Skip to content

API server: 'completion_tokens' always 1 (running Functionary 2.4) #1344

@ChristianWeyer

Description

@ChristianWeyer

Prerequisites

  • I am running the latest code. Development is very rapid so there are no tagged versions as of now.
  • I carefully followed the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new bug or useful enhancement to share.

Expected Behavior

Running Functionary 2.4 Small.

The response from the API server endpoint should contain correct values for completion_tokens and total_tokens.

Current Behavior

completion_tokens is always 1. E.g.:

  "usage": {
    "prompt_tokens": 507,
    "completion_tokens": 1,
    "total_tokens": 508
  }

Environment and Context

  • MacOS 14.4.1
    MBP M3 Max

  • Darwin MacBook-Pro 23.4.0 Darwin Kernel Version 23.4.0: Fri Mar 15 00:12:37 PDT 2024; root:xnu-10063.101.17~1/RELEASE_ARM64_T6031 arm64

  • Python 3.11.5

Failure Information (for bugs)

completion_tokens always 1 with API server

Steps to Reproduce

Run

python3 -m llama_cpp.server --model "./functionary/functionary-small-v2.4.Q4_0.gguf" --chat_format functionary-v2 --hf_pretrained_model_name_or_path "./functionary" --n_gpu_layers -1

Then send an Open AI Tools calling request to the endpoint, something like:

curl http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
  "messages": [
    {
      "role": "user",
      "content": "What is the weather like in Boston?"
    }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_current_weather",
        "description": "Get the current weather in a given location",
        "parameters": {
          "type": "object",
          "properties": {
            "location": {
              "type": "string",
              "description": "The city and state, e.g. San Francisco, CA"
            },
            "unit": {
              "type": "string",
              "enum": ["celsius", "fahrenheit"]
            }
          },
          "required": ["location"]
        }
      }
    }
  ],
  "tool_choice": "auto"
}'

The response contains the wrong values for usage:

"usage" : {
      "completion_tokens" : 1,
      "prompt_tokens" : 187,
      "total_tokens" : 188
   }

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions