-
Notifications
You must be signed in to change notification settings - Fork 4.2k
Description
Confirm this is an issue with the Python library and not an underlying OpenAI API
- This is an issue with the Python library
Describe the bug
I encountered the problem of usage.prompt_tokens_details=None.
This is the output of response.usage:
CompletionUsage(completion_tokens=57, prompt_tokens=2181, total_tokens=2518, completion_tokens_details=None, prompt_tokens_details=None, reasoning_tokens=280, traffic_type='ON_DEMAND', promptTokensDetails=[{'modality': 'TEXT', 'tokenCount': 2181}], candidatesTokensDetails=[{'modality': 'TEXT', 'tokenCount': 57}])
Docs for Prompting caching: https://platform.openai.com/docs/guides/prompt-caching
Requirements
Caching is available for prompts containing 1024 tokens or more, with cache hits occurring in increments of 128 tokens. Therefore, the number of cached tokens in a request will always fall within the following sequence: 1024, 1152, 1280, 1408, and so on, depending on the prompt's length.
All requests, including those with fewer than 1024 tokens, will display a cached_tokens field of the usage.prompt_tokens_details Response object or Chat object indicating how many of the prompt tokens were a cache hit. For requests under 1024 tokens, cached_tokens will be zero.
"usage": {
"prompt_tokens": 2006,
"completion_tokens": 300,
"total_tokens": 2306,
"prompt_tokens_details": {
"cached_tokens": 1920
},
"completion_tokens_details": {
"reasoning_tokens": 0,
"accepted_prediction_tokens": 0,
"rejected_prediction_tokens": 0
}
}
To Reproduce
My openai version is 1.99.6:
completion = client.chat.completions.create( model="gemini-2.5-pro", messages=messages, tools=available_tools, tool_choice="auto", max_tokens=20000, extra_headers={"X-TT-LOGID": ""} )
Code snippets
OS
Linux
Python version
Python 3.11.2
Library version
openai v1.99.6