Skip to content

LiteLLM + Gemini 2.5 Pro: cached_tokens=None crashes Agents SDK with Pydantic int-validation error #758

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ruidazeng opened this issue May 26, 2025 · 1 comment
Labels
bug Something isn't working

Comments

@ruidazeng
Copy link

ruidazeng commented May 26, 2025

Please read this first

  • Have you read the docs? Yes – Agents SDK docs
  • Have you searched for related issues? Yes – nothing covering the cached_tokens=None → Pydantic error with LiteLLM + Gemini 2.5.

Describe the bug

When Runner.run() executes an agent whose model is a Gemini 2.5-pro instance wrapped by LiteLLM, the pipeline dies during cost-calculation with

Error in Explainer (revision): 1 validation error for InputTokensDetails
cached_tokens
  Input should be a valid integer [type=int_type, input_value=None, input_type=NoneType]
    For further information visit https://errors.pydantic.dev/2.11/v/int_type

The log shows LiteLLM repeatedly inserting cached_tokens=None into the request metadata; Pydantic 2.11’s InputTokensDetails model rejects None because the field is typed as int.
The fallback code then silently downgrades the model to o3-2025-04-16, masking the issue in production.


Debug information

Item Value
Agents SDK v0.0.16
LiteLLM v1.71.1
Python 3.12.7
Pydantic 2.11.03
OS macOS 14.4 (Apple Silicon)
Model string gemini/gemini-2.5-pro-preview-05-06

Repro steps

"""
Minimal repro for cached_tokens=None crash with LiteLLM + Gemini 2.5.

Save as repro.py and run `python repro.py` (GOOGLE_API_KEY or GEMINI_API_KEY must be set).
"""

from agents import Agent, Runner
from agents.extensions.models.litellm_model import LitellmModel
import litellm, os, asyncio, json

# Suppress NULL fields – does *not* avoid the bug
litellm.drop_params = True

gemini = LitellmModel(
    model="gemini/gemini-2.5-pro-preview-05-06",
    api_key=os.getenv("GOOGLE_API_KEY") or os.getenv("GEMINI_API_KEY")
)

echo_agent = Agent(
    name="Echo",
    instructions="Return the user's message verbatim in JSON: {\"echo\": \"...\"}",
    model=gemini,
)

async def main():
    # Any prompt triggers the validation failure
    result = await Runner.run(echo_agent, [{"role": "user", "content": "ping"}])
    print(result.final_output)

asyncio.run(main())

Observed output

Error in Explainer (revision): 1 validation error for InputTokensDetails
cached_tokens
  Input should be a valid integer ...

Commenting out the Gemini/LiteLLM model and falling back to any OpenAI model (o3-2025-04-16, gpt-4o) makes the script succeed, confirming the issue is isolated to the Gemini + LiteLLM path.


Expected behavior

  • LiteLLM should pass a valid integer (e.g., 0) for cached_tokens instead of None, or
  • The Agents SDK should coerce None0 before instantiating InputTokensDetails.

Either fix would allow Gemini 2.5 to run without crashing and would eliminate silent model downgrades.

@ruidazeng ruidazeng added the bug Something isn't working label May 26, 2025
@DanielHashmi
Copy link

I'm also getting this error:

pydantic_core._pydantic_core.ValidationError: 1 validation error for InputTokensDetails cached_tokens Input should be a valid integer [type=int_type, input_value=None, input_type=NoneType] For further information visit https://errors.pydantic.dev/2.11/v/int_type

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants