Skip to content

Unexpected Token Count Mismatch in tiktoken Integration with openai-python #2538

@LuminaX-alt

Description

@LuminaX-alt

Confirm this is an issue with the Python library and not an underlying OpenAI API

  • This is an issue with the Python library

Describe the bug

When using the SyncClient in the openai-python library, frequent httpx.PoolTimeout exceptions occur even under moderate request loads. This happens despite default timeout and connection pool configurations, suggesting a possible connection pooling misconfiguration or insufficient default parameters for certain workloads.

The issue is reproducible in environments where multiple consecutive requests are made to the OpenAI API in short succession.

To Reproduce:

Install the latest version of openai-python.

Create a script that sends multiple consecutive requests using SyncClient.

Observe that after a few requests, httpx.PoolTimeout errors occur.

Expected behavior:
The SyncClient should handle consecutive requests without frequent pool timeout errors, as long as the request rate is within API limits.

Environment:

openai-python version: [latest]

Python version: 3.10+

OS: macOS / Linux / Windows

httpx version: latest

To Reproduce

import openai
from httpx import PoolTimeout

openai.api_key = "YOUR_API_KEY"

Simulate moderate load with multiple consecutive requests

for i in range(50):
try:
response = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[{"role": "user", "content": f"Hello, this is request {i}"}]
)
print(i, response.choices[0].message["content"])
except PoolTimeout as e:
print(f"Request {i} failed due to PoolTimeout: {e}")

Code snippets

import httpx
import openai

# The default client inside openai uses something like:
# httpx.Limits(max_keepalive_connections=5, max_connections=10)

client = httpx.Client(limits=httpx.Limits(max_keepalive_connections=5, max_connections=10))

# This small pool can be exhausted quickly under burst loads,
# causing httpx.PoolTimeout errors before the OpenAI API responds.
Increasing the pool size and timeouts when initializing the client can reduce the frequency of this error:
custom_client = httpx.Client(
    limits=httpx.Limits(max_keepalive_connections=20, max_connections=40),
    timeout=httpx.Timeout(60.0)  # Increase timeout
)

openai._client = custom_client  # Patch OpenAI client to use custom limits

OS

mac os

Python version

Python v3 11.4

Library version

openai v1.0.1

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions