Unexpected Token Count Mismatch in tiktoken Integration with openai-python

### Confirm this is an issue with the Python library and not an underlying OpenAI API

- [x] This is an issue with the Python library

### Describe the bug

When using the SyncClient in the openai-python library, frequent httpx.PoolTimeout exceptions occur even under moderate request loads. This happens despite default timeout and connection pool configurations, suggesting a possible connection pooling misconfiguration or insufficient default parameters for certain workloads.

The issue is reproducible in environments where multiple consecutive requests are made to the OpenAI API in short succession.

To Reproduce:

Install the latest version of openai-python.

Create a script that sends multiple consecutive requests using SyncClient.

Observe that after a few requests, httpx.PoolTimeout errors occur.

Expected behavior:
The SyncClient should handle consecutive requests without frequent pool timeout errors, as long as the request rate is within API limits.

Environment:

openai-python version: [latest]

Python version: 3.10+

OS: macOS / Linux / Windows

httpx version: latest

### To Reproduce

import openai
from httpx import PoolTimeout

openai.api_key = "YOUR_API_KEY"

# Simulate moderate load with multiple consecutive requests
for i in range(50):
    try:
        response = openai.ChatCompletion.create(
            model="gpt-3.5-turbo",
            messages=[{"role": "user", "content": f"Hello, this is request {i}"}]
        )
        print(i, response.choices[0].message["content"])
    except PoolTimeout as e:
        print(f"Request {i} failed due to PoolTimeout: {e}")


### Code snippets

```Python
import httpx
import openai

# The default client inside openai uses something like:
# httpx.Limits(max_keepalive_connections=5, max_connections=10)

client = httpx.Client(limits=httpx.Limits(max_keepalive_connections=5, max_connections=10))

# This small pool can be exhausted quickly under burst loads,
# causing httpx.PoolTimeout errors before the OpenAI API responds.
Increasing the pool size and timeouts when initializing the client can reduce the frequency of this error:
custom_client = httpx.Client(
    limits=httpx.Limits(max_keepalive_connections=20, max_connections=40),
    timeout=httpx.Timeout(60.0)  # Increase timeout
)

openai._client = custom_client  # Patch OpenAI client to use custom limits
```

### OS

mac os

### Python version

Python v3 11.4

### Library version

openai v1.0.1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unexpected Token Count Mismatch in tiktoken Integration with openai-python #2538

Confirm this is an issue with the Python library and not an underlying OpenAI API

Describe the bug

To Reproduce

Simulate moderate load with multiple consecutive requests

Code snippets

OS

Python version

Library version

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Unexpected Token Count Mismatch in tiktoken Integration with openai-python #2538

Description

Confirm this is an issue with the Python library and not an underlying OpenAI API

Describe the bug

To Reproduce

Simulate moderate load with multiple consecutive requests

Code snippets

OS

Python version

Library version

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions