Fix Ollama GPT-OSS streaming with 'thinking' field #13375

colesmcintosh · 2025-08-07T15:50:25Z

Title

Fix Ollama GPT-OSS streaming with 'thinking' field

Relevant issues

Fixes #13340

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement - see details
I have added a screenshot of my new test passing locally
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem

Type

🐛 Bug Fix

Changes

Problem

Ollama GPT-OSS models were failing with APIConnectionError: Unable to parse ollama chunk when streaming responses contained a 'thinking' field with empty 'response'. The chunk parser didn't handle this specific case, causing streaming to fail.

Solution

Added handling for chunks containing 'thinking' field with empty 'response'
These chunks are treated as intermediate chunks that don't contain user-facing content
Allows streaming to continue until actual response content arrives

Code Changes

Modified OllamaTextCompletionResponseIterator.chunk_parser() in litellm/llms/ollama/completion/transformation.py
Added condition to handle "thinking" in chunk and not chunk["response"]
Returns empty GenericStreamingChunk for these intermediate chunks

Tests Added

test_chunk_parser_with_thinking_field(): Tests the exact problematic chunk from the issue
test_chunk_parser_normal_response(): Ensures normal chunks still work
test_chunk_parser_done_chunk(): Verifies done chunks work correctly

Verification

Tested with the exact chunk from the error:

{'model': 'gpt-oss:20b', 'created_at': '2025-08-06T14:34:31.5276077Z', 'response': '', 'thinking': 'User', 'done': False}

Result: Successfully parsed without errors, returns empty text chunk allowing stream to continue.

- Handle chunks containing 'thinking' field with empty 'response' - Treat these as intermediate chunks that don't contain user content - Add comprehensive tests for chunk parsing scenarios - Resolves APIConnectionError for GPT-OSS model streaming Fixes BerriAI#13340

vercel · 2025-08-07T15:50:30Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
litellm	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Aug 8, 2025 9:08pm

Fixed MyPy type error in Ollama completion transformation where finish_reason was set to None instead of expected string type. Changed finish_reason=None to finish_reason="" to match GenericStreamingChunk TypedDict requirements. Also updated corresponding test to expect empty string instead of None.

krrishdholakia · 2025-08-10T15:10:07Z

litellm/llms/ollama/completion/transformation.py

@@ -459,6 +459,15 @@ def chunk_parser(self, chunk: dict) -> GenericStreamingChunk:
                    finish_reason="stop",
                    usage=None,
                )
+            elif "thinking" in chunk and not chunk["response"]:


@colesmcintosh can we return a ModelResponseStream instead?

This way we can include the reasoning content in the response, and allow it to be displayed on chat ui's like OpenWebUI - See openrouter implementation -

litellm/litellm/llms/openrouter/chat/transformation.py

Line 108 in 1846871

def chunk_parser(self, chunk: dict) -> ModelResponseStream:

vercel bot deployed to Preview August 7, 2025 15:52 View deployment

vercel bot deployed to Preview August 7, 2025 17:00 View deployment

Merge branch 'BerriAI:main' into fix/ollama-gpt-oss-thinking-field

b66f4a7

vercel bot deployed to Preview August 7, 2025 21:15 View deployment

colesmcintosh mentioned this pull request Aug 7, 2025

[Bug]: litellm.APIConnectionError: Unable to parse ollama chunk #13333

Open

Merge branch 'BerriAI:main' into fix/ollama-gpt-oss-thinking-field

66cc88f

vercel bot deployed to Preview August 8, 2025 21:08 View deployment

krrishdholakia reviewed Aug 10, 2025

View reviewed changes

krrishdholakia added the changes requested label Aug 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Fix Ollama GPT-OSS streaming with 'thinking' field #13375

Fix Ollama GPT-OSS streaming with 'thinking' field #13375

colesmcintosh commented Aug 7, 2025

Uh oh!

vercel bot commented Aug 7, 2025 •

edited

Loading

Uh oh!

krrishdholakia Aug 10, 2025

Uh oh!

Uh oh!

Uh oh!

Fix Ollama GPT-OSS streaming with 'thinking' field #13375

Are you sure you want to change the base?

Fix Ollama GPT-OSS streaming with 'thinking' field #13375

Conversation

colesmcintosh commented Aug 7, 2025

Title

Relevant issues

Pre-Submission checklist

Type

Changes

Problem

Solution

Code Changes

Tests Added

Verification

Uh oh!

vercel bot commented Aug 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

krrishdholakia Aug 10, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

vercel bot commented Aug 7, 2025 •

edited

Loading