Skip to content

Ensure that function_call_prompt extends system messages following its current schema #13243

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

nagyv
Copy link

@nagyv nagyv commented Aug 3, 2025

Title

Fixes #11267 - Fix system message format bug when using tools with Ollama models

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

Screenshot 2025-08-03 at 17 55 40

Type

🐛 Bug Fix

Changes

Fixed a critical bug in function_call_prompt() where the code attempted to concatenate strings to system message content without checking the message format.

When using tools, like Claude Code, with Ollama models, system messages can have either string content or structured content (list of objects with type/text/cache_control).

Root Cause:

The function_call_prompt() function in litellm/litellm_core_utils/prompt_templates/factory.py was blindly treating all system message content as strings and attempting string concatenation. However, with Ollama + tools, system messages follow the structured format (list of content objects), causing an AttributeError when trying to concatenate.

Fix Applied:

  • Added type checking to handle both string and structured content formats
  • When content is a string: append function prompt via string concatenation (existing behavior)
  • When content is structured: append function prompt as a new content object with proper schema
  • Maintains backward compatibility for existing string-based system messages

Testing:
Added comprehensive test case test_system_message_format_issue_reproduction() that:

  • Reproduces the exact conditions from the bug report
  • Uses Ollama model with tools to trigger the problematic code path
  • Verifies that system messages with structured content are handled correctly
  • Ensures function prompts are properly appended without breaking message schema

(The fix was created by a human. Only the PR text was written by AI using the prompt: Please, write a summary PR for issue #11267. The fix is in commit eca86cd, follow the format of @.github/pull_request_template.md)

Copy link

vercel bot commented Aug 3, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
litellm ✅ Ready (Inspect) Visit Preview 💬 Add feedback Aug 6, 2025 4:19am

message["content"].append({
"type": "text",
"text": f""" {function_prompt}""",
"cache_control": {"type": "ephemeral"}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why add a cache control flag?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It likely can be removed. I don't really know how cache control works here, and copied the format of the other messages.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove it

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -3711,7 +3711,14 @@ def function_call_prompt(messages: list, functions: list):
function_added_to_prompt = False
for message in messages:
if "system" in message["role"]:
message["content"] += f""" {function_prompt}"""
if isinstance(message["content"], str):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this even needed anymore? I thought ollama supported function calling now?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it does, LiteLLM does not know about it.

Reproduced with Ollama versions

Initially found it with Ollama version 0.7.0 (already should support tools), today I verified it with the most recent version 0.10.1 too. Calling LiteLLM through Claude (e.g. claude --model local "What is the capital of France?")

I run into this bug with the config:

model_list:
  - model_name: "local"
    litellm_params:
      model: "ollama/hf.co/unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF:UD-Q4_K_XL"
      max_tokens: 32684
      repetition_penalty: 1.05
      temperature: 0.7
      top_k: 20
      top_p: 0.8
      
litellm_settings:
  master_key: os.environ/LITELLM_MASTER_KEY

Question

Are you saying that the "bug" is that add_function_to_prompt is set to True?

I've seen that by default it's False, and somewhere the code sets it to True. Did not look into its place yet.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be that the qwen3 model does not support tool calling, and add_function_to_prompt is set in

litellm.add_function_to_prompt = (

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to remove this added string from the system prompt altogether, because certain Ollama models are trained to provide their own tool response format.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: APIConnectionError parsing Tool call response from Ollama
3 participants