Prevent preamble messages from being treated as final output when tool calls are pending #1689
+53
−53
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Issue
When using structured outputs, if the model produces both a message and a function call in the same turn,
the agent would terminate prematurely.
This behavior has become much easier to reproduce with GPT-5 tool calling + preambles.
Earlier models could also occasionally produce such outputs. For example, in GPT-4.1, if the prompt instructed the model to think before calling a tool—making it possible to encounter random failures whenever output_type was specified.
I think this fix should also resolve openai/openai-agents-python#1061.
Reproducible Example
Example Output (before this fix)
The to_input_list() looks like:
At this point, the list ends with the tool call output. The run loop does not pass this tool output back to the LLM for another turn. As a result, the agent finalizes prematurely, treating the preamble message as the structured output instead of producing a final answer that incorporates the tool result:
Behavior Change
Before:
When a structured output schema was used, if the model produced both a preamble message and a function call in the same turn, the agent would treat the preamble message as the final output, even though tool calls were still pending.
This caused the run loop to terminate early.
After:
Structured output only triggers a final output when there are no pending tool calls or approvals.
If the model emits a tool call, we process it, append the result, and re-run the loop.
I think this aligns with the default tool_use_behavior=run_llm_again, which is more intuitive.
If developers prefer to treat function call output as the final output, they can explicitly configure tool_use_behavior to stop_on_first_tool or StopAtTools.
✅ All tests are passing
Example Output (after this fix)
The list now continues after the tool call output, feeding the tool result back to the LLM and producing the proper final answer:
Note: after a function call output, the model may possibly produce a post-message (e.g. explaining that it received the weather data) before generating the final output message in the same turn.
And the final output now correctly incorporates the tool result: