Skip to content

Chain-of-thought and structured output #800

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
akira108 opened this issue Jun 2, 2025 · 4 comments
Closed

Chain-of-thought and structured output #800

akira108 opened this issue Jun 2, 2025 · 4 comments
Labels
question Question about using the SDK

Comments

@akira108
Copy link

akira108 commented Jun 2, 2025

Please read this first

  • Have you read the docs?Agents SDK docs: Yes
  • Have you searched for related issues? Others may have had similar requests: Yes

Question

I'd like to enforce LLM to chain-of-thought and step by step tool calls and ultimately want to return a structured output built with a Pydantic model. To achieve that, I'm using StopAtTools and stop when structured output is built.

tool_use_behavior = StopAtTools(stop_at_tool_names=[build_output.name])

where

@function_tool
def build_output(foo: Foo):
    return foo

Today I can make this work by calling run_streamed, inspecting each interim tool-call result, and returning when the build_output call of the wanted type appears.

Is there a way to achieve the same thing without using run_streamed?

@akira108 akira108 added the question Question about using the SDK label Jun 2, 2025
@rm-openai
Copy link
Collaborator

@akira108 I might be missing something, but why can't you do this:

agent = Agent(
  model="o3", # reasoning model,
  tools=[...], 
  output_type=MyOutputType,
)

result = await Runner.run(agent, input)
final_output = result.final_output_as(MyOutputType)

The reasoning model should automatically call tools in it's CoT, and keep going until it produces an output of that type.

@akira108
Copy link
Author

akira108 commented Jun 2, 2025

@rm-openai

Thanks for the reply!

I’d love to use the reasoning model, but due to cost considerations, I’m trying to make it work with gpt-4.1.

Ref: https://cookbook.openai.com/examples/gpt4-1_prompting_guide#prompting-induced-planning–chain-of-thought

Do you have any suggestions for achieving similar behavior with gpt-4.1?

@rm-openai
Copy link
Collaborator

Oh gotcha, your approach should work for that. Note that because you're prompting the agent, it might be finicky:

  1. It might call the tool directly, without doing the step-by-step thinking
  2. It might do the thinking, but not call the tool.

You could also try o4-mini, it might fit your budget.

import asyncio

from pydantic import BaseModel

from agents import (
    Agent,
    ItemHelpers,
    MessageOutputItem,
    ModelSettings,
    Runner,
    ToolCallItem,
    function_tool,
)


class MathResult(BaseModel):
    result: int


@function_tool
def commit_final_result(result: int) -> MathResult:
    """When you have the final result, call this function."""
    print(f"[debug] Final result: {result}")
    return MathResult(result=result)


async def main():
    agent = Agent(
        name="Assistant",
        instructions="Think step by step. Once you have the final answer, call the commit_final_result tool exactly once. Do NOT call the commit_final_result until AFTER your thought process is done.",
        tool_use_behavior={
            "stop_at_tool_names": [commit_final_result.name],
        },
        tools=[commit_final_result],
    )

    result = await Runner.run(agent, "What is 373 * 41 + the year Roger Federer was born?")
    new_items = result.new_items

    assert isinstance(new_items[0], MessageOutputItem), (
        f"First item should be a message output item, got {type(new_items[0])}"
    )

    print(f"===Message===:\n {ItemHelpers.text_message_output(new_items[0])}")

    print(f"===Final output===\n {result.final_output}")


if __name__ == "__main__":
    asyncio.run(main())

output:

[debug] Final result: 17274
===Message===:
 To solve this problem, I'll break it down into steps:

1. **Calculate \(373 \times 41\):**

   \(373 \times 41 = 373 \times (40 + 1) = 373 \times 40 + 373 \times 1\)

   \[
   373 \times 40 = 373 \times 4 \times 10 = 1492 \times 10 = 14920
   \]

   \[
   373 \times 1 = 373
   \]

   \[
   373 \times 41 = 14920 + 373 = 15293
   \]

2. **Find the year Roger Federer was born:**

   Roger Federer was born in 1981.

3. **Add the results from Step 1 and Step 2:**

   \[
   15293 + 1981 = 17274
   \]

Now, I'll provide the final result.
===Final output===
 result=17274

@akira108
Copy link
Author

akira108 commented Jun 2, 2025

@rm-openai

Thanks so much — that really helps!

I didn’t realize that even without specifying an output_type, final_output would take the return type of the tool.

Appreciate the heads-up on the two pitfalls when prompting gpt-4.1, and I’ll definitely give o4-mini a try too!

@akira108 akira108 closed this as completed Jun 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Question about using the SDK
Projects
None yet
Development

No branches or pull requests

2 participants