Support For CodeAct In The Future? #383

DrChrisLevy · 2025-03-28T10:08:10Z

I'm curious to know if there are any future plans on roadmap to also support the CodeAct approach as an alternative to JSON tool calling approach ?

Like discussed here:

https://x.com/_philschmid/status/1905536547627110727

Looks like lang-graph is giving it a try. https://github.com/langchain-ai/langgraph-codeact

rm-openai · 2025-03-28T14:09:44Z

That's pretty cool. No immediate plans, but if you (or anyone in the community) built a good implementation, we'd be down to link to it from our docs. It should be possible without modifying the agents sdk. I'll also try and build somethng if I can find some time!

DrChrisLevy · 2025-03-30T13:29:23Z

I was thinking about this more as I glance through some of the code (with help of AI). Here is what "we" concluded so far. It would seem tricky to integrate directly into the SDK. But thanks for entertaining the idea. I'm going to play with it some more...

That's an interesting question about integrating a CodeACT-style approach (like in smol-agents or langgraph-codeact) into the OpenAI Agents SDK.

Based on our exploration of this SDK and the general understanding of these approaches:

OpenAI Agents SDK Core Mechanism: As we've seen, this SDK is built heavily around the concept of Tools and Handoffs. These are presented to the underlying language model, which then generates structured output (effectively JSON) to indicate which predefined tool or handoff to call and what arguments to pass to it (Functions example, Integrating Tools). The Runner then parses this structured output and executes the corresponding Python function (FunctionTool, Handoff, etc.) with the provided arguments.
CodeACT Paradigm: The CodeACT approach, as you described, typically involves the LLM generating executable code (often Python) directly as its action or plan step, rather than selecting from a predefined set of tools via JSON. The system then executes this generated code.
The Challenge: Integrating CodeACT directly into this SDK appears quite challenging because the fundamental action mechanism is different.
- LLM Output: The OpenAI Agents SDK relies on the model producing specific structured tool_calls or handoff requests in its response. The CodeACT approach requires the model to generate raw code within its text response.
- Response Processing: The SDK's core loop (_run_impl.py) is designed to parse the structured tool_calls attribute of the LLM response. It would need significant modification to instead parse and identify executable code blocks from the main message content.
- Action Execution: The SDK executes predefined Python functions (FunctionTool) or specific ComputerTool actions based on the parsed arguments. A CodeACT integration would require a new mechanism to safely execute arbitrary code generated by the LLM. This involves significant security considerations (sandboxing, restricting available modules/functions) that are not currently part of the SDK's design.
- State Management: Integrating the results of arbitrary code execution back into the SDK's state (RunItem list, context, etc.) would also require careful design.

Conclusion:

While not strictly impossible, extending the current OpenAI Agents SDK to support a CodeACT paradigm wouldn't be a simple plug-in. It would likely involve:

Changing the prompting strategy to encourage code generation instead of tool use.
Fundamentally altering the response parsing logic in the execution loop (_run_impl.py).
Adding a secure execution environment for the LLM-generated code.
Adapting the state and result handling mechanisms.

It seems less like an extension and more like building a parallel execution model or significantly refactoring the core components, given the SDK's current reliance on the JSON-based tool-calling capabilities of the underlying OpenAI APIs.

aybidi · 2025-04-06T01:04:46Z

I briefly read the description, but I wonder if, in the future, we foresee a different Agent class for CodeAct. Similar to how smolagents implements ToolCallingAgent (for JSON tool calling approach) and CodeAgent (for code act style tool calling approach), both inheriting from a single MultiStepAgent class and override the appropriate methods for input, output, handoff, etc.

yihuang · 2025-04-29T12:01:03Z

Isn't CodeAct just use code executor as one of the tools?

DrChrisLevy added the enhancement New feature or request label Mar 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support For CodeAct In The Future? #383

Support For CodeAct In The Future? #383

DrChrisLevy commented Mar 28, 2025

rm-openai commented Mar 28, 2025

Uh oh!

DrChrisLevy commented Mar 30, 2025

Uh oh!

aybidi commented Apr 6, 2025

Uh oh!

yihuang commented Apr 29, 2025 •

edited

Loading

Uh oh!

Support For CodeAct In The Future? #383

Support For CodeAct In The Future? #383

Comments

DrChrisLevy commented Mar 28, 2025

rm-openai commented Mar 28, 2025

Uh oh!

DrChrisLevy commented Mar 30, 2025

Uh oh!

aybidi commented Apr 6, 2025

Uh oh!

yihuang commented Apr 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yihuang commented Apr 29, 2025 •

edited

Loading