Best practices for long running tools? #295

ostegm · 2025-03-21T22:42:11Z

I'm building a few tools which require waiting on long running operations (10's of minutes or so), and I'm looking for suggestions on the best way to implement those in this framework.

One idea I have is to have the tool return a pointer that says, "Come back and check in in the future, and I can give you some intermediate results possibly."

Is there anything like this already built into the SDK or anything coming? ⁠

rm-openai · 2025-03-21T22:48:33Z

I am thinking about this actively, but I want to make sure to do it right - so it may take a couple of weeks. For now, your main options are:

Just block on the long running operation - easy, but bad for obvious reasons.
Return "operation in progress, will let you know when done" as the tool result. Asynchronously kickoff the real operation. When done, insert a user message into the input with the result. This is a hack but works ok.
Use something like temporal for durable execution

If you (or anyone) has suggestions on implementation, open to it!

ostegm · 2025-03-21T23:02:37Z

Thanks for the reply - its not clear what the right approach is. Tne idea I'm exploring right now:

Define the original tool (start_X) which starts an async job that can stream output or post events to a queue and returns an id.
Define a second tool (check_X) which takes the id from the output of the first tool.
Optionally define a third: (kill_X)which can stop it?

The LLM could be instructed to use the ID to check in on the progress and get the intermediate outputs while its progressing and report back to the user. It would allow the LLM to work on other things in parallel and keep the user appraised of the progress?

rm-openai · 2025-03-21T23:17:47Z

That could work - it totally depends on the UX you want. For example - Deep Research in ChatGPT kicks off a process, and sends occasional updates to the frontend to display to the user. But the user can't send messages while things are in progress.

On the other hand, if you wanted to allow the user to be able to keep chatting while the task ran, your approach sounds great to me.

ostegm · 2025-03-21T23:24:45Z

The part that I'm not sure how to implement with my approach is the case where you trigger this long-running task, the LLM responds to the user, and then the user walks away, which means the LLM never gets a chance to come check on the task progress. So you may never get a status update until the user comes back and triggers another interaction with the agent. ⁠

rm-openai · 2025-03-22T00:42:58Z

There's a few pieces there:

You're probably storing the conversation in a database. You should probably insert a new item into the DB when the task is complete. That way, when the user refreshes the page, things appear correctly.
You might also want to notify the user via push notifications, so they see it.

the LLM responds to the user, and then the user walks away, which means the LLM never gets a chance to come check on the task progress

I guess I'd say - the LLM should only check on the task progress for the user. As long as the user is still active, you can periodically check for updates and update the message history. If they walk away, then only update the history when the task is done.

(Sorry if this is abstract - I'm speaking in general terms since I don't know the specifics of your application)

BenjaminChoou · 2025-05-28T06:04:53Z

There's a few pieces there:

You're probably storing the conversation in a database. You should probably insert a new item into the DB when the task is complete. That way, when the user refreshes the page, things appear correctly.

You might also want to notify the user via push notifications, so they see it.

the LLM responds to the user, and then the user walks away, which means the LLM never gets a chance to come check on the task progress

I guess I'd say - the LLM should only check on the task progress for the user. As long as the user is still active, you can periodically check for updates and update the message history. If they walk away, then only update the history when the task is done.

(Sorry if this is abstract - I'm speaking in general terms since I don't know the specifics of your application)

@rm-openai Thanks for providing this idea. Recently, I am trying to use this SDK to build a voice agent. I encountered into similar case too, where:

I have long running tools
Voice agent should actively talk to users during long running tools.

According to examples given in SDK docs, the voice agent will not interact with the user until it gets the tool result.
I believe, this is a common case for people who are building voice agent. Alternatives could be an extra agent generating the filler message in between. However, this is not very straightforward in my opinion.

Could you kindly give any suggestions on this case?

BenjaminChoou · 2025-05-29T03:04:52Z

@rm-openai Here is my test which simply return the tool call result with "Waiting...". And then push back the execution result to LLM.

{
    "role": "system",
    "content": "You\"re an AI agent to help users. Do not make up information."
},
{
    "role": "user",
    "content": "Hi there. Who are you? What is the weather like in Paris today?"
},
{
    "tool_calls": [
        {
            "function": {
                "arguments": "{\"location\": \"Paris, France\"}",
                "name": "get_weather"
            },
            "id": "call_xxx",
            "type": "function"
        }
    ],
    "role": "assistant"
}

Directly return long running tool output to request waiting:

{
    "role": "tool",
    "tool_call_id": "call_xxx",
    "content": "Waiting for tool executing result..."
}

-> LLM ->

{
    "role": "assistant",
    "content": "I'm an AI designed to assist users with various inquiries. I'm currently fetching the weather information for Paris. Please hold on for a moment."
}

Re-insert the previous tool call message and the real execution result:

{
    "tool_calls": [
        {
            "function": {
                "arguments": "{\"location\": \"Paris, France\"}",
                "name": "get_weather"
            },
            "id": "call_xxx",
            "type": "function"
        }
    ],
    "role": "assistant"
},
{
    "role": "tool",
    "tool_call_id": "call_xxx",
    "content": "The weather in Paris, France is cloudy, temperature 25 Celsius degree."
}

-> LLM ->

{
    "role": "assistant",
    "content": "The weather in Paris today is cloudy with a temperature of 25 degrees Celsius. If you have any more questions or need further assistance, feel free to ask!"
}

This example shows it's possible to put the tool execution in background. And waiting an executing notification pushing back up agent.
However, I don't find a way to implement this in agent SDK yet. Could give any suggestions or will agent SDK support similar feature in the future?

ostegm added the question Question about using the SDK label Mar 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Best practices for long running tools? #295

Best practices for long running tools? #295

ostegm commented Mar 21, 2025

rm-openai commented Mar 21, 2025

Uh oh!

ostegm commented Mar 21, 2025

Uh oh!

rm-openai commented Mar 21, 2025

Uh oh!

ostegm commented Mar 21, 2025

Uh oh!

rm-openai commented Mar 22, 2025

Uh oh!

BenjaminChoou commented May 28, 2025 •

edited

Loading

Uh oh!

BenjaminChoou commented May 29, 2025 •

edited

Loading

Uh oh!

Best practices for long running tools? #295

Best practices for long running tools? #295

Comments

ostegm commented Mar 21, 2025

rm-openai commented Mar 21, 2025

Uh oh!

ostegm commented Mar 21, 2025

Uh oh!

rm-openai commented Mar 21, 2025

Uh oh!

ostegm commented Mar 21, 2025

Uh oh!

rm-openai commented Mar 22, 2025

Uh oh!

BenjaminChoou commented May 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BenjaminChoou commented May 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BenjaminChoou commented May 28, 2025 •

edited

Loading

BenjaminChoou commented May 29, 2025 •

edited

Loading