Skip to content

Best practices for long running tools? #295

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ostegm opened this issue Mar 21, 2025 · 7 comments
Open

Best practices for long running tools? #295

ostegm opened this issue Mar 21, 2025 · 7 comments
Labels
question Question about using the SDK

Comments

@ostegm
Copy link

ostegm commented Mar 21, 2025

I'm building a few tools which require waiting on long running operations (10's of minutes or so), and I'm looking for suggestions on the best way to implement those in this framework.

One idea I have is to have the tool return a pointer that says, "Come back and check in in the future, and I can give you some intermediate results possibly."

Is there anything like this already built into the SDK or anything coming? ⁠

@ostegm ostegm added the question Question about using the SDK label Mar 21, 2025
@rm-openai
Copy link
Collaborator

I am thinking about this actively, but I want to make sure to do it right - so it may take a couple of weeks. For now, your main options are:

  1. Just block on the long running operation - easy, but bad for obvious reasons.
  2. Return "operation in progress, will let you know when done" as the tool result. Asynchronously kickoff the real operation. When done, insert a user message into the input with the result. This is a hack but works ok.
  3. Use something like temporal for durable execution

If you (or anyone) has suggestions on implementation, open to it!

@ostegm
Copy link
Author

ostegm commented Mar 21, 2025

Thanks for the reply - its not clear what the right approach is. Tne idea I'm exploring right now:

  1. Define the original tool (start_X) which starts an async job that can stream output or post events to a queue and returns an id.
  2. Define a second tool (check_X) which takes the id from the output of the first tool.
  3. Optionally define a third: (kill_X)which can stop it?

The LLM could be instructed to use the ID to check in on the progress and get the intermediate outputs while its progressing and report back to the user. It would allow the LLM to work on other things in parallel and keep the user appraised of the progress?

@rm-openai
Copy link
Collaborator

That could work - it totally depends on the UX you want. For example - Deep Research in ChatGPT kicks off a process, and sends occasional updates to the frontend to display to the user. But the user can't send messages while things are in progress.

On the other hand, if you wanted to allow the user to be able to keep chatting while the task ran, your approach sounds great to me.

@ostegm
Copy link
Author

ostegm commented Mar 21, 2025

The part that I'm not sure how to implement with my approach is the case where you trigger this long-running task, the LLM responds to the user, and then the user walks away, which means the LLM never gets a chance to come check on the task progress. So you may never get a status update until the user comes back and triggers another interaction with the agent. ⁠

@rm-openai
Copy link
Collaborator

There's a few pieces there:

  1. You're probably storing the conversation in a database. You should probably insert a new item into the DB when the task is complete. That way, when the user refreshes the page, things appear correctly.
  2. You might also want to notify the user via push notifications, so they see it.

the LLM responds to the user, and then the user walks away, which means the LLM never gets a chance to come check on the task progress

I guess I'd say - the LLM should only check on the task progress for the user. As long as the user is still active, you can periodically check for updates and update the message history. If they walk away, then only update the history when the task is done.

(Sorry if this is abstract - I'm speaking in general terms since I don't know the specifics of your application)

@BenjaminChoou
Copy link

BenjaminChoou commented May 28, 2025

There's a few pieces there:

  1. You're probably storing the conversation in a database. You should probably insert a new item into the DB when the task is complete. That way, when the user refreshes the page, things appear correctly.
  2. You might also want to notify the user via push notifications, so they see it.

the LLM responds to the user, and then the user walks away, which means the LLM never gets a chance to come check on the task progress

I guess I'd say - the LLM should only check on the task progress for the user. As long as the user is still active, you can periodically check for updates and update the message history. If they walk away, then only update the history when the task is done.

(Sorry if this is abstract - I'm speaking in general terms since I don't know the specifics of your application)

@rm-openai Thanks for providing this idea. Recently, I am trying to use this SDK to build a voice agent. I encountered into similar case too, where:

  • I have long running tools
  • Voice agent should actively talk to users during long running tools.

According to examples given in SDK docs, the voice agent will not interact with the user until it gets the tool result.
I believe, this is a common case for people who are building voice agent. Alternatives could be an extra agent generating the filler message in between. However, this is not very straightforward in my opinion.

Could you kindly give any suggestions on this case?

@BenjaminChoou
Copy link

BenjaminChoou commented May 29, 2025

@rm-openai Here is my test which simply return the tool call result with "Waiting...". And then push back the execution result to LLM.

{
    "role": "system",
    "content": "You\"re an AI agent to help users. Do not make up information."
},
{
    "role": "user",
    "content": "Hi there. Who are you? What is the weather like in Paris today?"
},
{
    "tool_calls": [
        {
            "function": {
                "arguments": "{\"location\": \"Paris, France\"}",
                "name": "get_weather"
            },
            "id": "call_xxx",
            "type": "function"
        }
    ],
    "role": "assistant"
}

Directly return long running tool output to request waiting:

{
    "role": "tool",
    "tool_call_id": "call_xxx",
    "content": "Waiting for tool executing result..."
}

-> LLM ->

{
    "role": "assistant",
    "content": "I'm an AI designed to assist users with various inquiries. I'm currently fetching the weather information for Paris. Please hold on for a moment."
}

Re-insert the previous tool call message and the real execution result:

{
    "tool_calls": [
        {
            "function": {
                "arguments": "{\"location\": \"Paris, France\"}",
                "name": "get_weather"
            },
            "id": "call_xxx",
            "type": "function"
        }
    ],
    "role": "assistant"
},
{
    "role": "tool",
    "tool_call_id": "call_xxx",
    "content": "The weather in Paris, France is cloudy, temperature 25 Celsius degree."
}

-> LLM ->

{
    "role": "assistant",
    "content": "The weather in Paris today is cloudy with a temperature of 25 degrees Celsius. If you have any more questions or need further assistance, feel free to ask!"
}

This example shows it's possible to put the tool execution in background. And waiting an executing notification pushing back up agent.
However, I don't find a way to implement this in agent SDK yet. Could give any suggestions or will agent SDK support similar feature in the future?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Question about using the SDK
Projects
None yet
Development

No branches or pull requests

3 participants