Skip to content

StreamableHttp - client refactoring and resumability support #595

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 72 commits into from
May 2, 2025

Conversation

ihrpr
Copy link
Contributor

@ihrpr ihrpr commented Apr 28, 2025

This PR implements client-side support for resumable execution in MCP, allowing
clients to resume interrupted long-running requests, for example, after network disconnections.

NOTE: support was added only on the lowest level, to send request, as we are not ready to add it to call_tool API as there is ongoing discussion about long running tasks. Once we are clear on the API design for long running tools, we will add it to session API.

Summary

  • Refactor client implementation, remove nesting
  • Enable clients to resume execution after connection breaks.
  • Add resumption token support to track tool execution progress
  • Implement SSE Last-Event-ID header for automatic event #replay

Note: stacked on top of #591

@ihrpr ihrpr marked this pull request as ready for review April 28, 2025 19:32
@ihrpr ihrpr added this to the 2025-03-26 spec release milestone Apr 29, 2025
Copy link
Contributor

@bhosmer-ant bhosmer-ant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ihrpr posting some comments early for curiosity's sake, then continuing review

@@ -255,9 +260,20 @@ async def unsubscribe_resource(self, uri: AnyUrl) -> types.EmptyResult:
)

async def call_tool(
self, name: str, arguments: dict[str, Any] | None = None
Copy link
Contributor

@bhosmer-ant bhosmer-ant May 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm kind of curious why you decided against a separate resume_tool method - at first blush it feels clearer as an API and it avoids the need to pass dummy arguments.

edit: looking at the way resume_tool was implemented in the earlier commits, I can see how the refactor disentangles the logic and avoids the fake "resume_from_token" tool call, but I do wonder whether resume_tool is a better Session API and (if so) whether it can be implemented in a way that preserves the current improvements. Interested in your thoughts (and definitely not looking to hold up the PR over it regardless)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resume_tool will only be applicable for one transport, introducing a new API for only one transport feels confusing
But in the same way I'm not happy with calling tool to resume the tool.

It's all back to the discussion how we support long runnng tools. Now I feel like we should just remove the resumption from session API call_tool and later introduce clear API how to do it for any transport

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: removed from call_tool API, left only on the lowest level send_request. Once we are happy with long running tools discussion we can add it properly

bhosmer-ant
bhosmer-ant previously approved these changes May 2, 2025
Copy link
Contributor

@bhosmer-ant bhosmer-ant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Continuation of earlier review - some additional comments inline but overall looks really nice, approving so that if none of the comments warrant changes you're able to go ahead and land it

ctx.metadata.on_resumption_token_update if ctx.metadata else None,
)
if is_complete:
break
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

possibly a dumb question, but if we're not returning anything to indicate that we got something back before timing out, how does the client know to (e.g.) retry the resumption request?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in _handle_sse_event we are sending the reponses and notifications (as well as saving resumtion tokes) before we return true or false if we need to stop listening. Spec states that the last even should be response, hence we need to stop there. Client will already received the response before we got to this break

@ihrpr ihrpr changed the title StreamableHttp - client resumability StreamableHttp - client refactoring and resumability support May 2, 2025
dsp-ant
dsp-ant previously approved these changes May 2, 2025
Base automatically changed from ihrpr/use-session-message-for-related-request to main May 2, 2025 13:35
@ihrpr ihrpr dismissed stale reviews from dsp-ant and bhosmer-ant May 2, 2025 13:35

The base branch was changed.

@ihrpr ihrpr merged commit 74f5fcf into main May 2, 2025
9 checks passed
@ihrpr ihrpr deleted the ihrpr/client-resumability branch May 2, 2025 13:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants