-
Notifications
You must be signed in to change notification settings - Fork 763
RFC: add Tool.outputSchema
and CallToolResult.structuredContent
#371
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
A couple of comments on this as I prepare to undraft #223 for RFC:
[update]
With guidance that Servers returning a Structured Response MUST return a CallToolResult containing one EmbeddedResource of type |
Having
Another drawback I see is the lack of ability to dynamically define the structure/schema for response content. There are certainly cases where the output schema may not be known ahead of time, but would still be useful for the client or LLM consuming the content; it also enriches the capability of sampling and prompt messages. Lastly, going with this approach, extending its functionality in the future would likely represent a breaking chance since it largely goes against the implied design pattern of |
@lukaswelinder first of all, my apologies for putting up this PR without first participating in the discussion on #356 - I only saw it as I was writing the PR comments for this, but hadn't had a chance to look properly yet. Definitely didn't mean to step on your ongoing work. I'll respond to your comment a bit later (not at keyboard right now) and also make comments on #356. Meanwhile I'll move this to draft, pending further discussion. |
@evalstate thanks for the heads up - will come back to this after we see what comes out of the discussion on #356 , per previous comment |
@bhosmer-ant No offense taken, just glad to see there is motivation here. Input and feedback on #356 would be great. |
74c0122
to
30d1a5e
Compare
30d1a5e
to
9876ab9
Compare
Tool.outputSchema
Tool.outputSchema
and CallToolResult.structuredContent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Largely LGTM, thank you so much for moving this forward 🙏. Minor nit: given it's json native, I would love if we could avoid json-in-json and just have structuredOutput as an object.
"result": { | ||
"structuredContent": { | ||
"type": "text", | ||
"text": "[{\"id\":\"doc-1\",\"title\":\"Introduction to MCP\",\"url\":\"https://example.com/docs/1\"},{\"id\":\"doc-2\",\"title\":\"Tool Usage Guide\",\"url\":\"https://example.com/docs/2\"}]" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that we have a separate key for structuredContent, would it be preferable for its value to directly conform to the outputSchema rather than having a type and a serialized json? At the very least should we avoid serialization?
schema/draft/schema.ts
Outdated
* | ||
* If the Tool defines an outputSchema, this field MUST be present in the result, and contain a serialized JSON object that matches the schema. | ||
*/ | ||
structuredContent?: string; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
structuredContent?: string; | |
structuredContent?: object; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be aligned more with tool input?
structuredContent?: string; | |
structuredContent?: { [key: string]: unknown }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on my comment above, maybe
structuredContent?: string; | |
structuredContent?: any; |
|
||
1. Clients **MUST** validate that the tool result contains a `structuredContent` field whose contents validate against the declared `outputSchema`. | ||
|
||
2. Servers **MUST** provide tool results that conform to their declared `outputSchema`s. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be helpful to provide some guidelines on how the server should respond when validation of the output against the schema fails.
For example InvalidParams
is used in https://github.com/modelcontextprotocol/typescript-sdk/pull/454/files
or recommend server authors to pick a custom error code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I'm not sure we'd need to make an error path explicit for this situation - this was more just a restatement of the contract, since (in the absence of upstream errors returned in the usual way) the server should be able to guarantee that results conform to the declared schema. But definitely LMK if you have a specific situation in mind that I'm not thinking of!
@@ -130,11 +131,12 @@ | |||
"isError": { | |||
"description": "Whether the tool call ended in an error.\n\nIf not set, this is assumed to be false (the call was successful).", | |||
"type": "boolean" | |||
}, | |||
"structuredContent": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about this, given that outputSchema is any arbitrary json schema (which makes sense), should this be of type any
rather than an object,i.e. should we omit the type=object? It would make sense for eg if the tool just returns an int or even an array. This as is will prevent an array of objects from being returned for eg. even though you could describe it in json schema.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah see above - on balance I think the sticking with the top-level-object restriction in the rest of the protocol (and standard practice more generally) is worth the extra trouble of wrapping top-level primitives/arrays
schema/draft/schema.ts
Outdated
* | ||
* If the Tool defines an outputSchema, this field MUST be present in the result, and contain a serialized JSON object that matches the schema. | ||
*/ | ||
structuredContent?: string; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on my comment above, maybe
structuredContent?: string; | |
structuredContent?: any; |
- `structuredOutput` typed as object rather than string - tighten CallToolResult to codify at-least-one constraint and explicitly allow structured results to contain `content` (but not the reverse) - update docs
@ihrpr fyi new rev makes |
OK - I'll just note my outstanding concerns on this one - not expecting a response - just adding my perspective as a Host application developer.
|
} | ||
``` | ||
|
||
Example valid response for this tool: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also add an example for a valid response that is b/w comaptible?
2. Servers **MUST** provide structured results in `structuredContent` that conform to the declared `outputSchema` of the tool. | ||
|
||
<Info> | ||
For backwards compatibility, a tool that declares an `outputSchema` may also return unstructured results in the `content` field. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be based on the MCP client version string during version negotiation rather than doing this unconditionally?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems useful to avoid mandating complex version-dependent logic here. (Paraphrasing offline discussion with @ihrpr:) In practice the SDKs will be handling construction of actual results, so it's useful to leave some freedom, and client/server devs will be spared the implementation details in any case. But even if both formats are sent unconditionally, the perf impact isn't huge, definitely not worth baking complexity into the spec to avoid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does it say "for backwards compatibility"? Does that mean structured output is always preferable over unstructured output? Why not specifically encourage tools to return both an unstructured and a structured response, so that the MCP client can pick the desired one based on the use case? For example, an unstructured response can be optimized for an LLM to read (e.g. when chatting with Claude), and a structured result can be used when further transformations are required.
Furthermore, you might also want to allow structured results contain extra data like IDs or timestamps that are not very meaningful for an LLM, but can be useful when doing data transformations (This also contradicts the bullet points below).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🚀
[update: relaxed the modality of
content
andstructuredContent
and the coupling ofstructuredContent
andoutputSchema
validation in #559]Adds support for strict validation of structured tool results.
Tool
can now optionally provide anoutputSchema
property, containing a JSON schema that defines the structure of its output.CallToolResult
adds a newstructuredContent
property, mutually exclusive withCallToolResult.content
property:result.structuredContent
will be absent, andresult.content
will be returned as before.result.structuredContent
will contain an object whose contents must validate against the tool's outputSchema.Prototype for typescript SDK support in modelcontextprotocol/typescript-sdk#454.
Design notes
This PR aims to provide simple, lightweight support for strict validation of tool result data whose structure can be entirely described by a single JSON schema. The approach here pairs a new
Tool.outputSchema
property with a newCallToolResult.structuredContent
property, avoiding use of theCallToolResult.content
array.This approach leaves the path open for adding schematic validation support to the much richer and more complex space of tools that make use of the full expressiveness of the
CallToolResult.content
array, via an additionalTool
property. Support for these use cases has been proposed in #356, and is under active discussion there.(After exploring possible ways of providing integrated support for both kinds use cases with one set of protocol additions, it's clear that both will be better served by a disjoint approach: strict validation of statically typed data results can be accomplished with the simple additions provided here, and the subtleties arising from supporting full space of
CallToolResult.content
shapes - see e.g. #415, in addition to #356 - can be addressed more naturally absent the need to support the use cases addressed here.)Motivation and Context
For tools that return structured output, having a description of that structure available is useful for various tasks, including:
outputSchema
s (or their absence) when making decisions about which tools to expose to the model.How Has This Been Tested?
No tests yet.
Breaking Changes
Optional new property that introduces a new behavior, not a breaking change.
Types of changes
Checklist
Additional context