Skip to content

Add Resource.uriScope to enable direct reads #607

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jonathanhefner
Copy link
Contributor

@jonathanhefner jonathanhefner commented May 30, 2025

(UPDATE: The original proposal was to add Resource.supportsDirectRead. The current proposal is to add Resource.uriScope.)

Motivation and Context

When reading a binary resource via resources/read, the resource content must be encoded into Base64 and wrapped in JSON on the server, and then must be decoded on the client. Base64 increases payload size by 33%, so this process can add significant overhead for large resources.

This commit adds an optional uriScope property to Resource, which indicates when the resource content may be read directly via its URI, thereby avoiding the overhead. If the value of uriScope is "external", clients may attempt to read via the URI. If the value of uriScope is "internal", clients must verify that the server is local before attempting to read via the URI. If these conditions are not met or if the read fails, clients should fall back to resources/read.

It's also worth mentioning that if a client SDK's resources/read API method accepts a resource object (as returned by resources/list), then the API method can transparently take advantage of this optimization.

How Has This Been Tested?

Not tested.

Breaking Changes

None.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update

Checklist

  • I have read the MCP Documentation
  • My code follows the repository's style guidelines
  • New and existing tests pass locally
  • I have added appropriate error handling
  • I have added or updated documentation as needed

Additional context

Closes #527.

@connor4312
Copy link
Contributor

connor4312 commented May 30, 2025

I'm not a huge fan of this proposal since an MCP server does not always know where the client environment is and its network conditions. Consider...

  • Cloud hosted MCP servers might be behind a proxy running under stdio and reference file:/// URIs, which would normally be fine, but they are actualy inaccessible for connected clients
  • Enterprises like to firewall environments and could host MCP servers for their employees. While the server might be able to use a fetch tool and reference URLs on the public internet, a client might not have access to it
  • VS Code (and forks) support remote development where the client is connected to and runs (some) MCP servers on the remote compute which is a whole other environment with its own filesystem and network configuration

base64 decoding does increase the resource size, but if MCP is served over a compressed protocol--which will generally be almost every non-local case--then the size increase is just a couple percent, fairly negligible. The main concern just becomes memory usage, which I think could be better solved with streamed reads.

@jonathanhefner
Copy link
Contributor Author

  • Cloud hosted MCP servers might be behind a proxy running under stdio and reference file:/// URIs, which would normally be fine, but they are actualy inaccessible for connected clients

Perhaps the alternative approach with directUri would be better for use cases like this?

  • Enterprises like to firewall environments and could host MCP servers for their employees. While the server might be able to use a fetch tool and reference URLs on the public internet, a client might not have access to it

In this case, I think supportsDirectRead would just be false (or directUri would be null).

Alternatively, the uri (or directUri) could point to an internal proxy server that serves the resource.

  • VS Code (and forks) support remote development where the client is connected to and runs (some) MCP servers on the remote compute which is a whole other environment with its own filesystem and network configuration

The specifics of this scenario are unclear to me, but perhaps some combination of the above solutions could apply?

@connor4312
Copy link
Contributor

Perhaps the alternative approach with directUri would be better for use cases like this?

How would the MCP server in this case know how / where to create that?

In this case, I think supportsDirectRead would just be false (or directUri would be null). Alternatively, the uri (or directUri) could point to an internal proxy server that serves the resource.

But the MCP server is just some random 3rd party binary, it does not know that web URIs it can access are not accessible to the client nor how to use the internal proxy server.

@LucaButBoring
Copy link

Cloud hosted MCP servers might be behind a proxy running under stdio and reference file:/// URIs, which would normally be fine, but they are actualy inaccessible for connected clients

It's up to the SDK to upgrade to this if it makes sense for a particular request, so this wouldn't necessarily cause problems. I put a sequence diagram in #527, but copying it here:

sequenceDiagram
    participant Application as Host Application
    participant Client as MCP SDK Client
    participant Server as MCP Server
    participant CDN

    Server->>Client: <resource reference from a tool call or something>
    Client->>Application: <ref>
    Application->>Client: readResource(ref)
    alt resource.supportsDirectRead is supported and true
        Client->>CDN: fetch
        CDN->>Client: <resource content>
    else resource.supportsDirectRead is not supported or false, or the resource URI is misunderstood, etc.
        Client->>Server: resources/read
        Server->>Client: <resource content>
    end
    Client->>Application: <resource object>

Loading

We could extend this to handle the error case where the resource URI was declared as supportsDirectRead despite that not being the case by choosing to fall back to resources/read in that scenario.

As it's entirely up to the SDK to leverage, we could even (hypothetically) implement SDKs such that if resources/read succeeds after a direct fetch fails, assume there's a connectivity issue of some kind and don't attempt to do direct fetches for the remainder of that session with the server to avoid taking a consistent performance hit from failed initial requests. A sophisticated implementation would be more complicated than that as you'd want to handle cases where the direct fetch only had a temporary failure etc., but the general idea should hold.

@connor4312
Copy link
Contributor

It's up to the SDK to upgrade to this if it makes sense for a particular request, so this wouldn't necessarily cause problems.

The SDK on the server would not know this information. The SDK on the client... might? In a generic sense for an editor like VS Code where the client runs anywhere and connect to MCP's via arguments or URI configurations, we know a little bit about our own setup but don't know topology beyond that. To give another example, the folks over at Docker support a cloud running scenario and to do that (at least as of the demo I saw last week) they have a Docker MCP server running over stdio that proxies into their service, and as a client I am totally unaware this is happening, it looks like a normal local server to me.

Going back to what I queried before: we can probably assume that traffic sent over the network is going to get compressed where base64 overhead is not that big. The pro's I see for this are:

  1. There is less compute to compress things
  2. Serving things from elsewhere could be done in a more efficient way where they already exist, e.g. by CDN for HTTP data
  3. Given the resource already exists 'somewhere', it's less work to implement (though everyone does still need to implement resources/read anywhere)

The biggest benefit is no. 2, which works the best when you have things like S3 permalinks and so on. Is that big enough to take this instead of streaming, or could we have streaming in addition to this?

@LucaButBoring
Copy link

LucaButBoring commented May 31, 2025

The client SDK is the only entity that knows if the upgrade makes sense or not, because it would have to be explicitly written to handle particular resource schemes etc. On top of that, it's able to fall back to a standard resource read if there are connectivity mismatches like what you're describing, so that should just be an edge case where we'd fail the direct read and then succeed on the resources/read.

Regarding compression — one of the concerns driving this was memory usage by the application server. HTTP response compression also applies after the raw encoding to b64, so having compression doesn't avoid that memory increase at all. You would still have at least a few representations of the same file you'd be shuffling in memory on the server, which is the problem we're trying to sidestep:

  1. Raw resource (bytes)
  2. Raw resource (b64)
  3. Raw resource (b64 in JSON-RPC)
  4. Raw resource (compressed b64 in JSON-RPC)

(also I recognize you don't need all of those buffers in flight at the same time, I'm just listing them to illustrate the problem — only your largest buffer matters)


I think this proposal is entirely independent of partial/chunked result streaming, and is not a replacement for it. This is solving a problem specific to large resources, and protocol-level streaming will still be the ideal solution for most use cases (and especially for tool calls, which aren't covered by this at all).

@LucaButBoring
Copy link

Actually, on the topic of SDKs doing this transparently, should we also have an optional size parameter in the resource object, representing the byte size of the underlying resource? Given that this is basically an escape hatch, I think we should give SDKs the option to only upgrade to this above some arbitrary resource size. Something along the lines of being able to compare the cost of opening a new connection to speculatively request a file directly versus reusing the active connection for relatively small files. Naturally, that introduces some ambiguity around how to handle mismatches between the stated and real resource sizes (do we fail out? should the client attempt to handle it by allocating more buffers?), but it might be worth addressing that complexity if it means being able to make more intelligent decisions about opting into this.

That would also allow us to nudge applications towards protocol-level streaming whenever that gets finalized, by tuning that breakpoint in the SDK according to both the total size of the file and whether or not it supports chunked delivery at a protocol level. IMO it should generally be made clear that we want to err towards what we can do within the protocol, but also want to enable efficient large file downloads with this where it makes sense.

@evalstate
Copy link
Contributor

Actually, on the topic of SDKs doing this transparently, should we also have an optional size parameter in the resource object, representing the byte size of the underlying resource?

/**
* The size of the raw resource content, in bytes (i.e., before base64 encoding or any tokenization), if known.
*
* This can be used by Hosts to display file sizes and estimate context window usage.
*/
size?: number;

Resource does have an optional size.

@jonathanhefner jonathanhefner force-pushed the resource-supportsDirectRead branch from f0d4b41 to c76c416 Compare June 1, 2025 19:06
@jonathanhefner
Copy link
Contributor Author

Those are good points. I added a caveat to the documentation indicating that if the server is remote, then the client should verify it can access the URI before reading. If the URI is not accessible, the client should fall back to resources/read.

I think the main concern would be false positives / negatives. Since this is an optimization, clients can behave conservatively when unsure, but are there cases where a client could mistake a server-local URI for a generally accessible URI?


Actually, on the topic of SDKs doing this transparently, should we also have an optional size parameter in the resource object, representing the byte size of the underlying resource?

Resource does have an optional size.

I had added documentation to concepts/resources.mdx for that field in this PR, but I also just opened #621 to do so separately.

@jonathanhefner jonathanhefner force-pushed the resource-supportsDirectRead branch from c76c416 to 8792cfa Compare June 2, 2025 00:17
@jonathanhefner
Copy link
Contributor Author

I think the main concern would be false positives / negatives. Since this is an optimization, clients can behave conservatively when unsure, but are there cases where a client could mistake a server-local URI for a generally accessible URI?

New alternative: instead of Resource.supportsDirectRead, add an optional Resource.uriScope which can be either "external" or "internal".

  • If uriScope is "external", the client can read the URI, but should fall back to resources/read if the read fails.
  • If uriScope is "internal", the client must confirm that the server is local before reading the URI. If the server cannot be confirmed as local or the read fails, fall back to resources/read.
  • If uriScope is not specified, use resources/read.

That avoids having to do any heuristics on the resource URI.

When reading a binary resource via `resources/read`, the resource
content must be encoded into Base64 and wrapped in JSON on the server,
and then must be decoded on the client.  Base64 increases payload size
by 33%, so this process can add significant overhead for large
resources.

This commit adds an optional `uriScope` property to `Resource`, which
indicates when the resource content may be read directly via its URI,
thereby avoiding the overhead.  If the value of `uriScope` is
`"external"`, clients may attempt to read via the URI.  If the value of
`uriScope` is `"internal"`, clients must verify that the server is local
before attempting to read via the URI.  If these conditions are not met
or if the read fails, clients should fall back to `resources/read`.

It's also worth mentioning that if a client SDK's `resources/read` API
method accepts a resource object (as returned by `resources/list`), then
the API method can transparently take advantage of this optimization.
@jonathanhefner jonathanhefner force-pushed the resource-supportsDirectRead branch from 8792cfa to 8d556f0 Compare June 3, 2025 16:37
@jonathanhefner jonathanhefner changed the title Add Resource.supportsDirectRead Add Resource.uriScope to enable direct reads Jun 3, 2025
@connor4312
Copy link
Contributor

connor4312 commented Jun 3, 2025

I think the hint is moving in the right direction. As a client implementor I would want a little clarification on "client must confirm that the server is local before reading the URI". E.g.:

  • Assume that the file scheme is always local
  • For other URIs, ensure that the authority is loopback or that the authority resolves to a loopback address

is that enough?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Base64 in JSON RPC will not scale for file content
4 participants