-
Notifications
You must be signed in to change notification settings - Fork 24
feat: Add RunPipeline tool #253
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add RunPipeline tool #253
Conversation
6bee4bc
to
256587a
Compare
export const RunPipelineOperationArgs = { | ||
documents: z | ||
.array(z.record(z.string(), z.unknown())) | ||
.describe("Documents to run the pipeline against. 500 is maximum.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Worth adding .max(500)
to codify this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice idea, added
src/tools/playground/runPipeline.ts
Outdated
export class RunPipeline extends ToolBase { | ||
protected name = "run-pipeline"; | ||
protected description = | ||
"Run aggregation pipeline for provided documents without needing an Atlas account, cluster, or collection."; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it's helpful to provide some description context to the LLM about when to use this tool? Like can be useful in cases such as x, y, z
., since the use cases seem more open ended than the other tools
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a small clause: The tool can be useful for running ad-hoc pipelines for testing or debugging.
I agree, it's quite open ended tool so I would leave it to llm to decide when exactly it wants to use it.
.array(z.record(z.string(), z.unknown())) | ||
.describe("Aggregation pipeline to run on the provided documents.") | ||
.default(DEFAULT_PIPELINE), | ||
searchIndexDefinition: z |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are more specific types for aggregationPipeline
/searchIndexDefinition
/synonyms
useful for the LLM or is it already pretty good at determining the types from the description?
For ex, the search playground looks limited to a subset of aggregation pipeline stages. Would those be helpful to include in the type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel it would be hard to add more specific zod
types here. All these entities have a complex dynamic structure unfortunately.
I updated Aggregation pipeline...
to MongoDB aggregation pipeline
(same for other fields) to stress MongoDB part that hopefully nudges LLM to the right direction.
Regarding supported stages, I’d avoid listing them here. If we hardcode them, the list will likely get out of sync over time between the Playground and MCP. I’d rather rely on the Playground’s response to flag any unsupported stages. It actually supports more than what’s in the public docs (product wants to position it as a Search only playground for now).
2f25ada
to
3df5eca
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
Add a new
RunPipeline
tool that can execute aggregation pipeline without requiring an Atlas account, cluster, or collection.The tool accepts a set of documents, an aggregation pipeline, and a search index definition, and runs them against the Search Playground. The Search Playground internally creates an ephemeral collection and executes the pipeline in a temporary environment.
Manual testing + integration test. More tests will be added in the following prs.
What would be the result of query?

Is the query syntax correct?

Why $search doesn't return first doc?
