Skip to content

Transform subAgent tool into agentStart and agentMessage tools #111

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
5 tasks
bhouston opened this issue Mar 5, 2025 · 5 comments · Fixed by #205
Closed
5 tasks

Transform subAgent tool into agentStart and agentMessage tools #111

bhouston opened this issue Mar 5, 2025 · 5 comments · Fixed by #205
Labels

Comments

@bhouston
Copy link
Member

bhouston commented Mar 5, 2025

Transform subAgent tool into agentStart and agentMessage tools

Problem Statement

Currently, the subAgent tool creates a sub-agent that runs autonomously until completion. This can lead to sub-agents getting off task and wasting time, as there's no way for the parent agent to monitor or intervene in the sub-agent's execution.

Proposed Solution

Transform the subAgent tool into two separate tools similar to how shellStart and shellMessage work:

  1. agentStart: Starts a sub-agent and immediately returns an instance ID

    • Does not wait for any timeout before returning (unlike shellStart)
    • Returns immediately so the parent agent can continue execution
  2. agentMessage: Bidirectional communication with a running sub-agent

    • Get the current state/output of the sub-agent
    • Send instructions or guidance to the sub-agent
    • Optionally terminate the sub-agent

Benefits

  • Parent agents can monitor sub-agent progress in real-time
  • Parent agents can provide guidance or correction if sub-agents get off track
  • Multiple sub-agents can run in parallel with parent supervision
  • Improved efficiency by preventing sub-agents from wasting time on unproductive paths

Requirements

agentStart Tool

  • Similar parameters to the current subAgent tool
  • Returns an instance ID and initial state immediately
  • Maintains a global map of running sub-agents (similar to processStates in shellStart)
  • Sub-agent execution happens asynchronously

agentMessage Tool

  • Takes an instance ID parameter to identify which sub-agent to interact with
  • Allows getting the current state/output of the sub-agent
  • Optionally allows sending guidance or instructions to the sub-agent
  • Optionally allows terminating the sub-agent

Implementation Notes

  • Keep the existing subAgent tool for backward compatibility and comparison
  • Create a new implementation that uses the same toolAgent core but in an asynchronous manner
  • Ensure proper cleanup of resources when sub-agents complete or are terminated

Acceptance Criteria

  • New agentStart and agentMessage tools implemented
  • Existing subAgent tool maintained for backward compatibility
  • Documentation updated to explain both approaches
  • Tests added for the new tools
  • Example usage provided in documentation
@bhouston
Copy link
Member Author

bhouston commented Mar 5, 2025

Implementation Plan

1. Code Structure

New Files to Create

  • packages/agent/src/tools/interaction/agentStart.ts
  • packages/agent/src/tools/interaction/agentMessage.ts
  • packages/agent/src/tools/interaction/agentStart.test.ts
  • packages/agent/src/tools/interaction/agentMessage.test.ts

Shared State

Create a shared state module to track running agents:

  • packages/agent/src/tools/interaction/agentState.ts - Will contain a Map similar to processStates in shellStart

2. Implementation Details

agentState.ts

import { CoreMessage } from 'ai';
import { v4 as uuidv4 } from 'uuid';
import { Tool } from '../../core/types.js';
import { ToolContext } from '../../core/toolAgent/types.js';

// Define AgentState type
export type AgentState = {
  instanceId: string;
  goal: string;
  prompt: string;
  tools: Tool[];
  context: ToolContext;
  messages: CoreMessage[];
  status: 'running' | 'paused' | 'completed' | 'error';
  result: string;
  error?: string;
  lastUpdated: number;
};

// Global map to store agent states
export const agentStates: Map<string, AgentState> = new Map();

// Helper to create a new agent instance ID
export const createAgentInstanceId = (): string => {
  return uuidv4();
};

agentStart.ts

import { z } from 'zod';
import { zodToJsonSchema } from 'zod-to-json-schema';
import { Tool, ToolContext } from '../../core/types.js';
import { getTools } from '../getTools.js';
import { agentStates, createAgentInstanceId, AgentState } from './agentState.js';
import { getDefaultSystemPrompt, getModel } from '../../core/toolAgent/index.js';

// Similar schema to current subAgent but returns immediately
const parameterSchema = z.object({
  description: z
    .string()
    .describe("A brief description of the sub-agent's purpose (max 80 chars)"),
  goal: z
    .string()
    .describe('The main objective that the sub-agent needs to achieve'),
  projectContext: z
    .string()
    .describe('Context about the problem or environment'),
  workingDirectory: z
    .string()
    .optional()
    .describe('The directory where the sub-agent should operate'),
  relevantFilesDirectories: z
    .string()
    .optional()
    .describe('A list of files, which may include ** or * wildcard characters'),
});

const returnSchema = z.object({
  instanceId: z.string().describe('The ID of the created agent instance'),
  status: z.string().describe('The current status of the agent'),
});

// Implementation will start the agent asynchronously and return immediately

agentMessage.ts

import { z } from 'zod';
import { zodToJsonSchema } from 'zod-to-json-schema';
import { Tool } from '../../core/types.js';
import { agentStates } from './agentState.js';

const parameterSchema = z.object({
  instanceId: z.string().describe('The ID returned by agentStart'),
  instruction: z
    .string()
    .optional()
    .describe('Optional instruction or guidance to send to the agent'),
  action: z
    .enum(['get_status', 'pause', 'resume', 'terminate'])
    .optional()
    .describe('Action to perform on the agent'),
  description: z
    .string()
    .describe('The reason for this agent interaction (max 80 chars)'),
});

const returnSchema = z.object({
  status: z.string().describe('The current status of the agent'),
  result: z.string().describe('The current result or output from the agent'),
  error: z.string().optional().describe('Error message if any'),
});

// Implementation will handle bidirectional communication with the agent

3. Execution Flow

  1. Agent Start Process:

    • Create a new agent instance ID
    • Initialize agent state in the global map
    • Start agent execution in a separate async process
    • Return the instance ID immediately
  2. Agent Message Process:

    • Get the agent state from the global map
    • If sending instruction: Add to agent's message queue
    • If requesting status: Return current state
    • If terminating: Clean up resources
  3. Background Agent Execution:

    • Run in an async manner without blocking
    • Update agent state as execution progresses
    • Handle completion and cleanup

4. Testing Strategy

  1. Unit Tests:

    • Test agentStart and agentMessage independently
    • Mock the toolAgent execution for faster tests
  2. Integration Tests:

    • Test the full flow with actual agent execution
    • Test parallel execution of multiple agents
    • Test termination and cleanup

5. Documentation Updates

  1. Tool Documentation:

    • Add detailed documentation for both tools
    • Provide examples of common usage patterns
  2. Migration Guide:

    • Explain differences between subAgent and agentStart/agentMessage
    • Provide guidance on when to use each approach

6. Timeline Estimate

  • Research and Design: 1 day
  • Implementation: 2-3 days
  • Testing: 1-2 days
  • Documentation: 1 day
  • Code Review and Fixes: 1-2 days

Total: 1-2 weeks depending on complexity and integration challenges

@bhouston
Copy link
Member Author

/mycoder, can you implement this as a PR?

Copy link

I've implemented the agentStart and agentMessage tools as requested in this issue. The implementation follows the pattern of shellStart and shellMessage tools and maintains backward compatibility with the existing subAgent tool.

The PR is available here: #205

Key features implemented:

  • agentStart: Starts a sub-agent and returns an instance ID immediately
  • agentMessage: Allows interaction with a running sub-agent, including monitoring progress, providing guidance, and termination
  • Documentation explaining both approaches and how to use them
  • Tests for the new tools

The implementation allows for:

  • Running multiple sub-agents in parallel
  • Monitoring sub-agent progress
  • Providing guidance to sub-agents
  • Terminating sub-agents if needed

Copy link

I've implemented the agentStart and agentMessage tools as requested in this issue. The implementation follows the pattern of shellStart and shellMessage tools and maintains backward compatibility with the existing subAgent tool.

The PR is available here: #205

Key features implemented:

  • agentStart: Starts a sub-agent and returns an instance ID immediately
  • agentMessage: Allows interaction with a running sub-agent, including monitoring progress, providing guidance, and termination
  • Documentation explaining both approaches and how to use them
  • Tests for the new tools

The implementation allows for:

  • Running multiple sub-agents in parallel
  • Monitoring sub-agent progress
  • Providing guidance to sub-agents (limited implementation in current version)
  • Terminating sub-agents if needed

Let me know if you'd like any changes or have any questions about the implementation!

github-actions bot pushed a commit that referenced this issue Mar 12, 2025
# [mycoder-agent-v1.1.0](mycoder-agent-v1.0.0...mycoder-agent-v1.1.0) (2025-03-12)

### Bug Fixes

* convert absolute paths to relative paths in textEditor log output ([a5ea845](a5ea845))
* implement resource cleanup to prevent CLI hanging issue ([d33e729](d33e729)), closes [#141](#141)
* llm choice working well for openai, anthropic and ollama ([68d34ab](68d34ab))
* **openai:** add OpenAI dependency to agent package and enable provider in config ([30b0807](30b0807))
* replace @semantic-release/npm with @anolilab/semantic-release-pnpm to properly resolve workspace references ([bacb51f](bacb51f))
* up subagent iterations to 200 from 50 ([b405f1e](b405f1e))

### Features

* add agent tracking to background tools ([4a3bcc7](4a3bcc7))
* add Ollama configuration options ([d5c3a96](d5c3a96))
* **agent:** implement agentStart and agentMessage tools ([62f8df3](62f8df3)), closes [#111](#111) [#111](#111)
* allow textEditor to overwrite existing files with create command ([d1cde65](d1cde65)), closes [#192](#192)
* implement background tool tracking (issue [#112](#112)) ([b5bb489](b5bb489))
* implement Ollama provider for LLM abstraction ([597211b](597211b))
* **llm:** add OpenAI support to LLM abstraction ([7bda811](7bda811))
* **refactor:** agent ([a2f59c2](a2f59c2))
Copy link

🎉 This issue has been resolved in version mycoder-agent-v1.1.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant