How Anthropic Reduced Tool Definition Tokens by 98%
TRIGGER
Agents connected to hundreds or thousands of MCP tools were consuming excessive context loading all tool definitions upfront—processing 150,000+ tokens of tool descriptions before even reading the user's request, increasing latency and costs as tool ecosystems grew.
APPROACH
Generate a filesystem representation of MCP tools where each server becomes a directory and each tool becomes a TypeScript file containing the interface definition and wrapper function. Input: connected MCP servers with tool definitions. Output: file tree like `./servers/google-drive/getDocument.ts` that agents navigate on-demand. The agent discovers tools by listing directories and reading only the specific tool files needed for the current task. Reported reduction from 150,000 tokens to 2,000 tokens (98.7% savings) for tool definition loading.
PATTERN
“Tool definitions consumed before the agent reads the user request. Generate a filesystem of tool wrappers; the agent lists directories and reads what it needs. Lazy discovery beats upfront enumeration.”
✓ WORKS WHEN
- Agent has access to 50+ tools across multiple MCP servers where loading all definitions exceeds 50k tokens
- Tasks typically use 2-5 tools from a much larger available set
- Agent has filesystem access and can execute TypeScript/JavaScript code
- Tool interfaces are stable enough to generate wrapper files
- Search/navigation is faster than reading all definitions (tool count > ~20)
✗ FAILS WHEN
- Total tool definitions fit comfortably in context (<10k tokens, roughly 20-30 tools)
- Agent lacks code execution environment or filesystem access
- Most tasks require most available tools, making on-demand loading overhead net-negative
- Tool definitions change frequently, requiring constant regeneration of filesystem
- Latency of filesystem operations exceeds latency savings from reduced token processing