← Back to patterns
build

How Anthropic Improved Tool Selection with Lazy Loading

TRIGGER

Agent systems connecting to multiple tool providers (MCP servers, APIs) were consuming 50-100K+ tokens on tool definitions alone before any conversation began, leaving insufficient context window for actual work and degrading tool selection accuracy as similar tool names created confusion.

APPROACH

Anthropic built a Tool Search Tool that marks tools with `defer_loading: true` to exclude them from initial context. When Claude needs capabilities, it searches against tool names/descriptions (regex, BM25, or embeddings), and only matching tools (~3-5) get expanded into full definitions. Input: search query like 'github'. Output: references to matching tools that get loaded on-demand. A 58-tool setup dropped from ~55K tokens upfront to ~500 tokens (search tool only) + ~3K when tools are discovered. Results: 85% token reduction; accuracy on MCP evaluations improved from 49% to 74% (Opus 4) and 79.5% to 88.1% (Opus 4.5).

PATTERN

An agent that discovers 5 relevant tools on-demand outperforms one with 50+ loaded upfront. Similar names create confusion; cognitive overload from too many options is real. Lazy loading beat comprehensive toolsets by 25 percentage points.

WORKS WHEN

  • Tool definitions exceed 10K tokens total
  • Tool library contains 10+ tools with overlapping names or functions
  • Building MCP-powered systems connecting multiple servers
  • Most tools are used infrequently—only 3-5 needed per typical task
  • Prompt caching is available (deferred tools don't break cache since they're excluded entirely)

FAILS WHEN

  • Tool library is small (<10 tools) where discovery overhead exceeds savings
  • All tools are used frequently in every session (no sparse access pattern)
  • Tool definitions are already compact (<10K tokens total)
  • Search latency is unacceptable for the use case (adds one inference step)
  • Tools have poor names/descriptions that won't match reasonable search queries

Stage

build

From

November 2025

Want patterns like this in your inbox?

3 patterns weekly. No fluff.