← Back to patterns
build

Anthropic's Query Complexity Routing Architecture

TRIGGER

A single LLM handling all queries forces a tradeoff between capability and cost—using a powerful model wastes resources on simple queries, while a cheap model fails on complex ones. Optimizing prompts for one query type often degrades performance on others.

APPROACH

Anthropic's teams and customers implement a routing workflow where a classifier (LLM or traditional ML) categorizes incoming requests and directs them to specialized handlers. Input: user query. Output: classification category plus response from the appropriate path. For customer support, queries route to distinct handlers: general questions, refund requests, and technical support each get their own optimized prompts and tools. For cost optimization, easy/common questions route to Claude Haiku (10x cheaper) while hard/unusual questions route to Claude Sonnet. Each downstream path can be independently optimized without degrading performance on other query types—avoiding the zero-sum tradeoff inherent in single-prompt approaches.

PATTERN

Optimizing prompts for one query type actively degrades performance on others—you can't build one prompt that handles everything well. Route queries to specialized handlers so you can optimize each path independently without the zero-sum tradeoff.

WORKS WHEN

  • Queries fall into distinct categories with meaningfully different handling requirements (customer support: general questions vs refund requests vs technical issues)
  • Classification accuracy is high enough that misrouting costs don't exceed routing benefits (>90% accuracy typical threshold)
  • Cost or latency difference between model tiers is significant (e.g., Haiku 10x cheaper than Sonnet)
  • Volume justifies the engineering investment in multiple specialized paths

FAILS WHEN

  • Query types blend together without clear categorical boundaries
  • All queries genuinely require the same capability level
  • Classification latency exceeds the time saved by using smaller models
  • Low volume makes maintaining multiple specialized paths more expensive than using one capable model for everything

Stage

build

From

December 2024

Want patterns like this in your inbox?

3 patterns weekly. No fluff.