← Back to patterns
build

Anthropic's Three-Tier Verification System for Agent Self-Correction

TRIGGER

Agents make mistakes that compound over multiple steps—a small error early in execution cascades into larger failures, and without feedback mechanisms, agents cannot self-correct before delivering flawed outputs.

APPROACH

Claude Agent SDK implements a three-tier verification system: (1) Rules-based feedback via linting—TypeScript over JavaScript because type errors provide more feedback signals; (2) Visual feedback via screenshots for UI/formatting tasks—checking layout, styling, content hierarchy, responsiveness; (3) LLM-as-judge for fuzzy criteria like tone matching. The agent loop becomes: gather context → take action → verify against appropriate feedback type → iterate if needed. Example: email agent validates address format (rules), screenshots HTML-formatted emails for visual check, uses separate subagent to judge tone consistency with user's previous messages.

PATTERN

Output that passes validation but "looks wrong"—the "single-layer verification trap" where your linter catches syntax but misses the broken layout. Errors manifest across dimensions: structural (use deterministic checks), visual (use screenshots), subjective (use LLM judge). Each layer catches what others miss.

WORKS WHEN

  • Task has clear correctness criteria that can be expressed as rules (valid email format, code compiles)
  • Output has visual component where screenshot comparison is meaningful
  • Multiple verification types can run in parallel to minimize latency impact
  • Iteration budget allows for 2-3 correction cycles before timeout

FAILS WHEN

  • Verification latency exceeds 30% of total task latency budget and iteration isn't possible
  • Correctness criteria are entirely subjective with no rule-based component
  • LLM-as-judge produces inconsistent results that cause infinite correction loops
  • Task is write-once with no opportunity to iterate on feedback

Stage

build

From

September 2025

Want patterns like this in your inbox?

3 patterns weekly. No fluff.