build

Anthropic's Three-Tier Verification System for Agent Self-Correction

TRIGGER

Agents make mistakes that compound over multiple steps—a small error early in execution cascades into larger failures, and without feedback mechanisms, agents cannot self-correct before delivering flawed outputs.

APPROACH

Claude Agent SDK implements a three-tier verification system: (1) Rules-based feedback via linting—TypeScript over JavaScript because type errors provide more feedback signals; (2) Visual feedback via screenshots for UI/formatting tasks—checking layout, styling, content hierarchy, responsiveness; (3) LLM-as-judge for fuzzy criteria like tone matching. The agent loop becomes: gather context → take action → verify against appropriate feedback type → iterate if needed. Example: email agent validates address format (rules), screenshots HTML-formatted emails for visual check, uses separate subagent to judge tone consistency with user's previous messages.

PATTERN

“Output that passes validation but "looks wrong"—the "single-layer verification trap" where your linter catches syntax but misses the broken layout. Errors manifest across dimensions: structural (use deterministic checks), visual (use screenshots), subjective (use LLM judge). Each layer catches what others miss.”

✓ WORKS WHEN

Task has clear correctness criteria that can be expressed as rules (valid email format, code compiles)
Output has visual component where screenshot comparison is meaningful
Multiple verification types can run in parallel to minimize latency impact
Iteration budget allows for 2-3 correction cycles before timeout

✗ FAILS WHEN

Verification latency exceeds 30% of total task latency budget and iteration isn't possible
Correctness criteria are entirely subjective with no rule-based component
LLM-as-judge produces inconsistent results that cause infinite correction loops
Task is write-once with no opportunity to iterate on feedback

Stage

build

Source

Anthropic Engineering →

From

September 2025