Anthropic's Three-Tier Verification System for Agent Self-Correction
TRIGGER
Agents make mistakes that compound over multiple steps—a small error early in execution cascades into larger failures, and without feedback mechanisms, agents cannot self-correct before delivering flawed outputs.
APPROACH
Claude Agent SDK implements a three-tier verification system: (1) Rules-based feedback via linting—TypeScript over JavaScript because type errors provide more feedback signals; (2) Visual feedback via screenshots for UI/formatting tasks—checking layout, styling, content hierarchy, responsiveness; (3) LLM-as-judge for fuzzy criteria like tone matching. The agent loop becomes: gather context → take action → verify against appropriate feedback type → iterate if needed. Example: email agent validates address format (rules), screenshots HTML-formatted emails for visual check, uses separate subagent to judge tone consistency with user's previous messages.
PATTERN
“Output that passes validation but "looks wrong"—the "single-layer verification trap" where your linter catches syntax but misses the broken layout. Errors manifest across dimensions: structural (use deterministic checks), visual (use screenshots), subjective (use LLM judge). Each layer catches what others miss.”
✓ WORKS WHEN
- Task has clear correctness criteria that can be expressed as rules (valid email format, code compiles)
- Output has visual component where screenshot comparison is meaningful
- Multiple verification types can run in parallel to minimize latency impact
- Iteration budget allows for 2-3 correction cycles before timeout
✗ FAILS WHEN
- Verification latency exceeds 30% of total task latency budget and iteration isn't possible
- Correctness criteria are entirely subjective with no rule-based component
- LLM-as-judge produces inconsistent results that cause infinite correction loops
- Task is write-once with no opportunity to iterate on feedback