← Back to patterns
build

Anthropic's Test-Driven Iteration Loop for Agentic Coding

TRIGGER

AI code generation produces plausible-looking code that may not actually work—without a feedback signal, the agent can't distinguish between code that compiles and code that behaves correctly, leading to subtle bugs that surface later.

APPROACH

Anthropic's internal workflow: (1) Ask Claude to write tests based on expected input/output pairs, explicitly stating you're doing TDD so it avoids creating mock implementations, (2) Tell Claude to run tests and confirm they fail—explicitly forbid implementation code at this stage, (3) Commit the tests when satisfied, (4) Ask Claude to write code that passes tests without modifying tests, instructing it to keep going until all tests pass—it iterates through write code → run tests → adjust code → run tests again, (5) Optionally use independent subagents to verify implementation isn't overfitting to tests, (6) Commit code when satisfied.

PATTERN

The agent has no internal signal for "this works" versus "this compiles." Write failing tests first, then instruct the agent to iterate until green. Binary pass/fail turns code generation from "get it right once" into "search until verified."

WORKS WHEN

  • Behavior is easily verifiable with unit, integration, or end-to-end tests
  • Expected input/output pairs can be defined upfront before implementation
  • Test execution is fast enough to support multiple iteration cycles
  • The problem domain has deterministic expected outputs (not subjective quality)

FAILS WHEN

  • Correct behavior is subjective or requires human judgment (UX, copy, design)
  • Test setup requires extensive mocking of systems the agent can't access
  • Expected outputs aren't known upfront and emerge during implementation
  • Tests themselves are the uncertain part (unclear requirements, exploratory work)

Stage

build

From

April 2025

Want patterns like this in your inbox?

3 patterns weekly. No fluff.