How Anthropic Compacts Context with Selective Tool Result Pruning
TRIGGER
Agents working on long-horizon tasks (large codebase migrations, multi-hour research) accumulated context that exceeded window limits, causing performance degradation before completion. Simply truncating old messages lost critical architectural decisions and unresolved bugs.
APPROACH
Claude Code implements compaction by passing message history to the model for summarization when context utilization exceeds 70-80%, preserving architectural decisions, unresolved bugs, and implementation details while discarding redundant tool outputs. Input: conversation nearing context limit. Output: compressed context summary plus the five most recently accessed files. The team carefully tuned the compaction prompt on complex agent traces, starting by maximizing recall to capture every relevant piece, then iterating to improve precision by eliminating superfluous content. A lightweight variant—tool result clearing—removes raw tool outputs deep in history since the agent has already internalized the insights; this is now available as a platform feature on the Claude Developer Platform. The technique enables Claude Code to work on multi-hour tasks like large codebase migrations without running out of context.
PATTERN
“Keeping "all tool outputs just in case" actively harms long-running agents—the model re-processes redundant data it won't re-use, and context fills before the task completes. Clear tool results once internalized: high signal when returned, near-zero later.”
✓ WORKS WHEN
- Tasks span tens of minutes to multiple hours of continuous work
- Tool outputs are large relative to the insights extracted from them (database queries, file reads, API responses)
- Agent's subsequent reasoning already incorporates tool results into decisions or notes
- Context window utilization exceeds 70-80% and performance degradation is observable
- Task has clear milestones where state can be checkpointed
✗ FAILS WHEN
- Agent needs to re-examine raw tool outputs for verification or error recovery
- Task requires precise citation or audit trail back to original data
- Tool results contain structured data the agent references repeatedly (lookup tables, schemas)
- Context window is far from limits and compaction overhead exceeds its benefit
- Compaction prompt itself consumes significant tokens relative to savings