How Anthropic Writes Tool Descriptions as Mistake-Specific Guardrails
TRIGGER
Models were misusing tools in predictable ways—misunderstanding specs, using wrong parameter formats, or falling into common pitfalls—leading to failed tool calls and wasted turns in agentic loops.
APPROACH
Anthropic's SWE-bench team spent more time optimizing tool descriptions than the overall prompt, achieving state-of-the-art 49% on SWE-bench Verified. They tested tools to uncover model misunderstandings, then edited descriptions to preempt those errors. Input: observed failure patterns from testing. Output: description text that explicitly prevents those failures. For the Bash tool, the description notes that command contents don't need XML escaping, there's no internet access, and how to run background commands with "&". For the Edit tool (str_replace_editor), they require absolute paths after observing models mess up relative paths when the agent moved out of root directory—this change alone eliminated an entire class of errors. The string replacement tool specifies that old_str must match exactly one location, with clear error messages when there are zero or multiple matches.
PATTERN
“Models misuse tools in predictable ways—wrong path formats, missing escapes, incorrect parameter types—and small error rates compound across agentic turns. Tool descriptions aren't documentation; they're guardrails. Run the tool 10+ times, observe failure modes, then encode "do not do X" in the description.”
✓ WORKS WHEN
- You have the ability to iterate on tool descriptions based on observed model behavior
- Tools will be used across many agentic turns where small error rates compound
- Failure modes are consistent enough to be preempted with description changes
- The tool interface has inherent ambiguities (relative vs absolute paths, escaping rules)
- You're building for a specific model family whose error patterns you can characterize
✗ FAILS WHEN
- Tool usage is one-shot where iteration isn't possible
- The model's error patterns are unpredictable or vary widely across invocations
- Tool interface is already unambiguous and self-documenting
- Description length is constrained and you can't fit the necessary detail
- You're building for multiple model families with different failure modes