Why Anthropic Chose Scripts Over Prompts for Deterministic Operations
TRIGGER
Agents using LLMs for tasks that require deterministic reliability or computational efficiency—like parsing structured data, sorting lists, or extracting form fields—produce inconsistent results and waste tokens on operations that traditional code handles better.
APPROACH
Anthropic's PDF skill bundles a Python script that extracts form fields from PDFs. Rather than loading the PDF into context and having Claude parse it via token generation, Claude executes the script directly. The script runs outside the context window, returning structured results. Input: PDF file path passed to bundled script. Output: extracted form fields as structured data, without loading PDF content or script code into context.
PATTERN
“Inconsistent parsing results and wasted tokens—the "LLM-for-everything trap" where you prompt the model to extract form fields when a 20-line script handles it deterministically. Agents don't need to understand code to use it; they need to know when to invoke it. Bundle scripts for mechanical operations; reserve tokens for judgment calls.”
✓ WORKS WHEN
- Operation has well-defined inputs and outputs that can be scripted
- Task requires deterministic, repeatable results (parsing, validation, computation)
- Processing would require loading large files into context (PDFs, images, datasets)
- Agent has code execution capabilities in its environment
- Script reliability exceeds what LLM generation could achieve for the task
✗ FAILS WHEN
- Operation requires judgment or interpretation that code can't capture
- Input formats vary unpredictably and need LLM flexibility to handle
- Agent runs in sandboxed environment without code execution
- Script maintenance burden exceeds the cost of LLM-based processing
- Task benefits from LLM's ability to handle malformed or unexpected inputs gracefully