build

Why Anthropic Chose Scripts Over Prompts for Deterministic Operations

TRIGGER

Agents using LLMs for tasks that require deterministic reliability or computational efficiency—like parsing structured data, sorting lists, or extracting form fields—produce inconsistent results and waste tokens on operations that traditional code handles better.

APPROACH

Anthropic's PDF skill bundles a Python script that extracts form fields from PDFs. Rather than loading the PDF into context and having Claude parse it via token generation, Claude executes the script directly. The script runs outside the context window, returning structured results. Input: PDF file path passed to bundled script. Output: extracted form fields as structured data, without loading PDF content or script code into context.

PATTERN

“Inconsistent parsing results and wasted tokens—the "LLM-for-everything trap" where you prompt the model to extract form fields when a 20-line script handles it deterministically. Agents don't need to understand code to use it; they need to know when to invoke it. Bundle scripts for mechanical operations; reserve tokens for judgment calls.”

✓ WORKS WHEN

Operation has well-defined inputs and outputs that can be scripted
Task requires deterministic, repeatable results (parsing, validation, computation)
Processing would require loading large files into context (PDFs, images, datasets)
Agent has code execution capabilities in its environment
Script reliability exceeds what LLM generation could achieve for the task

✗ FAILS WHEN

Operation requires judgment or interpretation that code can't capture
Input formats vary unpredictably and need LLM flexibility to handle
Agent runs in sandboxed environment without code execution
Script maintenance burden exceeds the cost of LLM-based processing
Task benefits from LLM's ability to handle malformed or unexpected inputs gracefully

Stage

build

Source

Anthropic Engineering →

From

October 2025