← Back to patterns
build

What Linear Learned About Context Retrieval vs Model Capability

TRIGGER

Small models with rigid, pre-scoped prompts were failing on nuanced triage decisions because they could only work with information provided upfront—if something important was missing from the initial context, they had no way to retrieve it, causing misclassification of fuzzy or complex issues.

APPROACH

Linear's team initially used GPT-4o mini and Gemini 2.0 Flash with tightly scoped prompts for duplicate detection. When these failed on context-sensitive cases, they switched to an agentic approach with larger models (GPT-5, Gemini 2.5 Pro) that could autonomously pull additional context from Linear's data. Input: incoming issue + candidate issues from semantic search. Output: duplicate/related classification, property suggestions (labels, assignees), and reasoning explanation. The agentic architecture also enabled user-defined 'Additional Guidance' prompts to steer decision-making per workspace or team.

PATTERN

Accuracy plateaus despite prompt iteration signal a context problem, not a model problem. Small models confidently misclassify when they lack information a human would fetch—giving the model agency to retrieve additional context unlocks quality improvements no prompt engineering on fixed inputs can achieve.

WORKS WHEN

  • Triage decisions require cross-referencing information scattered across the backlog (related issues, historical patterns, team conventions)
  • Issue complexity varies widely—some are straightforward duplicates, others require understanding project context or implicit relationships
  • Users need to customize AI behavior per workspace or team without engineering intervention
  • Latency tolerance is seconds rather than sub-second (users accept thinking time for better accuracy)
  • Existing backlog is large enough to serve as meaningful training context (hundreds+ of organized issues)

FAILS WHEN

  • All relevant context can be provided upfront in a single prompt (simple classification with known schema)
  • Sub-200ms response time is required and cannot be masked with progressive UI
  • Backlog is sparse, inconsistent, or newly created—no historical organization patterns to learn from
  • Cost per triage decision must stay under small-model pricing (agentic calls with large models are 10-50x more expensive)
  • Triage rules are fully deterministic and don't benefit from fuzzy reasoning

Stage

build

From

September 2025

Want patterns like this in your inbox?

3 patterns weekly. No fluff.