← Back to patterns
build

What Linear Learned About Existing Data as Fine-Tuning

TRIGGER

Generic AI models don't understand how a specific team organizes work—what their labels mean, who handles what, or how they've historically classified similar issues. Without this context, suggestions feel generic and require constant correction.

APPROACH

Linear's team (engineer Yann-Edern Gillet) rebuilt their search infrastructure from basic keyword matching to a unified semantic backend using vector search. Input: new issue text. Output: ranked list of semantically similar historical issues. For Triage Intelligence, incoming issues are embedded and matched against the existing backlog to surface candidate similar issues. These candidates become few-shot context for LLMs (initially GPT-4o mini and Gemini 2.0 Flash, later upgraded to GPT-5 and Gemini 2.5 Pro for better nuanced reasoning) that evaluate duplicates, related issues, and property suggestions like labels and assignees. The backlog becomes an implicit training set that improves with each organized issue.

PATTERN

Skip the fine-tuning project—your backlog is already training data. Instead of training custom models or writing elaborate prompts describing team conventions, retrieve examples of how this team has already solved similar problems and let the model infer the pattern. Every organized issue becomes implicit few-shot context.

WORKS WHEN

  • Teams have an existing backlog with consistent organization patterns (100+ well-labeled issues)
  • Classification rules are implicit in historical decisions rather than explicit documentation
  • Search infrastructure can surface semantically similar items, not just keyword matches
  • Teams want personalized suggestions without maintaining explicit configuration
  • The domain has natural clustering where similar issues should be handled similarly

FAILS WHEN

  • Backlog is empty, inconsistent, or full of misclassified issues (garbage in, garbage out)
  • Organization rules are explicit and rule-based rather than pattern-based (use deterministic automation instead)
  • Privacy/security constraints prevent using historical data as context
  • Classification categories are new or changing rapidly—no historical precedent exists
  • Team wants to break from historical patterns rather than reinforce them

Stage

build

From

September 2025

Want patterns like this in your inbox?

3 patterns weekly. No fluff.