AI Pattern Library

What works, when, and why — 126 patterns

Stage

All stages(126)research(0)scope(3)design(3)build(109)launch(0)iterate(11)

Category

Agents(64)RAG & Retrieval(22)Prompting(30)Code Generation(21)Evaluation(24)Design Systems(6)Model Ops(10)Workflow(18)Security(5)UX(8)Data(6)Production(8)

Company

Anthropic Engineering(61)Canva(10)Figma(7)Hugging Face(14)Linear(17)Notion(15)UX Design(2)

Showing 1-24 of 126 patterns

Flux.1-Dev requires ~33GB of memory in BFloat16, exceeding 24GB consumer GPU limits (RTX 4090). CPU offloading enables execution but caps speedup at 1.12x because FP8 quantization isn't compatible with offloading+compilation. A single quantization strategy couldn't solve both memory and speed.

Source: Hugging Face • July 2025

Agent performance is a product of framework, tools, and underlying model—when an agent underperforms, teams can't tell whether to switch frameworks, upgrade tools, or use a different LLM because all three vary simultaneously across experiments.

Source: Hugging Face • July 2025

Teams building on-device LLM experiences typically optimize for the latest flagship hardware with NPUs and advanced CPU features, but this excludes the majority of users whose devices are 3-5 years old and lack these capabilities—limiting adoption to a small fraction of the potential user base.

Source: Hugging Face • August 2025

Multiple LLMs given full discussion context were producing superficial responses that didn't engage meaningfully with each other's arguments—they had information but no framework for productive disagreement or synthesis.

Source: Hugging Face • July 2025

When LLMs with function-calling capabilities access external tools (web search, databases), users lose visibility into what information is being retrieved and which sources inform the response—the tool use happens invisibly within the model's reasoning.

Source: Hugging Face • July 2025

Full model compilation with torch.compile provides maximum speedup but has high memory overhead during the compilation process itself, and the compiled graph consumes additional memory. On consumer GPUs with 24GB VRAM, full compilation can cause OOM even when the model itself fits in memory.

Source: Hugging Face • July 2025

AI benchmarks using fixed test sets suffer from data contamination—models may have seen answers during training, making it impossible to distinguish genuine reasoning from memorization without access to full training pipelines.

Source: Hugging Face • July 2025

RL training for code generation tasks wastes failed rollouts—when the model produces incorrect code, it receives zero reward and learns nothing from the specific failure, even though the verifier's error message contains actionable debugging information.

Source: Hugging Face • August 2025

Compiled diffusion models with LoRA adapters required recompilation whenever adapters were swapped, negating the speed benefits of torch.compile. Each LoRA has different ranks and target layers, causing architecture changes that trigger recompilation on every swap.

Source: Hugging Face • July 2025

Training VLMs with RL using only accuracy-based rewards produces models that sometimes get correct answers but with malformed or unparseable output formats, making the responses unusable in downstream pipelines.

Source: Hugging Face • August 2025

RL training on uniformly sampled problems wastes compute on easy examples the model already solves consistently (providing no gradient signal) while undersampling hard problems that would drive improvement.

Source: Hugging Face • August 2025

Building AI features that require capabilities beyond what a single LLM provides—like domain-specific image generation, speech synthesis, or scientific computing—but training or fine-tuning custom models is prohibitively expensive and the capabilities already exist in specialized models.

Source: Hugging Face • July 2025

Full-context sharing between all LLM agents creates token cost that scales quadratically with participants and rounds, while also producing information overload that dilutes focus on the most relevant prior arguments.

Source: Hugging Face • July 2025

When comparing AI agents, accuracy alone doesn't reveal how models approach problems differently—two agents with similar scores may have vastly different reasoning strategies, cost profiles, and failure modes that matter for production deployment.

Source: Hugging Face • July 2025

AI design tools produce inconsistent outputs that don't match brand guidelines, component libraries, or product patterns—requiring extensive manual cleanup that negates the speed benefits.

Source: Figma • October 2025

Product requirements documents written before prototyping miss micro-interactions and edge cases that only surface when you can click through the experience—leading to specification gaps discovered late in development.

Source: Figma • October 2025

AI coding agents receiving design prototypes as images or rendered visuals couldn't understand the underlying implementation logic—they could see what the UI looked like but not how it was built, forcing developers to manually translate visual designs into code patterns.

Source: Figma • September 2025

AI code generation and prototyping tools produce working output but ignore existing design systems—generating inconsistent UI that doesn't match production standards. Teams worry that making AI tools accessible to more contributors will degrade quality and craft. AI-generated prototypes look polished but generic, making them less useful for realistic user testing.

Source: Figma • October 2025

AI coding agents generating UI code from designs would produce inconsistent component implementations—sometimes using the correct design system component, sometimes creating one-off implementations—because they couldn't see the relationship between visual design components and their production code equivalents.

Source: Figma • September 2025

Feature ideas languish in backlogs because static mockups and PRDs fail to generate executive buy-in—stakeholders can't viscerally understand the value proposition from wireframes or written specs alone, leading to slow alignment and theoretical debates about theoretical features.

Source: Figma • October 2025

Design exploration phases get compressed because generating multiple directions is time-expensive—teams converge prematurely on a single approach without fully exploring the solution space. PMs need to form opinions on ill-defined problem spaces but don't know where to start.

Source: Figma • October 2025

Search teams at companies with privacy constraints cannot view user queries or content to build evaluation datasets—the standard approach of human judges labeling real query-document pairs is impossible when the data is private user designs.

Source: Canva • November 2024

AI-generated content needs human review before user-facing deployment, but the volume (millions of items) makes comprehensive review impossible—yet releasing unreviewed AI content risks inappropriate or off-brand output.

Source: Canva • February 2025

Classification systems using keyword matching achieve high coverage on common cases but fail on a long tail of edge cases where keywords don't directly align—yet using AI for all classification is expensive or slow at scale.

Source: Canva • February 2025

AI Patterns

What works, when, and why — for AI product decisions.

by Dev Generalist

Documentation

Pattern Library

Connect

Newsletter LinkedIn

© 2026 AI Patterns•Status: OPERATIONAL

Build: v1.0.0