build

How Linear Scaled pgvector to Tens of Millions of Embeddings

TRIGGER

Vector similarity searches were timing out or taking multiple seconds when querying tens of millions of embeddings without proper indexing, and creating a single index over the entire table failed even with hundreds of GB of memory allocated to the database server.

APPROACH

Linear partitioned their pgvector embeddings table by workspace ID across several hundred partitions, then created indexes on each partition separately rather than attempting a single global index. They used PostgreSQL with the pgvector extension hosted on Google Cloud. Input: issue text content (title + description concatenated). Output: vector embedding stored with metadata (status, workspace ID, team ID). The system now powers duplicate detection across their entire user base in production.

PATTERN

“Index creation fails entirely—not slowly—when vector tables exceed memory limits. Partitioning by tenant turns one impossible index-build into many tractable ones, and naturally scopes queries since similarity searches almost always filter by tenant anyway.”

✓ WORKS WHEN

Multi-tenant SaaS where queries naturally filter by tenant/workspace (>90% of queries are tenant-scoped)
Total embedding count exceeds what can be indexed in a single operation (tens of millions of rows)
Database memory is constrained relative to index size requirements
Using managed PostgreSQL with pgvector where you cannot easily scale to dedicated vector DB infrastructure
Tenant count is large enough to distribute load (hundreds of partitions) but not so large that partition management becomes unwieldy

✗ FAILS WHEN

Cross-tenant similarity queries are required (e.g., finding similar content across all users)
Tenant sizes are highly skewed—one tenant with 50M issues still hits the same indexing wall
Single-tenant deployment or small number of tenants (<10) where partitioning adds overhead without benefit
Query patterns frequently span multiple tenants for analytics or global search features
Using a dedicated vector database that handles index scaling automatically

Stage

build

Source

Linear →

From

November 2023