How Linear Scaled pgvector to Tens of Millions of Embeddings
TRIGGER
Vector similarity searches were timing out or taking multiple seconds when querying tens of millions of embeddings without proper indexing, and creating a single index over the entire table failed even with hundreds of GB of memory allocated to the database server.
APPROACH
Linear partitioned their pgvector embeddings table by workspace ID across several hundred partitions, then created indexes on each partition separately rather than attempting a single global index. They used PostgreSQL with the pgvector extension hosted on Google Cloud. Input: issue text content (title + description concatenated). Output: vector embedding stored with metadata (status, workspace ID, team ID). The system now powers duplicate detection across their entire user base in production.
PATTERN
“Index creation fails entirely—not slowly—when vector tables exceed memory limits. Partitioning by tenant turns one impossible index-build into many tractable ones, and naturally scopes queries since similarity searches almost always filter by tenant anyway.”
✓ WORKS WHEN
- Multi-tenant SaaS where queries naturally filter by tenant/workspace (>90% of queries are tenant-scoped)
- Total embedding count exceeds what can be indexed in a single operation (tens of millions of rows)
- Database memory is constrained relative to index size requirements
- Using managed PostgreSQL with pgvector where you cannot easily scale to dedicated vector DB infrastructure
- Tenant count is large enough to distribute load (hundreds of partitions) but not so large that partition management becomes unwieldy
✗ FAILS WHEN
- Cross-tenant similarity queries are required (e.g., finding similar content across all users)
- Tenant sizes are highly skewed—one tenant with 50M issues still hits the same indexing wall
- Single-tenant deployment or small number of tenants (<10) where partitioning adds overhead without benefit
- Query patterns frequently span multiple tenants for analytics or global search features
- Using a dedicated vector database that handles index scaling automatically