Sharding Pgvector
10 comments
·March 26, 2025alexgarcia-xyz
ankitml
Well, I have a brute force strategy for pgvector working reasonably well. Individual, partial indexes. It works for all those queries with category_id=<> clauses. You only need an index for larger categories, for categories with rows below a threshold you dont need index a KNN/dot product would work.
jeffchuber
This will work very poorly when your data is changing because the centroids degrade and you'll have very poor recall but likely not know it unless you are also monitoring recall.
I didn't see this in the write-up, so adding it here as a common foot gun.
levkk
Good call-out. The article mentions using a representative sample, but I should definitely put it somewhere on top.
redskyluan
sharding is a bad solution for any databases, especially vector database. See https://milvus.io/blog/why-manual-sharding-is-a-bad-idea-for...
isoprophlex
That's an ad, plain and simple, and as such can't be taken seriously.
> When we reached 100 million vector embeddings, query latency spiked to over a second, something far beyond what our customers would tolerate.
I don't know what else they were doing but 1000 msec latency for 100 million embeddings... that's definitely not on pgvector.
redskyluan
What’s the largest deployment you’ve maintained in production?
If you’ve had the chance to manage one—especially while handling concurrent insert and search operations—you’ve probably felt the pain firsthand and understand just how critical scalability really is.
Insert into PGvector is super slow when data accumulate and the index build will take forever
ankitml
Depends on the hardware, sharding layout, bulk/individual insert and table bloat. Looks like you are missing pg_repack..
cluckindan
The article talks about manual sharding, not sharding in general.
charliereese
Thanks for writing about this :)
I really dig more content about how vector databases/tools handle problems like this!
In sqlite-vec, there's only a flat brute-force index (though DiskANN/IVF will be coming soon). But we do have a concept of partition keys[0], which allows you to "internally shard" the vector index based on a user_id or any other value.
Then at query time, any WHERE constraints on a partition key are pushed-down, so only matching vectors are searched instead of the entire index. Definitely not as performant as a proper vector index, but a lot of real-world application have these natural groups anyway. Like "search only English documents" or "search entries in this workspace only", or even "search comments only from the past 30 days."Internally sqlite-vec stores vectors in "chunks", so when partition keys are definds, every chunk is associated with a unique combination of all partition keys. Kinda hard to describe, but if you create a vec0 virtual table and insert some values, you can inspect the internal "shadow tables" in the SQLite database to see how it's all stored.
[0] https://alexgarcia.xyz/sqlite-vec/features/vec0.html#partiti...