Vector Databases: A Comparison
Choosing the right vector database is one of the most impactful infrastructure decisions you’ll make when building AI applications. After running benchmarks across three popular options, here’s what I found.
The Contenders
For this comparison, I evaluated three vector databases that represent different points on the managed-vs-self-hosted spectrum:
- Pinecone — Fully managed, serverless option
- Weaviate — Open-source with managed cloud option
- Milvus — Open-source, designed for massive scale
Each was tested with a dataset of 1 million 1536-dimensional embeddings (OpenAI text-embedding-3-small output).
Benchmark Results
| Metric | Pinecone | Weaviate | Milvus |
|---|---|---|---|
| P50 Latency (ms) | 12 | 18 | 15 |
| P99 Latency (ms) | 45 | 72 | 55 |
| Recall @10 | 0.95 | 0.93 | 0.94 |
| Cost/month (1M vectors) | $70 | $25* | $15* |
| Setup Complexity | Low | Medium | High |
| Filtering Support | Metadata | GraphQL-like | Boolean expressions |
*Self-hosted on AWS, compute costs only.
Indexing Performance
The write path matters just as much as reads. Here’s a simple benchmark script:
import timeimport numpy as np
def benchmark_ingest(client, vectors: np.ndarray, batch_size: int = 100): """Benchmark vector ingestion throughput.""" total = len(vectors) start = time.perf_counter()
for i in range(0, total, batch_size): batch = vectors[i : i + batch_size] client.upsert( vectors=[ {"id": f"vec_{j}", "values": v.tolist()} for j, v in enumerate(batch, start=i) ] )
elapsed = time.perf_counter() - start throughput = total / elapsed print(f"Ingested {total} vectors in {elapsed:.1f}s ({throughput:.0f} vec/s)") return throughputResults for 1M vectors:
- Pinecone: ~8,500 vectors/sec (serverless, no tuning needed)
- Weaviate: ~12,000 vectors/sec (with batch import API)
- Milvus: ~15,000 vectors/sec (with bulk insert, 3-node cluster)
My Recommendations
Choose Pinecone if you want zero operational overhead and predictable costs. It’s the “just works” option — ideal for startups and small teams.
Choose Weaviate if you need rich querying capabilities (its GraphQL-like query language is excellent) and want the option to self-host later. The hybrid search (vector + keyword) is particularly strong.
Choose Milvus if you’re operating at massive scale (100M+ vectors) and have the engineering team to manage infrastructure. Its distributed architecture handles horizontal scaling better than the alternatives.
The Real Question
Honestly, for most teams starting out, the database choice matters less than you think. What matters more is:
- Your embedding model — This determines the quality of your search results far more than the database
- Your chunking strategy — How you split documents affects retrieval relevance
- Your metadata design — Good filtering can compensate for imperfect vector search
Pick the database that fits your team’s operational capabilities, and spend your optimization energy on the retrieval pipeline instead.