Vector Databases: A Comparison

Aug 15, 2024 5 min read

Choosing the right vector database is one of the most impactful infrastructure decisions you’ll make when building AI applications. After running benchmarks across three popular options, here’s what I found.

The Contenders

For this comparison, I evaluated three vector databases that represent different points on the managed-vs-self-hosted spectrum:

Pinecone — Fully managed, serverless option
Weaviate — Open-source with managed cloud option
Milvus — Open-source, designed for massive scale

Each was tested with a dataset of 1 million 1536-dimensional embeddings (OpenAI text-embedding-3-small output).

Benchmark Results

Metric	Pinecone	Weaviate	Milvus
P50 Latency (ms)	12	18	15
P99 Latency (ms)	45	72	55
Recall @10	0.95	0.93	0.94
Cost/month (1M vectors)	$70	$25*	$15*
Setup Complexity	Low	Medium	High
Filtering Support	Metadata	GraphQL-like	Boolean expressions

*Self-hosted on AWS, compute costs only.

Indexing Performance

The write path matters just as much as reads. Here’s a simple benchmark script:

import time
import numpy as np

def benchmark_ingest(client, vectors: np.ndarray, batch_size: int = 100):
    """Benchmark vector ingestion throughput."""
    total = len(vectors)
    start = time.perf_counter()

    for i in range(0, total, batch_size):
        batch = vectors[i : i + batch_size]
        client.upsert(
            vectors=[
                {"id": f"vec_{j}", "values": v.tolist()}
                for j, v in enumerate(batch, start=i)
            ]
        )

    elapsed = time.perf_counter() - start
    throughput = total / elapsed
    print(f"Ingested {total} vectors in {elapsed:.1f}s ({throughput:.0f} vec/s)")
    return throughput

Results for 1M vectors:

Pinecone: ~8,500 vectors/sec (serverless, no tuning needed)
Weaviate: ~12,000 vectors/sec (with batch import API)
Milvus: ~15,000 vectors/sec (with bulk insert, 3-node cluster)

My Recommendations

Choose Pinecone if you want zero operational overhead and predictable costs. It’s the “just works” option — ideal for startups and small teams.

Choose Weaviate if you need rich querying capabilities (its GraphQL-like query language is excellent) and want the option to self-host later. The hybrid search (vector + keyword) is particularly strong.

Choose Milvus if you’re operating at massive scale (100M+ vectors) and have the engineering team to manage infrastructure. Its distributed architecture handles horizontal scaling better than the alternatives.

The Real Question

Honestly, for most teams starting out, the database choice matters less than you think. What matters more is:

Your embedding model — This determines the quality of your search results far more than the database
Your chunking strategy — How you split documents affects retrieval relevance
Your metadata design — Good filtering can compensate for imperfect vector search

Pick the database that fits your team’s operational capabilities, and spend your optimization energy on the retrieval pipeline instead.