Vector Databases¶

What They Do¶

Store high-dimensional vectors (embeddings) and find the most similar ones efficiently. The retrieval engine behind RAG.

How Similarity Search Works¶

Embed your documents into vectors (e.g., 768 or 1536 dimensions)
Index vectors for fast approximate nearest neighbor (ANN) search
Query: embed the query, find top-K most similar vectors
Return the associated documents/chunks

Tools Compared¶

Tool	Language	ANN Index	Persistence	Filtering	Best For
Chroma	Python	HNSW	SQLite	Metadata	Prototyping, small scale
Qdrant	Rust	HNSW	Disk/Memory	Payload filters	Production, filtering
FAISS	C++/Python	IVF, HNSW, PQ	Memory only	None (DIY)	Research, billion-scale
Weaviate	Go	HNSW	Disk	GraphQL	Full-featured, hybrid search
Milvus	Go/C++	Multiple	Disk	Attribute	Enterprise, distributed
Pinecone	Managed	Proprietary	Cloud	Metadata	Serverless, zero-ops
pgvector	SQL	IVF, HNSW	PostgreSQL	SQL WHERE	When you already use Postgres

Key Concepts¶

Embedding Models¶

Model	Dimensions	Quality	Speed	Open Source
OpenAI text-embedding-3-large	3072	High	API	No
Cohere embed-v4	1024	High	API	No
BGE-large-en-v1.5	1024	Good	Fast	Yes
nomic-embed-text	768	Good	Fast	Yes (Ollama)
all-MiniLM-L6-v2	384	Decent	Very fast	Yes

HNSW (Hierarchical Navigable Small World)¶

The dominant ANN algorithm. Builds a multi-layer graph where: - Top layers: sparse, for quick global navigation - Bottom layers: dense, for precise local search - Query traverses top-down, getting closer at each layer

Trade-offs: ef_construction (build quality vs speed), M (memory vs recall).

Distance Metrics¶

Metric	Formula	When to Use
Cosine similarity	1 - cos(a,b)	Normalized embeddings (most common)
Euclidean (L2)	sqrt(sum((a-b)^2))	Unnormalized embeddings
Dot product	sum(a*b)	When magnitude matters

My Setup¶

LightRAG uses: Qdrant for vector storage, PostgreSQL/AGE for graph
For experiments: Chroma (simplest to set up, good for prototyping)
Embedding: nomic-embed-text via Ollama (local, free)