Vector Search
How an AI agent finds the right document, chunk, or row to ground its answer in — and why the part that matters is the pipeline around the database, not the database itself.
Retrieval is a system, not a database call
Vector search by itself solves only the easy half of retrieval — finding text that is semantically near a query. The harder half is the system around it: source quality, chunk strategy, metadata, freshness, access control, hybrid keyword scoring (BM25 alongside dense), reranking with a cross-encoder, and feedback from failed answers. Skipping any of those produces a confident-sounding agent that cites the wrong thing.
- Chunking and overlap policies tuned per content type
- Hybrid dense + BM25 retrieval with metadata filters
- Cross-encoder rerank for the top-k passages
- Evaluation against known questions and known failures
Common stacks
pgvector for teams already on Postgres, Qdrant or Weaviate for self-hosted dedicated stores, Pinecone or Turbopuffer for managed; Cohere Rerank, BGE, or Voyage rerankers for the second stage; OpenSearch or Elasticsearch for the BM25 leg. The store is a means; the pipeline is the work.
Permissions in retrieval
Retrieval that ignores permissions is a data leak waiting for a vector match. Access boundaries from the source graph are applied either as pre-filters on the index or post-filters on the result set; the choice affects recall and latency and is part of the eval set.
What it works with
Sits inside Data Foundations. Reads access boundaries from the Source Graph at query time. Feeds Agent Workflows the chunks they need to act and cite. Measured by Workflow Evals (which cases retrieve correctly versus not). Optimized by Self-Optimizing Agents (chunking, rerank depth, hybrid weights are tunable surfaces).
When you need it
Signals: AI assistants that cite the wrong document or invent answers; retrieval that finds 'something close' but never the right thing for specific terms; an internal Q&A bot whose accuracy is plateauing because the team has tuned the prompt but never the retrieval. Hybrid retrieval with rerank consistently beats vector-only on real corpora.