Hybrid Retrieval
BM25 keyword search and dense vector search combined with a tunable weight, plus cross-encoder rerank on the top-k — the configuration that consistently beats vector-only.
Vector search alone misses queries that hinge on a specific term (product code, error string, person name). BM25 alone misses semantic equivalence. Hybrid with rerank is the configuration that wins on real corpora.
What it solves
Closes the gap between 'semantic similarity is close to right' and 'the right answer requires both a specific term and the right context'.
How we build it
OpenSearch or Elasticsearch for BM25, pgvector / Qdrant / Weaviate / Pinecone for dense vectors. The retriever queries both, fuses scores (reciprocal rank fusion or weighted sum), and reranks the top 50 with a cross-encoder (Cohere Rerank, BGE, Voyage). Weights and rerank depth are eval-set parameters, not magic numbers.
- BM25 leg and dense leg in parallel
- Reciprocal rank fusion or tuned weights
- Cross-encoder rerank on top-k
- Eval-set tuned weights, not guesses
What changes when it is in place
Recall on hard queries goes up; the agent stops missing answers that were in the corpus the whole time.