Data substrate

RAG (Retrieval-Augmented Generation)

The pattern where an AI agent retrieves relevant context from your data before generating an answer — instead of relying only on what the model learned during training.

Operating principle

Production AI is not a prompt. It is a system of context, tools, permissions, traces, evals, and feedback loops.

What it is

Retrieval-Augmented Generation, or RAG, is the most common pattern for grounding AI answers in your own data. Step one: take the user's question, search your indexed content for relevant chunks. Step two: send those chunks plus the question to the model and ask it to answer using them. The model gets to use your data; you get to control what it sees; the answer can be cited.

Why it matters

Without RAG, an AI assistant answers from the general knowledge baked into the model — which is incomplete, possibly outdated, and definitely not aware of your company. RAG is how 'AI that knows nothing about us' becomes 'AI that cites our docs.' Done well, it's table stakes for any internal AI application. Done badly, it's the source of most hallucinations.

How it works

Index your content (chunking, embeddings, metadata). At query time: retrieve candidates (hybrid BM25 + dense), rerank, filter by permissions, send top-k to the model with instructions to cite. Most production RAG bugs are at the retrieval step, not the generation step — which is why Vector Search is treated as a system, not a database call.

Related resources

Vector Search

How an AI agent finds the right document, chunk, or row to ground its answer in — and why the part that matters is the pipeline around the database, not the database itself.

Hybrid Retrieval

A capability in the Group e-media information AI stack. This resource connects the subject to data substrate, agent runtime, evals, and operations.

Reranking Policy

A capability in the Group e-media information AI stack. This resource connects the subject to data substrate, agent runtime, evals, and operations.

Source Graph

A navigable map of every system your data lives in — schemas, documents, code, tickets, events, owners, and permissions — so an AI agent can find the right source and respect the right access boundary.

Citation Quality

A capability in the Group e-media information AI stack. This resource connects the subject to data substrate, agent runtime, evals, and operations.

RAG (Retrieval-Augmented Generation)

What it is

Why it matters

How it works

Related concepts

Related resources