Agent runtime

External Memory

Memory that lives outside the model's context window — in a database, a vector store, or a structured memory store — and is retrieved on demand instead of carried in every call.

Operating principle

Production AI is not a prompt. It is a system of context, tools, permissions, traces, evals, and feedback loops.

What it is

External memory is anything an AI agent remembers that isn't in its current context window. Conversation history older than the window, user preferences, facts from past sessions, organizational context — all stored externally and retrieved when relevant. Without external memory, an agent restarts cold every session. With it, an agent can carry context across days or months.

Why it matters

Model context windows are growing but still finite, expensive, and slow to fully use. External memory keeps the per-call payload small and the long-term recall large. The trade-off is retrieval quality: bad memory retrieval can be worse than no memory at all (it surfaces the wrong precedent at the wrong moment).

How it works

Common substrates: pgvector or Qdrant for embedding-indexed memory; structured key-value stores for facts; full-text search for historical conversation lookup; hybrid retrievers combining all three. Each memory entry has a source, a confidence, a decay, and an expiry. Retrieval at inference time scores entries by recency, relevance, and confidence.

Related resources

Agent Memory

How an AI agent remembers the user it serves — what they said before, what they prefer, what context not to repeat — without that memory drifting the agent's behavior for everyone else.

Context Budget

The total number of tokens an AI agent has available for instructions, memory, retrieved context, conversation history, and tool results — and how that budget is allocated across them.

Compaction

Shrinking an AI agent's conversation history so the most relevant context stays in the model's window without exceeding the token budget — by summarizing, truncating, or selectively dropping turns.

Vector Search

How an AI agent finds the right document, chunk, or row to ground its answer in — and why the part that matters is the pipeline around the database, not the database itself.

External Memory

What it is

Why it matters

How it works

Related concepts

Related resources