Agent runtime

External Memory

Memory that lives outside the model's context window — in a database, a vector store, or a structured memory store — and is retrieved on demand instead of carried in every call.

Operating principle

Production AI is not a prompt. It is a system of context, tools, permissions, traces, evals, and feedback loops.

What it is

External memory is anything an AI agent remembers that isn't in its current context window. Conversation history older than the window, user preferences, facts from past sessions, organizational context — all stored externally and retrieved when relevant. Without external memory, an agent restarts cold every session. With it, an agent can carry context across days or months.

Why it matters

Model context windows are growing but still finite, expensive, and slow to fully use. External memory keeps the per-call payload small and the long-term recall large. The trade-off is retrieval quality: bad memory retrieval can be worse than no memory at all (it surfaces the wrong precedent at the wrong moment).

How it works

Common substrates: pgvector or Qdrant for embedding-indexed memory; structured key-value stores for facts; full-text search for historical conversation lookup; hybrid retrievers combining all three. Each memory entry has a source, a confidence, a decay, and an expiry. Retrieval at inference time scores entries by recency, relevance, and confidence.

Related resources