Responses API
OpenAI's API for stateful, multi-turn agent interactions — a successor to the Assistants API designed for production agent workloads with built-in tool use and state management.
Production AI is not a prompt. It is a system of context, tools, permissions, traces, evals, and feedback loops.
What it is
The Responses API is OpenAI's agent-oriented API surface, introduced in 2025 as the successor to the Assistants API. It bundles model calls, tool use, state management, and streaming into one interface aimed at production agent workloads rather than ad-hoc chat. It's one of several competing 'agent platform' offerings from major providers.
Why it matters
Building production agents requires plumbing — state, streaming, tool dispatch, retries — that every team would otherwise build themselves. Provider-side agent APIs offer that plumbing as a managed service. The trade-off is portability: lean on the provider's API and switching providers becomes a rewrite, not a routing rule.
How we use it
Where the Responses API fits a workflow's needs, we use it through the model gateway, with the same governance and observability as any other route. Where portability or custom orchestration matters more, we use OpenAI through the standard chat completions endpoint and provide our own runtime around it.
Related resources
The execution engine that turns an AI agent from a chat-window demo into a long-running, event-driven, restartable process you can trust with real operations.
A capability in the Group e-media information AI stack. This resource connects the subject to data substrate, agent runtime, evals, and operations.
The component of an AI platform that routes every model call — choosing the provider, applying rate limits and fallback, attributing cost, and emitting traces — so applications never call providers directly.
The mechanism by which a language model invokes external functions — APIs, databases, code execution, retrieval — and reads the results back to continue its work.