Model Gateway
The component of an AI platform that routes every model call — choosing the provider, applying rate limits and fallback, attributing cost, and emitting traces — so applications never call providers directly.
Production AI is not a prompt. It is a system of context, tools, permissions, traces, evals, and feedback loops.
What it is
A model gateway is the inbound API for every AI model an organization uses. Applications and agents talk to the gateway; the gateway decides which provider and model to actually invoke. It's the chokepoint that makes provider choice a routing rule instead of a hardcoded dependency.
Why it matters
Without a gateway, switching from one provider to another is a code change in every application. With a gateway, it's a routing rule. The gateway is also where cost attribution, rate-limit handling, fallback logic, and trace emission live — none of which any individual application should be reinventing.
How it works
Standard pattern: receive request, resolve routing policy (privacy, cost, latency, quality), call the selected provider, handle errors and fallback, attribute cost to tenant and workflow, emit OpenTelemetry trace. See AI Gateway for the broader picture; Model Gateway is specifically the model-call leg.
Related resources
A single integration point in front of every AI model your applications use — for routing, key rotation, rate limits, fallback, cost attribution, and observability.
How an AI system decides which model to call for each step — based on privacy, cost, latency, quality, and what happens when a provider goes down.
A capability in the Group e-media information AI stack. This resource connects the subject to data substrate, agent runtime, evals, and operations.
The total spend on language-model APIs across an organization — input tokens, output tokens, embeddings, fine-tuning — and the practice of attributing, optimizing, and budgeting it.