Operations

Model Gateway

The component of an AI platform that routes every model call — choosing the provider, applying rate limits and fallback, attributing cost, and emitting traces — so applications never call providers directly.

Operating principle

Production AI is not a prompt. It is a system of context, tools, permissions, traces, evals, and feedback loops.

What it is

A model gateway is the inbound API for every AI model an organization uses. Applications and agents talk to the gateway; the gateway decides which provider and model to actually invoke. It's the chokepoint that makes provider choice a routing rule instead of a hardcoded dependency.

Why it matters

Without a gateway, switching from one provider to another is a code change in every application. With a gateway, it's a routing rule. The gateway is also where cost attribution, rate-limit handling, fallback logic, and trace emission live — none of which any individual application should be reinventing.

How it works

Standard pattern: receive request, resolve routing policy (privacy, cost, latency, quality), call the selected provider, handle errors and fallback, attribute cost to tenant and workflow, emit OpenTelemetry trace. See AI Gateway for the broader picture; Model Gateway is specifically the model-call leg.

Related resources