Operations

AI Gateway

A single integration point in front of every AI model your applications use — for routing, key rotation, rate limits, fallback, cost attribution, and observability.

Operating principle

Production AI is not a prompt. It is a system of context, tools, permissions, traces, evals, and feedback loops.

What it is

An AI gateway is a service that sits between your applications and the AI model providers (Anthropic, OpenAI, Google, AWS Bedrock, Azure, Mistral, your own private deployments). Applications send their request to the gateway; the gateway decides which provider and model to actually call, handles the response, and emits a trace. From the application's perspective, switching from one provider to another is a config change, not a code rewrite.

Why it matters

Without a gateway, every team writes its own provider integration, manages its own API keys, handles its own rate limits, and has no visibility into total cost. A gateway centralizes those decisions and makes the AI provider choice a routing rule the platform team controls.

How it works

Standard pattern: route by policy (privacy class, cost budget, latency target, quality requirement), fail over on outage or rate limit, attribute cost per tenant or per workflow, log every request as a trace, enforce safety classifiers before responses leave. Open-source gateways (Portkey, LiteLLM, Helicone) and managed offerings (Vercel AI Gateway, AWS Bedrock, Cloudflare AI Gateway) all implement variations of this.

Related resources