Model Routing Policy
Per-step routing rules across providers and models — privacy, cost, latency, quality, fallback — declared in the gateway, not buried in workflow code.
One model is rarely optimal for an entire workflow. Routing turns a workflow from a single model dependency into a set of explicit trade-offs the team can tune as the landscape shifts.
What it solves
Decouples workflow logic from provider choice. A model deprecation, price change, or quality regression is a routing change, not a code rewrite.
How we build it
Each workflow step declares its routing requirements: latency budget, privacy class, quality floor, expected cost ceiling. The gateway resolves the requirements against the current provider catalog. Per-step traces show which model served which call so cost and quality are attributable.
- Per-step routing requirements
- Provider catalog with current pricing and limits
- Fallback chains for outage or degradation
- Cost and quality attribution per route
What changes when it is in place
Provider swaps become an operations change. Cost optimization becomes a routing experiment. A regulator requiring regional inference becomes a routing rule.