Use case

Model Routing Policy

Per-step routing rules across providers and models — privacy, cost, latency, quality, fallback — declared in the gateway, not buried in workflow code.

Overview

One model is rarely optimal for an entire workflow. Routing turns a workflow from a single model dependency into a set of explicit trade-offs the team can tune as the landscape shifts.

What it solves

Decouples workflow logic from provider choice. A model deprecation, price change, or quality regression is a routing change, not a code rewrite.

How we build it

Each workflow step declares its routing requirements: latency budget, privacy class, quality floor, expected cost ceiling. The gateway resolves the requirements against the current provider catalog. Per-step traces show which model served which call so cost and quality are attributable.

  • Per-step routing requirements
  • Provider catalog with current pricing and limits
  • Fallback chains for outage or degradation
  • Cost and quality attribution per route

What changes when it is in place

Provider swaps become an operations change. Cost optimization becomes a routing experiment. A regulator requiring regional inference becomes a routing rule.