Operations
Model Routing
A gateway strategy for choosing the right model per task based on privacy, cost, latency, quality, and failure mode.
Operating principle
Production AI is not a prompt. It is a system of context, tools, permissions, traces, evals, and feedback loops.
Why route
One model is rarely optimal for every task. Classification, retrieval reasoning, summarization, coding, and final answer generation have different cost and quality profiles.
- Private or local models for sensitive low-risk tasks
- Frontier models for high-complexity reasoning
- Fallbacks for outage or quality regression
- Cost and token telemetry per route
Related resources
Workflow Evals
Evaluation suites that mutate prompts, models, retrieval policies, generated code, and node structure before promotion.
Governance
The policy layer for data access, tool permissions, human approvals, audit trails, and deployment boundaries.
Agent Observability
Trace-level visibility into model calls, retrieval, tools, decisions, approvals, costs, and failures.