Use case

Workflow Mutation

Generating workflow variants — different prompts, models, retrieval policies, node shapes, tool budgets — for the eval harness to score before any change reaches production.

Overview

A workflow change shipped without measurement is an opinion enforced by deploy. Mutation makes the change a candidate the harness can score.

What it solves

Turns workflow optimization into a measurable, reviewable process. Surfaces dominant candidates without trial-and-error on production traffic.

How we build it

Mutation operators per surface: prompt rewrites, model swaps, retrieval-depth changes, node restructures, tool-budget tweaks. The harness runs the cartesian product (or a guided search) against the eval set and reports Pareto-dominant candidates.

  • Per-surface mutation operators
  • Guided search across surfaces
  • Pareto-dominant candidates reported
  • Human approval before promotion

What changes when it is in place

Workflow improvement becomes data-driven. Changes that survive the harness are the ones that ship; the rest stay in the variant table.