Use case
Resolution QA
Sampling and scoring resolved threads — agent-handled and human-handled — to calibrate quality, detect bad resolutions, and feed eval set additions.
Overview
A resolved thread is not necessarily a good thread. Resolution QA is the discipline of sampling resolutions to confirm they are also correct.
What it solves
Closes the gap between 'thread closed' and 'customer satisfied with the actual answer'.
How we build it
Sampling strategy weighted by volume and risk class. Rubric-scored by LLM-as-judge with periodic human calibration. Low-scoring resolutions surface for review; high-scoring ones become candidate gold cases. Calibration of the judge is itself a tracked metric.
- Volume and risk-weighted sampling
- Rubric scoring with human calibration
- Low-score review queue
- High-score promotion to gold
What changes when it is in place
Resolution quality becomes a tracked, calibrated metric instead of a CSAT lottery.