Use case

Resolution QA

Sampling and scoring resolved threads — agent-handled and human-handled — to calibrate quality, detect bad resolutions, and feed eval set additions.

Overview

A resolved thread is not necessarily a good thread. Resolution QA is the discipline of sampling resolutions to confirm they are also correct.

What it solves

Closes the gap between 'thread closed' and 'customer satisfied with the actual answer'.

How we build it

Sampling strategy weighted by volume and risk class. Rubric-scored by LLM-as-judge with periodic human calibration. Low-scoring resolutions surface for review; high-scoring ones become candidate gold cases. Calibration of the judge is itself a tracked metric.

Volume and risk-weighted sampling
Rubric scoring with human calibration
Low-score review queue
High-score promotion to gold

What changes when it is in place

Resolution quality becomes a tracked, calibrated metric instead of a CSAT lottery.