Agent runtime

Agent Harness

The scaffolding around an AI agent — prompt construction, tool dispatch, retry logic, trace emission, state management — that turns a model into a workflow participant.

Operating principle

Production AI is not a prompt. It is a system of context, tools, permissions, traces, evals, and feedback loops.

What it is

An agent harness is the code that wraps a language model and gives it the structure to be an agent: it assembles the prompt, presents available tools, parses the model's tool calls, dispatches them to real implementations, handles errors and retries, manages conversation state, and emits traces. Without a harness, you have a model that can generate text. With a harness, you have an agent that can do work.

Why it matters

The harness is where most of the actual engineering of an AI agent lives. Two teams using the same model with different harnesses produce dramatically different agents. The harness decides what the agent can see, what it can do, how it recovers from failure, and what's observable about its behavior.

How it works

Common substrates include LangGraph, the OpenAI Agents SDK, Mastra, Inngest, and custom in-house harnesses. The harness is where MCP tool calls are dispatched, where memory is loaded and stored, where retry policies live, and where trace spans are emitted. We treat the harness as the contract — versioned, tested, and gated by the same eval set as the prompts and models it serves.

Related resources