Data substrate

Source of Truth

The single, authoritative system or table where a piece of information is officially defined — the place every other copy of it must trace back to.

Operating principle

Production AI is not a prompt. It is a system of context, tools, permissions, traces, evals, and feedback loops.

What it is

A source of truth is the agreed-upon authoritative source for a piece of data. For customer records, it might be the CRM. For invoices, the billing system. For product specs, a specific repo. Every dashboard, AI retrieval, downstream system, and report should resolve back to the source of truth — not to a copy made last quarter that diverged silently.

Why it matters

Without declared sources of truth, the same information lives in three places and the three disagree. Reports contradict each other; AI cites the stale copy; teams argue about whose number is right. Declaring sources of truth — and making everything else read from them — is half of what governed data looks like in practice.

How it works

Each data domain gets an owner who declares the source of truth, the access boundaries, and the freshness contract. Downstream copies (warehouse, lakehouse, search indexes) are derived through documented pipelines. The Source Graph encodes the relationship; lineage tools enforce it.

Related resources