Source Graph
A navigable map of every system your data lives in — schemas, documents, code, tickets, events, owners, and permissions — so an AI agent can find the right source and respect the right access boundary.
What it is
A source graph is a map of every place your data lives — warehouse tables, application databases, documents, tickets, code, event streams — with the relationships between them, who owns each piece, and who is allowed to see it. Think of it as a navigable inventory: an AI agent (or a person) can ask 'where does the answer to this question live, and am I allowed to read it?' and get a real answer instead of a guess.
How it differs from a knowledge graph or lineage graph
A knowledge graph models the business — accounts, products, contracts — and is built downstream of trustworthy operational data. A lineage graph models how data moves between systems. A source graph is the map an agent uses to navigate before either of those exists or while both are still in flight; it does not require domain modeling to be useful, only an owner and an access boundary on each source.
Why it matters
Agents fail when they retrieve fragments without knowing where those fragments came from, who owns them, or how they relate to operational systems. A source graph gives retrieval a durable map and gives governance a single place to declare what an agent is allowed to see.
- System, schema, and document inventory
- Owner and access boundary on every node
- Permission-aware retrieval at query time
- Lineage between events, data, and decisions
What we build
We connect warehouse models (dbt or SQLMesh definitions), operational APIs (OpenAPI specs), documents (Notion, Drive, Confluence), tickets (Jira, Linear), code repositories, and event streams into a navigable substrate. The graph is queried at retrieval time to resolve which sources answer a question and whether the requesting principal is allowed to see them.
What it works with
Sits inside Data Foundations and feeds every layer above it. Vector Search uses it to know which indexes a given question is allowed to read. Agent Workflows use it to assemble context before they act. Conversation Intelligence uses it to write captured threads back into a governed table. The MCP Tool Registry consults it when a tool needs to know what an agent can see.
When you need it
Signals that a source graph would change your week: nobody can give a confident answer to 'where does this number come from'; AI pilots stall when the team realizes nobody knows which document is canonical; access-control reviews take days because the data map lives in tribal knowledge. If you can answer 'who owns this source and who is allowed to read it' for every source, you may not need this yet. If you cannot, you do.