Use case

Governed Datasets

Datasets with declared owners, access boundaries, freshness SLOs, and retention rules — the unit of agent-safe data sharing.

Overview

Agents that read from ungoverned tables eventually retrieve something they should not have seen. A governed dataset is a table or view with explicit ownership, an access boundary, a freshness SLO, and a retention policy attached as code.

What it solves

Removes the gap between 'who is allowed to see this' and 'who can read it at query time'. Governance moves from policy documents to enforced configuration the runtime reads on every retrieval.

How we build it

Each dataset declares: owner, classification (internal, confidential, regulated), allowed principals or roles, freshness target, retention window, and downstream consumers. The catalog (Unity Catalog, Polaris, Lake Formation, or DataHub) enforces access at query time. Retrieval pipelines inherit the same boundaries.

  • Per-dataset owner and classification
  • Role-based access enforced at query time
  • Freshness SLO and retention window
  • Downstream consumer registry

What changes when it is in place

Auditing 'who can see what' becomes a query against the catalog instead of a Slack thread. Onboarding a new agent or analyst becomes a role assignment, not a copy of last person's permissions.