AI Platform
One operating layer for the AI models, tools, and data your teams use. Cost, quality, and governance you can see. Provider choices — hosted, local, or self-hosted — you can change without rewriting the system.
One gateway · provider is a routing decision · switch without rewrites
What it is
Most companies that use AI have a few teams each calling a provider directly: one through a Python script, another through a SaaS tool, a third with API keys in a Google Doc. The AI Platform replaces that sprawl with one operating layer every team and every agent talks to. Switching providers becomes a routing decision. Costs become a dashboard. Exposing internal tools to AI becomes governed. The quality of every interaction becomes measurable. The architecture stays yours: model providers, cloud, private deployment.
Why it matters
Without a platform, three things go wrong at once. Costs are invisible until month-end when the bill arrives. Quality is anecdotal: you hear about it when a customer complains. And tools are dangerous: agents call business APIs with shared credentials, no audit, and no scoped permissions. A platform is what turns AI from a series of pilots into infrastructure you can run.
What we build
A gateway in front of model providers, with routing policies and fallback chains. Primary stacks are Claude (Anthropic) and OpenAI; we extend to Gemini, Bedrock, Azure OpenAI, Mistral, or open-weight runtimes (vLLM, Ollama) per engagement. A registry for MCP tools (the Model Context Protocol, Anthropic's open standard for exposing tools, resources, and prompts to agents) with typed JSON Schemas, permission scopes, and audit logging. Retrieval pipelines (chunking, embeddings, hybrid search, reranking). OpenTelemetry traces, token and cost telemetry, and an eval harness that gates promotion of any prompt, model, or workflow change.
- Model gateway with fallback and routing (Claude, OpenAI as primary)
- MCP tool registry with JSON Schema contracts and scopes
- Retrieval pipelines (chunking, embeddings — hosted or self-hosted — hybrid search, rerank)
- OpenTelemetry traces with cost, latency, and token telemetry
- Eval harness with promotion gates
- Per-tenant cost attribution
- Secret boundaries and scoped tool tokens
How it works
Applications and agents talk to the platform, not directly to providers. That single integration point is where routing decisions are made (privacy class, cost budget, latency target, quality floor), where keys rotate without code changes, where rate-limit and outage handling are absorbed, and where each call emits a trace. Tools onboard as MCP servers with versioned schemas; the registry decides which agent, on whose behalf, with which scope, can call which tool. Retrieval runs against the indexes built by Data Foundations. Quality regressions are caught by the eval harness before they reach users; cost spikes are caught by per-tenant telemetry before they reach finance.
What it works with
Sits on top of Data Foundations: the lakehouse provides the substrate, the platform serves it. Agent Workflows run on top of the platform, using its gateway for model calls, its registry for tool invocations, and its traces for observability. Benchmarks reads platform traces to build its eval set; Self-Optimizing Agents writes variants back through the platform. Conversation Intelligence uses the platform for extraction. Without this layer, the layers above tend to invent their own routing, governance, and telemetry, badly and inconsistently.
Where we draw the line
The gateway belongs to you: provider relationships are routing rules in a config, and every model call, tool invocation, and retrieval is traced and inspectable. We orchestrate the models that exist and shift routing as the landscape changes — we do not train foundation models. It sits below frameworks like LangGraph, the OpenAI Agents SDK, and Mastra: they call into it, not the other way around.
When you should start
Signals: multiple teams calling providers directly with separate API keys; AI costs invisible until month-end; an internal assistant whose quality is measured by user complaints; tools exposed to AI through shared credentials with no audit; a regulator asking which data left your network and to which provider. Common starting points: retire shadow LLM usage onto one observable gateway, introduce MCP-based tool governance so agents stop sharing credentials, or add eval and cost telemetry to an in-flight assistant.
Related learning
How an AI system decides which model to call for each step — based on privacy, cost, latency, quality, and what happens when a provider goes down.
A governed catalog of every tool an AI agent can call — your APIs, your databases, your internal systems — with typed schemas, permission scopes, audit trails, and the standard protocol (MCP) that turns 'we exposed it to the LLM' into 'we know exactly who called what when'.
Trace-level visibility into every model call, retrieval, tool invocation, decision, approval, and failure inside an AI workflow — the substrate every other discipline (evals, optimization, governance) reads from.
The test suite your AI workflows have to pass before any change reaches users — measuring quality, latency, cost, and safety on real production data instead of vibes.
A capability in the Group e-media information AI stack. This resource connects the subject to data substrate, agent runtime, evals, and operations.