Now in private beta

AI agents that leave a paper trail.

Ingram Cloud runs a stateful, tool-using agent for every one of your end-users — and puts every run on the record. Each model call, tool, approval, and dollar, traced end to end and replayable. One REST API behind all of it.

Start building Read the docs

Model-agnostic — bring your own OpenAI, Anthropic, or Gemini key.

run.tracerun_7fa2

00:00.0run.startedpriya · "where’s my refund for order #8821?"

00:00.4tool.executingorders.lookup(id: 8821)

00:01.1tool.completedshipped · refund eligible

00:01.2approval.requiredpausedrefund.issue($48.00)

00:02.6approval.resolvedapproved by you

00:02.9message.completed"Your $48 refund is on its way."

00:03.0run.completed5 steps · 3,812 tokens · $0.0241

Auditable by default

Nothing your agent does is a black box.

Hosted AI usually means handing a prompt to an opaque service and hoping. Here, every run is a recorded sequence of steps you can open, replay, and cost — and everything that happens lands on one append-only feed.

Trace every run, end to end

Each run is a recorded sequence of steps — every model call, tool invocation, and decision, timed and costed. Replay any of them.

run.started → tool.executing → run.completed

Account for every token

Usage and dollar cost are attributed down to the individual smith, so you always know which user spent what — and can meter it onward.

budget.threshold

Approve before it acts

Gate sensitive tools behind a human. Runs pause on approval, wait for your sign-off, and resume at the exact step they left off.

approval.required → approval.resolved

GET /v1/eventsappend-only

evt_9c4run.completedpriya12:04:21

evt_9c3approval.resolvedpriya12:04:18

evt_9c2approval.requiredpriya12:04:17

evt_9c1budget.thresholdmarco12:03:55

evt_9c0deployment.inboundalice12:03:40

evt_9bfrun.startedpriya12:04:14

One design, many users

Design once. Run a private one per person.

You design an agent — its instructions, model, tools, and memory — and publish versioned snapshots. For each of your end-users, Ingram Cloud runs an isolated clone of it: its own memory, conversations, and connections.

We call that running clone a smith. Roll a new version out to the whole fleet at once, or pin and override a single one. The token carries the tenant, so data never crosses between users or projects.

Agent · the design

support-concierge

instructionsmodeltoolsmemory

v7 · published

one each

Smiths · one per user

ppriya

live thread

mmarco

2 channels

aalice

312 memories

The runtime

The state and reach agents need in production.

Memory, tools, models, and channels — managed for you, behind one API and one console.

Memory that persists per person

Every smith keeps its own three-tier memory — core facts, recall, and archival history — so conversations resume where they left off. One user's data can never surface in another's.

Tools & MCP, server-side

Connect any MCP server or reach for the built-ins. Smiths call tools on the server, and each one's OAuth connections are stored in isolation — never in your app.

Model-agnostic, BYOK

Bring your own provider keys and pick the model per agent. Move between OpenAI, Anthropic, and Gemini without touching a line of your code.

Every channel, out of the box

Slack, Telegram, WhatsApp, and email. An inbound message wakes the right smith; the reply goes back to the same conversation, on the same thread.

Developer experience

An API-first platform, not a UI.

Everything in the console is the public /v1 REST API — the same surface you build on. Drive agents from your backend, or drop in the OpenAI-compatible endpoint and keep the SDK you already use.

OpenAI-compatible /v1/chat/completions — keep your SDK
Infrastructure as Code with the Pulumi provider
Signed webhooks for every lifecycle event
Idempotent writes and a versioned, dated API
Per-project isolation with cryptographically scoped tokens
Meter and bill your own customers on top

app.ts

// drop-in: point the OpenAI-compatible
// provider at a smith and stream
import { createOpenAICompatible }
  from "@ai-sdk/openai-compatible";
import { streamText } from "ai";
 
const ingram = createOpenAICompatible({
  name: "ingram",
  baseURL: "https://api.cloud.ingram.tech/v1",
  apiKey: SMITH_TOKEN,
});
 
const { textStream } = streamText({
  model: ingram(""), // the smith's configured model
  prompt,
});

Put an agent in your product this week.

Create a project, mint a token, and stream your first reply in minutes — every run on the record from the first one. No infrastructure to stand up.

Start building Read the docs