Ingram Cloud

Documentation

Build a support agent

Build a support agent

A worked example: a support assistant embedded on your website that serves both signed-in and logged-out visitors, with each person's conversation kept to themselves. It ties together agents, smiths, threads, and runs — read Modeling your users first for the why.

Architecture

browser widget  →  your backend  →  the API

The browser never holds an API token. Your server holds the tenant-admin token, decides who the visitor is, and proxies each message. That single rule is what keeps one visitor from ever addressing another's smith.

1. Design the support agent

Define the behaviour once as an agent (or do this in the console under Build → Agents). Publishing the first version makes it live:

# Authorization: tenant-admin token (server-side only)
curl https://api.cloud.ingram.tech/v1/agents \
  -H "Authorization: Bearer $IC_TOKEN" \
  -H "IC-Api-Version: 2026-05-01" \
  -H "Content-Type: application/json" \
  -d '{ "slug": "support", "name": "Support bot",
        "instructions": "You are the support assistant for Acme. Be concise. If you do not know, say so and offer to open a ticket.",
        "enabled_hosted_tools": ["web_search"] }'
# → 201 { "id": "agt_…", "slug": "support", … }

curl https://api.cloud.ingram.tech/v1/agents/agt_…/versions \
  -H "Authorization: Bearer $IC_TOKEN" \
  -H "IC-Api-Version: 2026-05-01" \
  -H "Content-Type: application/json" \
  -d '{ "note": "first version" }'
# → 201 { "version": 1 }  (the first publish goes live automatically)

Every smith you point at this agent runs version 1, and tracks future rollouts.

2. Identify the visitor (server-side)

When the widget loads, your backend ensures a smith for whoever's there. Both cases are the same call — POST /v1/smiths upserts on external_id — they differ only in the key.

Signed-in — key to your own user id, and attach the agent:

# Authorization: tenant-admin token (server-side only)
curl https://api.cloud.ingram.tech/v1/smiths \
  -H "Authorization: Bearer $IC_TOKEN" \
  -H "IC-Api-Version: 2026-05-01" \
  -H "Content-Type: application/json" \
  -d '{ "external_id": "user_123", "display_name": "Ada Lovelace",
        "agent_id": "agt_…" }'

Logged-out — mint a random token, set it as a cookie, and key the smith to it under the anon: namespace. Turn memory off so a drive-by visitor doesn't accumulate anything:

# Authorization: tenant-admin token (server-side only)
curl https://api.cloud.ingram.tech/v1/smiths \
  -H "Authorization: Bearer $IC_TOKEN" \
  -H "IC-Api-Version: 2026-05-01" \
  -H "Content-Type: application/json" \
  -d '{ "external_id": "anon:9f2c7b1e…", "agent_id": "agt_…",
        "auto_memory": false }'

Keep the returned smt_… id in the visitor's session. Because the create upserts, calling it on every page load is safe — a returning visitor (same cookie) lands back on the same smith.

3. Run a turn, one thread per conversation

Each chat session is a thread. Pick a thread_id when the conversation starts and reuse it for every turn, so the assistant keeps context; start a new one for a new conversation. Your backend forwards the visitor's message and streams the reply back to the widget:

# Authorization: tenant-admin token (server-side only)
curl -N https://api.cloud.ingram.tech/v1/smiths/smt_…/runs \
  -H "Authorization: Bearer $IC_TOKEN" \
  -H "IC-Api-Version: 2026-05-01" \
  -H "Content-Type: application/json" \
  -d '{ "input": [{ "role": "user", "content": "How do I reset my password?" }],
        "thread_id": "chat_7f3a", "stream": true }'

# event: run.started     data: {"v":1,"run_id":"run_…","smith_id":"smt_…",…}
# event: message.delta   data: {"v":1,"run_id":"run_…","delta":"Head"}
# event: message.delta   data: {"v":1,"run_id":"run_…","delta":" to"}
# …
# event: run.completed   data: {"v":1,"run_id":"run_…","stop_reason":"end_turn"}

Drop "stream": true for a single JSON response instead. The full event envelope (tool calls, approvals) is on Runs & streaming.

Building the widget with the Vercel AI SDK rather than hand-rolling the stream? Point your proxy route at the AI SDK adapter: your backend still holds the token and picks the smith, but streamText and useChat replace the bespoke SSE plumbing.

Accepting screenshots and files

Support turns often carry a screenshot or a PDF. Inline the bytes as a content part — an image for a screenshot, a file for a document — and the smith's (vision/document-capable) model reads it directly:

# Authorization: tenant-admin token (server-side only)
curl https://api.cloud.ingram.tech/v1/smiths/smt_…/runs \
  -H "Authorization: Bearer $IC_TOKEN" \
  -H "IC-Api-Version: 2026-05-01" \
  -H "Content-Type: application/json" \
  -d '{ "thread_id": "chat_7f3a", "input": [{ "role": "user", "content": [
        { "type": "text", "text": "This error keeps popping up — see the screenshot." },
        { "type": "image", "image": "data:image/png;base64,iVBORw0KGgo…" },
        { "type": "file", "filename": "invoice.pdf",
          "data": "data:application/pdf;base64,JVBERi0…" }
      ] }] }'

The bytes are stored for auditability: the run's input keeps a file_id reference instead of the payload, and you (or the console) download the original at GET /v1/files/{file_id}/content. Sending the OpenAI-style image_url / file shape to /v1/chat/completions works the same way — see openai-compat.

4. Claim the conversation on login

If a logged-out visitor signs in mid-chat, keep their smith — its threads carry over — and stamp the real identity onto it (external_id is immutable, so you don't rename it):

# Authorization: tenant-admin token (server-side only)
curl -X PATCH https://api.cloud.ingram.tech/v1/smiths/smt_… \
  -H "Authorization: Bearer $IC_TOKEN" \
  -H "IC-Api-Version: 2026-05-01" \
  -H "Content-Type: application/json" \
  -d '{ "display_name": "Ada Lovelace", "customer_id": "cus_…",
        "auto_memory": true }'

Record user_123 → smt_… on your side so their next login resolves the same smith. Keep using the same thread_id and the conversation continues uninterrupted.

5. Operate

Everything is segmentable by the anon: namespace you chose:

# Authorization: tenant-admin token (server-side only)
# Every logged-out visitor:
curl "https://api.cloud.ingram.tech/v1/smiths?external_id_prefix=anon:" \
  -H "Authorization: Bearer $IC_TOKEN" \
  -H "IC-Api-Version: 2026-05-01"

# Every anonymous conversation (the segment's run history):
curl "https://api.cloud.ingram.tech/v1/runs?external_id_prefix=anon:" \
  -H "Authorization: Bearer $IC_TOKEN" \
  -H "IC-Api-Version: 2026-05-01"

In the console, the Smiths page filter box takes the same prefix. To clean up a visitor, DELETE /v1/smiths/{id} soft-archives the smith and frees its external_id.

Pitfalls

  • One smith per visitor, not one shared "anonymous" smith. A shared smith with a thread per visitor has no access boundary between those visitors — see Modeling your users.
  • Don't ship a token to the browser. Proxy through your backend, which holds the tenant-admin token and picks the smith.
  • Memory is a choice. Leave it off for stateless support; turn it on (and persist the cookie) only when returning-visitor recall is worth it.