Memory
Each smith has one memory system, owned by Ingram Cloud. The smith reads and writes it automatically during runs. You rarely manage it, but all of it is inspectable and editable, via API and console.
Memory is per smith. One person's facts can never surface in another person's conversation; that isolation is why you create one smith per end-user.
The three tiers
| Tier | What it is | When it's in context |
|---|---|---|
| Core blocks | small named texts: persona, profile, notes | always: pinned into every run |
| Archival | durable, searchable facts ("prefers morning meetings") | when the smith searches for them |
| Recall | the thread history | the recent turns of the active thread |
A smith maintains its core blocks and archival facts through memory tools it
carries automatically (when auto_memory is on). Recall comes from passing a
stable thread_id; see Runs & streaming.
Core blocks
# Authorization: tenant-admin token (server-side only), or a smith token with memories:*
curl https://api.cloud.ingram.tech/v1/smiths/smt_…/blocks
# → { "data": [ { "label": "persona", "value": "…", "char_limit": …, … } ] }
curl -X PATCH https://api.cloud.ingram.tech/v1/smiths/smt_…/blocks/notes \
-H "Authorization: Bearer $IC_TOKEN" \
-H "IC-Api-Version: 2026-05-01" \
-H "Content-Type: application/json" \
-d '{ "value": "Prefers terse answers.", "mode": "append" }' # or "replace"
Blocks are deliberately small (they ride in every prompt); the char limit is enforced.
Archival memories
# Authorization: tenant-admin token (server-side only), or a smith token with memories:*
curl https://api.cloud.ingram.tech/v1/smiths/smt_…/memories # list
curl -X POST …/v1/smiths/smt_…/memories \
-d '{ "content": "Works at Acme as a data engineer.",
"category": "profile", "subject": "occupation" }' # add
curl -X POST …/v1/smiths/smt_…/memories/search -d '{ "query": "where works" }'
curl -X PATCH …/v1/smiths/smt_…/memories/{mid} -d '{ "content": "…" }'
curl -X DELETE …/v1/smiths/smt_…/memories/{mid}
category and subject are your strings. Use them to namespace facts your
backend writes versus facts the smith writes (source records who wrote it).
Sleep-time consolidation
In the background, Ingram Cloud periodically distills recent conversation into better core blocks for smiths that have been active, so the standing picture of a person improves between sessions, not just during them.
Turning memory off
auto_memory: false (on the smith or its agent) disables the
whole system for that smith: no blocks pinned into context, no memory tools,
nothing remembered. Use it for stateless utility smiths, e.g. the dedicated
smith you run structured-output calls through.
In the console
A smith's Memory tab shows the core blocks (editable in place) and the archival table (category, subject, content, source). Edits take effect on the next run.