litentry · hanwencheng · May 28, 2026 · May 28, 2026
diff --git a/crates/agentkeys-mcp-server/README.md b/crates/agentkeys-mcp-server/README.md
@@ -151,7 +151,7 @@ imports directly — successfully drives this server through the full
 
 ## Three-act demo storyboard
 
-Per [`docs/research/agent-iam-strategy.md`](../../docs/research/agent-iam-strategy.md) §4.3:
+Per [`docs/agent-iam-strategy.md`](../../docs/agent-iam-strategy.md) §4.3:
 
 1. **Permissioned Memory** — `memory.get(actor=O_kevin_001, namespace="travel")`
    returns Chengdu trip context only; other namespaces (`family`, `profile`)

diff --git a/crates/agentkeys-mcp-server/tests/three_acts.rs b/crates/agentkeys-mcp-server/tests/three_acts.rs
@@ -1,6 +1,6 @@
 //! Three-act demo storyboard exercised end-to-end against the MockBackend.
 //!
-//! Reference: `docs/research/agent-iam-strategy.md` §4.3.
+//! Reference: `docs/agent-iam-strategy.md` §4.3.
 //!   Act 1 — Permissioned Memory (namespace-scoped read returns travel,
 //!           refuses cross-namespace)
 //!   Act 2 — Deterministic Denial (payment over daily cap)

diff --git a/docs/research/agent-iam-strategy.md → docs/agent-iam-strategy.md b/docs/research/agent-iam-strategy.md → docs/agent-iam-strategy.md
diff --git a/docs/arch.md b/docs/arch.md
@@ -1969,7 +1969,7 @@ When a user buys a vendor AI device (xiaozhi MagicLick, Doubao smart speaker, fu
    enforces that this vendor device can only read its own actor's S3 prefix.
 ```
 
-This is the missing piece between [iam.md §4.3](research/agent-iam-strategy.md) (three-act demo) and a real consumer product. Until M2 ships this, the demo runs with hardcoded vendor tokens + seeded in-memory fixtures (per #107's stage-1 simplifications).
+This is the missing piece between [iam.md §4.3](agent-iam-strategy.md) (three-act demo) and a real consumer product. Until M2 ships this, the demo runs with hardcoded vendor tokens + seeded in-memory fixtures (per #107's stage-1 simplifications).
 
 ### 22c.5 What the daemon does NOT become
 
@@ -1983,7 +1983,7 @@ Likewise the web UI is **not a trust plane** — per the #107 PR thread, the tru
 - [§11](#11-recovery--m-of-n-device-quorum-no-anchor-wallet-no-seed-phrase) — recovery quorum when master devices are lost
 - [§12](#12-sidecar-daemon) — daemon as the trust core under all three surfaces
 - [§22](#22-pluggable-surfaces) — MCP backend variant is one of the pluggable axes
-- [research/agent-iam-strategy.md §4.4](research/agent-iam-strategy.md) — Phase 1 vendor onboarding portal deliverable
+- [docs/agent-iam-strategy.md §4.4](agent-iam-strategy.md) — Phase 1 vendor onboarding portal deliverable
 - [#110](https://github.com/litentry/agentKeys/issues/110) — phased web UI roadmap (M1 parent dashboard → M2 onboarding → M3 multi-master → M4 backend wiring)
 - [#133](https://github.com/litentry/agentKeys/issues/133) — Phase 3 hooks that fire when tools are invoked through any of these surfaces
 - [#134](https://github.com/litentry/agentKeys/issues/134) — M6 distribution: GH Releases + Homebrew + `curl | sh` installer
@@ -1996,6 +1996,66 @@ Likewise the web UI is **not a trust plane** — per the #107 PR thread, the tru
 
 ---
 
+## 22d. IAM-guarantee delivery — hooks-first, proxy-fallback
+
+Recorded 2026-05-28. AgentKeys exposes IAM **tools** through the MCP server (§22c.1, §22c.2). Turning a tool into an IAM **guarantee** — a check the LLM cannot bypass — is a separate concern delivered through two enforcement seams, with explicit priority.
+
+### 22d.1 IAM tool vs IAM guarantee — the architecture-level distinction
+
+| | Definition | Whether the check runs is decided by... | Failure mode |
+|---|---|---|---|
+| **IAM tool** | Function in the LLM's tool registry, surfaced via MCP per §22c | The LLM (prompt + context + sampling) | LLM skips / is jailbroken → unauthorized action proceeds |
+| **IAM guarantee** | Non-LLM gate in the execution path, runtime-invoked deterministically | The runtime — hook system, proxy, or OS capability | Gate fails closed; action cannot proceed without an allow verdict |
+
+The seven Phase-1 MCP tools enumerated in [docs/agent-iam-strategy.md §4.2](agent-iam-strategy.md) (`identity.whoami`, `memory.get`, `memory.put`, `permission.check`, `cap.mint`, `cap.revoke`, `audit.append`) are **tools** by themselves. They become **guarantees** only when wrapped by one of the two enforcement seams below. This is the architecture-level framing for the §3.1 bounded-revocation commitment in the strategy doc: *high-risk = always-online permission check + fresh cap-token mint per call* is deliverable only when there's a non-LLM gate, which is what §22d.2 (hooks) and §22d.3 (proxy) provide.
+
+### 22d.2 Primary — hook reference configs ([#133](https://github.com/litentry/agentKeys/issues/133))
+
+Task Host runtimes (Claude Code, Codex, Hermes, OpenClaw) fire lifecycle hooks (`PreToolUse`, `PostToolUse`, `Stop`, `SessionEnd`) that synchronously invoke AgentKeys MCP tool calls. The runtime guarantees the hook fires; the LLM cannot bypass.
+
+| Tier-1 host | Hooks? | Notable | Verified |
+|---|---|---|---|
+| Claude Code | ✅ richest (~24 events) | `~/.claude/settings.json` config; reference shape (`{decision, reason, hookSpecificOutput, …}`) | 2026-05-28 via [code.claude.com/docs/en/hooks](https://code.claude.com/docs/en/hooks) |
+| Codex (OpenAI) | ✅ 10 events | `~/.codex/hooks.json` or `~/.codex/config.toml`; thin `decision ↔ continue` shim needed | 2026-05-28 via [developers.openai.com/codex/hooks](https://developers.openai.com/codex/hooks) |
+| Hermes (NousResearch) | ✅ `pre_tool_call`, `post_tool_call`, `pre_llm_call`, `on_session_end`, `subagent_stop` | `~/.hermes/config.yaml`; **explicitly Claude-Code-compatible JSON shape** ("either shape accepted; normalised internally" per `agent/shell_hooks.py`) | 2026-05-28 via live `hermes hooks --help` + source |
+| OpenClaw | ⚠️ likely (inferred from Hermes lineage — `hermes claw` is the migration tool) | Probably mirrors Hermes config | inferred — verify with live install |
+
+Tier-1 coverage means one reference script bundle ports across four hosts with thin shims. Issue #133 is the canonical track: ship reference hook configs + `agentkeys hook check` CLI helper + cap-mint pre-warming for sub-50ms p99 latency.
+
+**Operator-facing delivery: `agentkeys wire <runtime>`** (per [`docs/agent-iam-strategy.md`](agent-iam-strategy.md) §3.7). The reference hook configs from #133 are not hand-installed by users. AgentKeys ships a single CLI command — `agentkeys wire hermes`, `agentkeys wire claude-code`, etc. — that idempotently writes the hook scripts (under `~/.<runtime>/agent-hooks/`), appends the `hooks:` block to the runtime's config, pre-approves first-use consent, fetches the LLM API key from the credential broker, and verifies via the runtime's own `hooks doctor` equivalent. Output follows the CLAUDE.md `ok proceeding / skip <reason> / fail <reason>` convention; re-runs are no-ops modulo drift. Per-runtime adapter trait lives in `crates/agentkeys-cli/src/wire/adapters/`. Full plan: [`docs/spec/plans/phase-1-fresh-user-wire-onboarding.md`](spec/plans/phase-1-fresh-user-wire-onboarding.md).
+
+### 22d.3 Fallback — OpenAI-compatible proxy (Phase 3b, lower priority)
+
+Tier-2 hosts have no lifecycle hook surface — xiaozhi-server (verified 2026-05-28: only plugin/MCP tool registration, no hooks), vendor mobile chatbots, plain `openai.ChatCompletion` scripts. For these, AgentKeys offers an OpenAI-compatible proxy that the host's LLM client points at via `OPENAI_BASE_URL`. The proxy inspects every prompt + `tool_calls` + completion, enforces policy, logs audit, then forwards upstream.
+
+Lower priority than hooks because:
+
+1. **Strategy §2.4 mission-creep risk** — proxy lives in the path of every byte; vendors will ask for retry/fallback/caching that edges toward Task Host territory.
+2. **Competitive crowding** — Vercel AI Gateway, Helicone, LangSmith, Portkey, OpenRouter, Cloudflare AI Gateway all occupy this space. We want this only when our authority position is established and we own the IAM-shaped differentiation.
+3. **Tier-1 hosts already cover the strategically-important runtimes** — one investment, four runtimes; broader reach for less per-host cost.
+
+Sequenced as **Phase 3b**, after #133 ships and at least one vendor pilot is on hooks.
+
+### 22d.4 Why this matters at the architecture level
+
+Per [strategy §2.4](agent-iam-strategy.md) zero-orchestration hard line, AgentKeys must not become a Task Host. Both enforcement seams respect that boundary, but differently:
+
+- **Hooks** sit *inside* the Task Host (Claude Code, Codex, Hermes, OpenClaw). The Task Host owns lifecycle; AgentKeys provides the policy-check tool body. No request-path orchestration on AgentKeys' side.
+- **Proxy** sits between the LLM client and the LLM provider. AgentKeys touches every byte of the LLM conversation. The §2.4 risk is real — discipline is "don't add retry, don't add fallback, don't add caching; stay a policy + audit + memory-injection layer."
+
+The hooks-first ordering is therefore both about coverage (Tier-1 covers the high-value runtimes) AND about preserving the Authority Host posture (hooks have lower §2.4 risk than proxy).
+
+### 22d.5 Cross-references
+
+- [docs/agent-iam-strategy.md §3.6](agent-iam-strategy.md) — strategic-anchor record of the same decision
+- [docs/wiki/agent-iam-guarantee-glossary.md](wiki/agent-iam-guarantee-glossary.md) — standalone glossary with the full hooks-vs-proxy trade-off table + verified hook-availability table across six runtimes
+- [docs/wiki/agent-iam-guarantee-glossary.md](wiki/agent-iam-guarantee-glossary.md) — standalone glossary with the hooks-vs-proxy trade-off table + verified hook-availability table across six runtimes. (The previous operator-facing summary at `docs/demo-aiosandbox-runbook.md` §6 was archived 2026-05-28; a new operator runbook lands with the `agentkeys wire` follow-up.)
+- [§22c.2](#22c2-backend-wiring--four-agent-runtimes) — four backend variants of the MCP server (Tools layer; the layer §22d wraps with guarantees)
+- [#133](https://github.com/litentry/agentKeys/issues/133) — Phase 3 LLM-host hook integration (the hook-track deliverable)
+- [#107](https://github.com/litentry/agentKeys/issues/107) — Phase 1 MCP server (the tool-layer deliverable; the substrate hooks call into)
+
+---
+
 ## 23. Cargo workspace
 
 ```

diff --git a/docs/archived/README.md b/docs/archived/README.md
@@ -12,6 +12,10 @@ Superseded by the current top-level docs:
 | `operator-runbook-pre-stage7.md` (was `../operator-runbook.md`) | [`../operator-runbook-stage7.md`](../operator-runbook-stage7.md) — Stage-7+ broker (post-issue-#71 OIDC-only mints, post-issue-#74-step-1 dev_key_service signer) |
 | `contradictions-stage4-2026-04.md` (was `../contradictions.md`) | Audit snapshot taken 2026-04-14 against Stage-4-implementation-complete + 17 open issues. The decisions it captured have either landed or been re-scoped; no live successor — Stage 7+ design discussions live under [`../spec/plans/issue-64/`](../spec/plans/issue-64/) and [`../spec/plans/issue-74-dev-key-service-plan.md`](../spec/plans/issue-74-dev-key-service-plan.md) |
 | `field-name-translation.md` (was `../field-name-translation.md`) | Stage-4-keychain-output design note. Subsumed by the Stage-7 daemon's session/wallet representation; kept for the historical "why we sed-pretty-printed `security(1)`" reasoning |
+| `demo-aiosandbox-runbook-rust-runtime-2026-05.md` (was `../demo-aiosandbox-runbook.md`) | The issue #103 Rust-runtime approach (custom `agentkeys-hermes-runtime` crate + daemon `--demo-memory` flag + extended sandbox image). Architecture content from §6 is preserved in [`../arch.md`](../arch.md) §22d + [`../wiki/agent-iam-guarantee-glossary.md`](../wiki/agent-iam-guarantee-glossary.md). Operator runbook for the replacement (real Hermes + `agentkeys wire`) lands with [`../spec/plans/phase-1-fresh-user-wire-onboarding.md`](../spec/plans/phase-1-fresh-user-wire-onboarding.md) |
+| `verify-issue-103-rust-runtime-2026-05.md` (was `../verify-issue-103.md`) | Verification script for the same Rust-runtime approach. Replaced by per-step verification inside the new wire-flow runbook (TBD) |
+| `setup-demo-aiosandbox-rust-runtime-2026-05.sh` (was `../../scripts/setup-demo-aiosandbox.sh`) | Idempotent provisioner for the Rust-runtime sandbox image + S3 bucket. Replaced by `agentkeys wire hermes` per [`../spec/plans/phase-1-fresh-user-wire-onboarding.md`](../spec/plans/phase-1-fresh-user-wire-onboarding.md) |
+| `aiosandbox-demo-rust-runtime-2026-05/` (was `../../docker/aiosandbox-demo/`) | Dockerfile + supervisord configs + nginx fragment for the Rust-runtime sandbox image. The new path runs stock `ghcr.io/agent-infra/sandbox:latest` + installs real Hermes inside, wired via `agentkeys wire hermes` |
 
 ## Archive policy
 

diff --git a/docs/archived/aiosandbox-demo-rust-runtime-2026-05/Dockerfile b/docs/archived/aiosandbox-demo-rust-runtime-2026-05/Dockerfile
@@ -0,0 +1,85 @@
+# Issue #103 — extended agent-infra/sandbox image for the AgentKeys
+# hardware-companion demo.
+#
+# Layers two supervisord programs on top of the stock sandbox:
+#   - agentkeys-daemon   (port 8089) — serves the demo memory blob
+#   - agentkeys-hermes-runtime (port 8090) — agent runtime
+#
+# The stock gem-server + nginx programs from the base image remain
+# in place; nginx fronts hermes-runtime on port 8080 for the public
+# /v1/chat path used by the ESP32 (see ./nginx-hermes.conf, applied
+# at the deploy script in scripts/setup-demo-aiosandbox.sh step 5).
+#
+# Build context: workspace root (so the binaries-builder stage can
+# `cargo build` against the full workspace).
+#
+# Build:
+#   docker build -f docker/aiosandbox-demo/Dockerfile -t agentkeys/aiosandbox-demo:latest .
+#
+# IMPORTANT: pass `--security-opt seccomp=unconfined` to every `docker run`
+# invocation. The agent-infra/sandbox base image uses kernel syscalls that
+# Docker Desktop's default seccomp profile blocks on macOS/Linux, and the
+# container exits silently without it. The flag is safe for local demo use;
+# production deployments behind a hardened runtime (firecracker / gVisor /
+# k8s seccomp profile) can drop it.
+#
+# Run (local quickstart with bundled fixture, no S3):
+#   docker run --security-opt seccomp=unconfined --rm -p 8080:8080 -p 8090:8090 \
+#     -e AGENTKEYS_LLM_PROVIDER=stub \
+#     agentkeys/aiosandbox-demo:latest
+#
+# Run (with DashScope Qwen-Plus + S3-backed memory):
+#   docker run --security-opt seccomp=unconfined --rm -p 8080:8080 \
+#     -e AGENTKEYS_LLM_PROVIDER=dashscope \
+#     -e AGENTKEYS_LLM_API_KEY=sk-... \
+#     -e AGENTKEYS_DEMO_MEMORY_BUCKET=agentkeys-demo-memory \
+#     -e AGENTKEYS_DEMO_MEMORY_REGION=us-east-1 \
+#     -e AGENTKEYS_DEMO_ACTOR_TOKEN=demo_token_O_demo_001_changeme \
+#     agentkeys/aiosandbox-demo:latest
+
+# ─── stage 1: build the agentkeys Rust binaries ────────────────────────
+FROM rust:1.83-slim-bookworm AS builder
+
+RUN apt-get update \
+ && apt-get install -y --no-install-recommends \
+      pkg-config libssl-dev ca-certificates \
+ && rm -rf /var/lib/apt/lists/*
+
+WORKDIR /src
+COPY Cargo.toml Cargo.lock ./
+COPY crates ./crates
+COPY tests ./tests
+
+RUN cargo build --release \
+      -p agentkeys-daemon \
+      -p agentkeys-hermes-runtime
+
+# ─── stage 2: layer onto the stock sandbox ─────────────────────────────
+FROM ghcr.io/agent-infra/sandbox:latest
+
+COPY --from=builder /src/target/release/agentkeys-daemon \
+                    /usr/local/bin/agentkeys-daemon
+COPY --from=builder /src/target/release/agentkeys-hermes-runtime \
+                    /usr/local/bin/agentkeys-hermes-runtime
+
+COPY docker/aiosandbox-demo/supervisord.d/agentkeys-daemon.conf \
+     /opt/gem/supervisord.d/agentkeys-daemon.conf
+COPY docker/aiosandbox-demo/supervisord.d/hermes-runtime.conf \
+     /opt/gem/supervisord.d/hermes-runtime.conf
+
+# Pre-create writable state dir owned by the stock gem user.
+RUN mkdir -p /home/gem/.agentkeys \
+ && chown gem:gem /home/gem/.agentkeys
+
+# Stock sandbox ships nginx fronting gem-server on 8080; we extend it
+# to forward /v1/chat to the hermes-runtime on 8090. The nginx vhost
+# is installed by scripts/setup-demo-aiosandbox.sh (post-deploy edit)
+# rather than baked in here so operators can swap the cert paths and
+# upstream port without rebuilding the image.
+
+# Ports:
+#   8080 — public nginx front-end (used by ESP32)
+#   8089 — agentkeys-daemon demo memory (loopback-only in prod, exposed
+#          here so docker-compose-style debugging is easy)
+#   8090 — agentkeys-hermes-runtime (also loopback in prod, behind nginx)
+EXPOSE 8080 8089 8090
diff --git a/docs/archived/aiosandbox-demo-rust-runtime-2026-05/nginx-hermes.conf b/docs/archived/aiosandbox-demo-rust-runtime-2026-05/nginx-hermes.conf
@@ -0,0 +1,27 @@
+# Nginx vhost fragment installed by scripts/setup-demo-aiosandbox.sh.
+#
+# Forwards the public /v1/chat path on port 8080 to hermes-runtime on
+# port 8090. Stock sandbox nginx already serves /healthz, /v1/api/* etc.
+# from gem-server on 8088 — we keep that intact and add /v1/chat as a
+# new location block.
+#
+# This fragment is applied to /etc/nginx/conf.d/ on the demo host (NOT
+# inside the container); the script lays down the file, runs
+# `nginx -t`, and reloads.
+
+location = /v1/chat {
+    # Body-size cap so a runaway client can't blow up memory.
+    client_max_body_size 16k;
+    client_body_timeout 10s;
+    proxy_pass http://127.0.0.1:8090/v1/chat;
+    proxy_set_header Host $host;
+    proxy_set_header X-Real-IP $remote_addr;
+    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
+    proxy_set_header X-Forwarded-Proto $scheme;
+    proxy_read_timeout 60s;
+    proxy_send_timeout 60s;
+}
+
+location = /v1/chat/healthz {
+    proxy_pass http://127.0.0.1:8090/healthz;
+}
diff --git a/docs/archived/aiosandbox-demo-rust-runtime-2026-05/supervisord.d/agentkeys-daemon.conf b/docs/archived/aiosandbox-demo-rust-runtime-2026-05/supervisord.d/agentkeys-daemon.conf
@@ -0,0 +1,19 @@
+[program:agentkeys-daemon]
+command=/usr/local/bin/agentkeys-daemon --demo-memory --demo-memory-port 8089
+user=gem
+autostart=true
+autorestart=true
+startsecs=2
+startretries=10
+stdout_logfile=/var/log/agentkeys-daemon.log
+stdout_logfile_maxbytes=10MB
+stdout_logfile_backups=3
+stderr_logfile=/var/log/agentkeys-daemon.err.log
+stderr_logfile_maxbytes=10MB
+stderr_logfile_backups=3
+environment=
+    RUST_LOG="info",
+    AGENTKEYS_DEMO_ACTOR_OMNI="%(ENV_AGENTKEYS_DEMO_ACTOR_OMNI)s",
+    AGENTKEYS_DEMO_MEMORY_BUCKET="%(ENV_AGENTKEYS_DEMO_MEMORY_BUCKET)s",
+    AGENTKEYS_DEMO_MEMORY_REGION="%(ENV_AGENTKEYS_DEMO_MEMORY_REGION)s",
+    AGENTKEYS_DEMO_MEMORY_FIXTURE="%(ENV_AGENTKEYS_DEMO_MEMORY_FIXTURE)s"
diff --git a/docs/archived/aiosandbox-demo-rust-runtime-2026-05/supervisord.d/hermes-runtime.conf b/docs/archived/aiosandbox-demo-rust-runtime-2026-05/supervisord.d/hermes-runtime.conf
@@ -0,0 +1,28 @@
+[program:hermes-runtime]
+command=/usr/local/bin/agentkeys-hermes-runtime
+user=gem
+autostart=true
+autorestart=true
+startsecs=2
+startretries=10
+stdout_logfile=/var/log/hermes-runtime.log
+stdout_logfile_maxbytes=10MB
+stdout_logfile_backups=3
+stderr_logfile=/var/log/hermes-runtime.err.log
+stderr_logfile_maxbytes=10MB
+stderr_logfile_backups=3
+# hermes-runtime depends on the daemon at boot time to fetch the
+# memory profile. supervisord doesn't have hard dependencies, so we
+# rely on the runtime's `refresh_memory_best_effort` path: it logs a
+# warning and continues without memory if the daemon isn't ready yet.
+# A periodic refresh job (TODO follow-up) would close that gap.
+environment=
+    RUST_LOG="info",
+    AGENTKEYS_HERMES_PORT="8090",
+    AGENTKEYS_DAEMON_URL="http://localhost:8089",
+    AGENTKEYS_ACTOR_OMNI="%(ENV_AGENTKEYS_DEMO_ACTOR_OMNI)s",
+    AGENTKEYS_DEMO_ACTOR_TOKEN="%(ENV_AGENTKEYS_DEMO_ACTOR_TOKEN)s",
+    AGENTKEYS_LLM_PROVIDER="%(ENV_AGENTKEYS_LLM_PROVIDER)s",
+    AGENTKEYS_LLM_MODEL="%(ENV_AGENTKEYS_LLM_MODEL)s",
+    AGENTKEYS_LLM_API_KEY="%(ENV_AGENTKEYS_LLM_API_KEY)s",
+    AGENTKEYS_LLM_BASE_URL="%(ENV_AGENTKEYS_LLM_BASE_URL)s"