diff --git a/CLAUDE.md b/CLAUDE.md index ac81a22..9cea16e 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -7,6 +7,14 @@ See `docs/spec/plans/development-stages.md` for the 8-stage build plan. See `docs/spec/plans/execution-plan.md` for the orchestration runbook (ralph, team, ultraqa). Do not read folder `docs/archived` +## Architecture-as-source-of-truth policy +[`docs/spec/architecture.md`](docs/spec/architecture.md) is the **single source of truth** for component inventory, key inventory (K1–K11), trust boundaries, identity model (HDKD actor tree), and per-actor binding ceremonies. **After editing any architectural doc** (broker plans, signer-protocol, demo doc, runbooks, plan files in `docs/spec/plans/`, heima-gaps), re-open `architecture.md` and verify it still matches; if it diverges, update arch.md in the same change. If the per-doc detail outgrows arch.md, link from arch.md outward — never duplicate. The wiki page at [`.omc/wiki/agent-role-and-usage-hdkd-per-agent-omni.md`](.omc/wiki/agent-role-and-usage-hdkd-per-agent-omni.md) is a focused operator reference for the agent role; it defers to arch.md. + +### Terminology-source-of-truth rule +**Never invent a new name for a concept that arch.md already names.** When a doc, runbook, CLI output, or commit message needs to refer to a wallet / omni / key / endpoint that exists in arch.md, use the arch.md spelling verbatim. If a component currently emits a different label (e.g. `agentkeys whoami` prints `session_wallet:` while arch.md / the OIDC JWT call the same field `agentkeys_user_wallet` / `JWT.agentkeys.wallet_address`), either (a) align the component to the arch.md name OR (b) document the alias in arch.md's "Canonical names" section as an explicit synonym — never let the divergence silently persist. Drift is auditable only if it's explicit. + +When you discover a name divergence while making any change, fix it in the same commit (or open a follow-up issue if the rename ripples beyond the current scope — but call out the divergence in the commit message either way). The cure for terminology drift is "one name, one concept, written down in arch.md's canonical-names section"; the disease is operators having to read three docs to figure out whether `master_wallet` / `session_wallet` / `agentkeys_user_wallet` are the same thing. + ## Version Control Use `jj` (Jujutsu) for all version control. Never use raw `git` commands. @@ -19,9 +27,66 @@ Before changing any file in response to a reported failure, **reproduce the fail ## Land-the-fix policy Once a local repro proves a fix is correct, **land it the same turn**: edit every affected file (search repo-wide — never assume one file), commit, push to `origin/evm`. Do not stop at "verified locally" or "fixed in one place" — the next operator running the docs will hit the same bug if the fix isn't on `origin/evm`. Pair this with the diagnosis-before-edit policy: diagnose once, fix everywhere, push immediately. +## Runbook-fix-fold-back policy +When the user is walking through a runbook (`docs/cloud-setup.md`, `docs/stage7-demo-and-verification.md`, `docs/operator-runbook-stage7.md`, etc.) and hits a step that fails, **two things must land in the same turn**: + +1. The targeted fix to whatever broke (script default, env var, doc command, code). +2. **A revision to the runbook itself** so the next operator running it top-to-bottom will not hit the same failure. The fix lives wherever the bug was; the runbook revision lives wherever the operator first encounters the broken step. + +Examples of revisions to land alongside the underlying fix: +- A failing prerequisite check → upgrade the prereq sanity-check step to catch the same case (not just fix the missing prereq once). +- A wrong env var on the wrong machine → call out the laptop-vs-broker-host scope explicitly in the runbook step that uses it. +- A silent skipped action that downstream commands rely on → add a verify-and-fail-loud sanity check in the runbook between the action and its dependent. +- A confusing diagnostic that took two rounds to resolve → fold the diagnosis steps inline into the runbook (one-shot lookup table, not 3 round-trips with the operator). + +The goal: every operator-encountered failure makes the runbook strictly more robust before we move on. Never leave the runbook in a state where the same operator (or the next one) will hit the same trap. + +## No-hardcoded-values policy +**Do not bake hardcoded values (paths, hostnames, addresses, account IDs, ports, magic numbers) into scripts, code, or runbooks.** Use one of: + +- env var with default + override (preferred for operator-facing config) +- CLI flag with default +- config file (env file, TOML, etc.) sourced at startup +- constant in a single source-of-truth file with a clear name + +If a hardcoded value is genuinely temporary — e.g. you're sketching a fix and don't yet know how to parameterize it — **log it in [`hardcoded.md`](hardcoded.md)** with: file path + line number, what's hardcoded, why it's hardcoded today, and the concrete change that would unblock making it dynamic. The doc is the audit trail; if a value is hardcoded but not in `hardcoded.md`, the next operator (or future-you) can't tell it was deliberate vs an oversight. + +Hardcoded values that go unrecorded compound: each new operator adds defaults baked into a different layer, the runbook drifts from reality, and the project becomes un-deployable to anyone but the original author. The audit log is the cure — it forces an explicit decision instead of an accumulating series of "I'll fix it later"s. + +## Plan-completion policy +When the user references a plan (e.g. `docs/spec/plans/issue-XX-*.md`), **complete every numbered step in the plan's implementation-order table — not a self-selected subset**. If you cannot complete a step (interactive flow needs human, scope explosion, prerequisites missing), say so up front before starting work and get explicit approval to defer. Never silently drop steps and ship a partial plan as "done." + +The end-of-PR summary is mandatory and has two sections in this exact order: + +1. **What landed** — bulleted list of every plan step you finished, with file paths. +2. **What did NOT land** — every plan step you skipped, with the reason and what unblocks it. If the section is empty, say so explicitly ("All plan steps shipped."). + +Do not bury skipped work in a footnote, in a note partway through prose, or in a doc that the user has to dig for. The summary is the authoritative answer to "is this PR plan-complete?" — make it answerable from a glance. + +Also: never gloss over a partial implementation in a demo doc or runbook. If the demo walks through a flow that is only half-shipped, the doc must state which half is shipped and which still requires manual setup or a follow-up PR. Operators reading the doc cannot tell which is which from prose alone. + ## Remote broker host (single entry point) All remote-host changes (binary upgrades, systemd edits, nginx/certbot, env tweaks, mock-server redeploys) MUST go through `bash scripts/setup-broker-host.sh` — it's idempotent and auto-detects bootstrap vs upgrade. No ad-hoc `systemctl` edits or hand-built `scp`. +## AWS local-profile ↔ remote-IAM mapping +Operator workstations use lowercase AWS profile names; the access key/secret inside each profile authenticates as the corresponding remote IAM user (case differences like `agentKeys-admin` on AWS vs `agentkeys-admin` locally are cosmetic — the key is the binding, not the name). Source-of-truth (`awsp` output): + +| Local profile (laptop) | Remote IAM principal (AWS) | Use for | +|------------------------|---------------------------|---------| +| `agentkeys-admin` | `user/agentKeys-admin` | Account-owner ops: SES verify, S3 bucket admin, IAM put-role-policy, EC2 describe-instances, OIDC provider mgmt | +| `agentkeys-broker` | `user/agentkey-broker` | Broker-runtime-equivalent perms (rarely used from laptop; the broker EC2 has its own instance profile) | +| `agentkeys-daemon` | `user/agentkey-daemon` | Daemon-side AssumeRoleWithWebIdentity-equivalent (rarely used from laptop) | + +Switch with `awsp `; verify with `aws sts get-caller-identity`. + +### Per-profile default region is NOT uniform — always pass `--region "$REGION"` explicitly +**Critical trap (real 2026-05-12 incident):** `agentkeys-admin` defaults to `us-west-2` while `agentkeys-broker` / `agentkeys-daemon` default to `us-east-1` (where the broker EC2 + SES + S3 actually live). A bare `aws ec2 describe-instances --filters "Name=ip-address,Values=$EIP"` under `agentkeys-admin` searches `us-west-2`, the EC2 isn't there, the JMESPath returns empty, and the CLI exits 0 with no stderr — silently corrupting the downstream `--role-name ""` or `--instance-profile-name ""` call. + +**Rule for all operator-facing docs, scripts, and copy-paste blocks:** every regional AWS API call (`aws ec2`, `aws ses`, `aws s3api`, `aws sts assume-role-*`, `aws logs`, etc.) MUST pass `--region "$REGION"` explicitly. `$REGION` comes from `scripts/operator-workstation.env` (us-east-1). Never rely on the profile's default region — they're not consistent across the three profiles. Global IAM calls (`aws iam`) are region-less and don't need the flag. + +### Caller-ARN matching in scripts must be case-insensitive +Lowercase the caller_arn before matching, since the remote IAM user is `agentKeys-admin` (capital K) but operator scripts canonicalize on `agentkeys-admin`. Use `tr '[:upper:]' '[:lower:]'` (portable to /bin/bash 3.2) — not `${var,,}` (bash 4+). + ## Development Workflow (Anthropic Harness Pattern) On every session start: diff --git a/Cargo.lock b/Cargo.lock index f56d425..b668410 100644 --- a/Cargo.lock +++ b/Cargo.lock @@ -24,10 +24,13 @@ dependencies = [ "async-trait", "aws-config", "aws-credential-types", + "aws-sdk-s3", + "aws-sdk-sesv2", "aws-sdk-sts", "axum", "base64", "clap", + "futures-util", "getrandom 0.2.17", "hex", "hmac 0.12.1", @@ -50,6 +53,7 @@ dependencies = [ "tracing", "tracing-subscriber", "url", + "uuid", ] [[package]] @@ -78,18 +82,25 @@ dependencies = [ name = "agentkeys-core" version = "0.1.0" dependencies = [ + "agentkeys-mock-server", "agentkeys-types", "anyhow", "async-trait", + "axum", "base64", "ciborium", + "getrandom 0.2.17", "hex", "hmac 0.12.1", + "k256", "keyring", + "rand_core", "reqwest", + "rusqlite", "serde", "serde_json", "sha2 0.10.9", + "sha3", "tempfile", "thiserror", "tokio", @@ -149,15 +160,23 @@ dependencies = [ "ciborium", "clap", "ed25519-dalek", + "getrandom 0.2.17", "hex", + "hkdf", "hmac 0.12.1", "http-body-util", + "jsonwebtoken", + "k256", + "p256 0.13.2", "rand", + "rand_core", "reqwest", "rusqlite", "serde", "serde_json", "sha2 0.10.9", + "sha3", + "thiserror", "tokio", "tower 0.4.13", "tower-http 0.5.2", @@ -215,6 +234,12 @@ dependencies = [ "memchr", ] +[[package]] +name = "allocator-api2" +version = "0.2.21" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "683d7910e743518b0e34f1186f92494becacb047c7b6bf616c96772180fef923" + [[package]] name = "anstream" version = "1.0.0" @@ -489,7 +514,7 @@ dependencies = [ "fastrand 2.4.1", "hex", "http 1.4.0", - "sha1", + "sha1 0.10.6", "time", "tokio", "tracing", @@ -540,6 +565,7 @@ dependencies = [ "aws-credential-types", "aws-sigv4", "aws-smithy-async", + "aws-smithy-eventstream", "aws-smithy-http", "aws-smithy-runtime", "aws-smithy-runtime-api", @@ -548,7 +574,9 @@ dependencies = [ "bytes", "bytes-utils", "fastrand 2.4.1", + "http 0.2.12", "http 1.4.0", + "http-body 0.4.6", "http-body 1.0.1", "percent-encoding", "pin-project-lite", @@ -556,6 +584,65 @@ dependencies = [ "uuid", ] +[[package]] +name = "aws-sdk-s3" +version = "1.132.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "5575840a3a6b11f6011463ebe359320dfe5b67babb5e9b06fed6ddf809a9ab40" +dependencies = [ + "aws-credential-types", + "aws-runtime", + "aws-sigv4", + "aws-smithy-async", + "aws-smithy-checksums", + "aws-smithy-eventstream", + "aws-smithy-http", + "aws-smithy-json", + "aws-smithy-observability", + "aws-smithy-runtime", + "aws-smithy-runtime-api", + "aws-smithy-types", + "aws-smithy-xml", + "aws-types", + "bytes", + "fastrand 2.4.1", + "hex", + "hmac 0.13.0", + "http 0.2.12", + "http 1.4.0", + "http-body 1.0.1", + "lru", + "percent-encoding", + "regex-lite", + "sha2 0.11.0", + "tracing", + "url", +] + +[[package]] +name = "aws-sdk-sesv2" +version = "1.118.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "b8d0642857f4fe76cd9a3d8c4f2b393546f7561f7725052dd9f268005fda92b7" +dependencies = [ + "aws-credential-types", + "aws-runtime", + "aws-smithy-async", + "aws-smithy-http", + "aws-smithy-json", + "aws-smithy-observability", + "aws-smithy-runtime", + "aws-smithy-runtime-api", + "aws-smithy-types", + "aws-types", + "bytes", + "fastrand 2.4.1", + "http 0.2.12", + "http 1.4.0", + "regex-lite", + "tracing", +] + [[package]] name = "aws-sdk-sso" version = "1.98.0" @@ -636,6 +723,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "68dc0b907359b120170613b5c09ccc61304eac3998ff6274b97d93ee6490115a" dependencies = [ "aws-credential-types", + "aws-smithy-eventstream", "aws-smithy-http", "aws-smithy-runtime-api", "aws-smithy-types", @@ -667,12 +755,45 @@ dependencies = [ "tokio", ] +[[package]] +name = "aws-smithy-checksums" +version = "0.64.7" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "10efbbcec1e044b81600e2fc562a391951d291152d95b482d5b7e7132299d762" +dependencies = [ + "aws-smithy-http", + "aws-smithy-types", + "bytes", + "crc-fast", + "hex", + "http 1.4.0", + "http-body 1.0.1", + "http-body-util", + "md-5", + "pin-project-lite", + "sha1 0.11.0", + "sha2 0.11.0", + "tracing", +] + +[[package]] +name = "aws-smithy-eventstream" +version = "0.60.20" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "faf09d74e5e32f76b8762da505a3cd59303e367a664ca67295387baa8c1d7548" +dependencies = [ + "aws-smithy-types", + "bytes", + "crc32fast", +] + [[package]] name = "aws-smithy-http" version = "0.63.6" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "ba1ab2dc1c2c3749ead27180d333c42f11be8b0e934058fb4b2258ee8dbe5231" dependencies = [ + "aws-smithy-eventstream", "aws-smithy-runtime-api", "aws-smithy-types", "bytes", @@ -1219,6 +1340,42 @@ dependencies = [ "libc", ] +[[package]] +name = "crc" +version = "3.3.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "9710d3b3739c2e349eb44fe848ad0b7c8cb1e42bd87ee49371df2f7acaf3e675" +dependencies = [ + "crc-catalog", +] + +[[package]] +name = "crc-catalog" +version = "2.5.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "217698eaf96b4a3f0bc4f3662aaa55bdf913cd54d7204591faa790070c6d0853" + +[[package]] +name = "crc-fast" +version = "1.9.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "2fd92aca2c6001b1bf5ba0ff84ee74ec8501b52bbef0cac80bf25a6c1d87a83d" +dependencies = [ + "crc", + "digest 0.10.7", + "rustversion", + "spin", +] + +[[package]] +name = "crc32fast" +version = "1.5.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "9481c1c90cbf2ac953f07c8d4a58aa3945c425b7185c9154d67a65e4230da511" +dependencies = [ + "cfg-if", +] + [[package]] name = "crossbeam-utils" version = "0.8.21" @@ -1659,6 +1816,12 @@ version = "0.1.5" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "d9c4f5dac5e15c24eb999c26181a6ca40b39fe946cbe4c263c7209467bc83af2" +[[package]] +name = "foldhash" +version = "0.2.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "77ce24cb58228fbb8aa041425bb1050850ac19177686ea6e0f41a70416f56fdb" + [[package]] name = "foreign-types" version = "0.3.2" @@ -1913,7 +2076,18 @@ version = "0.15.5" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "9229cfe53dfd69f0609a49f65461bd93001ea1ef889cd5529dd176593f5338a1" dependencies = [ - "foldhash", + "foldhash 0.1.5", +] + +[[package]] +name = "hashbrown" +version = "0.16.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "841d1cc9bed7f9236f321df977030373f4a4163ae1a7dbfe1a51a2c1a51d9100" +dependencies = [ + "allocator-api2", + "equivalent", + "foldhash 0.2.0", ] [[package]] @@ -2508,6 +2682,15 @@ version = "0.4.29" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "5e5032e24019045c762d3c0f28f5b6b8bbf38563a65908389bf7978758920897" +[[package]] +name = "lru" +version = "0.16.4" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "7f66e8d5d03f609abc3a39e6f08e4164ebf1447a732906d39eb9b99b7919ef39" +dependencies = [ + "hashbrown 0.16.1", +] + [[package]] name = "matchers" version = "0.2.0" @@ -2523,6 +2706,16 @@ version = "0.7.3" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "0e7465ac9959cc2b1404e8e2367b43684a6d13790fe23056cc8c6c5a6b7bcb94" +[[package]] +name = "md-5" +version = "0.11.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "69b6441f590336821bb897fb28fc622898ccceb1d6cea3fde5ea86b090c4de98" +dependencies = [ + "cfg-if", + "digest 0.11.2", +] + [[package]] name = "memchr" version = "2.8.0" @@ -3545,6 +3738,17 @@ dependencies = [ "digest 0.10.7", ] +[[package]] +name = "sha1" +version = "0.11.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "aacc4cc499359472b4abe1bf11d0b12e688af9a805fa5e3016f9a386dc2d0214" +dependencies = [ + "cfg-if", + "cpufeatures 0.3.0", + "digest 0.11.2", +] + [[package]] name = "sha2" version = "0.10.9" @@ -3666,6 +3870,12 @@ dependencies = [ "windows-sys 0.61.2", ] +[[package]] +name = "spin" +version = "0.10.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "d5fe4ccb98d9c292d56fec89a5e07da7fc4cf0dc11e156b41793132775d3e591" + [[package]] name = "spki" version = "0.6.0" @@ -4179,6 +4389,7 @@ version = "1.23.1" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "ddd74a9687298c6858e9b88ec8935ec45d22e8fd5e6394fa1bd4e99a87789c76" dependencies = [ + "getrandom 0.4.2", "js-sys", "wasm-bindgen", ] @@ -4740,7 +4951,7 @@ dependencies = [ "rand", "serde", "serde_repr", - "sha1", + "sha1 0.10.6", "static_assertions", "tracing", "uds_windows", diff --git a/crates/agentkeys-broker-server/Cargo.toml b/crates/agentkeys-broker-server/Cargo.toml index 90815d2..3274fca 100644 --- a/crates/agentkeys-broker-server/Cargo.toml +++ b/crates/agentkeys-broker-server/Cargo.toml @@ -30,6 +30,11 @@ hex = "0.4" aws-config = { version = "1", features = ["behavior-version-latest"] } aws-credential-types = "1" aws-sdk-sts = "1" +# Real SES sender for email-link auth. Optional, gated behind +# auth-email-link — without the feature the broker has no SES sender at +# all (StubEmailSender remains for tests). Pulled in by Pass 1 of +# Option B per docs/spec/plans/issue-74 (see commit log). +aws-sdk-sesv2 = { version = "1", optional = true } jsonwebtoken = "9" p256 = { version = "0.13", features = ["pkcs8", "pem", "ecdsa"] } pkcs8 = { version = "0.10", features = ["pem"] } @@ -58,7 +63,7 @@ default = ["auth-wallet-sig", "wallet-keystore", "audit-sqlite"] # US-006 adds k256+sha3 to auth-wallet-sig; Phase A.1 adds lettre+aws-sdk-sesv2 # to auth-email-link; Phase A.2's OAuth2 reuses unconditional jsonwebtoken+reqwest. auth-wallet-sig = ["dep:k256", "dep:sha3"] -auth-email-link = [] +auth-email-link = ["dep:aws-sdk-sesv2"] auth-oauth2 = ["dep:hmac", "dep:url"] auth-oauth2-google = ["auth-oauth2"] auth-oauth2-github = ["auth-oauth2"] # v1+ @@ -76,8 +81,15 @@ audit-solana = [] # v1; deferred test-stub = [] # existing — stubs STS/SES/RPC for offline tests [dev-dependencies] -agentkeys-broker-server = { path = ".", features = ["test-stub"] } +agentkeys-broker-server = { path = ".", features = ["test-stub", "auth-email-link"] } agentkeys-mock-server = { path = "../agentkeys-mock-server" } tower = { version = "0.4", features = ["util"] } http-body-util = "0.1" tempfile = "3" +# Integration test only — receiver side of the SES → S3 round-trip in +# tests/ses_email_flow.rs. Not needed at runtime. +aws-sdk-s3 = "1" +uuid = { version = "1", features = ["v4"] } +# FutureExt::catch_unwind on async — used by tests/ses_email_flow.rs to +# guarantee cleanup runs in async context regardless of test panic. +futures-util = "0.3" diff --git a/crates/agentkeys-broker-server/src/boot.rs b/crates/agentkeys-broker-server/src/boot.rs index 24d3c06..ede4cb7 100644 --- a/crates/agentkeys-broker-server/src/boot.rs +++ b/crates/agentkeys-broker-server/src/boot.rs @@ -370,25 +370,14 @@ fn build_registry( } #[cfg(feature = "auth-email-link")] "email_link" => { - use crate::plugins::auth::{EmailLinkAuth, StubEmailSender}; + use crate::plugins::auth::{ + EmailLinkAuth, EmailSender, SesEmailSender, StubEmailSender, + }; use crate::storage::{EmailRateLimitStore, EmailTokenStore}; - // HMAC key - let hmac_path = std::env::var(env::BROKER_EMAIL_HMAC_KEY_PATH).map_err(|_| { - boot_fail( - env::BROKER_EMAIL_HMAC_KEY_PATH, - "(unset)", - "required when email_link is in BROKER_AUTH_METHODS", - "email-hmac-key", - ) - })?; - let hmac_key = std::fs::read(&hmac_path).map_err(|e| { - boot_fail( - env::BROKER_EMAIL_HMAC_KEY_PATH, - &hmac_path, - format!("read failed: {}", e), - "email-hmac-key", - ) - })?; + // No HMAC key — magic-link is stateful (CSPRNG token → + // SHA256(token) keyed by request_id in EmailTokenStore → + // single-use within TTL). See arch.md §5a.1.M Stage 1 + + // EmailLinkAuth::new doc comment for the design rationale. let from_address = std::env::var(env::BROKER_EMAIL_FROM_ADDRESS).map_err(|_| { boot_fail( @@ -447,24 +436,62 @@ fn build_registry( .map(std::path::PathBuf::from) .unwrap_or_else(|_| parent.clone()); let ses_cache_path = data_dir.join("ses-verify.json"); - // Stub email sender for Phase A.1; real SES wiring lands - // as a fast-follow per V0.1-FOLLOWUPS R2-F8. - let sender = Arc::new(StubEmailSender::new()); + // Email sender backend selector — `BROKER_EMAIL_SENDER` env var. + // "stub" (default, in-process Vec — same as v0.1) + // "ses" (real aws-sdk-sesv2 SendEmail; requires verified FROM + // identity per scripts/ses-verify-sender.sh) + let sender_backend = std::env::var(env::BROKER_EMAIL_SENDER) + .unwrap_or_else(|_| "stub".to_string()); + let sender: Arc = match sender_backend.as_str() { + "stub" => { + tracing::info!("email_link sender backend: stub (in-process)"); + Arc::new(StubEmailSender::new()) + } + "ses" => { + // SesEmailSender::new takes &SdkConfig (sync), but + // aws_config::defaults().load() is async. We're in a + // sync fn called from #[tokio::main] (multi-thread), + // so block_in_place + block_on is the legal escape. + let region = std::env::var(env::BROKER_AWS_REGION) + .unwrap_or_else(|_| "us-east-1".to_string()); + tracing::info!( + from = %from_address, + region = %region, + "email_link sender backend: ses (aws-sdk-sesv2)" + ); + let sdk_config = tokio::task::block_in_place(|| { + tokio::runtime::Handle::current().block_on(async { + aws_config::defaults(aws_config::BehaviorVersion::latest()) + .region(aws_config::Region::new(region)) + .load() + .await + }) + }); + Arc::new(SesEmailSender::new(&sdk_config, from_address.clone())) + } + other => { + return Err(boot_fail( + env::BROKER_EMAIL_SENDER, + other, + "must be 'stub' or 'ses'", + "email-sender-backend", + )); + } + }; let plugin = EmailLinkAuth::new( sender, Arc::clone(&token_store), Arc::clone(&rl_store), - from_address, + from_address.clone(), landing_base, - hmac_key, ses_cache_path, per_email, per_ip, ) .map_err(|e| { boot_fail( - env::BROKER_EMAIL_HMAC_KEY_PATH, - &hmac_path, + env::BROKER_EMAIL_FROM_ADDRESS, + &from_address, format!("EmailLinkAuth::new: {}", e), "email-link-construct", ) diff --git a/crates/agentkeys-broker-server/src/env.rs b/crates/agentkeys-broker-server/src/env.rs index 31ff24b..dc02e30 100644 --- a/crates/agentkeys-broker-server/src/env.rs +++ b/crates/agentkeys-broker-server/src/env.rs @@ -137,10 +137,17 @@ pub const BROKER_EVM_PER_IDENTITY_DAILY_TX_BUDGET: &str = "BROKER_EVM_PER_IDENTI // Email auth (Phase A.1) // --------------------------------------------------------------------------- -/// Required when `email_link` is in `BROKER_AUTH_METHODS`. Path to a 32+ byte HMAC key file. -pub const BROKER_EMAIL_HMAC_KEY_PATH: &str = "BROKER_EMAIL_HMAC_KEY_PATH"; /// Required when `email_link` is in `BROKER_AUTH_METHODS`. Verified SES sender email address. +/// +/// **No HMAC key var.** Magic-link tokens are stateful (CSPRNG → SHA256 → SQLite EmailTokenStore → +/// single-use within TTL). See `crates/agentkeys-broker-server/src/plugins/auth/email_link.rs` +/// `EmailLinkAuth::new` doc + `docs/spec/architecture.md` §5a.1.M Stage 1. pub const BROKER_EMAIL_FROM_ADDRESS: &str = "BROKER_EMAIL_FROM_ADDRESS"; +/// Optional. Email sender backend selector — `stub` (default, in-process Vec) or `ses` +/// (real `aws-sdk-sesv2` SendEmail). When `ses`, the FROM identity must be SES-verified +/// (see `scripts/ses-verify-sender.sh`). Picks the SES region from `BROKER_AWS_REGION` +/// (or AWS SDK default chain). +pub const BROKER_EMAIL_SENDER: &str = "BROKER_EMAIL_SENDER"; /// Optional. Operator URL the broker redirects to after a successful email-link verification. /// If unset, the broker shows a minimal built-in "Verified — return to your terminal" page. pub const BROKER_EMAIL_SUCCESS_REDIRECT_URL: &str = "BROKER_EMAIL_SUCCESS_REDIRECT_URL"; @@ -243,8 +250,8 @@ pub const fn all() -> &'static [(&'static str, &'static str, Group)] { (BROKER_EVM_FEE_PAYER_MIN_BALANCE, "Wei threshold below which EVM anchor → Unready.", Group::AuditEvm), (BROKER_EVM_PER_IDENTITY_DAILY_TX_BUDGET, "Per-OmniAccount daily EVM-tx budget.", Group::AuditEvm), // Auth / email - (BROKER_EMAIL_HMAC_KEY_PATH, "Path to 32+ byte HMAC key for email tokens.", Group::AuthEmail), (BROKER_EMAIL_FROM_ADDRESS, "Verified SES sender email.", Group::AuthEmail), + (BROKER_EMAIL_SENDER, "Email backend: 'stub' (default) or 'ses' (real aws-sdk-sesv2).", Group::AuthEmail), (BROKER_EMAIL_SUCCESS_REDIRECT_URL, "Optional operator success-page redirect URL.", Group::AuthEmail), (BROKER_EMAIL_RATE_LIMIT_PER_EMAIL_HOURLY, "Per-email per-hour bucket.", Group::AuthEmail), (BROKER_EMAIL_RATE_LIMIT_PER_IP_MINUTELY, "Per-IP per-minute bucket.", Group::AuthEmail), diff --git a/crates/agentkeys-broker-server/src/jwt/session.rs b/crates/agentkeys-broker-server/src/jwt/session.rs index 9ae92eb..d6e799f 100644 --- a/crates/agentkeys-broker-server/src/jwt/session.rs +++ b/crates/agentkeys-broker-server/src/jwt/session.rs @@ -11,7 +11,7 @@ use base64::engine::general_purpose::URL_SAFE_NO_PAD; use base64::Engine; use jsonwebtoken::{encode, Algorithm, EncodingKey, Header}; use p256::ecdsa::SigningKey; -use p256::pkcs8::{DecodePrivateKey, EncodePrivateKey, LineEnding}; +use p256::pkcs8::{DecodePrivateKey, EncodePrivateKey, EncodePublicKey, LineEnding}; use serde::{Deserialize, Serialize}; use crate::error::{BrokerError, BrokerResult}; @@ -157,6 +157,18 @@ impl SessionKeypair { encode(&header, claims, &key) .map_err(|e| BrokerError::Internal(format!("sign session jwt: {e}"))) } + + /// Export the public component of this session keypair as a PEM-encoded + /// SubjectPublicKeyInfo (SPKI) string. The signer service reads this at + /// boot to verify broker session JWTs without holding the private key. + pub fn public_key_pem(&self) -> BrokerResult { + let signing_key = SigningKey::from_pkcs8_pem(&self.private_key_pem) + .map_err(|e| BrokerError::Internal(format!("decode pkcs8 pem for pubkey export: {e}")))?; + let verifying_key = signing_key.verifying_key(); + verifying_key + .to_public_key_pem(LineEnding::LF) + .map_err(|e| BrokerError::Internal(format!("encode public key pem: {e}"))) + } } #[cfg(test)] diff --git a/crates/agentkeys-broker-server/src/main.rs b/crates/agentkeys-broker-server/src/main.rs index 7da8ead..ae692e0 100644 --- a/crates/agentkeys-broker-server/src/main.rs +++ b/crates/agentkeys-broker-server/src/main.rs @@ -30,6 +30,15 @@ struct Args { /// In production, leave this off so misconfigured creds fail fast. #[arg(long)] skip_startup_check: bool, + + /// On boot, write the broker's session keypair **public key** (SPKI PEM, + /// mode 0644) to this path. The signer service (`--signer-only`) reads + /// it to verify bearer JWTs without holding the private key. + /// + /// Idempotent: re-runs overwrite the file (pubkey is stable unless the + /// broker keypair is regenerated via `keygen --purpose session`). + #[arg(long)] + export_session_pubkey_to: Option, } #[derive(Subcommand)] @@ -80,6 +89,31 @@ async fn main() -> anyhow::Result<()> { // validates plugin selection, opens stores, builds registry. Any // failure here exits with a single-line BOOT_FAIL message. let boot_artifacts = run_tier1(&config)?; + + // Export session pubkey if requested (issue #74 step 1b). Must happen + // after Tier-1 so the session keypair is loaded. Overwrites on every + // boot (pubkey is stable unless keygen was re-run). + if let Some(ref pubkey_path) = args.export_session_pubkey_to { + let pem = boot_artifacts + .session_keypair + .public_key_pem() + .map_err(|e| anyhow::anyhow!("export session pubkey: {e}"))?; + if let Some(parent) = pubkey_path.parent() { + std::fs::create_dir_all(parent) + .map_err(|e| anyhow::anyhow!("create dirs for pubkey export: {e}"))?; + } + std::fs::write(pubkey_path, &pem) + .map_err(|e| anyhow::anyhow!("write session pubkey to {pubkey_path:?}: {e}"))?; + // mode 0644 so the agentkeys-signer service (same user) can read it + #[cfg(unix)] + { + use std::os::unix::fs::PermissionsExt; + std::fs::set_permissions(pubkey_path, std::fs::Permissions::from_mode(0o644)) + .map_err(|e| anyhow::anyhow!("chmod 0644 {pubkey_path:?}: {e}"))?; + } + tracing::info!(path = %pubkey_path.display(), "wrote session pubkey PEM (signer can read it)"); + } + let tier2_profile = Tier2Profile::from_config(&config); tracing::info!( strict = tier2_profile.strict, @@ -183,9 +217,11 @@ async fn main() -> anyhow::Result<()> { /// Spawn the Tier-2 reachability probes that flip the AtomicBool flags /// on `Tier2State` as each external dependency becomes reachable. /// -/// Phase 0 ships only the backend probe (the only Tier-2 check whose -/// dependencies exist this early). SES + EVM probes land in Phase A.1 -/// and Phase C respectively, behind their feature gates. +/// Currently spawns the backend probe (always) and, when email-link auth +/// is compiled in and enabled, the SES sender-verify probe that also +/// persists `SesVerifyCache` to disk so the email-link plug-in's +/// `Readiness::ready()` flips from `Degraded` to `Ready`. The EVM probe +/// lands in Phase C. fn spawn_tier2_probes( state: Arc, profile: agentkeys_broker_server::boot::Tier2Profile, @@ -223,6 +259,87 @@ fn spawn_tier2_probes( } } }); + + #[cfg(feature = "auth-email-link")] + if profile.email_link_enabled { + spawn_ses_verify_probe(Arc::clone(&state), strict); + } +} + +/// SES sender-verify probe. Calls `verify_sender_ready()` on the +/// configured `EmailSender`, persists `SesVerifyCache` on success so the +/// plug-in's `Readiness` flips to `Ready`, and flips the `tier2/ses` +/// `AtomicBool`. Retries with exponential backoff on failure (capped at +/// 5 minutes); after a success, re-verifies every 12h so the cache stays +/// under the plug-in's 24h freshness TTL. +#[cfg(feature = "auth-email-link")] +fn spawn_ses_verify_probe(state: Arc, strict: bool) { + use std::sync::atomic::Ordering; + use std::time::{SystemTime, UNIX_EPOCH}; + + use agentkeys_broker_server::plugins::auth::SesVerifyCache; + + let Some(email_link) = state.email_link.clone() else { + tracing::error!( + "Tier-2 SES probe: email_link is in BROKER_AUTH_METHODS but the \ + concrete plug-in handle is missing from AppState — /readyz will \ + stay degraded. Indicates a build/config bug." + ); + return; + }; + + tokio::spawn(async move { + let mut backoff_seconds: u64 = 30; + loop { + match email_link.sender.verify_sender_ready().await { + Ok(()) => { + let now = SystemTime::now() + .duration_since(UNIX_EPOCH) + .map(|d| d.as_secs() as i64) + .unwrap_or(0); + let cache = SesVerifyCache { + last_verified_at: now, + sender_email: email_link.from_address.clone(), + }; + match cache.save(&email_link.ses_verify_cache_path) { + Ok(()) => { + state.tier2.ses_verified.store(true, Ordering::Relaxed); + tracing::info!( + sender = %email_link.from_address, + path = %email_link.ses_verify_cache_path.display(), + "Tier-2 SES probe: sender verified; cache persisted" + ); + } + Err(e) => { + tracing::error!( + error = %e, + path = %email_link.ses_verify_cache_path.display(), + "Tier-2 SES probe: verify succeeded but cache save failed; auth/email_link readiness will stay degraded" + ); + } + } + backoff_seconds = 30; + tokio::time::sleep(std::time::Duration::from_secs(12 * 3600)).await; + } + Err(e) => { + if strict { + tracing::error!( + error = %e, + "BROKER_REFUSE_TO_BOOT_STRICT=true and SES sender verify failed; exiting" + ); + std::process::exit(1); + } + tracing::warn!( + error = %e, + retry_seconds = backoff_seconds, + "Tier-2 SES probe: sender verify failed; /readyz will report unready until verified" + ); + tokio::time::sleep(std::time::Duration::from_secs(backoff_seconds)).await; + backoff_seconds = (backoff_seconds * 2).min(300); + } + } + } + }); } async fn shutdown_signal() { diff --git a/crates/agentkeys-broker-server/src/plugins/auth/email_link.rs b/crates/agentkeys-broker-server/src/plugins/auth/email_link.rs index 4ba0817..2763588 100644 --- a/crates/agentkeys-broker-server/src/plugins/auth/email_link.rs +++ b/crates/agentkeys-broker-server/src/plugins/auth/email_link.rs @@ -30,7 +30,6 @@ use std::time::{SystemTime, UNIX_EPOCH}; use async_trait::async_trait; use serde_json::json; -use crate::env; use crate::plugins::auth::{ AuthChallenge, AuthError, AuthResponse, ChallengeParams, IdentityType, UserAuthMethod, VerifiedIdentity, @@ -124,6 +123,154 @@ impl EmailSender for StubEmailSender { } } +// ─── Real SES sender (Pass 1 of Option B) ─────────────────────────────────── +// +// Production wiring of the EmailSender trait against AWS SES v2. Issued +// by `setup-broker-host.sh` via instance-profile creds; FROM is a verified +// identity in the broker host's account (typically noreply@). +// +// Failure modes map to EmailSendError variants: +// - SendEmail RPC fails / message rejected → EmailSendError::Send +// - GetEmailIdentity fails / SendingEnabled=false / VerificationStatus≠Success +// → EmailSendError::Verify +// - Constructor receives empty from_address → EmailSendError::Config (lazy) +// +// The integration test in tests/ses_email_flow.rs exercises this against +// the real AWS account by sending to a unique magic-link-test-{uuid}@ +// address that the SES inbound rule routes to the agentkeys-mail-* S3 bucket. + +const SES_SUBJECT: &str = "Your AgentKeys sign-in link"; + +/// Plaintext template — magic link is appended verbatim. Kept simple + +/// inlined (no template engine dep) so the body is auditable at a glance. +fn ses_body_text(landing_url: &str) -> String { + format!( + "Click the link below to finish signing in to AgentKeys.\n\n\ + {landing_url}\n\n\ + The link is single-use and expires in 10 minutes. If you didn't \ + request this, you can ignore this message.\n", + ) +} + +/// HTML template — minimal (no CSS, no images) to avoid spam-filter noise +/// and to keep the body identical in structure to the plaintext alternative. +fn ses_body_html(landing_url: &str) -> String { + format!( + "

Click the link below to finish signing in to AgentKeys.

\ +

{landing_url}

\ +

The link is single-use \ + and expires in 10 minutes. If you didn't request this, you can \ + ignore this message.

", + ) +} + +#[cfg(feature = "auth-email-link")] +pub struct SesEmailSender { + client: aws_sdk_sesv2::Client, + from_address: String, +} + +#[cfg(feature = "auth-email-link")] +impl SesEmailSender { + /// Construct from a pre-loaded SDK config + verified FROM address. + /// Doesn't verify the address up front — `verify_sender_ready` does + /// that on a 24h cadence (matches StubEmailSender's contract). + pub fn new(sdk_config: &aws_config::SdkConfig, from_address: String) -> Self { + Self { + client: aws_sdk_sesv2::Client::new(sdk_config), + from_address, + } + } + + /// Test/internal accessor — returns the FROM address. Used by the + /// integration test to assert the constructor wired correctly. + pub fn from_address(&self) -> &str { + &self.from_address + } +} + +#[cfg(feature = "auth-email-link")] +#[async_trait] +impl EmailSender for SesEmailSender { + async fn send_magic_link(&self, to: &str, landing_url: &str) -> Result<(), EmailSendError> { + if self.from_address.is_empty() { + return Err(EmailSendError::Config("from_address is empty".into())); + } + use aws_sdk_sesv2::types::{Body, Content, Destination, EmailContent, Message}; + + let subject = Content::builder() + .data(SES_SUBJECT) + .charset("UTF-8") + .build() + .map_err(|e| EmailSendError::Send(format!("build subject: {e}")))?; + let text_part = Content::builder() + .data(ses_body_text(landing_url)) + .charset("UTF-8") + .build() + .map_err(|e| EmailSendError::Send(format!("build text body: {e}")))?; + let html_part = Content::builder() + .data(ses_body_html(landing_url)) + .charset("UTF-8") + .build() + .map_err(|e| EmailSendError::Send(format!("build html body: {e}")))?; + + let body = Body::builder().text(text_part).html(html_part).build(); + let message = Message::builder().subject(subject).body(body).build(); + let dest = Destination::builder().to_addresses(to).build(); + let content = EmailContent::builder().simple(message).build(); + + self.client + .send_email() + .from_email_address(&self.from_address) + .destination(dest) + .content(content) + .send() + .await + .map(|_| ()) + .map_err(|e| EmailSendError::Send(format!("ses SendEmail: {}", e.into_service_error()))) + } + + async fn verify_sender_ready(&self) -> Result<(), EmailSendError> { + // Single explicit per-address lookup. The operator must register + // the FROM identity explicitly via: + // + // aws sesv2 create-email-identity \ + // --email-identity $BROKER_EMAIL_FROM_ADDRESS + // + // (then click the verification link that SES routes to the inbound + // S3 bucket). See scripts/ses-verify-sender.sh for the helper. + // We deliberately do NOT fall back to the domain identity — domain + // verification grants sending rights but obscures intent; an + // explicit per-address identity makes the verified sender visible + // in `aws sesv2 list-email-identities`. + let resp = self + .client + .get_email_identity() + .email_identity(&self.from_address) + .send() + .await + .map_err(|e| { + EmailSendError::Verify(format!( + "ses GetEmailIdentity({}): {} — register via \ + `aws sesv2 create-email-identity --email-identity {}` \ + and click the verification link", + self.from_address, + e.into_service_error(), + self.from_address, + )) + })?; + + if !resp.verified_for_sending_status() { + return Err(EmailSendError::Verify(format!( + "{} exists in SES but verified_for_sending_status=false — \ + click the verification link from the SES bootstrap email", + self.from_address + ))); + } + Ok(()) + } +} + /// Persisted SES verification cache. Survives restart so debug-loops /// don't burn SES API budget (Codex P2 #8 mitigation, V0.1-FOLLOWUPS R2-F8). #[derive(serde::Serialize, serde::Deserialize, Debug, Clone)] @@ -163,42 +310,40 @@ pub struct EmailLinkAuth { pub rate_limit_store: Arc, pub from_address: String, pub landing_url_base: String, // e.g. "https://broker.example.com/auth/email/landing" - pub hmac_key: Vec, pub ses_verify_cache_path: PathBuf, pub per_email_hourly_limit: i64, pub per_ip_minutely_limit: i64, } impl EmailLinkAuth { - /// Construct from already-loaded dependencies. The `hmac_key` MUST - /// be at least 32 bytes (boot validates this; the constructor - /// re-checks to make accidental misuse a hard error). - #[allow(clippy::too_many_arguments)] // 9 deps; refactoring into a builder hides nothing + /// Construct from already-loaded dependencies. + /// + /// **No HMAC key.** Per `docs/spec/architecture.md` §5a.1.M Stage 1 + /// and the K1–K11 inventory in §3, the magic-link is stateful: + /// the token is generated CSPRNG, `SHA256(token)` is keyed by + /// `request_id` in `EmailTokenStore`, and the broker confirms + /// single-use within TTL on click. No HMAC signature is needed — + /// the security comes from token randomness, stateful TTL, and + /// consume-once. (Earlier `hmac_key` field was vestigial — never + /// used cryptographically — and was removed alongside the + /// BROKER_EMAIL_HMAC_KEY_PATH env var to align with arch.md.) + #[allow(clippy::too_many_arguments)] // 8 deps; refactoring into a builder hides nothing pub fn new( sender: Arc, token_store: Arc, rate_limit_store: Arc, from_address: impl Into, landing_url_base: impl Into, - hmac_key: Vec, ses_verify_cache_path: PathBuf, per_email_hourly_limit: i64, per_ip_minutely_limit: i64, ) -> Result { - if hmac_key.len() < 32 { - return Err(AuthError::Internal(format!( - "{} must be >= 32 bytes, got {}", - env::BROKER_EMAIL_HMAC_KEY_PATH, - hmac_key.len() - ))); - } Ok(Self { sender, token_store, rate_limit_store, from_address: from_address.into(), landing_url_base: landing_url_base.into(), - hmac_key, ses_verify_cache_path, per_email_hourly_limit, per_ip_minutely_limit, @@ -406,7 +551,6 @@ mod tests { rate_limit_store, "broker@example.com", "https://broker.test/auth/email/landing", - vec![0u8; 32], tmp.path().join("ses-verify.json"), 5, 30, @@ -579,25 +723,6 @@ mod tests { assert!(p.ready().is_ready()); } - #[tokio::test] - async fn hmac_key_too_short_rejected() { - let token_store = Arc::new(EmailTokenStore::open_in_memory().unwrap()); - let rate_limit_store = Arc::new(EmailRateLimitStore::open_in_memory().unwrap()); - let sender: Arc = Arc::new(StubEmailSender::new()); - let res = EmailLinkAuth::new( - sender, - token_store, - rate_limit_store, - "broker@example.com", - "https://broker.test/auth/email/landing", - vec![0u8; 16], // < 32 bytes - std::path::PathBuf::from("/tmp/dummy.json"), - 5, - 30, - ); - assert!(res.is_err()); - } - #[tokio::test] async fn rate_limit_per_ip_enforced() { let (p, _s, _t) = make_plugin(); @@ -619,4 +744,52 @@ mod tests { .await; assert!(matches!(res, Err(AuthError::RateLimited(_)))); } + + // ─── SesEmailSender body composition (US-3) ────────────────────────── + // No AWS calls — pure string-composition checks. Guards the operator's + // "click the link" path: if the magic link doesn't appear in both + // alternatives, the recipient can't sign in regardless of SES delivery. + + #[test] + fn ses_subject_is_non_empty() { + assert!(!SES_SUBJECT.is_empty()); + } + + #[test] + fn ses_text_body_contains_landing_url() { + let url = "https://broker.example/auth/email/landing#t=ABC.DEF"; + let body = ses_body_text(url); + assert!(body.contains(url), "text body must contain landing URL: {body}"); + assert!( + body.contains("AgentKeys") || body.contains("agentkeys"), + "text body should mention the product" + ); + } + + #[test] + fn ses_html_body_contains_landing_url_twice() { + // Once in href attribute, once as visible link text — keeps the + // body usable in clients that strip wrapping. + let url = "https://broker.example/auth/email/landing#t=XYZ.123"; + let body = ses_body_html(url); + let occurrences = body.matches(url).count(); + assert!( + occurrences >= 2, + "html body should contain landing URL at least twice (href + text), got {}: {}", + occurrences, + body + ); + } + + #[test] + fn ses_text_and_html_alternatives_both_present() { + // Sanity-check: body composers don't return the same string — + // SES wraps them as multipart/alternative so they must differ. + let url = "https://example.test/landing#t=tok"; + assert_ne!( + ses_body_text(url), + ses_body_html(url), + "text and html alternatives must differ" + ); + } } diff --git a/crates/agentkeys-broker-server/src/plugins/auth/mod.rs b/crates/agentkeys-broker-server/src/plugins/auth/mod.rs index be9d965..19a4789 100644 --- a/crates/agentkeys-broker-server/src/plugins/auth/mod.rs +++ b/crates/agentkeys-broker-server/src/plugins/auth/mod.rs @@ -18,7 +18,9 @@ pub mod oauth2; pub mod wallet_sig; #[cfg(feature = "auth-email-link")] -pub use email_link::{EmailLinkAuth, EmailSendError, EmailSender, SesVerifyCache, StubEmailSender}; +pub use email_link::{ + EmailLinkAuth, EmailSendError, EmailSender, SesEmailSender, SesVerifyCache, StubEmailSender, +}; #[cfg(feature = "auth-oauth2")] pub use oauth2::{ OAuth2Auth, OAuth2Error, OAuth2Provider, StubOAuth2Provider, TokenExchangeOutcome, diff --git a/crates/agentkeys-broker-server/tests/email_flow.rs b/crates/agentkeys-broker-server/tests/email_flow.rs index b097e25..7648c4d 100644 --- a/crates/agentkeys-broker-server/tests/email_flow.rs +++ b/crates/agentkeys-broker-server/tests/email_flow.rs @@ -65,7 +65,6 @@ async fn spawn_broker() -> (String, Arc, Arc) { Arc::clone(&rl_store), "broker@example.test", format!("{}/auth/email/landing", TEST_ISSUER), - vec![0u8; 32], tmp.path().join("ses-verify.json"), 5, 30, diff --git a/crates/agentkeys-broker-server/tests/ses_email_flow.rs b/crates/agentkeys-broker-server/tests/ses_email_flow.rs new file mode 100644 index 0000000..d2e735a --- /dev/null +++ b/crates/agentkeys-broker-server/tests/ses_email_flow.rs @@ -0,0 +1,410 @@ +//! End-to-end SES → S3 round-trip integration test for SesEmailSender. +//! +//! Exercises the production sender path: build SesEmailSender against the +//! real AWS account, send a magic-link to a unique +//! `magic-link-test-{uuid}@` recipient, and poll the inbound +//! S3 bucket (provisioned per `docs/cloud-setup.md` §2.1) until the MIME +//! object lands. Then assert the body contains the unique token + landing +//! URL, and clean up every test object before exiting. +//! +//! ## Skipping +//! +//! Marked `#[ignore]` so `cargo test` skips it. Run explicitly: +//! +//! ```bash +//! awsp agentkeys-admin +//! RUN_SES_INTEGRATION_TESTS=1 ACCOUNT_ID=429071895007 \ +//! cargo test -p agentkeys-broker-server --features auth-email-link \ +//! --test ses_email_flow -- --ignored +//! ``` +//! +//! Without `RUN_SES_INTEGRATION_TESTS=1` the test still gets invoked by +//! `--ignored`, but early-returns with a `println!` skip notice so a CI +//! that runs `--ignored` without AWS creds doesn't false-fail. +//! +//! ## Cleanup invariant +//! +//! Whether the test passes, fails, or panics mid-flow, every S3 object +//! whose key contains the per-test UUID is deleted. Implemented via a +//! `CleanupGuard` Drop impl so a panic doesn't leak a test message into +//! the bucket's 30-day TTL window. + +#![cfg(feature = "auth-email-link")] + +use std::time::Duration; + +use agentkeys_broker_server::plugins::auth::{EmailSender, SesEmailSender}; +use aws_sdk_s3::Client as S3Client; + +const ENV_GATE: &str = "RUN_SES_INTEGRATION_TESTS"; +const DEFAULT_REGION: &str = "us-east-1"; +const DEFAULT_MAIL_DOMAIN: &str = "bots.litentry.org"; +const DEFAULT_FROM_LOCAL: &str = "noreply-test"; // → noreply-test@ +const POLL_INTERVAL: Duration = Duration::from_secs(5); +const POLL_MAX_ATTEMPTS: usize = 12; // 60s total +const INBOUND_PREFIX: &str = "inbound/"; + +struct TestEnv { + region: String, + account_id: String, + mail_domain: String, + bucket: String, + from_address: String, +} + +impl TestEnv { + fn from_env_or_skip() -> Option { + if std::env::var(ENV_GATE).ok().as_deref() != Some("1") { + println!( + "ses_email_flow: SKIP — set {}=1 to run the live SES round-trip", + ENV_GATE + ); + return None; + } + let account_id = match std::env::var("ACCOUNT_ID") { + Ok(v) if !v.is_empty() => v, + _ => { + println!("ses_email_flow: SKIP — ACCOUNT_ID env var required"); + return None; + } + }; + let region = std::env::var("AWS_REGION") + .or_else(|_| std::env::var("REGION")) + .unwrap_or_else(|_| DEFAULT_REGION.to_string()); + let mail_domain = + std::env::var("MAIL_DOMAIN").unwrap_or_else(|_| DEFAULT_MAIL_DOMAIN.to_string()); + let bucket = std::env::var("MAIL_BUCKET") + .unwrap_or_else(|_| format!("agentkeys-mail-{}", account_id)); + // BROKER_EMAIL_FROM_ADDRESS matches the env var the broker reads at + // runtime (per crates/agentkeys-broker-server/src/env.rs:143). Default + // to noreply-test@ — must be registered + verified per + // scripts/ses-verify-sender.sh before this test will pass. + let from_address = std::env::var("BROKER_EMAIL_FROM_ADDRESS") + .unwrap_or_else(|_| format!("{}@{}", DEFAULT_FROM_LOCAL, mail_domain)); + Some(Self { + region, + account_id, + mail_domain, + bucket, + from_address, + }) + } +} + +/// Explicit async cleanup. Two modes: +/// +/// 1. **Fast path** (happy case): the poll loop already located the +/// inbound object containing our token — `fast_key=Some(...)`. We +/// just `DeleteObject` that one key. ~1 RPC, sub-second. +/// +/// 2. **Slow path** (test panicked before poll found the key): scan +/// all of `inbound/`, GetObject + body-grep, delete any object whose +/// body contains the per-test UUID. O(N) GetObject calls — slow, +/// but only triggers on test failure. +/// +/// The per-token body match is production-safe because UUIDs are 128 +/// random bits (~10^-38 collision probability with any production email). +/// The cleanup ONLY deletes objects whose body contains this specific +/// test's UUID — every other inbound (production, other tests, SES +/// verification mails) is left intact. +async fn cleanup_test_objects( + s3: &S3Client, + bucket: &str, + token: &str, + fast_key: Option, +) { + if let Some(key) = fast_key { + log("cleanup: fast-path delete of {}", &[&key]); + match s3.delete_object().bucket(bucket).key(&key).send().await { + Ok(_) => log("cleanup: deleted {} (fast path, 1 RPC)", &[&key]), + Err(e) => log("cleanup: delete {} failed: {}", &[&key, &format!("{e}")]), + } + return; + } + + // Slow scan only when the poll didn't find the key (test panicked early). + log( + "cleanup: SLOW path — poll didn't return a key, scanning all inbound/ for token={}", + &[token], + ); + let listed = match s3 + .list_objects_v2() + .bucket(bucket) + .prefix(INBOUND_PREFIX) + .send() + .await + { + Ok(r) => r, + Err(e) => { + log("cleanup: list_objects_v2 failed: {} (skipping)", &[&format!("{e}")]); + return; + } + }; + let total = listed.contents().len(); + log( + "cleanup: bucket has {} object(s); scanning for token (this is slow)", + &[&total.to_string()], + ); + let mut deleted = 0usize; + for obj in listed.contents() { + let Some(key) = obj.key() else { continue }; + let body = match s3.get_object().bucket(bucket).key(key).send().await { + Ok(o) => match o.body.collect().await { + Ok(b) => String::from_utf8_lossy(&b.to_vec()).to_string(), + Err(_) => continue, + }, + Err(_) => continue, + }; + if body.contains(token) { + match s3.delete_object().bucket(bucket).key(key).send().await { + Ok(_) => { + log("cleanup: deleted {}", &[key]); + deleted += 1; + } + Err(e) => log("cleanup: delete {} failed: {}", &[key, &format!("{e}")]), + } + } + } + log( + "cleanup: slow-scan done — deleted {} object(s) matching token", + &[&deleted.to_string()], + ); +} + +#[tokio::test(flavor = "multi_thread")] +#[ignore = "live AWS round-trip — requires RUN_SES_INTEGRATION_TESTS=1 + agentkeys-admin creds"] +async fn ses_send_and_receive_round_trip() { + let Some(env) = TestEnv::from_env_or_skip() else { + return; + }; + + let token = uuid::Uuid::new_v4().to_string(); + let recipient = format!("magic-link-test-{}@{}", token, env.mail_domain); + let from_address = env.from_address.clone(); + let landing_url = format!("https://test.example/landing?token={}", token); + + log("account={} region={}", &[&env.account_id, &env.region]); + log("bucket={}", &[&env.bucket]); + log("from={} → to={}", &[&from_address, &recipient]); + log("token={}", &[&token]); + + let sdk_config = aws_config::defaults(aws_config::BehaviorVersion::latest()) + .region(aws_config::Region::new(env.region.clone())) + .load() + .await; + + let sender = SesEmailSender::new(&sdk_config, from_address.clone()); + assert_eq!(sender.from_address(), from_address); + + // Pre-flight: confirm the FROM identity is verified for sending. + log("verify_sender_ready: calling SES GetEmailIdentity({})", &[&from_address]); + sender + .verify_sender_ready() + .await + .expect("FROM identity not verified for sending — run scripts/ses-verify-sender.sh"); + log("verify_sender_ready: ok", &[]); + + let s3 = S3Client::new(&sdk_config); + + // Shared slot the poll loop writes into when it finds the matching + // inbound object. Cleanup reads it post-catch_unwind to fast-path + // a single DeleteObject (vs scanning the entire bucket on Drop). + let found_key: std::sync::Arc>> = + std::sync::Arc::new(std::sync::Mutex::new(None)); + + // Run the send + poll + assert flow inside catch_unwind so we can + // ALWAYS run cleanup before propagating any panic. AssertUnwindSafe + // is needed because S3Client + the captured &env contain interior + // mutability and references — neither implements UnwindSafe by + // default. Test failure semantics are unchanged: a panic inside the + // body still fails the test, just AFTER cleanup has run. + use futures_util::FutureExt; + let body_result = std::panic::AssertUnwindSafe(run_send_and_poll( + &sender, + &s3, + &env, + &token, + &recipient, + &landing_url, + found_key.clone(), + )) + .catch_unwind() + .await; + + let fast_key = found_key.lock().unwrap().take(); + cleanup_test_objects(&s3, &env.bucket, &token, fast_key).await; + + if let Err(panic) = body_result { + std::panic::resume_unwind(panic); + } + log("test ok — all steps complete", &[]); +} + +/// Test body extracted so it can run inside catch_unwind without polluting +/// the outer cleanup path. Sends the magic link, polls S3 for the inbound +/// MIME object, asserts the body contains the token + landing URL. +/// +/// Writes the found key into `found_key_slot` so the outer cleanup path +/// can fast-path a single DeleteObject (vs scanning the entire bucket). +async fn run_send_and_poll( + sender: &SesEmailSender, + s3: &S3Client, + env: &TestEnv, + token: &str, + recipient: &str, + landing_url: &str, + found_key_slot: std::sync::Arc>>, +) { + log("send_magic_link: calling SES SendEmail…", &[]); + sender + .send_magic_link(recipient, landing_url) + .await + .expect("SES SendEmail failed"); + log("send_magic_link: ok — polling for inbound delivery to S3", &[]); + + // Poll S3 for an inbound object whose body contains our unique token. + // To keep iteration fast even when the bucket has thousands of stale + // objects, sort by LastModified desc and examine only the most recent + // EXAMINE_PER_ATTEMPT objects each iteration. + const EXAMINE_PER_ATTEMPT: usize = 20; + let mut found_body: Option = None; + 'poll: for attempt in 1..=POLL_MAX_ATTEMPTS { + log( + "attempt {}/{} — list_objects_v2 prefix={}", + &[&attempt.to_string(), &POLL_MAX_ATTEMPTS.to_string(), INBOUND_PREFIX], + ); + let listed = match s3 + .list_objects_v2() + .bucket(&env.bucket) + .prefix(INBOUND_PREFIX) + .send() + .await + { + Ok(r) => r, + Err(e) => { + log( + "attempt {}: list_objects_v2 ERROR: {}", + &[&attempt.to_string(), &format!("{e}")], + ); + tokio::time::sleep(POLL_INTERVAL).await; + continue 'poll; + } + }; + let total = listed.contents().len(); + // Newest first. + let mut objs: Vec<_> = listed.contents().to_vec(); + objs.sort_by(|a, b| b.last_modified().cmp(&a.last_modified())); + let recent = &objs[..objs.len().min(EXAMINE_PER_ATTEMPT)]; + log( + "attempt {}: bucket has {} object(s); examining {} most recent", + &[ + &attempt.to_string(), + &total.to_string(), + &recent.len().to_string(), + ], + ); + + for (i, obj) in recent.iter().enumerate() { + let Some(key) = obj.key() else { continue }; + let object = match s3.get_object().bucket(&env.bucket).key(key).send().await { + Ok(o) => o, + Err(e) => { + log( + " [{}/{}] {} get_object ERROR: {}", + &[ + &(i + 1).to_string(), + &recent.len().to_string(), + key, + &format!("{e}"), + ], + ); + continue; + } + }; + let bytes = match object.body.collect().await { + Ok(b) => b.to_vec(), + Err(e) => { + log( + " [{}/{}] {} body.collect ERROR: {}", + &[ + &(i + 1).to_string(), + &recent.len().to_string(), + key, + &format!("{e}"), + ], + ); + continue; + } + }; + let body_str = String::from_utf8_lossy(&bytes).to_string(); + let hit = body_str.contains(token); + log( + " [{}/{}] {} size={}B contains_token={}", + &[ + &(i + 1).to_string(), + &recent.len().to_string(), + key, + &bytes.len().to_string(), + if hit { "YES" } else { "no" }, + ], + ); + if hit { + log("attempt {}: FOUND token in {}", &[&attempt.to_string(), key]); + // Publish the key so cleanup can fast-path a single DeleteObject. + *found_key_slot.lock().unwrap() = Some(key.to_string()); + found_body = Some(body_str); + break; + } + } + if found_body.is_some() { + break 'poll; + } + log( + "attempt {}: token not in {} most recent objects, sleeping {}s", + &[ + &attempt.to_string(), + &recent.len().to_string(), + &POLL_INTERVAL.as_secs().to_string(), + ], + ); + tokio::time::sleep(POLL_INTERVAL).await; + } + + let body = found_body.unwrap_or_else(|| { + panic!( + "inbound MIME object containing test token {} did not arrive in {}s. \ + Possible causes: SES in sandbox + recipient unverified; SES suppressed \ + the address; SES receipt rule not active for {} (check: \ + aws ses describe-active-receipt-rule-set --region {})", + token, + POLL_INTERVAL.as_secs() * POLL_MAX_ATTEMPTS as u64, + env.mail_domain, + env.region, + ) + }); + assert!( + body.contains(token), + "MIME body must contain unique token {token}" + ); + assert!( + body.contains(landing_url) || body.contains(&landing_url.replace('=', "=3D")), + "MIME body must contain landing URL {landing_url} (allowing for quoted-printable encoding)" + ); + log("send_and_poll: ok", &[]); +} + +/// Unbuffered logger used throughout this test. Stdout in `cargo test +/// --nocapture` is piped (not a TTY) so println! is fully buffered and +/// hides per-attempt progress until the test completes — eprintln! + +/// explicit flush gives instant feedback. +fn log(template: &str, args: &[&str]) { + use std::io::Write; + let mut out = template.to_string(); + for arg in args { + if let Some(pos) = out.find("{}") { + out.replace_range(pos..pos + 2, arg); + } + } + eprintln!("ses_email_flow: {}", out); + let _ = std::io::stderr().flush(); +} diff --git a/crates/agentkeys-cli/Cargo.toml b/crates/agentkeys-cli/Cargo.toml index b796b7e..90cd0c2 100644 --- a/crates/agentkeys-cli/Cargo.toml +++ b/crates/agentkeys-cli/Cargo.toml @@ -15,7 +15,7 @@ path = "src/lib.rs" agentkeys-types = { workspace = true } agentkeys-core = { workspace = true } agentkeys-provisioner = { path = "../agentkeys-provisioner" } -clap = { version = "4", features = ["derive"] } +clap = { version = "4", features = ["derive", "env"] } tokio = { workspace = true } serde_json = { workspace = true } serde = { workspace = true } diff --git a/crates/agentkeys-cli/src/lib.rs b/crates/agentkeys-cli/src/lib.rs index 77c743b..36b463d 100644 --- a/crates/agentkeys-cli/src/lib.rs +++ b/crates/agentkeys-cli/src/lib.rs @@ -2,9 +2,11 @@ use std::collections::HashMap; use std::sync::Arc; use agentkeys_core::backend::{BackendError, CredentialBackend}; +use agentkeys_core::init_flow; use agentkeys_core::mock_client::MockHttpClient; pub use agentkeys_core::session_store; use agentkeys_core::session_store::SessionStore; +use agentkeys_core::signer_client::{HttpSignerClient, SignerClient, SignerClientError}; use agentkeys_provisioner::{ aws_creds::fetch_via_broker_default_ttl, run_provision, ProvisionError, Provisioner, }; @@ -110,6 +112,16 @@ impl CommandContext { self } + /// Override the session namespace. Empty strings fall back to the + /// `"master"` default so a forgotten `AGENTKEYS_SESSION_ID=` shell + /// export doesn't silently write to `~/.agentkeys//session.json`. + pub fn with_session_id(mut self, session_id: String) -> Self { + if !session_id.is_empty() { + self.session_id = session_id; + } + self + } + pub fn with_session(mut self, session: Session) -> Self { self.session_override = Some(session); self @@ -157,17 +169,97 @@ impl CommandContext { } } -pub async fn cmd_init(ctx: &CommandContext, mock_token: Option) -> Result<(String, Session)> { - let token_str = mock_token.unwrap_or_else(|| "mock-default".to_string()); +/// `agentkeys init` modes per issue #74 step 1. +/// +/// The legacy `--mock-token` flag has been hard-cut from the CLI surface +/// per the plan's CEO-review §8 ("no deprecation runway, clean slate this +/// PR"). The internal mock-token path stays as `ImportLegacyMock` for unit +/// tests only — `agentkeys-cli/src/main.rs` does NOT route to it. +pub enum InitMode { + /// Email-link auth: drives `POST /v1/auth/email/request` + polls + /// `GET /v1/auth/email/status/` until the operator clicks the + /// magic link. On success, derives the EVM wallet via + /// `POST /dev/derive-address`, links it to the email-omni via + /// `POST /v1/wallet/link`, runs the SIWE round-trip with the signer + /// signing on behalf of the email-omni, and saves the resulting + /// EVM-omni session JWT. + Email { + email: String, + broker_url: String, + signer_url: String, + chain_id: u64, + poll_timeout_seconds: u64, + }, + + /// OAuth2/Google auth: same chain as `Email` but bootstraps via + /// `POST /v1/auth/oauth2/start` + `GET /v1/auth/oauth2/status/`. + /// The CLI prints the authorization URL — the operator opens it in a + /// browser, completes the flow, and the CLI's poll loop catches the + /// callback. + Oauth2Google { + broker_url: String, + signer_url: String, + chain_id: u64, + poll_timeout_seconds: u64, + }, + + /// Hermetic test seam — accepts a mock token and creates a legacy + /// session via the backend's `/session/create` endpoint. No CLI flag + /// exposes this; only `cli_tests.rs` constructs it. Production + /// deployments cannot use this mode at all. + #[doc(hidden)] + ImportLegacyMock(String), +} + +pub async fn cmd_init(ctx: &CommandContext, mode: InitMode) -> Result<(String, Session)> { + match mode { + InitMode::ImportLegacyMock(token) => init_legacy_mock(ctx, token).await, + InitMode::Email { + email, + broker_url, + signer_url, + chain_id, + poll_timeout_seconds, + } => { + init_via_email_link( + ctx, + &email, + &broker_url, + &signer_url, + chain_id, + poll_timeout_seconds, + ) + .await + } + InitMode::Oauth2Google { + broker_url, + signer_url, + chain_id, + poll_timeout_seconds, + } => { + init_via_oauth2_google( + ctx, + &broker_url, + &signer_url, + chain_id, + poll_timeout_seconds, + ) + .await + } + } +} +/// Test-only: legacy `/session/create` path. Production cannot reach this +/// (CLI surface drops `--mock-token`). +async fn init_legacy_mock(ctx: &CommandContext, token: String) -> Result<(String, Session)> { if ctx.verbose { eprintln!("[verbose] POST {}/session/create", ctx.backend_url); - eprintln!("[verbose] auth_token: {}", token_str); + eprintln!("[verbose] auth_token: {}", token); } let backend = ctx.backend(); let (session, wallet) = backend - .create_session(AuthToken::Mock(token_str)) + .create_session(AuthToken::Mock(token)) .await .map_err(wrap_backend_error)?; @@ -183,6 +275,72 @@ pub async fn cmd_init(ctx: &CommandContext, mock_token: Option) -> Resul Ok((output, session)) } +/// Email-link bootstrap delegates to `init_flow::init_via_email_link`. +async fn init_via_email_link( + ctx: &CommandContext, + email: &str, + broker_url: &str, + signer_url: &str, + chain_id: u64, + poll_timeout_seconds: u64, +) -> Result<(String, Session)> { + eprintln!("Magic link sent to {email}. Click the link in your inbox; the CLI is polling…"); + let result = init_flow::init_via_email_link( + broker_url, + signer_url, + email, + chain_id, + std::time::Duration::from_secs(poll_timeout_seconds), + ) + .await + .map_err(|e| anyhow!("{}", e))?; + + ctx.session_store() + .save(&result.session, &ctx.session_id) + .context("save EVM session to keychain")?; + let msg = format!( + "Initialized via email-link.\n identity omni: {}\n derived wallet: {}\n evm omni: {}", + result.identity_omni, result.derived_wallet, result.evm_omni + ); + Ok((msg, result.session)) +} + +/// OAuth2/Google bootstrap delegates to `init_flow::start_oauth2_google` + +/// `complete_oauth2_google`. +async fn init_via_oauth2_google( + ctx: &CommandContext, + broker_url: &str, + signer_url: &str, + chain_id: u64, + poll_timeout_seconds: u64, +) -> Result<(String, Session)> { + let start = init_flow::start_oauth2_google(broker_url) + .await + .map_err(|e| anyhow!("{}", e))?; + eprintln!("Open this URL in your browser to authenticate with Google:"); + eprintln!(" {}", start.authorization_url); + eprintln!("(Polling for callback…)"); + + let result = init_flow::complete_oauth2_google( + broker_url, + signer_url, + &start.request_id, + chain_id, + std::time::Duration::from_secs(poll_timeout_seconds), + ) + .await + .map_err(|e| anyhow!("{}", e))?; + + ctx.session_store() + .save(&result.session, &ctx.session_id) + .context("save EVM session to keychain")?; + let msg = format!( + "Initialized via OAuth2-Google.\n identity omni: {}\n derived wallet: {}\n evm omni: {}", + result.identity_omni, result.derived_wallet, result.evm_omni + ); + Ok((msg, result.session)) +} + /// Resolve the effective wallet address for a command. /// - `None` → use the session's own wallet (default agent) /// - `Some("0x...")` → parse directly as wallet address @@ -924,7 +1082,7 @@ pub async fn cmd_provision( Ok(env) => env, Err(e) => { return Err(anyhow!( - "Problem: Could not fetch AWS credentials from broker.\nCause: {}.\nFix: Verify --broker-url / AGENTKEYS_BROKER_URL is reachable, your session token is current, and the broker's /readyz endpoint returns 200.\nDocs: https://github.com/litentry/agentKeys/blob/main/docs/operator-runbook.md", + "Problem: Could not fetch AWS credentials from broker.\nCause: {}.\nFix: Verify --broker-url / AGENTKEYS_BROKER_URL is reachable, your session token is current, and the broker's /readyz endpoint returns 200.\nDocs: https://github.com/litentry/agentKeys/blob/main/docs/operator-runbook-stage7.md", e )); } @@ -999,6 +1157,180 @@ pub async fn cmd_inbox_list(ctx: &CommandContext, agent: Option<&str>) -> Result Ok(addresses.iter().map(|a| a.to_string()).collect::>().join("\n")) } +/// `agentkeys signer derive` — call `/dev/derive-address` on the configured +/// signer for `omni_account` and print the derived EVM address. +/// +/// The CLI treats the signer as opaque RPC: this command does not assume +/// HKDF-vs-TEE; it only enforces the wire contract from +/// `docs/spec/signer-protocol.md`. Issue #74 step 2 swaps the implementation +/// behind `signer_url`; this command keeps working unchanged. +/// +/// The saved session JWT is attached as a bearer token so the signer can +/// verify the request. If no session is saved, the command fails with a +/// clear message to run `agentkeys init` first. +pub async fn cmd_signer_derive( + ctx: &CommandContext, + signer_url: &str, + omni_account: &str, +) -> Result { + let session = ctx + .load_session() + .context("load session (run `agentkeys init` first)")?; + let client = HttpSignerClient::new(signer_url).with_session_jwt(session.token); + let derived = client + .derive_address(omni_account) + .await + .map_err(format_signer_error)?; + if ctx.json_output { + Ok(serde_json::to_string_pretty(&json!({ + "address": derived.address, + "key_version": derived.key_version, + })) + .unwrap()) + } else { + Ok(format!( + "address={} key_version={}", + derived.address, derived.key_version + )) + } +} + +/// `agentkeys signer sign` — call `/dev/sign-message` on the configured +/// signer for `omni_account || message_utf8`, returning the canonical +/// 65-byte EIP-191 signature plus the derived address. +/// +/// The saved session JWT is attached as a bearer token so the signer can +/// verify the request. If no session is saved, the command fails with a +/// clear message to run `agentkeys init` first. +pub async fn cmd_signer_sign( + ctx: &CommandContext, + signer_url: &str, + omni_account: &str, + message: &str, +) -> Result { + let session = ctx + .load_session() + .context("load session (run `agentkeys init` first)")?; + let client = HttpSignerClient::new(signer_url).with_session_jwt(session.token); + let signed = client + .sign_eip191(omni_account, message.as_bytes()) + .await + .map_err(format_signer_error)?; + if ctx.json_output { + Ok(serde_json::to_string_pretty(&json!({ + "signature": signed.signature, + "address": signed.address, + "key_version": signed.key_version, + })) + .unwrap()) + } else { + Ok(format!( + "signature={} address={} key_version={}", + signed.signature, signed.address, signed.key_version + )) + } +} + +/// `agentkeys whoami` — read-only summary of the current session and the +/// signer-derived wallet address (if a signer URL is supplied and the +/// session carries an `omni_account` claim). +/// +/// In v0 the legacy session does not carry an omni_account, so this command +/// requires `--omni-account` explicitly when `--signer-url` is set. After +/// the daemon flow lands fully (issue #74 step 1 completion), the omni +/// will come from the session itself. +pub async fn cmd_whoami( + ctx: &CommandContext, + signer_url: Option<&str>, + omni_account: Option<&str>, +) -> Result { + let session = ctx + .load_session() + .context("load session (run `agentkeys init` first)")?; + + let mut out = serde_json::Map::new(); + out.insert("session_wallet".into(), json!(session.wallet.0)); + if let Some(scope) = &session.scope { + out.insert( + "scope_services".into(), + json!(scope + .services + .iter() + .map(|s| s.0.clone()) + .collect::>()), + ); + out.insert("scope_read_only".into(), json!(scope.read_only)); + } + + if let Some(url) = signer_url { + let omni = omni_account.ok_or_else(|| { + anyhow!("--signer-url requires --omni-account (will be derived from session in a later issue-74 step)") + })?; + let client = HttpSignerClient::new(url).with_session_jwt(session.token.clone()); + let derived = client + .derive_address(omni) + .await + .map_err(format_signer_error)?; + out.insert("omni_account".into(), json!(omni)); + out.insert("derived_address".into(), json!(derived.address)); + out.insert("key_version".into(), json!(derived.key_version)); + } + + if ctx.json_output { + Ok(serde_json::to_string_pretty(&serde_json::Value::Object(out)).unwrap()) + } else { + let mut lines = Vec::new(); + lines.push(format!("session_wallet: {}", session.wallet.0)); + if let Some(scope) = &session.scope { + let svc: Vec<&str> = scope.services.iter().map(|s| s.0.as_str()).collect(); + lines.push(format!("scope: [{}] read_only={}", svc.join(", "), scope.read_only)); + } + if let Some(url) = signer_url { + lines.push(format!("signer_url: {}", url)); + if let Some(o) = omni_account { + lines.push(format!("omni_account: {}", o)); + } + if let Some(v) = out.get("derived_address") { + lines.push(format!("derived_address: {}", v.as_str().unwrap_or("?"))); + } + if let Some(v) = out.get("key_version") { + lines.push(format!("key_version: {}", v)); + } + } + Ok(lines.join("\n")) + } +} + +fn format_signer_error(e: SignerClientError) -> anyhow::Error { + match e { + SignerClientError::SignerDisabled(m) => anyhow!( + "Error: SIGNER_DISABLED\n {}\n\n Fix: set DEV_KEY_SERVICE_MASTER_SECRET on the mock-server (or attest the TEE worker once issue #74 step 2 ships).", + m + ), + SignerClientError::Unauthorized(m) => anyhow!( + "Error: SIGNER_UNAUTHORIZED\n {}\n\n Fix: run `agentkeys init` to obtain a fresh session JWT.", + m + ), + SignerClientError::InvalidOmniAccount(m) => { + anyhow!("Error: INVALID_OMNI_ACCOUNT\n {}", m) + } + SignerClientError::InvalidMessageHex(m) => { + anyhow!("Error: INVALID_MESSAGE_HEX\n {}", m) + } + SignerClientError::Internal(m) => anyhow!("Error: SIGNER_INTERNAL\n {}", m), + SignerClientError::Transport(m) => anyhow!( + "Error: SIGNER_UNREACHABLE\n {}\n\n Fix: confirm --signer-url is reachable.", + m + ), + SignerClientError::Unexpected { status, error, message } => anyhow!( + "Error: SIGNER_UNEXPECTED\n status={} error={:?} message={:?}", + status, + error, + message + ), + } +} + pub fn cmd_feedback() -> String { let url = "https://github.com/agentkeys/agentkeys/discussions"; let opened = std::process::Command::new("open").arg(url).status().is_ok() diff --git a/crates/agentkeys-cli/src/main.rs b/crates/agentkeys-cli/src/main.rs index f1fc0c7..8d54ecf 100644 --- a/crates/agentkeys-cli/src/main.rs +++ b/crates/agentkeys-cli/src/main.rs @@ -1,7 +1,7 @@ use agentkeys_cli::{ cmd_approve, cmd_feedback, cmd_inbox_list, cmd_inbox_provision, cmd_init, cmd_link, - cmd_provision, cmd_read, cmd_recover, cmd_revoke, cmd_run, cmd_scope, cmd_store, cmd_teardown, - cmd_usage, CommandContext, + cmd_provision, cmd_read, cmd_recover, cmd_revoke, cmd_run, cmd_scope, cmd_signer_derive, + cmd_signer_sign, cmd_store, cmd_teardown, cmd_usage, cmd_whoami, CommandContext, InitMode, }; @@ -12,7 +12,7 @@ use clap::{Parser, Subcommand}; name = "agentkeys", version, about = "Credential management for AI agents", - long_about = "agentkeys — secure credential storage and injection for AI agents.\n\nThe --agent flag on store/read/run accepts a 0x... wallet, a linked alias, or a linked email. Omit it to default to the current session wallet.\n\nExamples:\n agentkeys init --mock-token mytoken\n agentkeys store openrouter sk-or-... (session wallet)\n agentkeys store --agent 0xAGENT openrouter sk-or-... (specific wallet)\n agentkeys read --agent my-bot openrouter (linked alias)\n agentkeys run -- python my_agent.py (session wallet)\n agentkeys run --agent 0xAGENT -- python my_agent.py (specific wallet)\n agentkeys usage 0xAGENT\n agentkeys revoke 0xAGENT\n agentkeys teardown 0xAGENT" + long_about = "agentkeys — secure credential storage and injection for AI agents.\n\nThe --agent flag on store/read/run accepts a 0x... wallet, a linked alias, or a linked email. Omit it to default to the current session wallet.\n\nExamples:\n agentkeys init --email alice@example.com --broker-url https://broker.example --signer-url https://signer.example\n agentkeys init --oauth2-google --broker-url https://broker.example --signer-url https://signer.example\n agentkeys store openrouter sk-or-... (session wallet)\n agentkeys store --agent 0xAGENT openrouter sk-or-... (specific wallet)\n agentkeys read --agent my-bot openrouter (linked alias)\n agentkeys run -- python my_agent.py (session wallet)\n agentkeys usage 0xAGENT\n agentkeys revoke 0xAGENT\n agentkeys teardown 0xAGENT" )] struct Cli { #[arg(long, default_value = "http://localhost:8090", help = "Backend URL")] @@ -31,6 +31,14 @@ struct Cli { )] broker_url: Option, + #[arg( + long, + env = "AGENTKEYS_SESSION_ID", + default_value = "master", + help = "Session namespace under ~/.agentkeys//session.json. Defaults to \"master\". Use distinct ids to hold multiple concurrent sessions (e.g. --session-id=alice and --session-id=bob) without overwriting each other." + )] + session_id: String, + #[command(subcommand)] command: Commands, } @@ -38,12 +46,36 @@ struct Cli { #[derive(Subcommand)] enum Commands { #[command( - about = "Initialize a new session", - long_about = "Authenticate with the backend and store the session token in the OS keychain.\n\nExamples:\n agentkeys init\n agentkeys init --mock-token my-test-token" + about = "Initialize a new session via email-link or OAuth2/Google", + long_about = "Authenticate the operator's identity, derive the managed EVM wallet via the dev_key_service signer, link it to the broker, and save the resulting EVM session JWT in the OS keychain. The legacy --mock-token path was hard-cut in issue #74 step 1; the only production paths are --email and --oauth2-google.\n\nExamples:\n agentkeys init --email alice@example.com --broker-url https://broker.example --signer-url https://signer.example\n agentkeys init --oauth2-google --broker-url https://broker.example --signer-url https://signer.example" )] Init { - #[arg(long, help = "Use a mock authentication token (for testing)")] - mock_token: Option, + /// Email address for the email-link flow. Mutually exclusive with --oauth2-google. + #[arg(long, conflicts_with = "oauth2_google")] + email: Option, + + /// Initiate the OAuth2/Google flow. Mutually exclusive with --email. + #[arg(long = "oauth2-google", conflicts_with = "email")] + oauth2_google: bool, + + /// Broker URL (the server hosting `/v1/auth/{email,oauth2,wallet}/{request,start,verify,status}`). + #[arg(long, env = "AGENTKEYS_BROKER_URL")] + broker_url: Option, + + /// Signer URL (the server hosting `/dev/derive-address` + `/dev/sign-message` + /// per docs/spec/signer-protocol.md). Defaults to --backend if unset. + #[arg(long, env = "AGENTKEYS_SIGNER_URL")] + signer_url: Option, + + /// SIWE chain_id. Defaults to 84532 (Base Sepolia) which the + /// broker's wallet_sig plug-in already accepts in tests. + #[arg(long, default_value_t = 84532)] + chain_id: u64, + + /// How long to wait for the operator to complete the email-link + /// click or OAuth2 callback before failing the init. + #[arg(long, default_value_t = 300)] + poll_timeout_seconds: u64, }, #[command( @@ -189,6 +221,53 @@ enum Commands { #[command(subcommand)] action: InboxAction, }, + + #[command( + about = "Show the active session, scope, and (optionally) signer-derived wallet", + long_about = "Read-only summary of the current session.\n\nWith --signer-url and --omni-account, also calls the signer to print the derived EVM address. Useful for verifying the signer wire is reachable and the omni→address mapping is what you expect.\n\nExamples:\n agentkeys whoami\n agentkeys whoami --signer-url http://localhost:8090 --omni-account <64hex>" + )] + Whoami { + #[arg(long, env = "AGENTKEYS_SIGNER_URL", help = "URL of the signer service (dev_key_service or TEE worker)")] + signer_url: Option, + #[arg(long, help = "OmniAccount (64-hex-char SHA256 digest) to resolve via the signer")] + omni_account: Option, + }, + + #[command( + about = "Talk to the signer edge (dev_key_service or TEE worker)", + long_about = "Subcommands that exercise the wire contract from docs/spec/signer-protocol.md. The CLI treats the signer as opaque RPC; the same commands work against the HKDF dev backend and the future TEE backend.\n\nExamples:\n agentkeys signer derive --signer-url http://localhost:8090 --omni-account <64hex>\n agentkeys signer sign --signer-url http://localhost:8090 --omni-account <64hex> --message 'siwe-msg'" + )] + Signer { + #[command(subcommand)] + action: SignerAction, + }, +} + +#[derive(Subcommand)] +enum SignerAction { + #[command( + about = "Derive the EVM address for an OmniAccount via the signer", + long_about = "Calls /dev/derive-address on the configured signer.\n\nExamples:\n agentkeys signer derive --signer-url http://localhost:8090 --omni-account <64hex>" + )] + Derive { + #[arg(long, env = "AGENTKEYS_SIGNER_URL", help = "URL of the signer service")] + signer_url: String, + #[arg(long, help = "OmniAccount (64-hex-char SHA256 digest)")] + omni_account: String, + }, + + #[command( + about = "Sign a UTF-8 message under the keypair derived from an OmniAccount", + long_about = "Calls /dev/sign-message on the configured signer. The message is sent as UTF-8 bytes — the signer wraps them in EIP-191.\n\nExamples:\n agentkeys signer sign --signer-url http://localhost:8090 --omni-account <64hex> --message 'hello'" + )] + Sign { + #[arg(long, env = "AGENTKEYS_SIGNER_URL", help = "URL of the signer service")] + signer_url: String, + #[arg(long, help = "OmniAccount (64-hex-char SHA256 digest)")] + omni_account: String, + #[arg(long, help = "Message to sign (sent as UTF-8 bytes)")] + message: String, + }, } #[derive(Subcommand)] @@ -216,11 +295,55 @@ enum InboxAction { async fn main() { let cli = Cli::parse(); let ctx = CommandContext::new(&cli.backend, cli.verbose, cli.json) - .with_broker_url(cli.broker_url.clone()); + .with_broker_url(cli.broker_url.clone()) + .with_session_id(cli.session_id.clone()); let result: anyhow::Result = match &cli.command { - Commands::Init { mock_token } => { - cmd_init(&ctx, mock_token.clone()).await.map(|(msg, _session)| msg) + Commands::Init { + email, + oauth2_google, + broker_url, + signer_url, + chain_id, + poll_timeout_seconds, + } => { + let broker_opt = broker_url.clone().or_else(|| ctx.broker_url.clone()); + let signer = signer_url.clone().unwrap_or_else(|| ctx.backend_url.clone()); + let mode_result: anyhow::Result = match (email, *oauth2_google) { + (Some(addr), false) => broker_opt + .ok_or_else(|| { + anyhow::anyhow!( + "agentkeys init: missing --broker-url (or AGENTKEYS_BROKER_URL)" + ) + }) + .map(|broker| InitMode::Email { + email: addr.clone(), + broker_url: broker, + signer_url: signer.clone(), + chain_id: *chain_id, + poll_timeout_seconds: *poll_timeout_seconds, + }), + (None, true) => broker_opt + .ok_or_else(|| { + anyhow::anyhow!( + "agentkeys init: missing --broker-url (or AGENTKEYS_BROKER_URL)" + ) + }) + .map(|broker| InitMode::Oauth2Google { + broker_url: broker, + signer_url: signer.clone(), + chain_id: *chain_id, + poll_timeout_seconds: *poll_timeout_seconds, + }), + (Some(_), true) => unreachable!("clap conflicts_with prevents both"), + (None, false) => Err(anyhow::anyhow!( + "agentkeys init: pass --email or --oauth2-google (the legacy --mock-token flag was hard-cut in issue #74 step 1)" + )), + }; + match mode_result { + Ok(mode) => cmd_init(&ctx, mode).await.map(|(msg, _session)| msg), + Err(e) => Err(e), + } } Commands::Store { agent, service, key } => cmd_store(&ctx, agent.as_deref(), service, key).await, Commands::Read { agent, service } => cmd_read(&ctx, agent.as_deref(), service).await, @@ -255,6 +378,17 @@ async fn main() { cmd_inbox_list(&ctx, agent.as_deref()).await } }, + Commands::Whoami { signer_url, omni_account } => { + cmd_whoami(&ctx, signer_url.as_deref(), omni_account.as_deref()).await + } + Commands::Signer { action } => match action { + SignerAction::Derive { signer_url, omni_account } => { + cmd_signer_derive(&ctx, signer_url, omni_account).await + } + SignerAction::Sign { signer_url, omni_account, message } => { + cmd_signer_sign(&ctx, signer_url, omni_account, message).await + } + }, }; match result { diff --git a/crates/agentkeys-cli/tests/cli_tests.rs b/crates/agentkeys-cli/tests/cli_tests.rs index 9f12d57..e6a712e 100644 --- a/crates/agentkeys-cli/tests/cli_tests.rs +++ b/crates/agentkeys-cli/tests/cli_tests.rs @@ -2,7 +2,7 @@ use std::sync::Arc; use agentkeys_cli::{ cmd_inbox_list, cmd_inbox_provision, cmd_init, cmd_link, cmd_provision, cmd_read, cmd_revoke, - cmd_run, cmd_scope, cmd_store, cmd_teardown, cmd_usage, CommandContext, + cmd_run, cmd_scope, cmd_store, cmd_teardown, cmd_usage, CommandContext, InitMode, }; use agentkeys_core::backend::CredentialBackend; use agentkeys_core::session_store::SessionStore; @@ -37,7 +37,7 @@ async fn init_session_with_store( let ctx = CommandContext::new("unused", false, false) .with_backend(backend.clone() as Arc) .with_session_store(store.clone()); - let (output, session) = cmd_init(&ctx, Some("test-token-unique".to_string())) + let (output, session) = cmd_init(&ctx, InitMode::ImportLegacyMock("test-token-unique".to_string())) .await .unwrap(); let wallet = output.split("Wallet: ").nth(1).unwrap().trim().to_string(); @@ -161,7 +161,7 @@ async fn cmd_revoke_self_clears_local_session() { .with_backend(backend.clone() as Arc) .with_session_store(store.clone()); - let (_, session) = cmd_init(&ctx_init, Some("selfrevoke-token".to_string())) + let (_, session) = cmd_init(&ctx_init, InitMode::ImportLegacyMock("selfrevoke-token".to_string())) .await .unwrap(); @@ -227,7 +227,7 @@ async fn cmd_revoke_with_own_wallet_clears_local_session() { let ctx_init = CommandContext::new("unused", false, false) .with_backend(backend.clone() as Arc) .with_session_store(store.clone()); - let (_, session) = cmd_init(&ctx_init, Some("self-by-wallet-token".to_string())) + let (_, session) = cmd_init(&ctx_init, InitMode::ImportLegacyMock("self-by-wallet-token".to_string())) .await .unwrap(); @@ -270,7 +270,7 @@ async fn cmd_revoke_with_other_wallet_keeps_local_session() { let ctx_init = CommandContext::new("unused", false, false) .with_backend(backend.clone() as Arc) .with_session_store(store.clone()); - let (_, parent_session) = cmd_init(&ctx_init, Some("revoke-other-token".to_string())) + let (_, parent_session) = cmd_init(&ctx_init, InitMode::ImportLegacyMock("revoke-other-token".to_string())) .await .unwrap(); @@ -379,7 +379,7 @@ async fn cli_link_alias() { let (store, _tmp) = test_store(); let bare_ctx = CommandContext::new(&base_url, false, false) .with_session_store(store.clone()); - let (output, session) = cmd_init(&bare_ctx, Some("test-token-unique".to_string())) + let (output, session) = cmd_init(&bare_ctx, InitMode::ImportLegacyMock("test-token-unique".to_string())) .await .unwrap(); let wallet = output.split("Wallet: ").nth(1).unwrap().trim().to_string(); @@ -482,7 +482,7 @@ async fn cli_error_format_unreachable() { // cmd_init will fail at HTTP level because the URL is unreachable. let context = CommandContext::new("http://127.0.0.1:19999", false, false) .with_session_store(store); - let result = cmd_init(&context, Some("test".to_string())).await; + let result = cmd_init(&context, InitMode::ImportLegacyMock("test".to_string())).await; assert!(result.is_err()); let err = result.unwrap_err().to_string(); assert!( @@ -710,7 +710,7 @@ async fn cmd_store_resolves_alias() { let (store, _tmp) = test_store(); let bare_ctx = CommandContext::new(&base_url, false, false) .with_session_store(store.clone()); - let (output, session) = cmd_init(&bare_ctx, Some("test-token-alias".to_string())).await.unwrap(); + let (output, session) = cmd_init(&bare_ctx, InitMode::ImportLegacyMock("test-token-alias".to_string())).await.unwrap(); let wallet = output.split("Wallet: ").nth(1).unwrap().trim().to_string(); let context = CommandContext::new(&base_url, false, false) @@ -748,7 +748,7 @@ async fn cmd_read_unknown_identity_errors_cleanly() { let (store, _tmp) = test_store(); let bare_ctx = CommandContext::new(&base_url, false, false) .with_session_store(store.clone()); - let (_output, session) = cmd_init(&bare_ctx, Some("test-token-unknown".to_string())).await.unwrap(); + let (_output, session) = cmd_init(&bare_ctx, InitMode::ImportLegacyMock("test-token-unknown".to_string())).await.unwrap(); let context = CommandContext::new(&base_url, false, false) .with_session(session) @@ -788,7 +788,7 @@ async fn start_scope_test_server() -> (String, String, String, SessionStore, tem let (store, tmp) = test_store(); let bare_ctx = CommandContext::new(&base_url, false, false) .with_session_store(store.clone()); - let (_output, _session) = cmd_init(&bare_ctx, Some("scope-test-unique".to_string())) + let (_output, _session) = cmd_init(&bare_ctx, InitMode::ImportLegacyMock("scope-test-unique".to_string())) .await .unwrap(); diff --git a/crates/agentkeys-core/Cargo.toml b/crates/agentkeys-core/Cargo.toml index 21fc7b2..f3760c1 100644 --- a/crates/agentkeys-core/Cargo.toml +++ b/crates/agentkeys-core/Cargo.toml @@ -21,3 +21,10 @@ anyhow = { workspace = true } [dev-dependencies] tempfile = "3" +agentkeys-mock-server = { path = "../agentkeys-mock-server" } +axum = { version = "0.7", features = ["json"] } +k256 = { version = "0.13", features = ["ecdsa", "sha2"] } +sha3 = "0.10" +rusqlite = { version = "0.31", features = ["bundled"] } +rand_core = { version = "0.6", features = ["std"] } +getrandom = "0.2" diff --git a/crates/agentkeys-core/src/init_flow.rs b/crates/agentkeys-core/src/init_flow.rs new file mode 100644 index 0000000..a65ab72 --- /dev/null +++ b/crates/agentkeys-core/src/init_flow.rs @@ -0,0 +1,437 @@ +//! First-time bootstrap helpers for issue #74 step 1. +//! +//! Both `agentkeys-cli`'s `cmd_init` and `agentkeys-daemon`'s startup +//! routine drive the same chain on a cold start: +//! +//! 1. Authenticate the operator's identity (email-link or OAuth2/Google). +//! 2. From the resulting identity-omni session JWT, ask the dev_key_service +//! to derive the managed EVM wallet. +//! 3. Link that wallet at the broker (`POST /v1/wallet/link`) so any linked +//! identity can recover the same wallet later. +//! 4. Run a SIWE round-trip with the dev_key_service signing on behalf of +//! the identity-omni; receive an EVM-omni session JWT. +//! 5. Hand the EVM-omni session JWT back to the caller so it can persist +//! in the keychain (CLI) or seed the MCP server (daemon). +//! +//! The helpers below have no I/O side effects beyond HTTP calls — they +//! never touch `session_store`. Persistence is the caller's choice. + +use std::time::{Duration, Instant}; + +use agentkeys_types::{Session, WalletAddress}; +use serde_json::json; +use thiserror::Error; + +use crate::signer_client::{HttpSignerClient, SignerClient, SignerClientError}; + +/// Result of a successful first-time init flow. +#[derive(Debug, Clone)] +pub struct InitResult { + /// EVM-omni session JWT — what the daemon uses going forward. + pub session: Session, + /// Identity omni computed from the verified identity (email or OAuth2). + /// Daemon callers stash this so subsequent SIWE round-trips know which + /// omni to drive the signer with. + pub identity_omni: String, + /// EVM omni from the broker's `/v1/auth/wallet/verify` response. + pub evm_omni: String, + /// Derived wallet address (lowercase hex, 0x-prefixed). + pub derived_wallet: String, + /// `("email", "alice@…")` or `("oauth2_google", "")`. + pub identity_type: String, + pub identity_value: String, +} + +#[derive(Debug, Error)] +pub enum InitFlowError { + #[error("transport: {0}")] + Transport(String), + #[error("broker rejected {endpoint}: status={status} body={body}")] + BrokerRejected { + endpoint: String, + status: u16, + body: String, + }, + #[error("auth flow timed out after {0}s")] + Timeout(u64), + #[error("auth flow ended without success: status={0}")] + AuthFailed(String), + #[error("signer error: {0}")] + Signer(#[from] SignerClientError), + #[error("address mismatch: derive returned {derived}, sign returned {signed}")] + AddressMismatch { derived: String, signed: String }, + #[error("missing field {field} in {endpoint} response")] + MissingField { + endpoint: &'static str, + field: &'static str, + }, +} + +type FlowResult = Result; + +/// Email-link bootstrap. +pub async fn init_via_email_link( + broker_url: &str, + signer_url: &str, + email: &str, + chain_id: u64, + poll_timeout: Duration, +) -> FlowResult { + let http = reqwest::Client::new(); + let broker = broker_url.trim_end_matches('/'); + + // 1. Request a magic link. + let req = post_json( + &http, + &format!("{broker}/v1/auth/email/request"), + json!({ "email": email }), + ) + .await?; + let request_id = string_field(&req, "/v1/auth/email/request", "request_id")?; + + // 2. Poll until verified. + let (identity_session_jwt, identity_omni) = poll_auth_status( + &http, + broker, + "email", + &request_id, + poll_timeout, + ) + .await?; + + // 3-5. Derive + link + SIWE round-trip. + let result = finish_init( + &http, + broker, + signer_url, + &identity_session_jwt, + &identity_omni, + chain_id, + "email", + email, + ) + .await?; + Ok(result) +} + +/// OAuth2/Google bootstrap. Returns `(authorization_url, request_id)` after +/// `/v1/auth/oauth2/start`; the caller prints the URL and waits for the +/// operator. Then call `complete_oauth2_google(...)` with the request_id. +/// +/// Two-step shape (vs single-call `init_via_email_link`) so the caller can +/// surface the URL to the operator and handle interrupt cleanly between +/// the start and poll. +pub async fn start_oauth2_google(broker_url: &str) -> FlowResult { + let http = reqwest::Client::new(); + let broker = broker_url.trim_end_matches('/'); + let body = post_json( + &http, + &format!("{broker}/v1/auth/oauth2/start"), + json!({ "provider": "google" }), + ) + .await?; + let request_id = string_field(&body, "/v1/auth/oauth2/start", "request_id")?; + let authorization_url = string_field(&body, "/v1/auth/oauth2/start", "authorization_url")?; + Ok(Oauth2StartResult { + request_id, + authorization_url, + }) +} + +#[derive(Debug, Clone)] +pub struct Oauth2StartResult { + pub request_id: String, + pub authorization_url: String, +} + +/// Complete an OAuth2/Google flow that was kicked off via `start_oauth2_google`. +pub async fn complete_oauth2_google( + broker_url: &str, + signer_url: &str, + request_id: &str, + chain_id: u64, + poll_timeout: Duration, +) -> FlowResult { + let http = reqwest::Client::new(); + let broker = broker_url.trim_end_matches('/'); + let (identity_session_jwt, identity_omni) = + poll_auth_status(&http, broker, "oauth2", request_id, poll_timeout).await?; + + // For OAuth2/Google the broker's status response includes + // identity_value=. We pull it from the same call. + let identity_value = identity_value_from_status(&http, broker, "oauth2", request_id).await?; + + finish_init( + &http, + broker, + signer_url, + &identity_session_jwt, + &identity_omni, + chain_id, + "oauth2_google", + &identity_value, + ) + .await +} + +#[allow(clippy::too_many_arguments)] +async fn finish_init( + http: &reqwest::Client, + broker: &str, + signer_url: &str, + identity_session_jwt: &str, + identity_omni: &str, + chain_id: u64, + identity_type: &str, + identity_value: &str, +) -> FlowResult { + let derived = derive_via_signer(signer_url, identity_omni, identity_session_jwt).await?; + link_wallet_at_broker(http, broker, identity_session_jwt, "evm", &derived).await?; + let (evm_session_jwt, evm_omni, wallet_addr) = siwe_round_trip( + http, + broker, + signer_url, + identity_omni, + &derived, + chain_id, + identity_session_jwt, + ) + .await?; + let session = build_session_from_jwt(&evm_session_jwt, &wallet_addr); + Ok(InitResult { + session, + identity_omni: identity_omni.to_string(), + evm_omni, + derived_wallet: derived, + identity_type: identity_type.to_string(), + identity_value: identity_value.to_string(), + }) +} + +async fn poll_auth_status( + http: &reqwest::Client, + broker: &str, + provider: &str, + request_id: &str, + poll_timeout: Duration, +) -> FlowResult<(String, String)> { + let url = format!("{broker}/v1/auth/{provider}/status/{request_id}"); + let deadline = Instant::now() + poll_timeout; + loop { + let resp = http + .get(&url) + .send() + .await + .map_err(|e| InitFlowError::Transport(format!("GET {url}: {e}")))?; + let body: serde_json::Value = resp + .json() + .await + .map_err(|e| InitFlowError::Transport(format!("parse JSON: {e}")))?; + match body["status"].as_str() { + Some("verified") => { + let session_jwt = + string_field(&body, "/v1/auth/{provider}/status", "session_jwt")?; + let omni = + string_field(&body, "/v1/auth/{provider}/status", "omni_account")?; + return Ok((session_jwt, omni)); + } + Some("expired") | Some("rejected") => { + return Err(InitFlowError::AuthFailed( + body["status"].as_str().unwrap_or("?").to_string(), + )); + } + _ => {} + } + if Instant::now() >= deadline { + return Err(InitFlowError::Timeout(poll_timeout.as_secs())); + } + tokio::time::sleep(Duration::from_secs(2)).await; + } +} + +async fn identity_value_from_status( + http: &reqwest::Client, + broker: &str, + provider: &str, + request_id: &str, +) -> FlowResult { + let url = format!("{broker}/v1/auth/{provider}/status/{request_id}"); + let body: serde_json::Value = http + .get(&url) + .send() + .await + .map_err(|e| InitFlowError::Transport(format!("GET {url}: {e}")))? + .json() + .await + .map_err(|e| InitFlowError::Transport(format!("parse JSON: {e}")))?; + string_field(&body, "/v1/auth/{provider}/status", "identity_value") +} + +async fn derive_via_signer( + signer_url: &str, + omni_account: &str, + session_jwt: &str, +) -> FlowResult { + // Signer (post-issue-#74 step 1b) requires the broker's session JWT + // as a Bearer token on every /dev/* request. Standalone commands + // (cli::cmd_signer_derive) chain .with_session_jwt() from the + // keychain; the in-flow init_via_email_link path also has the + // identity-session JWT in hand (just minted by the broker after + // the magic-link click), so chain it here too. + let client = HttpSignerClient::new(signer_url).with_session_jwt(session_jwt.to_string()); + let derived = client.derive_address(omni_account).await?; + Ok(derived.address) +} + +async fn link_wallet_at_broker( + http: &reqwest::Client, + broker: &str, + session_jwt: &str, + identity_type: &str, + identity_value: &str, +) -> FlowResult<()> { + let url = format!("{broker}/v1/wallet/link"); + let resp = http + .post(&url) + .header("authorization", format!("Bearer {session_jwt}")) + .json(&json!({ + "identity_type": identity_type, + "identity_value": identity_value, + })) + .send() + .await + .map_err(|e| InitFlowError::Transport(format!("POST {url}: {e}")))?; + if !resp.status().is_success() { + let status = resp.status().as_u16(); + let body = resp.text().await.unwrap_or_default(); + return Err(InitFlowError::BrokerRejected { + endpoint: "/v1/wallet/link".into(), + status, + body, + }); + } + Ok(()) +} + +async fn siwe_round_trip( + http: &reqwest::Client, + broker: &str, + signer_url: &str, + identity_omni: &str, + derived_addr: &str, + chain_id: u64, + session_jwt: &str, +) -> FlowResult<(String, String, String)> { + let start = post_json( + http, + &format!("{broker}/v1/auth/wallet/start"), + json!({ "address": derived_addr, "chain_id": chain_id }), + ) + .await?; + let request_id = string_field(&start, "/v1/auth/wallet/start", "request_id")?; + let siwe_message = string_field(&start, "/v1/auth/wallet/start", "siwe_message")?; + + // Signer requires the broker's session JWT (same one threaded + // through derive_via_signer above) for the SIWE-message sign call. + let signer = HttpSignerClient::new(signer_url).with_session_jwt(session_jwt.to_string()); + let signed = signer + .sign_eip191(identity_omni, siwe_message.as_bytes()) + .await?; + if signed.address.to_lowercase() != derived_addr.to_lowercase() { + return Err(InitFlowError::AddressMismatch { + derived: derived_addr.to_string(), + signed: signed.address, + }); + } + + let verify = post_json( + http, + &format!("{broker}/v1/auth/wallet/verify"), + json!({ "request_id": request_id, "signature": signed.signature }), + ) + .await?; + let evm_session_jwt = string_field(&verify, "/v1/auth/wallet/verify", "session_jwt")?; + let evm_omni = string_field(&verify, "/v1/auth/wallet/verify", "omni_account")?; + let wallet_addr = verify["wallet_address"] + .as_str() + .unwrap_or(derived_addr) + .to_string(); + Ok((evm_session_jwt, evm_omni, wallet_addr)) +} + +async fn post_json( + http: &reqwest::Client, + url: &str, + body: serde_json::Value, +) -> FlowResult { + let resp = http + .post(url) + .json(&body) + .send() + .await + .map_err(|e| InitFlowError::Transport(format!("POST {url}: {e}")))?; + let status = resp.status(); + if !status.is_success() { + let body = resp.text().await.unwrap_or_default(); + return Err(InitFlowError::BrokerRejected { + endpoint: url.to_string(), + status: status.as_u16(), + body, + }); + } + resp.json::() + .await + .map_err(|e| InitFlowError::Transport(format!("parse JSON from {url}: {e}"))) +} + +fn string_field( + body: &serde_json::Value, + endpoint: &'static str, + field: &'static str, +) -> FlowResult { + body[field] + .as_str() + .map(|s| s.to_string()) + .ok_or(InitFlowError::MissingField { endpoint, field }) +} + +fn build_session_from_jwt(session_jwt: &str, wallet_addr: &str) -> Session { + let now = std::time::SystemTime::now() + .duration_since(std::time::UNIX_EPOCH) + .map(|d| d.as_secs()) + .unwrap_or(0); + Session { + token: session_jwt.to_string(), + wallet: WalletAddress(wallet_addr.to_string()), + scope: None, + created_at: now, + ttl_seconds: 18_000, + } +} + +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn build_session_from_jwt_populates_required_fields() { + let s = build_session_from_jwt("eyJ.fake.jwt", "0xdeadbeef"); + assert_eq!(s.token, "eyJ.fake.jwt"); + assert_eq!(s.wallet.0, "0xdeadbeef"); + assert!(s.scope.is_none()); + assert_eq!(s.ttl_seconds, 18_000); + assert!(s.created_at > 0); + } + + #[test] + fn missing_field_error_carries_endpoint_and_field() { + let body = serde_json::json!({}); + match string_field(&body, "/x", "y") { + Err(InitFlowError::MissingField { endpoint, field }) => { + assert_eq!(endpoint, "/x"); + assert_eq!(field, "y"); + } + other => panic!("unexpected: {other:?}"), + } + } +} diff --git a/crates/agentkeys-core/src/lib.rs b/crates/agentkeys-core/src/lib.rs index 57b26d7..f0df0a6 100644 --- a/crates/agentkeys-core/src/lib.rs +++ b/crates/agentkeys-core/src/lib.rs @@ -1,6 +1,8 @@ pub mod auth_request; pub mod backend; +pub mod init_flow; pub mod mock_client; pub mod otp; pub mod payment; pub mod session_store; +pub mod signer_client; diff --git a/crates/agentkeys-core/src/signer_client.rs b/crates/agentkeys-core/src/signer_client.rs new file mode 100644 index 0000000..7a111c4 --- /dev/null +++ b/crates/agentkeys-core/src/signer_client.rs @@ -0,0 +1,285 @@ +//! Daemon-side RPC client for the signer edge. +//! +//! The daemon never holds private key material. Instead, it asks the signer +//! to (a) reveal the EVM address derived from a given `omni_account` and +//! (b) sign EIP-191 messages under that derived key. The wire contract is +//! pinned by `docs/spec/signer-protocol.md`; the v0 implementation in +//! `agentkeys-mock-server::dev_key_service` is HKDF-backed; issue #74 step 2 +//! replaces it with a TEE worker behind the same wire shape. +//! +//! Daemon code MUST treat the signer as an opaque RPC dependency (no +//! assumptions about derivation, no caching of signing keys). The +//! `SignerClient` trait is the swap-point: tests inject a TEE-stub fixture, +//! prod code injects the HTTP client. + +use async_trait::async_trait; +use thiserror::Error; + +/// Wire-protocol error codes from `signer-protocol.md`. Daemon code matches +/// on these (and the transport variants) to drive retry / surface logic. +#[derive(Debug, Error)] +pub enum SignerClientError { + /// 400 `invalid_omni_account` — bug in caller; not retriable. + #[error("invalid_omni_account: {0}")] + InvalidOmniAccount(String), + + /// 400 `invalid_message_hex` — bug in caller; not retriable. + #[error("invalid_message_hex: {0}")] + InvalidMessageHex(String), + + /// 503 `signer_disabled` — operator must set + /// `DEV_KEY_SERVICE_MASTER_SECRET` (dev) or attest the TEE (prod). + #[error("signer_disabled: {0}")] + SignerDisabled(String), + + /// 401 `unauthorized` — bearer JWT missing, expired, or omni_account mismatch. + /// Caller should re-init to obtain a fresh session JWT. + #[error("unauthorized: {0}")] + Unauthorized(String), + + /// 500 `internal` from the signer — bug; surface to operator. + #[error("signer_internal: {0}")] + Internal(String), + + /// HTTP layer failure (DNS, TCP, TLS, timeout, malformed body). + #[error("transport: {0}")] + Transport(String), + + /// Server returned a status / `error` code not covered by the contract. + #[error("unexpected_response: status={status} error={error:?} message={message:?}")] + Unexpected { + status: u16, + error: Option, + message: Option, + }, +} + +/// Successful response from `/dev/derive-address`. +#[derive(Debug, Clone, PartialEq, Eq)] +pub struct DerivedAddress { + /// Lowercase 0x-prefixed 40-char hex EVM address. + pub address: String, + /// Derivation domain version. Daemon SHOULD record this alongside the + /// address; a mid-session change implies master-secret rotation. + pub key_version: u8, +} + +/// Successful response from `/dev/sign-message`. +#[derive(Debug, Clone, PartialEq, Eq)] +pub struct SignedMessage { + /// 0x-prefixed 130-char hex `r || s || v` with `v ∈ {0, 1}`. + pub signature: String, + /// MUST equal the address `derive_address` returned for the same + /// `omni_account`. Daemon MAY assert this invariant on every sign call. + pub address: String, + pub key_version: u8, +} + +/// The daemon's view of the signer. Two methods, both pure RPC. +#[async_trait] +pub trait SignerClient: Send + Sync { + /// Resolve `omni_account` (64 lowercase hex chars) to its derived EVM + /// address. Idempotent and side-effect-free. + async fn derive_address(&self, omni_account: &str) -> Result; + + /// EIP-191-sign `message_bytes` under the keypair derived from + /// `omni_account`. Returns the canonical 65-byte signature. + /// + /// Implementations MUST verify (or trust the wire promise that) + /// `signed.address` equals `derive_address(omni_account).address`. The + /// daemon's SIWE round-trip relies on this equality. + async fn sign_eip191( + &self, + omni_account: &str, + message_bytes: &[u8], + ) -> Result; +} + +/// HTTP implementation of `SignerClient` — talks to the dev_key_service +/// (or a TEE worker) over the `/dev/*` routes documented in +/// `signer-protocol.md`. +pub struct HttpSignerClient { + base_url: String, + http: reqwest::Client, + /// When set, added as `Authorization: Bearer ` on every `/dev/*` request. + /// Required when the signer listener has JWT bearer auth enabled + /// (issue #74 step 1b: `--signer-only` mode). + session_jwt: Option, +} + +impl HttpSignerClient { + /// `base_url` must NOT include a trailing slash. The client appends + /// `/dev/derive-address` and `/dev/sign-message`. + pub fn new(base_url: impl Into) -> Self { + Self { + base_url: base_url.into().trim_end_matches('/').to_string(), + http: reqwest::Client::new(), + session_jwt: None, + } + } + + /// Custom `reqwest::Client` injection — used by tests that need a + /// pre-configured connection pool or custom timeout. + pub fn with_http_client(base_url: impl Into, http: reqwest::Client) -> Self { + Self { + base_url: base_url.into().trim_end_matches('/').to_string(), + http, + session_jwt: None, + } + } + + /// Attach a session JWT that will be sent as `Authorization: Bearer ` + /// on every `/dev/*` request. Required when the signer listener runs in + /// `--signer-only` mode (issue #74 step 1b). + pub fn with_session_jwt(mut self, jwt: String) -> Self { + self.session_jwt = Some(jwt); + self + } +} + +#[async_trait] +impl SignerClient for HttpSignerClient { + async fn derive_address(&self, omni_account: &str) -> Result { + let url = format!("{}/dev/derive-address", self.base_url); + let mut req = self + .http + .post(&url) + .json(&serde_json::json!({ "omni_account": omni_account })); + if let Some(jwt) = &self.session_jwt { + req = req.header("Authorization", format!("Bearer {jwt}")); + } + let resp = req + .send() + .await + .map_err(|e| SignerClientError::Transport(format!("POST {url}: {e}")))?; + let status = resp.status().as_u16(); + let body: serde_json::Value = resp + .json() + .await + .map_err(|e| SignerClientError::Transport(format!("parse JSON: {e}")))?; + + if status == 200 { + let address = body["address"] + .as_str() + .ok_or_else(|| SignerClientError::Unexpected { + status, + error: None, + message: Some("missing 'address'".into()), + })? + .to_string(); + let key_version = body["key_version"].as_u64().unwrap_or(0) as u8; + return Ok(DerivedAddress { address, key_version }); + } + Err(map_error(status, &body)) + } + + async fn sign_eip191( + &self, + omni_account: &str, + message_bytes: &[u8], + ) -> Result { + let url = format!("{}/dev/sign-message", self.base_url); + let mut req = self + .http + .post(&url) + .json(&serde_json::json!({ + "omni_account": omni_account, + "message_hex": hex::encode(message_bytes), + })); + if let Some(jwt) = &self.session_jwt { + req = req.header("Authorization", format!("Bearer {jwt}")); + } + let resp = req + .send() + .await + .map_err(|e| SignerClientError::Transport(format!("POST {url}: {e}")))?; + let status = resp.status().as_u16(); + let body: serde_json::Value = resp + .json() + .await + .map_err(|e| SignerClientError::Transport(format!("parse JSON: {e}")))?; + + if status == 200 { + let signature = body["signature"] + .as_str() + .ok_or_else(|| SignerClientError::Unexpected { + status, + error: None, + message: Some("missing 'signature'".into()), + })? + .to_string(); + let address = body["address"] + .as_str() + .ok_or_else(|| SignerClientError::Unexpected { + status, + error: None, + message: Some("missing 'address'".into()), + })? + .to_string(); + let key_version = body["key_version"].as_u64().unwrap_or(0) as u8; + return Ok(SignedMessage { signature, address, key_version }); + } + Err(map_error(status, &body)) + } +} + +/// Translate a non-2xx response body into a typed `SignerClientError`, +/// honoring the stable `error` codes from `signer-protocol.md`. +fn map_error(status: u16, body: &serde_json::Value) -> SignerClientError { + let code = body["error"].as_str().unwrap_or(""); + let message = body["message"].as_str().unwrap_or("").to_string(); + match (status, code) { + (400, "invalid_omni_account") => SignerClientError::InvalidOmniAccount(message), + (400, "invalid_message_hex") => SignerClientError::InvalidMessageHex(message), + (401, "unauthorized") => SignerClientError::Unauthorized(message), + (503, "signer_disabled") => SignerClientError::SignerDisabled(message), + (500, "internal") => SignerClientError::Internal(message), + _ => SignerClientError::Unexpected { + status, + error: if code.is_empty() { None } else { Some(code.to_string()) }, + message: if message.is_empty() { None } else { Some(message) }, + }, + } +} + +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn map_error_recognizes_signer_disabled() { + let body = serde_json::json!({"error":"signer_disabled","message":"unset"}); + match map_error(503, &body) { + SignerClientError::SignerDisabled(m) => assert_eq!(m, "unset"), + other => panic!("unexpected: {other:?}"), + } + } + + #[test] + fn map_error_recognizes_invalid_omni_account() { + let body = serde_json::json!({"error":"invalid_omni_account","message":"too short"}); + match map_error(400, &body) { + SignerClientError::InvalidOmniAccount(m) => assert_eq!(m, "too short"), + other => panic!("unexpected: {other:?}"), + } + } + + #[test] + fn map_error_falls_back_for_unknown_codes() { + let body = serde_json::json!({"error":"weird","message":"???"}); + match map_error(418, &body) { + SignerClientError::Unexpected { status, error, message } => { + assert_eq!(status, 418); + assert_eq!(error.as_deref(), Some("weird")); + assert_eq!(message.as_deref(), Some("???")); + } + other => panic!("unexpected: {other:?}"), + } + } + + #[test] + fn http_signer_client_strips_trailing_slash() { + let c = HttpSignerClient::new("http://localhost:8090/"); + assert_eq!(c.base_url, "http://localhost:8090"); + } +} diff --git a/crates/agentkeys-core/tests/signer_conformance.rs b/crates/agentkeys-core/tests/signer_conformance.rs new file mode 100644 index 0000000..b8c25b5 --- /dev/null +++ b/crates/agentkeys-core/tests/signer_conformance.rs @@ -0,0 +1,329 @@ +//! TEE-stub conformance test: prove that `SignerClient` works identically +//! against the HKDF-backed `dev_key_service` and a stripped-down TEE-stub +//! that implements the same `signer-protocol.md` wire contract via an +//! in-memory ECDSA keypair (no HKDF). +//! +//! This is the load-bearing test for issue #74 step 1 → step 2 swap. If +//! someone breaks the wire shape in either direction, this test fails. +//! When the real TEE worker lands (issue #74 step 2), it joins this suite +//! verbatim; daemon and CLI code do not change. + +use agentkeys_core::signer_client::{HttpSignerClient, SignerClient, SignerClientError}; +use agentkeys_mock_server::{ + create_router as mock_router, db, dev_key_service::DevKeyService, state::AppState, +}; +use axum::{ + extract::State, + http::StatusCode, + response::IntoResponse, + routing::post, + Json, Router, +}; +use k256::ecdsa::{Signature, SigningKey, VerifyingKey}; +use serde::Deserialize; +use serde_json::{json, Value}; +use sha3::{Digest, Keccak256}; +use std::collections::HashMap; +use std::sync::{Arc, Mutex}; + +// ---------------------------------------------------------------------- +// TEE-stub: same wire as dev_key_service, but in-memory keypair per omni. +// ---------------------------------------------------------------------- + +#[derive(Clone, Default)] +struct TeeStubState { + /// One per-omni keypair, lazily instantiated. The real TEE worker would + /// generate these inside the enclave; the stub uses fresh OS-RNG keys + /// so we explicitly do NOT cross-validate addresses against the HKDF + /// backend — the conformance check is on shape, not identity. + keys: Arc>>, +} + +impl TeeStubState { + fn key_for(&self, omni: &str) -> SigningKey { + let mut map = self.keys.lock().unwrap(); + map.entry(omni.to_string()) + .or_insert_with(|| SigningKey::random(&mut k256_rand::OsRngWrapper)) + .clone() + } +} + +// k256 0.13 needs a `RngCore + CryptoRng` adapter; build a tiny one that +// wraps `getrandom`. +mod k256_rand { + use rand_core::{CryptoRng, RngCore}; + pub struct OsRngWrapper; + impl RngCore for OsRngWrapper { + fn next_u32(&mut self) -> u32 { + let mut b = [0u8; 4]; + self.fill_bytes(&mut b); + u32::from_le_bytes(b) + } + fn next_u64(&mut self) -> u64 { + let mut b = [0u8; 8]; + self.fill_bytes(&mut b); + u64::from_le_bytes(b) + } + fn fill_bytes(&mut self, dest: &mut [u8]) { + getrandom::getrandom(dest).expect("OS RNG failed"); + } + fn try_fill_bytes(&mut self, dest: &mut [u8]) -> Result<(), rand_core::Error> { + self.fill_bytes(dest); + Ok(()) + } + } + impl CryptoRng for OsRngWrapper {} +} + +fn address_for(sk: &SigningKey) -> String { + let vk: &VerifyingKey = sk.verifying_key(); + let encoded = vk.to_encoded_point(false); + let pubkey_bytes = encoded.as_bytes(); + let mut h = Keccak256::new(); + h.update(&pubkey_bytes[1..]); + let pubkey_hash = h.finalize(); + format!("0x{}", hex::encode(&pubkey_hash[12..])) +} + +fn parse_omni(s: &str) -> Result<(), (StatusCode, Json)> { + if s.len() != 64 { + return Err(( + StatusCode::BAD_REQUEST, + Json(json!({ + "error":"invalid_omni_account", + "message":"must be 64 hex chars" + })), + )); + } + if hex::decode(s).is_err() { + return Err(( + StatusCode::BAD_REQUEST, + Json(json!({ + "error":"invalid_omni_account", + "message":"not valid hex" + })), + )); + } + Ok(()) +} + +#[derive(Deserialize)] +struct DeriveReq { + omni_account: String, +} + +#[derive(Deserialize)] +struct SignReq { + omni_account: String, + message_hex: String, +} + +async fn tee_derive( + State(state): State, + Json(body): Json, +) -> impl IntoResponse { + if let Err(e) = parse_omni(&body.omni_account) { + return e.into_response(); + } + let sk = state.key_for(&body.omni_account); + let address = address_for(&sk); + ( + StatusCode::OK, + Json(json!({ + "address": address, + "key_version": 1, + })), + ) + .into_response() +} + +async fn tee_sign( + State(state): State, + Json(body): Json, +) -> impl IntoResponse { + if let Err(e) = parse_omni(&body.omni_account) { + return e.into_response(); + } + let message_bytes = match hex::decode(body.message_hex.trim_start_matches("0x")) { + Ok(b) => b, + Err(e) => { + return ( + StatusCode::BAD_REQUEST, + Json(json!({ + "error":"invalid_message_hex", + "message":format!("not valid hex: {e}") + })), + ) + .into_response(); + } + }; + + let sk = state.key_for(&body.omni_account); + let address = address_for(&sk); + + let prefix = format!("\x19Ethereum Signed Message:\n{}", message_bytes.len()); + let mut h = Keccak256::new(); + h.update(prefix.as_bytes()); + h.update(&message_bytes); + let digest = h.finalize(); + let (sig, recovery_id) = sk + .sign_prehash_recoverable(&digest) + .expect("tee-stub sign"); + let mut sig_bytes = sig.to_bytes().to_vec(); + sig_bytes.push(recovery_id.to_byte()); + let signature = format!("0x{}", hex::encode(&sig_bytes)); + + ( + StatusCode::OK, + Json(json!({ + "signature": signature, + "address": address, + "key_version": 1, + })), + ) + .into_response() +} + +fn build_tee_stub_router() -> Router { + Router::new() + .route("/dev/derive-address", post(tee_derive)) + .route("/dev/sign-message", post(tee_sign)) + .with_state(TeeStubState::default()) +} + +fn build_hkdf_router() -> Router { + let conn = rusqlite::Connection::open_in_memory().unwrap(); + db::init_schema(&conn).unwrap(); + let signer = DevKeyService::from_master_secret([0xCEu8; 32]); + let state = Arc::new(AppState::new(conn).with_dev_signer(Some(signer))); + mock_router(state) +} + +async fn spawn(router: Router) -> String { + let listener = tokio::net::TcpListener::bind("127.0.0.1:0").await.unwrap(); + let addr = listener.local_addr().unwrap(); + tokio::spawn(async move { axum::serve(listener, router).await.unwrap() }); + format!("http://{addr}") +} + +// ---------------------------------------------------------------------- +// Shared assertions — every conforming signer backend MUST pass these. +// ---------------------------------------------------------------------- + +async fn assert_address_determinism(client: &dyn SignerClient) { + let omni = "ab".repeat(32); + let a = client.derive_address(&omni).await.unwrap(); + let b = client.derive_address(&omni).await.unwrap(); + assert_eq!(a.address, b.address); + assert!(a.address.starts_with("0x")); + assert_eq!(a.address.len(), 42); + assert_eq!(a.address, a.address.to_lowercase()); + assert_eq!(a.key_version, 1); +} + +async fn assert_sign_address_matches_derive(client: &dyn SignerClient) { + let omni = "ab".repeat(32); + let derived = client.derive_address(&omni).await.unwrap(); + let signed = client.sign_eip191(&omni, b"siwe-test-message").await.unwrap(); + assert_eq!(derived.address, signed.address); + assert_eq!(derived.key_version, signed.key_version); +} + +async fn assert_signature_recovers(client: &dyn SignerClient) { + let omni = "ab".repeat(32); + let message = b"recoverable-message"; + let signed = client.sign_eip191(&omni, message).await.unwrap(); + + let raw = hex::decode(signed.signature.trim_start_matches("0x")).unwrap(); + assert_eq!(raw.len(), 65); + assert!(raw[64] == 0 || raw[64] == 1, "v must be canonical {{0,1}}"); + + let recovery_id = k256::ecdsa::RecoveryId::try_from(raw[64]).unwrap(); + let signature = Signature::from_slice(&raw[..64]).unwrap(); + + let prefix = format!("\x19Ethereum Signed Message:\n{}", message.len()); + let mut h = Keccak256::new(); + h.update(prefix.as_bytes()); + h.update(message); + let digest = h.finalize(); + + let vk = VerifyingKey::recover_from_prehash(&digest, &signature, recovery_id).unwrap(); + let encoded = vk.to_encoded_point(false); + let pubkey_bytes = encoded.as_bytes(); + let mut h2 = Keccak256::new(); + h2.update(&pubkey_bytes[1..]); + let pubkey_hash = h2.finalize(); + let recovered = format!("0x{}", hex::encode(&pubkey_hash[12..])); + assert_eq!(recovered, signed.address); +} + +async fn assert_invalid_omni_returns_typed_error(client: &dyn SignerClient) { + let res = client.derive_address("deadbeef").await; + match res { + Err(SignerClientError::InvalidOmniAccount(_)) => {} + other => panic!("expected InvalidOmniAccount, got {other:?}"), + } +} + +async fn assert_invalid_message_hex_returns_typed_error(_client: &dyn SignerClient) { + // The HttpSignerClient hex-encodes the message bytes for us, so we can't + // generate this error through the typed surface. Instead, hand-craft an + // HTTP request directly to confirm the wire shape — done in + // `dev_key_service_routes.rs`. Here we just leave a marker: every + // conforming backend MUST surface 400 invalid_message_hex if a raw HTTP + // POST sends a non-hex message_hex. No-op in this test layer. +} + +async fn assert_different_omnis_yield_different_addresses(client: &dyn SignerClient) { + let a = client.derive_address(&"11".repeat(32)).await.unwrap(); + let b = client.derive_address(&"22".repeat(32)).await.unwrap(); + assert_ne!(a.address, b.address); +} + +async fn run_full_suite(label: &str, client: &dyn SignerClient) { + println!("[conformance] running suite against {label}"); + assert_address_determinism(client).await; + assert_sign_address_matches_derive(client).await; + assert_signature_recovers(client).await; + assert_invalid_omni_returns_typed_error(client).await; + assert_invalid_message_hex_returns_typed_error(client).await; + assert_different_omnis_yield_different_addresses(client).await; + println!("[conformance] {label} passed all assertions"); +} + +// ---------------------------------------------------------------------- +// Each backend gets its own #[tokio::test] so a regression on one isn't +// masked by an early-exit on the other. +// ---------------------------------------------------------------------- + +#[tokio::test] +async fn hkdf_dev_key_service_passes_conformance_suite() { + let url = spawn(build_hkdf_router()).await; + let client = HttpSignerClient::new(url); + run_full_suite("hkdf-dev-key-service", &client).await; +} + +#[tokio::test] +async fn tee_stub_passes_conformance_suite() { + let url = spawn(build_tee_stub_router()).await; + let client = HttpSignerClient::new(url); + run_full_suite("tee-stub", &client).await; +} + +#[tokio::test] +async fn both_backends_emit_signer_disabled_error_envelope() { + // Spin a mock-server WITHOUT a dev signer; assert the typed error. + let conn = rusqlite::Connection::open_in_memory().unwrap(); + db::init_schema(&conn).unwrap(); + let state = Arc::new(AppState::new(conn)); + let router = mock_router(state); + let url = spawn(router).await; + let client = HttpSignerClient::new(url); + + match client.derive_address(&"ab".repeat(32)).await { + Err(SignerClientError::SignerDisabled(m)) => { + assert!(m.contains("DEV_KEY_SERVICE_MASTER_SECRET")); + } + other => panic!("expected SignerDisabled, got {other:?}"), + } +} diff --git a/crates/agentkeys-daemon/src/main.rs b/crates/agentkeys-daemon/src/main.rs index 9a4389d..e2ed229 100644 --- a/crates/agentkeys-daemon/src/main.rs +++ b/crates/agentkeys-daemon/src/main.rs @@ -1,6 +1,8 @@ use std::sync::Arc; +use std::time::Duration; use agentkeys_core::backend::CredentialBackend; +use agentkeys_core::init_flow; use agentkeys_core::mock_client::MockHttpClient; use agentkeys_core::session_store; use agentkeys_types::WalletAddress; @@ -54,6 +56,35 @@ struct Args { /// pre-sourced (pre-Stage-7 path). #[arg(long, env = "AGENTKEYS_BROKER_URL")] broker_url: Option, + + /// Issue #74 step 1: bootstrap a fresh daemon via the email-link → + /// dev_key_service → SIWE flow. Triggers on first start when no + /// `daemon-*` session is on disk; ignored if a saved session loads. + #[arg(long, conflicts_with = "init_oauth2_google")] + init_email: Option, + + /// Issue #74 step 1: bootstrap a fresh daemon via the OAuth2/Google → + /// dev_key_service → SIWE flow. Same first-start semantics as + /// `--init-email`. + #[arg(long = "init-oauth2-google", conflicts_with = "init_email")] + init_oauth2_google: bool, + + /// URL of the dev_key_service signer (`/dev/derive-address` + + /// `/dev/sign-message` per docs/spec/signer-protocol.md). Required + /// when `--init-email` or `--init-oauth2-google` is set; defaults to + /// `--backend` if unset. + #[arg(long, env = "AGENTKEYS_SIGNER_URL")] + signer_url: Option, + + /// SIWE chain_id for the signer-flow bootstrap. Default mirrors + /// the broker's wallet_sig plug-in test vectors (Base Sepolia). + #[arg(long, default_value_t = 84532)] + init_chain_id: u64, + + /// How long to wait for the operator to complete email-link click + /// or OAuth2 callback before failing init. + #[arg(long, default_value_t = 300)] + init_poll_timeout_seconds: u64, } #[tokio::main] @@ -213,27 +244,58 @@ async fn main() -> anyhow::Result<()> { (sess, agent_id) } None => { - // PAIR FLOW — no stored session found. Resolve --parent lazily - // here (codex PR #22 P3) so transient backend failures on the - // --session / --recover --method paths don't crash startup. - // `--parent` binds the pair request to a specific master so - // the backend refuses approval from any other master. - let parent_wallet = resolve_parent_if_set(&args.backend, args.parent.as_deref()).await?; - let result = pairing::run_pair_flow( - &*backend, - args.pair_timeout, - parent_wallet.as_ref(), - ) - .await - .context("pair flow failed")?; - let agent_id = result.wallet.clone(); - let sid = args - .session_id - .clone() - .unwrap_or_else(|| format!("daemon-{}", agent_id.0)); - session_store::save_session(&result.session, &sid) - .context("save paired session")?; - (result.session, agent_id) + // Issue #74 step 1: signer-flow bootstrap — when --init-email + // or --init-oauth2-google is set AND no session is saved, + // run the email/OAuth2 → dev_key_service → SIWE chain. + // Otherwise fall through to the legacy pair flow (master/ + // child paradigm). + if args.init_email.is_some() || args.init_oauth2_google { + let result = run_signer_flow_init(&args).await?; + let agent_id = WalletAddress(result.session.wallet.0.clone()); + let sid = args + .session_id + .clone() + .unwrap_or_else(|| format!("daemon-{}", agent_id.0)); + session_store::save_session(&result.session, &sid) + .context("save signer-flow session")?; + // Audit: structured tracing log so journalctl / + // log-aggregator captures the init event. The daemon + // does not have a SQL audit table of its own; the + // broker's audit (mint-time) and the structured log + // here together cover "did the daemon ever auth?" + info!( + target: "agentkeys.daemon.init", + identity_type = %result.identity_type, + identity_value = %result.identity_value, + identity_omni = %result.identity_omni, + evm_omni = %result.evm_omni, + derived_wallet = %result.derived_wallet, + "agentkeys-daemon bootstrapped via signer flow" + ); + (result.session, agent_id) + } else { + // PAIR FLOW — no stored session found. Resolve --parent lazily + // here (codex PR #22 P3) so transient backend failures on the + // --session / --recover --method paths don't crash startup. + // `--parent` binds the pair request to a specific master so + // the backend refuses approval from any other master. + let parent_wallet = resolve_parent_if_set(&args.backend, args.parent.as_deref()).await?; + let result = pairing::run_pair_flow( + &*backend, + args.pair_timeout, + parent_wallet.as_ref(), + ) + .await + .context("pair flow failed")?; + let agent_id = result.wallet.clone(); + let sid = args + .session_id + .clone() + .unwrap_or_else(|| format!("daemon-{}", agent_id.0)); + session_store::save_session(&result.session, &sid) + .context("save paired session")?; + (result.session, agent_id) + } } } }; @@ -257,6 +319,54 @@ async fn main() -> anyhow::Result<()> { Ok(()) } +/// Drive the issue-#74-step-1 bootstrap chain. Reads `--init-email` / +/// `--init-oauth2-google` / `--signer-url` / `--broker-url` / +/// `--init-chain-id` / `--init-poll-timeout-seconds` from `args` and +/// returns the resulting `InitResult` (session + identity provenance). +async fn run_signer_flow_init(args: &Args) -> anyhow::Result { + let broker_url = args.broker_url.clone().ok_or_else(|| { + anyhow::anyhow!( + "agentkeys-daemon --init-email/--init-oauth2-google requires --broker-url (or AGENTKEYS_BROKER_URL)" + ) + })?; + let signer_url = args.signer_url.clone().unwrap_or_else(|| args.backend.clone()); + let poll_timeout = Duration::from_secs(args.init_poll_timeout_seconds); + + if let Some(ref email) = args.init_email { + eprintln!( + "agentkeys-daemon: bootstrapping via email-link for {email}; click the magic link in your inbox" + ); + init_flow::init_via_email_link( + &broker_url, + &signer_url, + email, + args.init_chain_id, + poll_timeout, + ) + .await + .map_err(|e| anyhow::anyhow!("email-link bootstrap failed: {e}")) + } else if args.init_oauth2_google { + let start = init_flow::start_oauth2_google(&broker_url) + .await + .map_err(|e| anyhow::anyhow!("oauth2/start failed: {e}"))?; + eprintln!( + "agentkeys-daemon: open this URL in your browser to complete OAuth2/Google:\n {}", + start.authorization_url + ); + init_flow::complete_oauth2_google( + &broker_url, + &signer_url, + &start.request_id, + args.init_chain_id, + poll_timeout, + ) + .await + .map_err(|e| anyhow::anyhow!("oauth2 bootstrap failed: {e}")) + } else { + unreachable!("caller guards on init_email or init_oauth2_google being set") + } +} + /// True IFF `s` is a strict `0x` + 40 hex-digit wallet literal. Aliases like /// `0x-office` or `0x+bar` (both legal per `cmd_link`) fail this check and /// go through the identity-resolution path instead (codex PR #22 P2 — diff --git a/crates/agentkeys-mock-server/Cargo.toml b/crates/agentkeys-mock-server/Cargo.toml index d7591a8..2c7ffe0 100644 --- a/crates/agentkeys-mock-server/Cargo.toml +++ b/crates/agentkeys-mock-server/Cargo.toml @@ -23,7 +23,10 @@ tower-http = { version = "0.5", features = ["cors"] } ed25519-dalek = { version = "2", features = ["rand_core"] } rand = "0.8" hmac = "0.12" +hkdf = "0.12" sha2 = "0.10" +sha3 = "0.10" +k256 = { version = "0.13", features = ["ecdsa", "sha2"] } ciborium = "0.2" hex = "0.4" clap = { version = "4", features = ["derive"] } @@ -33,7 +36,14 @@ base64 = "0.22" tower = { version = "0.4", features = ["util"] } http-body-util = "0.1" async-trait = { workspace = true } +thiserror = { workspace = true } +jsonwebtoken = "9" [dev-dependencies] reqwest = { version = "0.12", features = ["json", "blocking"] } tokio = { workspace = true } +# Test-only: mint test JWTs against an in-test ES256 keypair so the JWT-auth +# path (`--signer-only` mode) can be exercised hermetically. +p256 = { version = "0.13", features = ["pkcs8", "pem", "ecdsa"] } +rand_core = { version = "0.6", features = ["std"] } +getrandom = "0.2" diff --git a/crates/agentkeys-mock-server/src/dev_key_service.rs b/crates/agentkeys-mock-server/src/dev_key_service.rs new file mode 100644 index 0000000..b81b139 --- /dev/null +++ b/crates/agentkeys-mock-server/src/dev_key_service.rs @@ -0,0 +1,410 @@ +//! ============================================================================ +//! DEV ONLY — REPLACE WITH TEE WORKER (issue #74 step 2) +//! ============================================================================ +//! +//! HKDF-backed signer for development and CI. The master secret lives in a +//! plain environment variable, which is fine for local dev and the demo +//! deployment but is unacceptable for any environment where compromise of +//! the host shell environment would be a security incident. +//! +//! Production deployments MUST replace this module with a TEE-backed +//! signer (issue #74 step 2). The wire shape is locked by +//! `docs/spec/signer-protocol.md` so the swap is mechanical. +//! +//! What this module does: +//! 1. Loads a 32-byte master secret from `DEV_KEY_SERVICE_MASTER_SECRET` +//! (hex). Refuses to enable if the env var is unset or malformed. +//! 2. Derives a deterministic secp256k1 keypair from `omni_account` via +//! HKDF-SHA256 using a versioned info string +//! (`[key_version_byte] || "agentkeys-evm-wallet" || omni_bytes`). +//! 3. Computes the EVM address from the derived public key (keccak256 of +//! uncompressed pubkey, last 20 bytes, lowercase hex). +//! 4. Signs arbitrary byte messages under the EIP-191 envelope and returns +//! the canonical 65-byte `r || s || v` signature with `v ∈ {0, 1}`. +//! +//! The signing key is never persisted, never logged, never returned over +//! the wire. The address and signatures are the only externally visible +//! products. +//! +//! See `docs/spec/signer-protocol.md` for the v0 wire contract. + +use hkdf::Hkdf; +use k256::ecdsa::SigningKey; +use sha2::Sha256; +use sha3::{Digest, Keccak256}; + +/// Stable salt input to the HKDF extract step. Pinning the salt locks the +/// derivation domain to "agentkeys signer v0" — distinct from any other +/// HKDF use of the same master secret in any unrelated AgentKeys subsystem. +const HKDF_SALT: &[u8] = b"agentkeys-signer-v0"; + +/// Info-string suffix appended after the version byte. Pinning this keeps +/// the v0 derivation domain stable; never change without a `KEY_VERSION` +/// bump. +const HKDF_INFO_SUFFIX: &[u8] = b"agentkeys-evm-wallet"; + +/// Current key-derivation version. Future master-secret rotation bumps this +/// byte; producing a different address from the same omni_account while +/// keeping the wire shape identical. Reserved range: +/// * `0x01..=0x7f` for production rotations +/// * `0x80..=0xff` for staging / testing +pub const KEY_VERSION: u8 = 0x01; + +/// Required env var name. Production builds (when the TEE worker exists) +/// MUST refuse to honor this env var; the TEE worker has its own sealed +/// secret and ignores it. +pub const MASTER_SECRET_ENV_VAR: &str = "DEV_KEY_SERVICE_MASTER_SECRET"; + +/// Errors that the signer can surface to the HTTP layer. +#[derive(Debug, thiserror::Error)] +pub enum SignerError { + #[error("invalid_omni_account: {0}")] + InvalidOmniAccount(String), + + #[error("invalid_message_hex: {0}")] + InvalidMessageHex(String), + + #[error("internal: {0}")] + Internal(String), +} + +impl SignerError { + /// Stable machine-readable code, matching `signer-protocol.md`'s error + /// envelope. + pub fn code(&self) -> &'static str { + match self { + SignerError::InvalidOmniAccount(_) => "invalid_omni_account", + SignerError::InvalidMessageHex(_) => "invalid_message_hex", + SignerError::Internal(_) => "internal", + } + } + + /// HTTP status the handler should return. + pub fn http_status(&self) -> u16 { + match self { + SignerError::InvalidOmniAccount(_) | SignerError::InvalidMessageHex(_) => 400, + SignerError::Internal(_) => 500, + } + } +} + +/// HKDF-backed dev signer. **DEV ONLY.** +/// +/// Holds the 32-byte master secret in process memory. Construct one per +/// process at boot via `DevKeyService::from_env()` and share it through +/// `Arc` if multiple call sites need it. +pub struct DevKeyService { + master_secret: [u8; 32], +} + +impl DevKeyService { + /// **DEV ONLY.** Load the master secret from + /// `DEV_KEY_SERVICE_MASTER_SECRET` (hex). Returns `Ok(None)` if the env + /// var is unset (callers translate this to 503 `signer_disabled` per + /// the wire contract). Returns `Err` if the env var is set but + /// malformed (wrong length, non-hex) — that is an operator error and + /// should fail the boot, not silently disable the signer. + pub fn from_env() -> Result, String> { + let raw = match std::env::var(MASTER_SECRET_ENV_VAR) { + Ok(s) if s.is_empty() => return Ok(None), + Ok(s) => s, + Err(_) => return Ok(None), + }; + let bytes = hex::decode(raw.trim_start_matches("0x")) + .map_err(|e| format!("{MASTER_SECRET_ENV_VAR} is not valid hex: {e}"))?; + if bytes.len() != 32 { + return Err(format!( + "{MASTER_SECRET_ENV_VAR} must decode to 32 bytes, got {}", + bytes.len() + )); + } + let mut master_secret = [0u8; 32]; + master_secret.copy_from_slice(&bytes); + Ok(Some(Self { master_secret })) + } + + /// **DEV ONLY.** Construct directly from a 32-byte master secret (used + /// by tests; production must go through `from_env()`). + pub fn from_master_secret(master_secret: [u8; 32]) -> Self { + Self { master_secret } + } + + /// **DEV ONLY.** Derive the secp256k1 signing key for an `omni_account` + /// per the v0 derivation rule: + /// `HKDF-SHA256(ikm=master_secret, salt="agentkeys-signer-v0", + /// info=[KEY_VERSION] || "agentkeys-evm-wallet" || omni_bytes, + /// okm=32)`. + /// + /// On the vanishingly rare chance the 32-byte HKDF output is rejected + /// by `secp256k1::SecretKey::from_slice` (probability ≈ 2⁻¹²⁸), we + /// extend the HKDF output with an additional byte and try again, up to + /// `MAX_HKDF_RETRIES` times. In practice this never fires. + fn derive_signing_key(&self, omni_bytes: &[u8; 32]) -> Result { + const MAX_HKDF_RETRIES: u8 = 16; + + let hk = Hkdf::::new(Some(HKDF_SALT), &self.master_secret); + + for retry in 0..MAX_HKDF_RETRIES { + // Build info: [KEY_VERSION] || "agentkeys-evm-wallet" || omni_bytes || + // optional retry counter (only when retry > 0) + let mut info = Vec::with_capacity(1 + HKDF_INFO_SUFFIX.len() + 32 + 1); + info.push(KEY_VERSION); + info.extend_from_slice(HKDF_INFO_SUFFIX); + info.extend_from_slice(omni_bytes); + if retry > 0 { + info.push(retry); + } + + let mut okm = [0u8; 32]; + hk.expand(&info, &mut okm) + .map_err(|e| SignerError::Internal(format!("HKDF expand failed: {e}")))?; + + match SigningKey::from_slice(&okm) { + Ok(sk) => return Ok(sk), + Err(_) => continue, + } + } + + Err(SignerError::Internal( + "HKDF output rejected as secp256k1 scalar after 16 retries (vanishingly rare; bug?)".into(), + )) + } + + /// **DEV ONLY.** Derive the EVM address (lowercase hex, + /// `0x` + 40 chars) for an `omni_account`. + pub fn derive_address(&self, omni_account: &str) -> Result { + let omni_bytes = parse_omni_account(omni_account)?; + let sk = self.derive_signing_key(&omni_bytes)?; + Ok(address_for_signing_key(&sk)) + } + + /// **DEV ONLY.** Sign `message_bytes` under EIP-191 with the keypair + /// derived from `omni_account`. Returns the canonical 65-byte signature + /// (`r || s || v`, `v ∈ {0, 1}`) as a 0x-prefixed lowercase hex string, + /// alongside the address that the signature recovers to. + pub fn sign_eip191( + &self, + omni_account: &str, + message_bytes: &[u8], + ) -> Result<(String, String), SignerError> { + let omni_bytes = parse_omni_account(omni_account)?; + let sk = self.derive_signing_key(&omni_bytes)?; + let address = address_for_signing_key(&sk); + + // EIP-191: keccak256("\x19Ethereum Signed Message:\n" || len || message). + let prefix = format!("\x19Ethereum Signed Message:\n{}", message_bytes.len()); + let mut hasher = Keccak256::new(); + hasher.update(prefix.as_bytes()); + hasher.update(message_bytes); + let digest = hasher.finalize(); + + // Sign and recover the recovery id. k256's + // `sign_prehash_recoverable` returns a low-s normalized signature + // and a recovery id in {0, 1}. + let (sig, recovery_id) = sk + .sign_prehash_recoverable(&digest) + .map_err(|e| SignerError::Internal(format!("signing failed: {e}")))?; + + let mut sig_bytes = sig.to_bytes().to_vec(); + sig_bytes.push(recovery_id.to_byte()); + debug_assert_eq!(sig_bytes.len(), 65, "EIP-191 signature must be 65 bytes"); + + let signature_hex = format!("0x{}", hex::encode(&sig_bytes)); + Ok((signature_hex, address)) + } +} + +/// Parse an `omni_account` from the wire format (64 lowercase hex chars, +/// no `0x` prefix per `signer-protocol.md`) into its raw 32 bytes. Tolerates +/// uppercase hex but rejects any other deviation. +fn parse_omni_account(omni_account: &str) -> Result<[u8; 32], SignerError> { + if omni_account.len() != 64 { + return Err(SignerError::InvalidOmniAccount(format!( + "must be 64 hex chars, got {}", + omni_account.len() + ))); + } + let bytes = hex::decode(omni_account) + .map_err(|e| SignerError::InvalidOmniAccount(format!("not valid hex: {e}")))?; + let mut out = [0u8; 32]; + out.copy_from_slice(&bytes); + Ok(out) +} + +/// EVM address from a secp256k1 verifying key: keccak256 of the +/// uncompressed public key (skipping the leading 0x04 marker), take the +/// last 20 bytes, return `0x` + 40 lowercase hex chars. +fn address_for_signing_key(sk: &SigningKey) -> String { + let vk = sk.verifying_key(); + let encoded_point = vk.to_encoded_point(false); + let pubkey_bytes = encoded_point.as_bytes(); + debug_assert_eq!(pubkey_bytes.len(), 65, "uncompressed secp256k1 pubkey is 65 bytes"); + debug_assert_eq!(pubkey_bytes[0], 0x04, "uncompressed marker"); + + let mut hasher = Keccak256::new(); + hasher.update(&pubkey_bytes[1..]); + let pubkey_hash = hasher.finalize(); + format!("0x{}", hex::encode(&pubkey_hash[12..])) +} + +#[cfg(test)] +mod tests { + use super::*; + use k256::ecdsa::{RecoveryId, Signature, VerifyingKey}; + + fn fixed_master_secret() -> [u8; 32] { + // Deterministic test fixture; do NOT use this in any environment. + let mut s = [0u8; 32]; + for (i, b) in s.iter_mut().enumerate() { + *b = i as u8; + } + s + } + + fn fixed_signer() -> DevKeyService { + DevKeyService::from_master_secret(fixed_master_secret()) + } + + fn fixed_omni() -> String { + // 64 hex chars, all 0xab. + "ab".repeat(32) + } + + #[test] + fn derive_address_is_deterministic() { + let s = fixed_signer(); + let a1 = s.derive_address(&fixed_omni()).unwrap(); + let a2 = s.derive_address(&fixed_omni()).unwrap(); + assert_eq!(a1, a2); + assert!(a1.starts_with("0x")); + assert_eq!(a1.len(), 42); + // lowercase + assert_eq!(a1, a1.to_lowercase()); + } + + #[test] + fn different_omni_yields_different_address() { + let s = fixed_signer(); + let a = s.derive_address(&fixed_omni()).unwrap(); + let b = s.derive_address(&"cd".repeat(32)).unwrap(); + assert_ne!(a, b); + } + + #[test] + fn different_master_secret_yields_different_address() { + let s1 = DevKeyService::from_master_secret([0x11; 32]); + let s2 = DevKeyService::from_master_secret([0x22; 32]); + let a1 = s1.derive_address(&fixed_omni()).unwrap(); + let a2 = s2.derive_address(&fixed_omni()).unwrap(); + assert_ne!(a1, a2); + } + + #[test] + fn rejects_short_omni() { + let s = fixed_signer(); + let res = s.derive_address("deadbeef"); + assert!(matches!(res, Err(SignerError::InvalidOmniAccount(_)))); + } + + #[test] + fn rejects_non_hex_omni() { + let s = fixed_signer(); + let res = s.derive_address(&"z".repeat(64)); + assert!(matches!(res, Err(SignerError::InvalidOmniAccount(_)))); + } + + #[test] + fn sign_address_matches_derive_address() { + let s = fixed_signer(); + let omni = fixed_omni(); + let derived = s.derive_address(&omni).unwrap(); + let (_sig, signed_addr) = s.sign_eip191(&omni, b"hello").unwrap(); + assert_eq!(derived, signed_addr); + } + + #[test] + fn signature_is_65_bytes_canonical_v() { + let s = fixed_signer(); + let (sig_hex, _addr) = s.sign_eip191(&fixed_omni(), b"hello").unwrap(); + assert!(sig_hex.starts_with("0x")); + let raw = hex::decode(sig_hex.trim_start_matches("0x")).unwrap(); + assert_eq!(raw.len(), 65); + // canonical v ∈ {0, 1} + assert!(raw[64] == 0 || raw[64] == 1, "v byte = {}", raw[64]); + } + + #[test] + fn signature_recovers_to_derived_address() { + let s = fixed_signer(); + let omni = fixed_omni(); + let message = b"siwe-test-message"; + let (sig_hex, derived_addr) = s.sign_eip191(&omni, message).unwrap(); + + // Reproduce the broker's ecrecover path. + let raw = hex::decode(sig_hex.trim_start_matches("0x")).unwrap(); + let recovery_id = RecoveryId::try_from(raw[64]).unwrap(); + let signature = Signature::from_slice(&raw[..64]).unwrap(); + + let prefix = format!("\x19Ethereum Signed Message:\n{}", message.len()); + let mut h = Keccak256::new(); + h.update(prefix.as_bytes()); + h.update(message); + let digest = h.finalize(); + + let vk = VerifyingKey::recover_from_prehash(&digest, &signature, recovery_id).unwrap(); + let encoded_point = vk.to_encoded_point(false); + let pubkey_bytes = encoded_point.as_bytes(); + let mut h2 = Keccak256::new(); + h2.update(&pubkey_bytes[1..]); + let pubkey_hash = h2.finalize(); + let recovered = format!("0x{}", hex::encode(&pubkey_hash[12..])); + + assert_eq!(recovered, derived_addr); + } + + /// Combined serial test for `from_env`. Tests that mutate process-global + /// env vars cannot run in parallel — a sibling test inside the same + /// binary would observe the wrong state. We sequence all three branches + /// (unset, malformed, valid) inside a single test and use a process-wide + /// `Mutex` to serialize against any future `from_env` call sites. + #[test] + fn from_env_unset_then_invalid_then_valid() { + use std::sync::Mutex; + static ENV_LOCK: Mutex<()> = Mutex::new(()); + let _guard = ENV_LOCK.lock().unwrap(); + + let prev = std::env::var(MASTER_SECRET_ENV_VAR).ok(); + + // Branch 1: unset → Ok(None). + std::env::remove_var(MASTER_SECRET_ENV_VAR); + assert!(matches!(DevKeyService::from_env(), Ok(None))); + + // Branch 2: malformed (too short hex) → Err. + std::env::set_var(MASTER_SECRET_ENV_VAR, "deadbeef"); + assert!(DevKeyService::from_env().is_err()); + + // Branch 3: valid 32-byte hex → Ok(Some(svc)) and derive succeeds. + std::env::set_var(MASTER_SECRET_ENV_VAR, "00".repeat(32)); + let svc = DevKeyService::from_env().unwrap().unwrap(); + let _ = svc.derive_address(&fixed_omni()).unwrap(); + + // Restore prior env state. + match prev { + Some(p) => std::env::set_var(MASTER_SECRET_ENV_VAR, p), + None => std::env::remove_var(MASTER_SECRET_ENV_VAR), + } + } + + #[test] + fn signer_error_codes_match_protocol() { + assert_eq!( + SignerError::InvalidOmniAccount("x".into()).code(), + "invalid_omni_account" + ); + assert_eq!( + SignerError::InvalidMessageHex("x".into()).code(), + "invalid_message_hex" + ); + assert_eq!(SignerError::Internal("x".into()).code(), "internal"); + } +} diff --git a/crates/agentkeys-mock-server/src/handlers/dev_keys.rs b/crates/agentkeys-mock-server/src/handlers/dev_keys.rs new file mode 100644 index 0000000..383be44 --- /dev/null +++ b/crates/agentkeys-mock-server/src/handlers/dev_keys.rs @@ -0,0 +1,191 @@ +//! HTTP handlers for the dev_key_service signer. +//! +//! See `docs/spec/signer-protocol.md` for the wire contract. Both endpoints +//! return 503 `signer_disabled` when `state.dev_signer` is `None` +//! (i.e. `DEV_KEY_SERVICE_MASTER_SECRET` was unset at boot). When enabled, +//! they delegate to `DevKeyService` for derivation/signing. +//! +//! JWT bearer auth: when `state.broker_session_pubkey` is `Some`, every request +//! MUST carry `Authorization: Bearer ` signed by the broker's session keypair. +//! The JWT's `agentkeys.omni_account` claim MUST match the request body's +//! `omni_account` field. When the pubkey is `None` (legacy/test mode), auth +//! is skipped. + +use axum::{extract::State, http::HeaderMap, http::StatusCode, response::IntoResponse, Json}; +use jsonwebtoken::{decode, Algorithm, Validation}; +use serde::{Deserialize, Serialize}; +use serde_json::{json, Value}; + +use crate::dev_key_service::{SignerError, KEY_VERSION}; +use crate::state::SharedState; + +#[derive(Deserialize)] +pub struct DeriveAddressRequest { + pub omni_account: String, +} + +#[derive(Deserialize)] +pub struct SignMessageRequest { + pub omni_account: String, + pub message_hex: String, +} + +/// Minimal JWT claims we care about for verification. +#[derive(Debug, Serialize, Deserialize)] +struct SessionClaims { + exp: u64, + agentkeys: AgentKeysClaims, +} + +#[derive(Debug, Serialize, Deserialize)] +struct AgentKeysClaims { + omni_account: String, +} + +/// Verify the bearer JWT and assert `claims.agentkeys.omni_account == body_omni`. +/// Returns `Ok(())` on success. +/// Returns `Err((StatusCode::UNAUTHORIZED, Json(...)))` on any failure. +/// +/// Skipped entirely when `state.broker_session_pubkey` is `None`. +fn verify_session_jwt( + state: &SharedState, + headers: &HeaderMap, + body_omni: &str, +) -> Result<(), (StatusCode, Json)> { + let Some(decoding_key) = state.broker_session_pubkey.as_ref() else { + return Ok(()); + }; + + let token = extract_bearer(headers).ok_or_else(|| { + ( + StatusCode::UNAUTHORIZED, + Json(json!({ + "error": "unauthorized", + "message": "missing Authorization: Bearer header", + })), + ) + })?; + + let mut validation = Validation::new(Algorithm::ES256); + // The signer doesn't know the broker's issuer URL — skip iss/aud validation + // here; the broker already validated those when it minted the token. + // We only verify signature + expiry + omni_account claim. + validation.set_audience(&["agentkeys:broker"]); + validation.insecure_disable_signature_validation(); + // Re-enable signature validation (override the above so we actually check it). + // Use the standard path: validate sig + exp only, leave iss/aud to the custom check above. + let mut validation2 = Validation::new(Algorithm::ES256); + validation2.set_audience(&["agentkeys:broker"]); + validation2.validate_exp = true; + // Don't require iss — we don't know the broker URL here. + validation2.set_required_spec_claims(&["exp", "aud"]); + + let token_data = decode::(token, decoding_key, &validation2).map_err(|e| { + ( + StatusCode::UNAUTHORIZED, + Json(json!({ + "error": "unauthorized", + "message": format!("invalid session JWT: {e}"), + })), + ) + })?; + + if token_data.claims.agentkeys.omni_account != body_omni { + return Err(( + StatusCode::UNAUTHORIZED, + Json(json!({ + "error": "unauthorized", + "message": "JWT omni_account claim does not match request body", + })), + )); + } + + Ok(()) +} + +fn extract_bearer(headers: &HeaderMap) -> Option<&str> { + let val = headers.get("authorization")?.to_str().ok()?; + val.strip_prefix("Bearer ").map(str::trim) +} + +pub async fn derive_address( + State(state): State, + headers: HeaderMap, + Json(body): Json, +) -> impl IntoResponse { + if let Err(e) = verify_session_jwt(&state, &headers, &body.omni_account) { + return e.into_response(); + } + let Some(signer) = state.dev_signer.as_ref() else { + return signer_disabled().into_response(); + }; + match signer.derive_address(&body.omni_account) { + Ok(address) => ( + StatusCode::OK, + Json(json!({ + "address": address, + "key_version": KEY_VERSION, + })), + ) + .into_response(), + Err(e) => signer_error(e).into_response(), + } +} + +pub async fn sign_message( + State(state): State, + headers: HeaderMap, + Json(body): Json, +) -> impl IntoResponse { + if let Err(e) = verify_session_jwt(&state, &headers, &body.omni_account) { + return e.into_response(); + } + let Some(signer) = state.dev_signer.as_ref() else { + return signer_disabled().into_response(); + }; + + let message_bytes = match hex::decode(body.message_hex.trim_start_matches("0x")) { + Ok(b) => b, + Err(e) => { + return signer_error(SignerError::InvalidMessageHex(format!( + "not valid hex: {e}" + ))) + .into_response(); + } + }; + + match signer.sign_eip191(&body.omni_account, &message_bytes) { + Ok((signature, address)) => ( + StatusCode::OK, + Json(json!({ + "signature": signature, + "address": address, + "key_version": KEY_VERSION, + })), + ) + .into_response(), + Err(e) => signer_error(e).into_response(), + } +} + +fn signer_disabled() -> (StatusCode, Json) { + ( + StatusCode::SERVICE_UNAVAILABLE, + Json(json!({ + "error": "signer_disabled", + "message": "dev_key_service disabled — set DEV_KEY_SERVICE_MASTER_SECRET to enable", + })), + ) +} + +fn signer_error(e: SignerError) -> (StatusCode, Json) { + let status = + StatusCode::from_u16(e.http_status()).unwrap_or(StatusCode::INTERNAL_SERVER_ERROR); + ( + status, + Json(json!({ + "error": e.code(), + "message": e.to_string(), + })), + ) +} diff --git a/crates/agentkeys-mock-server/src/handlers/mod.rs b/crates/agentkeys-mock-server/src/handlers/mod.rs index 92055f8..fc137a7 100644 --- a/crates/agentkeys-mock-server/src/handlers/mod.rs +++ b/crates/agentkeys-mock-server/src/handlers/mod.rs @@ -1,6 +1,7 @@ pub mod audit; pub mod auth_request; pub mod credential; +pub mod dev_keys; pub mod identity; pub mod inbox; pub mod rendezvous; diff --git a/crates/agentkeys-mock-server/src/lib.rs b/crates/agentkeys-mock-server/src/lib.rs index a4a0e89..e0b91a6 100644 --- a/crates/agentkeys-mock-server/src/lib.rs +++ b/crates/agentkeys-mock-server/src/lib.rs @@ -1,5 +1,6 @@ pub mod auth; pub mod db; +pub mod dev_key_service; pub mod error; pub mod handlers; pub mod state; @@ -7,11 +8,24 @@ pub mod test_client; use axum::{ Router, - routing::{delete, get, post, put}, + routing::{get, post, delete, put}, }; use state::SharedState; +/// Signer-only router: serves `/dev/*` + `/healthz` exclusively. +/// Used when `--signer-only` is set, so that the dedicated signer listener +/// (`signer.litentry.org` → :8092) never accidentally serves session/credential +/// endpoints. JWT bearer auth is enforced when `state.broker_session_pubkey` +/// is set. +pub fn create_signer_router(state: SharedState) -> Router { + Router::new() + .route("/dev/derive-address", post(handlers::dev_keys::derive_address)) + .route("/dev/sign-message", post(handlers::dev_keys::sign_message)) + .route("/healthz", get(|| async { "ok" })) + .with_state(state) +} + pub fn create_router(state: SharedState) -> Router { Router::new() // Session @@ -49,6 +63,11 @@ pub fn create_router(state: SharedState) -> Router { .route("/mock/inbox/deliver", post(handlers::inbox::deliver_inbox)) .route("/mock/inbox/messages", get(handlers::inbox::list_messages)) .route("/mock/inbox/list", get(handlers::inbox::list_inboxes)) + // Dev key service (signer edge — see docs/spec/signer-protocol.md). + // 503 `signer_disabled` when `DEV_KEY_SERVICE_MASTER_SECRET` is unset. + // Issue #74 step 2 replaces this with a TEE worker; wire shape stays. + .route("/dev/derive-address", post(handlers::dev_keys::derive_address)) + .route("/dev/sign-message", post(handlers::dev_keys::sign_message)) // `/healthz` (Kubernetes convention) — what the broker's Tier-2 // reachability probe hits. Single endpoint, single name across the // codebase. Pre-Stage-7 `/health` alias was dropped; any caller that diff --git a/crates/agentkeys-mock-server/src/main.rs b/crates/agentkeys-mock-server/src/main.rs index a06031b..92d40ec 100644 --- a/crates/agentkeys-mock-server/src/main.rs +++ b/crates/agentkeys-mock-server/src/main.rs @@ -1,11 +1,35 @@ -use agentkeys_mock_server::{create_router, db, state::AppState}; +use agentkeys_mock_server::{ + create_router, create_signer_router, db, dev_key_service::DevKeyService, state::AppState, +}; use clap::Parser; +use jsonwebtoken::DecodingKey; +use std::path::PathBuf; use std::sync::Arc; #[derive(Parser)] struct Args { #[arg(long, default_value = "8090")] port: u16, + + /// When set, the server runs in signer-only mode: it serves ONLY + /// `/dev/derive-address`, `/dev/sign-message`, and `/healthz`. + /// All other endpoints (session, credential, audit, etc.) are absent. + /// Intended for the dedicated `signer.litentry.org` listener (:8092). + #[arg(long)] + signer_only: bool, + + /// Path to the broker's ES256 session public key PEM file. + /// When provided together with `--signer-only`, the signer reads this key + /// at boot and uses it to verify the `Authorization: Bearer ` header + /// on every `/dev/*` request. + /// + /// Default: `/var/lib/agentkeys/.agentkeys/broker/session-keypair.pub.pem` + /// (the path the broker writes when started with `--export-session-pubkey-to`). + #[arg( + long, + default_value = "/var/lib/agentkeys/.agentkeys/broker/session-keypair.pub.pem" + )] + broker_session_pubkey_path: PathBuf, } #[tokio::main] @@ -15,13 +39,83 @@ async fn main() { let conn = rusqlite::Connection::open_in_memory().unwrap(); db::init_schema(&conn).unwrap(); - let state = Arc::new(AppState::new(conn)); - let app = create_router(state); + // Load the dev signer from `DEV_KEY_SERVICE_MASTER_SECRET`. Unset → + // `/dev/*` returns 503; malformed → fail boot loud (operator error). + let dev_signer = match DevKeyService::from_env() { + Ok(opt) => { + if opt.is_some() { + eprintln!( + "[mock-server] dev_key_service ENABLED (DEV ONLY — replace with TEE worker per issue #74 step 2)" + ); + } else { + eprintln!( + "[mock-server] dev_key_service disabled (set DEV_KEY_SERVICE_MASTER_SECRET to enable)" + ); + } + opt + } + Err(e) => { + eprintln!("[mock-server] FATAL: invalid DEV_KEY_SERVICE_MASTER_SECRET: {e}"); + std::process::exit(2); + } + }; + + // In signer-only mode, load the broker's session pubkey for JWT bearer + // verification. If the file is missing, fail boot loud — the operator + // must ensure the broker has written the pubkey before starting the signer. + let broker_session_pubkey = if args.signer_only { + match load_broker_pubkey(&args.broker_session_pubkey_path) { + Ok(key) => { + eprintln!( + "[mock-server] signer-only mode: broker session pubkey loaded from {}", + args.broker_session_pubkey_path.display() + ); + Some(key) + } + Err(e) => { + eprintln!( + "[mock-server] FATAL: cannot load broker session pubkey from {}: {e}", + args.broker_session_pubkey_path.display() + ); + std::process::exit(2); + } + } + } else { + None + }; + + let state = Arc::new( + AppState::new(conn) + .with_dev_signer(dev_signer) + .with_broker_session_pubkey(broker_session_pubkey), + ); - let listener = tokio::net::TcpListener::bind(format!("0.0.0.0:{}", args.port)) - .await - .unwrap(); - println!("Mock server running on port {}", args.port); + let bind_addr = if args.signer_only { + // Signer-only listener binds to loopback — nginx fronts it publicly. + format!("127.0.0.1:{}", args.port) + } else { + format!("0.0.0.0:{}", args.port) + }; + + let app = if args.signer_only { + eprintln!( + "[mock-server] signer-only mode: serving /dev/* + /healthz on {}", + bind_addr + ); + create_signer_router(state) + } else { + create_router(state) + }; + + let listener = tokio::net::TcpListener::bind(&bind_addr).await.unwrap(); + println!("Mock server running on {}", bind_addr); axum::serve(listener, app).await.unwrap(); } + +/// Load a PEM-encoded EC public key for use as a JWT decoding key. +fn load_broker_pubkey(path: &PathBuf) -> Result { + let pem = std::fs::read(path).map_err(|e| format!("read {}: {e}", path.display()))?; + DecodingKey::from_ec_pem(&pem) + .map_err(|e| format!("parse EC PEM from {}: {e}", path.display())) +} diff --git a/crates/agentkeys-mock-server/src/state.rs b/crates/agentkeys-mock-server/src/state.rs index 2acc7ec..e8f40a6 100644 --- a/crates/agentkeys-mock-server/src/state.rs +++ b/crates/agentkeys-mock-server/src/state.rs @@ -1,11 +1,23 @@ use ed25519_dalek::{SigningKey, VerifyingKey}; +use jsonwebtoken::DecodingKey; use rusqlite::Connection; use std::sync::{Arc, Mutex}; +use crate::dev_key_service::DevKeyService; + pub struct AppState { pub db: Mutex, pub shielding_signing_key: SigningKey, pub shielding_public_key: VerifyingKey, + /// Dev signer for `/dev/derive-address` and `/dev/sign-message`. + /// `None` when `DEV_KEY_SERVICE_MASTER_SECRET` is unset; the handlers + /// then return 503 `signer_disabled` per `signer-protocol.md`. + pub dev_signer: Option, + /// Broker session keypair public key for JWT bearer verification on `/dev/*`. + /// `None` in legacy mock-server mode (no auth on `/dev/*`). + /// When set (signer-only mode), every `/dev/*` request MUST carry a valid + /// session JWT signed by the broker. + pub broker_session_pubkey: Option, } impl AppState { @@ -17,8 +29,25 @@ impl AppState { db: Mutex::new(conn), shielding_signing_key: signing_key, shielding_public_key: verifying_key, + dev_signer: None, + broker_session_pubkey: None, } } + + /// Builder: attach a dev signer (or leave it `None` to keep the `/dev/*` + /// endpoints disabled). + pub fn with_dev_signer(mut self, signer: Option) -> Self { + self.dev_signer = signer; + self + } + + /// Builder: attach the broker session pubkey for JWT bearer verification. + /// When set, every `/dev/*` request must carry a valid session JWT. + /// When `None` (default), JWT verification is skipped (legacy/test mode). + pub fn with_broker_session_pubkey(mut self, key: Option) -> Self { + self.broker_session_pubkey = key; + self + } } pub type SharedState = Arc; diff --git a/crates/agentkeys-mock-server/tests/dev_key_service_routes.rs b/crates/agentkeys-mock-server/tests/dev_key_service_routes.rs new file mode 100644 index 0000000..2cd8afc --- /dev/null +++ b/crates/agentkeys-mock-server/tests/dev_key_service_routes.rs @@ -0,0 +1,468 @@ +//! Integration tests for `/dev/derive-address` and `/dev/sign-message` +//! per `docs/spec/signer-protocol.md`. +//! +//! These tests build the router directly (no real TCP) so the env-var seam +//! that gates the dev signer can be controlled per case without touching +//! the process environment. + +use agentkeys_mock_server::{ + create_router, create_signer_router, db, dev_key_service::DevKeyService, state::AppState, +}; +use axum::body::Body; +use axum::http::{Method, Request, StatusCode}; +use axum::Router; +use http_body_util::BodyExt; +use jsonwebtoken::{decode, encode, Algorithm, DecodingKey, EncodingKey, Header, Validation}; +use p256::ecdsa::SigningKey; +use p256::pkcs8::{EncodePrivateKey, EncodePublicKey, LineEnding}; +use serde::{Deserialize, Serialize}; +use serde_json::{json, Value}; +use std::sync::Arc; +use tower::ServiceExt; + +// ── JWT helpers for tests ────────────────────────────────────────────────── + +/// Generate a fresh P-256 keypair for use in JWT tests. +fn gen_ec_keypair() -> (EncodingKey, DecodingKey) { + let signing_key = SigningKey::random(&mut p256_rand::OsRngWrapper); + let private_pem = signing_key + .to_pkcs8_pem(LineEnding::LF) + .expect("encode private key") + .to_string(); + let public_pem = signing_key + .verifying_key() + .to_public_key_pem(LineEnding::LF) + .expect("encode public key"); + let enc = EncodingKey::from_ec_pem(private_pem.as_bytes()).expect("enc key"); + let dec = DecodingKey::from_ec_pem(public_pem.as_bytes()).expect("dec key"); + (enc, dec) +} + +mod p256_rand { + use rand_core::{CryptoRng, RngCore}; + pub struct OsRngWrapper; + impl RngCore for OsRngWrapper { + fn next_u32(&mut self) -> u32 { + let mut b = [0u8; 4]; + self.fill_bytes(&mut b); + u32::from_le_bytes(b) + } + fn next_u64(&mut self) -> u64 { + let mut b = [0u8; 8]; + self.fill_bytes(&mut b); + u64::from_le_bytes(b) + } + fn fill_bytes(&mut self, dest: &mut [u8]) { + getrandom::getrandom(dest).expect("OS RNG"); + } + fn try_fill_bytes(&mut self, dest: &mut [u8]) -> Result<(), rand_core::Error> { + self.fill_bytes(dest); + Ok(()) + } + } + impl CryptoRng for OsRngWrapper {} +} + +#[derive(Debug, Serialize, Deserialize)] +struct TestClaims { + exp: u64, + aud: String, + agentkeys: AgentKeysClaims, +} + +#[derive(Debug, Serialize, Deserialize)] +struct AgentKeysClaims { + omni_account: String, +} + +/// Mint a valid JWT for `omni_account` with a TTL of 300s. +fn mint_test_jwt(enc: &EncodingKey, omni_account: &str) -> String { + let now = std::time::SystemTime::now() + .duration_since(std::time::UNIX_EPOCH) + .unwrap() + .as_secs(); + let claims = TestClaims { + exp: now + 300, + aud: "agentkeys:broker".to_string(), + agentkeys: AgentKeysClaims { + omni_account: omni_account.to_string(), + }, + }; + let mut header = Header::new(Algorithm::ES256); + header.kid = Some("ak-session-test".to_string()); + encode(&header, &claims, enc).expect("encode jwt") +} + +/// Mint an expired JWT (exp in the past). +fn mint_expired_jwt(enc: &EncodingKey, omni_account: &str) -> String { + let claims = TestClaims { + exp: 1_000_000_001, // 2001 — always in the past + aud: "agentkeys:broker".to_string(), + agentkeys: AgentKeysClaims { + omni_account: omni_account.to_string(), + }, + }; + let mut header = Header::new(Algorithm::ES256); + header.kid = Some("ak-session-test".to_string()); + encode(&header, &claims, enc).expect("encode expired jwt") +} + +// ── Router helpers ───────────────────────────────────────────────────────── + +fn router_without_signer() -> Router { + let conn = rusqlite::Connection::open_in_memory().unwrap(); + db::init_schema(&conn).unwrap(); + let state = Arc::new(AppState::new(conn)); + create_router(state) +} + +fn router_with_signer(master_secret: [u8; 32]) -> Router { + let conn = rusqlite::Connection::open_in_memory().unwrap(); + db::init_schema(&conn).unwrap(); + let signer = DevKeyService::from_master_secret(master_secret); + let state = Arc::new(AppState::new(conn).with_dev_signer(Some(signer))); + create_router(state) +} + +/// Build a signer-only router with JWT auth enabled. +fn router_signer_only_with_auth( + master_secret: [u8; 32], + dec: DecodingKey, +) -> Router { + let conn = rusqlite::Connection::open_in_memory().unwrap(); + db::init_schema(&conn).unwrap(); + let signer = DevKeyService::from_master_secret(master_secret); + let state = Arc::new( + AppState::new(conn) + .with_dev_signer(Some(signer)) + .with_broker_session_pubkey(Some(dec)), + ); + create_signer_router(state) +} + +async fn post_json(app: Router, path: &str, body: Value) -> (StatusCode, Value) { + post_json_with_header(app, path, body, None).await +} + +async fn post_json_with_header( + app: Router, + path: &str, + body: Value, + authorization: Option<&str>, +) -> (StatusCode, Value) { + let mut builder = Request::builder() + .method(Method::POST) + .uri(path) + .header("content-type", "application/json"); + if let Some(auth) = authorization { + builder = builder.header("authorization", auth); + } + let req = builder + .body(Body::from(serde_json::to_string(&body).unwrap())) + .unwrap(); + let resp = app.oneshot(req).await.unwrap(); + let status = resp.status(); + let bytes = resp.into_body().collect().await.unwrap().to_bytes(); + let json: Value = serde_json::from_slice(&bytes).unwrap_or(Value::Null); + (status, json) +} + +fn fixed_omni() -> String { + "ab".repeat(32) +} + +// ── Original tests (no JWT auth — legacy router) ─────────────────────────── + +#[tokio::test] +async fn derive_address_returns_503_when_signer_disabled() { + let app = router_without_signer(); + let (status, body) = post_json( + app, + "/dev/derive-address", + json!({ "omni_account": fixed_omni() }), + ) + .await; + assert_eq!(status, StatusCode::SERVICE_UNAVAILABLE); + assert_eq!(body["error"], "signer_disabled"); + assert!(body["message"] + .as_str() + .unwrap() + .contains("DEV_KEY_SERVICE_MASTER_SECRET")); +} + +#[tokio::test] +async fn sign_message_returns_503_when_signer_disabled() { + let app = router_without_signer(); + let (status, body) = post_json( + app, + "/dev/sign-message", + json!({ + "omni_account": fixed_omni(), + "message_hex": hex::encode(b"hello"), + }), + ) + .await; + assert_eq!(status, StatusCode::SERVICE_UNAVAILABLE); + assert_eq!(body["error"], "signer_disabled"); +} + +#[tokio::test] +async fn derive_address_is_deterministic_across_calls() { + let master = [0x42u8; 32]; + let omni = fixed_omni(); + + let (s1, b1) = post_json( + router_with_signer(master), + "/dev/derive-address", + json!({ "omni_account": omni }), + ) + .await; + let (s2, b2) = post_json( + router_with_signer(master), + "/dev/derive-address", + json!({ "omni_account": omni }), + ) + .await; + assert_eq!(s1, StatusCode::OK); + assert_eq!(s2, StatusCode::OK); + assert_eq!(b1["address"], b2["address"]); + let addr = b1["address"].as_str().unwrap(); + assert!(addr.starts_with("0x")); + assert_eq!(addr.len(), 42); + assert_eq!(addr, addr.to_lowercase()); + assert_eq!(b1["key_version"], 1); +} + +#[tokio::test] +async fn derive_address_rejects_short_omni() { + let app = router_with_signer([0u8; 32]); + let (status, body) = post_json( + app, + "/dev/derive-address", + json!({ "omni_account": "deadbeef" }), + ) + .await; + assert_eq!(status, StatusCode::BAD_REQUEST); + assert_eq!(body["error"], "invalid_omni_account"); +} + +#[tokio::test] +async fn sign_message_address_matches_derive_response() { + let master = [0x33u8; 32]; + let omni = fixed_omni(); + + let (s1, derive) = post_json( + router_with_signer(master), + "/dev/derive-address", + json!({ "omni_account": omni }), + ) + .await; + let (s2, sign) = post_json( + router_with_signer(master), + "/dev/sign-message", + json!({ + "omni_account": omni, + "message_hex": hex::encode(b"siwe-test"), + }), + ) + .await; + assert_eq!(s1, StatusCode::OK); + assert_eq!(s2, StatusCode::OK); + assert_eq!(derive["address"], sign["address"]); + assert_eq!(derive["key_version"], sign["key_version"]); +} + +#[tokio::test] +async fn sign_message_returns_canonical_65_byte_signature() { + let app = router_with_signer([0u8; 32]); + let (status, body) = post_json( + app, + "/dev/sign-message", + json!({ + "omni_account": fixed_omni(), + "message_hex": hex::encode(b"hello"), + }), + ) + .await; + assert_eq!(status, StatusCode::OK); + let sig = body["signature"].as_str().unwrap(); + assert!(sig.starts_with("0x")); + let raw = hex::decode(sig.trim_start_matches("0x")).unwrap(); + assert_eq!(raw.len(), 65); + let v = raw[64]; + assert!(v == 0 || v == 1, "v byte must be canonical {{0,1}}, got {v}"); +} + +#[tokio::test] +async fn sign_message_rejects_invalid_message_hex() { + let app = router_with_signer([0u8; 32]); + let (status, body) = post_json( + app, + "/dev/sign-message", + json!({ + "omni_account": fixed_omni(), + "message_hex": "not-hex-zzz", + }), + ) + .await; + assert_eq!(status, StatusCode::BAD_REQUEST); + assert_eq!(body["error"], "invalid_message_hex"); +} + +#[tokio::test] +async fn different_master_secrets_produce_different_addresses() { + let omni = fixed_omni(); + let (_, a) = post_json( + router_with_signer([0x11u8; 32]), + "/dev/derive-address", + json!({ "omni_account": omni }), + ) + .await; + let (_, b) = post_json( + router_with_signer([0x22u8; 32]), + "/dev/derive-address", + json!({ "omni_account": omni }), + ) + .await; + assert_ne!(a["address"], b["address"]); +} + +// ── JWT bearer auth tests (signer-only router) ───────────────────────────── + +#[tokio::test] +async fn signer_only_missing_jwt_returns_401_unauthorized() { + let (enc, dec) = gen_ec_keypair(); + let _ = enc; // generated but only dec used here + let app = router_signer_only_with_auth([0x42u8; 32], dec); + let (status, body) = post_json( + app, + "/dev/derive-address", + json!({ "omni_account": fixed_omni() }), + ) + .await; + assert_eq!(status, StatusCode::UNAUTHORIZED); + assert_eq!(body["error"], "unauthorized"); + assert!(body["message"].as_str().unwrap().contains("Authorization")); +} + +#[tokio::test] +async fn signer_only_valid_jwt_matching_omni_returns_200() { + let (enc, dec) = gen_ec_keypair(); + let omni = fixed_omni(); + let jwt = mint_test_jwt(&enc, &omni); + let app = router_signer_only_with_auth([0x42u8; 32], dec); + let (status, body) = post_json_with_header( + app, + "/dev/derive-address", + json!({ "omni_account": omni }), + Some(&format!("Bearer {jwt}")), + ) + .await; + assert_eq!(status, StatusCode::OK, "body: {body:?}"); + assert!(body["address"].as_str().unwrap().starts_with("0x")); +} + +#[tokio::test] +async fn signer_only_wrong_jwt_returns_401() { + let (_enc, dec) = gen_ec_keypair(); + let (wrong_enc, _wrong_dec) = gen_ec_keypair(); + let omni = fixed_omni(); + let jwt = mint_test_jwt(&wrong_enc, &omni); + let app = router_signer_only_with_auth([0x42u8; 32], dec); + let (status, body) = post_json_with_header( + app, + "/dev/derive-address", + json!({ "omni_account": omni }), + Some(&format!("Bearer {jwt}")), + ) + .await; + assert_eq!(status, StatusCode::UNAUTHORIZED); + assert_eq!(body["error"], "unauthorized"); +} + +#[tokio::test] +async fn signer_only_expired_jwt_returns_401() { + let (enc, dec) = gen_ec_keypair(); + let omni = fixed_omni(); + let jwt = mint_expired_jwt(&enc, &omni); + let app = router_signer_only_with_auth([0x42u8; 32], dec); + let (status, body) = post_json_with_header( + app, + "/dev/derive-address", + json!({ "omni_account": omni }), + Some(&format!("Bearer {jwt}")), + ) + .await; + assert_eq!(status, StatusCode::UNAUTHORIZED); + assert_eq!(body["error"], "unauthorized"); +} + +#[tokio::test] +async fn signer_only_omni_mismatch_returns_401() { + let (enc, dec) = gen_ec_keypair(); + let omni = fixed_omni(); + let different_omni = "cd".repeat(32); + let jwt = mint_test_jwt(&enc, &different_omni); // JWT claims different omni + let app = router_signer_only_with_auth([0x42u8; 32], dec); + let (status, body) = post_json_with_header( + app, + "/dev/derive-address", + json!({ "omni_account": omni }), // body uses original omni — mismatch + Some(&format!("Bearer {jwt}")), + ) + .await; + assert_eq!(status, StatusCode::UNAUTHORIZED); + assert_eq!(body["error"], "unauthorized"); + assert!(body["message"] + .as_str() + .unwrap() + .contains("omni_account")); +} + +#[tokio::test] +async fn signer_only_valid_jwt_sign_message_returns_200() { + let (enc, dec) = gen_ec_keypair(); + let omni = fixed_omni(); + let jwt = mint_test_jwt(&enc, &omni); + let app = router_signer_only_with_auth([0x42u8; 32], dec); + let (status, body) = post_json_with_header( + app, + "/dev/sign-message", + json!({ + "omni_account": omni, + "message_hex": hex::encode(b"test-message"), + }), + Some(&format!("Bearer {jwt}")), + ) + .await; + assert_eq!(status, StatusCode::OK, "body: {body:?}"); + assert!(body["signature"].as_str().unwrap().starts_with("0x")); +} + +#[tokio::test] +async fn signer_only_healthz_needs_no_jwt() { + let (_enc, dec) = gen_ec_keypair(); + let app = router_signer_only_with_auth([0x42u8; 32], dec); + let req = Request::builder() + .method(Method::GET) + .uri("/healthz") + .body(Body::empty()) + .unwrap(); + let resp = app.oneshot(req).await.unwrap(); + assert_eq!(resp.status(), StatusCode::OK); +} + +#[tokio::test] +async fn signer_only_session_endpoint_absent() { + let (_enc, dec) = gen_ec_keypair(); + let app = router_signer_only_with_auth([0x42u8; 32], dec); + let req = Request::builder() + .method(Method::POST) + .uri("/session/create") + .header("content-type", "application/json") + .body(Body::from("{}")) + .unwrap(); + let resp = app.oneshot(req).await.unwrap(); + // signer-only router has no /session route → 404 + assert_eq!(resp.status(), StatusCode::NOT_FOUND); +} diff --git a/docs/archived/README.md b/docs/archived/README.md index 2361332..1ea199c 100644 --- a/docs/archived/README.md +++ b/docs/archived/README.md @@ -9,6 +9,9 @@ Superseded by the current top-level docs: | `development-stages-v1-2026-04.md` (1623 lines, Stage 0→9 full history) | [`../spec/plans/development-stages.md`](../spec/plans/development-stages.md) — concise Shipped/Active/Planned summary | | `manual-test-stage4.md`, `manual-test-stage5.md`, `manual-test-stage6.md`, `stage5-workspace-email-setup.md` | [`../dev-setup.md`](../dev-setup.md) — single developer onboarding + demo guide | | `manual-test-issue-{12..17}.md`, `manual-test-report-issues-12-17.md` | One-shot per-issue manual tests from Stage 4 — results folded into the Stage 4 test suite; kept for audit trail only | +| `operator-runbook-pre-stage7.md` (was `../operator-runbook.md`) | [`../operator-runbook-stage7.md`](../operator-runbook-stage7.md) — Stage-7+ broker (post-issue-#71 OIDC-only mints, post-issue-#74-step-1 dev_key_service signer) | +| `contradictions-stage4-2026-04.md` (was `../contradictions.md`) | Audit snapshot taken 2026-04-14 against Stage-4-implementation-complete + 17 open issues. The decisions it captured have either landed or been re-scoped; no live successor — Stage 7+ design discussions live under [`../spec/plans/issue-64/`](../spec/plans/issue-64/) and [`../spec/plans/issue-74-dev-key-service-plan.md`](../spec/plans/issue-74-dev-key-service-plan.md) | +| `field-name-translation.md` (was `../field-name-translation.md`) | Stage-4-keychain-output design note. Subsumed by the Stage-7 daemon's session/wallet representation; kept for the historical "why we sed-pretty-printed `security(1)`" reasoning | ## Archive policy diff --git a/docs/contradictions.md b/docs/archived/contradictions-stage4-2026-04.md similarity index 100% rename from docs/contradictions.md rename to docs/archived/contradictions-stage4-2026-04.md diff --git a/docs/field-name-translation.md b/docs/archived/field-name-translation.md similarity index 100% rename from docs/field-name-translation.md rename to docs/archived/field-name-translation.md diff --git a/docs/operator-runbook.md b/docs/archived/operator-runbook-pre-stage7.md similarity index 100% rename from docs/operator-runbook.md rename to docs/archived/operator-runbook-pre-stage7.md diff --git a/docs/stage7-wip.md b/docs/archived/stage7-wip-pre-arch-rewrite.md similarity index 98% rename from docs/stage7-wip.md rename to docs/archived/stage7-wip-pre-arch-rewrite.md index 22cdf8c..311f00d 100644 --- a/docs/stage7-wip.md +++ b/docs/archived/stage7-wip-pre-arch-rewrite.md @@ -27,7 +27,7 @@ Both `mint-*` endpoints write a row to the broker's append-only SQLite audit DB ## Configuration -The broker reads AWS credentials from the SDK default chain (instance profile → named profile → static keys, in that order). See [`operator-runbook.md` §2](./operator-runbook.md#2-aws-credentials) for the full credential story. +The broker reads AWS credentials from the SDK default chain (instance profile → named profile → static keys, in that order). See [`operator-runbook-stage7.md`](./operator-runbook-stage7.md) for the full credential story. | Env var | Default | Notes | |---|---|---| @@ -241,7 +241,7 @@ If `.issuer` doesn't match the URL byte-for-byte, fix `BROKER_OIDC_ISSUER` on th ## Operations -- **Start, supervise, rotate, audit** → [`operator-runbook.md`](./operator-runbook.md). +- **Start, supervise, rotate, audit** → [`operator-runbook-stage7.md`](./operator-runbook-stage7.md). - **Cloud-account provisioning + OIDC federation** → [`cloud-setup.md`](./cloud-setup.md). - **Don't expose `:8091` ingress.** Host firewall must drop `:8091` from anywhere except `127.0.0.1`. Nginx is the only legitimate caller. - **Cert renewal.** Certbot's renewal timer ships with the package (`sudo systemctl list-timers | grep certbot`). AWS doesn't pin the cert; thumbprint persistence comes from the LE intermediate CA. diff --git a/docs/cloud-setup.md b/docs/cloud-setup.md index 686ddbc..f1b8398 100644 --- a/docs/cloud-setup.md +++ b/docs/cloud-setup.md @@ -13,7 +13,8 @@ The runbook is split by concern, not by stage: | [§3 IAM users + role](#3-iam-identities) | `agentkeys-{admin,broker,daemon}` + `agentkeys-data-role` | Once per account | | [§4 OIDC federation](#4-oidc-federation-stage-7) | Register the broker as an OIDC provider, swap to PrincipalTag-scoped trust | After §1–§3 + a publicly-reachable broker | | [§5 EC2 broker host](#5-ec2-broker-host-optional) | EIP, A record, security group | Only if you're hosting the broker on AWS | -| [§6 Cleanup](#6-cleanup) | Tear-down recipe | When you want to delete it all | +| [§6 Signer host](#6-signer-host) | DNS A record + TLS cert + nginx flip for `signer.` | After §5 — needs `$EIP` | +| [§7 Cleanup](#7-cleanup) | Tear-down recipe | When you want to delete it all | **Cloud-portability:** §1 (DNS) and §2 (inbound mail) are the cloud-replaceable layers — Tencent Cloud SimpleDM + COS would slot in here unchanged at the §3+ boundary. See [§2.2](#22-future-tencent-cloud-simpledm--cos). @@ -96,6 +97,10 @@ aws route53 change-resource-record-sets --hosted-zone-id "$PARENT_ZONE_ID" \ Done as part of [§5 EC2 broker host](#5-ec2-broker-host-optional), once you know the host's public IP. If the broker lives outside AWS (DigitalOcean, Hetzner, etc.), upsert the A record now using the host's static IP — the rest of the runbook is identical. +### 1.3 Signer subdomain — A record + TLS cert (issue #74 step 1b) + +Done as part of [§6 Signer host](#6-signer-host), once `$EIP` is known from [§5.1](#51-allocate--attach-an-elastic-ip). + --- ## 2. Inbound mail backend @@ -129,11 +134,11 @@ aws s3api create-bucket \ --region "$REGION" --bucket "$BUCKET" \ $([ "$REGION" != "us-east-1" ] && echo "--create-bucket-configuration LocationConstraint=$REGION") -aws s3api put-public-access-block --bucket "$BUCKET" \ +aws s3api put-public-access-block --region "$REGION" --bucket "$BUCKET" \ --public-access-block-configuration BlockPublicAcls=true,IgnorePublicAcls=true,BlockPublicPolicy=true,RestrictPublicBuckets=true # 30-day TTL on inbound objects (throwaway-inbox model) -aws s3api put-bucket-lifecycle-configuration --bucket "$BUCKET" \ +aws s3api put-bucket-lifecycle-configuration --region "$REGION" --bucket "$BUCKET" \ --lifecycle-configuration "$(jq -n '{ Rules: [{ID:"inbound-30d-ttl", Status:"Enabled", Filter:{Prefix:"inbound/"}, Expiration:{Days:30}}] }')" @@ -263,12 +268,122 @@ aws ec2 associate-iam-instance-profile --region "$REGION" \ --iam-instance-profile Name=$ROLE_NAME ``` +### 3.4a `ses:SendEmail` grant on the broker's runtime role (Pass 2 prereq) + +The broker calls SES v2 `SendEmail` with its **own** runtime credentials +(instance profile), NOT via the assumed `agentkeys-data-role`. Without +`ses:SendEmail` on the broker's role the operator hits: + +``` +broker rejected /v1/auth/email/request: status=502 body= +{"error":"backend_unreachable","message":"… ses SendEmail: + unhandled error (AccessDeniedException)"} +``` + +The IAM action is `ses:SendEmail` (sesv2) — NOT `ses:SendRawEmail` (v1 +only; different code path the broker doesn't use). + +**Step 1: discover the actual role name attached to your broker host.** +The canonical name is `agentkeys-broker-host` (created by §3.4 above). +The discovery command below stays as-is so the runbook is robust to +operators who landed on a non-canonical name during early provisioning +(historically: `S3-full-access`, fully retired 2026-05-12 via the role +rename in [PR #75 follow-up](#)). Find it: + +```bash +# REQUIRED: admin profile + operator env loaded. +awsp agentkeys-admin +set -a; source scripts/operator-workstation.env; set +a + +# CRITICAL: pass --region "$REGION". The agentkeys-admin profile +# defaults to us-west-2, but the broker EC2 lives in us-east-1 (from +# operator-workstation.env). Without --region, describe-instances +# searches us-west-2, finds nothing, returns empty silently (no error), +# and the downstream put-role-policy silently runs with --role-name "". +# See CLAUDE.md → AWS local-profile ↔ remote-IAM mapping. +INSTANCE_PROFILE_ARN=$(aws ec2 describe-instances \ + --region "$REGION" \ + --filters "Name=ip-address,Values=$EIP" \ + --query 'Reservations[].Instances[].IamInstanceProfile.Arn' \ + --output text) + +if [[ -z "$INSTANCE_PROFILE_ARN" || "$INSTANCE_PROFILE_ARN" == "None" ]]; then + echo "ABORT: no EC2 instance with EIP=$EIP found in region $REGION." >&2 + echo "Caller: $(aws sts get-caller-identity --query Arn --output text)" >&2 + unset ROLE +else + ROLE=$(aws iam get-instance-profile \ + --instance-profile-name "${INSTANCE_PROFILE_ARN##*/}" \ + --query 'InstanceProfile.Roles[0].RoleName' --output text) + echo "broker runtime role: $ROLE" +fi +``` + +**Step 2: grant `ses:SendEmail` + `ses:GetEmailIdentity` (least-privilege).** + +The broker calls `ses:GetEmailIdentity` at startup via `verify_sender_ready` +to confirm the sender is verified, and `ses:SendEmail` per request. +Both grants are scoped to the verified domain identity (and any +per-address subset) — nothing wider. + +```bash +aws iam put-role-policy --role-name "$ROLE" \ + --policy-name BrokerSendEmail \ + --policy-document "$(jq -n \ + --arg region "$REGION" --arg acct "$ACCOUNT_ID" --arg domain "$MAIL_DOMAIN" '{ + Version: "2012-10-17", + Statement: [{ + Effect: "Allow", + Action: ["ses:SendEmail", "ses:GetEmailIdentity"], + Resource: [ + "arn:aws:ses:\($region):\($acct):identity/\($domain)", + "arn:aws:ses:\($region):\($acct):identity/*@\($domain)" + ] + }] + }')" +``` + +No broker restart needed — sesv2 picks up creds per-call. Verify: + +```bash +aws iam get-role-policy --role-name "$ROLE" --policy-name BrokerSendEmail \ + --query 'PolicyDocument.Statement[*].Action' +# → [["ses:SendEmail", "ses:GetEmailIdentity"]] +``` + +**Step 3 (security audit): strip any over-broad legacy attached policies.** + +Some legacy deploys ship with `AmazonS3FullAccess` (or similar wide +permissions) attached to the broker's instance role from initial +provisioning. The broker process at runtime ONLY uses `aws-sdk-sts` +(STS GetCallerIdentity startup probe) + `aws-sdk-sesv2` (this section's +grants) — it never accesses S3 with its own creds. Per-user S3 access +is via JWT-assumed `agentkeys-data-role` (§3.2), NOT the broker's +runtime role. + +A broker compromise with `AmazonS3FullAccess` would expose every +inbound email in the SES bucket (verification tokens, magic links, +user-data buckets if any). Strip it: + +```bash +# List currently attached policies on the broker's role: +aws iam list-attached-role-policies --role-name "$ROLE" + +# Detach AmazonS3FullAccess if present: +aws iam detach-role-policy --role-name "$ROLE" \ + --policy-arn arn:aws:iam::aws:policy/AmazonS3FullAccess + +# Verify only BrokerSendEmail (inline, this section) remains: +aws iam list-role-policies --role-name "$ROLE" # → ["BrokerSendEmail"] +aws iam list-attached-role-policies --role-name "$ROLE" # → [] +``` + ### 3.5 S3 bucket policy Now that `agentkeys-data-role` exists, attach the bucket policy. The static-IAM-user variant: SES writes inbound, role reads everything. ```bash -aws s3api put-bucket-policy --bucket "$BUCKET" \ +aws s3api put-bucket-policy --region "$REGION" --bucket "$BUCKET" \ --policy "$(jq -n --arg bucket "$BUCKET" --arg acct "$ACCOUNT_ID" '{ Version: "2012-10-17", Statement: [ @@ -380,7 +495,7 @@ Replaces `AllowDaemonRead` from §3.5. The cloud now enforces "the assumed sessi The daemon's read perms split into two statements because `s3:prefix` is a request-time condition that **only applies to `s3:ListBucket`** (the prefix filter on listings) — `s3:GetObject` doesn't carry a prefix parameter, so combining the two actions under one `s3:prefix` condition triggers `MalformedPolicy: Conditions do not apply to combination of actions and resources in statement`. For `GetObject` the resource ARN itself enforces the prefix via `${aws:PrincipalTag/...}` expansion. ```bash -aws s3api put-bucket-policy --bucket "$BUCKET" \ +aws s3api put-bucket-policy --region "$REGION" --bucket "$BUCKET" \ --policy "$(jq -n --arg bucket "$BUCKET" --arg acct "$ACCOUNT_ID" '{ Version: "2012-10-17", Statement: [ @@ -397,20 +512,31 @@ aws s3api put-bucket-policy --bucket "$BUCKET" \ Action: "s3:ListBucket", Resource: "arn:aws:s3:::\($bucket)", Condition: { - StringLike: {"s3:prefix": "${aws:PrincipalTag/agentkeys_user_wallet}/*"} + StringLike: {"s3:prefix": "bots/${aws:PrincipalTag/agentkeys_user_wallet}/*"} } }, { Sid: "AllowDaemonGetOwnObjects", Effect: "Allow", Principal: {AWS: "arn:aws:iam::\($acct):role/agentkeys-data-role"}, Action: "s3:GetObject", - Resource: "arn:aws:s3:::\($bucket)/${aws:PrincipalTag/agentkeys_user_wallet}/*" + Resource: "arn:aws:s3:::\($bucket)/bots/${aws:PrincipalTag/agentkeys_user_wallet}/*" } ] }')" ``` -`StringLike "${tag}/*"` (not `StringEquals "${tag}/"`) lets the daemon list sub-prefixes like `/inbox/` and `/sent/2026-05/`, not just the exact root `/`. Matches the shape in [`docs/spec/ses-email-architecture.md` §10.4](spec/ses-email-architecture.md) and [`wiki/tag-based-access`](../wiki/tag-based-access.md). +**`bots/` is the per-actor data namespace** — sibling to SES's +`inbound/`, and to future system prefixes like `audit/`, `dkim/`, +`config/`. Keeping every actor's data under a single parent prefix +lets lifecycle rules, encryption defaults, replication, and ops audits +scope cleanly to "user data" without sweeping in system prefixes. +Matches arch.md §6 (`bots/A/file` in the runtime sequence diagram). +Both the policy resource ARN (`bucket/bots/${tag}/*`) and the +`s3:prefix` condition (`bots/${tag}/*`) carry the `bots/` parent — +omit it on either and the other half of the policy denies even legit +reads. + +`StringLike "bots/${tag}/*"` (not `StringEquals "bots/${tag}/"`) lets the daemon list sub-prefixes like `bots//inbox/` and `bots//sent/2026-05/`, not just the exact root `bots//`. Matches the shape in [`docs/spec/ses-email-architecture.md` §10.4](spec/ses-email-architecture.md) and [`wiki/tag-based-access`](../wiki/tag-based-access.md). ### 4.4.1 Strip the §3 broad-bucket grant from the role's inline policy @@ -612,7 +738,84 @@ The script writes systemd units, an HTTP-only nginx config, then prints the cert --- -## 6. Cleanup +## 6. Signer host + +| Concern | Today | Future | +|---|---|---| +| Process | `agentkeys-signer.service` (Rust, `agentkeys-mock-server --signer-only`, loopback `:8092`) | TEE worker (issue #74 step 2) | +| Host | **Same EC2 box as the broker** — co-located behind the same nginx, provisioned by the same `setup-broker-host.sh` run | Separate machine (or enclave); only the A record + cert move | +| Public hostname | `signer.` (e.g. `signer.litentry.org`) — exported as `SIGNER_HOST` / `AGENTKEYS_SIGNER_URL` in [`scripts/operator-workstation.env`](../scripts/operator-workstation.env) | `signer.` (unchanged) | +| Endpoints | `/dev/derive-address`, `/dev/sign-message`, `/healthz` only — every request bearer-JWT-authed against the broker session pubkey ([`signer-protocol.md`](spec/signer-protocol.md)) | unchanged | +| Master secret (K3) | `/etc/agentkeys/dev-key-service.env` (mode 0600, owner `agentkeys`) — auto-generated on first `setup-broker-host.sh` run, **never rotated** (rotation invalidates every previously-derived wallet) | TEE-sealed; same wire shape | + +### 6.1 DNS A record + +```bash +# === ON OPERATOR WORKSTATION === +SIGNER_HOST="signer.${BROKER_HOST#*.}" + +# If $EIP isn't already set from §5.1, re-derive from AWS — NEVER from +# `dig`. Local resolvers behind Cloudflare WARP / Zscaler / Tailscale / +# corporate VPNs return RFC 2544 "TEST-NET-2" (198.18.0.0/15) for +# proxied hostnames, which silently breaks Let's Encrypt validation. +[ -z "$EIP" ] && EIP=$(aws ec2 describe-addresses --region "$REGION" \ + --query 'Addresses[?AssociationId!=`null`].PublicIp' --output text) +echo "EIP=$EIP" # MUST be a routable public IP, not 198.18.x.x / 10.x.x.x / 100.64.x.x + +aws route53 change-resource-record-sets --hosted-zone-id "$PARENT_ZONE_ID" \ + --change-batch "$(jq -n --arg name "${SIGNER_HOST}." --arg ip "$EIP" '{ + Changes: [{Action:"UPSERT", ResourceRecordSet:{Name:$name, Type:"A", TTL:300, ResourceRecords:[{Value:$ip}]}}] + }')" + +# Verify via Cloudflare DoH (your local resolver will keep lying if proxied). +until [ "$(curl -s "https://cloudflare-dns.com/dns-query?name=${SIGNER_HOST}&type=A" \ + -H 'accept: application/dns-json' | jq -r '.Answer[0].data')" = "$EIP" ]; do + echo "waiting for Route 53 propagation (TTL 300s)…"; sleep 5 +done +echo "DNS ready: ${SIGNER_HOST} → ${EIP}" +``` + +### 6.2 TLS cert + nginx flip + +> **`$SIGNER_HOST` is laptop-only** (lives in `operator-workstation.env`). +> On the broker host, derive it from the nginx vhost that `setup-broker-host.sh` +> just wrote — the snippet below does it inline so the commands work in a +> fresh broker shell with no env vars set. + +```bash +# === ON BROKER HOST === +# 1. First pass writes the HTTP-only nginx vhost for signer.. +sudo bash scripts/setup-broker-host.sh --yes + +# Sanity-check + read the hostname back out of the vhost. +ls /etc/nginx/sites-enabled/agentkeys-signer +SIGNER_HOST=$(awk '/server_name/ && /signer\./ {gsub(";",""); print $2}' \ + /etc/nginx/sites-available/agentkeys-signer | head -1) +echo "SIGNER_HOST=$SIGNER_HOST" + +# 2. Issue the LE cert. If the prompt only lists broker., the +# signer vhost wasn't written — re-pull + re-run step 1. +sudo certbot --nginx -d "$SIGNER_HOST" + +# 3. Re-run to flip the signer vhost onto :443 ssl. +sudo bash scripts/setup-broker-host.sh --yes +``` + +### 6.3 Verify + +```bash +# === ON OPERATOR WORKSTATION === +curl -sS "https://$SIGNER_HOST/healthz" +# ok + +# Defense-in-depth: signer vhost rejects everything except /dev/* + /healthz. +curl -sS -o /dev/null -w '%{http_code}\n' "https://$SIGNER_HOST/session/create" +# 404 +``` + +--- + +## 7. Cleanup ```bash # OIDC federation (if §4 ran) @@ -638,7 +841,7 @@ aws iam delete-role --role-name agentkeys-broker-host 2>/dev/null aws ses set-active-receipt-rule-set --rule-set-name "" --region "$REGION" aws sesv2 delete-email-identity --region "$REGION" --email-identity "$DOMAIN" aws s3 rm "s3://$BUCKET" --recursive -aws s3api delete-bucket --bucket "$BUCKET" +aws s3api delete-bucket --region "$REGION" --bucket "$BUCKET" # DNS records on the parent zone are NOT auto-deleted — you'll need to # remove the DKIM CNAMEs, MX, SPF, DMARC, and broker A record by hand diff --git a/docs/dev-setup.md b/docs/dev-setup.md index e4edc1e..e4d5f98 100644 --- a/docs/dev-setup.md +++ b/docs/dev-setup.md @@ -145,7 +145,7 @@ Run through [`cloud-setup.md`](./cloud-setup.md) §1–§3 once per AWS account. - S3 bucket `agentkeys-mail-` with receipt rule writing inbound to `inbound/` - Route 53 records: three DKIM CNAMEs, MX, SPF, DMARC -Manage the daemon user's long-lived AWS keys via a **named profile** in `~/.aws/credentials` (mode 0600). The broker uses the AWS SDK's default credential chain — `AWS_PROFILE` (set by `awsp` or your shell), the shared credentials file, or an EC2 instance profile via IMDS. **No long-lived AWS keys live in env vars.** See [`operator-runbook.md` §2](./operator-runbook.md#2-aws-credentials) for the full credential story. +Manage the daemon user's long-lived AWS keys via a **named profile** in `~/.aws/credentials` (mode 0600). The broker uses the AWS SDK's default credential chain — `AWS_PROFILE` (set by `awsp` or your shell), the shared credentials file, or an EC2 instance profile via IMDS. **No long-lived AWS keys live in env vars.** See [`operator-runbook-stage7.md`](./operator-runbook-stage7.md) for the full credential story. ### 5.2 Run the broker server @@ -173,7 +173,7 @@ The broker: 3. Returns 1-hour temp creds to the caller. 4. Logs every mint to `BROKER_AUDIT_DB_PATH` (SQLite, one row per mint). -For runbook detail (start / supervise / rotate / monitor / migrate to hosted), see [`docs/operator-runbook.md`](./operator-runbook.md). +For runbook detail (start / supervise / rotate / monitor / migrate to hosted), see [`docs/operator-runbook-stage7.md`](./operator-runbook-stage7.md). For the automated remote-host bootstrap, see [`scripts/setup-broker-host.sh`](../scripts/setup-broker-host.sh). ### 5.3 Hand off bearer tokens to your developers @@ -256,7 +256,7 @@ The longer-term plan (Stage 5b) is to detect drift automatically from telemetry - [`spec/plans/development-stages.md`](./spec/plans/development-stages.md) — Shipped / Active / Planned roadmap - [`cloud-setup.md`](./cloud-setup.md) — one-time AWS infra (DNS, SES, S3, IAM, OIDC federation) - [`stage7-wip.md`](./stage7-wip.md) — broker server design + acceptance test -- [`operator-runbook.md`](./operator-runbook.md) — start, supervise, rotate, monitor the broker +- [`operator-runbook-stage7.md`](./operator-runbook-stage7.md) — start, supervise, rotate, monitor the broker - [`spec/credential-backend-interface.md`](./spec/credential-backend-interface.md) — 15-method trait contract - [`spec/ses-email-architecture.md`](./spec/ses-email-architecture.md) — Stage 6 email pipeline deep-dive - [`spec/threat-model-key-custody.md`](./spec/threat-model-key-custody.md) — what the broker is defending against diff --git a/docs/spec/architecture.md b/docs/spec/architecture.md index b3d3d11..9380114 100644 --- a/docs/spec/architecture.md +++ b/docs/spec/architecture.md @@ -1,384 +1,738 @@ -# AgentKeys — Component Architecture and Language Choices +# AgentKeys — Architecture (broker, signer, daemon, key flows) + +**Audience:** anyone who needs to reason about AgentKeys end-to-end — +new contributors, security reviewers, ops, design partners. Use this +as the single visual + textual reference. Diagrams are Mermaid where +possible so they render in GitHub and copy cleanly into Figma. + +**Status:** canonical (post-issue-#74). Supersedes `docs/stage7-wip.md` +(archived). Component inventory and language choices were absorbed +from the prior `architecture.md` revision. + +**Companion docs (canonical for their narrow surface; this doc links +to them rather than duplicating):** + +- [`signer-protocol.md`](signer-protocol.md) — `/dev/*` wire contract +- [`threat-model-key-custody.md`](threat-model-key-custody.md) — + retroactive-confidentiality + key custody position +- [`heima-gaps-vs-desired-architecture.md`](heima-gaps-vs-desired-architecture.md) + — what current-Heima is missing vs the desired AgentKeys + architecture +- [`credential-backend-interface.md`](credential-backend-interface.md) + — 15-method `CredentialBackend` trait +- [`plans/issue-74-dev-key-service-plan.md`](plans/issue-74-dev-key-service-plan.md) + — dev_key_service signer (issue #74 step 1) +- [`plans/issue-74-step-1c-device-key-auth.md`](plans/issue-74-step-1c-device-key-auth.md) + — device-key auth on `/dev/*` (issue #74 step 1c, planned) -**Date:** 2026-04-09 (revised against ceo-plan.md Round 13 runtime reality check) -**Scope:** Cross-cutting architecture document covering all components of AgentKeys, the language chosen for each, the trust boundaries between them, and the Cargo workspace layout. +--- -**Parent docs (read first for context):** -- [`./design-spec.md`](design-spec.md) — product vision, MVP criteria, why Rust end-to-end was chosen -- [`/Users/hanwencheng/Projects/project-life/.omc/specs/deep-interview-agentkeys.md`](../../../../.omc/specs/deep-interview-agentkeys.md) — full prior-interview spec (11 rounds, 19% ambiguity, PASSED) +## 1. Component map + +```mermaid +flowchart LR + subgraph WS["Operator workstation"] + CLI["agentkeys CLI
(Rust)"] + end + + subgraph SBX["Agent sandbox"] + DMN["agentkeys-daemon
(Rust, MCP server)"] + PRV["provisioner orchestrator
(Rust)"] + BRO["browser scraper
(TypeScript + Playwright)"] + DMN -->|spawns subprocess| PRV + PRV -->|spawns subprocess| BRO + end + + subgraph BH["Broker host (EC2)"] + BRK["agentkeys-broker-server
(Rust, Axum :8091)"] + SIG["agentkeys-mock-server --signer-only
(Rust, Axum :8092)
= dev_key_service"] + BCK["agentkeys-mock-server
(Rust, Axum :8090, loopback)
= legacy session/credential backend"] + end + + subgraph CLOUD["AWS"] + STS["AWS STS
(AssumeRoleWithWebIdentity)"] + S3["S3 / SES / etc
(PrincipalTag-gated)"] + end + + CLI -->|init: email/OAuth2 + SIWE| BRK + CLI -->|init: derive wallet| SIG + DMN -->|mint OIDC JWT| BRK + DMN -->|sign-message
per call| SIG + DMN -->|AssumeRoleWithWebIdentity| STS + STS --> S3 + BRK -->|tier-2 reachability probe| BCK + CLI -. saved session JWT .-> DMN +``` -**Sibling architecture docs:** -- [`./1-step-analysis.md`](./1-step-analysis.md) — auth-layer sub-analysis (session keys, wallet identity, kernel hardening, user flows) -- [`./open-source-posture.md`](./open-source-posture.md) — open/closed split, licensing, reproducible builds, security-audit roadmap -- [`./heima-open-questions.md`](./heima-open-questions.md) — Kai meeting agenda for the Heima TEE worker reality check +**Three independent trust boundaries, three independent products:** -**Companion research:** -- [`./heima-cli-exploration.md`](./heima-cli-exploration.md) — 1Password CLI feature comparison +| Service | Public hostname (typical) | Holds | Role | +|---|---|---|---| +| Broker | `broker.litentry.org` | ES256 OIDC keypair, ES256 session keypair, audit DB | Mints session JWTs after identity ceremony; mints OIDC JWTs from session JWTs; never holds AWS principals at runtime | +| Signer (`dev_key_service`) | `signer.litentry.org` (post-step-1b) | `DEV_KEY_SERVICE_MASTER_SECRET` (32 bytes hex) | Derives EVM wallets from `omni_account` and signs EIP-191 messages on the operator's behalf. Replaceable with a TEE worker post-step-2. | +| Backend (mock-server) | `127.0.0.1:8090` (loopback only) | Legacy session/credential SQLite | Tier-2 reachability target for the broker; legacy `/session/*` + `/credential/*` endpoints used by the daemon's pair-flow | + +**Why three?** Compromise of any one process must NOT enable +impersonating the others. Broker compromise can't extract the master +secret (it's on the signer). Signer compromise can't mint session +JWTs (the keypair is on the broker). Backend compromise can't sign +EVM messages and can't mint cloud creds. The split is enforced by +process boundary and (at production deployment) by separate listener ++ host firewall. --- -## 1. The commitment: Strategy 2 (pragmatic Rust + targeted TypeScript) +## 2. Trust boundaries (where keys live, who can see them) + +```mermaid +flowchart TB + subgraph TB1["Trust boundary 1 — Master workstation"] + OS_KC["OS keychain
session JWT (K6)
device privkey K10 (post-step-1c)"] + PA["Platform authenticator
(Secure Enclave / TPM / StrongBox)
K11 — sealed in hardware"] + EVM_W["MetaMask / hardware wallet
(only if identity_type = evm)"] + end + + subgraph TB1A["Trust boundary 1A — Agent machine"] + AGENT_KC["OS keychain OR file backend
session JWT (K6) +
device privkey K10
NO K11"] + end + + subgraph TB2["Trust boundary 2 — Broker process"] + SESS_KP["session ES256 keypair
(BROKER_SESSION_KEYPAIR_PATH)"] + OIDC_KP["OIDC ES256 keypair
(BROKER_OIDC_KEYPAIR_PATH)"] + AUDIT_DB["audit SQLite
(BROKER_AUDIT_DB_PATH)"] + end + + subgraph TB3["Trust boundary 3 — Signer process (dev_key_service)"] + MASTER["DEV_KEY_SERVICE_MASTER_SECRET
(/etc/agentkeys/dev-key-service.env)"] + SIGNER_KP["per-omni derived secp256k1 keys
(in memory only, derived on demand,
never persisted, never logged, never returned)"] + end + + subgraph TB4["Trust boundary 4 — Backend (mock-server)"] + SES_DB["session + credential SQLite
(legacy)"] + end + + subgraph TB5["Trust boundary 5 — AWS"] + AWS_KMS["IAM roles, KMS, S3 policies"] + end + + OS_KC -. session_jwt .-> SESS_KP + OS_KC -. derive_address(omni) .-> SIGNER_KP + PA -. WebAuthn enroll/get (binding only) .-> SESS_KP + EVM_W -. SIWE signature .-> SESS_KP + AGENT_KC -. session_jwt .-> SESS_KP + AGENT_KC -. /dev/sign-message .-> SIGNER_KP + OS_KC -. mint link-code .-> AGENT_KC + OIDC_KP -. OIDC JWT .-> AWS_KMS +``` + +**Compromise-blast-radius table:** + +| Boundary breached | What attacker gains | What they CANNOT do | +|---|---|---| +| **Master workstation** (host root, but no hardware presence) | Stolen session JWT (replay until exp); stolen K10 device key (sign on operator's behalf until rotation) | **Cannot complete WebAuthn ceremony** to bind a new device or rotate K10 — K11 sealed in Secure Enclave/TPM requires biometric/PIN. Cannot derive wallets for other operators; cannot mint session JWTs for new identities. | +| **Master workstation** (full compromise WITH hardware presence — e.g. attacker physically at machine and unlocks biometric) | Above, plus: rebind K10 to attacker-controlled pubkey, rotate device key, mint link codes for new agents | Same as above — bounded to this operator's omni; cannot reach other operators' material | +| **Agent machine** (sandbox VM, host root) | Stolen K10; stolen session JWT (replay until session-JWT TTL expires) | Cannot rebind without master-issued link code; master link-code issuance is gated by master J1 (which is gated by master K11). Cannot escalate to master compromise. | +| Broker process | Mint session JWTs for any omni; mint OIDC JWTs (gated by JWT auth, defeated by full broker compromise) | Cannot derive wallets; cannot sign EIP-191 messages; cannot AssumeRole (no AWS principal at broker). **Post-step-1c: cannot forge device signatures** because per-request K10 signature is verified at signer — broker compromise alone cannot make the signer accept an attacker request. | +| Signer process (current step-1) | Derive any wallet from any omni; sign any EIP-191 message for any omni | Cannot mint session JWTs; cannot mint OIDC JWTs; cannot reach AWS | +| Signer process (post-step-1c) | Above, AND can verify (but not forge) device-signed requests | Same as above; per-request device signatures still gate the call surface | +| Backend (mock-server) | Stale legacy session bearer; credential ciphertext (today's mock storage) | Cannot affect Stage 7 mint paths (broker verifies session JWTs locally post-issue-#71) | +| AWS account | Game over for that operator's data scope | None of the above; AWS compromise is its own incident class | + +**Note on signer-process compromise.** Today's `dev_key_service` is +the **dev-stage** placeholder. Compromising the signer host = full +master-secret leak = every wallet for every operator is forge-able +forever. The TEE worker (issue #74 step 2) closes this: master secret +is sealed inside the enclave; host root no longer suffices. +Step-1c device-key auth additionally bounds the impact of broker +compromise on the signer call surface. + +--- -The design-spec says **Rust end-to-end**. After enumerating all components, that commitment is **correct for every component inside the trust boundary** but would fight the ecosystem for **browser automation scripts**, where TypeScript + Playwright is meaningfully better than any Rust option. +## 3. Key inventory + +The complete list of cryptographic material in the system. Use this +as the source-of-truth when designing the Figma trust-flow diagram. + +| # | Key | Type | Lives in | Role | Lifecycle | +|---|---|---|---|---|---| +| K1 | Broker session keypair | ES256 (P-256) | Broker process; pinned file at `BROKER_SESSION_KEYPAIR_PATH` (mode 0600); pubkey exported to `*.pub.pem` (mode 0644) for signer | Signs session JWTs (issued post-identity-ceremony, bound to omni + wallet) | Generated at first broker boot; preserved across re-deploys; manual rotation procedure TBD | +| K2 | Broker OIDC keypair | ES256 (P-256) | Broker process; pinned file at `BROKER_OIDC_KEYPAIR_PATH` (mode 0600); pubkey published at `/.well-known/jwks.json` | Signs OIDC JWTs minted by `/v1/mint-oidc-jwt` (consumed by AWS STS / GCP WIF / Tencent CAM via `AssumeRoleWithWebIdentity`) | Generated at first broker boot; rotation requires re-registering the OIDC provider in cloud IAM | +| K3 | Dev-signer master secret | 32 raw bytes (hex-encoded) | `/etc/agentkeys/dev-key-service.env` (mode 0600, owner agentkeys); auto-generated by `setup-broker-host.sh` | HKDF input for deriving per-actor-omni secp256k1 wallets (one per node in the HDKD actor tree — see §4) | Generated once on first broker-host setup; **never rotate** (rotation invalidates every previously-derived wallet); replaced by sealed enclave secret post-step-2 | +| K4 | Per-actor derived wallet | secp256k1 | Signer process (in memory only, derived on demand from K3 + actor_omni; never persisted, never logged, never returned over wire) | The managed EVM wallet for one node in the HDKD actor tree (master OR a specific agent). Different actor omni → different wallet → different AWS PrincipalTag → different S3 prefix. Used by signer to sign EIP-191 messages on that actor's behalf. | Deterministic; same `(K3, actor_omni)` always → same wallet; lifecycle == lifecycle of K3 | +| K5 | EVM-wallet (operator-held) | secp256k1 | Operator's MetaMask / hardware wallet / `cast wallet` | Identity authenticator for `identity_type = evm`; signs SIWE messages directly (this path bypasses K3/K4 entirely) | Operator-managed; outside AgentKeys' lifecycle | +| K6 | Session JWT | JWT (ES256 by K1) | Operator's OS keychain (via `agentkeys-core::session_store`) on the workstation; in daemon memory at runtime | Bearer credential for `/v1/mint-oidc-jwt`, `/v1/wallet/*`, post-step-1b also for `/dev/*` | TTL = `BROKER_SESSION_JWT_TTL_SECONDS` (default 18000s = 5h); re-mint requires re-running the identity ceremony | +| K7 | OIDC JWT | JWT (ES256 by K2) | Daemon memory only (transient — fetched per mint) | Web-identity token for `AssumeRoleWithWebIdentity` against AWS STS | TTL = `BROKER_OIDC_JWT_TTL_SECONDS` (bounded `[60, 3600]`, default 300s) | +| K8 | AWS temp credentials | STS access key + secret + session token | Daemon memory only (transient — refetched per provision/mint) | Direct AWS API access scoped by PrincipalTag = wallet | 1-hour TTL (STS default); short by design | +| K9 | DKIM keypair (per outbound domain) | Ed25519 | Stage 6 design — currently TEE-only, not yet implemented | **DKIM = DomainKeys Identified Mail (RFC 6376).** A per-domain signing key used to sign outbound email headers; the matching public key is published as a DNS TXT record at `._domainkey.`. Receiving mail servers fetch the pubkey via DNS, verify the signature, and use the result to decide whether the message originated from a server authorized for that domain — input to spam filtering, deliverability, and brand-impersonation defense. AgentKeys needs K9 because Stage 6 sends mail FROM operator-controlled sub-domains (e.g. for OpenRouter signups via plus-aliased addresses) and we hold the signing key ourselves rather than delegating to SES (so AWS never sees the plaintext content) — see [`heima-gaps §4`](heima-gaps-vs-desired-architecture.md). | TBD per Stage 6 spec ([`heima-gaps §4`](heima-gaps-vs-desired-architecture.md)) | +| K10 | Device key (planned, step-1c) | secp256k1 | **Master**: OS keychain (TouchID-backed on macOS, etc.) on the operator's workstation. **Agent**: OS keychain when available, else file backend at `~/.agentkeys/daemon-/session.json` (mode 0600) — see §5a.4.2. Pubkey registered at the broker as a session JWT claim (`agentkeys_device_pubkey`). | Per-request signature on `/dev/sign-message` calls — eliminates broker-as-SPOF for signer auth | Generated at init stage 0 (per §5); bound by master init per §5a.1 OR agent bootstrap per §5a.2; rotated by `agentkeys device rotate` per §5a.3.2 or by re-init; TTL = session JWT TTL | +| K11 | WebAuthn platform-authenticator credential (planned v0.2, master only) | Per-RP credential (typically EC P-256 on macOS Secure Enclave / Windows TPM / Android StrongBox) | **Master only.** Sealed inside the platform authenticator's hardware boundary; cannot be exfiltrated even by host-OS root. Credential ID published at the broker as a session JWT claim (`agentkeys_webauthn_cred`). | Hardware-attested **user-presence proof at master binding ceremonies** (init per §5a.1, new-device per §5a.3.1, rotation per §5a.3.2). NOT used per-request — K10 covers per-request signing without biometric. | Created at master init; survives K10 rotations; revoked by removing the credential from the broker's bound list or by destroying the platform authenticator | + +**Notation throughout the rest of this doc:** the K1–K11 indices +above are referenced directly so any flow can be unambiguously +mapped back to which key signed/verified/wrapped what. + +### 3a. Canonical names (one concept, one canonical spelling) + +Pinned to disambiguate the same value showing up under different +labels across components. **Use the canonical column** in every new +doc, runbook, CLI output, and commit message; the alias column lists +every spelling that exists today so a reader chasing one of them can +find their way back. Per `CLAUDE.md` → +"Terminology-source-of-truth rule", if you introduce a name not in +this table, either add the alias row here or rename the call site to +match the canonical name in the same change. + +| Canonical name | Identity | Aliases seen in the codebase / docs (NOT to introduce new ones) | +|-----------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| `master_wallet` | K4 instance bound to one actor's actor_omni at init/SIWE-verify. Source = `JWT.agentkeys.wallet_address` of the persisted session JWT (K6). | `wallet_address` (JWT claim shape), `agentkeys_user_wallet` (OIDC JWT claim + AWS PrincipalTag key), `session_wallet` (CLI `agentkeys whoami` field), `MASTER_WALLET` (demo doc shell var), `session.wallet.0` (Rust field). | +| `derived_address(omni)` | K4 instance computed on demand by `/dev/derive-address` for any omni — `HKDF(K3, omni)`. NOT persisted to a session JWT; NOT in AWS PrincipalTag. | `derived_address` (CLI `whoami` field), `ADDR_A` / `ADDR_B` (demo doc shell vars for the specific case `omni=actor_omni`), `SIGNER_DERIVE_ADDR` (`demo-show.sh` internal var). | +| `actor_omni` | The durable per-actor omni — `SHA256("agentkeys"||"evm"||master_wallet)` once SIWE-bound. Carried in `JWT.agentkeys.omni_account`. | `omni_account` (JWT claim + CLI `whoami` field), `OMNI_A` / `OMNI_B` (demo doc shell vars), `evm_omni` (init-flow return field, transient name pre-SIWE). | +| `identity_omni` | The transient identity omni — `SHA256("agentkeys"||identity_type||identity_value)`. Used internally by the broker between init and SIWE-verify; never in a post-SIWE JWT. | `identity_omni_email` / `identity_omni_oauth2` (demo doc when narrowing to a specific identity type), `identity omni` (init-flow CLI log line). | +| `K3` (= `master_secret`) | The 32 bytes in `/etc/agentkeys/dev-key-service.env` that every K4 is HKDF-derived from. Single per-broker-host. | `DEV_KEY_SERVICE_MASTER_SECRET` (env var name), `master_secret` (signer-side log). | +| `session JWT` (= K6) | The bearer token at `~/.agentkeys//session.json` (or OS keychain). Signed by K1. | `session_jwt` (JSON field name in broker responses), `evm_session_jwt` (init-flow internal var post-SIWE), `SESSION_JWT_A` / `SESSION_JWT_B` (demo doc shell vars). | +| `OIDC JWT` (= K7) | Per-mint short-lived JWT signed by K2; consumed by `AssumeRoleWithWebIdentity`. | `oidc_jwt`, `JWT_A` / `JWT_B` (demo doc shell vars). | + +The most common confusion this table resolves: **`master_wallet` +(persisted in the session JWT, used by AWS PrincipalTag) ≠ +`derived_address(actor_omni)` (recomputed on each `/dev/derive-address` +call, never reaches AWS).** Both are valid K4 instances; only the +first is what AWS sees in `${aws:PrincipalTag/agentkeys_user_wallet}`. +The post-SIWE `actor_omni` itself is *not a wallet* — it's the 32-byte +SHA256 input that defines which K4 the signer derives. -**Strategy 2 locks in:** -- **Rust** for everything in the trust boundary (CLI, daemon, core library, MCP adapter, CLI adapter, mock backend client, provisioner orchestrator). -- **TypeScript + Playwright** for browser automation scripts inside the agent sandbox. -- **TypeScript** for the audit indexer (Subsquid, post-MVP) and Web GUI frontend (Tauri hybrid, post-MVP). +--- -**Single monorepo, single Cargo workspace, multiple crates:** +## 4. Identity model -| Repo | GitHub | Contents | -|------|--------|----------| -| `agentkeys` | agentkeys/agentkeys | Hub: docs, architecture, Kai spec, issue tracking, README | -| `agentkeys-core` | agentkeys/agentkeys-core | `CredentialBackend` trait, shared types, mock backend HTTP client | -| `agentkeys-cli` | agentkeys/agentkeys-cli | Master CLI binary (depends on core via Cargo git dep) | -| `agentkeys-daemon` | agentkeys/agentkeys-daemon | Sandbox daemon binary (depends on core via Cargo git dep) | -| `agentkeys-mock-server` | agentkeys/agentkeys-mock-server | Temporary v0-only mock backend binary (depends on core) | -| `agentkeys-provisioner` | agentkeys/agentkeys-provisioner | Rust orchestrator library (depends on core) | -| `provisioner-scripts` | agentkeys/provisioner-scripts | TypeScript + Playwright scrapers (npm package) | +The system has two omni concepts that compose into an HDKD actor tree: -Cross-repo dependencies use Cargo `[dependencies] agentkeys-core = { git = "..." }`. All repos in the same local directory for development. +```mermaid +flowchart LR + ID["raw identity
(email, OAuth2 sub, EVM addr, passkey)"] + ID_OMNI["identity omni
= SHA256('agentkeys' || id_type || id_value)
(transient — auth-event handle)"] + M_OMNI["MASTER actor omni
(root of HDKD tree)
= SHA256('agentkeys' || 'evm' || master_wallet)"] + M_WALLET["wallet_master
= HKDF(K3, M_OMNI)"] + A_OMNI["AGENT actor omnis
O_master//agent-A, //agent-B, ..."] + A_WALLET["wallet_agent_A
= HKDF(K3, O_master//agent-A)"] -**Rust proportion of the codebase: ~75-80%**, including **100% of the security-critical path**. Every line of code that touches a session key, a wallet private key, an OS keychain entry, or a chain signing operation is in Rust. The cross-language boundaries are all at natural process/sandbox boundaries; no in-process polyglot. + ID -->|"identity ceremony"| ID_OMNI + ID_OMNI -->|"derive + link + SIWE"| M_OMNI + M_OMNI --> M_WALLET + M_OMNI -->|"HDKD //label"| A_OMNI + A_OMNI --> A_WALLET +``` -## 2. Component inventory +**Identity omni vs actor omni — different roles, different lifespans:** -| # | Component | Where it runs | Primary job | +- **Identity omni** = `SHA256("agentkeys" || identity_type || identity_value)`. Derived from the authenticator (email, OAuth2 sub, EVM addr, passkey). **Transient handle** for one auth event — the broker uses it to drive the wallet-binding round-trip, then discards it. Multiple identity omnis can map to the same master actor omni (a user with linked email + OAuth has two identity omnis but one master). +- **Actor omni** = `SHA256("agentkeys" || "evm" || lower(wallet))`. Derived from a wallet address. The **durable identity** the system reasons about: session JWTs, OIDC claims, audit attribution, AWS PrincipalTag are all keyed on actor omni. + +For `identity_type = evm` (operator authenticates via their own EVM wallet via SIWE), the identity omni and master actor omni are equal — identity IS the wallet, no signer derivation needed. + +### HDKD tree of actors (per-agent omni model) + +Actor omnis form an HDKD tree rooted at the master. Every node has its own derived wallet: + +``` +O_master wallet_master = HKDF(K3, O_master) +├── O_master//agent-A wallet_agent_A = HKDF(K3, O_master//agent-A) +├── O_master//agent-B wallet_agent_B = HKDF(K3, O_master//agent-B) +│ └── O_master//agent-B//task-1 (future — sub-actors under agents) +└── ... +``` + +Hard derivation (`//N`) — child secret cannot be derived without the parent's master secret. Substrate / SLIP-0010 standard. Each node's wallet is a different EVM address; AWS PrincipalTag is per-actor-wallet for prefix isolation. + +**Why per-agent omni (not shared with master):** +1. Per-agent compromise containment — leaked agent K10 touches only that agent's wallet/prefix. +2. First-class audit attribution — audit rows carry `acting_omni`, `parent_chain`, `derivation_path`. +3. Atomic revocation — revoke `O_master//agent-A` alone; master and other agents untouched. +4. Tree topology IS the data model — no binding-table abstraction needed. + +The shared-omni-with-multiple-device-pubkeys model is a v1c shipping shortcut; v1.0 = HDKD per-agent omni. v1c is a degenerate v1.0 tree (no children). + +--- + +## 4a. Mental model — four orthogonal axes + +The system separates four concepts that earlier drafts collapsed: + +| Axis | What it answers | Realized by | Lifecycle | |---|---|---|---| -| 1 | `agentkeys` CLI | User's Mac/PC/Linux | `init`, `store`, `read`, `run`, `approve`, `revoke`, `teardown`, `usage`, `link`, `feedback` | -| 2 | `agentkeys-daemon` | Inside agent sandbox (as `gem` UID on stock sandbox), also desktop / Mac mini / Raspberry Pi per [#12](https://github.com/litentry/agentKeys/issues/12) | Stores session in **OS keychain when available** (wallet-namespaced per [#12](https://github.com/litentry/agentKeys/issues/12)), file fallback (`~/.agentkeys/daemon-/session.json`, mode 0600) in sandboxes. Runtime key copy held in `memfd_secret`. Exposes MCP + CLI sockets; hosts provisioner as MCP tool | -| 3 | MCP adapter | Same process as #2 | Speaks MCP protocol on stdio/socket, translates to daemon internal API | -| 4 | CLI adapter | Same process as #2 | Line-protocol on Unix socket for `agentkeys read` etc. | -| 5 | Heima RPC client library | Linked into #1 and #2 | session-signed extrinsics over wss, scale-codec, signing | -| 6 | x402 / EVM library | Linked into #1 | ERC-20 USDC transfers, x402 HTTP payment headers, wallet signing | -| 7 | Provisioner orchestrator (Rust) | Inside agent sandbox, subprocess of daemon | Exposed as MCP tool `agentkeys.provision` on daemon; spawns browser automation, encrypts credentials to backend | -| 8 | Browser automation scripts (TypeScript) | Inside agent sandbox, child of #7 | Playwright/CDP flows for OpenRouter (v0), more services later | -| 9 | Ephemeral email integration (TypeScript) | Inside agent sandbox, child of #7 | Reads verification codes from burner email backends | -| 10 | Audit log indexer | Post-MVP, own host | Subsquid/Subquery indexing Heima extrinsics for `agentkeys usage` | -| 11 | Web GUI | Post-MVP, user's device, local-first | Master management UI, live audit, wallet balance (Tauri shell) | -| 12 | Heima TEE worker extensions | Kai's code, Gramine-SGX | New AgentKeys module (pending Kai conversation) | -| 13 | New Heima pallets | Substrate runtime | `pallet-secrets-vault` if Q2 of the Kai meeting says we build it | -| M | Mock backend service (v0-only) | Small VPS | Mirrors Heima API contract: session mgmt, credential storage, audit, rendezvous relay, auth-request primitive. Axum + SQLite. Deleted when Heima integration lands in v0.1. | -| 14 | `@agentkeys/daemon` npm package | Any environment a cloud LLM can install into | TypeScript wrapper + bundled prebuilt Rust binary. Ships the daemon to cloud LLM sandboxes via `npx @agentkeys/daemon`. | - -## 3. Language choice per component - -| # | Component | Language | Reasoning | +| **Identity** | Who is the human? | Identity omni (email / OAuth / EVM / passkey) | Recoverable via linked authenticators; identity omnis are ephemeral, masters are durable | +| **Actor** | Master, or which agent? | Actor omni — a node in the HDKD tree (`O_master`, `O_master//agent-A`) | Master derived from identity at first init; agents derived from master via `//

=…`. Forces `--derive`. | Capturing the seven fields into shell vars for §2/§4 (`eval "$(...)"`). | + +Two flags adjust behavior across modes: + +- `--no-derive` skips the `signer derive` round-trip; the `ADDR` field + ends up empty. Useful when the signer is offline or you only need + JWT-side fields. +- A positional `` (default `master`) selects which + `~/.agentkeys//session.json` to read. `AGENTKEYS_SESSION_ID` + has the same effect. + +```bash +# === ON OPERATOR WORKSTATION === +bash scripts/agentkeys-demo-show.sh alice +bash scripts/agentkeys-demo-show.sh --json bob | jq .actor.omni +bash scripts/agentkeys-demo-show.sh --no-derive alice +``` + +#### Capture (`OMNI`, `ADDR`) pairs for §2 + §4 via `--export` + +`--export ` is the canonical way to feed §2's SIWE round-trip +and §4's S3 isolation proof. Two `eval` calls populate the seven +per-session vars for both A and B labels; the rest of the demo just +references `$OMNI_A` / `$ADDR_A` / `$ADDR_B` etc. without re-decoding +the JWT. Idempotent — the script reads the file + calls `signer derive` +deterministically, so re-running overwrites the same shell vars with +the same values. + +```bash +# === ON OPERATOR WORKSTATION === +eval "$(bash scripts/agentkeys-demo-show.sh --export A alice)" +eval "$(bash scripts/agentkeys-demo-show.sh --export B bob)" + +# Stick the alice session as the default for the rest of §2. Without +# this, every `agentkeys signer sign`/`derive` call below falls back to +# --session-id master, which is likely an older expired session (see +# §14.8). Retarget to "$SESSION_ID_B" right before §2.4's bob block. +export AGENTKEYS_SESSION_ID="$SESSION_ID_A" +``` + +`--export` emits shell vars only — it does NOT route follow-up +`agentkeys` calls. The CLI's `--session-id` flag defaults to `master`, +so an unset `AGENTKEYS_SESSION_ID` silently reads +`~/.agentkeys/master/session.json` even after `eval … --export A alice`. +The explicit `export` line above pins routing for the rest of the +section; §2.4 retargets to bob the same way. + +Per-session vars (label `A` shown; `B` is symmetric): + +| Var | Source | Used by | +|--------------------|-----------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------| +| `SESSION_ID_A` | The session-id the script was called with (`alice`). | Routing follow-up `agentkeys --session-id` calls. | +| `OMNI_A` | `JWT.agentkeys.omni_account` — durable EVM actor omni. | Every `/dev/*` call (signer's strict JWT check requires the request omni to match the JWT claim). | +| `ADDR_A` | `signer.derive(OMNI_A) = HKDF(K3, OMNI_A)`. | §2's SIWE round-trip; §4's S3 isolation proof tags traffic with this via §2.3's freshly-minted JWT. | +| `MASTER_WALLET_A` | `JWT.agentkeys.wallet_address` from init — the wallet the broker linked + SIWE-verified at init. | Audit only post-init; not used by §2 or §4. | +| `IDENTITY_TYPE_A` | `JWT.agentkeys.identity_type` — `"evm"` post-SIWE for the email-link flow. | The `omni()` helper in §0.3 + the SHA256 cross-check. | +| `IDENTITY_VALUE_A` | `JWT.agentkeys.identity_value` — same as `MASTER_WALLET_A` post-SIWE. | Same. | +| `IDENTITY_OMNI_A` | Locally recomputed `SHA256("agentkeys" \|\| IDENTITY_TYPE_A \|\| IDENTITY_VALUE_A)`. | Cross-check — the JWT does NOT carry this post-SIWE. | + +Sanity-check both sessions are distinct (any of these failing means +the recipient defaults collided — see the callout below): + +```bash +[[ "$OMNI_A" != "$OMNI_B" ]] && echo "actor-omni split ok" +[[ "$ADDR_A" != "$ADDR_B" ]] && echo "ADDR split ok" +[[ "$MASTER_WALLET_A" != "$MASTER_WALLET_B" ]] && echo "wallet split ok" +``` + +> **Symptom: `MASTER_WALLET_A == MASTER_WALLET_B` after two distinct +> `--session-id` inits.** Both inits hit the same recipient email, +> producing the same `identity_omni_email`, and HKDF(K3, …) +> deterministically returned the same wallet. Since the 2026-05-13 +> fix, calling `init-email-demo.sh --session-id ` defaults the +> recipient to `@$MAIL_DOMAIN`, which is guaranteed-unique per +> session-id. If you see a collision today: (a) you passed the same +> positional recipient to both runs (`--session-id alice demo-2` +> twice), or (b) you set `$RECIPIENT` in your shell and it's +> overriding both. The script's recipient + `identity_omni (email)` +> log lines make the collision visible BEFORE SES SendEmail fires. + +> **Why `--session-id` matters.** The signer's strict JWT-omni check +> means each session JWT only authorizes `/dev/*` calls for ITS own +> actor_omni. Without `--session-id`, a second `agentkeys init` run +> overwrites `~/.agentkeys/master/session.json` and the first +> `(omni, wallet)` pair is lost. With `--session-id alice` + +> `--session-id bob` the two sessions live side by side and §4 can +> drive each in turn (`agentkeys --session-id alice ...` vs +> `--session-id bob ...`). + +> **Why `ADDR_A` is `signer derive(OMNI_A)` and NOT `JWT.wallet_address`.** +> §2.2 below calls `agentkeys signer sign --omni-account $OMNI_A` and +> ecrecover on the resulting signature recovers to `HKDF(K3, OMNI_A)` — +> i.e. to `ADDR_A`. For §2.1's SIWE message (which puts `ADDR_A` in the +> body) to survive `/v1/auth/wallet/verify`, the message-address MUST +> equal the signature-recovered address, so `ADDR_A` has to be +> `HKDF(K3, OMNI_A)`. §2.3 then mints a FRESH session JWT with +> `wallet_address=ADDR_A`, and §4 mints OIDC from that JWT — so AWS +> sees `ADDR_A` (= `HKDF(K3, OMNI_A)`) in the PrincipalTag, not +> `MASTER_WALLET_A`. `MASTER_WALLET_A` (= `HKDF(K3, identity_omni_email)`) +> only matters if you skip §2 entirely and mint OIDC directly from the +> init JWT — see the "Which one does AWS see?" paragraph above for the +> mechanical explanation. + +> **macOS Keychain prompts during `agentkeys` calls?** The CLI defaults +> to `KeyringMode::Auto` — Keychain first, file fallback. On a fresh +> machine that's fine, but if you've run earlier dev cycles the +> Keychain can hold a stale entry that returns +> `SIGNER_UNAUTHORIZED: invalid session JWT: InvalidToken` from +> `agentkeys signer derive` even while the file at +> `~/.agentkeys//session.json` is fresh and valid. Force file mode +> for the entire demo: +> ```bash +> export AGENTKEYS_SESSION_STORE=file +> ``` +> `operator-workstation.env` sets this for you when you `set -a; +> source` it. Verify with a raw curl using the file's JWT — if that +> succeeds while the CLI fails, your Keychain definitely has a stale +> entry: +> ```bash +> JWT=$(jq -r .token ~/.agentkeys/alice/session.json) +> curl -sS -H "Authorization: Bearer $JWT" -H 'content-type: application/json' \ +> -d "$(jq -n --arg o "$OMNI_A" '{omni_account: $o}')" \ +> "$AGENTKEYS_SIGNER_URL/dev/derive-address" | jq . +> ``` +> A `{"address":"0x...","key_version":1}` response means the JWT and +> signer wire are good and only the CLI's Keychain read is broken. + +`ADDR_A` and `ADDR_B` are 0x-prefixed 40-char lowercase hex EVM +addresses. They're stable across daemon reinstalls as long as the K3 +master secret doesn't rotate; that's the property that makes the +"recover-via-any-linked-identity" model work without ever moving a +private key. + +The keys never need on-chain funds — Stage 7's SIWE auth is off-chain +signing only. --- @@ -144,34 +843,12 @@ curl -s $OIDC_ISSUER/readyz | jq # "checks": [], # "ready": ["tier2/backend", "audit/sqlite", …] # } -# -# Degraded case (still serving, dependency impaired): -# { -# "status": "degraded", -# "degraded": true, -# "checks": [{"name":"…","status":"degraded","reason":"…","docs":"…"}], -# "ready": ["tier2/backend", …] -# } -# -# Unready case (HTTP 503): -# { -# "status": "unready", -# "degraded": false, -# "checks": [{"name":"tier2/backend","status":"unready", -# "reason":"BROKER_BACKEND_URL/healthz not yet reachable since boot", -# "docs":"https://docs.agentkeys.dev/operator-runbook-stage7#backend-reachability"}], -# "ready": [] -# } ``` The body is always self-describing — `status` is one of `ready`, `degraded`, `unready` — so `curl … | jq -r .status` is a single-shot -verdict. The HTTP status code agrees: `200` for ready/degraded, -`503` for unready. - -If `/readyz` returns `503` (unready), paste the `docs:` URL from the -checks array into the [operator runbook](operator-runbook-stage7.md) -— every check has its own anchor with the recovery procedure. +verdict. If `/readyz` returns `503`, paste the `docs:` URL from the +checks array into the [operator runbook](operator-runbook-stage7.md). ```bash curl -sS --fail-with-body $OIDC_ISSUER/.well-known/openid-configuration | jq @@ -183,22 +860,12 @@ curl -sS --fail-with-body $OIDC_ISSUER/.well-known/openid-configuration | jq # } curl -sS --fail-with-body $OIDC_ISSUER/.well-known/jwks.json | jq '.keys[0]' -# { -# "kty": "EC", -# "crv": "P-256", -# "x": "<43-char base64url>", -# "y": "<43-char base64url>", -# "kid": "v1-", -# "alg": "ES256", -# "use": "sig" -# } ``` **Critical invariant:** `issuer` in the discovery doc MUST equal `$OIDC_ISSUER` byte-for-byte. AWS IAM compares the JWT `iss` claim -against the registered OIDC provider URL exactly — trailing slash, host, -scheme, path all matter. If they don't match, every -`AssumeRoleWithWebIdentity` will return `InvalidIdentityToken`. +against the registered OIDC provider URL exactly. If they don't match, +every `AssumeRoleWithWebIdentity` will return `InvalidIdentityToken`. ```bash [[ "$(curl -sS --fail-with-body $OIDC_ISSUER/.well-known/openid-configuration | jq -r .issuer)" \ @@ -211,18 +878,144 @@ Verify from AWS IAM's perspective: aws iam get-open-id-connect-provider \ --open-id-connect-provider-arn $OIDC_PROVIDER_ARN \ --query '{Url:Url, ClientIDList:ClientIDList, Thumbprints:ThumbprintList}' -# { -# "Url": "broker.litentry.org", ← AWS strips the https:// -# "ClientIDList": ["sts.amazonaws.com"], -# "Thumbprints": ["<40 hex>"] -# } ``` --- -## 2. SIWE wallet auth round-trip +## 2. Managed-wallet SIWE auth via the dev_key_service + +This is the new flow that replaces the pre-issue-#74 `cast wallet +sign` walkthrough. The operator provides only an identity (email or +OAuth2/Google); the broker mints an identity-omni session JWT, the +backend derives the wallet, signs the SIWE challenge on the operator's +behalf, and the broker mints an EVM-omni session JWT. The broker sees +a normal SIWE round-trip — it cannot tell whether the signer is +HKDF-backed (today) or TEE-backed (issue #74 step 2). + +**Two ways to drive this section** — pick one, then jump to §3: + +| Path | When to use | What it runs | +|----------------------------|------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------| +| `init-email-demo.sh` (§0.4) | Default for demos, CI, doc verification — no human-in-the-loop click needed. | The script auto-clicks the magic link by polling `s3://$MAIL_BUCKET/inbound/`. §0.4 already ran this for alice + bob. | +| Manual `agentkeys init --email` (§2.0) | You want the magic link in an inbox you control (real demo to a stakeholder, or smoke-testing real SES delivery). | Same `/v1/auth/email/request` + `/v1/auth/email/verify` chain, but you click the link in your mail client. Requires `--email `. | +| Manual SIWE walkthrough (§2.1–§2.5) | Debugging a step the one-command path hides, or explaining the trust model to a reviewer. | Exactly the chain `init --email` runs internally, exposed call-by-call. Functionally redundant after §0.4 or §2.0 — read it for understanding, don't expect it to produce a new session. | + +### 2.0 Recommended path: `agentkeys init --email` + +Issue #74 step 1 + Pass 2 of Option B (closed [issue #80](https://github.com/litentry/agentKeys/issues/80)) +ship a single-command bootstrap that drives the entire chain end-to-end +against real SES delivery. Use this for any real demo or production deployment. + +> **Already done by §0.4 if you ran `init-email-demo.sh --session-id +> alice` (and bob).** That script runs `agentkeys init --email` against +> a deliverable `@$MAIL_DOMAIN` recipient, polls +> `s3://$MAIL_BUCKET/inbound/` for the SES inbound, parses the +> `#t=` fragment, and POSTs `/v1/auth/email/verify` — +> programmatically replicating the browser-side click. By the time it +> exits, alice's `~/.agentkeys/alice/session.json` holds a fully +> SIWE'd JWT and §2.1–§2.5 below would re-do the same chain manually. +> **For automation, skip to [§3](#3-mint-oidc-jwt-for-sts).** Read +> §2.1–§2.5 only when you want to inspect each wire frame or are +> debugging a step the script normally hides. + +> **Prereq if you haven't done §0.4 yet:** the two-step setup from +> §0.4 — `bash scripts/ses-verify-sender.sh` (one-time SES sender +> registration) + `sudo bash scripts/setup-broker-host.sh --yes` on the +> broker host (Pass 2 build with `auth-email-link` + `email_link` in +> `BROKER_AUTH_METHODS`). + +If you're driving the init manually (because you want a real +operator-controlled inbox rather than the `@bots.litentry.org` alias), +the equivalent one-command form is: + +```bash +# === ON OPERATOR WORKSTATION === +agentkeys --session-id alice init \ + --email \ + --broker-url $OIDC_ISSUER \ + --signer-url $BACKEND_URL +# Magic link sent via real SES (FROM noreply-test@bots.litentry.org). +# Click the link in your inbox; the CLI is polling… +# (operator clicks the magic link) +# Initialized via email-link. +# identity omni: <64 hex> +# derived wallet: 0x… +# evm omni: <64 hex> +``` + +The automated equivalent — same result, no click required — is what +§0.4 already runs: + +```bash +bash scripts/agentkeys-init-email-demo.sh --session-id alice +# (auto-prints a "Next: capture eval-able shell vars" hint at the end — +# copy-paste the eval line below to populate $ADDR_A / $OMNI_A / …) +eval "$(bash scripts/agentkeys-demo-show.sh --export A alice)" +export AGENTKEYS_SESSION_ID=alice +``` -### 2.1 Request a SIWE challenge +Pick whichever fits the run: the script for unattended demos / CI / +docs verification, the manual `--email ` form when you want the +magic link delivered to an inbox you control. + +> **Why the second line matters.** `init-email-demo.sh` runs in a +> subprocess, so it can't `export` variables into your parent shell. +> The human-mode session detail it prints at the end is text, not +> assignments. Without the `eval … --export A alice` line, your shell +> either has no `$ADDR_A` / `$OMNI_A` (and §2.1's +> `/v1/auth/wallet/start` fails JSON-validation on an empty address) +> or — worse — carries stale `$ADDR_A` from a previous run against a +> different session/identity. Stale `$ADDR_A` produces the +> `ADDRESS DRIFT — master secret rotated mid-session?` failure at the +> end of §2.2 (the sanity check `[[ "$SIG_ADDR" == "$ADDR_A" ]]` +> compares the just-now signer-returned address against your shell's +> `$ADDR_A`; they only match when both come from the *current* alice +> session). The §0.4 callout earlier already pins this — the eval line +> above is the same line, repeated here for the operator who jumped +> straight into §2 without running §0.4 top-to-bottom. + +> **Don't substitute a placeholder email** like `alice@demo.example` +> when you've already run `init-email-demo.sh --session-id alice`. The +> placeholder produces a *different* `identity_omni_email` → different +> `MASTER_WALLET` → different `actor_omni`, and the second init +> overwrites `~/.agentkeys/alice/session.json`. Your shell still holds +> the §0.4 `$OMNI_A` / `$ADDR_A` from the bots-alias identity, so the +> §2.2 strict JWT-omni check fails with a mismatch +> (`request.omni ≠ JWT.omni_account`). Either skip §2.0 entirely (use +> §0.4's script), or pass `--email ` with a domain +> SES can actually deliver to and re-run §0.4's `--export A alice` +> afterwards to refresh the shell vars. + +The `--session-id alice` writes to `~/.agentkeys/alice/session.json` +instead of the default `master`. Subsequent `agentkeys signer …` calls +in §2.1–§2.5 need either the same `--session-id alice` flag or +`export AGENTKEYS_SESSION_ID=alice` once at the top of the shell — +otherwise the CLI silently reads `master`, which is usually a stale +older session (see [§14.8](#148-agentkeys-signer-sign-returns-error-signer_unauthorized--invalid-session-jwt-expiredsignature)). + +For OAuth2/Google instead of email-link: + +```bash +agentkeys --session-id alice init \ + --oauth2-google \ + --broker-url $OIDC_ISSUER \ + --signer-url $BACKEND_URL +# Open this URL in your browser to authenticate with Google: +# https://accounts.google.com/o/oauth2/v2/auth?… +# (Polling for callback…) +``` + +The same flow is available on the daemon side via +`agentkeys-daemon --init-email ` and +`agentkeys-daemon --init-oauth2-google` (see §16.7 for an end-to-end +provision against a real broker). + +`§2.1`–`§2.5` below walk through the same chain manually, so you can +inspect each wire frame without trusting the CLI to do the right +thing. Use those sections for debugging or for explaining the trust +model to a reviewer. + +### 2.1 Request a SIWE challenge for `ADDR_A` ```bash # === ON OPERATOR WORKSTATION === @@ -250,19 +1043,33 @@ The SIWE message is constructed per EIP-4361 with the broker's `$BROKER_HOST` as the domain field. The signature you produce next has the EIP-191 `\x19Ethereum Signed Message:\n` prefix wrapped around this exact text — re-deriving any whitespace differently breaks -verification. +verification, so always pull `SIWE_MSG` straight from the response. -### 2.2 Sign the SIWE message +### 2.2 Sign the SIWE message via the dev_key_service -`cast wallet sign` does the EIP-191 wrap automatically when called -without `--no-hash`. The `--no-hash` flag means "the bytes ARE the -EIP-191 envelope already, just sign them" — which is **not** what we -want here. +`agentkeys signer sign` calls `POST /dev/sign-message` with `OMNI_A` +and the SIWE message bytes. The signer wraps them in EIP-191 and +returns the canonical 65-byte signature. The CLI never sees the +private key. ```bash -SIG_A=$(cast wallet sign --private-key $PK_A "$SIWE_MSG") +SIG_A=$(agentkeys --json signer sign \ + --signer-url $BACKEND_URL \ + --omni-account $OMNI_A \ + --message "$SIWE_MSG" | jq -r .signature) echo "SIG_A=${SIG_A:0:32}… length=${#SIG_A}" -# SIG_A=0x<130-hex-chars> +# SIG_A=0x<130 hex chars> +``` + +Sanity — the signer's `address` reply MUST match `ADDR_A`: + +```bash +SIG_ADDR=$(agentkeys --json signer sign \ + --signer-url $BACKEND_URL \ + --omni-account $OMNI_A \ + --message "$SIWE_MSG" | jq -r .address) +[[ "$SIG_ADDR" == "$ADDR_A" ]] && echo "sign↔derive address match" \ + || echo "ADDRESS DRIFT — master secret rotated mid-session?" ``` ### 2.3 Submit the signature, get back a session JWT @@ -287,16 +1094,21 @@ printf '%s' "$VERIFY" | jq SESSION_JWT_A=$(printf '%s' "$VERIFY" | jq -r .session_jwt) echo "SESSION_JWT_A=${SESSION_JWT_A:0:32}… length=${#SESSION_JWT_A}" -OMNI_A=$(printf '%s' "$VERIFY" | jq -r .omni_account) -echo "OMNI_A=$OMNI_A" +OMNI_EVM_A=$(printf '%s' "$VERIFY" | jq -r .omni_account) +echo "OMNI_EVM_A=$OMNI_EVM_A" +echo "OMNI_A =$OMNI_A (the omni you used to drive the signer)" ``` -The `omni_account` is `SHA256("agentkeys" || "evm" || lower(wallet))` -— deterministic from the wallet address, namespace-isolated from any -other identity provider, never reused across wallet rotations. If -you decode `$SESSION_JWT_A` (`echo $SESSION_JWT_A | cut -d. -f2 | base64 --d`) you'll see `omni_account`, `wallet`, `iss`, `iat`, `exp` claims and -a `kid` in the header pointing at the session keypair. +> **Two omnis at play — both correct.** +> - `$OMNI_A` is the operator's **identity omni** (the one you used to +> call the signer). The broker never sees this directly. +> - `$OMNI_EVM_A` is the **wallet omni** the broker derives from the +> verified EVM address. The session JWT is bound to this one. +> +> They link 1:1 in this demo because the wallet is deterministically +> derived from `OMNI_A`. In production, `agentkeys whoami` would +> show both via the linked-identities table after the daemon calls +> `/v1/wallet/link(OMNI_A → ADDR_A)`. See §7.1 below. > **Session JWT is broker-internal.** It is signed by the *session* > keypair (`purpose=session`), not the OIDC keypair. AWS IAM never @@ -304,87 +1116,234 @@ a `kid` in the header pointing at the session keypair. > session JWT can't impersonate the broker to AWS, and a stolen OIDC > JWT can't be replayed as a session token. -### 2.4 Repeat for wallet B +### 2.4 Repeat for `ADDR_B` + +**Run this FIRST** — refresh the shell vars for bob's *current* +session and pin the CLI to read bob's session file. Without it, the +`START_B` call below sends a stale `$ADDR_B` from a previous run and +§2.4 ends with `HTTP 401 — signature does not recover to claimed +address` (the SIWE message claims an address derived from +`$ADDR_B_stale`, but `$OMNI_B_stale` doesn't agree — see [§14.4](#144-siwe-verify-returns-signature-does-not-recover-to-claimed-address--or-address-drift--master-secret-rotated-mid-session-at-end-of-22)): + +```bash +eval "$(bash scripts/agentkeys-demo-show.sh --export B bob)" +export AGENTKEYS_SESSION_ID="$SESSION_ID_B" +``` + +The `eval` line is **idempotent** — re-running it after every fresh +`init-email-demo.sh --session-id bob` is the canonical fix when bob's +session got re-minted (e.g. expired JWT, K3 rotation, switched +broker hosts). The script's own end-of-run hint prints the exact same +line; this is just here for the operator who jumped straight from +§2.3 into §2.4 without scrolling back. ```bash START_B=$(curl -sS --fail-with-body -X POST $OIDC_ISSUER/v1/auth/wallet/start \ -H 'content-type: application/json' \ -d "$(jq -n --arg a "$ADDR_B" '{address:$a, chain_id:84532}')") -echo "START_B=${START_B:0:32}… length=${#START_B}" - REQ_ID_B=$(printf '%s' "$START_B" | jq -r .request_id) -echo "REQ_ID_B=$REQ_ID_B" SIWE_MSG_B=$(printf '%s' "$START_B" | jq -r .siwe_message) -echo "SIWE_MSG_B=${SIWE_MSG_B:0:32}… length=${#SIWE_MSG_B}" -SIG_B=$(cast wallet sign --private-key $PK_B "$SIWE_MSG_B") + +SIG_B=$(agentkeys --json signer sign \ + --signer-url $BACKEND_URL \ + --omni-account $OMNI_B \ + --message "$SIWE_MSG_B" | jq -r .signature) echo "SIG_B=${SIG_B:0:32}… length=${#SIG_B}" VERIFY_B=$(curl -sS --fail-with-body -X POST $OIDC_ISSUER/v1/auth/wallet/verify \ -H 'content-type: application/json' \ -d "$(jq -n --arg r "$REQ_ID_B" --arg s "$SIG_B" \ '{request_id:$r, signature:$s}')") -echo "VERIFY_B=${VERIFY_B:0:32}… length=${#VERIFY_B}" - SESSION_JWT_B=$(printf '%s' "$VERIFY_B" | jq -r .session_jwt) -echo "SESSION_JWT_B=${SESSION_JWT_B:0:32}… length=${#SESSION_JWT_B}" -OMNI_B=$(printf '%s' "$VERIFY_B" | jq -r .omni_account) -echo "OMNI_B=$OMNI_B" -echo "OMNI_A=$OMNI_A" -echo "OMNI_B=$OMNI_B" +OMNI_EVM_B=$(printf '%s' "$VERIFY_B" | jq -r .omni_account) +echo "OMNI_EVM_A=$OMNI_EVM_A" +echo "OMNI_EVM_B=$OMNI_EVM_B" +``` + +`OMNI_EVM_A` ≠ `OMNI_EVM_B` — confirmed by hash function. + +### 2.5 `agentkeys whoami` — sanity at-a-glance + +`whoami` is a read-only `/dev/derive-address` call — it surfaces the +omni → address mapping under whichever session is currently pinned. +Inherits `$AGENTKEYS_SESSION_ID` from §0.4 (still `alice` here) or +override per-call with `--session-id `. + +```bash +agentkeys whoami \ + --signer-url $BACKEND_URL \ + --omni-account $OMNI_A +# session_wallet: 0x ← JWT.agentkeys.wallet_address from ~/.agentkeys/alice/session.json +# signer_url: https://signer… +# omni_account: ← OMNI_A +# derived_address: 0x ← HKDF(K3, OMNI_A) = ADDR_A +# key_version: 1 + +# For bob, retarget the session-id once and rerun: +agentkeys --session-id "$SESSION_ID_B" whoami \ + --signer-url $BACKEND_URL \ + --omni-account $OMNI_B ``` -`OMNI_A` ≠ `OMNI_B` — confirmed by hash function. +Field-by-field, in arch.md §3a canonical names: + +| CLI label | arch.md canonical name | What the CLI computes | +|--------------------|--------------------------------|----------------------------------------------------------------------------------------------------------------------| +| `session_wallet` | `master_wallet` | Loaded from `~/.agentkeys/$SESSION_ID/session.json` → `JWT.agentkeys.wallet_address`. The init-flow's wallet. | +| `omni_account` | `actor_omni` | Echoed from the `--omni-account` flag. | +| `derived_address` | `derived_address(actor_omni)` | Server-side `HKDF(K3, actor_omni)` — what `/dev/derive-address` returns for this omni. Equals `$ADDR_A` post-export. | + +`session_wallet` and `derived_address` are **two different K4 +wallets** — both signable, both deterministic, derived from two +different omnis (`identity_omni` at init vs `actor_omni` post-SIWE). +After §2.3, the §3 OIDC mint stamps `derived_address(actor_omni)` +(NOT `session_wallet`) into `agentkeys_user_wallet`, because §3 reads +`$SESSION_JWT_A` from §2.3's fresh verify response, not the on-disk +session.json. See the "Which wallet ends up in AWS PrincipalTag?" +callout in §0.4 for the full mechanical reason. --- ## 3. Mint OIDC JWT for STS -The session JWT is broker-internal. To talk to AWS STS you need a -separate OIDC JWT signed by the OIDC keypair, with claims AWS knows how -to consume. +The session JWT is broker-internal. AWS STS speaks a different JWT +(signed by K2, the OIDC keypair) carrying the PrincipalTag claim. +Exchange the session JWT for an OIDC JWT — once for alice, once for +bob — and decode each to capture the wallet that ended up in +`agentkeys_user_wallet`. **That decoded wallet IS the value §4's S3 +prefix uses** — no path-specific naming, no mental substitution. ```bash +# === ON OPERATOR WORKSTATION === +# Prereq: $SESSION_JWT_A from §2.3's VERIFY, $SESSION_JWT_B from +# §2.4's VERIFY_B. If you skipped §2 entirely, read both from disk +# (footnote at section end). + JWT_A=$(curl -sS --fail-with-body -X POST $OIDC_ISSUER/v1/mint-oidc-jwt \ -H "Authorization: Bearer $SESSION_JWT_A" | jq -r .jwt) -echo "JWT_A=${JWT_A:0:32}… length=${#JWT_A}" +JWT_B=$(curl -sS --fail-with-body -X POST $OIDC_ISSUER/v1/mint-oidc-jwt \ + -H "Authorization: Bearer $SESSION_JWT_B" | jq -r .jwt) + +# Decode each JWT's body once, extract the wallet AWS will tag the +# assumed-role session with. These are the canonical names §4 uses. +decode_aws_wallet() { + echo "$1" | cut -d. -f2 | tr '_-' '/+' \ + | python3 -c "import base64,sys; s=sys.stdin.read().strip(); print(base64.urlsafe_b64decode(s+'='*(-len(s)%4)).decode())" \ + | jq -r .agentkeys_user_wallet +} +WALLET_A=$(decode_aws_wallet "$JWT_A") +WALLET_B=$(decode_aws_wallet "$JWT_B") +echo "WALLET_A=$WALLET_A WALLET_B=$WALLET_B" +# WALLET_A=0x… WALLET_B=0x… (the two wallets your bucket policy will gate on) +``` -echo "$JWT_A" -# eyJ… (header.payload.signature) +Confirm the `aws.amazon.com/tags` claim is present on `JWT_A` — STS +needs it to stamp the PrincipalTag: -# Decode and verify the claim shape AWS cares about: -echo "$JWT_A" | cut -d. -f2 \ - | tr '_-' '/+' \ - | { read p; printf '%s%s' "$p" "$(printf '====' | head -c $(( (4 - ${#p} % 4) % 4 )))" | base64 -d 2>/dev/null; } \ - | jq +```bash +echo "$JWT_A" | cut -d. -f2 | tr '_-' '/+' \ + | python3 -c "import base64,sys; s=sys.stdin.read().strip(); print(base64.urlsafe_b64decode(s+'='*(-len(s)%4)).decode())" \ + | jq '{aud, sub, agentkeys_user_wallet, tags: ."https://aws.amazon.com/tags"}' # { -# "iss": "https://broker.litentry.org", -# "sub": "agentkeys:agent:0x…", # "aud": "sts.amazonaws.com", -# "exp": , -# "iat": , -# "agentkeys_user_wallet": "0x…", -# "https://aws.amazon.com/tags": { -# "principal_tags": {"agentkeys_user_wallet": ["0x…"]}, +# "sub": "agentkeys:agent:0x…", +# "agentkeys_user_wallet": "0x…", +# "tags": { +# "principal_tags": {"agentkeys_user_wallet": ["0x…"]}, # "transitive_tag_keys": ["agentkeys_user_wallet"] # } # } ``` -The `https://aws.amazon.com/tags` claim is what makes -`PrincipalTag`-scoped isolation work — AWS STS reads it during -`AssumeRoleWithWebIdentity` and stamps the assumed session with that -tag. The role's trust policy requires this tag to be present (set up -in `cloud-setup.md §4.3`). - -JWT TTL is 5 min. If you wait too long, rerun this step. +JWT TTL is **5 min**. If §4 errors with `InvalidIdentityToken`, the +JWT expired — rerun the two `mint-oidc-jwt` curls (the session JWTs +last 5h, so you usually don't need to re-do §2). + +> **Where `$WALLET_A` actually points to.** §3 doesn't pick the +> wallet — it just *reports* whichever wallet the broker stamped into +> your session JWT at init/SIWE time. Concretely: +> - If `$SESSION_JWT_A` came from §2.3's manual SIWE (`$VERIFY` → +> `.session_jwt`), `$WALLET_A` = `$ADDR_A` = arch.md +> `derived_address(actor_omni)`. +> - If `$SESSION_JWT_A` came from the on-disk init JWT +> (`~/.agentkeys//session.json`), `$WALLET_A` = `$MASTER_WALLET_A` +> = arch.md `master_wallet`. +> +> Either is valid — §4 just uses `$WALLET_A` directly, no +> conditional. The wallet you committed to at §2/§0.4 is the wallet +> S3 will gate on. + +> **Skipped §2 entirely?** Read the session JWTs from disk: +> ```bash +> SESSION_JWT_A=$(jq -r .token ~/.agentkeys/alice/session.json) +> SESSION_JWT_B=$(jq -r .token ~/.agentkeys/bob/session.json) +> ``` +> (Or `security find-generic-password -s agentkeys -a alice -w | jq -r .token` on macOS +> Keychain mode — check by listing `~/.agentkeys/alice/.keyring_managed`: +> present-and-non-empty ⇒ Keychain, otherwise file.) Then resume with +> the two `mint-oidc-jwt` curls above. --- ## 4. Cloud-enforced isolation proof -This is the climax of the demo. We assume `agentkeys-data-role` with -JWT_A, then attempt to read both wallet A's prefix (allowed) and wallet -B's prefix (denied **by AWS, not by app code**). +Assume `agentkeys-data-role` with `JWT_A`, then attempt to read both +alice's prefix (`bots/$WALLET_A/`) and bob's prefix (`bots/$WALLET_B/`). +The first succeeds, the second is denied **by AWS, not by app code**. + +The S3 prefix shape (`bots//…`) matches arch.md §6's +sequence diagram — `bots/` is the per-actor data namespace, sibling to +SES's `inbound/`, future `audit/`, etc. Keeping user data under a +single parent prefix lets lifecycle rules, encryption defaults, and +replication scope cleanly to "user data" without touching the +bucket's system prefixes. The bucket policy from +[`cloud-setup.md` §4.4](cloud-setup.md#44-upgrade-bucket-policy-to-principaltag-scoped) +grants access conditioned on +`bots/${aws:PrincipalTag/agentkeys_user_wallet}/*`. + +### 4.0 One-shot run: `agentkeys-isolation-demo.sh` + +This script is the executable form of §3 + §4.1–§4.3. It reads alice ++ bob's saved sessions (running `init-email-demo.sh` first if either +isn't on disk), mints both OIDC JWTs, decodes `$WALLET_A` / +`$WALLET_B` from the `agentkeys_user_wallet` claim, assumes the data +role as alice, seeds `bots/$WALLET_A/` + `bots/$WALLET_B/` via admin, +then asserts: + +- 4a: `list bots/$WALLET_A/` → success (alice's own prefix) +- 4b: `get bots/$WALLET_B/hello.txt` → AccessDenied (bob's prefix) + +```bash +# === ON OPERATOR WORKSTATION === +# Prereqs: operator-workstation.env sourced; awsp agentkeys-admin (for the +# seed step); bucket policy applied per cloud-setup.md §4.4; role inline +# policy stripped per cloud-setup.md §4.4.1. +bash scripts/agentkeys-isolation-demo.sh +# ==> WALLET_A=0x… +# ==> WALLET_B=0x… +# ✓ alice reads bots// — allowed (expected) +# ✓ alice DENIED on bots// — cloud-enforced isolation works +# ✓ §4 isolation proof PASSED +``` + +Flags: + +- `--reinit-alice` / `--reinit-bob` / `--reinit-both` — force a fresh + init (replaces the on-disk session JWT) before the proof. Default + reuses existing sessions. + +Exit codes: + +- `0` proof passed +- `1` precondition missing (env vars, tools, sessions) +- `2` alice's own-prefix read failed (false-negative — check + cloud-setup.md §4.4 bucket policy + §4.4.1 role inline strip) +- `3` bob's peer-prefix read succeeded (false-positive — **isolation + broken**, §4.4.1 wasn't applied so the role's broad `s3:GetObject` + overrides the bucket-policy PrincipalTag check) + +§4.1–§4.3 below are the same chain, broken into copy-paste steps for +when you want to inspect each wire frame manually. ### 4.1 Assume the role with JWT_A @@ -394,75 +1353,65 @@ CREDS=$(aws sts assume-role-with-web-identity \ --role-arn arn:aws:iam::${ACCOUNT_ID}:role/agentkeys-data-role \ --role-session-name "demo-A-$(date +%s)" \ --web-identity-token "$JWT_A") -echo "CREDS=${CREDS:0:32}… length=${#CREDS}" printf '%s' "$CREDS" | jq '.Credentials | {AKID:.AccessKeyId, Exp:.Expiration}' - -export AWS_ACCESS_KEY_ID=$(printf '%s' "$CREDS" | jq -r .Credentials.AccessKeyId) -echo "AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID:0:32}… length=${#AWS_ACCESS_KEY_ID}" -export AWS_SECRET_ACCESS_KEY=$(printf '%s' "$CREDS" | jq -r .Credentials.SecretAccessKey) -echo "AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY:0:32}… length=${#AWS_SECRET_ACCESS_KEY}" -export AWS_SESSION_TOKEN=$(printf '%s' "$CREDS" | jq -r .Credentials.SessionToken) -echo "AWS_SESSION_TOKEN=${AWS_SESSION_TOKEN:0:32}… length=${#AWS_SESSION_TOKEN}" - -# Confirm: you are NOT your admin profile any more. -aws sts get-caller-identity -# { -# "UserId": "AROA…:demo-A-…", -# "Arn": "arn:aws:sts::ACCOUNT:assumed-role/agentkeys-data-role/demo-A-…" -# } ``` -### 4.2 Seed test objects (one-shot, with admin creds) +### 4.2 Seed test objects (admin profile, no PrincipalTag check) -If wallet A's prefix is empty, the read in step 4.3 succeeds vacuously -and proves nothing. Pop two objects in (one per wallet) using your -admin profile — clear out the assumed-role env first. +Two objects, one per tenant prefix. Admin bypasses the bucket policy +via account ownership, so this works regardless of the per-actor +isolation. ```bash unset AWS_ACCESS_KEY_ID AWS_SECRET_ACCESS_KEY AWS_SESSION_TOKEN awsp agentkeys-admin -WALLET_A_LC=$(echo "$ADDR_A" | tr '[:upper:]' '[:lower:]') -echo "WALLET_A_LC=$WALLET_A_LC" -WALLET_B_LC=$(echo "$ADDR_B" | tr '[:upper:]' '[:lower:]') -echo "WALLET_B_LC=$WALLET_B_LC" -aws s3api put-object --bucket "$BUCKET" \ - --key "bots/${WALLET_A_LC}/hello.txt" --body /dev/null -aws s3api put-object --bucket "$BUCKET" \ - --key "bots/${WALLET_B_LC}/hello.txt" --body /dev/null +# AWS CLI's --body needs a seekable regular file (rejects /dev/null +# on macOS — character device, not a regular file). Use a tmp file: +EMPTY=$(mktemp) && trap 'rm -f "$EMPTY"' EXIT + +aws s3api put-object --region "$REGION" --bucket "$BUCKET" \ + --key "bots/${WALLET_A}/hello.txt" --body "$EMPTY" +aws s3api put-object --region "$REGION" --bucket "$BUCKET" \ + --key "bots/${WALLET_B}/hello.txt" --body "$EMPTY" ``` ### 4.3 Re-export the assumed-role creds and probe both prefixes ```bash export AWS_ACCESS_KEY_ID=$(printf '%s' "$CREDS" | jq -r .Credentials.AccessKeyId) -echo "AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID:0:32}… length=${#AWS_ACCESS_KEY_ID}" export AWS_SECRET_ACCESS_KEY=$(printf '%s' "$CREDS" | jq -r .Credentials.SecretAccessKey) -echo "AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY:0:32}… length=${#AWS_SECRET_ACCESS_KEY}" export AWS_SESSION_TOKEN=$(printf '%s' "$CREDS" | jq -r .Credentials.SessionToken) -echo "AWS_SESSION_TOKEN=${AWS_SESSION_TOKEN:0:32}… length=${#AWS_SESSION_TOKEN}" -# 4a — your own prefix: SUCCESS +# Confirm: you are NOT your admin profile any more. +aws sts get-caller-identity +# { +# "Arn": "arn:aws:sts:::assumed-role/agentkeys-data-role/demo-A-…" +# } + +# 4a — alice's prefix: SUCCESS aws s3api list-objects-v2 --bucket "$BUCKET" \ - --prefix "bots/${WALLET_A_LC}/" --query 'Contents[*].Key' -# [ "bots/0x…/hello.txt" ] + --prefix "bots/${WALLET_A}/" --query 'Contents[*].Key' +# [ "bots//hello.txt" ] -aws s3api get-object --bucket "$BUCKET" \ - --key "bots/${WALLET_A_LC}/hello.txt" /tmp/got-A.txt +aws s3api get-object --region "$REGION" --bucket "$BUCKET" \ + --key "bots/${WALLET_A}/hello.txt" /tmp/got-A.txt # { "ContentLength": 0, ... } -# 4b — the OTHER wallet's prefix: AccessDenied (CLOUD-ENFORCED) -aws s3api get-object --bucket "$BUCKET" \ - --key "bots/${WALLET_B_LC}/hello.txt" /tmp/got-B.txt +# 4b — bob's prefix: AccessDenied (CLOUD-ENFORCED, no app code involved) +aws s3api get-object --region "$REGION" --bucket "$BUCKET" \ + --key "bots/${WALLET_B}/hello.txt" /tmp/got-B.txt # An error occurred (AccessDenied) when calling the GetObject operation: # Access Denied ``` -**Step 4b is the property the static-IAM path cannot prove.** No app -code participated in the deny — S3's policy engine evaluated -`${aws:PrincipalTag/agentkeys_user_wallet}` (which is `WALLET_A_LC`) -against the resource ARN's `bots/${WALLET_B_LC}/` and refused. +**Step 4b is the property the static-IAM path cannot prove.** S3's +policy engine evaluated `${aws:PrincipalTag/agentkeys_user_wallet}` +(= `$WALLET_A`, stamped by STS from `$JWT_A`'s tags claim) against the +resource ARN's `bots/${WALLET_B}/` and refused. Swap to `JWT_B` in +§4.1 and you'd see the mirror — bob can read `bots/${WALLET_B}/` and +gets denied on `bots/${WALLET_A}/`. ### 4.4 Diagnosing intermediate states @@ -509,23 +1458,18 @@ unset AWS_ACCESS_KEY_ID AWS_SECRET_ACCESS_KEY AWS_SESSION_TOKEN # 1. Ask the broker for an OIDC JWT (lightweight call — broker just signs). JWT=$(curl -sS --fail-with-body -X POST $OIDC_ISSUER/v1/mint-oidc-jwt \ -H "Authorization: Bearer $SESSION_JWT_A" | jq -r .jwt) -echo "JWT=${JWT:0:32}… length=${#JWT}" # 2. Exchange it for AWS creds CLIENT-SIDE. No broker creds participate. CREDS=$(aws sts assume-role-with-web-identity \ --role-arn arn:aws:iam::${ACCOUNT_ID}:role/agentkeys-data-role \ --role-session-name "demo-A-$(date +%s)" \ --web-identity-token "$JWT") -echo "CREDS=${CREDS:0:32}… length=${#CREDS}" export AWS_ACCESS_KEY_ID=$(printf '%s' "$CREDS" | jq -r .Credentials.AccessKeyId) -echo "AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID:0:32}… length=${#AWS_ACCESS_KEY_ID}" export AWS_SECRET_ACCESS_KEY=$(printf '%s' "$CREDS" | jq -r .Credentials.SecretAccessKey) -echo "AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY:0:32}… length=${#AWS_SECRET_ACCESS_KEY}" export AWS_SESSION_TOKEN=$(printf '%s' "$CREDS" | jq -r .Credentials.SessionToken) -echo "AWS_SESSION_TOKEN=${AWS_SESSION_TOKEN:0:32}… length=${#AWS_SESSION_TOKEN}" # 3. Use the temp creds. PrincipalTag-scoped per cloud-setup.md §4.4. -aws s3 ls "s3://$BUCKET/bots/$(echo $ADDR_A | tr A-Z a-z)/" +aws s3 ls "s3://$BUCKET/bots/${ADDR_A}/" ``` Inside `agentkeys-provisioner`, the `fetch_via_broker_default_ttl()` @@ -580,7 +1524,7 @@ export AWS_REGION=us-east-1 agentkeys-daemon \ --backend $BACKEND_URL \ --broker-url $AGENTKEYS_BROKER_URL \ - --session $YOUR_SESSION_TOKEN + --session $SESSION_JWT_A ``` Inside the daemon, the call site is @@ -610,7 +1554,6 @@ GRANT=$(curl -sS --fail-with-body -X POST $OIDC_ISSUER/v1/grant/create \ expires_at: (now + 3600 | floor), max_uses: 100 }')") -echo "GRANT=${GRANT:0:32}… length=${#GRANT}" printf '%s' "$GRANT" | jq # { @@ -637,7 +1580,6 @@ curl -sS --fail-with-body $OIDC_ISSUER/v1/grant/list \ ```bash GRANT_ID=$(printf '%s' "$GRANT" | jq -r .grant_id) -echo "GRANT_ID=$GRANT_ID" curl -sS --fail-with-body -X POST $OIDC_ISSUER/v1/grant/revoke \ -H "Authorization: Bearer $SESSION_JWT_A" \ -H 'content-type: application/json' \ @@ -660,15 +1602,23 @@ on the broker host once every daemon has a grant. ## 7. Wallet linking + recovery (Phase B) -### 7.1 Master links a secondary identity (e.g. email) +After issue #74 step 1 the canonical recovery model is "any linked +identity unlocks the same derived wallet." The daemon links its +identity-omni (e.g. `OMNI_A` from email) to the derived wallet so +re-authenticating as that email recovers the same EVM address. + +### 7.1 Master links the identity-omni to the derived wallet ```bash curl -sS --fail-with-body -X POST $OIDC_ISSUER/v1/wallet/link \ -H "Authorization: Bearer $SESSION_JWT_A" \ -H 'content-type: application/json' \ - -d "$(jq -n '{identity_type:"email", identity_value:"hanwen@example.com"}')" + -d "$(jq -n '{identity_type:"email", identity_value:"alice@demo.example"}')" ``` +After this call the broker's `IdentityLinkStore` knows that +`("email", "alice@demo.example")` ↔ `OMNI_EVM_A` ↔ `ADDR_A`. + ### 7.2 List linked identities ```bash @@ -681,19 +1631,27 @@ curl -sS --fail-with-body $OIDC_ISSUER/v1/wallet/links \ ```bash curl -sS --fail-with-body -X POST $OIDC_ISSUER/v1/wallet/recover/lookup \ -H 'content-type: application/json' \ - -d '{"identity_type":"email","identity_value":"hanwen@example.com"}' | jq + -d '{"identity_type":"email","identity_value":"alice@demo.example"}' | jq # {"omni_account": "<64 hex>"} ``` The lookup is unauthenticated *by design* — `omni_account` is a -SHA256 hash, discovery does not enable impersonation. Actual recovery -still requires the master to sign in fresh and call `/v1/grant/create` -on a new daemon address. See [operator-runbook-stage7.md → Recovery +SHA256 hash, discovery does not enable impersonation. Recovery still +requires the daemon to (a) re-authenticate as the linked identity, +(b) get the same `omni_account` back, and (c) ask the dev_key_service +to derive the wallet (the master secret has not rotated, so the +derivation is stable). See [operator-runbook-stage7.md → Recovery flow](operator-runbook-stage7.md#recovery-flow). --- -## 8. Email-link auth (Phase A.1) +## 8. Email-link auth (Phase A.1) — alternative entry point + +Email-link is the canonical way to bootstrap `OMNI_A` in a real +deployment instead of computing it offline like §0.3 does. After +verification, the broker mints a session JWT bound to `omni_email`, +and the daemon then derives the wallet via `/dev/derive-address`. +Same dev_key_service flow from there on out. Requires `BROKER_AUTH_METHODS=…,email_link` and `BROKER_EMAIL_*` env vars set (see runbook). SES sender identity must be verified. @@ -702,7 +1660,7 @@ vars set (see runbook). SES sender identity must be verified. # 1. Request a magic link. curl -sS --fail-with-body -X POST $OIDC_ISSUER/v1/auth/email/request \ -H 'content-type: application/json' \ - -d '{"email":"hanwen@example.com"}' + -d '{"email":"alice@demo.example"}' # {"request_id":"em_…","status":"sent"} # 2. Click the link in the email. The broker's /auth/email/landing @@ -713,44 +1671,52 @@ curl -sS --fail-with-body $OIDC_ISSUER/v1/auth/email/status/em_… | jq # { # "status": "verified", # "session_jwt": "eyJ…", -# "omni_account": "<64 hex>", +# "omni_account": "<64 hex of OMNI_A>", # "identity_type": "email", -# "identity_value": "hanwen@example.com" +# "identity_value": "alice@demo.example" # } + +# 4. The session JWT is now an `omni_email` session. Derive the wallet: +EMAIL_SESSION_JWT=... # from step 3 +agentkeys --session-id alice signer derive \ + --signer-url $BACKEND_URL \ + --omni-account $(omni email "alice@demo.example") +# 5. Then run §2.1 onwards using that derived address. ``` +§8 is a manual alternative to §2.0's one-command `agentkeys init +--email`. If you're driving it raw like this, persist the +`session_jwt` from step 3 into `~/.agentkeys/alice/session.json` +(matching `--session-id alice`) before running step 4 — or skip +step 4 entirely and inline the JWT as `Authorization: Bearer +$EMAIL_SESSION_JWT` against `$BACKEND_URL/dev/derive-address`. + ### 8.1 Debugging — inspecting the inbound email at S3 If the magic-link click never completes verification, the email probably arrived but the link the broker rendered doesn't match the URL pattern the auth handler regex-matches. Use [`scripts/inspect-inbound-email.sh`](../scripts/inspect-inbound-email.sh) -to dump the most-recent inbound email from `s3://$BUCKET/inbound/` -with the same quoted-printable normalization the broker applies: +to dump the most-recent inbound email from `s3://$BUCKET/inbound/`. ```bash # === ON OPERATOR WORKSTATION === awsp agentkeys-admin -set -a; source scripts/operator-workstation.env; set +a # if not done in §0 - ./scripts/inspect-inbound-email.sh # latest ./scripts/inspect-inbound-email.sh --all # list all keys + headers ./scripts/inspect-inbound-email.sh inbound/ # specific key ``` -The script prints raw + normalized bodies, all `href`s, all -`https://` URLs deduped, and specifically the URLs that match the -auth handler's regex. If the last block returns `(NONE — regex would -miss this email!)`, the broker's URL-extraction regex needs an -update for the new sender format. (This script is the Stage 7 -replacement for the archived `stage6-inspect-email.sh`.) - The session JWT NEVER appears in the browser-facing landing-page response — only on the CLI poll, per Plan §3.5.4 security posture. --- -## 9. OAuth2/Google auth (Phase A.2) +## 9. OAuth2/Google auth (Phase A.2) — alternative entry point + +Same shape as §8 but the bootstrap is a Google OAuth2 round-trip +instead of email. Once the omni_oauth2 session JWT lands, the daemon +derives the same EVM wallet via the dev_key_service. Requires `BROKER_OAUTH2_*` env vars, a Google Cloud Console OAuth web client, and the broker's redirect URI registered exactly. See @@ -761,11 +1727,6 @@ client, and the broker's redirect URI registered exactly. See curl -sS --fail-with-body -X POST $OIDC_ISSUER/v1/auth/oauth2/start \ -H 'content-type: application/json' \ -d '{"provider":"google"}' | jq -# { -# "request_id":"oa2-…", -# "authorization_url":"https://accounts.google.com/o/oauth2/v2/auth?…", -# "poll_url":"/v1/auth/oauth2/status/oa2-…" -# } # 2. Open authorization_url in a browser, sign in. Google redirects # to /auth/oauth2/callback on the broker. @@ -774,8 +1735,19 @@ curl -sS --fail-with-body -X POST $OIDC_ISSUER/v1/auth/oauth2/start \ curl -sS --fail-with-body $OIDC_ISSUER/v1/auth/oauth2/status/oa2-… | jq # {"status":"verified", "session_jwt":"eyJ…", "omni_account":"…", # "identity_type":"oauth2_google", "identity_value":""} + +# 4. Derive the wallet: +agentkeys --session-id alice signer derive \ + --signer-url $BACKEND_URL \ + --omni-account $(omni oauth2_google "") ``` +Same caveat as §8: §9 is a manual alternative to §2.0's +`agentkeys --session-id alice init --oauth2-google`. The shorthand +mints + persists the session JWT for you; the raw flow above needs +the step-3 JWT inlined as `Authorization: Bearer` or persisted into +`~/.agentkeys/alice/session.json` before step 4 reads it. + `prompt=select_account` is hardcoded into the auth URL so Google always forces the account chooser — defends against the silent-wrong-account scenario (multi-account browsers). @@ -795,6 +1767,9 @@ sudo sqlite3 /var/lib/agentkeys/.agentkeys/broker/audit.sqlite \ ``` Columns of interest: +- `omni_account` — `OMNI_EVM_A` for derived-wallet mints (post issue + #74 the wallet is the public side; the identity omni stays on the + daemon). - `status` — `confirmed` after `sqlite_primary` or `sqlite`-only policy completes; `pending` → `confirmed | quarantined` for `dual_strict` policy (Phase C). @@ -803,6 +1778,11 @@ Columns of interest: - `grant_id` — non-empty when the mint was authorized by an explicit grant; empty during the Phase-0→B migration window. +The dev_key_service itself has **no audit log** in v0 — it is +single-process, every `/dev/sign-message` call is the daemon's own. +Issue #74 step 2 (TEE worker) adds enclave-side per-omni signing +counters. + --- ## 11. EVM audit anchor (Phase C — structural only in v0) @@ -818,7 +1798,6 @@ To exercise the structural layer: ```bash # === ON BROKER HOST === -# Set Phase C env vars (see runbook §EVM Audit Anchor). sudo systemctl edit agentkeys-broker # [Service] # Environment=BROKER_AUDIT_ANCHORS=sqlite,evm_testnet @@ -845,7 +1824,7 @@ exercise this end-to-end against the stub. ### 12.1 Prometheus metrics ```bash -# === ON BROKER HOST (or curl from anywhere if exposed) === +# === ON BROKER HOST === sudo systemctl edit agentkeys-broker # Environment=BROKER_METRICS_ENABLED=true sudo systemctl restart agentkeys-broker @@ -856,9 +1835,7 @@ curl -sS --fail-with-body https://broker.litentry.org/metrics | head -30 # agentkeys_broker_mints_total 14 # agentkeys_broker_mints_failed_total 0 # agentkeys_broker_audit_writes_total 14 -# agentkeys_broker_audit_writes_failed_total 0 # agentkeys_broker_auth_attempts_total 23 -# agentkeys_broker_auth_failed_unauthorized_total 1 # agentkeys_broker_idempotency_hits_total 3 # … ``` @@ -871,7 +1848,6 @@ disabled to avoid leaking counter shapes to unauthenticated probers. ```bash KEY=$(uuidgen | tr '[:upper:]' '[:lower:]') -echo "KEY=${KEY:0:32}… length=${#KEY}" # First call — mints + caches. curl -i -X POST $OIDC_ISSUER/v1/mint-aws-creds \ @@ -916,9 +1892,19 @@ bash harness/stage-7-issue-64-done.sh This composes every per-phase smoke + the load-bearing invariant test + the env-var-table drift check + both build matrices (v0-default and -v0-testnet feature combos). Exits 0 if Stage 7 is shippable. Any -failure prints the failing phase name and points at the relevant -sub-script. +v0-testnet feature combos). Exits 0 if Stage 7 is shippable. + +Issue #74's signer-protocol conformance test runs as part of the +default `cargo test` path: + +```bash +cargo test -p agentkeys-mock-server --test dev_key_service_routes +cargo test -p agentkeys-core --test signer_conformance +``` + +The conformance test exercises both the HKDF-backed dev_key_service +and an in-memory TEE-stub that implements the same wire shape — the +swap-point invariant is now a tested CI gate. --- @@ -927,20 +1913,84 @@ sub-script. ### 14.1 BOOT_FAIL on first start Tier-1 refuse-to-boot prints a single-line `BOOT_FAIL: =: -; see runbook §` to stderr. The anchor is a Markdown -heading slug in [`docs/operator-runbook-stage7.md`](operator-runbook-stage7.md). -Common ones: +; see runbook §` to stderr. Common ones: | Anchor | Cause | Fix | |---|---|---| | `oidc-issuer` | `BROKER_OIDC_ISSUER` is `http://` and `BROKER_DEV_MODE` is unset | Set TLS in front of the broker, point issuer at the public HTTPS URL. | -| `oidc-keypair` / `session-keypair` | Keypair file missing | `agentkeys-broker-server keygen --purpose --out PATH` (commit `d9bf541`); or rerun `setup-broker-host.sh --upgrade` which auto-mints (commit `765ea9b`). | +| `oidc-keypair` / `session-keypair` | Keypair file missing | `agentkeys-broker-server keygen --purpose --out PATH`; or rerun `setup-broker-host.sh --upgrade` which auto-mints. | | `audit-policy` | Bad `BROKER_AUDIT_POLICY` value | Must be `dual_strict` / `sqlite_primary` / `evm_primary`. | -| `auth-method-not-compiled` | Plugin name in env var not registered | Rebuild with the matching `--features` flag (e.g. `auth-email-link`) or remove the name. | +| `auth-method-not-compiled` | Plugin name in env var not registered | Rebuild with the matching `--features` flag. | | `auth-method-empty` / `audit-anchor-empty` | Empty list | Defaults: `wallet_sig` / `sqlite`. | -| `backend-reachability` | Tier-2 backend `/healthz` not yet probed | Auto-clears once mock-server is up. With `BROKER_REFUSE_TO_BOOT_STRICT=true`, this is a hard fail instead. | +| `backend-reachability` | Tier-2 backend `/healthz` not yet probed | Auto-clears once mock-server is up. | + +### 14.2 `/dev/derive-address` returns HTTP 503 `signer_disabled` + +The backend's `DEV_KEY_SERVICE_MASTER_SECRET` env var is unset or +empty. From the broker host: + +```bash +sudo systemctl show agentkeys-backend | grep DEV_KEY_SERVICE +# Should print: Environment=DEV_KEY_SERVICE_MASTER_SECRET=… +# If blank, redo §0.1 of this guide. +``` + +### 14.3 `agentkeys signer sign` returns `Error: SIGNER_UNREACHABLE` + +The CLI cannot reach `--signer-url`. Verify, in order: + +1. `curl -sS https://signer./healthz` returns `ok` from the + workstation. If TLS errors, the cert hasn't been issued yet — + run `sudo certbot --nginx -d signer.` on the broker host + (per §0.2). +2. `sudo systemctl status agentkeys-signer` on the broker host + shows `active (running)`. If `failed`, check + `journalctl -u agentkeys-signer -n 50` — most likely + `/var/lib/agentkeys/.agentkeys/broker/session-keypair.pub.pem` + is missing (the broker writes it on boot via + `--export-session-pubkey-to`; restart `agentkeys-broker` then + `agentkeys-signer`). +3. The DNS A record for `signer.` resolves to the broker host + IP — `dig +short signer.` should return the EC2 EIP. + +### 14.4 SIWE verify returns `signature does not recover to claimed address` — OR `ADDRESS DRIFT — master secret rotated mid-session?` at end of §2.2 + +Both symptoms have the same family of causes — `$ADDR_A` (or `$OMNI_A`) +in your shell doesn't match the just-now-live alice/bob session. In +practice 9 out of 10 hits are **stale shell vars from a previous run**, +not actual K3 rotation. + +Most common diagnosis path — run this triplet and compare: + +```bash +echo "OMNI_A (shell) = $OMNI_A" +echo "ADDR_A (shell) = $ADDR_A" +DERIVE_NOW=$(agentkeys --json signer derive \ + --signer-url $BACKEND_URL --omni-account $OMNI_A | jq -r .address) +echo "derive(OMNI_A) = $DERIVE_NOW ← what signer returns RIGHT NOW" +JWT_OMNI=$(jq -r .token ~/.agentkeys/$AGENTKEYS_SESSION_ID/session.json \ + | cut -d. -f2 | tr '_-' '/+' \ + | { read p; printf '%s%s' "$p" "$(printf '====' | head -c $(( (4 - ${#p} % 4) % 4 )))" \ + | base64 -d 2>/dev/null; } | jq -r '.agentkeys.omni_account') +echo "JWT.omni_account = $JWT_OMNI ← what's persisted on disk" +``` + +Then match against the failure mode: -### 14.2 `AssumeRoleWithWebIdentity` returns InvalidIdentityToken +| Symptom | Cause | Fix | +|------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| `OMNI_A` (shell) `!=` `JWT.omni_account` (on disk) | Shell `$OMNI_A` is stale — set by a previous `--export` against a different session. Re-init happened after `--export`. | Re-run `eval "$(bash scripts/agentkeys-demo-show.sh --export A $AGENTKEYS_SESSION_ID)"`. Then re-do §2.1 (SIWE start) — your old `$SIWE_MSG` is also stale because it embeds the old `$ADDR_A`. | +| `DERIVE_NOW != ADDR_A` (shell) | Shell `$ADDR_A` is stale — same root cause as above. | Same fix. | +| `ADDR_A == MASTER_WALLET_A` (= JWT.wallet_address) | You substituted `$MASTER_WALLET_A` for `$ADDR_A` somewhere — easy mistake reading demo-show's human-mode output. | Re-run the eval line; `--export A` is the only mode that reliably sets `$ADDR_A = HKDF(K3, OMNI_A)`. | +| `DERIVE_NOW != SIG_ADDR` (where `SIG_ADDR` = §2.2's check) | Real K3 rotation — `setup-broker-host.sh` regenerated `/etc/agentkeys/dev-key-service.env`, or `agentkeys-backend` restarted with a new `DEV_KEY_SERVICE_MASTER_SECRET`. | All previously-derived wallets are invalidated. Re-init via `init-email-demo.sh --session-id alice`, re-export, restart from §2.1. To keep K3 stable across runs, the setup script preserves the env file — only `--force` rotates it. | +| SIWE message bytes mutated mid-flow | `$SIWE_MSG` was re-quoted or re-printed (zsh `echo` corrupts `\n` escapes — see §0 the printf note). | Always pass `$SIWE_MSG` straight from `printf '%s' "$START" \| jq -r .siwe_message`. Never `echo "$SIWE_MSG"` into the sign call. | + +The two stale-shell-vars rows are by far the most common when an +operator runs `init-email-demo.sh --session-id alice` twice in a row, +or runs it after a previous `--export A bob`. **Run the eval line every +time a fresh init lands** — it's idempotent and cheap. + +### 14.5 `AssumeRoleWithWebIdentity` returns InvalidIdentityToken - **Issuer mismatch.** Confirm `discovery.issuer == $OIDC_ISSUER` byte-for-byte. @@ -949,18 +1999,17 @@ Common ones: - **Audience mismatch.** AWS expects `aud=sts.amazonaws.com`. Decode the JWT and confirm. - **Stale OIDC provider.** If the broker's `kid` rotated and AWS - cached the old JWKS, re-register the provider: - `aws iam delete-open-id-connect-provider …` then re-create per + cached the old JWKS, re-register the provider per `cloud-setup.md §4.2`. -### 14.3 S3 GetObject returns AccessDenied for own prefix +### 14.6 S3 GetObject returns AccessDenied for own prefix The JWT isn't carrying the `https://aws.amazon.com/tags` claim. Decode and check (per §4.4 above). If the claim is present, confirm the role's trust policy has `sts:TagSession` and the `aws:RequestTag/...` condition (per `cloud-setup.md §4.3`). -### 14.4 Broker exits 0 cleanly after ~24h +### 14.7 Broker exits 0 cleanly after ~24h Designed behavior — the broker has a 24h max-uptime serve loop. The systemd unit ships with `Restart=always` (commit @@ -968,6 +2017,46 @@ systemd unit ships with `Restart=always` (commit systemd restarts it automatically. Verify with `sudo journalctl -u agentkeys-broker --since "1 day ago" | grep -E "max-uptime|listening"`. +### 14.8 `agentkeys signer sign` returns `Error: SIGNER_UNAUTHORIZED invalid session JWT: ExpiredSignature` + +The CLI's `--session-id` flag defaults to `master`. If you ran +`bash scripts/agentkeys-init-email-demo.sh --session-id alice` (which +writes `~/.agentkeys/alice/session.json`) but then called +`agentkeys signer sign …` without threading the session-id, the CLI +read `~/.agentkeys/master/session.json` instead — almost certainly an +older session whose JWT has since expired. + +Diagnose: + +```bash +# Confirm which file the CLI would read by default (master) vs. the one +# init-email-demo.sh just wrote (alice). +ls -la ~/.agentkeys/master/session.json ~/.agentkeys/alice/session.json +# Decode the JWT exp claim from each; the older one is what the bare +# `agentkeys signer sign` was using. +for f in ~/.agentkeys/{master,alice}/session.json; do + echo "=== $f ===" + payload="$(jq -r '.token' "$f" | awk -F. '{print $2}')" + pad=$(( (4 - ${#payload} % 4) % 4 )) + printf '%s' "$payload$(printf '=%.0s' $(seq 1 $pad))" | tr '_-' '/+' \ + | base64 -d 2>/dev/null | jq '{exp_iso: (.exp | todate)}' +done +``` + +Fix — pin the right session for the rest of this shell: + +```bash +export AGENTKEYS_SESSION_ID=alice # or whatever --session-id you initted +``` + +This matches the same pattern §0.4 and §2.4 use. The bare per-call +alternative is `agentkeys --session-id alice signer sign …` but the +env-var sticks across §2 + §4, which is what the demo assumes. + +If `alice`'s JWT is also expired (init was >5h ago), re-run +`bash scripts/agentkeys-init-email-demo.sh --session-id alice` to mint +a fresh one. `ttl_seconds` is 18000 (5h) by default. + --- ## 15. What's intentionally not yet live @@ -975,34 +2064,55 @@ systemd restarts it automatically. Verify with These ship behind their own user-stories or hardening passes; the structural plumbing is in place but the live integration isn't wired: +- **TEE-backed signer (issue #74 step 2).** Today's + `dev_key_service` keeps the master secret in a plain env var — fine + for dev / demo / single-operator deployments, **not** for any + environment where compromise of the host shell would be a security + incident. Step 2 swaps it for a TEE worker behind the same wire + shape. Daemon and CLI code do not change. See + [`docs/spec/signer-protocol.md`](spec/signer-protocol.md) for the + attestation handshake the TEE backend will add (`GET /dev/attestation`). - **Live EVM audit anchor.** The `EvmStubAnchor` round-trips without network. Real transaction submission + receipt polling lands in Phase E hardening (V0.1-FOLLOWUPS). - **TEE-derived OIDC signer.** The on-disk ES256 keypair is the v0.1 - signer. Plan §8 (TEE) replaces it without changing JWKS/JWT/STS shape. + signer for the broker's OIDC keypair (separate from the + dev_key_service master secret). Plan §8 (TEE) replaces it without + changing JWKS/JWT/STS shape. - **`BROKER_REQUIRE_EXPLICIT_GRANT=true` default-on.** Today the Phase-0 NoGrant migration window is open; flip the default once every daemon has been issued a grant. - **Histogram metrics + per-handler counter bumps.** Counter shapes ship; latency histograms land in V0.1-FOLLOWUPS. -- **Retire `/v1/mint-aws-creds` entirely (issue #71 Option A - closing step).** Provisioner / MCP / daemon now use - `/v1/mint-oidc-jwt` + client-side `AssumeRoleWithWebIdentity` - (landed in this guide's commit set). The endpoint stays for callers - who want server-side gates (audit + grants + idempotency); once - every operator's pipeline confirms the new path works in - production, the route can be dropped. +- **Retire `/v1/mint-aws-creds` entirely.** The provisioner / MCP / + daemon use `/v1/mint-oidc-jwt` + client-side + `AssumeRoleWithWebIdentity` (issue #71 Option A). The route stays + for callers who want server-side gates; once every operator's + pipeline confirms the new path works in production, the route can + be dropped. +- **Retire `/v1/auth/exchange` and backend `/session/validate`.** + Issue #74 step 1's CLI/daemon rewrite (this PR) removed every + in-tree caller of the legacy `/session/create` → bearer → + `/v1/auth/exchange` chain — production code now goes through + email/OAuth2 → omni → derive → SIWE → session-JWT. The shim itself + still exists for backward-compat with any out-of-tree caller; a + cleanup PR will delete the route, the validator + (`broker-server/src/auth.rs::validate_bearer_token`), and the env + vars (`BROKER_BACKEND_URL`, `BROKER_BACKEND_TIMEOUT_SECONDS`) once + external callers have migrated. See [`docs/spec/plans/issue-64/V0.1-FOLLOWUPS.md`](spec/plans/issue-64/V0.1-FOLLOWUPS.md) -for the prioritized backlog. +for the prioritized backlog and +[`docs/spec/plans/issue-74-dev-key-service-plan.md`](spec/plans/issue-74-dev-key-service-plan.md) +for the post-issue-#74 roadmap. --- ## 16. Live walkthrough on broker.litentry.org -This section is the copy-paste runbook for verifying the migration -end-to-end against the **live** broker at `https://broker.litentry.org`. -Each block is tagged with where it runs. +Copy-paste runbook for verifying the migration end-to-end against the +**live** broker at `https://broker.litentry.org`. Each block is +tagged with where it runs. ### 16.1 Pull + redeploy on the broker host @@ -1014,14 +2124,18 @@ git fetch origin git checkout evm git pull --ff-only -# Redeploy via the systemd-aware upgrade script. After the OIDC-only -# migration the broker no longer needs DAEMON_ACCESS_KEY_ID env vars; -# the systemd unit can run with no AWS creds. -sudo bash scripts/setup-broker-host.sh --upgrade - -# Verify the broker is up. -sudo systemctl --no-pager status agentkeys-broker -sudo journalctl -u agentkeys-broker -n 50 --no-pager +# Idempotent re-deploy. Same script handles bootstrap and upgrade — +# no `--upgrade` flag needed. Issue #74 step 1 made the script +# auto-generate /etc/agentkeys/dev-key-service.env on first run and +# preserve it on subsequent runs (rotating it would invalidate every +# previously-derived wallet). +sudo bash scripts/setup-broker-host.sh --yes + +# Verify the broker + backend are up. +sudo systemctl --no-pager status agentkeys-broker agentkeys-backend +sudo journalctl -u agentkeys-broker -n 50 --no-pager +sudo journalctl -u agentkeys-backend -n 10 --no-pager +# Look for: [mock-server] dev_key_service ENABLED (DEV ONLY — replace with TEE worker per issue #74 step 2) ``` ### 16.2 Verify broker is creds-free @@ -1032,10 +2146,7 @@ sudo systemctl show agentkeys-broker | grep -E "^Environment=" | tr ' ' '\n' \ | grep -E "AWS_|DAEMON_|BROKER_DAEMON_" || echo "OK: no AWS_* / DAEMON_* env vars" ``` -The expected output is `OK: no AWS_* / DAEMON_* env vars`. If the -unit still has `Environment=AWS_PROFILE=...` from a pre-migration -deployment, drop the line and `sudo systemctl daemon-reload && -sudo systemctl restart agentkeys-broker`. +The expected output is `OK: no AWS_* / DAEMON_* env vars`. ### 16.3 Public health checks (no creds needed) @@ -1044,10 +2155,8 @@ sudo systemctl restart agentkeys-broker`. curl -sS -o /dev/null -w 'HTTP %{http_code}\n' https://broker.litentry.org/healthz # HTTP 200 -# `/readyz` is self-describing — body has `status: ready | degraded | -# unready` and a `checks` array. HTTP 200 = ready/degraded, 503 = unready. curl -sS https://broker.litentry.org/readyz | jq -r .status -# ready ← anything else: `curl -s …/readyz | jq` for the full body +# ready curl -sS --fail-with-body https://broker.litentry.org/.well-known/openid-configuration | jq -r .issuer # https://broker.litentry.org @@ -1056,41 +2165,72 @@ curl -sS --fail-with-body https://broker.litentry.org/.well-known/jwks.json | jq # {"kty":"EC","crv":"P-256","alg":"ES256","kid":"v1-…"} ``` -### 16.4 SIWE wallet auth → session JWT +### 16.4 Managed-wallet SIWE auth via the dev_key_service -Generate two test wallets, sign in as wallet A, capture session JWT. -Same as §2 above against the live broker. Repeat for wallet B if you -want to demo the isolation property in §16.6. +Point the workstation at the public signer hostname (§0.2): + +```bash +# === ON OPERATOR WORKSTATION === +export AGENTKEYS_SIGNER_URL=https://signer.litentry.org +export BACKEND_URL=$AGENTKEYS_SIGNER_URL +curl -sS $BACKEND_URL/healthz # → ok + +# Make sure follow-up `agentkeys signer sign` calls read the session +# this section initted (not the default `master`, which is usually +# stale — see §14.8). +export AGENTKEYS_SESSION_ID=alice +``` + +Compute omnis + derive wallets + run SIWE round-trip — exactly §0.3 +through §2.4 above, just with `$OIDC_ISSUER=https://broker.litentry.org` +and `$BACKEND_URL=https://signer.litentry.org`. No tunnel; the signer +listener is fronted by nginx with TLS (issued via certbot per §0.2). + +```bash +omni() { printf '%s%s%s' "agentkeys" "$1" "$2" | shasum -a 256 | awk '{print $1}'; } +OMNI_A=$(omni email "alice@demo.example") +OMNI_B=$(omni email "bob@demo.example") + +ADDR_A=$(agentkeys --json signer derive --signer-url $BACKEND_URL --omni-account $OMNI_A | jq -r .address) +ADDR_B=$(agentkeys --json signer derive --signer-url $BACKEND_URL --omni-account $OMNI_B | jq -r .address) + +# SIWE round-trip for A. +START=$(curl -sS --fail-with-body -X POST $OIDC_ISSUER/v1/auth/wallet/start \ + -H 'content-type: application/json' \ + -d "$(jq -n --arg a "$ADDR_A" '{address:$a, chain_id:84532}')") +REQ_ID=$(printf '%s' "$START" | jq -r .request_id) +SIWE_MSG=$(printf '%s' "$START" | jq -r .siwe_message) +SIG_A=$(agentkeys --json signer sign --signer-url $BACKEND_URL --omni-account $OMNI_A --message "$SIWE_MSG" | jq -r .signature) +VERIFY=$(curl -sS --fail-with-body -X POST $OIDC_ISSUER/v1/auth/wallet/verify \ + -H 'content-type: application/json' \ + -d "$(jq -n --arg r "$REQ_ID" --arg s "$SIG_A" '{request_id:$r, signature:$s}')") +SESSION_JWT_A=$(printf '%s' "$VERIFY" | jq -r .session_jwt) +echo "SESSION_JWT_A=${SESSION_JWT_A:0:32}…" +``` + +Repeat for B. Or, for the demo's purposes, only A is needed for the +mint paths in §16.5, and the seed objects + isolation proof in §16.6 +exercise both prefixes. ### 16.5 Mint OIDC JWT + AssumeRoleWithWebIdentity (the new auto-provision path) ```bash # === ON OPERATOR WORKSTATION === -# (Assumes operator-workstation.env was sourced in §0 — $OIDC_ISSUER, -# $DATA_ROLE_ARN, $ACCOUNT_ID are already set.) awsp agentkeys-admin -# Get the OIDC JWT. JWT=$(curl -sS --fail-with-body -X POST $OIDC_ISSUER/v1/mint-oidc-jwt \ -H "Authorization: Bearer $SESSION_JWT_A" | jq -r .jwt) -echo "JWT=${JWT:0:32}… length=${#JWT}" echo "JWT prefix: ${JWT:0:40}…" -# Exchange it for AWS creds — UNAUTHENTICATED to AWS (the JWT authenticates). unset AWS_ACCESS_KEY_ID AWS_SECRET_ACCESS_KEY AWS_SESSION_TOKEN AWS_PROFILE CREDS=$(aws sts assume-role-with-web-identity \ --role-arn "$DATA_ROLE_ARN" \ --role-session-name "live-demo-$(date +%s)" \ --web-identity-token "$JWT") -echo "CREDS=${CREDS:0:32}… length=${#CREDS}" export AWS_ACCESS_KEY_ID=$(printf '%s' "$CREDS" | jq -r .Credentials.AccessKeyId) -echo "AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID:0:32}… length=${#AWS_ACCESS_KEY_ID}" export AWS_SECRET_ACCESS_KEY=$(printf '%s' "$CREDS" | jq -r .Credentials.SecretAccessKey) -echo "AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY:0:32}… length=${#AWS_SECRET_ACCESS_KEY}" export AWS_SESSION_TOKEN=$(printf '%s' "$CREDS" | jq -r .Credentials.SessionToken) -echo "AWS_SESSION_TOKEN=${AWS_SESSION_TOKEN:0:32}… length=${#AWS_SESSION_TOKEN}" -# Confirm — the assumed role identity, NOT your admin profile. aws sts get-caller-identity # { # "UserId": "AROA…:live-demo-…", @@ -1102,18 +2242,14 @@ aws sts get-caller-identity ```bash # === ON OPERATOR WORKSTATION (still with assumed-role creds) === -WALLET_A_LC=$(echo "$ADDR_A" | tr '[:upper:]' '[:lower:]') -echo "WALLET_A_LC=$WALLET_A_LC" -WALLET_B_LC=$(echo "$ADDR_B" | tr '[:upper:]' '[:lower:]') -echo "WALLET_B_LC=$WALLET_B_LC" # Wallet A's prefix — SUCCESS. aws s3api list-objects-v2 --bucket "$BUCKET" \ - --prefix "bots/${WALLET_A_LC}/" --query 'Contents[*].Key' + --prefix "bots/${ADDR_A}/" --query 'Contents[*].Key' # Wallet B's prefix — AccessDenied (cloud-enforced). -aws s3api get-object --bucket "$BUCKET" \ - --key "bots/${WALLET_B_LC}/hello.txt" /tmp/got-B.txt +aws s3api get-object --region "$REGION" --bucket "$BUCKET" \ + --key "bots/${ADDR_B}/hello.txt" /tmp/got-B.txt # An error occurred (AccessDenied) when calling the GetObject operation ``` @@ -1123,20 +2259,58 @@ aws s3api get-object --bucket "$BUCKET" \ # === ON OPERATOR WORKSTATION === unset AWS_ACCESS_KEY_ID AWS_SECRET_ACCESS_KEY AWS_SESSION_TOKEN -# The daemon reads these env vars and threads them through to the -# provisioner's fetch_via_broker_default_ttl(). export AGENTKEYS_BROKER_URL=https://broker.litentry.org export AGENTKEYS_DATA_ROLE_ARN=arn:aws:iam::${ACCOUNT_ID}:role/agentkeys-data-role +export AGENTKEYS_SIGNER_URL=$BACKEND_URL # public signer URL from §0.2 export AWS_REGION=us-east-1 -# Run the provisioner-driven scraper. The subprocess receives -# AWS_ACCESS_KEY_ID/SECRET/SESSION_TOKEN via env injection — those creds -# are minted by the daemon calling /v1/mint-oidc-jwt + AssumeRoleWithWebIdentity. -agentkeys-cli provision --service openrouter +# Bootstrap the alice session via the new flow. The CLI prompts you +# to click the magic link; once verified, it derives + links + SIWEs +# and saves the EVM session JWT under ~/.agentkeys/alice/session.json +# (or the OS keychain). --session-id alice keeps this isolated from +# any prior `master` session. +agentkeys --session-id alice init \ + --email alice@demo.example \ + --broker-url $AGENTKEYS_BROKER_URL \ + --signer-url $AGENTKEYS_SIGNER_URL + +# Pin the alice session for the provisioner subprocess too — without +# this, the provisioner falls back to --session-id master and reads +# whatever stale JWT lives there (see §14.8). +export AGENTKEYS_SESSION_ID=alice + +# Now run the provisioner. AWS temp creds get minted via +# /v1/mint-oidc-jwt + AssumeRoleWithWebIdentity using the saved +# EVM session JWT. +agentkeys provision openrouter # … scraper runs, fetches the verification email from S3 using the # injected temp creds … ``` +For a long-lived headless daemon (e.g. on a server), use +`agentkeys-daemon --init-email ` instead — same flow, but the +daemon stays running afterward to serve MCP via stdio: + +```bash +agentkeys-daemon \ + --session-id alice \ + --backend $BACKEND_URL \ + --broker-url $AGENTKEYS_BROKER_URL \ + --signer-url $AGENTKEYS_SIGNER_URL \ + --init-email alice@demo.example \ + --stdio +# agentkeys-daemon: bootstrapping via email-link for alice@demo.example; click the magic link in your inbox +# (operator clicks the magic link in their inbox) +# (daemon then enters MCP-stdio loop) +``` + +The daemon's `--session-id` mirrors the CLI's: it pins which +`~/.agentkeys//session.json` the long-running process reads + writes. +Omitting it falls back to a `daemon-` auto-discovered fallback +(see `agentkeys-daemon --help`) — fine for the very-first run on a +clean machine, but explicit `--session-id alice` keeps the daemon +session aligned with the CLI tenant for the operator-tracing case. + ### 16.8 Audit log inspection ```bash @@ -1153,10 +2327,9 @@ sudo sqlite3 /var/lib/agentkeys/.agentkeys/broker/audit.sqlite \ After the OIDC-only migration, the daemon-side path is invisible to the broker's audit log (the broker only sees `/v1/mint-oidc-jwt` calls). Use AWS CloudTrail's `AssumeRoleWithWebIdentity` events for -the STS-side audit trail. - -If you need server-side audit row coverage of the actual mint, hit -`/v1/mint-aws-creds` instead — it audits before returning creds. +the STS-side audit trail. If you need server-side audit row coverage +of the actual mint, hit `/v1/mint-aws-creds` instead — it audits before +returning creds. --- @@ -1170,13 +2343,27 @@ awsp agentkeys-admin aws sts get-caller-identity # confirm: back to admin ``` +(No tunnel to tear down post-step-1b — the signer is reached via +its public hostname, not via SSH.) + The broker keeps running. To tear down the cloud-side state -(provider, role, bucket policy), follow `cloud-setup.md §6`. +(provider, role, bucket policy), follow `cloud-setup.md §7`. + +> **Do NOT casually rotate `DEV_KEY_SERVICE_MASTER_SECRET`** — +> rotating invalidates every previously-derived wallet for every +> linked identity. The TEE worker (issue #74 step 2) will define a +> formal rotation runbook with key-version bumps; the dev backend +> intentionally has none. --- ## Cross-references +- [`docs/spec/signer-protocol.md`](spec/signer-protocol.md) — v0 + wire contract for the signer edge (`/dev/derive-address`, + `/dev/sign-message`, error envelope, future attestation handshake). +- [`docs/spec/plans/issue-74-dev-key-service-plan.md`](spec/plans/issue-74-dev-key-service-plan.md) + — the canonical issue #74 plan. - [`docs/operator-runbook-stage7.md`](operator-runbook-stage7.md) — authoritative env-var inventory, BOOT_FAIL anchors, recovery procedures, OAuth2/email setup details. @@ -1186,8 +2373,5 @@ The broker keeps running. To tear down the cloud-side state the canonical Stage 7 plan (§6 Refuse-to-boot tiers; §3.5 plugin trait surface; §3.5.4 OAuth2 security posture; §3.5.6 dual-keypair rationale). -- [`docs/spec/plans/issue-64/PHASE-0-CHECKPOINT.md`](spec/plans/issue-64/PHASE-0-CHECKPOINT.md) - — Phase-0-isolated localhost checkpoint that this guide - generalizes to a real cloud deployment. - [`harness/stage-7-issue-64-done.sh`](../harness/stage-7-issue-64-done.sh) — programmatic equivalent of §13 above (the gate CI runs). diff --git a/hardcoded.md b/hardcoded.md new file mode 100644 index 0000000..599fbdf --- /dev/null +++ b/hardcoded.md @@ -0,0 +1,99 @@ +# Hardcoded values audit log + +Per `CLAUDE.md` "No-hardcoded-values policy": every hardcoded value in +the codebase that hasn't been parameterized to env vars / CLI flags / +config files must be logged here, with the trade-off explanation + +the concrete change that would unblock making it dynamic. + +The intent is **not** to eliminate every hardcoded value — some +(system user names, well-known file paths, RFC-defined constants) are +correctly hardcoded forever. The intent is to make every "I'll fix it +later" a deliberate decision instead of an oversight. + +--- + +## Format + +Each entry: file path + line, what's hardcoded, why, what would unblock +parameterization. + +--- + +## Operator-deployment-pinned values (litentry-account-specific) + +These pin the canonical demo/prod deployment to litentry's AWS account ++ DNS zones. Operators forking the project must edit these (or override +via env). Logged here so a fork-attempt operator finds the full list. + +### `scripts/operator-workstation.env` + +| Line | Value | Why hardcoded | Unblock | +|---|---|---|---| +| 25 | `ACCOUNT_ID=429071895007` | Default to litentry's AWS account so the runbook is copy-pasteable. | Operators forking already override by editing this file (it's the canonical override point). No further parameterization needed. | +| 28 | `REGION=us-east-1` | SES inbound is region-restricted to `us-east-1` / `us-west-2` / `eu-west-1` per AWS docs; defaulting to `us-east-1` matches `cloud-setup.md §0`. | Operator override by editing the env file. | +| 32 | `BROKER_HOST=broker.litentry.org` | Litentry's broker hostname. | Operator override by editing the env file. | +| 84 | `MAIL_DOMAIN=bots.litentry.org` | Litentry's email subdomain (verified per `cloud-setup.md §1.1`). | Operator override by editing the env file. | +| 97 | `BROKER_EMAIL_FROM_ADDRESS=noreply-test@${MAIL_DOMAIN}` | Default sender for the integration test + broker. Computed from `MAIL_DOMAIN` so a fork operator only edits one place. | Single point of truth — already correct. | + +### `scripts/broker.env` + +| Line | Value | Why hardcoded | Unblock | +|---|---|---|---| +| 35 | `ACCOUNT_ID=429071895007` | Litentry's AWS account ID. Single source of truth — derived ARNs (BROKER_DATA_ROLE_ARN below) reference `${ACCOUNT_ID}`. | Operator override by editing the env file. | +| 41 | `BROKER_DATA_ROLE_ARN=arn:aws:iam::${ACCOUNT_ID}:role/agentkeys-data-role` | Derived from `ACCOUNT_ID` via bash expansion at source-time. Role name fixed by cloud-setup.md §3.2. | OK — single source of truth via `ACCOUNT_ID`. | +| 47 | `BROKER_OIDC_ISSUER=https://broker.litentry.org` | Must match the broker's public hostname byte-for-byte (AWS validates JWT iss claim). | Operator override by editing the env file. | +| 71 | `BROKER_EMAIL_FROM_ADDRESS=noreply-test@bots.litentry.org` | Default SES sender. | Operator override by editing the env file. | + +### `scripts/setup-broker-host.sh` + +| Line | Value | Why hardcoded | Unblock | +|---|---|---|---| +| 67 | `REGION="us-east-1"` | Default if not passed via `--region` / unit-detected. Same rationale as operator-workstation.env line 28. | `--region` CLI flag already exists. OK. | +| 84 | `BROKER_EMAIL_FROM_ADDRESS="${BROKER_EMAIL_FROM_ADDRESS:-noreply-test@bots.litentry.org}"` | Default sender if not passed via `--email-from` / env. | `--email-from` CLI flag already exists. OK. | + +--- + +## Deployment-architecture-pinned values + +These are pinned for the canonical broker-host layout. Changing them +requires also changing the systemd units, nginx configs, and the +broker's expectations at startup. + +### Loopback ports + +| File | Line | Value | Why hardcoded | Unblock | +|---|---|---|---|---| +| `scripts/setup-broker-host.sh` | various | broker `:8091`, backend `:8090`, signer `:8092` | The 3-port split is the architectural separation between the public broker, the internal backend, and the dedicated signer (per `architecture.md` §10). Changing requires re-coordinated edits to systemd units, nginx server blocks, and the broker's `--port` flag. | Add `--broker-port` / `--backend-port` / `--signer-port` flags + env var alternates. Low-priority — the canonical layout is the only deployment shape. | + +### System user + paths + +| File | Line | Value | Why hardcoded | Unblock | +|---|---|---|---|---| +| `scripts/setup-broker-host.sh` | various | `agentkeys` system user / `agentkeys` group | The systemd units, file ownership, and ProtectSystem sandbox all reference this user. | Renaming would require an in-place migration (chown every file). Not worth parameterizing. | +| `scripts/setup-broker-host.sh` | 532 | `/etc/agentkeys/dev-key-service.env` | K3 master-secret env file path. The backend + signer systemd units `EnvironmentFile=` this exact path. | Could be made `--secret-env-path` flag. Low-priority — the canonical path is the only deployment shape. | +| `scripts/setup-broker-host.sh` | various | `/var/lib/agentkeys/.agentkeys/broker/session-keypair.pub.pem` | The broker writes here; the signer reads from here. Hard-coded into both. | Could be `--session-pubkey-path` flag. Low-priority. | + +--- + +## Code-level constants + +| File | Line | Value | Why hardcoded | Unblock | +|---|---|---|---|---| +| `crates/agentkeys-broker-server/src/plugins/auth/email_link.rs` | 46 | `TOKEN_TTL_SECONDS: i64 = 600` | Magic-link TTL (10 min) per Plan §3.5.3. | Could be `BROKER_EMAIL_TOKEN_TTL_SECONDS` env var. Reasonable to leave as constant unless an operator needs longer/shorter window. | +| `crates/agentkeys-broker-server/src/plugins/auth/email_link.rs` | various | per-email rate limit default 5/hr, per-IP default 30/min | Operational defaults. Already env-overridable via `BROKER_EMAIL_RATE_LIMIT_PER_EMAIL_HOURLY` + `BROKER_EMAIL_RATE_LIMIT_PER_IP_MINUTELY`. | Already parameterized. OK. | +| `crates/agentkeys-broker-server/tests/ses_email_flow.rs` | 36 | `DEFAULT_REGION: &str = "us-east-1"` | Test default if `AWS_REGION` env unset. | Already env-overridable. OK. | +| `crates/agentkeys-broker-server/tests/ses_email_flow.rs` | 37 | `DEFAULT_MAIL_DOMAIN: &str = "bots.litentry.org"` | Test default if `MAIL_DOMAIN` env unset. | Already env-overridable. OK. | +| `crates/agentkeys-broker-server/tests/ses_email_flow.rs` | 38 | `DEFAULT_FROM_LOCAL: &str = "noreply-test"` | Test default if `BROKER_EMAIL_FROM_ADDRESS` env unset. | Already env-overridable. OK. | +| `crates/agentkeys-broker-server/tests/ses_email_flow.rs` | 41 | `POLL_MAX_ATTEMPTS: usize = 12` (60s total) | Empirical SES → S3 inbound delivery latency budget. | Could be `SES_TEST_TIMEOUT_S` env var. Reasonable to leave as constant. | + +--- + +## Open trade-offs (decision pending) + +### Email-link HMAC removal (commit `b8481fe`) + +`EmailLinkAuth` previously held a vestigial `hmac_key` field that was loaded + length-validated but never used cryptographically. Removed in `b8481fe` to align with `architecture.md §3` K-table (no HMAC key listed) and §5a.1.M Stage 1 (magic-link is stateful). + +**Trade-off**: in a multi-broker-replica deployment with shared SQLite, stateless HMAC tokens become attractive again (avoids a DB round-trip per verify). v0.1 is single-broker so this doesn't apply, but v0.2+ with replica scaling should revisit. + +**Unblock**: tracked in [issue #81 — v0.2+ email-auth enhancement: WebAuthn binding integration + stateless HMAC tokens for multi-broker scale](https://github.com/litentry/agentKeys/issues/81). Re-introduction will add **K12** (Email-token HMAC key) to `architecture.md §3` and revert the relevant pieces of `b8481fe` with proper architectural documentation this time. The same issue also tracks the v0.2 WebAuthn binding ceremony at email_link Stage 2 (currently v1c-interim ships bespoke per-identity PoP shapes). diff --git a/harness/stage-5a-live-demo-handoff.sh b/harness/stage-5a-live-demo-handoff.sh index d6d0325..0e2936b 100755 --- a/harness/stage-5a-live-demo-handoff.sh +++ b/harness/stage-5a-live-demo-handoff.sh @@ -59,8 +59,22 @@ if ! ls "${HOME}/Library/Caches/ms-playwright/chromium_headless_shell-"* >/dev/n fail "Playwright chromium not installed under \$HOME=$HOME. Run: npx playwright install chromium --with-deps" fi -say "1. Initialize master session" -$BIN --backend $BACKEND init --mock-token stage5-live-demo || fail "init" +say "1. Initialize master session (issue #74 step 1: signer-flow bootstrap)" +# --mock-token was hard-cut in issue #74 step 1. The new bootstrap chain is +# email/OAuth2 → identity-omni session JWT → /dev/derive-address → +# /v1/wallet/link → SIWE round-trip via dev_key_service → EVM session JWT. +# AGENTKEYS_BROKER_URL must point at a broker that advertises email_link +# auth (BROKER_AUTH_METHODS includes "email_link") and AGENTKEYS_SIGNER_URL +# at the backend serving /dev/derive-address + /dev/sign-message +# (defaults to --backend; the mock-server hosts both). +: "${AGENTKEYS_BROKER_URL:?AGENTKEYS_BROKER_URL must be set for the new init flow (issue #74 step 1)}" +$BIN --backend $BACKEND \ + init \ + --email "$AGENTKEYS_SIGNUP_EMAIL" \ + --broker-url "$AGENTKEYS_BROKER_URL" \ + --signer-url "${AGENTKEYS_SIGNER_URL:-$BACKEND}" \ + --poll-timeout-seconds "${INIT_POLL_TIMEOUT_SECONDS:-300}" \ + || fail "init (email-link → dev_key_service → SIWE)" say "2. Env snapshot (masking secrets)" env | grep -E 'AGENTKEYS_(EMAIL|SIGNUP)_' | sed 's/\(PASSWORD=\).*/\1***REDACTED***/' diff --git a/scripts/agentkeys-demo-show.sh b/scripts/agentkeys-demo-show.sh new file mode 100755 index 0000000..f6cea38 --- /dev/null +++ b/scripts/agentkeys-demo-show.sh @@ -0,0 +1,209 @@ +#!/usr/bin/env bash +# scripts/agentkeys-demo-show.sh — one-line rich-output inspector for an +# agentkeys session JWT, plus the signer-derive smoke-test wallet. +# +# Companion to `agentkeys-init-email-demo.sh` — after init lands a session +# under `~/.agentkeys//session.json`, this script extracts and +# pretty-prints every value §0.4 of stage7-demo-and-verification.md needs +# to drive `agentkeys signer derive` / `signer sign` / S3-isolation calls, +# in ONE invocation: +# +# - identity_omni (from agentkeys.identity_value, recomputed) +# - identity_type ("email" / "oauth2_google") +# - actor_omni (JWT.agentkeys.omni_account — the durable EVM omni) +# - master_wallet (JWT.agentkeys.wallet_address — bound to actor_omni +# via SIWE at init; this is the wallet AWS PrincipalTag +# matches against, i.e. the wallet for §4 S3 prefix) +# - signer_derive_addr (a SECOND wallet = HKDF(K3, actor_omni); useful +# as a signer-wire smoke test but NOT what AWS sees — +# see §0.4 for the key-topology explanation) +# - jwt_expires_at + ttl_remaining (so you know to re-init before §4) +# +# Usage: +# bash scripts/agentkeys-demo-show.sh # default: master session +# bash scripts/agentkeys-demo-show.sh alice # ~/.agentkeys/alice/session.json +# AGENTKEYS_SESSION_ID=alice bash scripts/agentkeys-demo-show.sh +# bash scripts/agentkeys-demo-show.sh --no-derive # skip the signer wire-test +# bash scripts/agentkeys-demo-show.sh --json # one-shot machine-readable +# +# Prereqs (operator workstation): jq, base64; for --derive (default): +# AGENTKEYS_SIGNER_URL set (sourced from operator-workstation.env), and +# the `agentkeys` CLI on $PATH. + +set -euo pipefail + +SESSION_ID="${AGENTKEYS_SESSION_ID:-master}" +DO_DERIVE=1 +JSON_OUTPUT=0 +EXPORT_PREFIX="" + +while [[ $# -gt 0 ]]; do + case "$1" in + --no-derive) DO_DERIVE=0; shift ;; + --json) JSON_OUTPUT=1; shift ;; + --export) + # --export emits eval-able VAR=value lines so the doc / + # an operator script can capture all six fields in one `eval $(...)`. + # Prefix is uppercased + suffixed with _ — e.g. --export A emits + # OMNI_A=… ADDR_A=… MASTER_WALLET_A=… IDENTITY_TYPE_A=… IDENTITY_VALUE_A=… + [[ $# -lt 2 ]] && { printf -- '--export requires a prefix label\n' >&2; exit 2; } + EXPORT_PREFIX="$(printf '%s' "$2" | tr '[:lower:]' '[:upper:]')" + DO_DERIVE=1 # caller wants ADDR — force derive + shift 2 ;; + --export=*) + EXPORT_PREFIX="$(printf '%s' "${1#*=}" | tr '[:lower:]' '[:upper:]')" + DO_DERIVE=1; shift ;; + -h|--help) + sed -n '2,/^set -euo/p' "$0" | sed '$d' | sed 's/^# \{0,1\}//' + exit 0 ;; + --*) printf 'unknown flag: %s\n' "$1" >&2; exit 2 ;; + *) SESSION_ID="$1"; shift ;; + esac +done + +SESSION_FILE="$HOME/.agentkeys/$SESSION_ID/session.json" +if [[ ! -f "$SESSION_FILE" ]]; then + printf 'no session file at %s\n' "$SESSION_FILE" >&2 + printf ' run: bash scripts/agentkeys-init-email-demo.sh --session-id %s\n' "$SESSION_ID" >&2 + exit 1 +fi + +# Decode JWT body (URL-safe base64, padded). awk + base64 is portable to +# macOS (/bin/bash 3.2, no GNU coreutils). The signer's strict JWT-omni +# check (issue #74 step 1b) means the canonical omni for any subsequent +# /dev/* call is whatever appears here — DO NOT recompute from email +# address (omni("email", addr) is wrong; the JWT post-SIWE carries the +# EVM-omni, not the identity-omni). +JWT_BODY=$(jq -r .token "$SESSION_FILE" | awk -F. '{ + p=$2; pad = 4 - length(p) % 4; + if (pad < 4) for (i=0; i/dev/null) + +if [[ -z "$JWT_BODY" ]]; then + printf 'failed to decode JWT body from %s — file may be corrupt or empty\n' "$SESSION_FILE" >&2 + exit 1 +fi + +ACTOR_OMNI=$(printf '%s' "$JWT_BODY" | jq -r '.agentkeys.omni_account') +MASTER_WALLET=$(printf '%s' "$JWT_BODY" | jq -r '.agentkeys.wallet_address') +IDENTITY_TYPE=$(printf '%s' "$JWT_BODY" | jq -r '.agentkeys.identity_type') +IDENTITY_VALUE=$(printf '%s' "$JWT_BODY" | jq -r '.agentkeys.identity_value') +EXP=$(printf '%s' "$JWT_BODY" | jq -r '.exp') +NOW=$(date +%s) +TTL_REMAINING=$(( EXP - NOW )) + +# Recompute the identity_omni locally (transient — not in the JWT post-SIWE). +# Matches crates/agentkeys-broker-server/src/identity/omni_account.rs. +IDENTITY_OMNI=$(printf 'agentkeys%s%s' "$IDENTITY_TYPE" "$IDENTITY_VALUE" \ + | shasum -a 256 | awk '{print $1}') + +SIGNER_DERIVE_ADDR="" +SIGNER_NOTE="" +if [[ "$DO_DERIVE" -eq 1 ]]; then + if ! command -v agentkeys >/dev/null 2>&1; then + SIGNER_NOTE="(agentkeys CLI not on PATH — skipped)" + elif ! agentkeys --help 2>&1 | grep -q -- "--session-id"; then + SIGNER_NOTE="(stale 'agentkeys' at $(command -v agentkeys) — missing --session-id flag; rebuild with: bash scripts/install-agentkeys-cli.sh)" + elif [[ -z "${AGENTKEYS_SIGNER_URL:-}" && -z "${BACKEND_URL:-}" ]]; then + SIGNER_NOTE="(AGENTKEYS_SIGNER_URL unset — source operator-workstation.env to enable)" + else + derive_json=$(agentkeys --session-id "$SESSION_ID" --json signer derive \ + --omni-account "$ACTOR_OMNI" 2>&1) || { + SIGNER_NOTE="(signer derive failed: $derive_json)" + derive_json="" + } + if [[ -n "$derive_json" ]]; then + SIGNER_DERIVE_ADDR=$(printf '%s' "$derive_json" | jq -r '.address // empty' 2>/dev/null || true) + [[ -z "$SIGNER_DERIVE_ADDR" ]] && SIGNER_NOTE="(could not parse address from derive response: $derive_json)" + fi + fi +fi + +if [[ -n "$EXPORT_PREFIX" ]]; then + # Emit eval-able shell assignments. q-escape values so they survive + # `eval` even if they contain unexpected chars (none of these fields + # should, but defensive — JWT bodies are operator-controlled). + q() { printf '%q' "$1"; } + printf 'SESSION_ID_%s=%s\n' "$EXPORT_PREFIX" "$(q "$SESSION_ID")" + printf 'OMNI_%s=%s\n' "$EXPORT_PREFIX" "$(q "$ACTOR_OMNI")" + printf 'ADDR_%s=%s\n' "$EXPORT_PREFIX" "$(q "$SIGNER_DERIVE_ADDR")" + printf 'MASTER_WALLET_%s=%s\n' "$EXPORT_PREFIX" "$(q "$MASTER_WALLET")" + printf 'IDENTITY_TYPE_%s=%s\n' "$EXPORT_PREFIX" "$(q "$IDENTITY_TYPE")" + printf 'IDENTITY_VALUE_%s=%s\n' "$EXPORT_PREFIX" "$(q "$IDENTITY_VALUE")" + printf 'IDENTITY_OMNI_%s=%s\n' "$EXPORT_PREFIX" "$(q "$IDENTITY_OMNI")" + if [[ -n "$SIGNER_NOTE" && -z "$SIGNER_DERIVE_ADDR" ]]; then + printf 'echo %s >&2\n' "$(q "[demo-show:$SESSION_ID] derive skipped: $SIGNER_NOTE")" + fi + exit 0 +fi + +if [[ "$JSON_OUTPUT" -eq 1 ]]; then + jq -n \ + --arg session_id "$SESSION_ID" \ + --arg session_file "$SESSION_FILE" \ + --arg identity_type "$IDENTITY_TYPE" \ + --arg identity_value "$IDENTITY_VALUE" \ + --arg identity_omni "$IDENTITY_OMNI" \ + --arg actor_omni "$ACTOR_OMNI" \ + --arg master_wallet "$MASTER_WALLET" \ + --arg signer_derive_addr "$SIGNER_DERIVE_ADDR" \ + --arg signer_note "$SIGNER_NOTE" \ + --argjson exp "$EXP" \ + --argjson ttl_remaining "$TTL_REMAINING" \ + '{session_id:$session_id, session_file:$session_file, + identity: {type:$identity_type, value:$identity_value, omni:$identity_omni}, + actor: {omni:$actor_omni, master_wallet:$master_wallet}, + signer_derive: {address:$signer_derive_addr, note:$signer_note}, + jwt: {exp:$exp, ttl_remaining:$ttl_remaining}}' + exit 0 +fi + +bold() { printf '\033[1m%s\033[0m' "$*"; } +cyan() { printf '\033[1;36m%s\033[0m' "$*"; } +green() { printf '\033[1;32m%s\033[0m' "$*"; } +yellow(){ printf '\033[1;33m%s\033[0m' "$*"; } +dim() { printf '\033[2m%s\033[0m' "$*"; } + +ttl_msg="" +if (( TTL_REMAINING < 0 )); then ttl_msg=$(yellow "EXPIRED $(( -TTL_REMAINING ))s ago") +elif (( TTL_REMAINING < 300 )); then ttl_msg=$(yellow "${TTL_REMAINING}s — re-init soon") +else ttl_msg=$(green "${TTL_REMAINING}s remaining") +fi + +echo +bold "session_id "; echo ": $SESSION_ID" +bold "session_file "; echo ": $SESSION_FILE" +echo +cyan "── identity (transient — what the human authenticated as) ──"; echo +bold " type "; echo ": $IDENTITY_TYPE" +bold " value "; echo ": $IDENTITY_VALUE" +bold " identity_omni "; echo ": $IDENTITY_OMNI" +dim " = SHA256(\"agentkeys\" || \"$IDENTITY_TYPE\" || \"$IDENTITY_VALUE\")"; echo +dim " (computed locally; NOT present in the post-SIWE JWT — see §0.3)"; echo +echo +cyan "── actor (durable — what AWS / signer / audit see) ──"; echo +bold " actor_omni "; echo ": $ACTOR_OMNI" +dim " (= JWT.agentkeys.omni_account)"; echo +bold " master_wallet "; echo ": $MASTER_WALLET" +dim " (= JWT.agentkeys.wallet_address — the wallet linked at init; audit only)"; echo +echo +cyan "── signer-wire smoke test (NOT used for AWS) ──"; echo +if [[ -n "$SIGNER_DERIVE_ADDR" ]]; then + bold " derive(actor_omni)"; printf ': %s ' "$SIGNER_DERIVE_ADDR"; dim '(HKDF(K3, actor_omni); proves /dev/derive-address wire works)'; echo + if [[ "$SIGNER_DERIVE_ADDR" == "$MASTER_WALLET" ]]; then + yellow " (matches master_wallet — unexpected for email/oauth2; expected only for identity_type=evm)"; echo + else + dim " (≠ master_wallet — expected: master_wallet came from HKDF(K3, identity_omni) at init)"; echo + fi +elif [[ -n "$SIGNER_NOTE" ]]; then + bold " derive(actor_omni)"; echo ": $SIGNER_NOTE" +fi +echo +cyan "── JWT lifetime ──"; echo +bold " exp "; printf ': %s ' "$EXP" +date -r "$EXP" '+(%Y-%m-%d %H:%M:%S %Z)' 2>/dev/null \ + || date -d "@$EXP" '+(%Y-%m-%d %H:%M:%S %Z)' 2>/dev/null || echo +bold " ttl_remaining "; echo ": $ttl_msg" +echo diff --git a/scripts/agentkeys-init-email-demo.sh b/scripts/agentkeys-init-email-demo.sh new file mode 100755 index 0000000..0823bb0 --- /dev/null +++ b/scripts/agentkeys-init-email-demo.sh @@ -0,0 +1,410 @@ +#!/usr/bin/env bash +# scripts/agentkeys-init-email-demo.sh — fully automated end-to-end demo +# of `agentkeys init --email` against a verified bots.litentry.org alias. +# +# Why: stage 7 demo uses `alice@demo.example` (RFC 2606 example domain, +# undeliverable) so the magic link is sent into the void and the CLI +# polls forever. This script uses an actual SES-routable address at +# bots.litentry.org, polls S3 inbound for the magic-link arrival, +# extracts the broker landing URL, parses the #t= URL fragment, +# and POSTs to /v1/auth/email/verify — replicating exactly what the +# browser-side JS in /auth/email/landing does. Then it waits for the +# foreground `agentkeys init` to confirm and exit. +# +# Prereqs (set on operator workstation): +# awsp agentkeys-admin # admin profile (S3 ListBucket) +# set -a; source scripts/operator-workstation.env; set +a +# # ACCOUNT_ID, REGION, MAIL_DOMAIN, +# # MAIL_BUCKET, OIDC_ISSUER, BACKEND_URL +# +# Usage: +# bash scripts/agentkeys-init-email-demo.sh # auto-pick demo-N alias, session="master" +# bash scripts/agentkeys-init-email-demo.sh demo-1 # use specific local-part +# bash scripts/agentkeys-init-email-demo.sh --session-id alice # writes ~/.agentkeys/alice/session.json +# bash scripts/agentkeys-init-email-demo.sh --session-id alice demo-1 +# RECIPIENT=alice@bots.litentry.org bash scripts/agentkeys-init-email-demo.sh +# AGENTKEYS_SESSION_ID=alice bash scripts/agentkeys-init-email-demo.sh +# +# The default rotates between `demo-1@bots.litentry.org` and +# `demo-2@bots.litentry.org` so consecutive runs don't collide on the +# email_request_status row keyed by the request_id (single-use TTL). +# Override with $RECIPIENT or a positional arg. +# +# **Multi-tenant sessions** (for the §4 isolation proof + general test +# isolation): pass `--session-id ` (or set `AGENTKEYS_SESSION_ID`) +# to write under `~/.agentkeys//session.json` instead of the default +# `~/.agentkeys/master/session.json`. Two back-to-back runs with distinct +# session-ids leave both sessions live — no need to re-init to switch +# between them. Subsequent `agentkeys --session-id ...` commands +# read from the matching dir; `bash scripts/agentkeys-demo-show.sh ` +# prints the (omni, wallet) pair for that session. +# +# Idempotent: if the script crashes mid-run, re-running cleans the +# previous attempt's S3 inbound object on the way through. + +set -euo pipefail + +# This script does NOT need root. It only makes AWS API calls (operator +# admin profile creds, in your shell env) and runs the user-space +# `agentkeys` binary (writes session JWT to YOUR OS keychain, not +# root's). Running with sudo strips the env vars you sourced from +# operator-workstation.env and the script dies on the first +# ${VAR:?...} guard with a misleading "env var required" error. +if [[ -n "${SUDO_USER:-}" ]]; then + printf '\033[1;31mxx\033[0m do NOT run this with sudo — sudo strips your env vars,\n' >&2 + printf ' and the script needs to inherit your operator-workstation.env values.\n' >&2 + printf ' Re-run as your normal user:\n' >&2 + printf ' bash scripts/agentkeys-init-email-demo.sh %s\n' "$*" >&2 + exit 1 +fi + +REGION="${REGION:?REGION env var required (source operator-workstation.env)}" +MAIL_DOMAIN="${MAIL_DOMAIN:?MAIL_DOMAIN env var required}" +MAIL_BUCKET="${MAIL_BUCKET:?MAIL_BUCKET env var required}" +OIDC_ISSUER="${OIDC_ISSUER:?OIDC_ISSUER env var required (broker URL)}" +BACKEND_URL="${BACKEND_URL:?BACKEND_URL env var required (signer URL)}" + +POLL_INTERVAL=5 +POLL_MAX_ATTEMPTS=24 # 2 min — magic-link delivery is usually <30s +INBOUND_PREFIX="inbound/" + +log() { printf '\033[1;36m==>\033[0m %s\n' "$*"; } +warn() { printf '\033[1;33m!!\033[0m %s\n' "$*" >&2; } +die() { printf '\033[1;31mxx\033[0m %s\n' "$*" >&2; exit 1; } + +require() { command -v "$1" >/dev/null 2>&1 || die "missing required tool: $1"; } +require aws +require jq +require curl +require agentkeys + +# ─── Preflight: agentkeys binary must support --session-id (added 2026-05-12) ─ +# The script ONLY works when the on-PATH `agentkeys` binary knows about the +# top-level --session-id flag — otherwise AGENTKEYS_SESSION_ID is silently +# ignored, the session lands under ~/.agentkeys/master/ regardless of what +# --session-id you passed, and demo-show.sh later fails with "no session file +# at ~/.agentkeys//session.json". +# +# Fail loud + tell the operator EXACTLY what to run to get a fresh binary. +# NOTE: `cargo install --path crates/agentkeys-cli --force` installs to +# ~/.cargo/bin/, but if ~/.local/bin/ comes EARLIER in $PATH (the §0 +# default), the stale ~/.local/bin/agentkeys still shadows the new one +# even after a successful cargo install. Use the helper script instead — +# it installs to ~/.local/bin/ directly (overwriting the shadowing +# binary in place) and runs the same capability check this preflight +# does, so a green exit there means this preflight will also pass. +if ! agentkeys --help 2>&1 | grep -q -- "--session-id"; then + resolved="$(command -v agentkeys)" + cargo_bin="$HOME/.cargo/bin/agentkeys" + shadow_msg="" + if [[ "$resolved" != "$cargo_bin" && -x "$cargo_bin" ]]; then + if "$cargo_bin" --help 2>&1 | grep -q -- "--session-id"; then + shadow_msg=" + Heads-up: a FRESH agentkeys at $cargo_bin already has --session-id, but + $resolved is shadowing it because $(dirname "$resolved") comes earlier + in \$PATH. The install script overwrites $resolved with the new binary." + fi + fi + die "stale 'agentkeys' binary at $resolved — missing --session-id flag. + Rebuild + reinstall (idempotent — safe to re-run on every git pull): + bash scripts/install-agentkeys-cli.sh + then re-run this script. (Verify with: agentkeys --help | grep session-id)${shadow_msg}" +fi + +# ─── Argument parsing: --session-id + optional positional recipient ───── +SESSION_ID="${AGENTKEYS_SESSION_ID:-master}" +positional=() +while [[ $# -gt 0 ]]; do + case "$1" in + --session-id) + [[ $# -lt 2 ]] && die "--session-id requires a value" + SESSION_ID="$2"; shift 2 ;; + --session-id=*) SESSION_ID="${1#*=}"; shift ;; + --) shift; while [[ $# -gt 0 ]]; do positional+=("$1"); shift; done ;; + --*) die "unknown flag: $1" ;; + *) positional+=("$1"); shift ;; + esac +done +set -- "${positional[@]:-}" + +# The CLI reads AGENTKEYS_SESSION_ID at parse time; exporting here makes +# the background `agentkeys init` write under ~/.agentkeys/$SESSION_ID/. +export AGENTKEYS_SESSION_ID="$SESSION_ID" + +# ─── Recipient selection ───────────────────────────────────────────────────── +# Precedence: $RECIPIENT > positional arg > $SESSION_ID-derived > demo-N rotation. +# +# The session-id-derived path is critical for "different sessions must produce +# different wallets". HKDF(K3, identity_omni) is deterministic — same omni in, +# same wallet out. identity_omni = SHA256("agentkeys"||type||value), so identical +# recipients map to identical wallets across runs. The legacy demo-1/demo-2 +# rotation (last fallback) collided on back-to-back runs that hit the same epoch +# parity, breaking the §4 two-actor isolation proof. +if [[ -n "${RECIPIENT:-}" ]]; then + recipient="$RECIPIENT" +elif [[ $# -ge 1 && -n "${1:-}" ]]; then + case "$1" in + *@*) recipient="$1" ;; + *) recipient="$1@$MAIL_DOMAIN" ;; + esac +elif [[ "$SESSION_ID" != "master" ]]; then + # Each --session-id gets a unique recipient deterministically. Two runs + # `--session-id alice` + `--session-id bob` are GUARANTEED to produce + # different wallets, no rotation guesswork. + recipient="$SESSION_ID@$MAIL_DOMAIN" +else + # Legacy default path (no --session-id, no positional, no $RECIPIENT). + # Kept for back-compat with pre-multi-tenant doc snippets that just + # called the script bare. Rotates demo-1 / demo-2 by epoch parity. + if (( $(date +%s) % 2 == 0 )); then + recipient="demo-1@$MAIL_DOMAIN" + else + recipient="demo-2@$MAIL_DOMAIN" + fi +fi + +# Show the SHA256 inputs inline so the operator can reproduce the math. +# identity_type for the magic-link flow is "email"; identity_value is the +# lowercased recipient. The broker mints the FIRST JWT with this omni; +# post-SIWE the FINAL JWT carries the evm actor omni instead (see §0.3). +identity_omni_email=$(printf 'agentkeysemail%s' "$(printf '%s' "$recipient" | tr '[:upper:]' '[:lower:]')" \ + | shasum -a 256 | awk '{print $1}') + +log "Session id : $SESSION_ID (writes ~/.agentkeys/$SESSION_ID/session.json)" +log "Recipient : $recipient" +log " identity_omni (email) = $identity_omni_email" +log " = SHA256(\"agentkeys\" || \"email\" || \"$(printf '%s' "$recipient" | tr '[:upper:]' '[:lower:]')\")" +log "Broker URL : $OIDC_ISSUER" +log "Mail bucket : $MAIL_BUCKET" + +# ─── Preflight: AWS caller identity (admin profile required for ListBucket) ─ +caller_arn=$(aws sts get-caller-identity --query 'Arn' --output text 2>&1) \ + || die "aws sts get-caller-identity failed: $caller_arn + Run: awsp agentkeys-admin then re-run this script." +case "$caller_arn" in + *":user/agentkey-broker"*) + die "wrong AWS profile: $caller_arn lacks s3:ListBucket on $MAIL_BUCKET. + Run: awsp agentkeys-admin then re-run this script." ;; +esac +log "Caller ARN : $caller_arn" + +# ─── Preflight: the broker session JWT will be re-minted by `agentkeys init`, +# so any stale session in the keychain is fine — the CLI overwrites it. ── +# (No precheck needed; documented for clarity.) + +# ─── Snapshot inbound BEFORE sending so we can identify the new object ────── +# The bucket has 400+ historical objects (test runs, prior demos). We +# only care about objects that arrive AFTER our SendEmail. snapshot the +# pre-existing key set; later we filter the post-list against this. +log "Snapshotting existing inbound/ keys (filter for NEW arrivals)" +# Build a string-based set of pre-existing keys: space-separated, with +# leading + trailing spaces, so a substring check `*" $k "*` is exact. +# Bash-3.2-compatible (declare -A / associative arrays would be +# cleaner but require bash 4+, and macOS ships /bin/bash 3.2 forever +# due to Apple's GPLv3 freeze). `aws --output text` returns keys +# TAB-separated; `tr '\t' ' '` normalizes them. SES-generated S3 keys +# are alphanumeric (no spaces), so the substring delimiter is safe. +pre_keys_text=$( { aws s3api list-objects-v2 \ + --bucket "$MAIL_BUCKET" --prefix "$INBOUND_PREFIX" \ + --region "$REGION" \ + --query 'Contents[*].Key' --output text 2>/dev/null \ + || true; } | tr '\t' ' ') +PRE_KEYS_SET=" $pre_keys_text " # leading + trailing space for exact match +pre_count=$(printf '%s\n' $pre_keys_text | grep -c . || true) +log " $pre_count existing object(s) — only newer arrivals will be inspected" + +# ─── Fire `agentkeys init --email` in the background ──────────────────────── +# It will print "Magic link sent..." then poll the broker's +# /v1/auth/email/status endpoint. When we click the link, the broker +# flips status → verified and the CLI completes. +log "Starting agentkeys init in background" +init_log=$(mktemp) +trap 'rm -f "$init_log"' EXIT +agentkeys init --email "$recipient" \ + --broker-url "$OIDC_ISSUER" \ + --signer-url "$BACKEND_URL" \ + > "$init_log" 2>&1 & +init_pid=$! +log " init PID : $init_pid (log: $init_log)" + +# Give SES SendEmail a few seconds to actually fire before we start polling. +sleep 3 + +# ─── Poll S3 inbound for the new magic-link email ────────────────────────── +# Match strategy: any key NOT in pre_keys is a candidate; download body, +# look for the recipient address (may be QP-encoded) AND the broker +# landing URL prefix (also may be QP-encoded). The first matching key +# wins. SES inbound objects have UUID-like keys with no useful metadata. +log "Polling s3://$MAIL_BUCKET/$INBOUND_PREFIX for the magic-link email" +landing_url="" +matched_key="" + +# Two possible encodings for the URL in the body: +# - 7bit/8bit (pure-ASCII, the common case for our magic-link URLs): +# URL has a LITERAL '=' between 't' and the base64url token. +# - quoted-printable (SES picks this when MIME parts have non-ASCII): +# '=' is encoded as '=3D' and lines may soft-wrap with '=\n'. +# Handle both: undo soft-wraps + match either form, then normalize. +extract_landing_url() { + local body="$1" + printf '%s' "$body" \ + | sed 's/=$//' \ + | tr -d '\n' \ + | grep -oE "${OIDC_ISSUER}/auth/email/landing#t=(3D)?[A-Za-z0-9_-]+" \ + | head -1 \ + | sed 's/#t=3D/#t=/' +} + +for attempt in $(seq 1 "$POLL_MAX_ATTEMPTS"); do + # Fast-fail: if agentkeys init died before the email arrives (e.g. + # broker rejected the request, signer unauthorized, ses misconfig), + # dump the init log and die immediately instead of waiting the full + # 2-min poll budget for an email that will never come. + if ! kill -0 "$init_pid" 2>/dev/null; then + warn "agentkeys init exited before magic link arrived in S3 — dumping log:" + cat "$init_log" >&2 || true + die "init died early (likely broker rejection); see log above" + fi + + current_keys=$( { aws s3api list-objects-v2 \ + --bucket "$MAIL_BUCKET" --prefix "$INBOUND_PREFIX" \ + --region "$REGION" \ + --query 'Contents[*].Key' --output text 2>/dev/null \ + || true; } | tr '\t' ' ') + # Build set difference: keys in current but not in PRE_KEYS_SET. + # Bash-3.2-compatible substring check against the leading+trailing- + # space-padded snapshot string. + new_keys=() + for k in $current_keys; do + [[ -z "$k" ]] && continue + case "$PRE_KEYS_SET" in + *" $k "*) continue ;; + esac + new_keys+=("$k") + done + new_count=${#new_keys[@]} + log " attempt $attempt/$POLL_MAX_ATTEMPTS — $new_count new object(s)" + + for key in "${new_keys[@]}"; do + [[ -z "$key" ]] && continue + body=$(aws s3 cp "s3://$MAIL_BUCKET/$key" - --region "$REGION" 2>/dev/null || true) + [[ -z "$body" ]] && continue + url=$(extract_landing_url "$body") + if [[ -n "$url" ]]; then + landing_url="$url" + matched_key="$key" + log " matched: s3://$MAIL_BUCKET/$key" + break + fi + done + + [[ -n "$landing_url" ]] && break + sleep "$POLL_INTERVAL" +done + +if [[ -z "$landing_url" ]]; then + warn "magic-link email did not arrive in $((POLL_INTERVAL * POLL_MAX_ATTEMPTS))s" + warn "Killing background agentkeys init (PID $init_pid)" + kill "$init_pid" 2>/dev/null || true + warn "init log:" + cat "$init_log" >&2 || true + die "no magic-link URL — check broker logs + SES inbound rule" +fi + +# ─── Extract the token from the URL fragment + POST to /v1/auth/email/verify ─ +# This is what the browser-side JS in /auth/email/landing does. The +# fragment-based delivery means a plain `curl ` would just +# fetch the static HTML without the token (fragments don't ride in HTTP +# requests). We have to lift the token out of the URL and POST it. +token="${landing_url##*#t=}" +if [[ -z "$token" || "$token" == "$landing_url" ]]; then + die "could not parse #t= fragment from landing URL: $landing_url" +fi + +log "Clicking the magic link (POST /v1/auth/email/verify with token)" +verify_response=$(curl -sS -X POST \ + -H 'content-type: application/json' \ + -d "$(jq -n --arg t "$token" '{token: $t}')" \ + "$OIDC_ISSUER/v1/auth/email/verify" 2>&1) +log " verify response: $verify_response" + +# Clean up the consumed S3 object so the bucket doesn't keep accreting. +aws s3 rm "s3://$MAIL_BUCKET/$matched_key" --region "$REGION" >/dev/null \ + || warn "failed to remove $matched_key from S3 (orphan)" + +# ─── Wait for the foreground init to complete ────────────────────────────── +# It polls /v1/auth/email/status; once the broker flips to verified, +# init proceeds to derive the wallet via the signer and saves the +# session JWT in the OS keychain. Should complete within ~5s. +log "Waiting for agentkeys init to confirm (max 30s)" +for i in $(seq 1 30); do + if ! kill -0 "$init_pid" 2>/dev/null; then + break + fi + sleep 1 +done + +if kill -0 "$init_pid" 2>/dev/null; then + warn "agentkeys init still running after 30s — sending SIGTERM" + kill "$init_pid" 2>/dev/null || true + sleep 2 + warn "init log:" + cat "$init_log" >&2 || true + die "agentkeys init did not complete after the magic-link click" +fi + +if wait "$init_pid"; then + log "agentkeys init completed successfully:" + cat "$init_log" +else + warn "agentkeys init exited non-zero:" + cat "$init_log" >&2 + die "init failed — see log above" +fi + +log "DONE — end-to-end magic-link demo passed for $recipient" + +# ─── Auto-invoke the rich-output inspector ────────────────────────────────── +# Saves the operator the next "now what does the session look like?" step. +# Skip if the helper isn't co-located (e.g. ad-hoc copy of this script). +SHOW="$(dirname "$0")/agentkeys-demo-show.sh" +if [[ -x "$SHOW" ]]; then + echo + log "Session detail (= scripts/agentkeys-demo-show.sh $SESSION_ID):" + AGENTKEYS_SESSION_ID="$SESSION_ID" bash "$SHOW" "$SESSION_ID" || \ + warn "demo-show failed (non-fatal — the session was saved successfully above)" +fi + +# ─── Tell the operator how to capture eval-able shell vars ───────────────── +# The demo-show output above is human-readable only — it does NOT export +# $OMNI / $ADDR / $MASTER_WALLET into the parent shell (this script runs +# in a subprocess, and the human-mode renderer prints to stdout as text, +# not as `KEY=value` assignments). +# +# Without the eval line below, the operator's shell either has no +# $ADDR_