Skip to content

broker: retire /v1/mint-aws-creds (issue #71 Option A end-state) #72

@hanwencheng

Description

@hanwencheng

Context

Issue #71 outlined a two-stage migration:

  • Option B (landed in b0c6515 / c54a69b / a9f3330 on evm) — pivot /v1/mint-aws-creds internals to AssumeRoleWithWebIdentity so it survives cloud-setup.md §4 federation. Wire shape preserved. Deployments that ran §4 stop returning AccessDenied from the integrated path.
  • Option A (this issue) — caller-side migration + retire the route — daemons fetch /v1/mint-oidc-jwt and do AssumeRoleWithWebIdentity client-side. The endpoint goes away.

The caller-side migration is already done as part of the Option B commits:

  • crates/agentkeys-provisioner/src/aws_creds.rs::fetch_via_broker_default_ttl — fetches OIDC JWT, does STS client-side via aws-sdk-sts with anonymous credentials.
  • crates/agentkeys-mcp/src/lib.rs::McpHandler::broker_env_for_provision — uses the new helper.
  • crates/agentkeys-cli/src/lib.rs::broker_env_for_provision — same.

Production daemons no longer call /v1/mint-aws-creds. The route still exists for callers who want server-side gates (audit + grants + idempotency + multi-anchor coordination) but has no in-tree caller as of a9f3330.

Goal

Delete the route and its handler. Broker becomes a pure JWT signer — zero AWS principals at runtime, single mint path. Compromise blast radius drops to "OIDC signing key only."

What's in scope

  1. Drop the route registration in crates/agentkeys-broker-server/src/lib.rs:39:
    .route(\"/v1/mint-aws-creds\", post(handlers::mint::mint_aws_creds))
  2. Drop the entire mint_aws_creds + mint_v2 handler in crates/agentkeys-broker-server/src/handlers/mint.rs (~700 LOC including: body parsing, EIP-191 per-call sig verification, grant resolution / consume, audit anchor write loop, response shaping, helpers).
  3. Delete crates/agentkeys-broker-server/tests/mint_v2_flow.rs (the only test suite that exercises this endpoint).
  4. Decide what happens to the policy gates that lived inside mint_v2:
    • Audit anchor write per mint. /v1/mint-oidc-jwt already audits the JWT mint via state.audit.record_mint(...). Plus AWS CloudTrail records every AssumeRoleWithWebIdentity call with the assumed role + session name. Two audit sources are arguably better than one. Multi-anchor (sqlite + EVM) coordination has no daemon-side equivalent today — it goes away with the endpoint.
    • Phase B explicit-grant enforcement. try_consume(grant_id) was the policy gate. Options:
      • (a) Move grant check to /v1/mint-oidc-jwt time. Requires the JWT request to carry an intent (service + scope_path) that the grant table can match. Currently /v1/mint-oidc-jwt takes only the bearer.
      • (b) Encode grant outcome into the JWT (scope claim, max_uses → JWT TTL) and let AWS bucket policy enforce. Limits granularity.
      • (c) Drop server-side grant enforcement entirely; rely on AWS PrincipalTag + bucket policy for isolation.
    • Idempotency-Key dedup. Currently keyed on body hash + key. Options:
      • (a) Move to /v1/mint-oidc-jwt keyed on bearer hash + key. Functional but the JWT is already short-lived (5min default).
      • (b) Drop. Daemons can dedup client-side via the JWT cache.
    • Per-OmniAccount rate limiting (MintRateLimiter::check_mint). Move to /v1/mint-oidc-jwt. Same code, different call site.

Acceptance criteria

  • crates/agentkeys-broker-server/src/lib.rs no longer registers /v1/mint-aws-creds.
  • crates/agentkeys-broker-server/src/handlers/mint.rs deleted (or shrunk to just the helpers /v1/mint-oidc-jwt reuses).
  • crates/agentkeys-broker-server/tests/mint_v2_flow.rs deleted.
  • Phase B grant enforcement and rate-limit checks move to /v1/mint-oidc-jwt per the chosen option above.
  • Multi-anchor audit policy (sqlite + EVM) decision documented — kept (re-homed at JWT mint), or dropped with explicit note in the runbook.
  • cargo build -p agentkeys-broker-server clean for both feature combos.
  • cargo test -p agentkeys-broker-server --features audit-evm,auth-email-link,auth-oauth2-google passes.
  • cargo test --workspace passes.
  • cargo clippy --workspace --all-features -- -D warnings clean.
  • bash harness/stage-7-issue-64-done.sh exits 0.
  • docs/operator-runbook-stage7.md AWS IAM Trust §'Mint-time STS path' rewritten — single path only.
  • docs/stage7-demo-and-verification.md §5 rewritten — drop the 'two paths' framing.
  • Live walkthrough on https://broker.litentry.org confirms /v1/mint-aws-creds returns 404 and the daemon-side path still works end-to-end.

Migration sequence (recommended)

  1. Decide the gate-rehoming policy (the four bullets in §4 above). This is the architectural question; the rest is mechanical.
  2. Move the gate code to /v1/mint-oidc-jwt (or document its drop).
  3. Delete the route + handler + tests in one commit.
  4. Doc updates in the same commit or a follow-up.
  5. Operator redeploys; verify live.

Out of scope

  • TEE-derived OIDC signer (tracked separately, plan §8 / heima-gaps §3).
  • Live EVM audit anchor (currently EvmStubAnchor — Phase E hardening).
  • The 3 pre-existing failing npm tests in provisioner-scripts/src/lib/email.test.ts (real-S3 calls failing due to local IAM perms — unrelated).

Why now / why not yet

Why now: Production daemons no longer use the endpoint. Keeping it is dead weight.

Why not yet: The gate-rehoming decision is real architecture work. Doing this without thinking about audit/grants/idempotency is how you delete a working policy enforcement layer by accident. The two-path system is fine to live in for a release or two while the rehoming is designed.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions