Skip to content

v0.2+ email-auth enhancement: WebAuthn binding integration + stateless HMAC tokens for multi-broker scale #81

@hanwencheng

Description

@hanwencheng

Background

PR #75 ships v0.1 of email-link auth: stateful CSPRNG magic-link tokens, single-broker deployment, no HMAC (per architecture.md §3 K-table + §5a.1.M Stage 1). Issue #80 closed the original "broker can't be initialized via CLI" gap.

This issue tracks the two known v0.2+ enhancements that v0.1 deliberately deferred, surfaced during PR #75 design discussion. Both are documented as "Open trade-offs" in hardcoded.md but need a tracked issue so they don't silently regress.


Enhancement A — Integrate WebAuthn binding into magic-link Stage 2

Current state (v1c-interim, per architecture.md §5a.1.M + step-1c plan)

The architecture's target for master init is the two-stage ceremony:

  • Stage 1 — Identity ceremony: operator clicks magic-link → broker confirms (email, binding_nonce).
  • Stage 2 — Binding ceremony: WebAuthn enrollment binds D_pub (K10 device key) atomically inside the WebAuthn challenge → broker mints J0 with claims agentkeys_device_pubkey=D_pub + agentkeys_webauthn_cred=K11_id.

Today (v1c-interim) ships bespoke per-identity PoP shapes (pop_sig field for email/oauth2; SIWE-payload Device Pubkey commit for evm) instead of WebAuthn at Stage 2 — see step-1c plan. The wire shapes work but aren't uniform across identity types.

v0.2 target

Collapse all three identity-type binding flows into one WebAuthn ceremony:

1. CLI: agentkeys init --email alice@example.com --broker-url B
2. Broker: POST /v1/auth/email/request → mints magic-link token (TTL 10 min)
3. Broker: SES send → alice's inbox
4. Operator: click → browser opens https://broker/auth/email/landing#t=<token>
5. Browser-side landing page:
   a. Read token from URL fragment (never sent in path/query)
   b. Call navigator.credentials.create(challenge=token) — triggers Touch ID / Windows Hello / Android StrongBox
   c. POST /v1/auth/email/verify {webauthn_attestation, D_pub, token}
6. Broker:
   a. Verify WebAuthn attestation against the challenge (binds D_pub atomically — attacker can't substitute D_pub without breaking the WebAuthn signature)
   b. Verify token (consume-once, TTL check)
   c. Mint J0: claims include agentkeys_device_pubkey=D_pub, agentkeys_webauthn_cred=K11_id
7. CLI polls /v1/auth/email/status/{request_id} → gets J0
8. CLI proceeds to J0 → J1 bridge per §5a.1.M

What needs to land

  • Browser-side landing page calls navigator.credentials.create (today: shows "Verified — return to your terminal")
  • /v1/auth/email/verify accepts {webauthn_attestation, D_pub, token} instead of just {token}
  • Broker validates the attestation (e.g. via webauthn-rs crate) AND that the challenge byte-equals the token — atomically binds D_pub to this magic-link round-trip
  • J0 minting writes agentkeys_webauthn_cred=K11_id claim
  • CLI polls + receives J0 with both K10 + K11 claims (already supported by agentkeys_device_pubkey claim path; just needs the parallel WebAuthn claim)
  • Same flow for --oauth2-google (single uniform Stage 2 across identity types)
  • Update docs/spec/plans/issue-74-step-1c-device-key-auth.md status: v1c-interim → v0.2-shipped

Architectural impact

  • K-table §3: K11 description already covers WebAuthn credential — no schema change.
  • §5a.1.M sequenceDiagram already shows the WebAuthn ceremony as the target — implementation just catches up.
  • Mitigates email-bypass attack: in v1c, an attacker with email access (e.g. shared mailbox) can complete the ceremony without hardware presence. v0.2 requires WebAuthn → biometric/PIN unlock at the bound device.

Enhancement B — Stateless HMAC tokens for multi-broker-replica scale

Current state (v0.1, per PR #75)

Single-broker deployment. Magic-link tokens are stateful: broker stores SHA256(token) in EmailTokenStore SQLite, looks up on click, marks consumed. No HMAC.

Why v0.1 didn't need HMAC

  • One broker process owns the SQLite — no cross-replica coordination needed.
  • Threat model: SQLite is local file under same UID as broker → attacker compromising one likely has the other → HMAC defense-in-depth is theoretical for this deployment.
  • HMAC was previously implemented as a vestigial dead field (loaded + length-validated but never used cryptographically); removed in b8481fe to align with architecture.md §3 K-table.

v0.2+ multi-broker scenarios

When the broker scales horizontally (HA, multi-region, blue-green deploys), v0.1's stateful-only design breaks in three ways:

  1. Replica routing: token issued by broker-A, click hits broker-B → broker-B has no row → 404. Mitigated only by sticky sessions (ALB-level), which don't survive failover.
  2. Failover: broker-A dies between issuance and click → in-flight tokens lost.
  3. Cross-region read latency: if SQLite is replaced with a shared DB (RDS/DynamoDB), every magic-link click costs a cross-region round-trip.

Recommended v0.2+ design: hybrid HMAC + consume-once

Stateless integrity + minimal shared state:

Token = base64url( {request_id, email, expires_at, nonce} ) || "." || base64url( HMAC-SHA256(K12, payload) )
  • Issuance: broker generates random nonce, signs (request_id, email, expires_at, nonce) with K12 (shared HMAC key, replicated to all broker replicas).
  • Click: any broker validates HMAC + expires_at locally (no DB lookup). Then a single small write to a shared consume-once store (Redis SETNX, DynamoDB conditional put, or Postgres unique constraint) marks the nonce consumed.
  • Cross-region: HMAC verify is local; consume-once is the only shared-state op (and it's small + can be eventually-consistent within a region).

Architectural impact

  • K-table §3: add K12 — Email-token HMAC key (32 bytes, shared across broker replicas, mounted from secrets manager). Sibling to K8 (broker session keypair).
  • §5a.1.M Stage 1: amend "Broker emails magic link; operator clicks; broker confirms single-use within TTL" → "Broker emails HMAC-signed magic link; operator clicks; ANY broker replica verifies HMAC locally, then consume-once write to shared store within TTL."
  • New env var: BROKER_EMAIL_HMAC_KEY_PATH (re-introduced — but this time documented in K-table, not vestigial).
  • New deployment requirement: shared K12 (e.g. AWS Secrets Manager, mounted via instance role at all broker hosts).
  • New deployment requirement: shared consume-once store (Redis / DynamoDB / Postgres — operator choice).

What needs to land

  • Add K12 to architecture.md §3 K-table.
  • Update §5a.1.M Stage 1 to describe the HMAC + consume-once flow.
  • Re-introduce BROKER_EMAIL_HMAC_KEY_PATH env var (with proper architectural documentation this time).
  • Re-introduce HMAC sign + verify in EmailLinkAuth (commit b8481fe removed it; revert is straightforward).
  • Add a ConsumeOnceStore trait with implementations for SQLite (single-broker, today) + Redis + DynamoDB.
  • setup-broker-host.sh: re-add the email-hmac.key mint step (only when --multi-broker flag is set, otherwise stays stateful-SQLite).

Why one issue covers both

The two enhancements are coupled:

  • WebAuthn binding (Enhancement A) is a Stage 2 change that's orthogonal to the token transport mechanism.
  • HMAC stateless tokens (Enhancement B) is a Stage 1 change that doesn't affect Stage 2.

But both touch the same files (crates/agentkeys-broker-server/src/plugins/auth/email_link.rs, boot.rs, setup-broker-host.sh, architecture.md §3 + §5a.1.M), so landing them together avoids two rounds of churn through the same code.

Acceptance criteria for closing this issue

  • Enhancement A: WebAuthn ceremony lands at email_link Stage 2; v1c-interim PoP shapes deprecated; demo doc shows the unified Stage-1 + Stage-2 flow.
  • Enhancement B: K12 lands in architecture K-table; HMAC sign+verify in EmailLinkAuth; consume-once store abstraction; setup-broker-host.sh --multi-broker flag wires it; all single-broker behavior preserved.
  • hardcoded.md "Open trade-offs" section updated: HMAC re-introduction landed, link to this closed issue.
  • PR agentkeys: stage 7+ — issue #74 step 1 (dev_key_service signer + bootstrap chain) #75's K10 + K11 claims path (already shipped) used unchanged by Enhancement A.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/brokerBroker server, cap-token issuance, OIDC issuancearea/identityHDKD actor tree, K-key inventory, identity ceremony

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions