Skip to content

feat(compat): runtime SDK↔backend version guard at ACP startup#408

Merged
NiteshDhanpal merged 6 commits into
nextfrom
feat/backend-version-guard
Jun 18, 2026
Merged

feat(compat): runtime SDK↔backend version guard at ACP startup#408
NiteshDhanpal merged 6 commits into
nextfrom
feat/backend-version-guard

Conversation

@NiteshDhanpal

@NiteshDhanpal NiteshDhanpal commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

What

A runtime SDK↔backend contract-version guard. On ACP/worker startup it reads the backend's reported contract version (/openapi.json info.version) and fails fast with an actionable error if the backend is older than this SDK supports — instead of the mismatch surfacing later as opaque 500s / missing-field errors deep in a request.

agentex-sdk 0.13.0 requires agentex backend >= 0.X.Y, but http://… reports 0.W.Z.
Upgrade the backend, or pin agentex-sdk to a version compatible with backend 0.W.Z.
(Set AGENTEX_SKIP_VERSION_CHECK=1 to bypass at your own risk.)

Why

This is the deploy-time complement to the build-time cross-version compat tests (#407). They cover different moments:

When Checks
#407 compat tests CI / release is the client compatible with the supported server window?
this guard runtime / startup is the server we're pointed at within that window?

Same source of truth — the supported window (min-supported..current); #321 provides the version the backend reports. It directly addresses the agentex-sdk 0.13 friction (e.g. Cengage): a client on an older backend got late, opaque failures (dropped task_id/agent_id → 500s) with no startup signal. This turns that into one actionable line at boot, and covers the whole "SDK needs a newer backend" class — not just this one break.

How

  • agentex/lib/core/compat/version_guard.py
    • assert_backend_compatible(base_url) — fetch /openapi.json info.version, compare to MIN_BACKEND_CONTRACT via a SemVer §11 precedence key (a prerelease like 0.1.0-rc.1 correctly sorts below the stable 0.1.0 floor), raise IncompatibleBackendError if older.
    • MIN_BACKEND_CONTRACT — kept in sync with tests/compat min-supported (test(compat): cross-version request-compatibility against supported server contracts #407); version axis from #321 tags.
    • AGENTEX_SKIP_VERSION_CHECK=1 escape hatch; warns (does not crash) on unreachable/unknown/unparseable version (transient blip or contract-less server shouldn't kill startup).
  • Wired into both agent entry points, before register_agent and gated on AGENTEX_BASE_URL, so a bad pairing fails startup with a clear message rather than serving broken traffic:
    • BaseACPServer lifespan (sync + async ACP servers).
    • AgentexWorker._register_agent (Temporal worker — it never goes through the ACP lifespan, so it needs its own guard).
  • Unit tests (tests/test_version_guard.py): parse, prerelease precedence, compatible-passes, incompatible-raises, prerelease-below-stable-floor-raises, skip-env, unknown-version-no-crash, no-base-url-noop. ✅ 8 passing.

Open / follow-ups (draft)

  • Pin MIN_BACKEND_CONTRACT to a real #321 release tag once those land (today it's seeded at 0.1.0, mirroring tests/compat min-supported); ideally derive it from the same manifest so the two can't drift. Interim: a drift-lock test asserting MIN_BACKEND_CONTRACT == min-supported.yaml info.version.
  • Consider also asserting an upper bound (server newer than the SDK's current) as a soft warning.
  • Surface the declared minimum in release notes / SDK metadata (the published compatibility matrix) — separate process change.
  • Overlaps with feat(adk): warn when the agentex server reports an unsupported version #410's _server_compat.py (per-response header check). This guard is the boot-time fail-fast via /openapi.json (works today, before traffic); feat(adk): warn when the agentex server reports an unsupported version #410 is the per-response soft warning (dormant until a server sets x-agentex-version). They're complementary — reconcile before un-drafting.

Draft for design review — happy to adjust the wiring (e.g. derive MIN_BACKEND_CONTRACT from tests/compat/server_specs/manifest.json) before un-drafting.

🤖 Generated with Claude Code

Greptile Summary

  • Adds a runtime SDK/backend compatibility guard that reads the backend contract version from /openapi.json.
  • Compares the reported backend version against MIN_BACKEND_CONTRACT with prerelease-aware SemVer ordering.
  • Wires the guard into ACP server startup and Temporal worker registration before agent registration.
  • Adds tests covering parsing, compatibility outcomes, skip-env behavior, unknown/unreachable backends, and worker startup wiring.

Confidence Score: 5/5

The compatibility guard is narrowly scoped to startup checks and is covered across the direct guard behavior plus ACP/worker wiring.

No issues were identified in the reviewed changes; tests cover the expected version comparison, bypass, warning, and startup integration paths.

T-Rex T-Rex Logs

What T-Rex did

  • Before the artifact, MODULE_PATH_EXISTS was False and the base import failed.
  • After the artifact, an IncompatibleBackendError is shown for sdk version 0.13.0 with required backend >= 0.1.0, including reported URL/version, upgrade/pin advice, and skip-env bypass text for old/prerelease backends; pass/no-raise for compatible, skip, unparseable/missing/unreachable, and None base URL.
  • The acp-startup-guard artifacts show that before, the head commit could not prove the startup guard due to a missing scale_gp dependency, and after, the same dependency boundary prevented proving old/compatible/skip/no-base-url cases.
  • Before the latest changes, the base commit had no worker guard, so old backend 0.0.9 reached register_agent with register_calls=1.
  • After the changes, the head commit requests /openapi.json and raises actionable IncompatibleBackendError for backend 0.0.9; with a compatible backend 0.1.0 and skip-env old backend both call register_agent, and absence of AGENTEX_BASE_URL exits with register_calls=0.

View all artifacts

T-Rex Ran code and verified through T-Rex

Reviews (3): Last reviewed commit: "fix(compat): anchor version regex + add ..." | Re-trigger Greptile

Complements the build-time cross-version compat tests (#407): on ACP/worker
startup, read the backend's reported contract version (/openapi.json info.version)
and fail fast with an actionable error if the backend is older than
MIN_BACKEND_CONTRACT — instead of the mismatch surfacing later as opaque 500s /
missing-field errors (the agentex-sdk 0.13 friction).

- agentex/lib/core/compat/version_guard.py: assert_backend_compatible() +
  MIN_BACKEND_CONTRACT (kept in sync with tests/compat min-supported) +
  AGENTEX_SKIP_VERSION_CHECK escape hatch; warns (no crash) on unreachable/unknown.
- wired into BaseACPServer lifespan (runs before register_agent when AGENTEX_BASE_URL set).
- unit tests.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Comment thread src/agentex/lib/core/compat/version_guard.py Outdated
NiteshDhanpal and others added 3 commits June 17, 2026 12:35
SemVer §11: a prerelease precedes its stable release (0.1.0-rc.1 < 0.1.0).
The old _parse dropped the suffix, so a release-candidate backend compared
equal to a stable floor and slipped past the guard even though it may lack
the final contract. Parse the prerelease and compare via a SemVer precedence
key; a prerelease of a higher version (0.2.0-rc.1) still clears a 0.1.0 floor.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The stable vs prerelease key branches returned different tuple shapes
((maj,min,patch,(1,)) vs (...,(0,list))), so pyright couldn't prove < was
defined on the union. Make the 4th element a uniform (rank, identifiers)
pair — stable rank 1 > prerelease rank 0 — keeping the ordering identical.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Async/Temporal agents run a separate worker process that never goes through
the ACP server lifespan, so the guard there wouldn't cover them. Wire it into
AgentexWorker._register_agent (same AGENTEX_BASE_URL gate, before register_agent).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Comment thread src/agentex/lib/core/compat/version_guard.py Outdated

@max-parke-scale max-parke-scale left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-reviewed at 5bbb9356 — worker-startup wiring and prerelease ordering both handled now, nice. Verified the version axis works: scale-agentex main serves info.version from _version.py (#321), so it's self-describing and moves with releases. One inline ask left.

🧑‍💻🤖 — posted via Claude Code

return os.environ.get(name, "").strip().lower() in ("1", "true", "yes", "on")


async def fetch_backend_version(base_url: str, *, timeout: float = 5.0) -> str | None:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing tests this — every test mocks fetch_backend_version out. Worth a respx test for the parse paths (missing info, missing version, non-2xx, non-JSON → all should degrade to None).

🧑‍💻🤖 — posted via Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 4d9e2c2. fetch_backend_version is now exercised for real via httpx.MockTransport (the function actually runs — request build, status check, JSON parse — just no network):

  • success + asserts URL is …/openapi.json and method GET
  • missing info.version, info absent, info: nullNone
  • HTTP 404 / 503 → None
  • non-JSON body → None
  • httpx.ConnectErrorNone

Plus end-to-end assert_backend_compatible through the real fetch (old → raises, new → passes, unreachable → proceeds). Writing these caught a real bug in the first test helper (it recursed infinitely), which the mock-everything tests would never have surfaced.

…tests

- Anchor _VERSION_RE at both ends so a malformed tail (0.1.0rc1, 0.1.0foo,
  0.1.0.1) is rejected to None ('unknown, proceed') instead of silently
  parsing as stable 0.1.0 and satisfying MIN_BACKEND_CONTRACT.
- Test fetch_backend_version for real via httpx.MockTransport (success/URL,
  missing version, missing/null info, 404/503, non-JSON, connection error)
  plus end-to-end assert_backend_compatible through the real fetch.
- Test the regex anchors explicitly (leading/trailing junk rejected;
  whitespace + leading v permitted).
- Test AgentexWorker._register_agent wiring: guard runs before register_agent,
  incompatible backend blocks registration, no AGENTEX_BASE_URL skips both.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

@max-parke-scale max-parke-scale left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@NiteshDhanpal NiteshDhanpal merged commit 433c999 into next Jun 18, 2026
69 of 71 checks passed
@NiteshDhanpal NiteshDhanpal deleted the feat/backend-version-guard branch June 18, 2026 04:10
@stainless-app stainless-app Bot mentioned this pull request Jun 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants