Skip to content

fix(node): reconnect and report app version after restart#613

Merged
vincentkoc merged 9 commits into
masterfrom
fix-node-version-surface
Jun 1, 2026
Merged

fix(node): reconnect and report app version after restart#613
vincentkoc merged 9 commits into
masterfrom
fix-node-version-surface

Conversation

@vincentkoc

@vincentkoc vincentkoc commented May 31, 2026

Copy link
Copy Markdown
Member

Summary

  • replace hardcoded 1.0.0 Windows node/tray handshake version fields with AppVersionInfo.Version
  • reconnect already-paired Windows nodes on startup using the persisted node device token instead of requiring bootstrap/shared credentials
  • preserve credential precedence so startup does not downgrade a paired node from its device token back to bootstrap/shared tokens
  • fix the native exec approval prompt loop so real button clicks wake the pending approval path
  • avoid double exec-approval prompts for approved shell wrappers while still honoring explicit inner deny rules
  • add regression coverage for node-only reconnect, SSH tunnel startup, startup fallback wiring, and shell-wrapper exec approvals

Validation

  • git diff --check
  • .agents/skills/autoreview/scripts/autoreview --mode local -> clean on the branch after the earlier review fixes
  • CI Build and Test run 26730526429 on 21984db: repo-hygiene, test, e2etests, build win-x64, and build win-arm64 all passed
  • staged current x64 PR artifact on an Azure crabbox: product version 0.6.0-PullRequest613.737; PR artifact is expected unsigned
  • node identity/version check on current artifact: active paired node stayed on the same device identity, connected=true, pending=0, version reports 0.6.0-PullRequest613.737
  • restart hardening checks on current artifact:
    • Windows tray restart through repaired task launcher: same active node reconnected, pending=0, version .737
    • Linux gateway restart: same active node reconnected, pending=0, version .737
    • Windows VM reboot + tray task start: same active node reconnected, pending=0, version .737
    • credential files stayed on node device token; no shared/bootstrap token downgrade observed
  • exec approval interactive matrix on current artifact using real mouse input:
    • Deny clicked and returned policy denial instead of hanging
    • Allow once clicked and node.invoke system.run returned stdout PR613_ALLOW_ONCE_737; no rule persisted
    • Always allow clicked and node.invoke system.run returned stdout PR613_ALWAYS_737; exactly one wrapper allow rule persisted
    • Repeated the exact always-allowed command without starting the clicker; it succeeded without a new prompt
  • restored the crabbox approval policy to its original defaultAction=deny, 22-rule state after testing

Known Follow-Up

  • The extracted-artifact scheduled task has no reboot trigger, so the reboot smoke is intentionally "VM reboot + tray task start", not autostart validation.
  • A stale disconnected paired node with version 1.0.0 still exists in the gateway registry, but no duplicate pending request was created and the active node stayed stable.
  • PR artifacts are unsigned. The Azure Trusted Signing release workflow remains the path for signed alpha/beta release artifacts.

@clawsweeper

clawsweeper Bot commented May 31, 2026

Copy link
Copy Markdown

Codex review: needs changes before merge. Reviewed May 31, 2026, 10:29 PM ET / 02:29 UTC.

Summary
The branch updates gateway/node handshake version fields to AppVersionInfo, adds node-only reconnect from persisted node tokens, rewrites exec approval prompts to a native Win32 window, and adds regression tests.

Reproducibility: yes. Source inspection gives a high-confidence path: configure EnableMcpServer=true, EnableNodeMode=false, keep an active gateway record with a persisted node token, and startup reaches ConnectNodeOnlyAsync because ShouldInitializeNodeService() is true.

Review metrics: 2 noteworthy metrics.

  • Files changed: 13 files, +970/-69. The patch spans connection lifecycle, credential startup, native UI prompting, and tests, so the mode-boundary risk is not settled by one unit test.
  • MCP-only persisted-token coverage: 0 focused regression checks. The added tests do not cover the exact upgrade state where MCP-only mode and a stored node token coexist.

Merge readiness
Overall: 🧂 unranked krab
Proof: 🦞 diamond lobster
Patch quality: 🧂 unranked krab
Result: blocked by patch quality or review findings.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • [P2] Gate node-only startup reconnect on EnableNodeMode and add the MCP-only persisted-node-token regression.
  • [P2] Rerun the AGENTS-required build, shared tests, and tray tests after the fix.

Risk before merge

  • [P1] Merging as-is can make an MCP-only setup with a persisted node token open a remote gateway node WebSocket on restart, contradicting the documented local-only mode.
  • [P1] The PR changes startup credential routing around persisted node device tokens, so the final patch needs explicit coverage that node-token reconnect only runs when gateway node mode is enabled.

Maintainer options:

  1. Fix the MCP-only gate before merge (recommended)
    Require the gateway-node setting for node-only reconnect and add regression coverage proving MCP-only with a persisted node token stays local-only.
  2. Accept a mode contract change
    Maintainers could intentionally redefine MCP-only to allow gateway node reconnect, but that would require explicit docs and user-facing upgrade handling.
  3. Pause if the mode matrix is still unsettled
    If maintainers are not ready to decide the gateway-node versus MCP-only boundary, pause the PR rather than merging a silent startup behavior change.
Copy recommended automerge instruction
@clawsweeper automerge

Special instructions:
Gate startup node-only gateway reconnect on EnableNodeMode rather than ShouldInitializeNodeService, and add regression coverage for EnableMcpServer=true/EnableNodeMode=false with a persisted node token proving ConnectNodeOnlyAsync is not called while MCP local-only startup still works.

Next step before merge

  • [P2] A narrow automated repair can update the startup predicate and add the missing MCP-only persisted-token regression without deciding a new product direction.

Security
Needs attention: The diff can cross the documented MCP-only local boundary by using a persisted node credential to reconnect to a gateway when gateway node mode is disabled.

Review findings

  • [P1] Gate node-only reconnect on EnableNodeMode — src/OpenClaw.Tray.WinUI/App.xaml.cs:1440-1445
Review details

Best possible solution:

Keep the useful reconnect/version/prompt work, but gate startup node-only gateway reconnect on EnableNodeMode and add a regression check for MCP-only plus persisted node token before merge.

Do we have a high-confidence way to reproduce the issue?

Yes. Source inspection gives a high-confidence path: configure EnableMcpServer=true, EnableNodeMode=false, keep an active gateway record with a persisted node token, and startup reaches ConnectNodeOnlyAsync because ShouldInitializeNodeService() is true.

Is this the best way to solve the issue?

No. The PR's overall repair direction is useful, but the node-only reconnect path must use a gateway-node-specific predicate instead of the broader node-service predicate that includes MCP-only mode.

Full review comments:

  • [P1] Gate node-only reconnect on EnableNodeMode — src/OpenClaw.Tray.WinUI/App.xaml.cs:1440-1445
    This condition uses ShouldInitializeNodeService(), which is true for EnableMcpServer=true even when EnableNodeMode=false. In the documented MCP-only mode, a user with a persisted node token would restart into ConnectNodeOnlyAsync and open a gateway node WebSocket despite the mode being local-only; gate this reconnect on the gateway-node setting and add the MCP-only persisted-token regression.
    Confidence: 0.96

Overall correctness: patch is incorrect
Overall confidence: 0.93

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 6d14b00f6cfc.

Label changes

Label changes:

  • add proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes detailed after-fix live artifact proof for restart reconnects, credential stability, and exec approval button behavior, though the source-level MCP-only gate defect still needs a code fix.
  • add status: ⏳ waiting on author: ClawSweeper has contributor-facing work open and is waiting for author action. Sufficient (live_output): The PR body includes detailed after-fix live artifact proof for restart reconnects, credential stability, and exec approval button behavior, though the source-level MCP-only gate defect still needs a code fix.
  • remove status: 📣 needs proof: Current PR status label is status: ⏳ waiting on author.

Label justifications:

  • P1: The PR can break a documented local-only MCP workflow by starting a gateway node connection with a persisted token on restart.
  • merge-risk: 🚨 compatibility: Existing MCP-only users with stored node identity state could see startup behavior change from local-only to gateway-connected.
  • merge-risk: 🚨 security-boundary: The affected mode is documented as no-gateway local MCP, so opening a remote gateway node connection crosses a security boundary.
  • merge-risk: 🚨 auth-provider: The risky path is driven by persisted node device token credential resolution during startup.
  • rating: 🧂 unranked krab: Overall readiness is 🧂 unranked krab; proof is 🦞 diamond lobster and patch quality is 🧂 unranked krab.
  • status: ⏳ waiting on author: ClawSweeper has contributor-facing work open and is waiting for author action. Sufficient (live_output): The PR body includes detailed after-fix live artifact proof for restart reconnects, credential stability, and exec approval button behavior, though the source-level MCP-only gate defect still needs a code fix.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes detailed after-fix live artifact proof for restart reconnects, credential stability, and exec approval button behavior, though the source-level MCP-only gate defect still needs a code fix.
Evidence reviewed

Security concerns:

  • [medium] Preserve MCP-only local boundary — src/OpenClaw.Tray.WinUI/App.xaml.cs:1440
    ShouldInitializeNodeService() includes MCP-only mode, so this new node-token reconnect path can use a stored node device token to create a gateway node connection in a mode documented as local-only.
    Confidence: 0.95

Acceptance criteria:

  • [P1] ./build.ps1.
  • [P1] dotnet test ./tests/OpenClaw.Shared.Tests/OpenClaw.Shared.Tests.csproj --no-restore.
  • [P1] dotnet test ./tests/OpenClaw.Tray.Tests/OpenClaw.Tray.Tests.csproj --no-restore.
  • [P1] dotnet test ./tests/OpenClaw.Connection.Tests/OpenClaw.Connection.Tests.csproj --no-restore.

What I checked:

  • Repository policy applied: AGENTS.md says connection, pairing, node, MCP, and tray UX changes must use the architecture docs, and explicitly states MCP-only mode must start local NodeService without requiring a gateway credential. (AGENTS.md:36, 6d14b00f6cfc)
  • MCP-only contract: The current docs define EnableMcpServer=true, EnableNodeMode=false as local MCP server only with no gateway required. (docs/MCP_MODE.md:103, 6d14b00f6cfc)
  • PR source finding: The PR adds node-only reconnect when a node credential exists and ShouldInitializeNodeService() is true; that helper is also true for MCP-only mode, so the branch can start a gateway node connection in local-only MCP mode. (src/OpenClaw.Tray.WinUI/App.xaml.cs:1440, 21984db59c8e)
  • Coverage gap: The added tray contract test checks that node-only reconnect exists, but it does not cover the MCP-only plus persisted-node-token case that would prove ConnectNodeOnlyAsync is not called when EnableNodeMode=false. (tests/OpenClaw.Tray.Tests/AppRefactorContractTests.cs:75, 21984db59c8e)
  • History provenance: Blame shows the broad ShouldInitializeNodeService helper and startup connection bridge came from the initial config/tray import, while the current PR wires that helper into the new node-only gateway reconnect path. (src/OpenClaw.Tray.WinUI/App.xaml.cs:1855, 9de9b5ba0f8a)

Likely related people:

  • vincentkoc: The current PR changes the startup reconnect path, and current main also has 64b650c by Vincent Koc hardening Windows node bootstrap reconnects across the same connection and node credential files. (role: recent reconnect and credential contributor; confidence: high; commits: 64b650cb3313, 21984db59c8e; files: src/OpenClaw.Connection/GatewayConnectionManager.cs, src/OpenClaw.Shared/WindowsNodeClient.cs, src/OpenClaw.Tray.WinUI/App.xaml.cs)
  • Ranjesh: Blame points the startup connection bridge and ShouldInitializeNodeService predicate to 9de9b5b, which introduced the tray configuration and MCP/node mode surface now being extended. (role: introduced startup and MCP mode surface; confidence: medium; commits: 9de9b5ba0f8a; files: src/OpenClaw.Tray.WinUI/App.xaml.cs, docs/MCP_MODE.md, docs/CONNECTION_ARCHITECTURE.md)
  • Scott Hanselman: 0d4fcbd recently hardened GatewayConnectionManager, App startup wiring, and related contract tests, so this person is a useful adjacent routing candidate for startup mode regressions. (role: adjacent connection hardening contributor; confidence: medium; commits: 0d4fcbd50ad5; files: src/OpenClaw.Connection/GatewayConnectionManager.cs, src/OpenClaw.Tray.WinUI/App.xaml.cs, tests/OpenClaw.Tray.Tests/AppRefactorContractTests.cs)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. P2 Normal priority bug or improvement with limited blast radius. merge-risk: 🚨 automation 🚨 Merging this PR could break CI, automerge, proof capture, label sync, or automation. labels May 31, 2026
@vincentkoc vincentkoc changed the title fix(node): report companion app version in handshakes fix(node): reconnect and report app version after restart May 31, 2026
@clawsweeper clawsweeper Bot added rating: 🌊 off-meta tidepool PR readiness rating does not apply to this item. and removed rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. labels May 31, 2026
@clawsweeper clawsweeper Bot added rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. merge-risk: 🚨 auth-provider 🚨 Merging this PR could break OAuth, tokens, provider routing, model choice, or credentials. merge-risk: 🚨 availability 🚨 Merging this PR could cause crashes, hangs, restart loops, stalls, or process outages. and removed rating: 🌊 off-meta tidepool PR readiness rating does not apply to this item. merge-risk: 🚨 automation 🚨 Merging this PR could break CI, automerge, proof capture, label sync, or automation. labels May 31, 2026
@clawsweeper clawsweeper Bot added rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. and removed rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. labels May 31, 2026
@clawsweeper clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. merge-risk: 🚨 compatibility 🚨 Merging this PR could break existing users, config, migrations, defaults, or upgrades. and removed rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. labels May 31, 2026
@clawsweeper clawsweeper Bot added merge-risk: 🚨 security-boundary 🚨 Merging this PR could weaken sandboxing, authorization, credentials, or sensitive data. and removed merge-risk: 🚨 availability 🚨 Merging this PR could cause crashes, hangs, restart loops, stalls, or process outages. labels May 31, 2026
@clawsweeper clawsweeper Bot added status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. and removed status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. labels May 31, 2026
@clawsweeper clawsweeper Bot added proof: sufficient Contributor real behavior proof is sufficient. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. P1 Urgent regression or broken agent/channel workflow affecting real users now. and removed status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. P2 Normal priority bug or improvement with limited blast radius. labels May 31, 2026
@clawsweeper clawsweeper Bot added status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. proof: sufficient Contributor real behavior proof is sufficient. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. and removed proof: sufficient Contributor real behavior proof is sufficient. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. labels Jun 1, 2026
@vincentkoc vincentkoc marked this pull request as ready for review June 1, 2026 02:32
@vincentkoc vincentkoc merged commit 1d58d59 into master Jun 1, 2026
27 checks passed
@vincentkoc vincentkoc deleted the fix-node-version-surface branch June 1, 2026 02:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merge-risk: 🚨 auth-provider 🚨 Merging this PR could break OAuth, tokens, provider routing, model choice, or credentials. merge-risk: 🚨 compatibility 🚨 Merging this PR could break existing users, config, migrations, defaults, or upgrades. merge-risk: 🚨 security-boundary 🚨 Merging this PR could weaken sandboxing, authorization, credentials, or sensitive data. P1 Urgent regression or broken agent/channel workflow affecting real users now. proof: sufficient Contributor real behavior proof is sufficient. rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant