fix: make managed-agent spawn and teardown portable to Windows#1097
Merged
Conversation
Several spawn, teardown, and shim paths were #[cfg(unix)]-only and silently no-op (or returned a falsey stub) on Windows, breaking the desktop build four ways. All four are in the same Windows-portability theme. #2/#4 (MCP PermissionDenied on C:\Windows): buzz-agent spawns the MCP with env_clear() then re-adds only an allowlist that omitted the Windows temp/profile vars. Stripped of TMP/TEMP/USERPROFILE, std::env::temp_dir() falls back to C:\Windows and Shim::install() can't write there. Pass the Windows vars through (cfg-gated). The shim itself had two more Unix-isms in the same install path: the PATH separator was hardcoded ':' (now std::env::join_paths) and the non-unix multicall copies dropped the .exe extension PATHEXT needs to exec them. Multicall dispatch now matches on file_stem() so the .exe copies route correctly. #1 (stray console + orphaned process tree): the buzz-acp child spawned with no CREATE_NO_WINDOW (console popped) and the non-unix stop path was Child::kill(), which kills only the harness and orphans the 24 workers + MCP servers. A Win32 Job Object with JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE now owns the tree and reaps it when the handle drops — the Windows mirror of the Unix process-group teardown. The after-restart path (PID only, no handle) falls back to taskkill /T. The Windows primitives live in a new process_lifecycle module. #4 (program not found): the create-path default agent command was the bare `goose`, not on PATH on a stock Windows install. It now catalog-resolves the bundled `buzz-agent`, the same shape mesh_llm::preset already uses. #3 (updater silently does nothing): when the updater plugin is unavailable the hook collapsed to `idle`, re-rendering the same button — indistinguishable from a no-op. It now sets a visible `unavailable` state and warns to the log so the firing branch is diagnosable on Will's Windows build. Co-authored-by: Will Pfleger <pfleger.will@gmail.com> Signed-off-by: Will Pfleger <pfleger.will@gmail.com>
The doc comment claimed buzz-acp connects to the relay before spawning its workers, making the spawn-to-assign window structurally empty. Source contradicts this: in crates/buzz-acp/src/lib.rs the agent pool is built (agent_pool_ready, line 1061) before the relay connect (line 1098), and Will's Windows log confirms that order. The window is closed by assign-latency (microsecond synchronous Win32 calls beating buzz-acp's tens-of-ms startup), not by child ordering. Comment-only change. Co-authored-by: Will Pfleger <pfleger.will@gmail.com> Signed-off-by: Will Pfleger <pfleger.will@gmail.com>
The #[cfg(windows)] code paths (Job Object kill-tree, multicall shim, MCP env passthrough) were never compiled by CI — no Linux job builds the MSVC target, and aws-lc-sys needs windows.h, so they shipped verified only by inspection. Add a windows-latest job that runs clippy, a workspace cargo check, and the Tauri-crate check + test against x86_64-pc-windows-msvc, gating exactly the Windows arms. Uses dtolnay/rust-toolchain rather than hermit (which the Linux jobs use) because hermit does not provide MSVC; mirrors release.yml's release-windows toolchain. Sidecar stubs are created before any Tauri compile because Tauri validates externalBin at compile time. Co-authored-by: Will Pfleger <pfleger.will@gmail.com> Signed-off-by: Will Pfleger <pfleger.will@gmail.com>
The new windows-rust CI job's Tauri-crate test step (--all-targets) was the first compiler to build these tests against MSVC, and it failed with 10 E0433 errors: migration_tests.rs and migration_team_dir_tests.rs call std::os::unix::fs::symlink directly with no cfg guard. Production code is already correctly #[cfg(unix)] / #[cfg(not(unix))] gated, so the workspace check and clippy passed — only the test compile reached these targets. These tests assert Unix symlink semantics (create symlink, heal/replace it, read through it); there is nothing to verify on Windows, where the production path copies instead. Gate each symlink-using test plus the two helpers they exclusively use (setup_sync_layout, sync_files) so the helpers do not trip dead_code under -D warnings on Windows. Co-authored-by: Will Pfleger <pfleger.will@gmail.com> Signed-off-by: Will Pfleger <pfleger.will@gmail.com>
zizmor's superfluous-actions check flagged the dtolnay/rust-toolchain step: windows-latest preinstalls rustup, which honors the repo-root rust-toolchain.toml (channel 1.95.0, profile = default). That profile already provides clippy, and the runner's host triple is x86_64-pc-windows-msvc, so both the explicit toolchain install and the targets/components inputs were no-ops. release.yml's release-windows job keeps its own copy (out of scope, not flagged). Co-authored-by: Will Pfleger <pfleger.will@gmail.com> Signed-off-by: Will Pfleger <pfleger.will@gmail.com>
632e866 to
5690b2f
Compare
The windows-rust rust-cache had no workspaces key, so it defaulted to the repo root and never cached desktop/src-tauri's separate target dir. The Check/Test (Tauri crate) steps are the heaviest compile on the job and rebuilt cold every run. Mirror the Desktop E2E Relay pattern to cache both workspaces, matching every other job that builds the crate. Co-authored-by: Will Pfleger <pfleger.will@gmail.com> Signed-off-by: Will Pfleger <pfleger.will@gmail.com>
reconcile_team_dirs_in_file built the rewritten path with a single
target_dir.join("agents/teams"). On Windows, join does not split the
embedded '/', so it persisted a mixed-separator path
(...app.dev\agents/teams\id) into managed-agents.json — unlike fresh
writes, which build the path per-component and stay all-native. Split
the join so reconcile emits the same native-separator path the rest of
the system stores. Tests now build expectations via shared team_dir /
pack_dir helpers using the same per-component join, so they assert real
production output on both Unix and Windows.
Co-authored-by: Will Pfleger <pfleger.will@gmail.com>
Signed-off-by: Will Pfleger <pfleger.will@gmail.com>
tlongwell-block
pushed a commit
that referenced
this pull request
Jun 18, 2026
…te-response * origin/main: (194 commits) Fold agent core memory into the session system prompt (#1112) feat(cli): add patches and issues commands for NIP-34 git collaboration (#1073) fix(desktop): stop random timeline message loss + page reconnect replay (#1105) Update README.md fix(desktop): keep thread replies from scrolling channel (#1109) fix(buzz-acp): accept siblings under allowlist author gate (#1108) feat(deploy): add production Helm chart for Buzz (#990) fix(desktop): keep MembersSidebar input usable while an add is in flight (#1106) chore(release): release version 0.3.25 (#1102) fix(desktop): stop dimming deferred message lists (#1104) Smooth channel loading: single-surface timeline state machine (#1099) feat: surface base + persona system prompts in observer feed (#1103) ci: move reminder e2e to a dedicated backend-integration job (#1098) fix: give agent-observer sub a replay-capable limit (#1100) fix: make managed-agent spawn and teardown portable to Windows (#1097) fix(desktop): constrain message timeline width with min-w-0 (#1092) feat(desktop): reminders notifications, snooze, overlay, and inbox view mode (#1093) feat(prompt): add memory hygiene and hoist universal engineering discipline to base prompt (#1085) fix(desktop): correct thread-unread badge flicker, stale clear, phantom count, mention gate, and nested count (#1080) Fix mention chip alignment (#1094) ... # Conflicts: # crates/buzz-cli/src/commands/workflows.rs
tlongwell-block
pushed a commit
that referenced
this pull request
Jun 18, 2026
…te-response * origin/main: (194 commits) Fold agent core memory into the session system prompt (#1112) feat(cli): add patches and issues commands for NIP-34 git collaboration (#1073) fix(desktop): stop random timeline message loss + page reconnect replay (#1105) Update README.md fix(desktop): keep thread replies from scrolling channel (#1109) fix(buzz-acp): accept siblings under allowlist author gate (#1108) feat(deploy): add production Helm chart for Buzz (#990) fix(desktop): keep MembersSidebar input usable while an add is in flight (#1106) chore(release): release version 0.3.25 (#1102) fix(desktop): stop dimming deferred message lists (#1104) Smooth channel loading: single-surface timeline state machine (#1099) feat: surface base + persona system prompts in observer feed (#1103) ci: move reminder e2e to a dedicated backend-integration job (#1098) fix: give agent-observer sub a replay-capable limit (#1100) fix: make managed-agent spawn and teardown portable to Windows (#1097) fix(desktop): constrain message timeline width with min-w-0 (#1092) feat(desktop): reminders notifications, snooze, overlay, and inbox view mode (#1093) feat(prompt): add memory hygiene and hoist universal engineering discipline to base prompt (#1085) fix(desktop): correct thread-unread badge flicker, stale clear, phantom count, mention gate, and nested count (#1080) Fix mention chip alignment (#1094) ... Co-authored-by: Tyler Longwell <tlongwell@squareup.com> Signed-off-by: Tyler Longwell <tlongwell@squareup.com> # Conflicts: # crates/buzz-cli/src/commands/workflows.rs
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Several managed-agent spawn, teardown, and shim paths were
#[cfg(unix)]-only and silently no-op (or returned a falsey stub) on Windows, breaking the desktop build four ways. All four fixes share one Windows-portability theme.#2 / #4 — MCP
PermissionDeniedonC:\Windowsbuzz-agentspawns the MCP withcmd.env_clear()then re-adds only an allowlist (PASSTHROUGH_ENV) that hadTMPDIRbut none of the Windows temp/profile vars. Stripped ofTMP/TEMP/USERPROFILE,std::env::temp_dir()falls all the way back toC:\Windows, whereShim::install()can't create its tempdir →PermissionDenied (os error 5)→ every MCP init dies.crates/buzz-agent/src/mcp.rs: cfg-gatedPASSTHROUGH_ENV_WINDOWSaddsTMP,TEMP,USERPROFILE(load-bearing fortemp_dir();USERPROFILEis the always-set floor) plusLOCALAPPDATA,APPDATA(child-tool config — git, etc.).The shim install path had two more Unix-isms that would silently break
buzz/gitshell-outs even aftertemp_dir()was fixed:crates/buzz-dev-mcp/src/shim.rs: PATH was built with a hardcoded:separator — nowstd::env::split_paths/join_paths(platform separator). The#[cfg(not(unix))]multicall copy dropped the.exeextension PATHEXT needs to treat the file as runnable — now appended.crates/buzz-dev-mcp/src/lib.rs: multicall dispatch matches onfile_stem()instead offile_name(), so the.execopies (rg.exe,buzz.exe, ...) route to the correct match arm.#1 — stray
buzz-acp.execonsole + orphaned process treeThe
buzz-acpchild spawned with noCREATE_NO_WINDOWflag (a console window popped and lingered), and the non-unix stop path wasChild::kill(), which kills only the harness and orphans the 24 agent workers + MCP servers it spawned.desktop/src-tauri/src/managed_agents/process_lifecycle.rs(new,#[cfg(windows)]): a Win32 Job Object (JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE) owns the harness tree and reaps it when the handle drops — the Windows mirror of the Unixprocess_group(0)teardown. The after-restart path (PID-only, no handle) falls back totaskkill /T.desktop/src-tauri/src/managed_agents/runtime.rs:CREATE_NO_WINDOWat the spawn site;stop_managed_agent_processdrops the job handle on Windows (falls back toChild::kill()if assignment failed);terminate_processdelegates totaskkill_tree; the augmented-PATH builder also moves off the hardcoded:separator.desktop/src-tauri/src/managed_agents/types.rs:ManagedAgentProcesscarries the#[cfg(windows)]job handle.The Job Object dies with its owner, so a co-located sibling instance's agents are never affected — no env-read PEB walk or identity-matched sweep is needed.
#4 —
agent_cmd=goose acp, "program not found"With an empty
managed-agents.json, a freshly-created agent fell through to the platform defaultgoose, which isn't on PATH on a stock Windows install → all 24 workers failed withprogram not found.desktop/src-tauri/src/managed_agents/discovery.rs: newdefault_agent_command()catalog-resolves the bundledbuzz-agent— the same shapemesh_llm::presetalready uses, so the default can't drift from the provider definition.buzz-agenttakes noacparg, so there's no arg leakage.desktop/src-tauri/src/commands/agents.rs,types.rs: create path uses the resolver; the deadDEFAULT_AGENT_COMMAND = "goose"const is removed.#3 — "Check for Updates" silently does nothing
v0.3.24is the latest tag, so the build is genuinely up-to-date. But when the updater plugin is unavailable, the hook collapsed toidle, re-rendering the same "Check for Updates" button — indistinguishable from a no-op.desktop/src/features/settings/hooks/use-updater.ts: the unavailable branch nowconsole.warns (so the firing branch is diagnosable in the Windows app log) and sets a visibleunavailablestate instead ofidle.desktop/src/features/settings/UpdateChecker.tsx: renders a clear "Automatic updates aren't available on this build" row.Verification
Verified on the macOS dev host:
just desktop-check,just desktop-test(922 TS tests),just desktop-tauri-test(547 Rust tests incl. the newdefault_agent_commandtest andwindows_passthrough_includes_temp_dir_vars),cargo clippy/cargo fmtclean across the touched workspace crates and the Tauri crate.The
#[cfg(windows)]Job Object,CREATE_NO_WINDOW, shim install, and the updater branch are not exercised by the host test suite and were not type-checked by a Windows compiler (no local MSVC toolchain). They were verified by inspectingwindows-sys 0.61symbol signatures. Windows CI / a manual run on Will's box is the real gate for those surfaces.A separate minor noted but intentionally not addressed: devtools is unreachable on release Windows builds because
windows_subsystem="windows"non-debug compiles the inspector out.