fix(buzz-dev-mcp): resolve non-WSL bash so the MCP shell works on Windows#1119
Merged
Conversation
…dows
The MCP shell spawned a bare `Command::new("bash")`, which on Windows
re-enters PATH search and resolves `System32\bash.exe` (the WSL launcher).
That fails at spawn with 0x8007072c, so the agent's only channel-posting
transport was fully dead on Windows.
resolve_bash() now returns an absolute, non-WSL bash path and spawns it
directly: GIT_BASH override -> installed Git for Windows -> bundled bash
staged next to the exe -> PATH scan excluding %SystemRoot%. Unix is a no-op.
The System32 exclusion compares path components case-INsensitively because
Windows paths are case-insensitive but `Path::starts_with` is not — a PATH
entry spelled `C:\WINDOWS\System32` would otherwise leak WSL's bash.
To make a bare (Git-less) host work, the build bundles a genuine bash: the
PortableGit MSYS runtime (`usr/` + `bin/`) with the separable `mingw64/`
git-program subtree dropped. The runtime is kept whole — bash loads its
DLLs lazily, so a hand-trimmed copy would pass an existence check yet fail
mid-command. The `git-bash` install-root dir name is a single path contract
shared byte-identical across shell.rs, build-release-config.mjs, and
bundle-sidecars.sh.
The download/extract/drop logic lives in scripts/stage-windows-bash.sh so
the Windows PR CI job can stage the tree and spawn the staged bash on a real
coreutils pipeline — gating the SFX extraction, the post-`mingw64`-drop spawn,
and the lazily-loaded DLL closure that would otherwise ship unexercised until
a tagged release.
Co-authored-by: Will Pfleger <pfleger.will@gmail.com>
Signed-off-by: Will Pfleger <pfleger.will@gmail.com>
9466e28 to
211e588
Compare
The idempotency skip trusted a single file (bin/bash.exe) as proof of a whole stage. A cp -a interrupted between the sibling bin/ and usr/ subtrees can leave bash.exe present while usr/ is incomplete, and the skip would then accept that partial runtime and ship it. Unreachable in CI (fresh $RUNNER_TEMP) and release (fresh checkout), so not a shipping defect — eliminated proactively. A .stage-complete marker written last, only after cp -a and the bash.exe integrity check both pass, is positive proof the whole tree landed and is immune to which specific files a future partial leaves behind. An interrupted stage never writes it, so the skip falls through to a clean re-extract (the pre-extract rm -rf already cleans the partial). Co-authored-by: Will Pfleger <pfleger.will@gmail.com> Signed-off-by: Will Pfleger <pfleger.will@gmail.com>
… docs rust-1.95.0 clippy on the Windows toolchain flags two doc comments in shell.rs where a numbered list is immediately followed by a left-margin prose line, parsing the prose as a malformed lazy continuation of the last list item. Under -D warnings this failed the Windows Rust job at the Clippy step, skipping Check/Test/the bundled-bash smoke gate. A blank /// line between each list and its trailing paragraph is the standard markdown separation; doc-only, no behavior change. Co-authored-by: Will Pfleger <pfleger.will@gmail.com> Signed-off-by: Will Pfleger <pfleger.will@gmail.com>
The three 'absolute path outside workspace is allowed' tests hardcoded /etc/hosts as the out-of-workspace fixture. On Windows that resolves to C:\etc\hosts, which does not exist, so the assertions got a path-not-found error (os error 3) instead of the intended outcome. Each test now writes a real file into a second tempdir outside the workspace root, giving a genuinely-absolute existing path on both platforms. Co-authored-by: Will Pfleger <pfleger.will@gmail.com> Signed-off-by: Will Pfleger <pfleger.will@gmail.com>
The bundled-bash resolver and CI smoke-step targeted git-bash\bin\bash.exe, which is git-for-windows's 47KB compat launcher shim, not bash. The shim validates its install root by requiring a sibling mingw64\bin marker that Option 2 deliberately drops, so it printed "Top-level not found" and exited before running any command. Retarget the bundled spawn to the real MSYS2 bash at git-bash\usr\bin\bash.exe, which boots from its co-located msys-2.0.dll and needs only usr/. The installed-Git branch stays on bin\bash.exe: a real Git-for-Windows install has mingw64/, so its launcher is the correct entry. Also swap the smoke probe's `rev` (not bundled in PortableGit at all) for `tr a-z A-Z`, which is bundled in usr/bin/ with the same msys-2.0.dll closure, and assert the file we actually run (usr/bin/bash.exe) in the staging integrity check rather than the shim. Co-authored-by: Will Pfleger <pfleger.will@gmail.com> Signed-off-by: Will Pfleger <pfleger.will@gmail.com>
The timeout_ms runaway-command guard was a no-op on Windows: the kill path was three cfg(not(unix)) stubs, so on timeout TerminateProcess killed the bash parent but orphaned its MSYS-forked grandchildren (e.g. sleep). The orphans held the stdout/stderr pipes open, blocking the reap until they self-exited — surfacing as shell::tests::timeout_fires eating ~986s of CI, and meaning a real agent's bash -c hitting its timeout was never killed. Replace the PID-keyed guard and free kill functions with a platform- abstracted KillGroup: Unix keeps the killpg behavior byte-for-byte; Windows assigns the child to a Job Object with JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE so terminating the job kills the whole tree atomically — the structural mirror of killpg. The job HANDLE is held in KillGroup for the whole run so Drop is the last-resort reaper; closing it earlier would fire KILL_ON_JOB_CLOSE and kill the child the instant spawn returned. Co-authored-by: Will Pfleger <pfleger.will@gmail.com> Signed-off-by: Will Pfleger <pfleger.will@gmail.com>
The shell tool's async future holds a KillGroup across an .await and is spawned as a Send future, but on Windows KillGroup wraps a raw HANDLE (*mut c_void) which is neither Send nor Sync — so the Windows clippy gate failed to compile (5 errors, all from this one missing bound). A job-object handle is a thread-safe kernel reference, so the impls are sound; the Unix path was unaffected and already Send + Sync. Co-authored-by: Will Pfleger <pfleger.will@gmail.com> Signed-off-by: Will Pfleger <pfleger.will@gmail.com>
The Job Object kill path is raw windows-sys FFI and is inherently unsafe, but the crate was #![forbid(unsafe_code)] — a hard wall an inner #[allow] cannot relax. Split the lint by target: Unix keeps the hard no-unsafe guarantee (forbid), Windows uses deny so the KillGroup FFI items can carry a scoped #[allow(unsafe_code)]. No safe Win32 API exists for job objects, mirroring the git-sign-nostr unsafe-FFI precedent. Co-authored-by: Will Pfleger <pfleger.will@gmail.com> Signed-off-by: Will Pfleger <pfleger.will@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The MCP shell tool is the only channel-posting transport an agent has on Buzz, and it was fully dead on Windows.
shell.rsspawned a bareCommand::new("bash"), which on Windows re-enters PATH search and resolvesSystem32\bash.exe— the WSL launcher — failing at spawn with0x8007072cbefore any command runs.What changed
resolve_bash()now returns an absolute, non-WSLbashpath and spawns it directly. Probe order (first hit wins), Windows-only — Unix is a no-op that preserves the existing bare-bashbehavior:GIT_BASHenv override (explicit operator escape hatch)%ProgramFiles%\Git,%LocalAppData%\Programs\Git)std::env::split_paths, excluding%SystemRoot%so WSL'sbash.exeis never selectedThe
%SystemRoot%exclusion compares path components case-insensitively (viais_under_dir): Windows paths are case-insensitive butPath::starts_withis not, so a PATH entry spelledC:\WINDOWS\System32against a%SystemRoot%ofC:\Windowswould otherwise slip past the guard and re-resolve WSL's bash. The comparison is component-wise (not a lowercased substring match) so a sibling likeC:\Windows2is not falsely treated as inside%SystemRoot%.If none resolve, the resolver returns an actionable error before spawn instead of the cryptic
0x8007072c.Bundled bash
The app is self-contained, so it cannot assume Git for Windows is installed. There is no standalone "bash for Windows" upstream — working Windows bash ships only inside git-for-windows. The build downloads PortableGit
2.54.0and keeps only the MSYS2 bash runtime (usr/+bin/), dropping themingw64/git-program subtree (git.exeetc., which the MCP shell never invokes) as one separable unit.The retained runtime is kept whole — bash loads
msys-2.0.dlland other libraries lazily, and load-bearing pieces (terminfo,gawklibs) live alongside the docs underusr/, so a hand-trimmed copy could pass an existence check yet fail mid-command with a cryptic error. Droppingmingw64/removes a separable subtree without ever touching the bash runtime.The download/extract/drop logic lives in
scripts/stage-windows-bash.sh, a self-contained script with no release-binary precondition so CI can call it directly (see below).scripts/bundle-sidecars.shinvokes it for Windows targets.The
git-bashinstall-root directory name is a single path contract shared byte-identical across three sites, each carrying a cross-reference comment:scripts/bundle-sidecars.shdesktop/src-tauri/binaries/git-bashdesktop/scripts/build-release-config.mjsbundle.resourcessource → install-root targetgit-bashcrates/buzz-dev-mcp/src/shell.rsgit-bash\bin\bash.exerelative to the executable at runtimeCI
The
windows-rustjob gains two gates on realwindows-latest:cargo test -p buzz-dev-mcpruns thecfg(windows)bash-resolution tests (including the case-insensitive%SystemRoot%exclusion) — the package was never tested on a Windows runner before, so these had no coverage.scripts/stage-windows-bash.shand spawns the stagedbash.exeon a real coreutils pipeline (echo hello | rev). This exercises the PortableGit download, the SFX-oextraction, themingw64/drop, and the lazily-loaded MSYS DLL closure — behaviors that previously ran only inrelease.ymlon a tag, i.e. unexercised until a release reached users.Notes
The UTF-16 garbling in the original bug report was the WSL launcher's own error output; genuine MSYS2 bash writes UTF-8, so no output decode is added — one would corrupt real command output. The garbling disappears once a real bash is spawned.