Launch the MSBuild server process with Server GC#14043
Conversation
The MSBuild server (the long-lived `/nodemode:8` process used when MSBUILDUSESERVER=1) hosts the build itself; under a multithreaded (/mt) build it runs all project work on threads in a single process, so Server GC's higher throughput pays off there. GC mode is fixed at CLR startup, so it is set via DOTNET_gcServer in the server's launch environment (MSBuildClient.TryLaunchServer). This is scoped to the server process only: sidecar TaskHosts (nodemode 2) and worker nodes (nodemode 1) are launched through other code paths that never set the knob, and the server resets its environment to the client's on the first build command, so they keep the default Workstation GC. An explicit user-set DOTNET_gcServer is honored, and MSBUILDDISABLESERVERGC=1 opts out. Adds tests verifying the server uses Server GC while TaskHost and worker nodes do not, and that the opt-out works. Documents the new variable. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The MSBuild server is experimental, so unconditionally launch it with Server GC (no MSBUILDDISABLESERVERGC escape hatch, no ChangeWave). Drops the opt-out test accordingly and documents the behavior in MSBuild-Server.md instead of the environment-variables list. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR updates MSBuild Server startup to launch the long-lived server process (/nodemode:8 when MSBUILDUSESERVER=1) with Server GC enabled by injecting DOTNET_gcServer=1 into the server process launch environment, and adds unit tests + documentation to validate and describe the behavior.
Changes:
- Inject
DOTNET_gcServer=1into the MSBuild server process launch environment inMSBuildClient.TryLaunchServer. - Extend
ProcessIdTaskto reportGCSettings.IsServerGCand add new MSBuild Server unit tests asserting GC mode for server / taskhost / worker. - Document the Server GC behavior in
documentation/MSBuild-Server.md.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| src/Build/BackEnd/Client/MSBuildClient.cs | Adds DOTNET_gcServer override to the server process launch environment. |
| src/MSBuild.UnitTests/MSBuildServer_Tests.cs | Adds a new IsServerGC probe output + tests covering server/taskhost/worker GC mode. |
| documentation/MSBuild-Server.md | Documents that the server node is launched with Server GC via DOTNET_gcServer. |
Benchmark evidence (part 1, updated): 4-config comparison across 21 workloadsFull local repro is in a zip on the PR thread (harness + scenario generators + raw Design — 4 configurations per scenario, all vs one baselineThe change only affects the server process, and the server only does build work under
Methodology (rigor)
Results — positive % = faster than baseline C1
Aggregate
Honest readFor the PR specifically, the relevant column is C3→C4 (server with vs without Server GC). The win is real and significant where
|
|
note that these are numbers is without sdk task migrations because it's repo bootstrap not net11 from vmr |
Benchmark evidence (part 2, definitive): 63 scenarios, 1,407 measurementsExpanded the study to 63 diverse workloads (templates, synthetic project-count/file-count sweeps, fan-in / deep-chain / layered dependency graphs, and 15 real OSS repos), 1,407 valid timed builds. Contested scenarios were boosted to n=20 iterations to tighten confidence intervals. Same 4-config method and baseline as part 1 ( Headline: the part-1 "regressions" were statistical noise; the wins are realIn part 1 a couple of small/fast synthetic scenarios looked like significant regressions at n=5. With n=20 they all collapse to within-noise, while the genuine wins held up and gained significance:
Isolated GC effect (C3→C4: server with Workstation vs Server GC) — the apples-to-apples for this PRPositive % = branch (Server GC) faster. Across all 63 scenarios:
Selected real repos (C3→C4):
Control:
|
Repro packageFull local repro (harness, scenario generators, raw Contents & step-by-step instructions in README.md inside the zip. The large artifacts (the two bootstraps, the 24 cloned repos, and the generated synthetic solutions — together ~18 GB) are not included; the README documents how to regenerate them (build two bootstraps from this branch vs main, run the generators, clone via Zip will be attached to the PR by @JanProvaznik. |
|
/review |
|
✅ Expert Code Review (command) completed successfully! Performance & Allocation Awareness review complete. |
There was a problem hiding this comment.
Review summary
| # | Dimension | Verdict |
|---|---|---|
| 1 | Backwards Compatibility | 🟡 1 MODERATE |
| 4 | Test Coverage | 🟡 1 MODERATE |
| 16 | Idiomatic C# | 🔵 1 NIT |
✅ 21/24 dimensions clean.
- Backwards Compatibility —
DOTNET_gcServer=1injected unconditionally; no escape hatch for memory-constrained environments. Users who have explicitly setDOTNET_gcServer=0will be silently overridden. The ~300 MB peak working-set increase documented in the benchmarks could trigger OOM kills in 2 GB CI containers. - Test Coverage — No test for the scenario where the parent environment has
DOTNET_gcServer=0(user opt-out). This is the exact path where the override semantics differ from inheriting a parent setting. - Idiomatic C# — Minor: variable typed as
IDictionary<string, string?>?and immediately replaced by a concreteDictionary<string, string?>.
Overall impression: The approach is sound — using DOTNET_gcServer in the child's launch environment is the correct mechanism, the scoping via SetEnvironment is well-analysed, the tests cover the three critical isolation scenarios, and the documentation is accurate. The main open question is the opt-out story: since this is an experimental feature the PR explicitly opts out of ChangeWaves, but the unconditional memory increase may affect memory-constrained CI users. Adding a MSBUILDSERVERGC=0 (or honouring an existing DOTNET_gcServer=0) escape hatch, and a test covering that path, would make this change production-ready.
Generated by Expert Code Review (command) for issue #14043 · 2.8K AIC · ⌖ 13 AIC · ⊞ 29.9K ambient context
Comment /review to run again
The Server GC benefit applies only when the server process itself does the project work, which happens under /mt. Without /mt the server merely orchestrates and delegates to separate worker nodes, so Server GC there adds memory cost for no throughput gain. Detect /mt (and MSBUILDFORCEMULTITHREADED) from the launching invocation's command line in MSBuildClient.TryLaunchServer and only then inject DOTNET_gcServer into the server launch environment. Tests: split the server-GC coverage into multithreaded (Server GC) and non-multithreaded (Workstation GC) cases, and run the TaskHost case under /mt so the server genuinely has Server GC while the sidecar TaskHost stays Workstation. The probe task is marked [MSBuildMultiThreadableTask] so under /mt it runs in-process in the server (observing the server's GC mode) rather than being routed to a sidecar TaskHost. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Honor an explicit user-set DOTNET_gcServer: only inject the default when the user has not set it, so DOTNET_gcServer=0 (e.g. memory-constrained containers) is respected. No new MSBuild-specific opt-out flag is added. - Preserve the base overrides dictionary's comparer when copying (use a separate local for the shared base vs the launch-specific copy) instead of forcing OrdinalIgnoreCase. - Fix 'Github' -> 'GitHub' in the added test message string. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Thanks for the automated review. Addressed in the latest commits: New behavior (since these reviews): Server GC is now gated on Honor user-set
Comparer NIT: the copy now preserves the base overrides dictionary's comparer (separate "Github" → "GitHub": fixed in the added test string. Test coverage: the server-GC tests now cover multithreaded server = Server GC, non-multithreaded server = Workstation GC, and TaskHost-under- |
Replace the bespoke command-line scan in MSBuildClient (IsMultiThreadedBuildRequested) with the canonical determination already used by the in-proc build path. XMake's CanRunServerBasedOnCommandLineSwitches already does the full GatherAllSwitches parse (expanding response files); have it also compute multithreaded via MSBuildApp.IsMultiThreadedEnabled and plumb the bool through MSBuildClientApp.Execute into MSBuildClient. This fixes a gap where the old scan missed /mt provided via a response file, and removes duplicated switch-parsing logic. A new MSBuildClient(commandLine, msbuildLocation, multiThreaded) overload carries the flag; the existing two-arg constructor is preserved. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
I suspect this oversells it--the scheduler and initial evaluation stuff could still benefit from Server GC. We're choosing this scoping to limit the blast radius of "Server GC has unexpected negative effects". |
rainersigwald
left a comment
There was a problem hiding this comment.
The only other concern I have is that I'd love to be able to disambiguate this change from others in MT benchmark tracking--we should be able to be faster even without this change (but with this change . . .)
Co-authored-by: Rainer Sigwald <raines@microsoft.com>
Summary
Launch the MSBuild server process (the long-lived
/nodemode:8process used whenMSBUILDUSESERVER=1) with Server GC.The server hosts the build itself — under a multithreaded (
/mt) build it runs all project work on threads in a single process — so Server GC's higher throughput is beneficial. This is scoped to the server process only; sidecar TaskHosts and worker nodes keep the default Workstation GC.The MSBuild server is experimental, so this is an unconditional change: no opt-out env var and no ChangeWave.
Why this approach
GC mode is locked at CLR startup and can't be changed at runtime. The server node is a freshly spawned process whose launch environment we control in
MSBuildClient.TryLaunchServer, so we injectDOTNET_gcServer=1there. We deliberately do not setSystem.GC.ServerinMSBuild.runtimeconfig.json, because that file is shared by all roles (entry/worker/server/TaskHost) and would flip the sidecars too.DOTNET_gcServeronly affects .NET (CoreCLR) and, since .NET 9, takes precedence overruntimeconfig.json. On .NET Framework it is a no-op (Framework GC isapp.config-driven), so this is effectively a .NET-Core-server optimization.Scoping / no leak to other nodes
Only the server launch sets the knob. Sidecar TaskHost (nodemode 2) and worker (nodemode 1) nodes are launched via separate code paths that never set it. Additionally, on the first build command the server resets its own environment to the client's (
OutOfProcServerNode→CommunicationsUtilities.SetEnvironment), which removesDOTNET_gcServerbefore the server ever spawns children — so it can't leak via inheritance either. (As a side effect,DOTNET_gcServerreads back as null inside the server even thoughIsServerGCis true; the GC mode was already locked at startup.)Benchmark (local)
OrchardCore full solution (
OrchardCore.slnx, 233 projects),Rebuild -mt -m, 16 cores, warm reused server, drift-controlled (baseline measured first and last):~10–13% faster. Every Server-GC iteration beat every Workstation iteration (no distribution overlap), and the last Workstation block was slower than the first, so machine drift worked against the result. Cost: ~+300 MB peak working set, as expected for Server GC.
Tests
src/MSBuild.UnitTests/MSBuildServer_Tests.cs(gated#if NET; each test uses a uniqueMSBUILDNODEHANDSHAKESALTso no leftover server is reused, clears ambient GC env vars, and registers spawned PIDs for cleanup):ServerProcessUsesServerGC— server process reportsIsServerGC == true.TaskHostProcessDoesNotUseServerGC— a TaskHost spawned by a Server-GC server reportsfalse.WorkerNodeDoesNotUseServerGC— an out-of-proc worker reportsfalse.GCSettings.IsServerGCis surfaced through a new[Output]on the existingProcessIdTask.Notes
DOTNET_gcServeris a no-op there); tests are#if NET.documentation/MSBuild-Server.md.