feat(aspire): tear down Aspire on test-run abort via session cancellation token#6292
Conversation
Up to standards ✅🟢 Issues
|
| Metric | Results |
|---|---|
| Complexity | 6 |
NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.
There was a problem hiding this comment.
Code Review — feat(aspire): tear down Aspire on test-run abort via engine cancellation token
The PR addresses a real gap: Aspire startup was invisible to the run-abort signal, leaving containers leaking and blocking shutdown for the full timeout. The approach is sound. A few concerns worth discussing before merging.
Medium: GlobalContext.CancellationToken naming creates ambiguity
The new public property (GlobalContext.CancellationToken) is named identically to TestContext.CancellationToken, which already exists for per-test cancellation. Both are CancellationToken properties on "context" objects, but with fundamentally different scopes — session vs. per-test.
Ironically, the fixture's own wrapper is called RunCancellationToken, which is already clearer. Leaking CancellationToken onto GlobalContext (vs. the fixture's RunCancellationToken) creates a naming inconsistency in the same PR. Users scanning both contexts will not immediately understand the scope difference.
Suggested names: SessionCancellationToken, RunAbortToken, or (to match the fixture) RunCancellationToken.
This is a public API change, and renaming after the fact requires a deprecation cycle, so it's worth getting right now.
Medium: Fire-and-forget in abort callback silently swallows teardown exceptions
_ = fixture.StopAndDisposeAsync();The comment notes that DisposeAsync awaits the same memoized task, but that's only true if DisposeAsync is actually reached. Under a forced-kill scenario (e.g., the abort timeout fires before session cleanup completes), neither path observes the exception. Teardown errors disappear silently.
Consider logging errors within the callback instead of relying on the outer await:
_abortRegistration = RunCancellationToken.Register(static async state =>
{
var fixture = (AspireFixture<TAppHost>)state!;
fixture.LogProgress("Test run aborted — tearing down Aspire application...");
try { await fixture.StopAndDisposeAsync(); }
catch (Exception ex) { fixture.LogProgress($"Teardown error on abort: {ex.Message}"); }
}, this);(Or use a non-async callback and Task.Run + logging — the key is not swallowing the exception.)
Minor: Abort teardown registered after BuildAsync, missing a window
RegisterAbortTeardown() is called after BuildAsync succeeds. If the run is aborted during BuildAsync, the linked RunCancellationToken will interrupt the build (good), but no abort callback is registered yet. Cleanup then depends entirely on TUnit calling DisposeAsync on the failed initializer.
StopAndDisposeCoreAsync() already handles _app == null gracefully, so RegisterAbortTeardown() could safely be moved to just after the OTLP receiver setup (before CreateAsync). This would eliminate the gap without introducing risk.
Positive observations
- Memoized single-shot teardown (
_teardownTask ??= StopAndDisposeCoreAsync()inside a lock) is the right pattern here. It correctly serialises both the abort-callback path and theDisposeAsyncpath onto the same work. _abortRegistration.Dispose()beforeStopAndDisposeAsync()inDisposeAsyncis correct — it prevents a racing late-firing callback while the memoised task already handles idempotency.CancellationTokenSource.CreateLinkedTokenSource+CancelAfteris cleaner thannew CancellationTokenSource(timeout)and correctly composes the two signals.OperationCanceledExceptioncatch ordering — checkingRunCancellationToken.IsCancellationRequestedfirst so an abort is surfaced as cancellation rather than a misleadingTimeoutExceptionis the right call.- PublicAPI snapshots updated across all four target-framework variants — no gaps.
GlobalContext.CancellationTokendefaults toNoneoutside the engine, so the abort registration becomes a no-op and existing behaviour is unchanged. Well documented.
Code Review — feat(aspire): tear down Aspire on test-run abort via engine cancellation tokenThe PR addresses a real gap: Aspire startup was invisible to the run-abort signal, leaving containers leaking and blocking shutdown for the full timeout. The approach is sound. A few concerns worth discussing before merging. Medium:
|
caff369 to
9e1bbfc
Compare
There was a problem hiding this comment.
Follow-up Review — feat(aspire): tear down Aspire on test-run abort via session cancellation token
The implementation looks correct and the feature fills a real gap. However, the three issues raised in the two previous Claude reviews remain unaddressed. Noting them again here with updated context.
Still Open: TestSessionContext.CancellationToken naming ambiguity
The property is still named CancellationToken on TestSessionContext. TestContext already has a CancellationToken (per-test scope), so users who use both types will see identically-named properties with fundamentally different lifetimes and semantics.
This is a public API change — renaming it after release requires a deprecation cycle. The fixture's own RunCancellationToken (in the same PR) is already a clearer name. SessionCancellationToken would make the scope unambiguous at a glance.
Why it matters: A user who wires up TestSessionContext.Current?.CancellationToken in their own fixture but means the per-test token will get silent, hard-to-debug lifetime bugs because the name doesn't distinguish the scopes.
Still Open: Fire-and-forget in abort callback swallows teardown errors
_ = fixture.StopAndDisposeAsync();The memoized task means DisposeAsync won't repeat the work — but if the process is killed before DisposeAsync is reached (the abort-timeout scenario this PR is specifically trying to defend against), exceptions from StopAndDisposeCoreAsync are lost completely. The user sees no trace of what went wrong with the container teardown.
Suggested fix:
_abortRegistration = RunCancellationToken.Register(static state =>
{
var fixture = (AspireFixture<TAppHost>)state!;
fixture.LogProgress("Test run aborted — tearing down Aspire application...");
_ = Task.Run(async () =>
{
try { await fixture.StopAndDisposeAsync(); }
catch (Exception ex) { fixture.LogProgress($"Teardown error on abort: {ex.Message}"); }
});
}, this);Why Task.Run here: CancellationToken.Register callbacks should not be async (the returned Task is discarded by the registration infrastructure). Wrapping in Task.Run keeps the fire-and-forget behaviour while giving teardown errors somewhere to surface.
Still Open: Abort teardown registration window after BuildAsync
RegisterAbortTeardown() is called after BuildAsync completes. If the run is aborted during BuildAsync, the RunCancellationToken correctly interrupts the build but no abort callback exists yet — cleanup then depends on TUnit calling DisposeAsync on the failed initializer, which may or may not happen promptly.
StopAndDisposeCoreAsync already null-checks _app, so moving RegisterAbortTeardown() to before the CreateAsync call is safe and closes the gap entirely. This is a low-risk, one-line move.
Positive observations (unchanged from prior reviews)
- Memoized single-shot teardown via
_teardownTask ??= StopAndDisposeCoreAsync()inside alockis the right pattern — both the abort-callback andDisposeAsyncpaths converge correctly. _abortRegistration.Dispose()beforeStopAndDisposeAsync()inDisposeAsyncprevents a late-racing callback.CreateLinkedTokenSource+CancelAfteris cleaner thannew CancellationTokenSource(timeout)and correctly composes both signals.- Catch ordering (
RunCancellationToken.IsCancellationRequestedchecked before the timeout branch) means abort surfaces as cancellation, not a misleadingTimeoutException. - All four PublicAPI snapshot files updated — no gaps.
- Token defaults to
Noneoutside the engine so existing tests are unaffected.
The core approach is solid. The naming concern is the one I'd most strongly recommend addressing before merge given it's a public API that can't be changed cleanly after the fact.
9e1bbfc to
7596cef
Compare
feat(aspire): tear down Aspire on test-run abort via session token AspireFixture startup (CreateAsync/BuildAsync/StartAsync and the resource-wait) previously only observed its own ResourceTimeout, so a run abort (Ctrl+C / IDE stop / ProcessExit) could neither interrupt a stuck startup nor stop the application before the process was forcefully killed, leaking containers. Expose the engine-wide abort token on TestSessionContext (populated once in TUnitTestFramework.ExecuteRequestAsync, covering both reflection and source-generated modes) so session-scoped fixtures can observe run cancellation. AspireFixture links it into all startup/wait operations via a virtual RunCancellationToken and registers a single-shot teardown that stops and disposes the application as soon as the run is aborted; DisposeAsync awaits the same task so teardown runs exactly once. The token lives on TestSessionContext (not a per-test context) because a session- or class-scoped fixture outlives any single test; a per-test token would tear the shared application down when its first consuming test ends. RunCancellationToken is virtual for the rare per-test fixture. Abort is distinguished from a genuine resource timeout so it surfaces as cancellation rather than a misleading TimeoutException. @
7596cef to
d85e4ac
Compare
There was a problem hiding this comment.
Third Review — feat(aspire): tear down Aspire on test-run abort via session cancellation token
The current commit (7596cef) has been reviewed previously (twice). All three flagged issues remain unaddressed. Summarising concisely.
1. Naming ambiguity — still open
TestSessionContext.CancellationToken conflicts with the already-public TestContext.CancellationToken (per-test scope). Identically-named properties on parallel context objects with silently different lifetimes is a trap. The fixture's own RunCancellationToken is already the clearer name; SessionCancellationToken on TestSessionContext would also be unambiguous.
This is a public API — renaming it post-merge requires a deprecation cycle.
2. Fire-and-forget swallows teardown exceptions — still open
_ = fixture.StopAndDisposeAsync(); // abort callbackDisposeAsync awaits the memoised task, so teardown exceptions are observed — but only if DisposeAsync is called. This PR's core motivation is the forced-kill scenario where the abort-timeout fires before session cleanup completes. In exactly that scenario, DisposeAsync is never reached and teardown errors disappear silently. A try/catch inside a Task.Run within the callback would surface them via LogProgress.
3. Registration window after BuildAsync — still open
RegisterAbortTeardown() is called after BuildAsync returns. An abort during BuildAsync correctly interrupts the build via the linked RunCancellationToken, but has no callback yet to kick off StopAndDisposeAsync. StopAndDisposeCoreAsync already null-checks _app, so moving the registration to just before CreateAsync is a safe one-line change that closes this window.
Assessment
The overall design — memoised single-shot teardown, linked token + CancelAfter, abort-vs-timeout catch ordering, _abortRegistration.Dispose() before the shared path, CancellationToken.None default outside the engine — is solid. The implementation is correct in the common case.
Issue 1 is the only one I'd block on before merge, because it is an irreversible public API decision. Issues 2 and 3 are lower-risk improvements but would make the error-handling behaviour match the PR's stated goal (graceful teardown even under forced kill).
The engine cancellation token only listened to OS signals (Console.CancelKeyPress for SIGINT, ProcessExit for SIGTERM). When the host drives cancellation itself -- IDE stop, --abort, or a CI runner cancelling the job -- it cancels MTP's ExecuteRequestContext.CancellationToken, which the engine token never observed. As a result, session-scoped fixtures that watch SessionCancellationToken (e.g. AspireFixture) could keep booting after the run was cancelled instead of tearing down promptly. Link context.CancellationToken into the engine token in Initialise. When it fires, the engine token (and thus SessionCancellationToken) fires too. The platform path opts out of the forceful-exit timer via Cancel(armForcefulExit: false) -- the platform owns process shutdown when it is the canceller, and killing the process underneath it would truncate result reporting. The Ctrl+C path keeps the timer (it suppresses default termination and must self-exit).
There was a problem hiding this comment.
Fourth Review — feat(aspire): tear down Aspire on test-run abort via session cancellation token
All three issues raised in the previous three reviews have been addressed in the latest commits.
✅ Issue 1 Resolved: Naming ambiguity
TestSessionContext.SessionCancellationToken — the scope is now unambiguous at a glance. No conflict with TestContext.CancellationToken (per-test). This is the right name for a public API that can't be cleanly renamed post-release.
✅ Issue 2 Resolved: Fire-and-forget now logs teardown errors
_ = Task.Run(async () =>
{
try { await fixture.StopAndDisposeAsync(); }
catch (Exception ex) { fixture.LogProgress($"Teardown error on abort: {ex.Message}"); }
});Teardown errors in the forced-kill path now surface via LogProgress rather than disappearing silently.
✅ Issue 3 Resolved: Registration window before CreateAsync closed
RegisterAbortTeardown() is now called before DistributedApplicationTestingBuilder.CreateAsync, so an abort during early AppHost creation will trigger the callback. The null-safe StopAndDisposeCoreAsync implementation handles the case where _app hasn't been assigned yet.
Second commit: engine-link of MTP platform token
The new EngineCancellationToken.Initialise(CancellationToken platformCancellationToken) change is clean:
_platformRegistrationis correctly disposed inDispose(). A defaultCancellationTokenRegistration(whenCanBeCanceled == false) disposes as a no-op, so the unconditional dispose is safe.- The
armForcefulExit: falsepath for platform-driven cancellation is well-reasoned — the platform owns process lifecycle when it is the canceller, and the forceful-exit timer is only necessary when the engine suppressed the OS default termination (Ctrl+C path). - The guard
Interlocked.CompareExchangealready prevents double-initialisation; the platform registration is set up within that guard, so there is no race.
Overall
This is a solid, well-motivated change. The design — memoized single-shot teardown, linked-token + CancelAfter, abort-vs-timeout catch ordering, SessionCancellationToken defaulting to None outside the engine — is correct throughout. All previously flagged issues are resolved. Ready to merge.
@
Problem
AspireFixturestartup (CreateAsync/BuildAsync/StartAsyncand the resource-wait) only observed its ownResourceTimeout. It never saw the engines run-abort token, so a request to abort the test run (Ctrl+C, IDE stop,ProcessExit) could not:ResourceTimeout, e.g. 60s), andChange
Expose the engine token on
TestSessionContext. TUnit.Core had no ambient session-scoped cancellation token (the engine token lived only inTUnitServiceProvider). Addedpublic CancellationToken CancellationToken { get; internal set; }toTestSessionContext, populated once inTUnitTestFramework.ExecuteRequestAsync. Both reflection and source-generated modes funnel through that method, so a single wiring covers both. Session-scopedIAsyncInitializerfixtures (containers, distributed apps) can now observe run cancellation.AspireFixture honors it. New
protected virtual RunCancellationToken(defaults to the session token). Linked intoCreateAsync/BuildAsync/StartAsyncand the resource-wait viaCreateLinkedTokenSource(...)+CancelAfter(ResourceTimeout), so an abort interrupts startup promptly.Teardown on abort.
RegisterAbortTeardown()fires a single-shotStopAndDisposeAsync()when the token trips, giving the app a head start on graceful shutdown within the forceful-exit window.DisposeAsyncawaits the same memoized task, so teardown runs exactly once regardless of which path fires first.An abort is distinguished from a genuine resource timeout (the linked token trips both) so it surfaces as cancellation rather than a misleading
TimeoutException.Why the session token (not a per-test one)
AspireFixtureis used as[ClassDataSource<...>(Shared = SharedType.PerTestSession)]and outlives any single test. ItsInitializeAsyncruns lazily, triggered by the first consuming test. A per-test token would be cancelled (and disposed) when that first test ends, tearing the shared application down underneath every later test. The session tokens lifetime matches the fixtures.RunCancellationTokenisvirtualso a genuinely per-test fixture (Shared = None) can fold in its own token.Notes
TestSessionContext.CancellationTokengetter. All 4 Core PublicAPI snapshots updated; net8/9/10 pass.None(CanBeCanceled == false) so the abort registration is a no-op and behavior is unchanged.