Improve error handling and lifecycle management in application services#116
Open
Improve error handling and lifecycle management in application services#116
Conversation
Root cause: When the host restarts after the startup/shutdown test, Playwright can fail to navigate (e.g. due to an orphaned Chromium from the previous run). This triggers BackgroundService to call StopApplication(), cancelling stoppingToken. However, ApplicationScopeContainer.InvokeInScopeAsync was awaiting the scope TaskCompletionSource without the cancellation token, so ApplicationLifecycle hung indefinitely waiting for a Blazor circuit that would never start. After 10 seconds the shutdown timeout killed the process with no meaningful error logged. Fixes: - ApplicationScopeContainer.InvokeInScopeAsync: use WaitAsync(combinedToken) instead of bare await so the scope wait is cancelled when shutdown is requested, preventing the 10-second hang - ApplicationLifecycle: log unhandled exceptions (previously swallowed silently) and add scope wait/release log messages for startup debugging - PlaywrightBrowserLifetimeService: wrap Tracing.StopAsync in try/catch so CleanupAsync always runs even if saving the trace file throws, preventing Chromium from being orphaned between runs - AdaptiveRemoteHost.Builder.Start: detect host process exit during the TCP retry loop and fail fast instead of waiting the full 120-second timeout - SimulatedEnvironment.SetLogLocation: stop accumulating --log:FilePath args in the shared builder settings on every scenario; instead apply the log path only at start time via a per-start override - SimulatedEnvironment.EnsureHostStarted: dispose the old (exited) host before starting a new one to avoid leaking the process handle https://claude.ai/code/session_0166sDkzTih5jmGpyeuYBBND
Test Results312 tests ±0 302 ✅ - 4 1m 3s ⏱️ -19s For more details on these failures, see this check. Results for commit b6643e2. ± Comparison against base commit 2d54cae. This pull request removes 3 and adds 2 tests. Note that renamed tests count towards both.♻️ This comment has been updated with latest results. |
…essages The WaitingForScope and ScopeReleased log messages added to ApplicationLifecycle.ExecuteAsync need to be included in the strict VerifyMessages assertions in the unit tests. - WaitingForScope appears first in every test (logged at ExecuteAsync entry) - ScopeReleased appears after InvokeInScopeAsync returns on the normal/ handled-error path (service init failures are caught internally by ScopedLifecycleContainer and the scope call returns normally) - ScopeReleased does NOT appear on the cancellation path (StopAsync_CancelsInitializeMethodsThatAreWaiting), where OperationCanceledException is caught instead
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR enhances error handling, logging, and lifecycle management across the AdaptiveRemote application. It adds better exception handling during browser cleanup, improves test host initialization with settings overrides, and refines application lifecycle logging and cancellation token handling.
Key Changes
Browser Tracing Cleanup: Wrapped the browser tracing stop operation in a try-catch block to gracefully handle exceptions during cleanup and log them appropriately via
Playwright_ErrorWhileCleaningUp.Test Host Builder Improvements:
overrideSettingsparameter to theStart()method to allow runtime configuration overridesStartWithSettings()for cleaner separation of concernsSimulated Environment Fixes:
SetLogLocation()to store the log location for use during the next host start rather than immediately reconfiguring the builderEnsureHostStarted()to properly dispose previous host instances and pass log location settings through theStart()method overrideApplication Lifecycle Logging:
ApplicationLifecycle_UnhandledError,ApplicationLifecycle_WaitingForScope, andApplicationLifecycle_ScopeReleasedExecuteAsyncto log unhandled errors before cleanupCancellation Token Handling: Added
WaitAsync()call when awaiting the application scope task to properly respect cancellation tokens and prevent indefinite waits during shutdownNotable Implementation Details
The changes maintain backward compatibility while improving robustness. The test host builder changes allow for more flexible configuration during startup, and the lifecycle improvements provide better observability and graceful shutdown behavior.
https://claude.ai/code/session_0166sDkzTih5jmGpyeuYBBND