Avoid race that can cause Kestrel's RequestAborted to not fire#62385
Merged
Avoid race that can cause Kestrel's RequestAborted to not fire#62385
Conversation
Member
Author
|
I think we should also backport this to .NET 8 and 9. |
Member
|
Add a test that a HTTP/3 stream isn't reused when aborted? |
- With HTTP/1.1 there's a narrow race where if we cancel the _abortedCts on a background thread due to the underlying connection closing between requests, it can be missed because CTS.TryReset is not thread safe with concurrent cancellation https://github.com/dotnet/runtime/blob/2dbdff1a7aebd64b4b265d99b173cd77f088c36e/src/libraries/System.Private.CoreLib/src/System/Threading/CancellationTokenSource.cs#L477-L479
5630fc3 to
21db906
Compare
Contributor
There was a problem hiding this comment.
Pull Request Overview
This PR fixes a race condition in Kestrel where the RequestAborted token could fail to fire despite the underlying connection closing. The issue stems from the non-thread-safe nature of CancellationTokenSource.TryReset() when concurrent Cancel() calls are made by background threads.
Key changes:
- Made
_connectionAborteda terminal state that is never reset to prevent race conditions - Added
CanReuseproperty to HTTP/3 streams to determine reusability based on connection state - Updated stream reuse logic to check
CanReusebefore reusing HTTP/3 streams - Added comprehensive tests to verify state is not reused for aborted requests
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
src/Servers/Kestrel/Core/src/Internal/Http/HttpProtocol.cs |
Modified Reset() method to make _connectionAborted terminal and only reset CancellationTokenSource when connection is not aborted |
src/Servers/Kestrel/Core/src/Internal/Http2/Http2Stream.cs |
Removed reset of _connectionAborted flag in OnReset() to maintain terminal state |
src/Servers/Kestrel/Core/src/Internal/Http3/Http3Stream.cs |
Added CanReuse property and removed _connectionAborted reset in OnReset() |
src/Servers/Kestrel/Core/src/Internal/Http3/Http3Connection.cs |
Updated stream reuse logic to check CanReuse property before reusing streams |
src/Servers/Kestrel/test/Interop.FunctionalTests/Http3/Http3RequestTests.cs |
Added tests for client and server-aborted requests to verify state is not reused |
src/Servers/Kestrel/test/Interop.FunctionalTests/Http3/Http3RequestTests.cs
Show resolved
Hide resolved
BrennanConroy
approved these changes
Aug 15, 2025
wtgodbe
added a commit
that referenced
this pull request
Aug 18, 2025
* Avoid race that can cause Kestrel's RequestAborted to not fire (#62385) * Send Keep-Alive Ping Immediately When Previous Ping Is Overdue (#63195) * Make new validations consistent with System.ComponentModel.DataAnnotations behavior (#63231) * Add support for type-level validation attributes, update validation ordering * Code review fix, test fix * Fix trimming annotation * Fix trimming annotation * Separate caches for property and type attributes * Fix typo Co-authored-by: Brennan <brecon@microsoft.com> * Fix typo Co-authored-by: Brennan <brecon@microsoft.com> * Fix and simplify the emitted code * Update src/Validation/test/Microsoft.Extensions.Validation.GeneratorTests/ValidationsGeneratorTestBase.cs --------- Co-authored-by: Brennan <brecon@microsoft.com> Co-authored-by: Safia Abdalla <safia@microsoft.com> * Search for slnx files when setting solution-relative content root (#61305) * Address API review feedback for what was IApiEndpointMetadata (#63283) --------- Co-authored-by: Stephen Halter <halter73@gmail.com> Co-authored-by: Reagan Yuan <reaganyuan27@gmail.com> Co-authored-by: Ondřej Roztočil <roztocil@outlook.com> Co-authored-by: Brennan <brecon@microsoft.com> Co-authored-by: Safia Abdalla <safia@microsoft.com> Co-authored-by: Jacob Bundgaard <jbu@templafy.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Before this change, Kestrel had a narrow race where it could fail to trigger the RequestAborted token despite the underlying connection closing. Because CancellationTokenSource.TryReset() is not thread-safe, concurrent calls to Cancel() made by CancelRequestAbortedTokenCallback() on a background thread can end up being ignored.
https://github.com/dotnet/runtime/blob/2dbdff1a7aebd64b4b265d99b173cd77f088c36e/src/libraries/System.Private.CoreLib/src/System/Threading/CancellationTokenSource.cs#L477-L479
I observed exactly this when debugging a hanging test from Send MCP-Protocol-Version header in Streamable HTTP client transport · modelcontextprotocol/csharp-sdk@735be0f by looking at the memory dump from testresults-windows-latest-Release. There you can see that there's a SSE request hanging because RequestAborted never firing despite ConnectionContext.ConnectionClosed firing and Http1Connection._connectionAborted being set.
I think this is definitely the right change for HTTP/1.1, and I copied the CanReuse logic from HTTP/2 for HTTP/3 so it hopefully shouldn't cause issues there, but please double check this @JamesNK. If the new logic does cause issues, it likely means the previous logic was even more broken for HTTP/3, because that would mean _connectionAborted could be Reset() to false while CancelRequestAbortedTokenCallback() from a call to HttpContext.Abort() from a previous request was still scheduled on the thread pool.
To fix this, I made _connectionAborted a terminal state and never reset _abortedCts to null. I think the only alternatives would be to somehow synchronize with the CancelRequestAbortedTokenCallback() and delay resetting until that completes or to not pool altogether.