Skip to content

Fix NativeAOT GC hole issue#129598

Open
janvorli wants to merge 1 commit into
dotnet:mainfrom
janvorli:fix-nativeaot-eh-gc-hole
Open

Fix NativeAOT GC hole issue#129598
janvorli wants to merge 1 commit into
dotnet:mainfrom
janvorli:fix-nativeaot-eh-gc-hole

Conversation

@janvorli

Copy link
Copy Markdown
Member

There is a GC hole when:

  • an exception is rethrown from a funclet
  • the exception escapes that funclet
  • a finally is executed for this secondary exception
  • GC runs while the call chain of this finally is being executed
  • A reference in non-volatile register is pushed in a prolog of one of the functions in the finally call chain
  • the nonvolatile register holds a live reference up somewhere up in the call chain of the parent of the catch handler that catches the secondary exception
  • the nonvolatile register is not pushed anywhere between the parent of the catch and the frame where the nonvolatile register holds a live GC reference

In this case, if GC relocates that reference, it is updated in the stack frame of the finally call chain, but not in the location referenced by the REGDISPLAY in the ExInfo of the secondary exception. So when we resume after catch, the stale reference is placed in the nonvolatile register and then it bubbles up the call chain until it reaches the frame where the register is supposed to hold live GC reference.

The fix is to save the nonvolatile registers after returning from a finally funclet back to the location referenced by the REGDISPLAY passed to the RhpCallFinallyFunclet.

Close #129010

There is a GC hole when:
* an exception is rethrown from a funclet
* the exception escapes that funclet
* a finally is executed for this secondary exception
* GC runs while the call chain of this finally is being executed
* A reference in non-volatile register is pushed in a prolog
  of one of the functions in the finally call chain
* the nonvolatile register holds a live reference up somewhere up
  in the call chain of the parent of the catch handler that catches
  the secondary exception
* the nonvolatile register is not pushed anywhere between the parent
  of the catch and the frame where the nonvolatile register holds
  a live GC reference

In this case, if GC relocates that reference, it is updated in the
stack frame of the finally call chain, but not in the location
referenced by the REGDISPLAY in the ExInfo of the secondary exception.
So when we resume after catch, the stale reference is placed in the
nonvolatile register and then it bubbles up the call chain until it
reaches the frame where the register is supposed to hold live GC
reference.

The fix is to save the nonvolatile registers after returning from a
finally funclet back to the location referenced by the REGDISPLAY passed
to the RhpCallFinallyFunclet.

Close dotnet#129010
@janvorli janvorli requested review from jakobbotsch and jkotas June 18, 2026 23:04
@janvorli janvorli self-assigned this Jun 18, 2026
Copilot AI review requested due to automatic review settings June 18, 2026 23:04
@janvorli

Copy link
Copy Markdown
Member Author

@jakobbotsch thank you so much for reproducing it with time travel debugging, it would be hard to reason about it without that!

@dotnet-policy-service

Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @agocke, @dotnet/ilc-contrib
See info in area-owners.md if you want to be subscribed.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the NativeAOT AMD64 exception-handling stubs so that after executing a finally funclet, the current values of preserved (non-volatile) registers are written back to the locations described by the passed-in REGDISPLAY (and, on Windows x64, the preserved XMM register values are written back into REGDISPLAY). This keeps the REGDISPLAY state consistent if a GC occurs during the finally call chain and relocates references that were temporarily spilled.

Changes:

  • Add preserved-register write-back in the System V AMD64 RhpCallFinallyFunclet2 path.
  • Add preserved-register + XMM6–XMM15 write-back in the Windows AMD64 RhpCallFinallyFunclet2 path.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
src/coreclr/nativeaot/Runtime/amd64/ExceptionHandling.S Writes back AMD64 SysV preserved integer registers to the homes referenced by REGDISPLAY after the finally funclet returns.
src/coreclr/nativeaot/Runtime/amd64/ExceptionHandling.asm Writes back Windows x64 preserved integer registers (and XMM6–XMM15) to REGDISPLAY state after the finally funclet returns.

@jkotas

jkotas commented Jun 19, 2026

Copy link
Copy Markdown
Member

Can we add a regression test for this?

Could you please change "Funclets are not required to preserve non-volatile registers." in https://github.com/dotnet/runtime/blob/main/docs/design/coreclr/botr/clr-abi.md to "Funclets are not required to preserve non-volatile registers that are saved by main method body."

Thank you both!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ci-scan] Test failure: NumberHandlingTests_Metadata.Number_AsCollectionElement_RoundTrip on NativeAOT windows-x64

3 participants