Fix NativeAOT GC hole issue#129598
Conversation
There is a GC hole when: * an exception is rethrown from a funclet * the exception escapes that funclet * a finally is executed for this secondary exception * GC runs while the call chain of this finally is being executed * A reference in non-volatile register is pushed in a prolog of one of the functions in the finally call chain * the nonvolatile register holds a live reference up somewhere up in the call chain of the parent of the catch handler that catches the secondary exception * the nonvolatile register is not pushed anywhere between the parent of the catch and the frame where the nonvolatile register holds a live GC reference In this case, if GC relocates that reference, it is updated in the stack frame of the finally call chain, but not in the location referenced by the REGDISPLAY in the ExInfo of the secondary exception. So when we resume after catch, the stale reference is placed in the nonvolatile register and then it bubbles up the call chain until it reaches the frame where the register is supposed to hold live GC reference. The fix is to save the nonvolatile registers after returning from a finally funclet back to the location referenced by the REGDISPLAY passed to the RhpCallFinallyFunclet. Close dotnet#129010
|
@jakobbotsch thank you so much for reproducing it with time travel debugging, it would be hard to reason about it without that! |
|
Tagging subscribers to this area: @agocke, @dotnet/ilc-contrib |
There was a problem hiding this comment.
Pull request overview
This PR updates the NativeAOT AMD64 exception-handling stubs so that after executing a finally funclet, the current values of preserved (non-volatile) registers are written back to the locations described by the passed-in REGDISPLAY (and, on Windows x64, the preserved XMM register values are written back into REGDISPLAY). This keeps the REGDISPLAY state consistent if a GC occurs during the finally call chain and relocates references that were temporarily spilled.
Changes:
- Add preserved-register write-back in the System V AMD64
RhpCallFinallyFunclet2path. - Add preserved-register + XMM6–XMM15 write-back in the Windows AMD64
RhpCallFinallyFunclet2path.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| src/coreclr/nativeaot/Runtime/amd64/ExceptionHandling.S | Writes back AMD64 SysV preserved integer registers to the homes referenced by REGDISPLAY after the finally funclet returns. |
| src/coreclr/nativeaot/Runtime/amd64/ExceptionHandling.asm | Writes back Windows x64 preserved integer registers (and XMM6–XMM15) to REGDISPLAY state after the finally funclet returns. |
|
Can we add a regression test for this? Could you please change "Funclets are not required to preserve non-volatile registers." in https://github.com/dotnet/runtime/blob/main/docs/design/coreclr/botr/clr-abi.md to "Funclets are not required to preserve non-volatile registers that are saved by main method body." Thank you both! |
There is a GC hole when:
In this case, if GC relocates that reference, it is updated in the stack frame of the finally call chain, but not in the location referenced by the
REGDISPLAYin theExInfoof the secondary exception. So when we resume after catch, the stale reference is placed in the nonvolatile register and then it bubbles up the call chain until it reaches the frame where the register is supposed to hold live GC reference.The fix is to save the nonvolatile registers after returning from a finally funclet back to the location referenced by the
REGDISPLAYpassed to theRhpCallFinallyFunclet.Close #129010