Skip to content

[ARM32] Eliminate red zone usage in runtime stubs#129398

Open
cshung wants to merge 14 commits into
dotnet:mainfrom
cshung:feature-avoid-red-zone
Open

[ARM32] Eliminate red zone usage in runtime stubs#129398
cshung wants to merge 14 commits into
dotnet:mainfrom
cshung:feature-avoid-red-zone

Conversation

@cshung

@cshung cshung commented Jun 14, 2026

Copy link
Copy Markdown
Contributor

On ARM32 Linux, the area below SP is not guaranteed to be preserved across signal delivery. The runtime previously used the red zone (writing below SP without adjusting it) in several stubs, which can cause silent corruption or crashes when a signal is delivered at the wrong moment.

This PR eliminates all red zone usage in ARM32 runtime stubs by replacing sub-SP reads/writes with explicit stack adjustments (push/pop):

  • NativeAOT interop thunks (ThunksMapping.cpp) — use ldr pc dispatch directly from r12, no stack intermediate. This also shrinks THUNK_SIZE from 20 to 12 bytes.
  • NativeAOT UniversalTransition — caller pushes args onto stack before branching; prolog reads them from known stack offsets after saving argument registers.
  • NativeAOT interface dispatch (DispatchResolve.S, StubDispatch.S) — PROLOG_PUSH/EPILOG_POP instead of red zone stores.
  • CoreCLR VTableCallStub — pre-indexed str / post-indexed ldr (actual push/pop).

On ARM32 Linux, the area below SP is not guaranteed to be preserved
across signal delivery. Replace red zone reads/writes with explicit
stack adjustments (push/pop) in:

- NativeAOT interop thunks (ldr pc dispatch, no stack intermediate)
- NativeAOT UniversalTransition (caller pushes args onto stack)
- NativeAOT interface dispatch stubs (PROLOG_STACK_ALLOC instead of
  sub-SP stores)
- CoreCLR VTableCallStub (pre-indexed str/post-indexed ldr)

Guarded by FEATURE_AVOID_RED_ZONE, enabled for ARM32 non-Windows
targets.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@cshung cshung requested a review from MichalStrehovsky as a code owner June 14, 2026 22:53
@dotnet-policy-service dotnet-policy-service Bot added the community-contribution Indicates that the PR has been added by a community member label Jun 14, 2026
@dotnet-policy-service

Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @agocke, @dotnet/ilc-contrib
See info in area-owners.md if you want to be subscribed.

Comment thread src/coreclr/vm/CMakeLists.txt Outdated
@MichalPetryka

Copy link
Copy Markdown
Contributor

Windows ARM32 has a well-defined red zone guarantee

Windows ARM32 is no longer supported.

Windows ARM32 is no longer supported, so every ARM32 target is Linux.
The red zone avoidance is always needed — remove the preprocessor guard
and delete the old red zone code paths entirely.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Comment thread src/coreclr/nativeaot/Runtime/ThunksMapping.cpp Outdated
Comment thread src/coreclr/nativeaot/Runtime/ThunksMapping.cpp Outdated
Comment thread src/coreclr/vm/arm/virtualcallstubcpu.hpp Outdated
Comment thread src/coreclr/vm/arm/virtualcallstubcpu.hpp Outdated
@cshung cshung force-pushed the feature-avoid-red-zone branch 2 times, most recently from 7cc9b73 to 59bc77c Compare June 15, 2026 17:31
Comment thread src/coreclr/nativeaot/Runtime/arm/UniversalTransition.S Outdated
Comment thread src/coreclr/runtime/arm/StubDispatch.S Outdated
Comment thread src/coreclr/runtime/arm/StubDispatch.S Outdated
Comment thread src/coreclr/runtime/arm/StubDispatch.S Outdated
Comment thread src/coreclr/runtime/arm/StubDispatch.S Outdated
Comment thread src/coreclr/nativeaot/Runtime/arm/UniversalTransition.S Outdated
cshung and others added 2 commits June 15, 2026 18:16
The ldr pc dispatch needs only 12 bytes (mov r12 + ldr pc), no padding
required. This increases thunks per page from 204 to 341 (67% more).

Also shorten verbose comments per review feedback.

Co-authored-by: Jan Kotas <jkotas@microsoft.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- StubDispatch: use PROLOG_PUSH/EPILOG_POP {r1,r2} instead of manual
  STACK_ALLOC + str/ldr
- UniversalTransition: replace interleaved ldr/push dance with a single
  PROLOG_PUSH {r0-r3} then load caller args from known stack offsets
- Clean up stale red zone comments

Co-authored-by: Jan Kotas <jkotas@microsoft.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@cshung cshung force-pushed the feature-avoid-red-zone branch from 59bc77c to 87288df Compare June 15, 2026 18:27
Comment thread src/coreclr/nativeaot/Runtime/arm/UniversalTransition.S
Comment thread src/coreclr/runtime/arm/StubDispatch.S Outdated
Comment thread src/coreclr/runtime/arm/StubDispatch.S Outdated
Comment thread src/coreclr/runtime/arm/StubDispatch.S
Comment thread src/coreclr/runtime/arm/StubDispatch.S
Comment thread src/coreclr/runtime/arm/StubDispatch.S
Co-authored-by: Jan Kotas <jkotas@microsoft.com>
@jkotas

jkotas commented Jun 15, 2026

Copy link
Copy Markdown
Member

/azp run runtime-nativeaot-outerloop

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

@jkotas jkotas left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@jkotas

jkotas commented Jun 16, 2026

Copy link
Copy Markdown
Member

Segfaults in many linux arm32 NAOT tests

pushd .
chmod +rwx Microsoft.Extensions.Configuration.FileExtensions.Tests ^&^& ./Microsoft.Extensions.Configuration.FileExtensions.Tests -notrait category=IgnoreForCI -notrait category=OuterLoop -notrait category=failing -xml testResults.xml 
popd
===========================================================================================================
/root/helix/work/workitem/e /root/helix/work/workitem/e
DOTNET_DbgEnableMiniDump is set and the createdump binary does not exist: ./createdump
./RunTests.sh: line 173:    18 Segmentation fault      (core dumped) ./Microsoft.Extensions.Configuration.FileExtensions.Tests -notrait category=IgnoreForCI -notrait category=OuterLoop -notrait category=failing -xml testResults.xml $RSP_FILE
/root/helix/work/workitem/e
----- end Mon Jun 15 09:53:04 PM UTC 2026 ----- exit code 139 ----------------------------------------------------------

Could you please take a look?

Comment thread src/coreclr/nativeaot/Runtime/arm/DispatchResolve.S Outdated
Comment thread src/coreclr/nativeaot/Runtime/arm/DispatchResolve.S Outdated
Comment thread src/coreclr/nativeaot/Runtime/arm/DispatchResolve.S Outdated
Co-authored-by: Jan Kotas <jkotas@microsoft.com>
Comment thread src/coreclr/nativeaot/Runtime/arm/DispatchResolve.S Outdated
Comment thread src/coreclr/nativeaot/Runtime/arm/DispatchResolve.S Outdated
Comment thread src/coreclr/nativeaot/Runtime/arm/DispatchResolve.S Outdated
@jkotas

jkotas commented Jun 17, 2026

Copy link
Copy Markdown
Member

/azp run runtime-nativeaot-outerloop

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

@jkotas

jkotas commented Jun 17, 2026

Copy link
Copy Markdown
Member

@MichalStrehovsky PTLA

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates several ARM32 stubs and NativeAOT transitions to avoid writing below sp (red zone) by switching to explicit stack adjustments (push/pop / stack alloc), and updates related thunk/transition conventions accordingly.

Changes:

  • CoreCLR ARM32 interface/vtable-related stubs: replace red-zone saves/restores with stack-based sequences.
  • NativeAOT ARM32 thunk and interop paths: shrink thunk stubs by branching via ldr pc while preserving r12 as the thunk data pointer, and adjust RhCommonStub accordingly.
  • NativeAOT ARM32 universal transition: change extra-argument passing to caller-pushed stack args and update the corresponding stack frame layout and unwind helper logic.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/coreclr/vm/arm/virtualcallstubcpu.hpp Updates VTableCall stub encoding/size logic to use push/pop-style stack ops instead of red zone.
src/coreclr/runtime/arm/StubDispatch.S Replaces red-zone register spills in cached interface dispatch stubs and adjusts slow-path arg passing to universal transition.
src/coreclr/nativeaot/Runtime/ThunksMapping.cpp Changes ARM thunk stub shape and size to branch via ldr pc and keep r12 as data pointer.
src/coreclr/nativeaot/Runtime/StackFrameIterator.cpp Updates ARM universal transition stack frame layout to account for caller-pushed extra args.
src/coreclr/nativeaot/Runtime/EHHelpers.cpp Adjusts ARM unwind helper to compensate for new interface dispatch stack usage on null-this AV.
src/coreclr/nativeaot/Runtime/arm/UniversalTransition.S Switches universal transition extra args from red zone to caller-pushed stack args and updates prolog/epilog accordingly.
src/coreclr/nativeaot/Runtime/arm/InteropThunksHelpers.S Updates RhCommonStub to consume r12 directly (no red-zone load).
src/coreclr/nativeaot/Runtime/arm/DispatchResolve.S Replaces red-zone spills with stack pushes and updates slow-path argument setup for universal transition.

Comment thread src/coreclr/nativeaot/Runtime/arm/DispatchResolve.S Outdated
Comment thread src/coreclr/nativeaot/Runtime/ThunksMapping.cpp
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@jkotas

jkotas commented Jun 17, 2026

Copy link
Copy Markdown
Member

/azp run runtime-nativeaot-outerloop

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

Comment thread src/coreclr/nativeaot/Runtime/arm/DispatchResolve.S
@jkotas

jkotas commented Jun 17, 2026

Copy link
Copy Markdown
Member

/azp run runtime-nativeaot-outerloop

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

@jkotas

jkotas commented Jun 18, 2026

Copy link
Copy Markdown
Member

All Arm32 test failures are known

Comment thread src/coreclr/nativeaot/Runtime/arm/DispatchResolve.S
Comment thread src/coreclr/nativeaot/Runtime/EHHelpers.cpp
uint64_t m_fpArgRegs[8]; // ChildSP+008 CallerSP-078 (0x40 bytes) (d0-d7)
uint64_t m_returnBlock[4]; // ChildSP+048 CallerSP-038 (0x20 bytes)
uintptr_t m_intArgRegs[4]; // ChildSP+068 CallerSP-018 (0x10 bytes) (r0-r3)
uintptr_t m_callerPushedArgs[2]; // ChildSP+078 CallerSP-008 (0x8 bytes) (extra arg + target fn)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does src/coreclr/nativeaot/Common/src/Internal/Runtime/TransitionBlock.cs need a matching ARM32 layout update?

@jkotas jkotas Jun 18, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, that would be a messy change with a lot of eventual fallout. It creates holes in structs.

@cshung Could please change the prolog of universal transition to create the same layout as before, and rever this change?

Something like:

        // Caller pushed 8 bytes: [sp]=extra arg, [sp+4]=target fn
        .pad #8
        PROLOG_PUSH  "{r0-r1}"
        ldr          r12, [sp, #20]         // Capture target function (caller's [sp+4], now at sp+16+4)  
        ldr          r1, [sp, #16]          // Capture extra arg (caller's [sp], now at sp+16)
        str          r3, [sp, #20]          // Now we can store remaining arg registers into the space used for the hidden args
        str          r2, [sp, #16]

Co-authored-by: Michal Strehovský <MichalStrehovsky@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arch-arm32 area-NativeAOT-coreclr community-contribution Indicates that the PR has been added by a community member

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants