Remove Wrapper delegate kind, switch ARM32 VSD hidden arg to R12 by Copilot · Pull Request #129403 · dotnet/runtime

Copilot · 2026-06-15T03:44:24Z

main PR

Description

CoreCLR’s Wrapper delegate kind was retained only to satisfy an ARM32 VSD ABI quirk (r4 hidden cell). This change removes that special delegate path by moving ARM32 VSD hidden-parameter passing to r12 and deleting wrapper-delegate infrastructure across VM/JIT/AOT/contracts and System.Private.CoreLib.

ARM32 VSD calling-convention alignment
- Updates ARM32 stub-dispatch plumbing to carry IndirectionCellAndFlags in r12.
- Aligns JIT virtual-stub parameter selection on ARM32 to REG_R12.
- Applies follow-up ARM32 VSD worker handoff fixes:
  - LookupHolder::Initialize and ResolveStub slow-entry push incoming hidden r12 to the stack and load dispatch token into r12 before branching to the worker.
  - ResolveWorkerAsmStub now uses a fake-prolog/transition-block shape (matching the existing ARM delay-load pattern) so argument registers are preserved correctly while passing the dispatch token and hidden cell/flags to VSD_ResolveWorker.
  - Updates ARM stub length constants to match the instruction sequence changes.
- Post-feedback correctness fixes:
  - Corrects ResolveStub::slowEntryPointLen to match the emitted slow-entry instruction count.
  - Removes unnecessary slow-entry alignment NOP padding and keeps slowEntryPointLen at 6, which is already even and satisfies the subsequent alignment invariant ((n & 1) == 0).
Wrapper delegate infrastructure removal
- Removes wrapper-delegate creation/detection/runtime paths from COMDelegate.
- Removes remaining wrapper handling from MulticastDelegate.CoreCLR.cs.
- Removes wrapper-specific IL stub kind/plumbing and associated metadata fields.
- Removes wrapper-specific JIT call flags and late-arg handling.
Cross-layer contract cleanup
- Deletes obsolete wrapperDelegateInvoke and offsetOfWrapperDelegateIndirectCell from EE/JIT interfaces and SuperPMI agnostic records.
- Keeps VM/JIT/AOT call-info structures consistent after field removal.
- Revs the JIT/EE interface GUID in jiteeversionguid.h to match the interface change.
IL stub enum cleanup
- Deletes the unused wrapper IL stub value and renumbers subsequent IL stub values in VM (method.hpp, dllimport.h) and corresponding cDAC contract mapping.
ARM32 ResolveStub correctness fix
- Fixes ResolveStub::_failEntryPoint to avoid clobbering r12 (which now carries IndirectionCellAndFlags).
- Uses r4 as decrement scratch for _pCounter update and preserves callee-saved r4 via stack save/restore in the fail path.
JIT cleanup after feedback
- Removes an unused VirtualStubParamInfo constructor argument in src/coreclr/jit/compiler.h.
- Updates the call site in src/coreclr/jit/compiler.cpp accordingly.
- Moves GTF_CALL_M_STACK_ARRAY to the removed wrapper-flag slot (0x00004000) so call flag values remain contiguous.
Delegate VM cleanup after feedback
- Removes the temporary refRealDelegate local and associated GCPROTECT_BEGIN/END in COMDelegate::BindToMethod.
- Replaces uses with (*pRefThis) directly.
- Deletes the unused _ReturnAddress intrinsic declaration from comdelegate.cpp.
R2R compatibility boundary
- Bumps ReadyToRun major/minimum version to 24 for the ARM32 VSD register contract change.
- Syncs version constants across native and managed header definitions.
Documentation update
- Updates docs/design/coreclr/botr/clr-abi.md hidden-parameter section: ARM32 VSD hidden parameter is now R12.

// src/coreclr/jit/compiler.h (ARM32)
#elif defined(TARGET_ARM)
    reg     = REG_R12;
    regMask = RBM_R12;

Customer Impact

Without this change, ARM32 virtual stub dispatch can mis-handle hidden argument/token flow in edge paths and wrapper-delegate-specific infrastructure remains in place even though it is no longer needed. This can lead to correctness risk and unnecessary maintenance burden. The slow-entry assertion fixes also prevent ARM checked-build assertion failures in ResolveHolder::Initialize without unnecessary instruction padding.

Regression

No known product regression is being fixed; this is ABI/calling-convention cleanup plus correctness hardening in ARM32 VSD paths. The latest updates fix regressions introduced during this PR’s ARM32 stub reshaping (slow-entry length mismatch and follow-up assertion failures).

Testing

Baseline build completed: ./build.sh clr+libs+host
Targeted ARM validation completed: ./build.sh clr.runtime -arch arm -c Release
Additional ARM build attempt: ./build.sh clr -arch arm -c Release (failed in ILCompiler_publish.csproj due missing libjitinterface_arm.so, unrelated to the ARM32 asm helper change)
Follow-up validation for assertion fixes: ./build.sh clr.runtime -arch arm -c Checked (passed)

Risk

Medium. The change touches low-level CoreCLR ARM32 stub assembly and calling-convention plumbing, but is narrowly scoped to VSD worker handoff and aligned with existing fake-prolog transition-block patterns used elsewhere in ARM stubs. The final slow-entry fix is low risk and removes unnecessary NOP padding while keeping emitted metadata and alignment invariants correct.

Package authoring no longer needed in .NET 9

IMPORTANT: Starting with .NET 9, you no longer need to edit a NuGet package's csproj to enable building and bump the version.
Keep in mind that we still need package authoring in .NET 8 and older versions.

Co-authored-by: jkotas <6668460+jkotas@users.noreply.github.com>

MichalPetryka · 2026-06-15T04:38:06Z

@copilot Remove all wrapper handling code from MulticastDelegate.CoreCLR.cs too.

MichalPetryka · 2026-06-15T04:41:42Z

@copilot Remove all wrapper handling code from MulticastDelegate.CoreCLR.cs too.

@jkotas Can you repost this since only maintainers can interact with copilot?

jkotas · 2026-06-15T04:44:08Z

@copilot Remove all wrapper handling code from MulticastDelegate.CoreCLR.cs too.

dotnet-policy-service · 2026-06-15T04:44:19Z

Tagging subscribers to this area: @agocke
See info in area-owners.md if you want to be subscribed.

Co-authored-by: jkotas <6668460+jkotas@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 35 out of 35 changed files in this pull request and generated 1 comment.

- Simplify DispatchHolder AV comment per feedback - Move ResolveHolder MethodTable load after push to preserve r4 Co-authored-by: jkotas <6668460+jkotas@users.noreply.github.com>

Co-authored-by: jkotas <6668460+jkotas@users.noreply.github.com>

…nstants Co-authored-by: jkotas <6668460+jkotas@users.noreply.github.com>

Co-authored-by: jkotas <6668460+jkotas@users.noreply.github.com>

jkotas · 2026-06-18T00:20:39Z

@copilot Build failure

Assertion failed.
IsLowReg(destination) && IsLowReg(source)
at ILCompiler.DependencyAnalysis.ReadyToRun.ImportThunk.EmitCode(NodeFactory factory, ARMEmitter& instructionEncoder, Boolean relocsOnly) in /_/src/coreclr/tools/aot/ILCompiler.ReadyToRun/Compiler/DependencyAnalysis/ReadyToRun/Target_ARM/ImportThunk.cs:line 50

Co-authored-by: jkotas <6668460+jkotas@users.noreply.github.com>

Copilot · 2026-06-18T00:28:26Z

@copilot Build failure

Assertion failed.
...

Fixed in 21e17b1. The issue was that R12 is a high register (R8-R15) and the simple form of EmitLDR requires both source and destination to be low registers (R0-R7). Changed the delay-load/VSD path to use R0 as scratch instead of R12, while keeping R12 for the hidden indirection cell as required by the new ABI.

jkotas · 2026-06-18T00:34:49Z

@copilot Please revert the last commit. It is wrong fix for the issue. The thunk must not clobber r0 - r3 registers. Instead, fix the instruction encoder to support r12 register as needed.

… scratch" This reverts commit 21e17b1. Co-authored-by: jkotas <6668460+jkotas@users.noreply.github.com>

Use 32-bit encoding with offset 0 for high registers while keeping compact 16-bit encoding for low registers (R0-R7). Co-authored-by: jkotas <6668460+jkotas@users.noreply.github.com>

Copilot · 2026-06-18T00:42:45Z

@copilot Please revert the last commit. It is wrong fix for the issue. The thunk must not clobber r0 - r3 registers. Instead, fix the instruction encoder to support r12 register as needed.

Reverted in 78e5b3e and fixed the instruction encoder in 1d5ac1c. ARMEmitter.EmitLDR now uses 32-bit encoding with offset 0 for high registers (R8-R15) including R12, while keeping the compact 16-bit encoding for low registers (R0-R7). Build verified successfully.

Copilot

Pull request overview

Copilot reviewed 36 out of 36 changed files in this pull request and generated 2 comments.

…hunk.cs Co-authored-by: jkotas <6668460+jkotas@users.noreply.github.com>

…g; use it in ImportThunk for R12 section index Co-authored-by: jkotas <6668460+jkotas@users.noreply.github.com>

Use r4 instead of r12 to load ResolveCacheElem.token during the cache loop in ResolveHolder::Initialize. This prevents the hidden IndirectionCellAndFlags in r12 from being clobbered on a pMT-match / token-mismatch path that falls through to _slowEntryPoint. - Replace 32-bit `ldr r12, [r6 + #token]` (2 WORDs) with 16-bit `ldr r4, [r6, #token]` (1 WORD, safe since r4 is reloaded per loop) - Change `cmp r12, r5` → `cmp r4, r5` - Adjust preceding pMT `bne nextEntry` offset from 8 → 7 (1 fewer WORD) - Decrement resolveEntryPointLen from 32 → 31 to match Co-authored-by: jkotas <6668460+jkotas@users.noreply.github.com>

jkotas · 2026-06-18T05:19:37Z

    // The following macro relies on this entry point being DWORD-aligned. We've already asserted that the
    // overall stub is aligned above, just need to check that the preceding stubs occupy an even number of
    // WORD slots.
    _ASSERTE((n & 1) == 0);


@copilot resolveEntryPointLen is odd now, but this assert checks that it is even. Delete the nop for alignment above and adjust resolveEntryPointLen as appropriate

Fixed in the latest commit: removed the alignment NOP from the slow-entry path of the resolve entry point and updated resolveEntryPointLen from 31 to 30. With the NOP gone the instruction count is 30 (even), satisfying the (n & 1) == 0 assert without needing any padding.

…eEntryPointLen to 30 Co-authored-by: jkotas <6668460+jkotas@users.noreply.github.com>

Initial plan

c7c6b25

Copilot AI assigned Copilot and jkotas Jun 15, 2026

Copilot AI requested review from Copilot and removed request for Copilot June 15, 2026 03:44

Copilot started work on behalf of jkotas June 15, 2026 03:44 View session

github-actions Bot added the area-Infrastructure label Jun 15, 2026

github-project-automation Bot added this to Runtime Infra Jun 15, 2026

Remove Wrapper delegate kind and switch ARM32 VSD hidden arg to r12

877724a

Co-authored-by: jkotas <6668460+jkotas@users.noreply.github.com>

Copilot AI requested review from Copilot and removed request for Copilot June 15, 2026 04:32

Copilot AI changed the title ~~[WIP] Remove need for the special Wrapper delegate kind~~ Remove Wrapper delegate kind; switch ARM32 VSD hidden arg to R12 Jun 15, 2026

Copilot finished work on behalf of jkotas June 15, 2026 04:33

Copilot AI requested a review from jkotas June 15, 2026 04:33