Skip to content

Remove Wrapper delegate kind, switch ARM32 VSD hidden arg to R12#129403

Draft
Copilot wants to merge 53 commits into
mainfrom
copilot/remove-wrapper-delegate-kind
Draft

Remove Wrapper delegate kind, switch ARM32 VSD hidden arg to R12#129403
Copilot wants to merge 53 commits into
mainfrom
copilot/remove-wrapper-delegate-kind

Conversation

Copilot AI commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

main PR

Description

CoreCLR’s Wrapper delegate kind was retained only to satisfy an ARM32 VSD ABI quirk (r4 hidden cell). This change removes that special delegate path by moving ARM32 VSD hidden-parameter passing to r12 and deleting wrapper-delegate infrastructure across VM/JIT/AOT/contracts and System.Private.CoreLib.

  • ARM32 VSD calling-convention alignment

    • Updates ARM32 stub-dispatch plumbing to carry IndirectionCellAndFlags in r12.
    • Aligns JIT virtual-stub parameter selection on ARM32 to REG_R12.
    • Applies follow-up ARM32 VSD worker handoff fixes:
      • LookupHolder::Initialize and ResolveStub slow-entry push incoming hidden r12 to the stack and load dispatch token into r12 before branching to the worker.
      • ResolveWorkerAsmStub now uses a fake-prolog/transition-block shape (matching the existing ARM delay-load pattern) so argument registers are preserved correctly while passing the dispatch token and hidden cell/flags to VSD_ResolveWorker.
      • Updates ARM stub length constants to match the instruction sequence changes.
    • Post-feedback correctness fixes:
      • Corrects ResolveStub::slowEntryPointLen to match the emitted slow-entry instruction count.
      • Removes unnecessary slow-entry alignment NOP padding and keeps slowEntryPointLen at 6, which is already even and satisfies the subsequent alignment invariant ((n & 1) == 0).
  • Wrapper delegate infrastructure removal

    • Removes wrapper-delegate creation/detection/runtime paths from COMDelegate.
    • Removes remaining wrapper handling from MulticastDelegate.CoreCLR.cs.
    • Removes wrapper-specific IL stub kind/plumbing and associated metadata fields.
    • Removes wrapper-specific JIT call flags and late-arg handling.
  • Cross-layer contract cleanup

    • Deletes obsolete wrapperDelegateInvoke and offsetOfWrapperDelegateIndirectCell from EE/JIT interfaces and SuperPMI agnostic records.
    • Keeps VM/JIT/AOT call-info structures consistent after field removal.
    • Revs the JIT/EE interface GUID in jiteeversionguid.h to match the interface change.
  • IL stub enum cleanup

    • Deletes the unused wrapper IL stub value and renumbers subsequent IL stub values in VM (method.hpp, dllimport.h) and corresponding cDAC contract mapping.
  • ARM32 ResolveStub correctness fix

    • Fixes ResolveStub::_failEntryPoint to avoid clobbering r12 (which now carries IndirectionCellAndFlags).
    • Uses r4 as decrement scratch for _pCounter update and preserves callee-saved r4 via stack save/restore in the fail path.
  • JIT cleanup after feedback

    • Removes an unused VirtualStubParamInfo constructor argument in src/coreclr/jit/compiler.h.
    • Updates the call site in src/coreclr/jit/compiler.cpp accordingly.
    • Moves GTF_CALL_M_STACK_ARRAY to the removed wrapper-flag slot (0x00004000) so call flag values remain contiguous.
  • Delegate VM cleanup after feedback

    • Removes the temporary refRealDelegate local and associated GCPROTECT_BEGIN/END in COMDelegate::BindToMethod.
    • Replaces uses with (*pRefThis) directly.
    • Deletes the unused _ReturnAddress intrinsic declaration from comdelegate.cpp.
  • R2R compatibility boundary

    • Bumps ReadyToRun major/minimum version to 24 for the ARM32 VSD register contract change.
    • Syncs version constants across native and managed header definitions.
  • Documentation update

    • Updates docs/design/coreclr/botr/clr-abi.md hidden-parameter section: ARM32 VSD hidden parameter is now R12.
// src/coreclr/jit/compiler.h (ARM32)
#elif defined(TARGET_ARM)
    reg     = REG_R12;
    regMask = RBM_R12;

Customer Impact

Without this change, ARM32 virtual stub dispatch can mis-handle hidden argument/token flow in edge paths and wrapper-delegate-specific infrastructure remains in place even though it is no longer needed. This can lead to correctness risk and unnecessary maintenance burden. The slow-entry assertion fixes also prevent ARM checked-build assertion failures in ResolveHolder::Initialize without unnecessary instruction padding.

Regression

No known product regression is being fixed; this is ABI/calling-convention cleanup plus correctness hardening in ARM32 VSD paths. The latest updates fix regressions introduced during this PR’s ARM32 stub reshaping (slow-entry length mismatch and follow-up assertion failures).

Testing

  • Baseline build completed: ./build.sh clr+libs+host
  • Targeted ARM validation completed: ./build.sh clr.runtime -arch arm -c Release
  • Additional ARM build attempt: ./build.sh clr -arch arm -c Release (failed in ILCompiler_publish.csproj due missing libjitinterface_arm.so, unrelated to the ARM32 asm helper change)
  • Follow-up validation for assertion fixes: ./build.sh clr.runtime -arch arm -c Checked (passed)

Risk

Medium. The change touches low-level CoreCLR ARM32 stub assembly and calling-convention plumbing, but is narrowly scoped to VSD worker handoff and aligned with existing fake-prolog transition-block patterns used elsewhere in ARM stubs. The final slow-entry fix is low risk and removes unnecessary NOP padding while keeping emitted metadata and alignment invariants correct.

Package authoring no longer needed in .NET 9

IMPORTANT: Starting with .NET 9, you no longer need to edit a NuGet package's csproj to enable building and bump the version.
Keep in mind that we still need package authoring in .NET 8 and older versions.

Co-authored-by: jkotas <6668460+jkotas@users.noreply.github.com>
Copilot AI requested review from Copilot and removed request for Copilot June 15, 2026 04:32
Copilot AI changed the title [WIP] Remove need for the special Wrapper delegate kind Remove Wrapper delegate kind; switch ARM32 VSD hidden arg to R12 Jun 15, 2026
Copilot AI requested a review from jkotas June 15, 2026 04:33
@MichalPetryka

Copy link
Copy Markdown
Contributor

@copilot Remove all wrapper handling code from MulticastDelegate.CoreCLR.cs too.

Comment thread src/coreclr/inc/corinfo.h
Comment thread src/coreclr/tools/aot/ILCompiler.RyuJit/JitInterface/CorInfoImpl.RyuJit.cs Outdated
Comment thread src/coreclr/vm/method.hpp Outdated
@MichalPetryka

Copy link
Copy Markdown
Contributor

@copilot Remove all wrapper handling code from MulticastDelegate.CoreCLR.cs too.

@jkotas Can you repost this since only maintainers can interact with copilot?

Comment thread src/coreclr/vm/dllimport.h Outdated
@jkotas

jkotas commented Jun 15, 2026

Copy link
Copy Markdown
Member

@copilot Remove all wrapper handling code from MulticastDelegate.CoreCLR.cs too.

@dotnet-policy-service

Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @agocke
See info in area-owners.md if you want to be subscribed.

Comment thread src/coreclr/vm/arm/asmhelpers.S Outdated
Co-authored-by: jkotas <6668460+jkotas@users.noreply.github.com>
Copilot AI requested review from Copilot and removed request for Copilot June 15, 2026 05:03
Comment thread src/coreclr/vm/arm/asmhelpers.S Outdated

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 35 out of 35 changed files in this pull request and generated 1 comment.

Comment thread src/coreclr/vm/arm/stubs.cpp
Comment thread src/coreclr/vm/arm/stubs.cpp Outdated
Copilot AI and others added 3 commits June 17, 2026 15:16
- Simplify DispatchHolder AV comment per feedback
- Move ResolveHolder MethodTable load after push to preserve r4

Co-authored-by: jkotas <6668460+jkotas@users.noreply.github.com>
Co-authored-by: jkotas <6668460+jkotas@users.noreply.github.com>
Co-authored-by: jkotas <6668460+jkotas@users.noreply.github.com>
Comment thread src/coreclr/vm/arm/exceparm.cpp Outdated
…nstants

Co-authored-by: jkotas <6668460+jkotas@users.noreply.github.com>
Comment thread src/coreclr/vm/arm/exceparm.cpp Outdated
Comment thread src/coreclr/vm/arm/cgencpu.h Outdated
Copilot AI and others added 2 commits June 17, 2026 17:10
Co-authored-by: jkotas <6668460+jkotas@users.noreply.github.com>
Co-authored-by: jkotas <6668460+jkotas@users.noreply.github.com>
@jkotas

jkotas commented Jun 18, 2026

Copy link
Copy Markdown
Member

@copilot Build failure

Assertion failed.
IsLowReg(destination) && IsLowReg(source)
at ILCompiler.DependencyAnalysis.ReadyToRun.ImportThunk.EmitCode(NodeFactory factory, ARMEmitter& instructionEncoder, Boolean relocsOnly) in /_/src/coreclr/tools/aot/ILCompiler.ReadyToRun/Compiler/DependencyAnalysis/ReadyToRun/Target_ARM/ImportThunk.cs:line 50

Co-authored-by: jkotas <6668460+jkotas@users.noreply.github.com>

Copilot AI commented Jun 18, 2026

Copy link
Copy Markdown
Contributor Author

@copilot Build failure

Assertion failed.
...

Fixed in 21e17b1. The issue was that R12 is a high register (R8-R15) and the simple form of EmitLDR requires both source and destination to be low registers (R0-R7). Changed the delay-load/VSD path to use R0 as scratch instead of R12, while keeping R12 for the hidden indirection cell as required by the new ABI.

@jkotas

jkotas commented Jun 18, 2026

Copy link
Copy Markdown
Member

@copilot Please revert the last commit. It is wrong fix for the issue. The thunk must not clobber r0 - r3 registers. Instead, fix the instruction encoder to support r12 register as needed.

Copilot AI and others added 2 commits June 18, 2026 00:41
… scratch"

This reverts commit 21e17b1.

Co-authored-by: jkotas <6668460+jkotas@users.noreply.github.com>
Use 32-bit encoding with offset 0 for high registers while keeping
compact 16-bit encoding for low registers (R0-R7).

Co-authored-by: jkotas <6668460+jkotas@users.noreply.github.com>

Copilot AI commented Jun 18, 2026

Copy link
Copy Markdown
Contributor Author

@copilot Please revert the last commit. It is wrong fix for the issue. The thunk must not clobber r0 - r3 registers. Instead, fix the instruction encoder to support r12 register as needed.

Reverted in 78e5b3e and fixed the instruction encoder in 1d5ac1c. ARMEmitter.EmitLDR now uses 32-bit encoding with offset 0 for high registers (R8-R15) including R12, while keeping the compact 16-bit encoding for low registers (R0-R7). Build verified successfully.

Comment thread src/coreclr/tools/Common/Compiler/DependencyAnalysis/Target_ARM/ARMEmitter.cs Outdated

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 36 out of 36 changed files in this pull request and generated 2 comments.

Comment thread src/coreclr/vm/arm/stubs.cpp
Copilot AI and others added 3 commits June 18, 2026 01:06
…hunk.cs

Co-authored-by: jkotas <6668460+jkotas@users.noreply.github.com>
…g; use it in ImportThunk for R12 section index

Co-authored-by: jkotas <6668460+jkotas@users.noreply.github.com>
Use r4 instead of r12 to load ResolveCacheElem.token during the cache
loop in ResolveHolder::Initialize. This prevents the hidden
IndirectionCellAndFlags in r12 from being clobbered on a pMT-match /
token-mismatch path that falls through to _slowEntryPoint.

- Replace 32-bit `ldr r12, [r6 + #token]` (2 WORDs) with 16-bit
  `ldr r4, [r6, #token]` (1 WORD, safe since r4 is reloaded per loop)
- Change `cmp r12, r5` → `cmp r4, r5`
- Adjust preceding pMT `bne nextEntry` offset from 8 → 7 (1 fewer WORD)
- Decrement resolveEntryPointLen from 32 → 31 to match

Co-authored-by: jkotas <6668460+jkotas@users.noreply.github.com>
// The following macro relies on this entry point being DWORD-aligned. We've already asserted that the
// overall stub is aligned above, just need to check that the preceding stubs occupy an even number of
// WORD slots.
_ASSERTE((n & 1) == 0);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot resolveEntryPointLen is odd now, but this assert checks that it is even. Delete the nop for alignment above and adjust resolveEntryPointLen as appropriate

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in the latest commit: removed the alignment NOP from the slow-entry path of the resolve entry point and updated resolveEntryPointLen from 31 to 30. With the NOP gone the instruction count is 30 (even), satisfying the (n & 1) == 0 assert without needing any padding.

…eEntryPointLen to 30

Co-authored-by: jkotas <6668460+jkotas@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

4 participants