Skip to content

JIT: don't forward sub calls under TYP_REF GT_STOREIND#125178

Open
jakobbotsch wants to merge 1 commit intodotnet:mainfrom
jakobbotsch:forward-sub-under-stores
Open

JIT: don't forward sub calls under TYP_REF GT_STOREIND#125178
jakobbotsch wants to merge 1 commit intodotnet:mainfrom
jakobbotsch:forward-sub-under-stores

Conversation

@jakobbotsch
Copy link
Member

Write barrier register constraints for ref-typed store indirections result in register shuffling when the value is a call node. Skip forward substitution in this case as a profitability heuristic.

Addresses regressions I saw in #125141.

Write barrier register constraints for ref-typed store indirections
result in register shuffling when the value is a call node. Skip forward
substitution in this case as a profitability heuristic.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 4, 2026 14:43
@github-actions github-actions bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Mar 4, 2026
@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adjusts JIT forward-substitution heuristics to avoid substituting call expressions directly under GT_STOREIND<TYP_REF> stores, aiming to reduce register shuffling caused by write barrier register constraints (noted as regressing after #125141).

Changes:

  • Adds a profitability guard in fgForwardSubStatement to skip forward substitution when the substitution would place a GT_CALL under a GT_STOREIND<TYP_REF>.

}

// Don't forward sub calls under TYP_REF GT_STOREIND nodes;
// the write barrier will constraint the address to a register that the call
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grammar: "will constraint" should be "will constrain".

Suggested change
// the write barrier will constraint the address to a register that the call
// the write barrier will constrain the address to a register that the call

Copilot uses AI. Check for mistakes.
Comment on lines +769 to +779
// Don't forward sub calls under TYP_REF GT_STOREIND nodes;
// the write barrier will constraint the address to a register that the call
// will trash, resulting in unfortunate shuffling.
//
if (fwdSubNode->IsCall())
{
GenTree* const parentNode = fsv.GetParentNode();
if ((parentNode != nullptr) && parentNode->OperIs(GT_STOREIND) && parentNode->TypeIs(TYP_REF))
{
JITDUMP(" call under TYP_REF GT_STOREIND; write barrier constraints\n");
return false;
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The heuristic is keyed only on GT_STOREIND + TYP_REF, but STOREIND<ref> does not necessarily produce a write barrier (e.g., when GTF_IND_TGT_NOT_HEAP is set or when the address is a GT_LCL_ADDR, gcIsWriteBarrierCandidate returns WBF_NoBarrier). As written, this will also block profitable forwarding in non-write-barrier cases; consider tightening the condition to only skip when the store is actually (or very likely) a write-barrier candidate.

Suggested change
// Don't forward sub calls under TYP_REF GT_STOREIND nodes;
// the write barrier will constraint the address to a register that the call
// will trash, resulting in unfortunate shuffling.
//
if (fwdSubNode->IsCall())
{
GenTree* const parentNode = fsv.GetParentNode();
if ((parentNode != nullptr) && parentNode->OperIs(GT_STOREIND) && parentNode->TypeIs(TYP_REF))
{
JITDUMP(" call under TYP_REF GT_STOREIND; write barrier constraints\n");
return false;
// Don't forward sub calls under TYP_REF GT_STOREIND nodes that will actually
// require a write barrier; the barrier will constrain the address to a
// register that the call will trash, resulting in unfortunate shuffling.
//
if (fwdSubNode->IsCall())
{
GenTree* const parentNode = fsv.GetParentNode();
if ((parentNode != nullptr) && parentNode->OperIs(GT_STOREIND) && parentNode->TypeIs(TYP_REF))
{
GenTree* const addr = parentNode->AsIndir()->Addr();
if (gcIsWriteBarrierCandidate(parentNode, addr) != WBF_NoBarrier)
{
JITDUMP(" call under TYP_REF GT_STOREIND; write barrier constraints\n");
return false;
}

Copilot uses AI. Check for mistakes.
@jakobbotsch jakobbotsch marked this pull request as ready for review March 4, 2026 16:35
@jakobbotsch
Copy link
Member Author

cc @dotnet/jit-contrib PTAL @AndyAyersMS

Diffs. Primarily arm32 seems to be affected, maybe because of the few registers it has. I can ifdef it if you'd prefer or we can keep it as is.

@jakobbotsch jakobbotsch requested a review from AndyAyersMS March 4, 2026 16:36
Copy link
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Copilot seems to think you can be a bit less restrictive.


// Don't forward sub calls under TYP_REF GT_STOREIND nodes;
// the write barrier will constraint the address to a register that the call
// will trash, resulting in unfortunate shuffling.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// will trash, resulting in unfortunate shuffling.

Put this under #if HAS_FIXED_REGISTER_SET?

@jakobbotsch
Copy link
Member Author

The issue is more specifically that on arm32, results are returned in the same register as the first argument is passed, which results in a kind of conflict that LSRA is not good at resolving (and ends up spilling for).
I think I will limit the change to arm32.

@jakobbotsch
Copy link
Member Author

Although, it seems the same is true for arm64, so I am not sure why it does not suffer from the problem. Need to study it a bit more...

@jakobbotsch
Copy link
Member Author

Ah, the write barrier on arm64 uses a custom calling convention that takes the source/destination in x14/x15, so it avoids the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants