JIT: don't forward sub calls under TYP_REF GT_STOREIND#125178
JIT: don't forward sub calls under TYP_REF GT_STOREIND#125178jakobbotsch wants to merge 1 commit intodotnet:mainfrom
Conversation
Write barrier register constraints for ref-typed store indirections result in register shuffling when the value is a call node. Skip forward substitution in this case as a profitability heuristic. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
There was a problem hiding this comment.
Pull request overview
This PR adjusts JIT forward-substitution heuristics to avoid substituting call expressions directly under GT_STOREIND<TYP_REF> stores, aiming to reduce register shuffling caused by write barrier register constraints (noted as regressing after #125141).
Changes:
- Adds a profitability guard in
fgForwardSubStatementto skip forward substitution when the substitution would place aGT_CALLunder aGT_STOREIND<TYP_REF>.
| } | ||
|
|
||
| // Don't forward sub calls under TYP_REF GT_STOREIND nodes; | ||
| // the write barrier will constraint the address to a register that the call |
There was a problem hiding this comment.
Grammar: "will constraint" should be "will constrain".
| // the write barrier will constraint the address to a register that the call | |
| // the write barrier will constrain the address to a register that the call |
| // Don't forward sub calls under TYP_REF GT_STOREIND nodes; | ||
| // the write barrier will constraint the address to a register that the call | ||
| // will trash, resulting in unfortunate shuffling. | ||
| // | ||
| if (fwdSubNode->IsCall()) | ||
| { | ||
| GenTree* const parentNode = fsv.GetParentNode(); | ||
| if ((parentNode != nullptr) && parentNode->OperIs(GT_STOREIND) && parentNode->TypeIs(TYP_REF)) | ||
| { | ||
| JITDUMP(" call under TYP_REF GT_STOREIND; write barrier constraints\n"); | ||
| return false; |
There was a problem hiding this comment.
The heuristic is keyed only on GT_STOREIND + TYP_REF, but STOREIND<ref> does not necessarily produce a write barrier (e.g., when GTF_IND_TGT_NOT_HEAP is set or when the address is a GT_LCL_ADDR, gcIsWriteBarrierCandidate returns WBF_NoBarrier). As written, this will also block profitable forwarding in non-write-barrier cases; consider tightening the condition to only skip when the store is actually (or very likely) a write-barrier candidate.
| // Don't forward sub calls under TYP_REF GT_STOREIND nodes; | |
| // the write barrier will constraint the address to a register that the call | |
| // will trash, resulting in unfortunate shuffling. | |
| // | |
| if (fwdSubNode->IsCall()) | |
| { | |
| GenTree* const parentNode = fsv.GetParentNode(); | |
| if ((parentNode != nullptr) && parentNode->OperIs(GT_STOREIND) && parentNode->TypeIs(TYP_REF)) | |
| { | |
| JITDUMP(" call under TYP_REF GT_STOREIND; write barrier constraints\n"); | |
| return false; | |
| // Don't forward sub calls under TYP_REF GT_STOREIND nodes that will actually | |
| // require a write barrier; the barrier will constrain the address to a | |
| // register that the call will trash, resulting in unfortunate shuffling. | |
| // | |
| if (fwdSubNode->IsCall()) | |
| { | |
| GenTree* const parentNode = fsv.GetParentNode(); | |
| if ((parentNode != nullptr) && parentNode->OperIs(GT_STOREIND) && parentNode->TypeIs(TYP_REF)) | |
| { | |
| GenTree* const addr = parentNode->AsIndir()->Addr(); | |
| if (gcIsWriteBarrierCandidate(parentNode, addr) != WBF_NoBarrier) | |
| { | |
| JITDUMP(" call under TYP_REF GT_STOREIND; write barrier constraints\n"); | |
| return false; | |
| } |
|
cc @dotnet/jit-contrib PTAL @AndyAyersMS Diffs. Primarily arm32 seems to be affected, maybe because of the few registers it has. I can ifdef it if you'd prefer or we can keep it as is. |
AndyAyersMS
left a comment
There was a problem hiding this comment.
LGTM.
Copilot seems to think you can be a bit less restrictive.
|
|
||
| // Don't forward sub calls under TYP_REF GT_STOREIND nodes; | ||
| // the write barrier will constraint the address to a register that the call | ||
| // will trash, resulting in unfortunate shuffling. |
There was a problem hiding this comment.
// will trash, resulting in unfortunate shuffling.
Put this under #if HAS_FIXED_REGISTER_SET?
|
The issue is more specifically that on arm32, results are returned in the same register as the first argument is passed, which results in a kind of conflict that LSRA is not good at resolving (and ends up spilling for). |
|
Although, it seems the same is true for arm64, so I am not sure why it does not suffer from the problem. Need to study it a bit more... |
|
Ah, the write barrier on arm64 uses a custom calling convention that takes the source/destination in x14/x15, so it avoids the problem. |
Write barrier register constraints for ref-typed store indirections result in register shuffling when the value is a call node. Skip forward substitution in this case as a profitability heuristic.
Addresses regressions I saw in #125141.