Conversation
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
Reorder invariant nodes in simple scenarios in stackifier Jitdump when moving nodes in stackifier When regallocwasm creates a new store node, lower it Apply regallocwasm fix from andy Checkpoint Checkpoint Add comment Speculatively implement the dstOnStack optimization (code that hits it doesn't compile yet)
867f769 to
4c8682d
Compare
There was a problem hiding this comment.
Pull request overview
This PR advances the WASM RyuJIT backend’s block-store support by adding cpobj codegen, addressing a stackifier NIY case, and ensuring newly generated stores are lowered post-rewrite to keep the pipeline consistent.
Changes:
- Implement
CodeGen::genCodeForCpObjforGT_STORE_BLKcpobj unrolling on WASM. - Extend WASM regalloc to track multi-use operands for
GT_STORE_BLKand to re-lower stores created byRewriteLocalStackStore. - Improve lowering/stackifier behavior: mark cpobj operands as multiply-used and relax one stackifier NIY by moving invariant nodes.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| src/coreclr/jit/regallocwasm.h | Adds CollectReferencesForBlockStore declaration for block store multi-use tracking. |
| src/coreclr/jit/regallocwasm.cpp | Implements block-store reference collection and re-lowers stores created during local-store rewrite. |
| src/coreclr/jit/lowerwasm.cpp | Marks cpobj operands as MultiplyUsed and adjusts stackifier handling for invariant nodes. |
| src/coreclr/jit/gentree.cpp | Improves WASM dump text for native block-store opcodes (memory.copy vs memory.fill). |
| src/coreclr/jit/compiler.h | Adds WasmRegAlloc friend access for invoking lowering post-rewrite. |
| src/coreclr/jit/codegenwasm.cpp | Implements cpobj unrolled copying sequence (load/store vs helper call) and adds pointer-sized instruction aliases. |
Comments suppressed due to low confidence (5)
src/coreclr/jit/codegenwasm.cpp:2368
- The call to
genEmitHelperCall(CORINFO_HELP_ASSIGN_BYREF, ...)is missing the required SP argument.CodeGen::genEmitHelperCallexplicitly notes that for WASM helper calls, the stack-pointer argument must be first on the value stack (below any other args). Not pushing SP here would mismatch the helper signature and can explain the reported crash when calling the write barrier helper.
Suggestion: push GetStackPointerReg() (via local.get) as the first argument before pushing the destination/source byrefs, for each helper call site (or refactor to a small helper that emits the correct argument sequence).
case PackOperAndType(GT_LE, TYP_DOUBLE):
ins = INS_f64_le;
break;
case PackOperAndType(GT_GE, TYP_FLOAT):
src/coreclr/jit/codegenwasm.cpp:2324
genCodeForCpObjcallsgenConsumeRegs(cpObjNode), which only updates liveness for the GT_STORE_BLK node itself and does not consume/update liveness for its operands. This differs from patterns likegenCodeForStoreInd, and can lead to incorrect liveness (and thus wrong code) forAddr()/Data().
Suggestion: consume the destination address and source address/value explicitly (e.g., using genConsumeAddress(dstAddr) and genConsumeRegs(...) in the correct execution order, or genConsumeOperands if appropriate) before emitting the copy sequence, and then do the usual life update for the store node.
// So we can re-express say GT_GE (UN) as !GT_LT
//
src/coreclr/jit/codegenwasm.cpp:2373
gcPtrCountis initialized to the total GC pointer slot count, but it is only decremented in the write-barrier path. WhendstOnStackis true, GC pointer slots are copied via the non-WB load/store path andgcPtrCountwill never reach 0, causing the finalassert(gcPtrCount == 0)to fire in debug builds.
Suggestion: either decrement gcPtrCount whenever layout->IsGCPtr(i) (regardless of dstOnStack), or move the gcPtrCount accounting + assert under the !dstOnStack branch (similar to other target implementations).
instruction ins;
switch (PackOperAndType(op, treeNode->gtOp1->TypeGet()))
{
case PackOperAndType(GT_EQ, TYP_FLOAT):
ins = INS_f32_eq;
break;
case PackOperAndType(GT_EQ, TYP_DOUBLE):
ins = INS_f64_eq;
break;
case PackOperAndType(GT_NE, TYP_FLOAT):
ins = INS_f32_ne;
break;
case PackOperAndType(GT_NE, TYP_DOUBLE):
ins = INS_f64_ne;
break;
case PackOperAndType(GT_LT, TYP_FLOAT):
ins = INS_f32_lt;
break;
case PackOperAndType(GT_LT, TYP_DOUBLE):
ins = INS_f64_lt;
break;
case PackOperAndType(GT_LE, TYP_FLOAT):
ins = INS_f32_le;
break;
case PackOperAndType(GT_LE, TYP_DOUBLE):
ins = INS_f64_le;
break;
case PackOperAndType(GT_GE, TYP_FLOAT):
ins = INS_f32_ge;
break;
case PackOperAndType(GT_GE, TYP_DOUBLE):
ins = INS_f64_ge;
break;
src/coreclr/jit/codegenwasm.cpp:2364
- The offset computation for the write-barrier helper path hard-codes
INS_i32_const/INS_i32_add, but the address locals may bei64whenTARGET_64BIT(and you already abstracted other pointer-sized ops viaINS_I_*). This will produce invalid wasm or incorrect values on 64-bit.
Suggestion: use the pointer-sized const/add instructions (INS_I_const/INS_I_add) and the appropriate emitAttr for the constant (or select i32 vs i64 based on TARGET_64BIT) so the address arithmetic matches the address type.
case PackOperAndType(GT_LT, TYP_DOUBLE):
ins = INS_f64_lt;
break;
case PackOperAndType(GT_LE, TYP_FLOAT):
ins = INS_f32_le;
break;
src/coreclr/jit/codegenwasm.cpp:2310
noway_assert(source->IsLocal())/noway_assert(dstAddr->IsLocal())will hard-fail compilation in non-DEBUG builds if either operand is not a local. Nothing inLowerBlockStoreguaranteesAddr()is a local, and other targets handle arbitrary address expressions here.
Suggestion: remove these noway_asserts and rely on GetMultiUseOperandReg (with MultiplyUsed marking where needed) to support non-local address expressions; if some shapes are truly unsupported for now, prefer an explicit NYI/IMPL_LIMITATION gate instead of a noway_assert in release builds.
genTreeOps op = treeNode->OperGet();
|
@dotnet/jit-contrib Not ready to merge but I don't think it can reach ready without human review. Thanks to Single for walking me through a lot of the tricky parts. |
Co-authored-by: SingleAccretion <62474226+SingleAccretion@users.noreply.github.com>
SingleAccretion
left a comment
There was a problem hiding this comment.
LGTM modulo SetMultiplyUsed nit.
AndyAyersMS
left a comment
There was a problem hiding this comment.
LGTM overall, just a few formatting nits you can fix in a follow-up PR.
Address unresolved comments from #124846
This is sufficient to compile the following to valid WASM (It crashes when attempting to call the write barrier helper):
And the following just plain works, since copies to the stack don't need a write barrier:
The following now works thanks to implementing isContainableMemoryOp and fixing a related bug:
Fixes #124903