Stackalloc localloc conditional2#127980
Draft
AndyAyersMS wants to merge 47 commits into
Draft
Conversation
…onditional2 # Conflicts: # src/coreclr/jit/codegenarm.cpp # src/coreclr/jit/codegenarm64.cpp # src/coreclr/jit/codegenloongarch64.cpp # src/coreclr/jit/codegenriscv64.cpp # src/coreclr/jit/codegenxarch.cpp # src/coreclr/jit/gentree.cpp # src/coreclr/jit/gentree.h # src/coreclr/jit/jitconfigvalues.h # src/coreclr/jit/lsraarm.cpp # src/coreclr/jit/lsraarm64.cpp # src/coreclr/jit/lsraloongarch64.cpp # src/coreclr/jit/lsrariscv64.cpp # src/coreclr/jit/lsraxarch.cpp # src/coreclr/jit/morph.cpp # src/coreclr/jit/objectalloc.cpp # src/coreclr/jit/objectalloc.h
The merge of upstream/main into StackallocLocallocConditional2 left the body of MorphNewArrNodeIntoStackAlloc in HEAD's old form (taking GenTree* len) while the header was updated to main's new signature (unsigned length, unsigned blockSize, returning unsigned int). Replace the body's signature and preamble with main's version; the rest of the function (including the return lclNum already added during the merge) is unchanged. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The five other codegen backends (xarch, arm, arm64, loongarch64, riscv64) zero LCLHEAP allocations when either info.compInitMem is set or the GTF_LCLHEAP_MUSTINIT flag is present on the LCLHEAP node. The wasm backend was missing the flag check, so an LCLHEAP marked MUSTINIT (e.g. the runtime length stack-array path that flows through helperexpansion.cpp) would not be zeroed when compInitMem is false. Validated by building clr.wasmjit subset. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Introduce an inline helper on Compiler that returns true iff a given LCLHEAP node must zero its allocation, encapsulating the common 'info.compInitMem || (tree->gtFlags & GTF_LCLHEAP_MUSTINIT)' check used by every codegen and LSRA backend. Replace the 12 inlined occurrences across xarch, arm, arm64, loongarch64, riscv64, and wasm with calls to the helper. Validated by building clr.jit and clr.wasmjit subsets. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
Contributor
There was a problem hiding this comment.
Pull request overview
This PR extends CoreCLR JIT object stack allocation for newarr to support cases where the array length isn’t a compile-time constant by routing allocation through a localloc-based expansion (with a runtime size check that can fall back to heap allocation). It also adds plumbing to ensure such locallocs are zero-initialized even when the method doesn’t have init-mem semantics.
Changes:
- Add an objectalloc “localloc” path for stack-allocated arrays with non-constant (and optionally in-loop) lengths, controlled by new JIT config switches.
- Teach stack-array helper expansion to generate a runtime size computation + conditional localloc vs heapalloc split, using a new well-known arg for element size.
- Introduce
GTF_LCLHEAP_MUSTINITand propagate “must zero” semantics through LSRA and codegen viaCompiler::gtMustZeroLocalloc.
Reviewed changes
Copilot reviewed 21 out of 21 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| src/coreclr/jit/objectalloc.h | Adds state/config fields and a localloc morph helper declaration; extends stack-allocation eligibility API with lengthKnown. |
| src/coreclr/jit/objectalloc.cpp | Implements localloc morphing for runtime-sized (and in-loop) newarr, and adds an eligibility guard for runtime-sized arrays with GC elements. |
| src/coreclr/jit/morph.cpp | Adds display support for new WellKnownArg::StackArrayElemSize. |
| src/coreclr/jit/lsraxarch.cpp | Updates LCLHEAP reg/zero-init decisions to use gtMustZeroLocalloc / GTF_LCLHEAP_MUSTINIT. |
| src/coreclr/jit/lsrariscv64.cpp | Switches localloc “needs init” checks to gtMustZeroLocalloc. |
| src/coreclr/jit/lsraloongarch64.cpp | Switches localloc “needs init” checks to gtMustZeroLocalloc. |
| src/coreclr/jit/lsraarm64.cpp | Switches localloc “needs init” checks to gtMustZeroLocalloc. |
| src/coreclr/jit/lsraarm.cpp | Switches localloc “needs init” checks to gtMustZeroLocalloc. |
| src/coreclr/jit/jitmetadatalist.h | Adds a new JIT metric counter for localloc-allocated arrays. |
| src/coreclr/jit/jitconfigvalues.h | Adds config knobs for enabling localloc-based stack array allocation and permitting it in loops. |
| src/coreclr/jit/helperexpansion.cpp | Implements runtime-sized stack array expansion into size-check + localloc/heapalloc CFG split, and header initialization for localloc case. |
| src/coreclr/jit/gentree.h | Adds GTF_LCLHEAP_MUSTINIT and WellKnownArg::StackArrayElemSize. |
| src/coreclr/jit/gentree.cpp | Adds formatting for the new well-known arg in arg debug messages. |
| src/coreclr/jit/compiler.h | Adds Compiler::gtMustZeroLocalloc helper to centralize localloc zero-init requirements. |
| src/coreclr/jit/codegenxarch.cpp | Uses gtMustZeroLocalloc in localloc codegen, and records localloc usage. |
| src/coreclr/jit/codegenwasm.cpp | Uses gtMustZeroLocalloc in localloc codegen. |
| src/coreclr/jit/codegenriscv64.cpp | Uses gtMustZeroLocalloc in localloc codegen. |
| src/coreclr/jit/codegenloongarch64.cpp | Uses gtMustZeroLocalloc in localloc codegen. |
| src/coreclr/jit/codegenarm64.cpp | Uses gtMustZeroLocalloc in localloc codegen. |
| src/coreclr/jit/codegenarm.cpp | Uses gtMustZeroLocalloc in localloc codegen. |
| src/coreclr/jit/codegen.h | Adds a genLocallocUsed flag to CodeGen state. |
basicBlockHasBackwardJump and basicBlockInHandler in MorphAllocObjNodes are no longer used after the morph loop body was refactored into MorphAllocObjNodeHelperArr (which queries the block flags directly). MSVC tolerates this via /wd4189; clang/gcc on Linux/Mac warn. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The pointer-size round-up in fgExpandStackArrayAllocation was using (size + TARGET_POINTER_SIZE) & ~(TARGET_POINTER_SIZE - 1) which over-allocates by one pointer when the size is already aligned, and can push already-aligned sizes over the runtime stack threshold. The standard align-up formula is (size + TPS - 1) & ~(TPS - 1). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Comment on lines
+3090
to
+3092
| GenTree* const zeroInit = gtNewStoreLclVarNode(frameRunningTotalLclNum, gtNewIconNode(0, TYP_I_IMPL)); | ||
| Statement* const zeroInitStmt = fgNewStmtFromTree(zeroInit); | ||
| fgInsertStmtAtBeg(fgFirstBB, zeroInitStmt); |
Comment on lines
+3116
to
+3120
| if (size->isContainedIntOrIImmed()) | ||
| { | ||
| // The size node being a contained constant means that Lower has taken care of | ||
| // zeroing the memory if compInitMem is true. | ||
| needsZeroing = false; | ||
| initMem = false; |
Keep conditional localloc locals possibly heap-pointing so heap fallback references remain GC-reportable, and avoid platform-specific allocation-kind assertions outside Windows. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ditional2-local # Conflicts: # src/coreclr/inc/jiteeversionguid.h
Preserve MUSTINIT zeroing for arm64 contained locallocs, update frame-running-total statement side effects, and keep conditional localloc heap-fallback locals from being treated as definitely stack-only. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Ensure in-loop stack allocation candidates embedded in uses are split and expanded through the conditional localloc helper instead of being skipped. Recompute split CFG weights so checked profile validation stays consistent. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Handle embedded conditional localloc uses by introducing a result temp when the newarr is not already stored at the statement root. Keep native-int subtraction results typed as native ints when propagating byref types, and keep GC-element arrays out of conditional localloc because their GC slots cannot be reported from variable localloc storage. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Stop scanning a block after conditional localloc expansion moves the current statement into new CFG blocks. Continue scanning after fixed stack-array expansions so their pseudo arguments are expanded before lowering. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Keep conditional localloc replacements typed as byrefs and materialize the x86 align8 adjusted address before replacement. Also avoid continuing the current block statement scan after a localloc expansion moves the statement to new CFG blocks. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Comment on lines
+2
to
+8
| <PropertyGroup> | ||
| <!-- Needed for CLRTestEnvironmentVariable --> | ||
| <RequiresProcessIsolation>true</RequiresProcessIsolation> | ||
| <DebugType>None</DebugType> | ||
| <Optimize>True</Optimize> | ||
| <JitOptimizationSensitive>true</JitOptimizationSensitive> | ||
| </PropertyGroup> |
Comment on lines
+207
to
+210
| [ActiveIssue("needs triage", TestRuntimes.Mono)] | ||
| [Fact] | ||
| public static int TestNegative() => CallTestAndVerifyAllocation(VariableLengthNegative, 0, AllocationKind.Undefined, throws: true); | ||
|
|
Comment on lines
+211
to
+214
| [ActiveIssue("needs triage", TestRuntimes.Mono)] | ||
| [Fact] | ||
| public static int TestIntMin() => CallTestAndVerifyAllocation(VariableLengthIntMin, 0, AllocationKind.Undefined, throws: true); | ||
|
|
Comment on lines
+215
to
+217
| [ActiveIssue("needs triage", TestRuntimes.Mono)] | ||
| [Fact] | ||
| public static int TestHuge() => CallTestAndVerifyAllocation(VariableLengthHuge, 0, AllocationKind.Undefined, throws: true); |
…onditional2-local
This was referenced Jul 2, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.