Support Write-Thru of EH variables in LSRA#543
Conversation
|
@dotnet/jit-contrib PTAL |
360d67b to
67c62ea
Compare
sandreenko
left a comment
There was a problem hiding this comment.
I expect that change has an impact on throughput and memory consumption, how big is it?
The implementation looks good, but I am scared to have so many places where we check writeThru, that means it will be very easy to forget some of them.
src/coreclr/src/jit/lsra.h
Outdated
There was a problem hiding this comment.
should not it be declared as a bitfield?
There was a problem hiding this comment.
I moved this field from the base type (Referenceable) to Interval, and changing it to a bitfield seemed to have a minor negative impact.
src/coreclr/src/jit/compiler.h
Outdated
There was a problem hiding this comment.
the same question why it is not a bitfield.
There was a problem hiding this comment.
We don't currently have any bitfields on Compiler, and given that there is only one instance per compilation, it seems that the efficiency of querying a byte overrides the storage impact.
src/coreclr/src/jit/instr.cpp
Outdated
src/coreclr/src/jit/instr.cpp
Outdated
There was a problem hiding this comment.
what do we actually need to know here:
- is the value valid in memory?
- is the value valid in a register?
looks like the second, so maybe rename isSpilledValue to validInReg?
There was a problem hiding this comment.
Makes sense. With the removal of legacy jit, I'm pretty sure this code is only used for lclVar spilling, so I think I'll try to simplify it.
src/coreclr/src/jit/lclvars.cpp
Outdated
There was a problem hiding this comment.
What will happen if lsra::enregisterLocalVars and compiler::lvaEnregEHVars have different values, for example enregisterLocalVars==false and lvaEnregEHVars==true, will we try allocate registers for such variables or not?
If now why do we need a separate flag?
There was a problem hiding this comment.
Although I hadn't added it yet, I anticipated having a COMPlus variable to separately control this (which I'm hopefully about to push as an update to this PR). However, LSRA uses the condition above to set enregisterLocalVars so it wouldn't be possible to have lvaEnregEHVars be true and enregisterLocalVars be false.
There was a problem hiding this comment.
maybe // if only register homed go to the next one?
There was a problem hiding this comment.
How about // If this arg is never on the stack, go to the next one.?
There was a problem hiding this comment.
Could you please update that comment so it includes normal spills, not only lvLiveInOutOfHndlr, because, at least for me, that was not obvious after the first read.
There was a problem hiding this comment.
Agree, this comment is a bit confusing...
There was a problem hiding this comment.
This still seems confusing, the comment is talking about GTF_SPILL but the code you changed here does not look at this flag.
There was a problem hiding this comment.
I've pushed another change to this comment, hopefully making it clear.
There was a problem hiding this comment.
I think that's better, yes. @sandreenko do you agree?
src/coreclr/src/jit/lsra.cpp
Outdated
There was a problem hiding this comment.
That change confuses me, but since hasEHBoundaryIn was not used before that PR there could not be any regressions from that.
There was a problem hiding this comment.
What this does is change the block to have no predecessor, so that it doesn't have to deal with mismatched locations of EH vars across the boundary. However, this change actually causes some small diffs when the EH write-thru is disabled, because it creates mismatches on the non-EH edges. I'm looking into how complex it would be to eliminate this change, but it may be excessively complex. The crossgen diffs over all jits & altjits for frameworks & tests shows only 4 bytes of diff each over two large methods for arm32, and 40 bytes of diff for one 4020 byte method for x64/ux altjit.
src/coreclr/src/jit/lsra.cpp
Outdated
There was a problem hiding this comment.
I think that was part was already merged in #679, why does github show that as a diff?
There was a problem hiding this comment.
I believe it's because this is actually showing the original change that I extracted for that PR. I haven't yet pushed a rebased version of this PR.
src/coreclr/src/jit/lsrabuild.cpp
Outdated
There was a problem hiding this comment.
Could you please explain how this PR relates to the deletion of block != compiler->fgFirstBB condition here?
There was a problem hiding this comment.
With this PR, we have blocks with an incoming EH boundary that may have live-in candidate vars. Previously those vars would have no live-in candidate vars, by construction, so we didn't need to worry about whether they needed dummy defs. Now that we have live-in EH vars, we don't want dummy defs for those (they should always be on stack on entry), so they are treated as having no pred block (like fgFirstBB), and that becomes the right condition for determining when we don't want dummy defs.
The Throughput tuning commit brings this from roughly a .05% throughput loss on x64 (in the noise on x86) to a .05% improvement on x64 and a .007% improvement (barely above the noise) on x86 for crossgen of SPC.dll
I'm not sure I fully understand. This is pretty fundamental, as these variables need to be handled differently. |
AndyAyersMS
left a comment
There was a problem hiding this comment.
Some initial comments. Still need to look through lsra.cpp ...
src/coreclr/src/jit/compiler.h
Outdated
There was a problem hiding this comment.
Can you update the text for this when we dump the local var table? In existing code this is debug only and only gets dumped if the local is DNER. Would be nice to have it show up more prominently.
There was a problem hiding this comment.
I've added an " EH" annotation unconditionally, and left the 'H' character in the DNER dump.
There was a problem hiding this comment.
Agree, this comment is a bit confusing...
There was a problem hiding this comment.
Should we consider any of these lvLiveInOutofHndlr asserts as candidates for noway_assert?
There was a problem hiding this comment.
I've pretty much left the assert\noway_assert distinction as-is - these are generally just either expanding or contracting the condition for an assert.
src/coreclr/src/jit/lclvars.cpp
Outdated
There was a problem hiding this comment.
Is there some other way to accomplish this? I never like to see us bias this sort of accounting if we can avoid it.
Say down the road we want to use weighted ref counts for other optimizations -- for instance trying to allocate the most frequently accessed locals at small FP or SP offsets -- we'd want these weighted ref counts to reflect our best model of reality.
There was a problem hiding this comment.
This is tricky because of the way we overload the ref counts for use by optimization and register allocation - the register allocator uses the weight to determine the value of allocating a register.
Note that this is only the IsLir() path.
There was a problem hiding this comment.
Would be good to post diffs for these (maybe in a gist). Also when these "manual write-thru" changes were added there was some kind of perf testing -- have you looked at trying to revisit those tests?
There was a problem hiding this comment.
Ditto for the other Fx edits below...
There was a problem hiding this comment.
I am unaware of perf testing that was done when those mitigations were added - @stephentoub do you know how those might have been tested?
|
Possible follow-up: seems like this change might allow us to get rid of the Also if you're interested in stress-testing, you might try the |
|
@AndyAyersMS - thanks. Right now I'm working on getting this checked in disabled. Turns out there are some minor regressions due to the way block boundaries are handled. I'm trying to get those eliminated. At that point it would be interesting to work on the stress testing; I think the other is best to leave for later. |
src/coreclr/src/jit/lsrabuild.cpp
Outdated
There was a problem hiding this comment.
Can you (perhaps in a comment) explain this part of the change?
src/coreclr/src/jit/lsrabuild.cpp
Outdated
There was a problem hiding this comment.
I take it this is the matching bit of logic for line 1185?
There was a problem hiding this comment.
Yes, I'll add a pointer
|
Have now looked over everything but lsra.cpp; not sure how much help I can be with the changes there. |
cf0b78a to
9cbe9af
Compare
|
@dotnet/jit-contrib - This is ready for review. It is disabled by default and there are zero diffs. |
|
I plan to rebase and run jitstress jitstressregs once my two stress fixes are merged. |
9cbe9af to
0bb9330
Compare
|
/azp run runtime-coreclr jitstress2-jitstressregs |
|
No pipelines are associated with this pull request. |
|
/azp list |
There was a problem hiding this comment.
Would these changes from else to if (isInMemory) be better as else if (isInMemory) with another else with a noway_assert()?
There was a problem hiding this comment.
No; the whole point of this change is that a variable can now be valid both in register and in memory, because when we "write-thru" a definition of an EH exposed variable it can remain live in the register as well.
|
@dotnet/jit-contrib ping |
AndyAyersMS
left a comment
There was a problem hiding this comment.
Left a few more notes; I think there are still some unaddressed comments from earlier too.
There was a problem hiding this comment.
I don't think you need this, the runtime doesn't care...
There was a problem hiding this comment.
Ah, yes - thanks.
src/coreclr/src/jit/lsra.h
Outdated
There was a problem hiding this comment.
Cut and paste issue in comment?
There was a problem hiding this comment.
Looks like same just above too, in calleeSaveRegs
Mark EH variables (those that are live in or out of exception regions) only as lvLiveInOutOfHndlr, not necessarily lvDoNotEnregister During register allocation, mark these as write-thru, and mark all defs as write-thru, ensuring that the stack value is always valid. Mark those defs with GTF_SPILLED (this the "reload" flag and is not currently used for pure defs) to indicate that it should be kept in the register. Mark blocks that enter EH regions as having no predecessor, and set the location of all live-in vars to be on the stack. Change genFnPrologCalleeRegArgs to store EH vars also to the stack if they have a register assignment.
…cal register RefPositions during allocation.
0bb9330 to
c8c0f17
Compare
I believe I addressed all of the previous comments, along with the latest; if not, could you point out what's left? |
Looks like just this one. |
| { | ||
| #ifdef DEBUG | ||
| if (VarSetOps::IsMember(this, codeGen->gcInfo.gcVarPtrSetCur, bornVarIndex)) | ||
| if (!varDsc->lvLiveInOutOfHndlr) |
There was a problem hiding this comment.
Add a comment explaining what/why we are doing here..
|
|
||
| #if defined(TARGET_ARM) | ||
| if (storeType == TYP_DOUBLE) | ||
| if ((storeType == TYP_DOUBLE) && !regArgTab[argNum].writeThru) |
There was a problem hiding this comment.
Add a comment regarding why we special case writeThru here
|
I believe that I've addressed all the PR feedback. |
AndyAyersMS
left a comment
There was a problem hiding this comment.
LGTM.
Bruce and I were talking about creating a rolling test where we could enable off by default changes like this so they don't regress while we're working on getting them to be on by default.
Mark EH variables (those that are live in or out of exception regions) only as lvLiveInOutOfHndlr, not necessarily lvDoNotEnregister.
During register allocation, mark these as write-thru, and mark all defs as write-thru, ensuring that the stack value is always valid.
Mark those defs with GTF_SPILLED (this the "reload" flag and is not currently used for pure defs) to indicate that it should be kept in the register.
Mark blocks that enter EH regions as having no predecessor, and set the location of all live-in vars to be on the stack.
Change genFnPrologCalleeRegArgs to store EH vars also to the stack if they have a register assignment.