Skip to content

Support Write-Thru of EH variables in LSRA#543

Merged
CarolEidt merged 7 commits intodotnet:masterfrom
CarolEidt:EHWriteThru
Feb 19, 2020
Merged

Support Write-Thru of EH variables in LSRA#543
CarolEidt merged 7 commits intodotnet:masterfrom
CarolEidt:EHWriteThru

Conversation

@CarolEidt
Copy link
Contributor

Mark EH variables (those that are live in or out of exception regions) only as lvLiveInOutOfHndlr, not necessarily lvDoNotEnregister.
During register allocation, mark these as write-thru, and mark all defs as write-thru, ensuring that the stack value is always valid.
Mark those defs with GTF_SPILLED (this the "reload" flag and is not currently used for pure defs) to indicate that it should be kept in the register.
Mark blocks that enter EH regions as having no predecessor, and set the location of all live-in vars to be on the stack.
Change genFnPrologCalleeRegArgs to store EH vars also to the stack if they have a register assignment.

@CarolEidt CarolEidt added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Dec 5, 2019
@CarolEidt
Copy link
Contributor Author

@dotnet/jit-contrib PTAL

Copy link
Contributor

@sandreenko sandreenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect that change has an impact on throughput and memory consumption, how big is it?

The implementation looks good, but I am scared to have so many places where we check writeThru, that means it will be very easy to forget some of them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should not it be declared as a bitfield?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved this field from the base type (Referenceable) to Interval, and changing it to a bitfield seemed to have a minor negative impact.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the same question why it is not a bitfield.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't currently have any bitfields on Compiler, and given that there is only one instance per compilation, it seems that the efficiency of querying a byte overrides the storage impact.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: memroy

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do we actually need to know here:

  1. is the value valid in memory?
  2. is the value valid in a register?

looks like the second, so maybe rename isSpilledValue to validInReg?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. With the removal of legacy jit, I'm pretty sure this code is only used for lclVar spilling, so I think I'll try to simplify it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What will happen if lsra::enregisterLocalVars and compiler::lvaEnregEHVars have different values, for example enregisterLocalVars==false and lvaEnregEHVars==true, will we try allocate registers for such variables or not?
If now why do we need a separate flag?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although I hadn't added it yet, I anticipated having a COMPlus variable to separately control this (which I'm hopefully about to push as an update to this PR). However, LSRA uses the condition above to set enregisterLocalVars so it wouldn't be possible to have lvaEnregEHVars be true and enregisterLocalVars be false.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe // if only register homed go to the next one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about // If this arg is never on the stack, go to the next one.?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please update that comment so it includes normal spills, not only lvLiveInOutOfHndlr, because, at least for me, that was not obvious after the first read.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, this comment is a bit confusing...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still seems confusing, the comment is talking about GTF_SPILL but the code you changed here does not look at this flag.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've pushed another change to this comment, hopefully making it clear.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's better, yes. @sandreenko do you agree?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That change confuses me, but since hasEHBoundaryIn was not used before that PR there could not be any regressions from that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What this does is change the block to have no predecessor, so that it doesn't have to deal with mismatched locations of EH vars across the boundary. However, this change actually causes some small diffs when the EH write-thru is disabled, because it creates mismatches on the non-EH edges. I'm looking into how complex it would be to eliminate this change, but it may be excessively complex. The crossgen diffs over all jits & altjits for frameworks & tests shows only 4 bytes of diff each over two large methods for arm32, and 40 bytes of diff for one 4020 byte method for x64/ux altjit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that was part was already merged in #679, why does github show that as a diff?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it's because this is actually showing the original change that I extracted for that PR. I haven't yet pushed a rebased version of this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please explain how this PR relates to the deletion of block != compiler->fgFirstBB condition here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this PR, we have blocks with an incoming EH boundary that may have live-in candidate vars. Previously those vars would have no live-in candidate vars, by construction, so we didn't need to worry about whether they needed dummy defs. Now that we have live-in EH vars, we don't want dummy defs for those (they should always be on stack on entry), so they are treated as having no pred block (like fgFirstBB), and that becomes the right condition for determining when we don't want dummy defs.

@CarolEidt
Copy link
Contributor Author

I expect that change has an impact on throughput and memory consumption, how big is it?

The Throughput tuning commit brings this from roughly a .05% throughput loss on x64 (in the noise on x86) to a .05% improvement on x64 and a .007% improvement (barely above the noise) on x86 for crossgen of SPC.dll

The implementation looks good, but I am scared to have so many places where we check writeThru, that means it will be very easy to forget some of them.

I'm not sure I fully understand. This is pretty fundamental, as these variables need to be handled differently.

Copy link
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial comments. Still need to look through lsra.cpp ...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update the text for this when we dump the local var table? In existing code this is debug only and only gets dumped if the local is DNER. Would be nice to have it show up more prominently.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added an " EH" annotation unconditionally, and left the 'H' character in the DNER dump.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, this comment is a bit confusing...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we consider any of these lvLiveInOutofHndlr asserts as candidates for noway_assert?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've pretty much left the assert\noway_assert distinction as-is - these are generally just either expanding or contracting the condition for an assert.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there some other way to accomplish this? I never like to see us bias this sort of accounting if we can avoid it.

Say down the road we want to use weighted ref counts for other optimizations -- for instance trying to allocate the most frequently accessed locals at small FP or SP offsets -- we'd want these weighted ref counts to reflect our best model of reality.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is tricky because of the way we overload the ref counts for use by optimization and register allocation - the register allocator uses the weight to determine the value of allocating a register.
Note that this is only the IsLir() path.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to post diffs for these (maybe in a gist). Also when these "manual write-thru" changes were added there was some kind of perf testing -- have you looked at trying to revisit those tests?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto for the other Fx edits below...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am unaware of perf testing that was done when those mitigations were added - @stephentoub do you know how those might have been tested?

Copy link
Member

@AndyAyersMS AndyAyersMS Dec 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some perf tests in dotnet/coreclr#15629.

@AndyAyersMS
Copy link
Member

AndyAyersMS commented Dec 19, 2019

Possible follow-up: seems like this change might allow us to get rid of the lvVolatileHint computations we do in lvaMarkLclRefs and the code to copy volatile hinted locals in optAddCopies.

Also if you're interested in stress-testing, you might try the mutate-test tool from jitutils, it can introduce EH into all methods in a test case.

@CarolEidt
Copy link
Contributor Author

@AndyAyersMS - thanks. Right now I'm working on getting this checked in disabled. Turns out there are some minor regressions due to the way block boundaries are handled. I'm trying to get those eliminated. At that point it would be interesting to work on the stress testing; I think the other is best to leave for later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you (perhaps in a comment) explain this part of the change?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I take it this is the matching bit of logic for line 1185?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'll add a pointer

@AndyAyersMS
Copy link
Member

Have now looked over everything but lsra.cpp; not sure how much help I can be with the changes there.

@CarolEidt CarolEidt force-pushed the EHWriteThru branch 3 times, most recently from cf0b78a to 9cbe9af Compare January 31, 2020 21:22
@CarolEidt
Copy link
Contributor Author

@dotnet/jit-contrib - This is ready for review. It is disabled by default and there are zero diffs.

@CarolEidt
Copy link
Contributor Author

I plan to rebase and run jitstress jitstressregs once my two stress fixes are merged.

@CarolEidt
Copy link
Contributor Author

/azp run runtime-coreclr jitstress2-jitstressregs

@azure-pipelines
Copy link

No pipelines are associated with this pull request.

@echesakov
Copy link
Contributor

/azp list

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would these changes from else to if (isInMemory) be better as else if (isInMemory) with another else with a noway_assert()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No; the whole point of this change is that a variable can now be valid both in register and in memory, because when we "write-thru" a definition of an EH exposed variable it can remain live in the register as well.

@CarolEidt
Copy link
Contributor Author

@dotnet/jit-contrib ping

Copy link
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few more notes; I think there are still some unaddressed comments from earlier too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo 'by aby'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need this, the runtime doesn't care...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes - thanks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cut and paste issue in comment?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like same just above too, in calleeSaveRegs

Mark EH variables (those that are live in or out of exception regions) only as lvLiveInOutOfHndlr, not necessarily lvDoNotEnregister
During register allocation, mark these as write-thru, and mark all defs as write-thru, ensuring that the stack value is always valid.
Mark those defs with GTF_SPILLED (this the "reload" flag and is not currently used for pure defs) to indicate that it should be kept in the register.
Mark blocks that enter EH regions as having no predecessor, and set the location of all live-in vars to be on the stack.
Change genFnPrologCalleeRegArgs to store EH vars also to the stack if they have a register assignment.
…cal register RefPositions during allocation.
@CarolEidt
Copy link
Contributor Author

I think there are still some unaddressed comments from earlier too.

I believe I addressed all of the previous comments, along with the latest; if not, could you point out what's left?

@AndyAyersMS
Copy link
Member

could you point out what's left?

Looks like just this one.

{
#ifdef DEBUG
if (VarSetOps::IsMember(this, codeGen->gcInfo.gcVarPtrSetCur, bornVarIndex))
if (!varDsc->lvLiveInOutOfHndlr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment explaining what/why we are doing here..


#if defined(TARGET_ARM)
if (storeType == TYP_DOUBLE)
if ((storeType == TYP_DOUBLE) && !regArgTab[argNum].writeThru)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment regarding why we special case writeThru here

@CarolEidt
Copy link
Contributor Author

I believe that I've addressed all the PR feedback.

Copy link
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Bruce and I were talking about creating a rolling test where we could enable off by default changes like this so they don't regress while we're working on getting them to be on by default.

@CarolEidt CarolEidt merged commit 3be5238 into dotnet:master Feb 19, 2020
@CarolEidt CarolEidt deleted the EHWriteThru branch July 16, 2020 16:58
@ghost ghost locked as resolved and limited conversation to collaborators Dec 11, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants