ARM64-SVE: Implement IF_SVE_DW_2A, IF_SVE_DW_2B, IF_SVE_EB_1B#97800
ARM64-SVE: Implement IF_SVE_DW_2A, IF_SVE_DW_2B, IF_SVE_EB_1B#97800amanasifkhalid merged 3 commits intodotnet:mainfrom
Conversation
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue DetailsPart of #94549. Note that the preferred disassembly for cstool output: JitDisasm output: cc @dotnet/arm64-contrib
|
Diff results for #97800Throughput diffsThroughput diffs for linux/arm64 ran on linux/x64Overall (-0.00% to +0.01%)
MinOpts (+0.00% to +0.04%)
Details here |
| } | ||
|
|
||
| case IF_SVE_EB_1B: // ........xx...... ...........ddddd -- SVE broadcast integer immediate (unpredicated) | ||
| // ins is MOV for this encoding, as it is the preferred disassembly, so pass FMOV to emitInsCodeSve |
There was a problem hiding this comment.
This is annoying, and slightly different to aliasing in other instructions.
Alternatively, keep it as FMOV in emitIns_R() and then covert to MOV in emitDispInsHelp().
I'm not sure either way is better. Happy to keep as is in here.
There was a problem hiding this comment.
Alternatively, keep it as FMOV in emitIns_R() and then covert to MOV in emitDispInsHelp().
I'd like to keep the instruction as FMOV in emitIns_R, but in emitDispInsHelp, in order to print it as a MOV, we would have to add a check for this case at the beginning of the method so we don't print the instruction before modifying it. My current approach is hacky, but it allows us to contain this behavior to the relevant switch case.
There was a problem hiding this comment.
This is fine to me. Comment describes why so it's all good.
|
@kunalspathak this change does have some TP impact. I suspect this is from me adding the |
Diff results for #97800Throughput diffsThroughput diffs for linux/arm64 ran on linux/x64Overall (-0.00% to +0.01%)
MinOpts (+0.00% to +0.04%)
Details here |
TIHan
left a comment
There was a problem hiding this comment.
LGTM.
I wouldn't worry too much about the TP regressions as I think I solved a chunk of it in #97739 by separating the SVE format cases in emitOutputInstr into their own function.
We could also separate out the SVE instruction cases in the emitIns functions into their own emitInsSve which could further help mitigate any TP regressions, but we don't have to do that now.
|
Thanks for the reviews!
I didn't realize we'd accumulated that much TP overhead from the SVE switch cases; >1% TP improvement is a lot...
|
The 1% TP improvement may not be accurate because it's running the ARM64 path on windows/x64 and it still counts the instructions generated for x64 too. The TP regressions to really look at is the linux/arm64 running on linux/arm64, which improved modestly.
I'd be curious as well. I already did it to |
Part of #94549. Note that the preferred disassembly for
FMOV <Zd>.<T>, #0.0isMOV <Zd>.<T>, 0, per the SVE docs:FMOVis a pseudo-instruction ofDUP, which is aliased byMOV.cstool output:
JitDisasm output:
cc @dotnet/arm64-contrib