Merged
Conversation
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch, @kunalspathak Issue DetailsThis PR does two things:
An example for 2nd: struct MyStruct
{
public fixed byte Data[30]; // and other sizes
}
MyStruct InitMemory() => new MyStruct();; Method Program:InitMemory():Program+MyStruct:this
G_M45940_IG01:
4883EC38 sub rsp, 56
C5F877 vzeroupper
48B878563412F0DEBC9A mov rax, 0x9ABCDEF012345678
4889442430 mov qword ptr [rsp+30H], rax
G_M45940_IG02:
- 33C0 xor eax, eax
C5F857C0 vxorps xmm0, xmm0
C5FA7F02 vmovdqu xmmword ptr [rdx], xmm0
C5FA7F420E vmovdqu xmmword ptr [rdx+0EH], xmm0
488BC2 mov rax, rdx
48B978563412F0DEBC9A mov rcx, 0x9ABCDEF012345678
48394C2430 cmp qword ptr [rsp+30H], rcx
7405 je SHORT G_M45940_IG03
E8C2CB4B5F call CORINFO_HELP_FAIL_FAST
G_M45940_IG03:
90 nop
G_M45940_IG04:
4883C438 add rsp, 56
C3 ret
-; Total bytes of code: 68
+; Total bytes of code: 66
|
Member
Author
|
@TIHan @dotnet/jit-contrib PTAL Size regressions are due to unrolled memset, e.g. - xor edx, edx
- lea rcx, bword ptr [rsp+38H]
- ; byrRegs +[rcx]
- mov r8d, 152
- call CORINFO_HELP_MEMSET
- ; byrRegs -[rcx]
- ; gcr arg pop 0
xor r9d, r9d
+ vxorps ymm0, ymm0
+ vmovdqu ymmword ptr[rsp+38H], ymm0
+ vmovdqu ymmword ptr[rsp+58H], ymm0
+ vmovdqu ymmword ptr[rsp+78H], ymm0
+ vmovdqu ymmword ptr[rsp+98H], ymm0
+ vmovdqu ymmword ptr[rsp+B0H], ymm0
mov dword ptr [rsp+20H], r9d
mov dword ptr [rsp+28H], r9d |
TIHan
approved these changes
Mar 21, 2023
Contributor
TIHan
left a comment
There was a problem hiding this comment.
The regressions make sense. LGTM.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR does two things:
getUnrollThresholdwas always invoked with/*canUseSimd*/ falseso the threshold was 128b instead of 256b for x64An example for 2nd:
; Method Test():Program+MyStruct:this G_M45940_IG01: 4883EC38 sub rsp, 56 C5F877 vzeroupper 48B878563412F0DEBC9A mov rax, 0x9ABCDEF012345678 4889442430 mov qword ptr [rsp+30H], rax G_M45940_IG02: - 33C0 xor eax, eax ;; we don't use RAX to zero the struct C5F857C0 vxorps xmm0, xmm0 C5FA7F02 vmovdqu xmmword ptr [rdx], xmm0 C5FA7F420E vmovdqu xmmword ptr [rdx+0EH], xmm0 ;; overlapped with previous mov 488BC2 mov rax, rdx 48B978563412F0DEBC9A mov rcx, 0x9ABCDEF012345678 48394C2430 cmp qword ptr [rsp+30H], rcx 7405 je SHORT G_M45940_IG03 E8C2CB4B5F call CORINFO_HELP_FAIL_FAST G_M45940_IG03: 90 nop G_M45940_IG04: 4883C438 add rsp, 56 C3 ret -; Total bytes of code: 68 +; Total bytes of code: 66