Convert packed floating point to signed integer #2320
Conversation
|
|
||
| // Get the low 16 bits | ||
| ctx.emit(Inst::xmm_rmi_reg(SseOpcode::Pslld, RegMemImm::imm(16), tmp)); | ||
| ctx.emit(Inst::xmm_rmi_reg(SseOpcode::Psrld, RegMemImm::imm(16), tmp)); |
There was a problem hiding this comment.
I used PBLENDW in the old backend and so does V8... what do you think?
There was a problem hiding this comment.
Ahh .. yes. V8 uses pblendw as well. This adds an extra instruction; both pblendw and pslld/psrld have the same instruction latency. I choose not to use pblendw though because it is only compatible with SSE4_1 or greater while shifts are compatible with SSE2 which I thought was the base target for SIMD. In general not sure how in the new backend we are guarding lowering based on compatibility level so for now I am lowering based on the lowest denominator. What do you think??
There was a problem hiding this comment.
Ah, cool--having both would actually be great: if ...has_sse41() { [emit pblendw] } else { [emit the double shift] }. That information is available in EmitInfo.isa_flags but unfortunately that struct is not present until the emit phase. If we created a new Inst::XmmLowBits { dst, bits_to_retain } (or something like that) and lowered to that then in the emit phase we could pick which version we want based on the EmitInfo. The other option would be to try to make those ISA flags available during lowering but that seems harder to do (@cfallin?).
| tmp, | ||
| )); | ||
|
|
||
| // Convert the float to double. |
There was a problem hiding this comment.
| // Convert the float to double. | |
| // Convert the float to double quadword. |
There was a problem hiding this comment.
By double I meant double word. This is converting to a double word though not a double quadword. I'll change it to say packed doubleword.
There was a problem hiding this comment.
I think the logic looks right but can we add the CLIF tests that verify these individual instructions? I'm thinking that
simd-conversion-run.clifandsimd-conversion-legalize.clif(once converted to being atest compileemitting vcode) would be very useful to see that each instruction works correctly and compiles to the sequence we expect.
It definitely passes the SIMD Spectest so I am confident it is correct but let me look to add a file test as well. 👍
There was a problem hiding this comment.
These instructions are tested in simd_conversions.wast but this file has not been enabled in experimental_x64_should_panic in the build.rs so I don't think any spec tests are running for these instructions. Unfortunately, that spec test also checks narrow and widen which I found a bit annoying; a lot has to be implemented for the spec test to be enabled. So I guess the CLIF file tests will be the only things checking this until all conversions are implemented.
There was a problem hiding this comment.
Yes, you're right they aren't enabled by default. I ran them manually though, basically removed all tests except the ones related to the packed float conversion to packed signed int. I also confirmed that it was indeed running as expected while testing. Separately I've also included the file tests that have tests for this conversion .. commenting out tests packed float to packed unsigned int which isn't supported yet.
| unimplemented!("f32x4.convert_i32x4_u"); | ||
| } else { | ||
| unreachable!(); | ||
| } |
There was a problem hiding this comment.
unreachable! is not correct here because there are two other opcodes that could reach this: Opcode::FcvtToUint | Opcode::FcvtToSint; perhaps we should do a match on all four opcodes (filling the others with unimplemented!()) and then _ => unreachable!() at the end.
There was a problem hiding this comment.
I think I disagree though I may be wrong. This code is guarded by a check for vector instructions and so afaik neither Opcode::FcvtToUint or Opcode::FcvtToSint are supported for vector input. In the context of a vector instruction it currently impossible to reach this branch with those opcodes right? This question has come up before where I reach for using unreachable instead of implementing it as unimplemented. I can change to unimplemented but not really sure the rules for applying unimplemented vs unreachable when context is considered. Certainly support for vector input for Opcode::FcvtToUint or Opcode::FcvtToSint could be added, but then that is the case for most places in the backend were the unreachable! is used instead of unimplemented! For example there are places where we match on a type (pshufd use in extractlane for example) and say the default _ => is unreachable simply because a type is not supported, but if there was need for that support and that support were added it is suddenly unimplemented and not unreachable.
There was a problem hiding this comment.
If Opcode::FcvtToUint and Opcode::FcvtToSint are not supported then this should remain unreachable!; maybe add a note because a straightforward reading of the code would expect these to be implemented.
abrown
left a comment
There was a problem hiding this comment.
I think the logic looks right but can we add the CLIF tests that verify these individual instructions? I'm thinking that simd-conversion-run.clif and simd-conversion-legalize.clif (once converted to being a test compile emitting vcode) would be very useful to see that each instruction works correctly and compiles to the sequence we expect.
b5e9a14 to
9b43733
Compare
|
Hopefully all issues have been addressed, but let me know if there is anything else. |
| @@ -0,0 +1,34 @@ | |||
| test legalizer | |||
| set enable_simd | |||
| target x86_64 skylake | |||
There was a problem hiding this comment.
This is currently running the old backend; I think it should be modified to test compile and add feature "experimental_x64" (see simd-bitwise-compile.clif, e.g.).
There was a problem hiding this comment.
Ok .. Yeah thanks. Will make this change too. It somehow was being acknowledged as testing was failing CI when I had the file tests for conversion to unsigned included, but I was having trouble running it on my machine. Will update.
There was a problem hiding this comment.
@abrown @bnjbvr .. Actually I am going to just remove this compile file test. It is checking for a very specific sequence of instructions which should not be static (set in stone). It will depend on optimizations or SSE feature flag set and is there anything else that can change register allocation even if the same instructions are used?
| // Since this branch is also guarded by a check for vector types | ||
| // neither Opcode::FcvtToUint nor Opcode::FcvtToSint can reach here | ||
| // as the first to branches will cover all reachable cases. |
There was a problem hiding this comment.
| // Since this branch is also guarded by a check for vector types | |
| // neither Opcode::FcvtToUint nor Opcode::FcvtToSint can reach here | |
| // as the first to branches will cover all reachable cases. | |
| // Since this branch is also guarded by a check for vector types, | |
| // neither Opcode::FcvtToUint nor Opcode::FcvtToSint can reach here | |
| // (the vector variants do not exist). |
Implements i32x4.trunc_sat_f32x4_s
Add portions of filetests simd-conversion-legalize.clif and simd-conversion-run.clif that test fcvt_from_sint.f32x4
|
No description provided.