7d7b25d95c
Integer splats (especially for sizes < 32-bits) does not directly translate to a single instruction on x64. We can do better for special values, like 0, which can be lowered to `xor dst dst`. We do this check in the instruction selector, and emit a special opcode kX64S128Zero. Also change the xor operation for kX64S128Zero from xorps to pxor. This can help reduce any potential data bypass delay (search for this on agner's microarchitecture manual for more details.). Since integer splats are likely to be followed by integer ops, we should remain in the integer domain, thus use pxor. For i64x2.splat the codegen goes from: xorl rdi,rdi vmovq xmm0,rdi vmovddup xmm0,xmm0 to: vpxor xmm0,xmm0,xmm0 Also add a unittest to verify this optimization, and necessary raw-assembler methods for the test. Bug: v8:11093 Change-Id: I26b092032b6e672f1d5d26e35d79578ebe591cfe Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2516299 Reviewed-by: Tobias Tebbi <tebbi@chromium.org> Commit-Queue: Zhi An Ng <zhin@chromium.org> Cr-Commit-Position: refs/heads/master@{#70977} |
||
---|---|---|
.. | ||
instruction-selector-x64-unittest.cc |