Commit Graph

226 Commits

Author SHA1 Message Date
Yolanda Chen
c48ec6f7bb [x64] Implement 256-bit assembly for SSE2_UNOP instructions
The SSE2_UNOP instructions have various src and dst register types for
256-bit AVX. One of them, the ucomisd instruction does not support YMM.
Other two: vcvtpd2ps and vcvttpd2dq use XMM as dst register. We extend
the Operand type to Operand256 to represent m256 to distiguish with the
128-bit AVX instruction.

Since this is a small suite, we explicitly specify the operand type for
each instruction.

Bug: v8:12228
Change-Id: I07c8168bd49f75eb8e4df8d6adfcfb37c1d34fff
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3518423
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Yolanda Chen <yolanda.chen@intel.com>
Cr-Commit-Position: refs/heads/main@{#80020}
2022-04-19 13:01:50 +00:00
Clemens Backes
08e514a894 [codegen][x64] Improve code for float to int64
This improves the code generated for float to int64 conversions on x64.
Instead of explicitly checking the input for specific values and
executing conditional jumps, just convert the integer back to a float
and check if this results in the rounded input. The "success value" is
then materialized via vmov + and instead of via branches.

old:
   7  c4e1fb2cd9           vcvttsd2siq rbx,xmm1
   c  ba01000000           movl rdx,0x1
  11  49ba000000000000e0c3 REX.W movq r10,0xc3e0000000000000
  1b  c441f96efa           vmovq xmm15,r10
  20  c5792ef9             vucomisd xmm15,xmm1
  24  7a08                 jpe 0x3599421714ee  <+0x2e>
  26  7408                 jz 0x3599421714f0  <+0x30>
  28  4883fb01             REX.W cmpq rbx,0x1
  2c  7102                 jno 0x3599421714f0  <+0x30>
  2e  33d2                 xorl rdx,rdx

new:
   7  c463010bf90b         vroundsd xmm15,xmm15,xmm1,0xb
   d  c4e1fb2cd9           vcvttsd2siq rbx,xmm1
  12  c4e1832ac3           vcvtqsi2sd xmm0,xmm15,rbx
  17  c4c17bc2c700         vcmpss xmm0,xmm0,xmm15, (eq)
  1d  c4e1f97ec2           vmovq rdx,xmm0
  22  83e201               andl rdx,0x1

A follow-up step would be to replace the explicitly materialized success
value by a direct jump to the code handling the error case, but that
requires more rewrite in TurboFan.

R=tebbi@chromium.org

Bug: v8:10005
Change-Id: Iaedc3f395fb3a8c11c936faa8c6e55c2dfe86cd9
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3560434
Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Commit-Queue: Clemens Backes <clemensb@chromium.org>
Cr-Commit-Position: refs/heads/main@{#79854}
2022-04-07 12:38:44 +00:00
jiepan
9ba6aff285 [x64] Implement 256-bit assembler for cmp ops
Bug: v8:12228
Change-Id: Iab09881d9c8bcd851fd89bf5d6bbd3f2cfb0f3d0
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3303808
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Jie Pan <jie.pan@intel.com>
Cr-Commit-Position: refs/heads/main@{#79838}
2022-04-07 04:05:23 +00:00
jiepan
a54f38e1fd [x64] Implement 256-bit assembler for vshufps
Bug: v8:12228
Change-Id: I233efc9fc4636c25baba6a689f7038331fd1f32b
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3303806
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Jie Pan <jie.pan@intel.com>
Cr-Commit-Position: refs/heads/main@{#78598}
2022-01-13 08:34:45 +00:00
Igor Sheludko
a7db5fcbad [ext-code-space][compiler] Support calling CodeT targets
... in order to avoid Code <-> CodeT conversions in builtins.
This CL changes the meaning of RelocInfo::CODE_TARGET which now expects
CodeT objects as a code target.

In order to reduce code churn this CL makes BUILTIN_CODE and friends
return CodeT instead of Code. In the follow-up CLs BUILTIN_CODET and
friends will be removed.

Bug: v8:11880
Change-Id: Ib8f60973e55c60fc62ba84707471da388f8201b4
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3338483
Reviewed-by: Patrick Thier <pthier@chromium.org>
Reviewed-by: Nico Hartmann <nicohartmann@chromium.org>
Reviewed-by: Leszek Swirski <leszeks@chromium.org>
Commit-Queue: Igor Sheludko <ishell@chromium.org>
Cr-Commit-Position: refs/heads/main@{#78393}
2021-12-16 13:45:12 +00:00
Yolanda Chen
7095cf14fa [x64] Implement 256-bit assembly for SSSE3 UNOP instructions
Bug: v8:12228
Change-Id: I9312716f78e79fd0759c2f7adfef065b5df5cfda
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3275566
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Commit-Queue: Yolanda Chen <yolanda.chen@intel.com>
Cr-Commit-Position: refs/heads/main@{#77861}
2021-11-12 02:03:42 +00:00
Yolanda Chen
0233cb6c82 [x64] Implement 256-bit assembly for vmovddup/vmovshdup
Bug: v8:12228
Change-Id: I49b2e1a1c837b96ea2e7cb58f42314109845b7fc
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3263766
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Commit-Queue: Yolanda Chen <yolanda.chen@intel.com>
Cr-Commit-Position: refs/heads/main@{#77746}
2021-11-06 02:03:52 +00:00
Ng Zhi An
91765804e3 [cleanup][disasm][x64] Fix some -Wshadow warnings
Bug: v8:12244,v8:12245
Change-Id: Iee80a34255a9c8ee5000719340a475331ab82942
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3254004
Reviewed-by: Andreas Haas <ahaas@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/main@{#77663}
2021-11-02 17:19:18 +00:00
Ng Zhi An
881a486ef6 [x64] Verify disassembly of more AVX instructions
This covers all the AVX instructions.

Bug: v8:12207
Change-Id: Idee66a55e1da5a2e88797002d25c6affb2d0c564
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3238149
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/main@{#77559}
2021-10-27 00:05:51 +00:00
Ng Zhi An
76efd418e4 [x64] Verify disassembly of some AVX instructions
Extract instructions, and pextrq.

Bug: v8:12207
Change-Id: I919ce53a6bb1357cb70d78b3c7f12fc3d2128deb
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3223969
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Cr-Commit-Position: refs/heads/main@{#77504}
2021-10-21 18:32:21 +00:00
Yolanda Chen
83a58b70e6 [x64] Implement 256-bit assembly for v(p)broadcast*
Bug: v8:12228
Change-Id: I434b07e3d7a2e270dc7dd26950b9dd047eb46a56
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3219944
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Commit-Queue: Yolanda Chen <yolanda.chen@intel.com>
Cr-Commit-Position: refs/heads/main@{#77446}
2021-10-19 02:21:19 +00:00
Ng Zhi An
11f5914716 [x64] Verify disassembly of some AVX instructions
Mostly the macro lists, the rest will be moved in a follow-up.

Bug: v8:12207
Change-Id: Iedf48e80f94ac99869c8aa31516cf93f9fc23667
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3209665
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/main@{#77387}
2021-10-13 21:25:53 +00:00
Ng Zhi An
166dde5d2f [x64] Verify disassembly of SSE4_2 instructions
R=gdeepti@chromium.org

Bug: v8:12207
Change-Id: I3eafe4b2cf2d37fd4f8a9792fb96bf7b92a4c61b
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3208456
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/main@{#77292}
2021-10-08 00:13:23 +00:00
Ng Zhi An
7d1c50d1cd [x64] Verify disassembly of SSE4_1 instructions
R=gdeepti@chromium.org

Bug: v8:12207
Change-Id: Ic0d408b3c7ecf69e45a794c6c96159df2bee80e4
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3180376
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/main@{#77273}
2021-10-06 21:47:09 +00:00
Ng Zhi An
1dfb8cd74a [x64][diasasm] Add more padding to disassembly
A mov can be up to 10 bytes, 6 for displacement, 4 for instr. Other
instructions (like pshufb) with a complex addressing mode can take 10
bytes too. So adjust the padding for disassembly of hex accordingly.
This requires fixing up all the test cases too.

Bug: v8:12207
Change-Id: I372d67a818a5dbfe6f49f67047493d7f67b59bcd
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3180375
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/main@{#77241}
2021-10-06 00:08:45 +00:00
Ng Zhi An
f80eed4729 [x64] Verify disassembly of SSE3 and SSSE3 instructions
Bug: v8:12207
Change-Id: I6d8a62bb69c6011e6e7f6da2663f9db297b76f7f
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3180374
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Cr-Commit-Position: refs/heads/main@{#77226}
2021-10-04 17:38:52 +00:00
Ng Zhi An
eb5656ef23 [x64] Verify disassembly of cmov instructions
Bug: v8:12207
Change-Id: Ic59dbbce330221c917f20c7d20ac7ddb421932ee
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3180373
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/main@{#77222}
2021-10-04 16:27:52 +00:00
Yolanda Chen
ed7e3de95a [x64] Implement 256-bit assembly for vhaddps
Bug: v8:12228
Change-Id: Ie1f569c450f84a862c754b844e36349b1533872d
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3194633
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Commit-Queue: Yolanda Chen <yolanda.chen@intel.com>
Cr-Commit-Position: refs/heads/main@{#77202}
2021-10-02 04:24:22 +00:00
Ng Zhi An
7537e36efa [x64] Verify disassembly of SSE2 instructions
Bug: v8:12207
Change-Id: Ia553891986f0ef3fe6fb1c4350c3accc0e7bfc84
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3180243
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/main@{#77027}
2021-09-24 01:37:03 +00:00
Ng Zhi An
d90c9c1f65 [x64] Verify disassembly of SSE instructions
- create a helper class to set up Disassembler for testing
- add a helper macro to only compare disassembled instruction (ignore
the hex bytes), this is useful for comparing SSE instructions, whose
opcodes are defined in sse-instr.h, and use uppercase letters, but the
disassembly always uses lowercase
- emit and compare SSE instructions using macro list

Bug: v8:12207
Change-Id: I3580f5d756736cada4f7260efc4d90e2c894f43c
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3173906
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/main@{#77021}
2021-09-23 18:51:03 +00:00
Ng Zhi An
565e83ab2f [x64] Check expected disassembly output fpu instructions
We move some instructions from the test that just disassembles them, to
the test that checks for expected output.

Bug: v8:12207
Change-Id: Ide8954e36c6ad016150bfe45abc1717bed55eb19
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3171972
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/main@{#76970}
2021-09-21 17:18:18 +00:00
Ng Zhi An
dd06c11ee0 [x64] Check expected disassembly output for some instructions
We move some instructions from the test that just disassembles them, to
the test that checks for expected output.

Bug: v8:12207
Change-Id: I913237427d795ed44539c7294ebbe69330c41dfa
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3163278
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/main@{#76944}
2021-09-20 18:03:57 +00:00
Ng Zhi An
f67ee467aa [disasm][x64] Remove unnecessary initialization code
These tests don't depend on initializing VM (for Context) or even an
isolate, so we can remove the setup code, and use UNINITIALIZED_TEST
(will not even set up an isolate).

Bug: v8:12207
Change-Id: I4b509b95cc8272db22892c32b53464678403dc7d
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3160748
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/main@{#76854}
2021-09-15 17:38:00 +00:00
Ng Zhi An
ca817b0bb6 [x64] Add new disassembly tests that verifies output
Currently the main test for disassembly just checks that there is
disassembly support for a assembler function, it doesn't verify the
output is as expected.

Add a new test case that checks the disassembly output against an
expected string.

Right now we only check a single instruction, subsequent patches will
move more instructions into this test case.

Bug: v8:12207
Change-Id: Id183bb2fd625713d82239363ebce3f4c77155acd
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3150145
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/main@{#76828}
2021-09-14 20:41:29 +00:00
Andrew Brown
cea787e280 [x64] Add disassembly tests for 256-bit instructions
A previous change (see ref) added a subset of 256-bit instructions to
the x64 assembler--this change adds a disassembly test for the added
instructions.

ref: https://chromium-review.googlesource.com/c/v8/v8/+/3123648
Change-Id: Ia56be7a7df636b8bf6c04f044912e914d949d19f
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3133956
Auto-Submit: Andrew Brown <andrew.brown@intel.com>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/main@{#76711}
2021-09-08 00:26:44 +00:00
Ng Zhi An
684f3cee1f [wasm-simd] Optimize i32x4.trunc_sat_f32x4_s
Bug: v8:12094
Change-Id: Ibefce881cbfcd4445485197a4a2615bdf0599ada
Fixed: v8:12094
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3123638
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Cr-Commit-Position: refs/heads/main@{#76706}
2021-09-07 20:11:26 +00:00
Ng Zhi An
bb12c48ac3 [wasm-simd] Share i8x16.splat implementation
The optimal implementation is in TurboFan x64 codegen, move it into
shared-macro-assembler, and have TurboFan ia32 and Liftoff use it. The
optimal implementation accounts for AVX2 support.

We add a couple of AVX2 instruction to ia32 in sse-instr.h, not all of
them are used, but follow-up patches will use them, so we add support
(including diassembly and test) in this change.

Drive-by clean up to test-disasm-x64.cc to merge 2 AVX2 test sections.

Bug: v8:11589
Change-Id: I1c8d7deb0f8bb70b29e7a680e5dbcfb09ca5505b
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3092555
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/main@{#76352}
2021-08-17 21:05:00 +00:00
Ng Zhi An
add293e80e [x64][ia32] Move more AVX_OP into SharedTurboAssembler
Bug: v8:11589
Change-Id: I30dbdbc6266d703ce697352780da1d543afbb457
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2826711
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Bill Budge <bbudge@chromium.org>
Cr-Commit-Position: refs/heads/master@{#73965}
2021-04-14 23:46:56 +00:00
Ng Zhi An
9c120b753d [wasm-simd][x64] Fix encoding of vcvtdq2pd
vcvtdq2pd was incorrectly declared to take 3 operands, the use of the
macro Cvtdq2pd meant that the call was vcvtdq2pd(dst, dst, src). This
is an incorrect encoding. Our tests happen to pass because dst was xmm0,
which made it accidentally correct.

This fixes it by moving cvtdq2pd out of the macro list.

Bug: v8:11265
Change-Id: I8b1baf4dd2c670021eafa76dc1a10b442f812805
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2654003
Reviewed-by: Adam Klein <adamk@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#72382}
2021-01-27 22:48:59 +00:00
Ng Zhi An
173d660849 [wasm-simd][x64] Optimize i8x16.popcnt with aligned moves
movups is slower on older hardware (core2) than movaps, even if the
operand is aligned. (Not an issue on modern hardware).

Also move i8x16.splat(0x0F) to an external reference so we can load the
mask directly.

Bug: v8:11002
Change-Id: I0b01c27a142024d50b9faaa9e7bd6a1fe169e141
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2643242
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#72336}
2021-01-26 19:03:10 +00:00
Zhi An Ng
ffc832becf [wasm-simd][x64][avx2] Optimize f32x4.splat
When AVX2 is available, we can use vbroadcastss. On AVX, use vshufps,
since it is non-destructive. On SSE, shufps is 1 byte shorter.

FIXED=b/175364402

Change-Id: I5bd10914579d8db012192a9c04f7b0038ec1c812
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2599849
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#71964}
2021-01-08 03:03:45 +00:00
Zhi An Ng
506c09797c [x64] Sort out move instructions in codegen
In AVX, it is better to use the appropriate integer or floating point
moves depending on which instructions produce/consume these moves, since
there can be a delay moving from integer to floating point domain. On
SSE systems, it is less important, and we can move movaps/movups which
is 1 byte shorter than movdqa/movdqu.

This patch cleans up a couple of places, and defines macro-assembler
functions Movdqa, Movdqu, Movapd, to call into movaps/movups when AVX is
not supported.

Change-Id: Iba6c54e218875f1a70f61792978d7b3f69edfb4b
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2599843
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Bill Budge <bbudge@chromium.org>
Cr-Commit-Position: refs/heads/master@{#71884}
2020-12-29 01:27:23 +00:00
Zhi An Ng
c9560d1dbf [wasm-simd][x64][avx2] Improve codegen for load{8,16}_splat
Detect AVX2 support and use vpbroadcastb or vpbroadcastw.

No new assembler helpers required because we are only emitting the
VEX-128 versions of these instructions.

Bug: v8:11258
Change-Id: Ic50178daa6fc8fe767dfc788e61e67538066bdea
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2596582
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Bill Budge <bbudge@chromium.org>
Cr-Commit-Position: refs/heads/master@{#71866}
2020-12-23 01:56:42 +00:00
Zhi An Ng
741e5a66de [wasm-simd][ia32][x64] More optimization for f32x4.extract_lane
We can have more optimizations for this instruction, they leave some
junk in the top lanes of dst, but that doesn't matter:

- when lane is 1: we use movshdup, this is 4 bytes long
- when lane is 2: use movhlps, this is 3 bytes long
- otherwise use shufps (4 bytes) or pshufd (5 bytes)

All of which are better than insertps (6 bytes).

Change-Id: I0e524431d1832e297e8c8bb418d42382d93fa691
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2591850
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Bill Budge <bbudge@chromium.org>
Cr-Commit-Position: refs/heads/master@{#71813}
2020-12-17 01:58:52 +00:00
Zhi An Ng
6cb61e63bb [wasm-simd][x64] Optimize f64x2.extract_lane
pextrq + movq crosses register files twice, which is not efficient.

Optimize this by:
- checking if lane 0, do nothing if dst == src (macro-assembler helper)
- use vmovhlps on AVX, with src as the operands to avoid false
dependency on dst
- use movhlps otherwise, this is shorter than shufpd, and faster on
older system

Change-Id: I3486d87224c048b3229c2f92359b8b8e6d5fd025
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2589056
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Bill Budge <bbudge@chromium.org>
Cr-Commit-Position: refs/heads/master@{#71751}
2020-12-14 23:53:19 +00:00
Zhi An Ng
b0d7912042 [wasm-simd][x64] Prototype sign select
Prototype i8x16, i16x8, i32x4, i64x2 sign select on x64 and interpreter.

Bug: v8:10983
Change-Id: I7d6f39a2cb4c2aefe31daac782978fe8b363dd1a
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2486235
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Reviewed-by: Bill Budge <bbudge@chromium.org>
Cr-Commit-Position: refs/heads/master@{#70818}
2020-10-28 03:32:57 +00:00
Jakob Gruber
c7cb9beca1 Reland "Reland "[deoptimizer] Change deopt entries into builtins""
This is a reland of fbfa9bf4ec

The arm64 was missing proper codegen for CFI, thus sizes were off.

Original change's description:
> Reland "[deoptimizer] Change deopt entries into builtins"
>
> This is a reland of 7f58ced72e
>
> It fixes the different exit size emitted on x64/Atom CPUs due to
> performance tuning in TurboAssembler::Call. Additionally, add
> cctests to verify the fixed size exits.
>
> Original change's description:
> > [deoptimizer] Change deopt entries into builtins
> >
> > While the overall goal of this commit is to change deoptimization
> > entries into builtins, there are multiple related things happening:
> >
> > - Deoptimization entries, formerly stubs (i.e. Code objects generated
> >   at runtime, guaranteed to be immovable), have been converted into
> >   builtins. The major restriction is that we now need to preserve the
> >   kRootRegister, which was formerly used on most architectures to pass
> >   the deoptimization id. The solution differs based on platform.
> > - Renamed DEOPT_ENTRIES_OR_FOR_TESTING code kind to FOR_TESTING.
> > - Removed heap/ support for immovable Code generation.
> > - Removed the DeserializerData class (no longer needed).
> > - arm64: to preserve 4-byte deopt exits, introduced a new optimization
> >   in which the final jump to the deoptimization entry is generated
> >   once per Code object, and deopt exits can continue to emit a
> >   near-call.
> > - arm,ia32,x64: change to fixed-size deopt exits. This reduces exit
> >   sizes by 4/8, 5, and 5 bytes, respectively.
> >
> > On arm the deopt exit size is reduced from 12 (or 16) bytes to 8 bytes
> > by using the same strategy as on arm64 (recalc deopt id from return
> > address). Before:
> >
> >  e300a002       movw r10, <id>
> >  e59fc024       ldr ip, [pc, <entry offset>]
> >  e12fff3c       blx ip
> >
> > After:
> >
> >  e59acb35       ldr ip, [r10, <entry offset>]
> >  e12fff3c       blx ip
> >
> > On arm64 the deopt exit size remains 4 bytes (or 8 bytes in same cases
> > with CFI). Additionally, up to 4 builtin jumps are emitted per Code
> > object (max 32 bytes added overhead per Code object). Before:
> >
> >  9401cdae       bl <entry offset>
> >
> > After:
> >
> >  # eager deoptimization entry jump.
> >  f95b1f50       ldr x16, [x26, <eager entry offset>]
> >  d61f0200       br x16
> >  # lazy deoptimization entry jump.
> >  f95b2b50       ldr x16, [x26, <lazy entry offset>]
> >  d61f0200       br x16
> >  # the deopt exit.
> >  97fffffc       bl <eager deoptimization entry jump offset>
> >
> > On ia32 the deopt exit size is reduced from 10 to 5 bytes. Before:
> >
> >  bb00000000     mov ebx,<id>
> >  e825f5372b     call <entry>
> >
> > After:
> >
> >  e8ea2256ba     call <entry>
> >
> > On x64 the deopt exit size is reduced from 12 to 7 bytes. Before:
> >
> >  49c7c511000000 REX.W movq r13,<id>
> >  e8ea2f0700     call <entry>
> >
> > After:
> >
> >  41ff9560360000 call [r13+<entry offset>]
> >
> > Bug: v8:8661,v8:8768
> > Change-Id: I13e30aedc360474dc818fecc528ce87c3bfeed42
> > Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2465834
> > Commit-Queue: Jakob Gruber <jgruber@chromium.org>
> > Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
> > Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
> > Reviewed-by: Ulan Degenbaev <ulan@chromium.org>
> > Cr-Commit-Position: refs/heads/master@{#70597}
>
> Tbr: ulan@chromium.org, tebbi@chromium.org, rmcilroy@chromium.org
> Bug: v8:8661,v8:8768,chromium:1140165
> Change-Id: Ibcd5c39c58a70bf2b2ac221aa375fc68d495e144
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2485506
> Reviewed-by: Jakob Gruber <jgruber@chromium.org>
> Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
> Commit-Queue: Jakob Gruber <jgruber@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#70655}

Tbr: ulan@chromium.org, tebbi@chromium.org, rmcilroy@chromium.org
Bug: v8:8661
Bug: v8:8768
Bug: chromium:1140165
Change-Id: I471cc94fc085e527dc9bfb5a84b96bd907c2333f
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2488682
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/master@{#70672}
2020-10-21 06:01:38 +00:00
Maya Lekova
7c7aa4fa94 Revert "Reland "[deoptimizer] Change deopt entries into builtins""
This reverts commit fbfa9bf4ec.

Reason for revert: Seems to break arm64 sim CFI build (please see DeoptExitSizeIfFixed) - https://ci.chromium.org/p/v8/builders/ci/V8%20Linux%20-%20arm64%20-%20sim%20-%20CFI/2808

Original change's description:
> Reland "[deoptimizer] Change deopt entries into builtins"
>
> This is a reland of 7f58ced72e
>
> It fixes the different exit size emitted on x64/Atom CPUs due to
> performance tuning in TurboAssembler::Call. Additionally, add
> cctests to verify the fixed size exits.
>
> Original change's description:
> > [deoptimizer] Change deopt entries into builtins
> >
> > While the overall goal of this commit is to change deoptimization
> > entries into builtins, there are multiple related things happening:
> >
> > - Deoptimization entries, formerly stubs (i.e. Code objects generated
> >   at runtime, guaranteed to be immovable), have been converted into
> >   builtins. The major restriction is that we now need to preserve the
> >   kRootRegister, which was formerly used on most architectures to pass
> >   the deoptimization id. The solution differs based on platform.
> > - Renamed DEOPT_ENTRIES_OR_FOR_TESTING code kind to FOR_TESTING.
> > - Removed heap/ support for immovable Code generation.
> > - Removed the DeserializerData class (no longer needed).
> > - arm64: to preserve 4-byte deopt exits, introduced a new optimization
> >   in which the final jump to the deoptimization entry is generated
> >   once per Code object, and deopt exits can continue to emit a
> >   near-call.
> > - arm,ia32,x64: change to fixed-size deopt exits. This reduces exit
> >   sizes by 4/8, 5, and 5 bytes, respectively.
> >
> > On arm the deopt exit size is reduced from 12 (or 16) bytes to 8 bytes
> > by using the same strategy as on arm64 (recalc deopt id from return
> > address). Before:
> >
> >  e300a002       movw r10, <id>
> >  e59fc024       ldr ip, [pc, <entry offset>]
> >  e12fff3c       blx ip
> >
> > After:
> >
> >  e59acb35       ldr ip, [r10, <entry offset>]
> >  e12fff3c       blx ip
> >
> > On arm64 the deopt exit size remains 4 bytes (or 8 bytes in same cases
> > with CFI). Additionally, up to 4 builtin jumps are emitted per Code
> > object (max 32 bytes added overhead per Code object). Before:
> >
> >  9401cdae       bl <entry offset>
> >
> > After:
> >
> >  # eager deoptimization entry jump.
> >  f95b1f50       ldr x16, [x26, <eager entry offset>]
> >  d61f0200       br x16
> >  # lazy deoptimization entry jump.
> >  f95b2b50       ldr x16, [x26, <lazy entry offset>]
> >  d61f0200       br x16
> >  # the deopt exit.
> >  97fffffc       bl <eager deoptimization entry jump offset>
> >
> > On ia32 the deopt exit size is reduced from 10 to 5 bytes. Before:
> >
> >  bb00000000     mov ebx,<id>
> >  e825f5372b     call <entry>
> >
> > After:
> >
> >  e8ea2256ba     call <entry>
> >
> > On x64 the deopt exit size is reduced from 12 to 7 bytes. Before:
> >
> >  49c7c511000000 REX.W movq r13,<id>
> >  e8ea2f0700     call <entry>
> >
> > After:
> >
> >  41ff9560360000 call [r13+<entry offset>]
> >
> > Bug: v8:8661,v8:8768
> > Change-Id: I13e30aedc360474dc818fecc528ce87c3bfeed42
> > Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2465834
> > Commit-Queue: Jakob Gruber <jgruber@chromium.org>
> > Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
> > Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
> > Reviewed-by: Ulan Degenbaev <ulan@chromium.org>
> > Cr-Commit-Position: refs/heads/master@{#70597}
>
> Tbr: ulan@chromium.org, tebbi@chromium.org, rmcilroy@chromium.org
> Bug: v8:8661,v8:8768,chromium:1140165
> Change-Id: Ibcd5c39c58a70bf2b2ac221aa375fc68d495e144
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2485506
> Reviewed-by: Jakob Gruber <jgruber@chromium.org>
> Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
> Commit-Queue: Jakob Gruber <jgruber@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#70655}

TBR=ulan@chromium.org,rmcilroy@chromium.org,jgruber@chromium.org,tebbi@chromium.org

Change-Id: I4739a3475bfd8ee0cfbe4b9a20382f91a6ef1bf0
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: v8:8661
Bug: v8:8768
Bug: chromium:1140165
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2485223
Reviewed-by: Maya Lekova <mslekova@chromium.org>
Commit-Queue: Maya Lekova <mslekova@chromium.org>
Cr-Commit-Position: refs/heads/master@{#70658}
2020-10-20 14:14:12 +00:00
Jakob Gruber
fbfa9bf4ec Reland "[deoptimizer] Change deopt entries into builtins"
This is a reland of 7f58ced72e

It fixes the different exit size emitted on x64/Atom CPUs due to
performance tuning in TurboAssembler::Call. Additionally, add
cctests to verify the fixed size exits.

Original change's description:
> [deoptimizer] Change deopt entries into builtins
>
> While the overall goal of this commit is to change deoptimization
> entries into builtins, there are multiple related things happening:
>
> - Deoptimization entries, formerly stubs (i.e. Code objects generated
>   at runtime, guaranteed to be immovable), have been converted into
>   builtins. The major restriction is that we now need to preserve the
>   kRootRegister, which was formerly used on most architectures to pass
>   the deoptimization id. The solution differs based on platform.
> - Renamed DEOPT_ENTRIES_OR_FOR_TESTING code kind to FOR_TESTING.
> - Removed heap/ support for immovable Code generation.
> - Removed the DeserializerData class (no longer needed).
> - arm64: to preserve 4-byte deopt exits, introduced a new optimization
>   in which the final jump to the deoptimization entry is generated
>   once per Code object, and deopt exits can continue to emit a
>   near-call.
> - arm,ia32,x64: change to fixed-size deopt exits. This reduces exit
>   sizes by 4/8, 5, and 5 bytes, respectively.
>
> On arm the deopt exit size is reduced from 12 (or 16) bytes to 8 bytes
> by using the same strategy as on arm64 (recalc deopt id from return
> address). Before:
>
>  e300a002       movw r10, <id>
>  e59fc024       ldr ip, [pc, <entry offset>]
>  e12fff3c       blx ip
>
> After:
>
>  e59acb35       ldr ip, [r10, <entry offset>]
>  e12fff3c       blx ip
>
> On arm64 the deopt exit size remains 4 bytes (or 8 bytes in same cases
> with CFI). Additionally, up to 4 builtin jumps are emitted per Code
> object (max 32 bytes added overhead per Code object). Before:
>
>  9401cdae       bl <entry offset>
>
> After:
>
>  # eager deoptimization entry jump.
>  f95b1f50       ldr x16, [x26, <eager entry offset>]
>  d61f0200       br x16
>  # lazy deoptimization entry jump.
>  f95b2b50       ldr x16, [x26, <lazy entry offset>]
>  d61f0200       br x16
>  # the deopt exit.
>  97fffffc       bl <eager deoptimization entry jump offset>
>
> On ia32 the deopt exit size is reduced from 10 to 5 bytes. Before:
>
>  bb00000000     mov ebx,<id>
>  e825f5372b     call <entry>
>
> After:
>
>  e8ea2256ba     call <entry>
>
> On x64 the deopt exit size is reduced from 12 to 7 bytes. Before:
>
>  49c7c511000000 REX.W movq r13,<id>
>  e8ea2f0700     call <entry>
>
> After:
>
>  41ff9560360000 call [r13+<entry offset>]
>
> Bug: v8:8661,v8:8768
> Change-Id: I13e30aedc360474dc818fecc528ce87c3bfeed42
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2465834
> Commit-Queue: Jakob Gruber <jgruber@chromium.org>
> Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
> Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
> Reviewed-by: Ulan Degenbaev <ulan@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#70597}

Tbr: ulan@chromium.org, tebbi@chromium.org, rmcilroy@chromium.org
Bug: v8:8661,v8:8768,chromium:1140165
Change-Id: Ibcd5c39c58a70bf2b2ac221aa375fc68d495e144
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2485506
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/master@{#70655}
2020-10-20 12:30:23 +00:00
Jakob Gruber
8bc9a7941c Revert "[deoptimizer] Change deopt entries into builtins"
This reverts commit 7f58ced72e.

Reason for revert: Segfaults on Atom_x64 https://ci.chromium.org/p/v8-internal/builders/ci/v8_linux64_atom_perf/5686?

Original change's description:
> [deoptimizer] Change deopt entries into builtins
>
> While the overall goal of this commit is to change deoptimization
> entries into builtins, there are multiple related things happening:
>
> - Deoptimization entries, formerly stubs (i.e. Code objects generated
>   at runtime, guaranteed to be immovable), have been converted into
>   builtins. The major restriction is that we now need to preserve the
>   kRootRegister, which was formerly used on most architectures to pass
>   the deoptimization id. The solution differs based on platform.
> - Renamed DEOPT_ENTRIES_OR_FOR_TESTING code kind to FOR_TESTING.
> - Removed heap/ support for immovable Code generation.
> - Removed the DeserializerData class (no longer needed).
> - arm64: to preserve 4-byte deopt exits, introduced a new optimization
>   in which the final jump to the deoptimization entry is generated
>   once per Code object, and deopt exits can continue to emit a
>   near-call.
> - arm,ia32,x64: change to fixed-size deopt exits. This reduces exit
>   sizes by 4/8, 5, and 5 bytes, respectively.
>
> On arm the deopt exit size is reduced from 12 (or 16) bytes to 8 bytes
> by using the same strategy as on arm64 (recalc deopt id from return
> address). Before:
>
>  e300a002       movw r10, <id>
>  e59fc024       ldr ip, [pc, <entry offset>]
>  e12fff3c       blx ip
>
> After:
>
>  e59acb35       ldr ip, [r10, <entry offset>]
>  e12fff3c       blx ip
>
> On arm64 the deopt exit size remains 4 bytes (or 8 bytes in same cases
> with CFI). Additionally, up to 4 builtin jumps are emitted per Code
> object (max 32 bytes added overhead per Code object). Before:
>
>  9401cdae       bl <entry offset>
>
> After:
>
>  # eager deoptimization entry jump.
>  f95b1f50       ldr x16, [x26, <eager entry offset>]
>  d61f0200       br x16
>  # lazy deoptimization entry jump.
>  f95b2b50       ldr x16, [x26, <lazy entry offset>]
>  d61f0200       br x16
>  # the deopt exit.
>  97fffffc       bl <eager deoptimization entry jump offset>
>
> On ia32 the deopt exit size is reduced from 10 to 5 bytes. Before:
>
>  bb00000000     mov ebx,<id>
>  e825f5372b     call <entry>
>
> After:
>
>  e8ea2256ba     call <entry>
>
> On x64 the deopt exit size is reduced from 12 to 7 bytes. Before:
>
>  49c7c511000000 REX.W movq r13,<id>
>  e8ea2f0700     call <entry>
>
> After:
>
>  41ff9560360000 call [r13+<entry offset>]
>
> Bug: v8:8661,v8:8768
> Change-Id: I13e30aedc360474dc818fecc528ce87c3bfeed42
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2465834
> Commit-Queue: Jakob Gruber <jgruber@chromium.org>
> Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
> Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
> Reviewed-by: Ulan Degenbaev <ulan@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#70597}

TBR=ulan@chromium.org,rmcilroy@chromium.org,jgruber@chromium.org,tebbi@chromium.org

# Not skipping CQ checks because original CL landed > 1 day ago.

Bug: v8:8661,v8:8768,chromium:1140165
Change-Id: I3df02ab42f6e02233d9f6fb80e8bb18f76870d91
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2485504
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/master@{#70649}
2020-10-20 09:43:19 +00:00
Jakob Gruber
7f58ced72e [deoptimizer] Change deopt entries into builtins
While the overall goal of this commit is to change deoptimization
entries into builtins, there are multiple related things happening:

- Deoptimization entries, formerly stubs (i.e. Code objects generated
  at runtime, guaranteed to be immovable), have been converted into
  builtins. The major restriction is that we now need to preserve the
  kRootRegister, which was formerly used on most architectures to pass
  the deoptimization id. The solution differs based on platform.
- Renamed DEOPT_ENTRIES_OR_FOR_TESTING code kind to FOR_TESTING.
- Removed heap/ support for immovable Code generation.
- Removed the DeserializerData class (no longer needed).
- arm64: to preserve 4-byte deopt exits, introduced a new optimization
  in which the final jump to the deoptimization entry is generated
  once per Code object, and deopt exits can continue to emit a
  near-call.
- arm,ia32,x64: change to fixed-size deopt exits. This reduces exit
  sizes by 4/8, 5, and 5 bytes, respectively.

On arm the deopt exit size is reduced from 12 (or 16) bytes to 8 bytes
by using the same strategy as on arm64 (recalc deopt id from return
address). Before:

 e300a002       movw r10, <id>
 e59fc024       ldr ip, [pc, <entry offset>]
 e12fff3c       blx ip

After:

 e59acb35       ldr ip, [r10, <entry offset>]
 e12fff3c       blx ip

On arm64 the deopt exit size remains 4 bytes (or 8 bytes in same cases
with CFI). Additionally, up to 4 builtin jumps are emitted per Code
object (max 32 bytes added overhead per Code object). Before:

 9401cdae       bl <entry offset>

After:

 # eager deoptimization entry jump.
 f95b1f50       ldr x16, [x26, <eager entry offset>]
 d61f0200       br x16
 # lazy deoptimization entry jump.
 f95b2b50       ldr x16, [x26, <lazy entry offset>]
 d61f0200       br x16
 # the deopt exit.
 97fffffc       bl <eager deoptimization entry jump offset>

On ia32 the deopt exit size is reduced from 10 to 5 bytes. Before:

 bb00000000     mov ebx,<id>
 e825f5372b     call <entry>

After:

 e8ea2256ba     call <entry>

On x64 the deopt exit size is reduced from 12 to 7 bytes. Before:

 49c7c511000000 REX.W movq r13,<id>
 e8ea2f0700     call <entry>

After:

 41ff9560360000 call [r13+<entry offset>]

Bug: v8:8661,v8:8768
Change-Id: I13e30aedc360474dc818fecc528ce87c3bfeed42
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2465834
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Reviewed-by: Ulan Degenbaev <ulan@chromium.org>
Cr-Commit-Position: refs/heads/master@{#70597}
2020-10-19 07:32:48 +00:00
Ng Zhi An
944dad59c8 [x64] Add movlps and movhps to assembler
These instructions will be used for prototyping Wasm SIMD's store lane
later on, separated the implementation for assembler and disassembler
into this patch to make things smaller.

Curiously, movhps and movlhps seems to have the same encoding, 0f 16, so
I'm not sure not sure how to differentiate them in the disassembler
besides using the mod field, since movlhps only takes xmm registers,
whereas movhps always take 1 operand.

Bug: v8:10975
Change-Id: I8be9a31b1c9a5515038f9c8c55ef30d1ba063ea7
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2471977
Reviewed-by: Bill Budge <bbudge@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#70520}
2020-10-15 00:37:32 +00:00
Ng Zhi An
e30c50f3bf [x64] Refactor pinsrb family of instructions
The existing macro assembler define Pinsrb, which expects 3 arguments:

- XMMRegister dst
- Register/Operand src
- uint8_t imm

which overwrites dst with src at lane specified by imm.

That means we cannot use the AVX version, which has 4 arguments, and
does not overwrite dst.

This refactoring defines the 4 argument AVX version instead, and if AVX
is not supported, fall back to the SSE version, and ensure that the
value is copied over into dst first.

For convenience, we define an overload with 3 arguments that duplicates
dst, this replicates the SSE behavior, so that not all callers have to
be updated.

Bug: v8:10975, v8:10933
Change-Id: I6f9b9d37fa08d3f5cff4f040ae7d5e1f0cf36455
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2444096
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Bill Budge <bbudge@chromium.org>
Cr-Commit-Position: refs/heads/master@{#70392}
2020-10-07 23:25:30 +00:00
Jakob Gruber
29bcdaad1d Rename legacy code kinds
CodeKind::OPTIMIZED_CODE -> TURBOFAN

Kinds are now more fine-grained and distinguish between TF, TP, NCI.

CodeKind::STUB -> DEOPT_ENTRIES_OR_FOR_TESTING

Code stubs (like builtins, but generated at runtime) were removed from
the codebase years ago, this is the last remnant. This kind is used
only for deopt entries (which should be converted into builtins) and
for tests.

Change-Id: I67beb15377cb60f395e9b051b25f3e5764982e93
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2440335
Auto-Submit: Jakob Gruber <jgruber@chromium.org>
Commit-Queue: Mythri Alle <mythria@chromium.org>
Reviewed-by: Mythri Alle <mythria@chromium.org>
Cr-Commit-Position: refs/heads/master@{#70234}
2020-09-30 15:39:23 +00:00
Ng Zhi An
ddf30bea13 [wasm-simd][x64] Check for register when emitting shuffles
Some shuffles take have either register or memory operand for second
input, but the codegen incorrectly assumes that it is always a register.

Bug: v8:10824
Change-Id: Ia2df233dad4ed451e52e57e35cce5c80db0905db
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2373586
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Bill Budge <bbudge@chromium.org>
Cr-Commit-Position: refs/heads/master@{#69562}
2020-08-25 17:52:16 +00:00
Jakob Gruber
c51041f454 [nci] Replace CompilationTarget with a new Code::Kind value
With the new Turbofan variants (NCI and Turboprop), we need a way to
distinguish between them both during and after compilation. We
initially introduced CompilationTarget to track the variant during
compilation, but decided to reuse the code kind as the canonical spot to
store this information instead.

Why? Because it is an established mechanism, already available in most
of the necessary spots (inside the pipeline, on Code objects, in
profiling traces).

This CL removes CompilationTarget and adds a new
NATIVE_CONTEXT_INDEPENDENT kind, plus helper functions to determine
various things about a given code kind (e.g.: does this code kind
deopt?).

As a (very large) drive-by, refactor both Code::Kind and
AbstractCode::Kind into a new CodeKind enum class.

Bug: v8:8888
Change-Id: Ie858b9a53311b0731630be35cf5cd108dee95b39
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2336793
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
Reviewed-by: Dominik Inführ <dinfuehr@chromium.org>
Reviewed-by: Georg Neis <neis@chromium.org>
Cr-Commit-Position: refs/heads/master@{#69244}
2020-08-05 12:27:22 +00:00
Ng Zhi An
667fafcec4 Reland "[wasm-simd] Prototype f64x2 rounding instructions"
This is a reland of f7f72b7b3a

This was reverted because of a test timing out on slow_path
variant (https://crrev.com/c/2237131 for details). Turns out
the test is just really slow, and was skipped on this variant
in https://crrev.com/c/2237628. Relanding without changes.


Original change's description:
> [wasm-simd] Prototype f64x2 rounding instructions
>
> Implements f64x2 ceil, floor, trunc, nearestint, for interpreter and
> x64.
>
> Bug: v8:10553
> Change-Id: I12a260a3b1d728368e5525d317d30fc9581cae04
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2213082
> Commit-Queue: Zhi An Ng <zhin@chromium.org>
> Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
> Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#68241}

Tbr: tebbi@chromium.org
Bug: v8:10553
Change-Id: I4cdc23d0556f11310d32fa066f40b057fd49d2d7
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2237350
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Reviewed-by: Adam Klein <adamk@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68304}
2020-06-10 20:51:21 +00:00
Leszek Swirski
926ce88782 Revert "[wasm-simd] Prototype f64x2 rounding instructions"
This reverts commit f7f72b7b3a.

Reason for revert: Flaky timeouts of slow-path tests -- specifically, mjsunit/regress/wasm/regress-9017, which appears to have regressed from ~2 min to ~3-4 min 

https://logs.chromium.org/logs/v8/buildbucket/cr-buildbucket.appspot.com/8878016799136124416/+/steps/Check_-_slow_path__flakes_/0/logs/regress-9017/0

Original change's description:
> [wasm-simd] Prototype f64x2 rounding instructions
> 
> Implements f64x2 ceil, floor, trunc, nearestint, for interpreter and
> x64.
> 
> Bug: v8:10553
> Change-Id: I12a260a3b1d728368e5525d317d30fc9581cae04
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2213082
> Commit-Queue: Zhi An Ng <zhin@chromium.org>
> Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
> Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#68241}

TBR=gdeepti@chromium.org,tebbi@chromium.org,zhin@chromium.org

Change-Id: I9915dd375c7f0e08b5414189efb29ed1c90cb96d
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: v8:10553
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2237131
Reviewed-by: Leszek Swirski <leszeks@chromium.org>
Commit-Queue: Leszek Swirski <leszeks@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68248}
2020-06-09 08:38:52 +00:00
Ng Zhi An
f7f72b7b3a [wasm-simd] Prototype f64x2 rounding instructions
Implements f64x2 ceil, floor, trunc, nearestint, for interpreter and
x64.

Bug: v8:10553
Change-Id: I12a260a3b1d728368e5525d317d30fc9581cae04
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2213082
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68241}
2020-06-08 23:43:09 +00:00
Ng Zhi An
02ee6904f4 [x64] Fix vroundps assembly, add disassembly
vroundps assembly is incorrect:
- the signature was wrong, vroundps takes 2 operands and 1 immediate
- when calling vinstr, should always pass xmm0, this wasn't causing
issues because our test cases were restricted enough that it was always
xmm0 anyway
- the macro assembler should use AVX_OP_SSE4_1, since roundps requires
SSE4_1
- drive-by fix for roundss and roundsd to be AVX_OP_SSE4_1
- add disasm for roundps and vroundps, and test them

Bug: v8:10553
Change-Id: I4046eb81a9f18d5af7137bbd46bfa0478e5a9ab2
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2227252
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68157}
2020-06-03 19:16:10 +00:00