Prototype f64x2.ceil on ARM for both ARM v7 and ARM v8. ARM v8 has
support for vrintp, and for ARM v7 we fallback to runtime.
Since ARM v8 uses vrintp, which is the same instruction used for
Float64RoundUp (scalar), wasm-compiler reuses the Float64RoundUp check.
Bug: v8:10553
Change-Id: I5841c6a06f260debe8ae90d331bdcc2a0fa3278c
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2258813
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68553}
Prototype f32x4.nearest on ARM for both ARM v7 and ARM v8. ARM v8 has
support for vrintn, and for ARM v7 we fallback to runtime.
Since ARM v8 uses vrintn, which is the same instruction used for
F32RoundTiesEven (scalar), wasm-compiler reuses the Float32RoundTiesEven
check.
Bug: v8:10553
Change-Id: I066b8c5f10fd86294afe1c530c516493deeb7b53
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2258037
Reviewed-by: Jakob Kummerow <jkummerow@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68526}
Prototype f32x4.trunc on ARM for both ARM v7 and ARM v8. ARM v8 has
support for vrintz, and for ARM v7 we fallback to runtime.
Since ARM v8 uses vrintz, which is the same instruction used for F32
trunc (scalar), wasm-compiler reuses the Float32RoundTruncate check.
Bug: v8:10553
Change-Id: I65ddc36ccff21f8f0ff21a6e768184c084ffcfea
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2256770
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Reviewed-by: Jakob Kummerow <jkummerow@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68498}
Scalar lowering for i8x16, i16x8, i32x4 bitmask.
Depending on which lane we are lowering, we can either shift the MSB
into the correct final bit position, then do a big OR of all the nodes.
Bug: v8:10308
Change-Id: Iddf6c077b5a8658a487cef59f2e3bbae3c8bd98d
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2219327
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68491}
Prototype f32x4.floor on ARM for both ARM v7 and ARM v8. ARM v8 has
support for vrintm, and for ARM v7 we fallback to runtime.
Since ARM v8 uses vrintm, which is the same instruction used for F32
Floor (scalar), wasm-compiler reuses the Float32RoundDown check.
Bug: v8:10553
Change-Id: I540e82a156131821f732cd427df2e5c68f4094d7
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2252541
Reviewed-by: Jakob Kummerow <jkummerow@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68470}
This implements I32x4DotI16x8S for ia32.
Also fixes instruction-selector for SIMD ops, they should all set operand1 to be a register, since we do not have memory alignment yet.
Bug: v8:10583
Change-Id: Id273816efd5eea128580f3f7bde533a8e1b2435d
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2231031
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68444}
Prototype f32x4.ceil on ARM for both ARM v7 and ARM v8. ARM v8 has
support for vrintp, and for ARM v7 we fallback to runtime.
Since ARM v8 uses vrintp, which is the same instruction used for F32
Ceil (scalar), wasm-compiler reuses the Float32Round check, rather than
creating new F32x4Round optional operators.
Implementation for vrintp (Advanced SIMD version that takes Q
registers), assembler, disassembler support. Incomplete for now, but
more will be added as we add other rounding modes.
Bug: v8:10553
Change-Id: I4563608b9501f6f57c3a8325b17de89da7058a43
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2248779
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Reviewed-by: Jakob Kummerow <jkummerow@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68419}
This is a reland of 3692bef9f9
Integer overflow in the test code is fixed by using
MulWithWraparound.
Original change's description:
> [wasm-simd][x64] Prototype i32x4.dot_i16x8_s
>
> This implements I32x4DotI16x8S for x64 and interpreter.
>
> Bug: v8:10583
> Change-Id: I404ac68c19c1686a93f29c3f4fc2d661c9558c67
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2229056
> Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
> Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
> Commit-Queue: Zhi An Ng <zhin@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#68244}
Bug: v8:10583
Change-Id: Ie7d0032f5398b6f725c02b572764258adacc8578
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2236962
Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68343}
This is a reland of f7f72b7b3a
This was reverted because of a test timing out on slow_path
variant (https://crrev.com/c/2237131 for details). Turns out
the test is just really slow, and was skipped on this variant
in https://crrev.com/c/2237628. Relanding without changes.
Original change's description:
> [wasm-simd] Prototype f64x2 rounding instructions
>
> Implements f64x2 ceil, floor, trunc, nearestint, for interpreter and
> x64.
>
> Bug: v8:10553
> Change-Id: I12a260a3b1d728368e5525d317d30fc9581cae04
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2213082
> Commit-Queue: Zhi An Ng <zhin@chromium.org>
> Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
> Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#68241}
Tbr: tebbi@chromium.org
Bug: v8:10553
Change-Id: I4cdc23d0556f11310d32fa066f40b057fd49d2d7
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2237350
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Reviewed-by: Adam Klein <adamk@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68304}
This reverts commit dfbbb4a531.
Reason for revert: Bitmask added post 84 cut, so it is not part of origin trial. Therefore it is still a post-mvp.
Original change's description:
> [wasm-simd] Add bitmask to SIMD MVP
>
> This removes the post-mvp flag for bitmask, since it was accepted into
> the proposal, see https://github.com/WebAssembly/simd/pull/201.
>
> Bug: v8:10308
> Change-Id: I4ced43a6484660125d773bc9de46bdea9f72b13b
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2216532
> Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
> Commit-Queue: Zhi An Ng <zhin@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#67993}
TBR=gdeepti@chromium.org,zhin@chromium.org
# Not skipping CQ checks because original CL landed > 1 day ago.
Bug: v8:10308
Change-Id: I53294be4ea816f37c7cc5f545afb572538dd4770
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2233183
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68216}
This is a reland of dfdef88547
Original change's description:
> [wasm-simd] Fix extract lane unsigned extend
>
> The interpreter is missing a static cast when extracting lanes smaller
> than int32_t and doing an unsigned extend. The array in Simd128 is
> signed, so a direct cast to uint32_t will be a signed extension. The fix
> is to, in the unsigned case, cast to unsigned (of the appropriate size)
> first, then cast to uint32_t.
>
> Change-Id: Ifabb5b9690f08ad505ac94b84908db0970581818
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2216721
> Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
> Commit-Queue: Zhi An Ng <zhin@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#68029}
Change-Id: Ica7974a2f1f2a4f07b54cc68f9abcf5e121a9262
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2219414
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68063}
This reverts commit dfdef88547.
Reason for revert: https://ci.chromium.org/p/v8/builders/ci/V8%20Blink%20Mac/2718?
Original change's description:
> [wasm-simd] Fix extract lane unsigned extend
>
> The interpreter is missing a static cast when extracting lanes smaller
> than int32_t and doing an unsigned extend. The array in Simd128 is
> signed, so a direct cast to uint32_t will be a signed extension. The fix
> is to, in the unsigned case, cast to unsigned (of the appropriate size)
> first, then cast to uint32_t.
>
> Change-Id: Ifabb5b9690f08ad505ac94b84908db0970581818
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2216721
> Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
> Commit-Queue: Zhi An Ng <zhin@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#68029}
TBR=gdeepti@chromium.org,zhin@chromium.org
Change-Id: Icdd0e705f4c7252aef2cadaa39ec52204b5c6093
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2219412
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68030}
The interpreter is missing a static cast when extracting lanes smaller
than int32_t and doing an unsigned extend. The array in Simd128 is
signed, so a direct cast to uint32_t will be a signed extension. The fix
is to, in the unsigned case, cast to unsigned (of the appropriate size)
first, then cast to uint32_t.
Change-Id: Ifabb5b9690f08ad505ac94b84908db0970581818
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2216721
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68029}
This removes the post-mvp flag for bitmask, since it was accepted into
the proposal, see https://github.com/WebAssembly/simd/pull/201.
Bug: v8:10308
Change-Id: I4ced43a6484660125d773bc9de46bdea9f72b13b
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2216532
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#67993}
The proposal uses the lane shape, e.g. i64x2.anytrue, and we were using
s1x2.anytrue in our opcodes. This was a legacy naming, because we were
trying to bitpack the booleans. Now that we aren't doing that, rename
these to be more consistent with the proposal.
This was done with a straightforward sed script, changing both cpp code
and also some comments in mjsunit test files.
Bug: v8:10506
Change-Id: If077ed805de23520d8580d6b3b1906c80f67b94f
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2207915
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Cr-Commit-Position: refs/heads/master@{#67945}
The codegen uses a bunch of vpmax to try and keep set bits around. The
datatype for vpmax does not need to change for each instruction, since
vpmax U32 will persist set bits just as well. This simplifies the
instruction sequences for S1x8 and S1x16 anytrue.
I added a test to check a special case when a f64x2 contains -0.0 (top
bit set). A previous attempt to optimize codegen used floating point
compare, which does not distinguish between 0.0 and -0.0. So -0.0 will
compare equals to 0.0, and incorrect return 0 for anytrue.
Change-Id: I66013796af08a666009e6b2d774ea7ee7bdfe1ad
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2203113
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Bill Budge <bbudge@chromium.org>
Cr-Commit-Position: refs/heads/master@{#67875}
Load extends always load 64-bits. Previously, we were setting the max
alignment to be the size_log_2 of the load_type. For LoadExtends the
load_type indicates what the lane size to be extended is, *NOT* the size
to be loaded.
Bug: chromium:1082848
Change-Id: I0c4115ea6ec916211b03afdb83376ccc05c0c244
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2202721
Reviewed-by: Bill Budge <bbudge@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#67815}
Same implementation as the one for x64 in https://crrev.com/c/2186630.
Bug: v8:10501
Change-Id: If2b6c0fdc649afba3449d9579452cf7047a55a54
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2188556
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#67721}
This patch implements f32x4.pmin, f32x4.pmax, f64x2.pmin, and f64x2.pmax
for x64 and interpreter.
Pseudo-min and Pseudo-max instructions were proposed in
https://github.com/WebAssembly/simd/pull/122. These instructions
exactly match std::min and std::max in C++ STL, and thus have different
semantics from the existing min and max.
The instruction-selector for x64 switches the operands around, because
it allows for defining the dst to be same as first (really the second
input node), allowing better codegen.
For example, b = f32x4.pmin(a, b) directly maps to vminps(b, b, a) or
minps(b, a), as long as we can define dst == b, and switching the
instruction operands around allows us to do that.
Bug: v8:10501
Change-Id: I06f983fc1764caf673e600ac91d9c0ac5166e17e
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2186630
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Cr-Commit-Position: refs/heads/master@{#67688}
Move them all into wasm-macro-gen.h, other opcodes have their macros
there as well. This will make reusing these macros easier when we have
other test files for SIMD. (An upcoming one is for scalar lowering
tests.)
Change-Id: I6c21100ce490abbc26f80a0d204815687fd62f00
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2185471
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#67658}
This reverts commit 3c40082543.
Reason for revert: Re-enable interpreter tests
Original change's description:
> [wasm-simd] Remove interpreter tier of SIMD tests
>
> As per the all-hands a couple of weeks ago, the interpreter will
> be removed soon. Remove running tests on this tier, so we no longer
> put effort into maintaining tests for this tier.
>
> Change-Id: I9fce0f3a7cd869d6ccecf1c1f820b794e89858e1
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2175021
> Reviewed-by: Zhi An Ng <zhin@chromium.org>
> Commit-Queue: Deepti Gandluri <gdeepti@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#67520}
TBR=gdeepti@chromium.org,zhin@chromium.org
# Not skipping CQ checks because original CL landed > 1 day ago.
Change-Id: Iac0f21311769157c5ae303e8078c25d96fbc7c93
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2180343
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Commit-Queue: Deepti Gandluri <gdeepti@chromium.org>
Cr-Commit-Position: refs/heads/master@{#67546}
As per the all-hands a couple of weeks ago, the interpreter will
be removed soon. Remove running tests on this tier, so we no longer
put effort into maintaining tests for this tier.
Change-Id: I9fce0f3a7cd869d6ccecf1c1f820b794e89858e1
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2175021
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Commit-Queue: Deepti Gandluri <gdeepti@chromium.org>
Cr-Commit-Position: refs/heads/master@{#67520}
- Update opcode numbers, tests
- As the wasm-module-builder currently assumes opcode bytes, skip
the test that needs a multi-byte leb128 opcode
- Renumber post-MVP opcodes
Change-Id: I6531e954e63986dc6f7a3144ec054d16e6dc1b05
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2173952
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Commit-Queue: Deepti Gandluri <gdeepti@chromium.org>
Cr-Commit-Position: refs/heads/master@{#67517}
Due to lack of testing environment before, there are some bugs in the
implementations of wasm-simd on mips64 platform, this CL fix them
according to the test on Loongson 3A4000.
Change-Id: I59ab6315987fc94a06cf0bf23754f5c593879532
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2162416
Commit-Queue: Zhao Jiazhong <zhaojiazhong-hf@loongson.cn>
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Cr-Commit-Position: refs/heads/master@{#67413}
Load splat opcodes are currently multi-byte, but were not passing the
right lengths for decoding of immediates.
Bug: v8:10258
Change-Id: I2c93c3f915eaa43a74722cf0285f161d16ef0ff6
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2154769
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#67326}
SIMD opcodes consist of the prefix byte, then an LEB128 encoded int. We
were decoding this incorrectly as a fixed uint8. This fixes the decoder
to properly handle multi bytes.
In some cases, the multi byte logic is applied to all prefixed opcodes.
This is not a problem, since for values < 0x80, the LEB encoding is a
single byte, and decodes to the same int. If the prefix opcode has
instructions with index >= 0x80, it would be required to be LEB128
encoded anyway.
There are a bunch of trivial changes to test-run-wasm-simd, to change
the macro from BUILD to BUILD_V, the former only works for single byte
opcodes, the latter is a new template-based macro that correct handles
multi-byte opcodes. The only unchanged test is the shuffle fuzzer test,
which builds its own sequence of bytes without using the BUILD macro.
Bug: v8:10258
Change-Id: Ie7377e899a7eab97ecf28176fd908babc08d0f19
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2118476
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Cr-Commit-Position: refs/heads/master@{#67186}
Implement i8x16.bitmask, i16x8.bitmask, i32x4.bitmask on ia32.
Drive by additions of disasm and disasm tests to some instructions.
Bug: v8:10308
Change-Id: I3725ed6959ae55f96ee7950130776a4f08e177c9
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2127314
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#66989}
MSVC 19.25 complains about signbit being ambiguous between
signbit(float) and signbit(double) overloads when called with an int8_t.
To remove the ambiguity, cast to a double.
Change-Id: I698f05eed9248eef493bbe46b75fcd07e37e2a05
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2118510
Reviewed-by: Andreas Haas <ahaas@chromium.org>
Commit-Queue: Richard Townsend <richard.townsend@arm.com>
Cr-Commit-Position: refs/heads/master@{#66856}
Introduces a new macro BUILD_V (v is for vector) that pushes bytes into
a vector (instead of directly in an array initializer, see BUILD). This
has the positive effect of being able to handle opcodes of multiple
bytes (e.g. SIMD opcodes bigger that 0xfd80). Because of this "API"
change, our helper macros in test-run-wasm-simd.cc and wasm-run-utils.h
need to change too. So, we introduce new macros (suffixed by _V), that
will call the appropriate lambdas defined in BUILD_V, that knows how to
push bytes into the vector, and also can handle multi-byte opcodes.
This design has a bit of duplication and ugliness, but was chosen to
reduce the impact of existing tests. No restructuring of test code is
required, we only need to add suffix _V.
Note that we do not have multi-byte opcodes yet (in wasm-opcodes.h),
this change will be breaking, and requires all the tests to be updated
to use _V macros first.
Bug: v8:10258
Change-Id: I86638a548fe2f9714c1cfb3bd691fb7b49bfd652
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2107650
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Cr-Commit-Position: refs/heads/master@{#66812}
Implement i8x16.bitmask, i16x8.bitmask, i32x4.bitmask on interpreter and
arm64.
These operations are behind wasm_simd_post_mvp flag, as we are only
prototyping to evaluate performance. The codegen is based on guidance at
https://github.com/WebAssembly/simd/pull/201.
Bug: v8:10308
Change-Id: I835aa8a23e677a00ee7897c1c31a028850e238a9
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2099451
Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#66793}
"I64x2Eq", "S1x2AnyTrue" and "S1x2AllTrue" do not yet have lowering
implemented hence some of the test case may fail on s390x
hardware without AVX support.
Change-Id: Ice01bcaed78950fbad36e2ba37c8f7ae5d10b59b
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2107763
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Commit-Queue: Milad Farazmand <miladfar@ca.ibm.com>
Cr-Commit-Position: refs/heads/master@{#66780}
This Cl enables simd on machines which support
VECTOR_ENHANCE_FACILITY_1. It also enables related tests to
match execution on x64.
LoadTransform tests must be skipped on the simulator until a future CL
matches behaviour between native BE and its simulator on LE.
Change-Id: Iaadc32e0388bf15d3d7c550062a373fb403b65c4
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2107053
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Milad Farazmand <miladfar@ca.ibm.com>
Cr-Commit-Position: refs/heads/master@{#66754}
Some opcodes are introduced in V8 for prototyping, and performance
measurements that are not officially a part of the current SIMD proposal
but may be included in future, gate these by a separate flag.
Change-Id: Icc6a9e89c6196c8ff144d2e0193d707e1f60c38b
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2079539
Commit-Queue: Deepti Gandluri <gdeepti@chromium.org>
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Ben Smith <binji@chromium.org>
Cr-Commit-Position: refs/heads/master@{#66542}
FMA tests that are running on Liftoff can use fused results, since the
tests will fall back to TurboFan.
Bug: v8:9415
Change-Id: I02edea5ce1447263f7bc7574573418b0055aef8f
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2063202
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#66438}
Implements lowering for:
- i16x8.load8x8_s
- i16x8.load8x8_u
- i32x4.load16x4_s
- i32x4.load16x4_u
As before, i64x2 is not implemented since 64-bit lowering and scalar
lowering don't work together yet.
Bug: v8:9886
Change-Id: I3728d009e053acf82baacbcf1c6c08ea636ef241
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2044546
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#66380}
We lower each op into num_lanes loads, and connecting up the effects in
a chain.
s64x2 is not implemented since we lowering for 64x2 generally doesn't
work anyway.
Load extends are a bit more complicated, so we'll do that in a separate
change.
Bug: v8:9886
Change-Id: I80096827bf8e8e0db1ef0ad1b76759ed1797ca5e
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2031893
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Cr-Commit-Position: refs/heads/master@{#66183}
These were not added in https://crrev.com/c/2026067 when we added
similar tests for other lane sizes, since x64 had a completely different
path for i8x16. But this tests are useful anyway for other archs, so add
them in.
Bug: v8:10115
Change-Id: I77ecca0cd9f4021c94f1538aa5635b5d54983207
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2041708
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#66178}
Define a macro in code-generator-x64 to help identify cases when the
shift value is an immediate/constant. In those cases we can directly
emit the shifts without any masking, since the instruction selector
would have modulo-ed the shift value. We also don't need any temporaries
in this case.
This is only x64 codegen, optimizations for other archs will come in
future patches (and will probably look very similar to this).
The current test case passes the shifts as an immediate, so we add a new
path that loads the shift value from memory, thereby exercising the
slower path of non-immediate shift value.
Bug: v8:10115
Change-Id: Iaf13d81595714882a8f5418734e031b8bc654af3
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2026067
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#66074}
On backends that do not have s128 support in Liftoff, tests will bail
out to TurboFan, so tests will continue running and passing.
Bug: v8:9909
Change-Id: I3b596a73b6cb2e8645a99c65a935026f9e1a8d55
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2029332
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#66056}
These conversion instructions were removed from the proposal in
https://github.com/WebAssembly/simd/pull/178.
Change-Id: I212ca2f923362bf08e178f6d28cc2338cf6f5927
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2016006
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#66015}
Remove the execution tier check for simd tests. On archs without
Liftoff, those tests that are configured to run on Liftoff will fail
with this check, since they bail out to TF.
We remove this check for now, but will think of a way to enforce this in
a more platform specific way.
Bug: v8:9909
Change-Id: Id56f841fe6e342434af3dbcdaef0a8a284614994
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2019924
Reviewed-by: Milad Farazmand <miladfar@ca.ibm.com>
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#65983}
This relands commit 009993adb4.
The fix is in liftoff-assembler-ia32.h, the codegen was incorrect.
Original change's description:
> Implement f32x4.splat and enable handling this in Liftoff.
>
> We add a new macro for defining test cases to run on TurboFan, Liftoff,
> interpreter, and scalar lowering.
>
> Also add an assertion that the execution tier used is what we expected
> it to be. This is useful for Liftoff, because by default it falls back
> to TurboFan when it encounters an unimplemented opcode.
>
> Bug: v8:9909
Bug: v8:9909
Change-Id: I7daacbe8b195d9212367190c515b0babbc457a88
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2018043
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#65963}
This reverts commit 009993adb4.
Reason for revert: New test fails, see https://ci.chromium.org/p/v8/builders/ci/V8%20Linux/35534 and https://ci.chromium.org/p/v8/builders/ci/V8%20Win32%20-%20debug/23778
Original change's description:
> [liftoff][wasm-simd] Implement f32x4.splat
>
> Implement f32x4.splat and enable handling this in Liftoff.
>
> We add a new macro for defining test cases to run on TurboFan, Liftoff,
> interpreter, and scalar lowering.
>
> Also add an assertion that the execution tier used is what we expected
> it to be. This is useful for Liftoff, because by default it falls back
> to TurboFan when it encounters an unimplemented opcode.
>
> Bug: v8:9909
> Change-Id: I594955fce778173191fc44c38c4f956a05e77839
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2014753
> Commit-Queue: Zhi An Ng <zhin@chromium.org>
> Reviewed-by: Clemens Backes <clemensb@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#65954}
TBR=clemensb@chromium.org,zhin@chromium.org
Change-Id: Ie6970a8c29baab149150dd734a95f89be5fd89ff
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: v8:9909
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2017722
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Commit-Queue: Clemens Backes <clemensb@chromium.org>
Cr-Commit-Position: refs/heads/master@{#65958}
Implement f32x4.splat and enable handling this in Liftoff.
We add a new macro for defining test cases to run on TurboFan, Liftoff,
interpreter, and scalar lowering.
Also add an assertion that the execution tier used is what we expected
it to be. This is useful for Liftoff, because by default it falls back
to TurboFan when it encounters an unimplemented opcode.
Bug: v8:9909
Change-Id: I594955fce778173191fc44c38c4f956a05e77839
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2014753
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Cr-Commit-Position: refs/heads/master@{#65954}
This CL implements load_extend with 2 lanes and all load_splat
operations on IA32. The necessary assemblers together with their
corresponding disassemblers and tests are also added in this CL.
The newly added opcodes include: S8x16LoadSplat, S16x8LoadSplat,
S32x4LoadSplat, S64x2LoadSplat, I64x2Load32x2S, I64x2Load32x2U.
Bug: v8:9886
Change-Id: I0a5dae0a683985c14c433ba9d85acbd1cee6705f
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1982989
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Commit-Queue: Zhiguo Zhou <zhiguo.zhou@intel.com>
Cr-Commit-Position: refs/heads/master@{#65937}
Note the tricky part in instruction-selector-x64, where we flip the
inputs given to the code generator. This is because the semantics we
want is: v128.andnot a b = a & !b, but the x64 instruction performs
andnps a b = !a & b. Therefore we flip the inputs, and combined with
g.DefineSameAsFirst, the output register will be the same as b, and we
can use andnps without any modifications in both SSE and AVX cases.
Bug: v8:10082
Change-Id: Iff98dc1dd944fbc642875f6306c6633d5d646615
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1980894
Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#65738}
For I16x8Splat and I8x16Splat, the arguments takes I32, which can hold a
value larger than what should be splatted. We add tests to check that
the splatted values is the truncated I32 value (top bits masked off).
See https://github.com/WebAssembly/simd/pull/151 for the updated to
proposal text.
Change-Id: Ib32770872e70c7cde2028130d2b44b416594610e
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1986200
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#65642}
This change includes templatization of the test helper to allow the
same function to be reused for both signed and unsigned data types.
We implement a new function RoundingAverageUnsigned in overflowing-math,
rather than in base/utils, since the addition could overflow.
SIMD scalar lowering and implementation for other backends will follow
in future patches.
Bug: v8:10039
Change-Id: I70735f7b6536f197869ef1afbccaf5649e7e8448
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1958007
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#65531}
Also some cleanup reordering of instruction codes.
Bug: v8:9813
Change-Id: I35caad0b84dd5824090046cba964454eac45d5d8
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1925613
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Michael Starzinger <mstarzinger@chromium.org>
Reviewed-by: Bill Budge <bbudge@chromium.org>
Cr-Commit-Position: refs/heads/master@{#65088}
These instructions should always treat inputs as signed, and saturate to
unsigned min/max values.
E.g. given -1, it should saturate to 0.
The spec text,
https://github.com/WebAssembly/simd/blob/master/proposals/simd/SIMD.md#integer-to-integer-narrowing,
has been updated to describe this.
The changes here include codegen changes to ia32, x64, arm, and arm64,
changes to arm simulator, assembler, and disassembler to handle the case
of treating input as signed and narrowing to unsigned. The vqmovn
instruction can handle this case, our assembler wasn't allowing callers
to specify this.
The interpreter and scalar lowering are also fixed with this change.
Bug: v8:9729
Change-Id: I6f72baa825f59037f7754485df6a2964af59fe31
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1879423
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Reviewed-by: Michael Starzinger <mstarzinger@chromium.org>
Reviewed-by: Bill Budge <bbudge@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#65051}
This implements the rest of the load extend instructions:
- i32x4.load16x4_s
- i32x4.load16x4_u
- i64x2.load32x2_s
- i64x2.load32x2_u
Bug: v8:9886
Change-Id: I4649f77bae5224042a1628d9f0498c050b1e599d
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1903812
Reviewed-by: Michael Starzinger <mstarzinger@chromium.org>
Reviewed-by: Bill Budge <bbudge@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#65017}
This makes sure that the {WasmGraphBuilder} properly detects the
presence of Simd128 global.get and global.set opcodes and triggers
scalar lowering on architectures without Simd128 support.
R=clemensb@chromium.org
TEST=cctest/test-run-wasm-simd/RunWasm_S128Globals
BUG=v8:9973
Change-Id: I1538bd1d3fea40cc78e82b125d4f113842faf68a
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1917148
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Michael Starzinger <mstarzinger@chromium.org>
Cr-Commit-Position: refs/heads/master@{#65002}
Introduce new operator LoadTransform that holds a LoadTransformInfo param,
which describes the kind of load (normal, unaligned, protected), and a
transformation (splat or extend, signed or unsigned).
We have a new method that a full decoder needs to implement, LoadTransform,
which resuses the existing LoadType we have, but also takes a LoadTransform,
to distinguish between splats and extends at the decoder level.
This implements 4 out of the 10 suggested load splat/extend operations
(to keep the cl smaller), and is also missing interpreter support (will
be added in the future).
Change-Id: I1e65c693bfbe30e2a511c81b5a32e06aacbddc19
Bug: v8:9886
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1863863
Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Reviewed-by: Michael Starzinger <mstarzinger@chromium.org>
Reviewed-by: Andreas Haas <ahaas@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#64819}