The load splat, load extend, load zero macros are essentially the same,
consolidate them into a single macro.
Change-Id: Ic812043b37524deb3a9e6ddc223bb95ae77e1d4d
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2304715
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Bill Budge <bbudge@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68978}
Prototype in TurboFan x64 and interpreter, bailout in Liftoff.
Suggested in https://github.com/WebAssembly/simd/pull/237.
Bug: v8:10713
Change-Id: I5346c351fb2ec5240b74013e62aef07c46d5d9b6
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2300924
Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Reviewed-by: Bill Budge <bbudge@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68973}
There is a sign-extension bug happening when packing 2 32-bit ints into
a 64-bit int. We are OR-ing int32_t with a uint64_t, so an integral
conversion converts int32_t to uint64_t, which is a sign extension, and
this gives unexpected results for a negative value:
0x80000000 | uint64_t{0} -> 0xffffffff80000000
What we want is 0x0000000080000000.
Created a helper function to do this work of combining two uint32_t
into one uint64_t. The use of this function will also ensure that
if callers passed a int32_t, it would first be converted to a
uint32_t, and will not have this sign extension bug.
Sneaked a small regression test into the existing v128.const cctest,
and also cleanup the loop to reset `expected` array to 0.
Bug: chromium:1104033
Change-Id: Icaca4c5ba42077dd4463697b9220cdbca9974b5e
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2293044
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Reviewed-by: Bill Budge <bbudge@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68850}
This implements v128.const for ia32, x64, arm, and arm64.
Moves one of the test case under the correct header.
Bug: v8:9909
Change-Id: I93eb179ac5fd0bc22e3dd5277f7d73699ac8b452
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2290623
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68806}
And removed the ifdef guards around instruction-selector and
tests since v128.const is now implemented for x86, x64, arm, arm64.
Bug: v8:8460
Change-Id: I0ed8aede0a07db2fd286bf0c3385eba1079558f8
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2285149
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Bill Budge <bbudge@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68745}
Port 871183ea12
Original Commit Message:
- Add wasm opcode, decode and compiler code for v128.const
- Add codegen implementations for v128.const on x64/Arm64
- Reuse/Rename some shuffle specific methods to handle generic
128-bit immediates
- Tests
R=gdeepti@chromium.org, joransiu@ca.ibm.com, jyan@ca.ibm.com, michael_dawson@ca.ibm.com
BUG=
LOG=N
Change-Id: Ia4990f768b6fac0ac72cf79129a53b531c9c2fa9
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2280541
Reviewed-by: Junliang Yan <jyan@ca.ibm.com>
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Bill Budge <bbudge@chromium.org>
Commit-Queue: Milad Farazmand <miladfar@ca.ibm.com>
Cr-Commit-Position: refs/heads/master@{#68691}
Prototype f64x2.nearest on ARM for both ARM v7 and ARM v8. ARM v8 has
support for vrintn, and for ARM v7 we fallback to runtime.
Since ARM v8 uses vrintn, which is the same instruction used for
F64RoundTiesEven (scalar), wasm-compiler reuses the Float64RoundTiesEven
check.
Bug: v8:10553
Change-Id: Ia4c4245cac87c132331f54e81dad323fc3fb9f6d
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2268358
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68619}
Prototype f64x2.trunc on ARM for both ARM v7 and ARM v8. ARM v8 has
support for vrintz, and for ARM v7 we fallback to runtime.
Since ARM v8 uses vrintz, which is the same instruction used for F64
trunc (scalar), wasm-compiler reuses the Float64RoundTruncate check.
Bug: v8:10553
Change-Id: I074d5b4172809915d4b37c59bd3b0dcbf9a45e1d
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2268357
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68592}
Prototype f64x2.floor on ARM for both ARM v7 and ARM v8. ARM v8 has
support for vrintm, and for ARM v7 we fallback to runtime.
Since ARM v8 uses vrintm, which is the same instruction used for
Float64RoundDown (scalar), wasm-compiler reuses the Float64RoundDown check.
Bug: v8:10553
Change-Id: I6f3d5c378a811ed94859535667aed1fa2d1ee552
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2265234
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68589}
Prototype f64x2.ceil on ARM for both ARM v7 and ARM v8. ARM v8 has
support for vrintp, and for ARM v7 we fallback to runtime.
Since ARM v8 uses vrintp, which is the same instruction used for
Float64RoundUp (scalar), wasm-compiler reuses the Float64RoundUp check.
Bug: v8:10553
Change-Id: I5841c6a06f260debe8ae90d331bdcc2a0fa3278c
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2258813
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68553}
Prototype f32x4.nearest on ARM for both ARM v7 and ARM v8. ARM v8 has
support for vrintn, and for ARM v7 we fallback to runtime.
Since ARM v8 uses vrintn, which is the same instruction used for
F32RoundTiesEven (scalar), wasm-compiler reuses the Float32RoundTiesEven
check.
Bug: v8:10553
Change-Id: I066b8c5f10fd86294afe1c530c516493deeb7b53
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2258037
Reviewed-by: Jakob Kummerow <jkummerow@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68526}
Prototype f32x4.trunc on ARM for both ARM v7 and ARM v8. ARM v8 has
support for vrintz, and for ARM v7 we fallback to runtime.
Since ARM v8 uses vrintz, which is the same instruction used for F32
trunc (scalar), wasm-compiler reuses the Float32RoundTruncate check.
Bug: v8:10553
Change-Id: I65ddc36ccff21f8f0ff21a6e768184c084ffcfea
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2256770
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Reviewed-by: Jakob Kummerow <jkummerow@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68498}
Scalar lowering for i8x16, i16x8, i32x4 bitmask.
Depending on which lane we are lowering, we can either shift the MSB
into the correct final bit position, then do a big OR of all the nodes.
Bug: v8:10308
Change-Id: Iddf6c077b5a8658a487cef59f2e3bbae3c8bd98d
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2219327
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68491}
Prototype f32x4.floor on ARM for both ARM v7 and ARM v8. ARM v8 has
support for vrintm, and for ARM v7 we fallback to runtime.
Since ARM v8 uses vrintm, which is the same instruction used for F32
Floor (scalar), wasm-compiler reuses the Float32RoundDown check.
Bug: v8:10553
Change-Id: I540e82a156131821f732cd427df2e5c68f4094d7
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2252541
Reviewed-by: Jakob Kummerow <jkummerow@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68470}
This implements I32x4DotI16x8S for ia32.
Also fixes instruction-selector for SIMD ops, they should all set operand1 to be a register, since we do not have memory alignment yet.
Bug: v8:10583
Change-Id: Id273816efd5eea128580f3f7bde533a8e1b2435d
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2231031
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68444}
Prototype f32x4.ceil on ARM for both ARM v7 and ARM v8. ARM v8 has
support for vrintp, and for ARM v7 we fallback to runtime.
Since ARM v8 uses vrintp, which is the same instruction used for F32
Ceil (scalar), wasm-compiler reuses the Float32Round check, rather than
creating new F32x4Round optional operators.
Implementation for vrintp (Advanced SIMD version that takes Q
registers), assembler, disassembler support. Incomplete for now, but
more will be added as we add other rounding modes.
Bug: v8:10553
Change-Id: I4563608b9501f6f57c3a8325b17de89da7058a43
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2248779
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Reviewed-by: Jakob Kummerow <jkummerow@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68419}
This is a reland of 3692bef9f9
Integer overflow in the test code is fixed by using
MulWithWraparound.
Original change's description:
> [wasm-simd][x64] Prototype i32x4.dot_i16x8_s
>
> This implements I32x4DotI16x8S for x64 and interpreter.
>
> Bug: v8:10583
> Change-Id: I404ac68c19c1686a93f29c3f4fc2d661c9558c67
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2229056
> Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
> Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
> Commit-Queue: Zhi An Ng <zhin@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#68244}
Bug: v8:10583
Change-Id: Ie7d0032f5398b6f725c02b572764258adacc8578
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2236962
Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68343}
This is a reland of f7f72b7b3a
This was reverted because of a test timing out on slow_path
variant (https://crrev.com/c/2237131 for details). Turns out
the test is just really slow, and was skipped on this variant
in https://crrev.com/c/2237628. Relanding without changes.
Original change's description:
> [wasm-simd] Prototype f64x2 rounding instructions
>
> Implements f64x2 ceil, floor, trunc, nearestint, for interpreter and
> x64.
>
> Bug: v8:10553
> Change-Id: I12a260a3b1d728368e5525d317d30fc9581cae04
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2213082
> Commit-Queue: Zhi An Ng <zhin@chromium.org>
> Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
> Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#68241}
Tbr: tebbi@chromium.org
Bug: v8:10553
Change-Id: I4cdc23d0556f11310d32fa066f40b057fd49d2d7
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2237350
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Reviewed-by: Adam Klein <adamk@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68304}
This reverts commit dfbbb4a531.
Reason for revert: Bitmask added post 84 cut, so it is not part of origin trial. Therefore it is still a post-mvp.
Original change's description:
> [wasm-simd] Add bitmask to SIMD MVP
>
> This removes the post-mvp flag for bitmask, since it was accepted into
> the proposal, see https://github.com/WebAssembly/simd/pull/201.
>
> Bug: v8:10308
> Change-Id: I4ced43a6484660125d773bc9de46bdea9f72b13b
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2216532
> Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
> Commit-Queue: Zhi An Ng <zhin@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#67993}
TBR=gdeepti@chromium.org,zhin@chromium.org
# Not skipping CQ checks because original CL landed > 1 day ago.
Bug: v8:10308
Change-Id: I53294be4ea816f37c7cc5f545afb572538dd4770
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2233183
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68216}
This is a reland of dfdef88547
Original change's description:
> [wasm-simd] Fix extract lane unsigned extend
>
> The interpreter is missing a static cast when extracting lanes smaller
> than int32_t and doing an unsigned extend. The array in Simd128 is
> signed, so a direct cast to uint32_t will be a signed extension. The fix
> is to, in the unsigned case, cast to unsigned (of the appropriate size)
> first, then cast to uint32_t.
>
> Change-Id: Ifabb5b9690f08ad505ac94b84908db0970581818
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2216721
> Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
> Commit-Queue: Zhi An Ng <zhin@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#68029}
Change-Id: Ica7974a2f1f2a4f07b54cc68f9abcf5e121a9262
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2219414
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68063}
This reverts commit dfdef88547.
Reason for revert: https://ci.chromium.org/p/v8/builders/ci/V8%20Blink%20Mac/2718?
Original change's description:
> [wasm-simd] Fix extract lane unsigned extend
>
> The interpreter is missing a static cast when extracting lanes smaller
> than int32_t and doing an unsigned extend. The array in Simd128 is
> signed, so a direct cast to uint32_t will be a signed extension. The fix
> is to, in the unsigned case, cast to unsigned (of the appropriate size)
> first, then cast to uint32_t.
>
> Change-Id: Ifabb5b9690f08ad505ac94b84908db0970581818
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2216721
> Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
> Commit-Queue: Zhi An Ng <zhin@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#68029}
TBR=gdeepti@chromium.org,zhin@chromium.org
Change-Id: Icdd0e705f4c7252aef2cadaa39ec52204b5c6093
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2219412
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68030}
The interpreter is missing a static cast when extracting lanes smaller
than int32_t and doing an unsigned extend. The array in Simd128 is
signed, so a direct cast to uint32_t will be a signed extension. The fix
is to, in the unsigned case, cast to unsigned (of the appropriate size)
first, then cast to uint32_t.
Change-Id: Ifabb5b9690f08ad505ac94b84908db0970581818
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2216721
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68029}
This removes the post-mvp flag for bitmask, since it was accepted into
the proposal, see https://github.com/WebAssembly/simd/pull/201.
Bug: v8:10308
Change-Id: I4ced43a6484660125d773bc9de46bdea9f72b13b
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2216532
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#67993}
The proposal uses the lane shape, e.g. i64x2.anytrue, and we were using
s1x2.anytrue in our opcodes. This was a legacy naming, because we were
trying to bitpack the booleans. Now that we aren't doing that, rename
these to be more consistent with the proposal.
This was done with a straightforward sed script, changing both cpp code
and also some comments in mjsunit test files.
Bug: v8:10506
Change-Id: If077ed805de23520d8580d6b3b1906c80f67b94f
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2207915
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Cr-Commit-Position: refs/heads/master@{#67945}
The codegen uses a bunch of vpmax to try and keep set bits around. The
datatype for vpmax does not need to change for each instruction, since
vpmax U32 will persist set bits just as well. This simplifies the
instruction sequences for S1x8 and S1x16 anytrue.
I added a test to check a special case when a f64x2 contains -0.0 (top
bit set). A previous attempt to optimize codegen used floating point
compare, which does not distinguish between 0.0 and -0.0. So -0.0 will
compare equals to 0.0, and incorrect return 0 for anytrue.
Change-Id: I66013796af08a666009e6b2d774ea7ee7bdfe1ad
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2203113
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Bill Budge <bbudge@chromium.org>
Cr-Commit-Position: refs/heads/master@{#67875}
Load extends always load 64-bits. Previously, we were setting the max
alignment to be the size_log_2 of the load_type. For LoadExtends the
load_type indicates what the lane size to be extended is, *NOT* the size
to be loaded.
Bug: chromium:1082848
Change-Id: I0c4115ea6ec916211b03afdb83376ccc05c0c244
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2202721
Reviewed-by: Bill Budge <bbudge@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#67815}
Same implementation as the one for x64 in https://crrev.com/c/2186630.
Bug: v8:10501
Change-Id: If2b6c0fdc649afba3449d9579452cf7047a55a54
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2188556
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#67721}
This patch implements f32x4.pmin, f32x4.pmax, f64x2.pmin, and f64x2.pmax
for x64 and interpreter.
Pseudo-min and Pseudo-max instructions were proposed in
https://github.com/WebAssembly/simd/pull/122. These instructions
exactly match std::min and std::max in C++ STL, and thus have different
semantics from the existing min and max.
The instruction-selector for x64 switches the operands around, because
it allows for defining the dst to be same as first (really the second
input node), allowing better codegen.
For example, b = f32x4.pmin(a, b) directly maps to vminps(b, b, a) or
minps(b, a), as long as we can define dst == b, and switching the
instruction operands around allows us to do that.
Bug: v8:10501
Change-Id: I06f983fc1764caf673e600ac91d9c0ac5166e17e
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2186630
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Cr-Commit-Position: refs/heads/master@{#67688}
Move them all into wasm-macro-gen.h, other opcodes have their macros
there as well. This will make reusing these macros easier when we have
other test files for SIMD. (An upcoming one is for scalar lowering
tests.)
Change-Id: I6c21100ce490abbc26f80a0d204815687fd62f00
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2185471
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#67658}
This reverts commit 3c40082543.
Reason for revert: Re-enable interpreter tests
Original change's description:
> [wasm-simd] Remove interpreter tier of SIMD tests
>
> As per the all-hands a couple of weeks ago, the interpreter will
> be removed soon. Remove running tests on this tier, so we no longer
> put effort into maintaining tests for this tier.
>
> Change-Id: I9fce0f3a7cd869d6ccecf1c1f820b794e89858e1
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2175021
> Reviewed-by: Zhi An Ng <zhin@chromium.org>
> Commit-Queue: Deepti Gandluri <gdeepti@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#67520}
TBR=gdeepti@chromium.org,zhin@chromium.org
# Not skipping CQ checks because original CL landed > 1 day ago.
Change-Id: Iac0f21311769157c5ae303e8078c25d96fbc7c93
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2180343
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Commit-Queue: Deepti Gandluri <gdeepti@chromium.org>
Cr-Commit-Position: refs/heads/master@{#67546}
As per the all-hands a couple of weeks ago, the interpreter will
be removed soon. Remove running tests on this tier, so we no longer
put effort into maintaining tests for this tier.
Change-Id: I9fce0f3a7cd869d6ccecf1c1f820b794e89858e1
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2175021
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Commit-Queue: Deepti Gandluri <gdeepti@chromium.org>
Cr-Commit-Position: refs/heads/master@{#67520}
- Update opcode numbers, tests
- As the wasm-module-builder currently assumes opcode bytes, skip
the test that needs a multi-byte leb128 opcode
- Renumber post-MVP opcodes
Change-Id: I6531e954e63986dc6f7a3144ec054d16e6dc1b05
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2173952
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Commit-Queue: Deepti Gandluri <gdeepti@chromium.org>
Cr-Commit-Position: refs/heads/master@{#67517}
Due to lack of testing environment before, there are some bugs in the
implementations of wasm-simd on mips64 platform, this CL fix them
according to the test on Loongson 3A4000.
Change-Id: I59ab6315987fc94a06cf0bf23754f5c593879532
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2162416
Commit-Queue: Zhao Jiazhong <zhaojiazhong-hf@loongson.cn>
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Cr-Commit-Position: refs/heads/master@{#67413}
Load splat opcodes are currently multi-byte, but were not passing the
right lengths for decoding of immediates.
Bug: v8:10258
Change-Id: I2c93c3f915eaa43a74722cf0285f161d16ef0ff6
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2154769
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#67326}
SIMD opcodes consist of the prefix byte, then an LEB128 encoded int. We
were decoding this incorrectly as a fixed uint8. This fixes the decoder
to properly handle multi bytes.
In some cases, the multi byte logic is applied to all prefixed opcodes.
This is not a problem, since for values < 0x80, the LEB encoding is a
single byte, and decodes to the same int. If the prefix opcode has
instructions with index >= 0x80, it would be required to be LEB128
encoded anyway.
There are a bunch of trivial changes to test-run-wasm-simd, to change
the macro from BUILD to BUILD_V, the former only works for single byte
opcodes, the latter is a new template-based macro that correct handles
multi-byte opcodes. The only unchanged test is the shuffle fuzzer test,
which builds its own sequence of bytes without using the BUILD macro.
Bug: v8:10258
Change-Id: Ie7377e899a7eab97ecf28176fd908babc08d0f19
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2118476
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Cr-Commit-Position: refs/heads/master@{#67186}
Implement i8x16.bitmask, i16x8.bitmask, i32x4.bitmask on ia32.
Drive by additions of disasm and disasm tests to some instructions.
Bug: v8:10308
Change-Id: I3725ed6959ae55f96ee7950130776a4f08e177c9
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2127314
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#66989}
MSVC 19.25 complains about signbit being ambiguous between
signbit(float) and signbit(double) overloads when called with an int8_t.
To remove the ambiguity, cast to a double.
Change-Id: I698f05eed9248eef493bbe46b75fcd07e37e2a05
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2118510
Reviewed-by: Andreas Haas <ahaas@chromium.org>
Commit-Queue: Richard Townsend <richard.townsend@arm.com>
Cr-Commit-Position: refs/heads/master@{#66856}
Introduces a new macro BUILD_V (v is for vector) that pushes bytes into
a vector (instead of directly in an array initializer, see BUILD). This
has the positive effect of being able to handle opcodes of multiple
bytes (e.g. SIMD opcodes bigger that 0xfd80). Because of this "API"
change, our helper macros in test-run-wasm-simd.cc and wasm-run-utils.h
need to change too. So, we introduce new macros (suffixed by _V), that
will call the appropriate lambdas defined in BUILD_V, that knows how to
push bytes into the vector, and also can handle multi-byte opcodes.
This design has a bit of duplication and ugliness, but was chosen to
reduce the impact of existing tests. No restructuring of test code is
required, we only need to add suffix _V.
Note that we do not have multi-byte opcodes yet (in wasm-opcodes.h),
this change will be breaking, and requires all the tests to be updated
to use _V macros first.
Bug: v8:10258
Change-Id: I86638a548fe2f9714c1cfb3bd691fb7b49bfd652
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2107650
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Cr-Commit-Position: refs/heads/master@{#66812}
Implement i8x16.bitmask, i16x8.bitmask, i32x4.bitmask on interpreter and
arm64.
These operations are behind wasm_simd_post_mvp flag, as we are only
prototyping to evaluate performance. The codegen is based on guidance at
https://github.com/WebAssembly/simd/pull/201.
Bug: v8:10308
Change-Id: I835aa8a23e677a00ee7897c1c31a028850e238a9
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2099451
Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#66793}
"I64x2Eq", "S1x2AnyTrue" and "S1x2AllTrue" do not yet have lowering
implemented hence some of the test case may fail on s390x
hardware without AVX support.
Change-Id: Ice01bcaed78950fbad36e2ba37c8f7ae5d10b59b
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2107763
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Commit-Queue: Milad Farazmand <miladfar@ca.ibm.com>
Cr-Commit-Position: refs/heads/master@{#66780}
This Cl enables simd on machines which support
VECTOR_ENHANCE_FACILITY_1. It also enables related tests to
match execution on x64.
LoadTransform tests must be skipped on the simulator until a future CL
matches behaviour between native BE and its simulator on LE.
Change-Id: Iaadc32e0388bf15d3d7c550062a373fb403b65c4
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2107053
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Milad Farazmand <miladfar@ca.ibm.com>
Cr-Commit-Position: refs/heads/master@{#66754}
Some opcodes are introduced in V8 for prototyping, and performance
measurements that are not officially a part of the current SIMD proposal
but may be included in future, gate these by a separate flag.
Change-Id: Icc6a9e89c6196c8ff144d2e0193d707e1f60c38b
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2079539
Commit-Queue: Deepti Gandluri <gdeepti@chromium.org>
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Ben Smith <binji@chromium.org>
Cr-Commit-Position: refs/heads/master@{#66542}
FMA tests that are running on Liftoff can use fused results, since the
tests will fall back to TurboFan.
Bug: v8:9415
Change-Id: I02edea5ce1447263f7bc7574573418b0055aef8f
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2063202
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#66438}
Implements lowering for:
- i16x8.load8x8_s
- i16x8.load8x8_u
- i32x4.load16x4_s
- i32x4.load16x4_u
As before, i64x2 is not implemented since 64-bit lowering and scalar
lowering don't work together yet.
Bug: v8:9886
Change-Id: I3728d009e053acf82baacbcf1c6c08ea636ef241
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2044546
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#66380}
We lower each op into num_lanes loads, and connecting up the effects in
a chain.
s64x2 is not implemented since we lowering for 64x2 generally doesn't
work anyway.
Load extends are a bit more complicated, so we'll do that in a separate
change.
Bug: v8:9886
Change-Id: I80096827bf8e8e0db1ef0ad1b76759ed1797ca5e
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2031893
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Cr-Commit-Position: refs/heads/master@{#66183}
These were not added in https://crrev.com/c/2026067 when we added
similar tests for other lane sizes, since x64 had a completely different
path for i8x16. But this tests are useful anyway for other archs, so add
them in.
Bug: v8:10115
Change-Id: I77ecca0cd9f4021c94f1538aa5635b5d54983207
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2041708
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#66178}
Define a macro in code-generator-x64 to help identify cases when the
shift value is an immediate/constant. In those cases we can directly
emit the shifts without any masking, since the instruction selector
would have modulo-ed the shift value. We also don't need any temporaries
in this case.
This is only x64 codegen, optimizations for other archs will come in
future patches (and will probably look very similar to this).
The current test case passes the shifts as an immediate, so we add a new
path that loads the shift value from memory, thereby exercising the
slower path of non-immediate shift value.
Bug: v8:10115
Change-Id: Iaf13d81595714882a8f5418734e031b8bc654af3
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2026067
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#66074}
On backends that do not have s128 support in Liftoff, tests will bail
out to TurboFan, so tests will continue running and passing.
Bug: v8:9909
Change-Id: I3b596a73b6cb2e8645a99c65a935026f9e1a8d55
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2029332
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#66056}
These conversion instructions were removed from the proposal in
https://github.com/WebAssembly/simd/pull/178.
Change-Id: I212ca2f923362bf08e178f6d28cc2338cf6f5927
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2016006
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#66015}
Remove the execution tier check for simd tests. On archs without
Liftoff, those tests that are configured to run on Liftoff will fail
with this check, since they bail out to TF.
We remove this check for now, but will think of a way to enforce this in
a more platform specific way.
Bug: v8:9909
Change-Id: Id56f841fe6e342434af3dbcdaef0a8a284614994
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2019924
Reviewed-by: Milad Farazmand <miladfar@ca.ibm.com>
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#65983}
This relands commit 009993adb4.
The fix is in liftoff-assembler-ia32.h, the codegen was incorrect.
Original change's description:
> Implement f32x4.splat and enable handling this in Liftoff.
>
> We add a new macro for defining test cases to run on TurboFan, Liftoff,
> interpreter, and scalar lowering.
>
> Also add an assertion that the execution tier used is what we expected
> it to be. This is useful for Liftoff, because by default it falls back
> to TurboFan when it encounters an unimplemented opcode.
>
> Bug: v8:9909
Bug: v8:9909
Change-Id: I7daacbe8b195d9212367190c515b0babbc457a88
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2018043
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#65963}
This reverts commit 009993adb4.
Reason for revert: New test fails, see https://ci.chromium.org/p/v8/builders/ci/V8%20Linux/35534 and https://ci.chromium.org/p/v8/builders/ci/V8%20Win32%20-%20debug/23778
Original change's description:
> [liftoff][wasm-simd] Implement f32x4.splat
>
> Implement f32x4.splat and enable handling this in Liftoff.
>
> We add a new macro for defining test cases to run on TurboFan, Liftoff,
> interpreter, and scalar lowering.
>
> Also add an assertion that the execution tier used is what we expected
> it to be. This is useful for Liftoff, because by default it falls back
> to TurboFan when it encounters an unimplemented opcode.
>
> Bug: v8:9909
> Change-Id: I594955fce778173191fc44c38c4f956a05e77839
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2014753
> Commit-Queue: Zhi An Ng <zhin@chromium.org>
> Reviewed-by: Clemens Backes <clemensb@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#65954}
TBR=clemensb@chromium.org,zhin@chromium.org
Change-Id: Ie6970a8c29baab149150dd734a95f89be5fd89ff
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: v8:9909
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2017722
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Commit-Queue: Clemens Backes <clemensb@chromium.org>
Cr-Commit-Position: refs/heads/master@{#65958}
Implement f32x4.splat and enable handling this in Liftoff.
We add a new macro for defining test cases to run on TurboFan, Liftoff,
interpreter, and scalar lowering.
Also add an assertion that the execution tier used is what we expected
it to be. This is useful for Liftoff, because by default it falls back
to TurboFan when it encounters an unimplemented opcode.
Bug: v8:9909
Change-Id: I594955fce778173191fc44c38c4f956a05e77839
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2014753
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Cr-Commit-Position: refs/heads/master@{#65954}
This CL implements load_extend with 2 lanes and all load_splat
operations on IA32. The necessary assemblers together with their
corresponding disassemblers and tests are also added in this CL.
The newly added opcodes include: S8x16LoadSplat, S16x8LoadSplat,
S32x4LoadSplat, S64x2LoadSplat, I64x2Load32x2S, I64x2Load32x2U.
Bug: v8:9886
Change-Id: I0a5dae0a683985c14c433ba9d85acbd1cee6705f
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1982989
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Commit-Queue: Zhiguo Zhou <zhiguo.zhou@intel.com>
Cr-Commit-Position: refs/heads/master@{#65937}
Note the tricky part in instruction-selector-x64, where we flip the
inputs given to the code generator. This is because the semantics we
want is: v128.andnot a b = a & !b, but the x64 instruction performs
andnps a b = !a & b. Therefore we flip the inputs, and combined with
g.DefineSameAsFirst, the output register will be the same as b, and we
can use andnps without any modifications in both SSE and AVX cases.
Bug: v8:10082
Change-Id: Iff98dc1dd944fbc642875f6306c6633d5d646615
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1980894
Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#65738}
For I16x8Splat and I8x16Splat, the arguments takes I32, which can hold a
value larger than what should be splatted. We add tests to check that
the splatted values is the truncated I32 value (top bits masked off).
See https://github.com/WebAssembly/simd/pull/151 for the updated to
proposal text.
Change-Id: Ib32770872e70c7cde2028130d2b44b416594610e
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1986200
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#65642}
This change includes templatization of the test helper to allow the
same function to be reused for both signed and unsigned data types.
We implement a new function RoundingAverageUnsigned in overflowing-math,
rather than in base/utils, since the addition could overflow.
SIMD scalar lowering and implementation for other backends will follow
in future patches.
Bug: v8:10039
Change-Id: I70735f7b6536f197869ef1afbccaf5649e7e8448
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1958007
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#65531}
Also some cleanup reordering of instruction codes.
Bug: v8:9813
Change-Id: I35caad0b84dd5824090046cba964454eac45d5d8
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1925613
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Michael Starzinger <mstarzinger@chromium.org>
Reviewed-by: Bill Budge <bbudge@chromium.org>
Cr-Commit-Position: refs/heads/master@{#65088}
These instructions should always treat inputs as signed, and saturate to
unsigned min/max values.
E.g. given -1, it should saturate to 0.
The spec text,
https://github.com/WebAssembly/simd/blob/master/proposals/simd/SIMD.md#integer-to-integer-narrowing,
has been updated to describe this.
The changes here include codegen changes to ia32, x64, arm, and arm64,
changes to arm simulator, assembler, and disassembler to handle the case
of treating input as signed and narrowing to unsigned. The vqmovn
instruction can handle this case, our assembler wasn't allowing callers
to specify this.
The interpreter and scalar lowering are also fixed with this change.
Bug: v8:9729
Change-Id: I6f72baa825f59037f7754485df6a2964af59fe31
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1879423
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Reviewed-by: Michael Starzinger <mstarzinger@chromium.org>
Reviewed-by: Bill Budge <bbudge@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#65051}
This implements the rest of the load extend instructions:
- i32x4.load16x4_s
- i32x4.load16x4_u
- i64x2.load32x2_s
- i64x2.load32x2_u
Bug: v8:9886
Change-Id: I4649f77bae5224042a1628d9f0498c050b1e599d
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1903812
Reviewed-by: Michael Starzinger <mstarzinger@chromium.org>
Reviewed-by: Bill Budge <bbudge@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#65017}
This makes sure that the {WasmGraphBuilder} properly detects the
presence of Simd128 global.get and global.set opcodes and triggers
scalar lowering on architectures without Simd128 support.
R=clemensb@chromium.org
TEST=cctest/test-run-wasm-simd/RunWasm_S128Globals
BUG=v8:9973
Change-Id: I1538bd1d3fea40cc78e82b125d4f113842faf68a
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1917148
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Michael Starzinger <mstarzinger@chromium.org>
Cr-Commit-Position: refs/heads/master@{#65002}
Introduce new operator LoadTransform that holds a LoadTransformInfo param,
which describes the kind of load (normal, unaligned, protected), and a
transformation (splat or extend, signed or unsigned).
We have a new method that a full decoder needs to implement, LoadTransform,
which resuses the existing LoadType we have, but also takes a LoadTransform,
to distinguish between splats and extends at the decoder level.
This implements 4 out of the 10 suggested load splat/extend operations
(to keep the cl smaller), and is also missing interpreter support (will
be added in the future).
Change-Id: I1e65c693bfbe30e2a511c81b5a32e06aacbddc19
Bug: v8:9886
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1863863
Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Reviewed-by: Michael Starzinger <mstarzinger@chromium.org>
Reviewed-by: Andreas Haas <ahaas@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#64819}
There are a couple of bugs here:
1. The immediate used for vinsertps is wrong when lane == 1, the first
two bits specify which element of the source is copied, and it should
always be 00, 01 to copy the first 2 lanes of source.
2. For both cases, the second insertps call should be using dst as the
src, since dst was already updated by the first insertps call, it was
incorrectly using the old value of src. This was probably working
correctly because in many cases dst and src happened to be the same
register.
3. rep cannot be same as dst, because dst is overwritten, and rep should
stay the same
I also modified the F64x2ReplaceLane to test separately for replacing
lane 0 and lane 1.
Fixed bug 3. for arm and arm64.
Bug: v8:9728
Change-Id: Iec6e48bcfbc7d27908dd86d5f113a8b5dedd499b
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1877055
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#64620}
This introduces 2 new machine operators that are variants of I64x2Splat
and I64x2ReplaceLane that takes two int32 operands instead of one i64
operand.
Bug: v8:9728
Change-Id: I6675f991e6c56821c84d183dacfda96961c1a708
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1841242
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Reviewed-by: Michael Starzinger <mstarzinger@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#64337}
The vst1 and vld1 instruction does a post-increment access. What we
intend is the usual access at (base+offset). This change adds a helper
function that is called for load and stores of s128, which emits the add
instruction to do base+offset, and then change the addressing mode of
the load/store to Operand2_R, which generates the variant of vld1/vst1
without the offset register. This is similar to how kSimd128 values are
loaded/stored in VisitUnalignedLoad and VisitUnalignedStore.
We also remove kSimd128 cases from UnalignedLoad and UnalignedStore,
since it is supported (see A3.2.1 Unaligned Data Access, ARM DDI
0406C.d)
Bug: v8:9746
Bug: v8:9748
Change-Id: I60b987ac58a5eaacd498a940625163484a3dc2db
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1834771
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#64229}
This CL implements i8x16.extract_lane_u, i16x8.extract_lane_u operations by
changing the default narrow extract operations to be unsigned. The
sign-extended extracts are implemented on top of the unsigned extracts
with an additional extend compiler node.
For IA32/X64, the codegen effectively remains the same -
0x389332bc32a3 63 660f3a14c900 pextrb rcx,xmm1,0
0x389332bc32a9 69 0fbec9 movsxbl rcx,rcx
0x389332bc32a3 63 660f3a14c900 pextrb rcx,xmm1,0
0x389332bc32a9 69 0fbec9 movsxbl rcx,rcx
On ARM, this adds an additional sxt instruction for the signed extracts.
Bug: v8:8460
Change-Id: I67f14b2b860ff8cc86ffbb2f65c7ef7de32da83f
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1846711
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Bill Budge <bbudge@chromium.org>
Commit-Queue: Deepti Gandluri <gdeepti@chromium.org>
Cr-Commit-Position: refs/heads/master@{#64172}
This brings our constants back in line with the changed spec text. We
already use kExprTableGet and kExprTableSet, but for locals and globals
we still use the old wording.
This renaming is mostly mechanical.
PS1 was created using:
ag -l 'kExpr(Get|Set)Global' src test | \
xargs -L1 sed -E 's/kExpr(Get|Set)Global\b/kExprGlobal\1/g' -i
PS2 contains manual fixes.
R=mstarzinger@chromium.org
Bug: v8:9810
Change-Id: I064a6448cd95bc24d31a5931b5b4ef2464ea88b1
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1847355
Commit-Queue: Clemens Backes <clemensb@chromium.org>
Reviewed-by: Michael Starzinger <mstarzinger@chromium.org>
Cr-Commit-Position: refs/heads/master@{#64163}
FMA operations is always supported on arm64, so in the test, we expect
fused results on arm64 whenever we run on TurboFan.
Bug: v8:9415
Change-Id: Ia2016533b9b76ee14b8c8da1c0d4ff7753276714
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1819723
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Cr-Commit-Position: refs/heads/master@{#63973}
Add a new test SimdLoadStoreLoadMemargOffset to test this, without this fix
this test would have failed.
Bug: v8:9753
Change-Id: I119adda8e3c6c7adb0ad4023298bbce9c0c64a01
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1811457
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Cr-Commit-Position: refs/heads/master@{#63914}
Quasi Fused Multiply-Add and Quasi Fused Multiply-Subtract performs, on floats, a + b * c and a - b * c respectively.
When there is only a single rounding, it is a fused operation. Quasi in this case means that the result can either be fused or not fused (two roundings), depending on hardware support.
It is tricky to write the test because we need to calculate the expected value, and there is no easy way to express fused or unfused operation in C++, i.e.
we cannot confirm that float expected = a + b * c will perform a fused or unfused operation (unless we use intrinsics).
Thus in the test we have a list of simple checks, plus interesting values that we know will produce different results depending on whether it was fused or not.
The difference between 32x4 and 64x2 qfma/qfms is the type, and also the values of b and c that will cause an overflow, and thus the intermediate rounding will affect the final result.
The same array can be copy pasted for both types, but with a bit of templating we can avoid that duplication.
Change-Id: I0973a3d28468d25f310b593c72f21bff54d809a7
Bug: v8:9415
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1779325
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Reviewed-by: Michael Starzinger <mstarzinger@chromium.org>
Cr-Commit-Position: refs/heads/master@{#63878}
Drive by fix of type of expected value in a test
Bug: v8:9626
Change-Id: I1bb44082b873383ea75e7089828bc68c9d4e0df0
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1757503
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#63727}
Implementations for other architectures will follow in subsequent
changes.
Bug: v8:8460
Change-Id: I279388ab76b1d88d65cbe179088be5573c17fc58
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1796317
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Reviewed-by: Michael Starzinger <mstarzinger@chromium.org>
Reviewed-by: Bill Budge <bbudge@chromium.org>
Cr-Commit-Position: refs/heads/master@{#63693}
Implement I64x2 multiply using 32-bit multiplies.
This approach uses two fewer cycles (0.88x) on Cortex-A53 and three fewer cycles (0.86x)
on Cortex-A72, compared to moving to general purpose registers and doing two 64-bit multiplies.
Based on a patch by Zhi An Ng.
Bug: v8:8460
Change-Id: I9c8d3bb77f0d751eec2d85823522558b7f173628
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1781696
Commit-Queue: Martyn Capewell <martyn.capewell@arm.com>
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Bill Budge <bbudge@chromium.org>
Cr-Commit-Position: refs/heads/master@{#63558}
This is only for turbofan and interpreter, and simd lowering for 64x2 is
not implemented yet.
Bug: v8:8460
Change-Id: I0d046cb39ff64936da772e0db9a86b88b1509ac2
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1769194
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Bill Budge <bbudge@chromium.org>
Cr-Commit-Position: refs/heads/master@{#63466}
This off-by-1 error surfaces when the load/store opcodes take up 2
bytes, which is the case for v128.load and v128.store SIMD operations.
Bug: v8:9015
Change-Id: Ife17375ed3450a95399b326bc6415dbc3ed3773b
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1769480
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Michael Starzinger <mstarzinger@chromium.org>
Cr-Commit-Position: refs/heads/master@{#63405}
- Move undef closer to end of usage
- Move I64x2ExtractWithF64x2 closer to Extract tests, and into ifdef
scope so it runs on arm64 builds
Change-Id: I7138c44097975d02e97f4b2b9bfcddd8eb9735c8
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1754544
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#63227}
This is a reland of
https://chromium-review.googlesource.com/c/v8/v8/+/1749712 with a fix in
test-run-wasm-simd.cc to use base::Divide to work around C++ undefined
behavior when the denominator is 0.
Bug: v8:8460
Change-Id: Ia0a4ff621cccc6d9b7528717bf3fa7c79e42ba1a
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1745819
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Reviewed-by: Michael Starzinger <mstarzinger@chromium.org>
Cr-Commit-Position: refs/heads/master@{#63198}
Also add a IsExtreme(double) overload.
This wasn't causing issues because there was no codepath
which exercised it (only approx operations did).
Change-Id: If7583fb567137c428d16c0d2cdfc37e086f7f3fd
Bug: v8:8460
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1726675
Reviewed-by: Michael Starzinger <mstarzinger@chromium.org>
Reviewed-by: Bill Budge <bbudge@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#63053}
The mask should cover the sign (1 bit), exponent (11 bits) and quiet bit (1 bit) of significand, total of 13 bits. The old mask only covered 9 bits.
Change-Id: I6ec402b4cec34978eac8fa3e5452ad22540a93ce
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1726984
Reviewed-by: Bill Budge <bbudge@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#63015}
-0.0 and 0.0 compare equals, so a < b ? a : b for min would pick b
incorrectly. We need to use JS semantics here, which returns -0.0.
Bug: v8:8425
Change-Id: I8ab094b566ece9c586de86aad4594dfdf8da930b
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1724802
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#62969}
Change-Id: I99fe89a679e6a628bd6fa7600f756d9a35450243
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1695203
Reviewed-by: Bill Budge <bbudge@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#62751}
Cpplint usually checks for non-const reference arguments. They are
forbidden in the style guide, and v8 does not explicitly make an
exception here.
This CL re-enables that warning, and fixes all current violations by
adding an explicit "NOLINT(runtime/references)" comment. In follow-up
CLs, we should aim to remove as many of them as possible.
TBR=mlippautz@chromium.org
Bug: v8:9429
Change-Id: If7054d0b366138b731972ed5d4e304b5ac8423bb
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1687891
Reviewed-by: Clemens Hammacher <clemensh@chromium.org>
Reviewed-by: Michael Lippautz <mlippautz@chromium.org>
Reviewed-by: Jakob Kummerow <jkummerow@chromium.org>
Reviewed-by: Igor Sheludko <ishell@chromium.org>
Commit-Queue: Clemens Hammacher <clemensh@chromium.org>
Cr-Commit-Position: refs/heads/master@{#62551}
These instructions should return 0 or 1, previously it would return the
min/max of the elements.
Change-Id: I81913c07f11e4a98ce3b9f5d79b5d975e5bf953f
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1681130
Reviewed-by: Ben Titzer <titzer@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Martyn Capewell <martyn.capewell@arm.com>
Cr-Commit-Position: refs/heads/master@{#62498}
The test case SimdF32x4ExtractWithI32x4 was still passing when the codegen for
F32x4Extract was entirely commented out. This change adds a new test
cases that specifically exercises F32x4ExtractLane.
It copies what is done in SimdI32x4SplatFromExtract,
which involves moving the splatted and
extracted values around locals, to ensure we move the values around
registers and not unintentionally reuse registers that we splatted to,
without actually extracting anything.
Note that the existing SimdF32x4ExtractWithI32x4 is kept because it is
used to test scalar lowering passes.
Bug: v8:9420
Change-Id: Ieb883175b0e0139e8452c18f09d50b7dfb05a994
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1684699
Commit-Queue: Deepti Gandluri <gdeepti@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Auto-Submit: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#62496}
Lowering does not work correctly for I64x2 and F64x2. Those tests are
guarded with X64, so it is fine, but if we remove the guard next
time, the failing tests will be confusing.
Bug: v8:8460
Change-Id: I98da0a2de1fefa8f46bdc5c0a1407973e3ed2b81
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1683928
Auto-Submit: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#62494}
Bug: v8:8460
Change-Id: Id159c81cd2d25924be96e49c64073e154ef32e6a
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1667867
Reviewed-by: Bill Budge <bbudge@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Reviewed-by: Michael Starzinger <mstarzinger@chromium.org>
Auto-Submit: Zhi An Ng <zhin@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#62475}