The load splat, load extend, load zero macros are essentially the same,
consolidate them into a single macro.
Change-Id: Ic812043b37524deb3a9e6ddc223bb95ae77e1d4d
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2304715
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Bill Budge <bbudge@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68978}
Prototype in TurboFan x64 and interpreter, bailout in Liftoff.
Suggested in https://github.com/WebAssembly/simd/pull/237.
Bug: v8:10713
Change-Id: I5346c351fb2ec5240b74013e62aef07c46d5d9b6
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2300924
Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Reviewed-by: Bill Budge <bbudge@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68973}
There is a sign-extension bug happening when packing 2 32-bit ints into
a 64-bit int. We are OR-ing int32_t with a uint64_t, so an integral
conversion converts int32_t to uint64_t, which is a sign extension, and
this gives unexpected results for a negative value:
0x80000000 | uint64_t{0} -> 0xffffffff80000000
What we want is 0x0000000080000000.
Created a helper function to do this work of combining two uint32_t
into one uint64_t. The use of this function will also ensure that
if callers passed a int32_t, it would first be converted to a
uint32_t, and will not have this sign extension bug.
Sneaked a small regression test into the existing v128.const cctest,
and also cleanup the loop to reset `expected` array to 0.
Bug: chromium:1104033
Change-Id: Icaca4c5ba42077dd4463697b9220cdbca9974b5e
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2293044
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Reviewed-by: Bill Budge <bbudge@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68850}
This implements v128.const for ia32, x64, arm, and arm64.
Moves one of the test case under the correct header.
Bug: v8:9909
Change-Id: I93eb179ac5fd0bc22e3dd5277f7d73699ac8b452
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2290623
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68806}
And removed the ifdef guards around instruction-selector and
tests since v128.const is now implemented for x86, x64, arm, arm64.
Bug: v8:8460
Change-Id: I0ed8aede0a07db2fd286bf0c3385eba1079558f8
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2285149
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Bill Budge <bbudge@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68745}
Port 871183ea12
Original Commit Message:
- Add wasm opcode, decode and compiler code for v128.const
- Add codegen implementations for v128.const on x64/Arm64
- Reuse/Rename some shuffle specific methods to handle generic
128-bit immediates
- Tests
R=gdeepti@chromium.org, joransiu@ca.ibm.com, jyan@ca.ibm.com, michael_dawson@ca.ibm.com
BUG=
LOG=N
Change-Id: Ia4990f768b6fac0ac72cf79129a53b531c9c2fa9
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2280541
Reviewed-by: Junliang Yan <jyan@ca.ibm.com>
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Bill Budge <bbudge@chromium.org>
Commit-Queue: Milad Farazmand <miladfar@ca.ibm.com>
Cr-Commit-Position: refs/heads/master@{#68691}
Prototype f64x2.nearest on ARM for both ARM v7 and ARM v8. ARM v8 has
support for vrintn, and for ARM v7 we fallback to runtime.
Since ARM v8 uses vrintn, which is the same instruction used for
F64RoundTiesEven (scalar), wasm-compiler reuses the Float64RoundTiesEven
check.
Bug: v8:10553
Change-Id: Ia4c4245cac87c132331f54e81dad323fc3fb9f6d
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2268358
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68619}
Prototype f64x2.trunc on ARM for both ARM v7 and ARM v8. ARM v8 has
support for vrintz, and for ARM v7 we fallback to runtime.
Since ARM v8 uses vrintz, which is the same instruction used for F64
trunc (scalar), wasm-compiler reuses the Float64RoundTruncate check.
Bug: v8:10553
Change-Id: I074d5b4172809915d4b37c59bd3b0dcbf9a45e1d
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2268357
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68592}
Prototype f64x2.floor on ARM for both ARM v7 and ARM v8. ARM v8 has
support for vrintm, and for ARM v7 we fallback to runtime.
Since ARM v8 uses vrintm, which is the same instruction used for
Float64RoundDown (scalar), wasm-compiler reuses the Float64RoundDown check.
Bug: v8:10553
Change-Id: I6f3d5c378a811ed94859535667aed1fa2d1ee552
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2265234
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68589}
Prototype f64x2.ceil on ARM for both ARM v7 and ARM v8. ARM v8 has
support for vrintp, and for ARM v7 we fallback to runtime.
Since ARM v8 uses vrintp, which is the same instruction used for
Float64RoundUp (scalar), wasm-compiler reuses the Float64RoundUp check.
Bug: v8:10553
Change-Id: I5841c6a06f260debe8ae90d331bdcc2a0fa3278c
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2258813
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68553}
Prototype f32x4.nearest on ARM for both ARM v7 and ARM v8. ARM v8 has
support for vrintn, and for ARM v7 we fallback to runtime.
Since ARM v8 uses vrintn, which is the same instruction used for
F32RoundTiesEven (scalar), wasm-compiler reuses the Float32RoundTiesEven
check.
Bug: v8:10553
Change-Id: I066b8c5f10fd86294afe1c530c516493deeb7b53
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2258037
Reviewed-by: Jakob Kummerow <jkummerow@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68526}
Prototype f32x4.trunc on ARM for both ARM v7 and ARM v8. ARM v8 has
support for vrintz, and for ARM v7 we fallback to runtime.
Since ARM v8 uses vrintz, which is the same instruction used for F32
trunc (scalar), wasm-compiler reuses the Float32RoundTruncate check.
Bug: v8:10553
Change-Id: I65ddc36ccff21f8f0ff21a6e768184c084ffcfea
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2256770
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Reviewed-by: Jakob Kummerow <jkummerow@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68498}
Scalar lowering for i8x16, i16x8, i32x4 bitmask.
Depending on which lane we are lowering, we can either shift the MSB
into the correct final bit position, then do a big OR of all the nodes.
Bug: v8:10308
Change-Id: Iddf6c077b5a8658a487cef59f2e3bbae3c8bd98d
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2219327
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68491}
Prototype f32x4.floor on ARM for both ARM v7 and ARM v8. ARM v8 has
support for vrintm, and for ARM v7 we fallback to runtime.
Since ARM v8 uses vrintm, which is the same instruction used for F32
Floor (scalar), wasm-compiler reuses the Float32RoundDown check.
Bug: v8:10553
Change-Id: I540e82a156131821f732cd427df2e5c68f4094d7
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2252541
Reviewed-by: Jakob Kummerow <jkummerow@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68470}
This implements I32x4DotI16x8S for ia32.
Also fixes instruction-selector for SIMD ops, they should all set operand1 to be a register, since we do not have memory alignment yet.
Bug: v8:10583
Change-Id: Id273816efd5eea128580f3f7bde533a8e1b2435d
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2231031
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68444}
Prototype f32x4.ceil on ARM for both ARM v7 and ARM v8. ARM v8 has
support for vrintp, and for ARM v7 we fallback to runtime.
Since ARM v8 uses vrintp, which is the same instruction used for F32
Ceil (scalar), wasm-compiler reuses the Float32Round check, rather than
creating new F32x4Round optional operators.
Implementation for vrintp (Advanced SIMD version that takes Q
registers), assembler, disassembler support. Incomplete for now, but
more will be added as we add other rounding modes.
Bug: v8:10553
Change-Id: I4563608b9501f6f57c3a8325b17de89da7058a43
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2248779
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Reviewed-by: Jakob Kummerow <jkummerow@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68419}
This is a reland of 3692bef9f9
Integer overflow in the test code is fixed by using
MulWithWraparound.
Original change's description:
> [wasm-simd][x64] Prototype i32x4.dot_i16x8_s
>
> This implements I32x4DotI16x8S for x64 and interpreter.
>
> Bug: v8:10583
> Change-Id: I404ac68c19c1686a93f29c3f4fc2d661c9558c67
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2229056
> Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
> Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
> Commit-Queue: Zhi An Ng <zhin@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#68244}
Bug: v8:10583
Change-Id: Ie7d0032f5398b6f725c02b572764258adacc8578
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2236962
Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68343}
This is a reland of f7f72b7b3a
This was reverted because of a test timing out on slow_path
variant (https://crrev.com/c/2237131 for details). Turns out
the test is just really slow, and was skipped on this variant
in https://crrev.com/c/2237628. Relanding without changes.
Original change's description:
> [wasm-simd] Prototype f64x2 rounding instructions
>
> Implements f64x2 ceil, floor, trunc, nearestint, for interpreter and
> x64.
>
> Bug: v8:10553
> Change-Id: I12a260a3b1d728368e5525d317d30fc9581cae04
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2213082
> Commit-Queue: Zhi An Ng <zhin@chromium.org>
> Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
> Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#68241}
Tbr: tebbi@chromium.org
Bug: v8:10553
Change-Id: I4cdc23d0556f11310d32fa066f40b057fd49d2d7
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2237350
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Reviewed-by: Adam Klein <adamk@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68304}
This reverts commit dfbbb4a531.
Reason for revert: Bitmask added post 84 cut, so it is not part of origin trial. Therefore it is still a post-mvp.
Original change's description:
> [wasm-simd] Add bitmask to SIMD MVP
>
> This removes the post-mvp flag for bitmask, since it was accepted into
> the proposal, see https://github.com/WebAssembly/simd/pull/201.
>
> Bug: v8:10308
> Change-Id: I4ced43a6484660125d773bc9de46bdea9f72b13b
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2216532
> Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
> Commit-Queue: Zhi An Ng <zhin@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#67993}
TBR=gdeepti@chromium.org,zhin@chromium.org
# Not skipping CQ checks because original CL landed > 1 day ago.
Bug: v8:10308
Change-Id: I53294be4ea816f37c7cc5f545afb572538dd4770
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2233183
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68216}
This is a reland of dfdef88547
Original change's description:
> [wasm-simd] Fix extract lane unsigned extend
>
> The interpreter is missing a static cast when extracting lanes smaller
> than int32_t and doing an unsigned extend. The array in Simd128 is
> signed, so a direct cast to uint32_t will be a signed extension. The fix
> is to, in the unsigned case, cast to unsigned (of the appropriate size)
> first, then cast to uint32_t.
>
> Change-Id: Ifabb5b9690f08ad505ac94b84908db0970581818
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2216721
> Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
> Commit-Queue: Zhi An Ng <zhin@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#68029}
Change-Id: Ica7974a2f1f2a4f07b54cc68f9abcf5e121a9262
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2219414
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68063}
This reverts commit dfdef88547.
Reason for revert: https://ci.chromium.org/p/v8/builders/ci/V8%20Blink%20Mac/2718?
Original change's description:
> [wasm-simd] Fix extract lane unsigned extend
>
> The interpreter is missing a static cast when extracting lanes smaller
> than int32_t and doing an unsigned extend. The array in Simd128 is
> signed, so a direct cast to uint32_t will be a signed extension. The fix
> is to, in the unsigned case, cast to unsigned (of the appropriate size)
> first, then cast to uint32_t.
>
> Change-Id: Ifabb5b9690f08ad505ac94b84908db0970581818
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2216721
> Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
> Commit-Queue: Zhi An Ng <zhin@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#68029}
TBR=gdeepti@chromium.org,zhin@chromium.org
Change-Id: Icdd0e705f4c7252aef2cadaa39ec52204b5c6093
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2219412
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68030}
The interpreter is missing a static cast when extracting lanes smaller
than int32_t and doing an unsigned extend. The array in Simd128 is
signed, so a direct cast to uint32_t will be a signed extension. The fix
is to, in the unsigned case, cast to unsigned (of the appropriate size)
first, then cast to uint32_t.
Change-Id: Ifabb5b9690f08ad505ac94b84908db0970581818
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2216721
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68029}
This removes the post-mvp flag for bitmask, since it was accepted into
the proposal, see https://github.com/WebAssembly/simd/pull/201.
Bug: v8:10308
Change-Id: I4ced43a6484660125d773bc9de46bdea9f72b13b
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2216532
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#67993}
The proposal uses the lane shape, e.g. i64x2.anytrue, and we were using
s1x2.anytrue in our opcodes. This was a legacy naming, because we were
trying to bitpack the booleans. Now that we aren't doing that, rename
these to be more consistent with the proposal.
This was done with a straightforward sed script, changing both cpp code
and also some comments in mjsunit test files.
Bug: v8:10506
Change-Id: If077ed805de23520d8580d6b3b1906c80f67b94f
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2207915
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Cr-Commit-Position: refs/heads/master@{#67945}
The codegen uses a bunch of vpmax to try and keep set bits around. The
datatype for vpmax does not need to change for each instruction, since
vpmax U32 will persist set bits just as well. This simplifies the
instruction sequences for S1x8 and S1x16 anytrue.
I added a test to check a special case when a f64x2 contains -0.0 (top
bit set). A previous attempt to optimize codegen used floating point
compare, which does not distinguish between 0.0 and -0.0. So -0.0 will
compare equals to 0.0, and incorrect return 0 for anytrue.
Change-Id: I66013796af08a666009e6b2d774ea7ee7bdfe1ad
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2203113
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Bill Budge <bbudge@chromium.org>
Cr-Commit-Position: refs/heads/master@{#67875}
Load extends always load 64-bits. Previously, we were setting the max
alignment to be the size_log_2 of the load_type. For LoadExtends the
load_type indicates what the lane size to be extended is, *NOT* the size
to be loaded.
Bug: chromium:1082848
Change-Id: I0c4115ea6ec916211b03afdb83376ccc05c0c244
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2202721
Reviewed-by: Bill Budge <bbudge@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#67815}
Same implementation as the one for x64 in https://crrev.com/c/2186630.
Bug: v8:10501
Change-Id: If2b6c0fdc649afba3449d9579452cf7047a55a54
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2188556
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#67721}
This patch implements f32x4.pmin, f32x4.pmax, f64x2.pmin, and f64x2.pmax
for x64 and interpreter.
Pseudo-min and Pseudo-max instructions were proposed in
https://github.com/WebAssembly/simd/pull/122. These instructions
exactly match std::min and std::max in C++ STL, and thus have different
semantics from the existing min and max.
The instruction-selector for x64 switches the operands around, because
it allows for defining the dst to be same as first (really the second
input node), allowing better codegen.
For example, b = f32x4.pmin(a, b) directly maps to vminps(b, b, a) or
minps(b, a), as long as we can define dst == b, and switching the
instruction operands around allows us to do that.
Bug: v8:10501
Change-Id: I06f983fc1764caf673e600ac91d9c0ac5166e17e
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2186630
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Cr-Commit-Position: refs/heads/master@{#67688}
Move them all into wasm-macro-gen.h, other opcodes have their macros
there as well. This will make reusing these macros easier when we have
other test files for SIMD. (An upcoming one is for scalar lowering
tests.)
Change-Id: I6c21100ce490abbc26f80a0d204815687fd62f00
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2185471
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#67658}