This reverts commit 3c40082543.
Reason for revert: Re-enable interpreter tests
Original change's description:
> [wasm-simd] Remove interpreter tier of SIMD tests
>
> As per the all-hands a couple of weeks ago, the interpreter will
> be removed soon. Remove running tests on this tier, so we no longer
> put effort into maintaining tests for this tier.
>
> Change-Id: I9fce0f3a7cd869d6ccecf1c1f820b794e89858e1
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2175021
> Reviewed-by: Zhi An Ng <zhin@chromium.org>
> Commit-Queue: Deepti Gandluri <gdeepti@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#67520}
TBR=gdeepti@chromium.org,zhin@chromium.org
# Not skipping CQ checks because original CL landed > 1 day ago.
Change-Id: Iac0f21311769157c5ae303e8078c25d96fbc7c93
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2180343
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Commit-Queue: Deepti Gandluri <gdeepti@chromium.org>
Cr-Commit-Position: refs/heads/master@{#67546}
As per the all-hands a couple of weeks ago, the interpreter will
be removed soon. Remove running tests on this tier, so we no longer
put effort into maintaining tests for this tier.
Change-Id: I9fce0f3a7cd869d6ccecf1c1f820b794e89858e1
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2175021
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Commit-Queue: Deepti Gandluri <gdeepti@chromium.org>
Cr-Commit-Position: refs/heads/master@{#67520}
- Update opcode numbers, tests
- As the wasm-module-builder currently assumes opcode bytes, skip
the test that needs a multi-byte leb128 opcode
- Renumber post-MVP opcodes
Change-Id: I6531e954e63986dc6f7a3144ec054d16e6dc1b05
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2173952
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Commit-Queue: Deepti Gandluri <gdeepti@chromium.org>
Cr-Commit-Position: refs/heads/master@{#67517}
Due to lack of testing environment before, there are some bugs in the
implementations of wasm-simd on mips64 platform, this CL fix them
according to the test on Loongson 3A4000.
Change-Id: I59ab6315987fc94a06cf0bf23754f5c593879532
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2162416
Commit-Queue: Zhao Jiazhong <zhaojiazhong-hf@loongson.cn>
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Cr-Commit-Position: refs/heads/master@{#67413}
Load splat opcodes are currently multi-byte, but were not passing the
right lengths for decoding of immediates.
Bug: v8:10258
Change-Id: I2c93c3f915eaa43a74722cf0285f161d16ef0ff6
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2154769
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#67326}
SIMD opcodes consist of the prefix byte, then an LEB128 encoded int. We
were decoding this incorrectly as a fixed uint8. This fixes the decoder
to properly handle multi bytes.
In some cases, the multi byte logic is applied to all prefixed opcodes.
This is not a problem, since for values < 0x80, the LEB encoding is a
single byte, and decodes to the same int. If the prefix opcode has
instructions with index >= 0x80, it would be required to be LEB128
encoded anyway.
There are a bunch of trivial changes to test-run-wasm-simd, to change
the macro from BUILD to BUILD_V, the former only works for single byte
opcodes, the latter is a new template-based macro that correct handles
multi-byte opcodes. The only unchanged test is the shuffle fuzzer test,
which builds its own sequence of bytes without using the BUILD macro.
Bug: v8:10258
Change-Id: Ie7377e899a7eab97ecf28176fd908babc08d0f19
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2118476
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Cr-Commit-Position: refs/heads/master@{#67186}
Implement i8x16.bitmask, i16x8.bitmask, i32x4.bitmask on ia32.
Drive by additions of disasm and disasm tests to some instructions.
Bug: v8:10308
Change-Id: I3725ed6959ae55f96ee7950130776a4f08e177c9
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2127314
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#66989}
MSVC 19.25 complains about signbit being ambiguous between
signbit(float) and signbit(double) overloads when called with an int8_t.
To remove the ambiguity, cast to a double.
Change-Id: I698f05eed9248eef493bbe46b75fcd07e37e2a05
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2118510
Reviewed-by: Andreas Haas <ahaas@chromium.org>
Commit-Queue: Richard Townsend <richard.townsend@arm.com>
Cr-Commit-Position: refs/heads/master@{#66856}
Introduces a new macro BUILD_V (v is for vector) that pushes bytes into
a vector (instead of directly in an array initializer, see BUILD). This
has the positive effect of being able to handle opcodes of multiple
bytes (e.g. SIMD opcodes bigger that 0xfd80). Because of this "API"
change, our helper macros in test-run-wasm-simd.cc and wasm-run-utils.h
need to change too. So, we introduce new macros (suffixed by _V), that
will call the appropriate lambdas defined in BUILD_V, that knows how to
push bytes into the vector, and also can handle multi-byte opcodes.
This design has a bit of duplication and ugliness, but was chosen to
reduce the impact of existing tests. No restructuring of test code is
required, we only need to add suffix _V.
Note that we do not have multi-byte opcodes yet (in wasm-opcodes.h),
this change will be breaking, and requires all the tests to be updated
to use _V macros first.
Bug: v8:10258
Change-Id: I86638a548fe2f9714c1cfb3bd691fb7b49bfd652
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2107650
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Cr-Commit-Position: refs/heads/master@{#66812}
Implement i8x16.bitmask, i16x8.bitmask, i32x4.bitmask on interpreter and
arm64.
These operations are behind wasm_simd_post_mvp flag, as we are only
prototyping to evaluate performance. The codegen is based on guidance at
https://github.com/WebAssembly/simd/pull/201.
Bug: v8:10308
Change-Id: I835aa8a23e677a00ee7897c1c31a028850e238a9
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2099451
Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#66793}
"I64x2Eq", "S1x2AnyTrue" and "S1x2AllTrue" do not yet have lowering
implemented hence some of the test case may fail on s390x
hardware without AVX support.
Change-Id: Ice01bcaed78950fbad36e2ba37c8f7ae5d10b59b
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2107763
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Commit-Queue: Milad Farazmand <miladfar@ca.ibm.com>
Cr-Commit-Position: refs/heads/master@{#66780}
This Cl enables simd on machines which support
VECTOR_ENHANCE_FACILITY_1. It also enables related tests to
match execution on x64.
LoadTransform tests must be skipped on the simulator until a future CL
matches behaviour between native BE and its simulator on LE.
Change-Id: Iaadc32e0388bf15d3d7c550062a373fb403b65c4
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2107053
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Milad Farazmand <miladfar@ca.ibm.com>
Cr-Commit-Position: refs/heads/master@{#66754}
Some opcodes are introduced in V8 for prototyping, and performance
measurements that are not officially a part of the current SIMD proposal
but may be included in future, gate these by a separate flag.
Change-Id: Icc6a9e89c6196c8ff144d2e0193d707e1f60c38b
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2079539
Commit-Queue: Deepti Gandluri <gdeepti@chromium.org>
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Ben Smith <binji@chromium.org>
Cr-Commit-Position: refs/heads/master@{#66542}
FMA tests that are running on Liftoff can use fused results, since the
tests will fall back to TurboFan.
Bug: v8:9415
Change-Id: I02edea5ce1447263f7bc7574573418b0055aef8f
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2063202
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#66438}
Implements lowering for:
- i16x8.load8x8_s
- i16x8.load8x8_u
- i32x4.load16x4_s
- i32x4.load16x4_u
As before, i64x2 is not implemented since 64-bit lowering and scalar
lowering don't work together yet.
Bug: v8:9886
Change-Id: I3728d009e053acf82baacbcf1c6c08ea636ef241
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2044546
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#66380}
We lower each op into num_lanes loads, and connecting up the effects in
a chain.
s64x2 is not implemented since we lowering for 64x2 generally doesn't
work anyway.
Load extends are a bit more complicated, so we'll do that in a separate
change.
Bug: v8:9886
Change-Id: I80096827bf8e8e0db1ef0ad1b76759ed1797ca5e
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2031893
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Cr-Commit-Position: refs/heads/master@{#66183}
These were not added in https://crrev.com/c/2026067 when we added
similar tests for other lane sizes, since x64 had a completely different
path for i8x16. But this tests are useful anyway for other archs, so add
them in.
Bug: v8:10115
Change-Id: I77ecca0cd9f4021c94f1538aa5635b5d54983207
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2041708
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#66178}
Define a macro in code-generator-x64 to help identify cases when the
shift value is an immediate/constant. In those cases we can directly
emit the shifts without any masking, since the instruction selector
would have modulo-ed the shift value. We also don't need any temporaries
in this case.
This is only x64 codegen, optimizations for other archs will come in
future patches (and will probably look very similar to this).
The current test case passes the shifts as an immediate, so we add a new
path that loads the shift value from memory, thereby exercising the
slower path of non-immediate shift value.
Bug: v8:10115
Change-Id: Iaf13d81595714882a8f5418734e031b8bc654af3
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2026067
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#66074}
On backends that do not have s128 support in Liftoff, tests will bail
out to TurboFan, so tests will continue running and passing.
Bug: v8:9909
Change-Id: I3b596a73b6cb2e8645a99c65a935026f9e1a8d55
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2029332
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#66056}
These conversion instructions were removed from the proposal in
https://github.com/WebAssembly/simd/pull/178.
Change-Id: I212ca2f923362bf08e178f6d28cc2338cf6f5927
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2016006
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#66015}
Remove the execution tier check for simd tests. On archs without
Liftoff, those tests that are configured to run on Liftoff will fail
with this check, since they bail out to TF.
We remove this check for now, but will think of a way to enforce this in
a more platform specific way.
Bug: v8:9909
Change-Id: Id56f841fe6e342434af3dbcdaef0a8a284614994
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2019924
Reviewed-by: Milad Farazmand <miladfar@ca.ibm.com>
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#65983}
This relands commit 009993adb4.
The fix is in liftoff-assembler-ia32.h, the codegen was incorrect.
Original change's description:
> Implement f32x4.splat and enable handling this in Liftoff.
>
> We add a new macro for defining test cases to run on TurboFan, Liftoff,
> interpreter, and scalar lowering.
>
> Also add an assertion that the execution tier used is what we expected
> it to be. This is useful for Liftoff, because by default it falls back
> to TurboFan when it encounters an unimplemented opcode.
>
> Bug: v8:9909
Bug: v8:9909
Change-Id: I7daacbe8b195d9212367190c515b0babbc457a88
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2018043
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#65963}
This reverts commit 009993adb4.
Reason for revert: New test fails, see https://ci.chromium.org/p/v8/builders/ci/V8%20Linux/35534 and https://ci.chromium.org/p/v8/builders/ci/V8%20Win32%20-%20debug/23778
Original change's description:
> [liftoff][wasm-simd] Implement f32x4.splat
>
> Implement f32x4.splat and enable handling this in Liftoff.
>
> We add a new macro for defining test cases to run on TurboFan, Liftoff,
> interpreter, and scalar lowering.
>
> Also add an assertion that the execution tier used is what we expected
> it to be. This is useful for Liftoff, because by default it falls back
> to TurboFan when it encounters an unimplemented opcode.
>
> Bug: v8:9909
> Change-Id: I594955fce778173191fc44c38c4f956a05e77839
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2014753
> Commit-Queue: Zhi An Ng <zhin@chromium.org>
> Reviewed-by: Clemens Backes <clemensb@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#65954}
TBR=clemensb@chromium.org,zhin@chromium.org
Change-Id: Ie6970a8c29baab149150dd734a95f89be5fd89ff
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: v8:9909
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2017722
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Commit-Queue: Clemens Backes <clemensb@chromium.org>
Cr-Commit-Position: refs/heads/master@{#65958}
Implement f32x4.splat and enable handling this in Liftoff.
We add a new macro for defining test cases to run on TurboFan, Liftoff,
interpreter, and scalar lowering.
Also add an assertion that the execution tier used is what we expected
it to be. This is useful for Liftoff, because by default it falls back
to TurboFan when it encounters an unimplemented opcode.
Bug: v8:9909
Change-Id: I594955fce778173191fc44c38c4f956a05e77839
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2014753
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Cr-Commit-Position: refs/heads/master@{#65954}
This CL implements load_extend with 2 lanes and all load_splat
operations on IA32. The necessary assemblers together with their
corresponding disassemblers and tests are also added in this CL.
The newly added opcodes include: S8x16LoadSplat, S16x8LoadSplat,
S32x4LoadSplat, S64x2LoadSplat, I64x2Load32x2S, I64x2Load32x2U.
Bug: v8:9886
Change-Id: I0a5dae0a683985c14c433ba9d85acbd1cee6705f
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1982989
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Commit-Queue: Zhiguo Zhou <zhiguo.zhou@intel.com>
Cr-Commit-Position: refs/heads/master@{#65937}
Note the tricky part in instruction-selector-x64, where we flip the
inputs given to the code generator. This is because the semantics we
want is: v128.andnot a b = a & !b, but the x64 instruction performs
andnps a b = !a & b. Therefore we flip the inputs, and combined with
g.DefineSameAsFirst, the output register will be the same as b, and we
can use andnps without any modifications in both SSE and AVX cases.
Bug: v8:10082
Change-Id: Iff98dc1dd944fbc642875f6306c6633d5d646615
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1980894
Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#65738}
For I16x8Splat and I8x16Splat, the arguments takes I32, which can hold a
value larger than what should be splatted. We add tests to check that
the splatted values is the truncated I32 value (top bits masked off).
See https://github.com/WebAssembly/simd/pull/151 for the updated to
proposal text.
Change-Id: Ib32770872e70c7cde2028130d2b44b416594610e
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1986200
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#65642}
This change includes templatization of the test helper to allow the
same function to be reused for both signed and unsigned data types.
We implement a new function RoundingAverageUnsigned in overflowing-math,
rather than in base/utils, since the addition could overflow.
SIMD scalar lowering and implementation for other backends will follow
in future patches.
Bug: v8:10039
Change-Id: I70735f7b6536f197869ef1afbccaf5649e7e8448
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1958007
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#65531}
Also some cleanup reordering of instruction codes.
Bug: v8:9813
Change-Id: I35caad0b84dd5824090046cba964454eac45d5d8
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1925613
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Michael Starzinger <mstarzinger@chromium.org>
Reviewed-by: Bill Budge <bbudge@chromium.org>
Cr-Commit-Position: refs/heads/master@{#65088}
These instructions should always treat inputs as signed, and saturate to
unsigned min/max values.
E.g. given -1, it should saturate to 0.
The spec text,
https://github.com/WebAssembly/simd/blob/master/proposals/simd/SIMD.md#integer-to-integer-narrowing,
has been updated to describe this.
The changes here include codegen changes to ia32, x64, arm, and arm64,
changes to arm simulator, assembler, and disassembler to handle the case
of treating input as signed and narrowing to unsigned. The vqmovn
instruction can handle this case, our assembler wasn't allowing callers
to specify this.
The interpreter and scalar lowering are also fixed with this change.
Bug: v8:9729
Change-Id: I6f72baa825f59037f7754485df6a2964af59fe31
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1879423
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Reviewed-by: Michael Starzinger <mstarzinger@chromium.org>
Reviewed-by: Bill Budge <bbudge@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#65051}
This implements the rest of the load extend instructions:
- i32x4.load16x4_s
- i32x4.load16x4_u
- i64x2.load32x2_s
- i64x2.load32x2_u
Bug: v8:9886
Change-Id: I4649f77bae5224042a1628d9f0498c050b1e599d
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1903812
Reviewed-by: Michael Starzinger <mstarzinger@chromium.org>
Reviewed-by: Bill Budge <bbudge@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#65017}
This makes sure that the {WasmGraphBuilder} properly detects the
presence of Simd128 global.get and global.set opcodes and triggers
scalar lowering on architectures without Simd128 support.
R=clemensb@chromium.org
TEST=cctest/test-run-wasm-simd/RunWasm_S128Globals
BUG=v8:9973
Change-Id: I1538bd1d3fea40cc78e82b125d4f113842faf68a
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1917148
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Michael Starzinger <mstarzinger@chromium.org>
Cr-Commit-Position: refs/heads/master@{#65002}
Introduce new operator LoadTransform that holds a LoadTransformInfo param,
which describes the kind of load (normal, unaligned, protected), and a
transformation (splat or extend, signed or unsigned).
We have a new method that a full decoder needs to implement, LoadTransform,
which resuses the existing LoadType we have, but also takes a LoadTransform,
to distinguish between splats and extends at the decoder level.
This implements 4 out of the 10 suggested load splat/extend operations
(to keep the cl smaller), and is also missing interpreter support (will
be added in the future).
Change-Id: I1e65c693bfbe30e2a511c81b5a32e06aacbddc19
Bug: v8:9886
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1863863
Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Reviewed-by: Michael Starzinger <mstarzinger@chromium.org>
Reviewed-by: Andreas Haas <ahaas@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#64819}
There are a couple of bugs here:
1. The immediate used for vinsertps is wrong when lane == 1, the first
two bits specify which element of the source is copied, and it should
always be 00, 01 to copy the first 2 lanes of source.
2. For both cases, the second insertps call should be using dst as the
src, since dst was already updated by the first insertps call, it was
incorrectly using the old value of src. This was probably working
correctly because in many cases dst and src happened to be the same
register.
3. rep cannot be same as dst, because dst is overwritten, and rep should
stay the same
I also modified the F64x2ReplaceLane to test separately for replacing
lane 0 and lane 1.
Fixed bug 3. for arm and arm64.
Bug: v8:9728
Change-Id: Iec6e48bcfbc7d27908dd86d5f113a8b5dedd499b
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1877055
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#64620}
This introduces 2 new machine operators that are variants of I64x2Splat
and I64x2ReplaceLane that takes two int32 operands instead of one i64
operand.
Bug: v8:9728
Change-Id: I6675f991e6c56821c84d183dacfda96961c1a708
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1841242
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Reviewed-by: Michael Starzinger <mstarzinger@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#64337}
The vst1 and vld1 instruction does a post-increment access. What we
intend is the usual access at (base+offset). This change adds a helper
function that is called for load and stores of s128, which emits the add
instruction to do base+offset, and then change the addressing mode of
the load/store to Operand2_R, which generates the variant of vld1/vst1
without the offset register. This is similar to how kSimd128 values are
loaded/stored in VisitUnalignedLoad and VisitUnalignedStore.
We also remove kSimd128 cases from UnalignedLoad and UnalignedStore,
since it is supported (see A3.2.1 Unaligned Data Access, ARM DDI
0406C.d)
Bug: v8:9746
Bug: v8:9748
Change-Id: I60b987ac58a5eaacd498a940625163484a3dc2db
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1834771
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#64229}
This CL implements i8x16.extract_lane_u, i16x8.extract_lane_u operations by
changing the default narrow extract operations to be unsigned. The
sign-extended extracts are implemented on top of the unsigned extracts
with an additional extend compiler node.
For IA32/X64, the codegen effectively remains the same -
0x389332bc32a3 63 660f3a14c900 pextrb rcx,xmm1,0
0x389332bc32a9 69 0fbec9 movsxbl rcx,rcx
0x389332bc32a3 63 660f3a14c900 pextrb rcx,xmm1,0
0x389332bc32a9 69 0fbec9 movsxbl rcx,rcx
On ARM, this adds an additional sxt instruction for the signed extracts.
Bug: v8:8460
Change-Id: I67f14b2b860ff8cc86ffbb2f65c7ef7de32da83f
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1846711
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Bill Budge <bbudge@chromium.org>
Commit-Queue: Deepti Gandluri <gdeepti@chromium.org>
Cr-Commit-Position: refs/heads/master@{#64172}
This brings our constants back in line with the changed spec text. We
already use kExprTableGet and kExprTableSet, but for locals and globals
we still use the old wording.
This renaming is mostly mechanical.
PS1 was created using:
ag -l 'kExpr(Get|Set)Global' src test | \
xargs -L1 sed -E 's/kExpr(Get|Set)Global\b/kExprGlobal\1/g' -i
PS2 contains manual fixes.
R=mstarzinger@chromium.org
Bug: v8:9810
Change-Id: I064a6448cd95bc24d31a5931b5b4ef2464ea88b1
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1847355
Commit-Queue: Clemens Backes <clemensb@chromium.org>
Reviewed-by: Michael Starzinger <mstarzinger@chromium.org>
Cr-Commit-Position: refs/heads/master@{#64163}
FMA operations is always supported on arm64, so in the test, we expect
fused results on arm64 whenever we run on TurboFan.
Bug: v8:9415
Change-Id: Ia2016533b9b76ee14b8c8da1c0d4ff7753276714
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1819723
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Cr-Commit-Position: refs/heads/master@{#63973}
Add a new test SimdLoadStoreLoadMemargOffset to test this, without this fix
this test would have failed.
Bug: v8:9753
Change-Id: I119adda8e3c6c7adb0ad4023298bbce9c0c64a01
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1811457
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Cr-Commit-Position: refs/heads/master@{#63914}
Quasi Fused Multiply-Add and Quasi Fused Multiply-Subtract performs, on floats, a + b * c and a - b * c respectively.
When there is only a single rounding, it is a fused operation. Quasi in this case means that the result can either be fused or not fused (two roundings), depending on hardware support.
It is tricky to write the test because we need to calculate the expected value, and there is no easy way to express fused or unfused operation in C++, i.e.
we cannot confirm that float expected = a + b * c will perform a fused or unfused operation (unless we use intrinsics).
Thus in the test we have a list of simple checks, plus interesting values that we know will produce different results depending on whether it was fused or not.
The difference between 32x4 and 64x2 qfma/qfms is the type, and also the values of b and c that will cause an overflow, and thus the intermediate rounding will affect the final result.
The same array can be copy pasted for both types, but with a bit of templating we can avoid that duplication.
Change-Id: I0973a3d28468d25f310b593c72f21bff54d809a7
Bug: v8:9415
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1779325
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Reviewed-by: Michael Starzinger <mstarzinger@chromium.org>
Cr-Commit-Position: refs/heads/master@{#63878}
Drive by fix of type of expected value in a test
Bug: v8:9626
Change-Id: I1bb44082b873383ea75e7089828bc68c9d4e0df0
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1757503
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#63727}
Implementations for other architectures will follow in subsequent
changes.
Bug: v8:8460
Change-Id: I279388ab76b1d88d65cbe179088be5573c17fc58
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1796317
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Reviewed-by: Michael Starzinger <mstarzinger@chromium.org>
Reviewed-by: Bill Budge <bbudge@chromium.org>
Cr-Commit-Position: refs/heads/master@{#63693}
Implement I64x2 multiply using 32-bit multiplies.
This approach uses two fewer cycles (0.88x) on Cortex-A53 and three fewer cycles (0.86x)
on Cortex-A72, compared to moving to general purpose registers and doing two 64-bit multiplies.
Based on a patch by Zhi An Ng.
Bug: v8:8460
Change-Id: I9c8d3bb77f0d751eec2d85823522558b7f173628
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1781696
Commit-Queue: Martyn Capewell <martyn.capewell@arm.com>
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Bill Budge <bbudge@chromium.org>
Cr-Commit-Position: refs/heads/master@{#63558}