Some wasm simd unit tests are not guarded by V8_ENABLE_WEBASSEMBLY,
it will cause test failure on no-wasm build.
Change-Id: Ib08e133f979e492ca620191d799f641bdb0f60bd
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3866706
Reviewed-by: Jakob Kummerow <jkummerow@chromium.org>
Commit-Queue: Jie Pan <jie.pan@intel.com>
Cr-Commit-Position: refs/heads/main@{#82887}
When a 8x16 shuffle matches a packed byte to dword zero extension,
1. input1 is S128Zero after canonicalization,
2. the indices {0,4,8,16} are consecutive value in the range [0-15] and
other indices are in the range [16-31],
the shuffle can be matched to packed byte to dword zero extend. These
shuffles are commonly used in image processing.
Change-Id: I14d1e35401dbc5ecd91f67c46ea9762628835d01
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3547667
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Fanchen Kong <fanchen.kong@intel.com>
Cr-Commit-Position: refs/heads/main@{#80953}
- For y = x & 0xFF, we could use movzxbq y, x.
- For y = x & 0xFFFF, we could use movzxwq y, x.
- For y = x & 0xFFFFFFFF, we could use movl y, x.
- For y = x & immediate and immediate fits into uint32,
we could use andl x, immediate.
Bug: v8:12337
Change-Id: I31f04fa9058c6acabb210f0fce61ac713ed1a382
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3518913
Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Commit-Queue: Clemens Backes <clemensb@chromium.org>
Cr-Commit-Position: refs/heads/main@{#79668}
When the input to F64x2PromoteLowF32x4 is a S128Load64Zero, we can skip
the load + promote, and promote directly with a memory operand. The
tricky bit here is that on systems that rely on OOB trap handling, the
load is not eliminatable, so we always visit the S128Load64Zero, even
though after instruction-selector pattern-matching, it is unused. We
mark it as defined to skip visiting it, only if we matched it.
Bug: v8:12189
Change-Id: I0a805a3fce65c56ec52082b3625e1712ea1ee7cf
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3154347
Reviewed-by: Georg Neis <neis@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/main@{#76917}
1. Move Abspd, Negpd from MacroAssembler into TurboAssembler so that we
can use it in code-generator
2. Add Absps and Negps (float32 versions of the instructions in 1)
3. Refactor SSE/AVX float32/float64 abs/neg to use these macro-assembler
helpers.
4. Use these helpers in Liftoff too
This has the benefit of not requiring to set up the masks in a temporary
register, and loading the constants via an ExternalReference instead.
It does require (in ins-sel) to have the input be in a Register, since
the ExternalReference is an operand (and the instruction can only have 1
operand input).
Bug: v8:11589
Change-Id: I68fafaf31b19ab05ee391aa3d54c45d547a85b34
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3123635
Reviewed-by: Adam Klein <adamk@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/main@{#76520}
This reverts commit d16eefe0f2.
It is not correct to check for node equality during the graph
construction phase, because we can have optimizations that will combine
same nodes. So it can happen that in wasm-compiler, the inputs to
shuffle are not the same, so we canonicalize using that knowledge that
it will not be the same, and allow indices > 15. But later we can have
optimizations that combine the 2 inputs (e.g. splat of the same
constants), and the instruction selector will see that the input nodes
are the same.
Bug: v8:11542,chromium:1199662
Change-Id: I21c175f4707708038710147f64d687d1b14c6ecc
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2829986
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Bill Budge <bbudge@chromium.org>
Cr-Commit-Position: refs/heads/master@{#74017}
When swizzle is called with a v128.const node, we can check that the
indices are either all in bounds, or if they are out of bounds the top
bit of each byte is set. This will match exactly pshufb behavior, and so
we can omit the paddusb (and getting external reference).
Bug: v8:10992
Change-Id: I5479a9eb92ebcfc12bedff5efd3e72bb4a43ff40
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2766222
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Reviewed-by: Georg Neis <neis@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#73583}
The test was extended in this CL: https://crrev.com/c/2762420
It now uses wasm::SimdShuffle, which is only available if webassembly is
enabled.
Thus, #if out the test if webassembly is disabled.
Drive-by: Add a missing include.
R=jkummerow@chromium.orgCC=zhin@chromium.org
Bug: v8:11238
Change-Id: I1b53d0145467b58616a161944fb88d2ca256fd58
Cq-Include-Trybots: luci.v8.try:v8_linux64_no_wasm_compile_rel
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2772978
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Commit-Queue: Clemens Backes <clemensb@chromium.org>
Cr-Commit-Position: refs/heads/master@{#73517}
We currently canonicalize shuffles in the architecture specific
instruction selector. This has the drawback that if we want to pattern
match on nodes that have a shuffle as input, they need to individually
canonicalize the shuffle. There can also be a subtle bug if we
canonicalize the same shuffle node twice (see bug for details).
This moves the canonicalization to "construction time", in
wasm-compiler, when building the graph. As such, any pattern matches in
instruction-selector will only need to deal with canonicalized shuffles.
We introduce a new kind of parameter for shuffle nodes,
ShuffleParameter, to store the 16 bytes plus a bool indicating if this
is a swizzle. A swizzle essentially: inputs to the shuffle are the same
or all indices only touch 1 input. We calculate this when
canonicalizing, so store this bit of information inside of the node's
parameter.
We update the tests in x64 to handle special cases where, even though
the node's inputs are not swapped (due to canonicalization), they need
to be swapped for the specific instruction selected (e.g. palignr). The
test data also contains canonicalized shuffles, so we have to manually
canonicalize them.
Bug: v8:11542
Change-Id: I4e78082267bd03d6caedf43d68d81ef3f5f364a8
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2762420
Reviewed-by: Bill Budge <bbudge@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#73495}
I want to extract the Canonicalize shuffle out of the arch-specific
instruction selector, since all archs have to do that anyway. Adding
these tests to make sure the matching still works.
Bug: v8:11542
Change-Id: Ic7ce0e0a027ce858a30f79a0f9ef2495bcaab4c7
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2750289
Reviewed-by: Bill Budge <bbudge@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#73414}
Integer splats (especially for sizes < 32-bits) does not directly
translate to a single instruction on x64. We can do better for special
values, like 0, which can be lowered to `xor dst dst`. We do this check
in the instruction selector, and emit a special opcode kX64S128Zero.
Also change the xor operation for kX64S128Zero from xorps to pxor. This
can help reduce any potential data bypass delay (search for this on
agner's microarchitecture manual for more details.). Since integer
splats are likely to be followed by integer ops, we should remain in the
integer domain, thus use pxor.
For i64x2.splat the codegen goes from:
xorl rdi,rdi
vmovq xmm0,rdi
vmovddup xmm0,xmm0
to:
vpxor xmm0,xmm0,xmm0
Also add a unittest to verify this optimization, and necessary
raw-assembler methods for the test.
Bug: v8:11093
Change-Id: I26b092032b6e672f1d5d26e35d79578ebe591cfe
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2516299
Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#70977}
Currently there are a number of -Wsubobject-linkage warnings when
compiling with gcc (formatted to fit 72 character lines):
In file included from
...
from ../../testing/gtest/include/gtest/gtest.h:10,
from ../../testing/gtest-support.h:8,
from ../../test/unittests/test-utils.h:20,
from ../../test/unittests/compiler/backend/
instruction-selector-unittest.h:15,
from ../../test/unittests/compiler/x64/
instruction-selector-x64-unittest.cc:9:
../../third_party/googletest/src/googletest/include/gtest/internal/
gtest-param-util.h:
In instantiation of ‘class
testing::internal::ParameterizedTestFactory<v8::internal::compiler::
InstructionSelectorChangeInt32ToInt64Test_ \
ChangeInt32ToInt64WithLoad_Test>’:
../../third_party/googletest/src/googletest/include/gtest/internal/
gtest-param-util.h:439:12: required from
‘testing::internal::TestFactoryBase*
testing::internal::TestMetaFactory<TestSuite>::CreateTestFactory(
testing::internal::TestMetaFactory<TestSuite>::ParamType)
[with
TestSuite = v8::internal::compiler::
InstructionSelectorChangeInt32ToInt64Test_ \
ChangeInt32ToInt64WithLoad_Test;
testing::internal::TestMetaFactory<TestSuite>::ParamType =
v8::internal::compiler::{anonymous}::LoadWithToInt64Extension]’
../../third_party/googletest/src/googletest/include/gtest/internal/
gtest-param-util.h:438:20: required from here
../../third_party/googletest/src/googletest/include/gtest/internal/
gtest-param-util.h:394:7: warning:
‘testing::internal::ParameterizedTestFactory<
v8::internal::compiler::
InstructionSelectorChangeInt32ToInt64Test_ \
ChangeInt32ToInt64WithLoad_Test >’ has a field
‘testing::internal::ParameterizedTestFactory<
v8::internal::compiler::
InstructionSelectorChangeInt32ToInt64Test_ \
ChangeInt32ToInt64WithLoad_Test>::parameter_’ whose type uses the
anonymous namespace [-Wsubobject-linkage]
394 | class ParameterizedTestFactory : public TestFactoryBase {
| ^~~~~~~~~~~~~~~~~~~~~~~~
This commit moves the parameterized tests in question into the
anonymous namespace to avoid the warnings.
Change-Id: I9c4a8bd9f4e225ed14ab64f5433d5f5c102e01a1
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2418723
Reviewed-by: Andreas Haas <ahaas@chromium.org>
Commit-Queue: Andreas Haas <ahaas@chromium.org>
Cr-Commit-Position: refs/heads/master@{#70482}
With this change, a load from memory into a register can be replaced by a memory operand for floating point binops if possible.
This eliminates one instruction for following pattern:
vmovss xmm0, m32
vmulss xmm1, xmm1, xmm0
===>
vmulss xmm1, xmm1, m32
Change-Id: I6944287fae3b7756621fb6b3d0b3db9e0beaf080
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2411696
Commit-Queue: Fanchen Kong <fanchen.kong@intel.com>
Reviewed-by: Bill Budge <bbudge@chromium.org>
Reviewed-by: Andreas Haas <ahaas@chromium.org>
Cr-Commit-Position: refs/heads/master@{#70255}
With a displacement of int32_t min (-2^31), and a displacement mode of
kNegativeDisplacement, we will try to negate this constant, but the
result will not fit in an int32_t, leading to a runtime crash.
Check for this special case in CanBeImmediate, and return false.
Bug: chromium:1091892
Change-Id: I7f18153d13805f2836dd5c8e1bc098f1e9600566
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2341095
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Georg Neis <neis@chromium.org>
Reviewed-by: Bill Budge <bbudge@chromium.org>
Cr-Commit-Position: refs/heads/master@{#69311}
This removes LoadStackPointer and its last remaining use in the
interpreter assembler.
Bug: v8:9534
Change-Id: I19aafb12c5fd50248841a3d92448e64243c723ad
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1748729
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Cr-Commit-Position: refs/heads/master@{#63164}
Especially for function types, this increases readability significantly.
Also the style guide recommends for 'using' over 'typedef'.
R=mstarzinger@chromium.org
Bug: v8:9183
Change-Id: If2d17863de39383f5a35e089298d37408791ce4b
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1631415
Commit-Queue: Clemens Hammacher <clemensh@chromium.org>
Reviewed-by: Michael Starzinger <mstarzinger@chromium.org>
Cr-Commit-Position: refs/heads/master@{#61872}
This replaces all typedefs that define types and not functions by the
equivalent "using" declaration.
This was done mostly automatically using this command:
ag -l '\btypedef\b' src test | xargs -L1 \
perl -i -p0e 's/typedef ([^*;{}]+) (\w+);/using \2 = \1;/sg'
Patchset 2 then adds some manual changes for typedefs for pointer types,
where the regular expression did not match.
R=mstarzinger@chromium.orgTBR=yangguo@chromium.org, jarin@chromium.org
Bug: v8:9183
Change-Id: I6f6ee28d1793b7ac34a58f980b94babc21874b78
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1631409
Commit-Queue: Clemens Hammacher <clemensh@chromium.org>
Reviewed-by: Michael Starzinger <mstarzinger@chromium.org>
Cr-Commit-Position: refs/heads/master@{#61849}
Googletest is (at last) converging with industry-standard terminology
[1]. We previously called test suites "test cases", which was rather
confusing for folks coming from any other testing framework.
Chrome now has a googletest version that supports _TEST_SUITE_ macros
instead of _TEST_CASE_, so this CL cleans up some of the outdated usage.
[1] https://github.com/google/googletest/blob/master/googletest/docs/primer.md#beware-of-the-nomenclature
Bug: chromium:925652
Change-Id: I3cd02b9fa6dbece1594bbfd50a21ad7503c2aab9
Reviewed-on: https://chromium-review.googlesource.com/c/1475654
Reviewed-by: Jaroslav Sevcik <jarin@chromium.org>
Commit-Queue: Victor Costan <pwnall@chromium.org>
Cr-Commit-Position: refs/heads/master@{#59666}
This CL splits the backend of TurboFan off into its own directory,
without changing namespaces. This makes ownership management a bit
more fine-grained with a logical separation.
R=mstarzinger@chromium.org,jarin@chromium.org,adamk@chromium.org
Change-Id: I2ac40d6ca2c4f04b8474b630aae0286ecf79ef42
Reviewed-on: https://chromium-review.googlesource.com/c/1308333
Commit-Queue: Ben Titzer <titzer@chromium.org>
Reviewed-by: Adam Klein <adamk@chromium.org>
Reviewed-by: Jaroslav Sevcik <jarin@chromium.org>
Reviewed-by: Michael Starzinger <mstarzinger@chromium.org>
Cr-Commit-Position: refs/heads/master@{#57437}
The InstructionSelector on x64 was missing the ability to properly match
comparisons of memory operands with zero, i.e. it used to turn something
like
Word32Equal(Load[Uint8](o, i), Int32Constant(0))
into
movzbl reg, [o,i]
cmp 0, reg
even requiring a temporary register. Now with this change it generates
the proper
cmpb [o,i], 0
sequence.
R=sigurds@chromium.org
Bug: v8:8238
Change-Id: I52a71bbf95c85e11cb275f0f4a5726a6873cde95
Reviewed-on: https://chromium-review.googlesource.com/c/1281342
Reviewed-by: Sigurd Schneider <sigurds@chromium.org>
Commit-Queue: Benedikt Meurer <bmeurer@chromium.org>
Cr-Commit-Position: refs/heads/master@{#56677}
The stack check instruction sequence is pattern-matched in
instruction-selector-{ia32,x64}.cc and replaced with its own specialized
opcode, for which we later generate an efficient stack check in a single
instruction.
But this pattern matching has never worked for CSA-generated code. The
matcher expected LoadStackPointer in the right operand and the external
reference load in the left operand. CSA generated exactly vice-versa.
This CL does a few things; it
1. reverts the recent change to load the
limit from smi roots:
Revert "[csa] Load the stack limit from smi roots"
This reverts commit 507c29c940.
2. tweaks the CSA instruction sequence to output what the matcher
expects.
3. refactors stack check matching into a new StackCheckMatcher class.
4. typifies CSA::PerformStackCheck as a drive-by.
Bug: v8:6666,v8:7844
Change-Id: I9bb879ac10bfe7187750c5f9e7834dc4accf28b5
Reviewed-on: https://chromium-review.googlesource.com/1099068
Reviewed-by: Leszek Swirski <leszeks@chromium.org>
Reviewed-by: Sigurd Schneider <sigurds@chromium.org>
Reviewed-by: Jaroslav Sevcik <jarin@chromium.org>
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/master@{#53737}
This patch normalizes the casing of hexadecimal digits in escape
sequences of the form `\xNN` and integer literals of the form
`0xNNNN`.
Previously, the V8 code base used an inconsistent mixture of uppercase
and lowercase.
Google’s C++ style guide uses uppercase in its examples:
https://google.github.io/styleguide/cppguide.html#Non-ASCII_Characters
Moreover, uppercase letters more clearly stand out from the lowercase
`x` (or `u`) characters at the start, as well as lowercase letters
elsewhere in strings.
BUG=v8:7109
TBR=marja@chromium.org,titzer@chromium.org,mtrofin@chromium.org,mstarzinger@chromium.org,rossberg@chromium.org,yangguo@chromium.org,mlippautz@chromium.org
NOPRESUBMIT=true
Cq-Include-Trybots: master.tryserver.blink:linux_trusty_blink_rel;master.tryserver.chromium.linux:linux_chromium_rel_ng
Change-Id: I790e21c25d96ad5d95c8229724eb45d2aa9e22d6
Reviewed-on: https://chromium-review.googlesource.com/804294
Commit-Queue: Mathias Bynens <mathias@chromium.org>
Reviewed-by: Jakob Kummerow <jkummerow@chromium.org>
Cr-Commit-Position: refs/heads/master@{#49810}
With this change, on ia32 and x64, a load from memory into a register can be replaced by a memory operand for integer binops if it makes sense.
BUG=
Review-Url: https://codereview.chromium.org/2728533003
Cr-Commit-Position: refs/heads/master@{#43739}
We were missing this optimization in a few cases because TruncateInt64ToInt32 was also interfering.
Also removed the equivalent from simplified-lowering.cc, as the arm64 instruction selector has a similar optimization.
R=jarin@chromium.org
Review-Url: https://codereview.chromium.org/2252333002
Cr-Commit-Position: refs/heads/master@{#38711}
This CL changes the semantics of FloatXXSub to match the semantics of
the semantics of FloatXXSubPreserveNan. Therefore there is no need
anymore for the FloatXXSubPreserveNan operators.
The optimizations in VisitFloatXXSub which are removed in this CL have
already been moved to machine-operator-reducer.cc in
https://codereview.chromium.org/2226663002R=bmeurer@chromium.org
Review-Url: https://codereview.chromium.org/2220973002
Cr-Commit-Position: refs/heads/master@{#38437}
Before this change we would first load an 8/16/32-bit value from memory into a 32-bit register, then zero/sign-extend from that register to a 64-bit one. Now we replace that pattern with a single movsx/movzx.
Ported from http://crrev.com/2183923003R=bmeurer@chromium.org
Review-Url: https://codereview.chromium.org/2220483003
Cr-Commit-Position: refs/heads/master@{#38388}
MachineType is now a class with two enum fields:
- MachineRepresentation
- MachineSemantic
Both enums are usable on their own, and this change switches some places from using MachineType to use just MachineRepresentation. Most notably:
- register allocator now uses just the representation.
- Phi and Select nodes only refer to representations.
Review URL: https://codereview.chromium.org/1513543003
Cr-Commit-Position: refs/heads/master@{#32738}
Adds support for loading from and storing to outer context
variables. Also adds support for declaring functions on contexts and
locals. Finally, fixes a couple of issues with StaContextSlot where
we weren't emitting the write barrier and therefore would crash in the
GC.
Also added code so that --print-bytecode will output the
function name before the bytecodes, and replaces MachineType with StoreRepresentation in RawMachineAssembler::Store and updates tests.
BUG=v8:4280
LOG=N
Review URL: https://codereview.chromium.org/1425633002
Cr-Commit-Position: refs/heads/master@{#31584}
These operators compute the absolute floating point value of some
arbitrary input, and are implemented without any branches (i.e. using
vabs on arm, and andps/andpd on x86).
R=svenpanne@chromium.org
Review URL: https://codereview.chromium.org/1066393002
Cr-Commit-Position: refs/heads/master@{#27662}
We can use xorps/xorpd on Intel CPUs to flip the sign bit. Ideally we'd
use a RIP-relative 128-bit constant in the code object, as OCaml/GCC
does, however that requires 128-bit alignment for code objects, which is
not yet implemented. So for now we materialize the mask inline.
R=dcarney@chromium.org
Review URL: https://codereview.chromium.org/1046893002
Cr-Commit-Position: refs/heads/master@{#27611}
This adds the basics necessary to support float32 operations in TurboFan.
The actual functionality required to detect safe float32 operations will
be added based on this later. Therefore this does not affect production
code except for some cleanup/refactoring.
In detail, this patchset contains the following features:
- Add support for float32 operations to arm, arm64, ia32 and x64
backends.
- Add float32 machine operators.
- Add support for float32 constants to simplified lowering.
- Handle float32 representation for phis in simplified lowering.
In addition, contains the following (related) cleanups:
- Fix/unify naming of backend instructions.
- Use AVX comparisons when available.
- Extend ArchOpcodeField to 9 bits (required for arm64).
- Refactor some code duplication in instruction selectors.
BUG=v8:3589
LOG=n
R=dcarney@chromium.org
Review URL: https://codereview.chromium.org/1044793002
Cr-Commit-Position: refs/heads/master@{#27509}
Achieve more than parity with modes currently handled on ia32 in preparation for
porting the entire mechanism to ia32, including supporting mul of constants 3,
5 and 9 with "leal" instructions.
TEST=unittests
Review URL: https://codereview.chromium.org/774193003
Cr-Commit-Position: refs/heads/master@{#25677}