This is a follow-up on https://crrev.com/c/3131374 to support more
instructions, float32 sqrt, cmp, round, float64 cmp.
Rename the opcodes since they are no longer SSE specific.
Bug: v8:12148
Change-Id: Ie5f74bc1b4510092cbfbcb7e420ef82cb1c39a14
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3154983
Reviewed-by: Adam Klein <adamk@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/main@{#76777}
In https://crrev.com/c/3131374 we switched some instructions to use
macro-assembler functions which can handle AVX and SSE. However for
Cvtsi2ss and Cvtsi2sd, the behavior subtly changed. The old behavior
directly called cvtsi2ss/cvtsi2sd in the code-generator. The new
behavior used the macro-assembler functions, which xor the dst operand.
This led to more instructions and larger code size in some benchmarks.
The xor is supposed to help reduce dependence chain length (see comments
on Cvtsi2ss), but doesn't seem to have helped in this benchmark.
So, partially revert the changes, and rename all affected IA32 opcodes
back to SSE.
Bug: chromium:1248509
Change-Id: Ie700e2980fe9ed083c1160bda3a28f64e1e43041
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3154349
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Adam Klein <adamk@chromium.org>
Cr-Commit-Position: refs/heads/main@{#76775}
Move some AVX_OP into shared macro-assembler, for reuse by ia32 in
future patches.
Movlhps is also unused in x64, so remove it.
Drive-by cleanup to use macro assembler helper Move to move 128-bit
const into a XMMRegister.
The change in liftoff-assembler-x64 is required because now the
macro-assembler functions are defined in the base class, so even though
we can use &TurboAssembler::Pcmpeqd to refer to that member function,
it actually resolves to &SharedTurboAssembler::Pcmpeqd.
Bug: v8:11589
Change-Id: Ie8f6a4dfd95b41192936f6e6be48c683042acec4
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3150138
Reviewed-by: Adam Klein <adamk@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/main@{#76772}
The {CountClearHalfWords} method is called whenever loading a constant
into a register. It showed up with >0.5% in Liftoff compilation
profiles. This CL refactors the method to return the number of *set*
halfwords instead of *cleared* halfwords and avoids the loop in the
implementation. This makes the method roughly twice as fast, and makes
the code more readable.
R=zhin@chromium.org
Bug: v8:11879
Change-Id: I7da8160b3c045e5fc1e97fc0e575083b3920cb5b
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3151962
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Commit-Queue: Clemens Backes <clemensb@chromium.org>
Cr-Commit-Position: refs/heads/main@{#76771}
If background threads are tiering up, they could temporarily make code
writable (if using the mprotect based approach). This would make our
death tests fail (i.e. not crash).
This CL fixes that by repeatedly writing in that case. Eventually, the
code should be protected again, and then we would crash. Failure to
crash would manifest as a timeout of the tests.
R=jkummerow@chromium.org
CC=mpdenton@chromium.org
Bug: v8:11974
Change-Id: Ibe34af499da9b964ad260d58e9b4e390007898e9
Cq-Include-Trybots: luci.v8.try:v8_mac_arm64_rel_ng
Cq-Include-Trybots: luci.v8.try:v8_mac_arm64_dbg_ng
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3151959
Commit-Queue: Clemens Backes <clemensb@chromium.org>
Reviewed-by: Jakob Kummerow <jkummerow@chromium.org>
Cr-Commit-Position: refs/heads/main@{#76770}
We already have some logic to try to get a reasonable name for the
function when logging code. It looks up the name custom section, and
falls back to the function index. Extract this into a helper, and call
it when disassembly the code.
Bug: v8:12098
Change-Id: Ieebe6594bc3184fa655f878faa0cb67c248d7f56
Fixed: v8:12098
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3125355
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/main@{#76769}
StackCheck needs to be implemented on liftoff.
Change-Id: I29624d65b82cbba3ef640ab7ea0cc78c2d5f2c4f
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3152745
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Commit-Queue: Milad Fa <mfarazma@redhat.com>
Cr-Commit-Position: refs/heads/main@{#76766}
In the case that {dst}, {lhs} and {rhs} all point to the same register,
we would emit wrong code (negating the register and adding it to
itself). This CL fixes this by checking if {lhs == rhs}, and just
clearing the {dst} register in that case.
R=thibaudm@chromium.org
Bug: chromium:1247659
Change-Id: I7913617850adb34a5ad812369f16a7422358454d
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3151955
Reviewed-by: Thibaud Michaud <thibaudm@chromium.org>
Commit-Queue: Clemens Backes <clemensb@chromium.org>
Cr-Commit-Position: refs/heads/main@{#76765}
With statically in-bounds memory accesses (implemented in
https://crrev.com/c/2919827) we would only have an offset but no index
register for {TraceMemoryOperation}. This CL fixes that situation.
R=thibaudm@chromium.org
Bug: chromium:1248024
Change-Id: I856b263a560cb71791c61e446e78dd99c9664190
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3149464
Reviewed-by: Thibaud Michaud <thibaudm@chromium.org>
Commit-Queue: Clemens Backes <clemensb@chromium.org>
Cr-Commit-Position: refs/heads/main@{#76763}
We have a macro list defined, and already use it in other places, use it
to disassemble the AVX instructions too.
Bug: v8:11879
Change-Id: Id1a5bdc167d3f17d603aa2e43e1ac80ef4b1fdb6
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3150139
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/main@{#76759}
value must be written to memory in LE order on BE machines
as they will be loaded in reverse when emitting S128Const.
Change-Id: Ia1d6c784505abe499fb71a6d86daea2721615da4
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3151956
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Commit-Queue: Milad Fa <mfarazma@redhat.com>
Cr-Commit-Position: refs/heads/main@{#76758}
With these accessors we can remove Assembler as a friend class.
Drive-by cleanup to change DCHECK(!x || y) to DCHECK_IMPLIES(x, y).
Change-Id: I74b7a23e85b50db93bbfe84fdfcc8563527f14d2
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3144374
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/main@{#76757}
This is similar to what is already done in x64, define a macro list for
all the *sd instructions (prefix f2 0f), and use this macro list to
define assembler functions and disassembly.
Bug: v8:11879
Change-Id: Ia7fbd9fe7f07b72c04d82c81726b9673c40eb0de
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3125774
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Cr-Commit-Position: refs/heads/main@{#76756}
By delegating to the macro-assembler, emit AVX instructions for some
float opcodes (float sqrt, round, conversions to and from int,
extract/insert/load word).
Since they now support AVX, we rename the instruction ops to remove the
SSE prefix, changing it to be IA32.
Bug: v8:12148
Change-Id: Ib488f03928756e7d85ab78e6cb28eb869e0641f9
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3131374
Reviewed-by: Adam Klein <adamk@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/main@{#76755}
Change-Id: I51dee467f5b843e96ffccbe6e99ba203e8c3bf10
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3111266
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/main@{#76754}
It could happen that the information about the feedback vector cached in
a JSFunctionData disagreed with the current value of the function's
feedback cell. The inlining code wasn't prepared for that and a CHECK
could fail.
The CL fixes this by removing the caching of
has_feedback_vector and feedback_vector and by getting hold of the
bytecode array before fetching the feedback vector in inlining.
Bug: v8:12172, v8:7790
Change-Id: Ife3ab8872085d9496e6d1f34514114a086f653ad
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3148010
Commit-Queue: Georg Neis <neis@chromium.org>
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/main@{#76751}
Use an immediate zero operand for floating point comparison nodes when
possible. This results in up to 20-25% runtime improvement in some
microbenchmarks, as well as 1-1.5% runtime improvement in some
real-use benchmarks on Cortex-A55 and Neoverse N1.
Change-Id: I39d10871a08a037dbe8c0877d789d110476e1a58
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3133143
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Commit-Queue: Martyn Capewell <martyn.capewell@arm.com>
Cr-Commit-Position: refs/heads/main@{#76749}
We add call_ref and return_call_ref to the fuzzed module.
We alter call function to generate call_ref in it.
Bug: v8:11954
Change-Id: I972b8e053d7eab758ac343d48f0c4631ef24b22b
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3148011
Reviewed-by: Manos Koukoutos <manoskouk@chromium.org>
Reviewed-by: Thibaud Michaud <thibaudm@chromium.org>
Commit-Queue: Rakhim Khismet <khismet@google.com>
Cr-Commit-Position: refs/heads/main@{#76748}
Test that also signal handlers cannot write to code, even if a
{CodeSpaceWriteScope} is open when the signal is triggered.
R=jkummerow@chromium.orgCC=mpdenton@chromium.org
Bug: v8:11974
Change-Id: I1e49e4b31ba196948f7f7adfdf88675816e0a58a
Cq-Include-Trybots: luci.v8.try:v8_mac_arm64_rel_ng
Cq-Include-Trybots: luci.v8.try:v8_mac_arm64_dbg_ng
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3140607
Reviewed-by: Jakob Kummerow <jkummerow@chromium.org>
Commit-Queue: Clemens Backes <clemensb@chromium.org>
Cr-Commit-Position: refs/heads/main@{#76747}
The icu object cache consists of 5 keys at most -> change it from
an unordered_set to a plain array.
Possible return values of CompareStrings are {-1,0,1}. Return those
directly instead of going through Factory::NewNumberFromInt.
Bug: v8:12196
Change-Id: Ia42bb6b1a0ebdc99550f604aa79cb438b150ee88
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3149454
Auto-Submit: Jakob Gruber <jgruber@chromium.org>
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Leszek Swirski <leszeks@chromium.org>
Cr-Commit-Position: refs/heads/main@{#76746}
They need to agree about when to delegate to CloneFastJSArray, since it
produces arrays which are potentially COW. If they don't agree, TF
generates code which produces a COW array and then expects it to be
non-COW -> immediate deopt.
This CL gets rid of the discrepancy in the case when there's exactly
one argument and it's the number 0.
Some corner cases remain, e.g., 1st argument not a number but ToInteger
returns 0. These should be extremely rare in the real world.
Bug: v8:12194
Change-Id: I10230245c97f8997da4d79702f29ebff11297229
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3147910
Commit-Queue: Marja Hölttä <marja@chromium.org>
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/main@{#76745}
AbstractCode doesn't fully support Sparkplug code yet (SourcePosition
and SourcePositionStatement are not supported).
Fall back to using BytecodeArray as AbstractCode at call-sites where
we use these functions.
Bug: chromium:1246259
Change-Id: I839cbff65c96eaaa0057c1e5a8bdd12e2bd721ee
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3147594
Reviewed-by: Leszek Swirski <leszeks@chromium.org>
Reviewed-by: Toon Verwaest <verwaest@chromium.org>
Commit-Queue: Toon Verwaest <verwaest@chromium.org>
Auto-Submit: Patrick Thier <pthier@chromium.org>
Cr-Commit-Position: refs/heads/main@{#76744}
... by skipping the optimization instead of CHECK-failing.
Bug: v8:12188
Change-Id: I6709bf1c55506f3d12886efbfbb9934788cd02ce
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3148132
Auto-Submit: Georg Neis <neis@chromium.org>
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/main@{#76741}
The t6-t8 are scratch registers and should not be allocatable.
Besides, add s0, s1, s2, s5 and s8 as allocatable registers.
Change-Id: I0805cc5273d0e0ec5040a0376bcbfba276202077
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3147315
Commit-Queue: Zhao Jiazhong <zhaojiazhong-hf@loongson.cn>
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Cr-Commit-Position: refs/heads/main@{#76739}
No functionality change is expected.
Bug: v8:11217
Change-Id: I131d52794e4de24ec838cc23f15828edbfc656ff
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3131372
Reviewed-by: Adam Klein <adamk@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/main@{#76738}
Merge the SSE and AVX opcodes for I16x8Eq and I16x8GtS. We delegate to
the macro-assembler to check for AVX.
No functionality change is expected.
Bug: v8:11217
Change-Id: I873b261d6f949bfc6755fe4c0e09b964a02c3684
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3131371
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Adam Klein <adamk@chromium.org>
Cr-Commit-Position: refs/heads/main@{#76737}
Change-Id: I8afa821412ae248ddea990755404a9bf5f33184e
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3125434
Reviewed-by: Adam Klein <adamk@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/main@{#76736}
Rolling v8/base/trace_event/common: 3da1e2f..715537d
Rolling v8/build: fbef918..1e4482b
Rolling v8/third_party/aemu-linux-x64: aSVGWUgGw-Nuh-08X80jtqA2bVKylBoNa1h7D-6Kzf0C..ExffPYjGXL4Gz5i52elIFTU-ZZZ3Rgom_ZGpSi12LBoC
Rolling v8/third_party/depot_tools: d69b31c..7285666
Rolling v8/tools/clang: 195c102..c678081
Rolling v8/tools/luci-go: git_revision:3e1f1f7a109ed8aefc7feba94fa737f0b5b4847e..git_revision:7b62727dc713b47d7a7ce9bca27500cb8e82ebd7
Rolling v8/tools/luci-go: git_revision:3e1f1f7a109ed8aefc7feba94fa737f0b5b4847e..git_revision:7b62727dc713b47d7a7ce9bca27500cb8e82ebd7
Rolling v8/tools/luci-go: git_revision:3e1f1f7a109ed8aefc7feba94fa737f0b5b4847e..git_revision:7b62727dc713b47d7a7ce9bca27500cb8e82ebd7
TBR=v8-waterfall-sheriff@grotations.appspotmail.com,mtv-sf-v8-sheriff@grotations.appspotmail.com
Change-Id: Id805d5bb7032f8208273f5e2aaa0532c7b03fc67
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3149517
Reviewed-by: v8-ci-autoroll-builder <v8-ci-autoroll-builder@chops-service-accounts.iam.gserviceaccount.com>
Commit-Queue: v8-ci-autoroll-builder <v8-ci-autoroll-builder@chops-service-accounts.iam.gserviceaccount.com>
Cr-Commit-Position: refs/heads/main@{#76735}
This CL takes advantage of the z15 `store byte reverse element`
instructions to optimize Simd StoreLane opcodes.
On the simulator we only run `store element` as reversing is
not required.
Change-Id: I723f6db535799470c46a1e298a9c1af7574ad5b6
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3144373
Reviewed-by: Junliang Yan <junyan@redhat.com>
Commit-Queue: Milad Fa <mfarazma@redhat.com>
Cr-Commit-Position: refs/heads/main@{#76734}
Combine the SSE and AVX versions, delegate to the macro-assembler
functions to check for AVX support.
Change Pand, Por, Pxor to generate the *ps version of the instruction
when AVX is not supported. The *ps versions are 1 byte shorter, and have
no performance difference on SSE-only processors.
Bug: v8:11589
Bug: v8:11217
Change-Id: I9d51054359dcc909efcbb2c3d3bb63d399cd6721
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3124101
Reviewed-by: Adam Klein <adamk@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/main@{#76733}
Do not require that dst == src1, this leaves more flexibility for the
operands. We check in the macro-assembler if dst alias any of the input
operands, then use vfma231/vfma132/vfma213 appropriately.
Bug: v8:11659
Change-Id: I3644f5e0e75bd047d4e5f5b52d4234e54d329d15
Fixed: v8:11659
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3131370
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Cr-Commit-Position: refs/heads/main@{#76732}
globals are no longer LE enforced after https://crrev.com/c/2944437.
LANE is used instead to pick the correct lane on BE machines.
Change-Id: I106bebda2633a4673ad4b5165c0440cc445d9475
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3148036
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Commit-Queue: Milad Fa <mfarazma@redhat.com>
Cr-Commit-Position: refs/heads/main@{#76730}
In addition to inputs consisting entirely of random bits, the
bigint test shell now also generates inputs that are powers of
two (i.e. have many 0-bits) and inputs with many 1-bits.
Empirically, these kinds of inputs are more likely to flush out
corner case bugs.
Bug: v8:11515
Change-Id: Ib69f12bf215055991b028196dc54ebbc00780bae
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3055292
Commit-Queue: Jakob Kummerow <jkummerow@chromium.org>
Reviewed-by: Maya Lekova <mslekova@chromium.org>
Cr-Commit-Position: refs/heads/main@{#76729}