This is a follow-up on https://crrev.com/c/3131374 to support more
instructions, float32 sqrt, cmp, round, float64 cmp.
Rename the opcodes since they are no longer SSE specific.
Bug: v8:12148
Change-Id: Ie5f74bc1b4510092cbfbcb7e420ef82cb1c39a14
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3154983
Reviewed-by: Adam Klein <adamk@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/main@{#76777}
This is similar to what is already done in x64, define a macro list for
all the *sd instructions (prefix f2 0f), and use this macro list to
define assembler functions and disassembly.
Bug: v8:11879
Change-Id: Ia7fbd9fe7f07b72c04d82c81726b9673c40eb0de
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3125774
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Cr-Commit-Position: refs/heads/main@{#76756}
By delegating to the macro-assembler, emit AVX instructions for some
float opcodes (float sqrt, round, conversions to and from int,
extract/insert/load word).
Since they now support AVX, we rename the instruction ops to remove the
SSE prefix, changing it to be IA32.
Bug: v8:12148
Change-Id: Ib488f03928756e7d85ab78e6cb28eb869e0641f9
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3131374
Reviewed-by: Adam Klein <adamk@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/main@{#76755}
Bug: v8:11589
Change-Id: I7b55efa76f60eacf31700a544f54042eec963f57
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3115545
Reviewed-by: Adam Klein <adamk@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/main@{#76495}
The optimal implementation is in TurboFan x64 codegen, move it into
shared-macro-assembler, and have TurboFan ia32 and Liftoff use it. The
optimal implementation accounts for AVX2 support.
We add a couple of AVX2 instruction to ia32 in sse-instr.h, not all of
them are used, but follow-up patches will use them, so we add support
(including diassembly and test) in this change.
Drive-by clean up to test-disasm-x64.cc to merge 2 AVX2 test sections.
Bug: v8:11589
Change-Id: I1c8d7deb0f8bb70b29e7a680e5dbcfb09ca5505b
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3092555
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/main@{#76352}
This is a reland of a16add806d.
The fixes are adding disassembly for pcmpgtq and vpcmpgtq.
While fixing also noticed a mistake in assembler for pcmpgtq,
which flipped dst and src.
Also realized that we don't detect SSE4.2, so adding that in.
PS2 contains these changes.
Original change's description:
> [wasm-simd][ia32] Implement i64x2 signed compares
>
> The code sequence is exactly the same as x64.
>
> Bug: v8:11415
> Change-Id: I53ed2723eda29c0a250cff514372a3d45b203476
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2683495
> Reviewed-by: Bill Budge <bbudge@chromium.org>
> Commit-Queue: Zhi An Ng <zhin@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#72637}
Bug: v8:11415
Change-Id: If6a18af2d7de20ac8ad38f94b6d0220769397194
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2688119
Reviewed-by: Bill Budge <bbudge@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#72721}
This reverts commit a16add806d.
Reason for revert: Broke Win32 debug https://ci.chromium.org/ui/p/v8/builders/ci/V8%20Win32%20-%20debug/29653/overview
Original change's description:
> [wasm-simd][ia32] Implement i64x2 signed compares
>
> The code sequence is exactly the same as x64.
>
> Bug: v8:11415
> Change-Id: I53ed2723eda29c0a250cff514372a3d45b203476
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2683495
> Reviewed-by: Bill Budge <bbudge@chromium.org>
> Commit-Queue: Zhi An Ng <zhin@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#72637}
TBR=bbudge@chromium.org,zhin@chromium.org
Change-Id: Idbfc8cd0fbbff607cff76953c53d0c149b87b573
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: v8:11415
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2688074
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#72640}
The code sequence is exactly the same as x64.
Bug: v8:11415
Change-Id: I53ed2723eda29c0a250cff514372a3d45b203476
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2683495
Reviewed-by: Bill Budge <bbudge@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#72637}
Implement these 6 instructions:
- f64x2.convert_low_i32x4_s
- f64x2.convert_low_i32x4_u
- i32x4.trunc_sat_f64x2_s_zero
- i32x4.trunc_sat_f64x2_u_zero
- f32x4.demote_f64x2_zero
- f64x2.promote_low_f32x4
The code sequences are exactly the same as on x64.
Needed to add some more instructions, and we don't have macro lists for
these instructions yet, so individually define them for now. We can
factor them into lists in a future change.
Bug: v8:11265
Change-Id: I606e1226201e3c5ecdc7e3f611315437e917d77c
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2668913
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#72535}
Code sequence from https://github.com/WebAssembly/simd/pull/379, and
exactly the same as x64, with minor tweaks for
ExternalReferenceAsOperand.
Bug: v8:11002
Change-Id: Icbfdac62b21c2734ad4886b3d48f34e29f7a8222
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2664860
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Cr-Commit-Position: refs/heads/master@{#72495}
We can have more optimizations for this instruction, they leave some
junk in the top lanes of dst, but that doesn't matter:
- when lane is 1: we use movshdup, this is 4 bytes long
- when lane is 2: use movhlps, this is 3 bytes long
- otherwise use shufps (4 bytes) or pshufd (5 bytes)
All of which are better than insertps (6 bytes).
Change-Id: I0e524431d1832e297e8c8bb418d42382d93fa691
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2591850
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Bill Budge <bbudge@chromium.org>
Cr-Commit-Position: refs/heads/master@{#71813}
Implementation is almost identical to x64, except that in the
instruction-selector, for AVX, we allow the second operand to
be a slot, and so we use InputOperand in the codegen.
Bug: v8:11008
Change-Id: I5b5ea4b5058dc0bf5ff1c24a67f9b787c5312106
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2576887
Reviewed-by: Bill Budge <bbudge@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#71705}
This is a reland of 716dae3ae0
Original change's description:
> [wasm-simd][ia32] Prototype sign select
>
> The implementation is the same as on x64.
>
> Bug: v8:10983
> Change-Id: I2654ce4a627ca5cc6c759051ab9034c528d9f25a
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2567194
> Reviewed-by: Bill Budge <bbudge@chromium.org>
> Commit-Queue: Zhi An Ng <zhin@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#71606}
Bug: v8:10983
Change-Id: I05af92ec2d3531dd2e0d27353cc665967fb5c387
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2574001
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Bill Budge <bbudge@chromium.org>
Cr-Commit-Position: refs/heads/master@{#71627}
This reverts commit 716dae3ae0.
Reason for revert: broke noavx build https://ci.chromium.org/ui/p/v8/builders/ci/V8%20Linux%20-%20debug/33124/overview
Original change's description:
> [wasm-simd][ia32] Prototype sign select
>
> The implementation is the same as on x64.
>
> Bug: v8:10983
> Change-Id: I2654ce4a627ca5cc6c759051ab9034c528d9f25a
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2567194
> Reviewed-by: Bill Budge <bbudge@chromium.org>
> Commit-Queue: Zhi An Ng <zhin@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#71606}
TBR=bbudge@chromium.org,zhin@chromium.org
Change-Id: I6408268945e41ef7acf5938ac989bab9824df185
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: v8:10983
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2573996
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#71607}
The implementation is the same as on x64.
Bug: v8:10983
Change-Id: I2654ce4a627ca5cc6c759051ab9034c528d9f25a
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2567194
Reviewed-by: Bill Budge <bbudge@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#71606}
This is a reland of a69b7ef2ff
Original change's description:
> [wasm-simd][ia32] Prototype store lane
>
> Prototype v128.store{8,16,32,64}_lane on IA32.
>
> Drive by fix for wrong disassembly of movlps.
>
> Also added more test cases for StoreLane, test for more alignment and offset.
>
> Bug: v8:10975
> Change-Id: I0e16f1b5be824b6fc818d02d0fd84ebc0dff4174
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2557068
> Commit-Queue: Zhi An Ng <zhin@chromium.org>
> Reviewed-by: Bill Budge <bbudge@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#71511}
Bug: v8:10975
Change-Id: I2c9b219b9ab9d78a83d1bf32ad1271d717471c19
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2567317
Reviewed-by: Bill Budge <bbudge@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#71547}
This reverts commit a69b7ef2ff.
Reason for revert: Broke msvc https://ci.chromium.org/p/v8/builders/ci/V8%20Win64%20-%20msvc/15975?
Original change's description:
> [wasm-simd][ia32] Prototype store lane
>
> Prototype v128.store{8,16,32,64}_lane on IA32.
>
> Drive by fix for wrong disassembly of movlps.
>
> Also added more test cases for StoreLane, test for more alignment and offset.
>
> Bug: v8:10975
> Change-Id: I0e16f1b5be824b6fc818d02d0fd84ebc0dff4174
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2557068
> Commit-Queue: Zhi An Ng <zhin@chromium.org>
> Reviewed-by: Bill Budge <bbudge@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#71511}
TBR=bbudge@chromium.org,zhin@chromium.org
Change-Id: Ic9386ea1254c1e0d9b42e92723b1a951fafe3a8b
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: v8:10975
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2567315
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#71512}
Prototype v128.store{8,16,32,64}_lane on IA32.
Drive by fix for wrong disassembly of movlps.
Also added more test cases for StoreLane, test for more alignment and offset.
Bug: v8:10975
Change-Id: I0e16f1b5be824b6fc818d02d0fd84ebc0dff4174
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2557068
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Bill Budge <bbudge@chromium.org>
Cr-Commit-Position: refs/heads/master@{#71511}
Prototype v128.load{8,16,32,64}_lane on IA32 (stores will come later).
This is pretty similar to x64 version, except that there is no signal
handler for OOB access, so kProtected is not a valid access mode.
Left some TODOs for myself to merge the new instruction codes
(kIA32Pinsrb) with the replace lane Wasm instructions.
Bug: v8:10975
Change-Id: I5c9f9a45e2e7f06e8fab4a28cdfe1857ccc35880
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2557063
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Bill Budge <bbudge@chromium.org>
Cr-Commit-Position: refs/heads/master@{#71394}
This is a reland of fbfa9bf4ec
The arm64 was missing proper codegen for CFI, thus sizes were off.
Original change's description:
> Reland "[deoptimizer] Change deopt entries into builtins"
>
> This is a reland of 7f58ced72e
>
> It fixes the different exit size emitted on x64/Atom CPUs due to
> performance tuning in TurboAssembler::Call. Additionally, add
> cctests to verify the fixed size exits.
>
> Original change's description:
> > [deoptimizer] Change deopt entries into builtins
> >
> > While the overall goal of this commit is to change deoptimization
> > entries into builtins, there are multiple related things happening:
> >
> > - Deoptimization entries, formerly stubs (i.e. Code objects generated
> > at runtime, guaranteed to be immovable), have been converted into
> > builtins. The major restriction is that we now need to preserve the
> > kRootRegister, which was formerly used on most architectures to pass
> > the deoptimization id. The solution differs based on platform.
> > - Renamed DEOPT_ENTRIES_OR_FOR_TESTING code kind to FOR_TESTING.
> > - Removed heap/ support for immovable Code generation.
> > - Removed the DeserializerData class (no longer needed).
> > - arm64: to preserve 4-byte deopt exits, introduced a new optimization
> > in which the final jump to the deoptimization entry is generated
> > once per Code object, and deopt exits can continue to emit a
> > near-call.
> > - arm,ia32,x64: change to fixed-size deopt exits. This reduces exit
> > sizes by 4/8, 5, and 5 bytes, respectively.
> >
> > On arm the deopt exit size is reduced from 12 (or 16) bytes to 8 bytes
> > by using the same strategy as on arm64 (recalc deopt id from return
> > address). Before:
> >
> > e300a002 movw r10, <id>
> > e59fc024 ldr ip, [pc, <entry offset>]
> > e12fff3c blx ip
> >
> > After:
> >
> > e59acb35 ldr ip, [r10, <entry offset>]
> > e12fff3c blx ip
> >
> > On arm64 the deopt exit size remains 4 bytes (or 8 bytes in same cases
> > with CFI). Additionally, up to 4 builtin jumps are emitted per Code
> > object (max 32 bytes added overhead per Code object). Before:
> >
> > 9401cdae bl <entry offset>
> >
> > After:
> >
> > # eager deoptimization entry jump.
> > f95b1f50 ldr x16, [x26, <eager entry offset>]
> > d61f0200 br x16
> > # lazy deoptimization entry jump.
> > f95b2b50 ldr x16, [x26, <lazy entry offset>]
> > d61f0200 br x16
> > # the deopt exit.
> > 97fffffc bl <eager deoptimization entry jump offset>
> >
> > On ia32 the deopt exit size is reduced from 10 to 5 bytes. Before:
> >
> > bb00000000 mov ebx,<id>
> > e825f5372b call <entry>
> >
> > After:
> >
> > e8ea2256ba call <entry>
> >
> > On x64 the deopt exit size is reduced from 12 to 7 bytes. Before:
> >
> > 49c7c511000000 REX.W movq r13,<id>
> > e8ea2f0700 call <entry>
> >
> > After:
> >
> > 41ff9560360000 call [r13+<entry offset>]
> >
> > Bug: v8:8661,v8:8768
> > Change-Id: I13e30aedc360474dc818fecc528ce87c3bfeed42
> > Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2465834
> > Commit-Queue: Jakob Gruber <jgruber@chromium.org>
> > Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
> > Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
> > Reviewed-by: Ulan Degenbaev <ulan@chromium.org>
> > Cr-Commit-Position: refs/heads/master@{#70597}
>
> Tbr: ulan@chromium.org, tebbi@chromium.org, rmcilroy@chromium.org
> Bug: v8:8661,v8:8768,chromium:1140165
> Change-Id: Ibcd5c39c58a70bf2b2ac221aa375fc68d495e144
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2485506
> Reviewed-by: Jakob Gruber <jgruber@chromium.org>
> Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
> Commit-Queue: Jakob Gruber <jgruber@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#70655}
Tbr: ulan@chromium.org, tebbi@chromium.org, rmcilroy@chromium.org
Bug: v8:8661
Bug: v8:8768
Bug: chromium:1140165
Change-Id: I471cc94fc085e527dc9bfb5a84b96bd907c2333f
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2488682
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/master@{#70672}
This is a reland of 7f58ced72e
It fixes the different exit size emitted on x64/Atom CPUs due to
performance tuning in TurboAssembler::Call. Additionally, add
cctests to verify the fixed size exits.
Original change's description:
> [deoptimizer] Change deopt entries into builtins
>
> While the overall goal of this commit is to change deoptimization
> entries into builtins, there are multiple related things happening:
>
> - Deoptimization entries, formerly stubs (i.e. Code objects generated
> at runtime, guaranteed to be immovable), have been converted into
> builtins. The major restriction is that we now need to preserve the
> kRootRegister, which was formerly used on most architectures to pass
> the deoptimization id. The solution differs based on platform.
> - Renamed DEOPT_ENTRIES_OR_FOR_TESTING code kind to FOR_TESTING.
> - Removed heap/ support for immovable Code generation.
> - Removed the DeserializerData class (no longer needed).
> - arm64: to preserve 4-byte deopt exits, introduced a new optimization
> in which the final jump to the deoptimization entry is generated
> once per Code object, and deopt exits can continue to emit a
> near-call.
> - arm,ia32,x64: change to fixed-size deopt exits. This reduces exit
> sizes by 4/8, 5, and 5 bytes, respectively.
>
> On arm the deopt exit size is reduced from 12 (or 16) bytes to 8 bytes
> by using the same strategy as on arm64 (recalc deopt id from return
> address). Before:
>
> e300a002 movw r10, <id>
> e59fc024 ldr ip, [pc, <entry offset>]
> e12fff3c blx ip
>
> After:
>
> e59acb35 ldr ip, [r10, <entry offset>]
> e12fff3c blx ip
>
> On arm64 the deopt exit size remains 4 bytes (or 8 bytes in same cases
> with CFI). Additionally, up to 4 builtin jumps are emitted per Code
> object (max 32 bytes added overhead per Code object). Before:
>
> 9401cdae bl <entry offset>
>
> After:
>
> # eager deoptimization entry jump.
> f95b1f50 ldr x16, [x26, <eager entry offset>]
> d61f0200 br x16
> # lazy deoptimization entry jump.
> f95b2b50 ldr x16, [x26, <lazy entry offset>]
> d61f0200 br x16
> # the deopt exit.
> 97fffffc bl <eager deoptimization entry jump offset>
>
> On ia32 the deopt exit size is reduced from 10 to 5 bytes. Before:
>
> bb00000000 mov ebx,<id>
> e825f5372b call <entry>
>
> After:
>
> e8ea2256ba call <entry>
>
> On x64 the deopt exit size is reduced from 12 to 7 bytes. Before:
>
> 49c7c511000000 REX.W movq r13,<id>
> e8ea2f0700 call <entry>
>
> After:
>
> 41ff9560360000 call [r13+<entry offset>]
>
> Bug: v8:8661,v8:8768
> Change-Id: I13e30aedc360474dc818fecc528ce87c3bfeed42
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2465834
> Commit-Queue: Jakob Gruber <jgruber@chromium.org>
> Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
> Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
> Reviewed-by: Ulan Degenbaev <ulan@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#70597}
Tbr: ulan@chromium.org, tebbi@chromium.org, rmcilroy@chromium.org
Bug: v8:8661,v8:8768,chromium:1140165
Change-Id: Ibcd5c39c58a70bf2b2ac221aa375fc68d495e144
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2485506
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/master@{#70655}
This reverts commit 7f58ced72e.
Reason for revert: Segfaults on Atom_x64 https://ci.chromium.org/p/v8-internal/builders/ci/v8_linux64_atom_perf/5686?
Original change's description:
> [deoptimizer] Change deopt entries into builtins
>
> While the overall goal of this commit is to change deoptimization
> entries into builtins, there are multiple related things happening:
>
> - Deoptimization entries, formerly stubs (i.e. Code objects generated
> at runtime, guaranteed to be immovable), have been converted into
> builtins. The major restriction is that we now need to preserve the
> kRootRegister, which was formerly used on most architectures to pass
> the deoptimization id. The solution differs based on platform.
> - Renamed DEOPT_ENTRIES_OR_FOR_TESTING code kind to FOR_TESTING.
> - Removed heap/ support for immovable Code generation.
> - Removed the DeserializerData class (no longer needed).
> - arm64: to preserve 4-byte deopt exits, introduced a new optimization
> in which the final jump to the deoptimization entry is generated
> once per Code object, and deopt exits can continue to emit a
> near-call.
> - arm,ia32,x64: change to fixed-size deopt exits. This reduces exit
> sizes by 4/8, 5, and 5 bytes, respectively.
>
> On arm the deopt exit size is reduced from 12 (or 16) bytes to 8 bytes
> by using the same strategy as on arm64 (recalc deopt id from return
> address). Before:
>
> e300a002 movw r10, <id>
> e59fc024 ldr ip, [pc, <entry offset>]
> e12fff3c blx ip
>
> After:
>
> e59acb35 ldr ip, [r10, <entry offset>]
> e12fff3c blx ip
>
> On arm64 the deopt exit size remains 4 bytes (or 8 bytes in same cases
> with CFI). Additionally, up to 4 builtin jumps are emitted per Code
> object (max 32 bytes added overhead per Code object). Before:
>
> 9401cdae bl <entry offset>
>
> After:
>
> # eager deoptimization entry jump.
> f95b1f50 ldr x16, [x26, <eager entry offset>]
> d61f0200 br x16
> # lazy deoptimization entry jump.
> f95b2b50 ldr x16, [x26, <lazy entry offset>]
> d61f0200 br x16
> # the deopt exit.
> 97fffffc bl <eager deoptimization entry jump offset>
>
> On ia32 the deopt exit size is reduced from 10 to 5 bytes. Before:
>
> bb00000000 mov ebx,<id>
> e825f5372b call <entry>
>
> After:
>
> e8ea2256ba call <entry>
>
> On x64 the deopt exit size is reduced from 12 to 7 bytes. Before:
>
> 49c7c511000000 REX.W movq r13,<id>
> e8ea2f0700 call <entry>
>
> After:
>
> 41ff9560360000 call [r13+<entry offset>]
>
> Bug: v8:8661,v8:8768
> Change-Id: I13e30aedc360474dc818fecc528ce87c3bfeed42
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2465834
> Commit-Queue: Jakob Gruber <jgruber@chromium.org>
> Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
> Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
> Reviewed-by: Ulan Degenbaev <ulan@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#70597}
TBR=ulan@chromium.org,rmcilroy@chromium.org,jgruber@chromium.org,tebbi@chromium.org
# Not skipping CQ checks because original CL landed > 1 day ago.
Bug: v8:8661,v8:8768,chromium:1140165
Change-Id: I3df02ab42f6e02233d9f6fb80e8bb18f76870d91
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2485504
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/master@{#70649}
While the overall goal of this commit is to change deoptimization
entries into builtins, there are multiple related things happening:
- Deoptimization entries, formerly stubs (i.e. Code objects generated
at runtime, guaranteed to be immovable), have been converted into
builtins. The major restriction is that we now need to preserve the
kRootRegister, which was formerly used on most architectures to pass
the deoptimization id. The solution differs based on platform.
- Renamed DEOPT_ENTRIES_OR_FOR_TESTING code kind to FOR_TESTING.
- Removed heap/ support for immovable Code generation.
- Removed the DeserializerData class (no longer needed).
- arm64: to preserve 4-byte deopt exits, introduced a new optimization
in which the final jump to the deoptimization entry is generated
once per Code object, and deopt exits can continue to emit a
near-call.
- arm,ia32,x64: change to fixed-size deopt exits. This reduces exit
sizes by 4/8, 5, and 5 bytes, respectively.
On arm the deopt exit size is reduced from 12 (or 16) bytes to 8 bytes
by using the same strategy as on arm64 (recalc deopt id from return
address). Before:
e300a002 movw r10, <id>
e59fc024 ldr ip, [pc, <entry offset>]
e12fff3c blx ip
After:
e59acb35 ldr ip, [r10, <entry offset>]
e12fff3c blx ip
On arm64 the deopt exit size remains 4 bytes (or 8 bytes in same cases
with CFI). Additionally, up to 4 builtin jumps are emitted per Code
object (max 32 bytes added overhead per Code object). Before:
9401cdae bl <entry offset>
After:
# eager deoptimization entry jump.
f95b1f50 ldr x16, [x26, <eager entry offset>]
d61f0200 br x16
# lazy deoptimization entry jump.
f95b2b50 ldr x16, [x26, <lazy entry offset>]
d61f0200 br x16
# the deopt exit.
97fffffc bl <eager deoptimization entry jump offset>
On ia32 the deopt exit size is reduced from 10 to 5 bytes. Before:
bb00000000 mov ebx,<id>
e825f5372b call <entry>
After:
e8ea2256ba call <entry>
On x64 the deopt exit size is reduced from 12 to 7 bytes. Before:
49c7c511000000 REX.W movq r13,<id>
e8ea2f0700 call <entry>
After:
41ff9560360000 call [r13+<entry offset>]
Bug: v8:8661,v8:8768
Change-Id: I13e30aedc360474dc818fecc528ce87c3bfeed42
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2465834
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Reviewed-by: Ulan Degenbaev <ulan@chromium.org>
Cr-Commit-Position: refs/heads/master@{#70597}
CodeKind::OPTIMIZED_CODE -> TURBOFAN
Kinds are now more fine-grained and distinguish between TF, TP, NCI.
CodeKind::STUB -> DEOPT_ENTRIES_OR_FOR_TESTING
Code stubs (like builtins, but generated at runtime) were removed from
the codebase years ago, this is the last remnant. This kind is used
only for deopt entries (which should be converted into builtins) and
for tests.
Change-Id: I67beb15377cb60f395e9b051b25f3e5764982e93
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2440335
Auto-Submit: Jakob Gruber <jgruber@chromium.org>
Commit-Queue: Mythri Alle <mythria@chromium.org>
Reviewed-by: Mythri Alle <mythria@chromium.org>
Cr-Commit-Position: refs/heads/master@{#70234}
dst might not be the same as src0 (since we don't define them to be
equals in the instruction-selector if AVX is enabled), so the minps
and maxps comparisons were incorrect.
I found this while trying to run some spec tests, so not adding any
unittest, eventually when the spec tests are enabled, this will be
covered.
Bug: v8:10835
Change-Id: I4fbc1dfe949e4137e057e73c0d5dfb8534a00b8f
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2411484
Reviewed-by: Bill Budge <bbudge@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#69953}
For SIMD instructions that use aligned moves (like movaps or movapd), we
don't have correct memory alignment for SIMD moves yet. Switch to to
movupd.
Bug: v8:9198
Bug: v8:10831
Change-Id: Ic60fba5d08dda9676f6091ce505ac7be54957d00
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2380240
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Reviewed-by: Bill Budge <bbudge@chromium.org>
Cr-Commit-Position: refs/heads/master@{#69613}
With the new Turbofan variants (NCI and Turboprop), we need a way to
distinguish between them both during and after compilation. We
initially introduced CompilationTarget to track the variant during
compilation, but decided to reuse the code kind as the canonical spot to
store this information instead.
Why? Because it is an established mechanism, already available in most
of the necessary spots (inside the pipeline, on Code objects, in
profiling traces).
This CL removes CompilationTarget and adds a new
NATIVE_CONTEXT_INDEPENDENT kind, plus helper functions to determine
various things about a given code kind (e.g.: does this code kind
deopt?).
As a (very large) drive-by, refactor both Code::Kind and
AbstractCode::Kind into a new CodeKind enum class.
Bug: v8:8888
Change-Id: Ie858b9a53311b0731630be35cf5cd108dee95b39
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2336793
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
Reviewed-by: Dominik Inführ <dinfuehr@chromium.org>
Reviewed-by: Georg Neis <neis@chromium.org>
Cr-Commit-Position: refs/heads/master@{#69244}
This CL
* Adds the xadd instruction to the ia32 assembler and disassembler;
* Implements the AtomicAdd instructions, except AtomicAddU64, on ia32;
R=clemensb@chromium.org
Bug: v8:10108
Change-Id: Ic8653a9f96148282951104fefb4185c4c0db89a3
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2232719
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Commit-Queue: Andreas Haas <ahaas@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68247}
Implement i8x16.bitmask, i16x8.bitmask, i32x4.bitmask on ia32.
Drive by additions of disasm and disasm tests to some instructions.
Bug: v8:10308
Change-Id: I3725ed6959ae55f96ee7950130776a4f08e177c9
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2127314
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#66989}
This CL implements load_extend with 2 lanes and all load_splat
operations on IA32. The necessary assemblers together with their
corresponding disassemblers and tests are also added in this CL.
The newly added opcodes include: S8x16LoadSplat, S16x8LoadSplat,
S32x4LoadSplat, S64x2LoadSplat, I64x2Load32x2S, I64x2Load32x2U.
Bug: v8:9886
Change-Id: I0a5dae0a683985c14c433ba9d85acbd1cee6705f
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1982989
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Commit-Queue: Zhiguo Zhou <zhiguo.zhou@intel.com>
Cr-Commit-Position: refs/heads/master@{#65937}
This is a reland of 855591a54d
Fixes break in builds that verify ReadOnlyHeap by relaxing the requirement for
Code objects to be in CODE_SPACE in PagedSpaceObjectIterator::FromCurrentPage.
Original change's description:
> Reland: [builtins] Move non-JS linkage builtins code objects into RO_SPACE
>
> Reland of https://chromium-review.googlesource.com/c/v8/v8/+/1795358.
>
> [builtins] Move non-JS linkage builtins code objects into RO_SPACE
>
> Creates an allow-list of builtins that can still go in code_space
> including all TFJ builtins and a small manual list that should be pared
> down in the future.
>
> For builtins that go in RO_SPACE a Code object is created that contains an
> immediate trap instruction. Generally these Code objects are still no
> smaller than CODE_SPACE Code objects because of the Code object alignment
> requirements. This will hopefully be addressed in a follow-up CL either by
> relaxing them or removing the instruction stream completely.
>
> In the snapshot, this reduces code_space from ~152k to ~40k (-112k) and
> increases by the same amount.
>
> Change-Id: I76661c35c7ea5866c1fb16e87e87122b3e3ca0ce
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1893336
> Commit-Queue: Dan Elphick <delphick@chromium.org>
> Reviewed-by: Jakob Gruber <jgruber@chromium.org>
> Reviewed-by: Ulan Degenbaev <ulan@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#64700}
Change-Id: I4eeb7dab3027b42fa58c5dfb2bad9873e9fff250
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1893192
Commit-Queue: Dan Elphick <delphick@chromium.org>
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Ulan Degenbaev <ulan@chromium.org>
Cr-Commit-Position: refs/heads/master@{#64728}
This reverts commit 855591a54d.
Reason for revert: Breaks arm64 sim tests
https://ci.chromium.org/p/v8/builders/ci/V8%20Linux%20-%20arm64%20-%20sim%20-%20debug/17957https://ci.chromium.org/p/v8/builders/ci/V8%20Linux%20-%20arm64%20-%20sim%20-%20gc%20stress/16585
Original change's description:
> Reland: [builtins] Move non-JS linkage builtins code objects into RO_SPACE
>
> Reland of https://chromium-review.googlesource.com/c/v8/v8/+/1795358.
>
> [builtins] Move non-JS linkage builtins code objects into RO_SPACE
>
> Creates an allow-list of builtins that can still go in code_space
> including all TFJ builtins and a small manual list that should be pared
> down in the future.
>
> For builtins that go in RO_SPACE a Code object is created that contains an
> immediate trap instruction. Generally these Code objects are still no
> smaller than CODE_SPACE Code objects because of the Code object alignment
> requirements. This will hopefully be addressed in a follow-up CL either by
> relaxing them or removing the instruction stream completely.
>
> In the snapshot, this reduces code_space from ~152k to ~40k (-112k) and
> increases by the same amount.
>
> Change-Id: I76661c35c7ea5866c1fb16e87e87122b3e3ca0ce
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1893336
> Commit-Queue: Dan Elphick <delphick@chromium.org>
> Reviewed-by: Jakob Gruber <jgruber@chromium.org>
> Reviewed-by: Ulan Degenbaev <ulan@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#64700}
TBR=ulan@chromium.org,jgruber@chromium.org,delphick@chromium.org
Change-Id: I4211c3bb7fe4741e0ba3898f92ce382dfc93c4f3
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1893636
Reviewed-by: Bill Budge <bbudge@chromium.org>
Commit-Queue: Bill Budge <bbudge@chromium.org>
Cr-Commit-Position: refs/heads/master@{#64701}
Reland of https://chromium-review.googlesource.com/c/v8/v8/+/1795358.
[builtins] Move non-JS linkage builtins code objects into RO_SPACE
Creates an allow-list of builtins that can still go in code_space
including all TFJ builtins and a small manual list that should be pared
down in the future.
For builtins that go in RO_SPACE a Code object is created that contains an
immediate trap instruction. Generally these Code objects are still no
smaller than CODE_SPACE Code objects because of the Code object alignment
requirements. This will hopefully be addressed in a follow-up CL either by
relaxing them or removing the instruction stream completely.
In the snapshot, this reduces code_space from ~152k to ~40k (-112k) and
increases by the same amount.
Change-Id: I76661c35c7ea5866c1fb16e87e87122b3e3ca0ce
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1893336
Commit-Queue: Dan Elphick <delphick@chromium.org>
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Ulan Degenbaev <ulan@chromium.org>
Cr-Commit-Position: refs/heads/master@{#64700}
This class used to describe unoptimized but compiled frames. All such
frames are by now covered via the architecture-independent description
in the {StandardFrameConstants} class (or one of its subclasses).
R=clemensb@chromium.org
BUG=v8:9810
Change-Id: I294cc6eec7d4a05e88e7aa336f1ebedfa0eb6e98
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1878708
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Reviewed-by: Michael Stanton <mvstanton@chromium.org>
Commit-Queue: Michael Starzinger <mstarzinger@chromium.org>
Cr-Commit-Position: refs/heads/master@{#64556}
This is a reland of 08b26f53c6
Fixed the original crash, by removing a disasm for psllq and psrlq
that is now handled by the macro list.
Original change's description:
> Clean up macros
>
> Move some instruction definitions into sse-instr, which is used to
> generate some disasm tests, so we can remove some cases there.
>
> Bug: v8:9810
> Change-Id: I0615ec823396da08bc5d234cf1dabca6afd3f052
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1866965
> Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
> Commit-Queue: Zhi An Ng <zhin@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#64441}
Bug: v8:9810
Change-Id: I69335a889f5f72b76a79e4e9860835232e6e38a8
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1872298
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#64487}
This reverts commit 08b26f53c6.
Reason for revert: Breaks tree https://ci.chromium.org/p/v8/builders/ci/V8%20Linux%20-%20noi18n%20-%20debug/29046
Original change's description:
> Clean up macros
>
> Move some instruction definitions into sse-instr, which is used to
> generate some disasm tests, so we can remove some cases there.
>
> Bug: v8:9810
> Change-Id: I0615ec823396da08bc5d234cf1dabca6afd3f052
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1866965
> Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
> Commit-Queue: Zhi An Ng <zhin@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#64441}
TBR=gdeepti@chromium.org,zhin@chromium.org
Change-Id: I067c1fdbaa6eb2a08c0fcb7c8885d72f073a8818
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: v8:9810
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1873195
Reviewed-by: Zhi An Ng <zhin@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#64443}
Move some instruction definitions into sse-instr, which is used to
generate some disasm tests, so we can remove some cases there.
Bug: v8:9810
Change-Id: I0615ec823396da08bc5d234cf1dabca6afd3f052
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1866965
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#64441}
This reverts commit 83f8464ffc.
Reason for revert: speculative revert for blink linux failure
https://ci.chromium.org/p/v8/builders/ci/V8%20Blink%20Linux/1272
Original change's description:
> [builtins] Move non-JS linkage builtins code objects into RO_SPACE
>
> Creates an allow-list of builtins that can still go in code_space
> including all TFJ builtins and a small manual list that should be pared
> down in the future.
>
> For builtins that go in RO_SPACE a Code object is created that contains
> no code at all (shrinking its size from 96 bytes to 64 bytes on x64),
> but is there to allow the runtime to continue to work since it expects
> a Code object.
>
> This reduces code_space from ~152k to ~40k (-112k) and increases
> read_only_space from 33k to 108k (+75k) in the snapshot.
>
> Bug: v8:7464, v8:9821, v8:9338, v8:8127
> Change-Id: Icc8bfc722bb267a2bcc17e2f1e27bef7f02f2376
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1795358
> Commit-Queue: Dan Elphick <delphick@chromium.org>
> Reviewed-by: Jakob Gruber <jgruber@chromium.org>
> Reviewed-by: Michael Starzinger <mstarzinger@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#64377}
TBR=mstarzinger@chromium.org,jgruber@chromium.org,delphick@chromium.org
Change-Id: I4cf38e9370280acdd2de718ca527776ebc509003
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: v8:7464, v8:9821, v8:9338, v8:8127
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1868621
Reviewed-by: Sathya Gunasekaran <gsathya@chromium.org>
Commit-Queue: Sathya Gunasekaran <gsathya@chromium.org>
Cr-Commit-Position: refs/heads/master@{#64383}
Creates an allow-list of builtins that can still go in code_space
including all TFJ builtins and a small manual list that should be pared
down in the future.
For builtins that go in RO_SPACE a Code object is created that contains
no code at all (shrinking its size from 96 bytes to 64 bytes on x64),
but is there to allow the runtime to continue to work since it expects
a Code object.
This reduces code_space from ~152k to ~40k (-112k) and increases
read_only_space from 33k to 108k (+75k) in the snapshot.
Bug: v8:7464, v8:9821, v8:9338, v8:8127
Change-Id: Icc8bfc722bb267a2bcc17e2f1e27bef7f02f2376
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1795358
Commit-Queue: Dan Elphick <delphick@chromium.org>
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Michael Starzinger <mstarzinger@chromium.org>
Cr-Commit-Position: refs/heads/master@{#64377}
We reuse PACKED_OP_LIST to generate *pd instructions. Introduce a new pd
base method, similar to ps and vps.
Bug: v8:9396
Change-Id: Id9d81c22c9110935484fd929ef7bf5cc20e9ae7e
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1834767
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Commit-Queue: Zhi An Ng <zhin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#64117}
This adds decoding and compilation of the "atomic.fence" operator, which
is intended to preserve the synchronization guarantees of higher-level
languages.
Unlike other atomic operators, it does not target a particular linear
memory. It may occur in modules which declare no memory, or a non-shared
memory, without causing a validation error.
See proposal: https://github.com/WebAssembly/threads/pull/141
See discussion: https://github.com/WebAssembly/threads/issues/140R=clemensh@chromium.org
TEST=cctest/test-run-wasm-atomics/RunWasmXXX_AtomicFence
BUG=v8:9452
Change-Id: Ibf7e46227f7edfe5c81c097cfc15924c59614067
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1701856
Commit-Queue: Michael Starzinger <mstarzinger@chromium.org>
Reviewed-by: Clemens Hammacher <clemensh@chromium.org>
Reviewed-by: Deepti Gandluri <gdeepti@chromium.org>
Cr-Commit-Position: refs/heads/master@{#62821}