This patch is a follow up to https://codereview.chromium.org/1972103002/
adding support for the `Operand_R_LSL_I` addressing mode to loads and
stores for ARM.
Just as the ARM64 implementation, the shift + load/store pattern is only
really relevant to the interpreter. For this reason, this patch does not
add support for the other addressing modes (`R_LSR_I`, `R_ASR_I` and
`R_ROR_I`) as I haven't seen those pattern being generated. Additionally,
the optimization is restricted 32 bit loads and stores.
kind = BYTECODE_HANDLER
name = Star
compiler = turbofan
Instructions (size = 40)
0x22a5f860 0 e2851001 add r1, r5, #1
0x22a5f864 4 e19610d1 ldrsb r1, [r6, +r1]
0x22a5f868 8 e1a0200b mov r2, fp
0x22a5f86c 12 e7820101 str r0, [r2, +r1, lsl #2]
^^^^^^^^^^^^^^^^^^^^^^^^^
0x22a5f870 16 e2855002 add r5, r5, #2
0x22a5f874 20 e7d61005 ldrb r1, [r6, +r5]
0x22a5f878 24 e7981101 ldr r1, [r8, +r1, lsl #2]
^^^^^^^^^^^^^^^^^^^^^^^^^
0x22a5f87c 28 e12fff11 bx r1
BUG=
Review-Url: https://codereview.chromium.org/1974263002
Cr-Commit-Position: refs/heads/master@{#36381}
The MLS instruction is available in all ARMv7 devices, and in no ARMv6
devices, aside from the usual ARMv6T2 caveat. We don't need a separate
feature flag for it.
BUG=
Review-Url: https://codereview.chromium.org/1988133004
Cr-Commit-Position: refs/heads/master@{#36378}
Remove dead code to optimize Int64Constants as branch/select conditions,
because we either have tagged booleans or bits represented as word32.
R=jarin@chromium.org
Review-Url: https://codereview.chromium.org/1994533002
Cr-Commit-Position: refs/heads/master@{#36308}
The type guard should never be used after the effect/control
linearization pass, so making it a simplified operator better
expresses the intended use. Also this way none of the common
operators actually has any dependency on the type system.
Drive-by-fix: Properly print the type parameter to a TypeGuard operator.
BUG=chromium:612142
R=jarin@chromium.org
Review-Url: https://codereview.chromium.org/1994503002
Cr-Commit-Position: refs/heads/master@{#36304}
This adds back the instanceof operator support in the backends and
introduces a @@hasInstance protector cell on the isolate that guards the
fast path for the InstanceOfStub. This way we recover the ~10%
regression on Octane EarleyBoyer in Crankshaft and greatly improve
TurboFan and Ignition performance of instanceof.
R=ishell@chromium.orgTBR=hpayer@chromium.org,rossberg@chromium.org
BUG=chromium:597249, v8:4447
LOG=n
Review-Url: https://codereview.chromium.org/1980483003
Cr-Commit-Position: refs/heads/master@{#36275}
This patch adds support for the `Operand2_R_LSL_I` addressing mode to
loads and stores. This allows merging a shift instruction into a
MemoryOperand. Since the shift immediate is restricted to the log2 of
the operation width, the opportunities to hit this are slim. However,
Ignition's bytecode handlers hit this case all the time:
kind = BYTECODE_HANDLER
name = Star
compiler = turbofan
Instructions (size = 44)
0x23e67280 0 add x1, x19, #0x1 (1)
0x23e67284 4 ldrsb x1, [x20, x1]
0x23e67288 8 sxtw x1, w1
0x23e6728c 12 mov x2, fp
0x23e67290 16 str x0, [x2, x1, lsl #3]
^^^^^^^^^^^^^^^^^^^^^
0x23e67294 20 add x19, x19, #0x2 (2)
0x23e67298 24 ldrb w1, [x20, x19]
0x23e6729c 28 ldr x1, [x21, x1, lsl #3]
^^^^^^^^^^^^^^^^^^^^^
0x23e672a0 32 br x1
Additionally, I noticed the optimisation occurs once in both the
`StringPrototypeCharAt` and `StringPrototypeCharCodeAt` turbofan stubs.
BUG=
Review-Url: https://codereview.chromium.org/1972103002
Cr-Commit-Position: refs/heads/master@{#36227}
We eagerly inserted Int32Mul for Math.imul during builtin lowering and
messed up with the types, which confused the representation selection.
This adds a proper NumberImul operator, and fixes the builtin reducer to
do the right thing according to the spec.
R=mstarzinger@chromium.org
BUG=v8:5006
LOG=n
Review-Url: https://codereview.chromium.org/1971163002
Cr-Commit-Position: refs/heads/master@{#36219}
Up until now we had two places where we did the function prototype
folding, once in the Typer and once in JSTypedLowering. Put this logic
into JSNativeContextSpecialization instead.
R=jarin@chromium.org
Review-Url: https://codereview.chromium.org/1965293002
Cr-Commit-Position: refs/heads/master@{#36157}
Make JSCreateArguments eliminatable, and remove the need for frame
states on JSCreateArguments nodes being lowered to (optimized) stub
calls. Only the runtime fallback needs a frame state, because in that
case we need to ask the deoptimizer for arguments to inlined functions.
R=jarin@chromium.org
Review-Url: https://codereview.chromium.org/1965013005
Cr-Commit-Position: refs/heads/master@{#36154}
This adds a new pass MemoryOptimizer that walks over the effect chain
from Start and lowers all Allocate, LoadField, StoreField, LoadElement,
and StoreElement nodes, trying to fold allocations into allocation
groups and eliminate write barriers on StoreField and StoreElement if
possible (i.e. if the object belongs to the current allocation group and
that group allocates in new space).
R=hpayer@chromium.org, jarin@chromium.org
BUG=v8:4931, chromium:580959
LOG=n
Review-Url: https://codereview.chromium.org/1963583004
Cr-Commit-Position: refs/heads/master@{#36128}
This operator was initially designed to handle arbitrary effect merging
for effect relaxation, but we don't do that (at least currently). So no
need to keep the dead operator around.
R=jarin@chromium.org
Review-Url: https://codereview.chromium.org/1954983002
Cr-Commit-Position: refs/heads/master@{#36063}
A load instruction will implicitely clear the top 32 bits when writing to a W
register. This patch avoids generating a `mov` instruction to zero-extend the
result in this case.
For example, this occurs in the generated code for dispatching to the next
bytecode in the interpreter:
kind = BYTECODE_HANDLER
name = LdaZero
compiler = turbofan
Instructions (size = 36)
0x32e64c60 0 add x19, x19, #0x1 (1)
0x32e64c64 4 ldrb w0, [x20, x19]
0x32e64c68 8 mov w0, w0
^^^^^^^^^^
0x32e64c6c 12 lsl x0, x0, #3
0x32e64c70 16 ldr x1, [x21, x0]
0x32e64c74 20 movz x0, #0x0
0x32e64c78 24 br x1
BUG=
Review-Url: https://codereview.chromium.org/1950013003
Cr-Commit-Position: refs/heads/master@{#36038}
Now that everything is properly wired to the effect chain when we get to
ChangeLowering, we can safely inline the allocation fast path and only
need to consule the slow path stub fallback when bump pointer allocation
fails.
R=jarin@chromium.org
BUG=v8:4931
LOG=n
Review-Url: https://codereview.chromium.org/1951853002
Cr-Commit-Position: refs/heads/master@{#36022}
When storing an immediate integer or floating point zero, use the zero register
as the source value. This avoids the need to sometimes allocate a new register.
BUG=
Review-Url: https://codereview.chromium.org/1945783002
Cr-Commit-Position: refs/heads/master@{#36013}
Now ChangeLowering is only concerned with lowering memory access and
allocation operations, and all changes are consistently lowered during
the effect/control linearization pass. The next step is to move the
left over lowerings to a pass dedicated to eliminate redundant loads and
stores, eliminate write barriers, fold and inline allocations.
Drive-by-fix: Rename ChangeBitToBool to ChangeBitToTagged,
ChangeBoolToBit to ChangeTaggedToBit, and ChangeInt31ToTagged to
ChangeInt31ToTaggedSigned for consistency.
CQ_INCLUDE_TRYBOTS=tryserver.v8:v8_linux64_tsan_rel
Committed: https://crrev.com/ceca5ae308bddda166651c654f96d71d74f617d0
Cr-Commit-Position: refs/heads/master@{#35924}
Review-Url: https://codereview.chromium.org/1941673002
Cr-Commit-Position: refs/heads/master@{#35929}
Reason for revert:
[Sheriff] Breaks mac gc stress:
https://build.chromium.org/p/client.v8/builders/V8%20Mac%20GC%20Stress/builds/5821
Original issue's description:
> [turbofan] Remove left-over change bits from ChangeLowering.
>
> Now ChangeLowering is only concerned with lowering memory access and
> allocation operations, and all changes are consistently lowered during
> the effect/control linearization pass. The next step is to move the
> left over lowerings to a pass dedicated to eliminate redundant loads and
> stores, eliminate write barriers, fold and inline allocations.
>
> Also remove the atomic regions now that we wire everything into the
> effect chain properly. This is an important step towards allocation
> inlining.
>
> Drive-by-fix: Rename ChangeBitToBool to ChangeBitToTagged,
> ChangeBoolToBit to ChangeTaggedToBit, and ChangeInt31ToTagged to
> ChangeInt31ToTaggedSigned for consistency.
>
> CQ_INCLUDE_TRYBOTS=tryserver.v8:v8_linux64_tsan_rel
>
> Committed: https://crrev.com/ceca5ae308bddda166651c654f96d71d74f617d0
> Cr-Commit-Position: refs/heads/master@{#35924}
TBR=ishell@chromium.org,bmeurer@chromium.org
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
Review-Url: https://codereview.chromium.org/1942733002
Cr-Commit-Position: refs/heads/master@{#35927}
Now ChangeLowering is only concerned with lowering memory access and
allocation operations, and all changes are consistently lowered during
the effect/control linearization pass. The next step is to move the
left over lowerings to a pass dedicated to eliminate redundant loads and
stores, eliminate write barriers, fold and inline allocations.
Also remove the atomic regions now that we wire everything into the
effect chain properly. This is an important step towards allocation
inlining.
Drive-by-fix: Rename ChangeBitToBool to ChangeBitToTagged,
ChangeBoolToBit to ChangeTaggedToBit, and ChangeInt31ToTagged to
ChangeInt31ToTaggedSigned for consistency.
CQ_INCLUDE_TRYBOTS=tryserver.v8:v8_linux64_tsan_rel
Review-Url: https://codereview.chromium.org/1941673002
Cr-Commit-Position: refs/heads/master@{#35924}
These also lower to subgraphs that have to be connected to the effect
and control chains, otherwise removing the atomic regions around heap
allocations would still be unsound.
R=jarin@chromium.org
Review URL: https://codereview.chromium.org/1916763003
Cr-Commit-Position: refs/heads/master@{#35762}
This allows us to get rid of the "push TruncateFloat64ToInt32 into Phi"
trick that was used in the MachineOperatorReducer to combine the
ChangeTaggedToFloat64 and TruncateFloat64ToInt32 operations. Instead of
doing that later, we can just introduce the proper operator during the
representation selection directly.
Also separate the TruncateFloat64ToInt32 machine operator, which had two
different meanings depending on a flag (either JavaScript truncation or
C++ style round to zero). Now there's a TruncateFloat64ToWord32 which
represents the JavaScript truncation (implemented via TruncateDoubleToI
macro + code stub) and the RoundFloat64ToInt32, which implements the C++
round towards zero operation (in the same style as the other WebAssembly
driven Round* machine operators).
R=jarin@chromium.org
Review URL: https://codereview.chromium.org/1919513002
Cr-Commit-Position: refs/heads/master@{#35743}
Get rid of further typing checks from ChangeLowering and put them into
the representation selection pass instead (encoding the information in
the operator instead).
Drive-by-change: Rename ChangeSmiToInt32 to ChangeTaggedSignedToInt32
for consistency about naming Tagged, TaggedSigned and TaggedPointer.
R=jarin@chromium.org
Review URL: https://codereview.chromium.org/1909343002
Cr-Commit-Position: refs/heads/master@{#35723}
If we have to convert a float64 value to tagged representation and we
already know that the value is either in Signed31/Signed32 or
Unsigned32 range, then we can just convert the float64 to word32 and
use the fast word32 to tagged conversion. Doing this in
ChangeLowering (or the effect linearization pass) would be unsound, as
the types on the nodes are no longer usable.
This removes all Type uses from effect linearization. There's still some
work to be done for ChangeLowering tho.
R=jarin@chromium.org
Review URL: https://codereview.chromium.org/1908093002
Cr-Commit-Position: refs/heads/master@{#35713}
Removes the register file machine register from the interpreter and
replaces it will loads from the parent frame pointer. As part of this
change the raw operand values for register values changes to enable the
interpreter to keep using the operand value as the offset from the
parent frame pointer.
BUG=v8:4280
LOG=N
Review URL: https://codereview.chromium.org/1894063002
Cr-Commit-Position: refs/heads/master@{#35618}
This introduces a compiler pass that schedules the graph and re-wires effect chain according to the schedule. It also connects allocating representation changes to the effect chain, and removes the BeginRegion and EndRegion nodes - they should not be needed anymore because all effectful nodes should be already wired-in.
This is an intermediate CL - the next step is to move lowering of the Change*ToTaggedEffect nodes to StateEffectIntroduction so that we do not have to introduce the effectful versions of nodes.
Review URL: https://codereview.chromium.org/1849603002
Cr-Commit-Position: refs/heads/master@{#35565}
These operators are really pure on the JavaScript level, and were only
part of the effect chain to make sure we don't accidentially schedule
them right after raw allocations, which is no longer an issue since we
now have the concept of atomic regions.
R=jarin@chromium.org
Review URL: https://codereview.chromium.org/1893543004
Cr-Commit-Position: refs/heads/master@{#35552}
This changes closure creation to lower to inline allocations when
possible instead of going through the FastNewClosureStub. It allows us
to leverage all advantages of inline allocations on closures. Note that
it is only safe to embed the raw entry point of the compile lazy stub
into the code, because that stub is immortal and immovable.
R=mvstanton@chromium.org
Review URL: https://codereview.chromium.org/1573153002
Cr-Commit-Position: refs/heads/master@{#35499}
This allows us to remove the turbofan bailout that we introduced
as a response to crbug.com/589792.
BUG=chromium:589792
LOG=n
Review URL: https://codereview.chromium.org/1884713003
Cr-Commit-Position: refs/heads/master@{#35493}
At some point we thought about using this instead of JSToNumber, but now
there doesn't seem to be any reason for this anymore.
R=jarin@chromium.org
Review URL: https://codereview.chromium.org/1890763002
Cr-Commit-Position: refs/heads/master@{#35469}
In simplified numbering, we make sanity checks based on types (e.g.,
NumberSubtract should take numbers as inputs), but this can be
violated if optimization passes make types less precise.
In this CL, we fix load elimination to make sure that types are
smaller in the store -> load elimination by taking an intersection
of the load's type with the store value's type and inserting a guard
with that type. Note that the load type comes from type feedback, so
it can be disjoint from the stored value type (in that case, this
must be dead code because the map chack for the load should prevent
us from using the stored value).
BUG=chromium:599412
LOG=n
Review URL: https://codereview.chromium.org/1857133003
Cr-Commit-Position: refs/heads/master@{#35259}
The background here is that graphs generated from WASM are not trimmed.
That means there can be some floating control diamonds that are not
reachable from end. An assertion in the scheduler for phis from floating
diamonds checks that the use edge in this situation is the control edge,
but in general, any edge could cause this.
Scheduling still works without this assertion. The longer term fix
is to either trim the graphs (more compile time overhead for WASM)
or improve the scheduler's handling of dead code in the graph. Currently
it does not schedule dead code but the potential use positions of
dead code are used in the computation of the common dominator of uses. We could
recognize dead nodes in PrepareUses() and check in GetBlockForUse()
as per TODO.
R=bradnelson@chromium.org, mstarzinger@chromium.org
BUG=
Review URL: https://codereview.chromium.org/1846933002
Cr-Commit-Position: refs/heads/master@{#35245}
This allows us to remove the troublesome %_MathClz32 intrinsic and also
allows us to utilize the functionality that is already available in
TurboFan. Also introduce a proper NumberClz32 operator so we don't need
to introduce a machine operator at the JS level.
R=epertoso@chromium.org
Review URL: https://codereview.chromium.org/1852553003
Cr-Commit-Position: refs/heads/master@{#35208}
We expect that the majority of malloc'd memory held by V8 is allocated
in Zone objects. Introduce an Allocator class that is used by Zones to
manage memory, and allows for querying the current usage.
BUG=none
R=titzer@chromium.org,bmeurer@chromium.org,jarin@chromium.org
LOG=n
TBR=rossberg@chromium.org
Review URL: https://codereview.chromium.org/1847543002
Cr-Commit-Position: refs/heads/master@{#35196}
Int64Mul is lowered to a new turbofan operator, Int32MulPair. The new
operator takes 4 inputs an generates 2 outputs. The inputs are the low
word of the left input, high word of the left input, the low word of the
right input, and high word of the right input. The ouputs are the low
and high word of the result of the multiplication.
R=titzer@chromium.org, v8-arm-ports@googlegroups.com
Review URL: https://codereview.chromium.org/1807273002
Cr-Commit-Position: refs/heads/master@{#35131}
The new implementation deals with cycles in the TF graph in two steps:
1) The lowering of phis is delayed to avoid cyclic dependencies.
2) The replacement nodes of phis are created already when the phi is
pushed onto the stack so that other nodes can use these replacements
for their lowering.
R=titzer@chromium.org
Review URL: https://codereview.chromium.org/1844553002
Cr-Commit-Position: refs/heads/master@{#35126}
This way we avoid the second deoptimization for the Math.floor and
Math.ceil builtins when -0 is involved. We still deoptimize the inlined
Crankshaft version in various cases, that's a separate issue.
The algorithm used for implement CodeStubAssembler::Float64Floor is
vaguely based on the fast math version used in the libm of various BSDs,
but had to be reengineered to match the EcmaScript specification.
R=epertoso@chromium.org
BUG=v8:2890, v8:4059
LOG=n
Review URL: https://codereview.chromium.org/1828253002
Cr-Commit-Position: refs/heads/master@{#35083}
This CL adds support for builtins with JavaScript linkage written using
the TurboFan CodeStubAssembler, but with a JSCall descriptor (which was
already supported thanks to a previous patch by Ben Smith). As a first
example, we convert the Math.sqrt builtin and thereby get rid of the
%_MathSqrt intrinsic, which causes trouble for the representation
selection pass in the JavaScript pipeline.
R=mstarzinger@chromium.org
Review URL: https://codereview.chromium.org/1824993002
Cr-Commit-Position: refs/heads/master@{#34989}
The CL also add guard nodes to places where we assume that certain
values are numbers.
Review URL: https://codereview.chromium.org/1821133002
Cr-Commit-Position: refs/heads/master@{#34977}
This new intrinsic is used by the desugared ES6 instanceof implementation for
the cases when the F[@@hasInstance] property is null or undefined.
R=mstarzinger@chromium.org
Review URL: https://codereview.chromium.org/1809993002
Cr-Commit-Position: refs/heads/master@{#34866}
Int64Sub is lowered to a new turbofan operator, Int32SubPair. The new
operator takes 4 inputs an generates 2 outputs. The inputs are the low
word of the left input, high word of the left input, the low word of the
right input, and high word of the right input. The ouputs are the low
and high word of the result of the subtraction.
The implementation is very similar to the implementation of Int64Add.
@v8-arm-ports: please take a careful look at the implementation of sbc
in the simulator.
R=titzer@chromium.org, v8-arm-ports@googlegroups.com
Review URL: https://codereview.chromium.org/1778893005
Cr-Commit-Position: refs/heads/master@{#34808}