This change significantly improves the performance of string
concatenation in optimized code for the case where the resulting string
is represented as a ConsString. On the relevant test cases we go from
serializeNaive: 10762 ms.
serializeClever: 7813 ms.
serializeConcat: 10271 ms.
to
serializeNaive: 10278 ms.
serializeClever: 5533 ms.
serializeConcat: 10310 ms.
which represents a 30% improvement on the "clever" benchmark, which
tests specifically the ConsString creation performance.
This was accomplished via a couple of different steps, which are briefly
outlined here:
1. The empty_string gets its own map, so that we can easily recognize
and handle it appropriately in the TurboFan type system. This
allows us to express (and assert) that the inputs to NewConsString
are non-empty strings, making sure that TurboFan no longer creates
"crippled ConsStrings" with empty left or right hand sides.
2. Further split the existing String types in TurboFan to be able to
distinguish between OneByte and TwoByte strings on the type system
level. This allows us to avoid having to dynamically lookup the
resulting ConsString map in case of ConsString creation (i.e. when
we know that both input strings are OneByte strings or at least
one of the input strings is TwoByte).
3. We also introduced more finegrained feedback for the Add bytecode
in the interpreter, having it collect feedback about ConsStrings,
specifically ConsOneByteString and ConsTwoByteString. This feedback
can be used by TurboFan to only inline the relevant code for what
was seen so far. This allows us to remove the Octane/Splay specific
magic in JSTypedLowering to detect ConsString creation, and instead
purely rely on the feedback of what was seen so far (also making it
possible to change the semantics of NewConsString to be a low-level
operator, which is only introduced in SimplifiedLowering by looking
at the input types of StringConcat).
4. On top of the before mentioned type and interpreter changes we added
new operators CheckNonEmptyString, CheckNonEmptyOneByteString, and
CheckNonEmptyTwoByteString, which perform the appropriate (dynamic)
checks.
There are several more improvements that are possible based on this, but
since the change was already quite big, we decided not to put everything
into the first change, but do some follow up tweaks to the type system,
and builtin optimizations later.
Tbr: mstarzinger@chromium.org
Bug: v8:8834, v8:8931, v8:8939, v8:8951
Change-Id: Ia24e17c6048bf2b04df966d3cd441f0edda05c93
Cq-Include-Trybots: luci.chromium.try:linux-blink-rel
Doc: https://bit.ly/fast-string-concatenation-in-javascript
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1499497
Commit-Queue: Michael Achenbach <machenbach@chromium.org>
Reviewed-by: Yang Guo <yangguo@chromium.org>
Reviewed-by: Jaroslav Sevcik <jarin@chromium.org>
Reviewed-by: Mythri Alle <mythria@chromium.org>
Reviewed-by: Benedikt Meurer <bmeurer@chromium.org>
Cr-Commit-Position: refs/heads/master@{#60318}
We want to allocate feedback vectors lazily in lite mode. To do that,
we should create closures with the correct feedback cell. This cl
allocates feedback cell arrays to hold these feedback cells in lite mode.
This cl also modifies the compile lazy to builtin to expect these arrays
in the feedback cell.
Drive-by fix: InterpreterEntryTrampoline no longer has argument count in
a register. So updated comments and removed unnecessary push/pop of this
register.
Bug: v8:8394
Change-Id: I10d8ca67cebce61a284f0c80b200e1f0c24577a2
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1511274
Reviewed-by: Hannes Payer <hpayer@chromium.org>
Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
Commit-Queue: Mythri Alle <mythria@chromium.org>
Cr-Commit-Position: refs/heads/master@{#60189}
Instead of eliminating bounds checks based on types, we introduce
an aborting bounds check that crashes rather than deopts.
Bug: v8:8806
Change-Id: Icbd9c4554b6ad20fe4135b8622590093679dac3f
Reviewed-on: https://chromium-review.googlesource.com/c/1460461
Commit-Queue: Jaroslav Sevcik <jarin@chromium.org>
Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Cr-Commit-Position: refs/heads/master@{#59467}
This introduces Word64 support for the CheckBounds operator, which now
lowers to either CheckedUint32Bounds or CheckedUint64Bounds after the
representation selection. The right hand side of CheckBounds can now
be any positive safe integer on 64-bit architectures, whereas it remains
Unsigned31 for 32-bit architectures. We only use the extended Word64
support when the right hand side is outside the Unsigned31 range, so
for everything except DataViews this means that the performance should
remain the same. The typing rule for the CheckBounds operator was
updated to reflect this new behavior.
The CheckBounds with a right hand side outside the Unsigned31 range will
pass a new Signed64 feedback kind, which is handled with newly introduced
CheckedFloat64ToInt64 and CheckedTaggedToInt64 operators in representation
selection.
The JSCallReducer lowering for DataView getType()/setType() methods was
updated to not smi-check the [[ByteLength]] and [[ByteOffset]] anymore,
but instead just use the raw uintptr_t values and operate on any value
(for 64-bit architectures these fields can hold any positive safe
integer, for 32-bit architectures it's limited to Unsigned31 range as
before). This means that V8 can now handle huge DataViews fully, without
falling off a performance cliff.
This refactoring even gave us some performance improvements, on a simple
micro-benchmark just exercising different DataView accesses we go from
testDataViewGetUint8: 796 ms.
testDataViewGetUint16: 997 ms.
testDataViewGetInt32: 994 ms.
testDataViewGetFloat64: 997 ms.
to
testDataViewGetUint8: 895 ms.
testDataViewGetUint16: 889 ms.
testDataViewGetInt32: 888 ms.
testDataViewGetFloat64: 890 ms.
meaning we lost around 10% on the single byte case, but gained 10% across
the board for all the other element sizes.
Design-Document: http://bit.ly/turbofan-word64
Bug: chromium:225811, v8:4153, v8:7881, v8:8171, v8:8383
Change-Id: Ic9d1bf152e47802c04dcfd679372e5c85e4abc83
Reviewed-on: https://chromium-review.googlesource.com/c/1303732
Reviewed-by: Sigurd Schneider <sigurds@chromium.org>
Commit-Queue: Benedikt Meurer <bmeurer@chromium.org>
Cr-Commit-Position: refs/heads/master@{#57095}
This changes the ReceiverOrOddball feedback on JSStrictEqual to
ReceiverOrNullOrUndefined feedback, which can also safely be
consumed by JSEqual (we cannot generally accept any oddball here
since booleans trigger implicit conversions, unfortunately).
Thus we replace the previously introduced CheckReceiverOrOddball
with CheckReceiverOrNullOrUndefined, and drop CheckOddball, since
we will no longer collect Oddball feedback separately.
TurboFan will then turn a JSEqual[ReceiverOrNullOrUndefined] into
a sequence like this:
```
left = CheckReceiverOrNullOrUndefined(left);
right = CheckReceiverOrNullOrUndefined(right);
result = if ObjectIsUndetectable(left) then
ObjectIsUndetectable(right)
else
ReferenceEqual(left, right);
```
This significantly improves the peak performance of abstract equality
with Receiver, Null or Undefined inputs. On the test case outlined in
http://crbug.com/v8/8356 we go from
naive: 2946 ms.
tenary: 2134 ms.
to
naive: 2230 ms.
tenary: 2250 ms.
which corresponds to a 25% improvement on the abstract equality case.
For regular code this will probably yield more performance, since we
get rid of the JSEqual operator, which might have arbitrary side
effects and thus blocks all kinds of TurboFan optimizations. The
JSStrictEqual case is slightly slower now, since it has to rule out
booleans as well (even though that's not strictly necessary, but
consistency is key here).
This way developers can safely use `a == b` instead of doing a dance
like `a == null ? b == null : a === b` (which is what dart2js does
right now) when both `a` and `b` are known to be Receiver, Null or
Undefined. The abstract equality is not only faster to parse than
the tenary, but also generates a shorter bytecode sequence. In the
test case referenced in http://crbug.com/v8/8356 the bytecode for
`naive` is
```
StackCheck
Ldar a1
TestEqual a0, [0]
JumpIfFalse [5]
LdaSmi [1]
Return
LdaSmi [2]
Return
```
which is 14 bytes, whereas the `tenary` function generates
```
StackCheck
Ldar a0
TestUndetectable
JumpIfFalse [7]
Ldar a1
TestUndetectable
Jump [7]
Ldar a1
TestEqualStrict a0, [0]
JumpIfToBooleanFalse [5]
LdaSmi [1]
Return
LdaSmi [2]
Return
```
which is 24 bytes. So the `naive` version is 40% smaller and requires
fewer bytecode dispatches.
Bug: chromium:898455, v8:8356
Change-Id: If3961b2518b4438700706b3bd6071d546305e233
Reviewed-on: https://chromium-review.googlesource.com/c/1297315
Reviewed-by: Jaroslav Sevcik <jarin@chromium.org>
Commit-Queue: Benedikt Meurer <bmeurer@chromium.org>
Cr-Commit-Position: refs/heads/master@{#56948}
This CL introduces proper Oddball and ReceiverOrOddball states for the
CompareOperationFeedback, and updates the StrictEqual IC to collect this
feedback as well. Previously it would not collect Oddball feedback, not
even in the sense of NumberOrOddball, since that's not usable for the
SpeculativeNumberEqual.
The new feedback is handled via newly introduced CheckReceiverOrOddball
and CheckOddball operators in TurboFan, introduced by JSTypedLowering.
Just like with the Receiver feedback, it's enough to check one side and
do a ReferenceEqual afterwards, since strict equal can only yield true
if both sides refer to the same instance.
This improves the benchmark mentioned in http://crbug.com/v8/8356 from
naive: 2950 ms.
tenary: 2456 ms.
to around
naive: 2996 ms.
tenary: 2192 ms.
which corresponds to a roughly 10% improvement in the case for the
tenary pattern, which is currently used by dart2js. In real world
scenarios this will probably help even more, since TurboFan is able
to optimize across the strict equality, i.e. there's no longer a stub
call forcibly spilling all registers that are live across the call.
This new feedback will be used as a basis for the JSEqual support for
ReceiverOrOddball, which will allow dart2js switching to the shorter
a==b form, at the same peak performance.
Bug: v8:8356
Change-Id: Iafbf5d64fcc9312f9e575b54c32c631ce9b572b2
Reviewed-on: https://chromium-review.googlesource.com/c/1297309
Reviewed-by: Jaroslav Sevcik <jarin@chromium.org>
Commit-Queue: Benedikt Meurer <bmeurer@chromium.org>
Cr-Commit-Position: refs/heads/master@{#56925}
There's no ambiguity and the shorter name makes things easier to read.
Bug: v8:7790
Change-Id: Ibcf3fd7f38a91e26a83cd335fad0ec80a5fe9be1
Reviewed-on: https://chromium-review.googlesource.com/c/1278392
Commit-Queue: Georg Neis <neis@chromium.org>
Reviewed-by: Jaroslav Sevcik <jarin@chromium.org>
Reviewed-by: Maya Lekova <mslekova@chromium.org>
Cr-Commit-Position: refs/heads/master@{#56623}
As identified in the web-tooling-benchmark, there are specific code
patterns involving array indexed property accesses and subsequent
comparisons of those indices that lead to repeated Smi checks in the
optimized code, which in turn leads to high register pressure and
generally bad register allocation. An example of this pattern is
code like this:
```js
function f(a, n) {
const i = a[n];
if (n >= 1) return i;
}
```
The `a[n]` property access introduces a CheckBounds on `n`, which
later lowers to a `CheckedTaggedToInt32[dont-check-minus-zero]`,
however the `n >= 1` comparison has collected `SignedSmall` feedback
and so it introduces a `CheckedTaggedToTaggedSigned` operation. This
second Smi check is redundant and cannot easily be combined with the
earlier tagged->int32 conversion, since that also deals with heap
numbers and even truncates -0 to 0.
So we teach the RedundancyElimination to look at the inputs of these
speculative number comparisons and if there's a leading bounds check
on either of these inputs, we change the input to the result of the
bounds check. This avoids the redundant Smi checks later and generally
allows the SimplifiedLowering to do a significantly better job on the
number comparisons. We only do this in case of SignedSmall feedback
and only for inputs that are not already known to be in UnsignedSmall
range, to avoid doing too many (unnecessary) expensive lookups during
RedundancyElimination.
All of this is safe despite the fact that CheckBounds truncates -0
to 0, since the regular number comparisons in JavaScript identify
0 and -0 (unlike Object.is()). This also adds appropriate tests,
especially for the interesting cases where -0 is used only after
the code was optimized.
Bug: v8:6936, v8:7094
Change-Id: Ie37114fb6192e941ae1a4f0bfe00e9c0a8305c07
Reviewed-on: https://chromium-review.googlesource.com/c/1246181
Reviewed-by: Sigurd Schneider <sigurds@chromium.org>
Commit-Queue: Benedikt Meurer <bmeurer@chromium.org>
Cr-Commit-Position: refs/heads/master@{#56428}
This reverts commit 4fd92b252b.
Reason for revert: Significant tankage on the no-mitigations bots (bad timing on the regular bots)
Original change's description:
> [turbofan] Do not consume SignedSmall feedback in TurboFan anymore.
>
> This changes TurboFan to treat SignedSmall feedback similar to Signed32
> feedback for binary and compare operations, in order to simplify and
> unify the machinery.
>
> This is an experiment. If this turns out to tank performance, we will
> need to revisit and ideally revert this change.
>
> Bug: v8:7094
> Change-Id: I885769c2fe93d8413e59838fbe844650c848c3f1
> Reviewed-on: https://chromium-review.googlesource.com/c/1261442
> Reviewed-by: Jaroslav Sevcik <jarin@chromium.org>
> Commit-Queue: Benedikt Meurer <bmeurer@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#56411}
TBR=jarin@chromium.org,bmeurer@chromium.org
# Not skipping CQ checks because original CL landed > 1 day ago.
Bug: v8:7094
Change-Id: I9fff3b40e6dc0ceb7611b55e1ca9940089470404
Reviewed-on: https://chromium-review.googlesource.com/c/1267175
Reviewed-by: Benedikt Meurer <bmeurer@chromium.org>
Commit-Queue: Benedikt Meurer <bmeurer@chromium.org>
Cr-Commit-Position: refs/heads/master@{#56427}
This changes TurboFan to treat SignedSmall feedback similar to Signed32
feedback for binary and compare operations, in order to simplify and
unify the machinery.
This is an experiment. If this turns out to tank performance, we will
need to revisit and ideally revert this change.
Bug: v8:7094
Change-Id: I885769c2fe93d8413e59838fbe844650c848c3f1
Reviewed-on: https://chromium-review.googlesource.com/c/1261442
Reviewed-by: Jaroslav Sevcik <jarin@chromium.org>
Commit-Queue: Benedikt Meurer <bmeurer@chromium.org>
Cr-Commit-Position: refs/heads/master@{#56411}
This is just a cleanup.
Bug: v8:7790
Change-Id: Ic0114451159b8c504f527f3cf3bdaed6a8cc8741
Reviewed-on: https://chromium-review.googlesource.com/1243103
Commit-Queue: Georg Neis <neis@chromium.org>
Reviewed-by: Jaroslav Sevcik <jarin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#56206}
Remove the NumberConstant right hand side limitation for the speculative
number operation optimization, and extend the logic to also deal with
SpeculativeToNumber, which is common when dealing with postfix increment
and array operations.
Also add appropriate tests for all the relevant cases, specifically we
mjsunit tests to increase the general coverage for the various cases
here (in addition to dedicated unittests).
Bug: v8:8015
Change-Id: I8c92f98490c63b07eb19686efd404322979e57c4
Reviewed-on: https://chromium-review.googlesource.com/1235919
Reviewed-by: Jaroslav Sevcik <jarin@chromium.org>
Commit-Queue: Benedikt Meurer <bmeurer@chromium.org>
Cr-Commit-Position: refs/heads/master@{#56072}
Make the RedundancyElimination handle all simplified operators that are
listed in the SIMPLIFIED_CHECKED_OP_LIST, and fix a couple of bugs and
oversights in the code. This also adds a lot of test coverage for all
the cases that we care about in RedundancyElimination (with respect to
Check/Checked simplified operators).
Bug: v8:8015
Change-Id: I57d29113389841b09abcd013313bf5dd1c67735f
Reviewed-on: https://chromium-review.googlesource.com/1233655
Reviewed-by: Sigurd Schneider <sigurds@chromium.org>
Commit-Queue: Benedikt Meurer <bmeurer@chromium.org>
Cr-Commit-Position: refs/heads/master@{#56032}