... that controls whether the TF graph zones should support compression.
Bug: v8:9923
Change-Id: Ifbe237b75e9c92e62eb32b69d6b3b1a818269b83
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2308347
Commit-Queue: Igor Sheludko <ishell@chromium.org>
Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
Cr-Commit-Position: refs/heads/master@{#69036}
Design doc:
https://docs.google.com/document/d/1szInbXZfaErWW70d30hJsOLL0Es-l5_g8d2rXm1ZBqI/edit?usp=sharing
V8 can already collect data about how many times each basic block in the
builtins is run. This change enables using that data for profile-guided
optimization. New comments in BUILD.gn describe how to use this feature.
A few implementation details worth mentioning, which aren't covered in
the design doc:
- BasicBlockProfilerData currently contains an array of RPO numbers.
However, this array is always just [0, 1, 2, 3, ...], so this change
removes that array. A new DCHECK in BasicBlockInstrumentor::Instrument
ensures that the removal is valid.
- RPO numbers, while useful for printing data that matches with the
stringified schedule, are not useful for matching profiling data with
blocks that haven't been scheduled yet. This change adds a new array
of block IDs in BasicBlockProfilerData, so that block counters can be
used for PGO.
- Basic block counters need to be written to a file so that they can be
provided to a subsequent run of mksnapshot, but the design doc doesn't
specify the transfer format or what file is used. In this change, I
propose using the existing v8.log file for that purpose. Block count
records look like this:
block,TestLessThanHandler,37,29405
This line indicates that block ID 37 in TestLessThanHandler was run
29405 times. If multiple lines refer to the same block, the reader
adds them all together. I like this format because it's easy to use:
- V8 already has robust logic for creating the log file, naming it to
avoid conflicts in multi-process situations, etc.
- Line order doesn't matter, and interleaved writes from various
logging sources are fine, given that V8 writes each line atomically.
- Combining multiple sources of profiling data is as simple as
concatenating their v8.log files together.
- It is a good idea to avoid making any changes based on profiling data
if the function being compiled doesn't match the one that was
profiled, since it is common to use profiling data downloaded from a
central lab which is updated only periodically. To check whether a
function matches, I propose using a hash of the Graph state right
before scheduling. This might be stricter than necessary, as some
changes to the function might be small enough that the profile data is
still relevant, but I'd rather err on the side of not making incorrect
changes. This hash is also written to the v8.log file, in a line that
looks like this:
builtin_hash,LdaZeroHandler,3387822046
Bug: v8:10470
Change-Id: I429e5ce5efa94e01e7489deb3996012cf860cf13
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2220765
Commit-Queue: Seth Brenith <seth.brenith@microsoft.com>
Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Cr-Commit-Position: refs/heads/master@{#69008}
This adds a simple counter to Turbofan that's incremented throughout the compilation, hopefully
frequently enough so we can use it to detect divergence and performance bugs.
In addition, we assert that this counter never gets too high. That's the equivalent of a simple
timeout, just more deterministic. The limitations on Turbofan input size should guarantee that
we never exceed this limit. Since we probably do exceed it rarely, this check is only a DCHECK and
intended to detect performance and divergence issues, but not supposed to be performed in release
builds.
In addition, this CL adds UMA stats to observe the real world distribution of the tick measurement.
Bug: v8:9444
Change-Id: I182dac6ecac64715e3f5885ff5c7c17549351cd0
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1695475
Commit-Queue: Tobias Tebbi <tebbi@chromium.org>
Reviewed-by: Georg Neis <neis@chromium.org>
Reviewed-by: Michael Stanton <mvstanton@chromium.org>
Cr-Commit-Position: refs/heads/master@{#62754}
This is a reland of 0909dbe3d6.
Added missing V8_EXPORT_PRIVATE to AndroidLogStream.
TBR=mstarzinger@chromium.org
Original change's description:
> Introduce StdoutStream which prints to Android log or stdout
>
> The often used construct {OFStream(stdout)} does not work on Android.
> This CL introduces an {StdoutStream} which behaves exactly like
> {OFStream(stdout)} on non-android platforms, and redirects to the
> Android log on appropriate systems and configurations.
>
> R=mstarzinger@chromium.org
>
> Bug: v8:7820
> Change-Id: Ia682fdf6d064e37c605c19b032f5a10b96ac825b
> Reviewed-on: https://chromium-review.googlesource.com/1088911
> Reviewed-by: Benedikt Meurer <bmeurer@chromium.org>
> Reviewed-by: Jakob Gruber <jgruber@chromium.org>
> Reviewed-by: Michael Starzinger <mstarzinger@chromium.org>
> Commit-Queue: Clemens Hammacher <clemensh@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#53692}
Bug: v8:7820
Change-Id: I8164bad78a401dbe4246c9ffcacd050fe511ed58
Reviewed-on: https://chromium-review.googlesource.com/1100636
Reviewed-by: Clemens Hammacher <clemensh@chromium.org>
Commit-Queue: Clemens Hammacher <clemensh@chromium.org>
Cr-Commit-Position: refs/heads/master@{#53733}
This reverts commit 0909dbe3d6.
Reason for revert: Blocks roll:
https://chromium-review.googlesource.com/c/chromium/src/+/1099143
Original change's description:
> Introduce StdoutStream which prints to Android log or stdout
>
> The often used construct {OFStream(stdout)} does not work on Android.
> This CL introduces an {StdoutStream} which behaves exactly like
> {OFStream(stdout)} on non-android platforms, and redirects to the
> Android log on appropriate systems and configurations.
>
> R=mstarzinger@chromium.org
>
> Bug: v8:7820
> Change-Id: Ia682fdf6d064e37c605c19b032f5a10b96ac825b
> Reviewed-on: https://chromium-review.googlesource.com/1088911
> Reviewed-by: Benedikt Meurer <bmeurer@chromium.org>
> Reviewed-by: Jakob Gruber <jgruber@chromium.org>
> Reviewed-by: Michael Starzinger <mstarzinger@chromium.org>
> Commit-Queue: Clemens Hammacher <clemensh@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#53692}
TBR=mstarzinger@chromium.org,jarin@chromium.org,jgruber@chromium.org,clemensh@chromium.org,bmeurer@chromium.org
Change-Id: Iadadd9a0df10dca0fad647138a83db50148e864d
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: v8:7820
Reviewed-on: https://chromium-review.googlesource.com/1100635
Reviewed-by: Michael Achenbach <machenbach@chromium.org>
Commit-Queue: Michael Achenbach <machenbach@chromium.org>
Cr-Commit-Position: refs/heads/master@{#53725}
The often used construct {OFStream(stdout)} does not work on Android.
This CL introduces an {StdoutStream} which behaves exactly like
{OFStream(stdout)} on non-android platforms, and redirects to the
Android log on appropriate systems and configurations.
R=mstarzinger@chromium.org
Bug: v8:7820
Change-Id: Ia682fdf6d064e37c605c19b032f5a10b96ac825b
Reviewed-on: https://chromium-review.googlesource.com/1088911
Reviewed-by: Benedikt Meurer <bmeurer@chromium.org>
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Michael Starzinger <mstarzinger@chromium.org>
Commit-Queue: Clemens Hammacher <clemensh@chromium.org>
Cr-Commit-Position: refs/heads/master@{#53692}
This also includes the precise reducer name. Currently the information
is available in the node tooltip in turbolizer. The new shortcut 's' in
the graph view selects the nodes the currently selected nodes were created
from.
Bug: v8:7327
Change-Id: I7ca7327d0cfa112972e3567df6e4a223c8eff3c0
Reviewed-on: https://chromium-review.googlesource.com/1064059
Commit-Queue: Sigurd Schneider <sigurds@chromium.org>
Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Cr-Commit-Position: refs/heads/master@{#53258}
SourcePosition::InliningId() refers to a the new table DeoptimizationInputData::InliningPositions(), which provides the following data for every inlining id:
- The inlined SharedFunctionInfo as an offset into DeoptimizationInfo::LiteralArray
- The SourcePosition of the inlining. Recursively, this yields the full inlining stack.
Before the Code object is created, the same information can be found in CompilationInfo::inlined_functions().
If SourcePosition::InliningId() is SourcePosition::kNotInlined, it refers to the outer (non-inlined) function.
So every SourcePosition has full information about its inlining stack, as long as the corresponding Code object is known. The internal represenation of a source position is a positive 64bit integer.
All compilers create now appropriate source positions for inlined functions. In the case of Turbofan, this required using AstGraphBuilderWithPositions for inlined functions too. So this class is now moved to a header file.
At the moment, the additional information in source positions is only used in --trace-deopt and --code-comments. The profiler needs to be updated, at the moment it gets the correct script offsets from the deopt info, but the wrong script id from the reconstructed deopt stack, which can lead to wrong outputs. This should be resolved by making the profiler use the new inlining information for deopts.
I activated the inlined deoptimization tests in test-cpu-profiler.cc for Turbofan, changing them to a case where the deopt stack and the inlining position agree. It is currently still broken for other cases.
The following additional changes were necessary:
- The source position table (internal::SourcePositionTableBuilder etc.) supports now 64bit source positions. Encoding source positions in a single 64bit int together with the difference encoding in the source position table results in very little overhead for the inlining id, since only 12% of the source positions in Octane have a changed inlining id.
- The class HPositionInfo was effectively dead code and is now removed.
- SourcePosition has new printing and information facilities, including computing a full inlining stack.
- I had to rename compiler/source-position.{h,cc} to compiler/compiler-source-position-table.{h,cc} to avoid clashes with the new src/source-position.cc file.
- I wrote the new wrapper PodArray for ByteArray. It is a template working with any POD-type. This is used in DeoptimizationInputData::InliningPositions().
- I removed HInlinedFunctionInfo and HGraph::inlined_function_infos, because they were only used for the now obsolete Crankshaft inlining ids.
- Crankshaft managed a list of inlined functions in Lithium: LChunk::inlined_functions. This is an analog structure to CompilationInfo::inlined_functions. So I removed LChunk::inlined_functions and made Crankshaft use CompilationInfo::inlined_functions instead, because this was necessary to register the offsets into the literal array in a uniform way. This is a safe change because LChunk::inlined_functions has no other uses and the functions in CompilationInfo::inlined_functions have a strictly longer lifespan, being created earlier (in Hydrogen already).
BUG=v8:5432
Review-Url: https://codereview.chromium.org/2451853002
Cr-Commit-Position: refs/heads/master@{#40975}
This is preparation for using TF to create builtins that handle variable number of
arguments and have to remove these arguments dynamically from the stack upon
return.
The gist of the changes:
- Added a second argument to the Return node which specifies the number of stack
slots to pop upon return in addition to those specified by the Linkage of the
compiled function.
- Removed Tail -> Non-Tail fallback in the instruction selector. Since TF now should
handles all tail-call cases except where the return value type differs, this fallback
was not really useful and in fact caused unexpected behavior with variable
sized argument popping, since it wasn't possible to materialize a Return node
with the right pop count from the TailCall without additional context.
- Modified existing Return generation to pass a constant zero as the additional
pop argument since the variable pop functionality
LOG=N
Review-Url: https://codereview.chromium.org/2446543002
Cr-Commit-Position: refs/heads/master@{#40699}
Reason for revert:
Seems to break arm64 sim debug and blocks roll:
https://build.chromium.org/p/client.v8.ports/builders/V8%20Linux%20-%20arm64%20-%20sim%20-%20debug/builds/3294
Original issue's description:
> [turbofan] Support variable size argument removal in TF-generated functions
>
> This is preparation for using TF to create builtins that handle variable number of
> arguments and have to remove these arguments dynamically from the stack upon
> return.
>
> The gist of the changes:
> - Added a second argument to the Return node which specifies the number of stack
> slots to pop upon return in addition to those specified by the Linkage of the
> compiled function.
> - Removed Tail -> Non-Tail fallback in the instruction selector. Since TF now should
> handles all tail-call cases except where the return value type differs, this fallback
> was not really useful and in fact caused unexpected behavior with variable
> sized argument popping, since it wasn't possible to materialize a Return node
> with the right pop count from the TailCall without additional context.
> - Modified existing Return generation to pass a constant zero as the additional
> pop argument since the variable pop functionality
>
> LOG=N
TBR=bmeurer@chromium.org,mstarzinger@chromium.org,epertoso@chromium.org,danno@chromium.org
# Not skipping CQ checks because original CL landed more than 1 days ago.
NOPRESUBMIT=true
Review-Url: https://codereview.chromium.org/2473643002
Cr-Commit-Position: refs/heads/master@{#40691}
This is preparation for using TF to create builtins that handle variable number of
arguments and have to remove these arguments dynamically from the stack upon
return.
The gist of the changes:
- Added a second argument to the Return node which specifies the number of stack
slots to pop upon return in addition to those specified by the Linkage of the
compiled function.
- Removed Tail -> Non-Tail fallback in the instruction selector. Since TF now should
handles all tail-call cases except where the return value type differs, this fallback
was not really useful and in fact caused unexpected behavior with variable
sized argument popping, since it wasn't possible to materialize a Return node
with the right pop count from the TailCall without additional context.
- Modified existing Return generation to pass a constant zero as the additional
pop argument since the variable pop functionality
LOG=N
Review-Url: https://codereview.chromium.org/2446543002
Cr-Commit-Position: refs/heads/master@{#40678}
This completely removes translation of exception handler predictions
from the graph IR. We now rely on the runtime using deoptimization
infomation via {FrameSummary} for predictions in optimized code.
R=bmeurer@chromium.org
Review-Url: https://codereview.chromium.org/2207533002
Cr-Commit-Position: refs/heads/master@{#38250}
The background here is that graphs generated from WASM are not trimmed.
That means there can be some floating control diamonds that are not
reachable from end. An assertion in the scheduler for phis from floating
diamonds checks that the use edge in this situation is the control edge,
but in general, any edge could cause this.
Scheduling still works without this assertion. The longer term fix
is to either trim the graphs (more compile time overhead for WASM)
or improve the scheduler's handling of dead code in the graph. Currently
it does not schedule dead code but the potential use positions of
dead code are used in the computation of the common dominator of uses. We could
recognize dead nodes in PrepareUses() and check in GetBlockForUse()
as per TODO.
R=bradnelson@chromium.org, mstarzinger@chromium.org
BUG=
Review URL: https://codereview.chromium.org/1846933002
Cr-Commit-Position: refs/heads/master@{#35245}
MachineType is now a class with two enum fields:
- MachineRepresentation
- MachineSemantic
Both enums are usable on their own, and this change switches some places from using MachineType to use just MachineRepresentation. Most notably:
- register allocator now uses just the representation.
- Phi and Select nodes only refer to representations.
Review URL: https://codereview.chromium.org/1513543003
Cr-Commit-Position: refs/heads/master@{#32738}
The test had an effect phi with one effect input connected to a loop with two control inputs. Also, the Terminate node was used by the effect phi.
Review URL: https://codereview.chromium.org/1398763002
Cr-Commit-Position: refs/heads/master@{#31193}
Verifies consistency of node inputs and uses:
- node inputs should agree with the input count computed from the node's operator.
- effect inputs should have effect outputs (or be a sentinel).
- control inputs should have control outputs (or be a sentinel).
- frame state inputs should be frame states (or be a sentinel).
- if the node has control uses, it should produce control.
- if the node has effect uses, it should produce effect.
- if the node has frame state uses, it must be a frame state.
I also removed some tests, either because they did not seem to be useful (scheduler) or they tested dead functionality (diamond effect phi).
Review URL: https://codereview.chromium.org/1368913002
Cr-Commit-Position: refs/heads/master@{#30927}
This is needed in order to allow expansion of a throwing node into a
set of nodes that produce different effects for the successful and the
exceptional continuation.
R=bmeurer@chromium.org
Review URL: https://codereview.chromium.org/1179543002
Cr-Commit-Position: refs/heads/master@{#28918}
This introduces a conservative prediction for each exception handler
whether it will locally catch an exception or re-throw it to outside
the code bondaries. It will allow for a more intuitive prediction of
whether an exception is considered "caught" or "uncaught".
R=bmeurer@chromium.org,yangguo@chromium.org
BUG=chromium:492522
LOG=N
Review URL: https://codereview.chromium.org/1158563008
Cr-Commit-Position: refs/heads/master@{#28681}
This simplifies the handling of the End node. Based on this CL we will
finally fix terminating every loop from the beginning (via Terminate
nodes) and fix inlining of Throw, Deoptimize and Terminate.
R=mstarzinger@chromium.org
Review URL: https://codereview.chromium.org/1157023002
Cr-Commit-Position: refs/heads/master@{#28620}
Use these check points to optimize comparisons where we already know
that one side cannot be a String (or turn into a string via
ToPrimitive).
Also remove bunch of useless DoNotCrash tests for the scheduler that are
painful to maintain and add almost no value.
R=jarin@chromium.org
Review URL: https://codereview.chromium.org/1140583004
Cr-Commit-Position: refs/heads/master@{#28383}
This revives the Terminate operator and removes the weird Always
operator. As a first step we let the ControlReducer connect non
terminating loops via Terminate. The next step will be to change the
graph builder to insert Terminate nodes into every loop.
Review URL: https://codereview.chromium.org/1123213002
Cr-Commit-Position: refs/heads/master@{#28259}
Tail calls are matched on the graph, with a dedicated tail call
optimization that is actually testable. The instruction selection can
still fall back to a regular if the platform constraints don't allow to
emit a tail call (i.e. the return locations of caller and callee differ
or the callee takes non-register parameters, which is a restriction that
will be removed in the future).
Also explicitly limit tail call optimization to stubs for now and drop
the global flag.
BUG=v8:4076
LOG=n
Review URL: https://codereview.chromium.org/1114163005
Cr-Commit-Position: refs/heads/master@{#28219}
Implements the strong mode proposal's restrictions on
implicit conversions for binary arithmetic operations, not
including the + special case. Adds some infrastructure
for future implementation of the restrictions for other
operators.
BUG=v8:3956
LOG=N
Review URL: https://codereview.chromium.org/1092353002
Cr-Commit-Position: refs/heads/master@{#28045}
Now all nodes that care about deoptimization always take frame state
inputs no matter whether deoptimization is enabled for a particular
function. In case that deoptimization is off, the AstGraphBuilder just
inserts the empty frame state. This greatly simplifies the logic in
various places and makes testing easier as well, and is probably the
first step towards enabling --turbo-deoptimization by default.
There seems to be no noticable performance impact on asm.js programs.
Also fix the graph replay in order to regenerate the scheduler unittests.
Review URL: https://codereview.chromium.org/1106613003
Cr-Commit-Position: refs/heads/master@{#28026}
This adds a new ControlFlowOptimizer that - for now - recognizes chains
of Branches generated by the SwitchBuilder for a subset of javascript
switches into Switch nodes. Those Switch nodes are then lowered to
either table or lookup switches.
Also rename Case to IfValue (and introduce IfDefault) for consistency.
BUG=v8:3872
LOG=n
Review URL: https://codereview.chromium.org/931623002
Cr-Commit-Position: refs/heads/master@{#26691}
Adds Switch and Case operators to TurboFan and handles them
appropriately in instruction selection and code generation.
BUG=v8:3872
LOG=n
Review URL: https://codereview.chromium.org/892513003
Cr-Commit-Position: refs/heads/master@{#26515}
If a (pure) node has two or more uses, but there exists a path from the
common dominator of these uses to end, which does not contain a use,
then we split the node such that no unnecessary computation takes place.
Note however, that this only applies if the node cannot be hoisted out
of a loop.
BUG=v8:3864
LOG=n
R=jarin@chromium.org
Review URL: https://codereview.chromium.org/899433005
Cr-Commit-Position: refs/heads/master@{#26404}
Up until now we used a special Terminate node to artifically connect non
terminating loops to the End node, but this was kind of adhoc and didn't
work for the CFG. So without all kinds of weird hacks, the end block in
the CFG will not be connected to NTLs, which makes it impossible to
compute post dominance / control dependence in the current setting.
So instead of Terminate, we add a special Branch to NTLs, whose
condition is the special Always node, which corresponds to True, except
that it cannot be folded away. This way we don't need any special
machinery in the scheduler, since it's just a regular Branch.
R=titzer@chromium.org
Review URL: https://codereview.chromium.org/875263004
Cr-Commit-Position: refs/heads/master@{#26294}