922983dfd3
Design doc: https://docs.google.com/document/d/1szInbXZfaErWW70d30hJsOLL0Es-l5_g8d2rXm1ZBqI/edit?usp=sharing V8 can already collect data about how many times each basic block in the builtins is run. This change enables using that data for profile-guided optimization. New comments in BUILD.gn describe how to use this feature. A few implementation details worth mentioning, which aren't covered in the design doc: - BasicBlockProfilerData currently contains an array of RPO numbers. However, this array is always just [0, 1, 2, 3, ...], so this change removes that array. A new DCHECK in BasicBlockInstrumentor::Instrument ensures that the removal is valid. - RPO numbers, while useful for printing data that matches with the stringified schedule, are not useful for matching profiling data with blocks that haven't been scheduled yet. This change adds a new array of block IDs in BasicBlockProfilerData, so that block counters can be used for PGO. - Basic block counters need to be written to a file so that they can be provided to a subsequent run of mksnapshot, but the design doc doesn't specify the transfer format or what file is used. In this change, I propose using the existing v8.log file for that purpose. Block count records look like this: block,TestLessThanHandler,37,29405 This line indicates that block ID 37 in TestLessThanHandler was run 29405 times. If multiple lines refer to the same block, the reader adds them all together. I like this format because it's easy to use: - V8 already has robust logic for creating the log file, naming it to avoid conflicts in multi-process situations, etc. - Line order doesn't matter, and interleaved writes from various logging sources are fine, given that V8 writes each line atomically. - Combining multiple sources of profiling data is as simple as concatenating their v8.log files together. - It is a good idea to avoid making any changes based on profiling data if the function being compiled doesn't match the one that was profiled, since it is common to use profiling data downloaded from a central lab which is updated only periodically. To check whether a function matches, I propose using a hash of the Graph state right before scheduling. This might be stricter than necessary, as some changes to the function might be small enough that the profile data is still relevant, but I'd rather err on the side of not making incorrect changes. This hash is also written to the v8.log file, in a line that looks like this: builtin_hash,LdaZeroHandler,3387822046 Bug: v8:10470 Change-Id: I429e5ce5efa94e01e7489deb3996012cf860cf13 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2220765 Commit-Queue: Seth Brenith <seth.brenith@microsoft.com> Reviewed-by: Ross McIlroy <rmcilroy@chromium.org> Reviewed-by: Tobias Tebbi <tebbi@chromium.org> Cr-Commit-Position: refs/heads/master@{#69008} |
||
---|---|---|
.. | ||
arm | ||
arm64 | ||
backend | ||
ia32 | ||
mips | ||
mips64 | ||
ppc | ||
regalloc | ||
s390 | ||
x64 | ||
branch-elimination-unittest.cc | ||
bytecode-analysis-unittest.cc | ||
checkpoint-elimination-unittest.cc | ||
common-operator-reducer-unittest.cc | ||
common-operator-unittest.cc | ||
compiler-test-utils.h | ||
constant-folding-reducer-unittest.cc | ||
control-equivalence-unittest.cc | ||
control-flow-optimizer-unittest.cc | ||
dead-code-elimination-unittest.cc | ||
decompression-optimizer-unittest.cc | ||
diamond-unittest.cc | ||
effect-control-linearizer-unittest.cc | ||
graph-reducer-unittest.cc | ||
graph-reducer-unittest.h | ||
graph-trimmer-unittest.cc | ||
graph-unittest.cc | ||
graph-unittest.h | ||
int64-lowering-unittest.cc | ||
js-call-reducer-unittest.cc | ||
js-create-lowering-unittest.cc | ||
js-intrinsic-lowering-unittest.cc | ||
js-native-context-specialization-unittest.cc | ||
js-operator-unittest.cc | ||
js-typed-lowering-unittest.cc | ||
linkage-tail-call-unittest.cc | ||
load-elimination-unittest.cc | ||
loop-peeling-unittest.cc | ||
machine-operator-reducer-unittest.cc | ||
machine-operator-unittest.cc | ||
node-cache-unittest.cc | ||
node-matchers-unittest.cc | ||
node-properties-unittest.cc | ||
node-test-utils.cc | ||
node-test-utils.h | ||
node-unittest.cc | ||
opcodes-unittest.cc | ||
persistent-unittest.cc | ||
redundancy-elimination-unittest.cc | ||
schedule-unittest.cc | ||
scheduler-rpo-unittest.cc | ||
scheduler-unittest.cc | ||
simplified-lowering-unittest.cc | ||
simplified-operator-reducer-unittest.cc | ||
simplified-operator-unittest.cc | ||
state-values-utils-unittest.cc | ||
typed-optimization-unittest.cc | ||
typer-unittest.cc | ||
value-numbering-reducer-unittest.cc | ||
zone-stats-unittest.cc |