v8/test/benchmarks/csuite
Paolo Severini a974dd7eae [Test] Make CSuite benchmark runner work better on Windows
The csuite.py script does not work correctly on Windows. It runs
correctly in baseline mode, but there are two problems when running in
compare mode:

1. In compare mode the output of benchmark.py is piped to the
   compare-baseline.py script, but Windows only execute python files if
   python.exe is the default program to open '.py' files, and this is
   not the case, by default, when python is installed as part of the
   depot_tools.

   Fix: explicitly add the 'python' command before compare-baseline.py.

2. By default CSuite prints the results to stdout using escapes codes
   that add color highlights. But this does not work on Windows when
   compare-baseline.py is launched with a pipe:

   python test/benchmarks/csuite/benchmark.py <...> |
       python test/benchmarks/csuite/compare-baseline.py <baseline_results>

   Fix: Do not use a pipe. Write the benchmark numbers for the
   compare-run into a separate file, and pass the path to this file to
   compare-baseline.py

Change-Id: Ic22d5bd4b47901f0ba0f35bc2496441346d21c6a
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2656855
Reviewed-by: Michael Stanton <mvstanton@chromium.org>
Commit-Queue: Paolo Severini <paolosev@microsoft.com>
Cr-Commit-Position: refs/heads/master@{#72807}
2021-02-17 11:38:59 +00:00
..
benchmark.py Reland "Preparing v8 to use with python3 /test" 2019-03-20 09:56:06 +00:00
compare-baseline.py Fix csuite compare command 2020-07-24 15:58:53 +00:00
csuite.py [Test] Make CSuite benchmark runner work better on Windows 2021-02-17 11:38:59 +00:00
README.md [Test] CSuite benchmark runner 2018-12-21 13:15:57 +00:00
run-kraken.js [Test] CSuite benchmark runner 2018-12-21 13:15:57 +00:00
sunspider-standalone-driver.js [Test] CSuite benchmark runner 2018-12-21 13:15:57 +00:00

CSuite: Local benchmarking help for V8 performance analysis

CSuite helps you make N averaged runs of a benchmark, then compare with a different binary and/or different flags. It knows about the "classic" benchmarks of SunSpider, Kraken and Octane, which are still useful for investigating peak performance scenarios. It offers a default number of runs, by default they are:

  • SunSpider - 100 runs
  • Kraken - 80 runs
  • Octane - 10 runs

Usage

Say you want to see how much optimization buys you:

./csuite.py kraken baseline ~/src/v8/out/d8 -x="--noopt"
./csuite.py kraken compare ~/src/v8/out/d8

Suppose you are comparing two binaries, and want a quick look at results. Normally, Octane should have about 10 runs, but 3 will only take a few minutes:

./csuite.py -r 3 octane baseline ~/src/v8/out-master/d8
./csuite.py -r 3 octane compare ~/src/v8/out-mine/d8

You can run from any place:

../../somewhere-strange/csuite.py sunspider baseline ./d8
../../somewhere-strange/csuite.py sunspider compare ./d8-better

Note that all output files are created in the directory where you run from. A _benchmark_runner_data directory will be created to store run output, and a _results directory as well for scores.

For more detailed documentation, see:

./csuite.py --help

Output from the runners is captured into files and cached, so you can cancel and resume multi-hour benchmark runs with minimal loss of data/time. The -f flag forces re-running even if these cached files still exist.