C++ algorithms have largely standardized on a [begin, end) half-open
range, as seen in standard library containers. SkTQSort now adheres to
this model, and takes vec.begin() and vec.end() as its inputs.
To avoid confusion between inclusive and half-open ranges inside the
implementation, internal helper functions now take "left" and "count"
arguments instead of "left"/"right" or "begin"/"end". This avoids any
ambiguity.
(Although performance was not the main goal, this CL appears to
slightly improve our sorting benchmark on my machine.)
Change-Id: I5e96b6730be96cf23d001ee0915c69764b2c024a
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/302579
Reviewed-by: Mike Klein <mtklein@google.com>
Commit-Queue: John Stiles <johnstiles@google.com>
This reverts commit 70474c1cb0.
Reason for revert: bot build failure
Original change's description:
> Remove custom SkSort algorithms.
>
> SortBench shows that SkTQSort and SkTHeapSort are inferior to std::sort.
> The difference is small on randomized inputs, but quite significant for
> semi-ordered inputs (forward/backward/repeated). There doesn't seem to
> to be any compelling advantage to SkTQSort.
>
> Nanobench results: https://screenshot.googleplex.com/9JOLV1d6Z0u
>
> (These performance numbers are from an optimized build my local machine;
> it's possible that we might see different results on the test bots.)
>
> Change-Id: Iaf19563041547eae7de2953be249129108f093b1
> Reviewed-on: https://skia-review.googlesource.com/c/skia/+/302295
> Commit-Queue: John Stiles <johnstiles@google.com>
> Reviewed-by: Mike Klein <mtklein@google.com>
TBR=mtklein@google.com,brianosman@google.com,johnstiles@google.com
Change-Id: I1126dd4cda95716dac225ad32d5b0e5cf3f09421
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/302447
Reviewed-by: John Stiles <johnstiles@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
SortBench shows that SkTQSort and SkTHeapSort are inferior to std::sort.
The difference is small on randomized inputs, but quite significant for
semi-ordered inputs (forward/backward/repeated). There doesn't seem to
to be any compelling advantage to SkTQSort.
Nanobench results: https://screenshot.googleplex.com/9JOLV1d6Z0u
(These performance numbers are from an optimized build my local machine;
it's possible that we might see different results on the test bots.)
Change-Id: Iaf19563041547eae7de2953be249129108f093b1
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/302295
Commit-Queue: John Stiles <johnstiles@google.com>
Reviewed-by: Mike Klein <mtklein@google.com>
Current strategy: everything from the top
Things to look at first are the manual changes:
- added tools/rewrite_includes.py
- removed -Idirectives from BUILD.gn
- various compile.sh simplifications
- tweak tools/embed_resources.py
- update gn/find_headers.py to write paths from the top
- update gn/gn_to_bp.py SkUserConfig.h layout
so that #include "include/config/SkUserConfig.h" always
gets the header we want.
No-Presubmit: true
Change-Id: I73a4b181654e0e38d229bc456c0d0854bae3363e
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/209706
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Hal Canary <halcanary@google.com>
Reviewed-by: Brian Osman <brianosman@google.com>
Reviewed-by: Florin Malita <fmalita@chromium.org>
Change-Id: I9ba1caa4862bdf9ffc9c0e637bd69cce91fd8468
Reviewed-on: https://skia-review.googlesource.com/c/168740
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
I keep seeing it show up on the profile(usually under memmove) of tight
benchmarks and it's kind of distracting. We don't even print it when
we pass --quiet, so that seems like a nice way to stifle it.
Change-Id: I3a67a7ca1758fd35e3b63cfeeddeac4ff1ffe38d
Reviewed-on: https://skia-review.googlesource.com/157520
Auto-Submit: Mike Klein <mtklein@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
Reviewed-by: Brian Osman <brianosman@google.com>
Adds a nanobench mode that takes samples for a fixed amount of time,
rather than taking a fixed amount of samples.
BUG=skia:
Review URL: https://codereview.chromium.org/1204153002
This seems to be ~100x higher resolution than QueryPerformanceCounter. AFAIK, all our Windows perf bots have constant_tsc, so we can be a bit more direct about using rdtsc directly: it'll always tick at the max CPU frequency.
Now, the question remains, what is the max CPU frequency to divide through by? It looks like QueryPerformanceFrequency actually gives the CPU frequency in kHz, suspiciously exactly what we need to divide through to get elapsed milliseconds. That was a freebie.
I did some before/after comparison on slow benchmarks. Timings look the same. Going to land this without review tonight to see what happens on the bots; happy to review carefully tomorrow.
R=mtklein@google.com
TBR=bungeman
BUG=skia:
Review URL: https://codereview.chromium.org/394363003