Choosing pixel formats is quite slow (depending on driver?). We were
doing this once per context, and it added up. On my desktop windows
machine, this saves another 7 seconds in `dm --config gl --src gm`.
Actual times:
37s -> 30s (not writing PNGs)
47s -> 39.5s (writing PNGs)
We always called this with MSAA sample count set to zero, so I cleaned
up the code to make that clearer. Also included a comment about the
theoretical risk, although I think that outside of a multi-GPU system,
we're fine.
Bug: skia:
Change-Id: I50927ebfaf6fe8d88a63674427fbf9e06e4ab059
Reviewed-on: https://skia-review.googlesource.com/35763
Reviewed-by: Brian Salomon <bsalomon@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
This reverts commit 3fd295550f.
Reason for revert: breaking things
Original change's description:
> Add GrTextureOp and use to implement SkGpuDevice::drawImage[Rect]() when possible
>
> This op draws a texture rectangle in src over blending with no edge antialiasing. It less powerful than NonAAFillRectOp/GrPaint but has less CPU overhead.
>
> Change-Id: Ia6107bb67c1c2a83de14c665aff64b0de2750fba
> Reviewed-on: https://skia-review.googlesource.com/33802
> Commit-Queue: Brian Salomon <bsalomon@google.com>
> Reviewed-by: Robert Phillips <robertphillips@google.com>
TBR=djsollen@google.com,bsalomon@google.com,robertphillips@google.com,brianosman@google.com
Change-Id: I9cdbeeac15b17d2d6b3385560ed826397c0373c6
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Reviewed-on: https://skia-review.googlesource.com/36220
Reviewed-by: Brian Salomon <bsalomon@google.com>
Commit-Queue: Brian Salomon <bsalomon@google.com>
This op draws a texture rectangle in src over blending with no edge antialiasing. It less powerful than NonAAFillRectOp/GrPaint but has less CPU overhead.
Change-Id: Ia6107bb67c1c2a83de14c665aff64b0de2750fba
Reviewed-on: https://skia-review.googlesource.com/33802
Commit-Queue: Brian Salomon <bsalomon@google.com>
Reviewed-by: Robert Phillips <robertphillips@google.com>
Change-Id: I97a844b2f289d2518f60a64f94d60551c4530dd4
Reviewed-on: https://skia-review.googlesource.com/35742
Commit-Queue: Brian Salomon <bsalomon@google.com>
Reviewed-by: Brian Osman <brianosman@google.com>
Catapult (Chrome tracing) has a hard upper limit of 256 MB of JSON data.
This is independent of the number of events, because V8 can't store a
single string longer than that. Before these changes, longer traces
(eg all GL GMs, which was my test case) would be much larger (306 MB).
This CL includes four changes that help to reduce the text size:
1) Offset timestamps (saved 7.3 MB)
2) Limit timestamps and durations to 3 digits (saved 10.7 MB)
3) Shorten thread IDs (saved 7.2 MB)
4) Omit categories from JSON (saved 25.7 MB)
Note that category filtering still works, this just prevents us from
writing the categories to the JSON, which was of limited value.
At this point, my 306 MB file is now 255.3 MB, and loads.
Bug: skia:
Change-Id: Iaafc84025ddd52904f1ce9c1c2e9cbca65113079
Reviewed-on: https://skia-review.googlesource.com/35523
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Brian Osman <brianosman@google.com>
Context is in the below bug
Bug: skia:6918
Change-Id: Ic9048311092bd7e73dd6ee182e79abea79baa07a
Reviewed-on: https://skia-review.googlesource.com/30586
Commit-Queue: Ravi Mistry <rmistry@google.com>
Reviewed-by: Eric Boren <borenet@google.com>
SkFAIL is a legacy macro which is just SK_ABORT. This CL mechanically
changes uses of SkFAIL to SK_ABORT in preparation for its removal. The
related sk_throw macro will be changed independently, due to needing to
actually clean up its users.
Change-Id: Id70b5c111a02d2458dc60c8933f444df27d9cebb
Reviewed-on: https://skia-review.googlesource.com/35284
Reviewed-by: Derek Sollenberger <djsollen@google.com>
Commit-Queue: Ben Wagner <bungeman@google.com>
Removes the need for strdup with copied strings, paves the way for
more (and richer) payload, and shrinks the average event way down.
Bug: skia:
Change-Id: I9604fe713c34cfc877dce84563af89c579abd65b
Reviewed-on: https://skia-review.googlesource.com/35166
Reviewed-by: Mike Klein <mtklein@google.com>
Reviewed-by: Brian Salomon <bsalomon@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
After turning on exceptions in test tools,
iOS builds of ok started saying atoi() was not declared.
I don't even want to know.
Change-Id: I52f354a1f25ec042bf2161a4c5dd9276aa25e46a
Reviewed-on: https://skia-review.googlesource.com/34961
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
The only interesting difference here is that I've just skipped
cd_Documents() on Google3 iOS builds rather than adding a new target to
BUILD. We don't run the binary so it's kind of moot what directory it'd
run in.
Change-Id: I1994e0283d24bcc505fa9b2b7b58307eafa5be92
Reviewed-on: https://skia-review.googlesource.com/34742
Reviewed-by: Ben Wagner <benjaminwagner@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
I'm betting big on ok bench. This is a forcing function.
Change-Id: I8c359b7d712e16f8f0cbb90591801e0014073288
Reviewed-on: https://skia-review.googlesource.com/33660
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
This new source acts like other sources (GMs, SKPs) for benchmarks. It
times multiple samples (controlled by samples=N, default 20), and each
of those samples uses the same strategy as monobench, growing loops
exponentially until it runs for at least 10ms.
When done it prints the fastest and the two slowest samples. In
practice the 100th percentile sample is very different from the
next slowest due to caching, and the fastest is always interesting.
Because these benchmarks run in whatever execution engine ok has
selected, on non-Windows platforms you have some real control over the
interaction between benchmarks. In its default "fork" mode each
benchmark runs independently in its own process, so the 100th
percentiles really stand out. The other modes "thread" and "serial"
work as you'd expect too.
Here's an example where you can see how the different interactions work:
out/ok bench:samples=100 8888 filter:search=text_16_AA fork
[text_16_AA_WT] 2.32µs @0 6.23µs @99 24.3ms @100
[text_16_AA_FF] 2.41µs @0 5.7µs @99 23.3ms @100
[text_16_AA_88] 2.55µs @0 5.6µs @99 24.8ms @100
[text_16_AA_BK] 1.97µs @0 5.44µs @99 23.2ms @100
out/ok bench:samples=100 8888 filter:search=text_16_AA thread
[text_16_AA_FF] 2.45µs @0 23.5µs @99 24.8ms @100
[text_16_AA_WT] 2.52µs @0 17.8µs @99 24.7ms @100
[text_16_AA_88] 2.55µs @0 19.7µs @99 25.1ms @100
[text_16_AA_BK] 1.8µs @0 14.7µs @99 25.1ms @100
out/ok bench:samples=100 8888 filter:search=text_16_AA serial
[text_16_AA_88] 2.35µs @0 3.53µs @99 16.7ms @100
[text_16_AA_FF] 2.09µs @0 2.73µs @99 2.91µs @100
[text_16_AA_BK] 1.75µs @0 2.46µs @99 2.65µs @100
[text_16_AA_WT] 2.1µs @0 3.16µs @99 3.17µs @100
In the first "fork" case all runs are independent and have roughly
the same profile. "thread" looks similar except you can see them
contending at the 99th percentile. In "serial", the first bench
warms up the rest, so their 100th percentiles are all much faster.
Change-Id: I01a9f8c54b540221a9f232b271bb8ef3fda2569c
Reviewed-on: https://skia-review.googlesource.com/33585
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
I tried to follow exactly the same strategy as a start.
(Though I did fix the off-by-one dimensions.)
It does rather look like we only need 3D and 4D now
that I've looked at the call sites.
Looks like about a 20% speedup.
Change-Id: I8b1af64750ad1750716ee1ab0767e64591c7206a
Reviewed-on: https://skia-review.googlesource.com/32842
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Brian Osman <brianosman@google.com>
This is a stand-alone helper class for writing properly
structured JSON to an SkWStream. It currently solves two
problems (although this CL only uses it in one context):
1) Performance. Writing out JSON this way is about 10x
faster than using JSONCPP. For the large amounts of data
generated by the tracing system, that's a big win.
2) Makes it easy to emit structured JSON from code that's
not fully centralized. We'd like to spit out JSON that
describes a GrContext, GrGpu, GrCaps, etc... Doing that
with simple string manipulation is complex, and spreads
this logic over all those functions. Using JSONCPP adds
yet another (large) third party library dependency (that
we only build into our own tools right now).
This went through several revisions. I originally planned
it as a stateful SkString wrapper, so the user could just
build their JSON as a string. That's O(N^2), though,
because SkString grows by a (small) constant amount. Even
using a better growth strategy still means needing RAM
for all the resulting text, which is usually pointless.
This version has a constant memory cost, so writing huge
amounts of JSON to disk (tracing a long DM run can emit
100's of MBs) doesn't stress resources.
Bug: skia:
Change-Id: Ia716524b246db0f97d332da60d2ce9903069e748
Reviewed-on: https://skia-review.googlesource.com/31204
Commit-Queue: Brian Osman <brianosman@google.com>
Reviewed-by: Mike Klein <mtklein@chromium.org>
rm gm that appears to have been there solely for pdf, but we don't use
it for that now.
Bug: skia:
Change-Id: I3cf88db923c2445b7c95dda14da679a594117643
Reviewed-on: https://skia-review.googlesource.com/31760
Reviewed-by: Derek Sollenberger <djsollen@google.com>
Reviewed-by: Mike Klein <mtklein@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
Perf showed that DAA is slow with MSVC. Disable it until I find
out why.
Bug: skia:
Change-Id: If30c24e97fa42e3a7ce143a1b1d06e4a3f278d13
TBR: mtklein@google.com
Reviewed-on: https://skia-review.googlesource.com/30584
Reviewed-by: Mike Klein <mtklein@chromium.org>
Reviewed-by: Yuqian Li <liyuqian@google.com>
Commit-Queue: Yuqian Li <liyuqian@google.com>
- Bring back some previously deleted macros and helper types.
- Automatically inject base_type information into snapshot events,
to allow simpler tracking of polymorphic object types.
- Fix JSON formatting of pointer values (they were serializing as bool).
Bug: skia:
Change-Id: Iac7803f72ce5396ffd2fbcb5a36d76745c5e3f3e
Reviewed-on: https://skia-review.googlesource.com/28220
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Brian Osman <brianosman@google.com>
Bug: skia:6916
Change-Id: I16badf80c3b34e517b8baab161150c9434f325aa
Reviewed-on: https://skia-review.googlesource.com/30100
Commit-Queue: Ravi Mistry <rmistry@google.com>
Reviewed-by: Eric Boren <borenet@google.com>
1) Run python bin/fetch-clang-win
2) Set clang_win = "../bin/clang_win"
3) ???
4) Profit
Most changes here are to pass the right -mfoo flags to Clang
to enable advanced instruction sets, or fixed warning-as-errors.
BUG=skia:2679
Change-Id: Ieed145d35c209131c7c16fdd3ee11a3de4a1a921
Reviewed-on: https://skia-review.googlesource.com/28740
Reviewed-by: Ben Wagner <bungeman@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
try removing self references in method definitions.
If this creates awkward wording, it can always be allowed
in another CL. Also tighten rules for identifying function
references in include comments.
R=briansoman@google.com, caryclark@google.comTBR=reed@google.com
Bug: skia:6898
Change-Id: I1a0e6b2a76dacfe71d134deb4589fb74e6611a03
Reviewed-on: https://skia-review.googlesource.com/28624
Commit-Queue: Cary Clark <caryclark@skia.org>
Reviewed-by: Cary Clark <caryclark@skia.org>
Fix 'arcs' at sentence start to Arcs.
This fix corrected other capitalizations as well,
and exposed some mis-capitalizations in the bmh
doc.
R=brianosman@google.comTBR=reed@google.com
Bug: skia:
Change-Id: I4d51388556f7e8ff868a9236ce76745915560327
Reviewed-on: https://skia-review.googlesource.com/28241
Commit-Queue: Cary Clark <caryclark@skia.org>
Reviewed-by: Cary Clark <caryclark@skia.org>
The earlier CL doesn't change the flag definition so it's not
turned on yet.
Bug: skia:
Change-Id: Id278ae5fc27d703ab7f6628bed95093d32cd7d0b
TBR: caryclark@google.com, fmalita@chromium.org
Reviewed-on: https://skia-review.googlesource.com/28161
Commit-Queue: Yuqian Li <liyuqian@google.com>
Reviewed-by: Yuqian Li <liyuqian@google.com>
change underscore to space if needed
remove bmh_ prefix
TBR=caryclark@google.com
Bug: skia:
Change-Id: I9d4d29c7ff91d9d29bf8740d163724f371e5e211
Reviewed-on: https://skia-review.googlesource.com/28044
Reviewed-by: Cary Clark <caryclark@skia.org>
Commit-Queue: Cary Clark <caryclark@skia.org>
bookmaker is a tool that generates documentation
backends from a canonical markup. Documentation for
bookmaker itself is evolving at docs/usingBookmaker.bmh,
which is visible online at skia.org/user/api/bmh_usingBookmaker
Change-Id: Ic76ddf29134895b5c2ebfbc84603e40ff08caf09
Reviewed-on: https://skia-review.googlesource.com/28000
Commit-Queue: Cary Clark <caryclark@google.com>
Reviewed-by: Cary Clark <caryclark@google.com>
Change-Id: I71cf04b12be95a54b7fb47d048ba1f8672ed9a8f
Reviewed-on: https://skia-review.googlesource.com/27760
Commit-Queue: Ben Wagner <bungeman@google.com>
Reviewed-by: Mike Klein <mtklein@chromium.org>
This makes Engines (task execution strategies: serial, thread, fork)
pluggable just like most of the rest of ok. It removes the thread and
process limits, as I find myself rarely caring about what they are
exactly. Instead of limiting to num-cores, we just allow any number of
concurrent threads, and any number of concurrent child processes subject
to OS limitations.
Change-Id: Icef49d86818fe9a4b7380efb60e73e40bc2e6b73
Reviewed-on: https://skia-review.googlesource.com/27140
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
Bug: skia:
Change-Id: I1449f0f4d7d9ab6225d98c601eafa7461a2a7dde
Reviewed-on: https://skia-review.googlesource.com/27120
Commit-Queue: Brian Osman <brianosman@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Mike Klein <mtklein@google.com>
This reverts commit 9ac801c12c.
Reason for revert: breaks threaded mode.
Original change's description:
> ok, exit() child processes instead of _exit()
>
> We want the atexit() hook to print deferred logs (crash dumps, etc.) only when the main, parent process exits. Ending the child processes with _exit() instead of exit() avoids calling atexit() hooks, but also global destructors. This is more complex than we need: we can just print before main() exits. Only the main, parent process gets there.
>
> This now runs each child process' global destructors, which makes trace.json "work": we get a trace of whichever child process finishes last. It's a start.
>
> Change-Id: I0cc2b12592f0f79e73a43a160b9fd06dba1fee25
> Reviewed-on: https://skia-review.googlesource.com/26800
> Commit-Queue: Mike Klein <mtklein@chromium.org>
> Reviewed-by: Brian Osman <brianosman@google.com>
TBR=mtklein@chromium.org,brianosman@google.com
Change-Id: I695dddaab3b5a51e4698bb7222fc723b544a1d64
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Reviewed-on: https://skia-review.googlesource.com/26943
Reviewed-by: Mike Klein <mtklein@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
If the value or option is empty (as was for kInstancedRenderingStateName),
our Android Java application will throw exception and skip the remaining
state objects. That would result in missing "Softkeys" and "FPS" for Raster
backend in Android Viewer app.
Bug: skia:
Change-Id: I6f600bbb94509ca5389eac2d681304a00427ecdb
Reviewed-on: https://skia-review.googlesource.com/26527
Reviewed-by: Brian Salomon <bsalomon@google.com>
Commit-Queue: Yuqian Li <liyuqian@google.com>
Ensures that all Skia events are disabled by default in Chrome, and
eliminates redundant typing.
Bug: skia:
Change-Id: I289c5e5a01084fcf4cccf512da65a4727f4aeca2
Reviewed-on: https://skia-review.googlesource.com/26880
Commit-Queue: Mike Klein <mtklein@chromium.org>
Reviewed-by: Mike Klein <mtklein@chromium.org>
We want the atexit() hook to print deferred logs (crash dumps, etc.) only when the main, parent process exits. Ending the child processes with _exit() instead of exit() avoids calling atexit() hooks, but also global destructors. This is more complex than we need: we can just print before main() exits. Only the main, parent process gets there.
This now runs each child process' global destructors, which makes trace.json "work": we get a trace of whichever child process finishes last. It's a start.
Change-Id: I0cc2b12592f0f79e73a43a160b9fd06dba1fee25
Reviewed-on: https://skia-review.googlesource.com/26800
Commit-Queue: Mike Klein <mtklein@chromium.org>
Reviewed-by: Brian Osman <brianosman@google.com>
DAA is:
1. Much simpler than AAA.
SkScan_AAAPath.cpp is about 1700 lines.
SkScan_DAAPath.cpp is about 300 lines.
The whole DAA CL is only about 800 lines.
2. Much faster than AAA for complicated paths.
The speedup applies to GL backend (including ccpr)!
Here's the frame time of 'SampleApp --slide Chart' on macbook pro:
AAA-raster: 33ms
DAA-raster: 21ms
AAA-gl: 30ms
DAA-gl: 20ms
AAA-ccpr: 18ms
DAA-ccpr: 12ms
My linux desktop doesn't have SSE3 so the speedup is smaller
(~25% for Chart). I believe that DAA is so fast that I can enable
it for any paths (AAA is not enabled by default for complicated
paths because it is slow; hence our older supersampling scan
converter is used for stroking on Chart for AAA-xxx config.)
3. The SkCoverageDelta is suitable for threaded backend with
out-of-order concurrent scan conversion as commented in the source
code. Maybe we can also just send deltas to GPU.
4. Similar to most analytic path renderers, the quality is on the best
ground-truth level, unless there are intersections within a pixel.
The intersections look good to my eyes although theoretically that
could be arbitrary far from the ground truth (see my AAA slides).
5. For simple paths, such as circle, triangle, rrect, etc., DAA is
slower than AAA. But DAA is faster than our older supersampling
scan converter in most cases. As those simple paths usually don't
constitute the bottleneck of a picture (skp or svg), I strongly
recommend use DAA.
6. DAA also heavily favors blitMask so it may work quite well with
SkRasterPipeline and SkRasterPipelineBlitter.
Finally, please check https://skia-review.googlesource.com/c/22420/
which accelerate DAA by specializing blitCoverageDeltas for
SkARGB32_Blitter and SkARGB32_Black_Blitter. It brings a little(<5%)
speedup. But I couldn't figure out how to reduce the duplicate code
so I don't intend to land it.
Bug: skia:
Change-Id: I3b7ed6a727447922e645b1acb737a506e7c09a4c
Reviewed-on: https://skia-review.googlesource.com/19666
Reviewed-by: Mike Reed <reed@google.com>
Reviewed-by: Cary Clark <caryclark@google.com>
Commit-Queue: Yuqian Li <liyuqian@google.com>
Will need guards for android (at least)
Bug: skia:
Change-Id: I2bb8e656997984489ef1f2e41cd3d301c4e7b947
Reviewed-on: https://skia-review.googlesource.com/26040
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Reed <reed@google.com>
Change-Id: I562d438bd65e9fd900cfc6831f971b4af25c8ae6
Reviewed-on: https://skia-review.googlesource.com/26361
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
This doesn't do anything in the default process-per-task mode, because
those child tasks exit using _exit(), which doesn't trigger the event
tracer destructor to flush.
I don't remember exactly why I exit with _exit(), so I'm going to have
to follow up on that. But written this way as I think I'm at least
initializing the tracing in the right place for each process for the
future.
In threaded (-j -1) and serial (-j 0) modes, everything seems to work
great.
I'm also thinking I might add a tracer like the SkDebugf tracer but
using ok_log(), which handles interlaced logging from concurrent tasks
better than vanilla SkDebugf.
Example:
ninja -C out ok; and out/ok gm 8888 filter:search=fontmgr_bounds trace -j -1
Change-Id: Ia3cdad930ce65e6fd12fa74f3fb00894e35138d3
Reviewed-on: https://skia-review.googlesource.com/26350
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
Change-Id: I6feb4924ab67e1da08f923315ef6d3a580881b23
Reviewed-on: https://skia-review.googlesource.com/26340
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
Bug: skia:
Change-Id: I401c5a9885c348aa424ab07b094acecddb209490
Reviewed-on: https://skia-review.googlesource.com/25860
Commit-Queue: Brian Osman <brianosman@google.com>
Reviewed-by: Mike Klein <mtklein@google.com>
Reviewed-by: Brian Salomon <bsalomon@google.com>
TraceID was unused, remove it.
Simplify casting logic by using the same union helper as the macros.
This fixes a bug that was present in the bool handling - we were
treating the union value as a pointer, so we were dereferencing
random stack memory. Luckily it never crashes, we did get the wrong
values for bools.
Bug: skia:
Change-Id: I15d44756214f34c1f6479980d9a487ac7f3d8f6c
Reviewed-on: https://skia-review.googlesource.com/25801
Commit-Queue: Brian Osman <brianosman@google.com>
Reviewed-by: Brian Salomon <bsalomon@google.com>
This also updates create_test_font so that it can be built, compiles,
and uses SkFontStyle instead of SkTypeface::Style.
BUG=b/63669723
Change-Id: I6eb0f851853f4721cf8e5052255b5b6750c3257f
Reviewed-on: https://skia-review.googlesource.com/24740
Reviewed-by: Cary Clark <caryclark@google.com>
Reviewed-by: Mike Reed <reed@google.com>
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Ben Wagner <bungeman@google.com>