That breaks the assumption that the work is proportional to loops.
For example, loops = 5 and loops = 7 would result in the same count
if count = loops / 4.
Bug: skia:
Change-Id: Idae86d658cbfba8a7f49b983ed61a8b7fbea007a
Reviewed-on: https://skia-review.googlesource.com/46600
Commit-Queue: Yuqian Li <liyuqian@google.com>
Commit-Queue: Herb Derby <herb@google.com>
Reviewed-by: Herb Derby <herb@google.com>
Bug: skia:
Change-Id: Ic4073aa17a04e8b400cc4a9db9a0669ee4a5c894
Reviewed-on: https://skia-review.googlesource.com/44203
Commit-Queue: Brian Osman <brianosman@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Mike Klein <mtklein@google.com>
These are transient, so need to be copied.
Bug: skia:
Change-Id: Id24db0b96f343ecd034dd015da6e19ea61579b56
Reviewed-on: https://skia-review.googlesource.com/41741
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
Also add new paragraph about using systrace correctly
Docs-Preview: https://skia.org/?cl=41502
Change-Id: I114c14cc2e87a8b72aec46d8c354d3ea877a41ab
Reviewed-on: https://skia-review.googlesource.com/41502
Reviewed-by: Mike Klein <mtklein@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
Re-land of: https://skia-review.googlesource.com/36560
All information needed by the thread is captured by the prepare
callback object, the lambda captures a pointer to that, and does the
mask render. Once it's done, it signals the semaphore (also owned by the
callback). The callback defers the semaphore wait even longer (into the
ASAP upload), so the odds of waiting for the thread are REALLY low.
Also did a bunch of cleanup along the way, and put in some trace markers
so we can monitor how well this is working.
Traces of a GM that includes GPU and SW path rendering (path-reverse):
Original:
https://screenshot.googleplex.com/f5BG3901tQg.png
Threaded, with wait in the callback (notice pre flush callback blocking):
https://screenshot.googleplex.com/htOSZFE2s04.png
Current version, with wait deferred to ASAP upload function:
https://screenshot.googleplex.com/GHjD0U3C34q.png
Bug: skia:
Change-Id: Idb92f385590749f41328a9aec65b2a93f4775079
Reviewed-on: https://skia-review.googlesource.com/40775
Reviewed-by: Brian Salomon <bsalomon@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
Motivated by wanting to speed-up A8 blits in general (and at the moment, aarect blits). More to come in these areas.
Bug: skia:
Change-Id: I45e8ef951b8e89a825af72b1918049be10920137
Reviewed-on: https://skia-review.googlesource.com/39401
Reviewed-by: Yuqian Li <liyuqian@google.com>
Commit-Queue: Mike Reed <reed@google.com>
This reverts commit 76323bc061.
Reason for revert: Breaking NUC bots in threaded gm comparison:
https://chromium-swarm.appspot.com/task?id=382e589753187f10&refresh=10
Original change's description:
> Threaded generation of software paths
>
> All information needed by the thread is captured by the prepare
> callback object, the lambda captures a pointer to that, and does the
> mask render. Once it's done, it signals the semaphore (also owned by the
> callback). The callback defers the semaphore wait even longer (into the
> ASAP upload), so the odds of waiting for the thread are REALLY low.
>
> Also did a bunch of cleanup along the way, and put in some trace markers
> so we can monitor how well this is working.
>
> Traces of a GM that includes GPU and SW path rendering (path-reverse):
>
> Original:
> https://screenshot.googleplex.com/f5BG3901tQg.png
> Threaded, with wait in the callback (notice pre flush callback blocking):
> https://screenshot.googleplex.com/htOSZFE2s04.png
> Current version, with wait deferred to ASAP upload function:
> https://screenshot.googleplex.com/GHjD0U3C34q.png
>
> Bug: skia:
> Change-Id: I3d5a230bbd68eb35e1f0574b308485c691435790
> Reviewed-on: https://skia-review.googlesource.com/36560
> Commit-Queue: Brian Osman <brianosman@google.com>
> Reviewed-by: Brian Salomon <bsalomon@google.com>
TBR=egdaniel@google.com,mtklein@google.com,bsalomon@google.com,robertphillips@google.com,brianosman@google.com
Change-Id: Icac0918a3771859f671b69ae07ae0fedd3ebb3db
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: skia:
Reviewed-on: https://skia-review.googlesource.com/38560
Reviewed-by: Brian Salomon <bsalomon@google.com>
Commit-Queue: Brian Salomon <bsalomon@google.com>
All information needed by the thread is captured by the prepare
callback object, the lambda captures a pointer to that, and does the
mask render. Once it's done, it signals the semaphore (also owned by the
callback). The callback defers the semaphore wait even longer (into the
ASAP upload), so the odds of waiting for the thread are REALLY low.
Also did a bunch of cleanup along the way, and put in some trace markers
so we can monitor how well this is working.
Traces of a GM that includes GPU and SW path rendering (path-reverse):
Original:
https://screenshot.googleplex.com/f5BG3901tQg.png
Threaded, with wait in the callback (notice pre flush callback blocking):
https://screenshot.googleplex.com/htOSZFE2s04.png
Current version, with wait deferred to ASAP upload function:
https://screenshot.googleplex.com/GHjD0U3C34q.png
Bug: skia:
Change-Id: I3d5a230bbd68eb35e1f0574b308485c691435790
Reviewed-on: https://skia-review.googlesource.com/36560
Commit-Queue: Brian Osman <brianosman@google.com>
Reviewed-by: Brian Salomon <bsalomon@google.com>
The ultimate goal is to end up with "float" and "half", but this
intermediate step uses "highfloat" so that it is clear if I missed a
"float" somewhere. Once this lands, a subsequent CL will switch all
"highfloats" back to "floats".
Bug: skia:
Change-Id: Ia13225c7a0a0a2901e07665891c473d2500ddcca
Reviewed-on: https://skia-review.googlesource.com/31000
Commit-Queue: Ethan Nicholas <ethannicholas@google.com>
Reviewed-by: Brian Salomon <bsalomon@google.com>
SkFAIL is a legacy macro which is just SK_ABORT. This CL mechanically
changes uses of SkFAIL to SK_ABORT in preparation for its removal. The
related sk_throw macro will be changed independently, due to needing to
actually clean up its users.
Change-Id: Id70b5c111a02d2458dc60c8933f444df27d9cebb
Reviewed-on: https://skia-review.googlesource.com/35284
Reviewed-by: Derek Sollenberger <djsollen@google.com>
Commit-Queue: Ben Wagner <bungeman@google.com>
On desktop, this saves just over 5% of the time in the SkSL compiler.
As written, the code will now build either way, so it's much easier to
switch back (or even have some platforms use SkString, if that's ever
required).
Bug: skia:
Change-Id: I634f26a4f6fcb404e59bda6a5c6a21a9c6d73c0b
Reviewed-on: https://skia-review.googlesource.com/34381
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
Adjust the configs specified by recipes to avoid the new error.
Change-Id: I23e31355e2faaab919d92abdb37a6f70cd2da1ff
Reviewed-on: https://skia-review.googlesource.com/32862
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Ben Wagner <benjaminwagner@google.com>
Until now we've been using 3 separate parametric stages to apply
gamma to r,g,b. That works fine, but is kind of unnecessarily
slow, and again less clear in a stack trace than seeing "gamma".
The new bench runs in about 60% of the time the old one does
on my Trashcan.
BUG=skia:6939
Change-Id: I079698d3009b081f1c23a2e27fc26e373b439610
Reviewed-on: https://skia-review.googlesource.com/32721
Reviewed-by: Mike Reed <reed@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
Bug: skia:
Change-Id: I487930955f75048ea27a1bcc61f7e0849c63759b
Reviewed-on: https://skia-review.googlesource.com/32681
Commit-Queue: Mike Reed <reed@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Mike Klein <mtklein@google.com>
rects are already auto-vectorized, so no need to explicitly write a 4f version of SkRect::round()
Bug: skia:
Change-Id: I098945767bfcaa7093d770c376bd17ff3bdc9983
Reviewed-on: https://skia-review.googlesource.com/32060
Commit-Queue: Mike Reed <reed@google.com>
Reviewed-by: Florin Malita <fmalita@chromium.org>
My next step is to change the uniform_color context to
struct {
float r,g,b,a;
uint32_t rgba;
};
so that it's trivial to load in both float and 8-bit pipelines.
Change-Id: If9bdde353ced3bf9eb0c63204b4770ed614ad16b
Reviewed-on: https://skia-review.googlesource.com/30481
Reviewed-by: Florin Malita <fmalita@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
Bug: skia:6880
Change-Id: Ia8b94e52eec3feb5104d2351bf7a7e6f99101deb
Reviewed-on: https://skia-review.googlesource.com/26370
Commit-Queue: Jim Van Verth <jvanverth@google.com>
Reviewed-by: Brian Salomon <bsalomon@google.com>
Change-Id: I71cf04b12be95a54b7fb47d048ba1f8672ed9a8f
Reviewed-on: https://skia-review.googlesource.com/27760
Commit-Queue: Ben Wagner <bungeman@google.com>
Reviewed-by: Mike Klein <mtklein@chromium.org>
DAA is:
1. Much simpler than AAA.
SkScan_AAAPath.cpp is about 1700 lines.
SkScan_DAAPath.cpp is about 300 lines.
The whole DAA CL is only about 800 lines.
2. Much faster than AAA for complicated paths.
The speedup applies to GL backend (including ccpr)!
Here's the frame time of 'SampleApp --slide Chart' on macbook pro:
AAA-raster: 33ms
DAA-raster: 21ms
AAA-gl: 30ms
DAA-gl: 20ms
AAA-ccpr: 18ms
DAA-ccpr: 12ms
My linux desktop doesn't have SSE3 so the speedup is smaller
(~25% for Chart). I believe that DAA is so fast that I can enable
it for any paths (AAA is not enabled by default for complicated
paths because it is slow; hence our older supersampling scan
converter is used for stroking on Chart for AAA-xxx config.)
3. The SkCoverageDelta is suitable for threaded backend with
out-of-order concurrent scan conversion as commented in the source
code. Maybe we can also just send deltas to GPU.
4. Similar to most analytic path renderers, the quality is on the best
ground-truth level, unless there are intersections within a pixel.
The intersections look good to my eyes although theoretically that
could be arbitrary far from the ground truth (see my AAA slides).
5. For simple paths, such as circle, triangle, rrect, etc., DAA is
slower than AAA. But DAA is faster than our older supersampling
scan converter in most cases. As those simple paths usually don't
constitute the bottleneck of a picture (skp or svg), I strongly
recommend use DAA.
6. DAA also heavily favors blitMask so it may work quite well with
SkRasterPipeline and SkRasterPipelineBlitter.
Finally, please check https://skia-review.googlesource.com/c/22420/
which accelerate DAA by specializing blitCoverageDeltas for
SkARGB32_Blitter and SkARGB32_Black_Blitter. It brings a little(<5%)
speedup. But I couldn't figure out how to reduce the duplicate code
so I don't intend to land it.
Bug: skia:
Change-Id: I3b7ed6a727447922e645b1acb737a506e7c09a4c
Reviewed-on: https://skia-review.googlesource.com/19666
Reviewed-by: Mike Reed <reed@google.com>
Reviewed-by: Cary Clark <caryclark@google.com>
Commit-Queue: Yuqian Li <liyuqian@google.com>
Will need guards for android (at least)
Bug: skia:
Change-Id: I2bb8e656997984489ef1f2e41cd3d301c4e7b947
Reviewed-on: https://skia-review.googlesource.com/26040
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Reed <reed@google.com>
Bug: skia:
Change-Id: I401c5a9885c348aa424ab07b094acecddb209490
Reviewed-on: https://skia-review.googlesource.com/25860
Commit-Queue: Brian Osman <brianosman@google.com>
Reviewed-by: Mike Klein <mtklein@google.com>
Reviewed-by: Brian Salomon <bsalomon@google.com>
Not yet thread safe (so it forces threading off).
Builds JSON on the fly, so overhead is certainly bad.
Plan to fix all of that, but this at least "works".
There is now one tracing flag: 'trace'.
- 'debugf' installs the SkDebugf tracer.
- 'atrace' installs the Android ATrace tracer.
- Any other value is interpreted as a filename, and
produces a JSON file for chrome://tracing.
All three modes work in DM, nanobench, and Viewer.
Bug: skia:
Change-Id: I3fbc22382b99418a508c670be2770195c0a1c364
Reviewed-on: https://skia-review.googlesource.com/24781
Commit-Queue: Brian Osman <brianosman@google.com>
Reviewed-by: Brian Salomon <bsalomon@google.com>
[√] convert all stages to use SkJumper_MemoryCtx / be 2d-compatible
[√] convert compile to 2d also, remove 1d run/compile
[√] convert all call sites
[√] no diffs
Change-Id: I3b806eb8fe0c3ec043359616409f7cd1211a1e43
Reviewed-on: https://skia-review.googlesource.com/24263
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Florin Malita <fmalita@chromium.org>
This is mostly dead code.
In order to make it truly dead, we need to opt drawing unpremul images
into SkRasterPipelineBlitter. They had been handled by
SkLinearBitmapPipeline, but can't be draw by SkBitmapProcLegacyShader.
Drawing unpremul images is tested by the GM all_variants_8888, which
gave us trouble last time around (serialize-8888 drew right, 8888 wrong)
but now draws fine. I think this was probably also the root of the
revert, drawing some unpremul image in Chrome's tests somewhere.
Change-Id: I453f9df44ade807316935921cbae82961e2f08aa
Reviewed-on: https://skia-review.googlesource.com/24862
Reviewed-by: Florin Malita <fmalita@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
CQ_INCLUDE_TRYBOTS=skia.primary:Test-Debian9-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD
Bug: skia:
Change-Id: If6f0d0a57463bf99a66d674e65a62ce3931d0116
Reviewed-on: https://skia-review.googlesource.com/24644
Commit-Queue: Mike Reed <reed@google.com>
Reviewed-by: Mike Klein <mtklein@chromium.org>
This allows alpha blending and also alpha shaders with color blended in.
fixes GMs: composeshader_alpha, composeshader_bitmap
Change-Id: I3ab9cbef216f7733798d2e29541b4211c627dab2
Reviewed-on: https://skia-review.googlesource.com/24760
Commit-Queue: Hal Canary <halcanary@google.com>
Reviewed-by: Ben Wagner <bungeman@google.com>
Instead of query and maxSampleCount and using that to cap, we now have
each config store its supported values and when requested returns either
the next highest or equal supported value, or if non the max config supported.
Bug: skia:
Change-Id: I8802d44c13b3b1703ee54a7e69b82102d4b8dc2d
Reviewed-on: https://skia-review.googlesource.com/24302
Commit-Queue: Greg Daniel <egdaniel@google.com>
Reviewed-by: Brian Salomon <bsalomon@google.com>
Also adds more trace events to GPU backend.
Change-Id: Ifa5f0cd4b1fd582f0cc30d37d9e6414dc498c75d
Reviewed-on: https://skia-review.googlesource.com/24622
Reviewed-by: Greg Daniel <egdaniel@google.com>
Commit-Queue: Brian Salomon <bsalomon@google.com>
This reverts commit 742a3e298f.
Reason for revert: Breaking Android roll:
frameworks/base/core/jni/android/graphics/BitmapFactory.cpp:453:18: error: no member named 'fColorPtr' in 'SkAndroidCodec::AndroidOptions'
codecOptions.fColorPtr = colorPtr;
~~~~~~~~~~~~ ^
frameworks/base/core/jni/android/graphics/BitmapFactory.cpp:454:18: error: no member named 'fColorCount' in 'SkAndroidCodec::AndroidOptions'
codecOptions.fColorCount = colorCount;
~~~~~~~~~~~~ ^
Original change's description:
> Remove support for decoding to kIndex_8
>
> Fix up callsites, and remove tests that no longer make sense.
>
> Bug: skia:6828
> Change-Id: I2548c4b7528b7b1be7412563156f27b52c9d4295
> Reviewed-on: https://skia-review.googlesource.com/21664
> Reviewed-by: Derek Sollenberger <djsollen@google.com>
> Commit-Queue: Leon Scroggins <scroggo@google.com>
TBR=djsollen@google.com,scroggo@google.com
Change-Id: I1bc669441f250690884e75a9a61427fdf75c6907
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: skia:6828
Reviewed-on: https://skia-review.googlesource.com/22120
Reviewed-by: Leon Scroggins <scroggo@google.com>
Commit-Queue: Leon Scroggins <scroggo@google.com>
Fix up callsites, and remove tests that no longer make sense.
Bug: skia:6828
Change-Id: I2548c4b7528b7b1be7412563156f27b52c9d4295
Reviewed-on: https://skia-review.googlesource.com/21664
Reviewed-by: Derek Sollenberger <djsollen@google.com>
Commit-Queue: Leon Scroggins <scroggo@google.com>
This makes it trivial to copy/paste into spreadsheets for sorting/diffing
Bug: skia:
Change-Id: I02c920e2b8be8f59270da9fb9bb3e6763987e0bc
Reviewed-on: https://skia-review.googlesource.com/21378
Reviewed-by: Mike Klein <mtklein@google.com>
Reviewed-by: Florin Malita <fmalita@chromium.org>
Commit-Queue: Mike Reed <reed@google.com>