This reverts commit 5e18cdea0a.
Reason for revert: ASAN
Original change's description:
> Direct evaluation of gaussian
>
> The SVG(CSS) standard allows the 3 pass algorithm for sigma >= 2. But
> sigma < 2, the code must evaluate to the convolution. The old code used
> an interpolation scheme between windowed filters. This code directly
> evaluates the gaussian kernel for sigma < 2.
>
> This code produces cleaner results, is 25% faster, and does not use a
> temporary memory buffer.
>
> Change-Id: Ibd0caa73cadd06b637f55ba7bd4fefcfe7ac73db
> Reviewed-on: https://skia-review.googlesource.com/62540
> Commit-Queue: Herb Derby <herb@google.com>
> Reviewed-by: Mike Klein <mtklein@google.com>
TBR=mtklein@google.com,herb@google.com
Change-Id: I936077dfa659d71bc361339d98340c55545a1eb8
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Reviewed-on: https://skia-review.googlesource.com/72481
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
The SVG(CSS) standard allows the 3 pass algorithm for sigma >= 2. But
sigma < 2, the code must evaluate to the convolution. The old code used
an interpolation scheme between windowed filters. This code directly
evaluates the gaussian kernel for sigma < 2.
This code produces cleaner results, is 25% faster, and does not use a
temporary memory buffer.
Change-Id: Ibd0caa73cadd06b637f55ba7bd4fefcfe7ac73db
Reviewed-on: https://skia-review.googlesource.com/62540
Commit-Queue: Herb Derby <herb@google.com>
Reviewed-by: Mike Klein <mtklein@google.com>
Change-Id: I871dd5eea4496e87c206b46d9eae81cb521b11ce
Reviewed-on: https://skia-review.googlesource.com/65103
Reviewed-by: Hal Canary <halcanary@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
- reduce code size by using a draw instead of custom blits
Bug: skia:
Change-Id: I90f9fb2abf40496e771f1f725556c178d730b590
Reviewed-on: https://skia-review.googlesource.com/62860
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Reed <reed@google.com>
Change-Id: Ifd87e752fe1712ab4adef6c5f5de8798ab6c4991
Reviewed-on: https://skia-review.googlesource.com/50680
Reviewed-by: Ethan Nicholas <ethannicholas@google.com>
Commit-Queue: Jim Van Verth <jvanverth@google.com>
Enables for volatile paths and when path mask caching is disabled.
Bug: skia:
Change-Id: I644b17f2a4f77a4ddf85265f520599499c0800cf
Reviewed-on: https://skia-review.googlesource.com/60481
Reviewed-by: Brian Salomon <bsalomon@google.com>
Commit-Queue: Chris Dalton <csmartdalton@google.com>
Adds the flag and a disables caching on the CCPR bots.
Bug: skia:
Change-Id: Icb85e77f89634dda1d419dacac5b8a93340723f0
Reviewed-on: https://skia-review.googlesource.com/59740
Reviewed-by: Eric Boren <borenet@google.com>
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Chris Dalton <csmartdalton@google.com>
Allows benchmarks to override GrContextOptions.
Removes the ability to use the same GrContext for all benchmarks in a config.
Change-Id: I5ab9f6e81055451ac912a66537843d1a49f3b479
Reviewed-on: https://skia-review.googlesource.com/34080
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Brian Salomon <bsalomon@google.com>
On ANGLE, at least, this frame gives us much more consistent results.
Bug: skia:
Change-Id: Ifdecc8451ef51490c08057645214738180b1a366
Reviewed-on: https://skia-review.googlesource.com/57884
Reviewed-by: Greg Daniel <egdaniel@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
Also adds a presubmit to prevent adding trailing whitespace to source
code in the future.
Change-Id: I41a4df81487f6f00aa19b188f0cac6a3377efde6
Reviewed-on: https://skia-review.googlesource.com/57380
Reviewed-by: Ravi Mistry <rmistry@google.com>
Reviewed-by: Mike Reed <reed@google.com>
Commit-Queue: Ben Wagner <bungeman@google.com>
Move the check for kFailedLoops above code that times the benchmark.
This matches the comment ("Can't be timed") and prevents an infinite
loop.
Bug: skia:6774
Change-Id: Iacdc1ca1d11afcf05afac60e4eb0d8d9a12f800e
Reviewed-on: https://skia-review.googlesource.com/53803
Reviewed-by: Yuqian Li <liyuqian@google.com>
Commit-Queue: Leon Scroggins <scroggo@google.com>
This reverts commit 88757dacd4.
Reason for revert: Still seems to be failing Chromium "telemetry_perf_unittests (with patch) on Android" on android_n5x_swarming_rel.
Original change's description:
> guard old apis for querying byte-size of a bitmap/imageinfo/pixmap
>
> Now with legacy behavior for allocpixels
>
> This was reverted, so the current CL is a "fix" on top of ...
> https://skia-review.googlesource.com/c/skia/+/50980
>
> Related update to Chrome (in preparation for this change)
> https://chromium-review.googlesource.com/c/chromium/src/+/685719
>
> Bug: skia:
> Change-Id: I4b370ee7e95083ab27421f008132219c9c7b86e9
> Reviewed-on: https://skia-review.googlesource.com/51341
> Reviewed-by: Florin Malita <fmalita@chromium.org>
> Commit-Queue: Mike Reed <reed@google.com>
TBR=fmalita@chromium.org,reed@google.com
Change-Id: I827a0ca1d1e3909e648fde3342cdb8601d34da8d
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: skia:
Reviewed-on: https://skia-review.googlesource.com/52381
Reviewed-by: Jim Van Verth <jvanverth@google.com>
Commit-Queue: Jim Van Verth <jvanverth@google.com>
This reverts commit 98a6216b18.
Reason for revert: breaking the chrome roll. Looks like they may be writing data to create an image across all the row bytes and thus writing to unalloced data on the last row. Link to example failing bot:
https://build.chromium.org/p/tryserver.chromium.win/builders/win_chromium_rel_ng/builds/539960
Original change's description:
> guard old apis for querying byte-size of a bitmap/imageinfo/pixmap
>
> Previously we had size_t and uint64_t variations.
>
> The new (simpler) API always..
> - returns size_t, or 0 if the calculation overflowed
> - returns the trimmed size (does not include rowBytes padding for the last row)
>
> Bug: skia:
> Change-Id: I05173e877918327c7b207d2f7f1ab0db36892e2e
> Reviewed-on: https://skia-review.googlesource.com/50980
> Commit-Queue: Mike Reed <reed@google.com>
> Reviewed-by: Florin Malita <fmalita@chromium.org>
> Reviewed-by: Leon Scroggins <scroggo@google.com>
TBR=mtklein@google.com,herb@google.com,scroggo@google.com,fmalita@chromium.org,reed@google.com
Change-Id: I726f6ab1b36b14979ba6f37105e0a469b3f0dbc0
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: skia:
Reviewed-on: https://skia-review.googlesource.com/51262
Reviewed-by: Greg Daniel <egdaniel@google.com>
Commit-Queue: Greg Daniel <egdaniel@google.com>
Previously we had size_t and uint64_t variations.
The new (simpler) API always..
- returns size_t, or 0 if the calculation overflowed
- returns the trimmed size (does not include rowBytes padding for the last row)
Bug: skia:
Change-Id: I05173e877918327c7b207d2f7f1ab0db36892e2e
Reviewed-on: https://skia-review.googlesource.com/50980
Commit-Queue: Mike Reed <reed@google.com>
Reviewed-by: Florin Malita <fmalita@chromium.org>
Reviewed-by: Leon Scroggins <scroggo@google.com>
Otherwise, the first few benches' measurements will be inaccurate.
For example, without this CL, the first few measurements are:
337ns, 566µs, 1000µs, ... without "--ms 1000" arg
211ns, 285µs, 874µs, ... with "--ms 1000" arg
With this CL, the first few measurements are:
195ns, 296µs, 1.03ms, ... without "--ms 1000" arg
204ns, 280µs, 859µs, ... with "--ms 1000" arg
In the example above, the first two measurements are vastly (>50%)
different without this CL. I think that's the reason why I keep
using "--ms 1000" arg locally. But it's really only necessary for
the first bench to warm up nanobench. It's a waste to apply
"--ms 1000" to all the following benches.
Bug: skia:
Change-Id: I1924ba3ff9185ed89aeda72794fafd1fe6625eef
Reviewed-on: https://skia-review.googlesource.com/49742
Reviewed-by: Yuqian Li <liyuqian@google.com>
Commit-Queue: Yuqian Li <liyuqian@google.com>
This reverts commit 05d5a13fea.
Reason for revert: looks like it broke filterfastbounds
Original change's description:
> Revert "Revert "Switched highp float to highfloat and mediump float to half.""
>
> This reverts commit 1d816b92bb.
>
> Bug: skia:
> Change-Id: I388b5e5e9bf619db48297a80c9a80c039f26c9f1
> Reviewed-on: https://skia-review.googlesource.com/46464
> Reviewed-by: Brian Salomon <bsalomon@google.com>
> Commit-Queue: Ethan Nicholas <ethannicholas@google.com>
TBR=bsalomon@google.com,ethannicholas@google.com
# Not skipping CQ checks because original CL landed > 1 day ago.
Bug: skia:
Change-Id: Iddf6aef2ab084aa73da7ceebdfc303a1d2b80cde
Reviewed-on: https://skia-review.googlesource.com/47441
Reviewed-by: Ethan Nicholas <ethannicholas@google.com>
Commit-Queue: Ethan Nicholas <ethannicholas@google.com>
created new file src/core/SkColorData.h for
internal consumption. Note that many of the
functions there are unused as well.
Bug: skia: 6898
R: reed@google.com
Change-Id: I25bfd5a9c21f53558c4ca65a77eb5d322d897c6d
Reviewed-on: https://skia-review.googlesource.com/46848
Commit-Queue: Cary Clark <caryclark@google.com>
Reviewed-by: Mike Reed <reed@google.com>
That breaks the assumption that the work is proportional to loops.
For example, loops = 5 and loops = 7 would result in the same count
if count = loops / 4.
Bug: skia:
Change-Id: Idae86d658cbfba8a7f49b983ed61a8b7fbea007a
Reviewed-on: https://skia-review.googlesource.com/46600
Commit-Queue: Yuqian Li <liyuqian@google.com>
Commit-Queue: Herb Derby <herb@google.com>
Reviewed-by: Herb Derby <herb@google.com>
Bug: skia:
Change-Id: Ic4073aa17a04e8b400cc4a9db9a0669ee4a5c894
Reviewed-on: https://skia-review.googlesource.com/44203
Commit-Queue: Brian Osman <brianosman@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Mike Klein <mtklein@google.com>
These are transient, so need to be copied.
Bug: skia:
Change-Id: Id24db0b96f343ecd034dd015da6e19ea61579b56
Reviewed-on: https://skia-review.googlesource.com/41741
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
Also add new paragraph about using systrace correctly
Docs-Preview: https://skia.org/?cl=41502
Change-Id: I114c14cc2e87a8b72aec46d8c354d3ea877a41ab
Reviewed-on: https://skia-review.googlesource.com/41502
Reviewed-by: Mike Klein <mtklein@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
Re-land of: https://skia-review.googlesource.com/36560
All information needed by the thread is captured by the prepare
callback object, the lambda captures a pointer to that, and does the
mask render. Once it's done, it signals the semaphore (also owned by the
callback). The callback defers the semaphore wait even longer (into the
ASAP upload), so the odds of waiting for the thread are REALLY low.
Also did a bunch of cleanup along the way, and put in some trace markers
so we can monitor how well this is working.
Traces of a GM that includes GPU and SW path rendering (path-reverse):
Original:
https://screenshot.googleplex.com/f5BG3901tQg.png
Threaded, with wait in the callback (notice pre flush callback blocking):
https://screenshot.googleplex.com/htOSZFE2s04.png
Current version, with wait deferred to ASAP upload function:
https://screenshot.googleplex.com/GHjD0U3C34q.png
Bug: skia:
Change-Id: Idb92f385590749f41328a9aec65b2a93f4775079
Reviewed-on: https://skia-review.googlesource.com/40775
Reviewed-by: Brian Salomon <bsalomon@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
Motivated by wanting to speed-up A8 blits in general (and at the moment, aarect blits). More to come in these areas.
Bug: skia:
Change-Id: I45e8ef951b8e89a825af72b1918049be10920137
Reviewed-on: https://skia-review.googlesource.com/39401
Reviewed-by: Yuqian Li <liyuqian@google.com>
Commit-Queue: Mike Reed <reed@google.com>
This reverts commit 76323bc061.
Reason for revert: Breaking NUC bots in threaded gm comparison:
https://chromium-swarm.appspot.com/task?id=382e589753187f10&refresh=10
Original change's description:
> Threaded generation of software paths
>
> All information needed by the thread is captured by the prepare
> callback object, the lambda captures a pointer to that, and does the
> mask render. Once it's done, it signals the semaphore (also owned by the
> callback). The callback defers the semaphore wait even longer (into the
> ASAP upload), so the odds of waiting for the thread are REALLY low.
>
> Also did a bunch of cleanup along the way, and put in some trace markers
> so we can monitor how well this is working.
>
> Traces of a GM that includes GPU and SW path rendering (path-reverse):
>
> Original:
> https://screenshot.googleplex.com/f5BG3901tQg.png
> Threaded, with wait in the callback (notice pre flush callback blocking):
> https://screenshot.googleplex.com/htOSZFE2s04.png
> Current version, with wait deferred to ASAP upload function:
> https://screenshot.googleplex.com/GHjD0U3C34q.png
>
> Bug: skia:
> Change-Id: I3d5a230bbd68eb35e1f0574b308485c691435790
> Reviewed-on: https://skia-review.googlesource.com/36560
> Commit-Queue: Brian Osman <brianosman@google.com>
> Reviewed-by: Brian Salomon <bsalomon@google.com>
TBR=egdaniel@google.com,mtklein@google.com,bsalomon@google.com,robertphillips@google.com,brianosman@google.com
Change-Id: Icac0918a3771859f671b69ae07ae0fedd3ebb3db
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: skia:
Reviewed-on: https://skia-review.googlesource.com/38560
Reviewed-by: Brian Salomon <bsalomon@google.com>
Commit-Queue: Brian Salomon <bsalomon@google.com>
All information needed by the thread is captured by the prepare
callback object, the lambda captures a pointer to that, and does the
mask render. Once it's done, it signals the semaphore (also owned by the
callback). The callback defers the semaphore wait even longer (into the
ASAP upload), so the odds of waiting for the thread are REALLY low.
Also did a bunch of cleanup along the way, and put in some trace markers
so we can monitor how well this is working.
Traces of a GM that includes GPU and SW path rendering (path-reverse):
Original:
https://screenshot.googleplex.com/f5BG3901tQg.png
Threaded, with wait in the callback (notice pre flush callback blocking):
https://screenshot.googleplex.com/htOSZFE2s04.png
Current version, with wait deferred to ASAP upload function:
https://screenshot.googleplex.com/GHjD0U3C34q.png
Bug: skia:
Change-Id: I3d5a230bbd68eb35e1f0574b308485c691435790
Reviewed-on: https://skia-review.googlesource.com/36560
Commit-Queue: Brian Osman <brianosman@google.com>
Reviewed-by: Brian Salomon <bsalomon@google.com>
The ultimate goal is to end up with "float" and "half", but this
intermediate step uses "highfloat" so that it is clear if I missed a
"float" somewhere. Once this lands, a subsequent CL will switch all
"highfloats" back to "floats".
Bug: skia:
Change-Id: Ia13225c7a0a0a2901e07665891c473d2500ddcca
Reviewed-on: https://skia-review.googlesource.com/31000
Commit-Queue: Ethan Nicholas <ethannicholas@google.com>
Reviewed-by: Brian Salomon <bsalomon@google.com>
SkFAIL is a legacy macro which is just SK_ABORT. This CL mechanically
changes uses of SkFAIL to SK_ABORT in preparation for its removal. The
related sk_throw macro will be changed independently, due to needing to
actually clean up its users.
Change-Id: Id70b5c111a02d2458dc60c8933f444df27d9cebb
Reviewed-on: https://skia-review.googlesource.com/35284
Reviewed-by: Derek Sollenberger <djsollen@google.com>
Commit-Queue: Ben Wagner <bungeman@google.com>
On desktop, this saves just over 5% of the time in the SkSL compiler.
As written, the code will now build either way, so it's much easier to
switch back (or even have some platforms use SkString, if that's ever
required).
Bug: skia:
Change-Id: I634f26a4f6fcb404e59bda6a5c6a21a9c6d73c0b
Reviewed-on: https://skia-review.googlesource.com/34381
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
Adjust the configs specified by recipes to avoid the new error.
Change-Id: I23e31355e2faaab919d92abdb37a6f70cd2da1ff
Reviewed-on: https://skia-review.googlesource.com/32862
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Ben Wagner <benjaminwagner@google.com>
Until now we've been using 3 separate parametric stages to apply
gamma to r,g,b. That works fine, but is kind of unnecessarily
slow, and again less clear in a stack trace than seeing "gamma".
The new bench runs in about 60% of the time the old one does
on my Trashcan.
BUG=skia:6939
Change-Id: I079698d3009b081f1c23a2e27fc26e373b439610
Reviewed-on: https://skia-review.googlesource.com/32721
Reviewed-by: Mike Reed <reed@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
Bug: skia:
Change-Id: I487930955f75048ea27a1bcc61f7e0849c63759b
Reviewed-on: https://skia-review.googlesource.com/32681
Commit-Queue: Mike Reed <reed@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Mike Klein <mtklein@google.com>
rects are already auto-vectorized, so no need to explicitly write a 4f version of SkRect::round()
Bug: skia:
Change-Id: I098945767bfcaa7093d770c376bd17ff3bdc9983
Reviewed-on: https://skia-review.googlesource.com/32060
Commit-Queue: Mike Reed <reed@google.com>
Reviewed-by: Florin Malita <fmalita@chromium.org>
My next step is to change the uniform_color context to
struct {
float r,g,b,a;
uint32_t rgba;
};
so that it's trivial to load in both float and 8-bit pipelines.
Change-Id: If9bdde353ced3bf9eb0c63204b4770ed614ad16b
Reviewed-on: https://skia-review.googlesource.com/30481
Reviewed-by: Florin Malita <fmalita@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
Bug: skia:6880
Change-Id: Ia8b94e52eec3feb5104d2351bf7a7e6f99101deb
Reviewed-on: https://skia-review.googlesource.com/26370
Commit-Queue: Jim Van Verth <jvanverth@google.com>
Reviewed-by: Brian Salomon <bsalomon@google.com>
Change-Id: I71cf04b12be95a54b7fb47d048ba1f8672ed9a8f
Reviewed-on: https://skia-review.googlesource.com/27760
Commit-Queue: Ben Wagner <bungeman@google.com>
Reviewed-by: Mike Klein <mtklein@chromium.org>
DAA is:
1. Much simpler than AAA.
SkScan_AAAPath.cpp is about 1700 lines.
SkScan_DAAPath.cpp is about 300 lines.
The whole DAA CL is only about 800 lines.
2. Much faster than AAA for complicated paths.
The speedup applies to GL backend (including ccpr)!
Here's the frame time of 'SampleApp --slide Chart' on macbook pro:
AAA-raster: 33ms
DAA-raster: 21ms
AAA-gl: 30ms
DAA-gl: 20ms
AAA-ccpr: 18ms
DAA-ccpr: 12ms
My linux desktop doesn't have SSE3 so the speedup is smaller
(~25% for Chart). I believe that DAA is so fast that I can enable
it for any paths (AAA is not enabled by default for complicated
paths because it is slow; hence our older supersampling scan
converter is used for stroking on Chart for AAA-xxx config.)
3. The SkCoverageDelta is suitable for threaded backend with
out-of-order concurrent scan conversion as commented in the source
code. Maybe we can also just send deltas to GPU.
4. Similar to most analytic path renderers, the quality is on the best
ground-truth level, unless there are intersections within a pixel.
The intersections look good to my eyes although theoretically that
could be arbitrary far from the ground truth (see my AAA slides).
5. For simple paths, such as circle, triangle, rrect, etc., DAA is
slower than AAA. But DAA is faster than our older supersampling
scan converter in most cases. As those simple paths usually don't
constitute the bottleneck of a picture (skp or svg), I strongly
recommend use DAA.
6. DAA also heavily favors blitMask so it may work quite well with
SkRasterPipeline and SkRasterPipelineBlitter.
Finally, please check https://skia-review.googlesource.com/c/22420/
which accelerate DAA by specializing blitCoverageDeltas for
SkARGB32_Blitter and SkARGB32_Black_Blitter. It brings a little(<5%)
speedup. But I couldn't figure out how to reduce the duplicate code
so I don't intend to land it.
Bug: skia:
Change-Id: I3b7ed6a727447922e645b1acb737a506e7c09a4c
Reviewed-on: https://skia-review.googlesource.com/19666
Reviewed-by: Mike Reed <reed@google.com>
Reviewed-by: Cary Clark <caryclark@google.com>
Commit-Queue: Yuqian Li <liyuqian@google.com>