This highlights overheads and instruction set switch costs.
At time of writing on my HSW laptop,
N = 16: 76ns
N = 15: 291ns
BUG=skia:6289
Change-Id: I01751e8f5ea6cf946e7710822d9bc742712553e9
Reviewed-on: https://skia-review.googlesource.com/8984
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
Adds a bitfield to GrContextOptions that masks out path renderers.
Adds commandline flags support to set this bitfield in tools apps.
Removes GrGLInterfaceRemoveNVPR since we can now accomplish the same
thing in the context options.
BUG=skia:
Change-Id: Icf2a4df36374b3ba2f69ebf0db56e8aedd6cf65f
Reviewed-on: https://skia-review.googlesource.com/8786
Reviewed-by: Brian Salomon <bsalomon@google.com>
Commit-Queue: Chris Dalton <csmartdalton@google.com>
Also changes the behavior of these flags to only override their
corresponding context options when set, and to leave them unchanged
when not set.
BUG=skia:
Change-Id: I09f6be09997594fa888d9045dd4901354ef3f880
Reviewed-on: https://skia-review.googlesource.com/8780
Reviewed-by: Brian Salomon <bsalomon@google.com>
Commit-Queue: Chris Dalton <csmartdalton@google.com>
It's easier to work on SkJumper if everything funnels through run().
I don't anticipate huge benefit from compile() without JITing,
but it's something we can always put back if we find a need.
Change-Id: Id5256fd21495e8195cad1924dbad81856416d913
Reviewed-on: https://skia-review.googlesource.com/8468
Commit-Queue: Mike Klein <mtklein@chromium.org>
Reviewed-by: Mike Klein <mtklein@chromium.org>
BUG=skia:
Change-Id: I4f3c6370b3ef4247aa446716c7c154899925d089
Reviewed-on: https://skia-review.googlesource.com/8442
Commit-Queue: Mike Klein <mtklein@chromium.org>
Reviewed-by: Mike Klein <mtklein@chromium.org>
Since the SkArenaAlloc handles calling the dtor, it is not longer needed
in the test.
Change-Id: I70a09be7bd0e71bf1e3d55ef08b5e87742e0bd18
Reviewed-on: https://skia-review.googlesource.com/8191
Commit-Queue: Herb Derby <herb@google.com>
Reviewed-by: Robert Phillips <robertphillips@google.com>
TBR=mtklein@google.com
Change-Id: Id7621548995b368164d74c817e288c34ef656bfb
Reviewed-on: https://skia-review.googlesource.com/8180
Reviewed-by: Herb Derby <herb@google.com>
Reviewed-by: Robert Phillips <robertphillips@google.com>
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Herb Derby <herb@google.com>
- Added default implementation of onMakeContext to support use in android.
Searches for uses:
"public SkShader" package:^chromium$ -file:^src/third_party/skia
package:^aosp.* "public SkShader" -file:external/skia -file:.*third_party/skia
package:^android$ "public SkShader" -file:external/skia -file:.*third_party/skia
... shows that no subclass overrides onCreateContext.
TBR=reed@google.comTBR=mtklein@google.com
Change-Id: I8bd5f57a79534574e344b165d31dccee41c31767
Reviewed-on: https://skia-review.googlesource.com/8140
Commit-Queue: Herb Derby <herb@google.com>
Reviewed-by: Herb Derby <herb@google.com>
This reverts commit 2b57b7f7a7.
Reason for revert: Android compile failing
Original change's description:
> Use SkArenaAlloc instead of SkSmallAllocator in the SkAutoBlitterChoose code.
>
>
> TBR=reed@google.com
> Change-Id: Iefb044bf7657fbf982f23aa91a3f4d013ce2c626
> Reviewed-on: https://skia-review.googlesource.com/7786
> Reviewed-by: Mike Klein <mtklein@chromium.org>
> Reviewed-by: Herb Derby <herb@google.com>
> Commit-Queue: Herb Derby <herb@google.com>
>
TBR=mtklein@chromium.org,mtklein@google.com,herb@google.com,reed@google.com
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
Change-Id: Id09c35377dddae0811d998b7d0c34c422325a5bc
Reviewed-on: https://skia-review.googlesource.com/8129
Commit-Queue: Robert Phillips <robertphillips@google.com>
Reviewed-by: Robert Phillips <robertphillips@google.com>
BUG=skia:
Change-Id: I01c5e1874c9a034febc64e25b3aaafb5050393a6
Reviewed-on: https://skia-review.googlesource.com/8021
Reviewed-by: Brian Osman <brianosman@google.com>
Reviewed-by: Mike Reed <reed@google.com>
Commit-Queue: Matt Sarett <msarett@google.com>
TBR=reed@google.com
Change-Id: Iefb044bf7657fbf982f23aa91a3f4d013ce2c626
Reviewed-on: https://skia-review.googlesource.com/7786
Reviewed-by: Mike Klein <mtklein@chromium.org>
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Herb Derby <herb@google.com>
The weird foo_mains are no longer needed when we build with GN.
CQ_INCLUDE_TRYBOTS=skia.primary:Build-Mac-Clang-arm-Debug-iOS
Change-Id: Iae50696741e0dc277d96dda4968a1ae41cb17c8a
Reviewed-on: https://skia-review.googlesource.com/8064
Commit-Queue: Mike Klein <mtklein@chromium.org>
Reviewed-by: Stephan Altmueller <stephana@google.com>
CQ_INCLUDE_TRYBOTS=skia.primary:Test-iOS-Clang-iPadMini4-GPU-GX6450-arm-Release
Change-Id: I7beadad742bc9444491c7a315a827297a636d70d
Reviewed-on: https://skia-review.googlesource.com/8049
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
not sure about api -- perhaps it could just return the bounds, and make them 0,0,0,0 if the path
is empty -- the caller can trivially know if the path is empty themselves.
BUG=skia:
Change-Id: I2dbb861e8d981b27c5a6833643977f5bd6802217
Reviewed-on: https://skia-review.googlesource.com/7989
Reviewed-by: Cary Clark <caryclark@google.com>
Reviewed-by: Florin Malita <fmalita@chromium.org>
Commit-Queue: Mike Reed <reed@google.com>
Just like DM.
CQ_INCLUDE_TRYBOTS=skia.primary:Test-iOS-Clang-iPadMini4-GPU-GX6450-Arm7-Debug,Build-Mac-Clang-arm64-Debug-GN_iOS
Change-Id: I4af3fa1813e3b7ee48407096e91373b5fee569c7
Reviewed-on: https://skia-review.googlesource.com/7824
Reviewed-by: Hal Canary <halcanary@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
Generate targets for dm and nanobench from ninja and add them to the
generated Android.bp file.
Remove nanobenchAndroid and SkAndroidSDKCanvas. These rely on HWUI
internals and are currently unused.
Update gyp file references to removed files, just in case.
Change-Id: Ic6ae18a70bfd0c33804e7996d077f2081dfdfe07
Reviewed-on: https://skia-review.googlesource.com/7635
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Leon Scroggins <scroggo@google.com>
We've been seeding the initial values of our registers to x+0.5,y+0.5,
1,0, 0,0,0,0 (useful values for shaders to start with) in all pipelines.
This CL changes that to do so only when blitting, and only when we have
a shader.
The nicest part of this change is that SkRasterPipeline itself no longer
needs to have a concept of y, or what x means. It just marches x
through [x,x+n), and the blitter handles y and layers the meaning of
"dst x coordinate" onto x.
This ought to make SkSplicer a little easier to work with too.
dm --src gm --config f16 srgb 565 all draws the same.
Change-Id: I69d8c1cc14a06e5dfdd6a7493364f43a18f8dec5
Reviewed-on: https://skia-review.googlesource.com/7353
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
(actually fixes undefined result in getClipBounds)
future CLs
- update all callers to new apis
- move/rename virtuals
BUG=skia:
DOCS_PREVIEW= https://skia.org/?cl=7400
Change-Id: I45b93014e915c0d1c36d97d948c9ac8931f23258
Reviewed-on: https://skia-review.googlesource.com/7400
Reviewed-by: Brian Salomon <bsalomon@google.com>
Reviewed-by: Cary Clark <caryclark@google.com>
Commit-Queue: Mike Reed <reed@google.com>
Change-Id: Iec5fc759e331de24caea1347f9510917260d379b
Reviewed-on: https://skia-review.googlesource.com/7363
Reviewed-by: Mike Reed <reed@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
These may be better at -fsanitize=object-size.
No need to loop more than once in nanobench for these bots.
CQ_INCLUDE_TRYBOTS=skia.primary:Build-Ubuntu-Clang-x86_64-Release-ASAN,Perf-Ubuntu-Clang-GCE-CPU-AVX2-x86_64-Release-ASAN,Perf-Ubuntu-Clang-Golo-GPU-GT610-x86_64-Release-ASAN,Test-Ubuntu-Clang-GCE-CPU-AVX2-x86_64-Release-ASAN,Test-Ubuntu-Clang-Golo-GPU-GT610-x86_64-Release-ASAN
Change-Id: If89e94390d473434717cfe28de6be9055b68d8d4
Reviewed-on: https://skia-review.googlesource.com/7278
Reviewed-by: Herb Derby <herb@google.com>
Reviewed-by: Eric Boren <borenet@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
No fancy f16 or sRGB here... just good old legacy 8888.
Change-Id: I21eb7c0d8e2c7a7d92e9d8a8bae9d318c4daa7e5
Reviewed-on: https://skia-review.googlesource.com/7109
Reviewed-by: Mike Klein <mtklein@chromium.org>
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
- Implementation.
- Use in SkLinearPipeline.
TBR=mtklein@google.com
Change-Id: Ie014184469b217132b0307b5a9ae40c0c60e5fc9
Reviewed-on: https://skia-review.googlesource.com/6921
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Herb Derby <herb@google.com>
The only difference is that we now also put the guard flag
SK_SUPPORT_LEGACY_AAA in SkUserConfig.h. Previously, SkAnalyticEdge.cpp doesn't
get that flag from SkScan.h and that caused many problems.
BUG=skia:
TBR=reed@google.com,caryclark@google.com
Change-Id: I134bb76cebd6fffa712f438076668765321bba3b
Reviewed-on: https://skia-review.googlesource.com/6992
Reviewed-by: Yuqian Li <liyuqian@google.com>
Commit-Queue: Yuqian Li <liyuqian@google.com>
Now that SkOpts_hsw.cpp no longer hooks in SkRasterPipeline_opts,
it should be safe to try this again.
This reverts commit 86d55b312a.
Change-Id: I2d495600ca9d3a0f49c2e02fbaaae349cefac3a1
Reviewed-on: https://skia-review.googlesource.com/6985
Reviewed-by: Mike Klein <mtklein@chromium.org>
This reverts commit b46fff60bc.
Reason for revert: possible chromium cc unit tests failure
Change-Id: Ie174c55e4d0fc3ae45854b5897ba26b7ad5a9c13
Reviewed-on: https://skia-review.googlesource.com/6981
Commit-Queue: Yuqian Li <liyuqian@google.com>
Reviewed-by: Yuqian Li <liyuqian@google.com>
The only difference is that we now put the guard flag SK_SUPPORT_LEGACY_AAA in
SkUserConfig.h instead of SkScan.h. Previously, SkAnalyticEdge.cpp doesn't get
that flag from SkScan.h and that caused many problems.
BUG=skia:
TBR=reed@google.com,caryclark@google.com
Change-Id: I7b89d3cb64ad71715101d2a5e8e77be3a8a6fa16
Reviewed-on: https://skia-review.googlesource.com/6972
Reviewed-by: Yuqian Li <liyuqian@google.com>
Commit-Queue: Yuqian Li <liyuqian@google.com>
This reverts commit 89a0e72287.
Reason for revert: <INSERT REASONING HERE>
Original change's description:
> Implement Analytic AA for General Paths (with Guard against Chrome)
>
> I've set up a SK_SUPPORT_LEGACY_AAA flag to guard against Chromium layout tests. I also set that flag in this CL so theoretically this CL won't trigger any GM changes. I'll use this to verify my guard, and remove that flag and actually enables concave AAA in a future CL.
>
> When enabled, for most simple concave paths (e.g., rectangle stroke, rrect stroke, sawtooth, stars...), the Analytic AA achieves 1.3x-2x speedup, and they look much prettier. And they probably are the majority in our use cases by number. But they probably are not the majority by time cost; a single complicated path may cost 10x-100x more time to render than a rectangle stroke... For those complicated paths, we fall back to supersampling by default as we're likely to be 1.1-1.2x slower and the quality improvement is not visually significant. However, one can use gSkForceAnalyticAA to disable that fallback.
>
> BUG=skia:
>
> Change-Id: If9549a3acc4a187cfaf7eb51890c148da3083d31
> Reviewed-on: https://skia-review.googlesource.com/6091
> Reviewed-by: Mike Reed <reed@google.com>
> Commit-Queue: Yuqian Li <liyuqian@google.com>
>
TBR=caryclark@google.com,liyuqian@google.com,reed@google.com
BUG=skia:
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
Change-Id: I13c05aaa1bcb14956bd0fe01bb404e41be75af22
Reviewed-on: https://skia-review.googlesource.com/6961
Commit-Queue: Yuqian Li <liyuqian@google.com>
Reviewed-by: Yuqian Li <liyuqian@google.com>
I've set up a SK_SUPPORT_LEGACY_AAA flag to guard against Chromium layout tests. I also set that flag in this CL so theoretically this CL won't trigger any GM changes. I'll use this to verify my guard, and remove that flag and actually enables concave AAA in a future CL.
When enabled, for most simple concave paths (e.g., rectangle stroke, rrect stroke, sawtooth, stars...), the Analytic AA achieves 1.3x-2x speedup, and they look much prettier. And they probably are the majority in our use cases by number. But they probably are not the majority by time cost; a single complicated path may cost 10x-100x more time to render than a rectangle stroke... For those complicated paths, we fall back to supersampling by default as we're likely to be 1.1-1.2x slower and the quality improvement is not visually significant. However, one can use gSkForceAnalyticAA to disable that fallback.
BUG=skia:
Change-Id: If9549a3acc4a187cfaf7eb51890c148da3083d31
Reviewed-on: https://skia-review.googlesource.com/6091
Reviewed-by: Mike Reed <reed@google.com>
Commit-Queue: Yuqian Li <liyuqian@google.com>
This reverts commit 6ff51aedda.
Reason for revert: breaks win2k8 and PDFium
Change-Id: Ib1e2db8e523d5d321836ce00e3773def3db8be2f
Reviewed-on: https://skia-review.googlesource.com/6898
Commit-Queue: Herb Derby <herb@google.com>
Reviewed-by: Herb Derby <herb@google.com>
- Implementation.
- Use in SkLinearPipeline.
TBR=mtklein@google.com
Change-Id: Ia8efd09b2f3139a57182889ba84d1610eae92749
Reviewed-on: https://skia-review.googlesource.com/6352
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Herb Derby <herb@google.com>
BUG=skia:
Change-Id: Ie7d4fac3024b361a281f456fec2b3a837e2bfe43
Reviewed-on: https://skia-review.googlesource.com/6881
Commit-Queue: Mike Reed <reed@google.com>
Reviewed-by: Mike Klein <mtklein@chromium.org>
* SkAutoFree moved to SkTemplates.h (now implmented with unique_ptr).
* SkAutoMalloc and SkAutoSMalloc moved to SkAutoMalloc.h
* "SkAutoFree X(sk_malloc_throw(N));" --> "SkAutoMalloc X(N);"
Revert "Revert 'SkTypes.h : move SkAutoMalloc into SkAutoMalloc.h'"
This reverts commit c456b73fef.
Change-Id: Ie2c1a17c20134b8ceab85a68b3ae3e61c24fbaab
Reviewed-on: https://skia-review.googlesource.com/6886
Reviewed-by: Hal Canary <halcanary@google.com>
Commit-Queue: Hal Canary <halcanary@google.com>
* SkAutoFree moved to SkTemplates.h (now implmented with unique_ptr).
* SkAutoMalloc and SkAutoSMalloc moved to SkAutoMalloc.h
* "SkAutoFree X(sk_malloc_throw(N));" --> "SkAutoMalloc X(N);"
Change-Id: Idacd86ca09e22bf092422228599ae0d9bedded88
Reviewed-on: https://skia-review.googlesource.com/4543
Reviewed-by: Ben Wagner <bungeman@google.com>
Reviewed-by: Mike Reed <reed@google.com>
Commit-Queue: Hal Canary <halcanary@google.com>
This reverts commit 1e74cad9b4.
Reason for revert: the guard flag has now been confirmed to be landed on chromium/src. We should now be able to pass the layout tests.
Original change's description:
> Revert "Improve quad edges' smoothness in non-AA cases"
>
> This reverts commit d4ed326d6f.
>
> Reason for revert: reverting temporarily to get us rolling into Chrome again.
>
> Must be this CL right?
> https://storage.googleapis.com/chromium-layout-test-archives/linux_trusty_blink_rel/3364/layout-test-results/results.html
>
> Original change's description:
> > Improve quad edges' smoothness in non-AA cases
> >
> > Previously, non-AA quad edges only have an accuracy about 1/2 pixel
> > (while the AA quad edges have an accuracy about 1/8 pixel). Now, we
> > increase non-AA quad edges' accuracy to 1/8 pixel as well.
> >
> > The difference is very significant for rotating non-AA filled circles.
> > For example, run `./out/Debug/SampleApp --slide GM:fillcircle` with AA
> > turned off (by pressing b).
> >
> > The benchmark added reveals that increasing quad accuracy from 1/2 to
> > 1/8 doesn't affect the performance significantly. The following is the
> > 1/2-accuracy performance versus 1/8-accuracy performance:
> >
> > curr/maxrss loops min median mean max stddev samples
> > config bench
> > 7/17 MB 19 2.43µs 2.57µs 2.81µs 10.5µs 22% 16119
> > 8888 path_fill_big_nonaacircle
> > 7/17 MB 17 1.38µs 1.42µs 1.52µs 13µs 20% 21409
> > 8888 path_fill_small_nonaacircle
> >
> > curr/maxrss loops min median mean max stddev samples
> > config bench
> > 7/17 MB 71 2.52µs 2.59µs 2.79µs 7.67µs 19% 7557
> > 8888 path_fill_big_nonaacircle
> > 7/17 MB 64 1.45µs 1.49µs 1.51µs 2.39µs 5% 12704
> > 8888 path_fill_small_nonaacircle
> >
> >
> > BUG=skia:
> >
> > Change-Id: I3482098aeafcc6f2ec9aa3382977c0dc1b650964
> > Reviewed-on: https://skia-review.googlesource.com/6699
> > Reviewed-by: Mike Reed <reed@google.com>
> > Commit-Queue: Yuqian Li <liyuqian@google.com>
> >
>
> TBR=caryclark@google.com,liyuqian@google.com,reed@google.com,djsollen@google.com,bungeman@google.com,reviews@skia.org,fmalita@chromium.org
> BUG=skia:
> NOPRESUBMIT=true
> NOTREECHECKS=true
> NOTRY=true
>
> Change-Id: I5bc4596ab506f6f61ac2da91a07cf51d61114f31
> Reviewed-on: https://skia-review.googlesource.com/6829
> Commit-Queue: Mike Klein <mtklein@google.com>
> Reviewed-by: Mike Klein <mtklein@google.com>
>
TBR=djsollen@google.com,mtklein@google.com,bungeman@google.com,reviews@skia.org,caryclark@google.com,fmalita@chromium.org,liyuqian@google.com,reed@google.com
BUG=skia:
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
Change-Id: I0ec2ea8dc0c6ad0ebdcb48878fb301c32443a09e
Reviewed-on: https://skia-review.googlesource.com/6858
Commit-Queue: Yuqian Li <liyuqian@google.com>
Reviewed-by: Yuqian Li <liyuqian@google.com>
SkSplicer is better.
CQ_INCLUDE_TRYBOTS=skia.primary:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD
Change-Id: I014ec0e9fb00a8a4694d442e672c65402621dc67
Reviewed-on: https://skia-review.googlesource.com/6830
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
This reverts commit d4ed326d6f.
Reason for revert: reverting temporarily to get us rolling into Chrome again.
Must be this CL right?
https://storage.googleapis.com/chromium-layout-test-archives/linux_trusty_blink_rel/3364/layout-test-results/results.html
Original change's description:
> Improve quad edges' smoothness in non-AA cases
>
> Previously, non-AA quad edges only have an accuracy about 1/2 pixel
> (while the AA quad edges have an accuracy about 1/8 pixel). Now, we
> increase non-AA quad edges' accuracy to 1/8 pixel as well.
>
> The difference is very significant for rotating non-AA filled circles.
> For example, run `./out/Debug/SampleApp --slide GM:fillcircle` with AA
> turned off (by pressing b).
>
> The benchmark added reveals that increasing quad accuracy from 1/2 to
> 1/8 doesn't affect the performance significantly. The following is the
> 1/2-accuracy performance versus 1/8-accuracy performance:
>
> curr/maxrss loops min median mean max stddev samples
> config bench
> 7/17 MB 19 2.43µs 2.57µs 2.81µs 10.5µs 22% 16119
> 8888 path_fill_big_nonaacircle
> 7/17 MB 17 1.38µs 1.42µs 1.52µs 13µs 20% 21409
> 8888 path_fill_small_nonaacircle
>
> curr/maxrss loops min median mean max stddev samples
> config bench
> 7/17 MB 71 2.52µs 2.59µs 2.79µs 7.67µs 19% 7557
> 8888 path_fill_big_nonaacircle
> 7/17 MB 64 1.45µs 1.49µs 1.51µs 2.39µs 5% 12704
> 8888 path_fill_small_nonaacircle
>
>
> BUG=skia:
>
> Change-Id: I3482098aeafcc6f2ec9aa3382977c0dc1b650964
> Reviewed-on: https://skia-review.googlesource.com/6699
> Reviewed-by: Mike Reed <reed@google.com>
> Commit-Queue: Yuqian Li <liyuqian@google.com>
>
TBR=caryclark@google.com,liyuqian@google.com,reed@google.com,djsollen@google.com,bungeman@google.com,reviews@skia.org,fmalita@chromium.org
BUG=skia:
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
Change-Id: I5bc4596ab506f6f61ac2da91a07cf51d61114f31
Reviewed-on: https://skia-review.googlesource.com/6829
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Mike Klein <mtklein@google.com>
Previously, non-AA quad edges only have an accuracy about 1/2 pixel
(while the AA quad edges have an accuracy about 1/8 pixel). Now, we
increase non-AA quad edges' accuracy to 1/8 pixel as well.
The difference is very significant for rotating non-AA filled circles.
For example, run `./out/Debug/SampleApp --slide GM:fillcircle` with AA
turned off (by pressing b).
The benchmark added reveals that increasing quad accuracy from 1/2 to
1/8 doesn't affect the performance significantly. The following is the
1/2-accuracy performance versus 1/8-accuracy performance:
curr/maxrss loops min median mean max stddev samples
config bench
7/17 MB 19 2.43µs 2.57µs 2.81µs 10.5µs 22% 16119
8888 path_fill_big_nonaacircle
7/17 MB 17 1.38µs 1.42µs 1.52µs 13µs 20% 21409
8888 path_fill_small_nonaacircle
curr/maxrss loops min median mean max stddev samples
config bench
7/17 MB 71 2.52µs 2.59µs 2.79µs 7.67µs 19% 7557
8888 path_fill_big_nonaacircle
7/17 MB 64 1.45µs 1.49µs 1.51µs 2.39µs 5% 12704
8888 path_fill_small_nonaacircle
BUG=skia:
Change-Id: I3482098aeafcc6f2ec9aa3382977c0dc1b650964
Reviewed-on: https://skia-review.googlesource.com/6699
Reviewed-by: Mike Reed <reed@google.com>
Commit-Queue: Yuqian Li <liyuqian@google.com>
This reverts commit f55ea6a1de.
Reason for revert: crbug.com/679147
Original change's description:
> Retry "SkRasterPipelineBlitter: support A8"...
>
> ...preferring SkA8_Coverage_Blitter over SkRasterPipelineBlitter.
>
> I think we could make this work with SkRasterPipelineBlitter (tell it, draw white in Src mode with this mask), but the existing blitter is pretty hard to beat in efficiency and correctness.
>
> CQ_INCLUDE_TRYBOTS=skia.primary:Perf-Ubuntu-Clang-GCE-CPU-AVX2-x86_64-Debug-MSAN,Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD
>
>
> Change-Id: I72df9995c63f3334d8111c59711818cb5ed1e63c
> Reviewed-on: https://skia-review.googlesource.com/6627
> Reviewed-by: Mike Klein <mtklein@chromium.org>
> Commit-Queue: Mike Klein <mtklein@chromium.org>
>
TBR=mtklein@chromium.org,brianosman@google.com,reed@google.com,reviews@skia.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
Change-Id: I6a36b4c087a52e54f4d591ded40e6a202fb77068
Reviewed-on: https://skia-review.googlesource.com/6760
Commit-Queue: Mike Klein <mtklein@chromium.org>
Reviewed-by: Mike Klein <mtklein@chromium.org>
MSAN's keeping me honest...
CQ_INCLUDE_TRYBOTS=skia.primary:Perf-Ubuntu-Clang-GCE-CPU-AVX2-x86_64-Debug-MSAN
Change-Id: If61f75ec919a2b5ae0e973db05760f88588fe18f
Reviewed-on: https://skia-review.googlesource.com/6702
Reviewed-by: Mike Klein <mtklein@chromium.org>
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
...preferring SkA8_Coverage_Blitter over SkRasterPipelineBlitter.
I think we could make this work with SkRasterPipelineBlitter (tell it, draw white in Src mode with this mask), but the existing blitter is pretty hard to beat in efficiency and correctness.
CQ_INCLUDE_TRYBOTS=skia.primary:Perf-Ubuntu-Clang-GCE-CPU-AVX2-x86_64-Debug-MSAN,Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD
Change-Id: I72df9995c63f3334d8111c59711818cb5ed1e63c
Reviewed-on: https://skia-review.googlesource.com/6627
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
CQ_INCLUDE_TRYBOTS=skia.primary:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD
Change-Id: I75e232faee6ad48f65bac5b119a461280b27bbc8
Reviewed-on: https://skia-review.googlesource.com/6661
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
Also split bench into run/compile variants to measure the effect:
Before …f16_compile 1x …f16_run 1.02x …srgb_compile 1.56x …srgb_run 1.61x
After …f16_run 1x …f16_compile 1.01x …srgb_compile 1.58x …srgb_run 1.59x
CQ_INCLUDE_TRYBOTS=skia.primary:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD
Change-Id: I8e65fb2acdbb05ccc0b3894f16d7646603c3e74d
Reviewed-on: https://skia-review.googlesource.com/6621
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
BUG=skia:
Change-Id: Icf616bec73e81aad97815b519566ff5b9db611e3
Reviewed-on: https://skia-review.googlesource.com/6495
Commit-Queue: Hal Canary <halcanary@google.com>
Reviewed-by: Mike Klein <mtklein@chromium.org>
Change-Id: I80f951976558a284e55386e0a368f08bd835d8ca
Reviewed-on: https://skia-review.googlesource.com/6359
Commit-Queue: Brian Salomon <bsalomon@google.com>
Reviewed-by: Brian Osman <brianosman@google.com>
- move bytesWritten calculation to query the tail, allowing write() to be faster since it doesn't have to update anything extra per-write.
- enforce that all blocks are multiple-of-4 bytes big
- update the minimum block size to 4K
Before: 30ms
After: 23ms for non-4-bytes writes
13ms for 4-bytes writes
BUG=skia:
Change-Id: Id06ecad3b9fe426747e02accf1393595e3356ce3
Reviewed-on: https://skia-review.googlesource.com/6087
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Reed <reed@google.com>
This reverts commit 2e018f548d.
Reason for revert: doesn't appear to have been the roll problem.
Original change's description:
> Revert "clamp to premul when reading premul sRGB"
>
> This reverts commit 04e10da836.
>
> Reason for revert: roll?
>
> Change-Id: Id0a8dcd62763bd6eddde120c513ca97e098a4268
> Reviewed-on: https://skia-review.googlesource.com/6022
> Commit-Queue: Mike Klein <mtklein@chromium.org>
> Reviewed-by: Mike Klein <mtklein@chromium.org>
>
TBR=mtklein@chromium.org,reviews@skia.org,brianosman@google.com
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
Change-Id: I399ca5e728ce6766c6707682c4c6b685681ffdeb
Reviewed-on: https://skia-review.googlesource.com/6025
Commit-Queue: Mike Klein <mtklein@chromium.org>
Reviewed-by: Mike Klein <mtklein@chromium.org>
This reverts commit 04e10da836.
Reason for revert: roll?
Change-Id: Id0a8dcd62763bd6eddde120c513ca97e098a4268
Reviewed-on: https://skia-review.googlesource.com/6022
Commit-Queue: Mike Klein <mtklein@chromium.org>
Reviewed-by: Mike Klein <mtklein@chromium.org>
It's pretty easy to start with sound premultiplied linear floats, pack those to sRGB encoded bytes, then read them back to linear floats and find them not quite premultiplied, with a color channel just a smidge greater than the alpha channel. This can happen basically any time we have different transfer functions for alpha and colors... sRGB being the only one we draw into.
This is an annoying problem with no known good solution. So apply the clamp hammer.
These new calls on SkRasterPipeline should make it impossible to get wrong.
CQ_INCLUDE_TRYBOTS=skia.primary:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD
Change-Id: I4c974f4a7b151f3f684946f1e83d06b1b288fd01
Reviewed-on: https://skia-review.googlesource.com/5945
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
This reverts commit ce33f10677.
Reason for revert: Breaking many gpu bots
Change-Id: I94c813ed6a9311458c872f74bb1b0792f46ff414
Reviewed-on: https://skia-review.googlesource.com/5737
Commit-Queue: Greg Daniel <egdaniel@google.com>
Reviewed-by: Greg Daniel <egdaniel@google.com>
This is less to type in most cases, and gives us more information
(for things like picture-backed images, where we need to know all
about the destination surface).
Additionally, strip out the plumbing entirely for bitmap sources,
where we don't need to know anything.
BUG=skia:
Change-Id: I4deff6c7c345fcf62eb08b2aff0560adae4313da
Reviewed-on: https://skia-review.googlesource.com/5748
Reviewed-by: Mike Klein <mtklein@chromium.org>
Reviewed-by: Brian Salomon <bsalomon@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
This reverts commit 8e7432b7f9.
Reason for revert: <INSERT REASONING HERE>
external/skia/bench/../tools/android/SkAndroidSDKCanvas.h:103:36: error: C++ requires a type specifier for all declarations
void onClipRect(const SkRect&, ClipOp, ClipEdgeStyle) override;
Original change's description:
> remove SK_SUPPORT_LEGACY_CLIP_REGIONOPS
>
>
> switch over to SkClipOps now that SK_SUPPORT_LEGACY_CLIP_REGIONOPS is gone
>
> BUG=skia:
>
> Change-Id: Ifdc8b3746d508348a40cc009a4e529a1cb3c405d
> Reviewed-on: https://skia-review.googlesource.com/5714
> Commit-Queue: Mike Reed <reed@google.com>
> Reviewed-by: Mike Reed <reed@google.com>
>
TBR=reed@google.com,reviews@skia.org
BUG=skia:
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
Change-Id: If26ea91d7464615e43c1d3d2f726e337ff56b55c
Reviewed-on: https://skia-review.googlesource.com/5721
Commit-Queue: Mike Reed <reed@google.com>
Reviewed-by: Mike Reed <reed@google.com>
switch over to SkClipOps now that SK_SUPPORT_LEGACY_CLIP_REGIONOPS is gone
BUG=skia:
Change-Id: Ifdc8b3746d508348a40cc009a4e529a1cb3c405d
Reviewed-on: https://skia-review.googlesource.com/5714
Commit-Queue: Mike Reed <reed@google.com>
Reviewed-by: Mike Reed <reed@google.com>
This patch adds per-benchmark-iteration times to our JSON output. Given that we
already collect these statistics, giving them to the user would be nice.
No unit-test provided, since `rgrep -i json tests` yielded nothing. Happy to add
one if someone wants.
BUG=None.
TEST=nanobench now writes per-run timinings with the output JSON.
Change-Id: I910f1d97fd3e0ee69fc8e78e011e67b9c866f18d
Reviewed-on: https://skia-review.googlesource.com/5617
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Ravi Mistry <rmistry@google.com>
BUG=skia:
Change-Id: Ib1920fffd5735ad54a5b785bbc2676ea240bdbfa
Reviewed-on: https://skia-review.googlesource.com/5611
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Brian Osman <brianosman@google.com>
Also, no more SkImageEncoder class.
SK_SUPPORT_LEGACY_IMAGE_ENCODER_CLASS now only guards some
old API shims.
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=5006
Change-Id: I3797f584f3e8e12ade10d31e8733163453725f40
Reviewed-on: https://skia-review.googlesource.com/5006
Commit-Queue: Hal Canary <halcanary@google.com>
Reviewed-by: Leon Scroggins <scroggo@google.com>
Reviewed-by: Mike Reed <reed@google.com>
For stages that have {r,g,b,a} and {dr,dg,db,da} versions, name the {r,g,b,a} one "foo" and the {dr,dg,db,da} on "foo_d". The {r,g,b,a} registers are the ones most commonly used and fastest, so they get short ordinary names, and the d-registers are less commonly used and sometimes slower, so they get a suffix.
Some stages naturally opearate on all 8 registers (the xfermodes, accumulate). These names for those look fine and aren't ambiguous.
Also, a bit more re-arrangement in _opts.h.
CQ_INCLUDE_TRYBOTS=skia.primary:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD
Change-Id: Ia20029247642798a60a2566e8a26b84ed101dbd0
Reviewed-on: https://skia-review.googlesource.com/5291
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
Every sRGB GM changes, none noticeably.
CQ_INCLUDE_TRYBOTS=skia.primary:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD
Change-Id: I632845aea0f40751639cccbcfde8fa270cae0301
Reviewed-on: https://skia-review.googlesource.com/5275
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
(re-land 248ff02 & 2cb6cb7, with changes)
- Hide SkImageEncoder class in private header.
- SkImageEncoder::Type becomes SkEncodedImageFormat
- SkEncodedFormat becomes SkEncodedImageFormat
- SkImageEncoder static functions replaced with
single function EncodeImage()
- utility wrappers for EncodeImage() are in
sk_tool_utils.h
TODO: remove link-time registration mechanism.
TODO: clean up clients use of API and flip the flag.
TODO: implement EncodeImage() in chromeium/skia/ext
Change-Id: I47d451e50be4d5c6c130869c7fa7c2857243d9f0
Reviewed-on: https://skia-review.googlesource.com/4909
Reviewed-by: Mike Reed <reed@google.com>
Reviewed-by: Leon Scroggins <scroggo@google.com>
Reviewed-on: https://skia-review.googlesource.com/5186
Commit-Queue: Hal Canary <halcanary@google.com>
Reviewed-by: Hal Canary <halcanary@google.com>
- Hide SkImageEncoder class in private header.
- SkImageEncoder::Type becomes SkEncodedImageFormat
- SkEncodedFormat becomes SkEncodedImageFormat
- SkImageEncoder static functions replaced with
single function EncodeImage()
- utility wrappers for EncodeImage() are in
sk_tool_utils.h
TODO: remove link-time registration mechanism.
TODO: clean up clients use of API and flip the flag.
TODO: implement EncodeImage() in chromeium/skia/ext
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=4909
Change-Id: Ib48b31fdc05cf23cda7f56ebfd67c841c149ce70
Reviewed-on: https://skia-review.googlesource.com/4909
Reviewed-by: Mike Reed <reed@google.com>
Reviewed-by: Leon Scroggins <scroggo@google.com>
Commit-Queue: Hal Canary <halcanary@google.com>
Our codec generator will now preserve any asked-for color space, and
convert the encoded data to that representation. Cacherator now
allows decoding an image to both legacy (nullptr color space), and
color-correct formats. In color-correct mode, we choose the best
decoded format, based on the original properties, and our backend's
capabilities. Preference is given to the native format, when it's
already texturable (sRGB 8888 or F16 linear). Otherwise, we prefer
linear F16, and fall back to sRGB when that's not an option.
Re-land (and fix) of:
https://skia-review.googlesource.com/c/4438/https://skia-review.googlesource.com/c/4796/
BUG=skia:5907
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=4838
Change-Id: I20ff972ffe1c7e6535ddc501e2a8ab8c246e4061
Reviewed-on: https://skia-review.googlesource.com/4838
Commit-Queue: Brian Osman <brianosman@google.com>
Reviewed-by: Matt Sarett <msarett@google.com>
the entire pipeline.
Adjust SkFixedAlloc to allow nesting of allocation.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=4987
Change-Id: Ieeb1b5deaae004b67cee933af9bc19bbfd5a7687
Reviewed-on: https://skia-review.googlesource.com/4987
Commit-Queue: Herb Derby <herb@google.com>
Reviewed-by: Mike Klein <mtklein@chromium.org>
This is step one:
- make SkXfermode useless to public clients
- everything they should need is in SkBlendMode.h
Step two:
- remove SkXfermode.h entirely (since skia core will already be using SkXfermodePriv.h)
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=4534
Change-Id: If2cea9f71df92430ed6644edb98dd306c5572cbc
Reviewed-on: https://skia-review.googlesource.com/4534
Commit-Queue: Mike Reed <reed@google.com>
Reviewed-by: Florin Malita <fmalita@chromium.org>
This reverts commit c73a1ecbed.
Reason for revert: ANGLE and CommandBuffer failures
Original change's description:
> Support decoding images to multiple formats, depending on usage
>
> Our codec generator will now preserve any asked-for color space, and
> convert the encoded data to that representation. Cacherator now
> allows decoding an image to both legacy (nullptr color space), and
> color-correct formats. In color-correct mode, we choose the best
> decoded format, based on the original properties, and our backend's
> capabilities. Preference is given to the native format, when it's
> already texturable (sRGB 8888 or F16 linear). Otherwise, we prefer
> linear F16, and fall back to sRGB when that's not an option.
>
> BUG=skia:5907
>
> GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=4438
>
> Change-Id: I847c243dcfb72d8c0f1f6fc73c09547adea933f0
> Reviewed-on: https://skia-review.googlesource.com/4438
> Reviewed-by: Matt Sarett <msarett@google.com>
> Commit-Queue: Brian Osman <brianosman@google.com>
>
TBR=mtklein@google.com,bsalomon@google.com,msarett@google.com,brianosman@google.com,reed@google.com
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
Change-Id: I1818f937464573d601f64e5a1f1eb43f5a778f4e
Reviewed-on: https://skia-review.googlesource.com/4832
Commit-Queue: Brian Osman <brianosman@google.com>
Reviewed-by: Brian Osman <brianosman@google.com>
Our codec generator will now preserve any asked-for color space, and
convert the encoded data to that representation. Cacherator now
allows decoding an image to both legacy (nullptr color space), and
color-correct formats. In color-correct mode, we choose the best
decoded format, based on the original properties, and our backend's
capabilities. Preference is given to the native format, when it's
already texturable (sRGB 8888 or F16 linear). Otherwise, we prefer
linear F16, and fall back to sRGB when that's not an option.
BUG=skia:5907
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=4438
Change-Id: I847c243dcfb72d8c0f1f6fc73c09547adea933f0
Reviewed-on: https://skia-review.googlesource.com/4438
Reviewed-by: Matt Sarett <msarett@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
Image shaders need to do some geometry work before sampling the image colors:
1) determine dst coordinates
2) map back to src coordinates
3) tiling
Feeding (x,y) through as (dr,dg) registers makes step 1) easy, perhaps trivial, while leaving (r,g,b,a) with their usual meanings, "the color", starting with the paint color.
This is easy to tweak into something like (x+0.5, y+0.5, 1) in (dr,dg,db) once this lands. Mostly I just want to get all the uninteresting boilerplate out of the way first.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=4791
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Change-Id: Ia07815d942ded6672dc1df785caf80a508fc8f37
Reviewed-on: https://skia-review.googlesource.com/4791
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
This reverts commit 8af38a9647.
Reason for revert: Breaking compile bots, e.g. from https://uberchromegw.corp.google.com/i/client.skia.compile/builders/Build-Ubuntu-GCC-x86_64-Debug-NoGPU/builds/10029/steps/compile_skia%20on%20Ubuntu/logs/stdio :
In function `sk_make_sp<GrGLSLCaps, GrContextOptions>'
undefined reference to `GrGLSLCaps::GrGLSLCaps(GrContextOptions const&)
In function `_Z10sk_make_spI10GrGLSLCapsI16GrContextOptionsEE5sk_spIT_EDpOT0_':
undefined reference to `GrGLSLCaps::GrGLSLCaps(GrContextOptions const&)
Original change's description:
> skslc now uses standard Skia caps
>
> BUG=skia:
>
> GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=4660
>
> Change-Id: Idaedae3f81426b97f5052bb872cdf0610e47a84f
> Reviewed-on: https://skia-review.googlesource.com/4660
> Reviewed-by: Ben Wagner <benjaminwagner@google.com>
> Reviewed-by: Brian Salomon <bsalomon@google.com>
> Commit-Queue: Ethan Nicholas <ethannicholas@google.com>
>
TBR=bsalomon@google.com,benjaminwagner@google.com,ethannicholas@google.com,reviews@skia.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
Change-Id: Ic7f987f5c050ac2e333f5a0f16c8de85c1047a74
Reviewed-on: https://skia-review.googlesource.com/4697
Commit-Queue: Leon Scroggins <scroggo@google.com>
Reviewed-by: Leon Scroggins <scroggo@google.com>
This is much more explicit about what that type represents (are we in
legacy mode or not), which also makes it suitable for other (upcoming)
usage.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=4529
Change-Id: Iacb397c34e7765f1ca86c0195bc622b2be4d9acf
Reviewed-on: https://skia-review.googlesource.com/4529
Commit-Queue: Brian Osman <brianosman@google.com>
Reviewed-by: Brian Salomon <bsalomon@google.com>
It is moved to src/utils. It is almost a tool, but has two uses in
src/ports.
The existing SkOSFile.cpp is left empty for the time being since it is
mentioned in Chromium's BUILD.gn for Skia.
Change-Id: I3bb7f7c4214359eb6ab906bfe76737d20bf1d6c7
Reviewed-on: https://skia-review.googlesource.com/4536
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Ben Wagner <bungeman@google.com>
Replace with std::unique_ptr.
Change-Id: I5806cfbb30515fcb20e5e66ce13fb5f3b8728176
Reviewed-on: https://skia-review.googlesource.com/4381
Commit-Queue: Ben Wagner <bungeman@google.com>
Reviewed-by: Mike Klein <mtklein@chromium.org>
This class is already just an alias for std::unique_ptr<T[]>, so replace
all uses with that and delete the class.
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-Clang-GCE-CPU-AVX2-x86_64-Debug-ASAN-Trybot,Test-Ubuntu-Clang-Golo-GPU-GT610-x86_64-Debug-ASAN-Trybot
Change-Id: I40668d398356a22da071ee791666c7f728b59266
Reviewed-on: https://skia-review.googlesource.com/4362
Reviewed-by: Mike Reed <reed@google.com>
Reviewed-by: Mike Klein <mtklein@chromium.org>
This was always intended to be a temporary dependency to use for
testing. It has served its purpose.
Also, this has already been dropped (accidentally, I think) by
the new GN build.
TBR=reed@google.com
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=4220
Change-Id: Ic72ee08bbfaf86ed86a4122fd38be2921eb1327e
Reviewed-on: https://skia-review.googlesource.com/4220
Reviewed-by: Matt Sarett <msarett@google.com>
Reviewed-by: Leon Scroggins <scroggo@google.com>
Commit-Queue: Matt Sarett <msarett@google.com>
This allows us to change the underlying pointer without rebuilding the pipeline, e.g. when moving the blitter from scanline to scanline.
The extra overhead when not needed is measurable but small, <2%. We can always add back direct stages later for cases where we know the context pointer will not change.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=3943
Change-Id: I827d7e6e4e67d02dd2802610f898f98c5f36f8cb
Reviewed-on: https://skia-review.googlesource.com/3943
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
I'm not yet caching these in the blitter, and speed is essentially unchanged in the bench where I am now building and compiling the pipeline only once. This may not be able to stay a simple std::function after I figure out caching, but for now it's a nice fit.
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=3911
Change-Id: I9545af589f73baf9f17cb4e6ace9a814c2478fe9
Reviewed-on: https://skia-review.googlesource.com/3911
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
Matches our naming convention for all other types - factories that
return sk_sp (or any type that intelligently manages its own
lifetime) are named Make.
Previous factories are still around, assuming
SK_SUPPORT_LEGACY_COLOR_SPACE_FACTORIES is defined. Enable that
define for Android, etc.
See also: https://codereview.chromium.org/2442053002/
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=3822
Change-Id: Iaea9376490736b494e8ffc820831f052bbe1478d
Reviewed-on: https://skia-review.googlesource.com/3822
Commit-Queue: Brian Osman <brianosman@google.com>
Reviewed-by: Mike Reed <reed@google.com>
The refactoring breaks off A2B0 tag support into a separate
subclass of SkColorSpace_Base, while keeping the current
(besides CLUT) functionality in a XYZTRC subclass.
ICC profile loading is now aware of this and creates the A2B0
subclass when SkColorSpace::NewICC() is called on a profile
in need of the A2B0 functionality.
The LabPCSDemo GM loads a .icc profile containing a LAB PCS and
then runs a Lab->XYZ conversion on an image using it so we can
display it and test out the A2B0 SkColorSpace functionality,
sans a/b/m-curves, as well as the Lab->XYZ conversion code.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2389983002
Review-Url: https://codereview.chromium.org/2389983002
This is only useful in the rare case that the dst does not
fall into one of our main paths.
But it's a good optimization, since this does happen,
and typically, the dst won't change.
ColorCodecBench z620 --nonstd --xform_only
Without Patch 511us
With Patch 348us
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=3400
Change-Id: Ibf68d9ce7072680465662922f4aa15630545e3d6
Reviewed-on: https://skia-review.googlesource.com/3400
Reviewed-by: Mike Klein <mtklein@chromium.org>
Reviewed-by: Mike Reed <reed@google.com>
Commit-Queue: Matt Sarett <msarett@google.com>
It will now reside in SkColorSpace_Base. Future work for SkColorSpace
will cause this function to not be desirable or sensible to call on
all SkColorSpaces. Call sites were changed to make a kSRGBLinear_Named
instead of kSRGB_Named -> makeLinearGamma() (the majority of cases),
and if that was not possible, SkColorSpace_Base::makeLinearGamma()
was called instead.
TBR=reed@google.com
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2412613005
Review-Url: https://codereview.chromium.org/2412613005
- perform version check in CreateProc for XfermodeImageFilter and ArithmeticImageFilter
This reverts commit 3ed485f424.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2992
Change-Id: Ib4a154cdd5f5d1dcac921ef50d53b79a2d6a1be8
Reviewed-on: https://skia-review.googlesource.com/2992
Reviewed-by: Mike Reed <reed@google.com>
Commit-Queue: Mike Reed <reed@google.com>
I also had to cut it down to just a global atomic bool... as a field in a global singleton accessed through instance(), it's very hard to make threadsafe.
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-Clang-GCE-CPU-AVX2-x86_64-Release-TSAN-Trybot
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2937
Change-Id: If80be987906dd521fbe644d1d0d577009f06d0e3
Reviewed-on: https://skia-review.googlesource.com/2937
Reviewed-by: Yuqian Li <liyuqian@google.com>
Commit-Queue: Yuqian Li <liyuqian@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
This is handy now, and becomes necessary with fancier backends:
- most code can't speak the type of AVX pipeline stages,
so indirection's definitely needed there;
- if the pipleine is entirely composed of stock stages,
these enum values become an abstract recipe that can be JITted.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2782
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Change-Id: Iedd62e99ce39e94cf3e6ffc78c428f0ccc182342
Reviewed-on: https://skia-review.googlesource.com/2782
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
This lets them pick up runtime CPU specializations. Here I've plugged in SSE4.1. This is still one of the N prelude CLs to full 8-at-a-time AVX.
I've moved the union of the stages used by SkRasterPipelineBench and SkRasterPipelineBlitter to SkOpts... they'll all be used by the blitter eventually. Picking up SSE4.1 specialization here (even still just 4 pixels at a time) is a significant speedup, especially to store_srgb(), so much that it's no longer really interesting to compare against the fused-but-default-instruction-set version in the bench. So that's gone now.
That left the SkRasterPipeline unit test as the only other user of the EasyFn simplified interface to SkRasterPipeline. So I converted that back down to the bare-metal interface, and EasyFn and its friends became SkRasterPipeline_opts.h exclusive abbreviations (now called Kernel_Sk4f). This isn't really unexpected: SkXfermode also wanted to build up its own little abstractions, and once you build your own abstraction, the value of an additional EasyFn-like layer plummets to negative.
For simplicity I've left the SkXfermode stages alone, except srcover() which was always part of the blitter. No particular reason except keeping the churn down while I hack. These _can_ be in SkOpts, but don't have to be until we go 8-at-a-time.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2752
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Change-Id: I3b476b18232a1598d8977e425be2150059ab71dc
Reviewed-on: https://skia-review.googlesource.com/2752
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
In some build configurations (I think, GN, GCC 6, Debug) I get a warning that i is used unintialized. This likely has something to do with GCC correctly seeing that the SkTCast construction there is illegal aliasing, and perhaps thus "doesn't happen". Might be that if the SkTCast gets inlined, it decides its implementation is secretly kosher, and so Release builds don't see this. None of this happens with the GCCs we have on the bots... too old?
Instead use memcpy() here, which is well defined to do what we intended.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2758
Change-Id: Iaf5c75fbd852193b0b861bf5e71450502511d102
Reviewed-on: https://skia-review.googlesource.com/2758
Commit-Queue: Ben Wagner <bungeman@google.com>
Reviewed-by: Ben Wagner <bungeman@google.com>
We used to step at a 4-pixel stride as long as possible, then run up to 3 times, one pixel at a time. Now replace those 1-at-a-time runs with a single tail stamp if there are 1-3 remaining pixels.
This style is simply more efficient: e.g. we'll blend and lerp once for 3 pixels instead of 3 times. This should make short blits significantly more efficient. It's also more future-oriented... AVX+ on Intel and SVE on ARM support masked loads and stores, so we can do the entire tail in one direct step.
This also makes it possible to re-arrange the code a bit to encapsulate each stage better. I think generally this code reads more clearly than the old code, but YMMV. I've arranged things so you write one function, but it's compiled into two specializations, one for tail=0 (Body) and one for tail>0 (Tail). It's pretty tidy.
For now I've just burned a register to pass around tail. It's 2 bits now, maybe soon 3 with AVX, and capped at 4 for even the craziest new toys, so there are plenty of places we can pack it if we want to get clever.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2717
Change-Id: I45852a3e5d4c5b5e9315302c46601aee0d32265f
Reviewed-on: https://skia-review.googlesource.com/2717
Reviewed-by: Mike Reed <reed@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
Today if you use the simple SK_RASTER_STAGE interface to build a pipeline, each stage you add calls into a next stage. The last stage you add calls into a special backstop stage JustReturn that, well, just returns, ending the pipeline.
This adds last(), which cuts that last stage off the pipeline. Instead, the stage you add using last() returns directly, ending the pipeline itself without jumping into JustReturn.
This reduces the overhead of using the pipelined version of SkRasterPipelineBench from ~25% to ~20% on my desktop.
Also, add docs.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2713
Change-Id: I11469378e2765c6e34db52eb3eef648d6612da3f
Reviewed-on: https://skia-review.googlesource.com/2713
Reviewed-by: Mike Reed <reed@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
It was not a fan of this (blatant) aliasing.
I suspect this best_non_simd_srcover_srgb_srgb() function has several
other aliasing issues that use undefined behavior, but this is all it's
complaining about for now.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2606
Change-Id: I25a8800e810bccf5068c8a10e9c8c8f565e57304
Reviewed-on: https://skia-review.googlesource.com/2606
Commit-Queue: Mike Klein <mtklein@chromium.org>
Commit-Queue: Herb Derby <herb@google.com>
Reviewed-by: Herb Derby <herb@google.com>
Reason for revert:
Hitting an assert
Original issue's description:
> Support Float32 output from SkColorSpaceXform
>
> * Adds Float32 support to SkColorSpaceXform
> * Changes API to allows clients to ask for F32, updates clients to
> new API
> * Adds Sk4f_load4 and Sk4f_store4 to SkNx
> * Make use of new xform in SkGr.cpp
>
> BUG=skia:
> GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2339233003
> CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
>
> Committed: https://skia.googlesource.com/skia/+/43d6651111374b5d1e4ddd9030dcf079b448ec47TBR=brianosman@google.com,mtklein@google.com,scroggo@google.com,mtklein@chromium.org,bsalomon@google.com
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:
Review-Url: https://codereview.chromium.org/2347473007
* Adds Float32 support to SkColorSpaceXform
* Changes API to allows clients to ask for F32, updates clients to
new API
* Adds Sk4f_load4 and Sk4f_store4 to SkNx
* Make use of new xform in SkGr.cpp
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2339233003
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Review-Url: https://codereview.chromium.org/2339233003
The SkFontData type is not exposed externally, so any method which uses
it can be updated to use smart pointers without affecting external
users. Updating this first will make updating the public API much
easier.
This also updates SkStreamAsset* SkStream::NewFromFile(const char*) to
std::unique_ptr<SkStreamAsset> SkStream::MakeFromFile(const char*). It
appears that no one outside Skia is currently using SkStream::NewfromFile
so this is a good time to update it as well.
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2339273002
Committed: https://skia.googlesource.com/skia/+/d8c2476a8b1e1e1a1771b17e8dd4db8645914f8c
Review-Url: https://codereview.chromium.org/2339273002
Reason for revert:
Killing Mac
Original issue's description:
> SkFontData to use smart pointers.
>
> The SkFontData type is not exposed externally, so any method which uses
> it can be updated to use smart pointers without affecting external
> users. Updating this first will make updating the public API much
> easier.
>
> This also updates SkStreamAsset* SkStream::NewFromFile(const char*) to
> std::unique_ptr<SkStreamAsset> SkStream::MakeFromFile(const char*). It
> appears that no one outside Skia is currently using SkStream::NewfromFile
> so this is a good time to update it as well.
>
> GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2339273002
>
> Committed: https://skia.googlesource.com/skia/+/d8c2476a8b1e1e1a1771b17e8dd4db8645914f8cTBR=mtklein@chromium.org,halcanary@google.com,mtklein@google.com,reed@google.com
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
Review-Url: https://codereview.chromium.org/2343933002
The SkFontData type is not exposed externally, so any method which uses
it can be updated to use smart pointers without affecting external
users. Updating this first will make updating the public API much
easier.
This also updates SkStreamAsset* SkStream::NewFromFile(const char*) to
std::unique_ptr<SkStreamAsset> SkStream::MakeFromFile(const char*). It
appears that no one outside Skia is currently using SkStream::NewfromFile
so this is a good time to update it as well.
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2339273002
Review-Url: https://codereview.chromium.org/2339273002
We used to build src and dst transfer fn tables
every time a new xform was created with linear
src and dst. Now we don't compute them because
we don't need them.
This will make SkColorSpaceXform a far better
option for any xforms with float or half-float
inputs or outputs, particularly on a small number
of pixels.
This CL also moves SkColorSpaceXform closer to
what I anticipate will be the eventual 'API design'.
I think apply() will want to take a SrcColorType enum
(not created yet because it's not necessary yet) and
a DstColorType enum (still using SkColorType because
there's not yet a reason not to).
Performance changes:
toSRGB 341us -> 366us
to2Dot2 404us -> 403us
toF16 318us -> 304us
There's no reason for toSRGB or to2Dot2 to change.
The refactor seems to have caused the compiler to
order the instructions a little differently...
This is something to come back to if we need to
squeeze more performance out of sRGB. For now,
let's not be held up by something we don't control.
F16 likely improves because we are no longer
(unnecessarily) building the linear tables.
Code size gets a little bigger. Measuring
SkColorSpaceXform size as a percentage of src/ size,
we go from 0.8% to 1.4%.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2335723002
Review-Url: https://codereview.chromium.org/2335723002
HWUI skips transparent rects when drawing.
When skia draws using bilerp, we will blend
transparent rects with neighboring rects and might
draw a bit of a smudge.
This CL adds the option to skip rects, allowing us
to have compatible behavior with the framework.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2305433002
Review-Url: https://codereview.chromium.org/2305433002
Verify the rules that we're converging on for surfaces:
- For 8888, we only support sRGB-like gamma, or no color space at all.
- For F16, we require a color space, with linear gamma.
- For all other formats, we do not support color spaces.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2270823002
Review-Url: https://codereview.chromium.org/2270823002
Motivation: gross code simplification, also no bitset lookups at draw time.
SkPDFFont owns its glyph useage bitset.
SkPDFSubstituteMap goes away.
SkPDFObject interface is simplified.
SkPDFDocument tracks font usage (as hash set), not glyph usage.
SkPDFFont gets a simpler constructor.
SkPDFFont has first and last glyph set in constructor, not adjusted later.
SkPDFFont implementations are simplified.
SkPDFGlyphSet is replaced with simple SkBitSet.
SkPDFFont sizes its SkBitSets based on glyph count.
SkPDFGlyphSetMap goes away.
SkBitSet is now non-copyable.
SkBitSet now how utility methods to match old SkPDFGlyphSet.
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2253283004
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Win-MSVC-GCE-CPU-AVX2-x86_64-Release-GDI-Trybot,Test-Win-MSVC-GCE-CPU-AVX2-x86_64-Debug-GDI-Trybot
Review-Url: https://codereview.chromium.org/2253283004
Useful when:
(1) Client does not realize src and dst match (calls color
xform anyway).
(2) Client wants half floats, src and dst have matching
gamuts
(3) Client wants premul (done correctly in linear space),
src and dst have matching gamuts.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2206403003
Review-Url: https://codereview.chromium.org/2206403003
Impl Overview
(1) Keep the device clip bounds up to date. This
requires minimal additional work in a few places
throughout canvas.
(2) Keep track of if the ctm isScaleTranslate. Yes,
there's a function that does this, but it's slow
to call.
(3) Perform the src->device transform in quick reject,
then check intersection/nan.
Other Notes:
(1) NaN and intersection checks are performed
simultaneously.
(2) We no longer quick reject infinity.
(3) Affine and perspective are both handled in the slow
case.
(4) SkRasterClip::isEmpty() is handled by the intersection
check.
Performance on Nexus 6P:
93.2ms -> 59.8ms
Overall Android Jank Tests Performance Impact:
Should gain us a ms or two on some tests.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2225393002
Committed: https://skia.googlesource.com/skia/+/d22a817ff57986407facd16af36320fc86ce02da
Review-Url: https://codereview.chromium.org/2225393002
Impl Overview
(1) Keep the device clip bounds up to date. This
requires minimal additional work in a few places
throughout canvas.
(2) Keep track of if the ctm isScaleTranslate. Yes,
there's a function that does this, but it's slow
to call.
(3) Perform the src->device transform in quick reject,
then check intersection/nan.
Other Notes:
(1) NaN and intersection checks are performed
simultaneously.
(2) We no longer quick reject infinity.
(3) Affine and perspective are both handled in the slow
case.
(4) SkRasterClip::isEmpty() is handled by the intersection
check.
Performance on Nexus 6P:
93.2ms -> 59.8ms
Overall Android Jank Tests Performance Impact:
Should gain us a ms or two on some tests.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2225393002
Review-Url: https://codereview.chromium.org/2225393002
The recording bench must record some source material into some sort of
display list, and fundamentally cannot separate the timing of the two.
This CL makes it so the source material and display list are of the same type.
So instead of previous:
--nolite: SkRecord-based picture -> SkRecord-based picture
--lite: SkRecord-based picture -> threadsafe SkLiteDL
Now this times
--nolite: SkRecord-based picture -> SkRecord-based picture
--lite: SkLiteDL -> threadsafe SkLiteDL
This makes it easier to profile SkLiteDL and explore both recording and playback overhead hot spots.
The threadsafety is incidental for the source (and doesn't affect playback speed),
but I think it's handy to keep around on the destination to make a more fair comparison.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2230323002
Review-Url: https://codereview.chromium.org/2230323002
- This code is entirely private and is not being used by anything.
- In a future CL we will write a class that uses CurveMeasure to compute dash points. In order to determine whether CurveMeasure or PathMeasure should be faster, we need the dash info (the sum of the on/off intervals and how many there are)
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2187083002
Review-Url: https://codereview.chromium.org/2187083002
About 9x faster than Murmur3 for long inputs.
Most of this is a mechanical change from SkChecksum::Murmur3(...) to SkOpts::hash(...).
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2208903002
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot;master.client.skia.compile:Build-Ubuntu-GCC-x86_64-Release-CMake-Trybot,Build-Mac-Clang-x86_64-Release-CMake-Trybot
Review-Url: https://codereview.chromium.org/2208903002
SkLiteRecorder, a new SkCanvas, fills out SkLiteDL, a new SkDrawable.
This SkDrawable is a display list similar to SkRecord and SkBigPicture / SkRecordedDrawable, but with a few new design points inspired by Android and slimming paint:
1) SkLiteDL is structured as one big contiguous array rather than the two layer structure of SkRecord. This trades away flexibility and large-op-count performance for better data locality for small to medium size pictures.
2) We keep a global freelist of SkLiteDLs, both reusing the SkLiteDL struct itself and its contiguous byte array. This keeps the expected number of mallocs per display list allocation <1 (really, ~0) for cyclical use cases.
These two together mean recording is faster. Measuring against the code we use at head, SkLiteRecorder trends about ~3x faster across various size pictures, matching speed at 0 draws and beating the special-case 1-draw pictures we have today. (I.e. we won't need those special case implementations anymore, because they're slower than this new generic code.) This new strategy records 10 drawRects() in about the same time the old strategy took for 2.
This strategy stays the winner until at least 500 drawRect()s on my laptop, where I stopped checking.
A simpler alternative to freelisting is also possible (but not implemented here), where we allow the client to manually reset() an SkLiteDL for reuse when its refcnt is 1. That's essentially what we're doing with the freelist, except tracking what's available for reuse globally instead of making the client do it.
This code is not fully capable yet, but most of the key design points are there. The internal structure of SkLiteDL is the area I expect to be most volatile (anything involving Op), but its interface and the whole of SkLiteRecorder ought to be just about done.
You can run nanobench --match picture_overhead as a demo. Everything it exercises is fully fleshed out, so what it tests is an apples-to-apples comparison as far as recording costs go. I have not yet compared playback performance.
It should be simple to wrap this into an SkPicture subclass if we want.
I won't start proposing we replace anything old with anything new quite yet until I have more ducks in a row, but this does look pretty promising (similar to the SkRecord over old SkPicture change a couple years ago) and I'd like to land, experiment, iterate, especially with an eye toward Android.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2213333002
Review-Url: https://codereview.chromium.org/2213333002
With the move from SkData::NewXXX to SkData::MakeXXX most
SkAutoTUnref<SkData> were changed to sk_sp<SkData>. However,
there are still a few SkAutoTUnref<SkData> around, so clean
them up.
Review-Url: https://codereview.chromium.org/2212493002
Most visibly this adds a macro SK_RASTER_STAGE that cuts down on the boilerplate of defining a raster pipeline stage function.
Most interestingly, SK_RASTER_STAGE doesn't define a SkRasterPipeline::Fn, but rather a new type EasyFn. This function is always static and inlined, and the details of interacting with the SkRasterPipeline::Stage are taken care of for you: ctx is just passed as a void*, and st->next() is always called. All EasyFns have to do is take care of the meat of the work: update r,g,b, etc. and read and write from their context.
The really neat new feature here is that you can either add EasyFns to a pipeline with the new append() functions, _or_ call them directly yourself. This lets you use the same set of pieces to build either a pipelined version of the function or a custom, fused version. The bench shows this off.
On my desktop, the pipeline version of the bench takes about 25% more time to run than the fused one.
The old approach to creating stages still works fine. I haven't updated SkXfermode.cpp or SkArithmeticMode.cpp because they seemed just as clear using Fn directly as they would have using EasyFn.
If this looks okay to you I will rework the comments in SkRasterPipeline to explain SK_RASTER_STAGE and EasyFn a bit as I've done here in the CL description.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2195853002
Review-Url: https://codereview.chromium.org/2195853002
Motivation:
SkPDFStream and SkPDFSharedStream now work the same.
Also:
- move SkPDFStream into SkPDFTypes (it's a fundamental PDF type).
- minor refactor of SkPDFSharedStream
- SkPDFSharedStream takes unique_ptr to represent ownership
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2190883003
Review-Url: https://codereview.chromium.org/2190883003
AtomicTest was the only use of sk_atomic_add().
AtomicInc64 bench was the only use of sk_atomic_inc(int64_t*).
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2183473005
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-TSAN-Trybot,Test-Ubuntu-GCC-Golo-GPU-GT610-x86_64-Release-TSAN-Trybot
Review-Url: https://codereview.chromium.org/2183473005
This trims the SkPM4fPriv methods down to just foolproof methods.
(Anything trying to build these itself is probably wrong.)
Things like Sk4f srgb_to_linear(Sk4f) can't really exist anymore,
at least not efficiently, so this refactor is somewhat more invasive
than you might think. Generally this means things using to_4f() are
also making a misstep... that's gone too.
It also does not make sense to try to play games with linear floats
with 255 bias any more. That hack can't work with real sRGB coding.
Rather than update them, I've removed a couple of L32 xfermode fast
paths. I'd even rather drop it entirely...
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2163683002
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Review-Url: https://codereview.chromium.org/2163683002
I basically just ran a big 5-deep for-loop over the five constants here.
This is the first set of coefficients I found that round trips all bytes.
I suspect there are many such sets.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2162063003
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Review-Url: https://codereview.chromium.org/2162063003
This should give us a good baseline to explore using SkRasterPipeline.
A particular colorxform to half float drops from 425us to 282us on my desktop.
Color Xform to Half Float (HP z620)
Original 425us
Trans16 (not 32) 355us
Vector Trans16 378us
Trans16 + Keep Halfs in Vector 335us
Vector Trans16 + Keep Halfs in Vector 282us
Final 282us
Color Xform to Half Float (Nexus 5X)
Original 556us
Final 472us
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2159993003
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Review-Url: https://codereview.chromium.org/2159993003
SkPDFUtils now has a special function (SkPDFUtils::AppendColorComponent)
just for writing out (color/255) as a decimal with three digits of
precision.
SkPDFUnion now has a type to represent a color component. It holds a
utint_8, but calls into AppendColorComponent to serialize.
Added a unit test that tests all possible input values.
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2151863003
Review-Url: https://codereview.chromium.org/2151863003
I measured relative runtimes on my laptop:
pack_int_uint16_t_ss…
1036 …e41 1x …se3 1.01x …e2_b 3.01x …e2_a 3.02x
I've run into Clang problems with the actual _mm_packus_epi32 instruction, I think,
so I'm going to exercise a little cowardice and leave that option disabled for now.
The ssse3 version probably looks a little faster than it will be in practice.
We'll usually need to load its mask, which here is hoisted out of the bench loop.
The two sse2 variants are close enough in speed that I'm tie breaking them on other
concerns: the <<16, >>16 version doesn't need any scratch registers or to load any
constants, so it wins.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2150343002
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot,Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-Fast-Trybot
Review-Url: https://codereview.chromium.org/2150343002
If we make sure all SkOpts functions are static, we can give the namespaces any
name we like. This lets us drop the sk_ prefix and give a real indication of
the default SIMD instruction set rather than just saying sk_default.
Both of these changes help debugger, profiler, and crash report readability.
Perhaps more importantly, keeping these functions static helps prevent
accidentally linking in unused versions of functions, as you see here with
sk_avx::srcover_srgb_srgb().
This requires we update SkBlend_opts tests and benches to call SkOpts functions
through SkOpts rather than declaring the methods externally. In practice this
drops testing of the SSE2 version on machines with SSE4. If we still really
need to test/bench the compile time best SIMD level version of this method
against the runtime detected best, we can include SkBlend_opts.h into the tests
or benches directly, similar to what we do for the trivial, brute-force, or best
non-SIMD versions.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2145833002
CQ_INCLUDE_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Review-Url: https://codereview.chromium.org/2145833002
SkMatrix::scale and ::rotate take a point around which to scale or rotate.
Canvas lacks these helpers, so the code to rotate a canvas around a
point has been duplicated many times. Factor all of these
implementations into SkCanvas::rotate.
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2142033002
Review-Url: https://codereview.chromium.org/2142033002
Adds a module that performs instanced rendering and starts using it
for a select subset of draws on Mac GL platforms. The instance
processor can currently handle rects, ovals, round rects, and double
round rects. It can generalize shapes as round rects in order to
improve batching. The instance processor also employs new drawing
algorithms, irrespective of instanced rendering, that improve GPU-side
performance (e.g. sample mask, different triangle layouts, etc.).
This change only scratches the surface of instanced rendering. The
majority of draws still only have one instance. Future work may
include:
* Passing coord transforms through the texel buffer.
* Sending FP uniforms through instanced vertex attribs.
* Using instanced rendering for more draws (stencil writes,
drawAtlas, etc.).
* Adding more shapes to the instance processor’s repertoire.
* Batching draws that have mismatched scissors (analyzing draw
bounds, inserting clip planes, etc.).
* Bindless textures.
* Uber shaders.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2066993003
Committed: https://skia.googlesource.com/skia/+/42eafa4bc00354b132ad114d22ed6b95d8849891
Review-Url: https://codereview.chromium.org/2066993003
201295.jpg on HP z620
(300x280, most common form of sRGB profile)
QCMS Xform 0.495 ms
Skia Old Xform 0.235 ms
Skia NEW Xform 0.423 ms
Vs Old Code 0.56x
Vs QCMS 1.17x
So to summarize, we are now much slower than before,
but still a bit faster than QCMS. And now we are also
far more accurate than QCMS :).
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2060823003
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Review-Url: https://codereview.chromium.org/2060823003