SK_SUPPORT_LEGACY_GET_PIXELS_ENUM just set a \#define to convert
onGetPixelsEnum
to
onGetPixels
Now that Chrome has been updated to override onGetPixels, there is no
need for the define.
BUG=skia:3257
Review URL: https://codereview.chromium.org/933853004
Plumb SkDOM as needed to make it suitable for an SkXMLWriter backend.
Also fix a potential null typeface issue in
SkSVGDevice::AutoElement::addTextAttributes().
R=reed@google.com,mtklein@google.com
Review URL: https://codereview.chromium.org/940283002
I've written some new hashtable interfaces that should be easier to use,
and I've been trying to roll them out bit by bit, hopefully replacing
SkTDynamicHash, SkTMultiMap, SkTHashCache, etc.
This turns the cache in GrGLCaps::readPixelsSupported() into an SkTHashMap,
mapping the format key to a bool. Functionally, it's the same.
BUG=skia:
Review URL: https://codereview.chromium.org/948473002
The approximation uses a similar error term as the fill scan converter to determine the number of quads to use.
This also updates SampleApp QuadStroker test with conics, ovals, and stroked text.
Review URL: https://codereview.chromium.org/932113002
Replace the old signature of onGetPixels (return bool) to return an
enum (Result). Remove onGetPixelsEnum.
Add a define for onGetPixelsEnum to onGetPixels. This is for staging
in Chromium, where some implementations override onGetPixelsEnum.
Add the define in skia_for_chromium_defines. Remove
SK_SUPPORT_LEGACY_IMAGE_GENERATOR_RETURN, which is no longer needed by
Chromium.
BUG=skia:3257
Review URL: https://codereview.chromium.org/939113002
Also: Add SkDeflateWStream and associated unit tests.
SkPDFBitmap is a replacement for SkPDFImage. As of now, it only
supports 8888 bitmaps (the most common case).
SkPDFBitmap takes very little extra memory (aside from refing the
bitmap's pixels), and its emitObject() does not cache any data.
The SkPDFBitmap::Create function will check the canon for duplicates.
This can reduce the size of the output PDF.
Motivation: this gives another ~40% decrease in PDF memory overhead
TODO: Support other ColorTypes and scrap SkPDFImage.
BUG=skia:3030
Review URL: https://codereview.chromium.org/918813002
Add RotateCircles3 back as better-named QuadStroker.
Switch pathfill test to call skia before draw instead of in
initializer to avoid triggering debugging breakpoints.
Review URL: https://codereview.chromium.org/912273003
Motivation: test that these works on all possible bots and for all possible clients (clients do run these unit tests, right?)
Dear clients: if this unit test fails, let us know!
BUG=skia:3427
Review URL: https://codereview.chromium.org/922043004
The new enum describes the nature of the failure. This is in
preparation for writing a replacement for SkImageDecoder, which will
use this interface.
Update the comments for getPixels() to specify what it means to pass
an SkImageInfo with a different size.
Make SkImageGenerator Noncopyable.
Leave onGetYUV8Planes alone, since we have separate discussions
regarding modifying that API.
Make callers of SkImageDecoder consistently handle kPartialSuccess.
Previously, some callers considered it a failure, and others considered
it a success.
BUG=skia:3257
Review URL: https://codereview.chromium.org/919693002
SkTHashTable is very similar to SkTDynamicHash, except it's generalized to support non-pointer value types.
It doesn't support remove(), just to keep things simple (it's not hard to add).
Instead of an iterator, it has foreach(), again, to keep things simple.
SkTHashMap<K,V> and SkTHashSet<T> build a friendlier experience on top of SkTHashTable.
BUG=skia:
Review URL: https://codereview.chromium.org/925613002
When checking the skia_arch_type for "x86", instead of doing an
== compare, check if "x86" in skia_arch_type, so it will cover
both x86 and x86_64.
Except when we specifically want x86.
Set skia_arch_width based on "64" in skia_arch_type. No need to specify
in scripts.
In gyp_to_android.py, create a separate var_dict for x86_64.
BUG=skia:3419
Review URL: https://codereview.chromium.org/916113002
This has the side effect of requiring SkNullGLContext to use the null GL interface.
It exposes SkNullGLContext and also removes null context support from SampleApp.
Review URL: https://codereview.chromium.org/916733002
Reason for revert:
Going to punt on 16-bit float support for now. Can't figure out ARM 64.
Original issue's description:
> GYP groudwork for half-float opts support.
>
> This sets us up two new opts targets with the immediate goal of adding half-float (SkHalf.h) opts:
> - opts_neon_fp16: uses hardware support on most ARM chips with NEON to do 4 conversions at a time;
> - opts_avx: uses hardware support on Intel chips with AVX to do 8 conversions at a time.
>
> opts_avx will be a handy thing to have around later too, especially if we want to work with floats.
>
> This doesn't actually add any new source files to these libraries yet, so they're no-ops for now.
> I'll need to write a parallel change to Chrome's GN and GYPs before we can start adding sources.
>
> This also rolls GYP up to head, to get suppport for EnableEnhancedInstructionSet: '3' on Windows,
> which is how we turn on AVX there. There's no Mac-specific flag, so we use OTHER_CPLUSPLUSFLAGS.
>
> BUG=skia:
>
> TBR=reed@google.com
>
> Committed: https://skia.googlesource.com/skia/+/46b80833394d7919cadf2abf2b93802141dd21c5TBR=reed@google.com,mtklein@chromium.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:
Review URL: https://codereview.chromium.org/912223002
This sets us up two new opts targets with the immediate goal of adding half-float (SkHalf.h) opts:
- opts_neon_fp16: uses hardware support on most ARM chips with NEON to do 4 conversions at a time;
- opts_avx: uses hardware support on Intel chips with AVX to do 8 conversions at a time.
opts_avx will be a handy thing to have around later too, especially if we want to work with floats.
This doesn't actually add any new source files to these libraries yet, so they're no-ops for now.
I'll need to write a parallel change to Chrome's GN and GYPs before we can start adding sources.
This also rolls GYP up to head, to get suppport for EnableEnhancedInstructionSet: '3' on Windows,
which is how we turn on AVX there. There's no Mac-specific flag, so we use OTHER_CPLUSPLUSFLAGS.
BUG=skia:
TBR=reed@google.com
Review URL: https://codereview.chromium.org/915693002
The macro is only used in CrashHandler.*
Removes SK_CRASH_HANDLER from Android's SkUserConfig, where it is not
needed.
Review URL: https://codereview.chromium.org/915663002
Mac and Windows bots are already building in C++11 mode.
This turns on the rest, mostly to see what work remains.
This will probably break a few bots. It'd be nice if we could let those
all come in as red before reverting this so I can see the full list to fix.
BUG=skia:
Committed: https://skia.googlesource.com/skia/+/779e49602a9c8f4d2799504822e01bcafbcaa534
CQ_EXTRA_TRYBOTS=client.skia.compile:Build-Ubuntu13.10-GCC4.8-NaCl-Release-Trybot,Build-Mac10.7-Clang-Arm7-Debug-iOS-Trybot
Review URL: https://codereview.chromium.org/868233008
Reason for revert:
Perf-Ubuntu12, and Test-Ubuntu12, Build-Nacl all too old.
Android and Chrome OS builders look ok.
Android testers look ok. Chrome OS testers haven't run yet.
Original issue's description:
> Build in C++11 mode on Unix-like bots.
>
> Mac and Windows bots are already building in C++11 mode.
> This turns on the rest, mostly to see what work remains.
>
> This will probably break a few bots. It'd be nice if we could let those
> all come in as red before reverting this so I can see the full list to fix.
>
> BUG=skia:
>
> Committed: https://skia.googlesource.com/skia/+/779e49602a9c8f4d2799504822e01bcafbcaa534TBR=stephana@google.com,mtklein@chromium.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:
Review URL: https://codereview.chromium.org/879803003
Mac and Windows bots are already building in C++11 mode.
This turns on the rest, mostly to see what work remains.
This will probably break a few bots. It'd be nice if we could let those
all come in as red before reverting this so I can see the full list to fix.
BUG=skia:
Review URL: https://codereview.chromium.org/868233008
This is working towards making a simple example part of the buildbot compile step and removing SkExamples from the experimental directory.
This works on Mac, Windows, and Linux but isn't complete for Android, ChromeOS and iOS.
Review URL: https://codereview.chromium.org/886413004
This merges and refactors SkAtomics.h and SkBarriers.h into SkAtomics.h and
some ports/ implementations. The major new feature is that we can express
memory orders explicitly rather than only through comments.
The porting layer is reduced to four template functions:
- sk_atomic_load
- sk_atomic_store
- sk_atomic_fetch_add
- sk_atomic_compare_exchange
From those four we can reconstruct all our previous sk_atomic_foo.
There are three ports:
- SkAtomics_std: uses C++11 <atomic>, used with MSVC
- SkAtomics_atomic: uses newer GCC/Clang intrinsics, used on not-MSVC where possible
- SkAtomics_sync: uses older GCC/Clang intrinsics, used where SkAtomics_atomic not supported
No public API changes.
TBR=reed@google.com
BUG=skia:
Review URL: https://codereview.chromium.org/896553002
Reason for revert:
Reverted the wrong CL.
Original issue's description:
> Revert of SSE4 opaque blend using intrinsics instead of assembly. (patchset #16 id:300001 of https://codereview.chromium.org/874863002/)
>
> Reason for revert:
> This causes a bug on the 'hittestpath' GM on MacMini 4,1
>
> See:
>
> https://gold.skia.org/#/triage/hittestpath?head=0
>
> for details.
>
> Original issue's description:
> > SSE4 opaque blend using intrinsics instead of assembly.
> >
> > Since we had such a hard time with the assembly versions of this blit (to the
> > point that we have them completely disabled everywhere), I thought I'd take
> > a shot at writing a version of the blit using intrinsics.
> >
> > The key feature of SSE4 we're exploiting is that we can use ptest (_mm_test*)
> > to skip the blend when the 16 src pixels we consider each loop are all opaque
> > or all transparent. _mm_shuffle_epi8 from SSSE3 also lends a hand to extract
> > all those alphas.
> >
> > It's worth looking to see if we can backport this type of logic to SSE2 using
> > _mm_movemask_epi8, or up to 32 pixels at a time using AVX.
> >
> > My local performance testing doesn't show this to be an unambiguous win
> > (there are probably microbenchmarks and SKPs where we'd be better off just
> > powering through the blend rather than looking at alphas), but the potential
> > does seem tantalizing enough to let skiaperf vet it on the bots. (< 1.0x is a win.)
> >
> > DM says it draws pixel perfect compare to the old code.
> >
> > Microbenchmarks:
> > bitmap_RGBA_8888_A_source_stripes_two 14us -> 14.4us 1.03x
> > bitmap_RGBA_8888_A_source_stripes_three 14.3us -> 14.5us 1.01x
> > bitmap_RGBA_8888_scale_bilerp 61.9us -> 62.2us 1.01x
> > bitmap_RGBA_8888_update_volatile_scale_rotate_bilerp 102us -> 101us 0.99x
> > bitmap_RGBA_8888_scale_rotate_bilerp 103us -> 101us 0.99x
> > bitmap_RGBA_8888_scale 18.4us -> 18.2us 0.99x
> > bitmap_RGBA_8888_A_scale_rotate_bicubic 71us -> 70us 0.99x
> > bitmap_RGBA_8888_update_scale_rotate_bilerp 103us -> 101us 0.99x
> > bitmap_RGBA_8888_A_scale_rotate_bilerp 112us -> 109us 0.98x
> > bitmap_RGBA_8888_update_volatile 5.72us -> 5.58us 0.98x
> > bitmap_RGBA_8888 5.73us -> 5.58us 0.97x
> > bitmap_RGBA_8888_update 5.78us -> 5.6us 0.97x
> > bitmap_RGBA_8888_A_scale_bilerp 70.7us -> 68us 0.96x
> > bitmap_RGBA_8888_A_scale_bicubic 23.7us -> 21.8us 0.92x
> > bitmap_RGBA_8888_A 13.9us -> 10.9us 0.78x
> > bitmap_RGBA_8888_A_source_opaque 14us -> 6.29us 0.45x
> > bitmap_RGBA_8888_A_source_transparent 14us -> 3.65us 0.26x
> >
> > Running over our ~70 SKP web page captures, this looks like we spend 0.7x
> > the time in S32A_Opaque_BlitRow compared to the SSE2 version, which should
> > be a decent predictor of real-world impact.
> >
> > BUG=chromium:399842
> >
> > Committed: https://skia.googlesource.com/skia/+/04bc91b972417038fecfa87c484771eac2b9b785
> >
> > CQ_EXTRA_TRYBOTS=client.skia:Test-Mac10.6-MacMini4.1-GeForce320M-x86_64-Release-Trybot
> >
> > Committed: https://skia.googlesource.com/skia/+/6dbfb21a6c88af6d94e8c823c3ad559f1a41b493
>
> TBR=henrik.smiding@intel.com,mtklein@google.com,herb@google.com,reed@google.com,thakis@chromium.org,mtklein@chromium.org
> NOPRESUBMIT=true
> NOTREECHECKS=true
> NOTRY=true
> BUG=chromium:399842
>
> Committed: https://skia.googlesource.com/skia/+/4988891a1173cd405bf1c1dd3a3668c451f45e4cTBR=henrik.smiding@intel.com,mtklein@google.com,herb@google.com,reed@google.com,thakis@chromium.org,mtklein@chromium.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=chromium:399842
Review URL: https://codereview.chromium.org/894083002
Reason for revert:
This causes a bug on the 'hittestpath' GM on MacMini 4,1
See:
https://gold.skia.org/#/triage/hittestpath?head=0
for details.
Original issue's description:
> SSE4 opaque blend using intrinsics instead of assembly.
>
> Since we had such a hard time with the assembly versions of this blit (to the
> point that we have them completely disabled everywhere), I thought I'd take
> a shot at writing a version of the blit using intrinsics.
>
> The key feature of SSE4 we're exploiting is that we can use ptest (_mm_test*)
> to skip the blend when the 16 src pixels we consider each loop are all opaque
> or all transparent. _mm_shuffle_epi8 from SSSE3 also lends a hand to extract
> all those alphas.
>
> It's worth looking to see if we can backport this type of logic to SSE2 using
> _mm_movemask_epi8, or up to 32 pixels at a time using AVX.
>
> My local performance testing doesn't show this to be an unambiguous win
> (there are probably microbenchmarks and SKPs where we'd be better off just
> powering through the blend rather than looking at alphas), but the potential
> does seem tantalizing enough to let skiaperf vet it on the bots. (< 1.0x is a win.)
>
> DM says it draws pixel perfect compare to the old code.
>
> Microbenchmarks:
> bitmap_RGBA_8888_A_source_stripes_two 14us -> 14.4us 1.03x
> bitmap_RGBA_8888_A_source_stripes_three 14.3us -> 14.5us 1.01x
> bitmap_RGBA_8888_scale_bilerp 61.9us -> 62.2us 1.01x
> bitmap_RGBA_8888_update_volatile_scale_rotate_bilerp 102us -> 101us 0.99x
> bitmap_RGBA_8888_scale_rotate_bilerp 103us -> 101us 0.99x
> bitmap_RGBA_8888_scale 18.4us -> 18.2us 0.99x
> bitmap_RGBA_8888_A_scale_rotate_bicubic 71us -> 70us 0.99x
> bitmap_RGBA_8888_update_scale_rotate_bilerp 103us -> 101us 0.99x
> bitmap_RGBA_8888_A_scale_rotate_bilerp 112us -> 109us 0.98x
> bitmap_RGBA_8888_update_volatile 5.72us -> 5.58us 0.98x
> bitmap_RGBA_8888 5.73us -> 5.58us 0.97x
> bitmap_RGBA_8888_update 5.78us -> 5.6us 0.97x
> bitmap_RGBA_8888_A_scale_bilerp 70.7us -> 68us 0.96x
> bitmap_RGBA_8888_A_scale_bicubic 23.7us -> 21.8us 0.92x
> bitmap_RGBA_8888_A 13.9us -> 10.9us 0.78x
> bitmap_RGBA_8888_A_source_opaque 14us -> 6.29us 0.45x
> bitmap_RGBA_8888_A_source_transparent 14us -> 3.65us 0.26x
>
> Running over our ~70 SKP web page captures, this looks like we spend 0.7x
> the time in S32A_Opaque_BlitRow compared to the SSE2 version, which should
> be a decent predictor of real-world impact.
>
> BUG=chromium:399842
>
> Committed: https://skia.googlesource.com/skia/+/04bc91b972417038fecfa87c484771eac2b9b785
>
> CQ_EXTRA_TRYBOTS=client.skia:Test-Mac10.6-MacMini4.1-GeForce320M-x86_64-Release-Trybot
>
> Committed: https://skia.googlesource.com/skia/+/6dbfb21a6c88af6d94e8c823c3ad559f1a41b493TBR=henrik.smiding@intel.com,mtklein@google.com,herb@google.com,reed@google.com,thakis@chromium.org,mtklein@chromium.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=chromium:399842
Review URL: https://codereview.chromium.org/873553003
Not enabled by default, but this should get you SKPs, GMs etc for free to play with.
$ out/Debug/dm -w svgs --src gm skp --config svg
BUG=skia:
Review URL: https://codereview.chromium.org/892693002
Eventually, this will be moved to be a peer of SampleApp so it is compiled by the bots to avoid future bit rot.
Also ignore XCode auto-generated flag in CommandLineFlags, and remove the unused multiple-example part.
Review URL: https://codereview.chromium.org/890873003
SkProxyCanvas is redundant with SkNWayCanvas, and means another class
we have to keep in sync with the SkCanvas interface.
Remove tests which use an SkProxyCanvas.
Requires a change to chromium.
BUG=skia:3279
BUG=skia:500
Review URL: https://codereview.chromium.org/886813002
The obsolete gyp files are also included, along with a pixman test that never functioned but accidentally referenced some of these deleted files.
Review URL: https://codereview.chromium.org/867213004
Since we had such a hard time with the assembly versions of this blit (to the
point that we have them completely disabled everywhere), I thought I'd take
a shot at writing a version of the blit using intrinsics.
The key feature of SSE4 we're exploiting is that we can use ptest (_mm_test*)
to skip the blend when the 16 src pixels we consider each loop are all opaque
or all transparent. _mm_shuffle_epi8 from SSSE3 also lends a hand to extract
all those alphas.
It's worth looking to see if we can backport this type of logic to SSE2 using
_mm_movemask_epi8, or up to 32 pixels at a time using AVX.
My local performance testing doesn't show this to be an unambiguous win
(there are probably microbenchmarks and SKPs where we'd be better off just
powering through the blend rather than looking at alphas), but the potential
does seem tantalizing enough to let skiaperf vet it on the bots. (< 1.0x is a win.)
DM says it draws pixel perfect compare to the old code.
Microbenchmarks:
bitmap_RGBA_8888_A_source_stripes_two 14us -> 14.4us 1.03x
bitmap_RGBA_8888_A_source_stripes_three 14.3us -> 14.5us 1.01x
bitmap_RGBA_8888_scale_bilerp 61.9us -> 62.2us 1.01x
bitmap_RGBA_8888_update_volatile_scale_rotate_bilerp 102us -> 101us 0.99x
bitmap_RGBA_8888_scale_rotate_bilerp 103us -> 101us 0.99x
bitmap_RGBA_8888_scale 18.4us -> 18.2us 0.99x
bitmap_RGBA_8888_A_scale_rotate_bicubic 71us -> 70us 0.99x
bitmap_RGBA_8888_update_scale_rotate_bilerp 103us -> 101us 0.99x
bitmap_RGBA_8888_A_scale_rotate_bilerp 112us -> 109us 0.98x
bitmap_RGBA_8888_update_volatile 5.72us -> 5.58us 0.98x
bitmap_RGBA_8888 5.73us -> 5.58us 0.97x
bitmap_RGBA_8888_update 5.78us -> 5.6us 0.97x
bitmap_RGBA_8888_A_scale_bilerp 70.7us -> 68us 0.96x
bitmap_RGBA_8888_A_scale_bicubic 23.7us -> 21.8us 0.92x
bitmap_RGBA_8888_A 13.9us -> 10.9us 0.78x
bitmap_RGBA_8888_A_source_opaque 14us -> 6.29us 0.45x
bitmap_RGBA_8888_A_source_transparent 14us -> 3.65us 0.26x
Running over our ~70 SKP web page captures, this looks like we spend 0.7x
the time in S32A_Opaque_BlitRow compared to the SSE2 version, which should
be a decent predictor of real-world impact.
BUG=chromium:399842
Committed: https://skia.googlesource.com/skia/+/04bc91b972417038fecfa87c484771eac2b9b785
CQ_EXTRA_TRYBOTS=client.skia:Test-Mac10.6-MacMini4.1-GeForce320M-x86_64-Release-Trybot
Review URL: https://codereview.chromium.org/874863002
Reason for revert:
This kills Mac 10.6 bots.
FAILED: c++ -MMD -MF obj/src/opts/opts_sse4.SkBlitRow_opts_SSE4.o.d -DSK_INTERNAL -DSK_GAMMA_SRGB -DSK_GAMMA_APPLY_TO_A8 -DSK_SCALAR_TO_FLOAT_EXCLUDED -DSK_ALLOW_STATIC_GLOBAL_INITIALIZERS=1 -DSK_SUPPORT_GPU=1 -DSK_SUPPORT_OPENCL=0 -DSK_FORCE_DISTANCE_FIELD_TEXT=0 -DSK_BUILD_FOR_MAC -DSK_CRASH_HANDLER -DSK_DEVELOPER=1 -I../../src/core -I../../src/utils -I../../include/c -I../../include/config -I../../include/core -I../../include/pathops -I../../include/pipe -I../../include/utils/mac -I../../include/effects -O0 -gdwarf-2 -mmacosx-version-min=10.6 -arch x86_64 -mssse3 -Wall -Wextra -Winit-self -Wpointer-arith -Wsign-compare -Wno-unused-parameter -Wno-invalid-offsetof -msse4.1 -c ../../src/opts/SkBlitRow_opts_SSE4.cpp -o obj/src/opts/opts_sse4.SkBlitRow_opts_SSE4.o
../../src/opts/SkBlitRow_opts_SSE4.cpp:15:27: warning: x86intrin.h: No such file or directory
../../src/opts/SkBlitRow_opts_SSE4.cpp: In function 'void S32A_Opaque_BlitRow32_SSE4(SkPMColor*, const SkPMColor*, int, U8CPU)':
../../src/opts/SkBlitRow_opts_SSE4.cpp:40: error: '_mm_testz_si128' was not declared in this scope
../../src/opts/SkBlitRow_opts_SSE4.cpp:45: error: '_mm_testc_si128' was not declared in this scope
Original issue's description:
> SSE4 opaque blend using intrinsics instead of assembly.
>
> Since we had such a hard time with the assembly versions of this blit (to the
> point that we have them completely disabled everywhere), I thought I'd take
> a shot at writing a version of the blit using intrinsics.
>
> The key feature of SSE4 we're exploiting is that we can use ptest (_mm_test*)
> to skip the blend when the 16 src pixels we consider each loop are all opaque
> or all transparent. _mm_shuffle_epi8 from SSSE3 also lends a hand to extract
> all those alphas.
>
> It's worth looking to see if we can backport this type of logic to SSE2 using
> _mm_movemask_epi8, or up to 32 pixels at a time using AVX.
>
> My local performance testing doesn't show this to be an unambiguous win
> (there are probably microbenchmarks and SKPs where we'd be better off just
> powering through the blend rather than looking at alphas), but the potential
> does seem tantalizing enough to let skiaperf vet it on the bots. (< 1.0x is a win.)
>
> DM says it draws pixel perfect compare to the old code.
>
> Microbenchmarks:
> bitmap_RGBA_8888_A_source_stripes_two 14us -> 14.4us 1.03x
> bitmap_RGBA_8888_A_source_stripes_three 14.3us -> 14.5us 1.01x
> bitmap_RGBA_8888_scale_bilerp 61.9us -> 62.2us 1.01x
> bitmap_RGBA_8888_update_volatile_scale_rotate_bilerp 102us -> 101us 0.99x
> bitmap_RGBA_8888_scale_rotate_bilerp 103us -> 101us 0.99x
> bitmap_RGBA_8888_scale 18.4us -> 18.2us 0.99x
> bitmap_RGBA_8888_A_scale_rotate_bicubic 71us -> 70us 0.99x
> bitmap_RGBA_8888_update_scale_rotate_bilerp 103us -> 101us 0.99x
> bitmap_RGBA_8888_A_scale_rotate_bilerp 112us -> 109us 0.98x
> bitmap_RGBA_8888_update_volatile 5.72us -> 5.58us 0.98x
> bitmap_RGBA_8888 5.73us -> 5.58us 0.97x
> bitmap_RGBA_8888_update 5.78us -> 5.6us 0.97x
> bitmap_RGBA_8888_A_scale_bilerp 70.7us -> 68us 0.96x
> bitmap_RGBA_8888_A_scale_bicubic 23.7us -> 21.8us 0.92x
> bitmap_RGBA_8888_A 13.9us -> 10.9us 0.78x
> bitmap_RGBA_8888_A_source_opaque 14us -> 6.29us 0.45x
> bitmap_RGBA_8888_A_source_transparent 14us -> 3.65us 0.26x
>
> Running over our ~70 SKP web page captures, this looks like we spend 0.7x
> the time in S32A_Opaque_BlitRow compared to the SSE2 version, which should
> be a decent predictor of real-world impact.
>
> BUG=chromium:399842
>
> Committed: https://skia.googlesource.com/skia/+/04bc91b972417038fecfa87c484771eac2b9b785TBR=henrik.smiding@intel.com,mtklein@google.com,herb@google.com,reed@google.com,thakis@chromium.org,mtklein@chromium.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=chromium:399842
Review URL: https://codereview.chromium.org/874033004
Since we had such a hard time with the assembly versions of this blit (to the
point that we have them completely disabled everywhere), I thought I'd take
a shot at writing a version of the blit using intrinsics.
The key feature of SSE4 we're exploiting is that we can use ptest (_mm_test*)
to skip the blend when the 16 src pixels we consider each loop are all opaque
or all transparent. _mm_shuffle_epi8 from SSSE3 also lends a hand to extract
all those alphas.
It's worth looking to see if we can backport this type of logic to SSE2 using
_mm_movemask_epi8, or up to 32 pixels at a time using AVX.
My local performance testing doesn't show this to be an unambiguous win
(there are probably microbenchmarks and SKPs where we'd be better off just
powering through the blend rather than looking at alphas), but the potential
does seem tantalizing enough to let skiaperf vet it on the bots. (< 1.0x is a win.)
DM says it draws pixel perfect compare to the old code.
Microbenchmarks:
bitmap_RGBA_8888_A_source_stripes_two 14us -> 14.4us 1.03x
bitmap_RGBA_8888_A_source_stripes_three 14.3us -> 14.5us 1.01x
bitmap_RGBA_8888_scale_bilerp 61.9us -> 62.2us 1.01x
bitmap_RGBA_8888_update_volatile_scale_rotate_bilerp 102us -> 101us 0.99x
bitmap_RGBA_8888_scale_rotate_bilerp 103us -> 101us 0.99x
bitmap_RGBA_8888_scale 18.4us -> 18.2us 0.99x
bitmap_RGBA_8888_A_scale_rotate_bicubic 71us -> 70us 0.99x
bitmap_RGBA_8888_update_scale_rotate_bilerp 103us -> 101us 0.99x
bitmap_RGBA_8888_A_scale_rotate_bilerp 112us -> 109us 0.98x
bitmap_RGBA_8888_update_volatile 5.72us -> 5.58us 0.98x
bitmap_RGBA_8888 5.73us -> 5.58us 0.97x
bitmap_RGBA_8888_update 5.78us -> 5.6us 0.97x
bitmap_RGBA_8888_A_scale_bilerp 70.7us -> 68us 0.96x
bitmap_RGBA_8888_A_scale_bicubic 23.7us -> 21.8us 0.92x
bitmap_RGBA_8888_A 13.9us -> 10.9us 0.78x
bitmap_RGBA_8888_A_source_opaque 14us -> 6.29us 0.45x
bitmap_RGBA_8888_A_source_transparent 14us -> 3.65us 0.26x
Running over our ~70 SKP web page captures, this looks like we spend 0.7x
the time in S32A_Opaque_BlitRow compared to the SSE2 version, which should
be a decent predictor of real-world impact.
BUG=chromium:399842
Review URL: https://codereview.chromium.org/874863002
BUG=skia:1366
For the added bench, the collapsing makes the bench take:
- 70% of the time for CPU rendering of 3 consecutive matrix filters
- almost no change in the GPU rendering of the matrix filters
- 50% of the time for CPU and GPU rendering of 3 consecutive table filters
Review URL: https://codereview.chromium.org/776673002
Fold alpha to the inner savelayer in savelayer-savelayer-restore
patterns such as this:
SaveLayer (non-opaque)
Save
ClipRect
SaveLayer
Restore
Restore
Restore
Current blink generates these for example for SVG content such as this:
<path style="opacity:0.5 filter:url(#blur_filter)"/>
The outer save layer is due to the opacity and the inner one is due to
blur filter being implemented with picture image filter.
Reduces layers in desk_carsvg.skp testcase from 115 to 78.
BUG=skia:3119
Review URL: https://codereview.chromium.org/835973005
- add -Wsign-compare, which has been catching useful issues for Kimmo;
- add -Winit-self and -Wpointer-arith to Mac builds so everyone's using
the same flags;
- try try removing -Wno-uninitialized. This was only for the old 10.6
compiler that we have warnings set as non-errors now.
BUG=skia:
Review URL: https://codereview.chromium.org/872793002
To compile SkCondVar, we already require either pthreads or Windows. This
simplifies that code to not need SK_USE_POSIX_THREADS to be explicitly defined.
We'll just look to see if we're targeting Windows, and if not, assume pthreads.
Both before and after this CL, that code will fail to compile if we're not on
Windows and don't have pthreads.
BUG=skia:
Review URL: https://codereview.chromium.org/869443003
This basically takes out the Windows-only hacks and promotes them to
cross-platform behavior driven by --gpu_threading.
- When --gpu_threading is false (the default), this puts GPU tasks and tests
together in the same GPU enclave. They all run serially.
- When --gpu_threading is true, both the tests and the tasks run totally
independently, just like the thread-safe CPU-bound work.
BUG=skia:3255
Review URL: https://codereview.chromium.org/847273005
This fixes two problems:
1) #include SK_SOME_DEFINE doesn't work well for all our clients.
2) Things in include/ are #including things in src/, which we don't like.
TBR=reed@google.com
BUG=skia:
Review URL: https://codereview.chromium.org/862983002
SkPDFCanon's fields and methods will eventually become part of
SkPDFDocument/SkDocument_PDF. For now, it exists as a singleton to
preflight that transition. This replaces three global arrays in
SkPDFFont, SkPDFShader, and SkPDFGraphicsContext. This code is still
thread-unsafe (http://skbug.com/2683), but moving this functionality
into SkPDFDocument will fix that.
This CL does not change pdf output from either GMs or SKPs.
This change also simplifies some code around the SkPDFCanon methods.
BUG=skia:2683
Review URL: https://codereview.chromium.org/842253003
SkFontMgr and SkFontStyle implementations are currently burried in the
old SkFontHost.cpp file. Move these implementations to their own file
so that the implementations are easier to find, and to make clearer that
SkFontHost.cpp needs to be removed.
Review URL: https://codereview.chromium.org/799533004
- Added new classes to contain YUV planes of memory, along with the associated data.
- Used these classes in load_yuv_texture() to enable YUV planes caching
- Added a unit test for the new cache
BUG=450021
Review URL: https://codereview.chromium.org/851273003
BUG=skia:3255
I think this supports everything DM used to, but has completely refactored how
it works to fit the design in the bug.
Configs like "tiles-gpu" are automatically wired up.
I wouldn't suggest looking at this as a diff. There's just a bunch of deleted
files, a few new files, and one new file that shares a name with a deleted file
(DM.cpp).
NOTREECHECKS=true
Committed: https://skia.googlesource.com/skia/+/709d2c3e5062c5b57f91273bfc11a751f5b2bb88
Review URL: https://codereview.chromium.org/788243008
Reason for revert:
plenty of data
Original issue's description:
> Sketch DM refactor.
>
> BUG=skia:3255
>
>
> I think this supports everything DM used to, but has completely refactored how
> it works to fit the design in the bug.
>
> Configs like "tiles-gpu" are automatically wired up.
>
> I wouldn't suggest looking at this as a diff. There's just a bunch of deleted
> files, a few new files, and one new file that shares a name with a deleted file
> (DM.cpp).
>
> NOTREECHECKS=true
>
> Committed: https://skia.googlesource.com/skia/+/709d2c3e5062c5b57f91273bfc11a751f5b2bb88TBR=bsalomon@google.com,mtklein@chromium.org
NOTREECHECKS=true
NOTRY=true
BUG=skia:3255
Review URL: https://codereview.chromium.org/853883004
BUG=skia:3255
I think this supports everything DM used to, but has completely refactored how
it works to fit the design in the bug.
Configs like "tiles-gpu" are automatically wired up.
I wouldn't suggest looking at this as a diff. There's just a bunch of deleted
files, a few new files, and one new file that shares a name with a deleted file
(DM.cpp).
NOTREECHECKS=true
Review URL: https://codereview.chromium.org/788243008
Reason for revert:
This CL introduces rendering conflicts with hairlines (i.e., the hairlines get overwritten). These conflicts are particularly visible on the following GMs (for the Ubuntu and Android gpu configs):
coloremoji & complexclip2_rrect_bw
Original issue's description:
> Fix GPU clipped-AA vs. non-AA drawRect discrepancy
>
> In the clip stack we were manually rounding out non-AA clip rects but leaving the hardening of non-AA drawRects up to the GPU. In some border cases the GPU can truncate rather than round out resulting in visual discrepancies.
>
> BUG=423834
>
> Committed: https://skia.googlesource.com/skia/+/933a03fecb65c83f81cf65d5cf9870c69aa379ffTBR=bsalomon@google.com,jvanverth@google.com
NOTREECHECKS=true
NOTRY=true
BUG=423834
Review URL: https://codereview.chromium.org/847033002
In the clip stack we were manually rounding out non-AA clip rects but leaving the hardening of non-AA drawRects up to the GPU. In some border cases the GPU can truncate rather than round out resulting in visual discrepancies.
BUG=423834
Review URL: https://codereview.chromium.org/839883003
This program takes a list of Skia Picture (SKP) files and
renders each as a multipage PDF, then prints out the MD5
checksum of the PDF file. This can be used to verify that
changes to the PDF backend will not change PDF output.
Review URL: https://codereview.chromium.org/832403002
Make draw command image widget resize. The widget was not resizing,
effectively preventing the window from being resized smaller.
Make the rasterized draw command image be proportional to the widget
size. The draw rasterization canvas is still an equilateral rectangle
with dimensions of the smaller side of the widget.
Makes the widget re-rasterize the image only when the draw command
changes, not for each widget paint.
Renames the widget from "image widget" to "draw command geometry
widget".
Makes the background of the image black, similar to the raster widget
background.
Adds a tooltip saying "Command geometry" for the widget, so that user might
understand what the contents should be.
This commit is part of work that tries to make the debugger window to be
a bit more resizeable, so that it would fit 1900x1200 screen.
Review URL: https://codereview.chromium.org/787143004
This GM is used to test the combined clipping of a complex clip (packman shape)
and a simple one (circle). We loop over all combinations of clip ops, aa/bw clip,
and inverse/non-inverse clips.
This GM triggers a current bug in the gpu clipping code which fires an assert. Thus
the skipGPU flag is set until that bug is fixed.
BUG=skia:
Review URL: https://codereview.chromium.org/798793003
Defining SK_DYNAMIC_ANNOTATIONS_ENABLED as 1 whenever DYNAMIC_ANNOTATIONS_ENABLED was 1
seems to be working fine for Chrome. Should be we can just use DYNAMIC_ANNOTATIONS_ENABLED.
BUG=skia:
Review URL: https://codereview.chromium.org/810513002
Reason for revert:
uses __builtins not available on all our compilers
Original issue's description:
> Roll libwebp to v0.4.2 (latest stable) to fix annoying build warning.
>
> This warning should now go away:
>
> ../../third_party/externals/libwebp/src/enc/quant.c:105:23: warning: unused variable 'kCoeffThresh' [-Wunused-const-variable]
> static const uint16_t kCoeffThresh[16] = {
>
>
> BUG=skia:
>
> Committed: https://skia.googlesource.com/skia/+/e655557a551c545f4153b0764269298cd01cd0c0TBR=caryclark@google.com,mtklein@chromium.org
NOTREECHECKS=true
NOTRY=true
BUG=skia:
Review URL: https://codereview.chromium.org/795823005
Gives more flexibility to the caller to decide whether to use the
encoded data returned by refEncodedData().
Provides an implementation that supports the old version of
SkPicture::serialize().
TODO: Update Chrome, so we can remove SK_LEGACY_ENCODE_BITMAP entirely
BUG=skia:3190
Review URL: https://codereview.chromium.org/784643002
Exposing SkSurface_Gpu makes me sad and I would welcome alternatives.
This change is desireable since it greatly decreases the render target swaps.
Review URL: https://codereview.chromium.org/792923002
Previously, a function was called using dlsym in skia_launcher.
Add a static initializer that changes the setting, and include that for
the tools we automate for testing.
Also only do va_copy if we actually use it.
BUG=skia:2454
Review URL: https://codereview.chromium.org/753543003