Unprotected reads -> relaxed reads.
Unprotected write -> relaxed write.
The only unprotected write we had was in SkTraceEvent, which it looks like we nabbed from Chrome at some point and changed only to silence TSAN. Chrome's version uses AtomicWord / NoBarrier_Load / NoBarrier_Store, which boils down to the same as here, intptr_t / relaxed load / relaxed store.
This leaves one place where we're lying a bit to TSAN, in include/core/SkLazyPtr.h where we're doing a data-dependent consume load. We're telling TSAN it's consume, but telling any other compiler to compile it as relaxed, given how they all upgrade consume to acquire. This eliminates a barrier for us on ARM. How do you guys deal with this? Just use a consume memory order, take the hit, and hope compilers get smarter one day?
BUG=chromium:465721
No public API changes.
TBR=reed@google.com
Review URL: https://codereview.chromium.org/996763002
Reason for revert:
Reverting in case this is the cause of the non-Windows failures.
Original issue's description:
> For consistency, use our homebrew zlib everywhere possible.
>
> This switches when we build our own zlib from "just Windows" to "everyone, but
> not Android framework of course".
>
> I tested this by building DM for my Mac and for an Android bot config.
> It took minor tweaks to the GYP to get ARM builds working.
>
> BUG=skia:
>
> Committed: https://skia.googlesource.com/skia/+/5a8f2257b0b0f954fb74f65e7ea3ada772ed9240TBR=scroggo@google.com,mtklein@chromium.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:
Review URL: https://codereview.chromium.org/989873002
This switches when we build our own zlib from "just Windows" to "everyone, but
not Android framework of course".
I tested this by building DM for my Mac and for an Android bot config.
It took minor tweaks to the GYP to get ARM builds working.
BUG=skia:
Review URL: https://codereview.chromium.org/971673005
Fixes Android build failures.
Android builds a custom version of libpng (instead of the one
determined by skia_libpng_static) which supplies extra needed
functions.
TBR=mtklein@google.com
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
Review URL: https://codereview.chromium.org/971243003
- SkDocument::CreateXPS() function added, returns NULL on non-Windows OS.
- DM: (Windows only) an XPSSink is added, fails on non-Windows OS
- DM: Common code for PDFSink::draw and XPSSink::draw are factored into
draw_skdocument static function.
- SkDocument_XPS (Windows only) implementation of SkDocument via
SkXPSDevice.
- SkDocument_XPS_None (non-Windows) returns NULL for
SkDocument::CreateXPS().
- gyp/xps.gyp refactored.
- SkXPSDevice::drawTextOnPath removed (see http://crrev.com/925343003 )
- SkXPSDevice::drawPath supports conics via SkAutoConicToQuads.
Review URL: https://codereview.chromium.org/963953002
DM:
Add a flag to use SkCodec instead of SkImageDecoder.
SkCodec:
Base class for codecs, allowing creation from an SkStream or an SkData.
An SkCodec, on creation, knows properties of the data like its width and height. Further calls can be used to generate the image.
TODO: Add scanline iterator
SkPngCodec:
New decoder for png. Wraps libpng. The code has been repurposed from SkImageDecoder_libpng.
TODO: Handle other destination colortypes
TODO: Substitute the transpose color
TODO: Allow silencing warnings
TODO: Use RGB instead of filler?
TODO: sRGB
SkSwizzler:
Simplified version of SkScaledSampler. Unlike the sampler, this object does no sampling.
TODO: Implement other swizzles.
Requires a gclient sync to pull down libpng.
BUG=skia:3257
Committed: https://skia.googlesource.com/skia/+/ca358852b4fed656d11107b2aaf28318a4518b49
(and then reverted)
Review URL: https://codereview.chromium.org/930283002
- SkDocument::CreateXPS() function added, returns NULL on non-Windows OS.
- DM: (Windows only) an XPSSink is added, fails on non-Windows OS
- DM: Common code for PDFSink::draw and XPSSink::draw are factored into
draw_skdocument static function.
- SkDocument_XPS (Windows only) implementation of SkDocument via
SkXPSDevice.
- SkDocument_XPS_None (non-Windows) returns NULL for
SkDocument::CreateXPS().
- gyp/xps.gyp refactored.
- SkXPSDevice::drawTextOnPath removed (see http://crrev.com/925343003 )
- SkXPSDevice::drawPath supports conics via SkAutoConicToQuads.
NOPRESUBMIT=true
Review URL: https://codereview.chromium.org/963953002
Reason for revert:
Breaking windows bots all over the place :(
Original issue's description:
> Add SkCodec, including PNG implementation.
>
> DM:
> Add a flag to use SkCodec instead of SkImageDecoder.
>
> SkCodec:
> Base class for codecs, allowing creation from an SkStream or an SkData.
> An SkCodec, on creation, knows properties of the data like its width and height. Further calls can be used to generate the image.
> TODO: Add scanline iterator
>
> SkPngCodec:
> New decoder for png. Wraps libpng. The code has been repurposed from SkImageDecoder_libpng.
> TODO: Handle other destination colortypes
> TODO: Substitute the transpose color
> TODO: Allow silencing warnings
> TODO: Use RGB instead of filler?
> TODO: sRGB
>
> SkSwizzler:
> Simplified version of SkScaledSampler. Unlike the sampler, this object does no sampling.
> TODO: Implement other swizzles.
>
> BUG=skia:3257
>
> Committed: https://skia.googlesource.com/skia/+/ca358852b4fed656d11107b2aaf28318a4518b49TBR=reed@google.com,djsollen@google.com,msarett@google.com,mtklein@google.com
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:3257
Review URL: https://codereview.chromium.org/972743003
DM:
Add a flag to use SkCodec instead of SkImageDecoder.
SkCodec:
Base class for codecs, allowing creation from an SkStream or an SkData.
An SkCodec, on creation, knows properties of the data like its width and height. Further calls can be used to generate the image.
TODO: Add scanline iterator
SkPngCodec:
New decoder for png. Wraps libpng. The code has been repurposed from SkImageDecoder_libpng.
TODO: Handle other destination colortypes
TODO: Substitute the transpose color
TODO: Allow silencing warnings
TODO: Use RGB instead of filler?
TODO: sRGB
SkSwizzler:
Simplified version of SkScaledSampler. Unlike the sampler, this object does no sampling.
TODO: Implement other swizzles.
BUG=skia:3257
Review URL: https://codereview.chromium.org/930283002
Partially in preparation for building libpng on Windows.
Also, this makes us consistent across platforms for PDF.
Uses the version of zlib checked into the Chromium tree.
Remove miniz, which is replaced by zlib.
Review URL: https://codereview.chromium.org/966963002
- Adds miniz.c v115_r4 (latest release) to third_party.
- Merges SkDeflateWStream into SkFlate so including "miniz.c" links
without duplicating symbols.
The only interesting code change I've made is to remove the line
fImpl->fZStream.data_type = Z_BINARY;
from SkDeflateWStream::SkDeflateWStream(). miniz doesn't have Z_BINARY
defined, and as far as I can tell, both zlib and miniz ignore data_type.
We should be able to swap skflate.gyp's dependency between zlib.gyp:zlib and
zlib.gyp:miniz at will (except of course on Windows) if we're interested in
zlib itself. I've left android framework on its own zlib. I think this all
means we can stop defining SK_NO_FLATE on Windows.
I'll leave the possible cleanup of SK_NO_FLATE itself for another time. Might
be we always want to keep this dependency optional.
CQ_EXTRA_TRYBOTS=client.skia:Test-Win8-ShuttleA-HD7770-x86-Debug-Trybot
BUG=skia:
Review URL: https://codereview.chromium.org/957323003
This path renderer converts paths to linear contours, resolves intersections via Bentley-Ottman, implements a trapezoidal decomposition a la Fournier and Montuno to produce triangles, and renders those with a single draw call. It does not currently do antialiasing, so it must be used in conjunction with multisampling.
A fair amount of the code is to handle floating point edge cases in intersections. Rather than perform exact computations (which would require arbitrary precision arithmetic), we reconnect the mesh to reflect the intersection points. For example, intersections can occur above the current vertex, and force edges to be merged into the current vertex, requiring a restart of the intersections. Splitting edges for intersections can also force them to merge with formerly-distinct edges in the same polygon, or to violate the ordering of the active edge list, or the active edge state of split edges.
BUG=skia:
Review URL: https://codereview.chromium.org/855513004
Fix path bugs exposed by the path fuzzer.
Changes to existing gm and samplecode files defer their calls to construct
SkPath objects until the first draw instead of at test initialization.
Add an experimental call to SkPath to validate the internal SkPathRef.
Fix SkPath::addPoly to set the last moveto after adding a close verb.
Fix stroke to handle failures when computing the unit normal.
Add a unit test for the unit normal failure.
R=reed@google.com
Review URL: https://codereview.chromium.org/953383002
Removes the disabled SSE2 optimization of ColorRect32 and deletes
the two files containing the code.
Measured on both Core Haswell and Atom Silvermont, and only got
some miniscule improvement compared to the default implementation.
Also tried to write a new, ultimate, version of this optimization,
but only got ~5% improvement on ColorRect32-heavy tests.
Signed-off-by: Henrik Smiding <henrik.smiding@intel.com>
Review URL: https://codereview.chromium.org/957433002
SK_SUPPORT_LEGACY_GET_PIXELS_ENUM just set a \#define to convert
onGetPixelsEnum
to
onGetPixels
Now that Chrome has been updated to override onGetPixels, there is no
need for the define.
BUG=skia:3257
Review URL: https://codereview.chromium.org/933853004
Plumb SkDOM as needed to make it suitable for an SkXMLWriter backend.
Also fix a potential null typeface issue in
SkSVGDevice::AutoElement::addTextAttributes().
R=reed@google.com,mtklein@google.com
Review URL: https://codereview.chromium.org/940283002
I've written some new hashtable interfaces that should be easier to use,
and I've been trying to roll them out bit by bit, hopefully replacing
SkTDynamicHash, SkTMultiMap, SkTHashCache, etc.
This turns the cache in GrGLCaps::readPixelsSupported() into an SkTHashMap,
mapping the format key to a bool. Functionally, it's the same.
BUG=skia:
Review URL: https://codereview.chromium.org/948473002
The approximation uses a similar error term as the fill scan converter to determine the number of quads to use.
This also updates SampleApp QuadStroker test with conics, ovals, and stroked text.
Review URL: https://codereview.chromium.org/932113002
Replace the old signature of onGetPixels (return bool) to return an
enum (Result). Remove onGetPixelsEnum.
Add a define for onGetPixelsEnum to onGetPixels. This is for staging
in Chromium, where some implementations override onGetPixelsEnum.
Add the define in skia_for_chromium_defines. Remove
SK_SUPPORT_LEGACY_IMAGE_GENERATOR_RETURN, which is no longer needed by
Chromium.
BUG=skia:3257
Review URL: https://codereview.chromium.org/939113002
Also: Add SkDeflateWStream and associated unit tests.
SkPDFBitmap is a replacement for SkPDFImage. As of now, it only
supports 8888 bitmaps (the most common case).
SkPDFBitmap takes very little extra memory (aside from refing the
bitmap's pixels), and its emitObject() does not cache any data.
The SkPDFBitmap::Create function will check the canon for duplicates.
This can reduce the size of the output PDF.
Motivation: this gives another ~40% decrease in PDF memory overhead
TODO: Support other ColorTypes and scrap SkPDFImage.
BUG=skia:3030
Review URL: https://codereview.chromium.org/918813002
Add RotateCircles3 back as better-named QuadStroker.
Switch pathfill test to call skia before draw instead of in
initializer to avoid triggering debugging breakpoints.
Review URL: https://codereview.chromium.org/912273003
Motivation: test that these works on all possible bots and for all possible clients (clients do run these unit tests, right?)
Dear clients: if this unit test fails, let us know!
BUG=skia:3427
Review URL: https://codereview.chromium.org/922043004
The new enum describes the nature of the failure. This is in
preparation for writing a replacement for SkImageDecoder, which will
use this interface.
Update the comments for getPixels() to specify what it means to pass
an SkImageInfo with a different size.
Make SkImageGenerator Noncopyable.
Leave onGetYUV8Planes alone, since we have separate discussions
regarding modifying that API.
Make callers of SkImageDecoder consistently handle kPartialSuccess.
Previously, some callers considered it a failure, and others considered
it a success.
BUG=skia:3257
Review URL: https://codereview.chromium.org/919693002
SkTHashTable is very similar to SkTDynamicHash, except it's generalized to support non-pointer value types.
It doesn't support remove(), just to keep things simple (it's not hard to add).
Instead of an iterator, it has foreach(), again, to keep things simple.
SkTHashMap<K,V> and SkTHashSet<T> build a friendlier experience on top of SkTHashTable.
BUG=skia:
Review URL: https://codereview.chromium.org/925613002
When checking the skia_arch_type for "x86", instead of doing an
== compare, check if "x86" in skia_arch_type, so it will cover
both x86 and x86_64.
Except when we specifically want x86.
Set skia_arch_width based on "64" in skia_arch_type. No need to specify
in scripts.
In gyp_to_android.py, create a separate var_dict for x86_64.
BUG=skia:3419
Review URL: https://codereview.chromium.org/916113002
This has the side effect of requiring SkNullGLContext to use the null GL interface.
It exposes SkNullGLContext and also removes null context support from SampleApp.
Review URL: https://codereview.chromium.org/916733002
Reason for revert:
Going to punt on 16-bit float support for now. Can't figure out ARM 64.
Original issue's description:
> GYP groudwork for half-float opts support.
>
> This sets us up two new opts targets with the immediate goal of adding half-float (SkHalf.h) opts:
> - opts_neon_fp16: uses hardware support on most ARM chips with NEON to do 4 conversions at a time;
> - opts_avx: uses hardware support on Intel chips with AVX to do 8 conversions at a time.
>
> opts_avx will be a handy thing to have around later too, especially if we want to work with floats.
>
> This doesn't actually add any new source files to these libraries yet, so they're no-ops for now.
> I'll need to write a parallel change to Chrome's GN and GYPs before we can start adding sources.
>
> This also rolls GYP up to head, to get suppport for EnableEnhancedInstructionSet: '3' on Windows,
> which is how we turn on AVX there. There's no Mac-specific flag, so we use OTHER_CPLUSPLUSFLAGS.
>
> BUG=skia:
>
> TBR=reed@google.com
>
> Committed: https://skia.googlesource.com/skia/+/46b80833394d7919cadf2abf2b93802141dd21c5TBR=reed@google.com,mtklein@chromium.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:
Review URL: https://codereview.chromium.org/912223002
This sets us up two new opts targets with the immediate goal of adding half-float (SkHalf.h) opts:
- opts_neon_fp16: uses hardware support on most ARM chips with NEON to do 4 conversions at a time;
- opts_avx: uses hardware support on Intel chips with AVX to do 8 conversions at a time.
opts_avx will be a handy thing to have around later too, especially if we want to work with floats.
This doesn't actually add any new source files to these libraries yet, so they're no-ops for now.
I'll need to write a parallel change to Chrome's GN and GYPs before we can start adding sources.
This also rolls GYP up to head, to get suppport for EnableEnhancedInstructionSet: '3' on Windows,
which is how we turn on AVX there. There's no Mac-specific flag, so we use OTHER_CPLUSPLUSFLAGS.
BUG=skia:
TBR=reed@google.com
Review URL: https://codereview.chromium.org/915693002
The macro is only used in CrashHandler.*
Removes SK_CRASH_HANDLER from Android's SkUserConfig, where it is not
needed.
Review URL: https://codereview.chromium.org/915663002
Mac and Windows bots are already building in C++11 mode.
This turns on the rest, mostly to see what work remains.
This will probably break a few bots. It'd be nice if we could let those
all come in as red before reverting this so I can see the full list to fix.
BUG=skia:
Committed: https://skia.googlesource.com/skia/+/779e49602a9c8f4d2799504822e01bcafbcaa534
CQ_EXTRA_TRYBOTS=client.skia.compile:Build-Ubuntu13.10-GCC4.8-NaCl-Release-Trybot,Build-Mac10.7-Clang-Arm7-Debug-iOS-Trybot
Review URL: https://codereview.chromium.org/868233008
Reason for revert:
Perf-Ubuntu12, and Test-Ubuntu12, Build-Nacl all too old.
Android and Chrome OS builders look ok.
Android testers look ok. Chrome OS testers haven't run yet.
Original issue's description:
> Build in C++11 mode on Unix-like bots.
>
> Mac and Windows bots are already building in C++11 mode.
> This turns on the rest, mostly to see what work remains.
>
> This will probably break a few bots. It'd be nice if we could let those
> all come in as red before reverting this so I can see the full list to fix.
>
> BUG=skia:
>
> Committed: https://skia.googlesource.com/skia/+/779e49602a9c8f4d2799504822e01bcafbcaa534TBR=stephana@google.com,mtklein@chromium.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:
Review URL: https://codereview.chromium.org/879803003
Mac and Windows bots are already building in C++11 mode.
This turns on the rest, mostly to see what work remains.
This will probably break a few bots. It'd be nice if we could let those
all come in as red before reverting this so I can see the full list to fix.
BUG=skia:
Review URL: https://codereview.chromium.org/868233008
This is working towards making a simple example part of the buildbot compile step and removing SkExamples from the experimental directory.
This works on Mac, Windows, and Linux but isn't complete for Android, ChromeOS and iOS.
Review URL: https://codereview.chromium.org/886413004
This merges and refactors SkAtomics.h and SkBarriers.h into SkAtomics.h and
some ports/ implementations. The major new feature is that we can express
memory orders explicitly rather than only through comments.
The porting layer is reduced to four template functions:
- sk_atomic_load
- sk_atomic_store
- sk_atomic_fetch_add
- sk_atomic_compare_exchange
From those four we can reconstruct all our previous sk_atomic_foo.
There are three ports:
- SkAtomics_std: uses C++11 <atomic>, used with MSVC
- SkAtomics_atomic: uses newer GCC/Clang intrinsics, used on not-MSVC where possible
- SkAtomics_sync: uses older GCC/Clang intrinsics, used where SkAtomics_atomic not supported
No public API changes.
TBR=reed@google.com
BUG=skia:
Review URL: https://codereview.chromium.org/896553002
Reason for revert:
Reverted the wrong CL.
Original issue's description:
> Revert of SSE4 opaque blend using intrinsics instead of assembly. (patchset #16 id:300001 of https://codereview.chromium.org/874863002/)
>
> Reason for revert:
> This causes a bug on the 'hittestpath' GM on MacMini 4,1
>
> See:
>
> https://gold.skia.org/#/triage/hittestpath?head=0
>
> for details.
>
> Original issue's description:
> > SSE4 opaque blend using intrinsics instead of assembly.
> >
> > Since we had such a hard time with the assembly versions of this blit (to the
> > point that we have them completely disabled everywhere), I thought I'd take
> > a shot at writing a version of the blit using intrinsics.
> >
> > The key feature of SSE4 we're exploiting is that we can use ptest (_mm_test*)
> > to skip the blend when the 16 src pixels we consider each loop are all opaque
> > or all transparent. _mm_shuffle_epi8 from SSSE3 also lends a hand to extract
> > all those alphas.
> >
> > It's worth looking to see if we can backport this type of logic to SSE2 using
> > _mm_movemask_epi8, or up to 32 pixels at a time using AVX.
> >
> > My local performance testing doesn't show this to be an unambiguous win
> > (there are probably microbenchmarks and SKPs where we'd be better off just
> > powering through the blend rather than looking at alphas), but the potential
> > does seem tantalizing enough to let skiaperf vet it on the bots. (< 1.0x is a win.)
> >
> > DM says it draws pixel perfect compare to the old code.
> >
> > Microbenchmarks:
> > bitmap_RGBA_8888_A_source_stripes_two 14us -> 14.4us 1.03x
> > bitmap_RGBA_8888_A_source_stripes_three 14.3us -> 14.5us 1.01x
> > bitmap_RGBA_8888_scale_bilerp 61.9us -> 62.2us 1.01x
> > bitmap_RGBA_8888_update_volatile_scale_rotate_bilerp 102us -> 101us 0.99x
> > bitmap_RGBA_8888_scale_rotate_bilerp 103us -> 101us 0.99x
> > bitmap_RGBA_8888_scale 18.4us -> 18.2us 0.99x
> > bitmap_RGBA_8888_A_scale_rotate_bicubic 71us -> 70us 0.99x
> > bitmap_RGBA_8888_update_scale_rotate_bilerp 103us -> 101us 0.99x
> > bitmap_RGBA_8888_A_scale_rotate_bilerp 112us -> 109us 0.98x
> > bitmap_RGBA_8888_update_volatile 5.72us -> 5.58us 0.98x
> > bitmap_RGBA_8888 5.73us -> 5.58us 0.97x
> > bitmap_RGBA_8888_update 5.78us -> 5.6us 0.97x
> > bitmap_RGBA_8888_A_scale_bilerp 70.7us -> 68us 0.96x
> > bitmap_RGBA_8888_A_scale_bicubic 23.7us -> 21.8us 0.92x
> > bitmap_RGBA_8888_A 13.9us -> 10.9us 0.78x
> > bitmap_RGBA_8888_A_source_opaque 14us -> 6.29us 0.45x
> > bitmap_RGBA_8888_A_source_transparent 14us -> 3.65us 0.26x
> >
> > Running over our ~70 SKP web page captures, this looks like we spend 0.7x
> > the time in S32A_Opaque_BlitRow compared to the SSE2 version, which should
> > be a decent predictor of real-world impact.
> >
> > BUG=chromium:399842
> >
> > Committed: https://skia.googlesource.com/skia/+/04bc91b972417038fecfa87c484771eac2b9b785
> >
> > CQ_EXTRA_TRYBOTS=client.skia:Test-Mac10.6-MacMini4.1-GeForce320M-x86_64-Release-Trybot
> >
> > Committed: https://skia.googlesource.com/skia/+/6dbfb21a6c88af6d94e8c823c3ad559f1a41b493
>
> TBR=henrik.smiding@intel.com,mtklein@google.com,herb@google.com,reed@google.com,thakis@chromium.org,mtklein@chromium.org
> NOPRESUBMIT=true
> NOTREECHECKS=true
> NOTRY=true
> BUG=chromium:399842
>
> Committed: https://skia.googlesource.com/skia/+/4988891a1173cd405bf1c1dd3a3668c451f45e4cTBR=henrik.smiding@intel.com,mtklein@google.com,herb@google.com,reed@google.com,thakis@chromium.org,mtklein@chromium.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=chromium:399842
Review URL: https://codereview.chromium.org/894083002
Reason for revert:
This causes a bug on the 'hittestpath' GM on MacMini 4,1
See:
https://gold.skia.org/#/triage/hittestpath?head=0
for details.
Original issue's description:
> SSE4 opaque blend using intrinsics instead of assembly.
>
> Since we had such a hard time with the assembly versions of this blit (to the
> point that we have them completely disabled everywhere), I thought I'd take
> a shot at writing a version of the blit using intrinsics.
>
> The key feature of SSE4 we're exploiting is that we can use ptest (_mm_test*)
> to skip the blend when the 16 src pixels we consider each loop are all opaque
> or all transparent. _mm_shuffle_epi8 from SSSE3 also lends a hand to extract
> all those alphas.
>
> It's worth looking to see if we can backport this type of logic to SSE2 using
> _mm_movemask_epi8, or up to 32 pixels at a time using AVX.
>
> My local performance testing doesn't show this to be an unambiguous win
> (there are probably microbenchmarks and SKPs where we'd be better off just
> powering through the blend rather than looking at alphas), but the potential
> does seem tantalizing enough to let skiaperf vet it on the bots. (< 1.0x is a win.)
>
> DM says it draws pixel perfect compare to the old code.
>
> Microbenchmarks:
> bitmap_RGBA_8888_A_source_stripes_two 14us -> 14.4us 1.03x
> bitmap_RGBA_8888_A_source_stripes_three 14.3us -> 14.5us 1.01x
> bitmap_RGBA_8888_scale_bilerp 61.9us -> 62.2us 1.01x
> bitmap_RGBA_8888_update_volatile_scale_rotate_bilerp 102us -> 101us 0.99x
> bitmap_RGBA_8888_scale_rotate_bilerp 103us -> 101us 0.99x
> bitmap_RGBA_8888_scale 18.4us -> 18.2us 0.99x
> bitmap_RGBA_8888_A_scale_rotate_bicubic 71us -> 70us 0.99x
> bitmap_RGBA_8888_update_scale_rotate_bilerp 103us -> 101us 0.99x
> bitmap_RGBA_8888_A_scale_rotate_bilerp 112us -> 109us 0.98x
> bitmap_RGBA_8888_update_volatile 5.72us -> 5.58us 0.98x
> bitmap_RGBA_8888 5.73us -> 5.58us 0.97x
> bitmap_RGBA_8888_update 5.78us -> 5.6us 0.97x
> bitmap_RGBA_8888_A_scale_bilerp 70.7us -> 68us 0.96x
> bitmap_RGBA_8888_A_scale_bicubic 23.7us -> 21.8us 0.92x
> bitmap_RGBA_8888_A 13.9us -> 10.9us 0.78x
> bitmap_RGBA_8888_A_source_opaque 14us -> 6.29us 0.45x
> bitmap_RGBA_8888_A_source_transparent 14us -> 3.65us 0.26x
>
> Running over our ~70 SKP web page captures, this looks like we spend 0.7x
> the time in S32A_Opaque_BlitRow compared to the SSE2 version, which should
> be a decent predictor of real-world impact.
>
> BUG=chromium:399842
>
> Committed: https://skia.googlesource.com/skia/+/04bc91b972417038fecfa87c484771eac2b9b785
>
> CQ_EXTRA_TRYBOTS=client.skia:Test-Mac10.6-MacMini4.1-GeForce320M-x86_64-Release-Trybot
>
> Committed: https://skia.googlesource.com/skia/+/6dbfb21a6c88af6d94e8c823c3ad559f1a41b493TBR=henrik.smiding@intel.com,mtklein@google.com,herb@google.com,reed@google.com,thakis@chromium.org,mtklein@chromium.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=chromium:399842
Review URL: https://codereview.chromium.org/873553003
Not enabled by default, but this should get you SKPs, GMs etc for free to play with.
$ out/Debug/dm -w svgs --src gm skp --config svg
BUG=skia:
Review URL: https://codereview.chromium.org/892693002
Eventually, this will be moved to be a peer of SampleApp so it is compiled by the bots to avoid future bit rot.
Also ignore XCode auto-generated flag in CommandLineFlags, and remove the unused multiple-example part.
Review URL: https://codereview.chromium.org/890873003
SkProxyCanvas is redundant with SkNWayCanvas, and means another class
we have to keep in sync with the SkCanvas interface.
Remove tests which use an SkProxyCanvas.
Requires a change to chromium.
BUG=skia:3279
BUG=skia:500
Review URL: https://codereview.chromium.org/886813002
The obsolete gyp files are also included, along with a pixman test that never functioned but accidentally referenced some of these deleted files.
Review URL: https://codereview.chromium.org/867213004
Since we had such a hard time with the assembly versions of this blit (to the
point that we have them completely disabled everywhere), I thought I'd take
a shot at writing a version of the blit using intrinsics.
The key feature of SSE4 we're exploiting is that we can use ptest (_mm_test*)
to skip the blend when the 16 src pixels we consider each loop are all opaque
or all transparent. _mm_shuffle_epi8 from SSSE3 also lends a hand to extract
all those alphas.
It's worth looking to see if we can backport this type of logic to SSE2 using
_mm_movemask_epi8, or up to 32 pixels at a time using AVX.
My local performance testing doesn't show this to be an unambiguous win
(there are probably microbenchmarks and SKPs where we'd be better off just
powering through the blend rather than looking at alphas), but the potential
does seem tantalizing enough to let skiaperf vet it on the bots. (< 1.0x is a win.)
DM says it draws pixel perfect compare to the old code.
Microbenchmarks:
bitmap_RGBA_8888_A_source_stripes_two 14us -> 14.4us 1.03x
bitmap_RGBA_8888_A_source_stripes_three 14.3us -> 14.5us 1.01x
bitmap_RGBA_8888_scale_bilerp 61.9us -> 62.2us 1.01x
bitmap_RGBA_8888_update_volatile_scale_rotate_bilerp 102us -> 101us 0.99x
bitmap_RGBA_8888_scale_rotate_bilerp 103us -> 101us 0.99x
bitmap_RGBA_8888_scale 18.4us -> 18.2us 0.99x
bitmap_RGBA_8888_A_scale_rotate_bicubic 71us -> 70us 0.99x
bitmap_RGBA_8888_update_scale_rotate_bilerp 103us -> 101us 0.99x
bitmap_RGBA_8888_A_scale_rotate_bilerp 112us -> 109us 0.98x
bitmap_RGBA_8888_update_volatile 5.72us -> 5.58us 0.98x
bitmap_RGBA_8888 5.73us -> 5.58us 0.97x
bitmap_RGBA_8888_update 5.78us -> 5.6us 0.97x
bitmap_RGBA_8888_A_scale_bilerp 70.7us -> 68us 0.96x
bitmap_RGBA_8888_A_scale_bicubic 23.7us -> 21.8us 0.92x
bitmap_RGBA_8888_A 13.9us -> 10.9us 0.78x
bitmap_RGBA_8888_A_source_opaque 14us -> 6.29us 0.45x
bitmap_RGBA_8888_A_source_transparent 14us -> 3.65us 0.26x
Running over our ~70 SKP web page captures, this looks like we spend 0.7x
the time in S32A_Opaque_BlitRow compared to the SSE2 version, which should
be a decent predictor of real-world impact.
BUG=chromium:399842
Committed: https://skia.googlesource.com/skia/+/04bc91b972417038fecfa87c484771eac2b9b785
CQ_EXTRA_TRYBOTS=client.skia:Test-Mac10.6-MacMini4.1-GeForce320M-x86_64-Release-Trybot
Review URL: https://codereview.chromium.org/874863002
Reason for revert:
This kills Mac 10.6 bots.
FAILED: c++ -MMD -MF obj/src/opts/opts_sse4.SkBlitRow_opts_SSE4.o.d -DSK_INTERNAL -DSK_GAMMA_SRGB -DSK_GAMMA_APPLY_TO_A8 -DSK_SCALAR_TO_FLOAT_EXCLUDED -DSK_ALLOW_STATIC_GLOBAL_INITIALIZERS=1 -DSK_SUPPORT_GPU=1 -DSK_SUPPORT_OPENCL=0 -DSK_FORCE_DISTANCE_FIELD_TEXT=0 -DSK_BUILD_FOR_MAC -DSK_CRASH_HANDLER -DSK_DEVELOPER=1 -I../../src/core -I../../src/utils -I../../include/c -I../../include/config -I../../include/core -I../../include/pathops -I../../include/pipe -I../../include/utils/mac -I../../include/effects -O0 -gdwarf-2 -mmacosx-version-min=10.6 -arch x86_64 -mssse3 -Wall -Wextra -Winit-self -Wpointer-arith -Wsign-compare -Wno-unused-parameter -Wno-invalid-offsetof -msse4.1 -c ../../src/opts/SkBlitRow_opts_SSE4.cpp -o obj/src/opts/opts_sse4.SkBlitRow_opts_SSE4.o
../../src/opts/SkBlitRow_opts_SSE4.cpp:15:27: warning: x86intrin.h: No such file or directory
../../src/opts/SkBlitRow_opts_SSE4.cpp: In function 'void S32A_Opaque_BlitRow32_SSE4(SkPMColor*, const SkPMColor*, int, U8CPU)':
../../src/opts/SkBlitRow_opts_SSE4.cpp:40: error: '_mm_testz_si128' was not declared in this scope
../../src/opts/SkBlitRow_opts_SSE4.cpp:45: error: '_mm_testc_si128' was not declared in this scope
Original issue's description:
> SSE4 opaque blend using intrinsics instead of assembly.
>
> Since we had such a hard time with the assembly versions of this blit (to the
> point that we have them completely disabled everywhere), I thought I'd take
> a shot at writing a version of the blit using intrinsics.
>
> The key feature of SSE4 we're exploiting is that we can use ptest (_mm_test*)
> to skip the blend when the 16 src pixels we consider each loop are all opaque
> or all transparent. _mm_shuffle_epi8 from SSSE3 also lends a hand to extract
> all those alphas.
>
> It's worth looking to see if we can backport this type of logic to SSE2 using
> _mm_movemask_epi8, or up to 32 pixels at a time using AVX.
>
> My local performance testing doesn't show this to be an unambiguous win
> (there are probably microbenchmarks and SKPs where we'd be better off just
> powering through the blend rather than looking at alphas), but the potential
> does seem tantalizing enough to let skiaperf vet it on the bots. (< 1.0x is a win.)
>
> DM says it draws pixel perfect compare to the old code.
>
> Microbenchmarks:
> bitmap_RGBA_8888_A_source_stripes_two 14us -> 14.4us 1.03x
> bitmap_RGBA_8888_A_source_stripes_three 14.3us -> 14.5us 1.01x
> bitmap_RGBA_8888_scale_bilerp 61.9us -> 62.2us 1.01x
> bitmap_RGBA_8888_update_volatile_scale_rotate_bilerp 102us -> 101us 0.99x
> bitmap_RGBA_8888_scale_rotate_bilerp 103us -> 101us 0.99x
> bitmap_RGBA_8888_scale 18.4us -> 18.2us 0.99x
> bitmap_RGBA_8888_A_scale_rotate_bicubic 71us -> 70us 0.99x
> bitmap_RGBA_8888_update_scale_rotate_bilerp 103us -> 101us 0.99x
> bitmap_RGBA_8888_A_scale_rotate_bilerp 112us -> 109us 0.98x
> bitmap_RGBA_8888_update_volatile 5.72us -> 5.58us 0.98x
> bitmap_RGBA_8888 5.73us -> 5.58us 0.97x
> bitmap_RGBA_8888_update 5.78us -> 5.6us 0.97x
> bitmap_RGBA_8888_A_scale_bilerp 70.7us -> 68us 0.96x
> bitmap_RGBA_8888_A_scale_bicubic 23.7us -> 21.8us 0.92x
> bitmap_RGBA_8888_A 13.9us -> 10.9us 0.78x
> bitmap_RGBA_8888_A_source_opaque 14us -> 6.29us 0.45x
> bitmap_RGBA_8888_A_source_transparent 14us -> 3.65us 0.26x
>
> Running over our ~70 SKP web page captures, this looks like we spend 0.7x
> the time in S32A_Opaque_BlitRow compared to the SSE2 version, which should
> be a decent predictor of real-world impact.
>
> BUG=chromium:399842
>
> Committed: https://skia.googlesource.com/skia/+/04bc91b972417038fecfa87c484771eac2b9b785TBR=henrik.smiding@intel.com,mtklein@google.com,herb@google.com,reed@google.com,thakis@chromium.org,mtklein@chromium.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=chromium:399842
Review URL: https://codereview.chromium.org/874033004
Since we had such a hard time with the assembly versions of this blit (to the
point that we have them completely disabled everywhere), I thought I'd take
a shot at writing a version of the blit using intrinsics.
The key feature of SSE4 we're exploiting is that we can use ptest (_mm_test*)
to skip the blend when the 16 src pixels we consider each loop are all opaque
or all transparent. _mm_shuffle_epi8 from SSSE3 also lends a hand to extract
all those alphas.
It's worth looking to see if we can backport this type of logic to SSE2 using
_mm_movemask_epi8, or up to 32 pixels at a time using AVX.
My local performance testing doesn't show this to be an unambiguous win
(there are probably microbenchmarks and SKPs where we'd be better off just
powering through the blend rather than looking at alphas), but the potential
does seem tantalizing enough to let skiaperf vet it on the bots. (< 1.0x is a win.)
DM says it draws pixel perfect compare to the old code.
Microbenchmarks:
bitmap_RGBA_8888_A_source_stripes_two 14us -> 14.4us 1.03x
bitmap_RGBA_8888_A_source_stripes_three 14.3us -> 14.5us 1.01x
bitmap_RGBA_8888_scale_bilerp 61.9us -> 62.2us 1.01x
bitmap_RGBA_8888_update_volatile_scale_rotate_bilerp 102us -> 101us 0.99x
bitmap_RGBA_8888_scale_rotate_bilerp 103us -> 101us 0.99x
bitmap_RGBA_8888_scale 18.4us -> 18.2us 0.99x
bitmap_RGBA_8888_A_scale_rotate_bicubic 71us -> 70us 0.99x
bitmap_RGBA_8888_update_scale_rotate_bilerp 103us -> 101us 0.99x
bitmap_RGBA_8888_A_scale_rotate_bilerp 112us -> 109us 0.98x
bitmap_RGBA_8888_update_volatile 5.72us -> 5.58us 0.98x
bitmap_RGBA_8888 5.73us -> 5.58us 0.97x
bitmap_RGBA_8888_update 5.78us -> 5.6us 0.97x
bitmap_RGBA_8888_A_scale_bilerp 70.7us -> 68us 0.96x
bitmap_RGBA_8888_A_scale_bicubic 23.7us -> 21.8us 0.92x
bitmap_RGBA_8888_A 13.9us -> 10.9us 0.78x
bitmap_RGBA_8888_A_source_opaque 14us -> 6.29us 0.45x
bitmap_RGBA_8888_A_source_transparent 14us -> 3.65us 0.26x
Running over our ~70 SKP web page captures, this looks like we spend 0.7x
the time in S32A_Opaque_BlitRow compared to the SSE2 version, which should
be a decent predictor of real-world impact.
BUG=chromium:399842
Review URL: https://codereview.chromium.org/874863002
BUG=skia:1366
For the added bench, the collapsing makes the bench take:
- 70% of the time for CPU rendering of 3 consecutive matrix filters
- almost no change in the GPU rendering of the matrix filters
- 50% of the time for CPU and GPU rendering of 3 consecutive table filters
Review URL: https://codereview.chromium.org/776673002
Fold alpha to the inner savelayer in savelayer-savelayer-restore
patterns such as this:
SaveLayer (non-opaque)
Save
ClipRect
SaveLayer
Restore
Restore
Restore
Current blink generates these for example for SVG content such as this:
<path style="opacity:0.5 filter:url(#blur_filter)"/>
The outer save layer is due to the opacity and the inner one is due to
blur filter being implemented with picture image filter.
Reduces layers in desk_carsvg.skp testcase from 115 to 78.
BUG=skia:3119
Review URL: https://codereview.chromium.org/835973005
- add -Wsign-compare, which has been catching useful issues for Kimmo;
- add -Winit-self and -Wpointer-arith to Mac builds so everyone's using
the same flags;
- try try removing -Wno-uninitialized. This was only for the old 10.6
compiler that we have warnings set as non-errors now.
BUG=skia:
Review URL: https://codereview.chromium.org/872793002
To compile SkCondVar, we already require either pthreads or Windows. This
simplifies that code to not need SK_USE_POSIX_THREADS to be explicitly defined.
We'll just look to see if we're targeting Windows, and if not, assume pthreads.
Both before and after this CL, that code will fail to compile if we're not on
Windows and don't have pthreads.
BUG=skia:
Review URL: https://codereview.chromium.org/869443003
This basically takes out the Windows-only hacks and promotes them to
cross-platform behavior driven by --gpu_threading.
- When --gpu_threading is false (the default), this puts GPU tasks and tests
together in the same GPU enclave. They all run serially.
- When --gpu_threading is true, both the tests and the tasks run totally
independently, just like the thread-safe CPU-bound work.
BUG=skia:3255
Review URL: https://codereview.chromium.org/847273005
This fixes two problems:
1) #include SK_SOME_DEFINE doesn't work well for all our clients.
2) Things in include/ are #including things in src/, which we don't like.
TBR=reed@google.com
BUG=skia:
Review URL: https://codereview.chromium.org/862983002
SkPDFCanon's fields and methods will eventually become part of
SkPDFDocument/SkDocument_PDF. For now, it exists as a singleton to
preflight that transition. This replaces three global arrays in
SkPDFFont, SkPDFShader, and SkPDFGraphicsContext. This code is still
thread-unsafe (http://skbug.com/2683), but moving this functionality
into SkPDFDocument will fix that.
This CL does not change pdf output from either GMs or SKPs.
This change also simplifies some code around the SkPDFCanon methods.
BUG=skia:2683
Review URL: https://codereview.chromium.org/842253003
SkFontMgr and SkFontStyle implementations are currently burried in the
old SkFontHost.cpp file. Move these implementations to their own file
so that the implementations are easier to find, and to make clearer that
SkFontHost.cpp needs to be removed.
Review URL: https://codereview.chromium.org/799533004
- Added new classes to contain YUV planes of memory, along with the associated data.
- Used these classes in load_yuv_texture() to enable YUV planes caching
- Added a unit test for the new cache
BUG=450021
Review URL: https://codereview.chromium.org/851273003
BUG=skia:3255
I think this supports everything DM used to, but has completely refactored how
it works to fit the design in the bug.
Configs like "tiles-gpu" are automatically wired up.
I wouldn't suggest looking at this as a diff. There's just a bunch of deleted
files, a few new files, and one new file that shares a name with a deleted file
(DM.cpp).
NOTREECHECKS=true
Committed: https://skia.googlesource.com/skia/+/709d2c3e5062c5b57f91273bfc11a751f5b2bb88
Review URL: https://codereview.chromium.org/788243008
Reason for revert:
plenty of data
Original issue's description:
> Sketch DM refactor.
>
> BUG=skia:3255
>
>
> I think this supports everything DM used to, but has completely refactored how
> it works to fit the design in the bug.
>
> Configs like "tiles-gpu" are automatically wired up.
>
> I wouldn't suggest looking at this as a diff. There's just a bunch of deleted
> files, a few new files, and one new file that shares a name with a deleted file
> (DM.cpp).
>
> NOTREECHECKS=true
>
> Committed: https://skia.googlesource.com/skia/+/709d2c3e5062c5b57f91273bfc11a751f5b2bb88TBR=bsalomon@google.com,mtklein@chromium.org
NOTREECHECKS=true
NOTRY=true
BUG=skia:3255
Review URL: https://codereview.chromium.org/853883004
BUG=skia:3255
I think this supports everything DM used to, but has completely refactored how
it works to fit the design in the bug.
Configs like "tiles-gpu" are automatically wired up.
I wouldn't suggest looking at this as a diff. There's just a bunch of deleted
files, a few new files, and one new file that shares a name with a deleted file
(DM.cpp).
NOTREECHECKS=true
Review URL: https://codereview.chromium.org/788243008
Reason for revert:
This CL introduces rendering conflicts with hairlines (i.e., the hairlines get overwritten). These conflicts are particularly visible on the following GMs (for the Ubuntu and Android gpu configs):
coloremoji & complexclip2_rrect_bw
Original issue's description:
> Fix GPU clipped-AA vs. non-AA drawRect discrepancy
>
> In the clip stack we were manually rounding out non-AA clip rects but leaving the hardening of non-AA drawRects up to the GPU. In some border cases the GPU can truncate rather than round out resulting in visual discrepancies.
>
> BUG=423834
>
> Committed: https://skia.googlesource.com/skia/+/933a03fecb65c83f81cf65d5cf9870c69aa379ffTBR=bsalomon@google.com,jvanverth@google.com
NOTREECHECKS=true
NOTRY=true
BUG=423834
Review URL: https://codereview.chromium.org/847033002
In the clip stack we were manually rounding out non-AA clip rects but leaving the hardening of non-AA drawRects up to the GPU. In some border cases the GPU can truncate rather than round out resulting in visual discrepancies.
BUG=423834
Review URL: https://codereview.chromium.org/839883003
This program takes a list of Skia Picture (SKP) files and
renders each as a multipage PDF, then prints out the MD5
checksum of the PDF file. This can be used to verify that
changes to the PDF backend will not change PDF output.
Review URL: https://codereview.chromium.org/832403002
Make draw command image widget resize. The widget was not resizing,
effectively preventing the window from being resized smaller.
Make the rasterized draw command image be proportional to the widget
size. The draw rasterization canvas is still an equilateral rectangle
with dimensions of the smaller side of the widget.
Makes the widget re-rasterize the image only when the draw command
changes, not for each widget paint.
Renames the widget from "image widget" to "draw command geometry
widget".
Makes the background of the image black, similar to the raster widget
background.
Adds a tooltip saying "Command geometry" for the widget, so that user might
understand what the contents should be.
This commit is part of work that tries to make the debugger window to be
a bit more resizeable, so that it would fit 1900x1200 screen.
Review URL: https://codereview.chromium.org/787143004