Use SkPoint* for all vertex data. (It was void* to support two different
vertex data formats, which we no longer need.)
Assert that vertex stride == sizeof(SkPoint).
Fix build when LOGGING_ENABLED=1.
BUG=skia:
Review URL: https://codereview.chromium.org/1128083004
Reason for revert:
This CL seems to be triggering the assert:
src/gpu/SkGpuDevice.cpp:319: failed assertion "!fNeedClear"
on, at least, Mac & Ubuntu.
Original issue's description:
> Defer glClear to just before draw call
>
> Remove some DO_DEFERRED_CLEAR call to avoid call glClear separately, like this:
> glBindFramebuffer(1)
> glClear
> glBindFramebuffer(2)
> glClear
> glBindFramebuffer(1)
> glDrawXXX
> glBindFramebuffer(2)
> glDrawXXX
>
> These call sequences may need read and write memory back and forth.
>
> If we call DO_DEFERRED_CLEAR just before draw call, we can bind, clear and draw in one go.
> e.g.
> glBindFramebuffer(1)
> glClear
> glDrawXXX
> glBindFramebuffer(2)
> glClear
> glDrawXXX
>
> BUG=skia:
>
> Committed: https://skia.googlesource.com/skia/+/c3c06a13e69b90d4cc1d543853504072d363ae8bTBR=bsalomon@google.com,joel.liang@arm.com,wasim.abbas@arm.com
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:
Review URL: https://codereview.chromium.org/1136393005
Remove some DO_DEFERRED_CLEAR call to avoid call glClear separately, like this:
glBindFramebuffer(1)
glClear
glBindFramebuffer(2)
glClear
glBindFramebuffer(1)
glDrawXXX
glBindFramebuffer(2)
glDrawXXX
These call sequences may need read and write memory back and forth.
If we call DO_DEFERRED_CLEAR just before draw call, we can bind, clear and draw in one go.
e.g.
glBindFramebuffer(1)
glClear
glDrawXXX
glBindFramebuffer(2)
glClear
glDrawXXX
BUG=skia:
Review URL: https://codereview.chromium.org/1113003005
I realized when writing the comment on https://crrev.com/1135363002/
that I'd really just sketched out the entire thing, so I couldn't help
but actually write up a working CL. How does this do for your benchmark?
BUG=chromium:487075
Review URL: https://codereview.chromium.org/1130123006
SSE runs 2-3x faster (than 4f), NEON runs 1.2-1.4x faster (than existing NEON).
Small diffs on {aarectmodes, imagefilters_xfermodes, hairmodes, mixed_xfermodes} only on AA edges due to precision drop.
BUG=skia:
Review URL: https://codereview.chromium.org/1132853005
Implemented by extracting out the non-scale/translate components
and applying that post-filter as an SkMatrixImageFilter.
BUG=skia:
Review URL: https://codereview.chromium.org/1120043002
alphas() extracts the 4 alphas from an existing Sk4px as another Sk4px.
LoadNAlphas() constructs an Sk4px from N packed alphas.
In both cases, we end up with 4x repeated alphas aligned with their pixels.
alphas()
A0 R0 G0 B0 A1 R1 G1 B1 A2 R2 G2 B2 A3 R3 G3 B3
->
A0 A0 A0 A0 A1 A1 A1 A1 A2 A2 A2 A2 A3 A3 A3 A3
Load4Alphas()
A0 A1 A2 A3
->
A0 A0 A0 A0 A1 A1 A1 A1 A2 A2 A2 A2 A3 A3 A3 A3
Load2Alphas()
A0 A1
->
A0 A0 A0 A0 A1 A1 A1 A1 0 0 0 0 0 0 0 0
This is a 5-10% speedup for AA on Intel, and wash on ARM.
AA is still mostly dominated by the final lerp.
alphas() isn't used yet, but it's similar enough to Load[24]Alphas()
that it was easier to write all at once.
BUG=skia:
Review URL: https://codereview.chromium.org/1138333003
Multiple Master and TrueType fonts support variation axes.
This implements back-end support for axes on platforms which
support it.
Review URL: https://codereview.chromium.org/1027373002
Confirm that no path ops tests are flaky, and clean up errors around
that. The test framework was incorrectly checking for >= MAX_ERRORS for
failure and <= MAX_ERRORS for success.
TBR=reed@google.com
Review URL: https://codereview.chromium.org/1140563003
The xml parsing of the Android font configuration is quite lax in what
it allows. This has lead to issues with forward compatibility and
extending the existing format. This has lead to some confusion about
what the actual format is and how a given proposed change will affect
existing files and readers.
The main issue this fixes is containment. Tags are now only recognized
at the correct levels in the correct containing tags. Tags which are
not recognized are properly skipped. Tags which accumulate character
data now only accumulate the character data in their own element as
opposed to all child elements.
Review URL: https://codereview.chromium.org/1138073002
Improve line/curve coincident detection and resolution. This fixed the remaining simple failures.
When an edge is unsortable, use the ray intersection to determine the angles' winding.
Deal with degenerate segments.
TBR=reed@google.com
BUG=skia:3588,skia:3762
Review URL: https://codereview.chromium.org/1140813002
For SSE, Sk4px is better than Sk4f is better than SkXfermodes_opts_SSE2 (where implemented).
For NEON, Sk4px is better than SkXfermodes_opts_arm_neon is better than Sk4f (where implemented).
This is a 1.6-1.9x speedup for Plus,Modulate, and Screen for NEON.
BUG=skia:
Review URL: https://codereview.chromium.org/1128053004
Xfermode_Plus runs 4-5x faster.
We expect mixed_xfermodes to have a small diff. This is because kFoldCoverageIntoSrcAlpha was incorrectly set to true.
This implementation handily beats the Sk4f impl, the portable impl, and the existing SSE2 impl. Reading the SkXfermodes_opts_SSE2.cpp file, I'm pretty confident that we'll be able to beat all SSE2 impls.
I believe this impl will beat or match the existing NEON impl too, but that may not be true for more complicated xfermodes. They can take advantage of transposing ARGBARGB... to AAAARRRR.... cheaply and I haven't figured out an abstraction for that yet that doesn't screw SSE.
Adds:
- MapDstSrc() to Sk4px
- saturatedAdd() to SkNi (only implemented as far as it's used).
- div255Narrow()
BUG=skia:
Review URL: https://codereview.chromium.org/1138893002