Reason for revert:
Breaks Chrome roll.
obj/skia/ext/skia_chrome.skia_memory_dump_provider.o
does not have -I include/private on its include path, but transitively includes SkMessageBus.h.
Original issue's description:
> Port uses of SkLazyPtr to SkOncePtr.
>
> This gives SkOncePtr a non-trivial destructor that uses std::default_delete
> by default. This is overrideable, as seen in SkColorTable.
>
> SK_DECLARE_STATIC_ONCE_PTR still just leaves its pointers hanging at EOP.
>
> BUG=skia:
>
> No public API changes.
> TBR=reed@google.com
>
> Committed: https://skia.googlesource.com/skia/+/a1254acdb344174e761f5061c820559dab64a74cTBR=herb@google.com,mtklein@chromium.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:
Review URL: https://codereview.chromium.org/1334523002
This gives SkOncePtr a non-trivial destructor that uses std::default_delete
by default. This is overrideable, as seen in SkColorTable.
SK_DECLARE_STATIC_ONCE_PTR still just leaves its pointers hanging at EOP.
BUG=skia:
No public API changes.
TBR=reed@google.com
Review URL: https://codereview.chromium.org/1322933005
This switches over SkXfermodes_opts.h and SkColorMatrixFilter to use Sk4f,
and converts the SkPMFloat benches to Sk4f benches.
No pixels should change here, and no code beyond the Sk4f_ benches should change speed.
The benches are faster than the old versions.
BUG=skia:4117
Review URL: https://codereview.chromium.org/1324743002
Renames Sk4pxXfermode.h to SkXfermode_opts.h,
and refactors it a tiny bit internally.
This moves xfermode optimization from being "compile-time everywhere but NEON"
to simply "runtime everywhere". I don't anticipate any effect on perf or
correctness.
BUG=skia:4117
Review URL: https://codereview.chromium.org/1264543006
These handwritten xfermodes for Clear, Src, DstIn, and DstOut are actually dead
code: they're all covered by Sk4pxXfermode, which we'd already have returned.
Tidies up the xfermode creation logic to make this clearer.
This cuts 20-40K off SkXfermode.o, depending on the platform.
BUG=skia:
Review URL: https://codereview.chromium.org/1249773004
Now that SK_SUPPORT_LEGACY_XFERMODES is unused, tons of code becomes dead.
Nothing is needed in opts/ anymore for x86.
We still do runtime NEON detection, which just duplicates Sk4pxXfermode.
TBR=reed@google.com
BUG=skia:
Review URL: https://codereview.chromium.org/1230023011
- Once in SkXfermode as usual to pick up compile-time SSE and NEON
- Once in SkXfermode_arm_neon to pick up run-time NEON
This allows us to start cleaning up SkXfermode_arm_neon as we've done
for SkXfermode_SSE2. I'm saving this catharsis for a day when I need it.
The Sk4px xfermodes are generally faster than the existing NEON procs,
so this should also have the side effect of a perf win there.
This means our new Plus-AA code works for runtime NEON too.
BUG=skia:3852
Review URL: https://codereview.chromium.org/1150313003
Adds and uses fastMulDiv255Round() where possible,
which approximates x*y/255 as (x*y+x)/256. Seems like a sizeable
speedup, as seen below on Exclusion, Screen, and Modulate. The
existing NEON code uses this approximation for
{Src,Dst}x{In,Out,Over}, and without it we'd regress speed there.
This will require rebaselines whether or not we use this
approximation: the x86 bots change if we do, the ARM bots change
if we don't. None of the diffs are significant.
Desktop:
Xfermode_Screen_aa 5.82ms -> 5.54ms 0.95x
Xfermode_Modulate_aa 5.67ms -> 5.36ms 0.95x
Xfermode_Exclusion_aa 6.18ms -> 5.81ms 0.94x
Xfermode_Exclusion 5.03ms -> 4.24ms 0.84x
Xfermode_Screen 4.51ms -> 3.59ms 0.8x
Xfermode_Modulate 4.2ms -> 3.19ms 0.76x
Xfermode_DstOver 6.73ms -> 3.88ms 0.58x
Xfermode_SrcOut 6.47ms -> 3.48ms 0.54x
Xfermode_SrcIn 6.46ms -> 3.46ms 0.54x
Xfermode_DstOut 6.49ms -> 3.41ms 0.52x
Xfermode_DstIn 6.5ms -> 3.32ms 0.51x
Xfermode_Src_aa 9.53ms -> 4.75ms 0.5x
Xfermode_Clear_aa 9.65ms -> 4.8ms 0.5x
Xfermode_DstIn_aa 11.5ms -> 5.57ms 0.49x
Xfermode_DstOver_aa 11.6ms -> 5.63ms 0.49x
Xfermode_SrcOut_aa 11.6ms -> 5.5ms 0.47x
Xfermode_SrcIn_aa 11.7ms -> 5.51ms 0.47x
Xfermode_DstOut_aa 11.7ms -> 5.4ms 0.46x
N7 performance is close enough to 1x that I'm not sure whether
this is a net win, net loss, or truly neutral. I figure the bots will
show that.
I experimented with another approximation,
(x*(255-y))/255 ≈ (x*(256-y))/256. This was inconclusive, so I'm
leaving it out for now.
The remaining modes are the complicated conditional ones.
BUG=skia:
Review URL: https://codereview.chromium.org/1141953004
SSE runs 2-3x faster (than 4f), NEON runs 1.2-1.4x faster (than existing NEON).
Small diffs on {aarectmodes, imagefilters_xfermodes, hairmodes, mixed_xfermodes} only on AA edges due to precision drop.
BUG=skia:
Review URL: https://codereview.chromium.org/1132853005
For SSE, Sk4px is better than Sk4f is better than SkXfermodes_opts_SSE2 (where implemented).
For NEON, Sk4px is better than SkXfermodes_opts_arm_neon is better than Sk4f (where implemented).
This is a 1.6-1.9x speedup for Plus,Modulate, and Screen for NEON.
BUG=skia:
Review URL: https://codereview.chromium.org/1128053004
Xfermode_Plus runs 4-5x faster.
We expect mixed_xfermodes to have a small diff. This is because kFoldCoverageIntoSrcAlpha was incorrectly set to true.
This implementation handily beats the Sk4f impl, the portable impl, and the existing SSE2 impl. Reading the SkXfermodes_opts_SSE2.cpp file, I'm pretty confident that we'll be able to beat all SSE2 impls.
I believe this impl will beat or match the existing NEON impl too, but that may not be true for more complicated xfermodes. They can take advantage of transposing ARGBARGB... to AAAARRRR.... cheaply and I haven't figured out an abstraction for that yet that doesn't screw SSE.
Adds:
- MapDstSrc() to Sk4px
- saturatedAdd() to SkNi (only implemented as far as it's used).
- div255Narrow()
BUG=skia:
Review URL: https://codereview.chromium.org/1138893002
This fixes every case where virtual and SK_OVERRIDE were on the same line,
which should be the bulk of cases. We'll have to manually clean up the rest
over time unless I level up in regexes.
for f in (find . -type f); perl -p -i -e 's/virtual (.*)SK_OVERRIDE/\1SK_OVERRIDE/g' $f; end
BUG=skia:
Review URL: https://codereview.chromium.org/806653007
Reason for revert:
break many gm's
Original issue's description:
> Make all blending up to GrOptDrawState be handled by the xp/xp factory.
>
> In this cl the blending information is extracted for the xp and stored in the ODS
> which is then used as it currently is. In the follow up cl, an XP backend will be added
> and at that point all blending work will take place inside XP's.
>
> BUG=skia:
>
> Committed: https://skia.googlesource.com/skia/+/7c66342a399b529634bed0fabfaa562db2c0dbd4TBR=bsalomon@google.com
NOTREECHECKS=true
NOTRY=true
BUG=skia:
Review URL: https://codereview.chromium.org/766653008
In this cl the blending information is extracted for the xp and stored in the ODS
which is then used as it currently is. In the follow up cl, an XP backend will be added
and at that point all blending work will take place inside XP's.
BUG=skia:
Review URL: https://codereview.chromium.org/759713002