This switches when we build our own zlib from "just Windows" to "everyone, but
not Android framework of course".
I tested this by building DM for my Mac and for an Android bot config.
It took minor tweaks to the GYP to get ARM builds working.
BUG=skia:
Review URL: https://codereview.chromium.org/971673005
The tests assert on Sk_ColorGREEN, which is in BRGA non-premul. Read the
pixels in same color format.
Currently the tests pass on all platforms because GREEN is fully opaque
and the component stays in the same place both on BGRA and
RGBA. However, hypothetically somebody could copy-paste the assertion
for further tests but use, say, RED. This creates latent error that is
only visible on some platforms like mac.
Review URL: https://codereview.chromium.org/989463002
Clamping 4 at a time is now about 15% faster than 1 at a time with SSSE3.
Clamping 4 at a time is now about 20% faster with SSE2,
and this applies to non-clamping too (we still just clamp there).
In all cases, 4 at a time is never worse than 1 at a time,
and not clamping is never slower than clamping.
Here's all the bench results, with the numbers for portable code as a fun point
of reference:
SSSE3:
maxrss loops min median mean max stddev samples config bench
10M 2291 4.66ns 4.66ns 4.66ns 4.68ns 0% ▆█▁▁▁▇▁▇▁▃ nonrendering SkPMFloat_get_1x
10M 2040 5.29ns 5.3ns 5.3ns 5.32ns 0% ▃▆▃▃▁▁▆▃▃█ nonrendering SkPMFloat_clamp_1x
10M 7175 4.62ns 4.62ns 4.62ns 4.63ns 0% ▁▄▃████▃▄▇ nonrendering SkPMFloat_get_4x
10M 5801 4.89ns 4.89ns 4.89ns 4.91ns 0% █▂▄▃▁▃▄█▁▁ nonrendering SkPMFloat_clamp_4x
SSE2:
maxrss loops min median mean max stddev samples config bench
10M 1601 6.02ns 6.05ns 6.04ns 6.08ns 0% █▅▄▅▄▂▁▂▂▂ nonrendering SkPMFloat_get_1x
10M 2918 6.05ns 6.06ns 6.05ns 6.06ns 0% ▂▇▁▇▇▁▇█▇▂ nonrendering SkPMFloat_clamp_1x
10M 3569 5.43ns 5.45ns 5.44ns 5.45ns 0% ▄█▂██▇▁▁▇▇ nonrendering SkPMFloat_get_4x
10M 4168 5.43ns 5.43ns 5.43ns 5.44ns 0% █▄▇▁▇▄▁▁▁▁ nonrendering SkPMFloat_clamp_4x
Portable:
maxrss loops min median mean max stddev samples config bench
10M 500 27.8ns 28.1ns 28ns 28.2ns 0% ▃█▆▃▇▃▆▁▇▂ nonrendering SkPMFloat_get_1x
10M 770 40.1ns 40.2ns 40.2ns 40.3ns 0% ▅▁▃▂▆▄█▂▅▂ nonrendering SkPMFloat_clamp_1x
10M 1269 28.4ns 28.8ns 29.1ns 32.7ns 4% ▂▂▂█▂▁▁▂▁▁ nonrendering SkPMFloat_get_4x
10M 1439 40.2ns 40.4ns 40.4ns 40.5ns 0% ▆▆▆█▁▆▅█▅▆ nonrendering SkPMFloat_clamp_4x
SkPMFloat_neon.h is still one big TODO as far as 4-at-a-time APIs go.
BUG=skia:
Review URL: https://codereview.chromium.org/982123002
Reason for revert:
Need to suppress these for rebaselining, so reverting for now.
Regressions: Unexpected image-only failures (5)
css3/filters/effect-brightness-clamping-hw.html [ ImageOnlyFailure ]
css3/filters/effect-combined-hw.html [ ImageOnlyFailure ]
virtual/slimmingpaint/css3/filters/effect-brightness-clamping-hw.html [ ImageOnlyFailure ]
virtual/slimmingpaint/css3/filters/effect-combined-hw.html [ ImageOnlyFailure ]
Original issue's description:
> Use ComposeColorFilter in factory to collapse consecutive filters (when possible).
> Change asColorFilter to reflect its reliance on the new factory behavior.
>
> patch from issue 967143002 at patchset 80001 (http://crrev.com/967143002#ps80001)
>
> BUG=skia:
>
> Committed: https://skia.googlesource.com/skia/+/dac843bf046c2cd79fd955cb177aee241d7a4b0cTBR=senorblanco@chromium.org,robertphillips@google.com,bsalomon@chromium.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:
Review URL: https://codereview.chromium.org/978923005
This blacklist entry bans any test with 'pdf' config, any source type, whose
name has '.webp' in it. In practice, that's 'image' or 'subset' source type
decoding some WEBP file.
BUG=skia:3505
Review URL: https://codereview.chromium.org/982163002
Reason for revert:
Might be breaking deps roll
Original issue's description:
> Update SkPicture cull rects with RTree information
>
> When computed, the RTree for an SkPicture will have a root
> bounds that reflects the best bounding information available,
> rather than the best estimate at the time the picture recorder
> is created. Given that creators frequently don't know ahead of
> time what will be drawn, the RTree bound is often tighter.
>
> Perf testing on Chrome indicates a small raster performance
> advantage. For upcoming painting changes in Chrome the
> performance advantage is much larger.
>
> BUG=
>
> Committed: https://skia.googlesource.com/skia/+/2dd3b6647dc726f36fd8774b3d0d2e83b493aeacTBR=mtklein@google.com,schenney@chromium.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=
Review URL: https://codereview.chromium.org/977413003
The core of the problem is that the system is asked to lookup the metrics for a character with id == 0. This causes a hit in the fCharToGlyphHash matching the sentinel glyph. This happens because fCharToGlpyhHash is initialized with all zeros, therefore, the fID is zero matching the char with id == 0. The fAdvanceX field of the sentinel glyph is in fact not initialized.
The bigger question is now did a zero character get passed to getUnicharMetrics?
The breaking code is basically as follows:
wchar_t glyph = L'S';
paint.measureText(&glyph, 2);
This get mischaracterized as a utf8 string instead of a utf16(?) string. Because of the little endian ordering, this is the character string 'L' '\0'. Since the size of the original string is two bytes (but a single character) the '\0' is treated as its own character and past to getUnicharMetrics.
TEST:
On windows failed using DrMemory. With this change does not fail.
BUG=463204
Review URL: https://codereview.chromium.org/977063002
Also allow incomplete to be considered successful.
Do not attempt to draw transparent images on 565.
BUG=skia:3475
Review URL: https://codereview.chromium.org/978413002
Please see if this looks usable. It may even give a perf boost if you use it, even without custom implementations for each instruction set.
I've been trying this morning to beat this naive loop implementation, but so far no luck with either _SSE2.h or _SSSE3.h. It's possible this is an artifact of the microbenchmark, because we're not doing anything between the conversions. I'd like to see how this fits into real code, what assembly's generated, what the hot spots are, etc.
I've updated the tests to test these new APIs, and splintered off a pair of new benchmarks that use the new APIs. This required some minor rejiggering in the benches.
BUG=skia:
Review URL: https://codereview.chromium.org/978213003
The main thrust of this CL is to remove the currCmdMarker variable from GrTargetCommands::flush. In a MultiDrawBuffer world the Cmds need not be execute in the order of their issuance.
Review URL: https://codereview.chromium.org/978363002
Tasks that produce a non-fatal error will bail out before writing their output to
disk and hash to dm.json, but not count as failures.
This also makes true failures bail out before writing their results. If the DM
program failed, we probably don't want to triage that image result.
We use this new feature first to skip image subset decoding when we detect it's
not supported. Here's a snippet of an example run, where in this case only
.webp are subset decodable:
...
( 15MB 12) 172µs 8888 subset color_wheel.jpg (skipped: Subset decoding not supported.)
( 15MB 11) 9.05ms 8888 subset randPixels.webp
( 16MB 10) 863µs 8888 subset baby_tux.png (skipped: Subset decoding not supported.)
...
Only outputs corresponding to the .webp show up, both on disk and in the .json.
BUG=skia:
Review URL: https://codereview.chromium.org/980333002
Make a Via for DM which transforms a set of draws to be more like what
we'd see through the Android Framework's HWUI API. Only built inside
Android's framework because we depend on HWUI classes for half of
those transformations.
Tested with --config androidsdk-8888 and --config androidsdk-hwui.
R=djsollen@google.com,mtklein@google.com,reed@google.com
BUG=skia:
Review URL: https://codereview.chromium.org/974913002
When computed, the RTree for an SkPicture will have a root
bounds that reflects the best bounding information available,
rather than the best estimate at the time the picture recorder
is created. Given that creators frequently don't know ahead of
time what will be drawn, the RTree bound is often tighter.
Perf testing on Chrome indicates a small raster performance
advantage. For upcoming painting changes in Chrome the
performance advantage is much larger.
BUG=
Review URL: https://codereview.chromium.org/971803002
Reason for revert:
Fails on mac for some reason.
Also is a bit wrong, but this should not be reason for the failure..
Original issue's description:
> Add image as a draw type that can be filtered
>
> Add image as a draw type that can be filtered.
>
> This is needed when SkImage is added as an object to be drawn so that
> the draw is forwarded to SkBaseDevice. This would be used in making
> filters use SkImages.
>
> BUG=skia:3388
>
> Committed: https://skia.googlesource.com/skia/+/fa77eb1e51b9317ff993d1be504ada173b561e5fTBR=reed@google.com,bsalomon@google.com
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:3388
Review URL: https://codereview.chromium.org/980273002
Add image as a draw type that can be filtered.
This is needed when SkImage is added as an object to be drawn so that
the draw is forwarded to SkBaseDevice. This would be used in making
filters use SkImages.
BUG=skia:3388
Review URL: https://codereview.chromium.org/960783003
Instead of set(SkPMColor), add a constructor SkPMFloat(SkPMColor).
Replace setA(), setR(), etc. with a 4 float constructor.
And, promise to stick to SkPMColor order.
BUG=skia:
Review URL: https://codereview.chromium.org/977773002
We've run into several places where GPU rasterization slows down a lot,
and in some cases, it's due to use reaching skia's budget. This shows a
graph of skia's used and free budgeted memory.
Review URL: https://codereview.chromium.org/977143002
With SSSE3, we can use the Swiss Army Knife byte shuffler pshufb,
a.k.a. _mm_shuffle_epi8(), to jump directly between 32 and 128 bits.
In microbench isolation, this looks like an additional 10-15% speedup:
SkPMFloat_get: 2.35ns -> 1.98ns
SkPMFloat_clamp: 2.35ns -> 2.18ns
Before this CL, get() and clamp() were identical code. The _get benchmark improves because both set() and get() become faster; the _clamp benchmark shows the improvement from set() getting faster with clamp() staying the same.
BUG=skia:
Review URL: https://codereview.chromium.org/976493002
Reason for revert:
Speculative revert to see if this unblocks the deps roll
Original issue's description:
> Adding linear interpolation to rgb->yuv conversion
>
> When the UV planes are smaller than the Y plane, doing the upscaling in nearest mode was creating artefacts, so I changed it to use linear interpolation to fix the issue.
>
> BUG=460380
>
> Committed: https://skia.googlesource.com/skia/+/cd9d42c5167a50f1bf20e969343556d61354171bTBR=bsalomon@google.com,scroggo@google.com,reed@google.com,sugoi@chromium.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=460380
Review URL: https://codereview.chromium.org/977133002