patch from issue 1072303005 at patchset 40001 (http://crrev.com/1072303005#ps40001)
This looks quite launchable. radial_gradient3, min of 100 samples:
N5: 985µs -> 946µs
MBP: 395µs -> 279µs
On my MBP, most of the meat looks like it's now in reading the cache and writing to dst one color at a time. Is that something we could do in float math rather than with a lookup table?
BUG=skia:
CQ_EXTRA_TRYBOTS=client.skia.compile:Build-Mac10.8-Clang-Arm7-Debug-Android-Trybot,Build-Ubuntu-GCC-Arm7-Release-Android_NoNeon-Trybot
Committed: https://skia.googlesource.com/skia/+/abf6c5cf95e921fae59efb487480e5b5081cf0ec
Review URL: https://codereview.chromium.org/1109643002
Reason for revert:
compile failures.
Original issue's description:
> Mike's radial gradient CL with better float -> int.
>
> patch from issue 1072303005 at patchset 40001 (http://crrev.com/1072303005#ps40001)
>
> This looks quite launchable. radial_gradient3, min of 100 samples:
> N5: 985µs -> 946µs
> MBP: 395µs -> 279µs
>
> On my MBP, most of the meat looks like it's now in reading the cache and writing to dst one color at a time. Is that something we could do in float math rather than with a lookup table?
>
> BUG=skia:
>
> CQ_EXTRA_TRYBOTS=client.skia.android:Test-Android-GCC-Nexus5-CPU-NEON-Arm7-Debug-Trybot,Test-Android-GCC-Nexus9-CPU-Denver-Arm64-Debug-Trybot
>
> Committed: https://skia.googlesource.com/skia/+/abf6c5cf95e921fae59efb487480e5b5081cf0ecTBR=reed@google.com,robertphillips@google.com,mtklein@chromium.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:
Review URL: https://codereview.chromium.org/1109883003
patch from issue 1072303005 at patchset 40001 (http://crrev.com/1072303005#ps40001)
This looks quite launchable. radial_gradient3, min of 100 samples:
N5: 985µs -> 946µs
MBP: 395µs -> 279µs
On my MBP, most of the meat looks like it's now in reading the cache and writing to dst one color at a time. Is that something we could do in float math rather than with a lookup table?
BUG=skia:
CQ_EXTRA_TRYBOTS=client.skia.android:Test-Android-GCC-Nexus5-CPU-NEON-Arm7-Debug-Trybot,Test-Android-GCC-Nexus9-CPU-Denver-Arm64-Debug-Trybot
Review URL: https://codereview.chromium.org/1109643002
Move line dashing logic from GrContext::drawPath to
GrDashLinePathRenderer. This makes it possible to let path renderers render arbitrary dashed paths.
End goal is to implement dashing in GrStencilAndCoverPathRenderer.
Review URL: https://codereview.chromium.org/1100073003
Chrome wants to call this more often, and it's quite slow today.
Seems like this could be clearer if SkPictureUtils::ApproxBytesUsed() were SkPicture::approxBytesUsed().
BUG=chromium:471873
Review URL: https://codereview.chromium.org/1090943004
More nullptr checks for factories I have added.
Other checks more Yoda-like I have made. (Skia style this is.)
BUG=skia:
Review URL: https://codereview.chromium.org/1086393004
Extended tests (150M+) run to completion in release in about 6 minutes; the standard test suite exceeds 100K and finishes in a few seconds on desktops.
TBR=reed
BUG=skia:3588
Review URL: https://codereview.chromium.org/1037953004
We only embed images with YUV planes. That should only grab the
subset of color JPEGs supported by PDF.
BUG=skia:3180
Review URL: https://codereview.chromium.org/1025773002
Enables basic decoding for jpegs
Includes rewinding
565, YUV, and Jpeg encoding are not yet implemented
BUG=skia:3257
Review URL: https://codereview.chromium.org/1076923002
These will underly the SkPMFloat-like class for uint16_t components.
Sk4h will back a single-pixel version, and Sk8h any larger number than that.
BUG=skia:
Review URL: https://codereview.chromium.org/1088883005
The bmp codec currently returns kIncompleteInput
when the stream is truncated, which we treat as a
partial success. However, we neglect the fill the
remaining pixels in the image, leaving these
uninitialized.
This CL addresses this problem by initializing the
remaining pixels in the image to default values.
BUG=skia:3257
Review URL: https://codereview.chromium.org/1075243003
We may want to enable swizzles to 565
for images that are encoded in a format
similar to 565, however, we do not want
to take images that decode naturally to
kN32 and then convert them to 565.
***Enable swizzles to kIndex_8. For images
encoded in a color table format, we suggest
that they be decoded to kIndex_8. When we
decode, we only allow conversion to kIndex_8
if it matches the suggested color type (except
wbmp which seems good as is).
***Modify dm to test images that decode to
kIndex_8.
BUG=skia:3257
BUG=skia:3440
Review URL: https://codereview.chromium.org/1055743003
This rearranges the record pointers and types so they can go in a single array, then preallocates some space for them and for the SkVarAlloc.
picture_overhead_draw bench drops from ~1000ns to 500-600ns, with no effect on picture_overhead_nodraw.
I don't see any significant effect on large picture recording times from our .skps.
BUG=chromium:470553
Committed: https://skia.googlesource.com/skia/+/e2dd9408cd711777afaa9410427fb0d761ab004a
Review URL: https://codereview.chromium.org/1061783002
This rearranges the record pointers and types so they can go in a single array, then preallocates some space for them and for the SkVarAlloc.
picture_overhead_draw bench drops from ~1000ns to 500-600ns, with no effect on picture_overhead_nodraw.
I don't see any significant effect on large picture recording times from our .skps.
BUG=chromium:470553
Review URL: https://codereview.chromium.org/1061783002
SkCodec::NewFromStream claims to delete the passed in SkStream on
failure. This allows the caller to pass an SkStream to the function
and not worry about deleting it depending on the return value.
Most of our SkCodecs did not honor this contract though. Update them
to delete the stream on failure. Further, update SkCodec::NewFromStream
to delete the stream if it did not match any subclass, and delete the
SkCodec if we decided to return NULL because it was too big.
Add a test which tests streams which represent the beginnings of
supported format types but do not contain enough data to create an
SkCodec. The interesting part of the test is when we run it on ASAN,
which will report that we leaked something without the other changes.
BUG=skia:3257
Review URL: https://codereview.chromium.org/1058873006
Add a virtual method on SkStream which will do a "peek" some bytes, so
that those bytes are read, but the next call to read will be
unaffected.
Implement peek for SkMemoryStream, where the implementation is simple
and obvious.
Implement peek on SkFrontBufferedStream.
Add tests.
Motivated by decoding streams which cannot be rewound.
TBR=reed@google.com
BUG=skia:3257
Review URL: https://codereview.chromium.org/1044953002
This mirrors the behavior in onGetPixels, and allows the implementation
to share code for handling calls to rewindIfNeeded.
This also fixes a bug where getScanlineDecoder was calling
rewindIfNeeded and treating the result as a bool.
In SkPngCodec, factor out the code to call rewindIfNeeded, and call it
in both onGetPixels and onGetScanlineDecoder.
Update the test to include testing the scanline decoder. Rename "gen"
to "codec" now that it must be an SkCodec.
BUG=skia:3257
Depends on https://codereview.chromium.org/1048423003/ (DIFFERENT ISSUE).
Review URL: https://codereview.chromium.org/1050893002
Separate out the code for reading the header, and use it to reinitialize
fPng_ptr and fInfo_ptr after a rewind.
Use common code to clean up fPng_ptr and fInfo_ptr, and set them to
NULL and treat them as NULL as appropriate.
Update the test to expect SkPngCodec to succeed.
BUG=skia:3257
Review URL: https://codereview.chromium.org/1048423003
Let's start with baby steps in case some bot can't handle this.
I have left many TODOs, most of which I know how to do if this
looks feasible and useful.
BUG=skia:
Review URL: https://codereview.chromium.org/1049223003
Motivation: Keep separate features separate. Also, future
linearization work will need to have several objNumMap
objects share a substituteMap. Also "catalog" has a
specific meaning in PDF. This catalog did not map to that
catalog.
- Modify SkPDFObject::emitObject and SkPDFObject::addResources
interface to requiore SkPDFObjNumMap and SkPDFSubstituteMap.
- SkPDFObjNumMap const in SkPDFObject::emitObject.
- Remove SkPDFCatalog.cpp/.h
- Modify SkDocument_PDF.cpp to use new functions
- Fold in SkPDFStream::populate
- Fold in SkPDFBitmap::emitDict
- Move SkPDFObjNumMap and SkPDFSubstituteMap to SkPDFTypes.h
- Note (via assert) that SkPDFArray & SkPDFDict don't need to
check substitutes.
- Remove extra space from SkPDFDict serialization.
- SkPDFBitmap SkPDFType0Font SkPDFGraphicState SkPDFStream
updated to new interface.
- PDFPrimitivesTest updated for new interface.
BUG=skia:3585
Review URL: https://codereview.chromium.org/1049753002
The primary feature this delivers is SkNf and SkNd for arbitrary power-of-two N. Non-specialized types or types larger than 128 bits should now Just Work (and we can drop in a specialization to make them faster). Sk4s is now just a typedef for SkNf<4, SkScalar>; Sk4d is SkNf<4, double>, Sk2f SkNf<2, float>, etc.
This also makes implementing new specializations easier and more encapsulated. We're now using template specialization, which means the specialized versions don't have to leak out so much from SkNx_sse.h and SkNx_neon.h.
This design leaves us room to grow up, e.g to SkNf<8, SkScalar> == Sk8s, and to grown down too, to things like SkNi<8, uint16_t> == Sk8h.
To simplify things, I've stripped away most APIs (swizzles, casts, reinterpret_casts) that no one's using yet. I will happily add them back if they seem useful.
You shouldn't feel bad about using any of the typedef Sk4s, Sk4f, Sk4d, Sk2s, Sk2f, Sk2d, Sk4i, etc. Here's how you should feel:
- Sk4f, Sk4s, Sk2d: feel awesome
- Sk2f, Sk2s, Sk4d: feel pretty good
No public API changes.
TBR=reed@google.com
BUG=skia:3592
Review URL: https://codereview.chromium.org/1048593002
Need to land SK_SUPPORT_LEGACY_SCALAR_MAPPOINTS in chrome to suppress Affine
version which causes slight differences (which will need to be rebaselined)
BUG=skia:
Review URL: https://codereview.chromium.org/1045493002
Add and test trunc(), which is what get() used to be before rounding.
Using trunc() is a ~40% speedup on our linear gradient bench.
#neon #floats
BUG=skia:3592
#n5
#n9
CQ_INCLUDE_TRYBOTS=client.skia.android:Test-Android-Nexus5-Adreno330-Arm7-Debug-Trybot;client.skia.android:Test-Android-Nexus9-TegraK1-Arm64-Release-Trybot
Review URL: https://codereview.chromium.org/1032243002
Replace the implicit curve intersection with a geometric curve intersection. The implicit intersection proved mathematically unstable and took a long time to zero in on an answer.
Use pointers instead of indices to refer to parts of curves. Indices required awkward renumbering.
Unify t and point values so that small intervals can be eliminated in one pass.
Break cubics up front to eliminate loops and cusps.
Make the Simplify and Op code more regular and eliminate arbitrary differences.
Add a builder that takes an array of paths and operators.
Delete unused code.
BUG=skia:3588
R=reed@google.com
Review URL: https://codereview.chromium.org/1037573004
There is no reason to require the 4 SkPMFloats (registers) to be adjacent.
The only potential win in loads and stores comes from the SkPMColors being adjacent.
Makes no difference to existing bench.
BUG=skia:
Review URL: https://codereview.chromium.org/1035583002
SkPDFCatalog:
- remove first-page-specific code (no longer needed, never
used) (e.g. addObject()).
- Make use of SkHashMap for lookups (simplifies code).
- inline all small methods
- emitXrefTable moved to SkPDFDocument.cpp
- no longer store offsets in this data structure (moved to
SkPDFDocument.cpp)
- setFileOffset gone.
- own substitute refs directly.
SkPDFDocument::EmitPDF()
- All sites that call into SkPDFCatalog modified.
- catalog.addObject only called in a single place, after the
resouceSet is built
- offsets moved to local array.
SkPDFPage:
- finalizePage no longer deals with SkPDFCatalog or resource sets.
- GeneratePageTree no longer deals with SkPDFCatalog
SkPDFObjRef
- emitObject respects the substitution map
Unit Tests:
- respect SkPDFCatalog::addObject signature change.
SkTHash:
- #include SkChecksum for SkGoodHash
- Copyright notice added
Review URL: https://codereview.chromium.org/1033543002
This seems to fix the miscompilation bug on ARM64 / Release / GCC 4.9.
We switched this over originally for perf issues with NEON, but I can't see any now. Will keep an eye out.
BUG=skia:3570
Review URL: https://codereview.chromium.org/1026403002
SkMatrix::mapPts() using aacc/bbdd was always worse than using badc():
- On Intel, it was faster than exisiting swizzle, but badc() is 10% faster still (one pshufd instead of two).
- On ARM, existing swizzle < badc() < aacc()+bbdd(), even though aacc() then bbdd() is really a single vtrn instruction.
I will revert SkMatrix.cpp before submitting. Just thought you might like to look.
Will think more and try to gear up Instruments on ARM.
BUG=skia:
Review URL: https://codereview.chromium.org/1012573003
This removes all the existing Sk4x swizzles and adds badc(), which is
both fast on all implementations and currently useful.
BUG=skia:
Review URL: https://codereview.chromium.org/997353005
We don't have control over which way _mm_cvtps_epi32 rounds.
- This makes the SSE SkPMFloat rounding consistent with _neon and _none.
- Sk4f::cast<Sk4i>() is closer to (int)float's behavior. (Correct when >=0).
Add tests that would fail at head.
BUG=skia:
Review URL: https://codereview.chromium.org/1029163002
Do not playback pending commands for full deferred canvas writePixels.
Changes the test to catch cases where discard is done without
a snapshot.
Review URL: https://codereview.chromium.org/939103002
- By default, use new SkGoodHash to hash keys, which is:
* for 4 byte values, use SkChecksum::Mix,
* for SkStrings, use SkChecksum::Murmur3 on the data,
* for other structs, shallow hash the struct with Murmur3.
- Expand SkChecksum::Murmur3 to support non-4-byte-aligned data.
- Add const foreach() methods.
- Have foreach() take a functor, which allows lambdas.
BUG=skia:
Review URL: https://codereview.chromium.org/1021033002
The bench improves from 39 to 30, about half from porting to Sk2f, half from
x.add(x) instead of x.multiply(two).
Remove Sk4f Load2/store2 now that we have Sk2f.
BUG=skia:
Review URL: https://codereview.chromium.org/1019773004
(This is essentially a revert of https://codereview.chromium.org/503833002/.)
This was necessary back when SkPaint was flattened even for in-process use. Now that we only flatten SkPaint for cross-process use, there's no need to serialize UniqueIDs.
Note: SkDropShadowImageFilter is being constructed with a croprect and UniqueID (of 0) in Blink. I've made the uniqueID param default to 0 temporarily, until this rolls in and Blink can be changed. (Blink can't be changed first, since unlike the other filters, there's no constructor that takes a cropRect but not a uniqueID.)
BUG=skia:
Review URL: https://codereview.chromium.org/1019493002
malloc_usable_size does not exist in UCLIBC, so fall back to just
returning 0 for SkVarAlloc::heap_size().
BUG=skia:
Review URL: https://codereview.chromium.org/1006073003
SK_BUILD_FOR_WIN is no longer a valid way to check for building on
Windows (go figure). Build everywhere.
Remove the REPORTER_ASSERT, which was the failing part. It also isn't
necessary for the test, which is just that we are not leaking an
SkColorTable.
Also fix indentation.
TBR=bungeman@google.com,mtklein@google.com
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:3457
Review URL: https://codereview.chromium.org/1002583002
Before the fix, we could use a stale cache of the clipbounds in quickReject. Often this could return false negatives, meaning we would try to draw more than we should (it would eventually be really clipped). Occasionally this could also report false positives (if the layer were outside of the normal canvas bounds, e.g. a layer with an offset imagefilter).
BUG=skia:
NOTREECHECKS=True
Review URL: https://codereview.chromium.org/983243003
Clarify asColorFilter ...
1. Rename to isColorFilterNode for DAG reduction
2. Add asAColorFilter for removing the imagefilter entirely (future use-case)
Need layouttest rebaseline suppression before this can land in chrome...
https://codereview.chromium.org/984023004/
BUG=skia:
Review URL: https://codereview.chromium.org/982933002
When computed, the RTree for an SkPicture will have a root
bounds that reflects the best bounding information available,
rather than the best estimate at the time the picture recorder
is created. Given that creators frequently don't know ahead of
time what will be drawn, the RTree bound is often tighter.
Perf testing on Chrome indicates a small raster performance
advantage. For upcoming painting changes in Chrome the
performance advantage is much larger.
BUG=
Committed: https://skia.googlesource.com/skia/+/2dd3b6647dc726f36fd8774b3d0d2e83b493aeac
Review URL: https://codereview.chromium.org/971803002
The tests assert on Sk_ColorGREEN, which is in BRGA non-premul. Read the
pixels in same color format.
Currently the tests pass on all platforms because GREEN is fully opaque
and the component stays in the same place both on BGRA and
RGBA. However, hypothetically somebody could copy-paste the assertion
for further tests but use, say, RED. This creates latent error that is
only visible on some platforms like mac.
Review URL: https://codereview.chromium.org/989463002
Reason for revert:
Need to suppress these for rebaselining, so reverting for now.
Regressions: Unexpected image-only failures (5)
css3/filters/effect-brightness-clamping-hw.html [ ImageOnlyFailure ]
css3/filters/effect-combined-hw.html [ ImageOnlyFailure ]
virtual/slimmingpaint/css3/filters/effect-brightness-clamping-hw.html [ ImageOnlyFailure ]
virtual/slimmingpaint/css3/filters/effect-combined-hw.html [ ImageOnlyFailure ]
Original issue's description:
> Use ComposeColorFilter in factory to collapse consecutive filters (when possible).
> Change asColorFilter to reflect its reliance on the new factory behavior.
>
> patch from issue 967143002 at patchset 80001 (http://crrev.com/967143002#ps80001)
>
> BUG=skia:
>
> Committed: https://skia.googlesource.com/skia/+/dac843bf046c2cd79fd955cb177aee241d7a4b0cTBR=senorblanco@chromium.org,robertphillips@google.com,bsalomon@chromium.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:
Review URL: https://codereview.chromium.org/978923005
Reason for revert:
Might be breaking deps roll
Original issue's description:
> Update SkPicture cull rects with RTree information
>
> When computed, the RTree for an SkPicture will have a root
> bounds that reflects the best bounding information available,
> rather than the best estimate at the time the picture recorder
> is created. Given that creators frequently don't know ahead of
> time what will be drawn, the RTree bound is often tighter.
>
> Perf testing on Chrome indicates a small raster performance
> advantage. For upcoming painting changes in Chrome the
> performance advantage is much larger.
>
> BUG=
>
> Committed: https://skia.googlesource.com/skia/+/2dd3b6647dc726f36fd8774b3d0d2e83b493aeacTBR=mtklein@google.com,schenney@chromium.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=
Review URL: https://codereview.chromium.org/977413003
Please see if this looks usable. It may even give a perf boost if you use it, even without custom implementations for each instruction set.
I've been trying this morning to beat this naive loop implementation, but so far no luck with either _SSE2.h or _SSSE3.h. It's possible this is an artifact of the microbenchmark, because we're not doing anything between the conversions. I'd like to see how this fits into real code, what assembly's generated, what the hot spots are, etc.
I've updated the tests to test these new APIs, and splintered off a pair of new benchmarks that use the new APIs. This required some minor rejiggering in the benches.
BUG=skia:
Review URL: https://codereview.chromium.org/978213003
When computed, the RTree for an SkPicture will have a root
bounds that reflects the best bounding information available,
rather than the best estimate at the time the picture recorder
is created. Given that creators frequently don't know ahead of
time what will be drawn, the RTree bound is often tighter.
Perf testing on Chrome indicates a small raster performance
advantage. For upcoming painting changes in Chrome the
performance advantage is much larger.
BUG=
Review URL: https://codereview.chromium.org/971803002
Reason for revert:
Fails on mac for some reason.
Also is a bit wrong, but this should not be reason for the failure..
Original issue's description:
> Add image as a draw type that can be filtered
>
> Add image as a draw type that can be filtered.
>
> This is needed when SkImage is added as an object to be drawn so that
> the draw is forwarded to SkBaseDevice. This would be used in making
> filters use SkImages.
>
> BUG=skia:3388
>
> Committed: https://skia.googlesource.com/skia/+/fa77eb1e51b9317ff993d1be504ada173b561e5fTBR=reed@google.com,bsalomon@google.com
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:3388
Review URL: https://codereview.chromium.org/980273002
Add image as a draw type that can be filtered.
This is needed when SkImage is added as an object to be drawn so that
the draw is forwarded to SkBaseDevice. This would be used in making
filters use SkImages.
BUG=skia:3388
Review URL: https://codereview.chromium.org/960783003
Instead of set(SkPMColor), add a constructor SkPMFloat(SkPMColor).
Replace setA(), setR(), etc. with a 4 float constructor.
And, promise to stick to SkPMColor order.
BUG=skia:
Review URL: https://codereview.chromium.org/977773002
SSE rounds for free (that was a happy accident: they also have a truncating version).
NEON does not, nor obviously the portable code, so they add 0.5 before truncating.
NOPRESUBMIT=true
BUG=skia:
Review URL: https://codereview.chromium.org/974643002
This pushes the cost of the *255 and *1/255 conversions onto only those code
paths that need it. We're not doing it any more efficiently than can be done
with Sk4f.
In microbenchmark isolation, this is about a 15% speedup.
BUG=skia:
NOPRESUBMIT=true
Review URL: https://codereview.chromium.org/973603002
If a quad, cubic, or conic goes back on itself, assume it's not convex.
In a future CL, we could check to see if the curve is linear so that
linear curves are treated the same as lines.
BUG=skia:3469
Review URL: https://codereview.chromium.org/971773002