Chrome would like to perform cpu-side preprocessing for gpu draws in parallel.
They do not want to go through a picture (since they have their own display list format).
The general idea is that we add a new SkDeferredDisplayListRecorder class to
perform all of Ganesh's cpu-side preprocessing ahead of time and in parallel.
The SkDDLRecorder operates like SkPictureRecorder. The user can get an SkCanvas
from the SkDDLRecorder and feed it draw operations. Once finished, the user
calls 'detach' to get an SkDeferredDisplayList. All the work up to and
including the 'detach' call can be done in parallel and will not touch
the GPU. To actually get pixels the client must call SkSurface::draw(SkDDL)
on an SkSurface that is "compatible" with the surface characterization
initially given to the SkDDLMaker.
The surface characterization contains the minimum amount of information Ganesh needs
to know about the ultimate destination in order to perform its cpu-side work
(i.e., caps, width, height, config).
Change-Id: I75faa483ab5a6b779c8de56ea56b9d90b990f43a
Reviewed-on: https://skia-review.googlesource.com/30140
Reviewed-by: Brian Salomon <bsalomon@google.com>
Commit-Queue: Robert Phillips <robertphillips@google.com>
Bug: skia:
Change-Id: Icd905ea7e60a05bc3903eb85d111dcf73ce2c4dd
Reviewed-on: https://skia-review.googlesource.com/40690
Reviewed-by: Greg Daniel <egdaniel@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
Re-land of: https://skia-review.googlesource.com/36560
All information needed by the thread is captured by the prepare
callback object, the lambda captures a pointer to that, and does the
mask render. Once it's done, it signals the semaphore (also owned by the
callback). The callback defers the semaphore wait even longer (into the
ASAP upload), so the odds of waiting for the thread are REALLY low.
Also did a bunch of cleanup along the way, and put in some trace markers
so we can monitor how well this is working.
Traces of a GM that includes GPU and SW path rendering (path-reverse):
Original:
https://screenshot.googleplex.com/f5BG3901tQg.png
Threaded, with wait in the callback (notice pre flush callback blocking):
https://screenshot.googleplex.com/htOSZFE2s04.png
Current version, with wait deferred to ASAP upload function:
https://screenshot.googleplex.com/GHjD0U3C34q.png
Bug: skia:
Change-Id: Idb92f385590749f41328a9aec65b2a93f4775079
Reviewed-on: https://skia-review.googlesource.com/40775
Reviewed-by: Brian Salomon <bsalomon@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
This reverts commit 76323bc061.
Reason for revert: Breaking NUC bots in threaded gm comparison:
https://chromium-swarm.appspot.com/task?id=382e589753187f10&refresh=10
Original change's description:
> Threaded generation of software paths
>
> All information needed by the thread is captured by the prepare
> callback object, the lambda captures a pointer to that, and does the
> mask render. Once it's done, it signals the semaphore (also owned by the
> callback). The callback defers the semaphore wait even longer (into the
> ASAP upload), so the odds of waiting for the thread are REALLY low.
>
> Also did a bunch of cleanup along the way, and put in some trace markers
> so we can monitor how well this is working.
>
> Traces of a GM that includes GPU and SW path rendering (path-reverse):
>
> Original:
> https://screenshot.googleplex.com/f5BG3901tQg.png
> Threaded, with wait in the callback (notice pre flush callback blocking):
> https://screenshot.googleplex.com/htOSZFE2s04.png
> Current version, with wait deferred to ASAP upload function:
> https://screenshot.googleplex.com/GHjD0U3C34q.png
>
> Bug: skia:
> Change-Id: I3d5a230bbd68eb35e1f0574b308485c691435790
> Reviewed-on: https://skia-review.googlesource.com/36560
> Commit-Queue: Brian Osman <brianosman@google.com>
> Reviewed-by: Brian Salomon <bsalomon@google.com>
TBR=egdaniel@google.com,mtklein@google.com,bsalomon@google.com,robertphillips@google.com,brianosman@google.com
Change-Id: Icac0918a3771859f671b69ae07ae0fedd3ebb3db
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: skia:
Reviewed-on: https://skia-review.googlesource.com/38560
Reviewed-by: Brian Salomon <bsalomon@google.com>
Commit-Queue: Brian Salomon <bsalomon@google.com>
All information needed by the thread is captured by the prepare
callback object, the lambda captures a pointer to that, and does the
mask render. Once it's done, it signals the semaphore (also owned by the
callback). The callback defers the semaphore wait even longer (into the
ASAP upload), so the odds of waiting for the thread are REALLY low.
Also did a bunch of cleanup along the way, and put in some trace markers
so we can monitor how well this is working.
Traces of a GM that includes GPU and SW path rendering (path-reverse):
Original:
https://screenshot.googleplex.com/f5BG3901tQg.png
Threaded, with wait in the callback (notice pre flush callback blocking):
https://screenshot.googleplex.com/htOSZFE2s04.png
Current version, with wait deferred to ASAP upload function:
https://screenshot.googleplex.com/GHjD0U3C34q.png
Bug: skia:
Change-Id: I3d5a230bbd68eb35e1f0574b308485c691435790
Reviewed-on: https://skia-review.googlesource.com/36560
Commit-Queue: Brian Osman <brianosman@google.com>
Reviewed-by: Brian Salomon <bsalomon@google.com>
We were ignoring the path renderer flag when drawing GMs or SKPs.
Bug: skia:
Change-Id: Iee443fb70f1faec65e46925fa0e3cea3716d448d
Reviewed-on: https://skia-review.googlesource.com/36861
Reviewed-by: Chris Dalton <csmartdalton@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
Change-Id: Ia50661a8391da526d509adbe2d7203866c140b1c
Reviewed-on: https://skia-review.googlesource.com/25321
Reviewed-by: Mike Klein <mtklein@chromium.org>
Reviewed-by: Ben Wagner <bungeman@google.com>
Commit-Queue: Hal Canary <halcanary@google.com>
Change-Id: I395e3387df44cf5370fef6ab73db73228225622f
Reviewed-on: https://skia-review.googlesource.com/23946
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Brian Salomon <bsalomon@google.com>
We have removed support for drawing Index8, so stop testing it in DM.
Bug: skia: 6828
Change-Id: Ib2c4d3ebd371be704151a9f956c0ca2aaf2926a6
Reviewed-on: https://skia-review.googlesource.com/21525
Reviewed-by: Mike Reed <reed@google.com>
Commit-Queue: Leon Scroggins <scroggo@google.com>
Motivation: We may want to make SkMultiPictureDocument.h public in the
future.
Change-Id: Ie97b88d51a179c2283155d65bcadee32178115ca
Reviewed-on: https://skia-review.googlesource.com/11402
Commit-Queue: Hal Canary <halcanary@google.com>
Reviewed-by: Herb Derby <herb@google.com>
This reverts commit 73e21af213.
Reason for revert: I will fix the broken bot next week.
Original change's description:
> Revert "Add color spin test for SkColorSpaceXformCanvas"
>
> This reverts commit cb01aec63b.
>
> Reason for revert: Breaks Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Debug-SK_USE_DISCARDABLE_SCALEDIMAGECACHE
>
> Original change's description:
> > Add color spin test for SkColorSpaceXformCanvas
> >
> > Also changes behavior to treat nullptr srcs as sRGB.
> >
> > Testing locally, it looks like 353 gms have no diffs from 8888.
> > There are 269 diffs - some are fine (gms that do color space stuff)
> > and some are bugs.
> >
> > BUG=skia:
> >
> > Change-Id: I55c2825f4f4b857e0b0a0ec050c6db82ac881492
> > Reviewed-on: https://skia-review.googlesource.com/9738
> > Reviewed-by: Brian Osman <brianosman@google.com>
> > Commit-Queue: Matt Sarett <msarett@google.com>
> >
>
> TBR=mtklein@google.com,msarett@google.com,brianosman@google.com,reviews@skia.org
> # Not skipping CQ checks because original CL landed > 1 day ago.
> BUG=skia:
>
> Change-Id: I70bb69f747b863d267494e37a60888a51ab0184c
> Reviewed-on: https://skia-review.googlesource.com/9823
> Reviewed-by: Eric Boren <borenet@google.com>
> Commit-Queue: Eric Boren <borenet@google.com>
>
TBR=borenet@google.com,mtklein@google.com,msarett@google.com,reviews@skia.org,brianosman@google.com
# Not skipping CQ checks because original CL landed > 1 day ago.
BUG=skia:
Change-Id: I766382e6655f614042cded84f547f9fd5b109fca
Reviewed-on: https://skia-review.googlesource.com/9879
Reviewed-by: Mike Klein <mtklein@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
This reverts commit cb01aec63b.
Reason for revert: Breaks Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Debug-SK_USE_DISCARDABLE_SCALEDIMAGECACHE
Original change's description:
> Add color spin test for SkColorSpaceXformCanvas
>
> Also changes behavior to treat nullptr srcs as sRGB.
>
> Testing locally, it looks like 353 gms have no diffs from 8888.
> There are 269 diffs - some are fine (gms that do color space stuff)
> and some are bugs.
>
> BUG=skia:
>
> Change-Id: I55c2825f4f4b857e0b0a0ec050c6db82ac881492
> Reviewed-on: https://skia-review.googlesource.com/9738
> Reviewed-by: Brian Osman <brianosman@google.com>
> Commit-Queue: Matt Sarett <msarett@google.com>
>
TBR=mtklein@google.com,msarett@google.com,brianosman@google.com,reviews@skia.org
# Not skipping CQ checks because original CL landed > 1 day ago.
BUG=skia:
Change-Id: I70bb69f747b863d267494e37a60888a51ab0184c
Reviewed-on: https://skia-review.googlesource.com/9823
Reviewed-by: Eric Boren <borenet@google.com>
Commit-Queue: Eric Boren <borenet@google.com>
Also changes behavior to treat nullptr srcs as sRGB.
Testing locally, it looks like 353 gms have no diffs from 8888.
There are 269 diffs - some are fine (gms that do color space stuff)
and some are bugs.
BUG=skia:
Change-Id: I55c2825f4f4b857e0b0a0ec050c6db82ac881492
Reviewed-on: https://skia-review.googlesource.com/9738
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Matt Sarett <msarett@google.com>
TODO:
images
shaders
color filters
image filters
a couple stray color arrays
Change-Id: Ib91639bb0a6a00af737dd5186180011fe5120860
Reviewed-on: https://skia-review.googlesource.com/9529
Reviewed-by: Brian Osman <brianosman@google.com>
Reviewed-by: Matt Sarett <msarett@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
Also changes the behavior of these flags to only override their
corresponding context options when set, and to leave them unchanged
when not set.
BUG=skia:
Change-Id: I09f6be09997594fa888d9045dd4901354ef3f880
Reviewed-on: https://skia-review.googlesource.com/8780
Reviewed-by: Brian Salomon <bsalomon@google.com>
Commit-Queue: Chris Dalton <csmartdalton@google.com>
Replace with std::unique_ptr.
Change-Id: I5806cfbb30515fcb20e5e66ce13fb5f3b8728176
Reviewed-on: https://skia-review.googlesource.com/4381
Commit-Queue: Ben Wagner <bungeman@google.com>
Reviewed-by: Mike Klein <mtklein@chromium.org>
This was always intended to be a temporary dependency to use for
testing. It has served its purpose.
Also, this has already been dropped (accidentally, I think) by
the new GN build.
TBR=reed@google.com
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=4220
Change-Id: Ic72ee08bbfaf86ed86a4122fd38be2921eb1327e
Reviewed-on: https://skia-review.googlesource.com/4220
Reviewed-by: Matt Sarett <msarett@google.com>
Reviewed-by: Leon Scroggins <scroggo@google.com>
Commit-Queue: Matt Sarett <msarett@google.com>
Add an interface to decode frames beyond the first in SkCodec, and
add an implementation for SkGifCodec.
Add getFrameData to SkCodec. This method reads ahead in the stream
to return a vector containing meta data about each frame in the image.
This is not required in order to decode frames beyond the first, but
it allows a client to learn extra information:
- how long the frame should be displayed
- whether a frame should be blended with a prior frame, allowing the
client to provide the prior frame to speed up decoding
Add a new fields to SkCodec::Options:
- fFrameIndex
- fHasPriorFrame
The API is designed so that SkCodec never caches frames. If a
client wants a frame beyond the first, they specify the frame in
Options.fFrameIndex. If the client does not have the
frame's required frame (the frame that this frame must be blended on
top of) cached, they pass false for
Options.fHasPriorFrame. Unless the frame is
independent, the codec will then recursively decode all frames
necessary to decode fFrameIndex. If the client has the required frame
cached, they can put it in the dst they pass to the codec, and the
codec will only draw fFrameIndex onto it.
Replace SkGifCodec's scanline decoding support with progressive
decoding, and update the tests accordingly.
Implement new APIs in SkGifCodec. Instead of using gif_lib, use
GIFImageReader, imported from Chromium (along with its copyright
headers) with the following changes:
- SkGifCodec is now the client
- Replace blink types
- Combine GIFColorMap::buildTable and ::getTable into a method that
creates and returns an SkColorTable
- Input comes from an SkStream, instead of a SegmentReader. Add
SkStreamBuffer, which buffers the (potentially partial) stream in
order to decode progressively.
(FIXME: This requires copying data that previously was read directly
from the SegmentReader. Does this hurt performance? If so, can we
fix it?)
- Remove UMA code
- Instead of reporting screen width and height to the client, allow the
client to query for it
- Fail earlier if the first frame AND screen have size of zero
- Compute required previous frame when adding a new one
- Move GIFParseQuery from GIFImageDecoder to GIFImageReader
- Allow parsing up to a specific frame (to skip parsing the rest of the
stream if a client only wants the first frame)
- Compute whether the first frame has alpha and supports index 8, to
create the SkImageInfo. This happens before reporting that the size
has been decoded.
Add GIFImageDecoder::haveDecodedRow to SkGifCodec, imported from
Chromium (along with its copyright header), with the following changes:
- Add support for sampling
- Use the swizzler
- Keep track of the rows decoded
- Do *not* keep track of whether we've seen alpha
Remove SkCodec::kOutOfOrder_SkScanlineOrder, which was only used by GIF
scanline decoding.
Call onRewind even if there is no stream (SkGifCodec needs to clear its
decoded state so it will decode from the beginning).
Add a method to SkSwizzler to access the offset into the dst, taking
subsetting into account.
Add a GM that animates a GIF.
Add tests for the new APIs.
*** Behavior changes:
* Previously, we reported that an image with a subset frame and no transparent
index was opaque and used the background index (if present) to fill the
background. This is necessary in order to support index 8, but it does not
match viewers/browsers I have seen. Examples:
- Chromium and Gimp render the background transparent
- Firefox, Safari, Linux Image Viewer, Safari Preview clip to the frame (for
a single frame image)
This CL matches Chromium's behavior and renders the background transparent.
This allows us to have consistent behavior across products and simplifies
the code (relative to what we would have to do to continue the old behavior
on Android). It also means that we will no longer support index 8 for some
GIFs.
* Stop checking for GIFSTAMP - all GIFs should be either 89a or 87a.
This matches Chromium. I suspect that bugs would have been reported if valid
GIFs started with "GIFVER" instead of "GIF89a" or "GIF87a" (but did not decode
in Chromium).
*** Future work not included in this CL:
* Move some checks out of haveDecodedRow, since they are the same for the
entire frame e.g.
- intersecting the frameRect with the full image size
- whether there is a color table
* Change when we write transparent pixels
- In some cases, Chromium deemed this unnecessary, but I suspect it is slower
than the fallback case. There will continue to be cases where we should
*not* write them, but for e.g. the first pass where we have already
cleared to transparent (which we may also be able to skip) writing the
transparent pixels will not make anything incorrect.
* Report color type and alpha type per frame
- Depending on alpha values, disposal methods, frame rects, etc, subsequent
frames may have different properties than the first.
* Skip copies of the encoded data
- We copy the encoded data in case the stream is one that cannot be rewound,
so we can parse and then decode (possibly not immediately). For some input
streams, this is unnecessary.
- I was concerned this cause a performance regression, but on average the
new code is faster than the old for the images I tested [1].
- It may cause a performance regression for Chromium, though, where we can
always move back in the stream, so this should be addressed.
Design doc:
https://docs.google.com/a/google.com/document/d/12Qhf9T92MWfdWujQwCIjhCO3sw6pTJB5pJBwDM1T7Kc/
[1] https://docs.google.com/a/google.com/spreadsheets/d/19V-t9BfbFw5eiwBTKA1qOBkZbchjlTC5EIz6HFy-6RI/
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=2045293002
Review-Url: https://codereview.chromium.org/2045293002
SkLiteRecorder, a new SkCanvas, fills out SkLiteDL, a new SkDrawable.
This SkDrawable is a display list similar to SkRecord and SkBigPicture / SkRecordedDrawable, but with a few new design points inspired by Android and slimming paint:
1) SkLiteDL is structured as one big contiguous array rather than the two layer structure of SkRecord. This trades away flexibility and large-op-count performance for better data locality for small to medium size pictures.
2) We keep a global freelist of SkLiteDLs, both reusing the SkLiteDL struct itself and its contiguous byte array. This keeps the expected number of mallocs per display list allocation <1 (really, ~0) for cyclical use cases.
These two together mean recording is faster. Measuring against the code we use at head, SkLiteRecorder trends about ~3x faster across various size pictures, matching speed at 0 draws and beating the special-case 1-draw pictures we have today. (I.e. we won't need those special case implementations anymore, because they're slower than this new generic code.) This new strategy records 10 drawRects() in about the same time the old strategy took for 2.
This strategy stays the winner until at least 500 drawRect()s on my laptop, where I stopped checking.
A simpler alternative to freelisting is also possible (but not implemented here), where we allow the client to manually reset() an SkLiteDL for reuse when its refcnt is 1. That's essentially what we're doing with the freelist, except tracking what's available for reuse globally instead of making the client do it.
This code is not fully capable yet, but most of the key design points are there. The internal structure of SkLiteDL is the area I expect to be most volatile (anything involving Op), but its interface and the whole of SkLiteRecorder ought to be just about done.
You can run nanobench --match picture_overhead as a demo. Everything it exercises is fully fleshed out, so what it tests is an apples-to-apples comparison as far as recording costs go. I have not yet compared playback performance.
It should be simple to wrap this into an SkPicture subclass if we want.
I won't start proposing we replace anything old with anything new quite yet until I have more ducks in a row, but this does look pretty promising (similar to the SkRecord over old SkPicture change a couple years ago) and I'd like to land, experiment, iterate, especially with an eye toward Android.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2213333002
Review-Url: https://codereview.chromium.org/2213333002
This will allow me to test and visualize some assumptions
on parsing and applying color profiles. Also, it should
help me find and fix bugs.
This is certainly not an optimized implementation, and, as
far as I know, it doesn't take any shortcuts to improve
performance. We'll probably want to do both of these
once we know where it fits in the pipeline.
Right now this test is only run on an arbitrary set of ~100
images from the top 10k skps. I'll continue to add more
"interesting" images and probably tweak the code as
necessary.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1995233003
Review-Url: https://codereview.chromium.org/1995233003