blink skips all pending commands during picture recording if it is drawing an opaque full-frame
geometry or image. This may improve performance for some edge cases. To recognize an opaque
full-frame drawing should be cheap enough. Otherwise, the overhead will offset the improvement.
Unfortunately, data from perf for content_shell on Nexus7 shows that SkDeferredCanvas::isFullFrame
is far from cheap. Table below shows that how much isFullFrame() costs in the whole render process.
benchmark percentage
my local benchmark(draw 1000 sprites) 4.1%
speedReading 2.8%
FishIETank(1000 fishes) 1.5%
GUIMark3 Bitmap 2.0%
By contrast, real recording (SkGPipeCanvas::drawBitmapRectToRect) and real rasterization
(GrDrawTarget::drawRect) cost ~4% and ~6% in the whole render process respectively. Apparently,
SkDeferredCanvas::isFullFrame() is nontrivial.
getDeviceSize() is the main contributor to this hotspot. The change simply save the canvasSize and
reuse it among drawings if it is not a fresh frame. This change cut off ~65% (or improved ~2 times)
of isFullFrame().
telemetry smoothness canvas_tough_test didn't show obvious improvement or regression.
BUG=411166
Committed: https://skia.googlesource.com/skia/+/8e45c3777d886ba3fe239bb549d06b0693692152R=junov@chromium.org, tomhudson@google.com, reed@google.com
Author: yunchao.he@intel.com
Review URL: https://codereview.chromium.org/545813002
Reason for revert:
This is leaking memory:
http://108.170.220.120:10117/builders/Test-Ubuntu13.10-GCE-NoGPU-x86_64-Debug-ASAN/builds/2516/steps/RunDM/logs/stdio
Original issue's description:
> Picture Recording: fix the performance bottleneck in SkDeferredCanvas::isFullFrame
>
> blink skips all pending commands during picture recording if it is drawing an opaque full-frame
> geometry or image. This may improve performance for some edge cases. To recognize an opaque
> full-frame drawing should be cheap enough. Otherwise, the overhead will offset the improvement.
> Unfortunately, data from perf for content_shell on Nexus7 shows that SkDeferredCanvas::isFullFrame
> is far from cheap. Table below shows that how much isFullFrame() costs in the whole render process.
>
> benchmark percentage
> my local benchmark(draw 1000 sprites) 4.1%
> speedReading 2.8%
> FishIETank(1000 fishes) 1.5%
> GUIMark3 Bitmap 2.0%
>
> By contrast, real recording (SkGPipeCanvas::drawBitmapRectToRect) and real rasterization
> (GrDrawTarget::drawRect) cost ~4% and ~6% in the whole render process respectively. Apparently,
> SkDeferredCanvas::isFullFrame() is nontrivial.
>
> getDeviceSize() is the main contributor to this hotspot. The change simply save the canvasSize and
> reuse it among drawings if it is not a fresh frame. This change cut off ~65% (or improved ~2 times)
> of isFullFrame().
>
> telemetry smoothness canvas_tough_test didn't show obvious improvement or regression.
>
> BUG=411166
>
> Committed: https://skia.googlesource.com/skia/+/8e45c3777d886ba3fe239bb549d06b0693692152R=junov@chromium.org, tomhudson@google.com, reed@google.com, yunchao.he@intel.comTBR=junov@chromium.org, reed@google.com, tomhudson@google.com, yunchao.he@intel.com
NOTREECHECKS=true
NOTRY=true
BUG=411166
Author: mtklein@google.com
Review URL: https://codereview.chromium.org/571053002
blink skips all pending commands during picture recording if it is drawing an opaque full-frame
geometry or image. This may improve performance for some edge cases. To recognize an opaque
full-frame drawing should be cheap enough. Otherwise, the overhead will offset the improvement.
Unfortunately, data from perf for content_shell on Nexus7 shows that SkDeferredCanvas::isFullFrame
is far from cheap. Table below shows that how much isFullFrame() costs in the whole render process.
benchmark percentage
my local benchmark(draw 1000 sprites) 4.1%
speedReading 2.8%
FishIETank(1000 fishes) 1.5%
GUIMark3 Bitmap 2.0%
By contrast, real recording (SkGPipeCanvas::drawBitmapRectToRect) and real rasterization
(GrDrawTarget::drawRect) cost ~4% and ~6% in the whole render process respectively. Apparently,
SkDeferredCanvas::isFullFrame() is nontrivial.
getDeviceSize() is the main contributor to this hotspot. The change simply save the canvasSize and
reuse it among drawings if it is not a fresh frame. This change cut off ~65% (or improved ~2 times)
of isFullFrame().
telemetry smoothness canvas_tough_test didn't show obvious improvement or regression.
BUG=411166
R=junov@chromium.org, tomhudson@google.com, reed@google.com
Author: yunchao.he@intel.com
Review URL: https://codereview.chromium.org/545813002
https://codereview.chromium.org/283093002 fixed some bugs in pipe memory
allocation, but also introduced one of its own: nearly every block requested
from needOpBytes() got its own 16K allocation.
The correct logic is to take the requested size, add four more bytes for a
DrawOp, make sure that's 4-byte aligned, then check to see if there's enough
space for that in the current block. If there's not, allocate at least
MIN_BLOCK_SIZE bytes to fit the request.
The bug is that I moved that round-up-to-MIN_BLOCK_SIZE step before checking
for space in the current block. This means most (all?) blocks will be 16K but
never seem to have room to fit another allocation. You need 8 bytes? You get
16K. You need 8 more bytes? Nope, can't fit that. Here's a new 16K...
This reverts the change to the test I made then, which really should have
tipped me off. It was testing exactly this bug.
BUG=372671
R=tomhudson@chromium.org, tomhudson@google.com
Author: mtklein@chromium.org
Review URL: https://codereview.chromium.org/433463003
When verylargebitmap GM runs in cross-process pipe mode, we're
requestBlock()ing ~200M to carry the bitmaps. The current
implementation ends up allocating ~800M, which is a bit wasteful.
SkGPipeWrite already rounds up to 16K, so just rely on that.
This change exposed several bugs in pipe:
- we don't reserve enough space in drawVertices
- we don't reserve enough space for factory names in cross-process mode
- we don't quite have the right check in needOpBytes to see if we needed to send off the current block and allocate a new one
SETUP_NOTIFY and generally calling doNotify() more often than necessary made things hard to debug and understand. Now the pipe always waits to send off its current block until it needs more space than that block can provide, or it's the final block. We can put these back if we need the proactive flushing, but it seems not necessary?
Removed an assert in DeferredCanvasTest, which is somtimes 2 (Debug), sometimes 3 (Release). It seemed like the other asserts were more essential, and this one was more of a white-box assertion. Still sound if we remove it?
BUG=skia:2478
R=scroggo@google.com, mtklein@google.com, reed@google.com, junov@chromium.org
Author: mtklein@chromium.org
Review URL: https://codereview.chromium.org/267863002
git-svn-id: http://skia.googlecode.com/svn/trunk@14613 2bbb7eff-a529-9590-31e7-b0007b416f81