This tries to cut things down to the very minimum you might want to know:
$ out/Release/nanobench --match nytimes --config 8888 gpu -q
Timer overhead: 29.6ns
! -> high variance, ? -> moderate variance
micros bench
2479.05 ! desk_nytimes.skp_1_mpd 8888
1313.92 desk_nytimes.skp_1_mpd gpu
3617.65 desk_nytimes.skp_1 8888
1158.34 desk_nytimes.skp_1 gpu
1368.99 ! keymobi_nytimes_com_.skp_1_mpd 8888
393.40 keymobi_nytimes_com_.skp_1_mpd gpu
1179.68 ! keymobi_nytimes_com_.skp_1 8888
342.74 keymobi_nytimes_com_.skp_1 gpu
All times are printed in microseconds, and high variance runs are marked.
BUG=skia:
Review URL: https://codereview.chromium.org/1493313003
Reason for revert:
BUG=skia:4609
skbug.com/4609
Seems like GrContextFactory needs to fail when the NVPR option is requested but the driver version isn't sufficiently high.
Original issue's description:
> Make NVPR a GL context option instead of a GL context
>
> Make NVPR a GL context option instead of a GL context.
> This may enable NVPR to be run with command buffer
> interface.
>
> No functionality change in DM or nanobench. NVPR can
> only be run with normal GL APIs.
>
> BUG=skia:2992
>
> Committed: https://skia.googlesource.com/skia/+/eeebdb538d476c1bfc8b63a946094ca1b505ecd1TBR=mtklein@google.com,jvanverth@google.com,kkinnunen@nvidia.com
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:2992
Review URL: https://codereview.chromium.org/1486153002
eeebdb538d did not update a Config that
only builds if SK_BUILD_FOR_ANDROID_FRAMEWORK. This appears to be the
necessary parameter that is missing.
BUG=skia:2992
Review URL: https://codereview.chromium.org/1491563003
Make NVPR a GL context option instead of a GL context.
This may enable NVPR to be run with command buffer
interface.
No functionality change in DM or nanobench. NVPR can
only be run with normal GL APIs.
BUG=skia:2992
Review URL: https://codereview.chromium.org/1448883002
Remove GrContextFactory::getGLContext, it is problematic:
It has the bug of not checking for the context type.
It also is error-prone, since the GL context is not made
current, but it is callers may assume it is current.
It is also not used very much.
Clients can use GrContextFactory::getContextInfo.
BUG=skia:
Review URL: https://codereview.chromium.org/1455093003
We no longer need to worry about namespace
conflicts SkBitmapRegionDecoder in Android (which
we are replacing).
Additionally, the static Create() function does not
need to repeat the name BitmapRegionDecoder.
BUG=skia:
Review URL: https://codereview.chromium.org/1415243007
In SkSampledCodec, allow the native codec to do its scaling first, then
sample on top of that. Since the only codec which can do native scaling
is JPEG, and we know what it can do, hard-code for JPEG. Check to see
if the sampleSize is something JPEG supports, or a multiple of
something it supports. If so, use JPEG directly or combine them.
BUG=skia:4320
Review URL: https://codereview.chromium.org/1417583009
Recent changes to WallTimer broke --samplingTime. In particular, this idiom became nonsensical:
WallTimer timer;
timer.start();
do {
...
timer.end();
} while(timer.fWall < ...);
WallTimer started making private use of fWall between when start() and end() were called, so the second time around the loop we end up with nonsense.
If that makes no sense, don't worry. The code here using now_ms() is just as fast, just as precise, and clearer.
I took the opportunity to simplify --samplingTime <complicated string parsing> to --ms <int>, and to simplify the code that depends on it.
BUG=skia:
Review URL: https://codereview.chromium.org/1419103004
The result SkBitmap, the pixel allocator, and the alpha
preference need to be communicated from the client to
the region decoder.
BUG=skia:
Review URL: https://codereview.chromium.org/1418093006
I don't really expect this to fix the errors, but I think
it's worth it to try shaking up the valgrind bot overnight.
There's some strange behavior with regard to color type on
the valgrind bot that I can't reproduce and that we aren't
seeing on any of the other bots.
TBR=mtklein,scroggo
BUG=skia:
Review URL: https://codereview.chromium.org/1418723002
- Remove --images '' to renable image benchmarking
- Add a flag to disable testing JPEG's buildTileIndex, since it also leaks memory
- Do not run images on GPU
- Do not run large interlaced images on 32 bit bots
- When buildTileIndex is not being used in the subset benches, do not use it for BRD
BUG=skia:3418
BUG=skia:4469
BUG=skia:4471
BUG=skia:4360
Review URL: https://codereview.chromium.org/1396113002
Instead of decoding one line at a time, if the ScanlineOrder is kNone,
decode all of the lines in one pass, and then copy the subset into the
output. This will allow us to more realistically test subset decodes
for interlaced png. It also makes running them not take forever.
Do *not* support other modes (besides kTopDown), since they are not
used by the big three we need to replace BitmapRegionDecoder
implementation (skbug.com/4428).
Fix a bug in SubsetTranslateBench and SubsetZoomBench:
When we decode another subset, we need to reset the scanline decode
first. This bug appears to have been present since the introduction of
these tests in crrev.com/1160953002
BUG=skia:4205
BUG=skia:3418
Review URL: https://codereview.chromium.org/1387233002
GLBenches do not expect gl state to change between onPerCanvasPreDraw and *PostDraw, but we do a clear and sometimes we clear as draw. This causes us to bind vertex objects / programs / etc.
This change creates two new virtual methods which are called right before and immediately after timing.
BUG=skia:
Review URL: https://codereview.chromium.org/1379853003
Benefits:
- This mimics other decoding APIs (including the ones SkCodec relies
on, e.g. a png_struct, which can be used to decode an entire image or
one line at a time).
- It allows a client to ask us to do what we can do efficiently - i.e.
start from encoded data and either decode the whole thing or scanlines.
- It removes the duplicate methods which appeared in both SkCodec and
SkScanlineDecoder (some of which, e.g. in SkJpegScanlineDecoder, just
call fCodec->sameMethod()).
- It simplifies moving more checks into the base class (e.g. the
examples in skbug.com/4284).
BUG=skia:4175
BUG=skia:4284
=====================================================================
SkScanlineDecoder.h/.cpp:
Removed.
SkCodec.h/.cpp:
Add methods, enums, and variables which were previously in
SkScanlineDecoder.
Default fCurrScanline to -1, as a sentinel that start has not been
called.
General changes:
Convert SkScanlineDecoders to SkCodecs.
General changes in SkCodec subclasses:
Merge SkScanlineDecoder implementation into SkCodec. Most (all?) owned
an SkCodec, so they now call this-> instead of fCodec->.
SkBmpCodec.h/.cpp:
Replace the unused rowOrder method with an override for
onGetScanlineOrder.
Make getDstRow const, since it is called by onGetY, which is const.
SkCodec_libpng.h/.cpp:
Make SkPngCodec an abstract class, with two subclasses which handle
scanline decoding separately (they share code for decoding the entire
image). Reimplement onReallyHasAlpha so that it can return the most
recent result (e.g. after a scanline decode which only decoded part
of the image) or a better answer (e.g. if the whole image is known to
be opaque).
Compute fNumberPasses early, so we know which subclass to instantiate.
Make SkPngInterlaceScanlineDecoder use the base class' fCurrScanline
rather than a separate variable.
CodexTest.cpp:
Add tests for the state changes in SkCodec (need to call start before
decoding scanlines; calling getPixels means that start will need to
be called again before decoding more scanlines).
Add a test which decodes in stripes, currently only used for an
interlaced PNG.
TODO: Add tests for onReallyHasAlpha.
Review URL: https://codereview.chromium.org/1365313002
SkBitmapRegionDecoderInterface provides an interface
for multiple implementations of Android's
BitmapRegionDecoder.
We already have correctness tests in DM that will enable us
to compare the quality of our various BRD implementations.
We also need these performance tests to compare the speed
of our various implementations.
BUG=skia:4357
Review URL: https://codereview.chromium.org/1344993003
Unfortunately, immintrin.h (which is also included by SkTypes)
includes xmmintrin.h which includes mm_malloc.h which includes
stdlib.h for malloc even though, from the implementation, it is
difficult to see why.
Fortunately, arm_neon.h does not seem to be involved in such
shenanigans, so building for Android will keep things sane.
TBR=reed@google.com
Doesn't change Skia API, just moves an include.
Review URL: https://codereview.chromium.org/1313203003
Prior to this CL, if a client wanted to decode scanlines, they had to
create an SkCodec in order to get an SkScanlineDecoder. This introduces
complications if input data is not easily shared between the two
objects.
Instead, add methods to SkScanlineDecoder for creating a new one from
input data, and remove the creation functions from SkCodec.
Update DM and tests.
Review URL: https://codereview.chromium.org/1267583002
Make getScanlineDecoder return a new object each time, which is
owned by the caller, and independent from any existing scanline
decoders and the SkCodec itself.
Since the SkCodec already contains the entire state machine, and it
is used by the scanline decoders, simply create a new SkCodec which
is now owned by the scanline decoder.
Move code that cleans up after using a scanline decoder into its
destructor
One side effect is that creating the first scanline decoder requires
a duplication of the stream and re-reading the header. (With some
more complexity/changes, we could pass the state machine to the
scanline decoder and make the SkCodec recreate its own state machine
instead.) The typical client of the scanline decoder (region decoder)
uses an SkMemoryStream, so the duplication is cheap, although we
should consider the extra time to reread the header/recreate the state
machine. (If/when we use the scanline decoder for other purposes,
where the stream may not be cheaply duplicated, we should consider
passing the state machine.)
One (intended) result of this change is that a client can create a
new scanline decoder in a new thread, and decode different pieces of
the image simultaneously.
In SkPngCodec::decodePalette, use fBitDepth rather than a parameter.
Review URL: https://codereview.chromium.org/1230033004
SkImageGenerator makes some assumptions that are not necessarily valid
for SkCodec. For example, SkCodec does not assume that it can always be
rewound.
We also have an ongoing question of what an SkCodec should report as
its default settings (i.e. the return from getInfo). It makes sense for
an SkCodec to report that its pixels are unpremultiplied, if that is
the case for the underlying data, but if a client of SkImageGenerator
uses the default settings (as many do), they will receive
unpremultiplied pixels which cannot (currently) be drawn with Skia. We
may ultimately decide to revisit SkCodec reporting an SkImageInfo, but
I have left it unchanged for now.
Import features of SkImageGenerator used by SkCodec into SkCodec.
I have left SkImageGenerator unchanged for now, but it no longer needs
Result or Options. This will require changes to Chromium.
Manually handle the lifetime of fScanlineDecoder, so SkScanlineDecoder.h
can include SkCodec.h (where Result is), and SkCodec.h does not need
to include it (to delete fScanlineDecoder).
In many places, make the following simple changes:
- Now include SkScanlineDecoder.h, which is no longer included by
SkCodec.h
- Use the enums in SkCodec, rather than SkImageGenerator
- Stop including SkImageGenerator.h where no longer needed
Review URL: https://codereview.chromium.org/1220733013
Changes verbose mode to print both the table and the individual sample
values. No need to hold back information in verbose mode.
BUG=skia:
Review URL: https://codereview.chromium.org/1208763003
Adds a nanobench mode that takes samples for a fixed amount of time,
rather than taking a fixed amount of samples.
BUG=skia:
Review URL: https://codereview.chromium.org/1204153002
Improves the GPU measuring accuracy of nanobench by using fence syncs.
Fence syncs are very widely supported and available on almost every
platform.
NO_MERGE_BUILDS
BUG=skia:
Review URL: https://codereview.chromium.org/1194783003
I think these changes to the subset benchmarks cover what we discussed yesterday.
I removed the divisor benchmarks (2x2, 3x3) and changed the single subset benchmarks.
Also, we will no longer benchmark subset decodes on small images.
BUG=skia:
Review URL: https://codereview.chromium.org/1188223002
This makes it easier to benchmark _mpd variants in a profiler.
E.g.,
<profiler> out/Release/nanobench --images --config 8888 --loops -1 --match sp_desk_nytimes
BUG=skia:
Review URL: https://codereview.chromium.org/1184673006