This requires we "first" add a has-picture bool to SkPictureShader serialized format.
BUG=chromium:486947, billions and billions of others.
Review URL: https://codereview.chromium.org/1151663002
This re-enables adoption tracking for SkPictures in Blink,
which should be green now that crrev.com/1136123011 has landed.
BUG=skia:3847
Review URL: https://codereview.chromium.org/1145153002
In my confusion yesterday I accidentally left this as a non-singleton.
The issue in Blink was not related to this being a singleton,
and it should be safe to make it one.
This means recording an empty picture properly costs zero mallocs.
BUG=skia:
Review URL: https://codereview.chromium.org/1147053002
Prior to the introduction of find.py, GMs were liked in the order they
were listed in the gypi file, which was generally alphabetically. This
made it fairly easy to predict where slides would show up in SampleApp
and the order was consistent. This simply sorts the list of files in
find.py to restore the expectation that files should be listed in the
build in alphabetical order.
Review URL: https://codereview.chromium.org/1144973003
If one of the inputs to a SkMergeImageFilter was clipped away or
otherwise caused the filterImage(...) invocation for it to return
false, the entire effect would be "failed" and return false --
regardless of if it had produced a result or not.
Instead of returning false directly if filterImage(...) for a source
returned false, consider all the inputs, and then only return false if
all of them do.
BUG=chromium:489046
Review URL: https://codereview.chromium.org/1133523006
Add a newline to the font load debug message. Helps reading nanobench
results. Otherwise the message "Resource /fonts/Funkster.ttf not a valid
font." causes first result be hard to read or missing.
Review URL: https://codereview.chromium.org/1142183002
A stroked conic computes the outset quad's control point by
computing the intersection of the quad's endpoints. If the
the denominator used to compute the scale factor for the
control point is small, check to see if the numerator is also
small so that the division stays bounded.
Also clean up error returns and internal function calls to
simplify the code.
Additionally, remove comic max curvature (unimplemented) and call
extrema functions instead to handle cases where the conic is degenerate
or is a line.
R=reed@google.com, fmalita@chromium.org
BUG=skia:3843
Review URL: https://codereview.chromium.org/1144883003
What is going on here is that, after the mapPoints in fillAANestedRects, devInside was upside down so the isEmpty check was always firing. I don't see why we need to avoid having devInside sorted.
BUG=488103
Review URL: https://codereview.chromium.org/1135753004
Improve caching of dashed paths in GrStencilAndCoverPathRenderer.
Look up the (NVPR specific) GrGLPath based on GrStrokeInfo and
the original path.
Use unique keys for all GrPaths.
Dash the path with Skia dash stroker and use that path geometry for
NVPR path.
NVPR internal dashing stroke is not used, because the dashing
implementation of NVPR does not match Skia implementation.
Review URL: https://codereview.chromium.org/1116123003
Make GrResourceCache performance less sensitive to key length change.
The memcmp in GrResourceKey is called when SkTDynamicHash jumps the
slots to find the hash by a index. Avoid most of the memcmps by
comparing the hash first.
This is important because small changes in key data length can cause
big performance regressions. The theory is that key length change causes
different hash values. These hash values might trigger memcmps that
originally weren't there, causing the regression.
Adds few specialized benches to grresourcecache_add to test different
key lengths. The tests are run only on release, because on debug the
SkTDynamicHash validation takes too long, and adding many such delays
to development test runs would be unproductive. On release the tests
are quite fast.
Effect of this patch to the added tests on amd64:
grresourcecache_find_10 738us -> 768us 1.04x
grresourcecache_find_2 472us -> 476us 1.01x
grresourcecache_find_25 841us -> 845us 1x
grresourcecache_find_4 565us -> 531us 0.94x
grresourcecache_find_54 1.18ms -> 1.1ms 0.93x
grresourcecache_find_5 834us -> 749us 0.9x
grresourcecache_find_3 620us -> 542us 0.87x
grresourcecache_add_25 2.74ms -> 2.24ms 0.82x
grresourcecache_add_56 3.23ms -> 2.56ms 0.79x
grresourcecache_add_54 3.34ms -> 2.62ms 0.78x
grresourcecache_add_5 2.68ms -> 2.1ms 0.78x
grresourcecache_add_10 2.7ms -> 2.11ms 0.78x
grresourcecache_add_2 1.85ms -> 1.41ms 0.76x
grresourcecache_add 1.84ms -> 1.4ms 0.76x
grresourcecache_add_4 1.99ms -> 1.49ms 0.75x
grresourcecache_add_3 2.11ms -> 1.55ms 0.73x
grresourcecache_add_55 39ms -> 13.9ms 0.36x
grresourcecache_find_55 23.2ms -> 6.21ms 0.27x
On arm64 the results are similar.
On arm_v7_neon, the results lack the discontinuity at 55:
grresourcecache_add 4.06ms -> 4.26ms 1.05x
grresourcecache_add_2 4.05ms -> 4.23ms 1.05x
grresourcecache_find 1.28ms -> 1.3ms 1.02x
grresourcecache_find_56 3.35ms -> 3.32ms 0.99x
grresourcecache_find_2 1.31ms -> 1.29ms 0.99x
grresourcecache_find_54 3.28ms -> 3.24ms 0.99x
grresourcecache_add_5 6.38ms -> 6.26ms 0.98x
grresourcecache_add_55 8.44ms -> 8.24ms 0.98x
grresourcecache_add_25 7.03ms -> 6.86ms 0.98x
grresourcecache_find_25 2.7ms -> 2.59ms 0.96x
grresourcecache_find_4 1.45ms -> 1.38ms 0.95x
grresourcecache_find_10 2.52ms -> 2.39ms 0.95x
grresourcecache_find_55 3.54ms -> 3.33ms 0.94x
grresourcecache_find_5 2.5ms -> 2.32ms 0.93x
grresourcecache_find_3 1.57ms -> 1.43ms 0.91x
The extremely slow case, 55, is postulated to be due to the index jump
collisions running the memcmp. This is not visible on arm_v7_neon probably due
to hash function producing different results for 32 bit architectures.
This change is needed for extending path cache key in Gr
NV_path_rendering codepath. Extending is needed in order to add dashed
paths to the path cache.
Review URL: https://codereview.chromium.org/1132723003
Set the "path stroke error bound" path parameter to 0.02 for all paths.
This means that the stroked path area will be within 98% of the stroke
width in path space.
This should fix many cases where NVPR stroked paths were visibly different to
Skia stroked paths. One such path is in dashcubics gm.
This increases the amount of subdivisions the path object creation will
make for paths that need it. This in turn will increase gpu object space
requirements sligthly. Both of these effects should be unnoticeable.
GL_NV_path_rendering.txt:
"""
Every path object has a stroke approximation bound parameter
(PATH_STROKE_BOUND_NV) that is a floating-point value /sab/ clamped
between 0.0 and 1.0 and set and queried with the PATH_STROKE_BOUND_NV
path parameter. Exact determination of samples swept an orthogonal
centered line segment along cubic Bezier segments and rational
quadratic Bezier curves (so non-circular partial elliptical arcs) is
intractable for real-time rendering so an approximation is required;
/sab/ intuitively bounds the approximation error as a percentage of
the path object's stroke width. Specifically, this path parameter
requests the implementation to stencil any samples within /sweep/
object space units of the exact sweep of the path's cubic Bezier
segments or partial elliptical arcs to be sampled by the stroke where
sweep = ((1-sab)*sw)/2
where /sw/ is the path object's stroke width. The initial value
of /sab/ when a path is created is 0.2. In practical terms, this
initial value means the stencil sample positions coverage within 80%
(100%-20%) of the stroke width of cubic and rational quadratic stroke
segments should be sampled.
"""
BUG=skia:2049
Review URL: https://codereview.chromium.org/1124423007
Make the code more readable by inheriting GrStrokeInfo from SkStrokeRec.
This should avoid the long .getStrokeRec() and .getStrokeRecPtr(). These
were a bit cumbersome especially in cases where an alias variable was
created for these, and then the reader had to keep track to which
StrokeInfo member the StrokeRec alias was pointing.
Removes SkStrokeRec::SkStrokeRec(const SkStrokeRec&). It was memcpying.
Try to play it safe wrt compiler using the possible padding of
superclass for subclass members. Instead, let the compiler generate
the copy constructor. Assignment operator was already
compiler-generated, so at least in that way this is consistent.
Renames GrStrokeInfo::applyDash to applyDashToPath for consistency
with superclass applyToPath.
Review URL: https://codereview.chromium.org/1128113008
Reason for revert:
win_chromium_compile_dbg_ng
FAILED: ninja -t msvc -e environment.x86 -- E:\b\build\goma/gomacc "E:\b\depot_tools\win_toolchain\vs2013_files\VC\bin\amd64_x86\cl.exe" /nologo /showIncludes /FC @obj\third_party\skia\src\core\skia.SkBitmapHeap.obj.rsp /c ..\..\third_party\skia\src\core\SkBitmapHeap.cpp /Foobj\third_party\skia\src\core\skia.SkBitmapHeap.obj /Fdobj\skia\skia.cc.pdb
e:\b\build\slave\win\build\src\third_party\skia\include\core\skpicture.h(176) : error C2487: 'CURRENT_PICTURE_VERSION' : member of dll interface class may not be declared with dll interface
Original issue's description:
> Sketch splitting SkPicture into an interface and SkBigPicture.
>
> Adds small pictures for drawRect(), drawTextBlob(), and drawPath().
> These cover about 89% of draw calls from Blink SKPs,
> and about 25% of draw calls from our GMs.
>
> SkPicture handles:
> - serialization and deserialization
> - unique IDs
>
> Everything else is left to the subclasses:
> - playback(), cullRect()
> - hasBitmap(), hasText(), suitableForGPU(), etc.
> - LayerInfo / AccelData if applicable.
>
> The time to record a 1-op picture improves a good chunk
> (2 mallocs to 1), and the time to record a 0-op picture
> greatly improves (2 mallocs to none):
>
> picture_overhead_draw: 450ns -> 350ns
> picture_overhead_nodraw: 300ns -> 90ns
>
> BUG=skia:
>
> Committed: https://skia.googlesource.com/skia/+/c92c129ff85b05a714bd1bf921c02d5e14651f8b
>
> Latest blink_linux_rel:
>
> http://build.chromium.org/p/tryserver.blink/builders/linux_blink_rel/builds/61248
>
> Committed: https://skia.googlesource.com/skia/+/15877b6eae33a9282458bdb904a6d00440eca0ecTBR=reed@google.com,robertphillips@google.com,fmalita@chromium.org,mtklein@chromium.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:
Review URL: https://codereview.chromium.org/1130283004
Adds small pictures for drawRect(), drawTextBlob(), and drawPath().
These cover about 89% of draw calls from Blink SKPs,
and about 25% of draw calls from our GMs.
SkPicture handles:
- serialization and deserialization
- unique IDs
Everything else is left to the subclasses:
- playback(), cullRect()
- hasBitmap(), hasText(), suitableForGPU(), etc.
- LayerInfo / AccelData if applicable.
The time to record a 1-op picture improves a good chunk
(2 mallocs to 1), and the time to record a 0-op picture
greatly improves (2 mallocs to none):
picture_overhead_draw: 450ns -> 350ns
picture_overhead_nodraw: 300ns -> 90ns
BUG=skia:
Committed: https://skia.googlesource.com/skia/+/c92c129ff85b05a714bd1bf921c02d5e14651f8b
Latest blink_linux_rel:
http://build.chromium.org/p/tryserver.blink/builders/linux_blink_rel/builds/61248
Review URL: https://codereview.chromium.org/1112523006
Reason for revert:
break cros build
Original issue's description:
> SkPDF: Add Sfntly to DEPS, gyp
>
> Note: this can be disabled via:
> GYP_DEFINES='skia_pdf_use_sfntly=0
>
> Warning: dm is 34% slower and uses 9% more memory. This is
> okay.
>
> Motivation: We want to test this code path in DM, since it is
> always used by Chromium and Android.
>
> BUG=skia:3563
>
> Committed: https://skia.googlesource.com/skia/+/6a53b04e26749ea61f690ece408f2a1c0a5ad5bbTBR=reed@google.com,mtklein@google.com
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:3563
Review URL: https://codereview.chromium.org/1128353004
Note: this can be disabled via:
GYP_DEFINES='skia_pdf_use_sfntly=0
Warning: dm is 34% slower and uses 9% more memory. This is
okay.
Motivation: We want to test this code path in DM, since it is
always used by Chromium and Android.
BUG=skia:3563
Review URL: https://codereview.chromium.org/1134683006
Adds and uses fastMulDiv255Round() where possible,
which approximates x*y/255 as (x*y+x)/256. Seems like a sizeable
speedup, as seen below on Exclusion, Screen, and Modulate. The
existing NEON code uses this approximation for
{Src,Dst}x{In,Out,Over}, and without it we'd regress speed there.
This will require rebaselines whether or not we use this
approximation: the x86 bots change if we do, the ARM bots change
if we don't. None of the diffs are significant.
Desktop:
Xfermode_Screen_aa 5.82ms -> 5.54ms 0.95x
Xfermode_Modulate_aa 5.67ms -> 5.36ms 0.95x
Xfermode_Exclusion_aa 6.18ms -> 5.81ms 0.94x
Xfermode_Exclusion 5.03ms -> 4.24ms 0.84x
Xfermode_Screen 4.51ms -> 3.59ms 0.8x
Xfermode_Modulate 4.2ms -> 3.19ms 0.76x
Xfermode_DstOver 6.73ms -> 3.88ms 0.58x
Xfermode_SrcOut 6.47ms -> 3.48ms 0.54x
Xfermode_SrcIn 6.46ms -> 3.46ms 0.54x
Xfermode_DstOut 6.49ms -> 3.41ms 0.52x
Xfermode_DstIn 6.5ms -> 3.32ms 0.51x
Xfermode_Src_aa 9.53ms -> 4.75ms 0.5x
Xfermode_Clear_aa 9.65ms -> 4.8ms 0.5x
Xfermode_DstIn_aa 11.5ms -> 5.57ms 0.49x
Xfermode_DstOver_aa 11.6ms -> 5.63ms 0.49x
Xfermode_SrcOut_aa 11.6ms -> 5.5ms 0.47x
Xfermode_SrcIn_aa 11.7ms -> 5.51ms 0.47x
Xfermode_DstOut_aa 11.7ms -> 5.4ms 0.46x
N7 performance is close enough to 1x that I'm not sure whether
this is a net win, net loss, or truly neutral. I figure the bots will
show that.
I experimented with another approximation,
(x*(255-y))/255 ≈ (x*(256-y))/256. This was inconclusive, so I'm
leaving it out for now.
The remaining modes are the complicated conditional ones.
BUG=skia:
Review URL: https://codereview.chromium.org/1141953004
Cheap (one contour) paths can be evaluated and reversed as needed with a minimum of checking, but multi-contour paths invoke the regular path ops machinery to determine who is contained by whom.
More tests need to be added to verify that all corner cases are considered, but this fixes the cases in the bug thus far.
R=fmalita@chromium.orgTBR=reed@google.com
BUG=skia:3838
Review URL: https://codereview.chromium.org/1129193006