This reverts commit 045d7513d7.
Reason for revert: experiment complete
Original change's description:
> Performance experiment: Disable SkSL inliner in nanobench/skpbench.
>
> This is a reland of 3f35ac10b4
>
> Time to run this experiment again.
>
> Original change's description:
> > Performance experiment: Disable SkSL inliner in nanobench/skpbench.
> >
> > This will allow us to measure the impact (positive or negative) of
> > inlining functions before submitting a shader to the driver.
> >
> > Change-Id: Icbd64096445a353187b30feea68573d89ca18664
> > Reviewed-on: https://skia-review.googlesource.com/c/skia/+/384317
> > Reviewed-by: Brian Osman <brianosman@google.com>
> > Commit-Queue: John Stiles <johnstiles@google.com>
>
> Change-Id: I278a770d4129f4ad0bf867c33a01b49a88cea588
> Reviewed-on: https://skia-review.googlesource.com/c/skia/+/387256
> Reviewed-by: Ethan Nicholas <ethannicholas@google.com>
> Commit-Queue: John Stiles <johnstiles@google.com>
TBR=brianosman@google.com,ethannicholas@google.com,johnstiles@google.com
Change-Id: I54d417e65bd134dee72ff46e9331f8fabfc724df
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/387536
Reviewed-by: John Stiles <johnstiles@google.com>
Commit-Queue: John Stiles <johnstiles@google.com>
Skia does not call set or get filter-quality any more
(except for legacy picture deserialization)
Change-Id: I504caf407ca68392481b771040e5d3280bf7da7f
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/387439
Reviewed-by: Mike Reed <reed@google.com>
Commit-Queue: Mike Reed <reed@google.com>
This is a reland of 3f35ac10b4
Time to run this experiment again.
Original change's description:
> Performance experiment: Disable SkSL inliner in nanobench/skpbench.
>
> This will allow us to measure the impact (positive or negative) of
> inlining functions before submitting a shader to the driver.
>
> Change-Id: Icbd64096445a353187b30feea68573d89ca18664
> Reviewed-on: https://skia-review.googlesource.com/c/skia/+/384317
> Reviewed-by: Brian Osman <brianosman@google.com>
> Commit-Queue: John Stiles <johnstiles@google.com>
Change-Id: I278a770d4129f4ad0bf867c33a01b49a88cea588
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/387256
Reviewed-by: Ethan Nicholas <ethannicholas@google.com>
Commit-Queue: John Stiles <johnstiles@google.com>
(in the future, we may determine need for more than NN)
Change-Id: Idf4318e67bf8201792cb0f6b307509fb81d90a23
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/386799
Reviewed-by: Mike Reed <reed@google.com>
Commit-Queue: Mike Reed <reed@google.com>
This means the draw is entirely clipped out, so we just don't even
create the FP to begin with.
Change-Id: I6d8a2a2e18be07c8a1408437c4bcc3d9349b77a2
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/387057
Commit-Queue: Chris Dalton <csmartdalton@google.com>
Reviewed-by: Adlai Holler <adlai@google.com>
Reviewed-by: Robert Phillips <robertphillips@google.com>
littleRound is used to round widths before a comparison with
maxWidth. The rounding only matters on a very small interval around
maxWidth. Take advantage of this fact to only do the expensive
rounding when close to maxWidth.
This is a 2-3% improvement.
Change-Id: If5b18ed4be56c1c8fa80b97d49930145d0f09b20
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/386844
Commit-Queue: Herb Derby <herb@google.com>
Reviewed-by: Julia Lavrova <jlavrova@google.com>
This exposes the implementation of calculateWidth to the
cluster building function. It improves performance by 2-3%.
Change-Id: I6be71ef2c9bdd4fb59531fc53cc3868434cba79d
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/387216
Commit-Queue: Herb Derby <herb@google.com>
Reviewed-by: Julia Lavrova <jlavrova@google.com>
This allows iterateThroughClustersInTextOrder to inline, and allows
iterateThroughClustersInTextOrder to inline the visitor. This
results in a 2-3% speed up.
Change-Id: Iacc137145547dc44dfbbddf2fa340d2945089169
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/386818
Commit-Queue: Herb Derby <herb@google.com>
Reviewed-by: Julia Lavrova <jlavrova@google.com>
This lets us plan out the allocation of resources without
actually committing to the resulting plan. In the future,
the user will be able to do the register allocation, then
query the estimated memory cost, and either commit to
that allocation or try a different order of operations.
The difference between this and the original 286097 are that we sorted
fFinishedIntvls by increasing start instead of increasing end and we
use the GrUniqueKey.hash instead of the default crc hash.
Bug: skia:10877
Change-Id: Idc405e2b4532c4cd0ae4127210ba3b42de27bd46
Cq-Include-Trybots: luci.skia.skia.primary:Canary-Chromium,Test-Debian10-Clang-GCE-GPU-SwiftShader-x86_64-Debug-All-SwiftShader_MSAN
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/386888
Reviewed-by: Robert Phillips <robertphillips@google.com>
Commit-Queue: Adlai Holler <adlai@google.com>
No behavior/pixel differences expected.
Change-Id: I9916a74de5063fd81f78bc3744ed32460e12c656
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/387236
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Mike Reed <reed@google.com>
Aiming to describe the interesting ways that we diverge from GLSL,
focused on semantics, not syntax. Left a placeholder for coordinates,
putting together good examples for things takes time, so I want to land
this in pieces.
Preview: https://skia.org/user/sksl?cl=386797
Bug: skia:11763
Change-Id: I4608774ad2896b4f2bd386e7d03065e380945861
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/386797
Commit-Queue: Brian Osman <brianosman@google.com>
Reviewed-by: John Stiles <johnstiles@google.com>
Move the ctor Cluster::Cluster into ParagraphImpl.cpp to allow
inlining. This results in about a 5% speed up using means.
min median mean max
before: 57.2µs 61.3µs 60.1µs 64.1µs
after: 55.8µs 58.4µs 57.3µs 59.4µs
Change-Id: Ie4cfcae9fde601ccf4a42aec69a853cc0bddb377
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/386817
Commit-Queue: Herb Derby <herb@google.com>
Reviewed-by: Julia Lavrova <jlavrova@google.com>
This saves a significant amount of CPU time and, now that the inliner
can handle nested expressions, still inlines almost everything.
Change-Id: I8f198630fa9627bc433ef8fb72f6bcf94595cdaa
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/386917
Commit-Queue: John Stiles <johnstiles@google.com>
Auto-Submit: John Stiles <johnstiles@google.com>
Reviewed-by: Ethan Nicholas <ethannicholas@google.com>
A follow-up CL can remove the filter-quality from onProgram.
Change-Id: I770e3b1fd0907bf3824ed402502fa67325a433d5
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/381799
Reviewed-by: Florin Malita <fmalita@chromium.org>
Commit-Queue: Mike Reed <reed@google.com>
The purpose of this method was to not allow FP values get too large for
the now-deleted coverage counting shaders.
Change-Id: I9f86c2adf64cc5e66ed9585d18945e8a2be35c34
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/387056
Reviewed-by: Adlai Holler <adlai@google.com>
Commit-Queue: Chris Dalton <csmartdalton@google.com>
We don't currently do FunctionCall optimization, but implementing
something along the lines of skia:10835 would probably involve doing
rewrites for optimization in FunctionCall::Make. This CL is the first
step down that road.
Change-Id: I249b02412e7ebac21bb98d6c5d61af3dcd6f1e69
Bug: skia:11342
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/387156
Commit-Queue: John Stiles <johnstiles@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
Auto-Submit: John Stiles <johnstiles@google.com>
Reviewed-by: Brian Osman <brianosman@google.com>
This will allow the inliner to successfully do more work in a single
pass.
Change-Id: I26e8831737c10bdf9a35eebd94ea8b74f6487077
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/386916
Reviewed-by: Ethan Nicholas <ethannicholas@google.com>
Commit-Queue: Ethan Nicholas <ethannicholas@google.com>
Auto-Submit: John Stiles <johnstiles@google.com>
Call onDraw in onDelayedSetup to warm up the glyph cache
otherwise nanobench will mis-caclulate the loop count.
This change also reduces the variability of the benchmark.
Change-Id: I5f3f2167cc78b996fcb589644b70622d18af240b
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/386816
Commit-Queue: Mike Reed <reed@google.com>
Auto-Submit: Herb Derby <herb@google.com>
Reviewed-by: Mike Reed <reed@google.com>
Arguments without side-effects that aren't read from more than once can
be moved directly into the inlined function, and don't need a scratch
variable. This can allow functions like `guarded_divide` to inline
completely in more cases.
Change-Id: I0bfce35635cf9779f4af1bc0790da966ccfe4230
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/386678
Reviewed-by: Ethan Nicholas <ethannicholas@google.com>
Commit-Queue: Ethan Nicholas <ethannicholas@google.com>
Auto-Submit: John Stiles <johnstiles@google.com>
Update Wuffs to match the version currently used by Chromium
(updated in http://crrev.com/c/2731168)
Change-Id: If3f1ccad6a9ff6202391ee79a9b4d3b413a4cc25
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/386756
Auto-Submit: Leon Scroggins <scroggo@google.com>
Reviewed-by: Nigel Tao <nigeltao@google.com>
Commit-Queue: Leon Scroggins <scroggo@google.com>
For now, just turns on fastmath.
Change-Id: Ica821dbfd5b30f9e0cceb1eed9443905987f292a
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/385882
Commit-Queue: Jim Van Verth <jvanverth@google.com>
Reviewed-by: Brian Osman <brianosman@google.com>
For immutable samplers, we claimed that the GrVkPipelineState was
taking over the ref from GrVkUniformHandler, but this actually never
happened and it was still unref'ed when the uniform handler went away.
We never unref'ed it in the GrvkPipeline so all our refs/unrefs still
lined up. We were getting saved that the GrVkResourceProvider happen to
hold the ref on its samplers longer that the GrVkPipeline that used it.
This CL cleans up all those issues.
Change-Id: I113f011979cf7ba3d734f9c518513598581a3efd
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/386677
Reviewed-by: Jim Van Verth <jvanverth@google.com>
Commit-Queue: Greg Daniel <egdaniel@google.com>
39296f396f..ba0bd78574
2021-03-19 jiajia.qin@intel.com Fix the assert error and inbalence parens for SSBO
2021-03-18 jmadill@chromium.org Vulkan: Use packed enum map for descriptor set index.
2021-03-18 jmadill@chromium.org Vulkan: Clean up shader buffer DS allocation.
2021-03-18 cclao@google.com Vulkan: Move CommandBufferHelper::reset() closer to constructor
2021-03-18 cclao@google.com Vulkan: Test render and sample the same texture but with different LOD
2021-03-18 gert.wollny@collabora.com scripts: Ignore "rapidsjon/..." when checking includes
2021-03-18 angle-autoroll@skia-public.iam.gserviceaccount.com Roll Chromium from 60fea25f23e6 to e7ef5f7d0368 (472 revisions)
If this roll has caused a breakage, revert this CL and stop the roller
using the controls here:
https://autoroll.skia.org/r/angle-skia-autoroll
Please CC michaelludwig@google.com on the revert to ensure that a human
is aware of the problem.
To report a problem with the AutoRoller itself, please file a bug:
https://bugs.chromium.org/p/skia/issues/entry?template=Autoroller+Bug
Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+doc/master/autoroll/README.md
Cq-Include-Trybots: skia/skia.primary:Build-Debian10-Clang-x86_64-Release-ANGLE;skia/skia.primary:Test-Win10-Clang-AlphaR2-GPU-RadeonR9M470X-x86_64-Debug-All-ANGLE;skia/skia.primary:Test-Win10-Clang-Golo-GPU-QuadroP400-x86_64-Debug-All-ANGLE;skia/skia.primary:Test-Win10-Clang-NUC5i7RYH-GPU-IntelIris6100-x86_64-Debug-All-ANGLE;skia/skia.primary:Test-Win10-Clang-NUC6i5SYK-GPU-IntelIris540-x86_64-Debug-All-ANGLE;skia/skia.primary:Test-Win10-Clang-NUC8i5BEK-GPU-IntelIris655-x86_64-Debug-All-ANGLE;skia/skia.primary:Test-Win10-Clang-NUCD34010WYKH-GPU-IntelHD4400-x86_64-Debug-All-ANGLE
Tbr: michaelludwig@google.com
Test: Test: FramebufferTest_ES3.SampleFromAttachedTextureWithDifferentLOD
Change-Id: Ie4f7f24acffe694ad14c41958fb4a65cada012a4
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/386753
Reviewed-by: skia-autoroll <skia-autoroll@skia-public.iam.gserviceaccount.com>
Commit-Queue: skia-autoroll <skia-autoroll@skia-public.iam.gserviceaccount.com>
Previously, we would copy all the plane data into an array of half4s,
then copy and swizzle them back out. This change writes the results
directly into the result color, and doesn't require any temporary
variables.
(Motivation: I noticed this shader in some of the Mali-400 regression
cases when testing with inliner-off, so I'm looking for cases where this
code could be simplified.)
Change-Id: I7d79ad519fb53f7d8e33c4d545e8a197023cec5b
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/386836
Commit-Queue: John Stiles <johnstiles@google.com>
Auto-Submit: John Stiles <johnstiles@google.com>
Reviewed-by: Brian Salomon <bsalomon@google.com>
This reverts commit c6f78ff55d.
Reason for revert: Broke Chrome roll and MSAN
Original change's description:
> Do register allocation in GrResourceAllocator
>
> This lets us plan out the allocation of resources without
> actually committing to the resulting plan. In the future,
> the user will be able to do the register allocation, then
> query the estimated memory cost, and either commit to
> that allocation or try a different order of operations.
>
> Bug: skia:10877
> Change-Id: I34f92b01986dc2a0dd72e85d42283fc438c5fc82
> Reviewed-on: https://skia-review.googlesource.com/c/skia/+/386097
> Commit-Queue: Adlai Holler <adlai@google.com>
> Reviewed-by: Robert Phillips <robertphillips@google.com>
TBR=robertphillips@google.com,adlai@google.com
Change-Id: I7492c12b8188ed22c3cd80fd4068da402d8d3543
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: skia:10877
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/386856
Reviewed-by: Adlai Holler <adlai@google.com>
Commit-Queue: Adlai Holler <adlai@google.com>
SkDevice::getGlobalBounds() transforms the device's local clip bounds
by its device-to-global matrix. The SkDevice constructor expects bounds
to already be in its own coordinate system (SkNoPixelsDevice ought to
just take dimensions really).
For now, all device-to-global transforms are integer translates, so
this didn't dramatically change the bounds but moving forward,
image filters may cause these transforms to include rotations, skew,
and perspective (once https://skia-review.googlesource.com/c/skia/+/334040
lands).
This change to TrackingDevice is lifted from the above CL since it
may play a role in addressing fixes for chromium:1187246, and it fixes
strike cache misses from the canary runs of the above image filter CL.
Bug: 1187246, skia:11240
Change-Id: I67c8446ddbf5aaed144d439ab8d1e7998e9bfa01
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/386696
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Michael Ludwig <michaelludwig@google.com>
This is a reland of 579728eb19
Original change's description:
> Add SVG to default modules list
>
> This enables SVG to build in official builds.
>
> Change-Id: I4f64109983216baf9663061e23cc3757292ff448
> Reviewed-on: https://skia-review.googlesource.com/c/skia/+/386096
> Reviewed-by: Florin Malita <fmalita@google.com>
> Commit-Queue: Tyler Denniston <tdenniston@google.com>
Change-Id: I8bb93f3881e69f7b4461981a4f0f95a87fed0976
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/386557
Reviewed-by: Florin Malita <fmalita@google.com>
Commit-Queue: Tyler Denniston <tdenniston@google.com>
Every caller of SymbolTable::takeOwnershipOfString was allocating a
unique_ptr<String> instead of just moving the String object directly.
This was important, as the SymbolTable needed to return a stable String
pointer to the caller, and vector<String> is allowed to move its
elements around when it is resized.
On the other hand, deque<String> promises pointer stability even after a
resize. Replacing the vector with a deque lets us avoid allocating an
extra String object every time we call takeOwnershipOfString.
Change-Id: I8947c0900fd355c940b046a52a4c1762465b55d3
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/386596
Auto-Submit: John Stiles <johnstiles@google.com>
Commit-Queue: Ethan Nicholas <ethannicholas@google.com>
Reviewed-by: Ethan Nicholas <ethannicholas@google.com>
Using the dedicated function is slightly faster than calling snprintf
and is documented to behave the same.
Change-Id: I9bc64066b55cf74d2369c531283c68e05bcc402c
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/386676
Auto-Submit: John Stiles <johnstiles@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
Reviewed-by: Brian Osman <brianosman@google.com>
This lets us plan out the allocation of resources without
actually committing to the resulting plan. In the future,
the user will be able to do the register allocation, then
query the estimated memory cost, and either commit to
that allocation or try a different order of operations.
Bug: skia:10877
Change-Id: I34f92b01986dc2a0dd72e85d42283fc438c5fc82
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/386097
Commit-Queue: Adlai Holler <adlai@google.com>
Reviewed-by: Robert Phillips <robertphillips@google.com>
Bug: skia:11760
Change-Id: I1baaec529b47954018d000856912f121b4f1454a
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/386597
Reviewed-by: Chris Dalton <csmartdalton@google.com>
Reviewed-by: Eric Boren <borenet@google.com>
Commit-Queue: Robert Phillips <robertphillips@google.com>
Dump information about the Difference, Serialization,
Deserialization, and Drawing phases of Renderer/GPU text
drawing.
Add the following to your args.gn to turn on telemetry.
extra_cflags = ["-D", "SK_TRACE_GLYPH_RUN_PROCESS"]
Change-Id: If435257574b74910822dbb90cc9dbca311578fe8
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/385696
Commit-Queue: Herb Derby <herb@google.com>
Reviewed-by: Michael Ludwig <michaelludwig@google.com>
The render_passes GPU stat is backend-agnostic and gets
more to the point of what we're trying to measure. It's verified
to be stable across configs (e.g. non-msaa vs msaa) and
correctly shows effects of reducesOpsTaskSplitting.
Rob and I have an alert on this stat from perf.skia.org
Bug: skia:10877
Change-Id: I71520cf8fd311545faf05ee5c55db185ed48c6a8
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/386561
Auto-Submit: Adlai Holler <adlai@google.com>
Commit-Queue: Robert Phillips <robertphillips@google.com>
Reviewed-by: Robert Phillips <robertphillips@google.com>
This reverts commit a9c187e5cc.
Change-Id: Icbfb8abdfc67fc2e6428d97a6cdede2726fb56e4
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/385596
Auto-Submit: Ethan Nicholas <ethannicholas@google.com>
Commit-Queue: John Stiles <johnstiles@google.com>
Reviewed-by: John Stiles <johnstiles@google.com>
The record time allocator's initial block size is 12,800
(100 * sizeof GrPipline) bytes.
This means that the fibonacci progression grows rapidly to large
block sizes. This is causing out of memory errors in Chrome and
Android as they try to allocate very large blocks.
Reduce the initial allocation to 1024 bytes to reduce growth
rate.
F *1,024 *12,800
=====================
0 1,024 12,800
1 1,024 12,800
2 2,048 25,600
3 3,072 38,400
4 5,120 64,000
5 8,192 102,400
6 13,312 166,400
7 21,504 268,800
8 34,816 435,200
9 56,320 704,000
10 91,136 1,139,200
11 147,456 1,843,200
Bug: b/182959903
Bug: chromium:1188071
Change-Id: I5ef1c736efb42b2bccd78549d129154c0857bbca
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/385938
Commit-Queue: Herb Derby <herb@google.com>
Reviewed-by: Robert Phillips <robertphillips@google.com>