Change-Id: I93ff7e5f1062c6a85152c587fcedc34e9257dd27
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/319345
Reviewed-by: Brian Osman <brianosman@google.com>
Reviewed-by: Ethan Nicholas <ethannicholas@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
I've been thinking and rethinking and rethinking how best to use 16-bit
values like Q14 fixed-point in SkVM. Here's some ways:
A) don't... just use 32-bit values instead
B) use 16x2-bit pairs to match the narrower 32-bit lane count
C) double-pump 32-bit values to match the wider 16-bit lane count
D) use native 16- and 32-bit values and let the backends sort it out
A) is how things work today, and C) is how SkRasterPipeline's lowp mode
works. Having tried out B) and C) both for a good fair shake, they were
both already awkward to work with after writing just a few functions. I
would not give up on them entirely, but they're no longer my favorites.
D) is subtle and my new favorite. It's easiest to program with SkVM
when the values we're holding represent single values and the backend
handles any parallelism for us. That suggests we add a simple 16-bit
Q14 to the existing 32-bit I32 and F32 types, where they can be actively
converted between as normal, but not freely no-op bit punned. D) says
we people shouldn't have to choose between A-C) up front... each backend
can handle it themselves.
Under strategy D), it's entirely the backend's job to decide how to
represent each value, and how to to vectorize them. We don't need to
know as a user, and the backends can use the program itself to inform
how they vectorize. 16-bit values could live in xmm registers and
32-bit values in ymm, or the 16-bit values could go in the low half of a
ymm, or the even lanes of a ymm, or a full ymm and use two for 32-bit
values, etc. etc. This all is a backend choice, not something we should
have to know about when writing a program using Q14/I32/F32.
My next steps are to get Q14 operations tested and plumbed through the
JIT again, and to build out a blitter and a few effects using Q14 color
channels. Then, independently, we can look at each backend and how to
vectorize them. Some ideas:
1) keep running at current vectorization, with half rate 16-bit ops
2) pump up to 2x wider vectorization unconditionally to favor 16-bit
3) pump up to 2x wider vectorization only when any 16-bit op is used
These choices can be made independently for each backend (JIT, LLVM,
interp), and I wouldn't be surprised to find that we'll want to do them
differently. For instance, the interpreter is already running at 32x
vectorization... might be pumping it higher won't help anything.
Change-Id: Ib8ad2b1bf790e8c4e3acfb4818d4032f7628e8f8
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/319321
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Mike Reed <reed@google.com>
Also adds GalaxyS20 to the mix, which wasn't running skpbench
previously.
Also removes the skpbench logic to fail if we don't recognize the
hardware or have specific scripts for it. We don't have time to reverse
engineer every new piece of hardware we want to run on and the general
android script is quite helpful already.
Bug: skia:10419
Change-Id: I0e139cdd4bc2e7ca0e2e14c715d319664fa8c949
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/319143
Commit-Queue: Chris Dalton <csmartdalton@google.com>
Reviewed-by: Greg Daniel <egdaniel@google.com>
Reviewed-by: Kevin Lubick <kjlubick@google.com>
Records programInfos for the stroke ops and for the stencil portions
of the path ops.
We can't prePrepare programInfos for the fill portions yet because it
would require multiple GrPipelines that all reference the same
GrProcessorSet. And GrProcessorSet is currently designed to be
std::moved into one single GrPipeline.
Bug: skia:10419
Change-Id: I3b8c061da181e20d3ff68746cf4b9c61f6d73a88
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/319256
Commit-Queue: Chris Dalton <csmartdalton@google.com>
Reviewed-by: Robert Phillips <robertphillips@google.com>
Change-Id: I5b4fe40847112a11d6057ee7acd208879a71722f
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/319190
Commit-Queue: John Stiles <johnstiles@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
Auto-Submit: John Stiles <johnstiles@google.com>
Reviewed-by: Brian Osman <brianosman@google.com>
This reverts commit b61c3a9a01.
Change-Id: I42d93bdc6455c8ef941a6cbe1339df2ba916bb3c
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/318697
Auto-Submit: Ethan Nicholas <ethannicholas@google.com>
Reviewed-by: John Stiles <johnstiles@google.com>
Commit-Queue: Ethan Nicholas <ethannicholas@google.com>
We don't want to be polluting the global namespace with external values,
especially when the typical/recommended way to use the Compiler is with
a single long-lived instance. Force client code to manage ownership (the
only non-unit-test case was already doing this), and pass external
values to convertProgram, so they can be added to the Program's symbol
table.
Change-Id: If4c1db5e48a62e2cf4333b8d80420f2dfede27ab
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/319125
Commit-Queue: Brian Osman <brianosman@google.com>
Reviewed-by: John Stiles <johnstiles@google.com>
This reverts commit 2bded27a96.
Reason for revert: Seems to be blocking the Chrome roll
Original change's description:
> Allow rect and circle blur fast cases to be used with rotation matrices.
>
> For circles this is trivial. The existing shader works as is.
>
> For rects this requires back projecting from device space.
>
> Adds a GM for rotated rect blurs and modifies a circle blur GM to add
> rotation.
>
> Bug: chromium:1087705
>
> Change-Id: I6b969552fbcc9f9997cfa061b3a312a5a71e8841
> Reviewed-on: https://skia-review.googlesource.com/c/skia/+/318757
> Reviewed-by: Robert Phillips <robertphillips@google.com>
> Commit-Queue: Brian Salomon <bsalomon@google.com>
TBR=bsalomon@google.com,robertphillips@google.com
Change-Id: Iafb479f3b3561e226678a3020254c6e76d4ce284
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: chromium:1087705
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/319186
Reviewed-by: Robert Phillips <robertphillips@google.com>
Commit-Queue: Robert Phillips <robertphillips@google.com>
Adds base class GrD3DAlloc and GrD3DMemoryAllocator, and a reference
to a GrD3DMemoryAllocator in GrBackendContext and a reference to a
GrD3DAlloc in GrD3DTextureResourceInfo. Internally, we override this
base class to define the AMD memory allocator.
Change-Id: I033924b0247ea330969b1398f25985e7a84aec11
Bug: skia:9935
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/317243
Commit-Queue: Jim Van Verth <jvanverth@google.com>
Reviewed-by: Brian Salomon <bsalomon@google.com>
Reviewed-by: Greg Daniel <egdaniel@google.com>
If the blob is empty, then try to regenerate it. Using
this method caused a slowdown in Skia perf, so we
added an extra check to allow some empty blobs through
for perf performance. The perf problem was caused by
SKPs generate empty blobs because of font mismatches.
Flutter has shown that scaling from very small to
normal size is not correctly handled by the existing
check. This CL favors correctness over optimizing empty
text blob and always regenerates empty blobs.
https://github.com/flutter/flutter/issues/64936
Change-Id: Ib18ecb684b0af5cf6dce274b6dc09a9c61b17c77
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/319031
Reviewed-by: Ben Wagner <bungeman@google.com>
Commit-Queue: Herb Derby <herb@google.com>
Many calls to `setRefKind` failed to check the return value; if it's
false, an error has occurred and the program is in a bad state.
Specifically, there is an assignment to a variable that's not marked as
"written-to." If we continue processing the program, we're likely to
assert.
Change-Id: I2dd5d1f41aa5ca0d30f8d638f05fe2e838216d78
Bug: skia:10753
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/319116
Commit-Queue: John Stiles <johnstiles@google.com>
Auto-Submit: John Stiles <johnstiles@google.com>
Reviewed-by: Brian Osman <brianosman@google.com>
Adds various optimizations to GrWangsFormula as well as "pow4" variants
of the formula that are quicker than the standard and/or log2 versions.
Uses the pow4 variants in GrStrokePatchBuilder.
Bug: skia:10419
Change-Id: I8478582df5296b088d25808bcaeb93107ff20797
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/318954
Reviewed-by: Michael Ludwig <michaelludwig@google.com>
Commit-Queue: Chris Dalton <csmartdalton@google.com>
These cover:
- Properly configured out-params
- Invalid/non-lvalue out-params, which currently cause an SkSL crash
- Interactions between the inliner and variable swizzles
Change-Id: I4874101236084f273e704d8717149b431d813883
Bug: skia:10753
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/319036
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
Auto-Submit: John Stiles <johnstiles@google.com>
For circles this is trivial. The existing shader works as is.
For rects this requires back projecting from device space.
Adds a GM for rotated rect blurs and modifies a circle blur GM to add
rotation.
Bug: chromium:1087705
Change-Id: I6b969552fbcc9f9997cfa061b3a312a5a71e8841
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/318757
Reviewed-by: Robert Phillips <robertphillips@google.com>
Commit-Queue: Brian Salomon <bsalomon@google.com>
- Remove a spurious symbol table inserted by convertProgram. start()
already pushes a symbol table, and this was pushing a second one,
which didn't seem necessary. (The Parser can inject symbols for types
it discovers, but I can't justify those needing to be in a different
table than the rest of the program elements?)
- The convertProgram one had a comment indicating that it was popped by
the Compiler. That wasn't true, so this gets us one step closer to
balance.
- The one in start() is meant to be balanced by a pop in finish(), but
no one ever called finish(). Add that call in, and also rearrange
things so that the base symbol table is a parameter to start(), rather
than just setting it on the IR generator. (There's more of this
pattern around, but I wanted to limit the scope of this CL).
- When dehydrating the include files, we had logic to work around the
extra symbol table (absorbing the symbols) - that's not needed now.
- Simplify some other logic in processIncludeFile (no need to make so
many string copies). Always just put the incoming include file strings
into the root table, also. It's largely irrelevant where they go.
Change-Id: I18d897af3d5fa6506e11024beb9bb70e6cc5b538
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/319038
Reviewed-by: John Stiles <johnstiles@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
Implements the previous join as a sub-section of the tessellation
patch. This cuts the number of vertex shader invocations in half since
we are no longer treating joins as their own patch. (This therefore
cuts the amount of inflection/midtangent chopping work in half also.)
This required a lot of modifications to GrStrokePatchBuilder.cpp, so
this CL also finishes up the chopping logic in that file for when there
aren't enough tessellation segments to render a curve.
Bug: skia:10419
Change-Id: I3da081fe756c97aeeb65e27f1319a29763b4ad34
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/318876
Reviewed-by: Chris Dalton <csmartdalton@google.com>
Reviewed-by: Michael Ludwig <michaelludwig@google.com>
Commit-Queue: Chris Dalton <csmartdalton@google.com>
The code here is resuscitated from the GLSL testing harness at
http://review.skia.org/317771 . Testing and debugging in dm is simpler
than debugging skslc when we encounter compiler issues.
Change-Id: I492dd0bbd2f0ee0b3b6a1b5253da54d72dc1eb3b
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/319032
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
Commit-Queue: John Stiles <johnstiles@google.com>
Auto-Submit: John Stiles <johnstiles@google.com>
Two Compiler methods that use the root symbol table are still dangerous,
they're polluting the global namespace. For long-lived compilers, we
don't want to do that. takeOwnership is only used in tests, but
registerExternalValue is used by particles. Thinking I'll move that to
an optional argument to convertProgram, or a field on Settings.
Change-Id: Ic88d29d053510001931dcc2388aba2dc83a953ea
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/319030
Commit-Queue: Brian Osman <brianosman@google.com>
Commit-Queue: John Stiles <johnstiles@google.com>
Reviewed-by: John Stiles <johnstiles@google.com>
(Removed one test, SkSLFPSwitchWithMultipleReturnsInside, because it was
redundant with existing tests.)
Change-Id: I1bfc069babdb5eb0cc515f195c3a2e307bb5871a
Bug: skia:10694
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/319029
Commit-Queue: John Stiles <johnstiles@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
Auto-Submit: John Stiles <johnstiles@google.com>
Reviewed-by: Brian Osman <brianosman@google.com>
This reverts commit a38945abe3.
Reason for revert: Pinpoint says it costs RAM on Mac (?), and doesn't show any visible perf benefit
Original change's description:
> Calculate texture clamping X/Y coordinates in parallel.
>
> The X and Y values of `clampedCoord`, `extraRepeatCoord`, and `snapped`
> were being calculated and stored separately, even in cases where work
> could easily be done in tandem. Updated the code so that we use .xy when
> it makes sense to do so.
>
> Change-Id: I10d85670acb4fec960444b3f3c30f2929c6dcaf2
> Reviewed-on: https://skia-review.googlesource.com/c/skia/+/318436
> Auto-Submit: John Stiles <johnstiles@google.com>
> Reviewed-by: Brian Salomon <bsalomon@google.com>
> Commit-Queue: Brian Salomon <bsalomon@google.com>
> Commit-Queue: John Stiles <johnstiles@google.com>
TBR=bsalomon@google.com,johnstiles@google.com
Change-Id: I10aaba4caeacaa0b081d10cc900044c37f690782
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/319021
Reviewed-by: John Stiles <johnstiles@google.com>
Commit-Queue: John Stiles <johnstiles@google.com>
This appears to fix the performance regressions seen on a number of webpage
SKPs for vkmsaa8 and glmsaa targets on devices that can't disable multisampling.
Bug: skia:10730
Change-Id: I2c10a78cfe864f987b7f17eb69eb2e1712ca0676
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/317772
Reviewed-by: Chris Dalton <csmartdalton@google.com>
Commit-Queue: Michael Ludwig <michaelludwig@google.com>
f39e0f01aa..2a09e89113
2020-09-23 jmadill@chromium.org Isolated script args in restricted trace tests.
2020-09-23 jmadill@chromium.org Suppress two more compute shader tests.
2020-09-23 jmadill@chromium.org Skip another slow test on Win/Intel/D3D11.
2020-09-22 courtneygo@google.com Vulkan: Fix racy access to VkPipelineCache
2020-09-22 cclao@google.com Vulkan: Add a alpha to coverage test
2020-09-22 lehoangq@gmail.com Metal: Support array of matrices varying.
2020-09-22 geofflang@google.com Vulkan: Add features to modify sampling parameters
2020-09-22 timvp@google.com Vulkan: Allocate descriptor pools with layouts
2020-09-22 jmadill@chromium.org Use c++ thread_local for current thread.
2020-09-22 geofflang@google.com Initialize InternalFormat::blendSupport.
2020-09-22 jmadill@chromium.org Pass #pragma optimize setting down to compilation.
2020-09-22 jmadill@chromium.org Give restricted gold tests executable bit.
2020-09-22 jmadill@chromium.org Test Runner: Improve sharding validation.
2020-09-22 angle-autoroll@skia-public.iam.gserviceaccount.com Roll SwiftShader from fe878dedd5ad to 6aadd31a5f98 (1 revision)
2020-09-22 angle-autoroll@skia-public.iam.gserviceaccount.com Roll Vulkan-Loader from a148cdd49041 to 8fdc21e4078d (1 revision)
2020-09-22 angle-autoroll@skia-public.iam.gserviceaccount.com Roll SPIRV-Tools from 60ce96e2ff10 to 125b64241909 (1 revision)
2020-09-22 jmadill@chromium.org Suppress 3 more Intel/GL/Compute shader fails.
2020-09-22 lehoangq@gmail.com Metal: Disable unused attribute slots.
If this roll has caused a breakage, revert this CL and stop the roller
using the controls here:
https://autoroll.skia.org/r/angle-skia-autoroll
Please CC csmartdalton@google.com on the revert to ensure that a human
is aware of the problem.
To report a problem with the AutoRoller itself, please file a bug:
https://bugs.chromium.org/p/skia/issues/entry?template=Autoroller+Bug
Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+doc/master/autoroll/README.md
Cq-Include-Trybots: skia/skia.primary:Build-Debian10-Clang-x86_64-Release-ANGLE;skia/skia.primary:Test-Win10-Clang-AlphaR2-GPU-RadeonR9M470X-x86_64-Debug-All-ANGLE;skia/skia.primary:Test-Win10-Clang-Golo-GPU-QuadroP400-x86_64-Debug-All-ANGLE;skia/skia.primary:Test-Win10-Clang-NUC5i7RYH-GPU-IntelIris6100-x86_64-Debug-All-ANGLE;skia/skia.primary:Test-Win10-Clang-NUC6i5SYK-GPU-IntelIris540-x86_64-Debug-All-ANGLE;skia/skia.primary:Test-Win10-Clang-NUC8i5BEK-GPU-IntelIris655-x86_64-Debug-All-ANGLE;skia/skia.primary:Test-Win10-Clang-NUCD34010WYKH-GPU-IntelHD4400-x86_64-Debug-All-ANGLE
Bug: None
Tbr: csmartdalton@google.com
Test: Test: CQTest: Test: DrawCallPerfBenchmark.Run/vulkan_nullTest: Test: EGLContextASANTest.DestroyContextInUse/ES3_Vulkan
Change-Id: I6c6c95919bb1f17aceb0e75954d258942d8128ed
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/318937
Reviewed-by: skia-autoroll <skia-autoroll@skia-public.iam.gserviceaccount.com>
Commit-Queue: skia-autoroll <skia-autoroll@skia-public.iam.gserviceaccount.com>
Also does the same for the quadratic variants.
Bug: skia:10419
Change-Id: I4a0e46a0d76d16dcf452f39c7e2552975ec46ed6
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/318783
Commit-Queue: Chris Dalton <csmartdalton@google.com>
Reviewed-by: John Stiles <johnstiles@google.com>
This makes IRIntrinsicMap an actual type, and supports chaining (so an
intrinsic map can have a parent, just like a symbol table). That lets us
put Enums and defined functions at multiple levels of the pre-include
hierarchy.
With that done, we add an intrinsic map for sksl_fp.sksl, containing the
enum declarations from that file. This lets .fp processing using the FP
intrinsic map (which is parented to the GPU one) to resolve those enums
(PMConversion, GrClipEdgeType), as well as the enums in sksl_gpu
(SkBlendMode).
Because sksl_fp was being used to generate an inherited element list
(containing several builtin variables), I have relaxed the restriction
around grab_intrinsics - unsupported element types are simply left in
the original vector, unchanged. for the GPU and interpreter intrinsic
maps (where the element lists are discarded), we still assert that we
didn't end up with any unsupported elements.
Doing all of this lets us remove the redundant enum resolution code in
IR generator (where we previously supported looking up enums in both the
inherited element list, and in the intrinsic map).
Subsequent changes will add support for variables/declarations to the
intrinsic map, so we won't need both the inherited list and the
intrinsic map, if all goes well.
Change-Id: Ic6174511e5f8d68f65e4919f2ec0b923717d6cd9
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/318212
Commit-Queue: Brian Osman <brianosman@google.com>
Reviewed-by: John Stiles <johnstiles@google.com>
FP gencode reads back matrix values in its dumpInfo method using rc().
This worked for SkM44; now it will work for SkMatrix as well.
Change-Id: I4efc7018529bd3c84aacc073fad2bfca7e12c517
Bug: skia:10748
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/318756
Commit-Queue: John Stiles <johnstiles@google.com>
Auto-Submit: John Stiles <johnstiles@google.com>
Reviewed-by: Brian Osman <brianosman@google.com>
Cleaned up some nearby code while implementing this fix as well.
Change-Id: Ic084451f0d9fc12169e1720a8889a290249eb5e9
Bug: skia:10750
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/318796
Commit-Queue: John Stiles <johnstiles@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
Auto-Submit: John Stiles <johnstiles@google.com>
Reviewed-by: Brian Osman <brianosman@google.com>
`isFixedPitch` is a bool*. The surrounding lines were updated to
maintain consistency.
Change-Id: Ic604d159ff77a1f4b7e2942c0bdbd4d68f0318b5
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/307496
Reviewed-by: Ben Wagner <bungeman@google.com>
Commit-Queue: Ben Wagner <bungeman@google.com>
Auto-Submit: John Stiles <johnstiles@google.com>
Previously, we copied intrinsic functions in a totally arbitrary order;
it used an unordered_set of pointers, so it could be affected by
switching standard libraries OR by malloc nondeterminism. (Surprisingly,
it was fairly consistent in practice on OS X/Linux.) This CL sorts the
intrinsic functions into a consistent order before copying them.
Change-Id: If90342bb77a9ae237a3ce91be3a9847311a722c4
Bug: skia:10749
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/318700
Commit-Queue: John Stiles <johnstiles@google.com>
Auto-Submit: John Stiles <johnstiles@google.com>
Reviewed-by: Ethan Nicholas <ethannicholas@google.com>
In this CL, the result expression has been updated to use
VariableExpression; followup CLs will update the VariableRewriteMap to
leverage it as well.
This is intended to allow the inliner to inline simple expressions (such
as literals or swizzles) directly if they are not written to, instead of
making an extraneous copy.
Change-Id: I050057d8c3e940e5e44c22fde2f4bc37bb4c6754
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/318576
Auto-Submit: John Stiles <johnstiles@google.com>
Commit-Queue: Ethan Nicholas <ethannicholas@google.com>
Reviewed-by: Ethan Nicholas <ethannicholas@google.com>