Create a simple linked list to sequence the subruns preparing
the way for a more focused allocator for managing subruns
at the end of the GrTextBlob.
Change-Id: I595e2ce2810d161332a23405e4615724d2953471
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/366957
Reviewed-by: Robert Phillips <robertphillips@google.com>
Commit-Queue: Herb Derby <herb@google.com>
Also add a unit test that the vectorized version equals the reference
implementation.
Bug: skia:10419
Change-Id: I4d165fd45532e9ec468565d0637fb769b51f5fcd
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/345122
Commit-Queue: Tyler Denniston <tdenniston@google.com>
Reviewed-by: Chris Dalton <csmartdalton@google.com>
Now there is only one op to tessellate a stroke, and it creates its
own GrStrokeIndirectTessellator or GrStrokeHardwareTessellator
internally. This will allow us to dynamically switch into hardware
tessellation when we need to batch strokes that have different
parameters or colors.
Bug: chromium:1172543
Bug: skia:10419
Change-Id: I3cddb855fdbb9ab018785584497c843e3e31b75e
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/366056
Commit-Queue: Chris Dalton <csmartdalton@google.com>
Reviewed-by: Greg Daniel <egdaniel@google.com>
Simplifies the op to always triangulate the inner fan. We ensure this
will always work by using breadcrumb triangles. Also removes the
fail-on-non-simple mode from GrInnerFanTriangulator since it isn't
being used anymore.
Bug: skia:10419
Change-Id: Idb849712bf2bc4e5ef785bc3f9b8db03119d230e
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/359565
Commit-Queue: Chris Dalton <csmartdalton@google.com>
Reviewed-by: Brian Salomon <bsalomon@google.com>
Reviewed-by: Jim Van Verth <jvanverth@google.com>
Marks its methods const and lifts the breadcrumb list out into
function arguments. This is one more step toward the final vision
where GrTriangulator just has an allocator and control knobs, and
everything else is functional.
Bug: skia:10419
Change-Id: I77341c045d481da49ebfee06de5dfc7a2a8a07be
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/360956
Reviewed-by: Brian Salomon <bsalomon@google.com>
Commit-Queue: Chris Dalton <csmartdalton@google.com>
For the tessellation op this allows it to use either the record-time
or flush-time allocator instead of making its own.
Bug: skia:10419
Change-Id: Ida13283feffda1976bd8fd221d5be0013a5de990
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/356516
Reviewed-by: Robert Phillips <robertphillips@google.com>
Commit-Queue: Chris Dalton <csmartdalton@google.com>
Change-Id: Ib150e6d6d3de34a85ce8051eea843ab3b2d7ab75
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/356921
Commit-Queue: Brian Osman <brianosman@google.com>
Reviewed-by: Mike Klein <mtklein@google.com>
Also cleanup some of the duplicate code in SkRecords
Bug: skia:7650
Change-Id: I4d3167a892c126c19a54002beab25c9a6c96fa5d
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/357000
Reviewed-by: Mike Reed <reed@google.com>
Commit-Queue: Mike Reed <reed@google.com>
Rather than copying these triangles twice, we can run "pathToPolys"
during onPrePrepare and "polysToTriangles" during onPrepare.
Also adds a benchmark for normal and inner-fan triangulation.
Bug: skia:10419
Change-Id: Id301afde5de11d93ae026e75e42ac03a50867687
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/355177
Commit-Queue: Chris Dalton <csmartdalton@google.com>
Reviewed-by: Robert Phillips <robertphillips@google.com>
This reduces complexity and will also allow us to add new ops that
take advantage of this same core logic.
Bug: skia:10419
Change-Id: I4ec8717a6d9510dea967d11467eeea0b5b7c7f4c
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/354296
Reviewed-by: Greg Daniel <egdaniel@google.com>
Commit-Queue: Chris Dalton <csmartdalton@google.com>
SkPaintPriv methods are just an internal stopgap
Change-Id: Ibe6e37c5871068d8cd67dc0948961444dfd2b62a
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/347041
Reviewed-by: Mike Reed <reed@google.com>
Commit-Queue: Mike Reed <reed@google.com>
Made it private and accessible internally via SkCanvasPriv.
Update SkGpuDevice methods/variables after rename of GrDrawSurfaceContext.
Cq-Include-Trybots: luci.skia.skia.primary:Canary-G3,Canary-Flutter,Canary-Android
Change-Id: I3da64cee1de03c201243ee6c7ccd4b4c44cad8c9
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/346498
Reviewed-by: Greg Daniel <egdaniel@google.com>
Commit-Queue: Brian Salomon <bsalomon@google.com>
... and lots and lots of IWYU
Change-Id: Ie5157dcdd2e6d29b95c71b39153278ab48ef4eb3
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/346778
Reviewed-by: Mike Reed <reed@google.com>
Commit-Queue: Mike Reed <reed@google.com>
This maps to usage better, and makes some code simpler to understand.
Note that there is still a PipelineStage *back-end*, which is specific
to the runtime-effect FP. A kRuntimeEffect_Kind program can be used to
generate a PipelineStage (for the GPU backend), or an skvm program (for
the CPU backend).
Change-Id: Id3f535db93a239726c595225aafe9467f0d19817
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/344969
Reviewed-by: John Stiles <johnstiles@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
None of these earliest testing tools are useful anymore
now that we can do useful work with SkVM and SkVMBlitter.
Change-Id: I8b25ef6ddd101c4ff8617c6742343dedb4764922
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/345456
Commit-Queue: Mike Klein <mtklein@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
Auto-Submit: Mike Klein <mtklein@google.com>
Reviewed-by: Brian Osman <brianosman@google.com>
This is a reland of ad8b9fa2d7
Original change's description:
> Disable HW tessellation shaders by default
>
> The GLSL backdoor for tessellation shaders is only experimental at
> this point. In order to enable the tessellation path renderer, we need
> to turn off HW tessellation and let it use the indirect draw modes.
>
> Bug: skia:10419
> Change-Id: Ic979978a331c7ad016907cf42b2562b35caab8ab
> Reviewed-on: https://skia-review.googlesource.com/c/skia/+/344336
> Commit-Queue: Chris Dalton <csmartdalton@google.com>
> Reviewed-by: Brian Salomon <bsalomon@google.com>
TBR=bsalomon@google.com
Bug: skia:10419
Change-Id: I0bdf34da134a3a34a5c78b35cc0eb86d6e989bb4
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/344796
Reviewed-by: Chris Dalton <csmartdalton@google.com>
Commit-Queue: Chris Dalton <csmartdalton@google.com>
No more
friend class ::SkArenaAlloc; // for access to ctor
Change-Id: I76fa3319498a965623e6865b75d1fb507ab845a6
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/344236
Auto-Submit: Mike Klein <mtklein@google.com>
Commit-Queue: Herb Derby <herb@google.com>
Reviewed-by: Herb Derby <herb@google.com>
Just the class/files. variable names and additional comments to follow.
Change-Id: Ic03d07fd5009eaf3d706c2536486a117328963fc
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/342617
Commit-Queue: Brian Salomon <bsalomon@google.com>
Reviewed-by: Greg Daniel <egdaniel@google.com>
Adds a vertex shader that maps a variable-length triangle strip to a
stroke and its preceding join. Adds a new op that generates stroke
instances from a path, bins them by log2 triangle strip length (using
SIMD for the calculations), and renders them with indirect draws.
Bug: skia:10419
Change-Id: I6d52df02cffe97d14827c6d66136957f1859f53b
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/339716
Commit-Queue: Chris Dalton <csmartdalton@google.com>
Reviewed-by: Brian Salomon <bsalomon@google.com>
This was just wrong. It should be determine perspective
using the matrix that we are drawing with, and not the initial
matrix.
Change-Id: I8410ced714d2c766305656bdbd797f9dea59b71e
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/339796
Reviewed-by: Robert Phillips <robertphillips@google.com>
Commit-Queue: Herb Derby <herb@google.com>
I think this is vestigial from some time in the past where RTC was
public.
Also just expose the methods that add ops rather than have so many
friends + testingOnly versions.
Change-Id: I60d9fdff23b2d67039a7b37815da7ff9e73d8999
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/339158
Commit-Queue: Brian Salomon <bsalomon@google.com>
Reviewed-by: Robert Phillips <robertphillips@google.com>
Instead of using sub classing to generate the atlas API, give
the atlas portion its own interface. Then the atlas subrun
variants can subclass both interfaces.
Change-Id: I8a0ca3d19bd362877224fa64f6c49a5f50d0ceb5
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/336958
Commit-Queue: Herb Derby <herb@google.com>
Reviewed-by: Robert Phillips <robertphillips@google.com>
do not land
Change-Id: I5fa7b2a0d1eb7e893d9b333f850a2f515d7ce065
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/336956
Commit-Queue: Mike Reed <reed@google.com>
Reviewed-by: Mike Klein <mtklein@google.com>
afaict the perf surprises associated with:
https://skia-review.googlesource.com/c/skia/+/334417 (Remove SkBaseDevice::flush)
were bc Ganesh resolves MSAA buffers for SkCanvas::flush but doesn't do so for GrDirectContext::flush.
Where possible this CL switches SkCanvas::flush to SkSurface::flush (which will also resolve MSAA buffers) so that when https://skia-review.googlesource.com/c/skia/+/334417 relands there should not be any performance surprises.
Change-Id: I705ad6219f0f625a88cf3f9e8b2418a3182d298c
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/335866
Commit-Queue: Robert Phillips <robertphillips@google.com>
Reviewed-by: Adlai Holler <adlai@google.com>
We were always pre-loading the fragment and vertex modules, but deferred
loading all others. Those two take up about 300 KB of heap. Now, all
modules are deferred, so compiler instances that don't need them (like
the one used for runtime effects) are much smaller.
Now that we can get better fine-grained numbers, added two more
benchmarks, to track actual baseline usage, plus the usage in the two
most likely configurations.
Change-Id: Idfbcd52c8afee566ac42ab827c80c940f91c4ad7
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/337176
Commit-Queue: Brian Osman <brianosman@google.com>
Commit-Queue: John Stiles <johnstiles@google.com>
Reviewed-by: John Stiles <johnstiles@google.com>
In follow on CLs we need to know what the load op is when we try to use
discardable msaa attachments. For vulkan the load op affects how we
copy the resolve attachment into the msaa attachment, which changes the
render pass we use (adds extra subpass). We need to be able to make a
compatible render pass to compile programs.
Bug: skia:10979
Change-Id: I40c23a18b251af6a2ad3b78a1f6382bdba0b90c4
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/336598
Commit-Queue: Greg Daniel <egdaniel@google.com>
Reviewed-by: Robert Phillips <robertphillips@google.com>
Previously, any builtin functions would be optimized as a side-effect of
optimizing programs that used them. Now that shared elements aren't
being optimized in that way, we explicitly optimize any shared modules
when they are first created. We don't remove dead elements, but we
we do substitute settings, simplify, and inline.
Bug: skia:10905
Change-Id: I701b5e9f52fb880ef3e6f4c67694d08602f47e95
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/336440
Commit-Queue: Brian Osman <brianosman@google.com>
Reviewed-by: John Stiles <johnstiles@google.com>
The important part is in GrOpFlushState.h were previously
we were taking a mutable pointer to the view, which should
at least be a const pointer and was making us do funky things
in some of the calling code. But I decided to go all the way
and do a const ref instead which is The Way It Should Be (tm).
Change-Id: I399d102e8b5e0a5059168cc450ae66f12ad47e13
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/336451
Reviewed-by: Greg Daniel <egdaniel@google.com>
Reviewed-by: Robert Phillips <robertphillips@google.com>
Commit-Queue: Adlai Holler <adlai@google.com>
This change will facilitate experimenting with direct-to-op
text draws. There should be no performance change because
almost all blobs are a single run.
Change-Id: I07fb3487e0601e00507403d94bc611c5022c1408
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/336447
Commit-Queue: Herb Derby <herb@google.com>
Reviewed-by: Robert Phillips <robertphillips@google.com>
This is a reland of 26ad8ccdec
... now with MSAN support.
Original change's description:
> add ERMS (enhanced rep mov/sto) SkOpts slice
>
> Intel's got two CPUID bits indicating the speed of rep mov/sto
> (memcpy/memset),
>
> - ERMS, Enhanced Rep Mov/Sto, older, large copies are fast?
> - FSRM, Fast Short Rep Mov, newer, small copies are fast?
>
> ERMS has been around a long time on Intel, but is relatively recent on
> Ryzen, and FSRM is new across the board. The startup cost for
> ERMS-but-not-FSRM copies really is noticeable, so we cut over to the
> previous SSE/AVX routines when N is small.
>
> I've left the memset benchmarks as I found them most useful when
> tuning the small/large cutoff in this CL.
>
> Change-Id: I3ac4e3f34796aba0ea86aabbe9dda7526919456a
> Reviewed-on: https://skia-review.googlesource.com/c/skia/+/332580
> Reviewed-by: Herb Derby <herb@google.com>
> Commit-Queue: Mike Klein <mtklein@google.com>
Cq-Include-Trybots: luci.skia.skia.primary:Test-Debian10-Clang-GCE-CPU-AVX2-x86_64-Release-All-MSAN
Change-Id: Ia293bba90022c48c884599331ef35aa67644729b
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/334343
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Herb Derby <herb@google.com>
So we don't incidentally measure the cost of the copy
Change-Id: I6f7e2119792423b0857764b3b557a9f838bb43fb
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/334423
Reviewed-by: Mike Reed <reed@google.com>
Commit-Queue: Mike Reed <reed@google.com>
This reverts commit 26ad8ccdec.
Reason for revert: gonna need to teach MSAN about this to reland.
Original change's description:
> add ERMS (enhanced rep mov/sto) SkOpts slice
>
> Intel's got two CPUID bits indicating the speed of rep mov/sto
> (memcpy/memset),
>
> - ERMS, Enhanced Rep Mov/Sto, older, large copies are fast?
> - FSRM, Fast Short Rep Mov, newer, small copies are fast?
>
> ERMS has been around a long time on Intel, but is relatively recent on
> Ryzen, and FSRM is new across the board. The startup cost for
> ERMS-but-not-FSRM copies really is noticeable, so we cut over to the
> previous SSE/AVX routines when N is small.
>
> I've left the memset benchmarks as I found them most useful when
> tuning the small/large cutoff in this CL.
>
> Change-Id: I3ac4e3f34796aba0ea86aabbe9dda7526919456a
> Reviewed-on: https://skia-review.googlesource.com/c/skia/+/332580
> Reviewed-by: Herb Derby <herb@google.com>
> Commit-Queue: Mike Klein <mtklein@google.com>
TBR=mtklein@google.com,herb@google.com
Change-Id: I3264af132272dbbaac8fc8b62e139a6a112bbadb
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/334342
Reviewed-by: Mike Klein <mtklein@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
Intel's got two CPUID bits indicating the speed of rep mov/sto
(memcpy/memset),
- ERMS, Enhanced Rep Mov/Sto, older, large copies are fast?
- FSRM, Fast Short Rep Mov, newer, small copies are fast?
ERMS has been around a long time on Intel, but is relatively recent on
Ryzen, and FSRM is new across the board. The startup cost for
ERMS-but-not-FSRM copies really is noticeable, so we cut over to the
previous SSE/AVX routines when N is small.
I've left the memset benchmarks as I found them most useful when
tuning the small/large cutoff in this CL.
Change-Id: I3ac4e3f34796aba0ea86aabbe9dda7526919456a
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/332580
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
Moves the "BenchmarkTarget" class from tessellation benchmarks into a
mock header where it can be reused by other tests and benchmarks.
Change-Id: I344d9ba3d391ff99e10c4ab238684b0b6ada87d0
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/333220
Reviewed-by: Robert Phillips <robertphillips@google.com>
Commit-Queue: Chris Dalton <csmartdalton@google.com>