Change-Id: I5cd133f9de490340a958403c06ab1c8c44017001
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/223186
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Ben Wagner aka dogben <benjaminwagner@google.com>
I was just reading the ARM docs and realized that their BIC ("BIt
Clear") is the same as SSE's ANDN ("AND Not") instruction. It's kind of
a neat little tool to have laying around... comes up more than you'd
think, and it's sometimes the clearest way to express what you're doing,
as in the changed program here where the comment is "mask away the low
bits". That's a bit_clear with a mask for what you want to clear away!
And the real reason to write this up is that I want to have a CL to
point to that shows how to add an instruction top to bottom.
Change-Id: I99690ed9c1009427b3986955e7ae6264de4d215c
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/223120
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Herb Derby <herb@google.com>
Reviewed-by: Mike Reed <reed@google.com>
Run the Skottie tests checked under resources/skottie/ on Lottie bots.
Bug: skia:8925
Change-Id: I240608f1cbc70440cd1a35af52f98a7ef250ec31
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/223182
Reviewed-by: Eric Boren <borenet@google.com>
Commit-Queue: Florin Malita <fmalita@chromium.org>
This CL allows user to indicate that they have a protected content in
GrVkBackendContext creation which results in protected CommandPool and Queue
usage.
Bug: skia:9016
Change-Id: I6a478d688b6988c2c5e5e98f18f58fb21f9d26ae
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/210067
Commit-Queue: Greg Daniel <egdaniel@google.com>
Auto-Submit: Emircan Uysaler <emircan@google.com>
Reviewed-by: Greg Daniel <egdaniel@google.com>
Instruction is the fundamental data, and Analysis derived from it.
The fields in Analysis are only* needed in Builder::done(), and this
split seems to help clarify what done() can tweak (Analysis) and what
it cannot (fProgram, Instructions). done() is now const.
No speed change as far as I can tell.
* As you may notice looking at the test expectations, making analysis
ephemeral means that dump() can no longer print the skull for dead code
or the arrow for hoisted. The register program that's also in the
expectation file still reflects both of these optimizations, so we're
not really losing any information. Just maybe less demo-friendly.
Change-Id: I79feb57558525591baf3faadeb59c418c12793f3
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/223119
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Herb Derby <herb@google.com>
Each one of these Instructions has its own register assignment,
so instead of allocating them in a little temporary side vector,
allocate them along with the main Program entries, just like
the other metadata, hoist and life.
No noticeable change in perf.
Change-Id: I3db8c1520d52f5787111b227e6becfef49e5a892
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/223118
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Herb Derby <herb@google.com>
This cuts the overhead bench from about 19µs to about 15µs.
The key insight here is that the only registers that might become
available after any given instruction are the ones that hold that
instruction's inputs. We can check when they become available
directly from the original Builder::Program, without needing a
side death schedule data structure.
Marking hoisted instructions as having life == program size
helps make this logic a little simpler to reason through.
Change-Id: Ifb9957f2d0e323e0e5d07996a2cc988f7c8b4c3f
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/223117
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Herb Derby <herb@google.com>
This splits the ID namespace into Reg and Val types, hopefully making it
a little easier to follow what's going on, and if we want, allowing us
to size them differently (e.g. val at i32 or i16, reg at i16 or u8). I
didn't notice any speed change when shrinking either, so I've left them
both at i32 for maximum flexibility.
I played with making these strong typedefs with both structs and enum
classes, but both felt a little awkward. I'm still open to the idea.
Change-Id: Ie0adf6944ed6254eb21dfdfb59894c4e30476443
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/223077
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Herb Derby <herb@google.com>
This cuts a field from Builder::Instruction, and also makes the code
easier to follow, I think. Now d, x, and y are always registers, and
only the final field may be a register z or an immediate.
Change-Id: I33bbe0c6fb8cb96b85f0b0e8c30df3fa4d233c1b
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/223076
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Herb Derby <herb@google.com>
We have 14 Shields available, so move some of the "oddball" jobs there.
We have only two Nexus5x's left and Moto G4 is way over-capacity after
half of them overheated.
Also sort jobs.json.
This CL just adds the new jobs; old jobs are removed in
https://skia-review.googlesource.com/c/skia/+/223057
This provides some overlap for easier diagnosis of Gold and Perf diffs.
Change-Id: Ie2d0151a1c3f2097ae69a3f173178b239592e8fc
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/222792
Commit-Queue: Ben Wagner aka dogben <benjaminwagner@google.com>
Auto-Submit: Ben Wagner aka dogben <benjaminwagner@google.com>
Reviewed-by: Eric Boren <borenet@google.com>
Pulling this cleanup out of a larger CL
Change-Id: Ib3ecff5d242eba72a7f2bc3ce07e09760a9ba7b7
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/223181
Reviewed-by: Brian Osman <brianosman@google.com>
Reviewed-by: Michael Ludwig <michaelludwig@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
bf4cfa77c4..dfd7600551
git log bf4cfa77c4bf..dfd760055152 --date=short --no-merges --format='%ad %ae %s'
2019-06-21 jmadill@chromium.org Vulkan: Don't update pipeline when only textures change.
2019-06-21 timvp@google.com Increase demangled array size
2019-06-21 dongja@google.com GL/Vulkan: handle depth texture discrepancy
2019-06-21 geofflang@chromium.org Limit max texture size and max MSAA samples on Android.
2019-06-21 geofflang@chromium.org Always scalarize mat and vec constructor arguments.
2019-06-21 tobine@google.com Print perf results to stdout on Android
2019-06-21 syoussefi@chromium.org Vulkan: Handle 0-sized viewports
2019-06-21 geofflang@chromium.org Removal global locks from GL entry points. Always lock in EGL.
2019-06-21 syoussefi@chromium.org Vulkan: Add vkCmdFillBuffer support
2019-06-21 angle-autoroll@skia-public.iam.gserviceaccount.com Roll ./third_party/spirv-tools/src 2090d7a2d26c..7c294608ca19 (8 commits)
Created with:
gclient setdep -r third_party/externals/angle2@dfd760055152
The AutoRoll server is located here: https://autoroll.skia.org/r/angle-skia-autoroll
Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+/master/autoroll/README.md
If the roll is causing failures, please contact the current sheriff, who should
be CC'd on the roll, and stop the roller if necessary.
CQ_INCLUDE_TRYBOTS=skia.primary:Build-Debian9-Clang-x86_64-Release-ANGLE;skia.primary:Perf-Win10-Clang-AlphaR2-GPU-RadeonR9M470X-x86_64-Debug-All-ANGLE;skia.primary:Perf-Win10-Clang-NUC5i7RYH-GPU-IntelIris6100-x86_64-Debug-All-ANGLE;skia.primary:Perf-Win10-Clang-NUC6i5SYK-GPU-IntelIris540-x86_64-Debug-All-ANGLE;skia.primary:Perf-Win10-Clang-NUCD34010WYKH-GPU-IntelHD4400-x86_64-Debug-All-ANGLE;skia.primary:Perf-Win10-Clang-ShuttleC-GPU-GTX960-x86_64-Debug-All-ANGLE;skia.primary:Test-Win10-Clang-AlphaR2-GPU-RadeonR9M470X-x86_64-Debug-All-ANGLE;skia.primary:Test-Win10-Clang-NUC6i5SYK-GPU-IntelIris540-x86_64-Debug-All-ANGLE;skia.primary:Test-Win10-Clang-NUCD34010WYKH-GPU-IntelHD4400-x86_64-Debug-All-ANGLE;skia.primary:Test-Win10-Clang-ShuttleC-GPU-GTX960-x86_64-Debug-All-ANGLE
TBR=djsollen@google.com
Change-Id: I15f50404641dd2af336e03957a515b036c8fb0db
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/223111
Reviewed-by: skia-autoroll <skia-autoroll@skia-public.iam.gserviceaccount.com>
Commit-Queue: skia-autoroll <skia-autoroll@skia-public.iam.gserviceaccount.com>
31223069ea..91f69e5c58
Created with:
gclient setdep -r ../src@91f69e5c58
The AutoRoll server is located here: https://autoroll.skia.org/r/chromium-skia-autoroll
Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+/master/autoroll/README.md
If the roll is causing failures, please contact the current sheriff, who should
be CC'd on the roll, and stop the roller if necessary.
CQ_INCLUDE_TRYBOTS=skia.primary:Perf-Mac10.13-Clang-MacBookPro11.5-GPU-RadeonHD8870M-x86_64-Release-All-CommandBuffer;skia.primary:Test-Mac10.13-Clang-MacBookPro11.5-GPU-RadeonHD8870M-x86_64-Debug-All-CommandBuffer
TBR=djsollen@google.com
Change-Id: I1ff7b24e9ab659a112e735931825e48401fa9f84
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/223112
Reviewed-by: skia-autoroll <skia-autoroll@skia-public.iam.gserviceaccount.com>
Commit-Queue: skia-autoroll <skia-autoroll@skia-public.iam.gserviceaccount.com>
From now on, sample counts always refer to the number of actual color
samples, and render targets don't have separate color and stencil
sample counts.
If mixed samples support is available when making a
"GrAAType::kCoverage" draw, then an op may attach and use a mixed
sampled stencil buffer internally. But this will all be invisible to
the client.
After this CL, we temporarily won't have a mode to use nvpr with mixed
samples. That will soon be fixed by a follow-on CL that enables nvpr
with mixed samples in the normal "gl" and "gles" configs.
Bug: skia:
Change-Id: I1cb8277f0d2d0d371f24bb9f39cd473ed5c5c83b
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/221878
Commit-Queue: Chris Dalton <csmartdalton@google.com>
Reviewed-by: Brian Salomon <bsalomon@google.com>
Switching some SkVM code from std::unordered_map to SkTHashMap caused
the -MSRTC bot to barf unexpectedly, but in a way that makes sense in
retrospect. The code to hash skvm::Builder::Instructions returns size_t
to fit the unordered_map convention, and I forgot to change that to
SkTHashMap's preferred uint32_t. So we began to implicitly truncate
that size_t to uint32_t on 64-bit machines, one of the potential issues
the -MSRTC bot exists to catch.
This change simply masks any user-provided hash to 32 bits explicitly.
We could alternatively update the Instruction hash code, but I think the
mask here is so cheap (usually notional, zero-cost) that compatibility
with std::unordered_map makes this approach more desirable.
Cq-Include-Trybots: skia.primary:Test-Win2016-MSVC-GCE-CPU-AVX2-x86_64-Debug-All-MSRTC
Change-Id: I0551e7590d5039962e213c6672927bd84e1a0856
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/223136
Reviewed-by: Mike Klein <mtklein@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
This new bench lets us measure the overhead of program building,
optimization, and JITting. Surprisingly, at head the optimization in
Builder::done() takes longer than the JIT.
The new bench clocks in around 40µs on my laptop at head,
then 32µs after switching val_to_reg to be an std::vector,
then 27µs after switching deaths to be an std::vector too,
then 22µs after switching fIndex to be an SkTHashMap,
then 20µs after calling program.reserve(fProgram.size()),
then 19µs after switching JIT data maps to SkTHashMap too.
I tried swapping some std::vector for SkTDArray to no benefit, actually
a little detriment. So I think this is roughly all the low-hanging
fruit, with time split now roughly equally between Builder::Done(),
JITting in Program::eval(), and the original calls to Builder
themselves.
Also disable perf dumps on Mac. No real value there until I can dump a
dylib, and it's just one more thing I have to remember to disable before
running this sort of benchmark.
Change-Id: I1c6e58ed00ac94ad622c7d740712634f60787102
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/222984
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
For now, disable the vpmovusdb AVX-512 instruction, using the compound
AVX2 fallback instead. I need to learn how to encode EVEX prefixes
before we can use that, and it's not very important.
That's everything! We're fully in control now, and should be able to
run this on any x86-64 Linux or Mac. And we can relax some of the
defined(SKVM_JIT) guards so that, e.g., we can unit test Assembler even
on all platforms.
Stifle some warnings about ~bool by ~(int)bool.
Would like to enable when is_mac too but can't seem to get past
(bogus?) thread annotation on the bots. My local Mac is fine. :/
Change-Id: If00bdd97ebd9684ed109933e2fa70c5e6f6ea339
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/222631
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Herb Derby <herb@google.com>
Most image filters were fixed with just the changes to SkImage::makeWithFilter
and the changes to SkSpecialImage's subset handling (particularly the
raster backend that could read from a bitmap view, or ganesh impls that relied
SkSpecialImage::draw).
The gpu implementation for alpha threshold, blurs, matrix convolutions,
and displacement maps have been updated to account for the special image's
offset when it reads from the backing texture.
Change-Id: I8778aa373e60e9268961305057b2bf6da2bdb3af
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/221121
Reviewed-by: Robert Phillips <robertphillips@google.com>
Commit-Queue: Michael Ludwig <michaelludwig@google.com>
Move the invariants for glyph image data into SkGlyph.
Change-Id: I1958612bb73cfffe42df19a11c8899048559013b
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/222876
Commit-Queue: Herb Derby <herb@google.com>
Reviewed-by: Mike Klein <mtklein@google.com>
Bug: chromium:977315
Change-Id: Ia5b734f5c0f0806af0f096de5add880a777c5c25
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/222793
Reviewed-by: Brian Salomon <bsalomon@google.com>
Commit-Queue: Michael Ludwig <michaelludwig@google.com>
This shows off a little how easy backwards-only labels are.
The rip == rbp + Mod::Indirect convention isn't something
you'd be able to guess without just looking at the docs.
I'm not actually sure if you can only use rbp or also r13,
but LLVM seems to always do the equivalent of rbp... might
just be that high bit in VEX is ignored: they're registers
5 and 13, 8 apart, only distinguished by that bit.
Convenienly RIP addressing is always 32-bit, so there's
no benefit to spending time checking whether the offset
fits in a byte, though most of our offsets would.
Change-Id: I01b7fb1500667e1bf98490d5144459f92e1b375d
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/222857
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Herb Derby <herb@google.com>
I think this makes the relatioship between mask and entry clearer?
Can't have JIT code handle >0 elements unless that JIT code itself
exists.
Change-Id: I238d54a5084c7f90bd32c83db5423840cf415b17
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/222856
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Herb Derby <herb@google.com>
This moves the responsibility for allocating executable code out of
Assembler. The pages Xbyak uses are obviously executable, so this is
redundant right now, but it'll let us switch to something simple like
std::vector<uint8_t> as we continue to cut out Xbyak.
Make how Program holds its cached JIT program slightly less of a mess.
Change-Id: I38d6f01006da1da60f4aed675e9ddf97de9aec52
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/222575
Auto-Submit: Mike Klein <mtklein@google.com>
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
This makes NV12 and NV21 toggle on/off as a pair.
Bug: skia:9155
Change-Id: Ie0d3f2b3c0aba9a1777a722190bcf6aa5b5e85c3
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/222460
Reviewed-by: Greg Daniel <egdaniel@google.com>
Commit-Queue: Robert Phillips <robertphillips@google.com>
If you launch the Mac viewer from the command line, it will sit there
until you click on the thumbnail in the dock, and only then will bring
up the window. This fixes that so it will open the window immediately.
Change-Id: I5628dc6c59833f808a61dedde457774114dd0e94
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/222783
Commit-Queue: Jim Van Verth <jvanverth@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
Auto-Submit: Jim Van Verth <jvanverth@google.com>
Reviewed-by: Brian Osman <brianosman@google.com>
Change-Id: I4d1a102264d8c97bf9120c3891d569ef96a92922
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/222782
Commit-Queue: Brian Osman <brianosman@google.com>
Reviewed-by: Mike Klein <mtklein@google.com>
By putting data first in descending alignment then code, we never need
any alignment padding.
This also makes all jumps and ip-relative data loads backward, so
they're really easy to assemble. No need for any sort of deferred
where-does-this-label-mean logic; the label can just be a simple byte
offset established before you need to use it.
Nothing new switched off of Xbyak in this CL, but the rearrangement
makes the rest a lot easier.
The one downside I've found so far is that the disassembly of the
first instruction can get confused into data or other instructions,
e.g.
63: 01 ff add %edi,%edi
65: 00 ff add %bh,%bh
67: 00 00 add %al,(%rax)
69: ff 00 incl (%rax)
6b: ff c4 inc %esp
6d: e2 7d loop ec <skvm-jit-884702985+0xac>
6f: 18 05 eb ff ff ff sbb %al,-0x15(%rip) # 60 <skvm-jit-884702985+0x20>
75: c4 e2 7d 18 0d e6 ff ff ff vbroadcastss -0x1a(%rip),%ymm1 # 64 <skvm-jit-884702985+0x24>
7e: c4 e2 7d 18 15 e1 ff ff ff vbroadcastss -0x1f(%rip),%ymm2 # 68 <skvm-jit-884702985+0x28>
There are 3 vbroadcastss instructions here, each starting with c4 e2 7d
18, but the first has been disassembled as if its c4 were part of the
last data entry (0xff00ff00) as inc %esp.
Probably not a big deal for now, particularly since those vbroadcastss
are all outside the loop and never show up on a profile. If it gets too
confusing I think we can dump the programs starting from the beginning
of the code instead of from the data; we won't be able to inspect the
data, but everything should disassemble perfectly.
Change-Id: I0cc864359fd0740fc026070eaf2b6cb130783a57
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/222574
Auto-Submit: Mike Klein <mtklein@google.com>
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
This centralizes the initial-lane-mask logic, and makes the return value
copying much more straightforward by just passing in the width. Lets us
shrink the arrays in the interpreter pipeline stage to the correct size.
Also normalize some formatting and structure.
Change-Id: I446598dcdd550d88ff1db1afe7507f31fa96d1d7
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/222510
Commit-Queue: Brian Osman <brianosman@google.com>
Reviewed-by: Mike Klein <mtklein@google.com>
Introduce textBlobToGlyphRunListWithoutRSX to convert text blob into
glyph runs. Convert the core of the code from working over text blobs
to working over glyph runs.
+ Misc cleanups
Change-Id: I33c1fc5e948dd7270031496325a96409f2cfeeb6
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/222277
Reviewed-by: Ben Wagner <bungeman@google.com>
Commit-Queue: Herb Derby <herb@google.com>
Implement radial wipe with a sweep gradient shader mask filter.
The implementation is slightly convoluted because edge feathering requires a real blur, which in turn requires content layer isolation.
So there are two distinct operation modes:
- no feather -> draw the content directly into the dest buffer, with the mask filter
deferred in SG context
- feather -> draw the content into a separate layer, then blend (dstOut) the composed
blur+shader mask on top
Change-Id: I253701aff42db8010ce463762252c262e2c5d92b
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/222596
Reviewed-by: Mike Reed <reed@google.com>
Commit-Queue: Florin Malita <fmalita@chromium.org>
Change-Id: I1d133259264adfdc872b0f4aeaa9390363c46341
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/222040
Reviewed-by: Mike Klein <mtklein@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
Since explicit resource allocation has stuck these instantiate calls are no longer required.
Change-Id: I5a8a7fa714eb1e9550f4f645ce8fced2d5f7aa4e
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/222457
Reviewed-by: Brian Salomon <bsalomon@google.com>
Commit-Queue: Robert Phillips <robertphillips@google.com>
He's being set as reviewer for recipe rolls, which causes the rest of us
not to see them.
Change-Id: Idaa59e32ba3fa28d2843a263e6fd8a0d0e234657
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/222776
Reviewed-by: Ravi Mistry <rmistry@google.com>
Commit-Queue: Eric Boren <borenet@google.com>