This loads 32 bits instead of gathering 256 in the tail part of loops.
To make it work, add a vmovd with SIB addressing.
I also remembered that the mysterious 0b100 is actually a signal that
the instruction uses SIB addressing, and is usually denoted by `rsp`.
(SIB addressing may be something we'd want to generalize over like we
did recently with YmmOrLabel, but I'll leave that for Future Me.)
Slight rewording where "scratch" is mentioned to keep it focused on
scratch GP registers, not "tmp" ymm registers. Not a hugely important
distinction but helps when I'm grepping through code.
Change-Id: I39a6ab1a76ea0c103ae7d3ebc97a1b7d4b530e73
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/264376
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Herb Derby <herb@google.com>
Some TODOs left over to make the scalar
tail case better... as is it issues a
256-bit gather for each 32-bit load!
I added a trimmed down variant of the existing
SkVM_gathers unit test to test just gather32,
covering this new JIT code.
Change-Id: Iabd2e6a61f0213b6d02d222b9f7aec2be000b70b
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/264217
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
This does the equivalent of dst = *(src + off),
which we use to find our gather base pointer in gather32.
Change-Id: I09ca7bfd404d7dce6de454ef1ed4eee78ab29932
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/264216
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Herb Derby <herb@google.com>
A complicated instruction to say the least!
A "fun" wrinkle is that all the ymm registers must be unique!
(And the mask register is cleared by the instruction...)
Still kind of TODO is what that 0b100 r/m in the mod_rm() means. Every
variant of the instruction I've assembled seems to have it set to 0b100
(e.g. 0x0c or 0x04) but I'd feel better if I knew what it meant.
Change-Id: Ia4ff5f8175bff545e2d10bb2d1b14f49073445a3
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/264116
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Herb Derby <herb@google.com>
Chrome layout test suppression has landed.
This reverts commit 67d0f3fd72.
Change-Id: I5b9963b306f29a41cf36e1802e7eebda010f186d
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/264016
Reviewed-by: Brian Salomon <bsalomon@google.com>
Commit-Queue: Brian Salomon <bsalomon@google.com>
This reverts commit 4a46758db8.
Add guard to Flutter
Change-Id: Ief0e5cb36af13c8f00a36a617d0384622012d644
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/263937
Reviewed-by: Mike Reed <reed@google.com>
Commit-Queue: Mike Reed <reed@google.com>
This is necessary to support the async readback API on Windows Chrome.
Transfer from buffer to texture is still disabled as the unit test
still fails.
Bug: chromium:1040643
Change-Id: I0e16437c066fadfdd8dbbbcaf376c6901a538063
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/263528
Commit-Queue: Brian Salomon <bsalomon@google.com>
Reviewed-by: Robert Phillips <robertphillips@google.com>
We were never calling specialize() to bake in the values of ins,
so do that. Add uniformSize() to get the size of just the uniform
values. (The interpreter asserts that the size of the uniforms
being passed in matches the expected size from the ByteCode,
so these need to match up).
Added a unit test that uses both 'in' and 'uniform'.
Change-Id: I595822171211d35a17d5977fa790de0d1bbd6c78
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/263519
Reviewed-by: Ethan Nicholas <ethannicholas@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
1. Feature: Clients need to override didConcat44() (new data)
2. Perf: Clients need to override didTranslate (and now didScale) so our
default impls can be empty.
Need SK_SUPPORT_LEGACY_CANVAS_MATRIX_VIRTUALS flag to stage this in
clients (anyone who subclasses SkCanvas)
Before (with flag)
120.87 canvas_matrix_4x4 8888
108.10 ? canvas_matrix_3x3 8888
108.13 ? canvas_matrix_2x3 8888
141.54 canvas_matrix_scale 8888
128.04 canvas_matrix_trans 8888
After (without the flag)
...
90.79 canvas_matrix_scale 8888
94.51 canvas_matrix_trans 8888
bug: skia:9768
Change-Id: I6f500138dd6b2b24754dc065c650d0bd3c341540
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/263349
Commit-Queue: Mike Reed <reed@google.com>
Reviewed-by: Florin Malita <fmalita@chromium.org>
This reverts commit a92320d4e6.
Reason for revert: Blocking Chrome roll.
Original change's description:
> Remove GrPaint::addColorTextureProcessor
>
> Just make the effect and then add it.
>
> Makes it easier to make changes to GrTextureEffect::Make going forward.
>
> Also add default param for matrix (identity).
>
> Change-Id: I52073f11a0a78b971bb512627198ee1724bfdac7
> Reviewed-on: https://skia-review.googlesource.com/c/skia/+/263518
> Reviewed-by: Michael Ludwig <michaelludwig@google.com>
> Commit-Queue: Brian Salomon <bsalomon@google.com>
TBR=bsalomon@google.com,michaelludwig@google.com
Change-Id: I2618844f5a8f5f1873dd79142caafd8939384e9e
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/263560
Reviewed-by: Jim Van Verth <jvanverth@google.com>
Commit-Queue: Jim Van Verth <jvanverth@google.com>
Just make the effect and then add it.
Makes it easier to make changes to GrTextureEffect::Make going forward.
Also add default param for matrix (identity).
Change-Id: I52073f11a0a78b971bb512627198ee1724bfdac7
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/263518
Reviewed-by: Michael Ludwig <michaelludwig@google.com>
Commit-Queue: Brian Salomon <bsalomon@google.com>
constexprify GrSamplerState
pass/return GrSamplerState by value (it's 3 bytes).
Remove unused function from GrTexturePriv
Change-Id: Iffecd941500acf5653f01cc88b42ff1d45678b54
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/263346
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Brian Salomon <bsalomon@google.com>
- Add YmmOrLabel struct to represent the concept that many
x86 instructions can take a final argument as either a
register or memory address, and that they all handle them
the same way.
- Convert existing overloads like vmulps() to use YmmOrLabel.
- upgrade some other instructions to take YmmOrLabel
- use them to implement today's new _imm ops
This feels like a good spot for implicit constructors, no?
Change-Id: I435028acc3fbfcc16f634cfccc98fe38bbce9d19
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/263207
Reviewed-by: Mike Klein <mtklein@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
This mostly handles cases where we are wrapping a GrBackend* or an already
made proxy.
Change-Id: Ieb33eb51f7db84611ade0f8243b6d9023ce8e390
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/262234
Commit-Queue: Greg Daniel <egdaniel@google.com>
Reviewed-by: Brian Salomon <bsalomon@google.com>
- Remove extract... it's not going to have any special impl.
I've left it on skvm::Builder as an inline compound method.
- Add no-op shift short circuits.
- Add immediate ops for bit_{and,or,xor,clear}.
This comes from me noticing that the masks for extract today are always
immediates, and then when I started converting it to be (I32, int shift,
int mask), I realized it might be even better to break it up into its
component pieces. There's no backend that can do extract any better
than shift-then-mask, so might as well leave it that way so we can
dedup, reorder, and specialize those micro ops.
Will follow up soon to get this all JITing again,
and these can-we-JIT test changes will be reverted.
Change-Id: I0835bcd825e417104ccc7efc79e9a0f2f4897841
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/263217
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
Add framework for unit tests that draw (CPU and GPU) with a runtime
shader, as well as couple example tests.
Change-Id: I43b3b39e86634ec55521a2689a4c55c21939dce5
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/262809
Reviewed-by: Brian Salomon <bsalomon@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
The reason for the assert was breaking an assert, that if the CTM was scale/translate, that after
a preTranslate, it should still be that.
This is true... unless the new translate values are non-finite. In that case, we might turn a zero
into a NaN, (0 * non_finite --> nan), so we either have to require finite args (which we don't
at the moment) or we can't make this assert. This re-land removes that assert.
This reverts commit 268ed57d71.
Change-Id: I3c48a0aa17649351a246c1fbab5449f2d59aaf84
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/263023
Commit-Queue: Mike Reed <reed@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
Auto-Submit: Mike Reed <reed@google.com>
Reviewed-by: Brian Osman <brianosman@google.com>
This reverts commit 98bfcc7ff3.
Reason for revert: Flutter hitting assert:
../../third_party/skia/src/core/SkCanvas.cpp:1432: fatal error: "assert(fIsScaleTranslate == fMCRec->fMatrix.isScaleTranslate())"
Original change's description:
> Extend SkCanvas matrix stack to be 4x4, but with (basically) the same public API.
>
> Devices receive the 4x4, but by default they simply downsample it to SkMatrix.
>
> New SkM44 matrix for the impl. It differs from SkMatrix44 in a few ways
> - no tracking of "type"
> - faster for concat, as it does not use doubles for intermediates
> - much simpler API
>
> There are some low-bit differences in some gms, so adding a flag for clients to
> stage this change. (due to faster but lower-precision in SkM44::concat)
>
> Performance: running canvas_matrix bench
>
> 3x3 version:
>
> 167.93 canvas_matrix_3x3 8888
> 209.97 canvas_matrix_2x3 8888
> 174.87 canvas_matrix_scale 8888
> 135.30 canvas_matrix_trans 8888
>
> 4x4 version:
>
> 116.59 canvas_matrix_3x3 8888
> 105.40 canvas_matrix_2x3 8888
> 159.83 ? canvas_matrix_scale 8888
> 113.47 canvas_matrix_trans 8888
>
> Why faster?
> - not tracking matrix_type helps a lot it seems
> - faster full concat (no doubles)
>
> Before adding the specialized preConcats...
>
> 318.11 ? canvas_matrix_3x3 8888
> 339.38 canvas_matrix_2x3 8888
> 383.28 canvas_matrix_scale 8888
> 251.67 canvas_matrix_trans 8888
>
> Change-Id: I68eac942919fa5418081e789f31710a1e2a752da
> Reviewed-on: https://skia-review.googlesource.com/c/skia/+/262056
> Reviewed-by: Brian Salomon <bsalomon@google.com>
> Reviewed-by: Florin Malita <fmalita@chromium.org>
> Commit-Queue: Mike Reed <reed@google.com>
TBR=mtklein@google.com,bsalomon@google.com,herb@google.com,fmalita@chromium.org,fmalita@google.com,reed@google.com,michaelludwig@google.com
Change-Id: I28c3d69c19ba44ab65ca7c059221b64c7dffef22
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/263021
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
This reverts commit 1c16b43033.
Reason for revert: Red on tree
Original change's description:
> Move makeDeferredRenderTargetContext calls to factory on RTC.
>
> Change-Id: Iaa8f5829d9f8650ff27a60f75fb2216f016ab85e
> Reviewed-on: https://skia-review.googlesource.com/c/skia/+/262058
> Commit-Queue: Greg Daniel <egdaniel@google.com>
> Reviewed-by: Brian Salomon <bsalomon@google.com>
TBR=egdaniel@google.com,bsalomon@google.com
Change-Id: I9e3c9d13c66b5437c87ad7136d283fa4ac81df1f
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/263019
Reviewed-by: Jim Van Verth <jvanverth@google.com>
Commit-Queue: Jim Van Verth <jvanverth@google.com>
This reverts commit d7436a37ff.
Restores old file order in gpu.gni until Mac/Metal issue can be
debugged.
Change-Id: I6e2ee3bdc3b39270aeaaf28b9613e4ac49d38e1e
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/262801
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Brian Salomon <bsalomon@google.com>
Devices receive the 4x4, but by default they simply downsample it to SkMatrix.
New SkM44 matrix for the impl. It differs from SkMatrix44 in a few ways
- no tracking of "type"
- faster for concat, as it does not use doubles for intermediates
- much simpler API
There are some low-bit differences in some gms, so adding a flag for clients to
stage this change. (due to faster but lower-precision in SkM44::concat)
Performance: running canvas_matrix bench
3x3 version:
167.93 canvas_matrix_3x3 8888
209.97 canvas_matrix_2x3 8888
174.87 canvas_matrix_scale 8888
135.30 canvas_matrix_trans 8888
4x4 version:
116.59 canvas_matrix_3x3 8888
105.40 canvas_matrix_2x3 8888
159.83 ? canvas_matrix_scale 8888
113.47 canvas_matrix_trans 8888
Why faster?
- not tracking matrix_type helps a lot it seems
- faster full concat (no doubles)
Before adding the specialized preConcats...
318.11 ? canvas_matrix_3x3 8888
339.38 canvas_matrix_2x3 8888
383.28 canvas_matrix_scale 8888
251.67 canvas_matrix_trans 8888
Change-Id: I68eac942919fa5418081e789f31710a1e2a752da
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/262056
Reviewed-by: Brian Salomon <bsalomon@google.com>
Reviewed-by: Florin Malita <fmalita@chromium.org>
Commit-Queue: Mike Reed <reed@google.com>
This reverts commit 90673ec665.
Reason for revert: Causes metal bot failures
Original change's description:
> Rename GrSimpleTextureEffect->GrTextureEffect
>
> It will become less simple.
>
> Change-Id: I409d0faba386597ae05738273d5ff773501eb358
> Reviewed-on: https://skia-review.googlesource.com/c/skia/+/262383
> Auto-Submit: Brian Salomon <bsalomon@google.com>
> Commit-Queue: Brian Osman <brianosman@google.com>
> Reviewed-by: Brian Osman <brianosman@google.com>
TBR=bsalomon@google.com,brianosman@google.com
Change-Id: Id25c9cde3c2048149409745f163e42c588de70c1
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/262514
Commit-Queue: Brian Salomon <bsalomon@google.com>
Reviewed-by: Brian Salomon <bsalomon@google.com>
It will become less simple.
Change-Id: I409d0faba386597ae05738273d5ff773501eb358
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/262383
Auto-Submit: Brian Salomon <bsalomon@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
Reviewed-by: Brian Osman <brianosman@google.com>
Validating that only certain types are allowed as in or uniform.
Change-Id: I941da448bffec06158e67fbf653833846d9fe3db
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/262222
Reviewed-by: Brian Salomon <bsalomon@google.com>
Reviewed-by: Ethan Nicholas <ethannicholas@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
Basic support for clamp/clamp/nearest/RGBA-8888/premul images.
A couple little changes to make this work:
- add pixel-center offset to shader (x,y)
- change the signature of gather??() calls to work
more naturally with how we let effects build uniforms.
Instead of gathering directly from one of the program
arguments, load the gather base pointer off another
uniforms pointer, just like any other uniform.
- remove the default argument to uniform??() so that
they parallel the new gather??() calls more closely.
There was only one place that was using the default
and I think it's clearer as an explicit 0 offset.
- centralize some more helpers onto skvm::Builder so
we can use the in both SkVMBlitter and SkImageShader.
Some diffs:
- very, very small color diffs probably due to slightly
different math converting between byte and float or blending;
- small sampling coordinate diffs where skvm + SkRP agree,
and the legacy shaders disagree. That's fine by me.
Change-Id: I72634e7fed4f13e6cb41b8067104760f392ea3bf
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/262368
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Mike Reed <reed@google.com>
Remove some unused functions/macros.
Move two functions only used by GrBufferAllocPool there.
We only ever used GrSizeAlignUp with pow 2 alignments. Require that,
rename, move to GrTypesPriv.h (along with GrSizeDivRoundUp).
Change-Id: I1a7248952d1905f16f02de2028d65768b186acee
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/262061
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Brian Salomon <bsalomon@google.com>
As they are a significant chunk of code, this reduces the size of the
release executable.
Change-Id: Ib764b3ec6244629e50941b0101a540e49d56c320
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/261955
Reviewed-by: Greg Daniel <egdaniel@google.com>
Reviewed-by: Kevin Lubick <kjlubick@google.com>
Commit-Queue: Ethan Nicholas <ethannicholas@google.com>
Change-Id: I0b11d4210c6e663cfb4854fc33e1396fd79fe9a4
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/261780
Reviewed-by: Brian Salomon <bsalomon@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
Use the original alpha in the unpremul color.
Bug: chromium:1024935
Change-Id: I6a721431781f0ef42a3f162d39f8bbac924a2c30
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/261680
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Brian Salomon <bsalomon@google.com>
- 'in' variables can't be arrays (applies to all SkSL)
- 'in' and 'uniform' variables are restricted to a fixed
list of types.
Change-Id: I957ce1ad33aaf6b5970ca7204c568bb533bc86d6
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/261436
Reviewed-by: Brian Salomon <bsalomon@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
Change-Id: I1e49ab46975bb8e5eff08bc5afe7ffed1c078309
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/261550
Reviewed-by: Brian Salomon <bsalomon@google.com>
Reviewed-by: Michael Ludwig <michaelludwig@google.com>
Commit-Queue: Brian Salomon <bsalomon@google.com>
Change-Id: Ic43bebee6a0036eff5b718cc505d85bf839cb8a3
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/260898
Commit-Queue: Brian Salomon <bsalomon@google.com>
Reviewed-by: Michael Ludwig <michaelludwig@google.com>
Causes shader compilation to fail in Programs test on Tecno Spark 3 Pro.
Hopefully this doesn't affect performance as this GPU has framebuffer
fetch support.
Also reduce the effect depth limit in ProgramsTest.
Change-Id: I1fa5b8a670b3a074808cdba09c2ac97492067d33
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/261192
Reviewed-by: Robert Phillips <robertphillips@google.com>
Commit-Queue: Brian Salomon <bsalomon@google.com>
This is working in the GL and Vulkan back ends. MacOS only supports the RGBA8 variants.
For mobile devices, probably only nVidia GPUs will support this.
Bug: skia:9680
Change-Id: I9d886b72232a031603e93e46059a97a8aa288b3c
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/261093
Reviewed-by: Brian Salomon <bsalomon@google.com>
Reviewed-by: Greg Daniel <egdaniel@google.com>
Commit-Queue: Robert Phillips <robertphillips@google.com>
Change-Id: I47ac0b069334cb9702473b1bb923f585712f38ce
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/261087
Commit-Queue: Michael Ludwig <michaelludwig@google.com>
Reviewed-by: Robert Phillips <robertphillips@google.com>
Replaces several of the arguments with a single GrMesh object. The
call signatures were a bit unruly, and would have become more so with
extra required arguments for tessellation.
Change-Id: I4834b03d9c107dd233a91019c0a3189030e52bef
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/260977
Reviewed-by: Brian Salomon <bsalomon@google.com>
Commit-Queue: Chris Dalton <csmartdalton@google.com>
Will follow-up with a change that doesn't use SkRegions!
Note:
The previous code flow gathered a region for all devices (in the canvas method).
However, it only tried to draw into the top device. The new code just focuses on
the top device, which ought to give the same results.
Change-Id: Ic5ed47e7908d646700c09b10faa538415522c645
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/261283
Commit-Queue: Mike Reed <reed@google.com>
Reviewed-by: Florin Malita <fmalita@chromium.org>
Reviewed-by: Michael Ludwig <michaelludwig@google.com>