Calling add32 can re-allocate the key storage, so it isn't safe to call
add32n and then keep writing into the returned pointer if an extra
sampler key is encountered.
Also fix the GP sampler key code to actually set the last bit, not the
16th bit.
Change-Id: I9c83435a164ab0391e2e6a9e1a9bdf6839490a15
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/229278
Reviewed-by: Greg Daniel <egdaniel@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
This leaves just width, height, and GrPixelConfig. Once we remove the
latter we can replace GrSurfaceDesc with SkISize.
Also remove unused GrRenderTarget::overrideResolveRect
Also remove GrSurfaceProxy::Renderable and use GrRenderable instead.
Change-Id: I652fe6169a22ca33d199b144ec6385286ac07b5a
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/228570
Reviewed-by: Greg Daniel <egdaniel@google.com>
Commit-Queue: Brian Salomon <bsalomon@google.com>
Adding cache
Caching shaped results
Base+Index for referencing arrays
The very first and naive version of cache
Cache measurement, lines and picture
Added text blob cache for lines
Removed Run* from Cluster
Removed const char* from Cluster and Run
Few minor changes
Change-Id: I444a1defa950aed5999cfa1c3545fd83ccb54ce9
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/227840
Commit-Queue: Julia Lavrova <jlavrova@google.com>
Reviewed-by: Julia Lavrova <jlavrova@google.com>
Reviewed-by: Mike Reed <reed@google.com>
9ec3f51d11..02407743bd
git log 9ec3f51d11d9..02407743bd72 --date=short --no-merges --format='%ad %ae %s'
2019-07-22 dongja@google.com Vulkan: implement indirect dispatch
2019-07-22 jmadill@chromium.org Vulkan: Rename 'extents' param to 'glExtents'.
2019-07-22 clemendeng@google.com Implicit conversions for Desktop GL shaders
2019-07-22 ynovikov@chromium.org Skip ProgramBinaryTransformFeedbackTest.GetTransformFeedbackVarying
2019-07-22 jmadill@chromium.org Vulkan: Store VkExtents3D in ImageHelper.
2019-07-22 jmadill@chromium.org Texture: Make ImageIndex store layer counts.
2019-07-22 ynovikov@chromium.org Skip ShaderStorageBufferTest31.ActiveSSBOButNotStaticallyUsed
2019-07-22 syoussefi@chromium.org Vulkan: Atomic counter buffer support
2019-07-22 ynovikov@chromium.org Revert "Reland "Temporarily disable creating D3D debug device.""
2019-07-22 syoussefi@chromium.org Vulkan: Generalize buffers desc set name to include images
2019-07-22 fei.yang@arm.com Vulkan: Intermittent failures in many GLES2 CTS
2019-07-22 jonahr@google.com Port adjust_src_dst_region_for_blitframebuffer workaround to ANGLE.
Created with:
gclient setdep -r third_party/externals/angle2@02407743bd72
The AutoRoll server is located here: https://autoroll.skia.org/r/angle-skia-autoroll
Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+/master/autoroll/README.md
If the roll is causing failures, please contact the current sheriff, who should
be CC'd on the roll, and stop the roller if necessary.
CQ_INCLUDE_TRYBOTS=skia.primary:Build-Debian9-Clang-x86_64-Release-ANGLE;skia.primary:Test-Win10-Clang-AlphaR2-GPU-RadeonR9M470X-x86_64-Debug-All-ANGLE;skia.primary:Test-Win10-Clang-Golo-GPU-QuadroP400-x86_64-Debug-All-ANGLE;skia.primary:Test-Win10-Clang-NUC5i7RYH-GPU-IntelIris6100-x86_64-Debug-All-ANGLE;skia.primary:Test-Win10-Clang-NUC6i5SYK-GPU-IntelIris540-x86_64-Debug-All-ANGLE;skia.primary:Test-Win10-Clang-NUC8i5BEK-GPU-IntelIris655-x86_64-Debug-All-ANGLE;skia.primary:Test-Win10-Clang-NUCD34010WYKH-GPU-IntelHD4400-x86_64-Debug-All-ANGLE
Bug: None
TBR=borenet@google.com
Change-Id: Ibc366c1a343269d76c0f7a329f6a7514dd8b8aa3
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/229170
Reviewed-by: skia-autoroll <skia-autoroll@skia-public.iam.gserviceaccount.com>
Commit-Queue: skia-autoroll <skia-autoroll@skia-public.iam.gserviceaccount.com>
This is an automated CL created by the recipe roller. This CL rolls recipe
changes from upstream projects (e.g. depot_tools) into downstream projects
(e.g. tools/build).
More info is at https://goo.gl/zkKdpD. Use https://goo.gl/noib3a to file a bug.
depot_tools:
https://crrev.com/c420221f1d94fd0799e9e7aed40928bf1b321a97 [git-cl] Proofread error messages for consistency (qyearsley@chromium.org)
TBR=borenet@google.com
Recipe-Tryjob-Bypass-Reason: Autoroller
Bugdroid-Send-Email: False
Change-Id: I3fa79aa3f5db2a383f1c31496505afcd390e9b4f
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/229176
Reviewed-by: Recipe Roller <recipe-mega-autoroller@chops-service-accounts.iam.gserviceaccount.com>
Commit-Queue: Recipe Roller <recipe-mega-autoroller@chops-service-accounts.iam.gserviceaccount.com>
This is an automated CL created by the recipe roller. This CL rolls recipe
changes from upstream projects (e.g. depot_tools) into downstream projects
(e.g. tools/build).
More info is at https://goo.gl/zkKdpD. Use https://goo.gl/noib3a to file a bug.
depot_tools:
https://crrev.com/4a8e453fbda7dde50721b479fa9ca5dd4a65ef50 bot_update: Disable metrics collection for bots, since it's not supported. (ehmaldonado@chromium.org)
TBR=borenet@google.com
Recipe-Tryjob-Bypass-Reason: Autoroller
Bugdroid-Send-Email: False
Change-Id: Ie58fda3684b16f23acbedfc5c2fce1f6e97fa004
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/229082
Reviewed-by: Recipe Roller <recipe-mega-autoroller@chops-service-accounts.iam.gserviceaccount.com>
Commit-Queue: Recipe Roller <recipe-mega-autoroller@chops-service-accounts.iam.gserviceaccount.com>
This adds a warmup phase to let each instruction do any setup
it needs, adding lookup entries for splat and bytes, and
on aarch64, hoisting the mask to a register when we can.
Oddly, this measures as a ~3x slowdown on the phone I'm testing, an
international Galaxy S9 with a Samsung Mongoose 3 processor. I've got
to imagine this somehow makes the processor think there's a carried loop
dependency when there is not? Anyway, we already know that that's a
pretty crazy CPU (reports FP16 compute but cannot), and this does deliver
a speedup on the Pixel 2's Kryo 280 / Cortex A73, so I think maybe I'll
just swap back to testing with the Pixel 2 and forget about that S9.
Here's a before/after codelisting with a hoisted tbl mask. In the
before case it's loaded in the loop with `ldr q3, #152`, and becomes
`ldr q0, #168` outside the loop. llvm-mca says this should cut one
cycle per loop, and with optimal out of order execution the loop cost
would drop from ~8.7 cycles to ~8.3. In practice, it looks like about a
15% speedup.
before:
ldr q0, #188
ldr q1, #200
cmp x0, #4 // =4
b.lt #76
ldr q2, [x1]
ldr q3, #152
tbl v3.16b, { v2.16b }, v3.16b
sub v3.8h, v0.8h, v3.8h
ldr q4, [x2]
and v5.16b, v4.16b, v1.16b
ushr v4.8h, v4.8h, #8
mul v5.8h, v5.8h, v3.8h
ushr v5.8h, v5.8h, #8
mul v3.8h, v4.8h, v3.8h
bic v3.16b, v3.16b, v1.16b
orr v3.16b, v5.16b, v3.16b
add v2.4s, v2.4s, v3.4s
str q2, [x2]
add x1, x1, #16 // =16
add x2, x2, #16 // =16
sub x0, x0, #4 // =4
b.al #-76
cmp x0, #1 // =1
b.lt #76
ldr s2, [x1]
ldr q3, #72
tbl v3.16b, { v2.16b }, v3.16b
sub v3.8h, v0.8h, v3.8h
ldr s4, [x2]
and v5.16b, v4.16b, v1.16b
ushr v4.8h, v4.8h, #8
mul v5.8h, v5.8h, v3.8h
ushr v5.8h, v5.8h, #8
mul v3.8h, v4.8h, v3.8h
bic v3.16b, v3.16b, v1.16b
orr v3.16b, v5.16b, v3.16b
add v2.4s, v2.4s, v3.4s
str s2, [x2]
add x1, x1, #4 // =4
add x2, x2, #4 // =4
sub x0, x0, #1 // =1
b.al #-76
ret
after: ldr q0, #168
ldr q1, #180
ldr q2, #192
cmp x0, #4 // =4
b.lt #72
ldr q3, [x1]
tbl v4.16b, { v3.16b }, v0.16b
sub v4.8h, v1.8h, v4.8h
ldr q5, [x2]
and v6.16b, v5.16b, v2.16b
ushr v5.8h, v5.8h, #8
mul v6.8h, v6.8h, v4.8h
ushr v6.8h, v6.8h, #8
mul v4.8h, v5.8h, v4.8h
bic v4.16b, v4.16b, v2.16b
orr v4.16b, v6.16b, v4.16b
add v3.4s, v3.4s, v4.4s
str q3, [x2]
add x1, x1, #16 // =16
add x2, x2, #16 // =16
sub x0, x0, #4 // =4
b.al #-72
cmp x0, #1 // =1
b.lt #72
ldr s3, [x1]
tbl v4.16b, { v3.16b }, v0.16b
sub v4.8h, v1.8h, v4.8h
ldr s5, [x2]
and v6.16b, v5.16b, v2.16b
ushr v5.8h, v5.8h, #8
mul v6.8h, v6.8h, v4.8h
ushr v6.8h, v6.8h, #8
mul v4.8h, v5.8h, v4.8h
bic v4.16b, v4.16b, v2.16b
orr v4.16b, v6.16b, v4.16b
add v3.4s, v3.4s, v4.4s
str s3, [x2]
add x1, x1, #4 // =4
add x2, x2, #4 // =4
sub x0, x0, #1 // =1
b.al #-72
ret
Change-Id: I352a98d3ac2ad84c338330ef4cfae0292a0b32da
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/229064
Reviewed-by: Mike Klein <mtklein@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
This first tries to JIT while hoisting all constants,
and if that fails, tries again hoisting no constants.
I figure this is one of those 80/20 deals for how to
handle constant hoisting and register pressure. This
probably mostly moots doing anything fancy like using
memory operands with AVX or lane operands with NEON.
This _doesn't_ moot hoisting the NEON tbl arguments,
which is not yet done here, but probably my next CL.
Change-Id: Id09d5cdddcdb45207bdfc914a5a3128a481a26f3
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/229058
Reviewed-by: Mike Klein <mtklein@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
Even if a JIT ultimately doesn't end up hoisting any values, it's going
to want this information while it decides. Writing it in one place also
ensures we only get it wrong in one place...
I'm no_ extending the lifetime of hoisted instructions here in Builder.
That's something to leave to the backend so they have the flexibility of
which of these values to hoist, if any. If they don't hoist, they'll
need to know when the value dies.
Moving this information back here lets the test expectation goldens
reflect the hoist bit again too. Kind of nice.
Change-Id: Ib165ca898a97c1d822cb28fe24f15bae4d570a17
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/229024
Reviewed-by: Mike Klein <mtklein@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
This is useful to avoid redrawing unnecessarily when the animation
doesn't progress.
Bug: skia:9267
Change-Id: Id4184ae8308b8abd959fbfd1768e3e22d1efe0a4
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/229006
Reviewed-by: Ravi Mistry <rmistry@google.com>
Commit-Queue: Florin Malita <fmalita@chromium.org>
Change-Id: Ib4b00712f27a47111ea1213d558e975f25eb8347
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/229007
Reviewed-by: Robert Phillips <robertphillips@google.com>
Commit-Queue: Eric Boren <borenet@google.com>
- shift the revalidation phase from Scene::render() to Scene::animate()
- pass an optional inval controller to Scene::animate() and Animation::seek()
- hoist the showInval logic out of SkSG, into clients
This allows clients to track dirty regions and detect cases where no updates are needed.
Bug: skia:9267
Change-Id: I3d35bf58b6eee9bfeb6e127ba58e2b96713b772d
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/229001
Commit-Queue: Florin Malita <fmalita@chromium.org>
Reviewed-by: Mike Reed <reed@google.com>
Change-Id: Iabfc1106fce9926547278ec1335f4888ca86511e
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/229002
Commit-Queue: Michael Ludwig <michaelludwig@google.com>
Reviewed-by: Michael Ludwig <michaelludwig@google.com>
Switch to Dawn's version of dawn_generator.gni. This depends on having
a file called build_overrides/dawn.gni. However, this will also enable
us to use the upstream Dawn BUILD.gn files more easily in the future.
This required adding it to compile.isolate, so the bots can pick it up.
Keeping up with Dawn:
Rename TextureFormat enums.
Rename dawn::BufferUsageBit::TransferDst -> CopyDst.
Removal of GLAD dependency.
SPIRV-Tools update.
Change-Id: Idcd5d1035ed106485dd2503b829e3c3b57a5688b
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/228568
Reviewed-by: Brian Salomon <bsalomon@google.com>
Reviewed-by: Mike Klein <mtklein@google.com>
Commit-Queue: Stephen White <senorblanco@chromium.org>
This will allow chromium to do a dump without crashing so we can get
more information about what's going wrong on webview in the field.
Bug: chromium:977231
Change-Id: I9022921aded735764d36868bb6606674654439bc
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/228389
Reviewed-by: Mike Klein <mtklein@google.com>
Reviewed-by: Khushal Sagar <khushalsagar@chromium.org>
Commit-Queue: Mike Klein <mtklein@google.com>
Auto-Submit: Adrienne Walker <enne@chromium.org>
When Chrome has a LUM16F texture they tell Skia it is R16F. Although this has been working for them so far it causes trouble with some upcoming changes.
Change-Id: I2473f70e4f725128f143c2dfb08adb79f3c7c166
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/228565
Reviewed-by: Brian Salomon <bsalomon@google.com>
We need to sort out an internal compiler error and a crash.
TBR=bsalomon@google.com
Bug: skia:
Change-Id: I8ecccc6ab696fd8c49735ba4690c4ec1f873c15e
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/228936
Reviewed-by: Chris Dalton <csmartdalton@google.com>
Commit-Queue: Chris Dalton <csmartdalton@google.com>
6a02f06dfd..9ec3f51d11
git log 6a02f06dfd4e..9ec3f51d11d9 --date=short --no-merges --format='%ad %ae %s'
2019-07-19 m.maiya@samsung.com Reland "Vulkan: Implement OES_get_program_binary extension"
2019-07-19 ynovikov@chromium.org Guard ID3DUserDefinedAnnotation access in DebugAnnotator11.
2019-07-19 lujc@google.com Add script to apply clang-format on all sources
2019-07-19 jmadill@chromium.org Functional revert of "Signal different dirty bit for vertex buffer change."
2019-07-19 dongja@google.com Vulkan: support for new vertex attribs in GLES 3.0
2019-07-19 geofflang@chromium.org Vulkan: Use the correct context in ImageVk::orphan.
2019-07-19 geofflang@chromium.org Call ImageImpl::destroy before destroying the image source.
2019-07-19 geofflang@chromium.org Return backwards compatible context versions in Vulkan, GL and D3D11.
2019-07-19 m.maiya@samsung.com Rectify bug in initialization of offsets for uniform variables
2019-07-19 clemendeng@google.com Add GL versions to desktop implementation
2019-07-19 clemendeng@google.com Rename ProvokingVertex and TextureBarrier
2019-07-19 angle-autoroll@skia-public.iam.gserviceaccount.com Roll ./third_party/spirv-tools/src aa9e8f538041..76b75c40a1e2 (1 commits)
Created with:
gclient setdep -r third_party/externals/angle2@9ec3f51d11d9
The AutoRoll server is located here: https://autoroll.skia.org/r/angle-skia-autoroll
Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+/master/autoroll/README.md
If the roll is causing failures, please contact the current sheriff, who should
be CC'd on the roll, and stop the roller if necessary.
CQ_INCLUDE_TRYBOTS=skia.primary:Build-Debian9-Clang-x86_64-Release-ANGLE;skia.primary:Test-Win10-Clang-AlphaR2-GPU-RadeonR9M470X-x86_64-Debug-All-ANGLE;skia.primary:Test-Win10-Clang-Golo-GPU-QuadroP400-x86_64-Debug-All-ANGLE;skia.primary:Test-Win10-Clang-NUC5i7RYH-GPU-IntelIris6100-x86_64-Debug-All-ANGLE;skia.primary:Test-Win10-Clang-NUC6i5SYK-GPU-IntelIris540-x86_64-Debug-All-ANGLE;skia.primary:Test-Win10-Clang-NUC8i5BEK-GPU-IntelIris655-x86_64-Debug-All-ANGLE;skia.primary:Test-Win10-Clang-NUCD34010WYKH-GPU-IntelHD4400-x86_64-Debug-All-ANGLE
Bug: None
TBR=borenet@google.com
Change-Id: I9ee7a4a943fe0163dfcff1e26d379c00d8f39b1b
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/228896
Reviewed-by: skia-autoroll <skia-autoroll@skia-public.iam.gserviceaccount.com>
Commit-Queue: skia-autoroll <skia-autoroll@skia-public.iam.gserviceaccount.com>
e02114c8fa..5c3e9d87e1
Created with:
gclient setdep -r ../src@5c3e9d87e1
The AutoRoll server is located here: https://autoroll.skia.org/r/chromium-skia-autoroll
Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+/master/autoroll/README.md
If the roll is causing failures, please contact the current sheriff, who should
be CC'd on the roll, and stop the roller if necessary.
CQ_INCLUDE_TRYBOTS=skia.primary:Perf-Mac10.13-Clang-MacBookPro11.5-GPU-RadeonHD8870M-x86_64-Release-All-CommandBuffer;skia.primary:Test-Mac10.13-Clang-MacBookPro11.5-GPU-RadeonHD8870M-x86_64-Debug-All-CommandBuffer
TBR=borenet@google.com
Change-Id: Ibbcd91099c6b6b2c73ca5a394b7c4bb049b92fb3
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/228898
Reviewed-by: skia-autoroll <skia-autoroll@skia-public.iam.gserviceaccount.com>
Commit-Queue: skia-autoroll <skia-autoroll@skia-public.iam.gserviceaccount.com>
This is an automated CL created by the recipe roller. This CL rolls recipe
changes from upstream projects (e.g. depot_tools) into downstream projects
(e.g. tools/build).
More info is at https://goo.gl/zkKdpD. Use https://goo.gl/noib3a to file a bug.
depot_tools:
https://crrev.com/ee7b9dda90e409fb92031d511151debe5db7db9f gclient: Make some changes to make gclient compatible with Python 3. (ehmaldonado@chromium.org)
TBR=borenet@google.com
Recipe-Tryjob-Bypass-Reason: Autoroller
Bugdroid-Send-Email: False
Change-Id: Ic6559bb81567a128b407414ebc2fb7a59527d5f3
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/228676
Reviewed-by: Recipe Roller <recipe-mega-autoroller@chops-service-accounts.iam.gserviceaccount.com>
Commit-Queue: Recipe Roller <recipe-mega-autoroller@chops-service-accounts.iam.gserviceaccount.com>
Change-Id: I5c89d5760c16097d658c454950a6632bd427c6ab
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/228637
Reviewed-by: Mike Klein <mtklein@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
Adds the option to use a multisampled (or mixed sampled) atlas, and
uses the sample mask and stencil buffer instead of coverage counts.
Bug: skia:
Change-Id: I9fb76d17895ae25208124f6c27e37977ac31b5eb
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/227428
Commit-Queue: Chris Dalton <csmartdalton@google.com>
Reviewed-by: Brian Salomon <bsalomon@google.com>
We can always move data around so that an FMA is possible using no more
registers than we would otherwise, and on x86, evne using no more
instructions.
The basic idea here is that if we can't reuse one of the inputs to
destructively host the FMA instruction, the next best thing is to copy
one of the arguments into tmp() and accumulate the FMA there.
Once the FMA has happened, we just need to copy that result to dst().
We can of course skip that copy if dst() == tmp(). On x86 we never need
that copy; dst() and tmp() are picked using the same logic except that
dst may alias one of its inputs, and we only fall into this case after
we've already found it doesn't. So we can just assert dst() == tmp()
rather than check it like we do on ARM.
It's subtle, but I think sound.
I'm using logical-or to copy registers around. This is a little lazy,
but maybe not as lazy as it looks: on ARM that is _the_ way to copy
registers. There's a vmovdqa instruction I could use on x86, TBD.
All paths through this new code were being exercised on ARM, but we
didn't have anything hitting the tmp case on x86, so I've added a new
unit test that hits the corner cases of both implementations.
Change-Id: I5422414fc50c64d491b4933b4b580b784596f291
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/228630
Reviewed-by: Mike Klein <mtklein@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
Change-Id: I82aec74524f33b3b8ea7592a9a4bf904127b87b6
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/228569
Reviewed-by: Brian Salomon <bsalomon@google.com>
Commit-Queue: Stephen White <senorblanco@chromium.org>
When SkBaseDevice switched to drawGlyphRunList(), we lost the ability to
detect a) constant-Y text and b) default-positioned text.
As a result, the emitted SVG contains lots of redundant/repeating glyph
positions.
This CL enhances SVGTextBuilder to detect and consolidate constant-Y
glyph positions.
Also restore a useful whitespace unit test.
Change-Id: I50568aef1955f75898ebab41441ad5fe418dac43
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/228563
Commit-Queue: Florin Malita <fmalita@chromium.org>
Reviewed-by: Mike Reed <reed@google.com>
Change-Id: I5fb08b64da127d88a32d8be35055565eb96b816d
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/224257
Commit-Queue: Eric Boren <borenet@google.com>
Reviewed-by: Ben Wagner aka dogben <benjaminwagner@google.com>
We have pack(x,y,imm) = x | (y<<imm) assuming (x & (y<<imm)) == 0.
If we can destroy x, sli (shift-left-insert) lets us implement that
as x |= y << imm. This happens quite often, so you'll see sequences
of pack that used to look like this
shl v4.4s, v2.4s, #8
orr v1.16b, v4.16b, v1.16b
shl v2.4s, v0.4s, #8
orr v0.16b, v2.16b, v3.16b
shl v2.4s, v0.4s, #16
orr v0.16b, v2.16b, v1.16b
now look like this
sli v1.4s, v2.4s, #8
sli v3.4s, v0.4s, #8
sli v1.4s, v3.4s, #16
We can do this thanks to the new simultaneous register assignment
and instruction selection I added. We used to never hit this case.
Change-Id: I75fa3defc1afd38779b3993887ca302a0885c5b1
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/228611
Reviewed-by: Mike Klein <mtklein@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
Trying to keep most of the structural parts shared between x86_64 and
aarch64. Not sure if this will stay factored like this long-term, but
the last version felt like there was a bit too much redundancy, and I
don't want to write things like register management more often than have
to.
Change-Id: Ieeb21f433715a730c41c85d657c5b33fa4702696
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/228608
Reviewed-by: Mike Klein <mtklein@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>