Change-Id: I66713976f08b1dbf0966d9a901f666b9f834b659
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/221096
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
Yet another way to transform a layer, disguised as a distort effect.
TBR=
Change-Id: Ic2d5479fa6ae27b460de60875924f73f77fc7f71
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/221001
Reviewed-by: Florin Malita <fmalita@chromium.org>
Commit-Queue: Florin Malita <fmalita@chromium.org>
Slightly sharper, but far easier to hold:
- Remove Value union from interface, everything is a 32-bit
value type, or a collection thereof.
- Collapse to one version of Run (that takes count), and make
it a member on ByteCode.
- Similarly, move disassemble to ByteCodeFunction.
Change-Id: I07c85e65991178b3f52e20e815c25f36bc9c4257
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220753
Commit-Queue: Brian Osman <brianosman@google.com>
Reviewed-by: Mike Klein <mtklein@google.com>
Bug: chromium:974177
Change-Id: I725f1ba63cbd57348419654a45934f09d1bcb6f4
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220996
Commit-Queue: Greg Daniel <egdaniel@google.com>
Commit-Queue: Jim Van Verth <jvanverth@google.com>
Auto-Submit: Greg Daniel <egdaniel@google.com>
Reviewed-by: Jim Van Verth <jvanverth@google.com>
85fef1bc62..853ebacf99
git log 85fef1bc62f8..853ebacf99a4 --date=short --no-merges --format='%ad %ae %s'
2019-06-14 ynovikov@chromium.org Suppress Mac 10.13.6 Intel OpenGL dEQP failures
2019-06-13 dongja@google.com Vulkan: implement primitive restart
2019-06-13 syoussefi@chromium.org Vulkan: Output render pass loadOp in graph dump
2019-06-13 jmadill@chromium.org Vulkan: Implement a texture descriptor cache.
2019-06-13 shrekshao@google.com Enable floatBlend for D3D9
2019-06-13 dongja@google.com Vulkan: fix array size for internal shaders
2019-06-13 jmadill@chromium.org Refactor DrawCallPerfParams.
2019-06-13 spang@chromium.org Reland "Vulkan: Build validation layers with asserts only"
2019-06-13 jonahr@google.com Clean up and expose frontend features to egl.
2019-06-13 angle-autoroll@skia-public.iam.gserviceaccount.com Roll ./third_party/spirv-tools/src 9c0830133b07..42830e5a68c3 (2 commits)
2019-06-13 jmadill@chromium.org Vulkan: Prefer immediate present mode for benchmarking.
2019-06-13 jmadill@chromium.org Revert "Vulkan: Build validation layers with asserts only"
2019-06-13 angle-autoroll@skia-public.iam.gserviceaccount.com Roll ./third_party/spirv-headers/src 9cf7c3a7d2d2..de99d4d834ae (1 commits)
2019-06-13 jiajia.qin@intel.com Fix the crash in angle_deqp_gles31_tests
Created with:
gclient setdep -r third_party/externals/angle2@853ebacf99a4
The AutoRoll server is located here: https://autoroll.skia.org/r/angle-skia-autoroll
Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+/master/autoroll/README.md
If the roll is causing failures, please contact the current sheriff, who should
be CC'd on the roll, and stop the roller if necessary.
CQ_INCLUDE_TRYBOTS=skia.primary:Build-Debian9-Clang-x86_64-Release-ANGLE;skia.primary:Perf-Win10-Clang-AlphaR2-GPU-RadeonR9M470X-x86_64-Debug-All-ANGLE;skia.primary:Perf-Win10-Clang-NUC5i7RYH-GPU-IntelIris6100-x86_64-Debug-All-ANGLE;skia.primary:Perf-Win10-Clang-NUC6i5SYK-GPU-IntelIris540-x86_64-Debug-All-ANGLE;skia.primary:Perf-Win10-Clang-NUCD34010WYKH-GPU-IntelHD4400-x86_64-Debug-All-ANGLE;skia.primary:Perf-Win10-Clang-ShuttleC-GPU-GTX960-x86_64-Debug-All-ANGLE;skia.primary:Test-Win10-Clang-AlphaR2-GPU-RadeonR9M470X-x86_64-Debug-All-ANGLE;skia.primary:Test-Win10-Clang-NUC6i5SYK-GPU-IntelIris540-x86_64-Debug-All-ANGLE;skia.primary:Test-Win10-Clang-NUCD34010WYKH-GPU-IntelHD4400-x86_64-Debug-All-ANGLE;skia.primary:Test-Win10-Clang-ShuttleC-GPU-GTX960-x86_64-Debug-All-ANGLE
TBR=herb@google.com
Change-Id: I66cf86ba54e2d38f3be65bd353207b787f69c06c
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220942
Reviewed-by: skia-autoroll <skia-autoroll@skia-public.iam.gserviceaccount.com>
Commit-Queue: skia-autoroll <skia-autoroll@skia-public.iam.gserviceaccount.com>
f6ed31446f..a2197be674
Created with:
gclient setdep -r ../src@a2197be674
The AutoRoll server is located here: https://autoroll.skia.org/r/chromium-skia-autoroll
Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+/master/autoroll/README.md
If the roll is causing failures, please contact the current sheriff, who should
be CC'd on the roll, and stop the roller if necessary.
CQ_INCLUDE_TRYBOTS=skia.primary:Perf-Mac10.13-Clang-MacBookPro11.5-GPU-RadeonHD8870M-x86_64-Release-All-CommandBuffer;skia.primary:Test-Mac10.13-Clang-MacBookPro11.5-GPU-RadeonHD8870M-x86_64-Debug-All-CommandBuffer
TBR=herb@google.com
Change-Id: I3cd00f5edd315664e413f2cc72b51a305fb526b4
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220943
Reviewed-by: skia-autoroll <skia-autoroll@skia-public.iam.gserviceaccount.com>
Commit-Queue: skia-autoroll <skia-autoroll@skia-public.iam.gserviceaccount.com>
This is a reland of 084fa1b52f
Original change's description:
> [skottie] Use metrics for Shaper vertical alignment
>
> Relying on visual bounds yields incorrect results in some cases (e.g.
> leading/trailing empty lines).
>
> Update the vertical alignment logic to use metrics instead:
>
> - track the first line ascent and last line descent
> - compute content height as
>
> first_ascent + last_descent + line_height * (line_count - 1)
>
> - relocate Result::computeBounds() to the unit test (only user)
>
> Empirically, this causes top-alignment to be less snug (likely due to
> ascent slack in the tested fonts).
>
> Bug: skia:9098
> Change-Id: Ib92bf907af8889d6b0d0fda22ef41a2cc8b50901
> Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220656
> Reviewed-by: Mike Reed <reed@google.com>
> Commit-Queue: Florin Malita <fmalita@chromium.org>
TBR=
No-Try: true
Bug: skia:9098
Change-Id: Iaba53968840749e35b9c3ed04b15d6e2cda55e72
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220916
Reviewed-by: Florin Malita <fmalita@chromium.org>
Commit-Queue: Florin Malita <fmalita@chromium.org>
Now that we've got shr_16x2, extract(..., 8, splat(0x00ff00ff)) is
better done as shr_16x2(..., 8). This swaps a 16-bit shift in for
the 32-bit shift, a wash, but lets us drop the bit_and at the end,
saving one whole instruction.
This places I32_SWAR a tiny little bit faster than the code in Opts,
like .19 ns/px vs .20 ns/px for Opts.
Change-Id: I4160dc03ecc8b855c0773a927f1510ad5cbb4b87
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220856
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Herb Derby <herb@google.com>
This is the final bunny I've got in my hat, I think...
Remembering that none of the s += d*invA adds can overflow,
we can use a single 32-bit add to add them all at once.
This means we don't have to unpack the src pixel into rb/ga
halves. We need only extract the alpha for invA.
This brings I32_SWAR even with the Opts code!
curr/maxrss loops min median mean max stddev samples config bench
36/36 MB 133 0.206ns 0.211ns 0.208ns 0.211ns 1% ▁▇▁█▁▇▁▇▁▇ nonrendering SkVM_4096_I32_SWAR
37/37 MB 152 0.432ns 0.432ns 0.434ns 0.444ns 1% ▃▁▁▁▁▃▁▁█▁ nonrendering SkVM_4096_I32
37/37 MB 50 0.781ns 0.794ns 0.815ns 0.895ns 5% ▆▂█▃▅▂▂▁▂▁ nonrendering SkVM_4096_F32
37/37 MB 76 0.773ns 0.78ns 0.804ns 0.907ns 6% ▄█▅▁▁▁▁▂▁▁ nonrendering SkVM_4096_RP
37/37 MB 268 0.201ns 0.203ns 0.203ns 0.204ns 0% █▇▆▆▆▆▁▆▆▆ nonrendering SkVM_4096_Opts
Change-Id: Ibf0a9c5d90b35f1e9cf7265868bd18b7e0a76c43
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220805
Reviewed-by: Mike Klein <mtklein@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
It's doing an arithmetic shift at head,
but we want a logical shift there...
Change-Id: I82eba87ccc3fba6a9511bf3a4d3ff88d90c29585
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220855
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Herb Derby <herb@google.com>
Reviewed-by: Mike Klein <mtklein@google.com>
Change-Id: I873a17712c77413ceb09ccc9a0813a5838fe62e7
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220754
Commit-Queue: Greg Daniel <egdaniel@google.com>
Commit-Queue: Robert Phillips <robertphillips@google.com>
Auto-Submit: Greg Daniel <egdaniel@google.com>
Reviewed-by: Robert Phillips <robertphillips@google.com>
When extracting contours for the even/odd fill type, connected holes
in the interior of a shape don't get rendered correctly.
The fix is to extract the subcontours separately from the outer contour.
To do this, we abort contour extraction the first time we re-encounter
the starting vertex. This causes the hole to be extracted as a
separate contour.
Bug: 908646
Change-Id: I047b77c74605987c40c12a228fd2898c9aa74e55
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220776
Reviewed-by: Robert Phillips <robertphillips@google.com>
Commit-Queue: Stephen White <senorblanco@chromium.org>
Check for HSW feature set, and that Programs use 15 or fewer registers
(reserving ymm15 for tmp).
With those in place, we can try playing with instructions past AVX2
even; for instance, AVX-512F added vpmovusdb which is exactly our
store8. Check for SkCPU::SKX support and use that then!
(This does not widen to 512-bit vectors... still just working with ymm.)
Change-Id: I49ae9fe4ad98d1b74daa84fcdd0697e1c5b5063f
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220842
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Herb Derby <herb@google.com>
The new API eliminates all need to access the path inner workings.
There are some uses of the cast (SkGlyph*) these are to facilitate
the larger change this is a part of. The will be eliminated when all
is done.
Some of the code has been changed to use strike->glyph(id) and SkGlyph*
to help with the flow of the code.
Change-Id: Id8dc84076f56e1e39450367a0440d15954dbdc71
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220523
Commit-Queue: Herb Derby <herb@google.com>
Reviewed-by: Mike Klein <mtklein@google.com>
This reverts commit f9a8688b4e.
Reason for revert: breaking processor unit tests
Original change's description:
> Make GrBicubicEffect also support centripetal Catmull-Rom kernel.
>
> Use new kernel in async rescale APIs.
>
> Bug: skia:8962
> Change-Id: Ife8f56f54b5df58cedd65b54083c7c0716b8c633
> Reviewed-on: https://skia-review.googlesource.com/c/skia/+/216352
> Reviewed-by: Brian Osman <brianosman@google.com>
> Commit-Queue: Brian Salomon <bsalomon@google.com>
TBR=bsalomon@google.com,brianosman@google.com
Change-Id: Idf317e76b870407060113dc60dd3776abc07f810
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: skia:8962
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220751
Reviewed-by: Brian Salomon <bsalomon@google.com>
Commit-Queue: Brian Salomon <bsalomon@google.com>
Change-Id: I70c7c4769e0a8d95b590a85fab34529041f8af8a
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220841
Reviewed-by: Mike Klein <mtklein@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
Change-Id: I59dd970b6fd512f2e2ee08cc821b758b950a2b53
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220743
Reviewed-by: Jim Van Verth <jvanverth@google.com>
Commit-Queue: Greg Daniel <egdaniel@google.com>
Use new kernel in async rescale APIs.
Bug: skia:8962
Change-Id: Ife8f56f54b5df58cedd65b54083c7c0716b8c633
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/216352
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Brian Salomon <bsalomon@google.com>
I figure the easiest way to expose 16-bit operations
is to expose 16x2 pair operations... this means we
can continue to always work with the same size vector.
Switching from 32-bit multiplies to 16-bit multiplies
is going to deliver the most oomph... they cost roughly
half what 32-bit multiplies do on x86.
Speed now:
I32_SWAR: 0.27 ns/px
I32: 0.43 ns/px
F32: 0.76 ns/px
RP: 0.8 ns/px
Opts: 0.2 ns/px
Change-Id: I8350c71722a9bde714ba18f97b8687fe35cc749f
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220709
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Herb Derby <herb@google.com>
This is an automated CL created by the recipe roller. This CL rolls recipe
changes from upstream projects (e.g. depot_tools) into downstream projects
(e.g. tools/build).
More info is at https://goo.gl/zkKdpD. Use https://goo.gl/noib3a to file a bug.
depot_tools:
https://crrev.com/d05ca1537ccd7abb2e753b2e109567e18f0df4b7 Invert ios_internal fetch spec. (jbudorick@chromium.org)
TBR=borenet@google.com
Recipe-Tryjob-Bypass-Reason: Autoroller
Bugdroid-Send-Email: False
Change-Id: I4eca8a99fedac79f2230f960650adc9627756198
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220744
Reviewed-by: Recipe Roller <recipe-mega-autoroller@chops-service-accounts.iam.gserviceaccount.com>
Commit-Queue: Recipe Roller <recipe-mega-autoroller@chops-service-accounts.iam.gserviceaccount.com>
I just kind of remembered that if we're doing (xy+x)/256
and x is a destination channel and y is 255-sa, then you
can get the +x for free by multiplying by 256-sa instead.
(d * (255-sa) + d)
(d * (255-sa + 1))
(d * (256-sa) )
Duh. This is a trick we play in a lot of legacy code and
I've just now realized it's exactly equivalent to the trick
I want to play here... sigh.
Folding this math in kind of makes mul/mad_unorm8 moot.
Speed's getting good:
I32_SWAR: 0.3 ns/px
I32 : 0.55 ns/px
F32 : 0.8 ns/px
RP : 0.8 ns/px
Opts : 0.2 ns/px
Change-Id: I4d10db51ea80a3258c36e97b6b334ad253804613
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220708
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
This reverts commit 084fa1b52f.
Reason for revert: Breaks google3 roller
Original change's description:
> [skottie] Use metrics for Shaper vertical alignment
>
> Relying on visual bounds yields incorrect results in some cases (e.g.
> leading/trailing empty lines).
>
> Update the vertical alignment logic to use metrics instead:
>
> - track the first line ascent and last line descent
> - compute content height as
>
> first_ascent + last_descent + line_height * (line_count - 1)
>
> - relocate Result::computeBounds() to the unit test (only user)
>
> Empirically, this causes top-alignment to be less snug (likely due to
> ascent slack in the tested fonts).
>
> Bug: skia:9098
> Change-Id: Ib92bf907af8889d6b0d0fda22ef41a2cc8b50901
> Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220656
> Reviewed-by: Mike Reed <reed@google.com>
> Commit-Queue: Florin Malita <fmalita@chromium.org>
TBR=bungeman@google.com,fmalita@chromium.org,reed@google.com
Change-Id: I2da2bf9b3bc4a2f333c0fbbd5a88434ef7ea65d5
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: skia:9098
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220746
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Herb Derby <herb@google.com>
This is an automated CL created by the recipe roller. This CL rolls recipe
changes from upstream projects (e.g. depot_tools) into downstream projects
(e.g. tools/build).
More info is at https://goo.gl/zkKdpD. Use https://goo.gl/noib3a to file a bug.
depot_tools:
https://crrev.com/a74bd78e9ccd242af225629c3a105e209f5bf400 Make it clear that compile_single_file.py doesn't support Jumbo builds (sebmarchand@chromium.org)
TBR=borenet@google.com
Recipe-Tryjob-Bypass-Reason: Autoroller
Bugdroid-Send-Email: False
Change-Id: Ib1aa978f0bc027023f99d27332f31a34f40a9004
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220742
Reviewed-by: Recipe Roller <recipe-mega-autoroller@chops-service-accounts.iam.gserviceaccount.com>
Commit-Queue: Recipe Roller <recipe-mega-autoroller@chops-service-accounts.iam.gserviceaccount.com>
Change-Id: I86374ad15433675626f477243a9c66177eb0e21a
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220740
Commit-Queue: Brian Osman <brianosman@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
Auto-Submit: Brian Osman <brianosman@google.com>
Reviewed-by: Herb Derby <herb@google.com>
Reviewed-by: Mike Klein <mtklein@google.com>
Any time we implement a Program::Instruction with multiple low-level
operations, we risk overwriting any arguments that alias the
destination.
This is why the _I32 tests are failing, mad_unorm8 where d == x. We
want (x*y+x)/256+z, but end up calculating (x*y+x*y)/256+z when x == d.
We could fix this by never allowing any arguments to alias any
destinations, but most instructions don't have this problem, and doing
that blindly would bloat the register count significantly.
We could fix this by knowing which Ops may be prone to aliasing in any
backend, but I find that somewhat error prone and also a little
abstraction- level-violatey. I would have thought, for instance, that
the mad_f32 Op might be vulnerable here, but it's actually not... in any
situation where there is aliasing, we actually lower it to a single
vfmadd instruction, never mul-then-add.
This sort of aliasing issue is going to keep coming back up again and
again, especially with 2-argument architectures like SSE. Luckily it's
trivially easy to fix by reserving a single tmp register to use as the
result of all but the final instructions.
The interpreter is safe because all its switch cases are single r(d) =
... statements. The right hand sides are evaluated before anything is
written back to a destination register slot. Had it been written a
little differently, it could have easily had this same aliasing issue.
Change-Id: I996392ef6af48268238ecae4a97d3bf3b4fba002
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220600
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
This converts the SkSL interpreter to operate in SIMT fashion. It handles
all the same features as the previous scalar implementation, but operates
on N lanes at a time. (Currently 8).
It's modeled after GPU and other parallel architectures, using execution
masks to handle control flow, including divergent control-flow.
Change-Id: Ieb38ffe2f55a10f72bdab844c297126fe9bedb6c
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/217122
Commit-Queue: Brian Osman <brianosman@google.com>
Reviewed-by: Mike Klein <mtklein@google.com>
Bug: skia:8962
Change-Id: I0ab208063b6b7eca010f86d4d851ade23df5f849
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220529
Reviewed-by: Mike Klein <mtklein@google.com>
Commit-Queue: Brian Salomon <bsalomon@google.com>
Bug: chromium:972587
Change-Id: I05ab4393d4df20a8f55c4352d09f92f275cedf5d
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220736
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Greg Daniel <egdaniel@google.com>
Relying on visual bounds yields incorrect results in some cases (e.g.
leading/trailing empty lines).
Update the vertical alignment logic to use metrics instead:
- track the first line ascent and last line descent
- compute content height as
first_ascent + last_descent + line_height * (line_count - 1)
- relocate Result::computeBounds() to the unit test (only user)
Empirically, this causes top-alignment to be less snug (likely due to
ascent slack in the tested fonts).
Bug: skia:9098
Change-Id: Ib92bf907af8889d6b0d0fda22ef41a2cc8b50901
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220656
Reviewed-by: Mike Reed <reed@google.com>
Commit-Queue: Florin Malita <fmalita@chromium.org>
2589cdcc88..85fef1bc62
git log 2589cdcc88ec..85fef1bc62f8 --date=short --no-merges --format='%ad %ae %s'
2019-06-13 spang@chromium.org Vulkan: Build validation layers with asserts only
2019-06-12 clemendeng@google.com implement core egl image entry points
2019-06-12 jmadill@chromium.org Vulkan: Fix build with custom secondaries disabled.
2019-06-12 jmadill@chromium.org Roll SPIR-V headers and Tools.
2019-06-12 dongja@google.com Vulkan: add support for shadow samplers.
2019-06-12 jonahr@google.com Extend eglGetPlatformDisplay to allow feature overrides.
Created with:
gclient setdep -r third_party/externals/angle2@85fef1bc62f8
The AutoRoll server is located here: https://autoroll.skia.org/r/angle-skia-autoroll
Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+/master/autoroll/README.md
If the roll is causing failures, please contact the current sheriff, who should
be CC'd on the roll, and stop the roller if necessary.
CQ_INCLUDE_TRYBOTS=skia.primary:Build-Debian9-Clang-x86_64-Release-ANGLE;skia.primary:Perf-Win10-Clang-AlphaR2-GPU-RadeonR9M470X-x86_64-Debug-All-ANGLE;skia.primary:Perf-Win10-Clang-NUC5i7RYH-GPU-IntelIris6100-x86_64-Debug-All-ANGLE;skia.primary:Perf-Win10-Clang-NUC6i5SYK-GPU-IntelIris540-x86_64-Debug-All-ANGLE;skia.primary:Perf-Win10-Clang-NUCD34010WYKH-GPU-IntelHD4400-x86_64-Debug-All-ANGLE;skia.primary:Perf-Win10-Clang-ShuttleC-GPU-GTX960-x86_64-Debug-All-ANGLE;skia.primary:Test-Win10-Clang-AlphaR2-GPU-RadeonR9M470X-x86_64-Debug-All-ANGLE;skia.primary:Test-Win10-Clang-NUC6i5SYK-GPU-IntelIris540-x86_64-Debug-All-ANGLE;skia.primary:Test-Win10-Clang-NUCD34010WYKH-GPU-IntelHD4400-x86_64-Debug-All-ANGLE;skia.primary:Test-Win10-Clang-ShuttleC-GPU-GTX960-x86_64-Debug-All-ANGLE
TBR=herb@google.com
Change-Id: I86cb59bef2fe3db96ded65c507a95134b717baf8
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220680
Reviewed-by: skia-autoroll <skia-autoroll@skia-public.iam.gserviceaccount.com>
Commit-Queue: skia-autoroll <skia-autoroll@skia-public.iam.gserviceaccount.com>
5dc4d131fb..f6ed31446f
Created with:
gclient setdep -r ../src@f6ed31446f
The AutoRoll server is located here: https://autoroll.skia.org/r/chromium-skia-autoroll
Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+/master/autoroll/README.md
If the roll is causing failures, please contact the current sheriff, who should
be CC'd on the roll, and stop the roller if necessary.
CQ_INCLUDE_TRYBOTS=skia.primary:Perf-Mac10.13-Clang-MacBookPro11.5-GPU-RadeonHD8870M-x86_64-Release-All-CommandBuffer;skia.primary:Test-Mac10.13-Clang-MacBookPro11.5-GPU-RadeonHD8870M-x86_64-Debug-All-CommandBuffer
TBR=herb@google.com
Change-Id: I155b13c1864cf0a9d10ed884342bcf455c6b49af
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220681
Reviewed-by: skia-autoroll <skia-autoroll@skia-public.iam.gserviceaccount.com>
Commit-Queue: skia-autoroll <skia-autoroll@skia-public.iam.gserviceaccount.com>
This invoked operator bool(), so all arrays were parsed as having 1 column.
Change-Id: Iccd2a4fc80c905d8a5912f5639b0efbad050cbcf
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220530
Reviewed-by: Ethan Nicholas <ethannicholas@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
This trims one instruction of loop overhead, from
sub(N, K) // N -= 8
cmp(N, K-1) // if (N >= 8)
jg loop // goto loop;
to
sub(N, K) // N -= 8
... // if (N != 0)
jne loop // goto loop;
To make this work we pass only multiples of K into the
JIT'd code, so it always hits exactly N = 0 to exit.
Change-Id: I81c7a2f8d5927971059a68c5bce2e3d6fb0b8ff2
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220576
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
Change-Id: I43847609bfbdc769487cab5bf19f754615cc8ddd
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220528
Commit-Queue: Greg Daniel <egdaniel@google.com>
Commit-Queue: Robert Phillips <robertphillips@google.com>
Auto-Submit: Greg Daniel <egdaniel@google.com>
Reviewed-by: Robert Phillips <robertphillips@google.com>
The mask-only special case for extract is wrong...
it never looked it its input!
This not only makes things correct-er, but oddly it also
makes them faster by breaking inter-loop data dependencies.
Disable tests for _I32... they're actually still broken
because of a much more systemic flaw in how I've evaluated
programs. The _F32 and _I32_SWAR JIT code and all interpreted
code is just getting lucky. o_O
While here, update the I32_SWAR code to use the same math as I32,
(x*y+x)/256 for unorm8 mul. This just helps keep me sane.
Change-Id: I1acc09adb84c426fca4b2be5ca8c2d46d9678dd8
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220577
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Herb Derby <herb@google.com>
At head we're redoing any n<8 tail from the start,
not continuing from (n/8)*8 like we'd want.
Change-Id: I1a3d24cdffc843bbe6f3e01a163b6e3a20fdd0ca
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220556
Reviewed-by: Brian Osman <brianosman@google.com>
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
Like any other Instruction the store*s are assigned
a destination register d, which doesn't really make
sense, but works perfectly as a temporary register.
This means store8 doesn't need to reserve xmm/ymm15
as a temporary... it already has one naturally. As
you might expect, the examples we have so far assign
the consumed input x register as the d register, so
things that used to look like
vpackusdw %ymm6 ,%ymm6 ,%ymm15
vpermq $0xd8 ,%ymm15,%ymm15
vpackuswb %ymm15,%ymm15,%ymm15
vmoq %xmm15,(%rdx)
now look more like
vpackusdw %ymm6,%ymm6,%ymm6
vpermq $0xd8,%ymm6,%ymm6
vpackuswb %ymm6,%ymm6,%ymm6
vmoq %xmm6,(%rdx)
Should be no perf difference, just simplified register bookkeeping.
This may suggest splitting load8/store8 into finer instructions,
two to do the physical loads and stores, and two for the 8->32
and 32->8 widen and narrow? On the other hand load8 really is
just one vpmovzxbd instruction, so it'd be a shame to split it.
I suspect this will become more clear as I add 16-bit support.
Change-Id: I7c2b4d6b1689d40b50382f65fc00c01c54529c8a
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220543
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
Change-Id: Id61b7b9d9bc7611727a27be0172fcabc2ef4345a
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220522
Commit-Queue: Brian Osman <brianosman@google.com>
Commit-Queue: Ethan Nicholas <ethannicholas@google.com>
Auto-Submit: Brian Osman <brianosman@google.com>
Reviewed-by: Ethan Nicholas <ethannicholas@google.com>
Change-Id: I0263b4ea8c60695c890f31d45198a7bd45ff3db8
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220536
Reviewed-by: Mike Klein <mtklein@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>