--motion_angle ... [default is 180]
--motion_samples ... [default is 1, for no motion blur]
Change-Id: Iec0f31655b3369f51e0b398efb2d5b156dcbaf2e
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/221416
Reviewed-by: Mike Reed <reed@google.com>
Commit-Queue: Mike Reed <reed@google.com>
Auto-Submit: Mike Reed <reed@google.com>
Optimizations to JSON size (%f -> %g) changed the meaning of the digits
argument, causing these timestamps to become severely truncated. Traces
have been fairly useless as a result (too many events starting/stopping
at the same time). This adds enough digits back that things are better.
Change-Id: I3f2d2a3dd064daf8449ac34ab5440f95e339a392
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/221346
Commit-Queue: Brian Osman <brianosman@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
Auto-Submit: Brian Osman <brianosman@google.com>
Reviewed-by: Mike Klein <mtklein@google.com>
Change-Id: I7236a30040ab532086e68d6e9de2898dd7acaa32
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/221098
Commit-Queue: Brian Osman <brianosman@google.com>
Reviewed-by: Mike Reed <reed@google.com>
Reviewed-by: Mike Klein <mtklein@google.com>
Change-Id: I042b90e7c405505447662e6d187ca1519efd4743
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/221342
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Brian Salomon <bsalomon@google.com>
Like SkConvertPixels but knows about all GrColorTypes, origin, and can
apply an arbitrary GrSwizzle.
Use in GrSurfaceContext read/write pixels methods.
Add support for '0' to GrSwizzle.
Change-Id: Ib9dd215fcb0ee8b33c4020893c22b4ab7ce1f40b
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220761
Commit-Queue: Brian Salomon <bsalomon@google.com>
Reviewed-by: Greg Daniel <egdaniel@google.com>
NoTry: true
Bug: skia:9169
Change-Id: I9fca155d0e8b238e5a38348a40fb3351f0aef2fc
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/221276
Reviewed-by: Ravi Mistry <rmistry@google.com>
Reviewed-by: Eric Boren <borenet@google.com>
Commit-Queue: Ravi Mistry <rmistry@google.com>
Make sure we're using five or fewer arguments.
Today all programs use one or two arguments, so this doesn't
really have any immediate effect, but it should be there.
Change-Id: Ia85e56ef63ceb442702546c402cd11a13daa2c25
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/221270
Reviewed-by: Mike Klein <mtklein@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
I'm staring at this assembly,
vmovups (%rsi), %ymm3
vpsrld $24, %ymm3, %ymm4
vpslld $16, %ymm4, %ymm15
vorps %ymm4, %ymm15, %ymm4
vpsubw %ymm4, %ymm0, %ymm4
Just knowing that could be
vmovups (%rsi), %ymm3
vpshufb 0x??(%rip), %ymm3, %ymm4
vpsubw %ymm4, %ymm0, %ymm4
That is, instead of shifting, shifting, and bit-oring
to create the 0a0a scale factor from ymm3, we could just
byte shuffle directly using some pre-baked control pattern
(stored at the end of the program like other constants)
pshufb lets you arbitrarily remix bytes from its argument and
zero bytes, and NEON has a similar family of vtbl instructions,
even including that same feature of injecting zeroes.
I think I've got this working, and the speedup is great,
from 0.19 to 0.16 ns/px for I32_SWAR, and
from 0.43 to 0.38 ns/px for I32.
Change-Id: Iab850275e826b4187f0efc9495a4b9eab4402c38
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220871
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Herb Derby <herb@google.com>
Change-Id: Ie4c8e8c5df8f3d37ea49d0c0f7e432e6999b7f0b
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/221243
Commit-Queue: Brian Osman <brianosman@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
Auto-Submit: Brian Osman <brianosman@google.com>
Reviewed-by: Mike Klein <mtklein@google.com>
With the modified gm, the time (on imac pro) goes from 2.4 to 1.6
Change-Id: I9f940220c129f74771f3b17126657bcf3739044f
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/221176
Commit-Queue: Mike Reed <reed@google.com>
Reviewed-by: Mike Klein <mtklein@google.com>
Reviewed-by: Ethan Nicholas <ethannicholas@google.com>
Bug: skia:9168
Change-Id: Id931959278a0638a333a5eb6a70117a9d04e25dd
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/221237
Reviewed-by: Eric Boren <borenet@google.com>
Commit-Queue: Eric Boren <borenet@google.com>
Change-Id: I20e652f2b6f9bf606b03c6dd4e346c3439ea8a0b
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220876
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Herb Derby <herb@google.com>
Change-Id: I66713976f08b1dbf0966d9a901f666b9f834b659
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/221096
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
Yet another way to transform a layer, disguised as a distort effect.
TBR=
Change-Id: Ic2d5479fa6ae27b460de60875924f73f77fc7f71
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/221001
Reviewed-by: Florin Malita <fmalita@chromium.org>
Commit-Queue: Florin Malita <fmalita@chromium.org>
Slightly sharper, but far easier to hold:
- Remove Value union from interface, everything is a 32-bit
value type, or a collection thereof.
- Collapse to one version of Run (that takes count), and make
it a member on ByteCode.
- Similarly, move disassemble to ByteCodeFunction.
Change-Id: I07c85e65991178b3f52e20e815c25f36bc9c4257
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220753
Commit-Queue: Brian Osman <brianosman@google.com>
Reviewed-by: Mike Klein <mtklein@google.com>
Bug: chromium:974177
Change-Id: I725f1ba63cbd57348419654a45934f09d1bcb6f4
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220996
Commit-Queue: Greg Daniel <egdaniel@google.com>
Commit-Queue: Jim Van Verth <jvanverth@google.com>
Auto-Submit: Greg Daniel <egdaniel@google.com>
Reviewed-by: Jim Van Verth <jvanverth@google.com>
85fef1bc62..853ebacf99
git log 85fef1bc62f8..853ebacf99a4 --date=short --no-merges --format='%ad %ae %s'
2019-06-14 ynovikov@chromium.org Suppress Mac 10.13.6 Intel OpenGL dEQP failures
2019-06-13 dongja@google.com Vulkan: implement primitive restart
2019-06-13 syoussefi@chromium.org Vulkan: Output render pass loadOp in graph dump
2019-06-13 jmadill@chromium.org Vulkan: Implement a texture descriptor cache.
2019-06-13 shrekshao@google.com Enable floatBlend for D3D9
2019-06-13 dongja@google.com Vulkan: fix array size for internal shaders
2019-06-13 jmadill@chromium.org Refactor DrawCallPerfParams.
2019-06-13 spang@chromium.org Reland "Vulkan: Build validation layers with asserts only"
2019-06-13 jonahr@google.com Clean up and expose frontend features to egl.
2019-06-13 angle-autoroll@skia-public.iam.gserviceaccount.com Roll ./third_party/spirv-tools/src 9c0830133b07..42830e5a68c3 (2 commits)
2019-06-13 jmadill@chromium.org Vulkan: Prefer immediate present mode for benchmarking.
2019-06-13 jmadill@chromium.org Revert "Vulkan: Build validation layers with asserts only"
2019-06-13 angle-autoroll@skia-public.iam.gserviceaccount.com Roll ./third_party/spirv-headers/src 9cf7c3a7d2d2..de99d4d834ae (1 commits)
2019-06-13 jiajia.qin@intel.com Fix the crash in angle_deqp_gles31_tests
Created with:
gclient setdep -r third_party/externals/angle2@853ebacf99a4
The AutoRoll server is located here: https://autoroll.skia.org/r/angle-skia-autoroll
Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+/master/autoroll/README.md
If the roll is causing failures, please contact the current sheriff, who should
be CC'd on the roll, and stop the roller if necessary.
CQ_INCLUDE_TRYBOTS=skia.primary:Build-Debian9-Clang-x86_64-Release-ANGLE;skia.primary:Perf-Win10-Clang-AlphaR2-GPU-RadeonR9M470X-x86_64-Debug-All-ANGLE;skia.primary:Perf-Win10-Clang-NUC5i7RYH-GPU-IntelIris6100-x86_64-Debug-All-ANGLE;skia.primary:Perf-Win10-Clang-NUC6i5SYK-GPU-IntelIris540-x86_64-Debug-All-ANGLE;skia.primary:Perf-Win10-Clang-NUCD34010WYKH-GPU-IntelHD4400-x86_64-Debug-All-ANGLE;skia.primary:Perf-Win10-Clang-ShuttleC-GPU-GTX960-x86_64-Debug-All-ANGLE;skia.primary:Test-Win10-Clang-AlphaR2-GPU-RadeonR9M470X-x86_64-Debug-All-ANGLE;skia.primary:Test-Win10-Clang-NUC6i5SYK-GPU-IntelIris540-x86_64-Debug-All-ANGLE;skia.primary:Test-Win10-Clang-NUCD34010WYKH-GPU-IntelHD4400-x86_64-Debug-All-ANGLE;skia.primary:Test-Win10-Clang-ShuttleC-GPU-GTX960-x86_64-Debug-All-ANGLE
TBR=herb@google.com
Change-Id: I66cf86ba54e2d38f3be65bd353207b787f69c06c
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220942
Reviewed-by: skia-autoroll <skia-autoroll@skia-public.iam.gserviceaccount.com>
Commit-Queue: skia-autoroll <skia-autoroll@skia-public.iam.gserviceaccount.com>
f6ed31446f..a2197be674
Created with:
gclient setdep -r ../src@a2197be674
The AutoRoll server is located here: https://autoroll.skia.org/r/chromium-skia-autoroll
Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+/master/autoroll/README.md
If the roll is causing failures, please contact the current sheriff, who should
be CC'd on the roll, and stop the roller if necessary.
CQ_INCLUDE_TRYBOTS=skia.primary:Perf-Mac10.13-Clang-MacBookPro11.5-GPU-RadeonHD8870M-x86_64-Release-All-CommandBuffer;skia.primary:Test-Mac10.13-Clang-MacBookPro11.5-GPU-RadeonHD8870M-x86_64-Debug-All-CommandBuffer
TBR=herb@google.com
Change-Id: I3cd00f5edd315664e413f2cc72b51a305fb526b4
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220943
Reviewed-by: skia-autoroll <skia-autoroll@skia-public.iam.gserviceaccount.com>
Commit-Queue: skia-autoroll <skia-autoroll@skia-public.iam.gserviceaccount.com>
This is a reland of 084fa1b52f
Original change's description:
> [skottie] Use metrics for Shaper vertical alignment
>
> Relying on visual bounds yields incorrect results in some cases (e.g.
> leading/trailing empty lines).
>
> Update the vertical alignment logic to use metrics instead:
>
> - track the first line ascent and last line descent
> - compute content height as
>
> first_ascent + last_descent + line_height * (line_count - 1)
>
> - relocate Result::computeBounds() to the unit test (only user)
>
> Empirically, this causes top-alignment to be less snug (likely due to
> ascent slack in the tested fonts).
>
> Bug: skia:9098
> Change-Id: Ib92bf907af8889d6b0d0fda22ef41a2cc8b50901
> Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220656
> Reviewed-by: Mike Reed <reed@google.com>
> Commit-Queue: Florin Malita <fmalita@chromium.org>
TBR=
No-Try: true
Bug: skia:9098
Change-Id: Iaba53968840749e35b9c3ed04b15d6e2cda55e72
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220916
Reviewed-by: Florin Malita <fmalita@chromium.org>
Commit-Queue: Florin Malita <fmalita@chromium.org>
Now that we've got shr_16x2, extract(..., 8, splat(0x00ff00ff)) is
better done as shr_16x2(..., 8). This swaps a 16-bit shift in for
the 32-bit shift, a wash, but lets us drop the bit_and at the end,
saving one whole instruction.
This places I32_SWAR a tiny little bit faster than the code in Opts,
like .19 ns/px vs .20 ns/px for Opts.
Change-Id: I4160dc03ecc8b855c0773a927f1510ad5cbb4b87
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220856
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Herb Derby <herb@google.com>
This is the final bunny I've got in my hat, I think...
Remembering that none of the s += d*invA adds can overflow,
we can use a single 32-bit add to add them all at once.
This means we don't have to unpack the src pixel into rb/ga
halves. We need only extract the alpha for invA.
This brings I32_SWAR even with the Opts code!
curr/maxrss loops min median mean max stddev samples config bench
36/36 MB 133 0.206ns 0.211ns 0.208ns 0.211ns 1% ▁▇▁█▁▇▁▇▁▇ nonrendering SkVM_4096_I32_SWAR
37/37 MB 152 0.432ns 0.432ns 0.434ns 0.444ns 1% ▃▁▁▁▁▃▁▁█▁ nonrendering SkVM_4096_I32
37/37 MB 50 0.781ns 0.794ns 0.815ns 0.895ns 5% ▆▂█▃▅▂▂▁▂▁ nonrendering SkVM_4096_F32
37/37 MB 76 0.773ns 0.78ns 0.804ns 0.907ns 6% ▄█▅▁▁▁▁▂▁▁ nonrendering SkVM_4096_RP
37/37 MB 268 0.201ns 0.203ns 0.203ns 0.204ns 0% █▇▆▆▆▆▁▆▆▆ nonrendering SkVM_4096_Opts
Change-Id: Ibf0a9c5d90b35f1e9cf7265868bd18b7e0a76c43
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220805
Reviewed-by: Mike Klein <mtklein@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
It's doing an arithmetic shift at head,
but we want a logical shift there...
Change-Id: I82eba87ccc3fba6a9511bf3a4d3ff88d90c29585
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220855
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Herb Derby <herb@google.com>
Reviewed-by: Mike Klein <mtklein@google.com>
Change-Id: I873a17712c77413ceb09ccc9a0813a5838fe62e7
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220754
Commit-Queue: Greg Daniel <egdaniel@google.com>
Commit-Queue: Robert Phillips <robertphillips@google.com>
Auto-Submit: Greg Daniel <egdaniel@google.com>
Reviewed-by: Robert Phillips <robertphillips@google.com>
When extracting contours for the even/odd fill type, connected holes
in the interior of a shape don't get rendered correctly.
The fix is to extract the subcontours separately from the outer contour.
To do this, we abort contour extraction the first time we re-encounter
the starting vertex. This causes the hole to be extracted as a
separate contour.
Bug: 908646
Change-Id: I047b77c74605987c40c12a228fd2898c9aa74e55
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220776
Reviewed-by: Robert Phillips <robertphillips@google.com>
Commit-Queue: Stephen White <senorblanco@chromium.org>
Check for HSW feature set, and that Programs use 15 or fewer registers
(reserving ymm15 for tmp).
With those in place, we can try playing with instructions past AVX2
even; for instance, AVX-512F added vpmovusdb which is exactly our
store8. Check for SkCPU::SKX support and use that then!
(This does not widen to 512-bit vectors... still just working with ymm.)
Change-Id: I49ae9fe4ad98d1b74daa84fcdd0697e1c5b5063f
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220842
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Herb Derby <herb@google.com>
The new API eliminates all need to access the path inner workings.
There are some uses of the cast (SkGlyph*) these are to facilitate
the larger change this is a part of. The will be eliminated when all
is done.
Some of the code has been changed to use strike->glyph(id) and SkGlyph*
to help with the flow of the code.
Change-Id: Id8dc84076f56e1e39450367a0440d15954dbdc71
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220523
Commit-Queue: Herb Derby <herb@google.com>
Reviewed-by: Mike Klein <mtklein@google.com>
This reverts commit f9a8688b4e.
Reason for revert: breaking processor unit tests
Original change's description:
> Make GrBicubicEffect also support centripetal Catmull-Rom kernel.
>
> Use new kernel in async rescale APIs.
>
> Bug: skia:8962
> Change-Id: Ife8f56f54b5df58cedd65b54083c7c0716b8c633
> Reviewed-on: https://skia-review.googlesource.com/c/skia/+/216352
> Reviewed-by: Brian Osman <brianosman@google.com>
> Commit-Queue: Brian Salomon <bsalomon@google.com>
TBR=bsalomon@google.com,brianosman@google.com
Change-Id: Idf317e76b870407060113dc60dd3776abc07f810
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: skia:8962
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220751
Reviewed-by: Brian Salomon <bsalomon@google.com>
Commit-Queue: Brian Salomon <bsalomon@google.com>
Change-Id: I70c7c4769e0a8d95b590a85fab34529041f8af8a
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220841
Reviewed-by: Mike Klein <mtklein@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
Change-Id: I59dd970b6fd512f2e2ee08cc821b758b950a2b53
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220743
Reviewed-by: Jim Van Verth <jvanverth@google.com>
Commit-Queue: Greg Daniel <egdaniel@google.com>
Use new kernel in async rescale APIs.
Bug: skia:8962
Change-Id: Ife8f56f54b5df58cedd65b54083c7c0716b8c633
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/216352
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Brian Salomon <bsalomon@google.com>
I figure the easiest way to expose 16-bit operations
is to expose 16x2 pair operations... this means we
can continue to always work with the same size vector.
Switching from 32-bit multiplies to 16-bit multiplies
is going to deliver the most oomph... they cost roughly
half what 32-bit multiplies do on x86.
Speed now:
I32_SWAR: 0.27 ns/px
I32: 0.43 ns/px
F32: 0.76 ns/px
RP: 0.8 ns/px
Opts: 0.2 ns/px
Change-Id: I8350c71722a9bde714ba18f97b8687fe35cc749f
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220709
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Herb Derby <herb@google.com>