Commit Graph

859 Commits

Author SHA1 Message Date
Chris Dalton
6f8fa4e6ca Implement Sk2f::Store4
Bug: skia:
Change-Id: I2adb983d68625d327e7c00e53b6ae4703b46252f
Reviewed-on: https://skia-review.googlesource.com/104761
Commit-Queue: Chris Dalton <csmartdalton@google.com>
Reviewed-by: Mike Klein <mtklein@chromium.org>
2018-02-07 05:06:15 +00:00
Kevin Lubick
4284613cfe Enable conditional-uninitialized flag
Bug: skia:7462
Cq-Include-Trybots: skia.primary:Test-Debian9-Clang-GCE-CPU-AVX2-x86_64-Release-All-SKNX_NO_SIMD
Change-Id: I1c0a09984bf28a5c620a89af56040f018bae6310
Reviewed-on: https://skia-review.googlesource.com/90941
Commit-Queue: Kevin Lubick <kjlubick@google.com>
Reviewed-by: Mike Klein <mtklein@chromium.org>
2018-01-05 18:03:25 +00:00
Herbert Derby
d8e2b13ad4 Remove previous blur image implementation. Try 2
The last version had a problem with the no simd compilation.

TBR=mtklein@google.com

Change-Id: I139388cf3bf1b55cb4a49133a98be129fd9711c6
Reviewed-on: https://skia-review.googlesource.com/80983
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Herb Derby <herb@google.com>
2017-12-06 17:37:49 +00:00
Herb Derby
2a3985e98d Revert "Remove previous blur image implementation"
This reverts commit a11346dd24.

Reason for revert: Compile problems on linux

Original change's description:
> Remove previous blur image implementation
> 
> Change-Id: Ie3bada767f0ba945cb17f174f179510768eb178d
> Reviewed-on: https://skia-review.googlesource.com/77583
> Commit-Queue: Herb Derby <herb@google.com>
> Reviewed-by: Mike Klein <mtklein@google.com>

TBR=mtklein@google.com,herb@google.com

Change-Id: I5f157fbc2fd4e2a37618dc9c0e72621ff9892ef6
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Reviewed-on: https://skia-review.googlesource.com/81100
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Herb Derby <herb@google.com>
2017-12-06 15:45:10 +00:00
Herbert Derby
a11346dd24 Remove previous blur image implementation
Change-Id: Ie3bada767f0ba945cb17f174f179510768eb178d
Reviewed-on: https://skia-review.googlesource.com/77583
Commit-Queue: Herb Derby <herb@google.com>
Reviewed-by: Mike Klein <mtklein@google.com>
2017-12-06 15:27:19 +00:00
Chris Dalton
0cb75879a5 Add Store3 to Sk2f
Bug: skia:
Change-Id: I0377e6a1dd8259e944f7902a5c68af524fa588c7
Reviewed-on: https://skia-review.googlesource.com/79382
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Chris Dalton <csmartdalton@google.com>
2017-12-01 21:12:49 +00:00
Mike Klein
213d821462 add Load2() to Sk4f
and test it.

Change-Id: Ib0c2cf93c63d8d3c36a7d4d60bbec4ecede29bc7
Reviewed-on: https://skia-review.googlesource.com/78480
Reviewed-by: Chris Dalton <csmartdalton@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
2017-12-01 19:26:02 +00:00
Herb Derby
b8b3086de3 Fix Sk8b reading too many bytes
Change-Id: I0e94ef1620b54405a23470507e2b2c4bb54731c9
Reviewed-on: https://skia-review.googlesource.com/72860
Reviewed-by: Mike Klein <mtklein@google.com>
Commit-Queue: Herb Derby <herb@google.com>
2017-11-16 21:39:19 +00:00
Herb Derby
d1b3c7846d Support for direct gaussian blur evaluation
Change-Id: I1b00ba2720648b75fce47d3f4d0f56fb8f2cd171
Reviewed-on: https://skia-review.googlesource.com/67041
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Herb Derby <herb@google.com>
2017-11-02 19:34:11 +00:00
Herb Derby
5eb1528234 Add mulHi to SkNx
Add mulHi to base SkNx, and specialize implementations for Sk4u for
neon and sse.

Add casts for converting from uint8_t by 4 to uint32_t by 4.

Cq-Include-Trybots: skia.primary:Test-Debian9-Clang-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD
Change-Id: I29a32e2ad9812a47fff841ceca334e562362836f
Reviewed-on: https://skia-review.googlesource.com/57960
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Herb Derby <herb@google.com>
2017-10-11 08:21:09 +00:00
Cary Clark
a4083c97d4 make most of SkColorPriv.h private
created new file src/core/SkColorData.h for
internal consumption. Note that many of the
functions there are unused as well.

Bug: skia: 6898
R: reed@google.com
Change-Id: I25bfd5a9c21f53558c4ca65a77eb5d322d897c6d
Reviewed-on: https://skia-review.googlesource.com/46848
Commit-Queue: Cary Clark <caryclark@google.com>
Reviewed-by: Mike Reed <reed@google.com>
2017-09-15 16:31:35 +00:00
Herb Derby
0f96bb303a Sk4i version of blur.
For the blur_1.50_normal_low_quality benchmark, this code goes from about 120us to 85us.
The original implementation executes at about 95us.

This changed in controlled by the flag:

SK_SUPPORT_LEGACY_SLOW_SMALL_BLUR

BUG=chromium:759070

CQ_INCLUDE_TRYBOTS=skia.primary:Test-Debian9-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD
Change-Id: If722cb8ffd8c47a94b7a6b4e6dd26fd1474b6209
Reviewed-on: https://skia-review.googlesource.com/45300
Commit-Queue: Herb Derby <herb@google.com>
Reviewed-by: Mike Klein <mtklein@chromium.org>
2017-09-14 03:42:50 +00:00
Chris Dalton
7732f4f8f2 Add missing methods to neon/sse SkNx implementations
Adds negate, abs, sqrt to Sk2f and/or Sk4f.

Bug: skia:
Change-Id: I0688dae45b32ff94abcc0525ef1f09d666f9c6e9
Reviewed-on: https://skia-review.googlesource.com/39642
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Chris Dalton <csmartdalton@google.com>
2017-08-28 21:21:36 +00:00
Mike Klein
cd71f115a8 make SkOpts functions inline, not static
When Skia's built with an interestingly advanced instruction set
baseline like SSSE3 or SSE4.1, we end up with two distinct copies of
some SkOpts functions, one default in SkOpts.o and one specialization
from SkOpts_{ssse3,sse41}.o.  These functions are static, and so are
technically unrelated, even though they're the same code compiled with
the same instructions available.  They're going to be identical.

What we want here is to remove static but mark them as inline instead.
In this case inline means "if the linker sees multiple copies of this,
that's cool, just pick any one arbitrarily".  That's just what we want.

Now, when I disassemble a binary before and after this change, I do see
the redundant routines removed.  However, the file size change is
minimal... I suspect that this must mean the linker has noticed that we
had identical code and physically folded the two logically independent
routines.  I don't know how prevalent this optimization is, though, so
it doesn't hurt to give it more of a "one copy please" hint with inline.
There may also be a difference here between the binary size (~unchanged)
and the in-memory layout of that binary?

Change-Id: Id9c8f0ffc84aa1c9a066c22b623d34adab281857
Reviewed-on: https://skia-review.googlesource.com/37501
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Ben Wagner <bungeman@google.com>
2017-08-23 17:25:26 +00:00
Mike Reed
dc03ddeb7e remove code associated with legacy affine imageshaders
requires https://skia-review.googlesource.com/c/33180

CQ_INCLUDE_TRYBOTS=skia.primary:Test-Debian9-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD
Bug: skia:
Change-Id: I226e120cc5aebe393bda8bc069e7927fdc981a0e
Reviewed-on: https://skia-review.googlesource.com/36800
Commit-Queue: Mike Reed <reed@google.com>
Reviewed-by: Florin Malita <fmalita@chromium.org>
2017-08-23 12:41:35 +00:00
Mike Klein
25954b64c0 explicitly vectorize sk_memset{16,32,64}
This ought to help clients who don't enable autovectorization.

With autovectorization enabled, this new version is like,
hyper-vectorized compared to the old autovectorization.
Instead of handling 128 bytes max per loop, it now
handles up to 512 bytes per loop.  Pretty exciting.

Locally perf effects are a mix, but we'd expect this to help
Chrome unambiguously if they've turned off autovectorization.

  $ out/ok bench:samples=100 sw filter:match=memset32_\\d\* serial

  Before:
    [memset32_100000]   16ms    @0  20.1ms  @99 20.2ms  @100
    [memset32_10000]    1.07ms  @0  1.26ms  @99 1.31ms  @100
    [memset32_1000]     73.9µs  @0  89.4µs  @99 90.1µs  @100
    [memset32_100]      8.59µs  @0  9.74µs  @99 9.96µs  @100
    [memset32_10]       7.45µs  @0  8.96µs  @99 8.99µs  @100
    [memset32_1]        2.29µs  @0  2.81µs  @99 2.92µs  @100

  After:
    [memset32_100000]   16.2ms  @0  17.3ms  @99 17.3ms  @100
    [memset32_10000]    1.06ms  @0  1.18ms  @99 1.23ms  @100
    [memset32_1000]     72µs    @0  75.6µs  @99 84.7µs  @100
    [memset32_100]      9.14µs  @0  10.6µs  @99 10.7µs  @100
    [memset32_10]       5.43µs  @0  5.88µs  @99 5.99µs  @100
    [memset32_1]        3.43µs  @0  3.65µs  @99 3.83µs  @100

BUG=chromium:755391

Change-Id: If9059a30ca7a345f1f7c37bd51473c29e8bb8922
Reviewed-on: https://skia-review.googlesource.com/34746
Reviewed-by: Florin Malita <fmalita@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
2017-08-15 22:28:17 +00:00
Mike Klein
0fddb2d7c1 Retry cleaning up SkLinearBitmapPipeline.
This is mostly dead code.

In order to make it truly dead, we need to opt drawing unpremul images
into SkRasterPipelineBlitter.  They had been handled by
SkLinearBitmapPipeline, but can't be draw by SkBitmapProcLegacyShader.

Drawing unpremul images is tested by the GM all_variants_8888, which
gave us trouble last time around (serialize-8888 drew right, 8888 wrong)
but now draws fine.  I think this was probably also the root of the
revert, drawing some unpremul image in Chrome's tests somewhere.

Change-Id: I453f9df44ade807316935921cbae82961e2f08aa
Reviewed-on: https://skia-review.googlesource.com/24862
Reviewed-by: Florin Malita <fmalita@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
2017-07-20 16:45:52 +00:00
Mike Reed
e32500f064 Assume HQ is handled by pipeline, delete legacy code-path
CQ_INCLUDE_TRYBOTS=skia.primary:Test-Debian9-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD
Bug: skia:
Change-Id: If6f0d0a57463bf99a66d674e65a62ce3931d0116
Reviewed-on: https://skia-review.googlesource.com/24644
Commit-Queue: Mike Reed <reed@google.com>
Reviewed-by: Mike Klein <mtklein@chromium.org>
2017-07-20 00:43:37 +00:00
Yuqian Li
e94865e934 Simplify Sk4i's Min/Max
CQ_INCLUDE_TRYBOTS=skia.primary:Test-Debian9-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD
Bug: skia:
Change-Id: Id82a58f2f4d947894bb710cb3190c873b20b98eb
Reviewed-on: https://skia-review.googlesource.com/23404
Reviewed-by: Mike Klein <mtklein@google.com>
Commit-Queue: Yuqian Li <liyuqian@google.com>
2017-07-14 18:14:04 +00:00
Yuqian Li
7da6ba2d63 Implement Sk4i's abs, min, max
CQ_INCLUDE_TRYBOTS=skia.primary:Test-Debian9-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD
Bug: skia:
Change-Id: Ia9ec3f72095e1c744f88df7bb990d99e0f87d578
Reviewed-on: https://skia-review.googlesource.com/22720
Commit-Queue: Yuqian Li <liyuqian@google.com>
Reviewed-by: Herb Derby <herb@google.com>
2017-07-12 20:20:43 +00:00
Mike Reed
6ee2965820 remove unreachable perspective code for imageshader
CQ_INCLUDE_TRYBOTS=skia.primary:Test-Debian9-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD
Bug: skia:
Change-Id: If9a7df3e1c387098b00bf1cc1a37c36c6d256ef1
Reviewed-on: https://skia-review.googlesource.com/22348
Commit-Queue: Florin Malita <fmalita@chromium.org>
Reviewed-by: Florin Malita <fmalita@chromium.org>
2017-07-12 13:19:35 +00:00
Hal Canary
466c7d6597 header cleanup
Change-Id: I3f7667a1357194ae2bdd341ad9d46eb93920f404
Reviewed-on: https://skia-review.googlesource.com/21374
Reviewed-by: Brian Salomon <bsalomon@google.com>
Commit-Queue: Hal Canary <halcanary@google.com>
2017-07-05 15:18:52 +00:00
Mike Reed
41dc6cc621 remove unreachable samples for non-N32 imageshaders
CQ_INCLUDE_TRYBOTS=skia.primary:Test-Debian9-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD
Bug: skia:
Change-Id: I2d9a22b22d72c81a742b8fd497797bff8174915b
Reviewed-on: https://skia-review.googlesource.com/21264
Commit-Queue: Eric Boren <borenet@google.com>
Commit-Queue: Mike Reed <reed@google.com>
Reviewed-by: Florin Malita <fmalita@chromium.org>
2017-06-29 19:13:50 +00:00
Mike Reed
5e78c61075 Revert "remove a bit more dead code"
This reverts commit d9b1fe02a6.

Reason for revert: try to fix chrome roll

Original change's description:
> remove a bit more dead code
> 
> Change-Id: I61484672e88d6bb4f75833ee89e7178c4f34d610
> Reviewed-on: https://skia-review.googlesource.com/20780
> Commit-Queue: Mike Klein <mtklein@google.com>
> Reviewed-by: Herb Derby <herb@google.com>

TBR=mtklein@chromium.org,mtklein@google.com,herb@google.com

# Not skipping CQ checks because original CL landed > 1 day ago.

Change-Id: I03dcd344dfb138261d9421b0692d12e4ed431100
Reviewed-on: https://skia-review.googlesource.com/20822
Reviewed-by: Mike Reed <reed@google.com>
Commit-Queue: Mike Reed <reed@google.com>
2017-06-26 13:53:22 +00:00
Mike Klein
d9b1fe02a6 remove a bit more dead code
Change-Id: I61484672e88d6bb4f75833ee89e7178c4f34d610
Reviewed-on: https://skia-review.googlesource.com/20780
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Herb Derby <herb@google.com>
2017-06-24 16:08:23 +00:00
Mike Reed
0215312f5f remove unused SkFilterProcs
CQ_INCLUDE_TRYBOTS=skia.primary:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD
Bug: skia:
Change-Id: I7f9abb3ecb1c7b4fd18a703198c54c6d8f5e1ef7
Reviewed-on: https://skia-review.googlesource.com/20429
Reviewed-by: Mike Reed <reed@google.com>
Commit-Queue: Mike Reed <reed@google.com>
2017-06-21 19:17:02 +00:00
Mike Klein
f1b41a73d4 Refactor SkBlurImageFilter_opts.h for readability.
This is mostly reindentation to make each platform's code clear, and
rewriting comments to match the usual way we orient vector comments
these days (in memory order).

Change-Id: I7ceb98c5af88980e74b6a124507e0ef1900fc731
Reviewed-on: https://skia-review.googlesource.com/19663
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
2017-06-13 17:35:17 +00:00
Mike Reed
ce9514c6cd remove unneeded proc fields
CQ_INCLUDE_TRYBOTS=skia.primary:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD
Bug: skia:
Change-Id: Ibf997c8d19a045d41d3e92b8db63c36f8fa10b3e
Reviewed-on: https://skia-review.googlesource.com/19441
Reviewed-by: Mike Reed <reed@google.com>
Commit-Queue: Mike Reed <reed@google.com>
2017-06-12 14:44:30 +00:00
Mike Reed
f066ac908e replace 4f procs with pipeline (only called in 2 places by ganesh)
enables lots of code to delete

CQ_INCLUDE_TRYBOTS=skia.primary:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD
Bug: skia:
Change-Id: I13631ead68a9232bd8c13c5ef54727f44def26ca
Reviewed-on: https://skia-review.googlesource.com/19278
Reviewed-by: Mike Klein <mtklein@google.com>
Commit-Queue: Mike Reed <reed@google.com>
2017-06-09 19:51:04 +00:00
Mike Reed
1608a1dd17 remove unused xfermode methods
CQ_INCLUDE_TRYBOTS=skia.primary:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD
Bug: skia:
Change-Id: Ibc7d581bcc40134ee7cf57bb65fee2d70e119bc7
Reviewed-on: https://skia-review.googlesource.com/18842
Reviewed-by: Mike Reed <reed@google.com>
Commit-Queue: Mike Reed <reed@google.com>
2017-06-06 13:41:19 +00:00
Mike Klein
25e90055a0 simplify sse41::srcover_srgb_srgb
Most importantly, remove the undefined behavior implied by "delta".

Change-Id: I8f9740804ec74dd40b049eafd4f0d51b36ce3237
Reviewed-on: https://skia-review.googlesource.com/18140
Reviewed-by: Herb Derby <herb@google.com>
2017-05-30 19:49:32 +00:00
Mike Klein
836e6c1f7d remove sse2::srcover_srgb_srgb
Change-Id: Icc570d8a8f1df1dea202e1d234433491122b9b67
Reviewed-on: https://skia-review.googlesource.com/18135
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
2017-05-30 19:16:46 +00:00
Matteo Franchin
a132c3869f Faster and more accurate blit_row_s32a_opaque for ARM
Change ARM implementation of alpha blending to work on 8 pixels at a
time (using NEON). Also improve the accuracy of alpha blending by using
a formula based on SkMulDiv255Round rather than SkPMSrcOver.

Note that a number of variations of this code were considered. Here are
some notes:

- A 16 pixels at a time version was considered. This performs well for
  the case of extreme alpha (all-opaque or all-transparent pixels), but
  performs worst than the 8 pixels version when there are frequent
  transitions of alpha. Also gcc 6.2.1 seems to have troubles with
  register pressure when using this version.

- If the branch to detect the fully-opaque or fully-transparent cases
  is removed, then the performance increases significantly for images
  which are all partially transparent (especially on ARM Cortex A72),
  but can significantly decrease for images that are almost fully
  opaque or fully transparent.

This implementation is a compromise to the effects described above.
This patch produces a ~10% improvement on the nanobench's sub-scores
repeatTile_BGRA_8888_A, constXTile_MM_filter_trans, constXTile_CC_trans,
constXTile_RR_filter_trans when running on ARM Cortex A72. Improvements
of greater magnitude (20% to 30%) are observed when running on ARM
Cortex A53.

CQ_INCLUDE_TRYBOTS=skia.primary:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD
Change-Id: I1f0c9f549057613bbffd26e6651f3beeb0019af9
Bug: skia:
Reviewed-on: https://skia-review.googlesource.com/16520
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
2017-05-26 19:02:11 +00:00
Mike Klein
87db001115 move sk_memset?? to SkOpts
This lets the compiler generate AVX versions with wider writes.

Change-Id: Ia63825e70c72bdb4d14bef97d8b4ea4be54c9d84
Reviewed-on: https://skia-review.googlesource.com/17715
Reviewed-by: Florin Malita <fmalita@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
2017-05-23 17:02:53 +00:00
Mike Klein
c6820383b2 remove old 565 destination opts
This is not an important format, and the code is dead or close to it.
The code is an occasional maintenance burden so I'd like it gone.

Change-Id: I4ad921533abf3211e6a81e6e475b848795eea060
Reviewed-on: https://skia-review.googlesource.com/15600
Reviewed-by: Mike Reed <reed@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
2017-05-06 14:57:12 +00:00
Amaury Le Leyzour
4c29633ca4 CRC32 no longer restricted to ARM64
On a simple benchmark the CRC32 version is about 3x faster on
ARM Cortex A57 (Aarch32) than the Murmur3 scalar version.

BUG=skia:

Change-Id: I71515e8463a33924998b837ff9f32202690dd2fe
Reviewed-on: https://skia-review.googlesource.com/15480
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
2017-05-04 22:27:59 +00:00
Amaury Le Leyzour
ac0e705af1 Fix new IT blocks ARMv8
ARMv8 specifies that an IT block should be followed by only one 16-bit instruction.
* SkFloatToFix is back to a C implementation that mirrors the assembly code.

* S32A_D565_Opaque_neon switched the usage of the temporary 'ip' register to let
the compiler choose what is best in the context of the IT block. And replaced
'keep_dst' by 'ip' where low register or high register does not matter.

BUG=skia:

CQ_INCLUDE_TRYBOTS=skia.primary:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD

Change-Id: If587110a0c74b637ae99460419d46cf969c694fc
Reviewed-on: https://skia-review.googlesource.com/9346
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
2017-04-26 22:45:15 +00:00
Mike Klein
1be0db86aa long live SkJumper
Change-Id: I5de3c8daae80e437b3553ab6afcee7120a1bb775
Reviewed-on: https://skia-review.googlesource.com/14038
Reviewed-by: Herb Derby <herb@google.com>
Reviewed-by: Matt Sarett <msarett@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
2017-04-21 19:02:28 +00:00
Mike Klein
dc80eaa971 kill off shader_adapter
I still plan to replace this more thoroughly with a different blitter,
but for now just implement it using callback.

This is the last stage not supported by SkJumper!  Will follow up by
removing all of SkRasterPipeline_opts.h and anything that indicates
SkJumper might not work.

Change-Id: I96ba2bb0a26266f3b658e5f3153ec7d5bbd46799
Reviewed-on: https://skia-review.googlesource.com/14037
Reviewed-by: Florin Malita <fmalita@chromium.org>
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
2017-04-21 17:21:37 +00:00
Mike Klein
c17dc24fa9 jumper, rework callback a bit, use it for color_lookup_table
Looks like the color-space images have this well tested (even without
lab_to_xyz) and the diffs look like rounding/FMA.

The old plan to keep loads and stores outside callback was:
  1) awkward, with too many pointers and pointers to pointers to track
  2) misguided... load and store stages march ahead by x,
     working at ptr+0, ptr+8, ptr+16, etc. while callback
     always wants to be working at the same spot in the buffer.

I spent a frustrating day in lldb to understood 2).  :/

So now the stage always store4's its pixels to a buffer in the context
before the callback, and when the callback returns it load4's them back
from a pointer in the context, defaulting to that same buffer.

Instead of passing a void* into the callback, we pass the context
itself.  This lets us subclass the context and add our own data...
C-compatible object-oriented programming.

Change-Id: I7a03439b3abd2efb000a6973631a9336452e9a43
Reviewed-on: https://skia-review.googlesource.com/13985
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
2017-04-21 16:00:06 +00:00
Mike Klein
7fee90cb5e add a callback stage to SkRasterPipeline
This lets us temporarily escape to piece of code outside
SkRasterPipeline.  We should be able to use this to replace
   - parametric_{r,g,b,a}
   - table_{r,g,b,a}
   - color_lookup_table
   - shader_adapter*

* We want to obsolete shader_adapter for other reasons anyway,
but we _could_ replace it with this if we want to.

Change-Id: I42b657b3c19c679796ed1876856cae0c8471307e
Reviewed-on: https://skia-review.googlesource.com/12102
Commit-Queue: Mike Klein <mtklein@chromium.org>
Reviewed-by: Herb Derby <herb@google.com>
Reviewed-by: Matt Sarett <msarett@google.com>
2017-04-17 14:06:33 +00:00
Mike Klein
0a9044950c jumper, bilinear and bicubic sampling stages
This splits SkImageShaderContext into three parts:
   - SkJumper_GatherCtx: always, already done
   - SkJumper_SamplerCtx: when bilinear or bicubic
   - MiscCtx: other little bits (the matrix, paint color, tiling limits)

Thanks for the snazzy allocator that allows this Herb!

Both SkJumper and SkRasterPipeline_opts.h should be speaking all the
same types now.

I've copied the comments about bilinear/bicubic to SkJumper with little
typo fixes and clarifications.

Change-Id: I4ba7b7c02feba3f65f5292169a22c060e34933c6
Reviewed-on: https://skia-review.googlesource.com/13269
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
2017-04-12 18:57:09 +00:00
Mike Klein
994ef97339 make all gather_*() use SkJumper_GatherCtx
SkJumper_GatherCtx is a prefix of SkImageShaderContext, so
this is a no-op.  It helps to keep things straight, and I
do want to split apart the GatherCtx from a new SamplingCtx.

Change-Id: I9c5f436b096624c2809e1f810e9bcd6c6b00b883
Reviewed-on: https://skia-review.googlesource.com/13264
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
2017-04-12 16:44:31 +00:00
Mike Klein
d177ae18d7 remove SkNx AVX code
We can't realistically use AVX and SkNx together because of ODR
problems, so remove the code that may tempt us to try.

Remaining code paths using AVX:
  - one intrinsics-only routine in SkOpts_hsw.cpp
  - SkJumper

Change-Id: I0d2d03b47ea4a0eec27f2de2b28a4c3d1ff8376f
Reviewed-on: https://skia-review.googlesource.com/13121
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
2017-04-11 14:59:49 +00:00
Mike Klein
4482a14db9 fix -Fast bot
N=8 on that bot.

CQ_INCLUDE_TRYBOTS=skia.primary:Build-Ubuntu-Clang-x86_64-Release-Fast

Change-Id: If54ae800b50d9dffb9f983b23ff6f522657943b1
Reviewed-on: https://skia-review.googlesource.com/13061
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
2017-04-10 16:57:02 +00:00
Herb Derby
7b4202de0e Add multi-stop SkJumper stage.
CQ_INCLUDE_TRYBOTS=skia.primary:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD

Change-Id: I954d02638a785bec284d2fdf8f46abfccd474e7a
Reviewed-on: https://skia-review.googlesource.com/10211
Commit-Queue: Herb Derby <herb@google.com>
Reviewed-by: Mike Klein <mtklein@chromium.org>
Reviewed-by: Florin Malita <fmalita@chromium.org>
2017-04-10 15:39:47 +00:00
Mike Klein
cf1b022c53 tweaks to make gather_* easier in SkJumper
This moves all the values that gather_8888, gather_a8, etc. need to
the front of SkImageShaderContext, and dereferences the color table.

This should be a no-op, but will make these stages easier to write
in SkJumper.

Change-Id: I0dff97d5113d14e941e7b717cd85f0036764eb88
Reviewed-on: https://skia-review.googlesource.com/11492
Reviewed-by: Herb Derby <herb@google.com>
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
2017-04-06 20:00:37 +00:00
Mike Klein
2d2ac3d162 remove trace and registers stages
These can't really be done with SkJumper.

Change-Id: Ic357f00695eacd2766f6dfb9a3be13b0c07c3650
Reviewed-on: https://skia-review.googlesource.com/11386
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
2017-04-05 19:53:33 +00:00
Matt Sarett
4c55027dbf Add support for F32 sources to SkColorSpaceXform
This also subtlely allows clients to convert between F32 and F16.

BUG=skia:

CQ_INCLUDE_TRYBOTS=skia.primary:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD

Change-Id: Ied5f2295fce00c69d8cf85730be899f3f8597915
Reviewed-on: https://skia-review.googlesource.com/9914
Reviewed-by: Mike Reed <reed@google.com>
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Matt Sarett <msarett@google.com>
2017-03-21 12:46:37 +00:00
Florin Malita
52fe583a7b Remove SK_SUPPORT_LEGACY_BROKEN_LERP support
Chromium change landed.

BUG=chromium:696216

CQ_INCLUDE_TRYBOTS=skia.primary:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD

Change-Id: I3e67392b0fdad8c5a3ad256e4f190123dff6c846
Reviewed-on: https://skia-review.googlesource.com/9551
Reviewed-by: Mike Reed <reed@google.com>
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Florin Malita <fmalita@chromium.org>
2017-03-13 15:08:29 +00:00