Commit Graph

8 Commits

Author SHA1 Message Date
mtklein
8d3e9dff3f Revert of Mike's radial gradient CL with better float -> int. (patchset #7 id:120001 of https://codereview.chromium.org/1109643002/)
Reason for revert:
compile failures.

Original issue's description:
> Mike's radial gradient CL with better float -> int.
>
> patch from issue 1072303005 at patchset 40001 (http://crrev.com/1072303005#ps40001)
>
> This looks quite launchable.  radial_gradient3, min of 100 samples:
>   N5:  985µs -> 946µs
>   MBP: 395µs -> 279µs
>
> On my MBP, most of the meat looks like it's now in reading the cache and writing to dst one color at a time.  Is that something we could do in float math rather than with a lookup table?
>
> BUG=skia:
>
> CQ_EXTRA_TRYBOTS=client.skia.android:Test-Android-GCC-Nexus5-CPU-NEON-Arm7-Debug-Trybot,Test-Android-GCC-Nexus9-CPU-Denver-Arm64-Debug-Trybot
>
> Committed: https://skia.googlesource.com/skia/+/abf6c5cf95e921fae59efb487480e5b5081cf0ec

TBR=reed@google.com,robertphillips@google.com,mtklein@chromium.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:

Review URL: https://codereview.chromium.org/1109883003
2015-04-27 11:21:16 -07:00
mtklein
abf6c5cf95 Mike's radial gradient CL with better float -> int.
patch from issue 1072303005 at patchset 40001 (http://crrev.com/1072303005#ps40001)

This looks quite launchable.  radial_gradient3, min of 100 samples:
  N5:  985µs -> 946µs
  MBP: 395µs -> 279µs

On my MBP, most of the meat looks like it's now in reading the cache and writing to dst one color at a time.  Is that something we could do in float math rather than with a lookup table?

BUG=skia:

CQ_EXTRA_TRYBOTS=client.skia.android:Test-Android-GCC-Nexus5-CPU-NEON-Arm7-Debug-Trybot,Test-Android-GCC-Nexus9-CPU-Denver-Arm64-Debug-Trybot

Review URL: https://codereview.chromium.org/1109643002
2015-04-27 11:13:53 -07:00
mtklein
8fe8fffdfa Rename SkNi to SkNb.
As used today, SkNi is used in bool-y contexts.  This keeps that, but under a
new name, SkNb.  This makes room for a new SkNi that's focused on integer-y
things like loads, stores, arithmetic, etc.

The main reason to split these is that we want different specializations for
each use case: for bools, it's important for us to specialize 32- and 64-bit to
support efficient float- and double- comparisons, but for integer work we're
more likely to be looking at 8- and 16- bit lanes.  Keeping these use cases
siloed helps me manage the compexity of the backend NEON and SSE code.

BUG=skia:

Review URL: https://codereview.chromium.org/1083123002
2015-04-14 11:49:14 -07:00
mtklein
7792dbf7ea Code's more readable when SkPMFloat is an Sk4f.
#floats

BUG=skia:
BUG=skia:3592

Committed: https://skia.googlesource.com/skia/+/6b5dab889579f1cc9e1b5278f4ecdc4c63fe78c9

CQ_EXTRA_TRYBOTS=client.skia.compile:Build-Ubuntu-GCC-Arm64-Debug-Android-Trybot

Review URL: https://codereview.chromium.org/1061603002
2015-04-03 13:08:28 -07:00
mtklein
e758579d4f Revert of Code's more readable when SkPMFloat is an Sk4f. (patchset #3 id:40001 of https://codereview.chromium.org/1061603002/)
Reason for revert:
missed some neon code

Original issue's description:
> Code's more readable when SkPMFloat is an Sk4f.
>  #floats
>
> BUG=skia:
> BUG=skia:3592
>
> Committed: https://skia.googlesource.com/skia/+/6b5dab889579f1cc9e1b5278f4ecdc4c63fe78c9

TBR=reed@google.com,mtklein@chromium.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:

Review URL: https://codereview.chromium.org/1056143004
2015-04-03 12:47:48 -07:00
mtklein
6b5dab8895 Code's more readable when SkPMFloat is an Sk4f.
#floats

BUG=skia:
BUG=skia:3592

Review URL: https://codereview.chromium.org/1061603002
2015-04-03 12:45:05 -07:00
mtklein
a156a8ffbe Use switch operator[](int) to kth<int>() so we can use vget_lane.
#floats

BUG=skia:
BUG=skia:3592

Review URL: https://codereview.chromium.org/1059743002
2015-04-03 06:16:13 -07:00
mtklein
c9adb05b64 Refactor Sk2x<T> + Sk4x<T> into SkNf<N,T> and SkNi<N,T>
The primary feature this delivers is SkNf and SkNd for arbitrary power-of-two N.  Non-specialized types or types larger than 128 bits should now Just Work (and we can drop in a specialization to make them faster).  Sk4s is now just a typedef for SkNf<4, SkScalar>; Sk4d is SkNf<4, double>, Sk2f SkNf<2, float>, etc.

This also makes implementing new specializations easier and more encapsulated.  We're now using template specialization, which means the specialized versions don't have to leak out so much from SkNx_sse.h  and SkNx_neon.h.

This design leaves us room to grow up, e.g to SkNf<8, SkScalar> == Sk8s, and to grown down too, to things like SkNi<8, uint16_t> == Sk8h.

To simplify things, I've stripped away most APIs (swizzles, casts, reinterpret_casts) that no one's using yet.  I will happily add them back if they seem useful.

You shouldn't feel bad about using any of the typedef Sk4s, Sk4f, Sk4d, Sk2s, Sk2f, Sk2d, Sk4i, etc.  Here's how you should feel:
  - Sk4f, Sk4s, Sk2d: feel awesome
  - Sk2f, Sk2s, Sk4d: feel pretty good

No public API changes.
TBR=reed@google.com

BUG=skia:3592

Review URL: https://codereview.chromium.org/1048593002
2015-03-30 10:50:27 -07:00