Refactor Sk2x<T> + Sk4x<T> into SkNf<N,T> and SkNi<N,T>
The primary feature this delivers is SkNf and SkNd for arbitrary power-of-two N. Non-specialized types or types larger than 128 bits should now Just Work (and we can drop in a specialization to make them faster). Sk4s is now just a typedef for SkNf<4, SkScalar>; Sk4d is SkNf<4, double>, Sk2f SkNf<2, float>, etc.
This also makes implementing new specializations easier and more encapsulated. We're now using template specialization, which means the specialized versions don't have to leak out so much from SkNx_sse.h and SkNx_neon.h.
This design leaves us room to grow up, e.g to SkNf<8, SkScalar> == Sk8s, and to grown down too, to things like SkNi<8, uint16_t> == Sk8h.
To simplify things, I've stripped away most APIs (swizzles, casts, reinterpret_casts) that no one's using yet. I will happily add them back if they seem useful.
You shouldn't feel bad about using any of the typedef Sk4s, Sk4f, Sk4d, Sk2s, Sk2f, Sk2d, Sk4i, etc. Here's how you should feel:
- Sk4f, Sk4s, Sk2d: feel awesome
- Sk2f, Sk2s, Sk4d: feel pretty good
No public API changes.
TBR=reed@google.com
BUG=skia:3592
Review URL: https://codereview.chromium.org/1048593002
2015-03-30 17:50:27 +00:00
|
|
|
/*
|
|
|
|
* Copyright 2015 Google Inc.
|
|
|
|
*
|
|
|
|
* Use of this source code is governed by a BSD-style license that can be
|
|
|
|
* found in the LICENSE file.
|
|
|
|
*/
|
|
|
|
|
2015-06-15 17:58:42 +00:00
|
|
|
#include "Sk4px.h"
|
Refactor Sk2x<T> + Sk4x<T> into SkNf<N,T> and SkNi<N,T>
The primary feature this delivers is SkNf and SkNd for arbitrary power-of-two N. Non-specialized types or types larger than 128 bits should now Just Work (and we can drop in a specialization to make them faster). Sk4s is now just a typedef for SkNf<4, SkScalar>; Sk4d is SkNf<4, double>, Sk2f SkNf<2, float>, etc.
This also makes implementing new specializations easier and more encapsulated. We're now using template specialization, which means the specialized versions don't have to leak out so much from SkNx_sse.h and SkNx_neon.h.
This design leaves us room to grow up, e.g to SkNf<8, SkScalar> == Sk8s, and to grown down too, to things like SkNi<8, uint16_t> == Sk8h.
To simplify things, I've stripped away most APIs (swizzles, casts, reinterpret_casts) that no one's using yet. I will happily add them back if they seem useful.
You shouldn't feel bad about using any of the typedef Sk4s, Sk4f, Sk4d, Sk2s, Sk2f, Sk2d, Sk4i, etc. Here's how you should feel:
- Sk4f, Sk4s, Sk2d: feel awesome
- Sk2f, Sk2s, Sk4d: feel pretty good
No public API changes.
TBR=reed@google.com
BUG=skia:3592
Review URL: https://codereview.chromium.org/1048593002
2015-03-30 17:50:27 +00:00
|
|
|
#include "SkNx.h"
|
2015-05-15 00:53:04 +00:00
|
|
|
#include "SkRandom.h"
|
Refactor Sk2x<T> + Sk4x<T> into SkNf<N,T> and SkNi<N,T>
The primary feature this delivers is SkNf and SkNd for arbitrary power-of-two N. Non-specialized types or types larger than 128 bits should now Just Work (and we can drop in a specialization to make them faster). Sk4s is now just a typedef for SkNf<4, SkScalar>; Sk4d is SkNf<4, double>, Sk2f SkNf<2, float>, etc.
This also makes implementing new specializations easier and more encapsulated. We're now using template specialization, which means the specialized versions don't have to leak out so much from SkNx_sse.h and SkNx_neon.h.
This design leaves us room to grow up, e.g to SkNf<8, SkScalar> == Sk8s, and to grown down too, to things like SkNi<8, uint16_t> == Sk8h.
To simplify things, I've stripped away most APIs (swizzles, casts, reinterpret_casts) that no one's using yet. I will happily add them back if they seem useful.
You shouldn't feel bad about using any of the typedef Sk4s, Sk4f, Sk4d, Sk2s, Sk2f, Sk2d, Sk4i, etc. Here's how you should feel:
- Sk4f, Sk4s, Sk2d: feel awesome
- Sk2f, Sk2s, Sk4d: feel pretty good
No public API changes.
TBR=reed@google.com
BUG=skia:3592
Review URL: https://codereview.chromium.org/1048593002
2015-03-30 17:50:27 +00:00
|
|
|
#include "Test.h"
|
|
|
|
|
2015-11-09 16:33:53 +00:00
|
|
|
template <int N>
|
Refactor Sk2x<T> + Sk4x<T> into SkNf<N,T> and SkNi<N,T>
The primary feature this delivers is SkNf and SkNd for arbitrary power-of-two N. Non-specialized types or types larger than 128 bits should now Just Work (and we can drop in a specialization to make them faster). Sk4s is now just a typedef for SkNf<4, SkScalar>; Sk4d is SkNf<4, double>, Sk2f SkNf<2, float>, etc.
This also makes implementing new specializations easier and more encapsulated. We're now using template specialization, which means the specialized versions don't have to leak out so much from SkNx_sse.h and SkNx_neon.h.
This design leaves us room to grow up, e.g to SkNf<8, SkScalar> == Sk8s, and to grown down too, to things like SkNi<8, uint16_t> == Sk8h.
To simplify things, I've stripped away most APIs (swizzles, casts, reinterpret_casts) that no one's using yet. I will happily add them back if they seem useful.
You shouldn't feel bad about using any of the typedef Sk4s, Sk4f, Sk4d, Sk2s, Sk2f, Sk2d, Sk4i, etc. Here's how you should feel:
- Sk4f, Sk4s, Sk2d: feel awesome
- Sk2f, Sk2s, Sk4d: feel pretty good
No public API changes.
TBR=reed@google.com
BUG=skia:3592
Review URL: https://codereview.chromium.org/1048593002
2015-03-30 17:50:27 +00:00
|
|
|
static void test_Nf(skiatest::Reporter* r) {
|
|
|
|
|
2015-11-20 21:53:19 +00:00
|
|
|
auto assert_nearly_eq = [&](float eps, const SkNx<N, float>& v,
|
|
|
|
float a, float b, float c, float d) {
|
2015-11-09 16:33:53 +00:00
|
|
|
auto close = [=](float a, float b) { return fabsf(a-b) <= eps; };
|
|
|
|
float vals[4];
|
Refactor Sk2x<T> + Sk4x<T> into SkNf<N,T> and SkNi<N,T>
The primary feature this delivers is SkNf and SkNd for arbitrary power-of-two N. Non-specialized types or types larger than 128 bits should now Just Work (and we can drop in a specialization to make them faster). Sk4s is now just a typedef for SkNf<4, SkScalar>; Sk4d is SkNf<4, double>, Sk2f SkNf<2, float>, etc.
This also makes implementing new specializations easier and more encapsulated. We're now using template specialization, which means the specialized versions don't have to leak out so much from SkNx_sse.h and SkNx_neon.h.
This design leaves us room to grow up, e.g to SkNf<8, SkScalar> == Sk8s, and to grown down too, to things like SkNi<8, uint16_t> == Sk8h.
To simplify things, I've stripped away most APIs (swizzles, casts, reinterpret_casts) that no one's using yet. I will happily add them back if they seem useful.
You shouldn't feel bad about using any of the typedef Sk4s, Sk4f, Sk4d, Sk2s, Sk2f, Sk2d, Sk4i, etc. Here's how you should feel:
- Sk4f, Sk4s, Sk2d: feel awesome
- Sk2f, Sk2s, Sk4d: feel pretty good
No public API changes.
TBR=reed@google.com
BUG=skia:3592
Review URL: https://codereview.chromium.org/1048593002
2015-03-30 17:50:27 +00:00
|
|
|
v.store(vals);
|
2015-04-03 13:16:13 +00:00
|
|
|
bool ok = close(vals[0], a) && close(vals[1], b)
|
2016-02-21 18:54:19 +00:00
|
|
|
&& close( v[0], a) && close( v[1], b);
|
2015-04-03 13:16:13 +00:00
|
|
|
REPORTER_ASSERT(r, ok);
|
Refactor Sk2x<T> + Sk4x<T> into SkNf<N,T> and SkNi<N,T>
The primary feature this delivers is SkNf and SkNd for arbitrary power-of-two N. Non-specialized types or types larger than 128 bits should now Just Work (and we can drop in a specialization to make them faster). Sk4s is now just a typedef for SkNf<4, SkScalar>; Sk4d is SkNf<4, double>, Sk2f SkNf<2, float>, etc.
This also makes implementing new specializations easier and more encapsulated. We're now using template specialization, which means the specialized versions don't have to leak out so much from SkNx_sse.h and SkNx_neon.h.
This design leaves us room to grow up, e.g to SkNf<8, SkScalar> == Sk8s, and to grown down too, to things like SkNi<8, uint16_t> == Sk8h.
To simplify things, I've stripped away most APIs (swizzles, casts, reinterpret_casts) that no one's using yet. I will happily add them back if they seem useful.
You shouldn't feel bad about using any of the typedef Sk4s, Sk4f, Sk4d, Sk2s, Sk2f, Sk2d, Sk4i, etc. Here's how you should feel:
- Sk4f, Sk4s, Sk2d: feel awesome
- Sk2f, Sk2s, Sk4d: feel pretty good
No public API changes.
TBR=reed@google.com
BUG=skia:3592
Review URL: https://codereview.chromium.org/1048593002
2015-03-30 17:50:27 +00:00
|
|
|
if (N == 4) {
|
2015-04-03 13:16:13 +00:00
|
|
|
ok = close(vals[2], c) && close(vals[3], d)
|
2016-02-21 18:54:19 +00:00
|
|
|
&& close( v[2], c) && close( v[3], d);
|
2015-04-03 13:16:13 +00:00
|
|
|
REPORTER_ASSERT(r, ok);
|
Refactor Sk2x<T> + Sk4x<T> into SkNf<N,T> and SkNi<N,T>
The primary feature this delivers is SkNf and SkNd for arbitrary power-of-two N. Non-specialized types or types larger than 128 bits should now Just Work (and we can drop in a specialization to make them faster). Sk4s is now just a typedef for SkNf<4, SkScalar>; Sk4d is SkNf<4, double>, Sk2f SkNf<2, float>, etc.
This also makes implementing new specializations easier and more encapsulated. We're now using template specialization, which means the specialized versions don't have to leak out so much from SkNx_sse.h and SkNx_neon.h.
This design leaves us room to grow up, e.g to SkNf<8, SkScalar> == Sk8s, and to grown down too, to things like SkNi<8, uint16_t> == Sk8h.
To simplify things, I've stripped away most APIs (swizzles, casts, reinterpret_casts) that no one's using yet. I will happily add them back if they seem useful.
You shouldn't feel bad about using any of the typedef Sk4s, Sk4f, Sk4d, Sk2s, Sk2f, Sk2d, Sk4i, etc. Here's how you should feel:
- Sk4f, Sk4s, Sk2d: feel awesome
- Sk2f, Sk2s, Sk4d: feel pretty good
No public API changes.
TBR=reed@google.com
BUG=skia:3592
Review URL: https://codereview.chromium.org/1048593002
2015-03-30 17:50:27 +00:00
|
|
|
}
|
|
|
|
};
|
2015-11-20 21:53:19 +00:00
|
|
|
auto assert_eq = [&](const SkNx<N, float>& v, float a, float b, float c, float d) {
|
Refactor Sk2x<T> + Sk4x<T> into SkNf<N,T> and SkNi<N,T>
The primary feature this delivers is SkNf and SkNd for arbitrary power-of-two N. Non-specialized types or types larger than 128 bits should now Just Work (and we can drop in a specialization to make them faster). Sk4s is now just a typedef for SkNf<4, SkScalar>; Sk4d is SkNf<4, double>, Sk2f SkNf<2, float>, etc.
This also makes implementing new specializations easier and more encapsulated. We're now using template specialization, which means the specialized versions don't have to leak out so much from SkNx_sse.h and SkNx_neon.h.
This design leaves us room to grow up, e.g to SkNf<8, SkScalar> == Sk8s, and to grown down too, to things like SkNi<8, uint16_t> == Sk8h.
To simplify things, I've stripped away most APIs (swizzles, casts, reinterpret_casts) that no one's using yet. I will happily add them back if they seem useful.
You shouldn't feel bad about using any of the typedef Sk4s, Sk4f, Sk4d, Sk2s, Sk2f, Sk2d, Sk4i, etc. Here's how you should feel:
- Sk4f, Sk4s, Sk2d: feel awesome
- Sk2f, Sk2s, Sk4d: feel pretty good
No public API changes.
TBR=reed@google.com
BUG=skia:3592
Review URL: https://codereview.chromium.org/1048593002
2015-03-30 17:50:27 +00:00
|
|
|
return assert_nearly_eq(0, v, a,b,c,d);
|
|
|
|
};
|
|
|
|
|
2015-11-09 16:33:53 +00:00
|
|
|
float vals[] = {3, 4, 5, 6};
|
2015-11-20 21:53:19 +00:00
|
|
|
SkNx<N,float> a = SkNx<N,float>::Load(vals),
|
|
|
|
b(a),
|
|
|
|
c = a;
|
|
|
|
SkNx<N,float> d;
|
Refactor Sk2x<T> + Sk4x<T> into SkNf<N,T> and SkNi<N,T>
The primary feature this delivers is SkNf and SkNd for arbitrary power-of-two N. Non-specialized types or types larger than 128 bits should now Just Work (and we can drop in a specialization to make them faster). Sk4s is now just a typedef for SkNf<4, SkScalar>; Sk4d is SkNf<4, double>, Sk2f SkNf<2, float>, etc.
This also makes implementing new specializations easier and more encapsulated. We're now using template specialization, which means the specialized versions don't have to leak out so much from SkNx_sse.h and SkNx_neon.h.
This design leaves us room to grow up, e.g to SkNf<8, SkScalar> == Sk8s, and to grown down too, to things like SkNi<8, uint16_t> == Sk8h.
To simplify things, I've stripped away most APIs (swizzles, casts, reinterpret_casts) that no one's using yet. I will happily add them back if they seem useful.
You shouldn't feel bad about using any of the typedef Sk4s, Sk4f, Sk4d, Sk2s, Sk2f, Sk2d, Sk4i, etc. Here's how you should feel:
- Sk4f, Sk4s, Sk2d: feel awesome
- Sk2f, Sk2s, Sk4d: feel pretty good
No public API changes.
TBR=reed@google.com
BUG=skia:3592
Review URL: https://codereview.chromium.org/1048593002
2015-03-30 17:50:27 +00:00
|
|
|
d = a;
|
|
|
|
|
|
|
|
assert_eq(a, 3, 4, 5, 6);
|
|
|
|
assert_eq(b, 3, 4, 5, 6);
|
|
|
|
assert_eq(c, 3, 4, 5, 6);
|
|
|
|
assert_eq(d, 3, 4, 5, 6);
|
|
|
|
|
|
|
|
assert_eq(a+b, 6, 8, 10, 12);
|
|
|
|
assert_eq(a*b, 9, 16, 25, 36);
|
|
|
|
assert_eq(a*b-b, 6, 12, 20, 30);
|
|
|
|
assert_eq((a*b).sqrt(), 3, 4, 5, 6);
|
|
|
|
assert_eq(a/b, 1, 1, 1, 1);
|
2015-11-20 21:53:19 +00:00
|
|
|
assert_eq(SkNx<N,float>(0)-a, -3, -4, -5, -6);
|
Refactor Sk2x<T> + Sk4x<T> into SkNf<N,T> and SkNi<N,T>
The primary feature this delivers is SkNf and SkNd for arbitrary power-of-two N. Non-specialized types or types larger than 128 bits should now Just Work (and we can drop in a specialization to make them faster). Sk4s is now just a typedef for SkNf<4, SkScalar>; Sk4d is SkNf<4, double>, Sk2f SkNf<2, float>, etc.
This also makes implementing new specializations easier and more encapsulated. We're now using template specialization, which means the specialized versions don't have to leak out so much from SkNx_sse.h and SkNx_neon.h.
This design leaves us room to grow up, e.g to SkNf<8, SkScalar> == Sk8s, and to grown down too, to things like SkNi<8, uint16_t> == Sk8h.
To simplify things, I've stripped away most APIs (swizzles, casts, reinterpret_casts) that no one's using yet. I will happily add them back if they seem useful.
You shouldn't feel bad about using any of the typedef Sk4s, Sk4f, Sk4d, Sk2s, Sk2f, Sk2d, Sk4i, etc. Here's how you should feel:
- Sk4f, Sk4s, Sk2d: feel awesome
- Sk2f, Sk2s, Sk4d: feel pretty good
No public API changes.
TBR=reed@google.com
BUG=skia:3592
Review URL: https://codereview.chromium.org/1048593002
2015-03-30 17:50:27 +00:00
|
|
|
|
2015-11-20 21:53:19 +00:00
|
|
|
SkNx<N,float> fours(4);
|
Refactor Sk2x<T> + Sk4x<T> into SkNf<N,T> and SkNi<N,T>
The primary feature this delivers is SkNf and SkNd for arbitrary power-of-two N. Non-specialized types or types larger than 128 bits should now Just Work (and we can drop in a specialization to make them faster). Sk4s is now just a typedef for SkNf<4, SkScalar>; Sk4d is SkNf<4, double>, Sk2f SkNf<2, float>, etc.
This also makes implementing new specializations easier and more encapsulated. We're now using template specialization, which means the specialized versions don't have to leak out so much from SkNx_sse.h and SkNx_neon.h.
This design leaves us room to grow up, e.g to SkNf<8, SkScalar> == Sk8s, and to grown down too, to things like SkNi<8, uint16_t> == Sk8h.
To simplify things, I've stripped away most APIs (swizzles, casts, reinterpret_casts) that no one's using yet. I will happily add them back if they seem useful.
You shouldn't feel bad about using any of the typedef Sk4s, Sk4f, Sk4d, Sk2s, Sk2f, Sk2d, Sk4i, etc. Here's how you should feel:
- Sk4f, Sk4s, Sk2d: feel awesome
- Sk2f, Sk2s, Sk4d: feel pretty good
No public API changes.
TBR=reed@google.com
BUG=skia:3592
Review URL: https://codereview.chromium.org/1048593002
2015-03-30 17:50:27 +00:00
|
|
|
|
|
|
|
assert_eq(fours.sqrt(), 2,2,2,2);
|
2016-03-21 17:04:46 +00:00
|
|
|
assert_nearly_eq(0.001f, fours.rsqrt(), 0.5, 0.5, 0.5, 0.5);
|
Refactor Sk2x<T> + Sk4x<T> into SkNf<N,T> and SkNi<N,T>
The primary feature this delivers is SkNf and SkNd for arbitrary power-of-two N. Non-specialized types or types larger than 128 bits should now Just Work (and we can drop in a specialization to make them faster). Sk4s is now just a typedef for SkNf<4, SkScalar>; Sk4d is SkNf<4, double>, Sk2f SkNf<2, float>, etc.
This also makes implementing new specializations easier and more encapsulated. We're now using template specialization, which means the specialized versions don't have to leak out so much from SkNx_sse.h and SkNx_neon.h.
This design leaves us room to grow up, e.g to SkNf<8, SkScalar> == Sk8s, and to grown down too, to things like SkNi<8, uint16_t> == Sk8h.
To simplify things, I've stripped away most APIs (swizzles, casts, reinterpret_casts) that no one's using yet. I will happily add them back if they seem useful.
You shouldn't feel bad about using any of the typedef Sk4s, Sk4f, Sk4d, Sk2s, Sk2f, Sk2d, Sk4i, etc. Here's how you should feel:
- Sk4f, Sk4s, Sk2d: feel awesome
- Sk2f, Sk2s, Sk4d: feel pretty good
No public API changes.
TBR=reed@google.com
BUG=skia:3592
Review URL: https://codereview.chromium.org/1048593002
2015-03-30 17:50:27 +00:00
|
|
|
|
2016-03-21 17:04:46 +00:00
|
|
|
assert_nearly_eq(0.001f, fours.invert(), 0.25, 0.25, 0.25, 0.25);
|
Refactor Sk2x<T> + Sk4x<T> into SkNf<N,T> and SkNi<N,T>
The primary feature this delivers is SkNf and SkNd for arbitrary power-of-two N. Non-specialized types or types larger than 128 bits should now Just Work (and we can drop in a specialization to make them faster). Sk4s is now just a typedef for SkNf<4, SkScalar>; Sk4d is SkNf<4, double>, Sk2f SkNf<2, float>, etc.
This also makes implementing new specializations easier and more encapsulated. We're now using template specialization, which means the specialized versions don't have to leak out so much from SkNx_sse.h and SkNx_neon.h.
This design leaves us room to grow up, e.g to SkNf<8, SkScalar> == Sk8s, and to grown down too, to things like SkNi<8, uint16_t> == Sk8h.
To simplify things, I've stripped away most APIs (swizzles, casts, reinterpret_casts) that no one's using yet. I will happily add them back if they seem useful.
You shouldn't feel bad about using any of the typedef Sk4s, Sk4f, Sk4d, Sk2s, Sk2f, Sk2d, Sk4i, etc. Here's how you should feel:
- Sk4f, Sk4s, Sk2d: feel awesome
- Sk2f, Sk2s, Sk4d: feel pretty good
No public API changes.
TBR=reed@google.com
BUG=skia:3592
Review URL: https://codereview.chromium.org/1048593002
2015-03-30 17:50:27 +00:00
|
|
|
|
2015-11-20 21:53:19 +00:00
|
|
|
assert_eq(SkNx<N,float>::Min(a, fours), 3, 4, 4, 4);
|
|
|
|
assert_eq(SkNx<N,float>::Max(a, fours), 4, 4, 5, 6);
|
Refactor Sk2x<T> + Sk4x<T> into SkNf<N,T> and SkNi<N,T>
The primary feature this delivers is SkNf and SkNd for arbitrary power-of-two N. Non-specialized types or types larger than 128 bits should now Just Work (and we can drop in a specialization to make them faster). Sk4s is now just a typedef for SkNf<4, SkScalar>; Sk4d is SkNf<4, double>, Sk2f SkNf<2, float>, etc.
This also makes implementing new specializations easier and more encapsulated. We're now using template specialization, which means the specialized versions don't have to leak out so much from SkNx_sse.h and SkNx_neon.h.
This design leaves us room to grow up, e.g to SkNf<8, SkScalar> == Sk8s, and to grown down too, to things like SkNi<8, uint16_t> == Sk8h.
To simplify things, I've stripped away most APIs (swizzles, casts, reinterpret_casts) that no one's using yet. I will happily add them back if they seem useful.
You shouldn't feel bad about using any of the typedef Sk4s, Sk4f, Sk4d, Sk2s, Sk2f, Sk2d, Sk4i, etc. Here's how you should feel:
- Sk4f, Sk4s, Sk2d: feel awesome
- Sk2f, Sk2s, Sk4d: feel pretty good
No public API changes.
TBR=reed@google.com
BUG=skia:3592
Review URL: https://codereview.chromium.org/1048593002
2015-03-30 17:50:27 +00:00
|
|
|
|
|
|
|
// Test some comparisons. This is not exhaustive.
|
|
|
|
REPORTER_ASSERT(r, (a == b).allTrue());
|
|
|
|
REPORTER_ASSERT(r, (a+b == a*b-b).anyTrue());
|
|
|
|
REPORTER_ASSERT(r, !(a+b == a*b-b).allTrue());
|
|
|
|
REPORTER_ASSERT(r, !(a+b == a*b).anyTrue());
|
|
|
|
REPORTER_ASSERT(r, !(a != b).anyTrue());
|
|
|
|
REPORTER_ASSERT(r, (a < fours).anyTrue());
|
|
|
|
REPORTER_ASSERT(r, (a <= fours).anyTrue());
|
|
|
|
REPORTER_ASSERT(r, !(a > fours).allTrue());
|
|
|
|
REPORTER_ASSERT(r, !(a >= fours).allTrue());
|
|
|
|
}
|
|
|
|
|
|
|
|
DEF_TEST(SkNf, r) {
|
2015-11-09 16:33:53 +00:00
|
|
|
test_Nf<2>(r);
|
|
|
|
test_Nf<4>(r);
|
Refactor Sk2x<T> + Sk4x<T> into SkNf<N,T> and SkNi<N,T>
The primary feature this delivers is SkNf and SkNd for arbitrary power-of-two N. Non-specialized types or types larger than 128 bits should now Just Work (and we can drop in a specialization to make them faster). Sk4s is now just a typedef for SkNf<4, SkScalar>; Sk4d is SkNf<4, double>, Sk2f SkNf<2, float>, etc.
This also makes implementing new specializations easier and more encapsulated. We're now using template specialization, which means the specialized versions don't have to leak out so much from SkNx_sse.h and SkNx_neon.h.
This design leaves us room to grow up, e.g to SkNf<8, SkScalar> == Sk8s, and to grown down too, to things like SkNi<8, uint16_t> == Sk8h.
To simplify things, I've stripped away most APIs (swizzles, casts, reinterpret_casts) that no one's using yet. I will happily add them back if they seem useful.
You shouldn't feel bad about using any of the typedef Sk4s, Sk4f, Sk4d, Sk2s, Sk2f, Sk2d, Sk4i, etc. Here's how you should feel:
- Sk4f, Sk4s, Sk2d: feel awesome
- Sk2f, Sk2s, Sk4d: feel pretty good
No public API changes.
TBR=reed@google.com
BUG=skia:3592
Review URL: https://codereview.chromium.org/1048593002
2015-03-30 17:50:27 +00:00
|
|
|
}
|
2015-04-14 21:02:52 +00:00
|
|
|
|
|
|
|
template <int N, typename T>
|
|
|
|
void test_Ni(skiatest::Reporter* r) {
|
2015-11-20 21:53:19 +00:00
|
|
|
auto assert_eq = [&](const SkNx<N,T>& v, T a, T b, T c, T d, T e, T f, T g, T h) {
|
2015-04-14 21:02:52 +00:00
|
|
|
T vals[8];
|
|
|
|
v.store(vals);
|
|
|
|
|
|
|
|
switch (N) {
|
|
|
|
case 8: REPORTER_ASSERT(r, vals[4] == e && vals[5] == f && vals[6] == g && vals[7] == h);
|
|
|
|
case 4: REPORTER_ASSERT(r, vals[2] == c && vals[3] == d);
|
|
|
|
case 2: REPORTER_ASSERT(r, vals[0] == a && vals[1] == b);
|
|
|
|
}
|
2015-04-27 19:08:01 +00:00
|
|
|
switch (N) {
|
2016-02-21 18:54:19 +00:00
|
|
|
case 8: REPORTER_ASSERT(r, v[4] == e && v[5] == f &&
|
|
|
|
v[6] == g && v[7] == h);
|
|
|
|
case 4: REPORTER_ASSERT(r, v[2] == c && v[3] == d);
|
|
|
|
case 2: REPORTER_ASSERT(r, v[0] == a && v[1] == b);
|
2015-04-27 19:08:01 +00:00
|
|
|
}
|
2015-04-14 21:02:52 +00:00
|
|
|
};
|
|
|
|
|
|
|
|
T vals[] = { 1,2,3,4,5,6,7,8 };
|
2015-11-20 21:53:19 +00:00
|
|
|
SkNx<N,T> a = SkNx<N,T>::Load(vals),
|
2015-04-14 21:02:52 +00:00
|
|
|
b(a),
|
|
|
|
c = a;
|
2015-11-20 21:53:19 +00:00
|
|
|
SkNx<N,T> d;
|
2015-04-14 21:02:52 +00:00
|
|
|
d = a;
|
|
|
|
|
|
|
|
assert_eq(a, 1,2,3,4,5,6,7,8);
|
|
|
|
assert_eq(b, 1,2,3,4,5,6,7,8);
|
|
|
|
assert_eq(c, 1,2,3,4,5,6,7,8);
|
|
|
|
assert_eq(d, 1,2,3,4,5,6,7,8);
|
|
|
|
|
|
|
|
assert_eq(a+a, 2,4,6,8,10,12,14,16);
|
|
|
|
assert_eq(a*a, 1,4,9,16,25,36,49,64);
|
|
|
|
assert_eq(a*a-a, 0,2,6,12,20,30,42,56);
|
|
|
|
|
|
|
|
assert_eq(a >> 2, 0,0,0,1,1,1,1,2);
|
|
|
|
assert_eq(a << 1, 2,4,6,8,10,12,14,16);
|
|
|
|
|
2016-02-21 18:54:19 +00:00
|
|
|
REPORTER_ASSERT(r, a[1] == 2);
|
2015-04-14 21:02:52 +00:00
|
|
|
}
|
|
|
|
|
2015-11-20 21:53:19 +00:00
|
|
|
DEF_TEST(SkNx, r) {
|
2015-04-14 21:02:52 +00:00
|
|
|
test_Ni<2, uint16_t>(r);
|
|
|
|
test_Ni<4, uint16_t>(r);
|
|
|
|
test_Ni<8, uint16_t>(r);
|
2015-04-27 19:08:01 +00:00
|
|
|
|
|
|
|
test_Ni<2, int>(r);
|
|
|
|
test_Ni<4, int>(r);
|
|
|
|
test_Ni<8, int>(r);
|
2015-04-14 21:02:52 +00:00
|
|
|
}
|
2015-05-15 00:53:04 +00:00
|
|
|
|
2015-07-13 19:06:33 +00:00
|
|
|
DEF_TEST(SkNi_min_lt, r) {
|
2015-05-15 00:53:04 +00:00
|
|
|
// Exhaustively check the 8x8 bit space.
|
|
|
|
for (int a = 0; a < (1<<8); a++) {
|
|
|
|
for (int b = 0; b < (1<<8); b++) {
|
2015-07-13 19:06:33 +00:00
|
|
|
Sk16b aw(a), bw(b);
|
2016-02-21 18:54:19 +00:00
|
|
|
REPORTER_ASSERT(r, Sk16b::Min(aw, bw)[0] == SkTMin(a, b));
|
|
|
|
REPORTER_ASSERT(r, !(aw < bw)[0] == !(a < b));
|
2015-05-15 00:53:04 +00:00
|
|
|
}}
|
|
|
|
|
|
|
|
// Exhausting the 16x16 bit space is kind of slow, so only do that in release builds.
|
|
|
|
#ifdef SK_DEBUG
|
|
|
|
SkRandom rand;
|
|
|
|
for (int i = 0; i < (1<<16); i++) {
|
|
|
|
uint16_t a = rand.nextU() >> 16,
|
|
|
|
b = rand.nextU() >> 16;
|
2016-02-21 18:54:19 +00:00
|
|
|
REPORTER_ASSERT(r, Sk16h::Min(Sk16h(a), Sk16h(b))[0] == SkTMin(a, b));
|
2015-05-15 00:53:04 +00:00
|
|
|
}
|
|
|
|
#else
|
|
|
|
for (int a = 0; a < (1<<16); a++) {
|
|
|
|
for (int b = 0; b < (1<<16); b++) {
|
2016-02-21 18:54:19 +00:00
|
|
|
REPORTER_ASSERT(r, Sk16h::Min(Sk16h(a), Sk16h(b))[0] == SkTMin(a, b));
|
2015-05-15 00:53:04 +00:00
|
|
|
}}
|
|
|
|
#endif
|
|
|
|
}
|
2015-06-15 17:58:42 +00:00
|
|
|
|
|
|
|
DEF_TEST(SkNi_saturatedAdd, r) {
|
|
|
|
for (int a = 0; a < (1<<8); a++) {
|
|
|
|
for (int b = 0; b < (1<<8); b++) {
|
|
|
|
int exact = a+b;
|
|
|
|
if (exact > 255) { exact = 255; }
|
|
|
|
if (exact < 0) { exact = 0; }
|
|
|
|
|
2016-02-21 18:54:19 +00:00
|
|
|
REPORTER_ASSERT(r, Sk16b(a).saturatedAdd(Sk16b(b))[0] == exact);
|
2015-06-15 17:58:42 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
DEF_TEST(Sk4px_muldiv255round, r) {
|
|
|
|
for (int a = 0; a < (1<<8); a++) {
|
|
|
|
for (int b = 0; b < (1<<8); b++) {
|
|
|
|
int exact = (a*b+127)/255;
|
|
|
|
|
|
|
|
// Duplicate a and b 16x each.
|
2015-06-22 17:39:38 +00:00
|
|
|
auto av = Sk4px::DupAlpha(a),
|
|
|
|
bv = Sk4px::DupAlpha(b);
|
2015-06-15 17:58:42 +00:00
|
|
|
|
|
|
|
// This way should always be exactly correct.
|
2016-02-21 18:54:19 +00:00
|
|
|
int correct = (av * bv).div255()[0];
|
2015-06-15 17:58:42 +00:00
|
|
|
REPORTER_ASSERT(r, correct == exact);
|
|
|
|
|
|
|
|
// We're a bit more flexible on this method: correct for 0 or 255, otherwise off by <=1.
|
2016-02-21 18:54:19 +00:00
|
|
|
int fast = av.approxMulDiv255(bv)[0];
|
2015-06-15 17:58:42 +00:00
|
|
|
REPORTER_ASSERT(r, fast-exact >= -1 && fast-exact <= 1);
|
|
|
|
if (a == 0 || a == 255 || b == 0 || b == 255) {
|
|
|
|
REPORTER_ASSERT(r, fast == exact);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
2015-07-14 17:54:19 +00:00
|
|
|
|
|
|
|
DEF_TEST(Sk4px_widening, r) {
|
|
|
|
SkPMColor colors[] = {
|
|
|
|
SkPreMultiplyColor(0xff00ff00),
|
|
|
|
SkPreMultiplyColor(0x40008000),
|
|
|
|
SkPreMultiplyColor(0x7f020406),
|
|
|
|
SkPreMultiplyColor(0x00000000),
|
|
|
|
};
|
|
|
|
auto packed = Sk4px::Load4(colors);
|
|
|
|
|
|
|
|
auto wideLo = packed.widenLo(),
|
|
|
|
wideHi = packed.widenHi(),
|
|
|
|
wideLoHi = packed.widenLoHi(),
|
|
|
|
wideLoHiAlt = wideLo + wideHi;
|
|
|
|
REPORTER_ASSERT(r, 0 == memcmp(&wideLoHi, &wideLoHiAlt, sizeof(wideLoHi)));
|
|
|
|
}
|
2015-09-01 13:29:45 +00:00
|
|
|
|
2016-01-15 20:16:40 +00:00
|
|
|
DEF_TEST(SkNx_abs, r) {
|
|
|
|
auto fs = Sk4f(0.0f, -0.0f, 2.0f, -4.0f).abs();
|
2016-02-21 18:54:19 +00:00
|
|
|
REPORTER_ASSERT(r, fs[0] == 0.0f);
|
|
|
|
REPORTER_ASSERT(r, fs[1] == 0.0f);
|
|
|
|
REPORTER_ASSERT(r, fs[2] == 2.0f);
|
|
|
|
REPORTER_ASSERT(r, fs[3] == 4.0f);
|
2016-01-15 20:16:40 +00:00
|
|
|
}
|
2016-02-08 13:54:38 +00:00
|
|
|
|
2016-02-09 23:41:36 +00:00
|
|
|
DEF_TEST(SkNx_floor, r) {
|
|
|
|
auto fs = Sk4f(0.4f, -0.4f, 0.6f, -0.6f).floor();
|
2016-02-21 18:54:19 +00:00
|
|
|
REPORTER_ASSERT(r, fs[0] == 0.0f);
|
|
|
|
REPORTER_ASSERT(r, fs[1] == -1.0f);
|
|
|
|
REPORTER_ASSERT(r, fs[2] == 0.0f);
|
|
|
|
REPORTER_ASSERT(r, fs[3] == -1.0f);
|
2016-02-09 23:41:36 +00:00
|
|
|
}
|
|
|
|
|
sknx refactoring
- trim unused specializations (Sk4i, Sk2d) and apis (SkNx_dup)
- expand apis a little
* v[0] == v.kth<0>()
* SkNx_shuffle can now convert to different-sized vectors, e.g. Sk2f <-> Sk4f
- remove anonymous namespace
I believe it's safe to remove the anonymous namespace right now.
We're worried about violating the One Definition Rule; the anonymous namespace protected us from that.
In Release builds, this is mostly moot, as everything tends to inline completely.
In Debug builds, violating the ODR is at worst an inconvenience, time spent trying to figure out why the bot is broken.
Now that we're building with SSE2/NEON everywhere, very few bots have even a chance about getting confused by two definitions of the same type or function. Where we do compile variants depending on, e.g., SSSE3, we do so in static inline functions. These are not subject to the ODR.
I plan to follow up with a tedious .kth<...>() -> [...] auto-replace.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1683543002
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Review URL: https://codereview.chromium.org/1683543002
2016-02-09 18:35:27 +00:00
|
|
|
DEF_TEST(SkNx_shuffle, r) {
|
|
|
|
Sk4f f4(0,10,20,30);
|
2016-02-08 13:54:38 +00:00
|
|
|
|
sknx refactoring
- trim unused specializations (Sk4i, Sk2d) and apis (SkNx_dup)
- expand apis a little
* v[0] == v.kth<0>()
* SkNx_shuffle can now convert to different-sized vectors, e.g. Sk2f <-> Sk4f
- remove anonymous namespace
I believe it's safe to remove the anonymous namespace right now.
We're worried about violating the One Definition Rule; the anonymous namespace protected us from that.
In Release builds, this is mostly moot, as everything tends to inline completely.
In Debug builds, violating the ODR is at worst an inconvenience, time spent trying to figure out why the bot is broken.
Now that we're building with SSE2/NEON everywhere, very few bots have even a chance about getting confused by two definitions of the same type or function. Where we do compile variants depending on, e.g., SSSE3, we do so in static inline functions. These are not subject to the ODR.
I plan to follow up with a tedious .kth<...>() -> [...] auto-replace.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1683543002
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Review URL: https://codereview.chromium.org/1683543002
2016-02-09 18:35:27 +00:00
|
|
|
Sk2f f2 = SkNx_shuffle<2,1>(f4);
|
|
|
|
REPORTER_ASSERT(r, f2[0] == 20);
|
|
|
|
REPORTER_ASSERT(r, f2[1] == 10);
|
|
|
|
|
|
|
|
f4 = SkNx_shuffle<0,1,1,0>(f2);
|
|
|
|
REPORTER_ASSERT(r, f4[0] == 20);
|
|
|
|
REPORTER_ASSERT(r, f4[1] == 10);
|
|
|
|
REPORTER_ASSERT(r, f4[2] == 10);
|
|
|
|
REPORTER_ASSERT(r, f4[3] == 20);
|
2016-02-08 13:54:38 +00:00
|
|
|
}
|
|
|
|
|
2016-02-17 15:23:36 +00:00
|
|
|
DEF_TEST(SkNx_int_float, r) {
|
|
|
|
Sk4f f(-2.3f, 1.0f, 0.45f, 0.6f);
|
|
|
|
|
|
|
|
Sk4i i = SkNx_cast<int>(f);
|
|
|
|
REPORTER_ASSERT(r, i[0] == -2);
|
|
|
|
REPORTER_ASSERT(r, i[1] == 1);
|
|
|
|
REPORTER_ASSERT(r, i[2] == 0);
|
|
|
|
REPORTER_ASSERT(r, i[3] == 0);
|
|
|
|
|
|
|
|
f = SkNx_cast<float>(i);
|
|
|
|
REPORTER_ASSERT(r, f[0] == -2.0f);
|
|
|
|
REPORTER_ASSERT(r, f[1] == 1.0f);
|
|
|
|
REPORTER_ASSERT(r, f[2] == 0.0f);
|
|
|
|
REPORTER_ASSERT(r, f[3] == 0.0f);
|
|
|
|
}
|
|
|
|
|
sknx refactoring
- trim unused specializations (Sk4i, Sk2d) and apis (SkNx_dup)
- expand apis a little
* v[0] == v.kth<0>()
* SkNx_shuffle can now convert to different-sized vectors, e.g. Sk2f <-> Sk4f
- remove anonymous namespace
I believe it's safe to remove the anonymous namespace right now.
We're worried about violating the One Definition Rule; the anonymous namespace protected us from that.
In Release builds, this is mostly moot, as everything tends to inline completely.
In Debug builds, violating the ODR is at worst an inconvenience, time spent trying to figure out why the bot is broken.
Now that we're building with SSE2/NEON everywhere, very few bots have even a chance about getting confused by two definitions of the same type or function. Where we do compile variants depending on, e.g., SSSE3, we do so in static inline functions. These are not subject to the ODR.
I plan to follow up with a tedious .kth<...>() -> [...] auto-replace.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1683543002
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Review URL: https://codereview.chromium.org/1683543002
2016-02-09 18:35:27 +00:00
|
|
|
#include "SkRandom.h"
|
|
|
|
|
2016-02-08 13:54:38 +00:00
|
|
|
DEF_TEST(SkNx_u16_float, r) {
|
|
|
|
{
|
|
|
|
// u16 --> float
|
|
|
|
auto h4 = Sk4h(15, 17, 257, 65535);
|
|
|
|
auto f4 = SkNx_cast<float>(h4);
|
2016-02-21 18:54:19 +00:00
|
|
|
REPORTER_ASSERT(r, f4[0] == 15.0f);
|
|
|
|
REPORTER_ASSERT(r, f4[1] == 17.0f);
|
|
|
|
REPORTER_ASSERT(r, f4[2] == 257.0f);
|
|
|
|
REPORTER_ASSERT(r, f4[3] == 65535.0f);
|
2016-02-08 13:54:38 +00:00
|
|
|
}
|
|
|
|
{
|
|
|
|
// float -> u16
|
|
|
|
auto f4 = Sk4f(15, 17, 257, 65535);
|
|
|
|
auto h4 = SkNx_cast<uint16_t>(f4);
|
2016-02-21 18:54:19 +00:00
|
|
|
REPORTER_ASSERT(r, h4[0] == 15);
|
|
|
|
REPORTER_ASSERT(r, h4[1] == 17);
|
|
|
|
REPORTER_ASSERT(r, h4[2] == 257);
|
|
|
|
REPORTER_ASSERT(r, h4[3] == 65535);
|
2016-02-08 13:54:38 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
// starting with any u16 value, we should be able to have a perfect round-trip in/out of floats
|
|
|
|
//
|
|
|
|
SkRandom rand;
|
sknx refactoring
- trim unused specializations (Sk4i, Sk2d) and apis (SkNx_dup)
- expand apis a little
* v[0] == v.kth<0>()
* SkNx_shuffle can now convert to different-sized vectors, e.g. Sk2f <-> Sk4f
- remove anonymous namespace
I believe it's safe to remove the anonymous namespace right now.
We're worried about violating the One Definition Rule; the anonymous namespace protected us from that.
In Release builds, this is mostly moot, as everything tends to inline completely.
In Debug builds, violating the ODR is at worst an inconvenience, time spent trying to figure out why the bot is broken.
Now that we're building with SSE2/NEON everywhere, very few bots have even a chance about getting confused by two definitions of the same type or function. Where we do compile variants depending on, e.g., SSSE3, we do so in static inline functions. These are not subject to the ODR.
I plan to follow up with a tedious .kth<...>() -> [...] auto-replace.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1683543002
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Review URL: https://codereview.chromium.org/1683543002
2016-02-09 18:35:27 +00:00
|
|
|
for (int i = 0; i < 10000; ++i) {
|
2016-02-08 13:54:38 +00:00
|
|
|
const uint16_t s16[4] {
|
|
|
|
(uint16_t)rand.nextU16(), (uint16_t)rand.nextU16(),
|
|
|
|
(uint16_t)rand.nextU16(), (uint16_t)rand.nextU16(),
|
|
|
|
};
|
|
|
|
auto u4_0 = Sk4h::Load(s16);
|
|
|
|
auto f4 = SkNx_cast<float>(u4_0);
|
|
|
|
auto u4_1 = SkNx_cast<uint16_t>(f4);
|
|
|
|
uint16_t d16[4];
|
|
|
|
u4_1.store(d16);
|
|
|
|
REPORTER_ASSERT(r, !memcmp(s16, d16, sizeof(s16)));
|
|
|
|
}
|
|
|
|
}
|