Include cmath in a few source files which use signbit and a relying on
magic to happen to use it.
Also Fix nuttiness in SampleClip. No need to #define single character
identifiers.
Change-Id: Iae3352d0cab9aaa6c37d6424f064b3d86fa2e011
Reviewed-on: https://skia-review.googlesource.com/4626
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Ben Wagner <bungeman@google.com>
Commit-Queue: Herb Derby <herb@google.com>
I think we convinced ourselves that denorms, while a good chunk of half floats,
cover a rather small fraction of the representable range, which is always
close enough to zero to flush.
This makes both paths of the conversion to or from float considerably simpler.
These functions now work for zero-or-normal half floats (excluding infinite, NaN).
I'm not aware of a term for this class so I've called them "ordinary".
A handful of GMs and SKPs draw differently in --config f16, but all imperceptibly.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2256023002
Review-Url: https://codereview.chromium.org/2256023002
This should give us a good baseline to explore using SkRasterPipeline.
A particular colorxform to half float drops from 425us to 282us on my desktop.
Color Xform to Half Float (HP z620)
Original 425us
Trans16 (not 32) 355us
Vector Trans16 378us
Trans16 + Keep Halfs in Vector 335us
Vector Trans16 + Keep Halfs in Vector 282us
Final 282us
Color Xform to Half Float (Nexus 5X)
Original 556us
Final 472us
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2159993003
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Review-Url: https://codereview.chromium.org/2159993003
It's become clear we need to sometimes deal with values <0 or >1.
I'm not yet convinced we care about NaN or +-inf.
We had some fairly clever tricks and optimizations here for NEON
and SSE. I've thrown them out in favor of a single implementation.
If we find the specializations mattered, we can certainly figure out
how to extend them to this new range/domain.
This happens to add a vectorized float -> half for ARMv7, which was
missing from the _01 version. (The SSE strategy was not portable to
platforms that flush denorm floats to zero.)
I've tested the full float range for FloatToHalf on my desktop and a 5x.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2145663003
CQ_INCLUDE_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot;master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot,Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-Fast-Trybot
Committed: https://skia.googlesource.com/skia/+/3296bee70d074bb8094b3229dbe12fa016657e90
Review-Url: https://codereview.chromium.org/2145663003
Reason for revert:
Unit tests fail on Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-Fast
Original issue's description:
> Expand _01 half<->float limitation to _finite. Simplify.
>
> It's become clear we need to sometimes deal with values <0 or >1.
> I'm not yet convinced we care about NaN or +-inf.
>
> We had some fairly clever tricks and optimizations here for NEON
> and SSE. I've thrown them out in favor of a single implementation.
> If we find the specializations mattered, we can certainly figure out
> how to extend them to this new range/domain.
>
> This happens to add a vectorized float -> half for ARMv7, which was
> missing from the _01 version. (The SSE strategy was not portable to
> platforms that flush denorm floats to zero.)
>
> I've tested the full float range for FloatToHalf on my desktop and a 5x.
>
> BUG=skia:
> GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2145663003
> CQ_INCLUDE_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot;master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot,Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-Fast-Trybot
>
> Committed: https://skia.googlesource.com/skia/+/3296bee70d074bb8094b3229dbe12fa016657e90TBR=msarett@google.com,mtklein@chromium.org
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:
Review-Url: https://codereview.chromium.org/2151023003
It's become clear we need to sometimes deal with values <0 or >1.
I'm not yet convinced we care about NaN or +-inf.
We had some fairly clever tricks and optimizations here for NEON
and SSE. I've thrown them out in favor of a single implementation.
If we find the specializations mattered, we can certainly figure out
how to extend them to this new range/domain.
This happens to add a vectorized float -> half for ARMv7, which was
missing from the _01 version. (The SSE strategy was not portable to
platforms that flush denorm floats to zero.)
I've tested the full float range for FloatToHalf on my desktop and a 5x.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2145663003
CQ_INCLUDE_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot;master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Review-Url: https://codereview.chromium.org/2145663003
These are basically inlined, 4-at-a-time versions of our existing functions,
but cut down to avoid any work that's only necessary outside [0,1].
Both f16 and f32 denorms should work fine modulo the usual ARMv7 NEON denorm==zero caveat.
In exchange for a little speed, f32->f16 does not round properly.
Instead it truncates, so it's never off by more than 1 bit.
Support for finite values >1 or <0 is straightforward to add back.
>1 might already work as-is.
Getting close to _u16 performance:
micros bench
261.13 xferu64_bw_1_opaque_u16
1833.51 xferu64_bw_1_alpha_u16
2762.32 ? xferu64_aa_1_opaque_u16
3334.29 xferu64_aa_1_alpha_u16
249.78 xferu64_bw_1_opaque_f16
3383.18 xferu64_bw_1_alpha_f16
4214.72 xferu64_aa_1_opaque_f16
4701.19 xferu64_aa_1_alpha_f16
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1685133005
Committed: https://skia.googlesource.com/skia/+/9ea11a4235b3e3521cc8bf914a27c2d0dc062db9
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Review URL: https://codereview.chromium.org/1685133005
Reason for revert:
Gotta fix Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD
Original issue's description:
> SkHalfToFloat_01 / SkFloatToHalf_01
>
> These are basically inlined, 4-at-a-time versions of our existing functions,
> but cut down to avoid any work that's only necessary outside [0,1].
>
> Both f16 and f32 denorms should work fine modulo the usual ARMv7 NEON denorm==zero caveat.
>
> In exchange for a little speed, f32->f16 does not round properly.
> Instead it truncates, so it's never off by more than 1 bit.
>
> Support for finite values >1 or <0 is straightforward to add back.
> >1 might already work as-is.
>
> Getting close to _u16 performance:
> micros bench
> 261.13 xferu64_bw_1_opaque_u16
> 1833.51 xferu64_bw_1_alpha_u16
> 2762.32 ? xferu64_aa_1_opaque_u16
> 3334.29 xferu64_aa_1_alpha_u16
> 249.78 xferu64_bw_1_opaque_f16
> 3383.18 xferu64_bw_1_alpha_f16
> 4214.72 xferu64_aa_1_opaque_f16
> 4701.19 xferu64_aa_1_alpha_f16
>
>
> BUG=skia:
> GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1685133005
>
> Committed: https://skia.googlesource.com/skia/+/9ea11a4235b3e3521cc8bf914a27c2d0dc062db9TBR=jvanverth@google.com,reed@google.com,mtklein@chromium.org
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:
Review URL: https://codereview.chromium.org/1693443003
These are basically inlined, 4-at-a-time versions of our existing functions,
but cut down to avoid any work that's only necessary outside [0,1].
Both f16 and f32 denorms should work fine modulo the usual ARMv7 NEON denorm==zero caveat.
In exchange for a little speed, f32->f16 does not round properly.
Instead it truncates, so it's never off by more than 1 bit.
Support for finite values >1 or <0 is straightforward to add back.
>1 might already work as-is.
Getting close to _u16 performance:
micros bench
261.13 xferu64_bw_1_opaque_u16
1833.51 xferu64_bw_1_alpha_u16
2762.32 ? xferu64_aa_1_opaque_u16
3334.29 xferu64_aa_1_alpha_u16
249.78 xferu64_bw_1_opaque_f16
3383.18 xferu64_bw_1_alpha_f16
4214.72 xferu64_aa_1_opaque_f16
4701.19 xferu64_aa_1_alpha_f16
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1685133005
Review URL: https://codereview.chromium.org/1685133005