Commit Graph

6 Commits

Author SHA1 Message Date
Matthias Clasen
155a4fac5c Add vectorized half-float conversion
We can't make the -4 versions inline, since
we use ifuncs for them, so make vectorized
versions.

Test included.
2021-09-10 22:17:31 -04:00
Matthias Clasen
fc9c34897a ngl: Make the C half-float implementation accessible
Make this accessible for tests.
2021-07-13 09:03:49 -04:00
Matthias Clasen
930ff499ee Confine -mf16c to a single source file
We can't use this flag for any code that may get run
outside the __builtin_cpu_supports() check, and meson
doesn't allow per-file cflags. So we have to split this
code off into its own static library.
2021-05-05 18:58:23 -04:00
Chun-wei Fan
d5ced21264 gsk/ngl/fp16.c: Implement runtime F16C detection on MSVC
We need to use __cpuid() to check for the presence of F16C instructions on
Visual Studio builds, and call the half_to_float4() or float_to_half4()
implementation accordingly, as the __builtin_cpu...() functions are strictly
for GCC or CLang only.

Also, since __m128i_u is not a standard intrisics type across the board, just
use __m128i on Visual Studio as it is safe to do so there for use for
_mm_loadl_epi64().

Like running on Darwin, we cannot use the alias __attribute__ as __attribute__
is also for GCC and CLang only.
2021-04-12 18:13:42 +08:00
Matthias Clasen
2d7169fd5f Work around compiler shortcomings on macOS
alias attributes don't work on Darwin, so
do without.
2021-04-07 22:38:47 -04:00
Matthias Clasen
885a6b8ebc gsk: Add runtime checks for F16C
Use an IFUNC resolver to determine whether we can use
intrinsics for FP16 conversion. This requires the functions
to be no longer inline.

Sadly, it turns out that __builtin_cpu_supports ("f16c")
doesn't compile on the systems where we want it to prevent
us from getting a SIGILL at runtime.
2021-04-07 22:21:23 -04:00