mirror of
https://sourceware.org/git/glibc.git
synced 2024-12-03 18:31:04 +00:00
e1f59bebd8
29 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
H.J. Lu
|
e1f59bebd8 |
x86-64: Replace assembly versions of e_expf with generic e_expf.c
This patch replaces x86-64 assembly versions of e_expf with generic e_expf.c. For workload-spec2017.wrf, on Nehalem, it improves performance by: Before After Improvement reciprocal-throughput 36.039 20.7749 73% latency 58.8096 40.8715 43% On Skylake, it improves Before After Improvement reciprocal-throughput 18.4436 11.1693 65% latency 47.5162 37.5411 26% * sysdeps/x86_64/fpu/e_expf.S: Removed. * sysdeps/x86_64/fpu/multiarch/e_expf-fma.S: Likewise. * sysdeps/x86_64/fpu/w_expf.c: Likewise. * sysdeps/x86_64/fpu/libm-test-ulps: Updated for generic e_expf.c. * sysdeps/x86_64/fpu/multiarch/Makefile (CFLAGS-e_expf-fma.c): New. * sysdeps/x86_64/fpu/multiarch/e_expf-fma.c: New file. * sysdeps/x86_64/fpu/multiarch/e_expf.c (__redirect_ieee754_expf): Renamed to ... (__redirect_expf): This. (SYMBOL_NAME): Changed to expf. (__ieee754_expf): Renamed to ... (__expf): This. (__GI___expf): This. (__ieee754_expf): Add strong_alias. (__expf_finite): Likewise. (__expf): New. Include <sysdeps/ieee754/flt-32/e_expf.c>. |
||
Joseph Myers
|
ae8372d7e4 |
Add SSE4.1 trunc, truncf (bug 20142).
This patch adds SSE4.1 versions of trunc and truncf, using the roundsd / roundss instructions, similar to the versions of ceil, floor, rint and nearbyint functions we already have. In my testing with the glibc benchtests these are about 30% faster than the C versions for double, 20% faster for float. Tested for x86_64. [BZ #20142] * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Add s_trunc-c, s_truncf-c, s_trunc-sse4_1 and s_truncf-sse4_1. * sysdeps/x86_64/fpu/multiarch/s_trunc-c.c: New file. * sysdeps/x86_64/fpu/multiarch/s_trunc-sse4_1.S: Likewise. * sysdeps/x86_64/fpu/multiarch/s_trunc.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_truncf-c.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_truncf-sse4_1.S: Likewise. * sysdeps/x86_64/fpu/multiarch/s_truncf.c: Likewise. |
||
H.J. Lu
|
24a2e6588d |
x86-64: Optimize e_expf with FMA [BZ #21912]
FMA optimized e_expf improves performance by more than 50% on Skylake. [BZ #21912] * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Add e_expf-fma. * sysdeps/x86_64/fpu/multiarch/e_expf-fma.S: New file. * sysdeps/x86_64/fpu/multiarch/e_expf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/ifunc-fma.h: Likewise. |
||
H.J. Lu
|
57a72fa350 |
x86-64: Add FMA multiarch functions to libm
This patch adds multiarch functions optimized with -mfma -mavx2 to libm. e_pow-fma.c is compiled with $(config-cflags-nofma) due to PR 19003. * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Add e_exp-fma, e_log-fma, e_pow-fma, s_atan-fma, e_asin-fma, e_atan2-fma, s_sin-fma, s_tan-fma, mplog-fma, mpa-fma, slowexp-fma, slowpow-fma, sincos32-fma, doasin-fma, dosincos-fma, halfulp-fma, mpexp-fma, mpatan2-fma, mpatan-fma, mpsqrt-fma, and mptan-fma. (CFLAGS-doasin-fma.c): New. (CFLAGS-dosincos-fma.c): Likewise. (CFLAGS-e_asin-fma.c): Likewise. (CFLAGS-e_atan2-fma.c): Likewise. (CFLAGS-e_exp-fma.c): Likewise. (CFLAGS-e_log-fma.c): Likewise. (CFLAGS-e_pow-fma.c): Likewise. (CFLAGS-halfulp-fma.c): Likewise. (CFLAGS-mpa-fma.c): Likewise. (CFLAGS-mpatan-fma.c): Likewise. (CFLAGS-mpatan2-fma.c): Likewise. (CFLAGS-mpexp-fma.c): Likewise. (CFLAGS-mplog-fma.c): Likewise. (CFLAGS-mpsqrt-fma.c): Likewise. (CFLAGS-mptan-fma.c): Likewise. (CFLAGS-s_atan-fma.c): Likewise. (CFLAGS-sincos32-fma.c): Likewise. (CFLAGS-slowexp-fma.c): Likewise. (CFLAGS-slowpow-fma.c): Likewise. (CFLAGS-s_sin-fma.c): Likewise. (CFLAGS-s_tan-fma.c): Likewise. * sysdeps/x86_64/fpu/multiarch/doasin-fma.c: New file. * sysdeps/x86_64/fpu/multiarch/dosincos-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/e_asin-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/e_atan2-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/e_exp-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/e_log-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/e_pow-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/halfulp-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/ifunc-avx-fma4.h: Likewise. * sysdeps/x86_64/fpu/multiarch/ifunc-fma4.h: Likewise. * sysdeps/x86_64/fpu/multiarch/mpa-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/mpatan-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/mpatan2-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/mpexp-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/mplog-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/mpsqrt-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/mptan-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_atan-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_sin-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_tan-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/sincos32-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowexp-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowpow-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/e_asin.c: Rewrite. * sysdeps/x86_64/fpu/multiarch/e_atan2.c: Likewise. * sysdeps/x86_64/fpu/multiarch/e_exp.c: Likewise. * sysdeps/x86_64/fpu/multiarch/e_log.c: Likewise. * sysdeps/x86_64/fpu/multiarch/e_pow.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_atan.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_sin.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_tan.c: Likewise. |
||
H.J. Lu
|
8537e0f6cf |
x86-64: Implement libmathvec IFUNC selectors in C
* sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines) Add svml_d_cos2_core-sse2, svml_d_cos4_core-sse, svml_d_cos8_core-avx2, svml_d_exp2_core-sse2, svml_d_exp4_core-sse, svml_d_exp8_core-avx2, svml_d_log2_core-sse2, svml_d_log4_core-sse, svml_d_log8_core-avx2, svml_d_pow2_core-sse2, svml_d_pow4_core-sse, svml_d_pow8_core-avx2 svml_d_sin2_core-sse2, svml_d_sin4_core-sse, svml_d_sin8_core-avx2, svml_d_sincos2_core-sse2, svml_d_sincos4_core-sse, svml_d_sincos8_core-avx2, svml_s_cosf16_core-avx2, svml_s_cosf4_core-sse2, svml_s_cosf8_core-sse, svml_s_expf16_core-avx2, svml_s_expf4_core-sse2, svml_s_expf8_core-sse, svml_s_logf16_core-avx2, svml_s_logf4_core-sse2, svml_s_logf8_core-sse, svml_s_powf16_core-avx2, svml_s_powf4_core-sse2, svml_s_powf8_core-sse, svml_s_sincosf16_core-avx2, svml_s_sincosf4_core-sse2, svml_s_sincosf8_core-sse, svml_s_sinf16_core-avx2, svml_s_sinf4_core-sse2 and svml_s_sinf8_core-sse. * sysdeps/x86_64/fpu/multiarch/ifunc-mathvec-avx2.h: New file. * sysdeps/x86_64/fpu/multiarch/ifunc-mathvec-avx512.h: Likewise. * sysdeps/x86_64/fpu/multiarch/ifunc-mathvec-sse4_1.h: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_cos2_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_cos4_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_exp2_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_exp4_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_log2_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_log4_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_log8_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_pow2_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_pow4_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos2_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos4_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_cosf16_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_cosf4_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_cosf8_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_expf16_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_expf4_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_expf8_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_logf16_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_logf4_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_logf8_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_powf16_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_powf4_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_powf8_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincosf16_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincosf4_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincosf8_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sinf16_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sinf4_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sinf8_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_cos2_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_cos2_core-sse2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVbN2v_cos): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_cos4_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_cos4_core-sse.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVdN4v_cos): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core-avx2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVeN8v_cos): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_exp2_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_exp2_core-sse2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVbN2v_exp): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_exp4_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_exp4_core-sse.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVdN4v_exp): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core-avx2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVeN8v_exp): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_log2_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_log2_core-sse2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVbN2v_log): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_log4_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_log4_core-sse.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVdN4v_log): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_log8_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_log8_core-avx2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVeN8v_log): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_pow2_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_pow2_core-sse2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVbN2vv_pow): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_pow4_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_pow4_core-sse.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVdN4vv_pow): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core-avx2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVeN8vv_pow): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core-sse2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVbN2v_sin): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core-sse.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVbN4v_sin): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core-avx2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVbN8v_sin): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos2_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_sincos2_core-sse2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVbN2vvv_sincos): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos4_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_sincos4_core-sse.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVdN4vvv_sincos): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core-avx2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVeN8vvv_sincos): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_cosf16_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_cosf16_core-avx2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVeN16v_cosf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_cosf4_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_cosf4_core-sse2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVbN4v_cosf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_cosf8_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_cosf8_core-sse.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVdN8v_cosf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_expf16_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_expf16_core-avx2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVeN16v_expf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_expf4_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_expf4_core-sse2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVbN4v_expf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_expf8_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_expf8_core-sse.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVdN8v_expf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_logf16_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_logf16_core-avx2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVeN16v_logf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_logf4_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_logf4_core-sse2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVbN4v_logf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_logf8_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_logf8_core-sse.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVdN8v_logf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_powf16_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_powf16_core-avx2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVeN16vv_powf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_powf4_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_powf4_core-sse2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVbN4vv_powf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_powf8_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_powf8_core-sse.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVdN8vv_powf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_sincosf16_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_sincosf16_core-avx2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVeN16vvv_sincosf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_sincosf4_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_sincosf4_core-sse2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVbN4vvv_sincosf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_sincosf8_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_sincosf8_core-sse.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVdN8vvv_sincosf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_sinf16_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_sinf16_core-avx2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVeN16v_sinf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_sinf4_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_sinf4_core-sse2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVbN4v_sinf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_sinf8_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_sinf8_core-sse.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVdN8v_sinf): Removed. |
||
H.J. Lu
|
10a87ca476 |
x86-64: Implement libm IFUNC selectors in C
* sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Add s_ceil-sse4_1, s_ceilf-sse4_1, s_floor-sse4_1, s_floorf-sse4_1, s_nearbyint-sse4_1, s_nearbyintf-sse4_1, s_rint-sse4_1 and s_rintf-sse4_1. * sysdeps/x86_64/fpu/multiarch/ifunc-sse4_1.h: New file. * sysdeps/x86_64/fpu/multiarch/s_ceil.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_ceilf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_floor.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_floorf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_nearbyint.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_nearbyintf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_rint.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_rintf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_ceil.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/s_ceil-sse4_1.S: This. Don't include <machine/asm.h> nor <init-arch.h>. Include <sysdep.h>. (__ceil): Removed. * sysdeps/x86_64/fpu/multiarch/s_ceilf.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/s_ceilf-sse4_1.S: This. Don't include <machine/asm.h> nor <init-arch.h>. Include <sysdep.h>. (__ceilf): Removed. * sysdeps/x86_64/fpu/multiarch/s_floor.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/s_floor-sse4_1.S: This. Don't include <machine/asm.h> nor <init-arch.h>. Include <sysdep.h>. (__floor): Removed. * sysdeps/x86_64/fpu/multiarch/s_floorf.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/s_floorf-sse4_1.S: This. Don't include <machine/asm.h> nor <init-arch.h>. Include <sysdep.h>. (__floorf): Removed. * sysdeps/x86_64/fpu/multiarch/s_nearbyint.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/s_nearbyint-sse4_1.S: This. Don't include <machine/asm.h> nor <init-arch.h>. Include <sysdep.h>. (__nearbyint): Removed. * sysdeps/x86_64/fpu/multiarch/s_nearbyintf.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/s_nearbyintf-sse4_1.S: This. Don't include <machine/asm.h> nor <init-arch.h>. Include <sysdep.h>. (__nearbyintf): Removed. * sysdeps/x86_64/fpu/multiarch/s_rint.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/s_rint-sse4_1.S: This. Don't include <machine/asm.h> nor <init-arch.h>. Include <sysdep.h>. (__rint): Removed. * sysdeps/x86_64/fpu/multiarch/s_rintf.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/s_rintf-sse4_1.S: This. Don't include <machine/asm.h> nor <init-arch.h>. Include <sysdep.h>. (__rintf): Removed. |
||
Joseph Myers
|
b7848899a5 |
Remove configure tests for FMA4 support.
GCC added support for -mfma4 in version 4.5. Thus the configure tests for this support are obsolete, and this patch removes them. Tested for x86_64 and x86 (testsuite, and that installed stripped shared libraries are unchanged by this patch). * sysdeps/i386/configure.ac (libc_cv_cc_fma4): Remove configure test. * sysdeps/i386/configure: Regenerated. * sysdeps/x86_64/configure.ac (libc_cv_cc_fma4): Remove configure test. * sysdeps/x86_64/configure: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile [$(have-mfma4) = yes]: Make code unconditional. * sysdeps/x86_64/fpu/multiarch/e_asin.c [HAVE_FMA4_SUPPORT]: Likewise. * sysdeps/x86_64/fpu/multiarch/e_atan2.c [HAVE_FMA4_SUPPORT]: Likewise. [!HAVE_FMA4_SUPPORT]: Remove conditional code. * sysdeps/x86_64/fpu/multiarch/e_exp.c [HAVE_FMA4_SUPPORT]: Make code unconditional. [!HAVE_FMA4_SUPPORT]: Remove conditional code. * sysdeps/x86_64/fpu/multiarch/e_log.c [HAVE_FMA4_SUPPORT]: Make code unconditional. [!HAVE_FMA4_SUPPORT]: Remove conditional code. * sysdeps/x86_64/fpu/multiarch/e_pow.c [HAVE_FMA4_SUPPORT]: Make code unconditional. * sysdeps/x86_64/fpu/multiarch/s_atan.c [HAVE_FMA4_SUPPORT]: Make code unconditional. [!HAVE_FMA4_SUPPORT]: Remove conditional code. * sysdeps/x86_64/fpu/multiarch/s_fma.c [HAVE_FMA4_SUPPORT]: Make code unconditional. [!HAVE_FMA4_SUPPORT]: Remove conditional code. * sysdeps/x86_64/fpu/multiarch/s_fmaf.c [HAVE_FMA4_SUPPORT]: Make code unconditional. [!HAVE_FMA4_SUPPORT]: Remove conditional code. * sysdeps/x86_64/fpu/multiarch/s_sin.c [HAVE_FMA4_SUPPORT]: Make code unconditional. [!HAVE_FMA4_SUPPORT]: Remove conditional code. * sysdeps/x86_64/fpu/multiarch/s_tan.c [HAVE_FMA4_SUPPORT]: Make code unconditional. [!HAVE_FMA4_SUPPORT]: Remove conditional code. * config.h.in (HAVE_FMA4_SUPPORT): Remove #undef. |
||
Joseph Myers
|
1b12cd7f4d |
Remove configure tests for AVX support.
GCC added support for -mavx and -msse2avx in version 4.4. Thus the configure tests for this support are obsolete, and this patch removes them. Tested for x86_64 and x86 (testsuite, and that installed stripped shared libraries are unchanged by this patch). * sysdeps/i386/configure.ac (libc_cv_cc_avx): Remove configure test. (libc_cv_cc_sse2avx): Likewise. * sysdeps/i386/configure: Regenerated. * sysdeps/i386/i686/multiarch/Makefile [$(subdir)$(config-cflags-avx) = mathyes]: Change conditional to [$(subdir) = math]. * sysdeps/i386/i686/multiarch/s_fma-fma.c [HAVE_AVX_SUPPORT]: Make code unconditional. * sysdeps/i386/i686/multiarch/s_fma.c [HAVE_AVX_SUPPORT]: Likewise. * sysdeps/i386/i686/multiarch/s_fmaf-fma.c [HAVE_AVX_SUPPORT]: Likewise. * sysdeps/i386/i686/multiarch/s_fmaf.c [HAVE_AVX_SUPPORT]: Likewise. * sysdeps/x86_64/configure.ac (libc_cv_cc_avx): Remove configure test. (libc_cv_cc_sse2avx): Likewise. * sysdeps/x86_64/configure: Regenerated. * sysdeps/x86_64/Makefile [$(config-cflags-avx) = yes]: Make code unconditional. * sysdeps/x86_64/dl-trampoline.h (_dl_runtime_profile) [HAVE_AVX_SUPPORT || HAVE_AVX512_ASM_SUPPORT]: Make code unconditional. (_dl_runtime_profile) [!(HAVE_AVX_SUPPORT || HAVE_AVX512_ASM_SUPPORT)]: Remove conditional code. * sysdeps/x86_64/fpu/multiarch/Makefile [$(config-cflags-sse2avx) = yes]: Make code unconditional. * sysdeps/x86_64/fpu/multiarch/e_atan2.c [HAVE_FMA4_SUPPORT || HAVE_AVX_SUPPORT]: Likewise. * sysdeps/x86_64/fpu/multiarch/e_exp.c [HAVE_FMA4_SUPPORT || HAVE_AVX_SUPPORT]: Likewise. * sysdeps/x86_64/fpu/multiarch/e_log.c [HAVE_FMA4_SUPPORT || HAVE_AVX_SUPPORT]: Likewise. * sysdeps/x86_64/fpu/multiarch/s_atan.c [HAVE_FMA4_SUPPORT || HAVE_AVX_SUPPORT]: Likewise. * sysdeps/x86_64/fpu/multiarch/s_fma.c [HAVE_AVX_SUPPORT]: Likewise. * sysdeps/x86_64/fpu/multiarch/s_fmaf.c [HAVE_AVX_SUPPORT]: Likewise. * sysdeps/x86_64/fpu/multiarch/s_sin.c [HAVE_FMA4_SUPPORT || HAVE_AVX_SUPPORT]: Likewise. * sysdeps/x86_64/fpu/multiarch/s_tan.c [HAVE_FMA4_SUPPORT || HAVE_AVX_SUPPORT]: Likewise. * sysdeps/x86_64/multiarch/strcmp.S [HAVE_AVX_SUPPORT]: Likewise. * config.h.in (HAVE_AVX_SUPPORT): Remove #undef. (HAVE_SSE2AVX_SUPPORT): Likewise. |
||
Joseph Myers
|
51df260506 |
Fix x86_64 fma4 pow inappropriate contraction (bug 19003).
The x86_64 fma4 version of pow fails to disable contraction of operations other than those explicitly intended to use fma instructions, so resulting in large ulps errors on processors with fma4 instructions, as in bug 18104 (165ulp for the test added for that bug; error originally reported by "blaaa" on #glibc). This patch adds $(config-cflags-nofma) for e_pow-fma4.c, corresponding to the use for e_pow.c in sysdeps/ieee754/dbl-64/Makefile. Tested for x86_64 on a processor with fma4. [BZ #19003] * sysdeps/x86_64/fpu/multiarch/Makefile (CFLAGS-e_pow-fma4.c): Add $(config-cflags-nofma). |
||
Andrew Senkevich
|
a6336cc446 |
Vector sincosf for x86_64 and tests.
Here is implementation of vectorized sincosf containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * NEWS: Mention addition of x86_64 vector sincosf. * math/test-float-vlen16.h: Added wrapper for sincosf tests. * math/test-float-vlen4.h: Likewise. * math/test-float-vlen8.h: Likewise. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New symbols added. * sysdeps/x86/fpu/bits/math-vector.h: Added sincosf SIMD declaration. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core.S * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core_avx512.S * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf4_core.S * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf4_core_sse4.S * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf8_core.S * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf8_core_avx2.S * sysdeps/x86_64/fpu/svml_s_sincosf16_core.S * sysdeps/x86_64/fpu/svml_s_sincosf4_core.S * sysdeps/x86_64/fpu/svml_s_sincosf8_core.S * sysdeps/x86_64/fpu/svml_s_sincosf8_core_avx.S * sysdeps/x86_64/fpu/svml_s_sincosf_data.S: New file. * sysdeps/x86_64/fpu/svml_s_sincosf_data.h: New file. * sysdeps/x86_64/fpu/svml_s_wrapper_impl.h: Added 3 argument wrappers. * sysdeps/x86_64/fpu/test-float-vlen16.c: : Vector sincosf tests. * sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8.c: Likewise. |
||
Andrew Senkevich
|
c9a8c526ac |
Vector sincos for x86_64 and tests.
Here is implementation of vectorized sincos containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * NEWS: Mention addition of x86_64 vector sincos. * bits/libm-simd-decl-stubs.h: Added stubs for sincos. * math/math.h (__MATHDECL_VEC): New macro. * math/bits/mathcalls.h: Added sincos declaration with __MATHDECL_VEC. * math/gen-libm-have-vector-test.sh: Added generation of sincos wrapper declaration under condition. * math/test-vec-loop.h (TEST_VEC_LOOP): Refactored. * math/test-double-vlen2.h: Added wrapper for sincos tests, reflected TEST_VEC_LOOP change. * math/test-double-vlen4.h: Likewise. * math/test-double-vlen8.h: Likewise. * math/test-float-vlen16.h: Reflected TEST_VEC_LOOP change. * math/test-float-vlen4.h: Likewise. * math/test-float-vlen8.h: Likewise. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New symbols added. * sysdeps/x86/fpu/bits/math-vector.h: Added sincos SIMD declaration. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos2_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos2_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos4_core_avx2.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core_avx512.S: New file. * sysdeps/x86_64/fpu/svml_d_sincos2_core.S: New file. * sysdeps/x86_64/fpu/svml_d_sincos4_core.S: New file. * sysdeps/x86_64/fpu/svml_d_sincos4_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_d_sincos8_core.S: New file. * sysdeps/x86_64/fpu/svml_d_sincos_data.S: New file. * sysdeps/x86_64/fpu/svml_d_sincos_data.h: New file. * sysdeps/x86_64/fpu/svml_d_wrapper_impl.h: Added wrappers for sincos. * sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: Vector sincos tests. * sysdeps/x86_64/fpu/test-double-vlen2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8.c: Likewise. |
||
Andrew Senkevich
|
8aa92022e2 |
Vector powf for x86_64 and tests.
Here is implementation of vectorized powf containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New symbols added. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration and asm redirections for powf. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/svml_s_wrapper_impl.h: Added 2 argument wrappers. * sysdeps/x86_64/fpu/multiarch/svml_s_powf16_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_powf16_core_avx512.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_powf4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_powf4_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_powf8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_powf8_core_avx2.S: New file. * sysdeps/x86_64/fpu/svml_s_powf16_core.S: New file. * sysdeps/x86_64/fpu/svml_s_powf4_core.S: New file. * sysdeps/x86_64/fpu/svml_s_powf8_core.S: New file. * sysdeps/x86_64/fpu/svml_s_powf8_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_s_powf_data.S: New file. * sysdeps/x86_64/fpu/svml_s_powf_data.h: New file. * sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c: Vector powf tests. * sysdeps/x86_64/fpu/test-float-vlen16.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8.c: Likewise. * math/test-float-vlen16.h: Fixed 2 argument macro. * math/test-float-vlen4.h: Likewise. * math/test-float-vlen8.h: Likewise. * NEWS: Mention addition of x86_64 vector powf. |
||
Andrew Senkevich
|
c10b9b13f7 |
Vector pow for x86_64 and tests.
Here is implementation of vectorized pow containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * bits/libm-simd-decl-stubs.h: Added stubs for pow. * math/bits/mathcalls.h: Added pow declaration with __MATHCALL_VEC. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New versions added. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration and asm redirections for pow. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/svml_d_wrapper_impl.h: Added 2 argument wrappers. * sysdeps/x86_64/fpu/multiarch/svml_d_pow2_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_pow2_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_pow4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_pow4_core_avx2.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core_avx512.S: New file. * sysdeps/x86_64/fpu/svml_d_pow2_core.S: New file. * sysdeps/x86_64/fpu/svml_d_pow4_core.S: New file. * sysdeps/x86_64/fpu/svml_d_pow4_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_d_pow8_core.S: New file. * sysdeps/x86_64/fpu/svml_d_pow_data.S: New file. * sysdeps/x86_64/fpu/svml_d_pow_data.h: New file. * sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: Added vector pow test. * sysdeps/x86_64/fpu/test-double-vlen2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8.c: Likewise. * NEWS: Mention addition of x86_64 vector pow. |
||
Andrew Senkevich
|
1663be053d |
Vector expf for x86_64 and tests.
Here is implementation of vectorized expf containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New symbols added. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration and asm redirections for expf. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/multiarch/svml_s_expf16_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_expf16_core_avx512.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_expf4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_expf4_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_expf8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_expf8_core_avx2.S: New file. * sysdeps/x86_64/fpu/svml_s_expf16_core.S: New file. * sysdeps/x86_64/fpu/svml_s_expf4_core.S: New file. * sysdeps/x86_64/fpu/svml_s_expf8_core.S: New file. * sysdeps/x86_64/fpu/svml_s_expf8_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_s_expf_data.S: New file. * sysdeps/x86_64/fpu/svml_s_expf_data.h: New file. * sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c: Vector expf tests. * sysdeps/x86_64/fpu/test-float-vlen16.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8.c: Likewise. * NEWS: Mention addition of x86_64 vector expf. |
||
Andrew Senkevich
|
9c02f663f6 |
Vector exp for x86_64 and tests.
Here is implementation of vectorized exp containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * bits/libm-simd-decl-stubs.h: Added stubs for exp. * math/bits/mathcalls.h: Added exp declaration with __MATHCALL_VEC. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New versions added. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration and asm redirections for exp. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/multiarch/svml_d_exp2_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_exp2_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_exp4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_exp4_core_avx2.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core_avx512.S: New file. * sysdeps/x86_64/fpu/svml_d_exp2_core.S: New file. * sysdeps/x86_64/fpu/svml_d_exp4_core.S: New file. * sysdeps/x86_64/fpu/svml_d_exp4_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_d_exp8_core.S: New file. * sysdeps/x86_64/fpu/svml_d_exp_data.S: New file. * sysdeps/x86_64/fpu/svml_d_exp_data.h: New file. * sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: Added vector exp test. * sysdeps/x86_64/fpu/test-double-vlen2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8.c: Likewise. * NEWS: Mention addition of x86_64 vector exp. |
||
Andrew Senkevich
|
774488f88a |
Vector logf for x86_64 and tests.
Here is implementation of vectorized logf containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New symbols added. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration and asm redirections for logf. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/multiarch/svml_s_logf16_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_logf16_core_avx512.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_logf4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_logf4_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_logf8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_logf8_core_avx2.S: New file. * sysdeps/x86_64/fpu/svml_s_logf16_core.S: New file. * sysdeps/x86_64/fpu/svml_s_logf4_core.S: New file. * sysdeps/x86_64/fpu/svml_s_logf8_core.S: New file. * sysdeps/x86_64/fpu/svml_s_logf8_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_s_logf_data.S: New file. * sysdeps/x86_64/fpu/svml_s_logf_data.h: New file. * sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c: Vector logf tests. * sysdeps/x86_64/fpu/test-float-vlen16.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8.c: Likewise. * NEWS: Mention addition of x86_64 vector logf. |
||
Andrew Senkevich
|
6af25acc7b |
Vector log for x86_64 and tests.
Here is implementation of vectorized log containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * bits/libm-simd-decl-stubs.h: Added stubs for log. * math/bits/mathcalls.h: Added log declaration with __MATHCALL_VEC. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New versions added. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration and asm redirections for log. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/multiarch/svml_d_log2_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_log2_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_log4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_log4_core_avx2.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_log8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_log8_core_avx512.S: New file. * sysdeps/x86_64/fpu/svml_d_log2_core.S: New file. * sysdeps/x86_64/fpu/svml_d_log4_core.S: New file. * sysdeps/x86_64/fpu/svml_d_log4_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_d_log8_core.S: New file. * sysdeps/x86_64/fpu/svml_d_log_data.S: New file. * sysdeps/x86_64/fpu/svml_d_log_data.h: New file. * sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: Added vector log test. * sysdeps/x86_64/fpu/test-double-vlen2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8.c: Likewise. * NEWS: Mention addition of x86_64 vector log. |
||
Andrew Senkevich
|
2a8c2c7b33 |
Vector sinf for x86_64 and tests.
Here is implementation of vectorized sinf containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New symbols added. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration for sinf. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf16_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf16_core_avx512.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf4_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf8_core_avx2.S: New file. * sysdeps/x86_64/fpu/svml_s_sinf16_core.S: New file. * sysdeps/x86_64/fpu/svml_s_sinf4_core.S: New file. * sysdeps/x86_64/fpu/svml_s_sinf8_core.S: New file. * sysdeps/x86_64/fpu/svml_s_sinf8_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_s_sinf_data.S: New file. * sysdeps/x86_64/fpu/svml_s_sinf_data.h: New file. * sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c: Vector sinf tests. * sysdeps/x86_64/fpu/test-float-vlen16.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8.c: Likewise. * NEWS: Mention addition of x86_64 vector sinf. |
||
Andrew Senkevich
|
4b9c2b707b |
Vector sin for x86_64 and tests.
Here is implementation of vectorized sin containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * bits/libm-simd-decl-stubs.h: Added stubs for sin. * math/bits/mathcalls.h: Added sin declaration with __MATHCALL_VEC. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New versions added. * sysdeps/x86/fpu/bits/math-vector.h: SIMD declaration for sin. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core_avx2.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core_avx512.S: New file. * sysdeps/x86_64/fpu/svml_d_sin2_core.S: New file. * sysdeps/x86_64/fpu/svml_d_sin4_core.S: New file. * sysdeps/x86_64/fpu/svml_d_sin4_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_d_sin8_core.S: New file. * sysdeps/x86_64/fpu/svml_d_sin_data.S: New file. * sysdeps/x86_64/fpu/svml_d_sin_data.h: New file. * sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: Added vector sin test. * sysdeps/x86_64/fpu/test-double-vlen2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8.c: Likewise. * NEWS: Mention addition of x86_64 vector sin. |
||
Andrew Senkevich
|
04f496d602 |
Vector cosf for x86_64.
Here is implementation of vectorized cosf containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/svml_s_cosf4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf4_core_sse4.S: New file. * sysdeps/x86_64/fpu/svml_s_cosf8_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_s_cosf8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf8_core_avx2.S: New file. * sysdeps/x86_64/fpu/svml_s_cosf16_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf16_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf16_core_avx512.S: New file. * sysdeps/x86_64/fpu/svml_s_wrapper_impl.h: New file. * sysdeps/x86_64/fpu/svml_s_cosf_data.S: New file. * sysdeps/x86_64/fpu/svml_s_cosf_data.h: New file. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New versions added. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration for cosf. * NEWS: Mention addition of x86_64 vector cosf. |
||
Andrew Senkevich
|
2193311288 |
Start of series of patches with x86_64 vector math functions.
Here is implementation of cos containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI which had been discussed in <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. Vector math library build and ABI testing enabled by default for x86_64. * sysdeps/x86_64/fpu/Makefile: New file. * sysdeps/x86_64/fpu/Versions: New file. * sysdeps/x86_64/fpu/svml_d_cos_data.S: New file. * sysdeps/x86_64/fpu/svml_d_cos_data.h: New file. * sysdeps/x86_64/fpu/svml_d_cos2_core.S: New file. * sysdeps/x86_64/fpu/svml_d_cos4_core.S: New file. * sysdeps/x86_64/fpu/svml_d_cos4_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_d_cos8_core.S: New file. * sysdeps/x86_64/fpu/svml_d_wrapper_impl.h: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_cos2_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_cos2_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_cos4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_cos4_core_avx2.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core_avx512.S: New file. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration for cos. * math/bits/mathcalls.h: Added cos declaration with __MATHCALL_VEC. * sysdeps/x86_64/configure.ac: Options for libmvec build. * sysdeps/x86_64/configure: Regenerated. * sysdeps/x86_64/sysdep.h (cfi_offset_rel_rsp): New macro. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New file. * manual/install.texi (Configuring and compiling): Document --disable-mathvec. * INSTALL: Regenerated. * NEWS: Mention addition of libmvec and x86_64 vector cos. |
||
Andreas Schwab
|
7998fa7899 | Disable use of FMA instructions in branred | ||
Joseph Myers
|
3b1004624e | Fix makefile/configure problems with sse2avx changes. | ||
Ulrich Drepper
|
56f6f6a240 | Use -msse2avx option for x86-64 libm functions | ||
Ulrich Drepper
|
a5b81e1fb7 |
Remove code without too much effects
Some of the AVX-specific code is not giving enough speed-up to justify the extra code. |
||
Ulrich Drepper
|
e0016b11d6 | Add AVX optimized versions for some x86-64 math functions | ||
Ulrich Drepper
|
af968f62f2 | Optimize accurate 64-bit routines for FMA4 on x86-64 | ||
Ulrich Drepper
|
581d30e386 | Add optimized nearbyint{,f} for x86-64 | ||
Ulrich Drepper
|
ad0f5cad15 | Use rounds{s,d} for x86 rint, ceil, floor |