glibc

mirror of https://sourceware.org/git/glibc.git synced 2024-11-26 23:10:06 +00:00

Author	SHA1	Message	Date
Andreas K. Hüttel	180d5a045f	Update x86-64 libm-test-ulps x86_64 Intel(R) Core(TM) i5-8265U gcc (Gentoo 10.1.0-r2 p3) 10.1.0 Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2020-07-25 17:10:53 -04:00
H.J. Lu	107e6a3c22	x86: Support usable check for all CPU features Support usable check for all CPU features with the following changes: 1. Change struct cpu_features to struct cpuid_features { struct cpuid_registers cpuid; struct cpuid_registers usable; }; struct cpu_features { struct cpu_features_basic basic; struct cpuid_features features[COMMON_CPUID_INDEX_MAX]; unsigned int preferred[PREFERRED_FEATURE_INDEX_MAX]; ... }; so that there is a usable bit for each cpuid bit. 2. After the cpuid bits have been initialized, copy the known bits to the usable bits. EAX/EBX from INDEX_1 and EAX from INDEX_7 aren't used for CPU feature detection. 3. Clear the usable bits which require OS support. 4. If the feature is supported by OS, copy its cpuid bit to its usable bit. 5. Replace HAS_CPU_FEATURE and CPU_FEATURES_CPU_P with CPU_FEATURE_USABLE and CPU_FEATURE_USABLE_P to check if a feature is usable. 6. Add DEPR_FPU_CS_DS for INDEX_7_EBX_13. 7. Unset MPX feature since it has been deprecated. The results are 1. If the feature is known and doesn't requre OS support, its usable bit is copied from the cpuid bit. 2. Otherwise, its usable bit is copied from the cpuid bit only if the feature is known to supported by OS. 3. CPU_FEATURE_USABLE/CPU_FEATURE_USABLE_P are used to check if the feature can be used. 4. HAS_CPU_FEATURE/CPU_FEATURE_CPU_P are used to check if CPU supports the feature.	2020-07-13 06:05:16 -07:00
Adhemerval Zanella	b24381e50f	i386: Use builtin sqrtl Checked on i686-linux-gnu.	2020-06-22 11:09:49 -03:00
Adhemerval Zanella	d19d25dd06	x86_64: Use builtin sqrt{f,l} Checked on x86_64-linux-gnu.	2020-06-22 11:09:49 -03:00
Adhemerval Zanella	f721171632	Revert "x86_64: Add SSE sfp-exceptions" The __sfp_handle_exceptions is not fully correct regarding raising exceptions, since there is no direct way to raise only FP_EX_OVERFLOW nor FP_EX_UNDERFLOW for SSE mode. Both libgcc and feraiseexcept rely on x87 mode to accomplish it. This reverts commit `460ee50de0`. Checked on x86_64.	2020-04-20 14:56:05 -03:00
Adhemerval Zanella	460ee50de0	x86_64: Add SSE sfp-exceptions The exported x86_64 fenv.h functions operate on both i387 and SSE (since they should work on both float, double, and long double) while the internal libc_fe* set either SSE (float, double, and float128) or i387 (long double). The libgcc __sfp_handle_exceptions (used on float128 implementation), however, will set either SEE or i387 exception depending of the exception to raise. This broke the internal assumption of float128 where only SSE operations will be used. This patch reimplements the libgcc __sfp_handle_exceptions to use only SSE operations and sets libgcc to use it instead of its own implementation. And I think we should fix libgcc in a similar manner, since checking on config/i386/64/sfp-machine.h it already only supports SSE rounding mode and x86_64 ABI also expectes float128 to use SSE registers [1] (although it is not clear on how future implementation might implement it). Checked on x86_64-linux-gnu. [1] https://github.com/hjl-tools/x86-psABI/wiki/X86-psABI	2020-04-17 11:42:29 -03:00
Paul Zimmermann	a9d42c09a3	math: Add inputs that yield larger errors for float type (x86_64) The corner cases included were generated using exhaustive search for all float/binary32 values on x86_64 (comparing to MPFR for correct rounding to nearest). For the j0/j1/y0 functions, only cases with ulp error <= 9 were included. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2020-03-31 21:48:54 -04:00
Adhemerval Zanella	1c15464ca0	math: Remove inline math tests With mathinline removal there is no need to keep building and testing inline math tests. The gen-libm-tests.py support to generate ULP_I_* is removed and all libm-test-ulps files are updated to longer have the i{float,double,ldouble} entries. The support for no-test-inline is also removed from both gen-auto-libm-tests and the auto-libm-test-out-* were regenerated. Checked on x86_64-linux-gnu and i686-linux-gnu.	2020-03-19 11:45:44 -03:00
Wilco Dijkstra	220622dde5	Add libm_alias_finite for _finite symbols This patch adds a new macro, libm_alias_finite, to define all _finite symbol. It sets all _finite symbol as compat symbol based on its first version (obtained from the definition at built generated first-versions.h). The <fn>f128_finite symbols were introduced in GLIBC 2.26 and so need special treatment in code that is shared between long double and float128. It is done by adding a list, similar to internal symbol redifinition, on sysdeps/ieee754/float128/float128_private.h. Alpha also needs some tricky changes to ensure we still emit 2 compat symbols for sqrt(f). Passes buildmanyglibc. Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2020-01-03 10:02:04 -03:00
Joseph Myers	d614a75396	Update copyright dates with scripts/update-copyrights.	2020-01-01 00:14:33 +00:00
Stefan Liebler	1c94bf0f0a	Always use wordsize-64 version of s_trunc.c. This patch replaces s_trunc.c in sysdeps/dbl-64 with the one in sysdeps/dbl-64/wordsize-64 and removes the latter one. The code is not changed except changes in code style. Also adjusted the include path in x86_64 and sparc64 files. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2019-12-11 15:12:14 +01:00
Stefan Liebler	9f234eafe8	Always use wordsize-64 version of s_ceil.c. This patch replaces s_ceil.c in sysdeps/dbl-64 with the one in sysdeps/dbl-64/wordsize-64 and removes the latter one. The code is not changed except changes in code style. Also adjusted the include path in x86_64 and sparc64 files. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2019-12-11 15:12:13 +01:00
Stefan Liebler	95b0c2c431	Always use wordsize-64 version of s_floor.c. This patch replaces s_floor.c in sysdeps/dbl-64 with the one in sysdeps/dbl-64/wordsize-64 and removes the latter one. The code is not changed except changes in code style. Also adjusted the include path in x86_64 and sparc64 files. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2019-12-11 15:12:12 +01:00
Stefan Liebler	ab48bdd098	Always use wordsize-64 version of s_rint.c. This patch replaces s_rint.c in sysdeps/dbl-64 with the one in sysdeps/dbl-64/wordsize-64 and removes the latter one. The code is not changed except changes in code style. Also adjusted the include path in x86_64 file. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2019-12-11 15:12:12 +01:00
Stefan Liebler	af123aa950	Always use wordsize-64 version of s_nearbyint.c. This patch replaces s_nearbyint.c in sysdeps/dbl-64 with the one in sysdeps/dbl-64/wordsize-64 and removes the latter one. The code is not changed except changes in code style. Also adjusted the include path in x86_64 file. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2019-12-11 15:12:11 +01:00
Wilco Dijkstra	d0007dc53c	Remove x64 _finite tests and references Remove _finite tests and references from x86_64. Rather than calling __exp_finite, use exp directly (since it's the same entry point). x86_64 builds and passes testsuite. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2019-10-21 14:29:12 -03:00
Paul Eggert	5a82c74822	Prefer https to http for gnu.org and fsf.org URLs Also, change sources.redhat.com to sourceware.org. This patch was automatically generated by running the following shell script, which uses GNU sed, and which avoids modifying files imported from upstream: sed -ri ' s,(http\|ftp)(://(.\.)?(gnu\|fsf\|sourceware)\.org($\|[^.]\|\.[^a-z])),https\2,g s,(http\|ftp)(://(.\.)?)sources\.redhat\.com($\|[^.]\|\.[^a-z]),https\2sourceware.org\4,g ' \ $(find $(git ls-files) -prune -type f \ ! -name '.po' \ ! -name 'ChangeLog' \ ! -path COPYING ! -path COPYING.LIB \ ! -path manual/fdl-1.3.texi ! -path manual/lgpl-2.1.texi \ ! -path manual/texinfo.tex ! -path scripts/config.guess \ ! -path scripts/config.sub ! -path scripts/install-sh \ ! -path scripts/mkinstalldirs ! -path scripts/move-if-change \ ! -path INSTALL ! -path locale/programs/charmap-kw.h \ ! -path po/libc.pot ! -path sysdeps/gnu/errlist.c \ ! '(' -name configure \ -execdir test -f configure.ac -o -f configure.in ';' ')' \ ! '(' -name preconfigure \ -execdir test -f preconfigure.ac ';' ')' \ -print) and then by running 'make dist-prepare' to regenerate files built from the altered files, and then executing the following to cleanup: chmod a+x sysdeps/unix/sysv/linux/riscv/configure # Omit irrelevant whitespace and comment-only changes, # perhaps from a slightly-different Autoconf version. git checkout -f \ sysdeps/csky/configure \ sysdeps/hppa/configure \ sysdeps/riscv/configure \ sysdeps/unix/sysv/linux/csky/configure # Omit changes that caused a pre-commit check to fail like this: # remote: * error: sysdeps/powerpc/powerpc64/ppc-mcount.S: trailing lines git checkout -f \ sysdeps/powerpc/powerpc64/ppc-mcount.S \ sysdeps/unix/sysv/linux/s390/s390-64/syscall.S # Omit change that caused a pre-commit check to fail like this: # remote: * error: sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S: last line does not end in newline git checkout -f sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S	2019-09-07 02:43:31 -07:00
H.J. Lu	7e681561a3	x86-64: Compile branred.c with -mprefer-vector-width=128 [BZ #24603 ] When compiled with -O3 and AVX, GCC 8 and 9 optimize some loops in sysdeps/ieee754/dbl-64/branred.c with 256-bit vector instructions, which leads to store forward stall: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90579 There is no easy fix in compiler. This patch limits vector width to 128 bits to work around this issue. It improves performance of sin and cos by more than 40% on Skylake compiled with -O3 -march=skylake. Tested with GCC 7/8/9 on x86-64. [BZ #24603] * sysdeps/x86_64/configure.ac: Check if -mprefer-vector-width=128 works. * sysdeps/x86_64/configure: Regenerated. * sysdeps/x86_64/fpu/Makefile (CFLAGS-branred.c): New. Set to -mprefer-vector-width=128 if supported.	2019-07-24 14:48:43 -07:00
Joseph Myers	04277e02d7	Update copyright dates with scripts/update-copyrights. * All files with FSF copyright notices: Update copyright dates using scripts/update-copyrights. * locale/programs/charmap-kw.h: Regenerated. * locale/programs/locfile-kw.h: Likewise.	2019-01-01 00:11:28 +00:00
H.J. Lu	ba4b8fab20	x86-64: Remove s_sincosf-sse2.S The current s_sincosf.c is faster than s_sincosf-sse2.S. On Broadwell with FMA disabled, bench-sincosf shows: Before After Improvement max 154.032 114.517 34% min 6.25 5.609 11% mean 14.8728 12.8589 15% * sysdeps/x86_64/fpu/s_sincosf.S: Removed. * sysdeps/x86_64/fpu/multiarch/s_sincosf-sse2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/s_sincosf-sse2.c: New file.	2018-12-26 06:58:31 -08:00
H.J. Lu	9412979a43	Regenerate sysdeps/x86_64/fpu/libm-test-ulps * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated.	2018-12-26 06:57:30 -08:00
H.J. Lu	8700a7851b	x86-64: Vectorize sincosf_poly and update s_sincosf-fma.c Add <sincosf_poly.h> and include it in s_sincosf.h to allow vectorized sincosf_poly. Add x86 sincosf_poly.h to vectorize sincosf_poly. On Broadwell, bench-sincosf shows: Before After Improvement max 160.273 114.198 40% min 6.25 5.625 11% mean 13.0325 10.6462 22% Vectorized sincosf_poly shows Before After Improvement max 138.653 114.198 21% min 5.004 5.625 -11% mean 11.5934 10.6462 9% Tested on x86-64 and i686 as well as with build-many-glibcs.py. * sysdeps/ieee754/flt-32/s_sincosf.h: Include <sincosf_poly.h>. (sincos_t, sincosf_poly, sinf_poly): Moved to ... * sysdeps/ieee754/flt-32/sincosf_poly.h: Here. New file. * sysdeps/x86/fpu/s_sincosf_data.c: New file. * sysdeps/x86/fpu/sincosf_poly.h: Likewise. * sysdeps/x86_64/fpu/multiarch/s_sincosf-fma.c: Just include <sysdeps/ieee754/flt-32/s_sincosf.c>.	2018-12-26 06:56:13 -08:00
Szabolcs Nagy	a502c5294b	Remove the error handling wrapper from pow Introduce new pow symbol version that doesn't do SVID compatible error handling. The standard errno and fp exception based error handling is inline in the new code and does not have significant overhead. The wrapper is disabled for sysdeps/ieee754/dbl-64 by using empty w_pow.c and enabled for targets with their own pow implementation or ifunc dispatch on __ieee754_pow by including math/w_pow.c. The compatibility symbol version still uses the wrapper with SVID error handling around the new code. There is no new symbol version nor compatibility code on !LIBM_SVID_COMPAT targets (e.g. riscv). On targets where previously powl was an alias of pow, now it points to the compatibility symbol with the wrapper, because it still need the SVID compatible error handling. This affects NO_LONG_DOUBLE (e.g. arm) and LONG_DOUBLE_COMPAT (e.g. alpha) targets as well. The __pow_finite symbol is now an alias of pow. Both __pow_finite and pow set errno and thus not const functions. The ia64 asm is changed so the compat and new symbol versions map to the same address. On x86_64 #include <math.h> was added before macro definitions that may affect that header. Tested with build-many-glibcs.py. * math/Versions (GLIBC_2.29): Add pow. * math/w_pow_compat.c (__pow_compat): Change to versioned compat symbol. * math/w_pow.c: New file. * sysdeps/i386/fpu/w_pow.c: New file. * sysdeps/ia64/fpu/e_pow.S: Add versioned symbols. * sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Rename to __pow and add necessary aliases. * sysdeps/ieee754/dbl-64/w_pow.c: New file. * sysdeps/m68k/m680x0/fpu/w_pow.c: New file. * sysdeps/mach/hurd/i386/libm.abilist: Update. * sysdeps/unix/sysv/linux/aarch64/libm.abilist: Update. * sysdeps/unix/sysv/linux/alpha/libm.abilist: Update. * sysdeps/unix/sysv/linux/arm/libm.abilist: Update. * sysdeps/unix/sysv/linux/hppa/libm.abilist: Update. * sysdeps/unix/sysv/linux/i386/libm.abilist: Update. * sysdeps/unix/sysv/linux/ia64/libm.abilist: Update. * sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Update. * sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Update. * sysdeps/unix/sysv/linux/microblaze/libm.abilist: Update. * sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Update. * sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Update. * sysdeps/unix/sysv/linux/nios2/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Update. * sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Update. * sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Update. * sysdeps/unix/sysv/linux/sh/libm.abilist: Update. * sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Update. * sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Update. * sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Update. * sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Update. * sysdeps/x86_64/fpu/multiarch/e_pow-fma.c (__ieee754_pow): Rename to __pow. * sysdeps/x86_64/fpu/multiarch/e_pow-fma4.c (__ieee754_pow): Likewise. * sysdeps/x86_64/fpu/multiarch/e_pow.c (__ieee754_pow): Likewise. * sysdeps/x86_64/fpu/multiarch/w_pow.c: New file.	2018-11-21 09:58:36 +00:00
Szabolcs Nagy	f29b7c492d	Remove the error handling wrapper from log Introduce new log symbol version that doesn't do SVID compatible error handling. The standard errno and fp exception based error handling is inline in the new code and does not have significant overhead. The wrapper is disabled for sysdeps/ieee754/dbl-64 by using empty w_log.c and enabled for targets with their own log implementation by including math/w_log.c. The compatibility symbol version still uses the wrapper with SVID error handling around the new code. There is no new symbol version nor compatibility code on !LIBM_SVID_COMPAT targets (e.g. riscv). On targets where previously logl was an alias of log, now it points to the compatibility symbol with the wrapper, because it still need the SVID compatible error handling. This affects NO_LONG_DOUBLE (e.g. arm) and LONG_DOUBLE_COMPAT (e.g. alpha) targets as well. The __log_finite symbol is now an alias of log. Both __log_finite and log set errno and thus not const functions. The ia64 asm is changed so the compat and new symbol versions map to the same address. On x86_64 #include <math.h> was added before macro definitions that may affect that header. Tested with build-many-glibcs.py. * math/Versions (GLIBC_2.29): Add log. * math/w_log_compat.c (__log_compat): Change to versioned compat symbol. * math/w_log.c: New file. * sysdeps/i386/fpu/w_log.c: New file. * sysdeps/ia64/fpu/e_log.S: Update. * sysdeps/ieee754/dbl-64/e_log.c (__ieee754_log): Rename to __log and add necessary aliases. * sysdeps/ieee754/dbl-64/w_log.c: New file. * sysdeps/m68k/m680x0/fpu/w_log.c: New file. * sysdeps/mach/hurd/i386/libm.abilist: Update. * sysdeps/unix/sysv/linux/aarch64/libm.abilist: Update. * sysdeps/unix/sysv/linux/alpha/libm.abilist: Update. * sysdeps/unix/sysv/linux/arm/libm.abilist: Update. * sysdeps/unix/sysv/linux/hppa/libm.abilist: Update. * sysdeps/unix/sysv/linux/i386/libm.abilist: Update. * sysdeps/unix/sysv/linux/ia64/libm.abilist: Update. * sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Update. * sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Update. * sysdeps/unix/sysv/linux/microblaze/libm.abilist: Update. * sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Update. * sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Update. * sysdeps/unix/sysv/linux/nios2/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Update. * sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Update. * sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Update. * sysdeps/unix/sysv/linux/sh/libm.abilist: Update. * sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Update. * sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Update. * sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Update. * sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Update. * sysdeps/x86_64/fpu/multiarch/e_log-avx.c (__ieee754_log): Rename to __log. * sysdeps/x86_64/fpu/multiarch/e_log-fma.c (__ieee754_log): Likewise. * sysdeps/x86_64/fpu/multiarch/e_log-fma4.c (__ieee754_log): Likewise. * sysdeps/x86_64/fpu/multiarch/e_log.c (__ieee754_log): Likewise. * sysdeps/x86_64/fpu/multiarch/w_log.c: New file.	2018-11-21 09:56:27 +00:00
Szabolcs Nagy	c20a10561a	Remove the error handling wrapper from exp and exp2 Introduce new exp and exp2 symbol version that don't do SVID compatible error handling. The standard errno and fp exception based error handling is inline in the new code and does not have significant overhead. The double precision wrappers are disabled for sysdeps/ieee754/dbl-64 by using empty w_exp.c and w_exp2.c files, the math/w_exp.c and math/w_exp2.c files use the wrapper template and can be included by targets that have their own exp and exp2 implementations or use ifunc on the glibc internal __ieee754_exp symbol. The compatibility symbol versions still use the wrapper with SVID error handling around the new code. There is no new symbol version nor compatibility code on !LIBM_SVID_COMPAT targets (e.g. riscv). On targets where previously expl and exp2l were aliases of exp and exp2, now they point to the compatibility symbols with the wrapper, because they still need the SVID compatible error handling. This affects NO_LONG_DOUBLE (e.g arm) and LONG_DOUBLE_COMPAT (e.g. alpha) targets as well. The _finite symbols are now aliases of the standard symbols (they have no performance advantage anymore). Both the standard symbols and _finite symbols set errno and thus not const functions. The ia64 asm is changed so the compat and new symbol versions map to the same address. On x86_64 #include <math.h> was added before macro definitions that may affect that header (the new macro name is __exp instead of __ieee754_exp which breaks some math.h macros). Tested with build-many-glibcs.py. * math/Versions (GLIBC_2.29): Add exp and exp2. * math/w_exp2_compat.c (__exp2_compat): Change to versioned compat symbol, handle NO_LONG_DOUBLE and LONG_DOUBLE_COMPAT explicitly. * math/w_exp_compat.c (__exp_compat): Likewise. * math/w_exp.c: New file. * math/w_exp2.c: New file. * sysdeps/i386/fpu/w_exp.c: New file. * sysdeps/i386/fpu/w_exp2.c: New file. * sysdeps/ia64/fpu/e_exp.S: Add versioned symbols. * sysdeps/ia64/fpu/e_exp2.S: Likewise. * sysdeps/ieee754/dbl-64/e_exp.c (__ieee754_exp): Rename to __exp and add necessary aliases. * sysdeps/ieee754/dbl-64/e_exp2.c (__ieee754_exp2): Rename to __exp2 and add necessary aliases. * sysdeps/ieee754/dbl-64/w_exp.c: New file. * sysdeps/ieee754/dbl-64/w_exp2.c: New file. * sysdeps/m68k/m680x0/fpu/w_exp.c: New file. * sysdeps/m68k/m680x0/fpu/w_exp2.c: New file. * sysdeps/mach/hurd/i386/libm.abilist: Update. * sysdeps/unix/sysv/linux/aarch64/libm.abilist: Update. * sysdeps/unix/sysv/linux/alpha/libm.abilist: Update. * sysdeps/unix/sysv/linux/arm/libm.abilist: Update. * sysdeps/unix/sysv/linux/hppa/libm.abilist: Update. * sysdeps/unix/sysv/linux/i386/libm.abilist: Update. * sysdeps/unix/sysv/linux/ia64/libm.abilist: Update. * sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Update. * sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Update. * sysdeps/unix/sysv/linux/microblaze/libm.abilist: Update. * sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Update. * sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Update. * sysdeps/unix/sysv/linux/nios2/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Update. * sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Update. * sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Update. * sysdeps/unix/sysv/linux/sh/libm.abilist: Update. * sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Update. * sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Update. * sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Update. * sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Update. * sysdeps/x86_64/fpu/multiarch/e_exp-avx.c (__exp1): Remove. (__ieee754_exp): Rename to __exp. * sysdeps/x86_64/fpu/multiarch/e_exp-fma.c (__exp1): Remove. (__ieee754_exp): Rename to __exp. * sysdeps/x86_64/fpu/multiarch/e_exp-fma4.c (__exp1): Remove. (__ieee754_exp): Rename to __exp. * sysdeps/x86_64/fpu/multiarch/e_exp.c (__ieee754_exp): Rename to __exp. * sysdeps/x86_64/fpu/multiarch/w_exp.c: New file.	2018-11-21 09:55:02 +00:00
Joseph Myers	7abf97bed9	Use trunc functions not __trunc functions in glibc libm. Continuing the move to use, within libm, public names for libm functions that can be inlined as built-in functions on many architectures, this patch moves calls to __trunc functions to call the corresponding trunc names instead, with asm redirection to __trunc when the calls are not inlined. Tested for x86_64, and with build-many-glibcs.py. * include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (trunc): Redirect using MATH_REDIRECT. * sysdeps/aarch64/fpu/s_trunc.c: Define NO_MATH_REDIRECT before header inclusion. * sysdeps/aarch64/fpu/s_truncf.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_trunc.c: Likewise. * sysdeps/ieee754/float128/s_truncf128.c: Likewise. * sysdeps/ieee754/dbl-64/s_trunc.c: Likewise. * sysdeps/ieee754/flt-32/s_truncf.c: Likewise. * sysdeps/ieee754/ldbl-128/s_truncl.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_trunc.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_truncf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_trunc.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_truncf.c: Likewise. * sysdeps/riscv/rv64/rvd/s_trunc.c: Likewise. * sysdeps/riscv/rvf/s_truncf.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_trunc.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_truncf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_trunc.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_truncf.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_trunc_template.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_truncl.c: Likewise. (ceil): Redirect to __ceil. (floor): Redirect to __floor. (trunc): Redirect to __trunc. (__truncl): Call trunc instead of __trunc. * sysdeps/powerpc/fpu/math_private.h [_ARCH_PWR5X] (__trunc): Remove macro. [_ARCH_PWR5X] (__truncf): Likewise. * sysdeps/ieee754/dbl-64/e_gamma_r.c (__ieee754_gamma_r): Use trunc functions instead of __trunc variants. * sysdeps/ieee754/flt-32/e_gammaf_r.c (__ieee754_gammaf_r): Likewise. * sysdeps/ieee754/ldbl-128/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/ieee754/ldbl-96/e_gammal_r.c (__ieee754_gammal_r): Likewise.	2018-09-20 21:11:10 +00:00
Szabolcs Nagy	424c4f60ed	Add new pow implementation The algorithm is exp(y * log(x)), where log(x) is computed with about 1.32^-68 relative error (1.52^-68 without fma), returning the result in two doubles, and the exp part uses the same algorithm (and lookup tables) as exp, but takes the input as two doubles and a sign (to handle negative bases with odd integer exponent). The __exp1 internal symbol is no longer necessary. There is separate code path when fma is not available but the worst case error is about 0.54 ULP in both cases. The lookup table and consts for log are 4168 bytes. The .rodata+.text is decreased by 37908 bytes on aarch64. The non-nearest rounding error is less than 1 ULP. Improvements on Cortex-A72 compared to current glibc master: pow thruput: 2.40x in [0.01 11.1]x[0.01 11.1] pow latency: 1.84x in [0.01 11.1]x[0.01 11.1] Tested on aarch64-linux-gnu (defined __FP_FAST_FMA, TOINT_INTRINSICS) and arm-linux-gnueabihf (!defined __FP_FAST_FMA, !TOINT_INTRINSICS) and x86_64-linux-gnu (!defined __FP_FAST_FMA, !TOINT_INTRINSICS) and powerpc64le-linux-gnu (defined __FP_FAST_FMA, !TOINT_INTRINSICS) targets. * NEWS: Mention pow improvements. * math/Makefile (type-double-routines): Add e_pow_log_data. * sysdeps/generic/math_private.h (__exp1): Remove. * sysdeps/i386/fpu/e_pow_log_data.c: New file. * sysdeps/ia64/fpu/e_pow_log_data.c: New file. * sysdeps/ieee754/dbl-64/Makefile (CFLAGS-e_pow.c): Allow fma contraction. * sysdeps/ieee754/dbl-64/e_exp.c (__exp1): Remove. (exp_inline): Remove. (__ieee754_exp): Only single double input is handled. * sysdeps/ieee754/dbl-64/e_pow.c: Rewrite. * sysdeps/ieee754/dbl-64/e_pow_log_data.c: New file. * sysdeps/ieee754/dbl-64/math_config.h (issignaling_inline): Define. (__pow_log_data): Define. * sysdeps/ieee754/dbl-64/upow.h: Remove. * sysdeps/ieee754/dbl-64/upow.tbl: Remove. * sysdeps/m68k/m680x0/fpu/e_pow_log_data.c: New file. * sysdeps/x86_64/fpu/multiarch/Makefile (CFLAGS-e_pow-fma.c): Allow fma contraction. (CFLAGS-e_pow-fma4.c): Likewise.	2018-09-19 10:04:51 +01:00
Joseph Myers	71223ef909	Use ceil functions not __ceil functions in glibc libm. Continuing the move to use, within libm, public names for libm functions that can be inlined as built-in functions on many architectures, this patch moves calls to __ceil functions to call the corresponding ceil names instead, with asm redirection to __ceil when the calls are not inlined. Tested for x86_64, and with build-many-glibcs.py. * include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (ceil): Redirect using MATH_REDIRECT. * sysdeps/aarch64/fpu/s_ceil.c: Define NO_MATH_REDIRECT before header inclusion. * sysdeps/aarch64/fpu/s_ceilf.c: Likewise. * sysdeps/ieee754/dbl-64/s_ceil.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_ceil.c: Likewise. * sysdeps/ieee754/float128/s_ceilf128.c: Likewise. * sysdeps/ieee754/flt-32/s_ceilf.c: Likewise. * sysdeps/ieee754/ldbl-128/s_ceill.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_ceill.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_ceil_template.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceil.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceilf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceil.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceilf.c: Likewise. * sysdeps/riscv/rv64/rvd/s_ceil.c: Likewise. * sysdeps/riscv/rvf/s_ceilf.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_ceil.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_ceilf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_ceil.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_ceilf.c: Likewise. * sysdeps/powerpc/fpu/math_private.h [_ARCH_PWR5X] (__ceil): Remove macro. * sysdeps/ieee754/dbl-64/e_gamma_r.c (gamma_positive): Use ceil functions instead of __ceil variants. * sysdeps/ieee754/flt-32/e_gammaf_r.c (gammaf_positive): Likewise. * sysdeps/ieee754/ldbl-128/e_gammal_r.c (gammal_positive): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (gammal_positive): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_truncl.c (__truncl): Likewise. * sysdeps/ieee754/ldbl-96/e_gammal_r.c (gammal_positive): Likewise. * sysdeps/powerpc/power5+/fpu/s_modf.c (__modf): Likewise. * sysdeps/powerpc/power5+/fpu/s_modff.c (__modff): Likewise.	2018-09-17 20:42:06 +00:00
Joseph Myers	f29b6f17e4	Use rint functions not __rint functions in glibc libm. Continuing the move to use, within libm, public names for libm functions that can be inlined as built-in functions on many architectures, this patch moves calls to __rint functions to call the corresponding rint names instead, with asm redirection to __rint when the calls are not inlined. The x86_64 math_private.h is removed as no longer useful after this patch. This patch is relative to a tree with my floor patch <https://sourceware.org/ml/libc-alpha/2018-09/msg00148.html> applied, and much the same considerations arise regarding possibly replacing an IFUNC call with a direct inline expansion. Tested for x86_64, and with build-many-glibcs.py. * include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (rint): Redirect using MATH_REDIRECT. * sysdeps/aarch64/fpu/s_rint.c: Define NO_MATH_REDIRECT before header inclusion. * sysdeps/aarch64/fpu/s_rintf.c: Likewise. * sysdeps/alpha/fpu/s_rint.c: Likewise. * sysdeps/alpha/fpu/s_rintf.c: Likewise. * sysdeps/i386/fpu/s_rintl.c: Likewise. * sysdeps/ieee754/dbl-64/s_rint.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_rint.c: Likewise. * sysdeps/ieee754/float128/s_rintf128.c: Likewise. * sysdeps/ieee754/flt-32/s_rintf.c: Likewise. * sysdeps/ieee754/ldbl-128/s_rintl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_rintl.c: Likewise. * sysdeps/m68k/coldfire/fpu/s_rint.c: Likewise. * sysdeps/m68k/coldfire/fpu/s_rintf.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_rint.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_rintf.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_rintl.c: Likewise. * sysdeps/powerpc/fpu/s_rint.c: Likewise. * sysdeps/powerpc/fpu/s_rintf.c: Likewise. * sysdeps/riscv/rv64/rvd/s_rint.c: Likewise. * sysdeps/riscv/rvf/s_rintf.c: Likewise. * sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_rint.c: Likewise. * sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_rintf.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_rint.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_rintf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_rint.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_rintf.c: Likewise. * sysdeps/x86_64/fpu/math_private.h: Remove file. * math/e_scalb.c (invalid_fn): Use rint functions instead of __rint variants. * math/e_scalbf.c (invalid_fn): Likewise. * math/e_scalbl.c (invalid_fn): Likewise. * sysdeps/ieee754/dbl-64/e_gamma_r.c (__ieee754_gamma_r): Likewise. * sysdeps/ieee754/flt-32/e_gammaf_r.c (__ieee754_gammaf_r): Likewise. * sysdeps/ieee754/k_standard.c (__kernel_standard): Likewise. * sysdeps/ieee754/k_standardl.c (__kernel_standard_l): Likewise. * sysdeps/ieee754/ldbl-128/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/ieee754/ldbl-96/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/powerpc/powerpc32/fpu/s_llrint.c (__llrint): Likewise. * sysdeps/powerpc/powerpc32/fpu/s_llrintf.c (__llrintf): Likewise.	2018-09-14 13:10:39 +00:00
Joseph Myers	e44acb2063	Use floor functions not __floor functions in glibc libm. Similar to the changes that were made to call sqrt functions directly in glibc, instead of __ieee754_sqrt variants, so that the compiler could inline them automatically without needing special inline definitions in lots of math_private.h headers, this patch makes libm code call floor functions directly instead of __floor variants, removing the inlines / macros for x86_64 (SSE4.1) and powerpc (POWER5). The redirection used to ensure that __ieee754_sqrt does still get called when the compiler doesn't inline a built-in function expansion is refactored so it can be applied to other functions; the refactoring is arranged so it's not limited to unary functions either (it would be reasonable to use this mechanism for copysign - removing the inline in math_private_calls.h but also eliminating unnecessary local PLT entry use in the cases (powerpc soft-float and e500v1, for IBM long double) where copysign calls don't get inlined). The point of this change is that more architectures can get floor calls inlined where they weren't previously (AArch64, for example), without needing special inline definitions in their math_private.h, and existing such definitions in math_private.h headers can be removed. Note that it's possible that in some cases an inline may be used where an IFUNC call was previously used - this is the case on x86_64, for example. I think the direct calls to floor are still appropriate; if there's any significant performance cost from inline SSE2 floor instead of an IFUNC call ending up with SSE4.1 floor, that indicates that either the function should be doing something else that's faster than using floor at all, or it should itself have IFUNC variants, or that the compiler choice of inlining for generic tuning should change to allow for the possibility that, by not inlining, an SSE4.1 IFUNC might be called at runtime - but not that glibc should avoid calling floor internally. (After all, all the same considerations would apply to any user program calling floor, where it might either be inlined or left as an out-of-line call allowing for a possible IFUNC.) Tested for x86_64, and with build-many-glibcs.py. * include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (MATH_REDIRECT): New macro. [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (MATH_REDIRECT_LDBL): Likewise. [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (MATH_REDIRECT_F128): Likewise. [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (MATH_REDIRECT_UNARY_ARGS): Likewise. [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (sqrt): Redirect using MATH_REDIRECT. [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (floor): Likewise. * sysdeps/aarch64/fpu/s_floor.c: Define NO_MATH_REDIRECT before header inclusion. * sysdeps/aarch64/fpu/s_floorf.c: Likewise. * sysdeps/ieee754/dbl-64/s_floor.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_floor.c: Likewise. * sysdeps/ieee754/float128/s_floorf128.c: Likewise. * sysdeps/ieee754/flt-32/s_floorf.c: Likewise. * sysdeps/ieee754/ldbl-128/s_floorl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_floorl.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_floor_template.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floor.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floorf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_floor.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_floorf.c: Likewise. * sysdeps/riscv/rv64/rvd/s_floor.c: Likewise. * sysdeps/riscv/rvf/s_floorf.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_floor.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_floorf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_floor.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_floorf.c: Likewise. * sysdeps/powerpc/fpu/math_private.h [_ARCH_PWR5X] (__floor): Remove macro. [_ARCH_PWR5X] (__floorf): Likewise. * sysdeps/x86_64/fpu/math_private.h [__SSE4_1__] (__floor): Remove inline function. [__SSE4_1__] (__floorf): Likewise. * math/w_lgamma_main.c (LGFUNC (__lgamma)): Use floor functions instead of __floor variants. * math/w_lgamma_r_compat.c (__lgamma_r): Likewise. * math/w_lgammaf_main.c (LGFUNC (__lgammaf)): Likewise. * math/w_lgammaf_r_compat.c (__lgammaf_r): Likewise. * math/w_lgammal_main.c (LGFUNC (__lgammal)): Likewise. * math/w_lgammal_r_compat.c (__lgammal_r): Likewise. * math/w_tgamma_compat.c (__tgamma): Likewise. * math/w_tgamma_template.c (M_DECL_FUNC (__tgamma)): Likewise. * math/w_tgammaf_compat.c (__tgammaf): Likewise. * math/w_tgammal_compat.c (__tgammal): Likewise. * sysdeps/ieee754/dbl-64/e_lgamma_r.c (sin_pi): Likewise. * sysdeps/ieee754/dbl-64/k_rem_pio2.c (__kernel_rem_pio2): Likewise. * sysdeps/ieee754/dbl-64/lgamma_neg.c (__lgamma_neg): Likewise. * sysdeps/ieee754/flt-32/e_lgammaf_r.c (sin_pif): Likewise. * sysdeps/ieee754/flt-32/lgamma_negf.c (__lgamma_negf): Likewise. * sysdeps/ieee754/ldbl-128/e_lgammal_r.c (__ieee754_lgammal_r): Likewise. * sysdeps/ieee754/ldbl-128/e_powl.c (__ieee754_powl): Likewise. * sysdeps/ieee754/ldbl-128/lgamma_negl.c (__lgamma_negl): Likewise. * sysdeps/ieee754/ldbl-128/s_expm1l.c (__expm1l): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_lgammal_r.c (__ieee754_lgammal_r): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_powl.c (__ieee754_powl): Likewise. * sysdeps/ieee754/ldbl-128ibm/lgamma_negl.c (__lgamma_negl): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_expm1l.c (__expm1l): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_truncl.c (__truncl): Likewise. * sysdeps/ieee754/ldbl-96/e_lgammal_r.c (sin_pi): Likewise. * sysdeps/ieee754/ldbl-96/lgamma_negl.c (__lgamma_negl): Likewise. * sysdeps/powerpc/power5+/fpu/s_modf.c (__modf): Likewise. * sysdeps/powerpc/power5+/fpu/s_modff.c (__modff): Likewise.	2018-09-14 13:09:01 +00:00
Joseph Myers	4e7fbdd7c2	Remove x86_64 math_private.h asms. The x86_64 math_private.h has asm versions of the macros to reinterpret between floating-point and integer types. This is the sort of thing we now strongly discourage; the expectation in such cases, where the generic C code gives the compiler all the information needed about the required semantics, is that you should get the compiler to do the right thing for the generic C code rather than writing an asm version. Trivial tests showed GCC generates the expected single instructions for reinterpretation from floating point to integer. In the other direction, it goes via memory when the asms don't; I asked about this in GCC bug 87236 and was advised this was deliberate for generic tuning because it was faster that way on some AMD processors (but -mtune=intel, and -Os with the latest GCC, avoid going via memory). The asms don't and can't know about those tuning details, so that's evidence that they are actually making the code worse. This patch removes the asms accordingly. Tested for x86_64. * sysdeps/x86_64/fpu/math_private.h (MOVD): Remove macro. (MOVQ): Likewise. (EXTRACT_WORDS64): Likewise. (INSERT_WORDS64): Likewise. (GET_FLOAT_WORD): Likewise. (SET_FLOAT_WORD): Likewise.	2018-09-11 14:51:40 +00:00
Szabolcs Nagy	e70c176825	Add new exp and exp2 implementations Optimized exp and exp2 implementations using a lookup table for fractional powers of 2. There are several variants, see e_exp_data.c, they can be selected by modifying math_config.h allowing different tradeoffs. The default selection should be acceptable as generic libm code. Worst case error is 0.509 ULP for exp and 0.507 ULP for exp2, on aarch64 the rodata size is 2160 bytes, shared between exp and exp2. On aarch64 .text + .rodata size decreased by 24912 bytes. The non-nearest rounding error is less than 1 ULP even on targets without efficient round implementation (although the error rate is higher in that case). Targets with single instruction, rounding mode independent, to nearest integer rounding and conversion can use them by setting TOINT_INTRINSICS and adding the necessary code to their math_private.h. The __exp1 code uses the same algorithm, so the error bound of pow increased a bit. New double precision error handling code was added following the style of the single precision error handling code. Improvements on Cortex-A72 compared to current glibc master: exp thruput: 1.61x in [-9.9 9.9] exp latency: 1.53x in [-9.9 9.9] exp thruput: 1.13x in [0.5 1] exp latency: 1.30x in [0.5 1] exp2 thruput: 2.03x in [-9.9 9.9] exp2 latency: 1.64x in [-9.9 9.9] For small (< 1) inputs the current exp code uses a separate algorithm so the speed up there is less. Was tested on aarch64-linux-gnu (TOINT_INTRINSICS, fma contraction) and arm-linux-gnueabihf (!TOINT_INTRINSICS, no fma contraction) and x86_64-linux-gnu (!TOINT_INTRINSICS, no fma contraction) and powerpc64le-linux-gnu (!TOINT_INTRINSICS, fma contraction) targets, only non-nearest rounding ulp errors increase and they are within acceptable bounds (ulp updates are in separate patches). * NEWS: Mention exp and exp2 improvements. * math/Makefile (libm-support): Remove t_exp. (type-double-routines): Add math_err and e_exp_data. * sysdeps/aarch64/libm-test-ulps: Update. * sysdeps/arm/libm-test-ulps: Update. * sysdeps/i386/fpu/e_exp_data.c: New file. * sysdeps/i386/fpu/math_err.c: New file. * sysdeps/i386/fpu/t_exp.c: Remove. * sysdeps/ia64/fpu/e_exp_data.c: New file. * sysdeps/ia64/fpu/math_err.c: New file. * sysdeps/ia64/fpu/t_exp.c: Remove. * sysdeps/ieee754/dbl-64/e_exp.c: Rewrite. * sysdeps/ieee754/dbl-64/e_exp2.c: Rewrite. * sysdeps/ieee754/dbl-64/e_exp_data.c: New file. * sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Update error bound. * sysdeps/ieee754/dbl-64/eexp.tbl: Remove. * sysdeps/ieee754/dbl-64/math_config.h: New file. * sysdeps/ieee754/dbl-64/math_err.c: New file. * sysdeps/ieee754/dbl-64/t_exp.c: Remove. * sysdeps/ieee754/dbl-64/t_exp2.h: Remove. * sysdeps/ieee754/dbl-64/uexp.h: Remove. * sysdeps/ieee754/dbl-64/uexp.tbl: Remove. * sysdeps/m68k/m680x0/fpu/e_exp_data.c: New file. * sysdeps/m68k/m680x0/fpu/math_err.c: New file. * sysdeps/m68k/m680x0/fpu/t_exp.c: Remove. * sysdeps/powerpc/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Update.	2018-09-05 16:22:00 +01:00
Joseph Myers	ff6b24501f	Split fenv_private.h out of math_private.h more consistently. On some architectures, the parts of math_private.h relating to the floating-point environment are in a separate file fenv_private.h included from math_private.h. As this is purely an architecture-specific convention used by several architectures, however, all such architectures still need their own math_private.h, even if it has nothing to do beyond #include <fenv_private.h> and peculiarity of including the i386 file directly instead of having a shared file in sysdeps/x86. This patch makes the fenv_private.h name an architecture-independent convention in glibc. The include of fenv_private.h from math_private.h becomes architecture-independent (until callers are updated to include fenv_private.h directly so the include from math_private.h is no longer needed). Some architecture math_private.h headers are removed if no longer needed, or renamed to fenv_private.h if all they define belongs in that header; architecture fenv_private.h headers now do require #include_next <fenv_private.h>. The i386 fenv_private.h file moves to sysdeps/x86/fpu/ to reflect how it is actually shared with x86_64. The generic math_private.h gets a new include of <stdbool.h>, as needed for bool in some prototypes in that header (previously that was indirectly included via include/fenv.h, which now only gets included too late in math_private.h, after those prototypes). Tested for x86_64 and x86, and tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by the patch. * sysdeps/aarch64/fpu/fenv_private.h: New file. Based on .... * sysdeps/aarch64/fpu/math_private.h: ... this file. All contents moved to fenv_private.h except for ... (TOINT_INTRINSICS): Kept in math_private.h. (roundtoint): Likewise. (converttoint): Likewise. * sysdeps/arm/fenv_private.h: Change multiple-include guard to [ARM_FENV_PRIVATE_H]. Include next <fenv_private.h>. * sysdeps/arm/math_private.h: Remove. * sysdeps/generic/fenv_private.h: New file. Contents moved from .... * sysdeps/generic/math_private.h: ... this file. Include <stdbool.h>. Do not include <fenv.h> or <get-rounding-mode.h>. Include <fenv_private.h>. Remove functions and macros moved to fenv_private.h. * sysdeps/i386/fpu/math_private.h: Remove. * sysdeps/mips/math_private.h: Move to .... * sysdeps/mips/fpu/fenv_private.h: ... here. Change multiple-include guard to [MIPS_FENV_PRIVATE_H]. Remove [__mips_hard_float] conditional. Include next <fenv_private.h>. * sysdeps/powerpc/fpu/fenv_private.h: Change multiple-include guard to [POWERPC_FENV_PRIVATE_H]. Include next <fenv_private.h>. * sysdeps/powerpc/fpu/math_private.h: Do not include <fenv_private.h>. * sysdeps/riscv/rvf/math_private.h: Move to .... * sysdeps/riscv/rvf/fenv_private.h: ... here. Change multiple-include guard to [RISCV_FENV_PRIVATE_H]. Include next <fenv_private.h>. * sysdeps/sparc/fpu/fenv_private.h: Change multiple-include guard to [SPARC_FENV_PRIVATE_H]. Include next <fenv_private.h>. * sysdeps/sparc/fpu/math_private.h: Remove. * sysdeps/i386/fpu/fenv_private.h: Move to .... * sysdeps/x86/fpu/fenv_private.h: ... here. Change multiple-include guard to [X86_FENV_PRIVATE_H]. Include next <fenv_private.h>. * sysdeps/x86_64/fpu/math_private.h: Do not include <sysdeps/i386/fpu/fenv_private.h>.	2018-08-28 20:48:49 +00:00
Wilco Dijkstra	49acec179c	Fix spaces in x86_64 ULP file Fix a few missing spaces, it's now identical to the regenerated version. Passes GLIBC tests on x64. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerate to fix spaces.	2018-08-15 12:56:22 +01:00
Wilco Dijkstra	599cf39766	Improve performance of sinf and cosf The second patch improves performance of sinf and cosf using the same algorithms and polynomials. The returned values are identical to sincosf for the same input. ULP definitions for AArch64 and x64 are updated. sinf/cosf througput gains on Cortex-A72: * \|x\| < 0x1p-12 : 1.2x * \|x\| < M_PI_4 : 1.8x * \|x\| < 2 * M_PI: 1.7x * \|x\| < 120.0 : 2.3x * \|x\| < Inf : 3.0x * NEWS: Mention sinf, cosf, sincosf. * sysdeps/aarch64/libm-test-ulps: Update ULP for sinf, cosf, sincosf. * sysdeps/x86_64/fpu/libm-test-ulps: Update ULP for sinf and cosf. * sysdeps/x86_64/fpu/multiarch/s_sincosf-fma.c: Add definitions of constants rather than including generic sincosf.h. * sysdeps/x86_64/fpu/s_sincosf_data.c: Remove. * sysdeps/ieee754/flt-32/s_cosf.c (cosf): Rewrite. * sysdeps/ieee754/flt-32/s_sincosf.h (reduced_sin): Remove. (reduced_cos): Remove. (sinf_poly): New function. * sysdeps/ieee754/flt-32/s_sinf.c (sinf): Rewrite.	2018-08-14 10:45:59 +01:00
Joseph Myers	2ce7ba7d15	Move SNAN_TESTS_* out of math-tests.h. Continuing moving macros out of math-tests.h to smaller headers following typo-proof conventions instead of using #ifndef, this patch moves the SNAN_TESTS_* macros for individual types out to their own sysdeps header (while the type-generic SNAN_TESTS wrapper for those macros remains in math-tests.h). Tested for x86_64 and x86, and with build-many-glibcs.py. * sysdeps/generic/math-tests-snan.h: New file. * sysdeps/generic/math-tests.h: Include <math-tests-snan.h>. (SNAN_TESTS_float): Do not define here. (SNAN_TESTS_double): Likewise. (SNAN_TESTS_long_double): Likewise. (SNAN_TESTS_float128): Likewise. * sysdeps/i386/fpu/math-tests-snan.h: New file. * sysdeps/i386/fpu/math-tests.h: Remove file. * sysdeps/ia64/math-tests-snan.h: New file. * sysdeps/ia64/math-tests.h: Remove file. * sysdeps/x86/math-tests.h: Likewise. * sysdeps/x86_64/fpu/math-tests-snan.h: New file.	2018-08-10 19:22:01 +00:00
Wilco Dijkstra	ea5c662c62	Improve performance of sincosf This patch is a complete rewrite of sincosf. The new version is significantly faster, as well as simple and accurate. The worst-case ULP is 0.5607, maximum relative error is 0.5303 * 2^-23 over all 4 billion inputs. In non-nearest rounding modes the error is 1ULP. The algorithm uses 3 main cases: small inputs which don't need argument reduction, small inputs which need a simple range reduction and large inputs requiring complex range reduction. The code uses approximate integer comparisons to quickly decide between these cases. The small range reducer uses a single reduction step to handle values up to 120.0. It is fastest on targets which support inlined round instructions. The large range reducer uses integer arithmetic for simplicity. It does a 32x96 bit multiply to compute a 64-bit modulo result. This is more than accurate enough to handle the worst-case cancellation for values close to an integer multiple of PI/4. It could be further optimized, however it is already much faster than necessary. sincosf throughput gains on Cortex-A72: * \|x\| < 0x1p-12 : 1.6x * \|x\| < M_PI_4 : 1.7x * \|x\| < 2 * M_PI: 1.5x * \|x\| < 120.0 : 1.8x * \|x\| < Inf : 2.3x * math/Makefile: Add s_sincosf_data.c. * sysdeps/ia64/fpu/s_sincosf_data.c: New file. * sysdeps/ieee754/flt-32/s_sincosf.h (abstop12): Add new function. (sincosf_poly): Likewise. (reduce_small): Likewise. (reduce_large): Likewise. * sysdeps/ieee754/flt-32/s_sincosf.c (sincosf): Rewrite. * sysdeps/ieee754/flt-32/s_sincosf_data.c: New file with sincosf data. * sysdeps/m68k/m680x0/fpu/s_sincosf_data.c: New file. * sysdeps/x86_64/fpu/s_sincosf_data.c: New file.	2018-08-10 17:34:39 +01:00
H.J. Lu	430388d5dc	x86: Don't include <init-arch.h> in assembly codes There is no need to include <init-arch.h> in assembly codes since all x86 IFUNC selector functions are written in C. Tested on i686 and x86-64. There is no code change in libc.so, ld.so and libmvec.so. * sysdeps/i386/i686/multiarch/bzero-ia32.S: Don't include <init-arch.h>. * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core-avx2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_expf16_core-avx2.S: Likewise. * sysdeps/x86_64/multiarch/memset-sse2-unaligned-erms.S: Likewise.	2018-08-03 08:05:00 -07:00
Paul Pluzhnikov	50d004c91c	Update ulps with "make regen-ulps" on AMD Ryzen 7 1800X. 2018-05-30 Paul Pluzhnikov <ppluzhnikov@google.com> * sysdeps/x86_64/fpu/libm-test-ulps (log_vlen8_avx2): Update for AMD Ryzen 7 1800X.	2018-05-30 09:17:47 -07:00
Wilco Dijkstra	19a8b9a300	[PATCH 1/7] sin/cos slow paths: avoid slow paths for small inputs This series of patches removes the slow patchs from sin, cos and sincos. Besides greatly simplifying the implementation, the new version is also much faster for inputs up to PI (41% faster) and for large inputs needing range reduction (27% faster). ULP is ~0.55 with no errors found after testing 1.6 billion inputs across most of the range with mpsin and mpcos. The number of incorrectly rounded results (ie. ULP >0.5) is at most ~2750 per million inputs between 0.125 and 0.5, the average is ~850 per million between 0 and PI. Tested on AArch64 and x86_64 with no regressions. The first patch removes the slow paths for the cases where the input is small and doesn't require range reduction. Update ULP tables for sin, cos and sincos on AArch64 and x86_64. * sysdeps/aarch64/libm-test-ulps: Update ULP for sin, cos, sincos. * sysdeps/ieee754/dbl-64/s_sin.c (__sin): Remove slow paths for small inputs. (__cos): Likewise. * sysdeps/x86_64/fpu/libm-test-ulps: Update ULP for sin, cos, sincos.	2018-04-03 16:52:16 +01:00
Wilco Dijkstra	700593fdd7	Remove all target specific __ieee754_sqrt(f/l) inlines Remove the now unused target specific__ieee754_sqrt(f/l) inlines. Also remove inlines of sqrt which are for really old GCC versions. Removing these is desirable, under the general principle of leaving such inlining to the compiler rather than trying to do it in installed headers, especially when only very old compilers are affected. Note that removing inlines for __ieee754_sqrt disables inlining in the sqrt wrapper functions. Given the sqrt function will typically only be called for negative arguments, it doesn't matter whether the inlining happens or not. * sysdeps/aarch64/fpu/math_private.h (__ieee754_sqrt): Remove. (__ieee754_sqrtf): Remove. * sysdeps/alpha/fpu/math_private.h (__ieee754_sqrt): Remove. (__ieee754_sqrtf): Remove. * sysdeps/generic/math-type-macros.h (M_SQRT): Use sqrt. * sysdeps/m68k/m680x0/fpu/mathimpl.h (__ieee754_sqrt): Remove. * sysdeps/powerpc/fpu/math_private.h (__ieee754_sqrt): Remove. (__ieee754_sqrtf): Remove. * sysdeps/s390/fpu/bits/mathinline.h: Remove file. * sysdeps/sparc/fpu/bits/mathinline.h (sqrt) Remove. (sqrtf): Remove. (sqrtl): Remove. (__ieee754_sqrt): Remove. (__ieee754_sqrtf): Remove. (__ieee754_sqrtl): Remove. * sysdeps/m68k/m680x0/fpu/mathimpl.h (__ieee754_sqrt): Remove. * sysdeps/x86/fpu/math_private.h (__ieee754_sqrt): Remove. * sysdeps/x86_64/fpu/math_private.h (__ieee754_sqrt): Remove. (__ieee754_sqrtf): Remove. (__ieee754_sqrtl): Remove.	2018-03-15 19:21:36 +00:00
Wilco Dijkstra	610ee1fc93	Remove mplog and mpexp Remove the now unused mplog and mpexp files. * math/Makefile: Remove mpexp.c and mplog.c * sysdeps/i386/fpu/mpexp.c: Delete file. * sysdeps/i386/fpu/mplog.c: Likewise. * sysdeps/ia64/fpu/mpexp.c: Likewise. * sysdeps/ia64/fpu/mplog.c: Likewise. * sysdeps/ieee754/dbl-64/e_exp.c: Remove mention of mpexp and mplog. * sysdeps/ieee754/dbl-64/mpa.h (__pow_mp): Remove unused function. * sysdeps/ieee754/dbl-64/mpexp.c: Delete file. * sysdeps/ieee754/dbl-64/mplog.c: Likewise. * sysdeps/m68k/m680x0/fpu/mpexp.c: Likewise. * sysdeps/m68k/m680x0/fpu/mplog.c: Likewise. * sysdeps/x86_64/fpu/multiarch/Makefile: Remove mpexp* and mplog. sysdeps/x86_64/fpu/multiarch/e_log-avx.c: Remove unused defines. * sysdeps/x86_64/fpu/multiarch/e_log-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/e_log-fma4.c: Likewise. * sysdeps/x86_64/fpu/multiarch/mpexp-avx.c: Delete file. * sysdeps/x86_64/fpu/multiarch/mpexp-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/mpexp-fma4.c: Likewise. * sysdeps/x86_64/fpu/multiarch/mplog-avx.c: Likewise. * sysdeps/x86_64/fpu/multiarch/mplog-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/mplog-fma4.c: Likewise.	2018-02-15 12:41:05 +00:00
Szabolcs Nagy	de800d8305	Remove slow paths from exp Remove the __slowexp code, so exp is no longer correctly rounded. The result is computed to about 70 bits precision so the worst case ulp error is about 0.500007 in nearest rounding mode. * manual/probes.texi: Remove slowexp probes. * math/Makefile: Remove slowexp. * sysdeps/generic/math_private.h (__slowexp): Remove. * sysdeps/ieee754/dbl-64/e_exp.c (__ieee754_exp): Remove __slowexp and document error bounds. * sysdeps/i386/fpu/slowexp.c: Remove. * sysdeps/ia64/fpu/slowexp.c: Remove. * sysdeps/ieee754/dbl-64/slowexp.c: Remove. * sysdeps/ieee754/dbl-64/uexp.h (err_0): Remove. * sysdeps/m68k/m680x0/fpu/slowexp.c: Remove. * sysdeps/powerpc/power4/fpu/Makefile (CPPFLAGS-slowexp.c): Remove. * sysdeps/x86_64/fpu/multiarch/Makefile: Remove slowexp-fma. * sysdeps/x86_64/fpu/multiarch/e_exp-avx.c (__slowexp): Remove. * sysdeps/x86_64/fpu/multiarch/e_exp-fma.c (__slowexp): Remove. * sysdeps/x86_64/fpu/multiarch/e_exp-fma4.c (__slowexp): Remove. * sysdeps/x86_64/fpu/multiarch/slowexp-avx.c: Remove. * sysdeps/x86_64/fpu/multiarch/slowexp-fma.c: Remove. * sysdeps/x86_64/fpu/multiarch/slowexp-fma4.c: Remove.	2018-02-12 11:33:33 +00:00
Wilco Dijkstra	c3d466cba1	Remove slow paths from pow Remove the slow paths from pow. Like several other double precision math functions, pow is exactly rounded. This is not required from math functions and causes major overheads as it requires multiple fallbacks using higher precision arithmetic if a result is close to 0.5ULP. Ridiculous slowdowns of up to 100000x have been reported when the highest precision path triggers. All GLIBC math tests pass on AArch64 and x64 (with ULP of pow set to 1). The worst case error is ~0.506ULP. A simple test over a few hundred million values shows pow is 10% faster on average. This fixes BZ #13932. [BZ #13932] * sysdeps/ieee754/dbl-64/uexp.h (err_1): Remove. * benchtests/pow-inputs: Update comment for slow path cases. * manual/probes.texi (slowpow_p10): Delete removed probe. (slowpow_p10): Likewise. * math/Makefile: Remove halfulp.c and slowpow.c. * sysdeps/aarch64/libm-test-ulps: Set ULP of pow to 1. * sysdeps/generic/math_private.h (__exp1): Remove error argument. (__halfulp): Remove. (__slowpow): Remove. * sysdeps/i386/fpu/halfulp.c: Delete file. * sysdeps/i386/fpu/slowpow.c: Likewise. * sysdeps/ia64/fpu/halfulp.c: Likewise. * sysdeps/ia64/fpu/slowpow.c: Likewise. * sysdeps/ieee754/dbl-64/e_exp.c (__exp1): Remove error argument, improve comments and add error analysis. * sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Add error analysis. (power1): Remove function: (log1): Remove error argument, add error analysis. (my_log2): Remove function. * sysdeps/ieee754/dbl-64/halfulp.c: Delete file. * sysdeps/ieee754/dbl-64/slowpow.c: Likewise. * sysdeps/m68k/m680x0/fpu/halfulp.c: Likewise. * sysdeps/m68k/m680x0/fpu/slowpow.c: Likewise. * sysdeps/powerpc/power4/fpu/Makefile: Remove CPPFLAGS-slowpow.c. * sysdeps/x86_64/fpu/libm-test-ulps: Set ULP of pow to 1. * sysdeps/x86_64/fpu/multiarch/Makefile: Remove slowpow-fma.c, slowpow-fma4.c, halfulp-fma.c, halfulp-fma4.c. * sysdeps/x86_64/fpu/multiarch/e_pow-fma.c (__slowpow): Remove define. * sysdeps/x86_64/fpu/multiarch/e_pow-fma4.c (__slowpow): Likewise. * sysdeps/x86_64/fpu/multiarch/halfulp-fma.c: Delete file. * sysdeps/x86_64/fpu/multiarch/halfulp-fma4.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowpow-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowpow-fma4.c: Likewise.	2018-02-12 10:47:09 +00:00
H.J. Lu	c70e4e9c9e	x86-64: Add sincosf with vector FMA Since the x86-64 assembly version of sincosf is higly optimized with vector instructions, there isn't much room for improvement. However s_sincosf.c written in C with vector math and intrinsics can be optimized by GCC with FMA. On Skylake, bench-sincosf reports performance improvement: Assembly FMA improvement max 104.042 101.008 3% min 9.426 8.586 10% mean 20.6209 18.2238 13% * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Add s_sincosf-sse2 and s_sincosf-fma. (CFLAGS-s_sincosf-fma.c): New. * sysdeps/x86_64/fpu/multiarch/s_sincosf-fma.c: New file. * sysdeps/x86_64/fpu/multiarch/s_sincosf-sse2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/s_sincosf.c: Likewise. * sysdeps/x86_64/fpu/s_sincosf.S: Don't add alias if __sincosf is defined.	2018-01-08 08:04:40 -08:00
Joseph Myers	688903eb3e	Update copyright dates with scripts/update-copyrights. * All files with FSF copyright notices: Update copyright dates using scripts/update-copyrights. * locale/programs/charmap-kw.h: Regenerated. * locale/programs/locfile-kw.h: Likewise.	2018-01-01 00:32:25 +00:00
Joseph Myers	f1e005022e	Revert exp reimplementation (causes test failures). Revert: 2017-12-19 Joseph Myers <joseph@codesourcery.com> * sysdeps/x86_64/fpu/libm-test-ulps: Update. 2017-12-19 Patrick McGehearty <patrick.mcgehearty@oracle.com> * sysdeps/ieee754/dbl-64/e_exp.c: Include <math-svid-compat.h> and <errno.h>. Include "eexp.tbl". (half): New constant. (one): Likewise. (__ieee754_exp): Rewrite. (__slowexp): Remove prototype. * sysdeps/ieee754/dbl-64/eexp.tbl: New file. * sysdeps/ieee754/dbl-64/slowexp.c: Remove file. * sysdeps/i386/fpu/slowexp.c: Likewise. * sysdeps/ia64/fpu/slowexp.c: Likewise. * sysdeps/m68k/m680x0/fpu/slowexp.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowexp-avx.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowexp-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowexp-fma4.c: Likewise. * sysdeps/generic/math_private.h (__slowexp): Remove prototype. * sysdeps/ieee754/dbl-64/e_pow.c: Remove mention of slowexp.c in comment. * sysdeps/powerpc/power4/fpu/Makefile [$(subdir) = math] (CPPFLAGS-slowexp.c): Remove variable. * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Remove slowexp-fma, slowexp-fma4 and slowexp-avx. (CFLAGS-slowexp-fma.c): Remove variable. (CFLAGS-slowexp-fma4.c): Likewise. (CFLAGS-slowexp-avx.c): Likewise. * sysdeps/x86_64/fpu/multiarch/e_exp-avx.c (__slowexp): Do not define as macro. * sysdeps/x86_64/fpu/multiarch/e_exp-fma.c (__slowexp): Likewise. * sysdeps/x86_64/fpu/multiarch/e_exp-fma4.c (__slowexp): Likewise. * math/Makefile (type-double-routines): Remove slowexp. * manual/probes.texi (slowexp_p6): Remove. (slowexp_p32): Likewise.	2017-12-19 18:11:37 +00:00
Joseph Myers	6f58c10ded	Update x86_64 libm-test-ulps. * sysdeps/x86_64/fpu/libm-test-ulps: Update.	2017-12-19 17:38:41 +00:00
Patrick McGehearty	6fd0a3c6a8	Improve __ieee754_exp() performance by greater than 5x on sparc/x86. These changes will be active for all platforms that don't provide their own exp() routines. They will also be active for ieee754 versions of ccos, ccosh, cosh, csin, csinh, sinh, exp10, gamma, and erf. Typical performance gains is typically around 5x when measured on Sparc s7 for common values between exp(1) and exp(40). Using the glibc perf tests on sparc, sparc (nsec) x86 (nsec) old new old new max 17629 395 5173 144 min 399 54 15 13 mean 5317 200 1349 23 The extreme max times for the old (ieee754) exp are due to the multiprecision computation in the old algorithm when the true value is very near 0.5 ulp away from an value representable in double precision. The new algorithm does not take special measures for those cases. The current glibc exp perf tests overrepresent those values. Informal testing suggests approximately one in 200 cases might invoke the high cost computation. The performance advantage of the new algorithm for other values is still large but not as large as indicated by the chart above. Glibc correctness tests for exp() and expf() were run. Within the test suite 3 input values were found to cause 1 bit differences (ulp) when "FE_TONEAREST" rounding mode is set. No differences in exp() were seen for the tested values for the other rounding modes. Typical example: exp(-0x1.760cd2p+0) (-1.46113312244415283203125) new code: 2.31973271630014299393707e-01 0x1.db14cd799387ap-3 old code: 2.31973271630014271638132e-01 0x1.db14cd7993879p-3 exp = 2.31973271630014285508337 (high precision) Old delta: off by 0.49 ulp New delta: off by 0.51 ulp In addition, because ieee754_exp() is used by other routines, cexp() showed test results with very small imaginary input values where the imaginary portion of the result was off by 3 ulp when in upward rounding mode, but not in the other rounding modes. For x86, tgamma showed a few values where the ulp increased to 6 (max ulp for tgamma is 5). Sparc tgamma did not show these failures. I presume the tgamma differences are due to compiler optimization differences within the gamma function.The gamma function is known to be difficult to compute accurately. * sysdeps/ieee754/dbl-64/e_exp.c: Include <math-svid-compat.h> and <errno.h>. Include "eexp.tbl". (half): New constant. (one): Likewise. (__ieee754_exp): Rewrite. (__slowexp): Remove prototype. * sysdeps/ieee754/dbl-64/eexp.tbl: New file. * sysdeps/ieee754/dbl-64/slowexp.c: Remove file. * sysdeps/i386/fpu/slowexp.c: Likewise. * sysdeps/ia64/fpu/slowexp.c: Likewise. * sysdeps/m68k/m680x0/fpu/slowexp.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowexp-avx.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowexp-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowexp-fma4.c: Likewise. * sysdeps/generic/math_private.h (__slowexp): Remove prototype. * sysdeps/ieee754/dbl-64/e_pow.c: Remove mention of slowexp.c in comment. * sysdeps/powerpc/power4/fpu/Makefile [$(subdir) = math] (CPPFLAGS-slowexp.c): Remove variable. * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Remove slowexp-fma, slowexp-fma4 and slowexp-avx. (CFLAGS-slowexp-fma.c): Remove variable. (CFLAGS-slowexp-fma4.c): Likewise. (CFLAGS-slowexp-avx.c): Likewise. * sysdeps/x86_64/fpu/multiarch/e_exp-avx.c (__slowexp): Do not define as macro. * sysdeps/x86_64/fpu/multiarch/e_exp-fma.c (__slowexp): Likewise. * sysdeps/x86_64/fpu/multiarch/e_exp-fma4.c (__slowexp): Likewise. * math/Makefile (type-double-routines): Remove slowexp. * manual/probes.texi (slowexp_p6): Remove. (slowexp_p32): Likewise.	2017-12-19 17:27:31 +00:00
H.J. Lu	4ca945e9c5	x86-64: Remove sysdeps/x86_64/fpu/s_cosf.S On Ivy Bridge, bench-cosf reports performance improvement: s_cosf.S s_cosf.c Improvement max 114.136 82.401 39% min 13.119 11.301 16% mean 22.1882 21.8938 1% * sysdeps/x86_64/fpu/s_cosf.S: Removed.	2017-12-14 04:39:53 -08:00
H.J. Lu	ac817e083b	x86-64: Add cosf with FMA On Skylake, bench-cosf reports performance improvement: Before After Improvement max 135.362 94.552 43% min 8.532 7.688 11% mean 17.1446 11.8128 45% * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Add s_cosf-sse2 and s_cosf-fma. (CFLAGS-s_cosf-fma.c): New. * sysdeps/x86_64/fpu/multiarch/s_cosf-fma.c: New file. * sysdeps/x86_64/fpu/multiarch/s_cosf-sse2.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_cosf.c: Likewise. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2017-12-12 15:32:58 -08:00
H.J. Lu	9d0ffa60ad	x86-64: Add sinf with FMA On Skylake, bench-sinf reports performance improvement: Before After Improvement max 153.996 100.094 54% min 8.546 6.852 25% mean 18.1223 11.802 54% * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Add s_sinf-sse2 and s_sinf-fma. (CFLAGS-s_sinf-fma.c): New. * sysdeps/x86_64/fpu/multiarch/s_sinf-fma.c: New file. * sysdeps/x86_64/fpu/multiarch/s_sinf-sse2.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_sinf.c: Likewise.	2017-12-07 10:11:16 -08:00
H.J. Lu	9574c7b68d	x86-64: Remove sysdeps/x86_64/fpu/s_sinf.S On Ivy Bridge, bench-sinf reports performance improvement: s_sinf.S s_sinf.c Improvement max 91.521 86.148 6% min 14.061 11.265 25% mean 23.3758 23.3344 0.2% * sysdeps/x86_64/fpu/s_sinf.S: Removed.	2017-12-07 09:44:17 -08:00
Joseph Myers	34bb10aabf	Use libm_alias_float for x86_64. Continuing the preparation for additional _FloatN / _FloatNx function aliases, this patch makes x86_64 libm function implementations use libm_alias_float to define function aliases, or libm_alias_float_other where the main name is defined with versioned_symbol. Tested with the glibc testsuite for x86_64, and tested with build-many-glibcs.py for all its x86_64 configurations that installed stripped shared libraries are unchanged by the patch. * sysdeps/x86_64/fpu/multiarch/e_exp2f.c: Include <libm-alias-float.h>. (exp2f): Define using libm_alias_float, or libm_alias_float_other if [SHARED]. * sysdeps/x86_64/fpu/multiarch/e_expf.c: Include <libm-alias-float.h>. (exp2f): Define using libm_alias_float, or libm_alias_float_other if [SHARED]. * sysdeps/x86_64/fpu/multiarch/e_log2f.c: Include <libm-alias-float.h>. (exp2f): Define using libm_alias_float, or libm_alias_float_other if [SHARED]. * sysdeps/x86_64/fpu/multiarch/e_logf.c: Include <libm-alias-float.h>. (exp2f): Define using libm_alias_float, or libm_alias_float_other if [SHARED]. * sysdeps/x86_64/fpu/multiarch/e_powf.c: Include <libm-alias-float.h>. (exp2f): Define using libm_alias_float, or libm_alias_float_other if [SHARED]. * sysdeps/x86_64/fpu/multiarch/s_ceilf.c: Include <libm-alias-float.h>. (ceilf): Define using libm_alias_float. * sysdeps/x86_64/fpu/multiarch/s_floorf.c: Include <libm-alias-float.h>. (floorf): Define using libm_alias_float. * sysdeps/x86_64/fpu/multiarch/s_fmaf.c: Include <libm-alias-float.h>. (fmaf): Define using libm_alias_float. * sysdeps/x86_64/fpu/multiarch/s_nearbyintf.c: Include <libm-alias-float.h>. (nearbyintf): Define using libm_alias_float. * sysdeps/x86_64/fpu/multiarch/s_rintf.c: Include <libm-alias-float.h>. (rintf): Define using libm_alias_float. * sysdeps/x86_64/fpu/multiarch/s_truncf.c: Include <libm-alias-float.h>. (truncf): Define using libm_alias_float. * sysdeps/x86_64/fpu/s_copysignf.S: Include <libm-alias-float.h>. (copysignf): Define using libm_alias_float. * sysdeps/x86_64/fpu/s_cosf.S: Include <libm-alias-float.h>. (cosf): Define using libm_alias_float. * sysdeps/x86_64/fpu/s_fabsf.c: Include <libm-alias-float.h>. (fabsf): Define using libm_alias_float. * sysdeps/x86_64/fpu/s_fmaxf.S: Include <libm-alias-float.h>. (fmaxf): Define using libm_alias_float. * sysdeps/x86_64/fpu/s_fminf.S: Include <libm-alias-float.h>. (fminf): Define using libm_alias_float. * sysdeps/x86_64/fpu/s_llrintf.S: Include <libm-alias-float.h>. (llrintf): Define using libm_alias_float. [!__ILP32__] (lrintf): Likewise. * sysdeps/x86_64/fpu/s_sincosf.S: Include <libm-alias-float.h>. (sincosf): Define using libm_alias_float. * sysdeps/x86_64/fpu/s_sinf.S: Include <libm-alias-float.h>. (sinf): Define using libm_alias_float. * sysdeps/x86_64/x32/fpu/s_lrintf.S: Include <libm-alias-float.h>. (lrintf): Define using libm_alias_float.	2017-11-29 21:25:41 +00:00
Joseph Myers	011fba7e48	Use libm_alias_double for x86_64. Continuing the preparation for additional _FloatN / _FloatNx function aliases, this patch makes x86_64 libm function implementations use libm_alias_double to define function aliases. Tested with the glibc testsuite for x86_64, and tested with build-many-glibcs.py for all its x86_64 configurations that installed stripped shared libraries are unchanged by the patch. * sysdeps/x86_64/fpu/multiarch/s_atan.c: Include <libm-alias-double.h>. (atan): Define using libm_alias_double. * sysdeps/x86_64/fpu/multiarch/s_ceil.c: Include <libm-alias-double.h>. (ceil): Define using libm_alias_double. * sysdeps/x86_64/fpu/multiarch/s_floor.c: Include <libm-alias-double.h>. (floor): Define using libm_alias_double. * sysdeps/x86_64/fpu/multiarch/s_fma.c: Include <libm-alias-double.h>. (fma): Define using libm_alias_double. * sysdeps/x86_64/fpu/multiarch/s_nearbyint.c: Include <libm-alias-double.h>. (nearbyint): Define using libm_alias_double. * sysdeps/x86_64/fpu/multiarch/s_rint.c: Include <libm-alias-double.h>. (rint): Define using libm_alias_double. * sysdeps/x86_64/fpu/multiarch/s_sin.c: Include <libm-alias-double.h>. (sin): Define using libm_alias_double. (cos): Likewise. * sysdeps/x86_64/fpu/multiarch/s_tan.c: Include <libm-alias-double.h>. (tan): Define using libm_alias_double. * sysdeps/x86_64/fpu/multiarch/s_trunc.c: Include <libm-alias-double.h>. (trunc): Define using libm_alias_double. * sysdeps/x86_64/fpu/s_copysign.S: Include <libm-alias-double.h>. (copysign): Define using libm_alias_double. * sysdeps/x86_64/fpu/s_fabs.c: Include <libm-alias-double.h>. (fabs): Define using libm_alias_double. * sysdeps/x86_64/fpu/s_fmax.S: Include <libm-alias-double.h>. (fmax): Define using libm_alias_double. * sysdeps/x86_64/fpu/s_fmin.S: Include <libm-alias-double.h>. (fmin): Define using libm_alias_double. * sysdeps/x86_64/fpu/s_llrint.S: Include <libm-alias-double.h>. (llrint): Define using libm_alias_double. [!__ILP32__] (lrint): Likewise. * sysdeps/x86_64/x32/fpu/s_lrint.S: Include <libm-alias-double.h>. (lrint): Define using libm_alias_double.	2017-11-29 19:01:21 +00:00
Joseph Myers	f58e5f4809	Use libm_alias_ldouble in sysdeps/x86_64/fpu. This patch continues the preparation for additional _FloatN / _FloatNx function aliases by using libm_alias_ldouble for sysdeps/x86_64/fpu long double functions, so that they can have _Float64x aliases added in future. Tested for x86_64, including build-many-glibcs.py tests that installed stripped shared libraries are unchanged by the patch. * sysdeps/x86_64/fpu/e_expl.S: Include <libm-alias-ldouble.h>. [USE_AS_EXPM1L] (expm1l): Define using libm_alias_ldouble. * sysdeps/x86_64/fpu/s_ceill.S: Include <libm-alias-ldouble.h>. (ceill): Define using libm_alias_ldouble. * sysdeps/x86_64/fpu/s_copysignl.S: Include <libm-alias-ldouble.h>. (copysignl): Define using libm_alias_ldouble. * sysdeps/x86_64/fpu/s_fabsl.S: Include <libm-alias-ldouble.h>. (fabsl): Define using libm_alias_ldouble. * sysdeps/x86_64/fpu/s_floorl.S: Include <libm-alias-ldouble.h>. (floorl): Define using libm_alias_ldouble. * sysdeps/x86_64/fpu/s_fmaxl.S: Include <libm-alias-ldouble.h>. (fmaxl): Define using libm_alias_ldouble. * sysdeps/x86_64/fpu/s_fminl.S: Include <libm-alias-ldouble.h>. (fminl): Define using libm_alias_ldouble. * sysdeps/x86_64/fpu/s_llrintl.S: Include <libm-alias-ldouble.h>. (llrintl): Define using libm_alias_ldouble. (lrintl): Likewise. * sysdeps/x86_64/fpu/s_nearbyintl.S: Include <libm-alias-ldouble.h>. (nearbyintl): Define using libm_alias_ldouble. * sysdeps/x86_64/fpu/s_truncl.S: Include <libm-alias-ldouble.h>. (truncl): Define using libm_alias_ldouble. * sysdeps/x86_64/x32/fpu/s_lrintl.S: Include <libm-alias-ldouble.h>. (lrintl): Define using libm_alias_ldouble.	2017-11-17 23:39:11 +00:00
H.J. Lu	a122dbfb2e	Replace "if if " with "if " in comments * include/alloc_buffer.h: Replace "if if " with "if " in comments. * sysdeps/mips/memcpy.S: Likkewise. * sysdeps/mips/memset.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf4_core_sse4.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf8_core_avx2.S: Likewise.	2017-10-25 08:05:51 -07:00
H.J. Lu	80bb593563	x86-64: Add powf with FMA For workload-spec2017.wrf, on Skylake, it improves performance by: Before After Improvement reciprocal-throughput 35.4713 27.3842 29% latency 82.4537 66.3175 24% * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Add e_powf-fma. (CFLAGS-e_powf-fma.c): New. * sysdeps/x86_64/fpu/multiarch/e_powf-fma.c: New file. * sysdeps/x86_64/fpu/multiarch/e_powf.c: Likewise.	2017-10-22 08:08:00 -07:00
H.J. Lu	5c7adbd8ed	x86-64: Add log2f with FMA For workload-spec2017.wrf, on Skylake, it improves performance by: Before After Improvement reciprocal-throughput 16.5937 14.0789 17% latency 41.7755 35.3586 18% * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Add e_log2f-fma. (CFLAGS-e_log2f-fma.c): New. * sysdeps/x86_64/fpu/multiarch/e_log2f-fma.c: New file. * sysdeps/x86_64/fpu/multiarch/e_log2f.c: Likewise.	2017-10-22 08:06:58 -07:00
H.J. Lu	0ccc7153cc	x86-64: Add logf with FMA For workload-spec2017.wrf, on Skylake, it improves performance by: Before After Improvement reciprocal-throughput 16.1534 13.8874 16% latency 41.9642 34.3072 22% * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Add e_logf-fma. (CFLAGS-e_logf-fma.c): New. * sysdeps/x86_64/fpu/multiarch/e_logf-fma.c: New file. * sysdeps/x86_64/fpu/multiarch/e_logf.c: Likewise.	2017-10-22 08:05:15 -07:00
H.J. Lu	5d15c96975	x86-64: Add exp2f with FMA For workload-spec2017.wrf, on Skylake, it improves performance by: Before After Improvement reciprocal-throughput 13.0291 11.2225 16% latency 44.5154 37.5766 18% * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Add e_exp2f-fma. (CFLAGS-e_exp2f-fma.c): New. * sysdeps/x86_64/fpu/multiarch/e_exp2f-fma.c: New file. * sysdeps/x86_64/fpu/multiarch/e_exp2f.c: Likewise.	2017-10-22 07:57:50 -07:00
H.J. Lu	e1f59bebd8	x86-64: Replace assembly versions of e_expf with generic e_expf.c This patch replaces x86-64 assembly versions of e_expf with generic e_expf.c. For workload-spec2017.wrf, on Nehalem, it improves performance by: Before After Improvement reciprocal-throughput 36.039 20.7749 73% latency 58.8096 40.8715 43% On Skylake, it improves Before After Improvement reciprocal-throughput 18.4436 11.1693 65% latency 47.5162 37.5411 26% * sysdeps/x86_64/fpu/e_expf.S: Removed. * sysdeps/x86_64/fpu/multiarch/e_expf-fma.S: Likewise. * sysdeps/x86_64/fpu/w_expf.c: Likewise. * sysdeps/x86_64/fpu/libm-test-ulps: Updated for generic e_expf.c. * sysdeps/x86_64/fpu/multiarch/Makefile (CFLAGS-e_expf-fma.c): New. * sysdeps/x86_64/fpu/multiarch/e_expf-fma.c: New file. * sysdeps/x86_64/fpu/multiarch/e_expf.c (__redirect_ieee754_expf): Renamed to ... (__redirect_expf): This. (SYMBOL_NAME): Changed to expf. (__ieee754_expf): Renamed to ... (__expf): This. (__GI___expf): This. (__ieee754_expf): Add strong_alias. (__expf_finite): Likewise. (__expf): New. Include <sysdeps/ieee754/flt-32/e_expf.c>.	2017-10-22 07:49:55 -07:00
Joseph Myers	527cd19c3d	Make dbl-64 atan and tan into weak aliases. This patch converts the dbl-64 implementations of atan and tan into weak aliases of __atan and __tan, in preparation for making them use libm_alias_double. Consequent changes are made to the x86_64 multiarch versions wrapping round them (with the dbl-64 functions, like other such functions, being made not to define their aliases at all if __atan or __tan are defined as macros by an including file). Tested for x86_64, and with build-many-glibcs.py. * sysdeps/ieee754/dbl-64/s_atan.c (atan): Rename to __atan and define as weak alias of __atan. Do not define any aliases if [__atan]. [NO_LONG_DOUBLE] (__atanl): Define as strong alias of __atan. [NO_LONG_DOUBLE] (atanl): Define as weak alias of __atanl. * sysdeps/ieee754/dbl-64/s_tan.c (tan): Rename to __tan and define as weak alias of __tan. Do not define any aliases if [__tan]. [NO_LONG_DOUBLE] (__tanl): Define as strong alias of __tan. [NO_LONG_DOUBLE] (tanl): Define as weak alias of __tanl. * sysdeps/x86_64/fpu/multiarch/s_atan-avx.c (atan): Rename to __atan. * sysdeps/x86_64/fpu/multiarch/s_atan-fma.c (atan): Likewise. * sysdeps/x86_64/fpu/multiarch/s_atan-fma4.c (atan): Likewise. * sysdeps/x86_64/fpu/multiarch/s_atan.c (atan): Rename to __atan and define as weak alias of __atan. * sysdeps/x86_64/fpu/multiarch/s_tan-avx.c (tan): Rename to __atan. * sysdeps/x86_64/fpu/multiarch/s_tan-fma.c (tan): Likewise. * sysdeps/x86_64/fpu/multiarch/s_tan-fma4.c (tan): Likewise. * sysdeps/x86_64/fpu/multiarch/s_tan.c (tan): Rename to __tan and define as weak alias of __tan.	2017-10-02 20:20:52 +00:00
Szabolcs Nagy	f7a0b063e7	Do not wrap expf and exp2f The new generic expf and exp2f code don't need wrappers any more, they set errno inline, so only use the wrappers on targets that need it. (If the wrapper is needed, then the top level wrapper code is included, otherwise empty w_expf.c is used to suppress the wrapper.) A powerpc64 expf implementation includes the expf c code directly which needed some changes. sysdeps/ieee754/flt-32/e_exp2f.c (__exp2f): Define without wrapper. * sysdeps/ieee754/flt-32/e_expf.c (__expf): Likewise * sysdeps/ieee754/flt-32/w_exp2f.c: New file. * sysdeps/ieee754/flt-32/w_expf.c: New file. * sysdeps/powerpc/powerpc64/fpu/multiarch/e_expf-ppc64.c: Update for the new expf code. * sysdeps/powerpc/powerpc64/fpu/multiarch/w_expf.c: New file. * sysdeps/powerpc/powerpc64/power8/fpu/w_expf.c: New file. * sysdeps/m68k/m680x0/fpu/w_exp2f.c: New file. * sysdeps/m68k/m680x0/fpu/w_expf.c: New file. * sysdeps/i386/fpu/w_exp2f.c: New file. * sysdeps/i386/fpu/w_expf.c: New file. * sysdeps/i386/i686/fpu/multiarch/w_expf.c: New file. * sysdeps/x86_64/fpu/w_expf.c: New file.	2017-10-02 14:38:54 +01:00
Joseph Myers	2f92505d20	Update x86_64 libm-test-ulps. * sysdeps/x86_64/fpu/libm-test-ulps: Update.	2017-09-29 18:03:48 +00:00
Joseph Myers	ae8372d7e4	Add SSE4.1 trunc, truncf (bug 20142). This patch adds SSE4.1 versions of trunc and truncf, using the roundsd / roundss instructions, similar to the versions of ceil, floor, rint and nearbyint functions we already have. In my testing with the glibc benchtests these are about 30% faster than the C versions for double, 20% faster for float. Tested for x86_64. [BZ #20142] * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Add s_trunc-c, s_truncf-c, s_trunc-sse4_1 and s_truncf-sse4_1. * sysdeps/x86_64/fpu/multiarch/s_trunc-c.c: New file. * sysdeps/x86_64/fpu/multiarch/s_trunc-sse4_1.S: Likewise. * sysdeps/x86_64/fpu/multiarch/s_trunc.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_truncf-c.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_truncf-sse4_1.S: Likewise. * sysdeps/x86_64/fpu/multiarch/s_truncf.c: Likewise.	2017-09-20 16:54:05 +00:00
H.J. Lu	ef8adeb041	x86: Add MathVec_Prefer_No_AVX512 to cpu-features [BZ #21967 ] AVX512 functions in mathvec are used on machines with AVX512. An AVX2 wrapper is also provided and it can be used when the AVX512 version isn't profitable. MathVec_Prefer_No_AVX512 is addded to cpu-features. If glibc.tune.hwcaps=MathVec_Prefer_No_AVX512 is set in GLIBC_TUNABLES environment variable, the AVX2 wrapper will be used. Tested on x86-64 machines with and without AVX512. Also verified glibc.tune.hwcaps=MathVec_Prefer_No_AVX512 on AVX512 machine. [BZ #21967] * sysdeps/x86/cpu-features.h (bit_arch_MathVec_Prefer_No_AVX512): New. (index_arch_MathVec_Prefer_No_AVX512): Likewise. * sysdeps/x86/cpu-tunables.c (TUNABLE_CALLBACK (set_hwcaps)): Handle MathVec_Prefer_No_AVX512. * sysdeps/x86_64/fpu/multiarch/ifunc-mathvec-avx512.h (IFUNC_SELECTOR): Return AVX2 version if MathVec_Prefer_No_AVX512 is set.	2017-09-12 07:54:47 -07:00
Markus Trippelsdorf	4c03a69680	Update x86_64 ulps for AMD Ryzen. * sysdeps/x86_64/fpu/libm-test-ulps: Update for AMD Ryzen.	2017-09-08 19:57:12 +00:00
Joseph Myers	5a80d39d0d	Obsolete pow10 functions. This patch obsoletes the pow10, pow10f and pow10l functions (makes them into compat symbols, not available for new ports or static linking). The exp10 names for these functions are standardized (in TS 18661-4) and were added in the same glibc version (2.1) as pow10 so source code can change to use them without any loss of portability. Since pow10 is deliberately not provided for _Float128, only exp10, this slightly simplifies moving to the new wrapper templates in the !LIBM_SVID_COMPAT case, by avoiding needing to arrange for pow10, pow10f and pow10l to be defined by those templates. Tested for x86_64, and with build-many-glibcs.py. * manual/math.texi (pow10): Do not document. (pow10f): Likewise. (pow10l): Likewise. * math/bits/mathcalls.h [__USE_GNU] (pow10): Do not declare. * math/bits/math-finite.h [__USE_GNU] (pow10): Likewise. * math/libm-test-exp10.inc (pow10_test): Remove. (do_test): Do not call pow10. * math/w_exp10_compat.c (pow10): Make into compat symbol. [NO_LONG_DOUBLE] (pow10l): Likewise. * math/w_exp10f_compat.c (pow10f): Likewise. * math/w_exp10l_compat.c (pow10l): Likewise. * sysdeps/ia64/fpu/e_exp10.S: Include <shlib-compat.h>. (pow10): Make into compat symbol. * sysdeps/ia64/fpu/e_exp10f.S: Include <shlib-compat.h>. (pow10f): Make into compat symbol. * sysdeps/ia64/fpu/e_exp10l.S: Include <shlib-compat.h>. (pow10l): Make into compat symbol. * sysdeps/ieee754/ldbl-opt/Makefile (libnldbl-calls): Remove pow10. (CFLAGS-nldbl-pow10.c): Remove variable.. * sysdeps/ieee754/ldbl-opt/nldbl-pow10.c: Remove file. * sysdeps/ieee754/ldbl-opt/w_exp10_compat.c (pow10l): Condition on [SHLIB_COMPAT (libm, GLIBC_2_1, GLIBC_2_27)]. * sysdeps/ieee754/ldbl-opt/w_exp10l_compat.c (compat_symbol): Undefine and redefine. (pow10l): Make into compat symbol. * sysdeps/aarch64/libm-test-ulps: Remove pow10 ulps. * sysdeps/alpha/fpu/libm-test-ulps: Likewise. * sysdeps/arm/libm-test-ulps: Likewise. * sysdeps/hppa/fpu/libm-test-ulps: Likewise. * sysdeps/i386/fpu/libm-test-ulps: Likewise. * sysdeps/i386/i686/fpu/multiarch/libm-test-ulps: Likewise. * sysdeps/microblaze/libm-test-ulps: Likewise. * sysdeps/mips/mips32/libm-test-ulps: Likewise. * sysdeps/mips/mips64/libm-test-ulps: Likewise. * sysdeps/nios2/libm-test-ulps: Likewise. * sysdeps/powerpc/fpu/libm-test-ulps: Likewise. * sysdeps/powerpc/nofpu/libm-test-ulps: Likewise. * sysdeps/s390/fpu/libm-test-ulps: Likewise. * sysdeps/sh/libm-test-ulps: Likewise. * sysdeps/sparc/fpu/libm-test-ulps: Likewise. * sysdeps/tile/libm-test-ulps: Likewise. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2017-09-01 21:13:18 +00:00
H.J. Lu	5e2bc4ff33	x86_64 __redirect_ieee754_expf: Change double to float __redirect_ieee754_expf has type float, not double. * sysdeps/x86_64/fpu/multiarch/e_expf.c (__redirect_ieee754_expf): Change double to float.	2017-08-28 08:45:53 -07:00
H.J. Lu	fcaaca412f	x86-64: Regenerate libm-test-ulps for AVX512 mathvec tests Update libm-test-ulps for AVX512 mathvec tests by running “make regen-ulps” on Intel Xeon processor with AVX512. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated.	2017-08-23 09:11:55 -07:00
H.J. Lu	b9eaca8fa0	x86_64: Replace AVX512F .byte sequences with instructions Since binutils 2.25 or later is required to build glibc, we can replace AVX512F .byte sequences with AVX512F instructions. Tested on x86-64 and x32. There are no code differences in libmvec.so and libmvec.a. * sysdeps/x86_64/fpu/svml_d_sincos8_core.S: Replace AVX512F .byte sequences with AVX512F instructions. * sysdeps/x86_64/fpu/svml_d_wrapper_impl.h: Likewise. * sysdeps/x86_64/fpu/svml_s_sincosf16_core.S: Likewise. * sysdeps/x86_64/fpu/svml_s_wrapper_impl.h: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core_avx512.S: Likewise.	2017-08-23 06:26:44 -07:00
H.J. Lu	098b9dd468	x86-64: Check FMA_Usable in ifunc-mathvec-avx2.h [BZ #21966 ] Since the AVX2 version of mathvec functions uses FMA, it can only be used when FMA is usable. [BZ #21966] * sysdeps/x86_64/fpu/multiarch/ifunc-mathvec-avx2.h (IFUNC_SELECTOR): Don't use the AVX2 version if FMA isn't usable.	2017-08-18 06:19:07 -07:00
H.J. Lu	24a2e6588d	x86-64: Optimize e_expf with FMA [BZ #21912 ] FMA optimized e_expf improves performance by more than 50% on Skylake. [BZ #21912] * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Add e_expf-fma. * sysdeps/x86_64/fpu/multiarch/e_expf-fma.S: New file. * sysdeps/x86_64/fpu/multiarch/e_expf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/ifunc-fma.h: Likewise.	2017-08-16 08:43:48 -07:00
H.J. Lu	f59f7adb4a	x86-64: Align L(SP_RANGE)/L(SP_INF_0) to 8 bytes [BZ #21955 ] sysdeps/x86_64/fpu/e_expf.S has lea L(SP_RANGE)(%rip), %rdx /* load over/underflow bound / cmpl (%rdx,%rax,4), %ecx / \|x\|<under/overflow bound ? / ... / Here if \|x\| is Inf / lea L(SP_INF_0)(%rip), %rdx / depending on sign of x: / movss (%rdx,%rax,4), %xmm0 / return zero or Inf / ret ... .section .rodata.cst8,"aM",@progbits,8 ... .p2align 2 L(SP_RANGE): / single precision overflow/underflow bounds / .long 0x42b17217 / if x>this bound, then result overflows / .long 0x42cff1b4 / if x<this bound, then result underflows / .type L(SP_RANGE), @object ASM_SIZE_DIRECTIVE(L(SP_RANGE)) .p2align 2 L(SP_INF_0): .long 0x7f800000 / single precision Inf / .long 0 / single precision zero / .type L(SP_INF_0), @object ASM_SIZE_DIRECTIVE(L(SP_INF_0)) Since L(SP_RANGE) and L(SP_INF_0) are in .rodata.cst8 section, they must be aligned to 8 bytes. [BZ #21955] sysdeps/x86_64/fpu/e_expf.S (L(SP_RANGE)): Aligned to 8 bytes. (L(SP_INF_0)): Likewise.	2017-08-15 14:05:14 -07:00
H.J. Lu	57a72fa350	x86-64: Add FMA multiarch functions to libm This patch adds multiarch functions optimized with -mfma -mavx2 to libm. e_pow-fma.c is compiled with $(config-cflags-nofma) due to PR 19003. * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Add e_exp-fma, e_log-fma, e_pow-fma, s_atan-fma, e_asin-fma, e_atan2-fma, s_sin-fma, s_tan-fma, mplog-fma, mpa-fma, slowexp-fma, slowpow-fma, sincos32-fma, doasin-fma, dosincos-fma, halfulp-fma, mpexp-fma, mpatan2-fma, mpatan-fma, mpsqrt-fma, and mptan-fma. (CFLAGS-doasin-fma.c): New. (CFLAGS-dosincos-fma.c): Likewise. (CFLAGS-e_asin-fma.c): Likewise. (CFLAGS-e_atan2-fma.c): Likewise. (CFLAGS-e_exp-fma.c): Likewise. (CFLAGS-e_log-fma.c): Likewise. (CFLAGS-e_pow-fma.c): Likewise. (CFLAGS-halfulp-fma.c): Likewise. (CFLAGS-mpa-fma.c): Likewise. (CFLAGS-mpatan-fma.c): Likewise. (CFLAGS-mpatan2-fma.c): Likewise. (CFLAGS-mpexp-fma.c): Likewise. (CFLAGS-mplog-fma.c): Likewise. (CFLAGS-mpsqrt-fma.c): Likewise. (CFLAGS-mptan-fma.c): Likewise. (CFLAGS-s_atan-fma.c): Likewise. (CFLAGS-sincos32-fma.c): Likewise. (CFLAGS-slowexp-fma.c): Likewise. (CFLAGS-slowpow-fma.c): Likewise. (CFLAGS-s_sin-fma.c): Likewise. (CFLAGS-s_tan-fma.c): Likewise. * sysdeps/x86_64/fpu/multiarch/doasin-fma.c: New file. * sysdeps/x86_64/fpu/multiarch/dosincos-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/e_asin-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/e_atan2-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/e_exp-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/e_log-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/e_pow-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/halfulp-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/ifunc-avx-fma4.h: Likewise. * sysdeps/x86_64/fpu/multiarch/ifunc-fma4.h: Likewise. * sysdeps/x86_64/fpu/multiarch/mpa-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/mpatan-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/mpatan2-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/mpexp-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/mplog-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/mpsqrt-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/mptan-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_atan-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_sin-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_tan-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/sincos32-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowexp-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowpow-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/e_asin.c: Rewrite. * sysdeps/x86_64/fpu/multiarch/e_atan2.c: Likewise. * sysdeps/x86_64/fpu/multiarch/e_exp.c: Likewise. * sysdeps/x86_64/fpu/multiarch/e_log.c: Likewise. * sysdeps/x86_64/fpu/multiarch/e_pow.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_atan.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_sin.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_tan.c: Likewise.	2017-08-07 08:20:56 -07:00
H.J. Lu	8537e0f6cf	x86-64: Implement libmathvec IFUNC selectors in C * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines) Add svml_d_cos2_core-sse2, svml_d_cos4_core-sse, svml_d_cos8_core-avx2, svml_d_exp2_core-sse2, svml_d_exp4_core-sse, svml_d_exp8_core-avx2, svml_d_log2_core-sse2, svml_d_log4_core-sse, svml_d_log8_core-avx2, svml_d_pow2_core-sse2, svml_d_pow4_core-sse, svml_d_pow8_core-avx2 svml_d_sin2_core-sse2, svml_d_sin4_core-sse, svml_d_sin8_core-avx2, svml_d_sincos2_core-sse2, svml_d_sincos4_core-sse, svml_d_sincos8_core-avx2, svml_s_cosf16_core-avx2, svml_s_cosf4_core-sse2, svml_s_cosf8_core-sse, svml_s_expf16_core-avx2, svml_s_expf4_core-sse2, svml_s_expf8_core-sse, svml_s_logf16_core-avx2, svml_s_logf4_core-sse2, svml_s_logf8_core-sse, svml_s_powf16_core-avx2, svml_s_powf4_core-sse2, svml_s_powf8_core-sse, svml_s_sincosf16_core-avx2, svml_s_sincosf4_core-sse2, svml_s_sincosf8_core-sse, svml_s_sinf16_core-avx2, svml_s_sinf4_core-sse2 and svml_s_sinf8_core-sse. * sysdeps/x86_64/fpu/multiarch/ifunc-mathvec-avx2.h: New file. * sysdeps/x86_64/fpu/multiarch/ifunc-mathvec-avx512.h: Likewise. * sysdeps/x86_64/fpu/multiarch/ifunc-mathvec-sse4_1.h: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_cos2_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_cos4_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_exp2_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_exp4_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_log2_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_log4_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_log8_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_pow2_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_pow4_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos2_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos4_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_cosf16_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_cosf4_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_cosf8_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_expf16_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_expf4_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_expf8_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_logf16_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_logf4_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_logf8_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_powf16_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_powf4_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_powf8_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincosf16_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincosf4_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincosf8_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sinf16_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sinf4_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sinf8_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_cos2_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_cos2_core-sse2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVbN2v_cos): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_cos4_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_cos4_core-sse.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVdN4v_cos): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core-avx2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVeN8v_cos): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_exp2_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_exp2_core-sse2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVbN2v_exp): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_exp4_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_exp4_core-sse.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVdN4v_exp): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core-avx2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVeN8v_exp): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_log2_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_log2_core-sse2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVbN2v_log): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_log4_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_log4_core-sse.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVdN4v_log): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_log8_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_log8_core-avx2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVeN8v_log): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_pow2_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_pow2_core-sse2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVbN2vv_pow): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_pow4_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_pow4_core-sse.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVdN4vv_pow): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core-avx2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVeN8vv_pow): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core-sse2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVbN2v_sin): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core-sse.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVbN4v_sin): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core-avx2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVbN8v_sin): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos2_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_sincos2_core-sse2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVbN2vvv_sincos): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos4_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_sincos4_core-sse.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVdN4vvv_sincos): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core-avx2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVeN8vvv_sincos): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_cosf16_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_cosf16_core-avx2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVeN16v_cosf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_cosf4_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_cosf4_core-sse2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVbN4v_cosf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_cosf8_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_cosf8_core-sse.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVdN8v_cosf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_expf16_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_expf16_core-avx2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVeN16v_expf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_expf4_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_expf4_core-sse2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVbN4v_expf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_expf8_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_expf8_core-sse.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVdN8v_expf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_logf16_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_logf16_core-avx2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVeN16v_logf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_logf4_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_logf4_core-sse2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVbN4v_logf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_logf8_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_logf8_core-sse.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVdN8v_logf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_powf16_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_powf16_core-avx2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVeN16vv_powf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_powf4_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_powf4_core-sse2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVbN4vv_powf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_powf8_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_powf8_core-sse.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVdN8vv_powf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_sincosf16_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_sincosf16_core-avx2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVeN16vvv_sincosf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_sincosf4_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_sincosf4_core-sse2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVbN4vvv_sincosf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_sincosf8_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_sincosf8_core-sse.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVdN8vvv_sincosf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_sinf16_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_sinf16_core-avx2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVeN16v_sinf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_sinf4_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_sinf4_core-sse2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVbN4v_sinf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_sinf8_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_sinf8_core-sse.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVdN8v_sinf): Removed.	2017-08-04 13:03:58 -07:00
H.J. Lu	10a87ca476	x86-64: Implement libm IFUNC selectors in C * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Add s_ceil-sse4_1, s_ceilf-sse4_1, s_floor-sse4_1, s_floorf-sse4_1, s_nearbyint-sse4_1, s_nearbyintf-sse4_1, s_rint-sse4_1 and s_rintf-sse4_1. * sysdeps/x86_64/fpu/multiarch/ifunc-sse4_1.h: New file. * sysdeps/x86_64/fpu/multiarch/s_ceil.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_ceilf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_floor.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_floorf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_nearbyint.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_nearbyintf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_rint.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_rintf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_ceil.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/s_ceil-sse4_1.S: This. Don't include <machine/asm.h> nor <init-arch.h>. Include <sysdep.h>. (__ceil): Removed. * sysdeps/x86_64/fpu/multiarch/s_ceilf.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/s_ceilf-sse4_1.S: This. Don't include <machine/asm.h> nor <init-arch.h>. Include <sysdep.h>. (__ceilf): Removed. * sysdeps/x86_64/fpu/multiarch/s_floor.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/s_floor-sse4_1.S: This. Don't include <machine/asm.h> nor <init-arch.h>. Include <sysdep.h>. (__floor): Removed. * sysdeps/x86_64/fpu/multiarch/s_floorf.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/s_floorf-sse4_1.S: This. Don't include <machine/asm.h> nor <init-arch.h>. Include <sysdep.h>. (__floorf): Removed. * sysdeps/x86_64/fpu/multiarch/s_nearbyint.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/s_nearbyint-sse4_1.S: This. Don't include <machine/asm.h> nor <init-arch.h>. Include <sysdep.h>. (__nearbyint): Removed. * sysdeps/x86_64/fpu/multiarch/s_nearbyintf.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/s_nearbyintf-sse4_1.S: This. Don't include <machine/asm.h> nor <init-arch.h>. Include <sysdep.h>. (__nearbyintf): Removed. * sysdeps/x86_64/fpu/multiarch/s_rint.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/s_rint-sse4_1.S: This. Don't include <machine/asm.h> nor <init-arch.h>. Include <sysdep.h>. (__rint): Removed. * sysdeps/x86_64/fpu/multiarch/s_rintf.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/s_rintf-sse4_1.S: This. Don't include <machine/asm.h> nor <init-arch.h>. Include <sysdep.h>. (__rintf): Removed.	2017-08-04 13:02:13 -07:00
Joseph Myers	c86ed71d63	Add float128 support for x86_64, x86. This patch enables float128 support for x86_64 and x86. All GCC versions that can build glibc provide the required support, but since GCC 6 and before don't provide __builtin_nanq / __builtin_nansq, sNaN tests and some tests of NaN payloads need to be disabled with such compilers (this does not affect the generated glibc binaries at all, just the tests). bits/floatn.h declares float128 support to be available for GCC versions that provide the required libgcc support (4.3 for x86_64, 4.4 for i386 GNU/Linux, 4.5 for i386 GNU/Hurd); compilation-only support was present some time before then, but not really useful without the libgcc functions. fenv_private.h needed updating to avoid trying to put _Float128 values in registers. I make no assertion of optimality of the math_opt_barrier / math_force_eval definitions for this case; they are simply intended to be sufficient to work correctly. Tested for x86_64 and x86, with GCC 7 and GCC 6. (Testing for x32 was compilation tests only with build-many-glibcs.py to verify the ABI baseline updates. I have not done any testing for Hurd, although the float128 support is enabled there as for GNU/Linux.) * sysdeps/i386/Implies: Add ieee754/float128. * sysdeps/x86_64/Implies: Likewise. * sysdeps/x86/bits/floatn.h: New file. * sysdeps/x86/float128-abi.h: Likewise. * manual/math.texi (Mathematics): Document support for _Float128 on x86_64 and x86. * sysdeps/i386/fpu/fenv_private.h: Include <bits/floatn.h>. (math_opt_barrier): Do not put _Float128 values in floating-point registers. (math_force_eval): Likewise. [__x86_64__] (SET_RESTORE_ROUNDF128): New macro. * sysdeps/x86/fpu/Makefile [$(subdir) = math] (CPPFLAGS): Append to Makefile variable. * sysdeps/x86/fpu/e_sqrtf128.c: New file. * sysdeps/x86/fpu/sfp-machine.h: Likewise. Based on libgcc. * sysdeps/x86/math-tests.h: New file. * math/libm-test-support.h (XFAIL_FLOAT128_PAYLOAD): New macro. * math/libm-test-getpayload.inc (getpayload_test_data): Use XFAIL_FLOAT128_PAYLOAD. * math/libm-test-setpayload.inc (setpayload_test_data): Likewise. * math/libm-test-totalorder.inc (totalorder_test_data): Likewise. * math/libm-test-totalordermag.inc (totalordermag_test_data): Likewise. * sysdeps/unix/sysv/linux/i386/libc.abilist: Update. * sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/64/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise. * sysdeps/i386/fpu/libm-test-ulps: Likewise. * sysdeps/i386/i686/fpu/multiarch/libm-test-ulps: Likewise. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2017-06-26 22:02:24 +00:00
Zack Weinberg	fd860eaaa8	Remove __need macros from errno.h (__need_Emath, __need_error_t). This is fairly complicated, not because the users of __need_Emath and __need_error_t have complicated requirements, but because the core changes had a lot of fallout. __need_error_t exists for gnulib compatibility in argz.h and argp.h. error_t itself is a Hurdism, an enum containing all the E-constants, so you can do 'p (error_t) errno' in gdb and get a symbolic value. argz.h and argp.h use it for function return values, and they want to fall back to 'int' when that's not available. There is no reason why these nonstandard headers cannot just go ahead and include all of errno.h; so we do that. __need_Emath is defined only by .S files; what they _really_ need is for errno.h to avoid declaring anything other than the E-constants (e.g. 'extern int __errno_location(void);' is a syntax error in assembly language). This is replaced with a check for __ASSEMBLER__ in errno.h, plus a carefully documented requirement for bits/errno.h not to define anything other than macros. That in turn has the consequence that bits/errno.h must not define errno - fortunately, all live ports use the same definition of errno, so I've moved it to errno.h. The Hurd bits/errno.h must also take care not to define error_t when __ASSEMBLER__ is defined, which involves repeating all of the definitions twice, but it's a generated file so that's okay. * stdlib/errno.h: Remove __need_Emath and __need_error_t logic. Reorganize file. Declare errno here. When __ASSEMBLER__ is defined, don't declare anything other than the E-constants. * include/errno.h: Change conditional for exposing internal declarations to (not _ISOMAC and not __ASSEMBLER__). * bits/errno.h: Remove logic for __need_Emath. Document requirements for a port-specific bits/errno.h. * sysdeps/unix/sysv/linux/bits/errno.h * sysdeps/unix/sysv/linux/alpha/bits/errno.h * sysdeps/unix/sysv/linux/hppa/bits/errno.h * sysdeps/unix/sysv/linux/mips/bits/errno.h * sysdeps/unix/sysv/linux/sparc/bits/errno.h: Add multiple-include guard and check against improper inclusion. Remove __need_Emath logic. Don't declare errno here. Ensure all constants are defined as simple integer literals. Consistent formatting. * sysdeps/mach/hurd/errnos.awk: Likewise. Only define error_t and enum __error_t_codes if __ASSEMBLER__ is not defined. * sysdeps/mach/hurd/bits/errno.h: Regenerate. * argp/argp.h, string/argz.h: Don't define __need_error_t before including errno.h. * sysdeps/i386/i686/fpu/multiarch/s_cosf-sse2.S * sysdeps/i386/i686/fpu/multiarch/s_sincosf-sse2.S * sysdeps/i386/i686/fpu/multiarch/s_sinf-sse2.S * sysdeps/x86_64/fpu/s_cosf.S * sysdeps/x86_64/fpu/s_sincosf.S * sysdeps/x86_64/fpu/s_sinf.S: Just include errno.h; don't define __need_Emath or include bits/errno.h directly.	2017-06-14 08:14:34 -04:00
Zack Weinberg	7c3018f9e4	Suppress internal declarations for most of the testsuite. This patch adds a new build module called 'testsuite'. IS_IN (testsuite) implies _ISOMAC, as do IS_IN_build and __cplusplus (which means several ad-hoc tests for __cplusplus can go away). libc-symbols.h now suppresses almost all of itself when _ISOMAC is defined; in particular, _ISOMAC mode does not get config.h automatically anymore. There are still quite a few tests that need to see internal gunk of one variety or another. For them, we now have 'tests-internal' and 'test-internal-extras'; files in this category will still be compiled with MODULE_NAME=nonlib, and everything proceeds as it always has. The bulk of this patch is moving tests from 'tests' to 'tests-internal'. There is also 'tests-static-internal', which has the same effect on files in 'tests-static', and 'modules-names-tests', which has the inverse effect on files in 'modules-names' (it's inverted because most of the things in modules-names are not tests). For both of these, the file must appear in both the new variable and the old one. There is also now a special case for when libc-symbols.h is included without MODULE_NAME being defined at all. (This happens during the creation of libc-modules.h, and also when preprocessing Versions files.) When this happens, IS_IN is set to be always false and _ISOMAC is not defined, which was the status quo, but now it's explicit. The remaining changes to C source files in this patch seemed likely to cause problems in the absence of the main change. They should be relatively self-explanatory. In a few cases I duplicated a definition from an internal header rather than move the test to tests-internal; this was a judgement call each time and I'm happy to change those however reviewers feel is more appropriate. * Makerules: New subdir configuration variables 'tests-internal' and 'test-internal-extras'. Test files in these categories will still be compiled with MODULE_NAME=nonlib. Test files in the existing categories (tests, xtests, test-srcs, test-extras) are now compiled with MODULE_NAME=testsuite. New subdir configuration variable 'modules-names-tests'. Files which are in both 'modules-names' and 'modules-names-tests' will be compiled with MODULE_NAME=testsuite instead of MODULE_NAME=extramodules. (gen-as-const-headers): Move to tests-internal. (do-tests-clean, common-mostlyclean): Support tests-internal. * Makeconfig (built-modules): Add testsuite. * Makefile: Change libof-check-installed-headers-c and libof-check-installed-headers-cxx to 'testsuite'. * Rules: Likewise. Support tests-internal. * benchtests/strcoll-inputs/filelist#en_US.UTF-8: Remove extra-modules.mk. * config.h.in: Don't check for __OPTIMIZE__ or __FAST_MATH__ here. * include/libc-symbols.h: Move definitions of _GNU_SOURCE, PASTE_NAME, PASTE_NAME1, IN_MODULE, IS_IN, and IS_IN_LIB to the very top of the file and rationalize their order. If MODULE_NAME is not defined at all, define IS_IN to always be false, and don't define _ISOMAC. If any of IS_IN (testsuite), IS_IN_build, or __cplusplus are true, define _ISOMAC and suppress everything else in this file, starting with the inclusion of config.h. Do check for inappropriate definitions of __OPTIMIZE__ and __FAST_MATH__ here, but only if _ISOMAC is not defined. Correct some out-of-date commentary. * include/math.h: If _ISOMAC is defined, undefine NO_LONG_DOUBLE and _Mlong_double_ before including math.h. * include/string.h: If _ISOMAC is defined, don't expose _STRING_ARCH_unaligned. Move a comment to a more appropriate location. * include/errno.h, include/stdio.h, include/stdlib.h, include/string.h * include/time.h, include/unistd.h, include/wchar.h: No need to check __cplusplus nor use __BEGIN_DECLS/__END_DECLS. * misc/sys/cdefs.h (__NTHNL): New macro. * sysdeps/m68k/m680x0/fpu/bits/mathinline.h (__m81_defun): Use __NTHNL to avoid errors with GCC 6. * elf/tst-env-setuid-tunables.c: Include config.h with _LIBC defined, for HAVE_TUNABLES. * inet/tst-checks-posix.c: No need to define _ISOMAC. * intl/tst-gettext2.c: Provide own definition of N_. * math/test-signgam-finite-c99.c: No need to define _ISOMAC. * math/test-signgam-main.c: No need to define _ISOMAC. * stdlib/tst-strtod.c: Convert to test-driver. Split locale_test to... * stdlib/tst-strtod1i.c: ...this new file. * stdlib/tst-strtod5.c: Convert to test-driver and add copyright notice. Split tests of __strtod_internal to... * stdlib/tst-strtod5i.c: ...this new file. * string/test-string.h: Include stdint.h. Duplicate definition of inhibit_loop_to_libcall here (from libc-symbols.h). * string/test-strstr.c: Provide dummy definition of libc_hidden_builtin_def when including strstr.c. * sysdeps/ia64/fpu/libm-symbols.h: Suppress entire file in _ISOMAC mode; no need to test __STRICT_ANSI__ nor __cplusplus as well. * sysdeps/x86_64/fpu/math-tests-arch.h: Include cpu-features.h. Don't include init-arch.h. * sysdeps/x86_64/multiarch/test-multiarch.h: Include cpu-features.h. Don't include init-arch.h. * elf/Makefile: Move tst-ptrguard1-static, tst-stackguard1-static, tst-tls1-static, tst-tls2-static, tst-tls3-static, loadtest, unload, unload2, circleload1, neededtest, neededtest2, neededtest3, neededtest4, tst-tls1, tst-tls2, tst-tls3, tst-tls6, tst-tls7, tst-tls8, tst-dlmopen2, tst-ptrguard1, tst-stackguard1, tst-_dl_addr_inside_object, and all of the ifunc tests to tests-internal. Don't add $(modules-names) to test-extras. * inet/Makefile: Move tst-inet6_scopeid_pton to tests-internal. Add tst-deadline to tests-static-internal. * malloc/Makefile: Move tst-mallocstate and tst-scratch_buffer to tests-internal. * misc/Makefile: Move tst-atomic and tst-atomic-long to tests-internal. * nptl/Makefile: Move tst-typesizes, tst-rwlock19, tst-sem11, tst-sem12, tst-sem13, tst-barrier5, tst-signal7, tst-tls3, tst-tls3-malloc, tst-tls5, tst-stackguard1, tst-sem11-static, tst-sem12-static, and tst-stackguard1-static to tests-internal. Link tests-internal with libpthread also. Don't add $(modules-names) to test-extras. * nss/Makefile: Move tst-field to tests-internal. * posix/Makefile: Move bug-regex5, bug-regex20, bug-regex33, tst-rfc3484, tst-rfc3484-2, and tst-rfc3484-3 to tests-internal. * stdlib/Makefile: Move tst-strtod1i, tst-strtod3, tst-strtod4, tst-strtod5i, tst-tls-atexit, and tst-tls-atexit-nodelete to tests-internal. * sunrpc/Makefile: Move tst-svc_register to tests-internal. * sysdeps/powerpc/Makefile: Move test-get_hwcap and test-get_hwcap-static to tests-internal. * sysdeps/unix/sysv/linux/Makefile: Move tst-setgetname to tests-internal. * sysdeps/x86_64/fpu/Makefile: Add all libmvec test modules to modules-names-tests.	2017-05-11 19:27:59 -04:00
Zack Weinberg	963394a22b	Allow direct use of math_ldbl.h in testsuite. A few 'long double'-related tests include math_private.h just for their variety of math_ldbl.h, which contains macros for assembling and disassembling the binary representation of 'long double'. math_ldbl.h insists on being included from math_private.h, but if we relax this restriction (and fix some portability sloppiness) we can use it directly and not have to expose all of math_private.h to the testsuite. * sysdeps/generic/math_private.h: Use __BIG_ENDIAN and __LITTLE_ENDIAN, not BIG_ENDIAN and LITTLE_ENDIAN. * sysdeps/generic/math_ldbl.h * sysdeps/ia64/fpu/math_ldbl.h * sysdeps/ieee754/ldbl-128/math_ldbl.h * sysdeps/ieee754/ldbl-128ibm/math_ldbl.h * sysdeps/ieee754/ldbl-96/math_ldbl.h * sysdeps/powerpc/fpu/math_ldbl.h * sysdeps/x86_64/fpu/math_ldbl.h: Allow direct inclusion. Use uintNN_t instead of u_intNN_t. Use __BIG_ENDIAN and __LITTLE_ENDIAN, not BIG_ENDIAN and LITTLE_ENDIAN. Include endian.h and/or stdint.h if necessary. Add copyright notices. * sysdeps/ieee754/ldbl-128ibm/math_ldbl.h (ldbl_canonicalize_int): Don't use EXTRACT_WORDS64. * sysdeps/ieee754/ldbl-96/test-canonical-ldbl-96.c * sysdeps/ieee754/ldbl-96/test-totalorderl-ldbl-96.c * sysdeps/ieee754/ldbl-128ibm/test-canonical-ldbl-128ibm.c * sysdeps/ieee754/ldbl-128ibm/test-totalorderl-ldbl-128ibm.c: Include math_ldbl.h, not math_private.h.	2017-02-25 10:40:48 -05:00
Joseph Myers	92061bb033	Run libm tests separately for each function. At present, libm tests for each function get built into a single executable (for each floating point type, for each of normal / inline / finite-math-only functions, plus vector variants) and run together, resulting in a single PASS or FAIL (for each of those nine variants plus vector variants). Building this executable involves reading over 50 MB of libm-test-.c sources. This patch arranges for tests of each function to be run separately from the makefiles instead. There are 121 functions being tested for each (type, variant pair) (actually 126, but run as 121 from the Makefile because each of the pairs (exp10, pow10), (isfinite, finite), (lgamma, gamma), (remainder, drem), (scalbn, ldexp), shares a table of test results and so is run together), so 1089 separate tests run from the Makefile, plus 48 vector tests on x86_64 (six functions for eight vector variants). Each test only involves a libm-test-<func>.c file of no more than about 4 MB, rather than all such files taking about 50 MB. With tests run separately, test summaries will indicate which functions actually have problems (of course, those problems may just be out-of-date libm-test-ulps files if the file hasn't been updated for the architecture in question recently). All the .c files for the 1089+48 tests are generated automatically from the Makefiles. Various checked-in boilerplate .c files are removed as no longer needed. CFLAGS definitions for the different kinds of tests are generated using makefile iterators to apply target-specific variable settings. libm-have-vector-test.h is no longer needed; the list of functions to test for each vector type is now in the sysdeps Makefile. This should reduce the amount of boilerplate needed for float128 testing support; test-float128.h will still be needed, but not various .c files or Makefile CFLAGS definitions. The logic for creating dependencies on libm-test-support-.o files should also render <https://sourceware.org/ml/libc-alpha/2017-02/msg00279.html> unnecessary. Tested for x86_64 and x86. * math/Makefile (libm-tests-generated): Remove variable. (libm-tests-base-normal): New variable. (libm-tests-base-finite): Likewise. (libm-tests-base-inline): Likewise. (libm-tests-base): Likewise. (libm-tests-normal): Likewise. (libm-tests-finite): Likewise. (libm-tests-inline): Likewise. (libm-tests-vector): Likewise. (libm-tests): Define in terms of these new variables. (libm-tests-for-type): New variable. (libm-tests.o): Move definition. (tests): Move addition of $(libm-tests). (generated): Update for new and removed libm test files. ($(objpfx)libm-test.c): Remove target. ($(objpfx)libm-have-vector-test.h): Likewise. (CFLAGS-test-double-vlen2.c): Remove variable. (CFLAGS-test-double-vlen4.c): Likewise. (CFLAGS-test-double-vlen8.c): Likewise. (CFLAGS-test-float-vlen4.c): Likewise. (CFLAGS-test-float-vlen8.c): Likewise. (CFLAGS-test-float-vlen16.c): Likewise. (CFLAGS-test-float.c): Likewise. (CFLAGS-test-float-finite.c): Likewise. (CFLAGS-libm-test-support-float.c): Likewise. (CFLAGS-test-double.c): Likewise. (CFLAGS-test-double-finite.c): Likewise. (CFLAGS-libm-test-support-double.c): Likewise. (CFLAGS-test-ldouble.c): Likewise. (CFLAGS-test-ldouble-finite.c): Likewise. (CFLAGS-libm-test-support-ldouble.c): Likewise. (libm-test-inline-cflags): New variable. (CFLAGS-test-ifloat.c): Remove variable. (CFLAGS-test-idouble.c): Likewise. (CFLAGS-test-ildouble.c): Likewise. ($(addprefix $(objpfx), $(libm-tests.o))): Move target and update dependencies. ($(foreach t,$(libm-tests-normal),$(objpfx)$(t).c)): New rule. ($(foreach t,$(libm-tests-finite),$(objpfx)$(t).c)): Likewise. ($(foreach t,$(libm-tests-inline),$(objpfx)$(t).c)): Likewise. ($(foreach t,$(libm-tests-vector),$(objpfx)$(t).c)): Likewise. ($(foreach t,$(types),$(objpfx)libm-test-support-$(t).c)): Likewise. (dependencies on libm-test-support-.o): Remove. ($(foreach f,$(libm-test-funcs-all),$(objpfx)$(o)-$(f).o)): New rules using iterators. ($(addprefix $(objpfx),$(call libm-tests-for-type,$(o)))): Likewise. ($(objpfx)libm-test-support-$(o).o): Likewise. ($(addprefix $(objpfx),$(filter-out $(tests-static) $(libm-vec-tests),$(tests)))): Filter out $(libm-tests-vector) instead. ($(addprefix $(objpfx), $(libm-vec-tests))): Use iterator to define rule instead. math/README.libm-test: Update. * math/libm-test-acos.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-acosh.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-asin.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-asinh.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-atan.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-atan2.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-atanh.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-cabs.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-cacos.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-cacosh.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-canonicalize.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-carg.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-casin.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-casinh.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-catan.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-catanh.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-cbrt.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-ccos.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-ccosh.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-ceil.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-cexp.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-cimag.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-clog.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-clog10.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-conj.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-copysign.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-cos.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-cosh.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-cpow.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-cproj.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-creal.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-csin.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-csinh.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-csqrt.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-ctan.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-ctanh.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-erf.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-erfc.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-exp.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-exp10.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-exp2.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-expm1.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-fabs.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-fdim.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-floor.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-fma.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-fmax.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-fmaxmag.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-fmin.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-fminmag.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-fmod.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-fpclassify.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-frexp.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-fromfp.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-fromfpx.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-getpayload.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-hypot.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-ilogb.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-iscanonical.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-iseqsig.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-isfinite.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-isgreater.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-isgreaterequal.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-isinf.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-isless.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-islessequal.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-islessgreater.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-isnan.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-isnormal.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-issignaling.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-issubnormal.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-isunordered.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-iszero.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-j0.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-j1.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-jn.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-lgamma.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-llogb.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-llrint.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-llround.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-log.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-log10.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-log1p.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-log2.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-logb.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-lrint.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-lround.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-modf.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-nearbyint.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-nextafter.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-nextdown.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-nexttoward.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-nextup.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-pow.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-remainder.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-remquo.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-rint.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-round.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-roundeven.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-scalb.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-scalbln.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-scalbn.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-setpayload.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-setpayloadsig.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-signbit.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-significand.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-sin.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-sincos.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-sinh.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-sqrt.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-tan.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-tanh.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-tgamma.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-totalorder.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-totalordermag.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-trunc.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-ufromfp.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-ufromfpx.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-y0.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-y1.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-yn.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-driver.c: Do not include libm-have-vector-test.h. (HAVE_VECTOR): Remove macro. (START): Do not call HAVE_VECTOR. * math/test-double-vlen2.h (FUNC_TEST): Remove macro. * math/test-double-vlen4.h (FUNC_TEST): Remove macro. * math/test-double-vlen8.h (FUNC_TEST): Remove macro. * math/test-float-vlen16.h (FUNC_TEST): Remove macro. * math/test-float-vlen4.h (FUNC_TEST): Remove macro. * math/test-float-vlen8.h (FUNC_TEST): Remove macro. * math/test-math-vector.h (FUNC_TEST): New macro. (WRAPPER_DECL): Rename to WRAPPER_DECL_f. * sysdeps/x86_64/fpu/Makefile (double-vlen2-funcs): New variable. (double-vlen4-funcs): Likewise. (double-vlen4-avx2-funcs): Likewise. (double-vlen8-funcs): Likewise. (float-vlen4-funcs): Likewise. (float-vlen8-funcs): Likewise. (float-vlen8-avx2-funcs): Likewise. (float-vlen16-funcs): Likewise. (CFLAGS-test-double-vlen4-avx2.c): Remove variable. (CFLAGS-test-float-vlen8-avx2.c): Likewise. * sysdeps/x86_64/fpu/test-double-vlen4.h (TEST_VECTOR_cos): Remove macro. (TEST_VECTOR_sin): Likewise. (TEST_VECTOR_sincos): Likewise. (TEST_VECTOR_log): Likewise. (TEST_VECTOR_exp): Likewise. (TEST_VECTOR_pow): Likewise. * sysdeps/x86_64/fpu/test-double-vlen8.h (TEST_VECTOR_cos): Likewise. (TEST_VECTOR_sin): Likewise. (TEST_VECTOR_sincos): Likewise. (TEST_VECTOR_log): Likewise. (TEST_VECTOR_exp): Likewise. (TEST_VECTOR_pow): Likewise. * sysdeps/x86_64/fpu/test-float-vlen16.h (TEST_VECTOR_cosf): Likewise. (TEST_VECTOR_sinf): Likewise. (TEST_VECTOR_sincosf): Likewise. (TEST_VECTOR_logf): Likewise. (TEST_VECTOR_expf): Likewise. (TEST_VECTOR_powf): Likewise. * sysdeps/x86_64/fpu/test-float-vlen8.h (TEST_VECTOR_cosf): Likewise. (TEST_VECTOR_sinf): Likewise. (TEST_VECTOR_sincosf): Likewise. (TEST_VECTOR_logf): Likewise. (TEST_VECTOR_expf): Likewise. (TEST_VECTOR_powf): Likewise. * math/gen-libm-have-vector-test.sh: Remove file. * math/libm-test.inc: Likewise. * math/libm-test-support-double.c: Likewise. * math/libm-test-support-float.c: Likewise. * math/libm-test-support-ldouble.c: Likewise. * math/test-double-finite.c: Likewise.: Likewise. * math/test-double.c: Likewise. * math/test-float-finite.c: Likewise. * math/test-float.c: Likewise. * math/test-idouble.c: Likewise. * math/test-ifloat.c: Likewise. * math/test-ildouble.c: Likewise. * math/test-ldouble-finite.c: Likewise. * math/test-ldouble.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen2.h: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen16.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4.h: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8.c: Likewise.	2017-02-24 00:52:49 +00:00
Joseph Myers	2c51dfd05d	Move tests of catan, catanh to auto-libm-test-. This patch moves tests of catan and catanh with finite inputs (other than the divide-by-zero cases producing an exact infinity) to using the auto-libm-test machinery. Each of auto-libm-test-out-catan and auto-libm-test-out-catanh takes about three seconds to generate on my system (so in fact it wasn't necessary after all to defer the move to auto-libm-test- until the output files were split up by function). Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add tests of catan and catanh. * math/auto-libm-test-out-catan: New generated file. * math/auto-libm-test-out-catanh: Likewise. * math/libm-test-catan.inc (catan_test_data): Use AUTO_TESTS_c_c. Move tests with finite inputs, except divide-by-zero cases, to auto-libm-test-in. * math/libm-test-catanh.inc (catanh_test_data): Likewise. * math/Makefile (libm-test-funcs-auto): Add catan and catanh. (libm-test-funcs-noauto): Remove catan and catanh. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/i386/i686/fpu/multiarch/libm-test-ulps: Likewise. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2017-02-17 18:42:37 +00:00
Joseph Myers	fa2a3dd7a3	Move tests of casin, casinh to auto-libm-test-. This patch moves tests of casin and casinh with finite inputs to using the auto-libm-test machinery. Each of auto-libm-test-out-casin and auto-libm-test-out-casinh takes about 38 minutes to generate on my system because of MPC slowness on special cases that appear in the tests (with MPC 1.0.3; I don't know to what extent current MPC master might speed it up). Tested for x86_64 and x86 and ulps updated accordingly. math/auto-libm-test-in: Add tests of casin and casinh. * math/auto-libm-test-out-casin: New generated file. * math/auto-libm-test-out-casinh: Likewise. * math/libm-test-casin.inc (casin_test_data): Use AUTO_TESTS_c_c. Move tests with finite inputs to auto-libm-test-in. * math/libm-test-casinh.inc (casinh_test_data): Likewise. * math/Makefile (libm-test-funcs-auto): Add casin and casinh. (libm-test-funcs-noauto): Remove casin and casinh. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/i386/i686/fpu/multiarch/libm-test-ulps: Likewise. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2017-02-17 18:14:02 +00:00
Joseph Myers	6b8303a383	Move tests of cacos, cacosh to auto-libm-test-. This patch moves tests of cacos and cacosh with finite inputs to using the auto-libm-test machinery. Each of auto-libm-test-out-cacos and auto-libm-test-out-cacosh takes about 80 minutes to generate on my system because of MPC slowness on special cases that appear in the tests (with MPC 1.0.3; I don't know to what extent current MPC master might speed it up). Tested for x86_64 and x86 and ulps updated accordingly. math/auto-libm-test-in: Add tests of cacos and cacosh. * math/auto-libm-test-out-cacos: New generated file. * math/auto-libm-test-out-cacosh: Likewise. * math/libm-test-cacos.inc (cacos_test_data): Use AUTO_TESTS_c_c. Move tests with finite inputs to auto-libm-test-in. * math/libm-test-cacosh.inc (cacosh_test_data): Likewise. * math/Makefile (libm-test-funcs-auto): Add cacos and cacosh. (libm-test-funcs-noauto): Remove cacos and cacosh. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/i386/i686/fpu/multiarch/libm-test-ulps: Likewise. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2017-02-17 17:44:23 +00:00
Joseph Myers	f7a51347a4	Revert header inclusion changes that break math/ testing on x86_64. Revert: 2017-02-16 Zack Weinberg <zackw@panix.com> * sysdeps/x86_64/fpu/math-tests-arch.h: Include cpu-features.h. Don't include init-arch.h. * sysdeps/x86_64/multiarch/test-multiarch.h: Include cpu-features.h. Don't include init-arch.h.	2017-02-17 17:08:17 +00:00
Zack Weinberg	ceaa98897c	Add missing header files throughout the testsuite. * crypt/md5.h: Test _LIBC with #if defined, not #if. * dirent/opendir-tst1.c: Include sys/stat.h. * dirent/tst-fdopendir.c: Include sys/stat.h. * dirent/tst-fdopendir2.c: Include stdlib.h. * dirent/tst-scandir.c: Include stdbool.h. * elf/tst-auditmod1.c: Include link.h and stddef.h. * elf/tst-tls15.c: Include stdlib.h. * elf/tst-tls16.c: Include stdlib.h. * elf/tst-tls17.c: Include stdlib.h. * elf/tst-tls18.c: Include stdlib.h. * iconv/tst-iconv6.c: Include endian.h. * iconvdata/bug-iconv11.c: Include limits.h. * io/test-utime.c: Include stdint.h. * io/tst-faccessat.c: Include sys/stat.h. * io/tst-fchmodat.c: Include sys/stat.h. * io/tst-fchownat.c: Include sys/stat.h. * io/tst-fstatat.c: Include sys/stat.h. * io/tst-futimesat.c: Include sys/stat.h. * io/tst-linkat.c: Include sys/stat.h. * io/tst-mkdirat.c: Include sys/stat.h and stdbool.h. * io/tst-mkfifoat.c: Include sys/stat.h and stdbool.h. * io/tst-mknodat.c: Include sys/stat.h and stdbool.h. * io/tst-openat.c: Include stdbool.h. * io/tst-readlinkat.c: Include sys/stat.h. * io/tst-renameat.c: Include sys/stat.h. * io/tst-symlinkat.c: Include sys/stat.h. * io/tst-unlinkat.c: Include stdbool.h. * libio/bug-memstream1.c: Include stdlib.h. * libio/bug-wmemstream1.c: Include stdlib.h. * libio/tst-fwrite-error.c: Include stdlib.h. * libio/tst-memstream1.c: Include stdlib.h. * libio/tst-memstream2.c: Include stdlib.h. * libio/tst-memstream3.c: Include stdlib.h. * malloc/tst-interpose-aux.c: Include stdint.h. * misc/tst-preadvwritev-common.c: Include sys/stat.h. * nptl/tst-basic7.c: Include limits.h. * nptl/tst-cancel25.c: Include pthread.h, not pthreadP.h. * nptl/tst-cancel4.c: Include stddef.h, limits.h, and sys/stat.h. * nptl/tst-cancel4_1.c: Include stddef.h. * nptl/tst-cancel4_2.c: Include stddef.h. * nptl/tst-cond16.c: Include limits.h. Use sysconf(_SC_PAGESIZE) instead of __getpagesize. * nptl/tst-cond18.c: Include limits.h. Use sysconf(_SC_PAGESIZE) instead of __getpagesize. * nptl/tst-cond4.c: Include stdint.h. * nptl/tst-cond6.c: Include stdint.h. * nptl/tst-stack2.c: Include limits.h. * nptl/tst-stackguard1.c: Include stddef.h. * nptl/tst-tls4.c: Include stdint.h. Don't include tls.h. * nptl/tst-tls4moda.c: Include stddef.h. Don't include stdio.h, unistd.h, or tls.h. * nptl/tst-tls4modb.c: Include stddef.h. Don't include stdio.h, unistd.h, or tls.h. * nptl/tst-tls5.h: Include stddef.h. Don't include stdlib.h or tls.h. * posix/tst-getaddrinfo2.c: Include stdio.h. * posix/tst-getaddrinfo5.c: Include stdio.h. * posix/tst-pathconf.c: Include sys/stat.h. * posix/tst-posix_fadvise-common.c: Include stdint.h. * posix/tst-preadwrite-common.c: Include sys/stat.h. * posix/tst-regex.c: Include stdint.h. Don't include spawn.h or spawn_int.h. * posix/tst-regexloc.c: Don't include spawn.h or spawn_int.h. * posix/tst-vfork3.c: Include sys/stat.h. * resolv/tst-bug18665-tcp.c: Include stdlib.h. * resolv/tst-res_hconf_reorder.c: Include stdlib.h. * resolv/tst-resolv-search.c: Include stdlib.h. * stdio-common/tst-fmemopen2.c: Include stdint.h. * stdio-common/tst-vfprintf-width-prec.c: Include stdlib.h. * stdlib/test-canon.c: Include sys/stat.h. * stdlib/tst-tls-atexit.c: Include stdbool.h. * string/test-memchr.c: Include stdint.h. * string/tst-cmp.c: Include stdint.h. * sysdeps/pthread/tst-timer.c: Include stdint.h. * sysdeps/unix/sysv/linux/tst-sync_file_range.c: Include stdint.h. * sysdeps/wordsize-64/tst-writev.c: Include limits.h and stdint.h. * sysdeps/x86_64/fpu/math-tests-arch.h: Include cpu-features.h. Don't include init-arch.h. * sysdeps/x86_64/multiarch/test-multiarch.h: Include cpu-features.h. Don't include init-arch.h. * sysdeps/x86_64/tst-auditmod10b.c: Include link.h and stddef.h. * sysdeps/x86_64/tst-auditmod3b.c: Include link.h and stddef.h. * sysdeps/x86_64/tst-auditmod4b.c: Include link.h and stddef.h. * sysdeps/x86_64/tst-auditmod5b.c: Include link.h and stddef.h. * sysdeps/x86_64/tst-auditmod6b.c: Include link.h and stddef.h. * sysdeps/x86_64/tst-auditmod6c.c: Include link.h and stddef.h. * sysdeps/x86_64/tst-auditmod7b.c: Include link.h and stddef.h. * time/clocktest.c: Include stdint.h. * time/tst-posixtz.c: Include stdint.h. * timezone/tst-timezone.c: Include stdint.h.	2017-02-16 17:33:18 -05:00
Joseph Myers	10303eb74b	Move most libmvec test contents from .c to .h files. The libmvec tests put substantive, architecture-specific contents in .c files such as test-double-vlen4.c, so making those files architecture-specific and causing issues for generating such files automatically when splitting up tests by function. This patch moves all the substantive contents to .h files, so the .c files only include the .h file and then libm-test.c. This allows for automatic generation of per-function .c files in future. The .h files in turn #include or #include_next the architecture-independent file and add the architecture-specific definitions to that. (Splitting by function should in fact allow the TEST_VECTOR_* macros to be replaced by sysdeps makefile information on which functions to test in each case, removing the need for gen-libm-have-vector-test.sh as well as removing the need for some of the architecture-specific headers.) Tested for x86_64. * sysdeps/x86_64/fpu/test-double-vlen2.c: Move most contents to, and include ... * sysdeps/x86_64/fpu/test-double-vlen2.h: ... here. New file. * sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: Move most contents to, and include ... * sysdeps/x86_64/fpu/test-double-vlen4-avx2.h: ... here. New file. * sysdeps/x86_64/fpu/test-double-vlen4.c: Move most contents to, and include ... * sysdeps/x86_64/fpu/test-double-vlen4.h: ... here. New file. * sysdeps/x86_64/fpu/test-double-vlen8.c: Move most contents to, and include ... * sysdeps/x86_64/fpu/test-double-vlen8.h: ... here. New file. * sysdeps/x86_64/fpu/test-float-vlen16.c: Move most contents to, and include ... * sysdeps/x86_64/fpu/test-float-vlen16.h: ... here. New file. * sysdeps/x86_64/fpu/test-float-vlen4.c: Move most contents to, and include ... * sysdeps/x86_64/fpu/test-float-vlen4.h: ... here. New file. * sysdeps/x86_64/fpu/test-float-vlen8-avx2.c: Move most contents to, and include ... * sysdeps/x86_64/fpu/test-float-vlen8-avx2.h: ... here. New file. * sysdeps/x86_64/fpu/test-float-vlen8.c: Move most contents to, and include ... * sysdeps/x86_64/fpu/test-float-vlen8.h: ... here. New file.	2017-02-15 01:13:15 +00:00
Joseph Myers	bfff8b1bec	Update copyright dates with scripts/update-copyrights.	2017-01-01 00:14:16 +00:00
Joseph Myers	0a2546cdaa	Fix x86, x86_64 fmax, fmin sNaN handling, add tests (bug 20947). Various fmax and fmin function implementations mishandle sNaN arguments: (a) When both arguments are NaNs, the return value should be a qNaN, but sometimes it is an sNaN if at least one argument is an sNaN. (b) Under TS 18661-1 semantics, if either argument is an sNaN then the result should be a qNaN (whereas if one argument is a qNaN and the other is not a NaN, the result should be the non-NaN argument). Various implementations treat sNaNs like qNaNs here. This patch fixes the x86 and x86_64 versions (ignoring float and double for 32-bit x86 given the inability to reliably avoid the sNaN turning into a qNaN before it gets to the called function). Tests of sNaN inputs to these functions are added. Note on architecture versions I haven't changed for this issue: AArch64 already gets this right (it uses a hardware instruction with the correct semantics for both quiet and signaling NaNs) and does not need changes. It's possible Alpha, IA64, SPARC might need changes (this would be shown by the testsuite if so). Tested for x86_64 and x86 (both i686 and i586 builds, to cover the different x86 implementations). [BZ #20947] * sysdeps/i386/fpu/s_fmaxl.S (__fmaxl): Add the arguments when either is a signaling NaN. * sysdeps/i386/fpu/s_fminl.S (__fminl): Likewise. Make code follow fmaxl more closely. * sysdeps/i386/i686/fpu/s_fmaxl.S (__fmaxl): Add the arguments when either is a signaling NaN. * sysdeps/i386/i686/fpu/s_fminl.S (__fminl): Likewise. * sysdeps/x86_64/fpu/s_fmax.S (__fmax): Likewise. * sysdeps/x86_64/fpu/s_fmaxf.S (__fmaxf): Likewise. * sysdeps/x86_64/fpu/s_fmaxl.S (__fmaxl): Likewise. * sysdeps/x86_64/fpu/s_fmin.S (__fmin): Likewise. * sysdeps/x86_64/fpu/s_fminf.S (__fminf): Likewise. * sysdeps/x86_64/fpu/s_fminl.S (__fminl): Likewise. * math/libm-test.inc (fmax_test_data): Add tests of sNaN inputs. (fmin_test_data): Likewise.	2016-12-15 23:52:18 +00:00
Joseph Myers	a91fd168a0	Fix x86_64/x86 powl handling of sNaN arguments (bug 20916). The x86_64/x86 powl implementations mishandle sNaN arguments, both by returning sNaN in some cases (instead of doing arithmetic on the arguments to produce the result when NaN arguments result in NaN results) and by treating sNaN the same as qNaN for arguments (1, sNaN) and (sNaN, 0), contrary to TS 18661-1 which requires those cases to return qNaN instead of 1. This patch makes the x86_64/x86 powl implementations follow TS 18661-1 semantics for sNaN arguments; sNaN tests are also added for pow. Given the problems with testing float and double sNaN arguments on 32-bit x86 (sNaN tests disabled because the compiler may convert unnecessarily to a qNaN when passing arguments), no changes are made to the powf and pow implementations there. Tested for x86_64 and x86. [BZ #20916] * sysdeps/i386/fpu/e_powl.S (__ieee754_powl): Do not return 1 for arguments (sNaN, 0) or (1, sNaN). Do arithmetic on NaN arguments to compute result. * sysdeps/x86_64/fpu/e_powl.S (__ieee754_powl): Likewise. * math/libm-test.inc (pow_test_data): Add tests of sNaN arguments.	2016-12-06 00:33:19 +00:00
Joseph Myers	799131036e	Do not hardcode platform names in manual/libm-err-tab.pl (bug 14139). manual/libm-err-tab.pl hardcodes a list of names for particular platforms (mapping from sysdeps directory name to friendly name for the manual). This goes against the principle of keeping information about individual platforms in their corresponding sysdeps directory, and the list is also very out-of-date regarding supported platforms and their corresponding sysdeps directories. This patch fixes this by adding a libm-test-ulps-name file alongside each libm-test-ulps file. The script then gets the friendly name from that file, which is required to exist, so it no longer needs to allow for the mapping being missing. Tested for x86_64. [BZ #14139] * manual/libm-err-tab.pl (%pplatforms): Initialize to empty. (find_files): Obtain platform name from libm-test-ulps-name and store in %pplatforms. (canonicalize_platform): Remove. (print_platforms): Use $pplatforms directly. (by_platforms): Do not allow for platforms missing from %pplatforms. * sysdeps/aarch64/libm-test-ulps-name: New file. * sysdeps/alpha/fpu/libm-test-ulps-name: Likewise. * sysdeps/arm/libm-test-ulps-name: Likewise. * sysdeps/generic/libm-test-ulps-name: Likewise. * sysdeps/hppa/fpu/libm-test-ulps-name: Likewise. * sysdeps/i386/fpu/libm-test-ulps-name: Likewise. * sysdeps/i386/i686/fpu/multiarch/libm-test-ulps-name: Likewise. * sysdeps/ia64/fpu/libm-test-ulps-name: Likewise. * sysdeps/m68k/coldfire/fpu/libm-test-ulps-name: Likewise. * sysdeps/m68k/m680x0/fpu/libm-test-ulps-name: Likewise. * sysdeps/microblaze/libm-test-ulps-name: Likewise. * sysdeps/mips/mips32/libm-test-ulps-name: Likewise. * sysdeps/mips/mips64/libm-test-ulps-name: Likewise. * sysdeps/nios2/libm-test-ulps-name: Likewise. * sysdeps/powerpc/fpu/libm-test-ulps-name: Likewise. * sysdeps/powerpc/nofpu/libm-test-ulps-name: Likewise. * sysdeps/s390/fpu/libm-test-ulps-name: Likewise. * sysdeps/sh/libm-test-ulps-name: Likewise. * sysdeps/sparc/fpu/libm-test-ulps-name: Likewise. * sysdeps/tile/libm-test-ulps-name: Likewise. * sysdeps/x86_64/fpu/libm-test-ulps-name: Likewise.	2016-11-04 16:49:06 +00:00
Joseph Myers	f280fa6d17	Use __builtin_fma more in dbl-64 code. sysdeps/ieee754/dbl-64/dla.h can use a macro DLA_FMS for more efficient double-width operations when fused multiply-subtract is supported. However, this macro is only defined for x86_64, conditional on architecture-specific __FMA4__. This patch makes the code use __builtin_fma conditional on __FP_FAST_FMA, as used elsewhere in glibc. Tested for x86_64, x86 and powerpc. On powerpc (where this is causing fused operations to be used where they weren't previously) I see an increase from 1ulp to 2ulp in the imaginary part of clog10: testing double (without inline functions) Failure: Test: Imaginary part of: clog10 (0x1.7a858p+0 - 0x6.d940dp-4 i) Result: is: -1.2237865208199886e-01 -0x1.f5435146bb61ap-4 should be: -1.2237865208199888e-01 -0x1.f5435146bb61cp-4 difference: 2.7755575615628914e-17 0x1.0000000000000p-55 ulp : 2.0000 max.ulp : 1.0000 Maximal error of real part of: clog10 is : 3 ulp accepted: 3 ulp Maximal error of imaginary part of: clog10 is : 2 ulp accepted: 1 ulp This is actually resulting from atan2 becoming more accurate (atan2 (-0x6.d940dp-4, 0x1.7a858p+0) should ideally be -0x1.208cd6e841554p-2 but was -0x1.208cd6e841555p-2 from a powerpc libm built before this change, and is -0x1.208cd6e841554p-2 from a powerpc libm built after this change). Since these functions are not expected to be correctly rounding by glibc's accuracy goals, neither result is a problem, but this does imply that some of this code, although designed to be correctly rounding, is not in fact correctly rounding (possibly because of GCC creating fused operations where the code does not expect it, something we've only disabled for specific functions where it was found to cause large errors). (Of course as previously discussed I think we should remove the slow cases where an error analysis shows this wouldn't increase the errors much above 0.5ulp; it's only functions such as cratan2 that are expected to be correctly rounding, not atan2.) * sysdeps/ieee754/dbl-64/dla.h [__FP_FAST_FMA] (DLA_FMS): Define macro to use __builtin_fma. * sysdeps/x86_64/fpu/dla.h: Remove file.	2016-09-30 15:49:51 +00:00
Joseph Myers	ec94343f59	Add femode_t functions. TS 18661-1 defines a type femode_t to represent the set of dynamic floating-point control modes (such as the rounding mode and trap enablement modes), and functions fegetmode and fesetmode to manipulate those modes (without affecting other state such as the raised exception flags) and a corresponding macro FE_DFL_MODE. This patch series implements those interfaces for glibc. This first patch adds the architecture-independent pieces, the x86 and x86_64 implementations, and the <bits/fenv.h> and ABI baseline updates for all architectures so glibc keeps building and passing the ABI tests on all architectures. Subsequent patches add the fegetmode and fesetmode implementations for other architectures. femode_t is generally an integer type - the same type as fenv_t, or as the single element of fenv_t where fenv_t is a structure containing a single integer (or the single relevant element, where it has elements for both status and control registers) - except where architecture properties or consistency with the fenv_t implementation indicate otherwise. FE_DFL_MODE follows FE_DFL_ENV in whether it's a magic pointer value (-1 cast to const femode_t ), a value that can be distinguished from valid pointers by its high bits but otherwise contains a representation of the desired register contents, or a pointer to a constant variable (the powerpc case; __fe_dfl_mode is added as an exported constant object, an alias to __fe_dfl_env). Note that where architectures (that share a register between control and status bits) gain definitions of new floating-point control or status bits in future, the implementations of fesetmode for those architectures may need updating (depending on whether the new bits are control or status bits and what the implementation does with previously unknown bits), just like existing implementations of <fenv.h> functions that take care not to touch reserved bits may need updating when the set of reserved bits changes. (As any new bits are outside the scope of ISO C, that's just a quality-of-implementation issue for supporting them, not a conformance issue.) As with fenv_t, femode_t should properly include any software DFP rounding mode (and for both fenv_t and femode_t I'd consider that fragment of DFP support appropriate for inclusion in glibc even in the absence of the rest of libdfp; hardware DFP rounding modes should already be included if the definitions of which bits are status / control bits are correct). Tested for x86_64, x86, mips64 (hard float, and soft float to test the fallback version), arm (hard float) and powerpc (hard float, soft float and e500). Other architecture versions are untested. math/fegetmode.c: New file. * math/fesetmode.c: Likewise. * sysdeps/i386/fpu/fegetmode.c: Likewise. * sysdeps/i386/fpu/fesetmode.c: Likewise. * sysdeps/x86_64/fpu/fegetmode.c: Likewise. * sysdeps/x86_64/fpu/fesetmode.c: Likewise. * math/fenv.h: Update comment on inclusion of <bits/fenv.h>. [__GLIBC_USE (IEC_60559_BFP_EXT)] (fegetmode): New function declaration. [__GLIBC_USE (IEC_60559_BFP_EXT)] (fesetmode): Likewise. * bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/aarch64/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/alpha/fpu/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/arm/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/hppa/fpu/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/ia64/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/m68k/fpu/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/microblaze/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/mips/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/nios2/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/powerpc/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (__fe_dfl_mode): New variable declaration. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/s390/fpu/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/sh/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/sparc/fpu/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/tile/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/x86/fpu/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * manual/arith.texi (FE_DFL_MODE): Document macro. (fegetmode): Document function. (fesetmode): Likewise. * math/Versions (fegetmode): New libm symbol at version GLIBC_2.25. (fesetmode): Likewise. * math/Makefile (libm-support): Add fegetmode and fesetmode. (tests): Add test-femode and test-femode-traps. * math/test-femode-traps.c: New file. * math/test-femode.c: Likewise. * sysdeps/powerpc/fpu/fenv_const.c (__fe_dfl_mode): Declare as alias for __fe_dfl_env. * sysdeps/powerpc/nofpu/fenv_const.c (__fe_dfl_mode): Likewise. * sysdeps/powerpc/powerpc32/e500/nofpu/fenv_const.c (__fe_dfl_mode): Likewise. * sysdeps/powerpc/Versions (__fe_dfl_mode): New libm symbol at version GLIBC_2.25. * sysdeps/nacl/libm.abilist: Update. * sysdeps/unix/sysv/linux/aarch64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/arm/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/hppa/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/microblaze/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/nios2/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sh/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/tile/tilegx/tilegx32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/tile/tilepro/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise.	2016-09-07 16:40:09 +00:00
Paul E. Murphy	2bad840e9d	Remove unneeded stubs for k_rem_pio2l. This is only used for the float and double variants. Instead, just add it to the type specific list of files, and remove all stubs, and remove the declaration from math_private.h. I verified x86_64, i486, ia64, m68k, and ppc64 build.	2016-09-01 09:31:06 -05:00
Joseph Myers	5146356f5a	Add fesetexcept. TS 18661-1 defines an fesetexcept function for setting floating-point exception flags without the side-effect of causing enabled traps to be taken. This patch series implements this function for glibc. The present patch adds the fallback stub implementation, x86 and x86_64 implementations, documentation, tests and ABI baseline updates. The remaining patches, some of them untested, add implementations for other architectures. The implementations generally follow those of the fesetexceptflag function. As for fesetexceptflag, the approach taken for architectures where setting flags causes enabled traps to be taken is to set the flags (and potentially cause traps) rather than refusing to set the flags and returning an error. Since ISO C and TS 18661 provide no way to enable traps, this is formally in accordance with the standards. The NEWS entry should be considered a placeholder, since this patch series is intended to be followed by further such series adding other TS 18661-1 features, so that the NEWS entry would end up looking more like * New <fenv.h> features from TS 18661-1:2014 are added to libm: the fesetexcept, fetestexceptflag, fegetmode and fesetmode functions, the femode_t type and the FE_DFL_MODE macro. with hopefully more such entries for other features, rather than having an entry for a single function in the end. I believe we have consensus for adding TS 18661-1 interfaces as per <https://sourceware.org/ml/libc-alpha/2016-06/msg00421.html>. Tested for x86_64, x86, mips64 (hard float, and soft float to test the fallback version), arm (hard float) and powerpc (hard float, soft float and e500). * math/fesetexcept.c: New file. * sysdeps/i386/fpu/fesetexcept.c: Likewise. * sysdeps/x86_64/fpu/fesetexcept.c: Likewise. * math/fenv.h: Define __GLIBC_INTERNAL_STARTING_HEADER_IMPLEMENTATION and include <bits/libc-header-start.h> instead of including <features.h>. [__GLIBC_USE (IEC_60559_BFP_EXT)] (fesetexcept): New function declaration. * manual/arith.texi (fesetexcept): Document function. * math/Versions (fesetexcept): New libm symbol at version GLIBC_2.25. * math/Makefile (libm-support): Add fesetexcept. (tests): Add test-fesetexcept and test-fesetexcept-traps. * math/test-fesetexcept.c: New file. * math/test-fesetexcept-traps.c: Likewise. * sysdeps/nacl/libm.abilist: Update. * sysdeps/unix/sysv/linux/aarch64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/arm/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/hppa/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/microblaze/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/nios2/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sh/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/tile/tilegx/tilegx32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/tile/tilepro/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise.	2016-08-16 16:16:10 +00:00
Andrew Senkevich	533f9bebf9	x86_64: Call finite scalar versions in vectorized log, pow, exp (bz #20033 ). Vector math functions require -ffast-math which sets -ffinite-math-only, so it is needed to call finite scalar versions (which are called from vector functions in some cases). Since finite version of pow() returns qNaN instead of 1.0 for several inputs, those inputs are excluded for tests of vector math functions. [BZ #20033] * sysdeps/x86_64/fpu/multiarch/svml_d_exp2_core_sse4.S: Call finite version. * sysdeps/x86_64/fpu/multiarch/svml_d_exp4_core_avx2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_log2_core_sse4.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_log4_core_avx2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_log8_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_pow2_core_sse4.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_pow4_core_avx2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_expf16_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_expf4_core_sse4.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_expf8_core_avx2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_logf16_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_logf4_core_sse4.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_logf8_core_avx2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_powf16_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_powf4_core_sse4.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_powf8_core_avx2.S: Likewise. * sysdeps/x86_64/fpu/svml_d_exp2_core.S: Likewise. * sysdeps/x86_64/fpu/svml_d_log2_core.S: Likewise. * sysdeps/x86_64/fpu/svml_d_pow2_core.S: Likewise. * sysdeps/x86_64/fpu/svml_s_expf4_core.S: Likewise. * sysdeps/x86_64/fpu/svml_s_logf4_core.S: Likewise. * sysdeps/x86_64/fpu/svml_s_powf4_core.S: Likewise. * math/libm-test.inc (pow_test_data): Exclude tests for qNaN in power zero.	2016-08-02 16:35:25 +03:00
H.J. Lu	fe0cf86148	Don't compile do_test with -mavx/-mavx/-mavx512 Don't compile do_test with -mavx, -mavx nor -mavx512 since they won't run on non-AVX machines. [BZ #20384] * sysdeps/x86_64/fpu/Makefile (extra-test-objs): Add test-double-libmvec-sincos-avx-main.o, test-double-libmvec-sincos-avx2-main.o, test-double-libmvec-sincos-main.o, test-float-libmvec-sincosf-avx-main.o, test-float-libmvec-sincosf-avx2-main.o and test-float-libmvec-sincosf-main.o. test-float-libmvec-sincosf-avx512-main.o. ($(objpfx)test-double-libmvec-sincos): Also link with $(objpfx)test-double-libmvec-sincos-main.o. ($(objpfx)test-double-libmvec-sincos-avx): Also link with $(objpfx)test-double-libmvec-sincos-avx-main.o. ($(objpfx)test-double-libmvec-sincos-avx2): Also link with $(objpfx)test-double-libmvec-sincos-avx2-main.o. ($(objpfx)test-float-libmvec-sincosf): Also link with $(objpfx)test-float-libmvec-sincosf-main.o. ($(objpfx)test-float-libmvec-sincosf-avx): Also link with $(objpfx)test-float-libmvec-sincosf-avx2-main.o. [$(config-cflags-avx512) == yes] (extra-test-objs): Add test-double-libmvec-sincos-avx512-main.o and ($(objpfx)test-double-libmvec-sincos-avx512): Also link with $(objpfx)test-double-libmvec-sincos-avx512-main.o. ($(objpfx)test-float-libmvec-sincosf-avx512): Also link with $(objpfx)test-float-libmvec-sincosf-avx512-main.o. (CFLAGS-test-double-libmvec-sincos.c): Removed. (CFLAGS-test-float-libmvec-sincosf.c): Likewise. (CFLAGS-test-double-libmvec-sincos-main.c): New. (CFLAGS-test-double-libmvec-sincos-avx-main.c): Likewise. (CFLAGS-test-double-libmvec-sincos-avx2-main.c): Likewise. (CFLAGS-test-float-libmvec-sincosf-main.c): Likewise. (CFLAGS-test-float-libmvec-sincosf-avx-main.c): Likewise. (CFLAGS-test-float-libmvec-sincosf-avx2-main.c): Likewise. (CFLAGS-test-float-libmvec-sincosf-avx512-main.c): Likewise. (CFLAGS-test-double-libmvec-sincos-avx.c): Set to -DREQUIRE_AVX. (CFLAGS-test-float-libmvec-sincosf-avx.c ): Likewise. (CFLAGS-test-double-libmvec-sincos-avx2.c): Set to -DREQUIRE_AVX2. (CFLAGS-test-float-libmvec-sincosf-avx2.c ): Likewise. (CFLAGS-test-double-libmvec-sincos-avx512.c): Set to -DREQUIRE_AVX512F. (CFLAGS-test-float-libmvec-sincosf-avx512.c): Likewise. * sysdeps/x86_64/fpu/test-double-libmvec-sincos.c: Rewritten. * sysdeps/x86_64/fpu/test-float-libmvec-sincosf.c: Likewise. * sysdeps/x86_64/fpu/test-double-libmvec-sincos-avx-main.c: New file. * sysdeps/x86_64/fpu/test-double-libmvec-sincos-avx2-main.c: Likewise. * sysdeps/x86_64/fpu/test-double-libmvec-sincos-avx512-main.c: Likewise. * sysdeps/x86_64/fpu/test-double-libmvec-sincos-main.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-sincosf-avx-main.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-sincosf-avx2-main.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-sincosf-avx512-main.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-sincosf-main.c: Likewise.	2016-07-27 11:53:15 -07:00
H.J. Lu	f43cb35c9b	Require binutils 2.24 to build x86-64 glibc [BZ #20139 ] If assembler doesn't support AVX512DQ, _dl_runtime_resolve_avx is used to save the first 8 vector registers, which only saves the lower 256 bits of vector register, for lazy binding. When it is called on AVX512 platform, the upper 256 bits of ZMM registers are clobbered. Parameters passed in ZMM registers will be wrong when the function is called the first time. This patch requires binutils 2.24, whose assembler can store and load ZMM registers, to build x86-64 glibc. Since mathvec library needs assembler support for AVX512DQ, we disable mathvec if assembler doesn't support AVX512DQ. [BZ #20139] * config.h.in (HAVE_AVX512_ASM_SUPPORT): Renamed to ... (HAVE_AVX512DQ_ASM_SUPPORT): This. * sysdeps/x86_64/configure.ac: Require assembler from binutils 2.24 or above. (HAVE_AVX512_ASM_SUPPORT): Removed. (HAVE_AVX512DQ_ASM_SUPPORT): New. * sysdeps/x86_64/configure: Regenerated. * sysdeps/x86_64/dl-trampoline.S: Make HAVE_AVX512_ASM_SUPPORT check unconditional. * sysdeps/x86_64/multiarch/ifunc-impl-list.c: Likewise. * sysdeps/x86_64/multiarch/memcpy.S: Likewise. * sysdeps/x86_64/multiarch/memcpy_chk.S: Likewise. * sysdeps/x86_64/multiarch/memmove-avx512-no-vzeroupper.S: Likewise. * sysdeps/x86_64/multiarch/memmove-avx512-unaligned-erms.S: Likewise. * sysdeps/x86_64/multiarch/memmove.S: Likewise. * sysdeps/x86_64/multiarch/memmove_chk.S: Likewise. * sysdeps/x86_64/multiarch/mempcpy.S: Likewise. * sysdeps/x86_64/multiarch/mempcpy_chk.S: Likewise. * sysdeps/x86_64/multiarch/memset-avx512-no-vzeroupper.S: Likewise. * sysdeps/x86_64/multiarch/memset-avx512-unaligned-erms.S: Likewise. * sysdeps/x86_64/multiarch/memset.S: Likewise. * sysdeps/x86_64/multiarch/memset_chk.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core_avx512.S: Check HAVE_AVX512DQ_ASM_SUPPORT instead of HAVE_AVX512_ASM_SUPPORT. * sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_log8_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core_avx512.: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf16_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_expf16_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_logf16_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_powf16_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core_avx51: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf16_core_avx512.S: Likewise.	2016-07-01 06:03:05 -07:00
Andrew Senkevich	ee2196bb67	Fixed wrong vector sincos/sincosf ABI to have it compatible with current vector function declaration "#pragma omp declare simd notinbranch", according to which vector sincos should have vector of pointers for second and third parameters. It is fixed with implementation as wrapper to version having second and third parameters as pointers. [BZ #20024] * sysdeps/x86/fpu/test-math-vector-sincos.h: New. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos2_core_sse4.S: Fixed ABI of this implementation of vector function. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos4_core_avx2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf4_core_sse4.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf8_core_avx2.S: Likewise. * sysdeps/x86_64/fpu/svml_d_sincos2_core.S: Likewise. * sysdeps/x86_64/fpu/svml_d_sincos4_core.S: Likewise. * sysdeps/x86_64/fpu/svml_d_sincos4_core_avx.S: Likewise. * sysdeps/x86_64/fpu/svml_d_sincos8_core.S: Likewise. * sysdeps/x86_64/fpu/svml_s_sincosf16_core.S: Likewise. * sysdeps/x86_64/fpu/svml_s_sincosf4_core.S: Likewise. * sysdeps/x86_64/fpu/svml_s_sincosf8_core.S: Likewise. * sysdeps/x86_64/fpu/svml_s_sincosf8_core_avx.S: Likewise. * sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: Use another wrapper for testing vector sincos with fixed ABI. * sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-libmvec-sincos-avx.c: New test. * sysdeps/x86_64/fpu/test-double-libmvec-sincos-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-double-libmvec-sincos-avx512.c: Likewise. * sysdeps/x86_64/fpu/test-double-libmvec-sincos.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-sincosf-avx.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-sincosf-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-sincosf-avx512.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-sincosf.c: Likewise. * sysdeps/x86_64/fpu/Makefile: Added new tests.	2016-07-01 14:15:38 +03:00
Joseph Myers	30dcf959d2	Avoid "inexact" exceptions in i386/x86_64 trunc functions (bug 15479). As discussed in <https://sourceware.org/ml/libc-alpha/2016-05/msg00577.html>, TS 18661-1 disallows ceil, floor, round and trunc functions from raising the "inexact" exception, in accordance with general IEEE 754 semantics for when that exception is raised. Fixing this for x87 floating point is more complicated than for the other versions of these functions, because they use the frndint instruction that raises "inexact" and this can only be avoided by saving and restoring the whole floating-point environment. As I noted in <https://sourceware.org/ml/libc-alpha/2016-06/msg00128.html>, I have now implemented a GCC option -fno-fp-int-builtin-inexact for GCC 7, such that GCC will inline these functions on x86, without caring about "inexact", when the default -ffp-int-builtin-inexact is in effect. This allows users to get optimized code depending on the options they pass to the compiler, while making the out-of-line functions follow TS 18661-1 semantics and avoid "inexact". This patch duly fixes the out-of-line trunc function implementations to avoid "inexact", in the same way as the nearbyint implementations. I do not know how the performance of implementations such as these based on saving the environment and changing the rounding mode temporarily compares to that of the C versions or SSE 4.1 versions (of course, for 32-bit x86 SSE implementations still need to get the return value in an x87 register); it's entirely possible other implementations could be faster in some cases. Tested for x86_64 and x86. [BZ #15479] * sysdeps/i386/fpu/s_trunc.S (__trunc): Save and restore floating-point environment rather than just control word. * sysdeps/i386/fpu/s_truncf.S (__truncf): Likewise. * sysdeps/i386/fpu/s_truncl.S (__truncl): Save and restore floating-point environment, with "invalid" exceptions merged in, rather than just control word. * sysdeps/x86_64/fpu/s_truncl.S (__truncl): Likewise. * math/libm-test.inc (trunc_test_data): Do not allow spurious "inexact" exceptions.	2016-06-27 17:26:52 +00:00
Joseph Myers	623629de06	Avoid "inexact" exceptions in i386/x86_64 floor functions (bug 15479). As discussed in <https://sourceware.org/ml/libc-alpha/2016-05/msg00577.html>, TS 18661-1 disallows ceil, floor, round and trunc functions from raising the "inexact" exception, in accordance with general IEEE 754 semantics for when that exception is raised. Fixing this for x87 floating point is more complicated than for the other versions of these functions, because they use the frndint instruction that raises "inexact" and this can only be avoided by saving and restoring the whole floating-point environment. As I noted in <https://sourceware.org/ml/libc-alpha/2016-06/msg00128.html>, I have now implemented a GCC option -fno-fp-int-builtin-inexact for GCC 7, such that GCC will inline these functions on x86, without caring about "inexact", when the default -ffp-int-builtin-inexact is in effect. This allows users to get optimized code depending on the options they pass to the compiler, while making the out-of-line functions follow TS 18661-1 semantics and avoid "inexact". This patch duly fixes the out-of-line floor function implementations to avoid "inexact", in the same way as the nearbyint implementations. I do not know how the performance of implementations such as these based on saving the environment and changing the rounding mode temporarily compares to that of the C versions or SSE 4.1 versions (of course, for 32-bit x86 SSE implementations still need to get the return value in an x87 register); it's entirely possible other implementations could be faster in some cases. Tested for x86_64 and x86. [BZ #15479] * sysdeps/i386/fpu/s_floor.S (__floor): Save and restore floating-point environment rather than just control word. * sysdeps/i386/fpu/s_floorf.S (__floorf): Likewise. * sysdeps/i386/fpu/s_floorl.S (__floorl): Save and restore floating-point environment, with "invalid" exceptions merged in, rather than just control word. * sysdeps/x86_64/fpu/s_floorl.S (__floorl): Likewise. * math/libm-test.inc (floor_test_data): Do not allow spurious "inexact" exceptions.	2016-06-27 17:25:47 +00:00
Joseph Myers	26b0bf9600	Avoid "inexact" exceptions in i386/x86_64 ceil functions (bug 15479). As discussed in <https://sourceware.org/ml/libc-alpha/2016-05/msg00577.html>, TS 18661-1 disallows ceil, floor, round and trunc functions from raising the "inexact" exception, in accordance with general IEEE 754 semantics for when that exception is raised. Fixing this for x87 floating point is more complicated than for the other versions of these functions, because they use the frndint instruction that raises "inexact" and this can only be avoided by saving and restoring the whole floating-point environment. As I noted in <https://sourceware.org/ml/libc-alpha/2016-06/msg00128.html>, I have now implemented a GCC option -fno-fp-int-builtin-inexact for GCC 7, such that GCC will inline these functions on x86, without caring about "inexact", when the default -ffp-int-builtin-inexact is in effect. This allows users to get optimized code depending on the options they pass to the compiler, while making the out-of-line functions follow TS 18661-1 semantics and avoid "inexact". This patch duly fixes the out-of-line ceil function implementations to avoid "inexact", in the same way as the nearbyint implementations. I do not know how the performance of implementations such as these based on saving the environment and changing the rounding mode temporarily compares to that of the C versions or SSE 4.1 versions (of course, for 32-bit x86 SSE implementations still need to get the return value in an x87 register); it's entirely possible other implementations could be faster in some cases. Tested for x86_64 and x86. [BZ #15479] * sysdeps/i386/fpu/s_ceil.S (__ceil): Save and restore floating-point environment rather than just control word. * sysdeps/i386/fpu/s_ceilf.S (__ceilf): Likewise. * sysdeps/i386/fpu/s_ceill.S (__ceill): Save and restore floating-point environment, with "invalid" exceptions merged in, rather than just control word. * sysdeps/x86_64/fpu/s_ceill.S (__ceill): Likewise. * math/libm-test.inc (ceil_test_data): Do not allow spurious "inexact" exceptions.	2016-06-27 17:24:30 +00:00
Joseph Myers	40244be372	Fix i386/x86_64 scalbl with sNaN input (bug 20296). The x86_64 and i386 versions of scalbl return sNaN for some cases of sNaN input and are missing "invalid" exceptions for other cases. This results from overly complicated code that either returns a NaN input, or discards both inputs when one is NaN and loads a NaN from memory. This patch fixes this by simplifying the code to add the arguments when either one is NaN. Tested for x86_64 and x86. [BZ #20296] * sysdeps/i386/fpu/e_scalbl.S (__ieee754_scalbl): Add arguments when either argument is a NaN. * sysdeps/x86_64/fpu/e_scalbl.S (__ieee754_scalbl): Likewise. * math/libm-test.inc (scalb_test_data): Add sNaN tests.	2016-06-23 22:17:41 +00:00
Joseph Myers	4e9bf327ad	Simplify x86 nearbyint functions. The i386 implementations of nearbyint functions, and x86_64 nearbyintl, contain code to mask the "inexact" exception. However, the fnstenv instruction has the effect of masking all exceptions, so this masking code has been redundant since fnstenv was added to those implementations (by commit 846d9a4a3acdb4939ca7bf6aed48f9f6f26911be; commit `71d1b0166b` added the test math/test-nearbyint-except-2.c that verifies these functions do work when called with "inexact" traps enabled); this patch removes the redundant code. Tested for x86_64 and x86. * sysdeps/i386/fpu/s_nearbyint.S (__nearbyint): Do not mask "inexact" exceptions after fnstenv. * sysdeps/i386/fpu/s_nearbyintf.S (__nearbyintf): Likewise. * sysdeps/i386/fpu/s_nearbyintl.S (__nearbyintl): Likewise. * sysdeps/x86_64/fpu/s_nearbyintl.S (__nearbyintl): Likewise.	2016-06-22 15:40:30 +00:00
Andrew Senkevich	df2258c6cb	Added tests to ensure linkage through libmvec _finite aliases which are defined in libmvec_nonshared.a (bug 19654). [BZ #19654] sysdeps/x86_64/fpu/Makefile: Added new tests. * sysdeps/x86_64/fpu/test-double-libmvec-alias-avx-main.c: New. * sysdeps/x86_64/fpu/test-double-libmvec-alias-avx-mod.c: Likewise. * sysdeps/x86_64/fpu/test-double-libmvec-alias-avx.c: Likewise. * sysdeps/x86_64/fpu/test-double-libmvec-alias-avx2-main.c: Likewise. * sysdeps/x86_64/fpu/test-double-libmvec-alias-avx2-mod.c: Likewise. * sysdeps/x86_64/fpu/test-double-libmvec-alias-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-double-libmvec-alias-avx512-main.c: Likewise. * sysdeps/x86_64/fpu/test-double-libmvec-alias-avx512-mod.c: Likewise. * sysdeps/x86_64/fpu/test-double-libmvec-alias-avx512.c: Likewise. * sysdeps/x86_64/fpu/test-double-libmvec-alias-main.c: Likewise. * sysdeps/x86_64/fpu/test-double-libmvec-alias-mod.c: Likewise. * sysdeps/x86_64/fpu/test-double-libmvec-alias.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-alias-avx-main.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-alias-avx-mod.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-alias-avx.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-alias-avx2-main.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-alias-avx2-mod.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-alias-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-alias-avx512-main.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-alias-avx512-mod.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-alias-avx512.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-alias-main.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-alias-mod.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-alias.c: Likewise. * sysdeps/x86_64/fpu/test-libmvec-alias-mod.c: Likewise.	2016-06-20 21:15:50 +03:00
Joseph Myers	f4015c8a86	Use generic fdim on more architectures (bug 6796, bug 20255, bug 20256). Some architectures have their own versions of fdim functions, which are missing errno setting (bug 6796) and may also return sNaN instead of qNaN for sNaN input, in the case of the x86 / x86_64 long double versions (bug 20256). These versions are not actually doing anything that a compiler couldn't generate, just straightforward comparisons / arithmetic (and, in the x86 / x86_64 case, testing for NaNs with fxam, which isn't actually needed once you use an unordered comparison and let the NaNs pass through the same subtraction as non-NaN inputs). This patch removes the x86 / x86_64 / powerpc versions, so that those architectures use the generic C versions, which correctly handle setting errno and deal properly with sNaN inputs. This seems better than dealing with setting errno in lots of .S versions. The i386 versions also return results with excess range and precision, which is not appropriate for a function exactly defined by reference to IEEE operations. For errno setting to work correctly on overflow, it's necessary to remove excess range with math_narrow_eval, which this patch duly does in the float and double versions so that the tests can reliably pass on x86. For float, this avoids any double rounding issues as the long double precision is more than twice that of float. For double, double rounding issues will need to be addressed separately, so this patch does not fully fix bug 20255. Tested for x86_64, x86 and powerpc. [BZ #6796] [BZ #20255] [BZ #20256] * math/s_fdim.c: Include <math_private.h>. (__fdim): Use math_narrow_eval on result. * math/s_fdimf.c: Include <math_private.h>. (__fdimf): Use math_narrow_eval on result. * sysdeps/i386/fpu/s_fdim.S: Remove file. * sysdeps/i386/fpu/s_fdimf.S: Likewise. * sysdeps/i386/fpu/s_fdiml.S: Likewise. * sysdeps/i386/i686/fpu/s_fdim.S: Likewise. * sysdeps/i386/i686/fpu/s_fdimf.S: Likewise. * sysdeps/i386/i686/fpu/s_fdiml.S: Likewise. * sysdeps/powerpc/fpu/s_fdim.c: Likewise. * sysdeps/powerpc/fpu/s_fdimf.c: Likewise. * sysdeps/powerpc/powerpc32/fpu/s_fdim.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_fdim.c: Likewise. * sysdeps/x86_64/fpu/s_fdiml.S: Likewise. * math/libm-test.inc (fdim_test_data): Expect errno setting on overflow. Add sNaN tests.	2016-06-14 16:04:19 +00:00
Joseph Myers	f00faa4a43	Fix i386/x86_64 log2l (sNaN) (bug 20235). The i386/x86_64 versions of log2l return sNaN for sNaN input. This patch fixes them to add NaN inputs to themselves so that qNaN is returned in this case. Tested for x86_64 and x86. [BZ #20235] * sysdeps/i386/fpu/e_log2l.S (__ieee754_log2l): Add NaN input to itself. * sysdeps/x86_64/fpu/e_log2l.S (__ieee754_log2l): Likewise. * math/libm-test.inc (log2_test_data): Add sNaN tests.	2016-06-09 18:04:30 +00:00
Joseph Myers	8c010e2f71	Fix i386/x86_64 log1pl (sNaN) (bug 20229). The i386/x86_64 versions of log1pl return sNaN for sNaN input. This patch fixes them to add a NaN input to itself so that qNaN is returned in this case. Tested for x86_64 and x86. [BZ #20229] * sysdeps/i386/fpu/s_log1pl.S (__log1pl): Add NaN input to itself. * sysdeps/x86_64/fpu/s_log1pl.S (__log1pl): Likewise. * math/libm-test.inc (log1p_test_data): Add sNaN tests.	2016-06-08 23:11:42 +00:00
Joseph Myers	09096b3615	Fix i386/x86_64 log10l (sNaN) (bug 20228). The i386/x86_64 versions of log10l return sNaN for sNaN input. This patch fixes them to add a NaN input to itself so that qNaN is returned in this case. Tested for x86_64 and x86. [BZ #20228] * sysdeps/i386/fpu/e_log10l.S (__ieee754_log10l): Add NaN input to itself. * sysdeps/x86_64/fpu/e_log10l.S (__ieee754_log10l): Likewise. * math/libm-test.inc (log10_test_data): Add sNaN tests.	2016-06-08 22:59:18 +00:00
Joseph Myers	df179d8808	Fix i386/x86_64 logl (sNaN) (bug 20227). The i386/x86_64 versions of logl return sNaN for sNaN input. This patch fixes them to add a NaN input to itself so that qNaN is returned in this case. Tested for x86_64 and x86 (including a build for i586 to cover the non-i686 logl version). [BZ #20227] * sysdeps/i386/fpu/e_logl.S (__ieee754_logl): Add NaN input to itself. * sysdeps/i386/i686/fpu/e_logl.S (__ieee754_logl): Likewise. * sysdeps/x86_64/fpu/e_logl.S (__ieee754_logl): Likewise. * math/libm-test.inc (log_test_data): Add sNaN tests.	2016-06-08 22:24:06 +00:00
Joseph Myers	9bd3ef8e19	Fix i386/x86_64 expl, exp10l, expm1l for sNaN input (bug 20226). The i386 and x86_64 implementations of expl, exp10l and expm1l (code shared between the functions) return sNaN for sNaN input. This patch fixes them to add NaN inputs to themselves so that qNaN is returned in this case. Tested for x86_64 and x86. [BZ #20226] * sysdeps/i386/fpu/e_expl.S (IEEE754_EXPL): Add NaN argument to itself. * sysdeps/x86_64/fpu/e_expl.S (IEEE754_EXPL): Likewise. * math/libm-test.inc (exp_test_data): Add sNaN tests. (exp10_test_data): Likewise. (expm1_test_data): Likewise.	2016-06-08 21:55:06 +00:00
Joseph Myers	5ff81530dd	Do not raise "inexact" from x86_64 SSE4.1 ceil, floor (bug 15479). Continuing fixes for ceil and floor functions not to raise the "inexact" exception, this patch fixes the x86_64 SSE4.1 versions. The roundss / roundsd instructions take an immediate operand that determines the rounding mode and whether to raise "inexact"; this just needs bit 3 set to disable "inexact", which this patch does. Remark: we don't have an SSE4.1 version of trunc / truncf (using this instruction with operand 11); I'd expect one to make sense, but of course it should be benchmarked against the existing C code. I'll file a bug in Bugzilla for the lack of such a version. Tested for x86_64. [BZ #15479] * sysdeps/x86_64/fpu/multiarch/s_ceil.S (__ceil_sse41): Set bit 3 of immediate operand to rounding instruction. * sysdeps/x86_64/fpu/multiarch/s_ceilf.S (__ceilf_sse41): Likewise. * sysdeps/x86_64/fpu/multiarch/s_floor.S (__floor_sse41): Likewise. * sysdeps/x86_64/fpu/multiarch/s_floorf.S (__floorf_sse41): Likewise.	2016-05-24 21:11:18 +00:00
Joseph Myers	c898991d8b	Fix x86_64 / x86 powl inaccuracy for integer exponents (bug 19848). Bug 19848 reports cases where powl on x86 / x86_64 has error accumulation, for small integer exponents, larger than permitted by glibc's accuracy goals, at least in some rounding modes. This patch further restricts the exponent range for which the small-integer-exponent logic is used to limit the possible error accumulation. Tested for x86_64 and x86 and ulps updated accordingly. [BZ #19848] * sysdeps/i386/fpu/e_powl.S (p3): Rename to p2 and change value from 8 to 4. (__ieee754_powl): Compare integer exponent against 4 not 8. * sysdeps/x86_64/fpu/e_powl.S (p3): Rename to p2 and change value from 8 to 4. (__ieee754_powl): Compare integer exponent against 4 not 8. * math/auto-libm-test-in: Add more tests of pow. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/i686/fpu/multiarch/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2016-03-24 01:32:52 +00:00
H.J. Lu	86ed888255	Use JUMPTARGET in x86-64 mathvec When PLT may be used, JUMPTARGET should be used instead calling the function directly. * sysdeps/x86_64/fpu/multiarch/svml_d_cos2_core_sse4.S (_ZGVbN2v_cos_sse4): Use JUMPTARGET to call cos. * sysdeps/x86_64/fpu/multiarch/svml_d_cos4_core_avx2.S (_ZGVdN4v_cos_avx2): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core_avx512.S (_ZGVdN4v_cos): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_exp2_core_sse4.S (_ZGVbN2v_exp_sse4): Use JUMPTARGET to call exp. * sysdeps/x86_64/fpu/multiarch/svml_d_exp4_core_avx2.S (_ZGVdN4v_exp_avx2): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core_avx512.S (_ZGVdN4v_exp): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_log2_core_sse4.S (_ZGVbN2v_log_sse4): Use JUMPTARGET to call log. * sysdeps/x86_64/fpu/multiarch/svml_d_log4_core_avx2.S (_ZGVdN4v_log_avx2): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_log8_core_avx512.S (_ZGVdN4v_log): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_pow2_core_sse4.S (_ZGVbN2vv_pow_sse4): Use JUMPTARGET to call pow. * sysdeps/x86_64/fpu/multiarch/svml_d_pow4_core_avx2.S (_ZGVdN4vv_pow_avx2): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core_avx512.S (_ZGVdN4vv_pow): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core_sse4.S (_ZGVbN2v_sin_sse4): Use JUMPTARGET to call sin. * sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core_avx2.S (_ZGVdN4v_sin_avx2): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core_avx512.S (_ZGVdN4v_sin): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos2_core_sse4.S (_ZGVbN2vvv_sincos_sse4): Use JUMPTARGET to call sin and cos. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos4_core_avx2.S (_ZGVdN4vvv_sincos_avx2): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core_avx512.S (_ZGVdN4vvv_sincos): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf16_core_avx512.S (_ZGVdN8v_cosf): Use JUMPTARGET to call cosf. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf4_core_sse4.S (_ZGVbN4v_cosf_sse4): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf8_core_avx2.S (_ZGVdN8v_cosf_avx2): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_expf16_core_avx512.S (_ZGVdN8v_expf): Use JUMPTARGET to call expf. * sysdeps/x86_64/fpu/multiarch/svml_s_expf4_core_sse4.S (_ZGVbN4v_expf_sse4): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_expf8_core_avx2.S (_ZGVdN8v_expf_avx2): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_logf16_core_avx512.S (_ZGVdN8v_logf): Use JUMPTARGET to call logf. * sysdeps/x86_64/fpu/multiarch/svml_s_logf4_core_sse4.S (_ZGVbN4v_logf_sse4): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_logf8_core_avx2.S (_ZGVdN8v_logf_avx2): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_powf16_core_avx512.S (_ZGVdN8vv_powf): Use JUMPTARGET to call powf. * sysdeps/x86_64/fpu/multiarch/svml_s_powf4_core_sse4.S (_ZGVbN4vv_powf_sse4): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_powf8_core_avx2.S (_ZGVdN8vv_powf_avx2): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core_avx512.S (_ZGVdN8vv_powf): Use JUMPTARGET to call sinf and cosf. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf4_core_sse4.S (_ZGVbN4vvv_sincosf_sse4): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf8_core_avx2.S (_ZGVdN8vvv_sincosf_avx2): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf16_core_avx512.S (_ZGVdN8v_sinf): Use JUMPTARGET to call sinf. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf4_core_sse4.S (_ZGVbN4v_sinf_sse4): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf8_core_avx2.S (_ZGVdN8v_sinf_avx2): Likewise. * sysdeps/x86_64/fpu/svml_d_wrapper_impl.h (WRAPPER_IMPL_SSE2): Use JUMPTARGET to call callee. (WRAPPER_IMPL_SSE2_ff): Likewise. (WRAPPER_IMPL_SSE2_fFF): Likewise. (WRAPPER_IMPL_AVX): Likewise. (WRAPPER_IMPL_AVX_ff): Likewise. (WRAPPER_IMPL_AVX_fFF): Likewise. (WRAPPER_IMPL_AVX512): Likewise. (WRAPPER_IMPL_AVX512_ff): Likewise. * sysdeps/x86_64/fpu/svml_s_wrapper_impl.h (WRAPPER_IMPL_SSE2): Likewise. (WRAPPER_IMPL_SSE2_ff): Likewise. (WRAPPER_IMPL_SSE2_fFF): Likewise. (WRAPPER_IMPL_AVX): Likewise. (WRAPPER_IMPL_AVX_ff): Likewise. (WRAPPER_IMPL_AVX_fFF): Likewise. (WRAPPER_IMPL_AVX512): Likewise. (WRAPPER_IMPL_AVX512_ff): Likewise. (WRAPPER_IMPL_AVX512_fFF): Likewise.	2016-03-16 14:24:19 -07:00
Andrew Senkevich	a5df3210a6	Use PIC relocation in ALIAS_IMPL Since libmvec_nonshared.a may be linked into shared objects, ALIAS_IMPL should use PIC relocation. [BZ #19590] * sysdeps/x86_64/fpu/svml_finite_alias.S (ALIAS_IMPL): Use PIC relocation.	2016-02-17 14:23:32 -08:00
Joseph Myers	f7a9f785e5	Update copyright dates with scripts/update-copyrights.	2016-01-04 16:05:18 +00:00
Andrew Senkevich	977a30801f	Better workaround for aliases of _finite symbols in vector math library. Old workaround based on assembly aliases can lead to link fail (bug 19058). This patch makes workaround in another way to avoid it. [BZ #19058] math/Makefile ($(inst_libdir)/libm.so): Added libmvec_nonshared.a to AS_NEEDED. * sysdeps/x86/fpu/bits/math-vector.h: Removed code with old workaround. * sysdeps/x86_64/fpu/Makefile (libmvec-support, libmvec-static-only-routines): Added new file. * sysdeps/x86_64/fpu/svml_finite_alias.S: New file.	2015-11-27 16:22:26 +03:00
Joseph Myers	01189b083b	Fix i386/x86_64 log* (1) zero sign for -ffinite-math-only (bug 19213). For the -ffinite-math-only versions of various x86_64 and x86 log* functions, a zero result from log* (1) is returned with incorrect sign in round-downward mode. This patch fixes this in a similar way to the previous fixes for the non-_finite versions of the functions. Tested for x86_64 and x86 (including an i586 build), together with a patch that will be applied separately to enable the main libm-test.inc tests for the finite-math-only functions. [BZ #19213] sysdeps/i386/fpu/e_log.S (__log_finite): Ensure +0 is always returned for argument 1. * sysdeps/i386/fpu/e_logf.S (__logf_finite): Likewise. * sysdeps/i386/fpu/e_logl.S (__logl_finite): Likewise. * sysdeps/i386/i686/fpu/e_logl.S (__logl_finite): Likewise. * sysdeps/x86_64/fpu/e_log10l.S (__log10l_finite): Likewise. * sysdeps/x86_64/fpu/e_log2l.S (__log2l_finite): Likewise. * sysdeps/x86_64/fpu/e_logl.S (__logl_finite): Likewise.	2015-11-05 21:56:31 +00:00
Joseph Myers	2145f97cee	Handle more state in i386/x86_64 fesetenv (bug 16068). fenv_t should include architecture-specific floating-point modes and status flags. i386 and x86_64 fesetenv limit which bits they use from the x87 status and control words, when using saved state, and limit which parts of the state they set to fixed values, when using FE_DFL_ENV / FE_NOMASK_ENV. The following should be included but are excluded in at least some cases: status and masking for the "denormal operand" exception (which isn't part of FE_ALL_EXCEPT); precision control (explicitly mentioned in Annex F as something that counts as part of the floating-point environment); MXCSR FZ and DAZ bits (for FE_DFL_ENV and FE_NOMASK_ENV). This patch arranges for this extra state to be handled by fesetenv (and thereby by feupdateenv, which calls fesetenv). (Note that glibc functions using floating point are not generally expected to work correctly with non-default values of this state, especially precision control, but it is still logically part of the floating-point environment and should be handled as such by fesetenv. Changes to the state relating to subnormals ought generally to work with libm functions when the arguments aren't subnormal and neither are the expected results; that's a consequence of functions avoiding spurious internal underflows.) A question arising from this is whether FE_NOMASK_ENV should or should not mask the "denormal operand" exception. I decided it should mask that exception. This is the status quo - previously that exception could only be unmasked by direct manipulation of control registers (possibly via <fpu_control.h>). In addition, it means that use of FE_NOMASK_ENV leaves a floating-point environment the same as could be obtained by fesetenv (FE_DFL_ENV); feenableexcept (FE_ALL_EXCEPT);, rather than an environment in which an exception is unmasked that could only be masked again by using fesetenv with FE_DFL_ENV (or a previously saved environment) - this exception not being usable with other <fenv.h> functions because it's outside FE_ALL_EXCEPT. Tested for x86_64 and x86. [BZ #16068] * sysdeps/i386/fpu/fesetenv.c: Include <fpu_control.h>. (FE_ALL_EXCEPT_X86): New macro. (__fesetenv): Use FE_ALL_EXCEPT_X86 in most places instead of FE_ALL_EXCEPT. Ensure precision control is included in floating-point state. Ensure that FE_DFL_ENV and FE_NOMASK_ENV handle "denormal operand exception" and clear FZ and DAZ bits. * sysdeps/x86_64/fpu/fesetenv.c: Include <fpu_control.h>. (FE_ALL_EXCEPT_X86): New macro. (__fesetenv): Use FE_ALL_EXCEPT_X86 in most places instead of FE_ALL_EXCEPT. Ensure precision control is included in floating-point state. Ensure that FE_DFL_ENV and FE_NOMASK_ENV handle "denormal operand exception" and clear FZ and DAZ bits. * sysdeps/x86/fpu/test-fenv-sse-2.c: New file. * sysdeps/x86/fpu/test-fenv-x87.c: Likewise. * sysdeps/x86/fpu/Makefile [$(subdir) = math] (tests): Add test-fenv-x87 and test-fenv-sse-2. [$(subdir) = math] (CFLAGS-test-fenv-sse-2.c): New variable.	2015-10-28 22:58:29 +00:00
Joseph Myers	0b9af583a5	Fix i386/x86_64 fesetenv SSE exception clearing (bug 19181). The i386 and x86_64 versions of fesetenv, when called with FE_DFL_ENV or FE_NOMASK_ENV as argument, do not clear SSE exceptions raised in MXCSR. These arguments should, like other fenv_t values, represent the whole of the floating-point state, so such exceptions should be cleared; this patch adds the required clearing. (Discovered while working on bug 16068.) Tested for x86_64 and x86. [BZ #19181] * sysdeps/i386/fpu/fesetenv.c (__fesetenv): Clear already-raised SSE exceptions when argument is FE_DFL_ENV or FE_NOMASK_ENV. * sysdeps/x86_64/fpu/fesetenv.c (__fesetenv): Likewise. * math/test-fenv-clear-main.c: New file. * math/test-fenv-clear.c: Likewise. * math/Makefile (tests): Add test-fenv-clear. * sysdeps/x86/fpu/test-fenv-clear-sse.c: New file. * sysdeps/x86/fpu/Makefile [$(subdir) = math] (tests): Add test-fenv-clear-sse. [$(subdir) = math] (CFLAGS-test-fenv-clear-sse.c): New variable.	2015-10-28 18:50:20 +00:00
Florian Weimer	e39edb0552	x86_64: Regenerate ulps [BZ #19168 ] This comes from running “make regen-ulps” on AMD Opteron 6272 CPUs.	2015-10-26 13:15:48 +01:00
Joseph Myers	9d1687b2df	Add more libm tests (ilogb, is, j0, j1, jn, lgamma, log). This patch improves the libm test coverage for a few more functions. Tested for x86_64 and x86. * math/auto-libm-test-in: Add more tests of log, log10, log1p and log2. * math/auto-libm-test-out: Regenerated. * math/libm-test.inc (MAX_EXP): New macro. (ilogb_test_data): Add more tests. (isfinite_test_data): Likewise. (isgreater_test_data): Likewise. (isgreaterequal_test_data): Likewise. (isinf_test_data): Likewise. (isless_test_data): Likewise. (islessequal_test_data): Likewise. (islessgreater_test_data): Likewise. (isnan_test_data): Likewise. (isnormal_test_data): Likewise. (issignaling_test_data): Likewise. (isunordered_test_data): Likewise. (j0_test_data): Likewise. (j1_test_data): Likewise. (jn_test_data): Likewise. (lgamma_test_data): Likewise. (log_test_data): Likewise. (log10_test_data): Likewise. (log1p_test_data): Likewise. (log2_test_data): Likewise. (logb_test_data): Likewise. * sysdeps/x86_64/fpu/libm-test-ulps: Update.	2015-10-23 22:46:05 +00:00
Joseph Myers	846d9a4a3a	Fix i386 / x86_64 nearbyint exception clearing (bug 15491). The implementations of nearbyint functions using x87 floating point (i386 all versions, x86_64 long double only) use the fclex instruction, which clears any exceptions that were raised before the function was called. These functions must not clear exceptions that were raised before they were called. This patch fixes these functions to save and restore the whole floating-point environment (fnstenv / fldenv) as the way of avoiding raising "inexact" (recall that there isn't an x87 instruction for loading just the status word, so the whole environment has to be saved and loaded instead - the code already saved and loaded the control word, which is now obtained from the saved environment after this patch, to disable traps on "inexact"). In the case of the long double functions, any "invalid" exception from frndint (applied to a signaling NaN) needs merging into the saved state; this issue doesn't apply to the float and double functions because that exception would have been raised when the argument is loaded, before the environment is saved. [BZ #15491] * sysdeps/i386/fpu/s_nearbyint.S (__nearbyint): Save and restore floating-point environment instead of clearing all exceptions. * sysdeps/i386/fpu/s_nearbyintf.S (__nearbyintf): Likewise. * sysdeps/i386/fpu/s_nearbyintl.S (__nearbyintl): Likewise, merging in "invalid" exceptions from frndint. * sysdeps/x86_64/fpu/s_nearbyintl.S (__nearbyintl): Likewise. * math/test-nearbyint-except.c: New file. * math/Makefile (tests): Add test-nearbyint-except.	2015-10-22 23:14:55 +00:00
H.J. Lu	4b71ce6c1a	Update lrint/lrintf/lrintl for x32 The x86_64 versions of lrint/lrintf/ lrintl are aliases for the long long versions which isn't correct for x32, where exceptions must respect overflow for 32-bit long. Separate versions of the long functions for x32 that convert to 32-bit long and raise the right exceptions for that conversion, while keeping the aliases in the non-x32 case. Tested on x86_64 and x32. There are no code changes in libm.so on x86_64. * sysdeps/x86_64/fpu/s_llrint.S (__lrint): Add alias only if __ILP32__ isn't defined. (lrint): Likewise. * sysdeps/x86_64/fpu/s_llrintf.S (__lrintf): Likewise. (lrintf): Likewise. * sysdeps/x86_64/fpu/s_llrintl.S (__lrintl): Likewise. (lrintl): Likewise. * sysdeps/x86_64/x32/fpu/s_lrint.S: New file. * sysdeps/x86_64/x32/fpu/s_lrintf.S: Likewise. * sysdeps/x86_64/x32/fpu/s_lrintl.S: Likewise.	2015-10-09 11:42:10 -07:00
Joseph Myers	b7848899a5	Remove configure tests for FMA4 support. GCC added support for -mfma4 in version 4.5. Thus the configure tests for this support are obsolete, and this patch removes them. Tested for x86_64 and x86 (testsuite, and that installed stripped shared libraries are unchanged by this patch). * sysdeps/i386/configure.ac (libc_cv_cc_fma4): Remove configure test. * sysdeps/i386/configure: Regenerated. * sysdeps/x86_64/configure.ac (libc_cv_cc_fma4): Remove configure test. * sysdeps/x86_64/configure: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile [$(have-mfma4) = yes]: Make code unconditional. * sysdeps/x86_64/fpu/multiarch/e_asin.c [HAVE_FMA4_SUPPORT]: Likewise. * sysdeps/x86_64/fpu/multiarch/e_atan2.c [HAVE_FMA4_SUPPORT]: Likewise. [!HAVE_FMA4_SUPPORT]: Remove conditional code. * sysdeps/x86_64/fpu/multiarch/e_exp.c [HAVE_FMA4_SUPPORT]: Make code unconditional. [!HAVE_FMA4_SUPPORT]: Remove conditional code. * sysdeps/x86_64/fpu/multiarch/e_log.c [HAVE_FMA4_SUPPORT]: Make code unconditional. [!HAVE_FMA4_SUPPORT]: Remove conditional code. * sysdeps/x86_64/fpu/multiarch/e_pow.c [HAVE_FMA4_SUPPORT]: Make code unconditional. * sysdeps/x86_64/fpu/multiarch/s_atan.c [HAVE_FMA4_SUPPORT]: Make code unconditional. [!HAVE_FMA4_SUPPORT]: Remove conditional code. * sysdeps/x86_64/fpu/multiarch/s_fma.c [HAVE_FMA4_SUPPORT]: Make code unconditional. [!HAVE_FMA4_SUPPORT]: Remove conditional code. * sysdeps/x86_64/fpu/multiarch/s_fmaf.c [HAVE_FMA4_SUPPORT]: Make code unconditional. [!HAVE_FMA4_SUPPORT]: Remove conditional code. * sysdeps/x86_64/fpu/multiarch/s_sin.c [HAVE_FMA4_SUPPORT]: Make code unconditional. [!HAVE_FMA4_SUPPORT]: Remove conditional code. * sysdeps/x86_64/fpu/multiarch/s_tan.c [HAVE_FMA4_SUPPORT]: Make code unconditional. [!HAVE_FMA4_SUPPORT]: Remove conditional code. * config.h.in (HAVE_FMA4_SUPPORT): Remove #undef.	2015-10-09 16:02:54 +00:00
Joseph Myers	1b12cd7f4d	Remove configure tests for AVX support. GCC added support for -mavx and -msse2avx in version 4.4. Thus the configure tests for this support are obsolete, and this patch removes them. Tested for x86_64 and x86 (testsuite, and that installed stripped shared libraries are unchanged by this patch). * sysdeps/i386/configure.ac (libc_cv_cc_avx): Remove configure test. (libc_cv_cc_sse2avx): Likewise. * sysdeps/i386/configure: Regenerated. * sysdeps/i386/i686/multiarch/Makefile [$(subdir)$(config-cflags-avx) = mathyes]: Change conditional to [$(subdir) = math]. * sysdeps/i386/i686/multiarch/s_fma-fma.c [HAVE_AVX_SUPPORT]: Make code unconditional. * sysdeps/i386/i686/multiarch/s_fma.c [HAVE_AVX_SUPPORT]: Likewise. * sysdeps/i386/i686/multiarch/s_fmaf-fma.c [HAVE_AVX_SUPPORT]: Likewise. * sysdeps/i386/i686/multiarch/s_fmaf.c [HAVE_AVX_SUPPORT]: Likewise. * sysdeps/x86_64/configure.ac (libc_cv_cc_avx): Remove configure test. (libc_cv_cc_sse2avx): Likewise. * sysdeps/x86_64/configure: Regenerated. * sysdeps/x86_64/Makefile [$(config-cflags-avx) = yes]: Make code unconditional. * sysdeps/x86_64/dl-trampoline.h (_dl_runtime_profile) [HAVE_AVX_SUPPORT \|\| HAVE_AVX512_ASM_SUPPORT]: Make code unconditional. (_dl_runtime_profile) [!(HAVE_AVX_SUPPORT \|\| HAVE_AVX512_ASM_SUPPORT)]: Remove conditional code. * sysdeps/x86_64/fpu/multiarch/Makefile [$(config-cflags-sse2avx) = yes]: Make code unconditional. * sysdeps/x86_64/fpu/multiarch/e_atan2.c [HAVE_FMA4_SUPPORT \|\| HAVE_AVX_SUPPORT]: Likewise. * sysdeps/x86_64/fpu/multiarch/e_exp.c [HAVE_FMA4_SUPPORT \|\| HAVE_AVX_SUPPORT]: Likewise. * sysdeps/x86_64/fpu/multiarch/e_log.c [HAVE_FMA4_SUPPORT \|\| HAVE_AVX_SUPPORT]: Likewise. * sysdeps/x86_64/fpu/multiarch/s_atan.c [HAVE_FMA4_SUPPORT \|\| HAVE_AVX_SUPPORT]: Likewise. * sysdeps/x86_64/fpu/multiarch/s_fma.c [HAVE_AVX_SUPPORT]: Likewise. * sysdeps/x86_64/fpu/multiarch/s_fmaf.c [HAVE_AVX_SUPPORT]: Likewise. * sysdeps/x86_64/fpu/multiarch/s_sin.c [HAVE_FMA4_SUPPORT \|\| HAVE_AVX_SUPPORT]: Likewise. * sysdeps/x86_64/fpu/multiarch/s_tan.c [HAVE_FMA4_SUPPORT \|\| HAVE_AVX_SUPPORT]: Likewise. * sysdeps/x86_64/multiarch/strcmp.S [HAVE_AVX_SUPPORT]: Likewise. * config.h.in (HAVE_AVX_SUPPORT): Remove #undef. (HAVE_SSE2AVX_SUPPORT): Likewise.	2015-10-08 15:59:32 +00:00
Paul Pluzhnikov	57352c2201	sysdeps/x86_64/fpu/libm-test-ulps: Regenerated on Haswell.	2015-10-03 11:31:21 -07:00
Joseph Myers	93e448cbed	Improve test coverage of real libm functions [a-e]. This patch improves test coverage of the real libm functions [a-e], ensuring that special cases and ranges of input values of potential significance (such as close to overflow and underflow thresholds) are more systematically covered. This is a followup to <https://sourceware.org/ml/libc-alpha/2013-12/msg00757.html> which covered [a-c]* (however, I found more weaknesses in the coverage of those functions when preparing this patch, hence the additional tests being added for them here). Addition of a test for acosh (-qNaN) is temporarily deferred, to be included as part of a fix for bug 19032 which was discovered in the course of adding these tests (and which illustrates the use of testing -qNaN as well as +qNaN as input even to functions for which the sign of a NaN isn't meant to be significant). Tested for x86_64 and x86. * math/auto-libm-test-in: Add more tests of acos, acosh, asin, atan, atan2, atanh, cbrt, cos, cosh, erf, erfc, exp, exp10, exp2 and expm1. * math/auto-libm-test-out: Regenerated. * math/libm-test.inc (acos_test_data): Add more tests. (asin_test_data): Likewise. (asinh_test_data): Likewise. (atan_test_data): Likewise. (atanh_test_data): Likewise. (atan2_test_data): Likewise. (cbrt_test_data): Likewise. (ceil_test_data): Likewise. (copysign_test_data): Likewise. (cos_test_data): Likewise. (cosh_test_data): Likewise. (erf_test_data): Likewise. (erfc_test_data): Likewise. (exp_test_data): Likewise. (exp10_test_data): Likewise. (exp2_test_data): Likewise. (expm1_test_data): Likewise. * sysdeps/x86_64/fpu/libm-test-ulps: Update.	2015-09-30 18:06:02 +00:00
Joseph Myers	a5721ebc68	Fix clog, clog10 inaccuracy (bug 19016). For arguments with X^2 + Y^2 close to 1, clog and clog10 avoid large errors from log(hypot) by computing X^2 + Y^2 - 1 in a way that avoids cancellation error and then using log1p. However, the thresholds for using that approach still result in log being used on argument as large as sqrt(13/16) > 0.9, leading to significant errors, in some cases above the 9ulp maximum allowed in glibc libm. This patch arranges for the approach using log1p to be used in any cases where \|X\|, \|Y\| < 1 and X^2 + Y^2 >= 0.5 (with the existing allowance for cases where one of X and Y is very small), adjusting the __x2y2m1 functions to work with the wider range of inputs. This way, log only gets used on arguments below sqrt(1/2) (or substantially above 1), where the error involved is much less. Tested for x86_64, x86, mips64 and powerpc. For the ulps regeneration I removed the existing clog and clog10 ulps before regenerating to allow any reduced ulps to appear. Tests added include those found by random test generation to produce large ulps either before or after the patch, and some found by trying inputs close to the (0.75, 0.5) threshold where the potential errors from using log are largest. [BZ #19016] * sysdeps/generic/math_private.h (__x2y2m1f): Update comment to allow more cases with X^2 + Y^2 >= 0.5. * sysdeps/ieee754/dbl-64/x2y2m1.c (__x2y2m1): Likewise. Add -1 as normal element in sum instead of special-casing based on values of arguments. * sysdeps/ieee754/dbl-64/x2y2m1f.c (__x2y2m1f): Update comment. * sysdeps/ieee754/ldbl-128/x2y2m1l.c (__x2y2m1l): Likewise. Add -1 as normal element in sum instead of special-casing based on values of arguments. * sysdeps/ieee754/ldbl-128ibm/x2y2m1l.c (__x2y2m1l): Likewise. * sysdeps/ieee754/ldbl-96/x2y2m1.c [FLT_EVAL_METHOD != 0] (__x2y2m1): Update comment. * sysdeps/ieee754/ldbl-96/x2y2m1l.c (__x2y2m1l): Likewise. Add -1 as normal element in sum instead of special-casing based on values of arguments. * math/s_clog.c (__clog): Handle more cases using log1p without hypot. * math/s_clog10.c (__clog10): Likewise. * math/s_clog10f.c (__clog10f): Likewise. * math/s_clog10l.c (__clog10l): Likewise. * math/s_clogf.c (__clogf): Likewise. * math/s_clogl.c (__clogl): Likewise. * math/auto-libm-test-in: Add more tests of clog and clog10. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-09-28 22:11:22 +00:00
Joseph Myers	fa752c6981	Fix powf inaccuracy (bug 18956). The flt-32 version of powf can be inaccurate because of bugs in the extra-precision calculation of (x-1)/(x+1) or (x-1.5)/(x+1.5) as part of calculating log(x) with extra precision: a constant used (as part of adding 1 or 1.5 through integer arithmetic) is incorrect, and then the code fails to mask a computed high part before using it in arithmetic that relies on s_ht_h being exactly representable. This patch fixes these bugs. Tested for x86_64 and x86. x86_64 ulps for powf removed and regenerated to reflect reduced ulps from the increased accuracy for existing tests. [BZ #18956] sysdeps/ieee754/flt-32/e_powf.c (__ieee754_powf): Add 0x00400000 not 0x0040000 for high bit of mantissa. Mask with 0xfffff000 when extracting high part. * math/auto-libm-test-in: Add another test of pow. * math/auto-libm-test-out: Regenerated. * sysdeps/x86_64/fpu/libm-test-ulps: Update.	2015-09-26 00:27:06 +00:00
Joseph Myers	6ace393821	Fix pow missing underflows (bug 18825). Similar to various other bugs in this area, pow functions can fail to raise the underflow exception when the result is tiny and inexact but one or more low bits of the intermediate result that is scaled down (or, in the i386 case, converted from a wider evaluation format) are zero. This patch forces the exception in a similar way to previous fixes, thereby concluding the fixes for known bugs with missing underflow exceptions currently filed in Bugzilla. Tested for x86_64, x86, mips64 and powerpc. [BZ #18825] * sysdeps/i386/fpu/i386-math-asm.h (FLT_NARROW_EVAL_UFLOW_NONNAN): New macro. (DBL_NARROW_EVAL_UFLOW_NONNAN): Likewise. (LDBL_CHECK_FORCE_UFLOW_NONNAN): Likewise. * sysdeps/i386/fpu/e_pow.S: Use DEFINE_DBL_MIN. (__ieee754_pow): Use DBL_NARROW_EVAL_UFLOW_NONNAN instead of DBL_NARROW_EVAL, reloading the PIC register as needed. * sysdeps/i386/fpu/e_powf.S: Use DEFINE_FLT_MIN. (__ieee754_powf): Use FLT_NARROW_EVAL_UFLOW_NONNAN instead of FLT_NARROW_EVAL. Use separate return path for case when first argument is NaN. * sysdeps/i386/fpu/e_powl.S: Include <i386-math-asm.h>. Use DEFINE_LDBL_MIN. (__ieee754_powl): Use LDBL_CHECK_FORCE_UFLOW_NONNAN, reloading the PIC register. * sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Use math_check_force_underflow_nonneg. * sysdeps/ieee754/flt-32/e_powf.c (__ieee754_powf): Force underflow for subnormal result. * sysdeps/ieee754/ldbl-128/e_powl.c (__ieee754_powl): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_powl.c (__ieee754_powl): Use math_check_force_underflow_nonneg. * sysdeps/x86/fpu/powl_helper.c (__powl_helper): Use math_check_force_underflow. * sysdeps/x86_64/fpu/x86_64-math-asm.h (LDBL_CHECK_FORCE_UFLOW_NONNAN): New macro. * sysdeps/x86_64/fpu/e_powl.S: Include <x86_64-math-asm.h>. Use DEFINE_LDBL_MIN. (__ieee754_powl): Use LDBL_CHECK_FORCE_UFLOW_NONNAN. * math/auto-libm-test-in: Add more tests of pow. * math/auto-libm-test-out: Regenerated.	2015-09-25 22:29:10 +00:00
Joseph Myers	b2a64460ba	Refactor x86_64 libm code forcing underflow exceptions. This patch refactors code in sysdeps/x86_64/fpu that forces underflow exceptions and closely follows corresponding i386 code to use common macros in x86_64-math-asm.h for that purpose. This is mainly about keeping the code similar to the i386 code as far as possible, since each macro apart from DEFINE_LDBL_MIN ends up used only once. It would be possible to do a further refactoring to share these macros between i386 and x86_64 (with i386 using the fcomip / fucomip versions when building for i686 and above), but I have no immediate plans to do so. Tested for x86_64. * sysdeps/x86_64/fpu/x86_64-math-asm.h: New file. * sysdeps/x86_64/fpu/e_exp2l.S: Include <x86_64-math-asm.h>. (ldbl_min): Replace with use of DEFINE_LDBL_MIN. (__ieee754_exp2l): Use LDBL_CHECK_FORCE_UFLOW_NONNEG_NAN. * sysdeps/x86_64/fpu/e_expl.S: Include <x86_64-math-asm.h>. [!USE_AS_EXPM1L] (cmin): Replace with use of DEFINE_LDBL_MIN. (IEEE754_EXPL): Use LDBL_CHECK_FORCE_UFLOW_NONNEG.	2015-09-24 22:25:30 +00:00
Joseph Myers	51df260506	Fix x86_64 fma4 pow inappropriate contraction (bug 19003). The x86_64 fma4 version of pow fails to disable contraction of operations other than those explicitly intended to use fma instructions, so resulting in large ulps errors on processors with fma4 instructions, as in bug 18104 (165ulp for the test added for that bug; error originally reported by "blaaa" on #glibc). This patch adds $(config-cflags-nofma) for e_pow-fma4.c, corresponding to the use for e_pow.c in sysdeps/ieee754/dbl-64/Makefile. Tested for x86_64 on a processor with fma4. [BZ #19003] * sysdeps/x86_64/fpu/multiarch/Makefile (CFLAGS-e_pow-fma4.c): Add $(config-cflags-nofma).	2015-09-24 16:48:32 +00:00
Joseph Myers	da2f4f2dd5	Make scalbn set errno (bug 6803). As noted in bug 6803, scalbn fails to set errno on overflow and underflow. This patch fixes this by making scalbn an alias of ldexp, which has exactly the same semantics (for floating-point types with radix 2) and already has wrappers that deal with setting errno, instead of an alias of the internal __scalbn (which ldexp calls). Notes: * Where compat symbols were defined for scalbn functions, I didn't change what they point to (to keep the patch minimal), so such compat symbols continue to go directly to the non-errno-setting functions. * Mike, I didn't do anything with the IA64 versions of these functions, where I think both the ldexp and scalbn functions already deal with setting errno. As a cleanup (not needed to fix this bug) however you might want to make those functions into aliases for IA64; there is no need for them to be separate function implementations at all. * This concludes the fix for bug 6803 since the scalb and scalbln cases of that bug were fixed some time ago. Tested for x86_64, x86, mips64 and powerpc. [BZ #6803] * math/s_ldexp.c (scalbn): Define as weak alias of __ldexp. [NO_LONG_DOUBLE] (scalbnl): Define as weak alias of __ldexp. * math/s_ldexpf.c (scalbnf): Define as weak alias of __ldexpf. * math/s_ldexpl.c (scalbnl): Define as weak alias of __ldexpl. * sysdeps/i386/fpu/s_scalbn.S (scalbn): Remove alias. * sysdeps/i386/fpu/s_scalbnf.S (scalbnf): Likewise. * sysdeps/i386/fpu/s_scalbnl.S (scalbnl): Likewise. * sysdeps/ieee754/dbl-64/s_scalbn.c (scalbn): Likewise. [NO_LONG_DOUBLE] (scalbnl): Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_scalbn.c (scalbn): Likewise. [NO_LONG_DOUBLE] (scalbnl): Likewise. * sysdeps/ieee754/flt-32/s_scalbnf.c (scalbnf): Likewise. * sysdeps/ieee754/ldbl-128/s_scalbnl.c (scalbnl): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_scalbnl.c (scalbnl): Remove long_double_symbol calls. * sysdeps/ieee754/ldbl-64-128/s_scalbnl.c (scalbnl): Likewise. * sysdeps/ieee754/ldbl-opt/s_ldexpl.c (__ldexpl_2): Define as strong alias of __ldexpl. (scalbnl): Define using long_double_symbol. * sysdeps/m68k/m680x0/fpu/s_scalbn.c (__CONCATX(scalbn,suffix)): Remove alias. * sysdeps/sparc/sparc64/soft-fp/s_scalbnl.c (scalbnl): Likewise. * sysdeps/x86_64/fpu/s_scalbnl.S (scalbnl): Likewise. * math/libm-test.inc (scalbn_test_data): Add errno expectations. (scalbln_test_data): Add more errno expectations.	2015-09-16 21:11:00 +00:00
Joseph Myers	903af5af9a	Fix exp2 missing underflows (bug 16521). Various exp2 implementations in glibc can miss underflow exceptions when the scaling down part of the calculation is exact (or, in the x86 case, when the conversion from extended precision to the target precision is exact). This patch forces the exception in a similar way to previous fixes. The x86 exp2f changes may in fact not be needed for this purpose - it's likely to be the case that no argument of type float has an exp2 result so close to an exact subnormal float value that it equals that value when rounded to 64 bits (even taking account of variation between different x86 implementations). However, they are included for consistency with the changes to exp2 and so as to fix the exp2f part of bug 18875 by ensuring that excess range and precision is removed from underflowing return values. Tested for x86_64, x86 and mips64. [BZ #16521] [BZ #18875] * math/e_exp2l.c (__ieee754_exp2l): Force underflow exception for small results. * sysdeps/i386/fpu/e_exp2.S (dbl_min): New object. (MO): New macro. (__ieee754_exp2): For small results, force underflow exception and remove excess range and precision from return value. * sysdeps/i386/fpu/e_exp2f.S (flt_min): New object. (MO): New macro. (__ieee754_exp2f): For small results, force underflow exception and remove excess range and precision from return value. * sysdeps/i386/fpu/e_exp2l.S (ldbl_min): New object. (MO): New macro. (__ieee754_exp2l): Force underflow exception for small results. * sysdeps/ieee754/dbl-64/e_exp2.c (__ieee754_exp2): Likewise. * sysdeps/ieee754/flt-32/e_exp2f.c (__ieee754_exp2f): Likewise. * sysdeps/x86_64/fpu/e_exp2l.S (ldbl_min): New object. (MO): New macro. (__ieee754_exp2l): Force underflow exception for small results. * math/auto-libm-test-in: Add more tests or exp2. * math/auto-libm-test-out: Regenerated.	2015-09-14 22:00:12 +00:00
Joseph Myers	a1f99ba28b	Add more random libm test inputs (mainly for ldbl-128). This patch adds more libm test inputs found through random test generation to increase previously known ulps. This particular test generation was run for mips64, so most of the increased ulps are for ldbl-128 (float and double having been fairly well covered by such testing for x86_64), but there's the odd ulps increase for other formats. Tested for x86_64, x86 and mips64. * math/auto-libm-test-in: Add more tests of acos, acosh, asin, asinh, atan, atan2, atanh, cabs, carg, cos, csqrt, erfc, exp, exp10, exp2, log, log1p, log2, pow, sin, sincos, sinh, tan and tanh. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/mips/mips32/libm-test-ulps: Likewise. * sysdeps/mips/mips64/libm-test-ulps: Likewise. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-09-12 00:01:38 +00:00
Joseph Myers	00a7073c38	Add more randomly-generated libm tests. This patch adds more libm test inputs found through random test generation to increase observed ulps on x86_64. Tested for x86_64 and x86. * math/auto-libm-test-in: Add more tests of acosh, atanh, cbrt, cosh, csqrt, erfc, expm1 and lgamma. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-09-11 15:03:10 +00:00
Joseph Myers	050f29c188	Fix lgamma (negative) inaccuracy (bug 2542, bug 2543, bug 2558). The existing implementations of lgamma functions (except for the ia64 versions) use the reflection formula for negative arguments. This suffers large inaccuracy from cancellation near zeros of lgamma (near where the gamma function is +/- 1). This patch fixes this inaccuracy. For arguments above -2, there are no zeros and no large cancellation, while for sufficiently large negative arguments the zeros are so close to integers that even for integers +/- 1ulp the log(gamma(1-x)) term dominates and cancellation is not significant. Thus, it is only necessary to take special care about cancellation for arguments around a limited number of zeros. Accordingly, this patch uses precomputed tables of relevant zeros, expressed as the sum of two floating-point values. The log of the ratio of two sines can be computed accurately using log1p in cases where log would lose accuracy. The log of the ratio of two gamma(1-x) values can be computed using Stirling's approximation (the difference between two values of that approximation to lgamma being computable without computing the two values and then subtracting), with appropriate adjustments (which don't reduce accuracy too much) in cases where 1-x is too small to use Stirling's approximation directly. In the interval from -3 to -2, using the ratios of sines and of gamma(1-x) can still produce too much cancellation between those two parts of the computation (and that interval is also the worst interval for computing the ratio between gamma(1-x) values, which computation becomes more accurate, while being less critical for the final result, for larger 1-x). Because this can result in errors slightly above those accepted in glibc, this interval is instead dealt with by polynomial approximations. Separate polynomial approximations to (\|gamma(x)\|-1)(x-n)/(x-x0) are used for each interval of length 1/8 from -3 to -2, where n (-3 or -2) is the nearest integer to the 1/8-interval and x0 is the zero of lgamma in the relevant half-integer interval (-3 to -2.5 or -2.5 to -2). Together, the two approaches are intended to give sufficient accuracy for all negative arguments in the problem range. Outside that range, the previous implementation continues to be used. Tested for x86_64, x86, mips64 and powerpc. The mips64 and powerpc testing shows up pre-existing problems for ldbl-128 and ldbl-128ibm with large negative arguments giving spurious "invalid" exceptions (exposed by newly added tests for cases this patch doesn't affect the logic for); I'll address those problems separately. [BZ #2542] [BZ #2543] [BZ #2558] * sysdeps/ieee754/dbl-64/e_lgamma_r.c (__ieee754_lgamma_r): Call __lgamma_neg for arguments from -28.0 to -2.0. * sysdeps/ieee754/flt-32/e_lgammaf_r.c (__ieee754_lgammaf_r): Call __lgamma_negf for arguments from -15.0 to -2.0. * sysdeps/ieee754/ldbl-128/e_lgammal_r.c (__ieee754_lgammal_r): Call __lgamma_negl for arguments from -48.0 or -50.0 to -2.0. * sysdeps/ieee754/ldbl-96/e_lgammal_r.c (__ieee754_lgammal_r): Call __lgamma_negl for arguments from -33.0 to -2.0. * sysdeps/ieee754/dbl-64/lgamma_neg.c: New file. * sysdeps/ieee754/dbl-64/lgamma_product.c: Likewise. * sysdeps/ieee754/flt-32/lgamma_negf.c: Likewise. * sysdeps/ieee754/flt-32/lgamma_productf.c: Likewise. * sysdeps/ieee754/ldbl-128/lgamma_negl.c: Likewise. * sysdeps/ieee754/ldbl-128/lgamma_productl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/lgamma_negl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/lgamma_productl.c: Likewise. * sysdeps/ieee754/ldbl-96/lgamma_negl.c: Likewise. * sysdeps/ieee754/ldbl-96/lgamma_product.c: Likewise. * sysdeps/ieee754/ldbl-96/lgamma_productl.c: Likewise. * sysdeps/generic/math_private.h (__lgamma_negf): New prototype. (__lgamma_neg): Likewise. (__lgamma_negl): Likewise. (__lgamma_product): Likewise. (__lgamma_productl): Likewise. * math/Makefile (libm-calls): Add lgamma_neg and lgamma_product. * math/auto-libm-test-in: Add more tests of lgamma. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-09-10 22:27:58 +00:00
Paul Pluzhnikov	db7f8c8fe0	Regenerated sysdeps/x86_64/fpu/libm-test-ulps with AVX2.	2015-08-14 09:59:04 -07:00
Siddhesh Poyarekar	37dd6a19ca	Remove incorrect register mov in floorf/nearbyint on x86_64 The change in `0b5395f052` replaced calls to __get_cpu_features@plt followed by a mov from rax to rdx, with a single macro LOAD_RTLD_GLOBAL_RO_RDX. It is pretty clear that there was a typo in s_floorf and __nearbyint due to which the (now incorrect) mov was not removed. This patch removes that mov. * sysdeps/x86_64/fpu/multiarch/s_floorf.S (__floorf): Remove unnecessary movq. * sysdeps/x86_64/fpu/multiarch/s_nearbyint.S (__nearbyint): Likewise.	2015-08-14 05:30:17 -07:00
Joseph Myers	3ba0ac10fa	Add more random libm-test inputs. This patch adds more test inputs to various libm functions found through random generation to have larger ulps errors than previously listed in libm-test-ulp, on at least one of x86_64 and x86. Tested for x86_64 and x86. * math/auto-libm-test-in: Add more tests of acos, acosh, asin, asinh, atan, atan2, atanh, cabs, cbrt, cosh, csqrt, erf, erfc, exp, exp2, lgamma, log, log1p, log2, pow, sin, sincos, tan, tanh and tgamma. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-08-13 23:23:23 +00:00
H.J. Lu	1dfa4a94ae	Update libmvec multiarch functions for <cpu-features.h> This patch updates libmvec multiarch functions to use the newly defined HAS_CPU_FEATURE, HAS_ARCH_FEATURE and LOAD_RTLD_GLOBAL_RO_RDX from <cpu-features.h>. * math/Makefile ($(addprefix $(objpfx), $(libm-vec-tests))): Remove $(objpfx)init-arch.o. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Remove init-arch. * sysdeps/x86_64/fpu/math-tests-arch.h (avx_usable): Removed. (INIT_ARCH_EXT): Defined as empty. (CHECK_ARCH_EXT): Replace HAS_XXX with HAS_ARCH_FEATURE (XXX). * sysdeps/x86_64/fpu/multiarch/svml_d_cos2_core.S: Remove __init_cpu_features call. Replace HAS_XXX with HAS_CPU_FEATURE/HAS_ARCH_FEATURE (XXX). * sysdeps/x86_64/fpu/multiarch/svml_d_cos4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_exp2_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_exp4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_log2_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_log4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_log8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_pow2_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_pow4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos2_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf16_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_expf16_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_expf4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_expf8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_logf16_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_logf4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_logf8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_powf16_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_powf4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_powf8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf16_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf8_core.S: Likewise.	2015-08-13 03:41:47 -07:00
H.J. Lu	0b5395f052	Update x86_64 multiarch functions for <cpu-features.h> This patch updates x86_64 multiarch functions to use the newly defined HAS_CPU_FEATURE, HAS_ARCH_FEATURE and LOAD_RTLD_GLOBAL_RO_RDX from <cpu-features.h>. * sysdeps/x86_64/fpu/multiarch/e_asin.c: Replace HAS_XXX with HAS_CPU_FEATURE/HAS_ARCH_FEATURE (XXX). * sysdeps/x86_64/fpu/multiarch/e_atan2.c: Likewise. * sysdeps/x86_64/fpu/multiarch/e_exp.c: Likewise. * sysdeps/x86_64/fpu/multiarch/e_log.c: Likewise. * sysdeps/x86_64/fpu/multiarch/e_pow.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_atan.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_fmaf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_sin.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_tan.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_ceil.S: Use LOAD_RTLD_GLOBAL_RO_RDX and HAS_CPU_FEATURE (SSE4_1). * sysdeps/x86_64/fpu/multiarch/s_ceilf.S: Likewise. * sysdeps/x86_64/fpu/multiarch/s_floor.S: Likewise. * sysdeps/x86_64/fpu/multiarch/s_floorf.S: Likewise. * sysdeps/x86_64/fpu/multiarch/s_nearbyint.S : Likewise. * sysdeps/x86_64/fpu/multiarch/s_nearbyintf.S: Likewise. * sysdeps/x86_64/fpu/multiarch/s_rintf.S: Likewise. * sysdeps/x86_64/fpu/multiarch/s_rintf.S : Likewise. * sysdeps/x86_64/multiarch/ifunc-impl-list.c: Likewise. * sysdeps/x86_64/multiarch/sched_cpucount.c: Likewise. * sysdeps/x86_64/multiarch/strstr.c: Likewise. * sysdeps/x86_64/multiarch/memmove.c: Likewise. * sysdeps/x86_64/multiarch/memmove_chk.c: Likewise. * sysdeps/x86_64/multiarch/test-multiarch.c: Likewise. * sysdeps/x86_64/multiarch/memcmp.S: Remove __init_cpu_features call. Add LOAD_RTLD_GLOBAL_RO_RDX. Replace HAS_XXX with HAS_CPU_FEATURE/HAS_ARCH_FEATURE (XXX). * sysdeps/x86_64/multiarch/memcpy.S: Likewise. * sysdeps/x86_64/multiarch/memcpy_chk.S: Likewise. * sysdeps/x86_64/multiarch/mempcpy.S: Likewise. * sysdeps/x86_64/multiarch/mempcpy_chk.S: Likewise. * sysdeps/x86_64/multiarch/memset.S: Likewise. * sysdeps/x86_64/multiarch/memset_chk.S: Likewise. * sysdeps/x86_64/multiarch/strcat.S: Likewise. * sysdeps/x86_64/multiarch/strchr.S: Likewise. * sysdeps/x86_64/multiarch/strcmp.S: Likewise. * sysdeps/x86_64/multiarch/strcpy.S: Likewise. * sysdeps/x86_64/multiarch/strcspn.S: Likewise. * sysdeps/x86_64/multiarch/strspn.S: Likewise. * sysdeps/x86_64/multiarch/wcscpy.S: Likewise. * sysdeps/x86_64/multiarch/wmemcmp.S: Likewise.	2015-08-13 03:41:30 -07:00
Joseph Myers	4afe4b20ce	Add more tests of various libm functions. This patch adds more tests of various libm functions found through random test generation to give increased ulps on 32-bit x86. Tested for x86_64 and x86. * math/auto-libm-test-in: Add more tests of acosh, asin, asinh, atanh, cabs, carg, cbrt, cosh, csqrt, erf, erfc, exp, exp10, expm1, hypot, log, log10, log1p, log2, pow, sinh, tan and tgamma. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-08-11 00:58:28 +00:00
H.J. Lu	72354ab5e1	Align stack to 16 bytes when calling __errno_location We should align stack to 16 bytes when calling __errno_location. [BZ #18661] * sysdeps/x86_64/fpu/s_cosf.S (__cosf): Align stack to 16 bytes when calling __errno_location. * sysdeps/x86_64/fpu/s_sincosf.S (__sincosf): Likewise. * sysdeps/x86_64/fpu/s_sinf.S (__sinf): Likewise.	2015-08-05 08:36:27 -07:00
Andrew Senkevich	a9e8ea51cc	Prevent runtime fail of SSE vector math tests on non SSE4.1 machine. [BZ #18740] * sysdeps/x86_64/fpu/Makefile (double-vlen2-arch-ext-cflags, float-vlen4-arch-ext-cflags): Removed. * math/Makefile (CFLAGS-test-double-vlen2-wrappers.c, CFLAGS-test-float-vlen4-wrappers.c): Likewise.	2015-07-30 18:00:24 +03:00
Andrew Senkevich	febce2ac5f	Added runtime check for AVX vector math tests. [BZ #18731] * sysdeps/x86_64/fpu/math-tests-arch.h: Added AVX runtime check. * sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8.c: Likewise.	2015-07-29 19:47:29 +03:00
Andrew Senkevich	9901716135	Fixed several libmvec bugs found during testing on KNL hardware. AVX512 IFUNC implementations, implementations of wrappers to AVX2 versions and KNL expf implementation fixed. * sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core.S: Fixed AVX512 IFUNC. * sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_log8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf16_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_expf16_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_logf16_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_powf16_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf16_core.S: Likewise. * sysdeps/x86_64/fpu/svml_d_wrapper_impl.h: Fixed wrappers to AVX2. * sysdeps/x86_64/fpu/svml_s_wrapper_impl.h: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_expf16_core_avx512.S: Fixed KNL implementation.	2015-07-24 14:47:23 +03:00
Joseph Myers	e02920bc02	Improve tgamma accuracy (bug 18613). In non-default rounding modes, tgamma can be slightly less accurate than permitted by glibc's accuracy goals. Part of the problem is error accumulation, addressed in this patch by setting round-to-nearest for internal computations. However, there was also a bug in the code dealing with computing pow (x + n, x + n) where x + n is not exactly representable, providing another source of error even in round-to-nearest mode; it was necessary to address both bugs to get errors for all testcases within glibc's accuracy goals. Given this second fix, accuracy in round-to-nearest mode is also improved (hence regeneration of ulps for tgamma should be from scratch - truncate libm-test-ulps or at least remove existing tgamma entries - so that the expected ulps can be reduced). Some additional complications also arose. Certain tgamma tests should strictly, according to IEEE semantics, overflow or not depending on the rounding mode; this is beyond the scope of glibc's accuracy goals for any function without exactly-determined results, but gen-auto-libm-tests doesn't handle being lax there as it does for underflow. (libm-test.inc also doesn't handle being lax about whether the result in cases very close to the overflow threshold is infinity or a finite value close to overflow, but that doesn't cause problems in this case though I've seen it cause problems with random test generation for some functions.) Thus, spurious-overflow markings, with a comment, are added to auto-libm-test-in (no bug in Bugzilla because the issue is with the testsuite, not a user-visible bug in glibc). And on x86, after the patch I saw ERANGE issues as previously reported by Carlos (see my commentary in <https://sourceware.org/ml/libc-alpha/2015-01/msg00485.html>), which needed addressing by ensuring excess range and precision were eliminated at various points if FLT_EVAL_METHOD != 0. I also noticed and fixed a cosmetic issue where 1.0f was used in long double functions and should have been 1.0L. This completes the move of all functions to testing in all rounding modes with ALL_RM_TEST, so gen-libm-have-vector-test.sh is updated to remove the workaround for some functions not using ALL_RM_TEST. Tested for x86_64, x86, mips64 and powerpc. [BZ #18613] * sysdeps/ieee754/dbl-64/e_gamma_r.c (gamma_positive): Take log of X_ADJ not X when adjusting exponent. (__ieee754_gamma_r): Do intermediate computations in round-to-nearest then adjust overflowing and underflowing results as needed. * sysdeps/ieee754/flt-32/e_gammaf_r.c (gammaf_positive): Take log of X_ADJ not X when adjusting exponent. (__ieee754_gammaf_r): Do intermediate computations in round-to-nearest then adjust overflowing and underflowing results as needed. * sysdeps/ieee754/ldbl-128/e_gammal_r.c (gammal_positive): Take log of X_ADJ not X when adjusting exponent. (__ieee754_gammal_r): Do intermediate computations in round-to-nearest then adjust overflowing and underflowing results as needed. Use 1.0L not 1.0f as numerator of division. * sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (gammal_positive): Take log of X_ADJ not X when adjusting exponent. (__ieee754_gammal_r): Do intermediate computations in round-to-nearest then adjust overflowing and underflowing results as needed. Use 1.0L not 1.0f as numerator of division. * sysdeps/ieee754/ldbl-96/e_gammal_r.c (gammal_positive): Take log of X_ADJ not X when adjusting exponent. (__ieee754_gammal_r): Do intermediate computations in round-to-nearest then adjust overflowing and underflowing results as needed. Use 1.0L not 1.0f as numerator of division. * math/libm-test.inc (tgamma_test_data): Remove one test. Moved to auto-libm-test-in. (tgamma_test): Use ALL_RM_TEST. * math/auto-libm-test-in: Add one test of tgamma. Mark some other tests of tgamma with spurious-overflow. * math/auto-libm-test-out: Regenerated. * math/gen-libm-have-vector-test.sh: Do not check for START. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-06-29 23:29:35 +00:00
Joseph Myers	a8e2112ae3	Use round-to-nearest internally in jn, test with ALL_RM_TEST (bug 18602). Some existing jn tests, if run in non-default rounding modes, produce errors above those accepted in glibc, which causes problems for moving tests of jn to use ALL_RM_TEST. This patch makes jn set rounding to-nearest internally, as was done for yn some time ago, then computes the appropriate underflowing value for results that underflowed to zero in to-nearest, and moves the tests to ALL_RM_TEST. It does nothing about the general inaccuracy of Bessel function implementations in glibc, though it should make jn more accurate on average in non-default rounding modes through reduced error accumulation. The recomputation of results that underflowed to zero should as a side-effect fix some cases of bug 16559, where jn just used an exact zero, but that is not the goal of this patch and other cases of that bug remain unfixed. (Most of the changes in the patch are reindentation to add new scopes for SET_RESTORE_ROUND.) Tested for x86_64, x86, powerpc and mips64. [BZ #16559] [BZ #18602] sysdeps/ieee754/dbl-64/e_jn.c (__ieee754_jn): Set round-to-nearest internally then recompute results that underflowed to zero in the original rounding mode. * sysdeps/ieee754/flt-32/e_jnf.c (__ieee754_jnf): Likewise. * sysdeps/ieee754/ldbl-128/e_jnl.c (__ieee754_jnl): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_jnl.c (__ieee754_jnl): Likewise. * sysdeps/ieee754/ldbl-96/e_jnl.c (__ieee754_jnl): Likewise * math/libm-test.inc (jn_test): Use ALL_RM_TEST. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-06-25 21:46:02 +00:00
Joseph Myers	f9536db790	Refactor libm tests. This patch refactors the libm tests using libm-test.inc to reduce the level of duplicate definitions. New headers are created for the definitions shared by tests for a particular type; by tests of inline functions; by tests of non-inline functions; by scalar tests; and by vector tests. The unused MATHCONST macro is removed. A new macro VEC_LEN is added to the vector headers to allow the macros defining wrappers for vector functions to be defined once, instead of six times each (differing only in vector length) as before. There is still scope for further refactoring, but this seems a useful start. Tested for x86_64. * math/test-double.h: New file. * math/test-float.h: Likewise. * math/test-ldouble.h: Likewise. * math/test-math-inline.h: Likewise. * math/test-math-no-inline.h: Likewise. * math/test-math-scalar.h: Likewise. * math/test-math-vector.h: Likewise. * math/test-vec-loop.h: Remove file. Contents moved into test-math-vector.h. * math/libm-test.inc (MATHCONST): Do not document macro. * math/test-double.c: Include test-double.h, test-math-no-inline.h and test-math-scalar.h. (FUNC): Remove macro. (FUNC_TEST): Likewise. (FLOAT): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_DOUBLE): Likewise. (TEST_MATHVEC): Likewise. (__NO_MATH_INLINES): Likewise. * math/test-float.c: Include test-float.h, test-math-no-inline.h and test-math-scalar.h. (FUNC): Remove macro. (FUNC_TEST): Likewise. (FLOAT): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_FLOAT): Likewise. (TEST_MATHVEC): Likewise. (__NO_MATH_INLINES): Likewise. * math/test-idouble.c: Include test-double.h, test-math-inline.h and test-math-scalar.h. (FUNC): Remove macro. (FUNC_TEST): Likewise. (FLOAT): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_DOUBLE): Likewise. (TEST_MATHVEC): Likewise. (TEST_INLINE): Likewise. (__NO_MATH_INLINES): Likewise. * math/test-ifloat.c: Include test-float.h, test-math-inline.h and test-math-scalar.h. (FUNC): Remove macro. (FUNC_TEST): Likewise. (FLOAT): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_FLOAT): Likewise. (TEST_MATHVEC): Likewise. (TEST_INLINE): Likewise. (__NO_MATH_INLINES): Likewise. * math/test-ildoubl.c: Include test-ldouble.h, test-math-inline.h and test-math-scalar.h. (FUNC): Remove macro. (FUNC_TEST): Likewise. (FLOAT): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_LDOUBLE): Likewise. (TEST_MATHVEC): Likewise. (TEST_INLINE): Likewise. (__NO_MATH_INLINES): Likewise. * math/test-ldouble.c: Include test-ldouble.h, test-math-no-inline.h and test-math-scalar.h. (FUNC): Remove macro. (FUNC_TEST): Likewise. (FLOAT): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_LDOUBLE): Likewise. (TEST_MATHVEC): Likewise. (__NO_MATH_INLINES): Likewise. * math/test-double-vlen2.h: Include test-double.h, test-math-no-inline.h and test-math-vector.h. (FLOAT): Remove macro. (FUNC): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_DOUBLE): Likewise. (TEST_MATHVEC): Likewise. (__NO_MATH_INLINES): Likewise. (CNCT): Likewise. (CONCAT): Likewise. (WRAPPER_NAME): Likewise. (WRAPPER_DECL): Likewise. (WRAPPER_DECL_ff): Likewise. (WRAPPER_DECL_fFF): Likewise. (VECTOR_WRAPPER): Likewise. (VECTOR_WRAPPER_ff): Likewise. (VECTOR_WRAPPER_fFF): Likewise. (VEC_LEN): New macro. * math/test-double-vlen4.h: Include test-double.h, test-math-no-inline.h and test-math-vector.h. (FLOAT): Remove macro. (FUNC): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_DOUBLE): Likewise. (TEST_MATHVEC): Likewise. (__NO_MATH_INLINES): Likewise. (CNCT): Likewise. (CONCAT): Likewise. (WRAPPER_NAME): Likewise. (WRAPPER_DECL): Likewise. (WRAPPER_DECL_ff): Likewise. (WRAPPER_DECL_fFF): Likewise. (VECTOR_WRAPPER): Likewise. (VECTOR_WRAPPER_ff): Likewise. (VECTOR_WRAPPER_fFF): Likewise. (VEC_LEN): New macro. * math/test-double-vlen8.h: Include test-double.h, test-math-no-inline.h and test-math-vector.h. (FLOAT): Remove macro. (FUNC): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_DOUBLE): Likewise. (TEST_MATHVEC): Likewise. (__NO_MATH_INLINES): Likewise. (CNCT): Likewise. (CONCAT): Likewise. (WRAPPER_NAME): Likewise. (WRAPPER_DECL): Likewise. (WRAPPER_DECL_ff): Likewise. (WRAPPER_DECL_fFF): Likewise. (VECTOR_WRAPPER): Likewise. (VECTOR_WRAPPER_ff): Likewise. (VECTOR_WRAPPER_fFF): Likewise. (VEC_LEN): New macro. * math/test-float-vlen4.h: Include test-float.h, test-math-no-inline.h and test-math-vector.h. (FLOAT): Remove macro. (FUNC): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_FLOAT): Likewise. (TEST_MATHVEC): Likewise. (__NO_MATH_INLINES): Likewise. (CNCT): Likewise. (CONCAT): Likewise. (WRAPPER_NAME): Likewise. (WRAPPER_DECL): Likewise. (WRAPPER_DECL_ff): Likewise. (WRAPPER_DECL_fFF): Likewise. (VECTOR_WRAPPER): Likewise. (VECTOR_WRAPPER_ff): Likewise. (VECTOR_WRAPPER_fFF): Likewise. (VEC_LEN): New macro. * math/test-float-vlen8.h: Include test-float.h, test-math-no-inline.h and test-math-vector.h. (FLOAT): Remove macro. (FUNC): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_FLOAT): Likewise. (TEST_MATHVEC): Likewise. (__NO_MATH_INLINES): Likewise. (CNCT): Likewise. (CONCAT): Likewise. (WRAPPER_NAME): Likewise. (WRAPPER_DECL): Likewise. (WRAPPER_DECL_ff): Likewise. (WRAPPER_DECL_fFF): Likewise. (VECTOR_WRAPPER): Likewise. (VECTOR_WRAPPER_ff): Likewise. (VECTOR_WRAPPER_fFF): Likewise. (VEC_LEN): New macro. * math/test-float-vlen16.h: Include test-float.h, test-math-no-inline.h and test-math-vector.h. (FLOAT): Remove macro. (FUNC): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_FLOAT): Likewise. (TEST_MATHVEC): Likewise. (__NO_MATH_INLINES): Likewise. (CNCT): Likewise. (CONCAT): Likewise. (WRAPPER_NAME): Likewise. (WRAPPER_DECL): Likewise. (WRAPPER_DECL_ff): Likewise. (WRAPPER_DECL_fFF): Likewise. (VECTOR_WRAPPER): Likewise. (VECTOR_WRAPPER_ff): Likewise. (VECTOR_WRAPPER_fFF): Likewise. (VEC_LEN): New macro. * sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: Do not include test-vec-loop.h. * sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c: Likewise.	2015-06-24 23:27:18 +00:00
Joseph Myers	a67894c505	Fix cexp, ccos, ccosh, csin, csinh spurious underflows (bug 18594). cexp, ccos, ccosh, csin and csinh have spurious underflows in cases where they compute sin of the smallest normal, that produces an underflow exception (depending on which sin implementation is in use) but the final result does not underflow. ctan and ctanh may also have such underflows, or they may be latent (the issue there is that e.g. ctan (DBL_MIN) should, rounded upwards, be the next double value above DBL_MIN, which under glibc's accuracy goals may not have an underflow exception, but the intermediate computation of sin (DBL_MIN) would legitimately underflow on before-rounding architectures). This patch fixes all those functions so they use plain comparisons (> DBL_MIN etc.) instead of comparing the result of fpclassify with FP_SUBNORMAL (in all these cases, we already know the number being compared is finite). Note that in the case of csin / csinf / csinl, there is no need for fabs calls in the comparison because the real part has already been reduced to its absolute value. As the patch fixes the failures that previously obstructed moving tests of cexp to use ALL_RM_TEST, those tests are moved to ALL_RM_TEST by the patch (two functions remain yet to be converted). Tested for x86_64 and x86 and ulps updated accordingly. [BZ #18594] * math/s_ccosh.c (__ccosh): Compare with least normal value instead of comparing class with FP_SUBNORMAL. * math/s_ccoshf.c (__ccoshf): Likewise. * math/s_ccoshl.c (__ccoshl): Likewise. * math/s_cexp.c (__cexp): Likewise. * math/s_cexpf.c (__cexpf): Likewise. * math/s_cexpl.c (__cexpl): Likewise. * math/s_csin.c (__csin): Likewise. * math/s_csinf.c (__csinf): Likewise. * math/s_csinh.c (__csinh): Likewise. * math/s_csinhf.c (__csinhf): Likewise. * math/s_csinhl.c (__csinhl): Likewise. * math/s_csinl.c (__csinl): Likewise. * math/s_ctan.c (__ctan): Likewise. * math/s_ctanf.c (__ctanf): Likewise. * math/s_ctanh.c (__ctanh): Likewise. * math/s_ctanhf.c (__ctanhf): Likewise. * math/s_ctanhl.c (__ctanhl): Likewise. * math/s_ctanl.c (__ctanl): Likewise. * math/auto-libm-test-in: Add more tests of ccos, ccosh, cexp, csin, csinh, ctan and ctanh. * math/auto-libm-test-out: Regenerated. * math/libm-test.inc (cexp_test): Use ALL_RM_TEST. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-06-24 21:04:51 +00:00
Joseph Myers	ac831b362a	Fix csin, csinh overflow in directed rounding modes (bug 18593). csin and csinh can produce bad results when overflowing in directed rounding modes, because a multiplication that can overflow is followed by a possible negation. This patch fixes this by negating one of the arguments of the multiplication before the multiplication instead of negating the result. The new tests for this issue are added to auto-libm-test-in, starting use of that file for csin and csinh. The issue was found in the course of moving existing tests for csin and csinh (existing tests, by being enabled in more cases than previously, showed the issue for float and double but not for long double); that move will now be done separately. Tested for x86_64 and x86 and ulps updated accordingly. [BZ #18593] * math/s_csin.c (__csin): Negate before rather than after possibly overflowing multiplication. * math/s_csinf.c (__csinf): Likewise. * math/s_csinh.c (__csinh): Likewise. * math/s_csinhf.c (__csinhf): Likewise. * math/s_csinhl.c (__csinhl): Likewise. * math/s_csinl.c (__csinl): Likewise. * math/auto-libm-test-in: Add some tests of csin and csinh. * math/auto-libm-test-out: Regenerated. * math/libm-test.inc (csin_test_data): Use AUTO_TESTS_c_c. (csinh_test_data): Likewise. * sysdeps/x86_64/fpu/libm-test-ulps: Update.	2015-06-24 16:20:48 +00:00
Andrew Senkevich	36870482d2	Combination of data tables for x86_64 vector functions sinf, cosf and sincosf. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Fixed files list. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf4_core_sse4.S: Renamed variable and included header. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf8_core_avx2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf16_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf4_core_sse4.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf8_core_avx2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf16_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf4_core_sse4.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf8_core_avx2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/svml_s_trig_data.S: New file. * sysdeps/x86_64/fpu/svml_s_trig_data.h: Likewise. * sysdeps/x86_64/fpu/svml_s_cosf_data.S: Removed file. * sysdeps/x86_64/fpu/svml_s_cosf_data.h: Likewise. * sysdeps/x86_64/fpu/svml_s_sinf_data.S: Likewise. * sysdeps/x86_64/fpu/svml_s_sinf_data.h: Likewise. * sysdeps/x86_64/fpu/svml_s_sincosf_data.S: Likewise. * sysdeps/x86_64/fpu/svml_s_sincosf_data.h: Likewise.	2015-06-24 17:44:35 +03:00
Andrew Senkevich	5872b8352a	Combination of data tables for x86_64 vector functions sin, cos and sincos. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Fixed files list. * sysdeps/x86_64/fpu/multiarch/svml_d_cos2_core_sse4.S: Renamed variable and included header. * sysdeps/x86_64/fpu/multiarch/svml_d_cos4_core_avx2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core_sse4.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core_avx2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos2_core_sse4.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos4_core_avx2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/svml_d_trig_data.S: New file. * sysdeps/x86_64/fpu/svml_d_trig_data.h: Likewise. * sysdeps/x86_64/fpu/svml_d_cos2_core.S: Removed unneeded include. * sysdeps/x86_64/fpu/svml_d_cos4_core.S: Likewise. * sysdeps/x86_64/fpu/svml_d_cos8_core.S: Likewise. * sysdeps/x86_64/fpu/svml_d_cos_data.S: Removed file. * sysdeps/x86_64/fpu/svml_d_cos_data.h: Likewise. * sysdeps/x86_64/fpu/svml_d_sin_data.S: Likewise. * sysdeps/x86_64/fpu/svml_d_sin_data.h: Likewise. * sysdeps/x86_64/fpu/svml_d_sincos_data.S: Likewise. * sysdeps/x86_64/fpu/svml_d_sincos_data.h: Likewise.	2015-06-23 19:21:50 +03:00
Joseph Myers	cb0937b299	Fix x86_64 / x86 expm1l (-min_subnorm) result sign (bug 18569). In the x86 / x86_64 implementations of expm1l, when expm1l's result should underflow to 0 (argument minus the least subnormal, in some rounding modes), it can be a zero of the wrong sign. This patch fixes this by returning the argument with underflow forced in that case (this is a 1ulp error relative to the correctly rounded result of -0, which is OK in terms of the documented accuracy goals, whereas a result with the wrong sign never is). Tested for x86_64 and x86. [BZ #18569] * sysdeps/i386/fpu/e_expl.S (IEEE754_EXPL) [USE_AS_EXPM1L]: Force underflow and return argument in case of subnormal argument. * sysdeps/x86_64/fpu/e_expl.S (IEEE754_EXPL) [USE_AS_EXPM1L]: Likewise. * math/auto-libm-test-in: Add more tests of expm1. * math/auto-libm-test-out: Regenerated.	2015-06-21 18:43:10 +00:00
Joseph Myers	7540cfc5a8	Fix x86 / x86_64 expl, exp10l missing underflows (bug 16361). Similar to various other bugs in this area, the x86 and x86_64 implementations of expl / exp10l can fail to produce underflow exceptions when the unscaled result has trailing 0 bits so the scaling down to subnormal precision is exact. This patch fixes this by forcing the exception in the case of tiny results. Tested for x86_64 and x86. [BZ #16361] * sysdeps/i386/fpu/e_expl.S [!USE_AS_EXPM1L] (cmin): New object. [!USE_AS_EXPM1L] (IEEE754_EXPL): Force underflow exception for tiny results. * sysdeps/x86_64/fpu/e_expl.S [!USE_AS_EXPM1L] (cmin): New object. [!USE_AS_EXPM1L] (IEEE754_EXPL): Force underflow exception for tiny results. * math/auto-libm-test-in: Add more tests of exp and exp10. Do not mark underflow exceptions as possibly missing for bug 16361. * math/auto-libm-test-out: Regenerated.	2015-06-21 17:48:04 +00:00
Andrew Senkevich	a6336cc446	Vector sincosf for x86_64 and tests. Here is implementation of vectorized sincosf containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * NEWS: Mention addition of x86_64 vector sincosf. * math/test-float-vlen16.h: Added wrapper for sincosf tests. * math/test-float-vlen4.h: Likewise. * math/test-float-vlen8.h: Likewise. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New symbols added. * sysdeps/x86/fpu/bits/math-vector.h: Added sincosf SIMD declaration. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core.S * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core_avx512.S * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf4_core.S * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf4_core_sse4.S * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf8_core.S * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf8_core_avx2.S * sysdeps/x86_64/fpu/svml_s_sincosf16_core.S * sysdeps/x86_64/fpu/svml_s_sincosf4_core.S * sysdeps/x86_64/fpu/svml_s_sincosf8_core.S * sysdeps/x86_64/fpu/svml_s_sincosf8_core_avx.S * sysdeps/x86_64/fpu/svml_s_sincosf_data.S: New file. * sysdeps/x86_64/fpu/svml_s_sincosf_data.h: New file. * sysdeps/x86_64/fpu/svml_s_wrapper_impl.h: Added 3 argument wrappers. * sysdeps/x86_64/fpu/test-float-vlen16.c: : Vector sincosf tests. * sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8.c: Likewise.	2015-06-18 20:11:27 +03:00
Andrew Senkevich	c9a8c526ac	Vector sincos for x86_64 and tests. Here is implementation of vectorized sincos containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * NEWS: Mention addition of x86_64 vector sincos. * bits/libm-simd-decl-stubs.h: Added stubs for sincos. * math/math.h (__MATHDECL_VEC): New macro. * math/bits/mathcalls.h: Added sincos declaration with __MATHDECL_VEC. * math/gen-libm-have-vector-test.sh: Added generation of sincos wrapper declaration under condition. * math/test-vec-loop.h (TEST_VEC_LOOP): Refactored. * math/test-double-vlen2.h: Added wrapper for sincos tests, reflected TEST_VEC_LOOP change. * math/test-double-vlen4.h: Likewise. * math/test-double-vlen8.h: Likewise. * math/test-float-vlen16.h: Reflected TEST_VEC_LOOP change. * math/test-float-vlen4.h: Likewise. * math/test-float-vlen8.h: Likewise. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New symbols added. * sysdeps/x86/fpu/bits/math-vector.h: Added sincos SIMD declaration. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos2_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos2_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos4_core_avx2.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core_avx512.S: New file. * sysdeps/x86_64/fpu/svml_d_sincos2_core.S: New file. * sysdeps/x86_64/fpu/svml_d_sincos4_core.S: New file. * sysdeps/x86_64/fpu/svml_d_sincos4_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_d_sincos8_core.S: New file. * sysdeps/x86_64/fpu/svml_d_sincos_data.S: New file. * sysdeps/x86_64/fpu/svml_d_sincos_data.h: New file. * sysdeps/x86_64/fpu/svml_d_wrapper_impl.h: Added wrappers for sincos. * sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: Vector sincos tests. * sysdeps/x86_64/fpu/test-double-vlen2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8.c: Likewise.	2015-06-18 17:55:55 +03:00
Andrew Senkevich	8aa92022e2	Vector powf for x86_64 and tests. Here is implementation of vectorized powf containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New symbols added. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration and asm redirections for powf. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/svml_s_wrapper_impl.h: Added 2 argument wrappers. * sysdeps/x86_64/fpu/multiarch/svml_s_powf16_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_powf16_core_avx512.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_powf4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_powf4_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_powf8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_powf8_core_avx2.S: New file. * sysdeps/x86_64/fpu/svml_s_powf16_core.S: New file. * sysdeps/x86_64/fpu/svml_s_powf4_core.S: New file. * sysdeps/x86_64/fpu/svml_s_powf8_core.S: New file. * sysdeps/x86_64/fpu/svml_s_powf8_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_s_powf_data.S: New file. * sysdeps/x86_64/fpu/svml_s_powf_data.h: New file. * sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c: Vector powf tests. * sysdeps/x86_64/fpu/test-float-vlen16.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8.c: Likewise. * math/test-float-vlen16.h: Fixed 2 argument macro. * math/test-float-vlen4.h: Likewise. * math/test-float-vlen8.h: Likewise. * NEWS: Mention addition of x86_64 vector powf.	2015-06-18 17:04:07 +03:00
Andrew Senkevich	c10b9b13f7	Vector pow for x86_64 and tests. Here is implementation of vectorized pow containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * bits/libm-simd-decl-stubs.h: Added stubs for pow. * math/bits/mathcalls.h: Added pow declaration with __MATHCALL_VEC. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New versions added. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration and asm redirections for pow. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/svml_d_wrapper_impl.h: Added 2 argument wrappers. * sysdeps/x86_64/fpu/multiarch/svml_d_pow2_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_pow2_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_pow4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_pow4_core_avx2.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core_avx512.S: New file. * sysdeps/x86_64/fpu/svml_d_pow2_core.S: New file. * sysdeps/x86_64/fpu/svml_d_pow4_core.S: New file. * sysdeps/x86_64/fpu/svml_d_pow4_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_d_pow8_core.S: New file. * sysdeps/x86_64/fpu/svml_d_pow_data.S: New file. * sysdeps/x86_64/fpu/svml_d_pow_data.h: New file. * sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: Added vector pow test. * sysdeps/x86_64/fpu/test-double-vlen2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8.c: Likewise. * NEWS: Mention addition of x86_64 vector pow.	2015-06-17 16:22:26 +03:00
Andrew Senkevich	1663be053d	Vector expf for x86_64 and tests. Here is implementation of vectorized expf containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New symbols added. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration and asm redirections for expf. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/multiarch/svml_s_expf16_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_expf16_core_avx512.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_expf4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_expf4_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_expf8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_expf8_core_avx2.S: New file. * sysdeps/x86_64/fpu/svml_s_expf16_core.S: New file. * sysdeps/x86_64/fpu/svml_s_expf4_core.S: New file. * sysdeps/x86_64/fpu/svml_s_expf8_core.S: New file. * sysdeps/x86_64/fpu/svml_s_expf8_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_s_expf_data.S: New file. * sysdeps/x86_64/fpu/svml_s_expf_data.h: New file. * sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c: Vector expf tests. * sysdeps/x86_64/fpu/test-float-vlen16.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8.c: Likewise. * NEWS: Mention addition of x86_64 vector expf.	2015-06-17 16:10:51 +03:00
Andrew Senkevich	9c02f663f6	Vector exp for x86_64 and tests. Here is implementation of vectorized exp containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * bits/libm-simd-decl-stubs.h: Added stubs for exp. * math/bits/mathcalls.h: Added exp declaration with __MATHCALL_VEC. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New versions added. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration and asm redirections for exp. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/multiarch/svml_d_exp2_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_exp2_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_exp4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_exp4_core_avx2.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core_avx512.S: New file. * sysdeps/x86_64/fpu/svml_d_exp2_core.S: New file. * sysdeps/x86_64/fpu/svml_d_exp4_core.S: New file. * sysdeps/x86_64/fpu/svml_d_exp4_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_d_exp8_core.S: New file. * sysdeps/x86_64/fpu/svml_d_exp_data.S: New file. * sysdeps/x86_64/fpu/svml_d_exp_data.h: New file. * sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: Added vector exp test. * sysdeps/x86_64/fpu/test-double-vlen2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8.c: Likewise. * NEWS: Mention addition of x86_64 vector exp.	2015-06-17 15:58:05 +03:00
Andrew Senkevich	774488f88a	Vector logf for x86_64 and tests. Here is implementation of vectorized logf containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New symbols added. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration and asm redirections for logf. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/multiarch/svml_s_logf16_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_logf16_core_avx512.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_logf4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_logf4_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_logf8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_logf8_core_avx2.S: New file. * sysdeps/x86_64/fpu/svml_s_logf16_core.S: New file. * sysdeps/x86_64/fpu/svml_s_logf4_core.S: New file. * sysdeps/x86_64/fpu/svml_s_logf8_core.S: New file. * sysdeps/x86_64/fpu/svml_s_logf8_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_s_logf_data.S: New file. * sysdeps/x86_64/fpu/svml_s_logf_data.h: New file. * sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c: Vector logf tests. * sysdeps/x86_64/fpu/test-float-vlen16.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8.c: Likewise. * NEWS: Mention addition of x86_64 vector logf.	2015-06-17 15:53:00 +03:00
Andrew Senkevich	6af25acc7b	Vector log for x86_64 and tests. Here is implementation of vectorized log containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * bits/libm-simd-decl-stubs.h: Added stubs for log. * math/bits/mathcalls.h: Added log declaration with __MATHCALL_VEC. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New versions added. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration and asm redirections for log. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/multiarch/svml_d_log2_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_log2_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_log4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_log4_core_avx2.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_log8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_log8_core_avx512.S: New file. * sysdeps/x86_64/fpu/svml_d_log2_core.S: New file. * sysdeps/x86_64/fpu/svml_d_log4_core.S: New file. * sysdeps/x86_64/fpu/svml_d_log4_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_d_log8_core.S: New file. * sysdeps/x86_64/fpu/svml_d_log_data.S: New file. * sysdeps/x86_64/fpu/svml_d_log_data.h: New file. * sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: Added vector log test. * sysdeps/x86_64/fpu/test-double-vlen2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8.c: Likewise. * NEWS: Mention addition of x86_64 vector log.	2015-06-17 15:38:29 +03:00
Andrew Senkevich	2a8c2c7b33	Vector sinf for x86_64 and tests. Here is implementation of vectorized sinf containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New symbols added. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration for sinf. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf16_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf16_core_avx512.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf4_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf8_core_avx2.S: New file. * sysdeps/x86_64/fpu/svml_s_sinf16_core.S: New file. * sysdeps/x86_64/fpu/svml_s_sinf4_core.S: New file. * sysdeps/x86_64/fpu/svml_s_sinf8_core.S: New file. * sysdeps/x86_64/fpu/svml_s_sinf8_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_s_sinf_data.S: New file. * sysdeps/x86_64/fpu/svml_s_sinf_data.h: New file. * sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c: Vector sinf tests. * sysdeps/x86_64/fpu/test-float-vlen16.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8.c: Likewise. * NEWS: Mention addition of x86_64 vector sinf.	2015-06-15 15:06:53 +03:00
Andrew Senkevich	4b9c2b707b	Vector sin for x86_64 and tests. Here is implementation of vectorized sin containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * bits/libm-simd-decl-stubs.h: Added stubs for sin. * math/bits/mathcalls.h: Added sin declaration with __MATHCALL_VEC. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New versions added. * sysdeps/x86/fpu/bits/math-vector.h: SIMD declaration for sin. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core_avx2.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core_avx512.S: New file. * sysdeps/x86_64/fpu/svml_d_sin2_core.S: New file. * sysdeps/x86_64/fpu/svml_d_sin4_core.S: New file. * sysdeps/x86_64/fpu/svml_d_sin4_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_d_sin8_core.S: New file. * sysdeps/x86_64/fpu/svml_d_sin_data.S: New file. * sysdeps/x86_64/fpu/svml_d_sin_data.h: New file. * sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: Added vector sin test. * sysdeps/x86_64/fpu/test-double-vlen2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8.c: Likewise. * NEWS: Mention addition of x86_64 vector sin.	2015-06-11 17:12:38 +03:00
Andrew Senkevich	2a523216d5	This patch adds vector cosf tests. * math/Makefile: Added CFLAGS for new tests. * math/test-float-vlen16.h: New file. * math/test-float-vlen4.h: New file. * math/test-float-vlen8.h: New file. * math/test-double-vlen2.h: Fixed 2 argument macro and comment. * sysdeps/x86_64/fpu/Makefile: Added new tests and variables. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c: New file. * sysdeps/x86_64/fpu/test-float-vlen16.c: New file. * sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c: New file. * sysdeps/x86_64/fpu/test-float-vlen4.c: New file. * sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c: New file. * sysdeps/x86_64/fpu/test-float-vlen8-avx2.c: New file. * sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c: New file. * sysdeps/x86_64/fpu/test-float-vlen8.c: New file.	2015-06-09 18:32:42 +03:00
Andrew Senkevich	04f496d602	Vector cosf for x86_64. Here is implementation of vectorized cosf containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/svml_s_cosf4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf4_core_sse4.S: New file. * sysdeps/x86_64/fpu/svml_s_cosf8_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_s_cosf8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf8_core_avx2.S: New file. * sysdeps/x86_64/fpu/svml_s_cosf16_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf16_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf16_core_avx512.S: New file. * sysdeps/x86_64/fpu/svml_s_wrapper_impl.h: New file. * sysdeps/x86_64/fpu/svml_s_cosf_data.S: New file. * sysdeps/x86_64/fpu/svml_s_cosf_data.h: New file. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New versions added. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration for cosf. * NEWS: Mention addition of x86_64 vector cosf.	2015-06-09 18:29:47 +03:00
Andrew Senkevich	24a2718f59	Addition of testing infrastructure for vector math functions. We test vector math functions using scalar tests infrastructure with help of special wrappers from scalar versions to vector ones. Wrapper implemented using platform specific vector types and placed in separate file for compilation with architecture specific options, main part of test has no such options. With help of system of definitions unfolding of which is drived from test code we have wrapper called in individual testing function instead of scalar function. Also system of definitions includes generated during make check header math/libm-have-vector-test.h with series of conditional definitions which help to avoid build fails for functions having no vector versions; runtime architecture check to prevent runtime fails of test run on inappropriate hardware. * math/Makefile: Added rules for vector tests. * math/gen-libm-have-vector-test.sh: Added generation of wrapper declaration under condition. * math/test-double-vlen2.h: New file. * math/test-double-vlen4.h: New file. * math/test-double-vlen8.h: New file. * math/test-vec-loop.h: Added initialization macro. * sysdeps/x86_64/fpu/Makefile: Added variables for vector tests. * sysdeps/x86_64/fpu/libm-test-ulps: Regenarated. * sysdeps/x86_64/fpu/math-tests-arch.h: New file. * sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: New file. * sysdeps/x86_64/fpu/test-double-vlen2.c: New file. * sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: New file. * sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: New file. * sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: New file. * sysdeps/x86_64/fpu/test-double-vlen4.c: New file. * sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: New file. * sysdeps/x86_64/fpu/test-double-vlen8.c: New file.	2015-06-09 14:51:52 +03:00
Andrew Senkevich	2193311288	Start of series of patches with x86_64 vector math functions. Here is implementation of cos containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI which had been discussed in <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. Vector math library build and ABI testing enabled by default for x86_64. * sysdeps/x86_64/fpu/Makefile: New file. * sysdeps/x86_64/fpu/Versions: New file. * sysdeps/x86_64/fpu/svml_d_cos_data.S: New file. * sysdeps/x86_64/fpu/svml_d_cos_data.h: New file. * sysdeps/x86_64/fpu/svml_d_cos2_core.S: New file. * sysdeps/x86_64/fpu/svml_d_cos4_core.S: New file. * sysdeps/x86_64/fpu/svml_d_cos4_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_d_cos8_core.S: New file. * sysdeps/x86_64/fpu/svml_d_wrapper_impl.h: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_cos2_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_cos2_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_cos4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_cos4_core_avx2.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core_avx512.S: New file. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration for cos. * math/bits/mathcalls.h: Added cos declaration with __MATHCALL_VEC. * sysdeps/x86_64/configure.ac: Options for libmvec build. * sysdeps/x86_64/configure: Regenerated. * sysdeps/x86_64/sysdep.h (cfi_offset_rel_rsp): New macro. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New file. * manual/install.texi (Configuring and compiling): Document --disable-mathvec. * INSTALL: Regenerated. * NEWS: Mention addition of libmvec and x86_64 vector cos.	2015-06-09 14:25:49 +03:00
Wilco Dijkstra	0e9be4db8f	Remove various ABS macros and replace uses with fabs (or in one case abs) which is more efficient on all targets.	2015-05-15 11:04:40 +00:00
Joseph Myers	14f36098f2	Add more tests of csqrt, lgamma, log10, sinh. This patch adds more randomly-generated tests of various libm functions that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of csqrt, lgamma, log10 and sinh. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-05-08 17:55:11 +00:00
Joseph Myers	471dffa12c	Add more tests of acosh, atanh, cos, csqrt, erfc, sin, sincos. This patch adds more randomly-generated tests of various libm functions that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of acosh, atanh, cos, csqrt, erfc, sin and sincos. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-05-06 17:30:18 +00:00
Joseph Myers	31450d9a87	Add further tests of libm functions. This patch adds more randomly-generated tests of various libm functions that are observed to increase ulps on x86_64. (This process must eventually converge, when my random test generation stops finding inputs that increase the listed ulps, except maybe for any cases uncovered where the errors exceed the maximum allowed 9ulp error and so indicate actual libm bugs needing fixing.) Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of acosh, atanh, clog, clog10, csqrt, erfc, exp2, expm1, log10, log2 and sinh. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-05-05 22:59:41 +00:00
Joseph Myers	305392eaca	Add more tests of libm functions. This patch adds more randomly-generated tests of various libm functions that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of atan, clog, clog10, cos, csqrt, erf, erfc, exp2, lgamma, log1p, sin, sincos, tanh and tgamma. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-05-02 21:06:33 +00:00
Joseph Myers	51e15247c3	Add more tests of tgamma. This patch adds some randomly-generated tests of tgamma that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of tgamma. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-05-01 23:15:07 +00:00
Joseph Myers	5ffb9a53d7	Add more tests of tanh. This patch adds some randomly-generated tests of tanh that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of tanh. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-05-01 23:06:44 +00:00
Joseph Myers	0957e15d0a	Add more tests of tan. This patch adds some randomly-generated tests of tan that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of tan. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-05-01 22:54:39 +00:00
Joseph Myers	827bb5859c	Add more tests of cos, sin, sincos. This patch adds some randomly-generated tests of cos, sin and sincos that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of cos, sin and sincos. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-05-01 22:41:00 +00:00
Joseph Myers	86793ae758	Add another test of pow. This patch adds a randomly-generated test of pow that is observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add another test of pow. * math/auto-libm-test-out: Regenerated. * sysdeps/x86_64/fpu/libm-test-ulps: Update.	2015-05-01 22:31:24 +00:00
Joseph Myers	038e4be99c	Add more tests of lgamma. This patch adds some randomly-generated tests of lgamma that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of lgamma. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-05-01 22:17:19 +00:00
Joseph Myers	a0d31f36aa	Add more tests of log, log10, log1p, log2. This patch adds some randomly-generated tests of log, log10, log1p and log2 that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of log, log10, log2 and log1p. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-05-01 21:08:37 +00:00
Joseph Myers	e1483b365d	Add more tests of exp, exp10, exp2, expm1. This patch adds some randomly-generated tests of exp, exp10, exp2 and expm1 that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of exp, exp10, exp2 and expm1. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-05-01 20:33:04 +00:00
Joseph Myers	c5a3a509df	Add more tests of erf, erfc. This patch adds some randomly-generated tests of erf and erfc that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of erf and erfc. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-05-01 17:49:44 +00:00
Joseph Myers	9862ab1f67	Add more tests of csqrt. This patch adds some randomly-generated tests of csqrt that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of csqrt. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-04-30 22:51:29 +00:00
Joseph Myers	094fca83ee	Add further tests of cosh and sinh. This patch adds some further randomly-generated tests of cosh and sinh that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of cosh and sinh. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-04-30 22:32:08 +00:00
Stefan Liebler	de8aadd52c	Set errno for log1p on pole/domain error. According to bug 6792, errno is not set to ERANGE/EDOM by calling log1p/log1pf/log1pl with x = -1 or x < -1. This patch adds a wrapper which sets errno in those cases and returns the value of the existing __log1p function. The log1p is now an alias to the wrapper function instead of __log1p. The files in sysdeps are reflecting these changes. The ia64 implementation sets errno by itself, thus the wrapper-file is empty. The libm-test is adjusted for log1p-tests to check errno. [BZ #6792] * math/w_log1p.c: New file. * math/w_log1pf.c: Likewise. * math/w_log1pl.c: Likewise. * math/Makefile (libm-calls): Add w_log1p. * math/s_log1pl.c (log1pl): Remove weak_alias. * sysdeps/i386/fpu/s_log1p.S (log1p): Likewise. * sysdeps/i386/fpu/s_log1pf.S (log1pf): Likewise. * sysdeps/i386/fpu/s_log1pl.S (log1pl): Likewise. * sysdeps/x86_64/fpu/s_log1pl.S (log1pl): Likewise. * sysdeps/ieee754/dbl-64/s_log1p.c (log1p): Likewise. [NO_LONG_DOUBLE] (log1pl): Likewise. * sysdeps/ieee754/flt-32/s_log1pf.c (log1pf): Likewise. * sysdeps/ieee754/ldbl-128/s_log1pl.c (log1pl): Likewise. * sysdeps/ieee754/ldbl-64-128/s_log1pl.c (log1p): Remove long_double_symbol. * sysdeps/ieee754/ldbl-128ibm/s_log1pl.c (log1pl): Likewise. * sysdeps/ieee754/ldbl-64-128/w_log1pl.c: New file. * sysdeps/ieee754/ldbl-128ibm/w_log1pl.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_log1p.c: Define empty weak_alias to remove weak_alias for corresponding log1p function. * sysdeps/m68k/m680x0/fpu/s_log1pf.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_log1pl.c: Likewise. * sysdeps/ia64/fpu/w_log1p.c: New file. * sysdeps/ia64/fpu/w_log1pf.c: Likewise. * sysdeps/ia64/fpu/w_log1pl.c: Likewise. * math/libm-test.inc (log1p_test_data): Add errno expectations.	2015-04-13 21:19:27 +02:00
Joseph Myers	b3c66c534f	Add more tests of clog and clog10. This patch adds some randomly-generated tests of clog and clog10 that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of clog and clog10. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-04-09 22:14:34 +00:00
Joseph Myers	787d22bce6	Add more tests of atanh. This patch adds some randomly-generated tests of atanh that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of atanh. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-04-08 21:13:35 +00:00
Joseph Myers	024bcc5106	Add more tests of atan. This patch adds some randomly-generated tests of atan that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of atan. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-04-08 21:00:03 +00:00
Joseph Myers	da0cf658c6	Add more tests of cbrt. This patch adds some randomly-generated tests of cbrt that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of cbrt. * math/auto-libm-test-out: Regenerated. * sysdeps/x86_64/fpu/libm-test-ulps: Update.	2015-04-08 17:56:15 +00:00
Joseph Myers	80352c01c1	Add more tests of cabs. This patch adds some randomly-generated tests of cabs that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of cabs. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-04-08 17:46:07 +00:00
Joseph Myers	8431838dde	Fix dbl-64 atan2 in non-default rounding modes (bug 18210, bug 18211). The dbl-64 implementation of atan2 does computations that expect to run in round-to-nearest mode, and in other modes the errors can accumulate to more than the maximum accepted 9ulp. This patch makes it use FE_TONEAREST internally, similar to other functions with such issues. Tests that previously produced large errors are added for atan2 and the closely related carg, clog and clog10 functions. Tested for x86_64 and x86 and ulps updated accordingly. [BZ #18210] [BZ #18211] * sysdeps/ieee754/dbl-64/e_atan2.c: Include <fenv.h>. (__ieee754_atan2): Set FE_TONEAREST mode for internal computations. * math/auto-libm-test-in: Add more tests of atan2, carg, clog and clog10. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-04-08 17:32:17 +00:00
Joseph Myers	efd5b641dd	Add more tests of acosh, asinh and atanh. This patch adds some randomly-generated tests of acosh, asinh and atanh that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of acosh, asinh and atanh. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-03-25 22:21:20 +00:00
Joseph Myers	e9b1015112	Add another test of asin. This patch adds a randomly-generated test of asin that is observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add another test of asin. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-03-25 21:57:04 +00:00
Joseph Myers	38755f1421	Add more tests of asin. This patch adds some randomly-generated tests of asin that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of asin. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-03-25 17:53:58 +00:00
Joseph Myers	8d6439712d	Add more tests of acos. This patch adds some randomly-generated tests of acos that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of acos. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-03-25 00:30:10 +00:00

... 2 3 4 5 6 ...

635 Commits