glibc

mirror of https://sourceware.org/git/glibc.git synced 2024-11-29 16:21:07 +00:00

Author	SHA1	Message	Date
Stefan Liebler	9f234eafe8	Always use wordsize-64 version of s_ceil.c. This patch replaces s_ceil.c in sysdeps/dbl-64 with the one in sysdeps/dbl-64/wordsize-64 and removes the latter one. The code is not changed except changes in code style. Also adjusted the include path in x86_64 and sparc64 files. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2019-12-11 15:12:13 +01:00
Stefan Liebler	95b0c2c431	Always use wordsize-64 version of s_floor.c. This patch replaces s_floor.c in sysdeps/dbl-64 with the one in sysdeps/dbl-64/wordsize-64 and removes the latter one. The code is not changed except changes in code style. Also adjusted the include path in x86_64 and sparc64 files. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2019-12-11 15:12:12 +01:00
Stefan Liebler	ab48bdd098	Always use wordsize-64 version of s_rint.c. This patch replaces s_rint.c in sysdeps/dbl-64 with the one in sysdeps/dbl-64/wordsize-64 and removes the latter one. The code is not changed except changes in code style. Also adjusted the include path in x86_64 file. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2019-12-11 15:12:12 +01:00
Stefan Liebler	af123aa950	Always use wordsize-64 version of s_nearbyint.c. This patch replaces s_nearbyint.c in sysdeps/dbl-64 with the one in sysdeps/dbl-64/wordsize-64 and removes the latter one. The code is not changed except changes in code style. Also adjusted the include path in x86_64 file. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2019-12-11 15:12:11 +01:00
Wilco Dijkstra	d0007dc53c	Remove x64 _finite tests and references Remove _finite tests and references from x86_64. Rather than calling __exp_finite, use exp directly (since it's the same entry point). x86_64 builds and passes testsuite. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2019-10-21 14:29:12 -03:00
Paul Eggert	5a82c74822	Prefer https to http for gnu.org and fsf.org URLs Also, change sources.redhat.com to sourceware.org. This patch was automatically generated by running the following shell script, which uses GNU sed, and which avoids modifying files imported from upstream: sed -ri ' s,(http\|ftp)(://(.\.)?(gnu\|fsf\|sourceware)\.org($\|[^.]\|\.[^a-z])),https\2,g s,(http\|ftp)(://(.\.)?)sources\.redhat\.com($\|[^.]\|\.[^a-z]),https\2sourceware.org\4,g ' \ $(find $(git ls-files) -prune -type f \ ! -name '.po' \ ! -name 'ChangeLog' \ ! -path COPYING ! -path COPYING.LIB \ ! -path manual/fdl-1.3.texi ! -path manual/lgpl-2.1.texi \ ! -path manual/texinfo.tex ! -path scripts/config.guess \ ! -path scripts/config.sub ! -path scripts/install-sh \ ! -path scripts/mkinstalldirs ! -path scripts/move-if-change \ ! -path INSTALL ! -path locale/programs/charmap-kw.h \ ! -path po/libc.pot ! -path sysdeps/gnu/errlist.c \ ! '(' -name configure \ -execdir test -f configure.ac -o -f configure.in ';' ')' \ ! '(' -name preconfigure \ -execdir test -f preconfigure.ac ';' ')' \ -print) and then by running 'make dist-prepare' to regenerate files built from the altered files, and then executing the following to cleanup: chmod a+x sysdeps/unix/sysv/linux/riscv/configure # Omit irrelevant whitespace and comment-only changes, # perhaps from a slightly-different Autoconf version. git checkout -f \ sysdeps/csky/configure \ sysdeps/hppa/configure \ sysdeps/riscv/configure \ sysdeps/unix/sysv/linux/csky/configure # Omit changes that caused a pre-commit check to fail like this: # remote: * error: sysdeps/powerpc/powerpc64/ppc-mcount.S: trailing lines git checkout -f \ sysdeps/powerpc/powerpc64/ppc-mcount.S \ sysdeps/unix/sysv/linux/s390/s390-64/syscall.S # Omit change that caused a pre-commit check to fail like this: # remote: * error: sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S: last line does not end in newline git checkout -f sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S	2019-09-07 02:43:31 -07:00
H.J. Lu	7e681561a3	x86-64: Compile branred.c with -mprefer-vector-width=128 [BZ #24603 ] When compiled with -O3 and AVX, GCC 8 and 9 optimize some loops in sysdeps/ieee754/dbl-64/branred.c with 256-bit vector instructions, which leads to store forward stall: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90579 There is no easy fix in compiler. This patch limits vector width to 128 bits to work around this issue. It improves performance of sin and cos by more than 40% on Skylake compiled with -O3 -march=skylake. Tested with GCC 7/8/9 on x86-64. [BZ #24603] * sysdeps/x86_64/configure.ac: Check if -mprefer-vector-width=128 works. * sysdeps/x86_64/configure: Regenerated. * sysdeps/x86_64/fpu/Makefile (CFLAGS-branred.c): New. Set to -mprefer-vector-width=128 if supported.	2019-07-24 14:48:43 -07:00
Joseph Myers	04277e02d7	Update copyright dates with scripts/update-copyrights. * All files with FSF copyright notices: Update copyright dates using scripts/update-copyrights. * locale/programs/charmap-kw.h: Regenerated. * locale/programs/locfile-kw.h: Likewise.	2019-01-01 00:11:28 +00:00
H.J. Lu	ba4b8fab20	x86-64: Remove s_sincosf-sse2.S The current s_sincosf.c is faster than s_sincosf-sse2.S. On Broadwell with FMA disabled, bench-sincosf shows: Before After Improvement max 154.032 114.517 34% min 6.25 5.609 11% mean 14.8728 12.8589 15% * sysdeps/x86_64/fpu/s_sincosf.S: Removed. * sysdeps/x86_64/fpu/multiarch/s_sincosf-sse2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/s_sincosf-sse2.c: New file.	2018-12-26 06:58:31 -08:00
H.J. Lu	9412979a43	Regenerate sysdeps/x86_64/fpu/libm-test-ulps * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated.	2018-12-26 06:57:30 -08:00
H.J. Lu	8700a7851b	x86-64: Vectorize sincosf_poly and update s_sincosf-fma.c Add <sincosf_poly.h> and include it in s_sincosf.h to allow vectorized sincosf_poly. Add x86 sincosf_poly.h to vectorize sincosf_poly. On Broadwell, bench-sincosf shows: Before After Improvement max 160.273 114.198 40% min 6.25 5.625 11% mean 13.0325 10.6462 22% Vectorized sincosf_poly shows Before After Improvement max 138.653 114.198 21% min 5.004 5.625 -11% mean 11.5934 10.6462 9% Tested on x86-64 and i686 as well as with build-many-glibcs.py. * sysdeps/ieee754/flt-32/s_sincosf.h: Include <sincosf_poly.h>. (sincos_t, sincosf_poly, sinf_poly): Moved to ... * sysdeps/ieee754/flt-32/sincosf_poly.h: Here. New file. * sysdeps/x86/fpu/s_sincosf_data.c: New file. * sysdeps/x86/fpu/sincosf_poly.h: Likewise. * sysdeps/x86_64/fpu/multiarch/s_sincosf-fma.c: Just include <sysdeps/ieee754/flt-32/s_sincosf.c>.	2018-12-26 06:56:13 -08:00
Szabolcs Nagy	a502c5294b	Remove the error handling wrapper from pow Introduce new pow symbol version that doesn't do SVID compatible error handling. The standard errno and fp exception based error handling is inline in the new code and does not have significant overhead. The wrapper is disabled for sysdeps/ieee754/dbl-64 by using empty w_pow.c and enabled for targets with their own pow implementation or ifunc dispatch on __ieee754_pow by including math/w_pow.c. The compatibility symbol version still uses the wrapper with SVID error handling around the new code. There is no new symbol version nor compatibility code on !LIBM_SVID_COMPAT targets (e.g. riscv). On targets where previously powl was an alias of pow, now it points to the compatibility symbol with the wrapper, because it still need the SVID compatible error handling. This affects NO_LONG_DOUBLE (e.g. arm) and LONG_DOUBLE_COMPAT (e.g. alpha) targets as well. The __pow_finite symbol is now an alias of pow. Both __pow_finite and pow set errno and thus not const functions. The ia64 asm is changed so the compat and new symbol versions map to the same address. On x86_64 #include <math.h> was added before macro definitions that may affect that header. Tested with build-many-glibcs.py. * math/Versions (GLIBC_2.29): Add pow. * math/w_pow_compat.c (__pow_compat): Change to versioned compat symbol. * math/w_pow.c: New file. * sysdeps/i386/fpu/w_pow.c: New file. * sysdeps/ia64/fpu/e_pow.S: Add versioned symbols. * sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Rename to __pow and add necessary aliases. * sysdeps/ieee754/dbl-64/w_pow.c: New file. * sysdeps/m68k/m680x0/fpu/w_pow.c: New file. * sysdeps/mach/hurd/i386/libm.abilist: Update. * sysdeps/unix/sysv/linux/aarch64/libm.abilist: Update. * sysdeps/unix/sysv/linux/alpha/libm.abilist: Update. * sysdeps/unix/sysv/linux/arm/libm.abilist: Update. * sysdeps/unix/sysv/linux/hppa/libm.abilist: Update. * sysdeps/unix/sysv/linux/i386/libm.abilist: Update. * sysdeps/unix/sysv/linux/ia64/libm.abilist: Update. * sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Update. * sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Update. * sysdeps/unix/sysv/linux/microblaze/libm.abilist: Update. * sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Update. * sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Update. * sysdeps/unix/sysv/linux/nios2/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Update. * sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Update. * sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Update. * sysdeps/unix/sysv/linux/sh/libm.abilist: Update. * sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Update. * sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Update. * sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Update. * sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Update. * sysdeps/x86_64/fpu/multiarch/e_pow-fma.c (__ieee754_pow): Rename to __pow. * sysdeps/x86_64/fpu/multiarch/e_pow-fma4.c (__ieee754_pow): Likewise. * sysdeps/x86_64/fpu/multiarch/e_pow.c (__ieee754_pow): Likewise. * sysdeps/x86_64/fpu/multiarch/w_pow.c: New file.	2018-11-21 09:58:36 +00:00
Szabolcs Nagy	f29b7c492d	Remove the error handling wrapper from log Introduce new log symbol version that doesn't do SVID compatible error handling. The standard errno and fp exception based error handling is inline in the new code and does not have significant overhead. The wrapper is disabled for sysdeps/ieee754/dbl-64 by using empty w_log.c and enabled for targets with their own log implementation by including math/w_log.c. The compatibility symbol version still uses the wrapper with SVID error handling around the new code. There is no new symbol version nor compatibility code on !LIBM_SVID_COMPAT targets (e.g. riscv). On targets where previously logl was an alias of log, now it points to the compatibility symbol with the wrapper, because it still need the SVID compatible error handling. This affects NO_LONG_DOUBLE (e.g. arm) and LONG_DOUBLE_COMPAT (e.g. alpha) targets as well. The __log_finite symbol is now an alias of log. Both __log_finite and log set errno and thus not const functions. The ia64 asm is changed so the compat and new symbol versions map to the same address. On x86_64 #include <math.h> was added before macro definitions that may affect that header. Tested with build-many-glibcs.py. * math/Versions (GLIBC_2.29): Add log. * math/w_log_compat.c (__log_compat): Change to versioned compat symbol. * math/w_log.c: New file. * sysdeps/i386/fpu/w_log.c: New file. * sysdeps/ia64/fpu/e_log.S: Update. * sysdeps/ieee754/dbl-64/e_log.c (__ieee754_log): Rename to __log and add necessary aliases. * sysdeps/ieee754/dbl-64/w_log.c: New file. * sysdeps/m68k/m680x0/fpu/w_log.c: New file. * sysdeps/mach/hurd/i386/libm.abilist: Update. * sysdeps/unix/sysv/linux/aarch64/libm.abilist: Update. * sysdeps/unix/sysv/linux/alpha/libm.abilist: Update. * sysdeps/unix/sysv/linux/arm/libm.abilist: Update. * sysdeps/unix/sysv/linux/hppa/libm.abilist: Update. * sysdeps/unix/sysv/linux/i386/libm.abilist: Update. * sysdeps/unix/sysv/linux/ia64/libm.abilist: Update. * sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Update. * sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Update. * sysdeps/unix/sysv/linux/microblaze/libm.abilist: Update. * sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Update. * sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Update. * sysdeps/unix/sysv/linux/nios2/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Update. * sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Update. * sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Update. * sysdeps/unix/sysv/linux/sh/libm.abilist: Update. * sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Update. * sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Update. * sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Update. * sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Update. * sysdeps/x86_64/fpu/multiarch/e_log-avx.c (__ieee754_log): Rename to __log. * sysdeps/x86_64/fpu/multiarch/e_log-fma.c (__ieee754_log): Likewise. * sysdeps/x86_64/fpu/multiarch/e_log-fma4.c (__ieee754_log): Likewise. * sysdeps/x86_64/fpu/multiarch/e_log.c (__ieee754_log): Likewise. * sysdeps/x86_64/fpu/multiarch/w_log.c: New file.	2018-11-21 09:56:27 +00:00
Szabolcs Nagy	c20a10561a	Remove the error handling wrapper from exp and exp2 Introduce new exp and exp2 symbol version that don't do SVID compatible error handling. The standard errno and fp exception based error handling is inline in the new code and does not have significant overhead. The double precision wrappers are disabled for sysdeps/ieee754/dbl-64 by using empty w_exp.c and w_exp2.c files, the math/w_exp.c and math/w_exp2.c files use the wrapper template and can be included by targets that have their own exp and exp2 implementations or use ifunc on the glibc internal __ieee754_exp symbol. The compatibility symbol versions still use the wrapper with SVID error handling around the new code. There is no new symbol version nor compatibility code on !LIBM_SVID_COMPAT targets (e.g. riscv). On targets where previously expl and exp2l were aliases of exp and exp2, now they point to the compatibility symbols with the wrapper, because they still need the SVID compatible error handling. This affects NO_LONG_DOUBLE (e.g arm) and LONG_DOUBLE_COMPAT (e.g. alpha) targets as well. The _finite symbols are now aliases of the standard symbols (they have no performance advantage anymore). Both the standard symbols and _finite symbols set errno and thus not const functions. The ia64 asm is changed so the compat and new symbol versions map to the same address. On x86_64 #include <math.h> was added before macro definitions that may affect that header (the new macro name is __exp instead of __ieee754_exp which breaks some math.h macros). Tested with build-many-glibcs.py. * math/Versions (GLIBC_2.29): Add exp and exp2. * math/w_exp2_compat.c (__exp2_compat): Change to versioned compat symbol, handle NO_LONG_DOUBLE and LONG_DOUBLE_COMPAT explicitly. * math/w_exp_compat.c (__exp_compat): Likewise. * math/w_exp.c: New file. * math/w_exp2.c: New file. * sysdeps/i386/fpu/w_exp.c: New file. * sysdeps/i386/fpu/w_exp2.c: New file. * sysdeps/ia64/fpu/e_exp.S: Add versioned symbols. * sysdeps/ia64/fpu/e_exp2.S: Likewise. * sysdeps/ieee754/dbl-64/e_exp.c (__ieee754_exp): Rename to __exp and add necessary aliases. * sysdeps/ieee754/dbl-64/e_exp2.c (__ieee754_exp2): Rename to __exp2 and add necessary aliases. * sysdeps/ieee754/dbl-64/w_exp.c: New file. * sysdeps/ieee754/dbl-64/w_exp2.c: New file. * sysdeps/m68k/m680x0/fpu/w_exp.c: New file. * sysdeps/m68k/m680x0/fpu/w_exp2.c: New file. * sysdeps/mach/hurd/i386/libm.abilist: Update. * sysdeps/unix/sysv/linux/aarch64/libm.abilist: Update. * sysdeps/unix/sysv/linux/alpha/libm.abilist: Update. * sysdeps/unix/sysv/linux/arm/libm.abilist: Update. * sysdeps/unix/sysv/linux/hppa/libm.abilist: Update. * sysdeps/unix/sysv/linux/i386/libm.abilist: Update. * sysdeps/unix/sysv/linux/ia64/libm.abilist: Update. * sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Update. * sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Update. * sysdeps/unix/sysv/linux/microblaze/libm.abilist: Update. * sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Update. * sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Update. * sysdeps/unix/sysv/linux/nios2/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Update. * sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Update. * sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Update. * sysdeps/unix/sysv/linux/sh/libm.abilist: Update. * sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Update. * sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Update. * sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Update. * sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Update. * sysdeps/x86_64/fpu/multiarch/e_exp-avx.c (__exp1): Remove. (__ieee754_exp): Rename to __exp. * sysdeps/x86_64/fpu/multiarch/e_exp-fma.c (__exp1): Remove. (__ieee754_exp): Rename to __exp. * sysdeps/x86_64/fpu/multiarch/e_exp-fma4.c (__exp1): Remove. (__ieee754_exp): Rename to __exp. * sysdeps/x86_64/fpu/multiarch/e_exp.c (__ieee754_exp): Rename to __exp. * sysdeps/x86_64/fpu/multiarch/w_exp.c: New file.	2018-11-21 09:55:02 +00:00
Joseph Myers	7abf97bed9	Use trunc functions not __trunc functions in glibc libm. Continuing the move to use, within libm, public names for libm functions that can be inlined as built-in functions on many architectures, this patch moves calls to __trunc functions to call the corresponding trunc names instead, with asm redirection to __trunc when the calls are not inlined. Tested for x86_64, and with build-many-glibcs.py. * include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (trunc): Redirect using MATH_REDIRECT. * sysdeps/aarch64/fpu/s_trunc.c: Define NO_MATH_REDIRECT before header inclusion. * sysdeps/aarch64/fpu/s_truncf.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_trunc.c: Likewise. * sysdeps/ieee754/float128/s_truncf128.c: Likewise. * sysdeps/ieee754/dbl-64/s_trunc.c: Likewise. * sysdeps/ieee754/flt-32/s_truncf.c: Likewise. * sysdeps/ieee754/ldbl-128/s_truncl.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_trunc.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_truncf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_trunc.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_truncf.c: Likewise. * sysdeps/riscv/rv64/rvd/s_trunc.c: Likewise. * sysdeps/riscv/rvf/s_truncf.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_trunc.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_truncf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_trunc.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_truncf.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_trunc_template.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_truncl.c: Likewise. (ceil): Redirect to __ceil. (floor): Redirect to __floor. (trunc): Redirect to __trunc. (__truncl): Call trunc instead of __trunc. * sysdeps/powerpc/fpu/math_private.h [_ARCH_PWR5X] (__trunc): Remove macro. [_ARCH_PWR5X] (__truncf): Likewise. * sysdeps/ieee754/dbl-64/e_gamma_r.c (__ieee754_gamma_r): Use trunc functions instead of __trunc variants. * sysdeps/ieee754/flt-32/e_gammaf_r.c (__ieee754_gammaf_r): Likewise. * sysdeps/ieee754/ldbl-128/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/ieee754/ldbl-96/e_gammal_r.c (__ieee754_gammal_r): Likewise.	2018-09-20 21:11:10 +00:00
Szabolcs Nagy	424c4f60ed	Add new pow implementation The algorithm is exp(y * log(x)), where log(x) is computed with about 1.32^-68 relative error (1.52^-68 without fma), returning the result in two doubles, and the exp part uses the same algorithm (and lookup tables) as exp, but takes the input as two doubles and a sign (to handle negative bases with odd integer exponent). The __exp1 internal symbol is no longer necessary. There is separate code path when fma is not available but the worst case error is about 0.54 ULP in both cases. The lookup table and consts for log are 4168 bytes. The .rodata+.text is decreased by 37908 bytes on aarch64. The non-nearest rounding error is less than 1 ULP. Improvements on Cortex-A72 compared to current glibc master: pow thruput: 2.40x in [0.01 11.1]x[0.01 11.1] pow latency: 1.84x in [0.01 11.1]x[0.01 11.1] Tested on aarch64-linux-gnu (defined __FP_FAST_FMA, TOINT_INTRINSICS) and arm-linux-gnueabihf (!defined __FP_FAST_FMA, !TOINT_INTRINSICS) and x86_64-linux-gnu (!defined __FP_FAST_FMA, !TOINT_INTRINSICS) and powerpc64le-linux-gnu (defined __FP_FAST_FMA, !TOINT_INTRINSICS) targets. * NEWS: Mention pow improvements. * math/Makefile (type-double-routines): Add e_pow_log_data. * sysdeps/generic/math_private.h (__exp1): Remove. * sysdeps/i386/fpu/e_pow_log_data.c: New file. * sysdeps/ia64/fpu/e_pow_log_data.c: New file. * sysdeps/ieee754/dbl-64/Makefile (CFLAGS-e_pow.c): Allow fma contraction. * sysdeps/ieee754/dbl-64/e_exp.c (__exp1): Remove. (exp_inline): Remove. (__ieee754_exp): Only single double input is handled. * sysdeps/ieee754/dbl-64/e_pow.c: Rewrite. * sysdeps/ieee754/dbl-64/e_pow_log_data.c: New file. * sysdeps/ieee754/dbl-64/math_config.h (issignaling_inline): Define. (__pow_log_data): Define. * sysdeps/ieee754/dbl-64/upow.h: Remove. * sysdeps/ieee754/dbl-64/upow.tbl: Remove. * sysdeps/m68k/m680x0/fpu/e_pow_log_data.c: New file. * sysdeps/x86_64/fpu/multiarch/Makefile (CFLAGS-e_pow-fma.c): Allow fma contraction. (CFLAGS-e_pow-fma4.c): Likewise.	2018-09-19 10:04:51 +01:00
Joseph Myers	71223ef909	Use ceil functions not __ceil functions in glibc libm. Continuing the move to use, within libm, public names for libm functions that can be inlined as built-in functions on many architectures, this patch moves calls to __ceil functions to call the corresponding ceil names instead, with asm redirection to __ceil when the calls are not inlined. Tested for x86_64, and with build-many-glibcs.py. * include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (ceil): Redirect using MATH_REDIRECT. * sysdeps/aarch64/fpu/s_ceil.c: Define NO_MATH_REDIRECT before header inclusion. * sysdeps/aarch64/fpu/s_ceilf.c: Likewise. * sysdeps/ieee754/dbl-64/s_ceil.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_ceil.c: Likewise. * sysdeps/ieee754/float128/s_ceilf128.c: Likewise. * sysdeps/ieee754/flt-32/s_ceilf.c: Likewise. * sysdeps/ieee754/ldbl-128/s_ceill.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_ceill.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_ceil_template.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceil.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceilf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceil.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceilf.c: Likewise. * sysdeps/riscv/rv64/rvd/s_ceil.c: Likewise. * sysdeps/riscv/rvf/s_ceilf.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_ceil.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_ceilf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_ceil.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_ceilf.c: Likewise. * sysdeps/powerpc/fpu/math_private.h [_ARCH_PWR5X] (__ceil): Remove macro. * sysdeps/ieee754/dbl-64/e_gamma_r.c (gamma_positive): Use ceil functions instead of __ceil variants. * sysdeps/ieee754/flt-32/e_gammaf_r.c (gammaf_positive): Likewise. * sysdeps/ieee754/ldbl-128/e_gammal_r.c (gammal_positive): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (gammal_positive): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_truncl.c (__truncl): Likewise. * sysdeps/ieee754/ldbl-96/e_gammal_r.c (gammal_positive): Likewise. * sysdeps/powerpc/power5+/fpu/s_modf.c (__modf): Likewise. * sysdeps/powerpc/power5+/fpu/s_modff.c (__modff): Likewise.	2018-09-17 20:42:06 +00:00
Joseph Myers	f29b6f17e4	Use rint functions not __rint functions in glibc libm. Continuing the move to use, within libm, public names for libm functions that can be inlined as built-in functions on many architectures, this patch moves calls to __rint functions to call the corresponding rint names instead, with asm redirection to __rint when the calls are not inlined. The x86_64 math_private.h is removed as no longer useful after this patch. This patch is relative to a tree with my floor patch <https://sourceware.org/ml/libc-alpha/2018-09/msg00148.html> applied, and much the same considerations arise regarding possibly replacing an IFUNC call with a direct inline expansion. Tested for x86_64, and with build-many-glibcs.py. * include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (rint): Redirect using MATH_REDIRECT. * sysdeps/aarch64/fpu/s_rint.c: Define NO_MATH_REDIRECT before header inclusion. * sysdeps/aarch64/fpu/s_rintf.c: Likewise. * sysdeps/alpha/fpu/s_rint.c: Likewise. * sysdeps/alpha/fpu/s_rintf.c: Likewise. * sysdeps/i386/fpu/s_rintl.c: Likewise. * sysdeps/ieee754/dbl-64/s_rint.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_rint.c: Likewise. * sysdeps/ieee754/float128/s_rintf128.c: Likewise. * sysdeps/ieee754/flt-32/s_rintf.c: Likewise. * sysdeps/ieee754/ldbl-128/s_rintl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_rintl.c: Likewise. * sysdeps/m68k/coldfire/fpu/s_rint.c: Likewise. * sysdeps/m68k/coldfire/fpu/s_rintf.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_rint.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_rintf.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_rintl.c: Likewise. * sysdeps/powerpc/fpu/s_rint.c: Likewise. * sysdeps/powerpc/fpu/s_rintf.c: Likewise. * sysdeps/riscv/rv64/rvd/s_rint.c: Likewise. * sysdeps/riscv/rvf/s_rintf.c: Likewise. * sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_rint.c: Likewise. * sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_rintf.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_rint.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_rintf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_rint.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_rintf.c: Likewise. * sysdeps/x86_64/fpu/math_private.h: Remove file. * math/e_scalb.c (invalid_fn): Use rint functions instead of __rint variants. * math/e_scalbf.c (invalid_fn): Likewise. * math/e_scalbl.c (invalid_fn): Likewise. * sysdeps/ieee754/dbl-64/e_gamma_r.c (__ieee754_gamma_r): Likewise. * sysdeps/ieee754/flt-32/e_gammaf_r.c (__ieee754_gammaf_r): Likewise. * sysdeps/ieee754/k_standard.c (__kernel_standard): Likewise. * sysdeps/ieee754/k_standardl.c (__kernel_standard_l): Likewise. * sysdeps/ieee754/ldbl-128/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/ieee754/ldbl-96/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/powerpc/powerpc32/fpu/s_llrint.c (__llrint): Likewise. * sysdeps/powerpc/powerpc32/fpu/s_llrintf.c (__llrintf): Likewise.	2018-09-14 13:10:39 +00:00
Joseph Myers	e44acb2063	Use floor functions not __floor functions in glibc libm. Similar to the changes that were made to call sqrt functions directly in glibc, instead of __ieee754_sqrt variants, so that the compiler could inline them automatically without needing special inline definitions in lots of math_private.h headers, this patch makes libm code call floor functions directly instead of __floor variants, removing the inlines / macros for x86_64 (SSE4.1) and powerpc (POWER5). The redirection used to ensure that __ieee754_sqrt does still get called when the compiler doesn't inline a built-in function expansion is refactored so it can be applied to other functions; the refactoring is arranged so it's not limited to unary functions either (it would be reasonable to use this mechanism for copysign - removing the inline in math_private_calls.h but also eliminating unnecessary local PLT entry use in the cases (powerpc soft-float and e500v1, for IBM long double) where copysign calls don't get inlined). The point of this change is that more architectures can get floor calls inlined where they weren't previously (AArch64, for example), without needing special inline definitions in their math_private.h, and existing such definitions in math_private.h headers can be removed. Note that it's possible that in some cases an inline may be used where an IFUNC call was previously used - this is the case on x86_64, for example. I think the direct calls to floor are still appropriate; if there's any significant performance cost from inline SSE2 floor instead of an IFUNC call ending up with SSE4.1 floor, that indicates that either the function should be doing something else that's faster than using floor at all, or it should itself have IFUNC variants, or that the compiler choice of inlining for generic tuning should change to allow for the possibility that, by not inlining, an SSE4.1 IFUNC might be called at runtime - but not that glibc should avoid calling floor internally. (After all, all the same considerations would apply to any user program calling floor, where it might either be inlined or left as an out-of-line call allowing for a possible IFUNC.) Tested for x86_64, and with build-many-glibcs.py. * include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (MATH_REDIRECT): New macro. [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (MATH_REDIRECT_LDBL): Likewise. [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (MATH_REDIRECT_F128): Likewise. [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (MATH_REDIRECT_UNARY_ARGS): Likewise. [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (sqrt): Redirect using MATH_REDIRECT. [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (floor): Likewise. * sysdeps/aarch64/fpu/s_floor.c: Define NO_MATH_REDIRECT before header inclusion. * sysdeps/aarch64/fpu/s_floorf.c: Likewise. * sysdeps/ieee754/dbl-64/s_floor.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_floor.c: Likewise. * sysdeps/ieee754/float128/s_floorf128.c: Likewise. * sysdeps/ieee754/flt-32/s_floorf.c: Likewise. * sysdeps/ieee754/ldbl-128/s_floorl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_floorl.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_floor_template.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floor.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floorf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_floor.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_floorf.c: Likewise. * sysdeps/riscv/rv64/rvd/s_floor.c: Likewise. * sysdeps/riscv/rvf/s_floorf.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_floor.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_floorf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_floor.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_floorf.c: Likewise. * sysdeps/powerpc/fpu/math_private.h [_ARCH_PWR5X] (__floor): Remove macro. [_ARCH_PWR5X] (__floorf): Likewise. * sysdeps/x86_64/fpu/math_private.h [__SSE4_1__] (__floor): Remove inline function. [__SSE4_1__] (__floorf): Likewise. * math/w_lgamma_main.c (LGFUNC (__lgamma)): Use floor functions instead of __floor variants. * math/w_lgamma_r_compat.c (__lgamma_r): Likewise. * math/w_lgammaf_main.c (LGFUNC (__lgammaf)): Likewise. * math/w_lgammaf_r_compat.c (__lgammaf_r): Likewise. * math/w_lgammal_main.c (LGFUNC (__lgammal)): Likewise. * math/w_lgammal_r_compat.c (__lgammal_r): Likewise. * math/w_tgamma_compat.c (__tgamma): Likewise. * math/w_tgamma_template.c (M_DECL_FUNC (__tgamma)): Likewise. * math/w_tgammaf_compat.c (__tgammaf): Likewise. * math/w_tgammal_compat.c (__tgammal): Likewise. * sysdeps/ieee754/dbl-64/e_lgamma_r.c (sin_pi): Likewise. * sysdeps/ieee754/dbl-64/k_rem_pio2.c (__kernel_rem_pio2): Likewise. * sysdeps/ieee754/dbl-64/lgamma_neg.c (__lgamma_neg): Likewise. * sysdeps/ieee754/flt-32/e_lgammaf_r.c (sin_pif): Likewise. * sysdeps/ieee754/flt-32/lgamma_negf.c (__lgamma_negf): Likewise. * sysdeps/ieee754/ldbl-128/e_lgammal_r.c (__ieee754_lgammal_r): Likewise. * sysdeps/ieee754/ldbl-128/e_powl.c (__ieee754_powl): Likewise. * sysdeps/ieee754/ldbl-128/lgamma_negl.c (__lgamma_negl): Likewise. * sysdeps/ieee754/ldbl-128/s_expm1l.c (__expm1l): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_lgammal_r.c (__ieee754_lgammal_r): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_powl.c (__ieee754_powl): Likewise. * sysdeps/ieee754/ldbl-128ibm/lgamma_negl.c (__lgamma_negl): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_expm1l.c (__expm1l): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_truncl.c (__truncl): Likewise. * sysdeps/ieee754/ldbl-96/e_lgammal_r.c (sin_pi): Likewise. * sysdeps/ieee754/ldbl-96/lgamma_negl.c (__lgamma_negl): Likewise. * sysdeps/powerpc/power5+/fpu/s_modf.c (__modf): Likewise. * sysdeps/powerpc/power5+/fpu/s_modff.c (__modff): Likewise.	2018-09-14 13:09:01 +00:00
Joseph Myers	4e7fbdd7c2	Remove x86_64 math_private.h asms. The x86_64 math_private.h has asm versions of the macros to reinterpret between floating-point and integer types. This is the sort of thing we now strongly discourage; the expectation in such cases, where the generic C code gives the compiler all the information needed about the required semantics, is that you should get the compiler to do the right thing for the generic C code rather than writing an asm version. Trivial tests showed GCC generates the expected single instructions for reinterpretation from floating point to integer. In the other direction, it goes via memory when the asms don't; I asked about this in GCC bug 87236 and was advised this was deliberate for generic tuning because it was faster that way on some AMD processors (but -mtune=intel, and -Os with the latest GCC, avoid going via memory). The asms don't and can't know about those tuning details, so that's evidence that they are actually making the code worse. This patch removes the asms accordingly. Tested for x86_64. * sysdeps/x86_64/fpu/math_private.h (MOVD): Remove macro. (MOVQ): Likewise. (EXTRACT_WORDS64): Likewise. (INSERT_WORDS64): Likewise. (GET_FLOAT_WORD): Likewise. (SET_FLOAT_WORD): Likewise.	2018-09-11 14:51:40 +00:00
Szabolcs Nagy	e70c176825	Add new exp and exp2 implementations Optimized exp and exp2 implementations using a lookup table for fractional powers of 2. There are several variants, see e_exp_data.c, they can be selected by modifying math_config.h allowing different tradeoffs. The default selection should be acceptable as generic libm code. Worst case error is 0.509 ULP for exp and 0.507 ULP for exp2, on aarch64 the rodata size is 2160 bytes, shared between exp and exp2. On aarch64 .text + .rodata size decreased by 24912 bytes. The non-nearest rounding error is less than 1 ULP even on targets without efficient round implementation (although the error rate is higher in that case). Targets with single instruction, rounding mode independent, to nearest integer rounding and conversion can use them by setting TOINT_INTRINSICS and adding the necessary code to their math_private.h. The __exp1 code uses the same algorithm, so the error bound of pow increased a bit. New double precision error handling code was added following the style of the single precision error handling code. Improvements on Cortex-A72 compared to current glibc master: exp thruput: 1.61x in [-9.9 9.9] exp latency: 1.53x in [-9.9 9.9] exp thruput: 1.13x in [0.5 1] exp latency: 1.30x in [0.5 1] exp2 thruput: 2.03x in [-9.9 9.9] exp2 latency: 1.64x in [-9.9 9.9] For small (< 1) inputs the current exp code uses a separate algorithm so the speed up there is less. Was tested on aarch64-linux-gnu (TOINT_INTRINSICS, fma contraction) and arm-linux-gnueabihf (!TOINT_INTRINSICS, no fma contraction) and x86_64-linux-gnu (!TOINT_INTRINSICS, no fma contraction) and powerpc64le-linux-gnu (!TOINT_INTRINSICS, fma contraction) targets, only non-nearest rounding ulp errors increase and they are within acceptable bounds (ulp updates are in separate patches). * NEWS: Mention exp and exp2 improvements. * math/Makefile (libm-support): Remove t_exp. (type-double-routines): Add math_err and e_exp_data. * sysdeps/aarch64/libm-test-ulps: Update. * sysdeps/arm/libm-test-ulps: Update. * sysdeps/i386/fpu/e_exp_data.c: New file. * sysdeps/i386/fpu/math_err.c: New file. * sysdeps/i386/fpu/t_exp.c: Remove. * sysdeps/ia64/fpu/e_exp_data.c: New file. * sysdeps/ia64/fpu/math_err.c: New file. * sysdeps/ia64/fpu/t_exp.c: Remove. * sysdeps/ieee754/dbl-64/e_exp.c: Rewrite. * sysdeps/ieee754/dbl-64/e_exp2.c: Rewrite. * sysdeps/ieee754/dbl-64/e_exp_data.c: New file. * sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Update error bound. * sysdeps/ieee754/dbl-64/eexp.tbl: Remove. * sysdeps/ieee754/dbl-64/math_config.h: New file. * sysdeps/ieee754/dbl-64/math_err.c: New file. * sysdeps/ieee754/dbl-64/t_exp.c: Remove. * sysdeps/ieee754/dbl-64/t_exp2.h: Remove. * sysdeps/ieee754/dbl-64/uexp.h: Remove. * sysdeps/ieee754/dbl-64/uexp.tbl: Remove. * sysdeps/m68k/m680x0/fpu/e_exp_data.c: New file. * sysdeps/m68k/m680x0/fpu/math_err.c: New file. * sysdeps/m68k/m680x0/fpu/t_exp.c: Remove. * sysdeps/powerpc/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Update.	2018-09-05 16:22:00 +01:00
Joseph Myers	ff6b24501f	Split fenv_private.h out of math_private.h more consistently. On some architectures, the parts of math_private.h relating to the floating-point environment are in a separate file fenv_private.h included from math_private.h. As this is purely an architecture-specific convention used by several architectures, however, all such architectures still need their own math_private.h, even if it has nothing to do beyond #include <fenv_private.h> and peculiarity of including the i386 file directly instead of having a shared file in sysdeps/x86. This patch makes the fenv_private.h name an architecture-independent convention in glibc. The include of fenv_private.h from math_private.h becomes architecture-independent (until callers are updated to include fenv_private.h directly so the include from math_private.h is no longer needed). Some architecture math_private.h headers are removed if no longer needed, or renamed to fenv_private.h if all they define belongs in that header; architecture fenv_private.h headers now do require #include_next <fenv_private.h>. The i386 fenv_private.h file moves to sysdeps/x86/fpu/ to reflect how it is actually shared with x86_64. The generic math_private.h gets a new include of <stdbool.h>, as needed for bool in some prototypes in that header (previously that was indirectly included via include/fenv.h, which now only gets included too late in math_private.h, after those prototypes). Tested for x86_64 and x86, and tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by the patch. * sysdeps/aarch64/fpu/fenv_private.h: New file. Based on .... * sysdeps/aarch64/fpu/math_private.h: ... this file. All contents moved to fenv_private.h except for ... (TOINT_INTRINSICS): Kept in math_private.h. (roundtoint): Likewise. (converttoint): Likewise. * sysdeps/arm/fenv_private.h: Change multiple-include guard to [ARM_FENV_PRIVATE_H]. Include next <fenv_private.h>. * sysdeps/arm/math_private.h: Remove. * sysdeps/generic/fenv_private.h: New file. Contents moved from .... * sysdeps/generic/math_private.h: ... this file. Include <stdbool.h>. Do not include <fenv.h> or <get-rounding-mode.h>. Include <fenv_private.h>. Remove functions and macros moved to fenv_private.h. * sysdeps/i386/fpu/math_private.h: Remove. * sysdeps/mips/math_private.h: Move to .... * sysdeps/mips/fpu/fenv_private.h: ... here. Change multiple-include guard to [MIPS_FENV_PRIVATE_H]. Remove [__mips_hard_float] conditional. Include next <fenv_private.h>. * sysdeps/powerpc/fpu/fenv_private.h: Change multiple-include guard to [POWERPC_FENV_PRIVATE_H]. Include next <fenv_private.h>. * sysdeps/powerpc/fpu/math_private.h: Do not include <fenv_private.h>. * sysdeps/riscv/rvf/math_private.h: Move to .... * sysdeps/riscv/rvf/fenv_private.h: ... here. Change multiple-include guard to [RISCV_FENV_PRIVATE_H]. Include next <fenv_private.h>. * sysdeps/sparc/fpu/fenv_private.h: Change multiple-include guard to [SPARC_FENV_PRIVATE_H]. Include next <fenv_private.h>. * sysdeps/sparc/fpu/math_private.h: Remove. * sysdeps/i386/fpu/fenv_private.h: Move to .... * sysdeps/x86/fpu/fenv_private.h: ... here. Change multiple-include guard to [X86_FENV_PRIVATE_H]. Include next <fenv_private.h>. * sysdeps/x86_64/fpu/math_private.h: Do not include <sysdeps/i386/fpu/fenv_private.h>.	2018-08-28 20:48:49 +00:00
Wilco Dijkstra	49acec179c	Fix spaces in x86_64 ULP file Fix a few missing spaces, it's now identical to the regenerated version. Passes GLIBC tests on x64. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerate to fix spaces.	2018-08-15 12:56:22 +01:00
Wilco Dijkstra	599cf39766	Improve performance of sinf and cosf The second patch improves performance of sinf and cosf using the same algorithms and polynomials. The returned values are identical to sincosf for the same input. ULP definitions for AArch64 and x64 are updated. sinf/cosf througput gains on Cortex-A72: * \|x\| < 0x1p-12 : 1.2x * \|x\| < M_PI_4 : 1.8x * \|x\| < 2 * M_PI: 1.7x * \|x\| < 120.0 : 2.3x * \|x\| < Inf : 3.0x * NEWS: Mention sinf, cosf, sincosf. * sysdeps/aarch64/libm-test-ulps: Update ULP for sinf, cosf, sincosf. * sysdeps/x86_64/fpu/libm-test-ulps: Update ULP for sinf and cosf. * sysdeps/x86_64/fpu/multiarch/s_sincosf-fma.c: Add definitions of constants rather than including generic sincosf.h. * sysdeps/x86_64/fpu/s_sincosf_data.c: Remove. * sysdeps/ieee754/flt-32/s_cosf.c (cosf): Rewrite. * sysdeps/ieee754/flt-32/s_sincosf.h (reduced_sin): Remove. (reduced_cos): Remove. (sinf_poly): New function. * sysdeps/ieee754/flt-32/s_sinf.c (sinf): Rewrite.	2018-08-14 10:45:59 +01:00
Joseph Myers	2ce7ba7d15	Move SNAN_TESTS_* out of math-tests.h. Continuing moving macros out of math-tests.h to smaller headers following typo-proof conventions instead of using #ifndef, this patch moves the SNAN_TESTS_* macros for individual types out to their own sysdeps header (while the type-generic SNAN_TESTS wrapper for those macros remains in math-tests.h). Tested for x86_64 and x86, and with build-many-glibcs.py. * sysdeps/generic/math-tests-snan.h: New file. * sysdeps/generic/math-tests.h: Include <math-tests-snan.h>. (SNAN_TESTS_float): Do not define here. (SNAN_TESTS_double): Likewise. (SNAN_TESTS_long_double): Likewise. (SNAN_TESTS_float128): Likewise. * sysdeps/i386/fpu/math-tests-snan.h: New file. * sysdeps/i386/fpu/math-tests.h: Remove file. * sysdeps/ia64/math-tests-snan.h: New file. * sysdeps/ia64/math-tests.h: Remove file. * sysdeps/x86/math-tests.h: Likewise. * sysdeps/x86_64/fpu/math-tests-snan.h: New file.	2018-08-10 19:22:01 +00:00
Wilco Dijkstra	ea5c662c62	Improve performance of sincosf This patch is a complete rewrite of sincosf. The new version is significantly faster, as well as simple and accurate. The worst-case ULP is 0.5607, maximum relative error is 0.5303 * 2^-23 over all 4 billion inputs. In non-nearest rounding modes the error is 1ULP. The algorithm uses 3 main cases: small inputs which don't need argument reduction, small inputs which need a simple range reduction and large inputs requiring complex range reduction. The code uses approximate integer comparisons to quickly decide between these cases. The small range reducer uses a single reduction step to handle values up to 120.0. It is fastest on targets which support inlined round instructions. The large range reducer uses integer arithmetic for simplicity. It does a 32x96 bit multiply to compute a 64-bit modulo result. This is more than accurate enough to handle the worst-case cancellation for values close to an integer multiple of PI/4. It could be further optimized, however it is already much faster than necessary. sincosf throughput gains on Cortex-A72: * \|x\| < 0x1p-12 : 1.6x * \|x\| < M_PI_4 : 1.7x * \|x\| < 2 * M_PI: 1.5x * \|x\| < 120.0 : 1.8x * \|x\| < Inf : 2.3x * math/Makefile: Add s_sincosf_data.c. * sysdeps/ia64/fpu/s_sincosf_data.c: New file. * sysdeps/ieee754/flt-32/s_sincosf.h (abstop12): Add new function. (sincosf_poly): Likewise. (reduce_small): Likewise. (reduce_large): Likewise. * sysdeps/ieee754/flt-32/s_sincosf.c (sincosf): Rewrite. * sysdeps/ieee754/flt-32/s_sincosf_data.c: New file with sincosf data. * sysdeps/m68k/m680x0/fpu/s_sincosf_data.c: New file. * sysdeps/x86_64/fpu/s_sincosf_data.c: New file.	2018-08-10 17:34:39 +01:00
H.J. Lu	430388d5dc	x86: Don't include <init-arch.h> in assembly codes There is no need to include <init-arch.h> in assembly codes since all x86 IFUNC selector functions are written in C. Tested on i686 and x86-64. There is no code change in libc.so, ld.so and libmvec.so. * sysdeps/i386/i686/multiarch/bzero-ia32.S: Don't include <init-arch.h>. * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core-avx2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_expf16_core-avx2.S: Likewise. * sysdeps/x86_64/multiarch/memset-sse2-unaligned-erms.S: Likewise.	2018-08-03 08:05:00 -07:00
Paul Pluzhnikov	50d004c91c	Update ulps with "make regen-ulps" on AMD Ryzen 7 1800X. 2018-05-30 Paul Pluzhnikov <ppluzhnikov@google.com> * sysdeps/x86_64/fpu/libm-test-ulps (log_vlen8_avx2): Update for AMD Ryzen 7 1800X.	2018-05-30 09:17:47 -07:00
Wilco Dijkstra	19a8b9a300	[PATCH 1/7] sin/cos slow paths: avoid slow paths for small inputs This series of patches removes the slow patchs from sin, cos and sincos. Besides greatly simplifying the implementation, the new version is also much faster for inputs up to PI (41% faster) and for large inputs needing range reduction (27% faster). ULP is ~0.55 with no errors found after testing 1.6 billion inputs across most of the range with mpsin and mpcos. The number of incorrectly rounded results (ie. ULP >0.5) is at most ~2750 per million inputs between 0.125 and 0.5, the average is ~850 per million between 0 and PI. Tested on AArch64 and x86_64 with no regressions. The first patch removes the slow paths for the cases where the input is small and doesn't require range reduction. Update ULP tables for sin, cos and sincos on AArch64 and x86_64. * sysdeps/aarch64/libm-test-ulps: Update ULP for sin, cos, sincos. * sysdeps/ieee754/dbl-64/s_sin.c (__sin): Remove slow paths for small inputs. (__cos): Likewise. * sysdeps/x86_64/fpu/libm-test-ulps: Update ULP for sin, cos, sincos.	2018-04-03 16:52:16 +01:00
Wilco Dijkstra	700593fdd7	Remove all target specific __ieee754_sqrt(f/l) inlines Remove the now unused target specific__ieee754_sqrt(f/l) inlines. Also remove inlines of sqrt which are for really old GCC versions. Removing these is desirable, under the general principle of leaving such inlining to the compiler rather than trying to do it in installed headers, especially when only very old compilers are affected. Note that removing inlines for __ieee754_sqrt disables inlining in the sqrt wrapper functions. Given the sqrt function will typically only be called for negative arguments, it doesn't matter whether the inlining happens or not. * sysdeps/aarch64/fpu/math_private.h (__ieee754_sqrt): Remove. (__ieee754_sqrtf): Remove. * sysdeps/alpha/fpu/math_private.h (__ieee754_sqrt): Remove. (__ieee754_sqrtf): Remove. * sysdeps/generic/math-type-macros.h (M_SQRT): Use sqrt. * sysdeps/m68k/m680x0/fpu/mathimpl.h (__ieee754_sqrt): Remove. * sysdeps/powerpc/fpu/math_private.h (__ieee754_sqrt): Remove. (__ieee754_sqrtf): Remove. * sysdeps/s390/fpu/bits/mathinline.h: Remove file. * sysdeps/sparc/fpu/bits/mathinline.h (sqrt) Remove. (sqrtf): Remove. (sqrtl): Remove. (__ieee754_sqrt): Remove. (__ieee754_sqrtf): Remove. (__ieee754_sqrtl): Remove. * sysdeps/m68k/m680x0/fpu/mathimpl.h (__ieee754_sqrt): Remove. * sysdeps/x86/fpu/math_private.h (__ieee754_sqrt): Remove. * sysdeps/x86_64/fpu/math_private.h (__ieee754_sqrt): Remove. (__ieee754_sqrtf): Remove. (__ieee754_sqrtl): Remove.	2018-03-15 19:21:36 +00:00
Wilco Dijkstra	610ee1fc93	Remove mplog and mpexp Remove the now unused mplog and mpexp files. * math/Makefile: Remove mpexp.c and mplog.c * sysdeps/i386/fpu/mpexp.c: Delete file. * sysdeps/i386/fpu/mplog.c: Likewise. * sysdeps/ia64/fpu/mpexp.c: Likewise. * sysdeps/ia64/fpu/mplog.c: Likewise. * sysdeps/ieee754/dbl-64/e_exp.c: Remove mention of mpexp and mplog. * sysdeps/ieee754/dbl-64/mpa.h (__pow_mp): Remove unused function. * sysdeps/ieee754/dbl-64/mpexp.c: Delete file. * sysdeps/ieee754/dbl-64/mplog.c: Likewise. * sysdeps/m68k/m680x0/fpu/mpexp.c: Likewise. * sysdeps/m68k/m680x0/fpu/mplog.c: Likewise. * sysdeps/x86_64/fpu/multiarch/Makefile: Remove mpexp* and mplog. sysdeps/x86_64/fpu/multiarch/e_log-avx.c: Remove unused defines. * sysdeps/x86_64/fpu/multiarch/e_log-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/e_log-fma4.c: Likewise. * sysdeps/x86_64/fpu/multiarch/mpexp-avx.c: Delete file. * sysdeps/x86_64/fpu/multiarch/mpexp-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/mpexp-fma4.c: Likewise. * sysdeps/x86_64/fpu/multiarch/mplog-avx.c: Likewise. * sysdeps/x86_64/fpu/multiarch/mplog-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/mplog-fma4.c: Likewise.	2018-02-15 12:41:05 +00:00
Szabolcs Nagy	de800d8305	Remove slow paths from exp Remove the __slowexp code, so exp is no longer correctly rounded. The result is computed to about 70 bits precision so the worst case ulp error is about 0.500007 in nearest rounding mode. * manual/probes.texi: Remove slowexp probes. * math/Makefile: Remove slowexp. * sysdeps/generic/math_private.h (__slowexp): Remove. * sysdeps/ieee754/dbl-64/e_exp.c (__ieee754_exp): Remove __slowexp and document error bounds. * sysdeps/i386/fpu/slowexp.c: Remove. * sysdeps/ia64/fpu/slowexp.c: Remove. * sysdeps/ieee754/dbl-64/slowexp.c: Remove. * sysdeps/ieee754/dbl-64/uexp.h (err_0): Remove. * sysdeps/m68k/m680x0/fpu/slowexp.c: Remove. * sysdeps/powerpc/power4/fpu/Makefile (CPPFLAGS-slowexp.c): Remove. * sysdeps/x86_64/fpu/multiarch/Makefile: Remove slowexp-fma. * sysdeps/x86_64/fpu/multiarch/e_exp-avx.c (__slowexp): Remove. * sysdeps/x86_64/fpu/multiarch/e_exp-fma.c (__slowexp): Remove. * sysdeps/x86_64/fpu/multiarch/e_exp-fma4.c (__slowexp): Remove. * sysdeps/x86_64/fpu/multiarch/slowexp-avx.c: Remove. * sysdeps/x86_64/fpu/multiarch/slowexp-fma.c: Remove. * sysdeps/x86_64/fpu/multiarch/slowexp-fma4.c: Remove.	2018-02-12 11:33:33 +00:00
Wilco Dijkstra	c3d466cba1	Remove slow paths from pow Remove the slow paths from pow. Like several other double precision math functions, pow is exactly rounded. This is not required from math functions and causes major overheads as it requires multiple fallbacks using higher precision arithmetic if a result is close to 0.5ULP. Ridiculous slowdowns of up to 100000x have been reported when the highest precision path triggers. All GLIBC math tests pass on AArch64 and x64 (with ULP of pow set to 1). The worst case error is ~0.506ULP. A simple test over a few hundred million values shows pow is 10% faster on average. This fixes BZ #13932. [BZ #13932] * sysdeps/ieee754/dbl-64/uexp.h (err_1): Remove. * benchtests/pow-inputs: Update comment for slow path cases. * manual/probes.texi (slowpow_p10): Delete removed probe. (slowpow_p10): Likewise. * math/Makefile: Remove halfulp.c and slowpow.c. * sysdeps/aarch64/libm-test-ulps: Set ULP of pow to 1. * sysdeps/generic/math_private.h (__exp1): Remove error argument. (__halfulp): Remove. (__slowpow): Remove. * sysdeps/i386/fpu/halfulp.c: Delete file. * sysdeps/i386/fpu/slowpow.c: Likewise. * sysdeps/ia64/fpu/halfulp.c: Likewise. * sysdeps/ia64/fpu/slowpow.c: Likewise. * sysdeps/ieee754/dbl-64/e_exp.c (__exp1): Remove error argument, improve comments and add error analysis. * sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Add error analysis. (power1): Remove function: (log1): Remove error argument, add error analysis. (my_log2): Remove function. * sysdeps/ieee754/dbl-64/halfulp.c: Delete file. * sysdeps/ieee754/dbl-64/slowpow.c: Likewise. * sysdeps/m68k/m680x0/fpu/halfulp.c: Likewise. * sysdeps/m68k/m680x0/fpu/slowpow.c: Likewise. * sysdeps/powerpc/power4/fpu/Makefile: Remove CPPFLAGS-slowpow.c. * sysdeps/x86_64/fpu/libm-test-ulps: Set ULP of pow to 1. * sysdeps/x86_64/fpu/multiarch/Makefile: Remove slowpow-fma.c, slowpow-fma4.c, halfulp-fma.c, halfulp-fma4.c. * sysdeps/x86_64/fpu/multiarch/e_pow-fma.c (__slowpow): Remove define. * sysdeps/x86_64/fpu/multiarch/e_pow-fma4.c (__slowpow): Likewise. * sysdeps/x86_64/fpu/multiarch/halfulp-fma.c: Delete file. * sysdeps/x86_64/fpu/multiarch/halfulp-fma4.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowpow-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowpow-fma4.c: Likewise.	2018-02-12 10:47:09 +00:00
H.J. Lu	c70e4e9c9e	x86-64: Add sincosf with vector FMA Since the x86-64 assembly version of sincosf is higly optimized with vector instructions, there isn't much room for improvement. However s_sincosf.c written in C with vector math and intrinsics can be optimized by GCC with FMA. On Skylake, bench-sincosf reports performance improvement: Assembly FMA improvement max 104.042 101.008 3% min 9.426 8.586 10% mean 20.6209 18.2238 13% * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Add s_sincosf-sse2 and s_sincosf-fma. (CFLAGS-s_sincosf-fma.c): New. * sysdeps/x86_64/fpu/multiarch/s_sincosf-fma.c: New file. * sysdeps/x86_64/fpu/multiarch/s_sincosf-sse2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/s_sincosf.c: Likewise. * sysdeps/x86_64/fpu/s_sincosf.S: Don't add alias if __sincosf is defined.	2018-01-08 08:04:40 -08:00
Joseph Myers	688903eb3e	Update copyright dates with scripts/update-copyrights. * All files with FSF copyright notices: Update copyright dates using scripts/update-copyrights. * locale/programs/charmap-kw.h: Regenerated. * locale/programs/locfile-kw.h: Likewise.	2018-01-01 00:32:25 +00:00
Joseph Myers	f1e005022e	Revert exp reimplementation (causes test failures). Revert: 2017-12-19 Joseph Myers <joseph@codesourcery.com> * sysdeps/x86_64/fpu/libm-test-ulps: Update. 2017-12-19 Patrick McGehearty <patrick.mcgehearty@oracle.com> * sysdeps/ieee754/dbl-64/e_exp.c: Include <math-svid-compat.h> and <errno.h>. Include "eexp.tbl". (half): New constant. (one): Likewise. (__ieee754_exp): Rewrite. (__slowexp): Remove prototype. * sysdeps/ieee754/dbl-64/eexp.tbl: New file. * sysdeps/ieee754/dbl-64/slowexp.c: Remove file. * sysdeps/i386/fpu/slowexp.c: Likewise. * sysdeps/ia64/fpu/slowexp.c: Likewise. * sysdeps/m68k/m680x0/fpu/slowexp.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowexp-avx.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowexp-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowexp-fma4.c: Likewise. * sysdeps/generic/math_private.h (__slowexp): Remove prototype. * sysdeps/ieee754/dbl-64/e_pow.c: Remove mention of slowexp.c in comment. * sysdeps/powerpc/power4/fpu/Makefile [$(subdir) = math] (CPPFLAGS-slowexp.c): Remove variable. * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Remove slowexp-fma, slowexp-fma4 and slowexp-avx. (CFLAGS-slowexp-fma.c): Remove variable. (CFLAGS-slowexp-fma4.c): Likewise. (CFLAGS-slowexp-avx.c): Likewise. * sysdeps/x86_64/fpu/multiarch/e_exp-avx.c (__slowexp): Do not define as macro. * sysdeps/x86_64/fpu/multiarch/e_exp-fma.c (__slowexp): Likewise. * sysdeps/x86_64/fpu/multiarch/e_exp-fma4.c (__slowexp): Likewise. * math/Makefile (type-double-routines): Remove slowexp. * manual/probes.texi (slowexp_p6): Remove. (slowexp_p32): Likewise.	2017-12-19 18:11:37 +00:00
Joseph Myers	6f58c10ded	Update x86_64 libm-test-ulps. * sysdeps/x86_64/fpu/libm-test-ulps: Update.	2017-12-19 17:38:41 +00:00
Patrick McGehearty	6fd0a3c6a8	Improve __ieee754_exp() performance by greater than 5x on sparc/x86. These changes will be active for all platforms that don't provide their own exp() routines. They will also be active for ieee754 versions of ccos, ccosh, cosh, csin, csinh, sinh, exp10, gamma, and erf. Typical performance gains is typically around 5x when measured on Sparc s7 for common values between exp(1) and exp(40). Using the glibc perf tests on sparc, sparc (nsec) x86 (nsec) old new old new max 17629 395 5173 144 min 399 54 15 13 mean 5317 200 1349 23 The extreme max times for the old (ieee754) exp are due to the multiprecision computation in the old algorithm when the true value is very near 0.5 ulp away from an value representable in double precision. The new algorithm does not take special measures for those cases. The current glibc exp perf tests overrepresent those values. Informal testing suggests approximately one in 200 cases might invoke the high cost computation. The performance advantage of the new algorithm for other values is still large but not as large as indicated by the chart above. Glibc correctness tests for exp() and expf() were run. Within the test suite 3 input values were found to cause 1 bit differences (ulp) when "FE_TONEAREST" rounding mode is set. No differences in exp() were seen for the tested values for the other rounding modes. Typical example: exp(-0x1.760cd2p+0) (-1.46113312244415283203125) new code: 2.31973271630014299393707e-01 0x1.db14cd799387ap-3 old code: 2.31973271630014271638132e-01 0x1.db14cd7993879p-3 exp = 2.31973271630014285508337 (high precision) Old delta: off by 0.49 ulp New delta: off by 0.51 ulp In addition, because ieee754_exp() is used by other routines, cexp() showed test results with very small imaginary input values where the imaginary portion of the result was off by 3 ulp when in upward rounding mode, but not in the other rounding modes. For x86, tgamma showed a few values where the ulp increased to 6 (max ulp for tgamma is 5). Sparc tgamma did not show these failures. I presume the tgamma differences are due to compiler optimization differences within the gamma function.The gamma function is known to be difficult to compute accurately. * sysdeps/ieee754/dbl-64/e_exp.c: Include <math-svid-compat.h> and <errno.h>. Include "eexp.tbl". (half): New constant. (one): Likewise. (__ieee754_exp): Rewrite. (__slowexp): Remove prototype. * sysdeps/ieee754/dbl-64/eexp.tbl: New file. * sysdeps/ieee754/dbl-64/slowexp.c: Remove file. * sysdeps/i386/fpu/slowexp.c: Likewise. * sysdeps/ia64/fpu/slowexp.c: Likewise. * sysdeps/m68k/m680x0/fpu/slowexp.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowexp-avx.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowexp-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowexp-fma4.c: Likewise. * sysdeps/generic/math_private.h (__slowexp): Remove prototype. * sysdeps/ieee754/dbl-64/e_pow.c: Remove mention of slowexp.c in comment. * sysdeps/powerpc/power4/fpu/Makefile [$(subdir) = math] (CPPFLAGS-slowexp.c): Remove variable. * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Remove slowexp-fma, slowexp-fma4 and slowexp-avx. (CFLAGS-slowexp-fma.c): Remove variable. (CFLAGS-slowexp-fma4.c): Likewise. (CFLAGS-slowexp-avx.c): Likewise. * sysdeps/x86_64/fpu/multiarch/e_exp-avx.c (__slowexp): Do not define as macro. * sysdeps/x86_64/fpu/multiarch/e_exp-fma.c (__slowexp): Likewise. * sysdeps/x86_64/fpu/multiarch/e_exp-fma4.c (__slowexp): Likewise. * math/Makefile (type-double-routines): Remove slowexp. * manual/probes.texi (slowexp_p6): Remove. (slowexp_p32): Likewise.	2017-12-19 17:27:31 +00:00
H.J. Lu	4ca945e9c5	x86-64: Remove sysdeps/x86_64/fpu/s_cosf.S On Ivy Bridge, bench-cosf reports performance improvement: s_cosf.S s_cosf.c Improvement max 114.136 82.401 39% min 13.119 11.301 16% mean 22.1882 21.8938 1% * sysdeps/x86_64/fpu/s_cosf.S: Removed.	2017-12-14 04:39:53 -08:00
H.J. Lu	ac817e083b	x86-64: Add cosf with FMA On Skylake, bench-cosf reports performance improvement: Before After Improvement max 135.362 94.552 43% min 8.532 7.688 11% mean 17.1446 11.8128 45% * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Add s_cosf-sse2 and s_cosf-fma. (CFLAGS-s_cosf-fma.c): New. * sysdeps/x86_64/fpu/multiarch/s_cosf-fma.c: New file. * sysdeps/x86_64/fpu/multiarch/s_cosf-sse2.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_cosf.c: Likewise. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2017-12-12 15:32:58 -08:00
H.J. Lu	9d0ffa60ad	x86-64: Add sinf with FMA On Skylake, bench-sinf reports performance improvement: Before After Improvement max 153.996 100.094 54% min 8.546 6.852 25% mean 18.1223 11.802 54% * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Add s_sinf-sse2 and s_sinf-fma. (CFLAGS-s_sinf-fma.c): New. * sysdeps/x86_64/fpu/multiarch/s_sinf-fma.c: New file. * sysdeps/x86_64/fpu/multiarch/s_sinf-sse2.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_sinf.c: Likewise.	2017-12-07 10:11:16 -08:00
H.J. Lu	9574c7b68d	x86-64: Remove sysdeps/x86_64/fpu/s_sinf.S On Ivy Bridge, bench-sinf reports performance improvement: s_sinf.S s_sinf.c Improvement max 91.521 86.148 6% min 14.061 11.265 25% mean 23.3758 23.3344 0.2% * sysdeps/x86_64/fpu/s_sinf.S: Removed.	2017-12-07 09:44:17 -08:00
Joseph Myers	34bb10aabf	Use libm_alias_float for x86_64. Continuing the preparation for additional _FloatN / _FloatNx function aliases, this patch makes x86_64 libm function implementations use libm_alias_float to define function aliases, or libm_alias_float_other where the main name is defined with versioned_symbol. Tested with the glibc testsuite for x86_64, and tested with build-many-glibcs.py for all its x86_64 configurations that installed stripped shared libraries are unchanged by the patch. * sysdeps/x86_64/fpu/multiarch/e_exp2f.c: Include <libm-alias-float.h>. (exp2f): Define using libm_alias_float, or libm_alias_float_other if [SHARED]. * sysdeps/x86_64/fpu/multiarch/e_expf.c: Include <libm-alias-float.h>. (exp2f): Define using libm_alias_float, or libm_alias_float_other if [SHARED]. * sysdeps/x86_64/fpu/multiarch/e_log2f.c: Include <libm-alias-float.h>. (exp2f): Define using libm_alias_float, or libm_alias_float_other if [SHARED]. * sysdeps/x86_64/fpu/multiarch/e_logf.c: Include <libm-alias-float.h>. (exp2f): Define using libm_alias_float, or libm_alias_float_other if [SHARED]. * sysdeps/x86_64/fpu/multiarch/e_powf.c: Include <libm-alias-float.h>. (exp2f): Define using libm_alias_float, or libm_alias_float_other if [SHARED]. * sysdeps/x86_64/fpu/multiarch/s_ceilf.c: Include <libm-alias-float.h>. (ceilf): Define using libm_alias_float. * sysdeps/x86_64/fpu/multiarch/s_floorf.c: Include <libm-alias-float.h>. (floorf): Define using libm_alias_float. * sysdeps/x86_64/fpu/multiarch/s_fmaf.c: Include <libm-alias-float.h>. (fmaf): Define using libm_alias_float. * sysdeps/x86_64/fpu/multiarch/s_nearbyintf.c: Include <libm-alias-float.h>. (nearbyintf): Define using libm_alias_float. * sysdeps/x86_64/fpu/multiarch/s_rintf.c: Include <libm-alias-float.h>. (rintf): Define using libm_alias_float. * sysdeps/x86_64/fpu/multiarch/s_truncf.c: Include <libm-alias-float.h>. (truncf): Define using libm_alias_float. * sysdeps/x86_64/fpu/s_copysignf.S: Include <libm-alias-float.h>. (copysignf): Define using libm_alias_float. * sysdeps/x86_64/fpu/s_cosf.S: Include <libm-alias-float.h>. (cosf): Define using libm_alias_float. * sysdeps/x86_64/fpu/s_fabsf.c: Include <libm-alias-float.h>. (fabsf): Define using libm_alias_float. * sysdeps/x86_64/fpu/s_fmaxf.S: Include <libm-alias-float.h>. (fmaxf): Define using libm_alias_float. * sysdeps/x86_64/fpu/s_fminf.S: Include <libm-alias-float.h>. (fminf): Define using libm_alias_float. * sysdeps/x86_64/fpu/s_llrintf.S: Include <libm-alias-float.h>. (llrintf): Define using libm_alias_float. [!__ILP32__] (lrintf): Likewise. * sysdeps/x86_64/fpu/s_sincosf.S: Include <libm-alias-float.h>. (sincosf): Define using libm_alias_float. * sysdeps/x86_64/fpu/s_sinf.S: Include <libm-alias-float.h>. (sinf): Define using libm_alias_float. * sysdeps/x86_64/x32/fpu/s_lrintf.S: Include <libm-alias-float.h>. (lrintf): Define using libm_alias_float.	2017-11-29 21:25:41 +00:00
Joseph Myers	011fba7e48	Use libm_alias_double for x86_64. Continuing the preparation for additional _FloatN / _FloatNx function aliases, this patch makes x86_64 libm function implementations use libm_alias_double to define function aliases. Tested with the glibc testsuite for x86_64, and tested with build-many-glibcs.py for all its x86_64 configurations that installed stripped shared libraries are unchanged by the patch. * sysdeps/x86_64/fpu/multiarch/s_atan.c: Include <libm-alias-double.h>. (atan): Define using libm_alias_double. * sysdeps/x86_64/fpu/multiarch/s_ceil.c: Include <libm-alias-double.h>. (ceil): Define using libm_alias_double. * sysdeps/x86_64/fpu/multiarch/s_floor.c: Include <libm-alias-double.h>. (floor): Define using libm_alias_double. * sysdeps/x86_64/fpu/multiarch/s_fma.c: Include <libm-alias-double.h>. (fma): Define using libm_alias_double. * sysdeps/x86_64/fpu/multiarch/s_nearbyint.c: Include <libm-alias-double.h>. (nearbyint): Define using libm_alias_double. * sysdeps/x86_64/fpu/multiarch/s_rint.c: Include <libm-alias-double.h>. (rint): Define using libm_alias_double. * sysdeps/x86_64/fpu/multiarch/s_sin.c: Include <libm-alias-double.h>. (sin): Define using libm_alias_double. (cos): Likewise. * sysdeps/x86_64/fpu/multiarch/s_tan.c: Include <libm-alias-double.h>. (tan): Define using libm_alias_double. * sysdeps/x86_64/fpu/multiarch/s_trunc.c: Include <libm-alias-double.h>. (trunc): Define using libm_alias_double. * sysdeps/x86_64/fpu/s_copysign.S: Include <libm-alias-double.h>. (copysign): Define using libm_alias_double. * sysdeps/x86_64/fpu/s_fabs.c: Include <libm-alias-double.h>. (fabs): Define using libm_alias_double. * sysdeps/x86_64/fpu/s_fmax.S: Include <libm-alias-double.h>. (fmax): Define using libm_alias_double. * sysdeps/x86_64/fpu/s_fmin.S: Include <libm-alias-double.h>. (fmin): Define using libm_alias_double. * sysdeps/x86_64/fpu/s_llrint.S: Include <libm-alias-double.h>. (llrint): Define using libm_alias_double. [!__ILP32__] (lrint): Likewise. * sysdeps/x86_64/x32/fpu/s_lrint.S: Include <libm-alias-double.h>. (lrint): Define using libm_alias_double.	2017-11-29 19:01:21 +00:00
Joseph Myers	f58e5f4809	Use libm_alias_ldouble in sysdeps/x86_64/fpu. This patch continues the preparation for additional _FloatN / _FloatNx function aliases by using libm_alias_ldouble for sysdeps/x86_64/fpu long double functions, so that they can have _Float64x aliases added in future. Tested for x86_64, including build-many-glibcs.py tests that installed stripped shared libraries are unchanged by the patch. * sysdeps/x86_64/fpu/e_expl.S: Include <libm-alias-ldouble.h>. [USE_AS_EXPM1L] (expm1l): Define using libm_alias_ldouble. * sysdeps/x86_64/fpu/s_ceill.S: Include <libm-alias-ldouble.h>. (ceill): Define using libm_alias_ldouble. * sysdeps/x86_64/fpu/s_copysignl.S: Include <libm-alias-ldouble.h>. (copysignl): Define using libm_alias_ldouble. * sysdeps/x86_64/fpu/s_fabsl.S: Include <libm-alias-ldouble.h>. (fabsl): Define using libm_alias_ldouble. * sysdeps/x86_64/fpu/s_floorl.S: Include <libm-alias-ldouble.h>. (floorl): Define using libm_alias_ldouble. * sysdeps/x86_64/fpu/s_fmaxl.S: Include <libm-alias-ldouble.h>. (fmaxl): Define using libm_alias_ldouble. * sysdeps/x86_64/fpu/s_fminl.S: Include <libm-alias-ldouble.h>. (fminl): Define using libm_alias_ldouble. * sysdeps/x86_64/fpu/s_llrintl.S: Include <libm-alias-ldouble.h>. (llrintl): Define using libm_alias_ldouble. (lrintl): Likewise. * sysdeps/x86_64/fpu/s_nearbyintl.S: Include <libm-alias-ldouble.h>. (nearbyintl): Define using libm_alias_ldouble. * sysdeps/x86_64/fpu/s_truncl.S: Include <libm-alias-ldouble.h>. (truncl): Define using libm_alias_ldouble. * sysdeps/x86_64/x32/fpu/s_lrintl.S: Include <libm-alias-ldouble.h>. (lrintl): Define using libm_alias_ldouble.	2017-11-17 23:39:11 +00:00
H.J. Lu	a122dbfb2e	Replace "if if " with "if " in comments * include/alloc_buffer.h: Replace "if if " with "if " in comments. * sysdeps/mips/memcpy.S: Likkewise. * sysdeps/mips/memset.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf4_core_sse4.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf8_core_avx2.S: Likewise.	2017-10-25 08:05:51 -07:00
H.J. Lu	80bb593563	x86-64: Add powf with FMA For workload-spec2017.wrf, on Skylake, it improves performance by: Before After Improvement reciprocal-throughput 35.4713 27.3842 29% latency 82.4537 66.3175 24% * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Add e_powf-fma. (CFLAGS-e_powf-fma.c): New. * sysdeps/x86_64/fpu/multiarch/e_powf-fma.c: New file. * sysdeps/x86_64/fpu/multiarch/e_powf.c: Likewise.	2017-10-22 08:08:00 -07:00
H.J. Lu	5c7adbd8ed	x86-64: Add log2f with FMA For workload-spec2017.wrf, on Skylake, it improves performance by: Before After Improvement reciprocal-throughput 16.5937 14.0789 17% latency 41.7755 35.3586 18% * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Add e_log2f-fma. (CFLAGS-e_log2f-fma.c): New. * sysdeps/x86_64/fpu/multiarch/e_log2f-fma.c: New file. * sysdeps/x86_64/fpu/multiarch/e_log2f.c: Likewise.	2017-10-22 08:06:58 -07:00
H.J. Lu	0ccc7153cc	x86-64: Add logf with FMA For workload-spec2017.wrf, on Skylake, it improves performance by: Before After Improvement reciprocal-throughput 16.1534 13.8874 16% latency 41.9642 34.3072 22% * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Add e_logf-fma. (CFLAGS-e_logf-fma.c): New. * sysdeps/x86_64/fpu/multiarch/e_logf-fma.c: New file. * sysdeps/x86_64/fpu/multiarch/e_logf.c: Likewise.	2017-10-22 08:05:15 -07:00
H.J. Lu	5d15c96975	x86-64: Add exp2f with FMA For workload-spec2017.wrf, on Skylake, it improves performance by: Before After Improvement reciprocal-throughput 13.0291 11.2225 16% latency 44.5154 37.5766 18% * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Add e_exp2f-fma. (CFLAGS-e_exp2f-fma.c): New. * sysdeps/x86_64/fpu/multiarch/e_exp2f-fma.c: New file. * sysdeps/x86_64/fpu/multiarch/e_exp2f.c: Likewise.	2017-10-22 07:57:50 -07:00
H.J. Lu	e1f59bebd8	x86-64: Replace assembly versions of e_expf with generic e_expf.c This patch replaces x86-64 assembly versions of e_expf with generic e_expf.c. For workload-spec2017.wrf, on Nehalem, it improves performance by: Before After Improvement reciprocal-throughput 36.039 20.7749 73% latency 58.8096 40.8715 43% On Skylake, it improves Before After Improvement reciprocal-throughput 18.4436 11.1693 65% latency 47.5162 37.5411 26% * sysdeps/x86_64/fpu/e_expf.S: Removed. * sysdeps/x86_64/fpu/multiarch/e_expf-fma.S: Likewise. * sysdeps/x86_64/fpu/w_expf.c: Likewise. * sysdeps/x86_64/fpu/libm-test-ulps: Updated for generic e_expf.c. * sysdeps/x86_64/fpu/multiarch/Makefile (CFLAGS-e_expf-fma.c): New. * sysdeps/x86_64/fpu/multiarch/e_expf-fma.c: New file. * sysdeps/x86_64/fpu/multiarch/e_expf.c (__redirect_ieee754_expf): Renamed to ... (__redirect_expf): This. (SYMBOL_NAME): Changed to expf. (__ieee754_expf): Renamed to ... (__expf): This. (__GI___expf): This. (__ieee754_expf): Add strong_alias. (__expf_finite): Likewise. (__expf): New. Include <sysdeps/ieee754/flt-32/e_expf.c>.	2017-10-22 07:49:55 -07:00
Joseph Myers	527cd19c3d	Make dbl-64 atan and tan into weak aliases. This patch converts the dbl-64 implementations of atan and tan into weak aliases of __atan and __tan, in preparation for making them use libm_alias_double. Consequent changes are made to the x86_64 multiarch versions wrapping round them (with the dbl-64 functions, like other such functions, being made not to define their aliases at all if __atan or __tan are defined as macros by an including file). Tested for x86_64, and with build-many-glibcs.py. * sysdeps/ieee754/dbl-64/s_atan.c (atan): Rename to __atan and define as weak alias of __atan. Do not define any aliases if [__atan]. [NO_LONG_DOUBLE] (__atanl): Define as strong alias of __atan. [NO_LONG_DOUBLE] (atanl): Define as weak alias of __atanl. * sysdeps/ieee754/dbl-64/s_tan.c (tan): Rename to __tan and define as weak alias of __tan. Do not define any aliases if [__tan]. [NO_LONG_DOUBLE] (__tanl): Define as strong alias of __tan. [NO_LONG_DOUBLE] (tanl): Define as weak alias of __tanl. * sysdeps/x86_64/fpu/multiarch/s_atan-avx.c (atan): Rename to __atan. * sysdeps/x86_64/fpu/multiarch/s_atan-fma.c (atan): Likewise. * sysdeps/x86_64/fpu/multiarch/s_atan-fma4.c (atan): Likewise. * sysdeps/x86_64/fpu/multiarch/s_atan.c (atan): Rename to __atan and define as weak alias of __atan. * sysdeps/x86_64/fpu/multiarch/s_tan-avx.c (tan): Rename to __atan. * sysdeps/x86_64/fpu/multiarch/s_tan-fma.c (tan): Likewise. * sysdeps/x86_64/fpu/multiarch/s_tan-fma4.c (tan): Likewise. * sysdeps/x86_64/fpu/multiarch/s_tan.c (tan): Rename to __tan and define as weak alias of __tan.	2017-10-02 20:20:52 +00:00
Szabolcs Nagy	f7a0b063e7	Do not wrap expf and exp2f The new generic expf and exp2f code don't need wrappers any more, they set errno inline, so only use the wrappers on targets that need it. (If the wrapper is needed, then the top level wrapper code is included, otherwise empty w_expf.c is used to suppress the wrapper.) A powerpc64 expf implementation includes the expf c code directly which needed some changes. sysdeps/ieee754/flt-32/e_exp2f.c (__exp2f): Define without wrapper. * sysdeps/ieee754/flt-32/e_expf.c (__expf): Likewise * sysdeps/ieee754/flt-32/w_exp2f.c: New file. * sysdeps/ieee754/flt-32/w_expf.c: New file. * sysdeps/powerpc/powerpc64/fpu/multiarch/e_expf-ppc64.c: Update for the new expf code. * sysdeps/powerpc/powerpc64/fpu/multiarch/w_expf.c: New file. * sysdeps/powerpc/powerpc64/power8/fpu/w_expf.c: New file. * sysdeps/m68k/m680x0/fpu/w_exp2f.c: New file. * sysdeps/m68k/m680x0/fpu/w_expf.c: New file. * sysdeps/i386/fpu/w_exp2f.c: New file. * sysdeps/i386/fpu/w_expf.c: New file. * sysdeps/i386/i686/fpu/multiarch/w_expf.c: New file. * sysdeps/x86_64/fpu/w_expf.c: New file.	2017-10-02 14:38:54 +01:00
Joseph Myers	2f92505d20	Update x86_64 libm-test-ulps. * sysdeps/x86_64/fpu/libm-test-ulps: Update.	2017-09-29 18:03:48 +00:00
Joseph Myers	ae8372d7e4	Add SSE4.1 trunc, truncf (bug 20142). This patch adds SSE4.1 versions of trunc and truncf, using the roundsd / roundss instructions, similar to the versions of ceil, floor, rint and nearbyint functions we already have. In my testing with the glibc benchtests these are about 30% faster than the C versions for double, 20% faster for float. Tested for x86_64. [BZ #20142] * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Add s_trunc-c, s_truncf-c, s_trunc-sse4_1 and s_truncf-sse4_1. * sysdeps/x86_64/fpu/multiarch/s_trunc-c.c: New file. * sysdeps/x86_64/fpu/multiarch/s_trunc-sse4_1.S: Likewise. * sysdeps/x86_64/fpu/multiarch/s_trunc.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_truncf-c.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_truncf-sse4_1.S: Likewise. * sysdeps/x86_64/fpu/multiarch/s_truncf.c: Likewise.	2017-09-20 16:54:05 +00:00
H.J. Lu	ef8adeb041	x86: Add MathVec_Prefer_No_AVX512 to cpu-features [BZ #21967 ] AVX512 functions in mathvec are used on machines with AVX512. An AVX2 wrapper is also provided and it can be used when the AVX512 version isn't profitable. MathVec_Prefer_No_AVX512 is addded to cpu-features. If glibc.tune.hwcaps=MathVec_Prefer_No_AVX512 is set in GLIBC_TUNABLES environment variable, the AVX2 wrapper will be used. Tested on x86-64 machines with and without AVX512. Also verified glibc.tune.hwcaps=MathVec_Prefer_No_AVX512 on AVX512 machine. [BZ #21967] * sysdeps/x86/cpu-features.h (bit_arch_MathVec_Prefer_No_AVX512): New. (index_arch_MathVec_Prefer_No_AVX512): Likewise. * sysdeps/x86/cpu-tunables.c (TUNABLE_CALLBACK (set_hwcaps)): Handle MathVec_Prefer_No_AVX512. * sysdeps/x86_64/fpu/multiarch/ifunc-mathvec-avx512.h (IFUNC_SELECTOR): Return AVX2 version if MathVec_Prefer_No_AVX512 is set.	2017-09-12 07:54:47 -07:00
Markus Trippelsdorf	4c03a69680	Update x86_64 ulps for AMD Ryzen. * sysdeps/x86_64/fpu/libm-test-ulps: Update for AMD Ryzen.	2017-09-08 19:57:12 +00:00
Joseph Myers	5a80d39d0d	Obsolete pow10 functions. This patch obsoletes the pow10, pow10f and pow10l functions (makes them into compat symbols, not available for new ports or static linking). The exp10 names for these functions are standardized (in TS 18661-4) and were added in the same glibc version (2.1) as pow10 so source code can change to use them without any loss of portability. Since pow10 is deliberately not provided for _Float128, only exp10, this slightly simplifies moving to the new wrapper templates in the !LIBM_SVID_COMPAT case, by avoiding needing to arrange for pow10, pow10f and pow10l to be defined by those templates. Tested for x86_64, and with build-many-glibcs.py. * manual/math.texi (pow10): Do not document. (pow10f): Likewise. (pow10l): Likewise. * math/bits/mathcalls.h [__USE_GNU] (pow10): Do not declare. * math/bits/math-finite.h [__USE_GNU] (pow10): Likewise. * math/libm-test-exp10.inc (pow10_test): Remove. (do_test): Do not call pow10. * math/w_exp10_compat.c (pow10): Make into compat symbol. [NO_LONG_DOUBLE] (pow10l): Likewise. * math/w_exp10f_compat.c (pow10f): Likewise. * math/w_exp10l_compat.c (pow10l): Likewise. * sysdeps/ia64/fpu/e_exp10.S: Include <shlib-compat.h>. (pow10): Make into compat symbol. * sysdeps/ia64/fpu/e_exp10f.S: Include <shlib-compat.h>. (pow10f): Make into compat symbol. * sysdeps/ia64/fpu/e_exp10l.S: Include <shlib-compat.h>. (pow10l): Make into compat symbol. * sysdeps/ieee754/ldbl-opt/Makefile (libnldbl-calls): Remove pow10. (CFLAGS-nldbl-pow10.c): Remove variable.. * sysdeps/ieee754/ldbl-opt/nldbl-pow10.c: Remove file. * sysdeps/ieee754/ldbl-opt/w_exp10_compat.c (pow10l): Condition on [SHLIB_COMPAT (libm, GLIBC_2_1, GLIBC_2_27)]. * sysdeps/ieee754/ldbl-opt/w_exp10l_compat.c (compat_symbol): Undefine and redefine. (pow10l): Make into compat symbol. * sysdeps/aarch64/libm-test-ulps: Remove pow10 ulps. * sysdeps/alpha/fpu/libm-test-ulps: Likewise. * sysdeps/arm/libm-test-ulps: Likewise. * sysdeps/hppa/fpu/libm-test-ulps: Likewise. * sysdeps/i386/fpu/libm-test-ulps: Likewise. * sysdeps/i386/i686/fpu/multiarch/libm-test-ulps: Likewise. * sysdeps/microblaze/libm-test-ulps: Likewise. * sysdeps/mips/mips32/libm-test-ulps: Likewise. * sysdeps/mips/mips64/libm-test-ulps: Likewise. * sysdeps/nios2/libm-test-ulps: Likewise. * sysdeps/powerpc/fpu/libm-test-ulps: Likewise. * sysdeps/powerpc/nofpu/libm-test-ulps: Likewise. * sysdeps/s390/fpu/libm-test-ulps: Likewise. * sysdeps/sh/libm-test-ulps: Likewise. * sysdeps/sparc/fpu/libm-test-ulps: Likewise. * sysdeps/tile/libm-test-ulps: Likewise. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2017-09-01 21:13:18 +00:00
H.J. Lu	5e2bc4ff33	x86_64 __redirect_ieee754_expf: Change double to float __redirect_ieee754_expf has type float, not double. * sysdeps/x86_64/fpu/multiarch/e_expf.c (__redirect_ieee754_expf): Change double to float.	2017-08-28 08:45:53 -07:00
H.J. Lu	fcaaca412f	x86-64: Regenerate libm-test-ulps for AVX512 mathvec tests Update libm-test-ulps for AVX512 mathvec tests by running “make regen-ulps” on Intel Xeon processor with AVX512. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated.	2017-08-23 09:11:55 -07:00
H.J. Lu	b9eaca8fa0	x86_64: Replace AVX512F .byte sequences with instructions Since binutils 2.25 or later is required to build glibc, we can replace AVX512F .byte sequences with AVX512F instructions. Tested on x86-64 and x32. There are no code differences in libmvec.so and libmvec.a. * sysdeps/x86_64/fpu/svml_d_sincos8_core.S: Replace AVX512F .byte sequences with AVX512F instructions. * sysdeps/x86_64/fpu/svml_d_wrapper_impl.h: Likewise. * sysdeps/x86_64/fpu/svml_s_sincosf16_core.S: Likewise. * sysdeps/x86_64/fpu/svml_s_wrapper_impl.h: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core_avx512.S: Likewise.	2017-08-23 06:26:44 -07:00
H.J. Lu	098b9dd468	x86-64: Check FMA_Usable in ifunc-mathvec-avx2.h [BZ #21966 ] Since the AVX2 version of mathvec functions uses FMA, it can only be used when FMA is usable. [BZ #21966] * sysdeps/x86_64/fpu/multiarch/ifunc-mathvec-avx2.h (IFUNC_SELECTOR): Don't use the AVX2 version if FMA isn't usable.	2017-08-18 06:19:07 -07:00
H.J. Lu	24a2e6588d	x86-64: Optimize e_expf with FMA [BZ #21912 ] FMA optimized e_expf improves performance by more than 50% on Skylake. [BZ #21912] * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Add e_expf-fma. * sysdeps/x86_64/fpu/multiarch/e_expf-fma.S: New file. * sysdeps/x86_64/fpu/multiarch/e_expf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/ifunc-fma.h: Likewise.	2017-08-16 08:43:48 -07:00
H.J. Lu	f59f7adb4a	x86-64: Align L(SP_RANGE)/L(SP_INF_0) to 8 bytes [BZ #21955 ] sysdeps/x86_64/fpu/e_expf.S has lea L(SP_RANGE)(%rip), %rdx /* load over/underflow bound / cmpl (%rdx,%rax,4), %ecx / \|x\|<under/overflow bound ? / ... / Here if \|x\| is Inf / lea L(SP_INF_0)(%rip), %rdx / depending on sign of x: / movss (%rdx,%rax,4), %xmm0 / return zero or Inf / ret ... .section .rodata.cst8,"aM",@progbits,8 ... .p2align 2 L(SP_RANGE): / single precision overflow/underflow bounds / .long 0x42b17217 / if x>this bound, then result overflows / .long 0x42cff1b4 / if x<this bound, then result underflows / .type L(SP_RANGE), @object ASM_SIZE_DIRECTIVE(L(SP_RANGE)) .p2align 2 L(SP_INF_0): .long 0x7f800000 / single precision Inf / .long 0 / single precision zero / .type L(SP_INF_0), @object ASM_SIZE_DIRECTIVE(L(SP_INF_0)) Since L(SP_RANGE) and L(SP_INF_0) are in .rodata.cst8 section, they must be aligned to 8 bytes. [BZ #21955] sysdeps/x86_64/fpu/e_expf.S (L(SP_RANGE)): Aligned to 8 bytes. (L(SP_INF_0)): Likewise.	2017-08-15 14:05:14 -07:00
H.J. Lu	57a72fa350	x86-64: Add FMA multiarch functions to libm This patch adds multiarch functions optimized with -mfma -mavx2 to libm. e_pow-fma.c is compiled with $(config-cflags-nofma) due to PR 19003. * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Add e_exp-fma, e_log-fma, e_pow-fma, s_atan-fma, e_asin-fma, e_atan2-fma, s_sin-fma, s_tan-fma, mplog-fma, mpa-fma, slowexp-fma, slowpow-fma, sincos32-fma, doasin-fma, dosincos-fma, halfulp-fma, mpexp-fma, mpatan2-fma, mpatan-fma, mpsqrt-fma, and mptan-fma. (CFLAGS-doasin-fma.c): New. (CFLAGS-dosincos-fma.c): Likewise. (CFLAGS-e_asin-fma.c): Likewise. (CFLAGS-e_atan2-fma.c): Likewise. (CFLAGS-e_exp-fma.c): Likewise. (CFLAGS-e_log-fma.c): Likewise. (CFLAGS-e_pow-fma.c): Likewise. (CFLAGS-halfulp-fma.c): Likewise. (CFLAGS-mpa-fma.c): Likewise. (CFLAGS-mpatan-fma.c): Likewise. (CFLAGS-mpatan2-fma.c): Likewise. (CFLAGS-mpexp-fma.c): Likewise. (CFLAGS-mplog-fma.c): Likewise. (CFLAGS-mpsqrt-fma.c): Likewise. (CFLAGS-mptan-fma.c): Likewise. (CFLAGS-s_atan-fma.c): Likewise. (CFLAGS-sincos32-fma.c): Likewise. (CFLAGS-slowexp-fma.c): Likewise. (CFLAGS-slowpow-fma.c): Likewise. (CFLAGS-s_sin-fma.c): Likewise. (CFLAGS-s_tan-fma.c): Likewise. * sysdeps/x86_64/fpu/multiarch/doasin-fma.c: New file. * sysdeps/x86_64/fpu/multiarch/dosincos-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/e_asin-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/e_atan2-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/e_exp-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/e_log-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/e_pow-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/halfulp-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/ifunc-avx-fma4.h: Likewise. * sysdeps/x86_64/fpu/multiarch/ifunc-fma4.h: Likewise. * sysdeps/x86_64/fpu/multiarch/mpa-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/mpatan-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/mpatan2-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/mpexp-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/mplog-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/mpsqrt-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/mptan-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_atan-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_sin-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_tan-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/sincos32-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowexp-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowpow-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/e_asin.c: Rewrite. * sysdeps/x86_64/fpu/multiarch/e_atan2.c: Likewise. * sysdeps/x86_64/fpu/multiarch/e_exp.c: Likewise. * sysdeps/x86_64/fpu/multiarch/e_log.c: Likewise. * sysdeps/x86_64/fpu/multiarch/e_pow.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_atan.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_sin.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_tan.c: Likewise.	2017-08-07 08:20:56 -07:00
H.J. Lu	8537e0f6cf	x86-64: Implement libmathvec IFUNC selectors in C * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines) Add svml_d_cos2_core-sse2, svml_d_cos4_core-sse, svml_d_cos8_core-avx2, svml_d_exp2_core-sse2, svml_d_exp4_core-sse, svml_d_exp8_core-avx2, svml_d_log2_core-sse2, svml_d_log4_core-sse, svml_d_log8_core-avx2, svml_d_pow2_core-sse2, svml_d_pow4_core-sse, svml_d_pow8_core-avx2 svml_d_sin2_core-sse2, svml_d_sin4_core-sse, svml_d_sin8_core-avx2, svml_d_sincos2_core-sse2, svml_d_sincos4_core-sse, svml_d_sincos8_core-avx2, svml_s_cosf16_core-avx2, svml_s_cosf4_core-sse2, svml_s_cosf8_core-sse, svml_s_expf16_core-avx2, svml_s_expf4_core-sse2, svml_s_expf8_core-sse, svml_s_logf16_core-avx2, svml_s_logf4_core-sse2, svml_s_logf8_core-sse, svml_s_powf16_core-avx2, svml_s_powf4_core-sse2, svml_s_powf8_core-sse, svml_s_sincosf16_core-avx2, svml_s_sincosf4_core-sse2, svml_s_sincosf8_core-sse, svml_s_sinf16_core-avx2, svml_s_sinf4_core-sse2 and svml_s_sinf8_core-sse. * sysdeps/x86_64/fpu/multiarch/ifunc-mathvec-avx2.h: New file. * sysdeps/x86_64/fpu/multiarch/ifunc-mathvec-avx512.h: Likewise. * sysdeps/x86_64/fpu/multiarch/ifunc-mathvec-sse4_1.h: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_cos2_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_cos4_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_exp2_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_exp4_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_log2_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_log4_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_log8_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_pow2_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_pow4_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos2_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos4_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_cosf16_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_cosf4_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_cosf8_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_expf16_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_expf4_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_expf8_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_logf16_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_logf4_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_logf8_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_powf16_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_powf4_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_powf8_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincosf16_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincosf4_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincosf8_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sinf16_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sinf4_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sinf8_core.c: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_cos2_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_cos2_core-sse2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVbN2v_cos): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_cos4_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_cos4_core-sse.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVdN4v_cos): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core-avx2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVeN8v_cos): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_exp2_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_exp2_core-sse2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVbN2v_exp): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_exp4_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_exp4_core-sse.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVdN4v_exp): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core-avx2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVeN8v_exp): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_log2_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_log2_core-sse2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVbN2v_log): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_log4_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_log4_core-sse.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVdN4v_log): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_log8_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_log8_core-avx2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVeN8v_log): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_pow2_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_pow2_core-sse2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVbN2vv_pow): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_pow4_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_pow4_core-sse.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVdN4vv_pow): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core-avx2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVeN8vv_pow): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core-sse2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVbN2v_sin): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core-sse.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVbN4v_sin): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core-avx2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVbN8v_sin): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos2_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_sincos2_core-sse2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVbN2vvv_sincos): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos4_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_sincos4_core-sse.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVdN4vvv_sincos): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core-avx2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVeN8vvv_sincos): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_cosf16_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_cosf16_core-avx2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVeN16v_cosf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_cosf4_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_cosf4_core-sse2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVbN4v_cosf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_cosf8_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_cosf8_core-sse.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVdN8v_cosf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_expf16_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_expf16_core-avx2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVeN16v_expf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_expf4_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_expf4_core-sse2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVbN4v_expf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_expf8_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_expf8_core-sse.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVdN8v_expf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_logf16_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_logf16_core-avx2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVeN16v_logf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_logf4_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_logf4_core-sse2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVbN4v_logf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_logf8_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_logf8_core-sse.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVdN8v_logf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_powf16_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_powf16_core-avx2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVeN16vv_powf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_powf4_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_powf4_core-sse2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVbN4vv_powf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_powf8_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_powf8_core-sse.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVdN8vv_powf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_sincosf16_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_sincosf16_core-avx2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVeN16vvv_sincosf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_sincosf4_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_sincosf4_core-sse2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVbN4vvv_sincosf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_sincosf8_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_sincosf8_core-sse.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVdN8vvv_sincosf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_sinf16_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_sinf16_core-avx2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVeN16v_sinf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_sinf4_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_sinf4_core-sse2.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVbN4v_sinf): Removed. * sysdeps/x86_64/fpu/multiarch/svml_d_sinf8_core.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/svml_d_sinf8_core-sse.S: This. Don't include <sysdep.h> nor <init-arch.h>. (_ZGVdN8v_sinf): Removed.	2017-08-04 13:03:58 -07:00
H.J. Lu	10a87ca476	x86-64: Implement libm IFUNC selectors in C * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Add s_ceil-sse4_1, s_ceilf-sse4_1, s_floor-sse4_1, s_floorf-sse4_1, s_nearbyint-sse4_1, s_nearbyintf-sse4_1, s_rint-sse4_1 and s_rintf-sse4_1. * sysdeps/x86_64/fpu/multiarch/ifunc-sse4_1.h: New file. * sysdeps/x86_64/fpu/multiarch/s_ceil.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_ceilf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_floor.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_floorf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_nearbyint.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_nearbyintf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_rint.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_rintf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_ceil.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/s_ceil-sse4_1.S: This. Don't include <machine/asm.h> nor <init-arch.h>. Include <sysdep.h>. (__ceil): Removed. * sysdeps/x86_64/fpu/multiarch/s_ceilf.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/s_ceilf-sse4_1.S: This. Don't include <machine/asm.h> nor <init-arch.h>. Include <sysdep.h>. (__ceilf): Removed. * sysdeps/x86_64/fpu/multiarch/s_floor.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/s_floor-sse4_1.S: This. Don't include <machine/asm.h> nor <init-arch.h>. Include <sysdep.h>. (__floor): Removed. * sysdeps/x86_64/fpu/multiarch/s_floorf.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/s_floorf-sse4_1.S: This. Don't include <machine/asm.h> nor <init-arch.h>. Include <sysdep.h>. (__floorf): Removed. * sysdeps/x86_64/fpu/multiarch/s_nearbyint.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/s_nearbyint-sse4_1.S: This. Don't include <machine/asm.h> nor <init-arch.h>. Include <sysdep.h>. (__nearbyint): Removed. * sysdeps/x86_64/fpu/multiarch/s_nearbyintf.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/s_nearbyintf-sse4_1.S: This. Don't include <machine/asm.h> nor <init-arch.h>. Include <sysdep.h>. (__nearbyintf): Removed. * sysdeps/x86_64/fpu/multiarch/s_rint.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/s_rint-sse4_1.S: This. Don't include <machine/asm.h> nor <init-arch.h>. Include <sysdep.h>. (__rint): Removed. * sysdeps/x86_64/fpu/multiarch/s_rintf.S: Renamed to ... * sysdeps/x86_64/fpu/multiarch/s_rintf-sse4_1.S: This. Don't include <machine/asm.h> nor <init-arch.h>. Include <sysdep.h>. (__rintf): Removed.	2017-08-04 13:02:13 -07:00
Joseph Myers	c86ed71d63	Add float128 support for x86_64, x86. This patch enables float128 support for x86_64 and x86. All GCC versions that can build glibc provide the required support, but since GCC 6 and before don't provide __builtin_nanq / __builtin_nansq, sNaN tests and some tests of NaN payloads need to be disabled with such compilers (this does not affect the generated glibc binaries at all, just the tests). bits/floatn.h declares float128 support to be available for GCC versions that provide the required libgcc support (4.3 for x86_64, 4.4 for i386 GNU/Linux, 4.5 for i386 GNU/Hurd); compilation-only support was present some time before then, but not really useful without the libgcc functions. fenv_private.h needed updating to avoid trying to put _Float128 values in registers. I make no assertion of optimality of the math_opt_barrier / math_force_eval definitions for this case; they are simply intended to be sufficient to work correctly. Tested for x86_64 and x86, with GCC 7 and GCC 6. (Testing for x32 was compilation tests only with build-many-glibcs.py to verify the ABI baseline updates. I have not done any testing for Hurd, although the float128 support is enabled there as for GNU/Linux.) * sysdeps/i386/Implies: Add ieee754/float128. * sysdeps/x86_64/Implies: Likewise. * sysdeps/x86/bits/floatn.h: New file. * sysdeps/x86/float128-abi.h: Likewise. * manual/math.texi (Mathematics): Document support for _Float128 on x86_64 and x86. * sysdeps/i386/fpu/fenv_private.h: Include <bits/floatn.h>. (math_opt_barrier): Do not put _Float128 values in floating-point registers. (math_force_eval): Likewise. [__x86_64__] (SET_RESTORE_ROUNDF128): New macro. * sysdeps/x86/fpu/Makefile [$(subdir) = math] (CPPFLAGS): Append to Makefile variable. * sysdeps/x86/fpu/e_sqrtf128.c: New file. * sysdeps/x86/fpu/sfp-machine.h: Likewise. Based on libgcc. * sysdeps/x86/math-tests.h: New file. * math/libm-test-support.h (XFAIL_FLOAT128_PAYLOAD): New macro. * math/libm-test-getpayload.inc (getpayload_test_data): Use XFAIL_FLOAT128_PAYLOAD. * math/libm-test-setpayload.inc (setpayload_test_data): Likewise. * math/libm-test-totalorder.inc (totalorder_test_data): Likewise. * math/libm-test-totalordermag.inc (totalordermag_test_data): Likewise. * sysdeps/unix/sysv/linux/i386/libc.abilist: Update. * sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/64/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise. * sysdeps/i386/fpu/libm-test-ulps: Likewise. * sysdeps/i386/i686/fpu/multiarch/libm-test-ulps: Likewise. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2017-06-26 22:02:24 +00:00
Zack Weinberg	fd860eaaa8	Remove __need macros from errno.h (__need_Emath, __need_error_t). This is fairly complicated, not because the users of __need_Emath and __need_error_t have complicated requirements, but because the core changes had a lot of fallout. __need_error_t exists for gnulib compatibility in argz.h and argp.h. error_t itself is a Hurdism, an enum containing all the E-constants, so you can do 'p (error_t) errno' in gdb and get a symbolic value. argz.h and argp.h use it for function return values, and they want to fall back to 'int' when that's not available. There is no reason why these nonstandard headers cannot just go ahead and include all of errno.h; so we do that. __need_Emath is defined only by .S files; what they _really_ need is for errno.h to avoid declaring anything other than the E-constants (e.g. 'extern int __errno_location(void);' is a syntax error in assembly language). This is replaced with a check for __ASSEMBLER__ in errno.h, plus a carefully documented requirement for bits/errno.h not to define anything other than macros. That in turn has the consequence that bits/errno.h must not define errno - fortunately, all live ports use the same definition of errno, so I've moved it to errno.h. The Hurd bits/errno.h must also take care not to define error_t when __ASSEMBLER__ is defined, which involves repeating all of the definitions twice, but it's a generated file so that's okay. * stdlib/errno.h: Remove __need_Emath and __need_error_t logic. Reorganize file. Declare errno here. When __ASSEMBLER__ is defined, don't declare anything other than the E-constants. * include/errno.h: Change conditional for exposing internal declarations to (not _ISOMAC and not __ASSEMBLER__). * bits/errno.h: Remove logic for __need_Emath. Document requirements for a port-specific bits/errno.h. * sysdeps/unix/sysv/linux/bits/errno.h * sysdeps/unix/sysv/linux/alpha/bits/errno.h * sysdeps/unix/sysv/linux/hppa/bits/errno.h * sysdeps/unix/sysv/linux/mips/bits/errno.h * sysdeps/unix/sysv/linux/sparc/bits/errno.h: Add multiple-include guard and check against improper inclusion. Remove __need_Emath logic. Don't declare errno here. Ensure all constants are defined as simple integer literals. Consistent formatting. * sysdeps/mach/hurd/errnos.awk: Likewise. Only define error_t and enum __error_t_codes if __ASSEMBLER__ is not defined. * sysdeps/mach/hurd/bits/errno.h: Regenerate. * argp/argp.h, string/argz.h: Don't define __need_error_t before including errno.h. * sysdeps/i386/i686/fpu/multiarch/s_cosf-sse2.S * sysdeps/i386/i686/fpu/multiarch/s_sincosf-sse2.S * sysdeps/i386/i686/fpu/multiarch/s_sinf-sse2.S * sysdeps/x86_64/fpu/s_cosf.S * sysdeps/x86_64/fpu/s_sincosf.S * sysdeps/x86_64/fpu/s_sinf.S: Just include errno.h; don't define __need_Emath or include bits/errno.h directly.	2017-06-14 08:14:34 -04:00
Zack Weinberg	7c3018f9e4	Suppress internal declarations for most of the testsuite. This patch adds a new build module called 'testsuite'. IS_IN (testsuite) implies _ISOMAC, as do IS_IN_build and __cplusplus (which means several ad-hoc tests for __cplusplus can go away). libc-symbols.h now suppresses almost all of itself when _ISOMAC is defined; in particular, _ISOMAC mode does not get config.h automatically anymore. There are still quite a few tests that need to see internal gunk of one variety or another. For them, we now have 'tests-internal' and 'test-internal-extras'; files in this category will still be compiled with MODULE_NAME=nonlib, and everything proceeds as it always has. The bulk of this patch is moving tests from 'tests' to 'tests-internal'. There is also 'tests-static-internal', which has the same effect on files in 'tests-static', and 'modules-names-tests', which has the inverse effect on files in 'modules-names' (it's inverted because most of the things in modules-names are not tests). For both of these, the file must appear in both the new variable and the old one. There is also now a special case for when libc-symbols.h is included without MODULE_NAME being defined at all. (This happens during the creation of libc-modules.h, and also when preprocessing Versions files.) When this happens, IS_IN is set to be always false and _ISOMAC is not defined, which was the status quo, but now it's explicit. The remaining changes to C source files in this patch seemed likely to cause problems in the absence of the main change. They should be relatively self-explanatory. In a few cases I duplicated a definition from an internal header rather than move the test to tests-internal; this was a judgement call each time and I'm happy to change those however reviewers feel is more appropriate. * Makerules: New subdir configuration variables 'tests-internal' and 'test-internal-extras'. Test files in these categories will still be compiled with MODULE_NAME=nonlib. Test files in the existing categories (tests, xtests, test-srcs, test-extras) are now compiled with MODULE_NAME=testsuite. New subdir configuration variable 'modules-names-tests'. Files which are in both 'modules-names' and 'modules-names-tests' will be compiled with MODULE_NAME=testsuite instead of MODULE_NAME=extramodules. (gen-as-const-headers): Move to tests-internal. (do-tests-clean, common-mostlyclean): Support tests-internal. * Makeconfig (built-modules): Add testsuite. * Makefile: Change libof-check-installed-headers-c and libof-check-installed-headers-cxx to 'testsuite'. * Rules: Likewise. Support tests-internal. * benchtests/strcoll-inputs/filelist#en_US.UTF-8: Remove extra-modules.mk. * config.h.in: Don't check for __OPTIMIZE__ or __FAST_MATH__ here. * include/libc-symbols.h: Move definitions of _GNU_SOURCE, PASTE_NAME, PASTE_NAME1, IN_MODULE, IS_IN, and IS_IN_LIB to the very top of the file and rationalize their order. If MODULE_NAME is not defined at all, define IS_IN to always be false, and don't define _ISOMAC. If any of IS_IN (testsuite), IS_IN_build, or __cplusplus are true, define _ISOMAC and suppress everything else in this file, starting with the inclusion of config.h. Do check for inappropriate definitions of __OPTIMIZE__ and __FAST_MATH__ here, but only if _ISOMAC is not defined. Correct some out-of-date commentary. * include/math.h: If _ISOMAC is defined, undefine NO_LONG_DOUBLE and _Mlong_double_ before including math.h. * include/string.h: If _ISOMAC is defined, don't expose _STRING_ARCH_unaligned. Move a comment to a more appropriate location. * include/errno.h, include/stdio.h, include/stdlib.h, include/string.h * include/time.h, include/unistd.h, include/wchar.h: No need to check __cplusplus nor use __BEGIN_DECLS/__END_DECLS. * misc/sys/cdefs.h (__NTHNL): New macro. * sysdeps/m68k/m680x0/fpu/bits/mathinline.h (__m81_defun): Use __NTHNL to avoid errors with GCC 6. * elf/tst-env-setuid-tunables.c: Include config.h with _LIBC defined, for HAVE_TUNABLES. * inet/tst-checks-posix.c: No need to define _ISOMAC. * intl/tst-gettext2.c: Provide own definition of N_. * math/test-signgam-finite-c99.c: No need to define _ISOMAC. * math/test-signgam-main.c: No need to define _ISOMAC. * stdlib/tst-strtod.c: Convert to test-driver. Split locale_test to... * stdlib/tst-strtod1i.c: ...this new file. * stdlib/tst-strtod5.c: Convert to test-driver and add copyright notice. Split tests of __strtod_internal to... * stdlib/tst-strtod5i.c: ...this new file. * string/test-string.h: Include stdint.h. Duplicate definition of inhibit_loop_to_libcall here (from libc-symbols.h). * string/test-strstr.c: Provide dummy definition of libc_hidden_builtin_def when including strstr.c. * sysdeps/ia64/fpu/libm-symbols.h: Suppress entire file in _ISOMAC mode; no need to test __STRICT_ANSI__ nor __cplusplus as well. * sysdeps/x86_64/fpu/math-tests-arch.h: Include cpu-features.h. Don't include init-arch.h. * sysdeps/x86_64/multiarch/test-multiarch.h: Include cpu-features.h. Don't include init-arch.h. * elf/Makefile: Move tst-ptrguard1-static, tst-stackguard1-static, tst-tls1-static, tst-tls2-static, tst-tls3-static, loadtest, unload, unload2, circleload1, neededtest, neededtest2, neededtest3, neededtest4, tst-tls1, tst-tls2, tst-tls3, tst-tls6, tst-tls7, tst-tls8, tst-dlmopen2, tst-ptrguard1, tst-stackguard1, tst-_dl_addr_inside_object, and all of the ifunc tests to tests-internal. Don't add $(modules-names) to test-extras. * inet/Makefile: Move tst-inet6_scopeid_pton to tests-internal. Add tst-deadline to tests-static-internal. * malloc/Makefile: Move tst-mallocstate and tst-scratch_buffer to tests-internal. * misc/Makefile: Move tst-atomic and tst-atomic-long to tests-internal. * nptl/Makefile: Move tst-typesizes, tst-rwlock19, tst-sem11, tst-sem12, tst-sem13, tst-barrier5, tst-signal7, tst-tls3, tst-tls3-malloc, tst-tls5, tst-stackguard1, tst-sem11-static, tst-sem12-static, and tst-stackguard1-static to tests-internal. Link tests-internal with libpthread also. Don't add $(modules-names) to test-extras. * nss/Makefile: Move tst-field to tests-internal. * posix/Makefile: Move bug-regex5, bug-regex20, bug-regex33, tst-rfc3484, tst-rfc3484-2, and tst-rfc3484-3 to tests-internal. * stdlib/Makefile: Move tst-strtod1i, tst-strtod3, tst-strtod4, tst-strtod5i, tst-tls-atexit, and tst-tls-atexit-nodelete to tests-internal. * sunrpc/Makefile: Move tst-svc_register to tests-internal. * sysdeps/powerpc/Makefile: Move test-get_hwcap and test-get_hwcap-static to tests-internal. * sysdeps/unix/sysv/linux/Makefile: Move tst-setgetname to tests-internal. * sysdeps/x86_64/fpu/Makefile: Add all libmvec test modules to modules-names-tests.	2017-05-11 19:27:59 -04:00
Zack Weinberg	963394a22b	Allow direct use of math_ldbl.h in testsuite. A few 'long double'-related tests include math_private.h just for their variety of math_ldbl.h, which contains macros for assembling and disassembling the binary representation of 'long double'. math_ldbl.h insists on being included from math_private.h, but if we relax this restriction (and fix some portability sloppiness) we can use it directly and not have to expose all of math_private.h to the testsuite. * sysdeps/generic/math_private.h: Use __BIG_ENDIAN and __LITTLE_ENDIAN, not BIG_ENDIAN and LITTLE_ENDIAN. * sysdeps/generic/math_ldbl.h * sysdeps/ia64/fpu/math_ldbl.h * sysdeps/ieee754/ldbl-128/math_ldbl.h * sysdeps/ieee754/ldbl-128ibm/math_ldbl.h * sysdeps/ieee754/ldbl-96/math_ldbl.h * sysdeps/powerpc/fpu/math_ldbl.h * sysdeps/x86_64/fpu/math_ldbl.h: Allow direct inclusion. Use uintNN_t instead of u_intNN_t. Use __BIG_ENDIAN and __LITTLE_ENDIAN, not BIG_ENDIAN and LITTLE_ENDIAN. Include endian.h and/or stdint.h if necessary. Add copyright notices. * sysdeps/ieee754/ldbl-128ibm/math_ldbl.h (ldbl_canonicalize_int): Don't use EXTRACT_WORDS64. * sysdeps/ieee754/ldbl-96/test-canonical-ldbl-96.c * sysdeps/ieee754/ldbl-96/test-totalorderl-ldbl-96.c * sysdeps/ieee754/ldbl-128ibm/test-canonical-ldbl-128ibm.c * sysdeps/ieee754/ldbl-128ibm/test-totalorderl-ldbl-128ibm.c: Include math_ldbl.h, not math_private.h.	2017-02-25 10:40:48 -05:00
Joseph Myers	92061bb033	Run libm tests separately for each function. At present, libm tests for each function get built into a single executable (for each floating point type, for each of normal / inline / finite-math-only functions, plus vector variants) and run together, resulting in a single PASS or FAIL (for each of those nine variants plus vector variants). Building this executable involves reading over 50 MB of libm-test-.c sources. This patch arranges for tests of each function to be run separately from the makefiles instead. There are 121 functions being tested for each (type, variant pair) (actually 126, but run as 121 from the Makefile because each of the pairs (exp10, pow10), (isfinite, finite), (lgamma, gamma), (remainder, drem), (scalbn, ldexp), shares a table of test results and so is run together), so 1089 separate tests run from the Makefile, plus 48 vector tests on x86_64 (six functions for eight vector variants). Each test only involves a libm-test-<func>.c file of no more than about 4 MB, rather than all such files taking about 50 MB. With tests run separately, test summaries will indicate which functions actually have problems (of course, those problems may just be out-of-date libm-test-ulps files if the file hasn't been updated for the architecture in question recently). All the .c files for the 1089+48 tests are generated automatically from the Makefiles. Various checked-in boilerplate .c files are removed as no longer needed. CFLAGS definitions for the different kinds of tests are generated using makefile iterators to apply target-specific variable settings. libm-have-vector-test.h is no longer needed; the list of functions to test for each vector type is now in the sysdeps Makefile. This should reduce the amount of boilerplate needed for float128 testing support; test-float128.h will still be needed, but not various .c files or Makefile CFLAGS definitions. The logic for creating dependencies on libm-test-support-.o files should also render <https://sourceware.org/ml/libc-alpha/2017-02/msg00279.html> unnecessary. Tested for x86_64 and x86. * math/Makefile (libm-tests-generated): Remove variable. (libm-tests-base-normal): New variable. (libm-tests-base-finite): Likewise. (libm-tests-base-inline): Likewise. (libm-tests-base): Likewise. (libm-tests-normal): Likewise. (libm-tests-finite): Likewise. (libm-tests-inline): Likewise. (libm-tests-vector): Likewise. (libm-tests): Define in terms of these new variables. (libm-tests-for-type): New variable. (libm-tests.o): Move definition. (tests): Move addition of $(libm-tests). (generated): Update for new and removed libm test files. ($(objpfx)libm-test.c): Remove target. ($(objpfx)libm-have-vector-test.h): Likewise. (CFLAGS-test-double-vlen2.c): Remove variable. (CFLAGS-test-double-vlen4.c): Likewise. (CFLAGS-test-double-vlen8.c): Likewise. (CFLAGS-test-float-vlen4.c): Likewise. (CFLAGS-test-float-vlen8.c): Likewise. (CFLAGS-test-float-vlen16.c): Likewise. (CFLAGS-test-float.c): Likewise. (CFLAGS-test-float-finite.c): Likewise. (CFLAGS-libm-test-support-float.c): Likewise. (CFLAGS-test-double.c): Likewise. (CFLAGS-test-double-finite.c): Likewise. (CFLAGS-libm-test-support-double.c): Likewise. (CFLAGS-test-ldouble.c): Likewise. (CFLAGS-test-ldouble-finite.c): Likewise. (CFLAGS-libm-test-support-ldouble.c): Likewise. (libm-test-inline-cflags): New variable. (CFLAGS-test-ifloat.c): Remove variable. (CFLAGS-test-idouble.c): Likewise. (CFLAGS-test-ildouble.c): Likewise. ($(addprefix $(objpfx), $(libm-tests.o))): Move target and update dependencies. ($(foreach t,$(libm-tests-normal),$(objpfx)$(t).c)): New rule. ($(foreach t,$(libm-tests-finite),$(objpfx)$(t).c)): Likewise. ($(foreach t,$(libm-tests-inline),$(objpfx)$(t).c)): Likewise. ($(foreach t,$(libm-tests-vector),$(objpfx)$(t).c)): Likewise. ($(foreach t,$(types),$(objpfx)libm-test-support-$(t).c)): Likewise. (dependencies on libm-test-support-.o): Remove. ($(foreach f,$(libm-test-funcs-all),$(objpfx)$(o)-$(f).o)): New rules using iterators. ($(addprefix $(objpfx),$(call libm-tests-for-type,$(o)))): Likewise. ($(objpfx)libm-test-support-$(o).o): Likewise. ($(addprefix $(objpfx),$(filter-out $(tests-static) $(libm-vec-tests),$(tests)))): Filter out $(libm-tests-vector) instead. ($(addprefix $(objpfx), $(libm-vec-tests))): Use iterator to define rule instead. math/README.libm-test: Update. * math/libm-test-acos.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-acosh.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-asin.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-asinh.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-atan.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-atan2.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-atanh.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-cabs.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-cacos.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-cacosh.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-canonicalize.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-carg.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-casin.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-casinh.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-catan.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-catanh.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-cbrt.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-ccos.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-ccosh.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-ceil.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-cexp.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-cimag.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-clog.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-clog10.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-conj.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-copysign.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-cos.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-cosh.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-cpow.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-cproj.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-creal.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-csin.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-csinh.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-csqrt.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-ctan.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-ctanh.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-erf.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-erfc.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-exp.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-exp10.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-exp2.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-expm1.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-fabs.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-fdim.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-floor.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-fma.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-fmax.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-fmaxmag.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-fmin.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-fminmag.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-fmod.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-fpclassify.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-frexp.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-fromfp.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-fromfpx.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-getpayload.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-hypot.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-ilogb.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-iscanonical.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-iseqsig.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-isfinite.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-isgreater.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-isgreaterequal.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-isinf.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-isless.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-islessequal.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-islessgreater.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-isnan.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-isnormal.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-issignaling.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-issubnormal.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-isunordered.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-iszero.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-j0.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-j1.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-jn.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-lgamma.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-llogb.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-llrint.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-llround.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-log.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-log10.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-log1p.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-log2.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-logb.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-lrint.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-lround.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-modf.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-nearbyint.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-nextafter.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-nextdown.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-nexttoward.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-nextup.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-pow.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-remainder.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-remquo.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-rint.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-round.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-roundeven.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-scalb.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-scalbln.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-scalbn.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-setpayload.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-setpayloadsig.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-signbit.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-significand.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-sin.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-sincos.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-sinh.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-sqrt.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-tan.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-tanh.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-tgamma.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-totalorder.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-totalordermag.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-trunc.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-ufromfp.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-ufromfpx.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-y0.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-y1.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-yn.inc: Include libm-test-driver.c. (do_test): New function. * math/libm-test-driver.c: Do not include libm-have-vector-test.h. (HAVE_VECTOR): Remove macro. (START): Do not call HAVE_VECTOR. * math/test-double-vlen2.h (FUNC_TEST): Remove macro. * math/test-double-vlen4.h (FUNC_TEST): Remove macro. * math/test-double-vlen8.h (FUNC_TEST): Remove macro. * math/test-float-vlen16.h (FUNC_TEST): Remove macro. * math/test-float-vlen4.h (FUNC_TEST): Remove macro. * math/test-float-vlen8.h (FUNC_TEST): Remove macro. * math/test-math-vector.h (FUNC_TEST): New macro. (WRAPPER_DECL): Rename to WRAPPER_DECL_f. * sysdeps/x86_64/fpu/Makefile (double-vlen2-funcs): New variable. (double-vlen4-funcs): Likewise. (double-vlen4-avx2-funcs): Likewise. (double-vlen8-funcs): Likewise. (float-vlen4-funcs): Likewise. (float-vlen8-funcs): Likewise. (float-vlen8-avx2-funcs): Likewise. (float-vlen16-funcs): Likewise. (CFLAGS-test-double-vlen4-avx2.c): Remove variable. (CFLAGS-test-float-vlen8-avx2.c): Likewise. * sysdeps/x86_64/fpu/test-double-vlen4.h (TEST_VECTOR_cos): Remove macro. (TEST_VECTOR_sin): Likewise. (TEST_VECTOR_sincos): Likewise. (TEST_VECTOR_log): Likewise. (TEST_VECTOR_exp): Likewise. (TEST_VECTOR_pow): Likewise. * sysdeps/x86_64/fpu/test-double-vlen8.h (TEST_VECTOR_cos): Likewise. (TEST_VECTOR_sin): Likewise. (TEST_VECTOR_sincos): Likewise. (TEST_VECTOR_log): Likewise. (TEST_VECTOR_exp): Likewise. (TEST_VECTOR_pow): Likewise. * sysdeps/x86_64/fpu/test-float-vlen16.h (TEST_VECTOR_cosf): Likewise. (TEST_VECTOR_sinf): Likewise. (TEST_VECTOR_sincosf): Likewise. (TEST_VECTOR_logf): Likewise. (TEST_VECTOR_expf): Likewise. (TEST_VECTOR_powf): Likewise. * sysdeps/x86_64/fpu/test-float-vlen8.h (TEST_VECTOR_cosf): Likewise. (TEST_VECTOR_sinf): Likewise. (TEST_VECTOR_sincosf): Likewise. (TEST_VECTOR_logf): Likewise. (TEST_VECTOR_expf): Likewise. (TEST_VECTOR_powf): Likewise. * math/gen-libm-have-vector-test.sh: Remove file. * math/libm-test.inc: Likewise. * math/libm-test-support-double.c: Likewise. * math/libm-test-support-float.c: Likewise. * math/libm-test-support-ldouble.c: Likewise. * math/test-double-finite.c: Likewise.: Likewise. * math/test-double.c: Likewise. * math/test-float-finite.c: Likewise. * math/test-float.c: Likewise. * math/test-idouble.c: Likewise. * math/test-ifloat.c: Likewise. * math/test-ildouble.c: Likewise. * math/test-ldouble-finite.c: Likewise. * math/test-ldouble.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen2.h: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen16.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4.h: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8.c: Likewise.	2017-02-24 00:52:49 +00:00
Joseph Myers	2c51dfd05d	Move tests of catan, catanh to auto-libm-test-. This patch moves tests of catan and catanh with finite inputs (other than the divide-by-zero cases producing an exact infinity) to using the auto-libm-test machinery. Each of auto-libm-test-out-catan and auto-libm-test-out-catanh takes about three seconds to generate on my system (so in fact it wasn't necessary after all to defer the move to auto-libm-test- until the output files were split up by function). Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add tests of catan and catanh. * math/auto-libm-test-out-catan: New generated file. * math/auto-libm-test-out-catanh: Likewise. * math/libm-test-catan.inc (catan_test_data): Use AUTO_TESTS_c_c. Move tests with finite inputs, except divide-by-zero cases, to auto-libm-test-in. * math/libm-test-catanh.inc (catanh_test_data): Likewise. * math/Makefile (libm-test-funcs-auto): Add catan and catanh. (libm-test-funcs-noauto): Remove catan and catanh. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/i386/i686/fpu/multiarch/libm-test-ulps: Likewise. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2017-02-17 18:42:37 +00:00
Joseph Myers	fa2a3dd7a3	Move tests of casin, casinh to auto-libm-test-. This patch moves tests of casin and casinh with finite inputs to using the auto-libm-test machinery. Each of auto-libm-test-out-casin and auto-libm-test-out-casinh takes about 38 minutes to generate on my system because of MPC slowness on special cases that appear in the tests (with MPC 1.0.3; I don't know to what extent current MPC master might speed it up). Tested for x86_64 and x86 and ulps updated accordingly. math/auto-libm-test-in: Add tests of casin and casinh. * math/auto-libm-test-out-casin: New generated file. * math/auto-libm-test-out-casinh: Likewise. * math/libm-test-casin.inc (casin_test_data): Use AUTO_TESTS_c_c. Move tests with finite inputs to auto-libm-test-in. * math/libm-test-casinh.inc (casinh_test_data): Likewise. * math/Makefile (libm-test-funcs-auto): Add casin and casinh. (libm-test-funcs-noauto): Remove casin and casinh. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/i386/i686/fpu/multiarch/libm-test-ulps: Likewise. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2017-02-17 18:14:02 +00:00
Joseph Myers	6b8303a383	Move tests of cacos, cacosh to auto-libm-test-. This patch moves tests of cacos and cacosh with finite inputs to using the auto-libm-test machinery. Each of auto-libm-test-out-cacos and auto-libm-test-out-cacosh takes about 80 minutes to generate on my system because of MPC slowness on special cases that appear in the tests (with MPC 1.0.3; I don't know to what extent current MPC master might speed it up). Tested for x86_64 and x86 and ulps updated accordingly. math/auto-libm-test-in: Add tests of cacos and cacosh. * math/auto-libm-test-out-cacos: New generated file. * math/auto-libm-test-out-cacosh: Likewise. * math/libm-test-cacos.inc (cacos_test_data): Use AUTO_TESTS_c_c. Move tests with finite inputs to auto-libm-test-in. * math/libm-test-cacosh.inc (cacosh_test_data): Likewise. * math/Makefile (libm-test-funcs-auto): Add cacos and cacosh. (libm-test-funcs-noauto): Remove cacos and cacosh. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/i386/i686/fpu/multiarch/libm-test-ulps: Likewise. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2017-02-17 17:44:23 +00:00
Joseph Myers	f7a51347a4	Revert header inclusion changes that break math/ testing on x86_64. Revert: 2017-02-16 Zack Weinberg <zackw@panix.com> * sysdeps/x86_64/fpu/math-tests-arch.h: Include cpu-features.h. Don't include init-arch.h. * sysdeps/x86_64/multiarch/test-multiarch.h: Include cpu-features.h. Don't include init-arch.h.	2017-02-17 17:08:17 +00:00
Zack Weinberg	ceaa98897c	Add missing header files throughout the testsuite. * crypt/md5.h: Test _LIBC with #if defined, not #if. * dirent/opendir-tst1.c: Include sys/stat.h. * dirent/tst-fdopendir.c: Include sys/stat.h. * dirent/tst-fdopendir2.c: Include stdlib.h. * dirent/tst-scandir.c: Include stdbool.h. * elf/tst-auditmod1.c: Include link.h and stddef.h. * elf/tst-tls15.c: Include stdlib.h. * elf/tst-tls16.c: Include stdlib.h. * elf/tst-tls17.c: Include stdlib.h. * elf/tst-tls18.c: Include stdlib.h. * iconv/tst-iconv6.c: Include endian.h. * iconvdata/bug-iconv11.c: Include limits.h. * io/test-utime.c: Include stdint.h. * io/tst-faccessat.c: Include sys/stat.h. * io/tst-fchmodat.c: Include sys/stat.h. * io/tst-fchownat.c: Include sys/stat.h. * io/tst-fstatat.c: Include sys/stat.h. * io/tst-futimesat.c: Include sys/stat.h. * io/tst-linkat.c: Include sys/stat.h. * io/tst-mkdirat.c: Include sys/stat.h and stdbool.h. * io/tst-mkfifoat.c: Include sys/stat.h and stdbool.h. * io/tst-mknodat.c: Include sys/stat.h and stdbool.h. * io/tst-openat.c: Include stdbool.h. * io/tst-readlinkat.c: Include sys/stat.h. * io/tst-renameat.c: Include sys/stat.h. * io/tst-symlinkat.c: Include sys/stat.h. * io/tst-unlinkat.c: Include stdbool.h. * libio/bug-memstream1.c: Include stdlib.h. * libio/bug-wmemstream1.c: Include stdlib.h. * libio/tst-fwrite-error.c: Include stdlib.h. * libio/tst-memstream1.c: Include stdlib.h. * libio/tst-memstream2.c: Include stdlib.h. * libio/tst-memstream3.c: Include stdlib.h. * malloc/tst-interpose-aux.c: Include stdint.h. * misc/tst-preadvwritev-common.c: Include sys/stat.h. * nptl/tst-basic7.c: Include limits.h. * nptl/tst-cancel25.c: Include pthread.h, not pthreadP.h. * nptl/tst-cancel4.c: Include stddef.h, limits.h, and sys/stat.h. * nptl/tst-cancel4_1.c: Include stddef.h. * nptl/tst-cancel4_2.c: Include stddef.h. * nptl/tst-cond16.c: Include limits.h. Use sysconf(_SC_PAGESIZE) instead of __getpagesize. * nptl/tst-cond18.c: Include limits.h. Use sysconf(_SC_PAGESIZE) instead of __getpagesize. * nptl/tst-cond4.c: Include stdint.h. * nptl/tst-cond6.c: Include stdint.h. * nptl/tst-stack2.c: Include limits.h. * nptl/tst-stackguard1.c: Include stddef.h. * nptl/tst-tls4.c: Include stdint.h. Don't include tls.h. * nptl/tst-tls4moda.c: Include stddef.h. Don't include stdio.h, unistd.h, or tls.h. * nptl/tst-tls4modb.c: Include stddef.h. Don't include stdio.h, unistd.h, or tls.h. * nptl/tst-tls5.h: Include stddef.h. Don't include stdlib.h or tls.h. * posix/tst-getaddrinfo2.c: Include stdio.h. * posix/tst-getaddrinfo5.c: Include stdio.h. * posix/tst-pathconf.c: Include sys/stat.h. * posix/tst-posix_fadvise-common.c: Include stdint.h. * posix/tst-preadwrite-common.c: Include sys/stat.h. * posix/tst-regex.c: Include stdint.h. Don't include spawn.h or spawn_int.h. * posix/tst-regexloc.c: Don't include spawn.h or spawn_int.h. * posix/tst-vfork3.c: Include sys/stat.h. * resolv/tst-bug18665-tcp.c: Include stdlib.h. * resolv/tst-res_hconf_reorder.c: Include stdlib.h. * resolv/tst-resolv-search.c: Include stdlib.h. * stdio-common/tst-fmemopen2.c: Include stdint.h. * stdio-common/tst-vfprintf-width-prec.c: Include stdlib.h. * stdlib/test-canon.c: Include sys/stat.h. * stdlib/tst-tls-atexit.c: Include stdbool.h. * string/test-memchr.c: Include stdint.h. * string/tst-cmp.c: Include stdint.h. * sysdeps/pthread/tst-timer.c: Include stdint.h. * sysdeps/unix/sysv/linux/tst-sync_file_range.c: Include stdint.h. * sysdeps/wordsize-64/tst-writev.c: Include limits.h and stdint.h. * sysdeps/x86_64/fpu/math-tests-arch.h: Include cpu-features.h. Don't include init-arch.h. * sysdeps/x86_64/multiarch/test-multiarch.h: Include cpu-features.h. Don't include init-arch.h. * sysdeps/x86_64/tst-auditmod10b.c: Include link.h and stddef.h. * sysdeps/x86_64/tst-auditmod3b.c: Include link.h and stddef.h. * sysdeps/x86_64/tst-auditmod4b.c: Include link.h and stddef.h. * sysdeps/x86_64/tst-auditmod5b.c: Include link.h and stddef.h. * sysdeps/x86_64/tst-auditmod6b.c: Include link.h and stddef.h. * sysdeps/x86_64/tst-auditmod6c.c: Include link.h and stddef.h. * sysdeps/x86_64/tst-auditmod7b.c: Include link.h and stddef.h. * time/clocktest.c: Include stdint.h. * time/tst-posixtz.c: Include stdint.h. * timezone/tst-timezone.c: Include stdint.h.	2017-02-16 17:33:18 -05:00
Joseph Myers	10303eb74b	Move most libmvec test contents from .c to .h files. The libmvec tests put substantive, architecture-specific contents in .c files such as test-double-vlen4.c, so making those files architecture-specific and causing issues for generating such files automatically when splitting up tests by function. This patch moves all the substantive contents to .h files, so the .c files only include the .h file and then libm-test.c. This allows for automatic generation of per-function .c files in future. The .h files in turn #include or #include_next the architecture-independent file and add the architecture-specific definitions to that. (Splitting by function should in fact allow the TEST_VECTOR_* macros to be replaced by sysdeps makefile information on which functions to test in each case, removing the need for gen-libm-have-vector-test.sh as well as removing the need for some of the architecture-specific headers.) Tested for x86_64. * sysdeps/x86_64/fpu/test-double-vlen2.c: Move most contents to, and include ... * sysdeps/x86_64/fpu/test-double-vlen2.h: ... here. New file. * sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: Move most contents to, and include ... * sysdeps/x86_64/fpu/test-double-vlen4-avx2.h: ... here. New file. * sysdeps/x86_64/fpu/test-double-vlen4.c: Move most contents to, and include ... * sysdeps/x86_64/fpu/test-double-vlen4.h: ... here. New file. * sysdeps/x86_64/fpu/test-double-vlen8.c: Move most contents to, and include ... * sysdeps/x86_64/fpu/test-double-vlen8.h: ... here. New file. * sysdeps/x86_64/fpu/test-float-vlen16.c: Move most contents to, and include ... * sysdeps/x86_64/fpu/test-float-vlen16.h: ... here. New file. * sysdeps/x86_64/fpu/test-float-vlen4.c: Move most contents to, and include ... * sysdeps/x86_64/fpu/test-float-vlen4.h: ... here. New file. * sysdeps/x86_64/fpu/test-float-vlen8-avx2.c: Move most contents to, and include ... * sysdeps/x86_64/fpu/test-float-vlen8-avx2.h: ... here. New file. * sysdeps/x86_64/fpu/test-float-vlen8.c: Move most contents to, and include ... * sysdeps/x86_64/fpu/test-float-vlen8.h: ... here. New file.	2017-02-15 01:13:15 +00:00
Joseph Myers	bfff8b1bec	Update copyright dates with scripts/update-copyrights.	2017-01-01 00:14:16 +00:00
Joseph Myers	0a2546cdaa	Fix x86, x86_64 fmax, fmin sNaN handling, add tests (bug 20947). Various fmax and fmin function implementations mishandle sNaN arguments: (a) When both arguments are NaNs, the return value should be a qNaN, but sometimes it is an sNaN if at least one argument is an sNaN. (b) Under TS 18661-1 semantics, if either argument is an sNaN then the result should be a qNaN (whereas if one argument is a qNaN and the other is not a NaN, the result should be the non-NaN argument). Various implementations treat sNaNs like qNaNs here. This patch fixes the x86 and x86_64 versions (ignoring float and double for 32-bit x86 given the inability to reliably avoid the sNaN turning into a qNaN before it gets to the called function). Tests of sNaN inputs to these functions are added. Note on architecture versions I haven't changed for this issue: AArch64 already gets this right (it uses a hardware instruction with the correct semantics for both quiet and signaling NaNs) and does not need changes. It's possible Alpha, IA64, SPARC might need changes (this would be shown by the testsuite if so). Tested for x86_64 and x86 (both i686 and i586 builds, to cover the different x86 implementations). [BZ #20947] * sysdeps/i386/fpu/s_fmaxl.S (__fmaxl): Add the arguments when either is a signaling NaN. * sysdeps/i386/fpu/s_fminl.S (__fminl): Likewise. Make code follow fmaxl more closely. * sysdeps/i386/i686/fpu/s_fmaxl.S (__fmaxl): Add the arguments when either is a signaling NaN. * sysdeps/i386/i686/fpu/s_fminl.S (__fminl): Likewise. * sysdeps/x86_64/fpu/s_fmax.S (__fmax): Likewise. * sysdeps/x86_64/fpu/s_fmaxf.S (__fmaxf): Likewise. * sysdeps/x86_64/fpu/s_fmaxl.S (__fmaxl): Likewise. * sysdeps/x86_64/fpu/s_fmin.S (__fmin): Likewise. * sysdeps/x86_64/fpu/s_fminf.S (__fminf): Likewise. * sysdeps/x86_64/fpu/s_fminl.S (__fminl): Likewise. * math/libm-test.inc (fmax_test_data): Add tests of sNaN inputs. (fmin_test_data): Likewise.	2016-12-15 23:52:18 +00:00
Joseph Myers	a91fd168a0	Fix x86_64/x86 powl handling of sNaN arguments (bug 20916). The x86_64/x86 powl implementations mishandle sNaN arguments, both by returning sNaN in some cases (instead of doing arithmetic on the arguments to produce the result when NaN arguments result in NaN results) and by treating sNaN the same as qNaN for arguments (1, sNaN) and (sNaN, 0), contrary to TS 18661-1 which requires those cases to return qNaN instead of 1. This patch makes the x86_64/x86 powl implementations follow TS 18661-1 semantics for sNaN arguments; sNaN tests are also added for pow. Given the problems with testing float and double sNaN arguments on 32-bit x86 (sNaN tests disabled because the compiler may convert unnecessarily to a qNaN when passing arguments), no changes are made to the powf and pow implementations there. Tested for x86_64 and x86. [BZ #20916] * sysdeps/i386/fpu/e_powl.S (__ieee754_powl): Do not return 1 for arguments (sNaN, 0) or (1, sNaN). Do arithmetic on NaN arguments to compute result. * sysdeps/x86_64/fpu/e_powl.S (__ieee754_powl): Likewise. * math/libm-test.inc (pow_test_data): Add tests of sNaN arguments.	2016-12-06 00:33:19 +00:00
Joseph Myers	799131036e	Do not hardcode platform names in manual/libm-err-tab.pl (bug 14139). manual/libm-err-tab.pl hardcodes a list of names for particular platforms (mapping from sysdeps directory name to friendly name for the manual). This goes against the principle of keeping information about individual platforms in their corresponding sysdeps directory, and the list is also very out-of-date regarding supported platforms and their corresponding sysdeps directories. This patch fixes this by adding a libm-test-ulps-name file alongside each libm-test-ulps file. The script then gets the friendly name from that file, which is required to exist, so it no longer needs to allow for the mapping being missing. Tested for x86_64. [BZ #14139] * manual/libm-err-tab.pl (%pplatforms): Initialize to empty. (find_files): Obtain platform name from libm-test-ulps-name and store in %pplatforms. (canonicalize_platform): Remove. (print_platforms): Use $pplatforms directly. (by_platforms): Do not allow for platforms missing from %pplatforms. * sysdeps/aarch64/libm-test-ulps-name: New file. * sysdeps/alpha/fpu/libm-test-ulps-name: Likewise. * sysdeps/arm/libm-test-ulps-name: Likewise. * sysdeps/generic/libm-test-ulps-name: Likewise. * sysdeps/hppa/fpu/libm-test-ulps-name: Likewise. * sysdeps/i386/fpu/libm-test-ulps-name: Likewise. * sysdeps/i386/i686/fpu/multiarch/libm-test-ulps-name: Likewise. * sysdeps/ia64/fpu/libm-test-ulps-name: Likewise. * sysdeps/m68k/coldfire/fpu/libm-test-ulps-name: Likewise. * sysdeps/m68k/m680x0/fpu/libm-test-ulps-name: Likewise. * sysdeps/microblaze/libm-test-ulps-name: Likewise. * sysdeps/mips/mips32/libm-test-ulps-name: Likewise. * sysdeps/mips/mips64/libm-test-ulps-name: Likewise. * sysdeps/nios2/libm-test-ulps-name: Likewise. * sysdeps/powerpc/fpu/libm-test-ulps-name: Likewise. * sysdeps/powerpc/nofpu/libm-test-ulps-name: Likewise. * sysdeps/s390/fpu/libm-test-ulps-name: Likewise. * sysdeps/sh/libm-test-ulps-name: Likewise. * sysdeps/sparc/fpu/libm-test-ulps-name: Likewise. * sysdeps/tile/libm-test-ulps-name: Likewise. * sysdeps/x86_64/fpu/libm-test-ulps-name: Likewise.	2016-11-04 16:49:06 +00:00
Joseph Myers	f280fa6d17	Use __builtin_fma more in dbl-64 code. sysdeps/ieee754/dbl-64/dla.h can use a macro DLA_FMS for more efficient double-width operations when fused multiply-subtract is supported. However, this macro is only defined for x86_64, conditional on architecture-specific __FMA4__. This patch makes the code use __builtin_fma conditional on __FP_FAST_FMA, as used elsewhere in glibc. Tested for x86_64, x86 and powerpc. On powerpc (where this is causing fused operations to be used where they weren't previously) I see an increase from 1ulp to 2ulp in the imaginary part of clog10: testing double (without inline functions) Failure: Test: Imaginary part of: clog10 (0x1.7a858p+0 - 0x6.d940dp-4 i) Result: is: -1.2237865208199886e-01 -0x1.f5435146bb61ap-4 should be: -1.2237865208199888e-01 -0x1.f5435146bb61cp-4 difference: 2.7755575615628914e-17 0x1.0000000000000p-55 ulp : 2.0000 max.ulp : 1.0000 Maximal error of real part of: clog10 is : 3 ulp accepted: 3 ulp Maximal error of imaginary part of: clog10 is : 2 ulp accepted: 1 ulp This is actually resulting from atan2 becoming more accurate (atan2 (-0x6.d940dp-4, 0x1.7a858p+0) should ideally be -0x1.208cd6e841554p-2 but was -0x1.208cd6e841555p-2 from a powerpc libm built before this change, and is -0x1.208cd6e841554p-2 from a powerpc libm built after this change). Since these functions are not expected to be correctly rounding by glibc's accuracy goals, neither result is a problem, but this does imply that some of this code, although designed to be correctly rounding, is not in fact correctly rounding (possibly because of GCC creating fused operations where the code does not expect it, something we've only disabled for specific functions where it was found to cause large errors). (Of course as previously discussed I think we should remove the slow cases where an error analysis shows this wouldn't increase the errors much above 0.5ulp; it's only functions such as cratan2 that are expected to be correctly rounding, not atan2.) * sysdeps/ieee754/dbl-64/dla.h [__FP_FAST_FMA] (DLA_FMS): Define macro to use __builtin_fma. * sysdeps/x86_64/fpu/dla.h: Remove file.	2016-09-30 15:49:51 +00:00
Joseph Myers	ec94343f59	Add femode_t functions. TS 18661-1 defines a type femode_t to represent the set of dynamic floating-point control modes (such as the rounding mode and trap enablement modes), and functions fegetmode and fesetmode to manipulate those modes (without affecting other state such as the raised exception flags) and a corresponding macro FE_DFL_MODE. This patch series implements those interfaces for glibc. This first patch adds the architecture-independent pieces, the x86 and x86_64 implementations, and the <bits/fenv.h> and ABI baseline updates for all architectures so glibc keeps building and passing the ABI tests on all architectures. Subsequent patches add the fegetmode and fesetmode implementations for other architectures. femode_t is generally an integer type - the same type as fenv_t, or as the single element of fenv_t where fenv_t is a structure containing a single integer (or the single relevant element, where it has elements for both status and control registers) - except where architecture properties or consistency with the fenv_t implementation indicate otherwise. FE_DFL_MODE follows FE_DFL_ENV in whether it's a magic pointer value (-1 cast to const femode_t ), a value that can be distinguished from valid pointers by its high bits but otherwise contains a representation of the desired register contents, or a pointer to a constant variable (the powerpc case; __fe_dfl_mode is added as an exported constant object, an alias to __fe_dfl_env). Note that where architectures (that share a register between control and status bits) gain definitions of new floating-point control or status bits in future, the implementations of fesetmode for those architectures may need updating (depending on whether the new bits are control or status bits and what the implementation does with previously unknown bits), just like existing implementations of <fenv.h> functions that take care not to touch reserved bits may need updating when the set of reserved bits changes. (As any new bits are outside the scope of ISO C, that's just a quality-of-implementation issue for supporting them, not a conformance issue.) As with fenv_t, femode_t should properly include any software DFP rounding mode (and for both fenv_t and femode_t I'd consider that fragment of DFP support appropriate for inclusion in glibc even in the absence of the rest of libdfp; hardware DFP rounding modes should already be included if the definitions of which bits are status / control bits are correct). Tested for x86_64, x86, mips64 (hard float, and soft float to test the fallback version), arm (hard float) and powerpc (hard float, soft float and e500). Other architecture versions are untested. math/fegetmode.c: New file. * math/fesetmode.c: Likewise. * sysdeps/i386/fpu/fegetmode.c: Likewise. * sysdeps/i386/fpu/fesetmode.c: Likewise. * sysdeps/x86_64/fpu/fegetmode.c: Likewise. * sysdeps/x86_64/fpu/fesetmode.c: Likewise. * math/fenv.h: Update comment on inclusion of <bits/fenv.h>. [__GLIBC_USE (IEC_60559_BFP_EXT)] (fegetmode): New function declaration. [__GLIBC_USE (IEC_60559_BFP_EXT)] (fesetmode): Likewise. * bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/aarch64/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/alpha/fpu/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/arm/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/hppa/fpu/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/ia64/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/m68k/fpu/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/microblaze/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/mips/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/nios2/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/powerpc/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (__fe_dfl_mode): New variable declaration. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/s390/fpu/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/sh/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/sparc/fpu/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/tile/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/x86/fpu/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * manual/arith.texi (FE_DFL_MODE): Document macro. (fegetmode): Document function. (fesetmode): Likewise. * math/Versions (fegetmode): New libm symbol at version GLIBC_2.25. (fesetmode): Likewise. * math/Makefile (libm-support): Add fegetmode and fesetmode. (tests): Add test-femode and test-femode-traps. * math/test-femode-traps.c: New file. * math/test-femode.c: Likewise. * sysdeps/powerpc/fpu/fenv_const.c (__fe_dfl_mode): Declare as alias for __fe_dfl_env. * sysdeps/powerpc/nofpu/fenv_const.c (__fe_dfl_mode): Likewise. * sysdeps/powerpc/powerpc32/e500/nofpu/fenv_const.c (__fe_dfl_mode): Likewise. * sysdeps/powerpc/Versions (__fe_dfl_mode): New libm symbol at version GLIBC_2.25. * sysdeps/nacl/libm.abilist: Update. * sysdeps/unix/sysv/linux/aarch64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/arm/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/hppa/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/microblaze/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/nios2/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sh/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/tile/tilegx/tilegx32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/tile/tilepro/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise.	2016-09-07 16:40:09 +00:00
Paul E. Murphy	2bad840e9d	Remove unneeded stubs for k_rem_pio2l. This is only used for the float and double variants. Instead, just add it to the type specific list of files, and remove all stubs, and remove the declaration from math_private.h. I verified x86_64, i486, ia64, m68k, and ppc64 build.	2016-09-01 09:31:06 -05:00
Joseph Myers	5146356f5a	Add fesetexcept. TS 18661-1 defines an fesetexcept function for setting floating-point exception flags without the side-effect of causing enabled traps to be taken. This patch series implements this function for glibc. The present patch adds the fallback stub implementation, x86 and x86_64 implementations, documentation, tests and ABI baseline updates. The remaining patches, some of them untested, add implementations for other architectures. The implementations generally follow those of the fesetexceptflag function. As for fesetexceptflag, the approach taken for architectures where setting flags causes enabled traps to be taken is to set the flags (and potentially cause traps) rather than refusing to set the flags and returning an error. Since ISO C and TS 18661 provide no way to enable traps, this is formally in accordance with the standards. The NEWS entry should be considered a placeholder, since this patch series is intended to be followed by further such series adding other TS 18661-1 features, so that the NEWS entry would end up looking more like * New <fenv.h> features from TS 18661-1:2014 are added to libm: the fesetexcept, fetestexceptflag, fegetmode and fesetmode functions, the femode_t type and the FE_DFL_MODE macro. with hopefully more such entries for other features, rather than having an entry for a single function in the end. I believe we have consensus for adding TS 18661-1 interfaces as per <https://sourceware.org/ml/libc-alpha/2016-06/msg00421.html>. Tested for x86_64, x86, mips64 (hard float, and soft float to test the fallback version), arm (hard float) and powerpc (hard float, soft float and e500). * math/fesetexcept.c: New file. * sysdeps/i386/fpu/fesetexcept.c: Likewise. * sysdeps/x86_64/fpu/fesetexcept.c: Likewise. * math/fenv.h: Define __GLIBC_INTERNAL_STARTING_HEADER_IMPLEMENTATION and include <bits/libc-header-start.h> instead of including <features.h>. [__GLIBC_USE (IEC_60559_BFP_EXT)] (fesetexcept): New function declaration. * manual/arith.texi (fesetexcept): Document function. * math/Versions (fesetexcept): New libm symbol at version GLIBC_2.25. * math/Makefile (libm-support): Add fesetexcept. (tests): Add test-fesetexcept and test-fesetexcept-traps. * math/test-fesetexcept.c: New file. * math/test-fesetexcept-traps.c: Likewise. * sysdeps/nacl/libm.abilist: Update. * sysdeps/unix/sysv/linux/aarch64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/arm/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/hppa/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/microblaze/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/nios2/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sh/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/tile/tilegx/tilegx32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/tile/tilepro/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise.	2016-08-16 16:16:10 +00:00
Andrew Senkevich	533f9bebf9	x86_64: Call finite scalar versions in vectorized log, pow, exp (bz #20033 ). Vector math functions require -ffast-math which sets -ffinite-math-only, so it is needed to call finite scalar versions (which are called from vector functions in some cases). Since finite version of pow() returns qNaN instead of 1.0 for several inputs, those inputs are excluded for tests of vector math functions. [BZ #20033] * sysdeps/x86_64/fpu/multiarch/svml_d_exp2_core_sse4.S: Call finite version. * sysdeps/x86_64/fpu/multiarch/svml_d_exp4_core_avx2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_log2_core_sse4.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_log4_core_avx2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_log8_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_pow2_core_sse4.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_pow4_core_avx2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_expf16_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_expf4_core_sse4.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_expf8_core_avx2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_logf16_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_logf4_core_sse4.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_logf8_core_avx2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_powf16_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_powf4_core_sse4.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_powf8_core_avx2.S: Likewise. * sysdeps/x86_64/fpu/svml_d_exp2_core.S: Likewise. * sysdeps/x86_64/fpu/svml_d_log2_core.S: Likewise. * sysdeps/x86_64/fpu/svml_d_pow2_core.S: Likewise. * sysdeps/x86_64/fpu/svml_s_expf4_core.S: Likewise. * sysdeps/x86_64/fpu/svml_s_logf4_core.S: Likewise. * sysdeps/x86_64/fpu/svml_s_powf4_core.S: Likewise. * math/libm-test.inc (pow_test_data): Exclude tests for qNaN in power zero.	2016-08-02 16:35:25 +03:00
H.J. Lu	fe0cf86148	Don't compile do_test with -mavx/-mavx/-mavx512 Don't compile do_test with -mavx, -mavx nor -mavx512 since they won't run on non-AVX machines. [BZ #20384] * sysdeps/x86_64/fpu/Makefile (extra-test-objs): Add test-double-libmvec-sincos-avx-main.o, test-double-libmvec-sincos-avx2-main.o, test-double-libmvec-sincos-main.o, test-float-libmvec-sincosf-avx-main.o, test-float-libmvec-sincosf-avx2-main.o and test-float-libmvec-sincosf-main.o. test-float-libmvec-sincosf-avx512-main.o. ($(objpfx)test-double-libmvec-sincos): Also link with $(objpfx)test-double-libmvec-sincos-main.o. ($(objpfx)test-double-libmvec-sincos-avx): Also link with $(objpfx)test-double-libmvec-sincos-avx-main.o. ($(objpfx)test-double-libmvec-sincos-avx2): Also link with $(objpfx)test-double-libmvec-sincos-avx2-main.o. ($(objpfx)test-float-libmvec-sincosf): Also link with $(objpfx)test-float-libmvec-sincosf-main.o. ($(objpfx)test-float-libmvec-sincosf-avx): Also link with $(objpfx)test-float-libmvec-sincosf-avx2-main.o. [$(config-cflags-avx512) == yes] (extra-test-objs): Add test-double-libmvec-sincos-avx512-main.o and ($(objpfx)test-double-libmvec-sincos-avx512): Also link with $(objpfx)test-double-libmvec-sincos-avx512-main.o. ($(objpfx)test-float-libmvec-sincosf-avx512): Also link with $(objpfx)test-float-libmvec-sincosf-avx512-main.o. (CFLAGS-test-double-libmvec-sincos.c): Removed. (CFLAGS-test-float-libmvec-sincosf.c): Likewise. (CFLAGS-test-double-libmvec-sincos-main.c): New. (CFLAGS-test-double-libmvec-sincos-avx-main.c): Likewise. (CFLAGS-test-double-libmvec-sincos-avx2-main.c): Likewise. (CFLAGS-test-float-libmvec-sincosf-main.c): Likewise. (CFLAGS-test-float-libmvec-sincosf-avx-main.c): Likewise. (CFLAGS-test-float-libmvec-sincosf-avx2-main.c): Likewise. (CFLAGS-test-float-libmvec-sincosf-avx512-main.c): Likewise. (CFLAGS-test-double-libmvec-sincos-avx.c): Set to -DREQUIRE_AVX. (CFLAGS-test-float-libmvec-sincosf-avx.c ): Likewise. (CFLAGS-test-double-libmvec-sincos-avx2.c): Set to -DREQUIRE_AVX2. (CFLAGS-test-float-libmvec-sincosf-avx2.c ): Likewise. (CFLAGS-test-double-libmvec-sincos-avx512.c): Set to -DREQUIRE_AVX512F. (CFLAGS-test-float-libmvec-sincosf-avx512.c): Likewise. * sysdeps/x86_64/fpu/test-double-libmvec-sincos.c: Rewritten. * sysdeps/x86_64/fpu/test-float-libmvec-sincosf.c: Likewise. * sysdeps/x86_64/fpu/test-double-libmvec-sincos-avx-main.c: New file. * sysdeps/x86_64/fpu/test-double-libmvec-sincos-avx2-main.c: Likewise. * sysdeps/x86_64/fpu/test-double-libmvec-sincos-avx512-main.c: Likewise. * sysdeps/x86_64/fpu/test-double-libmvec-sincos-main.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-sincosf-avx-main.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-sincosf-avx2-main.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-sincosf-avx512-main.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-sincosf-main.c: Likewise.	2016-07-27 11:53:15 -07:00
H.J. Lu	f43cb35c9b	Require binutils 2.24 to build x86-64 glibc [BZ #20139 ] If assembler doesn't support AVX512DQ, _dl_runtime_resolve_avx is used to save the first 8 vector registers, which only saves the lower 256 bits of vector register, for lazy binding. When it is called on AVX512 platform, the upper 256 bits of ZMM registers are clobbered. Parameters passed in ZMM registers will be wrong when the function is called the first time. This patch requires binutils 2.24, whose assembler can store and load ZMM registers, to build x86-64 glibc. Since mathvec library needs assembler support for AVX512DQ, we disable mathvec if assembler doesn't support AVX512DQ. [BZ #20139] * config.h.in (HAVE_AVX512_ASM_SUPPORT): Renamed to ... (HAVE_AVX512DQ_ASM_SUPPORT): This. * sysdeps/x86_64/configure.ac: Require assembler from binutils 2.24 or above. (HAVE_AVX512_ASM_SUPPORT): Removed. (HAVE_AVX512DQ_ASM_SUPPORT): New. * sysdeps/x86_64/configure: Regenerated. * sysdeps/x86_64/dl-trampoline.S: Make HAVE_AVX512_ASM_SUPPORT check unconditional. * sysdeps/x86_64/multiarch/ifunc-impl-list.c: Likewise. * sysdeps/x86_64/multiarch/memcpy.S: Likewise. * sysdeps/x86_64/multiarch/memcpy_chk.S: Likewise. * sysdeps/x86_64/multiarch/memmove-avx512-no-vzeroupper.S: Likewise. * sysdeps/x86_64/multiarch/memmove-avx512-unaligned-erms.S: Likewise. * sysdeps/x86_64/multiarch/memmove.S: Likewise. * sysdeps/x86_64/multiarch/memmove_chk.S: Likewise. * sysdeps/x86_64/multiarch/mempcpy.S: Likewise. * sysdeps/x86_64/multiarch/mempcpy_chk.S: Likewise. * sysdeps/x86_64/multiarch/memset-avx512-no-vzeroupper.S: Likewise. * sysdeps/x86_64/multiarch/memset-avx512-unaligned-erms.S: Likewise. * sysdeps/x86_64/multiarch/memset.S: Likewise. * sysdeps/x86_64/multiarch/memset_chk.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core_avx512.S: Check HAVE_AVX512DQ_ASM_SUPPORT instead of HAVE_AVX512_ASM_SUPPORT. * sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_log8_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core_avx512.: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf16_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_expf16_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_logf16_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_powf16_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core_avx51: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf16_core_avx512.S: Likewise.	2016-07-01 06:03:05 -07:00
Andrew Senkevich	ee2196bb67	Fixed wrong vector sincos/sincosf ABI to have it compatible with current vector function declaration "#pragma omp declare simd notinbranch", according to which vector sincos should have vector of pointers for second and third parameters. It is fixed with implementation as wrapper to version having second and third parameters as pointers. [BZ #20024] * sysdeps/x86/fpu/test-math-vector-sincos.h: New. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos2_core_sse4.S: Fixed ABI of this implementation of vector function. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos4_core_avx2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf4_core_sse4.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf8_core_avx2.S: Likewise. * sysdeps/x86_64/fpu/svml_d_sincos2_core.S: Likewise. * sysdeps/x86_64/fpu/svml_d_sincos4_core.S: Likewise. * sysdeps/x86_64/fpu/svml_d_sincos4_core_avx.S: Likewise. * sysdeps/x86_64/fpu/svml_d_sincos8_core.S: Likewise. * sysdeps/x86_64/fpu/svml_s_sincosf16_core.S: Likewise. * sysdeps/x86_64/fpu/svml_s_sincosf4_core.S: Likewise. * sysdeps/x86_64/fpu/svml_s_sincosf8_core.S: Likewise. * sysdeps/x86_64/fpu/svml_s_sincosf8_core_avx.S: Likewise. * sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: Use another wrapper for testing vector sincos with fixed ABI. * sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-libmvec-sincos-avx.c: New test. * sysdeps/x86_64/fpu/test-double-libmvec-sincos-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-double-libmvec-sincos-avx512.c: Likewise. * sysdeps/x86_64/fpu/test-double-libmvec-sincos.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-sincosf-avx.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-sincosf-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-sincosf-avx512.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-sincosf.c: Likewise. * sysdeps/x86_64/fpu/Makefile: Added new tests.	2016-07-01 14:15:38 +03:00
Joseph Myers	30dcf959d2	Avoid "inexact" exceptions in i386/x86_64 trunc functions (bug 15479). As discussed in <https://sourceware.org/ml/libc-alpha/2016-05/msg00577.html>, TS 18661-1 disallows ceil, floor, round and trunc functions from raising the "inexact" exception, in accordance with general IEEE 754 semantics for when that exception is raised. Fixing this for x87 floating point is more complicated than for the other versions of these functions, because they use the frndint instruction that raises "inexact" and this can only be avoided by saving and restoring the whole floating-point environment. As I noted in <https://sourceware.org/ml/libc-alpha/2016-06/msg00128.html>, I have now implemented a GCC option -fno-fp-int-builtin-inexact for GCC 7, such that GCC will inline these functions on x86, without caring about "inexact", when the default -ffp-int-builtin-inexact is in effect. This allows users to get optimized code depending on the options they pass to the compiler, while making the out-of-line functions follow TS 18661-1 semantics and avoid "inexact". This patch duly fixes the out-of-line trunc function implementations to avoid "inexact", in the same way as the nearbyint implementations. I do not know how the performance of implementations such as these based on saving the environment and changing the rounding mode temporarily compares to that of the C versions or SSE 4.1 versions (of course, for 32-bit x86 SSE implementations still need to get the return value in an x87 register); it's entirely possible other implementations could be faster in some cases. Tested for x86_64 and x86. [BZ #15479] * sysdeps/i386/fpu/s_trunc.S (__trunc): Save and restore floating-point environment rather than just control word. * sysdeps/i386/fpu/s_truncf.S (__truncf): Likewise. * sysdeps/i386/fpu/s_truncl.S (__truncl): Save and restore floating-point environment, with "invalid" exceptions merged in, rather than just control word. * sysdeps/x86_64/fpu/s_truncl.S (__truncl): Likewise. * math/libm-test.inc (trunc_test_data): Do not allow spurious "inexact" exceptions.	2016-06-27 17:26:52 +00:00
Joseph Myers	623629de06	Avoid "inexact" exceptions in i386/x86_64 floor functions (bug 15479). As discussed in <https://sourceware.org/ml/libc-alpha/2016-05/msg00577.html>, TS 18661-1 disallows ceil, floor, round and trunc functions from raising the "inexact" exception, in accordance with general IEEE 754 semantics for when that exception is raised. Fixing this for x87 floating point is more complicated than for the other versions of these functions, because they use the frndint instruction that raises "inexact" and this can only be avoided by saving and restoring the whole floating-point environment. As I noted in <https://sourceware.org/ml/libc-alpha/2016-06/msg00128.html>, I have now implemented a GCC option -fno-fp-int-builtin-inexact for GCC 7, such that GCC will inline these functions on x86, without caring about "inexact", when the default -ffp-int-builtin-inexact is in effect. This allows users to get optimized code depending on the options they pass to the compiler, while making the out-of-line functions follow TS 18661-1 semantics and avoid "inexact". This patch duly fixes the out-of-line floor function implementations to avoid "inexact", in the same way as the nearbyint implementations. I do not know how the performance of implementations such as these based on saving the environment and changing the rounding mode temporarily compares to that of the C versions or SSE 4.1 versions (of course, for 32-bit x86 SSE implementations still need to get the return value in an x87 register); it's entirely possible other implementations could be faster in some cases. Tested for x86_64 and x86. [BZ #15479] * sysdeps/i386/fpu/s_floor.S (__floor): Save and restore floating-point environment rather than just control word. * sysdeps/i386/fpu/s_floorf.S (__floorf): Likewise. * sysdeps/i386/fpu/s_floorl.S (__floorl): Save and restore floating-point environment, with "invalid" exceptions merged in, rather than just control word. * sysdeps/x86_64/fpu/s_floorl.S (__floorl): Likewise. * math/libm-test.inc (floor_test_data): Do not allow spurious "inexact" exceptions.	2016-06-27 17:25:47 +00:00
Joseph Myers	26b0bf9600	Avoid "inexact" exceptions in i386/x86_64 ceil functions (bug 15479). As discussed in <https://sourceware.org/ml/libc-alpha/2016-05/msg00577.html>, TS 18661-1 disallows ceil, floor, round and trunc functions from raising the "inexact" exception, in accordance with general IEEE 754 semantics for when that exception is raised. Fixing this for x87 floating point is more complicated than for the other versions of these functions, because they use the frndint instruction that raises "inexact" and this can only be avoided by saving and restoring the whole floating-point environment. As I noted in <https://sourceware.org/ml/libc-alpha/2016-06/msg00128.html>, I have now implemented a GCC option -fno-fp-int-builtin-inexact for GCC 7, such that GCC will inline these functions on x86, without caring about "inexact", when the default -ffp-int-builtin-inexact is in effect. This allows users to get optimized code depending on the options they pass to the compiler, while making the out-of-line functions follow TS 18661-1 semantics and avoid "inexact". This patch duly fixes the out-of-line ceil function implementations to avoid "inexact", in the same way as the nearbyint implementations. I do not know how the performance of implementations such as these based on saving the environment and changing the rounding mode temporarily compares to that of the C versions or SSE 4.1 versions (of course, for 32-bit x86 SSE implementations still need to get the return value in an x87 register); it's entirely possible other implementations could be faster in some cases. Tested for x86_64 and x86. [BZ #15479] * sysdeps/i386/fpu/s_ceil.S (__ceil): Save and restore floating-point environment rather than just control word. * sysdeps/i386/fpu/s_ceilf.S (__ceilf): Likewise. * sysdeps/i386/fpu/s_ceill.S (__ceill): Save and restore floating-point environment, with "invalid" exceptions merged in, rather than just control word. * sysdeps/x86_64/fpu/s_ceill.S (__ceill): Likewise. * math/libm-test.inc (ceil_test_data): Do not allow spurious "inexact" exceptions.	2016-06-27 17:24:30 +00:00
Joseph Myers	40244be372	Fix i386/x86_64 scalbl with sNaN input (bug 20296). The x86_64 and i386 versions of scalbl return sNaN for some cases of sNaN input and are missing "invalid" exceptions for other cases. This results from overly complicated code that either returns a NaN input, or discards both inputs when one is NaN and loads a NaN from memory. This patch fixes this by simplifying the code to add the arguments when either one is NaN. Tested for x86_64 and x86. [BZ #20296] * sysdeps/i386/fpu/e_scalbl.S (__ieee754_scalbl): Add arguments when either argument is a NaN. * sysdeps/x86_64/fpu/e_scalbl.S (__ieee754_scalbl): Likewise. * math/libm-test.inc (scalb_test_data): Add sNaN tests.	2016-06-23 22:17:41 +00:00
Joseph Myers	4e9bf327ad	Simplify x86 nearbyint functions. The i386 implementations of nearbyint functions, and x86_64 nearbyintl, contain code to mask the "inexact" exception. However, the fnstenv instruction has the effect of masking all exceptions, so this masking code has been redundant since fnstenv was added to those implementations (by commit 846d9a4a3acdb4939ca7bf6aed48f9f6f26911be; commit `71d1b0166b` added the test math/test-nearbyint-except-2.c that verifies these functions do work when called with "inexact" traps enabled); this patch removes the redundant code. Tested for x86_64 and x86. * sysdeps/i386/fpu/s_nearbyint.S (__nearbyint): Do not mask "inexact" exceptions after fnstenv. * sysdeps/i386/fpu/s_nearbyintf.S (__nearbyintf): Likewise. * sysdeps/i386/fpu/s_nearbyintl.S (__nearbyintl): Likewise. * sysdeps/x86_64/fpu/s_nearbyintl.S (__nearbyintl): Likewise.	2016-06-22 15:40:30 +00:00
Andrew Senkevich	df2258c6cb	Added tests to ensure linkage through libmvec _finite aliases which are defined in libmvec_nonshared.a (bug 19654). [BZ #19654] sysdeps/x86_64/fpu/Makefile: Added new tests. * sysdeps/x86_64/fpu/test-double-libmvec-alias-avx-main.c: New. * sysdeps/x86_64/fpu/test-double-libmvec-alias-avx-mod.c: Likewise. * sysdeps/x86_64/fpu/test-double-libmvec-alias-avx.c: Likewise. * sysdeps/x86_64/fpu/test-double-libmvec-alias-avx2-main.c: Likewise. * sysdeps/x86_64/fpu/test-double-libmvec-alias-avx2-mod.c: Likewise. * sysdeps/x86_64/fpu/test-double-libmvec-alias-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-double-libmvec-alias-avx512-main.c: Likewise. * sysdeps/x86_64/fpu/test-double-libmvec-alias-avx512-mod.c: Likewise. * sysdeps/x86_64/fpu/test-double-libmvec-alias-avx512.c: Likewise. * sysdeps/x86_64/fpu/test-double-libmvec-alias-main.c: Likewise. * sysdeps/x86_64/fpu/test-double-libmvec-alias-mod.c: Likewise. * sysdeps/x86_64/fpu/test-double-libmvec-alias.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-alias-avx-main.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-alias-avx-mod.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-alias-avx.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-alias-avx2-main.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-alias-avx2-mod.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-alias-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-alias-avx512-main.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-alias-avx512-mod.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-alias-avx512.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-alias-main.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-alias-mod.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-alias.c: Likewise. * sysdeps/x86_64/fpu/test-libmvec-alias-mod.c: Likewise.	2016-06-20 21:15:50 +03:00
Joseph Myers	f4015c8a86	Use generic fdim on more architectures (bug 6796, bug 20255, bug 20256). Some architectures have their own versions of fdim functions, which are missing errno setting (bug 6796) and may also return sNaN instead of qNaN for sNaN input, in the case of the x86 / x86_64 long double versions (bug 20256). These versions are not actually doing anything that a compiler couldn't generate, just straightforward comparisons / arithmetic (and, in the x86 / x86_64 case, testing for NaNs with fxam, which isn't actually needed once you use an unordered comparison and let the NaNs pass through the same subtraction as non-NaN inputs). This patch removes the x86 / x86_64 / powerpc versions, so that those architectures use the generic C versions, which correctly handle setting errno and deal properly with sNaN inputs. This seems better than dealing with setting errno in lots of .S versions. The i386 versions also return results with excess range and precision, which is not appropriate for a function exactly defined by reference to IEEE operations. For errno setting to work correctly on overflow, it's necessary to remove excess range with math_narrow_eval, which this patch duly does in the float and double versions so that the tests can reliably pass on x86. For float, this avoids any double rounding issues as the long double precision is more than twice that of float. For double, double rounding issues will need to be addressed separately, so this patch does not fully fix bug 20255. Tested for x86_64, x86 and powerpc. [BZ #6796] [BZ #20255] [BZ #20256] * math/s_fdim.c: Include <math_private.h>. (__fdim): Use math_narrow_eval on result. * math/s_fdimf.c: Include <math_private.h>. (__fdimf): Use math_narrow_eval on result. * sysdeps/i386/fpu/s_fdim.S: Remove file. * sysdeps/i386/fpu/s_fdimf.S: Likewise. * sysdeps/i386/fpu/s_fdiml.S: Likewise. * sysdeps/i386/i686/fpu/s_fdim.S: Likewise. * sysdeps/i386/i686/fpu/s_fdimf.S: Likewise. * sysdeps/i386/i686/fpu/s_fdiml.S: Likewise. * sysdeps/powerpc/fpu/s_fdim.c: Likewise. * sysdeps/powerpc/fpu/s_fdimf.c: Likewise. * sysdeps/powerpc/powerpc32/fpu/s_fdim.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_fdim.c: Likewise. * sysdeps/x86_64/fpu/s_fdiml.S: Likewise. * math/libm-test.inc (fdim_test_data): Expect errno setting on overflow. Add sNaN tests.	2016-06-14 16:04:19 +00:00
Joseph Myers	f00faa4a43	Fix i386/x86_64 log2l (sNaN) (bug 20235). The i386/x86_64 versions of log2l return sNaN for sNaN input. This patch fixes them to add NaN inputs to themselves so that qNaN is returned in this case. Tested for x86_64 and x86. [BZ #20235] * sysdeps/i386/fpu/e_log2l.S (__ieee754_log2l): Add NaN input to itself. * sysdeps/x86_64/fpu/e_log2l.S (__ieee754_log2l): Likewise. * math/libm-test.inc (log2_test_data): Add sNaN tests.	2016-06-09 18:04:30 +00:00
Joseph Myers	8c010e2f71	Fix i386/x86_64 log1pl (sNaN) (bug 20229). The i386/x86_64 versions of log1pl return sNaN for sNaN input. This patch fixes them to add a NaN input to itself so that qNaN is returned in this case. Tested for x86_64 and x86. [BZ #20229] * sysdeps/i386/fpu/s_log1pl.S (__log1pl): Add NaN input to itself. * sysdeps/x86_64/fpu/s_log1pl.S (__log1pl): Likewise. * math/libm-test.inc (log1p_test_data): Add sNaN tests.	2016-06-08 23:11:42 +00:00
Joseph Myers	09096b3615	Fix i386/x86_64 log10l (sNaN) (bug 20228). The i386/x86_64 versions of log10l return sNaN for sNaN input. This patch fixes them to add a NaN input to itself so that qNaN is returned in this case. Tested for x86_64 and x86. [BZ #20228] * sysdeps/i386/fpu/e_log10l.S (__ieee754_log10l): Add NaN input to itself. * sysdeps/x86_64/fpu/e_log10l.S (__ieee754_log10l): Likewise. * math/libm-test.inc (log10_test_data): Add sNaN tests.	2016-06-08 22:59:18 +00:00
Joseph Myers	df179d8808	Fix i386/x86_64 logl (sNaN) (bug 20227). The i386/x86_64 versions of logl return sNaN for sNaN input. This patch fixes them to add a NaN input to itself so that qNaN is returned in this case. Tested for x86_64 and x86 (including a build for i586 to cover the non-i686 logl version). [BZ #20227] * sysdeps/i386/fpu/e_logl.S (__ieee754_logl): Add NaN input to itself. * sysdeps/i386/i686/fpu/e_logl.S (__ieee754_logl): Likewise. * sysdeps/x86_64/fpu/e_logl.S (__ieee754_logl): Likewise. * math/libm-test.inc (log_test_data): Add sNaN tests.	2016-06-08 22:24:06 +00:00
Joseph Myers	9bd3ef8e19	Fix i386/x86_64 expl, exp10l, expm1l for sNaN input (bug 20226). The i386 and x86_64 implementations of expl, exp10l and expm1l (code shared between the functions) return sNaN for sNaN input. This patch fixes them to add NaN inputs to themselves so that qNaN is returned in this case. Tested for x86_64 and x86. [BZ #20226] * sysdeps/i386/fpu/e_expl.S (IEEE754_EXPL): Add NaN argument to itself. * sysdeps/x86_64/fpu/e_expl.S (IEEE754_EXPL): Likewise. * math/libm-test.inc (exp_test_data): Add sNaN tests. (exp10_test_data): Likewise. (expm1_test_data): Likewise.	2016-06-08 21:55:06 +00:00
Joseph Myers	5ff81530dd	Do not raise "inexact" from x86_64 SSE4.1 ceil, floor (bug 15479). Continuing fixes for ceil and floor functions not to raise the "inexact" exception, this patch fixes the x86_64 SSE4.1 versions. The roundss / roundsd instructions take an immediate operand that determines the rounding mode and whether to raise "inexact"; this just needs bit 3 set to disable "inexact", which this patch does. Remark: we don't have an SSE4.1 version of trunc / truncf (using this instruction with operand 11); I'd expect one to make sense, but of course it should be benchmarked against the existing C code. I'll file a bug in Bugzilla for the lack of such a version. Tested for x86_64. [BZ #15479] * sysdeps/x86_64/fpu/multiarch/s_ceil.S (__ceil_sse41): Set bit 3 of immediate operand to rounding instruction. * sysdeps/x86_64/fpu/multiarch/s_ceilf.S (__ceilf_sse41): Likewise. * sysdeps/x86_64/fpu/multiarch/s_floor.S (__floor_sse41): Likewise. * sysdeps/x86_64/fpu/multiarch/s_floorf.S (__floorf_sse41): Likewise.	2016-05-24 21:11:18 +00:00
Joseph Myers	c898991d8b	Fix x86_64 / x86 powl inaccuracy for integer exponents (bug 19848). Bug 19848 reports cases where powl on x86 / x86_64 has error accumulation, for small integer exponents, larger than permitted by glibc's accuracy goals, at least in some rounding modes. This patch further restricts the exponent range for which the small-integer-exponent logic is used to limit the possible error accumulation. Tested for x86_64 and x86 and ulps updated accordingly. [BZ #19848] * sysdeps/i386/fpu/e_powl.S (p3): Rename to p2 and change value from 8 to 4. (__ieee754_powl): Compare integer exponent against 4 not 8. * sysdeps/x86_64/fpu/e_powl.S (p3): Rename to p2 and change value from 8 to 4. (__ieee754_powl): Compare integer exponent against 4 not 8. * math/auto-libm-test-in: Add more tests of pow. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/i686/fpu/multiarch/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2016-03-24 01:32:52 +00:00
H.J. Lu	86ed888255	Use JUMPTARGET in x86-64 mathvec When PLT may be used, JUMPTARGET should be used instead calling the function directly. * sysdeps/x86_64/fpu/multiarch/svml_d_cos2_core_sse4.S (_ZGVbN2v_cos_sse4): Use JUMPTARGET to call cos. * sysdeps/x86_64/fpu/multiarch/svml_d_cos4_core_avx2.S (_ZGVdN4v_cos_avx2): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core_avx512.S (_ZGVdN4v_cos): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_exp2_core_sse4.S (_ZGVbN2v_exp_sse4): Use JUMPTARGET to call exp. * sysdeps/x86_64/fpu/multiarch/svml_d_exp4_core_avx2.S (_ZGVdN4v_exp_avx2): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core_avx512.S (_ZGVdN4v_exp): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_log2_core_sse4.S (_ZGVbN2v_log_sse4): Use JUMPTARGET to call log. * sysdeps/x86_64/fpu/multiarch/svml_d_log4_core_avx2.S (_ZGVdN4v_log_avx2): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_log8_core_avx512.S (_ZGVdN4v_log): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_pow2_core_sse4.S (_ZGVbN2vv_pow_sse4): Use JUMPTARGET to call pow. * sysdeps/x86_64/fpu/multiarch/svml_d_pow4_core_avx2.S (_ZGVdN4vv_pow_avx2): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core_avx512.S (_ZGVdN4vv_pow): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core_sse4.S (_ZGVbN2v_sin_sse4): Use JUMPTARGET to call sin. * sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core_avx2.S (_ZGVdN4v_sin_avx2): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core_avx512.S (_ZGVdN4v_sin): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos2_core_sse4.S (_ZGVbN2vvv_sincos_sse4): Use JUMPTARGET to call sin and cos. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos4_core_avx2.S (_ZGVdN4vvv_sincos_avx2): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core_avx512.S (_ZGVdN4vvv_sincos): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf16_core_avx512.S (_ZGVdN8v_cosf): Use JUMPTARGET to call cosf. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf4_core_sse4.S (_ZGVbN4v_cosf_sse4): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf8_core_avx2.S (_ZGVdN8v_cosf_avx2): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_expf16_core_avx512.S (_ZGVdN8v_expf): Use JUMPTARGET to call expf. * sysdeps/x86_64/fpu/multiarch/svml_s_expf4_core_sse4.S (_ZGVbN4v_expf_sse4): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_expf8_core_avx2.S (_ZGVdN8v_expf_avx2): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_logf16_core_avx512.S (_ZGVdN8v_logf): Use JUMPTARGET to call logf. * sysdeps/x86_64/fpu/multiarch/svml_s_logf4_core_sse4.S (_ZGVbN4v_logf_sse4): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_logf8_core_avx2.S (_ZGVdN8v_logf_avx2): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_powf16_core_avx512.S (_ZGVdN8vv_powf): Use JUMPTARGET to call powf. * sysdeps/x86_64/fpu/multiarch/svml_s_powf4_core_sse4.S (_ZGVbN4vv_powf_sse4): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_powf8_core_avx2.S (_ZGVdN8vv_powf_avx2): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core_avx512.S (_ZGVdN8vv_powf): Use JUMPTARGET to call sinf and cosf. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf4_core_sse4.S (_ZGVbN4vvv_sincosf_sse4): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf8_core_avx2.S (_ZGVdN8vvv_sincosf_avx2): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf16_core_avx512.S (_ZGVdN8v_sinf): Use JUMPTARGET to call sinf. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf4_core_sse4.S (_ZGVbN4v_sinf_sse4): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf8_core_avx2.S (_ZGVdN8v_sinf_avx2): Likewise. * sysdeps/x86_64/fpu/svml_d_wrapper_impl.h (WRAPPER_IMPL_SSE2): Use JUMPTARGET to call callee. (WRAPPER_IMPL_SSE2_ff): Likewise. (WRAPPER_IMPL_SSE2_fFF): Likewise. (WRAPPER_IMPL_AVX): Likewise. (WRAPPER_IMPL_AVX_ff): Likewise. (WRAPPER_IMPL_AVX_fFF): Likewise. (WRAPPER_IMPL_AVX512): Likewise. (WRAPPER_IMPL_AVX512_ff): Likewise. * sysdeps/x86_64/fpu/svml_s_wrapper_impl.h (WRAPPER_IMPL_SSE2): Likewise. (WRAPPER_IMPL_SSE2_ff): Likewise. (WRAPPER_IMPL_SSE2_fFF): Likewise. (WRAPPER_IMPL_AVX): Likewise. (WRAPPER_IMPL_AVX_ff): Likewise. (WRAPPER_IMPL_AVX_fFF): Likewise. (WRAPPER_IMPL_AVX512): Likewise. (WRAPPER_IMPL_AVX512_ff): Likewise. (WRAPPER_IMPL_AVX512_fFF): Likewise.	2016-03-16 14:24:19 -07:00
Andrew Senkevich	a5df3210a6	Use PIC relocation in ALIAS_IMPL Since libmvec_nonshared.a may be linked into shared objects, ALIAS_IMPL should use PIC relocation. [BZ #19590] * sysdeps/x86_64/fpu/svml_finite_alias.S (ALIAS_IMPL): Use PIC relocation.	2016-02-17 14:23:32 -08:00
Joseph Myers	f7a9f785e5	Update copyright dates with scripts/update-copyrights.	2016-01-04 16:05:18 +00:00
Andrew Senkevich	977a30801f	Better workaround for aliases of _finite symbols in vector math library. Old workaround based on assembly aliases can lead to link fail (bug 19058). This patch makes workaround in another way to avoid it. [BZ #19058] math/Makefile ($(inst_libdir)/libm.so): Added libmvec_nonshared.a to AS_NEEDED. * sysdeps/x86/fpu/bits/math-vector.h: Removed code with old workaround. * sysdeps/x86_64/fpu/Makefile (libmvec-support, libmvec-static-only-routines): Added new file. * sysdeps/x86_64/fpu/svml_finite_alias.S: New file.	2015-11-27 16:22:26 +03:00
Joseph Myers	01189b083b	Fix i386/x86_64 log* (1) zero sign for -ffinite-math-only (bug 19213). For the -ffinite-math-only versions of various x86_64 and x86 log* functions, a zero result from log* (1) is returned with incorrect sign in round-downward mode. This patch fixes this in a similar way to the previous fixes for the non-_finite versions of the functions. Tested for x86_64 and x86 (including an i586 build), together with a patch that will be applied separately to enable the main libm-test.inc tests for the finite-math-only functions. [BZ #19213] sysdeps/i386/fpu/e_log.S (__log_finite): Ensure +0 is always returned for argument 1. * sysdeps/i386/fpu/e_logf.S (__logf_finite): Likewise. * sysdeps/i386/fpu/e_logl.S (__logl_finite): Likewise. * sysdeps/i386/i686/fpu/e_logl.S (__logl_finite): Likewise. * sysdeps/x86_64/fpu/e_log10l.S (__log10l_finite): Likewise. * sysdeps/x86_64/fpu/e_log2l.S (__log2l_finite): Likewise. * sysdeps/x86_64/fpu/e_logl.S (__logl_finite): Likewise.	2015-11-05 21:56:31 +00:00
Joseph Myers	2145f97cee	Handle more state in i386/x86_64 fesetenv (bug 16068). fenv_t should include architecture-specific floating-point modes and status flags. i386 and x86_64 fesetenv limit which bits they use from the x87 status and control words, when using saved state, and limit which parts of the state they set to fixed values, when using FE_DFL_ENV / FE_NOMASK_ENV. The following should be included but are excluded in at least some cases: status and masking for the "denormal operand" exception (which isn't part of FE_ALL_EXCEPT); precision control (explicitly mentioned in Annex F as something that counts as part of the floating-point environment); MXCSR FZ and DAZ bits (for FE_DFL_ENV and FE_NOMASK_ENV). This patch arranges for this extra state to be handled by fesetenv (and thereby by feupdateenv, which calls fesetenv). (Note that glibc functions using floating point are not generally expected to work correctly with non-default values of this state, especially precision control, but it is still logically part of the floating-point environment and should be handled as such by fesetenv. Changes to the state relating to subnormals ought generally to work with libm functions when the arguments aren't subnormal and neither are the expected results; that's a consequence of functions avoiding spurious internal underflows.) A question arising from this is whether FE_NOMASK_ENV should or should not mask the "denormal operand" exception. I decided it should mask that exception. This is the status quo - previously that exception could only be unmasked by direct manipulation of control registers (possibly via <fpu_control.h>). In addition, it means that use of FE_NOMASK_ENV leaves a floating-point environment the same as could be obtained by fesetenv (FE_DFL_ENV); feenableexcept (FE_ALL_EXCEPT);, rather than an environment in which an exception is unmasked that could only be masked again by using fesetenv with FE_DFL_ENV (or a previously saved environment) - this exception not being usable with other <fenv.h> functions because it's outside FE_ALL_EXCEPT. Tested for x86_64 and x86. [BZ #16068] * sysdeps/i386/fpu/fesetenv.c: Include <fpu_control.h>. (FE_ALL_EXCEPT_X86): New macro. (__fesetenv): Use FE_ALL_EXCEPT_X86 in most places instead of FE_ALL_EXCEPT. Ensure precision control is included in floating-point state. Ensure that FE_DFL_ENV and FE_NOMASK_ENV handle "denormal operand exception" and clear FZ and DAZ bits. * sysdeps/x86_64/fpu/fesetenv.c: Include <fpu_control.h>. (FE_ALL_EXCEPT_X86): New macro. (__fesetenv): Use FE_ALL_EXCEPT_X86 in most places instead of FE_ALL_EXCEPT. Ensure precision control is included in floating-point state. Ensure that FE_DFL_ENV and FE_NOMASK_ENV handle "denormal operand exception" and clear FZ and DAZ bits. * sysdeps/x86/fpu/test-fenv-sse-2.c: New file. * sysdeps/x86/fpu/test-fenv-x87.c: Likewise. * sysdeps/x86/fpu/Makefile [$(subdir) = math] (tests): Add test-fenv-x87 and test-fenv-sse-2. [$(subdir) = math] (CFLAGS-test-fenv-sse-2.c): New variable.	2015-10-28 22:58:29 +00:00
Joseph Myers	0b9af583a5	Fix i386/x86_64 fesetenv SSE exception clearing (bug 19181). The i386 and x86_64 versions of fesetenv, when called with FE_DFL_ENV or FE_NOMASK_ENV as argument, do not clear SSE exceptions raised in MXCSR. These arguments should, like other fenv_t values, represent the whole of the floating-point state, so such exceptions should be cleared; this patch adds the required clearing. (Discovered while working on bug 16068.) Tested for x86_64 and x86. [BZ #19181] * sysdeps/i386/fpu/fesetenv.c (__fesetenv): Clear already-raised SSE exceptions when argument is FE_DFL_ENV or FE_NOMASK_ENV. * sysdeps/x86_64/fpu/fesetenv.c (__fesetenv): Likewise. * math/test-fenv-clear-main.c: New file. * math/test-fenv-clear.c: Likewise. * math/Makefile (tests): Add test-fenv-clear. * sysdeps/x86/fpu/test-fenv-clear-sse.c: New file. * sysdeps/x86/fpu/Makefile [$(subdir) = math] (tests): Add test-fenv-clear-sse. [$(subdir) = math] (CFLAGS-test-fenv-clear-sse.c): New variable.	2015-10-28 18:50:20 +00:00
Florian Weimer	e39edb0552	x86_64: Regenerate ulps [BZ #19168 ] This comes from running “make regen-ulps” on AMD Opteron 6272 CPUs.	2015-10-26 13:15:48 +01:00
Joseph Myers	9d1687b2df	Add more libm tests (ilogb, is, j0, j1, jn, lgamma, log). This patch improves the libm test coverage for a few more functions. Tested for x86_64 and x86. * math/auto-libm-test-in: Add more tests of log, log10, log1p and log2. * math/auto-libm-test-out: Regenerated. * math/libm-test.inc (MAX_EXP): New macro. (ilogb_test_data): Add more tests. (isfinite_test_data): Likewise. (isgreater_test_data): Likewise. (isgreaterequal_test_data): Likewise. (isinf_test_data): Likewise. (isless_test_data): Likewise. (islessequal_test_data): Likewise. (islessgreater_test_data): Likewise. (isnan_test_data): Likewise. (isnormal_test_data): Likewise. (issignaling_test_data): Likewise. (isunordered_test_data): Likewise. (j0_test_data): Likewise. (j1_test_data): Likewise. (jn_test_data): Likewise. (lgamma_test_data): Likewise. (log_test_data): Likewise. (log10_test_data): Likewise. (log1p_test_data): Likewise. (log2_test_data): Likewise. (logb_test_data): Likewise. * sysdeps/x86_64/fpu/libm-test-ulps: Update.	2015-10-23 22:46:05 +00:00
Joseph Myers	846d9a4a3a	Fix i386 / x86_64 nearbyint exception clearing (bug 15491). The implementations of nearbyint functions using x87 floating point (i386 all versions, x86_64 long double only) use the fclex instruction, which clears any exceptions that were raised before the function was called. These functions must not clear exceptions that were raised before they were called. This patch fixes these functions to save and restore the whole floating-point environment (fnstenv / fldenv) as the way of avoiding raising "inexact" (recall that there isn't an x87 instruction for loading just the status word, so the whole environment has to be saved and loaded instead - the code already saved and loaded the control word, which is now obtained from the saved environment after this patch, to disable traps on "inexact"). In the case of the long double functions, any "invalid" exception from frndint (applied to a signaling NaN) needs merging into the saved state; this issue doesn't apply to the float and double functions because that exception would have been raised when the argument is loaded, before the environment is saved. [BZ #15491] * sysdeps/i386/fpu/s_nearbyint.S (__nearbyint): Save and restore floating-point environment instead of clearing all exceptions. * sysdeps/i386/fpu/s_nearbyintf.S (__nearbyintf): Likewise. * sysdeps/i386/fpu/s_nearbyintl.S (__nearbyintl): Likewise, merging in "invalid" exceptions from frndint. * sysdeps/x86_64/fpu/s_nearbyintl.S (__nearbyintl): Likewise. * math/test-nearbyint-except.c: New file. * math/Makefile (tests): Add test-nearbyint-except.	2015-10-22 23:14:55 +00:00
H.J. Lu	4b71ce6c1a	Update lrint/lrintf/lrintl for x32 The x86_64 versions of lrint/lrintf/ lrintl are aliases for the long long versions which isn't correct for x32, where exceptions must respect overflow for 32-bit long. Separate versions of the long functions for x32 that convert to 32-bit long and raise the right exceptions for that conversion, while keeping the aliases in the non-x32 case. Tested on x86_64 and x32. There are no code changes in libm.so on x86_64. * sysdeps/x86_64/fpu/s_llrint.S (__lrint): Add alias only if __ILP32__ isn't defined. (lrint): Likewise. * sysdeps/x86_64/fpu/s_llrintf.S (__lrintf): Likewise. (lrintf): Likewise. * sysdeps/x86_64/fpu/s_llrintl.S (__lrintl): Likewise. (lrintl): Likewise. * sysdeps/x86_64/x32/fpu/s_lrint.S: New file. * sysdeps/x86_64/x32/fpu/s_lrintf.S: Likewise. * sysdeps/x86_64/x32/fpu/s_lrintl.S: Likewise.	2015-10-09 11:42:10 -07:00
Joseph Myers	b7848899a5	Remove configure tests for FMA4 support. GCC added support for -mfma4 in version 4.5. Thus the configure tests for this support are obsolete, and this patch removes them. Tested for x86_64 and x86 (testsuite, and that installed stripped shared libraries are unchanged by this patch). * sysdeps/i386/configure.ac (libc_cv_cc_fma4): Remove configure test. * sysdeps/i386/configure: Regenerated. * sysdeps/x86_64/configure.ac (libc_cv_cc_fma4): Remove configure test. * sysdeps/x86_64/configure: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile [$(have-mfma4) = yes]: Make code unconditional. * sysdeps/x86_64/fpu/multiarch/e_asin.c [HAVE_FMA4_SUPPORT]: Likewise. * sysdeps/x86_64/fpu/multiarch/e_atan2.c [HAVE_FMA4_SUPPORT]: Likewise. [!HAVE_FMA4_SUPPORT]: Remove conditional code. * sysdeps/x86_64/fpu/multiarch/e_exp.c [HAVE_FMA4_SUPPORT]: Make code unconditional. [!HAVE_FMA4_SUPPORT]: Remove conditional code. * sysdeps/x86_64/fpu/multiarch/e_log.c [HAVE_FMA4_SUPPORT]: Make code unconditional. [!HAVE_FMA4_SUPPORT]: Remove conditional code. * sysdeps/x86_64/fpu/multiarch/e_pow.c [HAVE_FMA4_SUPPORT]: Make code unconditional. * sysdeps/x86_64/fpu/multiarch/s_atan.c [HAVE_FMA4_SUPPORT]: Make code unconditional. [!HAVE_FMA4_SUPPORT]: Remove conditional code. * sysdeps/x86_64/fpu/multiarch/s_fma.c [HAVE_FMA4_SUPPORT]: Make code unconditional. [!HAVE_FMA4_SUPPORT]: Remove conditional code. * sysdeps/x86_64/fpu/multiarch/s_fmaf.c [HAVE_FMA4_SUPPORT]: Make code unconditional. [!HAVE_FMA4_SUPPORT]: Remove conditional code. * sysdeps/x86_64/fpu/multiarch/s_sin.c [HAVE_FMA4_SUPPORT]: Make code unconditional. [!HAVE_FMA4_SUPPORT]: Remove conditional code. * sysdeps/x86_64/fpu/multiarch/s_tan.c [HAVE_FMA4_SUPPORT]: Make code unconditional. [!HAVE_FMA4_SUPPORT]: Remove conditional code. * config.h.in (HAVE_FMA4_SUPPORT): Remove #undef.	2015-10-09 16:02:54 +00:00
Joseph Myers	1b12cd7f4d	Remove configure tests for AVX support. GCC added support for -mavx and -msse2avx in version 4.4. Thus the configure tests for this support are obsolete, and this patch removes them. Tested for x86_64 and x86 (testsuite, and that installed stripped shared libraries are unchanged by this patch). * sysdeps/i386/configure.ac (libc_cv_cc_avx): Remove configure test. (libc_cv_cc_sse2avx): Likewise. * sysdeps/i386/configure: Regenerated. * sysdeps/i386/i686/multiarch/Makefile [$(subdir)$(config-cflags-avx) = mathyes]: Change conditional to [$(subdir) = math]. * sysdeps/i386/i686/multiarch/s_fma-fma.c [HAVE_AVX_SUPPORT]: Make code unconditional. * sysdeps/i386/i686/multiarch/s_fma.c [HAVE_AVX_SUPPORT]: Likewise. * sysdeps/i386/i686/multiarch/s_fmaf-fma.c [HAVE_AVX_SUPPORT]: Likewise. * sysdeps/i386/i686/multiarch/s_fmaf.c [HAVE_AVX_SUPPORT]: Likewise. * sysdeps/x86_64/configure.ac (libc_cv_cc_avx): Remove configure test. (libc_cv_cc_sse2avx): Likewise. * sysdeps/x86_64/configure: Regenerated. * sysdeps/x86_64/Makefile [$(config-cflags-avx) = yes]: Make code unconditional. * sysdeps/x86_64/dl-trampoline.h (_dl_runtime_profile) [HAVE_AVX_SUPPORT \|\| HAVE_AVX512_ASM_SUPPORT]: Make code unconditional. (_dl_runtime_profile) [!(HAVE_AVX_SUPPORT \|\| HAVE_AVX512_ASM_SUPPORT)]: Remove conditional code. * sysdeps/x86_64/fpu/multiarch/Makefile [$(config-cflags-sse2avx) = yes]: Make code unconditional. * sysdeps/x86_64/fpu/multiarch/e_atan2.c [HAVE_FMA4_SUPPORT \|\| HAVE_AVX_SUPPORT]: Likewise. * sysdeps/x86_64/fpu/multiarch/e_exp.c [HAVE_FMA4_SUPPORT \|\| HAVE_AVX_SUPPORT]: Likewise. * sysdeps/x86_64/fpu/multiarch/e_log.c [HAVE_FMA4_SUPPORT \|\| HAVE_AVX_SUPPORT]: Likewise. * sysdeps/x86_64/fpu/multiarch/s_atan.c [HAVE_FMA4_SUPPORT \|\| HAVE_AVX_SUPPORT]: Likewise. * sysdeps/x86_64/fpu/multiarch/s_fma.c [HAVE_AVX_SUPPORT]: Likewise. * sysdeps/x86_64/fpu/multiarch/s_fmaf.c [HAVE_AVX_SUPPORT]: Likewise. * sysdeps/x86_64/fpu/multiarch/s_sin.c [HAVE_FMA4_SUPPORT \|\| HAVE_AVX_SUPPORT]: Likewise. * sysdeps/x86_64/fpu/multiarch/s_tan.c [HAVE_FMA4_SUPPORT \|\| HAVE_AVX_SUPPORT]: Likewise. * sysdeps/x86_64/multiarch/strcmp.S [HAVE_AVX_SUPPORT]: Likewise. * config.h.in (HAVE_AVX_SUPPORT): Remove #undef. (HAVE_SSE2AVX_SUPPORT): Likewise.	2015-10-08 15:59:32 +00:00
Paul Pluzhnikov	57352c2201	sysdeps/x86_64/fpu/libm-test-ulps: Regenerated on Haswell.	2015-10-03 11:31:21 -07:00
Joseph Myers	93e448cbed	Improve test coverage of real libm functions [a-e]. This patch improves test coverage of the real libm functions [a-e], ensuring that special cases and ranges of input values of potential significance (such as close to overflow and underflow thresholds) are more systematically covered. This is a followup to <https://sourceware.org/ml/libc-alpha/2013-12/msg00757.html> which covered [a-c]* (however, I found more weaknesses in the coverage of those functions when preparing this patch, hence the additional tests being added for them here). Addition of a test for acosh (-qNaN) is temporarily deferred, to be included as part of a fix for bug 19032 which was discovered in the course of adding these tests (and which illustrates the use of testing -qNaN as well as +qNaN as input even to functions for which the sign of a NaN isn't meant to be significant). Tested for x86_64 and x86. * math/auto-libm-test-in: Add more tests of acos, acosh, asin, atan, atan2, atanh, cbrt, cos, cosh, erf, erfc, exp, exp10, exp2 and expm1. * math/auto-libm-test-out: Regenerated. * math/libm-test.inc (acos_test_data): Add more tests. (asin_test_data): Likewise. (asinh_test_data): Likewise. (atan_test_data): Likewise. (atanh_test_data): Likewise. (atan2_test_data): Likewise. (cbrt_test_data): Likewise. (ceil_test_data): Likewise. (copysign_test_data): Likewise. (cos_test_data): Likewise. (cosh_test_data): Likewise. (erf_test_data): Likewise. (erfc_test_data): Likewise. (exp_test_data): Likewise. (exp10_test_data): Likewise. (exp2_test_data): Likewise. (expm1_test_data): Likewise. * sysdeps/x86_64/fpu/libm-test-ulps: Update.	2015-09-30 18:06:02 +00:00
Joseph Myers	a5721ebc68	Fix clog, clog10 inaccuracy (bug 19016). For arguments with X^2 + Y^2 close to 1, clog and clog10 avoid large errors from log(hypot) by computing X^2 + Y^2 - 1 in a way that avoids cancellation error and then using log1p. However, the thresholds for using that approach still result in log being used on argument as large as sqrt(13/16) > 0.9, leading to significant errors, in some cases above the 9ulp maximum allowed in glibc libm. This patch arranges for the approach using log1p to be used in any cases where \|X\|, \|Y\| < 1 and X^2 + Y^2 >= 0.5 (with the existing allowance for cases where one of X and Y is very small), adjusting the __x2y2m1 functions to work with the wider range of inputs. This way, log only gets used on arguments below sqrt(1/2) (or substantially above 1), where the error involved is much less. Tested for x86_64, x86, mips64 and powerpc. For the ulps regeneration I removed the existing clog and clog10 ulps before regenerating to allow any reduced ulps to appear. Tests added include those found by random test generation to produce large ulps either before or after the patch, and some found by trying inputs close to the (0.75, 0.5) threshold where the potential errors from using log are largest. [BZ #19016] * sysdeps/generic/math_private.h (__x2y2m1f): Update comment to allow more cases with X^2 + Y^2 >= 0.5. * sysdeps/ieee754/dbl-64/x2y2m1.c (__x2y2m1): Likewise. Add -1 as normal element in sum instead of special-casing based on values of arguments. * sysdeps/ieee754/dbl-64/x2y2m1f.c (__x2y2m1f): Update comment. * sysdeps/ieee754/ldbl-128/x2y2m1l.c (__x2y2m1l): Likewise. Add -1 as normal element in sum instead of special-casing based on values of arguments. * sysdeps/ieee754/ldbl-128ibm/x2y2m1l.c (__x2y2m1l): Likewise. * sysdeps/ieee754/ldbl-96/x2y2m1.c [FLT_EVAL_METHOD != 0] (__x2y2m1): Update comment. * sysdeps/ieee754/ldbl-96/x2y2m1l.c (__x2y2m1l): Likewise. Add -1 as normal element in sum instead of special-casing based on values of arguments. * math/s_clog.c (__clog): Handle more cases using log1p without hypot. * math/s_clog10.c (__clog10): Likewise. * math/s_clog10f.c (__clog10f): Likewise. * math/s_clog10l.c (__clog10l): Likewise. * math/s_clogf.c (__clogf): Likewise. * math/s_clogl.c (__clogl): Likewise. * math/auto-libm-test-in: Add more tests of clog and clog10. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-09-28 22:11:22 +00:00
Joseph Myers	fa752c6981	Fix powf inaccuracy (bug 18956). The flt-32 version of powf can be inaccurate because of bugs in the extra-precision calculation of (x-1)/(x+1) or (x-1.5)/(x+1.5) as part of calculating log(x) with extra precision: a constant used (as part of adding 1 or 1.5 through integer arithmetic) is incorrect, and then the code fails to mask a computed high part before using it in arithmetic that relies on s_ht_h being exactly representable. This patch fixes these bugs. Tested for x86_64 and x86. x86_64 ulps for powf removed and regenerated to reflect reduced ulps from the increased accuracy for existing tests. [BZ #18956] sysdeps/ieee754/flt-32/e_powf.c (__ieee754_powf): Add 0x00400000 not 0x0040000 for high bit of mantissa. Mask with 0xfffff000 when extracting high part. * math/auto-libm-test-in: Add another test of pow. * math/auto-libm-test-out: Regenerated. * sysdeps/x86_64/fpu/libm-test-ulps: Update.	2015-09-26 00:27:06 +00:00
Joseph Myers	6ace393821	Fix pow missing underflows (bug 18825). Similar to various other bugs in this area, pow functions can fail to raise the underflow exception when the result is tiny and inexact but one or more low bits of the intermediate result that is scaled down (or, in the i386 case, converted from a wider evaluation format) are zero. This patch forces the exception in a similar way to previous fixes, thereby concluding the fixes for known bugs with missing underflow exceptions currently filed in Bugzilla. Tested for x86_64, x86, mips64 and powerpc. [BZ #18825] * sysdeps/i386/fpu/i386-math-asm.h (FLT_NARROW_EVAL_UFLOW_NONNAN): New macro. (DBL_NARROW_EVAL_UFLOW_NONNAN): Likewise. (LDBL_CHECK_FORCE_UFLOW_NONNAN): Likewise. * sysdeps/i386/fpu/e_pow.S: Use DEFINE_DBL_MIN. (__ieee754_pow): Use DBL_NARROW_EVAL_UFLOW_NONNAN instead of DBL_NARROW_EVAL, reloading the PIC register as needed. * sysdeps/i386/fpu/e_powf.S: Use DEFINE_FLT_MIN. (__ieee754_powf): Use FLT_NARROW_EVAL_UFLOW_NONNAN instead of FLT_NARROW_EVAL. Use separate return path for case when first argument is NaN. * sysdeps/i386/fpu/e_powl.S: Include <i386-math-asm.h>. Use DEFINE_LDBL_MIN. (__ieee754_powl): Use LDBL_CHECK_FORCE_UFLOW_NONNAN, reloading the PIC register. * sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Use math_check_force_underflow_nonneg. * sysdeps/ieee754/flt-32/e_powf.c (__ieee754_powf): Force underflow for subnormal result. * sysdeps/ieee754/ldbl-128/e_powl.c (__ieee754_powl): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_powl.c (__ieee754_powl): Use math_check_force_underflow_nonneg. * sysdeps/x86/fpu/powl_helper.c (__powl_helper): Use math_check_force_underflow. * sysdeps/x86_64/fpu/x86_64-math-asm.h (LDBL_CHECK_FORCE_UFLOW_NONNAN): New macro. * sysdeps/x86_64/fpu/e_powl.S: Include <x86_64-math-asm.h>. Use DEFINE_LDBL_MIN. (__ieee754_powl): Use LDBL_CHECK_FORCE_UFLOW_NONNAN. * math/auto-libm-test-in: Add more tests of pow. * math/auto-libm-test-out: Regenerated.	2015-09-25 22:29:10 +00:00
Joseph Myers	b2a64460ba	Refactor x86_64 libm code forcing underflow exceptions. This patch refactors code in sysdeps/x86_64/fpu that forces underflow exceptions and closely follows corresponding i386 code to use common macros in x86_64-math-asm.h for that purpose. This is mainly about keeping the code similar to the i386 code as far as possible, since each macro apart from DEFINE_LDBL_MIN ends up used only once. It would be possible to do a further refactoring to share these macros between i386 and x86_64 (with i386 using the fcomip / fucomip versions when building for i686 and above), but I have no immediate plans to do so. Tested for x86_64. * sysdeps/x86_64/fpu/x86_64-math-asm.h: New file. * sysdeps/x86_64/fpu/e_exp2l.S: Include <x86_64-math-asm.h>. (ldbl_min): Replace with use of DEFINE_LDBL_MIN. (__ieee754_exp2l): Use LDBL_CHECK_FORCE_UFLOW_NONNEG_NAN. * sysdeps/x86_64/fpu/e_expl.S: Include <x86_64-math-asm.h>. [!USE_AS_EXPM1L] (cmin): Replace with use of DEFINE_LDBL_MIN. (IEEE754_EXPL): Use LDBL_CHECK_FORCE_UFLOW_NONNEG.	2015-09-24 22:25:30 +00:00
Joseph Myers	51df260506	Fix x86_64 fma4 pow inappropriate contraction (bug 19003). The x86_64 fma4 version of pow fails to disable contraction of operations other than those explicitly intended to use fma instructions, so resulting in large ulps errors on processors with fma4 instructions, as in bug 18104 (165ulp for the test added for that bug; error originally reported by "blaaa" on #glibc). This patch adds $(config-cflags-nofma) for e_pow-fma4.c, corresponding to the use for e_pow.c in sysdeps/ieee754/dbl-64/Makefile. Tested for x86_64 on a processor with fma4. [BZ #19003] * sysdeps/x86_64/fpu/multiarch/Makefile (CFLAGS-e_pow-fma4.c): Add $(config-cflags-nofma).	2015-09-24 16:48:32 +00:00
Joseph Myers	da2f4f2dd5	Make scalbn set errno (bug 6803). As noted in bug 6803, scalbn fails to set errno on overflow and underflow. This patch fixes this by making scalbn an alias of ldexp, which has exactly the same semantics (for floating-point types with radix 2) and already has wrappers that deal with setting errno, instead of an alias of the internal __scalbn (which ldexp calls). Notes: * Where compat symbols were defined for scalbn functions, I didn't change what they point to (to keep the patch minimal), so such compat symbols continue to go directly to the non-errno-setting functions. * Mike, I didn't do anything with the IA64 versions of these functions, where I think both the ldexp and scalbn functions already deal with setting errno. As a cleanup (not needed to fix this bug) however you might want to make those functions into aliases for IA64; there is no need for them to be separate function implementations at all. * This concludes the fix for bug 6803 since the scalb and scalbln cases of that bug were fixed some time ago. Tested for x86_64, x86, mips64 and powerpc. [BZ #6803] * math/s_ldexp.c (scalbn): Define as weak alias of __ldexp. [NO_LONG_DOUBLE] (scalbnl): Define as weak alias of __ldexp. * math/s_ldexpf.c (scalbnf): Define as weak alias of __ldexpf. * math/s_ldexpl.c (scalbnl): Define as weak alias of __ldexpl. * sysdeps/i386/fpu/s_scalbn.S (scalbn): Remove alias. * sysdeps/i386/fpu/s_scalbnf.S (scalbnf): Likewise. * sysdeps/i386/fpu/s_scalbnl.S (scalbnl): Likewise. * sysdeps/ieee754/dbl-64/s_scalbn.c (scalbn): Likewise. [NO_LONG_DOUBLE] (scalbnl): Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_scalbn.c (scalbn): Likewise. [NO_LONG_DOUBLE] (scalbnl): Likewise. * sysdeps/ieee754/flt-32/s_scalbnf.c (scalbnf): Likewise. * sysdeps/ieee754/ldbl-128/s_scalbnl.c (scalbnl): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_scalbnl.c (scalbnl): Remove long_double_symbol calls. * sysdeps/ieee754/ldbl-64-128/s_scalbnl.c (scalbnl): Likewise. * sysdeps/ieee754/ldbl-opt/s_ldexpl.c (__ldexpl_2): Define as strong alias of __ldexpl. (scalbnl): Define using long_double_symbol. * sysdeps/m68k/m680x0/fpu/s_scalbn.c (__CONCATX(scalbn,suffix)): Remove alias. * sysdeps/sparc/sparc64/soft-fp/s_scalbnl.c (scalbnl): Likewise. * sysdeps/x86_64/fpu/s_scalbnl.S (scalbnl): Likewise. * math/libm-test.inc (scalbn_test_data): Add errno expectations. (scalbln_test_data): Add more errno expectations.	2015-09-16 21:11:00 +00:00
Joseph Myers	903af5af9a	Fix exp2 missing underflows (bug 16521). Various exp2 implementations in glibc can miss underflow exceptions when the scaling down part of the calculation is exact (or, in the x86 case, when the conversion from extended precision to the target precision is exact). This patch forces the exception in a similar way to previous fixes. The x86 exp2f changes may in fact not be needed for this purpose - it's likely to be the case that no argument of type float has an exp2 result so close to an exact subnormal float value that it equals that value when rounded to 64 bits (even taking account of variation between different x86 implementations). However, they are included for consistency with the changes to exp2 and so as to fix the exp2f part of bug 18875 by ensuring that excess range and precision is removed from underflowing return values. Tested for x86_64, x86 and mips64. [BZ #16521] [BZ #18875] * math/e_exp2l.c (__ieee754_exp2l): Force underflow exception for small results. * sysdeps/i386/fpu/e_exp2.S (dbl_min): New object. (MO): New macro. (__ieee754_exp2): For small results, force underflow exception and remove excess range and precision from return value. * sysdeps/i386/fpu/e_exp2f.S (flt_min): New object. (MO): New macro. (__ieee754_exp2f): For small results, force underflow exception and remove excess range and precision from return value. * sysdeps/i386/fpu/e_exp2l.S (ldbl_min): New object. (MO): New macro. (__ieee754_exp2l): Force underflow exception for small results. * sysdeps/ieee754/dbl-64/e_exp2.c (__ieee754_exp2): Likewise. * sysdeps/ieee754/flt-32/e_exp2f.c (__ieee754_exp2f): Likewise. * sysdeps/x86_64/fpu/e_exp2l.S (ldbl_min): New object. (MO): New macro. (__ieee754_exp2l): Force underflow exception for small results. * math/auto-libm-test-in: Add more tests or exp2. * math/auto-libm-test-out: Regenerated.	2015-09-14 22:00:12 +00:00
Joseph Myers	a1f99ba28b	Add more random libm test inputs (mainly for ldbl-128). This patch adds more libm test inputs found through random test generation to increase previously known ulps. This particular test generation was run for mips64, so most of the increased ulps are for ldbl-128 (float and double having been fairly well covered by such testing for x86_64), but there's the odd ulps increase for other formats. Tested for x86_64, x86 and mips64. * math/auto-libm-test-in: Add more tests of acos, acosh, asin, asinh, atan, atan2, atanh, cabs, carg, cos, csqrt, erfc, exp, exp10, exp2, log, log1p, log2, pow, sin, sincos, sinh, tan and tanh. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/mips/mips32/libm-test-ulps: Likewise. * sysdeps/mips/mips64/libm-test-ulps: Likewise. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-09-12 00:01:38 +00:00
Joseph Myers	00a7073c38	Add more randomly-generated libm tests. This patch adds more libm test inputs found through random test generation to increase observed ulps on x86_64. Tested for x86_64 and x86. * math/auto-libm-test-in: Add more tests of acosh, atanh, cbrt, cosh, csqrt, erfc, expm1 and lgamma. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-09-11 15:03:10 +00:00
Joseph Myers	050f29c188	Fix lgamma (negative) inaccuracy (bug 2542, bug 2543, bug 2558). The existing implementations of lgamma functions (except for the ia64 versions) use the reflection formula for negative arguments. This suffers large inaccuracy from cancellation near zeros of lgamma (near where the gamma function is +/- 1). This patch fixes this inaccuracy. For arguments above -2, there are no zeros and no large cancellation, while for sufficiently large negative arguments the zeros are so close to integers that even for integers +/- 1ulp the log(gamma(1-x)) term dominates and cancellation is not significant. Thus, it is only necessary to take special care about cancellation for arguments around a limited number of zeros. Accordingly, this patch uses precomputed tables of relevant zeros, expressed as the sum of two floating-point values. The log of the ratio of two sines can be computed accurately using log1p in cases where log would lose accuracy. The log of the ratio of two gamma(1-x) values can be computed using Stirling's approximation (the difference between two values of that approximation to lgamma being computable without computing the two values and then subtracting), with appropriate adjustments (which don't reduce accuracy too much) in cases where 1-x is too small to use Stirling's approximation directly. In the interval from -3 to -2, using the ratios of sines and of gamma(1-x) can still produce too much cancellation between those two parts of the computation (and that interval is also the worst interval for computing the ratio between gamma(1-x) values, which computation becomes more accurate, while being less critical for the final result, for larger 1-x). Because this can result in errors slightly above those accepted in glibc, this interval is instead dealt with by polynomial approximations. Separate polynomial approximations to (\|gamma(x)\|-1)(x-n)/(x-x0) are used for each interval of length 1/8 from -3 to -2, where n (-3 or -2) is the nearest integer to the 1/8-interval and x0 is the zero of lgamma in the relevant half-integer interval (-3 to -2.5 or -2.5 to -2). Together, the two approaches are intended to give sufficient accuracy for all negative arguments in the problem range. Outside that range, the previous implementation continues to be used. Tested for x86_64, x86, mips64 and powerpc. The mips64 and powerpc testing shows up pre-existing problems for ldbl-128 and ldbl-128ibm with large negative arguments giving spurious "invalid" exceptions (exposed by newly added tests for cases this patch doesn't affect the logic for); I'll address those problems separately. [BZ #2542] [BZ #2543] [BZ #2558] * sysdeps/ieee754/dbl-64/e_lgamma_r.c (__ieee754_lgamma_r): Call __lgamma_neg for arguments from -28.0 to -2.0. * sysdeps/ieee754/flt-32/e_lgammaf_r.c (__ieee754_lgammaf_r): Call __lgamma_negf for arguments from -15.0 to -2.0. * sysdeps/ieee754/ldbl-128/e_lgammal_r.c (__ieee754_lgammal_r): Call __lgamma_negl for arguments from -48.0 or -50.0 to -2.0. * sysdeps/ieee754/ldbl-96/e_lgammal_r.c (__ieee754_lgammal_r): Call __lgamma_negl for arguments from -33.0 to -2.0. * sysdeps/ieee754/dbl-64/lgamma_neg.c: New file. * sysdeps/ieee754/dbl-64/lgamma_product.c: Likewise. * sysdeps/ieee754/flt-32/lgamma_negf.c: Likewise. * sysdeps/ieee754/flt-32/lgamma_productf.c: Likewise. * sysdeps/ieee754/ldbl-128/lgamma_negl.c: Likewise. * sysdeps/ieee754/ldbl-128/lgamma_productl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/lgamma_negl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/lgamma_productl.c: Likewise. * sysdeps/ieee754/ldbl-96/lgamma_negl.c: Likewise. * sysdeps/ieee754/ldbl-96/lgamma_product.c: Likewise. * sysdeps/ieee754/ldbl-96/lgamma_productl.c: Likewise. * sysdeps/generic/math_private.h (__lgamma_negf): New prototype. (__lgamma_neg): Likewise. (__lgamma_negl): Likewise. (__lgamma_product): Likewise. (__lgamma_productl): Likewise. * math/Makefile (libm-calls): Add lgamma_neg and lgamma_product. * math/auto-libm-test-in: Add more tests of lgamma. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-09-10 22:27:58 +00:00
Paul Pluzhnikov	db7f8c8fe0	Regenerated sysdeps/x86_64/fpu/libm-test-ulps with AVX2.	2015-08-14 09:59:04 -07:00
Siddhesh Poyarekar	37dd6a19ca	Remove incorrect register mov in floorf/nearbyint on x86_64 The change in `0b5395f052` replaced calls to __get_cpu_features@plt followed by a mov from rax to rdx, with a single macro LOAD_RTLD_GLOBAL_RO_RDX. It is pretty clear that there was a typo in s_floorf and __nearbyint due to which the (now incorrect) mov was not removed. This patch removes that mov. * sysdeps/x86_64/fpu/multiarch/s_floorf.S (__floorf): Remove unnecessary movq. * sysdeps/x86_64/fpu/multiarch/s_nearbyint.S (__nearbyint): Likewise.	2015-08-14 05:30:17 -07:00
Joseph Myers	3ba0ac10fa	Add more random libm-test inputs. This patch adds more test inputs to various libm functions found through random generation to have larger ulps errors than previously listed in libm-test-ulp, on at least one of x86_64 and x86. Tested for x86_64 and x86. * math/auto-libm-test-in: Add more tests of acos, acosh, asin, asinh, atan, atan2, atanh, cabs, cbrt, cosh, csqrt, erf, erfc, exp, exp2, lgamma, log, log1p, log2, pow, sin, sincos, tan, tanh and tgamma. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-08-13 23:23:23 +00:00
H.J. Lu	1dfa4a94ae	Update libmvec multiarch functions for <cpu-features.h> This patch updates libmvec multiarch functions to use the newly defined HAS_CPU_FEATURE, HAS_ARCH_FEATURE and LOAD_RTLD_GLOBAL_RO_RDX from <cpu-features.h>. * math/Makefile ($(addprefix $(objpfx), $(libm-vec-tests))): Remove $(objpfx)init-arch.o. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Remove init-arch. * sysdeps/x86_64/fpu/math-tests-arch.h (avx_usable): Removed. (INIT_ARCH_EXT): Defined as empty. (CHECK_ARCH_EXT): Replace HAS_XXX with HAS_ARCH_FEATURE (XXX). * sysdeps/x86_64/fpu/multiarch/svml_d_cos2_core.S: Remove __init_cpu_features call. Replace HAS_XXX with HAS_CPU_FEATURE/HAS_ARCH_FEATURE (XXX). * sysdeps/x86_64/fpu/multiarch/svml_d_cos4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_exp2_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_exp4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_log2_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_log4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_log8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_pow2_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_pow4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos2_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf16_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_expf16_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_expf4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_expf8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_logf16_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_logf4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_logf8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_powf16_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_powf4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_powf8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf16_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf8_core.S: Likewise.	2015-08-13 03:41:47 -07:00
H.J. Lu	0b5395f052	Update x86_64 multiarch functions for <cpu-features.h> This patch updates x86_64 multiarch functions to use the newly defined HAS_CPU_FEATURE, HAS_ARCH_FEATURE and LOAD_RTLD_GLOBAL_RO_RDX from <cpu-features.h>. * sysdeps/x86_64/fpu/multiarch/e_asin.c: Replace HAS_XXX with HAS_CPU_FEATURE/HAS_ARCH_FEATURE (XXX). * sysdeps/x86_64/fpu/multiarch/e_atan2.c: Likewise. * sysdeps/x86_64/fpu/multiarch/e_exp.c: Likewise. * sysdeps/x86_64/fpu/multiarch/e_log.c: Likewise. * sysdeps/x86_64/fpu/multiarch/e_pow.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_atan.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_fmaf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_sin.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_tan.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_ceil.S: Use LOAD_RTLD_GLOBAL_RO_RDX and HAS_CPU_FEATURE (SSE4_1). * sysdeps/x86_64/fpu/multiarch/s_ceilf.S: Likewise. * sysdeps/x86_64/fpu/multiarch/s_floor.S: Likewise. * sysdeps/x86_64/fpu/multiarch/s_floorf.S: Likewise. * sysdeps/x86_64/fpu/multiarch/s_nearbyint.S : Likewise. * sysdeps/x86_64/fpu/multiarch/s_nearbyintf.S: Likewise. * sysdeps/x86_64/fpu/multiarch/s_rintf.S: Likewise. * sysdeps/x86_64/fpu/multiarch/s_rintf.S : Likewise. * sysdeps/x86_64/multiarch/ifunc-impl-list.c: Likewise. * sysdeps/x86_64/multiarch/sched_cpucount.c: Likewise. * sysdeps/x86_64/multiarch/strstr.c: Likewise. * sysdeps/x86_64/multiarch/memmove.c: Likewise. * sysdeps/x86_64/multiarch/memmove_chk.c: Likewise. * sysdeps/x86_64/multiarch/test-multiarch.c: Likewise. * sysdeps/x86_64/multiarch/memcmp.S: Remove __init_cpu_features call. Add LOAD_RTLD_GLOBAL_RO_RDX. Replace HAS_XXX with HAS_CPU_FEATURE/HAS_ARCH_FEATURE (XXX). * sysdeps/x86_64/multiarch/memcpy.S: Likewise. * sysdeps/x86_64/multiarch/memcpy_chk.S: Likewise. * sysdeps/x86_64/multiarch/mempcpy.S: Likewise. * sysdeps/x86_64/multiarch/mempcpy_chk.S: Likewise. * sysdeps/x86_64/multiarch/memset.S: Likewise. * sysdeps/x86_64/multiarch/memset_chk.S: Likewise. * sysdeps/x86_64/multiarch/strcat.S: Likewise. * sysdeps/x86_64/multiarch/strchr.S: Likewise. * sysdeps/x86_64/multiarch/strcmp.S: Likewise. * sysdeps/x86_64/multiarch/strcpy.S: Likewise. * sysdeps/x86_64/multiarch/strcspn.S: Likewise. * sysdeps/x86_64/multiarch/strspn.S: Likewise. * sysdeps/x86_64/multiarch/wcscpy.S: Likewise. * sysdeps/x86_64/multiarch/wmemcmp.S: Likewise.	2015-08-13 03:41:30 -07:00
Joseph Myers	4afe4b20ce	Add more tests of various libm functions. This patch adds more tests of various libm functions found through random test generation to give increased ulps on 32-bit x86. Tested for x86_64 and x86. * math/auto-libm-test-in: Add more tests of acosh, asin, asinh, atanh, cabs, carg, cbrt, cosh, csqrt, erf, erfc, exp, exp10, expm1, hypot, log, log10, log1p, log2, pow, sinh, tan and tgamma. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-08-11 00:58:28 +00:00
H.J. Lu	72354ab5e1	Align stack to 16 bytes when calling __errno_location We should align stack to 16 bytes when calling __errno_location. [BZ #18661] * sysdeps/x86_64/fpu/s_cosf.S (__cosf): Align stack to 16 bytes when calling __errno_location. * sysdeps/x86_64/fpu/s_sincosf.S (__sincosf): Likewise. * sysdeps/x86_64/fpu/s_sinf.S (__sinf): Likewise.	2015-08-05 08:36:27 -07:00
Andrew Senkevich	a9e8ea51cc	Prevent runtime fail of SSE vector math tests on non SSE4.1 machine. [BZ #18740] * sysdeps/x86_64/fpu/Makefile (double-vlen2-arch-ext-cflags, float-vlen4-arch-ext-cflags): Removed. * math/Makefile (CFLAGS-test-double-vlen2-wrappers.c, CFLAGS-test-float-vlen4-wrappers.c): Likewise.	2015-07-30 18:00:24 +03:00
Andrew Senkevich	febce2ac5f	Added runtime check for AVX vector math tests. [BZ #18731] * sysdeps/x86_64/fpu/math-tests-arch.h: Added AVX runtime check. * sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8.c: Likewise.	2015-07-29 19:47:29 +03:00
Andrew Senkevich	9901716135	Fixed several libmvec bugs found during testing on KNL hardware. AVX512 IFUNC implementations, implementations of wrappers to AVX2 versions and KNL expf implementation fixed. * sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core.S: Fixed AVX512 IFUNC. * sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_log8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf16_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_expf16_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_logf16_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_powf16_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf16_core.S: Likewise. * sysdeps/x86_64/fpu/svml_d_wrapper_impl.h: Fixed wrappers to AVX2. * sysdeps/x86_64/fpu/svml_s_wrapper_impl.h: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_expf16_core_avx512.S: Fixed KNL implementation.	2015-07-24 14:47:23 +03:00
Joseph Myers	e02920bc02	Improve tgamma accuracy (bug 18613). In non-default rounding modes, tgamma can be slightly less accurate than permitted by glibc's accuracy goals. Part of the problem is error accumulation, addressed in this patch by setting round-to-nearest for internal computations. However, there was also a bug in the code dealing with computing pow (x + n, x + n) where x + n is not exactly representable, providing another source of error even in round-to-nearest mode; it was necessary to address both bugs to get errors for all testcases within glibc's accuracy goals. Given this second fix, accuracy in round-to-nearest mode is also improved (hence regeneration of ulps for tgamma should be from scratch - truncate libm-test-ulps or at least remove existing tgamma entries - so that the expected ulps can be reduced). Some additional complications also arose. Certain tgamma tests should strictly, according to IEEE semantics, overflow or not depending on the rounding mode; this is beyond the scope of glibc's accuracy goals for any function without exactly-determined results, but gen-auto-libm-tests doesn't handle being lax there as it does for underflow. (libm-test.inc also doesn't handle being lax about whether the result in cases very close to the overflow threshold is infinity or a finite value close to overflow, but that doesn't cause problems in this case though I've seen it cause problems with random test generation for some functions.) Thus, spurious-overflow markings, with a comment, are added to auto-libm-test-in (no bug in Bugzilla because the issue is with the testsuite, not a user-visible bug in glibc). And on x86, after the patch I saw ERANGE issues as previously reported by Carlos (see my commentary in <https://sourceware.org/ml/libc-alpha/2015-01/msg00485.html>), which needed addressing by ensuring excess range and precision were eliminated at various points if FLT_EVAL_METHOD != 0. I also noticed and fixed a cosmetic issue where 1.0f was used in long double functions and should have been 1.0L. This completes the move of all functions to testing in all rounding modes with ALL_RM_TEST, so gen-libm-have-vector-test.sh is updated to remove the workaround for some functions not using ALL_RM_TEST. Tested for x86_64, x86, mips64 and powerpc. [BZ #18613] * sysdeps/ieee754/dbl-64/e_gamma_r.c (gamma_positive): Take log of X_ADJ not X when adjusting exponent. (__ieee754_gamma_r): Do intermediate computations in round-to-nearest then adjust overflowing and underflowing results as needed. * sysdeps/ieee754/flt-32/e_gammaf_r.c (gammaf_positive): Take log of X_ADJ not X when adjusting exponent. (__ieee754_gammaf_r): Do intermediate computations in round-to-nearest then adjust overflowing and underflowing results as needed. * sysdeps/ieee754/ldbl-128/e_gammal_r.c (gammal_positive): Take log of X_ADJ not X when adjusting exponent. (__ieee754_gammal_r): Do intermediate computations in round-to-nearest then adjust overflowing and underflowing results as needed. Use 1.0L not 1.0f as numerator of division. * sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (gammal_positive): Take log of X_ADJ not X when adjusting exponent. (__ieee754_gammal_r): Do intermediate computations in round-to-nearest then adjust overflowing and underflowing results as needed. Use 1.0L not 1.0f as numerator of division. * sysdeps/ieee754/ldbl-96/e_gammal_r.c (gammal_positive): Take log of X_ADJ not X when adjusting exponent. (__ieee754_gammal_r): Do intermediate computations in round-to-nearest then adjust overflowing and underflowing results as needed. Use 1.0L not 1.0f as numerator of division. * math/libm-test.inc (tgamma_test_data): Remove one test. Moved to auto-libm-test-in. (tgamma_test): Use ALL_RM_TEST. * math/auto-libm-test-in: Add one test of tgamma. Mark some other tests of tgamma with spurious-overflow. * math/auto-libm-test-out: Regenerated. * math/gen-libm-have-vector-test.sh: Do not check for START. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-06-29 23:29:35 +00:00
Joseph Myers	a8e2112ae3	Use round-to-nearest internally in jn, test with ALL_RM_TEST (bug 18602). Some existing jn tests, if run in non-default rounding modes, produce errors above those accepted in glibc, which causes problems for moving tests of jn to use ALL_RM_TEST. This patch makes jn set rounding to-nearest internally, as was done for yn some time ago, then computes the appropriate underflowing value for results that underflowed to zero in to-nearest, and moves the tests to ALL_RM_TEST. It does nothing about the general inaccuracy of Bessel function implementations in glibc, though it should make jn more accurate on average in non-default rounding modes through reduced error accumulation. The recomputation of results that underflowed to zero should as a side-effect fix some cases of bug 16559, where jn just used an exact zero, but that is not the goal of this patch and other cases of that bug remain unfixed. (Most of the changes in the patch are reindentation to add new scopes for SET_RESTORE_ROUND.) Tested for x86_64, x86, powerpc and mips64. [BZ #16559] [BZ #18602] sysdeps/ieee754/dbl-64/e_jn.c (__ieee754_jn): Set round-to-nearest internally then recompute results that underflowed to zero in the original rounding mode. * sysdeps/ieee754/flt-32/e_jnf.c (__ieee754_jnf): Likewise. * sysdeps/ieee754/ldbl-128/e_jnl.c (__ieee754_jnl): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_jnl.c (__ieee754_jnl): Likewise. * sysdeps/ieee754/ldbl-96/e_jnl.c (__ieee754_jnl): Likewise * math/libm-test.inc (jn_test): Use ALL_RM_TEST. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-06-25 21:46:02 +00:00
Joseph Myers	f9536db790	Refactor libm tests. This patch refactors the libm tests using libm-test.inc to reduce the level of duplicate definitions. New headers are created for the definitions shared by tests for a particular type; by tests of inline functions; by tests of non-inline functions; by scalar tests; and by vector tests. The unused MATHCONST macro is removed. A new macro VEC_LEN is added to the vector headers to allow the macros defining wrappers for vector functions to be defined once, instead of six times each (differing only in vector length) as before. There is still scope for further refactoring, but this seems a useful start. Tested for x86_64. * math/test-double.h: New file. * math/test-float.h: Likewise. * math/test-ldouble.h: Likewise. * math/test-math-inline.h: Likewise. * math/test-math-no-inline.h: Likewise. * math/test-math-scalar.h: Likewise. * math/test-math-vector.h: Likewise. * math/test-vec-loop.h: Remove file. Contents moved into test-math-vector.h. * math/libm-test.inc (MATHCONST): Do not document macro. * math/test-double.c: Include test-double.h, test-math-no-inline.h and test-math-scalar.h. (FUNC): Remove macro. (FUNC_TEST): Likewise. (FLOAT): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_DOUBLE): Likewise. (TEST_MATHVEC): Likewise. (__NO_MATH_INLINES): Likewise. * math/test-float.c: Include test-float.h, test-math-no-inline.h and test-math-scalar.h. (FUNC): Remove macro. (FUNC_TEST): Likewise. (FLOAT): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_FLOAT): Likewise. (TEST_MATHVEC): Likewise. (__NO_MATH_INLINES): Likewise. * math/test-idouble.c: Include test-double.h, test-math-inline.h and test-math-scalar.h. (FUNC): Remove macro. (FUNC_TEST): Likewise. (FLOAT): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_DOUBLE): Likewise. (TEST_MATHVEC): Likewise. (TEST_INLINE): Likewise. (__NO_MATH_INLINES): Likewise. * math/test-ifloat.c: Include test-float.h, test-math-inline.h and test-math-scalar.h. (FUNC): Remove macro. (FUNC_TEST): Likewise. (FLOAT): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_FLOAT): Likewise. (TEST_MATHVEC): Likewise. (TEST_INLINE): Likewise. (__NO_MATH_INLINES): Likewise. * math/test-ildoubl.c: Include test-ldouble.h, test-math-inline.h and test-math-scalar.h. (FUNC): Remove macro. (FUNC_TEST): Likewise. (FLOAT): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_LDOUBLE): Likewise. (TEST_MATHVEC): Likewise. (TEST_INLINE): Likewise. (__NO_MATH_INLINES): Likewise. * math/test-ldouble.c: Include test-ldouble.h, test-math-no-inline.h and test-math-scalar.h. (FUNC): Remove macro. (FUNC_TEST): Likewise. (FLOAT): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_LDOUBLE): Likewise. (TEST_MATHVEC): Likewise. (__NO_MATH_INLINES): Likewise. * math/test-double-vlen2.h: Include test-double.h, test-math-no-inline.h and test-math-vector.h. (FLOAT): Remove macro. (FUNC): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_DOUBLE): Likewise. (TEST_MATHVEC): Likewise. (__NO_MATH_INLINES): Likewise. (CNCT): Likewise. (CONCAT): Likewise. (WRAPPER_NAME): Likewise. (WRAPPER_DECL): Likewise. (WRAPPER_DECL_ff): Likewise. (WRAPPER_DECL_fFF): Likewise. (VECTOR_WRAPPER): Likewise. (VECTOR_WRAPPER_ff): Likewise. (VECTOR_WRAPPER_fFF): Likewise. (VEC_LEN): New macro. * math/test-double-vlen4.h: Include test-double.h, test-math-no-inline.h and test-math-vector.h. (FLOAT): Remove macro. (FUNC): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_DOUBLE): Likewise. (TEST_MATHVEC): Likewise. (__NO_MATH_INLINES): Likewise. (CNCT): Likewise. (CONCAT): Likewise. (WRAPPER_NAME): Likewise. (WRAPPER_DECL): Likewise. (WRAPPER_DECL_ff): Likewise. (WRAPPER_DECL_fFF): Likewise. (VECTOR_WRAPPER): Likewise. (VECTOR_WRAPPER_ff): Likewise. (VECTOR_WRAPPER_fFF): Likewise. (VEC_LEN): New macro. * math/test-double-vlen8.h: Include test-double.h, test-math-no-inline.h and test-math-vector.h. (FLOAT): Remove macro. (FUNC): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_DOUBLE): Likewise. (TEST_MATHVEC): Likewise. (__NO_MATH_INLINES): Likewise. (CNCT): Likewise. (CONCAT): Likewise. (WRAPPER_NAME): Likewise. (WRAPPER_DECL): Likewise. (WRAPPER_DECL_ff): Likewise. (WRAPPER_DECL_fFF): Likewise. (VECTOR_WRAPPER): Likewise. (VECTOR_WRAPPER_ff): Likewise. (VECTOR_WRAPPER_fFF): Likewise. (VEC_LEN): New macro. * math/test-float-vlen4.h: Include test-float.h, test-math-no-inline.h and test-math-vector.h. (FLOAT): Remove macro. (FUNC): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_FLOAT): Likewise. (TEST_MATHVEC): Likewise. (__NO_MATH_INLINES): Likewise. (CNCT): Likewise. (CONCAT): Likewise. (WRAPPER_NAME): Likewise. (WRAPPER_DECL): Likewise. (WRAPPER_DECL_ff): Likewise. (WRAPPER_DECL_fFF): Likewise. (VECTOR_WRAPPER): Likewise. (VECTOR_WRAPPER_ff): Likewise. (VECTOR_WRAPPER_fFF): Likewise. (VEC_LEN): New macro. * math/test-float-vlen8.h: Include test-float.h, test-math-no-inline.h and test-math-vector.h. (FLOAT): Remove macro. (FUNC): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_FLOAT): Likewise. (TEST_MATHVEC): Likewise. (__NO_MATH_INLINES): Likewise. (CNCT): Likewise. (CONCAT): Likewise. (WRAPPER_NAME): Likewise. (WRAPPER_DECL): Likewise. (WRAPPER_DECL_ff): Likewise. (WRAPPER_DECL_fFF): Likewise. (VECTOR_WRAPPER): Likewise. (VECTOR_WRAPPER_ff): Likewise. (VECTOR_WRAPPER_fFF): Likewise. (VEC_LEN): New macro. * math/test-float-vlen16.h: Include test-float.h, test-math-no-inline.h and test-math-vector.h. (FLOAT): Remove macro. (FUNC): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_FLOAT): Likewise. (TEST_MATHVEC): Likewise. (__NO_MATH_INLINES): Likewise. (CNCT): Likewise. (CONCAT): Likewise. (WRAPPER_NAME): Likewise. (WRAPPER_DECL): Likewise. (WRAPPER_DECL_ff): Likewise. (WRAPPER_DECL_fFF): Likewise. (VECTOR_WRAPPER): Likewise. (VECTOR_WRAPPER_ff): Likewise. (VECTOR_WRAPPER_fFF): Likewise. (VEC_LEN): New macro. * sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: Do not include test-vec-loop.h. * sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c: Likewise.	2015-06-24 23:27:18 +00:00
Joseph Myers	a67894c505	Fix cexp, ccos, ccosh, csin, csinh spurious underflows (bug 18594). cexp, ccos, ccosh, csin and csinh have spurious underflows in cases where they compute sin of the smallest normal, that produces an underflow exception (depending on which sin implementation is in use) but the final result does not underflow. ctan and ctanh may also have such underflows, or they may be latent (the issue there is that e.g. ctan (DBL_MIN) should, rounded upwards, be the next double value above DBL_MIN, which under glibc's accuracy goals may not have an underflow exception, but the intermediate computation of sin (DBL_MIN) would legitimately underflow on before-rounding architectures). This patch fixes all those functions so they use plain comparisons (> DBL_MIN etc.) instead of comparing the result of fpclassify with FP_SUBNORMAL (in all these cases, we already know the number being compared is finite). Note that in the case of csin / csinf / csinl, there is no need for fabs calls in the comparison because the real part has already been reduced to its absolute value. As the patch fixes the failures that previously obstructed moving tests of cexp to use ALL_RM_TEST, those tests are moved to ALL_RM_TEST by the patch (two functions remain yet to be converted). Tested for x86_64 and x86 and ulps updated accordingly. [BZ #18594] * math/s_ccosh.c (__ccosh): Compare with least normal value instead of comparing class with FP_SUBNORMAL. * math/s_ccoshf.c (__ccoshf): Likewise. * math/s_ccoshl.c (__ccoshl): Likewise. * math/s_cexp.c (__cexp): Likewise. * math/s_cexpf.c (__cexpf): Likewise. * math/s_cexpl.c (__cexpl): Likewise. * math/s_csin.c (__csin): Likewise. * math/s_csinf.c (__csinf): Likewise. * math/s_csinh.c (__csinh): Likewise. * math/s_csinhf.c (__csinhf): Likewise. * math/s_csinhl.c (__csinhl): Likewise. * math/s_csinl.c (__csinl): Likewise. * math/s_ctan.c (__ctan): Likewise. * math/s_ctanf.c (__ctanf): Likewise. * math/s_ctanh.c (__ctanh): Likewise. * math/s_ctanhf.c (__ctanhf): Likewise. * math/s_ctanhl.c (__ctanhl): Likewise. * math/s_ctanl.c (__ctanl): Likewise. * math/auto-libm-test-in: Add more tests of ccos, ccosh, cexp, csin, csinh, ctan and ctanh. * math/auto-libm-test-out: Regenerated. * math/libm-test.inc (cexp_test): Use ALL_RM_TEST. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-06-24 21:04:51 +00:00
Joseph Myers	ac831b362a	Fix csin, csinh overflow in directed rounding modes (bug 18593). csin and csinh can produce bad results when overflowing in directed rounding modes, because a multiplication that can overflow is followed by a possible negation. This patch fixes this by negating one of the arguments of the multiplication before the multiplication instead of negating the result. The new tests for this issue are added to auto-libm-test-in, starting use of that file for csin and csinh. The issue was found in the course of moving existing tests for csin and csinh (existing tests, by being enabled in more cases than previously, showed the issue for float and double but not for long double); that move will now be done separately. Tested for x86_64 and x86 and ulps updated accordingly. [BZ #18593] * math/s_csin.c (__csin): Negate before rather than after possibly overflowing multiplication. * math/s_csinf.c (__csinf): Likewise. * math/s_csinh.c (__csinh): Likewise. * math/s_csinhf.c (__csinhf): Likewise. * math/s_csinhl.c (__csinhl): Likewise. * math/s_csinl.c (__csinl): Likewise. * math/auto-libm-test-in: Add some tests of csin and csinh. * math/auto-libm-test-out: Regenerated. * math/libm-test.inc (csin_test_data): Use AUTO_TESTS_c_c. (csinh_test_data): Likewise. * sysdeps/x86_64/fpu/libm-test-ulps: Update.	2015-06-24 16:20:48 +00:00
Andrew Senkevich	36870482d2	Combination of data tables for x86_64 vector functions sinf, cosf and sincosf. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Fixed files list. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf4_core_sse4.S: Renamed variable and included header. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf8_core_avx2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf16_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf4_core_sse4.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf8_core_avx2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf16_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf4_core_sse4.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf8_core_avx2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/svml_s_trig_data.S: New file. * sysdeps/x86_64/fpu/svml_s_trig_data.h: Likewise. * sysdeps/x86_64/fpu/svml_s_cosf_data.S: Removed file. * sysdeps/x86_64/fpu/svml_s_cosf_data.h: Likewise. * sysdeps/x86_64/fpu/svml_s_sinf_data.S: Likewise. * sysdeps/x86_64/fpu/svml_s_sinf_data.h: Likewise. * sysdeps/x86_64/fpu/svml_s_sincosf_data.S: Likewise. * sysdeps/x86_64/fpu/svml_s_sincosf_data.h: Likewise.	2015-06-24 17:44:35 +03:00
Andrew Senkevich	5872b8352a	Combination of data tables for x86_64 vector functions sin, cos and sincos. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Fixed files list. * sysdeps/x86_64/fpu/multiarch/svml_d_cos2_core_sse4.S: Renamed variable and included header. * sysdeps/x86_64/fpu/multiarch/svml_d_cos4_core_avx2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core_sse4.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core_avx2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos2_core_sse4.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos4_core_avx2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/svml_d_trig_data.S: New file. * sysdeps/x86_64/fpu/svml_d_trig_data.h: Likewise. * sysdeps/x86_64/fpu/svml_d_cos2_core.S: Removed unneeded include. * sysdeps/x86_64/fpu/svml_d_cos4_core.S: Likewise. * sysdeps/x86_64/fpu/svml_d_cos8_core.S: Likewise. * sysdeps/x86_64/fpu/svml_d_cos_data.S: Removed file. * sysdeps/x86_64/fpu/svml_d_cos_data.h: Likewise. * sysdeps/x86_64/fpu/svml_d_sin_data.S: Likewise. * sysdeps/x86_64/fpu/svml_d_sin_data.h: Likewise. * sysdeps/x86_64/fpu/svml_d_sincos_data.S: Likewise. * sysdeps/x86_64/fpu/svml_d_sincos_data.h: Likewise.	2015-06-23 19:21:50 +03:00
Joseph Myers	cb0937b299	Fix x86_64 / x86 expm1l (-min_subnorm) result sign (bug 18569). In the x86 / x86_64 implementations of expm1l, when expm1l's result should underflow to 0 (argument minus the least subnormal, in some rounding modes), it can be a zero of the wrong sign. This patch fixes this by returning the argument with underflow forced in that case (this is a 1ulp error relative to the correctly rounded result of -0, which is OK in terms of the documented accuracy goals, whereas a result with the wrong sign never is). Tested for x86_64 and x86. [BZ #18569] * sysdeps/i386/fpu/e_expl.S (IEEE754_EXPL) [USE_AS_EXPM1L]: Force underflow and return argument in case of subnormal argument. * sysdeps/x86_64/fpu/e_expl.S (IEEE754_EXPL) [USE_AS_EXPM1L]: Likewise. * math/auto-libm-test-in: Add more tests of expm1. * math/auto-libm-test-out: Regenerated.	2015-06-21 18:43:10 +00:00
Joseph Myers	7540cfc5a8	Fix x86 / x86_64 expl, exp10l missing underflows (bug 16361). Similar to various other bugs in this area, the x86 and x86_64 implementations of expl / exp10l can fail to produce underflow exceptions when the unscaled result has trailing 0 bits so the scaling down to subnormal precision is exact. This patch fixes this by forcing the exception in the case of tiny results. Tested for x86_64 and x86. [BZ #16361] * sysdeps/i386/fpu/e_expl.S [!USE_AS_EXPM1L] (cmin): New object. [!USE_AS_EXPM1L] (IEEE754_EXPL): Force underflow exception for tiny results. * sysdeps/x86_64/fpu/e_expl.S [!USE_AS_EXPM1L] (cmin): New object. [!USE_AS_EXPM1L] (IEEE754_EXPL): Force underflow exception for tiny results. * math/auto-libm-test-in: Add more tests of exp and exp10. Do not mark underflow exceptions as possibly missing for bug 16361. * math/auto-libm-test-out: Regenerated.	2015-06-21 17:48:04 +00:00
Andrew Senkevich	a6336cc446	Vector sincosf for x86_64 and tests. Here is implementation of vectorized sincosf containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * NEWS: Mention addition of x86_64 vector sincosf. * math/test-float-vlen16.h: Added wrapper for sincosf tests. * math/test-float-vlen4.h: Likewise. * math/test-float-vlen8.h: Likewise. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New symbols added. * sysdeps/x86/fpu/bits/math-vector.h: Added sincosf SIMD declaration. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core.S * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core_avx512.S * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf4_core.S * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf4_core_sse4.S * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf8_core.S * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf8_core_avx2.S * sysdeps/x86_64/fpu/svml_s_sincosf16_core.S * sysdeps/x86_64/fpu/svml_s_sincosf4_core.S * sysdeps/x86_64/fpu/svml_s_sincosf8_core.S * sysdeps/x86_64/fpu/svml_s_sincosf8_core_avx.S * sysdeps/x86_64/fpu/svml_s_sincosf_data.S: New file. * sysdeps/x86_64/fpu/svml_s_sincosf_data.h: New file. * sysdeps/x86_64/fpu/svml_s_wrapper_impl.h: Added 3 argument wrappers. * sysdeps/x86_64/fpu/test-float-vlen16.c: : Vector sincosf tests. * sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8.c: Likewise.	2015-06-18 20:11:27 +03:00
Andrew Senkevich	c9a8c526ac	Vector sincos for x86_64 and tests. Here is implementation of vectorized sincos containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * NEWS: Mention addition of x86_64 vector sincos. * bits/libm-simd-decl-stubs.h: Added stubs for sincos. * math/math.h (__MATHDECL_VEC): New macro. * math/bits/mathcalls.h: Added sincos declaration with __MATHDECL_VEC. * math/gen-libm-have-vector-test.sh: Added generation of sincos wrapper declaration under condition. * math/test-vec-loop.h (TEST_VEC_LOOP): Refactored. * math/test-double-vlen2.h: Added wrapper for sincos tests, reflected TEST_VEC_LOOP change. * math/test-double-vlen4.h: Likewise. * math/test-double-vlen8.h: Likewise. * math/test-float-vlen16.h: Reflected TEST_VEC_LOOP change. * math/test-float-vlen4.h: Likewise. * math/test-float-vlen8.h: Likewise. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New symbols added. * sysdeps/x86/fpu/bits/math-vector.h: Added sincos SIMD declaration. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos2_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos2_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos4_core_avx2.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core_avx512.S: New file. * sysdeps/x86_64/fpu/svml_d_sincos2_core.S: New file. * sysdeps/x86_64/fpu/svml_d_sincos4_core.S: New file. * sysdeps/x86_64/fpu/svml_d_sincos4_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_d_sincos8_core.S: New file. * sysdeps/x86_64/fpu/svml_d_sincos_data.S: New file. * sysdeps/x86_64/fpu/svml_d_sincos_data.h: New file. * sysdeps/x86_64/fpu/svml_d_wrapper_impl.h: Added wrappers for sincos. * sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: Vector sincos tests. * sysdeps/x86_64/fpu/test-double-vlen2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8.c: Likewise.	2015-06-18 17:55:55 +03:00
Andrew Senkevich	8aa92022e2	Vector powf for x86_64 and tests. Here is implementation of vectorized powf containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New symbols added. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration and asm redirections for powf. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/svml_s_wrapper_impl.h: Added 2 argument wrappers. * sysdeps/x86_64/fpu/multiarch/svml_s_powf16_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_powf16_core_avx512.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_powf4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_powf4_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_powf8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_powf8_core_avx2.S: New file. * sysdeps/x86_64/fpu/svml_s_powf16_core.S: New file. * sysdeps/x86_64/fpu/svml_s_powf4_core.S: New file. * sysdeps/x86_64/fpu/svml_s_powf8_core.S: New file. * sysdeps/x86_64/fpu/svml_s_powf8_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_s_powf_data.S: New file. * sysdeps/x86_64/fpu/svml_s_powf_data.h: New file. * sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c: Vector powf tests. * sysdeps/x86_64/fpu/test-float-vlen16.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8.c: Likewise. * math/test-float-vlen16.h: Fixed 2 argument macro. * math/test-float-vlen4.h: Likewise. * math/test-float-vlen8.h: Likewise. * NEWS: Mention addition of x86_64 vector powf.	2015-06-18 17:04:07 +03:00
Andrew Senkevich	c10b9b13f7	Vector pow for x86_64 and tests. Here is implementation of vectorized pow containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * bits/libm-simd-decl-stubs.h: Added stubs for pow. * math/bits/mathcalls.h: Added pow declaration with __MATHCALL_VEC. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New versions added. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration and asm redirections for pow. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/svml_d_wrapper_impl.h: Added 2 argument wrappers. * sysdeps/x86_64/fpu/multiarch/svml_d_pow2_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_pow2_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_pow4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_pow4_core_avx2.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core_avx512.S: New file. * sysdeps/x86_64/fpu/svml_d_pow2_core.S: New file. * sysdeps/x86_64/fpu/svml_d_pow4_core.S: New file. * sysdeps/x86_64/fpu/svml_d_pow4_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_d_pow8_core.S: New file. * sysdeps/x86_64/fpu/svml_d_pow_data.S: New file. * sysdeps/x86_64/fpu/svml_d_pow_data.h: New file. * sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: Added vector pow test. * sysdeps/x86_64/fpu/test-double-vlen2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8.c: Likewise. * NEWS: Mention addition of x86_64 vector pow.	2015-06-17 16:22:26 +03:00
Andrew Senkevich	1663be053d	Vector expf for x86_64 and tests. Here is implementation of vectorized expf containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New symbols added. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration and asm redirections for expf. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/multiarch/svml_s_expf16_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_expf16_core_avx512.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_expf4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_expf4_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_expf8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_expf8_core_avx2.S: New file. * sysdeps/x86_64/fpu/svml_s_expf16_core.S: New file. * sysdeps/x86_64/fpu/svml_s_expf4_core.S: New file. * sysdeps/x86_64/fpu/svml_s_expf8_core.S: New file. * sysdeps/x86_64/fpu/svml_s_expf8_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_s_expf_data.S: New file. * sysdeps/x86_64/fpu/svml_s_expf_data.h: New file. * sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c: Vector expf tests. * sysdeps/x86_64/fpu/test-float-vlen16.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8.c: Likewise. * NEWS: Mention addition of x86_64 vector expf.	2015-06-17 16:10:51 +03:00
Andrew Senkevich	9c02f663f6	Vector exp for x86_64 and tests. Here is implementation of vectorized exp containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * bits/libm-simd-decl-stubs.h: Added stubs for exp. * math/bits/mathcalls.h: Added exp declaration with __MATHCALL_VEC. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New versions added. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration and asm redirections for exp. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/multiarch/svml_d_exp2_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_exp2_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_exp4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_exp4_core_avx2.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core_avx512.S: New file. * sysdeps/x86_64/fpu/svml_d_exp2_core.S: New file. * sysdeps/x86_64/fpu/svml_d_exp4_core.S: New file. * sysdeps/x86_64/fpu/svml_d_exp4_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_d_exp8_core.S: New file. * sysdeps/x86_64/fpu/svml_d_exp_data.S: New file. * sysdeps/x86_64/fpu/svml_d_exp_data.h: New file. * sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: Added vector exp test. * sysdeps/x86_64/fpu/test-double-vlen2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8.c: Likewise. * NEWS: Mention addition of x86_64 vector exp.	2015-06-17 15:58:05 +03:00
Andrew Senkevich	774488f88a	Vector logf for x86_64 and tests. Here is implementation of vectorized logf containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New symbols added. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration and asm redirections for logf. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/multiarch/svml_s_logf16_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_logf16_core_avx512.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_logf4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_logf4_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_logf8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_logf8_core_avx2.S: New file. * sysdeps/x86_64/fpu/svml_s_logf16_core.S: New file. * sysdeps/x86_64/fpu/svml_s_logf4_core.S: New file. * sysdeps/x86_64/fpu/svml_s_logf8_core.S: New file. * sysdeps/x86_64/fpu/svml_s_logf8_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_s_logf_data.S: New file. * sysdeps/x86_64/fpu/svml_s_logf_data.h: New file. * sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c: Vector logf tests. * sysdeps/x86_64/fpu/test-float-vlen16.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8.c: Likewise. * NEWS: Mention addition of x86_64 vector logf.	2015-06-17 15:53:00 +03:00
Andrew Senkevich	6af25acc7b	Vector log for x86_64 and tests. Here is implementation of vectorized log containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * bits/libm-simd-decl-stubs.h: Added stubs for log. * math/bits/mathcalls.h: Added log declaration with __MATHCALL_VEC. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New versions added. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration and asm redirections for log. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/multiarch/svml_d_log2_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_log2_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_log4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_log4_core_avx2.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_log8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_log8_core_avx512.S: New file. * sysdeps/x86_64/fpu/svml_d_log2_core.S: New file. * sysdeps/x86_64/fpu/svml_d_log4_core.S: New file. * sysdeps/x86_64/fpu/svml_d_log4_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_d_log8_core.S: New file. * sysdeps/x86_64/fpu/svml_d_log_data.S: New file. * sysdeps/x86_64/fpu/svml_d_log_data.h: New file. * sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: Added vector log test. * sysdeps/x86_64/fpu/test-double-vlen2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8.c: Likewise. * NEWS: Mention addition of x86_64 vector log.	2015-06-17 15:38:29 +03:00
Andrew Senkevich	2a8c2c7b33	Vector sinf for x86_64 and tests. Here is implementation of vectorized sinf containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New symbols added. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration for sinf. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf16_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf16_core_avx512.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf4_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf8_core_avx2.S: New file. * sysdeps/x86_64/fpu/svml_s_sinf16_core.S: New file. * sysdeps/x86_64/fpu/svml_s_sinf4_core.S: New file. * sysdeps/x86_64/fpu/svml_s_sinf8_core.S: New file. * sysdeps/x86_64/fpu/svml_s_sinf8_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_s_sinf_data.S: New file. * sysdeps/x86_64/fpu/svml_s_sinf_data.h: New file. * sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c: Vector sinf tests. * sysdeps/x86_64/fpu/test-float-vlen16.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8.c: Likewise. * NEWS: Mention addition of x86_64 vector sinf.	2015-06-15 15:06:53 +03:00
Andrew Senkevich	4b9c2b707b	Vector sin for x86_64 and tests. Here is implementation of vectorized sin containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * bits/libm-simd-decl-stubs.h: Added stubs for sin. * math/bits/mathcalls.h: Added sin declaration with __MATHCALL_VEC. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New versions added. * sysdeps/x86/fpu/bits/math-vector.h: SIMD declaration for sin. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core_avx2.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core_avx512.S: New file. * sysdeps/x86_64/fpu/svml_d_sin2_core.S: New file. * sysdeps/x86_64/fpu/svml_d_sin4_core.S: New file. * sysdeps/x86_64/fpu/svml_d_sin4_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_d_sin8_core.S: New file. * sysdeps/x86_64/fpu/svml_d_sin_data.S: New file. * sysdeps/x86_64/fpu/svml_d_sin_data.h: New file. * sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: Added vector sin test. * sysdeps/x86_64/fpu/test-double-vlen2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8.c: Likewise. * NEWS: Mention addition of x86_64 vector sin.	2015-06-11 17:12:38 +03:00
Andrew Senkevich	2a523216d5	This patch adds vector cosf tests. * math/Makefile: Added CFLAGS for new tests. * math/test-float-vlen16.h: New file. * math/test-float-vlen4.h: New file. * math/test-float-vlen8.h: New file. * math/test-double-vlen2.h: Fixed 2 argument macro and comment. * sysdeps/x86_64/fpu/Makefile: Added new tests and variables. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c: New file. * sysdeps/x86_64/fpu/test-float-vlen16.c: New file. * sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c: New file. * sysdeps/x86_64/fpu/test-float-vlen4.c: New file. * sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c: New file. * sysdeps/x86_64/fpu/test-float-vlen8-avx2.c: New file. * sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c: New file. * sysdeps/x86_64/fpu/test-float-vlen8.c: New file.	2015-06-09 18:32:42 +03:00
Andrew Senkevich	04f496d602	Vector cosf for x86_64. Here is implementation of vectorized cosf containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/svml_s_cosf4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf4_core_sse4.S: New file. * sysdeps/x86_64/fpu/svml_s_cosf8_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_s_cosf8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf8_core_avx2.S: New file. * sysdeps/x86_64/fpu/svml_s_cosf16_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf16_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf16_core_avx512.S: New file. * sysdeps/x86_64/fpu/svml_s_wrapper_impl.h: New file. * sysdeps/x86_64/fpu/svml_s_cosf_data.S: New file. * sysdeps/x86_64/fpu/svml_s_cosf_data.h: New file. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New versions added. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration for cosf. * NEWS: Mention addition of x86_64 vector cosf.	2015-06-09 18:29:47 +03:00
Andrew Senkevich	24a2718f59	Addition of testing infrastructure for vector math functions. We test vector math functions using scalar tests infrastructure with help of special wrappers from scalar versions to vector ones. Wrapper implemented using platform specific vector types and placed in separate file for compilation with architecture specific options, main part of test has no such options. With help of system of definitions unfolding of which is drived from test code we have wrapper called in individual testing function instead of scalar function. Also system of definitions includes generated during make check header math/libm-have-vector-test.h with series of conditional definitions which help to avoid build fails for functions having no vector versions; runtime architecture check to prevent runtime fails of test run on inappropriate hardware. * math/Makefile: Added rules for vector tests. * math/gen-libm-have-vector-test.sh: Added generation of wrapper declaration under condition. * math/test-double-vlen2.h: New file. * math/test-double-vlen4.h: New file. * math/test-double-vlen8.h: New file. * math/test-vec-loop.h: Added initialization macro. * sysdeps/x86_64/fpu/Makefile: Added variables for vector tests. * sysdeps/x86_64/fpu/libm-test-ulps: Regenarated. * sysdeps/x86_64/fpu/math-tests-arch.h: New file. * sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: New file. * sysdeps/x86_64/fpu/test-double-vlen2.c: New file. * sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: New file. * sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: New file. * sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: New file. * sysdeps/x86_64/fpu/test-double-vlen4.c: New file. * sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: New file. * sysdeps/x86_64/fpu/test-double-vlen8.c: New file.	2015-06-09 14:51:52 +03:00
Andrew Senkevich	2193311288	Start of series of patches with x86_64 vector math functions. Here is implementation of cos containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI which had been discussed in <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. Vector math library build and ABI testing enabled by default for x86_64. * sysdeps/x86_64/fpu/Makefile: New file. * sysdeps/x86_64/fpu/Versions: New file. * sysdeps/x86_64/fpu/svml_d_cos_data.S: New file. * sysdeps/x86_64/fpu/svml_d_cos_data.h: New file. * sysdeps/x86_64/fpu/svml_d_cos2_core.S: New file. * sysdeps/x86_64/fpu/svml_d_cos4_core.S: New file. * sysdeps/x86_64/fpu/svml_d_cos4_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_d_cos8_core.S: New file. * sysdeps/x86_64/fpu/svml_d_wrapper_impl.h: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_cos2_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_cos2_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_cos4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_cos4_core_avx2.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core_avx512.S: New file. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration for cos. * math/bits/mathcalls.h: Added cos declaration with __MATHCALL_VEC. * sysdeps/x86_64/configure.ac: Options for libmvec build. * sysdeps/x86_64/configure: Regenerated. * sysdeps/x86_64/sysdep.h (cfi_offset_rel_rsp): New macro. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New file. * manual/install.texi (Configuring and compiling): Document --disable-mathvec. * INSTALL: Regenerated. * NEWS: Mention addition of libmvec and x86_64 vector cos.	2015-06-09 14:25:49 +03:00
Wilco Dijkstra	0e9be4db8f	Remove various ABS macros and replace uses with fabs (or in one case abs) which is more efficient on all targets.	2015-05-15 11:04:40 +00:00
Joseph Myers	14f36098f2	Add more tests of csqrt, lgamma, log10, sinh. This patch adds more randomly-generated tests of various libm functions that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of csqrt, lgamma, log10 and sinh. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-05-08 17:55:11 +00:00
Joseph Myers	471dffa12c	Add more tests of acosh, atanh, cos, csqrt, erfc, sin, sincos. This patch adds more randomly-generated tests of various libm functions that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of acosh, atanh, cos, csqrt, erfc, sin and sincos. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-05-06 17:30:18 +00:00
Joseph Myers	31450d9a87	Add further tests of libm functions. This patch adds more randomly-generated tests of various libm functions that are observed to increase ulps on x86_64. (This process must eventually converge, when my random test generation stops finding inputs that increase the listed ulps, except maybe for any cases uncovered where the errors exceed the maximum allowed 9ulp error and so indicate actual libm bugs needing fixing.) Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of acosh, atanh, clog, clog10, csqrt, erfc, exp2, expm1, log10, log2 and sinh. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-05-05 22:59:41 +00:00
Joseph Myers	305392eaca	Add more tests of libm functions. This patch adds more randomly-generated tests of various libm functions that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of atan, clog, clog10, cos, csqrt, erf, erfc, exp2, lgamma, log1p, sin, sincos, tanh and tgamma. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-05-02 21:06:33 +00:00
Joseph Myers	51e15247c3	Add more tests of tgamma. This patch adds some randomly-generated tests of tgamma that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of tgamma. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-05-01 23:15:07 +00:00
Joseph Myers	5ffb9a53d7	Add more tests of tanh. This patch adds some randomly-generated tests of tanh that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of tanh. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-05-01 23:06:44 +00:00
Joseph Myers	0957e15d0a	Add more tests of tan. This patch adds some randomly-generated tests of tan that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of tan. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-05-01 22:54:39 +00:00
Joseph Myers	827bb5859c	Add more tests of cos, sin, sincos. This patch adds some randomly-generated tests of cos, sin and sincos that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of cos, sin and sincos. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-05-01 22:41:00 +00:00
Joseph Myers	86793ae758	Add another test of pow. This patch adds a randomly-generated test of pow that is observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add another test of pow. * math/auto-libm-test-out: Regenerated. * sysdeps/x86_64/fpu/libm-test-ulps: Update.	2015-05-01 22:31:24 +00:00
Joseph Myers	038e4be99c	Add more tests of lgamma. This patch adds some randomly-generated tests of lgamma that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of lgamma. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-05-01 22:17:19 +00:00
Joseph Myers	a0d31f36aa	Add more tests of log, log10, log1p, log2. This patch adds some randomly-generated tests of log, log10, log1p and log2 that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of log, log10, log2 and log1p. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-05-01 21:08:37 +00:00
Joseph Myers	e1483b365d	Add more tests of exp, exp10, exp2, expm1. This patch adds some randomly-generated tests of exp, exp10, exp2 and expm1 that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of exp, exp10, exp2 and expm1. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-05-01 20:33:04 +00:00
Joseph Myers	c5a3a509df	Add more tests of erf, erfc. This patch adds some randomly-generated tests of erf and erfc that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of erf and erfc. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-05-01 17:49:44 +00:00
Joseph Myers	9862ab1f67	Add more tests of csqrt. This patch adds some randomly-generated tests of csqrt that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of csqrt. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-04-30 22:51:29 +00:00
Joseph Myers	094fca83ee	Add further tests of cosh and sinh. This patch adds some further randomly-generated tests of cosh and sinh that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of cosh and sinh. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-04-30 22:32:08 +00:00
Stefan Liebler	de8aadd52c	Set errno for log1p on pole/domain error. According to bug 6792, errno is not set to ERANGE/EDOM by calling log1p/log1pf/log1pl with x = -1 or x < -1. This patch adds a wrapper which sets errno in those cases and returns the value of the existing __log1p function. The log1p is now an alias to the wrapper function instead of __log1p. The files in sysdeps are reflecting these changes. The ia64 implementation sets errno by itself, thus the wrapper-file is empty. The libm-test is adjusted for log1p-tests to check errno. [BZ #6792] * math/w_log1p.c: New file. * math/w_log1pf.c: Likewise. * math/w_log1pl.c: Likewise. * math/Makefile (libm-calls): Add w_log1p. * math/s_log1pl.c (log1pl): Remove weak_alias. * sysdeps/i386/fpu/s_log1p.S (log1p): Likewise. * sysdeps/i386/fpu/s_log1pf.S (log1pf): Likewise. * sysdeps/i386/fpu/s_log1pl.S (log1pl): Likewise. * sysdeps/x86_64/fpu/s_log1pl.S (log1pl): Likewise. * sysdeps/ieee754/dbl-64/s_log1p.c (log1p): Likewise. [NO_LONG_DOUBLE] (log1pl): Likewise. * sysdeps/ieee754/flt-32/s_log1pf.c (log1pf): Likewise. * sysdeps/ieee754/ldbl-128/s_log1pl.c (log1pl): Likewise. * sysdeps/ieee754/ldbl-64-128/s_log1pl.c (log1p): Remove long_double_symbol. * sysdeps/ieee754/ldbl-128ibm/s_log1pl.c (log1pl): Likewise. * sysdeps/ieee754/ldbl-64-128/w_log1pl.c: New file. * sysdeps/ieee754/ldbl-128ibm/w_log1pl.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_log1p.c: Define empty weak_alias to remove weak_alias for corresponding log1p function. * sysdeps/m68k/m680x0/fpu/s_log1pf.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_log1pl.c: Likewise. * sysdeps/ia64/fpu/w_log1p.c: New file. * sysdeps/ia64/fpu/w_log1pf.c: Likewise. * sysdeps/ia64/fpu/w_log1pl.c: Likewise. * math/libm-test.inc (log1p_test_data): Add errno expectations.	2015-04-13 21:19:27 +02:00
Joseph Myers	b3c66c534f	Add more tests of clog and clog10. This patch adds some randomly-generated tests of clog and clog10 that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of clog and clog10. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-04-09 22:14:34 +00:00
Joseph Myers	787d22bce6	Add more tests of atanh. This patch adds some randomly-generated tests of atanh that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of atanh. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-04-08 21:13:35 +00:00
Joseph Myers	024bcc5106	Add more tests of atan. This patch adds some randomly-generated tests of atan that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of atan. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-04-08 21:00:03 +00:00
Joseph Myers	da0cf658c6	Add more tests of cbrt. This patch adds some randomly-generated tests of cbrt that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of cbrt. * math/auto-libm-test-out: Regenerated. * sysdeps/x86_64/fpu/libm-test-ulps: Update.	2015-04-08 17:56:15 +00:00
Joseph Myers	80352c01c1	Add more tests of cabs. This patch adds some randomly-generated tests of cabs that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of cabs. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-04-08 17:46:07 +00:00
Joseph Myers	8431838dde	Fix dbl-64 atan2 in non-default rounding modes (bug 18210, bug 18211). The dbl-64 implementation of atan2 does computations that expect to run in round-to-nearest mode, and in other modes the errors can accumulate to more than the maximum accepted 9ulp. This patch makes it use FE_TONEAREST internally, similar to other functions with such issues. Tests that previously produced large errors are added for atan2 and the closely related carg, clog and clog10 functions. Tested for x86_64 and x86 and ulps updated accordingly. [BZ #18210] [BZ #18211] * sysdeps/ieee754/dbl-64/e_atan2.c: Include <fenv.h>. (__ieee754_atan2): Set FE_TONEAREST mode for internal computations. * math/auto-libm-test-in: Add more tests of atan2, carg, clog and clog10. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-04-08 17:32:17 +00:00
Joseph Myers	efd5b641dd	Add more tests of acosh, asinh and atanh. This patch adds some randomly-generated tests of acosh, asinh and atanh that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of acosh, asinh and atanh. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-03-25 22:21:20 +00:00
Joseph Myers	e9b1015112	Add another test of asin. This patch adds a randomly-generated test of asin that is observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add another test of asin. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-03-25 21:57:04 +00:00
Joseph Myers	38755f1421	Add more tests of asin. This patch adds some randomly-generated tests of asin that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of asin. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-03-25 17:53:58 +00:00
Joseph Myers	8d6439712d	Add more tests of acos. This patch adds some randomly-generated tests of acos that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of acos. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-03-25 00:30:10 +00:00
Joseph Myers	bc899ea090	Add more tests of expm1. This patch adds some randomly-generated tests of expm1 that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of expm1. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-03-25 00:05:13 +00:00
Joseph Myers	239ed6f309	Add more tests of cosh, sinh. This patch adds some randomly-generated tests of cosh and sinh that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of cosh and sinh. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-03-24 23:48:04 +00:00
Joseph Myers	a737e8263a	Regenerate x86_64, x86 ulps from scratch. The x86_64 and x86 libm-test-ulps files hadn't been regenerated from scratch for some time, as evidenced by the presence of entries for _tonearest functions (those tests duplicated the default-rounding-mode tests, and such duplicates are no longer run). The aarch64, alpha, hppa, ia64, m68k, microblaze, powerpc, s390, sh, sparc, tile files similarly could do with from-scratch regeneration as evidenced by the presence of such entries. (Truncate the existing file then run "make regen-ulps" and move the resulting file into place.) This patch regenerates the x86_64 and x86 files from scratch. It's likely some of the reduced / removed ulps will need restoring because they appear on processors or compiler versions other than the one I tested on, but in such cases I'd like to first see if I can generate new tests that show such ulps on the Intel processor I'm testing on, to reduce the effects from different people using different processors and compilers to regenerate the ulps. sysdeps/i386/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-03-24 23:30:19 +00:00
Joseph Myers	7c84a5042f	Add more tests of log2. In testing for x86_64 on an AMD processor, I observed libm test failures of the form: testing long double (without inline functions) Failure: Test: log2_downward (0x2.b7e151628aed4p+0) Result: is: 1.44269504088896356633e+00 0xb.8aa3b295c17f67600000p-3 should be: 1.44269504088896356622e+00 0xb.8aa3b295c17f67500000p-3 difference: 1.08420217248550443400e-19 0x8.00000000000000000000p-66 ulp : 1.0000 max.ulp : 0.0000 Maximal error of `log2_downward' is : 1 ulp accepted: 0 ulp These issues arise because the maximum ulps when regenerating on one processor are not the same as on another processor, so regeneration on several processors may be needed when updating libm-test-ulps to avoid failures for some users testing glibc - but such regeneration on multiple processors is inconvenient. Causes can be: on x86 and, for x86_64, for long double, variation in results of x87 instructions for transcendental operations between processors; on x86, variation in compiler excess precision between compiler versions and configurations; on any processor where the compiler may contract expressions using fused multiply-add, variation in what contraction occurs. Although it's hard to be sure libm-test-ulps covers all ulps that may be seen in any configuration for the given architecture, in practice it helps simply to add wider test coverage to make it more likely that, when testing on one processor, the ulps seen are the biggest that can be seen for that function on that processor, and hopefully they are also the biggest that can be seen for that function in other configurations for that architecture. Thus, this patch adds some tests of log2 that increase the ulps I see on x86_64 on an Intel processor, so that hopefully future from-scratch regenerations on that processor will produce ulps big enough not to have errors from testing on AMD processors. These tests were found by randomly generating inputs and seeing what produced ulps larger than those currently in libm-test-ulps. Of course such increases also improve the accuracy of the empirical table of known ulps generated from libm-test-ulps files that goes in the manual. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of log2. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-03-24 23:06:28 +00:00
Joseph Myers	2ca725c594	Fix ldbl-96, ldbl-128ibm atanhl inaccuracy (bug 18046, bug 18047). The threshold in ldbl-96 atanhl for when to return the argument, 0x1p-28, is a bit too big, and that in ldbl-128ibm atanhl is much too big (the relevant condition being x^3/3 being < 0.5ulp of x), resulting in errors a bit above the limits of those considered acceptable in glibc in the ldbl-96 case, and in large errors in the ldbl-128ibm case. This patch changes those implementations to use more appropriate thresholds and adds tests around the thresholds for various formats. Tested for x86_64, x86 and powerpc. x86_64 and x86 ulps updated accordingly. [BZ #18046] [BZ #18047] * sysdeps/ieee754/ldbl-128ibm/e_atanhl.c (__ieee754_atanhl): Use 0x1p-56L as threshold for just returning the argument. * sysdeps/ieee754/ldbl-96/e_atanhl.c (__ieee754_atanhl): Use 0x1p-32L as threshold for just returning the argument. * math/auto-libm-test-in: Add more tests of atanh. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulp: Likewise.	2015-02-27 17:48:37 +00:00
Joseph Myers	ec0ce0d3be	Fix asin missing underflows (bug 16351). Similar to various other bugs in this area, some asin implementations do not raise the underflow exception for subnormal arguments, when the result is tiny and inexact. This patch forces the exception in a similar way to previous fixes. Tested for x86_64, x86, powerpc and mips64. [BZ #16351] * sysdeps/i386/fpu/e_asin.S (dbl_min): New object. (MO): New macro. (__ieee754_asin): Force underflow exception for results with small absolute value. * sysdeps/i386/fpu/e_asinf.S (flt_min): New object. (MO): New macro. (__ieee754_asinf): Force underflow exception for results with small absolute value. * sysdeps/ieee754/dbl-64/e_asin.c: Include <float.h> and <math.h>. (__ieee754_asin): Force underflow exception for results with small absolute value. * sysdeps/ieee754/flt-32/e_asinf.c: Include <float.h>. (__ieee754_asinf): Force underflow exception for results with small absolute value. * sysdeps/ieee754/ldbl-128/e_asinl.c: Include <float.h>. (__ieee754_asinl): Force underflow exception for results with small absolute value. * sysdeps/ieee754/ldbl-128ibm/e_asinl.c: Include <float.h>. (__ieee754_asinl): Force underflow exception for results with small absolute value. * sysdeps/ieee754/ldbl-96/e_asinl.c: Include <float.h>. (__ieee754_asinl): Force underflow exception for results with small absolute value. * sysdeps/x86_64/fpu/multiarch/e_asin.c [HAVE_FMA4_SUPPORT]: Include <math.h>. * math/auto-libm-test-in: Do not mark underflow exceptions as possibly missing for bug 16351. * math/auto-libm-test-out: Regenerated.	2015-02-26 17:18:54 +00:00
Joseph Myers	137cef7d43	Fix ldbl-128ibm asinhl inaccuracy (bug 18020). The ldbl-128ibm implementation of asinhl uses cut-offs of 0x1p28 and 0x1p-29 to determine when to use simpler formulas that avoid possible overflow / underflow. Both those cut-offs are inappropriate for this format, resulting in large errors. This patch changes the code to use more appropriate cut-offs of 0x1p56 and 0x1p-56, adding tests around the cut-offs for various floating-point formats. Tested for powerpc. Also tested for x86_64 and x86 and updated ulps. [BZ #18020] * sysdeps/ieee754/ldbl-128ibm/s_asinhl.c (__asinhl): Use 256 and 2-56 not 228 and 2-29 as thresholds for simpler formulas. * math/auto-libm-test-in: Add more tests of asinh. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-02-25 11:13:41 +00:00
Joseph Myers	440169d681	Fix ldbl-128ibm acoshl inaccuracy (bug 18019). The ldbl-128ibm implementation of acoshl uses a cut-off of 0x1p28 to determine when to use log(x) + log(2) as a formula. That cut-off is too small for this format, resulting in large errors. This patch changes it to a more appropriate cut-off of 0x1p56, adding tests around the cut-offs for various floating-point formats. Tested for powerpc. Also tested for x86_64 and x86 and updated ulps. [BZ #18019] * sysdeps/ieee754/ldbl-128ibm/e_acoshl.c (__ieee754_acoshl): Use 256 not 228 as threshold for log (2x) formula. * math/auto-libm-test-in: Add more tests of acosh. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2015-02-25 00:01:15 +00:00
Joseph Myers	9438b237ab	Fix x86/x86_64 scalb (qNaN, -Inf) (bug 16783). Various x86 / x86_64 versions of scalb / scalbf / scalbl produce spurious "invalid" exceptions for (qNaN, -Inf) arguments, because this is wrongly handled like (+/-Inf, -Inf) which should raise such an exception. (In fact the NaN case of the code determining whether to quietly return a zero or a NaN for second argument -Inf was accidentally dead since the code had been made to return a NaN with exception.) This patch fixes the code to do the proper test for an infinity as distinct from a NaN. (Since the existing code does nothing to distinguish qNaNs and sNaNs here, this patch doesn't either. If in future we systematically implement proper sNaN semantics following TS 18661-1:2014, there will be lots of bugs to address - Thomas found lots of issues with his patch <https://sourceware.org/ml/libc-ports/2013-04/msg00008.html> to add SNaN tests (which never went in and would now require significant reworking).) Tested for x86_64 and x86. Committed. [BZ #16783] * sysdeps/i386/fpu/e_scalb.S (__ieee754_scalb): Do not handle arguments (NaN, -Inf) the same as (+/-Inf, -Inf). * sysdeps/i386/fpu/e_scalbf.S (__ieee754_scalbf): Likewise. * sysdeps/i386/fpu/e_scalbl.S (__ieee754_scalbl): Likewise. * sysdeps/x86_64/fpu/e_scalbl.S (__ieee754_scalbl): Likewise. * math/libm-test.inc (scalb_test_data): Add more tests.	2015-02-24 17:30:02 +00:00
Joseph Myers	4629c866ad	Fix atan / atan2 missing underflows (bug 15319). This patch fixes bug 15319, missing underflows from atan / atan2 when the result of atan is very close to its small argument (or that of atan2 is very close to the ratio of its arguments, which may be an exact division). The usual approach of doing an underflowing computation if the computed result is subnormal is followed. For 32-bit x86, there are extra complications: the inline __ieee754_atan2 in bits/mathinline.h needs to be disabled for float and double because other libm functions using it generally rely on getting proper underflow exceptions from it, while the out-of-line functions have to remove excess range and precision from the underflowing result so as to return an exact 0 in the case where errno should be set for underflow to 0. (The failures I saw without that are similar to those Carlos reported for other functions, where I haven't seen a response to <https://sourceware.org/ml/libc-alpha/2015-01/msg00485.html> confirming if my diagnosis is correct. Arguably all libm functions with float and double returns should remove excess range and precision, but that's a separate matter.) The x86_64 long double case reported in a comment in bug 15319 is not a bug (it's an argument of LDBL_MIN, and x86_64 is an after-rounding architecture so the correct IEEE result is not to raise underflow in the given rounding mode, in addition to treating the result as an exact LDBL_MIN being within the newly clarified documentation of accuracy goals). I'm presuming that the fpatan instruction can be trusted to raise appropriate exceptions when the (long double) result underflows (after rounding) and so no changes are needed for x86 / x86_64 long double functions here; empirically this is the case for the cases covered in the testsuite, on my system. Tested for x86_64, x86, powerpc and mips64. Only 32-bit x86 needs ulps updates (for the changes to inlines meaning some functions no longer get excess precision from their __ieee754_atan2* calls). [BZ #15319] * sysdeps/i386/fpu/e_atan2.S (dbl_min): New object. (MO): New macro. (__ieee754_atan2): For results with small absolute value, force underflow exception and remove excess range and precision from return value. * sysdeps/i386/fpu/e_atan2f.S (flt_min): New object. (MO): New macro. (__ieee754_atan2f): For results with small absolute value, force underflow exception and remove excess range and precision from return value. * sysdeps/i386/fpu/s_atan.S (dbl_min): New object. (MO): New macro. (__atan): For results with small absolute value, force underflow exception and remove excess range and precision from return value. * sysdeps/i386/fpu/s_atanf.S (flt_min): New object. (MO): New macro. (__atanf): For results with small absolute value, force underflow exception and remove excess range and precision from return value. * sysdeps/ieee754/dbl-64/e_atan2.c: Include <float.h> and <math.h>. (__ieee754_atan2): Force underflow exception for results with small absolute value. * sysdeps/ieee754/dbl-64/s_atan.c: Include <float.h> and <math_private.h>. (atan): Force underflow exception for results with small absolute value. * sysdeps/ieee754/flt-32/s_atanf.c: Include <float.h>. (__atanf): Force underflow exception for results with small absolute value. * sysdeps/ieee754/ldbl-128/s_atanl.c: Include <float.h> and <math.h>. (__atanl): Force underflow exception for results with small absolute value. * sysdeps/ieee754/ldbl-128ibm/s_atanl.c: Include <float.h>. (__atanl): Force underflow exception for results with small absolute value. * sysdeps/x86/fpu/bits/mathinline.h [!__SSE2_MATH__ && !__x86_64__ && __LIBC_INTERNAL_MATH_INLINES] (__ieee754_atan2): Only define inline for long double. * sysdeps/x86_64/fpu/multiarch/e_atan2.c [HAVE_FMA4_SUPPORT \|\| HAVE_AVX_SUPPORT]: Include <math.h>. * math/auto-libm-test-in: Do not mark underflow exceptions as possibly missing for bug 15319. Add more tests of atan2. * math/auto-libm-test-out: Regenerated. * math/libm-test.inc (casin_test_data): Do not mark underflow exceptions as possibly missing for bug 15319. (casinh_test_data): Likewise. * sysdeps/i386/fpu/libm-test-ulps: Update.	2015-02-18 21:10:49 +00:00
Joseph Myers	03d95bd483	Fix exp2 spurious underflows (bug 16560). This patch fixes the remaining part of bug 16560, spurious underflows from exp2 of arguments close to 0 (when the result is close to 1, so should not underflow), by just using 1+x instead of a more complicated calculation when the argument is sufficiently small. Tested for x86_64, x86 and mips64. [BZ #16560] * math/e_exp2l.c [LDBL_MANT_DIG == 106] (LDBL_EPSILON): Undefine and redefine. (__ieee754_exp2l): Do not multiply small fractional parts by M_LN2l. * sysdeps/i386/fpu/e_exp2l.S (__ieee754_exp2l): Just add 1 to small argument. * sysdeps/ieee754/dbl-64/e_exp2.c (__ieee754_exp2): Likewise. * sysdeps/ieee754/flt-32/e_exp2f.c (__ieee754_exp2f): Likewise. * sysdeps/x86_64/fpu/e_exp2l.S (__ieee754_exp2l): Likewise. * math/auto-libm-test-in: Add more tests of exp2. * math/auto-libm-test-out: Regenerated.	2015-02-12 19:02:45 +00:00
Joseph Myers	8116321f65	Fix libm feupdateenv namespace (bug 17748). Concluding the fixes for C90 libm functions calling C99 fe* functions, this patch fixes the case of feupdateenv by making it a weak alias for __feupdateenv and making the affected code call __feupdateenv. Tested for x86_64 (testsuite, and that installed stripped shared libraries are unchanged by the patch). Also tested for ARM (soft-float) that the math.h linknamespace tests now pass. [BZ #17748] * include/fenv.h (__feupdateenv): Use libm_hidden_proto. * math/feupdateenv.c (__feupdateenv): Use libm_hidden_def. * sysdeps/aarch64/fpu/feupdateenv.c (feupdateenv): Rename to __feupdateenv and define as weak alias of __feupdateenv. Use libm_hidden_weak. * sysdeps/alpha/fpu/feupdateenv.c (__feupdateenv): Use libm_hidden_def. * sysdeps/arm/feupdateenv.c (feupdateenv): Rename to __feupdateenv and define as weak alias of __feupdateenv. Use libm_hidden_weak. * sysdeps/hppa/fpu/feupdateenv.c (feupdateenv): Likewise. * sysdeps/i386/fpu/feupdateenv.c (__feupdateenv): Use libm_hidden_def. * sysdeps/ia64/fpu/feupdateenv.c (feupdateenv): Rename to __feupdateenv and define as weak alias of __feupdateenv. Use libm_hidden_weak. * sysdeps/m68k/fpu/feupdateenv.c (__feupdateenv): Use libm_hidden_def. * sysdeps/mips/fpu/feupdateenv.c (feupdateenv): Rename to __feupdateenv and define as weak alias of __feupdateenv. Use libm_hidden_weak. * sysdeps/powerpc/fpu/feupdateenv.c (__feupdateenv): Use libm_hidden_def. * sysdeps/powerpc/nofpu/feupdateenv.c (__feupdateenv): Likewise. * sysdeps/powerpc/powerpc32/e500/nofpu/feupdateenv.c (__feupdateenv): Likewise. * sysdeps/s390/fpu/feupdateenv.c (feupdateenv): Rename to __feupdateenv and define as weak alias of __feupdateenv. Use libm_hidden_weak. * sysdeps/sh/sh4/fpu/feupdateenv.c (feupdateenv): Likewise. * sysdeps/sparc/fpu/feupdateenv.c (__feupdateenv): Use libm_hidden_def. * sysdeps/tile/math_private.h (__feupdateenv): New inline function. * sysdeps/x86_64/fpu/feupdateenv.c (__feupdateenv): Use libm_hidden_def. * sysdeps/generic/math_private.h (default_libc_feupdateenv): Call __feupdateenv instead of feupdateenv. (default_libc_feupdateenv_test): Likewise. (libc_feresetround_ctx): Likewise.	2015-01-07 19:01:20 +00:00
Joseph Myers	01238691bb	Fix libm fesetround namespace (bug 17748). Continuing the fixes for C90 libm functions calling C99 fe* functions, this patch fixes the case of fesetround by making it a weak alias of __fesetround and making the affected code call __fesetround. An existing __fesetround function in fenv_libc.h for powerpc is renamed to __fesetround_inline. Tested for x86_64 (testsuite, and that disassembly of installed shared libraries is unchanged by the patch). Also tested for ARM (soft-float) that fesetround failures disappear from the linknamespace test results (feupdateenv remains to be addressed to complete fixing bug 17748). [BZ #17748] * include/fenv.h (__fesetround): Declare. Use libm_hidden_proto. * math/fesetround.c (fesetround): Rename to __fesetround and define as weak alias of __fesetround. Use libm_hidden_weak. * sysdeps/aarch64/fpu/fesetround.c (fesetround): Likewise. * sysdeps/alpha/fpu/fesetround.c (fesetround): Likewise. * sysdeps/arm/fesetround.c (fesetround): Likewise. * sysdeps/hppa/fpu/fesetround.c (fesetround): Likewise. * sysdeps/i386/fpu/fesetround.c (fesetround): Likewise. * sysdeps/ia64/fpu/fesetround.c (fesetround): Likewise. * sysdeps/m68k/fpu/fesetround.c (fesetround): Likewise. * sysdeps/mips/fpu/fesetround.c (fesetround): Likewise. * sysdeps/powerpc/fpu/fenv_libc.h (__fesetround): Rename to __fesetround_inline. * sysdeps/powerpc/fpu/fenv_private.h (libc_fesetround_ppc): Call __fesetround_inline instead of __fesetround. * sysdeps/powerpc/fpu/fesetround.c (fesetround): Rename to __fesetround and define as weak alias of __fesetround. Use libm_hidden_weak. Call __fesetround_inline instead of __fesetround. * sysdeps/powerpc/nofpu/fesetround.c (fesetround): Rename to __fesetround and define as weak alias of __fesetround. Use libm_hidden_weak. * sysdeps/powerpc/powerpc32/e500/nofpu/fesetround.c (fesetround): Likewise. * sysdeps/s390/fpu/fesetround.c (fesetround): Likewise. * sysdeps/sh/sh4/fpu/fesetround.c (fesetround): Likewise. * sysdeps/sparc/fpu/fesetround.c (fesetround): Likewise. * sysdeps/tile/math_private.h (__fesetround): New inline function. * sysdeps/x86_64/fpu/fesetround.c (fesetround): Rename to __fesetround and define as weak alias of __fesetround. Use libm_hidden_weak. * sysdeps/generic/math_private.h (default_libc_fesetround): Call __fesetround instead of fesetround. (default_libc_feholdexcept_setround): Likewise. (libc_feholdsetround_ctx): Likewise. (libc_feholdsetround_noex_ctx): Likewise.	2015-01-07 00:41:23 +00:00
Joseph Myers	cd42798aef	Fix libm fesetenv namespace (bug 17748). Continuing the fixes for C90 libm functions calling C99 fe* functions, this patch fixes the case of fesetenv by making it a weak alias of __fesetenv and making the affected code (including various copies of feupdateenv which also gets called from C90 functions) call __fesetenv. Tested for x86_64 (testsuite, and that disassembly of installed shared libraries is unchanged by the patch). Also tested for ARM (soft-float) that fesetenv failures disappear from the linknamespace test results (fsetround and feupdateenv remain to be addressed to complete fixing bug 17748). [BZ #17748] * include/fenv.h (__fesetenv): Use libm_hidden_proto. * math/fesetenv.c (__fesetenv): Use libm_hidden_def. * sysdeps/aarch64/fpu/fesetenv.c (fesetenv): Rename to __fesetenv and define as weak alias of __fesetenv. Use libm_hidden_weak. * sysdeps/alpha/fpu/fesetenv.c (__fesetenv): Use libm_hidden_def. * sysdeps/arm/fesetenv.c (fesetenv): Rename to __fesetenv and define as weak alias of __fesetenv. Use libm_hidden_weak. * sysdeps/hppa/fpu/fesetenv.c (fesetenv): Likewise. * sysdeps/i386/fpu/fesetenv.c (__fesetenv): Use libm_hidden_def. * sysdeps/ia64/fpu/fesetenv.c (fesetenv): Rename to __fesetenv and define as weak alias of __fesetenv. Use libm_hidden_weak. * sysdeps/m68k/fpu/fesetenv.c (__fesetenv): Use libm_hidden_def. * sysdeps/mips/fpu/fesetenv.c (fesetenv): Rename to __fesetenv and define as weak alias of __fesetenv. Use libm_hidden_weak. * sysdeps/powerpc/fpu/fesetenv.c (__fesetenv): Use libm_hidden_def. * sysdeps/powerpc/nofpu/fesetenv.c (__fesetenv): Likewise. * sysdeps/powerpc/powerpc32/e500/nofpu/fesetenv.c (__fesetenv): Likewise. * sysdeps/s390/fpu/fesetenv.c (fesetenv): Rename to __fesetenv and define as weak alias of __fesetenv. Use libm_hidden_weak. * sysdeps/sh/sh4/fpu/fesetenv.c (fesetenv): Likewise. * sysdeps/sparc/fpu/fesetenv.c (__fesetenv): Use libm_hidden_def. * sysdeps/tile/math_private.h (__fesetenv): New inline function. * sysdeps/x86_64/fpu/fesetenv.c (fesetenv): Rename to __fesetenv and define as weak alias of __fesetenv. Use libm_hidden_weak. * sysdeps/generic/math_private.h (default_libc_fesetenv): Use __fesetenv instead of fesetenv. (libc_feresetround_noex_ctx): Likewise. * sysdeps/alpha/fpu/feupdateenv.c (__feupdateenv): Likewise. * sysdeps/hppa/fpu/feupdateenv.c (feupdateenv): Likewise. * sysdeps/i386/fpu/feupdateenv.c (__feupdateenv): Likewise. * sysdeps/ia64/fpu/feupdateenv.c (feupdateenv): Likewise. * sysdeps/m68k/fpu/feupdateenv.c (__feupdateenv): Likewise. * sysdeps/mips/fpu/feupdateenv.c (feupdateenv): Likewise. * sysdeps/powerpc/nofpu/feupdateenv.c (__feupdateenv): Likewise. * sysdeps/powerpc/powerpc32/e500/nofpu/feupdateenv.c (__feupdateenv): Likewise. * sysdeps/s390/fpu/feupdateenv.c (feupdateenv): Likewise. * sysdeps/sh/sh4/fpu/feupdateenv.c (feupdateenv): Likewise. * sysdeps/sparc/fpu/feupdateenv.c (__feupdateenv): Likewise. * sysdeps/x86_64/fpu/feupdateenv.c (__feupdateenv): Likewise.	2015-01-06 23:36:20 +00:00
Joseph Myers	ef9faf1385	Fix libm feholdexcept namespace (bug 17748). Continuing the fixes for C90 libm functions calling C99 fe* functions, this patch fixes the case of feholdexcept by making it a weak alias of __feholdexcept and making the affected code call __feholdexcept. Tested for x86_64 (testsuite, and that disassembly of installed shared libraries is unchanged by the patch). Also tested for ARM (soft-float) that feholdexcept failures disappear from the linknamespace test failures (fesetenv, fsetround and feupdateenv remain to be addressed to complete fixing bug 17748). [BZ #17748] * include/fenv.h (__feholdexcept): Declare. Use libm_hidden_proto. * math/feholdexcpt.c (feholdexcept): Rename to __feholdexcept and define as weak alias of __feholdexcept. Use libm_hidden_weak. * sysdeps/aarch64/fpu/feholdexcpt.c (feholdexcept): Likewise. * sysdeps/alpha/fpu/feholdexcpt.c (feholdexcept): Likewise. * sysdeps/arm/feholdexcpt.c (feholdexcept): Likewise. * sysdeps/hppa/fpu/feholdexcpt.c (feholdexcept): Likewise. * sysdeps/i386/fpu/feholdexcpt.c (feholdexcept): Likewise. * sysdeps/ia64/fpu/feholdexcpt.c (feholdexcept): Likewise. * sysdeps/m68k/fpu/feholdexcpt.c (feholdexcept): Likewise. * sysdeps/mips/fpu/feholdexcpt.c (feholdexcept): Likewise. * sysdeps/powerpc/fpu/feholdexcpt.c (feholdexcept): Likewise. * sysdeps/powerpc/nofpu/feholdexcpt.c (feholdexcept): Likewise. * sysdeps/powerpc/powerpc32/e500/nofpu/feholdexcpt.c (feholdexcept): Likewise. * sysdeps/s390/fpu/feholdexcpt.c (feholdexcept): Likewise. * sysdeps/sh/sh4/fpu/feholdexcpt.c (feholdexcept): Likewise. * sysdeps/sparc/fpu/feholdexcpt.c (feholdexcept): Likewise. * sysdeps/x86_64/fpu/feholdexcpt.c (feholdexcept): Likewise. * sysdeps/generic/math_private.h (default_libc_feholdexcept): Use __feholdexcept instead of feholdexcept. (default_libc_feholdexcept_setround): Likewise.	2015-01-05 23:06:14 +00:00
Joseph Myers	b93c2205ec	Fix libm fegetround namespace (bug 17748). Continuing the fixes for C90 libm functions calling C99 fe* functions, this patch fixes the case of fegetround by making it a weak alias of __fegetround and making the affected code call __fegetround. Tested for x86_64 (testsuite, and that disassembly of installed shared libraries is unchanged by the patch). Also tested for ARM (soft-float) that fegetround failures disappear from the linknamespace test failures (feholdexcept, fesetenv, fesetround and feupdateenv remain to be addressed before bug 17748 is fully fixed, although this patch may suffice to fix the failures in some cases, when the libc_fe* functions are implemented but there is no architecture-specific sqrt implementation in use so there were failures from fegetround used by sqrt but no other such failures). [BZ #17748] * include/fenv.h (__fegetround): Declare. Use libm_hidden_proto. * math/fegetround.c (fegetround): Rename to __fegetround and define as weak alias of __fegetround. Use libm_hidden_weak. * sysdeps/aarch64/fpu/fegetround.c (fegetround): Likewise. * sysdeps/alpha/fpu/fegetround.c (fegetround): Likewise. * sysdeps/arm/fegetround.c (fegetround): Likewise. * sysdeps/hppa/fpu/fegetround.c (fegetround): Likewise. * sysdeps/i386/fpu/fegetround.c (fegetround): Likewise. * sysdeps/ia64/fpu/fegetround.c (fegetround): Likewise. * sysdeps/m68k/fpu/fegetround.c (fegetround): Likewise. * sysdeps/mips/fpu/fegetround.c (fegetround): Likewise. * sysdeps/powerpc/fpu/fegetround.c (fegetround): Likewise. Undefine after rather than before function definition; use parentheses around function name in definition. (__fegetround): Also undefine macro after function definition. * sysdeps/powerpc/nofpu/fegetround.c (fegetround): Rename to __fegetround and define as weak alias of __fegetround. Use libm_hidden_weak. Do not undefine as macro. * sysdeps/powerpc/powerpc32/e500/nofpu/fegetround.c (fegetround): Likewise. * sysdeps/s390/fpu/fegetround.c (fegetround): Rename to __fegetround and define as weak alias of __fegetround. Use libm_hidden_weak. * sysdeps/sh/sh4/fpu/fegetround.c (fegetround): Likewise. * sysdeps/sparc/fpu/fegetround.c (fegetround): Likewise. * sysdeps/tile/math_private.h (__fegetround): New inline function. * sysdeps/x86_64/fpu/fegetround.c (fegetround): Rename to __fegetround and define as weak alias of __fegetround. Use libm_hidden_weak. * sysdeps/ieee754/dbl-64/e_sqrt.c (__ieee754_sqrt): Use __fegetround instead of fegetround.	2015-01-02 20:44:42 +00:00
Joseph Myers	b168057aaa	Update copyright dates with scripts/update-copyrights.	2015-01-02 16:29:47 +00:00
Joseph Myers	73a268c759	Fix libm fegetenv namespace (bug 17748). Some C90 libm functions call fegetenv via libc_feholdsetround* functions in math_private.h. This patch makes them call __fegetenv instead, making fegetenv into a weak alias for __fegetenv as needed. Tested for x86_64 (testsuite, and that disassembly of installed shared libraries is unchanged by the patch). Also tested for ARM (soft-float) that fegetenv failures disappear from the linknamespace test failures (however, similar fixes will also be needed for fegetround, feholdexcept, fesetenv, fesetround and feupdateenv before this set of namespace issues covered by bug 17748 is fully fixed and those linknamespace tests start passing). [BZ #17748] * include/fenv.h (__fegetenv): Use libm_hidden_proto. * math/fegetenv.c (__fegetenv): Use libm_hidden_def. * sysdeps/aarch64/fpu/fegetenv.c (fegetenv): Rename to __fegetenv and define as weak alias of __fegetenv. Use libm_hidden_weak. * sysdeps/alpha/fpu/fegetenv.c (__fegetenv): Use libm_hidden_def. * sysdeps/arm/fegetenv.c (fegetenv): Rename to __fegetenv and define as weak alias of __fegetenv. Use libm_hidden_weak. * sysdeps/hppa/fpu/fegetenv.c (fegetenv): Likewise. * sysdeps/i386/fpu/fegetenv.c (__fegetenv): Use libm_hidden_def. * sysdeps/ia64/fpu/fegetenv.c (fegetenv): Rename to __fegetenv and define as weak alias of __fegetenv. Use libm_hidden_weak. * sysdeps/m68k/fpu/fegetenv.c (__fegetenv): Use libm_hidden_def. * sysdeps/mips/fpu/fegetenv.c (fegetenv): Rename to __fegetenv and define as weak alias of __fegetenv. Use libm_hidden_weak. * sysdeps/powerpc/fpu/fegetenv.c (__fegetenv): Use libm_hidden_def. * sysdeps/powerpc/nofpu/fegetenv.c (__fegetenv): Likewise. * sysdeps/powerpc/powerpc32/e500/nofpu/fegetenv.c (__fegetenv): Likewise. * sysdeps/s390/fpu/fegetenv.c (fegetenv): Rename to __fegetenv and define as weak alias of __fegetenv. Use libm_hidden_weak. * sysdeps/sh/sh4/fpu/fegetenv.c (fegetenv): Likewise. * sysdeps/sparc/fpu/fegetenv.c (__fegetenv): Use libm_hidden_def. * sysdeps/tile/math_private.h (__fegetenv): New inline function. * sysdeps/x86_64/fpu/fegetenv.c (fegetenv): Rename to __fegetenv and define as weak alias of __fegetenv. Use libm_hidden_weak. * sysdeps/generic/math_private.h (libc_feholdsetround_ctx): Use __fegetenv instead of fegetenv. (libc_feholdsetround_noex_ctx): Likewise.	2014-12-31 22:07:52 +00:00
Joseph Myers	0747f81811	Fix libm feraiseexcept namespace (bug 17723). Various C90 and UNIX98 libm functions call feraiseexcept, which is not in those standards. This causes linknamespace test failures - except on x86 / x86_64, where feraiseexcept is inline (for the relevant constant arguments) in bits/fenv.h. This patch fixes this by making those functions call __feraiseexcept instead. All changes are applied to all architectures rather than considering the possibility that some might not be needed in some cases (e.g. x86) as it seems most maintainable to keep architectures consistent. Where __feraiseexcept does not exist, it is added, with feraiseexcept made a weak alias; where it is a strong alias, it is made weak. libm_hidden_def / libm_hidden_proto are used with __feraiseexcept (this might in some cases improve code generation for existing calls to __feraiseexcept in some code on some architectures). Where there are dummy feraiseexcept macros (on architectures without floating-point exceptions support, to avoid compile errors from references to undefined FE_* macros), corresponding dummy __feraiseexcept macros are added. And on x86, to ensure __feraiseexcept calls still get inlined, the inline function in bits/fenv.h is refactored so that most of it can be reused in an inline __feraiseexcept in a separate include/bits/fenv.h. Calls are changed in C90/UNIX98 functions, but generally not in functions missing from those standards. They are also changed in libc_fe* functions (on the basis that those might be used in any libm function), and in feupdateenv (on the same basis - may be used, via default libc_, in any libm function - of course feupdateenv will need changing to __feupdateenv in a subsequent patch to make that fully namespace-clean). No __feraiseexcept is added corresponding to the feraiseexcept in powerpc bits/fenvinline.h, because that macro definition is conditional on !defined __NO_MATH_INLINES, and glibc libm is built with -D__NO_MATH_INLINES, so changing internal calls to use __feraiseexcept should make no difference. Tested for x86_64 (testsuite; the only change in disassembly of installed shared libraries is a slight code reordering in clog10, of no apparent significance). Also tested for MIPS, where (in the configuration tested) it eliminates math.h linknamespace failures for n32 and n64 (some for o32 remain because of other issues). [BZ #17723] include/fenv.h (__feraiseexcept): Use libm_hidden_proto. * math/fraiseexcpt.c (__feraiseexcept): Use libm_hidden_def. * sysdeps/aarch64/fpu/fraiseexcpt.c (feraiseexcept): Rename to __feraiseexcept and define as weak alias of __feraiseexcept. Use libm_hidden_weak. * sysdeps/arm/fraiseexcpt.c (feraiseexcept): Likewise. * sysdeps/hppa/fpu/fraiseexcpt.c (feraiseexcept): Likewise. * sysdeps/i386/fpu/fraiseexcpt.c (__feraiseexcept): Use libm_hidden_def. * sysdeps/ia64/fpu/fraiseexcpt.c (feraiseexcept): Rename to __feraiseexcept and define as weak alias of __feraiseexcept. Use libm_hidden_weak. * sysdeps/m68k/coldfire/fpu/fraiseexcpt.c (feraiseexcept): Likewise. * sysdeps/microblaze/math_private.h (__feraiseexcept): New macro. * sysdeps/mips/fpu/fraiseexcpt.c (feraiseexcept): Rename to __feraiseexcept and define as weak alias of __feraiseexcept. Use libm_hidden_weak. * sysdeps/powerpc/fpu/fraiseexcpt.c (__feraiseexcept): Use libm_hidden_def. * sysdeps/powerpc/nofpu/fraiseexcpt.c (__feraiseexcept): Likewise. * sysdeps/powerpc/powerpc32/e500/nofpu/fraiseexcpt.c (__feraiseexcept): Likewise. * sysdeps/s390/fpu/fraiseexcpt.c (feraiseexcept): Rename to __feraiseexcept and define as weak alias of __feraiseexcept. Use libm_hidden_weak. * sysdeps/sh/sh4/fpu/fraiseexcpt.c (feraiseexcept): Likewise. * sysdeps/sparc/fpu/fraiseexcpt.c (__feraiseexcept): Use libm_hidden_def. * sysdeps/tile/math_private.h (__feraiseexcept): New macro. * sysdeps/unix/sysv/linux/alpha/fraiseexcpt.S (__feraiseexcept): Use libm_hidden_def. * sysdeps/x86_64/fpu/fraiseexcpt.c (__feraiseexcept): Use libm_hidden_def. (feraiseexcept): Define as weak not strong alias. Use libm_hidden_weak. * sysdeps/x86/fpu/bits/fenv.h (__feraiseexcept_invalid_divbyzero): New inline function. Factored out of ... (feraiseexcept): ... here. Use __feraiseexcept_invalid_divbyzero. * sysdeps/x86/fpu/include/bits/fenv.h: New file. * math/e_scalb.c (invalid_fn): Call __feraiseexcept instead of feraiseexcept. * math/w_acos.c (__acos): Likewise. * math/w_asin.c (__asin): Likewise. * math/w_ilogb.c (__ilogb): Likewise. * math/w_j0.c (y0): Likewise. * math/w_j1.c (y1): Likewise. * math/w_jn.c (yn): Likewise. * math/w_log.c (__log): Likewise. * math/w_log10.c (__log10): Likewise. * sysdeps/aarch64/fpu/feupdateenv.c (feupdateenv): Likewise. * sysdeps/aarch64/fpu/math_private.h (libc_feupdateenv_test_aarch64): Likewise. * sysdeps/alpha/fpu/feupdateenv.c (__feupdateenv): Likewise. * sysdeps/arm/fenv_private.h (libc_feupdateenv_test_vfp): Likewise. * sysdeps/arm/feupdateenv.c (feupdateenv): Likewise. * sysdeps/ia64/fpu/feupdateenv.c (feupdateenv): Likewise. * sysdeps/m68k/fpu/feupdateenv.c (__feupdateenv): Likewise. * sysdeps/mips/fpu/feupdateenv.c (feupdateenv): Likewise. * sysdeps/powerpc/fpu/e_sqrt.c (__slow_ieee754_sqrt): Likewise. * sysdeps/s390/fpu/feupdateenv.c (feupdateenv): Likewise. * sysdeps/sh/sh4/fpu/feupdateenv.c (feupdateenv): Likewise. * sysdeps/sparc/fpu/feupdateenv.c (__feupdateenv): Likewise.	2014-12-30 17:08:09 +00:00
Joseph Myers	5ae4fe60e6	Remove x86_64 __GNUC_PREREQ (4, 6) conditional. This patch removes a conditional on __GNUC_PREREQ (4, 6) in x86_64 code. Tested for x86_64 that installed shared libraries are unchanged by this patch. Committed (I think this file reasonably comes under math maintainership). * sysdeps/x86_64/fpu/dla.h [__FMA4__ && __GNUC_PREREQ (4, 6)] (DLA_FMS): Make definition conditional only on [__FMA4__]. [__FMA4__ && !__GNUC_PREREQ (4, 6)] (DLA_FMS): Remove conditional definition.	2014-11-14 18:53:07 +00:00
Joseph Myers	be25493251	Fix yn overflow handling in non-default rounding modes (bug 16561, bug 16562). This patch fixes bugs 16561 and 16562, bad results of yn in overflow cases in non-default rounding modes, both because an intermediate overflow in the recurrence does not get detected if the result is not an infinity and because an overflowing result may occur in the wrong sign. The fix is to set FE_TONEAREST mode internally for the parts of the function where such overflows can occur (which includes the call to y1 - where yn is used to compute a Bessel function of order -1, negating the result of y1 isn't correct for overflowing results in directed rounding modes) and then compute an overflowing value in the original rounding mode if the to-nearest result was an infinity. Tested x86_64 and x86 and ulps updated accordingly. Also tested for mips64 and powerpc32 to test the ldbl-128 and ldbl-128ibm changes. (The tests for these bugs were added in my previous y1 patch, so the only thing this patch has to do with the testsuite is enable yn testing in all rounding modes.) [BZ #16561] [BZ #16562] * sysdeps/ieee754/dbl-64/e_jn.c: Include <float.h>. (__ieee754_yn): Set FE_TONEAREST mode internally and then recompute overflowing results in original rounding mode. * sysdeps/ieee754/flt-32/e_jnf.c: Include <float.h>. (__ieee754_ynf): Set FE_TONEAREST mode internally and then recompute overflowing results in original rounding mode. * sysdeps/ieee754/ldbl-128/e_jnl.c: Include <float.h>. (__ieee754_ynl): Set FE_TONEAREST mode internally and then recompute overflowing results in original rounding mode. * sysdeps/ieee754/ldbl-128ibm/e_jnl.c: Include <float.h>. (__ieee754_ynl): Set FE_TONEAREST mode internally and then recompute overflowing results in original rounding mode. * sysdeps/ieee754/ldbl-96/e_jnl.c: Include <float.h>. (__ieee754_ynl): Set FE_TONEAREST mode internally and then recompute overflowing results in original rounding mode. * sysdeps/i386/fpu/fenv_private.h [!__SSE2_MATH__] (libc_feholdsetround_ctx): New macro. * math/libm-test.inc (yn_test): Use ALL_RM_TEST. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps : Likewise.	2014-06-27 14:52:13 +00:00
Joseph Myers	a638de828d	Fix exp10 spurious underflows (bug 16560). This patch fixes spurious underflows from exp10 for arguments near 0 (part of bug 16560; that bug also includes spurious underflows from exp2, which are not fixed by this patch). The problem is underflows in the internal computation converting the exp10 argument to arguments for exp (with extra precision), and the fix is simply to return 1 early for arguments near enough to 0 (just as arguments with large enough magnitude have their own overflow / underflow logic at the start of the function). Tested x86_64 and x86 and ulps updated accordingly; also tested for powerpc32 and mips64 to validate the ldbl-128ibm and ldbl-128 changes. [BZ #16560] * sysdeps/ieee754/dbl-64/e_exp10.c (__ieee754_exp10): Return 1 for arguments close to 0. * sysdeps/ieee754/ldbl-128/e_exp10l.c (__ieee754_exp10l): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_exp10l.c (__ieee754_exp10l): Likewise. * math/auto-libm-test-in: Add more tests of exp10. * math/auto-libm-test-out: Regenerated. * sysdeps/x86_64/fpu/libm-test-ulps: Update.	2014-06-25 11:33:22 +00:00
Joseph Myers	4060283dec	Fix x86/x86_64 expm1l spurious underflow exceptions (bug 16539). This patch fixes bug 16539, spurious underflow exceptions from x86 / x86-64 expm1l. The problem is that the computation of a base-2 exponent with extra precision involves spurious underflows for arguments that are small but not subnormal, so a check is added to just return the argument in those cases. (If the argument is subnormal, underflowing is correct and the existing code will always underflow, so it suffices to keep using the existing code in that case; some expm1 implementations have a bug (bug 16353) with missing underflow exceptions, but I don't think there's such a bug in this particular version.) Tested x86_64 and x86; no ulps updates needed. (auto-libm-test-out diffs omitted below.) [BZ #16539] * sysdeps/i386/fpu/e_expl.S (IEEE754_EXPL) [USE_AS_EXPM1L]: Just return the argument for normal arguments with exponent below -64. * sysdeps/x86_64/fpu/e_expl.S (IEEE754_EXPL) [USE_AS_EXPM1L]: Likewise. * math/auto-libm-test-in: Add another test of expm1. * math/auto-libm-test-out: Regenerated.	2014-06-24 21:00:08 +00:00
Joseph Myers	863893ec95	Test cpow in all rounding modes. This patch enables testing of cpow in all rounding modes using ALL_RM_TEST. There were two reasons this was previously deferred: * MPC has complicated rounding-mode-dependent rules for the signs of exact zero real or imaginary parts in the result of mpc_pow. Annex G does not impose any such requirements and I don't think glibc should try to implement any particular logic here. This patch adds support for gen-auto-libm-tests passing the IGNORE_ZERO_INF_SIGN flag to libm-test.inc. * Error accumulations in some tests in non-default rounding modes exceed the maximum error permitted in libm-test.inc. This patch marks the problem tests with xfail-rounding. (It might be possible to reduce the accumulations a bit by using round-to-nearest when cpow calls clog, but I don't think there's much point; the implementation approach for cpow is fundamentally deficient, as discussed in the existing bug for cpow inaccuracy which can reasonably be considered to cover these less-inaccurate cases as well. It's possible that the test "cpow 2 0 10 0" will also need xfail-rounding on some platforms.) Tested x86_64 and x86 and ulps updated accordingly. * math/gen-auto-libm-tests.c: Document use of ignore-zero-inf-sign. (input_flag_type): Add value flag_ignore_zero_inf_sign. (input_flags): Add ignore-zero-inf-sign. (output_for_one_input_case): Handle flag_ignore_zero_inf_sign. * math/gen-libm-test.pl (generate_testfile): Handle ignore-zero-inf-sign. * math/auto-libm-test-in: Mark some cpow tests with ignore-zero-inf-sign and some with xfail-rounding. * math/auto-libm-test-out: Regenerated. * math/libm-test.inc (cpow_test): Use ALL_RM_TEST. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2014-06-23 20:15:14 +00:00
Joseph Myers	4da6db5188	Fix pow overflow in non-default rounding modes (bug 16315). This patch fixes bug 16315, bad pow handling of overflow/underflow in non-default rounding modes. Tests of pow are duly converted to ALL_RM_TEST to run all tests in all rounding modes. There are two main issues here. First, various implementations compute a negative result by negating a positive result, but this yields inappropriate overflow / underflow values for directed rounding, so either overflow / underflow results need recomputing in the correct sign, or the relevant overflowing / underflowing operation needs to be made to have a result of the correct sign. Second, the dbl-64 implementation sets FE_TONEAREST internally; in the overflow / underflow case, the result needs recomputing in the original rounding mode. Tested x86_64 and x86 and ulps updated accordingly. [BZ #16315] * sysdeps/i386/fpu/e_pow.S (__ieee754_pow): Ensure possibly overflowing or underflowing operations take place with sign of result. * sysdeps/i386/fpu/e_powf.S (__ieee754_powf): Likewise. * sysdeps/i386/fpu/e_powl.S (__ieee754_powl): Likewise. * sysdeps/ieee754/dbl-64/e_pow.c: Include <math.h>. (__ieee754_pow): Recompute overflowing and underflowing results in original rounding mode. * sysdeps/x86/fpu/powl_helper.c: Include <stdbool.h>. (__powl_helper): Allow negative argument X and scale negated value as needed. Avoid passing value outside [-1, 1] to f2xm1. * sysdeps/x86_64/fpu/e_powl.S (__ieee754_powl): Ensure possibly overflowing or underflowing operations take place with sign of result. * sysdeps/x86_64/fpu/multiarch/e_pow.c [HAVE_FMA4_SUPPORT]: Include <math.h>. * math/auto-libm-test-in: Add more tests of pow. * math/auto-libm-test-out: Regenerated. * math/libm-test.inc (pow_test): Use ALL_RM_TEST. (pow_tonearest_test_data): Remove. (pow_test_tonearest): Likewise. (pow_towardzero_test_data): Likewise. (pow_test_towardzero): Likewise. (pow_downward_test_data): Likewise. (pow_test_downward): Likewise. (pow_upward_test_data): Likewise. (pow_test_upward): Likewise. (main): Don't call removed functions. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2014-06-23 20:12:33 +00:00
Joseph Myers	4ba7a00fe3	Fix __ieee754_logl (-LDBL_MAX) in FE_DOWNWARD mode (bug 17022). This patch fixes __ieee754_logl (-LDBL_MAX) on x86_64 and x86 not to subtract 1 from its argument and so cause spurious overflow in FE_DOWNWARD mode. (For any argument strictly less than -1, it doesn't matter whether or not 1 is subtracted before computing log1p, as long as the result doesn't overflow to -Inf.) Tested x86_64 and x86. (This particular case lacks test coverage, since the testsuite doesn't cover -lieee, but it will be covered by tests after the following patch to test pow in all rounding modes, which was the context in which this bug was found.) [BZ #17022] * sysdeps/i386/fpu/e_logl.S (__ieee754_logl): Do not subtract 1 from arguments -2 or below. * sysdeps/i386/i686/fpu/e_logl.S (__ieee754_logl): Likewise. * sysdeps/x86_64/fpu/e_logl.S (__ieee754_logl): Likewise.	2014-06-18 12:32:01 +00:00
Joseph Myers	f8ba1b5654	Fix log2 (1) in round-downward mode (bug 17042). As with other issues of this kind, bug 17042 is log2 (1) wrongly returning -0 instead of +0 in round-downward mode because of implementations effectively in terms of log1p (x - 1). This patch fixes the issue in the same way used for log and log10. Tested x86_64 and x86 and ulps updated accordingly. Also tested for mips64 to confirm a fix was needed for ldbl-128 and to validate that fix (also applied to ldbl-128ibm since that version of log2l is essentially the same as the ldbl-128 one). [BZ #17042] * sysdeps/i386/fpu/e_log2.S (__ieee754_log2): Take absolete value when x - 1 is zero. * sysdeps/i386/fpu/e_log2f.S (__ieee754_log2f): Likewise. * sysdeps/i386/fpu/e_log2l.S (__ieee754_log2l): Likewise. * sysdeps/ieee754/ldbl-128/e_log2l.c (__ieee754_log2l): Return 0.0L for an argument of 1.0L. * sysdeps/ieee754/ldbl-128ibm/e_log2l.c (__ieee754_log2l): Likewise. * sysdeps/x86_64/fpu/e_log2l.S (__ieee754_log2l): Take absolute value when x - 1 is zero. * math/libm-test.inc (log2_test): Use ALL_RM_TEST. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2014-06-10 12:07:15 +00:00
Joseph Myers	b72592e75f	Fix log10 (1) in round-downward mode (bug 16977). As with various other issues of this kind, bug 16977 is log10 (1) wrongly returning -0 rather than +0 in round-downward mode because of an implementation effectively in terms of log1p (x - 1). This patch fixes the issue in the same way used for log. Tested x86_64 and x86 and ulps updated accordingly. Also tested for mips64 to confirm a fix was needed for ldbl-128 and to validate that fix (also applied to ldbl-128ibm since that version of logl is essentially the same as the ldbl-128 one). [BZ #16977] * sysdeps/i386/fpu/e_log10.S (__ieee754_log10): Take absolute value when x - 1 is zero. * sysdeps/i386/fpu/e_log10f.S (__ieee754_log10f): Likewise. * sysdeps/i386/fpu/e_log10l.S (__ieee754_log10l): Likewise. * sysdeps/ieee754/ldbl-128/e_log10l.c (__ieee754_log10l): Return 0.0L for an argument of 1.0L. * sysdeps/ieee754/ldbl-128ibm/e_log10l.c (__ieee754_log10l): Likewise. * sysdeps/x86_64/fpu/e_log10l.S (__ieee754_log10l): Take absolute value when x - 1 is zero. * math/libm-test.inc (log10_test): Use ALL_RM_TEST. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2014-05-23 12:07:50 +00:00
Joseph Myers	1a84c3d6d4	Fix log1pl (LDBL_MAX) in FE_UPWARD mode (bug 16564). Bug 16564 is spurious overflow of log1pl (LDBL_MAX) in FE_UPWARD mode, resulting from log1pl adding 1 to its argument (for arguments not close to 0), which overflows in that mode. This patch fixes this by avoiding adding 1 to large arguments (precisely what counts as large depends on the floating-point format). Tested x86_64 and x86, and spot-checked log1pl tests on mips64 and powerpc64. [BZ #16564] * sysdeps/i386/fpu/s_log1pl.S (__log1pl): Do not add 1 to positive arguments with exponent 65 or above. * sysdeps/ieee754/ldbl-128/s_log1pl.c (__log1pl): Do not add 1 to arguments 0x1p113L or above. * sysdeps/ieee754/ldbl-128ibm/s_log1pl.c (__log1pl): Do not add 1 to arguments 0x1p107L or above. * sysdeps/x86_64/fpu/s_log1pl.S (__log1pl): Do not add 1 to positive arguments with exponent 65 or above. * math/auto-libm-test-in: Add more tests of log1p. * math/auto-libm-test-out: Regenerated.	2014-05-14 12:38:56 +00:00
Joseph Myers	01dbacd22a	Fix cacos (+Inf + finitei) in round-downward mode (bug 16928). According to C99/C11 Annex G, cacos applied to a value with real part +Inf and finite imaginary part should produce a result with real part +0. glibc wrongly produces a result with real part -0 in FE_DOWNWARD mode. This patch fixes this by checking for zero results in the relevant case of non-finite arguments (where there should never be a result with -0 real part), and converts the tests of cacos to ALL_RM_TEST. Tested x86_64 and x86 and ulps updated accordingly. [BZ #16928] math/s_cacos.c (__cacos): Ensure zero real part of result from non-finite arguments is +0. * math/s_cacosf.c (__cacosf): Likewise. * math/s_cacosl.c (__cacosl): Likewise. * math/libm-test.inc (cacos_test): Use ALL_RM_TEST. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2014-05-14 12:37:24 +00:00
Joseph Myers	913d03c864	Fix acosh (1) in round-downward mode (bug 16927). According to C99 and C11 Annex F, acosh (1) should be +0 in all rounding modes. However, some implementations in glibc wrongly return -0 in round-downward mode (which is what you get if you end up computing log1p (-0), via 1 - 1 being -0 in round-downward mode). This patch fixes the problem implementations, by correcting the test for an exact 1 value in the ldbl-96 implementation to allow for the explicit high bit of the mantissa, and by inserting fabs instructions in the i386 implementations; tests of acosh are duly converted to ALL_RM_TEST. I believe all the other sysdeps/ieee754 implementations are already OK (I haven't checked the ia64 versions, but if buggy then that will be obvious from the results of test runs after this patch is in). Tested x86_64 and x86 and ulps updated accordingly. [BZ #16927] * sysdeps/i386/fpu/e_acosh.S (__ieee754_acosh): Use fabs on x-1 value. * sysdeps/i386/fpu/e_acoshf.S (__ieee754_acoshf): Likewise. * sysdeps/i386/fpu/e_acoshl.S (__ieee754_acoshl): Likewise. * sysdeps/ieee754/ldbl-96/e_acoshl.c (__ieee754_acoshl): Correct for explicit high bit of mantissa when testing for argument equal to 1. * math/libm-test.inc (acosh_test): Use ALL_RM_TEST. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2014-05-14 12:35:40 +00:00
Joseph Myers	a84e78c8b3	Fix catan, catanh, __ieee754_logf in round-downward mode (bug 16799, bug 16800). This patch fixes incorrect results from catan and catanh of certain special inputs in round-downward mode (bug 16799), and incorrect results of __ieee754_logf (+/-0) in round-downward mode (bug 16800) that show up through catan/catanh when tested in all rounding modes, but not directly in the testing for logf because the bug gets hidden by the wrappers. Both bugs involve a zero that should be +0 being -0 instead: one computed as (1-x)(1+x) in the catan/catanh case, and one as (x-x) in the logf case. The fixes ensure positive zero is used. Testing of catan and catanh in all rounding modes is duly enabled. I expect there are various other bugs in special cases in __ieee754_ functions that are normally hidden by the wrappers but would show up for testing with -lieee (or in future with -fno-math-errno if we replace -lieee and _LIB_VERSION with compile-time redirection to new _noerrno symbol names). Tested x86_64 and x86 and ulps updated accordingly. [BZ #16799] [BZ #16800] math/s_catan.c (__catan): Avoid passing -0 denominator to atan2 with 0 numerator. * math/s_catanf.c (__catanf): Likewise. * math/s_catanh.c (__catanh): Likewise. * math/s_catanhf.c (__catanhf): Likewise. * math/s_catanhl.c (__catanhl): Likewise. * math/s_catanl.c (__catanl): Likewise. * sysdeps/ieee754/flt-32/e_logf.c (__ieee754_logf): Always divide by positive zero when computing -Inf result. * math/libm-test.inc (catan_test): Use ALL_RM_TEST. (catanh_test): Likewise. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2014-04-02 17:41:02 +00:00
Joseph Myers	6f05bafeba	Fix clog / clog10 sign of zero result in round-downward mode (bug 16789). This patch fixes bug 16789, incorrect sign of (real part) zero result from clog and clog10 in round-downward mode, arising from that real part being computed as 0 - 0. To ensure that an underflow exception occurred, the code used an underflowing value (the next term in the series for log1p) in arithmetic computing the real part of the result, yielding the problematic 0 - 0 computation in some cases even when the mathematical result would be small but positive. The patch changes this code to use the math_force_eval approach to ensuring that an underflowing computation actually occurs. Tests of clog and clog10 are enabled in all rounding modes. Tested x86_64 and x86 and ulps updated accordingly. [BZ #16789] * math/s_clog.c (__clog): Use math_force_eval to ensure underflow instead of using underflowing value in computing result. * math/s_clog10.c (__clog10): Likewise. * math/s_clog10f.c (__clog10f): Likewise. * math/s_clog10l.c (__clog10l): Likewise. * math/s_clogf.c (__clogf): Likewise. * math/s_clogl.c (__clogl): Likewise. * math/libm-test.inc (clog_test): Use ALL_RM_TEST. (clog10_test): Likewise. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2014-04-02 13:10:19 +00:00
Joseph Myers	03a7091fa2	Fix x86/x86_64 expl/exp10l spurious underflows (bug 16348). This patch fixes bug 16348, spurious underflows from x86/x86_64 expl on arguments close to 0. These implementations effectively use expm1 (on the fractional part of the argument) internally, so resulting in spurious underflows when the result is very close to 1. For arguments small enough that the round-to-nearest correct result is 1, this patch uses 1+x instead. These implementations are also used for exp10l and so the patch fixes similar issues there (the 0x1p-67 threshold being small enough to be correct for exp10l as well as expl). But because of spurious underflows in other exp10 implementations (bug 16560), the tests aren't added for exp10 at this point - they can be added when the other exp10 parts of that bug are fixed. Tested x86_64 and x86; no ulps updates needed. [BZ #16348] * sysdeps/i386/fpu/e_expl.S (IEEE754_EXPL) [!USE_AS_EXPM1L]: Use 1+x for argument with exponent below -67. * sysdeps/x86_64/fpu/e_expl.S (IEEE754_EXPL) [!USE_AS_EXPM1L]: Likewise. * math/auto-libm-test-in: Add more tests of exp. * math/auto-libm-test-out: Regenerated.	2014-03-27 18:41:14 +00:00
Joseph Myers	9be36fb8cb	Make x86_64 fegetenv preserve exception mask (bug 16198). Bug 16198 is x86_64 fegetenv wrongly masking exceptions for which traps are enabled, because that's a side-effect of the fnstenv instruction. This patch fixes it to use fldenv immediately after fnstenv, like the i386 version. Tested x86_64 and x86. [BZ #16198] * sysdeps/x86_64/fpu/fegetenv.c (fegetenv): Use fldenv after fnstenv. * math/test-fenv-preserve.c: New file. * math/Makefile (tests): Add test-fenv-preserve.	2014-03-26 18:59:08 +00:00
Joseph Myers	046651c168	Relax gen-auto-libm-tests may-underflow rules, test log1p in all rounding modes. gen-auto-libm-tests presently allows but does not require underflow exceptions for results with magnitude in the range (greatest subnormal, least normal]. In some cases, the magnitude of the exact result is very slightly above the least normal, but rounding in the implementation results in it effectively computing an infinite-precision result that is slightly below the least normal, so raising an underflow exception. This is in accordance with the documented accuracy goals, but results in testsuite failures. This patch changes the logic to allow underflows when the mathematical result is up to 0.5ulp above the least normal (so in any case where the round-to-nearest result is the least normal). Ideally underflows in all these cases would be accepted only when an underflow with the actual result is consistent with the rounding mode (in FE_TOWARDZERO mode, a return value of the least normal implies that the infinite-precision result did not underflow so there should be no underflow exception, for example), so as to match the documented goals more precisely - whereas at present the tests for exceptions are completely independent of the tests of the returned values. (The same applies to overflow exceptions as well - they too should be checked for consistency with the result, as in FE_TOWARDZERO mode a result 1ulp below the largest finite value should be inconsistent with an overflow exception and cause a failure with overflow rather than simply being considered a 1ulp error when overflow is expected.) But the present patch at least deals with the cases causing spurious failures so that (a) certain existing tests no longer need to be marked as having spurious exceptions (such markings in auto-libm-test-in end up applying to more cases than just those they are needed for) and (b) log1p can be tested in all rounding modes without introducing more such failures. This patch duly moves tests of log1p to ALL_RM_TEST. Tested x86_64 and x86 and ulps updated accordingly. [BZ #16357] [BZ #16599] * math/gen-auto-libm-tests.c (fp_format_desc): Add field min_plus_half. (fp_formats): Update initializers. (init_fp_formats): Initialize new field. (output_for_one_input_case): Allow underflow for results up to min_plus_half. * math/libm-test.inc (log1p_test): Use ALL_RM_TEST. * math/auto-libm-test-in: Don't mark some underflows from asin and atanh as spurious. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2014-03-25 12:26:06 +00:00
Joseph Myers	f3426898bf	Fix implicit __isinf declarations in exp. My recent exp patch introduced warnings about implicit __isinf declarations in exp because e_exp.c didn't include <math.h>. This patch fixes this. Because <math.h> can't be included after <math_private.h> (because of macro definitions of __nan), it was necessary to put an include in sysdeps/x86_64/fpu/multiarch/e_exp.c as well. Tested x86_64. sysdeps/ieee754/dbl-64/e_exp.c: Include <math.h>. * sysdeps/x86_64/fpu/multiarch/e_exp.c [HAVE_FMA4_SUPPORT \|\| HAVE_AVX_SUPPORT]: Likewise.	2014-03-24 22:00:32 +00:00
Joseph Myers	b376a11a19	Fix dbl-64 exp overflow/underflow in non-default rounding modes (bug 16284). The dbl-64 version of exp needs round-to-nearest mode for its internal computations, but that has the consequence of inappropriate overflowing and underflowing results in other rounding modes. This patch fixes this by recomputing the relevant results in cases where the round-to-nearest result overflows to infinity or underflows to zero (most of the diffs are actually just consequent reindentation). Tests are enabled in all rounding modes for complex functions using exp - but not for cexp because it turns out there are bugs causing spurious underflows for cexp for some tests, which will need to be fixed separately (I suspect ccos ccosh csin csinh ctan ctanh have similar bugs, just not shown by the present set of test inputs). Tested x86_64 and x86 and ulps updated accordingly. [BZ #16284] * sysdeps/ieee754/dbl-64/e_exp.c (__ieee754_exp): Use original rounding mode to recompute results that overflow to infinity or underflow to zero. * math/auto-libm-test-in: Don't mark tests as expected to fail for bug 16284. * math/auto-libm-test-out: Regenerated. * math/libm-test.inc (ccos_test): Use ALL_RM_TEST. (ccosh_test): Likewise. (csin_test_data): Use plus_oflow. (csin_test): Use ALL_RM_TEST. (csinh_test_data): Use plus_oflow. (csinh_test): Use ALL_RM_TEST. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2014-03-24 12:18:45 +00:00
Joseph Myers	f7be737659	Fix log (1) in round-downward mode (bug 16731). According to ISO C Annex F, log (1) should be +0 in all rounding modes, but some implementations in glibc wrongly return -0 in round-downward mode (mapping to log1p (x - 1) is problematic because 1 - 1 is -0 in round-downward mode, and log1p (-0) is -0). This patch fixes this. (It helps with some implementations of other functions such as acosh, log2 and log10 that call out to log, but not enough to enable all-rounding-modes testing for those functions without further fixes to other implementations of them.) Tested x86_64 and x86 and ulps updated accordingly, and did spot tests for mips64 for the ldbl-128 fix, and i586 for the sysdeps/i386/fpu implementations shadowed by those in sysdeps/i386/i686/fpu. [BZ #16731] * sysdeps/i386/fpu/e_log.S (__ieee754_log): Take absolute value when x - 1 is zero. * sysdeps/i386/fpu/e_logf.S (__ieee754_logf): Likewise. * sysdeps/i386/fpu/e_logl.S (__ieee754_logl): Likewise. * sysdeps/i386/i686/fpu/e_logl.S (__ieee754_logl): Likewise. * sysdeps/ieee754/dbl-64/e_log.c (__ieee754_log): Return +0 when argument is 1. * sysdeps/ieee754/ldbl-128/e_logl.c (__ieee754_logl): Likewise. * sysdeps/x86_64/fpu/e_logl.S: Take absolute value when x - 1 is zero. * math/libm-test.inc (log_test): Use ALL_RM_TEST. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2014-03-21 18:13:58 +00:00
Joseph Myers	8c92dfff41	Test most libm functions in all rounding modes. This patch makes libm-test.inc tests of most functions use ALL_RM_TEST unless there was some reason to defer that change for a particular function. I started out planning to defer the change for pow (bug 16315), cexp / ccos / ccosh / csin / csinh (likely fallout from exp, bug 16284) and cpow (exact expectations for signs of exact zero results not wanted). Testing on x86_64 and x86 showed additional failures for acosh, cacos, catan, catanh, clog, clog10, jn, log, log10, log1p, log2, tgamma, yn, so making the change for those functions was deferred as well, pending investigation to show which of these represent distinct bugs (some such bugs may already be filed) and appropriate fixing / XFAILing. Failures include wrong signs of zero results, errors slightly above the 9ulp bound (in such cases it may make sense for functions to set round-to-nearest internally to reduce error accumulation), large errors and incorrect overflow/underflow for the rounding mode (with consequent missing errno settings in some cases). It's possible some could be issues with test expectations, though I didn't notice any that were obviously like that (I added NO_TEST_INLINE for cases that were failing for ildoubl on x86 and where it seemed reasonable for them to fail for the fast-math inlines). There may of course be failures on other architectures for functions that didn't fail on x86_64 or x86, in which case the usual rule applies: file a bug (preferably identifying the underlying problem function, in cases where function A calls function B and a problem with function B may present in the test results for function A) if not already in Bugzilla then fix or XFAIL. Tested x86_64 and x86 and ulps updated accordingly. * math/libm-test.inc (asinh_test): Use ALL_RM_TEST. (atan_test): Likewise. (atanh_test_data): Use NO_TEST_INLINE for two tests. (atanh_test): Use ALL_RM_TEST. (atan2_test_data): Likewise. (cabs_test): Likewise. (cacosh_test): Likewise. (carg_test): Likewise. (casin_test): Likewise. (casinh_test): Likewise. (cbrt_test): Likewise. (csqrt_test): Likewise. (erf_test): Likewise. (erfc_test): Likewise. (pow10_test): Likewise. (exp2_test): Likewise. (hypot_test): Likewise. (j0_test): Likewise. (j1_test): Likewise. (lgamma_test): Likewise. (gamma_test): Likewise. (sincos_test): Likewise. (tanh_test): Likewise. (y0_test): Likewise. (y1_test): Likewise. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2014-03-21 00:03:38 +00:00
Joseph Myers	e6b6a85705	Don't include individual test ulps in libm-test-ulps. As recently discussed <https://sourceware.org/ml/libc-alpha/2014-02/msg00670.html>, it doesn't seem particularly useful for libm-test-ulps files to contain huge amounts of data on ulps for individual tests; just the global maximum observed ulps for each function, together with the verification of exceptions, errno and special results such as infinities and NaNs for each test, suffices to verify that a function's behavior on the given test inputs is within the expected accuracy. Removing this data reduces source tree churn caused by updates to these files when libm tests are added, and reduces the frequency with which testsuite additions actually need libm-test-ulps changes at all. Accordingly, this patch removes that data, so that individual tests get checked against the global bounds for the given function and only generate an error if those are exceeded. Tested x86_64 (including verifying that if an ulps value is artificially reduced, the tests do indeed fail as they should and "make regen-ulps" generates the expected changes). * math/libm-test.inc (struct ulp_data): Don't refer to ulps for individual tests in comment. (libm-test-ulps.h): Don't refer to test_ulps in #include comment. (prev_max_error): New variable. (prev_real_max_error): Likewise. (prev_imag_max_error): Likewise. (compare_ulp_data): Don't refer to test names in comment. (find_test_ulps): Remove function. (find_function_ulps): Likewise. (find_complex_function_ulps): Likewise. (init_max_error): Take function name as argument. Look up ulps for that function. (print_ulps): Remove function. (print_max_error): Use prev_max_error instead of calling find_function_ulps. (print_complex_max_error): Use prev_real_max_error and prev_imag_max_error instead of calling find_complex_function_ulps. (check_float_internal): Take max_ulp parameter instead of calling find_test_ulps. Don't call print_ulps. (check_float): Update call to check_float_internal. (check_complex): Update calls to check_float_internal. (START): Pass argument to init_max_error. * math/gen-libm-test.pl (%results): Don't include "kind" information. (parse_ulps): Don't handle ulps of individual tests. (print_ulps_file): Likewise. (output_ulps): Likewise. * math/README.libm-test: Update. * manual/libm-err-tab.pl (parse_ulps): Don't handle ulps of individual tests. * sysdeps/aarch64/libm-test-ulps: Remove individual test ulps. * sysdeps/alpha/fpu/libm-test-ulps: Likewise. * sysdeps/arm/libm-test-ulps: Likewise. * sysdeps/i386/fpu/libm-test-ulps: Likewise. * sysdeps/ia64/fpu/libm-test-ulps: Likewise. * sysdeps/m68k/coldfire/fpu/libm-test-ulps: Likewise. * sysdeps/m68k/m680x0/fpu/libm-test-ulps: Likewise. * sysdeps/microblaze/libm-test-ulps: Likewise. * sysdeps/mips/mips32/libm-test-ulps: Likewise. * sysdeps/mips/mips64/libm-test-ulps: Likewise. * sysdeps/powerpc/fpu/libm-test-ulps: Likewise. * sysdeps/powerpc/nofpu/libm-test-ulps: Likewise. * sysdeps/s390/fpu/libm-test-ulps: Likewise. * sysdeps/sh/libm-test-ulps: Likewise. * sysdeps/sparc/fpu/libm-test-ulps: Likewise. * sysdeps/tile/libm-test-ulps: Likewise. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise. * sysdeps/hppa/fpu/libm-test-ulps: Remove individual test ulps.	2014-03-05 15:02:38 +00:00
Siddhesh Poyarekar	1cadc85813	Fix sign of input to bsloww1 (BZ #16623 ) In `84ba214c`, I removed some redundant sign computations and in the process, I incorrectly got rid of a temporary variable, thus passing the absolute value of the input to bsloww1. This caused #16623. This fix undoes the incorrect change.	2014-02-27 21:12:09 +05:30
Joseph Myers	63689d6165	Move tests of clog10 from libm-test.inc to auto-libm-test-in. This patch moves tests of clog10 to auto-libm-test-in. Note that this means gen-auto-libm-tests will now depend on the recent MPC 1.0.2 release which added a fix for a bug that made gen-auto-libm-tests hang for clog10. (It still can't conveniently be used for cacos cacosh casin casinh catan catanh csin csinh because of extreme slowness of those functions for special cases in MPC; at least some slow cases of csin / csinh are fixed in MPC trunk, but not in a release.) Tested x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add tests of clog10. * math/auto-libm-test-out: Regenerated. * math/libm-test.inc (clog10_test_data): Use AUTO_TESTS_c_c. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2014-02-19 14:26:29 +00:00
Joseph Myers	a4fb786185	Fix gen-auto-libm-tests sticky bit setting for negative results. gen-auto-libm-tests has a bug in the logic for setting a sticky bit based on the ternary value from MPFR: it is correct for positive results, but for negative results mpz_setbit acts as if a two's complement representation is used, whereas the low bit needs setting based on the sign-magnitude representation GMP actually uses. (This showed up in converting fma tests to use auto-libm-test-in / gen-auto-libm-tests.) This patch fixes the problem by negating the mpz_t value to set its low bit. There are lots of changes to auto-libm-test-out (mainly 1ulp fixes to ldbl-128 expected results), but only a few ulps updates are needed on x86 / x86_64. In one case, a corrected expectation showed up a spurious underflow exception where the correct result is slightly outside the underflowing range. Tested x86_64 and x86 and ulps updated accordingly. * math/gen-auto-libm-tests.c (adjust_real): Ensure integers are non-negative before setting low bit. * math/auto-libm-test-in: Mark one asin test possibly having spurious underflow. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2014-02-18 14:45:41 +00:00
Dylan Alex Simon	fbfdf9cb03	Update x86_64 libm-test-ulps on AMD family 21h model 1 (bug 16545).	2014-02-12 15:55:10 +00:00
Joseph Myers	e8d8d7ec98	Regenerate x86_64 ulps.	2014-02-11 23:15:36 +00:00
Eric Wong	dc98b8f5a9	Update x86_64 ULPs (AMD family 21, model 2) Tested on an AMD FX-8320 CPU	2014-02-04 10:40:56 +10:00
Eric Wong	6c0ce4b45d	Update x86_64 ULPs (AMD Family 10h)	2014-02-04 10:40:44 +10:00
Joseph Myers	97b9a0090e	Regenerate x86 / x86_64 ulps.	2014-01-01 14:34:38 +00:00
Allan McRae	d4697bc93d	Update copyright notices with scripts/update-copyrights	2014-01-01 22:00:23 +10:00
Joseph Myers	5b0626b9c5	Fix x86 / x86_64 expl / expl10l wild results in directed rounding modes (bug 16356). This patch fixes bug 16356, bad results from x86 / x86_64 expl / exp10l in directed rounding modes, the most serious of the bugs shown up by my patch expanding libm test coverage. When I fixed bug 16293, I thought it was only necessary to set round-to-nearest when using frndint in expm1 functions, because in other cases the cancellation error from having the resulting fractional part close to 1 or -1 would not be significant. However, in expl and exp10l, the way the final fractional part gets computed (something more complicated than a simple subtraction, because more precision is needed than you'd get that way) can result in a value outside the range [-1, 1] when the argument to frndint was very close to an integer and was rounded the "wrong" way because of the rounding mode - and the f2xm1 instruction has undefined results if its argument is outside [-1, 1], so resulting in the large errors seen. So this patch removes the USE_AS_EXPM1L conditionals on the round-to-nearest settings, so all of expl, expm1l and exp10l now get round-to-nearest used for frndint (meaning the final fractional part can at most be slightly above 0.5 in magnitude). Associated tests of exp and exp10 are added and testing of exp10 in directed rounding modes enabled. Tested x86_64 and x86 and ulps updated accordingly. * sysdeps/i386/fpu/e_expl.S (IEEE754_EXPL): Also set round-to-nearest for [!USE_AS_EXPM1L]. * sysdeps/x86_64/fpu/e_expl.S (IEEE754_EXPL): Likewise. * math/auto-libm-test-in: Do not expect cosh tests to fail. Add more tests of exp and exp10. Expect some exp10 tests to miss exceptions or fail in directed rounding modes. * math/auto-libm-test-out: Regenerated. * math/libm-test.inc (exp10_tonearest_test_data): New array. (exp10_test_tonearest): New function. (exp10_towardzero_test_data): New array. (exp10_test_towardzero): New function. (exp10_downward_test_data): New array. (exp10_test_downward): New function. (exp10_upward_test_data): New array. (exp10_test_upward): New function. (main): Call the new functions. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2013-12-21 13:07:16 +00:00
Joseph Myers	31e3a40588	Add more libm-test coverage of [a-c]* real functions. Various libm functions have inadequate test coverage in libm-test.inc / auto-libm-test-in - failing to cover all the usual special cases (infinities, NaNs, zero, large and small finite values, subnormals) as well as a reasonable range of ordinary inputs and, where appropriate, inputs close to the thresholds for underflow and overflow. This patch improves test coverage for real functions [a-c]* (with the expectation of adding more coverage for other functions later). Tested x86_64 and x86 and ulps updated accordingly (and eight glibc bugs and one C11 DR filed for issues found in the process). * math/auto-libm-test-in: Add more tests of acos, acosh, asin, asinh, atan, atan2, atanh, cbrt, cos and cosh. * math/auto-libm-test-out: Regenerated. * math/libm-test.inc (acosh_test_data): Add more tests. (atanh_test_data): Likewise. (ceil_test_data): Likewise. (copysign_test_data): Likewise. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2013-12-20 21:03:39 +00:00
Joseph Myers	b7867a3bfb	Move tests of cpow from libm-test.inc to auto-libm-test-in. This patch moves tests of cpow to auto-libm-test-in, adding the required support to gen-auto-libm-tests. Tested x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add tests of cpow. * math/auto-libm-test-out: Regenerated. * math/libm-test.inc (cpow_test_data): Use AUTO_TESTS_cc_c. * * math/gen-auto-libm-tests.c (func_calc_method): Add value mpc_cc_c. (func_calc_desc): Add mpc_cc_c union field. (test_functions): Add cpow. (special_fill_2pi): New function. (special_real_inputs): Add 2pi. (calc_generic_results): Handle mpc_cc_c. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2013-12-20 12:35:10 +00:00
Joseph Myers	7fda568229	Move various TEST_c_c tests from libm-test.inc to auto-libm-test-inc. This patch moves tests of ccos, ccosh, cexp, clog, csqrt, ctan and ctanh to auto-libm-test-in, adding the required support to gen-auto-libm-tests. Other TEST_c_c functions aren't moved for now (although the relevant table entries are put in gen-auto-libm-tests for it to know how to handle them): clog10 because of a known MPC bug causing it to hang for at least some pure imaginary inputs (fixed in SVN, but I'd rather not rely on unreleased versions of MPFR or MPC even if relying on very recent releases); the inverse trig and hyperbolic functions because of known slowness in special cases; and csin / csinh because of observed slowness that I need to investigate and report to the MPC maintainers. Slowness can be bypassed by moving to incremental generation (only for new / changed tests) rather than regenerating the whole of auto-libm-test-out every time, but that needs implementing. (This patch takes the time for running gen-auto-libm-tests from about one second to seven, on my system, which I think is reasonable. The slow functions would make it take several minutes at least, which seems unreasonable.) Tested x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add tests of ccos, ccosh, cexp, clog, csqrt, ctan and ctanh. * math/auto-libm-test-out: Regenerated. * math/libm-test.inc (TEST_COND_x86_64): New macro. (TEST_COND_x86): Likewise. (ccos_test_data): Use AUTO_TESTS_c_c. (ccosh_test_data): Likewise. (cexp_test_data): Likewise. (clog_test_data): Likewise. (csqrt_test_data): Likewise. (ctan_test_data): Likewise. (ctan_tonearest_test_data): Likewise. (ctan_towardzero_test_data): Likewise. (ctan_downward_test_data): Likewise. (ctan_upward_test_data): Likewise. (ctanh_test_data): Likewise. (ctanh_tonearest_test_data): Likewise. (ctanh_towardzero_test_data): Likewise. (ctanh_downward_test_data): Likewise. (ctanh_upward_test_data): Likewise. * math/gen-auto-libm-tests.c (func_calc_method): Add value mpc_c_c. (func_calc_desc): Add mpc_c_c union field. (FUNC_mpc_c_c): New macro. (test_functions): Add cacos, cacosh, casin, casinh, catan, catanh, ccos, ccosh, cexp, clog, clog10, csin, csinh, csqrt, ctan and ctanh. (special_fill_min_subnorm_p120): New function. (special_real_inputs): Add min_subnorm_p120. (calc_generic_results): Handle mpc_c_c. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2013-12-20 12:32:44 +00:00
Joseph Myers	6f6fc48226	Move tests of sincos from libm-test.inc to auto-libm-test-in. This patch moves tests of sincos to auto-libm-test-in, adding the required support to gen-auto-libm-tests. Tested x86_64 and x86 and ulps updated accordingly. (auto-libm-test-out diffs omitted below.) * math/auto-libm-test-in: Add tests of sincos. * math/auto-libm-test-out: Regenerated. * math/libm-test.inc (sincos_test_data): Use AUTO_TESTS_fFF_11. * math/gen-auto-libm-tests.c (func_calc_method): Add value mpfr_f_11. (func_calc_desc): Add mpfr_f_11 union field. (test_functions): Add sincos. (calc_generic_results): Handle mpfr_f_11. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2013-12-19 17:21:01 +00:00
Joseph Myers	335ee09231	Disable libm-test test name beautification for M_* constants. math/gen-libm-test.pl has code to beautify names of various constants, transforming the source form in libm-test.inc into the version appearing in test names in libm-test-ulps files. This has become decreasingly relevant over time for the M_* constants, first as I changed the test names so only the arguments and not the expected results appeared in them, then as tests have moved to auto-libm-test-* so that automatically generated hex float constants get used instead of M_* in test inputs. This patch removes the beautification for all M_* constants. Tested x86_64 and x86 and ulps updated accordingly. Even the one case where this affected the name in the ulps files will disappear once complex function tests are moved to auto-libm-test-. math/gen-libm-test.pl (%beautify): Remove M_* constants. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2013-12-19 14:59:22 +00:00
Joseph Myers	f88acd39da	Fix x86/x86_64 expm1 inaccuracy near 0 in directed rounding modes (bug 16293). Bug 16293 is inaccuracy of x86/x86_64 versions of expm1, near 0 in directed rounding modes, that arises from frndint rounding the exponent to 1 or -1 instead of 0, resulting in large cancellation error. This inaccuracy in turn affects other functions such as sinh that use expm1. This patch fixes the problem by setting round-to-nearest mode temporarily around the affected calls to frndint. I don't think this is needed for other uses of frndint, such as in exp itself, as only for expm1 is the cancellation error significant. Tested x86_64 and x86 and ulps updated accordingly. * sysdeps/i386/fpu/e_expl.S (IEEE754_EXPL) [USE_AS_EXPM1L]: Set round-to-nearest mode when using frndint. * sysdeps/i386/fpu/s_expm1.S (__expm1): Likewise. * sysdeps/i386/fpu/s_expm1f.S (__expm1f): Likewise. * sysdeps/x86_64/fpu/e_expl.S (IEEE754_EXPL) [USE_AS_EXPM1L]: Likewise. * math/auto-libm-test-in: Add more tests of expm1. Do not expect sinh test to fail. * math/auto-libm-test-out: Regenerated. * math/libm-test.inc (TEST_COND_x86_64): Remove macro. (TEST_COND_x86): Likewise. (expm1_tonearest_test_data): New array. (expm1_test_tonearest): New function. (expm1_towardzero_test_data): New array. (expm1_test_towardzero): New function. (expm1_downward_test_data): New array. (expm1_test_downward): New function. (expm1_upward_test_data): New array. (expm1_test_upward): New function. (main): Run the new test functions. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2013-12-19 13:36:10 +00:00
Joseph Myers	f889953b44	Move tests of jn and yn from libm-test.inc to auto-libm-test-in. This patch moves tests of jn and yn to auto-libm-test-in, adding the required support for gen-auto-libm-tests (and adding a missing assertion there and fixing logic that was broken for functions with integer arguments). Tested x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add tests of jn and yn. * math/auto-libm-test-out: Regenerated. * math/libm-test.inc (jn_test_data): Use AUTO_TESTS_if_f. (yn_test_data): Likewise. * math/gen-auto-libm-tests.c (func_calc_method): Add value mpfr_if_f. (func_calc_desc): Add mpfr_if_f union field. (FUNC_mpfr_if_f): New macro. (test_functions): Add jn and yn. (calc_generic_results): Assert type of second input for mpfr_ff_f. Handle mpfr_if_f. (output_for_one_input_case): Disable all checking for arguments fitting floating-point types in case of an integer argument. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2013-12-18 17:59:29 +00:00
Joseph Myers	2dec468fd8	Fix ldbl-128 logl for subnormals (bug 16338). This patch fixes bug 16338, ldbl-128 logl not handling subnormals (with consequent inaccuracy for lgammal as well). The fix is simply to use __frexpl when determining the exponent, as done already in log2l and log10l. Given the lack of testing of small arguments to any of the log* functions, appropriate tests are added for all of them. Tested x86_64 and x86 and ulps updated accordingly, and spot tests also run for mips64 to confirm the ldbl-128 fix. Note that while this fixes lgammal inaccuracy for small positive arguments, I suspect that there will still be problems with spurious underflows in that case. * sysdeps/ieee754/ldbl-128/e_logl.c (__ieee754_logl): Use __frexpl to determine exponent and adjust argument to have exponent of -1. * math/auto-libm-test-in: Add more tests of log, log10, log1p and log2. * math/auto-libm-test-out: Regenerated. * sysdeps/x86_64/fpu/libm-test-ulps: Update.	2013-12-18 11:38:27 +00:00
Joseph Myers	ff362e5b93	Move tests of atan2, hypot and pow from libm-test.inc to auto-libm-test-in.	2013-12-16 21:18:07 +00:00
Markus Trippelsdorf	e7b914bd0e	Update x86_64 ULps for AMD K10	2013-12-09 11:06:11 +05:30

... 3 4 5 6 7 ...

674 Commits