glibc

mirror of https://sourceware.org/git/glibc.git synced 2024-11-30 00:31:08 +00:00

Author	SHA1	Message	Date
Ling Ma	5c74e47cd6	Add x86_64 memset optimized for AVX2 In this patch we take advantage of HSW memory bandwidth, manage to reduce miss branch prediction by avoiding using branch instructions and force destination to be aligned with avx & avx2 instruction. The CPU2006 403.gcc benchmark indicates this patch improves performance from 26% to 59%. * sysdeps/x86_64/multiarch/Makefile: Add memset-avx2. * sysdeps/x86_64/multiarch/memset-avx2.S: New file. * sysdeps/x86_64/multiarch/memset.S: Likewise. * sysdeps/x86_64/multiarch/memset_chk.S: Likewise. * sysdeps/x86_64/multiarch/rtld-memset.S: Likewise.	2014-06-19 15:14:08 -07:00
Joseph Myers	4ba7a00fe3	Fix __ieee754_logl (-LDBL_MAX) in FE_DOWNWARD mode (bug 17022). This patch fixes __ieee754_logl (-LDBL_MAX) on x86_64 and x86 not to subtract 1 from its argument and so cause spurious overflow in FE_DOWNWARD mode. (For any argument strictly less than -1, it doesn't matter whether or not 1 is subtracted before computing log1p, as long as the result doesn't overflow to -Inf.) Tested x86_64 and x86. (This particular case lacks test coverage, since the testsuite doesn't cover -lieee, but it will be covered by tests after the following patch to test pow in all rounding modes, which was the context in which this bug was found.) [BZ #17022] * sysdeps/i386/fpu/e_logl.S (__ieee754_logl): Do not subtract 1 from arguments -2 or below. * sysdeps/i386/i686/fpu/e_logl.S (__ieee754_logl): Likewise. * sysdeps/x86_64/fpu/e_logl.S (__ieee754_logl): Likewise.	2014-06-18 12:32:01 +00:00
Roland McGrath	f6b07b3d48	Move i386 code out of nptl/ subdirectory.	2014-06-12 10:08:24 -07:00
Roland McGrath	14642b8511	Move x86_64 code out of nptl/ subdirectory.	2014-06-11 21:33:32 -07:00
Joseph Myers	f8ba1b5654	Fix log2 (1) in round-downward mode (bug 17042). As with other issues of this kind, bug 17042 is log2 (1) wrongly returning -0 instead of +0 in round-downward mode because of implementations effectively in terms of log1p (x - 1). This patch fixes the issue in the same way used for log and log10. Tested x86_64 and x86 and ulps updated accordingly. Also tested for mips64 to confirm a fix was needed for ldbl-128 and to validate that fix (also applied to ldbl-128ibm since that version of log2l is essentially the same as the ldbl-128 one). [BZ #17042] * sysdeps/i386/fpu/e_log2.S (__ieee754_log2): Take absolete value when x - 1 is zero. * sysdeps/i386/fpu/e_log2f.S (__ieee754_log2f): Likewise. * sysdeps/i386/fpu/e_log2l.S (__ieee754_log2l): Likewise. * sysdeps/ieee754/ldbl-128/e_log2l.c (__ieee754_log2l): Return 0.0L for an argument of 1.0L. * sysdeps/ieee754/ldbl-128ibm/e_log2l.c (__ieee754_log2l): Likewise. * sysdeps/x86_64/fpu/e_log2l.S (__ieee754_log2l): Take absolute value when x - 1 is zero. * math/libm-test.inc (log2_test): Use ALL_RM_TEST. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2014-06-10 12:07:15 +00:00
Marko Myllynen	30477995dc	Replace __int128 with __int128_t * sysdeps/x86_64/link-defines.sym (BND_SIZE): Replace __int128 with __int128_t.	2014-05-30 10:50:21 -07:00
Joseph Myers	b72592e75f	Fix log10 (1) in round-downward mode (bug 16977). As with various other issues of this kind, bug 16977 is log10 (1) wrongly returning -0 rather than +0 in round-downward mode because of an implementation effectively in terms of log1p (x - 1). This patch fixes the issue in the same way used for log. Tested x86_64 and x86 and ulps updated accordingly. Also tested for mips64 to confirm a fix was needed for ldbl-128 and to validate that fix (also applied to ldbl-128ibm since that version of logl is essentially the same as the ldbl-128 one). [BZ #16977] * sysdeps/i386/fpu/e_log10.S (__ieee754_log10): Take absolute value when x - 1 is zero. * sysdeps/i386/fpu/e_log10f.S (__ieee754_log10f): Likewise. * sysdeps/i386/fpu/e_log10l.S (__ieee754_log10l): Likewise. * sysdeps/ieee754/ldbl-128/e_log10l.c (__ieee754_log10l): Return 0.0L for an argument of 1.0L. * sysdeps/ieee754/ldbl-128ibm/e_log10l.c (__ieee754_log10l): Likewise. * sysdeps/x86_64/fpu/e_log10l.S (__ieee754_log10l): Take absolute value when x - 1 is zero. * math/libm-test.inc (log10_test): Use ALL_RM_TEST. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2014-05-23 12:07:50 +00:00
Roland McGrath	1b731c35e6	Move NPTL public ABI headers for x86 to sysdeps/x86/nptl/.	2014-05-14 09:44:56 -07:00
Joseph Myers	1a84c3d6d4	Fix log1pl (LDBL_MAX) in FE_UPWARD mode (bug 16564). Bug 16564 is spurious overflow of log1pl (LDBL_MAX) in FE_UPWARD mode, resulting from log1pl adding 1 to its argument (for arguments not close to 0), which overflows in that mode. This patch fixes this by avoiding adding 1 to large arguments (precisely what counts as large depends on the floating-point format). Tested x86_64 and x86, and spot-checked log1pl tests on mips64 and powerpc64. [BZ #16564] * sysdeps/i386/fpu/s_log1pl.S (__log1pl): Do not add 1 to positive arguments with exponent 65 or above. * sysdeps/ieee754/ldbl-128/s_log1pl.c (__log1pl): Do not add 1 to arguments 0x1p113L or above. * sysdeps/ieee754/ldbl-128ibm/s_log1pl.c (__log1pl): Do not add 1 to arguments 0x1p107L or above. * sysdeps/x86_64/fpu/s_log1pl.S (__log1pl): Do not add 1 to positive arguments with exponent 65 or above. * math/auto-libm-test-in: Add more tests of log1p. * math/auto-libm-test-out: Regenerated.	2014-05-14 12:38:56 +00:00
Joseph Myers	01dbacd22a	Fix cacos (+Inf + finitei) in round-downward mode (bug 16928). According to C99/C11 Annex G, cacos applied to a value with real part +Inf and finite imaginary part should produce a result with real part +0. glibc wrongly produces a result with real part -0 in FE_DOWNWARD mode. This patch fixes this by checking for zero results in the relevant case of non-finite arguments (where there should never be a result with -0 real part), and converts the tests of cacos to ALL_RM_TEST. Tested x86_64 and x86 and ulps updated accordingly. [BZ #16928] math/s_cacos.c (__cacos): Ensure zero real part of result from non-finite arguments is +0. * math/s_cacosf.c (__cacosf): Likewise. * math/s_cacosl.c (__cacosl): Likewise. * math/libm-test.inc (cacos_test): Use ALL_RM_TEST. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2014-05-14 12:37:24 +00:00
Joseph Myers	913d03c864	Fix acosh (1) in round-downward mode (bug 16927). According to C99 and C11 Annex F, acosh (1) should be +0 in all rounding modes. However, some implementations in glibc wrongly return -0 in round-downward mode (which is what you get if you end up computing log1p (-0), via 1 - 1 being -0 in round-downward mode). This patch fixes the problem implementations, by correcting the test for an exact 1 value in the ldbl-96 implementation to allow for the explicit high bit of the mantissa, and by inserting fabs instructions in the i386 implementations; tests of acosh are duly converted to ALL_RM_TEST. I believe all the other sysdeps/ieee754 implementations are already OK (I haven't checked the ia64 versions, but if buggy then that will be obvious from the results of test runs after this patch is in). Tested x86_64 and x86 and ulps updated accordingly. [BZ #16927] * sysdeps/i386/fpu/e_acosh.S (__ieee754_acosh): Use fabs on x-1 value. * sysdeps/i386/fpu/e_acoshf.S (__ieee754_acoshf): Likewise. * sysdeps/i386/fpu/e_acoshl.S (__ieee754_acoshl): Likewise. * sysdeps/ieee754/ldbl-96/e_acoshl.c (__ieee754_acoshl): Correct for explicit high bit of mantissa when testing for argument equal to 1. * math/libm-test.inc (acosh_test): Use ALL_RM_TEST. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2014-05-14 12:35:40 +00:00
Carlos O'Donell	8f1df5cf9d	Fix -Wundef warning for FEATURE_INDEX_1. Define FEATURE_INDEX_1 and FEATURE_INDEX_MAX as macros for use by both assembly and C code. This fixes the -Wundef error for cases where FEATURE_INDEX_1 was not defined but used the correct value of 0 for an undefined macro.	2014-05-03 00:25:21 -04:00
Sihai Yao	f9281df995	Detect if AVX2 is usable This patch checks and sets bit_AVX2_Usable in __cpu_features.feature. * sysdeps/x86_64/multiarch/ifunc-defines.sym (COMMON_CPUID_INDEX_7): New. * sysdeps/x86_64/multiarch/init-arch.c (__init_cpu_features): Check and set bit_AVX2_Usable. * sysdeps/x86_64/multiarch/init-arch.h (bit_AVX2_Usable): New macro. (bit_AVX2): Likewise. (index_AVX2_Usable): Likewise. (CPUID_AVX2): Likewise. (HAS_AVX2): Likewise.	2014-04-17 08:00:21 -07:00
Igor Zamyatin	ea8ba7cd14	Save/restore bound registers for _dl_runtime_profile This patch saves and restores bound registers in x86-64 PLT for ld.so profile and LD_AUDIT: * sysdeps/x86_64/bits/link.h (La_x86_64_regs): Add lr_bnd. (La_x86_64_retval): Add lrv_bnd0 and lrv_bnd1. * sysdeps/x86_64/dl-trampoline.S (_dl_runtime_profile): Save Intel MPX bound registers before _dl_profile_fixup. * sysdeps/x86_64/dl-trampoline.h: Restore Intel MPX bound registers after _dl_profile_fixup. Save and restore bound registers bnd0/bnd1 when calling _dl_call_pltexit. * sysdeps/x86_64/link-defines.sym (BND_SIZE): New. (LR_BND_OFFSET): Likewise. (LRV_BND0_OFFSET): Likewise. (LRV_BND1_OFFSET): Likewise.	2014-04-16 14:46:49 -07:00
Igor Zamyatin	a4c75cfd56	Save/restore bound registers in _dl_runtime_resolve This patch saves and restores bound registers in symbol lookup for x86-64: 1. Branches without BND prefix clear bound registers. 2. x86-64 pass bounds in bound registers as specified in MPX psABI extension on hjl/mpx/master branch at https://github.com/hjl-tools/x86-64-psABI https://groups.google.com/forum/#!topic/x86-64-abi/KFsB0XTgWYc Binutils has been updated to create an alternate PLT to add BND prefix when branching to ld.so. * config.h.in (HAVE_MPX_SUPPORT): New #undef. * sysdeps/x86_64/configure.ac: Set HAVE_MPX_SUPPORT. * sysdeps/x86_64/configure: Regenerated. * sysdeps/x86_64/dl-trampoline.S (REGISTER_SAVE_AREA): New macro. (REGISTER_SAVE_RAX): Likewise. (REGISTER_SAVE_RCX): Likewise. (REGISTER_SAVE_RDX): Likewise. (REGISTER_SAVE_RSI): Likewise. (REGISTER_SAVE_RDI): Likewise. (REGISTER_SAVE_R8): Likewise. (REGISTER_SAVE_R9): Likewise. (REGISTER_SAVE_BND0): Likewise. (REGISTER_SAVE_BND1): Likewise. (REGISTER_SAVE_BND2): Likewise. (_dl_runtime_resolve): Use them. Save and restore Intel MPX bound registers when calling _dl_fixup.	2014-04-09 15:38:09 -07:00
Roland McGrath	fcccd51286	Factor mmap/munmap of PT_LOAD segments out of _dl_map_object_from_fd et al.	2014-04-03 10:47:14 -07:00
Joseph Myers	a84e78c8b3	Fix catan, catanh, __ieee754_logf in round-downward mode (bug 16799, bug 16800). This patch fixes incorrect results from catan and catanh of certain special inputs in round-downward mode (bug 16799), and incorrect results of __ieee754_logf (+/-0) in round-downward mode (bug 16800) that show up through catan/catanh when tested in all rounding modes, but not directly in the testing for logf because the bug gets hidden by the wrappers. Both bugs involve a zero that should be +0 being -0 instead: one computed as (1-x)(1+x) in the catan/catanh case, and one as (x-x) in the logf case. The fixes ensure positive zero is used. Testing of catan and catanh in all rounding modes is duly enabled. I expect there are various other bugs in special cases in __ieee754_ functions that are normally hidden by the wrappers but would show up for testing with -lieee (or in future with -fno-math-errno if we replace -lieee and _LIB_VERSION with compile-time redirection to new _noerrno symbol names). Tested x86_64 and x86 and ulps updated accordingly. [BZ #16799] [BZ #16800] math/s_catan.c (__catan): Avoid passing -0 denominator to atan2 with 0 numerator. * math/s_catanf.c (__catanf): Likewise. * math/s_catanh.c (__catanh): Likewise. * math/s_catanhf.c (__catanhf): Likewise. * math/s_catanhl.c (__catanhl): Likewise. * math/s_catanl.c (__catanl): Likewise. * sysdeps/ieee754/flt-32/e_logf.c (__ieee754_logf): Always divide by positive zero when computing -Inf result. * math/libm-test.inc (catan_test): Use ALL_RM_TEST. (catanh_test): Likewise. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2014-04-02 17:41:02 +00:00
Joseph Myers	6f05bafeba	Fix clog / clog10 sign of zero result in round-downward mode (bug 16789). This patch fixes bug 16789, incorrect sign of (real part) zero result from clog and clog10 in round-downward mode, arising from that real part being computed as 0 - 0. To ensure that an underflow exception occurred, the code used an underflowing value (the next term in the series for log1p) in arithmetic computing the real part of the result, yielding the problematic 0 - 0 computation in some cases even when the mathematical result would be small but positive. The patch changes this code to use the math_force_eval approach to ensuring that an underflowing computation actually occurs. Tests of clog and clog10 are enabled in all rounding modes. Tested x86_64 and x86 and ulps updated accordingly. [BZ #16789] * math/s_clog.c (__clog): Use math_force_eval to ensure underflow instead of using underflowing value in computing result. * math/s_clog10.c (__clog10): Likewise. * math/s_clog10f.c (__clog10f): Likewise. * math/s_clog10l.c (__clog10l): Likewise. * math/s_clogf.c (__clogf): Likewise. * math/s_clogl.c (__clogl): Likewise. * math/libm-test.inc (clog_test): Use ALL_RM_TEST. (clog10_test): Likewise. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2014-04-02 13:10:19 +00:00
Joseph Myers	03a7091fa2	Fix x86/x86_64 expl/exp10l spurious underflows (bug 16348). This patch fixes bug 16348, spurious underflows from x86/x86_64 expl on arguments close to 0. These implementations effectively use expm1 (on the fractional part of the argument) internally, so resulting in spurious underflows when the result is very close to 1. For arguments small enough that the round-to-nearest correct result is 1, this patch uses 1+x instead. These implementations are also used for exp10l and so the patch fixes similar issues there (the 0x1p-67 threshold being small enough to be correct for exp10l as well as expl). But because of spurious underflows in other exp10 implementations (bug 16560), the tests aren't added for exp10 at this point - they can be added when the other exp10 parts of that bug are fixed. Tested x86_64 and x86; no ulps updates needed. [BZ #16348] * sysdeps/i386/fpu/e_expl.S (IEEE754_EXPL) [!USE_AS_EXPM1L]: Use 1+x for argument with exponent below -67. * sysdeps/x86_64/fpu/e_expl.S (IEEE754_EXPL) [!USE_AS_EXPM1L]: Likewise. * math/auto-libm-test-in: Add more tests of exp. * math/auto-libm-test-out: Regenerated.	2014-03-27 18:41:14 +00:00
Joseph Myers	9be36fb8cb	Make x86_64 fegetenv preserve exception mask (bug 16198). Bug 16198 is x86_64 fegetenv wrongly masking exceptions for which traps are enabled, because that's a side-effect of the fnstenv instruction. This patch fixes it to use fldenv immediately after fnstenv, like the i386 version. Tested x86_64 and x86. [BZ #16198] * sysdeps/x86_64/fpu/fegetenv.c (fegetenv): Use fldenv after fnstenv. * math/test-fenv-preserve.c: New file. * math/Makefile (tests): Add test-fenv-preserve.	2014-03-26 18:59:08 +00:00
Joseph Myers	046651c168	Relax gen-auto-libm-tests may-underflow rules, test log1p in all rounding modes. gen-auto-libm-tests presently allows but does not require underflow exceptions for results with magnitude in the range (greatest subnormal, least normal]. In some cases, the magnitude of the exact result is very slightly above the least normal, but rounding in the implementation results in it effectively computing an infinite-precision result that is slightly below the least normal, so raising an underflow exception. This is in accordance with the documented accuracy goals, but results in testsuite failures. This patch changes the logic to allow underflows when the mathematical result is up to 0.5ulp above the least normal (so in any case where the round-to-nearest result is the least normal). Ideally underflows in all these cases would be accepted only when an underflow with the actual result is consistent with the rounding mode (in FE_TOWARDZERO mode, a return value of the least normal implies that the infinite-precision result did not underflow so there should be no underflow exception, for example), so as to match the documented goals more precisely - whereas at present the tests for exceptions are completely independent of the tests of the returned values. (The same applies to overflow exceptions as well - they too should be checked for consistency with the result, as in FE_TOWARDZERO mode a result 1ulp below the largest finite value should be inconsistent with an overflow exception and cause a failure with overflow rather than simply being considered a 1ulp error when overflow is expected.) But the present patch at least deals with the cases causing spurious failures so that (a) certain existing tests no longer need to be marked as having spurious exceptions (such markings in auto-libm-test-in end up applying to more cases than just those they are needed for) and (b) log1p can be tested in all rounding modes without introducing more such failures. This patch duly moves tests of log1p to ALL_RM_TEST. Tested x86_64 and x86 and ulps updated accordingly. [BZ #16357] [BZ #16599] * math/gen-auto-libm-tests.c (fp_format_desc): Add field min_plus_half. (fp_formats): Update initializers. (init_fp_formats): Initialize new field. (output_for_one_input_case): Allow underflow for results up to min_plus_half. * math/libm-test.inc (log1p_test): Use ALL_RM_TEST. * math/auto-libm-test-in: Don't mark some underflows from asin and atanh as spurious. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2014-03-25 12:26:06 +00:00
Joseph Myers	f3426898bf	Fix implicit __isinf declarations in exp. My recent exp patch introduced warnings about implicit __isinf declarations in exp because e_exp.c didn't include <math.h>. This patch fixes this. Because <math.h> can't be included after <math_private.h> (because of macro definitions of __nan), it was necessary to put an include in sysdeps/x86_64/fpu/multiarch/e_exp.c as well. Tested x86_64. sysdeps/ieee754/dbl-64/e_exp.c: Include <math.h>. * sysdeps/x86_64/fpu/multiarch/e_exp.c [HAVE_FMA4_SUPPORT \|\| HAVE_AVX_SUPPORT]: Likewise.	2014-03-24 22:00:32 +00:00
Joseph Myers	b376a11a19	Fix dbl-64 exp overflow/underflow in non-default rounding modes (bug 16284). The dbl-64 version of exp needs round-to-nearest mode for its internal computations, but that has the consequence of inappropriate overflowing and underflowing results in other rounding modes. This patch fixes this by recomputing the relevant results in cases where the round-to-nearest result overflows to infinity or underflows to zero (most of the diffs are actually just consequent reindentation). Tests are enabled in all rounding modes for complex functions using exp - but not for cexp because it turns out there are bugs causing spurious underflows for cexp for some tests, which will need to be fixed separately (I suspect ccos ccosh csin csinh ctan ctanh have similar bugs, just not shown by the present set of test inputs). Tested x86_64 and x86 and ulps updated accordingly. [BZ #16284] * sysdeps/ieee754/dbl-64/e_exp.c (__ieee754_exp): Use original rounding mode to recompute results that overflow to infinity or underflow to zero. * math/auto-libm-test-in: Don't mark tests as expected to fail for bug 16284. * math/auto-libm-test-out: Regenerated. * math/libm-test.inc (ccos_test): Use ALL_RM_TEST. (ccosh_test): Likewise. (csin_test_data): Use plus_oflow. (csin_test): Use ALL_RM_TEST. (csinh_test_data): Use plus_oflow. (csinh_test): Use ALL_RM_TEST. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2014-03-24 12:18:45 +00:00
Joseph Myers	f7be737659	Fix log (1) in round-downward mode (bug 16731). According to ISO C Annex F, log (1) should be +0 in all rounding modes, but some implementations in glibc wrongly return -0 in round-downward mode (mapping to log1p (x - 1) is problematic because 1 - 1 is -0 in round-downward mode, and log1p (-0) is -0). This patch fixes this. (It helps with some implementations of other functions such as acosh, log2 and log10 that call out to log, but not enough to enable all-rounding-modes testing for those functions without further fixes to other implementations of them.) Tested x86_64 and x86 and ulps updated accordingly, and did spot tests for mips64 for the ldbl-128 fix, and i586 for the sysdeps/i386/fpu implementations shadowed by those in sysdeps/i386/i686/fpu. [BZ #16731] * sysdeps/i386/fpu/e_log.S (__ieee754_log): Take absolute value when x - 1 is zero. * sysdeps/i386/fpu/e_logf.S (__ieee754_logf): Likewise. * sysdeps/i386/fpu/e_logl.S (__ieee754_logl): Likewise. * sysdeps/i386/i686/fpu/e_logl.S (__ieee754_logl): Likewise. * sysdeps/ieee754/dbl-64/e_log.c (__ieee754_log): Return +0 when argument is 1. * sysdeps/ieee754/ldbl-128/e_logl.c (__ieee754_logl): Likewise. * sysdeps/x86_64/fpu/e_logl.S: Take absolute value when x - 1 is zero. * math/libm-test.inc (log_test): Use ALL_RM_TEST. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2014-03-21 18:13:58 +00:00
Joseph Myers	8c92dfff41	Test most libm functions in all rounding modes. This patch makes libm-test.inc tests of most functions use ALL_RM_TEST unless there was some reason to defer that change for a particular function. I started out planning to defer the change for pow (bug 16315), cexp / ccos / ccosh / csin / csinh (likely fallout from exp, bug 16284) and cpow (exact expectations for signs of exact zero results not wanted). Testing on x86_64 and x86 showed additional failures for acosh, cacos, catan, catanh, clog, clog10, jn, log, log10, log1p, log2, tgamma, yn, so making the change for those functions was deferred as well, pending investigation to show which of these represent distinct bugs (some such bugs may already be filed) and appropriate fixing / XFAILing. Failures include wrong signs of zero results, errors slightly above the 9ulp bound (in such cases it may make sense for functions to set round-to-nearest internally to reduce error accumulation), large errors and incorrect overflow/underflow for the rounding mode (with consequent missing errno settings in some cases). It's possible some could be issues with test expectations, though I didn't notice any that were obviously like that (I added NO_TEST_INLINE for cases that were failing for ildoubl on x86 and where it seemed reasonable for them to fail for the fast-math inlines). There may of course be failures on other architectures for functions that didn't fail on x86_64 or x86, in which case the usual rule applies: file a bug (preferably identifying the underlying problem function, in cases where function A calls function B and a problem with function B may present in the test results for function A) if not already in Bugzilla then fix or XFAIL. Tested x86_64 and x86 and ulps updated accordingly. * math/libm-test.inc (asinh_test): Use ALL_RM_TEST. (atan_test): Likewise. (atanh_test_data): Use NO_TEST_INLINE for two tests. (atanh_test): Use ALL_RM_TEST. (atan2_test_data): Likewise. (cabs_test): Likewise. (cacosh_test): Likewise. (carg_test): Likewise. (casin_test): Likewise. (casinh_test): Likewise. (cbrt_test): Likewise. (csqrt_test): Likewise. (erf_test): Likewise. (erfc_test): Likewise. (pow10_test): Likewise. (exp2_test): Likewise. (hypot_test): Likewise. (j0_test): Likewise. (j1_test): Likewise. (lgamma_test): Likewise. (gamma_test): Likewise. (sincos_test): Likewise. (tanh_test): Likewise. (y0_test): Likewise. (y1_test): Likewise. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2014-03-21 00:03:38 +00:00
H.J. Lu	aa4de9cea5	Check AVX-512 assembler support first It checks AVX-512 assembler support first and sets libc_cv_cc_avx512 to $libc_cv_asm_avx512, instead of yes. GCC won't support AVX-512 if assembler doesn't support it. * sysdeps/x86_64/configure.ac: Check AVX-512 assembler support first. Disable AVX-512 GCC support if assembler doesn't support it. * sysdeps/x86_64/configure: Regenerated.	2014-03-14 08:51:25 -07:00
Igor Zamyatin	2d63a517e4	Save and restore AVX-512 zmm registers to x86-64 ld.so AVX-512 ISA adds 512-bit zmm registers. This patch updates _dl_runtime_profile to pass zmm registers to run-time audit. It also changes _dl_x86_64_save_sse and _dl_x86_64_restore_sse to upport zmm registers, which are called when only when RTLD_PREPARE_FOREIGN_CALL is used. Its performance impact is minimum. * config.h.in (HAVE_AVX512_SUPPORT): New #undef. (HAVE_AVX512_ASM_SUPPORT): Likewise. * sysdeps/x86_64/bits/link.h (La_x86_64_zmm): New. (La_x86_64_vector): Add zmm. * sysdeps/x86_64/Makefile (tests): Add tst-audit10. (modules-names): Add tst-auditmod10a and tst-auditmod10b. ($(objpfx)tst-audit10): New target. ($(objpfx)tst-audit10.out): Likewise. (tst-audit10-ENV): New. (AVX512-CFLAGS): Likewise. (CFLAGS-tst-audit10.c): Likewise. (CFLAGS-tst-auditmod10a.c): Likewise. (CFLAGS-tst-auditmod10b.c): Likewise. * sysdeps/x86_64/configure.ac: Set config-cflags-avx512, HAVE_AVX512_SUPPORT and HAVE_AVX512_ASM_SUPPORT. * sysdeps/x86_64/configure: Regenerated. * sysdeps/x86_64/dl-trampoline.S (_dl_runtime_profile): Add AVX-512 zmm register support. (_dl_x86_64_save_sse): Likewise. (_dl_x86_64_restore_sse): Likewise. * sysdeps/x86_64/dl-trampoline.h: Updated to support different size vector registers. * sysdeps/x86_64/link-defines.sym (YMM_SIZE): New. (ZMM_SIZE): Likewise. * sysdeps/x86_64/tst-audit10.c: New file. * sysdeps/x86_64/tst-auditmod10a.c: Likewise. * sysdeps/x86_64/tst-auditmod10b.c: Likewise.	2014-03-13 11:19:08 -07:00
Joseph Myers	e6b6a85705	Don't include individual test ulps in libm-test-ulps. As recently discussed <https://sourceware.org/ml/libc-alpha/2014-02/msg00670.html>, it doesn't seem particularly useful for libm-test-ulps files to contain huge amounts of data on ulps for individual tests; just the global maximum observed ulps for each function, together with the verification of exceptions, errno and special results such as infinities and NaNs for each test, suffices to verify that a function's behavior on the given test inputs is within the expected accuracy. Removing this data reduces source tree churn caused by updates to these files when libm tests are added, and reduces the frequency with which testsuite additions actually need libm-test-ulps changes at all. Accordingly, this patch removes that data, so that individual tests get checked against the global bounds for the given function and only generate an error if those are exceeded. Tested x86_64 (including verifying that if an ulps value is artificially reduced, the tests do indeed fail as they should and "make regen-ulps" generates the expected changes). * math/libm-test.inc (struct ulp_data): Don't refer to ulps for individual tests in comment. (libm-test-ulps.h): Don't refer to test_ulps in #include comment. (prev_max_error): New variable. (prev_real_max_error): Likewise. (prev_imag_max_error): Likewise. (compare_ulp_data): Don't refer to test names in comment. (find_test_ulps): Remove function. (find_function_ulps): Likewise. (find_complex_function_ulps): Likewise. (init_max_error): Take function name as argument. Look up ulps for that function. (print_ulps): Remove function. (print_max_error): Use prev_max_error instead of calling find_function_ulps. (print_complex_max_error): Use prev_real_max_error and prev_imag_max_error instead of calling find_complex_function_ulps. (check_float_internal): Take max_ulp parameter instead of calling find_test_ulps. Don't call print_ulps. (check_float): Update call to check_float_internal. (check_complex): Update calls to check_float_internal. (START): Pass argument to init_max_error. * math/gen-libm-test.pl (%results): Don't include "kind" information. (parse_ulps): Don't handle ulps of individual tests. (print_ulps_file): Likewise. (output_ulps): Likewise. * math/README.libm-test: Update. * manual/libm-err-tab.pl (parse_ulps): Don't handle ulps of individual tests. * sysdeps/aarch64/libm-test-ulps: Remove individual test ulps. * sysdeps/alpha/fpu/libm-test-ulps: Likewise. * sysdeps/arm/libm-test-ulps: Likewise. * sysdeps/i386/fpu/libm-test-ulps: Likewise. * sysdeps/ia64/fpu/libm-test-ulps: Likewise. * sysdeps/m68k/coldfire/fpu/libm-test-ulps: Likewise. * sysdeps/m68k/m680x0/fpu/libm-test-ulps: Likewise. * sysdeps/microblaze/libm-test-ulps: Likewise. * sysdeps/mips/mips32/libm-test-ulps: Likewise. * sysdeps/mips/mips64/libm-test-ulps: Likewise. * sysdeps/powerpc/fpu/libm-test-ulps: Likewise. * sysdeps/powerpc/nofpu/libm-test-ulps: Likewise. * sysdeps/s390/fpu/libm-test-ulps: Likewise. * sysdeps/sh/libm-test-ulps: Likewise. * sysdeps/sparc/fpu/libm-test-ulps: Likewise. * sysdeps/tile/libm-test-ulps: Likewise. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise. * sysdeps/hppa/fpu/libm-test-ulps: Remove individual test ulps.	2014-03-05 15:02:38 +00:00
Siddhesh Poyarekar	1cadc85813	Fix sign of input to bsloww1 (BZ #16623 ) In `84ba214c`, I removed some redundant sign computations and in the process, I incorrectly got rid of a temporary variable, thus passing the absolute value of the input to bsloww1. This caused #16623. This fix undoes the incorrect change.	2014-02-27 21:12:09 +05:30
Joseph Myers	63689d6165	Move tests of clog10 from libm-test.inc to auto-libm-test-in. This patch moves tests of clog10 to auto-libm-test-in. Note that this means gen-auto-libm-tests will now depend on the recent MPC 1.0.2 release which added a fix for a bug that made gen-auto-libm-tests hang for clog10. (It still can't conveniently be used for cacos cacosh casin casinh catan catanh csin csinh because of extreme slowness of those functions for special cases in MPC; at least some slow cases of csin / csinh are fixed in MPC trunk, but not in a release.) Tested x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add tests of clog10. * math/auto-libm-test-out: Regenerated. * math/libm-test.inc (clog10_test_data): Use AUTO_TESTS_c_c. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2014-02-19 14:26:29 +00:00
Joseph Myers	a4fb786185	Fix gen-auto-libm-tests sticky bit setting for negative results. gen-auto-libm-tests has a bug in the logic for setting a sticky bit based on the ternary value from MPFR: it is correct for positive results, but for negative results mpz_setbit acts as if a two's complement representation is used, whereas the low bit needs setting based on the sign-magnitude representation GMP actually uses. (This showed up in converting fma tests to use auto-libm-test-in / gen-auto-libm-tests.) This patch fixes the problem by negating the mpz_t value to set its low bit. There are lots of changes to auto-libm-test-out (mainly 1ulp fixes to ldbl-128 expected results), but only a few ulps updates are needed on x86 / x86_64. In one case, a corrected expectation showed up a spurious underflow exception where the correct result is slightly outside the underflowing range. Tested x86_64 and x86 and ulps updated accordingly. * math/gen-auto-libm-tests.c (adjust_real): Ensure integers are non-negative before setting low bit. * math/auto-libm-test-in: Mark one asin test possibly having spurious underflow. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2014-02-18 14:45:41 +00:00
Dylan Alex Simon	fbfdf9cb03	Update x86_64 libm-test-ulps on AMD family 21h model 1 (bug 16545).	2014-02-12 15:55:10 +00:00
Joseph Myers	e8d8d7ec98	Regenerate x86_64 ulps.	2014-02-11 23:15:36 +00:00
Ondřej Bílka	a1ffb40e32	Use glibc_likely instead __builtin_expect.	2014-02-10 15:07:12 +01:00
Eric Wong	dc98b8f5a9	Update x86_64 ULPs (AMD family 21, model 2) Tested on an AMD FX-8320 CPU	2014-02-04 10:40:56 +10:00
Eric Wong	6c0ce4b45d	Update x86_64 ULPs (AMD Family 10h)	2014-02-04 10:40:44 +10:00
H.J. Lu	4959e284ca	Include generic symbol-hacks.h for x32 In BZ #15605 fix with addding memset/memmove alias in symbol-hacks.h, x32 symbol-hacks.h change was missing. Fixed by including <sysdeps/generic/symbol-hacks.h> in x32 symbol-hacks.h.	2014-01-20 11:11:01 -08:00
Joseph Myers	97b9a0090e	Regenerate x86 / x86_64 ulps.	2014-01-01 14:34:38 +00:00
Allan McRae	d4697bc93d	Update copyright notices with scripts/update-copyrights	2014-01-01 22:00:23 +10:00
Joseph Myers	5b0626b9c5	Fix x86 / x86_64 expl / expl10l wild results in directed rounding modes (bug 16356). This patch fixes bug 16356, bad results from x86 / x86_64 expl / exp10l in directed rounding modes, the most serious of the bugs shown up by my patch expanding libm test coverage. When I fixed bug 16293, I thought it was only necessary to set round-to-nearest when using frndint in expm1 functions, because in other cases the cancellation error from having the resulting fractional part close to 1 or -1 would not be significant. However, in expl and exp10l, the way the final fractional part gets computed (something more complicated than a simple subtraction, because more precision is needed than you'd get that way) can result in a value outside the range [-1, 1] when the argument to frndint was very close to an integer and was rounded the "wrong" way because of the rounding mode - and the f2xm1 instruction has undefined results if its argument is outside [-1, 1], so resulting in the large errors seen. So this patch removes the USE_AS_EXPM1L conditionals on the round-to-nearest settings, so all of expl, expm1l and exp10l now get round-to-nearest used for frndint (meaning the final fractional part can at most be slightly above 0.5 in magnitude). Associated tests of exp and exp10 are added and testing of exp10 in directed rounding modes enabled. Tested x86_64 and x86 and ulps updated accordingly. * sysdeps/i386/fpu/e_expl.S (IEEE754_EXPL): Also set round-to-nearest for [!USE_AS_EXPM1L]. * sysdeps/x86_64/fpu/e_expl.S (IEEE754_EXPL): Likewise. * math/auto-libm-test-in: Do not expect cosh tests to fail. Add more tests of exp and exp10. Expect some exp10 tests to miss exceptions or fail in directed rounding modes. * math/auto-libm-test-out: Regenerated. * math/libm-test.inc (exp10_tonearest_test_data): New array. (exp10_test_tonearest): New function. (exp10_towardzero_test_data): New array. (exp10_test_towardzero): New function. (exp10_downward_test_data): New array. (exp10_test_downward): New function. (exp10_upward_test_data): New array. (exp10_test_upward): New function. (main): Call the new functions. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2013-12-21 13:07:16 +00:00
Joseph Myers	31e3a40588	Add more libm-test coverage of [a-c]* real functions. Various libm functions have inadequate test coverage in libm-test.inc / auto-libm-test-in - failing to cover all the usual special cases (infinities, NaNs, zero, large and small finite values, subnormals) as well as a reasonable range of ordinary inputs and, where appropriate, inputs close to the thresholds for underflow and overflow. This patch improves test coverage for real functions [a-c]* (with the expectation of adding more coverage for other functions later). Tested x86_64 and x86 and ulps updated accordingly (and eight glibc bugs and one C11 DR filed for issues found in the process). * math/auto-libm-test-in: Add more tests of acos, acosh, asin, asinh, atan, atan2, atanh, cbrt, cos and cosh. * math/auto-libm-test-out: Regenerated. * math/libm-test.inc (acosh_test_data): Add more tests. (atanh_test_data): Likewise. (ceil_test_data): Likewise. (copysign_test_data): Likewise. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2013-12-20 21:03:39 +00:00
Joseph Myers	b7867a3bfb	Move tests of cpow from libm-test.inc to auto-libm-test-in. This patch moves tests of cpow to auto-libm-test-in, adding the required support to gen-auto-libm-tests. Tested x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add tests of cpow. * math/auto-libm-test-out: Regenerated. * math/libm-test.inc (cpow_test_data): Use AUTO_TESTS_cc_c. * * math/gen-auto-libm-tests.c (func_calc_method): Add value mpc_cc_c. (func_calc_desc): Add mpc_cc_c union field. (test_functions): Add cpow. (special_fill_2pi): New function. (special_real_inputs): Add 2pi. (calc_generic_results): Handle mpc_cc_c. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2013-12-20 12:35:10 +00:00
Joseph Myers	7fda568229	Move various TEST_c_c tests from libm-test.inc to auto-libm-test-inc. This patch moves tests of ccos, ccosh, cexp, clog, csqrt, ctan and ctanh to auto-libm-test-in, adding the required support to gen-auto-libm-tests. Other TEST_c_c functions aren't moved for now (although the relevant table entries are put in gen-auto-libm-tests for it to know how to handle them): clog10 because of a known MPC bug causing it to hang for at least some pure imaginary inputs (fixed in SVN, but I'd rather not rely on unreleased versions of MPFR or MPC even if relying on very recent releases); the inverse trig and hyperbolic functions because of known slowness in special cases; and csin / csinh because of observed slowness that I need to investigate and report to the MPC maintainers. Slowness can be bypassed by moving to incremental generation (only for new / changed tests) rather than regenerating the whole of auto-libm-test-out every time, but that needs implementing. (This patch takes the time for running gen-auto-libm-tests from about one second to seven, on my system, which I think is reasonable. The slow functions would make it take several minutes at least, which seems unreasonable.) Tested x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add tests of ccos, ccosh, cexp, clog, csqrt, ctan and ctanh. * math/auto-libm-test-out: Regenerated. * math/libm-test.inc (TEST_COND_x86_64): New macro. (TEST_COND_x86): Likewise. (ccos_test_data): Use AUTO_TESTS_c_c. (ccosh_test_data): Likewise. (cexp_test_data): Likewise. (clog_test_data): Likewise. (csqrt_test_data): Likewise. (ctan_test_data): Likewise. (ctan_tonearest_test_data): Likewise. (ctan_towardzero_test_data): Likewise. (ctan_downward_test_data): Likewise. (ctan_upward_test_data): Likewise. (ctanh_test_data): Likewise. (ctanh_tonearest_test_data): Likewise. (ctanh_towardzero_test_data): Likewise. (ctanh_downward_test_data): Likewise. (ctanh_upward_test_data): Likewise. * math/gen-auto-libm-tests.c (func_calc_method): Add value mpc_c_c. (func_calc_desc): Add mpc_c_c union field. (FUNC_mpc_c_c): New macro. (test_functions): Add cacos, cacosh, casin, casinh, catan, catanh, ccos, ccosh, cexp, clog, clog10, csin, csinh, csqrt, ctan and ctanh. (special_fill_min_subnorm_p120): New function. (special_real_inputs): Add min_subnorm_p120. (calc_generic_results): Handle mpc_c_c. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2013-12-20 12:32:44 +00:00
Joseph Myers	6f6fc48226	Move tests of sincos from libm-test.inc to auto-libm-test-in. This patch moves tests of sincos to auto-libm-test-in, adding the required support to gen-auto-libm-tests. Tested x86_64 and x86 and ulps updated accordingly. (auto-libm-test-out diffs omitted below.) * math/auto-libm-test-in: Add tests of sincos. * math/auto-libm-test-out: Regenerated. * math/libm-test.inc (sincos_test_data): Use AUTO_TESTS_fFF_11. * math/gen-auto-libm-tests.c (func_calc_method): Add value mpfr_f_11. (func_calc_desc): Add mpfr_f_11 union field. (test_functions): Add sincos. (calc_generic_results): Handle mpfr_f_11. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2013-12-19 17:21:01 +00:00
Joseph Myers	335ee09231	Disable libm-test test name beautification for M_* constants. math/gen-libm-test.pl has code to beautify names of various constants, transforming the source form in libm-test.inc into the version appearing in test names in libm-test-ulps files. This has become decreasingly relevant over time for the M_* constants, first as I changed the test names so only the arguments and not the expected results appeared in them, then as tests have moved to auto-libm-test-* so that automatically generated hex float constants get used instead of M_* in test inputs. This patch removes the beautification for all M_* constants. Tested x86_64 and x86 and ulps updated accordingly. Even the one case where this affected the name in the ulps files will disappear once complex function tests are moved to auto-libm-test-. math/gen-libm-test.pl (%beautify): Remove M_* constants. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2013-12-19 14:59:22 +00:00
Joseph Myers	f88acd39da	Fix x86/x86_64 expm1 inaccuracy near 0 in directed rounding modes (bug 16293). Bug 16293 is inaccuracy of x86/x86_64 versions of expm1, near 0 in directed rounding modes, that arises from frndint rounding the exponent to 1 or -1 instead of 0, resulting in large cancellation error. This inaccuracy in turn affects other functions such as sinh that use expm1. This patch fixes the problem by setting round-to-nearest mode temporarily around the affected calls to frndint. I don't think this is needed for other uses of frndint, such as in exp itself, as only for expm1 is the cancellation error significant. Tested x86_64 and x86 and ulps updated accordingly. * sysdeps/i386/fpu/e_expl.S (IEEE754_EXPL) [USE_AS_EXPM1L]: Set round-to-nearest mode when using frndint. * sysdeps/i386/fpu/s_expm1.S (__expm1): Likewise. * sysdeps/i386/fpu/s_expm1f.S (__expm1f): Likewise. * sysdeps/x86_64/fpu/e_expl.S (IEEE754_EXPL) [USE_AS_EXPM1L]: Likewise. * math/auto-libm-test-in: Add more tests of expm1. Do not expect sinh test to fail. * math/auto-libm-test-out: Regenerated. * math/libm-test.inc (TEST_COND_x86_64): Remove macro. (TEST_COND_x86): Likewise. (expm1_tonearest_test_data): New array. (expm1_test_tonearest): New function. (expm1_towardzero_test_data): New array. (expm1_test_towardzero): New function. (expm1_downward_test_data): New array. (expm1_test_downward): New function. (expm1_upward_test_data): New array. (expm1_test_upward): New function. (main): Run the new test functions. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2013-12-19 13:36:10 +00:00
Joseph Myers	f889953b44	Move tests of jn and yn from libm-test.inc to auto-libm-test-in. This patch moves tests of jn and yn to auto-libm-test-in, adding the required support for gen-auto-libm-tests (and adding a missing assertion there and fixing logic that was broken for functions with integer arguments). Tested x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add tests of jn and yn. * math/auto-libm-test-out: Regenerated. * math/libm-test.inc (jn_test_data): Use AUTO_TESTS_if_f. (yn_test_data): Likewise. * math/gen-auto-libm-tests.c (func_calc_method): Add value mpfr_if_f. (func_calc_desc): Add mpfr_if_f union field. (FUNC_mpfr_if_f): New macro. (test_functions): Add jn and yn. (calc_generic_results): Assert type of second input for mpfr_ff_f. Handle mpfr_if_f. (output_for_one_input_case): Disable all checking for arguments fitting floating-point types in case of an integer argument. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.	2013-12-18 17:59:29 +00:00
Joseph Myers	2dec468fd8	Fix ldbl-128 logl for subnormals (bug 16338). This patch fixes bug 16338, ldbl-128 logl not handling subnormals (with consequent inaccuracy for lgammal as well). The fix is simply to use __frexpl when determining the exponent, as done already in log2l and log10l. Given the lack of testing of small arguments to any of the log* functions, appropriate tests are added for all of them. Tested x86_64 and x86 and ulps updated accordingly, and spot tests also run for mips64 to confirm the ldbl-128 fix. Note that while this fixes lgammal inaccuracy for small positive arguments, I suspect that there will still be problems with spurious underflows in that case. * sysdeps/ieee754/ldbl-128/e_logl.c (__ieee754_logl): Use __frexpl to determine exponent and adjust argument to have exponent of -1. * math/auto-libm-test-in: Add more tests of log, log10, log1p and log2. * math/auto-libm-test-out: Regenerated. * sysdeps/x86_64/fpu/libm-test-ulps: Update.	2013-12-18 11:38:27 +00:00
Joseph Myers	ff362e5b93	Move tests of atan2, hypot and pow from libm-test.inc to auto-libm-test-in.	2013-12-16 21:18:07 +00:00
Allan McRae	6f8e37ebf8	Update file name in x86_64 ifunc list File name update missed in commit `584b18eb`.	2013-12-16 13:00:39 +10:00

1 2 3 4 5 ...

776 Commits