This patch fixes incorrect results from catan and catanh of certain
special inputs in round-downward mode (bug 16799), and incorrect
results of __ieee754_logf (+/-0) in round-downward mode (bug 16800)
that show up through catan/catanh when tested in all rounding modes,
but not directly in the testing for logf because the bug gets hidden
by the wrappers.
Both bugs involve a zero that should be +0 being -0 instead: one
computed as (1-x)*(1+x) in the catan/catanh case, and one as (x-x) in
the logf case. The fixes ensure positive zero is used. Testing of
catan and catanh in all rounding modes is duly enabled.
I expect there are various other bugs in special cases in __ieee754_*
functions that are normally hidden by the wrappers but would show up
for testing with -lieee (or in future with -fno-math-errno if we
replace -lieee and _LIB_VERSION with compile-time redirection to new
*_noerrno symbol names).
Tested x86_64 and x86 and ulps updated accordingly.
[BZ #16799]
[BZ #16800]
* math/s_catan.c (__catan): Avoid passing -0 denominator to atan2
with 0 numerator.
* math/s_catanf.c (__catanf): Likewise.
* math/s_catanh.c (__catanh): Likewise.
* math/s_catanhf.c (__catanhf): Likewise.
* math/s_catanhl.c (__catanhl): Likewise.
* math/s_catanl.c (__catanl): Likewise.
* sysdeps/ieee754/flt-32/e_logf.c (__ieee754_logf): Always divide
by positive zero when computing -Inf result.
* math/libm-test.inc (catan_test): Use ALL_RM_TEST.
(catanh_test): Likewise.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
This patch fixes bug 16789, incorrect sign of (real part) zero result
from clog and clog10 in round-downward mode, arising from that real
part being computed as 0 - 0. To ensure that an underflow exception
occurred, the code used an underflowing value (the next term in the
series for log1p) in arithmetic computing the real part of the result,
yielding the problematic 0 - 0 computation in some cases even when the
mathematical result would be small but positive. The patch changes
this code to use the math_force_eval approach to ensuring that an
underflowing computation actually occurs. Tests of clog and clog10
are enabled in all rounding modes.
Tested x86_64 and x86 and ulps updated accordingly.
[BZ #16789]
* math/s_clog.c (__clog): Use math_force_eval to ensure underflow
instead of using underflowing value in computing result.
* math/s_clog10.c (__clog10): Likewise.
* math/s_clog10f.c (__clog10f): Likewise.
* math/s_clog10l.c (__clog10l): Likewise.
* math/s_clogf.c (__clogf): Likewise.
* math/s_clogl.c (__clogl): Likewise.
* math/libm-test.inc (clog_test): Use ALL_RM_TEST.
(clog10_test): Likewise.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
This patch fixes bug 16348, spurious underflows from x86/x86_64 expl
on arguments close to 0. These implementations effectively use expm1
(on the fractional part of the argument) internally, so resulting in
spurious underflows when the result is very close to 1. For arguments
small enough that the round-to-nearest correct result is 1, this patch
uses 1+x instead.
These implementations are also used for exp10l and so the patch fixes
similar issues there (the 0x1p-67 threshold being small enough to be
correct for exp10l as well as expl). But because of spurious
underflows in other exp10 implementations (bug 16560), the tests
aren't added for exp10 at this point - they can be added when the
other exp10 parts of that bug are fixed.
Tested x86_64 and x86; no ulps updates needed.
[BZ #16348]
* sysdeps/i386/fpu/e_expl.S (IEEE754_EXPL) [!USE_AS_EXPM1L]: Use
1+x for argument with exponent below -67.
* sysdeps/x86_64/fpu/e_expl.S (IEEE754_EXPL) [!USE_AS_EXPM1L]:
Likewise.
* math/auto-libm-test-in: Add more tests of exp.
* math/auto-libm-test-out: Regenerated.
Bug 16198 is x86_64 fegetenv wrongly masking exceptions for which
traps are enabled, because that's a side-effect of the fnstenv
instruction. This patch fixes it to use fldenv immediately after
fnstenv, like the i386 version. Tested x86_64 and x86.
[BZ #16198]
* sysdeps/x86_64/fpu/fegetenv.c (fegetenv): Use fldenv after
fnstenv.
* math/test-fenv-preserve.c: New file.
* math/Makefile (tests): Add test-fenv-preserve.
gen-auto-libm-tests presently allows but does not require underflow
exceptions for results with magnitude in the range (greatest
subnormal, least normal].
In some cases, the magnitude of the exact result is very slightly
above the least normal, but rounding in the implementation results in
it effectively computing an infinite-precision result that is slightly
below the least normal, so raising an underflow exception. This is in
accordance with the documented accuracy goals, but results in
testsuite failures.
This patch changes the logic to allow underflows when the mathematical
result is up to 0.5ulp above the least normal (so in any case where
the round-to-nearest result is the least normal). Ideally underflows
in all these cases would be accepted only when an underflow with the
actual result is consistent with the rounding mode (in FE_TOWARDZERO
mode, a return value of the least normal implies that the
infinite-precision result did not underflow so there should be no
underflow exception, for example), so as to match the documented goals
more precisely - whereas at present the tests for exceptions are
completely independent of the tests of the returned values. (The same
applies to overflow exceptions as well - they too should be checked
for consistency with the result, as in FE_TOWARDZERO mode a result
1ulp below the largest finite value should be inconsistent with an
overflow exception and cause a failure with overflow rather than
simply being considered a 1ulp error when overflow is expected.) But
the present patch at least deals with the cases causing spurious
failures so that (a) certain existing tests no longer need to be
marked as having spurious exceptions (such markings in
auto-libm-test-in end up applying to more cases than just those they
are needed for) and (b) log1p can be tested in all rounding modes
without introducing more such failures. This patch duly moves tests
of log1p to ALL_RM_TEST.
Tested x86_64 and x86 and ulps updated accordingly.
[BZ #16357]
[BZ #16599]
* math/gen-auto-libm-tests.c (fp_format_desc): Add field
min_plus_half.
(fp_formats): Update initializers.
(init_fp_formats): Initialize new field.
(output_for_one_input_case): Allow underflow for results up to
min_plus_half.
* math/libm-test.inc (log1p_test): Use ALL_RM_TEST.
* math/auto-libm-test-in: Don't mark some underflows from asin and
atanh as spurious.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
My recent exp patch introduced warnings about implicit __isinf
declarations in exp because e_exp.c didn't include <math.h>. This
patch fixes this. Because <math.h> can't be included after
<math_private.h> (because of macro definitions of __nan*), it was
necessary to put an include in sysdeps/x86_64/fpu/multiarch/e_exp.c as
well.
Tested x86_64.
* sysdeps/ieee754/dbl-64/e_exp.c: Include <math.h>.
* sysdeps/x86_64/fpu/multiarch/e_exp.c
[HAVE_FMA4_SUPPORT || HAVE_AVX_SUPPORT]: Likewise.
The dbl-64 version of exp needs round-to-nearest mode for its internal
computations, but that has the consequence of inappropriate
overflowing and underflowing results in other rounding modes. This
patch fixes this by recomputing the relevant results in cases where
the round-to-nearest result overflows to infinity or underflows to
zero (most of the diffs are actually just consequent reindentation).
Tests are enabled in all rounding modes for complex functions using
exp - but not for cexp because it turns out there are bugs causing
spurious underflows for cexp for some tests, which will need to be
fixed separately (I suspect ccos ccosh csin csinh ctan ctanh have
similar bugs, just not shown by the present set of test inputs).
Tested x86_64 and x86 and ulps updated accordingly.
[BZ #16284]
* sysdeps/ieee754/dbl-64/e_exp.c (__ieee754_exp): Use original
rounding mode to recompute results that overflow to infinity or
underflow to zero.
* math/auto-libm-test-in: Don't mark tests as expected to fail for
bug 16284.
* math/auto-libm-test-out: Regenerated.
* math/libm-test.inc (ccos_test): Use ALL_RM_TEST.
(ccosh_test): Likewise.
(csin_test_data): Use plus_oflow.
(csin_test): Use ALL_RM_TEST.
(csinh_test_data): Use plus_oflow.
(csinh_test): Use ALL_RM_TEST.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
According to ISO C Annex F, log (1) should be +0 in all rounding
modes, but some implementations in glibc wrongly return -0 in
round-downward mode (mapping to log1p (x - 1) is problematic because 1
- 1 is -0 in round-downward mode, and log1p (-0) is -0). This patch
fixes this. (It helps with some implementations of other functions
such as acosh, log2 and log10 that call out to log, but not enough to
enable all-rounding-modes testing for those functions without further
fixes to other implementations of them.)
Tested x86_64 and x86 and ulps updated accordingly, and did spot tests
for mips64 for the ldbl-128 fix, and i586 for the sysdeps/i386/fpu
implementations shadowed by those in sysdeps/i386/i686/fpu.
[BZ #16731]
* sysdeps/i386/fpu/e_log.S (__ieee754_log): Take absolute value
when x - 1 is zero.
* sysdeps/i386/fpu/e_logf.S (__ieee754_logf): Likewise.
* sysdeps/i386/fpu/e_logl.S (__ieee754_logl): Likewise.
* sysdeps/i386/i686/fpu/e_logl.S (__ieee754_logl): Likewise.
* sysdeps/ieee754/dbl-64/e_log.c (__ieee754_log): Return +0 when
argument is 1.
* sysdeps/ieee754/ldbl-128/e_logl.c (__ieee754_logl): Likewise.
* sysdeps/x86_64/fpu/e_logl.S: Take absolute value when x - 1 is
zero.
* math/libm-test.inc (log_test): Use ALL_RM_TEST.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
This patch makes libm-test.inc tests of most functions use ALL_RM_TEST
unless there was some reason to defer that change for a particular
function.
I started out planning to defer the change for pow (bug 16315), cexp /
ccos / ccosh / csin / csinh (likely fallout from exp, bug 16284) and
cpow (exact expectations for signs of exact zero results not wanted).
Testing on x86_64 and x86 showed additional failures for acosh, cacos,
catan, catanh, clog, clog10, jn, log, log10, log1p, log2, tgamma, yn,
so making the change for those functions was deferred as well, pending
investigation to show which of these represent distinct bugs (some
such bugs may already be filed) and appropriate fixing / XFAILing.
Failures include wrong signs of zero results, errors slightly above
the 9ulp bound (in such cases it may make sense for functions to set
round-to-nearest internally to reduce error accumulation), large
errors and incorrect overflow/underflow for the rounding mode (with
consequent missing errno settings in some cases). It's possible some
could be issues with test expectations, though I didn't notice any
that were obviously like that (I added NO_TEST_INLINE for cases that
were failing for ildoubl on x86 and where it seemed reasonable for
them to fail for the fast-math inlines).
There may of course be failures on other architectures for functions
that didn't fail on x86_64 or x86, in which case the usual rule
applies: file a bug (preferably identifying the underlying problem
function, in cases where function A calls function B and a problem
with function B may present in the test results for function A) if not
already in Bugzilla then fix or XFAIL.
Tested x86_64 and x86 and ulps updated accordingly.
* math/libm-test.inc (asinh_test): Use ALL_RM_TEST.
(atan_test): Likewise.
(atanh_test_data): Use NO_TEST_INLINE for two tests.
(atanh_test): Use ALL_RM_TEST.
(atan2_test_data): Likewise.
(cabs_test): Likewise.
(cacosh_test): Likewise.
(carg_test): Likewise.
(casin_test): Likewise.
(casinh_test): Likewise.
(cbrt_test): Likewise.
(csqrt_test): Likewise.
(erf_test): Likewise.
(erfc_test): Likewise.
(pow10_test): Likewise.
(exp2_test): Likewise.
(hypot_test): Likewise.
(j0_test): Likewise.
(j1_test): Likewise.
(lgamma_test): Likewise.
(gamma_test): Likewise.
(sincos_test): Likewise.
(tanh_test): Likewise.
(y0_test): Likewise.
(y1_test): Likewise.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
It checks AVX-512 assembler support first and sets libc_cv_cc_avx512 to
$libc_cv_asm_avx512, instead of yes. GCC won't support AVX-512 if
assembler doesn't support it.
* sysdeps/x86_64/configure.ac: Check AVX-512 assembler support
first. Disable AVX-512 GCC support if assembler doesn't support
it.
* sysdeps/x86_64/configure: Regenerated.
AVX-512 ISA adds 512-bit zmm registers. This patch updates
_dl_runtime_profile to pass zmm registers to run-time audit. It also
changes _dl_x86_64_save_sse and _dl_x86_64_restore_sse to upport zmm
registers, which are called when only when RTLD_PREPARE_FOREIGN_CALL
is used. Its performance impact is minimum.
* config.h.in (HAVE_AVX512_SUPPORT): New #undef.
(HAVE_AVX512_ASM_SUPPORT): Likewise.
* sysdeps/x86_64/bits/link.h (La_x86_64_zmm): New.
(La_x86_64_vector): Add zmm.
* sysdeps/x86_64/Makefile (tests): Add tst-audit10.
(modules-names): Add tst-auditmod10a and tst-auditmod10b.
($(objpfx)tst-audit10): New target.
($(objpfx)tst-audit10.out): Likewise.
(tst-audit10-ENV): New.
(AVX512-CFLAGS): Likewise.
(CFLAGS-tst-audit10.c): Likewise.
(CFLAGS-tst-auditmod10a.c): Likewise.
(CFLAGS-tst-auditmod10b.c): Likewise.
* sysdeps/x86_64/configure.ac: Set config-cflags-avx512,
HAVE_AVX512_SUPPORT and HAVE_AVX512_ASM_SUPPORT.
* sysdeps/x86_64/configure: Regenerated.
* sysdeps/x86_64/dl-trampoline.S (_dl_runtime_profile): Add
AVX-512 zmm register support.
(_dl_x86_64_save_sse): Likewise.
(_dl_x86_64_restore_sse): Likewise.
* sysdeps/x86_64/dl-trampoline.h: Updated to support different
size vector registers.
* sysdeps/x86_64/link-defines.sym (YMM_SIZE): New.
(ZMM_SIZE): Likewise.
* sysdeps/x86_64/tst-audit10.c: New file.
* sysdeps/x86_64/tst-auditmod10a.c: Likewise.
* sysdeps/x86_64/tst-auditmod10b.c: Likewise.
As recently discussed
<https://sourceware.org/ml/libc-alpha/2014-02/msg00670.html>, it
doesn't seem particularly useful for libm-test-ulps files to contain
huge amounts of data on ulps for individual tests; just the global
maximum observed ulps for each function, together with the
verification of exceptions, errno and special results such as
infinities and NaNs for each test, suffices to verify that a
function's behavior on the given test inputs is within the expected
accuracy. Removing this data reduces source tree churn caused by
updates to these files when libm tests are added, and reduces the
frequency with which testsuite additions actually need libm-test-ulps
changes at all.
Accordingly, this patch removes that data, so that individual tests
get checked against the global bounds for the given function and only
generate an error if those are exceeded. Tested x86_64 (including
verifying that if an ulps value is artificially reduced, the tests do
indeed fail as they should and "make regen-ulps" generates the
expected changes).
* math/libm-test.inc (struct ulp_data): Don't refer to ulps for
individual tests in comment.
(libm-test-ulps.h): Don't refer to test_ulps in #include comment.
(prev_max_error): New variable.
(prev_real_max_error): Likewise.
(prev_imag_max_error): Likewise.
(compare_ulp_data): Don't refer to test names in comment.
(find_test_ulps): Remove function.
(find_function_ulps): Likewise.
(find_complex_function_ulps): Likewise.
(init_max_error): Take function name as argument. Look up ulps
for that function.
(print_ulps): Remove function.
(print_max_error): Use prev_max_error instead of calling
find_function_ulps.
(print_complex_max_error): Use prev_real_max_error and
prev_imag_max_error instead of calling find_complex_function_ulps.
(check_float_internal): Take max_ulp parameter instead of calling
find_test_ulps. Don't call print_ulps.
(check_float): Update call to check_float_internal.
(check_complex): Update calls to check_float_internal.
(START): Pass argument to init_max_error.
* math/gen-libm-test.pl (%results): Don't include "kind"
information.
(parse_ulps): Don't handle ulps of individual tests.
(print_ulps_file): Likewise.
(output_ulps): Likewise.
* math/README.libm-test: Update.
* manual/libm-err-tab.pl (parse_ulps): Don't handle ulps of
individual tests.
* sysdeps/aarch64/libm-test-ulps: Remove individual test ulps.
* sysdeps/alpha/fpu/libm-test-ulps: Likewise.
* sysdeps/arm/libm-test-ulps: Likewise.
* sysdeps/i386/fpu/libm-test-ulps: Likewise.
* sysdeps/ia64/fpu/libm-test-ulps: Likewise.
* sysdeps/m68k/coldfire/fpu/libm-test-ulps: Likewise.
* sysdeps/m68k/m680x0/fpu/libm-test-ulps: Likewise.
* sysdeps/microblaze/libm-test-ulps: Likewise.
* sysdeps/mips/mips32/libm-test-ulps: Likewise.
* sysdeps/mips/mips64/libm-test-ulps: Likewise.
* sysdeps/powerpc/fpu/libm-test-ulps: Likewise.
* sysdeps/powerpc/nofpu/libm-test-ulps: Likewise.
* sysdeps/s390/fpu/libm-test-ulps: Likewise.
* sysdeps/sh/libm-test-ulps: Likewise.
* sysdeps/sparc/fpu/libm-test-ulps: Likewise.
* sysdeps/tile/libm-test-ulps: Likewise.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
* sysdeps/hppa/fpu/libm-test-ulps: Remove individual test ulps.
In 84ba214c, I removed some redundant sign computations and in the
process, I incorrectly got rid of a temporary variable, thus passing
the absolute value of the input to bsloww1. This caused #16623.
This fix undoes the incorrect change.
This patch moves tests of clog10 to auto-libm-test-in. Note that this
means gen-auto-libm-tests will now depend on the recent MPC 1.0.2
release which added a fix for a bug that made gen-auto-libm-tests hang
for clog10. (It still can't conveniently be used for cacos cacosh
casin casinh catan catanh csin csinh because of extreme slowness of
those functions for special cases in MPC; at least some slow cases of
csin / csinh are fixed in MPC trunk, but not in a release.)
Tested x86_64 and x86 and ulps updated accordingly.
* math/auto-libm-test-in: Add tests of clog10.
* math/auto-libm-test-out: Regenerated.
* math/libm-test.inc (clog10_test_data): Use AUTO_TESTS_c_c.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
gen-auto-libm-tests has a bug in the logic for setting a sticky bit
based on the ternary value from MPFR: it is correct for positive
results, but for negative results mpz_setbit acts as if a two's
complement representation is used, whereas the low bit needs setting
based on the sign-magnitude representation GMP actually uses. (This
showed up in converting fma tests to use auto-libm-test-in /
gen-auto-libm-tests.)
This patch fixes the problem by negating the mpz_t value to set its
low bit. There are lots of changes to auto-libm-test-out (mainly 1ulp
fixes to ldbl-128 expected results), but only a few ulps updates are
needed on x86 / x86_64. In one case, a corrected expectation showed
up a spurious underflow exception where the correct result is slightly
outside the underflowing range.
Tested x86_64 and x86 and ulps updated accordingly.
* math/gen-auto-libm-tests.c (adjust_real): Ensure integers are
non-negative before setting low bit.
* math/auto-libm-test-in: Mark one asin test possibly having
spurious underflow.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
In BZ #15605 fix with addding memset/memmove alias in symbol-hacks.h,
x32 symbol-hacks.h change was missing. Fixed by including
<sysdeps/generic/symbol-hacks.h> in x32 symbol-hacks.h.
This patch fixes bug 16356, bad results from x86 / x86_64 expl /
exp10l in directed rounding modes, the most serious of the bugs shown
up by my patch expanding libm test coverage. When I fixed bug 16293,
I thought it was only necessary to set round-to-nearest when using
frndint in expm1 functions, because in other cases the cancellation
error from having the resulting fractional part close to 1 or -1 would
not be significant. However, in expl and exp10l, the way the final
fractional part gets computed (something more complicated than a
simple subtraction, because more precision is needed than you'd get
that way) can result in a value outside the range [-1, 1] when the
argument to frndint was very close to an integer and was rounded the
"wrong" way because of the rounding mode - and the f2xm1 instruction
has undefined results if its argument is outside [-1, 1], so resulting
in the large errors seen. So this patch removes the USE_AS_EXPM1L
conditionals on the round-to-nearest settings, so all of expl, expm1l
and exp10l now get round-to-nearest used for frndint (meaning the
final fractional part can at most be slightly above 0.5 in
magnitude). Associated tests of exp and exp10 are added and testing
of exp10 in directed rounding modes enabled.
Tested x86_64 and x86 and ulps updated accordingly.
* sysdeps/i386/fpu/e_expl.S (IEEE754_EXPL): Also set
round-to-nearest for [!USE_AS_EXPM1L].
* sysdeps/x86_64/fpu/e_expl.S (IEEE754_EXPL): Likewise.
* math/auto-libm-test-in: Do not expect cosh tests to fail. Add
more tests of exp and exp10. Expect some exp10 tests to miss
exceptions or fail in directed rounding modes.
* math/auto-libm-test-out: Regenerated.
* math/libm-test.inc (exp10_tonearest_test_data): New array.
(exp10_test_tonearest): New function.
(exp10_towardzero_test_data): New array.
(exp10_test_towardzero): New function.
(exp10_downward_test_data): New array.
(exp10_test_downward): New function.
(exp10_upward_test_data): New array.
(exp10_test_upward): New function.
(main): Call the new functions.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
Various libm functions have inadequate test coverage in libm-test.inc
/ auto-libm-test-in - failing to cover all the usual special cases
(infinities, NaNs, zero, large and small finite values, subnormals) as
well as a reasonable range of ordinary inputs and, where appropriate,
inputs close to the thresholds for underflow and overflow.
This patch improves test coverage for real functions [a-c]* (with the
expectation of adding more coverage for other functions later).
Tested x86_64 and x86 and ulps updated accordingly (and eight glibc
bugs and one C11 DR filed for issues found in the process).
* math/auto-libm-test-in: Add more tests of acos, acosh, asin,
asinh, atan, atan2, atanh, cbrt, cos and cosh.
* math/auto-libm-test-out: Regenerated.
* math/libm-test.inc (acosh_test_data): Add more tests.
(atanh_test_data): Likewise.
(ceil_test_data): Likewise.
(copysign_test_data): Likewise.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
This patch moves tests of cpow to auto-libm-test-in, adding the
required support to gen-auto-libm-tests.
Tested x86_64 and x86 and ulps updated accordingly.
* math/auto-libm-test-in: Add tests of cpow.
* math/auto-libm-test-out: Regenerated.
* math/libm-test.inc (cpow_test_data): Use AUTO_TESTS_cc_c.
* * math/gen-auto-libm-tests.c (func_calc_method): Add value
mpc_cc_c.
(func_calc_desc): Add mpc_cc_c union field.
(test_functions): Add cpow.
(special_fill_2pi): New function.
(special_real_inputs): Add 2pi.
(calc_generic_results): Handle mpc_cc_c.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
This patch moves tests of ccos, ccosh, cexp, clog, csqrt, ctan and
ctanh to auto-libm-test-in, adding the required support to
gen-auto-libm-tests. Other TEST_c_c functions aren't moved for now
(although the relevant table entries are put in gen-auto-libm-tests
for it to know how to handle them): clog10 because of a known MPC bug
causing it to hang for at least some pure imaginary inputs (fixed in
SVN, but I'd rather not rely on unreleased versions of MPFR or MPC
even if relying on very recent releases); the inverse trig and
hyperbolic functions because of known slowness in special cases; and
csin / csinh because of observed slowness that I need to investigate
and report to the MPC maintainers. Slowness can be bypassed by moving
to incremental generation (only for new / changed tests) rather than
regenerating the whole of auto-libm-test-out every time, but that
needs implementing. (This patch takes the time for running
gen-auto-libm-tests from about one second to seven, on my system,
which I think is reasonable. The slow functions would make it take
several minutes at least, which seems unreasonable.)
Tested x86_64 and x86 and ulps updated accordingly.
* math/auto-libm-test-in: Add tests of ccos, ccosh, cexp, clog,
csqrt, ctan and ctanh.
* math/auto-libm-test-out: Regenerated.
* math/libm-test.inc (TEST_COND_x86_64): New macro.
(TEST_COND_x86): Likewise.
(ccos_test_data): Use AUTO_TESTS_c_c.
(ccosh_test_data): Likewise.
(cexp_test_data): Likewise.
(clog_test_data): Likewise.
(csqrt_test_data): Likewise.
(ctan_test_data): Likewise.
(ctan_tonearest_test_data): Likewise.
(ctan_towardzero_test_data): Likewise.
(ctan_downward_test_data): Likewise.
(ctan_upward_test_data): Likewise.
(ctanh_test_data): Likewise.
(ctanh_tonearest_test_data): Likewise.
(ctanh_towardzero_test_data): Likewise.
(ctanh_downward_test_data): Likewise.
(ctanh_upward_test_data): Likewise.
* math/gen-auto-libm-tests.c (func_calc_method): Add value
mpc_c_c.
(func_calc_desc): Add mpc_c_c union field.
(FUNC_mpc_c_c): New macro.
(test_functions): Add cacos, cacosh, casin, casinh, catan, catanh,
ccos, ccosh, cexp, clog, clog10, csin, csinh, csqrt, ctan and
ctanh.
(special_fill_min_subnorm_p120): New function.
(special_real_inputs): Add min_subnorm_p120.
(calc_generic_results): Handle mpc_c_c.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
This patch moves tests of sincos to auto-libm-test-in, adding the
required support to gen-auto-libm-tests.
Tested x86_64 and x86 and ulps updated accordingly.
(auto-libm-test-out diffs omitted below.)
* math/auto-libm-test-in: Add tests of sincos.
* math/auto-libm-test-out: Regenerated.
* math/libm-test.inc (sincos_test_data): Use AUTO_TESTS_fFF_11.
* math/gen-auto-libm-tests.c (func_calc_method): Add value
mpfr_f_11.
(func_calc_desc): Add mpfr_f_11 union field.
(test_functions): Add sincos.
(calc_generic_results): Handle mpfr_f_11.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
math/gen-libm-test.pl has code to beautify names of various constants,
transforming the source form in libm-test.inc into the version
appearing in test names in libm-test-ulps files.
This has become decreasingly relevant over time for the M_* constants,
first as I changed the test names so only the arguments and not the
expected results appeared in them, then as tests have moved to
auto-libm-test-* so that automatically generated hex float constants
get used instead of M_* in test inputs.
This patch removes the beautification for all M_* constants. Tested
x86_64 and x86 and ulps updated accordingly. Even the one case where
this affected the name in the ulps files will disappear once complex
function tests are moved to auto-libm-test-*.
* math/gen-libm-test.pl (%beautify): Remove M_* constants.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
Bug 16293 is inaccuracy of x86/x86_64 versions of expm1, near 0 in
directed rounding modes, that arises from frndint rounding the
exponent to 1 or -1 instead of 0, resulting in large cancellation
error. This inaccuracy in turn affects other functions such as sinh
that use expm1. This patch fixes the problem by setting
round-to-nearest mode temporarily around the affected calls to
frndint. I don't think this is needed for other uses of frndint, such
as in exp itself, as only for expm1 is the cancellation error
significant.
Tested x86_64 and x86 and ulps updated accordingly.
* sysdeps/i386/fpu/e_expl.S (IEEE754_EXPL) [USE_AS_EXPM1L]: Set
round-to-nearest mode when using frndint.
* sysdeps/i386/fpu/s_expm1.S (__expm1): Likewise.
* sysdeps/i386/fpu/s_expm1f.S (__expm1f): Likewise.
* sysdeps/x86_64/fpu/e_expl.S (IEEE754_EXPL) [USE_AS_EXPM1L]:
Likewise.
* math/auto-libm-test-in: Add more tests of expm1. Do not expect
sinh test to fail.
* math/auto-libm-test-out: Regenerated.
* math/libm-test.inc (TEST_COND_x86_64): Remove macro.
(TEST_COND_x86): Likewise.
(expm1_tonearest_test_data): New array.
(expm1_test_tonearest): New function.
(expm1_towardzero_test_data): New array.
(expm1_test_towardzero): New function.
(expm1_downward_test_data): New array.
(expm1_test_downward): New function.
(expm1_upward_test_data): New array.
(expm1_test_upward): New function.
(main): Run the new test functions.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
This patch moves tests of jn and yn to auto-libm-test-in, adding the
required support for gen-auto-libm-tests (and adding a missing
assertion there and fixing logic that was broken for functions with
integer arguments).
Tested x86_64 and x86 and ulps updated accordingly.
* math/auto-libm-test-in: Add tests of jn and yn.
* math/auto-libm-test-out: Regenerated.
* math/libm-test.inc (jn_test_data): Use AUTO_TESTS_if_f.
(yn_test_data): Likewise.
* math/gen-auto-libm-tests.c (func_calc_method): Add value
mpfr_if_f.
(func_calc_desc): Add mpfr_if_f union field.
(FUNC_mpfr_if_f): New macro.
(test_functions): Add jn and yn.
(calc_generic_results): Assert type of second input for
mpfr_ff_f. Handle mpfr_if_f.
(output_for_one_input_case): Disable all checking for arguments
fitting floating-point types in case of an integer argument.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
This patch fixes bug 16338, ldbl-128 logl not handling subnormals
(with consequent inaccuracy for lgammal as well). The fix is simply
to use __frexpl when determining the exponent, as done already in
log2l and log10l. Given the lack of testing of small arguments to any
of the log* functions, appropriate tests are added for all of them.
Tested x86_64 and x86 and ulps updated accordingly, and spot tests
also run for mips64 to confirm the ldbl-128 fix.
Note that while this fixes lgammal inaccuracy for small positive
arguments, I suspect that there will still be problems with spurious
underflows in that case.
* sysdeps/ieee754/ldbl-128/e_logl.c (__ieee754_logl): Use __frexpl
to determine exponent and adjust argument to have exponent of -1.
* math/auto-libm-test-in: Add more tests of log, log10, log1p and
log2.
* math/auto-libm-test-out: Regenerated.
* sysdeps/x86_64/fpu/libm-test-ulps: Update.
A sse42 version of strstr used pcmpistr instruction which is quite
ineffective. A faster way is look for pairs of characters which is uses
sse2, is faster than pcmpistr and for real strings a pairs we look for
are relatively rare.
For linear time complexity we use buy or rent technique which switches
to two-way algorithm when superlinear behaviour is detected.
Autoconf has been deprecating configure.in for quite a long time.
Rename all our configure.in and preconfigure.in files to .ac.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
http://sourceware.org/ml/libc-alpha/2013-08/msg00081.html
This is the first of a series of patches to ban ieee854_long_double
and the ieee854_long_double macros when using IBM long double. union
ieee854_long_double just isn't correct for IBM long double, especially
when little-endian, and pretending it is OK has allowed a number of
bugs to remain undetected in sysdeps/ieee754/ldbl-128ibm/.
This changes the few places in generic code that use it.
* stdio-common/printf_size.c (__printf_size): Don't use
union ieee854_long_double in fpnum union.
* stdio-common/printf_fphex.c (__printf_fphex): Likewise. Use
signbit macro to retrieve sign from long double.
* stdio-common/printf_fp.c (___printf_fp): Use signbit macro to
retrieve sign from long double.
* sysdeps/ieee754/ldbl-128ibm/printf_fphex.c: Adjust for fpnum change.
* sysdeps/ieee754/ldbl-128/printf_fphex.c: Likewise.
* sysdeps/ieee754/ldbl-96/printf_fphex.c: Likewise.
* sysdeps/x86_64/fpu/printf_fphex.c: Likewise.
* math/test-misc.c (main): Don't use union ieee854_long_double.
ports/
* sysdeps/ia64/fpu/printf_fphex.c: Adjust for fpnum change.
The pointer guard used for pointer mangling was not initialized for
static applications resulting in the security feature being disabled.
The pointer guard is now correctly initialized to a random value for
static applications. Existing static applications need to be
recompiled to take advantage of the fix.
The test tst-ptrguard1-static and tst-ptrguard1 add regression
coverage to ensure the pointer guards are sufficiently random
and initialized to a default value.
Resolves: #15465
The program name may be unavailable if the user application tampers
with argc and argv[]. Some parts of the dynamic linker caters for
this while others don't, so this patch consolidates the check and
fallback into a single macro and updates all users.
This implementation speed up memset in several ways. First is avoiding
expensive computed jump. Second is using fact that arguments of memset
are most of time aligned to 8 bytes.
Benchmark results on:
kam.mff.cuni.cz/~ondra/benchmark_string/memset_profile_result27_04_13.tar.bz2
We add new memcpy version that uses unaligned loads which are fast
on modern processors. This allows second improvement which is avoiding
computed jump which is relatively expensive operation.
Tests available here:
http://kam.mff.cuni.cz/~ondra/memcpy_profile_result27_04_13.tar.bz2
The EXTRACT_WORDS64 and INSERT_WORDS64 macros use movd for a 64-bit
operation. Somehow gcc manages to turn this into movq, but LLVM won't.
2013-05-15 Peter Collingbourne <pcc@google.com>
* sysdeps/x86_64/fpu/math_private.h (MOVQ): New macro.
(EXTRACT_WORDS64) Use where appropriate.
(INSERT_WORDS64) Likewise.
While these instructions accept memory operands, only one operand
may be a memory operand. Giving two operands xm constraints gives
the compiler the option of using memory for both operands, which
would result in invalid assembly code. Using x for all operands is
more appropriate, as most x86_64 calling conventions will pass the
arguments in registers anyway.
2013-05-15 Peter Collingbourne <pcc@google.com>
* sysdeps/x86_64/fpu/multiarch/s_fma.c (__fma_fma4): Replace xm
constraints with x constraints.
* sysdeps/x86_64/fpu/multiarch/s_fmaf.c (__fmaf_fma4): Likewise.
The value of PI is never exactly PI in any floating point representation,
and the value of PI/2 is never PI/2. It is wrong to expect cos(M_PI_2l)
to return 0, instead it will return an answer that is non-zero because
M_PI_2l doesn't round to exactly PI/2 in the type used.
That is to say that the correct answer is to do the following:
* Take PI or PI/2.
* Round to the floating point representation.
* Take the rounded value and compute an infinite precision cos or sin.
* Use the rounded result of the infinite precision cos or sin as the
answer to the test.
I used printf to do the type rounding, and Wolfram's Alpha to do the
infinite precision cos calculations.
The following changes bring x86-64 and x86 to 1/2 ulp for two tests.
It shows that the x86 cos implementation is quite good, and that
our test are flawed.
Unfortunately given that the rounding errors are type dependent we
need to fix this for each type. No regressions on x86-64 or x86.
---
2013-04-11 Carlos O'Donell <carlos@redhat.com>
* math/libm-test.inc (cos_test): Fix PI/2 test.
(sincos_test): Likewise.
* sysdeps/x86_64/fpu/libm-test-ulps: Regenerate.
* sysdeps/i386/fpu/libm-test-ulps: Regenerate.
Due to a typo repeated several times, this bug hasn't been fixed yet,
despite being marked as resolved in glibc 2.12.
* sysdeps/x86_64/strcmp.S: Replace all occurrences of NOT_IN_lib
with NOT_IN_libc.
With help from Joseph Myers.
* sysdeps/ieee754/ldbl-128/s_atanl.c (__atanl): Handle tiny and
very large arguments properly.
* math/libm-test.inc (atan_test): New tests.
(atan2_test): New tests.
* sysdeps/sparc/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Update.
With help from Joseph Myers.
* sysdeps/ieee754/flt-32/e_j0f.c (__ieee754_y0f): Adjust tinyness
cutoff to 2**-13.
* sysdeps/ieee754/flt-32/e_j1f.c (__ieee754_y1f): Adjust tinyness
cutoff to 2**-25.
* sysdeps/ieee754/ldbl-128/e_j0l.c (U0): New constant.
( __ieee754_y0l): Avoid arithmetic underflow when 'x' is very
small.
* sysdeps/ieee754/ldbl-128/e_j1l.c (__ieee754_y1l): Likewise.
* math/libm-test.inc (y0_test): New tests.
(y1_test): New tests.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Update.
* sysdeps/sparc/fpu/libm-test-ulps: Update.
* sysdeps/i386/i686/fpu/multiarch/Makefile (sysdep_routines):
Add s_sinf-sse2, s_conf-sse2.
* sysdeps/i386/i686/fpu/multiarch/s_sinf-sse2.S: New file.
* sysdeps/i386/i686/fpu/multiarch/s_cosf-sse2.S: New file.
* sysdeps/i386/i686/fpu/multiarch/s_sinf.c: New file.
* sysdeps/i386/i686/fpu/multiarch/s_cosf.c: New file.
* sysdeps/ieee754/flt-32/s_sinf.c (SINF, SINF_FUNC): Add macros
for using routine as __sinf_ia32.
Use macro for function declaration and weak_alias.
* sysdeps/ieee754/flt-32/s_cosf.c (COSF, COSF_FUNC): Add macros
for using routine as __cosf_ia32.
Use macro for function declaration and weak_alias.
* sysdeps/i386/i686/fpu/multiarch/e_expf-sse2.S: Fix Copyright.
* sysdeps/i386/i686/fpu/multiarch/e_expf.c: Fix Copyright.
* sysdeps/x86_64/fpu/s_sinf.S: New file.
* sysdeps/x86_64/fpu/s_cosf.S: New file.
* sysdeps/x86_64/fpu/libm-test-ulps: Update.
* math/libm-test.inc (cos_test): Add more test cases.
(sin_test): Likewise.
(sincos_test): Likewise.
[BZ #14538]
* sysdeps/x86_64/dl-machine.h (elf_machine_dynamic): Use the
first element of the GOT.
(elf_machine_load_address): Return the difference between
the runtime address of _DYNAMIC and elf_machine_dynamic ().
Pretty sure we require recent enough versions of gcc/binutils to make this
check pointless. I can't any logs in the last few years where this check
didn't return "yes".
Signed-off-by: Mike Frysinger <vapier@gentoo.org>