math/gen-libm-test.pl has code to beautify names of various constants,
transforming the source form in libm-test.inc into the version
appearing in test names in libm-test-ulps files.
This has become decreasingly relevant over time for the M_* constants,
first as I changed the test names so only the arguments and not the
expected results appeared in them, then as tests have moved to
auto-libm-test-* so that automatically generated hex float constants
get used instead of M_* in test inputs.
This patch removes the beautification for all M_* constants. Tested
x86_64 and x86 and ulps updated accordingly. Even the one case where
this affected the name in the ulps files will disappear once complex
function tests are moved to auto-libm-test-*.
* math/gen-libm-test.pl (%beautify): Remove M_* constants.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
Bug 16293 is inaccuracy of x86/x86_64 versions of expm1, near 0 in
directed rounding modes, that arises from frndint rounding the
exponent to 1 or -1 instead of 0, resulting in large cancellation
error. This inaccuracy in turn affects other functions such as sinh
that use expm1. This patch fixes the problem by setting
round-to-nearest mode temporarily around the affected calls to
frndint. I don't think this is needed for other uses of frndint, such
as in exp itself, as only for expm1 is the cancellation error
significant.
Tested x86_64 and x86 and ulps updated accordingly.
* sysdeps/i386/fpu/e_expl.S (IEEE754_EXPL) [USE_AS_EXPM1L]: Set
round-to-nearest mode when using frndint.
* sysdeps/i386/fpu/s_expm1.S (__expm1): Likewise.
* sysdeps/i386/fpu/s_expm1f.S (__expm1f): Likewise.
* sysdeps/x86_64/fpu/e_expl.S (IEEE754_EXPL) [USE_AS_EXPM1L]:
Likewise.
* math/auto-libm-test-in: Add more tests of expm1. Do not expect
sinh test to fail.
* math/auto-libm-test-out: Regenerated.
* math/libm-test.inc (TEST_COND_x86_64): Remove macro.
(TEST_COND_x86): Likewise.
(expm1_tonearest_test_data): New array.
(expm1_test_tonearest): New function.
(expm1_towardzero_test_data): New array.
(expm1_test_towardzero): New function.
(expm1_downward_test_data): New array.
(expm1_test_downward): New function.
(expm1_upward_test_data): New array.
(expm1_test_upward): New function.
(main): Run the new test functions.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
This patch moves tests of jn and yn to auto-libm-test-in, adding the
required support for gen-auto-libm-tests (and adding a missing
assertion there and fixing logic that was broken for functions with
integer arguments).
Tested x86_64 and x86 and ulps updated accordingly.
* math/auto-libm-test-in: Add tests of jn and yn.
* math/auto-libm-test-out: Regenerated.
* math/libm-test.inc (jn_test_data): Use AUTO_TESTS_if_f.
(yn_test_data): Likewise.
* math/gen-auto-libm-tests.c (func_calc_method): Add value
mpfr_if_f.
(func_calc_desc): Add mpfr_if_f union field.
(FUNC_mpfr_if_f): New macro.
(test_functions): Add jn and yn.
(calc_generic_results): Assert type of second input for
mpfr_ff_f. Handle mpfr_if_f.
(output_for_one_input_case): Disable all checking for arguments
fitting floating-point types in case of an integer argument.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
This patch fixes bug 16338, ldbl-128 logl not handling subnormals
(with consequent inaccuracy for lgammal as well). The fix is simply
to use __frexpl when determining the exponent, as done already in
log2l and log10l. Given the lack of testing of small arguments to any
of the log* functions, appropriate tests are added for all of them.
Tested x86_64 and x86 and ulps updated accordingly, and spot tests
also run for mips64 to confirm the ldbl-128 fix.
Note that while this fixes lgammal inaccuracy for small positive
arguments, I suspect that there will still be problems with spurious
underflows in that case.
* sysdeps/ieee754/ldbl-128/e_logl.c (__ieee754_logl): Use __frexpl
to determine exponent and adjust argument to have exponent of -1.
* math/auto-libm-test-in: Add more tests of log, log10, log1p and
log2.
* math/auto-libm-test-out: Regenerated.
* sysdeps/x86_64/fpu/libm-test-ulps: Update.
The most common use case of math functions is with default rounding
mode, i.e. rounding to nearest. Setting and restoring rounding mode
is an unnecessary overhead for this, so I've added support for a
context, which does the set/restore only if the FP status needs a
change. The code is written such that only x86 uses these. Other
architectures should be unaffected by it, but would definitely benefit
if the set/restore has as much overhead relative to the rest of the
code, as the x86 bits do.
Here's a summary of the performance improvement due to these
improvements; I've only mentioned functions that use the set/restore
and have benchmark inputs for x86_64:
Before:
cos(): ITERS:4.69335e+08: TOTAL:28884.6Mcy, MAX:4080.28cy, MIN:57.562cy, 16248.6 calls/Mcy
exp(): ITERS:4.47604e+08: TOTAL:28796.2Mcy, MAX:207.721cy, MIN:62.385cy, 15543.9 calls/Mcy
pow(): ITERS:1.63485e+08: TOTAL:28879.9Mcy, MAX:362.255cy, MIN:172.469cy, 5660.86 calls/Mcy
sin(): ITERS:3.89578e+08: TOTAL:28900Mcy, MAX:704.859cy, MIN:47.583cy, 13480.2 calls/Mcy
tan(): ITERS:7.0971e+07: TOTAL:28902.2Mcy, MAX:1357.79cy, MIN:388.58cy, 2455.55 calls/Mcy
After:
cos(): ITERS:6.0014e+08: TOTAL:28875.9Mcy, MAX:364.283cy, MIN:45.716cy, 20783.4 calls/Mcy
exp(): ITERS:5.48578e+08: TOTAL:28764.9Mcy, MAX:191.617cy, MIN:51.011cy, 19071.1 calls/Mcy
pow(): ITERS:1.70013e+08: TOTAL:28873.6Mcy, MAX:689.522cy, MIN:163.989cy, 5888.18 calls/Mcy
sin(): ITERS:4.64079e+08: TOTAL:28891.5Mcy, MAX:6959.3cy, MIN:36.189cy, 16062.8 calls/Mcy
tan(): ITERS:7.2354e+07: TOTAL:28898.9Mcy, MAX:1295.57cy, MIN:380.698cy, 2503.7 calls/Mcy
So the improvements are:
cos: 27.9089%
exp: 22.6919%
pow: 4.01564%
sin: 19.1585%
tan: 1.96086%
The downside of the change is that it will have an adverse performance
impact on non-default rounding modes, but I think the tradeoff is
justified.
The value of PI is never exactly PI in any floating point representation,
and the value of PI/2 is never PI/2. It is wrong to expect cos(M_PI_2l)
to return 0, instead it will return an answer that is non-zero because
M_PI_2l doesn't round to exactly PI/2 in the type used.
That is to say that the correct answer is to do the following:
* Take PI or PI/2.
* Round to the floating point representation.
* Take the rounded value and compute an infinite precision cos or sin.
* Use the rounded result of the infinite precision cos or sin as the
answer to the test.
I used printf to do the type rounding, and Wolfram's Alpha to do the
infinite precision cos calculations.
The following changes bring x86-64 and x86 to 1/2 ulp for two tests.
It shows that the x86 cos implementation is quite good, and that
our test are flawed.
Unfortunately given that the rounding errors are type dependent we
need to fix this for each type. No regressions on x86-64 or x86.
---
2013-04-11 Carlos O'Donell <carlos@redhat.com>
* math/libm-test.inc (cos_test): Fix PI/2 test.
(sincos_test): Likewise.
* sysdeps/x86_64/fpu/libm-test-ulps: Regenerate.
* sysdeps/i386/fpu/libm-test-ulps: Regenerate.