The flt-32 implementation of powf wrongly uses x-1 instead of |x|-1
when computing log (x) for the case where |x| is close to 1 and y is
large. This patch fixes the logic accordingly. Relevant tests
existed for x close to 1, and corresponding tests are added for x
close to -1, as well as for some new variant cases.
Tested for x86_64 and x86.
[BZ #18647]
* sysdeps/ieee754/flt-32/e_powf.c (__ieee754_powf): For large y
and |x| close to 1, use absolute value of x when computing log.
* math/auto-libm-test-in: Add more tests of pow.
* math/auto-libm-test-out: Regenerated.
In the ldbl-128 implementation of expm1l, when expm1l's result should
underflow to 0 (argument minus the least subnormal, in some rounding
modes), it can be a zero of the wrong sign. This patch fixes this in
the same way previously used for the x86 / x86_64 versions.
Tested for mips64.
[BZ #18619]
* sysdeps/ieee754/ldbl-128/s_expm1l.c (__expm1l): Force underflow
and return argument in case of subnormal argument.
In non-default rounding modes, tgamma can be slightly less accurate
than permitted by glibc's accuracy goals.
Part of the problem is error accumulation, addressed in this patch by
setting round-to-nearest for internal computations. However, there
was also a bug in the code dealing with computing pow (x + n, x + n)
where x + n is not exactly representable, providing another source of
error even in round-to-nearest mode; it was necessary to address both
bugs to get errors for all testcases within glibc's accuracy goals.
Given this second fix, accuracy in round-to-nearest mode is also
improved (hence regeneration of ulps for tgamma should be from scratch
- truncate libm-test-ulps or at least remove existing tgamma entries -
so that the expected ulps can be reduced).
Some additional complications also arose. Certain tgamma tests should
strictly, according to IEEE semantics, overflow or not depending on
the rounding mode; this is beyond the scope of glibc's accuracy goals
for any function without exactly-determined results, but
gen-auto-libm-tests doesn't handle being lax there as it does for
underflow. (libm-test.inc also doesn't handle being lax about whether
the result in cases very close to the overflow threshold is infinity
or a finite value close to overflow, but that doesn't cause problems
in this case though I've seen it cause problems with random test
generation for some functions.) Thus, spurious-overflow markings,
with a comment, are added to auto-libm-test-in (no bug in Bugzilla
because the issue is with the testsuite, not a user-visible bug in
glibc). And on x86, after the patch I saw ERANGE issues as previously
reported by Carlos (see my commentary in
<https://sourceware.org/ml/libc-alpha/2015-01/msg00485.html>), which
needed addressing by ensuring excess range and precision were
eliminated at various points if FLT_EVAL_METHOD != 0.
I also noticed and fixed a cosmetic issue where 1.0f was used in long
double functions and should have been 1.0L.
This completes the move of all functions to testing in all rounding
modes with ALL_RM_TEST, so gen-libm-have-vector-test.sh is updated to
remove the workaround for some functions not using ALL_RM_TEST.
Tested for x86_64, x86, mips64 and powerpc.
[BZ #18613]
* sysdeps/ieee754/dbl-64/e_gamma_r.c (gamma_positive): Take log of
X_ADJ not X when adjusting exponent.
(__ieee754_gamma_r): Do intermediate computations in
round-to-nearest then adjust overflowing and underflowing results
as needed.
* sysdeps/ieee754/flt-32/e_gammaf_r.c (gammaf_positive): Take log
of X_ADJ not X when adjusting exponent.
(__ieee754_gammaf_r): Do intermediate computations in
round-to-nearest then adjust overflowing and underflowing results
as needed.
* sysdeps/ieee754/ldbl-128/e_gammal_r.c (gammal_positive): Take
log of X_ADJ not X when adjusting exponent.
(__ieee754_gammal_r): Do intermediate computations in
round-to-nearest then adjust overflowing and underflowing results
as needed. Use 1.0L not 1.0f as numerator of division.
* sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (gammal_positive): Take
log of X_ADJ not X when adjusting exponent.
(__ieee754_gammal_r): Do intermediate computations in
round-to-nearest then adjust overflowing and underflowing results
as needed. Use 1.0L not 1.0f as numerator of division.
* sysdeps/ieee754/ldbl-96/e_gammal_r.c (gammal_positive): Take log
of X_ADJ not X when adjusting exponent.
(__ieee754_gammal_r): Do intermediate computations in
round-to-nearest then adjust overflowing and underflowing results
as needed. Use 1.0L not 1.0f as numerator of division.
* math/libm-test.inc (tgamma_test_data): Remove one test. Moved
to auto-libm-test-in.
(tgamma_test): Use ALL_RM_TEST.
* math/auto-libm-test-in: Add one test of tgamma. Mark some other
tests of tgamma with spurious-overflow.
* math/auto-libm-test-out: Regenerated.
* math/gen-libm-have-vector-test.sh: Do not check for START.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
The ldbl-128 implementation of j1l produces spurious underflow
exceptions for some small arguments, as a result of squaring the
argument. This patch fixes it just to use a linear approximation for
sufficiently small arguments, and then to force an underflow exception
only in the cases where it is required.
Tested for mips64.
[BZ #18612]
* sysdeps/ieee754/ldbl-128/e_j1l.c (__ieee754_j1l): For small
arguments, just return 0.5 times the argument, with underflow
forced as needed.
* math/auto-libm-test-in: Add more tests of j1.
* math/auto-libm-test-out: Regenerated.
Similar to various other bugs in this area, j1 and jn implementations
can fail to raise the underflow exception when the internal
computation is exact although the actual function is inexact. This
patch forces the exception in a similar way to other such fixes. (The
ldbl-128 / ldbl-128ibm j1l implementation is different and doesn't
need a change for this until spurious underflows in it are fixed.)
Tested for x86_64, x86, mips64 and powerpc.
[BZ #16559]
* sysdeps/ieee754/dbl-64/e_j1.c: Include <float.h>.
(__ieee754_j1): Force underflow exception for small results.
* sysdeps/ieee754/dbl-64/e_jn.c (__ieee754_jn): Likewise.
* sysdeps/ieee754/flt-32/e_j1f.c: Include <float.h>.
(__ieee754_j1f): Force underflow exception for small results.
* sysdeps/ieee754/flt-32/e_jnf.c (__ieee754_jnf): Likewise.
* sysdeps/ieee754/ldbl-128/e_jnl.c (__ieee754_jnl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_jnl.c (__ieee754_jnl): Likewise.
* sysdeps/ieee754/ldbl-96/e_j1l.c: Include <float.h>.
(__ieee754_j1l): Force underflow exception for small results.
* sysdeps/ieee754/ldbl-96/e_jnl.c (__ieee754_jnl): Likewise.
* math/auto-libm-test-in: Add more tests of j1 and jn.
* math/auto-libm-test-out: Regenerated.
Some existing jn tests, if run in non-default rounding modes, produce
errors above those accepted in glibc, which causes problems for moving
tests of jn to use ALL_RM_TEST. This patch makes jn set rounding
to-nearest internally, as was done for yn some time ago, then computes
the appropriate underflowing value for results that underflowed to
zero in to-nearest, and moves the tests to ALL_RM_TEST. It does
nothing about the general inaccuracy of Bessel function
implementations in glibc, though it should make jn more accurate on
average in non-default rounding modes through reduced error
accumulation. The recomputation of results that underflowed to zero
should as a side-effect fix some cases of bug 16559, where jn just
used an exact zero, but that is *not* the goal of this patch and other
cases of that bug remain unfixed.
(Most of the changes in the patch are reindentation to add new scopes
for SET_RESTORE_ROUND*.)
Tested for x86_64, x86, powerpc and mips64.
[BZ #16559]
[BZ #18602]
* sysdeps/ieee754/dbl-64/e_jn.c (__ieee754_jn): Set
round-to-nearest internally then recompute results that
underflowed to zero in the original rounding mode.
* sysdeps/ieee754/flt-32/e_jnf.c (__ieee754_jnf): Likewise.
* sysdeps/ieee754/ldbl-128/e_jnl.c (__ieee754_jnl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_jnl.c (__ieee754_jnl): Likewise.
* sysdeps/ieee754/ldbl-96/e_jnl.c (__ieee754_jnl): Likewise
* math/libm-test.inc (jn_test): Use ALL_RM_TEST.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
Similar to various other bugs in this area, the ldbl-128 expl
implementation does not raise the underflow exception for all
subnormal results, if the scaling down is exact although the actual
result is inexact. This patch fixes this by forcing the exception in
this case (the tests that failed before and pass after the test are
already in the testsuite).
Tested for mips64.
[BZ #18586]
* sysdeps/ieee754/ldbl-128/e_expl.c (__ieee754_expl): Force
underflow exception for small results.
Similar to various other bugs in this area, some sin and sincos
implementations do not raise the underflow exception for subnormal
arguments, when the result is tiny and inexact. This patch forces the
exception in a similar way to previous fixes.
Tested for x86_64, x86, mips64 and powerpc.
[BZ #16526]
[BZ #16538]
* sysdeps/ieee754/dbl-64/s_sin.c: Include <float.h>.
(__sin): Force underflow exception for arguments with small
absolute value.
* sysdeps/ieee754/flt-32/k_sinf.c: Include <float.h>.
(__kernel_sinf): Force underflow exception for arguments with
small absolute value.
* sysdeps/ieee754/ldbl-128/k_sincosl.c: Include <float.h>.
(__kernel_sincosl): Force underflow exception for arguments with
small absolute value.
* sysdeps/ieee754/ldbl-128/k_sinl.c: Include <float.h>.
(__kernel_sinl): Force underflow exception for arguments with
small absolute value.
* sysdeps/ieee754/ldbl-128ibm/k_sincosl.c: Include <float.h>.
(__kernel_sincosl): Force underflow exception for arguments with
small absolute value.
* sysdeps/ieee754/ldbl-128ibm/k_sinl.c: Include <float.h>.
(__kernel_sinl): Force underflow exception for arguments with
small absolute value.
* sysdeps/ieee754/ldbl-96/k_sinl.c: Include <float.h>.
(__kernel_sinl): Force underflow exception for arguments with
small absolute value.
* sysdeps/powerpc/fpu/k_sinf.c: Include <float.h>.
(__kernel_sinf): Force underflow exception for arguments with
small absolute value.
* math/auto-libm-test-in: Add more tests of sin and sincos.
* math/auto-libm-test-out: Regenerated.
__kernel_standard_l converts long double arguments to double for use
in SVID "struct exception". This has special-case handling for when
that conversion would overflow or underflow but the original long
double function wouldn't. However, it turns out that "inexact"
exceptions can be spurious here as well, when the function is exactly
determined and __kernel_standard_l is being called for a domain error.
This patch fixes this by using feholdexcept / fesetenv to avoid
exceptions from the conversion, replacing the previous special-case
logic for overflow and underflow (this covers all functions using
__kernel_standard_l, not just those that actually need a change, since
there doesn't seem to be much point in restricting things just to the
functions that mustn't get "inexact" here).
Tested for x86_64 and x86.
[BZ #18245]
[BZ #18583]
* sysdeps/ieee754/k_standardl.c: Include <fenv.h>.
(__kernel_standard_l): Use feholdexcept and fesetenv around
conversion to double instead of special-casing overflow and
underflow.
* math/libm-test.inc (fmod_test_data): Add more tests.
(remainder_test_data): Likewise.
(sqrt_test_data): Likewise.
The dbl-64 and flt-32 implementations of exp2 functions produce
spurious underflow exceptions. The underlying reason is the same in
both cases: the computation works as (2^a - 1)*2^b + 2^b for suitably
chosen a and b, where a has small magnitude so 2^a - 1 can be computed
with a low-degree polynomial approximation, and (2^a - 1)*2^b can
underflow even when the final result does not. This patch fixes this
by adjusting the threshold for when scaling is used to avoid
intermediate underflow so it works for any possible value of a where
the final result would not underflow.
Tested for x86_64 and x86.
[BZ #18219]
* sysdeps/ieee754/dbl-64/e_exp2.c (__ieee754_exp2): Reduce
threshold on absolute value of exponent for which scaling is used.
* sysdeps/ieee754/flt-32/e_exp2f.c (__ieee754_exp2f): Likewise.
* math/auto-libm-test-in: Add more tests of exp2.
* math/auto-libm-test-out: Regenerated.
Similar to various other bugs in this area, some expm1 implementations
do not raise the underflow exception for subnormal arguments, when the
result is tiny and inexact. This patch forces the exception in a
similar way to previous fixes.
(The issue does not apply to the ldbl-* implementations or to those
for x86 / x86_64 long double. The change to
sysdeps/ieee754/dbl-64/wordsize-64/e_cosh.c is one I missed when
previously fixing bug 16354; the bug in that implementation was
previously latent, but the expm1 fixes stopped it being latent and so
required it to be fixed to avoid spurious underflows from cosh.)
Tested for x86_64 and x86.
[BZ #16353]
* sysdeps/i386/fpu/s_expm1.S (dbl_min): New object.
(__expm1): Force underflow exception for arguments with small
absolute value.
* sysdeps/i386/fpu/s_expm1f.S (flt_min): New object.
(__expm1f): Force underflow exception for arguments with small
absolute value.
* sysdeps/ieee754/dbl-64/s_expm1.c: Include <float.h>.
(__expm1): Force underflow exception for arguments with small
absolute value.
* sysdeps/ieee754/flt-32/s_expm1f.c: Include <float.h>.
(__expm1f): Force underflow exception for arguments with small
absolute value.
* sysdeps/ieee754/dbl-64/wordsize-64/e_cosh.c (__ieee754_cosh):
Check for small arguments before calling __expm1.
* math/auto-libm-test-in: Do not mark underflow exceptions as
possibly missing for bug 16353.
* math/auto-libm-test-out: Regenerated.
Similar to various other bugs in this area, some asinh implementations
do not raise the underflow exception for subnormal arguments, when the
result is tiny and inexact. This patch forces the exception in a
similar way to previous fixes.
Tested for x86_64, x86 and mips64.
[BZ #16350]
* sysdeps/i386/fpu/s_asinh.S (__asinh): Force underflow exception
for arguments with small absolute value.
* sysdeps/i386/fpu/s_asinhf.S (__asinhf): Likewise.
* sysdeps/i386/fpu/s_asinhl.S (__asinhl): Likewise.
* sysdeps/ieee754/dbl-64/s_asinh.c: Include <float.h>.
(__asinh): Force underflow exception for arguments with small
absolute value.
* sysdeps/ieee754/flt-32/s_asinhf.c: Include <float.h>.
(__asinhf): Force underflow exception for arguments with small
absolute value.
* sysdeps/ieee754/ldbl-128/s_asinhl.c: Include <float.h>.
(__asinhl): Force underflow exception for arguments with small
absolute value.
* sysdeps/ieee754/ldbl-128ibm/s_asinhl.c: Include <float.h>.
(__asinhl): Force underflow exception for arguments with small
absolute value.
* sysdeps/ieee754/ldbl-96/s_asinhl.c: Include <float.h>.
(__asinhl): Force underflow exception for arguments with small
absolute value.
* math/auto-libm-test-in: Do not mark underflow exceptions as
possibly missing for bug 16350.
* math/auto-libm-test-out: Regenerated.
sysdeps/ieee754/ldbl-128ibm has its own versions of cprojl, ctanhl and
ctanl.
Having its own versions, where otherwise the math/ copies are
generally used for all floating-point formats, means they are liable
to get out of sync and not benefit from bug fixes to the generic
versions. The substantive differences (not arising from getting out
of sync and slightly different fixes for the same issues) are: long
double compat handling (also done in the ldbl-opt versions, so doesn't
require special versions for ldbl-128ibm); handling of LDBL_EPSILON
(conditionally undefined and redefined in other math/ implementations,
so doesn't justify a special version), and:
/* __gcc_qmul does not respect -0.0 so we need the following fixup. */
if ((__real__ res == 0.0L) && (__real__ x == 0.0L))
__real__ res = __real__ x;
if ((__real__ res == 0.0L) && (__imag__ x == 0.0L))
__imag__ res = __imag__ x;
But if that statement about __gcc_qmul was ever true for an old
version of that libgcc function, it's not the case for any GCC version
now supported to build glibc; there's explicit logic early in that
function (and similarly in __gcc_qdiv) to return an appropriately
signed zero if the product of the high parts is zero. So this patch
adds the special LDBL_EPSILON handling to the generic functions and
removes the ldbl-128ibm versions.
Tested for powerpc32 (compared test-ldouble.out before and after the
changes; there are slight changes to results for ctanl / ctanhl,
arising from divergence of the implementations, but nothing that
affects the overall nature of the issues shown by the testsuite, and
in particular nothing related to signs of zero resutls).
* math/s_ctanhl.c [LDBL_MANT_DIG == 106] (LDBL_EPSILON): Undefine
and redefine.
* math/s_ctanl.c [LDBL_MANT_DIG == 106] (LDBL_EPSILON): Undefine
and redefine.
* sysdeps/ieee754/ldbl-128ibm/s_cprojl.c: Remove file.
* sysdeps/ieee754/ldbl-128ibm/s_ctanhl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_ctanl.c: Likewise.
The ldbl-128 and ldbl-128ibm implementations of tanl produce
uninitialized variable warnings with -Wuninitialized because of a
variable that is initialized only conditionally, then used under the
same conditions under which it is set. This patch uses DIAG_* macros
to suppress those warnings.
Tested for powerpc and mips64.
* sysdeps/ieee754/ldbl-128/k_tanl.c: Include <libc-internal.h>.
(__kernel_tanl): Ignore uninitialized warnings around use of SIGN.
* sysdeps/ieee754/ldbl-128ibm/k_tanl.c: Include <libc-internal.h>.
(__kernel_tanl): Ignore uninitialized warnings around use of SIGN.
The ldbl-128 and ldbl-128ibm implementations of erfcl produce
uninitialized variable warnings with -Wuninitialized because of switch
statements where in fact one of the cases will always be executed, but
the compiler does not see that these cases cover all possibilities
(and because the reasoning that it does involves inequalities on the
representation of a floating point value leading to a set of possible
values for 8.0 times that value, converted to int, it's highly
nontrivial for the compiler to see that). This patch fixes those
warnings by converting the last case in those switch statements to a
"default" case.
Tested for powerpc and mips64.
* sysdeps/ieee754/ldbl-128/s_erfl.c (__erfcl): Make case 9 in
switch statement into default case.
* sysdeps/ieee754/ldbl-128ibm/s_erfl.c (__erfcl): Likewise.
The ldbl-128 and ldbl-128ibm implementations of asinl produce
uninitialized variable warnings with -Wuninitialized because the code
for small arguments in fact always returns but the compiler cannot see
this and instead sees that a variable would be uninitialized if the
"if (huge + x > one)" conditional used to force the "inexact"
exception were false.
All the code in libm trying to force "inexact" for functions that are
not exactly defined is suspect and should be removed at some point
given that we now have a clear definition of the accuracy goals for
libm functions which, following C99/C11, does not require anything
about "inexact" for most functions (likewise, the multi-precision code
that tries to give correctly-rounded results, very slowly, for
functions for which the goals clearly do not include correct rounding,
if the faster paths are accurate enough). However, for now this patch
simply changes the code to use math_force_eval, rather than "if", to
ensure the evaluation of the inexact computation.
Tested for powerpc and mips64.
* sysdeps/ieee754/ldbl-128/e_asinl.c (__ieee754_asinl): Don't use
a conditional in forcing "inexact".
* sysdeps/ieee754/ldbl-128ibm/e_asinl.c (__ieee754_asinl):
Likewise.
If you remove the "override CFLAGS += -Wno-uninitialized" in
math/Makefile, you get errors from lgamma implementations of the form:
../sysdeps/ieee754/dbl-64/e_lgamma_r.c: In function '__ieee754_lgamma_r':
../sysdeps/ieee754/dbl-64/e_lgamma_r.c:297:13: error: 'nadj' may be used uninitialized in this function [-Werror=maybe-uninitialized]
if(hx<0) r = nadj - r;
This is one of the standard kinds of false positive uninitialized
warnings: nadj is set under a certain condition, and then later used
under the same condition. This patch uses DIAG_* macros to suppress
the warning on the use of nadj. The ldbl-128 / ldbl-128ibm
implementation has a substantially different structure that avoids
this issue.
Tested for x86_64. (In fact this patch eliminates the need for that
-Wno-uninitialized on x86_64, but I want to test on more architectures
before removing it.)
* sysdeps/ieee754/dbl-64/e_lgamma_r.c: Include <libc-internal.h>.
(__ieee754_lgamma_r): Ignore uninitialized warnings around use of
NADJ.
* sysdeps/ieee754/flt-32/e_lgammaf_r.c: Include <libc-internal.h>.
(__ieee754_lgammaf_r): Ignore uninitialized warnings around use of
NADJ.
* sysdeps/ieee754/ldbl-96/e_lgammal_r.c: Include <libc-internal.h>.
(__ieee754_lgammal_r): Ignore uninitialized warnings around use of
NADJ.
If you remove the "override CFLAGS += -Wno-uninitialized" in
math/Makefile, one of the errors you get is:
../sysdeps/ieee754/dbl-64/mpa.c: In function '__mp_dbl.part.0':
../sysdeps/ieee754/dbl-64/mpa.c:183:5: error: 'c' may be used uninitialized in this function [-Werror=maybe-uninitialized]
c *= X[0];
The problem is that the p < 5 case initializes c if p is 1, 2, 3 or 4
but not otherwise, and in fact p is positive for all calls to this
function so the uninitialized case can't actually occur. This patch
replaces the "if (p == 4)" last case with a comment so the compiler
can see that all paths do initialize c.
Tested for x86_64.
* sysdeps/ieee754/dbl-64/mpa.c (norm): Remove if condition on
(p == 4) case.
ldbl-96 remquol wrongly handles the case where the first argument is
finite and the second infinite, because the check for the second
argument being a NaN fails to disregard the explicit high mantissa bit
and so wrongly interprets an infinity as being a NaN. This patch
fixes this by masking off that bit, and improves test coverage for
both remainder and remquo (various cases were missing tests, or, as in
the case of the bug, were tested only for one of the two functions).
Tested for x86_64 and x86.
[BZ #18244]
* sysdeps/ieee754/ldbl-96/s_remquol.c (__remquol): Ignore explicit
high mantissa bit when testing whether P is a NaN.
* math/libm-test.inc (remainder_test_data): Add more tests.
(remquo_test_data): Likewise.
Similar to various other bugs in this area, some atanh implementations
do not raise the underflow exception for subnormal arguments, when the
result is tiny and inexact. This patch forces the exception in a
similar way to previous fixes. (No change in this regard is needed
for the i386 implementation; special handling to force underflows in
these cases will only be needed there when the spurious underflows,
bug 18049, get fixed.)
Tested for x86_64, x86, powerpc and mips64.
[BZ #16352]
* sysdeps/i386/fpu/e_atanh.S (dbl_min): New object.
(__ieee754_atanh): Force underflow exception for results with
small absolute value.
* sysdeps/i386/fpu/e_atanhf.S (flt_min): New object.
(__ieee754_atanhf): Force underflow exception for results with
small absolute value.
* sysdeps/ieee754/dbl-64/e_atanh.c: Include <float.h>.
(__ieee754_atanh): Force underflow exception for results with
small absolute value.
* sysdeps/ieee754/flt-32/e_atanhf.c: Include <float.h>.
(__ieee754_atanhf): Force underflow exception for results with
small absolute value.
* sysdeps/ieee754/ldbl-128/e_atanhl.c: Include <float.h>.
(__ieee754_atanhl): Force underflow exception for results with
small absolute value.
* sysdeps/ieee754/ldbl-128ibm/e_atanhl.c: Include <float.h>.
(__ieee754_atanhl): Force underflow exception for results with
small absolute value.
* sysdeps/ieee754/ldbl-96/e_atanhl.c: Include <float.h>.
(__ieee754_atanhl): Force underflow exception for results with
small absolute value.
* math/auto-libm-test-in: Do not allow missing underflow
exceptions from atanh.
* math/auto-libm-test-out: Regenerated.
The flt-32 implementation of tanf produces spurious underflow
exceptions for some small arguments, through computing values on the
order of x^5. This patch fixes this by adjusting the threshold for
returning x (or, as applicable, +/- 1/x) to 2**-13 (the next term in
the power series being x^3/3).
Tested for x86_64 and x86.
[BZ #18221]
* sysdeps/ieee754/flt-32/k_tanf.c (__kernel_tanf): Use 2**-13 not
2**-28 as threshold for returning x or +/- 1/x.
* math/auto-libm-test-in: Add more tests of tan.
* math/auto-libm-test-out: Regenerated.
The flt-32 implementation of lgammaf produces spurious underflow
exceptions for some large arguments, because of calculations involving
x^-2 multiplied by small constants. This patch fixes this by
adjusting the threshold for a simpler computation to 2**26 (the error
in the simpler computation is on the order of 0.5 * log (x), for a
result on the order of x * log (x)).
Tested for x86_64 and x86.
[BZ #18220]
* sysdeps/ieee754/flt-32/e_lgammaf_r.c (__ieee754_lgammaf_r): Use
2**26 not 2**58 as threshold for returning x * (log (x) - 1).
* math/auto-libm-test-in: Add another test of lgamma.
* math/auto-libm-test-out: Regenerated.
The flt-32 implementation of erfcf produces spurious underflow
exceptions for some arguments close to 0, because of calculations
squaring the argument and then multiplying by small constants. This
patch fixes this by adjusting the threshold for arguments for which
the result is so close to 1 that 1 - x will give the right result from
2**-56 to 2**-26. (If 1 - x * 2/sqrt(pi) were used, the errors would be
on the order of x^3 and a much larger threshold could be used.)
Tested for x86_64 and x86.
[BZ #18217]
* sysdeps/ieee754/flt-32/s_erff.c (__erfcf): Use 2**-26 not 2**-56
as threshold for returning 1 - x.
* math/auto-libm-test-in: Add more tests of erfc.
* math/auto-libm-test-out: Regenerated.
The sysdeps/ieee754/flt-32 version of atanf produces spurious
underflow exceptions for some large arguments, because of computations
that compute x^-4. This patch fixes this by adjusting the threshold
for large arguments (for which +/- pi/2 can just be returned, the
correct result being roughly +/- pi/2 - 1/x) from 2^34 to 2^25.
Tested for x86_64 and x86.
[BZ #18196]
* sysdeps/ieee754/flt-32/s_atanf.c (__atanf): Use 2^25 not 2^34 as
threshold for large arguments.
* math/auto-libm-test-in: Add another test of atan.
* math/auto-libm-test-out: Regenerated.
Similar to various other bugs in this area, some log1p implementations
do not raise the underflow exception for subnormal arguments, when the
result is tiny and inexact. This patch forces the exception in a
similar way to previous fixes. (The ldbl-128ibm implementation
doesn't currently need any change as it already generates this
exception, albeit through code that would generate spurious exceptions
in other cases; special code for this issue will only be needed there
when fixing the spurious exceptions.)
Tested for x86_64, x86, powerpc and mips64.
[BZ #16339]
* sysdeps/i386/fpu/s_log1p.S (dbl_min): New object.
(__log1p): Force underflow exception for results with small
absolute value.
* sysdeps/i386/fpu/s_log1pf.S (flt_min): New object.
(__log1pf): Force underflow exception for results with small
absolute value.
* sysdeps/ieee754/dbl-64/s_log1p.c: Include <float.h>.
(__log1p): Force underflow exception for results with small
absolute value.
* sysdeps/ieee754/flt-32/s_log1pf.c: Include <float.h>.
(__log1pf): Force underflow exception for results with small
absolute value.
* sysdeps/ieee754/ldbl-128/s_log1pl.c: Include <float.h>.
(__log1pl): Force underflow exception for results with small
absolute value.
* math/auto-libm-test-in: Do not allow missing underflow
exceptions from log1p.
* math/auto-libm-test-out: Regenerated.
The implementation of roundl for ldbl-128 involves undefined behavior
for arguments with exponents from 31 to 47 inclusive, from the shift:
u_int64_t i = -1ULL >> (j0 - 48);
For example, on mips64, this means roundl (0xffffffffffff.8p0L)
wrongly returns its argument, which is not an integer. A condition
checking for exponents < 31 should actually be checking for exponents
< 48, and this patch makes it do so. (That condition is for whether
the bit representing 0.5 is in the high 64-bit half of the
floating-point number. The value 31 might have arisen from an
incorrect conversion of the ldbl-96 version to handle ldbl-128.)
This was originally reported as a GCC libquadmath bug
<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65757>.
Tested for mips64; also tested for x86_64 and x86 to make sure the new
tests pass there.
[BZ #18346]
* sysdeps/ieee754/ldbl-128/s_roundl.c (__roundl): Handle all
exponents less than 48 as cases where high part of mantissa needs
examining to determine whether argument is integral.
* math/libm-test.inc (round_test_data): Add more tests.
According to bug 6792, errno is not set to ERANGE/EDOM
by calling log1p/log1pf/log1pl with x = -1 or x < -1.
This patch adds a wrapper which sets errno in those cases
and returns the value of the existing __log1p function.
The log1p is now an alias to the wrapper function
instead of __log1p.
The files in sysdeps are reflecting these changes.
The ia64 implementation sets errno by itself,
thus the wrapper-file is empty.
The libm-test is adjusted for log1p-tests to check errno.
[BZ #6792]
* math/w_log1p.c: New file.
* math/w_log1pf.c: Likewise.
* math/w_log1pl.c: Likewise.
* math/Makefile (libm-calls): Add w_log1p.
* math/s_log1pl.c (log1pl): Remove weak_alias.
* sysdeps/i386/fpu/s_log1p.S (log1p): Likewise.
* sysdeps/i386/fpu/s_log1pf.S (log1pf): Likewise.
* sysdeps/i386/fpu/s_log1pl.S (log1pl): Likewise.
* sysdeps/x86_64/fpu/s_log1pl.S (log1pl): Likewise.
* sysdeps/ieee754/dbl-64/s_log1p.c (log1p): Likewise.
[NO_LONG_DOUBLE] (log1pl): Likewise.
* sysdeps/ieee754/flt-32/s_log1pf.c (log1pf): Likewise.
* sysdeps/ieee754/ldbl-128/s_log1pl.c (log1pl): Likewise.
* sysdeps/ieee754/ldbl-64-128/s_log1pl.c
(log1p): Remove long_double_symbol.
* sysdeps/ieee754/ldbl-128ibm/s_log1pl.c (log1pl): Likewise.
* sysdeps/ieee754/ldbl-64-128/w_log1pl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/w_log1pl.c: Likewise.
* sysdeps/m68k/m680x0/fpu/s_log1p.c: Define empty weak_alias to
remove weak_alias for corresponding log1p function.
* sysdeps/m68k/m680x0/fpu/s_log1pf.c: Likewise.
* sysdeps/m68k/m680x0/fpu/s_log1pl.c: Likewise.
* sysdeps/ia64/fpu/w_log1p.c: New file.
* sysdeps/ia64/fpu/w_log1pf.c: Likewise.
* sysdeps/ia64/fpu/w_log1pl.c: Likewise.
* math/libm-test.inc (log1p_test_data): Add errno expectations.
The dbl-64 implementation of atan2 does computations that expect to
run in round-to-nearest mode, and in other modes the errors can
accumulate to more than the maximum accepted 9ulp. This patch makes
it use FE_TONEAREST internally, similar to other functions with such
issues. Tests that previously produced large errors are added for
atan2 and the closely related carg, clog and clog10 functions.
Tested for x86_64 and x86 and ulps updated accordingly.
[BZ #18210]
[BZ #18211]
* sysdeps/ieee754/dbl-64/e_atan2.c: Include <fenv.h>.
(__ieee754_atan2): Set FE_TONEAREST mode for internal
computations.
* math/auto-libm-test-in: Add more tests of atan2, carg, clog and
clog10.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
The dbl-64 implementation of atan does computations that expect to run
in round-to-nearest mode, and in other modes the errors can accumulate
to more than the maximum accepted 9ulp. This patch makes it use
FE_TONEAREST internally, similar to other functions with such issues.
Tested for x86_64 and x86; no ulps updates needed.
[BZ #18197]
* sysdeps/ieee754/dbl-64/s_atan.c: Include <fenv.h>.
(atan): Set FE_TONEAREST mode for internal computations.
* math/auto-libm-test-in: Add more tests of atan.
* math/auto-libm-test-out: Regenerated.
The threshold in ldbl-96 atanhl for when to return the argument,
0x1p-28, is a bit too big, and that in ldbl-128ibm atanhl is much too
big (the relevant condition being x^3/3 being < 0.5ulp of x),
resulting in errors a bit above the limits of those considered
acceptable in glibc in the ldbl-96 case, and in large errors in the
ldbl-128ibm case. This patch changes those implementations to use
more appropriate thresholds and adds tests around the thresholds for
various formats.
Tested for x86_64, x86 and powerpc. x86_64 and x86 ulps updated
accordingly.
[BZ #18046]
[BZ #18047]
* sysdeps/ieee754/ldbl-128ibm/e_atanhl.c (__ieee754_atanhl): Use
0x1p-56L as threshold for just returning the argument.
* sysdeps/ieee754/ldbl-96/e_atanhl.c (__ieee754_atanhl): Use
0x1p-32L as threshold for just returning the argument.
* math/auto-libm-test-in: Add more tests of atanh.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulp: Likewise.
We want to avoid -Wno- options in makefiles as far as possible, by
cleaning up the underlying issues if possible or failing that by using
diagnostic pragmas. This patch eliminates the use of
-Wno-write-strings for sysdeps/ieee754/k_standard.c by using casts in
the source file to cast away const; those casts are encapsulated in a
macro that also deals with the choice of strings for float / double /
long double functions (for which the logic was previously replicated
many times).
Tested for x86_64; the only change to disassembly of installed
stripped shared libraries was a line number in an assertion.
* sysdeps/ieee754/k_standard.c (CSTR): New macro.
(__kernel_standard): Use CSTR macro when setting exc.name.
* sysdeps/ieee754/Makefile [$(subdir) = math]
(CFLAGS-k_standard.c): Remove variable.
math/Makefile currently has:
# The fdlibm code generates a lot of these warnings but is otherwise clean.
override CFLAGS += -Wno-uninitialized
This is of course undesirable; warnings should be disabled as narrowly
as possible. To remove this override, we need to fix files that
generate such warnings, or put warning-disabling pragmas in them.
This patch does so for Bessel function implementations, one of the
cases that have the warnings if the override is removed. The warnings
arise because functions set pointer variables p and q only for certain
values of the function argument, then use them unconditionally. As
the static functions in question only get called for arguments that
satisfy the last condition in the if/else chain, the natural fix is to
change the last "else if" to just "else", which this patch does. (The
ldbl-128 / ldbl-128ibm implementation of these functions is
substantially different and looks like it already does use "else" in
the last case in the nearest corresponding code.)
Tested for x86_64 and x86.
* sysdeps/ieee754/dbl-64/e_j0.c (pzero): Change last case for
setting p and q from "else if" to "else".
(qzero): Likewise.
* sysdeps/ieee754/dbl-64/e_j1.c (pone): Likewise.
(qone): Likewise.
* sysdeps/ieee754/flt-32/e_j0f.c (pzerof): Likewise.
(qzerof): Likewise.
* sysdeps/ieee754/flt-32/e_j1f.c (ponef): Likewise.
(qonef): Likewise.
* sysdeps/ieee754/ldbl-96/e_j0l.c (pzero): Likewise.
(qzero): Likewise.
* sysdeps/ieee754/ldbl-96/e_j1l.c (pone): Likewise.
(qone): Likewise.
The ldbl-128 and ldbl-128ibm implementations of acosl have similar
bugs, using a threshold of 0x1p-57L to determine when they just return
pi/2. Since the result pi/2 - asinl (x) is roughly pi/2 - x for small
x, the relevant cut-off is actually x being < 0.5ulp of 1. This patch
fixes the implementations to use that cut-off and adds tests of small
acos arguments.
Tested for powerpc and mips64. Also tested for x86_64 and x86; no
ulps updates needed.
[BZ #18038]
[BZ #18039]
* sysdeps/ieee754/ldbl-128/e_acosl.c (__ieee754_acosl): Only
return pi/2 for arguments below 0x1p-113L.
* sysdeps/ieee754/ldbl-128ibm/e_acosl.c (__ieee754_acosl): Only
return pi/2 for arguments below 0x1p-106L.
* math/auto-libm-test-in: Add more tests of acos.
* math/auto-libm-test-out: Regenerated.
Similar to various other bugs in this area, some asin implementations
do not raise the underflow exception for subnormal arguments, when the
result is tiny and inexact. This patch forces the exception in a
similar way to previous fixes.
Tested for x86_64, x86, powerpc and mips64.
[BZ #16351]
* sysdeps/i386/fpu/e_asin.S (dbl_min): New object.
(MO): New macro.
(__ieee754_asin): Force underflow exception for results with small
absolute value.
* sysdeps/i386/fpu/e_asinf.S (flt_min): New object.
(MO): New macro.
(__ieee754_asinf): Force underflow exception for results with
small absolute value.
* sysdeps/ieee754/dbl-64/e_asin.c: Include <float.h> and <math.h>.
(__ieee754_asin): Force underflow exception for results with small
absolute value.
* sysdeps/ieee754/flt-32/e_asinf.c: Include <float.h>.
(__ieee754_asinf): Force underflow exception for results with
small absolute value.
* sysdeps/ieee754/ldbl-128/e_asinl.c: Include <float.h>.
(__ieee754_asinl): Force underflow exception for results with
small absolute value.
* sysdeps/ieee754/ldbl-128ibm/e_asinl.c: Include <float.h>.
(__ieee754_asinl): Force underflow exception for results with
small absolute value.
* sysdeps/ieee754/ldbl-96/e_asinl.c: Include <float.h>.
(__ieee754_asinl): Force underflow exception for results with
small absolute value.
* sysdeps/x86_64/fpu/multiarch/e_asin.c [HAVE_FMA4_SUPPORT]:
Include <math.h>.
* math/auto-libm-test-in: Do not mark underflow exceptions as
possibly missing for bug 16351.
* math/auto-libm-test-out: Regenerated.
The ldbl-128ibm implementation of logbl produces incorrect results
when the high part of the argument is a power of 2 and the low part a
nonzero number with the opposite sign (and so the returned exponent
should be 1 less than that of the high part). For example, logbl
(0x1.ffffffffffffffp1L) returns 2 but should return 1. (This is
similar to (fixed) bug 16740 for frexpl, and (fixed) bug 18029 for
ilogbl.) This patch adds checks for that case.
Tested for powerpc.
[BZ #18030]
* sysdeps/ieee754/ldbl-128ibm/s_logbl.c (__logbl): Adjust exponent
of power of 2 down when low part has opposite sign.
* math/libm-test.inc (logb_test_data): Add more tests.
The ldbl-128ibm implementation of ilogbl produces incorrect results
when the high part of the argument is a power of 2 and the low part a
nonzero number with the opposite sign (and so the returned exponent
should be 1 less than that of the high part). For example, ilogbl
(0x1.ffffffffffffffp1L) returns 2 but should return 1. (This is
similar to (fixed) bug 16740 for frexpl, and bug 18030 for logbl.)
This patch adds checks for that case.
Tested for powerpc.
[BZ #18029]
* sysdeps/ieee754/ldbl-128ibm/e_ilogbl.c (__ieee754_ilogbl):
Adjust exponent of power of 2 down when low part has opposite
sign.
* math/libm-test.inc (ilogb_test_data): Add more tests.
The ldbl-128ibm implementation of asinhl uses cut-offs of 0x1p28 and
0x1p-29 to determine when to use simpler formulas that avoid possible
overflow / underflow. Both those cut-offs are inappropriate for this
format, resulting in large errors. This patch changes the code to use
more appropriate cut-offs of 0x1p56 and 0x1p-56, adding tests around
the cut-offs for various floating-point formats.
Tested for powerpc. Also tested for x86_64 and x86 and updated ulps.
[BZ #18020]
* sysdeps/ieee754/ldbl-128ibm/s_asinhl.c (__asinhl): Use 2**56 and
2**-56 not 2**28 and 2**-29 as thresholds for simpler formulas.
* math/auto-libm-test-in: Add more tests of asinh.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
The ldbl-128ibm implementation of acoshl uses a cut-off of 0x1p28 to
determine when to use log(x) + log(2) as a formula. That cut-off is
too small for this format, resulting in large errors. This patch
changes it to a more appropriate cut-off of 0x1p56, adding tests
around the cut-offs for various floating-point formats.
Tested for powerpc. Also tested for x86_64 and x86 and updated ulps.
[BZ #18019]
* sysdeps/ieee754/ldbl-128ibm/e_acoshl.c (__ieee754_acoshl): Use
2**56 not 2**28 as threshold for log (2x) formula.
* math/auto-libm-test-in: Add more tests of acosh.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
This patch fixes bug 15319, missing underflows from atan / atan2 when
the result of atan is very close to its small argument (or that of
atan2 is very close to the ratio of its arguments, which may be an
exact division).
The usual approach of doing an underflowing computation if the
computed result is subnormal is followed. For 32-bit x86, there are
extra complications: the inline __ieee754_atan2 in bits/mathinline.h
needs to be disabled for float and double because other libm functions
using it generally rely on getting proper underflow exceptions from
it, while the out-of-line functions have to remove excess range and
precision from the underflowing result so as to return an exact 0 in
the case where errno should be set for underflow to 0. (The failures
I saw without that are similar to those Carlos reported for other
functions, where I haven't seen a response to
<https://sourceware.org/ml/libc-alpha/2015-01/msg00485.html>
confirming if my diagnosis is correct. Arguably all libm functions
with float and double returns should remove excess range and
precision, but that's a separate matter.)
The x86_64 long double case reported in a comment in bug 15319 is not
a bug (it's an argument of LDBL_MIN, and x86_64 is an after-rounding
architecture so the correct IEEE result is not to raise underflow in
the given rounding mode, in addition to treating the result as an
exact LDBL_MIN being within the newly clarified documentation of
accuracy goals). I'm presuming that the fpatan instruction can be
trusted to raise appropriate exceptions when the (long double) result
underflows (after rounding) and so no changes are needed for x86 /
x86_64 long double functions here; empirically this is the case for
the cases covered in the testsuite, on my system.
Tested for x86_64, x86, powerpc and mips64. Only 32-bit x86 needs
ulps updates (for the changes to inlines meaning some functions no
longer get excess precision from their __ieee754_atan2* calls).
[BZ #15319]
* sysdeps/i386/fpu/e_atan2.S (dbl_min): New object.
(MO): New macro.
(__ieee754_atan2): For results with small absolute value, force
underflow exception and remove excess range and precision from
return value.
* sysdeps/i386/fpu/e_atan2f.S (flt_min): New object.
(MO): New macro.
(__ieee754_atan2f): For results with small absolute value, force
underflow exception and remove excess range and precision from
return value.
* sysdeps/i386/fpu/s_atan.S (dbl_min): New object.
(MO): New macro.
(__atan): For results with small absolute value, force underflow
exception and remove excess range and precision from return value.
* sysdeps/i386/fpu/s_atanf.S (flt_min): New object.
(MO): New macro.
(__atanf): For results with small absolute value, force underflow
exception and remove excess range and precision from return value.
* sysdeps/ieee754/dbl-64/e_atan2.c: Include <float.h> and
<math.h>.
(__ieee754_atan2): Force underflow exception for results with
small absolute value.
* sysdeps/ieee754/dbl-64/s_atan.c: Include <float.h> and
<math_private.h>.
(atan): Force underflow exception for results with small absolute
value.
* sysdeps/ieee754/flt-32/s_atanf.c: Include <float.h>.
(__atanf): Force underflow exception for results with small
absolute value.
* sysdeps/ieee754/ldbl-128/s_atanl.c: Include <float.h> and
<math.h>.
(__atanl): Force underflow exception for results with small
absolute value.
* sysdeps/ieee754/ldbl-128ibm/s_atanl.c: Include <float.h>.
(__atanl): Force underflow exception for results with small
absolute value.
* sysdeps/x86/fpu/bits/mathinline.h
[!__SSE2_MATH__ && !__x86_64__ && __LIBC_INTERNAL_MATH_INLINES]
(__ieee754_atan2): Only define inline for long double.
* sysdeps/x86_64/fpu/multiarch/e_atan2.c
[HAVE_FMA4_SUPPORT || HAVE_AVX_SUPPORT]: Include <math.h>.
* math/auto-libm-test-in: Do not mark underflow exceptions as
possibly missing for bug 15319. Add more tests of atan2.
* math/auto-libm-test-out: Regenerated.
* math/libm-test.inc (casin_test_data): Do not mark underflow
exceptions as possibly missing for bug 15319.
(casinh_test_data): Likewise.
* sysdeps/i386/fpu/libm-test-ulps: Update.
Various remquo implementations produce a zero remainder with the wrong
sign (a zero remainder should always have the sign of the first
argument, as specified in IEEE 754) in round-downward mode, resulting
from the sign of 0 - 0. This patch checks for zero results and fixes
their sign accordingly.
Tested for x86_64, x86, mips64 and powerpc.
[BZ #17987]
* sysdeps/ieee754/dbl-64/s_remquo.c (__remquo): Ensure sign of
zero result does not depend on the sign resulting from
subtraction.
* sysdeps/ieee754/dbl-64/wordsize-64/s_remquo.c (__remquo):
Likewise.
* sysdeps/ieee754/flt-32/s_remquof.c (__remquof): Likewise.
* sysdeps/ieee754/ldbl-128/s_remquol.c (__remquol): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_remquol.c (__remquol): Likewise.
* sysdeps/ieee754/ldbl-96/s_remquol.c (__remquol): Likewise.
* math/libm-test.inc (remquo_test_data): Add more tests.
Various remquo implementations, when computing the last three bits of
the quotient, have spurious overflows when 4 times the second argument
to remquo overflows. These overflows can in turn cause bad results in
rounding modes where that overflow results in a finite value. This
patch adds tests to avoid the problem multiplications in cases where
they would overflow, similar to those that control an earlier
multiplication by 8.
Tested for x86_64, x86, mips64 and powerpc.
[BZ #17978]
* sysdeps/ieee754/dbl-64/s_remquo.c (__remquo): Do not form
products 4 * y and 2 * y where those would overflow.
* sysdeps/ieee754/dbl-64/wordsize-64/s_remquo.c (__remquo):
Likewise.
* sysdeps/ieee754/flt-32/s_remquof.c (__remquof): Likewise.
* sysdeps/ieee754/ldbl-128/s_remquol.c (__remquol): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_remquol.c (__remquol): Likewise.
* sysdeps/ieee754/ldbl-96/s_remquol.c (__remquol): Likewise.
* math/libm-test.inc (remquo_test_data): Add more tests.
The dbl-64/wordsize-64 remquo implementation follows similar logic to
various other implementations, but where that logic computes some
absolute values, it wrongly uses a previously computed bit-pattern for
the absolute value of the first argument, where actually it needs the
absolute value of the first argument mod 8 times the second. This
patch fixes it to compute the correct absolute value.
The integer quotient result of remquo is only specified mod 8
(including its sign); architecture-specific versions may well vary in
what results they give for higher bits of that result (and indeed bug
17569 gives an example correct result from __builtin_remquo giving 9
for that result, where the particular glibc implementation used in
that bug report would give 1 after this fix). Thus, this patch adapts
the tests of remquo to test that result only mod 8, to allow for such
variation when tests with higher quotient are included.
Tested for x86_64 and x86.
[BZ #17569]
* sysdeps/ieee754/dbl-64/wordsize-64/s_remquo.c (__remquo):
Compute absolute value of x as modified by fmod, not original
value of x.
* math/libm-test.inc (RUN_TEST_ffI_f1): Rename to
RUN_TEST_ffI_f1_mod8. Check extra return value mod 8.
(RUN_TEST_LOOP_ffI_f1): Rename to RUN_TEST_LOOP_ffI_f1_mod8. Call
RUN_TEST_ffI_f1_mod8.
(remquo_test_data): Add more tests.
This patch fixes the remaining part of bug 16560, spurious underflows
from exp2 of arguments close to 0 (when the result is close to 1, so
should not underflow), by just using 1+x instead of a more complicated
calculation when the argument is sufficiently small.
Tested for x86_64, x86 and mips64.
[BZ #16560]
* math/e_exp2l.c [LDBL_MANT_DIG == 106] (LDBL_EPSILON): Undefine
and redefine.
(__ieee754_exp2l): Do not multiply small fractional parts by
M_LN2l.
* sysdeps/i386/fpu/e_exp2l.S (__ieee754_exp2l): Just add 1 to
small argument.
* sysdeps/ieee754/dbl-64/e_exp2.c (__ieee754_exp2): Likewise.
* sysdeps/ieee754/flt-32/e_exp2f.c (__ieee754_exp2f): Likewise.
* sysdeps/x86_64/fpu/e_exp2l.S (__ieee754_exp2l): Likewise.
* math/auto-libm-test-in: Add more tests of exp2.
* math/auto-libm-test-out: Regenerated.
This patch makes sincos set errno to EDOM when passed an infinity,
similarly to sin and cos.
Tested for x86_64, x86, powerpc and mips64. I don't know if the
architecture-specific implementations for ia64 and m68k might need
corresponding fixes.
2015-02-11 Joseph Myers <joseph@codesourcery.com>
[BZ #15467]
* sysdeps/ieee754/dbl-64/s_sincos.c: Include <errno.h>.
(__sincos): Set errno to EDOM for infinite argument.
* sysdeps/ieee754/flt-32/s_sincosf.c: Include <errno.h>.
(SINCOSF_FUNC): Set errno to EDOM for infinite argument.
* sysdeps/ieee754/ldbl-128/s_sincosl.c: Include <errno.h>.
(__sincosl): Set errno to EDOM for infinite argument.
* sysdeps/ieee754/ldbl-128ibm/s_sincosl.c: Include <errno.h>.
(__sincosl): Set errno to EDOM for infinite argument.
* sysdeps/ieee754/ldbl-96/s_sincosl.c: Include <errno.h>.
(__sincosl): Set errno to EDOM for infinite argument.
* math/libm-test.inc (sincos_test_data): Test errno setting.
The ldbl-96 implementation of scalblnl (used for x86_64 and ia64) uses
a condition k <= -63 to determine when a standard underflowing result
tiny*__copysignl(tiny,x) should be returned. However, that condition
corresponds to values with exponent -16446 or less, and in the case of
-16446, the correct result for round-to-nearest depends on whether the
value is exactly 0x1p-16446 (half the least subnormal) or more than
that. This patch fixes the bug by changing the condition to k <= -64
and accordingly adjusting the exponent by 64 not 63 when converting to
a normal value.
Tested for x86_64.
[BZ #17803]
* sysdeps/ieee754/ldbl-96/s_scalblnl.c (twom63): Rename to
twom64. Adjust value to 0x1p-64L.
(__scalblnl): Only return standard underflowing result for K <=
-64 not K <= -63; adjust exponent for underflowing result by 64
not 63.
* math/libm-test.inc (scalbn_test_data): Add more tests.
(scalbln_test_data): Likewise.
The ldbl-96 implementation of scalblnl (used for x86_64 and ia64) is
incorrect for subnormal arguments (this is a separate bug from bug
17803, which is about underflowing results). There are two problems
with the adjustments of subnormal arguments: the "two63" variable
multiplied by is actually 0x1p52L not 0x1p63L, so is insufficient to
make values normal, and then GET_LDOUBLE_EXP(es,x), used to extract
the new exponent, extracts it into a variable that isn't used, while
the value taken to by the new exponent is wrongly taken from the high
part of the mantissa before the adjustment (hx). This patch fixes
both those problems and adds appropriate tests.
Tested for x86_64.
[BZ #17834]
* sysdeps/ieee754/ldbl-96/s_scalblnl.c (two63): Change value to
0x1p63L.
(__scalblnl): Get new exponent of adjusted subnormal value from ES
not HX.
* math/libm-test.inc (scalbn_test_data): Add more tests.
(scalbln_test_data): Likewise.
Platforms with 64-bit registers where 32-bit values need to have the
high 32 bits set in a particular way need to have an explicit cast
when using the 64-bit sysdeps/ieee754/dbl-64/wordsize-64 version
of llround() as lround(). This includes tilegx32, and likely MIPS.
x32 does not need this, and AArch64 ILP32 will not either. Require
it to be specified in sysdep.h to be explicit.
Continuing the fixes for C90 libm functions calling C99 fe* functions,
this patch fixes the case of fegetround by making it a weak alias of
__fegetround and making the affected code call __fegetround.
Tested for x86_64 (testsuite, and that disassembly of installed shared
libraries is unchanged by the patch). Also tested for ARM
(soft-float) that fegetround failures disappear from the linknamespace
test failures (feholdexcept, fesetenv, fesetround and feupdateenv
remain to be addressed before bug 17748 is fully fixed, although this
patch may suffice to fix the failures in some cases, when the libc_fe*
functions are implemented but there is no architecture-specific sqrt
implementation in use so there were failures from fegetround used by
sqrt but no other such failures).
[BZ #17748]
* include/fenv.h (__fegetround): Declare. Use libm_hidden_proto.
* math/fegetround.c (fegetround): Rename to __fegetround and
define as weak alias of __fegetround. Use libm_hidden_weak.
* sysdeps/aarch64/fpu/fegetround.c (fegetround): Likewise.
* sysdeps/alpha/fpu/fegetround.c (fegetround): Likewise.
* sysdeps/arm/fegetround.c (fegetround): Likewise.
* sysdeps/hppa/fpu/fegetround.c (fegetround): Likewise.
* sysdeps/i386/fpu/fegetround.c (fegetround): Likewise.
* sysdeps/ia64/fpu/fegetround.c (fegetround): Likewise.
* sysdeps/m68k/fpu/fegetround.c (fegetround): Likewise.
* sysdeps/mips/fpu/fegetround.c (fegetround): Likewise.
* sysdeps/powerpc/fpu/fegetround.c (fegetround): Likewise.
Undefine after rather than before function definition; use
parentheses around function name in definition.
(__fegetround): Also undefine macro after function definition.
* sysdeps/powerpc/nofpu/fegetround.c (fegetround): Rename to
__fegetround and define as weak alias of __fegetround. Use
libm_hidden_weak. Do not undefine as macro.
* sysdeps/powerpc/powerpc32/e500/nofpu/fegetround.c (fegetround):
Likewise.
* sysdeps/s390/fpu/fegetround.c (fegetround): Rename to
__fegetround and define as weak alias of __fegetround. Use
libm_hidden_weak.
* sysdeps/sh/sh4/fpu/fegetround.c (fegetround): Likewise.
* sysdeps/sparc/fpu/fegetround.c (fegetround): Likewise.
* sysdeps/tile/math_private.h (__fegetround): New inline function.
* sysdeps/x86_64/fpu/fegetround.c (fegetround): Rename to
__fegetround and define as weak alias of __fegetround. Use
libm_hidden_weak.
* sysdeps/ieee754/dbl-64/e_sqrt.c (__ieee754_sqrt): Use
__fegetround instead of fegetround.
The natural fix for some linknamespace test failures, where C90 libm
functions call C99 <fenv.h> functions, is to make fe* into weak
aliases for __fe* and call __fe* from within libm as needed.
To do this, the __fe* names need to be available for that purpose -
that is, they must not be used for something other than aliases of
fe*. On powerpc, however, __fegetround is an inline function in
fenv_libc.h, with no corresponding fegetround inline function;
fegetround has an equivalent macro expansion in bits/fenvinline.h, but
that is disabled if __NO_MATH_INLINES (which is defined for building
libm).
I see no need for that disabling; it's not even clear that
__NO_MATH_INLINES should affect <fenv.h>, and the results of
fegetround are completely defined so there is no semantic effect of
that disabling at all outside glibc. The x86 inline feraiseexcept is
conditioned on __USE_EXTERN_INLINES not __NO_MATH_INLINES (but that's
an inline function rather than a macro).
This patch removes the __NO_MATH_INLINES conditional on that
fegetround macro, so resulting in it being expanded inline inside
glibc. In turn, this means that direct calls to __fegetround from C99
functions in ldbl-128ibm can be changed to calls to fegetround, so
that nofpu fenv_libc.h files don't need to define __fegetround at all
and, by changing ldbl-128ibm files to use <fenv.h> not <fenv_libc.h>,
non-e500 nofpu no longer needs an fenv_libc.h file.
The other macros in fenvinline.h are left conditional on
__NO_MATH_INLINES, although since the only case where this should make
a difference is one involving undefined behavior (if the argument to
the function is not a valid exception macro).
The out-of-line definition for fegetround uses __fegetround (the
inline function removed by this patch). So this continues to work,
the fenvinline.h header is made to define __fegetround, and then to
define fegetround to call __fegetround.
Tested for powerpc32 (hard float) that installed stripped shared
libraries are unchanged by this patch; also tested that powerpc-nofpu
build still works. (This patch does not itself fix any bugs; it
simply cleans things up in preparation for separate bug fixes.)
* sysdeps/powerpc/bits/fenvinline.h (fegetround): Rename macro to
__fegetround and redefine to call __fegetround. Remove condition
on [!__NO_MATH_INLINES].
* sysdeps/powerpc/fpu/fenv_libc.h (__fegetround): Remove inline
function.
* sysdeps/powerpc/nofpu/fenv_libc.h: Remove file.
* sysdeps/powerpc/powerpc32/e500/nofpu/fenv_libc.h (__fegetround):
Remove macro.
* sysdeps/ieee754/ldbl-128ibm/s_llrintl.c: Include <fenv.h>
instead of <fenv_libc.h>.
(__llrintl): Call fegetround instead of __fegetround.
* sysdeps/ieee754/ldbl-128ibm/s_llroundl.c: Include <fenv.h>
instead of <fenv_libc.h>.
* sysdeps/ieee754/ldbl-128ibm/s_lrintl.c: Likewise.
(__lrintl): Call fegetround instead of __fegetround.
* sysdeps/ieee754/ldbl-128ibm/s_lroundl.c: Include <fenv.h>
instead of <fenv_libc.h>.
* sysdeps/ieee754/ldbl-128ibm/s_rintl.c: Likewise.
(__rintl): Call fegetround instead of __fegetround.
Bug 17724 reports references to fesetround being brought in by
ldbl-128ibm rintl via references to __rintl from __kernel_standard_l.
Because all three __kernel_standard* functions are in the same file,
this gets brought in even though only the long double version
__kernel_standard_l needs __rintl, and the C90 functions use only
__kernel_standard.
This patch fixes this by splitting the three versions into separate
files; it's fine for long double functions to refer to fe* functions
directly, unless they get called by C90 double functions.
Tested for x86_64 (testsuite; the reordering of code means disassembly
of shared libraries can't usefully be compared). Tested for powerpc
that the relevant issue disappears from the linknamespace test
output.
[BZ #17724]
* sysdeps/ieee754/k_standard.c: Don't include <float.h>.
(__kernel_standard_f): Remove. Moved to k_standardf.c.
(__kernel_standard_l): Remove. Moved to k_standardl.c with
(char *) casts added.
* sysdeps/ieee754/k_standardf.c: New file.
* sysdeps/ieee754/k_standardl.c: Likewise.
* math/Makefile (libm-support): Remove k_standard.
(libm-calls): Add k_standard.
ldbl-128ibm uses ldbl-128 e_lgammal_r implementation as is, however some
constants definitions overflows for IBM long double range. This patch
suppress the compiler warnings until the ldbl-128ibm implementation is
fixed.
This patch fixes bugs in ldbl-128ibm frexpl for 32-bit systems shown
up by warnings:
../sysdeps/ieee754/ldbl-128ibm/s_frexpl.c:82:4: warning: left shift count >= width of type
../sysdeps/ieee754/ldbl-128ibm/s_frexpl.c:129:5: warning: left shift count >= width of type
This did in fact show up in test-ldouble.out (alongside all the other
problems there ... maybe we should again consider running the libm
tests at finer granularity from the makefiles) as already covered by
the testsuite after the previous patch that fixed these bugs for
64-bit systems. The fix is simply using 1LL instead of 1L when
shifting by 52.
Tested for powerpc32 (soft float).
[BZ #16619]
[BZ #16740]
* sysdeps/ieee754/ldbl-128ibm/s_frexpl.c (__frexpl): Use 1LL << 52
instead of 1L << 52.
libm uses symbols mpone and mptwo for internal purposes. This patch
moves them to the implementation namespace (__mpone and __mptwo).
Tested for x86_64 (testsuite, and that installed stripped shared
libraries are unchanged by the patch).
[BZ #17616]
* sysdeps/ieee754/dbl-64/mpa.c (mpone): Rename to __mpone.
(mptwo): Rename to __mptwo.
(__inv): Use __mptwo instead of mptwo.
* sysdeps/ieee754/dbl-64/mpa.h (mpone): Rename to __mpone.
(mptwo): Rename to __mptwo.
* sysdeps/ieee754/dbl-64/mpatan.c (__mpatan): Use __mpone instead
of mpone and __mptwo instead of mptwo.
* sysdeps/ieee754/dbl-64/mpatan2.c (__mpatan2): Use __mpone
instead of mpone.
* sysdeps/ieee754/dbl-64/mpexp.c (__mpexp): Likewise.
* sysdeps/ieee754/dbl-64/mplog.c (__mplog): Likewise.
* sysdeps/ieee754/dbl-64/sincos32.c (__c32): Use __mpone instead
of mpone and __mptwo instead of mptwo.
(__mpranred): Use __mpone instead of mpone.
* conform/Makefile (test-xfail-ISO/math.h/linknamespace): Remove
variable.
(test-xfail-ISO99/complex.h/linknamespace): Likewise.
(test-xfail-ISO99/math.h/linknamespace): Likewise.
(test-xfail-ISO99/tgmath.h/linknamespace): Likewise.
(test-xfail-ISO11/complex.h/linknamespace): Likewise.
(test-xfail-ISO11/math.h/linknamespace): Likewise.
(test-xfail-ISO11/tgmath.h/linknamespace): Likewise.
(test-xfail-XPG3/math.h/linknamespace): Likewise.
(test-xfail-XPG4/math.h/linknamespace): Likewise.
(test-xfail-POSIX/math.h/linknamespace): Likewise.
(test-xfail-UNIX98/math.h/linknamespace): Likewise.
(test-xfail-XOPEN2K/complex.h/linknamespace): Likewise.
(test-xfail-XOPEN2K/math.h/linknamespace): Likewise.
(test-xfail-XOPEN2K/tgmath.h/linknamespace): Likewise.
(test-xfail-POSIX2008/complex.h/linknamespace): Likewise.
(test-xfail-POSIX2008/math.h/linknamespace): Likewise.
(test-xfail-POSIX2008/tgmath.h/linknamespace): Likewise.
(test-xfail-XOPEN2K8/complex.h/linknamespace): Likewise.
(test-xfail-XOPEN2K8/math.h/linknamespace): Likewise.
(test-xfail-XOPEN2K8/tgmath.h/linknamespace): Likewise.
This patch fixes spurious underflows from ldbl-128 expm1l, as reported
in <https://sourceware.org/ml/libc-alpha/2014-06/msg00835.html> and
exposed by the tests added for such a bug in the x86 / x86-64
version. The bug and fix are essentially the same, so no separate bug
is filed in Bugzilla.
Tested for mips64.
[BZ #16539]
* sysdeps/ieee754/ldbl-128/s_expm1l.c: Include <float.h>.
(__expm1l): Return argument unchanged when small but not
subnormal.
This patch fixes bug 17097, ldbl-128 powl producing overflowing /
underflowing results with positive sign when the result should have
been negative. This was shown up by the tests in non-default rounding
modes added by my patch for bug 16315, but isn't actually limited to
non-default rounding modes: rather, when rounding to nearest the
wrappers produced a result with the correct sign and so always hid the
bug unless -lieee was used to disable the wrappers. The problem is
that in the cases where Y is large enough that the result overflows or
underflows for X not very close to 1, but not large enough to overflow
or underflow for all X != +/- 1 (in the latter case Y is always an
even integer), a positive overflowing / underflowing result is always
returned, rather than one with the correct sign. This patch moves the
relevant part of computation of the sign earlier and returns a result
of the correct sign.
Tested for mips64.
[BZ #17097]
* sysdeps/ieee754/ldbl-128/e_powl.c (__ieee754_powl): Return
result with correct sign in case of exponents that produce
overflow except for X very close to 1.
This patch fixes bugs 16561 and 16562, bad results of yn in overflow
cases in non-default rounding modes, both because an intermediate
overflow in the recurrence does not get detected if the result is not
an infinity and because an overflowing result may occur in the wrong
sign. The fix is to set FE_TONEAREST mode internally for the parts of
the function where such overflows can occur (which includes the call
to y1 - where yn is used to compute a Bessel function of order -1,
negating the result of y1 isn't correct for overflowing results in
directed rounding modes) and then compute an overflowing value in the
original rounding mode if the to-nearest result was an infinity.
Tested x86_64 and x86 and ulps updated accordingly. Also tested for
mips64 and powerpc32 to test the ldbl-128 and ldbl-128ibm changes.
(The tests for these bugs were added in my previous y1 patch, so the
only thing this patch has to do with the testsuite is enable yn
testing in all rounding modes.)
[BZ #16561]
[BZ #16562]
* sysdeps/ieee754/dbl-64/e_jn.c: Include <float.h>.
(__ieee754_yn): Set FE_TONEAREST mode internally and then
recompute overflowing results in original rounding mode.
* sysdeps/ieee754/flt-32/e_jnf.c: Include <float.h>.
(__ieee754_ynf): Set FE_TONEAREST mode internally and then
recompute overflowing results in original rounding mode.
* sysdeps/ieee754/ldbl-128/e_jnl.c: Include <float.h>.
(__ieee754_ynl): Set FE_TONEAREST mode internally and then
recompute overflowing results in original rounding mode.
* sysdeps/ieee754/ldbl-128ibm/e_jnl.c: Include <float.h>.
(__ieee754_ynl): Set FE_TONEAREST mode internally and then
recompute overflowing results in original rounding mode.
* sysdeps/ieee754/ldbl-96/e_jnl.c: Include <float.h>.
(__ieee754_ynl): Set FE_TONEAREST mode internally and then
recompute overflowing results in original rounding mode.
* sysdeps/i386/fpu/fenv_private.h [!__SSE2_MATH__]
(libc_feholdsetround_ctx): New macro.
* math/libm-test.inc (yn_test): Use ALL_RM_TEST.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps : Likewise.
This patch fixes spurious underflows from exp10 for arguments near 0
(part of bug 16560; that bug also includes spurious underflows from
exp2, which are not fixed by this patch). The problem is underflows
in the internal computation converting the exp10 argument to arguments
for exp (with extra precision), and the fix is simply to return 1
early for arguments near enough to 0 (just as arguments with large
enough magnitude have their own overflow / underflow logic at the
start of the function).
Tested x86_64 and x86 and ulps updated accordingly; also tested for
powerpc32 and mips64 to validate the ldbl-128ibm and ldbl-128 changes.
[BZ #16560]
* sysdeps/ieee754/dbl-64/e_exp10.c (__ieee754_exp10): Return 1 for
arguments close to 0.
* sysdeps/ieee754/ldbl-128/e_exp10l.c (__ieee754_exp10l):
Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_exp10l.c (__ieee754_exp10l):
Likewise.
* math/auto-libm-test-in: Add more tests of exp10.
* math/auto-libm-test-out: Regenerated.
* sysdeps/x86_64/fpu/libm-test-ulps: Update.
This patch fixes bug 16287, spurious underflows from ldbl-128 erfl
arising from it calling erfcl for arguments with absolute value at
least 1.0, although for large positive arguments erfcl correctly
underflows but erfl shouldn't. The fix is simply to avoid calling
erfcl, and just return 1, for arguments above a cut-off large enough
that erfl correctly rounds to-nearest as 1 but not so large that erfcl
underflows.
Tested mips64. Also tested x86_64 and x86 to confirm the new tests
(taken from the tests of erfc) don't cause any problems there; no ulps
updates needed.
[BZ #16287]
* sysdeps/ieee754/ldbl-128/s_erfl.c (__erfl): Return 1 without
calling __erfcl for arguments at least 16.
* math/auto-libm-test-in: Add more tests of erf.
* math/auto-libm-test-out: Regenerated.
This patch fixes bug 16354, spurious underflows from cosh when a tiny
argument is passed to expm1 and expm1 correctly underflows although
the final result of cosh should be 1. As noted in that bug, some
cases are latent because of expm1 implementations not raising
underflow (bug 16353), but all the implementations are fixed
similarly. They already contained checks for tiny arguments, but the
checks were too late to avoid underflow from expm1 (although they
would avoid underflow from subsequent squaring of the result of
expm1); they are moved before the expm1 calls.
The thresholds used for considering arguments tiny are not
particularly consistent in how they relate to the precision of the
floating-point format in question. They are, however, all sufficient
to ensure that the round-to-nearest result of cosh is indeed 1 below
the threshold (although sometimes they are smaller than necessary).
But the previous logic did not return 1, but the previously computed 1
+ expm1(abs(x)) value. And the thresholds in the ldbl-128 and
ldbl-128ibm code (0x1p-71L - I suspect 0x3f8b was intended in the code
instead of 0x3fb8 - and (roughly) 0x1p-55L) are not sufficient for
that value to be 1. So by moving the test for tiny arguments, and
consequently returning 1 directly now the expm1 value hasn't been
computed by that point, this patch also fixes bug 17061, the (large
number of ulps) inaccuracy for small arguments in those
implementations. Tests for that bug are duly added.
Tested x86_64 and x86 and ulps updated accordingly. Also tested for
mips64 and powerpc32 to validate the ldbl-128 and ldbl-128ibm changes.
[BZ #16354]
[BZ #17061]
* sysdeps/ieee754/dbl-64/e_cosh.c (__ieee754_cosh): Check for
small arguments before calling __expm1.
* sysdeps/ieee754/flt-32/e_coshf.c (__ieee754_coshf): Check for
small arguments before calling __expm1f.
* sysdeps/ieee754/ldbl-128/e_coshl.c (__ieee754_coshl): Check for
small arguments before calling __expm1l.
* sysdeps/ieee754/ldbl-128ibm/e_coshl.c (__ieee754_coshl):
Likewise.
* sysdeps/ieee754/ldbl-96/e_coshl.c (__ieee754_coshl): Likewise.
* math/auto-libm-test-in: Add more cosh tests. Do not allow
spurious underflow for some cosh tests.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
This patch fixes bug 17050, missing errno setting for y1 overflow (for
small positive arguments). An appropriate check is added for overflow
directly in the __ieee754_y1 implementation, similar to the check
present for yn (doing it there rather than in the wrapper also avoids
yn needing to repeat the check when called for order 1 or -1 and it
uses __ieee754_y1).
Tested x86_64 and x86; no ulps update needed. Also tested for mips64
to verify the ldbl-128 fix (the ldbl-128ibm code just #includes the
ldbl-128 file).
[BZ #17050]
* sysdeps/ieee754/dbl-64/e_j1.c: Include <errno.h>.
(__ieee754_y1): Set errno if return value overflows.
* sysdeps/ieee754/flt-32/e_j1f.c: Include <errno.h>.
(__ieee754_y1f): Set errno if return value overflows.
* sysdeps/ieee754/ldbl-128/e_j1l.c: Include <errno.h>.
(__ieee754_y1l): Set errno if return value overflows.
* sysdeps/ieee754/ldbl-96/e_j1l.c: Include <errno.h>.
(__ieee754_y1l): Set errno if return value overflows.
* math/auto-libm-test-in: Add more tests of y0, y1 and yn.
* math/auto-libm-test-out: Regenerated.
This patch fixes bug 16315, bad pow handling of overflow/underflow in
non-default rounding modes. Tests of pow are duly converted to
ALL_RM_TEST to run all tests in all rounding modes.
There are two main issues here. First, various implementations
compute a negative result by negating a positive result, but this
yields inappropriate overflow / underflow values for directed
rounding, so either overflow / underflow results need recomputing in
the correct sign, or the relevant overflowing / underflowing operation
needs to be made to have a result of the correct sign. Second, the
dbl-64 implementation sets FE_TONEAREST internally; in the overflow /
underflow case, the result needs recomputing in the original rounding
mode.
Tested x86_64 and x86 and ulps updated accordingly.
[BZ #16315]
* sysdeps/i386/fpu/e_pow.S (__ieee754_pow): Ensure possibly
overflowing or underflowing operations take place with sign of
result.
* sysdeps/i386/fpu/e_powf.S (__ieee754_powf): Likewise.
* sysdeps/i386/fpu/e_powl.S (__ieee754_powl): Likewise.
* sysdeps/ieee754/dbl-64/e_pow.c: Include <math.h>.
(__ieee754_pow): Recompute overflowing and underflowing results in
original rounding mode.
* sysdeps/x86/fpu/powl_helper.c: Include <stdbool.h>.
(__powl_helper): Allow negative argument X and scale negated value
as needed. Avoid passing value outside [-1, 1] to f2xm1.
* sysdeps/x86_64/fpu/e_powl.S (__ieee754_powl): Ensure possibly
overflowing or underflowing operations take place with sign of
result.
* sysdeps/x86_64/fpu/multiarch/e_pow.c [HAVE_FMA4_SUPPORT]:
Include <math.h>.
* math/auto-libm-test-in: Add more tests of pow.
* math/auto-libm-test-out: Regenerated.
* math/libm-test.inc (pow_test): Use ALL_RM_TEST.
(pow_tonearest_test_data): Remove.
(pow_test_tonearest): Likewise.
(pow_towardzero_test_data): Likewise.
(pow_test_towardzero): Likewise.
(pow_downward_test_data): Likewise.
(pow_test_downward): Likewise.
(pow_upward_test_data): Likewise.
(pow_test_upward): Likewise.
(main): Don't call removed functions.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
Errno is not set and the testcases will fail.
Now the scalbln-aliases are removed in i386/m68
and the wrappers are used when calling the scalbln-functions.
On ia64 only scalblnf has its own implementation.
For scalbln and scalblnl the ieee754/dbl-64 and ieee754/ldbl-96 are used, thus
the wrappers are needed, too.
This patch fixes few failures in nearbyintl() where the fraction part is
close to 0.5.i The new tests added report few extra failures in
nearbyint_downward and nearbyint_towardzero which is a known issue.
Fixes#17031.
As with other issues of this kind, bug 17042 is log2 (1) wrongly
returning -0 instead of +0 in round-downward mode because of
implementations effectively in terms of log1p (x - 1). This patch
fixes the issue in the same way used for log and log10.
Tested x86_64 and x86 and ulps updated accordingly. Also tested for
mips64 to confirm a fix was needed for ldbl-128 and to validate that
fix (also applied to ldbl-128ibm since that version of log2l is
essentially the same as the ldbl-128 one).
[BZ #17042]
* sysdeps/i386/fpu/e_log2.S (__ieee754_log2): Take absolete value
when x - 1 is zero.
* sysdeps/i386/fpu/e_log2f.S (__ieee754_log2f): Likewise.
* sysdeps/i386/fpu/e_log2l.S (__ieee754_log2l): Likewise.
* sysdeps/ieee754/ldbl-128/e_log2l.c (__ieee754_log2l): Return
0.0L for an argument of 1.0L.
* sysdeps/ieee754/ldbl-128ibm/e_log2l.c (__ieee754_log2l):
Likewise.
* sysdeps/x86_64/fpu/e_log2l.S (__ieee754_log2l): Take absolute
value when x - 1 is zero.
* math/libm-test.inc (log2_test): Use ALL_RM_TEST.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
As with various other issues of this kind, bug 16977 is log10 (1)
wrongly returning -0 rather than +0 in round-downward mode because of
an implementation effectively in terms of log1p (x - 1). This patch
fixes the issue in the same way used for log.
Tested x86_64 and x86 and ulps updated accordingly. Also tested for
mips64 to confirm a fix was needed for ldbl-128 and to validate that
fix (also applied to ldbl-128ibm since that version of logl is
essentially the same as the ldbl-128 one).
[BZ #16977]
* sysdeps/i386/fpu/e_log10.S (__ieee754_log10): Take absolute
value when x - 1 is zero.
* sysdeps/i386/fpu/e_log10f.S (__ieee754_log10f): Likewise.
* sysdeps/i386/fpu/e_log10l.S (__ieee754_log10l): Likewise.
* sysdeps/ieee754/ldbl-128/e_log10l.c (__ieee754_log10l): Return
0.0L for an argument of 1.0L.
* sysdeps/ieee754/ldbl-128ibm/e_log10l.c (__ieee754_log10l):
Likewise.
* sysdeps/x86_64/fpu/e_log10l.S (__ieee754_log10l): Take absolute
value when x - 1 is zero.
* math/libm-test.inc (log10_test): Use ALL_RM_TEST.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
Bug 16564 is spurious overflow of log1pl (LDBL_MAX) in FE_UPWARD mode,
resulting from log1pl adding 1 to its argument (for arguments not
close to 0), which overflows in that mode. This patch fixes this by
avoiding adding 1 to large arguments (precisely what counts as large
depends on the floating-point format).
Tested x86_64 and x86, and spot-checked log1pl tests on mips64 and
powerpc64.
[BZ #16564]
* sysdeps/i386/fpu/s_log1pl.S (__log1pl): Do not add 1 to positive
arguments with exponent 65 or above.
* sysdeps/ieee754/ldbl-128/s_log1pl.c (__log1pl): Do not add 1 to
arguments 0x1p113L or above.
* sysdeps/ieee754/ldbl-128ibm/s_log1pl.c (__log1pl): Do not add 1
to arguments 0x1p107L or above.
* sysdeps/x86_64/fpu/s_log1pl.S (__log1pl): Do not add 1 to
positive arguments with exponent 65 or above.
* math/auto-libm-test-in: Add more tests of log1p.
* math/auto-libm-test-out: Regenerated.
According to C99 and C11 Annex F, acosh (1) should be +0 in all
rounding modes. However, some implementations in glibc wrongly return
-0 in round-downward mode (which is what you get if you end up
computing log1p (-0), via 1 - 1 being -0 in round-downward mode).
This patch fixes the problem implementations, by correcting the test
for an exact 1 value in the ldbl-96 implementation to allow for the
explicit high bit of the mantissa, and by inserting fabs instructions
in the i386 implementations; tests of acosh are duly converted to
ALL_RM_TEST. I believe all the other sysdeps/ieee754 implementations
are already OK (I haven't checked the ia64 versions, but if buggy then
that will be obvious from the results of test runs after this patch is
in).
Tested x86_64 and x86 and ulps updated accordingly.
[BZ #16927]
* sysdeps/i386/fpu/e_acosh.S (__ieee754_acosh): Use fabs on x-1
value.
* sysdeps/i386/fpu/e_acoshf.S (__ieee754_acoshf): Likewise.
* sysdeps/i386/fpu/e_acoshl.S (__ieee754_acoshl): Likewise.
* sysdeps/ieee754/ldbl-96/e_acoshl.c (__ieee754_acoshl): Correct
for explicit high bit of mantissa when testing for argument equal
to 1.
* math/libm-test.inc (acosh_test): Use ALL_RM_TEST.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
Bug 16516 reports spurious underflows from erf (for all floating-point
types), when the result is close to underflowing but does not actually
underflow.
erf (x) is about (2/sqrt(pi))*x for x close to 0, so there are
subnormal arguments for which it does not underflow. The various
implementations do (x + efx*x) (for efx = 2/sqrt(pi) - 1), for greater
accuracy than if just using a single multiplication by an
approximation to 2/sqrt(pi) (effectively, this way there are a few
more bits in the approximation to 2/sqrt(pi)). This can introduce
underflows when efx*x underflows even though the final result does
not, so a scaled calculation with 8*efx is done in these cases - but 8
is not a big enough scale factor to avoid all such underflows. 16 is
(any underflows with a scale factor of 16 would only occur when the
final result underflows), so this patch changes the code to use that
factor. Rather than recomputing all the values of the efx8 variable,
it is removed, leaving it to the compiler's constant folding to
compute 16*efx. As such scaling can also lose underflows when the
final scaling down happens to be exact, appropriate checks are added
to ensure underflow exceptions occur when required in such cases.
Tested x86_64 and x86; no ulps updates needed. Also spot-checked for
powerpc32 and mips64 to verify the changes to the ldbl-128ibm and
ldbl-128 implementations.
[BZ #16516]
* sysdeps/ieee754/dbl-64/s_erf.c (efx8): Remove variable.
(__erf): Scale by 16 instead of 8 in potentially underflowing
case. Ensure exception if result actually underflows.
* sysdeps/ieee754/flt-32/s_erff.c (efx8): Remove variable.
(__erff): Scale by 16 instead of 8 in potentially underflowing
case. Ensure exception if result actually underflows.
* sysdeps/ieee754/ldbl-128/s_erfl.c: Include <float.h>.
(efx8): Remove variable.
(__erfl): Scale by 16 instead of 8 in potentially underflowing
case. Ensure exception if result actually underflows.
* sysdeps/ieee754/ldbl-128ibm/s_erfl.c: Include <float.h>.
(efx8): Remove variable.
(__erfl): Scale by 16 instead of 8 in potentially underflowing
case. Ensure exception if result actually underflows.
* sysdeps/ieee754/ldbl-96/s_erfl.c: Include <float.h>.
(efx8): Remove variable.
(__erfl): Scale by 16 instead of 8 in potentially underflowing
case. Ensure exception if result actually underflows.
* math/auto-libm-test-in: Add more tests of erf.
* math/auto-libm-test-out: Regenerated.
Besides fixing the bugzilla, this also fixes corner-cases where the high
and low double differ greatly in magnitude, and handles a denormal
input without resorting to a fp rescale.
[BZ #16740]
[BZ #16619]
* sysdeps/ieee754/ldbl-128ibm/s_frexpl.c (__frexpl): Rewrite.
* math/libm-test.inc (frexp_test_data): Add tests.
This patch fixes incorrect results from catan and catanh of certain
special inputs in round-downward mode (bug 16799), and incorrect
results of __ieee754_logf (+/-0) in round-downward mode (bug 16800)
that show up through catan/catanh when tested in all rounding modes,
but not directly in the testing for logf because the bug gets hidden
by the wrappers.
Both bugs involve a zero that should be +0 being -0 instead: one
computed as (1-x)*(1+x) in the catan/catanh case, and one as (x-x) in
the logf case. The fixes ensure positive zero is used. Testing of
catan and catanh in all rounding modes is duly enabled.
I expect there are various other bugs in special cases in __ieee754_*
functions that are normally hidden by the wrappers but would show up
for testing with -lieee (or in future with -fno-math-errno if we
replace -lieee and _LIB_VERSION with compile-time redirection to new
*_noerrno symbol names).
Tested x86_64 and x86 and ulps updated accordingly.
[BZ #16799]
[BZ #16800]
* math/s_catan.c (__catan): Avoid passing -0 denominator to atan2
with 0 numerator.
* math/s_catanf.c (__catanf): Likewise.
* math/s_catanh.c (__catanh): Likewise.
* math/s_catanhf.c (__catanhf): Likewise.
* math/s_catanhl.c (__catanhl): Likewise.
* math/s_catanl.c (__catanl): Likewise.
* sysdeps/ieee754/flt-32/e_logf.c (__ieee754_logf): Always divide
by positive zero when computing -Inf result.
* math/libm-test.inc (catan_test): Use ALL_RM_TEST.
(catanh_test): Likewise.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
Fix for values near a power of two, and some tidies.
[BZ #16739]
* sysdeps/ieee754/ldbl-128ibm/s_nextafterl.c (__nextafterl): Correct
output when value is near a power of two. Use int64_t for lx and
remove casts. Use decimal rather than hex exponent constants.
Don't use long double multiplication when double will suffice.
* math/libm-test.inc (nextafter_test_data): Add tests.
* NEWS: Add 16739 and 16786 to bug list.
My recent exp patch introduced warnings about implicit __isinf
declarations in exp because e_exp.c didn't include <math.h>. This
patch fixes this. Because <math.h> can't be included after
<math_private.h> (because of macro definitions of __nan*), it was
necessary to put an include in sysdeps/x86_64/fpu/multiarch/e_exp.c as
well.
Tested x86_64.
* sysdeps/ieee754/dbl-64/e_exp.c: Include <math.h>.
* sysdeps/x86_64/fpu/multiarch/e_exp.c
[HAVE_FMA4_SUPPORT || HAVE_AVX_SUPPORT]: Likewise.
The dbl-64 version of exp needs round-to-nearest mode for its internal
computations, but that has the consequence of inappropriate
overflowing and underflowing results in other rounding modes. This
patch fixes this by recomputing the relevant results in cases where
the round-to-nearest result overflows to infinity or underflows to
zero (most of the diffs are actually just consequent reindentation).
Tests are enabled in all rounding modes for complex functions using
exp - but not for cexp because it turns out there are bugs causing
spurious underflows for cexp for some tests, which will need to be
fixed separately (I suspect ccos ccosh csin csinh ctan ctanh have
similar bugs, just not shown by the present set of test inputs).
Tested x86_64 and x86 and ulps updated accordingly.
[BZ #16284]
* sysdeps/ieee754/dbl-64/e_exp.c (__ieee754_exp): Use original
rounding mode to recompute results that overflow to infinity or
underflow to zero.
* math/auto-libm-test-in: Don't mark tests as expected to fail for
bug 16284.
* math/auto-libm-test-out: Regenerated.
* math/libm-test.inc (ccos_test): Use ALL_RM_TEST.
(ccosh_test): Likewise.
(csin_test_data): Use plus_oflow.
(csin_test): Use ALL_RM_TEST.
(csinh_test_data): Use plus_oflow.
(csinh_test): Use ALL_RM_TEST.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
According to ISO C Annex F, log (1) should be +0 in all rounding
modes, but some implementations in glibc wrongly return -0 in
round-downward mode (mapping to log1p (x - 1) is problematic because 1
- 1 is -0 in round-downward mode, and log1p (-0) is -0). This patch
fixes this. (It helps with some implementations of other functions
such as acosh, log2 and log10 that call out to log, but not enough to
enable all-rounding-modes testing for those functions without further
fixes to other implementations of them.)
Tested x86_64 and x86 and ulps updated accordingly, and did spot tests
for mips64 for the ldbl-128 fix, and i586 for the sysdeps/i386/fpu
implementations shadowed by those in sysdeps/i386/i686/fpu.
[BZ #16731]
* sysdeps/i386/fpu/e_log.S (__ieee754_log): Take absolute value
when x - 1 is zero.
* sysdeps/i386/fpu/e_logf.S (__ieee754_logf): Likewise.
* sysdeps/i386/fpu/e_logl.S (__ieee754_logl): Likewise.
* sysdeps/i386/i686/fpu/e_logl.S (__ieee754_logl): Likewise.
* sysdeps/ieee754/dbl-64/e_log.c (__ieee754_log): Return +0 when
argument is 1.
* sysdeps/ieee754/ldbl-128/e_logl.c (__ieee754_logl): Likewise.
* sysdeps/x86_64/fpu/e_logl.S: Take absolute value when x - 1 is
zero.
* math/libm-test.inc (log_test): Use ALL_RM_TEST.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
ISO C requires the result of nextafter to be independent of the
rounding mode, even when underflow or overflow occurs. This patch
fixes the bug in various nextafter implementations that, having done
an overflowing computation to force an overflow exception (correct),
they then return the result of that computation rather than an
infinity computed some other way (incorrect, when the overflowing
result of arithmetic with that sign and rounding mode is finite but
the correct result is infinite) - generally by falling through to
existing code to return a value that in fact is correct for this case
(but was computed by an integer increment and so without generating
the exceptions required). Having fixed the bug, the previously
deferred conversion of nextafter testing in libm-test.inc to
ALL_RM_TEST is also included.
Tested x86_64 and x86; also spot-checked results of nextafter tests
for powerpc32 and mips64 to test the ldbl-128ibm and ldbl-128
changes. (The m68k change is untested.)
[BZ #16677]
* math/s_nextafter.c (__nextafter): Do not return value from
overflowing computation.
* sysdeps/i386/fpu/s_nextafterl.c (__nextafterl): Likewise.
* sysdeps/ieee754/flt-32/s_nextafterf.c (__nextafterf): Likewise.
* sysdeps/ieee754/ldbl-128/s_nextafterl.c (__nextafterl):
Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_nextafterl.c (__nextafterl):
Likewise.
* sysdeps/m68k/m680x0/fpu/s_nextafterl.c (__nextafterl): Likewise.
* math/libm-test.inc (nextafter_test): Use ALL_RM_TEST.
In 84ba214c, I removed some redundant sign computations and in the
process, I incorrectly got rid of a temporary variable, thus passing
the absolute value of the input to bsloww1. This caused #16623.
This fix undoes the incorrect change.
The truncl assembly implementation (sysdeps/powerpc/powerpc64/fpu/s_truncl.S)
returns wrong results for some inputs where first double is a exact integer
and the precision is determined by second long double.
Checking on implementation comments and history, I am very confident the
assembly implementation was based on a version before commit
5c68d40169 that fixes BZ#2423 (Errors in
long double (ldbl-128ibm) rounding functions in glibc-2.4).
By just removing the implementation and make the build select
sysdeps/ieee754/ldbl-128ibm/s_truncl.c instead it fixes tgammal
issues regarding wrong result sign.
This patch fixes bug 16408, ldbl-128ibm expm1l returning NaN for some
large arguments.
The basic problem is that the approach of converting the exponent to
the form n * log(2) + y, where -0.5 <= y <= 0.5, then computing 2^n *
expm1(y) + (2^n - 1) falls over when 2^n overflows (starting slightly
before the point where expm1 overflows, when y is negative and n is
the least integer for which 2^n overflows). The ldbl-128 code, and
the x86/x86_64 code, make expm1l fall back to expl for large positive
arguments to avoid this issue. This patch makes the ldbl-128ibm code
do the same. (The problem appears for the particular argument in the
testsuite because the ldbl-128ibm code also uses an overflow threshold
that's for ldbl-128 and is too big for ldbl-128ibm, but the problem
described applies for large non-overflowing cases as well, although
during the freeze is not a suitable time for making the expm1 tests
cover cases close to overflow more thoroughly.)
This leaves some code for large positive arguments in expm1l that is
now dead. To keep the code for ldbl-128 and ldbl-128ibm similar, and
to avoid unnecessary changes during the freeze, the patch doesn't
remove it; instead I propose to file a bug in Bugzilla as a reminder
that this code (for overflow, including errno setting, and for
arguments of +Inf) is no longer needed and should be removed from both
those expm1l implementations.
Tested powerpc32.
* sysdeps/ieee754/ldbl-128ibm/s_expm1l.c (__expm1l): Use __expl
for large positive arguments.
This patch fixes bug 16407, spurious overflows from ldbl-128ibm coshl.
The implementation assumed that a high part (reinterpreted as an
integer) of the absolute value of the argument of 0x408633ce8fb9f87dLL
or more meant overflow, but the actual threshold has high part
0x408633ce8fb9f87eLL (and a negative low part). The patch adjusts the
threshold accordingly.
sinhl probably has the same issue, but I didn't get that far in adding
tests of special cases (such as just below and above overflow) before
the freeze and during the freeze is not a suitable time to add them
(as they'd require ulps to be regenerated again), so I'm not changing
that function for now; when I add more tests of special cases, we'll
discover whether sinhl indeed has this problem.
Tested powerpc32.
* sysdeps/ieee754/ldbl-128ibm/e_coshl.c (__ieee754_coshl):
Increase overflow threshold.
This patch fixes bug 16400, spurious underflow exceptions for ldbl-128
/ ldbl-128ibm lgammal with small positive arguments, by just using
-__logl (x) as the result in the problem cases (similar to the
previous fix for problems with small negative arguments).
Tested powerpc32, and also tested on mips64 that this does not require
ulps regeneration for the ldbl-128 case.
* sysdeps/ieee754/ldbl-128/e_lgammal_r.c (__ieee754_lgammal_r):
Return -__logl (x) for small positive arguments without evaluating
a polynomial.
This patch fixes bug 16386, ldbl-128ibm logl inaccuracy (with
consequent inaccuracy for lgammal) for arguments where the high double
is subnormal, which showed up while attempting to regenerate ulps for
powerpc-nofpu for 2.19. The problem here is logic failing to allow
for subnormals when calculating the exponent of the argument. Tested
for powerpc-nofpu.
* sysdeps/ieee754/ldbl-128ibm/e_logl.c (__ieee754_logl): Adjust
numbers with subnormal high part when calculating exponent.
This patch fixes bug 16385, ldbl-128ibm asinhl inaccuracy, which
showed up while attempting to regenerate ulps for powerpc-nofpu for
2.19. The problem here was use of fabs instead of fabsl meaning large
arguments were reduced to the precision of double. Tested for
powerpc-nofpu.
* sysdeps/ieee754/ldbl-128ibm/s_asinhl.c (__asinhl): Use fabsl not
fabs.
This patch fixes bug 16384, ldbl-128ibm acoshl inaccuracy, which
showed up while attempting to regenerate ulps for powerpc-nofpu for
2.19. There were two separate problems, use of __log1p instead of
__log1pl and an insufficiently accurate constant value for log 2
(which this patch replaces by use of M_LN2l), each of which could
cause substantial inaccuracy in affected cases.
Tested for powerpc-nofpu.
* sysdeps/ieee754/ldbl-128ibm/e_acoshl.c (ln2): Initialize with
M_LN2l.
(__ieee754_acoshl): Use __log1pl not __log1p.
This patch fixes bug 16337, ldbl-128 lgammal spurious overflows for
small negative arguments (the arguments in question are already in the
testsuite). The implementation uses the reflection formula to compute
lgamma of negative x from lgamma of -x, effectively resulting in a
calculation -log(x^2) + log(-x); cancellation isn't problematic in
this case (bugs for problematic cancellation in lgamma are 2542, 2543,
2558), but the x^2 calculation can underflow (in which case there is
spurious logic to return an overflowing value - lgamma can only ever
correctly overflow for large positive arguments, though tgamma can
overflow for small arguments of either sign as well as large positive
arguments). The fix is simply to calculate the result directly with
logl when the argument is a small enough negative number.
Tested mips64.
* sysdeps/ieee754/ldbl-128/e_lgammal_r.c (__ieee754_lgammal_r):
Calculate results for small negative arguments directly rather
than using reflection formula with special underflow handling.
This patch consolidates the multiple copies of code that looks up sin
and cos of a number from the lookup table and computes the final
value, into static functions. This does not have a noticeable
performance impact since the functions are inlined by gcc.
There is further scope for consolidation in the functions but they
cause a more noticable impact on performance (>5%) due to which I have
held back on them.
Removed more redundant computations in the slow paths of the sin and
cos functions. The notable change is the passing of the most
significant bits of X to the slow functions to check if X is positive
so that just the absolute value of x can be passed and the repeated
ABS() operation is avoided.
There are multiple points in the code where the absolute value of a
number is computed multiple times or is computed even though the value
can only be positive. This change removes those redundant
computations. Tested on x86_64 to verify that there were no
regressions in the testsuite.
This patch fixes bug 16338, ldbl-128 logl not handling subnormals
(with consequent inaccuracy for lgammal as well). The fix is simply
to use __frexpl when determining the exponent, as done already in
log2l and log10l. Given the lack of testing of small arguments to any
of the log* functions, appropriate tests are added for all of them.
Tested x86_64 and x86 and ulps updated accordingly, and spot tests
also run for mips64 to confirm the ldbl-128 fix.
Note that while this fixes lgammal inaccuracy for small positive
arguments, I suspect that there will still be problems with spurious
underflows in that case.
* sysdeps/ieee754/ldbl-128/e_logl.c (__ieee754_logl): Use __frexpl
to determine exponent and adjust argument to have exponent of -1.
* math/auto-libm-test-in: Add more tests of log, log10, log1p and
log2.
* math/auto-libm-test-out: Regenerated.
* sysdeps/x86_64/fpu/libm-test-ulps: Update.
- Remove redundant mynumber union definitions
- Clean up a clumsy ternary operator
- Rename TAYLOR_SINCOS to TAYLOR_SIN since we're only expanding the
sin Taylor series in it.
This patch helps some math functions performance by adding the libc_fexxx
variant of inline functions to handle both FPU round and exception set/restore
and by using them on the libc_fexxx_ctx functions. It is based on already coded
fexxx family functions for PPC with fpu.
Here is the summary of performance improvements due this patch (measured on a
POWER7 machine):
Before:
cos(): ITERS:9.5895e+07: TOTAL:5116.03Mcy, MAX:77.6cy, MIN:49.792cy, 18744 calls/Mcy
exp(): ITERS:2.827e+07: TOTAL:5187.15Mcy, MAX:494.018cy, MIN:38.422cy, 5450.01 calls/Mcy
pow(): ITERS:6.1705e+07: TOTAL:5144.26Mcy, MAX:171.95cy, MIN:29.935cy, 11994.9 calls/Mcy
sin(): ITERS:8.6898e+07: TOTAL:5117.06Mcy, MAX:83.841cy, MIN:46.582cy, 16982 calls/Mcy
tan(): ITERS:2.9473e+07: TOTAL:5115.39Mcy, MAX:191.017cy, MIN:172.352cy, 5761.63 calls/Mcy
After:
cos(): ITERS:2.05265e+08: TOTAL:5111.37Mcy, MAX:78.754cy, MIN:24.196cy, 40158.5 calls/Mcy
exp(): ITERS:3.341e+07: TOTAL:5170.84Mcy, MAX:476.317cy, MIN:15.574cy, 6461.23 calls/Mcy
pow(): ITERS:7.6153e+07: TOTAL:5129.1Mcy, MAX:147.5cy, MIN:30.916cy, 14847.2 calls/Mcy
sin(): ITERS:1.58816e+08: TOTAL:5115.11Mcy, MAX:1490.39cy, MIN:22.341cy, 31048.4 calls/Mcy
tan(): ITERS:3.4964e+07: TOTAL:5114.18Mcy, MAX:177.422cy, MIN:146.115cy, 6836.68 calls/Mcy
Autoconf has been deprecating configure.in for quite a long time.
Rename all our configure.in and preconfigure.in files to .ac.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Add systemtap probes to various slow paths in libm so that application
developers may use systemtap to find out if their applications are
hitting these slow paths. We have added probes for pow, exp, log,
tan, atan and atan2.
http://sourceware.org/ml/libc-alpha/2013-07/msg00197.html
A rewrite to make this code correct for little-endian.
* sysdeps/ieee754/ldbl-128ibm/e_sqrtl.c (mynumber): Replace
union 32-bit int array member with 64-bit int array.
(t515, tm256): Double rather than long double.
(__ieee754_sqrtl): Rewrite using 64-bit arithmetic.
http://sourceware.org/ml/libc-alpha/2013-08/msg00085.html
Rid ourselves of ieee854.
* sysdeps/ieee754/ldbl-128ibm/ieee754.h (union ieee854_long_double):
Delete.
(IEEE854_LONG_DOUBLE_BIAS): Delete.
* sysdeps/ieee754/ldbl-128ibm/math_ldbl.h: Don't include ieee854
version of math_ldbl.h.
http://sourceware.org/ml/libc-alpha/2013-08/msg00084.html
Another batch of ieee854 macros and union replacement. These four
files also have bugs fixed with this patch. The fact that the two
doubles in an IBM long double may have different signs means that
negation and absolute value operations can't just twiddle one sign bit
as you can with ieee864 style extended double. fmodl, remainderl,
erfl and erfcl all had errors of this type. erfl also returned +1 for
large magnitude negative input where it should return -1. The hypotl
error is innocuous since the value adjusted twice is only used as a
flag. The e_hypotl.c tests for large "a" and small "b" are mutually
exclusive because we've already exited when x/y > 2**120. That allows
some further small simplifications.
[BZ #15734], [BZ #15735]
* sysdeps/ieee754/ldbl-128ibm/e_fmodl.c (__ieee754_fmodl): Rewrite
all uses of ieee875 long double macros and unions. Simplify test
for 0.0L. Correct |x|<|y| and |x|=|y| test. Use
ldbl_extract_mantissa value for ix,iy exponents. Properly
normalize after ldbl_extract_mantissa, and don't add hidden bit
already handled. Don't treat low word of ieee854 mantissa like
low word of IBM long double and mask off bit when testing for
zero.
* sysdeps/ieee754/ldbl-128ibm/e_hypotl.c (__ieee754_hypotl): Rewrite
all uses of ieee875 long double macros and unions. Simplify tests
for 0.0L and inf. Correct double adjustment of k. Delete dead code
adjusting ha,hb. Simplify code setting kld. Delete two600 and
two1022, instead use their values. Recognise that tests for large
"a" and small "b" are mutually exclusive. Rename vars. Comment.
* sysdeps/ieee754/ldbl-128ibm/e_remainderl.c (__ieee754_remainderl):
Rewrite all uses of ieee875 long double macros and unions. Simplify
test for 0.0L and nan. Correct negation.
* sysdeps/ieee754/ldbl-128ibm/s_erfl.c (__erfl): Rewrite all uses of
ieee875 long double macros and unions. Correct output for large
magnitude x. Correct absolute value calculation.
(__erfcl): Likewise.
* math/libm-test.inc: Add tests for errors discovered in IBM long
double versions of fmodl, remainderl, erfl and erfcl.
http://sourceware.org/ml/libc-alpha/2013-08/msg00083.html
Further replacement of ieee854 macros and unions. These files also
have some optimisations for comparison against 0.0L, infinity and nan.
Since the ABI specifies that the high double of an IBM long double
pair is the value rounded to double, a high double of 0.0 means the
low double must also be 0.0. The ABI also says that infinity and
nan are encoded in the high double, with the low double unspecified.
This means that tests for 0.0L, +/-Infinity and +/-NaN need only check
the high double.
* sysdeps/ieee754/ldbl-128ibm/e_atan2l.c (__ieee754_atan2l): Rewrite
all uses of ieee854 long double macros and unions. Simplify tests
for long doubles that are fully specified by the high double.
* sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (__ieee754_gammal_r):
Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_ilogbl.c (__ieee754_ilogbl): Likewise.
Remove dead code too.
* sysdeps/ieee754/ldbl-128ibm/e_jnl.c (__ieee754_jnl): Likewise.
(__ieee754_ynl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_log10l.c (__ieee754_log10l): Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_logl.c (__ieee754_logl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_powl.c (__ieee754_powl): Likewise.
Remove dead code too.
* sysdeps/ieee754/ldbl-128ibm/k_tanl.c (__kernel_tanl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_expm1l.c (__expm1l): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_frexpl.c (__frexpl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_isinf_nsl.c (__isinf_nsl): Likewise.
Simplify.
* sysdeps/ieee754/ldbl-128ibm/s_isinfl.c (___isinfl): Likewise.
Simplify.
* sysdeps/ieee754/ldbl-128ibm/s_log1pl.c (__log1pl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_modfl.c (__modfl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_nextafterl.c (__nextafterl): Likewise.
Comment on variable precision.
* sysdeps/ieee754/ldbl-128ibm/s_nexttoward.c (__nexttoward): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_nexttowardf.c (__nexttowardf):
Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_remquol.c (__remquol): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_scalblnl.c (__scalblnl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_scalbnl.c (__scalbnl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_tanhl.c (__tanhl): Likewise.
* sysdeps/powerpc/fpu/libm-test-ulps: Adjust tan_towardzero ulps.
http://sourceware.org/ml/libc-alpha/2013-08/msg00081.html
This is the first of a series of patches to ban ieee854_long_double
and the ieee854_long_double macros when using IBM long double. union
ieee854_long_double just isn't correct for IBM long double, especially
when little-endian, and pretending it is OK has allowed a number of
bugs to remain undetected in sysdeps/ieee754/ldbl-128ibm/.
This changes the few places in generic code that use it.
* stdio-common/printf_size.c (__printf_size): Don't use
union ieee854_long_double in fpnum union.
* stdio-common/printf_fphex.c (__printf_fphex): Likewise. Use
signbit macro to retrieve sign from long double.
* stdio-common/printf_fp.c (___printf_fp): Use signbit macro to
retrieve sign from long double.
* sysdeps/ieee754/ldbl-128ibm/printf_fphex.c: Adjust for fpnum change.
* sysdeps/ieee754/ldbl-128/printf_fphex.c: Likewise.
* sysdeps/ieee754/ldbl-96/printf_fphex.c: Likewise.
* sysdeps/x86_64/fpu/printf_fphex.c: Likewise.
* math/test-misc.c (main): Don't use union ieee854_long_double.
ports/
* sysdeps/ia64/fpu/printf_fphex.c: Adjust for fpnum change.
http://sourceware.org/ml/libc-alpha/2013-06/msg00919.html
I discovered a number of places where denormals and other corner cases
were being handled wrongly.
- printf_fphex.c: Testing for the low double exponent being zero is
unnecessary. If the difference in exponents is less than 53 then the
high double exponent must be nearing the low end of its range, and the
low double exponent hit rock bottom.
- ldbl2mpn.c: A denormal (ie. exponent of zero) value is treated as
if the exponent was one, so shift mantissa left by one. Code handling
normalisation of the low double mantissa lacked a test for shift count
greater than bits in type being shifted, and lacked anything to handle
the case where the difference in exponents is less than 53 as in
printf_fphex.c.
- math_ldbl.h (ldbl_extract_mantissa): Same as above, but worse, with
code testing for exponent > 1 for some reason, probably a typo for >= 1.
- math_ldbl.h (ldbl_insert_mantissa): Round the high double as per
mpn2ldbl.c (hi is odd or explicit mantissas non-zero) so that the
number we return won't change when applying ldbl_canonicalize().
Add missing overflow checks and normalisation of high mantissa.
Correct misleading comment: "The hidden bit of the lo mantissa is
zero" is not always true as can be seen from the code rounding the hi
mantissa. Also by inspection, lzcount can never be less than zero so
remove that test. Lastly, masking bitfields to their widths can be
left to the compiler.
- mpn2ldbl.c: The overflow checks here on rounding of high double were
just plain wrong. Incrementing the exponent must be accompanied by a
shift right of the mantissa to keep the value unchanged. Above notes
for ldbl_insert_mantissa are also relevant.
[BZ #15680]
* sysdeps/ieee754/ldbl-128ibm/e_rem_pio2l.c: Comment fix.
* sysdeps/ieee754/ldbl-128ibm/printf_fphex.c
(PRINT_FPHEX_LONG_DOUBLE): Tidy code by moving -53 into ediff
calculation. Remove unnecessary test for denormal exponent.
* sysdeps/ieee754/ldbl-128ibm/ldbl2mpn.c (__mpn_extract_long_double):
Correct handling of denormals. Avoid undefined shift behaviour.
Correct normalisation of low mantissa when low double is denormal.
* sysdeps/ieee754/ldbl-128ibm/math_ldbl.h
(ldbl_extract_mantissa): Likewise. Comment. Use uint64_t* for hi64.
(ldbl_insert_mantissa): Make both hi64 and lo64 parms uint64_t.
Correct normalisation of low mantissa. Test for overflow of high
mantissa and normalise.
(ldbl_nearbyint): Use more readable constant for two52.
* sysdeps/ieee754/ldbl-128ibm/mpn2ldbl.c
(__mpn_construct_long_double): Fix test for overflow of high
mantissa and correct normalisation. Avoid undefined shift.
http://sourceware.org/ml/libc-alpha/2013-07/msg00001.html
This patch starts the process of supporting powerpc64 little-endian
long double in glibc. IBM long double is an array of two ieee
doubles, so making union ibm_extended_long_double reflect this fact is
the correct way to access fields of the doubles.
* sysdeps/ieee754/ldbl-128ibm/ieee754.h
(union ibm_extended_long_double): Define as an array of ieee754_double.
(IBM_EXTENDED_LONG_DOUBLE_BIAS): Delete.
* sysdeps/ieee754/ldbl-128ibm/printf_fphex.c: Update all references
to ibm_extended_long_double and IBM_EXTENDED_LONG_DOUBLE_BIAS.
* sysdeps/ieee754/ldbl-128ibm/e_exp10l.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_expl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/ldbl2mpn.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/math_ldbl.h: Likewise.
* sysdeps/ieee754/ldbl-128ibm/mpn2ldbl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_nearbyintl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/strtold_l.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/x2y2m1l.c: Likewise.
The compiler is smart enough to convert those into double for powerpc,
but if we put them as doubles, it adds overhead by performing those
operations in floating point mode.
The mantissa of mp_no is intended to take only integral values. This
is a relatively good choice for powerpc due to its 4 fpus, but not for
other architectures, which suffer due to this choice. This change
makes the default mantissa a long integer and allows powerpc to
override it. Additionally, some operations have been optimized for
integer manipulation, resulting in a significant improvement in
performance.
The patch increase the high value to check if expl overflows. Current
high mark value is not really correct, the algorithm accepts high values.
It also adds a correct wrapper function to check for overflow and underflow.