This patch fixes bugs in ldbl-128ibm frexpl for 32-bit systems shown
up by warnings:
../sysdeps/ieee754/ldbl-128ibm/s_frexpl.c:82:4: warning: left shift count >= width of type
../sysdeps/ieee754/ldbl-128ibm/s_frexpl.c:129:5: warning: left shift count >= width of type
This did in fact show up in test-ldouble.out (alongside all the other
problems there ... maybe we should again consider running the libm
tests at finer granularity from the makefiles) as already covered by
the testsuite after the previous patch that fixed these bugs for
64-bit systems. The fix is simply using 1LL instead of 1L when
shifting by 52.
Tested for powerpc32 (soft float).
[BZ #16619]
[BZ #16740]
* sysdeps/ieee754/ldbl-128ibm/s_frexpl.c (__frexpl): Use 1LL << 52
instead of 1L << 52.
This patch fixes bugs 16561 and 16562, bad results of yn in overflow
cases in non-default rounding modes, both because an intermediate
overflow in the recurrence does not get detected if the result is not
an infinity and because an overflowing result may occur in the wrong
sign. The fix is to set FE_TONEAREST mode internally for the parts of
the function where such overflows can occur (which includes the call
to y1 - where yn is used to compute a Bessel function of order -1,
negating the result of y1 isn't correct for overflowing results in
directed rounding modes) and then compute an overflowing value in the
original rounding mode if the to-nearest result was an infinity.
Tested x86_64 and x86 and ulps updated accordingly. Also tested for
mips64 and powerpc32 to test the ldbl-128 and ldbl-128ibm changes.
(The tests for these bugs were added in my previous y1 patch, so the
only thing this patch has to do with the testsuite is enable yn
testing in all rounding modes.)
[BZ #16561]
[BZ #16562]
* sysdeps/ieee754/dbl-64/e_jn.c: Include <float.h>.
(__ieee754_yn): Set FE_TONEAREST mode internally and then
recompute overflowing results in original rounding mode.
* sysdeps/ieee754/flt-32/e_jnf.c: Include <float.h>.
(__ieee754_ynf): Set FE_TONEAREST mode internally and then
recompute overflowing results in original rounding mode.
* sysdeps/ieee754/ldbl-128/e_jnl.c: Include <float.h>.
(__ieee754_ynl): Set FE_TONEAREST mode internally and then
recompute overflowing results in original rounding mode.
* sysdeps/ieee754/ldbl-128ibm/e_jnl.c: Include <float.h>.
(__ieee754_ynl): Set FE_TONEAREST mode internally and then
recompute overflowing results in original rounding mode.
* sysdeps/ieee754/ldbl-96/e_jnl.c: Include <float.h>.
(__ieee754_ynl): Set FE_TONEAREST mode internally and then
recompute overflowing results in original rounding mode.
* sysdeps/i386/fpu/fenv_private.h [!__SSE2_MATH__]
(libc_feholdsetround_ctx): New macro.
* math/libm-test.inc (yn_test): Use ALL_RM_TEST.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps : Likewise.
This patch fixes spurious underflows from exp10 for arguments near 0
(part of bug 16560; that bug also includes spurious underflows from
exp2, which are not fixed by this patch). The problem is underflows
in the internal computation converting the exp10 argument to arguments
for exp (with extra precision), and the fix is simply to return 1
early for arguments near enough to 0 (just as arguments with large
enough magnitude have their own overflow / underflow logic at the
start of the function).
Tested x86_64 and x86 and ulps updated accordingly; also tested for
powerpc32 and mips64 to validate the ldbl-128ibm and ldbl-128 changes.
[BZ #16560]
* sysdeps/ieee754/dbl-64/e_exp10.c (__ieee754_exp10): Return 1 for
arguments close to 0.
* sysdeps/ieee754/ldbl-128/e_exp10l.c (__ieee754_exp10l):
Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_exp10l.c (__ieee754_exp10l):
Likewise.
* math/auto-libm-test-in: Add more tests of exp10.
* math/auto-libm-test-out: Regenerated.
* sysdeps/x86_64/fpu/libm-test-ulps: Update.
This patch fixes bug 16354, spurious underflows from cosh when a tiny
argument is passed to expm1 and expm1 correctly underflows although
the final result of cosh should be 1. As noted in that bug, some
cases are latent because of expm1 implementations not raising
underflow (bug 16353), but all the implementations are fixed
similarly. They already contained checks for tiny arguments, but the
checks were too late to avoid underflow from expm1 (although they
would avoid underflow from subsequent squaring of the result of
expm1); they are moved before the expm1 calls.
The thresholds used for considering arguments tiny are not
particularly consistent in how they relate to the precision of the
floating-point format in question. They are, however, all sufficient
to ensure that the round-to-nearest result of cosh is indeed 1 below
the threshold (although sometimes they are smaller than necessary).
But the previous logic did not return 1, but the previously computed 1
+ expm1(abs(x)) value. And the thresholds in the ldbl-128 and
ldbl-128ibm code (0x1p-71L - I suspect 0x3f8b was intended in the code
instead of 0x3fb8 - and (roughly) 0x1p-55L) are not sufficient for
that value to be 1. So by moving the test for tiny arguments, and
consequently returning 1 directly now the expm1 value hasn't been
computed by that point, this patch also fixes bug 17061, the (large
number of ulps) inaccuracy for small arguments in those
implementations. Tests for that bug are duly added.
Tested x86_64 and x86 and ulps updated accordingly. Also tested for
mips64 and powerpc32 to validate the ldbl-128 and ldbl-128ibm changes.
[BZ #16354]
[BZ #17061]
* sysdeps/ieee754/dbl-64/e_cosh.c (__ieee754_cosh): Check for
small arguments before calling __expm1.
* sysdeps/ieee754/flt-32/e_coshf.c (__ieee754_coshf): Check for
small arguments before calling __expm1f.
* sysdeps/ieee754/ldbl-128/e_coshl.c (__ieee754_coshl): Check for
small arguments before calling __expm1l.
* sysdeps/ieee754/ldbl-128ibm/e_coshl.c (__ieee754_coshl):
Likewise.
* sysdeps/ieee754/ldbl-96/e_coshl.c (__ieee754_coshl): Likewise.
* math/auto-libm-test-in: Add more cosh tests. Do not allow
spurious underflow for some cosh tests.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
Errno is not set and the testcases will fail.
Now the scalbln-aliases are removed in i386/m68
and the wrappers are used when calling the scalbln-functions.
On ia64 only scalblnf has its own implementation.
For scalbln and scalblnl the ieee754/dbl-64 and ieee754/ldbl-96 are used, thus
the wrappers are needed, too.
This patch fixes few failures in nearbyintl() where the fraction part is
close to 0.5.i The new tests added report few extra failures in
nearbyint_downward and nearbyint_towardzero which is a known issue.
Fixes#17031.
As with other issues of this kind, bug 17042 is log2 (1) wrongly
returning -0 instead of +0 in round-downward mode because of
implementations effectively in terms of log1p (x - 1). This patch
fixes the issue in the same way used for log and log10.
Tested x86_64 and x86 and ulps updated accordingly. Also tested for
mips64 to confirm a fix was needed for ldbl-128 and to validate that
fix (also applied to ldbl-128ibm since that version of log2l is
essentially the same as the ldbl-128 one).
[BZ #17042]
* sysdeps/i386/fpu/e_log2.S (__ieee754_log2): Take absolete value
when x - 1 is zero.
* sysdeps/i386/fpu/e_log2f.S (__ieee754_log2f): Likewise.
* sysdeps/i386/fpu/e_log2l.S (__ieee754_log2l): Likewise.
* sysdeps/ieee754/ldbl-128/e_log2l.c (__ieee754_log2l): Return
0.0L for an argument of 1.0L.
* sysdeps/ieee754/ldbl-128ibm/e_log2l.c (__ieee754_log2l):
Likewise.
* sysdeps/x86_64/fpu/e_log2l.S (__ieee754_log2l): Take absolute
value when x - 1 is zero.
* math/libm-test.inc (log2_test): Use ALL_RM_TEST.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
As with various other issues of this kind, bug 16977 is log10 (1)
wrongly returning -0 rather than +0 in round-downward mode because of
an implementation effectively in terms of log1p (x - 1). This patch
fixes the issue in the same way used for log.
Tested x86_64 and x86 and ulps updated accordingly. Also tested for
mips64 to confirm a fix was needed for ldbl-128 and to validate that
fix (also applied to ldbl-128ibm since that version of logl is
essentially the same as the ldbl-128 one).
[BZ #16977]
* sysdeps/i386/fpu/e_log10.S (__ieee754_log10): Take absolute
value when x - 1 is zero.
* sysdeps/i386/fpu/e_log10f.S (__ieee754_log10f): Likewise.
* sysdeps/i386/fpu/e_log10l.S (__ieee754_log10l): Likewise.
* sysdeps/ieee754/ldbl-128/e_log10l.c (__ieee754_log10l): Return
0.0L for an argument of 1.0L.
* sysdeps/ieee754/ldbl-128ibm/e_log10l.c (__ieee754_log10l):
Likewise.
* sysdeps/x86_64/fpu/e_log10l.S (__ieee754_log10l): Take absolute
value when x - 1 is zero.
* math/libm-test.inc (log10_test): Use ALL_RM_TEST.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
Bug 16564 is spurious overflow of log1pl (LDBL_MAX) in FE_UPWARD mode,
resulting from log1pl adding 1 to its argument (for arguments not
close to 0), which overflows in that mode. This patch fixes this by
avoiding adding 1 to large arguments (precisely what counts as large
depends on the floating-point format).
Tested x86_64 and x86, and spot-checked log1pl tests on mips64 and
powerpc64.
[BZ #16564]
* sysdeps/i386/fpu/s_log1pl.S (__log1pl): Do not add 1 to positive
arguments with exponent 65 or above.
* sysdeps/ieee754/ldbl-128/s_log1pl.c (__log1pl): Do not add 1 to
arguments 0x1p113L or above.
* sysdeps/ieee754/ldbl-128ibm/s_log1pl.c (__log1pl): Do not add 1
to arguments 0x1p107L or above.
* sysdeps/x86_64/fpu/s_log1pl.S (__log1pl): Do not add 1 to
positive arguments with exponent 65 or above.
* math/auto-libm-test-in: Add more tests of log1p.
* math/auto-libm-test-out: Regenerated.
Bug 16516 reports spurious underflows from erf (for all floating-point
types), when the result is close to underflowing but does not actually
underflow.
erf (x) is about (2/sqrt(pi))*x for x close to 0, so there are
subnormal arguments for which it does not underflow. The various
implementations do (x + efx*x) (for efx = 2/sqrt(pi) - 1), for greater
accuracy than if just using a single multiplication by an
approximation to 2/sqrt(pi) (effectively, this way there are a few
more bits in the approximation to 2/sqrt(pi)). This can introduce
underflows when efx*x underflows even though the final result does
not, so a scaled calculation with 8*efx is done in these cases - but 8
is not a big enough scale factor to avoid all such underflows. 16 is
(any underflows with a scale factor of 16 would only occur when the
final result underflows), so this patch changes the code to use that
factor. Rather than recomputing all the values of the efx8 variable,
it is removed, leaving it to the compiler's constant folding to
compute 16*efx. As such scaling can also lose underflows when the
final scaling down happens to be exact, appropriate checks are added
to ensure underflow exceptions occur when required in such cases.
Tested x86_64 and x86; no ulps updates needed. Also spot-checked for
powerpc32 and mips64 to verify the changes to the ldbl-128ibm and
ldbl-128 implementations.
[BZ #16516]
* sysdeps/ieee754/dbl-64/s_erf.c (efx8): Remove variable.
(__erf): Scale by 16 instead of 8 in potentially underflowing
case. Ensure exception if result actually underflows.
* sysdeps/ieee754/flt-32/s_erff.c (efx8): Remove variable.
(__erff): Scale by 16 instead of 8 in potentially underflowing
case. Ensure exception if result actually underflows.
* sysdeps/ieee754/ldbl-128/s_erfl.c: Include <float.h>.
(efx8): Remove variable.
(__erfl): Scale by 16 instead of 8 in potentially underflowing
case. Ensure exception if result actually underflows.
* sysdeps/ieee754/ldbl-128ibm/s_erfl.c: Include <float.h>.
(efx8): Remove variable.
(__erfl): Scale by 16 instead of 8 in potentially underflowing
case. Ensure exception if result actually underflows.
* sysdeps/ieee754/ldbl-96/s_erfl.c: Include <float.h>.
(efx8): Remove variable.
(__erfl): Scale by 16 instead of 8 in potentially underflowing
case. Ensure exception if result actually underflows.
* math/auto-libm-test-in: Add more tests of erf.
* math/auto-libm-test-out: Regenerated.
Besides fixing the bugzilla, this also fixes corner-cases where the high
and low double differ greatly in magnitude, and handles a denormal
input without resorting to a fp rescale.
[BZ #16740]
[BZ #16619]
* sysdeps/ieee754/ldbl-128ibm/s_frexpl.c (__frexpl): Rewrite.
* math/libm-test.inc (frexp_test_data): Add tests.
Fix for values near a power of two, and some tidies.
[BZ #16739]
* sysdeps/ieee754/ldbl-128ibm/s_nextafterl.c (__nextafterl): Correct
output when value is near a power of two. Use int64_t for lx and
remove casts. Use decimal rather than hex exponent constants.
Don't use long double multiplication when double will suffice.
* math/libm-test.inc (nextafter_test_data): Add tests.
* NEWS: Add 16739 and 16786 to bug list.
ISO C requires the result of nextafter to be independent of the
rounding mode, even when underflow or overflow occurs. This patch
fixes the bug in various nextafter implementations that, having done
an overflowing computation to force an overflow exception (correct),
they then return the result of that computation rather than an
infinity computed some other way (incorrect, when the overflowing
result of arithmetic with that sign and rounding mode is finite but
the correct result is infinite) - generally by falling through to
existing code to return a value that in fact is correct for this case
(but was computed by an integer increment and so without generating
the exceptions required). Having fixed the bug, the previously
deferred conversion of nextafter testing in libm-test.inc to
ALL_RM_TEST is also included.
Tested x86_64 and x86; also spot-checked results of nextafter tests
for powerpc32 and mips64 to test the ldbl-128ibm and ldbl-128
changes. (The m68k change is untested.)
[BZ #16677]
* math/s_nextafter.c (__nextafter): Do not return value from
overflowing computation.
* sysdeps/i386/fpu/s_nextafterl.c (__nextafterl): Likewise.
* sysdeps/ieee754/flt-32/s_nextafterf.c (__nextafterf): Likewise.
* sysdeps/ieee754/ldbl-128/s_nextafterl.c (__nextafterl):
Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_nextafterl.c (__nextafterl):
Likewise.
* sysdeps/m68k/m680x0/fpu/s_nextafterl.c (__nextafterl): Likewise.
* math/libm-test.inc (nextafter_test): Use ALL_RM_TEST.
The truncl assembly implementation (sysdeps/powerpc/powerpc64/fpu/s_truncl.S)
returns wrong results for some inputs where first double is a exact integer
and the precision is determined by second long double.
Checking on implementation comments and history, I am very confident the
assembly implementation was based on a version before commit
5c68d40169 that fixes BZ#2423 (Errors in
long double (ldbl-128ibm) rounding functions in glibc-2.4).
By just removing the implementation and make the build select
sysdeps/ieee754/ldbl-128ibm/s_truncl.c instead it fixes tgammal
issues regarding wrong result sign.
This patch fixes bug 16408, ldbl-128ibm expm1l returning NaN for some
large arguments.
The basic problem is that the approach of converting the exponent to
the form n * log(2) + y, where -0.5 <= y <= 0.5, then computing 2^n *
expm1(y) + (2^n - 1) falls over when 2^n overflows (starting slightly
before the point where expm1 overflows, when y is negative and n is
the least integer for which 2^n overflows). The ldbl-128 code, and
the x86/x86_64 code, make expm1l fall back to expl for large positive
arguments to avoid this issue. This patch makes the ldbl-128ibm code
do the same. (The problem appears for the particular argument in the
testsuite because the ldbl-128ibm code also uses an overflow threshold
that's for ldbl-128 and is too big for ldbl-128ibm, but the problem
described applies for large non-overflowing cases as well, although
during the freeze is not a suitable time for making the expm1 tests
cover cases close to overflow more thoroughly.)
This leaves some code for large positive arguments in expm1l that is
now dead. To keep the code for ldbl-128 and ldbl-128ibm similar, and
to avoid unnecessary changes during the freeze, the patch doesn't
remove it; instead I propose to file a bug in Bugzilla as a reminder
that this code (for overflow, including errno setting, and for
arguments of +Inf) is no longer needed and should be removed from both
those expm1l implementations.
Tested powerpc32.
* sysdeps/ieee754/ldbl-128ibm/s_expm1l.c (__expm1l): Use __expl
for large positive arguments.
This patch fixes bug 16407, spurious overflows from ldbl-128ibm coshl.
The implementation assumed that a high part (reinterpreted as an
integer) of the absolute value of the argument of 0x408633ce8fb9f87dLL
or more meant overflow, but the actual threshold has high part
0x408633ce8fb9f87eLL (and a negative low part). The patch adjusts the
threshold accordingly.
sinhl probably has the same issue, but I didn't get that far in adding
tests of special cases (such as just below and above overflow) before
the freeze and during the freeze is not a suitable time to add them
(as they'd require ulps to be regenerated again), so I'm not changing
that function for now; when I add more tests of special cases, we'll
discover whether sinhl indeed has this problem.
Tested powerpc32.
* sysdeps/ieee754/ldbl-128ibm/e_coshl.c (__ieee754_coshl):
Increase overflow threshold.
This patch fixes bug 16386, ldbl-128ibm logl inaccuracy (with
consequent inaccuracy for lgammal) for arguments where the high double
is subnormal, which showed up while attempting to regenerate ulps for
powerpc-nofpu for 2.19. The problem here is logic failing to allow
for subnormals when calculating the exponent of the argument. Tested
for powerpc-nofpu.
* sysdeps/ieee754/ldbl-128ibm/e_logl.c (__ieee754_logl): Adjust
numbers with subnormal high part when calculating exponent.
This patch fixes bug 16385, ldbl-128ibm asinhl inaccuracy, which
showed up while attempting to regenerate ulps for powerpc-nofpu for
2.19. The problem here was use of fabs instead of fabsl meaning large
arguments were reduced to the precision of double. Tested for
powerpc-nofpu.
* sysdeps/ieee754/ldbl-128ibm/s_asinhl.c (__asinhl): Use fabsl not
fabs.
This patch fixes bug 16384, ldbl-128ibm acoshl inaccuracy, which
showed up while attempting to regenerate ulps for powerpc-nofpu for
2.19. There were two separate problems, use of __log1p instead of
__log1pl and an insufficiently accurate constant value for log 2
(which this patch replaces by use of M_LN2l), each of which could
cause substantial inaccuracy in affected cases.
Tested for powerpc-nofpu.
* sysdeps/ieee754/ldbl-128ibm/e_acoshl.c (ln2): Initialize with
M_LN2l.
(__ieee754_acoshl): Use __log1pl not __log1p.
This patch helps some math functions performance by adding the libc_fexxx
variant of inline functions to handle both FPU round and exception set/restore
and by using them on the libc_fexxx_ctx functions. It is based on already coded
fexxx family functions for PPC with fpu.
Here is the summary of performance improvements due this patch (measured on a
POWER7 machine):
Before:
cos(): ITERS:9.5895e+07: TOTAL:5116.03Mcy, MAX:77.6cy, MIN:49.792cy, 18744 calls/Mcy
exp(): ITERS:2.827e+07: TOTAL:5187.15Mcy, MAX:494.018cy, MIN:38.422cy, 5450.01 calls/Mcy
pow(): ITERS:6.1705e+07: TOTAL:5144.26Mcy, MAX:171.95cy, MIN:29.935cy, 11994.9 calls/Mcy
sin(): ITERS:8.6898e+07: TOTAL:5117.06Mcy, MAX:83.841cy, MIN:46.582cy, 16982 calls/Mcy
tan(): ITERS:2.9473e+07: TOTAL:5115.39Mcy, MAX:191.017cy, MIN:172.352cy, 5761.63 calls/Mcy
After:
cos(): ITERS:2.05265e+08: TOTAL:5111.37Mcy, MAX:78.754cy, MIN:24.196cy, 40158.5 calls/Mcy
exp(): ITERS:3.341e+07: TOTAL:5170.84Mcy, MAX:476.317cy, MIN:15.574cy, 6461.23 calls/Mcy
pow(): ITERS:7.6153e+07: TOTAL:5129.1Mcy, MAX:147.5cy, MIN:30.916cy, 14847.2 calls/Mcy
sin(): ITERS:1.58816e+08: TOTAL:5115.11Mcy, MAX:1490.39cy, MIN:22.341cy, 31048.4 calls/Mcy
tan(): ITERS:3.4964e+07: TOTAL:5114.18Mcy, MAX:177.422cy, MIN:146.115cy, 6836.68 calls/Mcy
http://sourceware.org/ml/libc-alpha/2013-07/msg00197.html
A rewrite to make this code correct for little-endian.
* sysdeps/ieee754/ldbl-128ibm/e_sqrtl.c (mynumber): Replace
union 32-bit int array member with 64-bit int array.
(t515, tm256): Double rather than long double.
(__ieee754_sqrtl): Rewrite using 64-bit arithmetic.
http://sourceware.org/ml/libc-alpha/2013-08/msg00085.html
Rid ourselves of ieee854.
* sysdeps/ieee754/ldbl-128ibm/ieee754.h (union ieee854_long_double):
Delete.
(IEEE854_LONG_DOUBLE_BIAS): Delete.
* sysdeps/ieee754/ldbl-128ibm/math_ldbl.h: Don't include ieee854
version of math_ldbl.h.
http://sourceware.org/ml/libc-alpha/2013-08/msg00084.html
Another batch of ieee854 macros and union replacement. These four
files also have bugs fixed with this patch. The fact that the two
doubles in an IBM long double may have different signs means that
negation and absolute value operations can't just twiddle one sign bit
as you can with ieee864 style extended double. fmodl, remainderl,
erfl and erfcl all had errors of this type. erfl also returned +1 for
large magnitude negative input where it should return -1. The hypotl
error is innocuous since the value adjusted twice is only used as a
flag. The e_hypotl.c tests for large "a" and small "b" are mutually
exclusive because we've already exited when x/y > 2**120. That allows
some further small simplifications.
[BZ #15734], [BZ #15735]
* sysdeps/ieee754/ldbl-128ibm/e_fmodl.c (__ieee754_fmodl): Rewrite
all uses of ieee875 long double macros and unions. Simplify test
for 0.0L. Correct |x|<|y| and |x|=|y| test. Use
ldbl_extract_mantissa value for ix,iy exponents. Properly
normalize after ldbl_extract_mantissa, and don't add hidden bit
already handled. Don't treat low word of ieee854 mantissa like
low word of IBM long double and mask off bit when testing for
zero.
* sysdeps/ieee754/ldbl-128ibm/e_hypotl.c (__ieee754_hypotl): Rewrite
all uses of ieee875 long double macros and unions. Simplify tests
for 0.0L and inf. Correct double adjustment of k. Delete dead code
adjusting ha,hb. Simplify code setting kld. Delete two600 and
two1022, instead use their values. Recognise that tests for large
"a" and small "b" are mutually exclusive. Rename vars. Comment.
* sysdeps/ieee754/ldbl-128ibm/e_remainderl.c (__ieee754_remainderl):
Rewrite all uses of ieee875 long double macros and unions. Simplify
test for 0.0L and nan. Correct negation.
* sysdeps/ieee754/ldbl-128ibm/s_erfl.c (__erfl): Rewrite all uses of
ieee875 long double macros and unions. Correct output for large
magnitude x. Correct absolute value calculation.
(__erfcl): Likewise.
* math/libm-test.inc: Add tests for errors discovered in IBM long
double versions of fmodl, remainderl, erfl and erfcl.
http://sourceware.org/ml/libc-alpha/2013-08/msg00083.html
Further replacement of ieee854 macros and unions. These files also
have some optimisations for comparison against 0.0L, infinity and nan.
Since the ABI specifies that the high double of an IBM long double
pair is the value rounded to double, a high double of 0.0 means the
low double must also be 0.0. The ABI also says that infinity and
nan are encoded in the high double, with the low double unspecified.
This means that tests for 0.0L, +/-Infinity and +/-NaN need only check
the high double.
* sysdeps/ieee754/ldbl-128ibm/e_atan2l.c (__ieee754_atan2l): Rewrite
all uses of ieee854 long double macros and unions. Simplify tests
for long doubles that are fully specified by the high double.
* sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (__ieee754_gammal_r):
Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_ilogbl.c (__ieee754_ilogbl): Likewise.
Remove dead code too.
* sysdeps/ieee754/ldbl-128ibm/e_jnl.c (__ieee754_jnl): Likewise.
(__ieee754_ynl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_log10l.c (__ieee754_log10l): Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_logl.c (__ieee754_logl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_powl.c (__ieee754_powl): Likewise.
Remove dead code too.
* sysdeps/ieee754/ldbl-128ibm/k_tanl.c (__kernel_tanl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_expm1l.c (__expm1l): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_frexpl.c (__frexpl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_isinf_nsl.c (__isinf_nsl): Likewise.
Simplify.
* sysdeps/ieee754/ldbl-128ibm/s_isinfl.c (___isinfl): Likewise.
Simplify.
* sysdeps/ieee754/ldbl-128ibm/s_log1pl.c (__log1pl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_modfl.c (__modfl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_nextafterl.c (__nextafterl): Likewise.
Comment on variable precision.
* sysdeps/ieee754/ldbl-128ibm/s_nexttoward.c (__nexttoward): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_nexttowardf.c (__nexttowardf):
Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_remquol.c (__remquol): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_scalblnl.c (__scalblnl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_scalbnl.c (__scalbnl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_tanhl.c (__tanhl): Likewise.
* sysdeps/powerpc/fpu/libm-test-ulps: Adjust tan_towardzero ulps.
http://sourceware.org/ml/libc-alpha/2013-08/msg00081.html
This is the first of a series of patches to ban ieee854_long_double
and the ieee854_long_double macros when using IBM long double. union
ieee854_long_double just isn't correct for IBM long double, especially
when little-endian, and pretending it is OK has allowed a number of
bugs to remain undetected in sysdeps/ieee754/ldbl-128ibm/.
This changes the few places in generic code that use it.
* stdio-common/printf_size.c (__printf_size): Don't use
union ieee854_long_double in fpnum union.
* stdio-common/printf_fphex.c (__printf_fphex): Likewise. Use
signbit macro to retrieve sign from long double.
* stdio-common/printf_fp.c (___printf_fp): Use signbit macro to
retrieve sign from long double.
* sysdeps/ieee754/ldbl-128ibm/printf_fphex.c: Adjust for fpnum change.
* sysdeps/ieee754/ldbl-128/printf_fphex.c: Likewise.
* sysdeps/ieee754/ldbl-96/printf_fphex.c: Likewise.
* sysdeps/x86_64/fpu/printf_fphex.c: Likewise.
* math/test-misc.c (main): Don't use union ieee854_long_double.
ports/
* sysdeps/ia64/fpu/printf_fphex.c: Adjust for fpnum change.
http://sourceware.org/ml/libc-alpha/2013-06/msg00919.html
I discovered a number of places where denormals and other corner cases
were being handled wrongly.
- printf_fphex.c: Testing for the low double exponent being zero is
unnecessary. If the difference in exponents is less than 53 then the
high double exponent must be nearing the low end of its range, and the
low double exponent hit rock bottom.
- ldbl2mpn.c: A denormal (ie. exponent of zero) value is treated as
if the exponent was one, so shift mantissa left by one. Code handling
normalisation of the low double mantissa lacked a test for shift count
greater than bits in type being shifted, and lacked anything to handle
the case where the difference in exponents is less than 53 as in
printf_fphex.c.
- math_ldbl.h (ldbl_extract_mantissa): Same as above, but worse, with
code testing for exponent > 1 for some reason, probably a typo for >= 1.
- math_ldbl.h (ldbl_insert_mantissa): Round the high double as per
mpn2ldbl.c (hi is odd or explicit mantissas non-zero) so that the
number we return won't change when applying ldbl_canonicalize().
Add missing overflow checks and normalisation of high mantissa.
Correct misleading comment: "The hidden bit of the lo mantissa is
zero" is not always true as can be seen from the code rounding the hi
mantissa. Also by inspection, lzcount can never be less than zero so
remove that test. Lastly, masking bitfields to their widths can be
left to the compiler.
- mpn2ldbl.c: The overflow checks here on rounding of high double were
just plain wrong. Incrementing the exponent must be accompanied by a
shift right of the mantissa to keep the value unchanged. Above notes
for ldbl_insert_mantissa are also relevant.
[BZ #15680]
* sysdeps/ieee754/ldbl-128ibm/e_rem_pio2l.c: Comment fix.
* sysdeps/ieee754/ldbl-128ibm/printf_fphex.c
(PRINT_FPHEX_LONG_DOUBLE): Tidy code by moving -53 into ediff
calculation. Remove unnecessary test for denormal exponent.
* sysdeps/ieee754/ldbl-128ibm/ldbl2mpn.c (__mpn_extract_long_double):
Correct handling of denormals. Avoid undefined shift behaviour.
Correct normalisation of low mantissa when low double is denormal.
* sysdeps/ieee754/ldbl-128ibm/math_ldbl.h
(ldbl_extract_mantissa): Likewise. Comment. Use uint64_t* for hi64.
(ldbl_insert_mantissa): Make both hi64 and lo64 parms uint64_t.
Correct normalisation of low mantissa. Test for overflow of high
mantissa and normalise.
(ldbl_nearbyint): Use more readable constant for two52.
* sysdeps/ieee754/ldbl-128ibm/mpn2ldbl.c
(__mpn_construct_long_double): Fix test for overflow of high
mantissa and correct normalisation. Avoid undefined shift.
http://sourceware.org/ml/libc-alpha/2013-07/msg00001.html
This patch starts the process of supporting powerpc64 little-endian
long double in glibc. IBM long double is an array of two ieee
doubles, so making union ibm_extended_long_double reflect this fact is
the correct way to access fields of the doubles.
* sysdeps/ieee754/ldbl-128ibm/ieee754.h
(union ibm_extended_long_double): Define as an array of ieee754_double.
(IBM_EXTENDED_LONG_DOUBLE_BIAS): Delete.
* sysdeps/ieee754/ldbl-128ibm/printf_fphex.c: Update all references
to ibm_extended_long_double and IBM_EXTENDED_LONG_DOUBLE_BIAS.
* sysdeps/ieee754/ldbl-128ibm/e_exp10l.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_expl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/ldbl2mpn.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/math_ldbl.h: Likewise.
* sysdeps/ieee754/ldbl-128ibm/mpn2ldbl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_nearbyintl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/strtold_l.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/x2y2m1l.c: Likewise.
The patch increase the high value to check if expl overflows. Current
high mark value is not really correct, the algorithm accepts high values.
It also adds a correct wrapper function to check for overflow and underflow.