sysdeps/i386/fpu/e_atanh.S, unlike all other functions in that
directory, loads the PIC register with its own code using
_GLOBAL_OFFSET_TABLE_, rather than with the LOAD_PIC_REG macro. I see
no good reason for the difference; this patch makes it use the common
macro.
Tested for x86.
* sysdeps/i386/fpu/e_atanh.S (__ieee754_atanh) [PIC]: Use
LOAD_PIC_REG.
This patch refactors code in sysdeps/i386/fpu that forces underflow
exceptions to use common macros for that purpose as far as possible.
(Although some of the macros end up used in only one place, I think
it's cleanest to define all these macros together so that all the code
forcing underflow uses such macros. Some more uses of such macros
will also be introduced when fixing remaining bugs about missing
underflow exceptions, and it would be possible to do further
refactoring of the macros in i386-math-asm.h to share more code by
using other macros internally. Places that test for underflow by
examining the representation of the argument with integer operations,
rather that using floating-point comparisons on the argument or
result, are unchanged by this patch.)
Most of this code uses a macro MO to abstract away the differences
between PIC and non-PIC memory references to constants. log1p
functions, however, hardcoded PIC conditionals for this. Because the
common macros rely on the use of MO, I changed the log1p functions to
use the normal style here, and, for consistency, also made that change
to log1pl which is otherwise unaffected by this patch.
Tested for x86.
* sysdeps/i386/fpu/i386-math-asm.h (DEFINE_LDBL_MIN): New macro.
(FLT_CHECK_FORCE_UFLOW): Likewise.
(DBL_CHECK_FORCE_UFLOW): Likewise.
(FLT_CHECK_FORCE_UFLOW_NARROW): Likewise.
(DBL_CHECK_FORCE_UFLOW_NARROW): Likewise.
(LDBL_CHECK_FORCE_UFLOW_NONNEG_NAN): Likewise.
(FLT_CHECK_FORCE_UFLOW_NONNAN): Likewise.
(DBL_CHECK_FORCE_UFLOW_NONNAN): Likewise.
(FLT_CHECK_FORCE_UFLOW_NONNEG): Likewise.
(DBL_CHECK_FORCE_UFLOW_NONNEG): Likewise.
(LDBL_CHECK_FORCE_UFLOW_NONNEG): Likewise.
* sysdeps/i386/fpu/e_asin.S: Include <i386-math-asm.h>.
(dbl_min): Replace with use of DEFINE_DBL_MIN.
(__ieee754_asin): Use DBL_CHECK_FORCE_UFLOW.
* sysdeps/i386/fpu/e_asinf.S: Include <i386-math-asm.h>.
(flt_min): Replace with use of DEFINE_FLT_MIN.
(__ieee754_asinf): Use FLT_CHECK_FORCE_UFLOW.
* sysdeps/i386/fpu/e_atan2.S: Include <i386-math-asm.h>.
(dbl_min): Replace with use of DEFINE_DBL_MIN.
(__ieee754_atan2): Use DBL_CHECK_FORCE_UFLOW_NARROW.
* sysdeps/i386/fpu/e_atan2f.S: Include <i386-math-asm.h>.
(flt_min): Replace with use of DEFINE_FLT_MIN.
(__ieee754_atan2f): Use FLT_CHECK_FORCE_UFLOW_NARROW.
* sysdeps/i386/fpu/e_atanh.S: Include <i386-math-asm.h>.
(dbl_min): Replace with use of DEFINE_DBL_MIN.
(__ieee754_atanh): Use DBL_CHECK_FORCE_UFLOW_NONNEG.
* sysdeps/i386/fpu/e_atanhf.S: Include <i386-math-asm.h>.
(flt_min): Replace with use of DEFINE_FLT_MIN.
(__ieee754_atanhf): Use FLT_CHECK_FORCE_UFLOW_NONNEG.
* sysdeps/i386/fpu/e_exp2l.S: Include <i386-math-asm.h>.
(ldbl_min): Replace with use of DEFINE_LDBL_MIN.
(__ieee754_exp2l): Use LDBL_CHECK_FORCE_UFLOW_NONNEG_NAN.
* sysdeps/i386/fpu/e_expl.S: Include <i386-math-asm.h>.
[!USE_AS_EXPM1L] (cmin): Replace with use of DEFINE_LDBL_MIN.
(IEEE754_EXPL): Use LDBL_CHECK_FORCE_UFLOW_NONNEG.
* sysdeps/i386/fpu/s_atan.S: Include <i386-math-asm.h>.
(dbl_min): Replace with use of DEFINE_DBL_MIN.
(__atan): Use DBL_CHECK_FORCE_UFLOW.
* sysdeps/i386/fpu/s_atanf.S: Include <i386-math-asm.h>.
(flt_min): Replace with use of DEFINE_FLT_MIN.
(__atanf): Use FLT_CHECK_FORCE_UFLOW.
* sysdeps/i386/fpu/s_expm1.S: Include <i386-math-asm.h>.
(dbl_min): Replace with use of DEFINE_DBL_MIN.
(__expm1): Use DBL_CHECK_FORCE_UFLOW. Move underflow check after
main computation.
* sysdeps/i386/fpu/s_expm1f.S: Include <i386-math-asm.h>.
(flt_min): Replace with use of DEFINE_FLT_MIN.
(__expm1f): Use FLT_CHECK_FORCE_UFLOW. Move underflow check after
main computation.
* sysdeps/i386/fpu/s_log1p.S: Include <i386-math-asm.h>.
(dbl_min): Replace with use of DEFINE_DBL_MIN.
(MO): New macro.
(__log1p): Use MO. Use DBL_CHECK_FORCE_UFLOW_NONNAN.
* sysdeps/i386/fpu/s_log1pf.S: Include <i386-math-asm.h>.
(flt_min): Replace with use of DEFINE_FLT_MIN.
(MO): New macro.
(__log1pf): Use MO. Use FLT_CHECK_FORCE_UFLOW_NONNAN.
* sysdeps/i386/fpu/s_log1pl.S (MO): New macro.
(__log1pl): Use MO.
The x86_64 fma4 version of pow fails to disable contraction of
operations other than those explicitly intended to use fma
instructions, so resulting in large ulps errors on processors with
fma4 instructions, as in bug 18104 (165ulp for the test added for that
bug; error originally reported by "blaaa" on #glibc). This patch adds
$(config-cflags-nofma) for e_pow-fma4.c, corresponding to the use for
e_pow.c in sysdeps/ieee754/dbl-64/Makefile.
Tested for x86_64 on a processor with fma4.
[BZ #19003]
* sysdeps/x86_64/fpu/multiarch/Makefile (CFLAGS-e_pow-fma4.c): Add
$(config-cflags-nofma).
sysdeps/ieee754/flt-32/e_exp2f.c declares two variable as "static
const volatile float". Maybe this use of "volatile" was originally
intended to inhibit optimization of underflowing / overflowing
operations such as TWOM100 * TWOM100; in any case, it's not currently
needed, as given -frounding-math constant folding of such expressions
is properly disabled when it would be unsafe. This patch removes the
unnecessary use of "volatile".
Tested for x86_64.
* sysdeps/ieee754/flt-32/e_exp2f.c (TWOM100): Remove volatile.
(TWO127): Likewise.
Where glibc code needs to avoid excess range and precision in
floating-point arithmetic, code variously uses either asms or volatile
to force the results of that arithmetic to memory; mostly this is
conditional on FLT_EVAL_METHOD, but in the case of lrint / llrint
functions some use of volatile is unconditional (and is present
unnecessarily in versions for long double). This patch make such code
use the recently-added math_narrow_eval macro consistently, removing
the unnecessary uses of volatile in long double lrint / llrint
implementations completely.
Tested for x86_64, x86, mips64 and powerpc.
* math/s_nexttowardf.c (__nexttowardf): Use math_narrow_eval.
* stdlib/strtod_l.c: Include <math_private.h>.
(overflow_value): Use math_narrow_eval.
(underflow_value): Likewise.
* sysdeps/i386/fpu/s_nexttoward.c (__nexttoward): Likewise.
* sysdeps/i386/fpu/s_nexttowardf.c (__nexttowardf): Likewise.
* sysdeps/ieee754/dbl-64/e_gamma_r.c (gamma_positive): Likewise.
(__ieee754_gamma_r): Likewise.
* sysdeps/ieee754/dbl-64/gamma_productf.c (__gamma_productf):
Likewise.
* sysdeps/ieee754/dbl-64/k_rem_pio2.c (__kernel_rem_pio2):
Likewise.
* sysdeps/ieee754/dbl-64/lgamma_neg.c (__lgamma_neg): Likewise.
* sysdeps/ieee754/dbl-64/s_erf.c (__erfc): Likewise.
* sysdeps/ieee754/dbl-64/s_llrint.c (__llrint): Likewise.
* sysdeps/ieee754/dbl-64/s_lrint.c (__lrint): Likewise.
* sysdeps/ieee754/flt-32/e_gammaf_r.c (gammaf_positive): Likewise.
(__ieee754_gammaf_r): Likewise.
* sysdeps/ieee754/flt-32/k_rem_pio2f.c (__kernel_rem_pio2f):
Likewise.
* sysdeps/ieee754/flt-32/lgamma_negf.c (__lgamma_negf): Likewise.
* sysdeps/ieee754/flt-32/s_erff.c (__erfcf): Likewise.
* sysdeps/ieee754/flt-32/s_llrintf.c (__llrintf): Likewise.
* sysdeps/ieee754/flt-32/s_lrintf.c (__lrintf): Likewise.
* sysdeps/ieee754/ldbl-128/s_llrintl.c (__llrintl): Do not use
volatile.
* sysdeps/ieee754/ldbl-128/s_lrintl.c (__lrintl): Likewise.
* sysdeps/ieee754/ldbl-128/s_nexttoward.c (__nexttoward): Use
math_narrow_eval.
* sysdeps/ieee754/ldbl-128ibm/s_nexttoward.c (__nexttoward):
Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_nexttowardf.c (__nexttowardf):
Likewise.
* sysdeps/ieee754/ldbl-96/gamma_product.c (__gamma_product):
Likewise.
* sysdeps/ieee754/ldbl-96/s_llrintl.c (__llrintl): Do not use
volatile.
* sysdeps/ieee754/ldbl-96/s_lrintl.c (__lrintl): Likewise.
* sysdeps/ieee754/ldbl-96/s_nexttoward.c (__nexttoward): Use
math_narrow_eval.
* sysdeps/ieee754/ldbl-96/s_nexttowardf.c (__nexttowardf):
Likewise.
* sysdeps/ieee754/ldbl-opt/s_nexttowardfd.c (__nldbl_nexttowardf):
Likewise.
i386 exp, hypot and pow functions can return overflowing and
underflowing values with excess range and precision; ; Wilco
Dijkstra's patches to make isfinite etc. expand inline cause this
pre-existing issue to result in test failures.
This patch fixes those functions to avoid excess range and precision
in their return values. Appropriate macros are added for the repeated
code sequences; in future I'll add more such macros and refactor
existing code forcing underflow (with or without also eliminating
excess range and precision from the return value) to use such macros.
Tested for x86. If, after this patch, you still see x86 libm test
failures with excess range or precision, please file bugs in Bugzilla.
[BZ #18980]
* sysdeps/i386/fpu/i386-math-asm.h (DEFINE_FLT_MIN): New macro.
(DEFINE_DBL_MIN): Likewise.
(FLT_NARROW_EVAL_UFLOW_NONNEG_NAN): Likewise.
(DBL_NARROW_EVAL_UFLOW_NONNEG_NAN): Likewise.
(FLT_NARROW_EVAL_UFLOW_NONNEG): Likewise.
(DBL_NARROW_EVAL_UFLOW_NONNEG): Likewise.
* sysdeps/i386/fpu/e_exp.S: Include <i386-math-asm.h>.
(dbl_min): Replace with use of DEFINE_DBL_MIN.
(__ieee754_exp): Use DBL_NARROW_EVAL_UFLOW_NONNEG_NAN.
(__exp_finite): Use DBL_NARROW_EVAL_UFLOW_NONNEG.
* sysdeps/i386/fpu/e_exp10.S: Include <i386-math-asm.h>.
(dbl_min): Replace with use of DEFINE_DBL_MIN.
(__ieee754_exp10): Use DBL_NARROW_EVAL_UFLOW_NONNEG_NAN.
* sysdeps/i386/fpu/e_exp10f.S: Include <i386-math-asm.h>.
(flt_min): Replace with use of DEFINE_FLT_MIN.
(__ieee754_exp10f): Use FLT_NARROW_EVAL_UFLOW_NONNEG_NAN.
* sysdeps/i386/fpu/e_exp2.S: Include <i386-math-asm.h>.
(dbl_min): Replace with use of DEFINE_DBL_MIN.
(__ieee754_exp2): Use DBL_NARROW_EVAL_UFLOW_NONNEG_NAN.
* sysdeps/i386/fpu/e_exp2f.S: Include <i386-math-asm.h>.
(flt_min): Replace with use of DEFINE_FLT_MIN.
(__ieee754_exp2f): Use FLT_NARROW_EVAL_UFLOW_NONNEG_NAN.
* sysdeps/i386/fpu/e_expf.S: Include <i386-math-asm.h>.
(flt_min): Replace with use of DEFINE_FLT_MIN.
(__ieee754_expf): Use FLT_NARROW_EVAL_UFLOW_NONNEG_NAN.
(__expf_finite): Use FLT_NARROW_EVAL_UFLOW_NONNEG.
* sysdeps/i386/fpu/e_hypot.S: Include <i386-math-asm.h>.
(__ieee754_hypot): Use DBL_NARROW_EVAL.
* sysdeps/i386/fpu/e_hypotf.S: Include <i386-math-asm.h>.
(__ieee754_hypotf): Use FLT_NARROW_EVAL.
* sysdeps/i386/fpu/e_pow.S: Include <i386-math-asm.h>.
(__ieee754_pow): Use DBL_NARROW_EVAL.
* sysdeps/i386/fpu/e_powf.S: Include <i386-math-asm.h>.
(__ieee754_powf): Use FLT_NARROW_EVAL.
* sysdeps/i386/i686/fpu/multiarch/e_expf-sse2.S
(__ieee754_expf_sse2): Convert double-precision result to single
precision.
* sysdeps/i386/fpu/libm-test-ulps: Update.
i386 scalb / scalbn / scalbln (and thus ldexp) functions for float and
double can return results with excess range (and consequently excess
precision for subnormal results). As the results of these functions
are fully determined by reference to IEEE 754 operations, this is
unambiguously a bug, apart from the testsuite failures it causes.
This patch makes those functions store their results on the stack and
load them back to eliminate the excess range. Double rounding is not
a problem, as the only cases where it could occur are when the result
overflows or underflows for extended precision, and then the
double-rounded results are the same as the single-rounded results.
The new macros will be used for more functions, more such macros
added, and existing code refactored to use such macros, in subsequent
patches.
Tested for x86. Committed.
[BZ #18981]
* sysdeps/i386/fpu/i386-math-asm.h: New file.
* sysdeps/i386/fpu/e_scalb.S: Include <i386-math-asm.h>.
(__ieee754_scalb): Use DBL_NARROW_EVAL.
* sysdeps/i386/fpu/e_scalbf.S: Include <i386-math-asm.h>.
(__ieee754_scalbf): Use FLT_NARROW_EVAL.
* sysdeps/i386/fpu/s_scalbn.S: Include <i386-math-asm.h>.
(__scalbn): Use DBL_NARROW_EVAL.
* sysdeps/i386/fpu/s_scalbnf.S: Include <i386-math-asm.h>.
(__scalbnf): Use FLT_NARROW_EVAL.
Various i386 libm functions return values with excess range and
precision; Wilco Dijkstra's patches to make isfinite etc. expand
inline cause this pre-existing issue to result in test failures (when
e.g. a result that overflows float but not long double gets counted as
overflowing for some purposes but not others).
This patch addresses those cases arising from functions defined in C,
adding a math_narrow_eval macro that forces values to memory to
eliminate excess precision if FLT_EVAL_METHOD indicates this is
needed, and is a no-op otherwise. I'll convert existing uses of
volatile and asm for this purpose to use the new macro later, once
i386 has clean test results again (which requires fixes for .S files
as well).
Tested for x86_64 and x86. Committed.
[BZ #18980]
* sysdeps/generic/math_private.h: Include <float.h>.
(math_narrow_eval): New macro.
[FLT_EVAL_METHOD != 0] (excess_precision): Likewise.
* sysdeps/ieee754/dbl-64/e_cosh.c (__ieee754_cosh): Use
math_narrow_eval on overflowing return value.
* sysdeps/ieee754/dbl-64/e_lgamma_r.c (__ieee754_lgamma_r):
Likewise.
* sysdeps/ieee754/dbl-64/e_sinh.c (__ieee754_sinh): Likewise.
* sysdeps/ieee754/flt-32/e_coshf.c (__ieee754_coshf): Likewise.
* sysdeps/ieee754/flt-32/e_lgammaf_r.c (__ieee754_lgammaf_r):
Likewise.
* sysdeps/ieee754/flt-32/e_sinhf.c (__ieee754_sinhf): Likewise.
The logic in setjmp/__longjmp incorrectly uses "PIC" to figure out
whether the code is going into a shared library when it should be
using "SHARED". If you build glibc with a gcc version that has PIE
enabled by default, then the code will try to use symbols that are
only in the shared library.
URL: https://bugs.gentoo.org/336914
Since we require a new enough kernel all the time, the __ASSUME_FDATASYNC
define has been hardcoded to 1. That means we can delete the alpha file
for fdatasync now and rely on the syscalls list like other ports.
Bug 15384 notes that in __finite, two different constants are used
that could be the same constant (the result only depends on the
exponent of the floating-point representation), and that using the
same constant is better for architectures where constants need loading
from a constant pool. This patch implements that change.
Tested for x86_64, mips64 and powerpc.
[BZ #15384]
* sysdeps/ieee754/dbl-64/s_finite.c (FINITE): Use same constant as
bit-mask as in subtraction.
* sysdeps/ieee754/dbl-64/wordsize-64/s_finite.c (__finite):
Likewise.
* sysdeps/ieee754/flt-32/s_finitef.c (FINITEF): Likewise.
* sysdeps/ieee754/ldbl-128/s_finitel.c (__finitel): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_finitel.c (__finitel): Likewise.
Similar to various other bugs in this area, tgamma functions can fail
to raise the underflow exception when the result is tiny and inexact
but one or more low bits of the intermediate result that is scaled
down are zero. This patch forces the exception in a similar way to
previous fixes.
Tested for x86_64, x86, mips64 and powerpc.
[BZ #18951]
* sysdeps/ieee754/dbl-64/e_gamma_r.c (__ieee754_gamma_r): Force
underflow exception for small results.
* sysdeps/ieee754/flt-32/e_gammaf_r.c (__ieee754_gammaf_r):
Likewise.
* sysdeps/ieee754/ldbl-128/e_gammal_r.c (__ieee754_gammal_r):
Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (__ieee754_gammal_r):
Likewise.
* sysdeps/ieee754/ldbl-96/e_gammal_r.c (__ieee754_gammal_r):
Likewise.
* math/auto-libm-test-in: Add more tests of tgamma.
* math/auto-libm-test-out: Regenerated.
As noted in bug 6803, scalbn fails to set errno on overflow and
underflow. This patch fixes this by making scalbn an alias of ldexp,
which has exactly the same semantics (for floating-point types with
radix 2) and already has wrappers that deal with setting errno,
instead of an alias of the internal __scalbn (which ldexp calls).
Notes:
* Where compat symbols were defined for scalbn functions, I didn't
change what they point to (to keep the patch minimal), so such
compat symbols continue to go directly to the non-errno-setting
functions.
* Mike, I didn't do anything with the IA64 versions of these
functions, where I think both the ldexp and scalbn functions already
deal with setting errno. As a cleanup (not needed to fix this bug)
however you might want to make those functions into aliases for
IA64; there is no need for them to be separate function
implementations at all.
* This concludes the fix for bug 6803 since the scalb and scalbln
cases of that bug were fixed some time ago.
Tested for x86_64, x86, mips64 and powerpc.
[BZ #6803]
* math/s_ldexp.c (scalbn): Define as weak alias of __ldexp.
[NO_LONG_DOUBLE] (scalbnl): Define as weak alias of __ldexp.
* math/s_ldexpf.c (scalbnf): Define as weak alias of __ldexpf.
* math/s_ldexpl.c (scalbnl): Define as weak alias of __ldexpl.
* sysdeps/i386/fpu/s_scalbn.S (scalbn): Remove alias.
* sysdeps/i386/fpu/s_scalbnf.S (scalbnf): Likewise.
* sysdeps/i386/fpu/s_scalbnl.S (scalbnl): Likewise.
* sysdeps/ieee754/dbl-64/s_scalbn.c (scalbn): Likewise.
[NO_LONG_DOUBLE] (scalbnl): Likewise.
* sysdeps/ieee754/dbl-64/wordsize-64/s_scalbn.c (scalbn):
Likewise.
[NO_LONG_DOUBLE] (scalbnl): Likewise.
* sysdeps/ieee754/flt-32/s_scalbnf.c (scalbnf): Likewise.
* sysdeps/ieee754/ldbl-128/s_scalbnl.c (scalbnl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_scalbnl.c (scalbnl): Remove
long_double_symbol calls.
* sysdeps/ieee754/ldbl-64-128/s_scalbnl.c (scalbnl): Likewise.
* sysdeps/ieee754/ldbl-opt/s_ldexpl.c (__ldexpl_2): Define as
strong alias of __ldexpl.
(scalbnl): Define using long_double_symbol.
* sysdeps/m68k/m680x0/fpu/s_scalbn.c (__CONCATX(scalbn,suffix)):
Remove alias.
* sysdeps/sparc/sparc64/soft-fp/s_scalbnl.c (scalbnl): Likewise.
* sysdeps/x86_64/fpu/s_scalbnl.S (scalbnl): Likewise.
* math/libm-test.inc (scalbn_test_data): Add errno expectations.
(scalbln_test_data): Add more errno expectations.
This way we do not
need to call the kernel just to get the port. Furthermore, we no
longer increase the reference count on every invocation of
`mach_host_self'.
* mach/mach/mach_traps.h (__mach_host_self, mach_host_self):
Protect declarations against the macro expansion.
* mach/mach_init.c (__mach_host_self_): New variable.
(mach_init): Initialize `__mach_host_self_'.
* mach/mach_init.h (__mach_host_self_): New declaration.
(__mach_host_self, mach_host_self): New macros.
* sysdeps/mach/hurd/dl-sysdep.c (_dl_sysdep_start_cleanup):
Release reference.
The ldbl-128 and ldbl-128ibm expm1l implementations have code to
handle +Inf and finite arguments above an overflow threshold. Since
they now use __expl for large positive arguments to fix other
problems, this code is unreachable; this patch removes it.
Tested for mips64 and powerpc.
[BZ #16415]
* sysdeps/ieee754/ldbl-128/s_expm1l.c (maxlog): Remove variable.
(__expm1l): Remove code to handle positive infinity and overflow.
* sysdeps/ieee754/ldbl-128ibm/s_expm1l.c (maxlog): Remove
variable.
(__expm1l): Remove code to handle positive infinity and overflow.
The ldbl-128ibm implementation of nearbyintl wrongly uses signaling
comparisons such as "if (fabs (u.d[0].d) < TWO52)" on arguments that
might be NaNs, when "invalid" exceptions should not be raised. (For
hard float, this issue may be hidden by
<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58684>, powerpc GCC
wrongly only using unordered comparison instructions.) This patch
fixes this by just returning the argument if it is not finite (because
of the arbitrary value of the low part of a NaN in IBM long double,
there are quite a lot of comparisons that could end up involving a NaN
when the argument to nearbyintl is a NaN, so excluding NaN arguments
at the start is the simplest and safest fix).
Tested for powerpc-nofpu, where it removes failures for spurious
"invalid" exceptions from nearbyintl.
[BZ #18857]
* sysdeps/ieee754/ldbl-128ibm/s_nearbyintl.c (__nearbyintl): Just
return non-finite argument without doing ordered comparisons on
it.
Bug 15918 points out that the handling of infinities in hypotf can be
simplified: it's enough to return the absolute value of the infinite
argument without first comparing it to the other argument and possibly
returning that other argument's absolute value. This patch makes that
cleanup (which should not change how hypotf behaves on any input).
Tested for x86_64.
[BZ #15918]
* sysdeps/ieee754/flt-32/e_hypotf.c (__ieee754_hypotf): Simplify
handling of cases where one argument is an infinity.
On i386, the double version of exp10 can miss underflow exceptions if
the result is in the subnormal range for double but the last 11 bits
of the 64-bit extended-precision mantissa happen to be zero. This
patch forces the exception in a similar way to previous fixes.
As with the exp2 and exp fixes, the exp10f changes may in fact not be
needed to ensure underflow exceptions, but are included for
consistency and to fix the exp10 part of bug 18875 by ensuring that
excess range and precision is removed from underflowing return values.
Tested for x86_64 and x86.
[BZ #18875]
[BZ #18966]
* sysdeps/i386/fpu/e_exp10.S (dbl_min): New object.
(MO): New macro.
(__ieee754_exp10): For small results, force underflow exception
and remove excess range and precision from return value.
* sysdeps/i386/fpu/e_exp10f.S (flt_min): New object.
(MO): New macro.
(__ieee754_exp10f): For small results, force underflow exception
and remove excess range and precision from return value.
* math/auto-libm-test-in: Add more tests of exp10.
* math/auto-libm-test-out: Regenerated.
On i386, the double version of exp can miss underflow exceptions if
the result is in the subnormal range for double but the last 11 bits
of the 64-bit extended-precision mantissa happen to be zero. This
patch forces the exception in a similar way to previous fixes.
As with the exp2 fixes, the expf changes may in fact not be needed to
ensure underflow exceptions, but are included for consistency and to
fix the exp part of bug 18875 by ensuring that excess range and
precision is removed from underflowing return values.
Tested for x86_64 and x86.
[BZ #18875]
[BZ #18961]
* sysdeps/i386/fpu/e_exp.S (dbl_min): New object.
(MO): New macro.
(__ieee754_exp): For small results, force underflow exception and
remove excess range and precision from return value.
(__exp_finite): Likewise.
* sysdeps/i386/fpu/e_expf.S (flt_min): New object.
(MO): New macro.
(__ieee754_expf): For small results, force underflow exception and
remove excess range and precision from return value.
(__expf_finite): Likewise.
* math/auto-libm-test-in: Add more tests of exp.
* math/auto-libm-test-out: Regenerated.
Various exp2 implementations in glibc can miss underflow exceptions
when the scaling down part of the calculation is exact (or, in the x86
case, when the conversion from extended precision to the target
precision is exact). This patch forces the exception in a similar way
to previous fixes.
The x86 exp2f changes may in fact not be needed for this purpose -
it's likely to be the case that no argument of type float has an exp2
result so close to an exact subnormal float value that it equals that
value when rounded to 64 bits (even taking account of variation
between different x86 implementations). However, they are included
for consistency with the changes to exp2 and so as to fix the exp2f
part of bug 18875 by ensuring that excess range and precision is
removed from underflowing return values.
Tested for x86_64, x86 and mips64.
[BZ #16521]
[BZ #18875]
* math/e_exp2l.c (__ieee754_exp2l): Force underflow exception for
small results.
* sysdeps/i386/fpu/e_exp2.S (dbl_min): New object.
(MO): New macro.
(__ieee754_exp2): For small results, force underflow exception and
remove excess range and precision from return value.
* sysdeps/i386/fpu/e_exp2f.S (flt_min): New object.
(MO): New macro.
(__ieee754_exp2f): For small results, force underflow exception
and remove excess range and precision from return value.
* sysdeps/i386/fpu/e_exp2l.S (ldbl_min): New object.
(MO): New macro.
(__ieee754_exp2l): Force underflow exception for small results.
* sysdeps/ieee754/dbl-64/e_exp2.c (__ieee754_exp2): Likewise.
* sysdeps/ieee754/flt-32/e_exp2f.c (__ieee754_exp2f): Likewise.
* sysdeps/x86_64/fpu/e_exp2l.S (ldbl_min): New object.
(MO): New macro.
(__ieee754_exp2l): Force underflow exception for small results.
* math/auto-libm-test-in: Add more tests or exp2.
* math/auto-libm-test-out: Regenerated.
Profiling git's test suite, Linus noted [1] that a disproportionately
large amount of time was spent reading /proc/meminfo. This is done by
the glibc functions get_phys_pages and get_avphys_pages, but they only
need the MemTotal and MemFree fields, respectively. That same
information can be obtained with a single syscall, sysinfo, instead of
six: open, fstat, mmap, read, close, munmap. While sysinfo also
provides more than necessary, it does a lot less work than what the
kernel needs to do to provide the entire /proc/meminfo. Both strace -T
and in-app microbenchmarks shows that the sysinfo() approach is
roughly an order of magnitude faster.
sysinfo() is much older than what glibc currently requires, so I don't
think there's any reason to keep the old parsing code. Moreover, this
makes get_[av]phys_pages work even in the absence of /proc.
Linus noted that something as simple as 'bash -c "echo"' would trigger
the reading of /proc/meminfo, but gdb says that many more applications
than just bash are affected:
Starting program: /bin/bash "-c" "echo"
Breakpoint 1, __get_phys_pages () at ../sysdeps/unix/sysv/linux/getsysstats.c:283
283 ../sysdeps/unix/sysv/linux/getsysstats.c: No such file or directory.
(gdb) bt
So it seems that any application that uses qsort on a moderately sized
array will incur this cost (once), which is obviously proportionately
more expensive for lots of short-lived processes (such as the git test
suite).
[1] http://thread.gmane.org/gmane.linux.kernel/2019285
Signed-off-by: Rasmus Villemoes <rv@rasmusvillemoes.dk>
* sysdeps/unix/sysv/linux/getsysstats.c (__get_phys_pages):
Use sysinfo system call instead of parsing /proc/meminfo.
* sysdeps/unix/sysv/linux/getsysstats.c (__get_avphys_pages):
Likewise.
This patch adds more libm test inputs found through random test
generation to increase previously known ulps. This particular test
generation was run for mips64, so most of the increased ulps are for
ldbl-128 (float and double having been fairly well covered by such
testing for x86_64), but there's the odd ulps increase for other
formats.
Tested for x86_64, x86 and mips64.
* math/auto-libm-test-in: Add more tests of acos, acosh, asin,
asinh, atan, atan2, atanh, cabs, carg, cos, csqrt, erfc, exp,
exp10, exp2, log, log1p, log2, pow, sin, sincos, sinh, tan and
tanh.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/mips/mips32/libm-test-ulps: Likewise.
* sysdeps/mips/mips64/libm-test-ulps: Likewise.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
It was noted in
<https://sourceware.org/ml/libc-alpha/2012-09/msg00305.html> that the
bits/*.h naming scheme should only be used for installed headers.
This patch renames bits/atomic.h to atomic-machine.h to follow that
convention.
This is the only change in this series that needs to change the
filename rather than simply removing a directory level (because both
atomic.h and bits/atomic.h exist at present).
Tested for x86_64 (testsuite, and that installed stripped shared
libraries are unchanged by the patch).
[BZ #14912]
* sysdeps/aarch64/bits/atomic.h: Move to ...
* sysdeps/aarch64/atomic-machine.h: ...here.
(_AARCH64_BITS_ATOMIC_H): Rename macro to
_AARCH64_ATOMIC_MACHINE_H.
* sysdeps/alpha/bits/atomic.h: Move to ...
* sysdeps/alpha/atomic-machine.h: ...here.
* sysdeps/arm/bits/atomic.h: Move to ...
* sysdeps/arm/atomic-machine.h: ...here. Update comments.
* bits/atomic.h: Move to ...
* sysdeps/generic/atomic-machine.h: ...here.
(_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H.
* sysdeps/i386/bits/atomic.h: Move to ...
* sysdeps/i386/atomic-machine.h: ...here.
* sysdeps/ia64/bits/atomic.h: Move to ...
* sysdeps/ia64/atomic-machine.h: ...here.
* sysdeps/m68k/coldfire/bits/atomic.h: Move to ...
* sysdeps/m68k/coldfire/atomic-machine.h: ...here.
(_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H.
* sysdeps/m68k/m680x0/m68020/bits/atomic.h: Move to ...
* sysdeps/m68k/m680x0/m68020/atomic-machine.h: ...here.
* sysdeps/microblaze/bits/atomic.h: Move to ...
* sysdeps/microblaze/atomic-machine.h: ...here.
* sysdeps/mips/bits/atomic.h: Move to ...
* sysdeps/mips/atomic-machine.h: ...here.
(_MIPS_BITS_ATOMIC_H): Rename macro to _MIPS_ATOMIC_MACHINE_H.
* sysdeps/powerpc/bits/atomic.h: Move to ...
* sysdeps/powerpc/atomic-machine.h: ...here. Update comments.
* sysdeps/powerpc/powerpc32/bits/atomic.h: Move to ...
* sysdeps/powerpc/powerpc32/atomic-machine.h: ...here. Update
comments. Include <atomic-machine.h> instead of <bits/atomic.h>.
* sysdeps/powerpc/powerpc64/bits/atomic.h: Move to ...
* sysdeps/powerpc/powerpc64/atomic-machine.h: ...here. Include
<atomic-machine.h> instead of <bits/atomic.h>.
* sysdeps/s390/bits/atomic.h: Move to ...
* sysdeps/s390/atomic-machine.h: ...here.
* sysdeps/sparc/sparc32/bits/atomic.h: Move to ...
* sysdeps/sparc/sparc32/atomic-machine.h: ...here.
(_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H.
* sysdeps/sparc/sparc32/sparcv9/bits/atomic.h: Move to ...
* sysdeps/sparc/sparc32/sparcv9/atomic-machine.h: ...here.
* sysdeps/sparc/sparc64/bits/atomic.h: Move to ...
* sysdeps/sparc/sparc64/atomic-machine.h: ...here.
* sysdeps/tile/bits/atomic.h: Move to ...
* sysdeps/tile/atomic-machine.h: ...here.
* sysdeps/tile/tilegx/bits/atomic.h: Move to ...
* sysdeps/tile/tilegx/atomic-machine.h: ...here. Include
<sysdeps/tile/atomic-machine.h> instead of
<sysdeps/tile/bits/atomic.h>.
(_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H.
* sysdeps/tile/tilepro/bits/atomic.h: Move to ...
* sysdeps/tile/tilepro/atomic-machine.h: ...here. Include
<sysdeps/tile/atomic-machine.h> instead of
<sysdeps/tile/bits/atomic.h>.
(_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H.
* sysdeps/unix/sysv/linux/arm/bits/atomic.h: Move to ...
* sysdeps/unix/sysv/linux/arm/atomic-machine.h: ...here. Include
<sysdeps/arm/atomic-machine.h> instead of
<sysdeps/arm/bits/atomic.h>.
* sysdeps/unix/sysv/linux/hppa/bits/atomic.h: Move to ...
* sysdeps/unix/sysv/linux/hppa/atomic-machine.h: ...here.
(_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H.
* sysdeps/unix/sysv/linux/m68k/coldfire/bits/atomic.h: Move to ...
* sysdeps/unix/sysv/linux/m68k/coldfire/atomic-machine.h: ...here.
(_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H.
* sysdeps/unix/sysv/linux/nios2/bits/atomic.h: Move to ...
* sysdeps/unix/sysv/linux/nios2/atomic-machine.h: ...here.
(_NIOS2_BITS_ATOMIC_H): Rename macro to _NIOS2_ATOMIC_MACHINE_H.
* sysdeps/unix/sysv/linux/sh/bits/atomic.h: Move to ...
* sysdeps/unix/sysv/linux/sh/atomic-machine.h: ...here.
* sysdeps/x86_64/bits/atomic.h: Move to ...
* sysdeps/x86_64/atomic-machine.h: ...here.
* include/atomic.h: Include <atomic-machine.h> instead of
<bits/atomic.h>.
The ldbl-128 / ldbl-128ibm implementation of lgammal converts (the
floor of minus) non-integer negative arguments to int to determine the
value of signgam. When those values are outside the range of int,
this produces spurious "invalid" exceptions and incorrect values of
signgam. This patch fixes this by instead determining signgam through
comparing half the integer in question to floor of half the integer.
Tested for mips64, x86_64 and x86.
[BZ #18952]
* sysdeps/ieee754/ldbl-128/e_lgammal_r.c (__ieee754_lgammal_r): Do
not convert non-integer negative arguments to int to determine the
value of signgam.
* math/auto-libm-test-in: Add more tests of lgamma.
* math/auto-libm-test-out: Regenerated.
This patch adds more libm test inputs found through random test
generation to increase observed ulps on x86_64.
Tested for x86_64 and x86.
* math/auto-libm-test-in: Add more tests of acosh, atanh, cbrt,
cosh, csqrt, erfc, expm1 and lgamma.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
The existing implementations of lgamma functions (except for the ia64
versions) use the reflection formula for negative arguments. This
suffers large inaccuracy from cancellation near zeros of lgamma (near
where the gamma function is +/- 1).
This patch fixes this inaccuracy. For arguments above -2, there are
no zeros and no large cancellation, while for sufficiently large
negative arguments the zeros are so close to integers that even for
integers +/- 1ulp the log(gamma(1-x)) term dominates and cancellation
is not significant. Thus, it is only necessary to take special care
about cancellation for arguments around a limited number of zeros.
Accordingly, this patch uses precomputed tables of relevant zeros,
expressed as the sum of two floating-point values. The log of the
ratio of two sines can be computed accurately using log1p in cases
where log would lose accuracy. The log of the ratio of two gamma(1-x)
values can be computed using Stirling's approximation (the difference
between two values of that approximation to lgamma being computable
without computing the two values and then subtracting), with
appropriate adjustments (which don't reduce accuracy too much) in
cases where 1-x is too small to use Stirling's approximation directly.
In the interval from -3 to -2, using the ratios of sines and of
gamma(1-x) can still produce too much cancellation between those two
parts of the computation (and that interval is also the worst interval
for computing the ratio between gamma(1-x) values, which computation
becomes more accurate, while being less critical for the final result,
for larger 1-x). Because this can result in errors slightly above
those accepted in glibc, this interval is instead dealt with by
polynomial approximations. Separate polynomial approximations to
(|gamma(x)|-1)(x-n)/(x-x0) are used for each interval of length 1/8
from -3 to -2, where n (-3 or -2) is the nearest integer to the
1/8-interval and x0 is the zero of lgamma in the relevant half-integer
interval (-3 to -2.5 or -2.5 to -2).
Together, the two approaches are intended to give sufficient accuracy
for all negative arguments in the problem range. Outside that range,
the previous implementation continues to be used.
Tested for x86_64, x86, mips64 and powerpc. The mips64 and powerpc
testing shows up pre-existing problems for ldbl-128 and ldbl-128ibm
with large negative arguments giving spurious "invalid" exceptions
(exposed by newly added tests for cases this patch doesn't affect the
logic for); I'll address those problems separately.
[BZ #2542]
[BZ #2543]
[BZ #2558]
* sysdeps/ieee754/dbl-64/e_lgamma_r.c (__ieee754_lgamma_r): Call
__lgamma_neg for arguments from -28.0 to -2.0.
* sysdeps/ieee754/flt-32/e_lgammaf_r.c (__ieee754_lgammaf_r): Call
__lgamma_negf for arguments from -15.0 to -2.0.
* sysdeps/ieee754/ldbl-128/e_lgammal_r.c (__ieee754_lgammal_r):
Call __lgamma_negl for arguments from -48.0 or -50.0 to -2.0.
* sysdeps/ieee754/ldbl-96/e_lgammal_r.c (__ieee754_lgammal_r):
Call __lgamma_negl for arguments from -33.0 to -2.0.
* sysdeps/ieee754/dbl-64/lgamma_neg.c: New file.
* sysdeps/ieee754/dbl-64/lgamma_product.c: Likewise.
* sysdeps/ieee754/flt-32/lgamma_negf.c: Likewise.
* sysdeps/ieee754/flt-32/lgamma_productf.c: Likewise.
* sysdeps/ieee754/ldbl-128/lgamma_negl.c: Likewise.
* sysdeps/ieee754/ldbl-128/lgamma_productl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/lgamma_negl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/lgamma_productl.c: Likewise.
* sysdeps/ieee754/ldbl-96/lgamma_negl.c: Likewise.
* sysdeps/ieee754/ldbl-96/lgamma_product.c: Likewise.
* sysdeps/ieee754/ldbl-96/lgamma_productl.c: Likewise.
* sysdeps/generic/math_private.h (__lgamma_negf): New prototype.
(__lgamma_neg): Likewise.
(__lgamma_negl): Likewise.
(__lgamma_product): Likewise.
(__lgamma_productl): Likewise.
* math/Makefile (libm-calls): Add lgamma_neg and lgamma_product.
* math/auto-libm-test-in: Add more tests of lgamma.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
There are a few .set mips* assembler directives used in the MIPS specific
sysdep code that force an instruction to be assembled for a specific ISA.
The reason for these is to allow an instruction to be encoded when it might
not be supported in the current ISA (when the code is run the Linux kernel
will trap and emulate any unsupported instructions). Unfortunately forcing
a specific ISA means that when assembling for a newer ISA, where the
instruction has a different encoding, the wrong encoding will be used.
* sysdeps/mips/bits/atomic.h [_MIPS_SIM == _ABIO32] (MIPS_PUSH_MIPS2):
Only use .set mips2 if the current ISA is below mips2.
* sysdeps/mips/sys/tas.h [_MIPS_SIM == _ABIO32] (_test_and_set):
Likewise.
* sysdeps/mips/nptl/tls.h (READ_THREAD_POINTER): Only use .set
mips32r2 if the current ISA is below mips32r2.
* sysdeps/mips/tls-macros.h (TLS_RDHWR): New define.
(TLS_IE): Updated to use the TLD_RDHWR macro.
(TLS_LE): Likewise.
* sysdeps/unix/mips/sysdep.h (__mips_isa_rev): Moved out of #ifdef
__ASSEMBLER__ condition.
when initial make call has subdir= explicitly set.
* sysdeps/mach/Makefile ($(patsubst
mach%,m\%h%,$(mach-before-compile))): Force subdir to mach when
calling $(MAKE).
* sysdeps/mach/hurd/Makefile ($(patsubst %,$(hurd-objpfx)hurd/%.%,auth
io fs process)): Force subdir to hurd when calling $(MAKE).
($(common-objpfx)hurd/../mach/RPC_task_get_sampled_pcs.c): Force
subdir to mach when calling $(MAKE).
It was noted in
<https://sourceware.org/ml/libc-alpha/2012-09/msg00305.html> that the
bits/*.h naming scheme should only be used for installed headers.
This patch renames bits/linkmap.h to plain linkmap.h to follow that
convention.
Tested for x86_64 (testsuite, and that installed stripped shared
libraries are unchanged by the patch).
[BZ #14912]
* bits/linkmap.h: Move to ...
* sysdeps/generic/linkmap.h: ...here.
* sysdeps/aarch64/bits/linkmap.h: Move to ...
* sysdeps/aarch64/linkmap.h: ...here.
* sysdeps/arm/bits/linkmap.h: Move to ...
* sysdeps/arm/linkmap.h: ...here.
* sysdeps/hppa/bits/linkmap.h: Move to ...
* sysdeps/hppa/linkmap.h: ...here.
* sysdeps/ia64/bits/linkmap.h: Move to ...
* sysdeps/ia64/linkmap.h: ...here.
* sysdeps/mips/bits/linkmap.h: Move to ...
* sysdeps/mips/linkmap.h: ...here.
* sysdeps/s390/bits/linkmap.h: Move to ...
* sysdeps/s390/linkmap.h: ...here.
* sysdeps/sh/bits/linkmap.h: Move to ...
* sysdeps/sh/linkmap.h: ...here.
* sysdeps/x86/bits/linkmap.h: Move to ...
* sysdeps/x86/linkmap.h: ...here.
* include/link.h: Include <linkmap.h> instead of <bits/linkmap.h>.
Commit f4491417cc introduced some warnings
when building GLIBC with GCC 5.x. similar to those fixed by commit
dd6e8af6ba. This patch fixes those warnings.
* sysdeps/unix/sysv/linux/socketpair.c: Use the address of the
first member of struct sv in syscall macro.