This patch regenerates libm-test-ulps for ARM. As before it may be
useful for someone building for a configuration with VFMA enabled to
do a followup regeneration for any additional ulps in that
configuration.
Committed.
* sysdeps/arm/libm-test-ulps: Regenerated.
This patch fixes spurious underflows from ldbl-128 expm1l, as reported
in <https://sourceware.org/ml/libc-alpha/2014-06/msg00835.html> and
exposed by the tests added for such a bug in the x86 / x86-64
version. The bug and fix are essentially the same, so no separate bug
is filed in Bugzilla.
Tested for mips64.
[BZ #16539]
* sysdeps/ieee754/ldbl-128/s_expm1l.c: Include <float.h>.
(__expm1l): Return argument unchanged when small but not
subnormal.
This patch fixes bug 17097, ldbl-128 powl producing overflowing /
underflowing results with positive sign when the result should have
been negative. This was shown up by the tests in non-default rounding
modes added by my patch for bug 16315, but isn't actually limited to
non-default rounding modes: rather, when rounding to nearest the
wrappers produced a result with the correct sign and so always hid the
bug unless -lieee was used to disable the wrappers. The problem is
that in the cases where Y is large enough that the result overflows or
underflows for X not very close to 1, but not large enough to overflow
or underflow for all X != +/- 1 (in the latter case Y is always an
even integer), a positive overflowing / underflowing result is always
returned, rather than one with the correct sign. This patch moves the
relevant part of computation of the sign earlier and returns a result
of the correct sign.
Tested for mips64.
[BZ #17097]
* sysdeps/ieee754/ldbl-128/e_powl.c (__ieee754_powl): Return
result with correct sign in case of exponents that produce
overflow except for X very close to 1.
Define MEMCPY_OK_FOR_FWD_MEMMOVE in memcopy.h and let arch-specific
implementations of that file override the value if necessary. This
override is only useful for tile and moving this macro to memcopy.h
allows us to remove the tile-specific memmove.c.
shlib-versions files can contain ABI lines that map triplets to a
canonical ABI name. This name was once used for various purposes
where test baseline files for different ABIs went in a single
directory; now these purposes use sysdeps files, generation of headers
which have per-ABI variants uses abi-variants and related Makefile
variables and the shlib-versions ABI names are unused. This patch
duly removes those lines and associated build system support for them.
Tested for x86_64 (both a full testsuite run and confirming the
installed shared libraries are unchanged by the patch).
* Makeconfig ($(common-objpfx)soversions.mk): Do not generate
abi-name definition.
* scripts/soversions.awk: Do not handle or generate ABI lines.
* shlib-versions: Remove ABI entries.
* sysdeps/powerpc/nofpu/shlib-versions: Remove file.
* sysdeps/x86_64/x32/shlib-versions: Remove ABI entry.
This patch removes the configure test for working -z relro.
The use of -z relro in Makeconfig became unconditional with
commit 2e6ab1df44c412bb9d30b26a4d8a679150a7e375
Author: Ulrich Drepper <drepper@redhat.com>
Date: Sat Oct 28 06:44:04 2006 +0000
Remove conditional code which now is unnecessary.
(commit reference from git://repo.or.cz/glibc/history), so since then
the configure test has not controlled anything about how glibc is
built - simply about whether configure succeeds and allows a build to
be attempted. The test for whether the option did something useful
(as opposed to whether it exists - which we can certainly just assume
by now) was originally added in
<https://sourceware.org/ml/libc-hacker/2004-09/msg00069.html> to
disable the option in a case when it did nothing useful on ia64 (as a
result of something deliberate in the linker on ia64). Since 2006
that disabling has been of no effect, and given that the current test
does not set libc_relro_required for ia64, it does nothing whatever
useful for the original motivating case. Also at around the same time
in 2006 the test was made to give an error for missing or broken -z
relro support on various architectures.
So effectively all the test does now is verify that, on certain
architectures, the linker has not been changed deliberately to make
the option ineffective. I see no apparent reason why such a change
should be expected, or why the build should be stopped if it were to
be made (any more than we disallow build on ia64); I think we can
trust binutils patch review to point out the consequences of any
change to COMMONPAGESIZE setting. The only thing that might now make
sense would be disabling the -z relro use on an architecture-specific
basis if there were an architecture-specific reason to consider that
to make sense; it would be for the ia64 maintainer to decide if that
makes sense for ia64 at present, but I think that could be done
through sysdeps Makefiles - no special configure tests needed.
Tested for x86_64 that this patch makes no change to the installed
shared libraries.
Together with
<https://sourceware.org/ml/libc-alpha/2014-06/msg00788.html> (pending
review) this substantially eliminates architecture-specific cases from
architecture-independent configure.ac files. There remains an i386
case in sysdeps/mach/hurd/configure.ac that should properly move to
the i386 subdirectory. (There are also OS-specific cases outside
OS-specific directories; in principle I think should should also
move.)
* configure.ac (libc_commonpagesize): Remove variable.
(libc_relro_required): Likewise.
(libc_cv_z_relro): Remove configure test.
* configure: Regenerated.
* sysdeps/aarch64/preconfigure (libc_commonpagesize): Do not set
variable.
(libc_relro_required): Likewise.
* sysdeps/alpha/preconfigure (libc_commonpagesize): Likewise.
(libc_relro_required): Likewise.
* sysdeps/arm/preconfigure.ac (libc_commonpagesize): Likewise.
(libc_relro_required): Likewise.
* sysdeps/arm/preconfigure: Regenerated.
* sysdeps/ia64/preconfigure: Remove file.
* sysdeps/tile/preconfigure (libc_commonpagesize): Do not set
variable.
(libc_relro_required): Likewise.
This patch fixes bugs 16561 and 16562, bad results of yn in overflow
cases in non-default rounding modes, both because an intermediate
overflow in the recurrence does not get detected if the result is not
an infinity and because an overflowing result may occur in the wrong
sign. The fix is to set FE_TONEAREST mode internally for the parts of
the function where such overflows can occur (which includes the call
to y1 - where yn is used to compute a Bessel function of order -1,
negating the result of y1 isn't correct for overflowing results in
directed rounding modes) and then compute an overflowing value in the
original rounding mode if the to-nearest result was an infinity.
Tested x86_64 and x86 and ulps updated accordingly. Also tested for
mips64 and powerpc32 to test the ldbl-128 and ldbl-128ibm changes.
(The tests for these bugs were added in my previous y1 patch, so the
only thing this patch has to do with the testsuite is enable yn
testing in all rounding modes.)
[BZ #16561]
[BZ #16562]
* sysdeps/ieee754/dbl-64/e_jn.c: Include <float.h>.
(__ieee754_yn): Set FE_TONEAREST mode internally and then
recompute overflowing results in original rounding mode.
* sysdeps/ieee754/flt-32/e_jnf.c: Include <float.h>.
(__ieee754_ynf): Set FE_TONEAREST mode internally and then
recompute overflowing results in original rounding mode.
* sysdeps/ieee754/ldbl-128/e_jnl.c: Include <float.h>.
(__ieee754_ynl): Set FE_TONEAREST mode internally and then
recompute overflowing results in original rounding mode.
* sysdeps/ieee754/ldbl-128ibm/e_jnl.c: Include <float.h>.
(__ieee754_ynl): Set FE_TONEAREST mode internally and then
recompute overflowing results in original rounding mode.
* sysdeps/ieee754/ldbl-96/e_jnl.c: Include <float.h>.
(__ieee754_ynl): Set FE_TONEAREST mode internally and then
recompute overflowing results in original rounding mode.
* sysdeps/i386/fpu/fenv_private.h [!__SSE2_MATH__]
(libc_feholdsetround_ctx): New macro.
* math/libm-test.inc (yn_test): Use ALL_RM_TEST.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps : Likewise.
The 64-bit MIPS ABIs involve the caller setting up t9 ($25) to the
address of the called function, and the called function then using
this in a .cpsetup directive to compute gp. The .cpsetup directive
needs to name the function to which t9 points for this purpose. In
the definition of *_nocancel functions, the directive pointed to the
normal entry point rather than the _nocancel one, resulting in
segfaults when the _nocancel functions were used. This patch corrects
the function name used in the directive. (It seems the bug was latent
until Roland's not-cancel.h unification, with the _nocancel entry
points not previously being used - so not user-visible in a release,
so no Bugzilla entry required.)
Tested mips64 sufficiently to confirm the previously seen segfaults
are fixed.
* sysdeps/unix/sysv/linux/mips/mips64/nptl/sysdep-cancel.h
[__PIC__] (PSEUDO): Use name of _nocancel entry point in
corresponding .cpsetup call.
This patch removes configure tests for assembler CFI support (and
thereby eliminates an architecture-specific case in the main
configure.ac), instead assuming that support is present
unconditionally.
The main test was added in 2003 around the time CFI support was added
to the assembler. cfi_personality and cfi_lsda support were added to
the assembler in 2006. cfi_sections support was added in 2009, a few
weeks before binutils 2.20 was released; it's in 2.20, the minimum
supported version, so even that configure test is obsolete.
Tested x86_64 that the installed shared libraries are unchanged by
this patch.
* configure.ac (libc_cv_asm_cfi_directives): Remove configure
test.
* configure: Regenerated.
* config.h.in (HAVE_ASM_CFI_DIRECTIVES): Remove macro undefine.
* sysdeps/arm/configure.ac (libc_cv_asm_cfi_directive_sections):
Remove configure test.
* sysdeps/arm/configure: Regenerated.
* sysdeps/nptl/configure.ac: Do not check
libc_cv_asm_cfi_directives.
* sysdeps/nptl/configure: Regenerated.
* sysdeps/x86_64/nptl/configure.ac: Remove file.
* sysdeps/x86_64/nptl/configure: Remove generated file.
* b/sysdeps/generic/sysdep.h [HAVE_ASM_CFI_DIRECTIVES]: Make code
unconditional.
[!HAVE_ASM_CFI_DIRECTIVES]: Remove conditional code.
This patch defines ELF_MACHINE_NO_RELA on all architectures. Tested
only on x86_64 to verify that the sources before and after are
identical except for two instructions that pass the current line
number in dl-machine.h to assert_fail.
This patch removes conditionals on __ASSUME_O_CLOEXEC, and on
O_CLOEXEC being defined, in sysdeps/unix/sysv/linux/, now that
O_CLOEXEC support can be unconditionally assumed.
The patch is conservative in what it changes and further followup
cleanups may be possible. It may be possible to remove dl-opendir.c,
but the patch does not do so, just removing a redundant undefine and
redefine of __ASSUME_O_CLOEXEC. Also, __ASSUME_O_CLOEXEC is defined
unconditionally for Hurd as well as Linux. Thus, if we decide that
O_CLOEXEC support is a required feature of any glibc port, we could
remove __ASSUME_O_CLOEXEC and all conditionals on it throughout glibc,
rather than just cleaning up sysdeps/unix/sysv/linux/.
Tested x86_64 that the disassembly of installed shared libraries is
unchanged by this patch.
* sysdeps/unix/sysv/linux/dl-opendir.c (__ASSUME_O_CLOEXEC): Do
not undefine and redefine.
* sysdeps/unix/sysv/linux/getsysstats.c (__get_nprocs)
[O_CLOEXEC]: Make code unconditional.
(__get_nprocs) [!O_CLOEXEC]: Remove conditional code.
* sysdeps/unix/sysv/linux/shm_open.c: Do not include
<kernel-features.h>.
[O_CLOEXEC && !__ASSUME_O_CLOEXEC] (have_o_cloexec): Remove
conditional variable definition.
(shm_open) [O_CLOEXEC]: Make code unconditional.
(shm_open) [!O_CLOEXEC || !__ASSUME_O_CLOEXEC]: Remove conditional
code.
This patch moves the USE_REGPARMS define from the toplevel
configure.ac to sysdeps/i386/configure.ac.
Tested x86 that the disassembly of installed shared libraries is
unchanged by this patch.
* configure.ac (USE_REGPARMS): Don't define here.
* configure: Regenerated.
* sysdeps/i386/configure.ac (USE_REGPARMS): Define here.
* sysdeps/i386/configure: Regenerated.
This patch makes non-ex-ports architectures set base_machine and
machine based on the original configured machine value in preconfigure
fragments, like ex-ports architectures, rather than in the toplevel
configure.ac.
Tested x86 that the disassembly of installed shared libraries is
unchanged by the patch.
* configure.ac (base_machine): Do not set specially for particular
machines here.
* configure: Regenerated.
* sysdeps/powerpc/preconfigure: Move machine and base_machine
settings from configure.ac.
* sysdeps/i386/preconfigure: New file.
* sysdeps/s390/preconfigure: Likewise.
* sysdeps/sh/preconfigure: Likewise.
* sysdeps/sparc/preconfigure: Likewise.
This patch removes the __ASSUME_XFS_RESTRICTED_CHOWN macro, now it can
be presumed to be defined unconditionally. I'm not sure if what's
left of __statfs_chown_restricted is actually useful (if not, a
followup could remove it), but I left it there to keep the patch
conservative and avoid changing the code generated for glibc.
Tested x86_64 that the disassembly of installed shared libraries is
unchanged by the patch.
* sysdeps/unix/sysv/linux/kernel-features.h
(__ASSUME_XFS_RESTRICTED_CHOWN): Remove macro.
* sysdeps/unix/sysv/linux/pathconf.c (__statfs_chown_restricted)
[__ASSUME_XFS_RESTRICTED_CHOWN]: Make code unconditional.
(__statfs_chown_restricted) [!__ASSUME_XFS_RESTRICTED_CHOWN]:
Remove conditional code.
Add support for the new HWCAP2 values for ARMv8 added in the
3.15 kernel. Tested using QEMU which supports these extensions.
ChangeLog:
2014-06-25 Will Newton <will.newton@linaro.org>
* sysdeps/unix/sysv/linux/arm/dl-procinfo.c
(_dl_arm_cap_flags): Add HWCAP2 values.
* sysdeps/unix/sysv/linux/arm/dl-procinfo.h
(_DL_HWCAP_COUNT): Increase to 37.
(_DL_HWCAP_LAST): New define.
(_DL_HWCAP2_LAST): New define.
(_dl_procinfo): Add support for printing
AT_HWCAP2 entries.
(_dl_string_hwcap): Use _dl_hwcap_string.
This patch removes the __ASSUME_UTIMENSAT macro, now it can be
unconditionally assumed to be true.
This shows that the only live uses of __ASSUME_UTIMES are in utimes.c
and they are only live for hppa. I intend a followup patch to make
__ASSUME_UTIMES into an hppa-specific macro (not used or defined
outside sysdeps/unix/sysv/linux/hppa/).
Tested x86_64 that the disassembly of installed shared libraries is
unchanged by this patch.
* sysdeps/unix/sysv/linux/kernel-features.h (__ASSUME_UTIMENSAT):
Remove macro.
* sysdeps/unix/sysv/linux/futimes.c: Do not include
<kernel-features.h>.
[__NR_utimensat && !__ASSUME_UTIMENSAT] (miss_utimensat): Remove
conditional variable definition.
(__futimes): Update comment.
(__futimes) [__ASSUME_UTIMENSAT]: Make code unconditional.
(__futimes) [!__ASSUME_UTIMENSAT]: Remove conditional code.
This patch fixes spurious underflows from exp10 for arguments near 0
(part of bug 16560; that bug also includes spurious underflows from
exp2, which are not fixed by this patch). The problem is underflows
in the internal computation converting the exp10 argument to arguments
for exp (with extra precision), and the fix is simply to return 1
early for arguments near enough to 0 (just as arguments with large
enough magnitude have their own overflow / underflow logic at the
start of the function).
Tested x86_64 and x86 and ulps updated accordingly; also tested for
powerpc32 and mips64 to validate the ldbl-128ibm and ldbl-128 changes.
[BZ #16560]
* sysdeps/ieee754/dbl-64/e_exp10.c (__ieee754_exp10): Return 1 for
arguments close to 0.
* sysdeps/ieee754/ldbl-128/e_exp10l.c (__ieee754_exp10l):
Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_exp10l.c (__ieee754_exp10l):
Likewise.
* math/auto-libm-test-in: Add more tests of exp10.
* math/auto-libm-test-out: Regenerated.
* sysdeps/x86_64/fpu/libm-test-ulps: Update.
This patch removes the __ASSUME_COMPLETE_READV_WRITEV
kernel-features.h macro, now that it can be unconditionally assumed to
be true. (The relevant kernel feature was added some time between 2.0
and 2.2, and this macro is only used in sysdeps/unix/sysv/linux/.)
Tested x86_64 that the disassembly of installed shared libraries is
unchanged by this patch.
* sysdeps/unix/sysv/linux/kernel-features.h
(__ASSUME_COMPLETE_READV_WRITEV): Remove macro.
* sysdeps/unix/sysv/linux/readv.c: Do not include
<kernel-features.h>.
[!__ASSUME_COMPLETE_READV_WRITEV]: Remove conditional code.
[!UIO_FASTIOV] (UIO_FASTIOV): Remove macro.
(__libc_readv) [__ASSUME_COMPLETE_READV_WRITEV]: Make code
unconditional.
(__libc_readv) [!__ASSUME_COMPLETE_READV_WRITEV]: Remove
conditional code.
* sysdeps/unix/sysv/linux/writev.c: Do not include
<kernel-features.h>.
[!__ASSUME_COMPLETE_READV_WRITEV]: Remove conditional code.
[!UIO_FASTIOV] (UIO_FASTIOV): Remove macro.
(__libc_writev) [__ASSUME_COMPLETE_READV_WRITEV]: Make code
unconditional.
(__libc_writev) [!__ASSUME_COMPLETE_READV_WRITEV]: Remove
conditional code.
Partial merge from gnulib which fixes a number of -Wundef warnings.
The parts that differ from gnulib are the header comment, use of
__glibc_unlikely, a #define of __secure_getenv and the use of tabs.
The majority of the patch is cosmetic comment changes, the only runtime
change is an abort if an unknown kind is passed to __gen_tempname.
ChangeLog:
2014-06-25 Will Newton <will.newton@linaro.org>
* sysdeps/posix/tempname.c: Merge from gnulib, cosmetic
comment changes throughout the file. Remove checks
for HAVE_*_H definitions that are not required.
(__gen_tempname): Call abort if an unknown kind value is
passed.
This patch fixes bug 16539, spurious underflow exceptions from x86 /
x86-64 expm1l. The problem is that the computation of a base-2
exponent with extra precision involves spurious underflows for
arguments that are small but not subnormal, so a check is added to
just return the argument in those cases. (If the argument *is*
subnormal, underflowing is correct and the existing code will always
underflow, so it suffices to keep using the existing code in that
case; some expm1 implementations have a bug (bug 16353) with missing
underflow exceptions, but I don't think there's such a bug in this
particular version.)
Tested x86_64 and x86; no ulps updates needed.
(auto-libm-test-out diffs omitted below.)
[BZ #16539]
* sysdeps/i386/fpu/e_expl.S (IEEE754_EXPL) [USE_AS_EXPM1L]: Just
return the argument for normal arguments with exponent below -64.
* sysdeps/x86_64/fpu/e_expl.S (IEEE754_EXPL) [USE_AS_EXPM1L]:
Likewise.
* math/auto-libm-test-in: Add another test of expm1.
* math/auto-libm-test-out: Regenerated.
This patch fixes bug 16287, spurious underflows from ldbl-128 erfl
arising from it calling erfcl for arguments with absolute value at
least 1.0, although for large positive arguments erfcl correctly
underflows but erfl shouldn't. The fix is simply to avoid calling
erfcl, and just return 1, for arguments above a cut-off large enough
that erfl correctly rounds to-nearest as 1 but not so large that erfcl
underflows.
Tested mips64. Also tested x86_64 and x86 to confirm the new tests
(taken from the tests of erfc) don't cause any problems there; no ulps
updates needed.
[BZ #16287]
* sysdeps/ieee754/ldbl-128/s_erfl.c (__erfl): Return 1 without
calling __erfcl for arguments at least 16.
* math/auto-libm-test-in: Add more tests of erf.
* math/auto-libm-test-out: Regenerated.
Continuing the process of making non-ex-ports architectures follow the
preferred sysdeps practices followed by ex-ports architectures - that
is, putting things in architecture-specific sysdeps files rather than
having architecture-specific cases in architecture-independent files -
this patch moves architecture cases out of
sysdeps/unix/sysv/linux/configure.ac into (new or existing) configure
fragments for each architecture. (In the case of the
arch_minimum_kernel setting for x32,
sysdeps/unix/sysv/linux/x86_64/x32/configure already has such a
setting so the setting in sysdeps/unix/sysv/linux/configure.ac was a
duplicate that could just be removed - though I haven't tested for
x32.)
Tested for x86_64 and x86 that the patch causes no changes to the
installed shared libraries or ldd (or any part of the installation
except for the parts that always change because the files contain
timestamps - nscd and static libraries).
* sysdeps/unix/sysv/linux/configure.ac: Remove cases for
individual architectures.
* sysdeps/unix/sysv/linux/configure: Regenerated.
* sysdeps/unix/sysv/linux/i386/configure.ac: New file.
* sysdeps/unix/sysv/linux/i386/configure: New generated file.
* sysdeps/unix/sysv/linux/powerpc/configure.ac
(ldd_rewrite_script): Define variable.
* sysdeps/unix/sysv/linux/powerpc/configure: Regenerated.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/configure.ac: New
file.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/configure: New
generated file.
* sysdeps/unix/sysv/linux/s390/configure.ac: New file.
* sysdeps/unix/sysv/linux/s390/configure: New generated file.
* sysdeps/unix/sysv/linux/sh/configure.ac: New file.
* sysdeps/unix/sysv/linux/sh/configure: New generated file.
* sysdeps/unix/sysv/linux/sparc/configure.ac: New file.
* sysdeps/unix/sysv/linux/sparc/configure: New generated file.
* sysdeps/unix/sysv/linux/x86_64/configure.ac: New file.
* sysdeps/unix/sysv/linux/x86_64/configure: New generated file.
Improve fesetenv to use an optimized implementation similar to
feupdateenv.
2014-06-24 Wilco <wdijkstr@arm.com>
* sysdeps/arm/fesetenv.c (fesetenv): Optimize implementation.
This patch rewrites feupdateenv to improve performance by avoiding
unnecessary FPSCR reads/writes. It fixes bug 16918 by passing the
correct return value.
2014-06-24 Wilco <wdijkstr@arm.com>
[BZ #16918]
* sysdeps/arm/feupdateenv.c (feupdateenv):
Rewrite to reduce FPSCR accesses and fix return value.
rather than duplicating functionality. To make this work for softfp builds,
ensure functions in fenv_private are not conditionally compiled.
2014-06-24 Wilco <wdijkstr@arm.com>
* sysdeps/arm/fegetround.c (fegetround): Call get_rounding_mode.
* sysdeps/arm/feholdexcpt.c (feholdexcept): Call libc_feholdexcept_vfp.
* sysdeps/arm/fesetround.c (fesetround): Call libc_fesetround_vfp.
* sysdeps/arm/fgetexcptflg.c (fegetexceptflag):
Call libc_fetestexcept_vfp.
* sysdeps/arm/ftestexcept.c (fetestexcept): Call libc_fetestexcept_vfp.
* sysdeps/arm/fenv_private.h: Move libc_*_vfp functions outside of
__SOFTFP__ ifdef so that they can be built for softfp.
The first argument of elision_adapt and that of ELISION_*LOCK have
different signs since __elision_rwcount is signed char * and the
argument of elision_adapt is uint8_t *. Modified elision_adapt to
accept signed char * instead of uint8_t *.
This patch fixes bug 16354, spurious underflows from cosh when a tiny
argument is passed to expm1 and expm1 correctly underflows although
the final result of cosh should be 1. As noted in that bug, some
cases are latent because of expm1 implementations not raising
underflow (bug 16353), but all the implementations are fixed
similarly. They already contained checks for tiny arguments, but the
checks were too late to avoid underflow from expm1 (although they
would avoid underflow from subsequent squaring of the result of
expm1); they are moved before the expm1 calls.
The thresholds used for considering arguments tiny are not
particularly consistent in how they relate to the precision of the
floating-point format in question. They are, however, all sufficient
to ensure that the round-to-nearest result of cosh is indeed 1 below
the threshold (although sometimes they are smaller than necessary).
But the previous logic did not return 1, but the previously computed 1
+ expm1(abs(x)) value. And the thresholds in the ldbl-128 and
ldbl-128ibm code (0x1p-71L - I suspect 0x3f8b was intended in the code
instead of 0x3fb8 - and (roughly) 0x1p-55L) are not sufficient for
that value to be 1. So by moving the test for tiny arguments, and
consequently returning 1 directly now the expm1 value hasn't been
computed by that point, this patch also fixes bug 17061, the (large
number of ulps) inaccuracy for small arguments in those
implementations. Tests for that bug are duly added.
Tested x86_64 and x86 and ulps updated accordingly. Also tested for
mips64 and powerpc32 to validate the ldbl-128 and ldbl-128ibm changes.
[BZ #16354]
[BZ #17061]
* sysdeps/ieee754/dbl-64/e_cosh.c (__ieee754_cosh): Check for
small arguments before calling __expm1.
* sysdeps/ieee754/flt-32/e_coshf.c (__ieee754_coshf): Check for
small arguments before calling __expm1f.
* sysdeps/ieee754/ldbl-128/e_coshl.c (__ieee754_coshl): Check for
small arguments before calling __expm1l.
* sysdeps/ieee754/ldbl-128ibm/e_coshl.c (__ieee754_coshl):
Likewise.
* sysdeps/ieee754/ldbl-96/e_coshl.c (__ieee754_coshl): Likewise.
* math/auto-libm-test-in: Add more cosh tests. Do not allow
spurious underflow for some cosh tests.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
This patch fixes bug 17050, missing errno setting for y1 overflow (for
small positive arguments). An appropriate check is added for overflow
directly in the __ieee754_y1 implementation, similar to the check
present for yn (doing it there rather than in the wrapper also avoids
yn needing to repeat the check when called for order 1 or -1 and it
uses __ieee754_y1).
Tested x86_64 and x86; no ulps update needed. Also tested for mips64
to verify the ldbl-128 fix (the ldbl-128ibm code just #includes the
ldbl-128 file).
[BZ #17050]
* sysdeps/ieee754/dbl-64/e_j1.c: Include <errno.h>.
(__ieee754_y1): Set errno if return value overflows.
* sysdeps/ieee754/flt-32/e_j1f.c: Include <errno.h>.
(__ieee754_y1f): Set errno if return value overflows.
* sysdeps/ieee754/ldbl-128/e_j1l.c: Include <errno.h>.
(__ieee754_y1l): Set errno if return value overflows.
* sysdeps/ieee754/ldbl-96/e_j1l.c: Include <errno.h>.
(__ieee754_y1l): Set errno if return value overflows.
* math/auto-libm-test-in: Add more tests of y0, y1 and yn.
* math/auto-libm-test-out: Regenerated.
This patch enables testing of cpow in all rounding modes using
ALL_RM_TEST. There were two reasons this was previously deferred:
* MPC has complicated rounding-mode-dependent rules for the signs of
exact zero real or imaginary parts in the result of mpc_pow. Annex
G does not impose any such requirements and I don't think glibc
should try to implement any particular logic here. This patch adds
support for gen-auto-libm-tests passing the IGNORE_ZERO_INF_SIGN
flag to libm-test.inc.
* Error accumulations in some tests in non-default rounding modes
exceed the maximum error permitted in libm-test.inc. This patch
marks the problem tests with xfail-rounding. (It might be possible
to reduce the accumulations a bit by using round-to-nearest when
cpow calls clog, but I don't think there's much point; the
implementation approach for cpow is fundamentally deficient, as
discussed in the existing bug for cpow inaccuracy which can
reasonably be considered to cover these less-inaccurate cases as
well. It's possible that the test "cpow 2 0 10 0" will also need
xfail-rounding on some platforms.)
Tested x86_64 and x86 and ulps updated accordingly.
* math/gen-auto-libm-tests.c: Document use of
ignore-zero-inf-sign.
(input_flag_type): Add value flag_ignore_zero_inf_sign.
(input_flags): Add ignore-zero-inf-sign.
(output_for_one_input_case): Handle flag_ignore_zero_inf_sign.
* math/gen-libm-test.pl (generate_testfile): Handle
ignore-zero-inf-sign.
* math/auto-libm-test-in: Mark some cpow tests with
ignore-zero-inf-sign and some with xfail-rounding.
* math/auto-libm-test-out: Regenerated.
* math/libm-test.inc (cpow_test): Use ALL_RM_TEST.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
This patch fixes bug 16315, bad pow handling of overflow/underflow in
non-default rounding modes. Tests of pow are duly converted to
ALL_RM_TEST to run all tests in all rounding modes.
There are two main issues here. First, various implementations
compute a negative result by negating a positive result, but this
yields inappropriate overflow / underflow values for directed
rounding, so either overflow / underflow results need recomputing in
the correct sign, or the relevant overflowing / underflowing operation
needs to be made to have a result of the correct sign. Second, the
dbl-64 implementation sets FE_TONEAREST internally; in the overflow /
underflow case, the result needs recomputing in the original rounding
mode.
Tested x86_64 and x86 and ulps updated accordingly.
[BZ #16315]
* sysdeps/i386/fpu/e_pow.S (__ieee754_pow): Ensure possibly
overflowing or underflowing operations take place with sign of
result.
* sysdeps/i386/fpu/e_powf.S (__ieee754_powf): Likewise.
* sysdeps/i386/fpu/e_powl.S (__ieee754_powl): Likewise.
* sysdeps/ieee754/dbl-64/e_pow.c: Include <math.h>.
(__ieee754_pow): Recompute overflowing and underflowing results in
original rounding mode.
* sysdeps/x86/fpu/powl_helper.c: Include <stdbool.h>.
(__powl_helper): Allow negative argument X and scale negated value
as needed. Avoid passing value outside [-1, 1] to f2xm1.
* sysdeps/x86_64/fpu/e_powl.S (__ieee754_powl): Ensure possibly
overflowing or underflowing operations take place with sign of
result.
* sysdeps/x86_64/fpu/multiarch/e_pow.c [HAVE_FMA4_SUPPORT]:
Include <math.h>.
* math/auto-libm-test-in: Add more tests of pow.
* math/auto-libm-test-out: Regenerated.
* math/libm-test.inc (pow_test): Use ALL_RM_TEST.
(pow_tonearest_test_data): Remove.
(pow_test_tonearest): Likewise.
(pow_towardzero_test_data): Likewise.
(pow_test_towardzero): Likewise.
(pow_downward_test_data): Likewise.
(pow_test_downward): Likewise.
(pow_upward_test_data): Likewise.
(pow_test_upward): Likewise.
(main): Don't call removed functions.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
This patch adds a generic implementation of HAVE_RM_CTX using standard
fenv calls. As a result math functions using SET_RESTORE_ROUND* macros
do not suffer from a large slowdown on targets which do not implement
optimized libc_fe*_ctx inline functions. Most of the libc_fe* inline
functions are now unused and could be removed in the future (there are
a few math functions left which use a mixture of standard fenv calls
and libc_fe* inline functions - they could be updated to use
SET_RESTORE_ROUND or improved to avoid expensive fenv manipulations
across just a few FP instructions).
libc_feholdsetround*_noex_ctx is added to enable better optimization of
SET_RESTORE_ROUND_NOEX* implementations.
Performance measurements on ARM and x86 of sin() show significant gains
over the current default, fairly close to a highly optimized fenv_private:
ARM x86
no fenv_private : 100% 100%
generic HAVE_RM_CTX : 250% 350%
fenv_private (CTX) : 250% 450%
2014-06-23 Will Newton <will.newton@linaro.org>
Wilco <wdijkstr@arm.com>
* sysdeps/generic/math_private.h: Add generic HAVE_RM_CTX
implementation. Include get-rounding-mode.h.
[!HAVE_RM_CTX]: Define HAVE_RM_CTX to zero.
[!libc_feholdsetround_noex_ctx]: Define
libc_feholdsetround_noex_ctx.
[!libc_feholdsetround_noexf_ctx]: Define
libc_feholdsetround_noexf_ctx.
[!libc_feholdsetround_noexl_ctx]: Define
libc_feholdsetround_noexl_ctx.
(libc_feholdsetround_ctx): New function.
(libc_feresetround_ctx): New function.
(libc_feholdsetround_noex_ctx): New function.
(libc_feresetround_noex_ctx): New function.
This patch updates glibc headers for changes / new definitions in
Linux 3.15. In the course of my review I noticed that
IPV6_PMTUDISC_INTERFACE was absent from glibc despite the inclusion of
IP_PMTUDISC_INTERFACE; I added it along with IP_PMTUDISC_OMIT and
IPV6_PMTUDISC_OMIT. I did not add FALLOC_FL_NO_HIDE_STALE given the
kernel header comment that it is reserved.
Tested x86_64.
* sysdeps/unix/sysv/linux/bits/fcntl-linux.h [__USE_GNU]
(FALLOC_FL_COLLAPSE_RANGE): New macro.
[__USE_GNU] (FALLOC_FL_ZERO_RANGE): Likewise.
* sysdeps/unix/sysv/linux/bits/in.h (IP_PMTUDISC_OMIT): Likewise.
(IPV6_PMTUDISC_INTERFACE): Likewise.
(IPV6_PMTUDISC_OMIT): Likewise.
Linux commit dd58a092c4202f2bd490adab7285b3ff77f8e467 added the
PPC_FEATURE2_VEC_CRYPTO auvx capability to indicate whether to
hardware supports vector crypto hardware instructions. This patch
adds its definition to powerpc hwcap bits.
This patch removes ARM __ASSUME_SIGFRAME_V2 now that the
2.6.18-and-later signal frame layout can be assumed, renaming the
affected functions accordingly now only one version of them is needed
in glibc. (sigrestorer.S did not in fact include <kernel-features.h>
and it appears that, unlike other such cases, it didn't get the header
indirectly, so the v1 functions would have been compiled in even when
sigaction.c didn't reference them.)
(alpha and hppa also have architecture-specific __ASSUME_* macros that
should now be removed: __ASSUME_FDATASYNC and __ASSUME_LWS_CAS
respectively. I don't have any plans to do anything on that myself.)
Tested on ARM.
* sysdeps/unix/sysv/linux/arm/kernel-features.h
(__ASSUME_SIGFRAME_V2): Remove macro.
* sysdeps/unix/sysv/linux/arm/sigrestorer.S: Update comment.
[!__ASSUME_SIGFRAME_V2]: Remove conditional code.
(__default_sa_restorer_v2): Rename to __default_sa_restorer.
(__default_rt_sa_restorer_v2): Rename to __default_rt_sa_restorer.
* sysdeps/unix/sysv/linux/arm/sigaction.c (__default_sa_restorer):
Declare as function. Remove conditional macro definitions.
(__default_rt_sa_restorer): Likewise.
(__default_sa_restorer_v1): Remove declaration.
(__default_sa_restorer_v2): Likewise.
(__default_rt_sa_restorer_v1): Likewise.
(__default_rt_sa_restorer_v2): Likewise.
* sysdeps/unix/sysv/linux/arm/Versions (GLIBC_PRIVATE): Remove
__default_sa_restorer_v1, __default_rt_sa_restorer_v1,
__default_sa_restorer_v2 and __default_rt_sa_restorer_v2.
This patch makes files using __ASSUME_* macros include
<kernel-features.h> explicitly, rather than relying on some other
header (such as tls.h, lowlevellock.h or pthreadP.h) to include it
implicitly. (I omitted cases where I've already posted or am testing
the patch that stops the file from needing __ASSUME_* at all.) This
accords with the general principle of making source files include the
headers for anything they use, and also helps make it safe to remove
<kernel-features.h> includes from any file that doesn't use
__ASSUME_* (some of those may be stray includes left behind after
increasing the minimum kernel version, others may never have been
needed or may have become obsolete after some other change).
Tested x86_64 that the disassembly of installed shared libraries is
unchanged by this patch.
* nptl/pthread_cond_wait.c: Include <kernel-features.h>.
* nptl/pthread_rwlock_timedrdlock.c: Likewise.
* nptl/pthread_rwlock_timedwrlock.c: Likewise.
* nptl/sysdeps/unix/sysv/linux/lowlevelrobustlock.c: Likewise.
* nscd/nscd.c: Likewise.
* sysdeps/i386/nptl/tcb-offsets.sym: Likewise.
* sysdeps/powerpc/nptl/tcb-offsets.sym: Likewise.
* sysdeps/sh/nptl/tcb-offsets.sym: Likewise.
* sysdeps/x86_64/nptl/tcb-offsets.sym: Likewise.
This patch removes conditionals on __ASSUME_SOCK_CLOEXEC, and on
SOCK_CLOEXEC being defined, in Linux-specific code, now that all
supported Linux kernel versions can be assumed to have this
functionality. (The macro is also used in OS-independent code and is
not defined for Hurd.)
Tested x86_64 that the disassembly of installed shared libraries is
unchanged by this patch.
* nptl/sysdeps/unix/sysv/linux/mq_notify.c: Do not include
<kernel-features.h>.
(init_mq_netlink): Remove conditional have_sock_cloexec
definitions. Remove code conditional on have_sock_cloexec < 0.
(init_mq_netlink) [!SOCK_CLOEXEC]: Remove conditional code.
(init_mq_netlink) [!__ASSUME_SOCK_CLOEXEC]: Likewise.
* sysdeps/unix/sysv/linux/opensock.c: Do not include
<kernel-features.h>.
(__opensock) [SOCK_CLOEXEC]: Make code unconditional.
(__opensock) [!__ASSUME_SOCK_CLOEXEC]: Remove conditional code.
This patch adds ifunc tests for x86_64 memset_chk and memset. It also
defines HAS_AVX2 with AVX2_Usable since AVX2 may not be usable even if
processor has AVX2.
* sysdeps/x86_64/multiarch/ifunc-impl-list.c (__libc_ifunc_impl_list):
Add tests for memset_chk and memset.
* sysdeps/x86_64/multiarch/init-arch.h (HAS_AVX2): Defined
with AVX2_Usable.
This patch removes __ASSUME_F_GETOWN_EX now it can be assumed to be
true unconditionally.
Tested x86_64 that disassembly of installed shared libraries is
unchanged by this patch.
* sysdeps/unix/sysv/linux/kernel-features.h
(__ASSUME_F_GETOWN_EX): Remove macro.
* sysdeps/unix/sysv/linux/fcntl.c: Do not include
<kernel-features.h>.
(miss_F_GETOWN_EX): Remove variable or macro.
(do_fcntl): Do not check miss_F_GETOWN_EX.
(do_fcntl) [!__ASSUME_F_GETOWN_EX]: Remove conditional code.
This patch removes __ASSUME_AT_RANDOM now it can be assumed to be true
unconditionally.
Tested x86_64 that the disassembly of installed shared libraries is
unchanged by this patch.
* sysdeps/unix/sysv/linux/kernel-features.h (__ASSUME_AT_RANDOM):
Remove macro.
* sysdeps/unix/sysv/linux/dl-osinfo.h (_dl_setup_stack_chk_guard)
[!__ASSUME_AT_RANDOM]: Remove conditional code.
(_dl_setup_pointer_guard) [!__ASSUME_AT_RANDOM]: Likewise.
This patch removes the __ASSUME_ADJ_OFFSET_SS_READ macro (and
conditionals on whether ADJ_OFFSET_SS_READ is defined), now it can be
unconditionally assumed to be true and ADJ_OFFSET_SS_READ can be
assumed to be defined.
Tested x86_64 that the disassembly of installed shared libraries is
unchanged by this patch.
* sysdeps/unix/sysv/linux/kernel-features.h
(__ASSUME_ADJ_OFFSET_SS_READ): Remove macro.
* sysdeps/unix/sysv/linux/adjtime.c (ADJTIME)
[ADJ_OFFSET_SS_READ]: Make code unconditional.
(ADJTIME) [!ADJ_OFFSET_SS_READ]: Remove conditional code.
This fixes the calculation of R_ARM_TLS_DESC relocations for lazy global
symbol references, i.e. created with `-z lazy' in effect with the static
linker, where immediate resolution is requested with LD_BIND_NOW.
This patch cleans up for __ASSUME_ATFCTS now always being true for the
supported Linux kernel versions by removing conditional code in
sysdeps/unix/sysv/linux. Several fchownat.c files that were only
present because of differences in the fallback syscalls used
(depending on the architecture-specific names of chown-related
syscalls for 32-bit uids) are removed. Files that looks like they
could be replaced by syscalls.list entries have the standard "Consider
moving to syscalls.list." comment (see bug 14138) added. Conditionals
on the relevant __NR_* syscall numbers being defined are also removed,
since my analysis indicated that the relevant syscalls are always
defined for all relevant kernel versions using any affected file.
Much of the removed fallback code had unbounded stack allocations, so
this reduces the number of cases to consider for anyone reviewing uses
of alloca and VLAs in glibc.
There remain tests of __ASSUME_ATFCTS in io/openat.c (to determine
whether to define __have_atfcts) and sysdeps/posix/getcwd.c (which
also uses __have_atfcts); thus, the definition of __ASSUME_ATFCTS
remains in kernel-features.h. The logical condition relevant there is
whether openat64_not_cancel_3 is known to work. Hurd doesn't use this
version of getcwd at all, so the conditionals in getcwd.c are always
true in glibc. However, this code is also used in gnulib. So the
best way to deal with the conditionals there may be for gnulib people
to deal with merging all relevant changes in both directions between
the glibc and gnulib versions of this file, at the end of which the
openat conditionals should be in whatever form is best for gnulib, and
hardcoded in the _LIBC case to having openat supported.
Tested by comparing before-and-after disassembly of installed
(stripped) shared libraries, on x86_64 and x86. On x86 the patch made
no change to the disassembly; on x86_64, the only changes were in
readlinkat, where formerly the return value from the readlinkat
syscall was stored in an int variable before being converted to
ssize_t for the return, and now the return value is returned directly
without truncation to int. I think it's clearly correct not to
truncate the return value (although I also think the truncation would
not have been a user-visible bug because the kernel would never have
returned a value it could have affected).
* include/fcntl.h (__atfct_seterrno): Remove prototype.
(__atfct_seterrno_2): Likewise.
* sysdeps/unix/sysv/linux/alpha/dl-fxstatat64.c: Do not include
<kernel-features.h>.
(__ASSUME_ATFCTS): Do not undefine and redefine.
* sysdeps/unix/sysv/linux/alpha/fxstatat.c [__ASSUME_ATFCTS]
(__have_atfcts): Remove conditional definition.
(__fxstatat([__NR_fstatat64]: Make code unconditional.
(__fxstatat) [!__ASSUME_ATFCTS]: Remove conditional code and code
unreachable if [__ASSUME_ATFCTS].
* sysdeps/unix/sysv/linux/dl-fxstatat64.c (__ASSUME_ATFCTS): Do
not undefine and redefine.
* sysdeps/unix/sysv/linux/faccessat.c: Do not include
<kernel-features.h>.
(faccessat) [__NR_faccessat]: Make code unconditional.
(faccessat) [!__ASSUME_ATFCTS]: Remove conditional code.
* sysdeps/unix/sysv/linux/fchmodat.c: Do not include
<kernel-features.h>.
(fchmodat) [__NR_fchmodat]: Make code unconditional.
(fchmodat) [!__ASSUME_ATFCTS]: Remove conditional code.
* sysdeps/unix/sysv/linux/fchownat.c: Do not include
<kernel-features.h>.
(fchownat) [__NR_fchownat]: Make code unconditional.
(fchownat) [!__ASSUME_ATFCTS]: Remove conditional code.
* sysdeps/unix/sysv/linux/futimesat.c: Do not include
<kernel-features.h>.
(futimesat) [__NR_futimesat]: Make code unconditional.
(futimesat) [!__ASSUME_ATFCTS]: Remove conditional code.
* sysdeps/unix/sysv/linux/fxstatat.c: Do not include
<kernel-features.h>.
(__fxstatat) [__NR_newfstatat]: Make code unconditional.
(__fxstatat) [!__ASSUME_ATFCTS]: Remove conditional code.
* sysdeps/unix/sysv/linux/fxstatat64.c: Do not include
<kernel-features.h>.
(__fxstatat64) [__NR_fstatat64]: Make code unconditional.
(__fxstatat64) [!__ASSUME_ATFCTS]: Remove conditional code.
* sysdeps/unix/sysv/linux/i386/fchownat.c: Remove file.
* sysdeps/unix/sysv/linux/i386/fxstatat.c: Do not include
<kernel-features.h>.
(__fxstatat) [__NR_fstatat64]: Make code unconditional.
(__fxstatat) [!__ASSUME_ATFCTS]: Remove conditional code.
* sysdeps/unix/sysv/linux/linkat.c: Do not include
<kernel-features.h>.
(linkat) [__NR_linkat]: Make code unconditional.
(linkat) [!__ASSUME_ATFCTS]: Remove conditional code.
* sysdeps/unix/sysv/linux/m68k/fchownat.c: Remove file.
* sysdeps/unix/sysv/linux/mips/mips64/fxstatat64.c: Do not include
<kernel-features.h>.
(__fxstatat64) [__NR_newfstatat]: Make code unconditional.
(__fxstatat64) [!__ASSUME_ATFCTS]: Remove conditional code.
* sysdeps/unix/sysv/linux/mkdirat.c: Do not include
<kernel-features.h>.
(mkdirat) [__NR_mkdirat]: Make code unconditional.
(mkdirat) [!__ASSUME_ATFCTS]: Remove conditional code.
* sysdeps/unix/sysv/linux/openat.c: Do not include
<kernel-features.h>.
[!__ASSUME_ATFCTS] (__atfct_seterrno): Remove function.
[!__ASSUME_ATFCTS] (__have_atfcts): Remove variable.
(OPENAT_NOT_CANCEL) [__NR_openat]: Make code unconditional.
(OPENAT_NOT_CANCEL) [!__ASSUME_ATFCTS]: Remove conditional code.
* sysdeps/unix/sysv/linux/powerpc/fchownat.c: Remove file.
* sysdeps/unix/sysv/linux/readlinkat.c: Do not include
<kernel-features.h>.
(readlinkat) [__NR_readlinkat]: Make code unconditional.
(readlinkat) [!__ASSUME_ATFCTS]: Remove conditional code. Return
result of INLINE_SYSCALL directly, not via int variable.
* sysdeps/unix/sysv/linux/renameat.c: Do not include
<kernel-features.h>.
[!__ASSUME_ATFCTS] (__atfct_seterrno_2): Remove function.
(renameat) [__NR_renameat]: Make code unconditional.
(renameat) [!__ASSUME_ATFCTS]: Remove conditional code.
* sysdeps/unix/sysv/linux/s390/s390-32/fchownat.c: Remove file.
* sysdeps/unix/sysv/linux/sh/fchownat.c: Remove file.
* sysdeps/unix/sysv/linux/sparc/sparc32/fchownat.c: Remove file.
* sysdeps/unix/sysv/linux/sparc/sparc64/dl-fxstatat64.c
(__ASSUME_ATFCTS): Do not undefine and redefine.
* sysdeps/unix/sysv/linux/symlinkat.c: Do not include
<kernel-features.h>.
(symlinkat) [__NR_symlinkat]: Make code unconditional.
(symlinkat) [!__ASSUME_ATFCTS]: Remove conditional code.
* sysdeps/unix/sysv/linux/unlinkat.c: Do not include
<kernel-features.h>.
(unlinkat) [__NR_unlinkat]: Make code unconditional.
(unlinkat) [!__ASSUME_ATFCTS]: Remove conditional code.
* sysdeps/unix/sysv/linux/wordsize-64/dl-fxstatat64.c
(__ASSUME_ATFCTS): Do not undefine and redefine.
* sysdeps/unix/sysv/linux/wordsize-64/fxstatat.c: Do not include
<kernel-features.h>.
(__fxstatat) [__NR_newfstatat]: Make code unconditional.
(__fxstatat) [!__ASSUME_ATFCTS]: Remove conditional code.
* sysdeps/unix/sysv/linux/xmknodat.c: Do not include
<kernel-features.h>.
(__xmknodat) [__NR_mknodat]: Make code unconditional.
(__xmknodat) [!__ASSUME_ATFCTS]: Remove conditional code.
Errno is not set and the testcases will fail.
Now the scalbln-aliases are removed in i386/m68
and the wrappers are used when calling the scalbln-functions.
On ia64 only scalblnf has its own implementation.
For scalbln and scalblnl the ieee754/dbl-64 and ieee754/ldbl-96 are used, thus
the wrappers are needed, too.
Since long is 4 bytes for x32, we should use 3 bytes for __pad1 when
a long __pad1 is replaced by a byte __rwelision and __pad1.
* sysdeps/x86/nptl/bits/pthreadtypes.h (pthread_rwlock_t): Use
3 bytes for __pad1 for x32.
(__PTHREAD_RWLOCK_ELISION_EXTRA): Likewise.
In this patch we take advantage of HSW memory bandwidth, manage to
reduce miss branch prediction by avoiding using branch instructions and
force destination to be aligned with avx & avx2 instruction.
The CPU2006 403.gcc benchmark indicates this patch improves performance
from 26% to 59%.
* sysdeps/x86_64/multiarch/Makefile: Add memset-avx2.
* sysdeps/x86_64/multiarch/memset-avx2.S: New file.
* sysdeps/x86_64/multiarch/memset.S: Likewise.
* sysdeps/x86_64/multiarch/memset_chk.S: Likewise.
* sysdeps/x86_64/multiarch/rtld-memset.S: Likewise.
Implementation of strchr for AArch64. Speedups taken from micro-bench
show the improvements relative to the standard C code.
The use of LD1 means we have identical code for both big- and
little-endian systems.
This patch fixes __ieee754_logl (-LDBL_MAX) on x86_64 and x86 not to
subtract 1 from its argument and so cause spurious overflow in
FE_DOWNWARD mode. (For any argument strictly less than -1, it doesn't
matter whether or not 1 is subtracted before computing log1p, as long
as the result doesn't overflow to -Inf.)
Tested x86_64 and x86. (This particular case lacks test coverage,
since the testsuite doesn't cover -lieee, but it will be covered by
tests after the following patch to test pow in all rounding modes,
which was the context in which this bug was found.)
[BZ #17022]
* sysdeps/i386/fpu/e_logl.S (__ieee754_logl): Do not subtract 1
from arguments -2 or below.
* sysdeps/i386/i686/fpu/e_logl.S (__ieee754_logl): Likewise.
* sysdeps/x86_64/fpu/e_logl.S (__ieee754_logl): Likewise.
The glibc makefiles have a standard variable, $(rtld-prefix), to run
the dynamic linker with a default --library-path option; this is used
as the basis of lots of other variables for running programs compiled
with the newly built library.
A few places however use $(elf-objpfx)ld.so or
$(elf-objpfx)${rtld-installed-name} directly, with such a
--library-path option. This patch makes such places use
$(rtld-prefix) instead. I'm not aware of any significance in these
cases to the choice of ld.so or ${rtld-installed-name} when running
the dynamic linker, or to whether $(patsubst
%,:%,$(sysdep-library-path)) is included in the library-path as it is
in $(rtld-prefix) and just one of the places being changed.
Tested x86_64.
* elf/Makefile ($(objpfx)tst-unused-dep.out): Use $(rtld-prefix).
* iconvdata/Makefile ($(inst_gconvdir)/gconv-modules)
[$(cross-compiling) = no]: Likewise.
* sysdeps/s390/s390-64/Makefile ($(inst_gconvdir)/gconv-modules)
[$(cross-compiling) = no]: Likewise.
localedata/ChangeLog:
* Makefile (LOCALEDEF): Use $(rtld-prefix).
This patch fixes few failures in nearbyintl() where the fraction part is
close to 0.5.i The new tests added report few extra failures in
nearbyint_downward and nearbyint_towardzero which is a known issue.
Fixes#17031.
Add the missing fallback file for elide.h to fix non x86 builds.
Sorry about that. This is just a noop macro file that makes
all elision code to be optimized out.
With the recent tuning the C version of rwlocks is basically the same
performance as the x86 assembler version for uncontended locks (with a
a few cycles near the run-to-run variability). For others it should not
matter anyways.
So remove the assembler code and use the C version like other
architectures.
This patch relies on the C version of the rwlocks posted earlier.
With C rwlocks it is very straight forward to do adaptive elision
using TSX. It is based on the infrastructure added earlier
for mutexes, but uses its own elision macros. The macros
are fairly general purpose and could be used for other
elision purposes too.
This version is much cleaner than the earlier assembler based
version, and in particular implements adaptation which makes
it safer.
I changed the behavior slightly to not require any changes
in the test suite and fully conform to all expected
behaviors (generally at the cost of not eliding in
various situations). In particular this means the timedlock
variants are not elided. Nested trylock aborts.
The implementation of __get_nprocs uses a stactic variable to cache
the value of the current number of processors. The caching breaks when
'time (NULL) == 0':
$ cat nproc.c
#include <stdio.h>
#include <time.h>
#include <sys/time.h>
int main(int argc, char *argv[])
{
time_t t;
struct timeval tv = {0, 0};
printf("settimeofday({0, 0}, NULL) = %d\n", settimeofday(&tv, NULL));
t = time(NULL);
printf("Time: %d, CPUs: %d\n", (unsigned int)t, get_nprocs());
return 0;
}
$ gcc -O3 nproc.c
$ ./a.out
settimeofday({0, 0}, NULL) = -1
Time: 1401311578, CPUs: 4
$ sudo ./a.out
settimeofday({0, 0}, NULL) = 0
Time: 0, CPUs: 0
The problem is with the condition used to check whether a cached
value should be returned or not:
static int cached_result;
static time_t timestamp;
time_t now = time (NULL);
time_t prev = timestamp;
atomic_read_barrier ();
if (now == prev)
return cached_result;
This patch fixes the problem by ensuring that 'cached_result' has
been set at least once before returning it.
Optimization is achieved on 8 byte aligned strings with double word
comparison using cmpb instruction. On unaligned strings loop unrolling
is applied for Power7 gain.
As with other issues of this kind, bug 17042 is log2 (1) wrongly
returning -0 instead of +0 in round-downward mode because of
implementations effectively in terms of log1p (x - 1). This patch
fixes the issue in the same way used for log and log10.
Tested x86_64 and x86 and ulps updated accordingly. Also tested for
mips64 to confirm a fix was needed for ldbl-128 and to validate that
fix (also applied to ldbl-128ibm since that version of log2l is
essentially the same as the ldbl-128 one).
[BZ #17042]
* sysdeps/i386/fpu/e_log2.S (__ieee754_log2): Take absolete value
when x - 1 is zero.
* sysdeps/i386/fpu/e_log2f.S (__ieee754_log2f): Likewise.
* sysdeps/i386/fpu/e_log2l.S (__ieee754_log2l): Likewise.
* sysdeps/ieee754/ldbl-128/e_log2l.c (__ieee754_log2l): Return
0.0L for an argument of 1.0L.
* sysdeps/ieee754/ldbl-128ibm/e_log2l.c (__ieee754_log2l):
Likewise.
* sysdeps/x86_64/fpu/e_log2l.S (__ieee754_log2l): Take absolute
value when x - 1 is zero.
* math/libm-test.inc (log2_test): Use ALL_RM_TEST.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
The hppa port has no need of a custom lowlevellock.c, it should
use the generic version which is updated and correct. This
similarly fixes bug 15119 for hppa.
Various glibc build / install / test code has C locale settings that
are redundant with LC_ALL=C.
LC_ALL takes precedence over LANG, so anywhere that sets LC_ALL=C
(explicitly, or through it being in the default environment for
running tests) does not need to set LANG=C. LC_ALL=C also takes
precedence over LANGUAGE, since
2001-01-02 Ulrich Drepper <drepper@redhat.com>
* intl/dcigettext.c (guess_category_value): Rewrite so that LANGUAGE
value is ignored if the selected locale is the C locale.
* intl/tst-gettext.c: Set locale for above change.
* intl/tst-translit.c: Likewise.
and so settings of LANGUAGE=C are also redundant when LC_ALL=C is
set. One test also had LC_ALL=C in its -ENV setting, although it's
part of the default environment used for tests.
This patch removes the redundant settings. It removes a suggestion in
install.texi of setting LANGUAGE=C LC_ALL=C for "make install"; the
Makefile.in target "install" already sets LC_ALL_C so there's no need
for the user to set it (and nor should there be any need for the user
to set it).
If some build machine tool used by "make install" uses a version of
libintl predating that 2001 change, and the user has LANGUAGE set, the
removal of LANGUAGE=C from the Makefile.in "install" rule could in
principle affect the user's installation. However, I don't think we
need to be concerned about pre-2001 build tools.
Tested x86_64.
* Makefile (install): Don't set LANGUAGE.
* Makefile.in (install): Likewise.
* assert/Makefile (test-assert-ENV): Remove variable.
(test-assert-perr-ENV): Likewise.
* elf/Makefile (neededtest4-ENV): Likewise.
* iconvdata/Makefile ($(inst_gconvdir)/gconv-modules)
[$(cross-compiling) = no]: Don't set LANGUAGE.
* io/ftwtest-sh (LANG): Remove variable.
* libio/Makefile (tst-widetext-ENV): Likewise.
* manual/install.texi (Running make install): Don't refer to
environment settings for make install.
* INSTALL: Regenerated.
* nptl/tst-tls6.sh: Don't set LANG.
* posix/globtest.sh (LANG): Remove variable.
* string/Makefile (tester-ENV): Likewise.
(inl-tester-ENV): Likewise.
(noinl-tester-ENV): Likewise.
* sysdeps/s390/s390-64/Makefile ($(inst_gconvdir)/gconv-modules)
[$(cross-compiling) = no]: Don't set LANGUAGE.
* timezone/Makefile (build-testdata): Use $(built-program-cmd)
without explicit environment settings.
localedata/ChangeLog:
* tst-fmon.sh: Don't set LANGUAGE.
* tst-locale.sh: Likewise.
This patch fixes the optimized ppc64/power7 strncat strlen call for
static build without ifunc enabled. The strlen symbol to call in such
situation is just strlen, instead of __GI_strlen (since the __GI_
alias is just created for shared objects).
In several cases we've had asm routines rely on syscalls not clobbering
call-clobbered registers, and that's now deemed ABI. So take advantage
of this in the INLINE_SYSCALL path as well.
Shrinks libc.so by about 1k.
The current code for handling concurrent resolution says that the
ABI for _dl_tlsdesc_resolve_hold is the same as that of
_dl_tlsdesc_lazy_resolver. However _dl_tlsdesc_resolve_hold is
called from the trampoline directly rather than the lazy resolver
stub so, for example, r2 has not been pushed so does not needed
to be restored.
This fixes an intermittent failure in nptl/tst-tls3 when building
glibc for arm-linux-gnueabihf with -mtls-dialect=gnu2.
ChangeLog:
2014-05-27 Will Newton <will.newton@linaro.org>
[BZ #16990]
* sysdeps/arm/dl-tlsdesc.S (_dl_tlsdesc_resolve_hold): Save
and restore r2 rather than just restoring.
This fixes a variety of testsuite failures for me:
tststatic.out Error 1
tststatic2.out Error 1
tst-tls9-static.out Error 1
tst-audit8.out Error 127
tst-audit9.out Error 127
tst-audit1.out Error 127
and also has the added benefit of making LD_AUDIT/sotruss work on
AArch64.
Otherwise, we bail out early in _dl_try_allocate_static_tls as the
alignment requirement of the PT_TLS section in libc is 16.
This macro was removed by
2005-11-16 Daniel Jacobowitz <dan@codesourcery.com>
but not applied to the (still separate) eabi port so necro'd
when the eabi port superceded the old abi. It was thence
copied into the new AArch64 port.
As with various other issues of this kind, bug 16977 is log10 (1)
wrongly returning -0 rather than +0 in round-downward mode because of
an implementation effectively in terms of log1p (x - 1). This patch
fixes the issue in the same way used for log.
Tested x86_64 and x86 and ulps updated accordingly. Also tested for
mips64 to confirm a fix was needed for ldbl-128 and to validate that
fix (also applied to ldbl-128ibm since that version of logl is
essentially the same as the ldbl-128 one).
[BZ #16977]
* sysdeps/i386/fpu/e_log10.S (__ieee754_log10): Take absolute
value when x - 1 is zero.
* sysdeps/i386/fpu/e_log10f.S (__ieee754_log10f): Likewise.
* sysdeps/i386/fpu/e_log10l.S (__ieee754_log10l): Likewise.
* sysdeps/ieee754/ldbl-128/e_log10l.c (__ieee754_log10l): Return
0.0L for an argument of 1.0L.
* sysdeps/ieee754/ldbl-128ibm/e_log10l.c (__ieee754_log10l):
Likewise.
* sysdeps/x86_64/fpu/e_log10l.S (__ieee754_log10l): Take absolute
value when x - 1 is zero.
* math/libm-test.inc (log10_test): Use ALL_RM_TEST.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
This patch fixes a similar issue to
736c304a1a, where for PPC32 if the symbol
is defined as hidden (memchr) then compiler will create a local branc
(symbol@local) and the linker will not create a required PLT call to
make the ifunc work. It changes the default hidden symbol (__GI_memchr)
to default memchr symbol for powerpc32 (__memchr_ppc32).
As previously noted
<https://sourceware.org/ml/libc-alpha/2013-05/msg00696.html>,
$(elf-objpfx) and $(elfobjdir) are redundant and should be
consolidated. This patch consolidates on $(elf-objpfx) (for
consistency with $(csu-objpfx)), also changing direct uses of
$(common-objpfx)elf/ to use $(elf-objpfx).
Tested x86_64, including that installed shared libraries are unchanged
by the patch.
* Makeconfig [$(build-hardcoded-path-in-tests) = yes]
(rtld-tests-LDFLAGS): Use $(elf-objpfx) instead of
$(common-objpfx)elf/.
(link-libc-before-gnulib): Likewise.
(elfobjdir): Remove variable.
* Makefile (install): Use $(elf-objpfx) instead of
$(common-objpfx)elf/.
* Makerules (link-libc-args): Use $(elf-objpfx) instead of
$(elfobjdir)/.
(link-libc-deps): Likewise.
($(common-objpfx)libc.so): Likewise.
($(common-objpfx)linkobj/libc.so): Likewise.
[$(cross-compiling) = no] (symbolic-link-prog): Use $(elf-objpfx)
instead of $(common-objpfx)elf/.
(symbolic-link-list): Likewise.
* iconvdata/Makefile ($(inst_gconvdir)/gconv-modules)
[$(cross-compiling) = no]: Likewise.
* sysdeps/arm/Makefile (gnulib-arch): Use $(elf-objpfx) instead of
$(elfobjdir)/.
(static-gnulib-arch): Likewise.
* sysdeps/s390/s390-64/Makefile ($(inst_gconvdir)/gconv-modules)
[$(cross-compiling) = no]: Use $(elf-objpfx) instead of
$(common-objpfx)elf/.
localedata/ChangeLog:
* Makefile (LOCALEDEF): Use $(elf-objpfx) instead of
$(common-objpfx)elf/.
This also highlights that we'd been loading 64-bits instead of
the proper 32-bits. Caught by the linker as a relocation error,
since the variable happened to be unaligned for 64-bits.
sysdeps/unix/sysv/linux/arm/unwind-resume.c and
sysdeps/unix/sysv/linux/arm/unwind-forcedunwind.c have static
variables that are written in C code but only read from toplevel asms.
Current GCC trunk now optimizes away such apparently write-only static
variables, so causing a build failure. This patch marks those
variables with __attribute_used__ to avoid that optimization.
Tested that this fixes the build for ARM.
* sysdeps/unix/sysv/linux/arm/unwind-forcedunwind.c
(libgcc_s_resume): Use __attribute_used__.
* sysdeps/unix/sysv/linux/arm/unwind-resume.c (libgcc_s_resume):
Likewise.
This patch fixes the __copysignf optimized macro meant to internal libm
usage when used with constant value. Without the explicit cast to
float, if it is used with const double value (for instance, on
s_casinhf.c) double constants will be used and it may lead to precision
issues in some algorithms.
It fixes the following failures on PPC64/POWER7:
Failure: Test: Real part of: cacos_downward (inf + 0 i)
Result:
is: 1.19209289550781250000e-07 0x1.00000000000000000000p-23
should be: 0.00000000000000000000e+00 0x0.00000000000000000000p+0
Failure: Test: Real part of: cacos_downward (inf - 0 i)
Result:
is: 1.19209289550781250000e-07 0x1.00000000000000000000p-23
should be: 0.00000000000000000000e+00 0x0.00000000000000000000p+0
Failure: Test: Real part of: cacos_downward (inf + 0.5 i)
Result:
is: 1.19209289550781250000e-07 0x1.00000000000000000000p-23
should be: 0.00000000000000000000e+00 0x0.00000000000000000000p+0
Failure: Test: Real part of: cacos_downward (inf - 0.5 i)
Result:
is: 1.19209289550781250000e-07 0x1.00000000000000000000p-23
should be: 0.00000000000000000000e+00 0x0.00000000000000000000p+0
Failure: Test: Real part of: cacos_towardzero (inf + 0 i)
Result:
is: 1.19209289550781250000e-07 0x1.00000000000000000000p-23
should be: 0.00000000000000000000e+00 0x0.00000000000000000000p+0
Failure: Test: Real part of: cacos_towardzero (inf - 0 i)
Result:
is: 1.19209289550781250000e-07 0x1.00000000000000000000p-23
should be: 0.00000000000000000000e+00 0x0.00000000000000000000p+0
Failure: Test: Real part of: cacos_towardzero (inf + 0.5 i)
Result:
is: 1.19209289550781250000e-07 0x1.00000000000000000000p-23
should be: 0.00000000000000000000e+00 0x0.00000000000000000000p+0
Failure: Test: Real part of: cacos_towardzero (inf - 0.5 i)
Result:
is: 1.19209289550781250000e-07 0x1.00000000000000000000p-23
should be: 0.00000000000000000000e+00 0x0.00000000000000000000p+0
Using the default header instead. This matches the kernel, which also
uses the generic header. Fixes the sys/wait.h conform issue, where
si_band had the wrong type.
The current code for nocancel syscalls does not do a comparison of
the system call return value. This leads to code being generated
where the b.cs follows the svc instruction directly without setting
the flags on which the branch depends.
ChangeLog:
2014-05-20 Will Newton <will.newton@linaro.org>
* sysdeps/unix/sysv/linux/aarch64/nptl/sysdep-cancel.h (PSEUDO):
Test the return value of the system call in the nocancel case.
This patch fixes an issue observed by the Xen project, where including
signal.h exposes various PSR_MODE #defines. This is due to the usage
in sys/user.h and sys/procfs.h of the struct user_pt_regs and
user_fpsimd_state included via asm/ptrace.h. The namespace pollution
this inclusion introduce is already partially fixed with some #undef
of the PTRACE_* symbols, but other symbols like the PSR_MODE ones are
still present, and undefining them is not safe since a user can
include ptrace.h before user.h.
My proposition is to define the 2 structures we need in user.h and get
rid of the asm/ptrace.h inclusion.
Build and make check are clean on AArch64.
2014-05-20 Will Newton <will.newton@linaro.org>
Yvan Roux <yvan.roux@linaro.org>
* sysdeps/unix/sysv/linux/aarch64/sys/user.h: Remove unused
#include of asm/ptrace.h.
(PTRACE_GET_THREAD_AREA): Remove #undef.
(PTRACE_GETHBPREGS): Likewise.
(PTRACE_SETHBPREGS): Likewise.
(struct user_regs_struct): New structure.
(struct user_fpsimd_struct): New structure.
* sysdeps/unix/sysv/linux/aarch64/sys/procfs.h: Remove unused
#include of asm/ptrace.h and second #include of sys/user.h.
(PTRACE_GET_THREAD_AREA): Remove #undef.
(PTRACE_GETHBPREGS): Likewise.
(PTRACE_SETHBPREGS): Likewise.
(ELF_NGREG): Use new struct user_regs_struct.
(elf_fpregset_t): Use new struct user_fpsimd_struct.
Commit 7d92b78723 [Fix ARM NAN fraction
bits.] removed all the bits set from NANFRAC macros and, when propagated
to libgcc, regressed gcc.dg/torture/builtin-math-7.c on soft-fp arm-eabi
targets, currently ARMv6-M (`-march=armv6-m -mthumb') only. This is
because when used to construct a NaN in the semi-raw mode, they now
build an infinity instead. Consequently operations such as (Inf - Inf)
now produce Inf rather than NaN. The change worked for the original
test case, posted with PR libgcc/60166, because division is made in the
canonical mode, where the quiet bit is set separately, from the fp
class.
This change brings the quiet bit back to these macros, making semi-raw
mode calculations produce the expected results again.
This patch guard the BSD definition for terminal modes in PowerPC
specific header fixing the following conformance failures:
FAIL: conform/POSIX/termios.h/conform
FAIL: conform/POSIX2008/termios.h/conform
FAIL: conform/UNIX98/termios.h/conform
prlimit and prlimit64 have been added in the main <bits/resource.h>, but
not in the SPARC specific version. Fix that.
Note: this is Debian bug#703559, reported by Emilio Pozuelo Monfort
<pochu@debian.org>
If the fd refers to a terminal device, but not a pty master, the
TIOCGPTN ioctl returns with ENOTTY. This error is not caught, and the
possibly undefined buffer passed to ptsname_r is sent directly to the
stat64 syscall.
Fix this by using a fallback to the old method only if the TIOCGPTN
ioctl fails with EINVAL. This also fix the return value in that specific
case (it return ENOENT without this patch).
Also add tests to the ptsname_r function (and ptsname at the same time).
Note: this is Debian bug#741482, reported by Jakub Wilk <jwilk@debian.org>
getaddrinfo correctly returns EAI_AGAIN for AF_INET and AF_INET6
queries. For AF_UNSPEC however, an older change
(a682a1bf55) broke the check and due to
that the returned error was EAI_NONAME.
This patch fixes the check so that a non-authoritative not-found is
returned as EAI_AGAIN to the user instead of EAI_NONAME.
Bug 16564 is spurious overflow of log1pl (LDBL_MAX) in FE_UPWARD mode,
resulting from log1pl adding 1 to its argument (for arguments not
close to 0), which overflows in that mode. This patch fixes this by
avoiding adding 1 to large arguments (precisely what counts as large
depends on the floating-point format).
Tested x86_64 and x86, and spot-checked log1pl tests on mips64 and
powerpc64.
[BZ #16564]
* sysdeps/i386/fpu/s_log1pl.S (__log1pl): Do not add 1 to positive
arguments with exponent 65 or above.
* sysdeps/ieee754/ldbl-128/s_log1pl.c (__log1pl): Do not add 1 to
arguments 0x1p113L or above.
* sysdeps/ieee754/ldbl-128ibm/s_log1pl.c (__log1pl): Do not add 1
to arguments 0x1p107L or above.
* sysdeps/x86_64/fpu/s_log1pl.S (__log1pl): Do not add 1 to
positive arguments with exponent 65 or above.
* math/auto-libm-test-in: Add more tests of log1p.
* math/auto-libm-test-out: Regenerated.
According to C99/C11 Annex G, cacos applied to a value with real part
+Inf and finite imaginary part should produce a result with real part
+0. glibc wrongly produces a result with real part -0 in FE_DOWNWARD
mode. This patch fixes this by checking for zero results in the
relevant case of non-finite arguments (where there should never be a
result with -0 real part), and converts the tests of cacos to
ALL_RM_TEST.
Tested x86_64 and x86 and ulps updated accordingly.
[BZ #16928]
* math/s_cacos.c (__cacos): Ensure zero real part of result from
non-finite arguments is +0.
* math/s_cacosf.c (__cacosf): Likewise.
* math/s_cacosl.c (__cacosl): Likewise.
* math/libm-test.inc (cacos_test): Use ALL_RM_TEST.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
According to C99 and C11 Annex F, acosh (1) should be +0 in all
rounding modes. However, some implementations in glibc wrongly return
-0 in round-downward mode (which is what you get if you end up
computing log1p (-0), via 1 - 1 being -0 in round-downward mode).
This patch fixes the problem implementations, by correcting the test
for an exact 1 value in the ldbl-96 implementation to allow for the
explicit high bit of the mantissa, and by inserting fabs instructions
in the i386 implementations; tests of acosh are duly converted to
ALL_RM_TEST. I believe all the other sysdeps/ieee754 implementations
are already OK (I haven't checked the ia64 versions, but if buggy then
that will be obvious from the results of test runs after this patch is
in).
Tested x86_64 and x86 and ulps updated accordingly.
[BZ #16927]
* sysdeps/i386/fpu/e_acosh.S (__ieee754_acosh): Use fabs on x-1
value.
* sysdeps/i386/fpu/e_acoshf.S (__ieee754_acoshf): Likewise.
* sysdeps/i386/fpu/e_acoshl.S (__ieee754_acoshl): Likewise.
* sysdeps/ieee754/ldbl-96/e_acoshl.c (__ieee754_acoshl): Correct
for explicit high bit of mantissa when testing for argument equal
to 1.
* math/libm-test.inc (acosh_test): Use ALL_RM_TEST.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
Bug 16516 reports spurious underflows from erf (for all floating-point
types), when the result is close to underflowing but does not actually
underflow.
erf (x) is about (2/sqrt(pi))*x for x close to 0, so there are
subnormal arguments for which it does not underflow. The various
implementations do (x + efx*x) (for efx = 2/sqrt(pi) - 1), for greater
accuracy than if just using a single multiplication by an
approximation to 2/sqrt(pi) (effectively, this way there are a few
more bits in the approximation to 2/sqrt(pi)). This can introduce
underflows when efx*x underflows even though the final result does
not, so a scaled calculation with 8*efx is done in these cases - but 8
is not a big enough scale factor to avoid all such underflows. 16 is
(any underflows with a scale factor of 16 would only occur when the
final result underflows), so this patch changes the code to use that
factor. Rather than recomputing all the values of the efx8 variable,
it is removed, leaving it to the compiler's constant folding to
compute 16*efx. As such scaling can also lose underflows when the
final scaling down happens to be exact, appropriate checks are added
to ensure underflow exceptions occur when required in such cases.
Tested x86_64 and x86; no ulps updates needed. Also spot-checked for
powerpc32 and mips64 to verify the changes to the ldbl-128ibm and
ldbl-128 implementations.
[BZ #16516]
* sysdeps/ieee754/dbl-64/s_erf.c (efx8): Remove variable.
(__erf): Scale by 16 instead of 8 in potentially underflowing
case. Ensure exception if result actually underflows.
* sysdeps/ieee754/flt-32/s_erff.c (efx8): Remove variable.
(__erff): Scale by 16 instead of 8 in potentially underflowing
case. Ensure exception if result actually underflows.
* sysdeps/ieee754/ldbl-128/s_erfl.c: Include <float.h>.
(efx8): Remove variable.
(__erfl): Scale by 16 instead of 8 in potentially underflowing
case. Ensure exception if result actually underflows.
* sysdeps/ieee754/ldbl-128ibm/s_erfl.c: Include <float.h>.
(efx8): Remove variable.
(__erfl): Scale by 16 instead of 8 in potentially underflowing
case. Ensure exception if result actually underflows.
* sysdeps/ieee754/ldbl-96/s_erfl.c: Include <float.h>.
(efx8): Remove variable.
(__erfl): Scale by 16 instead of 8 in potentially underflowing
case. Ensure exception if result actually underflows.
* math/auto-libm-test-in: Add more tests of erf.
* math/auto-libm-test-out: Regenerated.
This patch reduces duplication between different architectures'
kernel-features.h files by making the architecture-independent file
define various macros unconditionally (instead of only for a
particular list of architectures), with the architecture-specific
files then undefining the macros if necessary.
Specifically, __ASSUME_O_CLOEXEC (O_CLOEXEC flag to open) and
__ASSUME_SOCK_CLOEXEC (SOCK_NONBLOCK and SOCK_CLOEXEC flags to socket)
are supported on all architectures as of 2.6.32 or the minimum kernel
version for the architecture if later. For __ASSUME_IN_NONBLOCK,
__ASSUME_PIPE2, __ASSUME_EVENTFD2, __ASSUME_SIGNALFD4 and
__ASSUME_DUP3, the relevant syscalls were added for alpha in 2.6.33
but otherwise the features are available as of 2.6.32. For
__ASSUME_UTIMES, support is everywhere in 2.6.32 except for
asm-generic architectures and hppa.
Although those were the main cases of duplication among
kernel-features.h files, some other cases of unnecessary definitions
were also cleaned up: the hppa file defined various macros that were
either no longer used at all, or defined by the main file by default
anyway, the ia64 file had duplicative definitions of __ASSUME_PSELECT
and __ASSUME_PPOLL, while mips had such a definition of
__ASSUME_IPC64.
Really, rather than being defined in the main file then undefined for
asm-generic architectures, __ASSUME_UTIMES should become an
hppa-specific macro. Given that __ASSUME_ATFCTS and
__ASSUME_UTIMENSAT are now always true, the only live __ASSUME_UTIMES
conditional is in sysdeps/unix/sysv/linux/utimes.c, which is not used
for asm-generic architectures. I think the desired state would be an
hppa-specific file (that includes sysdeps/unix/sysv/linux/utimes.c if
__ASSUME_UTIMES, and otherwise has fallback code), with the fallback
code being removed from the main utimes.c. But I think that's most
reasonably a separate cleanup once __ASSUME_ATFCTS and
__ASSUME_UTIMESAT have both had conditional code cleaned up.
Given this patch, I think it's straightforward to move non-ex-ports
architectures to having their own kernel-features.h files, like
ex-ports architectures, rather than conditionals in the main file
(i.e., such a move won't require the architecture-specific file to
contain anything that isn't genuinely architecture-specific), and
would encourage architecture maintainers to do so.
Tested x86_64 that the installed shared libraries are unchanged by
this patch. Note that on some architectures this *will* cause
__ASSUME_* macros to be defined in cases where they weren't previously
but should have been (but this is just optimization, not a fix to a
user-visible bug, so doesn't need a bug report in Bugzilla).
* sysdeps/unix/sysv/linux/kernel-features.h (__ASSUME_UTIMES):
Define unconditionally.
(__ASSUME_O_CLOEXEC): Likewise.
(__ASSUME_SOCK_CLOEXEC): Likewise.
(__ASSUME_IN_NONBLOCK): Likewise.
(__ASSUME_PIPE2): Likewise.
(__ASSUME_EVENTFD2): Likewise.
(__ASSUME_SIGNALFD4): Likewise.
(__ASSUME_DUP3): Likewise.
* sysdeps/unix/sysv/linux/aarch64/kernel-features.h
(__ASSUME_DUP3): Do not define.
(__ASSUME_EVENTFD2): Likewise.
(__ASSUME_IN_NONBLOCK): Likewise.
(__ASSUME_O_CLOEXEC): Likewise.
(__ASSUME_PIPE2): Likewise.
(__ASSUME_SIGNALFD4): Likewise.
(__ASSUME_SOCK_CLOEXEC): Likewise.
(__ASSUME_UTIMES): Undefine.
* sysdeps/unix/sysv/linux/alpha/kernel-features.h
(__ASSUME_UTIMES): Do not define.
(__ASSUME_O_CLOEXEC): Likewise.
(__ASSUME_SOCK_CLOEXEC): Likewise.
(__ASSUME_IN_NONBLOCK): Undefine if [__LINUX_KERNEL_VERSION <
0x020621] instead of defining if [__LINUX_KERNEL_VERSION >=
0x020621].
(__ASSUME_PIPE2): Likewise.
(__ASSUME_EVENTFD2): Likewise.
(__ASSUME_SIGNALFD4): Likewise.
[__LINUX_KERNEL_VERSION < 0x020621] (__ASSUME_DUP3): Undefine.
* sysdeps/unix/sysv/linux/arm/kernel-features.h (__ASSUME_UTIMES):
Do not define.
(__ASSUME_EVENTFD2): Likewise.
(__ASSUME_SIGNALFD4): Likewise.
* sysdeps/unix/sysv/linux/hppa/kernel-features.h
(__ASSUME_32BITUIDS): Likewise.
(__ASSUME_TRUNCATE64_SYSCALL): Likewise.
(__ASSUME_IPC64): Likewise.
(__ASSUME_ST_INO_64_BIT): Likewise.
(__ASSUME_GETDENTS64_SYSCALL): Likewise.
[__LINUX_KERNEL_VERSION < 0x030e00] (__ASSUME_UTIMES): Undefine.
* sysdeps/unix/sysv/linux/ia64/kernel-features.h
(__ASSUME_UTIMES): Do not define.
(__ASSUME_PSELECT): Likewise.
(__ASSUME_PPOLL): Likewise.
(__ASSUME_O_CLOEXEC): Likewise.
(__ASSUME_SOCK_CLOEXEC): Likewise.
(__ASSUME_IN_NONBLOCK): Likewise.
(__ASSUME_PIPE2): Likewise.
(__ASSUME_EVENTFD2): Likewise.
(__ASSUME_SIGNALFD4): Likewise.
(__ASSUME_DUP3): Likewise.
* sysdeps/unix/sysv/linux/m68k/kernel-features.h
(__ASSUME_UTIMES): Likewise.
(__ASSUME_O_CLOEXEC): Likewise.
(__ASSUME_SOCK_CLOEXEC): Likewise.
(__ASSUME_IN_NONBLOCK): Likewise.
(__ASSUME_PIPE2): Likewise.
(__ASSUME_EVENTFD2): Likewise.
(__ASSUME_SIGNALFD4): Likewise.
(__ASSUME_DUP3): Likewise.
* sysdeps/unix/sysv/linux/microblaze/kernel-features.h
(__ASSUME_UTIMES): Likewise.
(__ASSUME_O_CLOEXEC): Likewise.
(__ASSUME_SOCK_CLOEXEC): Likewise.
(__ASSUME_IN_NONBLOCK): Likewise.
(__ASSUME_PIPE2): Likewise.
(__ASSUME_EVENTFD2): Likewise.
(__ASSUME_SIGNALFD4): Likewise.
(__ASSUME_DUP3): Likewise.
* sysdeps/unix/sysv/linux/mips/kernel-features.h (__ASSUME_IPC64):
Likewise.
(__ASSUME_UTIMES): Likewise.
(__ASSUME_EVENTFD2): Likewise.
(__ASSUME_SIGNALFD4): Likewise.
* sysdeps/unix/sysv/linux/tile/kernel-features.h
(__ASSUME_O_CLOEXEC): Likewise.
(__ASSUME_SOCK_CLOEXEC): Likewise.
(__ASSUME_IN_NONBLOCK): Likewise.
(__ASSUME_PIPE2): Likewise.
(__ASSUME_EVENTFD2): Likewise.
(__ASSUME_SIGNALFD4): Likewise.
(__ASSUME_DUP3): Likewise.
(__ASSUME_UTIMES): Undefine.
This patch cleans up some symbol versioning code in the ARM port that
exists only as relics of the old-ABI port, which was removed some time
ago.
The minimum symbol version in the ARM port is GLIBC_2.4 (the version
where the EABI port was introduced). Thus, any SHLIB_COMPAT
conditionals where the later version is 2.4 or later are obsolete and
can be removed. In addition, there is no need to set symbol versions
before 2.4 explicitly if the symbols would have a version of 2.4 by
default anyway. This includes most of the entries in
sysdeps/unix/sysv/linux/arm/Versions: those for GLIBC_2.0 are for
libgcc unwind functions that aren't actually in ARM EABI glibc at all,
while those for GLIBC_2.2 and GLIBC_2.3.3 are for functions which for
the old-ABI port may have had versions different from the
architecture-independent default, but where for EABI the default
suffices (both the default and the version in that file map to 2.4, so
the entries in that file do nothing). The GLIBC_2.1 entries are
needed (architecture-specific functions), but it seems less confusing
for those to say GLIBC_2.4, as the actual version those symbols in
fact have.
Various cases in the <fenv.h> functions where a function is defined as
__fe* with an fe* versioned alias are cleaned up just to define fe*
directly, as done e.g. on AArch64. If in future we actually need an
__fe* name for use from C90 functions in libm as discussed recently,
of course we can add one on all architectures and make the fe* name
into a weak alias for that particular function, but for now the __fe*
names aren't needed.
In the case of posix_fadvise64, the __posix_fadvise64_l64 name and
posix_fadvise64 alias are kept as __posix_fadvise64_l64 is used in
posix_fadvise. (For that to be a namespace-clean use, posix_fadvise64
needs to be a *weak* alias not a strong one as at present, but that's
an independent preexisting bug.)
(There remain references to GLIBC_2_2 in
sysdeps/unix/sysv/linux/arm/{msgctl.c,semctl.c,shmctl.c}. As those
files are used by alpha which has a genuine 2.2 version for those
functions, I think those references need to stay as-is.)
Tested that the disassembly of installed shared libraries is unchanged
by this patch (though function names shown in disassembly change to no
longer have @@GLIBC_2.4, now those functions get versioned only by the
version map and not redundantly at assembler time) and that the ABI
tests pass.
* sysdeps/arm/fclrexcpt.c (__feclearexcept): Rename to
feclearexcept. Remove symbol versioning code.
* sysdeps/arm/fegetenv.c (__fegetenv): Rename to fegetenv. Remove
symbol versioning code.
* sysdeps/arm/fesetenv.c (__fesetenv): Rename to fesetenv. Remove
symbol versioning code.
* sysdeps/arm/feupdateenv.c (__feupdateenv): Rename to
feupdateenv. Remove symbol versioning code.
* sysdeps/arm/fgetexcptflg.c (__fegetexceptflag): Rename to
fegetexceptflag. Remove symbol versioning code.
* sysdeps/arm/fsetexcptflg.c (__fesetexceptflag): Rename to
fesetexceptflag. Remove symbol versioning code.
* sysdeps/unix/sysv/linux/arm/Versions (libc): Remove GLIBC_2.0,
GLIBC_2.2 and GLIBC_2.3.3 entries. Change GLIBC_2.1 to GLIBC_2.4.
* sysdeps/unix/sysv/linux/arm/posix_fadvise64.c
(__posix_fadvise64_l32): Remove prototype.
[SHLIB_COMPAT(libc, GLIBC_2_2, GLIBC_2_3_3)]: Remove conditional
code.
This patch does some initial cleanup, following the move to 2.6.32
minimum kernel version, by removing __LINUX_KERNEL_VERSION
conditionals that are now always-true or always-false. In the case of
__ASSUME_ARG_MAX_STACK_BASED, where the conditional used a kernel
version that was itself in a macro, the associated sysconf.c code is
also cleaned up and __ASSUME_ARG_MAX_STACK_BASED removed completely.
Tested x86_64 that disassembly of installed shared libraries is
unchanged by the patch.
* sysdeps/unix/sysv/linux/kernel-features.h [__s390__]
(__ASSUME_UTIMES): Do not condition on kernel version.
(__ASSUME_PSELECT): Define unconditionally.
(__ASSUME_PPOLL): Likewise.
(__ASSUME_ATFCTS): Likewise.
(__ASSUME_SET_ROBUST_LIST): Do not condition on kernel version.
(__ASSUME_COMPLETE_READV_WRITEV): Define unconditionally.
(__ASSUME_FUTEX_LOCK_PI): Do not condition on kernel version.
(__ASSUME_UTIMENSAT): Define unconditionally.
(__ASSUME_PRIVATE_FUTEX): Likewise.
(__ASSUME_FALLOCATE): Likewise.
(__ASSUME_O_CLOEXEC): Likewise.
(__LINUX_ARG_MAX_STACK_BASED_MIN_KERNEL): Remove.
(__ASSUME_ARG_MAX_STACK_BASED): Likewise.
(__ASSUME_ADJ_OFFSET_SS_READ): Define unconditionally.
(__ASSUME_SOCK_CLOEXEC): Do not condition on kernel version.
(__ASSUME_IN_NONBLOCK): Likewise.
(__ASSUME_PIPE2): Likewise.
(__ASSUME_EVENTFD2): Likewise.
(__ASSUME_SIGNALFD4): Likewise.
(__ASSUME_DUP3): Likewise.
[__x86_64__ || __sparc__] (__ASSUME_ACCEPT4_SYSCALL): Likewise.
(__ASSUME_FUTEX_CLOCK_REALTIME): Define unconditionally.
(__ASSUME_AT_RANDOM): Likewise.
(__ASSUME_PREADV): Likewise.
(__ASSUME_PWRITEV): Likewise.
(__ASSUME_REQUEUE_PI): Do not condition on kernel version.
(__ASSUME_F_GETOWN_EX): Define unconditionally.
(__ASSUME_XFS_RESTRICTED_CHOWN): Likewise.
* sysdeps/unix/sysv/linux/sysconf.c (__sysconf)
[!__ASSUME_ARG_MAX_STACK_BASED]: Remove conditional code.
* sysdeps/unix/sysv/linux/alpha/kernel-features.h
(__ASSUME_O_CLOEXEC): Define unconditionally.
(__ASSUME_PSELECT): Do not undefine conditionally.
(__ASSUME_PPOLL): Likewise.
(__ASSUME_ATFCTS): Likewise.
(__ASSUME_SET_ROBUST_LIST): Likewise.
(__ASSUME_UTIMENSAT): Likewise.
(__ASSUME_FDATASYNC): Define unconditionally.
* sysdeps/unix/sysv/linux/arm/kernel-features.h
(__ASSUME_SIGFRAME_V2): Likewise.
)__ASSUME_EVENTFD2): Likewise.
(__ASSUME_SIGNALFD4): Likewise.
(__ASSUME_PSELECT): Do not undefine conditionally.
(__ASSUME_PPOLL): Likewise.
* sysdeps/unix/sysv/linux/ia64/kernel-features.h
(__ASSUME_PSELECT): Define unconditionally.
(__ASSUME_PPOLL): Likewise.
(__ASSUME_O_CLOEXEC): Likewise.
(__ASSUME_SOCK_CLOEXEC): Likewise.
(__ASSUME_IN_NONBLOCK): Likewise.
(__ASSUME_PIPE2): Likewise.
(__ASSUME_EVENTFD2): Likewise.
(__ASSUME_SIGNALFD4): Likewise.
(__ASSUME_DUP3): Likewise.
* sysdeps/unix/sysv/linux/m68k/kernel-features.h
(__ASSUME_O_CLOEXEC): Likewise.
(__ASSUME_SOCK_CLOEXEC): Likewise.
(__ASSUME_IN_NONBLOCK): Likewise.
(__ASSUME_PIPE2): Likewise.
(__ASSUME_EVENTFD2): Likewise.
(__ASSUME_SIGNALFD4): Likewise.
(__ASSUME_DUP3): Likewise.
* sysdeps/unix/sysv/linux/mips/kernel-features.h
(__ASSUME_EVENTFD2): Likewise.
(__ASSUME_SIGNALFD4): Likewise.
(__ASSUME_ACCEPT4_SYSCALL): Likewise.
This patch fixes bug 16064, i386 fenv_t not including SSE state, using
the technique suggested there of storing the state in the existing
__eip field of fenv_t to avoid needing to increase the size of fenv_t
and add new symbol versions. The included testcase, which previously
failed for i386 (but passed for x86_64), illustrates how the previous
state was buggy.
This patch causes the SSE state to be included *to the extent it is on
x86_64*. Where some state should logically be included but isn't for
x86_64 (see bug 16068), this patch does not cause it to be included
for i386 either. The idea is that any patch fixing that bug should
fix it for both x86_64 and i386 at once.
Tested i386 and x86_64. (I haven't tested the case of a CPU without
SSE2 disabling the test.)
[BZ #16064]
* sysdeps/i386/fpu/fegetenv.c: Include <unistd.h>, <ldsodefs.h>
and <dl-procinfo.h>.
(__fegetenv): Save SSE state in envp->__eip if supported.
* sysdeps/i386/fpu/feholdexcpt.c (feholdexcept): Save SSE state in
envp->__eip if supported.
* sysdeps/i386/fpu/fesetenv.c: Include <unistd.h>, <ldsodefs.h>
and <dl-procinfo.h>.
(__fesetenv): Always set __eip, __cs_selector, __opcode,
__data_offset and __data_selector in environment to 0. Set SSE
state if supported.
* sysdeps/x86/fpu/Makefile [$(subdir) = math] (tests): Add
test-fenv-sse.
[$(subdir) = math] (CFLAGS-test-fenv-sse.c): Add -msse2
-mfpmath=sse.
* sysdeps/x86/fpu/test-fenv-sse.c: New file.
Set values for libc_commonpagesize and libc_relro_required for the
ARM port to enable relro by default and suppress a warning at
configure time.
ChangeLog:
2014-05-09 Will Newton <will.newton@linaro.org>
* sysdeps/arm/preconfigure.ac: Set libc_commonpagesize
and libc_relro_required for ARM.
* sysdeps/arm/preconfigure: Regenerate.
Added support for TX lock elision of pthread mutexes on s390 and
s390x. This may improve lock scaling of existing programs on TX
capable systems. The lock elision code is only built with
--enable-lock-elision=yes and then requires a GCC version supporting
the TX builtins. With lock elision default mutexes are elided via
__builtin_tbegin, if the cpu supports transactions. By default lock
elision is not enabled and the elision code is not built.
Add an optimized implementation of strcmp for ARMv7-A cores. This
implementation is significantly faster than the current generic C
implementation, particularly for strings of 16 bytes and longer.
Tested with the glibc string tests for arm-linux-gnueabihf and
armeb-linux-gnueabihf.
The code was written by ARM, who have agreed to assign the copyright
to the FSF for integration into glibc.
ChangeLog:
2014-05-09 Will Newton <will.newton@linaro.org>
* sysdeps/arm/armv7/strcmp.S: New file.
* NEWS: Mention addition of ARMv7 optimized strcmp.
The optimization is achieved by following techniques:
> data alignment [gain from aligned memory access on read/write]
> POWER7 gains performance with loop unrolling/unwinding
[gain by reduction of branch penalty].
> zero padding done by calling optimized memset