This patch fixes what I believe to be a bug in the handling of
R_ARM_IRELATIVE RELA relocations. At present, these are handled the
same as REL relocations: i.e. the addend is loaded from the relocation
address. Most of the time this isn't a problem because RELA relocations
aren't used on ARM (GNU/Linux at least) anyway, but it causes problems
with prelink, which uses RELA on all targets for its conflict table.
(Support for ifunc prelinking requires a prelink patch, not yet posted.)
Anyway, this patch works, though I'm not 100% sure if it is correct: I
notice that this code path received attention last year:
https://sourceware.org/ml/libc-ports/2013-07/msg00000.html
I'm not sure under what circumstances that patch would have had an
effect, nor if my patch conflicts with that case.
No regressions using Mentor's usual glibc cross-testing infrastructure.
[BZ #16888]
* sysdeps/arm/dl-machine.h (elf_machine_rela): Fix R_ARM_IRELATIVE
handling.
This patch increases the minimum Linux kernel version for glibc to
2.6.32, as discussed in the thread starting at
<https://sourceware.org/ml/libc-alpha/2014-01/msg00511.html>.
This patch just does the minimal change to arch_minimum_kernel
settings (and LIBC_LINUX_VERSION, which determines the minimum kernel
headers version, as it doesn't make sense for that to be older than
the minimum kernel that can be used at runtime). Followups would be
expected to do, roughly and not necessarily precisely in this order:
* Remove __LINUX_KERNEL_VERSION checks in kernel-features.h files
where those checks are always true / always false for kernels 2.6.32
and above.
* Otherwise simplify/improve conditionals in those files (for example,
where defining once in the main file then undefining in
architecture-specific files makes things clearer than having lots of
separate definitions of the same macro), possibly fixing in the
process cases where a macro should optimally have been defined for a
given architecture but wasn't. (In the review in preparation for
this version increase I checked what the right conditions should be
for all macros in the main kernel-features.h whose definitions there
would have been affected by the increase - but I only fixed that
subset of the issues found where --enable-kernel=2.6.32 would have
caused a kernel feature to be wrongly assumed to be present, not any
cases where a feature is not assumed but could be assumed.)
* Remove conditionals on __ASSUME_* where they can now be taken to be
always-true, and the definitions when the macros are only used in
Linux-specific files.
* Split more architectures out of the main kernel-features.h (like
ex-ports architectures), once various of the architecture
conditionals there have been eliminated so the new
architecture-specific files are no larger than actually necessary.
Tested x86_64.
2014-03-27 Joseph Myers <joseph@codesourcery.com>
[BZ #9894]
* sysdeps/unix/sysv/linux/configure.ac (LIBC_LINUX_VERSION):
Change to 2.6.32.
(arch_minimum_kernel): Change all 2.6.16 settings to 2.6.32.
* sysdeps/unix/sysv/linux/configure: Regenerated.
* sysdeps/unix/sysv/linux/microblaze/configure.ac: Remove file.
* sysdeps/unix/sysv/linux/microblaze/configure: Likewise.
* sysdeps/unix/sysv/linux/tile/configure.ac: Likewise.
* sysdeps/unix/sysv/linux/tile/configure: Likewise.
* README: Update reference to required Linux kernel version.
* manual/install.texi (Linux): Update reference to required Linux
kernel headers version.
* INSTALL: Regenerated.
This patch optimizes the FPSCR update on exception and rounding change
functions by just updating its value if new value if different from
current one. It also optimizes fedisableexcept and feenableexcept by
removing an unecessary FPSCR read.
__int128 was added in GCC 4.6 and __int128_t was added before x86-64
was supported. This patch replaces __int128 with __int128_t so that
the installed bits/link.h can be used with older GCC.
* sysdeps/x86/bits/link.h (La_x86_64_regs): Replace __int128
with __int128_t.
(La_x86_64_retval): Likewise.
The current implementation of setcontext uses rt_sigreturn to restore
the contents of registers. This contrasts with the way most other
architectures implement setcontext:
powerpc64, mips, tile:
Call rt_sigreturn if context was created by a call to a signal handler,
otherwise restore in user code.
powerpc32:
Call swapcontext system call and don't call sigreturn or rt_sigreturn.
x86_64, sparc, hppa, sh, ia64, m68k, s390, arm:
Only support restoring "synchronous" contexts, that is contexts
created by getcontext, and restoring in user code and don't call
sigreturn or rt_sigreturn.
alpha:
Call sigreturn (but not rt_sigreturn) in all cases to do the restore.
The text of the setcontext manpage suggests that the requirement to be
able to restore a signal handler created context has been dropped from
SUSv2:
If the context was obtained by a call to a signal handler, then old
standard text says that "program execution continues with the program
instruction following the instruction interrupted by the signal".
However, this sentence was removed in SUSv2, and the present verdict
is "the result is unspecified".
Implementing setcontext by calling rt_sigreturn unconditionally causes
problems when used with sigaltstack as in BZ #16629. On this basis it
seems that aarch64 is broken and that new ports should only support
restoring contexts created with getcontext and do not need to call
rt_sigreturn at all.
This patch re-implements the aarch64 setcontext function to restore
the context in user code in a similar manner to x86_64 and other ports.
ChangeLog:
2014-04-17 Will Newton <will.newton@linaro.org>
[BZ #16629]
* sysdeps/unix/sysv/linux/aarch64/setcontext.S (__setcontext):
Re-implement to restore registers in user code and avoid
rt_sigreturn system call.
Besides fixing the bugzilla, this also fixes corner-cases where the high
and low double differ greatly in magnitude, and handles a denormal
input without resorting to a fp rescale.
[BZ #16740]
[BZ #16619]
* sysdeps/ieee754/ldbl-128ibm/s_frexpl.c (__frexpl): Rewrite.
* math/libm-test.inc (frexp_test_data): Add tests.
[BZ #15215] This unifies various pthread_once architecture-specific
implementations which were using the same algorithm with slightly different
implementations. It also adds missing memory barriers that are required for
correctness.
This patch saves and restores bound registers in symbol lookup for x86-64:
1. Branches without BND prefix clear bound registers.
2. x86-64 pass bounds in bound registers as specified in MPX psABI
extension on hjl/mpx/master branch at
https://github.com/hjl-tools/x86-64-psABIhttps://groups.google.com/forum/#!topic/x86-64-abi/KFsB0XTgWYc
Binutils has been updated to create an alternate PLT to add BND prefix
when branching to ld.so.
* config.h.in (HAVE_MPX_SUPPORT): New #undef.
* sysdeps/x86_64/configure.ac: Set HAVE_MPX_SUPPORT.
* sysdeps/x86_64/configure: Regenerated.
* sysdeps/x86_64/dl-trampoline.S (REGISTER_SAVE_AREA): New
macro.
(REGISTER_SAVE_RAX): Likewise.
(REGISTER_SAVE_RCX): Likewise.
(REGISTER_SAVE_RDX): Likewise.
(REGISTER_SAVE_RSI): Likewise.
(REGISTER_SAVE_RDI): Likewise.
(REGISTER_SAVE_R8): Likewise.
(REGISTER_SAVE_R9): Likewise.
(REGISTER_SAVE_BND0): Likewise.
(REGISTER_SAVE_BND1): Likewise.
(REGISTER_SAVE_BND2): Likewise.
(_dl_runtime_resolve): Use them. Save and restore Intel MPX
bound registers when calling _dl_fixup.
pathconf(_PC_NAME_MAX) was implemented on top of statfs(). The 32bit
version therefore fails EOVERFLOW if the filesystem blockcount is
sufficiently large.
Most pathconf() queries use statvfs64(), which avoids this issue. This
patch modifies pathconf(_PC_NAME_MAX) to do likewise.
This patch moves the __PTHREAD_SPINS definition to arch specific header
since pthread_mutex_t layout is also arch specific. This leads to no
need to defining __PTHREAD_MUTEX_HAVE_ELISION and thus removing of the
undefined compiler warning.
This patch fixes some powerpc32 and powerpc64 builds with
--disable-multi-arch option along with different --with-cpu=powerN.
It cleanups the Implies directories by removing the multiarch
folder for non multiarch config and also fixing two assembly
implementations: powerpc64/power7/strncat.S that is calling the
wrong strlen; and power8/fpu/s_isnan.S that misses the hidden_def and
weak_alias directives.
This patch makes the configure adds -D_CALL_ELF=1 when compiler does
not define _CALL_ELF (versions before powerpc64le support). It cleans
up compiler warnings on old compiler where _CALL_ELF is not defined
on powerpc64(be) builds.
It does by add a new config.make variable for configure-deduced
CPPFLAGS and accumulate into that (confix-extra-cppflags). It also
generalizes libc_extra_cflags so it accumulates in sysdeps configure
fragmenets.
This patch fixes the powerpc32 optimized nearbyint/nearbyintf bogus
results for FE_DOWNWARD rounding mode. This is due wrong instructions
sequence used in the rounding calculation (two subtractions instead of
adition and a subtraction).
Fixes BZ#16815.
This patch fixes incorrect results from catan and catanh of certain
special inputs in round-downward mode (bug 16799), and incorrect
results of __ieee754_logf (+/-0) in round-downward mode (bug 16800)
that show up through catan/catanh when tested in all rounding modes,
but not directly in the testing for logf because the bug gets hidden
by the wrappers.
Both bugs involve a zero that should be +0 being -0 instead: one
computed as (1-x)*(1+x) in the catan/catanh case, and one as (x-x) in
the logf case. The fixes ensure positive zero is used. Testing of
catan and catanh in all rounding modes is duly enabled.
I expect there are various other bugs in special cases in __ieee754_*
functions that are normally hidden by the wrappers but would show up
for testing with -lieee (or in future with -fno-math-errno if we
replace -lieee and _LIB_VERSION with compile-time redirection to new
*_noerrno symbol names).
Tested x86_64 and x86 and ulps updated accordingly.
[BZ #16799]
[BZ #16800]
* math/s_catan.c (__catan): Avoid passing -0 denominator to atan2
with 0 numerator.
* math/s_catanf.c (__catanf): Likewise.
* math/s_catanh.c (__catanh): Likewise.
* math/s_catanhf.c (__catanhf): Likewise.
* math/s_catanhl.c (__catanhl): Likewise.
* math/s_catanl.c (__catanl): Likewise.
* sysdeps/ieee754/flt-32/e_logf.c (__ieee754_logf): Always divide
by positive zero when computing -Inf result.
* math/libm-test.inc (catan_test): Use ALL_RM_TEST.
(catanh_test): Likewise.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
This patch fixes bug 16789, incorrect sign of (real part) zero result
from clog and clog10 in round-downward mode, arising from that real
part being computed as 0 - 0. To ensure that an underflow exception
occurred, the code used an underflowing value (the next term in the
series for log1p) in arithmetic computing the real part of the result,
yielding the problematic 0 - 0 computation in some cases even when the
mathematical result would be small but positive. The patch changes
this code to use the math_force_eval approach to ensuring that an
underflowing computation actually occurs. Tests of clog and clog10
are enabled in all rounding modes.
Tested x86_64 and x86 and ulps updated accordingly.
[BZ #16789]
* math/s_clog.c (__clog): Use math_force_eval to ensure underflow
instead of using underflowing value in computing result.
* math/s_clog10.c (__clog10): Likewise.
* math/s_clog10f.c (__clog10f): Likewise.
* math/s_clog10l.c (__clog10l): Likewise.
* math/s_clogf.c (__clogf): Likewise.
* math/s_clogl.c (__clogl): Likewise.
* math/libm-test.inc (clog_test): Use ALL_RM_TEST.
(clog10_test): Likewise.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
Fix for values near a power of two, and some tidies.
[BZ #16739]
* sysdeps/ieee754/ldbl-128ibm/s_nextafterl.c (__nextafterl): Correct
output when value is near a power of two. Use int64_t for lx and
remove casts. Use decimal rather than hex exponent constants.
Don't use long double multiplication when double will suffice.
* math/libm-test.inc (nextafter_test_data): Add tests.
* NEWS: Add 16739 and 16786 to bug list.
This patch continues fixing __ASSUME_* issues in preparation for
moving to a 2.6.32 minimum kernel version by addressing assumptions on
robust mutex and PI futex support availability. Those assumptions are
bug 9894, but to be clear this patch does not address all the issues
from that bug about wrong version assumptions, only those still
applicable for --enable-kernel=2.6.32 or later (with the expectation
that the move to that minimum kernel will obsolete the other parts of
the bug). The patch is independent of
<https://sourceware.org/ml/libc-alpha/2014-03/msg00585.html>, my other
pending-review patch preparing for the kernel version change; the two
together complete all the changes I believe are needed in preparation
regarding any macro in sysdeps/unix/sysv/linux/kernel-features.h that
would be affected by such a change. (I have not checked the
correctness of macros whose conditions are unaffected by such a
change, or macros only defined in other kernel-features.h files.)
As discussed in that bug, robust mutexes and PI futexes need
futex_atomic_cmpxchg_inatomic to be implemented, in addition to
certain syscalls needed for robust mutexes (and
architecture-independent kernel pieces for all the features in
question). That is, as I understand it, they need
futex_atomic_cmpxchg_inatomic to *work* (not return an ENOSYS error).
The issues identified in my analysis relate to ARM, M68K, MicroBlaze,
MIPS and SPARC.
On ARM, whether futex_atomic_cmpxchg_inatomic works depends on the
kernel configuration. As of 3.13, the condition for *not* working is
CONFIG_CPU_USE_DOMAINS && CONFIG_SMP. As of 2.6.32 it was simply
CONFIG_SMP that meant the feature was not implemented. I don't know
if there are any circumstances in which we can say "we can assume a
userspace glibc binary built with these options will never run on a
kernel with the problematic configuration", but at least for now I'm
just undefining the relevant __ASSUME_* macros for ARM.
On M68K, two of the three macros are undefined for kernels before
3.10, but as far as I can see __ASSUME_FUTEX_LOCK_PI is in the same
group needing futex_atomic_cmpxchg_inatomic support and so should be
undefined as well.
On MicroBlaze the required support was added in 2.6.33.
On MIPS, the support depends on cpu_has_llsc in the kernel - that is,
actual hardware LL/SC support (GCC and glibc for MIPS GNU/Linux rely
on the instructions being supported in some way, but it may be kernel
emulation; futex_atomic_cmpxchg_inatomic doesn't work with that
emulation). The same condition as in GCC for indicating LL/SC support
may not be available is used for undefining the macros in glibc,
__mips == 1 || defined _MIPS_ARCH_R5900. (Maybe we could in fact
desupport MIPS processors without the hardware support in glibc.)
On SPARC, 32-bit kernels don't support futex_atomic_cmpxchg_inatomic;
__arch64__ || __sparc_v9__ is used as the condition for binaries that
won't run on 32-bit kernels.
This patch is not tested beyond the sanity check of an x86_64 build.
[BZ #9894]
* sysdeps/unix/sysv/linux/kernel-features.h
[__sparc__ && !__arch64__ && !__sparc_v9__]
(__ASSUME_SET_ROBUST_LIST): Do not define.
[__sparc__ && !__arch64__ && !__sparc_v9__]
(__ASSUME_FUTEX_LOCK_PI): Likewise.
[__sparc__ && !__arch64__ && !__sparc_v9__] (__ASSUME_REQUEUE_PI):
Likewise.
* sysdeps/unix/sysv/linux/arm/kernel-features.h
(__ASSUME_FUTEX_LOCK_PI): Undefine.
(__ASSUME_REQUEUE_PI): Likewise.
(__ASSUME_SET_ROBUST_LIST): Likewise.
* sysdeps/unix/sysv/linux/m68k/kernel-features.h
[__LINUX_KERNEL_VERSION < 0x030a00] (__ASSUME_FUTEX_LOCK_PI):
Undefine.
* sysdeps/unix/sysv/linux/microblaze/kernel-features.h
[__LINUX_KERNEL_VERSION < 0x020621] (__ASSUME_FUTEX_LOCK_PI):
Likewise.
[__LINUX_KERNEL_VERSION < 0x020621] (__ASSUME_REQUEUE_PI):
Likewise.
[__LINUX_KERNEL_VERSION < 0x020621] (__ASSUME_SET_ROBUST_LIST):
Likewise.
* sysdeps/unix/sysv/linux/mips/kernel-features.h
[__mips == 1 || _MIPS_ARCH_R5900] (__ASSUME_FUTEX_LOCK_PI):
Undefine.
[__mips == 1 || _MIPS_ARCH_R5900] (__ASSUME_REQUEUE_PI): Likewise.
[__mips == 1 || _MIPS_ARCH_R5900] (__ASSUME_SET_ROBUST_LIST):
Likewise.
Continuing the fixes for __ASSUME_* issues in preparation for moving
to a 2.6.32 minimum kernel version, this *untested* patch fixes bug
16648, the definition of __ASSUME_ATFCTS meaning that the futimesat
syscall is assumed for all MicroBlaze kernels despite not being
present until 2.6.33.
__ASSUME_ATFCTS controls conditionals relating to a lot of different
syscalls in Linux-specific code (fstatat64 faccessat fchmodat fchownat
futimesat newfstatat linkat mkdirat openat readlinkat renameat
symlinkat unlinkat mknodat), where whether newfstatat fstatat64
futimesat are used depends on the architecture, as well as controlling
whether openat64_not_cancel_3 is expected to work in
sysdeps/posix/getcwd.c. The assumptions are all OK as of 2.6.32
except for this MicroBlaze case, and it's generally desirable to get
rid of as many of the __ASSUME_ATFCTS conditionals as possible, to
simplify the code (the fallbacks include potential unbounded dynamic
stack allocations). Thus, rather than the simplest approach of
undefining __ASSUME_ATFCTS for older kernels on MicroBlaze, this patch
takes the approach of using the linux-generic implementation of
futimesat for MicroBlaze kernels before 2.6.33 (all such kernels have
the utimensat syscall).
[BZ #16648]
* sysdeps/unix/sysv/linux/microblaze/kernel-features.h
[__LINUX_KERNEL_VERSION >= 0x020621] (__ASSUME_FUTIMESAT): Define.
* sysdeps/unix/sysv/linux/microblaze/futimesat.c: New file.
This patch fixes bug 16348, spurious underflows from x86/x86_64 expl
on arguments close to 0. These implementations effectively use expm1
(on the fractional part of the argument) internally, so resulting in
spurious underflows when the result is very close to 1. For arguments
small enough that the round-to-nearest correct result is 1, this patch
uses 1+x instead.
These implementations are also used for exp10l and so the patch fixes
similar issues there (the 0x1p-67 threshold being small enough to be
correct for exp10l as well as expl). But because of spurious
underflows in other exp10 implementations (bug 16560), the tests
aren't added for exp10 at this point - they can be added when the
other exp10 parts of that bug are fixed.
Tested x86_64 and x86; no ulps updates needed.
[BZ #16348]
* sysdeps/i386/fpu/e_expl.S (IEEE754_EXPL) [!USE_AS_EXPM1L]: Use
1+x for argument with exponent below -67.
* sysdeps/x86_64/fpu/e_expl.S (IEEE754_EXPL) [!USE_AS_EXPM1L]:
Likewise.
* math/auto-libm-test-in: Add more tests of exp.
* math/auto-libm-test-out: Regenerated.
Bug 16198 is x86_64 fegetenv wrongly masking exceptions for which
traps are enabled, because that's a side-effect of the fnstenv
instruction. This patch fixes it to use fldenv immediately after
fnstenv, like the i386 version. Tested x86_64 and x86.
[BZ #16198]
* sysdeps/x86_64/fpu/fegetenv.c (fegetenv): Use fldenv after
fnstenv.
* math/test-fenv-preserve.c: New file.
* math/Makefile (tests): Add test-fenv-preserve.
gen-auto-libm-tests presently allows but does not require underflow
exceptions for results with magnitude in the range (greatest
subnormal, least normal].
In some cases, the magnitude of the exact result is very slightly
above the least normal, but rounding in the implementation results in
it effectively computing an infinite-precision result that is slightly
below the least normal, so raising an underflow exception. This is in
accordance with the documented accuracy goals, but results in
testsuite failures.
This patch changes the logic to allow underflows when the mathematical
result is up to 0.5ulp above the least normal (so in any case where
the round-to-nearest result is the least normal). Ideally underflows
in all these cases would be accepted only when an underflow with the
actual result is consistent with the rounding mode (in FE_TOWARDZERO
mode, a return value of the least normal implies that the
infinite-precision result did not underflow so there should be no
underflow exception, for example), so as to match the documented goals
more precisely - whereas at present the tests for exceptions are
completely independent of the tests of the returned values. (The same
applies to overflow exceptions as well - they too should be checked
for consistency with the result, as in FE_TOWARDZERO mode a result
1ulp below the largest finite value should be inconsistent with an
overflow exception and cause a failure with overflow rather than
simply being considered a 1ulp error when overflow is expected.) But
the present patch at least deals with the cases causing spurious
failures so that (a) certain existing tests no longer need to be
marked as having spurious exceptions (such markings in
auto-libm-test-in end up applying to more cases than just those they
are needed for) and (b) log1p can be tested in all rounding modes
without introducing more such failures. This patch duly moves tests
of log1p to ALL_RM_TEST.
Tested x86_64 and x86 and ulps updated accordingly.
[BZ #16357]
[BZ #16599]
* math/gen-auto-libm-tests.c (fp_format_desc): Add field
min_plus_half.
(fp_formats): Update initializers.
(init_fp_formats): Initialize new field.
(output_for_one_input_case): Allow underflow for results up to
min_plus_half.
* math/libm-test.inc (log1p_test): Use ALL_RM_TEST.
* math/auto-libm-test-in: Don't mark some underflows from asin and
atanh as spurious.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
My recent exp patch introduced warnings about implicit __isinf
declarations in exp because e_exp.c didn't include <math.h>. This
patch fixes this. Because <math.h> can't be included after
<math_private.h> (because of macro definitions of __nan*), it was
necessary to put an include in sysdeps/x86_64/fpu/multiarch/e_exp.c as
well.
Tested x86_64.
* sysdeps/ieee754/dbl-64/e_exp.c: Include <math.h>.
* sysdeps/x86_64/fpu/multiarch/e_exp.c
[HAVE_FMA4_SUPPORT || HAVE_AVX_SUPPORT]: Likewise.
This fixes a bug in the way the results from __nscd_getai are collected:
for every returned result a new entry is first added to the
gaih_addrtuple list, but if that result doesn't match the request this
entry remains uninitialized. So for this non-matching result an extra
result with uninitialized content is returned.
To reproduce (with nscd running):
$ getent ahostsv4 localhost
127.0.0.1 STREAM localhost
127.0.0.1 DGRAM
127.0.0.1 RAW
(null) STREAM
(null) DGRAM
(null) RAW
To reproduce:
# ip li add name dummy0 type dummy
# site_id=$(head -c6 /dev/urandom | od -tx2 -An | tr ' ' ':')
# for ((i = 0; i < 65536; i++)) do
> ip ad ad $(printf fd80$site_id::%04x $i)/128 dev dummy0
> done
# (ulimit -s 900; getent ahosts localhost)
# ip li de dummy0
The dbl-64 version of exp needs round-to-nearest mode for its internal
computations, but that has the consequence of inappropriate
overflowing and underflowing results in other rounding modes. This
patch fixes this by recomputing the relevant results in cases where
the round-to-nearest result overflows to infinity or underflows to
zero (most of the diffs are actually just consequent reindentation).
Tests are enabled in all rounding modes for complex functions using
exp - but not for cexp because it turns out there are bugs causing
spurious underflows for cexp for some tests, which will need to be
fixed separately (I suspect ccos ccosh csin csinh ctan ctanh have
similar bugs, just not shown by the present set of test inputs).
Tested x86_64 and x86 and ulps updated accordingly.
[BZ #16284]
* sysdeps/ieee754/dbl-64/e_exp.c (__ieee754_exp): Use original
rounding mode to recompute results that overflow to infinity or
underflow to zero.
* math/auto-libm-test-in: Don't mark tests as expected to fail for
bug 16284.
* math/auto-libm-test-out: Regenerated.
* math/libm-test.inc (ccos_test): Use ALL_RM_TEST.
(ccosh_test): Likewise.
(csin_test_data): Use plus_oflow.
(csin_test): Use ALL_RM_TEST.
(csinh_test_data): Use plus_oflow.
(csinh_test): Use ALL_RM_TEST.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
This patch fixes -Wundef warnings related to the _ABI* macros on MIPS.
GCC predefines only the _ABI* macro related to the ABI actually in
use, meaning that a conditional such as "#if _MIPS_SIM == _ABI64" is
true only for the ABI in question (all the macros are nonzero), but
produces a -Wundef warning for the other ABIs. The normal approach to
using these macros is to include <sgidefs.h>, which ensures that all
three _ABI* macros are defined rather than just one; this patch does
so in the places that caused warnings (the bulk of the warnings
arising from <bits/wordsize.h>). Tested that the warnings are fixed.
* sysdeps/mips/bits/wordsize.h: Include <sgidefs.h>.
* sysdeps/unix/sysv/linux/mips/getrlimit64.c: Likewise.
* sysdeps/unix/sysv/linux/mips/setrlimit64.c: Likewise.
According to ISO C Annex F, log (1) should be +0 in all rounding
modes, but some implementations in glibc wrongly return -0 in
round-downward mode (mapping to log1p (x - 1) is problematic because 1
- 1 is -0 in round-downward mode, and log1p (-0) is -0). This patch
fixes this. (It helps with some implementations of other functions
such as acosh, log2 and log10 that call out to log, but not enough to
enable all-rounding-modes testing for those functions without further
fixes to other implementations of them.)
Tested x86_64 and x86 and ulps updated accordingly, and did spot tests
for mips64 for the ldbl-128 fix, and i586 for the sysdeps/i386/fpu
implementations shadowed by those in sysdeps/i386/i686/fpu.
[BZ #16731]
* sysdeps/i386/fpu/e_log.S (__ieee754_log): Take absolute value
when x - 1 is zero.
* sysdeps/i386/fpu/e_logf.S (__ieee754_logf): Likewise.
* sysdeps/i386/fpu/e_logl.S (__ieee754_logl): Likewise.
* sysdeps/i386/i686/fpu/e_logl.S (__ieee754_logl): Likewise.
* sysdeps/ieee754/dbl-64/e_log.c (__ieee754_log): Return +0 when
argument is 1.
* sysdeps/ieee754/ldbl-128/e_logl.c (__ieee754_logl): Likewise.
* sysdeps/x86_64/fpu/e_logl.S: Take absolute value when x - 1 is
zero.
* math/libm-test.inc (log_test): Use ALL_RM_TEST.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
This patch add an optimized strpbrk for POWER7 by using a different
algorithm than default implementation: it constructs a table based on
the 'accept' argument and use this table to check for any occurance on
the input string. The idea is similar as x86_64 uses.
For PowerPC some tunings were added, such as unroll loops and memory
clear using VSX instructions.
This patch makes libm-test.inc tests of most functions use ALL_RM_TEST
unless there was some reason to defer that change for a particular
function.
I started out planning to defer the change for pow (bug 16315), cexp /
ccos / ccosh / csin / csinh (likely fallout from exp, bug 16284) and
cpow (exact expectations for signs of exact zero results not wanted).
Testing on x86_64 and x86 showed additional failures for acosh, cacos,
catan, catanh, clog, clog10, jn, log, log10, log1p, log2, tgamma, yn,
so making the change for those functions was deferred as well, pending
investigation to show which of these represent distinct bugs (some
such bugs may already be filed) and appropriate fixing / XFAILing.
Failures include wrong signs of zero results, errors slightly above
the 9ulp bound (in such cases it may make sense for functions to set
round-to-nearest internally to reduce error accumulation), large
errors and incorrect overflow/underflow for the rounding mode (with
consequent missing errno settings in some cases). It's possible some
could be issues with test expectations, though I didn't notice any
that were obviously like that (I added NO_TEST_INLINE for cases that
were failing for ildoubl on x86 and where it seemed reasonable for
them to fail for the fast-math inlines).
There may of course be failures on other architectures for functions
that didn't fail on x86_64 or x86, in which case the usual rule
applies: file a bug (preferably identifying the underlying problem
function, in cases where function A calls function B and a problem
with function B may present in the test results for function A) if not
already in Bugzilla then fix or XFAIL.
Tested x86_64 and x86 and ulps updated accordingly.
* math/libm-test.inc (asinh_test): Use ALL_RM_TEST.
(atan_test): Likewise.
(atanh_test_data): Use NO_TEST_INLINE for two tests.
(atanh_test): Use ALL_RM_TEST.
(atan2_test_data): Likewise.
(cabs_test): Likewise.
(cacosh_test): Likewise.
(carg_test): Likewise.
(casin_test): Likewise.
(casinh_test): Likewise.
(cbrt_test): Likewise.
(csqrt_test): Likewise.
(erf_test): Likewise.
(erfc_test): Likewise.
(pow10_test): Likewise.
(exp2_test): Likewise.
(hypot_test): Likewise.
(j0_test): Likewise.
(j1_test): Likewise.
(lgamma_test): Likewise.
(gamma_test): Likewise.
(sincos_test): Likewise.
(tanh_test): Likewise.
(y0_test): Likewise.
(y1_test): Likewise.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
This patch add a optimized strcspn for POWER7 by using a different
algorithm than default implementation: it constructs a table based on
the 'accept' argument and use this table to check for any occurance
on the input string. The idea is similar as x86_64 uses.
For PowerPC some tunings were added, such as unroll loops and align
stack memory to table to 16 bytes (so VSX clean can ran without
alignment issues).
Reviewing (for all architectures, with a baseline kernel version of
2.6.32) the kernel support for features for which __ASSUME_* macros
would be affected by a move to 2.6.32 as minimum kernel version showed
up that __ASSUME_PREADV and __ASSUME_PWRITEV were wrongly defined for
MicroBlaze (despite the corresponding syscall table entries not being
wired up in the kernel) and Alpha for 2.6.30 and above (although the
support on Alpha was added in 2.6.33). This patch makes the
kernel-features.h files undefine those macros for appropriate
versions.
[BZ #16649]
* sysdeps/unix/sysv/linux/alpha/kernel-features.h
[__LINUX_KERNEL_VERSION < 0x020621] (__ASSUME_PREADV): Undefine.
[__LINUX_KERNEL_VERSION < 0x020621] (__ASSUME_PWRITEV): Likewise.
* sysdeps/unix/sysv/linux/microblaze/kernel-features.h
(__ASSUME_PREADV): Undefine.
(__ASSUME_PWRITEV): Likewise.
This comment appears to have been copied from the ARM port where it
makes more sense.
2014-03-18 Will Newton <will.newton@linaro.org>
* sysdeps/unix/sysv/linux/aarch64/sysdep.h: Remove
inaccurate comment.
ChangeLog:
2014-03-17 Will Newton <will.newton@linaro.org>
* sysdeps/generic/math_private.h: Check whether
HAVE_RM_CTX is defined with #ifdef rather
than #if.
ChangeLog:
2014-03-17 Will Newton <will.newton@linaro.org>
* sysdeps/generic/ldsodefs.h: Check whether
HP_SMALL_TIMING_AVAIL is defined with #ifdef rather
than #if.
The roundl assembly implementation
(sysdeps/powerpc/powerpc64/fpu/s_roundl.S)
returns wrong results for some inputs where first double is a exact
integer and the precision is determined by second long double.
Checking on implementation comments and history, I am very confident the
assembly implementation was based on a version before commit
5c68d40169 that fixes BZ#2423 (Errors in
long double (ldbl-128ibm) rounding functions in glibc-2.4).
By just removing the implementation and make the build select
sysdeps/ieee754/ldbl-128ibm/s_roundl.c instead fixes the failing math.
This fixes 16707.
The nearbyintl assembly implementation
(sysdeps/powerpc/powerpc64/fpu/s_nearbyintl.S)
returns wrong results for some inputs where first double is a exact
integer and the precision is determined by second long double.
Checking on implementation comments and history, I am very confident the
assembly implementation was based on a version before commit
5c68d40169 that fixes BZ#2423 (Errors in
long double (ldbl-128ibm) rounding functions in glibc-2.4).
By just removing the implementation and make the build select
sysdeps/ieee754/ldbl-128ibm/s_nearbyintl.c instead fixes the failing
math.
Fixes BZ#16706.
The ceill assembly implementation (sysdeps/powerpc/powerpc64/fpu/s_ceill.S)
returns wrong results for some inputs where first double is a exact
integer and the precision is determined by second long double.
Checking on implementation comments and history, I am very confident the
assembly implementation was based on a version before commit
5c68d40169 that fixes BZ#2423 (Errors in
long double (ldbl-128ibm) rounding functions in glibc-2.4).
By just removing the implementation and make the build select
sysdeps/ieee754/ldbl-128ibm/s_ceill.c instead fixes the failing math.
Fixes BZ#16701.
It checks AVX-512 assembler support first and sets libc_cv_cc_avx512 to
$libc_cv_asm_avx512, instead of yes. GCC won't support AVX-512 if
assembler doesn't support it.
* sysdeps/x86_64/configure.ac: Check AVX-512 assembler support
first. Disable AVX-512 GCC support if assembler doesn't support
it.
* sysdeps/x86_64/configure: Regenerated.
AVX-512 ISA adds 512-bit zmm registers. This patch updates
_dl_runtime_profile to pass zmm registers to run-time audit. It also
changes _dl_x86_64_save_sse and _dl_x86_64_restore_sse to upport zmm
registers, which are called when only when RTLD_PREPARE_FOREIGN_CALL
is used. Its performance impact is minimum.
* config.h.in (HAVE_AVX512_SUPPORT): New #undef.
(HAVE_AVX512_ASM_SUPPORT): Likewise.
* sysdeps/x86_64/bits/link.h (La_x86_64_zmm): New.
(La_x86_64_vector): Add zmm.
* sysdeps/x86_64/Makefile (tests): Add tst-audit10.
(modules-names): Add tst-auditmod10a and tst-auditmod10b.
($(objpfx)tst-audit10): New target.
($(objpfx)tst-audit10.out): Likewise.
(tst-audit10-ENV): New.
(AVX512-CFLAGS): Likewise.
(CFLAGS-tst-audit10.c): Likewise.
(CFLAGS-tst-auditmod10a.c): Likewise.
(CFLAGS-tst-auditmod10b.c): Likewise.
* sysdeps/x86_64/configure.ac: Set config-cflags-avx512,
HAVE_AVX512_SUPPORT and HAVE_AVX512_ASM_SUPPORT.
* sysdeps/x86_64/configure: Regenerated.
* sysdeps/x86_64/dl-trampoline.S (_dl_runtime_profile): Add
AVX-512 zmm register support.
(_dl_x86_64_save_sse): Likewise.
(_dl_x86_64_restore_sse): Likewise.
* sysdeps/x86_64/dl-trampoline.h: Updated to support different
size vector registers.
* sysdeps/x86_64/link-defines.sym (YMM_SIZE): New.
(ZMM_SIZE): Likewise.
* sysdeps/x86_64/tst-audit10.c: New file.
* sysdeps/x86_64/tst-auditmod10a.c: Likewise.
* sysdeps/x86_64/tst-auditmod10b.c: Likewise.
Reviewing (for all architectures, with a baseline kernel version of
2.6.32) the kernel support for features for which __ASSUME_* macros
would be affected by a move to 2.6.32 as minimum kernel version showed
up that __ASSUME_PSELECT was wrongly defined for MicroBlaze, despite
the corresponding syscall table entry not being wired up in the
MicroBlaze kernel.
This patch makes the MicroBlaze kernel-features.h undefine
__ASSUME_PSELECT. I'd also encourage wiring it up in the kernel (so
you can then make this #undef conditional, and eventually obsolete
once a recent-enough kernel is required). I suspect it wasn't wired
up because of the mistaken comment in asm/unistd.h "obsolete ->
sys_pselect7" (there is no such syscall as pselect7).
[BZ #16642]
* sysdeps/unix/sysv/linux/microblaze/kernel-features.h
(__ASSUME_PSELECT): Undefine.
This patch fixes an issue for powerpc32-fpu static build which fails
with an 'bzero' undefined reference. This patch adds bzero ifunc selector
for static builds and fixes the '__bzero_ppc' reference to default
memset symbol (since static memset build does not provide ifunc
selector).
Fixes BZ#16689.
Testing on mips64 showed missing underflow exceptions (from exp, for
example) in non-default rounding modes, caused by
libc_feresetround*_ctx wrongly restoring a saved environment without
preserving exceptions, when that's only valid for the _noex variants.
(I don't know why Steve didn't see this in his testing.) This patch
fixes this by using libc_feupdateenv_mips_ctx for the relevant macros
and removing the problem definitions.
The problem definitions aren't suitable for the _noex macros either
because they only discard exceptions in non-default rounding modes,
and while for some uses of *_noex/*_NOEX it doesn't matter whether
exceptions are discarded, dbl-64/e_remainder.c requires
SET_RESTORE_ROUND_NOEX to cause exceptions to be discarded. I think
the accumulated set of macros / functions for optimized exception /
rounding mode handling could do with a careful review by now, and
possible refactoring, and at least one new feature (extracting the
saved rounding mode from an environment / context variable - see
dbl-64/e_sqrt.c for a case where this could be used).
Tested mips64.
* sysdeps/mips/math_private.h [__mips_hard_float]
(libc_feresetround_ctx): Define to libc_feupdateenv_mips_ctx not
libc_feresetround_mips_ctx.
[__mips_hard_float] (libc_feresetroundf_ctx): Likewise.
[__mips_hard_float] (libc_feresetroundl_ctx): Likewise.
[__mips_hard_float] (libc_feresetround_mips_ctx): Remove.
ISO C requires the result of nextafter to be independent of the
rounding mode, even when underflow or overflow occurs. This patch
fixes the bug in various nextafter implementations that, having done
an overflowing computation to force an overflow exception (correct),
they then return the result of that computation rather than an
infinity computed some other way (incorrect, when the overflowing
result of arithmetic with that sign and rounding mode is finite but
the correct result is infinite) - generally by falling through to
existing code to return a value that in fact is correct for this case
(but was computed by an integer increment and so without generating
the exceptions required). Having fixed the bug, the previously
deferred conversion of nextafter testing in libm-test.inc to
ALL_RM_TEST is also included.
Tested x86_64 and x86; also spot-checked results of nextafter tests
for powerpc32 and mips64 to test the ldbl-128ibm and ldbl-128
changes. (The m68k change is untested.)
[BZ #16677]
* math/s_nextafter.c (__nextafter): Do not return value from
overflowing computation.
* sysdeps/i386/fpu/s_nextafterl.c (__nextafterl): Likewise.
* sysdeps/ieee754/flt-32/s_nextafterf.c (__nextafterf): Likewise.
* sysdeps/ieee754/ldbl-128/s_nextafterl.c (__nextafterl):
Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_nextafterl.c (__nextafterl):
Likewise.
* sysdeps/m68k/m680x0/fpu/s_nextafterl.c (__nextafterl): Likewise.
* math/libm-test.inc (nextafter_test): Use ALL_RM_TEST.
This patch fixes an issue for powerpc64[le] static build where __bzero
is definied in multiple places (memset-ppc64.o and bzero.o). It is now
defined only in bzero.o and memset-ppc64.o only defined __bzero_ppc for
both dynamic and static library.
Fixes BZ#16683.
The optimization is achieved by following techniques:
> hashing of needle.
> hashing avoids scanning of duplicate entries in needle across the string.
> initializing the hash table with Vector instructions (VSX) by quadword access.
> unrolling when scanning for character in string across hash table.
The optimization is achieved by following techniques:
1. Doubleword aligned memory access and compares using
cmpb instruction.
2. Loop unrolling for byte load/store.
3. CPU pre-fetch to avoid cache miss.
This patch fix the optimized powerpc-fpu modf/modff implementation
when using in non-default rounding mode where the zero sign is not
as expected. It fixes the libm testsuite tests
modf_downward (0) == 0.00000000000000000000e+00
modf_downward (20) == 0.00000000000000000000e+00
modf_downward (21) == 0.00000000000000000000e+00
Where the sign returned was negative.
Trapping exceptions in AArch64 are optional. The relevant exception
control bits in FPCR are are defined as RES0 hence the absence of
support can be detected by reading back the FPCR and comparing with
the desired value.
This patch is a revised and updated version of
<https://sourceware.org/ml/libc-alpha/2014-01/msg00196.html>.
In order to generate overall summaries of the results of all tests in
the glibc testsuite, we need to identify and concatenate the files
with the results of individual tests.
Tomas Dohnalek's patch used $(common-objpfx)*/*.test-result for this.
However, the normal glibc approach is explicit enumeration of the
expected set of files with a given property, rather than all files
matching some pattern like that. Furthermore, we would like to be
able to mark tests as UNRESOLVED if the file with their results is for
some reason missing, and in future we would like to be able to mark
tests as UNSUPPORTED if they are disabled for a particular
configuration (rather than simply having them missing from the list of
tests as at present). Such handling of tests that were not run or did
not record results requires an explicit enumeration of tests.
For the tests following the default makefile rules, $(tests) (and
$(xtests)) provides such an enumeration. Others, however, are added
directly as dependencies of the "tests" and "xtests" makefile
targets. This patch changes the makefiles to put them in variables
tests-special and xtests-special, with appropriate dependencies on the
tests listed there then being added centrally.
Those variables are used in Rules and so need to be set before Rules
is included in a subdirectory makefile, which is often earlier in the
makefile than the dependencies were present before. We previously
discussed the question of where to include Rules; see the question at
<https://sourceware.org/ml/libc-alpha/2012-11/msg00798.html>, and a
discussion in
<https://sourceware.org/ml/libc-alpha/2013-01/msg00337.html> of why
Rules is included early rather than late in subdirectory makefiles.
It was necessary to avoid an indirection through the check-abi target
and get the check-abi-* targets for individual libraries into the
tests-special variable. The intl/ test $(objpfx)tst-gettext.out,
previously built only because of dependencies from other tests, was
also added to tests-special for the same reason.
The entries in tests-special are the full makefile targets, complete
with $(objpfx) and .out. If a future change causes tests to be named
consistently with a .out suffix, this can be changed to include just
the path relative to $(objpfx), without .out.
Tested x86_64, including that the same set of files is generated in
the build directory by a build and testsuite run both before and after
the patch (except for changes to the
elf/tst-null-argv.debug.out.<number> file name), and a build with
run-built-tests=no to verify there aren't any more obvious instances
of the issue Marcus Shawcroft reported with a previous version in
<https://sourceware.org/ml/libc-alpha/2014-01/msg00462.html>.
* Makefile (tests): Change dependencies to ....
(tests-special): ... additions to this variable.
(tests): Depend on $(tests-special).
* Makerules (check-abi-list): New variable.
(check-abi): Depend on $(check-abi-list).
[$(subdir) = elf] (tests-special): Add
$(objpfx)check-abi-libc.out.
[$(build-shared) = yes && subdir] (tests-special): Add
$(check-abi-list).
[$(build-shared) = yes && subdir] (tests): Do not depend on
check-abi.
* Rules (tests): Depend on $(tests-special).
(xtests): Depend on $(xtests-special).
* catgets/Makefile (tests): Change dependencies to ....
(tests-special): ... additions to this variable.
* conform/Makefile (tests): Change dependencies to ....
(tests-special): ... additions to this variable.
* elf/Makefile (tests): Change dependencies to ....
(tests-special): ... additions to this variable.
* grp/Makefile (tests): Change dependencies to ....
(tests-special): ... additions to this variable.
* iconv/Makefile (xtests): Change dependencies to ....
(xtests-special): ... additions to this variable.
* iconvdata/Makefile (tests): Change dependencies to ....
(tests-special): ... additions to this variable.
* intl/Makefile (tests): Change dependencies to ....
(tests-special): ... additions to this variable. Also add
$(objpfx)tst-gettext.out.
* io/Makefile (tests): Change dependencies to ....
(tests-special): ... additions to this variable.
* libio/Makefile (tests): Change dependencies to ....
(tests-special): ... additions to this variable.
* malloc/Makefile (tests): Change dependencies to ....
(tests-special): ... additions to this variable.
* misc/Makefile (tests): Change dependencies to ....
(tests-special): ... additions to this variable.
* nptl/Makefile (tests): Change dependencies to ....
(tests-special): ... additions to this variable.
* nptl_db/Makefile (tests): Change dependencies to ....
(tests-special): ... additions to this variable.
* posix/Makefile (tests): Change dependencies to ....
(tests-special): ... additions to this variable.
(xtests): Change dependencies to ....
(xtests-special): ... additions to this variable.
* resolv/Makefile (tests): Change dependencies to ....
(tests-special): ... additions to this variable.
(xtests): Change dependencies to ....
(xtests-special): ... additions to this variable.
* stdio-common/Makefile (tests): Change dependencies to ....
(tests-special): ... additions to this variable.
(do-tst-unbputc): Remove target.
(do-tst-printf): Likewise.
* stdlib/Makefile (tests): Change dependencies to ....
(tests-special): ... additions to this variable.
* string/Makefile (tests): Change dependencies to ....
(tests-special): ... additions to this variable.
* sysdeps/x86/Makefile (tests): Change dependencies to ....
(tests-special): ... additions to this variable.
localedata:
* Makefile (tests): Change dependencies to ....
(tests-special): ... additions to this variable.
The __ASSUME_UTIMES macro describes whether the utimes syscall is
present. For linux-generic architectures, it isn't (utimensat is
instead), so the macro should not be defined for them; this patch
removes the spurious definitions for such architectures. (Those
definitions don't actually cause any user-visible bug, because
futimes.c doesn't use __ASSUME_UTIMES if __ASSUME_UTIMENSAT is
defined, and futimesat.c and utimes.c are overridden for
linux-generic, but the definitions are still logically incorrect.)
* sysdeps/unix/sysv/linux/aarch64/kernel-features.h
(__ASSUME_UTIMES): Remove.
* sysdeps/unix/sysv/linux/tile/kernel-features.h
(__ASSUME_UTIMES): Likewise.
As recently discussed
<https://sourceware.org/ml/libc-alpha/2014-02/msg00670.html>, it
doesn't seem particularly useful for libm-test-ulps files to contain
huge amounts of data on ulps for individual tests; just the global
maximum observed ulps for each function, together with the
verification of exceptions, errno and special results such as
infinities and NaNs for each test, suffices to verify that a
function's behavior on the given test inputs is within the expected
accuracy. Removing this data reduces source tree churn caused by
updates to these files when libm tests are added, and reduces the
frequency with which testsuite additions actually need libm-test-ulps
changes at all.
Accordingly, this patch removes that data, so that individual tests
get checked against the global bounds for the given function and only
generate an error if those are exceeded. Tested x86_64 (including
verifying that if an ulps value is artificially reduced, the tests do
indeed fail as they should and "make regen-ulps" generates the
expected changes).
* math/libm-test.inc (struct ulp_data): Don't refer to ulps for
individual tests in comment.
(libm-test-ulps.h): Don't refer to test_ulps in #include comment.
(prev_max_error): New variable.
(prev_real_max_error): Likewise.
(prev_imag_max_error): Likewise.
(compare_ulp_data): Don't refer to test names in comment.
(find_test_ulps): Remove function.
(find_function_ulps): Likewise.
(find_complex_function_ulps): Likewise.
(init_max_error): Take function name as argument. Look up ulps
for that function.
(print_ulps): Remove function.
(print_max_error): Use prev_max_error instead of calling
find_function_ulps.
(print_complex_max_error): Use prev_real_max_error and
prev_imag_max_error instead of calling find_complex_function_ulps.
(check_float_internal): Take max_ulp parameter instead of calling
find_test_ulps. Don't call print_ulps.
(check_float): Update call to check_float_internal.
(check_complex): Update calls to check_float_internal.
(START): Pass argument to init_max_error.
* math/gen-libm-test.pl (%results): Don't include "kind"
information.
(parse_ulps): Don't handle ulps of individual tests.
(print_ulps_file): Likewise.
(output_ulps): Likewise.
* math/README.libm-test: Update.
* manual/libm-err-tab.pl (parse_ulps): Don't handle ulps of
individual tests.
* sysdeps/aarch64/libm-test-ulps: Remove individual test ulps.
* sysdeps/alpha/fpu/libm-test-ulps: Likewise.
* sysdeps/arm/libm-test-ulps: Likewise.
* sysdeps/i386/fpu/libm-test-ulps: Likewise.
* sysdeps/ia64/fpu/libm-test-ulps: Likewise.
* sysdeps/m68k/coldfire/fpu/libm-test-ulps: Likewise.
* sysdeps/m68k/m680x0/fpu/libm-test-ulps: Likewise.
* sysdeps/microblaze/libm-test-ulps: Likewise.
* sysdeps/mips/mips32/libm-test-ulps: Likewise.
* sysdeps/mips/mips64/libm-test-ulps: Likewise.
* sysdeps/powerpc/fpu/libm-test-ulps: Likewise.
* sysdeps/powerpc/nofpu/libm-test-ulps: Likewise.
* sysdeps/s390/fpu/libm-test-ulps: Likewise.
* sysdeps/sh/libm-test-ulps: Likewise.
* sysdeps/sparc/fpu/libm-test-ulps: Likewise.
* sysdeps/tile/libm-test-ulps: Likewise.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
* sysdeps/hppa/fpu/libm-test-ulps: Remove individual test ulps.
This patch optimizes strrchr() for ppc64. It uses aligned memory
access along with cmpb instruction and CPU prefetch to avoid
cache misses for speed improvement.
This patch add a optimized llround/llroundf implementation for POWER8
using the new Move From VSR Doubleword instruction to gains some
cycles from FP to GRP register move.
This patch add a optimized llrint/llrintf implementation for POWER8
using the new Move From VSR Doubleword instruction to gains some
cycles from FP to GRP register move.
This patch add a optimized finite/finitef implementation for POWER8
using the new Move From VSR Doubleword instruction to gains some
cycles from FP to GRP register move.
This patch add a optimized isinf/isinff implementation for POWER8
using the new Move From VSR Doubleword instruction to gains some
cycles from FP to GRP register move.
This patch add a optimized isnan/isnanf implementation for POWER8
using the new Move From VSR Doubleword instruction to gains some
cycles from FP to GRP register move.
Current ARM soft-float implementation is violating the RTABI
(http://infocenter.arm.com/help/topic/com.arm.doc.ihi0043d/IHI0043D_rtabi.pdf)
Section 4.1.1.1:
When not otherwise specified by IEEE 754, the result on an invalid
operation should be the quiet NaN bit pattern with only the most
significant bit of the significand set, and all other significand bits
zero.
This patch fixes it by setting _FP_NANFRAC_* to zero.
Ran make check test with -mfloat-abi=soft. No regression.
* sysdeps/arm/soft-fp/sfp-machine.h (_FP_NANFRAC_S, _FP_NANFRAC_D)
(_FP_NANFRAC_Q): Set to zero.
In 84ba214c, I removed some redundant sign computations and in the
process, I incorrectly got rid of a temporary variable, thus passing
the absolute value of the input to bsloww1. This caused #16623.
This fix undoes the incorrect change.
This is a patch to the MIPS math_private.h file to define HAVE_RM_CTX and
implement the ctx macros. I also defined a few other macros and inline
functions that I skipped the first time.
This commit fixes a bug where the dynamic loader would crash
when loading audit libraries, via LD_AUDIT, where those libraries
used TLS. The dynamic loader was not considering that the audit
libraries would use TLS and failed to bump the TLS generation
counter leaving TLS usage inconsistent after loading the audit
libraries.
https://sourceware.org/ml/libc-alpha/2014-02/msg00569.html
Now the ARM port implements pointer encryption for jmpbufs, gdb needs
a SystemTap probe point in longjmp to determine the target PC of
a call to longjmp. This patch implements the probe point in longjmp
and a similar probe point in setjmp.
In order to have all the appropriate registers available to pass to the
probe this reorders the layout of jmpbuf, putting the sp and lr registers
at the start rather than the end, allowing them to be read and
written sequentially.
Tested on armv7, no new failures in the glibc testsuite and confirmed
that this fixes the gdb.base/longjmp.exp failures in the gdb testsuite.
ChangeLog:
2014-02-25 Will Newton <will.newton@linaro.org>
* sysdeps/arm/__longjmp.S: Include stap-probe.h.
(__longjmp): Restore sp and lr before restoring callee
saved registers. Add longjmp and longjmp_target
SystemTap probe point.
* sysdeps/arm/bits/setjmp.h (__jmp_buf): Update comment.
* sysdeps/arm/include/bits/setjmp.h (__JMP_BUF_SP):
Define to zero to match jmpbuf layout.
* sysdeps/arm/setjmp.S: Include stap-probe.h.
(__sigsetjmp): Save sp and lr before saving callee
saved registers. Add setjmp SystemTap probe point.
elf/tst-auxv.c includes misc/sys/auxv.h, which ends up not actually
being included due to the guard overlap, and getauxval becomes an
implicit declaration and implicit pointer conversion which means, at
best, the test isn't actually testing what it thinks it is and, at
worst, it'll crash and burn on platforms where implict pointer
conversion is a Very Bad Thing.
* sysdeps/powerpc/bits/hwcap.h: Allow _SYSDEPS_SYSDEP_H guard as a
synonym for _SYS_AUXV_H to allow direct inclusion.
* sysdeps/sparc/bits/hwcap.h: Likewise.
* sysdeps/powerpc/sysdep.h: Define _SYSDEPS_SYSDEP_H instead of
_SYS_AUXV_H so we can include sysdep.h and sys/auxv.h together.
* sysdeps/sparc/sysdep.h: Likewise.
Similar to the issues for accept4 and recvmmsg, __ASSUME_SENDMMSG is
also confused about whether it relates to function availability or
socketcall operation availability, and the conditions for the
definition are always wrong (sendmmsg appeared in Linux kernel 3.0,
not 2.6.39); this is now bug 16611.
This patch splits the macro into separate macros like those for
accept4 and recvmmsg, defining them for appropriate kernel versions.
Tested x86_64, including that disassembly of the installed shared
libraries is unchanged by this patch.
[BZ #16611]
* sysdeps/unix/sysv/linux/kernel-features.h
[__LINUX_KERNEL_VERSION >= 0x030000 && __ASSUME_SOCKETCALL]
(__ASSUME_SENDMMSG_SOCKETCALL): Define.
[__LINUX_KERNEL_VERSION >= 0x030000 && (__i386__ || __x86_64__ ||
__powerpc__ || __sh__ || __sparc__)] (__ASSUME_SENDMMSG_SYSCALL):
Likewise.
[__i386__ || __powerpc__ || __sh__ || __sparc__]
(__ASSUME_SENDMMSG_SYSCALL_WITH_SOCKETCALL): Likewise.
[__ASSUME_SENDMMSG_SOCKETCALL || __ASSUME_SENDMMSG_SYSCALL]
(__ASSUME_SENDMMSG): Define instead of using previous
[__LINUX_KERNEL_VERSION >= 0x020627] condition.
* sysdeps/unix/sysv/linux/aarch64/kernel-features.h
(__ASSUME_SENDMMSG_SYSCALL): Define.
* sysdeps/unix/sysv/linux/alpha/kernel-features.h
[__LINUX_KERNEL_VERSION >= 0x030200] (__ASSUME_SENDMMSG_SYSCALL):
Likewise.
* sysdeps/unix/sysv/linux/arm/kernel-features.h
[__LINUX_KERNEL_VERSION >= 0x030000] (__ASSUME_SENDMMSG_SYSCALL):
Likewise.
* sysdeps/unix/sysv/linux/ia64/kernel-features.h
[__LINUX_KERNEL_VERSION >= 0x030000] (__ASSUME_SENDMMSG_SYSCALL):
Likewise.
* sysdeps/unix/sysv/linux/internal_sendmmsg.S [__ASSUME_SOCKETCALL
&& !__ASSUME_SENDMMSG_SYSCALL_WITH_SOCKETCALL &&
!__ASSUME_SENDMMSG_SYSCALL] (__NR_sendmmsg): Undefine.
[__ASSUME_SENDMMSG]: Change conditionals to
[__ASSUME_SENDMMSG_SOCKETCALL].
* sysdeps/unix/sysv/linux/microblaze/kernel-features.h
[__LINUX_KERNEL_VERSION >= 0x030300] (__ASSUME_SENDMMSG_SYSCALL):
Define.
* sysdeps/unix/sysv/linux/mips/kernel-features.h
[__LINUX_KERNEL_VERSION >= 0x030100] (__ASSUME_SENDMMSG_SYSCALL):
Likewise.
* sysdeps/unix/sysv/linux/sendmmsg.c [__ASSUME_SOCKETCALL &&
!__ASSUME_SENDMMSG_SYSCALL_WITH_SOCKETCALL &&
!__ASSUME_SENDMMSG_SYSCALL] (__NR_sendmmsg): Undefine.
[!__ASSUME_SENDMMSG]: Change conditional to
[!__ASSUME_SENDMMSG_SOCKETCALL].
* sysdeps/unix/sysv/linux/tile/kernel-features.h
[__LINUX_KERNEL_VERSION >= 0x030000] (__ASSUME_SENDMMSG_SYSCALL):
Define.
* sysdeps/unix/sysv/linux/hppa/kernel-features.h
[__LINUX_KERNEL_VERSION >= 0x030100] (__ASSUME_SENDMMSG_SYSCALL):
Define.
Similar to the issues for accept4, __ASSUME_RECVMMSG is also confused
about whether it relates to function availability or socketcall
operation availability; this is now bug 16610.
Nothing actually tests __ASSUME_RECVMMSG for function availability,
but implicit in the definition in kernel-features.h is the idea that
it makes sense when the syscall is available and socketcall is not
being used. As with accept4, there are architectures where the
syscall was added later than the socketcall operation, meaning that
assuming glibc is built with recent enough kernel headers, it does not
attempt to use socketcall for these operations and __ASSUME_RECVMMSG
gets defined for kernels >= 2.6.33 even when the syscall was only
added later.
This patch splits the macro into separate macros like those used for
accept4; having similar macro structure in both cases (and for
sendmmsg once I've dealt with that) seems likely to be less confusing
than having a different structure on the basis of nothing actually
needing to assume the recvmmsg function works. Appropriate
definitions are added for all architectures.
Architecture-specific note: Tile's kernel-features.h says "TILE glibc
support starts with 2.6.36", which is accurate in that 2.6.36 was the
first kernel version with Tile support, and on that basis I've made
that header define __ASSUME_RECVMMSG_SYSCALL unconditionally.
However, Tile's configure.ac has arch_minimum_kernel=2.6.32. Since
arch_minimum_kernel is meant to reflect only kernel.org kernel
versions, I think that should change to 2.6.36. (If using glibc with
kernel versions from before a port went in kernel.org, it's your
responsibility to change arch_minimum_kernel in a local patch, and at
the same time to adjust any __ASSUME_* definitions that may not be
correct for your older kernel; for developing the official glibc it
should only ever be necessary to consider what official kernel.org
releases support.)
Tested x86_64, including that disassembly of the installed shared
libraries is unchanged by this patch.
[BZ #16610]
* sysdeps/unix/sysv/linux/kernel-features.h
[__LINUX_KERNEL_VERSION >= 0x020621 && __ASSUME_SOCKETCALL]
(__ASSUME_RECVMMSG_SOCKETCALL): Define.
[(__LINUX_KERNEL_VERSION >= 0x020621 && (__i386__ || __x86_64__ ||
__sparc__)) || (__LINUX_KERNEL_VERSION >= 0x020625 && (__powerpc__
|| __sh__))] (__ASSUME_RECVMMSG_SYSCALL): Likewise.
[__i386__ || __sparc__]
(__ASSUME_RECVMMSG_SYSCALL_WITH_SOCKETCALL): Likewise.
[__ASSUME_RECVMMSG_SOCKETCALL || __ASSUME_RECVMMSG_SYSCALL]
(__ASSUME_RECVMMSG): Define instead of using previous
[__LINUX_KERNEL_VERSION >= 0x020621] condition.
* sysdeps/unix/sysv/linux/aarch64/kernel-features.h
(__ASSUME_RECVMMSG_SYSCALL): Define.
* sysdeps/unix/sysv/linux/alpha/kernel-features.h
[__LINUX_KERNEL_VERSION >= 0x020621] (__ASSUME_RECVMMSG_SYSCALL):
Likewise.
* sysdeps/unix/sysv/linux/arm/kernel-features.h
[__LINUX_KERNEL_VERSION >= 0x020621] (__ASSUME_RECVMMSG_SYSCALL):
Likewise.
* sysdeps/unix/sysv/linux/ia64/kernel-features.h
[__LINUX_KERNEL_VERSION >= 0x020621] (__ASSUME_RECVMMSG_SYSCALL):
Likewise.
* sysdeps/unix/sysv/linux/internal_recvmmsg.S [__ASSUME_SOCKETCALL
&& !__ASSUME_RECVMMSG_SYSCALL_WITH_SOCKETCALL &&
!__ASSUME_RECVMMSG_SYSCALL] (__NR_recvmmsg): Undefine.
[__ASSUME_RECVMMSG]: Change condition to
[__ASSUME_RECVMMSG_SOCKETCALL].
* sysdeps/unix/sysv/linux/microblaze/kernel-features.h
[__LINUX_KERNEL_VERSION >= 0x020621] (__ASSUME_RECVMMSG_SYSCALL):
Define.
(__ASSUME_RECVMMSG_SYSCALL_WITH_SOCKETCALL): Likewise.
* sysdeps/unix/sysv/linux/mips/kernel-features.h
[__LINUX_KERNEL_VERSION >= 0x020621] (__ASSUME_RECVMMSG_SYSCALL):
Likewise.
* sysdeps/unix/sysv/linux/recvmmsg.c [__ASSUME_SOCKETCALL &&
!__ASSUME_RECVMMSG_SYSCALL_WITH_SOCKETCALL &&
!__ASSUME_RECVMMSG_SYSCALL] (__NR_recvmmsg): Undefine.
[!__ASSUME_RECVMMSG]: Change condition to
[!__ASSUME_RECVMMSG_SOCKETCALL].
* sysdeps/unix/sysv/linux/tile/kernel-features.h
(__ASSUME_RECVMMSG_SYSCALL): Define.
* sysdeps/unix/sysv/linux/hppa/kernel-features.h
[__LINUX_KERNEL_VERSION >= 0x020622] (__ASSUME_RECVMMSG_SYSCALL):
Define.
In <https://sourceware.org/ml/libc-alpha/2013-12/msg00008.html>,
Aurelien noted issues with the definition of __ASSUME_ACCEPT4, which I
discussed in more detail in
<https://sourceware.org/ml/libc-alpha/2013-12/msg00014.html>; these
are now bug 16609.
As previously noted, __ASSUME_ACCEPT4 is used in two ways:
* In OS-independent code, to mean "accept4 can be assumed to work
rather than fail with ENOSYS". It doesn't matter whether it's
implemented with socketcall or a separate syscall.
* In Linux-specific code, to mean "the socketcall multiplex syscall
can be assumed to handle the accept4 operation. When used in
Linux-specific code, it *never* refers to anything relating to the
accept4 syscall, only to the socketcall multiplexer.
This patch splits the macro into separate __ASSUME_ACCEPT4_SOCKETCALL,
__ASSUME_ACCEPT4_SYSCALL and __ASSUME_ACCEPT4 to clarify the different
cases involved. A macro __ASSUME_SOCKETCALL is added for convenience
in writing logic relating to all socketcall architectures. In
addition, to address the issue of architectures where socketcall
support for accept4 was added before a separate syscall was added (and
so the separate syscall should not be used unless known to be present
or fallback to socketcall is available), a fourth macro
__ASSUME_ACCEPT4_SYSCALL_WITH_SOCKETCALL is added to indicate that the
syscall became available at the same time as socketcall support. This
is then used in the relevant places in a conditional determining
whether to undefine __NR_accept4 (the simple approach to avoiding the
syscall's presence causing problems; I didn't try to implement runtime
fallback from the syscall to socketcall).
Architecture-specific note: alpha defined __ASSUME_ACCEPT4 for 2.6.33
and later, but actually the syscall was added for alpha in 3.2, so
this patch uses the correct condition for __ASSUME_ACCEPT4_SYSCALL
there.
Tested x86_64, including that disassembly of the installed shared
libraries is unchanged by this patch.
[BZ #16609]
* sysdeps/unix/sysv/linux/kernel-features.h [__i386__ ||
__powerpc__ || __s390__ || __sh__ || __sparc__]
(__ASSUME_SOCKETCALL): Define.
[__LINUX_KERNEL_VERSION && __ASSUME_SOCKETCALL]
(__ASSUME_ACCEPT4_SOCKETCALL): Likewise.
[(__LINUX_KERNEL_VERSION >= 0x02061c && (__x86_64__ || __sparc__))
|| (__LINUX_KERNEL_VERSION >= 0x020625 && (__powerpc__ ||
__sh__))] (__ASSUME_ACCEPT4_SYSCALL): Likewise.
[__sparc__] (__ASSUME_ACCEPT4_SYSCALL_WITH_SOCKETCALL): Likewise.
[__ASSUME_ACCEPT4_SOCKETCALL || __ASSUME_ACCEPT4_SYSCALL]
(__ASSUME_ACCEPT4): Define instead of using previous
[__LINUX_KERNEL_VERSION >= 0x02061c && (__i386__ || __x86_64__ ||
__powerpc__ || __sparc__ || __s390__)] condition.
* sysdeps/unix/sysv/linux/aarch64/kernel-features.h
(__ASSUME_ACCEPT4): Change to __ASSUME_ACCEPT4_SYSCALL.
* sysdeps/unix/sysv/linux/accept4.c [__ASSUME_SOCKETCALL &&
!__ASSUME_ACCEPT4_SYSCALL_WITH_SOCKETCALL &&
!__ASSUME_ACCEPT4_SYSCALL] (__NR_accept4): Undefine.
[!__ASSUME_ACCEPT4]: Change condition to
[!__ASSUME_ACCEPT4_SOCKETCALL].
* sysdeps/unix/sysv/linux/alpha/kernel-features.h
(__ASSUME_ACCEPT4): Change to __ASSUME_ACCEPT4_SYSCALL. Correct
condition to [__LINUX_KERNEL_VERSION >= 0x030200].
* sysdeps/unix/sysv/linux/arm/kernel-features.h
[__LINUX_KERNEL_VERSION >= 0x020624] (__ASSUME_ACCEPT4): Change to
__ASSUME_ACCEPT4_SYSCALL.
* sysdeps/unix/sysv/linux/i386/accept4.S [__ASSUME_ACCEPT4]:
Change conditions to [__ASSUME_ACCEPT4_SOCKETCALL].
* sysdeps/unix/sysv/linux/ia64/kernel-features.h
[__LINUX_KERNEL_VERSION >= 0x030300] (__ASSUME_ACCEPT4): Change to
__ASSUME_ACCEPT4_SYSCALL.
* sysdeps/unix/sysv/linux/internal_accept4.S [__ASSUME_SOCKETCALL
&& !__ASSUME_ACCEPT4_SYSCALL_WITH_SOCKETCALL &&
!__ASSUME_ACCEPT4_SYSCALL] (__NR_accept4): Undefine.
[__ASSUME_ACCEPT4]: Change condition to
[__ASSUME_ACCEPT4_SOCKETCALL].
* sysdeps/unix/sysv/linux/m68k/kernel-features.h
(__ASSUME_SOCKETCALL): Define.
[__LINUX_KERNEL_VERSION >= 0x02061c] (__ASSUME_ACCEPT4): Remove.
* sysdeps/unix/sysv/linux/microblaze/kernel-features.h
(__ASSUME_SOCKETCALL): Define.
(__ASSUME_ACCEPT4): Remove.
[__LINUX_KERNEL_VERSION >= 0x020621] (__ASSUME_ACCEPT4_SYSCALL):
Define.
* sysdeps/unix/sysv/linux/mips/kernel-features.h
[__LINUX_KERNEL_VERSION >= 0x02061f] (__ASSUME_ACCEPT4_SYSCALL):
Likewise.
* sysdeps/unix/sysv/linux/tile/kernel-features.h
(__ASSUME_ACCEPT4): Change to __ASSUME_ACCEPT4_SYSCALL.
* sysdeps/unix/sysv/linux/hppa/kernel-features.h
[__LINUX_KERNEL_VERSION >= 0x020622] (__ASSUME_ACCEPT4_SYSCALL):
Define.
This patch updates the ARM HWCAP data (both bits/hwcap.h and
dl-procinfo.[ch]) to match Linux 3.13.
* sysdeps/unix/sysv/linux/arm/bits/hwcap.h (HWCAP_ARM_VFPD32): New
macro.
(HWCAP_ARM_LPAE): Likewise.
(HWCAP_ARM_EVTSTRM): Likewise.
* sysdeps/unix/sysv/linux/arm/dl-procinfo.c (_dl_arm_cap_flags):
Add vpfd32, lpae and evtstrm.
* sysdeps/unix/sysv/linux/arm/dl-procinfo.h (_DL_HWCAP_COUNT):
Increase to 22.
This patch moves tests of clog10 to auto-libm-test-in. Note that this
means gen-auto-libm-tests will now depend on the recent MPC 1.0.2
release which added a fix for a bug that made gen-auto-libm-tests hang
for clog10. (It still can't conveniently be used for cacos cacosh
casin casinh catan catanh csin csinh because of extreme slowness of
those functions for special cases in MPC; at least some slow cases of
csin / csinh are fixed in MPC trunk, but not in a release.)
Tested x86_64 and x86 and ulps updated accordingly.
* math/auto-libm-test-in: Add tests of clog10.
* math/auto-libm-test-out: Regenerated.
* math/libm-test.inc (clog10_test_data): Use AUTO_TESTS_c_c.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
gen-auto-libm-tests has a bug in the logic for setting a sticky bit
based on the ternary value from MPFR: it is correct for positive
results, but for negative results mpz_setbit acts as if a two's
complement representation is used, whereas the low bit needs setting
based on the sign-magnitude representation GMP actually uses. (This
showed up in converting fma tests to use auto-libm-test-in /
gen-auto-libm-tests.)
This patch fixes the problem by negating the mpz_t value to set its
low bit. There are lots of changes to auto-libm-test-out (mainly 1ulp
fixes to ldbl-128 expected results), but only a few ulps updates are
needed on x86 / x86_64. In one case, a corrected expectation showed
up a spurious underflow exception where the correct result is slightly
outside the underflowing range.
Tested x86_64 and x86 and ulps updated accordingly.
* math/gen-auto-libm-tests.c (adjust_real): Ensure integers are
non-negative before setting low bit.
* math/auto-libm-test-in: Mark one asin test possibly having
spurious underflow.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
GCC trunk now uses soft-fp for MIPS64 long double, so supporting
integration with hardware exceptions and rounding modes. This patch
updates MIPS math-tests.h accordingly not to disable exception and
rounding mode tests in this case.
Tested mips64 and ulps updated to reflect the newly run tests.
* sysdeps/mips/math-tests.h: Include <features.h>.
[!__mips_soft_float && _MIPS_SIM != _ABIO32 && __GNUC_PREREQ (4, 9)]
(ROUNDING_TESTS_long_double): Do not define.
[!__mips_soft_float && _MIPS_SIM != _ABIO32 && __GNUC_PREREQ (4, 9)]
(EXCEPTION_TESTS_long_double): Likewise.
* sysdeps/mips/mips64/libm-test-ulps: Update.
IEEE 754-2008 defines two ways in which tiny results can be detected,
"before rounding" (based on the infinite-precision result) and "after
rounding" (based on the result when rounded to normal precision as if
the exponent range were unbounded). All binary operations on an
architecture must use the same choice of how tininess is detected.
soft-fp has so far implemented only before-rounding tininess
detection. This patch adds support for after-rounding tininess
detection. A new macro _FP_TININESS_AFTER_ROUNDING is added that
sfp-machine.h must define (soft-fp is meant to be self-contained so
the existing tininess.h files aren't used here, though the information
going in sfp-machine.h has been taken from them). The soft-fp macros
dealing with raising underflow exceptions then handle the cases where
the choice matters specially, rounding a copy of the input to the
appropriate precision to see if a value that's tiny before rounding
isn't tiny after rounding.
Tested for mips64 using GCC trunk (which now uses soft-fp on MIPS, so
supporting exceptions and rounding modes for long double where not
previously supported - this is the immediate motivation for doing this
patch now) together with (a) a patch to sysdeps/mips/math-tests.h to
enable exceptions / rounding modes tests for long double for GCC 4.9
and later, and (b) corresponding changes applied to libgcc's soft-fp
and sfp-machine.h files. In the libgcc context this is also tested on
x86_64 (also an after-rounding architecture) with testcases for
__float128 that I intend to add to the GCC testsuite when updating
soft-fp there.
(To be clear: this patch does not fix any glibc bugs that were
user-visible in past releases, since after-rounding architectures
didn't use soft-fp in any affected case with support for
floating-point exceptions - so there is no corresponding Bugzilla bug.
Rather, it works together with the GCC changes to use soft-fp on MIPS
to allow previously absent long double functionality to work properly,
and allows soft-fp to be used in glibc on after-rounding architectures
in cases where it couldn't previously be used.)
* soft-fp/op-common.h (_FP_DECL): Mark exponent as possibly
unused.
(_FP_PACK_SEMIRAW): Determine tininess based on rounding shifted
value if _FP_TININESS_AFTER_ROUNDING and unrounded value is in
subnormal range.
(_FP_PACK_CANONICAL): Determine tininess based on rounding to
normal precision if _FP_TININESS_AFTER_ROUNDING and unrounded
value has largest subnormal exponent.
* soft-fp/soft-fp.h [FP_NO_EXCEPTIONS]
(_FP_TININESS_AFTER_ROUNDING): Undefine and redefine to 0.
* sysdeps/aarch64/soft-fp/sfp-machine.h
(_FP_TININESS_AFTER_ROUNDING): New macro.
* sysdeps/alpha/soft-fp/sfp-machine.h
(_FP_TININESS_AFTER_ROUNDING): Likewise.
* sysdeps/arm/soft-fp/sfp-machine.h (_FP_TININESS_AFTER_ROUNDING):
Likewise.
* sysdeps/mips/mips64/soft-fp/sfp-machine.h
(_FP_TININESS_AFTER_ROUNDING): Likewise.
* sysdeps/mips/soft-fp/sfp-machine.h
(_FP_TININESS_AFTER_ROUNDING): Likewise.
* sysdeps/powerpc/soft-fp/sfp-machine.h
(_FP_TININESS_AFTER_ROUNDING): Likewise.
* sysdeps/sh/soft-fp/sfp-machine.h (_FP_TININESS_AFTER_ROUNDING):
Likewise.
* sysdeps/sparc/sparc32/soft-fp/sfp-machine.h
(_FP_TININESS_AFTER_ROUNDING): Likewise.
* sysdeps/sparc/sparc64/soft-fp/sfp-machine.h
(_FP_TININESS_AFTER_ROUNDING): Likewise.
* sysdeps/tile/sfp-machine.h (_FP_TININESS_AFTER_ROUNDING):
Likewise.
Also fixed the following whitespace nits to satisfy the push:
sysdeps/alpha/alphaev6/memset.S:142: space before tab in indent.
sysdeps/alpha/configure:1: new blank line at EOF.
sysdeps/alpha/fpu/e_sqrt.c:126: space before tab in indent.
sysdeps/alpha/preconfigure:1: new blank line at EOF.
sysdeps/unix/sysv/linux/alpha/syscalls.list:1: new blank line at EOF.
MIPS has its own version of dl-lookup.c to deal with differences
between undefined symbol semantics in the PIC and non-PIC ABIs. This
is often liable to get out of date with respect to the generic file
(for example, the recent __builtin_expect changes didn't cover ports,
and it's not obvious to anyone changing dl-lookup.c that there would
be architecture-specific versions).
This patch adds a macro that dl-machine.h can define that is used in
the appropriate place in dl-lookup.c, so that MIPS no longer needs its
own version of that file.
Tested for mips64 that the only changes to disassembly of installed
shared libraries appear to be ld.so changes attributable to different
line numbers and paths in assertions.
* elf/dl-lookup.c (ELF_MACHINE_SYM_NO_MATCH): Define if not
already defined.
(do_lookup_x): Use ELF_MACHINE_SYM_NO_MATCH.
* sysdeps/mips/dl-lookup.c: Remove.
* sysdeps/mips/dl-machine.h (ELF_MACHINE_SYM_NO_MATCH): New macro.
This patch moves the AArch64 port to the main sysdeps hierarchy. The
move is essentially:
git mv ports/sysdeps/aarch64 sysdeps/aarch64
git mv ports/sysdeps/unix/sysv/linux/aarch64 sysdeps/unix/sysv/linux/aarch64
The README is updated and I've updated ChangeLog.aarch64 along the
lines of the ARM move. The AArch64 build has been tested to confirm
that there were no changes in objdump -dr output or the shared
objects.
I've moved the MIPS port from ports to the main sysdeps hierarchy.
Beyond the README update, the move of the files was simply
git mv ports/sysdeps/mips sysdeps/mips
git mv ports/sysdeps/unix/mips sysdeps/unix/mips
git mv ports/sysdeps/unix/sysv/linux/mips sysdeps/unix/sysv/linux/mips
and in addition to the ChangeLog entries here, I put a note at the top
of ports/ChangeLog.mips similar to those in other files.
Tested that disassembly of installed shared libraries for mips is the
same before and after this patch (except for ld.so where paths in
assertions are involved, as for arm).
* sysdeps/mips: Move directory from ports/sysdeps/mips.
* sysdeps/unix/mips: Move directory from ports/sysdeps/unix/mips.
* sysdeps/unix/sysv/linux/mips: Move directory from
ports/sysdeps/unix/sysv/linux/mips.
* README: Update listing for mips-*-linux-gnu and
mips64-*-linux-gnu.
* sysdeps/mips: Move directory to ../sysdeps/mips.
* sysdeps/unix/mips: Move directory to ../sysdeps/unix/mips.
* sysdeps/unix/sysv/linux/mips: Move directory to
../sysdeps/unix/sysv/linux/mips.
I've moved the TILE-Gx and TILEPro ports to the main sysdeps hierarchy,
along with the linux-generic ports infrastructure. Beyond the README
update, the move was just
git mv ports/sysdeps/tile sysdeps/tile
git mv ports/sysdeps/unix/sysv/linux/tile \
sysdeps/unix/sysv/linux/tile
git mv ports/sysdeps/unix/sysv/linux/generic \
sysdeps/unix/sysv/linux/generic
I updated the relevant ChangeLogs along the lines of the ARM move
in commit c6bfe5c4d7 and tested the 64-bit tilegx build to confirm that
there were no changes in "objdump -dr" output in the shared objects.
This pulls in the latest defines for {g,s}etsockopt.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Import the current list of defines available in the kernel headers.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
I've moved the ARM port from ports to the main sysdeps hierarchy.
Beyond the README update, the move of the files was simply
git mv ports/sysdeps/arm sysdeps/arm
git mv ports/sysdeps/unix/arm sysdeps/unix/arm
git mv ports/sysdeps/unix/sysv/linux/arm sysdeps/unix/sysv/linux/arm
and in addition to the ChangeLog entries here, I put a note at the top
of ports/ChangeLog.arm similar to that at the top of
ChangeLog.powerpc. There is deliberately no NEWS change, as I think
it makes the most sense to put in a general note above all ports
having moved if we can achieve that for 2.20.
Tested that disassembly of installed shared libraries for arm is the
same before and after this patch, except for data (not instructions)
in ld.so (there are assertions in sysdeps/arm/dl-machine.h, and the
path by which that file is found, and so by which it appears in the
assertion message, changes as a result of the move).
* sysdeps/arm: Move directory from ports/sysdeps/arm.
* sysdeps/unix/arm: Move directory from ports/sysdeps/unix/arm.
* sysdeps/unix/sysv/linux/arm: Move directory from
ports/sysdeps/unix/sysv/linux/arm.
* README: Update listing for arm-*-linux-gnueabi.
ports/ChangeLog.arm:
* sysdeps/arm: Move directory to ../sysdeps/arm.
* sysdeps/unix/arm: Move directory to ../sysdeps.arm.
* sysdeps/unix/sysv/linux/arm: Move directory to
../sysdeps/unix/sysv/linux/arm.
This reverts commit 1f33d36a8a.
Conflicts:
elf/dl-misc.c
Also reverts the follow commits that were bug fixes to new code introduced
in the above commit:
063b2acbceb627fdd585e81c64bba1
Support for /proc/self/task/$tid/comm as added in Linux 2.6.33,
therefore since the test tst-setgetname relies on this functionality
to operate we must skip the test in kernels < 2.6.33. We wrap the
checks with __ASSUME_PROC_PID_TASK_COMM such that in the future when
we move arch_minimum_kernel to 2.6.33 we can remove this code.
This patch creates implicit rules to match the abifiles if
abilist-pattern is defined in the architecture Makefile. This allows
machine specific Makefiles to define different abifiles names
(for instance *-le.abilist for powerpc64le).
When i386 and x86-64 mathinline.h was merged into a single mathinline.h,
"gcc -m32" enables x87 inline functions on x86-64 even when -mfpmath=sse
and SSE2 is enabled. It is a regression on x86-64. We should check
__SSE2_MATH__ instead of __x86_64__ when disabling x87 inline functions.
In BZ #15605 fix with addding memset/memmove alias in symbol-hacks.h,
x32 symbol-hacks.h change was missing. Fixed by including
<sysdeps/generic/symbol-hacks.h> in x32 symbol-hacks.h.
The IFUNC selector for gettimeofday runs before _libc_vdso_platform_setup where
__vdso_gettimeofday is set. The selector then sets __gettimeofday (the internal
version used within GLIBC) to use the system call version instead of the vDSO one.
This patch changes the check if vDSO is available to get its value directly
instead of rely on __vdso_gettimeofday.
This patch changes it by getting the vDSO value directly.
It fixes BZ#16431.
See commit 41b1792698 for testcase.
Note: while this works on s390x, the s390 code hangs when using -e.
But it hangs regardless of this code (the hang seems to occur before
the exit func is even called). I didn't look too closely at it as
it seems to be an issue external to this file, so this code shouldn't
make the situation any worse.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
This patches fixes BZ#16430 by setting a different symbol for internal
GLIBC calls that points to ifunc resolvers. For PPC32, if the symbol
is defined as hidden (which is the case for gettimeofday and time) the
compiler will create local branches (symbol@local) and linker will not
create PLT calls (required for IFUNC). This will leads to internal symbol
calling the IFUNC resolver instead of the resolved symbol.
For PPC64 this behavior does not occur because a call to a function in
another translation unit might use a different toc pointer thus requiring
a PLT call.
We needlessly enabled thread cancellation before it was necessary. As
only call that needs to be guarded is waitpid which is cancellation
point we could remove cancellation altogether.
The truncl assembly implementation (sysdeps/powerpc/powerpc64/fpu/s_truncl.S)
returns wrong results for some inputs where first double is a exact integer
and the precision is determined by second long double.
Checking on implementation comments and history, I am very confident the
assembly implementation was based on a version before commit
5c68d40169 that fixes BZ#2423 (Errors in
long double (ldbl-128ibm) rounding functions in glibc-2.4).
By just removing the implementation and make the build select
sysdeps/ieee754/ldbl-128ibm/s_truncl.c instead it fixes tgammal
issues regarding wrong result sign.
This patch fixes bug 16408, ldbl-128ibm expm1l returning NaN for some
large arguments.
The basic problem is that the approach of converting the exponent to
the form n * log(2) + y, where -0.5 <= y <= 0.5, then computing 2^n *
expm1(y) + (2^n - 1) falls over when 2^n overflows (starting slightly
before the point where expm1 overflows, when y is negative and n is
the least integer for which 2^n overflows). The ldbl-128 code, and
the x86/x86_64 code, make expm1l fall back to expl for large positive
arguments to avoid this issue. This patch makes the ldbl-128ibm code
do the same. (The problem appears for the particular argument in the
testsuite because the ldbl-128ibm code also uses an overflow threshold
that's for ldbl-128 and is too big for ldbl-128ibm, but the problem
described applies for large non-overflowing cases as well, although
during the freeze is not a suitable time for making the expm1 tests
cover cases close to overflow more thoroughly.)
This leaves some code for large positive arguments in expm1l that is
now dead. To keep the code for ldbl-128 and ldbl-128ibm similar, and
to avoid unnecessary changes during the freeze, the patch doesn't
remove it; instead I propose to file a bug in Bugzilla as a reminder
that this code (for overflow, including errno setting, and for
arguments of +Inf) is no longer needed and should be removed from both
those expm1l implementations.
Tested powerpc32.
* sysdeps/ieee754/ldbl-128ibm/s_expm1l.c (__expm1l): Use __expl
for large positive arguments.
This patch fixes bug 16407, spurious overflows from ldbl-128ibm coshl.
The implementation assumed that a high part (reinterpreted as an
integer) of the absolute value of the argument of 0x408633ce8fb9f87dLL
or more meant overflow, but the actual threshold has high part
0x408633ce8fb9f87eLL (and a negative low part). The patch adjusts the
threshold accordingly.
sinhl probably has the same issue, but I didn't get that far in adding
tests of special cases (such as just below and above overflow) before
the freeze and during the freeze is not a suitable time to add them
(as they'd require ulps to be regenerated again), so I'm not changing
that function for now; when I add more tests of special cases, we'll
discover whether sinhl indeed has this problem.
Tested powerpc32.
* sysdeps/ieee754/ldbl-128ibm/e_coshl.c (__ieee754_coshl):
Increase overflow threshold.
This patch fixes bug 16400, spurious underflow exceptions for ldbl-128
/ ldbl-128ibm lgammal with small positive arguments, by just using
-__logl (x) as the result in the problem cases (similar to the
previous fix for problems with small negative arguments).
Tested powerpc32, and also tested on mips64 that this does not require
ulps regeneration for the ldbl-128 case.
* sysdeps/ieee754/ldbl-128/e_lgammal_r.c (__ieee754_lgammal_r):
Return -__logl (x) for small positive arguments without evaluating
a polynomial.
All the other ptrace structures in this file have a __ prefix except this
new one. This in turn causes build problems for most packages that try to
use ptrace such as strace:
gcc -DHAVE_CONFIG_H -I. -I../.. -I../../linux/x86_64 -I../../linux \
-I./linux -Wall -Wwrite-strings -g -O2 -MT process.o -MD -MP \
-MF .deps/process.Tpo -c -o process.o ../../process.c
In file included from ../../process.c:63:0:
/usr/include/linux/ptrace.h:58:8: error: redefinition of 'struct ptrace_peeksiginfo_args'
struct ptrace_peeksiginfo_args {
^
In file included from ../../defs.h:159:0,
from ../../process.c:37:
/usr/include/sys/ptrace.h:191:8: note: originally defined here
struct ptrace_peeksiginfo_args
^
Since this struct was introduced in glibc-2.18, there shouldn't be any
real regressions with adding the __ prefix.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
This patch fixes bug 16390, incorrect signs of zero results from
ldbl-128ibm atan2l, soft-float only. The problem is a longstanding
GCC bug with fabsl not being correct for signed zero for soft float,
and the fix is using -fno-builtin-fabsl as a workaround, as already
done for various other source files. Tested powerpc-nofpu.
* sysdeps/powerpc/nofpu/Makefile [$(subdir) = math]
(CFLAGS-e_atan2l.c): Use -fno-builtin-fabsl.
This patch fixes bug 16386, ldbl-128ibm logl inaccuracy (with
consequent inaccuracy for lgammal) for arguments where the high double
is subnormal, which showed up while attempting to regenerate ulps for
powerpc-nofpu for 2.19. The problem here is logic failing to allow
for subnormals when calculating the exponent of the argument. Tested
for powerpc-nofpu.
* sysdeps/ieee754/ldbl-128ibm/e_logl.c (__ieee754_logl): Adjust
numbers with subnormal high part when calculating exponent.
This patch fixes bug 16385, ldbl-128ibm asinhl inaccuracy, which
showed up while attempting to regenerate ulps for powerpc-nofpu for
2.19. The problem here was use of fabs instead of fabsl meaning large
arguments were reduced to the precision of double. Tested for
powerpc-nofpu.
* sysdeps/ieee754/ldbl-128ibm/s_asinhl.c (__asinhl): Use fabsl not
fabs.
This patch fixes bug 16384, ldbl-128ibm acoshl inaccuracy, which
showed up while attempting to regenerate ulps for powerpc-nofpu for
2.19. There were two separate problems, use of __log1p instead of
__log1pl and an insufficiently accurate constant value for log 2
(which this patch replaces by use of M_LN2l), each of which could
cause substantial inaccuracy in affected cases.
Tested for powerpc-nofpu.
* sysdeps/ieee754/ldbl-128ibm/e_acoshl.c (ln2): Initialize with
M_LN2l.
(__ieee754_acoshl): Use __log1pl not __log1p.
We support older kernels that lack this header, so check for it
before we try to use it.
Reported-by: Adhemerval Zanella <azanella@linux.vnet.ibm.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
This patch fixes bug 16337, ldbl-128 lgammal spurious overflows for
small negative arguments (the arguments in question are already in the
testsuite). The implementation uses the reflection formula to compute
lgamma of negative x from lgamma of -x, effectively resulting in a
calculation -log(x^2) + log(-x); cancellation isn't problematic in
this case (bugs for problematic cancellation in lgamma are 2542, 2543,
2558), but the x^2 calculation can underflow (in which case there is
spurious logic to return an overflowing value - lgamma can only ever
correctly overflow for large positive arguments, though tgamma can
overflow for small arguments of either sign as well as large positive
arguments). The fix is simply to calculate the result directly with
logl when the argument is a small enough negative number.
Tested mips64.
* sysdeps/ieee754/ldbl-128/e_lgammal_r.c (__ieee754_lgammal_r):
Calculate results for small negative arguments directly rather
than using reflection formula with special underflow handling.
As discussed in
<https://sourceware.org/ml/libc-alpha/2012-04/msg00840.html> and
<https://sourceware.org/ml/libc-alpha/2012-04/msg00989.html>, it seems
appropriate to flatten sysdeps/unix/bsd/bsd4.4 into sysdeps/unix/bsd.
The bulk of the patch is just moving files. The only other changes
are: update paths in sysdeps/mach/hurd/Implies and
sysdeps/unix/sysv/linux/wait3.c; merge the two syscalls.list files,
with the removal of syscalls that were in
sysdeps/unix/bsd/syscalls.list but overridden in the bsd4.4 directory
by .c files there.
Tested x86_64. The installed shared libraries are identical before
and after the patch except for libc.so where the move of wait3.c
(included by sysdeps/unix/sysv/linux/wait3.c) affects debug info, but
the disassembly is unchanged.
* sysdeps/mach/hurd/Implies: Change unix/bsd/bsd4.4 to unix/bsd.
* sysdeps/unix/bsd/syscalls.list (chflags): Add entry from
sysdeps/unix/bsd/bsd4.4/syscalls.list.
(fchflags): Likewise.
(revoke): Likewise.
(setlogin): Likewise.
(sigaltstack): Likewise.
(wait4): Likewise.
(sigblock): Remove.
(sigsetmask): Likewise.
(wait3): Likewise.
(waitpid): Likewise.
* sysdeps/unix/bsd/bsd4.4/syscalls.list: Remove file.
* sysdeps/unix/sysv/linux/wait3.c: Update directory of included
file.
* sysdeps/unix/bsd/bsd4.4/Makefile: Move to ...
* sysdeps/unix/bsd/Makefile: ... here.
* sysdeps/unix/bsd/bsd4.4/Versions: Move to ...
* sysdeps/unix/bsd/Versions: ... here.
* sysdeps/unix/bsd/bsd4.4/bits/sockaddr.h: Move to ...
* sysdeps/unix/bsd/bits/sockaddr.h: ... here.
* sysdeps/unix/bsd/bsd4.4/cmsg_nxthdr.c: Move to ...
* sysdeps/unix/bsd/cmsg_nxthdr.c: ... here.
* sysdeps/unix/bsd/bsd4.4/sigblock.c: Move to ...
* sysdeps/unix/bsd/sigblock.c: ... here.
* sysdeps/unix/bsd/bsd4.4/sigsetmask.c: Move to ...
* sysdeps/unix/bsd/sigsetmask.c: ... here.
* sysdeps/unix/bsd/bsd4.4/sigvec.c: Move to ...
* sysdeps/unix/bsd/sigvec.c: ... here.
* sysdeps/unix/bsd/bsd4.4/tcdrain.c: Move to ...
* sysdeps/unix/bsd/tcdrain.c: ... here.
* sysdeps/unix/bsd/bsd4.4/tcgetattr.c: Move to ...
* sysdeps/unix/bsd/tcgetattr.c: ... here.
* sysdeps/unix/bsd/bsd4.4/tcsetattr.c: Move to ...
* sysdeps/unix/bsd/tcsetattr.c: ... here.
* sysdeps/unix/bsd/bsd4.4/wait.c: Move to ...
* sysdeps/unix/bsd/wait.c: ... here.
* sysdeps/unix/bsd/bsd4.4/wait3.c: Move to ...
* sysdeps/unix/bsd/wait3.c: ... here.
* sysdeps/unix/bsd/bsd4.4/waitpid.c: Move to ...
* sysdeps/unix/bsd/waitpid.c: ... here.
This patch fixes bug 16356, bad results from x86 / x86_64 expl /
exp10l in directed rounding modes, the most serious of the bugs shown
up by my patch expanding libm test coverage. When I fixed bug 16293,
I thought it was only necessary to set round-to-nearest when using
frndint in expm1 functions, because in other cases the cancellation
error from having the resulting fractional part close to 1 or -1 would
not be significant. However, in expl and exp10l, the way the final
fractional part gets computed (something more complicated than a
simple subtraction, because more precision is needed than you'd get
that way) can result in a value outside the range [-1, 1] when the
argument to frndint was very close to an integer and was rounded the
"wrong" way because of the rounding mode - and the f2xm1 instruction
has undefined results if its argument is outside [-1, 1], so resulting
in the large errors seen. So this patch removes the USE_AS_EXPM1L
conditionals on the round-to-nearest settings, so all of expl, expm1l
and exp10l now get round-to-nearest used for frndint (meaning the
final fractional part can at most be slightly above 0.5 in
magnitude). Associated tests of exp and exp10 are added and testing
of exp10 in directed rounding modes enabled.
Tested x86_64 and x86 and ulps updated accordingly.
* sysdeps/i386/fpu/e_expl.S (IEEE754_EXPL): Also set
round-to-nearest for [!USE_AS_EXPM1L].
* sysdeps/x86_64/fpu/e_expl.S (IEEE754_EXPL): Likewise.
* math/auto-libm-test-in: Do not expect cosh tests to fail. Add
more tests of exp and exp10. Expect some exp10 tests to miss
exceptions or fail in directed rounding modes.
* math/auto-libm-test-out: Regenerated.
* math/libm-test.inc (exp10_tonearest_test_data): New array.
(exp10_test_tonearest): New function.
(exp10_towardzero_test_data): New array.
(exp10_test_towardzero): New function.
(exp10_downward_test_data): New array.
(exp10_test_downward): New function.
(exp10_upward_test_data): New array.
(exp10_test_upward): New function.
(main): Call the new functions.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
Various libm functions have inadequate test coverage in libm-test.inc
/ auto-libm-test-in - failing to cover all the usual special cases
(infinities, NaNs, zero, large and small finite values, subnormals) as
well as a reasonable range of ordinary inputs and, where appropriate,
inputs close to the thresholds for underflow and overflow.
This patch improves test coverage for real functions [a-c]* (with the
expectation of adding more coverage for other functions later).
Tested x86_64 and x86 and ulps updated accordingly (and eight glibc
bugs and one C11 DR filed for issues found in the process).
* math/auto-libm-test-in: Add more tests of acos, acosh, asin,
asinh, atan, atan2, atanh, cbrt, cos and cosh.
* math/auto-libm-test-out: Regenerated.
* math/libm-test.inc (acosh_test_data): Add more tests.
(atanh_test_data): Likewise.
(ceil_test_data): Likewise.
(copysign_test_data): Likewise.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
This patch moves tests of cpow to auto-libm-test-in, adding the
required support to gen-auto-libm-tests.
Tested x86_64 and x86 and ulps updated accordingly.
* math/auto-libm-test-in: Add tests of cpow.
* math/auto-libm-test-out: Regenerated.
* math/libm-test.inc (cpow_test_data): Use AUTO_TESTS_cc_c.
* * math/gen-auto-libm-tests.c (func_calc_method): Add value
mpc_cc_c.
(func_calc_desc): Add mpc_cc_c union field.
(test_functions): Add cpow.
(special_fill_2pi): New function.
(special_real_inputs): Add 2pi.
(calc_generic_results): Handle mpc_cc_c.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
This patch moves tests of ccos, ccosh, cexp, clog, csqrt, ctan and
ctanh to auto-libm-test-in, adding the required support to
gen-auto-libm-tests. Other TEST_c_c functions aren't moved for now
(although the relevant table entries are put in gen-auto-libm-tests
for it to know how to handle them): clog10 because of a known MPC bug
causing it to hang for at least some pure imaginary inputs (fixed in
SVN, but I'd rather not rely on unreleased versions of MPFR or MPC
even if relying on very recent releases); the inverse trig and
hyperbolic functions because of known slowness in special cases; and
csin / csinh because of observed slowness that I need to investigate
and report to the MPC maintainers. Slowness can be bypassed by moving
to incremental generation (only for new / changed tests) rather than
regenerating the whole of auto-libm-test-out every time, but that
needs implementing. (This patch takes the time for running
gen-auto-libm-tests from about one second to seven, on my system,
which I think is reasonable. The slow functions would make it take
several minutes at least, which seems unreasonable.)
Tested x86_64 and x86 and ulps updated accordingly.
* math/auto-libm-test-in: Add tests of ccos, ccosh, cexp, clog,
csqrt, ctan and ctanh.
* math/auto-libm-test-out: Regenerated.
* math/libm-test.inc (TEST_COND_x86_64): New macro.
(TEST_COND_x86): Likewise.
(ccos_test_data): Use AUTO_TESTS_c_c.
(ccosh_test_data): Likewise.
(cexp_test_data): Likewise.
(clog_test_data): Likewise.
(csqrt_test_data): Likewise.
(ctan_test_data): Likewise.
(ctan_tonearest_test_data): Likewise.
(ctan_towardzero_test_data): Likewise.
(ctan_downward_test_data): Likewise.
(ctan_upward_test_data): Likewise.
(ctanh_test_data): Likewise.
(ctanh_tonearest_test_data): Likewise.
(ctanh_towardzero_test_data): Likewise.
(ctanh_downward_test_data): Likewise.
(ctanh_upward_test_data): Likewise.
* math/gen-auto-libm-tests.c (func_calc_method): Add value
mpc_c_c.
(func_calc_desc): Add mpc_c_c union field.
(FUNC_mpc_c_c): New macro.
(test_functions): Add cacos, cacosh, casin, casinh, catan, catanh,
ccos, ccosh, cexp, clog, clog10, csin, csinh, csqrt, ctan and
ctanh.
(special_fill_min_subnorm_p120): New function.
(special_real_inputs): Add min_subnorm_p120.
(calc_generic_results): Handle mpc_c_c.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
This patch consolidates the multiple copies of code that looks up sin
and cos of a number from the lookup table and computes the final
value, into static functions. This does not have a noticeable
performance impact since the functions are inlined by gcc.
There is further scope for consolidation in the functions but they
cause a more noticable impact on performance (>5%) due to which I have
held back on them.
Removed more redundant computations in the slow paths of the sin and
cos functions. The notable change is the passing of the most
significant bits of X to the slow functions to check if X is positive
so that just the absolute value of x can be passed and the repeated
ABS() operation is avoided.
There are multiple points in the code where the absolute value of a
number is computed multiple times or is computed even though the value
can only be positive. This change removes those redundant
computations. Tested on x86_64 to verify that there were no
regressions in the testsuite.
sysdeps/powerpc/powerpc32/libgcc-compat.S makes certain symbols that
glibc once accidentally reexported from libgcc into compat symbols.
Where the exports were purely accidental, this is the right thing to
do. However, for powerpc-nofpu the soft-fp symbols are deliberately
exported from libc, given public versions in
sysdeps/powerpc/nofpu/Versions and used by libm in preference to the
libgcc versions that do not support the software exceptions and
rounding modes. The libc versions should also be usable by user
programs, though normally libgcc gets linked in first (meaning,
effectively, that the <fenv.h> functions are broken as regards their
expected effects on user arithmetic).
A longstanding todo item is to remove the functions in question from
libgcc (when built with recent enough glibc) - that is, remove them
from static libgcc and make them compat symbols in shared libgcc - so
that this works properly (this is one of the items mentioned at
<http://gcc.gnu.org/wiki/Software_floating_point> - parts of that page
are obviously out of date, but this item still applies). Doing this
requires first that the functions are actually available from libc for
new links, not just as compat symbols.
This patch stops the symbols in question being compat symbols for
powerpc-nofpu. The nofpu Versions entries for them are removed (the
symbols never were exported at GLIBC_2.3.2, only GLIBC_2.0, because
the compat symbols took precedence).
Tested powerpc-nofpu. The symbols are no longer compat symbols and
libm.so now properly gets undefined references to them (resolved to
libc.so) instead of the libgcc copies getting linked into libm as
before.
* sysdeps/powerpc/powerpc32/libgcc-compat.S
[_SOFT_FLOAT || __NO_FPRS__] (__fixdfdi_v_glibc20): Do not define
as a macro and a compat symbol.
[_SOFT_FLOAT || __NO_FPRS__] (__fixsfdi_v_glibc20): Likewise.
[_SOFT_FLOAT || __NO_FPRS__] (__fixunsdfdi_v_glibc20): Likewise.
[_SOFT_FLOAT || __NO_FPRS__] (__fixunssfdi_v_glibc20): Likewise.
[_SOFT_FLOAT || __NO_FPRS__] (__floatdidf_v_glibc20): Likewise.
[_SOFT_FLOAT || __NO_FPRS__] (__floaddisf_v_glibc20): Likewise.
[HAVE_DOT_HIDDEN && (_SOFT_FLOAT || __NO_FPRS__)] (__fixdfdi): Do
not use .hidden.
[HAVE_DOT_HIDDEN && (_SOFT_FLOAT || __NO_FPRS__)] (__fixsfdi):
Likewise.
[HAVE_DOT_HIDDEN && (_SOFT_FLOAT || __NO_FPRS__)] (__fixunsdfdi):
Likewise.
[HAVE_DOT_HIDDEN && (_SOFT_FLOAT || __NO_FPRS__)] (__fixunssfdi):
Likewise.
[HAVE_DOT_HIDDEN && (_SOFT_FLOAT || __NO_FPRS__)] (__floaddidf):
Likewise.
[HAVE_DOT_HIDDEN && (_SOFT_FLOAT || __NO_FPRS__)] (__floaddisf):
Likewise.
* sysdeps/powerpc/nofpu/Versions (libc): Remove __fixdfdi,
__fixsfdi, __fixunsdfdi, __fixunssfdi, __floatdidf and __floatdisf
from GLIBC_2.3.2.
This patch moves tests of sincos to auto-libm-test-in, adding the
required support to gen-auto-libm-tests.
Tested x86_64 and x86 and ulps updated accordingly.
(auto-libm-test-out diffs omitted below.)
* math/auto-libm-test-in: Add tests of sincos.
* math/auto-libm-test-out: Regenerated.
* math/libm-test.inc (sincos_test_data): Use AUTO_TESTS_fFF_11.
* math/gen-auto-libm-tests.c (func_calc_method): Add value
mpfr_f_11.
(func_calc_desc): Add mpfr_f_11 union field.
(test_functions): Add sincos.
(calc_generic_results): Handle mpfr_f_11.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
math/gen-libm-test.pl has code to beautify names of various constants,
transforming the source form in libm-test.inc into the version
appearing in test names in libm-test-ulps files.
This has become decreasingly relevant over time for the M_* constants,
first as I changed the test names so only the arguments and not the
expected results appeared in them, then as tests have moved to
auto-libm-test-* so that automatically generated hex float constants
get used instead of M_* in test inputs.
This patch removes the beautification for all M_* constants. Tested
x86_64 and x86 and ulps updated accordingly. Even the one case where
this affected the name in the ulps files will disappear once complex
function tests are moved to auto-libm-test-*.
* math/gen-libm-test.pl (%beautify): Remove M_* constants.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
Bug 16293 is inaccuracy of x86/x86_64 versions of expm1, near 0 in
directed rounding modes, that arises from frndint rounding the
exponent to 1 or -1 instead of 0, resulting in large cancellation
error. This inaccuracy in turn affects other functions such as sinh
that use expm1. This patch fixes the problem by setting
round-to-nearest mode temporarily around the affected calls to
frndint. I don't think this is needed for other uses of frndint, such
as in exp itself, as only for expm1 is the cancellation error
significant.
Tested x86_64 and x86 and ulps updated accordingly.
* sysdeps/i386/fpu/e_expl.S (IEEE754_EXPL) [USE_AS_EXPM1L]: Set
round-to-nearest mode when using frndint.
* sysdeps/i386/fpu/s_expm1.S (__expm1): Likewise.
* sysdeps/i386/fpu/s_expm1f.S (__expm1f): Likewise.
* sysdeps/x86_64/fpu/e_expl.S (IEEE754_EXPL) [USE_AS_EXPM1L]:
Likewise.
* math/auto-libm-test-in: Add more tests of expm1. Do not expect
sinh test to fail.
* math/auto-libm-test-out: Regenerated.
* math/libm-test.inc (TEST_COND_x86_64): Remove macro.
(TEST_COND_x86): Likewise.
(expm1_tonearest_test_data): New array.
(expm1_test_tonearest): New function.
(expm1_towardzero_test_data): New array.
(expm1_test_towardzero): New function.
(expm1_downward_test_data): New array.
(expm1_test_downward): New function.
(expm1_upward_test_data): New array.
(expm1_test_upward): New function.
(main): Run the new test functions.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
This patch moves tests of jn and yn to auto-libm-test-in, adding the
required support for gen-auto-libm-tests (and adding a missing
assertion there and fixing logic that was broken for functions with
integer arguments).
Tested x86_64 and x86 and ulps updated accordingly.
* math/auto-libm-test-in: Add tests of jn and yn.
* math/auto-libm-test-out: Regenerated.
* math/libm-test.inc (jn_test_data): Use AUTO_TESTS_if_f.
(yn_test_data): Likewise.
* math/gen-auto-libm-tests.c (func_calc_method): Add value
mpfr_if_f.
(func_calc_desc): Add mpfr_if_f union field.
(FUNC_mpfr_if_f): New macro.
(test_functions): Add jn and yn.
(calc_generic_results): Assert type of second input for
mpfr_ff_f. Handle mpfr_if_f.
(output_for_one_input_case): Disable all checking for arguments
fitting floating-point types in case of an integer argument.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
My recent changes that added libm_hidden_proto / libm_hidden_def for
fegetround had the side effect of removing the need for a
localplt.data entry for fegetround for powerpc-nofpu. This patch
removes that entry. Tested powerpc-nofpu.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/nptl/localplt.data:
Don't expect fegetround reference in libm.so.
This patch fixes bug 16338, ldbl-128 logl not handling subnormals
(with consequent inaccuracy for lgammal as well). The fix is simply
to use __frexpl when determining the exponent, as done already in
log2l and log10l. Given the lack of testing of small arguments to any
of the log* functions, appropriate tests are added for all of them.
Tested x86_64 and x86 and ulps updated accordingly, and spot tests
also run for mips64 to confirm the ldbl-128 fix.
Note that while this fixes lgammal inaccuracy for small positive
arguments, I suspect that there will still be problems with spurious
underflows in that case.
* sysdeps/ieee754/ldbl-128/e_logl.c (__ieee754_logl): Use __frexpl
to determine exponent and adjust argument to have exponent of -1.
* math/auto-libm-test-in: Add more tests of log, log10, log1p and
log2.
* math/auto-libm-test-out: Regenerated.
* sysdeps/x86_64/fpu/libm-test-ulps: Update.
- Remove redundant mynumber union definitions
- Clean up a clumsy ternary operator
- Rename TAYLOR_SINCOS to TAYLOR_SIN since we're only expanding the
sin Taylor series in it.
A sse42 version of strstr used pcmpistr instruction which is quite
ineffective. A faster way is look for pairs of characters which is uses
sse2, is faster than pcmpistr and for real strings a pairs we look for
are relatively rare.
For linear time complexity we use buy or rent technique which switches
to two-way algorithm when superlinear behaviour is detected.
For PPC64, all the wrappers at sysdeps are superfluous: they are
basically the same implementation from math/w_sqrt.c with the
'#ifdef _IEEE_LIBM'. And the power4 version just force the 'fsqrt'
instruction utilization with an inline assembly, which is already
handled by math_private.h __ieee754_sqrt implementation.
This patch add optimized __mpn_addmul, __mpn_addsub, __mpn_lshift, and
__mpn_mul_1 implementations for PowerPC64. They are originally from GMP
with adjustments for GLIBC.
This patch add static probes for setjmp/longjmp in the way gdb expects,fixing
the gdb.base/longjmp.exp gdb testcases.
It changes the symbol_name and use macros to to avoid change the probe names
and ending up adding more logic on GDB (since with the expected name
GDB work seamlessly).
To avoid having a ELFv2 binary accidentally picking up an old ABI ld.so,
this patch bumps the soname to ld64.so.2.
In theory (or for testing purposes) this will also allow co-installing
ld.so versions for both ABIs on the same system. Note that the kernel
will already be able to load executables of both ABIs. However, there
is currently no plan to use that theoretical possibility in a any
supported distribution environment ...
Note that in order to check which ABI to use, we need to invoke the
compiler to check the _CALL_ELF macro; this is done in a new configure
check in sysdeps/unix/sysv/linux/powerpc/powerpc64/configure.ac,
replacing the hard-coded value of default-abi in the Makefile.
The ELFv2 ABI changes the calling convention by passing and returning
structures in registers in more cases than the old ABI:
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg01145.htmlhttp://gcc.gnu.org/ml/gcc-patches/2013-11/msg01147.html
For the most part, this does not affect glibc, since glibc assembler
files do not use structure parameters / return values. However, one
place is affected: the LD_AUDIT interface provides a structure to
the audit routine that contains all registers holding function
argument and return values for the intercepted PLT call.
Since the new ABI now sometimes uses registers to return values
that were never used for this purpose in the old ABI, this structure
has to be extended. To force audit routines to be modified for the
new ABI if necessary, the patch defines v2 variants of the la_ppc64
types and routines.
In addition, the patch contains two unrelated changes to the
PLT trampoline routines: it fixes a bug where FPR return values
were stored in the wrong place, and it removes the unnecessary
save/restore of CR.
This updates glibc for the changes in the ELFv2 relating to the
stack frame layout. These are described in more detail here:
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg01149.htmlhttp://gcc.gnu.org/ml/gcc-patches/2013-11/msg01146.html
Specifically, the "compiler and linker doublewords" were removed,
which has the effect that the save slot for the TOC register is
now at offset 24 rather than 40 to the stack pointer.
In addition, a function may now no longer necessarily assume that
its caller has set up a 64-byte register save area its use.
To address the first change, the patch goes through all assembler
files and replaces immediate offsets in instructions accessing the
ABI-defined stack slots by symbolic offsets. Those already were
defined in ucontext_i.sym and used in some of the context routines,
but that doesn't really seem like the right place for those defines.
The patch instead defines those symbolic offsets in sysdeps.h,
in two variants for the old and new ABI, and uses them systematically
in all assembler files, not just the context routines.
The second change only affected a few assembler files that used
the save area to temporarily store some registers. In those
cases where this happens within a leaf function, this patch
changes the code to store those registers to the "red zone"
below the stack pointer. Otherwise, the functions already allocate
a stack frame, and the patch changes them to add extra space in
these frames as temporary space for the ELFv2 ABI.
This is a follow-on to the previous patch to support the ELFv2 ABI in the
dynamic loader, split off into its own patch since it is just an optional
optimization.
In the ELFv2 ABI, most functions define both a global and a local entry
point; the local entry requires r2 to be already set up by the caller
to point to the callee's TOC; while the global entry does not require
the caller to know about the callee's TOC, but it needs to set up r12
to the callee's entry point address.
Now, when setting up a PLT slot, the dynamic linker will usually need
to enter the target function's global entry point. However, if the
linker can prove that the target function is in the same DSO as the
PLT slot itself, and the whole DSO only uses a single TOC (which the
linker will let ld.so know via a DT_PPC64_OPT entry), then it is
possible to actually enter the local entry point address into the
PLT slot, for a slight improvement in performance.
Note that this uncovered a problem on the first call via _dl_runtime_resolve,
because that routine neglected to restore the caller's TOC before calling
the target function for the first time, since it assumed that function
would always reload its own TOC anyway ...
This patch adds support for the ELFv2 ABI feature to remove function
descriptors. See this GCC patch for in-depth discussion:
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg01141.html
This mostly involves two types of changes: updating assembler source
files to the new logic, and updating the dynamic loader.
After the refactoring in the previous patch, most of the assembler source
changes can be handled simply by providing ELFv2 versions of the
macros in sysdep.h. One somewhat non-obvious change is in __GI__setjmp:
this used to "fall through" to the immediately following __setjmp ENTRY
point. This is no longer safe in the ELFv2 since ENTRY defines both
a global and a local entry point, and you cannot simply fall through
to a global entry point as it requires r12 to be set up.
Also, makecontext needs to be updated to set up registers according to
the new ABI for calling into the context's start routine.
The dynamic linker changes mostly consist of removing special code
to handle function descriptors. We also need to support the new PLT
and glink format used by the the ELFv2 linker, see:
https://sourceware.org/ml/binutils/2013-10/msg00376.html
In addition, the dynamic linker now verifies that the dynamic libraries
it loads match its own ABI.
The hack in VDSO_IFUNC_RET to "synthesize" a function descriptor
for vDSO routines is also no longer necessary for ELFv2.
This is the first patch to support the new ELFv2 ABI in glibc.
As preparation, this patch simply refactors some of the powerpc64 assembler
code to move all code related to creating function descriptors (.opd section)
or using function descriptors (function pointer call) into a central place
in sysdep.h.
Note that most locations creating .opd entries were already using macros
in sysdep.h, this patch simply extends this to the remaining places.
No relevant change in generated code expected.
This patch updates glibc in accordance with the binutils patch checked in here:
https://sourceware.org/ml/binutils/2013-10/msg00372.html
This changes the various R_PPC64_..._HI and _HA relocations to report
32-bit overflows. The motivation is that existing uses of @h / @ha
are to build up 32-bit offsets (for the "medium model" TOC access
that GCC now defaults to), and we'd really like to see failures at
link / load time rather than silent truncations.
For those rare cases where a modifier is needed to build up a 64-bit
constant, new relocations _HIGH / _HIGHA are supported.
The patch also fixes a bug in overflow checking for the R_PPC64_ADDR30
and R_PPC64_ADDR32 relocations.
The context established by "makecontext" has a link register pointing
back to an error path within the makecontext routine. This is currently
covered by the CFI FDE for makecontext itself, which is simply wrong
for the stack frame *inside* the context. When trying to unwind (e.g.
doing a backtrace) in a routine inside a context created by makecontext,
this can lead to uninitialized stack slots being accessed, causing the
unwinder to crash in the worst case.
Similarly, during parts of the "setcontext" routine, when the stack
pointer has already been switched to point to the new context, the
address range is still covered by the CFI FDE for setcontext. When
trying to unwind in that situation (e.g. backtrace from an async
signal handler for profiling), it is again possible that the unwinder
crashes.
Theses are all problems in existing code, but the changes in stack
frame layout appear to make the "worst case" much more likely in
the ELFv2 ABI context. This causes regressions e.g. in the libgo
testsuite on ELFv2.
This patch fixes this by ending the makecontext/setcontext FDEs
before those problematic parts of the assembler, similar to what
is already done on other platforms. This fixes the libgo
regression on ELFv2.
Only gaih_inet() and gaih_inet_serv() use a special bit flag denoted
by the GAIH_OKIFUNSPEC macro. Only the return value of
gaih_inet_serv() is actively checked for the bit flag which is
redundant because it just copies the nonzero property of the value
otherwise returned. The return value of gaih_inet() is only checked
for being zero and then the bit flag is filtered out. As the bit flag
is set only for otherwise nonzero return values, it doesn't affect the
zero comparison. GAIH_EAI just an alias to ~GAIH_OKIFUNSPEC.
The event code is PTRACE_EVENT_SECCOMP, not PTRAVE_EVENT_SECCOMP.
This patch fixes the V->C typo. There are no ABI issues since the
number remains the same for the code. Code using the old wrong
name will need to be updated.
This patch helps some math functions performance by adding the libc_fexxx
variant of inline functions to handle both FPU round and exception set/restore
and by using them on the libc_fexxx_ctx functions. It is based on already coded
fexxx family functions for PPC with fpu.
Here is the summary of performance improvements due this patch (measured on a
POWER7 machine):
Before:
cos(): ITERS:9.5895e+07: TOTAL:5116.03Mcy, MAX:77.6cy, MIN:49.792cy, 18744 calls/Mcy
exp(): ITERS:2.827e+07: TOTAL:5187.15Mcy, MAX:494.018cy, MIN:38.422cy, 5450.01 calls/Mcy
pow(): ITERS:6.1705e+07: TOTAL:5144.26Mcy, MAX:171.95cy, MIN:29.935cy, 11994.9 calls/Mcy
sin(): ITERS:8.6898e+07: TOTAL:5117.06Mcy, MAX:83.841cy, MIN:46.582cy, 16982 calls/Mcy
tan(): ITERS:2.9473e+07: TOTAL:5115.39Mcy, MAX:191.017cy, MIN:172.352cy, 5761.63 calls/Mcy
After:
cos(): ITERS:2.05265e+08: TOTAL:5111.37Mcy, MAX:78.754cy, MIN:24.196cy, 40158.5 calls/Mcy
exp(): ITERS:3.341e+07: TOTAL:5170.84Mcy, MAX:476.317cy, MIN:15.574cy, 6461.23 calls/Mcy
pow(): ITERS:7.6153e+07: TOTAL:5129.1Mcy, MAX:147.5cy, MIN:30.916cy, 14847.2 calls/Mcy
sin(): ITERS:1.58816e+08: TOTAL:5115.11Mcy, MAX:1490.39cy, MIN:22.341cy, 31048.4 calls/Mcy
tan(): ITERS:3.4964e+07: TOTAL:5114.18Mcy, MAX:177.422cy, MIN:146.115cy, 6836.68 calls/Mcy