This patch provides optimized versions of strlen and wcslen with the z13 vector
instructions.
The helper macro IFUNC_VX_IMPL is introduced and is used to register all
__<func>_c() and __<func>_vx() functions within __libc_ifunc_impl_list()
to the ifunc test framework.
ChangeLog:
* sysdeps/s390/multiarch/Makefile: New File.
* sysdeps/s390/multiarch/strlen-c.c: Likewise.
* sysdeps/s390/multiarch/strlen-vx.S: Likewise.
* sysdeps/s390/multiarch/strlen.c: Likewise.
* sysdeps/s390/multiarch/wcslen-c.c: Likewise.
* sysdeps/s390/multiarch/wcslen-vx.S: Likewise.
* sysdeps/s390/multiarch/wcslen.c: Likewise.
* string/strlen.c (STRLEN): Define and use macro.
* sysdeps/s390/multiarch/ifunc-impl-list.c
(IFUNC_VX_IMPL): New macro function.
(__libc_ifunc_impl_list): Add ifunc test for strlen, wcslen.
* benchtests/Makefile (wcsmbs-bench): New variable.
(string-bench-all): Added wcsmbs-bench.
* benchtests/bench-wcslen.c: New File.
This patch introduces a s390 specific ifunc resolver macro for 32/64bit,
which chooses <func>_vx with vector instructions if HWCAP_S390_VX flag
in hwcaps is set or <func>_c if not.
ChangeLog:
* sysdeps/s390/multiarch/ifunc-resolve.h (s390_vx_libc_ifunc,
s390_vx_libc_ifunc2): New macro function.
The S390 specific test checks if the assembler has support for the new z13
vector instructions by compiling a vector instruction. The .machine and
.machinemode directives are needed to compile the vector instruction without
-march=z13 option on 31/64 bit.
On success the macro HAVE_S390_VX_ASM_SUPPORT is defined. This macro is used
to determine if the optimized functions can be build without compile errors.
If the used assembler lacks vector support, then a warning is dumped while
configuring and only the common code functions are build.
The z13 instruction support was introduced in
"[Committed] S/390: Add support for IBM z13."
(https://sourceware.org/ml/binutils/2015-01/msg00197.html)
ChangeLog:
* config.h.in (HAVE_S390_VX_ASM_SUPPORT): New macro undefine.
* sysdeps/s390/configure.ac: Add test for S390 vector instruction
assembler support.
* sysdeps/s390/configure: Regenerated.
The new IBM z13 is added to platform string array.
The macro _DL_PLATFORMS_COUNT is incremented to 8,
because it was not incremented by commit
"S/390: Sync AUXV capabilities and archs with kernel".
ChangeLog:
* sysdeps/s390/dl-procinfo.c (_dl_s390_cap_flags): Add z13.
* sysdeps/s390/dl-procinfo.h (_DL_PLATFORMS_COUNT): Increased.
The HWCAP_S390_VX flag in hwcap field of auxiliary vector indicates
if the vector facility is available and the kernel is aware of it.
This can be tested with LD_SHOW_AUXV=1 <prog>.
Currently it does not show te, because it was not incremented
by commit "S/390: Add hwcap value for transactional execution.".
Thus _DL_HWCAP_COUNT is incremented by two.
ChangeLog:
* sysdeps/s390/dl-procinfo.c (_dl_s390_platforms): Add vector flag.
* sysdeps/s390/dl-procinfo.h: Add vector capability.
* sysdeps/unix/sysv/linux/s390/bits/hwcap.h (HWCAP_S390_VX): Define.
On s390 all ifunc resolvers were implemented in multiarch/ifunc-resolve.c.
The resulting single object files has undefined references to all ifunc-functions.
This patch introduces one multiarch/<func>.c file for each of memcpy, memcmp
and memset with the function specific ifunc resolver. The different function
implementations are now implemented in multiarch/<func>-s390x.S
(moved from multiarch/<func>.S).
The new multiarch/ifunc-resolve.h file contains the ifunc-resolver macro
and other helper-macros. They are merged and are now used in common for
32/64bit. Therefore the __<func>_g5/__<func>_z900 functions were renamed to
__<func>_default.
This patch also enables testing the ifunc implementations by implementing
the function __libc_ifunc_impl_list. It uses the helper-macros of ifunc-resolve.h.
ChangeLog:
* sysdeps/s390/s390-32/multiarch/Makefile (sysdep_routines):
Remove ifunc-resolve, add memset-s390, memcpy-s390, memcmp-s390.
* sysdeps/s390/s390-32/multiarch/ifunc-resolve.c: Delete File.
* sysdeps/s390/s390-32/multiarch/memcmp.S: Move to ...
* sysdeps/s390/s390-32/multiarch/memcmp-s390.S: ... here.
(memcmp, bcmp): Use __memcmp_default as alias source.
* sysdeps/s390/s390-32/multiarch/memcmp.c: New File.
* sysdeps/s390/s390-32/memcmp.S (__memcmp_g5):
Rename to __memcmp_default.
* sysdeps/s390/s390-32/multiarch/memcpy.S: Move to ...
* sysdeps/s390/s390-32/multiarch/memcpy-s390.S: ... here.
(memcpy): Use __memcpy_default as alias source.
* sysdeps/s390/s390-32/multiarch/memcpy.c: New File.
* sysdeps/s390/s390-32/memcpy.S (__memcpy_g5):
Rename to __memcpy_default.
* sysdeps/s390/s390-32/multiarch/memset.S: Move to ...
* sysdeps/s390/s390-32/multiarch/memset-s390.S: ... here.
(memset): Use __memset_default as alias source.
* sysdeps/s390/s390-32/multiarch/memset.c: New File.
* sysdeps/s390/s390-32/memset.S (__memset_g5):
Rename to __memset_default.
* sysdeps/s390/s390-64/multiarch/Makefile (sysdep_routines):
Remove ifunc-resolve, add memset-s390x, memcpy-s390x, memcmp-s390x.
* sysdeps/s390/s390-64/multiarch/ifunc-resolve.c: Delete File.
* sysdeps/s390/s390-64/multiarch/memcmp.S: Move to ...
* sysdeps/s390/s390-64/multiarch/memcmp-s390x.S: ... here.
(memcmp, bcmp): Use __memcmp_default as alias source.
* sysdeps/s390/s390-64/multiarch/memcmp.c: New File.
* sysdeps/s390/s390-64/memcmp.S (__memcmp_z900):
Rename to __memcmp_default.
* sysdeps/s390/s390-64/multiarch/memcpy.S: Move to ...
* sysdeps/s390/s390-64/multiarch/memcpy-s390x.S: ... here.
(memcpy): Use __memcpy_default as alias source.
* sysdeps/s390/s390-64/multiarch/memcpy.c: New File.
* sysdeps/s390/s390-64/memcpy.S (__memcpy_z900):
Rename to __memcpy_default.
* sysdeps/s390/s390-64/multiarch/memset.S: Move to ...
* sysdeps/s390/s390-64/multiarch/memset-s390x.S: ... here.
(memset): Use __memset_default as alias source.
* sysdeps/s390/s390-64/multiarch/memset.c: New File.
* sysdeps/s390/s390-64/memset.S (__memset_z900):
Rename to __memset_default.
* sysdeps/s390/multiarch/ifunc-resolve.h: New File.
* sysdeps/s390/multiarch/ifunc-impl-list.c: New File.
On s390, the DXC(data-exception-code)-byte in FPC(floating-point-control)-
register contains a code of the last occured exception.
If bits 6 and 7 of DXC-byte are zero, the bits 0-5 correspond to the
ieee-exception flag bits.
The current implementation always uses these bits as ieee-exception flag bits.
fetestexcept() reports any exception after the first usage of a
vector-instruction in a process, because it raises an "vector instruction
exception" with DXC-code 0xFE.
This patch fixes the handling of the DXC-byte. The DXC-Byte is only handled
if bits 6 and 7 are zero.
The #define _FPU_RESERVED is extended by the DXC-Byte.
Otherwise the tests math/test-fpucw-static and math/test-fpucw-ieee-static
fails, because DXC-Byte contains the vector instruction exception when reaching
main(). This exception was triggered by strrchr() call in __init_misc().
__init_misc() is called after __setfpucw () in __libc_init_first().
The field __ieee_instruction_pointer in struct fenv_t is renamed to __unused
because it is a relict from commit "Remove PTRACE_PEEKUSER"
(87b9b50f0d) and isn´t used anymore.
ChangeLog:
[BZ #18610]
* sysdeps/s390/fpu/bits/fenv.h (fenv_t): Rename
__ieee_instruction_pointer to __unused.
* sysdeps/s390/fpu/fesetenv.c (__fesetenv): Remove usage of
__ieee_instruction_pointer.
* sysdeps/s390/fpu/fclrexcpt.c (feclearexcept): Fix dxc-field handling.
* sysdeps/s390/fpu/fgetexcptflg.c (fegetexceptflag): Likewise.
* sysdeps/s390/fpu/fsetexcptflg.c (fesetexceptflag): Likewise.
* sysdeps/s390/fpu/ftestexcept.c (fetestexcept): Likewise.
* sysdeps/s390/fpu/fpu_control.h (_FPU_RESERVED):
Mark dxc-field as reserved.
Since ld.so preserves vector registers now, we can use the same SSE2
optimized strcmp in x86-64 libc and ld.so.
* sysdeps/x86_64/strcmp.S: Remove "#if !IS_IN (libc)".
Since _dl_x86_64_save_sse and _dl_x86_64_restore_sse are removed now,
we don't need to run tst-getpid2 with LD_BIND_NOW=1.
[BZ #11214]
* sysdeps/unix/sysv/linux/Makefile (tst-getpid2-ENV): Removed.
Explicit system calls for the socket operations were added in Linux kernel
in commit 86250b9d12ca for powerpc. This patch make use of those instead of
calling socketcall to save number of cycles on networking syscalls.
2015-08-25 Rajalakshmi Srinivasaraghavan <raji@linux.vnet.ibm.com>
* sysdeps/unix/sysv/linux/powerpc/kernel-features.h: Define new macros.
* sysdeps/unix/sysv/linux/accept.c: Call direct system call.
* sysdeps/unix/sysv/linux/bind.c: Call direct system call.
* sysdeps/unix/sysv/linux/connect.c: Call direct system call.
* sysdeps/unix/sysv/linux/getpeername.c: Call direct system call.
* sysdeps/unix/sysv/linux/getsockname.c: Call direct system call.
* sysdeps/unix/sysv/linux/getsockopt.c: Call direct system call.
* sysdeps/unix/sysv/linux/listen.c: Call direct system call.
* sysdeps/unix/sysv/linux/recv.c: Call direct system call.
* sysdeps/unix/sysv/linux/recvfrom.c: Call direct system call.
* sysdeps/unix/sysv/linux/recvmsg.c: Call direct system call.
* sysdeps/unix/sysv/linux/send.c: Call direct system call.
* sysdeps/unix/sysv/linux/sendmsg.c: Call direct system call.
* sysdeps/unix/sysv/linux/sendto.c: Call direct system call.
* sysdeps/unix/sysv/linux/setsockopt.c: Call direct system call.
* sysdeps/unix/sysv/linux/shutdown.c: Call direct system call.
* sysdeps/unix/sysv/linux/socket.c: Call direct system call.
* sysdeps/unix/sysv/linux/socketpair.c: Call direct system call.
Fix usage of tabort in generated syscalls. r0 has special meaning
when used with this instruction, thus it will not generate
persistent errors, nor return an error code. This mitigates poor
CPU usage when performing elided critical sections.
Additionally, transactions should be aborted when entering a user
invoked syscall. Otherwise the results of the transaction may be
undefined.
2015-08-25 Paul E. Murphy <murphyp@linux.vnet.ibm.com>
* sysdeps/powerpc/powerpc32/sysdep.h (ABORT_TRANSACTION): Use
register other than r0 for tabort, it has special meaning.
* sysdeps/powerpc/powerpc64/sysdep.h (ABORT_TRANSACTION): Likewise
* sysdeps/unix.sysv/linux/powerpc/syscall.S (syscall): Abort
transaction before starting syscall.
Instead of checking needle length, constant 'n' number of comparisons
is checked to fall back to default implementation. This patch is tested
on powerpc64 and powerpc64le.
2015-08-25 Rajalakshmi Srinivasaraghavan <raji@linux.vnet.ibm.com>
* sysdeps/powerpc/powerpc64/power7/strstr.S: Handle worst case.
Since ld.so preserves vector registers now, we can use %xmm[0-4] to
avoid the REX prefix.
* sysdeps/x86_64/strlen.S: Replace %xmm[8-12] with %xmm[0-4].
Hi, as I wrote in previous patches a performance of checked strcpy and
stpcpy is terrible as these don't use sse2 and are around four times
slower that strcpy and stpcpy now.
As this bug shows that these functions are not performance sensitive I
decided just to improve generic implementation instead for easier
maintainance.
* debug/strcpy_chk.c: Improve performance.
* debug/stpcpy_chk.c: Likewise.
* sysdeps/x86_64/strcpy_chk.S: Remove.
* sysdeps/x86_64/stpcpy_chk.S: Remove.
This patch adds SSE, AVX and AVX512 versions of _dl_runtime_resolve
and _dl_runtime_profile, which save and restore the first 8 vector
registers used for parameter passing. elf_machine_runtime_setup
selects the proper _dl_runtime_resolve or _dl_runtime_profile based
on _dl_x86_cpu_features. It avoids race condition caused by
FOREIGN_CALL macros, which are only used for x86-64.
Performance impact of saving and restoring 8 vector registers are
negligible on Nehalem, Sandy Bridge, Ivy Bridge and Haswell when
ld.so is optimized with SSE2.
[BZ #15128]
* sysdeps/x86_64/Makefile [$(subdir) == elf] (tests): Add
ifuncmain8.
(modules-names): Add ifuncmod8.
($(objpfx)ifuncmain8): New rule.
* sysdeps/x86_64/dl-machine.h: Include <dl-procinfo.h> and
<cpuid.h>.
(elf_machine_runtime_setup): Use _dl_runtime_resolve_sse,
_dl_runtime_resolve_avx, or _dl_runtime_resolve_avx512,
_dl_runtime_profile_sse, _dl_runtime_profile_avx, or
_dl_runtime_profile_avx512, based on HAS_ARCH_FEATURE.
* sysdeps/x86_64/dl-trampoline.S: Rewrite.
* sysdeps/x86_64/dl-trampoline.h: Likewise.
* sysdeps/x86_64/ifuncmain8.c: New file.
* sysdeps/x86_64/ifuncmod8.c: Likewise.
* sysdeps/x86_64/nptl/tcb-offsets.sym (RTLD_SAVESPACE_SSE):
Removed.
* sysdeps/x86_64/nptl/tls.h (__128bits): Removed.
(tcbhead_t): Change rtld_must_xmm_save to __glibc_unused1.
Change rtld_savespace_sse to __glibc_unused2.
(RTLD_CHECK_FOREIGN_CALL): Removed.
(RTLD_ENABLE_FOREIGN_CALL): Likewise.
(RTLD_PREPARE_FOREIGN_CALL): Likewise.
(RTLD_FINALIZE_FOREIGN_CALL): Likewise.
PowerPC has always used __IPC_64 like most other architectures, which
means that __ASSUME_IPC64 can be always true. Also, all other
architecture implementations that use the ipc syscall are effectively
identical to the generic version and can be removed.
In powerpc64, memchr was always pointing to the internal __GI_memchr
implementation. This patch fixes that and makes it use the
optimized POWER7 version when adequate.
* sysdeps/powerpc/powerpc64/multiarch/memchr-ppc64.c: Make
memchr not point to the internal __GI_memchr implementation.
sysdeps/i386/i686/multiarch/strcasestr-c.c became unused after
commit 1818483b15
Author: Andreas Schwab <schwab@suse.de>
Date: Wed Dec 18 11:53:27 2013 +1000
Remove use of SSE4.2 functions for strstr on i686
which contains
-sysdep_routines += strcspn-c strpbrk-c strspn-c strstr-c strcasestr-c
+sysdep_routines += strcspn-c strpbrk-c strspn-c
sysdeps/x86_64/multiarch/strcasestr.c became useless after
t 584b18eb4d
Author: Ondřej Bílka <neleai@seznam.cz>
Date: Sat Dec 14 19:33:56 2013 +0100
Add strstr with unaligned loads. Fixes bug 12100.
which changes sysdeps/x86_64/multiarch/strcasestr.c to
libc_ifunc (__strcasestr, __strcasestr_sse2);
This patch removes these file.
* i386/i686/multiarch/strcasestr-c.c: Removed.
* x86_64/multiarch/strcasestr.c: Likewise.
* x86_64/multiarch/ifunc-impl-list.c (__libc_ifunc_impl_list):
Remove strcasestr.
Removing the use of -Wno-uninitialized for math/ shows errors for
ldbl-128ibm:
../sysdeps/ieee754/ldbl-128ibm/s_nearbyintl.c: In function '__nearbyintl':
../sysdeps/ieee754/ldbl-128ibm/s_nearbyintl.c:119:34: error: 'low' may be used uninitialized in this function [-Werror=maybe-uninitialized]
u.d[1].d = high - u.d[0].d + low;
^
../sysdeps/ieee754/ldbl-128ibm/s_nearbyintl.c:119:23: error: 'high' may be used uninitialized in this function [-Werror=maybe-uninitialized]
u.d[1].d = high - u.d[0].d + low;
^
These errors are correct: if the high part of the argument is a NaN,
and the low part is nonzero but has absolute value less than 2^52,
those variables can be used uninitialized. This patch rearranges the
code so that the variables are always initialized with the natural
values, and then possibly modified later, to avoid this uninitialized
use. (Note that there are still other issues with this code and NaNs
that are not fixed by this patch.) No bug filed in Bugzilla or
testcase added for the uninitialized use since it wasn't user-visible
with the compiler I tried (that is, I still got a NaN result).
Tested for powerpc.
* sysdeps/ieee754/ldbl-128ibm/s_nearbyintl.c: Always initialize
variables for high and low parts before possibly modifying them.
Move sysdeps/x86_64/multiarch/init-arch.h to sysdeps/x86/init-arch.h
which can be used for both i386 and x86_64.
* sysdeps/i386/i686/multiarch/init-arch.h: Removed.
* sysdeps/unix/sysv/linux/x86/init-arch.h: Likewise.
* sysdeps/x86_64/cacheinfo.c: Include <init-arch.h> instead
of "multiarch/init-arch.h".
* sysdeps/x86_64/multiarch/init-arch.h: Renamed to ...
* sysdeps/x86/init-arch.h: This.
Both files include sysdeps/x86_64/multiarch/init-arch.c which has been
removed.
* sysdeps/i386/i686/multiarch/init-arch.c: Removed.
* sysdeps/unix/sysv/linux/x86/init-arch.c: Likewise.
The csqrt implementations in glibc can miss underflow exceptions when
the real or imaginary part of the result becomes tiny in the course of
scaling down (in particular, multiplication by 0.5) and that scaling
is exact although the relevant part of the mathematical result isn't.
This patch forces the exception in a similar way to previous fixes.
Tested for x86_64 and x86.
[BZ #18370]
* math/s_csqrt.c (__csqrt): Force underflow exception for results
whose real or imaginary part has small absolute value.
* math/s_csqrtf.c (__csqrtf): Likewise.
* math/s_csqrtl.c (__csqrtl): Likewise.
* math/auto-libm-test-in: Add more tests of csqrt.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
This patch adds extra inline functions to change the Program Priority
Register from ISA 2.07.
2015-08-19 Gabriel F. T. Gomes <gftg@linux.vnet.ibm.com>
* sysdeps/powerpc/sys/platform/ppc.h (__ppc_set_ppr_med_high,
__ppc_set_ppr_very_low): New functions.
* manual/platform.texi: Add documentation about
__ppc_set_ppr_med_high and __ppc_set_ppr_very_low.
Fix the bind-now case when DT_REL and DT_JMPREL sections are separate
and there is a gap between them.
[BZ #14341]
* elf/dynamic-link.h (elf_machine_lazy_rel): Properly handle the
case when there is a gap between DT_REL and DT_JMPREL sections.
* sysdeps/x86_64/Makefile (tests): Add tst-split-dynreloc.
(LDFLAGS-tst-split-dynreloc): New.
(tst-split-dynreloc-ENV): Likewise.
* sysdeps/x86_64/tst-split-dynreloc.c: New file.
* sysdeps/x86_64/tst-split-dynreloc.lds: Likewise.
__xstat_conv, __xstat64_conv and __xstat32_conv are internal to glibc.
They should be marked as hidden so that they can't be called without
PLT.
[BZ #18822]
* sysdeps/unix/sysv/linux/xstatconv.h (__xstat_conv): Add
attribute_hidden.
(__xstat64_conv): Likewise.
(__xstat32_conv): Likewise.
Since _dl_x86_cpu_features is always available, we can use x86-64
cacheinfo.c and sysconf.c for both i386 and x86-64.
* sysdeps/i386/i686/Makefile
[$(subdir) == string] (sysdep_routines): Moved to ...
* sysdeps/i386/Makefile: Here.
* sysdeps/i386/i686/cacheinfo.c: Moved to ...
* sysdeps/i386/cacheinfo.c: Here.
* sysdeps/unix/sysv/linux/i386/sysconf.c: Removed.
* sysdeps/unix/sysv/linux/i386/i686/sysconf.c: Likewise.
* sysdeps/unix/sysv/linux/x86_64/sysconf.c: Moved to ...
* sysdeps/unix/sysv/linux/x86/sysconf.c: Here.
This patch fixes -Wundef warnings relating to __mips_isa_rev being
undefined.
Tested for mips64 (all three ABIs) that there is a clean build and
testsuite run with -Wno-error=undef removed (and my other -Wundef
patches applied).
* sysdeps/mips/dl-machine.h [__mips_isa_rev < 6]: Change
conditionals to [!defined __mips_isa_rev || __mips_isa_rev < 6].
* sysdeps/mips/machine-gmon.h [__mips_isa_rev < 6]: Likewise.
Some features in hwcap.h do not have matching string descriptors
to be displayed when LD_SHOW_AUXV=1. This patch fixes the problem.
2015-08-13 Carlos Eduardo Seo <cseo@linux.vnet.ibm.com>
* sysdeps/powerpc/dl-procinfo.c:
(_dl_powerpc_cap_flags): Added missing strings for some
hwcap features.
* sysdeps/powerpc/dl-procinfo.h: Updated hwcap bit count.
cpuid, i586 and i686 instructions are available if the processor
specified by -march= supports them. We can use this information
to determine whether those instructions can be used safely.
* sysdeps/x86/cpu-features.c (init_cpu_features): Check
whether cpuid is available only if HAS_CPUID is 0.
* sysdeps/x86/cpu-features.h (HAS_CPUID): New.
(HAS_I586): Likewise.
(HAS_I686): Likewise.
This brings hppa inline with all the other arches and main code where we
require TLS support everywhere. That means dropping the defines USE_TLS
and USE___THREAD, and dropping the binutils check (since we already have
a version requirement that is new enough).
Various fma implementations have logic that, when computing fma (x, y,
z) where z is large (so care needs taking to avoid internal overflow)
but x * y is small, scale x * y up instead of down to avoid internal
underflows resulting from scaling down. (In these cases, x * y is
small enough that only its sign actually matters rather than the exact
value.)
The threshold for scaling up instead of down was correct for "if the
unscaled values were multiplied, the low part of the multiplication
could underflow", and the scaling was sufficient to ensure that the
low part of the multiplication did not underflow (given that cases of
very small x * y - less than half the least subnormal - were
previously dealt with). However, the choice in the functions wasn't
between scaling up or no scaling, but between scaling up and scaling
down (scaling down actually being needed when x * y isn't so small
compared to z and so the exact value does matter). Thus a larger
threshold is needed to ensure that scaling down doesn't produce values
the multiplication of whose low parts underflows. This patch
increases the thresholds accordingly.
Tested for x86_64, x86 and mips64 (with the MIPS version of s_fmal.c
removed so that the ldbl-128 version gets tested instead of the
soft-fp one).
[BZ #18824]
* sysdeps/ieee754/dbl-64/s_fma.c (__fma): Increase threshold for
scaling x * y up instead of down.
* sysdeps/ieee754/ldbl-128/s_fmal.c (__fmal): Likewise.
* sysdeps/ieee754/ldbl-96/s_fmal.c (__fmal): Likewise.
* math/auto-libm-test-in: Add more tests of fma.
* math/auto-libm-test-out: Regenerated.
The change in 0b5395f052 replaced calls
to __get_cpu_features@plt followed by a mov from rax to rdx, with a
single macro LOAD_RTLD_GLOBAL_RO_RDX. It is pretty clear that there
was a typo in s_floorf and __nearbyint due to which the (now incorrect)
mov was not removed. This patch removes that mov.
* sysdeps/x86_64/fpu/multiarch/s_floorf.S (__floorf): Remove
unnecessary movq.
* sysdeps/x86_64/fpu/multiarch/s_nearbyint.S (__nearbyint):
Likewise.
This patch adds more test inputs to various libm functions found
through random generation to have larger ulps errors than previously
listed in libm-test-ulp, on at least one of x86_64 and x86.
Tested for x86_64 and x86.
* math/auto-libm-test-in: Add more tests of acos, acosh, asin,
asinh, atan, atan2, atanh, cabs, cbrt, cosh, csqrt, erf, erfc,
exp, exp2, lgamma, log, log1p, log2, pow, sin, sincos, tan, tanh
and tgamma.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
Similar to various other bugs in this area, some tanh implementations
do not raise the underflow exception for subnormal arguments, when the
result is tiny and inexact. This patch forces the exception in a
similar way to previous fixes.
Tested for x86_64, x86, mips64 and powerpc.
[BZ #16520]
* sysdeps/ieee754/dbl-64/s_tanh.c: Include <float.h>.
(__tanh): Force underflow exception for arguments with small
absolute value.
* sysdeps/ieee754/flt-32/s_tanhf.c: Include <float.h>.
(__tanhf): Force underflow exception for arguments with small
absolute value.
* sysdeps/ieee754/ldbl-128/s_tanhl.c: Include <float.h>.
(__tanhl): Force underflow exception for arguments with small
absolute value.
* sysdeps/ieee754/ldbl-128ibm/s_tanhl.c: Include <float.h>.
(__tanhl): Force underflow exception for arguments with small
absolute value.
* sysdeps/ieee754/ldbl-96/s_tanhl.c: Include <float.h>.
(__tanhl): Force underflow exception for arguments with small
absolute value.
* math/auto-libm-test-in: Add more tests of tanh.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
Since not all i486 processors support cpuid, we call __get_cpuid_max to
check if cpuid is available before using it if not compiling for i586,
i686 nor x86-64.
* sysdeps/x86/cpu-features.c (init_cpu_features): Call
__get_cpuid_max if not compiling for i586, i686 nor x86-64.
This patch updates x86 elision-conf.c to use the newly defined
HAS_CPU_FEATURE from <cpu-features.h>.
* sysdeps/unix/sysv/linux/x86/elision-conf.c (elision_init):
Replace HAS_RTM with HAS_CPU_FEATURE (RTM).
When building with --disable-multi-arch the memmove and strstr POWER7
optimization create and uses symbols that conflict with expect conform
tests.
* sysdeps/powerpc/powerpc64/power7/memmove.S (bcopy): Changing to
__bcopy and add a weak_alias to bcopy.
* sysdeps/powerpc/powerpc64/power7/strstr.S (strstr): Use __strnlen
for static build.
This patches uses the default strcpy/stpcpy implementation for
POWER7/PPC64. This is faster in mostly inputs for benchtests
and for multiarch the implementation uses the POWER7 strlen and
memcpy.
* string/stpcpy.c (__stpcpy): Use STPCPY to redefine symbol name and
cleanup macro usage.
* string/strcpy.c (strcpt): Use STRCPY to redefine symbol name.
* sysdeps/powerpc/powerpc64/multiarch/stpcpy-power7.S: Remove file.
* sysdeps/powerpc/powerpc64/multiarch/stpcpy-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strcpy-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strcpy-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/stpcpy.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/strcpy.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/strcpy.c: Likewise.
* sysdeps/powerpc/powerpc64/stpcpy.S: Likewise.
* sysdeps/powerpc/powerpc64/strcpy.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/stpcpy.c
[SHARED && IS_IN (libc)]: Include <string/strcpy.c>.
* sysdeps/powerpc/powerpc64/multiarch/stpcpy.c
[SHARED && IS_IN (libc)]: Include <string/stpcpy.c>.
* sysdeps/powerpc/powerpc64/multiarch/stpcpy-power7.c: New file.
* sysdeps/powerpc/powerpc64/multiarch/stpcpy-ppc64.c: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strcpy-power7.c: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strcpy-ppc64.c: Likewise.
* sysdeps/powerpc/powerpc64/power7/strcpy.c: Likewise.
This patch fixes the strstr build with --disable-multi-arch option.
The optimization calls the __strstr_ppc symbol, which always build
for multiarch config but not if it is disable. This patch fixes it
by adding the default C implementation object with the expected
symbol name.
* sysdeps/powerpc/powerpc64/power7/Makefile [$(subdir) = string]
(sysdep_routines): Add strstr-ppc64.
* sysdeps/powerpc/powerpc64/power7/strstr-ppc64.c: New file.
This patch adds more tests of various libm functions found through
random test generation to give increased ulps on 32-bit x86.
Tested for x86_64 and x86.
* math/auto-libm-test-in: Add more tests of acosh, asin, asinh,
atanh, cabs, carg, cbrt, cosh, csqrt, erf, erfc, exp, exp10,
expm1, hypot, log, log10, log1p, log2, pow, sinh, tan and tgamma.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
ldbl-128ibm tanhl uses a too-small threshold to decide when to return
+/-1, resulting in large errors. This patch changes it to a more
appropriate threshold (the requirement is for 2*exp(-2|x|) to be small
in terms of ulps of 1).
Tested for x86_64, x86 and powerpc.
[BZ #18790]
* sysdeps/ieee754/ldbl-128ibm/s_tanhl.c (__tanhl): Increase
threshold for returning +/- 1.
* math/auto-libm-test-in: Add more tests of tanh.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
ldbl-128ibm sinhl uses a too-big threshold to decide when to return
the argument, resulting in large errors. This patch fixes it to use a
more appropriate threshold.
Tested for x86_64, x86 and powerpc.
[BZ #18789]
* sysdeps/ieee754/ldbl-128ibm/e_sinhl.c (__ieee754_sinhl): Use
smaller threshold for returning the argument.
* math/auto-libm-test-in: Add more tests of sinh.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
No other arch exports these defines, and having them in the default
namespace causes conformance header tests to fail. Put them behind
the __USE_MISC define as that is what other arches seem to use.
The attached change fixes the miscompilation of sched_setaffinity() on
hppa. This is an old problem that was fixed on other architectures using
a similar approach to the attached change. See:
https://sourceware.org/ml/libc-hacker/2004-04/msg00016.html
Build tested on trunk. Patch has been applied to debian glibc for some time.
As noted in the bug, the asm operands need to be copied to register
variables to avoid operand reloads in the principal asm of the macro.
See the arm implementation for reference. Otherwise we get:
../sysdeps/unix/sysv/linux/hppa/bits/atomic.h:68:6: error:
can't find a register in class 'R1_REGS' while reloading 'asm'
Build tested on trunk with gcc-4.8. Similar patch has been tested
with 2.19 on Debian hppa-unknown-linux-gnu.
The semi-recent SYSCALL_CANCEL inclusion broke microblaze due to the
sysdep.h header not including the unix/sysdep.h header. Include it
here like all other ports.
Similar to various other bugs in this area, some tan implementations
do not raise the underflow exception for subnormal arguments, when the
result is tiny and inexact. This patch forces the exception in a
similar way to previous fixes.
Tested for x86_64, x86, mips64 and powerpc.
[BZ #16517]
* sysdeps/ieee754/dbl-64/s_tan.c: Include <float.h>.
(tan): Force underflow exception for arguments with small absolute
value.
* sysdeps/ieee754/flt-32/k_tanf.c: Include <float.h>.
(__kernel_tanf): Force underflow exception for arguments with
small absolute value.
* sysdeps/ieee754/ldbl-128/k_tanl.c: Include <float.h>.
(__kernel_tanl): Force underflow exception for arguments with
small absolute value.
* sysdeps/ieee754/ldbl-128ibm/k_tanl.c: Include <float.h>.
(__kernel_tanl): Force underflow exception for arguments with
small absolute value.
* sysdeps/ieee754/ldbl-96/k_tanl.c: Include <float.h>.
(__kernel_tanl): Force underflow exception for arguments with
small absolute value.
* math/auto-libm-test-in: Add more tests of tan.
* math/auto-libm-test-out: Regenerated.
Commit 2a6ad8142d updated the headers and
the common dl-symaddr.c, but missed that hppa has its own dedicated source
file for this func. Update that too to fix build errors due to missing
exports of the symbol.
Similar to various other bugs in this area, some sinh implementations
do not raise the underflow exception for subnormal arguments, when the
result is tiny and inexact. This patch forces the exception in a
similar way to previous fixes.
Tested for x86_64, x86, mips64 and powerpc.
[BZ #16519]
* sysdeps/ieee754/dbl-64/e_sinh.c: Include <float.h>.
(__ieee754_sinh): Force underflow exception for arguments with
small absolute value.
* sysdeps/ieee754/flt-32/e_sinhf.c: Include <float.h>.
(__ieee754_sinhf): Force underflow exception for arguments with
small absolute value.
* sysdeps/ieee754/ldbl-128/e_sinhl.c: Include <float.h>.
(__ieee754_sinhl): Force underflow exception for arguments with
small absolute value.
* sysdeps/ieee754/ldbl-128ibm/e_sinhl.c: Include <float.h>.
(__ieee754_sinhl): Force underflow exception for arguments with
small absolute value.
* sysdeps/ieee754/ldbl-96/e_sinhl.c: Include <float.h>.
(__ieee754_sinhl): Force underflow exception for arguments with
small absolute value.
* math/auto-libm-test-in: Add more tests of sinh.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
Subtract stack by 24 bytes instead of 16 bytes so that stack is aligned
to 16 bytes when calling __gettimeofday.
[BZ #18661]
* sysdeps/unix/sysv/linux/x86_64/lowlevellock.S
(__lll_timedwait_tid): Align stack to 16 bytes when calling
__gettimeofday.
Don't use pop to restore %rdi so that stack is aligned to 16 bytes
when calling __setcontext.
[BZ #18661]
* sysdeps/unix/sysv/linux/x86_64/__start_context.S
(__start_context): Don't use pop to restore %rdi so that stack
is aligned to 16 bytes when calling __setcontext.
{memcpy,strcmp}-sse2-unaligned.S aren't needed in ld.so.
* sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S: Compile
only for libc.
* sysdeps/x86_64/multiarch/strcmp-sse2-unaligned.S: Likewise.
It uses the same logic as the ARM version. The common case removes 1 FPSR
and 1 FPCR read. For FE_DFL_ENV and FE_NOMASK_ENV a FPCR read is avoided in
case the FPCR does not change.
The flt-32 implementation of powf wrongly uses x-1 instead of |x|-1
when computing log (x) for the case where |x| is close to 1 and y is
large. This patch fixes the logic accordingly. Relevant tests
existed for x close to 1, and corresponding tests are added for x
close to -1, as well as for some new variant cases.
Tested for x86_64 and x86.
[BZ #18647]
* sysdeps/ieee754/flt-32/e_powf.c (__ieee754_powf): For large y
and |x| close to 1, use absolute value of x when computing log.
* math/auto-libm-test-in: Add more tests of pow.
* math/auto-libm-test-out: Regenerated.
The semi-recent SYSCALL_CANCEL inclusion broke hppa due to the sysdep.h
headers not including the unix/sysdep.h headers. Rework the includes so
we match the other ports:
* hppa/sysdep.h:
- Do not include sys/syscall.h as the unix sysdep.h headers do it.
- Do not include config.h as libc-symbols.h does it, and it has no
#ifdef multiple-include protection, and it breaks when some files
do things like #undef __OPTIMIZE__.
* sysdeps/unix/sysv/linux/hppa/sysdep-cancel.h:
- Drop the generic/sysdep.h as the unix sysdep.h headers include it.
* sysdeps/unix/sysv/linux/hppa/sysdep.h:
- Change to the unix & core hppa sysdep header stacks.
- Undef a few defines that the core headers already set up for us.
The semi-recent SYSCALL_CANCEL macro imposes a slight nuance on the
implementation of INLINE_SYSCALL: the nr argument cannot be expanded
directly but must be passed on to another macro which may expand it.
Most arches don't notice because INLINE_SYSCALL is defined in terms
of INTERNAL_SYSCALL which has the additional layer of expansion, but
on hppa, it was attempting to expand it directly. That causes build
errors like so:
../sysdeps/unix/sysv/linux/sigsuspend.c: In function '__sigsuspend':
../sysdeps/unix/sysv/linux/sigsuspend.c:31:62: error:
implicit declaration of function 'LOAD_ARGS___SYSCALL_NARGS'
../sysdeps/unix/sysv/linux/sigsuspend.c:31:304: error:
called object 'LOAD_ARGS___SYSCALL_NARGS(set, 8)' is not a function
So rewrite hppa's INLINE_SYSCALL to use INTERNAL_SYSCALL like other
arches do. This is also a nice clean up as the two macros had quite
a bit of duplicated logic.
On x86, linker in binutils 2.26 and newer consolidates R_*_JUMP_SLOT with
R_*_GLOB_DAT relocation against the same symbol. This patch extends
local PLT reference check to support alternate relocations.
[BZ #18078]
* scripts/check-localplt.awk: Support alternate relocations.
* scripts/localplt.awk: Also check relocations in DT_RELA/DT_REL
sections.
* sysdeps/unix/sysv/linux/i386/localplt.data: Mark free and
malloc entries with + REL R_386_GLOB_DAT.
* sysdeps/x86_64/localplt.data: New file.
Way back in 2005 the atomic_exchange_and_add function was cleaned up to
avoid the explicit size checking and instead let gcc handle things itself.
Unfortunately that change ended up leaving beyond a cast to int, even when
the incoming value was a long. This has flown under the radar for a long
time due to the function not being heavily used in the tree (especially as
a full 64bit field), but a recent change to semaphores made some nptl tests
fail reliably. This is due to the code packing two 32bit values into one
64bit variable (where the high 32bits contained the number of waiters), and
then the whole variable being atomically updated between threads. On ia64,
that meant we never atomically updated the count, so sometimes the sem_post
would not wake up the waiters.
This define made more sense in the pre-sanitized kernel headers days,
but since we require kernel versions that are sanitized, we don't need
this hack anymore.
Since ia64 is little endian, sa_flags has to come before the padding
when splitting it from 64bits to 32bits.
Reported-by: Joseph Myers <joseph@codesourcery.com>
It turns out tile suffered from the same problem as S390. However,
disabling CFI information for the __startcontext on tile was not
sufficient to fix the problem; I think the backtracer will just
blindly try to follow the link register (lr) in that case.
Instead, the change adds a cfi_undefined directive for "lr"
and then arranges to call __startcontext directly when the new
context starts, rather than just synthesizing a return to it.
In addition to being a bit easier now to understand the control
flow, this also allows the cfi_undefined directive to be placed in
a way that causes it to be in force at the address that the "lr"
from the called function points to.
Commit a059d359d8 changed the sigaction
struct to pass conform tests, but it ended up also changing the ABI for
32 bit builds. For 64 bit builds, changing the long to two ints works,
but for 32 bit builds, it inserts 4 extra bytes. This leads to many
packages randomly failing like bash that spews things like:
configure: line 471: wait_for: No record of process 0
Bracket the new member by a wordsize check to fix the ABI for 32bit.
X86 struct siginfo in kernel 3.19 has been changed by
commit ee1b58d36aa1b5a79eaba11f5c3633c88231da83
Author: Qiaowei Ren <qiaowei.ren@intel.com>
Date: Fri Nov 14 07:18:19 2014 -0800
mpx: Extend siginfo structure to include bound violation information
This patch adds new fields about bound violation into siginfo
structure. si_lower and si_upper are respectively lower bound
and upper bound when bound violation is caused.
This patch updates x86 struct siginfo to enable GDB with MPX support.
[BZ #18696]
* sysdeps/unix/sysv/linux/x86/bits/siginfo.h (_sigfault): Add
si_addr_bnd.
(si_lower): New.
(si_upper): Likewise.
This patch optimizes strstr function for power >= 7 systems. Performance
gain is obtained using aligned memory access and usage of cmpb
instruction for quicker comparison. The average improvement of this
optimization is ~40%. Tested on ppc64 and ppc64le.
2015-07-16 Rajalakshmi Srinivasaraghavan <raji@linux.vnet.ibm.com>
* sysdeps/powerpc/powerpc64/multiarch/Makefile: Add strstr().
* sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c: Likewise.
* sysdeps/powerpc/powerpc64/power7/strstr.S: New File.
* sysdeps/powerpc/powerpc64/multiarch/strstr-power7.S: New File.
* sysdeps/powerpc/powerpc64/multiarch/strstr-ppc64.c: New File.
* sysdeps/powerpc/powerpc64/multiarch/strstr.c: New File.
This symbol is only used by DL_UNMAP which in turn is only used by
_dl_close_worker in dl-close.c, and _dl_close_worker itself is marked
hidden as it is only used by the ldso. That means _dl_unmap should
be marked hidden. Without this, the elf/check-localplt test fails.
This symbol is defined in the ldso, and is used both there and libc.so.
There is no hidden symbol for it though which leads to relocations in
the ldso and the elf/check-localplt test failing. Add a hidden def for
rtld to fix all of that.
This function/file is only used by hppa & ia64, so no testing is needed
for other arches.
These tests were skipped by the use-test-skeleton conversion done in
commit 29955b5d because they were reused in other tests via the #include
directive, and so deemed worth an inspection before they were modified.
This has now been done.
ChangeLog:
2015-07-09 Arjun Shankar <arjun.is@lostca.se>
* elf/tst-leaks1.c (main): Converted to ...
(do_test): ... this.
(TEST_FUNCTION): New macro.
Include test-skeleton.c.
* localedata/tst-langinfo.c (main): Converted to ...
(do_test): ... this.
(TEST_FUNCTION): New macro.
Include test-skeleton.c.
* math/test-fpucw.c (main): Converted to ...
(do_test): ... this.
(TEST_FUNCTION): New macro.
Include test-skeleton.c.
* math/test-tgmath.c (main): Converted to ...
(do_test): ... this.
(TEST_FUNCTION): New macro.
Include test-skeleton.c.
* math/test-tgmath2.c (main): Converted to ...
(do_test): ... this.
(TEST_FUNCTION): New macro.
Include test-skeleton.c.
* setjmp/tst-setjmp.c (main): Converted to ...
(do_test): ... this.
(TEST_FUNCTION): New macro.
Include test-skeleton.c.
* stdio-common/tst-sscanf.c (main): Converted to ...
(do_test): ... this.
(TEST_FUNCTION): New macro.
Include test-skeleton.c.
* sysdeps/x86_64/tst-audit6.c (main): Converted to ...
(do_test): ... this.
(TEST_FUNCTION): New macro.
Include test-skeleton.c.
The testcase stdlib/tst-makecontext fails on i686 because
_Unwind_Backtrace from libgcc produces a segmentation fault if it was
called within a context created by makecontext. See Bug 18635.
ChangeLog:
* sysdeps/i386/i686/Makefile (test-xfail-tst-makecontext):
New variable.
for nul bytes which reverts to separate loop when a non-ASCII char is encountered.
Speedup on test-strlen is ~10%, long ASCII strings are processed ~60% faster,
and on random tests it is ~80% better.
This adds new functions for futex operations, starting with wait,
abstimed_wait, reltimed_wait, wake. They add documentation and error
checking according to the current draft of the Linux kernel futex manpage.
Waiting with absolute or relative timeouts is split into separate functions.
This allows for removing a few cases of code duplication in pthreads code,
which uses absolute timeouts; also, it allows us to put platform-specific
code to go from an absolute to a relative timeout into the platform-specific
futex abstractions..
Futex operations that can be canceled are also split out into separate
functions suffixed by "_cancelable".
There are separate versions for both Linux and NaCl; while they currently
differ only slightly, my expectation is that the separate versions of
lowlevellock-futex.h will eventually be merged into futex-internal.h
when we get to move the lll_ functions over to the new futex API.
If x86-64 assembler doesn't support MPX, we encode bndmov instruction by
hand. When displacement is zero, assembler generates shorter encoding.
This patch improves bndmov encoding with zero displacement so that ld.so
is identical when using assemblers with and without MPX support.
* sysdeps/x86_64/dl-trampoline.S (_dl_runtime_resolve): Improve
bndmov encoding with zero displacement.
We need to save/restore bound registers and add a BND prefix before
branches in _dl_runtime_profile so that bound registers for pointer
pass and return are preserved when LD_AUDIT is used.
[BZ #18134]
* sysdeps/i386/configure.ac: Set HAVE_MPX_SUPPORT.
* sysdeps/i386/configure: Regenerated.
* sysdeps/i386/dl-trampoline.S (PRESERVE_BND_REGS_PREFIX): New.
(_dl_runtime_profile): Save and restore Intel MPX return bound
registers when calling _dl_call_pltexit. Add
PRESERVE_BND_REGS_PREFIX before return.
* sysdeps/i386/link-defines.sym (LRV_BND0_OFFSET): New.
(LRV_BND1_OFFSET): Likewise.
* sysdeps/x86/bits/link.h (La_i86_retval): Add lrv_bnd0 and
lrv_bnd1.
* sysdeps/x86_64/dl-trampoline.S (_dl_runtime_profile): Fix
typo in bndmov encoding.
* sysdeps/x86_64/dl-trampoline.h: Properly save and restore
Intel MPX bound registers. Add PRESERVE_BND_REGS_PREFIX before
branch instructions to preserve bounds.
* sysdeps/mach/hurd/mlock.c (mlock): When __get_privileged_ports
returns an error, also try to use host port from __mach_host_self for
the __vm_wire call.
* sysdeps/mach/hurd/munlock.c (munlock): Likewise.
This is an ABI breaking change, but
typedef int greg_t;
is not a useful definition on aarch64.
greg_t is usually used for defining gregset_t which is used
in mcontext_t. The general registers in mcontext_t can only
be accessed by target specific code and on aarch64 greg_t
is not needed for that so this change is not supposed to break
existing code, just fix the definition.
[BZ #18648]
* sysdeps/unix/sysv/linux/aarch64/sys/ucontext.h (greg_t): Change the
definition to elf_greg_t.
(Added another BZ entry that was missed in the previous commit).
Kernel uses int pr_uid, pr_gid, but glibc used unsigned short.
This is an ABI breaking change, but the size and alignment of
the struct and the layout of other members is not changed and
there is no known usage of pr_uid and pr_gid so it is expected
to be safe.
[BZ #18400]
* sysdeps/unix/sysv/linux/aarch64/sys/procfs.h (struct elf_prpsinfo):
Fix pr_uid and pr_gid members.
This patch added a new fmemopen version, for glibc 2.22, that aims to be
POSIX complaint. It fixes some long-stading glibc fmemopen issues, such
as:
* it changes the way fseek with SEEK_END works on fmemopen to seek
relative to buffer size instead of first '\0'. This is default mode and
'b' opening mode does not change internal behavior (bz#6544).
* fix apending opening mode to use as start position either first null
byte of len specified in function call (bz#13152 and #13151).
* remove binary option 'b' and internal different handling (bz#12836)
* fix seek/SEE_END with negative values (bz#14292).
A compatibility symbol is provided to with old behavior for older symbols
version (2.2.5).
* include/stdio.h (fmemopen): Remove hidden prototype.
(__fmemopen): Add new hidden prototype.
* libio/Makefile: Add oldfmemopen object.
* libio/Versions [GLIBC_2.22]: Add new fmemopen symbol.
* libio/fmemopen.c (__fmemopen): Function rewrite to be POSIX
compliance.
* libio/oldfmemopen.c: New file: old fmemopen implementation for
symbol compatibility.
* stdio-common/Makefile [tests]: Add new tst-fmemopen3.
* stdio-common/psiginfo.c [psiginfo]: Call __fmemopen instead of
fmemopen.
* stdio-common/tst-fmemopen3.c: New file: more fmemopen tests, focus
on append and read mode.
* sysdeps/unix/sysv/linux/aarch64/libc.abilist [GLIBC_2.22]: Add
fmemopen.
* sysdeps/unix/sysv/linux/alpha/libc.abilist [GLIBC_2.22]: Likewise.
* sysdeps/unix/sysv/linux/arm/libc.abilist [GLIBC_2.22]: Likewise.
* sysdeps/unix/sysv/linux/i386/libc.abilist [GLIBC_2.22]: Likewise.
* sysdeps/unix/sysv/linux/ia64/libc.abilist [GLIBC_2.22]:
Likewise.
* sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist
[GLIBC_2.22]: Likewise.
* sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist [GLIBC_2.22]:
Likewise.
* sysdeps/unix/sysv/linux/microblaze/libc.abilist [GLIBC_2.22]:
Likewise.
* sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist
[GLIBC_2.22]: Likewise.
* sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist
[GLIBC_2.22]: Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist
[GLIBC_2.22]: Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist
[GLIBC_2.22]: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist
[GLIBC_2.22]: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist
[GLIBC_2.22]: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libc.abilist
[GLIBC_2.22]: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist [GLIBC_2.22]:
Likewise.
* sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist [GLIBC_2.22]:
Likewise.
* sysdeps/unix/sysv/linux/sh/libc.abilist [GLIBC_2.22]: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist
[GLIBC_2.22]: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist
[GLIBC_2.22]: Likewise.
* sysdeps/unix/sysv/linux/tile/tilegx/tilegx32/libc.abilist
[GLIBC_2.22]: Likewise.
* sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/libc.abilist
[GLIBC_2.22]: Likewise.
* sysdeps/unix/sysv/linux/tile/tilepro/libc.abilist [GLIBC_2.22]:
Likewise.
* sysdeps/unix/sysv/linux/x86_64/64/libc.abilist [GLIBC_2.22]:
Likewise.
* sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist [GLIBC_2.22]:
Likewise.
* sysdeps/unix/sysv/linux/hppa/libc.abilist [GLIBC_2.22]: Likewise.
* sysdeps/unix/sysv/linux/nios2/libc.abilist [GLIBC_2.22]: Likewise.
On s390/s390x backtrace(buffer, size) returns the series of called functions until
"makecontext_ret" and additional entries (up to "size") with "makecontext_ret".
GDB-backtrace is also warning:
"Backtrace stopped: previous frame identical to this frame (corrupt stack?)"
To reproduce this scenario you have to setup a new context with makecontext()
and activate it with setcontext(). See e.g. cf() function in testcase stdlib/tst-makecontext.c.
Or see bug in libgo "Bug 66303 - runtime.Caller() returns infinitely deep stack frames
on s390x " (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66303).
This patch omits the cfi_startproc/cfi_endproc directives in ENTRY/END macro of
__makecontext_ret. Thus no frame information is generated in .eh_frame and backtrace
stops after __makecontext_ret. There is also no .eh_frame info for _start or
thread_start functions.
ChangeLog:
[BZ #18508]
* stdlib/Makefile ($(objpfx)tst-makecontext3):
Depend on $(libdl).
* stdlib/tst-makecontext.c (cf): Test if _Unwind_Backtrace
is not called infinitely times.
(backtrace_helper): New function.
(trace_arg): New struct.
(st1): Enlarge stack size.
* sysdeps/unix/sysv/linux/s390/s390-32/__makecontext_ret.S:
(__makecontext_ret): Omit cfi_startproc and cfi_endproc.
* sysdeps/unix/sysv/linux/s390/s390-64/__makecontext_ret.S:
Likewise.
Regenerated ulps after recent changes.
Tested on s390/s390x.
All math-tests passes on s390 after this patch.
ChangeLog:
* sysdeps/s390/fpu/libm-test-ulps: Regenerated.
On s390 the following tests are failing due to unkown types time_t, pid_t:
FAIL: conform/UNIX98/sys/sem.h/conform
FAIL: conform/XOPEN2K/sys/sem.h/conform
FAIL: conform/XOPEN2K8/sys/sem.h/conform
FAIL: conform/XPG3/sys/sem.h/conform
FAIL: conform/XPG4/sys/sem.h/conform
This patch changes the s390 specific sem.h and includes sys/types.h instead
of bits/types.h. All other archs include sys/types.h, too.
Including bits/wordsize.h is obselete, because it is already inlcuded in
sys/types.h -> bits/types.h.
ChangeLog:
* sysdeps/unix/sysv/linux/s390/bits/sem.h:
Include sys/types.h instead of bits/types.h.
Remove inclusion of bits/wordsize.h.
la_symbind32 is used for x32 in x86-64 audit tests. We should define
both la_symbind32 and la_symbind64 in x86-64 audit tests.
* sysdeps/x86_64/tst-auditmod10b.c (la_symbind32): New.
* sysdeps/x86_64/tst-auditmod4b.c (la_symbind32): Likewise.
* sysdeps/x86_64/tst-auditmod5b.c (la_symbind32): Likewise.
* sysdeps/x86_64/tst-auditmod6b.c (la_symbind32): Likewise.
* sysdeps/x86_64/tst-auditmod6c.c (la_symbind32): Likewise.
* sysdeps/x86_64/tst-auditmod7b.c (la_symbind32): Likewise.
Define macros for fields in La_i86_regs and La_i86_retval and use them
in dl-trampoline.S, instead of hardcoded values.
* sysdeps/i386/Makefile (gen-as-const-headers)[elf]: Add
link-defines.sym.
* sysdeps/i386/dl-trampoline.S: Include <link-defines.h>.
(_dl_runtime_profile): Use LONG_DOUBLE_SIZE, LRV_SIZE,
LRV_EAX_OFFSET, LRV_EDX_OFFSET, LRV_ST0_OFFSET, LRV_ST1_OFFSET
and LR_SIZE.
* sysdeps/i386/link-defines.sym: New file.
This patch adds a testcase for i386 LD_AUDIT to check function return
and parameters passed in registers.
* sysdeps/i386/Makefile (tests)[elf]: Add tst-audit3.
(modules-names): Add tst-auditmod3a tst-auditmod3b.
($(objpfx)tst-audit3): New rule.
($(objpfx)tst-audit3.out): Likewise.
* sysdeps/i386/tst-audit3.c: New file.
* sysdeps/i386/tst-audit3.h: Likewise.
* sysdeps/i386/tst-auditmod3a.c: Likewise.
* sysdeps/i386/tst-auditmod3b.c: Likewise.
Some of the x86 string functions create pointers based on input strings
that may be outside of the input strings. When this happens in C code,
the compiler can potentially detect this, leading to warnings in
application code when those string functions are inlined. Perform those
operations in the assembly code instead of the C code to fix this.
since
https://sourceware.org/ml/libc-alpha/2014-04/msg00006.html
setcontext etc is no longer tied to the kernel use of ucontext.
in that patch the ucontext reserved space is not used consistently
with the kernel abi: the d8,d9 pair is saved in the slot of q8.
this is ok (*context functions work together), but probably not
desirable (ucontexts created by the kernel and getcontext are
subtly different).
the fix just replaces dN with qN in the save/restore code, which
does a bit more than needed (saves/restores the top half of qN that
is not callee saved), but this should not be an issue (and avoids
having to deal with endianness).
(kernel fpsimd context layout: the first 64bit contains 0x210 the fpsimd
context size and 0x46508001 the FPSIMD_MAGIC, the second 64bit is for
fpsr and fpcr, and the rest is the 128bit q0..q31 registers).
given d8=8.1, d9=9.1,... d15=15.1, the context created by getcontext is
current:
(gdb) x/40xg ctx.uc_mcontext.__reserved
0x410df0 <ctx+464>: 0x0000021046508001 0x0000000000000000
0x410e00 <ctx+480>: 0x0000000000000000 0x0000000000000000
0x410e10 <ctx+496>: 0x0000000000000000 0x0000000000000000
0x410e20 <ctx+512>: 0x0000000000000000 0x0000000000000000
0x410e30 <ctx+528>: 0x0000000000000000 0x0000000000000000
0x410e40 <ctx+544>: 0x0000000000000000 0x0000000000000000
0x410e50 <ctx+560>: 0x0000000000000000 0x0000000000000000
0x410e60 <ctx+576>: 0x0000000000000000 0x0000000000000000
0x410e70 <ctx+592>: 0x0000000000000000 0x0000000000000000
0x410e80 <ctx+608>: 0x4020333333333333 0x4022333333333333
0x410e90 <ctx+624>: 0x0000000000000000 0x0000000000000000
0x410ea0 <ctx+640>: 0x4024333333333333 0x4026333333333333
0x410eb0 <ctx+656>: 0x0000000000000000 0x0000000000000000
0x410ec0 <ctx+672>: 0x4028333333333333 0x402a333333333333
0x410ed0 <ctx+688>: 0x0000000000000000 0x0000000000000000
0x410ee0 <ctx+704>: 0x402c333333333333 0x402e333333333333
0x410ef0 <ctx+720>: 0x0000000000000000 0x0000000000000000
0x410f00 <ctx+736>: 0x0000000000000000 0x0000000000000000
0x410f10 <ctx+752>: 0x0000000000000000 0x0000000000000000
0x410f20 <ctx+768>: 0x0000000000000000 0x0000000000000000
fixed:
(gdb) x/40xg ctx.uc_mcontext.__reserved
0x410d70 <ctx+464>: 0x0000021046508001 0x0000000000000000
0x410d80 <ctx+480>: 0x0000000000000000 0x0000000000000000
0x410d90 <ctx+496>: 0x0000000000000000 0x0000000000000000
0x410da0 <ctx+512>: 0x0000000000000000 0x0000000000000000
0x410db0 <ctx+528>: 0x0000000000000000 0x0000000000000000
0x410dc0 <ctx+544>: 0x0000000000000000 0x0000000000000000
0x410dd0 <ctx+560>: 0x0000000000000000 0x0000000000000000
0x410de0 <ctx+576>: 0x0000000000000000 0x0000000000000000
0x410df0 <ctx+592>: 0x0000000000000000 0x0000000000000000
0x410e00 <ctx+608>: 0x4020333333333333 0x0000000000000000
0x410e10 <ctx+624>: 0x4022333333333333 0x0000000000000000
0x410e20 <ctx+640>: 0x4024333333333333 0x0000000000000000
0x410e30 <ctx+656>: 0x4026333333333333 0x0000000000000000
0x410e40 <ctx+672>: 0x4028333333333333 0x0000000000000000
0x410e50 <ctx+688>: 0x402a333333333333 0x0000000000000000
0x410e60 <ctx+704>: 0x402c333333333333 0x0000000000000000
0x410e70 <ctx+720>: 0x402e333333333333 0x0000000000000000
0x410e80 <ctx+736>: 0x0000000000000000 0x0000000000000000
0x410e90 <ctx+752>: 0x0000000000000000 0x0000000000000000
0x410ea0 <ctx+768>: 0x0000000000000000 0x0000000000000000
2015-07-06 Szabolcs Nagy <szabolcs.nagy@arm.com>
* sysdeps/unix/sysv/linux/aarch64/getcontext.S (__getcontext): Use q
registers instead of d ones so the layout is kernel abi compatible.
* sysdeps/unix/sysv/linux/aarch64/setcontext.S (__setcontext): Likewise.
* sysdeps/unix/sysv/linux/aarch64/swapcontext.S (__swapcontext):
Likewise.# Please enter the commit message for your changes. Lines starting
In the ldbl-128 implementation of expm1l, when expm1l's result should
underflow to 0 (argument minus the least subnormal, in some rounding
modes), it can be a zero of the wrong sign. This patch fixes this in
the same way previously used for the x86 / x86_64 versions.
Tested for mips64.
[BZ #18619]
* sysdeps/ieee754/ldbl-128/s_expm1l.c (__expm1l): Force underflow
and return argument in case of subnormal argument.
This patch combines BUSY_WAIT_NOP and atomic_delay into a new
atomic_spin_nop function and adjusts all clients. The new function is
put into atomic.h because what is best done in a spin loop is
architecture-specific, and atomics must be used for spinning. The
function name is meant to tell users that this has no effect on
synchronization semantics but is a performance aid for spinning.
In non-default rounding modes, tgamma can be slightly less accurate
than permitted by glibc's accuracy goals.
Part of the problem is error accumulation, addressed in this patch by
setting round-to-nearest for internal computations. However, there
was also a bug in the code dealing with computing pow (x + n, x + n)
where x + n is not exactly representable, providing another source of
error even in round-to-nearest mode; it was necessary to address both
bugs to get errors for all testcases within glibc's accuracy goals.
Given this second fix, accuracy in round-to-nearest mode is also
improved (hence regeneration of ulps for tgamma should be from scratch
- truncate libm-test-ulps or at least remove existing tgamma entries -
so that the expected ulps can be reduced).
Some additional complications also arose. Certain tgamma tests should
strictly, according to IEEE semantics, overflow or not depending on
the rounding mode; this is beyond the scope of glibc's accuracy goals
for any function without exactly-determined results, but
gen-auto-libm-tests doesn't handle being lax there as it does for
underflow. (libm-test.inc also doesn't handle being lax about whether
the result in cases very close to the overflow threshold is infinity
or a finite value close to overflow, but that doesn't cause problems
in this case though I've seen it cause problems with random test
generation for some functions.) Thus, spurious-overflow markings,
with a comment, are added to auto-libm-test-in (no bug in Bugzilla
because the issue is with the testsuite, not a user-visible bug in
glibc). And on x86, after the patch I saw ERANGE issues as previously
reported by Carlos (see my commentary in
<https://sourceware.org/ml/libc-alpha/2015-01/msg00485.html>), which
needed addressing by ensuring excess range and precision were
eliminated at various points if FLT_EVAL_METHOD != 0.
I also noticed and fixed a cosmetic issue where 1.0f was used in long
double functions and should have been 1.0L.
This completes the move of all functions to testing in all rounding
modes with ALL_RM_TEST, so gen-libm-have-vector-test.sh is updated to
remove the workaround for some functions not using ALL_RM_TEST.
Tested for x86_64, x86, mips64 and powerpc.
[BZ #18613]
* sysdeps/ieee754/dbl-64/e_gamma_r.c (gamma_positive): Take log of
X_ADJ not X when adjusting exponent.
(__ieee754_gamma_r): Do intermediate computations in
round-to-nearest then adjust overflowing and underflowing results
as needed.
* sysdeps/ieee754/flt-32/e_gammaf_r.c (gammaf_positive): Take log
of X_ADJ not X when adjusting exponent.
(__ieee754_gammaf_r): Do intermediate computations in
round-to-nearest then adjust overflowing and underflowing results
as needed.
* sysdeps/ieee754/ldbl-128/e_gammal_r.c (gammal_positive): Take
log of X_ADJ not X when adjusting exponent.
(__ieee754_gammal_r): Do intermediate computations in
round-to-nearest then adjust overflowing and underflowing results
as needed. Use 1.0L not 1.0f as numerator of division.
* sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (gammal_positive): Take
log of X_ADJ not X when adjusting exponent.
(__ieee754_gammal_r): Do intermediate computations in
round-to-nearest then adjust overflowing and underflowing results
as needed. Use 1.0L not 1.0f as numerator of division.
* sysdeps/ieee754/ldbl-96/e_gammal_r.c (gammal_positive): Take log
of X_ADJ not X when adjusting exponent.
(__ieee754_gammal_r): Do intermediate computations in
round-to-nearest then adjust overflowing and underflowing results
as needed. Use 1.0L not 1.0f as numerator of division.
* math/libm-test.inc (tgamma_test_data): Remove one test. Moved
to auto-libm-test-in.
(tgamma_test): Use ALL_RM_TEST.
* math/auto-libm-test-in: Add one test of tgamma. Mark some other
tests of tgamma with spurious-overflow.
* math/auto-libm-test-out: Regenerated.
* math/gen-libm-have-vector-test.sh: Do not check for START.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
The ldbl-128 implementation of j1l produces spurious underflow
exceptions for some small arguments, as a result of squaring the
argument. This patch fixes it just to use a linear approximation for
sufficiently small arguments, and then to force an underflow exception
only in the cases where it is required.
Tested for mips64.
[BZ #18612]
* sysdeps/ieee754/ldbl-128/e_j1l.c (__ieee754_j1l): For small
arguments, just return 0.5 times the argument, with underflow
forced as needed.
* math/auto-libm-test-in: Add more tests of j1.
* math/auto-libm-test-out: Regenerated.
Similar to various other bugs in this area, j1 and jn implementations
can fail to raise the underflow exception when the internal
computation is exact although the actual function is inexact. This
patch forces the exception in a similar way to other such fixes. (The
ldbl-128 / ldbl-128ibm j1l implementation is different and doesn't
need a change for this until spurious underflows in it are fixed.)
Tested for x86_64, x86, mips64 and powerpc.
[BZ #16559]
* sysdeps/ieee754/dbl-64/e_j1.c: Include <float.h>.
(__ieee754_j1): Force underflow exception for small results.
* sysdeps/ieee754/dbl-64/e_jn.c (__ieee754_jn): Likewise.
* sysdeps/ieee754/flt-32/e_j1f.c: Include <float.h>.
(__ieee754_j1f): Force underflow exception for small results.
* sysdeps/ieee754/flt-32/e_jnf.c (__ieee754_jnf): Likewise.
* sysdeps/ieee754/ldbl-128/e_jnl.c (__ieee754_jnl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_jnl.c (__ieee754_jnl): Likewise.
* sysdeps/ieee754/ldbl-96/e_j1l.c: Include <float.h>.
(__ieee754_j1l): Force underflow exception for small results.
* sysdeps/ieee754/ldbl-96/e_jnl.c (__ieee754_jnl): Likewise.
* math/auto-libm-test-in: Add more tests of j1 and jn.
* math/auto-libm-test-out: Regenerated.
This patch updates installed glibc headers for new definitions from
Linux 4.0 and 4.1 that seem relevant to glibc headers. In addition, I
noticed that PF_IB / AF_IB, added in Linux 3.11, were missing for no
obvious reason, so added those as well.
Tested for x86_64 (testsuite, and that installed stripped shared
libraries are unchanged by the patch).
* sysdeps/unix/sysv/linux/bits/in.h (IP_CHECKSUM): New macro.
* sysdeps/unix/sysv/linux/bits/socket.h (PF_IB): Likewise.
(PF_MPLS): Likewise.
(AF_IB): Likewise.
(AF_MPLS): Likewise.
* sysdeps/unix/sysv/linux/sys/mount.h (MS_LAZYTIME): New enum
value and macro.
(MS_RMT_MASK): Include MS_LAZYTIME.
This tag allows debugging of MIPS position independent executables
and provides access to shared library information.
* elf/elf.h (DT_MIPS_RLD_MAP_REL): New macro.
(DT_MIPS_NUM): Update.
* sysdeps/mips/dl-machine.h (ELF_MACHINE_DEBUG_SETUP): Handle
DT_MIPS_RLD_MAP_REL.
Some existing jn tests, if run in non-default rounding modes, produce
errors above those accepted in glibc, which causes problems for moving
tests of jn to use ALL_RM_TEST. This patch makes jn set rounding
to-nearest internally, as was done for yn some time ago, then computes
the appropriate underflowing value for results that underflowed to
zero in to-nearest, and moves the tests to ALL_RM_TEST. It does
nothing about the general inaccuracy of Bessel function
implementations in glibc, though it should make jn more accurate on
average in non-default rounding modes through reduced error
accumulation. The recomputation of results that underflowed to zero
should as a side-effect fix some cases of bug 16559, where jn just
used an exact zero, but that is *not* the goal of this patch and other
cases of that bug remain unfixed.
(Most of the changes in the patch are reindentation to add new scopes
for SET_RESTORE_ROUND*.)
Tested for x86_64, x86, powerpc and mips64.
[BZ #16559]
[BZ #18602]
* sysdeps/ieee754/dbl-64/e_jn.c (__ieee754_jn): Set
round-to-nearest internally then recompute results that
underflowed to zero in the original rounding mode.
* sysdeps/ieee754/flt-32/e_jnf.c (__ieee754_jnf): Likewise.
* sysdeps/ieee754/ldbl-128/e_jnl.c (__ieee754_jnl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_jnl.c (__ieee754_jnl): Likewise.
* sysdeps/ieee754/ldbl-96/e_jnl.c (__ieee754_jnl): Likewise
* math/libm-test.inc (jn_test): Use ALL_RM_TEST.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
cexp, ccos, ccosh, csin and csinh have spurious underflows in cases
where they compute sin of the smallest normal, that produces an
underflow exception (depending on which sin implementation is in use)
but the final result does not underflow. ctan and ctanh may also have
such underflows, or they may be latent (the issue there is that
e.g. ctan (DBL_MIN) should, rounded upwards, be the next double value
above DBL_MIN, which under glibc's accuracy goals may not have an
underflow exception, but the intermediate computation of sin (DBL_MIN)
would legitimately underflow on before-rounding architectures).
This patch fixes all those functions so they use plain comparisons (>
DBL_MIN etc.) instead of comparing the result of fpclassify with
FP_SUBNORMAL (in all these cases, we already know the number being
compared is finite). Note that in the case of csin / csinf / csinl,
there is no need for fabs calls in the comparison because the real
part has already been reduced to its absolute value.
As the patch fixes the failures that previously obstructed moving
tests of cexp to use ALL_RM_TEST, those tests are moved to ALL_RM_TEST
by the patch (two functions remain yet to be converted).
Tested for x86_64 and x86 and ulps updated accordingly.
[BZ #18594]
* math/s_ccosh.c (__ccosh): Compare with least normal value
instead of comparing class with FP_SUBNORMAL.
* math/s_ccoshf.c (__ccoshf): Likewise.
* math/s_ccoshl.c (__ccoshl): Likewise.
* math/s_cexp.c (__cexp): Likewise.
* math/s_cexpf.c (__cexpf): Likewise.
* math/s_cexpl.c (__cexpl): Likewise.
* math/s_csin.c (__csin): Likewise.
* math/s_csinf.c (__csinf): Likewise.
* math/s_csinh.c (__csinh): Likewise.
* math/s_csinhf.c (__csinhf): Likewise.
* math/s_csinhl.c (__csinhl): Likewise.
* math/s_csinl.c (__csinl): Likewise.
* math/s_ctan.c (__ctan): Likewise.
* math/s_ctanf.c (__ctanf): Likewise.
* math/s_ctanh.c (__ctanh): Likewise.
* math/s_ctanhf.c (__ctanhf): Likewise.
* math/s_ctanhl.c (__ctanhl): Likewise.
* math/s_ctanl.c (__ctanl): Likewise.
* math/auto-libm-test-in: Add more tests of ccos, ccosh, cexp,
csin, csinh, ctan and ctanh.
* math/auto-libm-test-out: Regenerated.
* math/libm-test.inc (cexp_test): Use ALL_RM_TEST.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
csin and csinh can produce bad results when overflowing in directed
rounding modes, because a multiplication that can overflow is followed
by a possible negation. This patch fixes this by negating one of the
arguments of the multiplication before the multiplication instead of
negating the result.
The new tests for this issue are added to auto-libm-test-in, starting
use of that file for csin and csinh. The issue was found in the
course of moving existing tests for csin and csinh (existing tests, by
being enabled in more cases than previously, showed the issue for
float and double but not for long double); that move will now be done
separately.
Tested for x86_64 and x86 and ulps updated accordingly.
[BZ #18593]
* math/s_csin.c (__csin): Negate before rather than after possibly
overflowing multiplication.
* math/s_csinf.c (__csinf): Likewise.
* math/s_csinh.c (__csinh): Likewise.
* math/s_csinhf.c (__csinhf): Likewise.
* math/s_csinhl.c (__csinhl): Likewise.
* math/s_csinl.c (__csinl): Likewise.
* math/auto-libm-test-in: Add some tests of csin and csinh.
* math/auto-libm-test-out: Regenerated.
* math/libm-test.inc (csin_test_data): Use AUTO_TESTS_c_c.
(csinh_test_data): Likewise.
* sysdeps/x86_64/fpu/libm-test-ulps: Update.
Similar to various other bugs in this area, the ldbl-128 expl
implementation does not raise the underflow exception for all
subnormal results, if the scaling down is exact although the actual
result is inexact. This patch fixes this by forcing the exception in
this case (the tests that failed before and pass after the test are
already in the testsuite).
Tested for mips64.
[BZ #18586]
* sysdeps/ieee754/ldbl-128/e_expl.c (__ieee754_expl): Force
underflow exception for small results.
Similar to various other bugs in this area, some sin and sincos
implementations do not raise the underflow exception for subnormal
arguments, when the result is tiny and inexact. This patch forces the
exception in a similar way to previous fixes.
Tested for x86_64, x86, mips64 and powerpc.
[BZ #16526]
[BZ #16538]
* sysdeps/ieee754/dbl-64/s_sin.c: Include <float.h>.
(__sin): Force underflow exception for arguments with small
absolute value.
* sysdeps/ieee754/flt-32/k_sinf.c: Include <float.h>.
(__kernel_sinf): Force underflow exception for arguments with
small absolute value.
* sysdeps/ieee754/ldbl-128/k_sincosl.c: Include <float.h>.
(__kernel_sincosl): Force underflow exception for arguments with
small absolute value.
* sysdeps/ieee754/ldbl-128/k_sinl.c: Include <float.h>.
(__kernel_sinl): Force underflow exception for arguments with
small absolute value.
* sysdeps/ieee754/ldbl-128ibm/k_sincosl.c: Include <float.h>.
(__kernel_sincosl): Force underflow exception for arguments with
small absolute value.
* sysdeps/ieee754/ldbl-128ibm/k_sinl.c: Include <float.h>.
(__kernel_sinl): Force underflow exception for arguments with
small absolute value.
* sysdeps/ieee754/ldbl-96/k_sinl.c: Include <float.h>.
(__kernel_sinl): Force underflow exception for arguments with
small absolute value.
* sysdeps/powerpc/fpu/k_sinf.c: Include <float.h>.
(__kernel_sinf): Force underflow exception for arguments with
small absolute value.
* math/auto-libm-test-in: Add more tests of sin and sincos.
* math/auto-libm-test-out: Regenerated.
__kernel_standard_l converts long double arguments to double for use
in SVID "struct exception". This has special-case handling for when
that conversion would overflow or underflow but the original long
double function wouldn't. However, it turns out that "inexact"
exceptions can be spurious here as well, when the function is exactly
determined and __kernel_standard_l is being called for a domain error.
This patch fixes this by using feholdexcept / fesetenv to avoid
exceptions from the conversion, replacing the previous special-case
logic for overflow and underflow (this covers all functions using
__kernel_standard_l, not just those that actually need a change, since
there doesn't seem to be much point in restricting things just to the
functions that mustn't get "inexact" here).
Tested for x86_64 and x86.
[BZ #18245]
[BZ #18583]
* sysdeps/ieee754/k_standardl.c: Include <fenv.h>.
(__kernel_standard_l): Use feholdexcept and fesetenv around
conversion to double instead of special-casing overflow and
underflow.
* math/libm-test.inc (fmod_test_data): Add more tests.
(remainder_test_data): Likewise.
(sqrt_test_data): Likewise.
This fixes BZ #17403 by defining atomic_full_barrier,
atomic_read_barrier, and atomic_write_barrier on x86 and x86_64. A full
barrier is implemented through an atomic idempotent modification to the
stack and not through using mfence because the latter can supposedly be
somewhat slower due to having to provide stronger guarantees wrt.
self-modifying code, for example.
The csqrt implementations in glibc can cause spurious underflows in
some cases as a side-effect of the scaling for large arguments (when
underflow is correct for the square root of the argument that was
scaled down to avoid overflow, but not for the original argument).
This patch arranges to avoid the underflowing intermediate computation
(eliminating a multiplication in 0.5 in the problem cases where a
subsequent scaling by 2 would follow).
Tested for x86_64 and x86 and ulps updated accordingly (only needed
for x86).
[BZ #18371]
* math/s_csqrt.c (__csqrt): Avoid multiplication by 0.5 where
intermediate but not final result might underflow.
* math/s_csqrtf.c (__csqrtf): Likewise.
* math/s_csqrtl.c (__csqrtl): Likewise.
* math/auto-libm-test-in: Add more tests of csqrt.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
The dbl-64 and flt-32 implementations of exp2 functions produce
spurious underflow exceptions. The underlying reason is the same in
both cases: the computation works as (2^a - 1)*2^b + 2^b for suitably
chosen a and b, where a has small magnitude so 2^a - 1 can be computed
with a low-degree polynomial approximation, and (2^a - 1)*2^b can
underflow even when the final result does not. This patch fixes this
by adjusting the threshold for when scaling is used to avoid
intermediate underflow so it works for any possible value of a where
the final result would not underflow.
Tested for x86_64 and x86.
[BZ #18219]
* sysdeps/ieee754/dbl-64/e_exp2.c (__ieee754_exp2): Reduce
threshold on absolute value of exponent for which scaling is used.
* sysdeps/ieee754/flt-32/e_exp2f.c (__ieee754_exp2f): Likewise.
* math/auto-libm-test-in: Add more tests of exp2.
* math/auto-libm-test-out: Regenerated.
Similar to various other bugs in this area, some expm1 implementations
do not raise the underflow exception for subnormal arguments, when the
result is tiny and inexact. This patch forces the exception in a
similar way to previous fixes.
(The issue does not apply to the ldbl-* implementations or to those
for x86 / x86_64 long double. The change to
sysdeps/ieee754/dbl-64/wordsize-64/e_cosh.c is one I missed when
previously fixing bug 16354; the bug in that implementation was
previously latent, but the expm1 fixes stopped it being latent and so
required it to be fixed to avoid spurious underflows from cosh.)
Tested for x86_64 and x86.
[BZ #16353]
* sysdeps/i386/fpu/s_expm1.S (dbl_min): New object.
(__expm1): Force underflow exception for arguments with small
absolute value.
* sysdeps/i386/fpu/s_expm1f.S (flt_min): New object.
(__expm1f): Force underflow exception for arguments with small
absolute value.
* sysdeps/ieee754/dbl-64/s_expm1.c: Include <float.h>.
(__expm1): Force underflow exception for arguments with small
absolute value.
* sysdeps/ieee754/flt-32/s_expm1f.c: Include <float.h>.
(__expm1f): Force underflow exception for arguments with small
absolute value.
* sysdeps/ieee754/dbl-64/wordsize-64/e_cosh.c (__ieee754_cosh):
Check for small arguments before calling __expm1.
* math/auto-libm-test-in: Do not mark underflow exceptions as
possibly missing for bug 16353.
* math/auto-libm-test-out: Regenerated.
In the x86 / x86_64 implementations of expm1l, when expm1l's result
should underflow to 0 (argument minus the least subnormal, in some
rounding modes), it can be a zero of the wrong sign. This patch fixes
this by returning the argument with underflow forced in that case
(this is a 1ulp error relative to the correctly rounded result of -0,
which is OK in terms of the documented accuracy goals, whereas a
result with the wrong sign never is).
Tested for x86_64 and x86.
[BZ #18569]
* sysdeps/i386/fpu/e_expl.S (IEEE754_EXPL) [USE_AS_EXPM1L]: Force
underflow and return argument in case of subnormal argument.
* sysdeps/x86_64/fpu/e_expl.S (IEEE754_EXPL) [USE_AS_EXPM1L]:
Likewise.
* math/auto-libm-test-in: Add more tests of expm1.
* math/auto-libm-test-out: Regenerated.
Similar to various other bugs in this area, the x86 and x86_64
implementations of expl / exp10l can fail to produce underflow
exceptions when the unscaled result has trailing 0 bits so the scaling
down to subnormal precision is exact. This patch fixes this by
forcing the exception in the case of tiny results.
Tested for x86_64 and x86.
[BZ #16361]
* sysdeps/i386/fpu/e_expl.S [!USE_AS_EXPM1L] (cmin): New object.
[!USE_AS_EXPM1L] (IEEE754_EXPL): Force underflow exception for
tiny results.
* sysdeps/x86_64/fpu/e_expl.S [!USE_AS_EXPM1L] (cmin): New object.
[!USE_AS_EXPM1L] (IEEE754_EXPL): Force underflow exception for
tiny results.
* math/auto-libm-test-in: Add more tests of exp and exp10. Do not
mark underflow exceptions as possibly missing for bug 16361.
* math/auto-libm-test-out: Regenerated.
Similar to various other bugs in this area, some asinh implementations
do not raise the underflow exception for subnormal arguments, when the
result is tiny and inexact. This patch forces the exception in a
similar way to previous fixes.
Tested for x86_64, x86 and mips64.
[BZ #16350]
* sysdeps/i386/fpu/s_asinh.S (__asinh): Force underflow exception
for arguments with small absolute value.
* sysdeps/i386/fpu/s_asinhf.S (__asinhf): Likewise.
* sysdeps/i386/fpu/s_asinhl.S (__asinhl): Likewise.
* sysdeps/ieee754/dbl-64/s_asinh.c: Include <float.h>.
(__asinh): Force underflow exception for arguments with small
absolute value.
* sysdeps/ieee754/flt-32/s_asinhf.c: Include <float.h>.
(__asinhf): Force underflow exception for arguments with small
absolute value.
* sysdeps/ieee754/ldbl-128/s_asinhl.c: Include <float.h>.
(__asinhl): Force underflow exception for arguments with small
absolute value.
* sysdeps/ieee754/ldbl-128ibm/s_asinhl.c: Include <float.h>.
(__asinhl): Force underflow exception for arguments with small
absolute value.
* sysdeps/ieee754/ldbl-96/s_asinhl.c: Include <float.h>.
(__asinhl): Force underflow exception for arguments with small
absolute value.
* math/auto-libm-test-in: Do not mark underflow exceptions as
possibly missing for bug 16350.
* math/auto-libm-test-out: Regenerated.
sysdeps/unix/sysv/linux/bits/in.h (as included in netinet/in.h, and
via that in netdb.h and arpa/inet.h) defines a series of MCAST_*
macros, both under __USE_MISC and then again unconditionally. These
are not POSIX macros, nor in any of the namespaces listed in POSIX as
reserved for this header, so should not be defined unconditionally.
This patch duly removes the unconditional definitions, leaving the
ones conditional on __USE_MISC.
Tested for x86_64 and x86 (testsuite, and that installed stripped
shared libraries are unchanged by the patch).
[BZ #18558]
* sysdeps/unix/sysv/linux/bits/in.h (MCAST_JOIN_GROUP): Remove
unconditional definition.
(MCAST_BLOCK_SOURCE): Likewise.
(MCAST_UNBLOCK_SOURCE): Likewise.
(MCAST_LEAVE_GROUP): Likewise.
(MCAST_JOIN_SOURCE_GROUP): Likewise.
(MCAST_LEAVE_SOURCE_GROUP): Likewise.
(MCAST_MSFILTER): Likewise.
* conform/Makefile (test-xfail-XOPEN2K/arpa/inet.h/conform):
Remove variable.
(test-xfail-XOPEN2K/netdb.h/conform): Likewise.
(test-xfail-XOPEN2K/netinet/in.h/conform): Likewise.
(test-xfail-XOPEN2K8/arpa/inet.h/conform): Likewise.
(test-xfail-XOPEN2K8/netdb.h/conform): Likewise.
(test-xfail-XOPEN2K8/netinet/in.h/conform): Likewise.