nice (XPG3) calls getpriority and setpriority (in XPG4 but not XPG3,
i.e. UX-shaded in XPG4). This patch fixes this by making those
functions into weak aliases of __* functions and calling the __*
versions as needed.
Tested for x86_64 and x86 (testsuite, and that disassembly of
installed shared libraries is unchanged by this patch).
This completes cleaning up the unsorted linknamespace test XFAILs.
[BZ #18553]
* resource/getpriority.c (getpriority): Rename to __getpriority
and define as weak alias of __getpriority.
* resource/setpriority.c (setpriority): Rename to __setpriority
and define as weak alias of __setpriority.
* sysdeps/mach/hurd/getpriority.c (getpriority): Rename to
__getpriority and define as weak alias of __getpriority.
* sysdeps/mach/hurd/setpriority.c (setpriority): Rename to
__setpriority and define as weak alias of __setpriority.
* sysdeps/unix/syscalls.list (getpriority): Use __getpriority as
strong name.
(setpriority): Use __setpriority as strong name.
* sysdeps/unix/sysv/linux/getpriority.c (getpriority): Rename to
__getpriority and define as weak alias of __getpriority.
* include/sys/resource.h (__getpriority): Declare. Use
libc_hidden_proto.
(__setpriority): Likewise.
(getpriority): Don't use libc_hidden_proto.
(setpriority): Likewise.
* sysdeps/posix/nice.c (nice): Call __getpriority instead of
getpriority. Call __setpriority instead of setpriority.
* conform/Makefile (test-xfail-XPG3/unistd.h/linknamespace):
Remove variable.
mq_notify (in the 1996 edition of POSIX) brings in references to recv
and socket (not in POSIX until the 2001 edition). This patch fixes
this by using __recv and __socket, exporting them from libc at version
GLIBC_PRIVATE.
Tested for x86_64 and x86 (testsuite and comparison of installed
stripped shared libraries; PLT / dynamic symbol table changes render
the comparison not particularly useful for libc).
[BZ #18546]
* socket/recv.c (__recv): Use libc_hidden_def.
* socket/socket.c (__socket): Likewise.
* sysdeps/mach/hurd/recv.c (__recv): Likewise.
* sysdeps/mach/hurd/socket.c (__socket): Likewise.
* sysdeps/unix/sysv/linux/generic/recv.c (__recv): Likewise.
* sysdeps/unix/sysv/linux/recv.c (__recv): Use libc_hidden_weak.
* sysdeps/unix/sysv/linux/socket.c (__socket): Use
libc_hidden_def.
* sysdeps/unix/sysv/linux/x86_64/recv.c (__recv): Use
libc_hidden_weak.
* include/sys/socket.h (__socket): Do not use attribute_hidden.
Use libc_hidden_proto.
(__recv): Likewise.
* socket/Versions (libc): Export __recv and __socket at version
GLIBC_PRIVATE.
* sysdeps/unix/sysv/linux/mq_notify.c (helper_thread): Call __recv
instead of recv.
(init_mq_netlink): Call __socket instead of socket.
* conform/Makefile (test-xfail-POSIX/mqueue.h/linknamespace):
Remove variable.
mq_receive calls mq_timedreceive, and mq_send calls mq_timedsend. But
mq_receive and mq_send were in POSIX by 1996, while mq_timed* were
added in the 2001 edition of POSIX. This patch fixes this by making
mq_timed* into weak aliases for __mq_timed* and calling the
__mq_timed* names.
Tested for x86_64 and x86 (testsuite, and that disassembly of
installed shared libraries is unchanged by the patch).
[BZ #18545]
* rt/mq_timedreceive.c (mq_timedreceive): Rename to
__mq_timedreceive and define as alias of __mq_timedreceive. Use
hidden_weak.
* rt/mq_timedsend.c (mq_timedsend): Rename to __mq_timedsend and
define as alias of __mq_timedsend. Use hidden_weak.
* sysdeps/unix/sysv/linux/syscalls.list (mq_timedsend): Use
__mq_timedsend as strong name.
(mq_timedreceive): Use __mq_timedreceive as strong name.
* include/mqueue.h (__mq_timedsend): Declare. Use hidden_proto.
(__mq_timedreceive): Likewise.
* sysdeps/unix/sysv/linux/mq_receive.c (mq_receive): Call
__mq_timedreceive instead of mq_timedreceive.
* sysdeps/unix/sysv/linux/mq_send.c (mq_send): Call __mq_timedsend
instead of mq_timedsend.
* conform/Makefile (test-xfail-UNIX98/mqueue.h/linknamespace):
Remove variable.
The syscall wrappers mechanism automatically creates hidden aliases
for syscalls with libc_hidden_def / libc_hidden_weak. The use of
libc_hidden_* has the side-effect that for syscall wrappers in
non-libc libraries those aliases are not created. In turn, this means
that three mq_* syscalls in sysdeps/unix/sysv/linux/syscalls.list list
the __GI_* names explicitly.
The use of libc_hidden_* dates back to the original introduction of
that support in
2002-08-03 Roland McGrath <roland@redhat.com>
* sysdeps/unix/make-syscalls.sh: Generate libc_hidden_def or
libc_hidden_weak for every system call symbol defined.
(predating the non-libc syscalls in question) and I see no reason for
excluding non-libc syscalls. This patch changes the code to use
hidden_def / hidden_weak (via a wrapper syscall_hidden_def in the case
where the argument is itself a macro, so that the argument gets
expanded before concatenation with __GI_), so avoiding the need to
specify the hidden aliases explicitly in this case.
Tested for x86_64 and x86 (testsuite, and that disassembly of
installed stripped shared libraries is unchanged by the patch; the
mq_* symbols change from weak to strong, which is of no significance
and two of them will shortly change back to weak as part of a fix for
bug 18545).
* sysdeps/unix/make-syscalls.sh (emit_weak_aliases): Use
hidden_def and hidden_weak instead of libc_hidden_def and
libc_hidden_weak.
(top level): Refer to hidden_def in comment.
* sysdeps/unix/syscall-template.S (syscall_hidden_def): New
macro. Use it instead of libc_hidden_def.
* sysdeps/unix/sysv/linux/syscalls.list (mq_timedsend): Do not
specify __GI_* name explicitly.
(mq_timedreceive): Likewise.
(mq_setattr): Likewise.
mq_notify (present in POSIX by 1996) brings in references to
pthread_barrier_init and pthread_barrier_wait (new in the 2001 edition
of POSIX). This patch fixes this by making those functions into weak
aliases of __pthread_barrier_*, exporting the __pthread_barrier_*
names at version GLIBC_PRIVATE and using them in mq_notify.
Tested for x86_64 and x86 (testsuite, and comparison of installed
stripped shared libraries). Changes in addresses from dynamic symbol
table / PLT changes render most comparisons not particularly useful,
but when the addresses of subsequent code don't change there's no sign
of unexpected changes there. This patch does not remove any
linknamespace XFAILs because of other namespace issues remaining with
mqueue.h functions.
[BZ #18544]
* nptl/pthread_barrier_init.c (pthread_barrier_init): Rename to
__pthread_barrier_init and define as weak alias of
__pthread_barrier_init.
* sysdeps/sparc/nptl/pthread_barrier_init.c
(pthread_barrier_init): Likewise.
* nptl/pthread_barrier_wait.c (pthread_barrier_wait): Rename to
__pthread_barrier_wait and define as weak alias of
__pthread_barrier_wait.
* sysdeps/sparc/nptl/pthread_barrier_wait.c
(pthread_barrier_wait): Likewise.
* sysdeps/sparc/sparc32/pthread_barrier_wait.c
(pthread_barrier_wait): Likewise.
* sysdeps/unix/sysv/linux/i386/i486/pthread_barrier_wait.S
(pthread_barrier_wait): Likewise.
* sysdeps/unix/sysv/linux/x86_64/pthread_barrier_wait.S
(pthread_barrier_wait): Likewise.
* nptl/Versions (libpthread): Export __pthread_barrier_init and
__pthread_barrier_wait at version GLIBC_PRIVATE.
* include/pthread.h (__pthread_barrier_init): Declare.
(__pthread_barrier_wait): Likewise.
* sysdeps/unix/sysv/linux/mq_notify.c (notification_function):
Call __pthread_barrier_wait instead of pthread_barrier_wait.
(helper_thread): Likewise.
(init_mq_netlink): Call __pthread_barrier_init instead of
pthread_barrier_init.
The sem_* functions bring in references to tdelete, tfind, tsearch and
twalk. But the t* functions are XSI-shaded, while sem_* aren't. This
patch fixes this by using __t* instead, exporting those functions from
libc at version GLIBC_PRIVATE (since sem_* are in libpthread) and
using libc_hidden_* for the benefit of calls within libc.
Tested for x86_64 and x86 (testsuite, and comparison of disassembly of
installed stripped shared libraries). libpthread gets changes from
PLT reordering; addresses in libc change because of PLT / dynamic
symbol table changes.
[BZ #18536]
* misc/tsearch.c (__tsearch): Use libc_hidden_def.
(__tfind): Likewise.
(__tdelete): Likewise.
(__twalk): Likewise.
* misc/Versions (libc): Add __tdelete, __tfind, __tsearch and
__twalk to GLIBC_PRIVATE.
* include/search.h (__tsearch): Use libc_hidden_proto.
(__tfind): Likewise.
(__tdelete): Likewise.
(__twalk): Likewise.
* nptl/sem_close.c (sem_close): Call __twalk instead of twalk.
Call __tdelete instead of tdelete.
* nptl/sem_open.c (check_add_mapping): Call __tfind instead of
tfind. Call __tsearch instead of tsearch.
* sysdeps/sparc/sparc32/sem_open.c (check_add_mapping): Likewise.
* conform/Makefile (test-xfail-POSIX/semaphore.h/linknamespace):
Remove variable.
(test-xfail-POSIX2008/semaphore.h/linknamespace): Likewise.
Some of the cfi annotations used incorrect sign.
* sysdeps/aarch64/dl-tlsdesc.S (_dl_tlsdesc_return_lazy): Fix
cfi_adjust_cfa_offset argument.
(_dl_tlsdesc_undefweak, _dl_tlsdesc_dynamic): Likewise.
(_dl_tlsdesc_resolve_rela, _dl_tlsdesc_resolve_hold): Likewise.
Lazy TLSDESC initialization needs to be synchronized with concurrent TLS
accesses. The TLS descriptor contains a function pointer (entry) and an
argument that is accessed from the entry function. With lazy initialization
the first call to the entry function updates the entry and the argument to
their final value. A final entry function must make sure that it accesses an
initialized argument, this needs synchronization on systems with weak memory
ordering otherwise the writes of the first call can be observed out of order.
There are at least two issues with the current code:
tlsdesc.c (i386, x86_64, arm, aarch64) uses volatile memory accesses on the
write side (in the initial entry function) instead of C11 atomics.
And on systems with weak memory ordering (arm, aarch64) the read side
synchronization is missing from the final entry functions (dl-tlsdesc.S).
This patch only deals with aarch64.
* Write side:
Volatile accesses were replaced with C11 relaxed atomics, and a release
store was used for the initialization of entry so the read side can
synchronize with it.
* Read side:
TLS access generated by the compiler and an entry function code is roughly
ldr x1, [x0] // load the entry
blr x1 // call it
entryfunc:
ldr x0, [x0,#8] // load the arg
ret
Various alternatives were considered to force the ordering in the entry
function between the two loads:
(1) barrier
entryfunc:
dmb ishld
ldr x0, [x0,#8]
(2) address dependency (if the address of the second load depends on the
result of the first one the ordering is guaranteed):
entryfunc:
ldr x1,[x0]
and x1,x1,#8
orr x1,x1,#8
ldr x0,[x0,x1]
(3) load-acquire (ARMv8 instruction that is ordered before subsequent
loads and stores)
entryfunc:
ldar xzr,[x0]
ldr x0,[x0,#8]
Option (1) is the simplest but slowest (note: this runs at every TLS
access), options (2) and (3) do one extra load from [x0] (same address
loads are ordered so it happens-after the load on the call site),
option (2) clobbers x1 which is problematic because existing gcc does
not expect that, so approach (3) was chosen.
A new _dl_tlsdesc_return_lazy entry function was introduced for lazily
relocated static TLS, so non-lazy static TLS can avoid the synchronization
cost.
[BZ #18034]
* sysdeps/aarch64/dl-tlsdesc.h (_dl_tlsdesc_return_lazy): Declare.
* sysdeps/aarch64/dl-tlsdesc.S (_dl_tlsdesc_return_lazy): Define.
(_dl_tlsdesc_undefweak): Guarantee TLSDESC entry and argument load-load
ordering using ldar.
(_dl_tlsdesc_dynamic): Likewise.
(_dl_tlsdesc_return_lazy): Likewise.
* sysdeps/aarch64/tlsdesc.c (_dl_tlsdesc_resolve_rela_fixup): Use
relaxed atomics instead of volatile and synchronize with release store.
(_dl_tlsdesc_resolve_hold_fixup): Use relaxed atomics instead of
volatile.
* elf/tlsdeschtab.h (_dl_tlsdesc_resolve_early_return_p): Likewise.
Various functions in XPG4 bring in references to getlogin_r, which is
not in XPG4; this is also a bug for some older POSIX versions which
aren't yet covered by the linknamespace tests. This patch fixes this
by making getlogin_r into a weak alias for __getlogin_r and using
__getlogin_r as needed.
Tested for x86_64 and x86 (testsuite, and that disassembly of
installed stripped shared libraries is unchanged by the patch).
[BZ #18527]
* login/getlogin_r.c (getlogin_r): Rename to __getlogin_r and
define as weak alias of __getlogin_r. Use libc_hidden_weak.
* sysdeps/mach/hurd/getlogin_r.c (getlogin_r): Likewise.
* sysdeps/unix/getlogin_r.c (getlogin_r): Likewise.
* sysdeps/unix/sysv/linux/getlogin_r.c (getlogin_r): Likewise.
* include/unistd.h (__getlogin_r): Declare. Use
libc_hidden_proto.
* posix/glob.c (glob): Call __getlogin_r instead of getlogin_r.
* conform/Makefile (test-xfail-XPG3/glob.h/linknamespace): Remove
variable.
(test-xfail-XPG3/wordexp.h/linknamespace): Likewise.
(test-xfail-XPG4/glob.h/linknamespace): Likewise.
(test-xfail-XPG4/wordexp.h/linknamespace): Likewise.
aio_* bring in references to pread, which isn't in all the standards
containing aio_* (as a reference from one library to another, this is
a bug for dynamic as well as static linking). This patch fixes this
by using __libc_pread instead, exporting that function from libc at
symbol version GLIBC_PRIVATE; the code, with conditionals that may
call either __pread64 or __libc_pread, becomes exactly analogous to
that elsewhere in the same file that may call either __pwrite64 or
__libc_pwrite.
Tested for x86_64 and x86 (testsuite, and comparison of disassembly of
installed shared libraries). libc changes because of the PLT entry
for the newly exported __libc_pread; librt changes because of
assertion line numbers and PLT rearrangement; other stripped installed
shared libraries do not change.
[BZ #18519]
* posix/Versions (libc): Export __libc_pread at version
GLIBC_PRIVATE.
* sysdeps/pthread/aio_misc.c (handle_fildes_io): Call __libc_pread
instead of pread.
* conform/Makefile (test-xfail-POSIX/aio.h/linknamespace): Remove
variable.
Here is implementation of vectorized sin containing SSE, AVX,
AVX2 and AVX512 versions according to Vector ABI
<https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>.
* bits/libm-simd-decl-stubs.h: Added stubs for sin.
* math/bits/mathcalls.h: Added sin declaration with __MATHCALL_VEC.
* sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New versions added.
* sysdeps/x86/fpu/bits/math-vector.h: SIMD declaration for sin.
* sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files.
* sysdeps/x86_64/fpu/Versions: New versions added.
* sysdeps/x86_64/fpu/libm-test-ulps: Regenerated.
* sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added
build of SSE, AVX2 and AVX512 IFUNC versions.
* sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core_sse4.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core_avx2.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core_avx512.S: New file.
* sysdeps/x86_64/fpu/svml_d_sin2_core.S: New file.
* sysdeps/x86_64/fpu/svml_d_sin4_core.S: New file.
* sysdeps/x86_64/fpu/svml_d_sin4_core_avx.S: New file.
* sysdeps/x86_64/fpu/svml_d_sin8_core.S: New file.
* sysdeps/x86_64/fpu/svml_d_sin_data.S: New file.
* sysdeps/x86_64/fpu/svml_d_sin_data.h: New file.
* sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: Added vector sin test.
* sysdeps/x86_64/fpu/test-double-vlen2.c: Likewise.
* sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: Likewise.
* sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: Likewise.
* sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: Likewise.
* sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise.
* sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: Likewise.
* sysdeps/x86_64/fpu/test-double-vlen8.c: Likewise.
* NEWS: Mention addition of x86_64 vector sin.
Binutils 2.24 doesn't support some AVX512 instructions with ZMM
registers, so we need add more strict check.
* configure.ac: Added more strict check.
* configure: Regenerated.
This patch removes the vsyscall usage for x86_64 port. As indicated
by kernel code comments [1], vsyscalls are a legacy ABI and its concept
is problematic:
- It interferes with ASLR.
- It's awkward to write code that lives in kernel addresses but is
callable by userspace at fixed addresses.
- The whole concept is impossible for 32-bit compat userspace.
- UML cannot easily virtualize a vsyscall.
The VDSO is a better approach for such functionality. Tested on i686,
x86_64, and x32.
* sysdeps/unix/sysv/linux/i386/gettimeofday.c
(__gettimeofday_syscall): Remove vsyscall fallback.
* sysdeps/unix/sysv/linux/i386/time.c (__time_syscall): Likewise.
* sysdeps/unix/sysv/linux/x86/gettimeofday.c (__gettimeofday_syscall):
Add syscall fallback function.
(gettimeofday_ifunc): Use __gettimeofday_syscall as fallback mechanism
if vDSO is not present.
* sysdeps/unix/sysv/linux/x86/time.c (__time_syscall): Add syscall
fallback function.
(time_ifunc): Use __time_syscall as fallback mechanism if vDSO is not
present.
* sysdeps/unix/sysv/linux/x86_64/gettimeofday.c: Remove file.
* sysdeps/unix/sysv/linux/x86_64/time.c: Likewise.
[1] arch/x86/kernel/vsyscall_64.c
regcomp brings in references to wcscoll, which isn't in all the
standards that contain regcomp. In turn, wcscoll brings in references
to wcscmp, also not in all those standards. This patch fixes this by
making those functions into weak aliases of __wcscoll and __wcscmp and
calling those names instead as needed.
Tested for x86_64 and x86 (testsuite, and that disassembly of
installed shared libraries is unchanged by the patch).
[BZ #18497]
* wcsmbs/wcscmp.c [!WCSCMP] (WCSCMP): Define as __wcscmp instead
of wcscmp.
(wcscmp): Define as weak alias of WCSCMP.
* wcsmbs/wcscoll.c (STRCOLL): Define as __wcscoll instead of
wcscoll.
(USE_HIDDEN_DEF): Define.
[!USE_IN_EXTENDED_LOCALE_MODEL] (wcscoll): Define as weak alias of
__wcscoll. Don't use libc_hidden_weak.
* wcsmbs/wcscoll_l.c (STRCMP): Define as __wcscmp instead of
wcscmp.
* sysdeps/i386/i686/multiarch/wcscmp-c.c
[SHARED] (libc_hidden_def): Define __GI___wcscmp instead of
__GI_wcscmp.
(weak_alias): Undefine and redefine.
* sysdeps/i386/i686/multiarch/wcscmp.S (wcscmp): Rename to
__wcscmp and define as weak alias of __wcscmp.
* sysdeps/x86_64/wcscmp.S (wcscmp): Likewise.
* include/wchar.h (__wcscmp): Declare. Use libc_hidden_proto.
(__wcscoll): Likewise.
(wcscmp): Don't use libc_hidden_proto.
(wcscoll): Likewise.
* posix/regcomp.c (build_range_exp): Call __wcscoll instead of
wcscoll.
* posix/regexec.c (check_node_accept_bytes): Likewise.
* conform/Makefile (test-xfail-XPG3/regex.h/linknamespace): Remove
variable.
(test-xfail-XPG4/regex.h/linknamespace): Likewise.
(test-xfail-POSIX/regex.h/linknamespace): Likewise.
pathconf uses __statvfs64, and fpathconf uses __fstatvfs64. On
systems using sysdeps/unix/sysv/linux/wordsize-64, __statvfs64 then
brings in the strong symbol statvfs, and __fstatvfs64 brings in the
strong symbol fstatvfs, which are not in all the standards that have
pathconf and fpathconf. This patch fixes this by making those symbols
into weak aliases.
Tested for x86_64 and x86 (testsuite, and that disassembly of
installed shared libraries is unchanged by the patch).
[BZ #18507]
* sysdeps/unix/sysv/linux/fstatvfs.c (fstatvfs): Rename to
__fstatvfs and define as weak alias of __fstatvfs. Use
libc_hidden_weak.
* sysdeps/unix/sysv/linux/statvfs.c (statvs): Rename to __statvfs
and define as weak alias of __statvfs. Use libc_hidden_weak.
* sysdeps/unix/sysv/linux/wordsize-64/fstatvfs.c (__fstatvfs64):
Define as alias of __fstatvfs, not fstatvfs.
(fstatvfs64): Likewise.
* sysdeps/unix/sysv/linux/wordsize-64/statvfs.c (__statvfs64):
Define as alias of __statvfs, not statvfs.
(statvfs64): Likewise.
* conform/Makefile (test-xfail-POSIX/unistd.h/linknamespace):
Remove variable.
This patch consolidates the sched_getcpu implementations across all
arches (except tile, which requires its own). This patch removes
the powerpc, x86_64 and x32 specific files and change the default
linux one to use INLINE_VSYSCALL where possible (for ports that
implements it).
* math/Makefile: Added CFLAGS for new tests.
* math/test-float-vlen16.h: New file.
* math/test-float-vlen4.h: New file.
* math/test-float-vlen8.h: New file.
* math/test-double-vlen2.h: Fixed 2 argument macro and comment.
* sysdeps/x86_64/fpu/Makefile: Added new tests and variables.
* sysdeps/x86_64/fpu/libm-test-ulps: Regenerated.
* sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c: New file.
* sysdeps/x86_64/fpu/test-float-vlen16.c: New file.
* sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c: New file.
* sysdeps/x86_64/fpu/test-float-vlen4.c: New file.
* sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c: New file.
* sysdeps/x86_64/fpu/test-float-vlen8-avx2.c: New file.
* sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c: New file.
* sysdeps/x86_64/fpu/test-float-vlen8.c: New file.
Here is implementation of vectorized cosf containing SSE, AVX,
AVX2 and AVX512 versions according to Vector ABI
<https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>.
* sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files.
* sysdeps/x86_64/fpu/Versions: New versions added.
* sysdeps/x86_64/fpu/svml_s_cosf4_core.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_s_cosf4_core.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_s_cosf4_core_sse4.S: New file.
* sysdeps/x86_64/fpu/svml_s_cosf8_core_avx.S: New file.
* sysdeps/x86_64/fpu/svml_s_cosf8_core.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_s_cosf8_core.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_s_cosf8_core_avx2.S: New file.
* sysdeps/x86_64/fpu/svml_s_cosf16_core.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_s_cosf16_core.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_s_cosf16_core_avx512.S: New file.
* sysdeps/x86_64/fpu/svml_s_wrapper_impl.h: New file.
* sysdeps/x86_64/fpu/svml_s_cosf_data.S: New file.
* sysdeps/x86_64/fpu/svml_s_cosf_data.h: New file.
* sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added
build of SSE, AVX2 and AVX512 IFUNC versions.
* sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New versions added.
* sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration for cosf.
* NEWS: Mention addition of x86_64 vector cosf.
We test vector math functions using scalar tests infrastructure with
help of special wrappers from scalar versions to vector ones. Wrapper
implemented using platform specific vector types and placed in separate
file for compilation with architecture specific options, main part of
test has no such options. With help of system of definitions unfolding
of which is drived from test code we have wrapper called in individual
testing function instead of scalar function. Also system of definitions
includes generated during make check header math/libm-have-vector-test.h
with series of conditional definitions which help to avoid build fails
for functions having no vector versions; runtime architecture check
to prevent runtime fails of test run on inappropriate hardware.
* math/Makefile: Added rules for vector tests.
* math/gen-libm-have-vector-test.sh: Added generation of wrapper
declaration under condition.
* math/test-double-vlen2.h: New file.
* math/test-double-vlen4.h: New file.
* math/test-double-vlen8.h: New file.
* math/test-vec-loop.h: Added initialization macro.
* sysdeps/x86_64/fpu/Makefile: Added variables for vector tests.
* sysdeps/x86_64/fpu/libm-test-ulps: Regenarated.
* sysdeps/x86_64/fpu/math-tests-arch.h: New file.
* sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: New file.
* sysdeps/x86_64/fpu/test-double-vlen2.c: New file.
* sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: New file.
* sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: New file.
* sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: New file.
* sysdeps/x86_64/fpu/test-double-vlen4.c: New file.
* sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: New file.
* sysdeps/x86_64/fpu/test-double-vlen8.c: New file.
Here is implementation of cos containing SSE, AVX, AVX2 and AVX512
versions according to Vector ABI which had been discussed in
<https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>.
Vector math library build and ABI testing enabled by default for x86_64.
* sysdeps/x86_64/fpu/Makefile: New file.
* sysdeps/x86_64/fpu/Versions: New file.
* sysdeps/x86_64/fpu/svml_d_cos_data.S: New file.
* sysdeps/x86_64/fpu/svml_d_cos_data.h: New file.
* sysdeps/x86_64/fpu/svml_d_cos2_core.S: New file.
* sysdeps/x86_64/fpu/svml_d_cos4_core.S: New file.
* sysdeps/x86_64/fpu/svml_d_cos4_core_avx.S: New file.
* sysdeps/x86_64/fpu/svml_d_cos8_core.S: New file.
* sysdeps/x86_64/fpu/svml_d_wrapper_impl.h: New file.
* sysdeps/x86_64/fpu/multiarch/svml_d_cos2_core.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_d_cos2_core_sse4.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_d_cos4_core.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_d_cos4_core_avx2.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core_avx512.S: New file.
* sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added
build of SSE, AVX2 and AVX512 IFUNC versions.
* sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration for cos.
* math/bits/mathcalls.h: Added cos declaration with __MATHCALL_VEC.
* sysdeps/x86_64/configure.ac: Options for libmvec build.
* sysdeps/x86_64/configure: Regenerated.
* sysdeps/x86_64/sysdep.h (cfi_offset_rel_rsp): New macro.
* sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New file.
* manual/install.texi (Configuring and compiling): Document
--disable-mathvec.
* INSTALL: Regenerated.
* NEWS: Mention addition of libmvec and x86_64 vector cos.
Handle signed integer overflow correctly. Detect and reject O_APPEND.
Document drawbacks of emulation.
This does not completely address bug 15661, but improves the situation
somewhat.
Beginning with the upcoming 4.1 release, Linux on a subset of 32-bit
ARM hardware will provide fast user-space implementations of the
following system calls:
- gettimeofday
- clock_gettime
The kernel implementation depends on the ARMv7 Generic Timers
Extension to accelerate these system calls. So CPUs such as
Cortex-A15 and -A7 benefit, while Cortex-A9, -A8, and pre-v7 CPUs do
not. On systems where the VDSO does not provide any speedup, the
kernel prevents the relevant symbol lookups from succeeding.
On OMAP5 (Cortex-A15) gettimeofday latency decreases from ~350ns to
~120ns. On BeagleBone Black (Cortex-A8) it goes from ~650ns to
~660ns, which to my mind is an acceptable cost.
Verified that no new test failures are introduced on kernels with and
without the VDSO.
* sysdeps/unix/sysv/linux/arm/Makefile: (sysdep_routines):
Include dl-vdso.
* sysdeps/unix/sysv/linux/arm/init-first.c: New file:
Use VDSO routines for gettimeofday, clock_gettime if
available.
* sysdeps/unix/sysv/linux/arm/libc-vdso.h: New file:
Declare VDSO symbols.
* sysdeps/unix/sysv/linux/arm/sysdep.h:
[HAVE_GETTIMEOFDAY_VSYSCALL]: Define.
[HAVE_CLOCK_GETTIME_VSYSCALL]: Define.
* sysdeps/unix/sysv/linux/arm/Versions: Add
__vdso_clock_gettime.
This patch uses inline calls (through INLINE_SYSCALL macro) to define
the non-cancellable functions macros to avoid use of the
syscall_nocancel entrypoint.
Carlos noted in
<https://sourceware.org/ml/libc-alpha/2015-05/msg00680.html> that
various ports use potentially problematic short variables names in
their syscall macros, which could shadow variables with the same name
from containing scopes.
This patch fixes variables called err and ret in MIPS macros. (I left
result_var and _sys_result - separate variables in different macros,
which need separate names - alone.)
Tested for mips64 (all three ABIs) that installed stripped shared
libraries are unchanged by this patch.
* sysdeps/unix/sysv/linux/mips/mips32/sysdep.h (INLINE_SYSCALL):
Use variable name _sc_err instead of err.
[__mips16] (INTERNAL_SYSCALL_NCS): Use variable name _sc_ret
instead of ret.
* sysdeps/unix/sysv/linux/mips/mips64/n32/sysdep.h
(INLINE_SYSCALL): Use variable name _sc_err instead of err.
* sysdeps/unix/sysv/linux/mips/mips64/n64/sysdep.h
(INLINE_SYSCALL): Likewise.
Various code in glibc uses __strnlen instead of strnlen for namespace
reasons. However, __strnlen does not use libc_hidden_proto /
libc_hidden_def (as is normally done for any function defined and
called within the same library, whether or not exported from the
library and whatever namespace it is in), so the compiler does not
know that those calls are to a function within libc.
This patch uses libc_hidden_proto / libc_hidden_def with __strnlen.
On x86_64, it makes no difference to the installed stripped shared
libraries. On 32-bit x86, it causes __strnlen calls to go to the same
place as strnlen calls (the fallback strnlen implementation), rather
than through a PLT entry for the strnlen IFUNC; I'm not sure of the
logic behind when calls from within libc should use IFUNCs versus when
they should go direct to a particular function implementation, but
clearly it doesn't make sense for strnlen and __strnlen to be handled
differently in this regard.
Tested for x86_64 and x86 (testsuite, and comparison of installed
shared libraries as described above).
* string/strnlen.c [!STRNLEN] (__strnlen): Use libc_hidden_def.
* include/string.h (__strnlen): Use libc_hidden_proto.
* sysdeps/aarch64/strnlen.S (__strnlen): Use libc_hidden_def.
* sysdeps/i386/i686/multiarch/strnlen-c.c [SHARED]
(libc_hidden_def): Define __GI___strnlen as well as __GI_strnlen.
* sysdeps/powerpc/powerpc32/power4/multiarch/strnlen-power7.S
(libc_hidden_def): Undefine and redefine.
* sysdeps/powerpc/powerpc32/power4/multiarch/strnlen-ppc32.c
[SHARED] (libc_hidden_def): Define __GI___strnlen as well as
__GI_strnlen.
* sysdeps/powerpc/powerpc32/power7/strnlen.S (__strnlen): Use
libc_hidden_def.
* sysdeps/tile/tilegx/strnlen.c (__strnlen): Likewise.
The attached patch fixes a glibc build failure with gcc 5 on powerpc64le
caused by a recent change in gcc where the compiler defines the
_ARCH_PWR6 macro when processing assembly files but doesn't invoke the
assembler in the corresponding machine mode (unless it has been
explicitly configured to target POWER 6 or later). A bug had been filed
with gcc for this (65341) but was closed as won't fix. Glibc relies on
the _ARCH_PWR6 macro in a few .S files to make use of Power ISA 2.5
instructions (specifically, the four-argument form of the mtfsf insn).
A similar problem had occurred in the past (bug 10118) but the fix that
was committed for it didn't anticipate this new problem.
At issue for INLINE_SYSCALL was that it used "err" and "val"
as variable names in a #define, so that if it was used in a context
where the "caller" was also using "err" or "val", and those
variables were passed in to INLINE_SYSCALL, we would end up
referencing the internal shadowed variables instead.
For example, "char val" in check_may_shrink_heap() in
sysdeps/unix/sysv/linux/malloc-sysdep.h was being shadowed by
the syscall return "val" in INLINE_SYSCALL, causing the "char val"
not to get updated at all, and may_shrink_heap ended up always false.
A similar fix was made to INTERNAL_VSYSCALL_CALL.
This patch removes the architecture specific gettimeofday implementation
to use the vDSO symbol and consolidate it on a common Linux one.
Similar to clock_gettime and clock_getres vDSO implementation, each port
that supports gettimeofday through vDSO should just implement INLINE_VSYSCALL
to access the symbol and define HAVE_{GETTIME,GETRES}_VSYSCAL as 1.
On 21/05/15 05:29, Siddhesh Poyarekar wrote:
> On Wed, May 20, 2015 at 06:55:02PM +0100, Szabolcs Nagy wrote:
>> i guess it's ok for consistency if i fix struct stat64
>> too to use __USE_XOPEN2K8.
>>
>> i will run some tests and come back with a patch
>
> I also think it would be appropriate to change this code in other
> architectures (microblaze and nacl IIRC) to make all of them
> consistent. It is a mechanical enough change IMO that all arch
> maintainer acks is not necessary.
>
here is the patch with consistent __USE_XOPEN2K8
ok to commit?
2015-05-21 Szabolcs Nagy <szabolcs.nagy@arm.com>
[BZ #18234]
* conform/data/sys/stat.h-data (struct stat): Add tests for st_atim,
st_mtim and st_ctim members.
* sysdeps/nacl/bits/stat.h (struct stat, struct stat64): Make
st_atim, st_ctim, st_mtim visible under __USE_XOPEN2K8 only.
* sysdeps/unix/sysv/linux/generic/bits/stat.h (struct stat,):
(struct stat64): Likewise.
* sysdeps/unix/sysv/linux/ia64/bits/stat.h (struct stat,):
(struct stat64): Likewise.
* sysdeps/unix/sysv/linux/microblaze/bits/stat.h (struct stat,):
(struct stat64): Likewise.
This patch consolidate the Linux vDSO define and usage across all ports
that uses it. The common vDSO definitions and calling through
{INLINE/INTERNAL}_VSYSCALL macros are moved to a common header
sysdep-vdso.h and vDSO name declaration and prototype is defined
using a common macro.
Also PTR_{MANGLE,DEMANGLE} is added to ports that does not use them
for vDSO calls (aarch64, powerpc, s390, and tile) and thus it will
reflect in code changes. For ports that already implement pointer
mangling/demangling in vDSO system (i386, x32, x86_64) this patch
is mainly a code refactor.
Checked on x32, x86_64, x32, ppc64le, and aarch64.
This patch removes the socket.S implementation for all ports and replace
it by a C implementation using socketcall. For ports that implement
the syscall directly, there is no change.
The patch idea is to simplify the socket function implementation that
uses the socketcall to be based on C implemetation instead of a pseudo
assembly implementation with arch specific parts. The patch then remove
the assembly implementatation for the ports which uses socketcall
(i386, microblaze, mips, powerpc, sparc, m68k, s390 and sh).
I have cross-build GLIBC for afore-mentioned ports and tested on both
i386 and ppc32 without regressions.
The ldbl-128 and ldbl-128ibm implementations of tanl produce
uninitialized variable warnings with -Wuninitialized because of a
variable that is initialized only conditionally, then used under the
same conditions under which it is set. This patch uses DIAG_* macros
to suppress those warnings.
Tested for powerpc and mips64.
* sysdeps/ieee754/ldbl-128/k_tanl.c: Include <libc-internal.h>.
(__kernel_tanl): Ignore uninitialized warnings around use of SIGN.
* sysdeps/ieee754/ldbl-128ibm/k_tanl.c: Include <libc-internal.h>.
(__kernel_tanl): Ignore uninitialized warnings around use of SIGN.
The ldbl-128 and ldbl-128ibm implementations of erfcl produce
uninitialized variable warnings with -Wuninitialized because of switch
statements where in fact one of the cases will always be executed, but
the compiler does not see that these cases cover all possibilities
(and because the reasoning that it does involves inequalities on the
representation of a floating point value leading to a set of possible
values for 8.0 times that value, converted to int, it's highly
nontrivial for the compiler to see that). This patch fixes those
warnings by converting the last case in those switch statements to a
"default" case.
Tested for powerpc and mips64.
* sysdeps/ieee754/ldbl-128/s_erfl.c (__erfcl): Make case 9 in
switch statement into default case.
* sysdeps/ieee754/ldbl-128ibm/s_erfl.c (__erfcl): Likewise.
The ldbl-128 and ldbl-128ibm implementations of asinl produce
uninitialized variable warnings with -Wuninitialized because the code
for small arguments in fact always returns but the compiler cannot see
this and instead sees that a variable would be uninitialized if the
"if (huge + x > one)" conditional used to force the "inexact"
exception were false.
All the code in libm trying to force "inexact" for functions that are
not exactly defined is suspect and should be removed at some point
given that we now have a clear definition of the accuracy goals for
libm functions which, following C99/C11, does not require anything
about "inexact" for most functions (likewise, the multi-precision code
that tries to give correctly-rounded results, very slowly, for
functions for which the goals clearly do not include correct rounding,
if the faster paths are accurate enough). However, for now this patch
simply changes the code to use math_force_eval, rather than "if", to
ensure the evaluation of the inexact computation.
Tested for powerpc and mips64.
* sysdeps/ieee754/ldbl-128/e_asinl.c (__ieee754_asinl): Don't use
a conditional in forcing "inexact".
* sysdeps/ieee754/ldbl-128ibm/e_asinl.c (__ieee754_asinl):
Likewise.
pathconf (sysdeps/unix/sysv/linux/pathconf.c) uses basename. But
pathconf is in POSIX back to 1990 while basename is only reserved with
external linkage in those standards including XPG functions. This
patch fixes this namespace issue in the usual way, renaming basename
to __basename and making it into a weak alias.
Tested for x86_64 and x86 (testsuite, and that disassembly of
installed shared libraries is unchanged by the patch).
[BZ #18444]
* string/basename.c (basename): Rename to __basename and define as
weak alias of __basename. Use libc_hidden_weak.
* include/string.h (__basename): Declare. Use libc_hidden_proto.
* sysdeps/unix/sysv/linux/pathconf.c (distinguish_extX): Call
__basename instead of basename.
* conform/Makefile (test-xfail-POSIX2008/unistd.h/linknamespace):
Remove variable.
(test-xfail-XOPEN2K8/unistd.h/linknamespace): Likewise.
If you remove the "override CFLAGS += -Wno-uninitialized" in
math/Makefile, you get errors from lgamma implementations of the form:
../sysdeps/ieee754/dbl-64/e_lgamma_r.c: In function '__ieee754_lgamma_r':
../sysdeps/ieee754/dbl-64/e_lgamma_r.c:297:13: error: 'nadj' may be used uninitialized in this function [-Werror=maybe-uninitialized]
if(hx<0) r = nadj - r;
This is one of the standard kinds of false positive uninitialized
warnings: nadj is set under a certain condition, and then later used
under the same condition. This patch uses DIAG_* macros to suppress
the warning on the use of nadj. The ldbl-128 / ldbl-128ibm
implementation has a substantially different structure that avoids
this issue.
Tested for x86_64. (In fact this patch eliminates the need for that
-Wno-uninitialized on x86_64, but I want to test on more architectures
before removing it.)
* sysdeps/ieee754/dbl-64/e_lgamma_r.c: Include <libc-internal.h>.
(__ieee754_lgamma_r): Ignore uninitialized warnings around use of
NADJ.
* sysdeps/ieee754/flt-32/e_lgammaf_r.c: Include <libc-internal.h>.
(__ieee754_lgammaf_r): Ignore uninitialized warnings around use of
NADJ.
* sysdeps/ieee754/ldbl-96/e_lgammal_r.c: Include <libc-internal.h>.
(__ieee754_lgammal_r): Ignore uninitialized warnings around use of
NADJ.
If you remove the "override CFLAGS += -Wno-uninitialized" in
math/Makefile, one of the errors you get is:
../sysdeps/ieee754/dbl-64/mpa.c: In function '__mp_dbl.part.0':
../sysdeps/ieee754/dbl-64/mpa.c:183:5: error: 'c' may be used uninitialized in this function [-Werror=maybe-uninitialized]
c *= X[0];
The problem is that the p < 5 case initializes c if p is 1, 2, 3 or 4
but not otherwise, and in fact p is positive for all calls to this
function so the uninitialized case can't actually occur. This patch
replaces the "if (p == 4)" last case with a comment so the compiler
can see that all paths do initialize c.
Tested for x86_64.
* sysdeps/ieee754/dbl-64/mpa.c (norm): Remove if condition on
(p == 4) case.
This patch removes the specialized i386 assembly implementations for
fallocate{64}, pselect, and sync_file_range now that i386 have
support for 6 argument syscalls.
ldbl-96 remquol wrongly handles the case where the first argument is
finite and the second infinite, because the check for the second
argument being a NaN fails to disregard the explicit high mantissa bit
and so wrongly interprets an infinity as being a NaN. This patch
fixes this by masking off that bit, and improves test coverage for
both remainder and remquo (various cases were missing tests, or, as in
the case of the bug, were tested only for one of the two functions).
Tested for x86_64 and x86.
[BZ #18244]
* sysdeps/ieee754/ldbl-96/s_remquol.c (__remquol): Ignore explicit
high mantissa bit when testing whether P is a NaN.
* math/libm-test.inc (remainder_test_data): Add more tests.
(remquo_test_data): Likewise.
The i386 implementation of atanhl, for small arguments, does a
calculation that involves computing twice the square of the argument,
resulting in spurious underflows for some arguments. This patch fixes
this by just returning the argument when its exponent is below -32,
with underflow being forced as needed for subnormal arguments.
Tested for x86 and x86_64.
[BZ #18049]
* sysdeps/i386/fpu/e_atanhl.S (__ieee754_atanhl): For exponents
below -32, return the argument, with underflow if subnormal.
* math/auto-libm-test-in: Add more tests of atanh.
* math/auto-libm-test-out: Regenerated.
Similar to various other bugs in this area, some atanh implementations
do not raise the underflow exception for subnormal arguments, when the
result is tiny and inexact. This patch forces the exception in a
similar way to previous fixes. (No change in this regard is needed
for the i386 implementation; special handling to force underflows in
these cases will only be needed there when the spurious underflows,
bug 18049, get fixed.)
Tested for x86_64, x86, powerpc and mips64.
[BZ #16352]
* sysdeps/i386/fpu/e_atanh.S (dbl_min): New object.
(__ieee754_atanh): Force underflow exception for results with
small absolute value.
* sysdeps/i386/fpu/e_atanhf.S (flt_min): New object.
(__ieee754_atanhf): Force underflow exception for results with
small absolute value.
* sysdeps/ieee754/dbl-64/e_atanh.c: Include <float.h>.
(__ieee754_atanh): Force underflow exception for results with
small absolute value.
* sysdeps/ieee754/flt-32/e_atanhf.c: Include <float.h>.
(__ieee754_atanhf): Force underflow exception for results with
small absolute value.
* sysdeps/ieee754/ldbl-128/e_atanhl.c: Include <float.h>.
(__ieee754_atanhl): Force underflow exception for results with
small absolute value.
* sysdeps/ieee754/ldbl-128ibm/e_atanhl.c: Include <float.h>.
(__ieee754_atanhl): Force underflow exception for results with
small absolute value.
* sysdeps/ieee754/ldbl-96/e_atanhl.c: Include <float.h>.
(__ieee754_atanhl): Force underflow exception for results with
small absolute value.
* math/auto-libm-test-in: Do not allow missing underflow
exceptions from atanh.
* math/auto-libm-test-out: Regenerated.
The flt-32 implementation of tanf produces spurious underflow
exceptions for some small arguments, through computing values on the
order of x^5. This patch fixes this by adjusting the threshold for
returning x (or, as applicable, +/- 1/x) to 2**-13 (the next term in
the power series being x^3/3).
Tested for x86_64 and x86.
[BZ #18221]
* sysdeps/ieee754/flt-32/k_tanf.c (__kernel_tanf): Use 2**-13 not
2**-28 as threshold for returning x or +/- 1/x.
* math/auto-libm-test-in: Add more tests of tan.
* math/auto-libm-test-out: Regenerated.
The flt-32 implementation of lgammaf produces spurious underflow
exceptions for some large arguments, because of calculations involving
x^-2 multiplied by small constants. This patch fixes this by
adjusting the threshold for a simpler computation to 2**26 (the error
in the simpler computation is on the order of 0.5 * log (x), for a
result on the order of x * log (x)).
Tested for x86_64 and x86.
[BZ #18220]
* sysdeps/ieee754/flt-32/e_lgammaf_r.c (__ieee754_lgammaf_r): Use
2**26 not 2**58 as threshold for returning x * (log (x) - 1).
* math/auto-libm-test-in: Add another test of lgamma.
* math/auto-libm-test-out: Regenerated.
The flt-32 implementation of erfcf produces spurious underflow
exceptions for some arguments close to 0, because of calculations
squaring the argument and then multiplying by small constants. This
patch fixes this by adjusting the threshold for arguments for which
the result is so close to 1 that 1 - x will give the right result from
2**-56 to 2**-26. (If 1 - x * 2/sqrt(pi) were used, the errors would be
on the order of x^3 and a much larger threshold could be used.)
Tested for x86_64 and x86.
[BZ #18217]
* sysdeps/ieee754/flt-32/s_erff.c (__erfcf): Use 2**-26 not 2**-56
as threshold for returning 1 - x.
* math/auto-libm-test-in: Add more tests of erfc.
* math/auto-libm-test-out: Regenerated.
The sysdeps/ieee754/flt-32 version of atanf produces spurious
underflow exceptions for some large arguments, because of computations
that compute x^-4. This patch fixes this by adjusting the threshold
for large arguments (for which +/- pi/2 can just be returned, the
correct result being roughly +/- pi/2 - 1/x) from 2^34 to 2^25.
Tested for x86_64 and x86.
[BZ #18196]
* sysdeps/ieee754/flt-32/s_atanf.c (__atanf): Use 2^25 not 2^34 as
threshold for large arguments.
* math/auto-libm-test-in: Add another test of atan.
* math/auto-libm-test-out: Regenerated.
Similar to various other bugs in this area, some log1p implementations
do not raise the underflow exception for subnormal arguments, when the
result is tiny and inexact. This patch forces the exception in a
similar way to previous fixes. (The ldbl-128ibm implementation
doesn't currently need any change as it already generates this
exception, albeit through code that would generate spurious exceptions
in other cases; special code for this issue will only be needed there
when fixing the spurious exceptions.)
Tested for x86_64, x86, powerpc and mips64.
[BZ #16339]
* sysdeps/i386/fpu/s_log1p.S (dbl_min): New object.
(__log1p): Force underflow exception for results with small
absolute value.
* sysdeps/i386/fpu/s_log1pf.S (flt_min): New object.
(__log1pf): Force underflow exception for results with small
absolute value.
* sysdeps/ieee754/dbl-64/s_log1p.c: Include <float.h>.
(__log1p): Force underflow exception for results with small
absolute value.
* sysdeps/ieee754/flt-32/s_log1pf.c: Include <float.h>.
(__log1pf): Force underflow exception for results with small
absolute value.
* sysdeps/ieee754/ldbl-128/s_log1pl.c: Include <float.h>.
(__log1pl): Force underflow exception for results with small
absolute value.
* math/auto-libm-test-in: Do not allow missing underflow
exceptions from log1p.
* math/auto-libm-test-out: Regenerated.
To make a strtok faster and improve performance in general we need to do one
additional change.
A comment:
/* It doesn't make sense to send libc-internal strcspn calls through a PLT.
The speedup we get from using SSE4.2 instruction is likely eaten away
by the indirect call in the PLT. */
Does not make sense at all because nobody bothered to check it. Gap
between these implementations is quite big, when haystack is empty a
sse2 is around 40 cycles slower because it needs to populate a lookup
table and difference only increases with size. That is much bigger than
plt slowdown which is few cycles.
Even benchtest show a gap which also may be reverse by branch
misprediction but my internal benchmark shown.
simple_strspn stupid_strspn __strspn_sse42 __strspn_sse2
Length 0, alignment 0, acc len 6: 18.6562 35.2344 17.0469 61.6719
Length 6, alignment 0, acc len 6: 59.5469 72.5781 16.4219 73.625
This patch also handles strpbrk which is implemented by including a
x86_64/multiarch/strcspn.S file.
* sysdeps/x86_64/multiarch/strspn.S: Remove plt indirection.
* sysdeps/x86_64/multiarch/strcspn.S: Likewise.
Programs are supposed to be able to define the __fpu_control variable,
overriding the library's version to cause the floating-point control
word to be set to the chosen value at startup.
This is broken for mips16 for static linking because the library's
__fpu_control variable is in the same object file as the helper
functions used by fpu_control.h for mips16, so test-fpucw-ieee-static
fails to link with multiple definitions of __fpu_control.
This patch fixes this by putting the helpers in a separate file rather
than overriding fpu_control.c. Tested for mips16 that this fixes the
link failure and the ABI tests still pass.
[BZ #18397]
* sysdeps/mips/mips32/fpu/fpu_control.c: Move to ....
* sysdeps/mips/mips32/fpu/fpucw-helpers.c: ... here. Include
<fpu_control.h> instead of <math/fpu_control.c>.
* sysdeps/mips/mips32/fpu/Makefile: New file.
This patch adds more randomly-generated tests of various libm
functions that are observed to increase ulps on x86_64.
Tested for x86_64 and x86 and ulps updated accordingly.
* math/auto-libm-test-in: Add more tests of csqrt, lgamma, log10
and sinh.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
There appears to be a discrepancy among the implementations
of setcontext with regards to the function called once the last
linked-to context has finished executing via setcontext.
The POSIX standard says:
~~~
If the uc_link member of the ucontext_t structure pointed to by
the ucp argument is equal to 0, then this context is the main
context, and the thread will exit when this context returns.
~~~
It says "exit" not "exit immediately" nor "exit without running
functions registered with atexit or on_exit."
Therefore the AArch64, ARM, hppa and NIOS II implementations are
wrong and no test detects it.
It is questionable if this should even be fixed or just documented
that the above 4 targets are wrong. The functions are deprecated
and nobody should be using them, but at the same time it silly to
have cross-target differences that make it hard to port old
applications from say x86_64 to AArch64.
Therefore I will ix the 4 arches, and checkin a regression
test to prevent it from changing again.
https://sourceware.org/ml/libc-alpha/2015-03/msg00720.html
This patch adds more randomly-generated tests of various libm
functions that are observed to increase ulps on x86_64.
Tested for x86_64 and x86 and ulps updated accordingly.
* math/auto-libm-test-in: Add more tests of acosh, atanh, cos,
csqrt, erfc, sin and sincos.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
This patch adds more randomly-generated tests of various libm
functions that are observed to increase ulps on x86_64. (This process
must eventually converge, when my random test generation stops finding
inputs that increase the listed ulps, except maybe for any cases
uncovered where the errors exceed the maximum allowed 9ulp error and
so indicate actual libm bugs needing fixing.)
Tested for x86_64 and x86 and ulps updated accordingly.
* math/auto-libm-test-in: Add more tests of acosh, atanh, clog,
clog10, csqrt, erfc, exp2, expm1, log10, log2 and sinh.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
This patch adds more randomly-generated tests of various libm
functions that are observed to increase ulps on x86_64.
Tested for x86_64 and x86 and ulps updated accordingly.
* math/auto-libm-test-in: Add more tests of atan, clog, clog10,
cos, csqrt, erf, erfc, exp2, lgamma, log1p, sin, sincos, tanh and
tgamma.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
This patch adds some randomly-generated tests of tgamma that are
observed to increase ulps on x86_64.
Tested for x86_64 and x86 and ulps updated accordingly.
* math/auto-libm-test-in: Add more tests of tgamma.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
This patch adds some randomly-generated tests of tanh that are
observed to increase ulps on x86_64.
Tested for x86_64 and x86 and ulps updated accordingly.
* math/auto-libm-test-in: Add more tests of tanh.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
This patch adds some randomly-generated tests of tan that are observed
to increase ulps on x86_64.
Tested for x86_64 and x86 and ulps updated accordingly.
* math/auto-libm-test-in: Add more tests of tan.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
This patch adds some randomly-generated tests of cos, sin and sincos
that are observed to increase ulps on x86_64.
Tested for x86_64 and x86 and ulps updated accordingly.
* math/auto-libm-test-in: Add more tests of cos, sin and sincos.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
This patch adds a randomly-generated test of pow that is observed to
increase ulps on x86_64.
Tested for x86_64 and x86 and ulps updated accordingly.
* math/auto-libm-test-in: Add another test of pow.
* math/auto-libm-test-out: Regenerated.
* sysdeps/x86_64/fpu/libm-test-ulps: Update.
This patch adds some randomly-generated tests of lgamma that are
observed to increase ulps on x86_64.
Tested for x86_64 and x86 and ulps updated accordingly.
* math/auto-libm-test-in: Add more tests of lgamma.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
This patch adds some randomly-generated tests of log, log10, log1p and
log2 that are observed to increase ulps on x86_64.
Tested for x86_64 and x86 and ulps updated accordingly.
* math/auto-libm-test-in: Add more tests of log, log10, log2 and
log1p.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
This patch adds some randomly-generated tests of exp, exp10, exp2 and
expm1 that are observed to increase ulps on x86_64.
Tested for x86_64 and x86 and ulps updated accordingly.
* math/auto-libm-test-in: Add more tests of exp, exp10, exp2 and
expm1.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
This patch adds some randomly-generated tests of erf and erfc that are
observed to increase ulps on x86_64.
Tested for x86_64 and x86 and ulps updated accordingly.
* math/auto-libm-test-in: Add more tests of erf and erfc.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
This patch adds some randomly-generated tests of csqrt that are
observed to increase ulps on x86_64.
Tested for x86_64 and x86 and ulps updated accordingly.
* math/auto-libm-test-in: Add more tests of csqrt.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
This patch adds some further randomly-generated tests of cosh and sinh
that are observed to increase ulps on x86_64.
Tested for x86_64 and x86 and ulps updated accordingly.
* math/auto-libm-test-in: Add more tests of cosh and sinh.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
Since glibc is no longer built with -Winline, a special MIPS version
of waitid.c to disable -Winline is no longer needed, and this patch
removes it. Tested that glibc does indeed build with the patch
applied.
* sysdeps/unix/sysv/linux/mips/mips32/waitid.c: Remove file.
The implementation of roundl for ldbl-128 involves undefined behavior
for arguments with exponents from 31 to 47 inclusive, from the shift:
u_int64_t i = -1ULL >> (j0 - 48);
For example, on mips64, this means roundl (0xffffffffffff.8p0L)
wrongly returns its argument, which is not an integer. A condition
checking for exponents < 31 should actually be checking for exponents
< 48, and this patch makes it do so. (That condition is for whether
the bit representing 0.5 is in the high 64-bit half of the
floating-point number. The value 31 might have arisen from an
incorrect conversion of the ldbl-96 version to handle ldbl-128.)
This was originally reported as a GCC libquadmath bug
<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65757>.
Tested for mips64; also tested for x86_64 and x86 to make sure the new
tests pass there.
[BZ #18346]
* sysdeps/ieee754/ldbl-128/s_roundl.c (__roundl): Handle all
exponents less than 48 as cases where high part of mantissa needs
examining to determine whether argument is integral.
* math/libm-test.inc (round_test_data): Add more tests.
This patch adds support to query cache information on s390
via sysconf() function - e.g. with _SC_LEVEL1_ICACHE_SIZE.
The attributes size, linesize and assoc can be queried
for cache level 1 - 4 via "extract cpu attribute" instruction,
which was first available with z10.
* NEWS: Mention sysconf() cache information support for s390.
* sysdeps/unix/sysv/linux/s390/sysconf.c: New File.
This does make ld.so very slightly larger (0.3%) and doesn't seem to
actually improve performance; in fact, my limited testing suggested a
slight (0.1%) performance decrease (running fork/exec of a no-op program
in a loop), but I didn't do enough testing to establish statistical
significance.
However, Roland agrees that it makes sense to switch tile to using
this path, since it's the more standard way.
This patch fix the static build for strftime, which uses __wcschr.
Current powerpc32 implementation defines the __wcschr be an alias to
__wcschr_ppc32 and current implementation misses the correct alias for
static build.
It also changes the default wcschr.c logic so a IFUNC implementation
should just define WCSCHR and undefine the required alias/internal
definitions.
According to bug 6792, errno is not set to ERANGE/EDOM
by calling log1p/log1pf/log1pl with x = -1 or x < -1.
This patch adds a wrapper which sets errno in those cases
and returns the value of the existing __log1p function.
The log1p is now an alias to the wrapper function
instead of __log1p.
The files in sysdeps are reflecting these changes.
The ia64 implementation sets errno by itself,
thus the wrapper-file is empty.
The libm-test is adjusted for log1p-tests to check errno.
[BZ #6792]
* math/w_log1p.c: New file.
* math/w_log1pf.c: Likewise.
* math/w_log1pl.c: Likewise.
* math/Makefile (libm-calls): Add w_log1p.
* math/s_log1pl.c (log1pl): Remove weak_alias.
* sysdeps/i386/fpu/s_log1p.S (log1p): Likewise.
* sysdeps/i386/fpu/s_log1pf.S (log1pf): Likewise.
* sysdeps/i386/fpu/s_log1pl.S (log1pl): Likewise.
* sysdeps/x86_64/fpu/s_log1pl.S (log1pl): Likewise.
* sysdeps/ieee754/dbl-64/s_log1p.c (log1p): Likewise.
[NO_LONG_DOUBLE] (log1pl): Likewise.
* sysdeps/ieee754/flt-32/s_log1pf.c (log1pf): Likewise.
* sysdeps/ieee754/ldbl-128/s_log1pl.c (log1pl): Likewise.
* sysdeps/ieee754/ldbl-64-128/s_log1pl.c
(log1p): Remove long_double_symbol.
* sysdeps/ieee754/ldbl-128ibm/s_log1pl.c (log1pl): Likewise.
* sysdeps/ieee754/ldbl-64-128/w_log1pl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/w_log1pl.c: Likewise.
* sysdeps/m68k/m680x0/fpu/s_log1p.c: Define empty weak_alias to
remove weak_alias for corresponding log1p function.
* sysdeps/m68k/m680x0/fpu/s_log1pf.c: Likewise.
* sysdeps/m68k/m680x0/fpu/s_log1pl.c: Likewise.
* sysdeps/ia64/fpu/w_log1p.c: New file.
* sysdeps/ia64/fpu/w_log1pf.c: Likewise.
* sysdeps/ia64/fpu/w_log1pl.c: Likewise.
* math/libm-test.inc (log1p_test_data): Add errno expectations.
This patch adds some randomly-generated tests of clog and clog10 that
are observed to increase ulps on x86_64.
Tested for x86_64 and x86 and ulps updated accordingly.
* math/auto-libm-test-in: Add more tests of clog and clog10.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
This patch adds some randomly-generated tests of atanh that are
observed to increase ulps on x86_64.
Tested for x86_64 and x86 and ulps updated accordingly.
* math/auto-libm-test-in: Add more tests of atanh.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
This patch adds some randomly-generated tests of atan that are
observed to increase ulps on x86_64.
Tested for x86_64 and x86 and ulps updated accordingly.
* math/auto-libm-test-in: Add more tests of atan.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
This patch adds some randomly-generated tests of cbrt that are
observed to increase ulps on x86_64.
Tested for x86_64 and x86 and ulps updated accordingly.
* math/auto-libm-test-in: Add more tests of cbrt.
* math/auto-libm-test-out: Regenerated.
* sysdeps/x86_64/fpu/libm-test-ulps: Update.
This patch adds some randomly-generated tests of cabs that are
observed to increase ulps on x86_64.
Tested for x86_64 and x86 and ulps updated accordingly.
* math/auto-libm-test-in: Add more tests of cabs.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
The dbl-64 implementation of atan2 does computations that expect to
run in round-to-nearest mode, and in other modes the errors can
accumulate to more than the maximum accepted 9ulp. This patch makes
it use FE_TONEAREST internally, similar to other functions with such
issues. Tests that previously produced large errors are added for
atan2 and the closely related carg, clog and clog10 functions.
Tested for x86_64 and x86 and ulps updated accordingly.
[BZ #18210]
[BZ #18211]
* sysdeps/ieee754/dbl-64/e_atan2.c: Include <fenv.h>.
(__ieee754_atan2): Set FE_TONEAREST mode for internal
computations.
* math/auto-libm-test-in: Add more tests of atan2, carg, clog and
clog10.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
The dbl-64 implementation of atan does computations that expect to run
in round-to-nearest mode, and in other modes the errors can accumulate
to more than the maximum accepted 9ulp. This patch makes it use
FE_TONEAREST internally, similar to other functions with such issues.
Tested for x86_64 and x86; no ulps updates needed.
[BZ #18197]
* sysdeps/ieee754/dbl-64/s_atan.c: Include <fenv.h>.
(atan): Set FE_TONEAREST mode for internal computations.
* math/auto-libm-test-in: Add more tests of atan.
* math/auto-libm-test-out: Regenerated.
Silvermont and Knights Landing have a modular system design with two cores
sharing an L2 cache. If more than 2 cores are detected to shared L2 cache,
it should be adjusted for Silvermont and Knights Landing.
[BZ #18185]
* sysdeps/x86_64/cacheinfo.c (init_cacheinfo): Limit threads
sharing L2 cache to 2 for Silvermont/Knights Landing.
With copy relocation, address of protected data defined in the shared
library may be external. When there is a relocation against the
protected data symbol within the shared library, we need to check if we
should skip the definition in the executable copied from the protected
data. This patch adds ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA and defines
it for x86. If ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA isn't 0, do_lookup_x
will skip the data definition in the executable from copy reloc.
[BZ #17711]
* elf/dl-lookup.c (do_lookup_x): When UNDEF_MAP is NULL, which
indicates it is called from do_lookup_x on relocation against
protected data, skip the data definion in the executable from
copy reloc.
(_dl_lookup_symbol_x): Pass ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA,
instead of ELF_RTYPE_CLASS_PLT, to do_lookup_x for
EXTERN_PROTECTED_DATA relocation against STT_OBJECT symbol.
* sysdeps/generic/ldsodefs.h * (ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA):
New. Defined to 4 if DL_EXTERN_PROTECTED_DATA is defined,
otherwise to 0.
* sysdeps/i386/dl-lookupcfg.h (DL_EXTERN_PROTECTED_DATA): New.
* sysdeps/i386/dl-machine.h (elf_machine_type_class): Set class
to ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA for R_386_GLOB_DAT.
* sysdeps/x86_64/dl-lookupcfg.h (DL_EXTERN_PROTECTED_DATA): New.
* sysdeps/x86_64/dl-machine.h (elf_machine_type_class): Set class
to ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA for R_X86_64_GLOB_DAT.
IFUNC is difficult to correctly implement on any target needing a GOT
to support position independent code, due to the dependency on order
of dynamic relocations. ld.so should be changed to apply IFUNC
relocations last, globally, because without that it is actually
impossible to write an IFUNC resolver in C that works in all
situations. Case in point, vfork in libpthread.so is an IFUNC with
the resolver returning &__libc_vfork. (system and fork are similar.)
If another shared library, libA say, uses vfork then it is quite
possible that libpthread.so hasn't been dynamically relocated before
the unfortunate libA is dynamically relocated. In that case the GOT
entry for &__libc_vfork is still zero, so the IFUNC resolver returns
NULL. LD_BIND_NOW=1 results in libA PLT dynamic relocations being
applied using this NULL value and ld.so segfaults.
This patch hardens ld.so to not segfault on a NULL from an IFUNC
resolver. It also fixes a problem with undefined weak. If you leave
the plt entry as-is for undefined weak then if the entry is ever
called it will loop in ld.so rather than segfaulting.
* sysdeps/powerpc/powerpc64/dl-machine.h (elf_machine_fixup_plt):
Don't segfault if ifunc resolver returns a NULL. Do set plt to
zero for undefined weak.
(elf_machine_plt_conflict): Similarly.
This patch adds some randomly-generated tests of acosh, asinh and
atanh that are observed to increase ulps on x86_64.
Tested for x86_64 and x86 and ulps updated accordingly.
* math/auto-libm-test-in: Add more tests of acosh, asinh and
atanh.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
This patch adds a randomly-generated test of asin that is observed to
increase ulps on x86_64.
Tested for x86_64 and x86 and ulps updated accordingly.
* math/auto-libm-test-in: Add another test of asin.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
In the course of the work on six-argument syscalls I noticed that the
i386 lowlevellock.h contained some unused macro definitions (already
unused before my patch). This patch removes them.
Tested for x86 that installed stripped shared libraries are unchanged
by this patch.
* sysdeps/unix/sysv/linux/i386/lowlevellock.h (LLL_EBX_LOAD):
Remove macro.
(LLL_EBX_REG): Likewise.
(LLL_ENTER_KERNEL): Likewise.
This patch adds some randomly-generated tests of asin that are
observed to increase ulps on x86_64.
Tested for x86_64 and x86 and ulps updated accordingly.
* math/auto-libm-test-in: Add more tests of asin.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
This patch follows the approach outlined in
<https://sourceware.org/ml/libc-alpha/2015-03/msg00656.html> to
support six-argument syscalls from INTERNAL_SYSCALL for 32-bit x86,
making them call a function __libc_do_syscall that takes the syscall
number and three syscall arguments in the registers in which the
kernel expects them, along with a pointer to a structure containing
the other three arguments.
In turn, this allows the generic lowlevellock-futex.h to be used on
32-bit x86, so supporting lll_futex_timed_wait_bitset (and so allowing
FUTEX_CLOCK_REALTIME to be used in various cases, so fixing bug 18138
for 32-bit x86 and leaving hppa as the only architecture missing
lll_futex_timed_wait_bitset). The change to lowlevellock.h's
definition of SYS_futex is because the generic lowlevelloc-futex.h
ends up bringing in bits/syscall.h which defines SYS_futex to
__NR_futex, so resulting in redefinition errors. The revised
definition in lowlevellock.h is in line with what the x86_64 version
does.
__libc_do_syscall is only needed in libpthread at present (meaning
nothing special needs to be done to make it shared-only in most
libraries containing it, static in libc only, as on ARM).
Tested for 32-bit x86, with the glibc testsuite and with the test in
bug 18138. The failures seen
FAIL: nptl/tst-cleanupx4
FAIL: rt/tst-cpuclock2
are pre-existing.
[BZ #18138]
* sysdeps/unix/sysv/linux/i386/sysdep.h (struct
libc_do_syscall_args): New structure.
(INTERNAL_SYSCALL_MAIN_0): New macro.
(INTERNAL_SYSCALL_MAIN_1): Likewise.
(INTERNAL_SYSCALL_MAIN_2): Likewise.
(INTERNAL_SYSCALL_MAIN_3): Likewise.
(INTERNAL_SYSCALL_MAIN_4): Likewise.
(INTERNAL_SYSCALL_MAIN_5): Likewise.
(INTERNAL_SYSCALL_MAIN_6): Likewise. Call __libc_do_syscall.
(INTERNAL_SYSCALL): Define to use INTERNAL_SYSCALL_MAIN_##nr.
Replace conditional definitions by conditional definitions of ....
(INTERNAL_SYSCALL_MAIN_INLINE): ... this. New macro.
* sysdeps/unix/sysv/linux/i386/libc-do-syscall.S: New file.
* sysdeps/unix/sysv/linux/i386/Makefile [$(subdir) = nptl]
(libpthread-sysdep_routines): Add libc-do-syscall.
* sysdeps/unix/sysv/linux/i386/lowlevellock-futex.h: Remove file.
* sysdeps/unix/sysv/linux/i386/lowlevellock.h (SYS_futex): Define
to __NR_futex not 240.
This patch is glibc support for a PowerPC TLS optimization, inspired
by Alexandre Oliva's TLS optimization for other processors,
http://www.lsd.ic.unicamp.br/~oliva/writeups/TLS/RFC-TLSDESC-x86.txt
In essence, this optimization uses a zero module id in the tls_index
GOT entry to indicate that a TLS variable is allocated space in the
static TLS area. A special plt call linker stub for __tls_get_addr
checks for such a tls_index and if found, returns the offset
immediately. The linker communicates the fact that the special
__tls_get_addr stub is used by setting a bit in the dynamic tag
DT_PPC64_OPT/DT_PPC_OPT. glibc communicates to the linker that this
optimization is available by the presence of __tls_get_addr_opt.
tst-tlsmod2.so is built with -Wl,--no-tls-get-addr-optimize for
tst-tls-dlinfo, which otherwise would fail since it tests that no
static tls is allocated. The ld option --no-tls-get-addr-optimize has
been available since binutils-2.20 so doesn't need a configure test.
* NEWS: Advertise TLS optimization.
* elf/elf.h (R_PPC_TLSGD, R_PPC_TLSLD, DT_PPC_OPT, PPC_OPT_TLS): Define.
(DT_PPC_NUM): Increment.
* elf/dynamic-link.h (HAVE_STATIC_TLS): Define.
(CHECK_STATIC_TLS): Use here.
* sysdeps/powerpc/powerpc32/dl-machine.h (elf_machine_rela): Optimize
TLS descriptors.
* sysdeps/powerpc/powerpc64/dl-machine.h (elf_machine_rela): Likewise.
* sysdeps/powerpc/dl-tls.c: New file.
* sysdeps/powerpc/Versions: Add __tls_get_addr_opt.
* sysdeps/powerpc/tst-tlsopt-powerpc.c: New tls test.
* sysdeps/unix/sysv/linux/powerpc/Makefile: Add new test.
Build tst-tlsmod2.so with --no-tls-get-addr-optimize.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/ld.abilist: Update.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/ld.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/ld-le.abilist: Likewise.
This feature doesn't depend on the linker, as can be seen from the
actual test. It's a compiler feature.
* sysdeps/powerpc/powerpc64/configure.ac: Correct "linker support
for overlapping .opd entries" to "support...".
* sysdeps/powerpc/powerpc64/configure: Regenerate
This patch adds some randomly-generated tests of acos that are
observed to increase ulps on x86_64.
Tested for x86_64 and x86 and ulps updated accordingly.
* math/auto-libm-test-in: Add more tests of acos.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
This patch adds some randomly-generated tests of expm1 that are
observed to increase ulps on x86_64.
Tested for x86_64 and x86 and ulps updated accordingly.
* math/auto-libm-test-in: Add more tests of expm1.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
This patch adds some randomly-generated tests of cosh and sinh that
are observed to increase ulps on x86_64.
Tested for x86_64 and x86 and ulps updated accordingly.
* math/auto-libm-test-in: Add more tests of cosh and sinh.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
The x86_64 and x86 libm-test-ulps files hadn't been regenerated from
scratch for some time, as evidenced by the presence of entries for
*_tonearest functions (those tests duplicated the
default-rounding-mode tests, and such duplicates are no longer run).
The aarch64, alpha, hppa, ia64, m68k, microblaze, powerpc, s390, sh,
sparc, tile files similarly could do with from-scratch regeneration as
evidenced by the presence of such entries. (Truncate the existing
file then run "make regen-ulps" and move the resulting file into
place.)
This patch regenerates the x86_64 and x86 files from scratch. It's
likely some of the reduced / removed ulps will need restoring because
they appear on processors or compiler versions other than the one I
tested on, but in such cases I'd like to first see if I can generate
new tests that show such ulps on the Intel processor I'm testing on,
to reduce the effects from different people using different processors
and compilers to regenerate the ulps.
* sysdeps/i386/fpu/libm-test-ulps: Regenerated.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
In testing for x86_64 on an AMD processor, I observed libm test
failures of the form:
testing long double (without inline functions)
Failure: Test: log2_downward (0x2.b7e151628aed4p+0)
Result:
is: 1.44269504088896356633e+00 0xb.8aa3b295c17f67600000p-3
should be: 1.44269504088896356622e+00 0xb.8aa3b295c17f67500000p-3
difference: 1.08420217248550443400e-19 0x8.00000000000000000000p-66
ulp : 1.0000
max.ulp : 0.0000
Maximal error of `log2_downward'
is : 1 ulp
accepted: 0 ulp
These issues arise because the maximum ulps when regenerating on one
processor are not the same as on another processor, so regeneration on
several processors may be needed when updating libm-test-ulps to avoid
failures for some users testing glibc - but such regeneration on
multiple processors is inconvenient. Causes can be: on x86 and, for
x86_64, for long double, variation in results of x87 instructions for
transcendental operations between processors; on x86, variation in
compiler excess precision between compiler versions and
configurations; on any processor where the compiler may contract
expressions using fused multiply-add, variation in what contraction
occurs.
Although it's hard to be sure libm-test-ulps covers all ulps that may
be seen in any configuration for the given architecture, in practice
it helps simply to add wider test coverage to make it more likely
that, when testing on one processor, the ulps seen are the biggest
that can be seen for that function on that processor, and hopefully
they are also the biggest that can be seen for that function in other
configurations for that architecture. Thus, this patch adds some
tests of log2 that increase the ulps I see on x86_64 on an Intel
processor, so that hopefully future from-scratch regenerations on that
processor will produce ulps big enough not to have errors from testing
on AMD processors. These tests were found by randomly generating
inputs and seeing what produced ulps larger than those currently in
libm-test-ulps. Of course such increases also improve the accuracy of
the empirical table of known ulps generated from libm-test-ulps files
that goes in the manual.
Tested for x86_64 and x86 and ulps updated accordingly.
* math/auto-libm-test-in: Add more tests of log2.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
extend_alloca was used to emulate VLA deallocation. The new version
also handles the res == 0 corner case more explicitly, by returning 0
instead of the (potentially undefined, but usually zero) system call
error.
In bc0cdc498 the configure check for HAVE_ASM_PPC_REL16 was removed
on the grounds that the minimum binutils supports rel16 relocs. This
is true, but not all references to HAVE_ASM_PPC_REL16 in the sources
were removed.
* config.h.in: Remove HAVE_ASM_PPC_REL16.
* sysdeps/powerpc/powerpc32/tls-macros.h: Remove HAVE_ASM_PPC_REL16
and false branch of conditional.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/swapcontext-common.S:
Likewise.
* sysdeps/mach/hurd/Makefile ($(common-objpfx)errnos.d): Depend on
libc-modules.h
* sysdeps/mach/hurd/i386/trampoline.c (_hurd_setup_sighandler): Remove
unused declaration of _hurd_intr_rpc_msg_in_trap.
* mach/mach_init.c (__mach_init): Test whether HAVE_HOST_PAGE_SIZE is
defined instead of whether it is non-zero.
* sysdeps/mach/hurd/i386/intr-msg.h (INTR_MSG_TRAP): Use "+m"
input constraint instead of both input and output constraint. Use ecx
clobber instead of %ecx.
* sysdeps/mach/hurd/malloc-machine.h (mutex_init, mutex_lock,
mutex_unlock): Use a statement expression instead of an expression list.
* sysdeps/mach/hurd/setitimer.c (_hurd_itimer_thread_stack_size): Set
type to vm_size_t instead of vm_address_t.
* sysdeps/mach/hurd/fork.c (__fork): Test whether STACK_GROWTH_UP is
defined instead of whether it is non-zero.
* hurd/hurd/ioctl.h (_hurd_locked_install_cttyid): New declaration.
* sysdeps/mach/hurd/setsid.c: Include <hurd/ioctl.h>.
* sysdeps/mach/hurd/mmap.c (__mmap): Use 0 instead of NULL for
comparisons with mapaddr.
* nscd/nscd-client.h: Include <time.h>.
* sysdeps/mach/hurd/dl-sysdep.c (fmh): Pass vm_offset_t dummy
9th parameter to __vm_region instead of int.
We need to add a BND prefix before indirect branch at the end of
_dl_runtime_resolve to preserve bound registers.
[BZ #18134]
* sysdeps/x86_64/dl-trampoline.S (PRESERVE_BND_REGS_PREFIX): New.
(_dl_runtime_resolve): Add a BND prefix before indirect branch.
This patch makes soft-fp use static assertions in place of conditional
calls to abort, in places where there are checks for conditions (on
the types for which a macro is used) that the code is not prepared to
handle. The fallback definition of _FP_STATIC_ASSERT (for kernel use
only, as only relevant to compilers not supported for building glibc)
is as in misc/sys/cdefs.h.
This means that soft-fp only ever calls abort for _FP_UNREACHABLE
calls in builds with GCC versions before 4.5. Thus, there is no need
for an abort declaration or <stdlib.h> include, since the kernel code
handles defining abort as a macro itself - and so this avoids any need
for an __KERNEL__ condition on the abort declaration to avoid it
breaking with the kernel's macro definition. That is, this patch is
intended to make glibc's soft-fp code suitable for kernel use with no
kernel-local changes to the soft-fp code needed at all.
Tested for powerpc-nofpu that installed stripped shared libraries are
unchanged by the patch. One explicit <stdlib.h> include had to be
added to a file that was relying on the include from soft-fp.h.
* soft-fp/soft-fp.h (_FP_STATIC_ASSERT): New macro.
[_LIBC]: Do not include <stdlib.h>.
[!_LIBC] (abort): Remove declaration.
* soft-fp/op-2.h (_FP_MUL_MEAT_2_120_240_double): Use
_FP_STATIC_ASSERT instead of conditionally calling abort.
* soft-fp/op-common.h (_FP_FROM_INT): Likewise.
(_FP_EXTEND_CNAN): Likewise.
(FP_TRUNC): Likewise.
(__FP_CLZ): Likewise.
* sysdeps/powerpc/nofpu/flt-rounds.c: Include <stdlib.h>.
With AIX port deprecated there is no need to check/define
HAVE_ASM_GLOBAL_DOT_NAME anymore since the current minimum binutils
supported (2.22) does not emit global symbol with dot.
This patch removes all the HAVE_ASM_GLOBAL_DOT_NAME definition and
checks for powerpc64 port.
The function feupdateenv has been fixed to correctly handle FE_DFL_ENV
and FE_NOMASK_ENV.
The fesetexceptflag function has been fixed to correctly handle setting
the new flags instead of just OR-ing the existing flags.
This fixes the test-fenv-return and test-fenvinline failures on hppa.
The constraints in the inline assembly in feholdexcept and fesetenv
are incorrect. The assembly modifies the buffer pointer, but doesn't
express that in the constraints. The simple fix is to remove the
modification of the buffer pointer which is no longer required by
the existing code, and adjust the one constraint that did express
the modification of bufptr.
The change fixes test-fenv when glibc is compiled with recent gcc.
This patch fixes the inline feraiseexcept and feclearexcept macros for
powerpc by casting the input argument to integer before operation on it.
It fixes BZ#17776.
__ASSUME_PRLIMIT64 is defined in kernel-features.h for kernels 2.6.36
and later, but hppa, microblaze and sh did not add the prlimit64
syscall until 2.6.37. This patch adds corresponding undefines of
__ASSUME_PRLIMIT64 to those architectures' kernel-features.h files.
(This concludes the kernel-features.h fixes arising out of the review
- limited to macros defined in the architecture-independent
kernel-features.h file - I did in connection with the move to 2.6.32
minimum kernel version. For that subset of macros - I didn't check
any purely architecture-specific macros - I think they are now defined
for the correct kernel versions on each architecture after this
patch.)
[BZ #17779]
* sysdeps/unix/sysv/linux/hppa/kernel-features.h
[__LINUX_KERNEL_VERSION < 0x020625] (__ASSUME_PRLIMIT64):
Undefine.
* sysdeps/unix/sysv/linux/microblaze/kernel-features.h
[__LINUX_KERNEL_VERSION < 0x020625] (__ASSUME_PRLIMIT64):
Likewise.
* sysdeps/unix/sysv/linux/sh/kernel-features.h
[__LINUX_KERNEL_VERSION < 0x020625] (__ASSUME_PRLIMIT64):
Likewise.
The threshold in ldbl-96 atanhl for when to return the argument,
0x1p-28, is a bit too big, and that in ldbl-128ibm atanhl is much too
big (the relevant condition being x^3/3 being < 0.5ulp of x),
resulting in errors a bit above the limits of those considered
acceptable in glibc in the ldbl-96 case, and in large errors in the
ldbl-128ibm case. This patch changes those implementations to use
more appropriate thresholds and adds tests around the thresholds for
various formats.
Tested for x86_64, x86 and powerpc. x86_64 and x86 ulps updated
accordingly.
[BZ #18046]
[BZ #18047]
* sysdeps/ieee754/ldbl-128ibm/e_atanhl.c (__ieee754_atanhl): Use
0x1p-56L as threshold for just returning the argument.
* sysdeps/ieee754/ldbl-96/e_atanhl.c (__ieee754_atanhl): Use
0x1p-32L as threshold for just returning the argument.
* math/auto-libm-test-in: Add more tests of atanh.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulp: Likewise.
We want to avoid -Wno- options in makefiles as far as possible, by
cleaning up the underlying issues if possible or failing that by using
diagnostic pragmas. This patch eliminates the use of
-Wno-write-strings for sysdeps/ieee754/k_standard.c by using casts in
the source file to cast away const; those casts are encapsulated in a
macro that also deals with the choice of strings for float / double /
long double functions (for which the logic was previously replicated
many times).
Tested for x86_64; the only change to disassembly of installed
stripped shared libraries was a line number in an assertion.
* sysdeps/ieee754/k_standard.c (CSTR): New macro.
(__kernel_standard): Use CSTR macro when setting exc.name.
* sysdeps/ieee754/Makefile [$(subdir) = math]
(CFLAGS-k_standard.c): Remove variable.
math/Makefile currently has:
# The fdlibm code generates a lot of these warnings but is otherwise clean.
override CFLAGS += -Wno-uninitialized
This is of course undesirable; warnings should be disabled as narrowly
as possible. To remove this override, we need to fix files that
generate such warnings, or put warning-disabling pragmas in them.
This patch does so for Bessel function implementations, one of the
cases that have the warnings if the override is removed. The warnings
arise because functions set pointer variables p and q only for certain
values of the function argument, then use them unconditionally. As
the static functions in question only get called for arguments that
satisfy the last condition in the if/else chain, the natural fix is to
change the last "else if" to just "else", which this patch does. (The
ldbl-128 / ldbl-128ibm implementation of these functions is
substantially different and looks like it already does use "else" in
the last case in the nearest corresponding code.)
Tested for x86_64 and x86.
* sysdeps/ieee754/dbl-64/e_j0.c (pzero): Change last case for
setting p and q from "else if" to "else".
(qzero): Likewise.
* sysdeps/ieee754/dbl-64/e_j1.c (pone): Likewise.
(qone): Likewise.
* sysdeps/ieee754/flt-32/e_j0f.c (pzerof): Likewise.
(qzerof): Likewise.
* sysdeps/ieee754/flt-32/e_j1f.c (ponef): Likewise.
(qonef): Likewise.
* sysdeps/ieee754/ldbl-96/e_j0l.c (pzero): Likewise.
(qzero): Likewise.
* sysdeps/ieee754/ldbl-96/e_j1l.c (pone): Likewise.
(qone): Likewise.
The ldbl-128 and ldbl-128ibm implementations of acosl have similar
bugs, using a threshold of 0x1p-57L to determine when they just return
pi/2. Since the result pi/2 - asinl (x) is roughly pi/2 - x for small
x, the relevant cut-off is actually x being < 0.5ulp of 1. This patch
fixes the implementations to use that cut-off and adds tests of small
acos arguments.
Tested for powerpc and mips64. Also tested for x86_64 and x86; no
ulps updates needed.
[BZ #18038]
[BZ #18039]
* sysdeps/ieee754/ldbl-128/e_acosl.c (__ieee754_acosl): Only
return pi/2 for arguments below 0x1p-113L.
* sysdeps/ieee754/ldbl-128ibm/e_acosl.c (__ieee754_acosl): Only
return pi/2 for arguments below 0x1p-106L.
* math/auto-libm-test-in: Add more tests of acos.
* math/auto-libm-test-out: Regenerated.
Similar to various other bugs in this area, some asin implementations
do not raise the underflow exception for subnormal arguments, when the
result is tiny and inexact. This patch forces the exception in a
similar way to previous fixes.
Tested for x86_64, x86, powerpc and mips64.
[BZ #16351]
* sysdeps/i386/fpu/e_asin.S (dbl_min): New object.
(MO): New macro.
(__ieee754_asin): Force underflow exception for results with small
absolute value.
* sysdeps/i386/fpu/e_asinf.S (flt_min): New object.
(MO): New macro.
(__ieee754_asinf): Force underflow exception for results with
small absolute value.
* sysdeps/ieee754/dbl-64/e_asin.c: Include <float.h> and <math.h>.
(__ieee754_asin): Force underflow exception for results with small
absolute value.
* sysdeps/ieee754/flt-32/e_asinf.c: Include <float.h>.
(__ieee754_asinf): Force underflow exception for results with
small absolute value.
* sysdeps/ieee754/ldbl-128/e_asinl.c: Include <float.h>.
(__ieee754_asinl): Force underflow exception for results with
small absolute value.
* sysdeps/ieee754/ldbl-128ibm/e_asinl.c: Include <float.h>.
(__ieee754_asinl): Force underflow exception for results with
small absolute value.
* sysdeps/ieee754/ldbl-96/e_asinl.c: Include <float.h>.
(__ieee754_asinl): Force underflow exception for results with
small absolute value.
* sysdeps/x86_64/fpu/multiarch/e_asin.c [HAVE_FMA4_SUPPORT]:
Include <math.h>.
* math/auto-libm-test-in: Do not mark underflow exceptions as
possibly missing for bug 16351.
* math/auto-libm-test-out: Regenerated.
The ldbl-128ibm implementation of logbl produces incorrect results
when the high part of the argument is a power of 2 and the low part a
nonzero number with the opposite sign (and so the returned exponent
should be 1 less than that of the high part). For example, logbl
(0x1.ffffffffffffffp1L) returns 2 but should return 1. (This is
similar to (fixed) bug 16740 for frexpl, and (fixed) bug 18029 for
ilogbl.) This patch adds checks for that case.
Tested for powerpc.
[BZ #18030]
* sysdeps/ieee754/ldbl-128ibm/s_logbl.c (__logbl): Adjust exponent
of power of 2 down when low part has opposite sign.
* math/libm-test.inc (logb_test_data): Add more tests.
The ldbl-128ibm implementation of ilogbl produces incorrect results
when the high part of the argument is a power of 2 and the low part a
nonzero number with the opposite sign (and so the returned exponent
should be 1 less than that of the high part). For example, ilogbl
(0x1.ffffffffffffffp1L) returns 2 but should return 1. (This is
similar to (fixed) bug 16740 for frexpl, and bug 18030 for logbl.)
This patch adds checks for that case.
Tested for powerpc.
[BZ #18029]
* sysdeps/ieee754/ldbl-128ibm/e_ilogbl.c (__ieee754_ilogbl):
Adjust exponent of power of 2 down when low part has opposite
sign.
* math/libm-test.inc (ilogb_test_data): Add more tests.
This patch fixes the missing "__memcpy_ppc" symbol for memmove-ppc64
object in static builds. Since memcpy ifunc is not enabled in static
mode, the specialized symbols are not provided. The patch changed the
it to just "__memcpy" instead.
The ldbl-128ibm implementation of asinhl uses cut-offs of 0x1p28 and
0x1p-29 to determine when to use simpler formulas that avoid possible
overflow / underflow. Both those cut-offs are inappropriate for this
format, resulting in large errors. This patch changes the code to use
more appropriate cut-offs of 0x1p56 and 0x1p-56, adding tests around
the cut-offs for various floating-point formats.
Tested for powerpc. Also tested for x86_64 and x86 and updated ulps.
[BZ #18020]
* sysdeps/ieee754/ldbl-128ibm/s_asinhl.c (__asinhl): Use 2**56 and
2**-56 not 2**28 and 2**-29 as thresholds for simpler formulas.
* math/auto-libm-test-in: Add more tests of asinh.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
Similarly to what we did for in6_addr, we need a macro
to guard in6_pktinfo and ip6_mtuinfo too.
Cc: Carlos O'Donell <carlos@redhat.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
The ldbl-128ibm implementation of acoshl uses a cut-off of 0x1p28 to
determine when to use log(x) + log(2) as a formula. That cut-off is
too small for this format, resulting in large errors. This patch
changes it to a more appropriate cut-off of 0x1p56, adding tests
around the cut-offs for various floating-point formats.
Tested for powerpc. Also tested for x86_64 and x86 and updated ulps.
[BZ #18019]
* sysdeps/ieee754/ldbl-128ibm/e_acoshl.c (__ieee754_acoshl): Use
2**56 not 2**28 as threshold for log (2x) formula.
* math/auto-libm-test-in: Add more tests of acosh.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
Various x86 / x86_64 versions of scalb / scalbf / scalbl produce
spurious "invalid" exceptions for (qNaN, -Inf) arguments, because this
is wrongly handled like (+/-Inf, -Inf) which *should* raise such an
exception. (In fact the NaN case of the code determining whether to
quietly return a zero or a NaN for second argument -Inf was
accidentally dead since the code had been made to return a NaN with
exception.) This patch fixes the code to do the proper test for an
infinity as distinct from a NaN.
(Since the existing code does nothing to distinguish qNaNs and sNaNs
here, this patch doesn't either. If in future we systematically
implement proper sNaN semantics following TS 18661-1:2014, there will
be lots of bugs to address - Thomas found lots of issues with his
patch <https://sourceware.org/ml/libc-ports/2013-04/msg00008.html> to
add SNaN tests (which never went in and would now require significant
reworking).)
Tested for x86_64 and x86. Committed.
[BZ #16783]
* sysdeps/i386/fpu/e_scalb.S (__ieee754_scalb): Do not handle
arguments (NaN, -Inf) the same as (+/-Inf, -Inf).
* sysdeps/i386/fpu/e_scalbf.S (__ieee754_scalbf): Likewise.
* sysdeps/i386/fpu/e_scalbl.S (__ieee754_scalbl): Likewise.
* sysdeps/x86_64/fpu/e_scalbl.S (__ieee754_scalbl): Likewise.
* math/libm-test.inc (scalb_test_data): Add more tests.
Both open and openat load their last argument 'mode' lazily, using
va_arg() only if O_CREAT is found in oflag. This is wrong, mode is also
necessary if O_TMPFILE is in oflag.
By chance on x86_64, the problem wasn't evident when using O_TMPFILE
with open, as the 3rd argument of open, even when not loaded with
va_arg, is left untouched in RDX, where the syscall expects it.
However, openat was not so lucky, and O_TMPFILE couldn't be used: mode
is the 4th argument, in RCX, but the syscall expects its 4th argument in
a different register than the glibc wrapper, in R10.
Introduce a macro __OPEN_NEEDS_MODE (oflag) to test if either O_CREAT or
O_TMPFILE is set in oflag.
Tested on Linux x86_64.
[BZ #17523]
* io/fcntl.h (__OPEN_NEEDS_MODE): New macro.
* io/bits/fcntl2.h (open): Use it.
(openat): Likewise.
* io/open.c (__libc_open): Likewise.
* io/open64.c (__libc_open64): Likewise.
* io/open64_2.c (__open64_2): Likewise.
* io/open_2.c (__open_2): Likewise.
* io/openat.c (__openat): Likewise.
* io/openat64.c (__openat64): Likewise.
* io/openat64_2.c (__openat64_2): Likewise.
* io/openat_2.c (__openat_2): Likewise.
* sysdeps/mach/hurd/open.c (__libc_open): Likewise.
* sysdeps/mach/hurd/openat.c (__openat): Likewise.
* sysdeps/posix/open64.c (__libc_open64): Likewise.
* sysdeps/unix/sysv/linux/dl-openat64.c (openat64): Likewise.
* sysdeps/unix/sysv/linux/generic/open.c (__libc_open): Likewise.
(__open_nocancel): Likewise.
* sysdeps/unix/sysv/linux/generic/open64.c (__libc_open64): Likewise.
* sysdeps/unix/sysv/linux/open64.c (__libc_open64): Likewise.
* sysdeps/unix/sysv/linux/openat.c (__OPENAT): Likewise.
* sysdeps/unix/sysv/linux/s390/lowlevellock.h: Include
<sysdeps/nptl/lowlevellock.h> and remove macros and
functions that are now defined there.
(SYS_futex): Remove.
(lll_compare_and_swap): Remove.
* sysdeps/s390/bits/atomic.h (atomic_exchange_acq): Define.
This patch fixes bug 15319, missing underflows from atan / atan2 when
the result of atan is very close to its small argument (or that of
atan2 is very close to the ratio of its arguments, which may be an
exact division).
The usual approach of doing an underflowing computation if the
computed result is subnormal is followed. For 32-bit x86, there are
extra complications: the inline __ieee754_atan2 in bits/mathinline.h
needs to be disabled for float and double because other libm functions
using it generally rely on getting proper underflow exceptions from
it, while the out-of-line functions have to remove excess range and
precision from the underflowing result so as to return an exact 0 in
the case where errno should be set for underflow to 0. (The failures
I saw without that are similar to those Carlos reported for other
functions, where I haven't seen a response to
<https://sourceware.org/ml/libc-alpha/2015-01/msg00485.html>
confirming if my diagnosis is correct. Arguably all libm functions
with float and double returns should remove excess range and
precision, but that's a separate matter.)
The x86_64 long double case reported in a comment in bug 15319 is not
a bug (it's an argument of LDBL_MIN, and x86_64 is an after-rounding
architecture so the correct IEEE result is not to raise underflow in
the given rounding mode, in addition to treating the result as an
exact LDBL_MIN being within the newly clarified documentation of
accuracy goals). I'm presuming that the fpatan instruction can be
trusted to raise appropriate exceptions when the (long double) result
underflows (after rounding) and so no changes are needed for x86 /
x86_64 long double functions here; empirically this is the case for
the cases covered in the testsuite, on my system.
Tested for x86_64, x86, powerpc and mips64. Only 32-bit x86 needs
ulps updates (for the changes to inlines meaning some functions no
longer get excess precision from their __ieee754_atan2* calls).
[BZ #15319]
* sysdeps/i386/fpu/e_atan2.S (dbl_min): New object.
(MO): New macro.
(__ieee754_atan2): For results with small absolute value, force
underflow exception and remove excess range and precision from
return value.
* sysdeps/i386/fpu/e_atan2f.S (flt_min): New object.
(MO): New macro.
(__ieee754_atan2f): For results with small absolute value, force
underflow exception and remove excess range and precision from
return value.
* sysdeps/i386/fpu/s_atan.S (dbl_min): New object.
(MO): New macro.
(__atan): For results with small absolute value, force underflow
exception and remove excess range and precision from return value.
* sysdeps/i386/fpu/s_atanf.S (flt_min): New object.
(MO): New macro.
(__atanf): For results with small absolute value, force underflow
exception and remove excess range and precision from return value.
* sysdeps/ieee754/dbl-64/e_atan2.c: Include <float.h> and
<math.h>.
(__ieee754_atan2): Force underflow exception for results with
small absolute value.
* sysdeps/ieee754/dbl-64/s_atan.c: Include <float.h> and
<math_private.h>.
(atan): Force underflow exception for results with small absolute
value.
* sysdeps/ieee754/flt-32/s_atanf.c: Include <float.h>.
(__atanf): Force underflow exception for results with small
absolute value.
* sysdeps/ieee754/ldbl-128/s_atanl.c: Include <float.h> and
<math.h>.
(__atanl): Force underflow exception for results with small
absolute value.
* sysdeps/ieee754/ldbl-128ibm/s_atanl.c: Include <float.h>.
(__atanl): Force underflow exception for results with small
absolute value.
* sysdeps/x86/fpu/bits/mathinline.h
[!__SSE2_MATH__ && !__x86_64__ && __LIBC_INTERNAL_MATH_INLINES]
(__ieee754_atan2): Only define inline for long double.
* sysdeps/x86_64/fpu/multiarch/e_atan2.c
[HAVE_FMA4_SUPPORT || HAVE_AVX_SUPPORT]: Include <math.h>.
* math/auto-libm-test-in: Do not mark underflow exceptions as
possibly missing for bug 15319. Add more tests of atan2.
* math/auto-libm-test-out: Regenerated.
* math/libm-test.inc (casin_test_data): Do not mark underflow
exceptions as possibly missing for bug 15319.
(casinh_test_data): Likewise.
* sysdeps/i386/fpu/libm-test-ulps: Update.
posix_spawn (a standard POSIX function) brings in a use of getrlimit64
(not a standard POSIX function). This patch fixes this by using
__getrlimit64 and making getrlimit64 a weak alias.
This is more complicated than some such changes because of files that
define getrlimit64 in their own way using symbol versioning after
including the main sysdeps/unix/sysv/linux/getrlimit64.c with a
getrlimit macro defined. There are various existing patterns for such
cases in glibc; the one I've used here is that a getrlimit64 macro
disables the weak_alias / libc_hidden_weak calls, leaving it to the
including file to define the getrlimit64 name in whatever way is
appropriate.
Tested for x86_64 and x86 that installed stripped shared libraries are
unchanged by this patch.
[BZ #17991]
* include/sys/resource.h (__getrlimit64): Declare. Use
libc_hidden_proto.
* resource/getrlimit64.c (getrlimit64): Rename to __getrlimit64
and define as weak alias of __getrlimit64. Use libc_hidden_weak.
* sysdeps/posix/spawni.c (__spawni): Call __getrlimit64 instead of
getrlimit64.
* sysdeps/unix/sysv/linux/getrlimit64.c (getrlimit64): Rename to
__getrlimit64.
[!getrlimit64] (getrlimit64): Define as weak alias of
__getrlimit64. Use libc_hidden_weak.
* sysdeps/unix/sysv/linux/i386/getrlimit64.c (getrlimit64): Define
using __getrlimit64 not __new_getrlimit64.
(__GI_getrlimit64): Likewise.
* sysdeps/unix/sysv/linux/mips/getrlimit64.c (getrlimit64):
Likewise.
(__GI_getrlimit64): Likewise.
(__old_getrlimit64): Use __getrlimit64 not __new_getrlimit64.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/syscalls.list
(getrlimit): Add __getrlimit64 alias.
* sysdeps/unix/sysv/linux/wordsize-64/syscalls.list (getrlimit):
Likewise.
* conform/Makefile (test-xfail-XOPEN2K/spawn.h/linknamespace):
Remove variable.
(test-xfail-POSIX2008/spawn.h/linknamespace): Likewise.
(test-xfail-XOPEN2K8/spawn.h/linknamespace): Likewise.
ia64 seems to use the same implementation of low-level locks as the
generic Linux lowlevellock.h. The futex syscalls are somewhat
different, but Roland thought it shouldn't matter. Note that the futex
calls are on the slow path always (except for PI mutexes).
Removing the custom low-level lock implementation will make further
refactoring easier, for example adding proper error checking to futex
operations.
Various remquo implementations produce a zero remainder with the wrong
sign (a zero remainder should always have the sign of the first
argument, as specified in IEEE 754) in round-downward mode, resulting
from the sign of 0 - 0. This patch checks for zero results and fixes
their sign accordingly.
Tested for x86_64, x86, mips64 and powerpc.
[BZ #17987]
* sysdeps/ieee754/dbl-64/s_remquo.c (__remquo): Ensure sign of
zero result does not depend on the sign resulting from
subtraction.
* sysdeps/ieee754/dbl-64/wordsize-64/s_remquo.c (__remquo):
Likewise.
* sysdeps/ieee754/flt-32/s_remquof.c (__remquof): Likewise.
* sysdeps/ieee754/ldbl-128/s_remquol.c (__remquol): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_remquol.c (__remquol): Likewise.
* sysdeps/ieee754/ldbl-96/s_remquol.c (__remquol): Likewise.
* math/libm-test.inc (remquo_test_data): Add more tests.
Various remquo implementations, when computing the last three bits of
the quotient, have spurious overflows when 4 times the second argument
to remquo overflows. These overflows can in turn cause bad results in
rounding modes where that overflow results in a finite value. This
patch adds tests to avoid the problem multiplications in cases where
they would overflow, similar to those that control an earlier
multiplication by 8.
Tested for x86_64, x86, mips64 and powerpc.
[BZ #17978]
* sysdeps/ieee754/dbl-64/s_remquo.c (__remquo): Do not form
products 4 * y and 2 * y where those would overflow.
* sysdeps/ieee754/dbl-64/wordsize-64/s_remquo.c (__remquo):
Likewise.
* sysdeps/ieee754/flt-32/s_remquof.c (__remquof): Likewise.
* sysdeps/ieee754/ldbl-128/s_remquol.c (__remquol): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_remquol.c (__remquol): Likewise.
* sysdeps/ieee754/ldbl-96/s_remquol.c (__remquol): Likewise.
* math/libm-test.inc (remquo_test_data): Add more tests.
I see an error
../sysdeps/mips/memcpy.S:209:68: error: "_ABIO64" is not defined [-Werror=undef]
#if defined(_MIPS_SIM) && ((_MIPS_SIM == _ABIO32) || (_MIPS_SIM == _ABIO64))
^
cc1: some warnings being treated as errors
in MIPS builds. This patch arranges for _ABIO64 to be defined with
the same value as GCC uses when building for O64 (the ABI itself isn't
supported by glibc, but defining the macro seems the simplest way of
avoiding the error in code that may be shared with other C libraries).
* sysdeps/mips/sgidefs.h [!_ABIO64] (_ABIO64): New macro.
I see an error
../sysdeps/mips/strcmp.S:25:7: error: "_COMPILING_NEWLIB" is not defined [-Werror=undef]
#elif _COMPILING_NEWLIB
^
cc1: some warnings being treated as errors
in MIPS builds. (This is with GCC 4.9; it's possible that the DR#412
change in GCC 5 - see
<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60570> - means that
-Wundef diagnostics no longer occur for #elif conditions where a
previous group's condition was true, just as with other errors there.)
This patch duly adjusts the conditionals to test whether
_COMPILING_NEWLIB is defined.
* sysdeps/mips/memcpy.S [_COMPILING_NEWLIB]: Change condition to
[defined _COMPILING_NEWLIB].
* sysdeps/mips/memset.S [_COMPILING_NEWLIB]: Likewise.
* sysdeps/mips/strcmp.S [_COMPILING_NEWLIB]: Likewise.
I see an error
In file included from ../sysdeps/mips/include/sys/asm.h:20:0,
from ../sysdeps/mips/start.S:39:
../sysdeps/mips/sys/asm.h:421:5: error: "__mips_isa_rev" is not defined [-Werror=undef]
#if __mips_isa_rev < 6
^
cc1: some warnings being treated as errors
in MIPS builds. As sys/asm.h is an installed header, it seems better
to test for !defined __mips_isa_rev here, instead of defining it to 0
as done in sysdeps/unix/mips/sysdep.h, to avoid perturbing any code
outside glibc that tests whether __mips_isa_rev is defined; this patch
does so.
* sysdeps/mips/sys/asm.h [__mips_isa_rev < 6]: Change condition to
[!defined __mips_isa_rev || __mips_isa_rev < 6].
Remove IA64 PAGE_SIZE related macros as PAGE_SIZE is not defined.
Also remove macros that are only used for BFD's trad-core support
which is not relavant for IA64 according to the thread starting
here:
https://sourceware.org/ml/libc-ports/2013-11/msg00028.html
This patch is neither built nor tested but is equivalent to a MIPS
patch for the same fix.
The dbl-64/wordsize-64 remquo implementation follows similar logic to
various other implementations, but where that logic computes some
absolute values, it wrongly uses a previously computed bit-pattern for
the absolute value of the first argument, where actually it needs the
absolute value of the first argument mod 8 times the second. This
patch fixes it to compute the correct absolute value.
The integer quotient result of remquo is only specified mod 8
(including its sign); architecture-specific versions may well vary in
what results they give for higher bits of that result (and indeed bug
17569 gives an example correct result from __builtin_remquo giving 9
for that result, where the particular glibc implementation used in
that bug report would give 1 after this fix). Thus, this patch adapts
the tests of remquo to test that result only mod 8, to allow for such
variation when tests with higher quotient are included.
Tested for x86_64 and x86.
[BZ #17569]
* sysdeps/ieee754/dbl-64/wordsize-64/s_remquo.c (__remquo):
Compute absolute value of x as modified by fmod, not original
value of x.
* math/libm-test.inc (RUN_TEST_ffI_f1): Rename to
RUN_TEST_ffI_f1_mod8. Check extra return value mod 8.
(RUN_TEST_LOOP_ffI_f1): Rename to RUN_TEST_LOOP_ffI_f1_mod8. Call
RUN_TEST_ffI_f1_mod8.
(remquo_test_data): Add more tests.
Similarly to sqrt in
<https://sourceware.org/ml/libc-alpha/2015-02/msg00353.html>, the
powerpc sqrtf implementation for when _ARCH_PPCSQ is not defined also
relies on a * b + c being contracted into a fused multiply-add.
Although this contraction is not explicitly disabled for e_sqrtf.c, it
still seems appropriate to make the file explicit about its
requirements by using __builtin_fmaf; this patch does so.
Furthermore, it turns out that doing so fixes the observed inaccuracy
and missing exceptions (that is, that without explicit __builtin_fmaf
usage, it was not being compiled as intended).
Tested for powerpc32 (hard float).
[BZ #17967]
* sysdeps/powerpc/fpu/e_sqrtf.c (__slow_ieee754_sqrtf): Use
__builtin_fmaf instead of relying on contraction of a * b + c.
As Adhemerval noted in
<https://sourceware.org/ml/libc-alpha/2015-01/msg00451.html>, the
powerpc sqrt implementation for when _ARCH_PPCSQ is not defined is
inaccurate in some cases.
The problem is that this code relies on fused multiply-add, and relies
on the compiler contracting a * b + c to get a fused operation. But
sysdeps/ieee754/dbl-64/Makefile disables contraction for e_sqrt.c,
because the implementation in that directory relies on *not* having
contracted operations.
While it would be possible to arrange makefiles so that an earlier
sysdeps directory can disable the setting in
sysdeps/ieee754/dbl-64/Makefile, it seems a lot cleaner to make the
dependence on fused operations explicit in the .c file. GCC 4.6
introduced support for __builtin_fma on powerpc and other
architectures with such instructions, so we can rely on that; this
patch duly makes the code use __builtin_fma for all such fused
operations.
Tested for powerpc32 (hard float).
2015-02-12 Joseph Myers <joseph@codesourcery.com>
[BZ #17964]
* sysdeps/powerpc/fpu/e_sqrt.c (__slow_ieee754_sqrt): Use
__builtin_fma instead of relying on contraction of a * b + c.
This patch fixes the remaining part of bug 16560, spurious underflows
from exp2 of arguments close to 0 (when the result is close to 1, so
should not underflow), by just using 1+x instead of a more complicated
calculation when the argument is sufficiently small.
Tested for x86_64, x86 and mips64.
[BZ #16560]
* math/e_exp2l.c [LDBL_MANT_DIG == 106] (LDBL_EPSILON): Undefine
and redefine.
(__ieee754_exp2l): Do not multiply small fractional parts by
M_LN2l.
* sysdeps/i386/fpu/e_exp2l.S (__ieee754_exp2l): Just add 1 to
small argument.
* sysdeps/ieee754/dbl-64/e_exp2.c (__ieee754_exp2): Likewise.
* sysdeps/ieee754/flt-32/e_exp2f.c (__ieee754_exp2f): Likewise.
* sysdeps/x86_64/fpu/e_exp2l.S (__ieee754_exp2l): Likewise.
* math/auto-libm-test-in: Add more tests of exp2.
* math/auto-libm-test-out: Regenerated.
This patch optimizes strncpy for power7 for unaligned source or
destination address. The source or destination address is aligned
to doubleword and data is shifted based on the alignment and
added with the previous loaded data to be written as a doubleword.
For each load, cmpb instruction is used for faster null check.
The new optimization shows 10 to 70% of performance improvement
for longer string though it does not show big difference on string
size less than 16 due to additional checks.Hence this new algorithm
is restricted to string greater than 16.
This patch makes sincos set errno to EDOM when passed an infinity,
similarly to sin and cos.
Tested for x86_64, x86, powerpc and mips64. I don't know if the
architecture-specific implementations for ia64 and m68k might need
corresponding fixes.
2015-02-11 Joseph Myers <joseph@codesourcery.com>
[BZ #15467]
* sysdeps/ieee754/dbl-64/s_sincos.c: Include <errno.h>.
(__sincos): Set errno to EDOM for infinite argument.
* sysdeps/ieee754/flt-32/s_sincosf.c: Include <errno.h>.
(SINCOSF_FUNC): Set errno to EDOM for infinite argument.
* sysdeps/ieee754/ldbl-128/s_sincosl.c: Include <errno.h>.
(__sincosl): Set errno to EDOM for infinite argument.
* sysdeps/ieee754/ldbl-128ibm/s_sincosl.c: Include <errno.h>.
(__sincosl): Set errno to EDOM for infinite argument.
* sysdeps/ieee754/ldbl-96/s_sincosl.c: Include <errno.h>.
(__sincosl): Set errno to EDOM for infinite argument.
* math/libm-test.inc (sincos_test_data): Test errno setting.
As noted in
<https://sourceware.org/ml/libc-alpha/2014-10/msg00369.html>, soft-fp
sysdeps subdirectories (and more generally, subdirectories where
sysdeps/foo/Implies contains foo/bar) are unnecessary and should be
eliminated. This patch does so for MIPS.
Tested for MIPS64 (all three ABIs, soft-float) that installed stripped
shared libraries are unchanged by this patch.
* sysdeps/mips/soft-fp/sfp-machine.h: Move to ....
* sysdeps/mips/mips32/sfp-machine.h: ... here.
* sysdeps/mips/mips64/soft-fp/Makefile: Move to ....
* sysdeps/mips/mips64/Makefile: ... here.
* sysdeps/mips/mips64/soft-fp/e_sqrtl.c: Move to ....
* sysdeps/mips/mips64/e_sqrtl.c: ... here.
* sysdeps/mips/mips64/soft-fp/sfp-machine.h: Move to ....
* sysdeps/mips/mips64/sfp-machine.h: ... here.
* sysdeps/mips/mips32/Implies: Remove mips/soft-fp.
* sysdeps/mips/mips64/n32/Implies: Remove mips/mips64/soft-fp.
* sysdeps/mips/mips64/n64/Implies: Likewise.
Current minimum binutils supported (2.22) has ".machine altivec" support
as default, so there is no need to add a configure check for such
functionality. This patches removes the configure checks for it.
This patch cleanup some multiarch code related to memmmove
optimization. Initial IFUNC support added specialized wordcopy
symbols which turned in local IFUNC calls used by memmove default
implementation. The patch removes the internal IFUNC for wordcopy
symbols and uses local branches in the memmmove optimization instead.
This patch cleanup some multiarch code related to memmmove
optimization. Initial IFUNC support added specialized wordcopy
symbols which turned in local IFUNC calls used by memmove default
implementation.
This change by removing then and used the optimized memmove instead
for supported chips.
This patch simplify the default bcopy symbol for powerpc64 by just using
memmove instead of implementing using the default bcopy. Since the
symbol is deprecated, it trades speed by code size.
This reverts part of the previous commit to refactor pthread.h.
The refactoring must be done by having pthread.h include arch
bits headers, not the other way around. Then hppa provides the
arch bits header. For now we synchronzie again with pthread.h
and include the entire contents in the hppa copy.
Update all translations.
Update contributions in the manual.
Update installation notes with information about newest working tools.
Reconfigure using exactly autoconf 2.69.
Regenerate INSTALL.
(1) Fix warnings.
This is a bulk update to fix all the warnings that were causing
build failures with -Werror on hppa.
The most egregious problems are in dl-fptr.c which needs to be
entirely rewritten, thus I've used -Wno-error for that.
(2) Fix conformance errors.
The sysdep.c file had __syscall_error and syscall in one file
which caused conformance issues by including syscall when
__syscall_error was linked to. The fix is obviously to split
the file and use syscall.c to implement syscall.
* sysdeps/sparc/sparc32/bits/atomic.h
(__sparc32_atomic_do_unlock24): Put the memory barrier before the
unlock not after it.
(__v9_compare_and_exchange_val_32_acq): Use unions to avoid getting
volatile register usage warnings from the compiler.
memcpy with unaligned 256-bit AVX register loads/stores are slow on older
processorsl like Sandy Bridge. This patch adds bit_AVX_Fast_Unaligned_Load
and sets it only when AVX2 is available.
[BZ #17801]
* sysdeps/x86_64/multiarch/init-arch.c (__init_cpu_features):
Set the bit_AVX_Fast_Unaligned_Load bit for AVX2.
* sysdeps/x86_64/multiarch/init-arch.h (bit_AVX_Fast_Unaligned_Load):
New.
(index_AVX_Fast_Unaligned_Load): Likewise.
(HAS_AVX_FAST_UNALIGNED_LOAD): Likewise.
* sysdeps/x86_64/multiarch/memcpy.S (__new_memcpy): Check the
bit_AVX_Fast_Unaligned_Load bit instead of the bit_AVX_Usable bit.
* sysdeps/x86_64/multiarch/memcpy_chk.S (__memcpy_chk): Likewise.
* sysdeps/x86_64/multiarch/mempcpy.S (__mempcpy): Likewise.
* sysdeps/x86_64/multiarch/mempcpy_chk.S (__mempcpy_chk): Likewise.
* sysdeps/x86_64/multiarch/memmove.c (__libc_memmove): Replace
HAS_AVX with HAS_AVX_FAST_UNALIGNED_LOAD.
* sysdeps/x86_64/multiarch/memmove_chk.c (__memmove_chk): Likewise.
This is because of alignment issues in the sem_t support.
tilegx32 does in fact support 64-bit atomics and we will need
to revisit this after the 2.21 freeze.