Commit Graph

12536 Commits

Author SHA1 Message Date
Florian Weimer
7854ebf8ed Linux: Use in-tree copy of SO_ constants for !__USE_MISC [BZ #24532]
The kernel changes for a 64-bit time_t on 32-bit architectures
resulted in <asm/socket.h> indirectly including <linux/posix_types.h>.
The latter is not namespace-clean for the POSIX version of
<sys/socket.h>.

This issue has persisted across several Linux releases, so this commit
creates our own copy of the SO_* definitions for !__USE_MISC mode.

The new test socket/tst-socket-consts ensures that the copy is
consistent with the kernel definitions (which vary across
architectures).  The test is tricky to get right because CPPFLAGS
includes include/libc-symbols.h, which in turn defines _GNU_SOURCE
unconditionally.

Tested with build-many-glibcs.py.  I verified that a discrepancy in
the definitions actually results in a failure of the
socket/tst-socket-consts test.
2019-07-24 10:59:34 +02:00
Florian Weimer
1f7097d09c Linux: Update syscall-names.list to Linux 5.2
This adds the system call names fsconfig, fsmount, fsopen, fspick,
move_mount, open_tree.

Tested with build-many-glibcs.py.
2019-07-19 08:53:04 +02:00
Adhemerval Zanella
2ab9ad5735 nptl: Add POSIX-proposed _clock functions to hppa pthread.h
The pthread _clock functions that were recently added to nptl need to be
declared in hppa's pthread.h too. After this change, the function
declaration part of sysdeps/nptl/pthread.h and
sysdeps/unix/sysv/linux/hppa/pthread.h are identical.

	* sysdeps/unix/sysv/linux/hppa/pthread.h: Add declarations of
	functions recently added to sysdeps/nptl/pthread.h:
	pthread_mutex_clocklock, pthread_rwlock_clockrdlock,
	pthread_rwlock_clockwrlock and pthread_cond_clockwait.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2019-07-18 11:24:36 -03:00
Mike Crowe
4a8f6d3155 nptl: Remove unnecessary forwarding of pthread_cond_clockwait from libc
In afe4de7d28, I added forwarding functions
from libc to libpthread for __pthread_cond_clockwait and
pthread_cond_clockwait to mirror those for pthread_cond_timedwait. These
are unnecessary[1], since these functions aren't (yet) being called from
within libc itself. Let's remove them.

      * nptl/forward.c: Remove unnecessary __pthread_cond_clockwait and
	pthread_cond_clockwait forwarding functions.  There are no internal
	users, so it is unnecessary to expose these functions in libc.so.
	* sysdeps/nptl/pthread-functions.h (pthread_functions): Remove
	unnecessary ptr___pthread_cond_clockwait member.
	* nptl/nptl-init.c (pthread_functions): Remove assignment of
	removed member.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

[1] https://sourceware.org/ml/libc-alpha/2017-10/msg00082.html
2019-07-18 11:24:33 -03:00
Mike Crowe
1ff1373b33 nptl: Remove futex_supports_exact_relative_timeouts
The only implementation of futex_supports_exact_relative_timeouts always
returns true. Let's remove it and all its callers.

	* nptl/pthread_cond_wait.c: (__pthread_cond_clockwait): Remove code
	that is only useful if futex_supports_exact_relative_timeouts ()
	returns false.
	* nptl/pthread_condattr_setclock.c: (pthread_condattr_setclock):
	Likewise.
	* sysdeps/nptl/futex-internal.h: Remove comment about relative
	timeouts potentially being imprecise since it's no longer true.
	Remove declaration of futex_supports_exact_relative_timeouts.
	* sysdeps/unix/sysv/linux/futex-internal.h: Remove implementation
	of futex_supports_exact_relative_timeouts.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2019-07-12 13:36:25 +00:00
Mike Crowe
9d20e22e46 nptl: Add POSIX-proposed pthread_mutex_clocklock
Add POSIX-proposed pthread_mutex_clocklock function that works like
pthread_mutex_timedlock but takes a clockid parameter to measure the
abstime parameter against.

	* sysdeps/nptl/pthread.h: Add pthread_mutex_clocklock.
	* nptl/DESIGN-systemtap-probes.txt: Likewise.
	* nptl/pthread_mutex_timedlock.c
	(__pthread_mutex_clocklock_common): Rename from
	__pthread_mutex_timedlock and add clockid parameter. Pass this
	parameter to lll_clocklock and lll_clocklock_elision in place of
	CLOCK_REALTIME. (__pthread_mutex_clocklock): New function to add
	LIBC_PROBE and validate clockid parameter before calling
	__pthread_mutex_clocklock_common. (__pthread_mutex_timedlock): New
	implementation to add LIBC_PROBE and calls
	__pthread_mutex_clocklock_common passing CLOCK_REALTIME as the
	clockid.
	* nptl/Makefile: Add tst-mutex11.c.
	* nptl/tst-abstime.c (th): Add tests for pthread_mutex_clocklock.
	* nptl/tst-mutex11.c: New tests for passing invalid and unsupported
	clockid parameters to pthread_mutex_clocklock.
	* nptl/tst-mutex5.c (do_test_clock): Rename from do_test and take
	clockid parameter to indicate which clock to be used. Call
	pthread_mutex_timedlock or pthread_mutex_clocklock as appropriate.
	(do_test): Call do_test_clock to separately test
	pthread_mutex_timedlock, pthread_mutex_clocklock(CLOCK_REALTIME)
	and pthread_mutex_clocklock(CLOCK_MONOTONIC).
	* nptl/tst-mutex9.c: Likewise.
	* nptl/Versions (GLIBC_2.30): Add pthread_mutex_clocklock.
	* sysdeps/unix/sysv/linux/aarch64/libpthread.abilist (GLIBC_2.30):
	Likewise.
	* sysdeps/unix/sysv/linux/alpha/libpthread.abilist (GLIBC_2.30):
	Likewise.
	* sysdeps/unix/sysv/linux/arm/libpthread.abilist (GLIBC_2.30):
	Likewise.
	* sysdeps/unix/sysv/linux/csky/libpthread.abilist (GLIBC_2.30):
	Likewise.
	* sysdeps/unix/sysv/linux/hppa/libpthread.abilist (GLIBC_2.30):
	Likewise.
	* sysdeps/unix/sysv/linux/i386/libpthread.abilist (GLIBC_2.30):
	Likewise.
	* sysdeps/unix/sysv/linux/ia64/libpthread.abilist (GLIBC_2.30):
	Likewise.
	* sysdeps/unix/sysv/linux/m68k/coldfire/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/m68k/m680x0/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/microblaze/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/mips/mips32/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/mips/mips64/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/nios2/libpthread.abilist (GLIBC_2.30):
	Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/be/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/le/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/riscv/rv64/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-32/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-64/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/sh/libpthread.abilist (GLIBC_2.30):
	Likewise.
	* sysdeps/unix/sysv/linux/sparc/sparc32/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/sparc/sparc64/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/x86_64/64/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/x86_64/x32/libpthread.abilist
	(GLIBC_2.30): Likewise.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2019-07-12 13:36:25 +00:00
Mike Crowe
59213094c8 nptl: Rename lll_timedlock to lll_clocklock and add clockid parameter
Rename lll_timedlock to lll_clocklock and add clockid
parameter to indicate the clock that the abstime parameter should
be measured against in preparation for adding
pthread_mutex_clocklock.

The name change mirrors the naming for the exposed pthread functions:

 timed => absolute timeout measured against CLOCK_REALTIME (or clock
          specified by attribute in the case of pthread_cond_timedwait.)

 clock => absolute timeout measured against clock specified in preceding
          parameter.

	* sysdeps/nptl/lowlevellock.h (lll_clocklock): Rename from
	lll_timedlock and add clockid parameter. (__lll_clocklock): Rename
	from __lll_timedlock and add clockid parameter.
	* sysdeps/unix/sysv/linux/sparc/lowlevellock.h (lll_clocklock):
	Likewise.
	* nptl/lll_timedlock_wait.c (__lll_clocklock_wait): Rename from
	__lll_timedlock_wait and add clockid parameter. Use __clock_gettime
	rather than __gettimeofday so that clockid can be used. This means
	that conversion from struct timeval is no longer required.
	* sysdeps/sparc/sparc32/lowlevellock.c (lll_clocklock_wait):
	Likewise.
	* sysdeps/sparc/sparc32/lll_timedlock_wait.c: Update comment to
	refer to __lll_clocklock_wait rather than __lll_timedlock_wait.
	* nptl/pthread_mutex_timedlock.c (lll_clocklock_elision): Rename
	from lll_timedlock_elision, add clockid parameter and use
	meaningful names for other parameters. (__pthread_mutex_timedlock):
	Pass CLOCK_REALTIME where necessary to lll_clocklock and
	lll_clocklock_elision.
	* sysdeps/unix/sysv/linux/powerpc/lowlevellock.h
	(lll_clocklock_elision): Rename from lll_timedlock_elision and add
	clockid parameter. (__lll_clocklock_elision): Rename from
	__lll_timedlock_elision and add clockid parameter.
	* sysdeps/unix/sysv/linux/s390/lowlevellock.h: Likewise.
	* sysdeps/unix/sysv/linux/x86/lowlevellock.h: Likewise.
	* sysdeps/unix/sysv/linux/powerpc/elision-timed.c
	(__lll_lock_elision): Call __lll_clocklock_elision rather than
	__lll_timedlock_elision. (EXTRAARG): Add clockid parameter.
	(LLL_LOCK): Likewise.
	* sysdeps/unix/sysv/linux/s390/elision-timed.c: Likewise.
	* sysdeps/unix/sysv/linux/x86/elision-timed.c: Likewise.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2019-07-12 13:36:25 +00:00
Mike Crowe
e996fa72a9 nptl: Add POSIX-proposed pthread_rwlock_clockrdlock & pthread_rwlock_clockwrlock
Add:
 int pthread_rwlock_clockrdlock (pthread_rwlock_t *rwlock,
                                 clockid_t clockid,
                                 const struct timespec *abstime)
and:
 int pthread_rwlock_clockwrlock (pthread_rwlock_t *rwlock,
                                 clockid_t clockid,
                                 const struct timespec *abstime)

which behave like pthread_rwlock_timedrdlock and
pthread_rwlock_timedwrlock respectively, except they always measure
abstime against the supplied clockid. The functions currently support
CLOCK_REALTIME and CLOCK_MONOTONIC and return EINVAL if any other
clock is specified.

	* sysdeps/nptl/pthread.h: Add pthread_rwlock_clockrdlock and
	pthread_wrlock_clockwrlock.
	* nptl/Makefile: Build pthread_rwlock_clockrdlock.c and
	pthread_rwlock_clockwrlock.c.
	* nptl/pthread_rwlock_clockrdlock.c: Implement
	pthread_rwlock_clockrdlock.
	* nptl/pthread_rwlock_clockwrlock.c: Implement
	pthread_rwlock_clockwrlock.
	* nptl/pthread_rwlock_common.c (__pthread_rwlock_rdlock_full): Add
	clockid parameter and verify that it indicates a supported clock on
	entry so that we fail even if it doesn't end up being used. Pass
	that clock on to futex_abstimed_wait when necessary.
	(__pthread_rwlock_wrlock_full): Likewise.
	* nptl/pthread_rwlock_rdlock.c: (__pthread_rwlock_rdlock): Pass
	CLOCK_REALTIME to __pthread_rwlock_rdlock_full even though it won't
	be used because there's no timeout.
	* nptl/pthread_rwlock_wrlock.c (__pthread_rwlock_wrlock): Pass
	CLOCK_REALTIME to __pthread_rwlock_wrlock_full even though it won't
	be used because there is no timeout.
	* nptl/pthread_rwlock_timedrdlock.c (pthread_rwlock_timedrdlock):
	Pass CLOCK_REALTIME to __pthread_rwlock_rdlock_full since abstime
	uses that clock.
	* nptl/pthread_rwlock_timedwrlock.c (pthread_rwlock_timedwrlock):
	Pass CLOCK_REALTIME to __pthread_rwlock_wrlock_full since abstime
	uses that clock.
	* sysdeps/unix/sysv/linux/aarch64/libpthread.abilist (GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/alpha/libpthread.abilist (GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/arm/libpthread.abilist (GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/csky/libpthread.abilist (GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/hppa/libpthread.abilist (GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/i386/libpthread.abilist (GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/ia64/libpthread.abilist (GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/m68k/coldfire/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/m68k/m680x0/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/microblaze/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/mips/mips32/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/mips/mips64/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/nios2/libpthread.abilist (GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/be/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/le/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/riscv/rv64/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-32/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-64/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/sh/libpthread.abilist (GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/sparc/sparc32/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/sparc/sparc64/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/x86_64/64/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/x86_64/x32/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* nptl/tst-abstime.c (th): Add pthread_rwlock_clockrdlock and
	pthread_rwlock_clockwrlock timeout tests to match the existing
	pthread_rwlock_timedrdloock and pthread_rwlock_timedwrlock tests.
	* nptl/tst-rwlock14.c (do_test): Likewise.
	* nptl/tst-rwlock6.c Invent verbose_printf macro, and use for
	ancillary output throughout. (tf): Accept thread_args structure so
	that rwlock, a clockid and function name can be passed to the
	thread. (do_test_clock): Rename from do_test. Accept clockid
	parameter to specify test clock. Use the magic clockid value of
	CLOCK_USE_TIMEDLOCK to indicate that pthread_rwlock_timedrdlock and
	pthread_rwlock_timedwrlock should be tested, otherwise pass the
	specified clockid to pthread_rwlock_clockrdlock and
	pthread_rwlock_clockwrlock. Use xpthread_create and xpthread_join.
	(do_test): Call do_test_clock to test each clockid in turn.
	* nptl/tst-rwlock7.c: Likewise.
	* nptl/tst-rwlock9.c (writer_thread, reader_thread): Accept
	thread_args structure so that the (now int) thread number, the
	clockid and the function name can be passed to the thread.
	(do_test_clock): Renamed from do_test. Pass the necessary
	thread_args when creating the reader and writer threads. Use
	xpthread_create and xpthread_join.
	(do_test): Call do_test_clock to test each clockid in turn.
	* manual/threads.texi: Add documentation for
	pthread_rwlock_clockrdlock and pthread_rwlock_clockwrclock.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2019-07-12 13:36:24 +00:00
Mike Crowe
afe4de7d28 nptl: Add POSIX-proposed pthread_cond_clockwait
Add:

 int pthread_cond_clockwait (pthread_cond_t *cond,
                             pthread_mutex_t *mutex,
                             clockid_t clockid,
                             const struct timespec *abstime)

which behaves just like pthread_cond_timedwait except it always measures
abstime against the supplied clockid. Currently supports CLOCK_REALTIME
and
CLOCK_MONOTONIC and returns EINVAL if any other clock is specified.

Includes feedback from many others. This function was originally
proposed[1] as pthread_cond_timedwaitonclock_np, but The Austin Group
preferred the new name.

	* nptl/Makefile: Add tst-cond26 and tst-cond27
	* nptl/Versions (GLIBC_2.30): Add pthread_cond_clockwait
	* sysdeps/nptl/pthread.h: Likewise
	* nptl/forward.c: Add __pthread_cond_clockwait
	* nptl/forward.c: Likewise
	* nptl/pthreadP.h: Likewise
	* sysdeps/nptl/pthread-functions.h: Likewise
	* nptl/pthread_cond_wait.c (__pthread_cond_wait_common): Add
	clockid parameter and comment describing why we don't need to
	check
	its value. Use that value when calling
	futex_abstimed_wait_cancelable rather than reading the clock
	from
	the flags. (__pthread_cond_wait): Pass unused clockid parameter.
	(__pthread_cond_timedwait): Read clock from flags and pass it to
	__pthread_cond_wait_common. (__pthread_cond_clockwait): Add new
	function with weak alias from pthread_cond_clockwait.
	* sysdeps/mach/hurd/i386/libpthread.abilist (GLIBC_2.30):
	* Likewise.
	* sysdeps/unix/sysv/linux/aarch64/libpthread.abilist
	* (GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/alpha/libpthread.abilist (GLIBC_2.30):
	* Likewise.
	* sysdeps/unix/sysv/linux/arm/libpthread.abilist (GLIBC_2.30):
	* Likewise.
	* sysdeps/unix/sysv/linux/csky/libpthread.abilist (GLIBC_2.30):
	* Likewise.
	* sysdeps/unix/sysv/linux/hppa/libpthread.abilist (GLIBC_2.30):
	* Likewise.
	* sysdeps/unix/sysv/linux/i386/libpthread.abilist (GLIBC_2.30):
	* Likewise.
	* sysdeps/unix/sysv/linux/ia64/libpthread.abilist (GLIBC_2.30):
	* Likewise.
	* sysdeps/unix/sysv/linux/m68k/coldfire/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/m68k/m680x0/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/microblaze/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/mips/mips32/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/mips/mips64/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/nios2/libpthread.abilist (GLIBC_2.30):
	* Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/be/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/le/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/riscv/rv64/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-32/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-64/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/sh/libpthread.abilist (GLIBC_2.30):
	* Likewise.
	* sysdeps/unix/sysv/linux/sparc/sparc32/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/sparc/sparc64/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/x86_64/64/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/x86_64/x32/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* nptl/tst-cond11.c (run_test): Support testing
	pthread_cond_clockwait too by using a special magic
	CLOCK_USE_ATTR_CLOCK value to determine whether to call
	pthread_cond_timedwait or pthread_cond_clockwait. (do_test):
	Pass
	CLOCK_USE_ATTR_CLOCK for existing tests, and add new tests using
	all combinations of CLOCK_MONOTONIC and CLOCK_REALTIME.
	* ntpl/tst-cond26.c: New test for passing unsupported and
	* invalid
	clocks to pthread_cond_clockwait.
	* nptl/tst-cond27.c: Add test similar to tst-cond5.c, but using
	struct timespec and pthread_cond_clockwait.
	* manual/threads.texi: Document pthread_cond_clockwait. The
	* comment
	was provided by Carlos O'Donell.

[1] https://sourceware.org/ml/libc-alpha/2015-07/msg00193.html

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2019-07-12 13:36:24 +00:00
Mike Crowe
6615f77978 nptl: Add POSIX-proposed sem_clockwait
Add:

 int sem_clockwait (sem_t *sem, clockid_t clock, const struct timespec
*abstime)

which behaves just like sem_timedwait, but measures abstime against the
specified clock. Currently supports CLOCK_REALTIME and CLOCK_MONOTONIC
and sets errno == EINVAL if any other clock is specified.

	* nptl/sem_waitcommon.c (do_futex_wait, __new_sem_wait_slow): Add
	clockid parameters to indicate the clock which abstime should be
	measured against.
	* nptl/sem_timedwait.c (sem_timedwait), nptl/sem_wait.c
	(__new_sem_wait): Pass CLOCK_REALTIME as clockid to
	__new_sem_wait_slow.
	* nptl/sem_clockwait.c: New file to implement sem_clockwait based
	on sem_timedwait.c.
	* nptl/Makefile: Add sem_clockwait.c source file. Add CFLAGS for
	sem_clockwait.c to match those used for sem_timedwait.c.
	* sysdeps/pthread/semaphore.h: Add sem_clockwait.
	* nptl/Versions (GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/aarch64/libpthread.abilist (GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/alpha/libpthread.abilist (GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/arm/libpthread.abilist (GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/csky/libpthread.abilist (GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/hppa/libpthread.abilist (GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/i386/libpthread.abilist (GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/ia64/libpthread.abilist (GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/m68k/coldfire/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/m68k/m680x0/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/microblaze/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/mips/mips32/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/mips/mips64/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/nios2/libpthread.abilist (GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/be/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/le/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/riscv/rv64/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-32/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-64/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/sh/libpthread.abilist (GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/sparc/sparc32/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/sparc/sparc64/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/x86_64/64/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* sysdeps/unix/sysv/linux/x86_64/x32/libpthread.abilist
	(GLIBC_2.30): Likewise.
	* nptl/tst-sem17.c: Add new test for passing invalid clock to
	sem_clockwait.
	* nptl/tst-sem13.c, nptl/tst-sem5.c: Modify existing sem_timedwait
	tests to also test sem_clockwait.
	* manual/threads.texi: Document sem_clockwait.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2019-07-12 13:36:23 +00:00
Mike Crowe
99d01ffcc3 nptl: Add clockid parameter to futex timed wait calls
In preparation for adding POSIX clockwait variants of timedwait functions,
add a clockid_t parameter to futex_abstimed_wait functions and pass
CLOCK_REALTIME from all callers for the time being.

Replace lll_futex_timed_wait_bitset with lll_futex_clock_wait_bitset
which takes a clockid_t parameter rather than the magic clockbit.

	* sysdeps/nptl/lowlevellock-futex.h,
	sysdeps/unix/sysv/linux/lowlevellock-futex.h: Replace
	lll_futex_timed_wait_bitset with lll_futex_clock_wait_bitset that
	takes a clockid rather than a special clockbit.
	* sysdeps/nptl/lowlevellock-futex.h: Add
	lll_futex_supported_clockid so that client functions can check
	whether their clockid parameter is valid even if they don't
	ultimately end up calling lll_futex_clock_wait_bitset.
	* sysdeps/nptl/futex-internal.h,
	sysdeps/unix/sysv/linux/futex-internal.h
	(futex_abstimed_wait, futex_abstimed_wait_cancelable): Add
	clockid_t parameter to indicate which clock the absolute time
	passed should be measured against. Pass that clockid onto
	lll_futex_clock_wait_bitset. Add invalid clock as reason for
	returning -EINVAL.
	* sysdeps/nptl/futex-internal.h,
	sysdeps/unix/sysv/linux/futex-internal.h: Introduce
	futex_abstimed_supported_clockid so that client functions can check
	whether their clockid parameter is valid even if they don't
	ultimately end up calling futex_abstimed_wait.
	* nptl/pthread_cond_wait.c (__pthread_cond_wait_common): Remove
	code to calculate relative timeout for
	__PTHREAD_COND_CLOCK_MONOTONIC_MASK and just pass CLOCK_MONOTONIC
	or CLOCK_REALTIME as required to futex_abstimed_wait_cancelable.
	* nptl/pthread_rwlock_common (__pthread_rwlock_rdlock_full)
	(__pthread_wrlock_full), nptl/sem_waitcommon (do_futex_wait): Pass
	additional CLOCK_REALTIME to futex_abstimed_wait_cancelable.
	* nptl/pthread_mutex_timedlock.c (__pthread_mutex_timedlock):
	Switch to lll_futex_clock_wait_bitset and pass CLOCK_REALTIME

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2019-07-12 13:36:23 +00:00
Adhemerval Zanella
a008c76b56 posix: Fix large mmap64 offset for mips64n32 (BZ#24699)
The fix for BZ#21270 (commit 158d5fa0e1) added a mask to avoid offset larger
than 1^44 to be used along __NR_mmap2.  However mips64n32 users __NR_mmap,
as mips64n64, but still defines off_t as old non-LFS type (other ILP32, such
x32, defines off_t being equal to off64_t).  This leads to use the same
mask meant only for __NR_mmap2 call for __NR_mmap, thus limiting the maximum
offset it can use with mmap64.

This patch fixes by setting the high mask only for __NR_mmap2 usage. The
posix/tst-mmap-offset.c already tests it and also fails for mips64n32. The
patch also change the test to check for an arch-specific header that defines
the maximum supported offset.

Checked on x86_64-linux-gnu, i686-linux-gnu, and I also tests tst-mmap-offset
on qemu simulated mips64 with kernel 3.2.0 kernel for both mips-linux-gnu and
mips64-n32-linux-gnu.

	[BZ #24699]
	* posix/tst-mmap-offset.c: Mention BZ #24699.
	(do_test_bz21270): Rename to do_test_large_offset and use
	mmap64_maximum_offset to check for maximum expected offset value.
	* sysdeps/generic/mmap_info.h: New file.
	* sysdeps/unix/sysv/linux/mips/mmap_info.h: Likewise.
	* sysdeps/unix/sysv/linux/mmap64.c (MMAP_OFF_HIGH_MASK): Define iff
	__NR_mmap2 is used.
2019-07-10 16:52:50 -03:00
Szabolcs Nagy
30ba037546 aarch64: simplify the DT_AARCH64_VARIANT_PCS handling code
Remove unnecessary variant_pcs field: the dynamic tag can be checked
directly.

	* sysdeps/aarch64/dl-machine.h (elf_machine_runtime_setup): Remove the
	DT_AARCH64_VARIANT_PCS check.
	(elf_machine_lazy_rel): Use l_info[DT_AARCH64 (VARIANT_PCS)].
	* sysdeps/aarch64/linkmap.h (struct link_map_machine): Remove
	variant_pcs.
2019-07-10 15:28:00 +01:00
Paul A. Clarke
b5232c9f9e [powerpc] fenv_libc.h: protect use of __builtin_cpu_supports
Using __builtin_cpu_supports() requires support in GCC and Glibc.
My recent patch to fenv_libc.h added an unprotected use of
__builtin_cpu_supports().  Compilation of Glibc itself will fail
with a sufficiently new GCC and sufficiently old Glibc:

../sysdeps/powerpc/fpu/fegetexcept.c: In function ‘__fegetexcept’:
../sysdeps/powerpc/fpu/fenv_libc.h:52:20: error: builtin ‘__builtin_cpu_supports’ needs GLIBC (2.23 and newer) that exports hardware capability bits [-Werror]

Reviewed-by: Florian Weimer <fweimer@redhat.com>
Fixes 3db85a9814.
2019-07-09 13:09:35 -05:00
Adhemerval Zanella
6ea21bfe43 powerpc: refactor logb{f,l}
The power7 logb implementation does not show a performance gain on
ISA 2.07+ chips with faster floating-point to GRP instructions
(currently POWER8 and POWER9).

This patch moves the POWER7 implementation to generic one and enables
it for POWER7.  It also add some cleanup to use inline floating-point
number instead of define them using static const.

The performance difference is for POWER9:

  - Without patch:
  "logb": {
   "subnormal": {
    "duration": 4.99202e+09,
    "iterations": 8.83662e+08,
    "max": 75.194,
    "min": 5.501,
    "mean": 5.64925
   },
   "normal": {
    "duration": 4.97063e+09,
    "iterations": 9.97094e+08,
    "max": 46.489,
    "min": 4.956,
    "mean": 4.98512
   }
  }

  - With patch:
  "logb": {
   "subnormal": {
    "duration": 4.97226e+09,
    "iterations": 9.92036e+08,
    "max": 77.209,
    "min": 4.892,
    "mean": 5.01218
   },
   "normal": {
    "duration": 4.96192e+09,
    "iterations": 1.07545e+09,
    "max": 12.361,
    "min": 4.593,
    "mean": 4.61382
   }
  }

The ifunc implementation is also enabled only for powerpc64.

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

	* sysdeps/powerpc/power7/fpu/s_logb.c: Move to ...
	* sysdeps/powerpc/fpu/s_logb.c: ... here.  Use inline FP constants.
	* sysdeps/powerpc/power7/fpu/s_logbf.c: Move to ...
	* sysdeps/powerpc/fpu/s_logbf.c: ... here.  Use inline FP constants.
	* sysdeps/powerpc/power7/fpu/s_logbl.c: Move to ...
	* sysdeps/powerpc/fpu/s_logbl.c: ... here.  Use inline FP constants.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_logb-power7.c:
	Adjust implementation path.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_logbf-power7.c:
	Adjust implementation path.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_logbl-power7.c:
	Adjust implementation path.
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/Makefile
	(libm-sysdep_routines): Add s_log* objects.
	(CFLAGS-s_logbf-power7.c, CFLAGS-s_logbl-power7.c,
	CFLAGS-s_logb-power7.c): New fule.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logb-power7.c: Move
	to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logb-power7.c:
	... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logb-ppc64.c: Move
	to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logb-ppc64.c:
	... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logb.c: Move to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logb.c: ... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logbf-power7.c: Move
	to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logbf-power7.c:
	... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logbf-ppc64.c: Move
	to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logbf-ppc64.c:
	... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logbf.c: Move to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logbf.c: ... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logbl-power7.c: Move
	to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logbl-power7.c:
	... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logbl-ppc64.c: Move
	to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logbl-ppc64.c:
	... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logbl.c: Move to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logbl.c: ... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile: Remove file.
	* sysdeps/powerpc/powerpc64/power7/fpu/s_logb.c: Remove file.
	* sysdeps/powerpc/powerpc64/power7/fpu/s_logbf.c: Likewise.
	* sysdeps/powerpc/powerpc64/power7/fpu/s_logbl.c: Likewise.

Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
2019-07-08 17:22:22 -03:00
Adhemerval Zanella
105f2ed368 math: Use wordsize-64 version for s_logb
- The resulting binary difference on 32 bits architecture is
    minimum.  On i686-linux-gnu (with architecture optimization
    routine removed) there is no different using logb benchtests

  - It helps wordsize-64 architectures that use ldbl-opt.

  - It add some code simplification with reduction of duplicated
    implementations.

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

	* sysdeps/ieee754/dbl-64/wordsize-64/s_logb.c: Move to ...
	* sysdeps/ieee754/dbl-64/s_logb.c: ... here.  Add work around for
	powerpc32 integer 0 converting to -0.

Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
2019-07-08 17:22:22 -03:00
Adhemerval Zanella
931c616eed powerpc: Refactor modf{f}
The modf{f} optimization is not an optimization for ISA 2.07+.  This
patch move the IFUNC for powerpc64 only, move the power5+ to generic
location, and include the generic implementation for ISA 2.07+.

The performance changes are based on modf benchtests:

  * POWER9 - ppc64
  "modf": {
   "": {
    "duration": 4.97057e+09,
    "iterations": 1.00688e+09,
    "max": 28.76,
    "min": 4.912,
    "mean": 4.9366
   }
  }
  * POWER9 - power5+
  "modf": {
   "": {
    "duration": 4.98291e+09,
    "iterations": 9.32818e+08,
    "max": 15.058,
    "min": 5.107,
    "mean": 5.34178
   }
  }

  * POWER8 - ppc64
   "modf": {
   "": {
    "duration": 5.05329e+09,
    "iterations": 8.38814e+08,
    "max": 518.051,
    "min": 5.79,
    "mean": 6.02433
   }
  }
  * POWER8 - power5+
  "modf": {
   "": {
    "duration": 5.05573e+09,
    "iterations": 8.35254e+08,
    "max": 63.141,
    "min": 5.873,
    "mean": 6.05293
   }
  }

  * POWER7 - ppc64
  "modf": {
   "": {
    "duration": 4.89818e+09,
    "iterations": 1.08408e+09,
    "max": 57.556,
    "min": 3.953,
    "mean": 4.51827
   }
  }
  * POWER7 - power5+
  "modf": {
   "": {
    "duration": 4.83789e+09,
    "iterations": 1.33409e+09,
    "max": 46.608,
    "min": 2.224,
    "mean": 3.62636
   }
  }

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

	* sysdeps/powerpc/power5+/fpu/s_modf.c: Move to ...
	* sysdeps/powerpc/fpu/s_modf.c: ... here.  Add ISA 2.07 optimization.
	* sysdeps/powerpc/power5+/fpu/s_modff.c: Move to ...
	* sysdeps/powerpc/fpu/s_modff.c: ... here.  Add ISA 2.07 optimization.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_modf-power5+.c:
	Adjust include.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_modff-power5+.c:
	Likewise.
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/Makefile (sysdep_calls,
	sysdep_routines): Add s_modf* objects.
	(CFLAGS-s_modf-power5+.c, CFLAGS-s_modff-power5+.c,
	CFLAGS-s_modf-ppc64.c, CFLAGS-s_modff-ppc64.c): New rule.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_modf-power5+.c: Move
	to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modf-power5+.c:
	... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_modf-power5+.c: Movo
	to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modf-power5+.c: Move
	... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_modf.c: Move to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modf.c: ... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_modff-power5+.c: Move
	to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modff-power5+.c:
	... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_modff-ppc64.c: Move to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modff-ppc64.c:
	... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_modff.c: Move to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modff.c: ... here.

Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
2019-07-08 17:22:22 -03:00
Adhemerval Zanella
69461d9896 powerpc: hypot refactor and optimization
The powerpc hypot is slight optimized by:

  - Commit 8df4e219e4, both isnan and isinf are always inlined and thus
    the check TEST_INF_NAN does not make sense anymore.  The generic
    check for POWER7 should be faster on all powerpc configuration.

  - The redundant check 'y > two60factor && (x / y) > two60' is removed.

Both changes leads to unrequired ifunc especialization for power7 and
thus they are removed.  Finally The code is also cleanup a bit by inlining
the constants floating points.

The performance changes using the hypot benchtests are:

  - POWER9 without patch:
    "hypot": {
     "overflow": {
      "duration": 4.98585e+09,
      "iterations": 4.84932e+08,
      "max": 46.551,
      "min": 10.229,
      "mean": 10.2815
     },
     "higher_two500": {
      "duration": 5.00192e+09,
      "iterations": 4.24843e+08,
      "max": 33.319,
      "min": 11.606,
      "mean": 11.7736
     },
     "subnormal": {
      "duration": 5.0075e+09,
      "iterations": 4.06792e+08,
      "max": 22.178,
      "min": 12.15,
      "mean": 12.3097
     },
     "less_two500": {
      "duration": 5.00685e+09,
      "iterations": 4.08772e+08,
      "max": 22.784,
      "min": 12.052,
      "mean": 12.2485
     },
     "default": {
      "duration": 5.06002e+09,
      "iterations": 4.09894e+08,
      "max": 20.648,
      "min": 11.874,
      "mean": 12.3447
     }
    }

  - POWER9 with patch:
    "hypot": {
     "overflow": {
      "duration": 4.91848e+09,
      "iterations": 7.28039e+08,
      "max": 47.958,
      "min": 6.436,
      "mean": 6.75579
     },
     "higher_two500": {
      "duration": 4.9359e+09,
      "iterations": 6.63376e+08,
      "max": 20.783,
      "min": 7.321,
      "mean": 7.44057
     },
     "subnormal": {
      "duration": 4.9479e+09,
      "iterations": 6.19772e+08,
      "max": 18.856,
      "min": 7.817,
      "mean": 7.98341
     },
     "less_two500": {
      "duration": 4.94275e+09,
      "iterations": 6.3889e+08,
      "max": 17.452,
      "min": 7.597,
      "mean": 7.73647
     },
     "default": {
      "duration": 5.03645e+09,
      "iterations": 5.70718e+08,
      "max": 18.904,
      "min": 8.55,
      "mean": 8.82476
     }
    }

  - POWER7 without patch
    "hypot": {
     "overflow": {
      "duration": 4.86637e+09,
      "iterations": 6.43196e+08,
      "max": 53.958,
      "min": 7.328,
      "mean": 7.56592
     },
     "higher_two500": {
      "duration": 4.99842e+09,
      "iterations": 3.11012e+08,
      "max": 78.227,
      "min": 15.696,
      "mean": 16.0715
     },
     "subnormal": {
      "duration": 4.99841e+09,
      "iterations": 3.08935e+08,
      "max": 51.392,
      "min": 15.983,
      "mean": 16.1795
     },
     "less_two500": {
      "duration": 5.00108e+09,
      "iterations": 2.99464e+08,
      "max": 73.247,
      "min": 16.416,
      "mean": 16.7001
     },
     "default": {
      "duration": 5.04645e+09,
      "iterations": 3.52608e+08,
      "max": 70.073,
      "min": 13.38,
      "mean": 14.3118
     }
    }

  - POWER7 with patch
    "hypot": {
     "overflow": {
      "duration": 4.80785e+09,
      "iterations": 8.00001e+08,
      "max": 66.262,
      "min": 5.888,
      "mean": 6.00981
     },
     "higher_two500": {
      "duration": 4.9859e+09,
      "iterations": 3.39449e+08,
      "max": 5148.44,
      "min": 14.539,
      "mean": 14.6882
     },
     "subnormal": {
      "duration": 4.9905e+09,
      "iterations": 3.28874e+08,
      "max": 64.905,
      "min": 14.971,
      "mean": 15.1745
     },
     "less_two500": {
      "duration": 4.99494e+09,
      "iterations": 3.19755e+08,
      "max": 103.696,
      "min": 14.972,
      "mean": 15.6211
     },
     "default": {
      "duration": 5.03951e+09,
      "iterations": 4.02502e+08,
      "max": 61.008,
      "min": 12.368,
      "mean": 12.5205
     }
    }

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

	* sysdeps/powerpc/fpu/e_hypot.c (two60, two500, two600, two1022,
	twoM500, twoM600, two60factor, pdnum): Remove.
	(TEST_INFO_NAN, GET_TW0_HIGH_WORD): Remove macro.
	(__ieee754_hypot): Replace static variables with inline definition,
	remove ununsed branches.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile
	(libm-sysdep_routines): Remove e_hypot-* objects.
	(CFLAGS-e_hypot-power7.c, CFLAGS-e_hypotf-power7.c): Remove rule.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/e_hypot-power7.c: Remove
	file.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/e_hypot-ppc64.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/e_hypot.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/e_hypotf-power7.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/e_hypotf-ppc64.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/e_hypotf.c: Likewise.

Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
2019-07-08 17:21:15 -03:00
Vincent Chen
97274b1846 dl-vdso: Add LINUX_4 HASH CODE to support nds32 vdso mechanism
* sysdeps/unix/sysv/linux/dl-vdso.h: Add LINUX_4
	HASH code to support nds32 vdso mechanism.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2019-07-08 15:57:01 -03:00
Andreas Schwab
484b7af3cc riscv: restore ABI compatibility (bug 24484)
The contents of the dynamic section are part of the ABI, thus
DL_RO_DYN_SECTION cannot be changed.
2019-07-04 14:55:58 +02:00
Szabolcs Nagy
2b8a3c86e7 aarch64: new ifunc resolver ABI
Passing a second argument to the ifunc resolver allows accessing
AT_HWCAP2 values from the resolver. AArch64 will start using AT_HWCAP2
on linux because for ilp32 to remain compatible with lp64 ABI no more
than 32bit hwcap flags can be in AT_HWCAP which is already used up.

Currently the relocation ordering logic does not guarantee that ifunc
resolvers can call libc apis or access libc objects, so only the
resolver arguments and runtime environment dependent instructions can
be used to do the dispatch (this affects ifunc resolvers outside of
the libc).

Since ifunc resolver is target specific and only supposed to be
called by the dynamic linker, the call ABI can be changed in a
backward compatible way:

Old call ABI passed hwcap as uint64_t, new abi sets the
_IFUNC_ARG_HWCAP flag in the hwcap and passes a second argument
that's a pointer to an extendible struct. A resolver has to check
the _IFUNC_ARG_HWCAP flag before accessing the second argument.

The new sys/ifunc.h installed header has the definitions for the
new ABI, everything is in the implementation reserved namespace.

An alternative approach is to try to support extern calls from ifunc
resolvers such as getauxval, but that seems non-trivial
https://sourceware.org/ml/libc-alpha/2017-01/msg00468.html

	* sysdeps/aarch64/Makefile: Install sys/ifunc.h and add tests.
	* sysdeps/aarch64/dl-irel.h (elf_ifunc_invoke): Update to new ABI.
	* sysdeps/aarch64/sys/ifunc.h: New file.
	* sysdeps/aarch64/tst-ifunc-arg-1.c: New file.
	* sysdeps/aarch64/tst-ifunc-arg-2.c: New file.
2019-07-04 11:13:32 +01:00
Florian Weimer
41d6f74e6c nptl: Remove vfork IFUNC-based forwarder from libpthread [BZ #20188]
With commit f0b2132b35 ("ld.so:
Support moving versioned symbols between sonames [BZ #24741]"), the
dynamic linker will find the definition of vfork in libc and binds
a vfork reference to that symbol, even if the soname in the version
reference says that the symbol should be located in libpthread.

As a result, the forwarder (whether it's IFUNC-based or a duplicate
of the libc implementation) is no longer necessary.

On older architectures, a placeholder symbol is required, to make sure
that the GLIBC_2.1.2 symbol version does not go away, or is turned in
to a weak symbol definition by the link editor.  (The symbol version
needs to preserved so that the symbol coverage check in
elf/dl-version.c does not fail for old binaries.)

mips32 is an outlier: It defined __vfork@@GLIBC_2.2, but the
baseline is GLIBC_2.0.  Since there are other @@GLIBC_2.2 symbols,
the placeholder symbol is not needed there.
2019-07-02 16:51:13 +02:00
H.J. Lu
d0093c5cef Call _dl_open_check after relocation [BZ #24259]
This is a workaround for [BZ #20839] which doesn't remove the NODELETE
object when _dl_open_check throws an exception.  Move it after relocation
in dl_open_worker to avoid leaving the NODELETE object mapped without
relocation.

	[BZ #24259]
	* elf/dl-open.c (dl_open_worker): Call _dl_open_check after
	relocation.
	* sysdeps/x86/Makefile (tests): Add tst-cet-legacy-5a,
	tst-cet-legacy-5b, tst-cet-legacy-6a and tst-cet-legacy-6b.
	(modules-names): Add tst-cet-legacy-mod-5a, tst-cet-legacy-mod-5b,
	tst-cet-legacy-mod-5c, tst-cet-legacy-mod-6a, tst-cet-legacy-mod-6b
	and tst-cet-legacy-mod-6c.
	(CFLAGS-tst-cet-legacy-5a.c): New.
	(CFLAGS-tst-cet-legacy-5b.c): Likewise.
	(CFLAGS-tst-cet-legacy-mod-5a.c): Likewise.
	(CFLAGS-tst-cet-legacy-mod-5b.c): Likewise.
	(CFLAGS-tst-cet-legacy-mod-5c.c): Likewise.
	(CFLAGS-tst-cet-legacy-6a.c): Likewise.
	(CFLAGS-tst-cet-legacy-6b.c): Likewise.
	(CFLAGS-tst-cet-legacy-mod-6a.c): Likewise.
	(CFLAGS-tst-cet-legacy-mod-6b.c): Likewise.
	(CFLAGS-tst-cet-legacy-mod-6c.c): Likewise.
	($(objpfx)tst-cet-legacy-5a): Likewise.
	($(objpfx)tst-cet-legacy-5a.out): Likewise.
	($(objpfx)tst-cet-legacy-mod-5a.so): Likewise.
	($(objpfx)tst-cet-legacy-mod-5b.so): Likewise.
	($(objpfx)tst-cet-legacy-5b): Likewise.
	($(objpfx)tst-cet-legacy-5b.out): Likewise.
	(tst-cet-legacy-5b-ENV): Likewise.
	($(objpfx)tst-cet-legacy-6a): Likewise.
	($(objpfx)tst-cet-legacy-6a.out): Likewise.
	($(objpfx)tst-cet-legacy-mod-6a.so): Likewise.
	($(objpfx)tst-cet-legacy-mod-6b.so): Likewise.
	($(objpfx)tst-cet-legacy-6b): Likewise.
	($(objpfx)tst-cet-legacy-6b.out): Likewise.
	(tst-cet-legacy-6b-ENV): Likewise.
	* sysdeps/x86/tst-cet-legacy-5.c: New file.
	* sysdeps/x86/tst-cet-legacy-5a.c: Likewise.
	* sysdeps/x86/tst-cet-legacy-5b.c: Likewise.
	* sysdeps/x86/tst-cet-legacy-6.c: Likewise.
	* sysdeps/x86/tst-cet-legacy-6a.c: Likewise.
	* sysdeps/x86/tst-cet-legacy-6b.c: Likewise.
	* sysdeps/x86/tst-cet-legacy-mod-5.c: Likewise.
	* sysdeps/x86/tst-cet-legacy-mod-5a.c: Likewise.
	* sysdeps/x86/tst-cet-legacy-mod-5b.c: Likewise.
	* sysdeps/x86/tst-cet-legacy-mod-5c.c: Likewise.
	* sysdeps/x86/tst-cet-legacy-mod-6.c: Likewise.
	* sysdeps/x86/tst-cet-legacy-mod-6a.c: Likewise.
	* sysdeps/x86/tst-cet-legacy-mod-6b.c: Likewise.
	* sysdeps/x86/tst-cet-legacy-mod-6c.c: Likewise.
2019-07-01 12:23:22 -07:00
Paul A. Clarke
3db85a9814 powerpc: Use faster means to access FPSCR when possible in some cases
Using 'mffs' instruction to read the Floating Point Status Control Register
(FPSCR) can force a processor flush in some cases, with undesirable
performance impact.  If the values of the bits in the FPSCR which force the
flush are not needed, an instruction that is new to POWER9 (ISA version 3.0),
'mffsl' can be used instead.

Cases included:  get_rounding_mode, fegetround, fegetmode, fegetexcept.

	* sysdeps/powerpc/bits/fenvinline.h (__fegetround): Use
	__fegetround_ISA300() or __fegetround_ISA2() as appropriate.
	(__fegetround_ISA300) New.
	(__fegetround_ISA2) New.
	* sysdeps/powerpc/fpu_control.h (IS_ISA300): New.
	(_FPU_MFFS): Move implementation...
	(_FPU_GETCW): Here.
	(_FPU_MFFSL): Move implementation....
	(_FPU_GET_RC_ISA300): Here. New.
	(_FPU_GET_RC): Use _FPU_GET_RC_ISA300() or _FPU_GETCW() as appropriate.
	* sysdeps/powerpc/fpu/fenv_libc.h (fegetenv_status_ISA300): New.
	(fegetenv_status): New.
	* sysdeps/powerpc/fpu/fegetmode.c (fegetmode): Use fegetenv_status()
	instead of fegetenv_register().
	* sysdeps/powerpc/fpu/fegetexcept.c (__fegetexcept): Likewise.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2019-06-30 08:40:44 -03:00
Florian Weimer
507f55c05f Linux: Use mmap instead of malloc in dirent/tst-getdents64
malloc dirties the entire allocated memory region due to M_PERTURB
in the test harness.
2019-06-28 14:05:02 +02:00
Tobias Klauser
589787f889 Replace PREPARE_VERSION macro with inline function
* sysdeps/unix/sysv/linux/dl-vdso.h (PREPARE_VERSION): Remove macro.
	(prepare_version_base): New helper inline function.
	(prepare_version): New macro replacing PREPARE_VERSION.
	(PREPARE_VERSION_KNOWN): Use prepare_version instead of PREPARE_VERSION.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2019-06-28 08:46:51 -03:00
Florian Weimer
5a659ccc0e io: Remove copy_file_range emulation [BZ #24744]
The kernel is evolving this interface (e.g., removal of the
restriction on cross-device copies), and keeping up with that
is difficult.  Applications which need the function should
run kernels which support the system call instead of relying on
the imperfect glibc emulation.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2019-06-28 09:39:21 +02:00
Florian Weimer
a620bd7935 Linux: Adjust gedents64 buffer size to int range [BZ #24740]
The kernel interface uses type unsigned int, but there is an
internal conversion to int, so INT_MAX is the correct limit.
Part of the buffer will always be unused, but this is not a
problem.  Such huge buffers do not occur in practice anyway.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2019-06-27 15:08:40 +02:00
H.J. Lu
d039da1c00 x86: Add sysdeps/x86/dl-lookupcfg.h
Since sysdeps/i386/dl-lookupcfg.h and sysdeps/x86_64/dl-lookupcfg.h are
identical, we can replace them with sysdeps/x86/dl-lookupcfg.h.

	* sysdeps/i386/dl-lookupcfg.h: Moved to ...
	* sysdeps/x86/dl-lookupcfg.h: Here.
	* sysdeps/x86_64/dl-lookupcfg.h: Removed.
2019-06-26 15:07:28 -07:00
Adhemerval Zanella
aa32f5bf0c powerpc: Use generic e_expf
Generic implementation is faster on both power8 and power9:

POWER9:
- sysdeps/ieee754/flt-32/e_expf.c
  "expf": {
   "workload-spec2017.wrf": {
    "duration": 5.1236e+09,
    "iterations": 7.53344e+08,
    "reciprocal-throughput": 5.9436,
    "latency": 7.65869,
    "max-throughput": 1.68248e+08,
    "min-throughput": 1.30571e+08
   }
  }

- sysdeps/powerpc/powerpc64/power8/fpu/e_expf.S
  "expf": {
   "workload-spec2017.wrf": {
    "duration": 5.14429e+09,
    "iterations": 5.29248e+08,
    "reciprocal-throughput": 8.05372,
    "latency": 11.3863,
    "max-throughput": 1.24166e+08,
    "min-throughput": 8.78249e+07
   }
  }

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

	* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile
	(libm-sysdep_routines): Remove e_expf-power8 and expf-ppc64.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/e_expf-power8.S: Remove
	file.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/e_expf-ppc64.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/e_expf.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/w_expf.c: Likewise.
	* sysdeps/powerpc/powerpc64/power8/fpu/e_expf.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/fpu/w_expf.c: Likewise.

Reviewed-by: Gabriel F. T. Gomes <gabriel@inconstante.eti.br>
2019-06-26 14:33:58 -03:00
Adhemerval Zanella
9d5d214e86 powerpc: Refactor powerpc32 lround/lroundf/llround/llroundf
This patches consolidates all the powerpc llround{f} implementations on
the generic sysdeps/powerpc/powerpc32/fpu/s_llround{f}.

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

	* sysdeps/powerpc/powerpc32/fpu/Makefile
	[$(subdir) == math] (CFLAGS-s_lround.c): New rule.
	* sysdeps/powerpc/powerpc32/fpu/s_llround.c (__llround): Add power5+
	and fctidz optimization.
	* sysdeps/powerpc/powerpc32/fpu/s_lround.S: Remove file.
	* sysdeps/powerpc/powerpc32/fpu/s_lround.c: New file.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile
	(CFLAGS-s_llround-power6.c, CFLAGS-s_llround-power5+.c,
	CFLAGS-s_llround-ppc32.c, CFLAGS-s_lround-ppc32.c,
	CFLAGS-s_lround-power5+.c): New rule.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-power5+.c:
	New file.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-power6.c:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-ppc32.c:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lround-power5+.c:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lround-ppc32.c:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-power5+.S:
	Remove file.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-power6.S:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-ppc32.S:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lround-power5+.S:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lround-ppc32.S:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/s_llround.S: Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/s_llroundf.S: Likewise.
	* sysdeps/powerpc/powerpc32/power5+/fpu/s_llround.S: Likewise.
	* sysdeps/powerpc/powerpc32/power5+/fpu/s_llroundf.S: Likewise.
	* sysdeps/powerpc/powerpc32/power5+/fpu/s_lround.S: Likewise.
	* sysdeps/powerpc/powerpc32/power6/fpu/s_llround.S: Likewise.
	* sysdeps/powerpc/powerpc32/power6/fpu/s_llroundf.S: Likewise.

Reviewed-by: Gabriel F. T. Gomes <gabriel@inconstante.eti.br>
2019-06-26 14:32:45 -03:00
Vincent Chen
a63b96fbdd Linux: Add nds32 specific syscalls to syscall-names.list
The nds32 creates two specific syscalls, udftrap and fp_udfiex_crtl, in
kernel v5.0 and v5.2, respectively. Add these two syscalls to
syscall-names.list.
2019-06-26 13:08:54 +02:00
Stefan Liebler
c89e669a70 S390: Regenerate ULPs.
The update is needed for builds with -O3 and -march>=z13.

ChangeLog:

	* sysdeps/s390/fpu/libm-test-ulps: Regenerated.
2019-06-25 15:14:17 +02:00
Tobias Klauser
85c748f9ff Add missing VDSO_{NAME,HASH}_* macros and use them for PREPARE_VERSION_KNOWN
Define all currently used Linux versions used for
PREPARE_VERSION{,_KNOWN} in sysdeps/unix/sysv/linux/dl-vdso.h and use
them instead of duplicating the versions and precomputed hashes across
architecture specific files.

	* sysdeps/unix/sysv/linux/aarch64/gettimeofday.c (INIT_ARCH): Use
	PREPARE_VERSION_KNOWN.
	* sysdeps/unix/sysv/linux/aarch64/init-first.c: Likewise.
	* sysdeps/unix/sysv/linux/dl-vdso.h (VDSO_NAME_LINUX_2_6_39): New
	define.
	(VDSO_HASH_LINUX_2_6_39): Likewise.
	(VDSO_NAME_LINUX_4_9): Likewise.
	(VDSO_HASH_LINUX_4_9): Likewise.
	* sysdeps/unix/sysv/linux/powerpc/gettimeofday.c (INIT_ARCH): Likewise.
	* sysdeps/unix/sysv/linux/powerpc/init-first.c
	(_libc_vdso_platform_setup): Likewise.
	* sysdeps/unix/sysv/linux/powerpc/time.c (INIT_ARCH): Likewise.
	* sysdeps/unix/sysv/linux/s390/init-first.c (_libc_vdso_platform_setup):
	Likewise.
	* sysdeps/unix/sysv/linux/x86_64/init-first.c (__vdso_platform_setup):
	Likewise.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2019-06-21 10:43:22 -03:00
Paul A. Clarke
49bc41b642 [powerpc] add 'volatile' to asm
Add 'volatile' keyword to a few asm statements, to force the compiler
to generate the instructions therein.

Some instances were implicitly volatile, but adding keyword for consistency.

2019-06-19  Paul A. Clarke  <pc@us.ibm.com>

	* sysdeps/powerpc/fpu/fenv_libc.h (relax_fenv_state): Add 'volatile'.
	* sysdeps/powerpc/fpu/fpu_control.h (__FPU_MFFS): Likewise.
	(__FPU_MFFSL): Likewise.
	(_FPU_SETCW): Likewise.
2019-06-19 20:20:02 -05:00
Stan Shebs
335c1007bf powerpc: Fix static-linked version of __ppc_get_timebase_freq [BZ #24640]
__ppc_get_timebase_freq() always return 0 when using static linked
glibc.

This is a minimal example.c to reproduce:

    /******************************/
    #include <inttypes.h>
    #include <stdint.h>
    #include <stdio.h>
    #include <sys/platform/ppc.h>

    int main() {
        uint64_t freq = __ppc_get_timebase_freq();
        printf("Time Base frequency = %"PRIu64" Hz\n", freq);
        if (freq == 0)
            return -1;
        return 0;
    }
    /******************************/

Compile command: gcc -static example.c

This bug has been reproduced, fixed and tested on all powerpc platforms
(ppc32, ppc64 and ppc64le).

The underlying code of __ppc_get_timebase_freq uses __get_timebase_freq
that has a different implementation for shared and static version of
glibc.  In the static version, there is an incorrect sense in the if
check for the fd returned when opening /proc/cpuinfo.

This solution is mostly a cherry-pick from:

  commit 4791e4f773d060c1a37b27aac5b03cdfa9327afc
  Author: Stan Shebs <stanshebs@google.com>
  Date:   Fri May 17 12:25:19 2019 -0700
  Subject: Fix sense of a test in the static-linking version of ppc get_clockfreq

That is in branch glibc/google/grte/v5-2.27/master and was mentioned for
inclusion on master here:

  https://www.sourceware.org/ml/libc-alpha/2019-05/msg00409.html

Adapted from original fix for get_clockfreq. That code was moved to
get_timebase_freq.

Also added a static-build testcase for __ppc_get_timebase_freq since the
underlying function has different implementations for shared and static
build.

	[BZ #24640]
	* sysdeps/unix/sysv/linux/powerpc/get_timebase_freq.c
	[!SHARED] (__get_timebase_freq): Fix sense of a test in the
	static-linking version.
	* sysdeps/unix/sysv/linux/powerpc/Makefile
	(tests-static): Add test-gettimebasefreq-static.
	(tests): Likewise.
	* sysdeps/unix/sysv/linux/powerpc/test-gettimebasefreq-static.c:
	New file.
2019-06-19 19:11:06 -03:00
Adhemerval Zanella
112a0ae18b m68k: Remove vDSO support
Although defined in initial TLS/NPTL ABI for m68k and ColdFire [1], kernel
support was never pushed upstream.  This patch removes the unused m68k
vDSO support.

Checked with a build against m68k and m68k-coldfire and some basic
tests on ARAnyM.

	* sysdeps/unix/sysv/linux/m68k/Makefile (sysdep_routines,
	sysdep-rtld-routines): Remove rules.
	* sysdeps/unix/sysv/linux/m68k/Versions (libc) [GLIBC_PRIVATE]:
	Remove __vdso_atomic_cmpxchg_32 and __vdso_atomic_barrier.
	(ld) [GLIBC_PRIVATE]: __rtld___vdso_read_tp,
	__rtld___vdso_atomic_cmpxchg_32, and __rtld___vdso_atomic_barrier.
	* sysdeps/unix/sysv/linux/m68k/coldfire/atomic-machine.h
	(atomic_compare_and_exchange_val_acq, atomic_full_barrier): Remove
	vDSO path for SHARED.
	* sysdeps/unix/sysv/linux/m68k/init-first.c: Remove file.
	* sysdeps/unix/sysv/linux/m68k/libc-m68k-vdso.c: Likewise.
	* sysdeps/unix/sysv/linux/m68k/m68k-helpers.S: Likewise.
	* sysdeps/unix/sysv/linux/m68k/m68k-vdso.c: Likewise.
	* sysdeps/unix/sysv/linux/m68k/m68k-vdso.h: Likewise.
	* sysdeps/unix/sysv/linux/m68k/m68k-helpers.c: New file.

[1] https://lists.debian.org/debian-68k/2007/11/msg00071.html
2019-06-17 09:29:01 -03:00
Adhemerval Zanella
dee07df1a4 powerpc: Refactor powerpc64 lround/lroundf/llround/llroundf
This patches consolidates all the powerpc {l}lround{f} implementations
on the generic sysdeps/powerpc/fpu/s_{l}lround{f}.c.

The IFUNC support is also moved only to powerpc64 only, since for
powerpc64le generic implementation resulting in optimized code.

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/Makefile
	(libm-sysdep_routines): Add s_llround-power8, s_llround-power6x,
	s_llround-power5+, s_llround-ppc64, and s_llroundf-ppc64.
	(CFLAGS-s_llround-power8.c, CFLAGS-s_llround-power6x.c,
	CFLAGS-s_llround-power5+.c): New rule.
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llround-power5+.c:
	New file.
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llround-power6x.c:
	Likewise.
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llround-power8.c:
	Likewise.
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llround-ppc64.c:
	Likewise.
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llroundf-ppc64.c:
	Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llround.c: Move to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llround.c: ... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llroundf.c: Move to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llroundf.c: ... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_lround.c: Move to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_lround.c: ... here.
	* sysdeps/powerpc/powerpc64/fpu/Makefile
	[$(subdir) == math] (CFLAGS-s_llround.c): New rule.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile
	(libm-sysdep_routines): Remove s_llround-* objects.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llround-power5+.S: Remove
	file.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llround-power6x.S:
	Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llround-power8.S:
	Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llround-ppc64.S:
	Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llroundf-ppc64.S:
	Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_llround.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_llroundf.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_lround.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_lroundf.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_llround.c: New file.
	* sysdeps/powerpc/powerpc64/fpu/s_llroundf.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_lround.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_lroundf.c: Likewise.
	* sysdeps/powerpc/powerpc64/power5+/fpu/s_llround.S: Likewise.
	* sysdeps/powerpc/powerpc64/power5+/fpu/s_llroundf.S: Likewise.
	* sysdeps/powerpc/powerpc64/power6x/fpu/s_llround.S: Likewise.
	* sysdeps/powerpc/powerpc64/power6x/fpu/s_llroundf.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/fpu/s_llround.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/fpu/s_llroundf.S: Likewise.

Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
2019-06-17 09:28:21 -03:00
Adhemerval Zanella
2166283fcc powerpc: Refactor powerpc32 lrint/lrintf/llrint/llrintf
This patches consolidates all the powerpc llrint{f} implementations on
the generic sysdeps/powerpc/powerpc32/fpu/s_llrint{f}.

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cpu and with --with-cpu=power5+
and --disable-multi-arch).

	* sysdeps/powerpc/fpu/s_lrintf.S: Remove file.
	* sysdeps/powerpc/powerpc64/fpu/s_lrintf.c: Move to ...
	* sysdeps/powerpc/fpu/s_lrintf.c: ... here.
	* sysdeps/powerpc/powerpc32/fpu/Makefile
	[$(subdir) == math] (CFLAGS-s_lrint.c): New rule.
	* sysdeps/powerpc/powerpc32/fpu/s_llrint.c (__llrint): Add power4
	optimization.
	* sysdeps/powerpc/powerpc32/fpu/s_llrintf.c (__llrintf): Likewise.
	* sysdeps/powerpc/powerpc32/fpu/s_lrint.S: Remove file.
	* sysdeps/powerpc/powerpc32/fpu/s_lrint.c: New file.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile
	(CFLAGS-s_llrintf-power6.c, CFLAGS-s_llrintf-ppc32.c,
	CFLAGS-s_llrint-power6.c, CFLAGS-s_llrint-ppc32.c,
	CFLAGS-s_lrint-ppc32.c): New rule.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llrint-power6.S:
	Remove file.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llrint-ppc32.S:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llrintf-power6.S:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llrintf-ppc32.S:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lrint-ppc32.S:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/s_llrint.S: Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/s_llrintf.S: Likewise.
	* sysdeps/powerpc/powerpc32/power6/fpu/s_llrint.S: Likewise.
	* sysdeps/powerpc/powerpc32/power6/fpu/s_llrintf.S: Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llrint-power6.c:
	New file.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llrint-ppc32.c:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llrintf-power6.c:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llrintf-ppc32.c:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lrint-ppc32.c:
	Likewise.

Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
2019-06-17 09:27:02 -03:00
Adhemerval Zanella
78049de0a9 powerpc: refactor powerpc64 lrint/lrintf/llrint/llrintf
This patches consolidates all the powerpc llrint{f} implementations on
the generic sysdeps/powerpc/fpu/s_llrint{f}.

The IFUNC support is also moved only to powerpc64 only, since for
powerpc64le generic implementation resulting in optimized code.

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/Makefile
	(libm-sysdep_routines): Add s_llrint-power8, s_llrint-power6x, and
	s_llrint-ppc64.
	(CFLAGS-s_llrint-power8.c, CFLAGS-s_llrint-power6x.c): New rule.
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llrint-power6x.c: New
	file.
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llrint-power8.c:
	Likewise.
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llrint-ppc64.c:
	Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_lrint.c: Move to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_lrint.c: ... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llrint.c: Move to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llrint.c: ... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llrintf.c: Move to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llrintf.c: ... here.
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_lrint.c: New file.
	* sysdeps/powerpc/powerpc64/fpu/Makefile: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile
	(libm-sysdep_routines): Remove s_llrint-* objects.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llrint-power6x.S: Remove
	file.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llrint-power8.S:
	Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llrint-ppc64.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_llrint.c: New file.
	* sysdeps/powerpc/powerpc64/fpu/s_llrintf.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_lrint.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_lrintf.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_llrint.S: Remove file.
	* sysdeps/powerpc/powerpc64/fpu/s_llrintf.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_lrint.S: Likewise.
	* sysdeps/powerpc/powerpc64/power6x/fpu/s_llrint.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/fpu/s_llrint.S: Likewise.

Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
2019-06-17 09:26:21 -03:00
Florian Weimer
48c3c12389 Linux: Fix __glibc_has_include use for <sys/stat.h> and statx
The identifier linux is used as a predefined macro, so the actually
used path is 1/stat.h or 1/stat64.h.  Using the quote-based version
triggers a file lookup for /usr/include/bits/linux/stat.h (or whatever
directory is used to store bits/statx.h), but since bits/ is pretty
much reserved by glibc, this appears to be acceptable.

This is related to GCC PR 80005: incorrect macro expansion of the
argument of __has_include.

Suggested by Zack Weinberg.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2019-06-14 16:28:41 +02:00
Joseph Myers
cf27468602 Add IPV6_ROUTER_ALERT_ISOLATE from Linux 5.1 to bits/in.h.
This patch adds the new constant IPV6_ROUTER_ALERT_ISOLATE from Linux
5.1 to sysdeps/unix/sysv/linux/bits/in.h.

Tested for x86_64.

	* sysdeps/unix/sysv/linux/bits/in.h (IPV6_ROUTER_ALERT_ISOLATE):
	New macro.
2019-06-13 16:08:20 +00:00
Joseph Myers
a26e2e9fea Allow memset local PLT reference for powerpc soft-float.
Some recent change on GCC mainline resulted in the localplt test
failing for powerpc soft-float (not sure exactly when, as the failure
appeared when there were other build test failures as well;
<https://sourceware.org/ml/libc-testresults/2019-q2/msg00261.html>
shows it remaining when other failures went away).  The problem is a
call to memset that GCC now generates in the libgcc long double code.

Since memset is documented as a function GCC may always implicitly
generate calls to, it seems reasonable to allow that local PLT
reference (just like those for libgcc functions that GCC implicitly
generates calls to and that are also exported from libc.so), which
this patch does.

Tested for powerpc soft-float with build-many-glibcs.py.

	* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/localplt.data:
	Allow memset in libc.so.
2019-06-13 12:21:50 +00:00
Szabolcs Nagy
82bc69c012 aarch64: handle STO_AARCH64_VARIANT_PCS
Avoid lazy binding of symbols that may follow a variant PCS with different
register usage convention from the base PCS.

Currently the lazy binding entry code does not preserve all the registers
required for AdvSIMD and SVE vector calls.  Saving and restoring all
registers unconditionally may break existing binaries, even if they never
use vector calls, because of the larger stack requirement for lazy
resolution, which can be significant on an SVE system.

The solution is to mark all symbols in the symbol table that may follow
a variant PCS so the dynamic linker can handle them specially.  In this
patch such symbols are always resolved at load time, not lazily.

So currently LD_AUDIT for variant PCS symbols are not supported, for that
the _dl_runtime_profile entry needs to be changed e.g. to unconditionally
save/restore all registers (but pass down arg and retval registers to
pltentry/exit callbacks according to the base PCS).

This patch also removes a __builtin_expect from the modified code because
the branch prediction hint did not seem useful.

	* sysdeps/aarch64/dl-dtprocnum.h: New file.
	* sysdeps/aarch64/dl-machine.h (DT_AARCH64): Define.
	(elf_machine_runtime_setup): Handle DT_AARCH64_VARIANT_PCS.
	(elf_machine_lazy_rel): Check STO_AARCH64_VARIANT_PCS and bind such
	symbols at load time.
	* sysdeps/aarch64/linkmap.h (struct link_map_machine): Add variant_pcs.
2019-06-13 09:45:00 +01:00
Adhemerval Zanella
1192696069 powerpc: Remove optimized finite
The powerpc finite optimization do not show much gain:

  - GCC will call libm iff -fsignaling-nans is used. This usage pattern
    is usually not performance oriented and for such calls PLT overhead
    should dominate execution time.

  - The power7 uses ftdiv to optimize for some input patterns, but at
    cost of others.  Comparing against generic C implementation built
    for powerpc64-linux-gnu-power7 (--with-cpu=power7):

    - Generic sysdeps/ieee754 implementation:
       "isfinite": {
        "": {
         "duration": 5.0082e+09,
         "iterations": 2.45299e+09,
         "max": 43.824,
         "min": 2.008,
         "mean": 2.04167
        },
        "INF": {
         "duration": 4.66554e+09,
         "iterations": 2.28288e+09,
         "max": 35.73,
         "min": 2.008,
         "mean": 2.04371
        },
        "NAN": {
         "duration": 4.66274e+09,
         "iterations": 2.28716e+09,
         "max": 34.161,
         "min": 2.009,
         "mean": 2.03866
        }
       }

    - power7 optimized one:
       "isfinite": {
        "": {
         "duration": 4.99111e+09,
         "iterations": 2.65566e+09,
         "max": 25.015,
         "min": 1.716,
         "mean": 1.87942
        },
        "INF": {
         "duration": 4.6783e+09,
         "iterations": 2.0999e+09,
         "max": 35.264,
         "min": 1.868,
         "mean": 2.22787
        },
        "NAN": {
         "duration": 4.67915e+09,
         "iterations": 2.08678e+09,
         "max": 38.099,
         "min": 1.869,
         "mean": 2.24228
        }
       }

     So it basically optimizes marginally for normal numbers while
     increasing the latency for other kind of FP.

  - The power8 implementation is just the generic implementation using
    ISA 2.07 mfvsrd instruction (which GCC uses for generic implementation).
    So generic implementation is the best option for powerpc64le.

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile
	(sysdeps_routines, libm-sysdep_routines): Remove s_finite*
	objects.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finite-power7.S:
	Remove file.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finite-ppc32.c:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finite.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finitef-ppc32.c:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finitef.c: Likewise.
	* sysdeps/powerpc/powerpc32/power7/fpu/s_finite.S: Likewise.
	* sysdeps/powerpc/powerpc32/power7/fpu/s_finitef.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (sysdep_call):
	Remove s_finite* objects.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite-power7.S: Remove file.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite-power8.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite-ppc64.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_finitef-ppc64.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_finitef.c: Likewise.
	* sysdeps/powerpc/powerpc64/power7/fpu/s_finite.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/fpu/s_finitef.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/fpu/s_finite.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/fpu/s_finitef.S: Likewise.

Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
2019-06-12 14:32:39 -03:00
Adhemerval Zanella
a72186761b math: Use wordsize-64 version for finite
- math.h will use compiler builtin for gcc 4.4 when built without
    -fsignaling-nans and the builtin is expanded inline for all
    support architectures.  As an example, there is no intra finite
    call on libm for the architecture I checked, x86, arm, aarch64,
    and powerpc.

  - The resulting binary difference on 32 bits architecture is minimum
    for the non hotspot symbol.

  - It helps wordsize-64 architectures that use ldbl-opt.

  - It add some code simplification with reduction of duplicated
    implementations.

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

	* sysdeps/ieee754/dbl-64/wordsize-64/s_finite.c: Move to ...
	* sysdeps/ieee754/dbl-64/s_finite.c: ... here and format code.

Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
2019-06-12 14:32:39 -03:00
Adhemerval Zanella
6427a6ac8c powerpc: Remove optimized isinf
The powerpc isinf optimizations onyl adds complexity:

  - GCC will call libm iff -fsignaling-nans is used. This usage pattern
    is usually not performance oriented and for such calls PLT overhead
    should dominate execution time.

  - The power7 uses ftdiv to optimize for some input pattern and branch
    implementation for INF and denormal that does:

    return (ix & UINT64_C (0x7fffffffffffffff)) == UINT64_C (0x7ff0000000000000)

    Although it does show slight better latency than generic algorithm
    (as below), it is only for power7 and requires it to override it
    for power8.

  - The power8 implementation is just the generic implementation using
    ISA 2.07 mfvsrd instruction (which GCC uses for generic implementation).
    So generic implementation is the best option for powerpc64le.

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile
	(sysdeps_routines, libm-sysdep_routines): Remove s_isinf* and s_isinf*
	objects.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinf-power7.S:
	Remove file.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinf-ppc32.c:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinf.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinff-ppc32.c:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinff.c: Likewise.
	* sysdeps/powerpc/powerpc32/power7/fpu/s_isinf.S: Likewise.
	* sysdeps/powerpc/powerpc32/power7/fpu/s_isinff.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (sysdep_call):
	Remove s_isinf* and s_isinf* objects.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinf-power7.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinf-power8.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinf-ppc64.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinf.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinff-ppc64.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinff.c: Likewise.
	* sysdeps/powerpc/powerpc64/power7/fpu/s_isinf.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/fpu/s_isinff.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/fpu/s_isinf.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/fpu/s_isinff.S: Likewise.

Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
2019-06-12 14:32:39 -03:00
Adhemerval Zanella
a8c590f789 math: Use wordsize-64 version for isinf
- math.h will use compiler builtin for gcc 4.4 when built without
    -fsignaling-nans and the builtin is expanded inline for all
    support architectures.  As an example, there is no intra isinf
    call on libm for the architecture I checked, x86, arm, aarch64,
    and powerpc.

  - The resulting binary difference on 32 bits architecture is minimum
    for the non hotspot symbol.

  - It helps wordsize-64 architectures that use ldbl-opt.

  - It add some code simplification with reduction of duplicated
    implementations.

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

        * sysdeps/ieee754/dbl-64/wordsize-64/s_isinf.c: Move to ...
        * sysdeps/ieee754/dbl-64/s_isinf.c: ... here and format code.

Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
2019-06-12 14:32:39 -03:00
Adhemerval Zanella
2666f96390 powerpc: Remove optimized isnan
The powerpc isnan optimizations are not really a gain:

  - GCC will call libm iff -fsignaling-nans is used. This usage pattern
    is usually not performance oriented and for such calls PLT overhead
    should dominate execution time.

  - The power5, power6, and power6x are just micro-optimization to
    improve the Load-Hit-Store hazards from floating-point to general
    register transfer, and current GCC already has support to minimize
    it by inserting either extra nops or group dispatch instructions.

  - The power7 uses ftdiv to optimize for some input patterns, but at
    cost of others.  Comparing against generic C implementation built
    for powerpc-linux-gnu-power4 (which uses the hp-timing support on
    benchtests):

    - Generic sysdeps/ieee754 implementation:
      "isnan": {
       "": {
        "duration": 4.98415e+09,
        "iterations": 2.34516e+09,
        "max": 45.925,
        "min": 2.052,
        "mean": 2.12529
       },
       "INF": {
        "duration": 4.74057e+09,
        "iterations": 1.69761e+09,
        "max": 91.01,
        "min": 2.052,
        "mean": 2.79249
       },
       "NAN": {
        "duration": 4.74071e+09,
        "iterations": 1.68768e+09,
        "max": 282.343,
        "min": 2.052,
        "mean": 2.809
       }
      }

    - power7 optimized one:
    $ ./testrun.sh benchtests/bench-isnan
      "isnan": {
       "": {
        "duration": 4.96842e+09,
        "iterations": 2.56297e+09,
        "max": 50.048,
        "min": 1.872,
        "mean": 1.93854
       },
       "INF": {
        "duration": 4.76648e+09,
        "iterations": 1.54213e+09,
        "max": 373.408,
        "min": 2.661,
        "mean": 3.09084
       },
       "NAN": {
        "duration": 4.76845e+09,
        "iterations": 1.54515e+09,
        "max": 51.016,
        "min": 2.736,
        "mean": 3.08607
       }
      }

    So it basically optimizes marginally for normal numbers while
    increasing the latency for other kind of FP.

  - The generic implementation requires getting the floating point
    status, disable the invalid operation bit, and restore the
    floating-point status.  Each operation is costly and requires
    flushing the FP pipeline.

    Using the same scenarion for the previous analysis:

      "isnan": {
       "": {
        "duration": 5.08284e+09,
        "iterations": 6.2898e+08,
        "max": 41.844,
        "min": 8.057,
        "mean": 8.08108
       },
       "INF": {
        "duration": 4.97904e+09,
        "iterations": 6.16176e+08,
        "max": 39.661,
        "min": 8.057,
        "mean": 8.08055
       },
       "NAN": {
        "duration": 4.98695e+09,
        "iterations": 5.95866e+08,
        "max": 29.728,
        "min": 8.345,
        "mean": 8.36925
       }
      }

  - The power8 implementation is just the generic implementation using
    ISA 2.07 mfvsrd instruction (which GCC uses for generic implementation).
    So generic implementation is the best option for powerpc64le.

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

	* sysdeps/powerpc/fpu/s_isnan.c: Remove file.
	* sysdeps/powerpc/fpu/s_isnanf.S: Likewise.
	* sysdeps/powerpc/powerpc32/fpu/s_isnan.S: Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile
	(sysdeps_routines, libm-sysdep_routines): Remove s_isnan-* and
	s_isnanf-* objects.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan-power5.S:
	Remove file
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan-power6.S:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan-power7.S:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan-ppc32.S:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnanf-power5.S:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnanf-power6.S:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnanf.c: Likewise.
	* sysdeps/powerpc/powerpc32/power5/fpu/s_isnan.S: Likewise.
	* sysdeps/powerpc/powerpc32/power5/fpu/s_isnanf.S: Likewise.
	* sysdeps/powerpc/powerpc32/power6/fpu/s_isnan.S: Likewise.
	* sysdeps/powerpc/powerpc32/power6/fpu/s_isnanf.S: Likewise.
	* sysdeps/powerpc/powerpc32/power7/fpu/s_isnan.S: Likewise.
	* sysdeps/powerpc/powerpc32/power7/fpu/s_isnanf.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (sysdep_calls):
	Remove s_isnan-* and s_isnanf-* objects.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power5.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power6.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power6x.S:
	Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power7.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power8.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-ppc64.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnanf.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_isnan.S: Likewise.
	* sysdeps/powerpc/powerpc64/power5/fpu/s_isnan.S: Likewise.
	* sysdeps/powerpc/powerpc64/power6/fpu/s_isnan.S: Likewise.
	* sysdeps/powerpc/powerpc64/power6x/fpu/s_isnan.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/fpu/s_isnan.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/fpu/s_isnanf.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/fpu/s_isnan.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/fpu/s_isnanf.S: Likewise.

Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
2019-06-12 14:32:36 -03:00
Adhemerval Zanella
197dbda1a1 math: Use wordsize-64 version for isnan
- math.h will use compiler builtin for gcc 4.4 when built without
    -fsignaling-nans and the builtin is expanded inline for all
    support architectures.  As an example, there is no intra isnan
    call on libm for the architecture I checked, x86, arm, aarch64,
    and powerpc.

  - The resulting binary difference on 32 bits architecture is minimum
    for the non hotspot symbol.

  - It helps wordsize-64 architectures that use ldbl-opt.

  - It add some code simplification with reduction of duplicated
    implementations.

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

	* sysdeps/ieee754/dbl-64/wordsize-64/s_isnan.c: Move to ...
	* sysdeps/ieee754/dbl-64/s_isnan.c: ... here and format code.

Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
2019-06-12 14:32:18 -03:00