Commit Graph

12942 Commits

Author SHA1 Message Date
Samuel Thibault
e622ce98c5 htl: Avoid check-installed-headers looking at inlines 2020-02-09 16:42:36 +00:00
Samuel Thibault
865bf71109 htl: Do not put spin_lock inlines in public headers
They were not getting used anyway.
Also do not make libsupport use them, it would make tests using it have
to be made to link against libmachuser for gsync_wait.
2020-02-09 16:36:37 +00:00
Samuel Thibault
cca76b6db2 pthread: Move basic tests from nptl to sysdeps/pthread
So they can be checked with htl too.
2020-02-09 16:12:53 +00:00
Samuel Thibault
19a64d9f6e htl: Fix calling pthread_exit in the child of a fork
We need to reset the threads counter, otherwise pthread_exit() would not
call exit(0).
2020-02-09 17:01:06 +01:00
Florian Weimer
3430ed09d3 x86: Remove <bits/select.h> and use the generic version
Particularly on CPUs without ERMS, the string instructions are slow,
so it is unclear whether this architecture-specific implementation is
in fact an optimization.
2020-02-09 14:02:27 +01:00
Samuel Thibault
b05de10400 C11 threads: Move implementation to sysdeps/pthread
so it gets shared by nptl and htl. Also add htl versions of thrd_current and
thrd_yield.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2020-02-09 13:56:48 +01:00
Samuel Thibault
6cefe985b8 htl: Add C11 threads types definitions 2020-02-09 13:56:48 +01:00
Samuel Thibault
e5ad057068 nptl: Move nptl-specific types to separate header
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2020-02-09 13:56:48 +01:00
Samuel Thibault
f827f0e473 htl: Make __PTHREAD_ONCE_INIT more flexible
by moving its (struct __pthread_once) cast into PTHREAD_ONCE_INIT.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2020-02-09 13:56:48 +01:00
Samuel Thibault
0c0361235c htl: Add support for C11 threads behavior
Essentially properly calling the thread function which returns an int
instead of a void*.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2020-02-09 13:56:48 +01:00
Samuel Thibault
95669bbf2c htl: Add missing internal functions declarations 2020-02-09 13:56:48 +01:00
Samuel Thibault
e775f443bd htl: Rename _pthread_mutex_init/destroy to __pthread_mutex_init/destroy 2020-02-09 13:56:45 +01:00
Samuel Thibault
0093df204a htl: Move internal mutex/rwlock symbols to GLIBC_PRIVATE
Their prototypes have never been made public, and they are not used outside
libc (checked on the whole Debian archive)
2020-02-09 13:06:35 +01:00
Florian Weimer
f6233ab412 Linux: Add io/tst-o_path-locks test
The O_PATH-based fchmodat emulation will rely on the fact that closing
an O_PATH descriptor never releases POSIX advisory locks, so this
commit adds a test case for this behavior.
2020-02-09 11:51:08 +01:00
Samuel Thibault
cc79354ecc htl: Remove duplicate files
The generic versions have the same content.
2020-02-09 01:27:53 +01:00
Samuel Thibault
a99155555c htl: Remove unused files
These have never been used.
2020-02-09 01:27:38 +01:00
Wilco Dijkstra
814309f0c2 Remove a comment claiming that sin/cos round correctly. 2020-02-07 17:15:37 +00:00
Lukasz Majewski
d2e3b697da y2038: linux: Provide __settimeofday64 implementation
This patch provides new __settimeofday64 explicit 64 bit function for setting
64 bit time in the kernel (by internally calling __clock_settime64).
Moreover, a 32 bit version - __settimeofday has been refactored to internally
use __settimeofday64.

The __settimeofday is now supposed to be used on systems still supporting 32
bit time (__TIMESIZE != 64) - hence the necessary conversion of struct
timeval to 64 bit struct __timespec64.

Internally the settimeofday uses __settimeofday64. This patch is necessary
for having architectures with __WORDSIZE == 32 Y2038 safe.

Build tests:
./src/scripts/build-many-glibcs.py glibcs

Run-time tests:
- Run specific tests on ARM/x86 32bit systems (qemu):
  https://github.com/lmajewski/meta-y2038 and run tests:
  https://github.com/lmajewski/y2038-tests/commits/master

Above tests were performed with Y2038 redirection applied as well as without
to test proper usage of both __settimeofday64 and __settimeofday.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2020-02-07 17:55:08 +01:00
Lukasz Majewski
ebc2368121 y2038: alpha: Rename valid_timeval64_to_timeval to valid_timeval_to_timeval32
The name 'valid_timeval64_to_timeval' suggest conversion of struct
__timeval64 to struct timeval (as in ./include/time.h).

As on the alpha the struct timeval supports 64 bit time, it seems more
feasible to emphasis struct timeval32 in the conversion function name.

Hence the helper function naming change to 'valid_timeval_to_timeval32'.

Build tests:
./src/scripts/build-many-glibcs.py glibcs

Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2020-02-07 17:55:08 +01:00
Lukasz Majewski
cde52c2557 y2038: alpha: Rename valid_timeval_to_timeval64 to valid_timeval32_to_timeval
Without this patch the naming convention for functions to convert
struct timeval32 to struct timeval (which supports 64 bit time on Alpha) was
a bit misleading. The name 'valid_timeval_to_timeval64' suggest conversion
of struct timeval to struct __timeval64 (as in ./include/time.h).

As on alpha the struct timeval supports 64 bit time it seems more readable
to emphasis struct timeval32 in the conversion function name.

Hence the helper function naming change to 'valid_timeval32_to_timeval'.

Build tests:
./src/scripts/build-many-glibcs.py glibcs

Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2020-02-07 17:55:08 +01:00
Lukasz Majewski
3fced064f2 y2038: Define __suseconds64_t type to be used with struct __timeval64
The __suseconds64_t type is supposed to be the 64 bit type across all
architectures.

It would be mostly used internally in the glibc - however, when passed to
Linux kernel (very unlikely), if necessary, it shall be converted to 32
bit type (i.e. __suseconds_t)

Build tests:
./src/scripts/build-many-glibcs.py glibcs

Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2020-02-07 17:55:07 +01:00
Joseph Myers
449db0fa3e Update kernel version to 5.5 in tst-mman-consts.py.
This patch updates the kernel version in the test tst-mman-consts.py
to 5.5.  (There are no new constants covered by this test in 5.5 that
need any other header changes.)

Tested with build-many-glibcs.py.
2020-02-07 13:55:29 +00:00
Joseph Myers
5828bc4523 Update syscall lists for Linux 5.5.
Linux 5.5 has no new syscalls to add to syscall-names.list, but it
does newly enable the clone3 syscall for AArch64.  This patch updates
the kernel version listed in syscall-names.list and regenerates the
AArch64 arch-syscall.h.

Tested with build-many-glibcs.py.
2020-02-07 13:54:58 +00:00
Lukasz Majewski
f1c314d275 y2038: linux: Provide __timespec_get64 implementation
This patch provides new instance of Linux specific timespec_get.c file placed
in ./sysdeps/unix/sysv/linux/.

When compared to this file version from ./time directory, it provides
__timespec_get64 explicit 64 bit function for getting 64 bit time in the
struct __timespec64 (for compilation using C11 standard).
Moreover, a 32 bit version - __timespec_get internally uses
__timespec_get64.

The __timespec_get is now supposed to be used on systems still supporting 32
bit time (__TIMESIZE != 64) - hence the necessary conversion to 32 bit struct
timespec.

Internally the timespec_get uses __clock_gettime64. This patch is necessary
for having architectures with __WORDSIZE == 32 Y2038 safe.

Build tests:
./src/scripts/build-many-glibcs.py glibcs

Run-time tests:
- Run specific tests on ARM/x86 32bit systems (qemu):
  https://github.com/lmajewski/meta-y2038 and run tests:
  https://github.com/lmajewski/y2038-tests/commits/master

Above tests were performed with Y2038 redirection applied as well as without
to test proper usage of both __timespec_get64 and __timespec_get.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2020-02-05 00:10:16 +01:00
Andreas Schwab
6befb33f31 rt: avoid PLT setup in timer_[sg]ettime
The functions __timer_gettime64 and __timer_settime64 live in librt, not
libc.  Use proper hidden aliases so that the callers do not need to set up
the PLT register.

Fixes commits cae1635a70 ("y2038: linux: Provide __timer_settime64
implementation") and 562cdc19c7 ("y2038: linux: Provide __timer_gettime64
implementation").
2020-02-03 12:16:09 +01:00
Lukasz Majewski
b112f53e9d y2038: linux: Provide __sched_rr_get_interval64 implementation
This patch replaces auto generated wrapper (as described in
sysdeps/unix/sysv/linux/syscalls.list) for sched_rr_get_interval with one which
adds extra support for reading 64 bit time values on machines with
__TIMESIZE != 64.
There is no functional change for architectures already supporting 64 bit
time ABI.

The sched_rr_get_interval declaration in ./include/sched.h is not followed by
corresponding libc_hidden_proto(), so it has been assumed that newly introduced
syscall wrapper doesn't require libc_hidden_def() (which has been added by
template used with auto generation script).

Moreover, the code for building sched_rr_gi.c file is already placed in
./posix/Makefiles, so there was no need to add it elsewhere.

Performed tests and validation are the same as for timer_gettime() conversion
(sysdeps/unix/sysv/linux/timer_gettime.c).

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2020-02-02 11:23:50 +01:00
Lukasz Majewski
eae2243272 y2038: linux: Provide __timerfd_settime64 implementation
This patch replaces auto generated wrapper (as described in
sysdeps/unix/sysv/linux/syscalls.list) for timerfd_settime with one which
adds extra support for reading and writing from Linux kernel 64 bit time
values on machines with __TIMESIZE != 64.
There is no functional change for archs already supporting 64 bit time ABI.

This patch is conceptually identical to timer_settime conversion already
done in sysdeps/unix/sysv/linux/timer_settime.c.
Please refer to corresponding commit message for detailed description of
introduced functions and the testing procedure.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

---
Changes for v4:
- Update date from 2019 to 2020

Changes for v3:
- Add missing libc_hidden_def()

Changes for v2:
- Remove "Contributed by" from the file header
- Remove early check for (fd < 0) in __timerfd_settime64 as the fd
  correctness check is already done in Linux kernel
- Add single descriptive comment line to provide concise explanation
  of the code
2020-02-02 11:23:23 +01:00
Lukasz Majewski
0f6e6b9764 y2038: linux: Provide __timerfd_gettime64 implementation
This patch replaces auto generated wrapper (as described in
sysdeps/unix/sysv/linux/syscalls.list) for timerfd_gettime with one which
adds extra support for reading 64 bit time values on machines with
__TIMESIZE != 64.
There is no functional change for architectures already supporting 64 bit
time ABI.

This patch is conceptually identical to timer_gettime conversion already
done in sysdeps/unix/sysv/linux/timer_gettime.c.
Please refer to corresponding commit message for detailed description of
introduced functions and the testing procedure.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

---
Changes for v4:
- Update date from 2019 to 2020

Changes for v3:
- Add missing libc_hidden_def()

Changes for v2:
- Remove "Contributed by" from the file header
- Remove early check for (fd < 0) in __timerfd_gettime64 as the fd
  correctness check is already done in Linux kernel
- Add single descriptive comment line to provide concise explanation
  of the code
2020-02-02 11:23:23 +01:00
H.J. Lu
bbfc0f0f8e i386: Remove _exit.S
The generic implementation is suffice since __NR_exit_group is always
support and i386 does define ABORT_INSTRUCTION.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2020-02-01 05:44:56 -08:00
H.J. Lu
0455f251f4 i386: Use ENTRY/END in assembly codes
Use ENTRY and END in assembly codes so that ENDBR32 will be added at
function entries when CET is enabled.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2020-02-01 05:44:55 -08:00
H.J. Lu
825b58f3fb i386-mcount.S: Add _CET_ENDBR to _mcount and __fentry__
Since _mcount and __fentry__ don't use ENTRY, we need to add _CET_ENDBR
by hand.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2020-02-01 05:44:55 -08:00
H.J. Lu
4031d7484a i386/sub_n.S: Add a missing _CET_ENDBR to indirect jump target
Add a missing _CET_ENDBR to indirect jump targe in sysdeps/i386/sub_n.S.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2020-02-01 05:44:55 -08:00
H.J. Lu
15eab1e3e8 i386: Don't unnecessarily save and restore EAX, ECX and EDX [BZ# 25262]
On i386, since EAX, ECX and EDX are caller-saved, there are no need
to save and restore EAX, ECX and EDX in getcontext, setcontext and
swapcontext.  They just need to clear EAX on success.  The extra
scratch registers are needed to enable CET.

Tested on i386.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2020-02-01 05:44:55 -08:00
H.J. Lu
635d6fae03 x86: Don't make 2 calls to dlerror () in a row
We shouldn't make 2 calls to dlerror () in a row since the first call
will clear the error.  We should just use the return value from the
first call.

Tested on Linux/x86-64.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2020-02-01 05:43:34 -08:00
Florian Weimer
9baa46aa7b nptl: Avoid using PTHREAD_MUTEX_DEFAULT in macro definition [BZ #25271]
Commit 1c3f9acf1f ("nptl: Add struct_mutex.h")
replaced a zero constant with the identifier PTHREAD_MUTEX_DEFAULT
in the macro PTHREAD_MUTEX_INITIALIZER.  However, that constant
is not available in ISO C11 mode:

In file included from /usr/include/bits/thread-shared-types.h:74,
                 from /usr/include/bits/pthreadtypes.h:23,
                 from /usr/include/pthread.h:26,
                 from bug25271.c:1:
bug25271.c:3:21: error: ‘PTHREAD_MUTEX_DEFAULT’ undeclared here (not in a function)
    3 | pthread_mutex_t m = PTHREAD_MUTEX_INITIALIZER;
      |                     ^~~~~~~~~~~~~~~~~~~~~~~~~

This commit change the constant to the equivalent
PTHREAD_MUTEX_TIMED_NP, which is in the POSIX extension namespace
and thus always available.
2020-01-30 15:54:49 +01:00
Andreas Schwab
d937694059 Fix array overflow in backtrace on PowerPC (bug 25423)
When unwinding through a signal frame the backtrace function on PowerPC
didn't check array bounds when storing the frame address.  Fixes commit
d400dcac5e ("PowerPC: fix backtrace to handle signal trampolines").
2020-01-21 15:26:57 +01:00
Florian Weimer
8b222fa387 getaddrinfo: Fix resource leak after strdup failure in gethosts [BZ #25425]
Filip Ochnik spotted that one of the error jumps in gethosts fails to
call __resolv_context_put to release the resolver context.

Fixes commit 352f4ff9a2 ("resolv:
Introduce struct resolv_context [BZ #21668]") and commit
964263bb8d ("getaddrinfo: Release
resolver context on error in gethosts [BZ #21885]").

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2020-01-20 18:37:13 +01:00
Matheus Castanho
9f8b135f76 Fix maybe-uninitialized error on powerpc
The build has been failing on powerpc64le-linux-gnu with GCC 10
due to a maybe-uninitialized error:

../sysdeps/ieee754/dbl-64/mpa.c:875:6: error: ‘w.e’ may be used
uninitialized in this function [-Werror=maybe-uninitialized]
  875 |   EY -= EX;
      |      ^~

The warning is thrown because when __inv is called by __dvd *y is not
initialized and if t == 0 before calling __dbl_mp, EY will stay
uninitialized, as the function does not touch it in this case.

However, since t will be set to 1/t before calling __dbl_mp, t == 0 will
never happen, so we can instruct the compiler to ignore this case, which
suppresses the warning.

Tested on powerpc64le.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2020-01-17 21:02:13 -03:00
Andreas Schwab
be5c5315b9 powerpc32: Fix syntax error in __GLRO macro 2020-01-18 00:43:12 +01:00
Lucas A. M. Magalhaes
70ba28f7ab Fix tst-pkey.c pkey_alloc return checks and manual
This test was failing in some powerpc systems as it was not checking
for ENOSPC return.

As said on the Linux man-pages and can be observed by the implementation
at mm/mprotect.c in the Linux Kernel source.  The syscall pkey_alloc can
return EINVAL or ENOSPC.  ENOSPC will indicate either that all keys are
in use or that the kernel does not support pkeys.

Reviewed-by: Gabriel F. T. Gomes <gabriel@inconstante.net.br>
2020-01-17 09:05:03 -03:00
Tulio Magno Quites Machado Filho
18363b4f01 powerpc: Move cache line size to rtld_global_ro
GCC 10.0 enabled -fno-common by default and this started to point that
__cache_line_size had been implemented in 2 different places: loader and
libc.

In order to avoid this duplication, the libc variable has been removed
and the loader variable is moved to rtld_global_ro.

File sysdeps/unix/sysv/linux/powerpc/dl-auxv.h has been added in order
to reuse code for both static and dynamic linking scenarios.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2020-01-17 09:05:03 -03:00
Tulio Magno Quites Machado Filho
c908ae0492 powerpc: Initialize rtld_global_ro for static dlopen [BZ #20802]
Initialize dl_auxv, dl_hwcap and dl_hwcap2 in rtld_global_ro for DSOs
that have been statically dlopen'ed.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2020-01-17 09:05:03 -03:00
Samuel Thibault
a8f0fc4e5f htl: Add internal version of __pthread_mutex_timedlock
The C11 threads implementation will need it.
2020-01-13 20:41:07 +01:00
Samuel Thibault
ae793cc20d htl: Avoid exposing unixoid functions
C11 threads should not expose them.
2020-01-13 01:38:33 +01:00
Samuel Thibault
196e62cbe4 htl: Add type sizes in bits/pthreadtypes-arch.h and check them 2020-01-13 01:24:43 +01:00
Samuel Thibault
e404be33fe htl: Add internal versions of functions used by C11 threads
The C11 threads implementation needs to call pthread_join and
pthread_key_delete without exposing them.
2020-01-13 00:48:47 +01:00
Zack Weinberg
4988e26b94
MIPS: Fix circular definition of __LDBL_MANT_DIG__ in ieee754.h
In commit aa706e13f4,
sysdeps/mips/ieee754/ieee754.h was changed to use GCC’s predefined
macro __LDBL_MANT_DIG__, instead of including <float.h> and using
LDBL_MANT_DIG (and therefore polluting the user namespace with all of
the macros defined in float.h).  In order to support compilers that
don’t provide __LDBL_MANT_DIG__, there is a fallback #if block which
was supposed to include <float.h> and then define __LDBL_MANT_DIG__ to
LDBL_MANT_DIG.  However, at some point during the development of the
patch, a typo was introduced, causing the fallback block to define
__LDBL_MANT_DIG__ to expand to __LDBL_MANT_DIG__.

Correct this typo.
2020-01-08 14:28:23 -05:00
Samuel Thibault
cbce69e70d hurd: Fix message reception for timer_thread
Without a proper size, we get MACH_RCV_TOO_LARGE instead of MACH_MSG_SUCCESS.

* sysdeps/mach/hurd/setitimer.c (timer_thread): Add return_code_type
field to received message, and set the receive size in __mach_msg call.
2020-01-05 18:09:13 +01:00
Samuel Thibault
25c084e0a7 htl: Add __errno_location and __h_errno_location
As explained on
https://sourceware.org/ml/libc-alpha/2020-01/msg00049.html
the presence of __errno_location in libpthread.so on GNU/Linux makes
libpthread getting linked in for libstdc++. This aligns on that behavior, to
avoid issues that only GNU/Hurd would get.
2020-01-04 19:37:53 +01:00
Samuel Thibault
50a78baa8e htl: Move pthread_atfork to libc_nonshared.a
This follows bd60ce8652 ('nptl: Move pthread_atfork to libc_nonshared.a')
with the same rationale: there is no non-libpthread equivalent to be used
for making linking against libpthread optional.

libpthread_nonshared.a is unused after this, so remove it from the
build.

There is no ABI impact because pthread_atfork was implemented using
__register_atfork in libc even before this change.

pthread_atfork has to be a weak alias because pthread_* names are not
reserved in libc.
2020-01-04 18:55:47 +01:00
Samuel Thibault
af7be496c9 htl: Use dso_handle.h 2020-01-04 17:25:08 +01:00
Adhemerval Zanella
92b963699a linux: Optimize fallback 32-bit clock_getres
This patch avoid probing the __NR_clock_getttime64 syscall each time
__clock_gettime64 is issued on a kernel without 64 bit time support.
Once ENOSYS is obtained, only 32-bit clock_gettime are used.

The following snippet:

  clock_getres (CLOCK_REALTIME, &(struct timespec) { 0 });
  clock_getres (CLOCK_MONOTONIC, &(struct timespec) { 0 });
  clock_getres (CLOCK_BOOTTIME, &(struct timespec) { 0 });
  clock_getres (20, &(struct timespec) { 0 });

On a kernel without 64 bit time support issues the syscalls:

  syscall_0x196(0, 0xffb83330, [...]) = -1 ENOSYS (Function not implemented)
  clock_getres(CLOCK_REALTIME, {tv_sec=0, tv_nsec=1}) = 0
  clock_getres(CLOCK_MONOTONIC, {tv_sec=0, tv_nsec=1}) = 0
  clock_getres(CLOCK_BOOTTIME, {tv_sec=0, tv_nsec=1}) = 0

Checked on i686-linux-gnu on 4.15 kernel.

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2020-01-03 11:22:07 -03:00
Adhemerval Zanella
0dc1a378b1 linux: Add support for clock_getres64 vDSO
No architecture currently defines the vDSO symbol.  On archictures
with 64-bit time_t the HAVE_CLOCK_GETRES_VSYSCALL is renamed to
HAVE_CLOCK_GETRES64_VSYSCALL, it simplifies clock_gettime code.

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2020-01-03 11:22:07 -03:00
Adhemerval Zanella
cdae973b6a linux: Enable vDSO clock_gettime64 for mips
It was added on Linux 5.4 (commit 1f66c45db3302).

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2020-01-03 11:22:07 -03:00
Adhemerval Zanella
93e4db49b4 linux: Enable vDSO clock_gettime64 for arm
It was added on Linux 5.5 (commit 74d06efb9c2f9).

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2020-01-03 11:22:07 -03:00
Adhemerval Zanella
2d77a44751 linux: Enable vDSO clock_gettime64 for i386
It was added on Linux 5.3 (commit 22ca962288c0a).

Checked on i686-linux-gnu with 5.3.0 kernel.

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2020-01-03 11:22:07 -03:00
Adhemerval Zanella
a9091a1244 linux: Optimize fallback 32-bit clock_gettime
This patch avoid probing the __NR_clock_getttime64 syscall each time
__clock_gettime64 is issued on a kernel without 64 bit time support.
Once ENOSYS is obtained, only 32-bit clock_gettime are used.

The following snippet:

  clock_gettime (CLOCK_REALTIME, &(struct timespec) { 0 });
  clock_gettime (CLOCK_MONOTONIC, &(struct timespec) { 0 });
  clock_gettime (CLOCK_BOOTTIME, &(struct timespec) { 0 });
  clock_gettime (20, &(struct timespec) { 0 });

On a kernel without 64 bit time support and with vDSO support results
on the following syscalls:

  syscall_0x193(0, 0xff87ba30, [...]) = -1 ENOSYS (Function not implemented)
  clock_gettime(CLOCK_BOOTTIME, {tv_sec=927082, tv_nsec=474382032}) = 0
  clock_gettime(0x14 /* CLOCK_??? */, 0xff87b9f8) = -1 EINVAL (Invalid argument)

While on a kernel without vDSO support:

  syscall_0x193(0, 0xbec95550, 0xb6ed2000, 0x1, 0xbec95550, 0) = -1 (errno 38)
  clock_gettime(CLOCK_REALTIME, {tv_sec=1576615930, tv_nsec=638250162}) = 0
  clock_gettime(CLOCK_MONOTONIC, {tv_sec=1665478, tv_nsec=638779620}) = 0
  clock_gettime(CLOCK_BOOTTIME, {tv_sec=1675418, tv_nsec=292932704}) = 0
  clock_gettime(0x14 /* CLOCK_??? */, 0xbec95530) = -1 EINVAL (Invalid argument)

Checked on i686-linux-gnu on 4.15 kernel and on a 5.3 kernel.

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2020-01-03 11:22:07 -03:00
Adhemerval Zanella
ff500a623d linux: Add support for clock_gettime64 vDSO
No architecture currently defines the vDSO symbol.  On architectures
with 64-bit time_t the HAVE_CLOCK_GETTIME_VSYSCALL is renamed to
HAVE_CLOCK_GETTIME64_VSYSCALL, it simplifies clock_gettime code.

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2020-01-03 11:22:07 -03:00
Adhemerval Zanella
1bdda52fe9 elf: Move vDSO setup to rtld (BZ#24967)
This patch moves the vDSO setup from libc to loader code, just after
the vDSO link_map setup.  For static case the initialization
is moved to _dl_non_dynamic_init instead.

Instead of using the mangled pointer, the vDSO data is set as
attribute_relro (on _rtld_global_ro for shared or _dl_vdso_* for
static).  It is read-only even with partial relro.

It fixes BZ#24967 now that the vDSO pointer is setup earlier than
malloc interposition is called.

Also, vDSO calls should not be a problem for static dlopen as
indicated by BZ#20802.  The vDSO pointer would be zero-initialized
and the syscall will be issued instead.

Checked on x86_64-linux-gnu, i686-linux-gnu, aarch64-linux-gnu,
arm-linux-gnueabihf, powerpc64le-linux-gnu, powerpc64-linux-gnu,
powerpc-linux-gnu, s390x-linux-gnu, sparc64-linux-gnu, and
sparcv9-linux-gnu.  I also run some tests on mips.

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2020-01-03 11:22:07 -03:00
Adhemerval Zanella
e760874ee3 linux: Consolidate time implementation
The IFUNC bypass to vDSO is used when USE_IFUNC_TIME is set.
Currently powerpc and x86 defines it.  Otherwise the generic
implementation is used, which calls clock_gettime.

Checked on powerpc64le-linux-gnu, powerpc64-linux-gnu,
powerpc-linux-gnu-power4, x86_64-linux-gnu, and i686-linux-gnu.

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2020-01-03 11:22:04 -03:00
Adhemerval Zanella
c701bcc6f4 linux: Consolidate Linux gettimeofday
The IFUNC bypass to vDSO is used when USE_IFUNC_GETTIMEOFDAY is set.
Currently aarch64, powerpc*, and x86 defines it.  Otherwise the
generic implementation is used, which calls clock_gettime.

Checked on aarch64-linux-gnu, powerpc64le-linux-gnu,
powerpc64-linux-gnu, powerpc-linux-gnu-power4, x86_64-linux-gnu,
and i686-linux-gnu.

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2020-01-03 11:21:50 -03:00
Adhemerval Zanella
7bcaf77574 linux: Update mips vDSO symbols
The clock_getres is a new implementation added on Linux 5.4
(abed3d826f2f).

Checked with a build against mips-linux-gnu and mips64-linux-gnu.

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2020-01-03 10:02:05 -03:00
Adhemerval Zanella
eca6aec6a3 linux: Update x86 vDSO symbols
Add the missing time and clock_getres vDSO symbol names on x86.
For time, the iFUNC already uses expected name so it affects only
the static build.

The clock_getres is a new implementation added on Linux 5.3
(f66501dc53e72).

Checked on x86-linux-gnu and i686-linux-gnu.

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2020-01-03 10:02:05 -03:00
Adhemerval Zanella
2822aaf4f7 Remove vDSO support from make-syscall.sh
The auto-generated vDSO call shows some issues:

  - It requires sync the auto-generated C file with current glibc
    implementation;
  - It still uses symbol redirections hacks where libc-symbols.h
    provide macros that uses compiler builtins
    (libc_ifunc_redirected for instance);
  - It does not handle all required compiler handling
    (inhibit_stack_protector on iFUNC resolver).
  - No architecure uses it.

Checked with a build against all major ABIs.

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2020-01-03 10:02:05 -03:00
Adhemerval Zanella
bc36727be9 x86: Make x32 use x86 time implementation
This is the only use of auto-generation syscall which uses a vDSO
plus IFUNC and the current x86 generic implementation already covers
the expected semantic.

Checked on x86_64-linux-gnu-x32.

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2020-01-03 10:02:05 -03:00
Adhemerval Zanella
d0def09ff6 linux: Fix vDSO macros build with time64 interfaces
As indicated on libc-help [1] the ec138c67cb commit broke 32-bit
builds when configured with --enable-kernel=5.1 or higher.  The
scenario 10 from [2] might also occur in this configuration and
INLINE_VSYSCALL will try to use the vDSO symbol and
HAVE_CLOCK_GETTIME64_VSYSCALL does not set HAVE_VSYSCALL prior its
usage.

Also, there is no easy way to just enable the code to use one
vDSO symbol since the macro INLINE_VSYSCALL is redefined if
HAVE_VSYSCALL is set.

Instead of adding more pre-processor handling and making the code
even more convoluted, this patch removes the requirement of defining
HAVE_VSYSCALL before including sysdep-vdso.h to enable vDSO usage.

The INLINE_VSYSCALL is now expected to be issued inside a
HAVE_*_VSYSCALL check, since it will try to use the internal vDSO
pointers.

Both clock_getres and clock_gettime vDSO code for time64_t were
removed since there is no vDSO setup code for the symbol (an
architecture can not set HAVE_CLOCK_GETTIME64_VSYSCALL).

Checked on i686-linux-gnu (default and with --enable-kernel=5.1),
x86_64-linux-gnu, aarch64-linux-gnu, and powerpc64le-linux-gnu.
I also checked against a build to mips64-linux-gnu and
sparc64-linux-gnu.

[1] https://sourceware.org/ml/libc-help/2019-12/msg00014.html
[2] https://sourceware.org/ml/libc-alpha/2019-12/msg00142.html

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2020-01-03 10:02:05 -03:00
Adhemerval Zanella
b03688bfbb Linux: Fix clock_nanosleep time64 check
The result of INTERNAL_SYSCALL_CANCEL should be checked with
macros INTERNAL_SYSCALL_ERROR_P and INTERNAL_SYSCALL_ERRNO instead
of comparing the result directly.

Checked on powerpc-linux-gnu.
2020-01-03 10:02:05 -03:00
Wilco Dijkstra
220622dde5 Add libm_alias_finite for _finite symbols
This patch adds a new macro, libm_alias_finite, to define all _finite
symbol.  It sets all _finite symbol as compat symbol based on its first
version (obtained from the definition at built generated first-versions.h).

The <fn>f128_finite symbols were introduced in GLIBC 2.26 and so need
special treatment in code that is shared between long double and float128.
It is done by adding a list, similar to internal symbol redifinition,
on sysdeps/ieee754/float128/float128_private.h.

Alpha also needs some tricky changes to ensure we still emit 2 compat
symbols for sqrt(f).

Passes buildmanyglibc.

Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2020-01-03 10:02:04 -03:00
Florian Weimer
0933a4678c Linux: Remove pread/pread64, pwrite/pwrite64 kludges from <sysdep.h>
Since the switch away from auto-generated wrappers for these system
calls, the kludge is already included in the C source file of the
system call wrapper.
2020-01-02 10:18:37 +01:00
Florian Weimer
a1bd5f8673 Linux: Use system call tables during build
Use <arch-syscall.h> instead of <asm/unistd.h> to obtain the system
call numbers.  A few direct includes of <asm/unistd.h> need to be
removed (if the system call numbers are already provided indirectly
by <sysdep.h>) or replaced with <sys/syscall.h>.

Current Linux headers for alpha define the required system call names,
so most of the _NR_* hacks are no longer needed.  For the 32-bit arm
architecture, eliminate the INTERNAL_SYSCALL_ARM macro, now that we
have regular system call names for cacheflush and set_tls.  There are
more such cleanup opportunities for other architectures, but these
cleanups are required to avoid macro redefinition errors during the
build.

For ia64, it is desirable to use <asm/break.h> directly to obtain
the break number for system calls (which is not a system call number
itself).  This requires replacing __BREAK_SYSCALL with
__IA64_BREAK_SYSCALL because the former is defined as an alias in
<asm/unistd.h>, but not in <asm/break.h>.

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2020-01-02 10:18:23 +01:00
Florian Weimer
4cf0d22305 Linux: Add tables with system call numbers
The new tables are currently only used for consistency checks
with the installed kernel headers and the architecture-independent
system call names table.  They are based on Linux 5.4.

The goal is to use these architecture-specific tables to ensure
that system call wrappers are available irrespective of the version
of the installed kernel headers.

The tables are formatted in the form of C header files so that they
can be used directly in an #include directive, without external
preprocessing.  (External preprocessing of a plain table file
would introduce cross-subdirectory dependency issues.)  However,
the intent is that they can still be treated as tables and can be
processed by simple tools.

The irregular system call names on 32-bit arm add a complication.
The <fixup-asm-unistd.h> header is introduced to work around that,
and the system calls are listed under regular names in the
<arch-syscall.h> file.

A make target, update-syscalls-list, is added to patch the glibc
sources with data from the current kernel headers.

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2020-01-02 10:18:10 +01:00
Joseph Myers
5f72f9800b Update copyright dates not handled by scripts/update-copyrights.
I've updated copyright dates in glibc for 2020.  This is the patch for
the changes not generated by scripts/update-copyrights and subsequent
build / regeneration of generated files.  As well as the usual annual
updates, mainly dates in --version output (minus libc.texinfo which
previously had to be handled manually but is now successfully updated
by update-copyrights), there is a fix to
sysdeps/unix/sysv/linux/powerpc/bits/termios-c_lflag.h where a typo in
the copyright notice meant it failed to be updated automatically.

Please remember to include 2020 in the dates for any new files added
in future (which means updating any existing uncommitted patches you
have that add new files to use the new copyright dates in them).
2020-01-01 00:21:22 +00:00
Joseph Myers
d614a75396 Update copyright dates with scripts/update-copyrights. 2020-01-01 00:14:33 +00:00
Adhemerval Zanella
09153638cf alpha: Set wait4 as cancellation entrypoint
Since both wait and waitpid are implemented on top of wait4.  It fixes
nptl/tst-cancel{x}{4,5,7}.

Checked on alpha-linux-gnu.
2019-12-30 11:05:28 -03:00
Jeremie Koenig
653d74f12a hurd: Global signal disposition
This adds _hurd_sigstate_set_global_rcv used by libpthread to enable
POSIX-confirming behavior of signals on a per-thread basis.

This also provides a sigstate destructor _hurd_sigstate_delete, and a
global process signal state, which needs to be locked and check when
global disposition is enabled, thus the addition of _hurd_sigstate_lock
_hurd_sigstate_actions _hurd_sigstate_pending _hurd_sigstate_unlock helpers.

This also updates all the glibc code accordingly.

This also drops support for get_int(INIT_SIGMASK), which did not make sense
any more since we do not have a single signal thread any more.

During fork/spawn, this also reinitializes the child global sigstate's
lock. That cures an issue that would very rarely cause a deadlock in the
child in fork, tries to unlock ss' critical section lock at the end of
fork.  This will typically (always?) be observed in /bin/sh, which is not
surprising as that is the foremost caller of fork.

To reproduce an intermediate state, add an endless loop if
_hurd_global_sigstate is locked after __proc_dostop (cast through
volatile); that is, while still being in the fork's parent process.

When that triggers (use the libtool testsuite), the signal thread has
already locked ss (which is _hurd_global_sigstate), and is stuck at
hurdsig.c:685 in post_signal, trying to lock _hurd_siglock (which the
main thread already has locked and keeps locked until after
__task_create).  This is the case that ss->thread == MACH_PORT_NULL, that
is, a global signal.  In the main thread, between __proc_dostop and
__task_create is the __thread_abort call on the signal thread which would
abort any current kernel operation (but leave ss locked).  Later in fork,
in the parent, when _hurd_siglock is unlocked in fork, the parent's
signal thread can proceed and will unlock eventually the global sigstate.
In the client, _hurd_siglock will likewise be unlocked, but the global
sigstate never will be, as the client's signal thread has been configured
to restart execution from _hurd_msgport_receive.  Thus, when the child
tries to unlock ss' critical section lock at the end of fork, it will
first lock the global sigstate, will spin trying to lock it, which can
never be successful, and we get our deadlock.

Options seem to be:

  * Move the locking of _hurd_siglock earlier in post_signal -- but that
    may generally impact performance, if this locking isn't generally
    needed anyway?

    On the other hand, would it actually make sense to wait here until we
    are not any longer in a critical section (which is meant to disable
    signal delivery anyway (but not for preempted signals?))?

  * Clear the global sigstate in the fork's child with the rationale that
    we're anyway restarting the signal thread from a clean state.  This
    has now been implemented.

Why has this problem not been observed before Jérémie's patches?  (Or has
it?  Perhaps even more rarely?)  In _S_msg_sig_post, the signal is now
posted to a *global receiver thread*, whereas previously it was posted to
the *designated signal-receiving thread*.  The latter one was in a
critical section in fork, so didn't try to handle the signal until after
leaving the critical section?  (Not completely analyzed and verified.)

Another question is what the signal is that is being received
during/around the time __proc_dostop executes.
2019-12-29 18:32:49 +01:00
Samuel Thibault
eb87a46c56 hurd sendmsg: Fix warning on calling CMSG_*HDR 2019-12-29 17:49:41 +01:00
Thomas Schwinge
a678c13b8f hurd: Add getcontext, makecontext, setcontext, swapcontext
Adapted from the Linux x86 functions.

Not thoroughly tested, but manual testing as well as glibc tests look fine, and
manual -lpthread testing also looks fine (within the given bounds for a new
stack to be used with makecontext).

This has also been in use in Debian since 2013.
2019-12-29 16:54:08 +01:00
Emilio Pozuelo Monfort
344e755248 hurd: Support sending file descriptors over Unix sockets 2019-12-29 16:34:20 +01:00
Gabriel F. T. Gomes
9ae967bf45 ldbl-128ibm-compat: Do not mix -mabi=*longdouble and -mlong-double-128
Some compiler versions, e.g. GCC 7, complain when -mlong-double-128 is
used together with -mabi=ibmlongdouble or -mabi=ieeelongdouble,
producing the following error message:

  cc1: error: ‘-mabi=ibmlongdouble’ requires ‘-mlong-double-128’

This patch removes -mlong-double-128 from the compilation lines that
explicitly request -mabi=*longdouble.

Tested for powerpc64le.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2019-12-27 15:02:10 -03:00
Tulio Magno Quites Machado Filho
5d73c96f64 ldbl-128ibm-compat: Compiler flags for stdio functions
Some of the files that provide stdio.h and wchar.h functions have a
filename prefixed with 'io', such as 'iovsprintf.c'.  On platforms that
imply ldbl-128ibm-compat, these files must be compiled with the flag
-mabi=ibmlongdouble.  This patch adds this flag to their compilation.

Notice that this is not required for the other files that provide
similar functions, because filenames that are not prefixed with 'io'
have ldbl-128ibm-compat counterparts in the Makefile, which already adds
-mabi=ibmlongdouble to them.

Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
2019-12-27 15:02:10 -03:00
Tulio Magno Quites Machado Filho
1ef9b6e0bf Do not redirect calls to __GI_* symbols, when redirecting to *ieee128
On platforms where long double has IEEE binary128 format as a third
option (initially, only powerpc64le), many exported functions are
redirected to their __*ieee128 equivalents.  This redirection is
provided by installed headers such as stdio-ldbl.h, and is supposed to
work correctly with user code.

However, during the build of glibc, similar redirections are employed,
in internal headers, such as include/stdio.h, in order to avoid extra
PLT entries.  These redirections conflict with the redirections to
__*ieee128, and must be avoided during the build.  This patch protects
the second redirections with a test for __LONG_DOUBLE_USES_FLOAT128, a
new macro that is defined to 1 when functions that deal with long double
typed values reuses the _Float128 implementation (this is currently only
true for powerpc64le).

Tested for powerpc64le, x86_64, and with build-many-glibcs.py.

Co-authored-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
2019-12-27 15:02:10 -03:00
Xuelei Zhang
863d775c48 aarch64: add default memcpy version for kunpeng920
Checked on aarch64-linux-gnu.
2019-12-27 11:59:37 -03:00
Xuelei Zhang
10df95cdaf aarch64: ifunc rename for kunpeng
Rename ifunc for kunpeng to kunpeng920, and modify the corresponding
function files including IS_KUNPENG920 judgement.

Checked on aarch64-linux-gnu.
2019-12-27 11:59:51 -03:00
Xuelei Zhang
64297d49b3 aarch64: Modify error-shown comments for strcpy
Checked on aarch64-linux-gnu.
2019-12-27 11:59:37 -03:00
Adhemerval Zanella
dc86199477 linux: Consolidate sigprocmask
All architectures now uses the Linux generic implementation which
uses __NR_rt_sigprocmask.

Checked on x86_64-linux-gnu, sparc64-linux-gnu, ia64-linux-gnu,
s390x-linux-gnu, and alpha-linux-gnu.
2019-12-27 11:18:23 -03:00
Adhemerval Zanella
58bd592536 Fix return code for __libc_signal_* functions
The functions do not fail regardless of the argument value.  Also, for
Linux the return value is not correct on some platforms due the missing
usage of INTERNAL_SYSCALL_ERROR_P / INTERNAL_SYSCALL_ERRNO macros.

Checked on x86_64-linux-gnu, i686-linux-gnu, and sparc64-linux-gnu.
2019-12-27 11:18:23 -03:00
Adhemerval Zanella
11519fd0c9 nptl: Remove duplicate internal __SIZEOF_PTHREAD_MUTEX_T (BZ#25241)
Checked on x86_64-linux-gnu, i686-linux-gnu, and x86_64-linux-gnu-x32.
2019-12-26 17:04:50 -03:00
Gabriel F. T. Gomes
f8cd102081 Avoid compat symbols for totalorder in powerpc64le IEEE long double
On powerpc64le, the libm_alias_float128_other_r_ldbl macro is
used to create an alias between totalorderf128 and __totalorderlieee128,
as well as between the totalordermagf128 and __totalordermaglieee128.

However, the totalorder* and totalordermag* functions changed their
parameter type since commit ID 42760d7646 and got compat symbols for
their old versions.  With this change, the aforementioned macro would
create two conflicting aliases for __totalorderlieee128 and
__totalordermaglieee128.

This patch avoids the creation of the alias between the IEEE long double
symbols (__totalorderl*ieee128) and the compat symbols, because the IEEE
long double functions have never been exported thus don't need such
compat symbol.

Tested for powerpc64le.

Reviewed-by: Joseph Myers <joseph@codesourcery.com>
2019-12-23 16:32:20 -03:00
Gabriel F. T. Gomes
3021e78178 ldbl-128ibm-compat: Add *cvt functions
This patch adds IEEE long double versions of q*cvt* functions for
powerpc64le.  Unlike all other long double to/from string conversion
functions, these do not rely on internal functions that can take
floating-point numbers with different formats and act on them
accordingly, instead, the related files are rebuilt with the
-mabi=ieeelongdouble compiler flag set.

Having -mabi=ieeelongdouble passed to the compiler causes the object
files to be marked with a .gnu_attribute that is incompatible with the
.gnu_attribute in files built with -mabi=ibmlongdouble (the default).
The difference causes error messages similar to the following:

  ld: libc_pic.a(s_isinfl.os) uses IBM long double,
      libc_pic.a(ieee128-qefgcvt_r.os) uses IEEE long double.
  collect2: error: ld returned 1 exit status
  make[2]: *** [../Makerules:649: libc_pic.os] Error 1

Although this warning is useful in other situations, the library
actually needs to have functions with different long double formats, so
.gnu_attribute generation is explicitly disabled for these files with
the use of -mno-gnu-attribute.

Tested for powerpc64le on the branch that actually enables the
sysdeps/ieee754/ldbl-128ibm-compat for powerpc64le.

Reviewed-by: Paul E. Murphy <murphyp@linux.ibm.com>
2019-12-23 16:32:20 -03:00
Xuelei Zhang
525de033a9 aarch64: Optimized memset for Kunpeng processor.
Due to the branch prediction issue of Kunpeng processor, we found
memset_generic has poor performance on middle sizes setting, and so
we reconstructed the logic, expanded the loop by 4 times in set_long
to solve the problem, even when setting below 1K sizes have benefit.

Another change is that DZ_ZVA seems no work when setting zero, so we
discarded it and used set_long to set zero instead. Fewer branches and
predictions also make the zero case have slightly improvement.

Checked on aarch64-linux-gnu.

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
2019-12-19 16:31:04 -03:00
Xuelei Zhang
c2150769d0 aarch64: Optimized strlen for strlen_asimd
Optimize the strlen implementation by using vector operations and
loop unrolling in main loop.Compared to __strlen_generic,it reduces
latency of cases in bench-strlen by 7%~18% when the length of src
is greater than 128 bytes, with gains throughout the benchmark.

Checked on aarch64-linux-gnu.

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
2019-12-19 16:31:04 -03:00
Xuelei Zhang
0db8e7b366 aarch64: Add Huawei Kunpeng to tunable cpu list
Kunpeng processer is a 64-bit Arm-compatible CPU released by Huawei,
and we have already signed a copyright assignement with the FSF.

This patch adds its to cpu list, and related macro for IFUNC.

Checked on aarch64-linux-gnu.

Reviewed-by: Szabolcs Nagy <Szabolcs.Nagy@arm.com>
2019-12-19 16:31:04 -03:00
Xuelei Zhang
a7611806d5 aarch64: Optimized implementation of memrchr
Considering the excellent performance of memchr.S on glibc 2.30, the
same algorithm is used to find chrin. Compared to memrchr.c, this
method with memrchr.S achieves an average performance improvement
of 58% based on benchtest and its extension cases.

Checked on aarch64-linux-gnu.

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
2019-12-19 16:31:04 -03:00
Xuelei Zhang
2911cb68ed aarch64: Optimized implementation of strnlen
Optimize the strlen implementation by using vector operations and
loop unrooling in main loop. Compared to aarch64/strnlen.S, it
reduces latency of cases in bench-strnlen by 11%~24% when the length
of src is greater than 64 bytes, with gains throughout the benchmark.

Checked on aarch64-linux-gnu.

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
2019-12-19 16:31:04 -03:00
Xuelei Zhang
0237b61526 aarch64: Optimized implementation of strcpy
Optimize the strcpy implementation by using vector loads and operations
in main loop.Compared to aarch64/strcpy.S, it reduces latency of cases
in bench-strlen by 5%~18% when the length of src is greater than 64
bytes, with gains throughout the benchmark.

Checked on aarch64-linux-gnu.

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
2019-12-19 16:31:04 -03:00
Xuelei Zhang
233efd433d aarch64: Optimized implementation of memcmp
The loop body is expanded from a 16-byte comparison to a 64-byte
comparison, and the usage of ldp is replaced by the Post-index
mode to the Base plus offset mode. Hence, compare can faster 18%
around > 128 bytes in all.

Checked on aarch64-linux-gnu.

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
2019-12-19 16:31:04 -03:00
Adhemerval Zanella
442d9c9c67 Consolidate wait3 implementations
The generic one calls wait4.

Checked on x86_64-linux-gnu.
2019-12-19 16:11:09 -03:00
Adhemerval Zanella
848791557b Implement waitpid in terms of wait4
This also consolidate all waitpid implementations.

Checked on x86_64-linux-gnu.
2019-12-19 16:11:09 -03:00
Adhemerval Zanella
9b2cf9482a linux: Use waitid on wait4 if __NR_wait4 is not defined
If the wait4 syscall is not available (such as y2038 safe 32-bit
systems) waitid should be used instead.  However prior Linux 5.4
waitid is not a full superset of other wait syscalls, since it
does not include support for waiting for the current process group.

It is possible to emulate wait4 by issuing an extra syscall to get
the current process group, but it is inherent racy: after the current
process group is received and before it is passed to waitid a signal
could arrive causing the current process group to change.

So waitid is used if wait4 is not defined iff the build is
enabled with a minimum kernel if 5.4+.  The new assume
__ASSUME_WAITID_PID0_P_PGID is added and an error is issued if waitid
can not be implemented by either __NR_wait4 or
__NR_waitid && __ASSUME_WAITID_PID0_P_PGID.

Checked on x86_64-linux-gnu and i686-linux-gnu.

Co-authored-by: Alistair Francis <alistair.francis@wdc.com>
2019-12-19 16:11:09 -03:00
Adhemerval Zanella
c5cbdacb8a Implement wait in terms of waitpid
The POSIX implementation is used as default and both BSD and Linux
version are removed.  It simplifies the implementation for
architectures that do not provide either __NR_waitpid or
__NR_wait4.

Checked on x86_64-linux-gnu and powerpc64le-linux-gnu.
2019-12-19 16:11:09 -03:00