Commit Graph

372 Commits

Author SHA1 Message Date
Adhemerval Zanella
320768a664 linux: Re-flow and sort multiline Makefile definitions 2023-04-20 10:40:54 -03:00
Adhemerval Zanella Netto
33237fe83d Remove --enable-tunables configure option
And make always supported.  The configure option was added on glibc 2.25
and some features require it (such as hwcap mask, huge pages support, and
lock elisition tuning).  It also simplifies the build permutations.

Changes from v1:
 * Remove glibc.rtld.dynamic_sort changes, it is orthogonal and needs
   more discussion.
 * Cleanup more code.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2023-03-29 14:33:06 -03:00
Adhemerval Zanella Netto
2053c11331 linux: Add clone3 CLONE_CLEAR_SIGHAND optimization to posix_spawn
The clone3 flag resets all signal handlers of the child not set to
SIG_IGN to SIG_DFL.  It allows to skip most of the sigaction calls
to setup child signal handling, where previously a posix_spawn
had to issue 2 times NSIG sigaction calls (one to obtain the current
disposition and another to set either SIG_DFL or SIG_IGN).

With POSIX_SPAWN_SETSIGDEF the child will setup the signal for the case
where the disposition is SIG_IGN.

The code must handle the fallback where clone3 is not available. This is
done by splitting __clone_internal_fallback from __clone_internal.

Checked on x86_64-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2023-02-01 08:42:11 -03:00
Florian Weimer
c1c0dea388 Linux: Remove epoll_create, inotify_init from syscalls.list
Their presence causes stub warnings to be created on architectures
which do not implement them.

Fixes commit d1d23b1342 ("Lninux: consolidate
epoll_create implementation") and commit 842128f160
("Linux: consolidate inotify_init implementation").

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2022-12-19 13:28:14 +01:00
Florian Weimer
9a5b1d84fb Linux: Reflow and sort some Makefile variables
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2022-12-19 13:28:07 +01:00
Adhemerval Zanella
774058d729 linux: Fix sys/mount.h usage with kernel headers
Now that kernel exports linux/mount.h and includes it on linux/fs.h,
its definitions might clash with glibc exports sys/mount.h.  To avoid
the need to rearrange the Linux header to be always after glibc one,
the glibc sys/mount.h is changed to:

  1. Undefine the macros also used as enum constants.  This covers prior
     inclusion of <linux/mount.h> (for instance MS_RDONLY).

  2. Include <linux/mount.h> based on the usual __has_include check
     (needs to use __has_include ("linux/mount.h") to paper over GCC
     bugs.

  3. Define enum fsconfig_command only if FSOPEN_CLOEXEC is not defined.
     (FSOPEN_CLOEXEC should be a very close proxy.)

  4. Define struct mount_attr if MOUNT_ATTR_SIZE_VER0 is not defined.
     (Added in the same commit on the Linux side.)

This patch also adds some tests to check if including linux/fs.h and
linux/mount.h after and before sys/mount.h does work.

Checked on x86_64-linux-gnu.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2022-08-12 09:15:28 -03:00
Adhemerval Zanella
36676f5e5d Remove ldd libc4 support
The older libc versions are obsolete for over twenty years now.
2022-08-04 10:03:45 -03:00
Florian Weimer
0c5605989f Linux: dirent/tst-readdir64-compat needs to use TEST_COMPAT (bug 27654)
The hppa port starts libc at GLIBC_2.2, but has earlier symbol
versions in other shared objects.  This means that the compat
symbol for readdir64 is not actually present in libc even though
have-GLIBC_2.1.3 is defined as yes at the make level.

Fixes commit 15e50e6c96 ("Linux:
dirent/tst-readdir64-compat can be a regular test") by mostly
reverting it.
2022-07-25 11:39:03 +02:00
Adhemerval Zanella
c3b02b6567 linux: Add tst-mount to check for Linux new mount API
The new mount API was added on Linux 5.2 with six new syscalls:
fsopen, fsconfig, fsmount, move_mount, fspick, and open_tree.

The new test verifies minimal functionality along with error paths
for specific arguments and their corner cases.

Checked on x86_64-linux-gnu.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2022-07-05 10:08:48 -03:00
Adhemerval Zanella
6c0eedd97e linux: Add fsopen
It was added on Linux 5.2 (24dcb3d90a1f67fe08c68a004af37df059d74005)
to start the process of preparing to create a superblock that will
then be mountable, using an fd as a context handle.

Tested-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2022-06-24 16:03:15 -03:00
Adhemerval Zanella
1002f1af1c linux: Add process_mrelease
Added in Linux 5.15 (884a7e5964e06ed93c7771c0d7cf19c09a8946f1), the new
syscalls allows a caller to free the memory of a dying target process.

Checked on x86_64-linux-gnu.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2022-06-02 15:43:28 -03:00
Adhemerval Zanella
d19ee3473d linux: Add process_madvise
It was added on Linux 5.10 (ecb8ac8b1f146915aa6b96449b66dd48984caacc)
with the same functionality as madvise but using a pidfd of the target
process.

Checked on x86_64-linux-gnu and i686-linux-gnu.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2022-06-02 15:43:28 -03:00
Adhemerval Zanella
d2a1ec2097 linux: Add tst-pidfd.c
To check for the pidfd functions pidfd_open, pidfd_getfd, pid_send_signal,
and waitid with P_PIDFD.

Checked on x86_64-linux-gnu and i686-linux-gnu.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
2022-05-17 10:36:59 -03:00
Adhemerval Zanella
97f5d19c45 linux: Add pidfd_open
This was added on Linux 5.3 (32fcb426ec001cb6d5a4a195091a8486ea77e2df)
as a way to retrieve a pid file descriptors for process that has not
been created CLONE_PIDFD (by usual fork/clone).

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
2022-05-17 10:32:28 -03:00
Szabolcs Nagy
9faf5262c7 linux: Add a getauxval test [BZ #23293]
This is for bug 23293 and it relies on the glibc test system running
tests via explicit ld.so invokation by default.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2022-05-17 10:14:03 +01:00
H.J. Lu
1fe00d3eb6 build: Properly generate .d dependency files [BZ #28922]
1. Also generate .d dependency files for $(tests-container) and
$(tests-printers).
2. elf: Add tst-auditmod17.os to extra-test-objs.
3. iconv: Add tst-gconv-init-failure-mod.os to extra-test-objs.
4. malloc: Rename extra-tests-objs to extra-test-objs.
5. linux: Add tst-sysconf-iov_max-uapi.o to extra-test-objs.
6. x86_64: Add tst-x86_64mod-1.o, tst-platformmod-2.o, test-libmvec.o,
test-libmvec-avx.o, test-libmvec-avx2.o and test-libmvec-avx512f.o to
extra-test-objs.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2022-02-25 10:35:45 -08:00
Adhemerval Zanella
948ce73b31 Linux: Only generate 64 bit timestamps for 64 bit time_t recvmsg/recvmmsg
The timestamps created by __convert_scm_timestamps only make sense for
64 bit time_t programs, 32 bit time_t programs will ignore 64 bit time_t
timestamps since SO_TIMESTAMP will be defined to old values (either by
glibc or kernel headers).

Worse, if the buffer is not suffice MSG_CTRUNC is set to indicate it
(which breaks some programs [1]).

This patch makes only 64 bit time_t recvmsg and recvmmsg to call
__convert_scm_timestamps.  Also, the assumption to called it is changed
from __ASSUME_TIME64_SYSCALLS to __TIMESIZE != 64 since the setsockopt
might be called by libraries built without __TIME_BITS=64.  The
MSG_CTRUNC is only set for the 64 bit symbols, it should happen only
if 64 bit time_t programs run older kernels.

Checked on x86_64-linux-gnu and i686-linux-gnu.

[1] https://github.com/systemd/systemd/pull/20567

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2022-01-28 18:18:27 -03:00
Adhemerval Zanella
8fba672472 linux: Fix ancillary 64-bit time timestamp conversion (BZ #28349, BZ#28350)
The __convert_scm_timestamps only updates the control message last
pointer for SOL_SOCKET type, so if the message control buffer contains
multiple ancillary message types the converted timestamp one might
overwrite a valid message.

The test checks if the extra ancillary space is correctly handled
by recvmsg/recvmmsg, where if there is no extra space for the 64-bit
time_t converted message the control buffer should be marked with
MSG_TRUNC.  It also check if recvmsg/recvmmsg handle correctly multiple
ancillary data.

Checked on x86_64-linux and on i686-linux-gnu on both 5.11 and
4.15 kernel.

Co-authored-by: Fabian Vogt <fvogt@suse.de>

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2022-01-28 17:46:44 -03:00
Siddhesh Poyarekar
23e0e8f5f1 getcwd: Set errno to ERANGE for size == 1 (CVE-2021-3999)
No valid path returned by getcwd would fit into 1 byte, so reject the
size early and return NULL with errno set to ERANGE.  This change is
prompted by CVE-2021-3999, which describes a single byte buffer
underflow and overflow when all of the following conditions are met:

- The buffer size (i.e. the second argument of getcwd) is 1 byte
- The current working directory is too long
- '/' is also mounted on the current working directory

Sequence of events:

- In sysdeps/unix/sysv/linux/getcwd.c, the syscall returns ENAMETOOLONG
  because the linux kernel checks for name length before it checks
  buffer size

- The code falls back to the generic getcwd in sysdeps/posix

- In the generic func, the buf[0] is set to '\0' on line 250

- this while loop on line 262 is bypassed:

    while (!(thisdev == rootdev && thisino == rootino))

  since the rootfs (/) is bind mounted onto the directory and the flow
  goes on to line 449, where it puts a '/' in the byte before the
  buffer.

- Finally on line 458, it moves 2 bytes (the underflowed byte and the
  '\0') to the buf[0] and buf[1], resulting in a 1 byte buffer overflow.

- buf is returned on line 469 and errno is not set.

This resolves BZ #28769.

Reviewed-by: Andreas Schwab <schwab@linux-m68k.org>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
Signed-off-by: Qualys Security Advisory <qsa@qualys.com>
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2022-01-24 11:00:17 +05:30
Adhemerval Zanella
5f3a7ebc35 Linux: Add epoll_pwait2 (BZ #27359)
It is similar to epoll_wait, with the difference the timeout has
nanosecond resoluting by using struct timespec instead of int.

Although Linux interface only provides 64 bit time_t support, old
32 bit interface is also provided (so keep in sync with current
practice and to no force opt-in on 64 bit time_t).

Checked on x86_64-linux-gnu and i686-linux-gnu.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2022-01-17 14:34:54 -03:00
Adhemerval Zanella
572e0c8554 Revert "linux: Fix ancillary 64-bit time timestamp conversion (BZ #28349, BZ #28350)"
This reverts commit 21e0f45c7d.
2022-01-12 10:35:06 -03:00
Adhemerval Zanella
21e0f45c7d linux: Fix ancillary 64-bit time timestamp conversion (BZ #28349, BZ #28350)
The __convert_scm_timestamps() only updates the control message last
pointer for SOL_SOCKET type, so if the message control buffer contains
multiple ancillary message types the converted timestamp one might
overwrite a valid message.

The test check if the extra ancillary space is correctly handled
by recvmsg/recvmmsg, where if there is no extra space for the 64-bit
time_t converted message the control buffer should be marked with
MSG_TRUNC.  It also check if recvmsg/recvmmsg handle correctly multiple
ancillary data.

Checked on x86_64-linux and on i686-linux-gnu on both 5.11 and
4.15 kernel.

Co-authored-by: Fabian Vogt <fvogt@suse.de>
2022-01-12 10:30:10 -03:00
Florian Weimer
c901c3e764 nptl: Add public rseq symbols and <sys/rseq.h>
The relationship between the thread pointer and the rseq area
is made explicit.  The constant offset can be used by JIT compilers
to optimize rseq access (e.g., for really fast sched_getcpu).

Extensibility is provided through __rseq_size and __rseq_flags.
(In the future, the kernel could request a different rseq size
via the auxiliary vector.)

Co-Authored-By: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2021-12-09 09:49:32 +01:00
Florian Weimer
e3e589829d nptl: Add glibc.pthread.rseq tunable to control rseq registration
This tunable allows applications to register the rseq area instead
of glibc.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2021-12-09 09:49:32 +01:00
Florian Weimer
95e114a091 nptl: Add rseq registration
The rseq area is placed directly into struct pthread.  rseq
registration failure is not treated as an error, so it is possible
that threads run with inconsistent registration status.

<sys/rseq.h> is not yet installed as a public header.

Co-Authored-By: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2021-12-09 09:49:32 +01:00
Adhemerval Zanella
5b3e31e312 linux: Implement mremap in C
Variadic function calls in syscalls.list does not work for all ABIs
(for instance where the argument are passed on the stack instead of
registers) and might have underlying issues depending of the variadic
type (for instance if a 64-bit argument is used).

Checked on x86_64-linux-gnu.
2021-11-30 13:13:03 -03:00
Adhemerval Zanella
83008fa495 linux: Add prlimit64 C implementation
The LFS prlimit64 requires a arch-specific implementation in
syscalls.list.  Instead add a generic one that handles the
required symbol alias for __RLIM_T_MATCHES_RLIM64_T.

HPPA is the only outlier which requires a different default
symbol.

Checked on x86_64-linux-gnu and with build for the affected ABIs.
2021-11-30 13:13:03 -03:00
Adhemerval Zanella
d150181d73 linux: Add fanotify_mark C implementation
Passing 64-bit arguments on syscalls.list is tricky: it requires
to reimplement the expected kernel abi in each architecture.  This
is way to better to represent in C code where we already have
macros for this (SYSCALL_LL64).

Checked on x86_64-linux-gnu.
2021-11-25 09:56:57 -03:00
Adhemerval Zanella
456b3c08b6 io: Refactor close_range and closefrom
Now that Hurd implementis both close_range and closefrom (f2c996597d),
we can make close_range() a base ABI, and make the default closefrom()
implementation on top of close_range().

The generic closefrom() implementation based on __getdtablesize() is
moved to generic close_range().  On Linux it will be overriden by
the auto-generation syscall while on Hurd it will be a system specific
implementation.

The closefrom() now calls close_range() and __closefrom_fallback().
Since on Hurd close_range() does not fail, __closefrom_fallback() is an
empty static inline function set by__ASSUME_CLOSE_RANGE.

The __ASSUME_CLOSE_RANGE also allows optimize Linux
__closefrom_fallback() implementation when --enable-kernel=5.9 or
higher is used.

Finally the Linux specific tst-close_range.c is moved to io and
enabled as default.  The Linuxism and CLOSE_RANGE_UNSHARE are
guarded so it can be built for Hurd (I have not actually test it).

Checked on x86_64-linux-gnu, i686-linux-gnu, and with a i686-gnu
build.
2021-11-24 09:09:37 -03:00
Florian Weimer
8b2c706a9d socket: Add time64 alias for sendmmsg
Reviewed-by: Lukasz Majewski <lukma@denx.de>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-07-21 11:58:16 +02:00
Florian Weimer
b39ffab860 Linux: Add time64 alias for prctl
Reviewed-by: Lukasz Majewski <lukma@denx.de>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-07-21 11:58:16 +02:00
H.J. Lu
84d40d702f Add static tests for __clone_internal
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-07-14 06:55:04 -07:00
H.J. Lu
d8ea0d0168 Add an internal wrapper for clone, clone2 and clone3
The clone3 system call (since Linux 5.3) provides a superset of the
functionality of clone and clone2.  It also provides a number of API
improvements, including the ability to specify the size of the child's
stack area which can be used by kernel to compute the shadow stack size
when allocating the shadow stack.  Add:

extern int __clone_internal (struct clone_args *__cl_args,
			     int (*__func) (void *__arg), void *__arg);

to provide an abstract interface for clone, clone2 and clone3.

1. Simplify stack management for thread creation by passing both stack
base and size to create_thread.
2. Consolidate clone vs clone2 differences into a single file.
3. Call __clone3 if HAVE_CLONE3_WAPPER is defined.  If __clone3 returns
-1 with ENOSYS, fall back to clone or clone2.
4. Use only __clone_internal to clone a thread.  Since the stack size
argument for create_thread is now unconditional, always pass stack size
to create_thread.
5. Enable the public clone3 wrapper in the future after it has been
added to all targets.

NB: Sandbox will return ENOSYS on clone3 in both Chromium:

The following revision refers to this bug:
  218438259d

commit 218438259dd795456f0a48f67cbe5b4e520db88b
Author: Matthew Denton <mpdenton@chromium.org>
Date: Thu Jun 03 20:06:13 2021

Linux sandbox: return ENOSYS for clone3

Because clone3 uses a pointer argument rather than a flags argument, we
cannot examine the contents with seccomp, which is essential to
preventing sandboxed processes from starting other processes. So, we
won't be able to support clone3 in Chromium. This CL modifies the
BPF policy to return ENOSYS for clone3 so glibc always uses the fallback
to clone.

Bug: 1213452
Change-Id: I7c7c585a319e0264eac5b1ebee1a45be2d782303
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/2936184
Reviewed-by: Robert Sesek <rsesek@chromium.org>
Commit-Queue: Matthew Denton <mpdenton@chromium.org>
Cr-Commit-Position: refs/heads/master@{#888980}

[modify] https://crrev.com/218438259dd795456f0a48f67cbe5b4e520db88b/sandbox/linux/seccomp-bpf-helpers/baseline_policy.cc

and Firefox:

https://hg.mozilla.org/integration/autoland/rev/ecb4011a0c76

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2021-07-14 06:33:58 -07:00
Adhemerval Zanella
72e84d1db2 Linux: Use 32-bit vDSO for clock_gettime, gettimeofday, time (BZ# 28071)
The previous approach defeats the vDSO optimization on older kernels
because a failing clock_gettime64 system call is performed on every
function call.  It also results in a clobbered errno value, exposing
an OpenJDK bug (JDK-8270244).

This patch fixes by open-code INLINE_VSYSCALL macro and replace all
INLINE_SYSCALL_CALL with INTERNAL_SYSCALL_CALLS.  Now for
__clock_gettime64x, the 64-bit vDSO is used and the 32-bit vDSO is
tried before falling back to 64-bit syscalls.

The previous code preferred 64-bit syscall for the case where the kernel
provides 64-bit time_t syscalls *and* also a 32-bit vDSO (in this case
the *64-bit* syscall should be preferable over the vDSO).  All
architectures that provides 32-bit vDSO (i386, mips, powerpc, s390)
modulo sparc; but I am not sure if some kernels versions do provide
only 32-bit vDSO while still providing 64-bit time_t syscall.
Regardless, for such cases the 64-bit time_t syscall is used if the
vDSO returns overflowed 32-bit time_t.

Tested on i686-linux-gnu (with a time64 and non-time64 kernel),
x86_64-linux-gnu.  Built with build-many-glibcs.py.

Co-authored-by: Florian Weimer <fweimer@redhat.com>
2021-07-12 17:37:56 -03:00
Florian Weimer
aaacde11f2 Reduce <limits.h> pollution due to dynamic PTHREAD_STACK_MIN
<limits.h> used to be a header file with no declarations.
GCC's libgomp includes it in a #pragma GCC visibility hidden block.
Including <unistd.h> from <limits.h> (indirectly) declares everything
in <unistd.h> with hidden visibility, resulting in linker failures.

This commit avoids C declarations in assembler mode and only declares
__sysconf in <limits.h> (and not the entire contents of <unistd.h>).
The __sysconf symbol is already part of the ABI.  PTHREAD_STACK_MIN
is no longer defined for __USE_DYNAMIC_STACK_SIZE && __ASSEMBLER__
because there is no possible definition.

Additionally, PTHREAD_STACK_MIN is now defined by <pthread.h> for
__USE_MISC because this is what developers expect based on the macro
name.  It also helps to avoid libgomp linker failures in GCC because
libgomp includes <pthread.h> before its visibility hacks.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2021-07-12 18:43:32 +02:00
H.J. Lu
5d98a7dae9 Define PTHREAD_STACK_MIN to sysconf(_SC_THREAD_STACK_MIN)
The constant PTHREAD_STACK_MIN may be too small for some processors.
Rename _SC_SIGSTKSZ_SOURCE to _DYNAMIC_STACK_SIZE_SOURCE.  When
_DYNAMIC_STACK_SIZE_SOURCE or _GNU_SOURCE are defined, define
PTHREAD_STACK_MIN to sysconf(_SC_THREAD_STACK_MIN) which is changed
to MIN (PTHREAD_STACK_MIN, sysconf(_SC_MINSIGSTKSZ)).

Consolidate <bits/local_lim.h> with <bits/pthread_stack_min.h> to
provide a constant target specific PTHREAD_STACK_MIN value.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2021-07-09 15:10:35 -07:00
Adhemerval Zanella
607449506f io: Add closefrom [BZ #10353]
The function closes all open file descriptors greater than or equal to
input argument.  Negative values are clamped to 0, i.e, it will close
all file descriptors.

As indicated by the bug report, this is a common symbol provided by
different systems (Solaris, OpenBSD, NetBSD, FreeBSD) and, although
its has inherent issues with not taking in consideration internal libc
file descriptors (such as syslog), this is also a common feature used
in multiple projects [1][2][3][4][5].

The Linux fallback implementation iterates over /proc and close all
file descriptors sequentially.  Although it was raised the questioning
whether getdents on /proc/self/fd might return disjointed entries
when file descriptor are closed; it does not seems the case on my
testing on multiple kernel (v4.18, v5.4, v5.9) and the same strategy
is used on different projects [1][2][3][5].

Also, the interface is set a fail-safe meaning that a failure in the
fallback results in a process abort.

Checked on x86_64-linux-gnu and i686-linux-gnu on kernel 5.11 and 4.15.

[1] 5238e95759/src/basic/fd-util.c (L217)
[2] ddf4b77e11/src/lxc/start.c (L236)
[3] 9e4f2f3a6b/Modules/_posixsubprocess.c (L220)
[4] 5f47c0613e/src/libstd/sys/unix/process2.rs (L303-L308)
[5] https://github.com/openjdk/jdk/blob/master/src/java.base/unix/native/libjava/childproc.c#L82
2021-07-08 14:08:14 -03:00
Adhemerval Zanella
286286283e linux: Add close_range
It was added on Linux 5.9 (278a5fbaed89) with CLOSE_RANGE_CLOEXEC
added on 5.11 (582f1fb6b721f).  Although FreeBSD has added the same
syscall, this only adds the symbol on Linux ports.  This syscall is
required to provided a fail-safe way to implement the closefrom
symbol (BZ #10353).

Checked on x86_64-linux-gnu and i686-linux-gnu on kernel 5.11 and 4.15.
2021-07-08 14:08:13 -03:00
Florian Weimer
30639e79d3 Linux: Cleanups after librt move
librt.so is no longer installed for PTHREAD_IN_LIBC, and tests
are not linked against it.  $(librt) is introduced globally for
shared tests that need to be linked for both PTHREAD_IN_LIBC
and !PTHREAD_IN_LIBC.

GLIBC_PRIVATE symbols that were needed during the transition are
removed again.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2021-06-28 09:51:01 +02:00
Adhemerval Zanella
dafab287b4 linux: Only use 64-bit syscall if required for sigtimedwait
For !__ASSUME_TIME64_SYSCALLS there is no need to issue a 64-bit syscall
if the provided timeout fits in a 32-bit one.  The 64-bit usage should
be rare since the timeout is a relative one.

Checked on i686-linux-gnu on a 4.15 kernel and on a 5.11 kernel
(with and without --enable-kernel=5.1) and on x86_64-linux-gnu.

Reviewed-by: Lukasz Majewski <lukma@denx.de>
2021-06-22 12:09:52 -03:00
Adhemerval Zanella
2c0982eb93 linux: Only use 64-bit syscall if required for timerfd_settime
For !__ASSUME_TIME64_SYSCALLS there is no need to issue a 64-bit syscall
if the provided timeout fits in a 32-bit one.  The 64-bit usage should
be rare since the timeout is a relative one.

Checked on i686-linux-gnu on a 4.15 kernel and on a 5.11 kernel
(with and without --enable-kernel=5.1) and on x86_64-linux-gnu.

Reviewed-by: Lukasz Majewski <lukma@denx.de>
2021-06-22 12:09:52 -03:00
Adhemerval Zanella
9465c3a9fb linux: Remove time64-support
It breaks the usage case of live migration like CRIU or similar
and most usages can be optimized away by either building glibc with
a minimum 5.1 kernel or by using the 32-bit syscall for the common
case.

Checked on i686-linux-gnu on a 4.15 kernel and on a 5.11 kernel
(with and without --enable-kernel=5.1) and on x86_64-linux-gnu.

Reviewed-by: Lukasz Majewski <lukma@denx.de>
2021-06-22 12:09:52 -03:00
Adhemerval Zanella
ecf2661281 linux: Only use 64-bit syscall if required for ppoll
For !__ASSUME_TIME64_SYSCALLS there is no need to issue a 64-bit syscall
if the provided timeout fits in a 32-bit one.  The 64-bit usage should
be rare since the timeout is a relative one.  This also avoids the need
to use supports_time64() (which breaks the usage case of live migration
like CRIU or similar).

Checked on i686-linux-gnu on a 4.15 kernel and on a 5.11 kernel
(with and without --enable-kernel=5.1) and on x86_64-linux-gnu.

Reviewed-by: Lukasz Majewski <lukma@denx.de>
2021-06-22 12:09:52 -03:00
Adhemerval Zanella
088d3291ef y2038: Add test coverage
It is enabled through a new rule, tests-y2038, which is built only
when the ABI supports the comapt 64-bit time_t (defined by the
header time64-compat.h, which also enables the creation of the
symbol Version for Linux).  It means the tests are not built
for ABI which already provide default 64-bit time_t.

The new rule already adds the required LFS and 64-bit time_t
compiler flags.

The current coverage is:

  * libc:
    - adjtime                       tst-adjtime-time64
    - adjtimex                      tst-adjtimex-time64
    - clock_adjtime                 tst-clock_adjtime-time64
    - clock_getres                  tst-clock-time64, tst-cpuclock1-time64
    - clock_gettime                 tst-clock-time64, tst-clock2-time64,
				    tst-cpuclock1-time64
    - clock_nanosleep               tst-clock_nanosleep-time64,
				    tst-cpuclock1-time64
    - clock_settime                 tst-clock2-time64
    - cnd_timedwait                 tst-cnd-timedwait-time64
    - ctime                         tst-ctime-time64
    - ctime_r                       tst-ctime-time64
    - difftime                      tst-difftime-time64
    - fstat                         tst-stat-time64
    - fstatat                       tst-stat-time64
    - futimens                      tst-futimens-time64
    - futimes                       tst-futimes-time64
    - futimesat                     tst-futimesat-time64
    - fts_*                         tst-fts-time64
    - getitimer                     tst-itimer-timer64
    - getrusage
    - gettimeofday                  tst-clock_nanosleep-time64
    - glob / globfree               tst-gnuglob64-time64
    - gmtime                        tst-gmtime-time64
    - gmtime_r                      tst-gmtime-time64
    - lstat                         tst-stat-time64
    - localtime                     tst-y2039-time64
    - localtime_t                   tst-y2039-time64
    - lutimes                       tst-lutimes-time64
    - mktime                        tst-mktime4-time64
    - mq_timedreceive               tst-mqueue{1248}-time64
    - mq_timedsend                  tst-mqueue{1248}-time64
    - msgctl                        test-sysvmsg-time64
    - mtx_timedlock                 tst-mtx-timedlock-time64
    - nanosleep                     tst-cpuclock{12}-time64,
				    tst-mqueue8-time64, tst-clock-time64
    - nftw / ftw                    ftwtest-time64
    - ntp_adjtime                   tst-ntp_adjtime-time64
    - ntp_gettime                   tst-ntp_gettime-time64
    - ntp_gettimex                  tst-ntp_gettimex-time64
    - ppoll                         tst-ppoll-time64
    - pselect                       tst-pselect-time64
    - pthread_clockjoin_np          tst-join14-time64
    - pthread_cond_clockwait        tst-cond11-time64
    - pthread_cond_timedwait        tst-abstime-time64
    - pthread_mutex_clocklock       tst-abstime-time64
    - pthread_mutex_timedlock       tst-abstime-time64
    - pthread_rwlock_clockrdlock    tst-abstime-time64, tst-rwlock14-time64
    - pthread_rwlock_clockwrlock    tst-abstime-time64, tst-rwlock14-time64
    - pthread_rwlock_timedrdlock    tst-abstime-time64, tst-rwlock14-time64
    - pthread_rwlock_timedwrlock    tst-abstime-time64, tst-rwlock14-time64
    - pthread_timedjoin_np          tst-join14-time64
    - recvmmsg                      tst-cancel4_2-time64
    - sched_rr_get_interval         tst-sched_rr_get_interval-time64
    - select                        tst-select-time64
    - sem_clockwait                 tst-sem5-time64
    - sem_timedwait                 tst-sem5-time64
    - semctl                        test-sysvsem-time64
    - semtimedop                    test-sysvsem-time64
    - setitimer                     tst-mqueue2-time64, tst-itimer-timer64
    - settimeofday                  tst-settimeofday-time64
    - shmctl                        test-sysvshm-time64
    - sigtimedwait                  tst-sigtimedwait-time64
    - stat                          tst-stat-time64
    - thrd_sleep                    tst-thrd-sleep-time64
    - time                          tst-mqueue{1248}-time64
    - timegm                        tst-timegm-time64
    - timer_gettime                 tst-timer4-time64
    - timer_settime                 tst-timer4-time64
    - timerfd_gettime               tst-timerfd-time64
    - timerfd_settime               tst-timerfd-time64
    - timespec_get                  tst-timespec_get-time64
    - timespec_getres               tst-timespec_getres-time64
    - utime                         tst-utime-time64
    - utimensat                     tst-utimensat-time64
    - utimes                        tst-utimes-time64
    - wait3                         tst-wait3-time64
    - wait4                         tst-wait4-time64

  * librt:
    - aio_suspend                   tst-aio6-time64
    - mq_timedreceive               tst-mqueue{1248}-time64
    - mq_timedsend                  tst-mqueue{1248}-time64
    - timer_gettime                 tst-timer4-time64
    - timer_settime                 tst-timer4-time64

  * libanl:
    - gai_suspend

Reviewed-by: Lukasz Majewski <lukma@denx.de>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
2021-06-15 10:42:11 -03:00
Adhemerval Zanella
47f24c21ee y2038: Add support for 64-bit time on legacy ABIs
A new build flag, _TIME_BITS, enables the usage of the newer 64-bit
time symbols for legacy ABI (where 32-bit time_t is default).  The 64
bit time support is only enabled if LFS (_FILE_OFFSET_BITS=64) is
also used.

Different than LFS support, the y2038 symbols are added only for the
required ABIs (armhf, csky, hppa, i386, m68k, microblaze, mips32,
mips64-n32, nios2, powerpc32, sparc32, s390-32, and sh).  The ABIs with
64-bit time support are unchanged, both for symbol and types
redirection.

On Linux the full 64-bit time support requires a minimum of kernel
version v5.1.  Otherwise, the 32-bit fallbacks are used and might
results in error with overflow return code (EOVERFLOW).

The i686-gnu does not yet support 64-bit time.

This patch exports following rediretions to support 64-bit time:

  * libc:
    adjtime
    adjtimex
    clock_adjtime
    clock_getres
    clock_gettime
    clock_nanosleep
    clock_settime
    cnd_timedwait
    ctime
    ctime_r
    difftime
    fstat
    fstatat
    futimens
    futimes
    futimesat
    getitimer
    getrusage
    gettimeofday
    gmtime
    gmtime_r
    localtime
    localtime_r
    lstat_time
    lutimes
    mktime
    msgctl
    mtx_timedlock
    nanosleep
    nanosleep
    ntp_gettime
    ntp_gettimex
    ppoll
    pselec
    pselect
    pthread_clockjoin_np
    pthread_cond_clockwait
    pthread_cond_timedwait
    pthread_mutex_clocklock
    pthread_mutex_timedlock
    pthread_rwlock_clockrdlock
    pthread_rwlock_clockwrlock
    pthread_rwlock_timedrdlock
    pthread_rwlock_timedwrlock
    pthread_timedjoin_np
    recvmmsg
    sched_rr_get_interval
    select
    sem_clockwait
    semctl
    semtimedop
    sem_timedwait
    setitimer
    settimeofday
    shmctl
    sigtimedwait
    stat
    thrd_sleep
    time
    timegm
    timerfd_gettime
    timerfd_settime
    timespec_get
    utime
    utimensat
    utimes
    utimes
    wait3
    wait4

  * librt:
    aio_suspend
    mq_timedreceive
    mq_timedsend
    timer_gettime
    timer_settime

  * libanl:
    gai_suspend

Reviewed-by: Lukasz Majewski <lukma@denx.de>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
2021-06-15 10:42:11 -03:00
Adhemerval Zanella
7194337c3e y2038: Use a common definition for shmid_ds
Instead of replicate the same definitions from struct_shmid64_ds.h
on the multiple struct_shmid_ds.h, use a common header which is included
when required (struct_shmid64_ds_helper.h).

The __USE_TIME_BITS64 is not defined internally yet, although the
internal header is used when building the 64-bit semctl implementation.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
2021-06-15 10:42:11 -03:00
Adhemerval Zanella
f98beb65f5 y2038: Use a common definition for semid_ds
Instead of replicate the same definitions from struct_semid64_ds.h
on the multiple struct_semid_ds.h, use a common header which is included
when required (struct_semid64_ds_helper.h).

The __USE_TIME_BITS64 is not defined internally yet, although the
internal header is used when building the 64-bit semctl implementation.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
2021-06-15 10:42:11 -03:00
Lukasz Majewski
b997083e3d y2038: Use a common definition for msqid_ds
Instead of replicate the same definitions from struct_msqid64_ds.h
on the multiple struct_msqid_ds.h, use a common header which is included
when required (struct_msqid64_ds_helper.h).

The __USE_TIME_BITS64 is not defined internally yet, although the
internal header is used when building the 64-bit stat implementations.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
2021-06-15 10:42:11 -03:00
Lukasz Majewski
4e8521333b y2038: Use a common definition for stat
Instead of replicate the same definitions from struct_stat_time64.h
on the multiple struct_stat.h, use a common header which is included
when required (struct_stat_time64_helper.h).  The 64-bit time support
is added only for LFS support.

The __USE_TIME_BITS64 is not defined internally yet, although the
internal header is used when building the 64-bit stat implementations.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
2021-06-15 10:42:11 -03:00
Adhemerval Zanella
13c51549e2 linux: Add fallback for 64-bit time_t SO_TIMESTAMP{NS}
The recvmsg handling is more complicated because it requires check the
returned kernel control message and make some convertions.  For
!__ASSUME_TIME64_SYSCALLS it converts the first 32-bit time SO_TIMESTAMP
or SO_TIMESTAMPNS and appends it to the control buffer if has extra
space or returns MSG_CTRUNC otherwise.  The 32-bit time field is kept
as-is.

Calls with __TIMESIZE=32 will see the converted 64-bit time control
messages as spurious control message of unknown type.  Calls with
__TIMESIZE=64 running on pre-time64 kernels will see the original
message as a spurious control ones of unknown typ while running on
kernel with native 64-bit time support will only see the time64 version
of the control message.

Checked on x86_64-linux-gnu and i686-linux-gnu (on 5.4 and on 4.15
kernel).

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
2021-06-15 10:42:06 -03:00