glibc

mirror of https://sourceware.org/git/glibc.git synced 2024-12-03 02:11:08 +00:00

Author	SHA1	Message	Date
Adhemerval Zanella	4e16d89866	linux: Make fdopendir fail with O_PATH (BZ 30373) It is not strictly required by the POSIX, since O_PATH is a Linux extension, but it is QoI to fail early instead of at readdir. Also the check is free, since fdopendir already checks if the file descriptor is opened for read. Checked on x86_64-linux-gnu.	2023-11-30 13:37:04 -03:00
Adhemerval Zanella	78ed8bdf4f	linux: Add PR_SET_VMA_ANON_NAME support Linux 5.17 added support to naming anonymous virtual memory areas through the prctl syscall. The __set_vma_name is a wrapper to avoid optimizing the prctl call if the kernel does not support it. If the kernel does not support PR_SET_VMA_ANON_NAME, prctl returns EINVAL. And it also returns the same error for an invalid argument. Since it is an internal-only API, it assumes well-formatted input: aligned START, with (START, START+LEN) being a valid memory range, and NAME with a limit of 80 characters without an invalid one ("\\`$[]"). Reviewed-by: DJ Delorie <dj@redhat.com>	2023-11-07 10:27:20 -03:00
Adhemerval Zanella Netto	e7190fc73d	linux: Add pidfd_getpid This interface allows to obtain the associated process ID from the process file descriptor. It is done by parsing the procps fdinfo information. Its prototype is: pid_t pidfd_getpid (int fd) It returns the associated pid or -1 in case of an error and sets the errno accordingly. The possible errno values are those from open, read, and close (used on procps parsing), along with: - EBADF if the FD is negative, does not have a PID associated, or if the fdinfo fields contain a value larger than pid_t. - EREMOTE if the PID is in a separate namespace. - ESRCH if the process is already terminated. Checked on x86_64-linux-gnu on Linux 4.15 (no CLONE_PIDFD or waitid support), Linux 5.4 (full support), and Linux 6.2. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2023-09-05 13:08:59 -03:00
Adhemerval Zanella Netto	0d6f9f6265	posix: Add pidfd_spawn and pidfd_spawnp (BZ 30349) Returning a pidfd allows a process to keep a race-free handle for a child process, otherwise, the caller will need to either use pidfd_open (which still might be subject to TOCTOU) or keep the old racy interface base on pid_t. To correct use pifd_spawn, the kernel must support not only returning the pidfd with clone/clone3 but also waitid (P_PIDFD) (added on Linux 5.4). If kernel does not support the waitid, pidfd return ENOSYS. It avoids the need to racy workarounds, such as reading the procfs fdinfo to get the pid to use along with other wait interfaces. These interfaces are similar to the posix_spawn and posix_spawnp, with the only difference being it returns a process file descriptor (int) instead of a process ID (pid_t). Their prototypes are: int pidfd_spawn (int restrict pidfd, const char restrict file, const posix_spawn_file_actions_t restrict facts, const posix_spawnattr_t restrict attrp, char const argv[restrict], char const envp[restrict]) int pidfd_spawnp (int restrict pidfd, const char restrict path, const posix_spawn_file_actions_t restrict facts, const posix_spawnattr_t restrict attrp, char const argv[restrict_arr], char const envp[restrict_arr]); A new symbol is used instead of a posix_spawn extension to avoid possible issues with language bindings that might track the return argument lifetime. Although on Linux pid_t and int are interchangeable, POSIX only states that pid_t should be a signed integer. Both symbols reuse the posix_spawn posix_spawn_file_actions_t and posix_spawnattr_t, to void rehash posix_spawn API or add a new one. It also means that both interfaces support the same attribute and file actions, and a new flag or file action on posix_spawn is also added automatically for pidfd_spawn. Also, using posix_spawn plumbing allows the reusing of most of the current testing with some changes: - waitid is used instead of waitpid since it is a more generic interface. - tst-posix_spawn-setsid.c is adapted to take into consideration that the caller can check for session id directly. The test now spawns itself and writes the session id as a file instead. - tst-spawn3.c need to know where pidfd_spawn is used so it keeps an extra file description unused. Checked on x86_64-linux-gnu on Linux 4.15 (no CLONE_PIDFD or waitid support), Linux 5.4 (full support), and Linux 6.2. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2023-09-05 13:08:59 -03:00
Adhemerval Zanella Netto	ce2bfb8569	linux: Add posix_spawnattr_{get, set}cgroup_np (BZ 26371) These functions allow to posix_spawn and posix_spawnp to use CLONE_INTO_CGROUP with clone3, allowing the child process to be created in a different cgroup version 2. These are GNU extensions that are available only for Linux, and also only for the architectures that implement clone3 wrapper (HAVE_CLONE3_WRAPPER). To create a process on a different cgroupv2, one can use the: posix_spawnattr_t attr; posix_spawnattr_init (&attr); posix_spawnattr_setflags (&attr, POSIX_SPAWN_SETCGROUP); posix_spawnattr_setcgroup_np (&attr, cgroup); posix_spawn (...) Similar to other posix_spawn flags, POSIX_SPAWN_SETCGROUP control whether the cgroup file descriptor will be used or not with clone3. There is no fallback if either clone3 does not support the flag or if the architecture does not provide the clone3 wrapper, in this case posix_spawn returns EOPNOTSUPP. Checked on x86_64-linux-gnu. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2023-09-05 13:08:48 -03:00
Frédéric Bérat	20c894d21e	Exclude routines from fortification Since the _FORTIFY_SOURCE feature uses some routines of Glibc, they need to be excluded from the fortification. On top of that: - some tests explicitly verify that some level of fortification works appropriately, we therefore shouldn't modify the level set for them. - some objects need to be build with optimization disabled, which prevents _FORTIFY_SOURCE to be used for them. Assembler files that implement architecture specific versions of the fortified routines were not excluded from _FORTIFY_SOURCE as there is no C header included that would impact their behavior. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2023-07-05 16:59:48 +02:00
Adhemerval Zanella	a9fed5ea81	linux: Split tst-ttyname The tst-ttyname-direct.c checks the ttyname with procfs mounted in bind mode (MS_BIND\|MS_REC), while tst-ttyname-namespace.c checks with procfs mount with MS_NOSUID\|MS_NOEXEC\|MS_NODEV in a new namespace. Checked on x86_64-linux-gnu and aarch64-linux-gnu. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2023-06-28 09:18:23 -03:00
Carlos O'Donell	dccee96e6d	linux: Reformat Makefile. Reflow Makefile. Sort using scripts/sort-makefile-lines.py. No code generation changes observed in binary artifacts. No regressions on x86_64 and i686.	2023-05-16 07:19:31 -04:00
Samuel Thibault	e2b3d7f485	hurd 64bit: Fix struct msqid_ds and shmid_ds fields The standards want msg_lspid/msg_lrpid/shm_cpid/shm_lpid to be pid_t, see BZ 23083 and 23085. We can leave them __rpc_pid_t on i386 for ABI compatibility, but avoid hitting the issue on 64bit.	2023-05-01 15:07:51 +02:00
Samuel Thibault	e3a3616dbf	hurd 64bit: Fix ipc_perm fields types The standards want uid/cuid to be uid_t, gid/cgid to be gid_t and mode to be mode_t, see BZ 23082. We can leave them short ints on i386 for ABI compatibility, but avoid hitting the issue on 64bit. bits/ipc.h ends up being exactly the same in sysdeps/gnu/ and sysdeps/unix/sysv/linux/, so remove the latter.	2023-05-01 15:05:09 +02:00
H.J. Lu	a443bd3fb2	__check_pf: Add a cancellation cleanup handler [BZ #20975 ] There are reports for hang in __check_pf: https://github.com/JoeDog/siege/issues/4 It is reproducible only under specific configurations: 1. Large number of cores (>= 64) and large number of threads (> 3X of the number of cores) with long lived socket connection. 2. Low power (frequency) mode. 3. Power management is enabled. While holding lock, __check_pf calls make_request which calls __sendto and __recvmsg. Since __sendto and __recvmsg are cancellation points, lock held by __check_pf won't be released and can cause deadlock when thread cancellation happens in __sendto or __recvmsg. Add a cancellation cleanup handler for __check_pf to unlock the lock when cancelled by another thread. This fixes BZ #20975 and the siege hang issue.	2023-04-28 13:38:38 -07:00
Adhemerval Zanella	320768a664	linux: Re-flow and sort multiline Makefile definitions	2023-04-20 10:40:54 -03:00
Adhemerval Zanella Netto	33237fe83d	Remove --enable-tunables configure option And make always supported. The configure option was added on glibc 2.25 and some features require it (such as hwcap mask, huge pages support, and lock elisition tuning). It also simplifies the build permutations. Changes from v1: * Remove glibc.rtld.dynamic_sort changes, it is orthogonal and needs more discussion. * Cleanup more code. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2023-03-29 14:33:06 -03:00
Adhemerval Zanella Netto	2053c11331	linux: Add clone3 CLONE_CLEAR_SIGHAND optimization to posix_spawn The clone3 flag resets all signal handlers of the child not set to SIG_IGN to SIG_DFL. It allows to skip most of the sigaction calls to setup child signal handling, where previously a posix_spawn had to issue 2 times NSIG sigaction calls (one to obtain the current disposition and another to set either SIG_DFL or SIG_IGN). With POSIX_SPAWN_SETSIGDEF the child will setup the signal for the case where the disposition is SIG_IGN. The code must handle the fallback where clone3 is not available. This is done by splitting __clone_internal_fallback from __clone_internal. Checked on x86_64-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2023-02-01 08:42:11 -03:00
Florian Weimer	c1c0dea388	Linux: Remove epoll_create, inotify_init from syscalls.list Their presence causes stub warnings to be created on architectures which do not implement them. Fixes commit `d1d23b1342` ("Lninux: consolidate epoll_create implementation") and commit `842128f160` ("Linux: consolidate inotify_init implementation"). Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-12-19 13:28:14 +01:00
Florian Weimer	9a5b1d84fb	Linux: Reflow and sort some Makefile variables Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-12-19 13:28:07 +01:00
Adhemerval Zanella	774058d729	linux: Fix sys/mount.h usage with kernel headers Now that kernel exports linux/mount.h and includes it on linux/fs.h, its definitions might clash with glibc exports sys/mount.h. To avoid the need to rearrange the Linux header to be always after glibc one, the glibc sys/mount.h is changed to: 1. Undefine the macros also used as enum constants. This covers prior inclusion of <linux/mount.h> (for instance MS_RDONLY). 2. Include <linux/mount.h> based on the usual __has_include check (needs to use __has_include ("linux/mount.h") to paper over GCC bugs. 3. Define enum fsconfig_command only if FSOPEN_CLOEXEC is not defined. (FSOPEN_CLOEXEC should be a very close proxy.) 4. Define struct mount_attr if MOUNT_ATTR_SIZE_VER0 is not defined. (Added in the same commit on the Linux side.) This patch also adds some tests to check if including linux/fs.h and linux/mount.h after and before sys/mount.h does work. Checked on x86_64-linux-gnu. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2022-08-12 09:15:28 -03:00
Adhemerval Zanella	36676f5e5d	Remove ldd libc4 support The older libc versions are obsolete for over twenty years now.	2022-08-04 10:03:45 -03:00
Florian Weimer	0c5605989f	Linux: dirent/tst-readdir64-compat needs to use TEST_COMPAT (bug 27654) The hppa port starts libc at GLIBC_2.2, but has earlier symbol versions in other shared objects. This means that the compat symbol for readdir64 is not actually present in libc even though have-GLIBC_2.1.3 is defined as yes at the make level. Fixes commit `15e50e6c96` ("Linux: dirent/tst-readdir64-compat can be a regular test") by mostly reverting it.	2022-07-25 11:39:03 +02:00
Adhemerval Zanella	c3b02b6567	linux: Add tst-mount to check for Linux new mount API The new mount API was added on Linux 5.2 with six new syscalls: fsopen, fsconfig, fsmount, move_mount, fspick, and open_tree. The new test verifies minimal functionality along with error paths for specific arguments and their corner cases. Checked on x86_64-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2022-07-05 10:08:48 -03:00
Adhemerval Zanella	6c0eedd97e	linux: Add fsopen It was added on Linux 5.2 (24dcb3d90a1f67fe08c68a004af37df059d74005) to start the process of preparing to create a superblock that will then be mountable, using an fd as a context handle. Tested-by: Carlos O'Donell <carlos@redhat.com> Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2022-06-24 16:03:15 -03:00
Adhemerval Zanella	1002f1af1c	linux: Add process_mrelease Added in Linux 5.15 (884a7e5964e06ed93c7771c0d7cf19c09a8946f1), the new syscalls allows a caller to free the memory of a dying target process. Checked on x86_64-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2022-06-02 15:43:28 -03:00
Adhemerval Zanella	d19ee3473d	linux: Add process_madvise It was added on Linux 5.10 (ecb8ac8b1f146915aa6b96449b66dd48984caacc) with the same functionality as madvise but using a pidfd of the target process. Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2022-06-02 15:43:28 -03:00
Adhemerval Zanella	d2a1ec2097	linux: Add tst-pidfd.c To check for the pidfd functions pidfd_open, pidfd_getfd, pid_send_signal, and waitid with P_PIDFD. Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>	2022-05-17 10:36:59 -03:00
Adhemerval Zanella	97f5d19c45	linux: Add pidfd_open This was added on Linux 5.3 (32fcb426ec001cb6d5a4a195091a8486ea77e2df) as a way to retrieve a pid file descriptors for process that has not been created CLONE_PIDFD (by usual fork/clone). Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>	2022-05-17 10:32:28 -03:00
Szabolcs Nagy	9faf5262c7	linux: Add a getauxval test [BZ #23293 ] This is for bug 23293 and it relies on the glibc test system running tests via explicit ld.so invokation by default. Reviewed-by: Florian Weimer <fweimer@redhat.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-05-17 10:14:03 +01:00
H.J. Lu	1fe00d3eb6	build: Properly generate .d dependency files [BZ #28922 ] 1. Also generate .d dependency files for $(tests-container) and $(tests-printers). 2. elf: Add tst-auditmod17.os to extra-test-objs. 3. iconv: Add tst-gconv-init-failure-mod.os to extra-test-objs. 4. malloc: Rename extra-tests-objs to extra-test-objs. 5. linux: Add tst-sysconf-iov_max-uapi.o to extra-test-objs. 6. x86_64: Add tst-x86_64mod-1.o, tst-platformmod-2.o, test-libmvec.o, test-libmvec-avx.o, test-libmvec-avx2.o and test-libmvec-avx512f.o to extra-test-objs. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2022-02-25 10:35:45 -08:00
Adhemerval Zanella	948ce73b31	Linux: Only generate 64 bit timestamps for 64 bit time_t recvmsg/recvmmsg The timestamps created by __convert_scm_timestamps only make sense for 64 bit time_t programs, 32 bit time_t programs will ignore 64 bit time_t timestamps since SO_TIMESTAMP will be defined to old values (either by glibc or kernel headers). Worse, if the buffer is not suffice MSG_CTRUNC is set to indicate it (which breaks some programs [1]). This patch makes only 64 bit time_t recvmsg and recvmmsg to call __convert_scm_timestamps. Also, the assumption to called it is changed from __ASSUME_TIME64_SYSCALLS to __TIMESIZE != 64 since the setsockopt might be called by libraries built without __TIME_BITS=64. The MSG_CTRUNC is only set for the 64 bit symbols, it should happen only if 64 bit time_t programs run older kernels. Checked on x86_64-linux-gnu and i686-linux-gnu. [1] https://github.com/systemd/systemd/pull/20567 Reviewed-by: Florian Weimer <fweimer@redhat.com>	2022-01-28 18:18:27 -03:00
Adhemerval Zanella	8fba672472	linux: Fix ancillary 64-bit time timestamp conversion (BZ #28349 , BZ#28350) The __convert_scm_timestamps only updates the control message last pointer for SOL_SOCKET type, so if the message control buffer contains multiple ancillary message types the converted timestamp one might overwrite a valid message. The test checks if the extra ancillary space is correctly handled by recvmsg/recvmmsg, where if there is no extra space for the 64-bit time_t converted message the control buffer should be marked with MSG_TRUNC. It also check if recvmsg/recvmmsg handle correctly multiple ancillary data. Checked on x86_64-linux and on i686-linux-gnu on both 5.11 and 4.15 kernel. Co-authored-by: Fabian Vogt <fvogt@suse.de> Reviewed-by: Florian Weimer <fweimer@redhat.com>	2022-01-28 17:46:44 -03:00
Siddhesh Poyarekar	23e0e8f5f1	getcwd: Set errno to ERANGE for size == 1 (CVE-2021-3999) No valid path returned by getcwd would fit into 1 byte, so reject the size early and return NULL with errno set to ERANGE. This change is prompted by CVE-2021-3999, which describes a single byte buffer underflow and overflow when all of the following conditions are met: - The buffer size (i.e. the second argument of getcwd) is 1 byte - The current working directory is too long - '/' is also mounted on the current working directory Sequence of events: - In sysdeps/unix/sysv/linux/getcwd.c, the syscall returns ENAMETOOLONG because the linux kernel checks for name length before it checks buffer size - The code falls back to the generic getcwd in sysdeps/posix - In the generic func, the buf[0] is set to '\0' on line 250 - this while loop on line 262 is bypassed: while (!(thisdev == rootdev && thisino == rootino)) since the rootfs (/) is bind mounted onto the directory and the flow goes on to line 449, where it puts a '/' in the byte before the buffer. - Finally on line 458, it moves 2 bytes (the underflowed byte and the '\0') to the buf[0] and buf[1], resulting in a 1 byte buffer overflow. - buf is returned on line 469 and errno is not set. This resolves BZ #28769. Reviewed-by: Andreas Schwab <schwab@linux-m68k.org> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Signed-off-by: Qualys Security Advisory <qsa@qualys.com> Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2022-01-24 11:00:17 +05:30
Adhemerval Zanella	5f3a7ebc35	Linux: Add epoll_pwait2 (BZ #27359 ) It is similar to epoll_wait, with the difference the timeout has nanosecond resoluting by using struct timespec instead of int. Although Linux interface only provides 64 bit time_t support, old 32 bit interface is also provided (so keep in sync with current practice and to no force opt-in on 64 bit time_t). Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2022-01-17 14:34:54 -03:00
Adhemerval Zanella	572e0c8554	Revert "linux: Fix ancillary 64-bit time timestamp conversion (BZ #28349 , BZ #28350 )" This reverts commit `21e0f45c7d`.	2022-01-12 10:35:06 -03:00
Adhemerval Zanella	21e0f45c7d	linux: Fix ancillary 64-bit time timestamp conversion (BZ #28349 , BZ #28350 ) The __convert_scm_timestamps() only updates the control message last pointer for SOL_SOCKET type, so if the message control buffer contains multiple ancillary message types the converted timestamp one might overwrite a valid message. The test check if the extra ancillary space is correctly handled by recvmsg/recvmmsg, where if there is no extra space for the 64-bit time_t converted message the control buffer should be marked with MSG_TRUNC. It also check if recvmsg/recvmmsg handle correctly multiple ancillary data. Checked on x86_64-linux and on i686-linux-gnu on both 5.11 and 4.15 kernel. Co-authored-by: Fabian Vogt <fvogt@suse.de>	2022-01-12 10:30:10 -03:00
Florian Weimer	c901c3e764	nptl: Add public rseq symbols and <sys/rseq.h> The relationship between the thread pointer and the rseq area is made explicit. The constant offset can be used by JIT compilers to optimize rseq access (e.g., for really fast sched_getcpu). Extensibility is provided through __rseq_size and __rseq_flags. (In the future, the kernel could request a different rseq size via the auxiliary vector.) Co-Authored-By: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2021-12-09 09:49:32 +01:00
Florian Weimer	e3e589829d	nptl: Add glibc.pthread.rseq tunable to control rseq registration This tunable allows applications to register the rseq area instead of glibc. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com> Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2021-12-09 09:49:32 +01:00
Florian Weimer	95e114a091	nptl: Add rseq registration The rseq area is placed directly into struct pthread. rseq registration failure is not treated as an error, so it is possible that threads run with inconsistent registration status. <sys/rseq.h> is not yet installed as a public header. Co-Authored-By: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com> Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2021-12-09 09:49:32 +01:00
Adhemerval Zanella	5b3e31e312	linux: Implement mremap in C Variadic function calls in syscalls.list does not work for all ABIs (for instance where the argument are passed on the stack instead of registers) and might have underlying issues depending of the variadic type (for instance if a 64-bit argument is used). Checked on x86_64-linux-gnu.	2021-11-30 13:13:03 -03:00
Adhemerval Zanella	83008fa495	linux: Add prlimit64 C implementation The LFS prlimit64 requires a arch-specific implementation in syscalls.list. Instead add a generic one that handles the required symbol alias for __RLIM_T_MATCHES_RLIM64_T. HPPA is the only outlier which requires a different default symbol. Checked on x86_64-linux-gnu and with build for the affected ABIs.	2021-11-30 13:13:03 -03:00
Adhemerval Zanella	d150181d73	linux: Add fanotify_mark C implementation Passing 64-bit arguments on syscalls.list is tricky: it requires to reimplement the expected kernel abi in each architecture. This is way to better to represent in C code where we already have macros for this (SYSCALL_LL64). Checked on x86_64-linux-gnu.	2021-11-25 09:56:57 -03:00
Adhemerval Zanella	456b3c08b6	io: Refactor close_range and closefrom Now that Hurd implementis both close_range and closefrom (`f2c996597d`), we can make close_range() a base ABI, and make the default closefrom() implementation on top of close_range(). The generic closefrom() implementation based on __getdtablesize() is moved to generic close_range(). On Linux it will be overriden by the auto-generation syscall while on Hurd it will be a system specific implementation. The closefrom() now calls close_range() and __closefrom_fallback(). Since on Hurd close_range() does not fail, __closefrom_fallback() is an empty static inline function set by__ASSUME_CLOSE_RANGE. The __ASSUME_CLOSE_RANGE also allows optimize Linux __closefrom_fallback() implementation when --enable-kernel=5.9 or higher is used. Finally the Linux specific tst-close_range.c is moved to io and enabled as default. The Linuxism and CLOSE_RANGE_UNSHARE are guarded so it can be built for Hurd (I have not actually test it). Checked on x86_64-linux-gnu, i686-linux-gnu, and with a i686-gnu build.	2021-11-24 09:09:37 -03:00
Florian Weimer	8b2c706a9d	socket: Add time64 alias for sendmmsg Reviewed-by: Lukasz Majewski <lukma@denx.de> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2021-07-21 11:58:16 +02:00
Florian Weimer	b39ffab860	Linux: Add time64 alias for prctl Reviewed-by: Lukasz Majewski <lukma@denx.de> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2021-07-21 11:58:16 +02:00
H.J. Lu	84d40d702f	Add static tests for __clone_internal Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2021-07-14 06:55:04 -07:00
H.J. Lu	d8ea0d0168	Add an internal wrapper for clone, clone2 and clone3 The clone3 system call (since Linux 5.3) provides a superset of the functionality of clone and clone2. It also provides a number of API improvements, including the ability to specify the size of the child's stack area which can be used by kernel to compute the shadow stack size when allocating the shadow stack. Add: extern int __clone_internal (struct clone_args __cl_args, int (__func) (void __arg), void __arg); to provide an abstract interface for clone, clone2 and clone3. 1. Simplify stack management for thread creation by passing both stack base and size to create_thread. 2. Consolidate clone vs clone2 differences into a single file. 3. Call __clone3 if HAVE_CLONE3_WAPPER is defined. If __clone3 returns -1 with ENOSYS, fall back to clone or clone2. 4. Use only __clone_internal to clone a thread. Since the stack size argument for create_thread is now unconditional, always pass stack size to create_thread. 5. Enable the public clone3 wrapper in the future after it has been added to all targets. NB: Sandbox will return ENOSYS on clone3 in both Chromium: The following revision refers to this bug: `218438259d` commit 218438259dd795456f0a48f67cbe5b4e520db88b Author: Matthew Denton <mpdenton@chromium.org> Date: Thu Jun 03 20:06:13 2021 Linux sandbox: return ENOSYS for clone3 Because clone3 uses a pointer argument rather than a flags argument, we cannot examine the contents with seccomp, which is essential to preventing sandboxed processes from starting other processes. So, we won't be able to support clone3 in Chromium. This CL modifies the BPF policy to return ENOSYS for clone3 so glibc always uses the fallback to clone. Bug: 1213452 Change-Id: I7c7c585a319e0264eac5b1ebee1a45be2d782303 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/2936184 Reviewed-by: Robert Sesek <rsesek@chromium.org> Commit-Queue: Matthew Denton <mpdenton@chromium.org> Cr-Commit-Position: refs/heads/master@{#888980} [modify] https://crrev.com/218438259dd795456f0a48f67cbe5b4e520db88b/sandbox/linux/seccomp-bpf-helpers/baseline_policy.cc and Firefox: https://hg.mozilla.org/integration/autoland/rev/ecb4011a0c76 Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2021-07-14 06:33:58 -07:00
Adhemerval Zanella	72e84d1db2	Linux: Use 32-bit vDSO for clock_gettime, gettimeofday, time (BZ# 28071) The previous approach defeats the vDSO optimization on older kernels because a failing clock_gettime64 system call is performed on every function call. It also results in a clobbered errno value, exposing an OpenJDK bug (JDK-8270244). This patch fixes by open-code INLINE_VSYSCALL macro and replace all INLINE_SYSCALL_CALL with INTERNAL_SYSCALL_CALLS. Now for __clock_gettime64x, the 64-bit vDSO is used and the 32-bit vDSO is tried before falling back to 64-bit syscalls. The previous code preferred 64-bit syscall for the case where the kernel provides 64-bit time_t syscalls and also a 32-bit vDSO (in this case the 64-bit syscall should be preferable over the vDSO). All architectures that provides 32-bit vDSO (i386, mips, powerpc, s390) modulo sparc; but I am not sure if some kernels versions do provide only 32-bit vDSO while still providing 64-bit time_t syscall. Regardless, for such cases the 64-bit time_t syscall is used if the vDSO returns overflowed 32-bit time_t. Tested on i686-linux-gnu (with a time64 and non-time64 kernel), x86_64-linux-gnu. Built with build-many-glibcs.py. Co-authored-by: Florian Weimer <fweimer@redhat.com>	2021-07-12 17:37:56 -03:00
Florian Weimer	aaacde11f2	Reduce <limits.h> pollution due to dynamic PTHREAD_STACK_MIN <limits.h> used to be a header file with no declarations. GCC's libgomp includes it in a #pragma GCC visibility hidden block. Including <unistd.h> from <limits.h> (indirectly) declares everything in <unistd.h> with hidden visibility, resulting in linker failures. This commit avoids C declarations in assembler mode and only declares __sysconf in <limits.h> (and not the entire contents of <unistd.h>). The __sysconf symbol is already part of the ABI. PTHREAD_STACK_MIN is no longer defined for __USE_DYNAMIC_STACK_SIZE && __ASSEMBLER__ because there is no possible definition. Additionally, PTHREAD_STACK_MIN is now defined by <pthread.h> for __USE_MISC because this is what developers expect based on the macro name. It also helps to avoid libgomp linker failures in GCC because libgomp includes <pthread.h> before its visibility hacks. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2021-07-12 18:43:32 +02:00
H.J. Lu	5d98a7dae9	Define PTHREAD_STACK_MIN to sysconf(_SC_THREAD_STACK_MIN) The constant PTHREAD_STACK_MIN may be too small for some processors. Rename _SC_SIGSTKSZ_SOURCE to _DYNAMIC_STACK_SIZE_SOURCE. When _DYNAMIC_STACK_SIZE_SOURCE or _GNU_SOURCE are defined, define PTHREAD_STACK_MIN to sysconf(_SC_THREAD_STACK_MIN) which is changed to MIN (PTHREAD_STACK_MIN, sysconf(_SC_MINSIGSTKSZ)). Consolidate <bits/local_lim.h> with <bits/pthread_stack_min.h> to provide a constant target specific PTHREAD_STACK_MIN value. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2021-07-09 15:10:35 -07:00
Adhemerval Zanella	607449506f	io: Add closefrom [BZ #10353 ] The function closes all open file descriptors greater than or equal to input argument. Negative values are clamped to 0, i.e, it will close all file descriptors. As indicated by the bug report, this is a common symbol provided by different systems (Solaris, OpenBSD, NetBSD, FreeBSD) and, although its has inherent issues with not taking in consideration internal libc file descriptors (such as syslog), this is also a common feature used in multiple projects [1][2][3][4][5]. The Linux fallback implementation iterates over /proc and close all file descriptors sequentially. Although it was raised the questioning whether getdents on /proc/self/fd might return disjointed entries when file descriptor are closed; it does not seems the case on my testing on multiple kernel (v4.18, v5.4, v5.9) and the same strategy is used on different projects [1][2][3][5]. Also, the interface is set a fail-safe meaning that a failure in the fallback results in a process abort. Checked on x86_64-linux-gnu and i686-linux-gnu on kernel 5.11 and 4.15. [1] `5238e95759/src/basic/fd-util.c (L217)` [2] `ddf4b77e11/src/lxc/start.c (L236)` [3] `9e4f2f3a6b/Modules/_posixsubprocess.c (L220)` [4] `5f47c0613e/src/libstd/sys/unix/process2.rs (L303-L308)` [5] https://github.com/openjdk/jdk/blob/master/src/java.base/unix/native/libjava/childproc.c#L82	2021-07-08 14:08:14 -03:00
Adhemerval Zanella	286286283e	linux: Add close_range It was added on Linux 5.9 (278a5fbaed89) with CLOSE_RANGE_CLOEXEC added on 5.11 (582f1fb6b721f). Although FreeBSD has added the same syscall, this only adds the symbol on Linux ports. This syscall is required to provided a fail-safe way to implement the closefrom symbol (BZ #10353). Checked on x86_64-linux-gnu and i686-linux-gnu on kernel 5.11 and 4.15.	2021-07-08 14:08:13 -03:00
Florian Weimer	30639e79d3	Linux: Cleanups after librt move librt.so is no longer installed for PTHREAD_IN_LIBC, and tests are not linked against it. $(librt) is introduced globally for shared tests that need to be linked for both PTHREAD_IN_LIBC and !PTHREAD_IN_LIBC. GLIBC_PRIVATE symbols that were needed during the transition are removed again. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2021-06-28 09:51:01 +02:00

1 2 3 4 5 ...

383 Commits