glibc/nptl/cancellation.c

111 lines
4.2 KiB
C
Raw Permalink Normal View History

/* Copyright (C) 2002-2024 Free Software Foundation, Inc.
Initial revision 2002-11-26 Ulrich Drepper <drepper@redhat.com> * allocatestack.c (queue_stack): Don't remove stack from list here. Do it in the caller. Correct condition to prematurely terminate loop to free stacks. (__deallocate_stack): Remove stack from list here. 2002-11-26 Ulrich Drepper <drepper@redhat.com> * Makefile (tests): Add tst-stack1. * tst-stack1.c: New file. * allocatestack.c (allocate_stack): Initialize the TCB on a user provided stack. * pthread_attr_getstack.c: Return bottom of the thread area. 2002-11-25 Ulrich Drepper <drepper@redhat.com> * Makefile (libpthread-routines): Add pt-allocrtsig and pthread_kill_other_threads. * pt-allocrtsig.c: New file. * pthread_kill_other_threads.c: New file. * sysdeps/unix/sysv/linux/allocrtsig.c: Add additional aliases for all three functions. * sysdeps/unix/sysv/linux/Makefile (sysdep_routines): Remove allocrtsig. * sysdeps/unix/sysv/linux/Versions (libc:GLIBC_PRIVATE): Export __libc_current_sigrtmin_private, __libc_current_sigrtmax_private, and __libc_allocate_rtsig_private. * Versions (libpthread): Export pthread_kill_other_threads_np, __libc_current_sigrtmin, and __libc_current_sigrtmax. 2002-11-24 Ulrich Drepper <drepper@redhat.com> * allocatestack.c (allocate_stack): stackaddr in attribute points to the end of the stack. Adjust computations. When mprotect call fails dequeue stack and free it. * pthread_attr_setstack.c: Store top of the stack in stackaddr attribute. * pthread_getattr_np.c: Likewise. * descr.h (IS_DETACHED): Add some more parenthesis to prevent surprises. 2002-11-23 Ulrich Drepper <drepper@redhat.com> * sysdeps/pthread/pthread.h (pthread_self): __THROW must come before attribute definitions. Patch by Luca Barbieri <ldb@ldb.ods.org>. 2002-11-22 Ulrich Drepper <drepper@redhat.com> * pthread_getspecific.c: Optimize access to first 2nd-level array. * pthread_setspecific.c: Likewise. 2002-11-21 Ulrich Drepper <drepper@redhat.com> * sysdeps/unix/sysv/linux/i386/createthread.c: Remove CLONE_ flags definitions. Get them from the official place. * sysdeps/unix/sysv/linux/i386/fork.c: Likewise. * sysdeps/unix/sysv/linux/i386/createthread.c: Update CLONE_* flags. Use new CLONE_ flags in clone() calls. * sysdeps/unix/sysv/linux/fork.c: Use ARCH_FORK to actually fork. * sysdeps/unix/sysv/linux/i386/fork.c: New file. * Versions: Add pthread_* functions for libc. * forward.c: New file. * sysdeps/pthread/Makefile (libpthread-sysdeps_routines): Add errno-loc. * herrno.c: New file. * res.c: New file. * Makefile (libpthread-routines): Remove sem_post, sem_wait, sem_trywait, and sem_timedwait. Add herrno and res. * sem_init.c: Don't initialize lock and waiters members. * sem_open.c: Likewise. * sem_post.c: Removed. * sem_wait.c: Removed. * sem_trywait.c: Removed. * sem_timedwait.c: Removed. * sysdeps/unix/sysv/linux/i386/i486/lowlevelsem.S: Complete rewrite. Includes full implementations of sem_post, sem_wait, sem_trywait, and sem_timedwait. * sysdeps/unix/sysv/linux/i386/lowlevelsem.h (lll_sem_post): Adjust for new implementation. * sysdeps/unix/sysv/linux/internaltypes.h (struct sem): Remove lock and waiters fields. * tst-sem3.c: Improve error message. * tst-signal3.c: Likewise. * init.c (__pthread_initialize_minimal): Use set_tid_address syscall to tell the kernel about the termination futex and to initialize tid member. Don't initialize main_thread. * descr.h (struct pthread): Remove main_thread member. * cancelllation.c (__do_cancel): Remove code handling main thread. The main thread is not special anymore. * allocatestack.c (__reclaim_stacks): Mark stacks as unused. Add size of the stacks to stack_cache_actsize. * pt-readv.c: Add missing "defined". * pt-sigwait.c: Likewise. * pt-writev.c: Likewise. 2002-11-09 Ulrich Drepper <drepper@redhat.com> * Versions: Export __connect from libpthread. Patch by Luca Barbieri <ldb@ldb.ods.org>. * Makefile (libpthread-routines): Add pt-raise. * sysdeps/unix/sysv/linux/raise.c: New file. * sysdeps/unix/sysv/linux/pt-raise.c: New file. * sysdeps/generic/pt-raise.c: New file. * pthread_cond_init.c: Initialize all data elements of the condvar structure. Patch by Luca Barbieri <ldb@ldb.ods.org>. * pthread_attr_init.c: Actually implement 2.0 compatibility version. * pthread_create.c: Likewise. * Makefile (tests): Add tst-key1, tst-key2, tst-key3. * tst-key1.c: New file. * tst-key2.c: New file. * tst-key3.c: New file. * Versions: Export pthread_detach for version GLIBC_2.0. Reported by Saurabh Desai <sdesai@austin.ibm.com>. 2002-11-08 Ulrich Drepper <drepper@redhat.com> * pthread_key_create.c: Terminate search after an unused key was found. Patch by Luca Barbieri <ldb@ldb.ods.org>. * sysdeps/unix/sysv/linux/i386/pthread_once.S: Return zero. Patch by Luca Barbieri <ldb@ldb.ods.org>. 2002-10-10 Ulrich Drepper <drepper@redhat.com> * sysdeps/unix/sysv/linux/i386/i486/lowlevelsem.S: Use slow generic dynamic lookup for errno in PIC. * allocatestack.c (get_cached_stack): Rearrange code slightly to release the stack lock as soon as possible. Call _dl_allocate_tls_init for TCB from the cache to re-initialize the static TLS block. (allocate_stack): Call _dl_allocate_tls_init for user-provided stack. * cancellation.c: Renamed from cancelation.c. * Makefile: Adjust accordingly. * pthreadP.h (CANCELLATION_P): Renamed from CANCELATION_P. * cleanup_defer.c: Use CANCELLATION_P. * pthread_testcancel.c: Likewise. * descr.h: Fix spelling in comments. * init.c: Likewise. * pthread_getattr_np.c: Likewise. * pthread_getschedparam.c: Likewise. * pthread_setschedparam.c: Likewise. * Versions: Likewise. * pt-pselect.c: New file. * Makefile (libpthread-routines): Add pt-pselect. * Versions: Add pselect. * tst-cancel4.c: New file. * Makefile (tests): Add tst-cancel4. 2002-10-09 Ulrich Drepper <drepper@redhat.com> * pthread_mutex_lock.c: Always record lock ownership. * pthread_mutex_timedlock.c: Likewise. * pthread_mutex_trylock.c: Likewise. * pt-readv.c: New file. * pt-writev.c: New file. * pt-creat.c: New file. * pt-msgrcv.c: New file. * pt-msgsnd.c: New file. * pt-poll.c: New file. * pt-select.c: New file. * pt-sigpause.c: New file. * pt-sigsuspend.c: New file. * pt-sigwait.c: New file. * pt-sigwaitinfo.c: New file. * pt-waitid.c: New file. * Makefile (libpthread-routines): Add pt-readv, pt-writev, pt-creat, pt-msgrcv, pt-msgsnd, pt-poll, pt-select, pt-sigpause, pt-sigsuspend, pt-sigwait, pt-sigwaitinfo, and pt-waitid. * Versions: Add all the new functions. * tst-exit1.c: New file. * Makefile (tests): Add tst-exit1. * sem_timedwait.c: Minor optimization for more optimal fastpath. 2002-10-08 Ulrich Drepper <drepper@redhat.com> * pt-fcntl.c: Only enable asynchronous cancellation for F_SETLKW. * pthread_join.c: Enable asynchronous cancellation around lll_wait_tid call. pthread_join is an official cancellation point. * pthread_timedjoin.c: Likewise. * pthread_cond_wait.c: Revert order in which internal lock are dropped and the condvar's mutex are retrieved. * pthread_cond_timedwait.c: Likewise. Reported by dice@saros.East.Sun.COM. 2002-10-07 Ulrich Drepper <drepper@redhat.com> * pthreadP.h: Cut out all type definitions and move them... * sysdeps/unix/sysv/linux/internaltypes.h: ...here. New file. * pthreadP.h: Include <internaltypes.h>. * sysdeps/unix/sysv/linux/i386/lowlevelsem.h (lll_sem_post): Little performance tweaks. * sem_trywait.c: Shuffle #includes around to get right order. * sem_timedwait.c: Likewise. * sem_post.c: Likewise. * sem_wait.c: Likewise. * nptl 0.3 released. * Makefile (tests): Add tst-signal3. * tst-signal3.c: New file. 2002-10-05 Ulrich Drepper <drepper@redhat.com> * sysdeps/unix/sysv/linux/i386/lowlevelsem.h: Tell the compiler that the asms modify the sem object. (__lll_sem_timedwait): Now takes struct sem* as first parameter. * sysdeps/unix/sysv/linux/i386/bits/semaphore.h (sem_t): Don't expose the actual members. * pthreadP.h (struct sem): New type. Actual semaphore type. * semaphoreP.h: Include pthreadP.h. * sem_getvalue.c: Adjust to sem_t change. * sem_init.c: Likewise. * sem_open.c: Likewise. * sem_post.c: Likewise. * sem_timedwait.c: Likewise. * sem_trywait.c: Likewise. * sem_wait.c: Likewise. 2002-10-04 Ulrich Drepper <drepper@redhat.com> * Makefile (tests): Add tst-basic2, tst-exec1, tst-exec3, tst-exec3. * tst-basic2.c: New file. * tst-exec1.c: New file. * tst-exec2.c: New file. * tst-exec3.c: New file. * tst-fork1.c: Remove extra */. * nptl 0.2 released. The API for IA-32 is complete.
2002-11-26 22:50:54 +00:00
This file is part of the GNU C Library.
The GNU C Library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.
The GNU C Library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public
License along with the GNU C Library; if not, see
Prefer https to http for gnu.org and fsf.org URLs Also, change sources.redhat.com to sourceware.org. This patch was automatically generated by running the following shell script, which uses GNU sed, and which avoids modifying files imported from upstream: sed -ri ' s,(http|ftp)(://(.*\.)?(gnu|fsf|sourceware)\.org($|[^.]|\.[^a-z])),https\2,g s,(http|ftp)(://(.*\.)?)sources\.redhat\.com($|[^.]|\.[^a-z]),https\2sourceware.org\4,g ' \ $(find $(git ls-files) -prune -type f \ ! -name '*.po' \ ! -name 'ChangeLog*' \ ! -path COPYING ! -path COPYING.LIB \ ! -path manual/fdl-1.3.texi ! -path manual/lgpl-2.1.texi \ ! -path manual/texinfo.tex ! -path scripts/config.guess \ ! -path scripts/config.sub ! -path scripts/install-sh \ ! -path scripts/mkinstalldirs ! -path scripts/move-if-change \ ! -path INSTALL ! -path locale/programs/charmap-kw.h \ ! -path po/libc.pot ! -path sysdeps/gnu/errlist.c \ ! '(' -name configure \ -execdir test -f configure.ac -o -f configure.in ';' ')' \ ! '(' -name preconfigure \ -execdir test -f preconfigure.ac ';' ')' \ -print) and then by running 'make dist-prepare' to regenerate files built from the altered files, and then executing the following to cleanup: chmod a+x sysdeps/unix/sysv/linux/riscv/configure # Omit irrelevant whitespace and comment-only changes, # perhaps from a slightly-different Autoconf version. git checkout -f \ sysdeps/csky/configure \ sysdeps/hppa/configure \ sysdeps/riscv/configure \ sysdeps/unix/sysv/linux/csky/configure # Omit changes that caused a pre-commit check to fail like this: # remote: *** error: sysdeps/powerpc/powerpc64/ppc-mcount.S: trailing lines git checkout -f \ sysdeps/powerpc/powerpc64/ppc-mcount.S \ sysdeps/unix/sysv/linux/s390/s390-64/syscall.S # Omit change that caused a pre-commit check to fail like this: # remote: *** error: sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S: last line does not end in newline git checkout -f sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S
2019-09-07 05:40:42 +00:00
<https://www.gnu.org/licenses/>. */
Initial revision 2002-11-26 Ulrich Drepper <drepper@redhat.com> * allocatestack.c (queue_stack): Don't remove stack from list here. Do it in the caller. Correct condition to prematurely terminate loop to free stacks. (__deallocate_stack): Remove stack from list here. 2002-11-26 Ulrich Drepper <drepper@redhat.com> * Makefile (tests): Add tst-stack1. * tst-stack1.c: New file. * allocatestack.c (allocate_stack): Initialize the TCB on a user provided stack. * pthread_attr_getstack.c: Return bottom of the thread area. 2002-11-25 Ulrich Drepper <drepper@redhat.com> * Makefile (libpthread-routines): Add pt-allocrtsig and pthread_kill_other_threads. * pt-allocrtsig.c: New file. * pthread_kill_other_threads.c: New file. * sysdeps/unix/sysv/linux/allocrtsig.c: Add additional aliases for all three functions. * sysdeps/unix/sysv/linux/Makefile (sysdep_routines): Remove allocrtsig. * sysdeps/unix/sysv/linux/Versions (libc:GLIBC_PRIVATE): Export __libc_current_sigrtmin_private, __libc_current_sigrtmax_private, and __libc_allocate_rtsig_private. * Versions (libpthread): Export pthread_kill_other_threads_np, __libc_current_sigrtmin, and __libc_current_sigrtmax. 2002-11-24 Ulrich Drepper <drepper@redhat.com> * allocatestack.c (allocate_stack): stackaddr in attribute points to the end of the stack. Adjust computations. When mprotect call fails dequeue stack and free it. * pthread_attr_setstack.c: Store top of the stack in stackaddr attribute. * pthread_getattr_np.c: Likewise. * descr.h (IS_DETACHED): Add some more parenthesis to prevent surprises. 2002-11-23 Ulrich Drepper <drepper@redhat.com> * sysdeps/pthread/pthread.h (pthread_self): __THROW must come before attribute definitions. Patch by Luca Barbieri <ldb@ldb.ods.org>. 2002-11-22 Ulrich Drepper <drepper@redhat.com> * pthread_getspecific.c: Optimize access to first 2nd-level array. * pthread_setspecific.c: Likewise. 2002-11-21 Ulrich Drepper <drepper@redhat.com> * sysdeps/unix/sysv/linux/i386/createthread.c: Remove CLONE_ flags definitions. Get them from the official place. * sysdeps/unix/sysv/linux/i386/fork.c: Likewise. * sysdeps/unix/sysv/linux/i386/createthread.c: Update CLONE_* flags. Use new CLONE_ flags in clone() calls. * sysdeps/unix/sysv/linux/fork.c: Use ARCH_FORK to actually fork. * sysdeps/unix/sysv/linux/i386/fork.c: New file. * Versions: Add pthread_* functions for libc. * forward.c: New file. * sysdeps/pthread/Makefile (libpthread-sysdeps_routines): Add errno-loc. * herrno.c: New file. * res.c: New file. * Makefile (libpthread-routines): Remove sem_post, sem_wait, sem_trywait, and sem_timedwait. Add herrno and res. * sem_init.c: Don't initialize lock and waiters members. * sem_open.c: Likewise. * sem_post.c: Removed. * sem_wait.c: Removed. * sem_trywait.c: Removed. * sem_timedwait.c: Removed. * sysdeps/unix/sysv/linux/i386/i486/lowlevelsem.S: Complete rewrite. Includes full implementations of sem_post, sem_wait, sem_trywait, and sem_timedwait. * sysdeps/unix/sysv/linux/i386/lowlevelsem.h (lll_sem_post): Adjust for new implementation. * sysdeps/unix/sysv/linux/internaltypes.h (struct sem): Remove lock and waiters fields. * tst-sem3.c: Improve error message. * tst-signal3.c: Likewise. * init.c (__pthread_initialize_minimal): Use set_tid_address syscall to tell the kernel about the termination futex and to initialize tid member. Don't initialize main_thread. * descr.h (struct pthread): Remove main_thread member. * cancelllation.c (__do_cancel): Remove code handling main thread. The main thread is not special anymore. * allocatestack.c (__reclaim_stacks): Mark stacks as unused. Add size of the stacks to stack_cache_actsize. * pt-readv.c: Add missing "defined". * pt-sigwait.c: Likewise. * pt-writev.c: Likewise. 2002-11-09 Ulrich Drepper <drepper@redhat.com> * Versions: Export __connect from libpthread. Patch by Luca Barbieri <ldb@ldb.ods.org>. * Makefile (libpthread-routines): Add pt-raise. * sysdeps/unix/sysv/linux/raise.c: New file. * sysdeps/unix/sysv/linux/pt-raise.c: New file. * sysdeps/generic/pt-raise.c: New file. * pthread_cond_init.c: Initialize all data elements of the condvar structure. Patch by Luca Barbieri <ldb@ldb.ods.org>. * pthread_attr_init.c: Actually implement 2.0 compatibility version. * pthread_create.c: Likewise. * Makefile (tests): Add tst-key1, tst-key2, tst-key3. * tst-key1.c: New file. * tst-key2.c: New file. * tst-key3.c: New file. * Versions: Export pthread_detach for version GLIBC_2.0. Reported by Saurabh Desai <sdesai@austin.ibm.com>. 2002-11-08 Ulrich Drepper <drepper@redhat.com> * pthread_key_create.c: Terminate search after an unused key was found. Patch by Luca Barbieri <ldb@ldb.ods.org>. * sysdeps/unix/sysv/linux/i386/pthread_once.S: Return zero. Patch by Luca Barbieri <ldb@ldb.ods.org>. 2002-10-10 Ulrich Drepper <drepper@redhat.com> * sysdeps/unix/sysv/linux/i386/i486/lowlevelsem.S: Use slow generic dynamic lookup for errno in PIC. * allocatestack.c (get_cached_stack): Rearrange code slightly to release the stack lock as soon as possible. Call _dl_allocate_tls_init for TCB from the cache to re-initialize the static TLS block. (allocate_stack): Call _dl_allocate_tls_init for user-provided stack. * cancellation.c: Renamed from cancelation.c. * Makefile: Adjust accordingly. * pthreadP.h (CANCELLATION_P): Renamed from CANCELATION_P. * cleanup_defer.c: Use CANCELLATION_P. * pthread_testcancel.c: Likewise. * descr.h: Fix spelling in comments. * init.c: Likewise. * pthread_getattr_np.c: Likewise. * pthread_getschedparam.c: Likewise. * pthread_setschedparam.c: Likewise. * Versions: Likewise. * pt-pselect.c: New file. * Makefile (libpthread-routines): Add pt-pselect. * Versions: Add pselect. * tst-cancel4.c: New file. * Makefile (tests): Add tst-cancel4. 2002-10-09 Ulrich Drepper <drepper@redhat.com> * pthread_mutex_lock.c: Always record lock ownership. * pthread_mutex_timedlock.c: Likewise. * pthread_mutex_trylock.c: Likewise. * pt-readv.c: New file. * pt-writev.c: New file. * pt-creat.c: New file. * pt-msgrcv.c: New file. * pt-msgsnd.c: New file. * pt-poll.c: New file. * pt-select.c: New file. * pt-sigpause.c: New file. * pt-sigsuspend.c: New file. * pt-sigwait.c: New file. * pt-sigwaitinfo.c: New file. * pt-waitid.c: New file. * Makefile (libpthread-routines): Add pt-readv, pt-writev, pt-creat, pt-msgrcv, pt-msgsnd, pt-poll, pt-select, pt-sigpause, pt-sigsuspend, pt-sigwait, pt-sigwaitinfo, and pt-waitid. * Versions: Add all the new functions. * tst-exit1.c: New file. * Makefile (tests): Add tst-exit1. * sem_timedwait.c: Minor optimization for more optimal fastpath. 2002-10-08 Ulrich Drepper <drepper@redhat.com> * pt-fcntl.c: Only enable asynchronous cancellation for F_SETLKW. * pthread_join.c: Enable asynchronous cancellation around lll_wait_tid call. pthread_join is an official cancellation point. * pthread_timedjoin.c: Likewise. * pthread_cond_wait.c: Revert order in which internal lock are dropped and the condvar's mutex are retrieved. * pthread_cond_timedwait.c: Likewise. Reported by dice@saros.East.Sun.COM. 2002-10-07 Ulrich Drepper <drepper@redhat.com> * pthreadP.h: Cut out all type definitions and move them... * sysdeps/unix/sysv/linux/internaltypes.h: ...here. New file. * pthreadP.h: Include <internaltypes.h>. * sysdeps/unix/sysv/linux/i386/lowlevelsem.h (lll_sem_post): Little performance tweaks. * sem_trywait.c: Shuffle #includes around to get right order. * sem_timedwait.c: Likewise. * sem_post.c: Likewise. * sem_wait.c: Likewise. * nptl 0.3 released. * Makefile (tests): Add tst-signal3. * tst-signal3.c: New file. 2002-10-05 Ulrich Drepper <drepper@redhat.com> * sysdeps/unix/sysv/linux/i386/lowlevelsem.h: Tell the compiler that the asms modify the sem object. (__lll_sem_timedwait): Now takes struct sem* as first parameter. * sysdeps/unix/sysv/linux/i386/bits/semaphore.h (sem_t): Don't expose the actual members. * pthreadP.h (struct sem): New type. Actual semaphore type. * semaphoreP.h: Include pthreadP.h. * sem_getvalue.c: Adjust to sem_t change. * sem_init.c: Likewise. * sem_open.c: Likewise. * sem_post.c: Likewise. * sem_timedwait.c: Likewise. * sem_trywait.c: Likewise. * sem_wait.c: Likewise. 2002-10-04 Ulrich Drepper <drepper@redhat.com> * Makefile (tests): Add tst-basic2, tst-exec1, tst-exec3, tst-exec3. * tst-basic2.c: New file. * tst-exec1.c: New file. * tst-exec2.c: New file. * tst-exec3.c: New file. * tst-fork1.c: Remove extra */. * nptl 0.2 released. The API for IA-32 is complete.
2002-11-26 22:50:54 +00:00
#include <setjmp.h>
#include <stdlib.h>
#include "pthreadP.h"
nptl: Fix Race conditions in pthread cancellation [BZ#12683] The current racy approach is to enable asynchronous cancellation before making the syscall and restore the previous cancellation type once the syscall returns, and check if cancellation has happen during the cancellation entrypoint. As described in BZ#12683, this approach shows 2 problems: 1. Cancellation can act after the syscall has returned from the kernel, but before userspace saves the return value. It might result in a resource leak if the syscall allocated a resource or a side effect (partial read/write), and there is no way to program handle it with cancellation handlers. 2. If a signal is handled while the thread is blocked at a cancellable syscall, the entire signal handler runs with asynchronous cancellation enabled. This can lead to issues if the signal handler call functions which are async-signal-safe but not async-cancel-safe. For the cancellation to work correctly, there are 5 points at which the cancellation signal could arrive: [ ... )[ ... )[ syscall ]( ... 1 2 3 4 5 1. Before initial testcancel, e.g. [*... testcancel) 2. Between testcancel and syscall start, e.g. [testcancel...syscall start) 3. While syscall is blocked and no side effects have yet taken place, e.g. [ syscall ] 4. Same as 3 but with side-effects having occurred (e.g. a partial read or write). 5. After syscall end e.g. (syscall end...*] And libc wants to act on cancellation in cases 1, 2, and 3 but not in cases 4 or 5. For the 4 and 5 cases, the cancellation will eventually happen in the next cancellable entrypoint without any further external event. The proposed solution for each case is: 1. Do a conditional branch based on whether the thread has received a cancellation request; 2. It can be caught by the signal handler determining that the saved program counter (from the ucontext_t) is in some address range beginning just before the "testcancel" and ending with the syscall instruction. 3. SIGCANCEL can be caught by the signal handler and determine that the saved program counter (from the ucontext_t) is in the address range beginning just before "testcancel" and ending with the first uninterruptable (via a signal) syscall instruction that enters the kernel. 4. In this case, except for certain syscalls that ALWAYS fail with EINTR even for non-interrupting signals, the kernel will reset the program counter to point at the syscall instruction during signal handling, so that the syscall is restarted when the signal handler returns. So, from the signal handler's standpoint, this looks the same as case 2, and thus it's taken care of. 5. For syscalls with side-effects, the kernel cannot restart the syscall; when it's interrupted by a signal, the kernel must cause the syscall to return with whatever partial result is obtained (e.g. partial read or write). 6. The saved program counter points just after the syscall instruction, so the signal handler won't act on cancellation. This is similar to 4. since the program counter is past the syscall instruction. So The proposed fixes are: 1. Remove the enable_asynccancel/disable_asynccancel function usage in cancellable syscall definition and instead make them call a common symbol that will check if cancellation is enabled (__syscall_cancel at nptl/cancellation.c), call the arch-specific cancellable entry-point (__syscall_cancel_arch), and cancel the thread when required. 2. Provide an arch-specific generic system call wrapper function that contains global markers. These markers will be used in SIGCANCEL signal handler to check if the interruption has been called in a valid syscall and if the syscalls has side-effects. A reference implementation sysdeps/unix/sysv/linux/syscall_cancel.c is provided. However, the markers may not be set on correct expected places depending on how INTERNAL_SYSCALL_NCS is implemented by the architecture. It is expected that all architectures add an arch-specific implementation. 3. Rewrite SIGCANCEL asynchronous handler to check for both canceling type and if current IP from signal handler falls between the global markers and act accordingly. 4. Adjust libc code to replace LIBC_CANCEL_ASYNC/LIBC_CANCEL_RESET to use the appropriate cancelable syscalls. 5. Adjust 'lowlevellock-futex.h' arch-specific implementations to provide cancelable futex calls. Some architectures require specific support on syscall handling: * On i386 the syscall cancel bridge needs to use the old int80 instruction because the optimized vDSO symbol the resulting PC value for an interrupted syscall points to an address outside the expected markers in __syscall_cancel_arch. It has been discussed in LKML [1] on how kernel could help userland to accomplish it, but afaik discussion has stalled. Also, sysenter should not be used directly by libc since its calling convention is set by the kernel depending of the underlying x86 chip (check kernel commit 30bfa7b3488bfb1bb75c9f50a5fcac1832970c60). * mips o32 is the only kABI that requires 7 argument syscall, and to avoid add a requirement on all architectures to support it, mips support is added with extra internal defines. Checked on aarch64-linux-gnu, arm-linux-gnueabihf, powerpc-linux-gnu, powerpc64-linux-gnu, powerpc64le-linux-gnu, i686-linux-gnu, and x86_64-linux-gnu. [1] https://lkml.org/lkml/2016/3/8/1105 Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2024-06-25 19:17:44 +00:00
/* Called by the INTERNAL_SYSCALL_CANCEL macro, check for cancellation and
returns the syscall value or its negative error code. */
long int
__internal_syscall_cancel (__syscall_arg_t a1, __syscall_arg_t a2,
__syscall_arg_t a3, __syscall_arg_t a4,
__syscall_arg_t a5, __syscall_arg_t a6,
__SYSCALL_CANCEL7_ARG_DEF
__syscall_arg_t nr)
{
nptl: Fix Race conditions in pthread cancellation [BZ#12683] The current racy approach is to enable asynchronous cancellation before making the syscall and restore the previous cancellation type once the syscall returns, and check if cancellation has happen during the cancellation entrypoint. As described in BZ#12683, this approach shows 2 problems: 1. Cancellation can act after the syscall has returned from the kernel, but before userspace saves the return value. It might result in a resource leak if the syscall allocated a resource or a side effect (partial read/write), and there is no way to program handle it with cancellation handlers. 2. If a signal is handled while the thread is blocked at a cancellable syscall, the entire signal handler runs with asynchronous cancellation enabled. This can lead to issues if the signal handler call functions which are async-signal-safe but not async-cancel-safe. For the cancellation to work correctly, there are 5 points at which the cancellation signal could arrive: [ ... )[ ... )[ syscall ]( ... 1 2 3 4 5 1. Before initial testcancel, e.g. [*... testcancel) 2. Between testcancel and syscall start, e.g. [testcancel...syscall start) 3. While syscall is blocked and no side effects have yet taken place, e.g. [ syscall ] 4. Same as 3 but with side-effects having occurred (e.g. a partial read or write). 5. After syscall end e.g. (syscall end...*] And libc wants to act on cancellation in cases 1, 2, and 3 but not in cases 4 or 5. For the 4 and 5 cases, the cancellation will eventually happen in the next cancellable entrypoint without any further external event. The proposed solution for each case is: 1. Do a conditional branch based on whether the thread has received a cancellation request; 2. It can be caught by the signal handler determining that the saved program counter (from the ucontext_t) is in some address range beginning just before the "testcancel" and ending with the syscall instruction. 3. SIGCANCEL can be caught by the signal handler and determine that the saved program counter (from the ucontext_t) is in the address range beginning just before "testcancel" and ending with the first uninterruptable (via a signal) syscall instruction that enters the kernel. 4. In this case, except for certain syscalls that ALWAYS fail with EINTR even for non-interrupting signals, the kernel will reset the program counter to point at the syscall instruction during signal handling, so that the syscall is restarted when the signal handler returns. So, from the signal handler's standpoint, this looks the same as case 2, and thus it's taken care of. 5. For syscalls with side-effects, the kernel cannot restart the syscall; when it's interrupted by a signal, the kernel must cause the syscall to return with whatever partial result is obtained (e.g. partial read or write). 6. The saved program counter points just after the syscall instruction, so the signal handler won't act on cancellation. This is similar to 4. since the program counter is past the syscall instruction. So The proposed fixes are: 1. Remove the enable_asynccancel/disable_asynccancel function usage in cancellable syscall definition and instead make them call a common symbol that will check if cancellation is enabled (__syscall_cancel at nptl/cancellation.c), call the arch-specific cancellable entry-point (__syscall_cancel_arch), and cancel the thread when required. 2. Provide an arch-specific generic system call wrapper function that contains global markers. These markers will be used in SIGCANCEL signal handler to check if the interruption has been called in a valid syscall and if the syscalls has side-effects. A reference implementation sysdeps/unix/sysv/linux/syscall_cancel.c is provided. However, the markers may not be set on correct expected places depending on how INTERNAL_SYSCALL_NCS is implemented by the architecture. It is expected that all architectures add an arch-specific implementation. 3. Rewrite SIGCANCEL asynchronous handler to check for both canceling type and if current IP from signal handler falls between the global markers and act accordingly. 4. Adjust libc code to replace LIBC_CANCEL_ASYNC/LIBC_CANCEL_RESET to use the appropriate cancelable syscalls. 5. Adjust 'lowlevellock-futex.h' arch-specific implementations to provide cancelable futex calls. Some architectures require specific support on syscall handling: * On i386 the syscall cancel bridge needs to use the old int80 instruction because the optimized vDSO symbol the resulting PC value for an interrupted syscall points to an address outside the expected markers in __syscall_cancel_arch. It has been discussed in LKML [1] on how kernel could help userland to accomplish it, but afaik discussion has stalled. Also, sysenter should not be used directly by libc since its calling convention is set by the kernel depending of the underlying x86 chip (check kernel commit 30bfa7b3488bfb1bb75c9f50a5fcac1832970c60). * mips o32 is the only kABI that requires 7 argument syscall, and to avoid add a requirement on all architectures to support it, mips support is added with extra internal defines. Checked on aarch64-linux-gnu, arm-linux-gnueabihf, powerpc-linux-gnu, powerpc64-linux-gnu, powerpc64le-linux-gnu, i686-linux-gnu, and x86_64-linux-gnu. [1] https://lkml.org/lkml/2016/3/8/1105 Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2024-06-25 19:17:44 +00:00
long int result;
struct pthread *pd = THREAD_SELF;
nptl: Fix Race conditions in pthread cancellation [BZ#12683] The current racy approach is to enable asynchronous cancellation before making the syscall and restore the previous cancellation type once the syscall returns, and check if cancellation has happen during the cancellation entrypoint. As described in BZ#12683, this approach shows 2 problems: 1. Cancellation can act after the syscall has returned from the kernel, but before userspace saves the return value. It might result in a resource leak if the syscall allocated a resource or a side effect (partial read/write), and there is no way to program handle it with cancellation handlers. 2. If a signal is handled while the thread is blocked at a cancellable syscall, the entire signal handler runs with asynchronous cancellation enabled. This can lead to issues if the signal handler call functions which are async-signal-safe but not async-cancel-safe. For the cancellation to work correctly, there are 5 points at which the cancellation signal could arrive: [ ... )[ ... )[ syscall ]( ... 1 2 3 4 5 1. Before initial testcancel, e.g. [*... testcancel) 2. Between testcancel and syscall start, e.g. [testcancel...syscall start) 3. While syscall is blocked and no side effects have yet taken place, e.g. [ syscall ] 4. Same as 3 but with side-effects having occurred (e.g. a partial read or write). 5. After syscall end e.g. (syscall end...*] And libc wants to act on cancellation in cases 1, 2, and 3 but not in cases 4 or 5. For the 4 and 5 cases, the cancellation will eventually happen in the next cancellable entrypoint without any further external event. The proposed solution for each case is: 1. Do a conditional branch based on whether the thread has received a cancellation request; 2. It can be caught by the signal handler determining that the saved program counter (from the ucontext_t) is in some address range beginning just before the "testcancel" and ending with the syscall instruction. 3. SIGCANCEL can be caught by the signal handler and determine that the saved program counter (from the ucontext_t) is in the address range beginning just before "testcancel" and ending with the first uninterruptable (via a signal) syscall instruction that enters the kernel. 4. In this case, except for certain syscalls that ALWAYS fail with EINTR even for non-interrupting signals, the kernel will reset the program counter to point at the syscall instruction during signal handling, so that the syscall is restarted when the signal handler returns. So, from the signal handler's standpoint, this looks the same as case 2, and thus it's taken care of. 5. For syscalls with side-effects, the kernel cannot restart the syscall; when it's interrupted by a signal, the kernel must cause the syscall to return with whatever partial result is obtained (e.g. partial read or write). 6. The saved program counter points just after the syscall instruction, so the signal handler won't act on cancellation. This is similar to 4. since the program counter is past the syscall instruction. So The proposed fixes are: 1. Remove the enable_asynccancel/disable_asynccancel function usage in cancellable syscall definition and instead make them call a common symbol that will check if cancellation is enabled (__syscall_cancel at nptl/cancellation.c), call the arch-specific cancellable entry-point (__syscall_cancel_arch), and cancel the thread when required. 2. Provide an arch-specific generic system call wrapper function that contains global markers. These markers will be used in SIGCANCEL signal handler to check if the interruption has been called in a valid syscall and if the syscalls has side-effects. A reference implementation sysdeps/unix/sysv/linux/syscall_cancel.c is provided. However, the markers may not be set on correct expected places depending on how INTERNAL_SYSCALL_NCS is implemented by the architecture. It is expected that all architectures add an arch-specific implementation. 3. Rewrite SIGCANCEL asynchronous handler to check for both canceling type and if current IP from signal handler falls between the global markers and act accordingly. 4. Adjust libc code to replace LIBC_CANCEL_ASYNC/LIBC_CANCEL_RESET to use the appropriate cancelable syscalls. 5. Adjust 'lowlevellock-futex.h' arch-specific implementations to provide cancelable futex calls. Some architectures require specific support on syscall handling: * On i386 the syscall cancel bridge needs to use the old int80 instruction because the optimized vDSO symbol the resulting PC value for an interrupted syscall points to an address outside the expected markers in __syscall_cancel_arch. It has been discussed in LKML [1] on how kernel could help userland to accomplish it, but afaik discussion has stalled. Also, sysenter should not be used directly by libc since its calling convention is set by the kernel depending of the underlying x86 chip (check kernel commit 30bfa7b3488bfb1bb75c9f50a5fcac1832970c60). * mips o32 is the only kABI that requires 7 argument syscall, and to avoid add a requirement on all architectures to support it, mips support is added with extra internal defines. Checked on aarch64-linux-gnu, arm-linux-gnueabihf, powerpc-linux-gnu, powerpc64-linux-gnu, powerpc64le-linux-gnu, i686-linux-gnu, and x86_64-linux-gnu. [1] https://lkml.org/lkml/2016/3/8/1105 Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2024-06-25 19:17:44 +00:00
/* If cancellation is not enabled, call the syscall directly and also
for thread terminatation to avoid call __syscall_do_cancel while
executing cleanup handlers. */
int ch = atomic_load_relaxed (&pd->cancelhandling);
if (SINGLE_THREAD_P || !cancel_enabled (ch) || cancel_exiting (ch))
2022-04-06 15:24:42 +00:00
{
nptl: Fix Race conditions in pthread cancellation [BZ#12683] The current racy approach is to enable asynchronous cancellation before making the syscall and restore the previous cancellation type once the syscall returns, and check if cancellation has happen during the cancellation entrypoint. As described in BZ#12683, this approach shows 2 problems: 1. Cancellation can act after the syscall has returned from the kernel, but before userspace saves the return value. It might result in a resource leak if the syscall allocated a resource or a side effect (partial read/write), and there is no way to program handle it with cancellation handlers. 2. If a signal is handled while the thread is blocked at a cancellable syscall, the entire signal handler runs with asynchronous cancellation enabled. This can lead to issues if the signal handler call functions which are async-signal-safe but not async-cancel-safe. For the cancellation to work correctly, there are 5 points at which the cancellation signal could arrive: [ ... )[ ... )[ syscall ]( ... 1 2 3 4 5 1. Before initial testcancel, e.g. [*... testcancel) 2. Between testcancel and syscall start, e.g. [testcancel...syscall start) 3. While syscall is blocked and no side effects have yet taken place, e.g. [ syscall ] 4. Same as 3 but with side-effects having occurred (e.g. a partial read or write). 5. After syscall end e.g. (syscall end...*] And libc wants to act on cancellation in cases 1, 2, and 3 but not in cases 4 or 5. For the 4 and 5 cases, the cancellation will eventually happen in the next cancellable entrypoint without any further external event. The proposed solution for each case is: 1. Do a conditional branch based on whether the thread has received a cancellation request; 2. It can be caught by the signal handler determining that the saved program counter (from the ucontext_t) is in some address range beginning just before the "testcancel" and ending with the syscall instruction. 3. SIGCANCEL can be caught by the signal handler and determine that the saved program counter (from the ucontext_t) is in the address range beginning just before "testcancel" and ending with the first uninterruptable (via a signal) syscall instruction that enters the kernel. 4. In this case, except for certain syscalls that ALWAYS fail with EINTR even for non-interrupting signals, the kernel will reset the program counter to point at the syscall instruction during signal handling, so that the syscall is restarted when the signal handler returns. So, from the signal handler's standpoint, this looks the same as case 2, and thus it's taken care of. 5. For syscalls with side-effects, the kernel cannot restart the syscall; when it's interrupted by a signal, the kernel must cause the syscall to return with whatever partial result is obtained (e.g. partial read or write). 6. The saved program counter points just after the syscall instruction, so the signal handler won't act on cancellation. This is similar to 4. since the program counter is past the syscall instruction. So The proposed fixes are: 1. Remove the enable_asynccancel/disable_asynccancel function usage in cancellable syscall definition and instead make them call a common symbol that will check if cancellation is enabled (__syscall_cancel at nptl/cancellation.c), call the arch-specific cancellable entry-point (__syscall_cancel_arch), and cancel the thread when required. 2. Provide an arch-specific generic system call wrapper function that contains global markers. These markers will be used in SIGCANCEL signal handler to check if the interruption has been called in a valid syscall and if the syscalls has side-effects. A reference implementation sysdeps/unix/sysv/linux/syscall_cancel.c is provided. However, the markers may not be set on correct expected places depending on how INTERNAL_SYSCALL_NCS is implemented by the architecture. It is expected that all architectures add an arch-specific implementation. 3. Rewrite SIGCANCEL asynchronous handler to check for both canceling type and if current IP from signal handler falls between the global markers and act accordingly. 4. Adjust libc code to replace LIBC_CANCEL_ASYNC/LIBC_CANCEL_RESET to use the appropriate cancelable syscalls. 5. Adjust 'lowlevellock-futex.h' arch-specific implementations to provide cancelable futex calls. Some architectures require specific support on syscall handling: * On i386 the syscall cancel bridge needs to use the old int80 instruction because the optimized vDSO symbol the resulting PC value for an interrupted syscall points to an address outside the expected markers in __syscall_cancel_arch. It has been discussed in LKML [1] on how kernel could help userland to accomplish it, but afaik discussion has stalled. Also, sysenter should not be used directly by libc since its calling convention is set by the kernel depending of the underlying x86 chip (check kernel commit 30bfa7b3488bfb1bb75c9f50a5fcac1832970c60). * mips o32 is the only kABI that requires 7 argument syscall, and to avoid add a requirement on all architectures to support it, mips support is added with extra internal defines. Checked on aarch64-linux-gnu, arm-linux-gnueabihf, powerpc-linux-gnu, powerpc64-linux-gnu, powerpc64le-linux-gnu, i686-linux-gnu, and x86_64-linux-gnu. [1] https://lkml.org/lkml/2016/3/8/1105 Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2024-06-25 19:17:44 +00:00
result = INTERNAL_SYSCALL_NCS_CALL (nr, a1, a2, a3, a4, a5, a6
__SYSCALL_CANCEL7_ARCH_ARG7);
if (INTERNAL_SYSCALL_ERROR_P (result))
return -INTERNAL_SYSCALL_ERRNO (result);
return result;
}
nptl: Fix Race conditions in pthread cancellation [BZ#12683] The current racy approach is to enable asynchronous cancellation before making the syscall and restore the previous cancellation type once the syscall returns, and check if cancellation has happen during the cancellation entrypoint. As described in BZ#12683, this approach shows 2 problems: 1. Cancellation can act after the syscall has returned from the kernel, but before userspace saves the return value. It might result in a resource leak if the syscall allocated a resource or a side effect (partial read/write), and there is no way to program handle it with cancellation handlers. 2. If a signal is handled while the thread is blocked at a cancellable syscall, the entire signal handler runs with asynchronous cancellation enabled. This can lead to issues if the signal handler call functions which are async-signal-safe but not async-cancel-safe. For the cancellation to work correctly, there are 5 points at which the cancellation signal could arrive: [ ... )[ ... )[ syscall ]( ... 1 2 3 4 5 1. Before initial testcancel, e.g. [*... testcancel) 2. Between testcancel and syscall start, e.g. [testcancel...syscall start) 3. While syscall is blocked and no side effects have yet taken place, e.g. [ syscall ] 4. Same as 3 but with side-effects having occurred (e.g. a partial read or write). 5. After syscall end e.g. (syscall end...*] And libc wants to act on cancellation in cases 1, 2, and 3 but not in cases 4 or 5. For the 4 and 5 cases, the cancellation will eventually happen in the next cancellable entrypoint without any further external event. The proposed solution for each case is: 1. Do a conditional branch based on whether the thread has received a cancellation request; 2. It can be caught by the signal handler determining that the saved program counter (from the ucontext_t) is in some address range beginning just before the "testcancel" and ending with the syscall instruction. 3. SIGCANCEL can be caught by the signal handler and determine that the saved program counter (from the ucontext_t) is in the address range beginning just before "testcancel" and ending with the first uninterruptable (via a signal) syscall instruction that enters the kernel. 4. In this case, except for certain syscalls that ALWAYS fail with EINTR even for non-interrupting signals, the kernel will reset the program counter to point at the syscall instruction during signal handling, so that the syscall is restarted when the signal handler returns. So, from the signal handler's standpoint, this looks the same as case 2, and thus it's taken care of. 5. For syscalls with side-effects, the kernel cannot restart the syscall; when it's interrupted by a signal, the kernel must cause the syscall to return with whatever partial result is obtained (e.g. partial read or write). 6. The saved program counter points just after the syscall instruction, so the signal handler won't act on cancellation. This is similar to 4. since the program counter is past the syscall instruction. So The proposed fixes are: 1. Remove the enable_asynccancel/disable_asynccancel function usage in cancellable syscall definition and instead make them call a common symbol that will check if cancellation is enabled (__syscall_cancel at nptl/cancellation.c), call the arch-specific cancellable entry-point (__syscall_cancel_arch), and cancel the thread when required. 2. Provide an arch-specific generic system call wrapper function that contains global markers. These markers will be used in SIGCANCEL signal handler to check if the interruption has been called in a valid syscall and if the syscalls has side-effects. A reference implementation sysdeps/unix/sysv/linux/syscall_cancel.c is provided. However, the markers may not be set on correct expected places depending on how INTERNAL_SYSCALL_NCS is implemented by the architecture. It is expected that all architectures add an arch-specific implementation. 3. Rewrite SIGCANCEL asynchronous handler to check for both canceling type and if current IP from signal handler falls between the global markers and act accordingly. 4. Adjust libc code to replace LIBC_CANCEL_ASYNC/LIBC_CANCEL_RESET to use the appropriate cancelable syscalls. 5. Adjust 'lowlevellock-futex.h' arch-specific implementations to provide cancelable futex calls. Some architectures require specific support on syscall handling: * On i386 the syscall cancel bridge needs to use the old int80 instruction because the optimized vDSO symbol the resulting PC value for an interrupted syscall points to an address outside the expected markers in __syscall_cancel_arch. It has been discussed in LKML [1] on how kernel could help userland to accomplish it, but afaik discussion has stalled. Also, sysenter should not be used directly by libc since its calling convention is set by the kernel depending of the underlying x86 chip (check kernel commit 30bfa7b3488bfb1bb75c9f50a5fcac1832970c60). * mips o32 is the only kABI that requires 7 argument syscall, and to avoid add a requirement on all architectures to support it, mips support is added with extra internal defines. Checked on aarch64-linux-gnu, arm-linux-gnueabihf, powerpc-linux-gnu, powerpc64-linux-gnu, powerpc64le-linux-gnu, i686-linux-gnu, and x86_64-linux-gnu. [1] https://lkml.org/lkml/2016/3/8/1105 Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2024-06-25 19:17:44 +00:00
/* Call the arch-specific entry points that contains the globals markers
to be checked by SIGCANCEL handler. */
result = __syscall_cancel_arch (&pd->cancelhandling, nr, a1, a2, a3, a4, a5,
a6 __SYSCALL_CANCEL7_ARCH_ARG7);
2022-04-06 15:24:42 +00:00
nptl: Fix Race conditions in pthread cancellation [BZ#12683] The current racy approach is to enable asynchronous cancellation before making the syscall and restore the previous cancellation type once the syscall returns, and check if cancellation has happen during the cancellation entrypoint. As described in BZ#12683, this approach shows 2 problems: 1. Cancellation can act after the syscall has returned from the kernel, but before userspace saves the return value. It might result in a resource leak if the syscall allocated a resource or a side effect (partial read/write), and there is no way to program handle it with cancellation handlers. 2. If a signal is handled while the thread is blocked at a cancellable syscall, the entire signal handler runs with asynchronous cancellation enabled. This can lead to issues if the signal handler call functions which are async-signal-safe but not async-cancel-safe. For the cancellation to work correctly, there are 5 points at which the cancellation signal could arrive: [ ... )[ ... )[ syscall ]( ... 1 2 3 4 5 1. Before initial testcancel, e.g. [*... testcancel) 2. Between testcancel and syscall start, e.g. [testcancel...syscall start) 3. While syscall is blocked and no side effects have yet taken place, e.g. [ syscall ] 4. Same as 3 but with side-effects having occurred (e.g. a partial read or write). 5. After syscall end e.g. (syscall end...*] And libc wants to act on cancellation in cases 1, 2, and 3 but not in cases 4 or 5. For the 4 and 5 cases, the cancellation will eventually happen in the next cancellable entrypoint without any further external event. The proposed solution for each case is: 1. Do a conditional branch based on whether the thread has received a cancellation request; 2. It can be caught by the signal handler determining that the saved program counter (from the ucontext_t) is in some address range beginning just before the "testcancel" and ending with the syscall instruction. 3. SIGCANCEL can be caught by the signal handler and determine that the saved program counter (from the ucontext_t) is in the address range beginning just before "testcancel" and ending with the first uninterruptable (via a signal) syscall instruction that enters the kernel. 4. In this case, except for certain syscalls that ALWAYS fail with EINTR even for non-interrupting signals, the kernel will reset the program counter to point at the syscall instruction during signal handling, so that the syscall is restarted when the signal handler returns. So, from the signal handler's standpoint, this looks the same as case 2, and thus it's taken care of. 5. For syscalls with side-effects, the kernel cannot restart the syscall; when it's interrupted by a signal, the kernel must cause the syscall to return with whatever partial result is obtained (e.g. partial read or write). 6. The saved program counter points just after the syscall instruction, so the signal handler won't act on cancellation. This is similar to 4. since the program counter is past the syscall instruction. So The proposed fixes are: 1. Remove the enable_asynccancel/disable_asynccancel function usage in cancellable syscall definition and instead make them call a common symbol that will check if cancellation is enabled (__syscall_cancel at nptl/cancellation.c), call the arch-specific cancellable entry-point (__syscall_cancel_arch), and cancel the thread when required. 2. Provide an arch-specific generic system call wrapper function that contains global markers. These markers will be used in SIGCANCEL signal handler to check if the interruption has been called in a valid syscall and if the syscalls has side-effects. A reference implementation sysdeps/unix/sysv/linux/syscall_cancel.c is provided. However, the markers may not be set on correct expected places depending on how INTERNAL_SYSCALL_NCS is implemented by the architecture. It is expected that all architectures add an arch-specific implementation. 3. Rewrite SIGCANCEL asynchronous handler to check for both canceling type and if current IP from signal handler falls between the global markers and act accordingly. 4. Adjust libc code to replace LIBC_CANCEL_ASYNC/LIBC_CANCEL_RESET to use the appropriate cancelable syscalls. 5. Adjust 'lowlevellock-futex.h' arch-specific implementations to provide cancelable futex calls. Some architectures require specific support on syscall handling: * On i386 the syscall cancel bridge needs to use the old int80 instruction because the optimized vDSO symbol the resulting PC value for an interrupted syscall points to an address outside the expected markers in __syscall_cancel_arch. It has been discussed in LKML [1] on how kernel could help userland to accomplish it, but afaik discussion has stalled. Also, sysenter should not be used directly by libc since its calling convention is set by the kernel depending of the underlying x86 chip (check kernel commit 30bfa7b3488bfb1bb75c9f50a5fcac1832970c60). * mips o32 is the only kABI that requires 7 argument syscall, and to avoid add a requirement on all architectures to support it, mips support is added with extra internal defines. Checked on aarch64-linux-gnu, arm-linux-gnueabihf, powerpc-linux-gnu, powerpc64-linux-gnu, powerpc64le-linux-gnu, i686-linux-gnu, and x86_64-linux-gnu. [1] https://lkml.org/lkml/2016/3/8/1105 Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2024-06-25 19:17:44 +00:00
/* If the cancellable syscall was interrupted by SIGCANCEL and it has no
side-effect, cancel the thread if cancellation is enabled. */
ch = atomic_load_relaxed (&pd->cancelhandling);
/* The behaviour here assumes that EINTR is returned only if there are no
visible side effects. POSIX Issue 7 has not yet provided any stronger
language for close, and in theory the close syscall could return EINTR
and leave the file descriptor open (conforming and leaks). It expects
that no such kernel is used with glibc. */
if (result == -EINTR && cancel_enabled_and_canceled (ch))
__syscall_do_cancel ();
nptl: Fix Race conditions in pthread cancellation [BZ#12683] The current racy approach is to enable asynchronous cancellation before making the syscall and restore the previous cancellation type once the syscall returns, and check if cancellation has happen during the cancellation entrypoint. As described in BZ#12683, this approach shows 2 problems: 1. Cancellation can act after the syscall has returned from the kernel, but before userspace saves the return value. It might result in a resource leak if the syscall allocated a resource or a side effect (partial read/write), and there is no way to program handle it with cancellation handlers. 2. If a signal is handled while the thread is blocked at a cancellable syscall, the entire signal handler runs with asynchronous cancellation enabled. This can lead to issues if the signal handler call functions which are async-signal-safe but not async-cancel-safe. For the cancellation to work correctly, there are 5 points at which the cancellation signal could arrive: [ ... )[ ... )[ syscall ]( ... 1 2 3 4 5 1. Before initial testcancel, e.g. [*... testcancel) 2. Between testcancel and syscall start, e.g. [testcancel...syscall start) 3. While syscall is blocked and no side effects have yet taken place, e.g. [ syscall ] 4. Same as 3 but with side-effects having occurred (e.g. a partial read or write). 5. After syscall end e.g. (syscall end...*] And libc wants to act on cancellation in cases 1, 2, and 3 but not in cases 4 or 5. For the 4 and 5 cases, the cancellation will eventually happen in the next cancellable entrypoint without any further external event. The proposed solution for each case is: 1. Do a conditional branch based on whether the thread has received a cancellation request; 2. It can be caught by the signal handler determining that the saved program counter (from the ucontext_t) is in some address range beginning just before the "testcancel" and ending with the syscall instruction. 3. SIGCANCEL can be caught by the signal handler and determine that the saved program counter (from the ucontext_t) is in the address range beginning just before "testcancel" and ending with the first uninterruptable (via a signal) syscall instruction that enters the kernel. 4. In this case, except for certain syscalls that ALWAYS fail with EINTR even for non-interrupting signals, the kernel will reset the program counter to point at the syscall instruction during signal handling, so that the syscall is restarted when the signal handler returns. So, from the signal handler's standpoint, this looks the same as case 2, and thus it's taken care of. 5. For syscalls with side-effects, the kernel cannot restart the syscall; when it's interrupted by a signal, the kernel must cause the syscall to return with whatever partial result is obtained (e.g. partial read or write). 6. The saved program counter points just after the syscall instruction, so the signal handler won't act on cancellation. This is similar to 4. since the program counter is past the syscall instruction. So The proposed fixes are: 1. Remove the enable_asynccancel/disable_asynccancel function usage in cancellable syscall definition and instead make them call a common symbol that will check if cancellation is enabled (__syscall_cancel at nptl/cancellation.c), call the arch-specific cancellable entry-point (__syscall_cancel_arch), and cancel the thread when required. 2. Provide an arch-specific generic system call wrapper function that contains global markers. These markers will be used in SIGCANCEL signal handler to check if the interruption has been called in a valid syscall and if the syscalls has side-effects. A reference implementation sysdeps/unix/sysv/linux/syscall_cancel.c is provided. However, the markers may not be set on correct expected places depending on how INTERNAL_SYSCALL_NCS is implemented by the architecture. It is expected that all architectures add an arch-specific implementation. 3. Rewrite SIGCANCEL asynchronous handler to check for both canceling type and if current IP from signal handler falls between the global markers and act accordingly. 4. Adjust libc code to replace LIBC_CANCEL_ASYNC/LIBC_CANCEL_RESET to use the appropriate cancelable syscalls. 5. Adjust 'lowlevellock-futex.h' arch-specific implementations to provide cancelable futex calls. Some architectures require specific support on syscall handling: * On i386 the syscall cancel bridge needs to use the old int80 instruction because the optimized vDSO symbol the resulting PC value for an interrupted syscall points to an address outside the expected markers in __syscall_cancel_arch. It has been discussed in LKML [1] on how kernel could help userland to accomplish it, but afaik discussion has stalled. Also, sysenter should not be used directly by libc since its calling convention is set by the kernel depending of the underlying x86 chip (check kernel commit 30bfa7b3488bfb1bb75c9f50a5fcac1832970c60). * mips o32 is the only kABI that requires 7 argument syscall, and to avoid add a requirement on all architectures to support it, mips support is added with extra internal defines. Checked on aarch64-linux-gnu, arm-linux-gnueabihf, powerpc-linux-gnu, powerpc64-linux-gnu, powerpc64le-linux-gnu, i686-linux-gnu, and x86_64-linux-gnu. [1] https://lkml.org/lkml/2016/3/8/1105 Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2024-06-25 19:17:44 +00:00
return result;
}
nptl: Fix Race conditions in pthread cancellation [BZ#12683] The current racy approach is to enable asynchronous cancellation before making the syscall and restore the previous cancellation type once the syscall returns, and check if cancellation has happen during the cancellation entrypoint. As described in BZ#12683, this approach shows 2 problems: 1. Cancellation can act after the syscall has returned from the kernel, but before userspace saves the return value. It might result in a resource leak if the syscall allocated a resource or a side effect (partial read/write), and there is no way to program handle it with cancellation handlers. 2. If a signal is handled while the thread is blocked at a cancellable syscall, the entire signal handler runs with asynchronous cancellation enabled. This can lead to issues if the signal handler call functions which are async-signal-safe but not async-cancel-safe. For the cancellation to work correctly, there are 5 points at which the cancellation signal could arrive: [ ... )[ ... )[ syscall ]( ... 1 2 3 4 5 1. Before initial testcancel, e.g. [*... testcancel) 2. Between testcancel and syscall start, e.g. [testcancel...syscall start) 3. While syscall is blocked and no side effects have yet taken place, e.g. [ syscall ] 4. Same as 3 but with side-effects having occurred (e.g. a partial read or write). 5. After syscall end e.g. (syscall end...*] And libc wants to act on cancellation in cases 1, 2, and 3 but not in cases 4 or 5. For the 4 and 5 cases, the cancellation will eventually happen in the next cancellable entrypoint without any further external event. The proposed solution for each case is: 1. Do a conditional branch based on whether the thread has received a cancellation request; 2. It can be caught by the signal handler determining that the saved program counter (from the ucontext_t) is in some address range beginning just before the "testcancel" and ending with the syscall instruction. 3. SIGCANCEL can be caught by the signal handler and determine that the saved program counter (from the ucontext_t) is in the address range beginning just before "testcancel" and ending with the first uninterruptable (via a signal) syscall instruction that enters the kernel. 4. In this case, except for certain syscalls that ALWAYS fail with EINTR even for non-interrupting signals, the kernel will reset the program counter to point at the syscall instruction during signal handling, so that the syscall is restarted when the signal handler returns. So, from the signal handler's standpoint, this looks the same as case 2, and thus it's taken care of. 5. For syscalls with side-effects, the kernel cannot restart the syscall; when it's interrupted by a signal, the kernel must cause the syscall to return with whatever partial result is obtained (e.g. partial read or write). 6. The saved program counter points just after the syscall instruction, so the signal handler won't act on cancellation. This is similar to 4. since the program counter is past the syscall instruction. So The proposed fixes are: 1. Remove the enable_asynccancel/disable_asynccancel function usage in cancellable syscall definition and instead make them call a common symbol that will check if cancellation is enabled (__syscall_cancel at nptl/cancellation.c), call the arch-specific cancellable entry-point (__syscall_cancel_arch), and cancel the thread when required. 2. Provide an arch-specific generic system call wrapper function that contains global markers. These markers will be used in SIGCANCEL signal handler to check if the interruption has been called in a valid syscall and if the syscalls has side-effects. A reference implementation sysdeps/unix/sysv/linux/syscall_cancel.c is provided. However, the markers may not be set on correct expected places depending on how INTERNAL_SYSCALL_NCS is implemented by the architecture. It is expected that all architectures add an arch-specific implementation. 3. Rewrite SIGCANCEL asynchronous handler to check for both canceling type and if current IP from signal handler falls between the global markers and act accordingly. 4. Adjust libc code to replace LIBC_CANCEL_ASYNC/LIBC_CANCEL_RESET to use the appropriate cancelable syscalls. 5. Adjust 'lowlevellock-futex.h' arch-specific implementations to provide cancelable futex calls. Some architectures require specific support on syscall handling: * On i386 the syscall cancel bridge needs to use the old int80 instruction because the optimized vDSO symbol the resulting PC value for an interrupted syscall points to an address outside the expected markers in __syscall_cancel_arch. It has been discussed in LKML [1] on how kernel could help userland to accomplish it, but afaik discussion has stalled. Also, sysenter should not be used directly by libc since its calling convention is set by the kernel depending of the underlying x86 chip (check kernel commit 30bfa7b3488bfb1bb75c9f50a5fcac1832970c60). * mips o32 is the only kABI that requires 7 argument syscall, and to avoid add a requirement on all architectures to support it, mips support is added with extra internal defines. Checked on aarch64-linux-gnu, arm-linux-gnueabihf, powerpc-linux-gnu, powerpc64-linux-gnu, powerpc64le-linux-gnu, i686-linux-gnu, and x86_64-linux-gnu. [1] https://lkml.org/lkml/2016/3/8/1105 Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2024-06-25 19:17:44 +00:00
/* Called by the SYSCALL_CANCEL macro, check for cancellation and return the
syscall expected success value (usually 0) or, in case of failure, -1 and
sets errno to syscall return value. */
long int
__syscall_cancel (__syscall_arg_t a1, __syscall_arg_t a2,
__syscall_arg_t a3, __syscall_arg_t a4,
__syscall_arg_t a5, __syscall_arg_t a6,
__SYSCALL_CANCEL7_ARG_DEF __syscall_arg_t nr)
{
nptl: Fix Race conditions in pthread cancellation [BZ#12683] The current racy approach is to enable asynchronous cancellation before making the syscall and restore the previous cancellation type once the syscall returns, and check if cancellation has happen during the cancellation entrypoint. As described in BZ#12683, this approach shows 2 problems: 1. Cancellation can act after the syscall has returned from the kernel, but before userspace saves the return value. It might result in a resource leak if the syscall allocated a resource or a side effect (partial read/write), and there is no way to program handle it with cancellation handlers. 2. If a signal is handled while the thread is blocked at a cancellable syscall, the entire signal handler runs with asynchronous cancellation enabled. This can lead to issues if the signal handler call functions which are async-signal-safe but not async-cancel-safe. For the cancellation to work correctly, there are 5 points at which the cancellation signal could arrive: [ ... )[ ... )[ syscall ]( ... 1 2 3 4 5 1. Before initial testcancel, e.g. [*... testcancel) 2. Between testcancel and syscall start, e.g. [testcancel...syscall start) 3. While syscall is blocked and no side effects have yet taken place, e.g. [ syscall ] 4. Same as 3 but with side-effects having occurred (e.g. a partial read or write). 5. After syscall end e.g. (syscall end...*] And libc wants to act on cancellation in cases 1, 2, and 3 but not in cases 4 or 5. For the 4 and 5 cases, the cancellation will eventually happen in the next cancellable entrypoint without any further external event. The proposed solution for each case is: 1. Do a conditional branch based on whether the thread has received a cancellation request; 2. It can be caught by the signal handler determining that the saved program counter (from the ucontext_t) is in some address range beginning just before the "testcancel" and ending with the syscall instruction. 3. SIGCANCEL can be caught by the signal handler and determine that the saved program counter (from the ucontext_t) is in the address range beginning just before "testcancel" and ending with the first uninterruptable (via a signal) syscall instruction that enters the kernel. 4. In this case, except for certain syscalls that ALWAYS fail with EINTR even for non-interrupting signals, the kernel will reset the program counter to point at the syscall instruction during signal handling, so that the syscall is restarted when the signal handler returns. So, from the signal handler's standpoint, this looks the same as case 2, and thus it's taken care of. 5. For syscalls with side-effects, the kernel cannot restart the syscall; when it's interrupted by a signal, the kernel must cause the syscall to return with whatever partial result is obtained (e.g. partial read or write). 6. The saved program counter points just after the syscall instruction, so the signal handler won't act on cancellation. This is similar to 4. since the program counter is past the syscall instruction. So The proposed fixes are: 1. Remove the enable_asynccancel/disable_asynccancel function usage in cancellable syscall definition and instead make them call a common symbol that will check if cancellation is enabled (__syscall_cancel at nptl/cancellation.c), call the arch-specific cancellable entry-point (__syscall_cancel_arch), and cancel the thread when required. 2. Provide an arch-specific generic system call wrapper function that contains global markers. These markers will be used in SIGCANCEL signal handler to check if the interruption has been called in a valid syscall and if the syscalls has side-effects. A reference implementation sysdeps/unix/sysv/linux/syscall_cancel.c is provided. However, the markers may not be set on correct expected places depending on how INTERNAL_SYSCALL_NCS is implemented by the architecture. It is expected that all architectures add an arch-specific implementation. 3. Rewrite SIGCANCEL asynchronous handler to check for both canceling type and if current IP from signal handler falls between the global markers and act accordingly. 4. Adjust libc code to replace LIBC_CANCEL_ASYNC/LIBC_CANCEL_RESET to use the appropriate cancelable syscalls. 5. Adjust 'lowlevellock-futex.h' arch-specific implementations to provide cancelable futex calls. Some architectures require specific support on syscall handling: * On i386 the syscall cancel bridge needs to use the old int80 instruction because the optimized vDSO symbol the resulting PC value for an interrupted syscall points to an address outside the expected markers in __syscall_cancel_arch. It has been discussed in LKML [1] on how kernel could help userland to accomplish it, but afaik discussion has stalled. Also, sysenter should not be used directly by libc since its calling convention is set by the kernel depending of the underlying x86 chip (check kernel commit 30bfa7b3488bfb1bb75c9f50a5fcac1832970c60). * mips o32 is the only kABI that requires 7 argument syscall, and to avoid add a requirement on all architectures to support it, mips support is added with extra internal defines. Checked on aarch64-linux-gnu, arm-linux-gnueabihf, powerpc-linux-gnu, powerpc64-linux-gnu, powerpc64le-linux-gnu, i686-linux-gnu, and x86_64-linux-gnu. [1] https://lkml.org/lkml/2016/3/8/1105 Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2024-06-25 19:17:44 +00:00
int r = __internal_syscall_cancel (a1, a2, a3, a4, a5, a6,
__SYSCALL_CANCEL7_ARG nr);
return __glibc_unlikely (INTERNAL_SYSCALL_ERROR_P (r))
? SYSCALL_ERROR_LABEL (INTERNAL_SYSCALL_ERRNO (r))
: r;
}
nptl: Fix Race conditions in pthread cancellation [BZ#12683] The current racy approach is to enable asynchronous cancellation before making the syscall and restore the previous cancellation type once the syscall returns, and check if cancellation has happen during the cancellation entrypoint. As described in BZ#12683, this approach shows 2 problems: 1. Cancellation can act after the syscall has returned from the kernel, but before userspace saves the return value. It might result in a resource leak if the syscall allocated a resource or a side effect (partial read/write), and there is no way to program handle it with cancellation handlers. 2. If a signal is handled while the thread is blocked at a cancellable syscall, the entire signal handler runs with asynchronous cancellation enabled. This can lead to issues if the signal handler call functions which are async-signal-safe but not async-cancel-safe. For the cancellation to work correctly, there are 5 points at which the cancellation signal could arrive: [ ... )[ ... )[ syscall ]( ... 1 2 3 4 5 1. Before initial testcancel, e.g. [*... testcancel) 2. Between testcancel and syscall start, e.g. [testcancel...syscall start) 3. While syscall is blocked and no side effects have yet taken place, e.g. [ syscall ] 4. Same as 3 but with side-effects having occurred (e.g. a partial read or write). 5. After syscall end e.g. (syscall end...*] And libc wants to act on cancellation in cases 1, 2, and 3 but not in cases 4 or 5. For the 4 and 5 cases, the cancellation will eventually happen in the next cancellable entrypoint without any further external event. The proposed solution for each case is: 1. Do a conditional branch based on whether the thread has received a cancellation request; 2. It can be caught by the signal handler determining that the saved program counter (from the ucontext_t) is in some address range beginning just before the "testcancel" and ending with the syscall instruction. 3. SIGCANCEL can be caught by the signal handler and determine that the saved program counter (from the ucontext_t) is in the address range beginning just before "testcancel" and ending with the first uninterruptable (via a signal) syscall instruction that enters the kernel. 4. In this case, except for certain syscalls that ALWAYS fail with EINTR even for non-interrupting signals, the kernel will reset the program counter to point at the syscall instruction during signal handling, so that the syscall is restarted when the signal handler returns. So, from the signal handler's standpoint, this looks the same as case 2, and thus it's taken care of. 5. For syscalls with side-effects, the kernel cannot restart the syscall; when it's interrupted by a signal, the kernel must cause the syscall to return with whatever partial result is obtained (e.g. partial read or write). 6. The saved program counter points just after the syscall instruction, so the signal handler won't act on cancellation. This is similar to 4. since the program counter is past the syscall instruction. So The proposed fixes are: 1. Remove the enable_asynccancel/disable_asynccancel function usage in cancellable syscall definition and instead make them call a common symbol that will check if cancellation is enabled (__syscall_cancel at nptl/cancellation.c), call the arch-specific cancellable entry-point (__syscall_cancel_arch), and cancel the thread when required. 2. Provide an arch-specific generic system call wrapper function that contains global markers. These markers will be used in SIGCANCEL signal handler to check if the interruption has been called in a valid syscall and if the syscalls has side-effects. A reference implementation sysdeps/unix/sysv/linux/syscall_cancel.c is provided. However, the markers may not be set on correct expected places depending on how INTERNAL_SYSCALL_NCS is implemented by the architecture. It is expected that all architectures add an arch-specific implementation. 3. Rewrite SIGCANCEL asynchronous handler to check for both canceling type and if current IP from signal handler falls between the global markers and act accordingly. 4. Adjust libc code to replace LIBC_CANCEL_ASYNC/LIBC_CANCEL_RESET to use the appropriate cancelable syscalls. 5. Adjust 'lowlevellock-futex.h' arch-specific implementations to provide cancelable futex calls. Some architectures require specific support on syscall handling: * On i386 the syscall cancel bridge needs to use the old int80 instruction because the optimized vDSO symbol the resulting PC value for an interrupted syscall points to an address outside the expected markers in __syscall_cancel_arch. It has been discussed in LKML [1] on how kernel could help userland to accomplish it, but afaik discussion has stalled. Also, sysenter should not be used directly by libc since its calling convention is set by the kernel depending of the underlying x86 chip (check kernel commit 30bfa7b3488bfb1bb75c9f50a5fcac1832970c60). * mips o32 is the only kABI that requires 7 argument syscall, and to avoid add a requirement on all architectures to support it, mips support is added with extra internal defines. Checked on aarch64-linux-gnu, arm-linux-gnueabihf, powerpc-linux-gnu, powerpc64-linux-gnu, powerpc64le-linux-gnu, i686-linux-gnu, and x86_64-linux-gnu. [1] https://lkml.org/lkml/2016/3/8/1105 Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2024-06-25 19:17:44 +00:00
/* Called by __syscall_cancel_arch or function above start the thread
cancellation. */
_Noreturn void
__syscall_do_cancel (void)
{
struct pthread *self = THREAD_SELF;
nptl: Fix Race conditions in pthread cancellation [BZ#12683] The current racy approach is to enable asynchronous cancellation before making the syscall and restore the previous cancellation type once the syscall returns, and check if cancellation has happen during the cancellation entrypoint. As described in BZ#12683, this approach shows 2 problems: 1. Cancellation can act after the syscall has returned from the kernel, but before userspace saves the return value. It might result in a resource leak if the syscall allocated a resource or a side effect (partial read/write), and there is no way to program handle it with cancellation handlers. 2. If a signal is handled while the thread is blocked at a cancellable syscall, the entire signal handler runs with asynchronous cancellation enabled. This can lead to issues if the signal handler call functions which are async-signal-safe but not async-cancel-safe. For the cancellation to work correctly, there are 5 points at which the cancellation signal could arrive: [ ... )[ ... )[ syscall ]( ... 1 2 3 4 5 1. Before initial testcancel, e.g. [*... testcancel) 2. Between testcancel and syscall start, e.g. [testcancel...syscall start) 3. While syscall is blocked and no side effects have yet taken place, e.g. [ syscall ] 4. Same as 3 but with side-effects having occurred (e.g. a partial read or write). 5. After syscall end e.g. (syscall end...*] And libc wants to act on cancellation in cases 1, 2, and 3 but not in cases 4 or 5. For the 4 and 5 cases, the cancellation will eventually happen in the next cancellable entrypoint without any further external event. The proposed solution for each case is: 1. Do a conditional branch based on whether the thread has received a cancellation request; 2. It can be caught by the signal handler determining that the saved program counter (from the ucontext_t) is in some address range beginning just before the "testcancel" and ending with the syscall instruction. 3. SIGCANCEL can be caught by the signal handler and determine that the saved program counter (from the ucontext_t) is in the address range beginning just before "testcancel" and ending with the first uninterruptable (via a signal) syscall instruction that enters the kernel. 4. In this case, except for certain syscalls that ALWAYS fail with EINTR even for non-interrupting signals, the kernel will reset the program counter to point at the syscall instruction during signal handling, so that the syscall is restarted when the signal handler returns. So, from the signal handler's standpoint, this looks the same as case 2, and thus it's taken care of. 5. For syscalls with side-effects, the kernel cannot restart the syscall; when it's interrupted by a signal, the kernel must cause the syscall to return with whatever partial result is obtained (e.g. partial read or write). 6. The saved program counter points just after the syscall instruction, so the signal handler won't act on cancellation. This is similar to 4. since the program counter is past the syscall instruction. So The proposed fixes are: 1. Remove the enable_asynccancel/disable_asynccancel function usage in cancellable syscall definition and instead make them call a common symbol that will check if cancellation is enabled (__syscall_cancel at nptl/cancellation.c), call the arch-specific cancellable entry-point (__syscall_cancel_arch), and cancel the thread when required. 2. Provide an arch-specific generic system call wrapper function that contains global markers. These markers will be used in SIGCANCEL signal handler to check if the interruption has been called in a valid syscall and if the syscalls has side-effects. A reference implementation sysdeps/unix/sysv/linux/syscall_cancel.c is provided. However, the markers may not be set on correct expected places depending on how INTERNAL_SYSCALL_NCS is implemented by the architecture. It is expected that all architectures add an arch-specific implementation. 3. Rewrite SIGCANCEL asynchronous handler to check for both canceling type and if current IP from signal handler falls between the global markers and act accordingly. 4. Adjust libc code to replace LIBC_CANCEL_ASYNC/LIBC_CANCEL_RESET to use the appropriate cancelable syscalls. 5. Adjust 'lowlevellock-futex.h' arch-specific implementations to provide cancelable futex calls. Some architectures require specific support on syscall handling: * On i386 the syscall cancel bridge needs to use the old int80 instruction because the optimized vDSO symbol the resulting PC value for an interrupted syscall points to an address outside the expected markers in __syscall_cancel_arch. It has been discussed in LKML [1] on how kernel could help userland to accomplish it, but afaik discussion has stalled. Also, sysenter should not be used directly by libc since its calling convention is set by the kernel depending of the underlying x86 chip (check kernel commit 30bfa7b3488bfb1bb75c9f50a5fcac1832970c60). * mips o32 is the only kABI that requires 7 argument syscall, and to avoid add a requirement on all architectures to support it, mips support is added with extra internal defines. Checked on aarch64-linux-gnu, arm-linux-gnueabihf, powerpc-linux-gnu, powerpc64-linux-gnu, powerpc64le-linux-gnu, i686-linux-gnu, and x86_64-linux-gnu. [1] https://lkml.org/lkml/2016/3/8/1105 Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2024-06-25 19:17:44 +00:00
/* Disable thread cancellation to avoid cancellable entrypoints calling
__syscall_do_cancel recursively. We atomic load relaxed to check the
state of cancelhandling, there is no particular ordering requirement
between the syscall call and the other thread setting our cancelhandling
with a atomic store acquire.
POSIX Issue 7 notes that the cancellation occurs asynchronously on the
target thread, that implies there is no ordering requirements. It does
not need a MO release store here. */
2022-04-06 15:24:42 +00:00
int oldval = atomic_load_relaxed (&self->cancelhandling);
nptl: Fix Race conditions in pthread cancellation [BZ#12683] The current racy approach is to enable asynchronous cancellation before making the syscall and restore the previous cancellation type once the syscall returns, and check if cancellation has happen during the cancellation entrypoint. As described in BZ#12683, this approach shows 2 problems: 1. Cancellation can act after the syscall has returned from the kernel, but before userspace saves the return value. It might result in a resource leak if the syscall allocated a resource or a side effect (partial read/write), and there is no way to program handle it with cancellation handlers. 2. If a signal is handled while the thread is blocked at a cancellable syscall, the entire signal handler runs with asynchronous cancellation enabled. This can lead to issues if the signal handler call functions which are async-signal-safe but not async-cancel-safe. For the cancellation to work correctly, there are 5 points at which the cancellation signal could arrive: [ ... )[ ... )[ syscall ]( ... 1 2 3 4 5 1. Before initial testcancel, e.g. [*... testcancel) 2. Between testcancel and syscall start, e.g. [testcancel...syscall start) 3. While syscall is blocked and no side effects have yet taken place, e.g. [ syscall ] 4. Same as 3 but with side-effects having occurred (e.g. a partial read or write). 5. After syscall end e.g. (syscall end...*] And libc wants to act on cancellation in cases 1, 2, and 3 but not in cases 4 or 5. For the 4 and 5 cases, the cancellation will eventually happen in the next cancellable entrypoint without any further external event. The proposed solution for each case is: 1. Do a conditional branch based on whether the thread has received a cancellation request; 2. It can be caught by the signal handler determining that the saved program counter (from the ucontext_t) is in some address range beginning just before the "testcancel" and ending with the syscall instruction. 3. SIGCANCEL can be caught by the signal handler and determine that the saved program counter (from the ucontext_t) is in the address range beginning just before "testcancel" and ending with the first uninterruptable (via a signal) syscall instruction that enters the kernel. 4. In this case, except for certain syscalls that ALWAYS fail with EINTR even for non-interrupting signals, the kernel will reset the program counter to point at the syscall instruction during signal handling, so that the syscall is restarted when the signal handler returns. So, from the signal handler's standpoint, this looks the same as case 2, and thus it's taken care of. 5. For syscalls with side-effects, the kernel cannot restart the syscall; when it's interrupted by a signal, the kernel must cause the syscall to return with whatever partial result is obtained (e.g. partial read or write). 6. The saved program counter points just after the syscall instruction, so the signal handler won't act on cancellation. This is similar to 4. since the program counter is past the syscall instruction. So The proposed fixes are: 1. Remove the enable_asynccancel/disable_asynccancel function usage in cancellable syscall definition and instead make them call a common symbol that will check if cancellation is enabled (__syscall_cancel at nptl/cancellation.c), call the arch-specific cancellable entry-point (__syscall_cancel_arch), and cancel the thread when required. 2. Provide an arch-specific generic system call wrapper function that contains global markers. These markers will be used in SIGCANCEL signal handler to check if the interruption has been called in a valid syscall and if the syscalls has side-effects. A reference implementation sysdeps/unix/sysv/linux/syscall_cancel.c is provided. However, the markers may not be set on correct expected places depending on how INTERNAL_SYSCALL_NCS is implemented by the architecture. It is expected that all architectures add an arch-specific implementation. 3. Rewrite SIGCANCEL asynchronous handler to check for both canceling type and if current IP from signal handler falls between the global markers and act accordingly. 4. Adjust libc code to replace LIBC_CANCEL_ASYNC/LIBC_CANCEL_RESET to use the appropriate cancelable syscalls. 5. Adjust 'lowlevellock-futex.h' arch-specific implementations to provide cancelable futex calls. Some architectures require specific support on syscall handling: * On i386 the syscall cancel bridge needs to use the old int80 instruction because the optimized vDSO symbol the resulting PC value for an interrupted syscall points to an address outside the expected markers in __syscall_cancel_arch. It has been discussed in LKML [1] on how kernel could help userland to accomplish it, but afaik discussion has stalled. Also, sysenter should not be used directly by libc since its calling convention is set by the kernel depending of the underlying x86 chip (check kernel commit 30bfa7b3488bfb1bb75c9f50a5fcac1832970c60). * mips o32 is the only kABI that requires 7 argument syscall, and to avoid add a requirement on all architectures to support it, mips support is added with extra internal defines. Checked on aarch64-linux-gnu, arm-linux-gnueabihf, powerpc-linux-gnu, powerpc64-linux-gnu, powerpc64le-linux-gnu, i686-linux-gnu, and x86_64-linux-gnu. [1] https://lkml.org/lkml/2016/3/8/1105 Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2024-06-25 19:17:44 +00:00
while (1)
2022-04-06 15:24:42 +00:00
{
nptl: Fix Race conditions in pthread cancellation [BZ#12683] The current racy approach is to enable asynchronous cancellation before making the syscall and restore the previous cancellation type once the syscall returns, and check if cancellation has happen during the cancellation entrypoint. As described in BZ#12683, this approach shows 2 problems: 1. Cancellation can act after the syscall has returned from the kernel, but before userspace saves the return value. It might result in a resource leak if the syscall allocated a resource or a side effect (partial read/write), and there is no way to program handle it with cancellation handlers. 2. If a signal is handled while the thread is blocked at a cancellable syscall, the entire signal handler runs with asynchronous cancellation enabled. This can lead to issues if the signal handler call functions which are async-signal-safe but not async-cancel-safe. For the cancellation to work correctly, there are 5 points at which the cancellation signal could arrive: [ ... )[ ... )[ syscall ]( ... 1 2 3 4 5 1. Before initial testcancel, e.g. [*... testcancel) 2. Between testcancel and syscall start, e.g. [testcancel...syscall start) 3. While syscall is blocked and no side effects have yet taken place, e.g. [ syscall ] 4. Same as 3 but with side-effects having occurred (e.g. a partial read or write). 5. After syscall end e.g. (syscall end...*] And libc wants to act on cancellation in cases 1, 2, and 3 but not in cases 4 or 5. For the 4 and 5 cases, the cancellation will eventually happen in the next cancellable entrypoint without any further external event. The proposed solution for each case is: 1. Do a conditional branch based on whether the thread has received a cancellation request; 2. It can be caught by the signal handler determining that the saved program counter (from the ucontext_t) is in some address range beginning just before the "testcancel" and ending with the syscall instruction. 3. SIGCANCEL can be caught by the signal handler and determine that the saved program counter (from the ucontext_t) is in the address range beginning just before "testcancel" and ending with the first uninterruptable (via a signal) syscall instruction that enters the kernel. 4. In this case, except for certain syscalls that ALWAYS fail with EINTR even for non-interrupting signals, the kernel will reset the program counter to point at the syscall instruction during signal handling, so that the syscall is restarted when the signal handler returns. So, from the signal handler's standpoint, this looks the same as case 2, and thus it's taken care of. 5. For syscalls with side-effects, the kernel cannot restart the syscall; when it's interrupted by a signal, the kernel must cause the syscall to return with whatever partial result is obtained (e.g. partial read or write). 6. The saved program counter points just after the syscall instruction, so the signal handler won't act on cancellation. This is similar to 4. since the program counter is past the syscall instruction. So The proposed fixes are: 1. Remove the enable_asynccancel/disable_asynccancel function usage in cancellable syscall definition and instead make them call a common symbol that will check if cancellation is enabled (__syscall_cancel at nptl/cancellation.c), call the arch-specific cancellable entry-point (__syscall_cancel_arch), and cancel the thread when required. 2. Provide an arch-specific generic system call wrapper function that contains global markers. These markers will be used in SIGCANCEL signal handler to check if the interruption has been called in a valid syscall and if the syscalls has side-effects. A reference implementation sysdeps/unix/sysv/linux/syscall_cancel.c is provided. However, the markers may not be set on correct expected places depending on how INTERNAL_SYSCALL_NCS is implemented by the architecture. It is expected that all architectures add an arch-specific implementation. 3. Rewrite SIGCANCEL asynchronous handler to check for both canceling type and if current IP from signal handler falls between the global markers and act accordingly. 4. Adjust libc code to replace LIBC_CANCEL_ASYNC/LIBC_CANCEL_RESET to use the appropriate cancelable syscalls. 5. Adjust 'lowlevellock-futex.h' arch-specific implementations to provide cancelable futex calls. Some architectures require specific support on syscall handling: * On i386 the syscall cancel bridge needs to use the old int80 instruction because the optimized vDSO symbol the resulting PC value for an interrupted syscall points to an address outside the expected markers in __syscall_cancel_arch. It has been discussed in LKML [1] on how kernel could help userland to accomplish it, but afaik discussion has stalled. Also, sysenter should not be used directly by libc since its calling convention is set by the kernel depending of the underlying x86 chip (check kernel commit 30bfa7b3488bfb1bb75c9f50a5fcac1832970c60). * mips o32 is the only kABI that requires 7 argument syscall, and to avoid add a requirement on all architectures to support it, mips support is added with extra internal defines. Checked on aarch64-linux-gnu, arm-linux-gnueabihf, powerpc-linux-gnu, powerpc64-linux-gnu, powerpc64le-linux-gnu, i686-linux-gnu, and x86_64-linux-gnu. [1] https://lkml.org/lkml/2016/3/8/1105 Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2024-06-25 19:17:44 +00:00
int newval = oldval | CANCELSTATE_BITMASK;
if (oldval == newval)
break;
if (atomic_compare_exchange_weak_acquire (&self->cancelhandling,
&oldval, newval))
break;
2022-04-06 15:24:42 +00:00
}
nptl: Fix Race conditions in pthread cancellation [BZ#12683] The current racy approach is to enable asynchronous cancellation before making the syscall and restore the previous cancellation type once the syscall returns, and check if cancellation has happen during the cancellation entrypoint. As described in BZ#12683, this approach shows 2 problems: 1. Cancellation can act after the syscall has returned from the kernel, but before userspace saves the return value. It might result in a resource leak if the syscall allocated a resource or a side effect (partial read/write), and there is no way to program handle it with cancellation handlers. 2. If a signal is handled while the thread is blocked at a cancellable syscall, the entire signal handler runs with asynchronous cancellation enabled. This can lead to issues if the signal handler call functions which are async-signal-safe but not async-cancel-safe. For the cancellation to work correctly, there are 5 points at which the cancellation signal could arrive: [ ... )[ ... )[ syscall ]( ... 1 2 3 4 5 1. Before initial testcancel, e.g. [*... testcancel) 2. Between testcancel and syscall start, e.g. [testcancel...syscall start) 3. While syscall is blocked and no side effects have yet taken place, e.g. [ syscall ] 4. Same as 3 but with side-effects having occurred (e.g. a partial read or write). 5. After syscall end e.g. (syscall end...*] And libc wants to act on cancellation in cases 1, 2, and 3 but not in cases 4 or 5. For the 4 and 5 cases, the cancellation will eventually happen in the next cancellable entrypoint without any further external event. The proposed solution for each case is: 1. Do a conditional branch based on whether the thread has received a cancellation request; 2. It can be caught by the signal handler determining that the saved program counter (from the ucontext_t) is in some address range beginning just before the "testcancel" and ending with the syscall instruction. 3. SIGCANCEL can be caught by the signal handler and determine that the saved program counter (from the ucontext_t) is in the address range beginning just before "testcancel" and ending with the first uninterruptable (via a signal) syscall instruction that enters the kernel. 4. In this case, except for certain syscalls that ALWAYS fail with EINTR even for non-interrupting signals, the kernel will reset the program counter to point at the syscall instruction during signal handling, so that the syscall is restarted when the signal handler returns. So, from the signal handler's standpoint, this looks the same as case 2, and thus it's taken care of. 5. For syscalls with side-effects, the kernel cannot restart the syscall; when it's interrupted by a signal, the kernel must cause the syscall to return with whatever partial result is obtained (e.g. partial read or write). 6. The saved program counter points just after the syscall instruction, so the signal handler won't act on cancellation. This is similar to 4. since the program counter is past the syscall instruction. So The proposed fixes are: 1. Remove the enable_asynccancel/disable_asynccancel function usage in cancellable syscall definition and instead make them call a common symbol that will check if cancellation is enabled (__syscall_cancel at nptl/cancellation.c), call the arch-specific cancellable entry-point (__syscall_cancel_arch), and cancel the thread when required. 2. Provide an arch-specific generic system call wrapper function that contains global markers. These markers will be used in SIGCANCEL signal handler to check if the interruption has been called in a valid syscall and if the syscalls has side-effects. A reference implementation sysdeps/unix/sysv/linux/syscall_cancel.c is provided. However, the markers may not be set on correct expected places depending on how INTERNAL_SYSCALL_NCS is implemented by the architecture. It is expected that all architectures add an arch-specific implementation. 3. Rewrite SIGCANCEL asynchronous handler to check for both canceling type and if current IP from signal handler falls between the global markers and act accordingly. 4. Adjust libc code to replace LIBC_CANCEL_ASYNC/LIBC_CANCEL_RESET to use the appropriate cancelable syscalls. 5. Adjust 'lowlevellock-futex.h' arch-specific implementations to provide cancelable futex calls. Some architectures require specific support on syscall handling: * On i386 the syscall cancel bridge needs to use the old int80 instruction because the optimized vDSO symbol the resulting PC value for an interrupted syscall points to an address outside the expected markers in __syscall_cancel_arch. It has been discussed in LKML [1] on how kernel could help userland to accomplish it, but afaik discussion has stalled. Also, sysenter should not be used directly by libc since its calling convention is set by the kernel depending of the underlying x86 chip (check kernel commit 30bfa7b3488bfb1bb75c9f50a5fcac1832970c60). * mips o32 is the only kABI that requires 7 argument syscall, and to avoid add a requirement on all architectures to support it, mips support is added with extra internal defines. Checked on aarch64-linux-gnu, arm-linux-gnueabihf, powerpc-linux-gnu, powerpc64-linux-gnu, powerpc64le-linux-gnu, i686-linux-gnu, and x86_64-linux-gnu. [1] https://lkml.org/lkml/2016/3/8/1105 Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2024-06-25 19:17:44 +00:00
__do_cancel (PTHREAD_CANCELED);
}