The current racy approach is to enable asynchronous cancellation
before making the syscall and restore the previous cancellation
type once the syscall returns, and check if cancellation has happen
during the cancellation entrypoint.
As described in BZ#12683, this approach shows 2 problems:
1. Cancellation can act after the syscall has returned from the
kernel, but before userspace saves the return value. It might
result in a resource leak if the syscall allocated a resource or a
side effect (partial read/write), and there is no way to program
handle it with cancellation handlers.
2. If a signal is handled while the thread is blocked at a cancellable
syscall, the entire signal handler runs with asynchronous
cancellation enabled. This can lead to issues if the signal
handler call functions which are async-signal-safe but not
async-cancel-safe.
For the cancellation to work correctly, there are 5 points at which the
cancellation signal could arrive:
[ ... )[ ... )[ syscall ]( ...
1 2 3 4 5
1. Before initial testcancel, e.g. [*... testcancel)
2. Between testcancel and syscall start, e.g. [testcancel...syscall start)
3. While syscall is blocked and no side effects have yet taken
place, e.g. [ syscall ]
4. Same as 3 but with side-effects having occurred (e.g. a partial
read or write).
5. After syscall end e.g. (syscall end...*]
And libc wants to act on cancellation in cases 1, 2, and 3 but not
in cases 4 or 5. For the 4 and 5 cases, the cancellation will eventually
happen in the next cancellable entrypoint without any further external
event.
The proposed solution for each case is:
1. Do a conditional branch based on whether the thread has received
a cancellation request;
2. It can be caught by the signal handler determining that the saved
program counter (from the ucontext_t) is in some address range
beginning just before the "testcancel" and ending with the
syscall instruction.
3. SIGCANCEL can be caught by the signal handler and determine that
the saved program counter (from the ucontext_t) is in the address
range beginning just before "testcancel" and ending with the first
uninterruptable (via a signal) syscall instruction that enters the
kernel.
4. In this case, except for certain syscalls that ALWAYS fail with
EINTR even for non-interrupting signals, the kernel will reset
the program counter to point at the syscall instruction during
signal handling, so that the syscall is restarted when the signal
handler returns. So, from the signal handler's standpoint, this
looks the same as case 2, and thus it's taken care of.
5. For syscalls with side-effects, the kernel cannot restart the
syscall; when it's interrupted by a signal, the kernel must cause
the syscall to return with whatever partial result is obtained
(e.g. partial read or write).
6. The saved program counter points just after the syscall
instruction, so the signal handler won't act on cancellation.
This is similar to 4. since the program counter is past the syscall
instruction.
So The proposed fixes are:
1. Remove the enable_asynccancel/disable_asynccancel function usage in
cancellable syscall definition and instead make them call a common
symbol that will check if cancellation is enabled (__syscall_cancel
at nptl/cancellation.c), call the arch-specific cancellable
entry-point (__syscall_cancel_arch), and cancel the thread when
required.
2. Provide an arch-specific generic system call wrapper function
that contains global markers. These markers will be used in
SIGCANCEL signal handler to check if the interruption has been
called in a valid syscall and if the syscalls has side-effects.
A reference implementation sysdeps/unix/sysv/linux/syscall_cancel.c
is provided. However, the markers may not be set on correct
expected places depending on how INTERNAL_SYSCALL_NCS is
implemented by the architecture. It is expected that all
architectures add an arch-specific implementation.
3. Rewrite SIGCANCEL asynchronous handler to check for both canceling
type and if current IP from signal handler falls between the global
markers and act accordingly.
4. Adjust libc code to replace LIBC_CANCEL_ASYNC/LIBC_CANCEL_RESET to
use the appropriate cancelable syscalls.
5. Adjust 'lowlevellock-futex.h' arch-specific implementations to
provide cancelable futex calls.
Some architectures require specific support on syscall handling:
* On i386 the syscall cancel bridge needs to use the old int80
instruction because the optimized vDSO symbol the resulting PC value
for an interrupted syscall points to an address outside the expected
markers in __syscall_cancel_arch. It has been discussed in LKML [1]
on how kernel could help userland to accomplish it, but afaik
discussion has stalled.
Also, sysenter should not be used directly by libc since its calling
convention is set by the kernel depending of the underlying x86 chip
(check kernel commit 30bfa7b3488bfb1bb75c9f50a5fcac1832970c60).
* mips o32 is the only kABI that requires 7 argument syscall, and to
avoid add a requirement on all architectures to support it, mips
support is added with extra internal defines.
Checked on aarch64-linux-gnu, arm-linux-gnueabihf, powerpc-linux-gnu,
powerpc64-linux-gnu, powerpc64le-linux-gnu, i686-linux-gnu, and
x86_64-linux-gnu.
[1] https://lkml.org/lkml/2016/3/8/1105
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Some Linux interfaces never restart after being interrupted by a signal
handler, regardless of the use of SA_RESTART [1]. It means that for
pthread cancellation, if the target thread disables cancellation with
pthread_setcancelstate and calls such interfaces (like poll or select),
it should not see spurious EINTR failures due the internal SIGCANCEL.
However recent changes made pthread_cancel to always sent the internal
signal, regardless of the target thread cancellation status or type.
To fix it, the previous semantic is restored, where the cancel signal
is only sent if the target thread has cancelation enabled in
asynchronous mode.
The cancel state and cancel type is moved back to cancelhandling
and atomic operation are used to synchronize between threads. The
patch essentially revert the following commits:
8c1c0aae20 nptl: Move cancel type out of cancelhandling
2b51742531 nptl: Move cancel state out of cancelhandling
26cfbb7162 nptl: Remove CANCELING_BITMASK
However I changed the atomic operation to follow the internal C11
semantic and removed the MACRO usage, it simplifies a bit the
resulting code (and removes another usage of the old atomic macros).
Checked on x86_64-linux-gnu, i686-linux-gnu, aarch64-linux-gnu,
and powerpc64-linux-gnu.
[1] https://man7.org/linux/man-pages/man7/signal.7.html
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
I used these shell commands:
../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")
and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 7061 files FOO.
I then removed trailing white space from math/tgmath.h,
support/tst-support-open-dev-null-range.c, and
sysdeps/x86_64/multiarch/strlen-vec.S, to work around the following
obscure pre-commit check failure diagnostics from Savannah. I don't
know why I run into these diagnostics whereas others evidently do not.
remote: *** 912-#endif
remote: *** 913:
remote: *** 914-
remote: *** error: lines with trailing whitespace found
...
remote: *** error: sysdeps/unix/sysv/linux/statx_cp.c: trailing lines
We stopped adding "Contributed by" or similar lines in sources in 2012
in favour of git logs and keeping the Contributors section of the
glibc manual up to date. Removing these lines makes the license
header a bit more consistent across files and also removes the
possibility of error in attribution when license blocks or files are
copied across since the contributed-by lines don't actually reflect
reality in those cases.
Move all "Contributed by" and similar lines (Written by, Test by,
etc.) into a new file CONTRIBUTED-BY to retain record of these
contributions. These contributors are also mentioned in
manual/contrib.texi, so we just maintain this additional record as a
courtesy to the earlier developers.
The following scripts were used to filter a list of files to edit in
place and to clean up the CONTRIBUTED-BY file respectively. These
were not added to the glibc sources because they're not expected to be
of any use in future given that this is a one time task:
https://gist.github.com/siddhesh/b5ecac94eabfd72ed2916d6d8157e7dchttps://gist.github.com/siddhesh/15ea1f5e435ace9774f485030695ee02
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Now that the thread cancellation type is not accessed concurrently
anymore, it is possible to move it out the cancelhandling.
By removing the cancel state out of the internal thread cancel handling
state there is no need to check if cancelled bit was set in CAS
operation.
It allows simplifing the cancellation wrappers and the
CANCEL_CANCELED_AND_ASYNCHRONOUS is removed.
Checked on x86_64-linux-gnu and aarch64-linux-gnu.
Now that thread cancellation state is not accessed concurrently anymore,
it is possible to move it out the 'cancelhandling'.
The code is also simplified: CANCELLATION_P is replaced with a
internal pthread_testcancel call and the CANCELSTATE_BIT{MASK} is
removed.
With this behavior pthread_setcancelstate does not require to act on
cancellation if cancel type is asynchronous (is already handled either
by pthread_setcanceltype or by the signal handler).
Checked on x86_64-linux-gnu and aarch64-linux-gnu.
The CANCELING_BITMASK is used as an optimization to avoid sending
the signal when pthread_cancel is called in a concurrent manner.
This requires then to put both the cancellation state and type on a
shared state (cancelhandling), since 'pthread_cancel' checks whether
cancellation is enabled and asynchrnous to either cancel itself of
sending the signal.
It also requires handle the CANCELING_BITMASK on
__pthread_disable_asynccancel, however this incurs in the same issues
described on BZ#12683: the cancellation is acted upon even *after*
syscall returns with user visible side-effects.
This patch removes this optimization and simplifies the pthread
cancellation implementation: pthread_cancel now first checks if
cancellation is already pending and if not always, sends a signal
if the target is not itself. The SIGCANCEL handler is also simpified
since there is not need to setup a CAS loop.
It also allows to move both the cancellation state and mode out of
'cancelhadling' (it is done in subsequent patches).
Checked on x86_64-linux-gnu and aarch64-linux-gnu.
Previously, the source file nptl/cancellation.c was compiled multiple
times, for libc, libpthread, librt. This commit switches to a single
implementation, with new __pthread_enable_asynccancel@@GLIBC_PRIVATE,
__pthread_disable_asynccancel@@GLIBC_PRIVATE exports.
The almost-unused CANCEL_ASYNC and CANCEL_RESET macros are replaced
by LIBC_CANCEL_ASYNC and LIBC_CANCEL_ASYNC macros. They call the
__pthread_* functions unconditionally now. The macros are still
needed because shared code uses them; Hurd has different definitions.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
I used these shell commands:
../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")
and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 6694 files FOO.
I then removed trailing white space from benchtests/bench-pthread-locks.c
and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this
diagnostic from Savannah:
remote: *** pre-commit check failed ...
remote: *** error: lines with trailing whitespace found
remote: error: hook declined to update refs/heads/master
Document in comments that __pthread_enable_asynccancel and
__pthread_disable_asynccancel must be AS-safe in general with
the exception of the act of cancellation.
This adds new functions for futex operations, starting with wait,
abstimed_wait, reltimed_wait, wake. They add documentation and error
checking according to the current draft of the Linux kernel futex manpage.
Waiting with absolute or relative timeouts is split into separate functions.
This allows for removing a few cases of code duplication in pthreads code,
which uses absolute timeouts; also, it allows us to put platform-specific
code to go from an absolute to a relative timeout into the platform-specific
futex abstractions..
Futex operations that can be canceled are also split out into separate
functions suffixed by "_cancelable".
There are separate versions for both Linux and NaCl; while they currently
differ only slightly, my expectation is that the separate versions of
lowlevellock-futex.h will eventually be merged into futex-internal.h
when we get to move the lll_ functions over to the new futex API.
The bits tested to decide when to delay the return when switching
off async cancel mode were wrong. Fix that. Also close a race
condition in pthread_cancel where the bit indicating the cancellation
is unconditionally set even if the cancel type might have changed.
When disabling async cancellation we cannot return from the function
call if the thread is canceled. This happens when the cancel bits
have been set before async cancel is disabled but the signal hasn't
been sent/received yet. Delay for as long as necessary since
otherwise the signal might be received in an unsafe context.
2002-12-13 Ulrich Drepper <drepper@redhat.com>
* misc/syslog.c (log_cleanup): Don't use parameter in
__libc_lock_unlock call, use syslog_lock directly. Adjust callers to
pass NULL instead of a pointer to syslog_lock.
2002-11-26 Ulrich Drepper <drepper@redhat.com>
* allocatestack.c (queue_stack): Don't remove stack from list here.
Do it in the caller. Correct condition to prematurely terminate
loop to free stacks.
(__deallocate_stack): Remove stack from list here.
2002-11-26 Ulrich Drepper <drepper@redhat.com>
* Makefile (tests): Add tst-stack1.
* tst-stack1.c: New file.
* allocatestack.c (allocate_stack): Initialize the TCB on a user
provided stack.
* pthread_attr_getstack.c: Return bottom of the thread area.
2002-11-25 Ulrich Drepper <drepper@redhat.com>
* Makefile (libpthread-routines): Add pt-allocrtsig and
pthread_kill_other_threads.
* pt-allocrtsig.c: New file.
* pthread_kill_other_threads.c: New file.
* sysdeps/unix/sysv/linux/allocrtsig.c: Add additional aliases for
all three functions.
* sysdeps/unix/sysv/linux/Makefile (sysdep_routines): Remove
allocrtsig.
* sysdeps/unix/sysv/linux/Versions (libc:GLIBC_PRIVATE): Export
__libc_current_sigrtmin_private, __libc_current_sigrtmax_private,
and __libc_allocate_rtsig_private.
* Versions (libpthread): Export pthread_kill_other_threads_np,
__libc_current_sigrtmin, and __libc_current_sigrtmax.
2002-11-24 Ulrich Drepper <drepper@redhat.com>
* allocatestack.c (allocate_stack): stackaddr in attribute points to
the end of the stack. Adjust computations.
When mprotect call fails dequeue stack and free it.
* pthread_attr_setstack.c: Store top of the stack in stackaddr
attribute.
* pthread_getattr_np.c: Likewise.
* descr.h (IS_DETACHED): Add some more parenthesis to prevent
surprises.
2002-11-23 Ulrich Drepper <drepper@redhat.com>
* sysdeps/pthread/pthread.h (pthread_self): __THROW must come before
attribute definitions. Patch by Luca Barbieri <ldb@ldb.ods.org>.
2002-11-22 Ulrich Drepper <drepper@redhat.com>
* pthread_getspecific.c: Optimize access to first 2nd-level array.
* pthread_setspecific.c: Likewise.
2002-11-21 Ulrich Drepper <drepper@redhat.com>
* sysdeps/unix/sysv/linux/i386/createthread.c: Remove CLONE_ flags
definitions. Get them from the official place.
* sysdeps/unix/sysv/linux/i386/fork.c: Likewise.
* sysdeps/unix/sysv/linux/i386/createthread.c: Update CLONE_* flags.
Use new CLONE_ flags in clone() calls.
* sysdeps/unix/sysv/linux/fork.c: Use ARCH_FORK to actually fork.
* sysdeps/unix/sysv/linux/i386/fork.c: New file.
* Versions: Add pthread_* functions for libc.
* forward.c: New file.
* sysdeps/pthread/Makefile (libpthread-sysdeps_routines): Add
errno-loc.
* herrno.c: New file.
* res.c: New file.
* Makefile (libpthread-routines): Remove sem_post, sem_wait,
sem_trywait, and sem_timedwait. Add herrno and res.
* sem_init.c: Don't initialize lock and waiters members.
* sem_open.c: Likewise.
* sem_post.c: Removed.
* sem_wait.c: Removed.
* sem_trywait.c: Removed.
* sem_timedwait.c: Removed.
* sysdeps/unix/sysv/linux/i386/i486/lowlevelsem.S: Complete rewrite.
Includes full implementations of sem_post, sem_wait, sem_trywait,
and sem_timedwait.
* sysdeps/unix/sysv/linux/i386/lowlevelsem.h (lll_sem_post): Adjust
for new implementation.
* sysdeps/unix/sysv/linux/internaltypes.h (struct sem): Remove lock
and waiters fields.
* tst-sem3.c: Improve error message.
* tst-signal3.c: Likewise.
* init.c (__pthread_initialize_minimal): Use set_tid_address syscall
to tell the kernel about the termination futex and to initialize tid
member. Don't initialize main_thread.
* descr.h (struct pthread): Remove main_thread member.
* cancelllation.c (__do_cancel): Remove code handling main thread.
The main thread is not special anymore.
* allocatestack.c (__reclaim_stacks): Mark stacks as unused. Add
size of the stacks to stack_cache_actsize.
* pt-readv.c: Add missing "defined".
* pt-sigwait.c: Likewise.
* pt-writev.c: Likewise.
2002-11-09 Ulrich Drepper <drepper@redhat.com>
* Versions: Export __connect from libpthread.
Patch by Luca Barbieri <ldb@ldb.ods.org>.
* Makefile (libpthread-routines): Add pt-raise.
* sysdeps/unix/sysv/linux/raise.c: New file.
* sysdeps/unix/sysv/linux/pt-raise.c: New file.
* sysdeps/generic/pt-raise.c: New file.
* pthread_cond_init.c: Initialize all data elements of the condvar
structure. Patch by Luca Barbieri <ldb@ldb.ods.org>.
* pthread_attr_init.c: Actually implement 2.0 compatibility version.
* pthread_create.c: Likewise.
* Makefile (tests): Add tst-key1, tst-key2, tst-key3.
* tst-key1.c: New file.
* tst-key2.c: New file.
* tst-key3.c: New file.
* Versions: Export pthread_detach for version GLIBC_2.0.
Reported by Saurabh Desai <sdesai@austin.ibm.com>.
2002-11-08 Ulrich Drepper <drepper@redhat.com>
* pthread_key_create.c: Terminate search after an unused key was found.
Patch by Luca Barbieri <ldb@ldb.ods.org>.
* sysdeps/unix/sysv/linux/i386/pthread_once.S: Return zero.
Patch by Luca Barbieri <ldb@ldb.ods.org>.
2002-10-10 Ulrich Drepper <drepper@redhat.com>
* sysdeps/unix/sysv/linux/i386/i486/lowlevelsem.S: Use slow generic
dynamic lookup for errno in PIC.
* allocatestack.c (get_cached_stack): Rearrange code slightly to
release the stack lock as soon as possible.
Call _dl_allocate_tls_init for TCB from the cache to re-initialize
the static TLS block.
(allocate_stack): Call _dl_allocate_tls_init for user-provided stack.
* cancellation.c: Renamed from cancelation.c.
* Makefile: Adjust accordingly.
* pthreadP.h (CANCELLATION_P): Renamed from CANCELATION_P.
* cleanup_defer.c: Use CANCELLATION_P.
* pthread_testcancel.c: Likewise.
* descr.h: Fix spelling in comments.
* init.c: Likewise.
* pthread_getattr_np.c: Likewise.
* pthread_getschedparam.c: Likewise.
* pthread_setschedparam.c: Likewise.
* Versions: Likewise.
* pt-pselect.c: New file.
* Makefile (libpthread-routines): Add pt-pselect.
* Versions: Add pselect.
* tst-cancel4.c: New file.
* Makefile (tests): Add tst-cancel4.
2002-10-09 Ulrich Drepper <drepper@redhat.com>
* pthread_mutex_lock.c: Always record lock ownership.
* pthread_mutex_timedlock.c: Likewise.
* pthread_mutex_trylock.c: Likewise.
* pt-readv.c: New file.
* pt-writev.c: New file.
* pt-creat.c: New file.
* pt-msgrcv.c: New file.
* pt-msgsnd.c: New file.
* pt-poll.c: New file.
* pt-select.c: New file.
* pt-sigpause.c: New file.
* pt-sigsuspend.c: New file.
* pt-sigwait.c: New file.
* pt-sigwaitinfo.c: New file.
* pt-waitid.c: New file.
* Makefile (libpthread-routines): Add pt-readv, pt-writev, pt-creat,
pt-msgrcv, pt-msgsnd, pt-poll, pt-select, pt-sigpause, pt-sigsuspend,
pt-sigwait, pt-sigwaitinfo, and pt-waitid.
* Versions: Add all the new functions.
* tst-exit1.c: New file.
* Makefile (tests): Add tst-exit1.
* sem_timedwait.c: Minor optimization for more optimal fastpath.
2002-10-08 Ulrich Drepper <drepper@redhat.com>
* pt-fcntl.c: Only enable asynchronous cancellation for F_SETLKW.
* pthread_join.c: Enable asynchronous cancellation around lll_wait_tid
call. pthread_join is an official cancellation point.
* pthread_timedjoin.c: Likewise.
* pthread_cond_wait.c: Revert order in which internal lock are dropped
and the condvar's mutex are retrieved.
* pthread_cond_timedwait.c: Likewise.
Reported by dice@saros.East.Sun.COM.
2002-10-07 Ulrich Drepper <drepper@redhat.com>
* pthreadP.h: Cut out all type definitions and move them...
* sysdeps/unix/sysv/linux/internaltypes.h: ...here. New file.
* pthreadP.h: Include <internaltypes.h>.
* sysdeps/unix/sysv/linux/i386/lowlevelsem.h (lll_sem_post): Little
performance tweaks.
* sem_trywait.c: Shuffle #includes around to get right order.
* sem_timedwait.c: Likewise.
* sem_post.c: Likewise.
* sem_wait.c: Likewise.
* nptl 0.3 released.
* Makefile (tests): Add tst-signal3.
* tst-signal3.c: New file.
2002-10-05 Ulrich Drepper <drepper@redhat.com>
* sysdeps/unix/sysv/linux/i386/lowlevelsem.h: Tell the compiler that
the asms modify the sem object.
(__lll_sem_timedwait): Now takes struct sem* as first parameter.
* sysdeps/unix/sysv/linux/i386/bits/semaphore.h (sem_t): Don't expose
the actual members.
* pthreadP.h (struct sem): New type. Actual semaphore type.
* semaphoreP.h: Include pthreadP.h.
* sem_getvalue.c: Adjust to sem_t change.
* sem_init.c: Likewise.
* sem_open.c: Likewise.
* sem_post.c: Likewise.
* sem_timedwait.c: Likewise.
* sem_trywait.c: Likewise.
* sem_wait.c: Likewise.
2002-10-04 Ulrich Drepper <drepper@redhat.com>
* Makefile (tests): Add tst-basic2, tst-exec1, tst-exec3, tst-exec3.
* tst-basic2.c: New file.
* tst-exec1.c: New file.
* tst-exec2.c: New file.
* tst-exec3.c: New file.
* tst-fork1.c: Remove extra */.
* nptl 0.2 released. The API for IA-32 is complete.