Some Linux interfaces never restart after being interrupted by a signal
handler, regardless of the use of SA_RESTART [1]. It means that for
pthread cancellation, if the target thread disables cancellation with
pthread_setcancelstate and calls such interfaces (like poll or select),
it should not see spurious EINTR failures due the internal SIGCANCEL.
However recent changes made pthread_cancel to always sent the internal
signal, regardless of the target thread cancellation status or type.
To fix it, the previous semantic is restored, where the cancel signal
is only sent if the target thread has cancelation enabled in
asynchronous mode.
The cancel state and cancel type is moved back to cancelhandling
and atomic operation are used to synchronize between threads. The
patch essentially revert the following commits:
8c1c0aae20 nptl: Move cancel type out of cancelhandling
2b51742531 nptl: Move cancel state out of cancelhandling
26cfbb7162 nptl: Remove CANCELING_BITMASK
However I changed the atomic operation to follow the internal C11
semantic and removed the MACRO usage, it simplifies a bit the
resulting code (and removes another usage of the old atomic macros).
Checked on x86_64-linux-gnu, i686-linux-gnu, aarch64-linux-gnu,
and powerpc64-linux-gnu.
[1] https://man7.org/linux/man-pages/man7/signal.7.html
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
I used these shell commands:
../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")
and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 7061 files FOO.
I then removed trailing white space from math/tgmath.h,
support/tst-support-open-dev-null-range.c, and
sysdeps/x86_64/multiarch/strlen-vec.S, to work around the following
obscure pre-commit check failure diagnostics from Savannah. I don't
know why I run into these diagnostics whereas others evidently do not.
remote: *** 912-#endif
remote: *** 913:
remote: *** 914-
remote: *** error: lines with trailing whitespace found
...
remote: *** error: sysdeps/unix/sysv/linux/statx_cp.c: trailing lines
Now that thread cancellation state is not accessed concurrently anymore,
it is possible to move it out the 'cancelhandling'.
The code is also simplified: CANCELLATION_P is replaced with a
internal pthread_testcancel call and the CANCELSTATE_BIT{MASK} is
removed.
With this behavior pthread_setcancelstate does not require to act on
cancellation if cancel type is asynchronous (is already handled either
by pthread_setcanceltype or by the signal handler).
Checked on x86_64-linux-gnu and aarch64-linux-gnu.
The CANCELING_BITMASK is used as an optimization to avoid sending
the signal when pthread_cancel is called in a concurrent manner.
This requires then to put both the cancellation state and type on a
shared state (cancelhandling), since 'pthread_cancel' checks whether
cancellation is enabled and asynchrnous to either cancel itself of
sending the signal.
It also requires handle the CANCELING_BITMASK on
__pthread_disable_asynccancel, however this incurs in the same issues
described on BZ#12683: the cancellation is acted upon even *after*
syscall returns with user visible side-effects.
This patch removes this optimization and simplifies the pthread
cancellation implementation: pthread_cancel now first checks if
cancellation is already pending and if not always, sends a signal
if the target is not itself. The SIGCANCEL handler is also simpified
since there is not need to setup a CAS loop.
It also allows to move both the cancellation state and mode out of
'cancelhadling' (it is done in subsequent patches).
Checked on x86_64-linux-gnu and aarch64-linux-gnu.
I used these shell commands:
../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")
and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 6694 files FOO.
I then removed trailing white space from benchtests/bench-pthread-locks.c
and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this
diagnostic from Savannah:
remote: *** pre-commit check failed ...
remote: *** error: lines with trailing whitespace found
remote: error: hook declined to update refs/heads/master
It can be replaced with a __futex_abstimed_wait_cancelable64 call,
with the advantage that there is no need to further clock adjustments
to create a absolute timeout. It allows to remove the now ununsed
futex_timed_wait_cancel64 internal function.
Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Lukasz Majewski <lukma@denx.de>
The pthread_clockjoin_np and pthread_timedjoin_np have been converted to
support 64 bit time.
This change introduces new futex_timed_wait_cancel64 function in
./sysdeps/nptl/futex-internal.h, which uses futex_time64 where possible
and tries to replace low-level preprocessor macros from
lowlevellock-futex.h
The pthread_{timed|clock}join_np only accept absolute time. Moreover,
there is no need to check for NULL passed as *abstime pointer as
clockwait_tid() always passes struct __timespec64.
For systems with __TIMESIZE != 64 && __WORDSIZE == 32:
- Conversions between 64 bit time to 32 bit are necessary
- Redirection to __pthread_{clock|timed}join_np64 will provide support
for 64 bit time
Build tests:
./src/scripts/build-many-glibcs.py glibcs
Run-time tests:
- Run specific tests on ARM/x86 32bit systems (qemu):
https://github.com/lmajewski/meta-y2038 and run tests:
https://github.com/lmajewski/y2038-tests/commits/master
Above tests were performed with Y2038 redirection applied as well as without
to test the proper usage of both __pthread_{timed|clock}join_np64 and
__pthread_{timed|clock}join_np.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Introduce pthread_clockjoin_np as a version of pthread_timedjoin_np that
accepts a clockid_t parameter to indicate which clock the timeout should be
measured against. This mirrors the recently-added POSIX-proposed "clock"
wait functions.
Checked on x86_64-linux-gnu.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Since gettimeofday will shortly be implemented in terms of
clock_gettime on all platforms, internal code should use clock_gettime
directly; in addition to removing a layer of indirection, this will
allow us to remove the PLT-bypass gunk for gettimeofday. (We can't
quite do that yet, but it'll be coming later in this patch series.)
In many cases, the changed code does fewer conversions.
The changed code always assumes __clock_gettime (CLOCK_REALTIME)
cannot fail. Most of the call sites were assuming gettimeofday could
not fail, but a few places were checking for errors. POSIX says
clock_gettime can only fail if the clock constant is invalid or
unsupported, and CLOCK_REALTIME is the one and only clock constant
that's required to be supported. For consistency I grepped the entire
source tree for any other places that checked for errors from
__clock_gettime (CLOCK_REALTIME), found one, and changed it too.
(For the record, POSIX also says gettimeofday can never fail.)
(It would be nice if we could declare that GNU systems will always
support CLOCK_MONOTONIC as well as CLOCK_REALTIME; there are several
places where we are using CLOCK_REALTIME where _MONOTONIC would be
more appropriate, and/or trying to use _MONOTONIC and then falling
back to _REALTIME. But the Hurd doesn't support CLOCK_MONOTONIC yet,
and it looks like adding it would involve substantial changes to
gnumach's internals and API. Oh well.)
A few Hurd-specific files were changed to use __host_get_time instead
of __clock_gettime, as this seemed tidier. We also assume this cannot
fail. Skimming the code in gnumach leads me to believe the only way
it could fail is if __mach_host_self also failed, and our
Hurd-specific code consistently assumes that can't happen, so I'm
going with that.
With the exception of support/support_test_main.c, test cases are not
modified, mainly because I didn't want to have to figure out which
test cases were testing gettimeofday specifically.
The definition of GETTIME in sysdeps/generic/memusage.h had a typo and
was not reading tv_sec at all. I fixed this. It appears nobody has been
generating malloc traces on a machine that doesn't have a superseding
definition.
There are a whole bunch of places where the code could be simplified
by factoring out timespec subtraction and/or comparison logic, but I
want to keep this patch as mechanical as possible.
Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc64le-linux-gnu,
powerpc64-linux-gnu, powerpc-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Lukasz Majewski <lukma@denx.de>
The valid_nanoseconds () static inline function has been introduced to
check if nanoseconds value is in the correct range - greater or equal to
zero and less than 1000000000.
The explicit #include <time.h> has been added to files where it was
missing.
The __syscall_slong_t type for ns has been used to avoid issues on x32.
Tested with:
- scripts/build-many-glibcs.py
- make PARALLELMFLAGS="-j12" && make PARALLELMFLAGS="-j12" xcheck on x86_64
After commit f1ac745583 ("arm: Use "nr"
constraint for Systemtap probes [BZ #24164]"), we load pd->result into
a register in the probe below:
/* Free the TCB. */
__free_tcb (pd);
}
else
pd->joinid = NULL;
LIBC_PROBE (pthread_join_ret, 3, threadid, result, pd->result);
However, at this point, the thread descriptor has been freed. If the
thread stack does not fit into the thread stack cache, the memory will
have been unmapped, and the program will crash in the probe.
Patch ce7eb0e903 ("nptl: Cleanup cancellation macros") changed the
join sequence for internal common __pthread_timedjoin_ex to use the
new macro lll_wait_tid. The idea was this macro would issue the
cancellable futex operation depending whether the timeout is used or
not. However if a timeout is used, __lll_timedwait_tid is called and
it is not a cancellable entrypoint.
This patch fixes it by simplifying the code in various ways:
- Instead of adding the cancellation handling on __lll_timedwait_tid,
it moves the generic implementation to pthread_join_common.c (called
now timedwait_tid with some fixes to use the correct type for pid).
- The llvm_wait_tid macro is removed, along with its replication on
x86_64, i686, and sparc arch-specific lowlevellock.h.
- sparc32 __lll_timedwait_tid is also removed, since the code is similar
to generic one.
- x86_64 and i386 provides arch-specific __lll_timedwait_tid which is
also removed since they are similar in functionality to generic C code
and there is no indication it is better than compiler generated code.
New tests, tst-join8 and tst-join9, are provided to check if
pthread_timedjoin_np acts as a cancellation point.
Checked on x86_64-linux-gnu, i686-linux-gnu, sparcv9-linux-gnu, and
aarch64-linux-gnu.
[BZ #24215]
* nptl/Makefile (lpthread-routines): Remove lll_timedwait_tid.
(tests): Add tst-join8 tst-join9.
* nptl/lll_timedwait_tid.c: Remove file.
* sysdeps/sparc/sparc32/lll_timedwait_tid.c: Likewise.
* sysdeps/unix/sysv/linux/i386/lll_timedwait_tid.c: Likewise.
* sysdeps/sysv/linux/x86_64/lll_timedwait_tid.c: Likewise.
* nptl/pthread_join_common.c (timedwait_tid): New function.
(__pthread_timedjoin_ex): Act as cancellation entrypoint is block
is set.
* nptl/tst-join5.c (thread_join): New function.
(tf1, tf2, do_test): Use libsupport and add pthread_timedjoin_np
check.
* nptl/tst-join8.c: New file.
* nptl/tst-join9.c: Likewise.
* sysdeps/nptl/lowlevellock-futex.h (lll_futex_wait_cancel,
lll_futex_timed_wait_cancel): Add generic macros.
* sysdeps/nptl/lowlevellock.h (__lll_timedwait_tid, lll_wait_tid):
Remove definitions.
* sysdeps/unix/sysv/linux/i386/lowlevellock.h: Likewise.
* sysdeps/unix/sysv/linux/sparc/lowlevellock.h: Likewise.
* sysdeps/unix/sysv/linux/x86_64/lowlevellock.h: Likewise.
* sysdeps/sparc/sparc32/lowlevellock.c (__lll_timedwait_tid):
Remove function.
* sysdeps/unix/sysv/linux/i386/lowlevellock.S (__lll_timedwait_tid):
Likewise.
* sysdeps/unix/sysv/linux/x86_64/lowlevellock.S: Likewise.
* sysdeps/unix/sysv/linux/lowlevellock-futex.h
(lll_futex_timed_wait_cancel): New macro.
This patch wraps all uses of *_{enable,disable}_asynccancel and
and *_CANCEL_{ASYNC,RESET} in either already provided macros
(lll_futex_timed_wait_cancel) or creates new ones if the
functionality is not provided (SYSCALL_CANCEL_NCS, lll_futex_wait_cancel,
and lll_futex_timed_wait_cancel).
Also for some generic implementations, the direct call of the macros
are removed since the underlying symbols are suppose to provide
cancellation support.
This is a priliminary patch intended to simplify the work required
for BZ#12683 fix. It is a refactor change, no semantic changes are
expected.
Checked on x86_64-linux-gnu and i686-linux-gnu.
* nptl/pthread_join_common.c (__pthread_timedjoin_ex): Use
lll_wait_tid with timeout.
* nptl/sem_wait.c (__old_sem_wait): Use lll_futex_wait_cancel.
* sysdeps/nptl/aio_misc.h (AIO_MISC_WAIT): Use
futex_reltimed_wait_cancelable for cancelabla mode.
* sysdeps/nptl/gai_misc.h (GAI_MISC_WAIT): Likewise.
* sysdeps/posix/open64.c (__libc_open64): Do not call cancelation
macros.
* sysdeps/posix/sigwait.c (__sigwait): Likewise.
* sysdeps/posix/waitid.c (__sigwait): Likewise.
* sysdeps/unix/sysdep.h (__SYSCALL_CANCEL_CALL,
SYSCALL_CANCEL_NCS): New macro.
* sysdeps/nptl/lowlevellock.h (lll_wait_tid): Add timeout argument.
(lll_timedwait_tid): Remove macro.
* sysdeps/unix/sysv/linux/i386/lowlevellock.h (lll_wait_tid):
Likewise.
(lll_timedwait_tid): Likewise.
* sysdeps/unix/sysv/linux/sparc/lowlevellock.h (lll_wait_tid):
Likewise.
(lll_timedwait_tid): Likewise.
* sysdeps/unix/sysv/linux/x86_64/lowlevellock.h (lll_wait_tid):
Likewise.
(lll_timedwait_tid): Likewise.
* sysdeps/unix/sysv/linux/clock_nanosleep.c (__clock_nanosleep):
Use INTERNAL_SYSCALL_CANCEL.
* sysdeps/unix/sysv/linux/futex-internal.h
(futex_reltimed_wait_cancelable): Use LIBC_CANCEL_{ASYNC,RESET}
instead of __pthread_{enable,disable}_asynccancel.
* sysdeps/unix/sysv/linux/lowlevellock-futex.h
(lll_futex_wait_cancel): New macro.
This patch consolidates the pthread_join and gnu extensions to avoid
code duplication. The function pthread_join, pthread_tryjoin_np, and
pthread_timedjoin_np are now based on pthread_timedjoin_ex.
It also fixes some inconsistencies on ESRCH, EINVAL, EDEADLK handling
(where each implementation differs from each other) and also on
clenup handler (which now always use a CAS).
Checked on i686-linux-gnu and x86_64-linux-gnu.
* nptl/pthreadP.h (__pthread_timedjoin_np): Define.
* nptl/pthread_join.c (pthread_join): Use __pthread_timedjoin_np.
* nptl/pthread_tryjoin.c (pthread_tryjoin): Likewise.
* nptl/pthread_timedjoin.c (cleanup): Use CAS on argument setting.
(pthread_timedjoin_np): Define internal symbol and common code from
pthread_join.
* sysdeps/unix/sysv/linux/i386/lowlevellock.h (__lll_timedwait_tid):
Remove superflous checks.
* sysdeps/unix/sysv/linux/x86_64/lowlevellock.h (__lll_timedwait_tid):
Likewise.
Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>