If the test fails due some unexpected failure after the children
creation, either in the signal handler by calling abort or in the main
loop; the created children might not be killed properly.
This patches fixes it by:
* Avoid aborting in the signal handler by setting a flag that
an error has occured and add a check in the main loop.
* Add a atexit handler to handle kill child processes.
Checked on x86_64-linux-gnu.
This synchronization method has a lower overhead and makes
it more likely that the signal arrives during one of the critical
functions.
Also test for fork deadlocks explicitly.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
This patch increases timeouts on some tests I've observed timing out.
elf/tst-tls13 and iconvdata/tst-loading both dynamically load many
objects and so are slow when testing over NFS. They had timeouts set
from before the default changed from 2 to 20 seconds; this patch
removes those old settings, so effectively increasing the timeout to
20 seconds (from 3 and 10 seconds respectively).
malloc/tst-malloc-thread-fail.c and malloc/tst-mallocfork2.c are slow
on slow systems and so I set a fairly arbitrary 100 second timeout,
which seems to suffice on the system where I saw them timing out.
nss/tst-cancel-getpwuid_r.c and nss/tst-nss-getpwent.c are slow on
systems with a large passwd file; I set timeouts that empirically
worked for me. (It seems tst-cancel-getpwuid_r.c is hitting the
100000 getpwuid_r call limit in my testing, with each call taking a
bit over 0.007 seconds, so 700 seconds for the test.)
* elf/tst-tls13.c (TIMEOUT): Remove.
* iconvdata/tst-loading.c (TIMEOUT): Likewise.
* malloc/tst-malloc-thread-fail.c (TIMEOUT): Increase to 100.
* malloc/tst-mallocfork2.c (TIMEOUT): Define to 100.
* nss/tst-cancel-getpwuid_r.c (TIMEOUT): Define to 900.
* nss/tst-nss-getpwent.c (TIMEOUT): Define to 300.
The first SIGUSR1 signal could arrive when sigusr1_sender_pid
was still 0. As a result, kill would send SIGSTOP to the
entire process group. This would cause the test to hang before
printing any output.
This commit also adds a sched_yield to the signal source, so that
it does not flood the parent process with signals it has never a
chance to handle.
Even with these changes, tst-mallocfork2 still fails reliably
after the fix in commit commit 56290d6e76
(Increase fork signal safety for single-threaded processes) is
backed out.
This provides a band-aid and addresses the scenario where fork is
called from a signal handler while the process is in the malloc
subsystem (or has acquired the libio list lock). It does not
address the general issue of async-signal-safety of fork;
multi-threaded processes are not covered, and some glibc
subsystems have fork handlers which are not async-signal-safe.