glibc/nptl/tst-join7mod.c

64 lines
1.7 KiB
C
Raw Normal View History

Use IE model for static variables in libc.so, libpthread.so and rtld The recently introduced TLS variables in the thread-local destructor implementation (__cxa_thread_atexit_impl) used the default GD access model, resulting in a call to __tls_get_addr. This causes a deadlock with recent changes to the way TLS is initialized because DTV allocations are delayed and hence despite knowing the offset to the variable inside its TLS block, the thread has to take the global rtld lock to safely update the TLS offset. This causes deadlocks when a thread is instantiated and joined inside a destructor of a dlopen'd DSO. The correct long term fix is to somehow not take the lock, but that will need a lot deeper change set to alter the way in which the big rtld lock is used. Instead, this patch just eliminates the call to __tls_get_addr for the thread-local variables inside libc.so, libpthread.so and rtld by building all of their units with -mtls-model=initial-exec. There were concerns that the static storage for TLS is limited and hence we should not be using it. Additionally, dynamically loaded modules may result in libc.so looking for this static storage pretty late in static binaries. Both concerns are valid when using TLSDESC since that is where one may attempt to allocate a TLS block from static storage for even those variables that are not IE. They're not very strong arguments for the traditional TLS model though, since it assumes that the static storage would be used sparingly and definitely not by default. Hence, for now this would only theoretically affect ARM architectures. The impact is hence limited to statically linked binaries that dlopen modules that in turn load libc.so, all that on arm hardware. It seems like a small enough impact to justify fixing the larger problem that currently affects everything everywhere. This still does not solve the original problem completely. That is, it is still possible to deadlock on the big rtld lock with a small tweak to the test case attached to this patch. That problem is however not a regression in 2.22 and hence could be tackled as a separate project. The test case is picked up as is from Alex's patch. This change has been tested to verify that it does not cause any issues on x86_64. ChangeLog: [BZ #18457] * nptl/Makefile (tests): New test case tst-join7. (modules-names): New test case module tst-join7mod. * nptl/tst-join7.c: New file. * nptl/tst-join7mod.c: New file. * Makeconfig (tls-model): Pass -ftls-model=initial-exec for all translation units in libc.so, libpthread.so and rtld.
2015-07-24 13:43:38 +00:00
/* Verify that TLS access in separate thread in a dlopened library does not
deadlock - the module.
Copyright (C) 2015-2020 Free Software Foundation, Inc.
Use IE model for static variables in libc.so, libpthread.so and rtld The recently introduced TLS variables in the thread-local destructor implementation (__cxa_thread_atexit_impl) used the default GD access model, resulting in a call to __tls_get_addr. This causes a deadlock with recent changes to the way TLS is initialized because DTV allocations are delayed and hence despite knowing the offset to the variable inside its TLS block, the thread has to take the global rtld lock to safely update the TLS offset. This causes deadlocks when a thread is instantiated and joined inside a destructor of a dlopen'd DSO. The correct long term fix is to somehow not take the lock, but that will need a lot deeper change set to alter the way in which the big rtld lock is used. Instead, this patch just eliminates the call to __tls_get_addr for the thread-local variables inside libc.so, libpthread.so and rtld by building all of their units with -mtls-model=initial-exec. There were concerns that the static storage for TLS is limited and hence we should not be using it. Additionally, dynamically loaded modules may result in libc.so looking for this static storage pretty late in static binaries. Both concerns are valid when using TLSDESC since that is where one may attempt to allocate a TLS block from static storage for even those variables that are not IE. They're not very strong arguments for the traditional TLS model though, since it assumes that the static storage would be used sparingly and definitely not by default. Hence, for now this would only theoretically affect ARM architectures. The impact is hence limited to statically linked binaries that dlopen modules that in turn load libc.so, all that on arm hardware. It seems like a small enough impact to justify fixing the larger problem that currently affects everything everywhere. This still does not solve the original problem completely. That is, it is still possible to deadlock on the big rtld lock with a small tweak to the test case attached to this patch. That problem is however not a regression in 2.22 and hence could be tackled as a separate project. The test case is picked up as is from Alex's patch. This change has been tested to verify that it does not cause any issues on x86_64. ChangeLog: [BZ #18457] * nptl/Makefile (tests): New test case tst-join7. (modules-names): New test case module tst-join7mod. * nptl/tst-join7.c: New file. * nptl/tst-join7mod.c: New file. * Makeconfig (tls-model): Pass -ftls-model=initial-exec for all translation units in libc.so, libpthread.so and rtld.
2015-07-24 13:43:38 +00:00
This file is part of the GNU C Library.
The GNU C Library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.
The GNU C Library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public
License along with the GNU C Library; if not, see
Prefer https to http for gnu.org and fsf.org URLs Also, change sources.redhat.com to sourceware.org. This patch was automatically generated by running the following shell script, which uses GNU sed, and which avoids modifying files imported from upstream: sed -ri ' s,(http|ftp)(://(.*\.)?(gnu|fsf|sourceware)\.org($|[^.]|\.[^a-z])),https\2,g s,(http|ftp)(://(.*\.)?)sources\.redhat\.com($|[^.]|\.[^a-z]),https\2sourceware.org\4,g ' \ $(find $(git ls-files) -prune -type f \ ! -name '*.po' \ ! -name 'ChangeLog*' \ ! -path COPYING ! -path COPYING.LIB \ ! -path manual/fdl-1.3.texi ! -path manual/lgpl-2.1.texi \ ! -path manual/texinfo.tex ! -path scripts/config.guess \ ! -path scripts/config.sub ! -path scripts/install-sh \ ! -path scripts/mkinstalldirs ! -path scripts/move-if-change \ ! -path INSTALL ! -path locale/programs/charmap-kw.h \ ! -path po/libc.pot ! -path sysdeps/gnu/errlist.c \ ! '(' -name configure \ -execdir test -f configure.ac -o -f configure.in ';' ')' \ ! '(' -name preconfigure \ -execdir test -f preconfigure.ac ';' ')' \ -print) and then by running 'make dist-prepare' to regenerate files built from the altered files, and then executing the following to cleanup: chmod a+x sysdeps/unix/sysv/linux/riscv/configure # Omit irrelevant whitespace and comment-only changes, # perhaps from a slightly-different Autoconf version. git checkout -f \ sysdeps/csky/configure \ sysdeps/hppa/configure \ sysdeps/riscv/configure \ sysdeps/unix/sysv/linux/csky/configure # Omit changes that caused a pre-commit check to fail like this: # remote: *** error: sysdeps/powerpc/powerpc64/ppc-mcount.S: trailing lines git checkout -f \ sysdeps/powerpc/powerpc64/ppc-mcount.S \ sysdeps/unix/sysv/linux/s390/s390-64/syscall.S # Omit change that caused a pre-commit check to fail like this: # remote: *** error: sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S: last line does not end in newline git checkout -f sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S
2019-09-07 05:40:42 +00:00
<https://www.gnu.org/licenses/>. */
Use IE model for static variables in libc.so, libpthread.so and rtld The recently introduced TLS variables in the thread-local destructor implementation (__cxa_thread_atexit_impl) used the default GD access model, resulting in a call to __tls_get_addr. This causes a deadlock with recent changes to the way TLS is initialized because DTV allocations are delayed and hence despite knowing the offset to the variable inside its TLS block, the thread has to take the global rtld lock to safely update the TLS offset. This causes deadlocks when a thread is instantiated and joined inside a destructor of a dlopen'd DSO. The correct long term fix is to somehow not take the lock, but that will need a lot deeper change set to alter the way in which the big rtld lock is used. Instead, this patch just eliminates the call to __tls_get_addr for the thread-local variables inside libc.so, libpthread.so and rtld by building all of their units with -mtls-model=initial-exec. There were concerns that the static storage for TLS is limited and hence we should not be using it. Additionally, dynamically loaded modules may result in libc.so looking for this static storage pretty late in static binaries. Both concerns are valid when using TLSDESC since that is where one may attempt to allocate a TLS block from static storage for even those variables that are not IE. They're not very strong arguments for the traditional TLS model though, since it assumes that the static storage would be used sparingly and definitely not by default. Hence, for now this would only theoretically affect ARM architectures. The impact is hence limited to statically linked binaries that dlopen modules that in turn load libc.so, all that on arm hardware. It seems like a small enough impact to justify fixing the larger problem that currently affects everything everywhere. This still does not solve the original problem completely. That is, it is still possible to deadlock on the big rtld lock with a small tweak to the test case attached to this patch. That problem is however not a regression in 2.22 and hence could be tackled as a separate project. The test case is picked up as is from Alex's patch. This change has been tested to verify that it does not cause any issues on x86_64. ChangeLog: [BZ #18457] * nptl/Makefile (tests): New test case tst-join7. (modules-names): New test case module tst-join7mod. * nptl/tst-join7.c: New file. * nptl/tst-join7mod.c: New file. * Makeconfig (tls-model): Pass -ftls-model=initial-exec for all translation units in libc.so, libpthread.so and rtld.
2015-07-24 13:43:38 +00:00
#include <stdio.h>
Miscellaneous low-risk changes preparing for _ISOMAC testsuite. These are a grab bag of changes where the testsuite was using internal symbols of some variety, but this was straightforward to fix, and the fixed code should work with or without the change to compile the testsuite under _ISOMAC. Four of these are just more #include adjustments, but I want to highlight sysdeps/powerpc/fpu/tst-setcontext-fpscr.c, which appears to have been written before the advent of sys/auxv.h. I think a big chunk of this file could be replaced by a simple call to getauxval, but I'll let someone who actually has a powerpc machine to test on do that. dlfcn/tst-dladdr.c was including ldsodefs.h just so it could use DL_LOOKUP_ADDRESS to print an additional diagnostic; as requested by Carlos, I have removed this. math/test-misc.c was using #ifndef NO_LONG_DOUBLE, which is an internal configuration macro, to decide whether to do certain tests involving 'long double'. I changed the test to #if LDBL_MANT_DIG > DBL_MANT_DIG instead, which uses only public float.h macros and is equivalent on all supported platforms. (Note that NO_LONG_DOUBLE doesn't mean 'the compiler doesn't support long double', it means 'long double is the same as double'.) tst-writev.c has a configuration macro 'ARTIFICIAL_LIMIT' that the Makefiles are expected to define, and sysdeps/unix/sysv/linux/Makefile was using the internal __getpagesize in the definition; changed to sysconf(_SC_PAGESIZE) which is the POSIX equivalent. ia64-linux doesn't supply 'clone', only '__clone2', which is not defined in the public headers(!) All the other clone tests have local extern declarations of __clone2, but tst-clone.c doesn't; it was getting away with this because include/sched.h does declare __clone2. * nss/tst-cancel-getpwuid_r.c: Include nss.h. * string/strcasestr.c: No need to include config.h. * sysdeps/powerpc/fpu/tst-setcontext-fpscr.c: Include sys/auxv.h. Don't include sysdep.h. * sysdeps/powerpc/tst-set_ppr.c: Don't include dl-procinfo.h. * dlfcn/tst-dladdr.c: Don't include ldsodefs.h. Don't use DL_LOOKUP_ADDRESS. * math/test-misc.c: Instead of testing NO_LONG_DOUBLE, test whether LDBL_MANT_DIG is greater than DBL_MANT_DIG. * sysdeps/unix/sysv/linux/Makefile (CFLAGS-tst-writev.c): Use sysconf (_SC_PAGESIZE) instead of __getpagesize in definition of ARTIFICIAL_LIMIT. * sysdeps/unix/sysv/linux/tst-clone.c [__ia64__]: Add extern declaration of __clone2.
2016-11-21 01:46:30 +00:00
#include <stdlib.h>
#include <string.h>
Use IE model for static variables in libc.so, libpthread.so and rtld The recently introduced TLS variables in the thread-local destructor implementation (__cxa_thread_atexit_impl) used the default GD access model, resulting in a call to __tls_get_addr. This causes a deadlock with recent changes to the way TLS is initialized because DTV allocations are delayed and hence despite knowing the offset to the variable inside its TLS block, the thread has to take the global rtld lock to safely update the TLS offset. This causes deadlocks when a thread is instantiated and joined inside a destructor of a dlopen'd DSO. The correct long term fix is to somehow not take the lock, but that will need a lot deeper change set to alter the way in which the big rtld lock is used. Instead, this patch just eliminates the call to __tls_get_addr for the thread-local variables inside libc.so, libpthread.so and rtld by building all of their units with -mtls-model=initial-exec. There were concerns that the static storage for TLS is limited and hence we should not be using it. Additionally, dynamically loaded modules may result in libc.so looking for this static storage pretty late in static binaries. Both concerns are valid when using TLSDESC since that is where one may attempt to allocate a TLS block from static storage for even those variables that are not IE. They're not very strong arguments for the traditional TLS model though, since it assumes that the static storage would be used sparingly and definitely not by default. Hence, for now this would only theoretically affect ARM architectures. The impact is hence limited to statically linked binaries that dlopen modules that in turn load libc.so, all that on arm hardware. It seems like a small enough impact to justify fixing the larger problem that currently affects everything everywhere. This still does not solve the original problem completely. That is, it is still possible to deadlock on the big rtld lock with a small tweak to the test case attached to this patch. That problem is however not a regression in 2.22 and hence could be tackled as a separate project. The test case is picked up as is from Alex's patch. This change has been tested to verify that it does not cause any issues on x86_64. ChangeLog: [BZ #18457] * nptl/Makefile (tests): New test case tst-join7. (modules-names): New test case module tst-join7mod. * nptl/tst-join7.c: New file. * nptl/tst-join7mod.c: New file. * Makeconfig (tls-model): Pass -ftls-model=initial-exec for all translation units in libc.so, libpthread.so and rtld.
2015-07-24 13:43:38 +00:00
#include <pthread.h>
#include <atomic.h>
static pthread_t th;
static int running = 1;
static void *
test_run (void *p)
{
while (atomic_load_relaxed (&running))
printf ("Test running\n");
printf ("Test finished\n");
return NULL;
}
static void __attribute__ ((constructor))
do_init (void)
{
int ret = pthread_create (&th, NULL, test_run, NULL);
if (ret != 0)
{
printf ("failed to create thread: %s (%d)\n", strerror (ret), ret);
exit (1);
}
}
static void __attribute__ ((destructor))
do_end (void)
{
atomic_store_relaxed (&running, 0);
int ret = pthread_join (th, NULL);
if (ret != 0)
{
printf ("pthread_join: %s(%d)\n", strerror (ret), ret);
exit (1);
}
printf ("Thread joined\n");
}