As a result, is not necessary to specify __attribute__ ((nocommon))
on individual definitions.
GCC 10 defaults to -fno-common on all architectures except ARC,
but this change is compatible with older GCC versions and ARC, too.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
This allows distributions to strip debugging information from
libc.so.6 without impacting the debugging experience.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The testcase provided on BZ#19366 may update __nptl_nthreads in a wrong
order, triggering an early process exit because the thread decrement
the value twice.
The issue is once the thread exits without acting on cancellation,
it decreaments '__nptl_nthreads' and then atomically set
'cancelhandling' with EXITING_BIT (thus preventing further cancellation
handler to act). The issue happens if a SIGCANCEL is received between
checking '__ntpl_nthreads' and setting EXITING_BIT. To avoid it, the
'__nptl_nthreads' decrement is moved after EXITING_BIT.
It does fully follow the POSIX XSH 2.9.5 Thread Cancellation under
the heading Thread Cancellation Cleanup Handlers that states that
when a cancellation request is acted upon, or when a thread calls
pthread_exit(), the thread first disables cancellation by setting its
cancelability state to PTHREAD_CANCEL_DISABLE and its cancelability type
to PTHREAD_CANCEL_DEFERRED. The issue is '__pthread_enable_asynccancel'
explicit enabled assynchrnous cancellation, so an interrupted syscall
within the cancellation cleanup handlers might see an invalid cancelling
type (a possible fix might be possible with my proposed solution to
BZ#12683).
Trying to come up with a test is quite hard since it requires to
mimic the timing issue described below, however I see that the
bug report reproducer does not early exit anymore.
Checked on x86_64-linux-gnu.
Now that cancellation is not used anymore to handle thread setup
creation failure, the sighandle can be installed only when
pthread_cancel is actually used.
Checked on x86_64-linux-gnu and aarch64-linux-gnu.
To setup either the thread scheduling parameters or affinity,
pthread_create enforce synchronization on created thread to wait until
its parent either release PD ownership or send a cancellation signal if
a failure occurs.
However, cancelling the thread does not deallocate the newly created
stack since cancellation expects that a pthread_join to deallocate any
allocated thread resouces (threads stack or TLS).
This patch changes on how the thread resource is deallocate in case of
failure to be synchronous, where the creating thread will signal the
created thread to exit early so it could be joined. The creating thread
will be reponsible for the resource cleanup before returning to the
caller.
To signal the creating thread that a failure has occured, an unused
'struct pthread' member, parent_cancelhandling_unsed, now indicates
whether the setup has failed so creating thread can proper exit.
This strategy also simplifies by not using thread cancellation and
thus not running libgcc_so load in the signal handler (which is
avoided in thread cancellation since 'pthread_cancel' is the one
responsible to dlopen libgcc_s). Another advantage is since the
early exit is move to first step at thread creation, the signal
mask is not already set and thus it can not act on change ID setxid
handler.
Checked on x86_64-linux-gnu and aarch64-linux-gnu.
The 'create_thread' function is moved to pthread_create.c. It removes
the START_THREAD_DEFN and START_THREAD_SELF macros and make the
lock usage more clear (no need to cross-reference multiple files).
No functional change.
The signal is sent to all threads, some of which may have switched
to very small stacks. If they have also installed an alternate
signal stack, SA_ONSTACK makes this work. The Go runtime needs this:
runtime: C.setuid/C.setgid smashes Go stack
<https://github.com/golang/go/issues/9400>
Doing this for SIGCANCEL is less obviously beneficial and needs further
testing.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The symbols were moved using scripts/move-symbol-to-libc.py.
The libpthread placeholder symbols need some changes because some
symbol versions have gone away completely. But
__errno_location@@GLIBC_2.0 still exists, so the GLIBC_2.0 version
is still there.
The internal __pthread_create symbol now points to the correct
function, so the sysdeps/nptl/thrd_create.c override is no longer
necessary.
There was an issue how the hidden alias of pthread_getattr_default_np
was defined, so this commit cleans up that aspects and removes the
GLIBC_PRIVATE export altogether.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
No abilist updates here because it is a GLIBC_PRIVATE symbol.
It's also necessary to move nptl_version into pthread_create, so
that it still ends up in static binaries.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Install signal handlers and unblock signals before pthread_create
creates the first thread.
create_thread in sysdeps/unix/sysv/linux/createthread.c can send
SIGCANCEL to the current thread, so the SIGCANCEL handler is currently
needed even if pthread_cancel is never called. (The way timer_create
uses SIGCANCEL does not need a signal handler; both SIG_DFL and SIG_IGN
dispositions should work.)
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This replaces the FREE_P macro with the __nptl_stack_in_use inline
function. stack_list_del is renamed to __nptl_stack_list_del,
stack_list_add to __nptl_stack_list_add, __deallocate_stack to
__nptl_deallocate_stack, free_stacks to __nptl_free_stacks.
It is convenient to move __libpthread_freeres into libc at the
same time. This removes the temporary __default_pthread_attr_freeres
export and restores full freeres coverage for __default_pthread_attr.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This removes the DEBUGGING_P macro and the __pthread_debug variable.
The __find_in_stack_list function is now unused and deleted as well.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Previously, the source file nptl/cancellation.c was compiled multiple
times, for libc, libpthread, librt. This commit switches to a single
implementation, with new __pthread_enable_asynccancel@@GLIBC_PRIVATE,
__pthread_disable_asynccancel@@GLIBC_PRIVATE exports.
The almost-unused CANCEL_ASYNC and CANCEL_RESET macros are replaced
by LIBC_CANCEL_ASYNC and LIBC_CANCEL_ASYNC macros. They call the
__pthread_* functions unconditionally now. The macros are still
needed because shared code uses them; Hurd has different definitions.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This initalization should only happen once for the main thread's TCB.
At present, auditors can achieve this by not linking against
libpthread. If libpthread becomes part of libc, doing this
initialization in libc would happen for every audit namespace,
or too late (if it happens from the main libc only). That's why
moving this code into ld.so seems the right thing to do, right after
the TCB initialization.
For !__ASSUME_SET_ROBUST_LIST ports, this also moves the symbol
__set_robust_list_avail into ld.so, as __nptl_set_robust_list_avail.
It also turned into a proper boolean flag.
Inline the __pthread_initialize_pids function because it seems no
longer useful as a separate function.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
libthread_db is loaded once GDB encounters libpthread, and at this
point, ld.so may not have been processed by GDB yet. As a result,
_rtld_global cannot be accessed by regular means from libthread_db.
To make this work until GDB can be fixed, acess _rtld_global through
a pointer stored in libpthread.
The new test does not reproduce bug 27744 with
--disable-hardcoded-path-in-tests, but is still a valid smoke test.
With --enable-hardcoded-path-in-tests, it is necessary to avoid
add-symbol-file because this can tickle a GDB bug.
Fixes commit 1daccf403b ("nptl: Move
stack list variables into _rtld_global").
Tested-by: Emil Velikov <emil.velikov@collabora.com>
I used these shell commands:
../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")
and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 6694 files FOO.
I then removed trailing white space from benchtests/bench-pthread-locks.c
and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this
diagnostic from Savannah:
remote: *** pre-commit check failed ...
remote: *** error: lines with trailing whitespace found
remote: error: hook declined to update refs/heads/master
Now __thread_gscope_wait (the function behind THREAD_GSCOPE_WAIT,
formerly __wait_lookup_done) can be implemented directly in ld.so,
eliminating the unprotected GL (dl_wait_lookup_done) function
pointer.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Building glibc with GCC 11 fails with (among other warnings) spurious
-Wstringop-overflow warnings from calls to setjmp and longjmp with a
pointer to a pthread_unwind_buf that is smaller than jmp_buf. As
discussed in bug 26647, the warning in libc-start.c is a false
positive, because setjmp and longjmp do not access anything (the
signal mask) beyond the common prefix of the two structures, so this
patch disables the warning for that call to setjmp, as well as for two
calls in NPTL code that produce the same warning and look like false
positives for the same reason.
Tested with build-many-glibcs.py for arm-linux-gnueabi, where this
allows the build to get further.
Reviewed-by: DJ Delorie <dj@redhat.com>
The kernel ABI is not finalized, and there are now various proposals
to change the size of struct rseq, which would make the glibc ABI
dependent on the version of the kernels used for building glibc.
This is of course not acceptable.
This reverts commit 48699da1c4 ("elf:
Support at least 32-byte alignment in static dlopen"), commit
8f4632deb3 ("Linux: rseq registration
tests"), commit 6e29cb3f61 ("Linux: Use
rseq in sched_getcpu if available"), and commit
0c76fc3c2b ("Linux: Perform rseq
registration at C startup and thread creation"), resolving the conflicts
introduced by the ARC port and the TLS static surplus changes.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The variable is placed in libc.so, and it can be true only in
an outer libc, not libcs loaded via dlmopen or static dlopen.
Since thread creation from inner namespaces does not work,
pthread_create can update __libc_single_threaded directly.
Using __libc_early_init and its initial flag, implementation of this
variable is very straightforward. A future version may reset the flag
during fork (but not in an inner namespace), or after joining all
threads except one.
Reviewed-by: DJ Delorie <dj@redhat.com>
Register rseq TLS for each thread (including main), and unregister for
each thread (excluding main). "rseq" stands for Restartable Sequences.
See the rseq(2) man page proposed here:
https://lkml.org/lkml/2018/9/19/647
Those are based on glibc master branch commit 3ee1e0ec5c.
The rseq system call was merged into Linux 4.18.
The TLS_STATIC_SURPLUS define is increased to leave additional room for
dlopen'd initial-exec TLS, which keeps elf/tst-auditmany working.
The increase (76 bytes) is larger than 32 bytes because it has not been
increased in quite a while. The cost in terms of additional TLS storage
is quite significant, but it will also obscure some initial-exec-related
dlopen failures.
User provided stack should not be released nor madvised at
thread exit because it's owned by the user.
If the memory is shared or file based then MADV_DONTNEED
can have unwanted effects. With memory tagging on aarch64
linux the tags are dropped and thus it may invalidate
pointers.
Tested on aarch64-linux-gnu with MTE, it fixes
FAIL: nptl/tst-stack3
FAIL: nptl/tst-stack3-mem
This introduces the function __pthread_attr_extension to allocate the
extension space, which is freed by pthread_attr_destroy.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
union pthread_attr_transparent has always the correct size, even if
pthread_attr_t has padding that is not present in struct pthread_attr.
This should not result in an observable behavioral change. The
existing code appears to have been correct, but it was brittle because
it was not clear which functions were allowed to write to an entire
pthread_attr_t argument (e.g., by copying it).
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
There is a race between __nptl_setxid and exiting detached thread, which
causes a deadlock on stack_cache_lock. The deadlock happens in this
state:
T1: setgroups -> __nptl_setxid (holding stack_cache_lock, waiting on cmdp->cntr == 0)
T2 (detached, exiting): start_thread -> __deallocate_stack (waiting on stack_cache_lock)
more threads waiting on stack_cache_lock in pthread_create
For non-detached threads, start_thread waits for its own setxid handler to
finish before exiting. Do this for detached threads as well.
New threads inherit the signal mask from the current thread. This
means that signal handlers can run on the newly created thread
immediately after the kernel has created the userspace thread, even
before glibc has initialized the TCB. Consequently, new threads can
observe uninitialized ctype data, among other things.
To address this, block all signals before starting the thread, and
pass the original signal mask to the start routine wrapper. On the
new thread, first perform all thread initialization, and then unblock
signals.
The cost of doing this is two rt_sigprocmask system calls on the old
thread, and one rt_sigprocmask system call on the new thread. (If
there was a way to clone a new thread with a signals disabled, this
could be brought down to one system call each.) The thread descriptor
increases in size, too, and sigset_t is fairly large. This increase
could be brought down by reusing space the in the descriptor which is
not needed before running user code, or by switching to an internal
sigset_t definition which only covers the signals supported by the
kernel definition. (Part of the thread descriptor size increase is
already offset by reduced stack usage in the thread start wrapper
routine after this commit.)
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Due to the built-in tables, __NR_set_robust_list is always defined
(although it may not be available at run time).
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
With all Linux ABIs using the expected Linux kABI to indicate
syscalls errors, the INTERNAL_SYSCALL_DECL is an empty declaration
on all ports.
This patch removes the 'err' argument on INTERNAL_SYSCALL* macro
and remove the INTERNAL_SYSCALL_DECL usage.
Checked with a build against all affected ABIs.
All nptl targets have these signal definitions nowadays. This
changes also replaces the nptl-generic version of pthread_sigmask
with the Linux version.
Tested on x86_64-linux-gnu and i686-linux-gnu. Built with
build-many-glibcs.py.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
This patch removes CLOCK_THREAD_CPUTIME_ID and CLOCK_PROCESS_CPUTIME_ID support
from clock_gettime and clock_settime generic implementation. For Linux, kernel
already provides supports through the syscall and Hurd HTL lacks
__pthread_clock_gettime and __pthread_clock_settime internal implementation.
As described in clock_gettime man-page [1] on 'Historical note for SMP
system', implementing CLOCK_{THREAD,PROCESS}_CPUTIME_ID with timer registers
is error-prone and susceptible to timing and accurary issues that the libc
can not deal without kernel support.
This allows removes unused code which, however, still incur in some runtime
overhead in thread creation (the struct pthread cpuclock_offset
initialization).
If hurd eventually wants to support them it should either either implement as
a kernel facility (or something related due its architecture) or in system
specific implementation.
Checked on aarch64-linux-gnu, x86_64-linux-gnu, and i686-linux-gnu. I also
checked on a i686-gnu build.
* nptl/Makefile (libpthread-routines): Remove pthread_clock_gettime and
pthread_clock_settime.
* nptl/pthreadP.h (__find_thread_by_id): Remove prototype.
* elf/dl-support.c [!HP_TIMING_NOAVAIL] (_dl_cpuclock_offset): Remove.
(_dl_non_dynamic_init): Remove _dl_cpuclock_offset setting.
* elf/rtld.c (_dl_start_final): Likewise.
* nptl/allocatestack.c (__find_thread_by_id): Remove function.
* sysdeps/generic/ldsodefs.h [!HP_TIMING_NOAVAIL] (_dl_cpuclock_offset):
Remove.
* sysdeps/mach/hurd/dl-sysdep.c [!HP_TIMING_NOAVAIL]
(_dl_cpuclock_offset): Remove.
* nptl/descr.h (struct pthread): Rename cpuclock_offset to
cpuclock_offset_ununsed.
* nptl/nptl-init.c (__pthread_initialize_minimal_internal): Remove
cpuclock_offset set.
* nptl/pthread_create.c (START_THREAD_DEFN): Likewise.
* sysdeps/nptl/fork.c (__libc_fork): Likewise.
* nptl/pthread_clock_gettime.c: Remove file.
* nptl/pthread_clock_settime.c: Likewise.
* sysdeps/unix/clock_gettime.c (hp_timing_gettime): Remove function.
[HP_TIMING_AVAIL] (realtime_gettime): Remove CLOCK_THREAD_CPUTIME_ID
and CLOCK_PROCESS_CPUTIME_ID support.
* sysdeps/unix/clock_settime.c (hp_timing_gettime): Likewise.
[HP_TIMING_AVAIL] (realtime_gettime): Likewise.
* sysdeps/posix/clock_getres.c (hp_timing_getres): Likewise.
[HP_TIMING_AVAIL] (__clock_getres): Likewise.
* sysdeps/unix/clock_nanosleep.c (CPUCLOCK_P, INVALID_CLOCK_P):
Likewise.
(__clock_nanosleep): Remove CPUCLOCK_P and INVALID_CLOCK_P usage.
[1] http://man7.org/linux/man-pages/man2/clock_gettime.2.html
This patch adds the thrd_* definitions from C11 threads (ISO/IEC 9899:2011),
more specifically thrd_create, thrd_curent, rhd_detach, thrd_equal,
thrd_exit, thrd_join, thrd_sleep, thrd_yield, and required types.
Mostly of the definitions are composed based on POSIX conterparts, such as
thrd_t (using pthread_t). For thrd_* function internally direct
POSIX pthread call are used with the exceptions:
1. thrd_start uses pthread_create internal implementation, but changes
how to actually calls the start routine. This is due the difference
in signature between POSIX and C11, where former return a 'void *'
and latter 'int'.
To avoid calling convention issues due 'void *' to int cast, routines
from C11 threads are started slight different than default pthread one.
Explicit cast to expected return are used internally on pthread_create
and the result is stored back to void also with an explicit cast.
2. thrd_sleep uses nanosleep internal direct syscall to avoid clobbering
errno and to handle expected standard return codes. It is a
cancellation entrypoint to be consistent with both thrd_join and
cnd_{timed}wait.
3. thrd_yield also uses internal direct syscall to avoid errno clobbering.
Checked with a build for all major ABI (aarch64-linux-gnu, alpha-linux-gnu,
arm-linux-gnueabi, i386-linux-gnu, ia64-linux-gnu, m68k-linux-gnu,
microblaze-linux-gnu [1], mips{64}-linux-gnu, nios2-linux-gnu,
powerpc{64le}-linux-gnu, s390{x}-linux-gnu, sparc{64}-linux-gnu,
and x86_64-linux-gnu).
Also ran a full check on aarch64-linux-gnu, x86_64-linux-gnu, i686-linux-gnu,
arm-linux-gnueabhf, and powerpc64le-linux-gnu.
[BZ #14092]
* conform/Makefile (conformtest-headers-ISO11): Add threads.h.
(linknamespace-libs-ISO11): Add libpthread.a.
* conform/data/threads.h-data: New file: add C11 thrd_* types and
functions.
* include/stdc-predef.h (__STDC_NO_THREADS__): Remove definition.
* nptl/Makefile (headers): Add threads.h.
(libpthread-routines): Add new C11 thread thrd_create, thrd_current,
thrd_detach, thrd_equal, thrd_exit, thrd_join, thrd_sleep, and
thrd_yield.
* nptl/Versions (libpthread) [GLIBC_2.28]): Add new C11 thread
thrd_create, thrd_current, thrd_detach, thrd_equal, thrd_exit,
thrd_join, thrd_sleep, and thrd_yield symbols.
* nptl/descr.h (struct pthread): Add c11 field.
* nptl/pthreadP.h (ATTR_C11_THREAD): New define.
* nptl/pthread_create.c (START_THREAD_DEFN): Call C11 thread start
routine with expected function prototype.
(__pthread_create_2_1): Add C11 threads check based on attribute
value.
* sysdeps/unix/sysdep.h (INTERNAL_SYSCALL_CANCEL): New macro.
* nptl/thrd_create.c: New file.
* nptl/thrd_current.c: Likewise.
* nptl/thrd_detach.c: Likewise.
* nptl/thrd_equal.c: Likewise.
* nptl/thrd_exit.c: Likewise.
* nptl/thrd_join.c: Likewise.
* nptl/thrd_priv.h: Likewise.
* nptl/thrd_sleep.c: Likewise.
* nptl/thrd_yield.c: Likewise.
* include/threads.h: Likewise.
feature_1 has X86_FEATURE_1_IBT and X86_FEATURE_1_SHSTK bits for CET
run-time control.
CET_ENABLED, IBT_ENABLED and SHSTK_ENABLED are defined to 1 or 0 to
indicate that if CET, IBT and SHSTK are enabled.
<tls-setup.h> is added to set up thread-local data.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
[BZ #22563]
* nptl/pthread_create.c: Include <tls-setup.h>.
(__pthread_create_2_1): Call tls_setup_tcbhead.
* sysdeps/generic/tls-setup.h: New file.
* sysdeps/x86/nptl/tls-setup.h: Likewise.
* sysdeps/i386/nptl/tcb-offsets.sym (FEATURE_1_OFFSET): New.
* sysdeps/x86_64/nptl/tcb-offsets.sym (FEATURE_1_OFFSET):
Likewise.
* sysdeps/i386/nptl/tls.h (tcbhead_t): Rename __glibc_reserved1
to feature_1.
* sysdeps/x86_64/nptl/tls.h (tcbhead_t): Likewise.
* sysdeps/x86/sysdep.h (X86_FEATURE_1_IBT): New.
(X86_FEATURE_1_SHSTK): Likewise.
(CET_ENABLED): Likewise.
(IBT_ENABLED): Likewise.
(SHSTK_ENABLED): Likewise.
The pad array in struct pthread_unwind_buf is used by setjmp to save
shadow stack register. We assert that size of struct pthread_unwind_buf
is no less than offset of shadow stack pointer + shadow stack pointer
size.
Since functions, like LIBC_START_MAIN, START_THREAD_DEFN as well as
these with thread cancellation, call setjmp, but never return after
__libc_unwind_longjmp, __libc_unwind_longjmp, which is defined as
__libc_longjmp on x86, doesn't need to restore shadow stack register.
__libc_longjmp, which is a private interface for thread cancellation
implementation in libpthread, is changed to call __longjmp_cancel,
instead of __longjmp. __longjmp_cancel is a new internal function
in libc, which is similar to __longjmp, but doesn't restore shadow
stack register.
The compatibility longjmp and siglongjmp in libpthread.so are changed
to call __libc_siglongjmp, instead of __libc_longjmp, so that they will
restore shadow stack register.
Tested with build-many-glibcs.py.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* nptl/pthread_create.c (START_THREAD_DEFN): Clear previous
handlers after setjmp.
* setjmp/longjmp.c (__libc_longjmp): Don't define alias if
defined.
* sysdeps/unix/sysv/linux/x86/setjmpP.h: Include
<libc-pointer-arith.h>.
(_JUMP_BUF_SIGSET_BITS_PER_WORD): New.
(_JUMP_BUF_SIGSET_NSIG): Changed to 96.
(_JUMP_BUF_SIGSET_NWORDS): Changed to use ALIGN_UP and
_JUMP_BUF_SIGSET_BITS_PER_WORD.
* sysdeps/x86/Makefile (sysdep_routines): Add __longjmp_cancel.
* sysdeps/x86/__longjmp_cancel.S: New file.
* sysdeps/x86/longjmp.c: Likewise.
* sysdeps/x86/nptl/pt-longjmp.c: Likewise.
This patch adds two new internal defines to set the internal
pthread_mutex_t layout required by the supported ABIS:
1. __PTHREAD_MUTEX_NUSERS_AFTER_KIND which control whether to define
__nusers fields before or after __kind. The preferred value for
is 0 for new ports and it sets __nusers before __kind.
2. __PTHREAD_MUTEX_USE_UNION which control whether internal __spins and
__list members will be place inside an union for linuxthreads
compatibility. The preferred value is 0 for ports and it sets
to not use an union to define both fields.
It fixes the wrong offsets value for __kind value on x86_64-linux-gnu-x32.
Checked with a make check run-built-tests=no on all afected ABIs.
[BZ #22298]
* nptl/allocatestack.c (allocate_stack): Check if
__PTHREAD_MUTEX_HAVE_PREV is non-zero, instead if
__PTHREAD_MUTEX_HAVE_PREV is defined.
* nptl/descr.h (pthread): Likewise.
* nptl/nptl-init.c (__pthread_initialize_minimal_internal):
Likewise.
* nptl/pthread_create.c (START_THREAD_DEFN): Likewise.
* sysdeps/nptl/fork.c (__libc_fork): Likewise.
* sysdeps/nptl/pthread.h (PTHREAD_MUTEX_INITIALIZER): Likewise.
* sysdeps/nptl/bits/thread-shared-types.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): New
defines.
(__pthread_internal_list): Check __PTHREAD_MUTEX_USE_UNION instead
of __WORDSIZE for internal layout.
(__pthread_mutex_s): Check __PTHREAD_MUTEX_NUSERS_AFTER_KIND instead
of __WORDSIZE for internal __nusers layout and __PTHREAD_MUTEX_USE_UNION
instead of __WORDSIZE whether to use an union for __spins and __list
fields.
(__PTHREAD_MUTEX_HAVE_PREV): Define also for __PTHREAD_MUTEX_USE_UNION
case.
* sysdeps/aarch64/nptl/bits/pthreadtypes-arch.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): New
defines.
* sysdeps/alpha/nptl/bits/pthreadtypes-arch.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION):
Likewise.
* sysdeps/arm/nptl/bits/pthreadtypes-arch.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION):
Likewise.
* sysdeps/hppa/nptl/bits/pthreadtypes-arch.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION):
Likewise.
* sysdeps/ia64/nptl/bits/pthreadtypes-arch.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION):
Likewise.
* sysdeps/m68k/nptl/bits/pthreadtypes-arch.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION):
Likewise.
* sysdeps/microblaze/nptl/bits/pthreadtypes-arch.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION):
Likewise.
* sysdeps/mips/nptl/bits/pthreadtypes-arch.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION):
Likewise.
* sysdeps/nios2/nptl/bits/pthreadtypes-arch.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION):
Likewise.
* sysdeps/powerpc/nptl/bits/pthreadtypes-arch.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION):
Likewise.
* sysdeps/s390/nptl/bits/pthreadtypes-arch.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION):
Likewise.
* sysdeps/sh/nptl/bits/pthreadtypes-arch.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION):
Likewise.
* sysdeps/sparc/nptl/bits/pthreadtypes-arch.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION):
Likewise.
* sysdeps/tile/nptl/bits/pthreadtypes-arch.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION):
Likewise.
* sysdeps/x86/nptl/bits/pthreadtypes-arch.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION):
Likewise.
Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This patch fixes ia64 failures on thread exit by madvise the required
area taking in consideration its disjoing stacks
(NEED_SEPARATE_REGISTER_STACK). Also the snippet that setup the
madvise call to advertise kernel the area won't be used anymore in
near future is reallocated in allocatestack.c (for consistency to
put all stack management function in one place).
Checked on x86_64-linux-gnu and i686-linux-gnu for sanity (since
it is not expected code changes for architecture that do not
define NEED_SEPARATE_REGISTER_STACK) and also got a report that
it fixes ia64-linux-gnu failures from Sergei Trofimovich
<slyfox@gentoo.org>.
[BZ #21672]
* nptl/allocatestack.c [_STACK_GROWS_DOWN] (setup_stack_prot):
Set to use !NEED_SEPARATE_REGISTER_STACK as well.
(advise_stack_range): New function.
* nptl/pthread_create.c (START_THREAD_DEFN): Move logic to mark
stack non required to advise_stack_range at allocatestack.c
Locking overhead can be significant in some stdio operations
that are common in single threaded applications.
This patch adds the _IO_FLAGS2_NEED_LOCK flag to indicate if
an _IO_FILE object needs to be locked and some of the stdio
functions just jump to their _unlocked variant when not. The
flag is set on all _IO_FILE objects when the first thread is
created. A new GLIBC_PRIVATE libc symbol, _IO_enable_locks,
was added to do this from libpthread.
The optimization can be applied to more stdio functions,
currently it is only applied to single flag check or single
non-wide-char standard operations. The flag should probably
be never set for files with _IO_USER_LOCK, but that's just a
further optimization, not a correctness requirement.
The optimization is valid in a single thread because stdio
operations are non-as-safe (so lock state is not observable
from a signal handler) and stdio locks are recursive (so lock
state is not observable via deadlock). The optimization is not
valid if a thread may be created while an stdio lock is taken
and thus it should be disabled if any user code may run during
an stdio operation (interposed malloc, printf hooks, etc).
This makes the optimization more complicated for some stdio
operations (e.g. printf), but those are bigger and thus less
important to optimize so this patch does not try to do that.
* libio/libio.h (_IO_FLAGS2_NEED_LOCK, _IO_need_lock): Define.
* libio/libioP.h (_IO_enable_locks): Declare.
* libio/Versions (_IO_enable_locks): New symbol.
* libio/genops.c (_IO_enable_locks): Define.
(_IO_old_init): Initialize flags2.
* libio/feof.c.c (_IO_feof): Avoid locking when not needed.
* libio/ferror.c (_IO_ferror): Likewise.
* libio/fputc.c (fputc): Likewise.
* libio/putc.c (_IO_putc): Likewise.
* libio/getc.c (_IO_getc): Likewise.
* libio/getchar.c (getchar): Likewise.
* libio/ioungetc.c (_IO_ungetc): Likewise.
* nptl/pthread_create.c (__pthread_create_2_1): Enable stdio locks.
* libio/iofopncook.c (_IO_fopencookie): Enable locking for the file.
* sysdeps/pthread/flockfile.c (__flockfile): Likewise.
This patch removes CALL_THREAD_FCT macro usage and its defition for
x86. For 32 bits it usage is only for force 16 stack alignment,
however stack is already explicit aligned in clone syscall. For
64 bits and x32 it just a function call and there is no need to
code it with inline assembly.
Checked on i686-linux-gnu, x86_64-linux-gnu, and x86_64-linux-gnu-x32.
* nptl/pthread_create.c (START_THREAD_DEFN): Remove
CALL_THREAD_FCT macro usage.
* sysdeps/i386/nptl/tls.h (CALL_THREAD_FCT): Remove definition.
* sysdeps/x86_64/nptl/tls.h (CALL_THREAD_FCT): Likewise.
* sysdeps/x86_64/32/nptl/tls.h: Remove file.