glibc/sysdeps/unix/sysv/linux/x86/elision-timed.c

27 lines
1.1 KiB
C
Raw Normal View History

Add the low level infrastructure for pthreads lock elision with TSX Lock elision using TSX is a technique to optimize lock scaling It allows to run locks in parallel using hardware support for a transactional execution mode in 4th generation Intel Core CPUs. See http://www.intel.com/software/tsx for more Information. This patch implements a simple adaptive lock elision algorithm based on RTM. It enables elision for the pthread mutexes and rwlocks. The algorithm keeps track whether a mutex successfully elides or not, and stops eliding for some time when it is not. When the CPU supports RTM the elision path is automatically tried, otherwise any elision is disabled. The adaptation algorithm and its tuning is currently preliminary. The code adds some checks to the lock fast paths. Micro-benchmarks show little to no difference without RTM. This patch implements the low level "lll_" code for lock elision. Followon patches hook this into the pthread implementation Changes with the RTM mutexes: ----------------------------- Lock elision in pthreads is generally compatible with existing programs. There are some obscure exceptions, which are expected to be uncommon. See the manual for more details. - A broken program that unlocks a free lock will crash. There are ways around this with some tradeoffs (more code in hot paths) I'm still undecided on what approach to take here; have to wait for testing reports. - pthread_mutex_destroy of a lock mutex will not return EBUSY but 0. - There's also a similar situation with trylock outside the mutex, "knowing" that the mutex must be held due to some other condition. In this case an assert failure cannot be recovered. This situation is usually an existing bug in the program. - Same applies to the rwlocks. Some of the return values changes (for example there is no EDEADLK for an elided lock, unless it aborts. However when elided it will also never deadlock of course) - Timing changes, so broken programs that make assumptions about specific timing may expose already existing latent problems. Note that these broken programs will break in other situations too (loaded system, new faster hardware, compiler optimizations etc.) - Programs with non recursive mutexes that take them recursively in a thread and which would always deadlock without elision may not always see a deadlock. The deadlock will only happen on an early or delayed abort (which typically happens at some point) This only happens for mutexes not explicitely set to PTHREAD_MUTEX_NORMAL or PTHREAD_MUTEX_ADAPTIVE_NP. PTHREAD_MUTEX_NORMAL mutexes do not elide. The elision default can be set at configure time. This patch implements the basic infrastructure for elision.
2012-11-10 08:51:26 +00:00
/* elision-timed.c: Lock elision timed lock.
Copyright (C) 2013-2020 Free Software Foundation, Inc.
Add the low level infrastructure for pthreads lock elision with TSX Lock elision using TSX is a technique to optimize lock scaling It allows to run locks in parallel using hardware support for a transactional execution mode in 4th generation Intel Core CPUs. See http://www.intel.com/software/tsx for more Information. This patch implements a simple adaptive lock elision algorithm based on RTM. It enables elision for the pthread mutexes and rwlocks. The algorithm keeps track whether a mutex successfully elides or not, and stops eliding for some time when it is not. When the CPU supports RTM the elision path is automatically tried, otherwise any elision is disabled. The adaptation algorithm and its tuning is currently preliminary. The code adds some checks to the lock fast paths. Micro-benchmarks show little to no difference without RTM. This patch implements the low level "lll_" code for lock elision. Followon patches hook this into the pthread implementation Changes with the RTM mutexes: ----------------------------- Lock elision in pthreads is generally compatible with existing programs. There are some obscure exceptions, which are expected to be uncommon. See the manual for more details. - A broken program that unlocks a free lock will crash. There are ways around this with some tradeoffs (more code in hot paths) I'm still undecided on what approach to take here; have to wait for testing reports. - pthread_mutex_destroy of a lock mutex will not return EBUSY but 0. - There's also a similar situation with trylock outside the mutex, "knowing" that the mutex must be held due to some other condition. In this case an assert failure cannot be recovered. This situation is usually an existing bug in the program. - Same applies to the rwlocks. Some of the return values changes (for example there is no EDEADLK for an elided lock, unless it aborts. However when elided it will also never deadlock of course) - Timing changes, so broken programs that make assumptions about specific timing may expose already existing latent problems. Note that these broken programs will break in other situations too (loaded system, new faster hardware, compiler optimizations etc.) - Programs with non recursive mutexes that take them recursively in a thread and which would always deadlock without elision may not always see a deadlock. The deadlock will only happen on an early or delayed abort (which typically happens at some point) This only happens for mutexes not explicitely set to PTHREAD_MUTEX_NORMAL or PTHREAD_MUTEX_ADAPTIVE_NP. PTHREAD_MUTEX_NORMAL mutexes do not elide. The elision default can be set at configure time. This patch implements the basic infrastructure for elision.
2012-11-10 08:51:26 +00:00
This file is part of the GNU C Library.
The GNU C Library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.
The GNU C Library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public
License along with the GNU C Library; if not, see
Prefer https to http for gnu.org and fsf.org URLs Also, change sources.redhat.com to sourceware.org. This patch was automatically generated by running the following shell script, which uses GNU sed, and which avoids modifying files imported from upstream: sed -ri ' s,(http|ftp)(://(.*\.)?(gnu|fsf|sourceware)\.org($|[^.]|\.[^a-z])),https\2,g s,(http|ftp)(://(.*\.)?)sources\.redhat\.com($|[^.]|\.[^a-z]),https\2sourceware.org\4,g ' \ $(find $(git ls-files) -prune -type f \ ! -name '*.po' \ ! -name 'ChangeLog*' \ ! -path COPYING ! -path COPYING.LIB \ ! -path manual/fdl-1.3.texi ! -path manual/lgpl-2.1.texi \ ! -path manual/texinfo.tex ! -path scripts/config.guess \ ! -path scripts/config.sub ! -path scripts/install-sh \ ! -path scripts/mkinstalldirs ! -path scripts/move-if-change \ ! -path INSTALL ! -path locale/programs/charmap-kw.h \ ! -path po/libc.pot ! -path sysdeps/gnu/errlist.c \ ! '(' -name configure \ -execdir test -f configure.ac -o -f configure.in ';' ')' \ ! '(' -name preconfigure \ -execdir test -f preconfigure.ac ';' ')' \ -print) and then by running 'make dist-prepare' to regenerate files built from the altered files, and then executing the following to cleanup: chmod a+x sysdeps/unix/sysv/linux/riscv/configure # Omit irrelevant whitespace and comment-only changes, # perhaps from a slightly-different Autoconf version. git checkout -f \ sysdeps/csky/configure \ sysdeps/hppa/configure \ sysdeps/riscv/configure \ sysdeps/unix/sysv/linux/csky/configure # Omit changes that caused a pre-commit check to fail like this: # remote: *** error: sysdeps/powerpc/powerpc64/ppc-mcount.S: trailing lines git checkout -f \ sysdeps/powerpc/powerpc64/ppc-mcount.S \ sysdeps/unix/sysv/linux/s390/s390-64/syscall.S # Omit change that caused a pre-commit check to fail like this: # remote: *** error: sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S: last line does not end in newline git checkout -f sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S
2019-09-07 05:40:42 +00:00
<https://www.gnu.org/licenses/>. */
Add the low level infrastructure for pthreads lock elision with TSX Lock elision using TSX is a technique to optimize lock scaling It allows to run locks in parallel using hardware support for a transactional execution mode in 4th generation Intel Core CPUs. See http://www.intel.com/software/tsx for more Information. This patch implements a simple adaptive lock elision algorithm based on RTM. It enables elision for the pthread mutexes and rwlocks. The algorithm keeps track whether a mutex successfully elides or not, and stops eliding for some time when it is not. When the CPU supports RTM the elision path is automatically tried, otherwise any elision is disabled. The adaptation algorithm and its tuning is currently preliminary. The code adds some checks to the lock fast paths. Micro-benchmarks show little to no difference without RTM. This patch implements the low level "lll_" code for lock elision. Followon patches hook this into the pthread implementation Changes with the RTM mutexes: ----------------------------- Lock elision in pthreads is generally compatible with existing programs. There are some obscure exceptions, which are expected to be uncommon. See the manual for more details. - A broken program that unlocks a free lock will crash. There are ways around this with some tradeoffs (more code in hot paths) I'm still undecided on what approach to take here; have to wait for testing reports. - pthread_mutex_destroy of a lock mutex will not return EBUSY but 0. - There's also a similar situation with trylock outside the mutex, "knowing" that the mutex must be held due to some other condition. In this case an assert failure cannot be recovered. This situation is usually an existing bug in the program. - Same applies to the rwlocks. Some of the return values changes (for example there is no EDEADLK for an elided lock, unless it aborts. However when elided it will also never deadlock of course) - Timing changes, so broken programs that make assumptions about specific timing may expose already existing latent problems. Note that these broken programs will break in other situations too (loaded system, new faster hardware, compiler optimizations etc.) - Programs with non recursive mutexes that take them recursively in a thread and which would always deadlock without elision may not always see a deadlock. The deadlock will only happen on an early or delayed abort (which typically happens at some point) This only happens for mutexes not explicitely set to PTHREAD_MUTEX_NORMAL or PTHREAD_MUTEX_ADAPTIVE_NP. PTHREAD_MUTEX_NORMAL mutexes do not elide. The elision default can be set at configure time. This patch implements the basic infrastructure for elision.
2012-11-10 08:51:26 +00:00
#include <time.h>
#include <elision-conf.h>
#include "lowlevellock.h"
nptl: Rename lll_timedlock to lll_clocklock and add clockid parameter Rename lll_timedlock to lll_clocklock and add clockid parameter to indicate the clock that the abstime parameter should be measured against in preparation for adding pthread_mutex_clocklock. The name change mirrors the naming for the exposed pthread functions: timed => absolute timeout measured against CLOCK_REALTIME (or clock specified by attribute in the case of pthread_cond_timedwait.) clock => absolute timeout measured against clock specified in preceding parameter. * sysdeps/nptl/lowlevellock.h (lll_clocklock): Rename from lll_timedlock and add clockid parameter. (__lll_clocklock): Rename from __lll_timedlock and add clockid parameter. * sysdeps/unix/sysv/linux/sparc/lowlevellock.h (lll_clocklock): Likewise. * nptl/lll_timedlock_wait.c (__lll_clocklock_wait): Rename from __lll_timedlock_wait and add clockid parameter. Use __clock_gettime rather than __gettimeofday so that clockid can be used. This means that conversion from struct timeval is no longer required. * sysdeps/sparc/sparc32/lowlevellock.c (lll_clocklock_wait): Likewise. * sysdeps/sparc/sparc32/lll_timedlock_wait.c: Update comment to refer to __lll_clocklock_wait rather than __lll_timedlock_wait. * nptl/pthread_mutex_timedlock.c (lll_clocklock_elision): Rename from lll_timedlock_elision, add clockid parameter and use meaningful names for other parameters. (__pthread_mutex_timedlock): Pass CLOCK_REALTIME where necessary to lll_clocklock and lll_clocklock_elision. * sysdeps/unix/sysv/linux/powerpc/lowlevellock.h (lll_clocklock_elision): Rename from lll_timedlock_elision and add clockid parameter. (__lll_clocklock_elision): Rename from __lll_timedlock_elision and add clockid parameter. * sysdeps/unix/sysv/linux/s390/lowlevellock.h: Likewise. * sysdeps/unix/sysv/linux/x86/lowlevellock.h: Likewise. * sysdeps/unix/sysv/linux/powerpc/elision-timed.c (__lll_lock_elision): Call __lll_clocklock_elision rather than __lll_timedlock_elision. (EXTRAARG): Add clockid parameter. (LLL_LOCK): Likewise. * sysdeps/unix/sysv/linux/s390/elision-timed.c: Likewise. * sysdeps/unix/sysv/linux/x86/elision-timed.c: Likewise. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2019-06-24 13:08:25 +00:00
#define __lll_lock_elision __lll_clocklock_elision
#define EXTRAARG clockid_t clockid, const struct timespec *t,
Add the low level infrastructure for pthreads lock elision with TSX Lock elision using TSX is a technique to optimize lock scaling It allows to run locks in parallel using hardware support for a transactional execution mode in 4th generation Intel Core CPUs. See http://www.intel.com/software/tsx for more Information. This patch implements a simple adaptive lock elision algorithm based on RTM. It enables elision for the pthread mutexes and rwlocks. The algorithm keeps track whether a mutex successfully elides or not, and stops eliding for some time when it is not. When the CPU supports RTM the elision path is automatically tried, otherwise any elision is disabled. The adaptation algorithm and its tuning is currently preliminary. The code adds some checks to the lock fast paths. Micro-benchmarks show little to no difference without RTM. This patch implements the low level "lll_" code for lock elision. Followon patches hook this into the pthread implementation Changes with the RTM mutexes: ----------------------------- Lock elision in pthreads is generally compatible with existing programs. There are some obscure exceptions, which are expected to be uncommon. See the manual for more details. - A broken program that unlocks a free lock will crash. There are ways around this with some tradeoffs (more code in hot paths) I'm still undecided on what approach to take here; have to wait for testing reports. - pthread_mutex_destroy of a lock mutex will not return EBUSY but 0. - There's also a similar situation with trylock outside the mutex, "knowing" that the mutex must be held due to some other condition. In this case an assert failure cannot be recovered. This situation is usually an existing bug in the program. - Same applies to the rwlocks. Some of the return values changes (for example there is no EDEADLK for an elided lock, unless it aborts. However when elided it will also never deadlock of course) - Timing changes, so broken programs that make assumptions about specific timing may expose already existing latent problems. Note that these broken programs will break in other situations too (loaded system, new faster hardware, compiler optimizations etc.) - Programs with non recursive mutexes that take them recursively in a thread and which would always deadlock without elision may not always see a deadlock. The deadlock will only happen on an early or delayed abort (which typically happens at some point) This only happens for mutexes not explicitely set to PTHREAD_MUTEX_NORMAL or PTHREAD_MUTEX_ADAPTIVE_NP. PTHREAD_MUTEX_NORMAL mutexes do not elide. The elision default can be set at configure time. This patch implements the basic infrastructure for elision.
2012-11-10 08:51:26 +00:00
#undef LLL_LOCK
nptl: Rename lll_timedlock to lll_clocklock and add clockid parameter Rename lll_timedlock to lll_clocklock and add clockid parameter to indicate the clock that the abstime parameter should be measured against in preparation for adding pthread_mutex_clocklock. The name change mirrors the naming for the exposed pthread functions: timed => absolute timeout measured against CLOCK_REALTIME (or clock specified by attribute in the case of pthread_cond_timedwait.) clock => absolute timeout measured against clock specified in preceding parameter. * sysdeps/nptl/lowlevellock.h (lll_clocklock): Rename from lll_timedlock and add clockid parameter. (__lll_clocklock): Rename from __lll_timedlock and add clockid parameter. * sysdeps/unix/sysv/linux/sparc/lowlevellock.h (lll_clocklock): Likewise. * nptl/lll_timedlock_wait.c (__lll_clocklock_wait): Rename from __lll_timedlock_wait and add clockid parameter. Use __clock_gettime rather than __gettimeofday so that clockid can be used. This means that conversion from struct timeval is no longer required. * sysdeps/sparc/sparc32/lowlevellock.c (lll_clocklock_wait): Likewise. * sysdeps/sparc/sparc32/lll_timedlock_wait.c: Update comment to refer to __lll_clocklock_wait rather than __lll_timedlock_wait. * nptl/pthread_mutex_timedlock.c (lll_clocklock_elision): Rename from lll_timedlock_elision, add clockid parameter and use meaningful names for other parameters. (__pthread_mutex_timedlock): Pass CLOCK_REALTIME where necessary to lll_clocklock and lll_clocklock_elision. * sysdeps/unix/sysv/linux/powerpc/lowlevellock.h (lll_clocklock_elision): Rename from lll_timedlock_elision and add clockid parameter. (__lll_clocklock_elision): Rename from __lll_timedlock_elision and add clockid parameter. * sysdeps/unix/sysv/linux/s390/lowlevellock.h: Likewise. * sysdeps/unix/sysv/linux/x86/lowlevellock.h: Likewise. * sysdeps/unix/sysv/linux/powerpc/elision-timed.c (__lll_lock_elision): Call __lll_clocklock_elision rather than __lll_timedlock_elision. (EXTRAARG): Add clockid parameter. (LLL_LOCK): Likewise. * sysdeps/unix/sysv/linux/s390/elision-timed.c: Likewise. * sysdeps/unix/sysv/linux/x86/elision-timed.c: Likewise. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2019-06-24 13:08:25 +00:00
#define LLL_LOCK(a, b) lll_clocklock (a, clockid, t, b)
Add the low level infrastructure for pthreads lock elision with TSX Lock elision using TSX is a technique to optimize lock scaling It allows to run locks in parallel using hardware support for a transactional execution mode in 4th generation Intel Core CPUs. See http://www.intel.com/software/tsx for more Information. This patch implements a simple adaptive lock elision algorithm based on RTM. It enables elision for the pthread mutexes and rwlocks. The algorithm keeps track whether a mutex successfully elides or not, and stops eliding for some time when it is not. When the CPU supports RTM the elision path is automatically tried, otherwise any elision is disabled. The adaptation algorithm and its tuning is currently preliminary. The code adds some checks to the lock fast paths. Micro-benchmarks show little to no difference without RTM. This patch implements the low level "lll_" code for lock elision. Followon patches hook this into the pthread implementation Changes with the RTM mutexes: ----------------------------- Lock elision in pthreads is generally compatible with existing programs. There are some obscure exceptions, which are expected to be uncommon. See the manual for more details. - A broken program that unlocks a free lock will crash. There are ways around this with some tradeoffs (more code in hot paths) I'm still undecided on what approach to take here; have to wait for testing reports. - pthread_mutex_destroy of a lock mutex will not return EBUSY but 0. - There's also a similar situation with trylock outside the mutex, "knowing" that the mutex must be held due to some other condition. In this case an assert failure cannot be recovered. This situation is usually an existing bug in the program. - Same applies to the rwlocks. Some of the return values changes (for example there is no EDEADLK for an elided lock, unless it aborts. However when elided it will also never deadlock of course) - Timing changes, so broken programs that make assumptions about specific timing may expose already existing latent problems. Note that these broken programs will break in other situations too (loaded system, new faster hardware, compiler optimizations etc.) - Programs with non recursive mutexes that take them recursively in a thread and which would always deadlock without elision may not always see a deadlock. The deadlock will only happen on an early or delayed abort (which typically happens at some point) This only happens for mutexes not explicitely set to PTHREAD_MUTEX_NORMAL or PTHREAD_MUTEX_ADAPTIVE_NP. PTHREAD_MUTEX_NORMAL mutexes do not elide. The elision default can be set at configure time. This patch implements the basic infrastructure for elision.
2012-11-10 08:51:26 +00:00
#include "elision-lock.c"