Add LLL_MUTEX_READ_LOCK [BZ #28537]

CAS instruction is expensive.  From the x86 CPU's point of view, getting
a cache line for writing is more expensive than reading.  See Appendix
A.2 Spinlock in:

https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/xeon-lock-scaling-analysis-paper.pdf

The full compare and swap will grab the cache line exclusive and cause
excessive cache line bouncing.

Add LLL_MUTEX_READ_LOCK to do an atomic load and skip CAS in spinlock
loop if compare may fail to reduce cache line bouncing on contended locks.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
This commit is contained in:
H.J. Lu 2021-11-02 18:33:07 -07:00
parent 49302b8fdf
commit d672a98a1a

View File

@ -64,6 +64,11 @@ lll_mutex_lock_optimized (pthread_mutex_t *mutex)
# define PTHREAD_MUTEX_VERSIONS 1
#endif
#ifndef LLL_MUTEX_READ_LOCK
# define LLL_MUTEX_READ_LOCK(mutex) \
atomic_load_relaxed (&(mutex)->__data.__lock)
#endif
static int __pthread_mutex_lock_full (pthread_mutex_t *mutex)
__attribute_noinline__;
@ -141,6 +146,8 @@ PTHREAD_MUTEX_LOCK (pthread_mutex_t *mutex)
break;
}
atomic_spin_nop ();
if (LLL_MUTEX_READ_LOCK (mutex) != 0)
continue;
}
while (LLL_MUTEX_TRYLOCK (mutex) != 0);