glibc/sysdeps/x86_64
Wangyang Guo ea69248445 nptl: Add backoff mechanism to spinlock loop
When mutiple threads waiting for lock at the same time, once lock owner
releases the lock, waiters will see lock available and all try to lock,
which may cause an expensive CAS storm.

Binary exponential backoff with random jitter is introduced. As try-lock
attempt increases, there is more likely that a larger number threads
compete for adaptive mutex lock, so increase wait time in exponential.
A random jitter is also added to avoid synchronous try-lock from other
threads.

v2: Remove read-check before try-lock for performance.

v3:
1. Restore read-check since it works well in some platform.
2. Make backoff arch dependent, and enable it for x86_64.
3. Limit max backoff to reduce latency in large critical section.

v4: Fix strict-prototypes error in sysdeps/nptl/pthread_mutex_backoff.h

v5: Commit log updated for regression in large critical section.

Result of pthread-mutex-locks bench

Test Platform: Xeon 8280L (2 socket, 112 CPUs in total)
First Row: thread number
First Col: critical section length
Values: backoff vs upstream, time based, low is better

non-critical-length: 1
	1	2	4	8	16	32	64	112	140
0	0.99	0.58	0.52	0.49	0.43	0.44	0.46	0.52	0.54
1	0.98	0.43	0.56	0.50	0.44	0.45	0.50	0.56	0.57
2	0.99	0.41	0.57	0.51	0.45	0.47	0.48	0.60	0.61
4	0.99	0.45	0.59	0.53	0.48	0.49	0.52	0.64	0.65
8	1.00	0.66	0.71	0.63	0.56	0.59	0.66	0.72	0.71
16	0.97	0.78	0.91	0.73	0.67	0.70	0.79	0.80	0.80
32	0.95	1.17	0.98	0.87	0.82	0.86	0.89	0.90	0.90
64	0.96	0.95	1.01	1.01	0.98	1.00	1.03	0.99	0.99
128	0.99	1.01	1.01	1.17	1.08	1.12	1.02	0.97	1.02

non-critical-length: 32
	1	2	4	8	16	32	64	112	140
0	1.03	0.97	0.75	0.65	0.58	0.58	0.56	0.70	0.70
1	0.94	0.95	0.76	0.65	0.58	0.58	0.61	0.71	0.72
2	0.97	0.96	0.77	0.66	0.58	0.59	0.62	0.74	0.74
4	0.99	0.96	0.78	0.66	0.60	0.61	0.66	0.76	0.77
8	0.99	0.99	0.84	0.70	0.64	0.66	0.71	0.80	0.80
16	0.98	0.97	0.95	0.76	0.70	0.73	0.81	0.85	0.84
32	1.04	1.12	1.04	0.89	0.82	0.86	0.93	0.91	0.91
64	0.99	1.15	1.07	1.00	0.99	1.01	1.05	0.99	0.99
128	1.00	1.21	1.20	1.22	1.25	1.31	1.12	1.10	0.99

non-critical-length: 128
	1	2	4	8	16	32	64	112	140
0	1.02	1.00	0.99	0.67	0.61	0.61	0.61	0.74	0.73
1	0.95	0.99	1.00	0.68	0.61	0.60	0.60	0.74	0.74
2	1.00	1.04	1.00	0.68	0.59	0.61	0.65	0.76	0.76
4	1.00	0.96	0.98	0.70	0.63	0.63	0.67	0.78	0.77
8	1.01	1.02	0.89	0.73	0.65	0.67	0.71	0.81	0.80
16	0.99	0.96	0.96	0.79	0.71	0.73	0.80	0.84	0.84
32	0.99	0.95	1.05	0.89	0.84	0.85	0.94	0.92	0.91
64	1.00	0.99	1.16	1.04	1.00	1.02	1.06	0.99	0.99
128	1.00	1.06	0.98	1.14	1.39	1.26	1.08	1.02	0.98

There is regression in large critical section. But adaptive mutex is
aimed for "quick" locks. Small critical section is more common when
users choose to use adaptive pthread_mutex.

Signed-off-by: Wangyang Guo <wangyang.guo@intel.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit 8162147872)
2022-09-28 07:34:53 -07:00
..
64 Move architecture shlib-versions files to Linux-specific directories. 2014-07-17 14:31:12 +00:00
fpu x86-64: Optimize load of all bits set into ZMM register [BZ #28252] 2022-04-26 18:18:15 -07:00
multiarch x86: Add missing IS_IN (libc) check to strncmp-sse4_2.S 2022-07-18 20:45:21 -07:00
nptl nptl: Add backoff mechanism to spinlock loop 2022-09-28 07:34:53 -07:00
x32 mcheck: Align struct hdr to MALLOC_ALIGNMENT bytes [BZ #28068] 2021-07-12 18:13:32 -07:00
____longjmp_chk.S
__longjmp.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
_mcount.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
abort-instr.h
add_n.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
addmul_1.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
bsd-_setjmp.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
bsd-setjmp.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
configure x86_64: Remove unneeded static PIE check for undefined weak diagnostic 2021-07-08 14:26:22 -07:00
configure.ac x86_64: Remove unneeded static PIE check for undefined weak diagnostic 2021-07-08 14:26:22 -07:00
crti.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
crtn.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
dl-hwcaps-subdirs.c <sys/platform/x86.h>: Remove the C preprocessor magic 2021-01-21 05:58:17 -08:00
dl-irel.h Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
dl-machine.h x86-64: Ignore r_addend for R_X86_64_GLOB_DAT/R_X86_64_JUMP_SLOT 2022-07-18 20:45:20 -07:00
dl-procinfo.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
dl-runtime.h elf: Add _dl_audit_pltexit 2022-04-08 14:18:12 -04:00
dl-tls.c elf: Use relaxed atomics for racy accesses [BZ #19329] 2021-05-11 17:16:37 +01:00
dl-tls.h Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
dl-tlsdesc.h x86_64: Remove lazy tlsdesc relocation related code 2021-04-15 09:47:47 +01:00
dl-tlsdesc.S x86_64: Remove lazy tlsdesc relocation related code 2021-04-15 09:47:47 +01:00
dl-trampoline.h elf: Add _dl_audit_pltexit 2022-04-08 14:18:12 -04:00
dl-trampoline.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
ffs.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
ffsll.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
htonl.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
ifuncmain8.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
ifuncmod8.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
Implies Remove dbl-64/wordsize-64 (part 2) 2021-01-07 15:26:26 +00:00
isa.h Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
jmpbuf-offsets.h Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
jmpbuf-unwind.h Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
l10nflist.c
link-defines.sym Replace __int128 with __int128_t 2014-05-30 10:50:21 -07:00
locale-defines.sym
localplt.data mtrace: Wean away from malloc hooks 2021-07-22 18:38:06 +05:30
lshift.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
machine-gmon.h Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
Makefile Add a generic malloc test for MALLOC_ALIGNMENT 2021-07-09 06:39:30 -07:00
memchr.S x86: Fix overflow bug with wmemchr-sse2 and wmemchr-avx2 [BZ #27974] 2021-06-23 14:13:03 -04:00
memcmp.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
memcpy_chk.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
memcpy.S X86-64: Remove previous default/SSE2/AVX2 memcpy/memmove 2016-06-08 13:58:08 -07:00
memmove_chk.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
memmove.S x86: Optimize memmove-vec-unaligned-erms.S 2022-04-26 18:18:16 -07:00
mempcpy_chk.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
mempcpy.S X86-64: Remove previous default/SSE2/AVX2 memcpy/memmove 2016-06-08 13:58:08 -07:00
memrchr.S x86: Optimize memrchr-sse2.S 2022-07-18 20:45:21 -07:00
memset_chk.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
memset.S x86_64: Remove bzero optimization 2022-07-18 20:45:20 -07:00
memusage.h Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
mp_clz_tab.c
mul_1.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
preconfigure rename configure.in to configure.ac 2013-10-30 17:32:08 +10:00
preconfigure.ac rename configure.in to configure.ac 2013-10-30 17:32:08 +10:00
rawmemchr.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
rshift.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
rtld-offsets.sym x86-64: Align the stack in __tls_get_addr [BZ #21609] 2017-07-06 04:43:20 -07:00
setjmp.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
stackguard-macros.h
stackinfo.h Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
start.S Reduce the statically linked startup code [BZ #23323] 2021-02-25 12:13:02 +01:00
stpcpy.S
strcasecmp_l-nonascii.c Use locale_t, not __locale_t, throughout glibc 2017-06-20 20:30:06 -04:00
strcasecmp_l.S
strcasecmp.S
strcat.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strchr.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strchrnul.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strcmp.S x86: Optimize str{n}casecmp TOLOWER logic in strcmp.S 2022-05-16 18:54:17 -07:00
strcpy.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strlen.S x86-64: Move strlen.S to multiarch/strlen-vec.S 2021-06-23 10:24:35 -07:00
strncase_l-nonascii.c Use locale_t, not __locale_t, throughout glibc 2017-06-20 20:30:06 -04:00
strncase_l.S
strncase.S
strncmp.S
strnlen.S
strrchr.S x86: Optimize {str|wcs}rchr-sse2 2022-05-16 18:55:37 -07:00
sub_n.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
submul_1.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
sysdep.h x86: ZERO_UPPER_VEC_REGISTERS_RETURN_XTEST expect no transactions 2022-07-18 20:45:21 -07:00
tls_get_addr.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
tls-macros.h
tlsdesc.c elf: Remove lazy tlsdesc relocation related code 2021-04-21 14:35:53 +01:00
tlsdesc.sym x86-64: Align the stack in __tls_get_addr [BZ #21609] 2017-07-06 04:43:20 -07:00
tst-audit3.c Modify several tests to use test-skeleton.c 2014-11-05 15:24:08 +05:30
tst-audit4-aux.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
tst-audit4.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
tst-audit5.c Modify several tests to use test-skeleton.c 2014-11-05 15:24:08 +05:30
tst-audit6.c Modify several tests to use test-skeleton.c 2015-07-15 15:10:23 +05:30
tst-audit7.c
tst-audit10-aux.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
tst-audit10.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
tst-audit.h Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
tst-auditmod3a.c
tst-auditmod3b.c Add missing header files throughout the testsuite. 2017-02-16 17:33:18 -05:00
tst-auditmod4a.c
tst-auditmod4b.c Add missing header files throughout the testsuite. 2017-02-16 17:33:18 -05:00
tst-auditmod5a.c
tst-auditmod5b.c Add missing header files throughout the testsuite. 2017-02-16 17:33:18 -05:00
tst-auditmod6a.c
tst-auditmod6b.c Add missing header files throughout the testsuite. 2017-02-16 17:33:18 -05:00
tst-auditmod6c.c Add missing header files throughout the testsuite. 2017-02-16 17:33:18 -05:00
tst-auditmod7a.c
tst-auditmod7b.c Add missing header files throughout the testsuite. 2017-02-16 17:33:18 -05:00
tst-auditmod10a.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
tst-auditmod10b.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
tst-avx512-aux.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
tst-avx512.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
tst-avx512mod.c x86-64: Verify that _dl_runtime_resolve preserves vector registers 2017-02-09 12:19:58 -08:00
tst-avx-aux.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
tst-avx.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
tst-avxmod.c x86-64: Verify that _dl_runtime_resolve preserves vector registers 2017-02-09 12:19:58 -08:00
tst-glibc-hwcaps.c <sys/platform/x86.h>: Remove the C preprocessor magic 2021-01-21 05:58:17 -08:00
tst-platform-1.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
tst-platformmod-1.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
tst-platformmod-2.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
tst-quad1.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
tst-quad1pie.c
tst-quad2.c
tst-quad2pie.c
tst-quadmod1.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
tst-quadmod1pie.S
tst-quadmod2.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
tst-quadmod2pie.S
tst-rsi-strlen.c x86-64: Test strlen and wcslen with 0 in the RSI register [BZ #28064] 2021-07-08 18:55:40 -04:00
tst-rsi-wcslen.c x86-64: Test strlen and wcslen with 0 in the RSI register [BZ #28064] 2021-07-08 18:55:40 -04:00
tst-split-dynreloc.c Fix dynamic linker issue with bind-now 2015-08-19 05:37:01 -07:00
tst-split-dynreloc.lds Fix dynamic linker issue with bind-now 2015-08-19 05:37:01 -07:00
tst-sse.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
tst-ssemod.c x86-64: Verify that _dl_runtime_resolve preserves vector registers 2017-02-09 12:19:58 -08:00
tst-x86_64-1.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
tst-x86_64mod-1.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
tst-x86-64-tls-1.c x86_64: Correct THREAD_SETMEM/THREAD_SETMEM_NC for movq [BZ #27591] 2021-04-01 07:00:22 -07:00
Versions Move __fentry__ version definition to sysdeps/{i386,x86_64} 2018-08-10 09:07:44 +02:00
wcschr.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
wcscmp.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
wcslen.S x86: Small improvements for wcslen 2022-05-16 18:55:09 -07:00
wcsrchr.S x86: Optimize {str|wcs}rchr-sse2 2022-05-16 18:55:37 -07:00
wmemset_chk.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
wmemset.S x86-64: Optimize wmemset with SSE2/AVX2/AVX512 2017-06-05 11:09:59 -07:00
wordcopy.c X86-64: Add dummy memcopy.h and wordcopy.c 2016-06-09 04:38:34 -07:00