The elision interfaces are closely aligned between the targets that
implement them, so declare them in the generic <lowlevellock.h>
file.
Empty .c stubs are provided, so that fewer makefile updates
under sysdeps are needed. Also simplify initialization via
__libc_early_init.
The symbols __lll_clocklock_elision, __lll_lock_elision,
__lll_trylock_elision, __lll_unlock_elision, __pthread_force_elision
move into libc. For the time being, non-hidden references are used
from libpthread to access them, but once that part of libpthread
is moved into libc, hidden symbols will be used again. (Hidden
references seem desirable to reduce the likelihood of transactions
aborts.)
I used these shell commands:
../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")
and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 6694 files FOO.
I then removed trailing white space from benchtests/bench-pthread-locks.c
and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this
diagnostic from Savannah:
remote: *** pre-commit check failed ...
remote: *** error: lines with trailing whitespace found
remote: error: hook declined to update refs/heads/master
This patch adjusts s390 specific lock elision code after review
of the following patches:
-S390: Use own tbegin macro instead of __builtin_tbegin.
(8bfc4a2ab4)
-S390: Use new __libc_tbegin_retry macro in elision-lock.c.
(53c5c3d5ac)
-S390: Optimize lock-elision by decrementing adapt_count at unlock.
(dd037fb3df)
The futex value is not tested before starting a transaction,
__glibc_likely is used instead of __builtin_expect and comments
are adjusted.
ChangeLog:
* sysdeps/unix/sysv/linux/s390/htm.h: Adjust comments.
* sysdeps/unix/sysv/linux/s390/elision-unlock.c: Likewise.
* sysdeps/unix/sysv/linux/s390/elision-lock.c: Adjust comments.
(__lll_lock_elision): Do not test futex before starting a
transaction. Use __glibc_likely instead of __builtin_expect.
* sysdeps/unix/sysv/linux/s390/elision-trylock.c: Adjust comments.
(__lll_trylock_elision): Do not test futex before starting a
transaction. Use __glibc_likely instead of __builtin_expect.
This patch decrements the adapt_count while unlocking the futex
instead of before aquiring the futex as it is done on power, too.
Furthermore a transaction is only started if the futex is currently free.
This check is done after starting the transaction, too.
If the futex is not free and the transaction nesting depth is one,
we can simply end the started transaction instead of aborting it.
The implementation of this check was faulty as it always ended the
started transaction. By using the fallback path, the the outermost
transaction was aborted. Now the outermost transaction is aborted
directly.
This patch also adds some commentary and aligns the code in
elision-trylock.c to the code in elision-lock.c as possible.
ChangeLog:
* sysdeps/unix/sysv/linux/s390/lowlevellock.h
(__lll_unlock_elision, lll_unlock_elision): Add adapt_count argument.
* sysdeps/unix/sysv/linux/s390/elision-lock.c:
(__lll_lock_elision): Decrement adapt_count while unlocking
instead of before locking.
* sysdeps/unix/sysv/linux/s390/elision-trylock.c
(__lll_trylock_elision): Likewise.
* sysdeps/unix/sysv/linux/s390/elision-unlock.c:
(__lll_unlock_elision): Likewise.
This patch defines __libc_tbegin, __libc_tend, __libc_tabort and
__libc_tx_nesting_depth in htm.h which replaces the direct usage of
equivalent gcc builtins.
We have to use an own inline assembly instead of __builtin_tbegin,
as tbegin has to filter program interruptions which can't be done with
the builtin. Before this change, e.g. a segmentation fault within a
transaction, leads to a coredump where the instruction pointer points
behind the tbegin instruction instead of real failing one.
Now the transaction aborts and the code should be reexecuted by the
fallback path without transactions. The segmentation fault will
produce a coredump with the real failing instruction.
The fpc is not saved before starting the transaction. If e.g. the
rounging mode is changed and the transaction is aborting afterwards,
the builtin will not restore the fpc. This is now done with the
__libc_tbegin macro.
Now the call saved fprs have to be saved / restored in the
__libc_tbegin macro. Using the gcc builtin had forced the saving /
restoring of fprs at begin / end of e.g. __lll_lock_elision function.
The new macro saves these fprs before tbegin instruction and only
restores them on a transaction abort. Restoring is not needed on
a successfully started transaction.
The used inline assembly does not clobber the fprs / vrs!
Clobbering the latter ones would force the compiler to save / restore
the call saved fprs as those overlap with the vrs, but they only
need to be restored if the transaction fails. Thus the user of the
tbegin macros has to compile the file / function with -msoft-float.
It prevents gcc from using fprs / vrs.
ChangeLog:
* sysdeps/unix/sysv/linux/s390/Makefile (elision-CFLAGS):
Add -msoft-float.
* sysdeps/unix/sysv/linux/s390/htm.h: New File.
* sysdeps/unix/sysv/linux/s390/elision-lock.c:
Use __libc_t* transaction macros instead of __builtin_t*.
* sysdeps/unix/sysv/linux/s390/elision-trylock.c: Likewise.
* sysdeps/unix/sysv/linux/s390/elision-unlock.c: Likewise.
This uses atomic operations to access lock elision metadata that is accessed
concurrently (ie, adapt_count fields). The size of the data is less than a
word but accessed only with atomic loads and stores.
See also x86 commit ca6e601a9d:
"Use C11-like atomics instead of plain memory accesses in x86 lock elision."
ChangeLog:
* sysdeps/unix/sysv/linux/s390/elision-lock.c
(__lll_lock_elision): Use atomics to load / store adapt_count.
* sysdeps/unix/sysv/linux/s390/elision-trylock.c
(__lll_trylock_elision): Likewise.