The elision interfaces are closely aligned between the targets that
implement them, so declare them in the generic <lowlevellock.h>
file.
Empty .c stubs are provided, so that fewer makefile updates
under sysdeps are needed. Also simplify initialization via
__libc_early_init.
The symbols __lll_clocklock_elision, __lll_lock_elision,
__lll_trylock_elision, __lll_unlock_elision, __pthread_force_elision
move into libc. For the time being, non-hidden references are used
from libpthread to access them, but once that part of libpthread
is moved into libc, hidden symbols will be used again. (Hidden
references seem desirable to reduce the likelihood of transactions
aborts.)
I used these shell commands:
../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")
and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 6694 files FOO.
I then removed trailing white space from benchtests/bench-pthread-locks.c
and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this
diagnostic from Savannah:
remote: *** pre-commit check failed ...
remote: *** error: lines with trailing whitespace found
remote: error: hook declined to update refs/heads/master
GCC 9 dropped support for the SPE extensions to PowerPC, which means
powerpc*-*-*gnuspe* configurations are no longer buildable with that
compiler. This ISA extension was peculiar to the “e500” line of
embedded PowerPC chips, which, as far as I can tell, are no longer
being manufactured, so I think we should follow suit.
This patch was developed by grepping for “e500”, “__SPE__”, and
“__NO_FPRS__”, and may not eliminate every vestige of SPE support.
Most uses of __NO_FPRS__ are left alone, as they are relevant to
normal embedded PowerPC with soft-float.
* sysdeps/powerpc/preconfigure: Error out on powerpc-*-*gnuspe*
host type.
* scripts/build-many-glibcs.py: Remove powerpc-*-linux-gnuspe
and powerpc-*-linux-gnuspe-e500v1 from list of build configurations.
* sysdeps/powerpc/powerpc32/e500: Recursively delete.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/e500: Recursively delete.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/context-e500.h:
Delete.
* sysdeps/powerpc/fpu_control.h: Remove SPE variant.
Issue an #error if used with a compiler in SPE-float mode.
* sysdeps/powerpc/powerpc32/__longjmp_common.S
* sysdeps/powerpc/powerpc32/setjmp_common.S
* sysdeps/unix/sysv/linux/powerpc/powerpc32/getcontext-common.S
* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/getcontext.S
* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/setcontext.S
* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/swapcontext.S
* sysdeps/unix/sysv/linux/powerpc/powerpc32/setcontext-common.S
* sysdeps/unix/sysv/linux/powerpc/powerpc32/swapcontext-common.S:
Remove code to preserve SPE register state.
* sysdeps/unix/sysv/linux/powerpc/elision-lock.c
* sysdeps/unix/sysv/linux/powerpc/elision-trylock.c
* sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
Remove __SPE__ ifndefs.
Some SPE opcodes clashes with some recent PowerISA opcodes and
until recently gas did not complain about it. However binutils
recently changed it and now VLE configured gas does not support to
assembler some instruction that might class with VLE (HTM for
instance). It also does not help that glibc build hardware lock
elision support as default (regardless of assembler support).
Although runtime will not actually enables TLE on SPE hardware
(since kernel will not advertise it), I see little advantage on
adding HTM support on SPE built glibc. SPE uses an incompatible
ABI which does not allow share the same build with default
powerpc and HTM code slows down SPE without any benefict.
This patch fixes it by only building HTM when SPE configuration
is not used.
Checked with a powerpc-linux-gnuspe build. I also did some sniff
tests on a e500 hardware without any issue.
[BZ #22926]
* sysdeps/powerpc/powerpc32/sysdep.h (ABORT_TRANSACTION_IMPL): Define
empty for __SPE__.
* sysdeps/powerpc/sysdep.h (ABORT_TRANSACTION): Likewise.
* sysdeps/unix/sysv/linux/powerpc/elision-lock.c (__lll_lock_elision):
Do not build hardware transactional code for __SPE__.
* sysdeps/unix/sysv/linux/powerpc/elision-trylock.c
(__lll_trylock_elision): Likewise.
* sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
(__lll_unlock_elision): Likewise.
The update of *adapt_count after the release of the lock causes a race
condition when thread A unlocks, thread B continues and destroys the
mutex, and thread A writes to *adapt_count.
Work around a GCC behavior with hardware transactional memory built-ins.
GCC doesn't treat the PowerPC transactional built-ins as compiler
barriers, moving instructions past the transaction boundaries and
altering their atomicity.
With TLE enabled, the adapt count variable update incurs
an 8% overhead before entering the critical section of an
elided mutex.
Instead, if it is done right after leaving the critical
section, this serialization can be avoided.
This alters the existing behavior of __lll_trylock_elision
as it will only decrement the adapt_count if it successfully
acquires the lock.
* sysdeps/unix/sysv/linux/powerpc/elision-lock.c
(__lll_lock_elision): Remove adapt_count decrement...
* sysdeps/unix/sysv/linux/powerpc/elision-trylock.c
(__lll_trylock_elision): Likewise.
* sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
(__lll_unlock_elision): ... to here. And utilize
new adapt_count parameter.
* sysdeps/unix/sysv/linux/powerpc/lowlevellock.h
(__lll_unlock_elision): Update to include adapt_count
parameter.
(lll_unlock_elision): Pass pointer to adapt_count
variable.
This patch adds support for lock elision using ISA 2.07 hardware
transactional memory instructions for pthread_mutex primitives.
Similar to s390 version, the for elision logic defined in
'force-elision.h' is only enabled if ENABLE_LOCK_ELISION is defined.
Also, the lock elision code should be able to be built even with
a compiler that does not provide HTM support with builtins.
However I have noted the performance is sub-optimal due scheduling
pressures.