glibc/sysdeps/powerpc/sysdep.h
Adhemerval Zanella f0458cf4f9 powerpc: Only enable TLE with PPC_FEATURE2_HTM_NOSC
Linux from 3.9 through 4.2 does not abort HTM transaction on syscalls,
instead it suspend and resume it when leaving the kernel.  The
side-effects of the syscall will always remain visible, even if the
transaction is aborted.  This is an issue when transaction is used along
with futex syscall, on pthread_cond_wait for instance, where the futex
call might succeed but the transaction is rolled back leading the
pthread_cond object in an inconsistent state.

Glibc used to prevent it by always aborting a transaction before issuing
a syscall.  Linux 4.2 also decided to abort active transaction in
syscalls which makes the glibc workaround superfluous.  Worse, glibc
transaction abortion leads to a performance issue on recent kernels
where the HTM state is saved/restore lazily (v4.9).  By aborting a
transaction on every syscalls, regardless whether a transaction has being
initiated before, GLIBS makes the kernel always save/restore HTM state
(it can not even lazily disable it after a certain number of syscall
iterations).

Because of this shortcoming, Transactional Lock Elision is just enabled
when it has been explicitly set (either by tunables of by a configure
switch) and if kernel aborts HTM transactions on syscalls
(PPC_FEATURE2_HTM_NOSC).  It is reported that using simple benchmark [1],
the context-switch is about 5% faster by not issuing a tabort in every
syscall in newer kernels.

Checked on powerpc64le-linux-gnu with 4.4.0 kernel (Ubuntu 16.04).

	* NEWS: Add note about new TLE support on powerpc64le.
	* sysdeps/powerpc/nptl/tcb-offsets.sym (TM_CAPABLE): Remove.
	* sysdeps/powerpc/nptl/tls.h (tcbhead_t): Rename tm_capable to
	__ununsed1.
	(TLS_INIT_TP, TLS_DEFINE_INIT_TP): Remove tm_capable setup.
	(THREAD_GET_TM_CAPABLE, THREAD_SET_TM_CAPABLE): Remove macros.
	* sysdeps/powerpc/powerpc32/sysdep.h,
	sysdeps/powerpc/powerpc64/sysdep.h (ABORT_TRANSACTION_IMPL,
	ABORT_TRANSACTION): Remove macros.
	* sysdeps/powerpc/sysdep.h (ABORT_TRANSACTION): Likewise.
	* sysdeps/unix/sysv/linux/powerpc/elision-conf.c (elision_init): Set
	__pthread_force_elision iff PPC_FEATURE2_HTM_NOSC is set.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/sysdep.h,
	sysdeps/unix/sysv/linux/powerpc/powerpc64/sysdep.h
	sysdeps/unix/sysv/linux/powerpc/syscall.S (ABORT_TRANSACTION): Remove
	usage.
	* sysdeps/unix/sysv/linux/powerpc/not-errno.h: Remove file.

Reported-by: Breno Leitão <leitao@debian.org>
2018-09-21 10:18:03 -07:00

168 lines
3.4 KiB
C

/* Copyright (C) 1999-2018 Free Software Foundation, Inc.
This file is part of the GNU C Library.
The GNU C Library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.
The GNU C Library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public
License along with the GNU C Library; if not, see
<http://www.gnu.org/licenses/>. */
/*
* Powerpc Feature masks for the Aux Vector Hardware Capabilities (AT_HWCAP).
* This entry is copied to _dl_hwcap or rtld_global._dl_hwcap during startup.
*/
#define _SYSDEPS_SYSDEP_H 1
#include <bits/hwcap.h>
#define PPC_FEATURE_970 (PPC_FEATURE_POWER4 + PPC_FEATURE_HAS_ALTIVEC)
#ifdef __ASSEMBLER__
/* Symbolic names for the registers. The only portable way to write asm
code is to use number but this produces really unreadable code.
Therefore these symbolic names. */
/* Integer registers. */
#define r0 0
#define r1 1
#define r2 2
#define r3 3
#define r4 4
#define r5 5
#define r6 6
#define r7 7
#define r8 8
#define r9 9
#define r10 10
#define r11 11
#define r12 12
#define r13 13
#define r14 14
#define r15 15
#define r16 16
#define r17 17
#define r18 18
#define r19 19
#define r20 20
#define r21 21
#define r22 22
#define r23 23
#define r24 24
#define r25 25
#define r26 26
#define r27 27
#define r28 28
#define r29 29
#define r30 30
#define r31 31
/* Floating-point registers. */
#define fp0 0
#define fp1 1
#define fp2 2
#define fp3 3
#define fp4 4
#define fp5 5
#define fp6 6
#define fp7 7
#define fp8 8
#define fp9 9
#define fp10 10
#define fp11 11
#define fp12 12
#define fp13 13
#define fp14 14
#define fp15 15
#define fp16 16
#define fp17 17
#define fp18 18
#define fp19 19
#define fp20 20
#define fp21 21
#define fp22 22
#define fp23 23
#define fp24 24
#define fp25 25
#define fp26 26
#define fp27 27
#define fp28 28
#define fp29 29
#define fp30 30
#define fp31 31
/* Condition code registers. */
#define cr0 0
#define cr1 1
#define cr2 2
#define cr3 3
#define cr4 4
#define cr5 5
#define cr6 6
#define cr7 7
/* Vector registers. */
#define v0 0
#define v1 1
#define v2 2
#define v3 3
#define v4 4
#define v5 5
#define v6 6
#define v7 7
#define v8 8
#define v9 9
#define v10 10
#define v11 11
#define v12 12
#define v13 13
#define v14 14
#define v15 15
#define v16 16
#define v17 17
#define v18 18
#define v19 19
#define v20 20
#define v21 21
#define v22 22
#define v23 23
#define v24 24
#define v25 25
#define v26 26
#define v27 27
#define v28 28
#define v29 29
#define v30 30
#define v31 31
#define VRSAVE 256
/* The 32-bit words of a 64-bit dword are at these offsets in memory. */
#if defined __LITTLE_ENDIAN__ || defined _LITTLE_ENDIAN
# define LOWORD 0
# define HIWORD 4
#else
# define LOWORD 4
# define HIWORD 0
#endif
/* The high 16-bit word of a 64-bit dword is at this offset in memory. */
#if defined __LITTLE_ENDIAN__ || defined _LITTLE_ENDIAN
# define HISHORT 6
#else
# define HISHORT 0
#endif
/* This seems to always be the case on PPC. */
#define ALIGNARG(log2) log2
#define ASM_SIZE_DIRECTIVE(name) .size name,.-name
#endif /* __ASSEMBLER__ */