Commit Graph

26172 Commits

Author SHA1 Message Date
Andi Kleen
1cdbe57948 Add the low level infrastructure for pthreads lock elision with TSX
Lock elision using TSX is a technique to optimize lock scaling
It allows to run locks in parallel using hardware support for
a transactional execution mode in 4th generation Intel Core CPUs.
See http://www.intel.com/software/tsx for more Information.

This patch implements a simple adaptive lock elision algorithm based
on RTM. It enables elision for the pthread mutexes and rwlocks.
The algorithm keeps track whether a mutex successfully elides or not,
and stops eliding for some time when it is not.

When the CPU supports RTM the elision path is automatically tried,
otherwise any elision is disabled.

The adaptation algorithm and its tuning is currently preliminary.

The code adds some checks to the lock fast paths. Micro-benchmarks
show little to no difference without RTM.

This patch implements the low level "lll_" code for lock elision.
Followon patches hook this into the pthread implementation

Changes with the RTM mutexes:
-----------------------------
Lock elision in pthreads is generally compatible with existing programs.
There are some obscure exceptions, which are expected to be uncommon.
See the manual for more details.

- A broken program that unlocks a free lock will crash.
  There are ways around this with some tradeoffs (more code in hot paths)
  I'm still undecided on what approach to take here; have to wait for testing reports.
- pthread_mutex_destroy of a lock mutex will not return EBUSY but 0.
- There's also a similar situation with trylock outside the mutex,
  "knowing" that the mutex must be held due to some other condition.
  In this case an assert failure cannot be recovered. This situation is
  usually an existing bug in the program.
- Same applies to the rwlocks. Some of the return values changes
  (for example there is no EDEADLK for an elided lock, unless it aborts.
   However when elided it will also never deadlock of course)
- Timing changes, so broken programs that make assumptions about specific timing
  may expose already existing latent problems.  Note that these broken programs will
  break in other situations too (loaded system, new faster hardware, compiler
  optimizations etc.)
- Programs with non recursive mutexes that take them recursively in a thread and
  which would always deadlock without elision may not always see a deadlock.
  The deadlock will only happen on an early or delayed abort (which typically
  happens at some point)
  This only happens for mutexes not explicitely set to PTHREAD_MUTEX_NORMAL
  or PTHREAD_MUTEX_ADAPTIVE_NP.  PTHREAD_MUTEX_NORMAL mutexes do not elide.

The elision default can be set at configure time.

This patch implements the basic infrastructure for elision.
2013-07-02 08:46:54 -07:00
H.J. Lu
1c81621c5b Enable static 32-bit SSE4.2 strcasecmp/strncasecmp 2013-07-02 08:06:04 -07:00
Joseph Myers
77f01ab5d1 Implement fma in soft-fp. 2013-07-02 14:55:32 +00:00
Will Newton
1413c693d3 ARM: Pass dl_hwcap to IFUNC resolver functions. 2013-07-02 13:01:21 +00:00
Joseph Myers
c53e2f0a56 Support no-FPU ColdFire in sysdeps/m68k/dl-trampoline.S and refactor code. 2013-06-30 21:36:59 +00:00
Chris Metcalf
8145005c31 tile: switch to using <fenv.h> fallback functions
Now that the fallback functions match the desired semantics for tile
functions, just switch to using them.
2013-06-30 11:50:43 -04:00
Joseph Myers
e7521973aa Add more NEWS items for 2.18. 2013-06-28 22:53:57 +00:00
Liubov Dmitrieva
6308fd9a46 Skip SSE4.2 versions on Intel Silvermont
SSE2/SSSE3 versions are faster than SSE4.2 versions on Intel Silvermont.
2013-06-28 15:31:40 -07:00
Ryan S. Arnold
89cd956937 PowerPC: Define AT_HWCAP2 bits and AT_HWCAP2 handling for POWER8. 2013-06-28 16:52:49 -05:00
Ryan S. Arnold
1ae8bfe07c Add GLRO(dl_hwcap2) for new AT_HWCAP2 auxv_t a_type. 2013-06-28 16:50:48 -05:00
Joseph Myers
8fbec01098 Consistently use page_shift in sysdeps/unix/sysv/linux/mmap64.c. 2013-06-28 21:45:11 +00:00
Pierre Ynard
0432680e8c Test for mprotect failure in dl-load.c (bug 12492). 2013-06-28 21:43:42 +00:00
Nathan Froyd
ce61a2ad2e Mark packed structure element used with atomic operation aligned. 2013-06-28 21:42:19 +00:00
Joseph Myers
ef65da39e6 Fix sysdeps/m68k/fpu_control.h preprocessor indentation. 2013-06-28 20:30:43 +00:00
Nathan Sidwell
0cad7ea248 Support no-FPU ColdFire in sysdeps/m68k/fpu_control.h. 2013-06-28 20:28:25 +00:00
Maciej W. Rozycki
3d0f5d0c7a Add a dlopen/getpagesize static executable test. 2013-06-28 17:43:07 +01:00
Maciej W. Rozycki
f91f1c0fb8 [BZ #15022] Correct global-scope dlopen issues in static executables.
This change creates a link map in static executables to serve as the
global search list for dlopen.  It fixes a problem with the inability
to access the global symbol object and a crash on an attempt to map a
DSO into the global scope.  Some code that has become dead after the
addition of this link map is removed too and test cases are provided.
2013-06-28 16:22:20 +01:00
Marcus Shawcroft
ed0257f7d3 [AArch64] Adjust elf_machine_dynamic to find _DYNAMIC via _GLOBAL_OFFSET_TABLE_ 2013-06-28 11:27:26 +01:00
Marcus Shawcroft
03ea4d9b69 [AArch64] Simplify getcontext pstate initialization. 2013-06-28 11:23:58 +01:00
Maciej W. Rozycki
fe114d2064 _dl_static_init: Remove nested locking.
This function is now called from dl_open_worker with the GL(dl_load_lock)
lock held and no longer needs local protection.  GL(dl_load_lock) also
correctly protects _dl_lookup_symbol_x called here that relies on the
caller to have serialized access to the data structures it uses.
2013-06-27 11:49:44 +01:00
Joseph Myers
cbe7d24bb4 Require GCC 4.4 or later to build glibc. 2013-06-26 23:10:48 +00:00
H.J. Lu
bb5bb87cd2 Add a test for BZ #15674 2013-06-26 15:23:08 -07:00
H.J. Lu
fc74328c1f Mention BZ #15674 2013-06-26 12:31:51 -07:00
Liubov Dmitrieva
11b8a0e1d7 Fix buffers overrun in x86_64 memcmp-ssse3.S 2013-06-26 12:31:51 -07:00
Maciej W. Rozycki
b003710377 [BZ #15022] Avoid repeated calls to DL_STATIC_INIT for the same module. 2013-06-26 19:14:29 +01:00
Ryan S. Arnold
c18c701d03 Add AT_HWCAP2 as a new auxv_t a_type to elf.h. 2013-06-26 08:50:20 -05:00
Mike Frysinger
89756a8cdb drop NEWS mention
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
2013-06-25 15:20:13 -04:00
Richard Henderson
1d17fa5f8e Fix missing libc-internal.h include.
* locale/programs/locarchive.c: Include <libc-internal.h>
2013-06-25 11:21:20 -07:00
Joseph Myers
8fcb833a2b Update texinfo.tex. 2013-06-25 17:21:48 +00:00
Andreas Schwab
5ccb431120 m68k: fix bad use of register alias in cfi insn 2013-06-25 19:03:46 +02:00
Richard Henderson
385fd0d524 [BZ #15666] alpha: Add __sqrt*_finite definitions
With compatibility for ev6 and non-ev6 builds, as the non-ev6 did
manage to get definitions emitted for the float and double functions.
2013-06-24 18:12:24 -07:00
Mike Frysinger
17db6e8d6b [BZ #10283] localedef: align fixed maps to SHMLBA
Many Linux arches require fixed mmaps to be aligned higher than pagesize,
so use the SHMLBA define as it represents this quantity exactly.

This fixes spurious errors seen on those arches like:
cannot map archive header: Invalid argument

URL: http://sourceware.org/bugzilla/show_bug.cgi?id=10283
Reported-by: CHIKAMA Masaki <masaki.chikama@gmail.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
2013-06-24 20:26:58 -04:00
Mike Frysinger
d605071ebf libc-internal.h: add ALIGN helper macros
Rather than open coding the masks, add helper macros to do the magic.
This makes code easier to read.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
2013-06-24 20:25:15 -04:00
Vladimir Nikulichev
e1f0b2cfa1 BZ #12310: pthread_exit in static app. segfaults
Static applications that call pthread_exit on the main
thread segfault. This is because after a thread terminates
__libc_start_main decrements __nptl_nthreads which is only
defined in pthread_create. Therefore the right solution is
to add a requirement to pthread_create from pthread_exit.

~~~
nptl/

2013-06-24  Vladimir Nikulichev  <v.nikulichev@gmail.com>

	[BZ #12310]
	* pthread_exit.c: Add reference to pthread_create.
2013-06-24 17:12:30 -04:00
Ryan S. Arnold
2f063a6e84 PowerPC: Enable POWER8 platform sans hwcap bits. 2013-06-24 15:33:32 -05:00
Siddhesh Poyarekar
a74ca98fdd Regenerate INSTALL file 2013-06-24 21:46:42 +05:30
Siddhesh Poyarekar
a31ee4b3a5 Fix typo in comment 2013-06-24 18:07:37 +05:30
Richard Henderson
09d91fde6b alpha: Update libm-test-ulps 2013-06-23 11:05:56 -07:00
Joseph Myers
e781d7c58f Include <string.h> in nptl/pthread_setattr_default_np.c. 2013-06-22 19:32:50 +00:00
Joseph Myers
d8412221e6 Include <string.h> in sysdeps/unix/sysv/linux/libc_fatal.c. 2013-06-22 19:30:10 +00:00
Joseph Myers
695c378f81 Fix soft-fp shadowing between __FP_FRAC_ADD_3 and _FP_MUL_MEAT_2_wide_3mul (bug 15667). 2013-06-22 19:27:41 +00:00
Maciej W. Rozycki
d1d5471579 Remove dead DL_DST_REQ_STATIC code. 2013-06-22 00:39:42 +01:00
Kaz Kojima
638faeb6fe Add sh4 implementation of fegetexceptflag (bug 15655). 2013-06-22 07:46:45 +09:00
Joseph Myers
8fdda7afb8 Fix bad shift in soft-fp (bug 7006). 2013-06-21 19:00:43 +00:00
Maciej W. Rozycki
f3bc5e5a3e dlfcn/Makefile: Avoid repeated $(*-ENV) definitions. 2013-06-21 18:13:39 +01:00
Kaz Kojima
be09e8c9ec Add sh4 implementation of fegetexceptflag. 2013-06-21 18:07:31 +09:00
Adhemerval Zanella
85c2e6110c Fix loop construction to functions calls
Check wheter the compiler has the option -fno-tree-loop-distribute-patterns
to inhibit loop transformation to library calls and uses it on memset
and memmove default implementation to avoid recursive calls.
2013-06-20 19:42:05 -05:00
Joseph Myers
b8c792af85 Allow fesetround failures in math/test-misc.c if ROUNDING_TESTS fails. 2013-06-20 19:11:34 +00:00
Joseph Myers
c91e082525 Avoid spurious failures from <fenv.h> fallback functions (bug 15654). 2013-06-20 19:10:44 +00:00
Roland McGrath
bfcacbdec0 Use rtld-CPPFLAGS in rtld-%.os rules for generated sources. 2013-06-18 16:29:25 -07:00