Commit Graph

13 Commits

Author SHA1 Message Date
Joseph Myers
688903eb3e Update copyright dates with scripts/update-copyrights.
* All files with FSF copyright notices: Update copyright dates
	using scripts/update-copyrights.
	* locale/programs/charmap-kw.h: Regenerated.
	* locale/programs/locfile-kw.h: Likewise.
2018-01-01 00:32:25 +00:00
Szabolcs Nagy
659ca26736 aarch64: optimize _dl_tlsdesc_dynamic fast path
Remove some load/store instructions from the dynamic tlsdesc resolver
fast path.  This gives around 20% faster tls access in dlopened shared
libraries (assuming glibc ran out of static tls space).

	* sysdeps/aarch64/dl-tlsdesc.S (_dl_tlsdesc_dynamic): Optimize.
2017-11-03 14:50:55 +00:00
Szabolcs Nagy
91c5a366d8 aarch64: Remove barriers from TLS descriptor functions
Remove ldar synchronization and most lazy TLSDESC initialization
related code.

	* sysdeps/aarch64/dl-machine.h (elf_machine_runtime_setup): Remove
	DT_TLSDESC_GOT initialization.
	* sysdeps/aarch64/dl-tlsdesc.S (_dl_tlsdesc_return_lazy): Remove.
	(_dl_tlsdesc_resolve_rela): Likewise.
	(_dl_tlsdesc_resolve_hold): Likewise.
	(_dl_tlsdesc_undefweak): Remove ldar.
	(_dl_tlsdesc_dynamic): Likewise.
	* sysdeps/aarch64/dl-tlsdesc.h (_dl_tlsdesc_return_lazy): Remove.
	(_dl_tlsdesc_resolve_rela): Likewise.
	(_dl_tlsdesc_resolve_hold): Likewise.
	* sysdeps/aarch64/tlsdesc.c (_dl_tlsdesc_resolve_rela_fixup): Remove.
	(_dl_tlsdesc_resolve_hold_fixup): Likewise.
	(_dl_tlsdesc_resolve_rela): Likewise.
	(_dl_tlsdesc_resolve_hold): Likewise.
2017-11-03 14:43:32 +00:00
Steve Ellcey
5a706f649d aarch64: Use PTR_REG macro to fix ILP32 bug and make code consistent
* sysdeps/aarch64/dl-tlsdesc.S (_dl_tlsdesc_dynamic):
	Use PTR_REG macro in cmp instruction.
2017-08-22 16:22:05 -07:00
Szabolcs Nagy
e535139e82 [AArch64] Add more cfi annotations to tlsdesc entry points
Backtrace through _dl_tlsdesc_resolve_rela was broken because the offset
of x30 from cfa was not in the debug info.

Add enough annotation so backtracing from the dynamic linker through
tlsdesc entry points works and the debugger shows registers correctly.
2017-06-21 15:04:37 +01:00
Joseph Myers
bfff8b1bec Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
Florian Weimer
67aae64512 aarch64: Use explicit offsets in _dl_tlsdesc_dynamic
Commit 389d1f1b23 (“Partial ILP32
support for aarch64”) broke dynamic TLS support because a load
offset changed:

 0000000000000030 <_dl_tlsdesc_dynamic>:
   30:  a9bc7bfd        stp     x29, x30, [sp,#-64]!
   34:  910003fd        mov     x29, sp
   38:  a9020be1        stp     x1, x2, [sp,#32]
   3c:  a90313e3        stp     x3, x4, [sp,#48]
   40:  d53bd044        mrs     x4, tpidr_el0
   44:  c8dffc1f        ldar    xzr, [x0]
   48:  f9400401        ldr     x1, [x0,#8]
   4c:  f9400080        ldr     x0, [x4]
   50:  f9400823        ldr     x3, [x1,#16]
   54:  f9400002        ldr     x2, [x0]
   58:  eb02007f        cmp     x3, x2
   5c:  540001a8        b.hi    90 <_dl_tlsdesc_dynamic+0x60>
   60:  f9400022        ldr     x2, [x1]
   64:  8b021000        add     x0, x0, x2, lsl #4
   68:  f9400000        ldr     x0, [x0]
   6c:  b100041f        cmn     x0, #0x1
   70:  54000100        b.eq    90 <_dl_tlsdesc_dynamic+0x60>
-  74:  f9400421        ldr     x1, [x1,#8]
+  74:  f9400821        ldr     x1, [x1,#16]
   78:  8b010000        add     x0, x0, x1
…

This commit introduces explicit struct offsets, generated
from the C headers, fixing the regression.
2016-12-02 16:52:57 +01:00
Steve Ellcey
389d1f1b23 Partial ILP32 support for aarch64.
* sysdeps/aarch64/crti.S: Add include of sysdep.h.
	(call_weak_fn): Use PTR_REG to get correct reg name in ILP32.
	* sysdeps/aarch64/dl-irel.h: Add include of sysdep.h.
	(elf_irela): Use AARCH64_R macro to get correct relocation in ILP32.
	* sysdeps/aarch64/dl-machine.h: Add include of sysdep.h.
	(elf_machine_load_address, RTLD_START, RTLD_START_1, RTLD_START,
	elf_machine_type_class, ELF_MACHINE_JMP_SLOT, elf_machine_rela,
	elf_machine_lazy_rel): Add ifdef's for ILP32 support.
	* sysdeps/aarch64/dl-tlsdesc.S (_dl_tlsdesc_return,
	_dl_tlsdesc_return_lazy, _dl_tlsdesc_dynamic,
	_dl_tlsdesc_resolve_hold): Extend pointers in ILP32, use PTR_REG
	to get correct reg name for ILP32.
	* sysdeps/aarch64/dl-trampoline.S (ip01): New Macro.
	(RELA_SIZE): New Macro.
	(_dl_runtime_resolve, _dl_runtime_profile): Use new macros and PTR_REG
	to support ILP32.
	* sysdeps/aarch64/jmpbuf-unwind.h (_JMPBUF_CFA_UNWINDS_ADJ): Add
	cast for ILP32 mode.
	* sysdeps/aarch64/memcmp.S (memcmp): Extend arg pointers for ILP32 mode.
	* sysdeps/aarch64/memcpy.S (memmove, memcpy): Ditto.
	* sysdeps/aarch64/memset.S (__memset): Ditto.
	* sysdeps/aarch64/strchr.S (strchr): Ditto.
	* sysdeps/aarch64/strchrnul.S (__strchrnul): Ditto.
	* sysdeps/aarch64/strcmp.S (strcmp): Ditto.
	* sysdeps/aarch64/strcpy.S (strcpy): Ditto.
	* sysdeps/aarch64/strlen.S (__strlen): Ditto.
	* sysdeps/aarch64/strncmp.S (strncmp): Ditto.
	* sysdeps/aarch64/strnlen.S (strnlen): Ditto.
	* sysdeps/aarch64/strrchr.S (strrchr): Ditto.
	* sysdeps/unix/sysv/linux/aarch64/clone.S: Ditto.
	* sysdeps/unix/sysv/linux/aarch64/setcontext.S (__setcontext): Ditto.
	* sysdeps/unix/sysv/linux/aarch64/swapcontext.S (__swapcontext): Ditto.
	* sysdeps/aarch64/__longjmp.S (__longjmp): Extend pointers in ILP32,
	change PTR_MANGLE call to use register numbers instead of names.
	* sysdeps/unix/sysv/linux/aarch64/getcontext.S (__getcontext): Ditto.
	* sysdeps/aarch64/setjmp.S (__sigsetjmp): Extend arg pointers for
	ILP32 mode, change PTR_MANGLE calls to use register numbers.
	* sysdeps/aarch64/start.S (_start): Ditto.
	* sysdeps/aarch64/nptl/bits/pthreadtypes.h
	(__PTHREAD_RWLOCK_INT_FLAGS_SHARED): New define.
	(__SIZEOF_PTHREAD_ATTR_T, __SIZEOF_PTHREAD_MUTEX_T,
	__SIZEOF_PTHREAD_MUTEXATTR_T, __SIZEOF_PTHREAD_COND_T,
	__SIZEOF_PTHREAD_COND_COMPAT_T, __SIZEOF_PTHREAD_CONDATTR_T,
	__SIZEOF_PTHREAD_RWLOCK_T, __SIZEOF_PTHREAD_RWLOCKATTR_T,
	__SIZEOF_PTHREAD_BARRIER_T, __SIZEOF_PTHREAD_BARRIERATTR_T):
	Make defined values dependent on __ILP32__.
	* sysdeps/aarch64/nptl/bits/semaphore.h (__SIZEOF_SEM_T): Change define.
	(sem_t): Change __align type.
	* sysdeps/aarch64/sysdep.h (AARCH64_R, PTR_REG, PTR_LOG_SIZE, DELOUSE,
	PTR_SIZE): New Macros.
	(LDST_PCREL, LDST_GLOBAL) Update to use PTR_REG.
	* sysdeps/unix/sysv/linux/aarch64/bits/fcntl.h (O_LARGEFILE):
	Set when in ILP32 mode.
	(F_GETLK64, F_SETLK64, F_SETLKW64): Only set in LP64 mode.
	* sysdeps/unix/sysv/linux/aarch64/dl-cache.h (DL_CACHE_DEFAULT_ID):
	Set elf flags for ILP32.
	(add_system_dir): Set ILP32 library directories.
	* sysdeps/unix/sysv/linux/aarch64/init-first.c
	(_libc_vdso_platform_setup): Set minimum kernel version for ILP32.
	* sysdeps/unix/sysv/linux/aarch64/ldconfig.h
	(SYSDEP_KNOWN_INTERPRETER_NAMES): Add ILP32 names.
	* sysdeps/unix/sysv/linux/aarch64/sigcontextinfo.h (GET_PC, SET_PC):
	New Macros.
	* sysdeps/unix/sysv/linux/aarch64/sysdep.h: Handle ILP32 pointers.
2016-11-28 09:01:23 -08:00
Joseph Myers
f7a9f785e5 Update copyright dates with scripts/update-copyrights. 2016-01-04 16:05:18 +00:00
Szabolcs Nagy
c71c89e5c7 [AArch64] Fix cfi_adjust_cfa_offset usage in dl-tlsdesc.S
Some of the cfi annotations used incorrect sign.

	* sysdeps/aarch64/dl-tlsdesc.S (_dl_tlsdesc_return_lazy): Fix
	cfi_adjust_cfa_offset argument.
	(_dl_tlsdesc_undefweak, _dl_tlsdesc_dynamic): Likewise.
	(_dl_tlsdesc_resolve_rela, _dl_tlsdesc_resolve_hold): Likewise.
2015-06-17 12:44:53 +01:00
Szabolcs Nagy
08325735c2 [BZ 18034][AArch64] Lazy TLSDESC relocation data race fix
Lazy TLSDESC initialization needs to be synchronized with concurrent TLS
accesses.  The TLS descriptor contains a function pointer (entry) and an
argument that is accessed from the entry function.  With lazy initialization
the first call to the entry function updates the entry and the argument to
their final value.  A final entry function must make sure that it accesses an
initialized argument, this needs synchronization on systems with weak memory
ordering otherwise the writes of the first call can be observed out of order.

There are at least two issues with the current code:

tlsdesc.c (i386, x86_64, arm, aarch64) uses volatile memory accesses on the
write side (in the initial entry function) instead of C11 atomics.

And on systems with weak memory ordering (arm, aarch64) the read side
synchronization is missing from the final entry functions (dl-tlsdesc.S).

This patch only deals with aarch64.

* Write side:

Volatile accesses were replaced with C11 relaxed atomics, and a release
store was used for the initialization of entry so the read side can
synchronize with it.

* Read side:

TLS access generated by the compiler and an entry function code is roughly

  ldr x1, [x0]    // load the entry
  blr x1          // call it

entryfunc:
  ldr x0, [x0,#8] // load the arg
  ret

Various alternatives were considered to force the ordering in the entry
function between the two loads:

(1) barrier

entryfunc:
  dmb ishld
  ldr x0, [x0,#8]

(2) address dependency (if the address of the second load depends on the
result of the first one the ordering is guaranteed):

entryfunc:
  ldr x1,[x0]
  and x1,x1,#8
  orr x1,x1,#8
  ldr x0,[x0,x1]

(3) load-acquire (ARMv8 instruction that is ordered before subsequent
loads and stores)

entryfunc:
  ldar xzr,[x0]
  ldr x0,[x0,#8]

Option (1) is the simplest but slowest (note: this runs at every TLS
access), options (2) and (3) do one extra load from [x0] (same address
loads are ordered so it happens-after the load on the call site),
option (2) clobbers x1 which is problematic because existing gcc does
not expect that, so approach (3) was chosen.

A new _dl_tlsdesc_return_lazy entry function was introduced for lazily
relocated static TLS, so non-lazy static TLS can avoid the synchronization
cost.

	[BZ #18034]
	* sysdeps/aarch64/dl-tlsdesc.h (_dl_tlsdesc_return_lazy): Declare.
	* sysdeps/aarch64/dl-tlsdesc.S (_dl_tlsdesc_return_lazy): Define.
	(_dl_tlsdesc_undefweak): Guarantee TLSDESC entry and argument load-load
	ordering using ldar.
	(_dl_tlsdesc_dynamic): Likewise.
	(_dl_tlsdesc_return_lazy): Likewise.
	* sysdeps/aarch64/tlsdesc.c (_dl_tlsdesc_resolve_rela_fixup): Use
	relaxed atomics instead of volatile and synchronize with release store.
	(_dl_tlsdesc_resolve_hold_fixup): Use relaxed atomics instead of
	volatile.
	* elf/tlsdeschtab.h (_dl_tlsdesc_resolve_early_return_p): Likewise.
2015-06-17 12:41:01 +01:00
Joseph Myers
b168057aaa Update copyright dates with scripts/update-copyrights. 2015-01-02 16:29:47 +00:00
Marcus Shawcroft
75eff3fe90 Relocate AArch64 from ports to libc.
This patch moves the AArch64 port to the main sysdeps hierarchy.  The
move is essentially:

  git mv ports/sysdeps/aarch64 sysdeps/aarch64
  git mv ports/sysdeps/unix/sysv/linux/aarch64 sysdeps/unix/sysv/linux/aarch64

The README is updated and I've updated ChangeLog.aarch64 along the
lines of the ARM move.  The AArch64 build has been tested to confirm
that there were no changes in objdump -dr output or the shared
objects.
2014-02-11 11:36:00 +00:00