Merge roland/nptl-hppa to master, update and test for hppa-linux-gnu.
This commit squashes and commits the work done by Roland McGrath on
roland/nptl-hppa to migrate hppa to the new non-addon NPTL. Some
additional tweaks were required for tcb-offsets.sym to work correctly
along with clone.S (unique to hppa).
Some types of relocations technically need to be signed rather than
unsigned: in particular ones that are used with moveli or movei,
or for jump and branch. This is almost never a problem. Jump and
branch opcodes are pretty much uniformly resolved by the static linker
(unless you omit -fpic for a shared library, which is not recommended).
The moveli and movei opcodes that need to be sign-extended generally
are for positive displacements, like the construction of the address of
main() from _start(). However, tst-pie1 ends up with main below _start
(in a different module) and the test failed due to signedness issues in
relocation handling.
This commit treats the value as signed when shifting (to preserve the
high bit) and also sign-extends the value generated from the updated
bundle when comparing with the desired bundle, which we do to make sure
no overflow occurred. As a result, the tst-pie1 test now passes.
generic HAVE_RM_CTX implementation which is used for ppc/e500 as well
has introduced calls to fegetenv which should be resolved internally
with in libm
Signed-off-by: Khem Raj <raj.khem@gmail.com>
* sysdeps/powerpc/powerpc32/e500/nofpu/fegetenv.c (fegetenv): Add
libm_hidden_ver.
If e.g. a signal is being received while we are running fork(), the signal
thread may be having our SS lock when we make the space copy, and thus in the
child we can not take the SS lock any more.
* sysdeps/mach/hurd/fork.c (__fork): Lock SS->lock around __proc_dostop call.
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
TLS_INIT_TP in sysdeps/i386/nptl/tls.h uses some hand written asm to
generate a set_thread_area that might result in exchanging ebx and esp
around the syscall causing introspection tools like valgrind to loose
track of the user stack. Just use INTERNAL_SYSCALL which makes sure
esp isn't changed arbitrarily.
Before the patch the code would generate:
mov $0xf3,%eax
movl $0xfffff,0x8(%esp)
movl $0x51,0xc(%esp)
xchg %esp,%ebx
int $0x80
xchg %esp,%ebx
Using INTERNAL_SYSCALL instead will generate:
movl $0xfffff,0x8(%esp)
movl $0x51,0xc(%esp)
xchg %ecx,%ebx
mov $0xf3,%eax
int $0x80
xchg %ecx,%ebx
Thanks to Florian Weimer for analysing why the original code generated
the bogus esp usage:
_segdescr.desc happens to be at the top of the stack, so its address
is in %esp. The asm statement says that %3 is an input, so its value
will not change, and GCC can use %esp as the input register for the
expression &_segdescr.desc. But the constraints do not fully describe
the asm statement because the %3 register is actually modified, albeit
only temporarily.
[BZ #17319]
* sysdeps/i386/nptl/tls.h (TLS_INIT_TP): Use INTERNAL_SYSCALL
to call set_thread_area instead of hand written asm.
(__NR_set_thread_area): Removed define.
(TLS_FLAG_WRITABLE): Likewise.
(__ASSUME_SET_THREAD_AREA): Remove check.
(TLS_EBX_ARG): Remove define.
(TLS_LOAD_EBX): Likewise.
In my powerpc32 testing I've observed misc/test-gettimebasefreq
failing.
This is a glibc build (soft-float, though that's not relevant here)
without any --with-cpu and without any special configuration of the
default CPU for GCC either. In particular, it's one not using
sysdeps/powerpc/powerpc32/power4/hp-timing.h (although in fact the
processor I'm using for testing is POWER4-based), so hp_timing_t is
32-bit not 64-bit. But the VDSO call being used by
INTERNAL_VSYSCALL_NO_SYSCALL_FALLBACK is generating a 64-bit result
(high part in r3, low part in r4). The code extracting that result,
however, expects a result of the type hp_timing_t as passed to
INTERNAL_VSYSCALL_NO_SYSCALL_FALLBACK, meaning that only r3 (= 0) is
used and the value in r4 is ignored. This patch fixes this by always
using uint64_t as the type in INTERNAL_VSYSCALL_NO_SYSCALL_FALLBACK -
reflecting the actual ABI (unconditional in the kernel) of that VDSO
call. This is the minimal change for this issue - no check for
overflow, no change of the type of the timebase_freq variable or the
return type of __get_clockfreq to something other than hp_timing_t
(such a change would simply move the implicit conversions to the over
callers of that function), no change to hp_timing_t itself.
Tested for powerpc32 soft float.
[BZ #17263]
* sysdeps/unix/sysv/linux/powerpc/get_clockfreq.c: Include
<stdint.h>.
(__get_clockfreq): Use uint64_t instead of hp_timing_t in
INTERNAL_VSYSCALL_NO_SYSCALL_FALLBACK call.
Since:
commit 409e00bd69
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Wed Jan 29 07:51:41 2014 -0800
Disable x87 inline functions for SSE2 math
When i386 and x86-64 mathinline.h was merged into a single mathinline.h,
"gcc -m32" enables x87 inline functions on x86-64 even when -mfpmath=sse
and SSE2 is enabled. It is a regression on x86-64. We should check
__SSE2_MATH__ instead of __x86_64__ when disabling x87 inline functions.
gcc-3.2 is unable to correctly compile x86_64 routines for llrint
since it gets redefined. This is because gcc 3.2 does not set
__SSE2_MATH__ for x86_64, thus exposing the duplicate definition.
The correct fix ought to be to check for both __SSE2_MATH__ and
__x86_64__ and enable those bits only when neither are defined.
Tested fix with the reproducer for
409e00bd69 as well as with gcc-3.2.
The compiler doesn't know that the cpuid asm statement in intel_check_word
will trash RBX. We are lucky that it doesn't cause any problems since
RBX is also used by compiler for other purposes so that RBX is saved and
restored. This patch replaces it with __cpuid_count.
[BZ #17259]
* sysdeps/x86_64/cacheinfo.c (intel_check_word): Replace cpuid
asm statement with __cpuid_count.
On powerpc, floating-point environment macros are defined as pointers
to constants in the library that contain the bit-patterns of the
desired environment, instead of being magic constants cast to pointer
type.
For soft-float, the bit-patterns used for fenv_t are not laid out the
same as for hard-float. (e500 has a third layout used; that's not an
ABI issue because these values are only meaningful within a single
process, all of whose glibc libraries must come from the same build of
glibc.) While the __fe_dfl_env value for soft-float was appropriate
for the soft-float fenv_t representation, the other two constants had
the same bit-patterns as for hard-float. Those bit patterns had the
effect of having exceptions already raised, causing
math/test-fenv-return to fail; this patch fixes the patterns used.
(__fe_nonieee_env also had exceptions unmasked, though they should be
masked to match hard-float semantics. Since there is no separate
non-IEEE mode for soft-float, it's most appropriate for
__fe_nonieee_env to be the same as __fe_dfl_env; this patch makes it
an alias.)
Tested for powerpc-nofpu.
[BZ #17261]
* sysdeps/powerpc/nofpu/fenv_const.c (__fe_enabled_env): Change
value to 0.
(__fe_nonieee_env): Define as an alias for __fe_dfl_env.
2014-08-12 Bernard Ogden <bernie.ogden@linaro.org>
[BZ #16892]
* sysdeps/nptl/lowlevellock.h (__lll_timedlock): Use
atomic_compare_and_exchange_bool_acq rather than atomic_exchange_acq.
further optimization. libc_feholdsetround_aarch64_ctx now only needs to
read the FPCR in the typical case, avoiding a redundant FPSR read.
Performance results show a good improvement (5-10% on sin()) on cores with
expensive FPCR/FPSR instructions.
This patch fixes the incorrect guard by __USE_MISC of struct winsize and
struct termio in powerpc termios header. Current states leads to build
failures if the program defines _XOPEN_SOURCE, but not _DEFAULT_SOURCE
or either _BSD_SOURCE or _SVID_SOURCE. Without any definition,
__USE_MISC will not be defined and neither the struct definitions.
This patch copies the default Linux ioctl-types.h by adjusting only the
character control field (c_cc) size in struct termio.
Use the SSI_IEEE_RAISE_EXCEPTION function as from feraiseexcept,
instead of __ieee_get+set_fp_status. Always raise the FP exceptions
from float-to-integer conversion.
Remove lowlevellock.h in favour of the generic implementation. The
generic implementation was tested natively and introduces no
regressions.
ChangeLog:
2014-08-04 Will Newton <will.newton@linaro.org>
* sysdeps/unix/sysv/linux/aarch64/lowlevellock.h: Remove
file.
The previous set of not-cancel.h headers (prior to the commit
2fbdf5339a) did not require the
arch to define nocancel entry points, so ia64 never did.
However, after the various files were merged, it became a hard
requirement for arches which mean ia64 failed to build.
Here we add dedicated entry points. It'd be nice to merge
with the existing stubs like other arches do, but the ia64
asm does not lend itself to interleaving of functions. If
someone has a suggestion on merging these, that'd be great,
but at least now we build & pass tests again.
This patch fixes the remaining ONE_DIRECTION warnings for s390 specific conversions.
It defines ONE_DIRECTION to 0 like the patch from Steve Ellcey:
https://www.sourceware.org/ml/libc-alpha/2014-05/msg00039.html
Changelog:
* sysdeps/s390/s390-64/utf16-utf32-z9.c
(ONE_DIRECTION): Define.
* sysdeps/s390/s390-64/utf8-utf16-z9.c
(ONE_DIRECTION): Define.
* sysdeps/s390/s390-64/utf8-utf32-z9.c
(ONE_DIRECTION): Define.
In this patch we take advantage of HSW memory bandwidth, manage to
reduce miss branch prediction by avoiding using branch instructions and
force destination to be aligned with avx instruction.
The CPU2006 403.gcc benchmark indicates this patch improves performance
from 2% to 10%.
Open file description locks have been merged into the Linux kernel for
v3.15. Add the appropriate command-value definitions and an update to
the manual that describes their usage.
This is a change to the dynamic linker to add prelinker support for the
R_ARM_TLS_DESC relocation. Two cases can be considered here, the usual
one where lazy binding is in use and the less frequent one, where
immediate binding is requested via the use of the DF_BIND_NOW dynamic
flag (e.g. by using the GNU linker's "-z now" option).
This change only handles the first case. In this scenario the prelinker
does what the dynamic linker would do, that is it preinitialises
R_ARM_TLS_DESC relocations with a pointer to the lazy specialization as
provided with the DT_TLSDESC_PLT dynamic tag. A conflict is
additionally created and in the conflict resolution path the dynamic
linker complements the work by initialising the object's pointer as
indicated by the DT_TLSDESC_GOT dynamic tag to the linker's internal
lazy specialization worker function and also providing the associated
link map in the second entry of the GOT. This step is required, because
if prelinking is successful at the run time, then the dynamic linker's
elf_machine_runtime_setup() function isn't called that would normally do
so.
The second case remains unresolved, because support for that scenario
has not been implemented in the prelinker. In this case the lazy
specialization is unavailable and the DT_TLSDESC_PLT dynamic tag is not
present.
The prelinker could assume the common case of static specialization and
resolve the relocation, but that would require the exposure of dynamic
linker's specialization worker function. Furthermore the dynamic linker
would have to handle the relocation in the conflict resolution path and
see if the dynamic specialization should be used instead. This however
would require access to data structures currently not made available to
the conflict resolution path and therefore a redesign of this part of
the dynamic linker.
Alternatively the prelinker could defer all processing to the dynamic
linker's conflict resolution path, but that would require similar access
to the said data structures.
Therefore the prelinker issues an error instead and the dynamic linker
has assertions to check that DT_TLSDESC_PLT and DT_TLSDESC_GOT are in
use in its conflict resolution path.
This change resolves all TLS failures in the prelinker testsuite, as
noted in the bug report, as well as the small test case provided there.
Unfortunately we don't seem to have any hooks to factor in the prelinker
(if present on a system) to testing, so at this time this fix has to
rely on using the prelinker test suite and enabling TLS descriptors
there for coverage.
[BZ #17078]
* sysdeps/arm/dl-machine.h (elf_machine_rela)
[RESOLVE_CONFLICT_FIND_MAP]: Handle R_ARM_TLS_DESC relocation.
(elf_machine_lazy_rel): Handle prelinked R_ARM_TLS_DESC entries.
This patch splits s390 out of the main Linux kernel-features.h.
Not tested.
* sysdeps/unix/sysv/linux/s390/kernel-features.h: New file.
* sysdeps/unix/sysv/linux/kernel-features.h [__s390__]
(__ASSUME_SOCKETCALL): Do not define.
This patch splits sh out of the main Linux kernel-features.h.
Not tested.
* sysdeps/unix/sysv/linux/sh/kernel-features.h: New file.
* sysdeps/unix/sysv/linux/kernel-features.h [__sh__]
(__ASSUME_SOCKETCALL): Do not define.
(__ASSUME_ST_INO_64_BIT): Define unconditionally.
[__LINUX_KERNEL_VERSION >= 0x020625 && __sh__]
(__ASSUME_ACCEPT4_SYSCALL): Do not define.
[__LINUX_KERNEL_VERSION >= 0x020625 && __sh__]
(__ASSUME_RECVMMSG_SYSCALL): Likewise.
[__LINUX_KERNEL_VERSION >= 0x030000 && __sh__]
(__ASSUME_SENDMMSG_SYSCALL): Likewise.
[__sh__] (__ASSUME_SENDMMSG_SYSCALL_WITH_SOCKETCALL): Likewise.
This patch splits powerpc out of the main Linux kernel-features.h.
Not tested.
* sysdeps/unix/sysv/linux/powerpc/kernel-features.h: New file.
* sysdeps/unix/sysv/linux/kernel-features.h [__powerpc__]
(__ASSUME_SOCKETCALL): Do not define.
(__ASSUME_IPC64): Define unconditionally.
[__LINUX_KERNEL_VERSION >= 0x020625 && __powerpc__]
(__ASSUME_ACCEPT4_SYSCALL): Do not define.
[__LINUX_KERNEL_VERSION >= 0x020625 && __powerpc__]
(__ASSUME_RECVMMSG_SYSCALL): Likewise.
[__LINUX_KERNEL_VERSION >= 0x030000 && __powerpc__]
(__ASSUME_SENDMMSG_SYSCALL): Likewise.
[__powerpc__] (__ASSUME_SENDMMSG_SYSCALL_WITH_SOCKETCALL):
Likewise.
This patch splits sparc out of the main Linux kernel-features.h.
Not tested.
* sysdeps/unix/sysv/linux/sparc/kernel-features.h: New file.
* sysdeps/unix/sysv/linux/kernel-features.h [__sparc__]
(__ASSUME_SOCKETCALL): Do not define.
(__ASSUME_SET_ROBUST_LIST): Define unconditionally.
(__ASSUME_FUTEX_LOCK_PI): Likewise.
[__sparc__] (__ASSUME_ACCEPT4_SYSCALL): Do not define.
[__sparc__] (__ASSUME_ACCEPT4_SYSCALL_WITH_SOCKETCALL): Likewise.
(__ASSUME_REQUEUE_PI): Define unconditionally.
[__LINUX_KERNEL_VERSION >= 0x020621 && __sparc__]
(__ASSUME_RECVMMSG_SYSCALL): Do not define.
[__sparc__] (__ASSUME_RECVMMSG_SYSCALL_WITH_SOCKETCALL): Likewise.
[__LINUX_KERNEL_VERSION >= 0x030000 && __sparc__]
(__ASSUME_SENDMMSG_SYSCALL): Likewise.
[__sparc__] (__ASSUME_SENDMMSG_SYSCALL_WITH_SOCKETCALL): Likewise.
This patch splits i386 out of the main Linux kernel-features.h.
Tested x86 that there are no changes to disassembly of installed
shared libraries.
* sysdeps/unix/sysv/linux/i386/kernel-features.h: New file.
* sysdeps/unix/sysv/linux/kernel-features.h [__i386__]
(__ASSUME_SOCKETCALL): Do not define.
[__LINUX_KERNEL_VERSION >= 0x020621 && __i386__]
(__ASSUME_RECVMMSG_SYSCALL): Likewise.
[__i386__] (__ASSUME_RECVMMSG_SYSCALL_WITH_SOCKETCALL): Likewise.
[__LINUX_KERNEL_VERSION >= 0x030000 && __i386__]
(__ASSUME_SENDMMSG_SYSCALL): Likewise.
[__i386__] (__ASSUME_SENDMMSG_SYSCALL_WITH_SOCKETCALL): Likewise.