Commit Graph

10425 Commits

Author SHA1 Message Date
Joseph Myers
c23805a95d Fix i386 asinhl (sNaN) (bug 20218).
The i386 version of asinhl returns sNaN (without raising any
exceptions) for sNaN input.  This patch fixes it to add non-finite
arguments to themselves, so that "invalid" is raised and qNaN
returned.

Tested for x86_64 and x86.

	[BZ #20218]
	* sysdeps/i386/fpu/s_asinhl.S (__asinhl): Add non-finite argument
	to itself.
	* math/libm-test.inc (asinh_test_data): Add sNaN tests.
2016-06-07 22:54:58 +00:00
H.J. Lu
91655fc307 Check FMA after COMMON_CPUID_INDEX_80000001
Since the FMA4 bit is in COMMON_CPUID_INDEX_80000001 and FMA4 requires
AVX, determine if FMA4 is usable after COMMON_CPUID_INDEX_80000001 is
available and if AVX is usable.

	[BZ #20195]
	* sysdeps/x86/cpu-features.c (get_common_indeces): Move FMA4
	check to ...
	(init_cpu_features): Here.
2016-06-07 08:00:40 -07:00
Carlos O'Donell
c9bd40daae Bug 20214: Fix linux/in6.h and netinet/in.h sync.
In: https://sourceware.org/glibc/wiki/Synchronizing_Headers
we explain how we synchronize our headers with Linux kernel
headers.

In order to synchronize with the Linux linux/in6.h and
linux/ipv6.h headers we checked for their guard macros and
then defined __USE_KERNEL_IPV6_DEFS and conditionalized code
on this macro.

In upstream kernel 56c176c9 the _UAPI prefix was stripped and
this broke our synchronized headers again. We now need to check
for _LINUX_IN6_H and _IPV6_H, and keep checking the old versions
of the header guard checks for maximum backwards compatibility
with older Linux headers (the history is actually a bit muddled
here and it appears upstream linus kernel broke this 10 months
*before* our fix was ever applied to glibc, but without glibc
testing we didn't notice and distro kernels have their own
testing to fix this).

This patch fixes synchronization with linux/in6.h and
with netinet/in.h.
2016-06-07 04:46:37 -04:00
Carlos O'Donell
47dd3543d3 Bug 20198: quick_exit should not call destructors.
In C++11 18.5.12 says "Objects shall not be destroyed as a
result of calling quick_exit." In C11 quick_exit is silent
about thread object destruction. Therefore to make glibc
C++ compliant we do not call any thread local destructors.
A new regression test verifies the fix.

I will note that C++11 18.5.3 makes it clear that C++
defines additional requirements for _Exit() to prevent it
from executing destructors.

Given that the point of _Exit() is to terminate the process
immediately it makes sense the C and C++ should line up
and avoid calling destructors.

No failures. New regtest passes.
2016-06-06 21:40:25 -04:00
H.J. Lu
3f61232ab3 Fix a typo in comments in memmove-vec-unaligned-erms.S
* sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S: Fix
	a typo in comments.
2016-06-06 16:03:21 -07:00
Joseph Myers
3d8b06bc61 Fix dbl-64 asin (sNaN) (bug 20213).
The dbl-64 version of asin returns sNaN for sNaN arguments.  This
patch fixes it to add NaN arguments to themselves so that qNaN is
returned in this case.

Tested for x86_64 and x86.

	[BZ #20213]
	* sysdeps/ieee754/dbl-64/e_asin.c (__ieee754_asin): Add NaN
	argument to itself.
	* math/libm-test.inc (asin_test_data): Add sNaN tests.
2016-06-06 22:21:11 +00:00
Adhemerval Zanella
af5fdf5a35 Consolidate pwritev/pwritev64 implementations
This patch consolidates all the pwritev{64} implementation for Linux
in only one (sysdeps/unix/sysv/linux/pwritev{64}.c).  It also removes the
syscall from the auto-generation using assembly macros.

It was based on previous pwrite/pwrite64 consolidation patch.  The new macro
SYSCALL_LL{64} is used to handle the offset argument and alias is created
for __ASSUME_OFF_DIFF_OFF64 in case of pread64.

Checked on x86_64, i386, aarch64, and powerpc64le.

	* misc/Makefile (CFLAGS-pwritev.c): New variable: add cancellation
	required flags.
	(CFLAGS-pwritev64.c): Likewise.
	* sysdeps/unix/sysv/linux/generic/wordsize-32/pwritev.c: Remove file.
	* sysdeps/unix/sysv/linux/generic/wordsize-32/pwritev64.c: Likewise.
	* sysdeps/unix/sysv/linux/mips/mips64/n64/pwritev64.c: Likewise.
	* sysdeps/unix/sysv/linux/wordsize-64/pwritev.c: Likewise.
	* sysdeps/unix/sysv/linux/wordsize-64/pwritev64.: Likwise.
	* sysdeps/unix/sysv/linux/x86_64/x32/syscalls.list (pwritev): Remove
	syscall from auto-generation.
	* sysdeps/unix/sysv/linux/pwritev.c: Rewrite implementation.
	[WORDSIZE == 64] (pwritev64): Remove macro.
	[!PWRITEV] (PWRITEV): Likewise.
	[!PWRITEV] (PWRITEV_REPLACEMENT): Likewise.
	[!PWRITEV] (PWRITE): Likewise.
	[!PWRITEV] (OFF_T): Likewise.
	[!__ASSUME_PWRITEV] (PWRITEV_REPLACEMENT): Likewise.
	(LO_HI_LONG): Remove macro.
	[__WORDSIZE != 64 || __ASSUME_OFF_DIFF_OFF64] (pwritev): Add function.
	* sysdeps/unix/sysv/linux/pwritev64.c: Rewrite implementation.
	(PWRITEV): Remove macro.
	(PWRITEV_REPLACEMENTE): Likewise.
	(PWRITE): Likewise.
	(OFF_T): Likewise.
	(pwritev64): New function.
	* nptl/tst-cancel4.c (tf_writev): Add test.
2016-06-06 19:12:36 -03:00
Adhemerval Zanella
4e77815173 Consolidate preadv/preadv64 implementation
This patch consolidates all the preadv{64} implementation for Linux
in only one (sysdeps/unix/sysv/linux/preadv{64}.c).  It also removes the
syscall from the auto-generation using assembly macros.

It was based on previous pread/pread64 consolidation patch.  The new macro
SYSCALL_LL{64} is used to handle the offset argument and alias is created
for __ASSUME_OFF_DIFF_OFF64 in case of pread64.

Checked on x86_64, i386, aarch64, and powerpc64le.

	* misc/Makefile (CFLAGS-preadv.c): New variable: add cancellation
	required flags.
	(CFLAGS-preadv64.c): Likewise.
	* sysdeps/unix/sysv/linux/generic/wordsize-32/preadv.c: Remove file.
	* sysdeps/unix/sysv/linux/generic/wordsize-32/preadv64.c: Likewise.
	* sysdeps/unix/sysv/linux/mips/mips64/n64/preadv64.c: Likewise.
	* sysdeps/unix/sysv/linux/wordsize-64/preadv.c: Likewise.
	* sysdeps/unix/sysv/linux/wordsize-64/preadv64.: Likwise.
	* sysdeps/unix/sysv/linux/x86_64/x32/syscalls.list (preadv): Remove
	syscall from auto-generation.
	* sysdeps/unix/sysv/linux/preadv.c: Rewrite implementation.
	[WORDSIZE == 64] (preadv64): Remove macro.
	[!PREADV] (PREADV): Likewise.
	[!PREADV] (PREADV_REPLACEMENT): Likewise.
	[!PREADV] (PREAD): Likewise.
	[!PREADV] (OFF_T): Likewise.
	[!__ASSUME_PREADV] (PREADV_REPLACEMENT): Likewise.
	(LO_HI_LONG): Remove macro.
	[__WORDSIZE != 64 || __ASSUME_OFF_DIFF_OFF64] (preadv): Add function.
	* sysdeps/unix/sysv/linux/preadv64.c: Rewrite implementation.
	(PREADV): Remove macro.
	(PREADV_REPLACEMENTE): Likewise.
	(PREAD): Likewise.
	(OFF_T): Likewise.
	(preadv64): New function.
	* nptl/tst-cancel4.c (tf_preadv): Add test.
2016-06-06 19:12:36 -03:00
Joseph Myers
af0cfbaf1d Fix dbl-64 acos (sNaN) (bug 20212).
The dbl-64 version of acos returns sNaN for sNaN arguments.  This
patch fixes it to add NaN arguments to themselves so that qNaN is
returned in this case.

Tested for x86_64 and x86.

	[BZ #20212]
	* sysdeps/ieee754/dbl-64/e_asin.c (__ieee754_acos): Add NaN
	argument to itself.
	* math/libm-test.inc (acos_test_data): Add sNaN tests.
2016-06-06 22:10:11 +00:00
Tulio Magno Quites Machado Filho
c24480ce3b powerpc: Fix --disable-multi-arch build on POWER8
Add missing symbols of stpncpy and strcasestr when multi-arch is
disabled.
Fix memset call from strncpy/stpncpy when multi-arch is disabled.
2016-06-06 16:03:29 -03:00
Joseph Myers
8cbd1453ec Fix x86/x86_64 nextafterl incrementing negative subnormals (bug 20205).
The x86 / x86_64 implementation of nextafterl (also used for
nexttowardl) produces incorrect results (NaNs) when negative
subnormals, the low 32 bits of whose mantissa are zero, are
incremented towards zero.  This patch fixes this by disabling the
logic to decrement the exponent in that case.

Tested for x86_64 and x86.

	[BZ #20205]
	* sysdeps/i386/fpu/s_nextafterl.c (__nextafterl): Do not adjust
	exponent when incrementing negative subnormal with low mantissa
	word zero.
	* math/libm-test.inc (nextafter_test_data) [TEST_COND_intel96]:
	Add another test.
2016-06-03 21:30:12 +00:00
Carlos O'Donell
1c1e7fb658 Fix macro API for __USE_KERNEL_IPV6_DEFS.
The use of __USE_KERNEL_IPV6_DEFS with ifndef is bad
practice per: https://sourceware.org/glibc/wiki/Wundef.
This change moves it to use 'if' and always define the
macro.

Please note that this is not the only problem with this
code. I have a series of fixes after this one to resolve
breakage with this code and add regression tests for it
via compile-only source testing (to be discussed in another
thread).

Unfortunately __USE_KERNEL_XATTR_DEFS is set by the kernel
and not glibc, and uses 'define', so we can't fix that yet.
2016-06-02 23:52:06 -04:00
Samuel Thibault
600c13bf72 hurd: disable ifunc for now
* sysdeps/mach/hurd/configure.ac (libc_cv_ld_gnu_indirect_function):
	Set to no.
	* sysdeps/mach/hurd/configure: Refresh.
2016-05-30 22:13:47 +02:00
Adhemerval Zanella
3e040a2d5f posix: Call _exit in failure case for posix_spawn{p} (BZ#20178)
This patch call _exit instead of exit in failure case for the spawned
child in Linux posix_spawn{p} implementation.

Tested on x86_64.

	[BZ #20178]
	* sysdeps/unix/sysv/linux/spawni.c (__spawni_child): Call _exit
	on failure instead of exit.
2016-05-30 10:56:01 -03:00
Samuel Thibault
3904414a30 hurd: fix _hurd_self_sigstate reference from ____longjmp_chk
* sysdeps/mach/hurd/i386/____longjmp_chk.S (____longjmp_chk) [PIC]:
	  Use PLT entry for calling _hurd_self_sigstate.
2016-05-30 01:24:09 +02:00
H.J. Lu
d6af2388f7 Count number of logical processors sharing L2 cache
For Intel processors, when there are both L2 and L3 caches, SMT level
type should be ued to count number of available logical processors
sharing L2 cache.  If there is only L2 cache, core level type should
be used to count number of available logical processors sharing L2
cache.  Number of available logical processors sharing L2 cache should
be used for non-inclusive L2 and L3 caches.

	* sysdeps/x86/cacheinfo.c (init_cacheinfo): Count number of
	available logical processors with SMT level type sharing L2
	cache for Intel processors.
2016-05-27 15:16:51 -07:00
Joseph Myers
f6ef0657e4 Fix powerpc64 ceil, rint etc. on sNaN input (bug 20160).
The powerpc64 versions of ceil, floor, round, trunc, rint, nearbyint
and their float versions return sNaN for sNaN input when they should
return qNaN.  This patch fixes them to add a NaN argument to itself to
quiet sNaNs before returning.

Tested for powerpc64.

	[BZ #20160]
	* sysdeps/powerpc/powerpc64/fpu/s_ceil.S (__ceil): Add NaN
	argument to itself before returning the result.
	* sysdeps/powerpc/powerpc64/fpu/s_ceilf.S (__ceilf): Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_floor.S (__floor): Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_floorf.S (__floorf): Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_nearbyint.S (__nearbyint):
	Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_nearbyintf.S (__nearbyintf):
	Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_rint.S (__rint): Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_rintf.S (__rintf): Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_round.S (__round): Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_roundf.S (__roundf): Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_trunc.S (__trunc): Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_truncf.S (__truncf): Likewise.
2016-05-27 17:47:54 +00:00
Joseph Myers
debf7618f6 Fix powerpc32 ceil, rint etc. on sNaN input (bug 20160).
The powerpc32 versions of ceil, floor, round, trunc, rint, nearbyint
and their float versions return sNaN for sNaN input when they should
return qNaN.  This patch fixes them to add a NaN argument to itself to
quiet sNaNs before returning.  The powerpc64 versions, which have the
same bug, will be addressed separately.

Tested for powerpc32.

	[BZ #20160]
	* sysdeps/powerpc/powerpc32/fpu/s_ceil.S (__ceil): Add NaN
	argument to itself before returning the result.
	* sysdeps/powerpc/powerpc32/fpu/s_ceilf.S (__ceilf): Likewise.
	* sysdeps/powerpc/powerpc32/fpu/s_floor.S (__floor): Likewise.
	* sysdeps/powerpc/powerpc32/fpu/s_floorf.S (__floorf): Likewise.
	* sysdeps/powerpc/powerpc32/fpu/s_nearbyint.S (__nearbyint):
	Likewise.
	* sysdeps/powerpc/powerpc32/fpu/s_nearbyintf.S (__nearbyintf):
	Likewise.
	* sysdeps/powerpc/powerpc32/fpu/s_rint.S (__rint): Likewise.
	* sysdeps/powerpc/powerpc32/fpu/s_rintf.S (__rintf): Likewise.
	* sysdeps/powerpc/powerpc32/fpu/s_round.S (__round): Likewise.
	* sysdeps/powerpc/powerpc32/fpu/s_roundf.S (__roundf): Likewise.
	* sysdeps/powerpc/powerpc32/fpu/s_trunc.S (__trunc): Likewise.
	* sysdeps/powerpc/powerpc32/fpu/s_truncf.S (__truncf): Likewise.
2016-05-27 17:31:21 +00:00
Joseph Myers
24e9ae1bc2 Avoid "invalid" exceptions from powerpc fabsl (sNaN) (bug 20157).
The powerpc implementations of fabsl for ldbl-128ibm (both powerpc32
and powerpc64) wrongly raise the "invalid" exception for sNaN
arguments.  fabs functions should be quiet for all inputs including
signaling NaNs.  The problem is the use of a comparison instruction
fcmpu to determine if the high part of the argument is negative and so
the low part needs to be negated; such instructions raise "invalid"
for sNaNs.

There is a pure integer implementation of fabsl in
sysdeps/ieee754/ldbl-128ibm/s_fabsl.c.  However, it's not necessary to
use it to avoid such exceptions.  The fsel instruction does not raise
exceptions for sNaNs, and can be used in place of the original
comparison.  (Note that if the high part is zero or a NaN, it does not
matter whether the low part is negated; the choice of whether the low
part of a zero is +0 or -0 does not affect the value, and the low part
of a NaN does not affect the value / payload either.)

The condition in GCC for fsel to be available is TARGET_PPC_GFXOPT,
corresponding to the _ARCH_PPCGR predefined macro.  fsel is available
on all 64-bit processors supported by GCC.  A few 32-bit processors
supported by GCC do not have TARGET_PPC_GFXOPT despite having hard
float support.  To support those processors, integer code (similar to
that in copysignl) is included for the !_ARCH_PPCGR case for
powerpc32.

Tested for powerpc32 (configurations with and without _ARCH_PPCGR) and
powerpc64.

	[BZ #20157]
	* sysdeps/powerpc/powerpc32/fpu/s_fabsl.S (__fabsl): Use fsel to
	determine whether to negate low half if [_ARCH_PPCGR], and integer
	comparison otherwise.
	* sysdeps/powerpc/powerpc64/fpu/s_fabsl.S (__fabsl): Use fsel to
	determine whether to negate low half.
2016-05-27 15:29:31 +00:00
Joseph Myers
bba1419589 Fix ldbl-128ibm ceill, rintl etc. for sNaN arguments (bug 20156).
The ldbl-128ibm implementations of ceill, floorl, roundl, truncl,
rintl and nearbyintl wrongly return an sNaN when given an sNaN
argument.  This patch fixes them to add such an argument to itself to
turn it into a quiet NaN.  (The code structure means this "else" case
applies to any argument which is zero or not finite; it's OK to do
this in all such cases.)

Tested for powerpc.

	[BZ #20156]
	* sysdeps/ieee754/ldbl-128ibm/s_ceill.c (__ceill): Add high part
	to itself when zero or not finite.
	* sysdeps/ieee754/ldbl-128ibm/s_floorl.c (__floorl): Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_rintl.c (__rintl): Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_roundl.c (__roundl): Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_truncl.c (__truncl): Likewise.
2016-05-27 13:59:24 +00:00
Joseph Myers
98c9c9d9ca Fix ldbl-128ibm sqrtl (sNaN) (bug 20153).
The ldbl-128ibm implementation of sqrtl wrongly returns an sNaN for
signaling NaN arguments.  This patch fixes it to quiet its argument,
using the same x * x + x return for infinities and NaNs as the dbl-64
implementation uses to ensure that +Inf maps to +Inf while -Inf and
NaN map to NaN.

Tested for powerpc.

	[BZ #20153]
	* sysdeps/ieee754/ldbl-128ibm/e_sqrtl.c (__ieee754_sqrtl): Return
	x * x + x for infinities and NaNs.
2016-05-26 22:58:36 +00:00
Joseph Myers
d73e7bdb3a Fix ldbl-128 j0l, j1l, y0l, y1l for sNaN argument (bug 20151).
The ldbl-128 implementations of j0l, j1l, y0l, y1l (also used for
ldbl-128ibm) return an sNaN argument unchanged.  This patch fixes them
to add a NaN argument to itself to quiet it before return.

Tested for mips64.

	[BZ #20151]
	* sysdeps/ieee754/ldbl-128/e_j0l.c (__ieee754_j0l): Add NaN
	argument to itself before returning result.
	(__ieee754_y0l): Likewise.
	* sysdeps/ieee754/ldbl-128/e_j1l.c (__ieee754_j1l): Likewise.
	(__ieee754_y1l).
2016-05-26 20:55:03 +00:00
Adhemerval Zanella
2f0dc39029 network: Fix missing bits from {recv,send}{m}msg standard com,pliance
This patch fixes wrong/missing bits from the Fix {recv,send}{m}msg
standard compliance (BZ#16919) patches:

  * nptl/Makefile sets CFLAGS-oldrecvfrom.c, but there's no such file as
    oldrecvfrom.c.  It should be oldsendmsg.c as defined by ChangeLog.

  * sysdeps/unix/sysv/linux/hppa/Versions and
    sysdeps/unix/sysv/linux/i386/Versions list a symbol recvms instead of
    recvmsg at version GLIBC_2.24.

	* nptl/Makefile (CFLAGS-oldrecvfrom.c): Remove rule.
	(CFLAGS-oldsendmsg.c): Add rule.
	* sysdeps/unix/sysv/linux/hppa/Versions [libc] (GLIBC_2.24):
	Correct recvmsg symbol name.
	* sysdeps/unix/sysv/linux/i386/Versions [libc] (GLIBC_2.24):
	Likewise.
2016-05-26 11:11:33 -03:00
Adhemerval Zanella
222c2d7f43 network: recvmmsg and sendmmsg standard compliance (BZ#16919)
POSIX specifies that both msghdr::msg_iovlen and msghdr::msg_controllen
to be of size int and socklen_t respectively, however Linux implements
it as a size_t.  So for 64-bits architecture where sizeof of size_t is
larger than socklen_t, both sendmmsg and recvmmsg need to adjust the
mmsghdr::msg_hdr internal fields before issuing the syscall itself.

This patch fixes it by operating on the padding if it the case.
For recvmmsg, the most straightfoward case, only zero padding the fields
is suffice.  However, for sendmmsg, where adjusting the buffer is out
of the contract (since it may point to a read-only data), the function
is rewritten to use sendmsg instead (which from previous patch
allocates a temporary msghdr to operate on).

Also for 64-bit ports that requires it, a new recvmmsg and sendmmsg
compat version is created (which uses size_t for both cmsghdr::cmsg_len
and internal

Tested on x86_64, i686, aarch64, armhf, and powerpc64le.

	* sysdeps/unix/sysv/linux/Makefile
	[$(subdir) = socket] (sysdep_routines): Add oldrecvmmsg and
	oldsendmmsg.
	* sysdeps/unix/sysv/linux/aarch64/libc.abilist: Add recvmmsg and
	sendmmsg.
	* sysdeps/unix/sysv/linux/alpha/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/ia64/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist: Likewise.
	* sysdeps/sysv/linux/powerpc/powerpc64/libc-le.abilist: Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/libc.abilist:
	Likewise.
	* sysdeps/unix/sysv/linux/x86_64/64/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/oldrecvmmsg.c: New file.
	* sysdeps/unix/sysv/linux/oldsendmmsg.c: Likewise.
	* sysdeps/unix/sysv/linux/recvmmsg.c (__recvmmsg): Adjust msghdr
	iovlen and controllen fields to adjust to POSIX specification.
	* sysdeps/unix/sysv/linux/sendmmsg.c (__sendmmsg): Likewise.
2016-05-25 17:39:07 -03:00
Adhemerval Zanella
af7f7c7ec8 network: recvmsg and sendmsg standard compliance (BZ#16919)
POSIX specifies that both msghdr::msg_iovlen and msghdr::msg_controllen
to be of size int and socklen_t respectively.  However Linux defines it as
both size_t and for 64-bit it requires some adjustments to make the
functions standard compliance.

This patch fixes it by creating a temporary header and zeroing the pad
fields for 64-bits architecture where size of size_t exceeds the size of
the int.

Also the new recvmsg and sendmsg implementation is only added on libc,
with libpthread only containing a compat symbol.

Tested on x86_64, i686, aarch64, armhf, and powerpc64le.

	* conform/data/sys/socket.h-data (msghdr.msg_iovlen): Remove xfail-
	and change to correct expected type.
	(msghdr.msg_controllen): Likewise.
	(cmsghdr.cmsg_len): Likewise.
	* sysdeps/unix/sysv/linux/bits/socket.h (msghdr.msg_iovlen): Fix
	expected POSIX assumption about the size.
	(msghdr.msg_controllen): Likewise.
	(msghdr.__glibc_reserved1): Likewise.
	(msghdr.__glibc_reserved2): Likewise.
	(cmsghdr.cmsg_len): Likewise.
	(cmsghdr.__glibc_reserved1): Likewise.
	* nptl/Makefile (libpthread-routines): Remove ptw-recvmsg and ptw-sendmsg.
	Add ptw-oldrecvmsg and ptw-oldsendmsg.
	(CFLAGS-sendmsg.c): Remove rule.
	(CFLAGS-recvmsg.c): Likewise.
	(CFLAGS-oldsendmsg.c): Add rule.
	(CFLAGS-oldrecvmsg.c): Likewise.
	* sysdeps/unix/sysv/linux/alpha/Versions [libc] (GLIBC_2.24): Add
	recvmsg and sendmsg.
	* sysdeps/unix/sysv/linux/aarch64/Version [libc] (GLIBC_2.24):
	Likewise.
	* sysdeps/unix/sysv/linux/arm/Versions [libc] (GLIBC_2.24): Likewise.
	* sysdeps/unix/sysv/linux/hppa/Versions [libc] (GLIBC_2.24): Likewise.
	* sysdeps/unix/sysv/linux/i386/Versions [libc] (GLIBC_2.24): Likewise.
	* sysdeps/unix/sysv/linux/ia64/Versions [libc] (GLIBC_2.24): Likewise.
	* sysdeps/unix/sysv/linux/m68k/Versions [libc] (GLIBC_2.24): Likewise.
	* sysdeps/unix/sysv/linux/microblaze/Versions [libc] (GLIBC_2.24):
	Likewise.
	* sysdeps/unix/sysv/linux/mips/mips32/Versions [libc] (GLIBC_2.24):
	Likewise.
	* sysdeps/unix/sysv/linux/mips/mips64/n32/Versions [libc] (GLIBC_2.24):
	Likewise.
	* sysdeps/unix/sysv/linux/mips/mips64/Versions [libc] (GLIBC_2.24):
	Likewise.
	* sysdeps/unix/sysv/linux/nios2/Versions [libc] (GLIBC_2.24):
	Likewise.
	* sysdeps/unix/sysv/linux/powerpc/Versions [libc] (GLIBC_2.24):
	Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/Versions [libc]
	(GLIBC_2.24): Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-32/Versions [libc] (GLIBC_2.24):
	Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-64/Versions [libc] (GLIBC_2.24):
	Likewise.
	* sysdeps/unix/sysv/linux/sh/Versions [libc] (GLIBC_2.24): Likewise.
	* sysdeps/unix/sysv/linux/sparc/Versions [libc] (GLIBC_2.24):
	Likewise.
	* sysdeps/unix/sysv/linux/sparc/sparc64/Versions [libc] (GLIBC_2.24):
	Likewise.
	( sysdeps/unix/sysv/linux/tile/Versions [libc] (GLIBC_2.24):
	Likewise.
	* sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/Versions [libc]
	(GLIBC_2.24): Likewise.
	( sysdeps/unix/sysv/linux/x86_64/64/Versions [libc] (GLIBC_2.24):
	Likewise.
	* sysdeps/unix/sysv/linux/x84_64/Versions [libc] (GLIBC_2.24):
	Likewise.
	* sysdeps/unix/sysv/linux/Makefile
	[$(subdir) = socket)] (sysdep_headers): Add oldrecvmsg and oldsendmsg.
	(CFLAGS-sendmsg.c): Add rule.
	(CFLAGS-recvmsg.c): Likewise.
	(CFLAGS-oldsendmsg.c): Likewise.
	(CFLAGS-oldrecvmsg.c): Likewise.
	* sysdeps/unix/sysv/linux/check_native.c (__check_native): Fix msghdr
	initialization.
	* sysdeps/unix/sysv/linux/check_pf.c (make_request): Likewise.
	* sysdeps/unix/sysv/linux/ifaddrs.c (__netlink_request): Likewise.
	* sysdeps/unix/sysv/linux/oldrecvmsg.c: New file.
	* sysdeps/unix/sysv/linux/oldsendmsg.c: Likewise.
	* sysdeps/unix/sysv/linux/recvmsg.c (__libc_recvmsg): Adjust msghdr
	iovlen and controllen fields to adjust to POSIX specification.
	* sysdeps/unix/sysv/linux/sendmsg.c (__libc_sendmsg): Likewise.
	* sysdeps/unix/sysv/linux/aarch64/libc.abilist: New version and
	added recvmsg and sendmsg.
	* sysdeps/unix/sysv/linux/alpha/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/hppa/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/i386/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/ia64/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/microblaze/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/nios2/libc.abilist: Likewise
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist:
	Likewise.
	* sysdeps/unix/linux/powerpc/powerpc32/nofpu/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/libc-le.abilist: Likewise.
	* sysdepe/unix/sysv/linux/powerpc/powerpc64/libc.abilist: Likewise.
	Likewise.
	Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/sh/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/tile/tilegx/tilegx32/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/tile/tilepro/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/libc.abilist: Likewise.
	Likewise.
	* sysdeps/unix/sysv/linux/x86_64/64/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist: Likewise.
2016-05-25 17:39:01 -03:00
Adhemerval Zanella
abf29edd4a Adjust kernel-features.h defaults for recvmsg and sendmsg
This patch removes the auto-generation for recvmsg and sendmsg syscall
and adjust the kernel-features.h for all architectures supported on
Linux.  This patch follows the idea of 'Adjust kernel-features.h defaults
for socket syscalls.' (35ade9f11b) by define
__ASSUME_SENDMSG_SYSCALL and __ASSUME_RECVMSG_SYSCALL as supported by
default and undefine it for the architecture that do not support it
directly.

The main rationale is to make is easier add code wrapper over the syscall
to fix BZ#16919 (recvmsg standard compliance).

Tested on x86_64, i686, aarch64, armhf, and powerpc64le.

	* sysdeps/unix/sysv/linux/alpha/syscalls.list (recvmsg): Remove
	from auto-generation.
	(sendmsg): Likewise.
	* sysdeps/unix/sysv/linux/arm/syscalls.list (recvmsg): Likewise.
	(sendmsg): Likewise.
	* sysdeps/unix/sysv/linux/generic/syscalls.list (recvmsg): Likewise.
	(sendmsg): Likewise.
	* sysdeps/unix/sysv/linux/hppa/syscalls.list (recvmsg): Likewise.
	(sendmsg): Likewise.
	* sysdeps/unix/sysv/linux/ia64/syscalls.list (recvmsg): Likewise.
	(sendmsg): Likewise.
	* sysdeps/unix/sysv/linux/mips/syscalls.list (recvmsg): Likewise.
	(sendmsg): Likewise.
	* sysdeps/unix/sysv/linux/sparc/sparc64/syscalls.list (recvmsg):
	Likewise.
	(sendmsg): Likewise.
	* sysdeps/unix/sysv/linux/x86_64/syscalls.list (recvmsg): Likewise.
	(sendmsg): Likewise.
	* sysdeps/unix/sysv/linux/i386/kernel-features.h
	[__LINUX_KERNEL_VERSION >= 0x040300] (__ASSUME_SENDMSG_SYSCALL):
	Remove.
	[__LINUX_KERNEL_VERSION >= 0x040300] (__ASSUME_RECVMSG_SYSCALL):
	Likewise.
	[__LINUX_KERNEL_VERSION < 0x040300] (__ASSUME_SENDMSG_SYSCALL):
	Undefine.
	[__LINUX_KERNEL_VERSION < 0x040300] (__ASSUME_RECVMSG_SYSCALL):
	Likewise.
	* sysdeps/unix/sysv/linux/kernel-features.h
	(__ASSUME_SENDMSG_SYSCALL): Define.
	(__ASSUME_RECVMSG_SYSCALL): Likewise.
	* sysdeps/unix/sysv/linux/m68k/kernel-features.h
	[__LINUX_KERNEL_VERSION >= 0x040300] (__ASSUME_SENDMSG_SYSCALL):
	Remove.
	[__LINUX_KERNEL_VERSION >= 0x040300] (__ASSUME_RECVMSG_SYSCALL):
	Likewise.
	[__LINUX_KERNEL_VERSION < 0x040300] (__ASSUME_SENDMSG_SYSCALL):
	Undefine.
	[__LINUX_KERNEL_VERSION < 0x040300] (__ASSUME_RECVMSG_SYSCALL):
	Likewise.
	* sysdeps/unix/sysv/linux/s390/kernel-features.h
	[__LINUX_KERNEL_VERSION >= 0x040300] (__ASSUME_SENDMSG_SYSCALL):
	Remove.
	[__LINUX_KERNEL_VERSION >= 0x040300] (__ASSUME_RECVMSG_SYSCALL):
	Likewise.
	[__LINUX_KERNEL_VERSION < 0x040300] (__ASSUME_SENDMSG_SYSCALL):
	Undefine.
	[__LINUX_KERNEL_VERSION < 0x040300] (__ASSUME_RECVMSG_SYSCALL):
	Likewise.
	* sysdeps/unix/sysv/linux/microblaze/kernel-features.h
	(__ASSUME_SENDMSG_SYSCALL): Undefine.
	(__ASSUME_RECVMSG_SYSCALL): Likewise.
	* sysdeps/unix/sysv/linux/powerpc/kernel-features.h
	(__ASSUME_SENDMSG_SYSCALL): Likewise.
	(__ASSUME_RECVMSG_SYSCALL): Likewise.
	* sysdeps/unix/sysv/linux/sh/kernel-features.h
	(__ASSUME_SENDMSG_SYSCALL): Likewise.
	(__ASSUME_RECVMSG_SYSCALL): Likewise.
2016-05-25 17:27:57 -03:00
Joseph Myers
b4d80349bb Do not raise "inexact" from powerpc64 ceil, floor, trunc (bug 15479).
Continuing fixes for ceil, floor and trunc functions not to raise the
"inexact" exception, this patch fixes the versions used on older
powerpc64 processors.  As was done with the round implementations some
time ago, the save of floating-point state is moved after the first
floating-point operation on the input to ensure that any "invalid"
exception from signaling NaN input is included in the saved state, and
then the whole state gets restored rather than just the rounding mode.

This has no effect on configurations using the power5+ code, since
such processors can do these operations with a single instruction (and
those instructions do not set "inexact", so are correct for TS 18661-1
semantics).

Tested for powerpc64.

	[BZ #15479]
	* sysdeps/powerpc/powerpc64/fpu/s_ceil.S (__ceil): Move save of
	floating-point state after first floating-point operation on
	input.  Restore full floating-point state instead of just rounding
	mode.
	* sysdeps/powerpc/powerpc64/fpu/s_ceilf.S (__ceilf): Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_floor.S (__floor): Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_floorf.S (__floorf): Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_trunc.S (__trunc): Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_truncf.S (__truncf): Likewise.
2016-05-25 17:42:22 +00:00
Joseph Myers
1f921a93e4 Do not raise "inexact" from powerpc32 ceil, floor, trunc (bug 15479).
Continuing fixes for ceil, floor and trunc functions not to raise the
"inexact" exception, this patch fixes the versions used on older
powerpc32 processors.  As was done with the round implementations some
time ago, the save of floating-point state is moved after the first
floating-point operation on the input to ensure that any "invalid"
exception from signaling NaN input is included in the saved state, and
then the whole state gets restored rather than just the rounding mode.

This has no effect on configurations using the power5+ code, since
such processors can do these operations with a single instruction (and
those instructions do not set "inexact", so are correct for TS 18661-1
semantics).

Tested for powerpc32.

	[BZ #15479]
	* sysdeps/powerpc/powerpc32/fpu/s_ceil.S (__ceil): Move save of
	floating-point state after first floating-point operation on
	input.  Restore full floating-point state instead of just rounding
	mode.
	* sysdeps/powerpc/powerpc32/fpu/s_ceilf.S (__ceilf): Likewise.
	* sysdeps/powerpc/powerpc32/fpu/s_floor.S (__floor): Likewise.
	* sysdeps/powerpc/powerpc32/fpu/s_floorf.S (__floorf): Likewise.
	* sysdeps/powerpc/powerpc32/fpu/s_trunc.S (__trunc): Likewise.
	* sysdeps/powerpc/powerpc32/fpu/s_truncf.S (__truncf): Likewise.
2016-05-25 16:53:23 +00:00
Stefan Liebler
a42a95c431 S390: Fix utf32 to utf16 handling of low surrogates (disable cu42).
According to the latest Unicode standard, a conversion from/to UTF-xx has
to report an error if the character value is in range of an utf16 surrogate
(0xd800..0xdfff). See https://sourceware.org/ml/libc-help/2015-12/msg00015.html.

Thus the cu42 instruction, which converts from utf32 to utf16,  has to be
disabled because it does not report an error in case of a value in range of
a low surrogate (0xdc00..0xdfff). The etf3eh variant is removed and the c,
vector variant is adjusted to handle the value in range of an utf16 low
surrogate correctly.

ChangeLog:

	* sysdeps/s390/utf16-utf32-z9.c: Disable cu42 instruction and report
	an error in case of a value in range of an utf16 low surrogate.
2016-05-25 17:18:06 +02:00
Stefan Liebler
52f8a48e24 S390: Fix utf32 to utf8 handling of low surrogates (disable cu41).
According to the latest Unicode standard, a conversion from/to UTF-xx has
to report an error if the character value is in range of an utf16 surrogate
(0xd800..0xdfff). See https://sourceware.org/ml/libc-help/2015-12/msg00015.html.

Thus the cu41 instruction, which converts from utf32 to utf8,  has to be
disabled because it does not report an error in case of a value in range of
a low surrogate (0xdc00..0xdfff). The etf3eh variant is removed and the c,
vector variant is adjusted to handle the value in range of an utf16 low
surrogate correctly.

ChangeLog:

	* sysdeps/s390/utf8-utf32-z9.c: Disable cu41 instruction and report
	an error in case of a value in range of an utf16 low surrogate.
2016-05-25 17:18:05 +02:00
Stefan Liebler
ee518b7070 S390: Use s390-64 specific ionv-modules on s390-32, too.
This patch reworks the existing s390 64bit specific iconv modules in order
to use them on s390 31bit, too.

Thus the parts for subdirectory iconvdata in sysdeps/s390/s390-64/Makefile
were moved to sysdeps/s390/Makefile so that they apply on 31bit, too.
All those modules are moved from sysdeps/s390/s390-64 directory to sysdeps/s390.

The iso-8859-1 to/from cp037 module was adjusted, to use brct (branch relative
on count) instruction on 31bit s390 instead of brctg, because the brctg is a
zarch instruction and is not available on a 31bit kernel.

The utf modules are using zarch instructions, thus the directive machinemode
zarch_nohighgprs was added to the inline assemblies to omit the high-gprs flag
in the shared libraries. Otherwise they can't be loaded on a 31bit kernel.
The ifunc resolvers were adjusted in order to call the etf3eh or vector variants
only if zarch instructions are available (64bit kernel in 31bit compat-mode).
Furthermore some variable types were changed. E.g. unsigned long long would be
a register pair on s390 31bit, but we want only one single register.
For variables of type size_t the register contents have to be enlarged from a
32bit to a 64bit value on 31bit, because the inline assemblies uses 64bit values
in such cases.

ChangeLog:

	* sysdeps/s390/s390-64/Makefile (iconvdata-subdirectory):
	Move to ...
	* sysdeps/s390/Makefile: ... here.
	* sysdeps/s390/s390-64/iso-8859-1_cp037_z900.c: Move to ...
	* sysdeps/s390/iso-8859-1_cp037_z900.c: ... here.
	(BRANCH_ON_COUNT): New define.
	(TR_LOOP): Use BRANCH_ON_COUNT instead of brctg.
	* sysdeps/s390/s390-64/utf16-utf32-z9.c: Move to ...
	* sysdeps/s390/utf16-utf32-z9.c: ... here and adjust to
	run on s390-32, too.
	* sysdeps/s390/s390-64/utf8-utf16-z9.c: Move to ...
	* sysdeps/s390/utf8-utf16-z9.c: ... here and adjust to
	run on s390-32, too.
	* sysdeps/s390/s390-64/utf8-utf32-z9.c: Move to ...
	* sysdeps/s390/utf8-utf32-z9.c: ... here and adjust to
	run on s390-32, too.
2016-05-25 17:18:05 +02:00
Stefan Liebler
6896776c3c S390: Optimize utf16-utf32 module.
This patch reworks the s390 specific module to convert between utf16 and utf32.
Now ifunc is used to choose either the c or etf3eh (with convert utf
instruction) variants at runtime.
Furthermore a new vector variant for z13 is introduced which will be build
and chosen if vector support is available at build / runtime.

In case of converting utf 32 to utf16, the vector variant optimizes input of
2byte utf16 characters. The convert utf instruction is used if an utf16
surrogate is found.

For the other direction utf16 to utf32, the cu24 instruction can't be re-
enabled, because it does not report an error, if the input-stream consists of
a single low surrogate utf16 char (e.g. 0xdc00). This applies to the newest z13,
too. Thus there is only the c or the new vector variant, which can handle utf16
surrogate characters.

This patch also fixes some whitespace errors. Furthermore, the etf3eh variant is
handling the "UTF-xx//IGNORE" case now. Before they ignored the ignore-case and
always stopped at an error.

ChangeLog:

	* sysdeps/s390/s390-64/utf16-utf32-z9.c: Use ifunc to select c,
	etf3eh or new vector loop-variant.
2016-05-25 17:18:05 +02:00
Stefan Liebler
5bd11b1909 S390: Optimize utf8-utf16 module.
This patch reworks the s390 specific module to convert between utf8 and utf16.
Now ifunc is used to choose either the c or etf3eh (with convert utf instruction)
variants at runtime. Furthermore a new vector variant for z13 is introduced
which will be build and chosen if vector support is available at build / runtime.

In case of converting utf 8 to utf16, the vector variant optimizes input of
1byte utf8 characters. The convert utf instruction is used if a multibyte utf8
character is found.

For the other direction utf16 to utf8, the cu21 instruction can't be re-enabled,
because it does not report an error, if the input-stream consists of a single
low surrogate utf16 char (e.g. 0xdc00). This applies to the newest z13, too.
Thus there is only the c or the new vector variant, which can handle 1..4 byte
utf8 characters.

The c variant from utf16 to utf8 has beed fixed. If a high surrogate was at the
end of the input-buffer, then errno was set to EINVAL and the input-pointer
pointed just after the high surrogate. Now it points to the beginning of the
high surrogate.

This patch also fixes some whitespace errors. The c variant from utf8 to utf16
is now checking that tail-bytes starts with 0b10... and the value is not in
range of an utf16 surrogate.

Furthermore, the etf3eh variants are handling the "UTF-xx//IGNORE" case now.
Before they ignored the ignore-case and always stopped at an error.

ChangeLog:

	* sysdeps/s390/s390-64/utf8-utf16-z9.c: Use ifunc to select c,
	etf3eh or new vector loop-variant.
2016-05-25 17:18:05 +02:00
Stefan Liebler
421c5278d8 S390: Optimize utf8-utf32 module.
This patch reworks the s390 specific module to convert between utf8 and utf32.
Now ifunc is used to choose either the c or etf3eh (with convert utf
instruction) variants at runtime.
Furthermore a new vector variant for z13 is introduced which will be build
and chosen if vector support is available at build / runtime.
The vector variants optimize input of 1byte utf8 characters. The convert utf
instruction is used if a multibyte utf8 character is found.

This patch also fixes some whitespace errors. The c variants are rejecting
UTF-16 surrogates and values above 0x10ffff now.
Furthermore, the etf3eh variants are handling the "UTF-xx//IGNORE" case now.
Before they ignored the ignore-case and always stopped at an error.

ChangeLog:

	* sysdeps/s390/s390-64/utf8-utf32-z9.c: Use ifunc to select c, etf3eh
	or new vector loop-variant.
2016-05-25 17:18:05 +02:00
Stefan Liebler
81c6380887 S390: Optimize iso-8859-1 to ibm037 iconv-module.
This patch reworks the s390 specific module which used the z900
translate one to one instruction. Now the g5 translate instruction is used,
because it outperforms the troo instruction.

ChangeLog:

	* sysdeps/s390/s390-64/iso-8859-1_cp037_z900.c (TROO_LOOP):
	Rename to TR_LOOP and usage of tr instead of troo instruction.
2016-05-25 17:18:05 +02:00
Stefan Liebler
3b704e26b3 S390: Optimize builtin iconv-modules.
This patch introduces a s390 specific gconv_simple.c file which provides
optimized versions for z13 with vector instructions, which will be chosen at
runtime via ifunc.
The optimized conversions can convert between internal and ascii, ucs4, ucs4le,
ucs2, ucs2le.
If the build-environment lacks vector support, then iconv/gconv_simple.c
is used wihtout any change. Otherwise iconvdata/gconv_simple.c is used to create
conversion loop routines without vector instructions as fallback, if vector
instructions aren't available at runtime.

ChangeLog:

	* sysdeps/s390/multiarch/gconv_simple.c: New File.
	* sysdeps/s390/multiarch/Makefile (sysdep_routines): Add gconv_simple.
2016-05-25 17:18:04 +02:00
Stefan Liebler
4690dab084 S390: Optimize 8bit-generic iconv modules.
This patch introduces a s390 specific 8bit-generic.c file which provides an
optimized version for z13 with translate-/vector-instructions, which will be
chosen at runtime via ifunc.
If the build-environment lacks vector support, then iconvdata/8bit-generic.c
is used wihtout any change. Otherwise iconvdata/8bit-generic.c is used to create
conversion loop routines without vector instructions as fallback, if vector
instructions aren't available at runtime.

The vector routines can only be used with charsets where the maximum UCS4 value
fits in 1 byte size. Then the hardware translate-instruction is used
to translate between up to 256 generic characters and "1 byte UCS4"
characters at once. The vector instructions are used to convert between
the "1 byte UCS4" and UCS4.

The gen-8bit.sh script in sysdeps/s390/multiarch generates the conversion
table to_ucs1. Therefore in sysdeps/s390/multiarch/Makefile is added an
override define generate-8bit-table, which is originally defined in
iconvdata/Makefile. This version calls the gen-8bit.sh in iconvdata folder
and the s390 one.

ChangeLog:

	* sysdeps/s390/multiarch/8bit-generic.c: New File.
	* sysdeps/s390/multiarch/gen-8bit.sh: New File.
	* sysdeps/s390/multiarch/Makefile (generate-8bit-table):
	New override define.
	* sysdeps/s390/multiarch/iconv/skeleton.c: Likewise.
2016-05-25 17:18:04 +02:00
Stefan Liebler
9b7f05599a S390: Configure check for vector support in gcc.
The S390 specific test checks if the gcc has support for vector registers
by compiling an inline assembly which clobbers vector registers.
On success the macro HAVE_S390_VX_GCC_SUPPORT is defined.
This macro can be used to determine if e.g. clobbering vector registers
is allowed or not.

ChangeLog:

	* config.h.in (HAVE_S390_VX_GCC_SUPPORT): New macro undefine.
	* sysdeps/s390/configure.ac: Add test for S390 vector register
	support in gcc.
	* sysdeps/s390/configure: Regenerated.
2016-05-25 17:18:04 +02:00
Stefan Liebler
c70e9913d2 S390: Get rid of make warning: overriding recipe for target gconv-modules.
This patch introduces a way to provide an architecture dependent gconv-modules
file. Before this patch, the gconv-modules file was normally installed from
src-dir/iconvdata/gconv-modules. The S390 Makefile had overridden the
installation recipe (with a make warning) in order to install the
gconv-module-s390 file from build-dir.
The iconvdata/Makefile provides another recipe, which copies the gconv-modules
file from src to build dir, which are used by the testcases.
Thus the testcases does not use the currently build s390-modules.

This patch uses build-dir/iconvdata/gconv-modules for installation, which
is generated by concatenating src-dir/iconvdata/gconv-modules and the
architecture specific one. The latter one can be specified by setting the variable
sysdeps-gconv-modules in sysdeps/.../Makefile.

The architecture specific gconv-modules file is emitted before the common one
because these modules aren't used in all possible conversions. E.g. the converting
from INTERNAL to UTF-16 used the common UTF-16.so module instead of UTF16_UTF32_Z9.so.

This way, the s390-Makefile does not need to override the recipe for gconv-modules
and no warning is emitted anymore.
Since we no longer support empty objpfx the conditional test in iconvdata/Makefile
is removed.

ChangeLog:

	* iconvdata/Makefile ($(inst_gconvdir)/gconv-modules):
	Install file from $(objpfx)gconv-modules.
	($(objpfx)gconv-modules): Concatenate architecture specific file
	in variable sysdeps-gconv-modules and gconv-modules in src dir.
	* sysdeps/s390/gconv-modules: New file.
	* sysdeps/s390/s390-64/Makefile: ($(inst_gconvdir)/gconv-modules):
	Deleted.
	($(objpfx)gconv-modules-s390): Deleted.
	(sysdeps-gconv-modules): New variable.
2016-05-25 17:18:04 +02:00
Joseph Myers
5ff81530dd Do not raise "inexact" from x86_64 SSE4.1 ceil, floor (bug 15479).
Continuing fixes for ceil and floor functions not to raise the
"inexact" exception, this patch fixes the x86_64 SSE4.1 versions.  The
roundss / roundsd instructions take an immediate operand that
determines the rounding mode and whether to raise "inexact"; this just
needs bit 3 set to disable "inexact", which this patch does.

Remark: we don't have an SSE4.1 version of trunc / truncf (using this
instruction with operand 11); I'd expect one to make sense, but of
course it should be benchmarked against the existing C code.  I'll
file a bug in Bugzilla for the lack of such a version.

Tested for x86_64.

	[BZ #15479]
	* sysdeps/x86_64/fpu/multiarch/s_ceil.S (__ceil_sse41): Set bit 3
	of immediate operand to rounding instruction.
	* sysdeps/x86_64/fpu/multiarch/s_ceilf.S (__ceilf_sse41):
	Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_floor.S (__floor_sse41):
	Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_floorf.S (__floorf_sse41):
	Likewise.
2016-05-24 21:11:18 +00:00
Joseph Myers
078d1cf8ac Do not raise "inexact" from generic round (bug 15479).
C99 and C11 allow but do not require ceil, floor, round and trunc to
raise the "inexact" exception for noninteger arguments.  TS 18661-1
requires that this exception not be raised by these functions.  This
aligns them with general IEEE semantics, where "inexact" is only
raised if the final step of rounding the infinite-precision result to
the result type is inexact; for these functions, the
infinite-precision integer result is always representable in the
result type, so "inexact" should never be raised.

The generic implementations of ceil, floor and round functions contain
code to force "inexact" to be raised.  This patch removes it for round
functions to align them with TS 18661-1 in this regard.  The tests
*are* updated by this patch; there are fewer architecture-specific
versions than for ceil and floor, and I fixed the powerpc ones some
time ago.  If any others still have the issue, as shown by tests for
round failing with spurious exceptions, they can be fixed separately
by architecture maintainers or others.

Tested for x86_64, x86 and mips64.

	[BZ #15479]
	* sysdeps/ieee754/dbl-64/s_round.c (huge): Remove variable.
	(__round): Do not force "inexact" exception.
	* sysdeps/ieee754/dbl-64/wordsize-64/s_round.c (huge): Remove
	variable.
	(__round): Do not force "inexact" exception.
	* sysdeps/ieee754/flt-32/s_roundf.c (huge): Remove variable.
	(__roundf): Do not force "inexact" exception.
	* sysdeps/ieee754/ldbl-128/s_roundl.c (huge): Remove variable.
	(__roundl): Do not force "inexact" exception.
	* sysdeps/ieee754/ldbl-96/s_roundl.c (huge): Remove variable.
	(__roundl): Do not force "inexact" exception.
	* math/libm-test.inc (round_test_data): Do not allow spurious
	"inexact" exceptions.
2016-05-24 17:46:55 +00:00
Joseph Myers
876c5bd30c Do not raise "inexact" from generic floor (bug 15479).
C99 and C11 allow but do not require ceil, floor, round and trunc to
raise the "inexact" exception for noninteger arguments.  TS 18661-1
requires that this exception not be raised by these functions.  This
aligns them with general IEEE semantics, where "inexact" is only
raised if the final step of rounding the infinite-precision result to
the result type is inexact; for these functions, the
infinite-precision integer result is always representable in the
result type, so "inexact" should never be raised.

The generic implementations of ceil, floor and round functions contain
code to force "inexact" to be raised.  This patch removes it for floor
functions to align them with TS 18661-1 in this regard.  Note that
some architecture-specific versions may still raise "inexact", so the
tests are not updated and the bug is not yet fixed.

Tested for x86_64, x86 and mips64.

	[BZ #15479]
	* sysdeps/ieee754/dbl-64/s_floor.c: Do not mention "inexact"
	exception in comment.
	(huge): Remove variable.
	(__floor): Do not force "inexact" exception.
	* sysdeps/ieee754/dbl-64/wordsize-64/s_floor.c: Do not mention
	"inexact" exception in comment.
	(huge): Remove variable.
	(__floor): Do not force "inexact" exception.
	* sysdeps/ieee754/flt-32/s_floorf.c: Do not mention "inexact"
	exception in comment.
	(huge): Remove variable.
	(__floorf): Do not force "inexact" exception.
	* sysdeps/ieee754/ldbl-128/s_floorl.c: Do not mention "inexact"
	exception in comment.
	(huge): Remove variable.
	(__floorl): Do not force "inexact" exception.
2016-05-24 17:44:46 +00:00
Joseph Myers
ac2cc6f021 Do not raise "inexact" from generic ceil (bug 15479).
C99 and C11 allow but do not require ceil, floor, round and trunc to
raise the "inexact" exception for noninteger arguments.  TS 18661-1
requires that this exception not be raised by these functions.  This
aligns them with general IEEE semantics, where "inexact" is only
raised if the final step of rounding the infinite-precision result to
the result type is inexact; for these functions, the
infinite-precision integer result is always representable in the
result type, so "inexact" should never be raised.

The generic implementations of ceil, floor and round functions contain
code to force "inexact" to be raised.  This patch removes it for ceil
functions to align them with TS 18661-1 in this regard.  Note that
some architecture-specific versions may still raise "inexact", so the
tests are not updated and the bug is not yet fixed.

Tested for x86_64, x86 and mips64.

	[BZ #15479]
	* sysdeps/ieee754/dbl-64/s_ceil.c: Do not mention "inexact"
	exception in comment.
	(huge): Remove variable.
	(__ceil): Do not force "inexact" exception.
	* sysdeps/ieee754/dbl-64/wordsize-64/s_ceil.c: Do not mention
	"inexact" exception in comment.
	(huge): Remove variable.
	(__ceil): Do not force "inexact" exception.
	* sysdeps/ieee754/flt-32/s_ceilf.c (huge): Remove variable.
	(__ceilf): Do not force "inexact" exception.
	* sysdeps/ieee754/ldbl-128/s_ceill.c: Do not mention "inexact"
	exception in comment.
	(huge): Remove variable.
	(__ceill): Do not force "inexact" exception.
2016-05-24 17:42:10 +00:00
H.J. Lu
6901def689 Avoid an extra branch to PLT for -z now
When --enable-bind-now is used to configure glibc build, we can avoid
an extra branch to the PLT entry by using indirect branch via the GOT
slot instead, which is similar to the first instructuon in the PLT
entry.  Changes in the shared library sizes in text sections:

Shared library    Before (bytes)   After (bytes)
libm.so             1060813          1060797
libmvec.so           160881           160805
libpthread.so         94992            94984
librt.so              25064            25048

	* config.h.in (BIND_NOW): New.
	* configure.ac (BIND_NOW): New.  Defined for --enable-bind-now.
	* configure: Regenerated.
	* sysdeps/x86_64/sysdep.h (JUMPTARGET)[BIND_NOW]: Defined to
	indirect branch via the GOT slot.
2016-05-24 08:44:23 -07:00
Stefan Liebler
4c01126896 S390: Implement mempcpy with help of memcpy. [BZ #19765]
There exist optimized memcpy functions on s390, but no optimized mempcpy.
This patch adds mempcpy entry points in memcpy.S files, which
use the memcpy implementation. Now mempcpy itself is also an IFUNC function
as memcpy is and the variants are listed in ifunc-impl-list.c.

The s390 string.h does not define _HAVE_STRING_ARCH_mempcpy.
Instead mempcpy string/string.h inlines memcpy() + n.
If n is constant and small enough, GCC emits instructions like mvi or mvc
and avoids the function call to memcpy.
If n is not constant, then memcpy is called and n is added afterwards.
If _HAVE_STRING_ARCH_mempcpy would be defined, mempcpy would be called in
every case.

According to PR70140 "Inefficient expansion of __builtin_mempcpy"
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70140) GCC should handle a
call to mempcpy in the same way as memcpy. Then either the mempcpy macro
in string/string.h has to be removed or _HAVE_STRING_ARCH_mempcpy has to
be defined for S390.

ChangeLog:

	[BZ #19765]
	* sysdeps/s390/mempcpy.S: New File.
	* sysdeps/s390/multiarch/mempcpy.c: Likewise.
	* sysdeps/s390/multiarch/Makefile (sysdep_routines): Add mempcpy.
	* sysdeps/s390/multiarch/ifunc-impl-list.c (__libc_ifunc_impl_list):
	Add mempcpy variants.
	* sysdeps/s390/s390-32/memcpy.S: Add mempcpy entry point.
	(memcpy): Adjust to be usable from mempcpy entry point.
	(__memcpy_mvcle): Likewise.
	* sysdeps/s390/s390-64/memcpy.S: Likewise.
	* sysdeps/s390/s390-32/multiarch/memcpy-s390.S: Add entry points
	____mempcpy_z196, ____mempcpy_z10 and add __GI_ symbols for mempcpy.
	(__memcpy_z196): Adjust to be usable from mempcpy entry point.
	(__memcpy_z10): Likewise.
	* sysdeps/s390/s390-64/multiarch/memcpy-s390x.S: Likewise.
2016-05-24 10:39:13 +02:00
Stefan Liebler
7165583255 S390: Do not call memcpy, memcmp, memset within libc.so via ifunc-plt.
On s390, the memcpy, memcmp, memset functions are IFUNC symbols,
which are created with s390_libc_ifunc-macro.
This macro creates a __GI_ symbol which is set to the
ifunced symbol. Thus calls within libc.so to e.g. memcpy
result in a call to *ABS*+0x954c0@plt stub and afterwards
to the resolved memcpy-ifunc-variant.

This patch sets the __GI_ symbol to the default-ifunc-variant
to avoid the plt call. The __GI_ symbols are now created at the
default variant of ifunced function.

ChangeLog:

	* sysdeps/s390/multiarch/ifunc-resolve.h (s390_libc_ifunc):
	Remove __GI_ symbol.
	* sysdeps/s390/s390-32/multiarch/memcmp-s390.S: Add __GI_memcmp symbol.
	* sysdeps/s390/s390-64/multiarch/memcmp-s390x.S: Likewise.
	* sysdeps/s390/s390-32/multiarch/memcpy-s390.S: Add __GI_memcpy symbol.
	* sysdeps/s390/s390-64/multiarch/memcpy-s390x.S: Likewise.
	* sysdeps/s390/s390-32/multiarch/memset-s390.S: Add __GI_memset symbol.
	* sysdeps/s390/s390-64/multiarch/memset-s390x.S: Likewise.
2016-05-24 10:39:13 +02:00
Stefan Liebler
074b0f27d9 S390: Use 64bit instruction to check for copies of > 1MB with mvcle.
The __memcpy_default variant on s390 64bit calculates the number of
256byte blocks in a 64bit register and checks, if they exceed 1MB
to jump to mvcle. Otherwise a mvc-loop is used. The compare-instruction
only checks a 32bit value.
This patch uses a 64bit compare.

ChangeLog:

	* sysdeps/s390/s390-64/memcpy.S (memcpy):
	Use cghi instead of chi to compare 64bit value.
2016-05-24 10:39:13 +02:00
Stefan Liebler
04bb21ac93 S390: Use mvcle for copies > 1MB on 32bit with default memcpy variant.
If more than 255 bytes should be copied, the algorithm jumps away.
Before this patch, it jumps to the mvc-loop (.L_G5_12).
Now it jumps first to the "> 1MB" check, which jumps away to
__memcpy_mvcle. Otherwise, the mvc-loop (.L_G5_12) copies the bytes.

ChangeLog:

	* sysdeps/s390/s390-32/memcpy.S (memcpy):
	Jump to 1MB check before executing mvc-loop.
2016-05-24 10:39:13 +02:00
Florian Weimer
3375cfafa7 Make padding in struct sockaddr_storage explicit [BZ #20111]
This avoids aliasing issues with GCC 6 in -fno-strict-aliasing
mode.  (With implicit padding, not all data is copied.)

This change makes it explicit that struct sockaddr_storage is
only 126 bytes large on m68k (unlike elsewhere, where we end up
with the requested 128 bytes).  The new test case makes sure that
this does not happen on other architectures.
2016-05-23 19:43:09 +02:00
Joseph Myers
f9b437d5ef Update sysdeps/unix/sysv/linux/bits/socket.h for Linux 4.6.
This patch updates sysdeps/unix/sysv/linux/bits/socket.h for new
constants added in Linux 4.6.  AF_KCM / PF_KCM are added.  SOL_KCM is
new, and I added a lot of SOL_* values postdating the last one present
in the header, since I saw no apparent reason for the set in glibc to
stop at SOL_IRDA.  MSG_BATCH is added; Linux also has
MSG_SENDPAGE_NOTLAST which is not in glibc, but given the comment
starts "sendpage() internal" I presume it's correct for it not to be
in glibc.

(Note that this is a case where the Linux kernel header with userspace
relevant values is *not* a uapi header but include/linux/socket.h - I
don't know why, but at least this header, as well as uapi headers,
needs reviewing for glibc-relevant changes each release.)

Tested for x86_64 and x86 (testsuite, and that installed stripped
shared libraries are unchanged by the patch).

	* sysdeps/unix/sysv/linux/bits/socket.h (PF_KCM): New macro.
	(PF_MAX): Update value.
	(AF_KCM): New macro.
	(SOL_NETBEUI): Likewise.
	(SOL_LLC): Likewise.
	(SOL_DCCP): Likewise.
	(SOL_NETLINK): Likewise.
	(SOL_TIPC): Likewise.
	(SOL_RXRPC): Likewise.
	(SOL_PPPOL2TP): Likewise.
	(SOL_BLUETOOTH): Likewise.
	(SOL_PNPIPE): Likewise.
	(SOL_RDS): Likewise.
	(SOL_IUCV): Likewise.
	(SOL_CAIF): Likewise.
	(SOL_ALG): Likewise.
	(SOL_NFC): Likewise.
	(SOL_KCM): Likewise.
	(MSG_BATCH): New enum value and macro.
2016-05-23 13:27:37 +00:00
H.J. Lu
b7598b1b85 Remove special L2 cache case for Knights Landing
L2 cache is shared by 2 cores on Knights Landing, which has 4 threads
per core:

https://en.wikipedia.org/wiki/Xeon_Phi#Knights_Landing

So L2 cache is shared by 8 threads on Knights Landing as reported by
CPUID.  We should remove special L2 cache case for Knights Landing.

	[BZ #18185]
	* sysdeps/x86/cacheinfo.c (init_cacheinfo): Don't limit threads
	sharing L2 cache to 2 for Knights Landing.
2016-05-20 14:42:00 -07:00
Joseph Myers
ffe9aaf2b9 Implement proper fmal for ldbl-128ibm (bug 13304).
ldbl-128ibm had an implementation of fmal that just did (x * y) + z in
most cases, with no attempt at actually being a fused operation.

This patch replaces it with a genuine fused operation.  It is not
necessarily correctly rounding, but should produce a result at least
as accurate as the long double arithmetic operations in libgcc, which
I think is all that can reasonably be expected for such a non-IEEE
format where arithmetic is approximate rather than rounded according
to any particular rule for determining the exact result.  Like the
libgcc arithmetic, it may produce spurious overflow and underflow
results, and it falls back to the libgcc multiplication in the case of
(finite, finite, zero).

This concludes the fixes for bug 13304; any subsequently found fma
issues should go in separate Bugzilla bugs.  Various other pieces of
bug 13304 were fixed in past releases over the past several years.

Tested for powerpc.

	[BZ #13304]
	* sysdeps/ieee754/ldbl-128ibm/s_fmal.c: Include <fenv.h>,
	<float.h>, <math_private.h> and <stdlib.h>.
	(add_split): New function.
	(mul_split): Likewise.
	(ext_val): New typedef.
	(store_ext_val): New function.
	(mul_ext_val): New function.
	(compare): New function.
	(add_split_ext): New function.
	(__fmal): After checking for Inf, NaN and zero, compute result as
	an exact sum of scaled double values in round-to-nearest before
	adding those up and adjusting for other rounding modes.
	* math/auto-libm-test-in: Remove xfail-rounding:ldbl-128ibm from
	tests of fma.
	* math/auto-libm-test-out: Regenerated.
2016-05-19 20:10:56 +00:00
H.J. Lu
de71e0421b Correct Intel processor level type mask from CPUID
Intel CPUID with EAX == 11 returns:

ECX Bits 07 - 00: Level number. Same value in ECX input.
    Bits 15 - 08: Level type.
    ^^^^^^^^^^^^^^^^^^^^^^^^ This is level type.
    Bits 31 - 16: Reserved.

Intel processor level type mask should be 0xff00, not 0xff0.

	[BZ #20119]
	* sysdeps/x86/cacheinfo.c (init_cacheinfo): Correct Intel
	processor level type mask for CPUID with EAX == 11.
2016-05-19 10:02:36 -07:00
H.J. Lu
7c08d791ee Check the HTT bit before counting logical threads
Skip counting logical threads for Intel processors if the HTT bit is 0
which indicates there is only a single logical processor.

	* sysdeps/x86/cacheinfo.c (init_cacheinfo): Skip counting
	logical threads if the HTT bit is 0.
	* sysdeps/x86/cpu-features.h (bit_cpu_HTT): New.
	(index_cpu_HTT): Likewise.
	(reg_HTT): Likewise.
2016-05-19 09:09:00 -07:00
H.J. Lu
eb2c88c7c8 Remove alignments on jump targets in memset
X86-64 memset-vec-unaligned-erms.S aligns many jump targets, which
increases code sizes, but not necessarily improve performance.  As
memset benchtest data of align vs no align on various Intel and AMD
processors

https://sourceware.org/bugzilla/attachment.cgi?id=9277

shows that aligning jump targets isn't necessary.

	[BZ #20115]
	* sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S (__memset):
	Remove alignments on jump targets.
2016-05-19 08:49:55 -07:00
H.J. Lu
16cd2b35c2 Don't call internal _Unwind_Resume via PLT
There is no need to call the internal funtion, _Unwind_Resume, which
is defined in unwind-forcedunwind.c, via PLT.

	* sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S
	(__condvar_cleanup2): Remove JUMPTARGET from  _Unwind_Resume
	call.
	* sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S
	(__condvar_cleanup1): Likewise.
2016-05-18 13:43:26 -07:00
H.J. Lu
d29261db22 Don't call internal __pthread_unwind via PLT
Add PTHREAD_UNWIND to replace JUMPTARGET(__pthread_unwind) and define
it to __GI___pthread_unwind within libpthread.

	* sysdeps/unix/sysv/linux/x86_64/cancellation.S (PTHREAD_UNWIND):
	New
	(__pthread_unwind): Renamed to ...
	(PTHREAD_UNWIND): This.
	(__pthread_enable_asynccancel): Replace
	JUMPTARGET(__pthread_unwind) with PTHREAD_UNWIND.
2016-05-18 13:41:55 -07:00
Joseph Myers
48526672b6 Add CLONE_NEWCGROUP from Linux 4.6 to bits/sched.h.
This patch adds CLONE_NEWCGROUP, new in Linux 4.6, to
sysdeps/unix/sysv/linux/bits/sched.h.

Tested for x86_64 and x86 (testsuite, and that installed stripped
shared libraries are unchanged by the patch).

	* sysdeps/unix/sysv/linux/bits/sched.h [__USE_GNU]
	(CLONE_NEWCGROUP): New macro.
2016-05-18 17:46:52 +00:00
Joseph Myers
2a1aa52824 Add Q_GETNEXTQUOTA from Linux 4.6 to sys/quota.h.
This patch adds Q_GETNEXTQUOTA, new in Linux 4.6, to
sysdeps/unix/sysv/linux/sys/quota.h.

Tested for x86_64 and x86 (testsuite, and that installed shared
libraries are unchanged by the patch).

	* sysdeps/unix/sysv/linux/sys/quota.h [_LINUX_QUOTA_VERSION >= 2]
	(Q_GETNEXTQUOTA): New macro.
2016-05-18 13:15:11 +00:00
H.J. Lu
4facca0b0e Call init_cpu_features only if SHARED is defined
In static executable, since init_cpu_features is called early from
__libc_start_main, there is no need to call it again in dl_platform_init.

	[BZ #20072]
	* sysdeps/i386/dl-machine.h (dl_platform_init): Call
	init_cpu_features only if SHARED is defined.
	* sysdeps/x86_64/dl-machine.h (dl_platform_init): Likewise.
2016-05-13 08:29:33 -07:00
H.J. Lu
9e4ec3e816 Support non-inclusive caches on Intel processors
* sysdeps/x86/cacheinfo.c (init_cacheinfo): Check and support
	non-inclusive caches on Intel processors.
2016-05-13 07:18:35 -07:00
Wilco Dijkstra
a8c5a2a952 This is an optimized memset for AArch64. Memset is split into 4 main cases:
small sets of up to 16 bytes, medium of 16..96 bytes which are fully unrolled.
Large memsets of more than 96 bytes align the destination and use an unrolled
loop processing 64 bytes per iteration.  Memsets of zero of more than 256 use
the dc zva instruction, and there are faster versions for the common ZVA sizes
64 or 128.  STP of Q registers is used to reduce codesize without loss of
performance.

The speedup on test-memset is 1% on Cortex-A57 and 8% on Cortex-A53.

	* sysdeps/aarch64/memset.S (__memset):
	Rewrite of optimized memset.
2016-05-12 16:44:53 +01:00
Florian Weimer
56290d6e76 Increase fork signal safety for single-threaded processes [BZ #19703]
This provides a band-aid and addresses the scenario where fork is
called from a signal handler while the process is in the malloc
subsystem (or has acquired the libio list lock).  It does not
address the general issue of async-signal-safety of fork;
multi-threaded processes are not covered, and some glibc
subsystems have fork handlers which are not async-signal-safe.
2016-05-12 15:26:55 +02:00
Florian Weimer
cd065b6843 getaddrinfo: Convert from extend_alloca to struct scratch_buffer 2016-05-12 14:07:56 +02:00
Stefan Liebler
c64a10e544 S390: Use fPIC to avoid R_390_GOT12 relocation in gcrt1.o.
if glibc is build with -march=z900 | -march=z990,
the startup file gcrt1.o (used if you link with gcc -pg)
contains R_390_GOT12 | R_390_GOT20 relocations.
Thus, an entry in the GOT can be addressed relative to the GOT pointer
with a 12 | 20 bit displacement value.
The startup files should not contain R_390_GOT12,
R_390_GOT20 relocations, but R_390_GOTENT ones.

This patch removes the overrides of pic-ccflag and
the default pic-ccflag = -fPIC in Makeconfig
is used instead to get the R_390_GOTENT relocations in gcrt1.o.

ChangeLog:

	* sysdeps/s390/s390-32/Makefile (pic-ccflag): Remove.
	* sysdeps/s390/s390-64/Makefile: Likewise.
2016-05-11 15:51:25 +02:00
H.J. Lu
2a1f15b1a9 Remove x86 ifunc-defines.sym and rtld-global-offsets.sym
Merge x86 ifunc-defines.sym with x86 cpu-features-offsets.sym.  Remove
x86 ifunc-defines.sym and rtld-global-offsets.sym.  No code changes on
i686 and x86-64.

	* sysdeps/i386/i686/multiarch/Makefile (gen-as-const-headers):
	Remove ifunc-defines.sym.
	* sysdeps/x86_64/multiarch/Makefile (gen-as-const-headers):
	Likewise.
	* sysdeps/i386/i686/multiarch/ifunc-defines.sym: Removed.
	* sysdeps/x86/rtld-global-offsets.sym: Likewise.
	* sysdeps/x86_64/multiarch/ifunc-defines.sym: Likewise.
	* sysdeps/x86/Makefile (gen-as-const-headers): Remove
	rtld-global-offsets.sym.
	* sysdeps/x86_64/multiarch/ifunc-defines.sym: Merged with ...
	* sysdeps/x86/cpu-features-offsets.sym: This.
	* sysdeps/x86/cpu-features.h: Include <cpu-features-offsets.h>
	instead of <ifunc-defines.h> and <rtld-global-offsets.h>.
2016-05-11 05:51:39 -07:00
Florian Weimer
8db2cf163e getaddrinfo: Restore RES_USE_INET6 flag on error path [BZ #19994] 2016-05-10 10:09:24 +02:00
Stefan Liebler
b91a333ecb S390: Add support for vdso getcpu symbol.
This patch adds support for symbol __kernel_getcpu in vDSO,
which is available with kernel 4.5.
Now sched_getcpu is using this symbol if available in mapped vDSO
by defining macro HAVE_GETCPU_VSYSCALL. If not available at runtime,
the former syscall is used.
2016-05-09 11:05:45 +02:00
H.J. Lu
a9558b49b3 Move sysdeps/x86_64/cacheinfo.c to sysdeps/x86
Move sysdeps/x86_64/cacheinfo.c to sysdeps/x86.  No code changes on x86
and x86_64.

	* sysdeps/i386/cacheinfo.c: Include <sysdeps/x86/cacheinfo.c>
	instead of <sysdeps/x86_64/cacheinfo.c>.
	* sysdeps/x86_64/cacheinfo.c: Moved to ...
	* sysdeps/x86/cacheinfo.c: Here.
2016-05-08 08:49:18 -07:00
Samuel Thibault
04794f3e7e Revert "aio: fix newp->running data race"
This reverts commit fd67a9cf7b.
2016-05-04 15:52:30 +02:00
Samuel Thibault
fd67a9cf7b aio: fix newp->running data race
* sysdeps/pthread/aio_misc.c (__aio_enqueue_request): Do not write
	`running` field of `newp` when a thread was started to process it,
	since that thread will not take `__aio_requests_mutex`, and the field
	already has the proper value actually.
2016-05-04 15:14:29 +02:00
Gabriel F. T. Gomes
eb3b8a4924 powerpc: Fix operand prefixes
The file sysdeps/powerpc/sysdeps.h defines aliases for condition register
operands.  E.g.: 'cr7' means condition register 7.  On the one hand, this
increases readability, as it makes it easier for readers to know whether the
operand is a condition register, a general purpose register or an immediate.
On the other hand, this permits that condition registers be written as if they
were general purpose, and vice-versa, thus reducing the readability of the
code.

This commit removes some of these unintentional misuses.

The changes have no effect on the final code.  Checked with objdump.
2016-05-04 09:14:52 -03:00
Florian Weimer
5171f3079f CVE-2016-1234: glob: Do not copy d_name field of struct dirent [BZ #19779]
Instead, we store the data we need from the return value of
readdir in an object of the new type struct readdir_result.
This type is independent of the layout of struct dirent.
2016-05-04 12:09:35 +02:00
Paul E. Murphy
cbc06bc486 powerpc: Add missing insn in swapcontext [BZ #20004]
A missing instruction was discovered in the compat version of
swapcontext while running the GCC test suite.
2016-05-03 10:45:51 -05:00
Adhemerval Zanella
230528c467 powerpc: Fix clone CLONE_VM compare
This patch fixes the clone CLONE_VM change from 0cb313f (BZ#19957)
where the commit changed the register that contains the save flags
argument to compare with (from r28 to r29).  This patch changes
back to correct register.

Tested on powerpc32 (thanks to Tulio Magno Quites Machado Filho).

	* sysdeps/unix/sysv/linux/powerpc/powerpc32/clone.S (__clone): Fix
	flags CLONE_VM compare.
2016-05-02 17:44:00 -03:00
Andreas Schwab
8a9ea3ccc5 m68k: use large PIC model for gcrt1.o 2016-04-30 18:51:43 +02:00
Andreas Schwab
4816d802ff m68k: avoid local labels in symbol table 2016-04-30 18:50:39 +02:00
Adhemerval Zanella
0cb313f7cb Fix clone (CLONE_VM) pid/tid reset (BZ#19957)
As discussed in libc-alpha [1] current clone with CLONE_VM (without
CLONE_THREAD set) will reset the pthread pid/tid fields to -1.  The
issue is since memory is shared between the parent and child it will
clobber parent's cached pid/tid leading to internal inconsistencies
if the value is not restored.

And even it is restored it may lead to racy conditions when between
set/restore a thread might invoke pthread function that validate the
pthread with INVALID_TD_P/INVALID_NOT_TERMINATED_TD_P and thus get
wrong results.

As stated in BZ19957, previously reports of this behaviour was close
with EWONTFIX due the fact usage of clone outside glibc is tricky
since glibc requires consistent internal pthread, while using clone
directly may not provide it. However since now posix_spawn uses
clone (CLONE_VM) to fixes various issues related to previous vfork
usage this issue requires fixing.

The vfork implementation also does something similar, but instead
it negates and restores only the *pid* field and functions that
might access its value know to handle such case (getpid, raise
and pthread ones that uses INVALID_TD_P/INVALID_NOT_TERMINATED_TD_P
macros that check only *tid* field).  Also vfork does not call
__clone directly, instead calling either __NR_vfork or __NR_clone
directly.

So this patch removes this clone behavior by avoiding setting
the pthread pid/tid field for CLONE_VM. There is no need to
check for CLONE_THREAD, since the minimum supported kernel in all
architecture implies that CLONE_VM must be used with CLONE_THREAD,
otherwise clone returns EINVAL.

Instead of current approach of:

   int clone(int (*fn)(void *), void *child_stack, int flags, ...)
      [...]
      if (flags & CLONE_THREAD)
        goto do_syscall;
      pid_t new_value;
      if (flags & CLONE_VM)
        new_value = -1;
      else
        new_value = getpid ();
      THREAD_SETMEM (THREAD_SELF, pid, new_value);
      THREAD_SETMEM (THREAD_SELF, tid, new_value);

    do_syscall:
      [...]

The new approach uses:

   int clone(int (*fn)(void *), void *child_stack, int flags, ...)
      [...]
      if (flags & CLONE_VM)
        goto do_syscall;
      pid_t new_value = getpid ();
      THREAD_SETMEM (THREAD_SELF, pid, new_value);
      THREAD_SETMEM (THREAD_SELF, tid, new_value);

    do_syscall:
      [...]

It also removes the linux tst-getpid2.c test which expects the previous
behavior and instead add another clone test.

Tested on x86_64, i686, x32, powerpc64le, aarch64, armhf, s390, and
s390x. I also did limited check on mips32 and sparc64 (using the new
added test).

I also got reviews from both m68k, hppa, and tile.  So I presume for
these architecture the patch works.

The fixes for alpha, microblaze, sh, ia64, and nio2 have not been
tested.

[1] https://sourceware.org/ml/libc-alpha/2016-04/msg00307.html

	* sysdeps/unix/sysv/linux/Makefile [$(subdir) == nptl] (test): Remove
	tst-getpid2.
	(test): Add tst-clone2.
	* sysdeps/unix/sysv/linux/tst-clone2.c: New file.
	* sysdeps/unix/sysv/linux/aarch64/clone.S (__clone): Do not change
	pid/tid fields for CLONE_VM.
	* sysdeps/unix/sysv/linux/arm/clone.S: Likewise.
	* sysdeps/unix/sysv/linux/i386/clone.S: Likewise.
	* sysdeps/unix/sysv/linux/mips/clone.S: Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/clone.S: Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S: Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-32/clone.S: Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-64/clone.S: Likewise.
	* sysdeps/unix/sysv/linux/sparc/sparc32/clone.S: Likewise.
	* sysdeps/unix/sysv/linux/sparc/sparc64/clone.S: Likewise.
	* sysdeps/unix/sysv/linux/x86_64/clone.S: Likewise.
	* sysdeps/unix/sysv/linux/tst-getpid2.c: Remove file.
2016-04-29 18:19:30 -03:00
Gabriel F. T. Gomes
72c11b353e powerpc: Zero pad using memset in strncpy/stpncpy
Call __memset_power8 to pad, with zeros, the remaining bytes in the
dest string on __strncpy_power8 and __stpncpy_power8.  This improves
performance when n is larger than the input string, giving ~30% gain for
larger strings without impacting much shorter strings.
2016-04-29 10:05:33 -03:00
Florian Weimer
4ab2ab03d4 CVE-2016-3706: getaddrinfo: stack overflow in hostent conversion [BZ #20010]
When converting a struct hostent response to struct gaih_addrtuple, the
gethosts macro (which is called from gaih_inet) used alloca, without
malloc fallback for large responses.  This commit changes this code to
use calloc unconditionally.

This commit also consolidated a second hostent-to-gaih_addrtuple
conversion loop (in gaih_inet) to use the new conversion function.
2016-04-29 10:35:34 +02:00
Stefan Liebler
b06549a5e6 Add missing iucv related defines.
this patch adds the missing SOL_IUCV socket level definition
and socket options SO_IPRMDATA_MSG, SO_MSGLIMIT, SO_MSGSIZE
which can be used with get/setsockopt().
SCM_IUCV_TRGCLS is needed to send/receive ancillary data with send/recvmsg().

The defines are copied from kernel-source:
include/net/iucv/af_iucv.h
include/linux/socket.h
2016-04-27 09:08:29 +02:00
Paul E. Murphy
8f1b841e45 powerpc: Add optimized strcspn for P8
A few minor adjustments to the P8 strspn gives us
an almost equally optimized P8 strcspn.
2016-04-25 09:11:02 -05:00
Chung-Lin Tang
a5507dfa60 Fix stdlib/tst-makecontext regression for Nios II 2016-04-25 00:08:17 -07:00
Rajalakshmi Srinivasaraghavan
e413b14e18 powerpc: strcasestr optmization for power8
This patch optimizes strcasestr function for power >= 8 systems.  The average
improvement of this optimization is ~40% and compares 16 bytes at a time
using vector instructions.  This patch is tested on powerpc64 and powerpc64le.
2016-04-22 19:23:13 +05:30
Samuel Thibault
6f8222a1c5 Fix gprof timing
* sysdeps/mach/hurd/profil.c (__profile_frequency): Return tick
	frequency instead of tick length in us.
2016-04-19 23:27:27 +02:00
Samuel Thibault
593285ac15 hurd: fix profiling short-living processes
* sysdeps/mach/hurd/profil.c (update_waiter): Initialize
	profil_reply_port.
	(profile_waiter): Do not initialize profil_reply_port.
2016-04-19 00:54:24 +02:00
Carlos Eduardo Seo
1b045ee53e powerpc: Optimization for strlen for POWER8.
This implementation takes advantage of vectorization to improve performance of
the loop over the current strlen implementation for POWER7.
2016-04-15 17:19:19 -03:00
H.J. Lu
2e2d9796da Detect Intel Goldmont and Airmont processors
Updated from the model numbers of Goldmont and Airmont processors in
Intel64 And IA-32 Processor Architectures Software Developer's Manual
Volume 3 Revision 058.

	* sysdeps/x86/cpu-features.c (init_cpu_features): Detect Intel
	Goldmont and Airmont processors.
2016-04-15 05:23:06 -07:00
Adhemerval Zanella
41e77f36d4 Fix pread consolidation on ports that require argument alignment
This patch fixes the __ALIGNMENT_{ARG,COUNT} definition for ports that
define __ASSUME_ALIGNED_REGISTER_PAIRS by including the kernel-features.h
(where it is defined if the case).

This was shown on arm with failing cases:

FAIL: debug/tst-chk1
FAIL: debug/tst-chk2
FAIL: debug/tst-chk3
FAIL: debug/tst-chk4
FAIL: debug/tst-chk5
FAIL: debug/tst-chk6
FAIL: debug/tst-lfschk1
FAIL: debug/tst-lfschk2
FAIL: debug/tst-lfschk3
FAIL: debug/tst-lfschk4
FAIL: debug/tst-lfschk5
FAIL: debug/tst-lfschk6
FAIL: posix/tst-preadwrite
FAIL: posix/tst-preadwrite64

The patches fixes it.  Tested on armhf.

	* sysdeps/unix/sysv/linux/sysdep.h: Include kernel-features.h.
2016-04-14 16:49:40 -03:00
Florian Weimer
ae9e94e744 malloc: Remove unused definitions of thread_atfork, thread_atfork_static 2016-04-14 09:17:36 +02:00
Florian Weimer
29d794863c malloc: Run fork handler as late as possible [BZ #19431]
Previously, a thread M invoking fork would acquire locks in this order:

  (M1) malloc arena locks (in the registered fork handler)
  (M2) libio list lock

A thread F invoking flush (NULL) would acquire locks in this order:

  (F1) libio list lock
  (F2) individual _IO_FILE locks

A thread G running getdelim would use this order:

  (G1) _IO_FILE lock
  (G2) malloc arena lock

After executing (M1), (F1), (G1), none of the threads can make progress.

This commit changes the fork lock order to:

  (M'1) libio list lock
  (M'2) malloc arena locks

It explicitly encodes the lock order in the implementations of fork,
and does not rely on the registration order, thus avoiding the deadlock.
2016-04-14 09:17:02 +02:00
Florian Weimer
b49ab5f450 Remove union wait [BZ #19613]
The overloading approach in the W* macros was incompatible with
integer expressions of a type different from int.  Applications
using union wait and these macros will have to migrate to the
POSIX-specified int status type.
2016-04-14 08:54:57 +02:00
Andreas Schwab
b4bcb3aec6 Register extra test objects
This makes sure that the extra test objects are compiled with the correct
MODULE_NAME and dependencies are tracked.
2016-04-13 17:07:13 +02:00
H.J. Lu
a057f5f8cd X86-64: Use non-temporal store in memcpy on large data
The large memcpy micro benchmark in glibc shows that there is a
regression with large data on Haswell machine.  non-temporal store in
memcpy on large data can improve performance significantly.  This
patch adds a threshold to use non temporal store which is 6 times of
shared cache size.  When size is above the threshold, non temporal
store will be used, but avoid non-temporal store if there is overlap
between destination and source since destination may be in cache when
source is loaded.

For size below 8 vector register width, we load all data into registers
and store them together.  Only forward and backward loops, which move 4
vector registers at a time, are used to support overlapping addresses.
For forward loop, we load the last 4 vector register width of data and
the first vector register width of data into vector registers before the
loop and store them after the loop.  For backward loop, we load the first
4 vector register width of data and the last vector register width of
data into vector registers before the loop and store them after the loop.

	[BZ #19928]
	* sysdeps/x86_64/cacheinfo.c (__x86_shared_non_temporal_threshold):
	New.
	(init_cacheinfo): Set __x86_shared_non_temporal_threshold to 6
	times of shared cache size.
	* sysdeps/x86_64/multiarch/memmove-avx-unaligned-erms.S
	(VMOVNT): New.
	* sysdeps/x86_64/multiarch/memmove-avx512-unaligned-erms.S
	(VMOVNT): Likewise.
	* sysdeps/x86_64/multiarch/memmove-sse2-unaligned-erms.S
	(VMOVNT): Likewise.
	(VMOVU): Changed to movups for smaller code sizes.
	(VMOVA): Changed to movaps for smaller code sizes.
	* sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S: Update
	comments.
	(PREFETCH): New.
	(PREFETCH_SIZE): Likewise.
	(PREFETCHED_LOAD_SIZE): Likewise.
	(PREFETCH_ONE_SET): Likewise.
	Rewrite to use forward and backward loops, which move 4 vector
	registers at a time, to support overlapping addresses and use
	non temporal store if size is above the threshold and there is
	no overlap between destination and source.
2016-04-12 08:10:47 -07:00
Matthew Fortune
b39d84adff VDSO support for MIPS
This patch adds support for using the implementations of gettimeofday()
and clock_gettime() provided by the kernel in the VDSO. The VDSO will
always provide clock_gettime() as CLOCK_{REALTIME,MONOTONIC}_COARSE can
be implemented regardless of platform. CLOCK_{REALTIME,MONOTONIC}, along
with gettimeofday(), are only implemented on platforms which make use of
either the CP0 count or GIC as their clocksource. On other platforms,
the VDSO does not provide the __vdso_gettimeofday symbol, as it is
never useful.

The VDSO functions return ENOSYS when they encounter an unsupported
request, in which case glibc should fall back to the standard syscall.

Tested with upstream kernel 4.5 and QEMU emulating Malta.

./vdsotest gettimeofday bench
gettimeofday: syscall: 1021 nsec/call
gettimeofday:    libc: 262 nsec/call
gettimeofday:    vdso: 174 nsec/call

	* sysdeps/unix/sysv/linux/mips/Makefile (sysdep_routines):
	Include dl-vdso.
	* sysdeps/unix/sysv/linux/mips/Versions: Add
	__vdso_clock_gettime.
	* sysdeps/unix/sysv/linux/mips/init-first.c: New file.
	* sysdeps/unix/sysv/linux/mips/libc-vdso.h: New file.
	* sysdeps/unix/sysv/linux/mips/mips32/sysdep.h:
	(INTERNAL_VSYSCALL_CALL): Define to be compatible with MIPS
	definitions of INTERNAL_SYSCALL_{ERROR_P,ERRNO}.
	(HAVE_CLOCK_GETTIME_VSYSCALL): Define.
	(HAVE_GETTIMEOFDAY_VSYSCALL): Define.
	* sysdeps/unix/sysv/linux/mips/mips64/n32/sysdep.h: Likewise.
	* sysdeps/unix/sysv/linux/mips/mips64/n64/sysdep.h: Likewise.
2016-04-12 11:05:13 +01:00
Adhemerval Zanella
071af4769f Consolidate pwrite/pwrite64 implementations
This patch consolidates all the pwrite/pwrite64 implementation for Linux
in only one (sysdeps/unix/sysv/linux/pwrite{64}.c).  It also removes the
syscall from the auto-generation using assembly macros.

For pwrite{64} offset argument placement the new SYSCALL_LL{64} macro
is used.  For pwrite ports that do not define __NR_pwrite will use
__NR_pwrite64 and for pwrite64 ports that dot define __NR_pwrite64 will
use __NR_pwrite for the syscall.

Checked on x86_64, x32, i386, aarch64, and ppc64le.

	* sysdeps/unix/sysv/linux/arm/pwrite.c: Remove file.
	* sysdeps/unix/sysv/linux/arm/pwrite64.c: Likewise.
	* sysdeps/unix/sysv/linux/generic/wordsize-32/pwrite.c: Likewise.
	* sysdeps/unix/sysv/linux/generic/wordsize-32/pwrite64.c: Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/pwrite.c: Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/pwrite64.c: Likewise.
	* sysdeps/unix/sysv/linux/wordsize-64/pwrite64.c: Likewise.
	* sysdeps/unix/sysv/linux/wordsize-64/syscalls.list (prite): Remove
	syscalls generation.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/sysdep.h
	[__NR_pwrite64] (__NR_write): Remove define.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/sysdep.h
	[__NR_pwrite64] (__NR_write): Remove define.
	* sysdeps/unix/sysv/linux/pwrite.c [__NR_pwrite64] (__NR_pwrite):
	Remove define.
	(__libc_pwrite): Use SYSCALL_LL macro on offset argument.
	* sysdeps/unix/sysv/linux/pwrite64.c [__NR_pwrite64] (__NR_pwrite):
	Remove define.
	(__libc_pwrite64): Use SYSCALL_LL64 macro on offset argument.
	* sysdeps/unix/sysv/linux/sh/pwrite.c: Rewrite using default
	Linux implementation as base.
	* sysdeps/unix/sysv/linux/sh/pwrite64.c: Likewise.
	* sysdeps/unix/sysv/linux/mips/pwrite.c: Likewise.
	* sysdeps/unix/sysv/linux/mips/pwrite64.c: Likewise.
2016-04-11 10:08:01 -03:00
Adhemerval Zanella
77a4fbd536 Consolidate pread/pread64 implementations
This patch consolidates all the pread/pread64 implementation for Linux
in only one (sysdeps/unix/sysv/linux/pread.c).  It also removes the
syscall from the auto-generation using assembly macros.

For pread{64} offset argument placement the new SYSCALL_LL{64} macro
is used.  For pread ports that do not define __NR_pread will use
__NR_pread64 and for pread64 ports that dot define __NR_pread64 will
use __NR_pread for the syscall.

Checked on x86_64, x32, i386, aarch64, and ppc64le.

	* sysdeps/unix/sysv/linux/arm/pread.c: Remove file.
	* sysdeps/unix/sysv/linux/arm/pread64.c: Likewise.
	* sysdeps/unix/sysv/linux/generic/wordsize-32/pread.c: Likewise.
	* sysdeps/unix/sysv/linux/generic/wordsize-32/pread64.c: Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/pread.c: Likewise,
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/pread64.c: Likewise.
	* sysdeps/unix/sysv/linux/wordsize-64/pread64.c: Likewise.
	* sysdeps/unix/sysv/linux/wordsize-64/syscalls.list (pread): Remove
	syscall generation.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/sysdep.h
	[__NR_pread64] (__NR_pread): Remove define.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/sysdep.h:
	[__NR_pread64] (__NR_pread): Likewise.
	* sysdeps/unix/sysv/linux/pread.c [__NR_pread64] (__NR_pread): Remove
	define.
	(__libc_pread): Use SYSCALL_LL macro on offset argument.
	* sysdeps/unix/sysv/linux/pread64.c [__NR_pread64] (__NR_pread):
	Remove define.
	(__libc_pread64): Use SYSCALL_LL64 macro on offset argument.
	* sysdeps/unix/sysv/linux/sh/pread.c: Rewrite using default
	Linux implementation as base.
	* sysdeps/unix/sysv/linux/sh/pread64.c: Likewise.
	* sysdeps/unix/sysv/linux/mips/pread.c: Likewise.
	* sysdeps/unix/sysv/linux/mips/pread64.c: Likewise.
2016-04-11 10:08:01 -03:00
Adhemerval Zanella
eeddfa91cb Consolidate off_t/off64_t syscall argument passing
This patch add three new macros (SYSCALL_LL, SYSCALL_LL64, and
__ASSUME_WORDSIZE64_ILP32) to use along with off_t and off64_t argument
syscalls.  The rationale for this change is:

1. Remove multiple implementations for the same syscall for different
   architectures (for instance, pread have 6 different implementations).

2. Also remove the requirement to use syscall wrappers for cancellable
   entrypoints.

The macro usage should be used along __ALIGNMENT_ARG to follow ABI constrains
for architecture where it applies.  For instance, pread can be rewritten as:

  return SYSCALL_CANCEL (pread, fd, buf, count,
                         __ALIGNMENT_ARG SYSCALL_LL (offset));

Another macro, SYSCALL_LL64, is provided for off64_t.  The macro
__ASSUME_WORDSIZE64_ILP32 is used by the ABI to define is uses 64-bit register
even if ABI is ILP32 (for instance x32 and mips64-n32).

The changes itself are not currently used in any implementation, so no
code change is expected.

	* sysdeps/unix/sysv/linux/generic/sysdep.h (__ALIGNMENT_ARG): Move
	definition.
	(__ALIGNMENT_COUNT): Likewise.
	* sysdeps/unix/sysv/linux/sysdep.h (__ALIGNMENT_ARG): To here.
	(__ALIGNMENT_COUNT): Likewise.
	(SYSCALL_LL): New define.
	(SYSCALL_LL64): Likewise.
	* sysdeps/unix/sysv/linux/mips/kernel-features.h:
	[_MIPS_SIM == _ABIO32] (__ASSUME_WORDSIZE64_ILP32): Define.
	* sysdeps/unix/sysv/linux/x86_64/kernel-features.h:
	[ILP32] (__ASUME_WORDSIZE64_ILP32): Likewise.
2016-04-11 10:07:53 -03:00
Adhemerval Zanella
482b2f87a8 Define __ASSUME_ALIGNED_REGISTER_PAIRS for missing ports
This patch defines __ASSUME_ALIGNED_REGISTER_PAIRS for the missing
ports that require 64-bit value (e.g., long long) to be aligned to
an even register pair in argument passing.

No code change is expected, tested with builds for powerpc32,
mips-o32, and armhf.

	* sysdeps/unix/sysv/linux/arm/kernel-features.h
	(__ASSUME_ALIGNED_REGISTER_PAIRS): Define.
	* sysdeps/unix/sysv/linux/mips/kernel-features.h
	[_MIPS_SIM == _ABIO32] (__ASSUME_ALIGNED_REGISTER_PAIRS): Likewise.
	* sysdeps/unix/sysv/linux/powerpc/kernel-features.h
	[!__powerpc64__] (__ASSUME_ALIGNED_REGISTER_PAIRS): Likewise.
2016-04-11 09:15:11 -03:00
Samuel Thibault
e1ef505659 Fix build with HAVE_AUX_VECTOR
* sysdeps/unix/sysv/linux/ldsodefs.h (HAVE_AUX_VECTOR): Define before
	including <ldsodefs.h>.
	* sysdeps/nacl/ldsodefs.h (HAVE_AUX_VECTOR): Likewise.
2016-04-11 10:27:25 +02:00
Samuel Thibault
0cdc5e930a Fix crash on getauxval call without HAVE_AUX_VECTOR
* sysdeps/generic/ldsodefs.h (struct rtld_global_ro)
	[!HAVE_AUX_VECTOR]: Do not define _dl_auxv field.
	* misc/getauxval.c (__getauxval) [!HAVE_AUX_VECTOR]: Do not go through
	GLRO(dl_auxv) list.
2016-04-10 23:58:43 +02:00
Khem Raj
1a5d01e79e When disabling SSE, make sure -fpmath is not set to use SSE either
This fixes errors when we inject sse options through CFLAGS and now
that we have -Werror turned on by default this warning turns into an
error on x86:

$ gcc -m32 -march=core2 -mtune=core2 -msse3 -mfpmath=sse -x c /dev/null -S -mno-sse -mno-mmx
/dev/null:1:0: warning: SSE instruction set disabled, using 387 arithmetics

Where as:

$ gcc -m32 -march=core2 -mtune=core2 -msse3 -mfpmath=sse -x c /dev/null -S -mno-sse -mno-mmx -mfpmath=387

Generates no warnings.
2016-04-09 22:14:24 -04:00
Mike Frysinger
b2d4456b33 configure: fix test == usage
POSIX defines the = operator, but not ==.  Fix the few places where we
incorrectly used ==.
2016-04-09 20:05:13 -04:00
Joseph Myers
fb3227b95c Fix termios.h XCASE namespace (bug 19925).
bits/termios.h (various versions under sysdeps/unix/sysv/linux)
defines XCASE if defined __USE_MISC || defined __USE_XOPEN.  This
macro was removed in the 2001 edition of POSIX, and is not otherwise
reserved, so should not be defined for 2001 and later versions of
POSIX.  This patch fixes the conditions accordingly (leaving the macro
defined for __USE_MISC, so still in the default namespace).

Tested for x86_64 and x86 (testsuite, and that installed shared
libraries are unchanged by the patch).

	[BZ #19925]
	* sysdeps/unix/sysv/linux/alpha/bits/termios.h (XCASE): Do not
	define if [!__USE_MISC && __USE_XOPEN2K].
	* sysdeps/unix/sysv/linux/bits/termios.h (XCASE): Likewise.
	* sysdeps/unix/sysv/linux/mips/bits/termios.h (XCASE): Likewise.
	* sysdeps/unix/sysv/linux/powerpc/bits/termios.h (XCASE):
	Likewise.
	* sysdeps/unix/sysv/linux/sparc/bits/termios.h (XCASE): Likewise.
	* conform/Makefile (test-xfail-XOPEN2K/termios.h/conform): Remove
	variable.
	(test-xfail-XOPEN2K8/termios.h/conform): Likewise.
2016-04-08 18:16:09 +00:00
Paul E. Murphy
25dba0ad05 powerpc: Add optimized P8 strspn
This utilizes vectors and bitmasks.  For small needle, large
haystack, the performance improvement is upto 8x.  For short
strings (0-4B), the cost of computing the bitmask dominates,
and is a tad slower.
2016-04-07 15:51:28 -05:00
H.J. Lu
a7d1c51482 X86-64: Prepare memmove-vec-unaligned-erms.S
Prepare memmove-vec-unaligned-erms.S to make the SSE2 version as the
default memcpy, mempcpy and memmove.

	* sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S
	(MEMCPY_SYMBOL): New.
	(MEMPCPY_SYMBOL): Likewise.
	(MEMMOVE_CHK_SYMBOL): Likewise.
	Replace MEMMOVE_SYMBOL with MEMMOVE_CHK_SYMBOL on __mempcpy_chk
	symbols.  Replace MEMMOVE_SYMBOL with MEMPCPY_SYMBOL on
	__mempcpy symbols.  Provide alias for __memcpy_chk in libc.a.
	Provide alias for memcpy in libc.a and ld.so.
2016-04-06 10:19:16 -07:00
H.J. Lu
4af1bb06c5 X86-64: Prepare memset-vec-unaligned-erms.S
Prepare memset-vec-unaligned-erms.S to make the SSE2 version as the
default memset.

	* sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S
	(MEMSET_CHK_SYMBOL): New.  Define if not defined.
	(__bzero): Check VEC_SIZE == 16 instead of USE_MULTIARCH.
	Disabled fro now.
	Replace MEMSET_SYMBOL with MEMSET_CHK_SYMBOL on __memset_chk
	symbols.  Properly check USE_MULTIARCH on __memset symbols.
2016-04-06 09:10:35 -07:00
H.J. Lu
ec0cac9a1f Force 32-bit displacement in memset-vec-unaligned-erms.S
* sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S: Force
	32-bit displacement to avoid long nop between instructions.
2016-04-05 05:21:19 -07:00
H.J. Lu
696ac77484 Add a comment in memset-sse2-unaligned-erms.S
* sysdeps/x86_64/multiarch/memset-sse2-unaligned-erms.S: Add
	a comment on VMOVU and VMOVA.
2016-04-05 05:19:18 -07:00
H.J. Lu
5cd7af016d Don't put SSE2/AVX/AVX512 memmove/memset in ld.so
Since memmove and memset in ld.so don't use IFUNC, don't put SSE2, AVX
and AVX512 memmove and memset in ld.so.

	* sysdeps/x86_64/multiarch/memmove-avx-unaligned-erms.S: Skip
	if not in libc.
	* sysdeps/x86_64/multiarch/memmove-avx512-unaligned-erms.S:
	Likewise.
	* sysdeps/x86_64/multiarch/memset-avx2-unaligned-erms.S:
	Likewise.
	* sysdeps/x86_64/multiarch/memset-avx512-unaligned-erms.S:
	Likewise.
2016-04-03 14:35:38 -07:00
H.J. Lu
ea2785e96f Fix memmove-vec-unaligned-erms.S
__mempcpy_erms and __memmove_erms can't be placed between __memmove_chk
and __memmove it breaks __memmove_chk.

Don't check source == destination first since it is less common.

	* sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:
	(__mempcpy_erms, __memmove_erms): Moved before __mempcpy_chk
	with unaligned_erms.
	(__memmove_erms): Skip if source == destination.
	(__memmove_unaligned_erms): Don't check source == destination
	first.
2016-04-03 12:38:25 -07:00
H.J. Lu
27d3ce1467 Remove Fast_Copy_Backward from Intel Core processors
Intel Core i3, i5 and i7 processors have fast unaligned copy and
copy backward is ignored.  Remove Fast_Copy_Backward from Intel Core
processors to avoid confusion.

	* sysdeps/x86/cpu-features.c (init_cpu_features): Don't set
	bit_arch_Fast_Copy_Backward for Intel Core proessors.
2016-04-01 15:09:14 -07:00
Adhemerval Zanella
528ffb3a04 Remove powerpc64 strspn, strcspn, and strpbrk implementation
This patch removes the powerpc64 optimized strspn, strcspn, and
strpbrk assembly implementation now that the default C one
implements the same strategy.  On internal glibc benchtests
current implementations shows similar performance with -O2.

Tested on powerpc64le (POWER8).

	* sysdeps/powerpc/powerpc64/strcspn.S: Remove file.
	* sysdeps/powerpc/powerpc64/strpbrk.S: Remove file.
	* sysdeps/powerpc/powerpc64/strspn.S: Remove file.
2016-04-01 10:44:45 -03:00
Wilco Dijkstra
d3496c9f4f Improve generic strcspn performance
Improve strcspn performance using a much faster algorithm.  It is kept simple
so it works well on most targets.  It is generally at least 10 times faster
than the existing implementation on bench-strcspn on a few AArch64
implementations, and for some tests 100 times as fast (repeatedly calling
strchr on a small string is extremely slow...).

In fact the string/bits/string2.h inlines make no longer sense, as GCC
already uses strlen if reject is an empty string, strchrnul is 5 times as
fast as __strcspn_c1, while __strcspn_c2 and __strcspn_c3 are slower than
the strcspn main loop for large strings (though reject length 2-4 could be
special cased in the future to gain even more performance).

Tested on x86_64, i686, and aarch64.

	* string/Version (libc): Add GLIBC_2.24.
	* string/strcspn.c (strcspn): Rewrite function.
	* string/bits/string2.h (strcspn): Use __builtin_strcspn.
	(__strcspn_c1): Remove inline function.
	(__strcspn_c2): Likewise.
	(__strcspn_c3): Likewise.
	* string/string-inline.c
	[SHLIB_COMPAT(libc, GLIBC_2_1_1, GLIBC_2_24)] (__strcspn_c1): Add
	compatibility symbol.
	[SHLIB_COMPAT(libc, GLIBC_2_1_1, GLIBC_2_24)] (__strcspn_c2):
	Likewise.
	[SHLIB_COMPAT(libc, GLIBC_2_1_1, GLIBC_2_24)] (__strcspn_c3):
	Likewise.
	* sysdeps/i386/string-inlines.c: Include generic string-inlines.c.
2016-04-01 10:44:40 -03:00
Stefan Liebler
d8a012c5c9 S390: Use ahi instead of aghi in 32bit _dl_runtime_resolve.
This patch uses ahi instead of aghi in 32bit _dl_runtime_resolve
to adjust the stack pointer. This is no functional change,
but a cosmetic one.

ChangeLog:

	* sysdeps/s390/s390-32/dl-trampoline.h (_dl_runtime_resolve):
	Use ahi instead of aghi to adjust stack pointer.
2016-04-01 10:42:54 +02:00
Paul E. Murphy
37a4c70bd4 Increase internal precision of ldbl-128ibm decimal printf [BZ #19853]
When the signs differ, the precision of the conversion sometimes
drops below 106 bits.  This strategy is identical to the
hexadecimal variant.

I've refactored tst-sprintf3 to enable testing a value with more
than 30 significant digits in order to demonstrate this failure
and its solution.

Additionally, this implicitly fixes a typo in the shift
quantities when subtracting from the high mantissa to compute
the difference.
2016-03-31 12:14:33 -05:00
H.J. Lu
830566307f Add x86-64 memset with unaligned store and rep stosb
Implement x86-64 memset with unaligned store and rep movsb.  Support
16-byte, 32-byte and 64-byte vector register sizes.  A single file
provides 2 implementations of memset, one with rep stosb and the other
without rep stosb.  They share the same codes when size is between 2
times of vector register size and REP_STOSB_THRESHOLD which defaults
to 2KB.

Key features:

1. Use overlapping store to avoid branch.
2. For size <= 4 times of vector register size, fully unroll the loop.
3. For size > 4 times of vector register size, store 4 times of vector
register size at a time.

	[BZ #19881]
	* sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add
	memset-sse2-unaligned-erms, memset-avx2-unaligned-erms and
	memset-avx512-unaligned-erms.
	* sysdeps/x86_64/multiarch/ifunc-impl-list.c
	(__libc_ifunc_impl_list): Test __memset_chk_sse2_unaligned,
	__memset_chk_sse2_unaligned_erms, __memset_chk_avx2_unaligned,
	__memset_chk_avx2_unaligned_erms, __memset_chk_avx512_unaligned,
	__memset_chk_avx512_unaligned_erms, __memset_sse2_unaligned,
	__memset_sse2_unaligned_erms, __memset_erms,
	__memset_avx2_unaligned, __memset_avx2_unaligned_erms,
	__memset_avx512_unaligned_erms and __memset_avx512_unaligned.
	* sysdeps/x86_64/multiarch/memset-avx2-unaligned-erms.S: New
	file.
	* sysdeps/x86_64/multiarch/memset-avx512-unaligned-erms.S:
	Likewise.
	* sysdeps/x86_64/multiarch/memset-sse2-unaligned-erms.S:
	Likewise.
	* sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:
	Likewise.
2016-03-31 10:06:07 -07:00
H.J. Lu
88b57b8ed4 Add x86-64 memmove with unaligned load/store and rep movsb
Implement x86-64 memmove with unaligned load/store and rep movsb.
Support 16-byte, 32-byte and 64-byte vector register sizes.  When
size <= 8 times of vector register size, there is no check for
address overlap bewteen source and destination.  Since overhead for
overlap check is small when size > 8 times of vector register size,
memcpy is an alias of memmove.

A single file provides 2 implementations of memmove, one with rep movsb
and the other without rep movsb.  They share the same codes when size is
between 2 times of vector register size and REP_MOVSB_THRESHOLD which
is 2KB for 16-byte vector register size and scaled up by large vector
register size.

Key features:

1. Use overlapping load and store to avoid branch.
2. For size <= 8 times of vector register size, load  all sources into
registers and store them together.
3. If there is no address overlap bewteen source and destination, copy
from both ends with 4 times of vector register size at a time.
4. If address of destination > address of source, backward copy 8 times
of vector register size at a time.
5. Otherwise, forward copy 8 times of vector register size at a time.
6. Use rep movsb only for forward copy.  Avoid slow backward rep movsb
by fallbacking to backward copy 8 times of vector register size at a
time.
7. Skip when address of destination == address of source.

	[BZ #19776]
	* sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add
	memmove-sse2-unaligned-erms, memmove-avx-unaligned-erms and
	memmove-avx512-unaligned-erms.
	* sysdeps/x86_64/multiarch/ifunc-impl-list.c
	(__libc_ifunc_impl_list): Test
	__memmove_chk_avx512_unaligned_2,
	__memmove_chk_avx512_unaligned_erms,
	__memmove_chk_avx_unaligned_2, __memmove_chk_avx_unaligned_erms,
	__memmove_chk_sse2_unaligned_2,
	__memmove_chk_sse2_unaligned_erms, __memmove_avx_unaligned_2,
	__memmove_avx_unaligned_erms, __memmove_avx512_unaligned_2,
	__memmove_avx512_unaligned_erms, __memmove_erms,
	__memmove_sse2_unaligned_2, __memmove_sse2_unaligned_erms,
	__memcpy_chk_avx512_unaligned_2,
	__memcpy_chk_avx512_unaligned_erms,
	__memcpy_chk_avx_unaligned_2, __memcpy_chk_avx_unaligned_erms,
	__memcpy_chk_sse2_unaligned_2, __memcpy_chk_sse2_unaligned_erms,
	__memcpy_avx_unaligned_2, __memcpy_avx_unaligned_erms,
	__memcpy_avx512_unaligned_2, __memcpy_avx512_unaligned_erms,
	__memcpy_sse2_unaligned_2, __memcpy_sse2_unaligned_erms,
	__memcpy_erms, __mempcpy_chk_avx512_unaligned_2,
	__mempcpy_chk_avx512_unaligned_erms,
	__mempcpy_chk_avx_unaligned_2, __mempcpy_chk_avx_unaligned_erms,
	__mempcpy_chk_sse2_unaligned_2, __mempcpy_chk_sse2_unaligned_erms,
	__mempcpy_avx512_unaligned_2, __mempcpy_avx512_unaligned_erms,
	__mempcpy_avx_unaligned_2, __mempcpy_avx_unaligned_erms,
	__mempcpy_sse2_unaligned_2, __mempcpy_sse2_unaligned_erms and
	__mempcpy_erms.
	* sysdeps/x86_64/multiarch/memmove-avx-unaligned-erms.S: New
	file.
	* sysdeps/x86_64/multiarch/memmove-avx512-unaligned-erms.S:
	Likwise.
	* sysdeps/x86_64/multiarch/memmove-sse2-unaligned-erms.S:
	Likwise.
	* sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:
	Likwise.
2016-03-31 10:04:40 -07:00
Stefan Liebler
5cdd1989d1 S390: Extend structs La_s390_regs / La_s390_retval with vector-registers.
Starting with z13, vector registers can also occur as argument registers.
Thus the passed input/output register structs for
la_s390_[32|64]_gnu_plt[enter|exit] functions should reflect those new
registers. This patch extends these structs La_s390_regs and La_s390_retval
and adjusts _dl_runtime_profile() to handle those fields in case of
running on a z13 machine.

ChangeLog:

	* sysdeps/s390/bits/link.h: (La_s390_vr) New typedef.
	(La_s390_32_regs): Append vector register lr_v24-lr_v31.
	(La_s390_64_regs): Likewise.
	(La_s390_32_retval): Append vector register lrv_v24.
	(La_s390_64_retval): Likeweise.
	* sysdeps/s390/s390-32/dl-trampoline.h (_dl_runtime_profile):
	Handle extended structs La_s390_32_regs and La_s390_32_retval.
	* sysdeps/s390/s390-64/dl-trampoline.h (_dl_runtime_profile):
	Handle extended structs La_s390_64_regs and La_s390_64_retval.
2016-03-31 17:37:16 +02:00
Stefan Liebler
4603c51ef7 S390: Save and restore fprs/vrs while resolving symbols.
On s390, no fpr/vrs were saved while resolving a symbol
via _dl_runtime_resolve/_dl_runtime_profile.

According to the abi, the fpr-arguments are defined as call clobbered.
In leaf-functions, gcc 4.9 and newer can use fprs for saving/restoring gprs
instead of saving them to the stack.
If gcc do this in one of the resolver-functions, then the floating point
arguments of a library-function are invalid for the first library-function-call.
Thus, this patch saves/restores the fprs around the resolving code.

The same could occur for vector registers. Furthermore an ifunc-resolver
could also clobber the vector/floating point argument registers.
Thus this patch provides the further variants _dl_runtime_resolve_vx/
_dl_runtime_profile_vx, which are used if the kernel claims, that
we run on a machine with vector registers.

Furthermore, if _dl_runtime_profile calls _dl_call_pltexit,
the pointers to inregs-/outregs-structs were setup invalid.
Now they point to the correct location in the stack-frame.
Before branching back to the caller, the return values are now
restored instead of containing the return values of the
_dl_call_pltexit() call.
On s390-32, an endless loop occurs if _dl_call_pltexit() should be called.
Now, this code-path branches to this function instead of just after the
preceding basr-instruction.

ChangeLog:

	* sysdeps/s390/s390-32/dl-trampoline.S: Include dl-trampoline.h twice
	to create a non-vector/vector version for _dl_runtime_resolve and
	_dl_runtime_profile. Move implementation to ...
	* sysdeps/s390/s390-32/dl-trampoline.h: ... here.
	(_dl_runtime_resolve) Save and restore fpr/vrs.
	(_dl_runtime_profile) Save and restore vrs and fix some issues
	if _dl_call_pltexit is called.
	* sysdeps/s390/s390-32/dl-machine.h (elf_machine_runtime_setup):
	Choose the correct resolver function if running on a machine with vx.
	* sysdeps/s390/s390-64/dl-trampoline.S: Include dl-trampoline.h twice
	to create a non-vector/vector version for _dl_runtime_resolve and
	_dl_runtime_profile. Move implementation to ...
	* sysdeps/s390/s390-64/dl-trampoline.h: ... here.
	(_dl_runtime_resolve) Save and restore fpr/vrs.
	(_dl_runtime_profile) Save and restore vrs and fix some issues
	* sysdeps/s390/s390-64/dl-machine.h: (elf_machine_runtime_setup):
	Choose the correct resolver function if running on a machine with vx.
2016-03-31 17:37:16 +02:00
Joseph Myers
258ec8abc1 [microblaze] Remove __ASSUME_FUTIMESAT.
MicroBlaze has a special version of futimesat.c because it gained the
futimesat syscall later than other non-asm-generic architectures.  Now
the minimum kernel is recent enough that this syscall can always be
assumed to be present for MicroBlaze, so this patch removes the
special version and the __ASSUME_FUTIMESAT macro, resulting in the
sysdeps/unix/sysv/linux/futimesat.c version being used.

Untested.

	* sysdeps/unix/sysv/linux/microblaze/kernel-features.h
	(__ASSUME_FUTIMESAT): Remove macro.
	* sysdeps/unix/sysv/linux/microblaze/futimesat.c: Remove file.
2016-03-29 22:13:36 +00:00
H.J. Lu
0791f91dff Initial Enhanced REP MOVSB/STOSB (ERMS) support
The newer Intel processors support Enhanced REP MOVSB/STOSB (ERMS) which
has a feature bit in CPUID.  This patch adds the Enhanced REP MOVSB/STOSB
(ERMS) bit to x86 cpu-features.

	* sysdeps/x86/cpu-features.h (bit_cpu_ERMS): New.
	(index_cpu_ERMS): Likewise.
	(reg_ERMS): Likewise.
2016-03-28 19:23:31 -07:00
Aurelien Jarno
9ff9351d02 Synchronize <sys/personality.h> with kernel headers
<sys/personality.h> is out of sync with kernel headers, missing the
UNAME26, FDPIC_FUNCPTRS and PER_LINUX_FDPIC entries. Fix that.

Changelog:
	* sysdeps/unix/sysv/linux/sys/personality.h (UNAME26, FDPIC_FUNCPTRS,
	PER_LINUX_FDPIC): Add.
2016-03-28 22:42:52 +02:00
H.J. Lu
064f01b10b Make __memcpy_avx512_no_vzeroupper an alias
Since x86-64 memcpy-avx512-no-vzeroupper.S implements memmove, make
__memcpy_avx512_no_vzeroupper an alias of __memmove_avx512_no_vzeroupper
to reduce code size of libc.so.

	* sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Remove
	memcpy-avx512-no-vzeroupper.
	* sysdeps/x86_64/multiarch/memcpy-avx512-no-vzeroupper.S: Renamed
	to ...
	* sysdeps/x86_64/multiarch/memmove-avx512-no-vzeroupper.S: This.
	(MEMCPY): Don't define.
	(MEMCPY_CHK): Likewise.
	(MEMPCPY): Likewise.
	(MEMPCPY_CHK): Likewise.
	(MEMPCPY_CHK): Renamed to ...
	(__mempcpy_chk_avx512_no_vzeroupper): This.
	(MEMPCPY_CHK): Renamed to ...
	(__mempcpy_chk_avx512_no_vzeroupper): This.
	(MEMCPY_CHK): Renamed to ...
	(__memmove_chk_avx512_no_vzeroupper): This.
	(MEMCPY): Renamed to ...
	(__memmove_avx512_no_vzeroupper): This.
	(__memcpy_avx512_no_vzeroupper): New alias.
	(__memcpy_chk_avx512_no_vzeroupper): Likewise.
2016-03-28 13:16:22 -07:00
H.J. Lu
c365e615f7 Implement x86-64 multiarch mempcpy in memcpy
Implement x86-64 multiarch mempcpy in memcpy to share most of code.  It
reduces code size of libc.so.

	[BZ #18858]
	* sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Remove
	mempcpy-ssse3, mempcpy-ssse3-back, mempcpy-avx-unaligned
	and mempcpy-avx512-no-vzeroupper.
	* sysdeps/x86_64/multiarch/memcpy-avx-unaligned.S (MEMPCPY_CHK):
	New.
	(MEMPCPY): Likewise.
	* sysdeps/x86_64/multiarch/memcpy-avx512-no-vzeroupper.S
	(MEMPCPY_CHK): New.
	(MEMPCPY): Likewise.
	* sysdeps/x86_64/multiarch/memcpy-ssse3-back.S (MEMPCPY_CHK): New.
	(MEMPCPY): Likewise.
	* sysdeps/x86_64/multiarch/memcpy-ssse3.S (MEMPCPY_CHK): New.
	(MEMPCPY): Likewise.
	* sysdeps/x86_64/multiarch/mempcpy-avx-unaligned.S: Removed.
	* sysdeps/x86_64/multiarch/mempcpy-avx512-no-vzeroupper.S:
	Likewise.
	* sysdeps/x86_64/multiarch/mempcpy-ssse3-back.S: Likewise.
	* sysdeps/x86_64/multiarch/mempcpy-ssse3.S: Likewise.
2016-03-28 13:13:51 -07:00
H.J. Lu
e41b395523 [x86] Add a feature bit: Fast_Unaligned_Copy
On AMD processors, memcpy optimized with unaligned SSE load is
slower than emcpy optimized with aligned SSSE3 while other string
functions are faster with unaligned SSE load.  A feature bit,
Fast_Unaligned_Copy, is added to select memcpy optimized with
unaligned SSE load.

	[BZ #19583]
	* sysdeps/x86/cpu-features.c (init_cpu_features): Set
	Fast_Unaligned_Copy with Fast_Unaligned_Load for Intel
	processors.  Set Fast_Copy_Backward for AMD Excavator
	processors.
	* sysdeps/x86/cpu-features.h (bit_arch_Fast_Unaligned_Copy):
	New.
	(index_arch_Fast_Unaligned_Copy): Likewise.
	* sysdeps/x86_64/multiarch/memcpy.S (__new_memcpy): Check
	Fast_Unaligned_Copy instead of Fast_Unaligned_Load.
2016-03-28 04:40:03 -07:00
Florian Weimer
f327f5b47b tst-audit10: Fix compilation on compilers without bit_AVX512F [BZ #19860]
[BZ# 19860]
	* sysdeps/x86_64/tst-audit10.c (avx512_enabled): Always return
	zero if the compiler does not provide the AVX512F bit.
2016-03-25 11:11:42 +01:00
Joseph Myers
c898991d8b Fix x86_64 / x86 powl inaccuracy for integer exponents (bug 19848).
Bug 19848 reports cases where powl on x86 / x86_64 has error
accumulation, for small integer exponents, larger than permitted by
glibc's accuracy goals, at least in some rounding modes.  This patch
further restricts the exponent range for which the
small-integer-exponent logic is used to limit the possible error
accumulation.

Tested for x86_64 and x86 and ulps updated accordingly.

	[BZ #19848]
	* sysdeps/i386/fpu/e_powl.S (p3): Rename to p2 and change value
	from 8 to 4.
	(__ieee754_powl): Compare integer exponent against 4 not 8.
	* sysdeps/x86_64/fpu/e_powl.S (p3): Rename to p2 and change value
	from 8 to 4.
	(__ieee754_powl): Compare integer exponent against 4 not 8.
	* math/auto-libm-test-in: Add more tests of pow.
	* math/auto-libm-test-out: Regenerated.
	* sysdeps/i386/i686/fpu/multiarch/libm-test-ulps: Update.
	* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
2016-03-24 01:32:52 +00:00
Aurelien Jarno
7e1ff08c26 Assume __NR_utimensat is always defined
With the 2.6.32 minimum kernel on x86 and 3.2 on other architectures,
__NR_utimensat is always defined.

Changelog:
	* sysdeps/unix/sysv/linux/futimens.c (futimens) [__NR_utimensat]:
	Make code unconditional.
	[!__NR_utimensat]: Remove conditional code.
	* sysdeps/unix/sysv/linux/lutimes.c (lutimes) [__NR_utimensat]:
	Make code unconditional.
	[!__NR_utimensat]: Remove conditional code.
	* sysdeps/unix/sysv/linux/utimensat.c (utimensat) [__NR_utimensat]:
	Make code unconditional.
	[!__NR_utimensat]: Remove conditional code.
2016-03-23 23:35:08 +01:00
Aurelien Jarno
16d94f67e5 Assume __NR_openat is always defined
With the 2.6.32 minimum kernel on x86 and 3.2 on other architectures,
__NR_openat is always defined.

Changelog:
	* sysdeps/unix/sysv/linux/dl-openat64.c (openat64) [__NR_openat]:
	Make code unconditional.
2016-03-23 23:35:08 +01:00
Nick Alcock
7a25d6a84d x86, pthread_cond_*wait: Do not depend on %eax not being clobbered
The x86-specific versions of both pthread_cond_wait and
pthread_cond_timedwait have (in their fall-back-to-futex-wait slow
paths) calls to __pthread_mutex_cond_lock_adjust followed by
__pthread_mutex_unlock_usercnt, which load the parameters before the
first call but then assume that the first parameter, in %eax, will
survive unaffected.  This happens to have been true before now, but %eax
is a call-clobbered register, and this assumption is not safe: it could
change at any time, at GCC's whim, and indeed the stack-protector canary
checking code clobbers %eax while checking that the canary is
uncorrupted.

So reload %eax before calling __pthread_mutex_unlock_usercnt.  (Do this
unconditionally, even when stack-protection is not in use, because it's
the right thing to do, it's a slow path, and anything else is dicing
with death.)

	* sysdeps/unix/sysv/linux/i386/pthread_cond_timedwait.S: Reload
	call-clobbered %eax on retry path.
	* sysdeps/unix/sysv/linux/i386/pthread_cond_wait.S: Likewise.
2016-03-23 13:40:14 +01:00
H.J. Lu
3c9a4cd16c Don't set %rcx twice before "rep movsb"
* sysdeps/x86_64/multiarch/memcpy-avx-unaligned.S (MEMCPY):
	Don't set %rcx twice before "rep movsb".
2016-03-22 08:36:16 -07:00
H.J. Lu
f781a9e961 Set index_arch_AVX_Fast_Unaligned_Load only for Intel processors
Since only Intel processors with AVX2 have fast unaligned load, we
should set index_arch_AVX_Fast_Unaligned_Load only for Intel processors.

Move AVX, AVX2, AVX512, FMA and FMA4 detection into get_common_indeces
and call get_common_indeces for other processors.

Add CPU_FEATURES_CPU_P and CPU_FEATURES_ARCH_P to aoid loading
GLRO(dl_x86_cpu_features) in cpu-features.c.

	[BZ #19583]
	* sysdeps/x86/cpu-features.c (get_common_indeces): Remove
	inline.  Check family before setting family, model and
	extended_model.  Set AVX, AVX2, AVX512, FMA and FMA4 usable
	bits here.
	(init_cpu_features): Replace HAS_CPU_FEATURE and
	HAS_ARCH_FEATURE with CPU_FEATURES_CPU_P and
	CPU_FEATURES_ARCH_P.  Set index_arch_AVX_Fast_Unaligned_Load
	for Intel processors with usable AVX2.  Call get_common_indeces
	for other processors with family == NULL.
	* sysdeps/x86/cpu-features.h (CPU_FEATURES_CPU_P): New macro.
	(CPU_FEATURES_ARCH_P): Likewise.
	(HAS_CPU_FEATURE): Use CPU_FEATURES_CPU_P.
	(HAS_ARCH_FEATURE): Use CPU_FEATURES_ARCH_P.
2016-03-22 07:47:20 -07:00
Joseph Myers
37ad347359 Remove __ASSUME_GETDENTS64_SYSCALL.
This patch removes the __ASSUME_GETDENTS64_SYSCALL macro, as its
definition is constant given the new kernel version requirements (and
was constant anyway before those requirements except for MIPS n32).

Note that the "#ifdef __NR_getdents64" conditional *is* still needed,
because MIPS n64 only has the getdents syscall (being a 64-bit ABI,
that syscall is 64-bit; the difference between the two on 64-bit
architectures is where d_type goes).  If MIPS n64 were to gain the
getdents64 syscall and we wanted to use it conditionally on the kernel
version at runtime we'd have to revert this patch, but I think that's
unlikely (and in any case, we could follow the simpler approach of
undefining __NR_getdents64 if the syscall can't be assumed, just like
we do for accept4 / recvmmsg / sendmmsg syscalls on architectures
where socketcall support came first).

Most of the getdents.c changes are reindentation.

Tested for x86_64 and x86 that installed stripped shared libraries are
unchanged by the patch.

	* sysdeps/unix/sysv/linux/kernel-features.h
	(__ASSUME_GETDENTS64_SYSCALL): Remove macro.
	* sysdeps/unix/sysv/linux/getdents.c
	[!__ASSUME_GETDENTS64_SYSCALL]: Remove conditional code.
	[!have_no_getdents64_defined]: Likewise.
	(__GETDENTS): Remove __have_no_getdents64 conditional.
2016-03-22 00:32:20 +00:00
Joseph Myers
238d60ac9b Remove __ASSUME_SIGNALFD4.
Current Linux kernel version requirements mean the signalfd4 syscall
can always be assumed to be available.  This patch removes
__ASSUME_SIGNALFD4 and associated conditionals.

Tested for x86_64 and x86 that installed stripped shared libraries are
unchanged by the patch.

	* sysdeps/unix/sysv/linux/kernel-features.h (__ASSUME_SIGNALFD4):
	Remove macro.
	* sysdeps/unix/sysv/linux/signalfd.c: Do not include
	<kernel-features.h>.
	(signalfd) [__NR_signalfd4]: Make code unconditional.
	(signalfd) [!__ASSUME_SIGNALFD4]: Remove conditional code.
2016-03-21 16:30:05 +00:00
Adhemerval Zanella
67b23376fb posix: Fix posix_spawn implict check style
This patch fixes the implicit check style add in 2a69f853c for the
general convention one.

Checked on x86_64.

	* sysdeps/unix/sysv/linux/spawni.c (__spawnix): Fix implict checks
	style.
2016-03-21 12:12:26 -03:00
H.J. Lu
893e371b2f Use JUMPTARGET in x86-64 pthread
When PLT may be used, JUMPTARGET should be used instead calling the
function directly.

	* sysdeps/unix/sysv/linux/x86_64/cancellation.S
	(__pthread_enable_asynccancel): Use JUMPTARGET to call
	__pthread_unwind.
	* sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S
	(__condvar_cleanup2): Use JUMPTARGET to call _Unwind_Resume.
	* sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S
	(__condvar_cleanup1): Likewise.
2016-03-21 06:51:05 -07:00
Adhemerval Zanella
2a69f853c0 posix: Fix posix_spawn invalid memory access
Current Linux posix_spawn spawn do not test if the pid argument is
valid before trying to update it for success case.  This patch fixes
it.

Tested on x86_64 and i686.

	* sysdeps/unix/sysv/linux/spawni.c (__spawnix): Fix invalid memory
	access where posix_spawn success and pid argument is null.
	* posix/tst-spawn.c (do_test): Add posix_spawn null pid argument for
	success case.
2016-03-20 18:17:52 -03:00
Samuel Thibault
0e8e593d73 hurd: Add c++-types expected result
* sysdeps/mach/hurd/i386/c++-types.data: New file.
2016-03-20 22:16:34 +01:00
Samuel Thibault
4d10ceb2b2 hurd: Allow inlining IO locks
* sysdeps/mach/hurd/libc-lock.h (_IO_lock_inexpensive): Define to 1.
2016-03-20 22:12:06 +01:00
Samuel Thibault
d2129ad457 hurd: Do not hide rtld symbols which need to be preempted
* sysdeps/generic/dl-fcntl.h: New file, adds attribute_hidden to __open
	and __fcntl.
	* sysdeps/mach/hurd/dl-fcntl.h: New file, adds attribute_hidden to
	__fcntl only.
	* include/fcntl.h [IS_IN (rtld)]: Include <dl-fcntl.h> instead of
	adding attribute_hidden to __open and __fcntl.
2016-03-20 19:51:42 +01:00
Samuel Thibault
fe43d0f464 hurd: Break errnos.d / libc-modules.h dependency loop
Generating errnos.d does not actually need libc-modules.h.

* sysdeps/mach/hurd/Makefile ($(common-objpfx)errnos.d): Strip
"-include $(common-objpfx)libc-modules.h" from CPPFLAGS, and do not
depend on libc-modules.h,
2016-03-20 16:44:44 +01:00
Joseph Myers
a64e3aadbf Remove __ASSUME_EVENTFD2, move eventfd to syscalls.list.
Given current Linux kernel version requirements, we can assume the
presence of the eventfd2 syscall.  This means that __ASSUME_EVENTFD2
can be removed, and a syscalls.list entry suffices for eventfd instead
of needing a .c file.  This patch implements those changes.

Tested for x86_64 and x86 (not that that means much, given the lack of
testsuite coverage for eventfd).

	* sysdeps/unix/sysv/linux/kernel-features.h (__ASSUME_EVENTFD2):
	Remove macro.
	* sysdeps/unix/sysv/linux/eventfd.c: Remove file.
	* sysdeps/unix/sysv/linux/syscalls.list (eventfd): New syscall
	entry.
2016-03-17 19:07:39 +00:00
Joseph Myers
4674df40bb Remove __ASSUME_FALLOCATE.
Given current Linux kernel version requirements, we can always assume
the fallocate syscall to be available.  This patch removes
__ASSUME_FALLOCATE and a test for whether __NR_fallocate is defined.

Tested for x86_64 and x86 that installed stripped shared libraries are
unchanged by the patch.

	* sysdeps/unix/sysv/linux/kernel-features.h (__ASSUME_FALLOCATE):
	Remove macro.
	* sysdeps/unix/sysv/linux/wordsize-64/posix_fallocate.c: Do not
	include <kernel-features.h>.
	[!__ASSUME_FALLOCATE]: Remove conditional code.
	(posix_fallocate) [__NR_fallocate]: Make code unconditional.
2016-03-17 12:15:51 +00:00
H.J. Lu
86ed888255 Use JUMPTARGET in x86-64 mathvec
When PLT may be used, JUMPTARGET should be used instead calling the
function directly.

	* sysdeps/x86_64/fpu/multiarch/svml_d_cos2_core_sse4.S
	(_ZGVbN2v_cos_sse4): Use JUMPTARGET to call cos.
	* sysdeps/x86_64/fpu/multiarch/svml_d_cos4_core_avx2.S
	(_ZGVdN4v_cos_avx2): Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core_avx512.S
	(_ZGVdN4v_cos): Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_exp2_core_sse4.S
	(_ZGVbN2v_exp_sse4): Use JUMPTARGET to call exp.
	* sysdeps/x86_64/fpu/multiarch/svml_d_exp4_core_avx2.S
	(_ZGVdN4v_exp_avx2): Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core_avx512.S
	(_ZGVdN4v_exp): Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_log2_core_sse4.S
	(_ZGVbN2v_log_sse4): Use JUMPTARGET to call log.
	* sysdeps/x86_64/fpu/multiarch/svml_d_log4_core_avx2.S
	(_ZGVdN4v_log_avx2): Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_log8_core_avx512.S
	(_ZGVdN4v_log): Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_pow2_core_sse4.S
	(_ZGVbN2vv_pow_sse4): Use JUMPTARGET to call pow.
	* sysdeps/x86_64/fpu/multiarch/svml_d_pow4_core_avx2.S
	(_ZGVdN4vv_pow_avx2): Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core_avx512.S
	(_ZGVdN4vv_pow): Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core_sse4.S
	(_ZGVbN2v_sin_sse4): Use JUMPTARGET to call sin.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core_avx2.S
	(_ZGVdN4v_sin_avx2): Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core_avx512.S
	(_ZGVdN4v_sin): Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sincos2_core_sse4.S
	(_ZGVbN2vvv_sincos_sse4): Use JUMPTARGET to call sin and cos.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sincos4_core_avx2.S
	(_ZGVdN4vvv_sincos_avx2): Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core_avx512.S
	(_ZGVdN4vvv_sincos): Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_cosf16_core_avx512.S
	(_ZGVdN8v_cosf): Use JUMPTARGET to call cosf.
	* sysdeps/x86_64/fpu/multiarch/svml_s_cosf4_core_sse4.S
	(_ZGVbN4v_cosf_sse4): Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_cosf8_core_avx2.S
	(_ZGVdN8v_cosf_avx2): Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_expf16_core_avx512.S
	(_ZGVdN8v_expf): Use JUMPTARGET to call expf.
	* sysdeps/x86_64/fpu/multiarch/svml_s_expf4_core_sse4.S
	(_ZGVbN4v_expf_sse4): Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_expf8_core_avx2.S
	(_ZGVdN8v_expf_avx2): Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_logf16_core_avx512.S
	(_ZGVdN8v_logf): Use JUMPTARGET to call logf.
	* sysdeps/x86_64/fpu/multiarch/svml_s_logf4_core_sse4.S
	(_ZGVbN4v_logf_sse4): Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_logf8_core_avx2.S
	(_ZGVdN8v_logf_avx2): Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_powf16_core_avx512.S
	(_ZGVdN8vv_powf): Use JUMPTARGET to call powf.
	* sysdeps/x86_64/fpu/multiarch/svml_s_powf4_core_sse4.S
	(_ZGVbN4vv_powf_sse4): Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_powf8_core_avx2.S
	(_ZGVdN8vv_powf_avx2): Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core_avx512.S
	(_ZGVdN8vv_powf): Use JUMPTARGET to call sinf and cosf.
	* sysdeps/x86_64/fpu/multiarch/svml_s_sincosf4_core_sse4.S
	(_ZGVbN4vvv_sincosf_sse4): Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_sincosf8_core_avx2.S
	(_ZGVdN8vvv_sincosf_avx2): Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_sinf16_core_avx512.S
	(_ZGVdN8v_sinf): Use JUMPTARGET to call sinf.
	* sysdeps/x86_64/fpu/multiarch/svml_s_sinf4_core_sse4.S
	(_ZGVbN4v_sinf_sse4): Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_sinf8_core_avx2.S
	(_ZGVdN8v_sinf_avx2): Likewise.
	* sysdeps/x86_64/fpu/svml_d_wrapper_impl.h (WRAPPER_IMPL_SSE2):
	Use JUMPTARGET to call callee.
	(WRAPPER_IMPL_SSE2_ff): Likewise.
	(WRAPPER_IMPL_SSE2_fFF): Likewise.
	(WRAPPER_IMPL_AVX): Likewise.
	(WRAPPER_IMPL_AVX_ff): Likewise.
	(WRAPPER_IMPL_AVX_fFF): Likewise.
	(WRAPPER_IMPL_AVX512): Likewise.
	(WRAPPER_IMPL_AVX512_ff): Likewise.
	* sysdeps/x86_64/fpu/svml_s_wrapper_impl.h (WRAPPER_IMPL_SSE2):
	Likewise.
	(WRAPPER_IMPL_SSE2_ff): Likewise.
	(WRAPPER_IMPL_SSE2_fFF): Likewise.
	(WRAPPER_IMPL_AVX): Likewise.
	(WRAPPER_IMPL_AVX_ff): Likewise.
	(WRAPPER_IMPL_AVX_fFF): Likewise.
	(WRAPPER_IMPL_AVX512): Likewise.
	(WRAPPER_IMPL_AVX512_ff): Likewise.
	(WRAPPER_IMPL_AVX512_fFF): Likewise.
2016-03-16 14:24:19 -07:00
Samuel Thibault
35fbb341f8 Fix hurd build
* sysdeps/mach/hurd/openat.c (__openat): Add missing ellipsis.
	* resolv/gai_sigqueue.c (__gai_sigqueue): Add missing internal_function
	qualifier.
	* /rt/aio_sigqueue.c (__aio_sigqueue): Add missing attribute_hidden
	internal_function qualifiers.
2016-03-16 13:57:57 +01:00
Carlos O'Donell
b4f518ecfa Fix building glibc master with NDEBUG and --with-cpu.
When building on i686, x86_64, and arm, and with NDEBUG, or --with-cpu
there are various variables and functions which are unused based on
these settings.

This patch marks all such variables with __attribute__((unused)) to
avoid the compiler warnings when building with the aformentioned
options.
2016-03-15 23:23:24 -04:00
Joseph Myers
089b772f98 Remove __ASSUME_PPOLL.
With current kernel version requirements, the ppoll Linux syscall can
be assumed to be present on all architectures; this patch removes the
__ASSUME_PPOLL macro and conditionals on it and on whether __NR_ppoll
is defined.  (Note that the same can't yet be done for pselect,
because MicroBlaze only wired that up in the syscall table in 3.15.)

Tested for x86_64 and x86 that installed stripped shared libraries are
unchanged by the patch.

	* sysdeps/unix/sysv/linux/kernel-features.h (__ASSUME_PPOLL):
	Remove macro.
	* sysdeps/unix/sysv/linux/ppoll.c: Do not include
	<kernel-features.h>.
	[__NR_ppoll]: Make code unconditional.
	[!__ASSUME_PPOLL]: Remove conditional code.
2016-03-15 21:11:07 +00:00
Joseph Myers
35ade9f11b Adjust kernel-features.h defaults for socket syscalls.
This patch adjusts the defaults for kernel-features.h macros relating
to availability of accept4, recvmmsg and sendmmsg.  It is not intended
to affect which macros end up getting defined in any configuration.

At present, all architectures with syscalls for those functions need
to define __ASSUME_*_SYSCALL macros; in particular, any new
architecture needs its own kernel-features.h file for that purpose,
though it may not otherwise need such a header.  Those macros are then
used together with __ASSUME_SOCKETCALL to define macros for whether
the functions in question are available.

This patch changes the defaults so that the syscalls are assumed to be
available by default with recent-enough kernels, and it is the
responsibility of architecture headers to undefine the macros if they
are unavailable in supported kernels at least as recent as the version
where the architecture-independent functionality was introduced.  The
__ASSUME_<function> macros are defaulted similarly instead of being
defined based on other macros (defining based on other macros would no
longer work because the #undefs appear after the generic header is
included), so where the syscall being unavailable means the function
is unavailable this means the architecture header has to undefine the
__ASSUME_<function> macro; this only affects __ASSUME_ACCEPT4 for
ia64, as other cases where the syscalls were added late enough to be
relevant with current kernel version requirements are all on
socketcall architectures.

As a consequence, the AArch64 and Nios II kernel-features.h header
files are removed, and others simplified.  When the minimum kernel
version becomes 4.3 or later on all architectures, the syscalls in
question can just be assumed unconditionally, permitting further
simplification.

Tested for x86_64, x86 and powerpc (that installed shared libraries
are unchanged by the patch, and testsuite for x86_64 and x86).

	* sysdeps/unix/sysv/linux/kernel-features.h
	(__ASSUME_ACCEPT4_SYSCALL): Define unconditionally.
	(__ASSUME_ACCEPT4): Likewise.
	[__LINUX_KERNEL_VERSION >= 0x020621] (__ASSUME_RECVMMSG_SYSCALL):
	Define.
	[__LINUX_KERNEL_VERSION >= 0x020621] (__ASSUME_RECVMMSG):
	Likewise.
	[__LINUX_KERNEL_VERSION >= 0x030000] (__ASSUME_SENDMMSG_SYSCALL):
	Likewise.
	[__LINUX_KERNEL_VERSION >= 0x030000] (__ASSUME_SENDMMSG):
	Likewise.
	* sysdeps/unix/sysv/linux/aarch64/kernel-features.h: Remove file.
	* sysdeps/unix/sysv/linux/nios2/kernel-features.h: Likewise.
	* sysdeps/unix/sysv/linux/alpha/kernel-features.h
	(__ASSUME_RECVMMSG_SYSCALL): Do not define.
	(__ASSUME_ACCEPT4_SYSCALL): Likewise.
	(__ASSUME_SENDMMSG_SYSCALL): Likewise.
	* sysdeps/unix/sysv/linux/arm/kernel-features.h
	(__ASSUME_RECVMMSG_SYSCALL): Likewise.
	(__ASSUME_ACCEPT4_SYSCALL): Likewise.
	(__ASSUME_SENDMMSG_SYSCALL): Likewise.
	* sysdeps/unix/sysv/linux/hppa/kernel-features.h
	(__ASSUME_ACCEPT4_SYSCALL): Likewise.
	(__ASSUME_RECVMMSG_SYSCALL): Likewise.
	(__ASSUME_SENDMMSG_SYSCALL): Likewise.
	* sysdeps/unix/sysv/linux/i386/kernel-features.h
	[__LINUX_KERNEL_VERSION >= 0x020621] (__ASSUME_RECVMMSG_SYSCALL):
	Likewise.
	[__LINUX_KERNEL_VERSION >= 0x030000] (__ASSUME_SENDMMSG_SYSCALL):
	Likewise.
	(__ASSUME_ACCEPT4_SYSCALL): Undefine if [__LINUX_KERNEL_VERSION <
	0x040300] instead of defining if [__LINUX_KERNEL_VERSION >=
	0x040300].
	* sysdeps/unix/sysv/linux/ia64/kernel-features.h
	(__ASSUME_RECVMMSG_SYSCALL): Do not define.
	(__ASSUME_SENDMMSG_SYSCALL): Likewise.
	(__ASSUME_ACCEPT4_SYSCALL): Undefine if [__LINUX_KERNEL_VERSION <
	0x030300] instead of defining if [__LINUX_KERNEL_VERSION >=
	0x030300].
	[__LINUX_KERNEL_VERSION < 0x030300] (__ASSUME_ACCEPT4): Undefine.
	* sysdeps/unix/sysv/linux/m68k/kernel-features.h
	(__ASSUME_ACCEPT4_SYSCALL): Undefine if [__LINUX_KERNEL_VERSION <
	0x040300] instead of defining if [__LINUX_KERNEL_VERSION >=
	0x040300].
	(__ASSUME_RECVMMSG_SYSCALL): Likewise.
	(__ASSUME_SENDMMSG_SYSCALL): Likewise.
	* sysdeps/unix/sysv/linux/microblaze/kernel-features.h
	(__ASSUME_ACCEPT4_SYSCALL): Do not define.
	(__ASSUME_RECVMMSG_SYSCALL): Likewise.
	(__ASSUME_SENDMMSG_SYSCALL): Undefine if [__LINUX_KERNEL_VERSION <
	0x030300] instead of defining if [__LINUX_KERNEL_VERSION >=
	0x030300].
	* sysdeps/unix/sysv/linux/mips/kernel-features.h
	(__ASSUME_ACCEPT4_SYSCALL): Do not define.
	(__ASSUME_RECVMMSG_SYSCALL): Likewise.
	(__ASSUME_SENDMMSG_SYSCALL): Likewise.
	* sysdeps/unix/sysv/linux/powerpc/kernel-features.h
	(__ASSUME_ACCEPT4_SYSCALL): Likewise.
	(__ASSUME_RECVMMSG_SYSCALL): Likewise.
	(__ASSUME_SENDMMSG_SYSCALL): Likewise.
	* sysdeps/unix/sysv/linux/s390/kernel-features.h
	(__ASSUME_ACCEPT4_SYSCALL): Undefine if [__LINUX_KERNEL_VERSION <
	0x040300] instead of defining if [__LINUX_KERNEL_VERSION >=
	0x040300].
	(__ASSUME_RECVMMSG_SYSCALL): Likewise.
	(__ASSUME_SENDMMSG_SYSCALL): Likewise.
	* sysdeps/unix/sysv/linux/sh/kernel-features.h
	(__ASSUME_ACCEPT4_SYSCALL): Do not define.
	(__ASSUME_RECVMMSG_SYSCALL): Likewise.
	(__ASSUME_SENDMMSG_SYSCALL): Likewise.
	* sysdeps/unix/sysv/linux/sparc/kernel-features.h
	(__ASSUME_ACCEPT4_SYSCALL): Likewise.
	(__ASSUME_RECVMMSG_SYSCALL): Likewise.
	(__ASSUME_SENDMMSG_SYSCALL): Likewise.
	* sysdeps/unix/sysv/linux/tile/kernel-features.h
	(__ASSUME_ACCEPT4_SYSCALL): Likewise.
	(__ASSUME_RECVMMSG_SYSCALL): Likewise.
	(__ASSUME_SENDMMSG_SYSCALL): Likewise.
	* sysdeps/unix/sysv/linux/x86_64/kernel-features.h
	(__ASSUME_ACCEPT4_SYSCALL): Likewise.
	[__LINUX_KERNEL_VERSION >= 0x020621] (__ASSUME_RECVMMSG_SYSCALL):
	Likewise.
	[__LINUX_KERNEL_VERSION >= 0x030000] (__ASSUME_SENDMMSG_SYSCALL):
	Likewise.
2016-03-15 21:09:33 +00:00
Joseph Myers
981569c74c Update glibc headers for Linux 4.5.
This patch updates the glibc headers with the defines MADV_FREE,
IPV6_HDRINCL and EPOLLEXCLUSIVE that are added in Linux 4.5.

Tested for x86_64 and x86 (testsuite, and that installed stripped
shared libraries are unchanged by the patch).

	* bits/mman-linux.h [__USE_MISC] (MADV_FREE): New macro.
	* sysdeps/unix/sysv/linux/hppa/bits/mman.h [__USE_MISC]
	(MADV_FREE): Likewise.
	* sysdeps/unix/sysv/linux/bits/in.h (IPV6_HDRINCL): Likewise.
	* sysdeps/unix/sysv/linux/sys/epoll.h (enum EPOLL_EVENTS): Add
	EPOLLEXCLUSIVE.
2016-03-14 19:04:53 +00:00
Samuel Thibault
15b9738da3 Fix flag test in waitid compatibility layer
* sysdeps/posix/waitid.c (OUR_WAITID): Test against WSTOPPED instead of
	WUNTRACED.
2016-03-13 21:44:09 +01:00
Rajalakshmi Srinivasaraghavan
869d7180dd powerpc: Rearrange cfi_offset calls
This patch rearranges cfi_offset() calls after the last store
so as to avoid extra DW_CFA_advance opcodes in unwind information.
2016-03-11 11:31:58 -03:00
H.J. Lu
6aa3e97e25 Add _arch_/_cpu_ to index_*/bit_* in x86 cpu-features.h
index_* and bit_* macros are used to access cpuid and feature arrays o
struct cpu_features.  It is very easy to use bits and indices of cpuid
array on feature array, especially in assembly codes.  For example,
sysdeps/i386/i686/multiarch/bcopy.S has

	HAS_CPU_FEATURE (Fast_Rep_String)

which should be

	HAS_ARCH_FEATURE (Fast_Rep_String)

We change index_* and bit_* to index_cpu_*/index_arch_* and
bit_cpu_*/bit_arch_* so that we can catch such error at build time.

	[BZ #19762]
	* sysdeps/unix/sysv/linux/x86_64/64/dl-librecon.h
	(EXTRA_LD_ENVVARS): Add _arch_ to index_*/bit_*.
	* sysdeps/x86/cpu-features.c (init_cpu_features): Likewise.
	* sysdeps/x86/cpu-features.h (bit_*): Renamed to ...
	(bit_arch_*): This for feature array.
	(bit_*): Renamed to ...
	(bit_cpu_*): This for cpu array.
	(index_*): Renamed to ...
	(index_arch_*): This for feature array.
	(index_*): Renamed to ...
	(index_cpu_*): This for cpu array.
	[__ASSEMBLER__] (HAS_FEATURE): Add and use field.
	[__ASSEMBLER__] (HAS_CPU_FEATURE)): Pass cpu to HAS_FEATURE.
	[__ASSEMBLER__] (HAS_ARCH_FEATURE)): Pass arch to HAS_FEATURE.
	[!__ASSEMBLER__] (HAS_CPU_FEATURE): Replace index_##name and
	bit_##name with index_cpu_##name and bit_cpu_##name.
	[!__ASSEMBLER__] (HAS_ARCH_FEATURE): Replace index_##name and
	bit_##name with index_arch_##name and bit_arch_##name.
2016-03-10 05:27:07 -08:00
Aurelien Jarno
f8e9c4d30c mips: terminate the FDE before the return trampoline in makecontext
In makecontext the FDE needs to be terminated before the return
trampoline otherwise backtrace called within a context created by
makecontext yields infinite backtrace.

This bug has been present for a long time, stdlib/tst-makecontext did
not fail until recent commit e535ce25. Tested on mips-linux-gnu and
mips64el-linux-gnuabi64 and mips-linux-gnu, no regression.

This fixes stdlib/tst-makecontext on MIPS.

Changelog:
	[BZ #19792]
	* sysdeps/unix/sysv/linux/mips/makecontext.S (__makecontext):
	Terminate FDE before return label.
2016-03-09 18:48:18 +01:00
Joseph Myers
613c92b3b5 Fix ldbl-128ibm nearbyintl in non-default rounding modes (bug 19790).
The ldbl-128ibm implementation of nearbyintl uses logic that only
works in round-to-nearest mode.  This contrasts with rintl, which
works in all rounding modes.

Now, arguably nearbyintl could simply be aliased to rintl, given that
spurious "inexact" is generally allowed for ldbl-128ibm, even for the
underlying arithmetic operations.  But given that the only point of
nearbyintl is to avoid "inexact", this patch follows the more
conservative approach of adding conditionals to the rintl
implementation to make it suitable for use to implement nearbyintl,
then builds it for nearbyintl with USE_AS_NEARBYINTL defined.  The
test test-nearbyint-except-2 shows up issues when traps on "inexact"
are enabled, which turn out to be problems with the powerpc
fenv_private.h implementation (two functions that should disable
exception traps potentially failing to do so in some cases); this
patch duly fixes that as well (I don't see any other existing cases
where this would be user-visible; there isn't much use of *_NOEX,
*hold* etc. in libm that requires exceptions to be discarded and not
trapped on).

Tested for powerpc.

	[BZ #19790]
	* sysdeps/ieee754/ldbl-128ibm/s_rintl.c [USE_AS_NEARBYINTL]
	(rintl): Define as macro.
	[USE_AS_NEARBYINTL] (__rintl): Likewise.
	(__rintl) [USE_AS_NEARBYINTL]: Use SET_RESTORE_ROUND_NOEX instead
	of fesetround.  Ensure results are evaluated before end of scope.
	* sysdeps/ieee754/ldbl-128ibm/s_nearbyintl.c: Define
	USE_AS_NEARBYINTL and include s_rintl.c.
	* sysdeps/powerpc/fpu/fenv_private.h (libc_feholdsetround_ppc):
	Disable exception traps in new environment.
	(libc_feholdsetround_ppc_ctx): Likewise.
2016-03-09 00:30:59 +00:00
Roland McGrath
3bd80c0de2 Fix tst-audit10 build when -mavx512f is not supported. 2016-03-08 12:32:59 -08:00
H.J. Lu
2b35e48c0c Define _HAVE_STRING_ARCH_mempcpy to 1 for x86
Since x86 has an optimized mempcpy and GCC can inline mempcpy on x86,
define _HAVE_STRING_ARCH_mempcpy to 1 for x86.

	[BZ #19759]
	* sysdeps/x86/bits/string.h (_HAVE_STRING_ARCH_mempcpy): New.
2016-03-08 10:57:41 -08:00
Gabriel F. T. Gomes
183a34dc4a powerpc: Remove uses of operand modifier (%s) in inline asm
The operand modifier %s on powerpc is an undocumented internal implementation
detail of GCC.  Besides that, the GCC community wants to remove it.  This patch
rewrites the expressions that use this modifier with logically equivalent
expressions that don't require it.

Explanation for the substitution:

The %s modifier takes an immediate operand and prints 32 less such immediate.
Thus, in the previous code, the expression resulted in:

  32 - __builtin_ffs(e)

where e was guaranteed to have exactly a single bit set, by the following
expressions:

  (e & (e-1) == 0) : e has at most one bit set.
  (e != 0)         : e is not zero, thus it has at least one bit set.

Since we guarantee that there is exactly only one bit set, the following
statement is true:

  32 - __builtin_ffs(e) == __builtin_clz(e)

Thus, we can replace __builtin_ffs with __builtin_clz and remove the %s operand
modifier.
2016-03-08 15:30:28 -03:00
Carlos Eduardo Seo
911569d02d powerpc: Fix dl-procinfo HWCAP
HWCAP-related code should had been updated when the 32 bits of HWCAP were
used.  This patch updates the code in dl-procinfo.h to loop through all
the 32 bits in HWCAP and updates _dl_powerpc_cap_flags accordingly.
2016-03-08 15:30:06 -03:00
Joseph Myers
cc4084017e Fix ldbl-128ibm remainderl equality test for zero low part (bug 19677).
The ldbl-128ibm implementation of remainderl has logic resulting in
incorrect tests for equality of the absolute values of the arguments
in the case of zero low parts.  If the low parts are both zero but
with different signs, this can wrongly cause equal arguments to be
treated as different, resulting in turn in incorrect signs of zero
result in nondefault rounding modes arising from the subtractions done
when the arguments are not equal.

This patch fixes the logic to convert -0 low parts into +0 before the
comparison (remquo already has separate logic to deal with signs of
zero results, so doesn't need such a change).  Tests are added for
remainderl and remquol similar to that for fmodl, and based on a
refactoring of it, since the bug depends on low parts which should not
be relied upon in tests not setting the representation explicitly
(although in fact the bug shows up in test-ldouble with current GCC).
Tested for powerpc.

	[BZ #19677]
	* sysdeps/ieee754/ldbl-128ibm/e_remainderl.c
	(__ieee754_remainderl): Put zero low parts in canonical form.
	* sysdeps/ieee754/ldbl-128ibm/test-fmodrem-ldbl-128ibm.c: New
	file.  Based on
	sysdeps/ieee754/ldbl-128ibm/test-fmodl-ldbl-128ibm.c.
	* sysdeps/ieee754/ldbl-128ibm/test-fmodl-ldbl-128ibm.c: Replace
	with wrapper round test-fmodrem-ldbl-128ibm.c.
	* sysdeps/ieee754/ldbl-128ibm/test-remainderl-ldbl-128ibm.c: New
	file.
	* sysdeps/ieee754/ldbl-128ibm/test-remquol-ldbl-128ibm.c:
	Likewise.
	* sysdeps/ieee754/ldbl-128ibm/Makefile (tests): Add
	test-remainderl-ldbl-128ibm and test-remquol-ldbl-128ibm.
2016-03-08 00:27:21 +00:00
Florian Weimer
3c0f7407ee tst-audit4, tst-audit10: Compile AVX/AVX-512 code separately [BZ #19269]
This ensures that GCC will not use unsupported instructions before
the run-time check to ensure support.
2016-03-07 16:00:25 +01:00
Adhemerval Zanella
9ff72da471 posix: New Linux posix_spawn{p} implementation
This patch implements a new posix_spawn{p} implementation for Linux.  The main
difference is it uses the clone syscall directly with CLONE_VM and CLONE_VFORK
flags and a direct allocated stack.  The new stack and start function solves
most the vfork limitation (possible parent clobber due stack spilling).  The
remaning issue are related to signal handling:

  1. That no signal handlers must run in child context, to avoid corrupt
     parent's state.
  2. Child must synchronize with parent to enforce stack deallocation and
     to possible return execv issues.

The first one is solved by blocking all signals in child, even NPTL-internal
ones (SIGCANCEL and SIGSETXID).  The second issue is done by a stack allocation
in parent and a synchronization with using a pipe or waitpid (in case or error).
The pipe has the advantage of allowing the child signal an exec error (checked
with new tst-spawn2 test).

There is an inherent race condition in pipe2 usage for architectures that do not
support the syscall directly.  In such cases the a pipe plus fctnl is used
instead and it may lead to file descriptor leak in parent (as decribed by fcntl
documentation).

The child process stack is allocate with a mmap with MAP_STACK flag using
default architecture stack size.  Although it is slower than use a stack buffer
from parent, it allows some slack for the compatibility code to run scripts
with no shebang (which may use a buffer with size depending of argument list
count).

Performance should be similar to the vfork default posix implementation and
way faster than fork path (vfork on mostly linux ports are basically
clone with CLONE_VM plus CLONE_VFORK).  The only difference is the syscalls
required for the stack allocation/deallocation.

It fixes BZ#10354, BZ#14750, and BZ#18433.

Tested on i386, x86_64, powerpc64le, and aarch64.

	[BZ #14750]
	[BZ #10354]
	[BZ #18433]
	* include/sched.h (__clone): Add hidden prototype.
	(__clone2): Likewise.
	* include/unistd.h (__dup): Likewise.
	* posix/Makefile (tests): Add tst-spawn2.
	* posix/tst-spawn2.c: New file.
	* sysdeps/posix/dup.c (__dup): Add hidden definition.
	* sysdeps/unix/sysv/linux/aarch64/clone.S (__clone): Likewise.
	* sysdeps/unix/sysv/linux/alpha/clone.S (__clone): Likewise.
	* sysdeps/unix/sysv/linux/arm/clone.S (__clone): Likewise.
	* sysdeps/unix/sysv/linux/hppa/clone.S (__clone): Likewise.
	* sysdeps/unix/sysv/linux/i386/clone.S (__clone): Likewise.
	* sysdeps/unix/sysv/linux/ia64/clone2.S (__clone): Likewise.
	* sysdeps/unix/sysv/linux/m68k/clone.S (__clone): Likewise.
	* sysdeps/unix/sysv/linux/microblaze/clone.S (__clone): Likewise.
	* sysdeps/unix/sysv/linux/mips/clone.S (__clone): Likewise.
	* sysdeps/unix/sysv/linux/nios2/clone.S (__clone): Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/clone.S (__clone):
	Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S (__clone):
	Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-32/clone.S (__clone): Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-64/clone.S (__clone): Likewise.
	* sysdeps/unix/sysv/linux/sh/clone.S (__clone): Likewise.
	* sysdeps/unix/sysv/linux/sparc/sparc32/clone.S (__clone): Likewise.
	* sysdeps/unix/sysv/linux/sparc/sparc64/clone.S (__clone): Likewise.
	* sysdeps/unix/sysv/linux/tile/clone.S (__clone): Likewise.
	* sysdeps/unix/sysv/linux/x86_64/clone.S (__clone): Likewise.
	* sysdeps/unix/sysv/linux/nptl-signals.h
	(____nptl_is_internal_signal): New function.
	* sysdeps/unix/sysv/linux/spawni.c: New file.
2016-03-07 11:53:47 +07:00
H.J. Lu
fee9eb6200 Group AVX512 functions in .text.avx512 section
* sysdeps/x86_64/multiarch/memcpy-avx512-no-vzeroupper.S:
	Replace .text with .text.avx512.
	* sysdeps/x86_64/multiarch/memset-avx512-no-vzeroupper.S:
	Likewise.
2016-03-06 16:48:11 -08:00
Aurelien Jarno
8a74071cef Add placeholder libnsl.abilist and libutil.abilist files
Changelog:
	* sysdeps/generic/libnsl.abilist: New file.
	* sysdeps/generic/libutil.abilist: New file.
2016-03-07 00:49:36 +01:00
H.J. Lu
4e940b2f4b Use HAS_ARCH_FEATURE with Fast_Rep_String
HAS_ARCH_FEATURE, not HAS_CPU_FEATURE, should be used with
Fast_Rep_String.

	[BZ #19762]
	* sysdeps/i386/i686/multiarch/bcopy.S (bcopy): Use
	HAS_ARCH_FEATURE with Fast_Rep_String.
	* sysdeps/i386/i686/multiarch/bzero.S (__bzero): Likewise.
	* sysdeps/i386/i686/multiarch/memcpy.S (memcpy): Likewise.
	* sysdeps/i386/i686/multiarch/memcpy_chk.S (__memcpy_chk):
	Likewise.
	* sysdeps/i386/i686/multiarch/memmove_chk.S (__memmove_chk):
	Likewise.
	* sysdeps/i386/i686/multiarch/mempcpy.S (__mempcpy): Likewise.
	* sysdeps/i386/i686/multiarch/mempcpy_chk.S (__mempcpy_chk):
	Likewise.
	* sysdeps/i386/i686/multiarch/memset.S (memset): Likewise.
	* sysdeps/i386/i686/multiarch/memset_chk.S (__memset_chk):
	Likewise.
2016-03-06 08:23:51 -08:00
H.J. Lu
16b23e0363 Replace PREINIT_FUNCTION@PLT with *%rax in call
Since we have loaded address of PREINIT_FUNCTION into %rax, we can
avoid extra branch to PLT slot.

	[BZ #19745]
	* sysdeps/x86_64/crti.S (_init): Replace PREINIT_FUNCTION@PLT
	with *%rax in call.
2016-03-04 16:15:41 -08:00
H.J. Lu
21683b5a7d Replace @PLT with @GOTPCREL(%rip) in call
Since __libc_start_main is called very early, lazy binding isn't relevant
here.  Use indirect branch via GOT to avoid extra branch to PLT slot.

	[BZ #19745]
	* sysdeps/x86_64/start.S (_start): __libc_start_main@PLT
	with *__libc_start_main@GOTPCREL(%rip) in call.
2016-03-04 16:15:41 -08:00
H.J. Lu
97f7112728 Add a comment in sysdeps/x86_64/Makefile
Mention recursive calls when ENTRY is used in _mcount.S.

	* sysdeps/x86_64/Makefile (sysdep_noprof): Add a comment.
2016-03-04 08:44:58 -08:00
H.J. Lu
14a1d7cc4c x86-64: Fix memcpy IFUNC selection
Chek Fast_Unaligned_Load, instead of Slow_BSF, and also check for
Fast_Copy_Backward to enable __memcpy_ssse3_back.  Existing selection
order is updated with following selection order:

1. __memcpy_avx_unaligned if AVX_Fast_Unaligned_Load bit is set.
2. __memcpy_sse2_unaligned if Fast_Unaligned_Load bit is set.
3. __memcpy_sse2 if SSSE3 isn't available.
4. __memcpy_ssse3_back if Fast_Copy_Backward bit it set.
5. __memcpy_ssse3

	[BZ #18880]
	* sysdeps/x86_64/multiarch/memcpy.S: Check Fast_Unaligned_Load,
	instead of Slow_BSF, and also check for Fast_Copy_Backward to
	enable __memcpy_ssse3_back.
2016-03-04 08:39:07 -08:00
H.J. Lu
a475427295 Or bit_Prefer_MAP_32BIT_EXEC in EXTRA_LD_ENVVARS
We should turn on bit_Prefer_MAP_32BIT_EXEC in EXTRA_LD_ENVVARS without
overriding other bits.

	[BZ #19758]
	* sysdeps/unix/sysv/linux/x86_64/64/dl-librecon.h
	(EXTRA_LD_ENVVARS): Or bit_Prefer_MAP_32BIT_EXEC.
2016-03-03 14:51:40 -08:00
Paul Pluzhnikov
5cdc3d9db0 2016-03-03 Paul Pluzhnikov <ppluzhnikov@google.com>
[BZ #19490]
	* sysdeps/x86_64/_mcount.S (_mcount): Add unwind descriptor.
	(__fentry__): Likewise
2016-03-03 09:53:49 -08:00
H.J. Lu
87a07a4376 Copy x86_64 _mcount.op from _mcount.o
No need to compile x86_64 _mcount.S with -pg.  We can just copy the
normal static object.

	* gmon/Makefile (noprof): Add $(sysdep_noprof).
	* sysdeps/x86_64/Makefile (sysdep_noprof): Add _mcount.
2016-03-03 06:56:22 -08:00
H.J. Lu
ec215346b9 Call x86-64 __mcount_internal/__sigjmp_save directly
Since __mcount_internal and __sigjmp_save are internal to x86-64 libc.so:

3532: 0000000000104530   289 FUNC    LOCAL  DEFAULT   13 __mcount_internal
3391: 0000000000034170    38 FUNC    LOCAL  DEFAULT   13 __sigjmp_save

they can be called directly without PLT.

	* sysdeps/x86_64/_mcount.S (C_LABEL(_mcount)): Call
	__mcount_internal directly.
	(C_LABEL(__fentry__)): Likewise.
	* sysdeps/x86_64/setjmp.S __sigsetjmp): Call __sigjmp_save
	directly.
2016-03-01 16:58:07 -08:00
H.J. Lu
521266a819 Call x86-64 __setcontext directly
Since x86-64 __start_context calls the internal __setcontext:

5089: 00000000000417e0   145 FUNC    LOCAL  DEFAULT   13 __setcontext

it should call __setcontext directly.

	* sysdeps/unix/sysv/linux/x86_64/__start_context.S
	(__start_context): Call __setcontext directly.
2016-03-01 16:55:36 -08:00
Joseph Myers
ad1b6d85ba Remove kernel-features.h conditionals on pre-3.2 kernels.
This patch follows up on the increase in minimum kernel version by
removing conditionals in non-x86, non-x86_64 kernel-features.h headers
that are now constant for all supported kernel versions.

	* sysdeps/unix/sysv/linux/alpha/kernel-features.h
	[__LINUX_KERNEL_VERSION >= 0x020621]: Make code unconditional.
	[__LINUX_KERNEL_VERSION >= 0x030200]: Likewise.
	[__LINUX_KERNEL_VERSION < 0x020621]: Remove conditional code.
	* sysdeps/unix/sysv/linux/arm/kernel-features.h
	[__LINUX_KERNEL_VERSION >= 0x020621]: Make code unconditional.
	[__LINUX_KERNEL_VERSION >= 0x020624]: Likewise.
	[__LINUX_KERNEL_VERSION >= 0x030000]: Likewise.
	* sysdeps/unix/sysv/linux/hppa/kernel-features.h
	[__LINUX_KERNEL_VERSION >= 0x020622]: Likewise.
	[__LINUX_KERNEL_VERSION >= 0x030100]: Likewise.
	[__LINUX_KERNEL_VERSION < 0x020625]: Remove conditional code.
	* sysdeps/unix/sysv/linux/ia64/kernel-features.h
	[__LINUX_KERNEL_VERSION >= 0x020621]: Make code unconditional.
	[__LINUX_KERNEL_VERSION >= 0x030000]: Likewise.
	* sysdeps/unix/sysv/linux/m68k/kernel-features.h
	[__LINUX_KERNEL_VERSION < 0x030000]: Remove conditional code.
	* sysdeps/unix/sysv/linux/microblaze/kernel-features.h
	[__LINUX_KERNEL_VERSION >= 0x020621]: Make code unconditional.
	[__LINUX_KERNEL_VERSION < 0x020621]: Remove conditional code.
	[__LINUX_KERNEL_VERSION < 0x020625]: Likewise.
	* sysdeps/unix/sysv/linux/mips/kernel-features.h
	[__LINUX_KERNEL_VERSION >= 0x020621]: Make code unconditional.
	[__LINUX_KERNEL_VERSION >= 0x030100]: Likewise.
	[_MIPS_SIM == _ABIN32 && __LINUX_KERNEL_VERSION < 0x020623]:
	Remove conditional code.
	* sysdeps/unix/sysv/linux/powerpc/kernel-features.h
	[__LINUX_KERNEL_VERSION >= 0x020625]: Make code unconditional.
	[__LINUX_KERNEL_VERSION >= 0x030000]: Likewise.
	* sysdeps/unix/sysv/linux/sh/kernel-features.h
	[__LINUX_KERNEL_VERSION >= 0x020625]: Likewise.
	[__LINUX_KERNEL_VERSION >= 0x030000]: Likewise.
	[__LINUX_KERNEL_VERSION < 0x020625]: Remove conditional code.
	* sysdeps/unix/sysv/linux/sparc/kernel-features.h
	[__LINUX_KERNEL_VERSION >= 0x020621]: Make code unconditional.
	[__LINUX_KERNEL_VERSION >= 0x030000]: Likewise.
	* sysdeps/unix/sysv/linux/tile/kernel-features.h
	[__LINUX_KERNEL_VERSION >= 0x030000]: Likewise.
2016-02-26 16:17:25 +00:00
Joseph Myers
f4a2740a69 Remove linux/fanotify.h configure test.
Now we require Linux 3.2 or later kernel headers everywhere, the
configure test for <linux/fanotify.h> is obsolete; this patch removes
it.

Tested for x86_64.

	* sysdeps/unix/sysv/linux/configure.ac (linux/fanotify.h): Do not
	test for header.
	* sysdeps/unix/sysv/linux/configure: Regenerated.
	* config.h.in (HAVE_LINUX_FANOTIFY_H): Remove #undef.
	* sysdeps/unix/sysv/linux/tst-fanotify.c [!HAVE_LINUX_FANOTIFY_H]:
	Remove conditional code.
	[HAVE_LINUX_FANOTIFY_H]: Make code unconditional.
2016-02-24 18:44:10 +00:00
Joseph Myers
5b4ecd3f95 Require Linux 3.2 except on x86 / x86_64, 3.2 headers everywhere.
In <https://sourceware.org/ml/libc-alpha/2016-01/msg00885.html> I
proposed a minimum Linux kernel version of 3.2 for glibc 2.24, since
Linux 2.6.32 has reached EOL.

In the discussion in February, some concerns were expressed about
compatibility with OpenVZ containers.  It's not clear that these are
real issues, given OpenVZ backporting kernel features and faking the
kernel version for guest software, as discussed in
<https://sourceware.org/ml/libc-alpha/2016-02/msg00278.html>.  It's
also not clear that supporting running GNU/Linux distributions from
late 2016 (at the earliest) on a kernel series from 2009 is a sensible
expectation.  However, as an interim step, this patch increases the
requirement everywhere except x86 / x86_64 (since the controversy was
only about those architectures); the special caveats and settings can
easily be removed later when we're ready to increase the requirements
on x86 / x86_64 (and if someone would like to raise the issue on LWN
as suggested in the previous discussion, that would be welcome).  3.2
kernel headers are required everywhere by this patch.

(x32 already requires 3.4 or later, so is unaffected by this patch.)

As usual for such a change, this patch only changes the configure
scripts and associated documentation.  The intent is to follow up with
removal of dead __LINUX_KERNEL_VERSION conditionals.  Each __ASSUME_*
or other macro that becomes dead can then be removed independently.

Tested for x86_64 and x86.

	* sysdeps/unix/sysv/linux/configure.ac (LIBC_LINUX_VERSION):
	Define to 3.2.0.
	(arch_minimum_kernel): Likewise.
	* sysdeps/unix/sysv/linux/configure: Regenerated.
	* sysdeps/unix/sysv/linux/i386/configure.ac (arch_minimum_kernel):
	Define to 2.6.32.
	* sysdeps/unix/sysv/linux/i386/configure: Regenerated.
	* sysdeps/unix/sysv/linux/x86_64/64/configure.ac
	(arch_minimum_kernel): Define to 2.6.32.
	* sysdeps/unix/sysv/linux/x86_64/64/configure: Regenerated.
	* README: Document Linux 3.2 requirement.
	* manual/install.texi (Linux): Document Linux 3.2 headers
	requirement.
	* INSTALL: Regenerated.
2016-02-24 17:15:12 +00:00
Roland McGrath
b2e722855b Add fts64_* to sysdeps/arm/nacl/libc.abilist 2016-02-22 15:19:56 -08:00
H.J. Lu
8d9c92017d [x86_64] Set DL_RUNTIME_UNALIGNED_VEC_SIZE to 8
Due to GCC bug:

   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58066

__tls_get_addr may be called with 8-byte stack alignment.  Although
this bug has been fixed in GCC 4.9.4, 5.3 and 6, we can't assume
that stack will be always aligned at 16 bytes.  Since SSE optimized
memory/string functions with aligned SSE register load and store are
used in the dynamic linker, we must set DL_RUNTIME_UNALIGNED_VEC_SIZE
to 8 so that _dl_runtime_resolve_sse will align the stack before
calling _dl_fixup:

Dump of assembler code for function _dl_runtime_resolve_sse:
   0x00007ffff7deea90 <+0>:	push   %rbx
   0x00007ffff7deea91 <+1>:	mov    %rsp,%rbx
   0x00007ffff7deea94 <+4>:	and    $0xfffffffffffffff0,%rsp
                                ^^^^^^^^^^^ Align stack to 16 bytes
   0x00007ffff7deea98 <+8>:	sub    $0x100,%rsp
   0x00007ffff7deea9f <+15>:	mov    %rax,0xc0(%rsp)
   0x00007ffff7deeaa7 <+23>:	mov    %rcx,0xc8(%rsp)
   0x00007ffff7deeaaf <+31>:	mov    %rdx,0xd0(%rsp)
   0x00007ffff7deeab7 <+39>:	mov    %rsi,0xd8(%rsp)
   0x00007ffff7deeabf <+47>:	mov    %rdi,0xe0(%rsp)
   0x00007ffff7deeac7 <+55>:	mov    %r8,0xe8(%rsp)
   0x00007ffff7deeacf <+63>:	mov    %r9,0xf0(%rsp)
   0x00007ffff7deead7 <+71>:	movaps %xmm0,(%rsp)
   0x00007ffff7deeadb <+75>:	movaps %xmm1,0x10(%rsp)
   0x00007ffff7deeae0 <+80>:	movaps %xmm2,0x20(%rsp)
   0x00007ffff7deeae5 <+85>:	movaps %xmm3,0x30(%rsp)
   0x00007ffff7deeaea <+90>:	movaps %xmm4,0x40(%rsp)
   0x00007ffff7deeaef <+95>:	movaps %xmm5,0x50(%rsp)
   0x00007ffff7deeaf4 <+100>:	movaps %xmm6,0x60(%rsp)
   0x00007ffff7deeaf9 <+105>:	movaps %xmm7,0x70(%rsp)

	[BZ #19679]
	* sysdeps/x86_64/dl-trampoline.S (DL_RUNIME_UNALIGNED_VEC_SIZE):
	Renamed to ...
	(DL_RUNTIME_UNALIGNED_VEC_SIZE): This.  Set to 8.
	(DL_RUNIME_RESOLVE_REALIGN_STACK): Renamed to ...
	(DL_RUNTIME_RESOLVE_REALIGN_STACK): This.  Updated.
	(DL_RUNIME_RESOLVE_REALIGN_STACK): Renamed to ...
	(DL_RUNTIME_RESOLVE_REALIGN_STACK): This.
	* sysdeps/x86_64/dl-trampoline.h
	(DL_RUNIME_RESOLVE_REALIGN_STACK): Renamed to ...
	(DL_RUNTIME_RESOLVE_REALIGN_STACK): This.
2016-02-19 15:45:09 -08:00
Joseph Myers
7b428e744b Fix ldbl-128ibm nextafterl, nexttowardl sign of zero result (bug 19678).
The ldbl-128ibm implementation of nextafterl / nexttowardl returns -0
in FE_DOWNWARD mode when taking the next value below the least
positive subnormal, when it should return +0.  This patch fixes it to
check explicitly for this case.

Tested for powerpc.

	[BZ #19678]
	* sysdeps/ieee754/ldbl-128ibm/s_nextafterl.c (__nextafterl):
	Ensure +0.0 is returned when taking the next value below the least
	positive value.
2016-02-19 17:19:53 +00:00
Florian Weimer
59eda029a8 malloc: Remove NO_THREADS
No functional change.  It was not possible to build without
threading support before.
2016-02-19 17:07:45 +01:00
Joseph Myers
c091488e51 Fix ldbl-128ibm powl overflow handling (bug 19674).
The ldbl-128ibm implementation of powl has some problems in the case
of overflow or underflow, which are mainly visible in non-default
rounding modes.

* When overflow or underflow is detected early, the correct sign of an
  overflowing or underflowing result is not allowed for.  This is
  mostly hidden in the default rounding mode by the errno-setting
  wrappers recomputing the result (except in non-default
  error-handling modes such as -lieee), but visible in other rounding
  modes where a result that is not zero or infinity causes the
  wrappers not to do the recomputation.

* The final scaling is done before the sign is incorporated in the
  result, but should be done afterwards for correct overflowing and
  underflowing results in directed rounding modes.

This patch fixes those problems.  Tested for powerpc.

	[BZ #19674]
	* sysdeps/ieee754/ldbl-128ibm/e_powl.c (__ieee754_powl): Include
	sign in overflowing and underflowing results when overflow or
	underflow is detected early.  Include sign in result before rather
	than after scaling.
2016-02-19 01:07:40 +00:00
Joseph Myers
9120a57f48 Fix ldbl-128ibm remainderl, remquol equality tests (bug 19603).
The ldbl-128ibm implementations of remainderl and remquol have logic
resulting in incorrect tests for equality of the absolute values of
the arguments.  Equality is tested based on the integer
representations of the high and low parts, with the sign bit masked
off the high part - but when this changes the sign of the high part,
the sign of the low part needs to be changed as well, and failure to
do this means arguments are wrongly treated as equal when they are
not.

This patch fixes the logic to adjust signs of low parts as needed.
Tested for powerpc.

	[BZ #19603]
	* sysdeps/ieee754/ldbl-128ibm/e_remainderl.c
	(__ieee754_remainderl): Adjust sign of integer version of low part
	when taking absolute value of high part.
	* sysdeps/ieee754/ldbl-128ibm/s_remquol.c (__remquol): Likewise.
	* math/libm-test.inc (remainder_test_data): Add another test.
	(remquo_test_data): Likewise.
2016-02-19 00:55:46 +00:00
Joseph Myers
0fed79a827 Fix ldbl-128ibm fmodl handling of equal arguments with low part zero (bug 19602).
The ldbl-128ibm implementation of fmodl has logic to detect when the
first argument has absolute value less than or equal to the second.
This logic is only correct for nonzero low parts; if the high parts
are equal and the low parts are zero, then the signs of the low parts
(which have no semantic effect on the value of the long double number)
can result in equal values being wrongly treated as unequal, and an
incorrect result being returned from fmodl.  This patch fixes this by
checking for the case of zero low parts.

Although this does show up in tests from libm-test.inc (both tests of
fmodl, and, indirectly, of remainderl / dreml), the dependence on
non-semantic zero low parts means that test shouldn't be expected to
reproduce it reliably; thus, this patch adds a standalone test that
sets up affected values using unions.

Tested for powerpc.

	[BZ #19602]
	* sysdeps/ieee754/ldbl-128ibm/e_fmodl.c (__ieee754_fmodl): Handle
	equal high parts and both low parts zero specially.
	* sysdeps/ieee754/ldbl-128ibm/test-fmodl-ldbl-128ibm.c: New test.
	* sysdeps/ieee754/ldbl-128ibm/Makefile [$(subdir) = math] (tests):
	Add test-fmodl-ldbl-128ibm.
2016-02-18 22:54:07 +00:00
Joseph Myers
e2c631384a Fix ldbl-128ibm fmodl handling of subnormal results (bug 19595).
The ldbl-128ibm implementation of fmodl has completely bogus logic for
subnormal results (in this context, that means results for which the
result is in the subnormal range for double, not results with absolute
value below LDBL_MIN), based on code used for ldbl-128 that is correct
in that case but incorrect in the ldbl-128ibm use.  This patch fixes
it to convert the mantissa into the correct form expected by
ldbl_insert_mantissa, removing the other cases of the code that were
incorrect and in one case unreachable for ldbl-128ibm.  A correct
exponent value is then passed to ldbl_insert_mantissa to reflect the
shifted result.

Tested for powerpc.

	[BZ #19595]
	* sysdeps/ieee754/ldbl-128ibm/e_fmodl.c (__ieee754_fmodl): Use
	common logic for all cases of shifting subnormal results.  Do not
	insert sign bit in shifted mantissa.  Always pass -1023 as biased
	exponent to ldbl_insert_mantissa in subnormal case.
2016-02-18 22:42:06 +00:00
Joseph Myers
b9a76339be Fix ldbl-128ibm roundl for non-default rounding modes (bug 19594).
The ldbl-128ibm implementation of roundl is only correct in
round-to-nearest mode (in other modes, there are incorrect results and
overflow exceptions in some cases).  This patch reimplements it along
the lines used for floorl, ceill and truncl, using __round on the high
part, and on the low part if the high part is an integer, and then
adjusting in the cases where this is incorrect.

Tested for powerpc.

	[BZ #19594]
	* sysdeps/ieee754/ldbl-128ibm/s_roundl.c (__roundl): Use __round
	on high and low parts then adjust result and use
	ldbl_canonicalize_int if needed.
2016-02-18 22:24:32 +00:00
Joseph Myers
e2310a27be Fix ldbl-128ibm truncl for non-default rounding modes (bug 19593).
The ldbl-128ibm implementation of truncl is only correct in
round-to-nearest mode (in other modes, there are incorrect results and
overflow exceptions in some cases).  It is also unnecessarily
complicated, rounding both high and low parts to the nearest integer
and then adjusting for the semantics of trunc, when it seems more
natural to take the truncation of the high part (__trunc optimized
inline versions can be used), and the floor or ceiling of the low part
(depending on the sign of the high part) if the high part is an
integer, as was done for floorl and ceill.  This patch makes it use
that simpler approach.

Tested for powerpc.

	[BZ #19593]
	* sysdeps/ieee754/ldbl-128ibm/s_truncl.c (__truncl): Use __trunc
	on high part and __floor or __ceil on low part then use
	ldbl_canonicalize_int if needed.
2016-02-18 21:52:07 +00:00
Joseph Myers
8a9fa0086d Fix ldbl-128ibm ceill for non-default rounding modes (bug 19592).
The ldbl-128ibm implementation of ceill is only correct in
round-to-nearest mode (in other modes, there are incorrect results and
overflow exceptions in some cases).  It is also unnecessarily
complicated, rounding both high and low parts to the nearest integer
and then adjusting for the semantics of ceil, when it seems more
natural to take the ceiling of the high part (__ceil optimized inline
versions can be used), and that of the low part if the high part is an
integer, as was done for floorl.  This patch makes it use that simpler
approach.

Tested for powerpc.

	[BZ #19592]
	* sysdeps/ieee754/ldbl-128ibm/s_ceill.c (__ceill): Use __ceil on
	high and low parts then use ldbl_canonicalize_int if needed.
2016-02-18 21:40:39 +00:00
Joseph Myers
1833769e19 Fix ldbl-128ibm floorl for non-default rounding modes (bug 17899).
The ldbl-128ibm implementation of floorl is only correct in
round-to-nearest mode (in other modes, there are incorrect results and
overflow exceptions in some cases going beyond the incorrect signs of
zero results noted in bug 17899).  It is also unnecessarily
complicated, rounding both high and low parts to the nearest integer
and then adjusting for the semantics of floor, when it seems more
natural to take the floor of the high part (__floor optimized inline
versions can be used), and that of the low part if the high part is an
integer.  This patch makes it use that simpler approach, with a
canonicalization that works in all rounding modes (given that the only
way the result can be noncanonical is if taking the floor of a
negative noninteger low part increased its exponent).

Tested for powerpc, where over a thousand failures are removed from
test-ldouble.out (floorl problems affect many powl tests).

	[BZ #17899]
	* sysdeps/ieee754/ldbl-128ibm/math_ldbl.h (ldbl_canonicalize_int):
	New function.
	* sysdeps/ieee754/ldbl-128ibm/s_floorl.c (__floorl): Use __floor
	on high and low parts then use ldbl_canonicalize_int if needed.
2016-02-18 21:31:10 +00:00
H.J. Lu
16396c41de Add _STRING_INLINE_unaligned and string_private.h
As discussed in

https://sourceware.org/ml/libc-alpha/2015-10/msg00403.html

the setting of _STRING_ARCH_unaligned currently controls the external
GLIBC ABI as well as selecting the use of unaligned accesses withing
GLIBC.

Since _STRING_ARCH_unaligned was recently changed for AArch64, this
would potentially break the ABI in GLIBC 2.23, so split the uses and add
_STRING_INLINE_unaligned to select the string ABI. This setting must be
fixed for each target, while _STRING_ARCH_unaligned may be changed from
release to release.  _STRING_ARCH_unaligned is used unconditionally in
glibc.  But <bits/string.h>, which defines _STRING_ARCH_unaligned, isn't
included with -Os.  Since _STRING_ARCH_unaligned is internal to glibc and
may change between glibc releases, it should be made private to glibc.
_STRING_ARCH_unaligned should defined in the new string_private.h heade
file which is included unconditionally from internal <string.h> for glibc
build.

	[BZ #19462]
	* bits/string.h (_STRING_ARCH_unaligned): Renamed to ...
	(_STRING_INLINE_unaligned): This.
	* include/string.h: Include <string_private.h>.
	* string/bits/string2.h: Replace _STRING_ARCH_unaligned with
	_STRING_INLINE_unaligned.
	* sysdeps/aarch64/bits/string.h (_STRING_ARCH_unaligned): Removed.
	(_STRING_INLINE_unaligned): New.
	* sysdeps/aarch64/string_private.h: New file.
	* sysdeps/generic/string_private.h: Likewise.
	* sysdeps/m68k/m680x0/m68020/string_private.h: Likewise.
	* sysdeps/s390/string_private.h: Likewise.
	* sysdeps/x86/string_private.h: Likewise.
	* sysdeps/m68k/m680x0/m68020/bits/string.h
	(_STRING_ARCH_unaligned): Renamed to ...
	(_STRING_INLINE_unaligned): This.
	* sysdeps/s390/bits/string.h (_STRING_ARCH_unaligned): Renamed
	to ...
	(_STRING_INLINE_unaligned): This.
	* sysdeps/sparc/bits/string.h (_STRING_ARCH_unaligned): Renamed
	to ...
	(_STRING_INLINE_unaligned): This.
	* sysdeps/x86/bits/string.h (_STRING_ARCH_unaligned): Renamed
	to ...
	(_STRING_INLINE_unaligned): This.
2016-02-18 14:55:29 -02:00
Andrew Senkevich
a5df3210a6 Use PIC relocation in ALIAS_IMPL
Since libmvec_nonshared.a may be linked into shared objects, ALIAS_IMPL
should use PIC relocation.

	[BZ #19590]
	* sysdeps/x86_64/fpu/svml_finite_alias.S (ALIAS_IMPL): Use PIC
	relocation.
2016-02-17 14:23:32 -08:00
Rajalakshmi Srinivasaraghavan
ebf1264f61 powerpc: Regenerate libm-test-ulps 2016-02-04 16:40:54 -02:00
Joseph Myers
5163b4b76f Fix MIPS mmap negative offset handling for consistency (bug 19550).
The handling of negative offsets in MIPS mmap is inconsistent with
other architectures, as shown by failure of the test
posix/tst-mmap-offset for o32 and n32.  The MIPS mmap syscall uses a
signed argument and does a signed arithmetic shift on it, whereas the
glibc semantics expected by that test are for the offset to be
considered as a large positive offset.  This patch makes MIPS
consistent with other architectures as far as possible by using the
mmap2 syscall on o32 (#including the generic implementation), and
making mmap not an alias for mmap64 for n32, with a custom
implementation for n32 that zero-extends the offset argument to 64-bit
before calling the mmap syscall.

Tested for MIPS64 (o32, n32, n64).

	[BZ #19550]
	* sysdeps/unix/sysv/linux/mips/mips32/mmap.c: New file.
	* sysdeps/unix/sysv/linux/mips/mips64/mmap64.c: Move to ....
	* sysdeps/unix/sysv/linux/mips/mips64/n64/mmap64.c: ... here.
	* sysdeps/unix/sysv/linux/mips/mips64/n32/mmap.c: New file.
	* sysdeps/unix/sysv/linux/mips/mips64/n32/syscalls.list (mmap64):
	New syscall entry.
	* sysdeps/unix/sysv/linux/mips/mips64/n64/syscalls.list (mmap):
	New syscall entry.
	* sysdeps/unix/sysv/linux/mips/mips64/syscalls.list (mmap): Remove
	syscall entry.
2016-02-01 18:20:21 +00:00
Steve Ellcey
8a71d2e27f Fix MIPS64 memcpy regression.
The MIPS memcpy optimizations at
<https://sourceware.org/ml/libc-alpha/2015-10/msg00597.html>
introduced a bug causing many string function tests to fail with
segfaults for n32 and n64:

FAIL: string/stratcliff
FAIL: string/test-bcopy
FAIL: string/test-memccpy
FAIL: string/test-memcmp
FAIL: string/test-memcpy
FAIL: string/test-memmove
FAIL: string/test-mempcpy
FAIL: string/test-stpncpy
FAIL: string/test-strncmp
FAIL: string/test-strncpy

(Some failures in other directories could also be caused by this bug.)

The problem is that after the check for whether a word of input is
left that can be copied as a word before moving to byte copies, a load
can occur in the branch delay slot, resulting in a segfault if we are
at the end of a page and the following page is unmapped.  I don't see
how this would have passed the tests as reported in the original patch
posting (different kernel configurations affecting the code setting up
unmapped pages, maybe?), since the tests in question don't appear to
have changed recently.

This patch moves a later instruction into the delay slot, as suggested
at <https://sourceware.org/ml/libc-alpha/2016-01/msg00584.html>.

Tested for n32 and n64.

2016-01-28  Steve Ellcey  <sellcey@imgtec.com>
            Joseph Myers  <joseph@codesourcery.com>

	* sysdeps/mips/memcpy.S (MEMCPY_NAME) [USE_DOUBLE]: Avoid word
	load in branch delay slot when less than a word of input left.
2016-01-28 01:52:05 +00:00
Andreas Schwab
4fb66fac3a Remove unused variables
They are flagged by -Wunused-const-variable.
2016-01-27 09:30:16 +01:00
David S. Miller
6ef1cb957e Update localplt.data for 32-bit sparc.
* sysdeps/unix/sysv/linux/sparc/sparc32/localplt.data: Add _Q_cmp.
2016-01-26 16:16:38 -08:00
David S. Miller
82e5836613 Define __sqrtl_finite on sparc 32-bit with correct symbol version.
* sysdeps/sparc/sparc32/Versions (GLIBC_2.23): Add entry for __sqrtl_finite.
	* sysdeps/sparc/sparc32/fpu/e_sqrtl.c (__sqrtl_finite): Define instead using
	versioned_symbol.
	* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Fix ordering of entries.
2016-01-25 16:07:15 -08:00
David S. Miller
7a18c2a0c1 Adjust sparc 32-bit __sqrtl_finite version tag.
* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Move
	__sqrtl_finite to GLIBC_2.23
2016-01-25 10:42:17 -08:00
Richard Henderson
89cfb554b8 Update Alpha libm-test-ulps 2016-01-25 10:43:41 -08:00
Paul E. Murphy
9200e581e5 Cleanup ppc bits/ipc.h
Ages ago (commit e9dcb08) the ipc syscalls were inlined and
eventually abstracted away any need for direct __ipc calls.
2016-01-25 10:35:21 -02:00
David S. Miller
c34ae92056 Fix missing __sqrtl_finite symbol in libm on sparc 32-bit.
* sysdeps/sparc/sparc32/fpu/e_sqrtl.c: New file.
	* sysdeps/sparc/sparc32/soft-fp/q_sqrt.c (__ieee754_sqrtl): Remove alias.
	* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Add __sqrtl_finite.
2016-01-24 21:14:12 -08:00
David S. Miller
a9d460a977 Update sparc ULPS.
* sysdeps/sparc/fpu/libm-test-ulps: Update.
2016-01-24 21:12:58 -08:00
Maciej W. Rozycki
d5f2798a0a MIPS: Set the required Linux kernel version to 4.5.0 for 2008 NaN
Complement the addition of the required kernel support, present upstream
as from commit 2b5e869ecfcb3112f7e1267cb0328f3ff6d49b18 ("MIPS: ELF:
Interpret the NAN2008 file header flag") and released with Linux 4.5-rc1
on Jan 24th, 2016.

	* sysdeps/unix/sysv/linux/mips/configure.ac: Set
	`arch_minimum_kernel' to 4.5.0 if 2008 NaN encoding is used.
	* sysdeps/unix/sysv/linux/mips/configure: Regenerate.
2016-01-25 00:19:27 +00:00
Paul E. Murphy
af8ea0f449 powerpc: Fix macro usage of htm builtins
Some extraneous semicolons were included in a
recent patch which causes a build failure with
newer compilers.
2016-01-22 14:13:08 -02:00
Chung-Lin Tang
fba91f1232 Maintainence patch for nios2: update ULPS file and localplt.data changes. 2016-01-21 22:58:03 -08:00
Roland McGrath
a3140836c8 NaCl: Fix unused variable errors in lowlevellock-futex.h macros. 2016-01-20 13:57:14 -08:00
Paul Pluzhnikov
b274130206 2016-01-20 Paul Pluzhnikov <ppluzhnikov@google.com>
[BZ #19490]
* sysdeps/unix/sysv/linux/x86_64/pthread_cond_broadcast.S (pthread_cond_broadcast): Use ENTRY/END
* sysdeps/unix/sysv/linux/x86_64/pthread_cond_signal.S (pthread_cond_signal): Likewise
* sysdeps/x86_64/nptl/pthread_spin_lock.S (pthread_spin_lock): Likewise
* sysdeps/x86_64/nptl/pthread_spin_trylock.S (pthread_spin_trylock): Likewise
* sysdeps/x86_64/nptl/pthread_spin_unlock.S (pthread_spin_unlock): Likewise
2016-01-20 13:39:20 -08:00
Joseph Myers
dcb133b7a4 Fix __finitel libm compat symbol version.
The changes to restrict implementation-namespace symbol aliases such
as __finitel to compat symbols used code for __finitel in libm
analogous to that for __finitel in libc.  However, the versions for
the two symbols are actually different, GLIBC_2.0 in libc and
GLIBC_2.1 in libm.  This patch fixes the handling of the libm compat
symbol.

Tested for mips (o32), where it fixes an ABI test failure.

	* sysdeps/ieee754/dbl-64/s_finite.c
	[NO_LONG_DOUBLE && LDBL_CLASSIFY_COMPAT] (__finitel): Define
	compat symbol at version GLIBC_2_1 and use GLIBC_2_1 in
	SHLIB_COMPAT condition for libm, not GLIBC_2_0.
	* sysdeps/ieee754/dbl-64/wordsize-64/s_finite.c
	[NO_LONG_DOUBLE && LDBL_CLASSIFY_COMPAT] (__finitel): Likewise.
2016-01-20 19:04:43 +00:00
Joseph Myers
00b85374a9 Update localplt.data for powerpc-nofpu.
Testing for powerpc-nofpu showed that localplt.data was out of date.
Two new soft-fp functions showed up in the list: __gtsf2 and
__unordsf2; this patch adds these as optional.  __signbit and
__signbitl no longer appear as local PLT entries; given the move to
__builtin_signbit* for all GCC versions supported for building glibc
(and given the use of the type-generic signbit macro within glibc),
those can safely be removed from the list, which this patch does.

Tested for powerpc-nofpu.

	* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/localplt.data
	(__gtsf2): Add as optional for libc.so.
	(__unordsf2): Likewise.
	(__signbit): Remove for libc.so.
	(__signbitl): Likewise.
2016-01-20 18:19:10 +00:00
Joseph Myers
2e3d0de31f Fix ulps regeneration for *-finite tests.
On running tests after from-scratch ulps regeneration, I found that
some libm tests failed with ulps in excess of those recorded in the
from-scratch regeneration, which should never happen unless those ulps
exceed the limit on ulps that can go in libm-test-ulps files.

Failure: Test: atan2_upward (inf, -inf)
Result:
 is:          2.35619498e+00   0x1.2d97ccp+1
 should be:   2.35619450e+00   0x1.2d97c8p+1
 difference:  4.76837159e-07   0x1.000000p-21
 ulp       :  2.0000
 max.ulp   :  1.0000
Maximal error of `atan2_upward'
 is      : 2 ulp
 accepted: 1 ulp
Failure: Test: carg_upward (-inf + inf i)
Result:
 is:          2.35619498e+00   0x1.2d97ccp+1
 should be:   2.35619450e+00   0x1.2d97c8p+1
 difference:  4.76837159e-07   0x1.000000p-21
 ulp       :  2.0000
 max.ulp   :  1.0000
Maximal error of `carg_upward'
 is      : 2 ulp
 accepted: 1 ulp

The problem comes from the addition of tests for the finite-math-only
versions of libm functions.  Those tests share ulps with the default
function variants.  make regen-ulps runs the default tests before the
finite-math-only tests, concatenating the resulting ulps before
feeding them to gen-libm-test.pl to generate a new libm-test-ulps
file.  But gen-libm-test.pl always takes the last ulps value given for
any (function, type) pair.  So, if the largest ulps for a function
come from non-finite inputs, a from-scratch regeneration loses those
ulps.

This patch fixes gen-libm-test.pl, in the case where there are
multiple ulps values for a (function, type) pair - which can only
happen as part of a regeneration - to take the largest ulps value
rather than the last one.

Tested for ARM / MIPS / powerpc-nofpu.

	* math/gen-libm-test.pl (parse_ulps): Do not reduce
	already-recorded ulps.
	* sysdeps/arm/libm-test-ulps: Regenerated.
	* sysdeps/mips/mips32/libm-test-ulps: Likewise.
	* sysdeps/mips/mips64/libm-test-ulps: Likewise.
	* sysdeps/powerpc/nofpu/libm-test-ulps: Likewise.
2016-01-19 21:42:58 +00:00
Andrew Senkevich
df782dc690 Fixed build with assembler w/o AVX-512 support.
* sysdeps/x86_64/multiarch/ifunc-impl-list.c: Fixed build with
    assembler not supporting AVX-512.
2016-01-19 14:34:53 +03:00
Stefan Liebler
415031f734 S390: Regenerate ULPs
I've regenerated ulps from scratch for s390/s390x.
All math testcases are passing afterwards.

ChangeLog:

	* sysdeps/s390/fpu/libm-test-ulps: Regenerated.
2016-01-19 10:02:44 +01:00
Joseph Myers
204a038e57 Regenerate MIPS libm-test-ulps.
* sysdeps/mips/mips32/libm-test-ulps: Regenerated.
	* sysdeps/mips/mips64/libm-test-ulps: Likewise.
2016-01-18 23:32:40 +00:00
Joseph Myers
844c75aa06 Regenerate powerpc-nofpu libm-test-ulps.
* sysdeps/powerpc/nofpu/libm-test-ulps: Regenerated.
2016-01-18 23:02:03 +00:00
Joseph Myers
a99236df89 Regenerate ARM libm-test-ulps.
* sysdeps/arm/libm-test-ulps: Regenerated.
2016-01-18 22:55:47 +00:00
Stefan Liebler
c4d17461e0 S/390: Do not raise inexact exception in lrint/lround. [BZ #19486]
I get some math test-failures on s390 for float/double/ldouble for
various lrint/lround functions like:
lrint (0x1p64): Exception "Inexact" set
lrint (-0x1p64): Exception "Inexact" set
lround (0x1p64): Exception "Inexact" set
lround (-0x1p64): Exception "Inexact" set
...

GCC emits "convert to fixed" instructions for casting floating point
values to integer values. These instructions raise invalid and inexact
exceptions if the floating point value exceeds the integer type ranges.

This patch enables the various FIX_DBL_LONG_CONVERT_OVERFLOW macros in
order to avoid a cast from floating point to integer type and raise the
invalid exception with feraiseexcept.
The ldbl-128 rint/round functions are now using the same logic.

ChangeLog:

	[BZ #19486]
	* sysdeps/s390/fix-fp-int-convert-overflow.h: New File.
	* sysdeps/generic/fix-fp-int-convert-overflow.h
	(FIX_LDBL_LONG_CONVERT_OVERFLOW,
	FIX_LDBL_LLONG_CONVERT_OVERFLOW): New define.
	* sysdeps/arm/fix-fp-int-convert-overflow.h: Likewise.
	* sysdeps/mips/mips32/fpu/fix-fp-int-convert-overflow.h:
	Likewise.
	* sysdeps/ieee754/ldbl-128/s_lrintl.c (__lrintl):
	Avoid conversions to long int where inexact exceptions
	could be raised.
	* sysdeps/ieee754/ldbl-128/s_lroundl.c (__lroundl):
	Likewise.
	* sysdeps/ieee754/ldbl-128/s_llrintl.c (__llrintl):
	Avoid conversions to long long int where inexact exceptions
	could be raised.
	* sysdeps/ieee754/ldbl-128/s_llroundl.c (__llroundl):
	Likewise.
2016-01-18 12:48:06 +01:00
Andrew Senkevich
214a44f394 Fixed typos in __memcpy_chk.
* sysdeps/x86_64/multiarch/memcpy_chk.S: Fixed typos.
2016-01-16 14:42:26 +03:00
Mike Frysinger
3f2c97261b sparc: mman.h: fix bad comment insertion
The MCL_ONFAULT define was inserted into the middle of a comment which
breaks the build.
2016-01-16 02:34:15 -05:00
Andrew Senkevich
72276d6e88 Added memcpy/memmove family optimized with AVX512 for KNL hardware.
Added AVX512 implementations of memcpy, mempcpy, memmove, memcpy_chk,
mempcpy_chk, memmove_chk.
It shows average improvement more than 30% over AVX versions on KNL
hardware (performance results in the thread
<https://sourceware.org/ml/libc-alpha/2016-01/msg00258.html>).

    * sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Added new files.
    * sysdeps/x86_64/multiarch/ifunc-impl-list.c: Added new tests.
    * sysdeps/x86_64/multiarch/memcpy-avx512-no-vzeroupper.S: New file.
    * sysdeps/x86_64/multiarch/mempcpy-avx512-no-vzeroupper.S: Likewise.
    * sysdeps/x86_64/multiarch/memmove-avx512-no-vzeroupper.S: Likewise.
    * sysdeps/x86_64/multiarch/memcpy.S: Added new IFUNC branch.
    * sysdeps/x86_64/multiarch/memcpy_chk.S: Likewise.
    * sysdeps/x86_64/multiarch/memmove.c: Likewise.
    * sysdeps/x86_64/multiarch/memmove_chk.c: Likewise.
    * sysdeps/x86_64/multiarch/mempcpy.S: Likewise.
    * sysdeps/x86_64/multiarch/mempcpy_chk.S: Likewise.
2016-01-16 00:49:45 +03:00
Torvald Riegel
b02840bacd New pthread_barrier algorithm to fulfill barrier destruction requirements.
The previous barrier implementation did not fulfill the POSIX requirements
for when a barrier can be destroyed.  Specifically, it was possible that
threads that haven't noticed yet that their round is complete still access
the barrier's memory, and that those accesses can happen after the barrier
has been legally destroyed.
The new algorithm does not have this issue, and it avoids using a lock
internally.
2016-01-15 21:20:34 +01:00
Martin Sebor
ad37480c4b Fix build errors with -DNDEBUG.
[BZ #18755]
        * iconv/skeleton.c (FUNCTION_NAME): Suppress -Wunused-but-set-variable
        warnings.
        * sysdeps/nptl/gai_misc.h (__gai_start_notify_thread): Same.
        (__gai_create_helper_thread): Same.
        * nscd/nscd.c (do_exit): Suppress -Wunused-variable.
        * iconvdata/iso-2022-cn-ext.c (BODY): Initialize local variable
        to suppress -Wmaybe-uninitialized warnings.
2016-01-15 10:44:07 -07:00
H.J. Lu
09245377da Call math_opt_barrier inside if
Since floating-point operation may trigger floating-point exceptions,
we call math_opt_barrier inside if to prevent code motion.

	[BZ #19465]
	* sysdeps/ieee754/dbl-64/s_fma.c (__fma): Call math_opt_barrier
	inside if.
	* sysdeps/ieee754/ldbl-128/s_fmal.c (__fmal): Likewise.
	* sysdeps/ieee754/ldbl-96/s_fma.c (__fma): Likewise.
	* sysdeps/ieee754/ldbl-96/s_fmal.c (__fmal): Likewise.
2016-01-15 05:23:20 -08:00
Amit Pawar
d7890e6947 Set index_Fast_Unaligned_Load for Excavator family CPUs
GLIBC benchtest testcases shows SSE2_Unaligned based implementations
are performing faster compare to SSE2 based implementations for
routines: strcmp, strcat, strncat, stpcpy, stpncpy, strcpy, strncpy
and strstr. Flag index_Fast_Unaligned_Load is set for Excavator family
0x15h CPU's. This makes SSE2_Unaligned based implementations as
default for these routines.

	[BZ #19467]
	* sysdeps/x86/cpu-features.c (init_cpu_features): Set
	index_Fast_Unaligned_Load flag for Excavator family CPUs.
2016-01-14 08:14:31 -08:00
Marcin Kościelnicki
a4b5177ca8 Add __private_ss to s390 struct tcbhead.
Preparation for gcc -fsplit-stack support (gcc bug #68191).  The new
field is basically identical to the one on x86.  Its TCB offset needs
to be constant, as it'll be hardcoded in gcc.

ChangeLog:

	* sysdeps/s390/nptl/tls.h (struct tcbhead_t): Add __private_ss field.
2016-01-14 16:48:55 +01:00
Joseph Myers
fb53a27c57 Add new header definitions from Linux 4.4 (plus older ptrace definitions).
This patch adds some new header definitions from Linux 4.4:

* MCL_ONFAULT is added to bits/mman.h / bits/mman-linux.h (this was
  already done for hppa).

* PTRACE_SECCOMP_GET_FILTER is added to sys/ptrace.h.  Along with it,
  the older PTRACE_GETSIGMASK and PTRACE_SETSIGMASK, added in Linux
  3.11 but missed at the time, are also added.

Tested for x86_64 and x86 (testsuite, and that installed stripped
shared libraries are unchanged by the patch).

	* bits/mman-linux.h [!MCL_CURRENT] (MCL_ONFAULT): New macro.
	* sysdeps/unix/sysv/linux/alpha/bits/mman.h (MCL_ONFAULT):
	Likewise.
	* sysdeps/unix/sysv/linux/powerpc/bits/mman.h (MCL_ONFAULT):
	Likewise.
	* sysdeps/unix/sysv/linux/sparc/bits/mman.h (MCL_ONFAULT):
	Likewise.
	* sysdeps/unix/sysv/linux/sys/ptrace.h (PTRACE_GETSIGMASK): New
	enum constant and macro.
	(PTRACE_SETSIGMASK): Likewise.
	(PTRACE_SECCOMP_GET_FILTER): Likewise.
	* sysdeps/unix/sysv/linux/aarch64/sys/ptrace.h
	(PTRACE_GETSIGMASK): Likewise.
	(PTRACE_SETSIGMASK): Likewise.
	(PTRACE_SECCOMP_GET_FILTER): Likewise.
	* sysdeps/unix/sysv/linux/ia64/sys/ptrace.h (PTRACE_GETSIGMASK):
	Likewise.
	(PTRACE_SETSIGMASK): Likewise.
	(PTRACE_SECCOMP_GET_FILTER): Likewise.
	* sysdeps/unix/sysv/linux/powerpc/sys/ptrace.h
	(PTRACE_GETSIGMASK): Likewise.
	(PTRACE_SETSIGMASK): Likewise.
	(PTRACE_SECCOMP_GET_FILTER): Likewise.
	* sysdeps/unix/sysv/linux/s390/sys/ptrace.h (PTRACE_GETSIGMASK):
	Likewise.
	(PTRACE_SETSIGMASK): Likewise.
	(PTRACE_SECCOMP_GET_FILTER): Likewise.
	* sysdeps/unix/sysv/linux/sparc/sys/ptrace.h (PTRACE_GETSIGMASK):
	Likewise.
	(PTRACE_SETSIGMASK): Likewise.
	(PTRACE_SECCOMP_GET_FILTER): Likewise.
	* sysdeps/unix/sysv/linux/tile/sys/ptrace.h (PTRACE_GETSIGMASK):
	Likewise.
	(PTRACE_SETSIGMASK): Likewise.
	(PTRACE_SECCOMP_GET_FILTER): Likewise.
2016-01-12 12:42:55 +00:00
Tulio Magno Quites Machado Filho
42bf1c8971 powerpc: Enforce compiler barriers on hardware transactions
Work around a GCC behavior with hardware transactional memory built-ins.
GCC doesn't treat the PowerPC transactional built-ins as compiler
barriers, moving instructions past the transaction boundaries and
altering their atomicity.
2016-01-08 17:47:33 -02:00
Carlos Eduardo Seo
d2de9ef7ad powerpc: Add hwcap2 bits for POWER9.
Added hwcap2 bit masks for Power ISA 3.0 and VSX IEEE binary float 128-bit
features.
2016-01-08 11:19:40 -02:00
John David Anglin
48025aa9ed hppa: fix dladdr [BZ #19415]
The attached patch fixes dladdr on hppa.

Instead of using the generic version of _dl_lookup_address, we use an
implementation more or less modeled after __canonicalize_funcptr_for_compare()
in gcc.  The function pointer is analyzed and if it points to the
trampoline used to call _dl_runtime_resolve just before the global
offset table, then we call _dl_fixup to resolve the function pointer.
Then, we return the instruction pointer from the first word of the
descriptor.

The change fixes the testcase provided in [BZ #19415] and the Debian
nss package now builds successfully.
2016-01-08 02:19:26 -05:00
Mike Frysinger
1f89b8d881 xstat: only check to see if __ASSUME_ST_INO_64_BIT is defined
We define __ASSUME_ST_INO_64_BIT by default for Linux targets, and then
undef it for alpha/sh targets.  But the code that uses it looks at its
value (as 0/1) rather than whether it's defined (like all other assume
knobs).  Change the code to see if it's defined to fix build Wundef build
errors for alpha/sh.
2016-01-07 14:37:09 -05:00
Marko Myllynen
48d0341cdd Make shebang interpreter directives consistent 2016-01-07 04:03:21 -05:00
John David Anglin
d7f914848b hppa: fix pthread spinlock
URL: https://bugs.debian.org/725508
2016-01-06 17:26:04 -05:00
H.J. Lu
db2f6f4794 Update copyright dates committed in 2016 2016-01-06 14:03:10 -08:00
H.J. Lu
730bbab2c3 Mark internal unistd functions hidden in ld.so
Since internal unistd functions are only used internally in ld.so and
libc.so, they can be made hidden.  __close, __getcwd, __getpid,
__libc_read and __libc_write can't be hidden in ld.so on Hurd since they
will be preempted by the ones in libc.so after bootstrap.

	[BZ #19122]
	* include/unistd.h [IS_IN (rtld)]: Include <dl-unistd.h>.
	* sysdeps/generic/dl-unistd.h: New file.
	* sysdeps/mach/hurd/dl-unistd.h: Likewise.
2016-01-06 12:54:10 -08:00
H.J. Lu
38acf35697 Mark ld.so internal mmap functions hidden in ld.so
Since ld.so internal mmap functions are only used internally in ld.so,
they can be made hidden.  Don't hide __mmap on Hurd, since __mmap in
ld.so will be preempted by the one in libc.so after bootstrap.

	 [BZ #19122]
	 * include/sys/mman.h [IS_IN (rtld)]: Include <dl-mman.h>.
	 * sysdeps/generic/dl-mman.h: New file.
	 * sysdeps/mach/hurd/dl-mman.h: Likewise.
2016-01-06 11:28:56 -08:00
Anton Blanchard
0a1f1e78fb Eliminate redundant sign extensions in pow()
When looking at the code generated for pow() on ppc64 I noticed quite
a few sign extensions. Making the array indices unsigned reduces the
number of sign extensions from 24 to 7.

Tested for powerpc64le and x86_64.
2016-01-04 14:55:38 -02:00
Joseph Myers
1979f3c1ad Update copyright dates not handled by scripts/update-copyrights.
I've updated copyright dates in glibc for 2016.  This is the patch for
the changes not generated by scripts/update-copyrights and subsequent
build / regeneration of generated files.

	* NEWS: Update copyright dates.
	* catgets/gencat.c (print_version): Likewise.
	* csu/version.c (banner): Likewise.
	* debug/catchsegv.sh: Likewise.
	* debug/pcprofiledump.c (print_version): Likewise.
	* debug/xtrace.sh (do_version): Likewise.
	* elf/ldconfig.c (print_version): Likewise.
	* elf/ldd.bash.in: Likewise.
	* elf/pldd.c (print_version): Likewise.
	* elf/sotruss.sh: Likewise.
	* elf/sprof.c (print_version): Likewise.
	* iconv/iconv_prog.c (print_version): Likewise.
	* iconv/iconvconfig.c (print_version): Likewise.
	* locale/programs/locale.c (print_version): Likewise.
	* locale/programs/localedef.c (print_version): Likewise.
	* login/programs/pt_chown.c (print_version): Likewise.
	* malloc/memusage.sh (do_version): Likewise.
	* malloc/memusagestat.c (print_version): Likewise.
	* malloc/mtrace.pl: Likewise.
	* manual/libc.texinfo: Likewise.
	* nptl/version.c (banner): Likewise.
	* nscd/nscd.c (print_version): Likewise.
	* nss/getent.c (print_version): Likewise.
	* nss/makedb.c (print_version): Likewise.
	* posix/getconf.c (main): Likewise.
	* scripts/test-installation.pl: Likewise.
	* sysdeps/unix/sysv/linux/lddlibc4.c (main): Likewise.
2016-01-04 16:26:30 +00:00
Joseph Myers
f7a9f785e5 Update copyright dates with scripts/update-copyrights. 2016-01-04 16:05:18 +00:00
Helge Deller
d4eed61f85 hppa: Add MAP_HUGETLB and MAP_STACK defines [BZ #19285]
The attached patch adds some upstream defines like MAP_HUGETLB and MAP_STACK
in mman.h for the hppa architecture.

The existing MADV_xxK_PAGES defines were dropped upstream, because they were
originally added many years ago based on a proposed patch for the Linux kernel
which was never applied. So, this patch drops those unneeded defines.
2016-01-02 23:39:49 -05:00
Mike Frysinger
19e0751014 ia64: fpu: fix gamma definition handling [BZ #15421]
The rework in commit d709042a6e broke
buiding on ia64 due to compat_symbol expanding into ... in some cases.
The common files were wrapped in a BUILD_LGAMMA check, but the ia64
ones were not.  Add that logic to the ia64 files too.
2016-01-01 22:17:07 -05:00
Dmitry V. Levin
e0043e17df Fix linux personality syscall wrapper
The personality system call, starting with linux kernel commit
v2.6.29-6609-g11d06b2a1e5658f448a308aa3beb97bacd64a940, always
successfully changes the personality if requested.  The syscall
wrapper, however, still can return an error in the following cases:
- the value returned by the system call looks like an error
due to architecture limitations of 32-bit kernels;
- a personality greater than 0xffffffff is passed to the system call,
and the 64-bit kernel does not have commit
v2.6.35-rc1-372-g485d527686850d68a0e9006dd9904f19f122485e
that would truncate this value to unsigned int;
- on sparc64, the value returned by the system call looks like an error
due to sparc64 kernel sign extension bug.

The solution is three-fold:
- move generic syscalls.list personality entry to generic 64-bit
syscalls.list file;
- for each 32-bit architecture that use negated errno semantics,
add a NOERRNO personality entry to their syscalls.list file;
- for sparc64 and 32-bit architectures that use dedicated registers
to flag syscall errors, add a wrapper around personality syscall;
if the system call return value is flagged as an error, this wrapper
returns the negated "would be errno" value, otherwise it returns
the system call return value; on sparc64, it also truncates the
personality argument to unsigned int before passing it to the kernel.

[BZ #19408]
* sysdeps/unix/sysv/linux/personality.c: New file.
* sysdeps/unix/sysv/linux/sparc/sparc64/personality.c: Likewise.
* sysdeps/unix/sysv/linux/tst-personality.c: Likewise.
* sysdeps/unix/sysv/linux/Makefile [$(subdir) == misc]
(sysdep_routines): Add personality.
(tests): Add tst-personality.
* sysdeps/unix/sysv/linux/syscalls.list (personality): Move ...
* sysdeps/unix/sysv/linux/wordsize-64/syscalls.list: ... here.
* sysdeps/unix/sysv/linux/arm/syscalls.list (personality): New entry.
* sysdeps/unix/sysv/linux/hppa/syscalls.list (personality): Likewise.
* sysdeps/unix/sysv/linux/i386/syscalls.list (personality): Likewise.
* sysdeps/unix/sysv/linux/m68k/syscalls.list (personality): Likewise.
* sysdeps/unix/sysv/linux/microblaze/syscalls.list (personality):
Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/n32/syscalls.list (personality):
Likewise.
* sysdeps/unix/sysv/linux/s390/s390-32/syscalls.list (personality):
Likewise.
* sysdeps/unix/sysv/linux/sh/syscalls.list (personality): Likewise.
* sysdeps/unix/sysv/linux/x86_64/x32/syscalls.list (personality):
Likewise.
2015-12-31 00:17:48 +00:00
Aurelien Jarno
cc42170ef6 Cleanup ARM ioperm implementation (step 2)
Since GLIBC requires a minimum 2.6.32 kernel, the sysctl (CTL_BUS,
CTL_BUS_ISA, ISA_*) is always available.  We can therefore remove the
fallback code reading /etc/arm_systype or parsing /proc/cpuinfo.

Remove fscanf from localplt.data as it is no longer called from within
GLIBC.

	* sysdeps/unix/sysv/linux/arm/ioperm.c: Do not include <string.h>.
	(PATH_ARM_SYSTYPE): Remove.
	(PATH_CPUINFO): Likewise.
	(IO_BASE_FOOTBRIDGE): Likewise.
	(IO_SHIFT_FOOTBRIDGE): Likewise.
	(struct platform): Likewise.
	(init_iosys): Remove compatibility code for 2.4 kernels.
	* sysdeps/unix/sysv/linux/arm/localplt.data: Remove fscanf.
2015-12-30 23:31:18 +01:00
John David Anglin
d51442aacd hppa: Define __NO_LONG_DOUBLE_MATH so headers are consistent with libm build [BZ #19270]
The attached patch fixes BZ #19270 and the Debian gmt package now builds
successfully.  Aside from the comment, the define of __NO_LONG_DOUBLE_MATH
is similar to that in the generic version of glibc.

Build tested on hppa-unknown-linux-gnu with no observed regressions.
2015-12-29 13:24:51 -05:00
Mike Frysinger
d46256f440 ia64: fpu: fix gammaf typo [BZ #15421]
The lgamma rewrite in commit d709042a6e
used "gammaf" in this function when it should have used "gamma".
2015-12-28 22:20:03 -05:00
Torvald Riegel
389fdf78b2 Do not violate mutex destruction requirements.
POSIX and C++11 require that a thread can destroy a mutex if no other
thread owns the mutex, is blocked on the mutex, or will try to acquire
it in the future.  After destroying the mutex, it can reuse or unmap the
underlying memory.  Thus, we must not access a mutex' memory after
releasing it.  Currently, we can load the private flag after releasing
the mutex, which is fixed by this patch.
See https://sourceware.org/bugzilla/show_bug.cgi?id=13690 for more
background.

We need to call futex_wake on the lock after releasing it, however.  This
is by design, and can lead to spurious wake-ups on unrelated futex words
(e.g., when the mutex memory is reused for another mutex).  This behavior
is documented in the glibc-internal futex API and in recent drafts of the
Linux kernel's futex documentation (see the draft_futex branch of
git://git.kernel.org/pub/scm/docs/man-pages/man-pages.git).
2015-12-23 18:44:53 +01:00
Carlos Eduardo Seo
c676e65939 powerpc: Export __parse_hwcap_and_convert_at_platform to libc.a.
Commit 67385a01d2 added a new feature for
powerpc, where we store HWCAP/Platform bits in the TCB.  In the dynamic
linking case, we use the versioned symbol
'__parse_hwcap_and_convert_at_platform' to verify if this feature is
available.  However, the same symbol was not exported to libc.a, making
it not possible for GCC to check for it prior to link time.
2015-12-22 15:41:19 -02:00
Carlos Eduardo Seo
b1f19b8ef1 powerpc: Add basic support for POWER9 sans hwcap.
This patch adds the minimum changes for supporting the POWER9 processor.
2015-12-22 14:45:55 -02:00
Samuel Thibault
2cf3e1aa74 Harmonize generic stdio-lock support with nptl
This fixes build when _IO_funlockfile is a macro, fixes build where
	_IO_acquire_lock_clear_flags2 is used, and fixes unlocking on unexpected
	stack unwind.

	* sysdeps/generic/stdio-lock.h [__EXCEPTIONS] (_IO_acquire_lock,
	_IO_release_lock ): Use cleanup attribute on new
	_IO_acquire_lock_file variable instead of assuming that
	_IO_release_lock will be called.
	[!__EXCEPTIONS] (_IO_acquire_lock): Define to non-existing
	_IO_acquire_lock_needs_exceptions_enabled.
	(_IO_acquire_lock_clear_flags2): New macro.
2015-12-22 14:39:19 +01:00
Adhemerval Zanella
661a29a518 powerpc: Regenerate libm-test-ulps
* sysdeps/powerpc/fpu/libm-test-ulps: Regenerated.
2015-12-22 11:11:01 -02:00
Siddhesh Poyarekar
b300455644 Consolidate sincos computation for 2.426265 < |x| < 105414350
Like the previous change, exploit the fact that computation for sin
and cos is identical except that it is apart by a quadrant.  Also
remove csloww, csloww1 and csloww2 since they can easily be expressed
in terms of sloww, sloww1 and sloww2.
2015-12-21 10:43:04 +05:30
Siddhesh Poyarekar
f7953c44d5 Consolidate sin and cos code for 105414350 <|x|< 281474976710656
The sin and cos computation for this range of input is identical
except for a difference in quadrants by 1.  Exploit that fact and the
common argument reduction to reduce computations for sincos.
2015-12-21 10:41:46 +05:30