These new memcpy functions are the 32-bit version of x86_64 SSE2 unaligned
memcpy. Memcpy average performace benefit is 18% on Silvermont, other
platforms also improved about 35%, benchmarked on Silvermont, Haswell, Ivy
Bridge, Sandy Bridge and Westmere, performance results attached in
https://sourceware.org/ml/libc-alpha/2014-07/msg00157.html
* sysdeps/i386/i686/multiarch/bcopy-sse2-unaligned.S: New file.
* sysdeps/i386/i686/multiarch/memcpy-sse2-unaligned.S: Likewise.
* sysdeps/i386/i686/multiarch/memmove-sse2-unaligned.S: Likewise.
* sysdeps/i386/i686/multiarch/mempcpy-sse2-unaligned.S: Likewise.
* sysdeps/i386/i686/multiarch/bcopy.S: Select the sse2_unaligned
version if bit_Fast_Unaligned_Load is set.
* sysdeps/i386/i686/multiarch/memcpy.S: Likewise.
* sysdeps/i386/i686/multiarch/memcpy_chk.S: Likewise.
* sysdeps/i386/i686/multiarch/memmove.S: Likewise.
* sysdeps/i386/i686/multiarch/memmove_chk.S: Likewise.
* sysdeps/i386/i686/multiarch/mempcpy.S: Likewise.
* sysdeps/i386/i686/multiarch/mempcpy_chk.S: Likewise.
* sysdeps/i386/i686/multiarch/Makefile (sysdep_routines): Add
bcopy-sse2-unaligned, memcpy-sse2-unaligned,
memmove-sse2-unaligned and mempcpy-sse2-unaligned.
* sysdeps/i386/i686/multiarch/ifunc-impl-list.c (MAX_IFUNC): Set
to 4.
(__libc_ifunc_impl_list): Test __bcopy_sse2_unaligned,
__memmove_chk_sse2_unaligned, __memmove_sse2_unaligned,
__memcpy_chk_sse2_unaligned, __memcpy_sse2_unaligned,
__mempcpy_chk_sse2_unaligned, and __mempcpy_sse2_unaligned.
This patch adds support to generate the spec array in getconf from the
conf.list. The generated code is mostly unchanged. the only changes
are due to the change in layout of the spec and val arrays in the ELF.
The val array can also be auto-generated from posix-conf-vars.list
once the remaining macros are added to it.
* posix/posix-conf-vars.list (SPEC:XBS5): Add sysconf prefix.
* posix/confstr.c: Define NEED_SPEC_ARRAY to 0.
* posix/posix-envs.def: Likewise.
* sysdeps/posix/sysconf.c: Likewise.
* posix/getconf.c: Define NEED_SPEC_ARRAY to 1.
(specs): Remove array.
* scripts/gen-posix-conf-vars.awk: Support generation of specs
array.
This fixes the remaining -Wundef warnings. Tested on x86_64.
* posix/posix-conf-vars.list: Add _POSIX sysconf namespace.
* sysdeps/posix/sysconf.c: Include posix-conf-vars.h.
(__sysconf): Use CONF_IS_* macros.
This patch adds a file posix-conf-vars.list that is used to generate
macros to determine if a macro is defined as set, unset or not
defined. gen-posix-conf-vars.awk processes this file and generates a
header (posix-conf-vars-def.h) with these macros. A new header
posix-conf-vars.h includes this generated header and defines accessor
macros for the generated macros.
Tested on x86_64.
* posix/Makefile (before-compile): Add posix-conf-vars-def.h.
($(objpfx)posix-conf-vars-def.h): New target.
* posix/posix-conf-vars.list: New file.
* posix/posix-conf-vars.h: New file.
* posix/confstr.c: Include posix-conf-vars.h.
(confstr): Use CONF_IS_* macros.
* posix/posix-envs.def: Include posix-conf-vars.h. Use
CONF_IS_* macros.
* scripts/gen-posix-conf-vars.awk: New file.
These avoid having tile generate real calls to the no-op
functions, which then causes linknamespace test failures.
It might make sense to factor all of these out into a common
header that can be shared by tile, microblaze, etc., but for
now just fix the test failures.
These definitions were added back before __ASSUME_POSIX_CPU_TIMERS
was removed. There used to be a vsyscall to clock_getres() in
maybe_syscall_settime_cpu(), but that function was removed in commit
26889eac. The presence of the vsyscall definitions means that platforms
that don't provide clock_getres as a vsyscall hit a symbol redefinition
warning in this file, becoming fatal with -Werror. Removing the
vsyscall definitions is the obvious fix.
No change to generated code on x86_64.
The symbol for HAVE_CLOCK_GETTIME_VSYSCALL was being
only conditionally defined under [SHARED]. However, it turns
out this causes a preprocessor symbol redefinition warning
when building clock_gettime.o. Move the symbol definition
down to make it unconditional, like other platforms do.
The _Unwind_GetCFA() routine returns a 64-bit value,
which we interpret as a pointer. Add an intermediate
cast to long so that in ILP32 mode we don't get a warning
about casting a wrong-sized integer to a pointer.
I missed this during the initial port. Some testing shows that
enabling this mode does, unsurprisingly, yield some nice speedups
on the math functions in question.
This patch makes __ASSUME_UTIMES hppa-specific, removing mentions of
the macro from architecture-independent code and code for other
architectures. (All other architectures either have the utimes
syscall in all relevant kernel versions, or use the asm-generic
interface so only have utimensat and won't get the utimes syscall.) A
similar approach is used to that used for futimesat for MicroBlaze: if
the kernel is recent enough that the utimes syscall can be assumed to
be present, use the implementation in terms of the utimes syscall, and
otherwise use the linux/generic implementation in terms of utimensat.
Tested x86_64 that the disassembly of installed shared libraries is
unchanged by the patch. Not tested for hppa.
* sysdeps/unix/sysv/linux/kernel-features.h (__ASSUME_UTIMES): Do
not define.
* sysdeps/unix/sysv/linux/utimes.c: Do not include
<kernel-features.h>.
(__utimes) [__NR_utimes]: Make code unconditional.
(__utimes) [!__ASSUME_UTIMES]: Remove conditional code.
* sysdeps/unix/sysv/linux/aarch64/kernel-features.h
(__ASSUME_UTIMES): Do not undefine.
* sysdeps/unix/sysv/linux/tile/kernel-features.h
(__ASSUME_UTIMES): Likewise.
* sysdeps/unix/sysv/linux/hppa/kernel-features.h
(__ASSUME_UTIMES): Define for [__LINUX_KERNEL_VERSION >= 0x030e00]
instead of undefining for [__LINUX_KERNEL_VERSION < 0x030e00].
* sysdeps/unix/sysv/linux/hppa/utimes.c: New file.
[BZ #17747]
The y0/y1/yn and j0/j1/jn functions provided a strong_alias
to the "l"-suffixed variants when no long double support is
being compiled. This breaks namespace conformance when the
basename versions conform but the l-suffixed ones don't.
Fixed by making them weak aliases instead.
[BZ #17746]
The __builtin_expect() truncated a uint64_t to a 32-bit long
in ILP32 mode, discarding the high 32 bits, and potentially
missing the NUL terminator that we were searching for with SIMD
operations. Explicitly compare to zero to fix the problem.
Bug 17724 reports references to fesetround being brought in by
ldbl-128ibm rintl via references to __rintl from __kernel_standard_l.
Because all three __kernel_standard* functions are in the same file,
this gets brought in even though only the long double version
__kernel_standard_l needs __rintl, and the C90 functions use only
__kernel_standard.
This patch fixes this by splitting the three versions into separate
files; it's fine for long double functions to refer to fe* functions
directly, unless they get called by C90 double functions.
Tested for x86_64 (testsuite; the reordering of code means disassembly
of shared libraries can't usefully be compared). Tested for powerpc
that the relevant issue disappears from the linknamespace test
output.
[BZ #17724]
* sysdeps/ieee754/k_standard.c: Don't include <float.h>.
(__kernel_standard_f): Remove. Moved to k_standardf.c.
(__kernel_standard_l): Remove. Moved to k_standardl.c with
(char *) casts added.
* sysdeps/ieee754/k_standardf.c: New file.
* sysdeps/ieee754/k_standardl.c: Likewise.
* math/Makefile (libm-support): Remove k_standard.
(libm-calls): Add k_standard.
On Linux architectures using socketcall, the resolver ends up bringing
in strong symbols for bind and getsockname, which are not in
POSIX.1-1996. This causes linknamespace test failures:
FAIL: conform/POSIX/pthread.h/linknamespace
FAIL: conform/POSIX/sched.h/linknamespace
FAIL: conform/POSIX/time.h/linknamespace
These functions are defined as strong symbols with __bind and
__getsockname as weak aliases. This patch switches this to the other
way round by removing the NO_WEAK_ALIAS definitions and so letting the
default case in socket.S act; I see no reason for the existing
arrangements.
Tested for x86 (testsuite, and that disassembly of installed shared
libraries is unchanged by the patch).
[BZ #17733]
* sysdeps/unix/sysv/linux/bind.S (NO_WEAK_ALIAS): Do not define.
(__bind): Do not define as weak alias.
* sysdeps/unix/sysv/linux/getsockname.S (NO_WEAK_ALIAS): Do not
define.
(__getsockname): Do not define as weak alias.
The merge of the latest gettext code introduced changes to the yacc
parser source that are incompatible with versions of bison older
than 2.7. Add a configure check for the appropriate versions and
document the requirement in INSTALL.
ChangeLog:
2014-12-22 Will Newton <will.newton@linaro.org>
* manual/install.texi: Document that we require bison 2.7
or above.
* INSTALL: Regenerate.
* configure.ac: Use AC_CHECK_PROG_VER instead of
AC_PATH_PROG when checking for bison and check for
version 2.7 or above.
* configure: Regenerate.
__tls_get_addr/___tls_get_addr is always defined in ld.so. There is
no need to call them via PLT inside ld.so. This patch adds the hidden
__tls_get_addr/___tls_get_addr aliases and calls them directly from
_dl_tlsdesc_dynamic. There is no need to set up the EBX register in
i386 _dl_tlsdesc_dynamic when calling the hidden ___tls_get_addr.
* elf/dl-tls.c (__tls_get_addr): Provide the hidden definition
if not defined.
* sysdeps/i386/dl-tls.h (___tls_get_addr): Provide the hidden
definition.
* sysdeps/i386/dl-tlsdesc.S (_dl_tlsdesc_dynamic): Call the
hidden ___tls_get_addr.
* sysdeps/x86_64/dl-tlsdesc.S (_dl_tlsdesc_dynamic): Call the
hidden __tls_get_addr.
* sysdeps/generic/localplt.data (__tls_get_addr): Removed.
* sysdeps/unix/sysv/linux/i386/localplt.data (___tls_get_addr):
Likewise.
_dl_start_user in ld.so calls the local function _dl_init. There is no
need to go through PLT.
* sysdeps/i386/dl-machine.h (_dl_start_user): Remove @PLT
from "call _dl_init@PLT".
* sysdeps/x86_64/dl-machine.h (_dl_start_user): Likewise.
from "call _dl_init@PLT".
C99, C11, POSIX, and the glibc implementation do guarantee that the
pointers passed to the qsort comparison function lie within the array.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
The two_way_short_needle() routine included from str-two-way.h
is not used, so mark it so to avoid compiler warnings.
Calling strnlen() breaks linknamespace tests, so change it
to __strnlen().
* sysdeps/mips/sys/asm.h (PTR_ADDU): Use addu on mips32r6/mips64r6.
(PTR_ADDIU): Use addiu for mips32r6/mips64r6.
(PTR_SUBU): Use subu for mips32r6/mips64r6.
(PTR_SUBIU): Use subu for mips32r6/mips64r6 (subiu does not exist).
* sysdeps/mips/machine-gmon.h (PTR_ADDU_STRING) Use addu for
mips32r6/mips64r6.
(PTR_SUBU_STRING) Use subu for mips32r6/mips64r6.