glibc

mirror of https://sourceware.org/git/glibc.git synced 2024-12-14 07:10:05 +00:00

Author	SHA1	Message	Date
Wilco Dijkstra	fe161827c5	Fix build issue with SINGLE_THREAD_P Add sysdep-cancel.h include. * malloc/malloc.c (sysdep-cancel.h): Add include. (cherry-picked `6d43de4b85`)	2017-11-28 19:15:13 +05:30
Wilco Dijkstra	e759c32364	Add single-threaded path to _int_free This patch adds single-threaded fast paths to _int_free. Bypass the explicit locking for larger allocations. * malloc/malloc.c (_int_free): Add SINGLE_THREAD_P fast paths. (cherry-picked from `a15d53e2de`)	2017-11-28 19:14:48 +05:30
Wilco Dijkstra	0e24837040	Fix deadlock in _int_free consistency check This patch fixes a deadlock in the fastbin consistency check. If we fail the fast check due to concurrent modifications to the next chunk or system_mem, we should not lock if we already have the arena lock. Simplify the check to make it obviously correct. * malloc/malloc.c (_int_free): Fix deadlock bug in consistency check. (cherry-pick `d74e6f6c0d`)	2017-11-28 19:13:49 +05:30
Florian Weimer	590b24e6e0	malloc: Resolve compilation failure in NDEBUG mode In _int_free, the locked variable is not used if NDEBUG is defined. (cherry-picked from `24cffce736`)	2017-11-28 19:12:19 +05:30
Florian Weimer	533afac929	malloc: Change top_check return type to void After commit `ec2c1fcefb`, (malloc: Abort on heap corruption, without a backtrace), the function always returns 0. (cherry-picked from `5129873a8e`)	2017-11-28 19:11:30 +05:30
Florian Weimer	675e8785dc	malloc: Remove corrupt arena flag This is no longer needed because we now abort immediately once heap corruption is detected. (cherry-picked from `a9da0bb266`)	2017-11-28 19:10:16 +05:30
Florian Weimer	ee717ed23d	malloc: Remove check_action variable [BZ #21754 ] Clean up calls to malloc_printerr and trim its argument list. This also removes a few bits of work done before calling malloc_printerr (such as unlocking operations). The tunable/environment variable still enables the lightweight additional malloc checking, but mallopt (M_CHECK_ACTION) no longer has any effect. (cherry-picked from `ac3ed168d0`)	2017-11-28 19:09:25 +05:30
Florian Weimer	8788996793	malloc: Abort on heap corruption, without a backtrace [BZ #21754 ] The stack trace printing caused deadlocks and has been itself been targeted by code execution exploits. (cherry-picked from `ec2c1fcefb`)	2017-11-28 19:07:55 +05:30
Tulio Magno Quites Machado Filho	aaa2eb83b8	powerpc: Update AT_HWCAP2 bits Linux commit ID cba6ac4869e45cc93ac5497024d1d49576e82666 reserved a new bit for a scenario where transactional memory is available, but the suspended state is disabled. * sysdeps/powerpc/bits/hwcap.h (PPC_FEATURE2_HTM_NO_SUSPEND): New macro. (cherry picked from commit `df0c40ee3a`) Signed-off-by: Tulio Magno Quites Machado Filho <tuliom@linux.vnet.ibm.com>	2017-11-24 18:30:11 -02:00
Andreas Schwab	71170eba2a	Add test for bug 21041 (cherry picked from commit `40c06a3d04`)	2017-11-21 20:09:49 +01:00
Andreas Schwab	4db8f362c1	Fix s390 version of pt-longjmp.c (cherry picked from commit `5797b410a8`)	2017-11-21 20:09:42 +01:00
Andreas Schwab	88758c4ad3	Don't use IFUNC resolver for longjmp or system in libpthread (bug 21041) Unlike the vfork forwarder and like the fork forwarder as in bug 19861, there won't be a problem when the compiler does not turn this into a tail call. (cherry picked from commit `fc5ad7024c`)	2017-11-21 20:09:34 +01:00
Rajalakshmi Srinivasaraghavan	6850e9c6ba	powerpc: Replace lxvd2x/stxvd2x with lvx/stvx in P7's memcpy/memmove POWER9 DD2.1 and earlier has an issue where some cache inhibited vector load traps to the kernel, causing a performance degradation. To handle this in memcpy and memmove, lvx/stvx is used for aligned addresses instead of lxvd2x/stxvd2x. Reference: https://patchwork.ozlabs.org/patch/814059/ * sysdeps/powerpc/powerpc64/power7/memcpy.S: Replace lxvd2x/stxvd2x with lvx/stvx. * sysdeps/powerpc/powerpc64/power7/memmove.S: Likewise. Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.vnet.ibm.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> (cherry picked from commit `63da5cd4a0`)	2017-11-21 22:32:20 +05:30
Florian Weimer	2767ebd8bc	crypt: Adjust check-local-headers.sh for nspr4 include directory [BZ #17956 ] (cherry picked from commit `11c4f5010c`)	2017-11-18 19:27:03 +01:00
Guido Trentalancia	82b1663202	crypt: Use NSPR header files in addition to NSS header files [BZ #17956 ] When configuring and building GNU libc using the Mozilla NSS library for cryptography (--enable-nss-crypt option), also include the NSPR header files along with the Mozilla NSS library header files. Finally, when running the check-local-headers test, ignore the Mozilla NSPR library header files (used by the Mozilla NSS library). (cherry picked from commit `57b4af1955`)	2017-11-18 19:26:57 +01:00
Wilco Dijkstra	a546080d51	Fix build failure on tilepro due to unsupported atomics * malloc/malloc.c (malloc_state): Use int for have_fastchunks since not all targets support atomics on bool. (cherry-picked from `2c2245b92c`)	2017-11-16 12:23:00 +05:30
Siddhesh Poyarekar	aa5be982ea	Use relaxed atomics for malloc have_fastchunks Currently free typically uses 2 atomic operations per call. The have_fastchunks flag indicates whether there are recently freed blocks in the fastbins. This is purely an optimization to avoid calling malloc_consolidate too often and avoiding the overhead of walking all fast bins even if all are empty during a sequence of allocations. However using catomic_or to update the flag is completely unnecessary since it can be changed into a simple boolean and accessed using relaxed atomics. There is no change in multi-threaded behaviour given the flag is already approximate (it may be set when there are no blocks in any fast bins, or it may be clear when there are free blocks that could be consolidated). Performance of malloc/free improves by 27% on a simple benchmark on AArch64 (both single and multithreaded). The number of load/store exclusive instructions is reduced by 33%. Bench-malloc-thread speeds up by ~3% in all cases. * malloc/malloc.c (FASTCHUNKS_BIT): Remove. (have_fastchunks): Remove. (clear_fastchunks): Remove. (set_fastchunks): Remove. (malloc_state): Add have_fastchunks. (malloc_init_state): Use have_fastchunks. (do_check_malloc_state): Remove incorrect invariant checks. (_int_malloc): Use have_fastchunks. (_int_free): Likewise. (malloc_consolidate): Likewise. (cherry-picked from `e956075a5a`)	2017-11-16 12:21:27 +05:30
Wilco Dijkstra	ade53e0df7	Inline tcache functions The functions tcache_get and tcache_put show up in profiles as they are a critical part of the tcache code. Inline them to give tcache a 16% performance gain. Since this improves multi-threaded cases as well, it helps offset any potential performance loss due to adding single-threaded fast paths. * malloc/malloc.c (tcache_put): Inline. (tcache_get): Inline. (cherry-picked from commit `e4dd4ace56`)	2017-11-16 12:14:30 +05:30
James Clarke	77f921dac1	Fix TLS relocations against local symbols on powerpc32, sparc32 and sparc64 Normally, TLS relocations against local symbols are optimised by the linker to be absolute. However, gold does not do this, and so it is possible to end up with, for example, R_SPARC_TLS_DTPMOD64 referring to a local symbol. Since sym_map is left as null in elf_machine_rela for the special local symbol case, the relocation handling thinks it has nothing to do, and so the module gets left as 0. Havoc then ensues when the variable in question is accessed. Before this fix, the main_local_gold program would receive a SIGBUS on sparc64, and SIGSEGV on powerpc32. With this fix applied, that test now passes like the rest of them. * sysdeps/powerpc/powerpc32/dl-machine.h (elf_machine_rela): Assign sym_map to be map for local symbols, as TLS relocations use sym_map to determine whether the symbol is defined and to extract the TLS information. * sysdeps/sparc/sparc32/dl-machine.h (elf_machine_rela): Likewise. * sysdeps/sparc/sparc64/dl-machine.h (elf_machine_rela): Likewise. (cherry picked from commit `8644588807`)	2017-11-14 21:07:52 +01:00
H.J. Lu	6a094c0ff1	x86-64: Regenerate libm-test-ulps for AVX512 mathvec tests Update libm-test-ulps for AVX512 mathvec tests by running “make regen-ulps” on Intel Xeon processor with AVX512. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. (cherry picked from commit `fcaaca412f`)	2017-11-13 13:31:18 +01:00
Adhemerval Zanella	a81c1156c1	nptl: Define __PTHREAD_MUTEX_{NUSERS_AFTER_KIND,USE_UNION} This patch adds two new internal defines to set the internal pthread_mutex_t layout required by the supported ABIS: 1. __PTHREAD_MUTEX_NUSERS_AFTER_KIND which control whether to define __nusers fields before or after __kind. The preferred value for is 0 for new ports and it sets __nusers before __kind. 2. __PTHREAD_MUTEX_USE_UNION which control whether internal __spins and __list members will be place inside an union for linuxthreads compatibility. The preferred value is 0 for ports and it sets to not use an union to define both fields. It fixes the wrong offsets value for __kind value on x86_64-linux-gnu-x32. Checked with a make check run-built-tests=no on all afected ABIs. [BZ #22298] * nptl/allocatestack.c (allocate_stack): Check if __PTHREAD_MUTEX_HAVE_PREV is non-zero, instead if __PTHREAD_MUTEX_HAVE_PREV is defined. * nptl/descr.h (pthread): Likewise. * nptl/nptl-init.c (__pthread_initialize_minimal_internal): Likewise. * nptl/pthread_create.c (START_THREAD_DEFN): Likewise. * sysdeps/nptl/fork.c (__libc_fork): Likewise. * sysdeps/nptl/pthread.h (PTHREAD_MUTEX_INITIALIZER): Likewise. * sysdeps/nptl/bits/thread-shared-types.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): New defines. (__pthread_internal_list): Check __PTHREAD_MUTEX_USE_UNION instead of __WORDSIZE for internal layout. (__pthread_mutex_s): Check __PTHREAD_MUTEX_NUSERS_AFTER_KIND instead of __WORDSIZE for internal __nusers layout and __PTHREAD_MUTEX_USE_UNION instead of __WORDSIZE whether to use an union for __spins and __list fields. (__PTHREAD_MUTEX_HAVE_PREV): Define also for __PTHREAD_MUTEX_USE_UNION case. * sysdeps/aarch64/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): New defines. * sysdeps/alpha/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/arm/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/hppa/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/ia64/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/m68k/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/microblaze/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/mips/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/nios2/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/powerpc/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/s390/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/sh/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/sparc/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/tile/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/x86/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> (cherry picked from commit `06be6368da`)	2017-11-07 11:03:01 -02:00
Adhemerval Zanella	5712f8db26	nptl: Add tests for internal pthread_mutex_t offsets This patch adds a new build test to check for internal fields offsets for user visible internal field. Although currently the only field which is statically initialized to a non zero value is pthread_mutex_t.__data.__kind value, the tests also check the offset of __kind, __spins, __elision (if supported), and __list internal member. A internal header (pthread-offset.h) is added to each major ABI with the reference value. Checked on x86_64-linux-gnu and with a build check for all affected ABIs (aarch64-linux-gnu, alpha-linux-gnu, arm-linux-gnueabihf, hppa-linux-gnu, i686-linux-gnu, ia64-linux-gnu, m68k-linux-gnu, microblaze-linux-gnu, mips64-linux-gnu, mips64-n32-linux-gnu, mips-linux-gnu, powerpc64le-linux-gnu, powerpc-linux-gnu, s390-linux-gnu, s390x-linux-gnu, sh4-linux-gnu, sparc64-linux-gnu, sparcv9-linux-gnu, tilegx-linux-gnu, tilegx-linux-gnu-x32, tilepro-linux-gnu, x86_64-linux-gnu, and x86_64-linux-x32). * nptl/pthreadP.h (ASSERT_PTHREAD_STRING, ASSERT_PTHREAD_INTERNAL_OFFSET): New macro. * nptl/pthread_mutex_init.c (__pthread_mutex_init): Add build time checks for internal pthread_mutex_t offsets. * sysdeps/aarch64/nptl/pthread-offsets.h (__PTHREAD_MUTEX_NUSERS_OFFSET, __PTHREAD_MUTEX_KIND_OFFSET, __PTHREAD_MUTEX_SPINS_OFFSET, __PTHREAD_MUTEX_ELISION_OFFSET, __PTHREAD_MUTEX_LIST_OFFSET): New macro. * sysdeps/alpha/nptl/pthread-offsets.h: Likewise. * sysdeps/arm/nptl/pthread-offsets.h: Likewise. * sysdeps/hppa/nptl/pthread-offsets.h: Likewise. * sysdeps/i386/nptl/pthread-offsets.h: Likewise. * sysdeps/ia64/nptl/pthread-offsets.h: Likewise. * sysdeps/m68k/nptl/pthread-offsets.h: Likewise. * sysdeps/microblaze/nptl/pthread-offsets.h: Likewise. * sysdeps/mips/nptl/pthread-offsets.h: Likewise. * sysdeps/nios2/nptl/pthread-offsets.h: Likewise. * sysdeps/powerpc/nptl/pthread-offsets.h: Likewise. * sysdeps/s390/nptl/pthread-offsets.h: Likewise. * sysdeps/sh/nptl/pthread-offsets.h: Likewise. * sysdeps/sparc/nptl/pthread-offsets.h: Likewise. * sysdeps/tile/nptl/pthread-offsets.h: Likewise. * sysdeps/x86_64/nptl/pthread-offsets.h: Likewise. Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> (cherry picked from commit `dff91cd45e`)	2017-11-07 10:57:49 -02:00
Adhemerval Zanella	bfdb34f2f2	posix: Do not use WNOHANG in waitpid call for Linux posix_spawn As shown in some buildbot issues on aarch64 and powerpc, calling clone (VFORK) and waitpid (WNOHANG) does not guarantee the child is ready to be collected. This patch changes the call back to 0 as before `fe05e1cb6d` fix. This change can lead to the scenario 4.3 described in the commit, where the waitpid call can hang undefinitely on the call. However this is also a very unlikely and also undefinied situation where both the caller is trying to terminate a pid before posix_spawn returns and the race pid reuse is triggered. I don't see how to correct handle this specific situation within posix_spawn. Checked on x86_64-linux-gnu, aarch64-linux-gnu and powerpc64-linux-gnu. * sysdeps/unix/sysv/linux/spawni.c (__spawnix): Use 0 instead of WNOHANG in waitpid call. (cherry picked from commit `aa95a2414e`)	2017-11-07 10:16:43 -02:00
Adhemerval Zanella	f8ee700e89	posix: Fix improper assert in Linux posix_spawn (BZ#22273) As noted by Florian Weimer, current Linux posix_spawn implementation can trigger an assert if the auxiliary process is terminated before actually setting the err member: 340 /* Child must set args.err to something non-negative - we rely on 341 the parent and child sharing VM. / 342 args.err = -1; [...] 362 new_pid = CLONE (__spawni_child, STACK (stack, stack_size), stack_size, 363 CLONE_VM \| CLONE_VFORK \| SIGCHLD, &args); 364 365 if (new_pid > 0) 366 { 367 ec = args.err; 368 assert (ec >= 0); Another possible issue is killing the child between setting the err and actually calling execve. In this case the process will not ran, but posix_spawn also will not report any error: 269 270 args->err = 0; 271 args->exec (args->file, args->argv, args->envp); As suggested by Andreas Schwab, this patch removes the faulty assert and also handles any signal that happens before fork and execve as the spawn was successful (and thus relaying the handling to the caller to figure this out). Different than Florian, I can not see why using atomics to set err would help here, essentially the code runs sequentially (due CLONE_VFORK) and I think it would not be legal the compiler evaluate ec without checking for new_pid result (thus there is no need to compiler barrier). Summarizing the possible scenarios on posix_spawn execution, we have: 1. For default case with a success execution, args.err will be 0, pid will not be collected and it will be reported to caller. 2. For default failure case, args.err will be positive and the it will be collected by the waitpid. An error will be reported to the caller. 3. For the unlikely case where the process was terminated and not collected by a caller signal handler, it will be reported as succeful execution and not be collected by posix_spawn (since args.err will be 0). The caller will need to actually handle this case. 4. For the unlikely case where the process was terminated and collected by caller we have 3 other possible scenarios: 4.1. The auxiliary process was terminated with args.err equal to 0: it will handled as 1. (so it does not matter if we hit the pid reuse race since we won't possible collect an unexpected process). 4.2. The auxiliary process was terminated after execve (due a failure in calling it) and before setting args.err to -1: it will also be handle as 1. but with the issue of not be able to report the caller a possible execve failures. 4.3. The auxiliary process was terminated after args.err is set to -1: this is the case where it will be possible to hit the pid reuse case where we will need to collected the auxiliary pid but we can not be sure if it will be expected one. I think for this case we need to actually change waitpid to use WNOHANG to avoid hanging indefinitely on the call and report an error to caller since we can't differentiate between a default failure as 2. and a possible pid reuse race issue. Checked on x86_64-linux-gnu. sysdeps/unix/sysv/linux/spawni.c (__spawnix): Handle the case where the auxiliary process is terminated by a signal before calling _exit or execve. (cherry picked from commit `fe05e1cb6d`)	2017-11-07 10:15:45 -02:00
Adhemerval Zanella	caa6857ec1	posix: Fix compat glob code on s390 and alpha This patch fixes the compat glob implementation consolidation from commit `116f1c64d` with the following changes: - Add a compat implementation on s390 to avoid the architecture to build the symbols on default linux oldglob.c by setting GLOB_NO_OLD_VERSION. - Remove the duplicate rule to build oldglob on alpha. Checked on s390-linux-gnu and alpha-linux-gnu using build-many-glibc.py. * sysdeps/unix/sysv/linux/s390/s390-32/oldglob.c: New file. * sysdeps/unix/sysv/linux/alpha/Makefile [$(subdir) = csu] (sysdep_routines): Remove rule. (cherry picked from commit `3ca622e4d6`)	2017-10-30 11:49:40 +01:00
Adhemerval Zanella	ee5bce43eb	posix: Consolidate Linux glob implementation This patch consolidates the glob implementation. The main changes are: * On Linux all implementation now uses the default one at sysdeps/unix/sysv/linux/glob{free}{64}.c with the exception of alpha (which requires specific versioning) and s390-32 (which different than other 32 bits ports it does not add a compat one symbol for 2.1 version). * The default implementation uses XSTAT_IS_XSTAT64 to define whether both glob{free} and glob{free}64 should be different implementations. For archictures that define XSTAT_IS_XSTAT64, glob{free} is an alias to glob{free}64. * Move i386 olddirent.h header to Linux default directory, since it is the only header with this name and it is shared among different architectures (and used on compat glob symbol as well). Checked on x86_64-linux-gnu and on a build using build-many-glibcs.py for all major architectures. * sysdeps/unix/sysv/linux/arm/glob64.c: Remove file. * sysdeps/unix/sysv/linux/i386/glob64.c: Likewise. * sysdeps/unix/sysv/linux/m68k/glob64.c: Likewise. * sysdeps/unix/sysv/linux/mips/mips64/n64/glob64.c: Likewise. * sysdeps/unix/sysv/linux/mips/mips64/n64/globfree64.c: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/glob64.c: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/glob64.c: Likewise. * sysdeps/unix/sysv/linux/wordsize-64/glob64.c: Likewise. * sysdeps/unix/sysv/linux/wordsize-64/globfree64.c: Likewise. * sysdeps/unix/sysv/linux/x86_64/x32/glob.c: Likewise. * sysdeps/unix/sysv/linux/x86_64/x32/globfree.c: Likewise. * sysdeps/wordsize-64/glob.c: Likewise. * sysdeps/wordsize-64/glob64.c: Likewise. * sysdeps/wordsize-64/globfree64.c: Likewise. * sysdeps/unix/sysv/linux/glob.c: New file. * sysdeps/unix/sysv/linux/glob64.c: Likewise. * sysdeps/unix/sysv/linux/globfree.c: Likewise. * sysdeps/unix/sysv/linux/globfree64.c: Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/glob64.c: Likewise. * sysdeps/unix/sysv/linux/oldglob.c [SHLIB_COMPAT]: Also adds !GLOB_NO_OLD_VERSION as an extra condition. * sysdeps/unix/sysv/linux/i386/alphasort64.c: Include olddirent.h using relative path instead of absolute one. * sysdeps/unix/sysv/linux/i386/getdents64.c: Likewise. * sysdeps/unix/sysv/linux/i386/readdir64.c: Likewise. * sysdeps/unix/sysv/linux/i386/readdir64_r.c: Likewise. * sysdeps/unix/sysv/linux/i386/versionsort64.c: Likewise. * sysdeps/unix/sysv/linux/i386/olddirent.h: Move to ... * sysdeps/unix/sysv/linux//olddirent.h: ... here. (cherry picked from commit `116f1c64d8`)	2017-10-30 11:49:40 +01:00
H.J. Lu	4b692dffb9	x86-64: Don't set GLRO(dl_platform) to NULL [BZ #22299 ] Since ld.so expands $PLATFORM with GLRO(dl_platform), don't set GLRO(dl_platform) to NULL. [BZ #22299] * sysdeps/x86/cpu-features.c (init_cpu_features): Don't set GLRO(dl_platform) to NULL. * sysdeps/x86_64/Makefile (tests): Add tst-platform-1. (modules-names): Add tst-platformmod-1 and x86_64/tst-platformmod-2. (CFLAGS-tst-platform-1.c): New. (CFLAGS-tst-platformmod-1.c): Likewise. (CFLAGS-tst-platformmod-2.c): Likewise. (LDFLAGS-tst-platformmod-2.so): Likewise. ($(objpfx)tst-platform-1): Likewise. ($(objpfx)tst-platform-1.out): Likewise. (tst-platform-1-ENV): Likewise. ($(objpfx)x86_64/tst-platformmod-2.os): Likewise. * sysdeps/x86_64/tst-platform-1.c: New file. * sysdeps/x86_64/tst-platformmod-1.c: Likewise. * sysdeps/x86_64/tst-platformmod-2.c: Likewise. (cherry picked from commit `4d916f0f12`)	2017-10-26 07:36:47 -07:00
Alexey Makhalov	77eea8950c	Fix range check in do_tunable_update_val Current implementation of tunables does not set arena_max and arena_test values. Any value provided by glibc.malloc.arena_max and glibc.malloc.arena_test parameters is ignored. These tunables have minval value set to 1 (see elf/dl-tunables.list file) and undefined maxval value. In that case default value (which is 0. see scripts/gen-tunables.awk) is being used to set maxval. For instance, generated tunable_list[] entry for arena_max is: (gdb) p cur $1 = {name = 0x7ffff7df6217 "glibc.malloc.arena_max", type = {type_code = TUNABLE_TYPE_SIZE_T, min = 1, max = 0}, val = {numval = 0, strval = 0x0}, initialized = false, security_level = TUNABLE_SECLEVEL_SXID_IGNORE, env_alias = 0x7ffff7df622e "MALLOC_ARENA_MAX"} As a result, any value of glibc.malloc.arena_max is ignored by TUNABLE_SET_VAL_IF_VALID_RANGE macro __type min = (__cur)->type.min; <- initialized to 1 __type max = (__cur)->type.max; <- initialized to 0! if (min == max) <- false { min = __default_min; max = __default_max; } if ((__type) (__val) >= min && (__type) (val) <= max) <- false { (__cur)->val.numval = val; (__cur)->initialized = true; } Assigning correct min/max values at a build time fixes a problem. Plus, a bit of optimization: Setting of default min/max values for the given type at a run time might be eliminated. elf/dl-tunables.c (do_tunable_update_val): Range checking fix. * scripts/gen-tunables.awk: Set unspecified minval and/or maxval values to correct default value for given type.	2017-10-23 21:27:46 +05:30
Joseph Myers	04acd59794	Install correct bits/long-double.h for MIPS64 (bug 22322). Similar to bug 21987 for SPARC, MIPS64 wrongly installs the ldbl-128 version of bits/long-double.h, meaning incorrect results when using headers installed from a 64-bit installation for a 32-bit build. (I haven't actually seen this cause build failures before its interaction with bits/floatn.h did so - installed headers wrongly expecting _Float128 to be available in a 32-bit configuration.) This patch fixes the bug by moving the MIPS header to sysdeps/mips/ieee754, which comes before sysdeps/ieee754/ldbl-128 in the sysdeps directory ordering. (bits/floatn.h will need a similar fix - duplicating the ldbl-128 version for MIPS will suffice - for headers from a 32-bit installation to be correct for 64-bit builds.) Tested with build-many-glibcs.py (compilers build for mips64-linux-gnu, where there was previously a libstdc++ build failure as at <https://sourceware.org/ml/libc-testresults/2017-q4/msg00130.html>). [BZ #22322] * sysdeps/mips/bits/long-double.h: Move to .... * sysdeps/mips/ieee754/bits/long-double.h: ... here. (cherry picked from commit `37bb78cb8c`)	2017-10-23 15:46:38 +00:00
Gabriel F. T. Gomes	b1b8d8aa95	Add missing bug fixes to NEWS	2017-10-22 22:35:42 -02:00
Romain Naour	f8279a4b3c	Let signbit use the builtin in C++ mode with gcc < 6.x (bug 22296) When using gcc < 6.x, signbit does not use the type-generic __builtin_signbit builtin, instead it uses __MATH_TG. However, when library support for float128 is available, __MATH_TG uses __builtin_types_compatible_p, which is not available in C++ mode. On the other hand, libstdc++ undefines (in cmath) many macros from math.h, including signbit, so that it can provide its own functions. However, during its configure tests, libstdc++ just tests for the availability of the macros (it does not undefine them, nor does it provide its own functions). Finally, libstdc++ configure tests include math.h and get the definition of signbit that uses __MATH_TG (and __builtin_types_compatible_p). Since libstdc++ does not undefine the macros during its configure tests, they fail. This patch lets signbit use the builtin in C++ mode when gcc < 6.x is used. This allows the configure test in libstdc++ to work. Tested for x86_64. [BZ #22296] * math/math.h: Let signbit use the builtin in C++ mode with gcc < 6.x Cc: Gabriel F. T. Gomes <gftg@linux.vnet.ibm.com> Cc: Joseph Myers <joseph@codesourcery.com> (cherry picked from commit `386e1c26ac`)	2017-10-22 20:53:43 -02:00
H.J. Lu	f82a6fc223	x86-64: Use fxsave/xsave/xsavec in _dl_runtime_resolve [BZ #21265 ] In _dl_runtime_resolve, use fxsave/xsave/xsavec to preserve all vector, mask and bound registers. It simplifies _dl_runtime_resolve and supports different calling conventions. ld.so code size is reduced by more than 1 KB. However, use fxsave/xsave/xsavec takes a little bit more cycles than saving and restoring vector and bound registers individually. Latency for _dl_runtime_resolve to lookup the function, foo, from one shared library plus libc.so: Before After Change Westmere (SSE)/fxsave 345 866 151% IvyBridge (AVX)/xsave 420 643 53% Haswell (AVX)/xsave 713 1252 75% Skylake (AVX+MPX)/xsavec 559 719 28% Skylake (AVX512+MPX)/xsavec 145 272 87% Ryzen (AVX)/xsavec 280 553 97% This is the worst case where portion of time spent for saving and restoring registers is bigger than majority of cases. With smaller _dl_runtime_resolve code size, overall performance impact is negligible. On IvyBridge, differences in build and test time of binutils with lazy binding GCC and binutils are noises. On Westmere, differences in bootstrap and "makc check" time of GCC 7 with lazy binding GCC and binutils are also noises. [BZ #21265] * sysdeps/x86/cpu-features-offsets.sym (XSAVE_STATE_SIZE_OFFSET): New. * sysdeps/x86/cpu-features.c: Include <libc-pointer-arith.h>. (get_common_indeces): Set xsave_state_size, xsave_state_full_size and bit_arch_XSAVEC_Usable if needed. (init_cpu_features): Remove bit_arch_Use_dl_runtime_resolve_slow and bit_arch_Use_dl_runtime_resolve_opt. * sysdeps/x86/cpu-features.h (bit_arch_Use_dl_runtime_resolve_opt): Removed. (bit_arch_Use_dl_runtime_resolve_slow): Likewise. (bit_arch_Prefer_No_AVX512): Updated. (bit_arch_MathVec_Prefer_No_AVX512): Likewise. (bit_arch_XSAVEC_Usable): New. (STATE_SAVE_OFFSET): Likewise. (STATE_SAVE_MASK): Likewise. [__ASSEMBLER__]: Include <cpu-features-offsets.h>. (cpu_features): Add xsave_state_size and xsave_state_full_size. (index_arch_Use_dl_runtime_resolve_opt): Removed. (index_arch_Use_dl_runtime_resolve_slow): Likewise. (index_arch_XSAVEC_Usable): New. * sysdeps/x86/cpu-tunables.c (TUNABLE_CALLBACK (set_hwcaps)): Support XSAVEC_Usable. Remove Use_dl_runtime_resolve_slow. * sysdeps/x86_64/Makefile (tst-x86_64-1-ENV): New if tunables is enabled. * sysdeps/x86_64/dl-machine.h (elf_machine_runtime_setup): Replace _dl_runtime_resolve_sse, _dl_runtime_resolve_avx, _dl_runtime_resolve_avx_slow, _dl_runtime_resolve_avx_opt, _dl_runtime_resolve_avx512 and _dl_runtime_resolve_avx512_opt with _dl_runtime_resolve_fxsave, _dl_runtime_resolve_xsave and _dl_runtime_resolve_xsavec. * sysdeps/x86_64/dl-trampoline.S (DL_RUNTIME_UNALIGNED_VEC_SIZE): Removed. (DL_RUNTIME_RESOLVE_REALIGN_STACK): Check STATE_SAVE_ALIGNMENT instead of VEC_SIZE. (REGISTER_SAVE_BND0): Removed. (REGISTER_SAVE_BND1): Likewise. (REGISTER_SAVE_BND3): Likewise. (REGISTER_SAVE_RAX): Always defined to 0. (VMOV): Removed. (_dl_runtime_resolve_avx): Likewise. (_dl_runtime_resolve_avx_slow): Likewise. (_dl_runtime_resolve_avx_opt): Likewise. (_dl_runtime_resolve_avx512): Likewise. (_dl_runtime_resolve_avx512_opt): Likewise. (_dl_runtime_resolve_sse): Likewise. (_dl_runtime_resolve_sse_vex): Likewise. (USE_FXSAVE): New. (_dl_runtime_resolve_fxsave): Likewise. (USE_XSAVE): Likewise. (_dl_runtime_resolve_xsave): Likewise. (USE_XSAVEC): Likewise. (_dl_runtime_resolve_xsavec): Likewise. * sysdeps/x86_64/dl-trampoline.h (_dl_runtime_resolve_avx512): Removed. (_dl_runtime_resolve_avx512_opt): Likewise. (_dl_runtime_resolve_avx): Likewise. (_dl_runtime_resolve_avx_opt): Likewise. (_dl_runtime_resolve_sse): Likewise. (_dl_runtime_resolve_sse_vex): Likewise. (_dl_runtime_resolve_fxsave): New. (_dl_runtime_resolve_xsave): Likewise. (_dl_runtime_resolve_xsavec): Likewise. (cherry picked from commit `b52b0d793d`)	2017-10-22 07:41:00 -07:00
H.J. Lu	b2c78ae69e	x86: Add x86_64 to x86-64 HWCAP [BZ #22093 ] Before glibc 2.26, ld.so set dl_platform to "x86_64" and searched the "x86_64" subdirectory when loading a shared library. ld.so in glibc 2.26 was changed to set dl_platform to "haswell" or "xeon_phi", based on supported ISAs. This led to shared library loading failure for shared libraries placed under the "x86_64" subdirectory. This patch adds "x86_64" to x86-64 dl_hwcap so that ld.so will always search the "x86_64" subdirectory when loading a shared library. NB: We can't set x86-64 dl_platform to "x86-64" since ld.so will skip the "haswell" and "xeon_phi" subdirectories on "haswell" and "xeon_phi" machines. Tested on i686 and x86-64. [BZ #22093] * sysdeps/x86/cpu-features.c (init_cpu_features): Initialize GLRO(dl_hwcap) to HWCAP_X86_64 for x86-64. * sysdeps/x86/dl-hwcap.h (HWCAP_COUNT): Updated. (HWCAP_IMPORTANT): Likewise. (HWCAP_X86_64): New enum. (HWCAP_X86_AVX512_1): Updated. * sysdeps/x86/dl-procinfo.c (_dl_x86_hwcap_flags): Add "x86_64". * sysdeps/x86_64/Makefile (tests): Add tst-x86_64-1. (modules-names): Add x86_64/tst-x86_64mod-1. (LDFLAGS-tst-x86_64mod-1.so): New. ($(objpfx)tst-x86_64-1): Likewise. ($(objpfx)x86_64/tst-x86_64mod-1.os): Likewise. (tst-x86_64-1-clean): Likewise. * sysdeps/x86_64/tst-x86_64-1.c: New file. * sysdeps/x86_64/tst-x86_64mod-1.c: Likewise. (cherry picked from commit `45ff34638f`)	2017-10-22 04:16:39 -07:00
Florian Weimer	6182b3708b	glob: Add new test tst-glob-tilde The new test checks for memory leaks (see bug 22325) and attempts to trigger the buffer overflow in bug 22320. (cherry picked from commit `e80fc1fc98`)	2017-10-21 18:32:06 +02:00
Paul Eggert	a76376df7c	CVE-2017-15670: glob: Fix one-byte overflow [BZ #22320 ] (cherry picked from commit `c369d66e54`)	2017-10-20 19:36:54 +02:00
Adhemerval Zanella	305f4f057d	posix: Sync glob with gnulib [BZ #1062 ] This patch syncs posix/glob.c implementation with gnulib version b5ec983 (glob: simplify symlink detection). The only difference to gnulib code is * DT_UNKNOWN, DT_DIR, and DT_LNK definition in the case there were not already defined. Gnulib code which uses HAVE_STRUCT_DIRENT_D_TYPE will redefine them wrongly because GLIBC does not define HAVE_STRUCT_DIRENT_D_TYPE. Instead the patch check for each definition instead. Also, the patch requires additional globfree and globfree64 files for compatibility version on some architectures. Also the code simplification leads to not macro simplification (not need for NO_GLOB_PATTERN_P anymore). Checked on x86_64-linux-gnu and on a build using build-many-glibcs.py for all major architectures. [BZ #1062] * posix/Makefile (routines): Add globfree, globfree64, and glob_pattern_p. * posix/flexmember.h: New file. * posix/glob_internal.h: Likewise. * posix/glob_pattern_p.c: Likewise. * posix/globfree.c: Likewise. * posix/globfree64.c: Likewise. * sysdeps/gnu/globfree64.c: Likewise. * sysdeps/unix/sysv/linux/alpha/globfree.c: Likewise. * sysdeps/unix/sysv/linux/mips/mips64/n64/globfree64.c: Likewise. * sysdeps/unix/sysv/linux/oldglob.c: Likewise. * sysdeps/unix/sysv/linux/wordsize-64/globfree64.c: Likewise. * sysdeps/unix/sysv/linux/x86_64/x32/globfree.c: Likewise. * sysdeps/wordsize-64/globfree.c: Likewise. * sysdeps/wordsize-64/globfree64.c: Likewise. * posix/glob.c (HAVE_CONFIG_H): Use !_LIBC instead. [NDEBUG): Remove comments. (GLOB_ONLY_P, _AMIGA, VMS): Remove define. (dirent_type): New type. Use uint_fast8_t not uint8_t, as C99 does not require uint8_t. (DT_UNKNOWN, DT_DIR, DT_LNK): New macros. (struct readdir_result): Use dirent_type. Do not define skip_entry unless it is needed; this saves a byte on platforms lacking d_ino. (readdir_result_type, readdir_result_skip_entry): New functions, replacing ... (readdir_result_might_be_symlink, readdir_result_might_be_dir): these functions, which were removed. This makes the callers easier to read. All callers changed. (D_INO_TO_RESULT): Now empty if there is no d_ino. (size_add_wrapv, glob_use_alloca): New static functions. (glob, glob_in_dir): Check for size_t overflow in several places, and fix some size_t checks that were not quite right. Remove old code using SHELL since Bash no longer uses this. (glob, prefix_array): Separate MS code better. (glob_in_dir): Remove old Amiga and VMS code. (globfree, __glob_pattern_type, __glob_pattern_p): Move to separate files. (glob_in_dir): Do not rely on undefined behavior in accessing struct members beyond their bounds. Use a flexible array member instead (link_stat): Rename from link_exists2_p and return -1/0 instead of 0/1. Caller changed. (glob): Fix memory leaks. * posix/glob64 (globfree64): Move to separate file. * sysdeps/gnu/glob64.c (NO_GLOB_PATTERN_P): Remove define. (globfree64): Remove hidden alias. * sysdeps/unix/sysv/linux/Makefile (sysdeps_routines): Add oldglob. * sysdeps/unix/sysv/linux/alpha/glob.c (__new_globfree): Move to separate file. * sysdeps/unix/sysv/linux/i386/glob64.c (NO_GLOB_PATTERN_P): Remove define. Move compat code to separate file. * sysdeps/wordsize-64/glob.c (globfree): Move definitions to separate file. (cherry picked from commit `c66c908230`)	2017-10-20 19:36:33 +02:00
H.J. Lu	c96d7a646b	i386: Hide __old_glob64 [BZ #18822 ] Hide internal __old_glob64 function to allow direct access within libc.so and libc.a without using GOT nor PLT. [BZ #18822] * sysdeps/unix/sysv/linux/i386/glob64.c (__old_glob64): Add libc_hidden_proto and libc_hidden_def. (cherry picked from commit `2585d7b839`)	2017-10-20 19:12:11 +02:00
Florian Weimer	2e78ea7a20	sysconf: Fix missing definition of UIO_MAXIOV on Linux [BZ #22321 ] After commit `37f802f864` (Remove __need_IOV_MAX and __need_FOPEN_MAX), UIO_MAXIOV is no longer supplied (indirectly) through <bits/stdio_lim.h>, so sysdeps/posix/sysconf.c no longer sees the definition. (cherry picked from commit `63b4baa44e`)	2017-10-20 04:23:26 +02:00
Florian Weimer	05155f0772	nss_files: Avoid large buffers with many host addresses [BZ #22078 ] The previous implementation had at least a quadratic space requirement in the number of host addresses and aliases. (cherry picked from commit `d8425e116c`)	2017-10-19 10:44:31 +02:00
Florian Weimer	13728f56f0	nss_files: Use struct scratch_buffer for gethostbyname [BZ #18023 ] (cherry picked from commit `78e806fd8c`)	2017-10-19 10:43:40 +02:00
Florian Weimer	5ebb81e292	nss_files: Refactor gethostbyname3 multi case into separate function This is in preparation of further cleanup work. (cherry picked from commit `8ed70de2fa`)	2017-10-19 10:43:32 +02:00
Gabriel F. T. Gomes	f725563967	Remove conditional on LDBL_MANT_DIG from e_lgammal_r.c The IEEE 754 implementation of lgammal in sysdeps/ieee754/ldbl-128/ used to be shared by IBM's implementation in sysdeps/ieee754/ldbl-128ibm/ (by an inclusion of the source file). In order for the algorithm to work for IBM's implementation, a check for LDBL_MANT_DIG was required. Since the source file is no longer shared, the requirement for the check is gone. This patch removes the conditionals. Tested for powerpc64le and s390x. * sysdeps/ieee754/ldbl-128/e_lgammal_r.c (__ieee754_lgammal_r): Remove conditionals on LDBL_MANT_DIG. * sysdeps/ieee754/ldbl-128ibm/e_lgammal_r.c (__ieee754_lgammal_r): Likewise. (cherry picked from commit `9ac3c68218`)	2017-10-10 10:15:16 -03:00
Gabriel F. T. Gomes	d50b9bf1cc	ldbl-128ibm: Automatic replacing of _Float128 and L() The ldbl-128ibm implementation of j0l, j1l, lgammal_r, and cbrtl, as well as the tables used by expl were copied from ldbl-128. However, the original files used _Float128 for the type and L() for the literal suffix. This patch uses the following sed command to rewrite _Float128 as long double and L(x) as xL (for e_expl.c, e_j0l.c, e_j1l.c, e_lgammal_r.c, and t_expl.h): sed -i <filename> \ -e "/^#define _Float128 long double/d" \ -e "/^#define L(x) x ## L/d" \ -e "/L(/s/)/L/" \ -e "/L(/s/L(//" \ -e "s/_Float128/long double/g" For sysdeps/ieee754/ldbl-128ibm/s_cbrtl.c, this sed command incorrectly replaces a few occurrences of L(), so the following command is used instead: sed -i sysdeps/ieee754/ldbl-128ibm/s_cbrtl.c \ -e "/^#define _Float128 long double/d" \ -e "/^#define L(x) x ## L/d" \ -e "s/L(0\.3\{40\})/0.3333333333333333333333333333333333333333L/" \ -e "s/L(3\.7568280825958912391243e-1)/3.7568280825958912391243e-1L/" \ -e "/L(/s/)/L/" \ -e "/L(/s/L(//" \ -e "s/_Float128/long double/g" Tested for powerpc64le with patched [1] and unpatched gcc. [1] https://gcc.gnu.org/ml/gcc-patches/2017-08/msg01028.html * sysdeps/ieee754/ldbl-128ibm/e_expl.c: Remove definitions of _Float128 and L(). * sysdeps/ieee754/ldbl-128ibm/e_j0l.c: Remove definitions of _Float128 and L(). Replace _Float128 with long double and L(x) with xL, throughout the file. * sysdeps/ieee754/ldbl-128ibm/e_j1l.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/e_lgammal_r.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_cbrtl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/t_expl.h: Likewise. (cherry picked from commit `d2f0ed09f8`)	2017-10-10 10:15:16 -03:00
Gabriel F. T. Gomes	4f11fe97c3	ldbl-128ibm: Copy implementations from ldbl-128 instead of including them Some files under sysdeps/ieee754/ldbl-128ibm/ are able to reuse the implementation in sysdeps/ieee754/ldbl-128/ by defining _Float128 to long double. This relied on compiler support for _Float128 being disabled. On powerpc, such support was disabled by default, however, it got enabled by default [1] in GCC 8. This patch copies the implementations from ldbl-128 to ldbl-128ibm. The uses of _Float128 and L() are kept intact in this patch and are replaced with a script in a subsequent patch. [1] https://gcc.gnu.org/ml/gcc-patches/2017-08/msg01028.html Tested for powerpc64 and powerpc64le. * sysdeps/ieee754/ldbl-128ibm/e_expl.c: Include tables from sysdeps/ieee754/ldbl-128ibm. * sysdeps/ieee754/ldbl-128ibm/e_j0l.c: Copy contents from the equivalent implementation in sysdeps/ieee754/ldbl-128/ instead of including it. Keep _Float128 and L() intact. These will be reviewed by a separate patch. * sysdeps/ieee754/ldbl-128ibm/e_j1l.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/e_lgammal_r.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_cbrtl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/t_expl.h: Likewise. (cherry picked from commit `c5c2e667bf`)	2017-10-10 10:15:15 -03:00
Gabriel F. T. Gomes	c9163eeb6e	powerpc: Add redirection for finitef128, isinf128, and isnanf128 On powerpc64le, compiler support for float128 is not enabled by default on gcc. To enable it, the flag -mfloat128 must be passed as a command line option to the compiler. This means that only the few files that actively have -mfloat128 passed as an argument get compiler support for float128, whereas all other files don't. When -mfloat128 becomes enabled by default on powerpc [1], all the files that do not currently have compiler support for float128 enabled during their compilation, will start to have it. This will lead to build errors in s_finite.c, s_isinf.c, and s_isnan.c. The errors are due to the unintended macro expansion of __finitef128 to __redirect_finitef128 in math/bits/mathcalls-helper-functions.h. In that header, __MATHDECL_1 takes '__finite' and 'f128' as arguments and concatenates them. However, since '__finite' has been redefined in s_finite.c, the function declaration becomes __redirect_finitef128: extern int __redirect___finitef128 (_Float128 __value) __attribute__ ((__nothrow__ )) __attribute__ ((__const__)); This declaration itself is OK. The problem arises when include/math.h creates the hidden prototype ('hidden_proto (__finitef128)'), which expands to: extern __typeof (__finitef128) __finitef128 __attribute__ ((visibility ("hidden"))); Since __finitef128 is not declared, __typeof fails. This effect was already true for the 'float' and 'long double' versions and is now true for float128. Likewise for isinsff128 and isnanf128. This patch defines __finitef128 as __redirect___finitef128 in sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite.c, similarly to what's done for the float and long double versions of these functions, to get rid of the build error. Likewise for isinff128 and isnanf128. [1] https://gcc.gnu.org/ml/gcc-patches/2017-08/msg01028.html Tested for powerpc64 and powerpc64le. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite.c (__finitef128): Define to __redirect___finitef128. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinf.c (__isinff128): Define to __redirect___isinff128. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan.c (__isnanf128): Define to __redirect___isnanf128. (cherry picked from commit `e010deb231`)	2017-10-10 10:15:15 -03:00
Gabriel F. T. Gomes	356a2df52a	powerpc64le: Add -mfloat128 to tst-strtod-nan-locale testcase On powerpc64le, not all files can have the flag -mfloat128 passed as an option on the compile command, since that could conflict with other flags, such as -mno-vsx. Each file that needs the flag, gets it through a CFLAGS-filename variable on sysdeps/powerpc/powerpc64le/Makefile. The test cases tst-strtod-nan-locale and tst-wcstod-nan-locale are missing this flag. Tested for powerpc64le. * sysdeps/powerpc/powerpc64le/Makefile (CFLAGS-tst-strtod-nan-locale.c): New variable. (CFLAGS-tst-wcstod-nan-locale.c): New variable. (cherry picked from commit `ffa448041b`)	2017-10-10 10:15:15 -03:00
Steve Ellcey	1f1239c389	Fix glibc.tune.cpu tunable handling * sysdeps/unix/sysv/linux/aarch64/cpu-features.c (get_midr_from_mcpu): Use strcmp instead of tunable_is_name.	2017-10-10 15:53:55 +05:30
Siddhesh Poyarekar	290ba1089e	aarch64: Optimized implementation of memmove for Qualcomm Falkor This is an optimized memmove implementation for the Qualcomm Falkor processor core. Due to the way the falkor memcpy needs to be written, code cannot be easily shared between memmove and memcpy like in case of other aarch64 memcpy implementations due to which this routine is separate. The underlying principle is the same as that of memcpy where it tries to use registers with the same lower 4 bits for fetching the same stream, thus optimizing hardware prefetcher performance. The memcpy copy loop copies 64 bytes at a time using the same register pair since that's the way to train the hardware prefetcher on the falkor core. memmove cannot quite do that since it needs to avoid overlaps, so it does the next best thing, i.e. has a 32 byte loop with a 32 byte end (prefetch a loop ahead to account for overlapping locations) with register pairs that alias so that they hit the same prefetcher. Due to this difference in loop size, they have to currently be separate implementations but efforts are on to try and get memmove to fall back into memcpy whenever it can without simply duplicating all of the code. Performance: The routine fares around 20-25% better than the generic memmove for most medium to large sizes (i.e. > 128 bytes) for the new walking memmove benchmark (memmove-walk) with an unexplained regression between 1K and 2K. The minor regression is something worth looking into for us, but the remaining gains are significant enough that we would like this included upstream as we looking into the cause for the regression. Here is a snippet of the numbers as generated from the microbenchmark by the compare_strings script. Comparisons are against __memmove_generic: Function: memmove Variant: walk __memmove_thunderx __memmove_falkor __memmove_generic ======================================================================================================================== <snip> length=16384: 12508800.00 ( 6.09%) 11486800.00 ( 13.76%) 13319600.00 length=16400: 13614200.00 ( -0.67%) 11585000.00 ( 14.33%) 13523600.00 length=16385: 13448400.00 ( 0.10%) 11732700.00 ( 12.84%) 13461200.00 length=16399: 13594100.00 ( -0.22%) 11859600.00 ( 12.57%) 13564400.00 length=16386: 13211600.00 ( 1.13%) 11503800.00 ( 13.91%) 13362400.00 length=16398: 13218600.00 ( 2.12%) 11573200.00 ( 14.30%) 13504700.00 length=16387: 13510900.00 ( -0.37%) 11744200.00 ( 12.76%) 13461300.00 length=16397: 13603700.00 ( -0.15%) 11878200.00 ( 12.55%) 13583200.00 length=16388: 13461700.00 ( -0.13%) 11558000.00 ( 14.03%) 13444100.00 length=16396: 13517500.00 ( -0.03%) 11561300.00 ( 14.45%) 13513900.00 length=16389: 13534100.00 ( 0.17%) 11756800.00 ( 13.28%) 13556900.00 length=16395: 13585600.00 ( 0.11%) 11791800.00 ( 13.30%) 13601200.00 length=16390: 13480100.00 ( -0.13%) 11685500.00 ( 13.20%) 13462100.00 length=16394: 13529900.00 ( -0.23%) 11549800.00 ( 14.43%) 13498200.00 length=16391: 13595400.00 ( -0.26%) 11768200.00 ( 13.22%) 13560600.00 length=16393: 13567000.00 ( 0.20%) 11779700.00 ( 13.35%) 13594700.00 length=32768: 71308800.00 ( -6.53%) 50220800.00 ( 24.98%) 66939200.00 length=32784: 72100800.00 (-11.55%) 50114100.00 ( 22.47%) 64636300.00 length=32769: 71767000.00 ( -7.10%) 51238400.00 ( 23.54%) 67010000.00 length=32783: 70113700.00 (-40.95%) 51129000.00 ( -2.78%) 49744400.00 length=32770: 71367600.00 ( -6.52%) 50244700.00 ( 25.01%) 67000900.00 length=32782: 64366700.00 ( 4.71%) 50101400.00 ( 25.83%) 67545600.00 length=32771: 71440100.00 ( -6.51%) 51263900.00 ( 23.57%) 67074900.00 length=32781: 66993000.00 ( 0.34%) 51108300.00 ( 23.97%) 67220300.00 length=32772: 71443900.00 (-60.50%) 50062100.00 (-12.47%) 44512600.00 length=32780: 71759100.00 ( -6.58%) 50263200.00 ( 25.35%) 67328600.00 length=32773: 71714900.00 (-33.21%) 51076600.00 ( 5.12%) 53835400.00 length=32779: 71756900.00 ( -6.56%) 51290800.00 ( 23.83%) 67337800.00 length=32774: 59689300.00 (-34.55%) 50068400.00 (-12.86%) 44363300.00 length=32778: 71847500.00 (-18.20%) 50084100.00 ( 17.61%) 60786500.00 length=32775: 71599300.00 ( -6.54%) 51278200.00 ( 23.70%) 67204800.00 length=32777: 71862900.00 (-60.85%) 51094000.00 (-14.36%) 44677900.00 length=65536: 282848000.00 ( -6.60%) 199187000.00 ( 24.93%) 265325000.00 length=65552: 243285000.00 (-41.61%) 198512000.00 (-15.54%) 171805000.00 length=65537: 255415000.00 (-23.47%) 202499000.00 ( 2.11%) 206858000.00 length=65551: 280122000.00 (-62.95%) 203349000.00 (-18.29%) 171911000.00 length=65538: 283676000.00 (-14.46%) 198368000.00 ( 19.96%) 247848000.00 length=65550: 275566000.00 (-51.76%) 198494000.00 ( -9.31%) 181581000.00 length=65539: 283699000.00 ( -6.58%) 203453000.00 ( 23.57%) 266195000.00 length=65549: 286572000.00 ( -6.65%) 202607000.00 ( 24.60%) 268712000.00 length=65540: 283710000.00 ( -6.59%) 199161000.00 ( 25.17%) 266160000.00 length=65548: 237573000.00 ( 11.48%) 198462000.00 ( 26.06%) 268395000.00 length=65541: 284150000.00 ( -6.58%) 203273000.00 ( 23.75%) 266600000.00 length=65547: 286250000.00 ( -6.70%) 202594000.00 ( 24.48%) 268263000.00 length=65542: 284167000.00 ( -6.60%) 199122000.00 ( 25.31%) 266584000.00 length=65546: 285656000.00 ( -6.59%) 198443000.00 ( 25.95%) 268002000.00 length=65543: 284600000.00 ( -6.58%) 203247000.00 ( 23.89%) 267030000.00 length=65545: 285665000.00 ( -6.40%) 202575000.00 ( 24.55%) 268472000.00 <snip> * sysdeps/aarch64/multiarch/Makefile (sysdep_routines): Add memmove_falkor. * sysdeps/aarch64/multiarch/ifunc-impl-list.c (__libc_ifunc_impl_list): Likewise. * sysdeps/aarch64/multiarch/memmove.c: Likewise. * sysdeps/aarch64/multiarch/memmove_falkor.S: New file.	2017-10-10 15:49:50 +05:30
Siddhesh Poyarekar	de84fc77f8	Update translations	2017-10-10 15:47:10 +05:30
Siddhesh Poyarekar	e39de9fa74	memcpy_falkor: Fix code style in comments	2017-10-10 15:45:09 +05:30

1 2 3 4 5 ...

32010 Commits