glibc

mirror of https://sourceware.org/git/glibc.git synced 2024-11-10 15:20:10 +00:00

Author	SHA1	Message	Date
Florian Weimer	1daccf403b	nptl: Move stack list variables into _rtld_global Now __thread_gscope_wait (the function behind THREAD_GSCOPE_WAIT, formerly __wait_lookup_done) can be implemented directly in ld.so, eliminating the unprotected GL (dl_wait_lookup_done) function pointer. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2020-11-16 19:33:30 +01:00
Adhemerval Zanella	01bd62517c	Remove tls.h inclusion from internal errno.h The tls.h inclusion is not really required and limits possible definition on more arch specific headers. This is a cleanup to allow inline functions on sysdep.h, more specifically on i386 and ia64 which requires to access some tls definitions its own. No semantic changes expected, checked with a build against all affected ABIs.	2020-11-13 12:59:19 -03:00
Florian Weimer	0f34d426ac	x86: Remove UP macro. Define LOCK_PREFIX unconditionally. The UP macro is never defined. Also define LOCK_PREFIX unconditionally, to the same string. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2020-11-13 15:20:03 +01:00
Samuel Thibault	3d3316b1de	hurd: keep only required PLTs in ld.so We need NO_RTLD_HIDDEN because of the need for PLT calls in ld.so. See Roland's comment in https://sourceware.org/bugzilla/show_bug.cgi?id=15605 "in the Hurd it's crucial that calls like __mmap be the libc ones instead of the rtld-local ones after the bootstrap phase, when the dynamic linker is being used for dlopen and the like." We used to just avoid all hidden use in the rtld ; this commit switches to keeping only those that should use PLT calls, i.e. essentially those defined in sysdeps/mach/hurd/dl-sysdep.c: __assert_fail __assert_perror_fail __*stat64 _exit This fixes a few startup issues, notably the call to __tunable_get_val that is made before PLTs are set up.	2020-11-11 02:36:22 +01:00
H.J. Lu	0f09154c64	x86: Initialize CPU info via IFUNC relocation [BZ 26203] X86 CPU features in ld.so are initialized by init_cpu_features, which is invoked by DL_PLATFORM_INIT from _dl_sysdep_start. But when ld.so is loaded by static executable, DL_PLATFORM_INIT is never called. Also x86 cache info in libc.o and libc.a is initialized by a constructor which may be called too late. Since some fields in _rtld_global_ro in ld.so are initialized by dynamic relocation, we can also initialize x86 CPU features in _rtld_global_ro in ld.so and cache info in libc.so by initializing dummy function pointers in ld.so and libc.so via IFUNC relocation. Key points: 1. IFUNC is always supported, independent of --enable-multi-arch or --disable-multi-arch. Linker generates IFUNC relocations from input IFUNC objects and ld.so performs IFUNC relocations. 2. There are no IFUNC dependencies in ld.so before dynamic relocation have been performed, 3. The x86 CPU features in ld.so is initialized by DL_PLATFORM_INIT in dynamic executable and by IFUNC relocation in dlopen in static executable. 4. The x86 cache info in libc.o is initialized by IFUNC relocation. 5. In libc.a, both x86 CPU features and cache info are initialized from ARCH_INIT_CPU_FEATURES, not by IFUNC relocation, before __libc_early_init is called. Note: _dl_x86_init_cpu_features can be called more than once from DL_PLATFORM_INIT and during relocation in ld.so.	2020-10-16 16:17:53 -07:00
Szabolcs Nagy	238032ead6	aarch64: enforce >=64K guard size [BZ #26691 ] There are several compiler implementations that allow large stack allocations to jump over the guard page at the end of the stack and corrupt memory beyond that. See CVE-2017-1000364. Compilers can emit code to probe the stack such that the guard page cannot be skipped, but on aarch64 the probe interval is 64K by default instead of the minimum supported page size (4K). This patch enforces at least 64K guard on aarch64 unless the guard is disabled by setting its size to 0. For backward compatibility reasons the increased guard is not reported, so it is only observable by exhausting the address space or parsing /proc/self/maps on linux. On other targets the patch has no effect. If the stack probe interval is larger than a page size on a target then ARCH_MIN_GUARD_SIZE can be defined to get large enough stack guard on libc allocated stacks. The patch does not affect threads with user allocated stacks. Fixes bug 26691.	2020-10-02 09:57:44 +01:00
Florian Weimer	90ccfdf176	x86: Use one ldbl2mpn.c file for both i386 and x86_64	2020-09-22 17:58:39 +02:00
H.J. Lu	9620398097	x86: Install <sys/platform/x86.h> [BZ #26124 ] Install <sys/platform/x86.h> so that programmers can do #if __has_include(<sys/platform/x86.h>) #include <sys/platform/x86.h> #endif ... if (CPU_FEATURE_USABLE (SSE2)) ... if (CPU_FEATURE_USABLE (AVX2)) ... <sys/platform/x86.h> exports only: enum { COMMON_CPUID_INDEX_1 = 0, COMMON_CPUID_INDEX_7, COMMON_CPUID_INDEX_80000001, COMMON_CPUID_INDEX_D_ECX_1, COMMON_CPUID_INDEX_80000007, COMMON_CPUID_INDEX_80000008, COMMON_CPUID_INDEX_7_ECX_1, /* Keep the following line at the end. / COMMON_CPUID_INDEX_MAX }; struct cpuid_features { struct cpuid_registers cpuid; struct cpuid_registers usable; }; struct cpu_features { struct cpu_features_basic basic; struct cpuid_features features[COMMON_CPUID_INDEX_MAX]; }; / Get a pointer to the CPU features structure. / extern const struct cpu_features __x86_get_cpu_features (unsigned int max) __attribute__ ((const)); Since all feature checks are done through macros, programs compiled with a newer <sys/platform/x86.h> are compatible with the older glibc binaries as long as the layout of struct cpu_features is identical. The features array can be expanded with backward binary compatibility for both .o and .so files. When COMMON_CPUID_INDEX_MAX is increased to support new processor features, __x86_get_cpu_features in the older glibc binaries returns NULL and HAS_CPU_FEATURE/CPU_FEATURE_USABLE return false on the new processor feature. No new symbol version is neeeded. Both CPU_FEATURE_USABLE and HAS_CPU_FEATURE are provided. HAS_CPU_FEATURE can be used to identify processor features. Note: Although GCC has __builtin_cpu_supports, it only supports a subset of <sys/platform/x86.h> and it is equivalent to CPU_FEATURE_USABLE. It doesn't support HAS_CPU_FEATURE.	2020-09-11 17:20:52 -07:00
Ondřej Hošek	23af890b3f	x86-64: Fix FMA4 detection in ifunc [BZ #26534 ] A typo in commit `107e6a3c22` causes the FMA4 code path to be taken on systems that support FMA, even if they do not support FMA4. Fix this to detect FMA4.	2020-09-02 05:07:37 -07:00
Adhemerval Zanella	5ff35e9544	math: Update x86_64 ulps From new j0 test.	2020-08-08 16:43:11 -03:00
Andreas K. Hüttel	180d5a045f	Update x86-64 libm-test-ulps x86_64 Intel(R) Core(TM) i5-8265U gcc (Gentoo 10.1.0-r2 p3) 10.1.0 Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2020-07-25 17:10:53 -04:00
H.J. Lu	107e6a3c22	x86: Support usable check for all CPU features Support usable check for all CPU features with the following changes: 1. Change struct cpu_features to struct cpuid_features { struct cpuid_registers cpuid; struct cpuid_registers usable; }; struct cpu_features { struct cpu_features_basic basic; struct cpuid_features features[COMMON_CPUID_INDEX_MAX]; unsigned int preferred[PREFERRED_FEATURE_INDEX_MAX]; ... }; so that there is a usable bit for each cpuid bit. 2. After the cpuid bits have been initialized, copy the known bits to the usable bits. EAX/EBX from INDEX_1 and EAX from INDEX_7 aren't used for CPU feature detection. 3. Clear the usable bits which require OS support. 4. If the feature is supported by OS, copy its cpuid bit to its usable bit. 5. Replace HAS_CPU_FEATURE and CPU_FEATURES_CPU_P with CPU_FEATURE_USABLE and CPU_FEATURE_USABLE_P to check if a feature is usable. 6. Add DEPR_FPU_CS_DS for INDEX_7_EBX_13. 7. Unset MPX feature since it has been deprecated. The results are 1. If the feature is known and doesn't requre OS support, its usable bit is copied from the cpuid bit. 2. Otherwise, its usable bit is copied from the cpuid bit only if the feature is known to supported by OS. 3. CPU_FEATURE_USABLE/CPU_FEATURE_USABLE_P are used to check if the feature can be used. 4. HAS_CPU_FEATURE/CPU_FEATURE_CPU_P are used to check if CPU supports the feature.	2020-07-13 06:05:16 -07:00
H.J. Lu	9016b6f389	x86: Remove the unused __x86_prefetchw Since commit `c867597bff` Author: H.J. Lu <hjl.tools@gmail.com> Date: Wed Jun 8 13:57:50 2016 -0700 X86-64: Remove previous default/SSE2/AVX2 memcpy/memmove removed the only usage of __x86_prefetchw, we can remove the unused __x86_prefetchw.	2020-07-11 09:34:03 -07:00
H.J. Lu	3f4b61a0b8	x86: Add thresholds for "rep movsb/stosb" to tunables Add x86_rep_movsb_threshold and x86_rep_stosb_threshold to tunables to update thresholds for "rep movsb" and "rep stosb" at run-time. Note that the user specified threshold for "rep movsb" smaller than the minimum threshold will be ignored. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2020-07-06 11:48:42 -07:00
Adhemerval Zanella	b24381e50f	i386: Use builtin sqrtl Checked on i686-linux-gnu.	2020-06-22 11:09:49 -03:00
Adhemerval Zanella	d19d25dd06	x86_64: Use builtin sqrt{f,l} Checked on x86_64-linux-gnu.	2020-06-22 11:09:49 -03:00
Sunil K Pandey	75870237ff	Fix avx2 strncmp offset compare condition check [BZ #25933 ] strcmp-avx2.S: In avx2 strncmp function, strings are compared in chunks of 4 vector size(i.e. 32x4=128 byte for avx2). After first 4 vector size comparison, code must check whether it already passed the given offset. This patch implement avx2 offset check condition for strncmp function, if both string compare same for first 4 vector size.	2020-06-17 07:07:38 -07:00
H.J. Lu	a35a59036e	x86_64: Use %xmmN with vpxor to clear a vector register Since "vpxor %xmmN, %xmmN, %xmmN" clears the whole vector register, use %xmmN, instead of %ymmN, with vpxor to clear a vector register.	2020-06-17 05:44:02 -07:00
Vineet Gupta	8dbb7a08ec	dl-runtime: reloc_{offset,index} now functions arch overide'able The existing macros are fragile and expect local variables with a certain name. Fix this by defining them as functions with default implementation in a new header dl-runtime.h which arches can override if need be. This came up during ARC port review, hence the need for argument pltgot in reloc_index() which is not needed by existing ports. This patch potentially only affects hppa/x86 ports, build tested for both those configs and a few more. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2020-06-05 13:45:46 -07:00
Florian Weimer	501bdb5dd6	Linux: Remove remnants of the getcpu cache The getcpu cache was removed from the kernel in Linux 2.6.24. glibc support from the sched_getcpu implementation was removed in commit `dd26c44403` ("Consolidate sched_getcpu").	2020-05-16 15:47:51 +02:00
H.J. Lu	55c7bcc71b	x86-64: Use RDX_LP on __x86_shared_non_temporal_threshold [BZ #25966 ] Since __x86_shared_non_temporal_threshold is defined as long int __x86_shared_non_temporal_threshold; and long int is 4 bytes for x32, use RDX_LP to compare against __x86_shared_non_temporal_threshold in assembly code.	2020-05-09 12:28:15 -07:00
Joseph Myers	dbb188dd87	Remove unused floating-point configuration from gmp-impl.h. This patch removes the IEEE_DOUBLE_BIG_ENDIAN and IEEE_DOUBLE_MIXED_ENDIAN macros from gmp-impl.h and gmp-mparam.h, and the ieee_double_extract union from gmp-impl.h. The macros were used only in defining the union, which was used nowhere in glibc. As GMP's gmp-impl.h is over 5000 lines, the file in glibc is so far from the GMP version that it doesn't seem to make sense to keep things there that are not relevant in glibc. (I expect there is plenty more in the header after this patch that is also not relevant in glibc and can be cleaned up later.) Tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by this patch.	2020-04-28 15:05:09 +00:00
Adhemerval Zanella	f721171632	Revert "x86_64: Add SSE sfp-exceptions" The __sfp_handle_exceptions is not fully correct regarding raising exceptions, since there is no direct way to raise only FP_EX_OVERFLOW nor FP_EX_UNDERFLOW for SSE mode. Both libgcc and feraiseexcept rely on x87 mode to accomplish it. This reverts commit `460ee50de0`. Checked on x86_64.	2020-04-20 14:56:05 -03:00
Adhemerval Zanella	460ee50de0	x86_64: Add SSE sfp-exceptions The exported x86_64 fenv.h functions operate on both i387 and SSE (since they should work on both float, double, and long double) while the internal libc_fe* set either SSE (float, double, and float128) or i387 (long double). The libgcc __sfp_handle_exceptions (used on float128 implementation), however, will set either SEE or i387 exception depending of the exception to raise. This broke the internal assumption of float128 where only SSE operations will be used. This patch reimplements the libgcc __sfp_handle_exceptions to use only SSE operations and sets libgcc to use it instead of its own implementation. And I think we should fix libgcc in a similar manner, since checking on config/i386/64/sfp-machine.h it already only supports SSE rounding mode and x86_64 ABI also expectes float128 to use SSE registers [1] (although it is not clear on how future implementation might implement it). Checked on x86_64-linux-gnu. [1] https://github.com/hjl-tools/x86-psABI/wiki/X86-psABI	2020-04-17 11:42:29 -03:00
Adhemerval Zanella	17fd707f88	nptl: Remove x86_64 cancellation assembly implementations [BZ #25765 ] All cancellable syscalls are done by C implementations, so there is no no need to use a specialized implementation to optimize register usage. It fixes BZ #25765. Checked on x86_64-linux-gnu.	2020-04-03 10:47:59 -03:00
Paul Zimmermann	a9d42c09a3	math: Add inputs that yield larger errors for float type (x86_64) The corner cases included were generated using exhaustive search for all float/binary32 values on x86_64 (comparing to MPFR for correct rounding to nearest). For the j0/j1/y0 functions, only cases with ulp error <= 9 were included. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2020-03-31 21:48:54 -04:00
Adhemerval Zanella	1c15464ca0	math: Remove inline math tests With mathinline removal there is no need to keep building and testing inline math tests. The gen-libm-tests.py support to generate ULP_I_* is removed and all libm-test-ulps files are updated to longer have the i{float,double,ldouble} entries. The support for no-test-inline is also removed from both gen-auto-libm-tests and the auto-libm-test-out-* were regenerated. Checked on x86_64-linux-gnu and i686-linux-gnu.	2020-03-19 11:45:44 -03:00
Florian Weimer	fe49a73316	x86: Avoid single-argument _Static_assert in <tls.h> Older GCC versions do not support this extension. Fixes commit `f1bdee6179` ("x86 tls: Use _Static_assert for TLS access size assertion").	2020-02-17 11:12:03 +01:00
Samuel Thibault	f1bdee6179	x86 tls: Use _Static_assert for TLS access size assertion	2020-02-17 00:40:39 +01:00
Florian Weimer	3a0ecccb59	ld.so: Do not export free/calloc/malloc/realloc functions [BZ #25486 ] Exporting functions and relying on symbol interposition from libc.so makes the choice of implementation dependent on DT_NEEDED order, which is not what some compiler drivers expect. This commit replaces one magic mechanism (symbol interposition) with another one (preprocessor-/compiler-based redirection). This makes the hand-over from the minimal malloc to the full malloc more explicit. Removing the ABI symbols is backwards-compatible because libc.so is always in scope, and the dynamic loader will find the malloc-related symbols there since commit `f0b2132b35` ("ld.so: Support moving versioned symbols between sonames [BZ #24741]"). Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2020-02-15 11:01:23 +01:00
Adhemerval Zanella	fcb78a5505	linux: Consolidate INLINE_SYSCALL With all Linux ABIs using the expected Linux kABI to indicate syscalls errors, there is no need to replicate the INLINE_SYSCALL. The generic Linux sysdep.h includes errno.h even for !__ASSEMBLER__, which is ok now and it allows cleanup some archaic code that assume otherwise. Checked with a build against all affected ABIs.	2020-02-14 21:09:12 -03:00
Wilco Dijkstra	220622dde5	Add libm_alias_finite for _finite symbols This patch adds a new macro, libm_alias_finite, to define all _finite symbol. It sets all _finite symbol as compat symbol based on its first version (obtained from the definition at built generated first-versions.h). The <fn>f128_finite symbols were introduced in GLIBC 2.26 and so need special treatment in code that is shared between long double and float128. It is done by adding a list, similar to internal symbol redifinition, on sysdeps/ieee754/float128/float128_private.h. Alpha also needs some tricky changes to ensure we still emit 2 compat symbols for sqrt(f). Passes buildmanyglibc. Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2020-01-03 10:02:04 -03:00
Joseph Myers	d614a75396	Update copyright dates with scripts/update-copyrights.	2020-01-01 00:14:33 +00:00
Stefan Liebler	1c94bf0f0a	Always use wordsize-64 version of s_trunc.c. This patch replaces s_trunc.c in sysdeps/dbl-64 with the one in sysdeps/dbl-64/wordsize-64 and removes the latter one. The code is not changed except changes in code style. Also adjusted the include path in x86_64 and sparc64 files. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2019-12-11 15:12:14 +01:00
Stefan Liebler	9f234eafe8	Always use wordsize-64 version of s_ceil.c. This patch replaces s_ceil.c in sysdeps/dbl-64 with the one in sysdeps/dbl-64/wordsize-64 and removes the latter one. The code is not changed except changes in code style. Also adjusted the include path in x86_64 and sparc64 files. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2019-12-11 15:12:13 +01:00
Stefan Liebler	95b0c2c431	Always use wordsize-64 version of s_floor.c. This patch replaces s_floor.c in sysdeps/dbl-64 with the one in sysdeps/dbl-64/wordsize-64 and removes the latter one. The code is not changed except changes in code style. Also adjusted the include path in x86_64 and sparc64 files. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2019-12-11 15:12:12 +01:00
Stefan Liebler	ab48bdd098	Always use wordsize-64 version of s_rint.c. This patch replaces s_rint.c in sysdeps/dbl-64 with the one in sysdeps/dbl-64/wordsize-64 and removes the latter one. The code is not changed except changes in code style. Also adjusted the include path in x86_64 file. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2019-12-11 15:12:12 +01:00
Stefan Liebler	af123aa950	Always use wordsize-64 version of s_nearbyint.c. This patch replaces s_nearbyint.c in sysdeps/dbl-64 with the one in sysdeps/dbl-64/wordsize-64 and removes the latter one. The code is not changed except changes in code style. Also adjusted the include path in x86_64 file. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2019-12-11 15:12:11 +01:00
Florian Weimer	4db71d2f98	elf: Do not run IFUNC resolvers for LD_DEBUG=unused [BZ #24214 ] This commit adds missing skip_ifunc checks to aarch64, arm, i386, sparc, and x86_64. A new test case ensures that IRELATIVE IFUNC resolvers do not run in various diagnostic modes of the dynamic loader. Reviewed-By: Szabolcs Nagy <szabolcs.nagy@arm.com>	2019-12-02 14:55:22 +01:00
Adhemerval Zanella	48dbce60cf	nptl: Add tests for internal pthread_rwlock_t offsets This patch new build tests to check for internal fields offsets for internal pthread_rwlock_t definition. Althoug the '__data.__flags' field layout should be preserved due static initializators, the patch also adds tests for the futexes that may be used in a shared memory (although using different libc version in such scenario is not really supported). Checked with a build against all affected ABIs. Change-Id: Iccc103d557de13d17e4a3f59a0cad2f4a640c148	2019-11-26 13:53:36 +00:00
Adhemerval Zanella	71d260c107	nptl: Cleanup mutex internal offset tests The offsets of pthread_mutex_t __data.__nusers, __data.__spins, __data.elision, __data.list are not required to be constant over the releases. Only the __data.__kind is used for static initializers. This patch also adds an additional size check for __data.__kind. Checked with a build against affected ABIs. Change-Id: I7a4e48cc91b4c4ada57e9a5d1b151fb702bfaa9f	2019-11-26 13:53:36 +00:00
Wilco Dijkstra	d0007dc53c	Remove x64 _finite tests and references Remove _finite tests and references from x86_64. Rather than calling __exp_finite, use exp directly (since it's the same entry point). x86_64 builds and passes testsuite. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2019-10-21 14:29:12 -03:00
Paul Eggert	5a82c74822	Prefer https to http for gnu.org and fsf.org URLs Also, change sources.redhat.com to sourceware.org. This patch was automatically generated by running the following shell script, which uses GNU sed, and which avoids modifying files imported from upstream: sed -ri ' s,(http\|ftp)(://(.\.)?(gnu\|fsf\|sourceware)\.org($\|[^.]\|\.[^a-z])),https\2,g s,(http\|ftp)(://(.\.)?)sources\.redhat\.com($\|[^.]\|\.[^a-z]),https\2sourceware.org\4,g ' \ $(find $(git ls-files) -prune -type f \ ! -name '.po' \ ! -name 'ChangeLog' \ ! -path COPYING ! -path COPYING.LIB \ ! -path manual/fdl-1.3.texi ! -path manual/lgpl-2.1.texi \ ! -path manual/texinfo.tex ! -path scripts/config.guess \ ! -path scripts/config.sub ! -path scripts/install-sh \ ! -path scripts/mkinstalldirs ! -path scripts/move-if-change \ ! -path INSTALL ! -path locale/programs/charmap-kw.h \ ! -path po/libc.pot ! -path sysdeps/gnu/errlist.c \ ! '(' -name configure \ -execdir test -f configure.ac -o -f configure.in ';' ')' \ ! '(' -name preconfigure \ -execdir test -f preconfigure.ac ';' ')' \ -print) and then by running 'make dist-prepare' to regenerate files built from the altered files, and then executing the following to cleanup: chmod a+x sysdeps/unix/sysv/linux/riscv/configure # Omit irrelevant whitespace and comment-only changes, # perhaps from a slightly-different Autoconf version. git checkout -f \ sysdeps/csky/configure \ sysdeps/hppa/configure \ sysdeps/riscv/configure \ sysdeps/unix/sysv/linux/csky/configure # Omit changes that caused a pre-commit check to fail like this: # remote: * error: sysdeps/powerpc/powerpc64/ppc-mcount.S: trailing lines git checkout -f \ sysdeps/powerpc/powerpc64/ppc-mcount.S \ sysdeps/unix/sysv/linux/s390/s390-64/syscall.S # Omit change that caused a pre-commit check to fail like this: # remote: * error: sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S: last line does not end in newline git checkout -f sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S	2019-09-07 02:43:31 -07:00
Wilco Dijkstra	3c05dd79d0	Use generic memset/memcpy/memmove in benchtests Use the generic C memset/memcpy/memmove in benchtests since comparing against a slow byte-oriented implementation makes no sense. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> 2019-08-29 Wilco Dijkstra <wdijkstr@arm.com> * benchtests/bench-memcpy.c (simple_memcpy): Remove. (generic_memcpy): Include generic C memcpy. * benchtests/bench-memmove.c (simple_memmove): Remove. (generic_memmove): Include generic C memmove. * benchtests/bench-memset.c (simple_memset): Remove. (generic_memset): Include generic C memset. * benchtests/bench-memset-large.c (simple_memset): Remove. (generic_memset): Include generic C memset. * benchtests/bench-memset-walk.c (simple_memset): Remove. (generic_memset): Include generic C memset. * string/memcpy.c (MEMCPY): Add defines to enable redirection. * string/memset.c (MEMSET): Likewise. * sysdeps/x86_64/memcopy.h: Remove empty file.	2019-08-30 17:21:35 +01:00
H.J. Lu	7e681561a3	x86-64: Compile branred.c with -mprefer-vector-width=128 [BZ #24603 ] When compiled with -O3 and AVX, GCC 8 and 9 optimize some loops in sysdeps/ieee754/dbl-64/branred.c with 256-bit vector instructions, which leads to store forward stall: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90579 There is no easy fix in compiler. This patch limits vector width to 128 bits to work around this issue. It improves performance of sin and cos by more than 40% on Skylake compiled with -O3 -march=skylake. Tested with GCC 7/8/9 on x86-64. [BZ #24603] * sysdeps/x86_64/configure.ac: Check if -mprefer-vector-width=128 works. * sysdeps/x86_64/configure: Regenerated. * sysdeps/x86_64/fpu/Makefile (CFLAGS-branred.c): New. Set to -mprefer-vector-width=128 if supported.	2019-07-24 14:48:43 -07:00
H.J. Lu	d039da1c00	x86: Add sysdeps/x86/dl-lookupcfg.h Since sysdeps/i386/dl-lookupcfg.h and sysdeps/x86_64/dl-lookupcfg.h are identical, we can replace them with sysdeps/x86/dl-lookupcfg.h. * sysdeps/i386/dl-lookupcfg.h: Moved to ... * sysdeps/x86/dl-lookupcfg.h: Here. * sysdeps/x86_64/dl-lookupcfg.h: Removed.	2019-06-26 15:07:28 -07:00
Adhemerval Zanella	81a1443941	wcsmbs: optimize wcscat This patch rewrites wcscat using wcslen and wcscpy. This is similar to the optimization done on strcat by `6e46de42fe`. The strcpy changes are mainly to add the internal alias to avoid PLT calls. Checked on x86_64-linux-gnu and a build against the affected architectures. * include/wchar.h (__wcscpy): New prototype. * sysdeps/powerpc/powerpc32/power4/multiarch/wcscpy-ppc32.c (__wcscpy): Route internal symbol to generic implementation. * sysdeps/powerpc/powerpc32/power4/multiarch/wcscpy.c (wcscpy): Add internal __wcscpy alias. * sysdeps/powerpc/powerpc64/multiarch/wcscpy.c (wcscpy): Likewise. * sysdeps/s390/wcscpy.c (wcscpy): Likewise. * sysdeps/x86_64/multiarch/wcscpy.c (wcscpy): Likewise. * wcsmbs/wcscpy.c (wcscpy): Add * sysdeps/x86_64/multiarch/wcscpy-c.c (WCSCPY): Adjust macro to use generic implementation. * wcsmbs/wcscat.c (wcscat): Rewrite using wcslen and wcscpy.	2019-02-27 10:00:37 -03:00
Joseph Myers	32db86d558	Add fall-through comments. This patch adds fall-through comments in some cases where -Wextra produces implicit-fallthrough warnings. The patch is non-exhaustive. Apart from architecture-specific code for non-x86_64 architectures, it does not change sunrpc/xdr.c (legacy code, probably should have such changes, but left to be dealt with separately), or places that already had comments about the fall-through but not matching the form expected by -Wimplicit-fallthrough=3 (the default level with -Wextra; my inclination is to adjust those comments to match rather than downgrading to -Wimplicit-fallthrough=1 to allow any comment), or one place where I thought the implicit fallthrough was not correct and so should be handled separately as a bug fix. I think the key thing to consider in review of this patch is whether the fall-through is indeed intended and correct in each place where such a comment is added. Tested for x86_64. * elf/dl-exception.c (_dl_exception_create_format): Add fall-through comments. * elf/ldconfig.c (parse_conf_include): Likewise. * elf/rtld.c (print_statistics): Likewise. * locale/programs/charmap.c (parse_charmap): Likewise. * misc/mntent_r.c (__getmntent_r): Likewise. * posix/wordexp.c (parse_arith): Likewise. (parse_backtick): Likewise. * resolv/ns_ttl.c (ns_parse_ttl): Likewise. * sysdeps/x86/cpu-features.c (init_cpu_features): Likewise. * sysdeps/x86_64/dl-machine.h (elf_machine_rela): Likewise.	2019-02-12 10:30:34 +00:00
Andreas Schwab	65f7767a91	Fix handling of collating elements in fnmatch (bug 17396, bug 16976) This fixes the same bug in fnmatch that was fixed by commit `7e2f0d2d77` for regexp matching. As a side effect it also removes the use of an unbound VLA.	2019-02-04 15:45:02 +01:00
H.J. Lu	3f635fb433	x86-64 memcmp: Use unsigned Jcc instructions on size [BZ #24155 ] Since the size argument is unsigned. we should use unsigned Jcc instructions, instead of signed, to check size. Tested on x86-64 and x32, with and without --disable-multi-arch. [BZ #24155] CVE-2019-7309 * NEWS: Updated for CVE-2019-7309. * sysdeps/x86_64/memcmp.S: Use RDX_LP for size. Clear the upper 32 bits of RDX register for x32. Use unsigned Jcc instructions, instead of signed. * sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-memcmp-2. * sysdeps/x86_64/x32/tst-size_t-memcmp-2.c: New test.	2019-02-04 06:31:13 -08:00

1 2 3 4 5 ...

1290 Commits