glibc

mirror of https://sourceware.org/git/glibc.git synced 2024-11-09 23:00:07 +00:00

Author	SHA1	Message	Date
Paul Eggert	dff8da6b3e	Update copyright dates with scripts/update-copyrights	2024-01-01 10:53:40 -08:00
Joseph Myers	6d7e8eda9b	Update copyright dates with scripts/update-copyrights	2023-01-06 21:14:39 +00:00
Paul Eggert	581c785bf3	Update copyright dates with scripts/update-copyrights I used these shell commands: ../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright (cd ../glibc && git commit -am"[this commit message]") and then ignored the output, which consisted lines saying "FOO: warning: copyright statement not found" for each of 7061 files FOO. I then removed trailing white space from math/tgmath.h, support/tst-support-open-dev-null-range.c, and sysdeps/x86_64/multiarch/strlen-vec.S, to work around the following obscure pre-commit check failure diagnostics from Savannah. I don't know why I run into these diagnostics whereas others evidently do not. remote: * 912-#endif remote: * 913: remote: * 914- remote: * error: lines with trailing whitespace found ... remote: *** error: sysdeps/unix/sysv/linux/statx_cp.c: trailing lines	2022-01-01 11:40:24 -08:00
H.J. Lu	f53ffc9b90	x86: Handle _SC_LEVEL1_ICACHE_LINESIZE [BZ #27444 ] commit `2d651eb926` Author: H.J. Lu <hjl.tools@gmail.com> Date: Fri Sep 18 07:55:14 2020 -0700 x86: Move x86 processor cache info to cpu_features missed _SC_LEVEL1_ICACHE_LINESIZE. 1. Add level1_icache_linesize to struct cpu_features. 2. Initialize level1_icache_linesize by calling handle_intel, handle_zhaoxin and handle_amd with _SC_LEVEL1_ICACHE_LINESIZE. 3. Return level1_icache_linesize for _SC_LEVEL1_ICACHE_LINESIZE. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2021-03-15 05:43:26 -07:00
Andreas Schwab	31f6488722	Fix misplaced const Constify __x86_cacheinfo_p and __x86_cpu_features_p, not their pointer target types.	2021-01-25 15:09:02 +01:00
H.J. Lu	2d651eb926	x86: Move x86 processor cache info to cpu_features 1. Move x86 processor cache info to _dl_x86_cpu_features in ld.so. 2. Update tunable bounds with TUNABLE_SET_WITH_BOUNDS. 3. Move x86 cache info initialization to dl-cacheinfo.h and initialize x86 cache info in init_cpu_features (). 4. Put x86 cache info for libc in cacheinfo.h, which is included in libc-start.c in libc.a and is included in cacheinfo.c in libc.so. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2021-01-14 11:38:45 -08:00
Paul Eggert	2b778ceb40	Update copyright dates with scripts/update-copyrights I used these shell commands: ../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright (cd ../glibc && git commit -am"[this commit message]") and then ignored the output, which consisted lines saying "FOO: warning: copyright statement not found" for each of 6694 files FOO. I then removed trailing white space from benchtests/bench-pthread-locks.c and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this diagnostic from Savannah: remote: * pre-commit check failed ... remote: * error: lines with trailing whitespace found remote: error: hook declined to update refs/heads/master	2021-01-02 12:17:34 -08:00
H.J. Lu	0f09154c64	x86: Initialize CPU info via IFUNC relocation [BZ 26203] X86 CPU features in ld.so are initialized by init_cpu_features, which is invoked by DL_PLATFORM_INIT from _dl_sysdep_start. But when ld.so is loaded by static executable, DL_PLATFORM_INIT is never called. Also x86 cache info in libc.o and libc.a is initialized by a constructor which may be called too late. Since some fields in _rtld_global_ro in ld.so are initialized by dynamic relocation, we can also initialize x86 CPU features in _rtld_global_ro in ld.so and cache info in libc.so by initializing dummy function pointers in ld.so and libc.so via IFUNC relocation. Key points: 1. IFUNC is always supported, independent of --enable-multi-arch or --disable-multi-arch. Linker generates IFUNC relocations from input IFUNC objects and ld.so performs IFUNC relocations. 2. There are no IFUNC dependencies in ld.so before dynamic relocation have been performed, 3. The x86 CPU features in ld.so is initialized by DL_PLATFORM_INIT in dynamic executable and by IFUNC relocation in dlopen in static executable. 4. The x86 cache info in libc.o is initialized by IFUNC relocation. 5. In libc.a, both x86 CPU features and cache info are initialized from ARCH_INIT_CPU_FEATURES, not by IFUNC relocation, before __libc_early_init is called. Note: _dl_x86_init_cpu_features can be called more than once from DL_PLATFORM_INIT and during relocation in ld.so.	2020-10-16 16:17:53 -07:00
Patrick McGehearty	d3c5702747	Reversing calculation of __x86_shared_non_temporal_threshold The __x86_shared_non_temporal_threshold determines when memcpy on x86 uses non_temporal stores to avoid pushing other data out of the last level cache. This patch proposes to revert the calculation change made by H.J. Lu's patch of June 2, 2017. H.J. Lu's patch selected a threshold suitable for a single thread getting maximum performance. It was tuned using the single threaded large memcpy micro benchmark on an 8 core processor. The last change changes the threshold from using 3/4 of one thread's share of the cache to using 3/4 of the entire cache of a multi-threaded system before switching to non-temporal stores. Multi-threaded systems with more than a few threads are server-class and typically have many active threads. If one thread consumes 3/4 of the available cache for all threads, it will cause other active threads to have data removed from the cache. Two examples show the range of the effect. John McCalpin's widely parallel Stream benchmark, which runs in parallel and fetches data sequentially, saw a 20% slowdown with this patch on an internal system test of 128 threads. This regression was discovered when comparing OL8 performance to OL7. An example that compares normal stores to non-temporal stores may be found at https://vgatherps.github.io/2018-09-02-nontemporal/. A simple test shows performance loss of 400 to 500% due to a failure to use nontemporal stores. These performance losses are most likely to occur when the system load is heaviest and good performance is critical. The tunable x86_non_temporal_threshold can be used to override the default for the knowledgable user who really wants maximum cache allocation to a single thread in a multi-threaded system. The manual entry for the tunable has been expanded to provide more information about its purpose. modified: sysdeps/x86/cacheinfo.c modified: manual/tunables.texi	2020-09-28 22:10:39 +00:00
H.J. Lu	107e6a3c22	x86: Support usable check for all CPU features Support usable check for all CPU features with the following changes: 1. Change struct cpu_features to struct cpuid_features { struct cpuid_registers cpuid; struct cpuid_registers usable; }; struct cpu_features { struct cpu_features_basic basic; struct cpuid_features features[COMMON_CPUID_INDEX_MAX]; unsigned int preferred[PREFERRED_FEATURE_INDEX_MAX]; ... }; so that there is a usable bit for each cpuid bit. 2. After the cpuid bits have been initialized, copy the known bits to the usable bits. EAX/EBX from INDEX_1 and EAX from INDEX_7 aren't used for CPU feature detection. 3. Clear the usable bits which require OS support. 4. If the feature is supported by OS, copy its cpuid bit to its usable bit. 5. Replace HAS_CPU_FEATURE and CPU_FEATURES_CPU_P with CPU_FEATURE_USABLE and CPU_FEATURE_USABLE_P to check if a feature is usable. 6. Add DEPR_FPU_CS_DS for INDEX_7_EBX_13. 7. Unset MPX feature since it has been deprecated. The results are 1. If the feature is known and doesn't requre OS support, its usable bit is copied from the cpuid bit. 2. Otherwise, its usable bit is copied from the cpuid bit only if the feature is known to supported by OS. 3. CPU_FEATURE_USABLE/CPU_FEATURE_USABLE_P are used to check if the feature can be used. 4. HAS_CPU_FEATURE/CPU_FEATURE_CPU_P are used to check if CPU supports the feature.	2020-07-13 06:05:16 -07:00
H.J. Lu	9016b6f389	x86: Remove the unused __x86_prefetchw Since commit `c867597bff` Author: H.J. Lu <hjl.tools@gmail.com> Date: Wed Jun 8 13:57:50 2016 -0700 X86-64: Remove previous default/SSE2/AVX2 memcpy/memmove removed the only usage of __x86_prefetchw, we can remove the unused __x86_prefetchw.	2020-07-11 09:34:03 -07:00
H.J. Lu	3f4b61a0b8	x86: Add thresholds for "rep movsb/stosb" to tunables Add x86_rep_movsb_threshold and x86_rep_stosb_threshold to tunables to update thresholds for "rep movsb" and "rep stosb" at run-time. Note that the user specified threshold for "rep movsb" smaller than the minimum threshold will be ignored. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2020-07-06 11:48:42 -07:00
Florian Weimer	19108a3832	i386: Remove unused variable in sysdeps/x86/cacheinfo.c Commit `a98dc92dd1` ("x86: Add cache information support for Zhaoxin processors") introduced an unused variable warning in the default i686-linux-gnu build: In file included from ../sysdeps/i386/cacheinfo.c:3: ../sysdeps/x86/cacheinfo.c: In function 'init_cacheinfo': ../sysdeps/x86/cacheinfo.c:762:16: error: unused variable 'eax' [-Werror=unused-variable] 762 \| unsigned int eax; \| ^~~	2020-04-30 21:16:47 +02:00
mayshao-oc	a98dc92dd1	x86: Add cache information support for Zhaoxin processors To obtain Zhaoxin CPU cache information, add a new function handle_zhaoxin(). Add a new function get_common_cache_info() that extracts the code in init_cacheinfo() to get the value of the variable shared, threads. Add Zhaoxin branch in init_cacheinfo() for initializing variables, such as __x86_shared_cache_size.	2020-04-30 06:45:27 -07:00
Joseph Myers	d614a75396	Update copyright dates with scripts/update-copyrights.	2020-01-01 00:14:33 +00:00
Paul Eggert	5a82c74822	Prefer https to http for gnu.org and fsf.org URLs Also, change sources.redhat.com to sourceware.org. This patch was automatically generated by running the following shell script, which uses GNU sed, and which avoids modifying files imported from upstream: sed -ri ' s,(http\|ftp)(://(.\.)?(gnu\|fsf\|sourceware)\.org($\|[^.]\|\.[^a-z])),https\2,g s,(http\|ftp)(://(.\.)?)sources\.redhat\.com($\|[^.]\|\.[^a-z]),https\2sourceware.org\4,g ' \ $(find $(git ls-files) -prune -type f \ ! -name '.po' \ ! -name 'ChangeLog' \ ! -path COPYING ! -path COPYING.LIB \ ! -path manual/fdl-1.3.texi ! -path manual/lgpl-2.1.texi \ ! -path manual/texinfo.tex ! -path scripts/config.guess \ ! -path scripts/config.sub ! -path scripts/install-sh \ ! -path scripts/mkinstalldirs ! -path scripts/move-if-change \ ! -path INSTALL ! -path locale/programs/charmap-kw.h \ ! -path po/libc.pot ! -path sysdeps/gnu/errlist.c \ ! '(' -name configure \ -execdir test -f configure.ac -o -f configure.in ';' ')' \ ! '(' -name preconfigure \ -execdir test -f preconfigure.ac ';' ')' \ -print) and then by running 'make dist-prepare' to regenerate files built from the altered files, and then executing the following to cleanup: chmod a+x sysdeps/unix/sysv/linux/riscv/configure # Omit irrelevant whitespace and comment-only changes, # perhaps from a slightly-different Autoconf version. git checkout -f \ sysdeps/csky/configure \ sysdeps/hppa/configure \ sysdeps/riscv/configure \ sysdeps/unix/sysv/linux/csky/configure # Omit changes that caused a pre-commit check to fail like this: # remote: * error: sysdeps/powerpc/powerpc64/ppc-mcount.S: trailing lines git checkout -f \ sysdeps/powerpc/powerpc64/ppc-mcount.S \ sysdeps/unix/sysv/linux/s390/s390-64/syscall.S # Omit change that caused a pre-commit check to fail like this: # remote: * error: sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S: last line does not end in newline git checkout -f sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S	2019-09-07 02:43:31 -07:00
Joseph Myers	04277e02d7	Update copyright dates with scripts/update-copyrights. * All files with FSF copyright notices: Update copyright dates using scripts/update-copyrights. * locale/programs/charmap-kw.h: Regenerated. * locale/programs/locfile-kw.h: Likewise.	2019-01-01 00:11:28 +00:00
H.J. Lu	c22e4c2a14	x86: Extend CPUID support in struct cpu_features Extend CPUID support for all feature bits from CPUID. Add a new macro, CPU_FEATURE_USABLE, which can be used to check if a feature is usable at run-time, instead of HAS_CPU_FEATURE and HAS_ARCH_FEATURE. Add COMMON_CPUID_INDEX_D_ECX_1, COMMON_CPUID_INDEX_80000007 and COMMON_CPUID_INDEX_80000008 to check CPU feature bits in them. Tested on i686 and x86-64 as well as using build-many-glibcs.py with x86 targets. * sysdeps/x86/cacheinfo.c (intel_check_word): Updated for cpu_features_basic. (__cache_sysconf): Likewise. (init_cacheinfo): Likewise. * sysdeps/x86/cpu-features.c (get_extended_indeces): Also populate COMMON_CPUID_INDEX_80000007 and COMMON_CPUID_INDEX_80000008. (get_common_indices): Also populate COMMON_CPUID_INDEX_D_ECX_1. Use CPU_FEATURES_CPU_P (cpu_features, XSAVEC) to check if XSAVEC is available. Set the bit_arch_XXX_Usable bits. (init_cpu_features): Use _Static_assert on index_arch_Fast_Unaligned_Load. __get_cpuid_registers and __get_arch_feature. Updated for cpu_features_basic. Set stepping in cpu_features. * sysdeps/x86/cpu-features.h: (FEATURE_INDEX_1): Changed to enum. (FEATURE_INDEX_2): New. (FEATURE_INDEX_MAX): Changed to enum. (COMMON_CPUID_INDEX_D_ECX_1): New. (COMMON_CPUID_INDEX_80000007): Likewise. (COMMON_CPUID_INDEX_80000008): Likewise. (cpuid_registers): Likewise. (cpu_features_basic): Likewise. (CPU_FEATURE_USABLE): Likewise. (bit_arch_XXX_Usable): Likewise. (cpu_features): Use cpuid_registers and cpu_features_basic. (bit_arch_XXX): Reweritten. (bit_cpu_XXX): Likewise. (index_cpu_XXX): Likewise. (reg_XXX): Likewise. * sysdeps/x86/tst-get-cpu-features.c: Include <stdio.h> and <support/check.h>. (CHECK_CPU_FEATURE): New. (CHECK_CPU_FEATURE_USABLE): Likewise. (cpu_kinds): Likewise. (do_test): Print vendor, family, model and stepping. Check HAS_CPU_FEATURE and CPU_FEATURE_USABLE. (TEST_FUNCTION): Removed. Include <support/test-driver.c> instead of "../../test-skeleton.c". * sysdeps/x86_64/multiarch/sched_cpucount.c (__sched_cpucount): Check POPCNT instead of POPCOUNT. * sysdeps/x86_64/multiarch/test-multiarch.c (do_test): Likewise.	2018-12-03 05:54:56 -08:00
Joseph Myers	688903eb3e	Update copyright dates with scripts/update-copyrights. * All files with FSF copyright notices: Update copyright dates using scripts/update-copyrights. * locale/programs/charmap-kw.h: Regenerated. * locale/programs/locfile-kw.h: Likewise.	2018-01-01 00:32:25 +00:00
H.J. Lu	905947c304	tunables: Add IFUNC selection and cache sizes The current IFUNC selection is based on microbenchmarks in glibc. It should give the best performance for most workloads. But other choices may have better performance for a particular workload or on the hardware which wasn't available at the selection was made. The environment variable, GLIBC_TUNABLES=glibc.tune.ifunc=-xxx,yyy,-zzz...., can be used to enable CPU/ARCH feature yyy, disable CPU/ARCH feature yyy and zzz, where the feature name is case-sensitive and has to match the ones in cpu-features.h. It can be used by glibc developers to override the IFUNC selection to tune for a new processor or improve performance for a particular workload. It isn't intended for normal end users. NOTE: the IFUNC selection may change over time. Please check all multiarch implementations when experimenting. Also, GLIBC_TUNABLES=glibc.tune.x86_non_temporal_threshold=NUMBER is provided to set threshold to use non temporal store to NUMBER, GLIBC_TUNABLES=glibc.tune.x86_data_cache_size=NUMBER to set data cache size, GLIBC_TUNABLES=glibc.tune.x86_shared_cache_size=NUMBER to set shared cache size. * elf/dl-tunables.list (tune): Add ifunc, x86_non_temporal_threshold, x86_data_cache_size and x86_shared_cache_size. * manual/tunables.texi: Document glibc.tune.ifunc, glibc.tune.x86_data_cache_size, glibc.tune.x86_shared_cache_size and glibc.tune.x86_non_temporal_threshold. * sysdeps/unix/sysv/linux/x86/dl-sysdep.c: New file. * sysdeps/x86/cpu-tunables.c: Likewise. * sysdeps/x86/cacheinfo.c (init_cacheinfo): Check and get data cache size, shared cache size and non temporal threshold from cpu_features. * sysdeps/x86/cpu-features.c [HAVE_TUNABLES] (TUNABLE_NAMESPACE): New. [HAVE_TUNABLES] Include <unistd.h>. [HAVE_TUNABLES] Include <elf/dl-tunables.h>. [HAVE_TUNABLES] (TUNABLE_CALLBACK (set_ifunc)): Likewise. [HAVE_TUNABLES] (init_cpu_features): Use TUNABLE_GET to set IFUNC selection, data cache size, shared cache size and non temporal threshold. * sysdeps/x86/cpu-features.h (cpu_features): Add data_cache_size, shared_cache_size and non_temporal_threshold.	2017-06-20 08:37:28 -07:00
H.J. Lu	48e7bc7a55	x86: Don't use dl_x86_cpu_features in cacheinfo.c Since cpu_features is available, use it instead of dl_x86_cpu_features. * sysdeps/x86/cacheinfo.c (intel_check_word): Accept cpu_features and use it instead of dl_x86_cpu_features. (handle_intel): Replace maxidx with cpu_features. Pass cpu_features to intel_check_word. (__cache_sysconf): Pass cpu_features to handle_intel. (init_cacheinfo): Likewise. Use cpu_features instead of dl_x86_cpu_features.	2017-06-05 16:20:11 -07:00
H.J. Lu	808fd9e6fe	x86: Update __x86_shared_non_temporal_threshold __x86_shared_non_temporal_threshold was set to 6 times of per-core shared cache size, based on the large memcpy micro benchmark in glibc on a 8-core processor. For a processor with more than 8 cores, the threshold is too low. Set __x86_shared_non_temporal_threshold to the 3/4 of the total shared cache size so that it is unchanged on 8-core processors. On processors with less than 8 cores, the threshold is lower. * sysdeps/x86/cacheinfo.c (__x86_shared_non_temporal_threshold): Set to the 3/4 of the total shared cache size.	2017-06-02 17:32:37 -07:00
H.J. Lu	9c450f6f6f	x86: Don't include cacheinfo.c in ld.so Since cacheinfo.c isn't used by ld.so, there is no need to include it in ld.so. * sysdeps/x86/cacheinfo.c: Skip if not in libc.	2017-05-24 06:33:43 -07:00
H.J. Lu	7c1d722554	x86: Use __get_cpu_features to get cpu_features Remove is_intel, is_amd and max_cpuid macros. Use __get_cpu_features to get cpu_features instead. * sysdeps/x86/cacheinfo.c (is_intel): Removed. (is_amd): Likewise. (max_cpuid): Likewise. (__cache_sysconf): Use __get_cpu_features to get cpu_features. (init_cacheinfo): Likewise.	2017-05-24 06:28:52 -07:00
Joseph Myers	bfff8b1bec	Update copyright dates with scripts/update-copyrights.	2017-01-01 00:14:16 +00:00
H.J. Lu	6a824767d8	X86: Don't assert on older Intel CPUs [BZ #20647 ] Since the maximum CPUID level of older Intel CPUs is 1, change handle_intel to return -1, instead of assert, when the maximum CPUID level is less than 2. [BZ #20647] * sysdeps/x86/cacheinfo.c (handle_intel): Return -1 if the maximum CPUID level is less than 2.	2016-10-12 08:22:52 -07:00
H.J. Lu	d6af2388f7	Count number of logical processors sharing L2 cache For Intel processors, when there are both L2 and L3 caches, SMT level type should be ued to count number of available logical processors sharing L2 cache. If there is only L2 cache, core level type should be used to count number of available logical processors sharing L2 cache. Number of available logical processors sharing L2 cache should be used for non-inclusive L2 and L3 caches. * sysdeps/x86/cacheinfo.c (init_cacheinfo): Count number of available logical processors with SMT level type sharing L2 cache for Intel processors.	2016-05-27 15:16:51 -07:00
H.J. Lu	b7598b1b85	Remove special L2 cache case for Knights Landing L2 cache is shared by 2 cores on Knights Landing, which has 4 threads per core: https://en.wikipedia.org/wiki/Xeon_Phi#Knights_Landing So L2 cache is shared by 8 threads on Knights Landing as reported by CPUID. We should remove special L2 cache case for Knights Landing. [BZ #18185] * sysdeps/x86/cacheinfo.c (init_cacheinfo): Don't limit threads sharing L2 cache to 2 for Knights Landing.	2016-05-20 14:42:00 -07:00
H.J. Lu	de71e0421b	Correct Intel processor level type mask from CPUID Intel CPUID with EAX == 11 returns: ECX Bits 07 - 00: Level number. Same value in ECX input. Bits 15 - 08: Level type. ^^^^^^^^^^^^^^^^^^^^^^^^ This is level type. Bits 31 - 16: Reserved. Intel processor level type mask should be 0xff00, not 0xff0. [BZ #20119] * sysdeps/x86/cacheinfo.c (init_cacheinfo): Correct Intel processor level type mask for CPUID with EAX == 11.	2016-05-19 10:02:36 -07:00
H.J. Lu	7c08d791ee	Check the HTT bit before counting logical threads Skip counting logical threads for Intel processors if the HTT bit is 0 which indicates there is only a single logical processor. * sysdeps/x86/cacheinfo.c (init_cacheinfo): Skip counting logical threads if the HTT bit is 0. * sysdeps/x86/cpu-features.h (bit_cpu_HTT): New. (index_cpu_HTT): Likewise. (reg_HTT): Likewise.	2016-05-19 09:09:00 -07:00
H.J. Lu	9e4ec3e816	Support non-inclusive caches on Intel processors * sysdeps/x86/cacheinfo.c (init_cacheinfo): Check and support non-inclusive caches on Intel processors.	2016-05-13 07:18:35 -07:00
H.J. Lu	a9558b49b3	Move sysdeps/x86_64/cacheinfo.c to sysdeps/x86 Move sysdeps/x86_64/cacheinfo.c to sysdeps/x86. No code changes on x86 and x86_64. * sysdeps/i386/cacheinfo.c: Include <sysdeps/x86/cacheinfo.c> instead of <sysdeps/x86_64/cacheinfo.c>. * sysdeps/x86_64/cacheinfo.c: Moved to ... * sysdeps/x86/cacheinfo.c: Here.	2016-05-08 08:49:18 -07:00

32 Commits