glibc

mirror of https://sourceware.org/git/glibc.git synced 2024-11-26 15:00:06 +00:00

Author	SHA1	Message	Date
Sunil K Pandey	f20f980c71	x86-64: Add vector acos/acosf implementation to libmvec Implement vectorized acos/acosf containing SSE, AVX, AVX2 and AVX512 versions for libmvec as per vector ABI. It also contains accuracy and ABI tests for vector acos/acosf with regenerated ulps. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2021-12-22 13:03:14 -08:00
Aurelien Jarno	94058f6cde	elf: Fix tst-cpu-features-cpuinfo for KVM guests on some AMD systems [BZ #28704 ] On KVM guests running on some AMD systems, the IBRS feature is reported as a synthetic feature using the Intel feature, while the cpuinfo entry keeps the same. Handle that by first checking the presence of the Intel feature on AMD systems. Fixes bug 28704.	2021-12-17 20:20:15 +01:00
H.J. Lu	ea5814467a	x86-64: Remove LD_PREFER_MAP_32BIT_EXEC support [BZ #28656 ] Remove the LD_PREFER_MAP_32BIT_EXEC environment variable support since the first PT_LOAD segment is no longer executable due to defaulting to -z separate-code. This fixes [BZ #28656]. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2021-12-10 14:01:34 -08:00
Florian Weimer	8dbeb0561e	nptl: Add <thread_pointer.h> for defining __thread_pointer <tls.h> already contains a definition that is quite similar, but it is not consistent across architectures. Only architectures for which rseq support is added are covered. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2021-12-09 09:49:32 +01:00
H.J. Lu	ceeffe968c	x86: Don't set Prefer_No_AVX512 for processors with AVX512 and AVX-VNNI Don't set Prefer_No_AVX512 on processors with AVX512 and AVX-VNNI since they won't lower CPU frequency when ZMM load and store instructions are used.	2021-12-06 07:14:12 -08:00
Noah Goldstein	475b63702e	x86: Double size of ERMS rep_movsb_threshold in dl-cacheinfo.h No bug. This patch doubles the rep_movsb_threshold when using ERMS. Based on benchmarks the vector copy loop, especially now that it handles 4k aliasing, is better for these medium ranged. On Skylake with ERMS: Size, Align1, Align2, dst>src,(rep movsb) / (vec copy) 4096, 0, 0, 0, 0.975 4096, 0, 0, 1, 0.953 4096, 12, 0, 0, 0.969 4096, 12, 0, 1, 0.872 4096, 44, 0, 0, 0.979 4096, 44, 0, 1, 0.83 4096, 0, 12, 0, 1.006 4096, 0, 12, 1, 0.989 4096, 0, 44, 0, 0.739 4096, 0, 44, 1, 0.942 4096, 12, 12, 0, 1.009 4096, 12, 12, 1, 0.973 4096, 44, 44, 0, 0.791 4096, 44, 44, 1, 0.961 4096, 2048, 0, 0, 0.978 4096, 2048, 0, 1, 0.951 4096, 2060, 0, 0, 0.986 4096, 2060, 0, 1, 0.963 4096, 2048, 12, 0, 0.971 4096, 2048, 12, 1, 0.941 4096, 2060, 12, 0, 0.977 4096, 2060, 12, 1, 0.949 8192, 0, 0, 0, 0.85 8192, 0, 0, 1, 0.845 8192, 13, 0, 0, 0.937 8192, 13, 0, 1, 0.939 8192, 45, 0, 0, 0.932 8192, 45, 0, 1, 0.927 8192, 0, 13, 0, 0.621 8192, 0, 13, 1, 0.62 8192, 0, 45, 0, 0.53 8192, 0, 45, 1, 0.516 8192, 13, 13, 0, 0.664 8192, 13, 13, 1, 0.659 8192, 45, 45, 0, 0.593 8192, 45, 45, 1, 0.575 8192, 2048, 0, 0, 0.854 8192, 2048, 0, 1, 0.834 8192, 2061, 0, 0, 0.863 8192, 2061, 0, 1, 0.857 8192, 2048, 13, 0, 0.63 8192, 2048, 13, 1, 0.629 8192, 2061, 13, 0, 0.627 8192, 2061, 13, 1, 0.62 Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com> Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2021-11-06 16:18:08 -05:00
Florian Weimer	cca75bd8b5	i386: Explain why __HAVE_64B_ATOMICS has to be 0	2021-11-02 10:26:23 +01:00
H.J. Lu	14dbbf46a0	x86-64: Remove Prefer_AVX2_STRCMP Remove Prefer_AVX2_STRCMP to enable EVEX strcmp. When comparing 2 32-byte strings, EVEX strcmp has been improved to require 1 load, 1 VPTESTM, 1 VPCMP, 1 KMOVD and 1 INCL instead of 2 loads, 3 VPCMPs, 2 KORDs, 1 KMOVD and 1 TESTL while AVX2 strcmp requires 1 load, 2 VPCMPEQs, 1 VPMINU, 1 VPMOVMSKB and 1 TESTL. EVEX strcmp is now faster than AVX2 strcmp by up to 40% on Tiger Lake and Ice Lake.	2021-11-01 07:53:04 -07:00
Fangrui Song	bf433b849a	elf: Remove Intel MPX support (lazy PLT, ld.so profile, and LD_AUDIT) Intel MPX failed to gain wide adoption and has been deprecated for a while. GCC 9.1 removed Intel MPX support. Linux kernel removed MPX in 2019. This patch removes the support code from the dynamic loader. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2021-10-11 11:14:02 -07:00
Noah Goldstein	fc5bd179ef	x86: Modify ENTRY in sysdep.h so that p2align can be specified No bug. This change adds a new macro ENTRY_P2ALIGN which takes a second argument, log2 of the desired function alignment. The old ENTRY(name) macro is just ENTRY_P2ALIGN(name, 4) so this doesn't affect any existing functionality. Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>	2021-10-08 11:30:52 -05:00
H.J. Lu	1bd888d0b7	Initial support for GNU_PROPERTY_1_NEEDED 1. Add GNU_PROPERTY_1_NEEDED: #define GNU_PROPERTY_1_NEEDED GNU_PROPERTY_UINT32_OR_LO to indicate the needed properties by the object file. 2. Add GNU_PROPERTY_1_NEEDED_INDIRECT_EXTERN_ACCESS: #define GNU_PROPERTY_1_NEEDED_INDIRECT_EXTERN_ACCESS (1U << 0) to indicate that the object file requires canonical function pointers and cannot be used with copy relocation. 3. Scan GNU_PROPERTY_1_NEEDED property and store it in l_1_needed. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2021-10-07 10:26:08 -07:00
Joseph Myers	b26901b26e	Fix sysdeps/x86/fpu/s_ffma.c for 32-bit FMA processor case It turns out the __SSE2_MATH__ conditional in sysdeps/x86/fpu/s_ffma.c does not cover all cases where the x86 fenv_private.h macros might manipulate one of the SSE and 387 floating-point state, while the actual fma implementation uses the other. Specifically, in the 32-bit case, with a compiler not defaulting to -mfpmath=sse, but testing on a processor with hardware FMA support, the multiarch fma function implementations will end up using SSE, while the fenv_private.h macros will use the 387 state for double. Change the conditional to use the default macros rather than the optimized ones in all cases except when the compiler inlines an fma instruction (in which case, since all those instructions are SSE instructions and -mfpmath=sse must be in effect for them to be inlined, the optimized macros will only use the SSE state and it's OK for them to only use the SSE state). Tested for x86_64 and x86. H.J. reports in <https://sourceware.org/pipermail/libc-alpha/2021-September/131367.html> that it fixes the problems he observed.	2021-09-24 17:59:22 +00:00
Joseph Myers	4ed7a383f9	Fix ffma use of round-to-odd on x86 On 32-bit x86 with -mfpmath=sse, and on x86_64 with --disable-multi-arch, the tests of ffma and its aliases (fma narrowing from binary64 to binary32) fail. This is probably the issue reported by H.J. in <https://sourceware.org/pipermail/libc-alpha/2021-September/131277.html>. The problem is the use of fenv_private.h macros in the round-to-odd implementation. Those macros are set up to manipulate only one of the SSE and 387 floating-point state, whichever is relevant for the type indicated by the suffix on the macro name. But x86 configurations sometimes use the ldbl-96 implementation of binary64 fma (that's where --disable-multi-arch is relevant for x86_64: it causes the ldbl-96 implementation to be used, instead of an IFUNC implementation that falls back to the dbl-64 version), contrary to the expectations of those macros for functions operating on double when __SSE2_MATH__ is defined. This can be addressed by using the default versions of those macros (giving x86 its own version of s_ffma.c), as is done for the *f128 macro variants where it depends on the details of how GCC was configured when building libgcc which floating-point state is affected by _Float128 arithmetic. The issue only applies when __SSE2_MATH__ is defined, and doesn't apply when __FP_FAST_FMA is defined (because in that case, fma will be inlined by the compiler, meaning it's definitely an SSE operation; for the same reason, this is not an issue for narrowing sqrt, as hardware sqrt is always inlined in that implementation for x86), but in other cases it's safest to use the default versions of the fenv_private.h macros to ensure things work whichever fma implementation is used. Tested for x86_64 (with and without --disable-multi-arch) and x86 (with and without -mfpmath=sse).	2021-09-23 21:18:31 +00:00
Siddhesh Poyarekar	30891f35fa	Remove "Contributed by" lines We stopped adding "Contributed by" or similar lines in sources in 2012 in favour of git logs and keeping the Contributors section of the glibc manual up to date. Removing these lines makes the license header a bit more consistent across files and also removes the possibility of error in attribution when license blocks or files are copied across since the contributed-by lines don't actually reflect reality in those cases. Move all "Contributed by" and similar lines (Written by, Test by, etc.) into a new file CONTRIBUTED-BY to retain record of these contributions. These contributors are also mentioned in manual/contrib.texi, so we just maintain this additional record as a courtesy to the earlier developers. The following scripts were used to filter a list of files to edit in place and to clean up the CONTRIBUTED-BY file respectively. These were not added to the glibc sources because they're not expected to be of any use in future given that this is a one time task: https://gist.github.com/siddhesh/b5ecac94eabfd72ed2916d6d8157e7dc https://gist.github.com/siddhesh/15ea1f5e435ace9774f485030695ee02 Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2021-09-03 22:06:44 +05:30
Fangrui Song	224edada60	configure: Allow LD to be LLD 13.0.0 or above [BZ #26558 ] When using LLD (LLVM linker) as the linker, configure prints a confusing message. *** These critical programs are missing or too old: GNU ld LLD>=13.0.0 can build glibc --enable-static-pie. (8.0.0 needs one workaround for -Wl,-defsym=_begin=0. 9.0.0 works with --disable-static-pie). XFAIL two tests sysdeps/x86/tst-ifunc-isa-* which have the BZ #28154 issue (LLD follows the PowerPC port of GNU ld for ifunc by placing IRELATIVE relocations in .rela.dyn, triggering a glibc ifunc fragility). The set of dynamic symbols is the same with GNU ld and LLD, modulo unused SHN_ABS version node symbols. For comparison, gold does not support --enable-static-pie yet (--no-dynamic-linker is unsupported BZ #22221), yet has 6 failures more than LLD. gold linked libc.so has larger .dynsym differences with GNU ld and LLD (non-default version symbols are changed to default versions by a version script BZ #28196).	2021-08-31 20:23:34 -07:00
Matt Whitlock	0835c0f0ba	x86: fix Autoconf caching of instruction support checks [BZ #27991 ] The Autoconf documentation for the AC_CACHE_CHECK macro states: The commands-to-set-it must have no side effects except for setting the variable cache-id, see below. However, the tests for support of -msahf and -mmovbe were embedded in the commands-to-set-it for lib_cv_include_x86_isa_level. This had the consequence that libc_cv_have_x86_lahf_sahf and libc_cv_have_x86_movbe were not defined whenever lib_cv_include_x86_isa_level was read from cache. These variables' being undefined meant that their unquoted use in later test expressions led to the 'test' built-in's misparsing its arguments and emitting errors like "test: =: unexpected operator" or "test: =: unary operator expected", depending on the particular shell. This commit refactors the tests for LAHF/SAHF and MOVBE instruction support into their own AC_CACHE_CHECK macro invocations to obey the rule that the commands-to-set-it must have no side effects other than setting the variable named by cache-id. Signed-off-by: Matt Whitlock <sourceware@mattwhitlock.name> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2021-08-19 09:11:35 -03:00
H.J. Lu	91cc803d27	x86-64: Add Avoid_Short_Distance_REP_MOVSB commit `3ec5d83d2a` Author: H.J. Lu <hjl.tools@gmail.com> Date: Sat Jan 25 14:19:40 2020 -0800 x86-64: Avoid rep movsb with short distance [BZ #27130] introduced some regressions on Intel processors without Fast Short REP MOV (FSRM). Add Avoid_Short_Distance_REP_MOVSB to avoid rep movsb with short distance only on Intel processors with FSRM. bench-memmove-large on Skylake server shows that cycles of __memmove_evex_unaligned_erms improves for the following data size: before after Improvement length=4127, align1=3, align2=0: 479.38 349.25 27% length=4223, align1=9, align2=5: 405.62 333.25 18% length=8223, align1=3, align2=0: 786.12 496.38 37% length=8319, align1=9, align2=5: 727.50 501.38 31% length=16415, align1=3, align2=0: 1436.88 840.00 41% length=16511, align1=9, align2=5: 1375.50 836.38 39% length=32799, align1=3, align2=0: 2890.00 1860.12 36% length=32895, align1=9, align2=5: 2891.38 1931.88 33%	2021-07-28 13:23:57 -07:00
H.J. Lu	7c124e3714	x86: Install <bits/platform/x86.h> [BZ #27958 ] 1. Install <bits/platform/x86.h> for <sys/platform/x86.h> which includes <bits/platform/x86.h>. 2. Rename HAS_CPU_FEATURE to CPU_FEATURE_PRESENT which checks if the processor has the feature. 3. Rename CPU_FEATURE_USABLE to CPU_FEATURE_ACTIVE which checks if the feature is active. There may be other preconditions, like sufficient stack space or further setup for AMX, which must be satisfied before the feature can be used. This fixes BZ #27958. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2021-07-23 05:12:51 -07:00
Siddhesh Poyarekar	5b8d271571	Fix build and tests with --disable-tunables Remove unused code and declare __libc_mallopt when !IS_IN (libc) to allow the debug hook to build with --disable-tunables. Also, run tst-ifunc-isa-2* tests only when tunables are enabled since the result depends on it. Tested on x86_64. Reported-by: Matheus Castanho <msc@linux.ibm.com> Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2021-07-23 13:57:56 +05:30
Adhemerval Zanella	469761eac8	elf: Fix tst-cpu-features-cpuinfo on some AMD systems (BZ #28090 ) The SSBD feature is implemented in 2 different ways on AMD processors: newer systems (Zen3) provides AMD_SSBD (function 8000_0008, EBX[24]), while older system provides AMD_VIRT_SSBD (function 8000_0008, EBX[25]). However for AMD_VIRT_SSBD, kernel shows both 'ssdb' and 'virt_ssdb' on /proc/cpuinfo; while for AMD_SSBD only 'ssdb' is provided. This now check is AMD_SSBD is set to check for 'ssbd', otherwise check if AMD_VIRT_SSDB is set to check for 'virt_ssbd'. Checked on x86_64-linux-gnu on a Ryzen 9 5900x. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2021-07-19 14:12:29 -03:00
H.J. Lu	ea8e465a6b	x86: Check RTM_ALWAYS_ABORT for RTM [BZ #28033 ] From https://www.intel.com/content/www/us/en/support/articles/000059422/processors.html * Intel TSX will be disabled by default. * The processor will force abort all Restricted Transactional Memory (RTM) transactions by default. * A new CPUID bit CPUID.07H.0H.EDX[11](RTM_ALWAYS_ABORT) will be enumerated, which is set to indicate to updated software that the loaded microcode is forcing RTM abort. * On processors that enumerate support for RTM, the CPUID enumeration bits for Intel TSX (CPUID.07H.0H.EBX[11] and CPUID.07H.0H.EBX[4]) continue to be set by default after microcode update. * Workloads that were benefited from Intel TSX might experience a change in performance. * System software may use a new bit in Model-Specific Register (MSR) 0x10F TSX_FORCE_ABORT[TSX_CPUID_CLEAR] functionality to clear the Hardware Lock Elision (HLE) and RTM bits to indicate to software that Intel TSX is disabled. 1. Add RTM_ALWAYS_ABORT to CPUID features. 2. Set RTM usable only if RTM_ALWAYS_ABORT isn't set. This skips the string/tst-memchr-rtm etc. testcases on the affected processors, which always fail after a microcde update. 3. Check RTM feature, instead of usability, against /proc/cpuinfo. This fixes BZ #28033.	2021-07-01 10:47:35 -07:00
Adhemerval Zanella	e3e3eb0a2e	x86: Fix tst-cpu-features-cpuinfo on Ryzen 9 (BZ #27873 ) AMD define different flags for IRPB, IBRS, and STIPBP [1], so new x86_64_cpu are added and IBRS_IBPB is only tested for Intel. The SSDB is also defined and implemented different on AMD [2], and also a new AMD_SSDB flag is added. It should map to the cpuinfo 'ssdb' on recent AMD cpus. It fixes tst-cpu-features-cpuinfo and tst-cpu-features-cpuinfo-static on recent AMD cpus. Checked on x86_64-linux-gnu on AMD Ryzen 9 5900X. [1] https://developer.amd.com/wp-content/resources/Architecture_Guidelines_Update_Indirect_Branch_Control.pdf [2] https://bugzilla.kernel.org/show_bug.cgi?id=199889 Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2021-06-24 09:57:46 -03:00
H.J. Lu	ea26ff0322	x86: Copy IBT and SHSTK usable only if CET is enabled IBT and SHSTK usable bits are copied from CPUID feature bits and later cleared if kernel doesn't support CET. Copy IBT and SHSTK usable only if CET is enabled so that they aren't set on CET capable processors with non-CET enabled glibc.	2021-06-23 17:35:47 -07:00
Florian Weimer	6f1c701026	dlfcn: Cleanups after -ldl is no longer required This commit removes the ELF constructor and internal variables from dlfcn/dlfcn.c. The file now serves the same purpose as nptl/libpthread-compat.c, so it is renamed to dlfcn/libdl-compat.c. The use of libdl-shared-only-routines ensures that libdl.a is empty. This commit adjusts the test suite not to use $(libdl). The libdl.so symbolic link is no longer installed. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2021-06-03 09:11:45 +02:00
H.J. Lu	79aec84102	Properly check stack alignment [BZ #27901 ] 1. Replace if ((((uintptr_t) &_d) & (__alignof (double) - 1)) != 0) which may be optimized out by compiler, with int __attribute__ ((weak, noclone, noinline)) is_aligned (void *p, int align) { return (((uintptr_t) p) & (align - 1)) != 0; } 2. Add TEST_STACK_ALIGN_INIT to TEST_STACK_ALIGN. 3. Add a common TEST_STACK_ALIGN_INIT to check 16-byte stack alignment for both i386 and x86-64. 4. Update powerpc to use TEST_STACK_ALIGN_INIT. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2021-05-24 07:42:12 -07:00
H.J. Lu	cf2c57526b	x86: Set rep_movsb_threshold to 2112 on processors with FSRM The glibc memcpy benchmark on Intel Core i7-1065G7 (Ice Lake) showed that REP MOVSB became faster after 2112 bytes: Vector Move REP MOVSB length=2112, align1=0, align2=0: 24.20 24.40 length=2112, align1=1, align2=0: 26.07 23.13 length=2112, align1=0, align2=1: 27.18 28.13 length=2112, align1=1, align2=1: 26.23 25.16 length=2176, align1=0, align2=0: 23.18 22.52 length=2176, align1=2, align2=0: 25.45 22.52 length=2176, align1=0, align2=2: 27.14 27.82 length=2176, align1=2, align2=2: 22.73 25.56 length=2240, align1=0, align2=0: 24.62 24.25 length=2240, align1=3, align2=0: 29.77 27.15 length=2240, align1=0, align2=3: 35.55 29.93 length=2240, align1=3, align2=3: 34.49 25.15 length=2304, align1=0, align2=0: 34.75 26.64 length=2304, align1=4, align2=0: 32.09 22.63 length=2304, align1=0, align2=4: 28.43 31.24 Use REP MOVSB for data size > 2112 bytes in memcpy on processors with fast short REP MOVSB (FSRM). * sysdeps/x86/dl-cacheinfo.h (dl_init_cacheinfo): Set rep_movsb_threshold to 2112 on processors with fast short REP MOVSB (FSRM).	2021-05-03 05:08:22 -07:00
H.J. Lu	7fc9152e83	x86: tst-cpu-features-supports.c: Update AMX check Pass "amx-bf16", "amx-int8" and "amx-tile", instead of "amx_bf16", "amx_int8" and "amx_tile", to __builtin_cpu_supports for GCC 11.	2021-04-22 10:09:49 -07:00
Florian Weimer	81dfc6694c	nptl: Remove longjmp, siglongjmp from libpthread The definitions in libc are sufficient, the forwarders are no longer needed. The symbols have been moved using scripts/move-symbol-to-libc.py. s390-linux-gnu and s390x-linux-gnu need a new version placeholder to keep the GLIBC_2.19 symbol version in libpthread. Tested on i386-linux-gnu, powerpc64le-linux-gnu, s390x-linux-gnu, x86_64-linux-gnu. Built with build-many-glibcs.py. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2021-04-21 19:49:50 +02:00
Siddhesh Poyarekar	abadbef5c8	Move __isnanf128 to libc.so All of the isnan functions are in libc.so due to printf_fp, so move __isnanf128 there too for consistency. Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@ascii.art.br> Reviewed-by: Florian Weimer <fweimer@redhat.com>	2021-03-30 14:58:19 +05:30
H.J. Lu	4bd660be40	x86: Add string/memory function tests in RTM region At function exit, AVX optimized string/memory functions have VZEROUPPER which triggers RTM abort. When such functions are called inside a transactionally executing RTM region, RTM abort causes severe performance degradation. Add tests to verify that string/memory functions won't cause RTM abort in RTM region.	2021-03-29 07:40:17 -07:00
H.J. Lu	1da50d4bda	x86: Set Prefer_No_VZEROUPPER and add Prefer_AVX2_STRCMP 1. Set Prefer_No_VZEROUPPER if RTM is usable to avoid RTM abort triggered by VZEROUPPER inside a transactionally executing RTM region. 2. Since to compare 2 32-byte strings, 256-bit EVEX strcmp requires 2 loads, 3 VPCMPs and 2 KORDs while AVX2 strcmp requires 1 load, 2 VPCMPEQs, 1 VPMINU and 1 VPMOVMSKB, AVX2 strcmp is faster than EVEX strcmp. Add Prefer_AVX2_STRCMP to prefer AVX2 strcmp family functions.	2021-03-29 07:40:17 -07:00
H.J. Lu	27f7463675	x86: Properly disable XSAVE related features [BZ #27605 ] 1. Support GLIBC_TUNABLES=glibc.cpu.hwcaps=-XSAVE. 2. Disable all features which depend on XSAVE: a. If OSXSAVE is disabled by glibc tunables. Or b. If both XSAVE and XSAVEC aren't usable.	2021-03-29 06:04:17 -07:00
Samuel Thibault	16b597807d	elf: Fix not compiling ifunc tests that need gcc ifunc support	2021-03-24 01:52:46 +01:00
Siddhesh Poyarekar	941ea10f80	Build get-cpuid-feature-leaf.c without stack-protector [BZ #27555 ] __x86_get_cpuid_feature_leaf is called during early startup, before the stack check guard is initialized and is hence not safe to build with stack-protector. Additionally, IFUNC resolvers for static tst-ifunc-isa tests get called too early for stack protector to be useful, so fix them to disable stack protector for the resolver functions. This fixes all failures seen with --enable-stack-protector=all configuration. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2021-03-15 20:24:45 +05:30
H.J. Lu	f53ffc9b90	x86: Handle _SC_LEVEL1_ICACHE_LINESIZE [BZ #27444 ] commit `2d651eb926` Author: H.J. Lu <hjl.tools@gmail.com> Date: Fri Sep 18 07:55:14 2020 -0700 x86: Move x86 processor cache info to cpu_features missed _SC_LEVEL1_ICACHE_LINESIZE. 1. Add level1_icache_linesize to struct cpu_features. 2. Initialize level1_icache_linesize by calling handle_intel, handle_zhaoxin and handle_amd with _SC_LEVEL1_ICACHE_LINESIZE. 3. Return level1_icache_linesize for _SC_LEVEL1_ICACHE_LINESIZE. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2021-03-15 05:43:26 -07:00
H.J. Lu	339bf918ea	x86: Set minimum x86-64 level marker [BZ #27318 ] Since the full ISA set used in an ELF binary is unknown to compiler, an x86-64 ISA level marker indicates the minimum, not maximum, ISA set required to run such an ELF binary. We never guarantee a library with an x86-64 ISA level v3 marker doesn't contain other ISAs beyond x86-64 ISA level v3, like AVX VNNI. We check the x86-64 ISA level marker for the minimum ISA set. Since -march=sandybridge enables only some ISAs in x86-64 ISA level v3, we should set the needed ISA marker to v2. Otherwise, libc is compiled with -march=sandybridge will fail to run on Sandy Bridge: $ ./elf/ld.so ./libc.so ./libc.so: (p) CPU ISA level is lower than required: needed: 7; got: 3 Set the minimum, instead of maximum, x86-64 ISA level marker should have no impact on the glibc-hwcaps directory assignment logic in ldconfig nor ld.so.	2021-03-06 07:49:30 -08:00
Florian Weimer	01a5746b6c	x86: Add CPU-specific diagnostics to ld.so --list-diagnostics	2021-03-02 15:01:10 +01:00
Florian Weimer	e4933c8a92	x86: Automate generation of PREFERRED_FEATURE_INDEX_1 bitfield Use a .def file to define the bitfield layout, so that it is possible to iterate over field members using the preprocessor.	2021-03-02 15:01:06 +01:00
H.J. Lu	89de9d3958	x86: Use x86/nptl/pthreaddef.h 1. Move sysdeps/i386/nptl/pthreaddef.h to sysdeps/x86/nptl/pthreaddef.h. 2. Remove sysdeps/x86_64/nptl/pthreaddef.h. Reviewed-by: DJ Delorie <dj@redhat.com>	2021-02-22 15:52:56 -08:00
Florian Weimer	feb741bb81	x86: Remove unused variables for raw cache sizes from cacheinfo.h	2021-02-22 17:36:03 +01:00
H.J. Lu	ba230b6387	<bits/platform/x86.h>: Correct x86_cpu_TBM x86_cpu_TBM should be x86_cpu_index_80000001_ecx + 21.	2021-02-22 04:31:51 -08:00
H.J. Lu	ce4a94b12e	x86: Remove the extra space between "# endif" Remove the extra space between "# endif" left over from commit `f380868f6d` Author: H.J. Lu <hjl.tools@gmail.com> Date: Thu Dec 24 15:43:34 2020 -0800 Remove _ISOMAC check from <cpu-features.h>	2021-02-12 07:50:29 -08:00
Siddhesh Poyarekar	a1b8b06a55	x86: Use SIZE_MAX instead of (long int)-1 for tunable range value The tunable types are SIZE_T, so set the ranges to the correct maximum value, i.e. SIZE_MAX.	2021-02-10 19:08:33 +05:30
Siddhesh Poyarekar	61117bfa1b	tunables: Simplify TUNABLE_SET interface The TUNABLE_SET interface took a primitive C type argument, which resulted in inconsistent type conversions internally due to incorrect dereferencing of types, especialy on 32-bit architectures. This change simplifies the TUNABLE setting logic along with the interfaces. Now all numeric tunable values are stored as signed numbers in tunable_num_t, which is intmax_t. All calls to set tunables cast the input value to its primitive type and then to tunable_num_t for storage. This relies on gcc-specific (although I suspect other compilers woul also do the same) unsigned to signed integer conversion semantics, i.e. the bit pattern is conserved. The reverse conversion is guaranteed by the standard.	2021-02-10 19:08:33 +05:30
H.J. Lu	5ab25c8875	x86: Add PTWRITE feature detection [BZ #27346 ] 1. Add CPUID_INDEX_14_ECX_0 for CPUID leaf 0x14 to detect PTWRITE feature in EBX of CPUID leaf 0x14 with ECX == 0. 2. Add PTWRITE detection to CPU feature tests. 3. Add 2 static CPU feature tests.	2021-02-07 08:01:14 -08:00
Sajan Karumanchi	6e02b3e932	x86: Adding an upper bound for Enhanced REP MOVSB. In the process of optimizing memcpy for AMD machines, we have found the vector move operations are outperforming enhanced REP MOVSB for data transfers above the L2 cache size on Zen3 architectures. To handle this use case, we are adding an upper bound parameter on enhanced REP MOVSB:'__x86_rep_movsb_stop_threshold'. As per large-bench results, we are configuring this parameter to the L2 cache size for AMD machines and applicable from Zen3 architecture supporting the ERMS feature. For architectures other than AMD, it is the computed value of non-temporal threshold parameter. Reviewed-by: Premachandra Mallappa <premachandra.mallappa@amd.com>	2021-02-02 12:42:15 +01:00
H.J. Lu	6c57d32048	sysconf: Add _SC_MINSIGSTKSZ/_SC_SIGSTKSZ [BZ #20305 ] Add _SC_MINSIGSTKSZ for the minimum signal stack size derived from AT_MINSIGSTKSZ, which is the minimum number of bytes of free stack space required in order to gurantee successful, non-nested handling of a single signal whose handler is an empty function, and _SC_SIGSTKSZ which is the suggested minimum number of bytes of stack space required for a signal stack. If AT_MINSIGSTKSZ isn't available, sysconf (_SC_MINSIGSTKSZ) returns MINSIGSTKSZ. On Linux/x86 with XSAVE, the signal frame used by kernel is composed of the following areas and laid out as: ------------------------------ \| alignment padding \| ------------------------------ \| xsave buffer \| ------------------------------ \| fsave header (32-bit only) \| ------------------------------ \| siginfo + ucontext \| ------------------------------ Compute AT_MINSIGSTKSZ value as size of xsave buffer + size of fsave header (32-bit only) + size of siginfo and ucontext + alignment padding. If _SC_SIGSTKSZ_SOURCE or _GNU_SOURCE are defined, MINSIGSTKSZ and SIGSTKSZ are redefined as /* Default stack size for a signal handler: sysconf (SC_SIGSTKSZ). / # undef SIGSTKSZ # define SIGSTKSZ sysconf (_SC_SIGSTKSZ) / Minimum stack size for a signal handler: SIGSTKSZ. */ # undef MINSIGSTKSZ # define MINSIGSTKSZ SIGSTKSZ Compilation will fail if the source assumes constant MINSIGSTKSZ or SIGSTKSZ. The reason for not simply increasing the kernel's MINSIGSTKSZ #define (apart from the fact that it is rarely used, due to glibc's shadowing definitions) was that userspace binaries will have baked in the old value of the constant and may be making assumptions about it. For example, the type (char [MINSIGSTKSZ]) changes if this #define changes. This could be a problem if an newly built library tries to memcpy() or dump such an object defined by and old binary. Bounds-checking and the stack sizes passed to things like sigaltstack() and makecontext() could similarly go wrong.	2021-02-01 11:00:52 -08:00
H.J. Lu	04dff6fc0d	x86: Properly set usable CET feature bits [BZ #26625 ] commit `94cd37ebb2` Author: H.J. Lu <hjl.tools@gmail.com> Date: Wed Sep 16 05:27:32 2020 -0700 x86: Use HAS_CPU_FEATURE with IBT and SHSTK [BZ #26625] broke GLIBC_TUNABLES=glibc.cpu.hwcaps=-IBT,-SHSTK since it can no longer disable IBT nor SHSTK. Handle IBT and SHSTK with: 1. Revert commit `94cd37ebb2`. 2. Clears the usable CET feature bits if kernel doesn't support CET. 3. Add GLIBC_TUNABLES tests without dlopen. 4. Add tests to verify that CPU_FEATURE_USABLE on IBT and SHSTK matches _get_ssp. 5. Update GLIBC_TUNABLES tests with dlopen to verify that CET is disabled with GLIBC_TUNABLES. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2021-01-29 03:58:11 -08:00
Andreas Schwab	31f6488722	Fix misplaced const Constify __x86_cacheinfo_p and __x86_cpu_features_p, not their pointer target types.	2021-01-25 15:09:02 +01:00
H.J. Lu	5f478eb0fb	x86: Properly match CPU features in /proc/cpuinfo [BZ #27222 ] Search " YYY " and " YYY\n", instead of "YYY", to avoid matching "XXXYYYZZZ" with "YYY". Update /proc/cpuinfo CPU feature names: /proc/cpuinfo glibc ------------------------------------------------ avx512vbmi AVX512_VBMI dts DS pni SSE3 tsc_deadline_timer TSC_DEADLINE	2021-01-22 10:15:46 -08:00

1 2 3 4 5 ...

360 Commits