glibc

mirror of https://sourceware.org/git/glibc.git synced 2024-11-22 21:10:07 +00:00

Author	SHA1	Message	Date
Noah Goldstein	7775574ce0	x86: Use `testb` for case-locale check in str{n}casecmp-sse2 `testb` saves a bit of code size is the imm-operand can be encoded 1-bytes. Tested on x86-64.	2022-10-20 11:29:05 -07:00
Noah Goldstein	b6d02d6457	x86: Use `testb` for case-locale check in str{n}casecmp-avx2 `testb` saves a bit of code size is the imm-operand can be encoded 1-bytes. Tested on x86-64.	2022-10-20 11:29:05 -07:00
Noah Goldstein	5ce9766417	x86: Add support for VEC_SIZE == 64 in strcmp-evex.S impl Unused at the moment, but evex512 strcmp, strncmp, strcasecmp{l}, and strncasecmp{l} functions can be added by including strcmp-evex.S with "x86-evex512-vecs.h" defined. In addition save code size a bit in a few places. 1. tzcnt ... -> bsf ... 2. vpcmp{b\|d} $0 ... -> vpcmpeq{b\|d} This saves a touch of code size but has minimal net affect. Full check passes on x86-64.	2022-10-20 11:29:05 -07:00
Noah Goldstein	c25eb94aed	x86: Remove AVX512-BVMI2 instruction from strrchr-evex.S commit `b412213eee` Author: Noah Goldstein <goldstein.w.n@gmail.com> Date: Tue Oct 18 17:44:07 2022 -0700 x86: Optimize strrchr-evex.S and implement with VMM headers Added `vpcompress{b\|d}` to the page-cross logic with is an AVX512-VBMI2 instruction. This is not supported on SKX. Since the page-cross logic is relatively cold and the benefit is minimal revert the page-cross case back to the old logic which is supported on SKX. Tested on x86-64.	2022-10-20 11:29:05 -07:00
Felix Riemann	a885fc2d68	sysdeps: arm: Fix preconfigure script for ARMv8/v9 targets [BZ #29698 ] The ARM preconfigure script tries to detect the capabilities of the target platform by checking the compiler's predefined architecture macros. However, if the compiler is tuning for AArch32 on ARMv8/v9 this step fails: checking for sysdeps preconfigure fragments... aarch64 alpha arc arm WARNING: arm/preconfigure: Did not find ARM architecture type; using default This is because preconfigure.ac doesn't escape the square brackets in the glob for matching compilers targeting ARMv8. Adding another pair of brackets to escape the first pair fixes this: checking for sysdeps preconfigure fragments... aarch64 alpha arc arm Found compiler is configured for something newer than v7 - using v7 Signed-off-by: Felix Riemann <felix.riemann@sma.de> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-10-20 11:23:05 -03:00
Adhemerval Zanella	9b5e138f2b	linux: Avoid shifting a negative signed on POSIX timer interface The current macros uses pid as signed value, which triggers a compiler warning for process and thread timers. Replace MAKE_PROCESS_CPUCLOCK with static inline function that expects the pid as unsigned. These are similar to what Linux does internally. Checked on x86_64-linux-gnu. Reviewed-by: Arjun Shankar <arjun@redhat.com>	2022-10-20 10:19:08 -03:00
Noah Goldstein	b412213eee	x86: Optimize strrchr-evex.S and implement with VMM headers Optimization is: 1. Cache latest result in "fast path" loop with `vmovdqu` instead of `kunpckdq`. This helps if there are more than one matches. Code Size Changes: strrchr-evex.S : +30 bytes (Same number of cache lines) Net perf changes: Reported as geometric mean of all improvements / regressions from N=10 runs of the benchtests. Value as New Time / Old Time so < 1.0 is improvement and 1.0 is regression. strrchr-evex.S : 0.932 (From cases with higher match frequency) Full results attached in email. Full check passes on x86-64.	2022-10-19 17:31:03 -07:00
Noah Goldstein	4af6844aa5	x86: Optimize memrchr-evex.S Optimizations are: 1. Use the fact that lzcnt(0) -> VEC_SIZE for memchr to save a branch in short string case. 2. Save several instructions in len = [VEC_SIZE, 4 * VEC_SIZE] case. 3. Use more code-size efficient instructions. - tzcnt ... -> bsf ... - vpcmpb $0 ... -> vpcmpeq ... Code Size Changes: memrchr-evex.S : -29 bytes Net perf changes: Reported as geometric mean of all improvements / regressions from N=10 runs of the benchtests. Value as New Time / Old Time so < 1.0 is improvement and 1.0 is regression. memrchr-evex.S : 0.949 (Mostly from improvements in small strings) Full results attached in email. Full check passes on x86-64.	2022-10-19 17:31:03 -07:00
Noah Goldstein	b79f8ff26a	x86: Optimize strnlen-evex.S and implement with VMM headers Optimizations are: 1. Use the fact that bsf(0) leaves the destination unchanged to save a branch in short string case. 2. Restructure code so that small strings are given the hot path. - This is a net-zero on the benchmark suite but in general makes sense as smaller sizes are far more common. 3. Use more code-size efficient instructions. - tzcnt ... -> bsf ... - vpcmpb $0 ... -> vpcmpeq ... 4. Align labels less aggressively, especially if it doesn't save fetch blocks / causes the basic-block to span extra cache-lines. The optimizations (especially for point 2) make the strnlen and strlen code essentially incompatible so split strnlen-evex to a new file. Code Size Changes: strlen-evex.S : -23 bytes strnlen-evex.S : -167 bytes Net perf changes: Reported as geometric mean of all improvements / regressions from N=10 runs of the benchtests. Value as New Time / Old Time so < 1.0 is improvement and 1.0 is regression. strlen-evex.S : 0.992 (No real change) strnlen-evex.S : 0.947 Full results attached in email. Full check passes on x86-64.	2022-10-19 17:31:03 -07:00
Noah Goldstein	69717709ec	x86: Shrink / minorly optimize strchr-evex and implement with VMM headers Size Optimizations: 1. Condence hot path for better cache-locality. - This is most impact for strchrnul where the logic strings with len <= VEC_SIZE or with a match in the first VEC no fits entirely in the first cache line. 2. Reuse common targets in first 4x VEC and after the loop. 3. Don't align targets so aggressively if it doesn't change the number of fetch blocks it will require and put more care in avoiding the case where targets unnecessarily split cache lines. 4. Align the loop better for DSB/LSD 5. Use more code-size efficient instructions. - tzcnt ... -> bsf ... - vpcmpb $0 ... -> vpcmpeq ... 6. Align labels less aggressively, especially if it doesn't save fetch blocks / causes the basic-block to span extra cache-lines. Code Size Changes: strchr-evex.S : -63 bytes strchrnul-evex.S: -48 bytes Net perf changes: Reported as geometric mean of all improvements / regressions from N=10 runs of the benchtests. Value as New Time / Old Time so < 1.0 is improvement and 1.0 is regression. strchr-evex.S (Fixed) : 0.971 strchr-evex.S (Rand) : 0.932 strchrnul-evex.S : 0.965 Full results attached in email. Full check passes on x86-64.	2022-10-19 17:31:03 -07:00
Noah Goldstein	330881763e	x86: Optimize memchr-evex.S and implement with VMM headers Optimizations are: 1. Use the fact that tzcnt(0) -> VEC_SIZE for memchr to save a branch in short string case. 2. Restructure code so that small strings are given the hot path. - This is a net-zero on the benchmark suite but in general makes sense as smaller sizes are far more common. 3. Use more code-size efficient instructions. - tzcnt ... -> bsf ... - vpcmpb $0 ... -> vpcmpeq ... 4. Align labels less aggressively, especially if it doesn't save fetch blocks / causes the basic-block to span extra cache-lines. The optimizations (especially for point 2) make the memchr and rawmemchr code essentially incompatible so split rawmemchr-evex to a new file. Code Size Changes: memchr-evex.S : -107 bytes rawmemchr-evex.S : -53 bytes Net perf changes: Reported as geometric mean of all improvements / regressions from N=10 runs of the benchtests. Value as New Time / Old Time so < 1.0 is improvement and 1.0 is regression. memchr-evex.S : 0.928 rawmemchr-evex.S : 0.986 (Less targets cross cache lines) Full results attached in email. Full check passes on x86-64.	2022-10-19 17:31:03 -07:00
Sunil K Pandey	451c6e5854	x86_64: Implement evex512 version of memchr, rawmemchr and wmemchr This patch implements following evex512 version of string functions. evex512 version takes up to 30% less cycle as compared to evex, depending on length and alignment. - memchr function using 512 bit vectors. - rawmemchr function using 512 bit vectors. - wmemchr function using 512 bit vectors. Code size data: memchr-evex.o 762 byte memchr-evex512.o 576 byte (-24%) rawmemchr-evex.o 461 byte rawmemchr-evex512.o 412 byte (-11%) wmemchr-evex.o 794 byte wmemchr-evex512.o 552 byte (-30%) Placeholder function, not used by any processor at the moment. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2022-10-18 13:26:33 -07:00
Florian Weimer	58548b9d68	Use PTR_MANGLE and PTR_DEMANGLE unconditionally in C sources In the future, this will result in a compilation failure if the macros are unexpectedly undefined (due to header inclusion ordering or header inclusion missing altogether). Assembler sources are more difficult to convert. In many cases, they are hand-optimized for the mangling and no-mangling variants, which is why they are not converted. sysdeps/s390/s390-32/__longjmp.c and sysdeps/s390/s390-64/__longjmp.c are special: These are C sources, but most of the implementation is in assembler, so the PTR_DEMANGLE macro has to be undefined in some cases, to match the assembler style. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-10-18 17:04:10 +02:00
Florian Weimer	88f4b6929c	Introduce <pointer_guard.h>, extracted from <sysdep.h> This allows us to define a generic no-op version of PTR_MANGLE and PTR_DEMANGLE. In the future, we can use PTR_MANGLE and PTR_DEMANGLE unconditionally in C sources, avoiding an unintended loss of hardening due to missing include files or unlucky header inclusion ordering. In i386 and x86_64, we can avoid a <tls.h> dependency in the C code by using the computed constant from <tcb-offsets.h>. <sysdep.h> no longer includes these definitions, so there is no cyclic dependency anymore when computing the <tcb-offsets.h> constants. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-10-18 17:03:55 +02:00
Florian Weimer	246f37d6b1	x86-64: Move LP_SIZE definition to its own header This way, we can define the pointer guard macros without including <sysdep.h> on x86-64. Other architectures will not have such an inclusion dependency, and the implied header file inclusion would create a porting hazard. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-10-18 17:02:08 +02:00
Szabolcs Nagy	7363a9a9a0	math: Fix asin and acos invalid exception with old gcc This works around a gcc issue where it const folded inf/inf into nan, preventing the invalid exception to be signalled. (x-x)/(x-x) is more robust against optimizations and works for all out of bounds values including x==nan. The gcc issue https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95115 should be fixed on release branches starting from gcc-10, but it is better to change the code in case glibc is built with older gcc. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2022-10-17 08:18:52 +01:00
Noah Goldstein	be066536bd	x86: Update strlen-evex-base to use new reg/vec macros. To avoid duplicate the VMM / GPR / mask insn macros in all incoming evex512 files use the macros defined in 'reg-macros.h' and '{vec}-macros.h' This commit does not change libc.so Tested build on x86-64	2022-10-14 21:21:58 -07:00
Noah Goldstein	47f5d51461	x86: Remove now unused vec header macros. This commit does not change libc.so Tested build on x86-64	2022-10-14 21:21:58 -07:00
Noah Goldstein	a6784653f7	x86: Update memset to use new VEC macros Replace %VEC(n) -> %VMM(n) This commit does not change libc.so Tested build on x86-64	2022-10-14 21:21:58 -07:00
Noah Goldstein	4fb7d8a938	x86: Update memmove to use new VEC macros Replace %VEC(n) -> %VMM(n) This commit does not change libc.so Tested build on x86-64	2022-10-14 21:21:58 -07:00
Noah Goldstein	3088a66ff8	x86: Update memrchr to use new VEC macros Replace %VEC(n) -> %VMM(n) This commit does not change libc.so Tested build on x86-64	2022-10-14 21:21:58 -07:00
Noah Goldstein	52ab7604db	x86: Update VEC macros to complete API for evex/evex512 impls 1) Copy so that backport will be easier. 2) Make section only define if there is not a previous definition 3) Add `VEC_lo` definition for proper reg-width but in the ymm/zmm0-15 range. 4) Add macros for accessing GPRs based on VEC_SIZE This is to make it easier to do think like: ``` vpcmpb %VEC(0), %VEC(1), %k0 kmov{d\|q} %k0, %{eax\|rax} test %{eax\|rax} ``` It adds macro s.t any GPR can get the proper width with: `V{upcase_GPR_name}` and any mask insn can get the proper width with: `{upcase_mask_insn_without_postfix}` This commit does not change libc.so Tested build on x86-64	2022-10-14 21:21:58 -07:00
Joseph Myers	3bd18aa4d1	Add AArch64 HWCAP2_EBF16 from Linux 6.0 to bits/hwcap.h Linux 6.0 adds a new AArch64 HWCAP2 bit, HWCAP2_EBF16. Add this to glibc's bits/hwcap.h. Tested with build-many-glibcs.py for aarch64-linux-gnu.	2022-10-12 14:28:14 +00:00
Adhemerval Zanella	5355f9ca7b	elf: Remove -fno-tree-loop-distribute-patterns usage on dl-support Besides the option being gcc specific, this approach is still fragile and not future proof since we do not know if this will be the only optimization option gcc will add that transforms loops to memset (or any libcall). This patch adds a new header, dl-symbol-redir-ifunc.h, that can b used to redirect the compiler generated libcalls to port the generic memset implementation if required. Checked on x86_64-linux-gnu and aarch64-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2022-10-10 10:32:28 -03:00
Andreas Schwab	954b8f3895	Expose all MAP_ constants in <sys/mman.h> unconditionally (bug 29375) POSIX reserves the MAP_ prefix for <sys/mman.h>, so there is no need to conditionalize their definitions on feature test macros.	2022-10-10 09:30:24 +02:00
Xi Ruoyao	589eda82bb	LoongArch: Fix the condition to use PC-relative addressing in start.S A start.o compiled from start.S with -DPIC and no -DSHARED is used by both crt1.o and rcrt1.o. So the LoongArch static PIE patch unintentionally introduced PC-relative addressing for main and __libc_start_main into crt1.o. While the latest Binutils (trunk, which will be released as 2.40) supports the PC-relative relocs against an external function by creating a PLT entry, the 2.39 release branch doesn't (and won't) support this. An error is raised: "PLT stub does not represent and symbol not defined." So, we need the following changes: 1. Check if ld supports the PC-relative relocs against an external function. If it's not supported, we deem static PIE unsupported. 2. Change start.S. If static PIE is supported, use PC-relative addressing for main and __libc_start_main and rely on the linker to create PLT entries. Otherwise, restore the old behavior (using GOT to address these functions). An alternative would be adding a new "static-pie-start.S", and some custom logic into Makefile to build rcrt1.o with it. And, restore start.S to the state before static PIE change so crt1.o won't contain PC-relative relocs against external symbols. But I can't see any benefit of this alternative, so I'd just keep it simple. Tested by building glibc with the following configurations: 1. Binutils trunk + GCC trunk. Static PIE enabled. All tests passed. 2. Binutils 2.39 branch + GCC trunk. Static PIE disabled. Tests related to ifunc failed (it's a known issue). All other tests passed. 3. Binutils 2.39 branch + GCC 12 branch, cross compilation with build-many-glibcs.py from x86_64-linux-gnu. Static PIE disabled. Build succeeded.	2022-10-08 16:34:45 +08:00
Adhemerval Zanella	f9646d138f	arm: Enable USE_ATOMIC_COMPILER_BUILTINS (BZ #24774 ) As per other architectures. I have checked on a armv8 hardware with the following configurations: arm-linux-gnueabihf (gcc built with --with-float=hard --with-cpu=arm926ej-s) armv5-linux-gnueabihf (-march=armv5te -mfpu=vfpv3) armv7-linux-gnueabihf (-march=armv7-a -mfpu=vfpv3) armv7-thumb-linux-gnueabihf (-march=armv7-a -mfpu=vfpv3 -mthumb) armv7-neon-linux-gnueabihf (-march=armv7-a -mfpu=neon) armv7-neonhard-linux-gnueabihf (-march=armv7-a -mfpu=neon -mfloat-abi=hard) Without any regression. I haven't dig into the code, but since Linux atomic-machine.h handle pre-ARMv6 and ARMv6 I expect the compiler might have some small room to optimize. The code size also improves is most of the configurations: * master text data bss dec hex filename 1727801 9720 37928 1775449 1b1759 arm-linux-gnueabihf/libc.so 1691729 9720 37928 1739377 1a8a71 arm-linux-gnueabihf-armv7-disable-multi-arch/libc.so 1725509 9720 37928 1773157 1b0e65 armv5-linux-gnueabihf/libc.so 1700757 9720 37928 1748405 1aadb5 armv6-linux-gnueabihf/libc.so 1698973 9720 37928 1746621 1aa6bd armv6t2-linux-gnueabihf/libc.so 1695481 9752 37928 1743161 1a9939 armv7-linux-gnueabihf/libc.so 1692917 9744 37928 1740589 1a8f2d armv7-neonhard-linux-gnueabihf/libc.so 1692917 9744 37928 1740589 1a8f2d armv7-neon-linux-gnueabihf/libc.so 1225353 9752 37928 1273033 136cc9 armv7-thumb-linux-gnueabihf/libc.so * patched text data bss dec hex filename 1726805 9720 37928 1774453 1b1375 arm-linux-gnueabihf/libc.so 1689321 9720 37928 1736969 1a8109 arm-linux-gnueabihf-armv7-disable-multi-arch/libc.so 1724433 9720 37928 1772081 1b0a31 armv5-linux-gnueabihf/libc.so 1698301 9720 37928 1745949 1aa41d armv6-linux-gnueabihf/libc.so 1696525 9720 37928 1744173 1a9d2d armv6t2-linux-gnueabihf/libc.so 1693009 9752 37928 1740689 1a8f91 armv7-linux-gnueabihf/libc.so 1690493 9744 37928 1738165 1a85b5 armv7-neonhard-linux-gnueabihf/libc.so 1690493 9744 37928 1738165 1a85b5 armv7-neon-linux-gnueabihf/libc.so 1223837 9752 37928 1271517 1366dd armv7-thumb-linux-gnueabihf/libc.so The idea is eventually move all architectures to use compiler builtins. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Tested-by: Aurelien Jarno <aurelien@aurel32.net>	2022-10-07 16:19:20 -03:00
Javier Pello	ab40f20364	elf: Remove _dl_string_hwcap Removal of legacy hwcaps support from the dynamic loader left no users of _dl_string_hwcap. Signed-off-by: Javier Pello <devel@otheo.eu> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-10-06 07:59:48 -03:00
Javier Pello	4a7094119c	elf: Remove hwcap parameter from add_to_cache signature Last commit made it so that the value passed for that parameter was always 0 at its only call site. Signed-off-by: Javier Pello <devel@otheo.eu> Reviewed-by: Florian Weimer <fweimer@redhat.com>	2022-10-06 07:59:48 -03:00
Javier Pello	d178c67535	x86_64: Remove platform directory library loading test This was to test loading of shared libraries from platform subdirectories, but this functionality is going away in the following commits. Signed-off-by: Javier Pello <devel@otheo.eu> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-10-06 07:59:48 -03:00
Joseph Myers	27d67e974e	Update kernel version to 6.0 in header constant tests This patch updates the kernel version in the tests tst-mman-consts.py, tst-mount-consts.py and tst-pidfd-consts.py to 6.0. (There are no new constants covered by these tests in 6.0 that need any other header changes.) Tested with build-many-glibcs.py.	2022-10-05 22:11:27 +00:00
Adhemerval Zanella Netto	9dc4e29f63	x86: Fix -Os build (BZ #29576 ) The compiler might transform __stpcpy calls (which are routed to __builtin_stpcpy as an optimization) to strcpy and x86_64 strcpy multiarch implementation does not build any working symbol due ISA_SHOULD_BUILD not being evaluated for IS_IN(rtld). Checked on x86_64-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>	2022-10-05 18:04:13 -03:00
Joseph Myers	a878a1384c	Regenerate sysdeps/mach/hurd/bits/errno.h This addition to the list of source headers in sysdeps/mach/hurd/bits/errno.h appears in the source tree after build-many-glibcs.py runs, I'm guessing resulting from gnumach commit c566ad85a2d6728ebc8ec0f461a3b35df300e96e.	2022-10-05 19:21:25 +00:00
Joseph Myers	919b9bfaa9	Update syscall lists for Linux 6.0 Linux 6.0 has no new syscalls. Update the version number in syscall-names.list to reflect that it is still current for 6.0. Tested with build-many-glibcs.py.	2022-10-05 14:33:14 +00:00
Aurelien Jarno	7e8283170c	x86-64: Require BMI1/BMI2 for AVX2 strrchr and wcsrchr implementations The AVX2 strrchr and wcsrchr implementation uses the 'blsmsk' instruction which belongs to the BMI1 CPU feature and the 'shrx' instruction, which belongs to the BMI2 CPU feature. Fixes: `df7e295d18` ("x86: Optimize {str\|wcs}rchr-avx2") Partially resolves: BZ #29611 Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2022-10-03 23:46:11 +02:00
Aurelien Jarno	3c0c78afab	x86-64: Require BMI2 and LZCNT for AVX2 memrchr implementation The AVX2 memrchr implementation uses the 'shlxl' instruction, which belongs to the BMI2 CPU feature and uses the 'lzcnt' instruction, which belongs to the LZCNT CPU feature. Fixes: `af5306a735` ("x86: Optimize memrchr-avx2.S") Partially resolves: BZ #29611 Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2022-10-03 23:46:11 +02:00
Aurelien Jarno	e3e7fab7fe	x86-64: Require BMI2 for AVX2 (raw\|w)memchr implementations The AVX2 memchr, rawmemchr and wmemchr implementations use the 'bzhi' and 'sarx' instructions, which belongs to the BMI2 CPU feature. Fixes: `acfd088a19` ("x86: Optimize memchr-avx2.S") Partially resolves: BZ #29611 Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2022-10-03 23:46:11 +02:00
Aurelien Jarno	f31a5a884e	x86-64: Require BMI2 for AVX2 wcs(n)cmp implementations The AVX2 wcs(n)cmp implementations use the 'bzhi' instruction, which belongs to the BMI2 CPU feature. NB: It also uses the 'tzcnt' BMI1 instruction, but it is executed as BSF as BSF if the CPU doesn't support TZCNT, and produces the same result for non-zero input. Partially fixes: `b77b06e0e2` ("x86: Optimize strcmp-avx2.S") Partially resolves: BZ #29611 Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2022-10-03 23:46:11 +02:00
Aurelien Jarno	fc7de1d9b9	x86-64: Require BMI2 for AVX2 strncmp implementation The AVX2 strncmp implementations uses the 'bzhi' instruction, which belongs to the BMI2 CPU feature. NB: It also uses the 'tzcnt' BMI1 instruction, but it is executed as BSF as BSF if the CPU doesn't support TZCNT, and produces the same result for non-zero input. Partially fixes: `b77b06e0e2` ("x86: Optimize strcmp-avx2.S") Partially resolves: BZ #29611 Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2022-10-03 23:46:11 +02:00
Aurelien Jarno	4d64c64457	x86-64: Require BMI2 for AVX2 strcmp implementation The AVX2 strcmp implementation uses the 'bzhi' instruction, which belongs to the BMI2 CPU feature. NB: It also uses the 'tzcnt' BMI1 instruction, but it is executed as BSF as BSF if the CPU doesn't support TZCNT, and produces the same result for non-zero input. Partially fixes: `b77b06e0e2` ("x86: Optimize strcmp-avx2.S") Partially resolves: BZ #29611 Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2022-10-03 23:46:11 +02:00
Aurelien Jarno	10f79d3670	x86-64: Require BMI2 for AVX2 str(n)casecmp implementations The AVX2 str(n)casecmp implementations use the 'bzhi' instruction, which belongs to the BMI2 CPU feature. NB: It also uses the 'tzcnt' BMI1 instruction, but it is executed as BSF as BSF if the CPU doesn't support TZCNT, and produces the same result for non-zero input. Partially fixes: `b77b06e0e2` ("x86: Optimize strcmp-avx2.S") Partially resolves: BZ #29611 Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2022-10-03 23:46:11 +02:00
Aurelien Jarno	b80f16adbd	x86: include BMI1 and BMI2 in x86-64-v3 level The "System V Application Binary Interface AMD64 Architecture Processor Supplement" mandates the BMI1 and BMI2 CPU features for the x86-64-v3 level. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2022-10-03 23:46:11 +02:00
Noah Goldstein	653c12c7d8	x86: Cleanup pthread_spin_{try}lock.S Save a jmp on the lock path coming from an initial failure in pthread_spin_lock.S. This costs 4-bytes of code but since the function still fits in the same number of 16-byte blocks (default function alignment) it does not have affect on the total binary size of libc.so (unchanged after this commit). pthread_spin_trylock was using a CAS when a simple xchg works which is often more expensive. Full check passes on x86-64.	2022-10-03 14:13:49 -07:00
Adhemerval Zanella	114e299ca6	x86: Remove .tfloat usage Some compiler does not support it (such as clang integrated assembler) neither gcc emits it.	2022-10-03 14:03:21 -03:00
John David Anglin	b7bd94068e	hppa: Fix initialization of dp register [BZ 29635] After upgrading glibc to Debian 2.35-1, gdb faulted on startup and dropped core in a function call in the main application. This was caused by not initializing the global dp register for the main application early enough. Restore the code to initialize dp in _dl_start_user. It was removed when code was added to initialize dp in elf_machine_runtime_setup. Signed-off-by: John David Anglin <dave.anglin@bell.net>	2022-10-01 19:49:25 +00:00
Adhemerval Zanella	609c9d0951	malloc: Do not clobber errno on __getrandom_nocancel (BZ #29624 ) Use INTERNAL_SYSCALL_CALL instead of INLINE_SYSCALL_CALL. This requires emulate the semantic for hurd call (so __arc4random_buf uses the fallback). Checked on x86_64-linux-gnu. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2022-09-30 15:25:15 -03:00
Adhemerval Zanella	13db9ee2cb	stdlib: Fix __getrandom_nocancel type and arc4random usage (BZ #29638 ) Using an unsigned type prevents the fallback to be used if kernel does not support getrandom syscall. Checked on x86_64-linux-gnu. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2022-09-30 15:24:49 -03:00
Xi Ruoyao	8b10727a9a	LoongArch: Add static PIE support If the compiler is new enough, enable static PIE support. In the static PIE version of _start (in rcrt1.o), use la.pcrel instead of la.got because in a static PIE we cannot use GOT entries until the dynamic relocations for GOT are resolved.	2022-09-30 11:51:58 +08:00
Noah Goldstein	b0969fa53a	x86: Fix wcsnlen-avx2 page cross length comparison [BZ #29591 ] Previous implementation was adjusting length (rsi) to match bytes (eax), but since there is no bound to length this can cause overflow. Fix is to just convert the byte-count (eax) to length by dividing by sizeof (wchar_t) before the comparison. Full check passes on x86-64 and build succeeds w/ and w/o multiarch.	2022-09-28 20:15:16 -07:00
Joseph Myers	3e5760fcb4	Update _FloatN header support for C++ in GCC 13 GCC 13 adds support for _FloatN and _FloatNx types in C++, so breaking the installed glibc headers that assume such support is not present. GCC mostly works around this with fixincludes, but that doesn't help for building glibc and its tests (glibc doesn't itself contain C++ code, but there's C++ code built for tests). Update glibc's bits/floatn-common.h and bits/floatn.h headers to handle the GCC 13 support directly. In general the changes match those made by fixincludes, though I think the ones in sysdeps/powerpc/bits/floatn.h, where the header tests __LDBL_MANT_DIG__ == 113 or uses #elif, wouldn't match the existing fixincludes patterns. Some places involving special C++ handling in relation to _FloatN support are not changed. There's no need to change the __HAVE_FLOATN_NOT_TYPEDEF definition (also in a form that wouldn't be matched by the fixincludes fixes) because it's only used in relation to macro definitions using features not supported for C++ (__builtin_types_compatible_p and _Generic). And there's no need to change the inline function overloads for issignaling, iszero and iscanonical in C++ because cases where types have the same format but are no longer compatible types are handled automatically by the C++ overload resolution rules. This patch also does not change the overload handling for iseqsig, and there I think changes are needed, beyond those in this patch or made by fixincludes. The way that overload is defined, via a template parameter to a structure type, requires overloads whenever the types are incompatible, even if they have the same format. So I think we need to add overloads with GCC 13 for every supported _FloatN and _FloatNx type, rather than just having one for _Float128 when it has a different ABI to long double as at present (but for older GCC, such overloads must not be defined for types that end up defined as typedefs for another type). Tested with build-many-glibcs.py: compilers build for aarch64-linux-gnu ia64-linux-gnu mips64-linux-gnu powerpc-linux-gnu powerpc64le-linux-gnu x86_64-linux-gnu; glibcs build for aarch64-linux-gnu ia64-linux-gnu i686-linux-gnu mips-linux-gnu mips64-linux-gnu-n32 powerpc-linux-gnu powerpc64le-linux-gnu x86_64-linux-gnu.	2022-09-28 20:10:08 +00:00
Samuel Thibault	d7f32c9958	hurd: Fix typo	2022-09-28 19:21:44 +02:00
Jörg Sonnenberger	c9226c03da	get_nscd_addresses: Fix subscript typos [BZ #29605 ] Fix the subscript on air->family, which was accidentally set to COUNT when it should have remained as I. Resolves: BZ #29605 Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2022-09-28 12:47:10 -04:00
Samuel Thibault	7de3f0a96c	hurd: Increase SOMAXCONN to 4096 Notably fakeroot-tcp may introduce a lot of parallel connections.	2022-09-27 23:37:42 +02:00
Wilco Dijkstra	22f4ab2d20	Use atomic_exchange_release/acquire Rename atomic_exchange_rel/acq to use atomic_exchange_release/acquire since these map to the standard C11 atomic builtins. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-09-26 16:58:08 +01:00
Wilco Dijkstra	4a07fbb689	Use C11 atomics instead of atomic_decrement_and_test Replace atomic_decrement_and_test with atomic_fetch_add_relaxed. These are simple counters which do not protect any shared data from concurrent accesses. Also remove the unused file cond-perf.c. Passes regress on AArch64. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-09-23 15:59:56 +01:00
Wilco Dijkstra	d1babeb32d	Use C11 atomics instead of atomic_increment(_val) Replace atomic_increment and atomic_increment_val with atomic_fetch_add_relaxed. One case in sem_post.c uses release semantics (see comment above it). The others are simple counters and do not protect any shared data from concurrent accesses. Passes regress on AArch64. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-09-23 15:59:56 +01:00
Alistair Francis	2e81493fa6	riscv: Remove RV32 floating point functions We don't need RV32 specific floating point functions, instead make them generic for RISC-V. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-09-21 14:37:43 -04:00
Alistair Francis	73e9fe43ac	riscv: Consolidate the libm-test-ulps Both RV32 and RV64 should have the same libm-test-ulps, so consolidate them into a single file. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-09-21 14:37:13 -04:00
Samuel Thibault	385f2ecda9	hurd: Fix SIOCADD/DELRT ioctls The hurd network stack uses struct ifrtreq rather than ortentry.	2022-09-21 19:58:44 +02:00
Samuel Thibault	b84199eb18	hurd: Drop struct rtentry and in6_rtmsg These were cargo-culted, they are not used at all in Hurd interfaces.	2022-09-21 19:58:44 +02:00
Damien Zammit	9ba0f010a6	hurd: Add _IOT_ifrtreq to <net/route.h> So that we can use struct ifrtreq in ioctls.	2022-09-21 19:58:44 +02:00
Samuel Thibault	c0c9092f75	hurd: Use IF_NAMESIZE rather than IFNAMSIZ The latter is not available without __USE_MISC.	2022-09-21 08:51:50 +02:00
Damien Zammit	ffd0b295d9	hurd: Add ifrtreq structure to net/route.h As used by the hurdish route ioctls.	2022-09-21 00:42:13 +02:00
John David Anglin	fa47e8e6df	hppa: undef __ASSUME_SET_ROBUST_LIST QEMU does not support support set_robust_list. Thus, we need to enable detection of set_robust_list system call. Signed-off-by: John David Anglin <dave.anglin@bell.net>	2022-09-20 20:14:14 +00:00
Adhemerval Zanella	85a3228744	linux: Use same type for MMAP2_PAGE_UNIT It avoid a possible compiler warning where right size of operator is converted from a negative value to unsigned. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2022-09-20 10:57:40 -03:00
Adhemerval Zanella	aeb4d2e981	m68k: Enforce 4-byte alignment on internal locks (BZ #29537 ) A new internal definition, __LIBC_LOCK_ALIGNMENT, is used to force the 4-byte alignment only for m68k, other architecture keep the natural alignment of the type used internally (and hppa does not require 16-byte alignment for kernel-assisted CAS). Reviewed-by: Florian Weimer <fweimer@redhat.com>	2022-09-20 10:56:54 -03:00
Florian Weimer	766b73768b	Linux: Do not skip d_ino == 0 entries in readdir, readdir64 (bug 12165) POSIX does not say this value is special. For example, old XFS file systems may still use inode number zero. Also update the comment regarding ENOENT. Linux may return ENOENT for some file systems.	2022-09-19 12:04:57 +02:00
Samuel Thibault	7ae60af75b	hurd: Factorize at/non-at functions Non-at functions can be implemented by just calling the corresponding at function with AT_FDCWD and zero at_flags. In the linkat case, the at behavior is different (O_NOLINK), so this introduces __linkat_common to pass O_NOLINK as appropriate. lstat functions can also be implemented with fstatat by adding __fstatat64_common which takes a flags parameter in addition to the at_flags parameter, In the end this factorizes chmod, chown, link, lstat64, mkdir, readlink, rename, stat64, symlink, unlink, utimes. This also makes __lstat, __lxstat64, __stat and __xstat64 directly use __fstatat64_common instead of __lstat64 or __stat64.	2022-09-17 19:58:30 +00:00
Łukasz Stelmach	22c96052ac	RISC-V: Allow long jumps to __syscall_error __syscall_error may end up farther than 1MiB away from a caller, especially when linking statically large binaries. tail allows for 4GiB jumps and is reduced to j when a linked symbol is within range. Fixes: `36960f0c76` ("RISC-V: Linux Syscall Interface") Fixes: `7f33b09c65` ("RISC-V: Linux ABI") Signed-off-by: Łukasz Stelmach <l.stelmach@samsung.com>	2022-09-16 23:25:45 -04:00
Samuel Thibault	5652e12cce	hurd: Make readlink* just reopen the file used for stat `9e5c991106` ("hurd: Fix readlink() hanging on fifo") separated opening the file for the stat call from opening the file for the read call. That however opened a small window for the file to change. Better make this atomic by reopening the file with O_READ.	2022-09-15 21:53:57 +02:00
Samuel Thibault	9e5c991106	hurd: Fix readlink() hanging on fifo readlink() opens the target with O_READ to be able to read the symlink content. When the target is actually a fifo, that would hang waiting for a writer (caught in the coreutils testsuite). We thus have to first lookup the target without O_READ to perform io_stat and lookout for fifos, and only after checking the symlink type, we can re-lookup with O_READ.	2022-09-14 18:57:44 +02:00
Wilco Dijkstra	a30e960328	Use relaxed atomics since there is no MO dependence Replace the 3 uses of atomic_bit_set and atomic_bit_test_set with atomic_fetch_or_relaxed. Using relaxed MO is correct since the atomics are used to ensure memory is released only once. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2022-09-13 11:58:07 +01:00
Wilco Dijkstra	53b251c9ff	Use C11 atomics instead atomic_add(_zero) Replace atomic_add and atomic_add_zero with atomic_fetch_add_relaxed. Reviewed-by: DJ Delorie <dj@redhat.com>	2022-09-09 14:11:23 +01:00
Andreas Schwab	3d7d5c10c8	errlist: add missing entry for EDEADLOCK (bug 29545) Some architectures (mips, powerpc and sparc) define separate values for EDEADLOCK and EDEADLK. Readd the errlist entry for EDEADLOCK for those configurations. Also use the dependency files from generating the auxiliary errlist and siglist files.	2022-09-08 11:40:24 +02:00
Joseph Myers	b8cc607f3c	Do not define static_assert or thread_local in headers for C2x C2x makes static_assert and thread_local into keywords, removing the definitions as macros in assert.h and threads.h. Thus, disable those macros in those glibc headers for C2x. The disabling is done based on a combination of language version and __GNUC_PREREQ, not based on __GLIBC_USE (ISOC2X), on the principle that users of the header (when requesting C11 or later APIs - not assert.h for C99 and older API versions) should always have the names static_assert or thread_local available after inclusion of the header, whether as a keyword or as a macro. Thus, when using a compiler without the keywords (whether an older compiler, possibly in C2x mode, or _GNU_SOURCE with any compiler but in an older language mode, for example) the macros should be defined, even when C2x APIs have been requested. The __GNUC_PREREQ conditionals here may well need updating with the versions of other compilers that gained support for these keywords in C2x mode. Tested for x86_64.	2022-09-07 18:39:28 +00:00
Florian Weimer	dbb75513f5	elf: Rename _dl_sort_maps parameter from skip to force_first The new implementation will not be able to skip an arbitrary number of objects. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-09-06 07:38:33 +02:00
Adhemerval Zanella	2fc7320668	math: x86: Use prefix for FP_INIT_ROUNDMODE Not all compilers support the inline asm prefix '%v' to emit the avx instruction if AVX is enable. Use a prefix instead. Checked on x86_64-linux-gnu and i686-linux-gnu.	2022-09-05 10:54:41 -03:00
caiyinyu	930993921f	LoongArch: Add soft float support.	2022-09-01 09:10:08 +08:00
Adhemerval Zanella	8cd559cf5a	nptl: x86_64: Use same code for CURRENT_STACK_FRAME and stackinfo_get_sp It avoids the possible warning of uninitialized 'frame' variable when building with clang: ../sysdeps/nptl/jmp-unwind.c:27:42: error: variable 'frame' is uninitialized when used here [-Werror,-Wuninitialized] __pthread_cleanup_upto (env->__jmpbuf, CURRENT_STACK_FRAME); The resulting code is similar to CURRENT_STACK_FRAME. Checked on x86_64-linux-gnu.	2022-08-31 09:04:27 -03:00
Adhemerval Zanella	ddcf5a9170	posix: Fix macro expansion producing 'defined' has undefined behavior The NEED_CHECK_SPEC is defined as: #define NEED_CHECK_SPEC \ (!defined _XBS5_ILP32_OFF32 \|\| !defined _XBS5_ILP32_OFFBIG \ \|\| !defined _XBS5_LP64_OFF64 \|\| !defined _XBS5_LPBIG_OFFBIG \ \|\| !defined _POSIX_V6_ILP32_OFF32 \|\| !defined _POSIX_V6_ILP32_OFFBIG \ \|\| !defined _POSIX_V6_LP64_OFF64 \|\| !defined _POSIX_V6_LPBIG_OFFBIG \ \|\| !defined _POSIX_V7_ILP32_OFF32 \|\| !defined _POSIX_V7_ILP32_OFFBIG \ \|\| !defined _POSIX_V7_LP64_OFF64 \|\| !defined _POSIX_V7_LPBIG_OFFBIG) Which is undefined behavior accordingly to C Standard (Preprocessing directives, p4). Checked on x86_64-linux-gnu.	2022-08-30 08:40:47 -03:00
Stefan Liebler	e57d8fc97b	S390: Always use svc 0 On s390x syscalls are triggered by svc instruction. One can pass the syscall number encoded in the instruction "svc 123" or by storing it in r1: lghi r1,123 svc 0 If the syscall number is encoded in the instruction, this can cause broken syscall restarts. Therefore this patch is now just passing the syscall number in r1. See also kernel-commit: "s390/signal: switch to using vdso for sigreturn and syscall restart" https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch/s390/[%e2%80%a6]call.c?h=v6.0-rc1&id=df29a7440c4b5c65765c8f60396b3b13063e24e9 As information, the "svc 0" feature was introduced in kernel 2.5.62: commit b5aad611393ef2e132e3648fa4c6e56a9cfa8708	2022-08-30 10:54:46 +02:00
Xi Ruoyao	241603123c	LoongArch: Use __builtin_{fmax,fmaxf,fmin,fminf} with GCC >= 13 GCC 13 compiles these built-ins to {fmax,fmin}.{s/d} instruction, use them instead of the generic implementation. Link: https://gcc.gnu.org/r13-2085 Signed-off-by: Xi Ruoyao <xry111@xry111.site>	2022-08-30 11:59:15 +08:00
caiyinyu	fa9e095bbe	LoongArch: Fix ptr mangling/demangling features.	2022-08-30 11:45:22 +08:00
Richard Henderson	51231c469b	Makeconfig: Set pie-ccflag to -fPIE by default [BZ# 29514] We should default to the larger code model, in order to support larger applications built with -static -pie. This should be consistent with pic-ccflag, which defaults to -fPIC. Remove the now redundant override from sysdeps/sparc/Makefile. Note that -fno-pie and -fno-PIE have the same effect. Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>	2022-08-29 09:03:00 -04:00
Samuel Thibault	063f7462da	hurd: Fix vm_size_t incoherencies In gnumach, 3e1702a65fb3 ("add rpc_versions for vm types") changed the type of vm_size_t, making it always a unsigned long. This made it incompatible on x86 with size_t. Even if we may want to revert it to unsigned int, it's better to fix the types of parameters according to the .defs files.	2022-08-29 01:42:47 +02:00
Samuel Thibault	cb033e6b0c	mach: Make xpg_strerror_r set a message on error posix advises to have strerror_r fill a message even when we are returning an error. This makes mach's xpg_strerror_r do this, like the generic version does. Spotted by the libunistring testsuite test-strerror_r	2022-08-27 14:56:35 +02:00
Samuel Thibault	03ad444e8e	mach: Fix incoherency between perror and strerror `08d2024b41` ("string: Simplify strerror_r") inadvertently made __strerror_r print unknown error system in decimal while the original code was printing it in hexadecimal. perror was kept printing in hexadecimal in `725eeb4af1` ("string: Use tls-internal on strerror_l"), let us keep both coherent. This also fixes a duplicate ':' Spotted by the libunistring testsuite test-perror2	2022-08-27 14:36:18 +02:00
Szabolcs Nagy	06d4381dd8	csu: Change start code license to have link exception The start code can get linked into dynamic linked executables where LGPL would require shipping the source or linkable binaries when the executable is distributed. On some targets the license exception was missing in start.S (which is compiled into crt1.o and Scrt1.o which may end up linked into PDE and PIE binaries). I did not review what other code may end up in executables, just fixed the start.S license inconsistency across targets. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2022-08-26 09:14:53 +01:00
Florian Weimer	5ecc982412	s390: Move hwcaps/platform names out of _rtld_global_ro Changes to these arrays are often backported to stable releases, but additions to these arrays shift the offsets of the following _rltd_global_ro members, thus breaking the GLIBC_PRIVATE ABI. Obviously, this change is itself an internal ABI break, but at least it will avoid further ABI breaks going forward. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2022-08-25 21:33:12 +02:00
Florian Weimer	89baed0b93	Revert "Detect ld.so and libc.so version inconsistency during startup" This reverts commit `6f85dbf102`. Once this change hits the release branches, it will require relinking of all statically linked applications before static dlopen works again, for the majority of updates on release branches: The NEWS file is regularly updated with bug references, so the __libc_early_init suffix changes, and static dlopen cannot find the function anymore. While this ABI check is still technically correct (we do require rebuilding & relinking after glibc updates to keep static dlopen working), it is too drastic for stable release branches. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-08-25 18:46:43 +02:00
Florian Weimer	6f85dbf102	Detect ld.so and libc.so version inconsistency during startup The files NEWS, include/link.h, and sysdeps/generic/ldsodefs.h contribute to the version fingerprint used for detection. The fingerprint can be further refined using the --with-extra-version-id configure argument. _dl_call_libc_early_init is replaced with _dl_lookup_libc_early_init. The new function is used store a pointer to libc.so's __libc_early_init function in the libc_map_early_init member of the ld.so namespace structure. This function pointer can then be called directly, so the separate invocation function is no longer needed. The versioned symbol lookup needs the symbol versioning data structures, so the initialization of libc_map and libc_map_early_init is now done from _dl_check_map_versions, after this information becomes available. (_dl_map_object_from_fd does not set this up in time, so the initialization code had to be moved from there.) This means that the separate initialization code can be removed from dl_main because _dl_check_map_versions covers all maps, including the initial executable loaded by the kernel. The lookup still happens before relocation and the invocation of IFUNC resolvers, so IFUNC resolvers are protected from ABI mismatch. The __libc_early_init function pointer is not protected because so little code runs between the pointer write and the invocation (only dynamic linker code and IFUNC resolvers). Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2022-08-24 17:35:57 +02:00
Paul Eggert	464138e904	Merge _GL_UNUSED C23 patch from Gnulib * posix/getopt.c (_getopt_initialize): * sysdeps/posix/tempname.c (try_dir, try_nocreate): Put _GL_UNUSED before args instead of after. This makes no difference for glibc. It is needed for Gnulib when being compiled on non-GCC C23 compilers.	2022-08-23 21:58:39 -07:00
Xi Ruoyao	8995b84c45	LoongArch: Fix dl-machine.h code formatting. No functional change.	2022-08-24 10:06:41 +08:00
Samuel Thibault	af6b1cce98	hurd: Fix starting static binaries with stack protection enabled gcc introduces gs:0x14 accesses in most functions, so we need some tcbhead to be ready very early during initialization. This configures a static area which can be referenced by various protected functions, until proper TLS is set up.	2022-08-22 22:34:31 +02:00
Samuel Thibault	4565083abc	htl: Make pthread*_cond_timedwait register wref before releasing mutex Otherwise another thread could be rightly trying to destroy the condition, see e.g. tst-cond20.	2022-08-22 22:27:24 +02:00
Samuel Thibault	8bf0bc8350	htl: make __pthread_hurd_cond_timedwait_internal check mutex is held Like __pthread_cond_timedwait_internal already does.	2022-08-22 22:25:27 +02:00
Joseph Myers	4c199499d6	Add AArch64 HWCAP2_* constants from Linux 5.19 Linux 5.19 adds more HWCAP2_* values for AArch64; add these to its bits/hwcap.h header in glibc. Tested with build-many-glibcs.py for aarch64-linux-gnu.	2022-08-22 14:59:39 +00:00
Joseph Myers	a727220b37	Add AGROUP from Linux 5.19 to sys/acct.h, remove Alpha version (bug 29502) Linux 5.19 adds a new accounting flag AGROUP; add it to the enumeration in sys/acct.h. This shows up that the Alpha-specific variant of this header has a different set of constants and struct acct, which appear to be the constants and structure layout from Linux 2.0. These were changed some time between Linux 2.0 and Linux 2.2; I see no evidence of an Alpha-specific layout or set of constants, but haven't checked the detailed Linux kernel history between those versions. Rather, it looks like tha Alpha-specific header was originally needed because of the use of types in the kernel structure (such as uid_t and gid_t) that had different sizes on Alpha, and when glibc was updated for changes to the structure and constants in the kernel 1998-10-02 Andreas Jaeger <aj@arthur.rhein-neckar.de> * sysdeps/unix/sysv/linux/sys/acct.h: Bring in sync with current linux 2.1 version. that simply omitted to do anything about the Alpha version. Thus, remove the Alpha version in order to get the updated definitions into use on Alpha, as I don't think the interfaces are actually different for Alpha with any kernel version supported by glibc. Tested for x86_64, and with build-many-glibcs.py for alpha-linux-gnu.	2022-08-22 14:16:57 +00:00
Florian Weimer	e7ad26ee3c	alpha: Fix generic brk system call emulation in __brk_call (bug 29490) The kernel special-cases the zero argument for alpha brk, and we can use that to restore the generic Linux error handling behavior. Fixes commit `b57ab258c1` ("Linux: Introduce __brk_call for invoking the brk system call").	2022-08-22 11:05:42 +02:00
Samuel Thibault	f7b0fc5cc6	hurd: Assume non-suid during bootstrap We do not have a hurd data block only when bootstrapping the system, in which case we don't have a notion of suid yet anyway. This is needed, otherwise init_standard_fds would check that standard file descriptors are allocated, which is meaningless during bootstrap.	2022-08-19 02:26:21 +02:00
Stefan Liebler	f465b21b06	S390: Fix werror=unused-variable in ifunc-impl-list.c. If the architecture level set is high enough, no IFUNCs are used at all and the variable i would be unused. Then the build fails with: ../sysdeps/s390/multiarch/ifunc-impl-list.c: In function ‘__libc_ifunc_impl_list’: ../sysdeps/s390/multiarch/ifunc-impl-list.c:76:10: error: unused variable ‘i’ [-Werror=unused-variable] 76 \| size_t i = max; \| ^ cc1: all warnings being treated as errors	2022-08-18 09:10:48 +02:00
Michael Hudson-Doyle	2b274fd8c9	Ensure calculations happen with desired rounding mode in y1lf128 math/test-float128-y1 fails on x86_64 and ppc64el with gcc 12 and -O3, because code inside a block guarded by SET_RESTORE_ROUNDL is being moved after the rounding mode has been restored. Use math_force_eval to prevent this (and insert some math_opt_barrier calls to prevent code from being moved before the rounding mode is set). Fixes #29463 Reviewed-By: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2022-08-18 12:32:18 +12:00
Florian Weimer	2955ef4b7c	Linux: Fix enum fsconfig_command detection in <sys/mount.h> The #ifdef FSOPEN_CLOEXEC check did not work because the macro was always defined in this header prior to the check, so that the <linux/mount.h> contents did not matter. Fixes commit `774058d729` ("linux: Fix sys/mount.h usage with kernel headers").	2022-08-16 12:03:28 +02:00
Samuel Thibault	a2ee8c6500	Move ip_mreqn structure from Linux to generic I.e. from sysdeps/unix/sysv/linux/bits/in.h to netinet/in.h It is following both the BSD and Linux definitions. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2022-08-15 22:43:15 +02:00
Florian Weimer	f82e05ebb2	Linux: Terminate subprocess on late failure in tst-pidfd (bug 29485) Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2022-08-15 16:43:59 +02:00
Adhemerval Zanella	453b88efe6	arm: Remove nested functionf rom relocate_pc24 Checked on arm-linux-gnueabihf.	2022-08-12 09:46:22 -03:00
Adhemerval Zanella	774058d729	linux: Fix sys/mount.h usage with kernel headers Now that kernel exports linux/mount.h and includes it on linux/fs.h, its definitions might clash with glibc exports sys/mount.h. To avoid the need to rearrange the Linux header to be always after glibc one, the glibc sys/mount.h is changed to: 1. Undefine the macros also used as enum constants. This covers prior inclusion of <linux/mount.h> (for instance MS_RDONLY). 2. Include <linux/mount.h> based on the usual __has_include check (needs to use __has_include ("linux/mount.h") to paper over GCC bugs. 3. Define enum fsconfig_command only if FSOPEN_CLOEXEC is not defined. (FSOPEN_CLOEXEC should be a very close proxy.) 4. Define struct mount_attr if MOUNT_ATTR_SIZE_VER0 is not defined. (Added in the same commit on the Linux side.) This patch also adds some tests to check if including linux/fs.h and linux/mount.h after and before sys/mount.h does work. Checked on x86_64-linux-gnu. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2022-08-12 09:15:28 -03:00
Adhemerval Zanella	e1226cdc6b	linux: Use compile_c_snippet to check linux/mount.h availability Checked on x86_64-linux-gnu. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2022-08-12 09:15:23 -03:00
Adhemerval Zanella	c68b6044bc	linux: Mimic kernel defition for BLOCK_SIZE To avoid possible warnings if the kernel header is included before sys/mount.h. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2022-08-12 09:15:21 -03:00
Adhemerval Zanella	1542019b69	linux: Use compile_c_snippet to check linux/pidfd.h availability Instead of tying to a specific kernel version. Checked on x86_64-linux-gnu. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2022-08-12 09:15:11 -03:00
caiyinyu	1c9bc1b6e5	LoongArch: Add pointer mangling support.	2022-08-12 09:30:56 +08:00
Wilco Dijkstra	12182ba18d	AArch64: Fix typo in sve configure check (BZ# 29394) Fix a typo in the SVE configure check. This fixes [BZ# 29394].	2022-08-11 17:52:00 +01:00
Wilco Dijkstra	c51c483d2b	libio: Improve performance of IO locks Improve performance of recursive IO locks by adding a fast path for the single-threaded case. To reduce the number of memory accesses for locking/unlocking, only increment the recursion counter if the lock is already taken. On Neoverse V1, a microbenchmark with many small freads improved by 2.9x. Multithreaded performance improved by 2%. Reviewed-by: Cristian Rodríguez <crrodriguez@opensuse.org>	2022-08-11 16:47:45 +01:00
Stefan Liebler	11f09947f3	tst-process_madvise: Check process_madvise-syscall support. So far this test checks if pidfd_open-syscall is supported, which was introduced with linux 5.3. The process_madvise-syscall was introduced with linux 5.10. Thus you'll get FAILs if you are running a kernel in between. This patch adds a check if the first process_madvise-syscall returns ENOSYS and in this case will fail with UNSUPPORTED. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2022-08-11 12:21:05 +02:00
Noah Goldstein	312ded0d63	x86: Fix `#define STRCPY` guard in strcpy-sse2.S `#ifndef STPCPY` is incorrect for checking if `STRCPY` is already defined. It doesn't end up mattering as the whole check is guarded by `#if IS_IN (libc)` but is incorrect none the less.	2022-08-09 17:00:03 +08:00
Adhemerval Zanella	26a3499cdb	i386: Use cmpl instead of cmp Clang cannot assemble cmp in the AT&T dialect mode.	2022-08-05 09:28:39 -03:00
Adhemerval Zanella	1ed5869c4c	i386: Use fldt instead of fld on e_logl.S Clang cannot assemble fldt in the AT&T dialect mode.	2022-08-05 09:28:33 -03:00
Fangrui Song	525ca33a61	i386: Replace movzx with movzbl Similar to `6720d36b66` for x86-64. Clang cannot assemble movzx in the AT&T dialect mode. Change movzx to movzbl, which follows the AT&T dialect and is used elsewhere in the file.	2022-08-04 14:06:50 -07:00
Adhemerval Zanella	3698f5a9dd	i386: Remove RELA support Now that prelink is not support, there is no need to keep supporting rela for non bootstrap.	2022-08-04 10:03:46 -03:00
Adhemerval Zanella	c3f5682215	arm: Remove RELA support Now that prelink is not support, there is no need to keep supporting rela for non bootstrap.	2022-08-04 10:03:46 -03:00
Adhemerval Zanella	36676f5e5d	Remove ldd libc4 support The older libc versions are obsolete for over twenty years now.	2022-08-04 10:03:45 -03:00
Lucas A. M. Magalhaes	8ee878592c	Assume only FLAG_ELF_LIBC6 suport The older libc versions are obsolete for over twenty years now. This patch removes the special flags for libc5 and libc4 and assumes that all libraries cached are libc6 compatible and use FLAG_ELF_LIBC6. Checked with a build for all affected architectures. Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-08-04 09:09:48 -03:00
Adhemerval Zanella	5a57ad23ba	Remove left over LD_LIBRARY_VERSION usages The environment variable was removed by `d2db60d8d8`.	2022-08-04 09:09:48 -03:00
Florian Weimer	8fabe0e632	Linux: Remove exit system call from _exit exit only terminates the current thread, not the whole process, so it is the wrong fallback system call in this context. All supported Linux versions implement the exit_group system call anyway.	2022-08-04 06:17:50 +02:00
caiyinyu	3e83843637	LoongArch: Add vdso support for gettimeofday.	2022-08-04 09:19:36 +08:00
Joseph Myers	085030b957	Update kernel version to 5.19 in header constant tests This patch updates the kernel version in the tests tst-mman-consts.py, tst-mount-consts.py and tst-pidfd-consts.py to 5.18. (There are no new constants covered by these tests in 5.19, or in 5.17 or 5.18 in the case of tst-mount-consts.py that previously used version 5.16, that need any other header changes.) Tested with build-many-glibcs.py.	2022-08-03 16:31:58 +00:00
Florian Weimer	68e036f27f	nptl: Remove uses of assert_perror __pthread_sigmask cannot actually fail with valid pointer arguments (it would need a really broken seccomp filter), and we do not check for errors elsewhere. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-08-03 11:42:49 +02:00
Florian Weimer	cca9684f2d	stdio: Clean up __libc_message after unconditional abort Since commit `ec2c1fcefb` ("malloc: Abort on heap corruption, without a backtrace [BZ #21754]"), __libc_message always terminates the process. Since commit `a289ea09ea` ("Do not print backtraces on fatal glibc errors"), the backtrace facility has been removed. Therefore, remove enum __libc_message_action and the action argument of __libc_message, and mark __libc_message as _No_return. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-08-03 11:42:39 +02:00
Joseph Myers	fccadcdf5b	Update syscall lists for Linux 5.19 Linux 5.19 has no new syscalls, but enables memfd_secret in the uapi headers for RISC-V. Update the version number in syscall-names.list to reflect that it is still current for 5.19 and regenerate the arch-syscall.h headers with build-many-glibcs.py update-syscalls. Tested with build-many-glibcs.py.	2022-08-02 21:05:07 +00:00
Arjun Shankar	9c443ac455	socket: Check lengths before advancing pointer in CMSG_NXTHDR The inline and library functions that the CMSG_NXTHDR macro may expand to increment the pointer to the header before checking the stride of the increment against available space. Since C only allows incrementing pointers to one past the end of an array, the increment must be done after a length check. This commit fixes that and includes a regression test for CMSG_FIRSTHDR and CMSG_NXTHDR. The Linux, Hurd, and generic headers are all changed. Tested on Linux on armv7hl, i686, x86_64, aarch64, ppc64le, and s390x. [BZ #28846] Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2022-08-02 11:10:25 +02:00
Mark Wielaard	325ba824b0	tst-pidfd.c: UNSUPPORTED if we get EPERM on valid pidfd_getfd call pidfd_getfd can fail for a valid pidfd with errno EPERM for various reasons in a restricted environment. Use FAIL_UNSUPPORTED in that case. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2022-07-29 18:52:12 +02:00
caiyinyu	bce0218d9a	LoongArch: Add greg_t and gregset_t.	2022-07-29 09:15:21 +08:00
caiyinyu	033e76ea9c	LoongArch: Fix VDSO_HASH and VDSO_NAME.	2022-07-29 09:15:21 +08:00
Darius Rad	7c5db7931f	riscv: Update rv64 libm test ulps Generated on a Microsemi Polarfire Icicle Kit running Linux version 5.15.32. Same ULPs were also produced on QEMU 5.2.0 running Linux 5.18.0.	2022-07-27 10:50:20 -03:00
Darius Rad	5b6d8a650d	riscv: Update nofpu libm test ulps	2022-07-27 10:50:10 -03:00
Jason A. Donenfeld	eaad4f9e8f	arc4random: simplify design for better safety Rather than buffering 16 MiB of entropy in userspace (by way of chacha20), simply call getrandom() every time. This approach is doubtlessly slower, for now, but trying to prematurely optimize arc4random appears to be leading toward all sorts of nasty properties and gotchas. Instead, this patch takes a much more conservative approach. The interface is added as a basic loop wrapper around getrandom(), and then later, the kernel and libc together can work together on optimizing that. This prevents numerous issues in which userspace is unaware of when it really must throw away its buffer, since we avoid buffering all together. Future improvements may include userspace learning more from the kernel about when to do that, which might make these sorts of chacha20-based optimizations more possible. The current heuristic of 16 MiB is meaningless garbage that doesn't correspond to anything the kernel might know about. So for now, let's just do something conservative that we know is correct and won't lead to cryptographic issues for users of this function. This patch might be considered along the lines of, "optimization is the root of all evil," in that the much more complex implementation it replaces moves too fast without considering security implications, whereas the incremental approach done here is a much safer way of going about things. Once this lands, we can take our time in optimizing this properly using new interplay between the kernel and userspace. getrandom(0) is used, since that's the one that ensures the bytes returned are cryptographically secure. But on systems without it, we fallback to using /dev/urandom. This is unfortunate because it means opening a file descriptor, but there's not much of a choice. Secondly, as part of the fallback, in order to get more or less the same properties of getrandom(0), we poll on /dev/random, and if the poll succeeds at least once, then we assume the RNG is initialized. This is a rough approximation, as the ancient "non-blocking pool" initialized after the "blocking pool", not before, and it may not port back to all ancient kernels, though it does to all kernels supported by glibc (≥3.2), so generally it's the best approximation we can do. The motivation for including arc4random, in the first place, is to have source-level compatibility with existing code. That means this patch doesn't attempt to litigate the interface itself. It does, however, choose a conservative approach for implementing it. Cc: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org> Cc: Florian Weimer <fweimer@redhat.com> Cc: Cristian Rodríguez <crrodriguez@opensuse.org> Cc: Paul Eggert <eggert@cs.ucla.edu> Cc: Mark Harris <mark.hsj@gmail.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: linux-crypto@vger.kernel.org Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-07-27 08:58:27 -03:00
caiyinyu	68d61026d5	LoongArch: Hard Float Support	2022-07-26 12:35:12 -03:00
caiyinyu	3d87c89815	LoongArch: Build Infrastructure	2022-07-26 12:35:12 -03:00
caiyinyu	0d4a891a7c	LoongArch: Add ABI Lists	2022-07-26 12:35:12 -03:00
caiyinyu	f2037efbb3	LoongArch: Linux ABI	2022-07-26 12:35:12 -03:00
caiyinyu	45955fe618	LoongArch: Linux Syscall Interface	2022-07-26 12:35:12 -03:00
caiyinyu	3275882261	LoongArch: Atomic and Locking Routines	2022-07-26 12:35:12 -03:00
caiyinyu	c742795dce	LoongArch: Generic <math.h> and soft-fp Routines	2022-07-26 12:35:12 -03:00
caiyinyu	619bfc6770	LoongArch: Thread-Local Storage Support	2022-07-26 12:35:12 -03:00
caiyinyu	a133942025	LoongArch: ABI Implementation	2022-07-26 12:35:12 -03:00
Arnout Vandecappelle (Essensium/Mind)	794c27446f	struct stat is not posix conformant on microblaze with __USE_FILE_OFFSET64 Commit `a06b40cdf5` updated stat.h to use __USE_XOPEN2K8 instead of __USE_MISC to add the st_atim, st_mtim and st_ctim members to struct stat. However, for microblaze, there are two definitions of struct stat, depending on the __USE_FILE_OFFSET64 macro. The second one was not updated. Change __USE_MISC to __USE_XOPEN2K8 in the __USE_FILE_OFFSET64 version of struct stat for microblaze.	2022-07-25 11:06:49 -03:00
Florian Weimer	0c5605989f	Linux: dirent/tst-readdir64-compat needs to use TEST_COMPAT (bug 27654) The hppa port starts libc at GLIBC_2.2, but has earlier symbol versions in other shared objects. This means that the compat symbol for readdir64 is not actually present in libc even though have-GLIBC_2.1.3 is defined as yes at the make level. Fixes commit `15e50e6c96` ("Linux: dirent/tst-readdir64-compat can be a regular test") by mostly reverting it.	2022-07-25 11:39:03 +02:00
Adhemerval Zanella Netto	3b56f944c5	s390x: Add optimized chacha20 It adds vectorized ChaCha20 implementation based on libgcrypt cipher/chacha20-s390x.S. The final state register clearing is omitted. On a z15 it shows the following improvements (using formatted bench-arc4random data): GENERIC MB/s ----------------------------------------------- arc4random [single-thread] 198.92 arc4random_buf(16) [single-thread] 244.49 arc4random_buf(32) [single-thread] 282.73 arc4random_buf(48) [single-thread] 286.64 arc4random_buf(64) [single-thread] 320.06 arc4random_buf(80) [single-thread] 297.43 arc4random_buf(96) [single-thread] 310.96 arc4random_buf(112) [single-thread] 308.10 arc4random_buf(128) [single-thread] 309.90 ----------------------------------------------- VX. MB/s ----------------------------------------------- arc4random [single-thread] 430.26 arc4random_buf(16) [single-thread] 735.14 arc4random_buf(32) [single-thread] 1029.99 arc4random_buf(48) [single-thread] 1206.76 arc4random_buf(64) [single-thread] 1311.92 arc4random_buf(80) [single-thread] 1378.74 arc4random_buf(96) [single-thread] 1445.06 arc4random_buf(112) [single-thread] 1484.32 arc4random_buf(128) [single-thread] 1517.30 ----------------------------------------------- Checked on s390x-linux-gnu.	2022-07-22 11:58:27 -03:00
Adhemerval Zanella Netto	b7060acfe8	powerpc64: Add optimized chacha20 It adds vectorized ChaCha20 implementation based on libgcrypt cipher/chacha20-ppc.c. It targets POWER8 and it is used on default for LE. On a POWER8 it shows the following improvements (using formatted bench-arc4random data): POWER8 GENERIC MB/s ----------------------------------------------- arc4random [single-thread] 138.77 arc4random_buf(16) [single-thread] 174.36 arc4random_buf(32) [single-thread] 228.11 arc4random_buf(48) [single-thread] 252.31 arc4random_buf(64) [single-thread] 270.11 arc4random_buf(80) [single-thread] 278.97 arc4random_buf(96) [single-thread] 287.78 arc4random_buf(112) [single-thread] 291.92 arc4random_buf(128) [single-thread] 295.25 POWER8 MB/s ----------------------------------------------- arc4random [single-thread] 198.06 arc4random_buf(16) [single-thread] 278.79 arc4random_buf(32) [single-thread] 448.89 arc4random_buf(48) [single-thread] 551.09 arc4random_buf(64) [single-thread] 646.12 arc4random_buf(80) [single-thread] 698.04 arc4random_buf(96) [single-thread] 756.06 arc4random_buf(112) [single-thread] 784.12 arc4random_buf(128) [single-thread] 808.04 ----------------------------------------------- Checked on powerpc64-linux-gnu and powerpc64le-linux-gnu. Reviewed-by: Paul E. Murphy <murphyp@linux.ibm.com>	2022-07-22 11:58:27 -03:00
Adhemerval Zanella Netto	84cfc6479b	x86: Add AVX2 optimized chacha20 It adds vectorized ChaCha20 implementation based on libgcrypt cipher/chacha20-amd64-avx2.S. It is used only if AVX2 is supported and enabled by the architecture. As for generic implementation, the last step that XOR with the input is omited. The final state register clearing is also omitted. On a Ryzen 9 5900X it shows the following improvements (using formatted bench-arc4random data): SSE MB/s ----------------------------------------------- arc4random [single-thread] 704.25 arc4random_buf(16) [single-thread] 1018.17 arc4random_buf(32) [single-thread] 1315.27 arc4random_buf(48) [single-thread] 1449.36 arc4random_buf(64) [single-thread] 1511.16 arc4random_buf(80) [single-thread] 1539.48 arc4random_buf(96) [single-thread] 1571.06 arc4random_buf(112) [single-thread] 1596.16 arc4random_buf(128) [single-thread] 1613.48 ----------------------------------------------- AVX2 MB/s ----------------------------------------------- arc4random [single-thread] 922.61 arc4random_buf(16) [single-thread] 1478.70 arc4random_buf(32) [single-thread] 2241.80 arc4random_buf(48) [single-thread] 2681.28 arc4random_buf(64) [single-thread] 2913.43 arc4random_buf(80) [single-thread] 3009.73 arc4random_buf(96) [single-thread] 3141.16 arc4random_buf(112) [single-thread] 3254.46 arc4random_buf(128) [single-thread] 3305.02 ----------------------------------------------- Checked on x86_64-linux-gnu.	2022-07-22 11:58:27 -03:00
Adhemerval Zanella Netto	e169aff0e9	x86: Add SSE2 optimized chacha20 It adds vectorized ChaCha20 implementation based on libgcrypt cipher/chacha20-amd64-ssse3.S. It replaces the ROTATE_SHUF_2 (which uses pshufb) by ROTATE2 and thus making the original implementation SSE2. As for generic implementation, the last step that XOR with the input is omited. The final state register clearing is also omitted. On a Ryzen 9 5900X it shows the following improvements (using formatted bench-arc4random data): GENERIC MB/s ----------------------------------------------- arc4random [single-thread] 443.11 arc4random_buf(16) [single-thread] 552.27 arc4random_buf(32) [single-thread] 626.86 arc4random_buf(48) [single-thread] 649.81 arc4random_buf(64) [single-thread] 663.95 arc4random_buf(80) [single-thread] 674.78 arc4random_buf(96) [single-thread] 675.17 arc4random_buf(112) [single-thread] 680.69 arc4random_buf(128) [single-thread] 683.20 ----------------------------------------------- SSE MB/s ----------------------------------------------- arc4random [single-thread] 704.25 arc4random_buf(16) [single-thread] 1018.17 arc4random_buf(32) [single-thread] 1315.27 arc4random_buf(48) [single-thread] 1449.36 arc4random_buf(64) [single-thread] 1511.16 arc4random_buf(80) [single-thread] 1539.48 arc4random_buf(96) [single-thread] 1571.06 arc4random_buf(112) [single-thread] 1596.16 arc4random_buf(128) [single-thread] 1613.48 ----------------------------------------------- Checked on x86_64-linux-gnu.	2022-07-22 11:58:27 -03:00
Adhemerval Zanella Netto	4c128c7823	aarch64: Add optimized chacha20 It adds vectorized ChaCha20 implementation based on libgcrypt cipher/chacha20-aarch64.S. It is used as default and only little-endian is supported (BE uses generic code). As for generic implementation, the last step that XOR with the input is omited. The final state register clearing is also omitted. On a virtualized Linux on Apple M1 it shows the following improvements (using formatted bench-arc4random data): GENERIC MB/s ----------------------------------------------- arc4random [single-thread] 380.89 arc4random_buf(16) [single-thread] 500.73 arc4random_buf(32) [single-thread] 552.61 arc4random_buf(48) [single-thread] 566.82 arc4random_buf(64) [single-thread] 574.01 arc4random_buf(80) [single-thread] 581.02 arc4random_buf(96) [single-thread] 591.19 arc4random_buf(112) [single-thread] 592.29 arc4random_buf(128) [single-thread] 596.43 ----------------------------------------------- OPTIMIZED MB/s ----------------------------------------------- arc4random [single-thread] 569.60 arc4random_buf(16) [single-thread] 825.78 arc4random_buf(32) [single-thread] 987.03 arc4random_buf(48) [single-thread] 1042.39 arc4random_buf(64) [single-thread] 1075.50 arc4random_buf(80) [single-thread] 1094.68 arc4random_buf(96) [single-thread] 1130.16 arc4random_buf(112) [single-thread] 1129.58 arc4random_buf(128) [single-thread] 1137.91 ----------------------------------------------- Checked on aarch64-linux-gnu.	2022-07-22 11:58:27 -03:00
Adhemerval Zanella Netto	6f4e0fcfa2	stdlib: Add arc4random, arc4random_buf, and arc4random_uniform (BZ #4417 ) The implementation is based on scalar Chacha20 with per-thread cache. It uses getrandom or /dev/urandom as fallback to get the initial entropy, and reseeds the internal state on every 16MB of consumed buffer. To improve performance and lower memory consumption the per-thread cache is allocated lazily on first arc4random functions call, and if the memory allocation fails getentropy or /dev/urandom is used as fallback. The cache is also cleared on thread exit iff it was initialized (so if arc4random is not called it is not touched). Although it is lock-free, arc4random is still not async-signal-safe (the per thread state is not updated atomically). The ChaCha20 implementation is based on RFC8439 [1], omitting the final XOR of the keystream with the plaintext because the plaintext is a stream of zeros. This strategy is similar to what OpenBSD arc4random does. The arc4random_uniform is based on previous work by Florian Weimer, where the algorithm is based on Jérémie Lumbroso paper Optimal Discrete Uniform Generation from Coin Flips, and Applications (2013) [2], who credits Donald E. Knuth and Andrew C. Yao, The complexity of nonuniform random number generation (1976), for solving the general case. The main advantage of this method is the that the unit of randomness is not the uniform random variable (uint32_t), but a random bit. It optimizes the internal buffer sampling by initially consuming a 32-bit random variable and then sampling byte per byte. Depending of the upper bound requested, it might lead to better CPU utilization. Checked on x86_64-linux-gnu, aarch64-linux, and powerpc64le-linux-gnu. Co-authored-by: Florian Weimer <fweimer@redhat.com> Reviewed-by: Yann Droneaud <ydroneaud@opteya.com> [1] https://datatracker.ietf.org/doc/html/rfc8439 [2] https://arxiv.org/pdf/1304.1916.pdf	2022-07-22 11:58:27 -03:00
Michael Hudson-Doyle	1f4e90d468	linux: return UNSUPPORTED from tst-mount if entering mount namespace fails Before this the test fails if run in a chroot by a non-root user: warning: could not become root outside namespace (Operation not permitted) ../sysdeps/unix/sysv/linux/tst-mount.c:36: numeric comparison failure left: 1 (0x1); from: errno right: 19 (0x13); from: ENODEV error: ../sysdeps/unix/sysv/linux/tst-mount.c:39: not true: fd != -1 error: ../sysdeps/unix/sysv/linux/tst-mount.c:46: not true: r != -1 error: ../sysdeps/unix/sysv/linux/tst-mount.c:48: not true: r != -1 ../sysdeps/unix/sysv/linux/tst-mount.c:52: numeric comparison failure left: 1 (0x1); from: errno right: 9 (0x9); from: EBADF error: ../sysdeps/unix/sysv/linux/tst-mount.c:55: not true: mfd != -1 ../sysdeps/unix/sysv/linux/tst-mount.c:58: numeric comparison failure left: 1 (0x1); from: errno right: 2 (0x2); from: ENOENT error: ../sysdeps/unix/sysv/linux/tst-mount.c:61: not true: r != -1 ../sysdeps/unix/sysv/linux/tst-mount.c:65: numeric comparison failure left: 1 (0x1); from: errno right: 2 (0x2); from: ENOENT error: ../sysdeps/unix/sysv/linux/tst-mount.c:68: not true: pfd != -1 error: ../sysdeps/unix/sysv/linux/tst-mount.c:75: not true: fd_tree != -1 ../sysdeps/unix/sysv/linux/tst-mount.c:88: numeric comparison failure left: 1 (0x1); from: errno right: 38 (0x26); from: ENOSYS error: 12 test failures Checking that the test can enter a new mount namespace is more correct than just checking the return value of support_become_root() as the test code changes the mount namespace it runs in so running it as root on a system that does not support mount namespaces should still skip. Also change the test to remove the unnecessary fork. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2022-07-19 06:55:49 +12:00
Noah Goldstein	49889fb256	x86: Add support to build st{p\|r}{n}{cpy\|cat} with explicit ISA level 1. Add default ISA level selection in non-multiarch/rtld implementations. 2. Add ISA level build guards to different implementations. - I.e strcpy-avx2.S which is ISA level 3 will only build if compiled ISA level <= 3. Otherwise there is no reason to include it as we will always use one of the ISA level 4 implementations (strcpy-evex.S). 3. Refactor the ifunc selector and ifunc implementation list to use the ISA level aware wrapper macros that allow functions below the compiled ISA level (with a guranteed replacement) to be skipped. Tested with and without multiarch on x86_64 for ISA levels: {generic, x86-64-v2, x86-64-v3, x86-64-v4} And m32 with and without multiarch.	2022-07-16 03:07:59 -07:00
Noah Goldstein	192979ee35	x86: Add support to build wcscpy with explicit ISA level 1. Add ISA level build guards to different implementations. - wcscpy-ssse3.S is used as ISA level 2/3/4. - wcscpy-generic.c is only used at ISA level 1 and will only build if compiled with ISA level == 1. Otherwise there is no reason to include it as we will always use wcscpy-ssse3.S 2. Refactor the ifunc selector and ifunc implementation list to use the ISA level aware wrapper macros that allow functions below the compiled ISA level (with a guranteed replacement) to be skipped. Tested with and without multiarch on x86_64 for ISA levels: {generic, x86-64-v2, x86-64-v3, x86-64-v4} And m32 with and without multiarch.	2022-07-16 03:07:59 -07:00
Noah Goldstein	ceabdcd130	x86: Add support to build strcmp/strlen/strchr with explicit ISA level 1. Add default ISA level selection in non-multiarch/rtld implementations. 2. Add ISA level build guards to different implementations. - I.e strcmp-avx2.S which is ISA level 3 will only build if compiled ISA level <= 3. Otherwise there is no reason to include it as we will always use one of the ISA level 4 implementations (strcmp-evex.S). 3. Refactor the ifunc selector and ifunc implementation list to use the ISA level aware wrapper macros that allow functions below the compiled ISA level (with a guranteed replacement) to be skipped. Tested with and without multiarch on x86_64 for ISA levels: {generic, x86-64-v2, x86-64-v3, x86-64-v4} And m32 with and without multiarch.	2022-07-16 03:07:59 -07:00
Stefan Liebler	779aa039fc	S390: Define SINGLE_THREAD_BY_GLOBAL only on s390x Starting with commit `e070501d12` "Replace __libc_multiple_threads with __libc_single_threaded" the testcases nptl/tst-cancel-self and nptl/tst-cancel-self-cancelstate are failing. This is fixed by only defining SINGLE_THREAD_BY_GLOBAL on s390x, but not on s390. Starting with commit `09c76a7409` "Linux: Consolidate {RTLD_}SINGLE_THREAD_P definition", SINGLE_THREAD_BY_GLOBAL was defined in sysdeps/unix/sysv/linux/s390/s390-64/sysdep.h. Lateron the commit `9a973da617` "s390: Consolidate Linux syscall definition" consolidates the sysdep.h files from s390-32/s390-64 subdirectories. Unfortunately the macro is now always defined instead of only on s390-64. As information: TLS_MULTIPLE_THREADS_IN_TCB is also only defined for s390. See: sysdeps/s390/nptl/tls.h	2022-07-14 13:39:09 +02:00
Noah Goldstein	7c8ca17893	x86: Add missing rtm tests for strcmp family Add new tests for: strcasecmp strncasecmp strcmp wcscmp These functions all have avx2_rtm implementations so should be tested.	2022-07-13 14:55:31 -07:00
Noah Goldstein	42b014dd1b	x86: Remove unneeded rtld-wmemcmp wmemcmp isn't used by the dynamic loader so their no need to add an RTLD stub for it. Tested with and without multiarch on x86_64 for ISA levels: {generic, x86-64-v2, x86-64-v3, x86-64-v4} And m32 with and without multiarch.	2022-07-13 14:55:31 -07:00
Noah Goldstein	e19bb87c97	x86: Move wcslen SSE2 implementation to multiarch/wcslen-sse2.S This commit doesn't affect libc.so.6, its just housekeeping to prepare for adding explicit ISA level support. Tested build on x86_64 and x86_32 with/without multiarch.	2022-07-13 14:55:31 -07:00
Noah Goldstein	64479f11b7	x86: Move wcschr SSE2 implementation to multiarch/wcschr-sse2.S This commit doesn't affect libc.so.6, its just housekeeping to prepare for adding explicit ISA level support. Tested build on x86_64 and x86_32 with/without multiarch.	2022-07-13 14:55:31 -07:00
Noah Goldstein	72a48ec0f7	x86: Move strcat SSE2 implementation to multiarch/strcat-sse2.S This commit doesn't affect libc.so.6, its just housekeeping to prepare for adding explicit ISA level support. Tested build on x86_64 and x86_32 with/without multiarch.	2022-07-13 14:55:31 -07:00
Noah Goldstein	cd080d0741	x86: Move strchr SSE2 implementation to multiarch/strchr-sse2.S This commit doesn't affect libc.so.6, its just housekeeping to prepare for adding explicit ISA level support. Tested build on x86_64 and x86_32 with/without multiarch.	2022-07-13 14:55:31 -07:00
Noah Goldstein	425647458b	x86: Move strrchr SSE2 implementation to multiarch/strrchr-sse2.S This commit doesn't affect libc.so.6, its just housekeeping to prepare for adding explicit ISA level support. Tested build on x86_64 and x86_32 with/without multiarch.	2022-07-13 14:55:31 -07:00
Noah Goldstein	08af081ffd	x86: Move memrchr SSE2 implementation to multiarch/memrchr-sse2.S This commit doesn't affect libc.so.6, its just housekeeping to prepare for adding explicit ISA level support. Tested build on x86_64 and x86_32 with/without multiarch.	2022-07-13 14:55:31 -07:00
Noah Goldstein	6b9006bfb0	x86: Move strcpy SSE2 implementation to multiarch/strcpy-sse2.S This commit doesn't affect libc.so.6, its just housekeeping to prepare for adding explicit ISA level support. Tested build on x86_64 and x86_32 with/without multiarch.	2022-07-13 14:55:31 -07:00
Noah Goldstein	58e6cd4bcb	x86: Move strlen SSE2 implementation to multiarch/strlen-sse2.S This commit doesn't affect libc.so.6, its just housekeeping to prepare for adding explicit ISA level support. Tested build on x86_64 and x86_32 with/without multiarch.	2022-07-13 14:55:31 -07:00
Noah Goldstein	60a583ec60	x86: Move strcmp SSE42 implementation to multiarch/strcmp-sse4_2.S This commit doesn't affect libc.so.6, its just housekeeping to prepare for adding explicit ISA level support. Tested build on x86_64 and x86_32 with/without multiarch.	2022-07-13 14:55:31 -07:00
Noah Goldstein	427eaa2c85	x86: Move wcscmp SSE2 implementation to multiarch/wcscmp-sse2.S This commit doesn't affect libc.so.6, its just housekeeping to prepare for adding explicit ISA level support. Tested build on x86_64 and x86_32 with/without multiarch.	2022-07-13 14:55:31 -07:00
Noah Goldstein	d561fbb041	x86: Move strcmp SSE2 implementation to multiarch/strcmp-sse2.S This commit doesn't affect libc.so.6, its just housekeeping to prepare for adding explicit ISA level support. Because strcmp-sse2.S implements so many functions (more from avx2/evex/sse42) add a new file 'strcmp-naming.h' to assist in getting the correct symbol name for all the function across multiarch/non-multiarch builds. Tested build on x86_64 and x86_32 with/without multiarch.	2022-07-13 14:55:31 -07:00
Noah Goldstein	30e57e0a21	x86: Rename STRCASECMP_NONASCII macro to STRCASECMP_L_NONASCII The previous macro name can be confusing given that both `__strcasecmp_l_nonascii` and `__strcasecmp_nonascii` are functions and we use the `_l` version.	2022-07-13 14:55:31 -07:00
Noah Goldstein	f2698954ff	x86: Remove __mmask intrinsics in strstr-avx512.c The intrinsics are not available before GCC7 and using standard operators generates code of equivalent or better quality. Removed: _cvtmask64_u64 _kshiftri_mask64 _kand_mask64 Geometric Mean of 5 Runs of Full Benchmark Suite New / Old: 0.958	2022-07-12 15:41:14 -07:00
Noah Goldstein	9c38deec96	x86: Remove generic strncat, strncpy, and stpncpy implementations These functions all have optimized versions: __strncat_sse2_unaligned, __strncpy_sse2_unaligned, and stpncpy_sse2_unaligned which are faster than their respective generic implementations. Since the sse2 versions can run on baseline x86_64, we should use these as the baseline implementation and can remove the generic implementations. Geometric mean of N=20 runs of the entire benchmark suite on: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz (Tigerlake) __strncat_sse2_unaligned / __strncat_generic: .944 __strncpy_sse2_unaligned / __strncpy_generic: .726 __stpncpy_sse2_unaligned / __stpncpy_generic: .650 Tested build with and without multiarch and full check with multiarch.	2022-07-12 11:44:12 -07:00
Fangrui Song	c5bec9d491	i386: Remove -Wa,-mtune=i686 gas -mtune= may change NOP generating patterns but -mtune=i686 has no difference from the default by inspecting .o and .os files. Note: Clang doesn't support -Wa,-mtune=i686.	2022-07-12 11:14:32 -07:00
H.J. Lu	ec9013727d	x86-64: Remove redundant strcspn-generic/strpbrk-generic/strspn-generic Remove redundant strcspn-generic, strpbrk-generic and strspn-generic from sysdep_routines in sysdeps/x86_64/multiarch/Makefile added by commit `c69f960b01` Author: Noah Goldstein <goldstein.w.n@gmail.com> Date: Sun Jul 3 21:28:07 2022 -0700 x86: Add support for building str{c\|p}{brk\|spn} with explicit ISA level since they have been added to sysdep_routines in sysdeps/x86_64/Makefile.	2022-07-08 16:06:04 -07:00
H.J. Lu	eedf7886ed	x86-64: Don't mark symbols as hidden in strcmp-XXX.S Don't mark symbols as hidden in strcmp-avx2.S, strcmp-evex.S and strcmp-sse42.S since they are marked as hidden in the IFUNC selectors.	2022-07-07 16:38:11 -07:00
Tom Honermann	8bcca1db3d	stdlib: Implement mbrtoc8, c8rtomb, and the char8_t typedef. This change provides implementations for the mbrtoc8 and c8rtomb functions adopted for C++20 via WG21 P0482R6 and for C2X via WG14 N2653. It also provides the char8_t typedef from WG14 N2653. The mbrtoc8 and c8rtomb functions are declared in uchar.h in C2X mode or when the _GNU_SOURCE macro or C++20 __cpp_char8_t feature test macro is defined. The char8_t typedef is declared in uchar.h in C2X mode or when the _GNU_SOURCE macro is defined and the C++20 __cpp_char8_t feature test macro is not defined (if __cpp_char8_t is defined, then char8_t is a builtin type). Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-07-06 09:29:42 -03:00
Danila Kutenin	3c99806989	aarch64: Optimize string functions with shrn instruction We found that string functions were using AND+ADDP to find the nibble/syndrome mask but there is an easier opportunity through `SHRN dst.8b, src.8h, 4` (shift right every 2 bytes by 4 and narrow to 1 byte) and has same latency on all SIMD ARMv8 targets as ADDP. There are also possible gaps for memcmp but that's for another patch. We see 10-20% savings for small-mid size cases (<=128) which are primary cases for general workloads.	2022-07-06 09:26:20 +01:00
Noah Goldstein	ae308947ff	x86: Add support for building {w}memcmp{eq} with explicit ISA level 1. Refactor files so that all implementations are in the multiarch directory - Moved the implementation portion of memcmp sse2 from memcmp.S to multiarch/memcmp-sse2.S - The non-multiarch file now only includes one of the implementations in the multiarch directory based on the compiled ISA level (only used for non-multiarch builds. Otherwise we go through the ifunc selector). 2. Add ISA level build guards to different implementations. - I.e memcmp-avx2-movsb.S which is ISA level 3 will only build if compiled ISA level <= 3. Otherwise there is no reason to include it as we will always use one of the ISA level 4 implementations (memcmp-evex-movbe.S). 3. Add new multiarch/rtld-{w}memcmp{eq}.S that just include the non-multiarch {w}memcmp{eq}.S which will in turn select the best implementation based on the compiled ISA level. 4. Refactor the ifunc selector and ifunc implementation list to use the ISA level aware wrapper macros that allow functions below the compiled ISA level (with a guranteed replacement) to be skipped. Tested with and without multiarch on x86_64 for ISA levels: {generic, x86-64-v2, x86-64-v3, x86-64-v4} And m32 with and without multiarch.	2022-07-05 16:42:42 -07:00
Noah Goldstein	37ecc657b2	x86: Add support for building {w}memset{_chk} with explicit ISA level 1. Refactor files so that all implementations are in the multiarch directory - Moved the implementation portion of memset sse2 from memset.S to multiarch/memset-sse2.S - The non-multiarch file now only includes one of the implementations in the multiarch directory based on the compiled ISA level (only used for non-multiarch builds. Otherwise we go through the ifunc selector). 2. Add ISA level build guards to different implementations. - I.e memset-avx2-unaligned-erms.S which is ISA level 3 will only build if compiled ISA level <= 3. Otherwise there is no reason to include it as we will always use one of the ISA level 4 implementations (memset-evex-unaligned-erms.S). 3. Add new multiarch/rtld-memset.S that just include the non-multiarch memset.S which will in turn select the best implementation based on the compiled ISA level. 4. Refactor the ifunc selector and ifunc implementation list to use the ISA level aware wrapper macros that allow functions below the compiled ISA level (with a guranteed replacement) to be skipped. Tested with and without multiarch on x86_64 for ISA levels: {generic, x86-64-v2, x86-64-v3, x86-64-v4} And m32 with and without multiarch.	2022-07-05 16:42:42 -07:00
Noah Goldstein	b6a02c3606	x86: Add support for building {w}memmove{_chk} with explicit ISA level 1. Refactor files so that all implementations are in the multiarch directory - Moved the implementation portion of memmove sse2 from memmove.S to multiarch/memmove-sse2.S - The non-multiarch file now only includes one of the implementations in the multiarch directory based on the compiled ISA level (only used for non-multiarch builds. Otherwise we go through the ifunc selector). 2. Add ISA level build guards to different implementations. - I.e memmove-avx2-unaligned-erms.S which is ISA level 3 will only build if compiled ISA level <= 3. Otherwise there is no reason to include it as we will always use one of the ISA level 4 implementations (memmove-evex-unaligned-erms.S). 3. Add new multiarch/rtld-memmove.S that just include the non-multiarch memmove.S which will in turn select the best implementation based on the compiled ISA level. 4. Refactor the ifunc selector and ifunc implementation list to use the ISA level aware wrapper macros that allow functions below the compiled ISA level (with a guranteed replacement) to be skipped. Tested with and without multiarch on x86_64 for ISA levels: {generic, x86-64-v2, x86-64-v3, x86-64-v4} And m32 with and without multiarch. isa raising memmove	2022-07-05 16:42:42 -07:00
Noah Goldstein	c69f960b01	x86: Add support for building str{c\|p}{brk\|spn} with explicit ISA level The changes for these functions are different than the others because the best implementation (sse4_2) requires the generic implementation as a fallback to be built as well. Changes are: 1. Add non-multiarch functions for str{c\|p}{brk\|spn}.c to statically select the best implementation based on the configured ISA build level. 2. Add stubs for str{c\|p}{brk\|spn}-generic and varshift.c to in the sysdeps/x86_64 directory so that the the sse4 implementation will have all of its dependencies for the non-multiarch / rtld build when ISA level >= 2. 3. Add new multiarch/rtld-strcspn.c that just include the non-multiarch strcspn.c which will in turn select the best implementation based on the compiled ISA level. 4. Refactor the ifunc selector and ifunc implementation list to use the ISA level aware wrapper macros that allow functions below the compiled ISA level (with a guranteed replacement) to be skipped. Tested with and without multiarch on x86_64 for ISA levels: {generic, x86-64-v2, x86-64-v3, x86-64-v4} And m32 with and without multiarch.	2022-07-05 16:42:42 -07:00
Noah Goldstein	baeae86fb8	x86: Add comment explaining no Slow_SSE4_2 check in ifunc-sse4_2 Just for clarities sake and so that if a future implementation is added we remember to add the check.	2022-07-05 16:42:42 -07:00
Adhemerval Zanella	e070501d12	Replace __libc_multiple_threads with __libc_single_threaded And also fixes the SINGLE_THREAD_P macro for SINGLE_THREAD_BY_GLOBAL, since header inclusion single-thread.h is in the wrong order, the define needs to come before including sysdeps/unix/sysdep.h. The macro is now moved to a per-arch single-threade.h header. The SINGLE_THREAD_P is used on some more places. Checked on aarch64-linux-gnu and x86_64-linux-gnu.	2022-07-05 10:14:47 -03:00
Adhemerval Zanella	af1aa36c61	linux: Add mount_setattr It was added on Linux 5.12 (2a1867219c7b27f928e2545782b86daaf9ad50bd) to allow change the properties of a mount or a mount tree using file descriptors which the new mount api is based on. Checked on x86_64-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2022-07-05 10:08:48 -03:00
Adhemerval Zanella	c3b02b6567	linux: Add tst-mount to check for Linux new mount API The new mount API was added on Linux 5.2 with six new syscalls: fsopen, fsconfig, fsmount, move_mount, fspick, and open_tree. The new test verifies minimal functionality along with error paths for specific arguments and their corner cases. Checked on x86_64-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2022-07-05 10:08:48 -03:00
Adhemerval Zanella	78a408ee7b	linux: Add open_tree It was added on Linux 5.2 (a07b20004793d8926f78d63eb5980559f7813404) to return a O_PATH-opened file descriptor to an existing mountpoint. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2022-07-05 10:08:48 -03:00
Adhemerval Zanella	60f574e140	linux: Add fspick It was added on Linux 5.2 (cf3cba4a429be43e5527a3f78859b1bfd9ebc5fb) that can be used to pick an existing mountpoint into an filesystem context which can thereafter be used to reconfigure a superblock with fsconfig syscall. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2022-07-05 10:08:48 -03:00
Adhemerval Zanella	7eae6a91e9	linux: Add fsconfig It was added on Linux 5.2 (ecdab150fddb42fe6a739335257949220033b782) as a way to a configure filesystem creation context and trigger actions upon it, to be used in conjunction with fsopen, fspick and fsmount. The fsconfig_command commands are currently only defined as an enum, so they can't be checked on tst-mount-consts.py with current test support. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2022-07-05 10:08:48 -03:00
Tejas Belagod	05844d18f7	AArch64: Reset HWCAP2_AFP bits in FPCR for default fenv The AFP feature (Alternate floating-point behavior) was added in armv8.7 and introduced new FPCR bits. Currently, HWCAP2_AFP bits (bit 0, 1, 2) in FPCR are preserved when fenv is set to default environment. This is a deviation from standard behaviour. Clear these bits when setting the fenv to default. There is no libc API to modify the new FPCR bits. Restoring those bits matters if the user changed them directly.	2022-07-05 14:01:17 +01:00
Adhemerval Zanella	8ee2c043cf	Fix hurd namespace issues for internal signal functions It was introduced by "Refactor internal-signals.h (`a1bdd81664`)". Use the internal symbols instead. Checked with a build for i686-gnu.	2022-07-04 11:10:06 -03:00
Adhemerval Zanella	a1bdd81664	Refactor internal-signals.h The main drive is to optimize the internal usage and required size when sigset_t is embedded in other data structures. On Linux, the current supported signal set requires up to 8 bytes (16 on mips), was lower than the user defined sigset_t (128 bytes). A new internal type internal_sigset_t is added, along with the functions to operate on it similar to the ones for sigset_t. The internal-signals.h is also refactored to remove unused functions Besides small stack usage on some functions (posix_spawn, abort) it lower the struct pthread by about 120 bytes (112 on mips). Checked on x86_64-linux-gnu. Reviewed-by: Arjun Shankar <arjun@redhat.com>	2022-06-30 14:56:21 -03:00
Kito Cheng	c22d2021a9	riscv: Use memcpy to handle unaligned access when fixing R_RISCV_RELATIVE Although RISC-V Linux will enable the unaligned memory access handler by default, that is quite expensive in general, using memcpy will be much cheaper - just break down that into several load/store byte instructions. ARM and MIPS has similar issue: ARM: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51456 MIPS: https://gcc.gnu.org/legacy-ml/gcc-help/2005-07/msg00325.html Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-06-30 08:04:52 -07:00
Tejas Belagod	e9dd368296	AArch64: Add asymmetric faulting mode for tag violations in mem.tagging tunable The new asymmetric mode is available when HWCAP2_MTE3 is set (support is available), bit2 is set in the tunable (user request per application), and the system is configured such that the asymmetric mode is preferred over sync or async (per-cpu system-wide setting). Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2022-06-30 14:01:08 +01:00
Adhemerval Zanella	71d87d85bf	linux: Fix mq_timereceive check for 32 bit fallback code (BZ 29304) On success, mq_receive() and mq_timedreceive() return the number of bytes in the received message, so it requires to check if the value is larger than 0. Checked on i686-linux-gnu.	2022-06-30 09:12:59 -03:00
Noah Goldstein	96ac447d91	x86: Add missing IS_IN (libc) check to strncmp-sse4_2.S Was missing to for the multiarch build rtld-strncmp-sse4_2.os was being built and exporting symbols: build/glibc/string/rtld-strncmp-sse4_2.os: 0000000000000000 T __strncmp_sse42 Introduced in: commit `11ffcacb64` Author: H.J. Lu <hjl.tools@gmail.com> Date: Wed Jun 21 12:10:50 2017 -0700 x86-64: Implement strcmp family IFUNC selectors in C	2022-06-29 19:47:52 -07:00
Noah Goldstein	0aa294fb88	x86: Add missing IS_IN (libc) check to strcspn-sse4.c Was missing to for the multiarch build rtld-strcspn-sse4.os was being built and exporting symbols: build/glibc/string/rtld-strcspn-sse4.os: U ___m128i_shift_right U __strcspn_generic 0000000000000000 T __strcspn_sse42 U strlen build/glibc/string/rtld-varshift.os: 0000000000000000 R ___m128i_shift_right Introduced in: commit `06e51c8f3d` Author: H.J. Lu <hongjiu.lu@intel.com> Date: Fri Jul 3 02:48:56 2009 -0700 Add SSE4.2 support for strcspn, strpbrk, and strspn on x86-64.	2022-06-29 19:47:52 -07:00
Noah Goldstein	8cfbbbcdf9	x86: Add missing IS_IN (libc) check to memmove-ssse3.S Was missing to for the multiarch build rtld-memmove-ssse3.os was being built and exporting symbols: >$ nm string/rtld-memmove-ssse3.os U __GI___chk_fail 0000000000000020 T __memcpy_chk_ssse3 0000000000000040 T __memcpy_ssse3 0000000000000020 T __memmove_chk_ssse3 0000000000000040 T __memmove_ssse3 0000000000000000 T __mempcpy_chk_ssse3 0000000000000010 T __mempcpy_ssse3 U __x86_shared_cache_size_half Introduced after 2.35 in: commit `26b2478322` Author: Noah Goldstein <goldstein.w.n@gmail.com> Date: Thu Apr 14 11:47:40 2022 -0500 x86: Reduce code size of mem{move\|pcpy\|cpy}-ssse3	2022-06-29 19:47:52 -07:00
H.J. Lu	88070acdd0	x86-64: Properly indent X86_IFUNC_IMPL_ADD_VN arguments Properly indent X86_IFUNC_IMPL_ADD_VN arguments for memchr, rawmemchr and wmemchr. Co-authored-by: H.J. Lu <hjl.tools@gmail.com>	2022-06-29 19:47:52 -07:00

... 2 3 4 5 6 ...

15452 Commits