glibc

mirror of https://sourceware.org/git/glibc.git synced 2024-11-22 04:50:07 +00:00

Author	SHA1	Message	Date
Adhemerval Zanella	d8023eb460	arm: Regenerate ULPs From new tests added by `0797283910`.	2024-08-07 11:02:03 -03:00
Adhemerval Zanella	e2f88d8524	aarch64: Regenerate ULPs From new tests added by `0797283910`.	2024-08-07 11:02:03 -03:00
Adhemerval Zanella	428c7383da	sysdeps: Re-flow and sort multiline gnu/Makefile definitions	2024-08-07 11:02:03 -03:00
Wilco Dijkstra	3dc426b642	AArch64: Improve generic strlen Improve performance by handling another 16 bytes before entering the loop. Use ADDHN in the loop to avoid SHRN+FMOV when it terminates. Change final size computation to avoid increasing latency. On Neoverse V1 performance of the random strlen benchmark improves by 4.6%. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-08-07 14:58:46 +01:00
Paul Zimmermann	0797283910	added inputs giving large errors on x86_64 for new C23 functions These functions are exp10m1, exp2m1, log10p1, log2p1. Also regenerated ulps on x86_64. For each format, there are 4 values, one for each rounding mode. (For the intel96 format, there are 8 values, 4 for Intel hardware, and 4 for AMD hardware. However, regen-ulps was only run on Intel. It should be run in a separate patch on a AMD x86_64.) Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-08-07 14:28:46 +02:00
caiyinyu	d7eca2714f	LoongArch: Update Ulps. From new tests added by `4dc22baa84`. Signed-off-by: caiyinyu <caiyinyu@loongson.cn>	2024-08-06 09:23:56 +08:00
Florian Weimer	5097cd344f	elf: Avoid re-initializing already allocated TLS in dlopen (bug 31717) The old code used l_init_called as an indicator for whether TLS initialization was complete. However, it is possible that TLS for an object is initialized, written to, and then dlopen for this object is called again, and l_init_called is not true at this point. Previously, this resulted in TLS being initialized twice, discarding any interim writes (technically introducing a use-after-free bug even). This commit introduces an explicit per-object flag, l_tls_in_slotinfo. It indicates whether _dl_add_to_slotinfo has been called for this object. This flag is used to avoid double-initialization of TLS. In update_tls_slotinfo, the first_static_tls micro-optimization is removed because preserving the initalization flag for subsequent use by the second loop for static TLS is a bit complicated, and another per-object flag does not seem to be worth it. Furthermore, the l_init_called flag is dropped from the second loop (for static TLS initialization) because l_need_tls_init on its own prevents double-initialization. The remaining l_init_called usage in resize_scopes and update_scopes is just an optimization due to the use of scope_has_map, so it is not changed in this commit. The isupper check ensures that libc.so.6 is TLS is not reverted. Such a revert happens if l_need_tls_init is not cleared in _dl_allocate_tls_init for the main_thread case, now that l_init_called is not checked anymore in update_tls_slotinfo in elf/dl-open.c. Reported-by: Jonathon Anderson <janderson@rice.edu> Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2024-08-05 18:26:52 +02:00
Florian Weimer	fe06fb313b	elf: Clarify and invert second argument of _dl_allocate_tls_init Also remove an outdated comment: _dl_allocate_tls_init is called as part of pthread_create. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2024-08-05 18:26:42 +02:00
Florian Weimer	7a630f7d33	x86: Tunables may incorrectly set Prefer_PMINUB_for_stringop (bug 32047) Fixes commit `5bcf6265f2` ("x86: Disable non-temporal memset on Skylake Server"). Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2024-08-02 18:08:14 +02:00
Florian Weimer	0df48472ff	x86: Add missing switch/case fall-through markers to init_cpu_features The commits introducing these fall-throughs intended them to happen. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2024-08-02 18:08:14 +02:00
Samuel Thibault	8dc3f4f8ad	hurd: Fix missing pthread_ compat symbol in libc `5476f8cd2e` ("htl: move pthread_self info libc.") and `9dfa256216` ("htl: move pthread_equal into libc") to `1dc0bc8f07` ("htl: move pthread_attr_setdetachstate into libc") moved some pthread_ symbols from libpthread.so to libc.so, but missed adding the compat version like `5476f8cd2e` ("htl: move pthread_self info libc.") did: libc already had these symbols as forwards, but versioned GLIBC_2.21, while the symbols in libpthread.so were versioned GLIBC_2.12. To fix running executables built before this, we thus have to add the GLIBC_2.12 version, otherwise execution fails with e.g. /usr/lib/i386-gnu/libglib-2.0.so: symbol lookup error: /usr/lib/i386-gnu/libglib-2.0.so: undefined symbol: pthread_attr_setinheritsched, version GLIBC_2.12	2024-08-01 23:58:51 +02:00
H.J. Lu	ff0320bec2	Add mremap tests Add tests for MREMAP_MAYMOVE and MREMAP_FIXED. On Linux, also test MREMAP_DONTUNMAP. Signed-off-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-08-01 05:06:12 -07:00
H.J. Lu	6c40cb0e9f	linux: Update the mremap C implementation [BZ #31968 ] Update the mremap C implementation to support the optional argument for MREMAP_DONTUNMAP added in Linux 5.7 since it may not always be correct to implement a variadic function as a non-variadic function on all Linux targets. Return MAP_FAILED and set errno to EINVAL for unknown flag bits. This fixes BZ #31968. Note: A test must be added when a new flag bit is introduced. Signed-off-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-08-01 05:06:12 -07:00
Adhemerval Zanella	28f8cee64a	Add F_DUPFD_QUERY from Linux 6.10 to bits/fcntl-linux.h It was added by commit c62b758bae6af16 as a way for userspace to check if two file descriptors refer to the same struct file. Checked on aarch64-linux-gnu. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2024-07-30 08:52:52 -03:00
Adhemerval Zanella	e433cdec9b	Update kernel version to 6.10 in header constant tests This patch updates the kernel version in the tests tst-mman-consts.py, tst-mount-consts.py, and tst-pidfd-consts.py to 6.9. There are no new constants covered by these tests in 6.10. Tested with build-many-glibcs.py. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2024-07-30 08:48:51 -03:00
Adhemerval Zanella	eb0776d4e1	Update syscall lists for Linux 6.10 Linux 6.10 changes for syscall are: * mseal for all architectures. * map_shadow_stack for x32. * Replace sync_file_range with sync_file_range2 for csky (which fixes a broken sync_file_range usage). Update syscall-names.list and regenerate the arch-syscall.h headers with build-many-glibcs.py update-syscalls. Tested with build-many-glibcs.py. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2024-07-30 08:48:51 -03:00
Michael Karcher	faeaa3bc9f	Mitigation for "clone on sparc might fail with -EFAULT for no valid reason" (bz 31394) It seems the kernel can not deal with uncommitted stack space in the area intended for the register window when executing the clone() system call. So create a nested frame (proxy for the kernel frame) and flush it from the processor to memory to force committing pages to the stack before invoking the system call. Bug: https://www.mail-archive.com/debian-glibc@lists.debian.org/msg62592.html Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31394 See-also: https://lore.kernel.org/sparclinux/62f9be9d-a086-4134-9a9f-5df8822708af@mkarcher.dialup.fu-berlin.de/ Signed-off-by: Michael Karcher <sourceware-bugzilla@mkarcher.dialup.fu-berlin.de> Reviewed-by: DJ Delorie <dj@redhat.com>	2024-07-29 23:00:39 +02:00
Julian Zhu	32328a5a14	MIPS: Regenerate ulps From new tests added by `4dc22baa84`. Signed-off-by: Julian Zhu <jz531210@gmail.com>	2024-07-27 16:55:38 +02:00
Maciej W. Rozycki	8c98195af6	nptl: Use <support/check.h> facilities in tst-setuid3 Remove local FAIL macro in favor to FAIL_EXIT1 from <support/check.h>, which provides equivalent reporting, with the name of the file and the line number within of the failure site additionally included. Remove FAIL_ERR altogether and include ": %m" explicitly with the format string supplied to FAIL_EXIT1 as there seems little value to have a separate macro just for this. Reviewed-by: DJ Delorie <dj@redhat.com>	2024-07-26 13:21:34 +01:00
Adhemerval Zanella	fe94080875	sparc: Regenerate ULPs From new tests added by `4dc22baa84`.	2024-07-25 11:06:53 -03:00
Adhemerval Zanella	65e267dcdd	i386: Regenerate ULPs From new tests added by `4dc22baa84`.	2024-07-25 10:49:06 -03:00
Adhemerval Zanella	cc84f11282	arm: Regenerate ULPs From new tests added by `4dc22baa84`.	2024-07-25 10:41:34 -03:00
Adhemerval Zanella	cfc9b07346	aarch64: Regenerate ULPs From new tests added by `4dc22baa84`.	2024-07-25 10:41:30 -03:00
Adhemerval Zanella	fa00661082	powerpc: Regenerate ULPs for soft-fp From new tests added by `4dc22baa84`.	2024-07-25 10:33:40 -03:00
jeevitha	4e40c8104f	powerpc: Update ulps for fpu Adjust the ULPs for the log2p1 implementation.	2024-07-25 10:28:47 -03:00
Khem Raj	ff03b5efe6	riscv: Update ulps Generated with make regen-ulps using gcc14 on a visionfive2 SBC. Signed-off-by: Khem Raj <raj.khem@gmail.com>	2024-07-25 10:28:44 -03:00
Stefan Liebler	22958014ab	s390x: Regenerate ULPs. Needed due to: "This patch adds larger ulp errors for the log2p1 function." commit `4dc22baa84`	2024-07-25 14:14:22 +02:00
H.J. Lu	8344c1f551	x32/cet: Support shadow stack during startup for Linux 6.10 Use RXX_LP in RTLD_START_ENABLE_X86_FEATURES. Support shadow stack during startup for Linux 6.10: commit 2883f01ec37dd8668e7222dfdb5980c86fdfe277 Author: H.J. Lu <hjl.tools@gmail.com> Date: Fri Mar 15 07:04:33 2024 -0700 x86/shstk: Enable shadow stacks for x32 1. Add shadow stack support to x32 signal. 2. Use the 64-bit map_shadow_stack syscall for x32. 3. Set up shadow stack for x32. Add the map_shadow_stack system call to <fixup-asm-unistd.h> and regenerate arch-syscall.h. Tested on Intel Tiger Lake with CET enabled x32. There are no regressions with CET enabled x86-64. There are no changes in CET enabled x86-64 _dl_start_user. Signed-off-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2024-07-25 00:17:21 -07:00
H.J. Lu	652c6cf269	x86-64: Remove sysdeps/x86_64/x32/dl-machine.h Remove sysdeps/x86_64/x32/dl-machine.h by folding x32 ARCH_LA_PLTENTER, ARCH_LA_PLTEXIT and RTLD_START into sysdeps/x86_64/dl-machine.h. There are no regressions on x86-64 nor x32. There are no changes in x86-64 _dl_start_user. On x32, _dl_start_user changes are <_dl_start_user>: mov %eax,%r12d + mov %esp,%r13d mov (%rsp),%edx mov %edx,%esi - mov %esp,%r13d and $0xfffffff0,%esp mov 0x0(%rip),%edi # <_dl_start_user+0x14> lea 0x8(%r13,%rdx,4),%ecx Signed-off-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2024-07-25 00:17:21 -07:00
John David Anglin	431c1be28e	hppa: Update libm-test-ulps	2024-07-24 16:43:01 -04:00
Paul Zimmermann	4dc22baa84	This patch adds larger ulp errors for the log2p1 function. Changes in v2: - added larger error for long double on AMD reported by Adhemerval (https://sourceware.org/pipermail/libc-alpha/2024-June/157755.html) Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-07-22 08:54:23 +02:00
Andreas K. Hüttel	ab5748118f	linux: Trivial test output fix in tst-pkey Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>	2024-07-19 22:57:23 +02:00
Adhemerval Zanella	6b7e2e1d61	linux: Also check pkey_get for ENOSYS on tst-pkey (BZ 31996) The powerpc pkey_get/pkey_set support was only added for 64-bit [1], and tst-pkey only checks if the support was present with pkey_alloc (which does not fail on powerpc32, at least running a 64-bit kernel). Checked on powerpc-linux-gnu. [1] https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=a803367bab167f5ec4fde1f0d0ec447707c29520 Reviewed-By: Andreas K. Huettel <dilfridge@gentoo.org>	2024-07-19 22:39:44 +02:00
Adhemerval Zanella	e0f7da7235	powerpc: Update soft-fp ulps Results based on regen-ulps using gcc 11.2.1 on a POWER8 machine.	2024-07-19 19:29:35 +02:00
John David Anglin	8cfa4ecff2	Fix usage of _STACK_GROWS_DOWN and _STACK_GROWS_UP defines [BZ 31989] Signed-off-by: John David Anglin <dave.anglin@bell.net> Reviewed-By: Andreas K. Hüttel <dilfridge@gentoo.org>	2024-07-19 10:10:17 -04:00
H.J. Lu	66f2cd6e1a	x32: xfail elf/tst-platform-1 [BZ #22363 ] Xfail elf/tst-platform-1 on x32 since kernel passes i686 in AT_PLATFORM. See https://sourceware.org/bugzilla/show_bug.cgi?id=22363 Signed-off-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Sam James <sam@gentoo.org>	2024-07-19 10:34:38 +02:00
Andreas K. Hüttel	910aae6e5a	Revert "LoongArch: Add cfi instructions for _dl_tlsdesc_dynamic" We're in freeze for the 2.40 release. This reverts commit `43224b1379`. Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>	2024-07-17 15:24:51 +02:00
Samuel Thibault	6ed76f4efc	htl: Fix __pthread_init_thread declaration and definition `0e75c4a463` ("hurd: Fix pthread_self() without libpthread") added a declaration for ___pthread_init_thread instead of __pthread_init_thread, and missed defining the external hidden symbol.	2024-07-17 15:04:25 +02:00
Samuel Thibault	0e75c4a463	hurd: Fix pthread_self() without libpthread `5476f8cd2e` ("htl: move pthread_self info libc.") moved the htl pthread_self() function from libpthread to libc, replacing the previous libc stub that just returns 0. And `53da64d1cf` ("htl: Initialize ___pthread_self early") added initialization code which is needed before being able to call pthread_self. It is currently in libpthread, and thus never called before programs can call pthread_self from libc, which then segfaults when accessing _pthread_self()->thread. This moves the initialization to libc itself, as initialized variables, so pthread_self can always be called fine.	2024-07-17 14:14:21 +02:00
mengqinggang	43224b1379	LoongArch: Add cfi instructions for _dl_tlsdesc_dynamic In _dl_tlsdesc_dynamic, there are three 'addi.d sp, sp, -size' instructions to allocate stack size for Float/LSX/LASX registers. Every 'addi.d sp, sp, -size' needs a cfi_adjust_cfa_offset because of sp is used to compute CFA. But only one 'addi.d sp, sp, -size' will be run according to HWCAP value. And all cfi_adjust_cfa_offset will be executed in stack unwinding, it result in incorrect CFA. Change _dl_tlsdesc_dynamic to _dl_tlsdesc_dynamic, _dl_tlsdesc_dynamic_lsx and _dl_tlsdesc_dynamic_lasx. Conflicting cfi instructions can be distributed to the three functions. And cfi instructions can correspond to stack down instructions.	2024-07-17 09:32:25 +08:00
Noah Goldstein	5bcf6265f2	x86: Disable non-temporal memset on Skylake Server The original commit enabling non-temporal memset on Skylake Server had erroneous benchmarks (actually done on ICX). Further benchmarks indicate non-temporal stores may in fact by a regression on Skylake Server. This commit may be over-cautious in some cases, but should avoid any regressions for 2.40. Tested using qemu on all x86_64 cpu arch supported by both qemu + GLIBC. Reviewed-by: DJ Delorie <dj@redhat.com> Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-07-16 17:20:18 +08:00
Flavio Cruz	2dcc908538	Add pthread_getname_np and pthread_setname_np for Hurd We use thread_get_name and thread_set_name to get and set the thread name, so nothing is stored in the thread structure since these functions are supposed to be called sparingly. One notable difference with Linux is that the thread name is up to 32 chars, whereas Linux's is 16. Also added a mach_RPC_CHECK to check for the existing of gnumach RPCs.	2024-07-16 09:21:52 +02:00
Andreas K. Hüttel	a11e15ea0a	math: Update alpha ulps Linux alphadev 6.9.8-gentoo-alpha #1 Sun Jul 7 00:45:49 EDT 2024 alpha EV68CB Titan GNU/Linux gcc (Gentoo 14.1.1_p20240622 p2) 14.1.1 20240622 GNU ld (Gentoo 2.42 p6) 2.42.0 Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>	2024-07-14 12:44:15 +02:00
Andreas K. Hüttel	ef7005628f	tests: XFAIL audit tests failing on all mips configurations, bug 29404 Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>	2024-07-12 18:49:42 +02:00
Stefan Liebler	9b76514103	s390x: Fix segfault in wcsncmp [BZ #31934 ] The z13/vector-optimized wcsncmp implementation segfaults if n=1 and there is only one character (equal on both strings) before the page end. Then it loads and compares one character and misses to check n again. The following load fails. This patch removes the extra load and compare of the first character and just start with the loop which uses vector-load-to-block-boundary. This code-path also checks n. With this patch both tests are passing: - the simplified one mentioned in the bugzilla 31934 - the full one in Florian Weimer's patch: "manual: Document a GNU extension for strncmp/wcsncmp" (https://patchwork.sourceware.org/project/glibc/patch/874j9eml6y.fsf@oldenburg.str.redhat.com/): On s390x-linux-gnu (z16), the new wcsncmp test fails due to bug 31934. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2024-07-11 15:08:57 +02:00
Florian Weimer	2e456ccf0c	Linux: Make __rseq_size useful for feature detection (bug 31965) The __rseq_size value is now the active area of struct rseq (so 20 initially), not the full struct size including padding at the end (32 initially). Update misc/tst-rseq to print some additional diagnostics. Reviewed-by: Michael Jeanson <mjeanson@efficios.com> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>	2024-07-09 19:33:37 +02:00
Andreas K. Hüttel	ab6045728f	math: Update m68k ULPs This hasn't been looked at for a loong time (already guessing from the number of missing entries), and it ain't pretty. There are some 9-ulps results for float. - ZaZaZebra (qemu-system-m68k clone of PowerBook 190 system) - GCC 13.3.1 20240614 (Gentoo 13.3.1_p20240614 p17) - ld GNU ld (Gentoo 2.42 p6) 2.42.0 - Linux ZaZaZebra 4.19.0-5-m68k #1 Gentoo 4.19.37-5 (2019-06-19) m68k 68040 68040 GNU/Linux - manual build - ../glibc/configure --enable-fortify-source --prefix=/usr - Tested by Immolo (via Andreas K. Hüttel) Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>	2024-07-08 21:51:03 +02:00
Adhemerval Zanella	9fc639f654	elf: Make dl-rseq-symbols Linux only And avoid a Hurd build failures. Checked on x86_64-linux-gnu.	2024-07-04 10:09:07 -03:00
Michael Jeanson	2b92982e23	nptl: fix potential merge of __rseq_* relro symbols While working on a patch to add support for the extensible rseq ABI, we came across an issue where a new 'const' variable would be merged with the existing '__rseq_size' variable. We tracked this to the use of '-fmerge-all-constants' which allows the compiler to merge identical constant variables. This means that all 'const' variables in a compile unit that are of the same size and are initialized to the same value can be merged. In this specific case, on 32 bit systems 'unsigned int' and 'ptrdiff_t' are both 4 bytes and initialized to 0 which should trigger the merge. However for reasons we haven't delved into when the attribute 'section (".data.rel.ro")' is added to the mix, only variables of the same exact types are merged. As far as we know this behavior is not specified anywhere and could change with a new compiler version, hence this patch. Move the definitions of these variables into an assembler file and add hidden writable aliases for internal use. This has the added bonus of removing the asm workaround to set the values on rseq registration. Tested on Debian 12 with GCC 12.2. Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Reviewed-by: Florian Weimer <fweimer@redhat.com>	2024-07-03 21:40:30 +02:00
Darius Rad	b85a23d736	riscv: Update nofpu libm test ulps Fixes 32 test failures.	2024-07-03 21:05:34 +02:00
John David Anglin	4737e6a7a3	hppa/vdso: Provide 64-bit clock_gettime() vDSO only Adhemerval noticed that the gettimeofday() and 32-bit clock_gettime() vDSO calls won't be used by glibc on hppa, so there is no need to declare them. Both syscalls will be emulated by utilizing return values of the 64-bit clock_gettime() vDSO instead. Signed-off-by: Helge Deller <deller@gmx.de> Suggested-by: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>	2024-07-02 16:26:32 -04:00
YunQiang Su	9d0e9c8a13	MIPSr6/math: Use builtin fma and fmaf MIPSr6 has MADDF.s/MADDF.d instructions, which are fused. In MIPS ISA, double support can be subsetted. Only FMAF is enabled for this case. * sysdeps/mips/fpu/math-use-builtins-fma.h Signed-off-by: YunQiang Su <syq@gcc.gnu.org> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-07-01 14:52:30 -03:00
Florian Weimer	018f0fc3b8	elf: Support recursive use of dynamic TLS in interposed malloc It turns out that quite a few applications use bundled mallocs that have been built to use global-dynamic TLS (instead of the recommended initial-exec TLS). The previous workaround from commit `afe42e935b` ("elf: Avoid some free (NULL) calls in _dl_update_slotinfo") does not fix all encountered cases unfortunatelly. This change avoids the TLS generation update for recursive use of TLS from a malloc that was called during a TLS update. This is possible because an interposed malloc has a fixed module ID and TLS slot. (It cannot be unloaded.) If an initially-loaded module ID is encountered in __tls_get_addr and the dynamic linker is already in the middle of a TLS update, use the outdated DTV, thus avoiding another call into malloc. It's still necessary to update the DTV to the most recent generation, to get out of the slow path, which is why the check for recursion is needed. The bookkeeping is done using a global counter instead of per-thread flag because TLS access in the dynamic linker is tricky. All this will go away once the dynamic linker stops using malloc for TLS, likely as part of a change that pre-allocates all TLS during pthread_create/dlopen. Fixes commit `d2123d6827` ("elf: Fix slow tls access after dlopen [BZ #19924]"). Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2024-07-01 19:02:11 +02:00
MayShao-oc	9dc645cb56	x86: Set default non_temporal_threshold for Zhaoxin processors Current 'non_temporal_threshold' set to 'non_temporal_threshold_lowbound' on Zhaoxin processors without ERMS. The default 'non_temporal_threshold_lowbound' is too small for the KH-40000 and KX-7000 Zhaoxin processors, this patch updates the value to 'shared / cachesize_non_temporal_divisor'. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2024-06-30 06:26:43 -07:00
MayShao-oc	c19457aec6	x86_64: Optimize large size copy in memmove-ssse3 This patch optimizes large size copy using normal store when src > dst and overlap. Make it the same as the logic in memmove-vec-unaligned-erms.S. Current memmove-ssse3 use '__x86_shared_cache_size_half' as the non- temporal threshold, this patch updates that value to '__x86_shared_non_temporal_threshold'. Currently, the __x86_shared_non_temporal_threshold is cpu-specific, and different CPUs will have different values based on the related nt-benchmark results. However, in memmove-ssse3, the nontemporal threshold uses '__x86_shared_cache_size_half', which sounds unreasonable. The performance is not changed drastically although shows overall improvements without any major regressions or gains. Results on Zhaoxin KX-7000: bench-memcpy geometric_mean(N=20) New / Original: 0.999 bench-memcpy-random geometric_mean(N=20) New / Original: 0.999 bench-memcpy-large geometric_mean(N=20) New / Original: 0.978 bench-memmove geometric_mean(N=20) New / Original: 1.000 bench-memmmove-large geometric_mean(N=20) New / Original: 0.962 Results on Intel Core i5-6600K: bench-memcpy geometric_mean(N=20) New / Original: 1.001 bench-memcpy-random geometric_mean(N=20) New / Original: 0.999 bench-memcpy-large geometric_mean(N=20) New / Original: 1.001 bench-memmove geometric_mean(N=20) New / Original: 0.995 bench-memmmove-large geometric_mean(N=20) New / Original: 0.936 Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2024-06-30 06:26:43 -07:00
MayShao-oc	44d757eb9f	x86: Set preferred CPU features on the KH-40000 and KX-7000 Zhaoxin processors Fix code formatting under the Zhaoxin branch and add comments for different Zhaoxin models. Unaligned AVX load are slower on KH-40000 and KX-7000, so disable the AVX_Fast_Unaligned_Load. Enable Prefer_No_VZEROUPPER and Fast_Unaligned_Load features to use sse2_unaligned version of memset,strcpy and strcat. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2024-06-30 06:26:43 -07:00
Andrew Pinski	2f1f7a5f8a	Aarch64: Add new memset for Qualcomm's oryon-1 core Qualcom's new core, oryon-1, has a different characteristics for memset than the current versions of memset. For non-zero, larger sizes, using GPRs rather than the SIMD stores is ~30% faster. For even larger sizes, using the nontemporal stores is needed not to polute the L1/L2 caches. For zero values, using `dc zva` should be used. Since we know the size will always be 64 bytes, we don't need to figure out the size there. I started with the emag memset and added back the `dc zva` code. Changes since v1: * v3: Fix comment formating Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-06-30 13:47:17 +02:00
Andrew Pinski	4dc83cac78	Aarch64: Add memcpy for qualcomm's oryon-1 core Qualcomm's new core (oryon-1) has a different performance characteristic than other cores. For memcpy, it is faster to use the GPRs to do the copy for large sizes (2x faster). For even larger sizes, it is better to use the nontemporal load/store instructions so we don't pollute the L1/L2 caches. For smaller sizes, the characteristic are very similar to other cores. I used the thunderx memcpy as a starting point and expanded from there. Changes since v1: * v2: Fix ordering in Makefile. * v3: Fix comment grammar about the ldnp/stnp instructions. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-06-30 13:46:33 +02:00
Palmer Dabbelt	07fe71f59b	arm: Avoid UB in elf_machine_rel() This recently came up during a cleanup to remove misaligned accesses from the RISC-V port. Link: https://sourceware.org/pipermail/libc-alpha/2022-June/139961.html Suggested-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Reviewed-by: Fangrui Song <maskray@google.com>	2024-06-26 12:45:43 +02:00
mengqinggang	a10b6ad471	LoongArch: Fix tst-gnu2-tls2 test case asm volatile ("movfcsr2gr $t0, $fcsr0" ::: "$t0"); asm volatile ("st.d $t0, %0" :"=m"(restore_fcsr)); generate to the following instructions with -Og flag: movfcsr2gr $t0, $zero addi.d $t0, $sp, 2047(0x7ff) addi.d $t0, $t0, 77(0x4d) st.w $t0, $t0, 0 fcsr0 register and restore_fcsr variable are both stored in t0 register. Change to: asm volatile ("movfcsr2gr %0, $fcsr0" :"=r"(restore_fcsr)); to avoid restore_fcsr address in t0. Comparing float value using memcmp because float value cannot be directly compared for equality. Put LOAD_REGISTER_FCSR and SAVE_REGISTER_FCC after LOAD_REGISTER_FLOAT. Some float instructions may change fcsr register.	2024-06-26 12:02:07 +08:00
Adhemerval Zanella	c90cfce849	posix: Fix pidfd_spawn/pidfd_spawnp leak if execve fails (BZ 31695) If the pidfd_spawn/pidfd_spawnp helper process succeeds, but evecve fails for some reason (either with an invalid/non-existent, memory allocation, etc.) the resulting pidfd is never closed, nor returned to caller (so it can call close). Since the process creation failed, it should be up to posix_spawn to also, close the file descriptor in this case (similar to what it does to reap the process). This patch also changes the waitpid with waitid (P_PIDFD) for pidfd case, to avoid a possible pid re-use. Checked on x86_64-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2024-06-25 12:11:48 -03:00
Andreas K. Hüttel	d32c342425	Revert "MIPSr6/math: Use builtin fma and fmaf" Apologies, I mistakenly interpreted this to be already accepted. Reverting until v6 or later is reviewed and approved. This reverts commit `9e06e4a43b`.	2024-06-25 01:02:58 +02:00
Christoph Müllner	81c7f6193c	RISC-V: Execute a PAUSE hint in spin loops The atomic_spin_nop() macro can be used to run arch-specific code in the body of a spin loop to potentially improve efficiency. RISC-V's Zihintpause extension includes a PAUSE instruction for this use-case, which is encoded as a HINT, which means that it behaves like a NOP on systems that don't implement Zihintpause. Binutils supports Zihintpause since 2.36, so this patch uses the ".insn" directive to keep the code compatible with older toolchains. Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu> Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com> Acked-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-06-24 21:36:49 +02:00
YunQiang Su	9e06e4a43b	MIPSr6/math: Use builtin fma and fmaf MIPSr6 has MADDF.s/MADDF.d instructions, which are fused. In MIPS ISA, double support can be subsetted. Only FMAF is enabled for this case. * sysdeps/mips/fpu/math-use-builtins-fma.h Signed-off-by: YunQiang Su <syq@gcc.gnu.org> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>	2024-06-24 19:43:57 +02:00
John David Anglin	aecde502e9	hppa/vdso: Add wrappers for vDSO functions The upcoming parisc (hppa) v6.11 Linux kernel will include vDSO support for gettimeofday(), clock_gettime() and clock_gettime64() syscalls for 32- and 64-bit userspace. The patch below adds the necessary glue code for glibc. Signed-off-by: Helge Deller <deller@gmx.de> Changes in v2: - add vsyscalls for 64-bit too	2024-06-23 19:39:28 -04:00
John David Anglin	9dddb26954	Update hppa libm-test-ulps	2024-06-23 13:51:25 -04:00
John David Anglin	da61ba3f89	Update hppa libm-test-ulps	2024-06-20 19:44:04 -04:00
Julian Zhu	9f2bf0e23a	RISC-V: Update ulps For the exp10m1, exp2m1, log10p1 and log2p1 implementations. Signed-off-by: Julian Zhu <jz531210@gmail.com>	2024-06-20 23:46:32 +02:00
Julian Zhu	cb20e7c7cc	MIPS: Update ulps Update mips32/mips64 ulps for the exp10m1, exp2m1, and log10p1 implementations. Signed-off-by: Julian Zhu <jz531210@gmail.com>	2024-06-20 23:45:24 +02:00
Florian Weimer	b375e597da	i386: Update ulps This is from a -march=i686 -mtune=generic build with --disable-multi-arch, running on a Cascade Lake CPU.	2024-06-20 19:00:48 +02:00
Florian Weimer	362588f7cc	s390x: Capture grep output in static PIE check The test is not a run-time check, so update the description. Also use readelf -W for a more stable output format and fix an LC_ALL typo. This avoids garbled configure messages: checking for s390-specific static PIE requirements (runtime check)... 0x0000000000000017 (JMPREL) 0x280 yes	2024-06-20 14:34:06 +02:00
Florian Weimer	71dafdf5f1	powerpc: Update ulps Results based on POWER8 and POWER9 machines running powerpc64-linux-gnu, with and without --disable-multi-arch.	2024-06-20 12:15:31 +02:00
Florian Weimer	3cb77b7d1e	i386: Update ulps Based on a -march=x86-64-v4 -mfpmath=sse build, with and without --disable-multi-arch, running on a Zen 4 CPU. Also used different -march=x8i6-64-v… settings.	2024-06-20 12:15:09 +02:00
Xi Ruoyao	9405d54c62	LoongArch: Update ulps Add ulps for recently added C23 exp10m1, exp2m1, and log10p1 functions. Signed-off-by: Xi Ruoyao <xry111@xry111.site>	2024-06-19 21:17:19 +02:00
Andreas K. Hüttel	4f1cf0c0e1	sparc: Regenerate ULPs Linux catbus 5.15.110-gentoo-r1 #1 SMP Fri Jun 9 17:53:23 PDT 2023 sparc64 sun4v UltraSparc T5 (Niagara5) GNU/Linux Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>	2024-06-19 14:58:32 +02:00
Stefan Liebler	19f6d6a480	s390x: Regenerate ULPs. Needed due to: - "Implement C23 log10p1" commit ID `55eb99e9a9` - "Implement C23 exp2m1, exp10m1" commit ID `7ec903e028`	2024-06-19 08:42:30 +02:00
mengqinggang	9a675d998e	LoongArch: Fix _dl_tlsdesc_dynamic in LSX case HWCAP value is overwritten at the first comparison of the LASX case. The second comparison at LSX get incorrect result. Change to use t0 to save HWCAP value, and use t1 to save comparison result.	2024-06-19 10:06:41 +08:00
Adhemerval Zanella	92341e3150	arm: Update ulps For the exp10m1, exp2m1, and log10p1 implementations.	2024-06-18 17:31:10 -03:00
Adhemerval Zanella	45f5f51b85	aarch64: Update ulps For the exp10m1, exp2m1, and log10p1 implementations.	2024-06-18 17:31:10 -03:00
Adhemerval Zanella	52b397bafa	powerpc: Update ulps For the exp10m1, exp2m1, and log10p1 implementations.	2024-06-18 17:31:10 -03:00
Florian Weimer	f6ea5d1291	Linux: Include <dl-symbol-redir-ifunc.h> in dl-sysdep.c The _dl_sysdep_parse_arguments function contains initalization of a large on-stack variable: dl_parse_auxv_t auxv_values = { 0, }; This uses a non-inline version of memset on powerpc64le-linux-gnu, so it must use the baseline memset.	2024-06-18 10:56:34 +02:00
Carlos Llamas	176671f604	linux: add definitions for hugetlb page size encodings A desired hugetlb page size can be encoded in the flags parameter of system calls such as mmap() and shmget(). The Linux UAPI headers have included explicit definitions for these encodings since v4.14. This patch adds these definitions that are used along with MAP_HUGETLB and SHM_HUGETLB flags as specified in the corresponding man pages. This relieves programs from having to duplicate and/or compute the encodings manually. Additionally, the filter on these definitions in tst-mman-consts.py is removed, as suggested by Florian. I then ran this tests successfully, confirming the alignment with the kernel headers. PASS: misc/tst-mman-consts original exit status 0 Signed-off-by: Carlos Llamas <cmllamas@google.com> Tested-by: Florian Weimer <fweimer@redhat.com> Reviewed-by: Florian Weimer <fweimer@redhat.com>	2024-06-18 10:56:34 +02:00
Stefan Liebler	e260ceb4aa	elf: Remove HWCAP_IMPORTANT Remove the definitions of HWCAP_IMPORTANT after removal of LD_HWCAP_MASK / tunable glibc.cpu.hwcap_mask. There HWCAP_IMPORTANT was used as default value. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-06-18 10:45:36 +02:00
Stefan Liebler	ad0aa1f549	elf: Remove LD_HWCAP_MASK / tunable glibc.cpu.hwcap_mask Remove the environment variable LD_HWCAP_MASK and the tunable glibc.cpu.hwcap_mask as those are not used anymore in common-code after removal in elf/dl-cache.c:search_cache(). The only remaining user is sparc32 where it is used in elf_machine_matches_host(). If sparc32 does not need it anymore, we can get rid of it at all. Otherwise we could also move LD_HWCAP_MASK / tunable glibc.cpu.hwcap_mask to be sparc32 specific. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-06-18 10:45:36 +02:00
Stefan Liebler	343439a31e	elf: Remove _DL_PLATFORMS_COUNT Remove the definitions of _DL_PLATFORMS_COUNT as those are not used anymore after removal in elf/dl-cache.c:search_cache(). Note: On x86, we can also get rid of the definitions HWCAP_PLATFORMS_START and HWCAP_PLATFORMS_COUNT. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-06-18 10:45:36 +02:00
Stefan Liebler	43c7c5e62d	elf: Remove _DL_FIRST_PLATFORM Remove the definitions of _DL_FIRST_PLATFORM as those were only used in the _DL_HWCAP_PLATFORM definitions and in _dl_string_platform(). Both were removed. Note: Removed on every architecture despite of powerpc, where _dl_string_platform() is still used. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-06-18 10:45:36 +02:00
Stefan Liebler	ed23449dac	elf: Remove _DL_HWCAP_PLATFORM Remove the definitions of _DL_HWCAP_PLATFORM as those are not used anymore after removal in elf/dl-cache.c:search_cache(). Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-06-18 10:45:36 +02:00
Stefan Liebler	374c8b4483	elf: Remove platform strings in dl-procinfo.c Remove the platform strings in dl-procinfo.c where also the implementation of _dl_string_platform() was removed. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-06-18 10:45:36 +02:00
Stefan Liebler	8faada8302	elf: Remove _dl_string_platform Despite of powerpc where the returned integer is stored in tcb, and the diagnostics output, there is no user anymore. Thus this patch removes the diagnostics output and _dl_string_platform for all other platforms. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-06-18 10:45:36 +02:00
Stefan Liebler	f14b6dfc87	x86: Remove HWCAP_START and HWCAP_COUNT Both defines are not used anymore. Those were only used for _dl_string_hwcap(), which itself was removed with commit `ab40f20364` "elf: Remove _dl_string_hwcap" Just clean up. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-06-18 10:45:36 +02:00
YunQiang Su	eaf4fc516a	math: Update mips32/mips64 ulps for log2p1	2024-06-17 21:45:53 +02:00
Andreas K. Hüttel	98ffc1bfeb	Convert to autoconf 2.72 (vanilla release, no distribution patches) As discussed at the patch review meeting Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org> Reviewed-by: Simon Chopin <simon.chopin@canonical.com>	2024-06-17 21:15:28 +02:00
Joseph Myers	7ec903e028	Implement C23 exp2m1, exp10m1 C23 adds various <math.h> function families originally defined in TS 18661-4. Add the exp2m1 and exp10m1 functions (exp2(x)-1 and exp10(x)-1, like expm1). As with other such functions, these use type-generic templates that could be replaced with faster and more accurate type-specific implementations in future. Test inputs are copied from those for expm1, plus some additions close to the overflow threshold (copied from exp2 and exp10) and also some near the underflow threshold. exp2m1 has the unusual property of having an input (M_MAX_EXP) where whether the function overflows (under IEEE semantics) depends on the rounding mode. Although these could reasonably be XFAILed in the testsuite (as we do in some cases for arguments very close to a function's overflow threshold when an error of a few ulps in the implementation can result in the implementation not agreeing with an ideal one on whether overflow takes place - the testsuite isn't smart enough to handle this automatically), since these functions aren't required to be correctly rounding, I made the implementation check for and handle this case specially. The Makefile ordering expected by lint-makefiles for the new functions is a bit peculiar, but I implemented it in this patch so that the test passes; I don't know why log2 also needed moving in one Makefile variable setting when it didn't in my previous patches, but the failure showed a different place was expected for that function as well. The powerpc64le IFUNC setup seems not to be as self-contained as one might hope; it shouldn't be necessary to add IFUNCs for new functions such as these simply to get them building, but without setting up IFUNCs for the new functions, there were undefined references to __GI___expm1f128 (that IFUNC machinery results in no such function being defined, but doesn't stop include/math.h from doing the redirection resulting in the exp2m1f128 and exp10m1f128 implementations expecting to call it). Tested for x86_64 and x86, and with build-many-glibcs.py.	2024-06-17 16:31:49 +00:00
Joseph Myers	55eb99e9a9	Implement C23 log10p1 C23 adds various <math.h> function families originally defined in TS 18661-4. Add the log10p1 functions (log10(1+x): like log1p, but for base-10 logarithms). This is directly analogous to the log2p1 implementation (except that whereas log2p1 has a smaller underflow range than log1p, log10p1 has a larger underflow range). The test inputs are copied from those for log1p and log2p1, plus a few more inputs in that wider underflow range. Tested for x86_64 and x86, and with build-many-glibcs.py.	2024-06-17 13:48:13 +00:00
Joseph Myers	bb014f50c4	Implement C23 logp1 C23 adds various <math.h> function families originally defined in TS 18661-4. Add the logp1 functions (aliases for log1p functions - the name is intended to be more consistent with the new log2p1 and log10p1, where clearly it would have been very confusing to name those functions log21p and log101p). As aliases rather than new functions, the content of this patch is somewhat different from those actually adding new functions. Tests are shared with log1p, so this patch does mechanically update all affected libm-test-ulps files to expect the same errors for both functions. The vector versions of log1p on aarch64 and x86_64 are not updated to have logp1 aliases (and thus there are no corresponding header, tests, abilist or ulps changes for vector functions either). It would be reasonable for such vector aliases and corresponding changes to other files to be made separately. For now, the log1p tests instead avoid testing logp1 in the vector case (a Makefile change is needed to avoid problems with grep, used in generating the .c files for vector function tests, matching more than one ALL_RM_TEST line in a file testing multiple functions with the same inputs, when it assumes that the .inc file only has a single such line). Tested for x86_64 and x86, and with build-many-glibcs.py.	2024-06-17 13:47:09 +00:00
Noah Goldstein	5b54a33435	x86: Fix value for `x86_memset_non_temporal_threshold` when it is undesirable When we don't want to use non-temporal stores for memset, we set `x86_memset_non_temporal_threshold` to SIZE_MAX. The current code, however, we using `maximum_non_temporal_threshold` as the upper bound which is `SIZE_MAX >> 4` so we ended up with a value of `0`. Fix is to just use `SIZE_MAX` as the upper bound for when setting the tunable. Tested-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-06-14 17:25:05 -05:00
Andreas K. Hüttel	3953b5b88f	i686: Regenerate ulps Linux pinacolada 6.6.32-gentoo #1 SMP PREEMPT Sun Jun 9 14:18:17 CEST 2024 x86_64 Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz GenuineIntel GNU/Linux 32bit build for multilib environment Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>	2024-06-14 21:24:24 +02:00
Xi Ruoyao	97aa7b7346	LoongArch: Ensure sp 16-byte aligned for tlsdesc "ADDI sp, sp, 24" and "ADDI sp, sp, SZFCSREG" (SZFCSREG = 4) are misaligning the stack: the ABI mandates a 16-byte alignment. Fix it by changing the first one to "ADDI sp, sp, 32", and reuse the spare 4th slot for saving fcsr. Reported-by: Jinyang He <hejinyang@loongson.cn> Signed-off-by: Xi Ruoyao <xry111@xry111.site>	2024-06-14 10:14:54 +08:00
H.J. Lu	29807a271e	x86: Properly set x86 minimum ISA level [BZ #31883 ] Properly set libc_cv_have_x86_isa_level in shell for MINIMUM_X86_ISA_LEVEL defined as (__X86_ISA_V1 + __X86_ISA_V2 + __X86_ISA_V3 + __X86_ISA_V4) Also set __X86_ISA_V2 to 1 for i386 if __GCC_HAVE_SYNC_COMPARE_AND_SWAP_8 is defined. There are no changes in config.h nor in config.make on x86-64. On i386, -march=x86-64-v2 with GCC generates #define MINIMUM_X86_ISA_LEVEL 2 in config.h and have-x86-isa-level = 2 in config.make. This fixes BZ #31883. Signed-off-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2024-06-12 14:27:54 -07:00
Adhemerval Zanella	7edd3814b0	linux: Remove __stack_prot The __stack_prot is used by Linux to make the stack executable if a modules requires it. It is also marked as RELRO, which requires to change the segment permission to RW to update it. Also, there is no need to keep track of the flags: either the stack will have the default permission of the ABI or should be change to PROT_READ \| PROT_WRITE \| PROT_EXEC. The only additional flag, PROT_GROWSDOWN or PROT_GROWSUP, is Linux only and can be deducted from _STACK_GROWS_DOWN/_STACK_GROWS_UP. Also, the check_consistency function was already removed some time ago. Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2024-06-12 15:25:54 -03:00
H.J. Lu	09bc68b0ac	x86: Properly set MINIMUM_X86_ISA_LEVEL for i386 [BZ #31867 ] On i386, set the default minimum ISA level to 0, not 1 (baseline which includes SSE2). There are no changes in config.h nor in config.make on x86-64. This fixes BZ #31867. Signed-off-by: H.J. Lu <hjl.tools@gmail.com> Tested-by: Ian Jordan <immoloism@gmail.com> Reviewed-by: Sam James <sam@gentoo.org> Reviewed-by: Florian Weimer <fweimer@redhat.com>	2024-06-11 00:10:08 -07:00
Joe Damato	bef2a827a5	x86: Enable non-temporal memset tunable for AMD In commit `46b5e98ef6` ("x86: Add seperate non-temporal tunable for memset") a tunable threshold for enabling non-temporal memset was added, but only for Intel hardware. Since that commit, new benchmark results suggest that non-temporal memset is beneficial on AMD, as well, so allow this tunable to be set for AMD. See: https://docs.google.com/spreadsheets/d/1opzukzvum4n6-RUVHTGddV6RjAEil4P2uMjjQGLbLcU/edit?usp=sharing which has been updated to include data using different stategies for large memset on AMD Zen2, Zen3, and Zen4. Signed-off-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2024-06-10 16:18:18 -05:00
Samuel Thibault	74f9ee3b91	hurd: Fix lsetxattr return value The manpage says that lsetxattr returns 0 on success, like setxattr.	2024-06-10 21:56:13 +02:00
Joe Damato	92c270d32c	Linux: Add epoll ioctls As of Linux kernel 6.9, some ioctls and a parameters structure have been introduced which allow user programs to control whether a particular epoll context will busy poll. Update the headers to include these for the convenience of user apps. The ioctls were added in Linux kernel 6.9 commit 18e2bf0edf4dd ("eventpoll: Add epoll ioctl for epoll_params") [1] to include/uapi/linux/eventpoll.h. [1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/diff/?h=v6.9&id=18e2bf0edf4dd Signed-off-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-06-04 12:09:15 -05:00
Szabolcs Nagy	2a9943b4a0	math: Fix exp10 undefined left shift Left shift of ki is undefined when ki<0, copy the logic from exp, which uses unsigned arithmetics, to fix it. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-06-04 15:33:26 +01:00
Joseph Myers	1d441791cb	Add new AArch64 HWCAP2 definitions from Linux 6.9 to bits/hwcap.h Linux 6.9 adds 15 new HWCAP2_* values for AArch64; add them to bits/hwcap.h in glibc. Tested with build-many-glibcs.py for aarch64-linux-gnu.	2024-06-04 12:25:05 +00:00
Noah Goldstein	46b5e98ef6	x86: Add seperate non-temporal tunable for memset The tuning for non-temporal stores for memset vs memcpy is not always the same. This includes both the exact value and whether non-temporal stores are profitable at all for a given arch. This patch add `x86_memset_non_temporal_threshold`. Currently we disable non-temporal stores for non Intel vendors as the only benchmarks showing its benefit have been on Intel hardware. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-05-30 12:36:09 -05:00
Noah Goldstein	5bf0ab8057	x86: Improve large memset perf with non-temporal stores [RHEL-29312] Previously we use `rep stosb` for all medium/large memsets. This is notably worse than non-temporal stores for large (above a few MBs) memsets. See: https://docs.google.com/spreadsheets/d/1opzukzvum4n6-RUVHTGddV6RjAEil4P2uMjjQGLbLcU/edit?usp=sharing For data using different stategies for large memset on ICX and SKX. Using non-temporal stores can be up to 3x faster on ICX and 2x faster on SKX. Historically, these numbers would not have been so good because of the zero-over-zero writeback optimization that `rep stosb` is able to do. But, the zero-over-zero writeback optimization has been removed as a potential side-channel attack, so there is no longer any good reason to only rely on `rep stosb` for large memsets. On the flip size, non-temporal writes can avoid data in their RFO requests saving memory bandwidth. All of the other changes to the file are to re-organize the code-blocks to maintain "good" alignment given the new code added in the `L(stosb_local)` case. The results from running the GLIBC memset benchmarks on TGL-client for N=20 runs: Geometric Mean across the suite New / Old EXEX256: 0.979 Geometric Mean across the suite New / Old EXEX512: 0.979 Geometric Mean across the suite New / Old AVX2 : 0.986 Geometric Mean across the suite New / Old SSE2 : 0.979 Most of the cases are essentially unchanged, this is mostly to show that adding the non-temporal case didn't add any regressions to the other cases. The results on the memset-large benchmark suite on TGL-client for N=20 runs: Geometric Mean across the suite New / Old EXEX256: 0.926 Geometric Mean across the suite New / Old EXEX512: 0.925 Geometric Mean across the suite New / Old AVX2 : 0.928 Geometric Mean across the suite New / Old SSE2 : 0.924 So roughly a 7.5% speedup. This is lower than what we see on servers (likely because clients typically have faster single-core bandwidth so saving bandwidth on RFOs is less impactful), but still advantageous. Full test-suite passes on x86_64 w/ and w/o multiarch. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-05-30 12:36:09 -05:00
Xi Ruoyao	0c1d2c277a	LoongArch: Use "$fcsr0" instead of "$r0" in _FPU_{GET,SET}CW Clang inline-asm parser does not allow using "$r0" in movfcsr2gr/movgr2fcsr, so everything using _FPU_{GET,SET}CW is now failing to build with Clang on LoongArch. As we now requires Binutils >= 2.41 which supports using "$fcsr0" here, use it instead of "$r0" to fix the issue. Link: https://github.com/loongson-community/discussions/issues/53#issuecomment-2081507390 Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=4142b2368353 Signed-off-by: Xi Ruoyao <xry111@xry111.site>	2024-05-28 09:17:05 +08:00
Xin Wang	e0f7f1808f	x86_64: Reformat elf_machine_rela A space is added before the left bracket of the x86_64 elf_machine_rela function, in order to harmonize with the rest of the implementation of the function and to make it easier to retrieve the function. The lines where the function definition is located has been re-indented, as well as its left curly bracket placed in the correct position. Signed-off-by: Xin Wang <yw987194828@gmail.com> Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-05-27 13:46:45 -07:00
Sunil K Pandey	1b713c9a53	i386: Disable Intel Xeon Phi tests for GCC 15 and above (BZ 31782) This patch disables Intel Xeon Phi tests for GCC 15 and above. GCC 15 removed Intel Xeon Phi ISA support. commit e1a7e2c54d52d0ba374735e285b617af44841ace Author: Haochen Jiang <haochen.jiang@intel.com> Date: Mon May 20 10:43:44 2024 +0800 i386: Remove Xeon Phi ISA support Fixes BZ 31782. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-05-27 12:28:13 -07:00
H.J. Lu	f981bf6b9d	parse_fdinfo: Don't advance pointer twice [BZ #31798 ] pidfd_getpid.c has /* Ignore invalid large values. / if (INT_MULTIPLY_WRAPV (10, n, &n) \|\| INT_ADD_WRAPV (n, l++ - '0', &n)) return -1; For GCC older than GCC 7, INT_ADD_WRAPV(a, b, r) is defined as _GL_INT_OP_WRAPV (a, b, r, +, _GL_INT_ADD_RANGE_OVERFLOW) and *l++ - '0' is evaluated twice. Fix BZ #31798 by moving "l++" out of the if statement. Tested with GCC 6.4 and GCC 14.1. Signed-off-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-05-27 06:52:45 -07:00
H.J. Lu	23c60af6dc	sysdeps/ieee754/ldbl-opt/Makefile: Split and sort libnldbl-calls Put each item on a separate line and sort libnldbl-calls. Signed-off-by: H.J. Lu <hjl.tools@gmail.com>	2024-05-24 10:25:40 -07:00
H.J. Lu	639c143db3	sysdeps/ieee754/ldbl-opt/Makefile: Remove test-nldbl-redirect-static Remove $(objpfx)test-nldbl-redirect-static checked in by accident. Signed-off-by: H.J. Lu <hjl.tools@gmail.com>	2024-05-24 06:36:18 -07:00
H.J. Lu	acfb169b3c	sysdeps/ieee754/ldbl-opt/Makefile: Split and sort tests Put each test on a separate line and sort tests. Signed-off-by: H.J. Lu <hjl.tools@gmail.com>	2024-05-24 06:31:49 -07:00
Stefan Liebler	4af49c60a1	s390x: Regenerate ULPs. Needed due to: "Implement C23 log2p1" commit ID `79c52daf47`	2024-05-24 09:53:49 +02:00
Joseph Myers	84d2762922	Update kernel version to 6.9 in header constant tests This patch updates the kernel version in the tests tst-mman-consts.py and tst-mount-consts.py to 6.9. (There are no new constants covered by these tests in 6.9 that need any other header changes; tst-pidfd-consts.py was updated separately along with adding new constants relevant to that test.) Tested with build-many-glibcs.py.	2024-05-23 14:04:48 +00:00
Adhemerval Zanella	eaa8113bf0	math: Provide missing math symbols on libc.a (BZ 31781) The libc.a for alpha, s390, and sparcv9 does not provide copysignf64x, copysignf128, frexpf64x, frexpf128, modff64x, and modff128. Checked with a static build for the affected ABIs. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-05-23 09:36:08 -03:00
Adhemerval Zanella	1664bbf238	s390: Make utmp32, utmpx32, and login32 shared only (BZ 31790) The function that work with 'struct utmp32' and 'struct utmpx32' are only for compat symbols. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-05-23 09:36:08 -03:00
Adhemerval Zanella	18dbe27847	microblaze: Remove cacheflush from libc.a (BZ 31788) microblaze does not export it in libc.so nor the kernel provides the cacheflush syscall. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-05-23 09:36:08 -03:00
Adhemerval Zanella	d8ebde14fb	powerpc: Remove duplicated llrintf and llrintf32 from libm.a (BZ 31787) Both the generic and POWER6 versions provide definitions of the symbol, which are already provided by the ifunc resolver. Checked on powerpc-linux-gnu-power4. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-05-23 09:36:08 -03:00
Adhemerval Zanella	5fededd825	powerpc: Remove duplicate strchrnul and strncasecmp_l libc.a (BZ 31786) For powerpc64 the generic version provides a weak definition of strchrnul, which are already provided by the ifunc resolver. The powerpc32 version is slight different, where for static case there is no iFUNC support. The strncasecmp_l is provided ifunc resolver. Checked on powerpc-linux-gnu-power4 and powerpc64-linux-gnu. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-05-23 09:36:08 -03:00
Adhemerval Zanella	62eaa46739	loongarch: Remove duplicate strnlen in libc.a (BZ 31785) The generic version provides weak definitions of strnlen, which are already provided by the ifunc resolver. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-05-23 09:36:08 -03:00
Adhemerval Zanella	ef9596352b	aarch64: Remove duplicate memchr/strlen in libc.a (BZ 31777) The generic version provides weak definitions of memchr/strlen, which are already provided by the ifunc resolvers. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-05-23 09:36:08 -03:00
Joseph Myers	e9a37242f9	Update PIDFD_* constants for Linux 6.9 Linux 6.9 adds some more PIDFD_* constants. Add them to glibc's sys/pidfd.h, including updating comments that said FLAGS was reserved and must be 0, along with updating tst-pidfd-consts.py. Tested with build-many-glibcs.py.	2024-05-23 12:22:40 +00:00
H.J. Lu	43d41ae6d7	Don't provide XXXf128_do_not_use aliases [BZ #31757 ] Don't provide __nexttowardf128_do_not_use, nexttowardf128_do_not_use, finitef128_do_not_use, isinff128_do_not_use and isnanf128_do_not_use. This fixes BZ #31757. Signed-off-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-05-22 06:12:17 -07:00
Adhemerval Zanella	5d4999e519	math: Fix isnanf128 static build (BZ 31774) Some static implementation of float128 routines might call __isnanf128, which is not provided by the static object. Checked on x86_64-linux-gnu. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-05-21 16:53:27 -03:00
Adhemerval Zanella	1f09aae36a	math: Fix i386 and m68k exp10 on static build (BZ 31775) The commit `08ddd26814` removed the static exp10 on i386 and m68k with an empty w_exp10.c (required for the ABIs that uses the newly implementation). This patch fixes by adding the required symbols on the arch-specific w_exp{f}_compat.c implementation. Checked on i686-linux-gnu and with a build for m68k-linux-gnu. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>	2024-05-21 13:44:22 -03:00
Adhemerval Zanella	0b716305df	math: Fix i386 and m68k fmod/fmodf on static build (BZ 31488) The commit `16439f419b` removed the static fmod/fmodf on i386 and m68k with and empty w_fmod.c (required for the ABIs that uses the newly implementation). This patch fixes by adding the required symbols on the arch-specific w_fmod{f}_compat.c implementation. To statically build fmod fails on some ABI (alpha, s390, sparc) because it does not export the ldexpf128, this is also fixed by this patch. Checked on i686-linux-gnu and with a build for m68k-linux-gnu. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Tested-by: Aurelien Jarno <aurelien@aurel32.net>	2024-05-21 13:43:39 -03:00
H.J. Lu	437c94e04b	Remove the clone3 symbol from libc.a [BZ #31770 ] clone3 isn't exported from glibc and is hidden in libc.so. Fix BZ #31770 by removing clone3 alias. Signed-off-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-05-21 07:05:08 -07:00
Joe Ramsay	0fed0b250f	aarch64/fpu: Add vector variants of pow Plus a small amount of moving includes around in order to be able to remove duplicate definition of asuint64. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2024-05-21 14:38:49 +01:00
caiyinyu	3c1e22372d	LoongArch: Update ulps For the log2p1 implementation.	2024-05-21 12:08:25 +08:00
mengqinggang	16d47c1594	LoongArch: Fix tst-gnu2-tls2 compiler error Add -mno-lsx to tst-gnu2-tlsmod*.c if gcc support -mno-lsx. Add escape character '\' in vector support test function.	2024-05-21 11:23:03 +08:00
H.J. Lu	8428278b5f	i386: Don't define stpncpy alias when used in IFUNC [BZ #31768 ] Fix BZ #31768 by not defining stpncpy alias when used in IFUNC. Signed-off-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>	2024-05-20 19:35:00 -07:00
Adhemerval Zanella	f83e461f10	powerpc: Update ulps For the log2p1 implementation.	2024-05-20 13:12:23 -03:00
Adhemerval Zanella	32b2aa59da	arm: Update ulps For the log2p1 implementation.	2024-05-20 13:12:23 -03:00
Adhemerval Zanella	241338bd6f	aarch64: Update ulps For the log2p1 implementation.	2024-05-20 13:12:23 -03:00
Joseph Myers	79c52daf47	Implement C23 log2p1 C23 adds various <math.h> function families originally defined in TS 18661-4. Add the log2p1 functions (log2(1+x): like log1p, but for base-2 logarithms). This illustrates the intended structure of implementations of all these function families: define them initially with a type-generic template implementation. If someone wishes to add type-specific implementations, it is likely such implementations can be both faster and more accurate than the type-generic one and can then override it for types for which they are implemented (adding benchmarks would be desirable in such cases to demonstrate that a new implementation is indeed faster). The test inputs are copied from those for log1p. Note that these changes make gen-auto-libm-tests depend on MPFR 4.2 (or later). The bulk of the changes are fairly generic for any such new function. (sysdeps/powerpc/nofpu/Makefile only needs changing for those type-generic templates that use fabs.) Tested for x86_64 and x86, and with build-many-glibcs.py.	2024-05-20 13:41:39 +00:00
Joseph Myers	cf0ca8d52e	Update syscall lists for Linux 6.9 Linux 6.9 has no new syscalls. Update the version number in syscall-names.list to reflect that it is still current for 6.9. Tested with build-many-glibcs.py.	2024-05-20 13:10:31 +00:00
H.J. Lu	7935e7a537	Rename procutils_read_file to __libc_procutils_read_file [BZ #31755 ] Fix BZ #31755 by renaming the internal function procutils_read_file to __libc_procutils_read_file. Signed-off-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-05-20 05:22:43 -07:00
H.J. Lu	4e21cb95e2	nearbyint: Don't define alias when used in IFUNC [BZ #31759 ] Fix BZ #31759 by not defining nearbyint aliases when used in IFUNC. Signed-off-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2024-05-20 05:21:41 -07:00
Florian Weimer	8d7b6b4cb2	socket: Use may_alias on sockaddr structs (bug 19622) This supports common coding patterns. The GCC C front end before version 7 rejects the may_alias attribute on a struct definition if it was not present in a previous forward declaration, so this attribute can only be conditionally applied. This implements the spirit of the change in Austin Group issue 1641. Suggested-by: Marek Polacek <polacek@redhat.com> Suggested-by: Jakub Jelinek <jakub@redhat.com> Reviewed-by: Sam James <sam@gentoo.org> Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2024-05-18 09:33:19 +02:00
Manjunath Matti	a81cdde1cb	powerpc64: Fix by using the configure value $libc_cv_cc_submachine [BZ #31629 ] This patch ensures that $libc_cv_cc_submachine, which is set from "--with-cpu", overrides $CFLAGS for configure time tests. Suggested-by: Peter Bergner <bergner@linux.ibm.com> Reviewed-by: Peter Bergner <bergner@linux.ibm.com>	2024-05-16 17:31:45 -05:00
Joe Ramsay	75207bde68	aarch64/fpu: Add vector variants of cbrt Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2024-05-16 14:35:06 +01:00
Joe Ramsay	157f89fa3d	aarch64/fpu: Add vector variants of hypot Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2024-05-16 14:34:43 +01:00
mengqinggang	1dbf2bef79	LoongArch: Add support for TLS Descriptors This is mostly based on AArch64 and RISC-V implementation. Add R_LARCH_TLS_DESC32 and R_LARCH_TLS_DESC64 relocations. For _dl_tlsdesc_dynamic function slow path, temporarily save and restore all vector registers.	2024-05-15 10:31:53 +08:00
Joe Ramsay	90a6ca8b28	aarch64: Fix AdvSIMD libmvec routines for big-endian Previously many routines used * to load from vector types stored in the data table. This is emitted as ldr, which byte-swaps the entire vector register, and causes bugs for big-endian when not all lanes contain the same value. When a vector is to be used this way, it has been replaced with an array and the load with an explicit ld1 intrinsic, which byte-swaps only within lanes. As well, many routines previously used non-standard GCC syntax for vector operations such as indexing into vectors types with [] and assembling vectors using {}. This syntax should not be mixed with ACLE, as the former does not respect endianness whereas the latter does. Such examples have been replaced with, for instance, vcombine_* and vgetq_lane* intrinsics. Helpers which only use the GCC syntax, such as the v_call helpers, do not need changing as they do not use intrinsics. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2024-05-14 13:10:33 +01:00
Adhemerval Zanella	ae515ba530	powerpc: Fix __fesetround_inline_nocheck on POWER9+ (BZ 31682) The `e68b1151f7` commit changed the __fesetround_inline_nocheck implementation to use mffscrni (through __fe_mffscrn) instead of mtfsfi. For generic powerpc ceil/floor/trunc, the function is supposed to disable the floating-point inexact exception enable bit, however mffscrni does not change any exception enable bits. This patch fixes by reverting the optimization for the __fesetround_inline_nocheck. Checked on powerpc-linux-gnu. Reviewed-by: Paul E. Murphy <murphyp@linux.ibm.com>	2024-05-09 08:59:30 -03:00
Gabi Falk	dd5f891c1a	x86_64: Fix missing wcsncat function definition without multiarch (x86-64-v4) This code expects the WCSCAT preprocessor macro to be predefined in case the evex implementation of the function should be defined with a name different from __wcsncat_evex. However, when glibc is built for x86-64-v4 without multiarch support, sysdeps/x86_64/wcsncat.S defines WCSNCAT variable instead of WCSCAT to build it as wcsncat. Rename the variable to WCSNCAT, as it is actually a better naming choice for the variable in this case. Reported-by: Kenton Groombridge Link: https://bugs.gentoo.org/921945 Fixes: `64b8b6516b` ("x86: Add evex optimized functions for the wchar_t strcpy family") Signed-off-by: Gabi Falk <gabifalk@gmx.com> Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>	2024-05-08 07:37:59 -07:00
Adhemerval Zanella	1e1ad714ee	support: Add envp argument to support_capture_subprogram So tests can specify a list of environment variables. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2024-05-07 12:16:36 -03:00

1 2 3 4 5 ...

16417 Commits