glibc

mirror of https://sourceware.org/git/glibc.git synced 2024-11-24 22:10:13 +00:00

Author	SHA1	Message	Date
Andreas K. Hüttel	ab6045728f	math: Update m68k ULPs This hasn't been looked at for a loong time (already guessing from the number of missing entries), and it ain't pretty. There are some 9-ulps results for float. - ZaZaZebra (qemu-system-m68k clone of PowerBook 190 system) - GCC 13.3.1 20240614 (Gentoo 13.3.1_p20240614 p17) - ld GNU ld (Gentoo 2.42 p6) 2.42.0 - Linux ZaZaZebra 4.19.0-5-m68k #1 Gentoo 4.19.37-5 (2019-06-19) m68k 68040 68040 GNU/Linux - manual build - ../glibc/configure --enable-fortify-source --prefix=/usr - Tested by Immolo (via Andreas K. Hüttel) Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>	2024-07-08 21:51:03 +02:00
Adhemerval Zanella	9fc639f654	elf: Make dl-rseq-symbols Linux only And avoid a Hurd build failures. Checked on x86_64-linux-gnu.	2024-07-04 10:09:07 -03:00
Michael Jeanson	2b92982e23	nptl: fix potential merge of __rseq_* relro symbols While working on a patch to add support for the extensible rseq ABI, we came across an issue where a new 'const' variable would be merged with the existing '__rseq_size' variable. We tracked this to the use of '-fmerge-all-constants' which allows the compiler to merge identical constant variables. This means that all 'const' variables in a compile unit that are of the same size and are initialized to the same value can be merged. In this specific case, on 32 bit systems 'unsigned int' and 'ptrdiff_t' are both 4 bytes and initialized to 0 which should trigger the merge. However for reasons we haven't delved into when the attribute 'section (".data.rel.ro")' is added to the mix, only variables of the same exact types are merged. As far as we know this behavior is not specified anywhere and could change with a new compiler version, hence this patch. Move the definitions of these variables into an assembler file and add hidden writable aliases for internal use. This has the added bonus of removing the asm workaround to set the values on rseq registration. Tested on Debian 12 with GCC 12.2. Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Reviewed-by: Florian Weimer <fweimer@redhat.com>	2024-07-03 21:40:30 +02:00
Darius Rad	b85a23d736	riscv: Update nofpu libm test ulps Fixes 32 test failures.	2024-07-03 21:05:34 +02:00
John David Anglin	4737e6a7a3	hppa/vdso: Provide 64-bit clock_gettime() vDSO only Adhemerval noticed that the gettimeofday() and 32-bit clock_gettime() vDSO calls won't be used by glibc on hppa, so there is no need to declare them. Both syscalls will be emulated by utilizing return values of the 64-bit clock_gettime() vDSO instead. Signed-off-by: Helge Deller <deller@gmx.de> Suggested-by: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>	2024-07-02 16:26:32 -04:00
YunQiang Su	9d0e9c8a13	MIPSr6/math: Use builtin fma and fmaf MIPSr6 has MADDF.s/MADDF.d instructions, which are fused. In MIPS ISA, double support can be subsetted. Only FMAF is enabled for this case. * sysdeps/mips/fpu/math-use-builtins-fma.h Signed-off-by: YunQiang Su <syq@gcc.gnu.org> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-07-01 14:52:30 -03:00
Florian Weimer	018f0fc3b8	elf: Support recursive use of dynamic TLS in interposed malloc It turns out that quite a few applications use bundled mallocs that have been built to use global-dynamic TLS (instead of the recommended initial-exec TLS). The previous workaround from commit `afe42e935b` ("elf: Avoid some free (NULL) calls in _dl_update_slotinfo") does not fix all encountered cases unfortunatelly. This change avoids the TLS generation update for recursive use of TLS from a malloc that was called during a TLS update. This is possible because an interposed malloc has a fixed module ID and TLS slot. (It cannot be unloaded.) If an initially-loaded module ID is encountered in __tls_get_addr and the dynamic linker is already in the middle of a TLS update, use the outdated DTV, thus avoiding another call into malloc. It's still necessary to update the DTV to the most recent generation, to get out of the slow path, which is why the check for recursion is needed. The bookkeeping is done using a global counter instead of per-thread flag because TLS access in the dynamic linker is tricky. All this will go away once the dynamic linker stops using malloc for TLS, likely as part of a change that pre-allocates all TLS during pthread_create/dlopen. Fixes commit `d2123d6827` ("elf: Fix slow tls access after dlopen [BZ #19924]"). Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2024-07-01 19:02:11 +02:00
MayShao-oc	9dc645cb56	x86: Set default non_temporal_threshold for Zhaoxin processors Current 'non_temporal_threshold' set to 'non_temporal_threshold_lowbound' on Zhaoxin processors without ERMS. The default 'non_temporal_threshold_lowbound' is too small for the KH-40000 and KX-7000 Zhaoxin processors, this patch updates the value to 'shared / cachesize_non_temporal_divisor'. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2024-06-30 06:26:43 -07:00
MayShao-oc	c19457aec6	x86_64: Optimize large size copy in memmove-ssse3 This patch optimizes large size copy using normal store when src > dst and overlap. Make it the same as the logic in memmove-vec-unaligned-erms.S. Current memmove-ssse3 use '__x86_shared_cache_size_half' as the non- temporal threshold, this patch updates that value to '__x86_shared_non_temporal_threshold'. Currently, the __x86_shared_non_temporal_threshold is cpu-specific, and different CPUs will have different values based on the related nt-benchmark results. However, in memmove-ssse3, the nontemporal threshold uses '__x86_shared_cache_size_half', which sounds unreasonable. The performance is not changed drastically although shows overall improvements without any major regressions or gains. Results on Zhaoxin KX-7000: bench-memcpy geometric_mean(N=20) New / Original: 0.999 bench-memcpy-random geometric_mean(N=20) New / Original: 0.999 bench-memcpy-large geometric_mean(N=20) New / Original: 0.978 bench-memmove geometric_mean(N=20) New / Original: 1.000 bench-memmmove-large geometric_mean(N=20) New / Original: 0.962 Results on Intel Core i5-6600K: bench-memcpy geometric_mean(N=20) New / Original: 1.001 bench-memcpy-random geometric_mean(N=20) New / Original: 0.999 bench-memcpy-large geometric_mean(N=20) New / Original: 1.001 bench-memmove geometric_mean(N=20) New / Original: 0.995 bench-memmmove-large geometric_mean(N=20) New / Original: 0.936 Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2024-06-30 06:26:43 -07:00
MayShao-oc	44d757eb9f	x86: Set preferred CPU features on the KH-40000 and KX-7000 Zhaoxin processors Fix code formatting under the Zhaoxin branch and add comments for different Zhaoxin models. Unaligned AVX load are slower on KH-40000 and KX-7000, so disable the AVX_Fast_Unaligned_Load. Enable Prefer_No_VZEROUPPER and Fast_Unaligned_Load features to use sse2_unaligned version of memset,strcpy and strcat. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2024-06-30 06:26:43 -07:00
Andrew Pinski	2f1f7a5f8a	Aarch64: Add new memset for Qualcomm's oryon-1 core Qualcom's new core, oryon-1, has a different characteristics for memset than the current versions of memset. For non-zero, larger sizes, using GPRs rather than the SIMD stores is ~30% faster. For even larger sizes, using the nontemporal stores is needed not to polute the L1/L2 caches. For zero values, using `dc zva` should be used. Since we know the size will always be 64 bytes, we don't need to figure out the size there. I started with the emag memset and added back the `dc zva` code. Changes since v1: * v3: Fix comment formating Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-06-30 13:47:17 +02:00
Andrew Pinski	4dc83cac78	Aarch64: Add memcpy for qualcomm's oryon-1 core Qualcomm's new core (oryon-1) has a different performance characteristic than other cores. For memcpy, it is faster to use the GPRs to do the copy for large sizes (2x faster). For even larger sizes, it is better to use the nontemporal load/store instructions so we don't pollute the L1/L2 caches. For smaller sizes, the characteristic are very similar to other cores. I used the thunderx memcpy as a starting point and expanded from there. Changes since v1: * v2: Fix ordering in Makefile. * v3: Fix comment grammar about the ldnp/stnp instructions. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-06-30 13:46:33 +02:00
Palmer Dabbelt	07fe71f59b	arm: Avoid UB in elf_machine_rel() This recently came up during a cleanup to remove misaligned accesses from the RISC-V port. Link: https://sourceware.org/pipermail/libc-alpha/2022-June/139961.html Suggested-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Reviewed-by: Fangrui Song <maskray@google.com>	2024-06-26 12:45:43 +02:00
mengqinggang	a10b6ad471	LoongArch: Fix tst-gnu2-tls2 test case asm volatile ("movfcsr2gr $t0, $fcsr0" ::: "$t0"); asm volatile ("st.d $t0, %0" :"=m"(restore_fcsr)); generate to the following instructions with -Og flag: movfcsr2gr $t0, $zero addi.d $t0, $sp, 2047(0x7ff) addi.d $t0, $t0, 77(0x4d) st.w $t0, $t0, 0 fcsr0 register and restore_fcsr variable are both stored in t0 register. Change to: asm volatile ("movfcsr2gr %0, $fcsr0" :"=r"(restore_fcsr)); to avoid restore_fcsr address in t0. Comparing float value using memcmp because float value cannot be directly compared for equality. Put LOAD_REGISTER_FCSR and SAVE_REGISTER_FCC after LOAD_REGISTER_FLOAT. Some float instructions may change fcsr register.	2024-06-26 12:02:07 +08:00
Adhemerval Zanella	c90cfce849	posix: Fix pidfd_spawn/pidfd_spawnp leak if execve fails (BZ 31695) If the pidfd_spawn/pidfd_spawnp helper process succeeds, but evecve fails for some reason (either with an invalid/non-existent, memory allocation, etc.) the resulting pidfd is never closed, nor returned to caller (so it can call close). Since the process creation failed, it should be up to posix_spawn to also, close the file descriptor in this case (similar to what it does to reap the process). This patch also changes the waitpid with waitid (P_PIDFD) for pidfd case, to avoid a possible pid re-use. Checked on x86_64-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2024-06-25 12:11:48 -03:00
Andreas K. Hüttel	d32c342425	Revert "MIPSr6/math: Use builtin fma and fmaf" Apologies, I mistakenly interpreted this to be already accepted. Reverting until v6 or later is reviewed and approved. This reverts commit `9e06e4a43b`.	2024-06-25 01:02:58 +02:00
Christoph Müllner	81c7f6193c	RISC-V: Execute a PAUSE hint in spin loops The atomic_spin_nop() macro can be used to run arch-specific code in the body of a spin loop to potentially improve efficiency. RISC-V's Zihintpause extension includes a PAUSE instruction for this use-case, which is encoded as a HINT, which means that it behaves like a NOP on systems that don't implement Zihintpause. Binutils supports Zihintpause since 2.36, so this patch uses the ".insn" directive to keep the code compatible with older toolchains. Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu> Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com> Acked-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-06-24 21:36:49 +02:00
YunQiang Su	9e06e4a43b	MIPSr6/math: Use builtin fma and fmaf MIPSr6 has MADDF.s/MADDF.d instructions, which are fused. In MIPS ISA, double support can be subsetted. Only FMAF is enabled for this case. * sysdeps/mips/fpu/math-use-builtins-fma.h Signed-off-by: YunQiang Su <syq@gcc.gnu.org> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>	2024-06-24 19:43:57 +02:00
John David Anglin	aecde502e9	hppa/vdso: Add wrappers for vDSO functions The upcoming parisc (hppa) v6.11 Linux kernel will include vDSO support for gettimeofday(), clock_gettime() and clock_gettime64() syscalls for 32- and 64-bit userspace. The patch below adds the necessary glue code for glibc. Signed-off-by: Helge Deller <deller@gmx.de> Changes in v2: - add vsyscalls for 64-bit too	2024-06-23 19:39:28 -04:00
John David Anglin	9dddb26954	Update hppa libm-test-ulps	2024-06-23 13:51:25 -04:00
John David Anglin	da61ba3f89	Update hppa libm-test-ulps	2024-06-20 19:44:04 -04:00
Julian Zhu	9f2bf0e23a	RISC-V: Update ulps For the exp10m1, exp2m1, log10p1 and log2p1 implementations. Signed-off-by: Julian Zhu <jz531210@gmail.com>	2024-06-20 23:46:32 +02:00
Julian Zhu	cb20e7c7cc	MIPS: Update ulps Update mips32/mips64 ulps for the exp10m1, exp2m1, and log10p1 implementations. Signed-off-by: Julian Zhu <jz531210@gmail.com>	2024-06-20 23:45:24 +02:00
Florian Weimer	b375e597da	i386: Update ulps This is from a -march=i686 -mtune=generic build with --disable-multi-arch, running on a Cascade Lake CPU.	2024-06-20 19:00:48 +02:00
Florian Weimer	362588f7cc	s390x: Capture grep output in static PIE check The test is not a run-time check, so update the description. Also use readelf -W for a more stable output format and fix an LC_ALL typo. This avoids garbled configure messages: checking for s390-specific static PIE requirements (runtime check)... 0x0000000000000017 (JMPREL) 0x280 yes	2024-06-20 14:34:06 +02:00
Florian Weimer	71dafdf5f1	powerpc: Update ulps Results based on POWER8 and POWER9 machines running powerpc64-linux-gnu, with and without --disable-multi-arch.	2024-06-20 12:15:31 +02:00
Florian Weimer	3cb77b7d1e	i386: Update ulps Based on a -march=x86-64-v4 -mfpmath=sse build, with and without --disable-multi-arch, running on a Zen 4 CPU. Also used different -march=x8i6-64-v… settings.	2024-06-20 12:15:09 +02:00
Xi Ruoyao	9405d54c62	LoongArch: Update ulps Add ulps for recently added C23 exp10m1, exp2m1, and log10p1 functions. Signed-off-by: Xi Ruoyao <xry111@xry111.site>	2024-06-19 21:17:19 +02:00
Andreas K. Hüttel	4f1cf0c0e1	sparc: Regenerate ULPs Linux catbus 5.15.110-gentoo-r1 #1 SMP Fri Jun 9 17:53:23 PDT 2023 sparc64 sun4v UltraSparc T5 (Niagara5) GNU/Linux Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>	2024-06-19 14:58:32 +02:00
Stefan Liebler	19f6d6a480	s390x: Regenerate ULPs. Needed due to: - "Implement C23 log10p1" commit ID `55eb99e9a9` - "Implement C23 exp2m1, exp10m1" commit ID `7ec903e028`	2024-06-19 08:42:30 +02:00
mengqinggang	9a675d998e	LoongArch: Fix _dl_tlsdesc_dynamic in LSX case HWCAP value is overwritten at the first comparison of the LASX case. The second comparison at LSX get incorrect result. Change to use t0 to save HWCAP value, and use t1 to save comparison result.	2024-06-19 10:06:41 +08:00
Adhemerval Zanella	92341e3150	arm: Update ulps For the exp10m1, exp2m1, and log10p1 implementations.	2024-06-18 17:31:10 -03:00
Adhemerval Zanella	45f5f51b85	aarch64: Update ulps For the exp10m1, exp2m1, and log10p1 implementations.	2024-06-18 17:31:10 -03:00
Adhemerval Zanella	52b397bafa	powerpc: Update ulps For the exp10m1, exp2m1, and log10p1 implementations.	2024-06-18 17:31:10 -03:00
Florian Weimer	f6ea5d1291	Linux: Include <dl-symbol-redir-ifunc.h> in dl-sysdep.c The _dl_sysdep_parse_arguments function contains initalization of a large on-stack variable: dl_parse_auxv_t auxv_values = { 0, }; This uses a non-inline version of memset on powerpc64le-linux-gnu, so it must use the baseline memset.	2024-06-18 10:56:34 +02:00
Carlos Llamas	176671f604	linux: add definitions for hugetlb page size encodings A desired hugetlb page size can be encoded in the flags parameter of system calls such as mmap() and shmget(). The Linux UAPI headers have included explicit definitions for these encodings since v4.14. This patch adds these definitions that are used along with MAP_HUGETLB and SHM_HUGETLB flags as specified in the corresponding man pages. This relieves programs from having to duplicate and/or compute the encodings manually. Additionally, the filter on these definitions in tst-mman-consts.py is removed, as suggested by Florian. I then ran this tests successfully, confirming the alignment with the kernel headers. PASS: misc/tst-mman-consts original exit status 0 Signed-off-by: Carlos Llamas <cmllamas@google.com> Tested-by: Florian Weimer <fweimer@redhat.com> Reviewed-by: Florian Weimer <fweimer@redhat.com>	2024-06-18 10:56:34 +02:00
Stefan Liebler	e260ceb4aa	elf: Remove HWCAP_IMPORTANT Remove the definitions of HWCAP_IMPORTANT after removal of LD_HWCAP_MASK / tunable glibc.cpu.hwcap_mask. There HWCAP_IMPORTANT was used as default value. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-06-18 10:45:36 +02:00
Stefan Liebler	ad0aa1f549	elf: Remove LD_HWCAP_MASK / tunable glibc.cpu.hwcap_mask Remove the environment variable LD_HWCAP_MASK and the tunable glibc.cpu.hwcap_mask as those are not used anymore in common-code after removal in elf/dl-cache.c:search_cache(). The only remaining user is sparc32 where it is used in elf_machine_matches_host(). If sparc32 does not need it anymore, we can get rid of it at all. Otherwise we could also move LD_HWCAP_MASK / tunable glibc.cpu.hwcap_mask to be sparc32 specific. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-06-18 10:45:36 +02:00
Stefan Liebler	343439a31e	elf: Remove _DL_PLATFORMS_COUNT Remove the definitions of _DL_PLATFORMS_COUNT as those are not used anymore after removal in elf/dl-cache.c:search_cache(). Note: On x86, we can also get rid of the definitions HWCAP_PLATFORMS_START and HWCAP_PLATFORMS_COUNT. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-06-18 10:45:36 +02:00
Stefan Liebler	43c7c5e62d	elf: Remove _DL_FIRST_PLATFORM Remove the definitions of _DL_FIRST_PLATFORM as those were only used in the _DL_HWCAP_PLATFORM definitions and in _dl_string_platform(). Both were removed. Note: Removed on every architecture despite of powerpc, where _dl_string_platform() is still used. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-06-18 10:45:36 +02:00
Stefan Liebler	ed23449dac	elf: Remove _DL_HWCAP_PLATFORM Remove the definitions of _DL_HWCAP_PLATFORM as those are not used anymore after removal in elf/dl-cache.c:search_cache(). Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-06-18 10:45:36 +02:00
Stefan Liebler	374c8b4483	elf: Remove platform strings in dl-procinfo.c Remove the platform strings in dl-procinfo.c where also the implementation of _dl_string_platform() was removed. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-06-18 10:45:36 +02:00
Stefan Liebler	8faada8302	elf: Remove _dl_string_platform Despite of powerpc where the returned integer is stored in tcb, and the diagnostics output, there is no user anymore. Thus this patch removes the diagnostics output and _dl_string_platform for all other platforms. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-06-18 10:45:36 +02:00
Stefan Liebler	f14b6dfc87	x86: Remove HWCAP_START and HWCAP_COUNT Both defines are not used anymore. Those were only used for _dl_string_hwcap(), which itself was removed with commit `ab40f20364` "elf: Remove _dl_string_hwcap" Just clean up. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-06-18 10:45:36 +02:00
YunQiang Su	eaf4fc516a	math: Update mips32/mips64 ulps for log2p1	2024-06-17 21:45:53 +02:00
Andreas K. Hüttel	98ffc1bfeb	Convert to autoconf 2.72 (vanilla release, no distribution patches) As discussed at the patch review meeting Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org> Reviewed-by: Simon Chopin <simon.chopin@canonical.com>	2024-06-17 21:15:28 +02:00
Joseph Myers	7ec903e028	Implement C23 exp2m1, exp10m1 C23 adds various <math.h> function families originally defined in TS 18661-4. Add the exp2m1 and exp10m1 functions (exp2(x)-1 and exp10(x)-1, like expm1). As with other such functions, these use type-generic templates that could be replaced with faster and more accurate type-specific implementations in future. Test inputs are copied from those for expm1, plus some additions close to the overflow threshold (copied from exp2 and exp10) and also some near the underflow threshold. exp2m1 has the unusual property of having an input (M_MAX_EXP) where whether the function overflows (under IEEE semantics) depends on the rounding mode. Although these could reasonably be XFAILed in the testsuite (as we do in some cases for arguments very close to a function's overflow threshold when an error of a few ulps in the implementation can result in the implementation not agreeing with an ideal one on whether overflow takes place - the testsuite isn't smart enough to handle this automatically), since these functions aren't required to be correctly rounding, I made the implementation check for and handle this case specially. The Makefile ordering expected by lint-makefiles for the new functions is a bit peculiar, but I implemented it in this patch so that the test passes; I don't know why log2 also needed moving in one Makefile variable setting when it didn't in my previous patches, but the failure showed a different place was expected for that function as well. The powerpc64le IFUNC setup seems not to be as self-contained as one might hope; it shouldn't be necessary to add IFUNCs for new functions such as these simply to get them building, but without setting up IFUNCs for the new functions, there were undefined references to __GI___expm1f128 (that IFUNC machinery results in no such function being defined, but doesn't stop include/math.h from doing the redirection resulting in the exp2m1f128 and exp10m1f128 implementations expecting to call it). Tested for x86_64 and x86, and with build-many-glibcs.py.	2024-06-17 16:31:49 +00:00
Joseph Myers	55eb99e9a9	Implement C23 log10p1 C23 adds various <math.h> function families originally defined in TS 18661-4. Add the log10p1 functions (log10(1+x): like log1p, but for base-10 logarithms). This is directly analogous to the log2p1 implementation (except that whereas log2p1 has a smaller underflow range than log1p, log10p1 has a larger underflow range). The test inputs are copied from those for log1p and log2p1, plus a few more inputs in that wider underflow range. Tested for x86_64 and x86, and with build-many-glibcs.py.	2024-06-17 13:48:13 +00:00
Joseph Myers	bb014f50c4	Implement C23 logp1 C23 adds various <math.h> function families originally defined in TS 18661-4. Add the logp1 functions (aliases for log1p functions - the name is intended to be more consistent with the new log2p1 and log10p1, where clearly it would have been very confusing to name those functions log21p and log101p). As aliases rather than new functions, the content of this patch is somewhat different from those actually adding new functions. Tests are shared with log1p, so this patch does mechanically update all affected libm-test-ulps files to expect the same errors for both functions. The vector versions of log1p on aarch64 and x86_64 are not updated to have logp1 aliases (and thus there are no corresponding header, tests, abilist or ulps changes for vector functions either). It would be reasonable for such vector aliases and corresponding changes to other files to be made separately. For now, the log1p tests instead avoid testing logp1 in the vector case (a Makefile change is needed to avoid problems with grep, used in generating the .c files for vector function tests, matching more than one ALL_RM_TEST line in a file testing multiple functions with the same inputs, when it assumes that the .inc file only has a single such line). Tested for x86_64 and x86, and with build-many-glibcs.py.	2024-06-17 13:47:09 +00:00
Noah Goldstein	5b54a33435	x86: Fix value for `x86_memset_non_temporal_threshold` when it is undesirable When we don't want to use non-temporal stores for memset, we set `x86_memset_non_temporal_threshold` to SIZE_MAX. The current code, however, we using `maximum_non_temporal_threshold` as the upper bound which is `SIZE_MAX >> 4` so we ended up with a value of `0`. Fix is to just use `SIZE_MAX` as the upper bound for when setting the tunable. Tested-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-06-14 17:25:05 -05:00

1 2 3 4 5 ...

16271 Commits