glibc

mirror of https://sourceware.org/git/glibc.git synced 2024-11-25 14:30:06 +00:00

Author	SHA1	Message	Date
Adhemerval Zanella	86f06282cc	Update PIDFD_* constants for Linux 6.11 Linux 6.11 adds some more PIDFD_* constants for 'pidfs: allow retrieval of namespace file descriptors' (5b08bd408534bfb3a7cf5778da5b27d4e4fffe12). Tested with build-many-glibcs.py. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-10-10 10:27:51 -03:00
Adhemerval Zanella	02de16df48	Update syscall lists for Linux 6.11 Linux 6.11 changes for syscall are: * fstat/newfstatat for loongarch (it should be safe to add since `255dc1e4ed` that undefine them). * clone3 for nios2, which only adds the entry point but defined __ARCH_BROKEN_SYS_CLONE3 (the syscall will always return ENOSYS). * uretprobe for x86_64 and x32. Update syscall-names.list and regenerate the arch-syscall.h headers with build-many-glibcs.py update-syscalls. Tested with build-many-glibcs.py. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-10-10 10:27:49 -03:00
Adhemerval Zanella	d17e5d5f6e	Use Linux 6.11 in build-many-glibcs.py Tested with build-many-glibcs.py (host-libraries, compilers and glibcs builds). Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-10-10 10:27:47 -03:00
Joseph Myers	0e8738a48c	Fix header guard in sysdeps/mach/hurd/x86_64/vm_param.h GCC mainline produces a -Wheader-guard error building for x86_64-gnu. Fix what seems to be incorrect macro naming in the #ifndef conditional. Tested with build-many-glibc.py for x86_64-gnu (GCC mainline). Message-ID: <fd800046-5ecb-ebd5-4df1-29d4eb3d5433@redhat.com>	2024-10-09 19:16:53 +02:00
DJ Delorie	1895a35e70	rt: more clock_nanosleep tests addendum Forgot to change the first-line description.	2024-10-08 14:30:21 -04:00
DJ Delorie	cfb35f5f7f	rt: more clock_nanosleep tests Test that clock_nanosleep rejects out of range time values. Test that clock_nanosleep actually sleeps for at least the requested time relative to the requested clock. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-10-08 14:27:55 -04:00
Adhemerval Zanella	d40ac01cbb	stdlib: Make abort/_Exit AS-safe (BZ 26275) The recursive lock used on abort does not synchronize with a new process creation (either by fork-like interfaces or posix_spawn ones), nor it is reinitialized after fork(). Also, the SIGABRT unblock before raise() shows another race condition, where a fork or posix_spawn() call by another thread, just after the recursive lock release and before the SIGABRT signal, might create programs with a non-expected signal mask. With the default option (without POSIX_SPAWN_SETSIGDEF), the process can see SIG_DFL for SIGABRT, where it should be SIG_IGN. To fix the AS-safe, raise() does not change the process signal mask, and an AS-safe lock is used if a SIGABRT is installed or the process is blocked or ignored. With the signal mask change removal, there is no need to use a recursive loc. The lock is also taken on both _Fork() and posix_spawn(), to avoid the spawn process to see the abort handler as SIG_DFL. A read-write lock is used to avoid serialize _Fork and posix_spawn execution. Both sigaction (SIGABRT) and abort() requires to lock as writer (since both change the disposition). The fallback is also simplified: there is no need to use a loop of ABORT_INSTRUCTION after _exit() (if the syscall does not terminate the process, the system is broken). The proposed fix changes how setjmp works on a SIGABRT handler, where glibc does not save the signal mask. So usage like the below will now always abort. static volatile int chk_fail_ok; static jmp_buf chk_fail_buf; static void handler (int sig) { if (chk_fail_ok) { chk_fail_ok = 0; longjmp (chk_fail_buf, 1); } else _exit (127); } [...] signal (SIGABRT, handler); [....] chk_fail_ok = 1; if (! setjmp (chk_fail_buf)) { // Something that can calls abort, like a failed fortify function. chk_fail_ok = 0; printf ("FAIL\n"); } Such cases will need to use sigsetjmp instead. The _dl_start_profile calls sigaction through _profil, and to avoid pulling abort() on loader the call is replaced with __libc_sigaction. Checked on x86_64-linux-gnu and aarch64-linux-gnu. Reviewed-by: DJ Delorie <dj@redhat.com>	2024-10-08 14:40:12 -03:00
Adhemerval Zanella	55d33108c7	linux: Use GLRO(dl_vdso_time) on time The BZ#24967 fix (`1bdda52fe9`) missed the time for architectures that define USE_IFUNC_TIME. Although it is not an issue, since there is no pointer mangling, there is also no need to call dl_vdso_vsym since the vDSO setup was already done by the loader. Checked on x86_64-linux-gnu and i686-linux-gnu.	2024-10-08 13:28:21 -03:00
Adhemerval Zanella	02b195d30f	linux: Use GLRO(dl_vdso_gettimeofday) on gettimeofday The BZ#24967 fix (`1bdda52fe9`) missed the gettimeofday for architectures that define USE_IFUNC_GETTIMEOFDAY. Although it is not an issue, since there is no pointer mangling, there is also no need to call dl_vdso_vsym since the vDSO setup was already done by the loader. Checked on x86_64-linux-gnu and i686-linux-gnu.	2024-10-08 13:28:21 -03:00
Stefan Liebler	7949f552cb	S390: Don't use r11 for cu-instructions as used as frame-pointer. [BZ# 32192] Building the s390 specific iconv modules - utf16-utf32-z9.c, utf8-utf32-z9.c and utf8-utf16-z9.c - with -fno-omit-frame-pointer leads to a build error "error: %r11 cannot be used in 'asm' here" as r11 is needed as frame-pointer. The cuXY-instructions need two even-odd register pairs. Therefore the register pinning is used. This patch just uses a different register pair. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2024-10-08 10:13:02 +02:00
H.J. Lu	ced745bcd3	stdio-common/Makefile: Fix FAIL: lint-makefiles Fix stdio-common/Makefile: @@ -224,12 +224,12 @@ tst-freopen4 \ tst-freopen5 \ tst-freopen6 \ + tst-freopen7 \ tst-freopen64-2 \ tst-freopen64-3 \ tst-freopen64-4 \ tst-freopen64-6 \ tst-freopen64-7 \ - tst-freopen7 \ tst-fseek \ tst-fwrite \ tst-fwrite-memstrm \ Signed-off-by: H.J. Lu <hjl.tools@gmail.com>	2024-10-08 08:46:45 +08:00
Carlos O'Donell	cae9944a6c	Fix whitespace related license issues. Several copies of the licenses in files contained whitespace related problems. Two cases are addressed here, the first is two spaces after a period which appears between "PURPOSE." and "See". The other is a space after the last forward slash in the URL. Both issues are corrected and the licenses now match the official textual description of the license (and the other license in the sources). Since these whitespaces changes do not alter the paragraph structure of the license, nor create new sentences, they do not change the license.	2024-10-07 18:08:16 -04:00
Joseph Myers	42c810c2cf	Add freopen special-case tests: thread cancellation Add tests of freopen adding or removing "c" (non-cancelling I/O) from the mode string (so completing my planned tests of freopen with different features used in the mode strings). Note that it's in the nature of the uncertain time at which cancellation might act (possibly during freopen, possibly during subsequent reads) that these can leak memory or file descriptors, so these do not include leak tests. Tested for x86_64.	2024-10-07 19:44:25 +00:00
Bruno Haible	e67f8e6dbd	hurd: Add missing va_end call in fcntl implementation. [BZ #32234 ] * sysdeps/mach/hurd/fcntl.c (__libc_fcntl): Add va_end call in two code paths.	2024-10-03 20:18:29 +02:00
Andreas Schwab	a36814e145	riscv: align .preinit_array (bug 32228) The section contains an array of pointers, so it should be aligned to pointer size.	2024-10-02 13:04:30 +02:00
Adhemerval Zanella	5e8cfc5d62	linux: sparc: Fix clone for LEON/sparcv8 (BZ 31394) The sparc clone mitigation (`faeaa3bc9f`) added the use of flushw, which is not support by LEON/sparcv8. As discussed on the libc-alpha, 'ta 3' is a working alternative [1]. [1] https://sourceware.org/pipermail/libc-alpha/2024-August/158905.html Checked with a build for sparcv8-linux-gnu targetting leon. Acked-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>	2024-10-01 10:37:21 -03:00
Adhemerval Zanella	49c3682ce1	linux: sparc: Fix syscall_cancel for LEON LEON2/LEON3 are both sparcv8, which does not support branch hints (bne,pn) nor the return instruction. Checked with a build for sparcv8-linux-gnu targetting leon. I also checked some cancellation tests with qemu-system (targeting LEON3). Acked-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>	2024-10-01 10:37:21 -03:00
Wilco Dijkstra	44fa9c1080	math: Improve layout of expf data GCC aligns global data to 16 bytes if their size is >= 16 bytes. This patch changes the exp2f_data struct slightly so that the fields are better aligned. As a result on targets that support them, load-pair instructions accessing poly_scaled and invln2_scaled are now 16-byte aligned. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-10-01 13:39:26 +01:00
Adhemerval Zanella	4d8965f130	Disable _TIME_BITS if the compiler defaults to it Even though building glibc with 64 bit time_t flags is not supported, and the usual way is to patch the build system to avoid it; some systems do enable it by default, and it increases the requirements to build glibc in such cases (it also does not help newcomers when trying to build glibc). The conform namespace and linknamespace tests also do not expect that flag to be set by default, so disable it as well. Checked with a build/check for major ABI and some (i386, arm, mipsel, hppa) with a toolchain that has LFS flags by default. Reviewed-by: DJ Delorie <dj@redhat.com>	2024-10-01 08:44:41 -03:00
Adhemerval Zanella	3f1932ed2e	Disable _FILE_OFFSET_BITS if the compiler defaults to it Even though building glibc with LFS flags is not supported, and the the usual way is to patch the build system to avoid it [1]; some system do enable it by default, and it increases the requirements to build glibc in such cases (it also does not help newcomers when trying to build glibc). The conform namespace and linknamespace tests also do not expect that flag to be set by default, so disable it as well. Checked with a build/check for major ABI and some (i386, arm, mipsel, hppa) with a toolchain that has LFS flags by default. [1] https://sourceware.org/bugzilla/show_bug.cgi?id=31624 Reviewed-by: DJ Delorie <dj@redhat.com>	2024-10-01 08:44:41 -03:00
Adhemerval Zanella	127cefd84d	Do not use -Wp to disable fortify (BZ 31928) The -Wp does not work properly if the compiler is configured to enable fortify by default, since it bypasses the compiler driver (which defines the fortify flags in this case). This patch is similar to the one used on Ubuntu [1]. I checked with a build for x86_64-linux-gnu, i686-linux-gnu, aarch64-linux-gnu, s390x-linux-gnu, and riscv64-linux-gnu with gcc-13 that enables the fortify by default. Co-authored-by: Matthias Klose <matthias.klose@canonical.com> [1] https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/glibc/tree/debian/patches/ubuntu/fix-fortify-source.patch Reviewed-by: DJ Delorie <dj@redhat.com>	2024-10-01 08:44:40 -03:00
H.J. Lu	9dfea3de7f	libio: Set _vtable_offset before calling _IO_link_in [BZ #32148 ] Since _IO_vtable_offset is used to detect the old binaries, set it in _IO_old_file_init_internal before calling _IO_link_in which checks _IO_vtable_offset. Add a glibc 2.0 test with copy relocation on _IO_stderr_@GLIBC_2.0 to verify that fopen won't cause memory corruption. This fixes BZ #32148. Signed-off-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2024-10-01 07:31:25 +08:00
Tulio Magno Quites Machado Filho	97aa92263a	Add a new fwrite test that exercises buffer overflow Exercises fwrite's internal buffer when doing a file operation. The new test, exercises 2 overflow behaviors: 1. Call fwrite multiple times making usage of fwrite's internal buffer. The total number of bytes written is larger than fwrite's internal buffer, forcing an automatic flush. 2. Call fwrite a single time with an amount of data that is larger than fwrite's internal buffer. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2024-09-30 15:57:12 -03:00
Noah Goldstein	483443d321	x86/string: Fixup alignment of main loop in str{n}cmp-evex [BZ #32212 ] The loop should be aligned to 32-bytes so that it can ideally run out the DSB. This is particularly important on Skylake-Server where deficiencies in it's DSB implementation make it prone to not being able to run loops out of the DSB. For example running strcmp-evex on 200Mb string: 32-byte aligned loop: - 43,399,578,766 idq.dsb_uops not 32-byte aligned loop: - 6,060,139,704 idq.dsb_uops This results in a 25% performance degradation for the non-aligned version. The fix is to just ensure the code layout is such that the loop is aligned. (Which was previously the case but was accidentally dropped in `84e7c46df`). NB: The fix was actually 64-byte alignment. This is because 64-byte alignment generally produces more stable performance than 32-byte aligned code (cache line crosses can affect perf), so if we are going past 16-byte alignmnent, might as well go to 64. 64-byte alignment also matches most other functions we over-align, so it creates a common point of optimization. Times are reported as ratio of Time_With_Patch / Time_Without_Patch. Lower is better. The values being reported is the geometric mean of the ratio across all tests in bench-strcmp and bench-strncmp. Note this patch is only attempting to improve the Skylake-Server strcmp for long strings. The rest of the numbers are only to test for regressions. Tigerlake Results Strings <= 512: strcmp : 1.026 strncmp: 0.949 Tigerlake Results Strings > 512: strcmp : 0.994 strncmp: 0.998 Skylake-Server Results Strings <= 512: strcmp : 0.945 strncmp: 0.943 Skylake-Server Results Strings > 512: strcmp : 0.778 strncmp: 1.000 The 2.6% regression on TGL-strcmp is due to slowdowns caused by changes in alignment of code handling small sizes (most on the page-cross logic). These should be safe to ignore because 1) We previously only 16-byte aligned the function so this behavior is not new and was essentially up to chance before this patch and 2) this type of alignment related regression on small sizes really only comes up in tight micro-benchmark loops and is unlikely to have any affect on realworld performance. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-09-30 07:40:40 -07:00
Florian Weimer	6948ee4edf	stdio-common: Fix memory leak in tst-freopen4* tests on UNSUPPORTED The temp_dir allocation leaks if support_can_chroot returns false.	2024-09-28 21:06:11 +02:00
Florian Weimer	b300078d97	Linux: Block signals around _Fork (bug 32215) This hides the inconsistent TCB state (missing robust mutex list) from signal handlers. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-09-28 09:44:25 +02:00
Mike FABIAN	a7b5eb821d	Update to Unicode 16.0.0 [BZ #32168 ] Unicode 16.0.0 Support: Character encoding, character type info, and transliteration tables are all updated to Unicode 16.0.0, using the generator scripts contributed by Mike FABIAN (Red Hat). Changes in CHARMAP and WIDTH: Total added characters in newly generated CHARMAP: 5185 Total removed characters in newly generated WIDTH: 1 Total added characters in newly generated WIDTH: 170 The removed character from WIDTH is U+1171E AHOM CONSONANT SIGN MEDIAL RA. It changed like this: UnicodeData.txt 15.1.0: 1171E;AHOM CONSONANT SIGN MEDIAL RA;Mn;0;NSM;;;;;N;;;;; UnicodeData.txt 16.0.0: 1171E;AHOM CONSONANT SIGN MEDIAL RA;Mc;0;L;;;;;N;;;;; EastAsianWidth.txt 15.1.0: 1171D..1171F ; N # Mn [3] AHOM CONSONANT SIGN MEDIAL LA..AHOM CONSONANT SIGN MEDIAL LIGATING RA EastAsianWidth.txt 16.0.0: 1171E ; N # Mc AHOM CONSONANT SIGN MEDIAL RA I.e it changed from Mn (Mark Nonspacing) to Mc (Mark Spacing combining). So it should now have width 1 instead of 0, therefore it is OK that it was removed from WIDTH, characters not in WIDTH get width 1 by default. Nothing suspicious when browsing the list of the 170 added characters. Changes in ctype: alpha: Added 4452 characters in new ctype which were not in old ctype combining: Added 51 characters in new ctype which were not in old ctype combining_level3: Added 43 characters in new ctype which were not in old ctype graph: Added 5185 characters in new ctype which were not in old ctype lower: Added 25 characters in new ctype which were not in old ctype print: Added 5185 characters in new ctype which were not in old ctype punct: Missing 33 characters of old ctype in new ctype punct: Added 766 characters in new ctype which were not in old ctype tolower: Added 27 characters in new ctype which were not in old ctype totitle: Added 27 characters in new ctype which were not in old ctype toupper: Added 27 characters in new ctype which were not in old ctype upper: Added 27 characters in new ctype which were not in old ctype Nothing suspicous in the additions. About the 33 characters removed from `punct`: U+0363 - U+036F are identical in UnicodeData.txt. Difference in DerivedCoreProperties.txt: DerivedCoreProperties.txt 15.1.0: not there. DerivedCoreProperties.txt 16.0.0: 0363..036F ; Alphabetic # Mn [13] COMBINING LATIN SMALL LETTER A..COMBINING LATIN SMALL LETTER X So that’s the reason why they are added to `alpha` and removed from `punct`. Same for U+1DD3 - U+1DE6, they are identical in UnicodeData.txt but there is a difference in DerivedCoreProperties.txt: DerivedCoreProperties.txt 15.1.0: 1DE7..1DF4 ; Alphabetic # Mn [14] COMBINING LATIN SMALL LETTER ALPHA..COMBINING LATIN SMALL LETTER U WITH DIAERESIS DerivedCoreProperties.txt 16.0.0: 1DD3..1DF4 ; Alphabetic # Mn [34] COMBINING LATIN SMALL LETTER FLATTENED OPEN A ABOVE..COMBINING LATIN SMALL LETTER U WITH DIAERESIS So they became `Alphabetic` and were thus added to `alpha` and removed from `punct`. Resolves: BZ #32168 Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2024-09-27 14:43:38 +02:00
Florian Weimer	f47596fcfe	manual: Document that feof and ferror are mutually exclusive This is not completely clear from the C standard (although there is footnote number 289 in C11), but I assume that our implementation works this way. Reviewed-by: DJ Delorie <dj@redhat.com>	2024-09-27 11:41:14 +02:00
Sergey Kolosov	1d72fa3cfa	stdio-common: Add new test for fdopen This commit adds fdopen test with all modes. Reviewed-by: DJ Delorie <dj@redhat.com>	2024-09-26 15:33:03 +02:00
Andreas Schwab	5f62cf88c4	Fix missing randomness in __gen_tempname (bug 32214) Make sure to update the random value also if getrandom fails. Fixes: `686d542025` ("posix: Sync tempname with gnulib")	2024-09-26 11:45:44 +02:00
Pavel Kozlov	cc84cd389c	arc: Cleanup arcbe Remove the mention of arcbe ABI to avoid any mislead. ARC big endian ABI is no longer supported. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2024-09-25 15:54:07 +01:00
Florian Weimer	4ff55d08df	arc: Remove HAVE_ARC_BE macro and disable big-endian port It is no longer needed, now that ARC is always little endian.	2024-09-25 11:25:22 +02:00
Florian Weimer	d67a7dbc84	scripts: Remove arceb-linux-gnu from build-many-glibcs.py This was discussed on the hallway track at GNU Tools Cauldron 2024. There are concerns about stability of the big-endian GCC backend, and Linux removed support for the only big-endian ARC platform in commit dd7c7ab01a04d645b7e7baa8530bfd81e31a2202 ("ARC: [plat-eznps]: Drop support for EZChip NPS platform").	2024-09-25 11:25:22 +02:00
caiyinyu	255dc1e4ed	LoongArch: Undef __NR_fstat and __NR_newfstatat. In Linux 6.11, fstat and newfstatat are added back. To avoid the messy usage of the fstat, newfstatat, and statx system calls, we will continue using statx only in glibc, maintaining consistency with previous versions of the LoongArch-specific glibc implementation. Signed-off-by: caiyinyu <caiyinyu@loongson.cn> Reviewed-by: Xi Ruoyao <xry111@xry111.site> Suggested-by: Florian Weimer <fweimer@redhat.com>	2024-09-25 10:00:42 +08:00
Joseph Myers	d14c977c65	Add tests of fread There seem to be no glibc tests specifically for the fread function. Add basic tests of that function. Tested for x86_64.	2024-09-24 14:06:22 +00:00
Florian Weimer	da29dc24d4	nptl: Prefer setresuid32 in tst-setuid2 Use the setresuid32 system call if it is available, prefering it over setresuid. If both system calls exist, setresuid is the 16-bit variant. This fixes a build failure on sparcv9-linux-gnu.	2024-09-24 13:48:11 +02:00
Florian Weimer	2abfa19072	elf: Move __rtld_malloc_init_stubs call into _dl_start_final Calling an extern function in a different translation unit before self-relocation is brittle. The compiler may load the address at an earlier point in _dl_start, before self-relocation. In _dl_start_final, the call is behind a compiler barrier, so this cannot happen.	2024-09-24 13:23:10 +02:00
Florian Weimer	9802c0f2fe	elf: Eliminate alloca in open_verify With the two-stage approach for exception handling, the name can be freed after it has been copied into the exception, but before it is raised.	2024-09-24 13:23:10 +02:00
Florian Weimer	bdaf500353	elf: Remove version assert in check_match in elf/dl-lookup.c This case is detected early in the elf/dl-version.c consistency checks. (These checks could be disabled in the future to allow the removal of symbol versioning from objects.) Commit `f0b2132b35` ("ld.so: Support moving versioned symbols between sonames [BZ #24741]) removed another call to _dl_name_match_p. The _dl_check_caller function no longer exists, and the remaining calls to _dl_name_match_p happen under the loader lock. This means that atomic accesses are no longer required for the l_libname list. This supersedes commit `395be7c218` ("elf: Fix data race in _dl_name_match_p [BZ #21349]").	2024-09-24 13:23:10 +02:00
Florian Weimer	8f6a53eab8	elf: In rtld_setup_main_map, assume ld.so has a DYNAMIC segment The way we build ld.so, it always has a dynamic segment, so checking for its absence is unnecessary.	2024-09-24 13:23:10 +02:00
Florian Weimer	7e21a65c58	misc: Enable internal use of memory protection keys This adds the necessary hidden prototypes.	2024-09-24 13:23:10 +02:00
Florian Weimer	3ef26b7087	misc: Link tst-mkstemp-fuse-parallel with $(shared-thread-library) The barrier functions require this on Hurd.	2024-09-24 13:05:52 +02:00
Florian Weimer	079ebf7624	iconv: Use $(run-program-prefix) for running iconv (bug 32197) With --enable-hardcoded-path-in-tests, $(test-program-prefix) does not redirect to the built glibc, but we need to run iconv (the program) against the built glibc even with --enable-hardcoded-path-in-tests, as it is using the ABI path for the dynamic linker (as an installed program). Use $(run-program-prefix) instead. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-09-24 12:35:40 +02:00
Joe Ramsay	16a59571e4	AArch64: Simplify rounding-multiply pattern in several AdvSIMD routines This operation can be simplified to use simpler multiply-round-convert sequence, which uses fewer instructions and constants. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2024-09-23 15:44:08 +01:00
Joe Ramsay	7900ac490d	AArch64: Improve codegen in users of ADVSIMD expm1f helper Rearrange operations so MOV is not necessary in reduction or around the special-case handler. Reduce memory access by using more indexed MLAs in polynomial. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2024-09-23 15:44:07 +01:00
Joe Ramsay	5bc100bd4b	AArch64: Improve codegen in users of AdvSIMD log1pf helper log1pf is quite register-intensive - use fewer registers for the polynomial, and make various changes to shorten dependency chains in parent routines. There is now no spilling with GCC 14. Accuracy moves around a little - comments adjusted accordingly but does not require regen-ulps. Use the helper in log1pf as well, instead of having separate implementations. The more accurate polynomial means special-casing can be simplified, and the shorter dependency chain avoids the usual dance around v0, which is otherwise difficult. There is a small duplication of vectors containing 1.0f (or 0x3f800000) - GCC is not currently able to efficiently handle values which fit in FMOV but not MOVI, and are reinterpreted to integer. There may be potential for more optimisation if this is fixed. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2024-09-23 15:44:07 +01:00
Joe Ramsay	a15b1394b5	AArch64: Improve codegen in SVE F32 logs Reduce MOVPRFXs by using unpredicated (non-destructive) instructions where possible. Similar to the recent change to AdvSIMD F32 logs, adjust special-case arguments and bounds to allow for more optimal register usage. For all 3 routines one MOVPRFX remains in the reduction, which cannot be avoided as immediate AND and ASR are both destructive. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2024-09-23 15:44:07 +01:00
Joe Ramsay	7b8c134b54	AArch64: Improve codegen in SVE expf & related routines Reduce MOV and MOVPRFX by improving special-case handling. Use inline helper to duplicate the entire computation between the special- and non-special case branches, removing the contention for z0 between x and the return value. Also rearrange some MLAs and MLSs - by making the multiplicand the destination we can avoid a MOVPRFX in several cases. Also change which constants go in the vector used for lanewise ops - the last lane is no longer wasted. Spotted that shift was incorrect in exp2f and exp10f, w.r.t. to the comment that explains it. Fixed - worst-case ULP for exp2f moves around but it doesn't change significantly for either routine. Worst-case error for coshf increases due to passing x to exp rather than abs(x) - updated the comment, but does not require regen-ulps. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2024-09-23 15:44:07 +01:00
Florian Weimer	6f3f6c506c	Linux: readdir64_r should not skip d_ino == 0 entries (bug 32126) This is the same bug as bug 12165, but for readdir_r. The regression test covers both bug 12165 and bug 32126. Reviewed-by: DJ Delorie <dj@redhat.com>	2024-09-21 19:32:34 +02:00
Florian Weimer	6aa1645f66	dirent: Add tst-rewinddir It verifies that rewinddir allows restarting the directory iteration. Reviewed-by: DJ Delorie <dj@redhat.com>	2024-09-21 19:32:34 +02:00

1 2 3 4 5 ...

41556 Commits