glibc

mirror of https://sourceware.org/git/glibc.git synced 2024-11-30 08:40:07 +00:00

Author	SHA1	Message	Date
Adhemerval Zanella	994fec2397	math: Use erff from CORE-MATH The CORE-MATH implementation is correctly rounded (for any rounding mode) and shows better performance to the generic erff. The code was adapted to glibc style and to use the definition of math_config.h. Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (M1, gcc 13.2.1), and powerpc (POWER10, gcc 13.2.1): latency master patched improvement x86_64 85.7363 45.1372 47.35% x86_64v2 86.6337 38.5816 55.47% x86_64v3 71.3810 34.0843 52.25% i686 190.143 97.5014 48.72% aarch64 34.9091 14.9320 57.23% power10 38.6160 8.5188 77.94% powerpc 39.7446 8.45781 78.72% reciprocal-throughput master patched improvement x86_64 35.1739 14.7603 58.04% x86_64v2 34.5976 11.2283 67.55% x86_64v3 27.3260 9.8550 63.94% i686 91.0282 30.8840 66.07% aarch64 22.5831 6.9615 69.17% power10 18.0386 3.0918 82.86% powerpc 20.7277 3.63396 82.47% Signed-off-by: Alexei Sibidanov <sibid@uvic.ca> Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr> Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: DJ Delorie <dj@redhat.com>	2024-11-22 10:52:27 -03:00
Adhemerval Zanella	c4c64ba5d1	math: Split s_erfF in erff and erfc So we can eventually replace each implementation. Reviewed-by: DJ Delorie <dj@redhat.com>	2024-11-22 10:52:26 -03:00
Adhemerval Zanella	c5d241f06b	math: Use cbrtf from CORE-MATH The CORE-MATH implementation is correctly rounded (for any rounding mode) and shows better performance to the generic cbrtf. The code was adapted to glibc style and to use the definition of math_config.h. Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (M1, gcc 13.2.1), and powerpc (POWER10, gcc 13.2.1): latency master patched improvement x86_64 68.6348 36.8908 46.25% x86_64v2 67.3418 36.6968 45.51% x86_64v3 63.4981 32.7859 48.37% aarch64 29.3172 12.1496 58.56% power10 18.0845 8.8893 50.85% powerpc 18.0859 8.79527 51.37% reciprocal-throughput master patched improvement x86_64 36.4369 13.3565 63.34% x86_64v2 37.3611 13.1149 64.90% x86_64v3 31.6024 11.2102 64.53% aarch64 18.6866 7.3474 60.68% power10 9.4758 3.6329 61.66% powerpc 9.58896 3.90439 59.28% Signed-off-by: Alexei Sibidanov <sibid@uvic.ca> Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr> Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-11-22 10:01:03 -03:00
Adhemerval Zanella	2234b08763	benchtests: Add tanf benchmark Random inputs in [-pi, pi]. Reviewed-by: DJ Delorie <dj@redhat.com>	2024-11-22 10:01:03 -03:00
Adhemerval Zanella	ce4122ff97	benchtests: Add lgammaf benchmark Random inputs in the range [-20.0,20.0]. Reviewed-by: DJ Delorie <dj@redhat.com>	2024-11-22 10:01:03 -03:00
Adhemerval Zanella	d7612d04e4	benchtests: Add erfcf benchmark It is based on binary64 erfc-inputs, with random inputs in [0,b=0x1.41bbf6p+3] where b in the smallest number such that erfcf(b) rounds to 0 (to nearest). Reviewed-by: DJ Delorie <dj@redhat.com>	2024-11-22 10:01:03 -03:00
Adhemerval Zanella	50657965da	benchtests: Add erff benchmark It is based on binary64 erf-inputs, with random inputs in [0,b=0x1.f5a888p+1] where b in the smallest number such that erff(b) rounds to 1 (to nearest). Reviewed-by: DJ Delorie <dj@redhat.com>	2024-11-22 10:01:03 -03:00
Adhemerval Zanella	53c80be8da	benchtests: Add cbrtf benchmark Based on binary64 benchtests, with random inputs in [1,8].	2024-11-22 10:01:03 -03:00
H.J. Lu	e7b5532721	elf: Handle static PIE with non-zero load address [BZ #31799 ] For a static PIE with non-zero load address, its PT_DYNAMIC segment entries contain the relocated values for the load address in static PIE. Since static PIE usually doesn't have PT_PHDR segment, use p_vaddr of the PT_LOAD segment with offset == 0 as the load address in static PIE and adjust the entries of PT_DYNAMIC segment in static PIE by properly setting the l_addr field for static PIE. This fixes BZ #31799. Signed-off-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2024-11-22 06:22:13 +08:00
Siddhesh Poyarekar	713d6d7e78	x86/string: Use `movsl` instead of `movsd` in strncat [BZ #32344 ] The previous patch missed strncat, so fixed that. Resolves: BZ #32344 Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2024-11-21 17:11:01 -05:00
Florian Weimer	7a61e7f557	stdlib: Make getenv thread-safe in more cases Async-signal-safety is preserved, too. In fact, getenv is fully reentrant and can be called from the malloc call in setenv (if a replacement malloc uses getenv during its initialization). This is relatively easy to implement because even before this change, setenv, unsetenv, clearenv, putenv do not deallocate the environment strings themselves as they are removed from the environment. The main changes are: * Use release stores for environment array updates, following the usual pattern for safely publishing immutable data (in this case, the environment strings). * Do not deallocate the environment array. Instead, keep older versions around and adopt an exponential resizing policy. This results in an amortized constant space leak per active environment variable, but there already is such a leak for the variable itself (and that is even length-dependent, and includes no-longer used values). * Add a seqlock-like mechanism to retry getenv if a concurrent unsetenv is observed. Without that, it is possible that getenv returns NULL for a variable that is never unset. This is visible on some AArch64 implementations with the newly added stdlib/tst-getenv-unsetenv test case. The mechanism is not a pure seqlock because it tolerates one write from unsetenv. This avoids the need for a second copy of the environ array that getenv can read from a signal handler that happens to interrupt an unsetenv call. No manual updates are included with this patch because environ usage with execve, posix_spawn, system is still not thread-safe relative unsetenv. The new process may end up with an environment that misses entries that were never unset. This is the same issue described above for getenv. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-11-21 21:10:52 +01:00
Andrew Pinski	e6590f0c86	aarch64: Remove non-temporal load/stores from oryon-1's memset The hardware architects have a new recommendation not to use non-temporal load/stores for memset. This patch removes this path. I found there was no difference in the memset speed with/without non-temporal load/stores either. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-11-21 11:32:23 -03:00
Andrew Pinski	eb5eeb4740	aarch64: Remove non-temporal load/stores from oryon-1's memcpy The hardware architects have a new recommendation not to use non-temporal load/stores for memcpy. This patch removes this path. I found there was no difference in the memcpy speed with/without non-temporal load/stores either. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-11-21 11:32:17 -03:00
Sachin Monga	3051f3495c	powerpc64le: _init/_fini file changes for ROP The ROP instructions were added in ISA 3.1 (ie, Power10), however they were defined so that if executed on older cpus, they would behave as nops. This allows us to emit them on older cpus and they'd just be ignored, but if run on a Power10, then the binary would be ROP protected. Hash instructions use negative offsets so the default position of ROP pointer is FRAME_ROP_SAVE from caller's SP. Modified FRAME_MIN_SIZE_PARM to 112 for ELFv2 to reserve additional 16 bytes for ROP save slot and padding. Signed-off-by: Sachin Monga <smonga@linux.ibm.com> Reviewed-by: Peter Bergner <bergner@linux.ibm.com>	2024-11-20 16:50:34 -05:00
Samuel Thibault	c0365d3791	mman.h: Fix MAP_HASSEMPHORE typo BSD's MAP_HASSEMAPHORE is with an A. MAP_HASSEMPHORE is not used in any Debian software for instance.	2024-11-20 19:52:44 +01:00
Andreas Schwab	6e7778ecde	misc: remove extra va_end in error_tail (bug 32233) This is an addendum to commit `b7b52b9dec` ("error, error_at_line: Add missing va_end calls"), which added the va_end calls in the callers where they belong.	2024-11-20 14:05:52 +01:00
Andreas Schwab	ab545460b0	intl: avoid alloca for arbitrary sizes (bug 32380) Use malloc for the copy of the domain name and the category value, which can both be of arbitrary size.	2024-11-20 14:05:52 +01:00
Yury Khrustalev	47311cca31	manual: Add description of AArch64-specific pkey flags Describe AArch64 specific flags PKEY_DISABLE_READ and PKEY_DISABLE_EXECUTE that are available on AArch64 systems with enabled Stage 1 permission overlays feature introduced in Armv8.9 / 9.4 (FEAT_S1POE). Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-11-20 11:30:58 +00:00
Yury Khrustalev	f4d00dd60d	AArch64: Add support for memory protection keys This patch adds support for memory protection keys on AArch64 systems with enabled Stage 1 permission overlays feature introduced in Armv8.9 / 9.4 (FEAT_S1POE) [1]. 1. Internal functions "pkey_read" and "pkey_write" to access data associated with memory protection keys. 2. Implementation of API functions "pkey_get" and "pkey_set" for the AArch64 target. 3. AArch64-specific PKEY flags for READ and EXECUTE (see below). 4. New target-specific test that checks behaviour of pkeys on AArch64 targets. 5. This patch also extends existing generic test for pkeys. 6. HWCAP constant for Permission Overlay Extension feature. To support more accurate mapping of underlying permissions to the PKEY flags, we introduce additional AArch64-specific flags. The full list of flags is: - PKEY_UNRESTRICTED: 0x0 (for completeness) - PKEY_DISABLE_ACCESS: 0x1 (existing flag) - PKEY_DISABLE_WRITE: 0x2 (existing flag) - PKEY_DISABLE_EXECUTE: 0x4 (new flag, AArch64 specific) - PKEY_DISABLE_READ: 0x8 (new flag, AArch64 specific) The problem here is that PKEY_DISABLE_ACCESS has unusual semantics as it overlaps with existing PKEY_DISABLE_WRITE and new PKEY_DISABLE_READ. For this reason mapping between permission bits RWX and "restrictions" bits awxr (a for disable access, etc) becomes complicated: - PKEY_DISABLE_ACCESS disables both R and W - PKEY_DISABLE_{WRITE,READ} disables W and R respectively - PKEY_DISABLE_EXECUTE disables X Combinations like the one below are accepted although they are redundant: - PKEY_DISABLE_ACCESS \| PKEY_DISABLE_READ \| PKEY_DISABLE_WRITE Reverse mapping tries to retain backward compatibility and ORs PKEY_DISABLE_ACCESS whenever both flags PKEY_DISABLE_READ and PKEY_DISABLE_WRITE would be present. This will break code that compares pkey_get output with == instead of using bitwise operations. The latter is more correct since PKEY_* constants are essentially bit flags. It should be noted that PKEY_DISABLE_ACCESS does not prevent execution. [1] https://developer.arm.com/documentation/ddi0487/ka/ section D8.4.1.4 Co-authored-by: Szabolcs Nagy <szabolcs.nagy@arm.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-11-20 11:30:58 +00:00
Andrew Pinski	e162ab2bf1	AArch64: Remove thunderx{,2} memcpy ThunderX1 and ThunderX2 have been retired for a few years now. So let's remove the thunderx{,2} specific versions of memcpy. The performance gain or them was for medium and large sizes while the generic (aarch64) memcpy will handle just slightly worse. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com> Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2024-11-20 11:23:53 +00:00
Joseph Myers	d899b48a30	Fix femode_t conditionals for arc and or1k Two of the architecture bits/fenv.h headers define femode_t if __GLIBC_USE (IEC_60559_BFP_EXT), instead of the correct condition __GLIBC_USE (IEC_60559_BFP_EXT_C23) (both were added after commit `0175c9e9be`, but were probably first developed before it and then not updated to take account of its changes). This results in failures of the installed headers check for fenv.h when building with GCC 15 (defaults to -std=gnu23 - we don't yet have an installed-headers test specifically for C23 mode and don't yet require a compiler with such a mode for building glibc) together with a combination of options leaving C23 features enabled, since the declarations of functions using femode_t use the correct conditions; see <https://sourceware.org/pipermail/libc-testresults/2024q4/013163.html>. Fix the conditionals to get <fenv.h> to work correctly in C23 mode again. Tested with build-many-glibcs.py (arc-linux-gnu, arch-linux-gnuhf, or1k-linux-gnu-hard, or1k-linux-gnu-soft).	2024-11-19 22:25:39 +00:00
Mahesh Bodapati	3ef7e42861	powerpc64le: Optimized strcat for POWER10 This patch adds an optimized strcat which makes use of the default strcat function which calls the Power10 strcpy and strlen routines.	2024-11-19 15:59:15 -05:00
Peter Bergner	229265cc2c	powerpc: Improve the inline asm for syscall wrappers Update the inline asm syscall wrappers to match the newer register constraint usage in INTERNAL_VSYSCALL_CALL_TYPE. Use the faster mfocrf instruction when available, rather than the slower mfcr microcoded instruction.	2024-11-19 12:43:57 -05:00
gfleury	7f045c0b48	htl: move pthread_attr_init into libc. Signed-off-by: gfleury <gfleury@disroot.org>	2024-11-19 01:37:35 +01:00
gfleury	1a1cedd635	htl: move pthread_attr_setguardsize into libc. Signed-off-by: gfleury <gfleury@disroot.org>	2024-11-19 01:37:35 +01:00
gfleury	f26b272a75	htl: move pthread_attr_setschedparam into libc. Signed-off-by: gfleury <gfleury@disroot.org>	2024-11-19 01:37:35 +01:00
gfleury	32aa498ceb	htl: move pthread_attr_setscope into libc. Signed-off-by: gfleury <gfleury@disroot.org>	2024-11-19 01:37:35 +01:00
gfleury	4a8b7d7e62	htl: move pthread_attr_setstackaddr into libc. Signed-off-by: gfleury <gfleury@disroot.org>	2024-11-19 01:37:35 +01:00
gfleury	d69a010e7b	htl: move pthread_attr_setstacksize into libc. Signed-off-by: gfleury <gfleury@disroot.org>	2024-11-19 01:37:35 +01:00
gfleury	330c1fad5b	htl: move pthread_attr_getstack into libc. Signed-off-by: gfleury <gfleury@disroot.org>	2024-11-19 01:37:35 +01:00
gfleury	1428ae39e8	htl: move pthread_attr_getstackaddr into libc. Signed-off-by: gfleury <gfleury@disroot.org>	2024-11-19 01:37:35 +01:00
gfleury	993440a260	htl move pthread_attr_getstacksize into libc. Signed-off-by: gfleury <gfleury@disroot.org>	2024-11-19 01:34:34 +01:00
gfleury	4bcda927fe	htl move pthread_attr_getscope into libc. Signed-off-by: gfleury <gfleury@disroot.org>	2024-11-19 01:19:00 +01:00
gfleury	6caf24c972	htl move pthread_attr_getguardsize into libc. Signed-off-by: gfleury <gfleury@disroot.org>	2024-11-19 01:18:59 +01:00
gfleury	f55cf584ff	htl: move __pthread_default_attr into libc Signed-off-by: gfleury <gfleury@disroot.org>	2024-11-19 01:08:27 +01:00
gfleury	736befab6c	htl: move pthread_attr_destroy into libc. Signed-off-by: gfleury <gfleury@disroot.org>	2024-11-19 01:08:14 +01:00
Maciej W. Rozycki	ce13ab5033	stdio-common: Fix C23-ism in formatted output specifier tests [BZ #32360 ] Nameless function parameters have only been added to ISO C with the C23 revision of the language standard. Give names to the unused parameters of the stub 'dladdr' implementation then so as to make compilation happy with the earlier language definitions, fixing errors such as: tst-printf-format-skeleton.c:374:9: error: parameter name omitted 374 \| dladdr (const void , Dl_info ) reported by older compilers. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2024-11-15 22:43:54 +00:00
Aurelien Jarno	6c915c73d0	elf: handle addition overflow in _dl_find_object_update_1 [BZ #32245 ] The remaining_to_add variable can be 0 if (current_used + count) wraps, This is caught by GCC 14+ on hppa, which determines from there that target_seg could be be NULL when remaining_to_add is zero, which in turns causes a -Wstringop-overflow warning: In file included from ../include/atomic.h:49, from dl-find_object.c:20: In function '_dlfo_update_init_seg', inlined from '_dl_find_object_update_1' at dl-find_object.c:689:30, inlined from '_dl_find_object_update' at dl-find_object.c:805:13: ../sysdeps/unix/sysv/linux/hppa/atomic-machine.h:44:4: error: '__atomic_store_4' writing 4 bytes into a region of size 0 overflows the destination [-Werror=stringop-overflow=] 44 \| __atomic_store_n ((mem), (val), __ATOMIC_RELAXED); \ \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ dl-find_object.c:644:3: note: in expansion of macro 'atomic_store_relaxed' 644 \| atomic_store_relaxed (&seg->size, new_seg_size); \| ^~~~~~~~~~~~~~~~~~~~ In function '_dl_find_object_update': cc1: note: destination object is likely at address zero In practice, this is not possible as it represent counts of link maps. Link maps have sizes larger than 1 byte, so the sum of any two link map counts will always fit within a size_t without wrapping around. This patch therefore adds a check on remaining_to_add == 0 and tell GCC that this can not happen using __builtin_unreachable. Thanks to Andreas Schwab for the investigation. Closes: BZ #32245 Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Tested-by: John David Anglin <dave.anglin@bell.net> Reviewed-by: Florian Weimer <fweimer@redhat.com>	2024-11-13 23:06:43 +01:00
Noah Goldstein	c510681a69	x86/string: Use `movsl` instead of `movsd` in strncpy/strncat [BZ #32344 ] `ld`, starting at 2.40, emits a warning when using `movsd`. There is no change to the actual code produced. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-11-13 10:09:30 -06:00
Jonathan Wakely	8d3fb43797	manual: Fix overeager s/int/size_t/ in memory.texi The change in `e3960d1c57` should only have affected 'int' not 'internally'. Signed-off-by: Jonathan Wakely <jwakely@redhat.com>	2024-11-13 14:43:58 +00:00
John David Anglin	b919fe1f6d	hppa: Update libm-test-ulps Update imaginary part of csin. Signed-off-by: John David Anglin <dave.anglin@bell.net>	2024-11-12 21:32:54 -05:00
Samuel Thibault	e5c2738f17	Revert "hurd: Stop depending on the default_pager stubs provided by gnumach" This reverts commit `f7f7dd8009`. default_pager is actually also used in e.g. xosview.	2024-11-13 01:34:09 +01:00
Adhemerval Zanella	461cab1de7	linux: Add support for getrandom vDSO Linux 6.11 has getrandom() in vDSO. It operates on a thread-local opaque state allocated with mmap using flags specified by the vDSO. Multiple states are allocated at once, as many as fit into a page, and these are held in an array of available states to be doled out to each thread upon first use, and recycled when a thread terminates. As these states run low, more are allocated. To make this procedure async-signal-safe, a simple guard is used in the LSB of the opaque state address, falling back to the syscall if there's reentrancy contention. Also, _Fork() is handled by blocking signals on opaque state allocation (so _Fork() always sees a consistent state even if it interrupts a getrandom() call) and by iterating over the thread stack cache on reclaim_stack. Each opaque state will be in the free states list (grnd_alloc.states) or allocated to a running thread. The cancellation is handled by always using GRND_NONBLOCK flags while calling the vDSO, and falling back to the cancellable syscall if the kernel returns EAGAIN (would block). Since getrandom is not defined by POSIX and cancellation is supported as an extension, the cancellation is handled as 'may occur' instead of 'shall occur' [1], meaning that if vDSO does not block (the expected behavior) getrandom will not act as a cancellation entrypoint. It avoids a pthread_testcancel call on the fast path (different than 'shall occur' functions, like sem_wait()). It is currently enabled for x86_64, which is available in Linux 6.11, and aarch64, powerpc32, powerpc64, loongarch64, and s390x, which are available in Linux 6.12. Link: https://pubs.opengroup.org/onlinepubs/9799919799/nframe.html [1] Co-developed-by: Jason A. Donenfeld <Jason@zx2c4.com> Tested-by: Jason A. Donenfeld <Jason@zx2c4.com> # x86_64 Tested-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> # x86_64, aarch64 Tested-by: Xi Ruoyao <xry111@xry111.site> # x86_64, aarch64, loongarch64 Tested-by: Stefan Liebler <stli@linux.ibm.com> # s390x	2024-11-12 14:42:12 -03:00
Siddhesh Poyarekar	b583b1080b	io: Add setuid tests for faccessat Add a new test tst-faccessat-setuid that iterates through real and effective UID/GID combination and tests the faccessat() interface for default and AT_EACCESS flags. Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-11-12 10:19:58 -05:00
Siddhesh Poyarekar	ea75860813	tst-faccessat.c: Port to libsupport Use libsupport convenience functions and macros instead of the old test-skeleton. Also add a new xdup() convenience wrapper function. Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-11-12 10:19:58 -05:00
Siddhesh Poyarekar	04b1eb161f	support: Add xdup Add xdup as the error-checking version of dup for test cases. Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-11-12 10:19:58 -05:00
caiyinyu	ab4388f91c	LoongArch: Update ulps Needed for test-float-cacosh, test-float-csin, test-float32-cacosh and test-float32-csin. Signed-off-by: caiyinyu <caiyinyu@loongson.cn> Reviewed-by: Florian Weimer <fweimer@redhat.com>	2024-11-12 09:19:23 +08:00
Samuel Thibault	7b544224f8	stat.h: Fix missing declaration of struct timespec When building with e.g. -std=c99 and _ATFILE_SOURCE, stat.h was missing including bits/types/struct_timespec.h to get the struct timespec declaration for utimensat.	2024-11-10 00:46:42 +01:00
Samuel Thibault	d2e65aa7d6	mach: Fix __xpg_strerror_r on in-range but undefined errors [BZ #32350 ] For instance, 1073741906 leads to system 16, subsystem 0 and code 82, which is in range (max_code is 122), but not defined. Return EINVAL in that case, like	2024-11-09 20:00:40 +01:00
Noah Goldstein	6754b5becf	x86/string: Use `movsl` instead of `movsd` [BZ #32344 ] `ld`, starting at 2.40, emits a warning when using `movsd`. There is no change to the actual code produced. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-11-08 17:23:05 -06:00

1 2 3 4 5 ...

41592 Commits