glibc

mirror of https://sourceware.org/git/glibc.git synced 2024-11-24 22:10:13 +00:00

Author	SHA1	Message	Date
Florian Weimer	f8d8b1b1e6	aarch64: Enhanced CPU diagnostics for ld.so This prints some information from struct cpu_features, and the midr_el1 and dczid_el0 system register contents on every CPU. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2024-04-08 16:48:55 +02:00
Florian Weimer	7a430f40c4	x86: Add generic CPUID data dumper to ld.so --list-diagnostics This is surprisingly difficult to implement if the goal is to produce reasonably sized output. With the current approaches to output compression (suppressing zeros and repeated results between CPUs, folding ranges of identical subleaves, dealing with the %ecx reflection issue), the output is less than 600 KiB even for systems with 256 logical CPUs. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-04-08 16:48:55 +02:00
Florian Weimer	5653ccd847	elf: Add CPU iteration support for future use in ld.so diagnostics Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2024-04-08 16:48:55 +02:00
H.J. Lu	9e1f4aef86	x86-64: Exclude FMA4 IFUNC functions for -mapxf When -mapxf is used to build glibc, the resulting glibc will never run on FMA4 machines. Exclude FMA4 IFUNC functions when -mapxf is used. This requires GCC which defines __APX_F__ for -mapxf with commit: 1df56719bd8 x86: Define __APX_F__ for -mapxf Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>	2024-04-06 05:03:55 -07:00
Adhemerval Zanella	c27f8763cf	Reinstate generic features-time64.h The `a4ed0471d7` removed the generic version which is included by features.h and used by Hurd. Checked by building i686-gnu and x86_64-gnu with build-many-glibc.py.	2024-04-05 09:02:36 -03:00
Adhemerval Zanella	460d9e2dfe	Cleanup __tls_get_addr on alpha/microblaze localplt.data They are not required. Checked with a make check for both ABIs.	2024-04-04 17:20:33 -03:00
Adhemerval Zanella	95700e7998	arm: Remove ld.so __tls_get_addr plt usage Use the hidden alias instead. Checked on arm-linux-gnueabihf.	2024-04-04 17:03:32 -03:00
Adhemerval Zanella	50c2be2390	aarch64: Remove ld.so __tls_get_addr plt usage Use the hidden alias instead. Checked on aarch64-linux-gnu.	2024-04-04 17:02:32 -03:00
Adhemerval Zanella	44ccc2465c	math: x86 trunc traps when FE_INEXACT is enabled (BZ 31603) The implementations of trunc functions using x87 floating point (i386 and x86_64 long double only) traps when FE_INEXACT is enabled. Although this is a GNU extension outside the scope of the C standard, other architectures that also support traps do not show this behavior. The fix moves the implementation to a common one that holds any exceptions with a 'fnclex' (libc_feholdexcept_setround_387). Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-04-04 14:29:28 -03:00
Adhemerval Zanella	932544efa4	math: x86 floor traps when FE_INEXACT is enabled (BZ 31601) The implementations of floor functions using x87 floating point (i386 and 86_64 long double only) traps when FE_INEXACT is enabled. Although this is a GNU extension outside the scope of the C standard, other architectures that also support traps do not show this behavior. The fix moves the implementation to a common one that holds any exceptions with a 'fnclex' (libc_feholdexcept_setround_387). Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-04-04 14:29:28 -03:00
Adhemerval Zanella	637bfc392f	math: x86 ceill traps when FE_INEXACT is enabled (BZ 31600) The implementations of ceil functions using x87 floating point (i386 and x86_64 long double only) traps when FE_INEXACT is enabled. Although this is a GNU extension outside the scope of the C standard, other architectures that also support traps do not show this behavior. The fix moves the implementation to a common one that holds any exceptions with a 'fnclex' (libc_feholdexcept_setround_387). Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-04-04 14:29:28 -03:00
Joe Ramsay	87cb1dfcd6	aarch64/fpu: Add vector variants of erfc Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2024-04-04 10:33:24 +01:00
Joe Ramsay	3d3a4fb8e4	aarch64/fpu: Add vector variants of tanh Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2024-04-04 10:33:20 +01:00
Joe Ramsay	eedbbca0bf	aarch64/fpu: Add vector variants of sinh Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2024-04-04 10:33:16 +01:00
Joe Ramsay	8b67920528	aarch64/fpu: Add vector variants of atanh Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2024-04-04 10:33:12 +01:00
Joe Ramsay	81406ea3c5	aarch64/fpu: Add vector variants of asinh Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2024-04-04 10:33:02 +01:00
Joe Ramsay	b09fee1d21	aarch64/fpu: Add vector variants of acosh Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2024-04-04 10:32:58 +01:00
Joe Ramsay	bdb5705b7b	aarch64/fpu: Add vector variants of cosh Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2024-04-04 10:32:52 +01:00
Joe Ramsay	cb5d84f1f8	aarch64/fpu: Add vector variants of erf Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2024-04-04 10:32:48 +01:00
Stafford Horne	3db9d208dd	misc: Add support for Linux uio.h RWF_NOAPPEND flag In Linux 6.9 a new flag is added to allow for Per-io operations to disable append mode even if a file was opened with the flag O_APPEND. This is done with the new RWF_NOAPPEND flag. This caused two test failures as these tests expected the flag 0x00000020 to be unused. Adding the flag definition now fixes these tests on Linux 6.9 (v6.9-rc1). FAIL: misc/tst-preadvwritev2 FAIL: misc/tst-preadvwritev64v2 This patch adds the flag, adjusts the test and adds details to documentation. Link: https://lore.kernel.org/all/20200831153207.GO3265@brightrain.aerifal.cx/ Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-04-04 09:41:27 +01:00
Adhemerval Zanella	4dcd674b66	powerpc: Add missing arch flags on rounding ifunc variants The ifunc variants now uses the powerpc implementation which in turn uses the compiler builtin. Without the proper -mcpu switch the builtin does not generate the expected optimization. Checked on powerpc-linux-gnu. Reviewed-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Peter Bergner <bergner@linux.ibm.com>	2024-04-02 15:49:31 -03:00
Adhemerval Zanella	a4ed0471d7	Always define __USE_TIME_BITS64 when 64 bit time_t is used It was raised on libc-help [1] that some Linux kernel interfaces expect the libc to define __USE_TIME_BITS64 to indicate the time_t size for the kABI. Different than defined by the initial y2038 design document [2], the __USE_TIME_BITS64 is only defined for ABIs that support more than one time_t size (by defining the _TIME_BITS for each module). The 64 bit time_t redirects are now enabled using a different internal define (__USE_TIME64_REDIRECTS). There is no expected change in semantic or code generation. Checked on x86_64-linux-gnu, i686-linux-gnu, aarch64-linux-gnu, and arm-linux-gnueabi [1] https://sourceware.org/pipermail/libc-help/2024-January/006557.html [2] https://sourceware.org/glibc/wiki/Y2038ProofnessDesign Reviewed-by: DJ Delorie <dj@redhat.com>	2024-04-02 15:28:36 -03:00
Adhemerval Zanella	721314c980	x86_64: Remove avx512 strstr implementation As indicated in a recent thread, this it is a simple brute-force algorithm that checks the whole needle at a matching character pair (and does so 1 byte at a time after the first 64 bytes of a needle). Also it never skips ahead and thus can match at every haystack position after trying to match all of the needle, which generic implementation avoids. As indicated by Wilco, a 4x larger needle and 16x larger haystack gives a clear 65x slowdown both basic_strstr and __strstr_avx512: "ifuncs": ["basic_strstr", "twoway_strstr", "__strstr_avx512", "__strstr_sse2_unaligned", "__strstr_generic"], { "len_haystack": 65536, "len_needle": 1024, "align_haystack": 0, "align_needle": 0, "fail": 1, "desc": "Difficult bruteforce needle", "timings": [4.0948e+07, 15094.5, 3.20818e+07, 108558, 10839.2] }, { "len_haystack": 1048576, "len_needle": 4096, "align_haystack": 0, "align_needle": 0, "fail": 1, "desc": "Difficult bruteforce needle", "timings": [2.69767e+09, 100797, 2.08535e+09, 495706, 82666.9] } PS: I don't have an AVX512 capable machine to verify this issues, but skimming through the code it does seems to follow what Wilco has described. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2024-03-27 13:48:16 -03:00
Adhemerval Zanella	2e53eb9234	signal: Avoid system signal disposition to interfere with tests Both tst-sigset2 and tst-signal1 expectes that SIGINT disposition is set to SIG_DFL.	2024-03-27 13:47:09 -03:00
Palmer Dabbelt	96d1b9ac23	RISC-V: Fix the static-PIE non-relocated object check The value of l_scope is only valid post relocation, so this original check was triggering undefined behavior. Instead just directly check to see if the object has been relocated, at which point using l_scope is safe. Reported-by: Andreas Schwab <schwab@suse.de> Closes: BZ #31317 Fixes: `e0590f41fe` ("RISC-V: Enable static-pie.") Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-03-25 15:17:13 +01:00
Sergey Bugaev	dc1a77269c	htl: Implement some support for TLS_DTV_AT_TP Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-ID: <20240323173301.151066-19-bugaevc@gmail.com>	2024-03-23 23:00:30 +01:00
Sergey Bugaev	a4273efa21	htl: Respect GL(dl_stack_flags) when allocating stacks Previously, HTL would always allocate non-executable stacks. This has never been noticed, since GNU Mach on x86 ignores VM_PROT_EXECUTE and makes all pages implicitly executable. Since GNU Mach on AArch64 supports non-executable pages, HTL forgetting to pass VM_PROT_EXECUTE immediately breaks any code that (unfortunately, still) relies on executable stacks. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-ID: <20240323173301.151066-7-bugaevc@gmail.com>	2024-03-23 22:48:44 +01:00
Sergey Bugaev	b467cfcaee	hurd: Use the RETURN_ADDRESS macro This gives us PAC stripping on AArch64. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-ID: <20240323173301.151066-6-bugaevc@gmail.com>	2024-03-23 22:48:01 +01:00
Sergey Bugaev	6afeac1289	hurd: Disable Prefer_MAP_32BIT_EXEC on non-x86_64 for now While we could support it on any architecture, the tunable is currently only defined on x86_64. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-ID: <20240323173301.151066-5-bugaevc@gmail.com>	2024-03-23 22:47:46 +01:00
Sergey Bugaev	7f02511e5b	hurd: Move internal functions to internal header Move _hurd_self_sigstate (), _hurd_critical_section_lock (), and _hurd_critical_section_unlock () inline implementations (that were already guarded by #if defined _LIBC) to the internal version of the header. While at it, add <tls.h> to the includes, and use __LIBC_NO_TLS () unconditionally. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-ID: <20240323173301.151066-2-bugaevc@gmail.com>	2024-03-23 22:43:07 +01:00
Stafford Horne	ad05a42370	or1k: Add prctl wrapper to unwrap variadic args On OpenRISC variadic functions and regular functions have different calling conventions so this wrapper is needed to translate. This wrapper is copied from x86_64/x32. I don't know the build system enough to find a cleaner way to share the code between x86_64/x32 and or1k (maybe Implies?), so I went with the straight copy. This fixes test failures: misc/tst-prctl nptl/tst-setgetname	2024-03-22 15:43:34 +00:00
Stafford Horne	df7e29e2a4	or1k: Only define fpu rouding and exceptions with hard-float This test failure: math/test-fenv If rounding mode and exception macros are defined then the fenv tests run and always fail. This patch adds an ifdef using the __or1k_hard_float__ macro provided by gcc to avoid defining these fenv macros when they cnnot be used. This is similar to what is done in csky. Note, I will post the or1k hard-float support soon. So, I prefer to leave the hard-float bits here for now.	2024-03-22 15:43:34 +00:00
Stafford Horne	2e982a3937	or1k: Update libm test ulps To fix test failures: FAIL: math/test-float-hypot FAIL: math/test-float32-hypot	2024-03-22 15:43:34 +00:00
Wilco Dijkstra	2e94e2f5d2	AArch64: Check kernel version for SVE ifuncs Old Linux kernels disable SVE after every system call. Calling the SVE-optimized memcpy afterwards will then cause a trap to reenable SVE. As a result, applications with a high use of syscalls may run slower with the SVE memcpy. This is true for kernels between 4.15.0 and before 6.2.0, except for 5.14.0 which was patched. Avoid this by checking the kernel version and selecting the SVE ifunc on modern kernels. Parse the kernel version reported by uname() into a 24-bit kernel.major.minor value without calling any library functions. If uname() is not supported or if the version format is not recognized, assume the kernel is modern. Tested-by: Florian Weimer <fweimer@redhat.com> Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2024-03-21 16:50:51 +00:00
Amrita H S	1ea0511456	powerpc: Placeholder and infrastructure/build support to add Power11 related changes. The following three changes have been added to provide initial Power11 support. 1. Add the directories to hold Power11 files. 2. Add support to select Power11 libraries based on AT_PLATFORM. 3. Let submachine=power11 be set automatically. Reviewed-by: Florian Weimer <fweimer@redhat.com> Reviewed-by: Peter Bergner <bergner@linux.ibm.com>	2024-03-19 21:11:34 -05:00
Manjunath Matti	3ab9b88e2a	powerpc: Add HWCAP3/HWCAP4 data to TCB for Power Architecture. This patch adds a new feature for powerpc. In order to get faster access to the HWCAP3/HWCAP4 masks, similar to HWCAP/HWCAP2 (i.e. for implementing __builtin_cpu_supports() in GCC) without the overhead of reading them from the auxiliary vector, we now reserve space for them in the TCB. This is an ABI change for GLIBC 2.39. Suggested-by: Peter Bergner <bergner@linux.ibm.com> Reviewed-by: Peter Bergner <bergner@linux.ibm.com>	2024-03-19 17:19:27 -05:00
Adhemerval Zanella	3d53d18fc7	elf: Enable TLS descriptor tests on aarch64 The aarch64 uses 'trad' for traditional tls and 'desc' for tls descriptors, but unlike other targets it defaults to 'desc'. The gnutls2 configure check does not set aarch64 as an ABI that uses TLS descriptors, which then disable somes stests. Also rename the internal machinery fron gnu2 to tls descriptors. Checked on aarch64-linux-gnu. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-03-19 14:53:30 -03:00
Adhemerval Zanella	64c7e34428	arm: Update _dl_tlsdesc_dynamic to preserve caller-saved registers (BZ 31372) ARM _dl_tlsdesc_dynamic slow path has two issues: * The ip/r12 is defined by AAPCS as a scratch register, and gcc is used to save the stack pointer before on some function calls. So it should also be saved/restored as well. It fixes the tst-gnu2-tls2. * None of the possible VFP registers are saved/restored. ARM has the additional complexity to have different VFP bank sizes (depending of VFP support by the chip). The tst-gnu2-tls2 test is extended to check for VFP registers, although only for hardfp builds. Different than setcontext, _dl_tlsdesc_dynamic does not have HWCAP_ARM_IWMMXT (I don't have a way to properly test it and it is almost a decade since newer hardware was released). With this patch there is no need to mark tst-gnu2-tls2 as XFAIL. Checked on arm-linux-gnueabihf. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-03-19 14:53:30 -03:00
Andreas Schwab	fd7ee2e6c5	Add tst-gnu2-tls2mod1 to test-internal-extras That allows sysdeps/x86_64/tst-gnu2-tls2mod1.S to use internal headers. Fixes: `717ebfa85c` ("x86-64: Allocate state buffer space for RDI, RSI and RBX")	2024-03-19 14:28:28 +01:00
H.J. Lu	717ebfa85c	x86-64: Allocate state buffer space for RDI, RSI and RBX _dl_tlsdesc_dynamic preserves RDI, RSI and RBX before realigning stack. After realigning stack, it saves RCX, RDX, R8, R9, R10 and R11. Define TLSDESC_CALL_REGISTER_SAVE_AREA to allocate space for RDI, RSI and RBX to avoid clobbering saved RDI, RSI and RBX values on stack by xsave to STATE_SAVE_OFFSET(%rsp). +==================+<- stack frame start aligned at 8 or 16 bytes \| \|<- RDI saved in the red zone \| \|<- RSI saved in the red zone \| \|<- RBX saved in the red zone \| \|<- paddings for stack realignment of 64 bytes \|------------------\|<- xsave buffer end aligned at 64 bytes \| \|<- \| \|<- \| \|<- \|------------------\|<- xsave buffer start at STATE_SAVE_OFFSET(%rsp) \| \|<- 8-byte padding for 64-byte alignment \| \|<- 8-byte padding for 64-byte alignment \| \|<- R11 \| \|<- R10 \| \|<- R9 \| \|<- R8 \| \|<- RDX \| \|<- RCX +==================+<- RSP aligned at 64 bytes Define TLSDESC_CALL_REGISTER_SAVE_AREA, the total register save area size for all integer registers by adding 24 to STATE_SAVE_OFFSET since RDI, RSI and RBX are saved onto stack without adjusting stack pointer first, using the red-zone. This fixes BZ #31501. Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>	2024-03-18 19:45:13 -07:00
Darius Rad	f44f3aed31	riscv: Update nofpu libm test ulps Fix two test failures. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2024-03-18 11:28:50 +01:00
Florian Weimer	7a76f21867	linux: Use rseq area unconditionally in sched_getcpu (bug 31479) Originally, nptl/descr.h included <sys/rseq.h>, but we removed that in commit `2c6b4b272e` ("nptl: Unconditionally use a 32-byte rseq area"). After that, it was not ensured that the RSEQ_SIG macro was defined during sched_getcpu.c compilation that provided a definition. This commit always checks the rseq area for CPU number information before using the other approaches. This adds an unnecessary (but well-predictable) branch on architectures which do not define RSEQ_SIG, but its cost is small compared to the system call. Most architectures that have vDSO acceleration for getcpu also have rseq support. Fixes: `2c6b4b272e` Fixes: `1d350aa060` Reviewed-by: Arjun Shankar <arjun@redhat.com>	2024-03-15 19:08:24 +01:00
Szabolcs Nagy	73c26018ed	aarch64: fix check for SVE support in assembler Due to GCC bug 110901 -mcpu can override -march setting when compiling asm code and thus a compiler targetting a specific cpu can fail the configure check even when binutils gas supports SVE. The workaround is that explicit .arch directive overrides both -mcpu and -march, and since that's what the actual SVE memcpy uses the configure check should use that too even if the GCC issue is fixed independently. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2024-03-14 14:27:56 +00:00
Joseph Myers	2367bf468c	Update kernel version to 6.8 in header constant tests This patch updates the kernel version in the tests tst-mman-consts.py, tst-mount-consts.py and tst-pidfd-consts.py to 6.8. (There are no new constants covered by these tests in 6.8 that need any other header changes.) Tested with build-many-glibcs.py.	2024-03-13 19:46:21 +00:00
Joseph Myers	3de2f8755c	Update syscall lists for Linux 6.8 Linux 6.8 adds five new syscalls. Update syscall-names.list and regenerate the arch-syscall.h headers with build-many-glibcs.py update-syscalls. Tested with build-many-glibcs.py.	2024-03-13 13:57:56 +00:00
Adhemerval Zanella	4a76fb1da8	powerpc: Remove power8 strcasestr optimization Similar to strstr (`1e9a550ba4`), power8 strcasestr does not show much improvement compared to the generic implementation. The geomean on bench-strcasestr shows: __strcasestr_power8 __strcasestr_ppc power10 1159 1120 power9 1640 1469 power8 1787 1904 The strcasestr uses the same 'trick' as power7 strstr to detect potential quadradic behavior, which only adds overheads for input that trigger quadradic behavior and it is really a hack. Checked on powerpc64le-linux-gnu. Reviewed-by: DJ Delorie <dj@redhat.com>	2024-03-12 17:11:01 -03:00
Adhemerval Zanella	2149da3683	riscv: Fix alignment-ignorant memcpy implementation The memcpy optimization (commit `587a1290a1`) has a series of mistakes: - The implementation is wrong: the chunk size calculation is wrong leading to invalid memory access. - It adds ifunc supports as default, so --disable-multi-arch does not work as expected for riscv. - It mixes Linux files (memcpy ifunc selection which requires the vDSO/syscall mechanism) with generic support (the memcpy optimization itself). - There is no __libc_ifunc_impl_list, which makes testing only check the selected implementation instead of all supported by the system. This patch also simplifies the required bits to enable ifunc: there is no need to memcopy.h; nor to add Linux-specific files. The __memcpy_noalignment tail handling now uses a branchless strategy similar to aarch64 (overlap 32-bits copies for sizes 4..7 and byte copies for size 1..3). Checked on riscv64 and riscv32 by explicitly enabling the function on __libc_ifunc_impl_list on qemu-system. Changes from v1: * Implement the memcpy in assembly to correctly handle RISCV strict-alignment. Reviewed-by: Evan Green <evan@rivosinc.com> Acked-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-03-12 14:38:08 -03:00
Andreas Schwab	2173173d57	linux/sigsetops: fix type confusion (bug 31468) Each mask in the sigset array is an unsigned long, so fix __sigisemptyset to use that instead of int. The __sigword function returns a simple array index, so it can return int instead of unsigned long.	2024-03-12 10:00:22 +01:00
caiyinyu	aeee41f1cf	LoongArch: Correct {__ieee754, _}_scalb -> {__ieee754, _}_scalbf	2024-03-12 14:07:27 +08:00
Sunil K Pandey	b6e3898194	x86-64: Simplify minimum ISA check ifdef conditional with if Replace minimum ISA check ifdef conditional with if. Since MINIMUM_X86_ISA_LEVEL and AVX_X86_ISA_LEVEL are compile time constants, compiler will perform constant folding optimization, getting same results. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-03-03 15:47:53 -08:00

1 2 3 4 5 ...

16147 Commits