glibc

mirror of https://sourceware.org/git/glibc.git synced 2024-12-04 19:00:09 +00:00

Author	SHA1	Message	Date
Stefan Liebler	87fa7bfb84	s390x: Fix segfault in wcsncmp [BZ #31934 ] The z13/vector-optimized wcsncmp implementation segfaults if n=1 and there is only one character (equal on both strings) before the page end. Then it loads and compares one character and misses to check n again. The following load fails. This patch removes the extra load and compare of the first character and just start with the loop which uses vector-load-to-block-boundary. This code-path also checks n. With this patch both tests are passing: - the simplified one mentioned in the bugzilla 31934 - the full one in Florian Weimer's patch: "manual: Document a GNU extension for strncmp/wcsncmp" (https://patchwork.sourceware.org/project/glibc/patch/874j9eml6y.fsf@oldenburg.str.redhat.com/): On s390x-linux-gnu (z16), the new wcsncmp test fails due to bug 31934. Reviewed-by: Carlos O'Donell <carlos@redhat.com> (cherry picked from commit `9b76514103`)	2024-07-16 10:34:09 +02:00
Florian Weimer	bb2ec75022	login: structs utmp, utmpx, lastlog _TIME_BITS independence (bug 30701) These structs describe file formats under /var/log, and should not depend on the definition of _TIME_BITS. This is achieved by defining __WORDSIZE_TIME64_COMPAT32 to 1 on 32-bit ports that support 32-bit time_t values (where __time_t is 32 bits). Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> (cherry picked from commit `9abdae94c7`)	2024-05-02 22:04:35 +02:00
Florian Weimer	93e98f092c	login: Check default sizes of structs utmp, utmpx, lastlog The default <utmp-size.h> is for ports with a 64-bit time_t. Ports with a 32-bit time_t or with __WORDSIZE_TIME64_COMPAT32=1 need to override it. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> (cherry picked from commit `4d4da5aab9`)	2024-05-02 22:04:32 +02:00
Adhemerval Zanella	f0eae7b7d0	sparc: Remove 64 bit check on sparc32 wordsize (BZ 27574) The sparc32 is always 32 bits. Checked on sparcv9-linux-gnu. (cherry picked from commit `dd57f5e7b6`)	2024-05-02 22:02:33 +02:00
Florian Weimer	c15fbfdc0e	powerpc: Fix ld.so address determination for PCREL mode (bug 31640) This seems to have stopped working with some GCC 14 versions, which clobber r2. With other compilers, the kernel-provided r2 value is still available at this point. Reviewed-by: Peter Bergner <bergner@linux.ibm.com> (cherry picked from commit `14e56bd4ce`)	2024-04-14 11:27:23 +02:00
Wilco Dijkstra	a4c897e4c7	AArch64: Check kernel version for SVE ifuncs Old Linux kernels disable SVE after every system call. Calling the SVE-optimized memcpy afterwards will then cause a trap to reenable SVE. As a result, applications with a high use of syscalls may run slower with the SVE memcpy. This is true for kernels between 4.15.0 and before 6.2.0, except for 5.14.0 which was patched. Avoid this by checking the kernel version and selecting the SVE ifunc on modern kernels. Parse the kernel version reported by uname() into a 24-bit kernel.major.minor value without calling any library functions. If uname() is not supported or if the version format is not recognized, assume the kernel is modern. Tested-by: Florian Weimer <fweimer@redhat.com> Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com> (cherry picked from commit `2e94e2f5d2`)	2024-04-09 18:02:26 +01:00
Szabolcs Nagy	07aa48b111	aarch64: fix check for SVE support in assembler Due to GCC bug 110901 -mcpu can override -march setting when compiling asm code and thus a compiler targetting a specific cpu can fail the configure check even when binutils gas supports SVE. The workaround is that explicit .arch directive overrides both -mcpu and -march, and since that's what the actual SVE memcpy uses the configure check should use that too even if the GCC issue is fixed independently. Reviewed-by: Florian Weimer <fweimer@redhat.com> (cherry picked from commit `73c26018ed`)	2024-04-09 17:59:22 +01:00
Andreas Schwab	c3ac827c81	aarch64: correct CFI in rawmemchr (bug 31113) The .cfi_return_column directive changes the return column for the whole FDE range. But the actual intent is to tell the unwinder that the value in x30 (lr) now resides in x15 after the move, and that is expressed by the .cfi_register directive. (cherry picked from commit `3f79842788`)	2024-04-09 17:59:11 +01:00
Wilco Dijkstra	a4c93ae6d5	AArch64: Remove Falkor memcpy The latest implementations of memcpy are actually faster than the Falkor implementations [1], so remove the falkor/phecda ifuncs for memcpy and the now unused IS_FALKOR/IS_PHECDA defines. [1] https://sourceware.org/pipermail/libc-alpha/2022-December/144227.html Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> (cherry picked from commit `2f5524cc53`)	2024-04-09 17:59:00 +01:00
Wilco Dijkstra	ff17116c1e	AArch64: Add memset_zva64 Add a specialized memset for the common ZVA size of 64 to avoid the overhead of reading the ZVA size. Since the code is identical to __memset_falkor, remove the latter. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> (cherry picked from commit `3d7090f14b`)	2024-04-09 17:58:28 +01:00
Wilco Dijkstra	7e999181c2	AArch64: Cleanup emag memset Cleanup emag memset - merge the memset_base64.S file, remove the unused ZVA code (since it is disabled on emag). Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> (cherry picked from commit `9627ab99b5`)	2024-04-09 17:57:54 +01:00
Wilco Dijkstra	bfca39cce7	AArch64: Cleanup ifuncs Cleanup ifuncs. Remove uses of libc_hidden_builtin_def, use ENTRY rather than ENTRY_ALIGN, remove unnecessary defines and conditional compilation. Rename strlen_mte to strlen_generic. Remove rtld-memset. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com> (cherry picked from commit `9fd3409842`)	2024-04-09 17:57:15 +01:00
Wilco Dijkstra	3b79e57c1c	AArch64: Add support for MOPS memcpy/memmove/memset Add support for MOPS in cpu_features and INIT_ARCH. Add ifuncs using MOPS for memcpy, memmove and memset (use .inst for now so it works with all binutils versions without needing complex configure and conditional compilation). Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com> (cherry picked from commit `2bd0017988`)	2024-04-09 17:39:07 +01:00
Joseph Myers	cb0012f076	Add HWCAP2_MOPS from Linux 6.5 to AArch64 bits/hwcap.h Linux 6.5 adds a new AArch64 HWCAP2 value, HWCAP2_MOPS. Add it to glibc's bits/hwcap.h. Tested with build-many-glibcs.py for aarch64-linux-gnu. (cherry picked from commit `ff5d2abd18`)	2024-04-09 17:38:54 +01:00
Wilco Dijkstra	49f1bd6256	AArch64: Improve SVE memcpy and memmove Improve SVE memcpy by copying 2 vectors if the size is small enough. This improves performance of random memcpy by ~9% on Neoverse V1, and 33-64 byte copies are ~16% faster. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com> (cherry picked from commit `d2d3f3720c`)	2024-04-09 17:38:05 +01:00
Wilco Dijkstra	3972a306be	AArch64: Improve strrchr Use shrn for narrowing the mask which simplifies code and speeds up small strings. Unroll the first search loop to improve performance on large strings. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com> (cherry picked from commit `55599d4804`)	2024-04-09 17:37:56 +01:00
Wilco Dijkstra	34bf1b598e	AArch64: Optimize strnlen Optimize strnlen using the shrn instruction and improve the main loop. Small strings are around 10% faster, large strings are 40% faster on modern CPUs. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com> (cherry picked from commit `ad098893ba`)	2024-04-09 17:37:47 +01:00
Wilco Dijkstra	07666811d4	AArch64: Optimize strlen Optimize strlen by unrolling the main loop. Large strings are 64% faster on modern CPUs. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com> (cherry picked from commit `03c8ce5000`)	2024-04-09 17:37:38 +01:00
Wilco Dijkstra	20a2b9dc71	AArch64: Optimize strcpy Unroll the main loop. Large strings are around 20% faster on modern CPUs. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com> (cherry picked from commit `349e48c01e`)	2024-04-09 17:37:24 +01:00
Wilco Dijkstra	b2cf48dd84	AArch64: Improve strchrnul Unroll the main loop, which improves performance slightly. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com> (cherry picked from commit `09ebd8549b`)	2024-04-09 17:37:15 +01:00
Wilco Dijkstra	979cfc9dc3	AArch64: Optimize strchr Simplify calculation of the mask using shrn. Unroll the main loop. Small strings are 20% faster on modern CPUs. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com> (cherry picked from commit `51541a2297`)	2024-04-09 17:37:05 +01:00
Wilco Dijkstra	90255d909c	AArch64: Improve strlen_asimd Use shrn for the mask, merge tst+bne into cbnz, and tweak code alignment. Performance improves slightly as a result. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com> (cherry picked from commit `1bbb1a2022`)	2024-04-09 17:36:54 +01:00
Wilco Dijkstra	3474ab871b	AArch64: Optimize memrchr Optimize the main loop - large strings are 43% faster on modern CPUs. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com> (cherry picked from commit `0077624177`)	2024-04-09 17:36:45 +01:00
Wilco Dijkstra	a68b7934a2	AArch64: Optimize memchr Optimize the main loop - large strings are 40% faster on modern CPUs. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com> (cherry picked from commit `ce758d4f06`)	2024-04-09 17:36:35 +01:00
Wilco Dijkstra	f44f5957ba	aarch64: Use memcpy_simd as the default memcpy Since __memcpy_simd is the fastest memcpy on almost all cores, replace the generic memcpy with it. If SVE is available, a SVE memcpy will be used by default (including for Neoverse N2). (cherry picked from commit `e6f3fe362f`)	2024-04-09 17:36:09 +01:00
Wilco Dijkstra	88ff6835da	aarch64: Cleanup memset ifunc Cleanup memset ifunc selectors. The A64FX memset relies on a ZVA size of 256, so add an explicit check. (cherry picked from commit `a8e72913fe`)	2024-04-09 17:34:30 +01:00
Wilco Dijkstra	41c34cc7c7	AArch64: Fix typo in sve configure check (BZ# 29394) Fix a typo in the SVE configure check. This fixes [BZ# 29394]. (cherry picked from commit `12182ba18d`)	2024-04-09 17:34:20 +01:00
Danila Kutenin	c4f4b53eee	aarch64: Optimize string functions with shrn instruction We found that string functions were using AND+ADDP to find the nibble/syndrome mask but there is an easier opportunity through `SHRN dst.8b, src.8h, 4` (shift right every 2 bytes by 4 and narrow to 1 byte) and has same latency on all SIMD ARMv8 targets as ADDP. There are also possible gaps for memcmp but that's for another patch. We see 10-20% savings for small-mid size cases (<=128) which are primary cases for general workloads. (cherry picked from commit `3c99806989`)	2024-04-09 17:34:07 +01:00
Wilco Dijkstra	3393c72eb0	AArch64: Sort makefile entries Sort makefile entries to reduce conflicts. (cherry picked from commit `eea282d9c6`)	2024-04-09 17:33:54 +01:00
Wilco Dijkstra	45a13a518a	AArch64: Add SVE memcpy Add an initial SVE memcpy implementation. Copies up to 32 bytes use SVE vectors which improves the random memcpy benchmark significantly. Cleanup the memcpy and memmove ifunc selectors. (cherry picked from commit `9f298bfe1f`)	2024-04-09 17:33:25 +01:00
Wilco Dijkstra	82dee12d3b	AArch64: Optimize memcmp Rewrite memcmp to improve performance. On small and medium inputs performance is 10-20% better. Large inputs use a SIMD loop processing 64 bytes per iteration, which is 30-50% faster depending on the size. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com> (cherry picked from commit `b51eb35c57`)	2024-04-09 17:32:53 +01:00
Adhemerval Zanella	3fcb410465	malloc: Use __get_nprocs on arena_get2 (BZ 30945) This restore the 2.33 semantic for arena_get2. It was changed by `11a02b035b` to avoid arena_get2 call malloc (back when __get_nproc was refactored to use an scratch_buffer - `903bc7dcc2`). The __get_nproc was refactored over then and now it also avoid to call malloc. The `11a02b035b` did not take in consideration any performance implication, which should have been discussed properly. The __get_nprocs_sched is still used as a fallback mechanism if procfs and sysfs is not acessible. Checked on x86_64-linux-gnu. Reviewed-by: DJ Delorie <dj@redhat.com> (cherry picked from commit `472894d2cf`)	2024-02-12 10:20:44 -03:00
Sunil K Pandey	a08677d389	x86_64: Optimize ffsll function code size. Ffsll function randomly regress by ~20%, depending on how code gets aligned in memory. Ffsll function code size is 17 bytes. Since default function alignment is 16 bytes, it can load on 16, 32, 48 or 64 bytes aligned memory. When ffsll function load at 16, 32 or 64 bytes aligned memory, entire code fits in single 64 bytes cache line. When ffsll function load at 48 bytes aligned memory, it splits in two cache line, hence random regression. Ffsll function size reduction from 17 bytes to 12 bytes ensures that it will always fit in single 64 bytes cache line. This patch fixes ffsll function random performance regression. Reviewed-by: Carlos O'Donell <carlos@redhat.com> (cherry picked from commit `9d94997b5f`)	2024-01-31 18:54:26 -08:00
H.J. Lu	2143fcd540	x86-64: Fix the tcb field load for x32 [BZ #31185 ] _dl_tlsdesc_undefweak and _dl_tlsdesc_dynamic access the thread pointer via the tcb field in TCB: _dl_tlsdesc_undefweak: _CET_ENDBR movq 8(%rax), %rax subq %fs:0, %rax ret _dl_tlsdesc_dynamic: ... subq %fs:0, %rax movq -8(%rsp), %rdi ret Since the tcb field in TCB is a pointer, %fs:0 is a 32-bit location, not 64-bit. It should use "sub %fs:0, %RAX_LP" instead. Since _dl_tlsdesc_undefweak returns ptrdiff_t and _dl_make_tlsdesc_dynamic returns void *, RAX_LP is appropriate here for x32 and x86-64. This fixes BZ #31185. (cherry picked from commit `81be2a61da`)	2023-12-23 09:05:42 -08:00
H.J. Lu	ba52b325c4	x86-64: Fix the dtv field load for x32 [BZ #31184 ] On x32, I got FAIL: elf/tst-tlsgap $ gdb elf/tst-tlsgap ... open tst-tlsgap-mod1.so Thread 2 "tst-tlsgap" received signal SIGSEGV, Segmentation fault. [Switching to LWP 2268754] _dl_tlsdesc_dynamic () at ../sysdeps/x86_64/dl-tlsdesc.S:108 108 movq (%rsi), %rax (gdb) p/x $rsi $4 = 0xf7dbf9005655fb18 (gdb) This is caused by _dl_tlsdesc_dynamic: _CET_ENDBR /* Preserve call-clobbered registers that we modify. We need two scratch regs anyway. */ movq %rsi, -16(%rsp) movq %fs:DTV_OFFSET, %rsi Since the dtv field in TCB is a pointer, %fs:DTV_OFFSET is a 32-bit location, not 64-bit. Load the dtv field to RSI_LP instead of rsi. This fixes BZ #31184. (cherry picked from commit `3502440397`)	2023-12-23 09:04:14 -08:00
Romain Geissler	8006457ab7	Fix leak in getaddrinfo introduced by the fix for CVE-2023-4806 [BZ #30843 ] This patch fixes a very recently added leak in getaddrinfo. This was assigned CVE-2023-5156. Resolves: BZ #30884 Related: BZ #30842 Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org> (cherry picked from commit `ec6b95c330`)	2023-09-26 15:38:49 -04:00
Siddhesh Poyarekar	e09ee267c0	getaddrinfo: Fix use after free in getcanonname (CVE-2023-4806) When an NSS plugin only implements the _gethostbyname2_r and _getcanonname_r callbacks, getaddrinfo could use memory that was freed during tmpbuf resizing, through h_name in a previous query response. The backing store for res->at->name when doing a query with gethostbyname3_r or gethostbyname2_r is tmpbuf, which is reallocated in gethosts during the query. For AF_INET6 lookup with AI_ALL \| AI_V4MAPPED, gethosts gets called twice, once for a v6 lookup and second for a v4 lookup. In this case, if the first call reallocates tmpbuf enough number of times, resulting in a malloc, th->h_name (that res->at->name refers to) ends up on a heap allocated storage in tmpbuf. Now if the second call to gethosts also causes the plugin callback to return NSS_STATUS_TRYAGAIN, tmpbuf will get freed, resulting in a UAF reference in res->at->name. This then gets dereferenced in the getcanonname_r plugin call, resulting in the use after free. Fix this by copying h_name over and freeing it at the end. This resolves BZ #30843, which is assigned CVE-2023-4806. Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org> (cherry picked from commit `973fe93a56`)	2023-09-15 20:00:04 -04:00
Siddhesh Poyarekar	cc4544ef80	gethosts: Return EAI_MEMORY on allocation failure All other cases of failures due to lack of memory return EAI_MEMORY, so it seems wrong to return EAI_SYSTEM here. The only reason convert_hostent_to_gaih_addrtuple could fail is on calloc failure. Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org> Reviewed-by: DJ Delorie <dj@redhat.com> (cherry picked from commit `b587456c0e`)	2023-09-15 19:53:24 -04:00
Siddhesh Poyarekar	92478a808f	gaih_inet: Split result generation into its own function Simplify the loop a wee bit and clean up variable names too. Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org> Reviewed-by: DJ Delorie <dj@redhat.com> (cherry picked from commit `ac4653ef50`)	2023-09-15 19:53:24 -04:00
Siddhesh Poyarekar	6e3fed9d20	gaih_inet: split loopback lookup into its own function Flatten the condition nesting and replace the alloca for RET.AT/ATR with a single array LOCAL_AT[2]. This gets rid of alloca and alloca accounting. `git diff -b` is probably the best way to view this change since much of the diff is whitespace changes. Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org> Reviewed-by: DJ Delorie <dj@redhat.com> (cherry picked from commit `657472b2a5`)	2023-09-15 19:53:24 -04:00
Siddhesh Poyarekar	4d59769087	gaih_inet: make gethosts into a function The macro is quite a pain to debug, so make gethosts into a function to make it easier to maintain. Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org> Reviewed-by: DJ Delorie <dj@redhat.com> (cherry picked from commit `cfa3bd48cb`)	2023-09-15 19:53:24 -04:00
Siddhesh Poyarekar	ec71cb9611	gaih_inet: separate nss lookup loop into its own function Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org> Reviewed-by: DJ Delorie <dj@redhat.com> (cherry picked from commit `906cecbe08`)	2023-09-15 19:53:24 -04:00
Siddhesh Poyarekar	5914a1d55b	gaih_inet: Split nscd lookup code into its own function. Add a new member got_ipv6 to indicate if the results have an IPv6 result and use it instead of the local got_ipv6. Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org> Reviewed-by: DJ Delorie <dj@redhat.com> (cherry picked from commit `e7e5315b7f`)	2023-09-15 19:53:24 -04:00
Siddhesh Poyarekar	3b5a3e5009	gaih_inet: Split simple gethostbyname into its own function Add a free_at flag in gaih_result to indicate if res.at needs to be freed by the caller. Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org> Reviewed-by: DJ Delorie <dj@redhat.com> (cherry picked from commit `b44389cb7f`)	2023-09-15 19:53:24 -04:00
Siddhesh Poyarekar	922f2614d6	gaih_inet: make numeric lookup a separate routine Introduce the gaih_result structure and general paradigm for cleanups that follow to process the lookup request and return a result. A lookup function (like text_to_binary_address), should return an integer error code and set members of gaih_result based on what it finds. If the function does not have a result and no errors have occurred during the lookup, it should return 0 and res.at should be set to NULL, allowing a subsequent function to do the lookup until we run out of options. Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org> Reviewed-by: DJ Delorie <dj@redhat.com> (cherry picked from commit `26dea46119`)	2023-09-15 19:53:24 -04:00
Siddhesh Poyarekar	e05e5889b8	gaih_inet: Simplify service resolution Refactor the code to split out the service resolution code into a separate function. Allocate the service tuples array just once to the size of the typeproto array, thus avoiding the unnecessary pointer chasing and stack allocations. Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org> Reviewed-by: DJ Delorie <dj@redhat.com> (cherry picked from commit `8d6cf99f2f`)	2023-09-15 19:53:24 -04:00
Siddhesh Poyarekar	f7efb43738	getaddrinfo: Fix leak with AI_ALL [BZ #28852 ] Use realloc in convert_hostent_to_gaih_addrtuple and fix up pointers in the result list so that a single block is maintained for hostbyname3_r/hostbyname2_r and freed in gaih_inet. This result is never merged with any other results, since the hosts database does not permit merging. Resolves BZ #28852. Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org> Reviewed-by: DJ Delorie <dj@redhat.com> (cherry picked from commit `3004604607`)	2023-09-15 19:53:24 -04:00
Siddhesh Poyarekar	b195fd86c6	gaih_inet: Simplify canon name resolution Simplify logic for allocation of canon to remove the canonbuf variable; canon now always points to an allocated block. Also pull the canon name set into a separate function. Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org> Reviewed-by: DJ Delorie <dj@redhat.com> (cherry picked from commit `d01411f6bc`)	2023-09-15 19:53:24 -04:00
Siddhesh Poyarekar	01671608a3	gethosts: Remove unused argument _type The generated code is unchanged. (cherry picked from commit `b17e842a60`)	2023-09-15 19:53:18 -04:00
Siddhesh Poyarekar	228cdb00a0	Simplify allocations and fix merge and continue actions [BZ #28931 ] Allocations for address tuples is currently a bit confusing because of the pointer chasing through PAT, making it hard to observe the sequence in which allocations have been made. Narrow scope of the pointer chasing through PAT so that it is only used where necessary. This also tightens actions behaviour with the hosts database in getaddrinfo to comply with the manual text. The "continue" action discards previous results and the "merge" action results in an immedate lookup failure. Consequently, chaining of allocations across modules is no longer necessary, thus opening up cleanup opportunities. A test has been added that checks some combinations to ensure that they work correctly. Resolves: BZ #28931 Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org> Reviewed-by: DJ Delorie <dj@redhat.com> (cherry picked from commit `1c37b8022e`)	2023-09-14 22:46:01 -04:00

1 2 3 4 5 ...

14615 Commits