glibc

mirror of https://sourceware.org/git/glibc.git synced 2025-01-11 03:40:06 +00:00

Author	SHA1	Message	Date
Florian Weimer	29f833f5ab	Linux: Remove DL_FIND_ARG_COMPONENTS The generic definition is always used since the Native Client port has been removed. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> (cherry picked from commit `2d47fa6862`)	2022-05-17 08:08:52 +02:00
Florian Weimer	1695c5e0f6	Linux: Remove HAVE_AUX_SECURE, HAVE_AUX_XID, HAVE_AUX_PAGESIZE They are always defined. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> (cherry picked from commit `b9c3d3382f`)	2022-05-17 08:08:52 +02:00
Florian Weimer	756d583c9e	elf: Merge dl-sysdep.c into the Linux version The generic version is the de-facto Linux implementation. It requires an auxiliary vector, so Hurd does not use it. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> (cherry picked from commit `91c0a47ffb`)	2022-05-17 08:08:52 +02:00
Noah Goldstein	2c4fc8e5ca	x86: Optimize {str\|wcs}rchr-evex The new code unrolls the main loop slightly without adding too much overhead and minimizes the comparisons for the search CHAR. Geometric Mean of all benchmarks New / Old: 0.755 See email for all results. Full xcheck passes on x86_64 with and without multiarch enabled. Reviewed-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit `c966099cdc`)	2022-05-16 12:42:46 -07:00
Noah Goldstein	fdbc8439ac	x86: Optimize {str\|wcs}rchr-avx2 The new code unrolls the main loop slightly without adding too much overhead and minimizes the comparisons for the search CHAR. Geometric Mean of all benchmarks New / Old: 0.832 See email for all results. Full xcheck passes on x86_64 with and without multiarch enabled. Reviewed-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit `df7e295d18`)	2022-05-16 12:40:38 -07:00
Noah Goldstein	b05c0c8b28	x86: Optimize {str\|wcs}rchr-sse2 The new code unrolls the main loop slightly without adding too much overhead and minimizes the comparisons for the search CHAR. Geometric Mean of all benchmarks New / Old: 0.741 See email for all results. Full xcheck passes on x86_64 with and without multiarch enabled. Reviewed-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit `5307aa9c18`)	2022-05-16 12:37:03 -07:00
H.J. Lu	bc35e22be4	x86-64: Fix SSE2 memcmp and SSSE3 memmove for x32 Clear the upper 32 bits in RDX (memory size) for x32 to fix FAIL: string/tst-size_t-memcmp FAIL: string/tst-size_t-memcmp-2 FAIL: string/tst-size_t-memcpy FAIL: wcsmbs/tst-size_t-wmemcmp on x32 introduced by `8804157ad9` x86: Optimize memcmp SSE2 in memcmp.S `26b2478322` x86: Reduce code size of mem{move\|pcpy\|cpy}-ssse3 Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com> (cherry picked from commit `8ea20ee5f6`)	2022-05-16 12:34:52 -07:00
Noah Goldstein	4d1841deb7	x86: Fix missing __wmemcmp def for disable-multiarch build commit `8804157ad9` Author: Noah Goldstein <goldstein.w.n@gmail.com> Date: Fri Apr 15 12:27:59 2022 -0500 x86: Optimize memcmp SSE2 in memcmp.S Only defined wmemcmp and missed __wmemcmp. This commit fixes that by defining __wmemcmp and setting wmemcmp as a weak alias to __wmemcmp. Both multiarch and disable-multiarch builds succeed and full xchecks pass. Reviewed-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit `c72a1a062a`)	2022-05-16 12:31:39 -07:00
Noah Goldstein	cee9939f67	x86: Cleanup page cross code in memcmp-avx2-movbe.S Old code was both inefficient and wasted code size. New code (-62 bytes) and comparable or better performance in the page cross case. geometric_mean(N=20) of page cross cases New / Original: 0.960 size, align0, align1, ret, New Time/Old Time 1, 4095, 0, 0, 1.001 1, 4095, 0, 1, 0.999 1, 4095, 0, -1, 1.0 2, 4094, 0, 0, 1.0 2, 4094, 0, 1, 1.0 2, 4094, 0, -1, 1.0 3, 4093, 0, 0, 1.0 3, 4093, 0, 1, 1.0 3, 4093, 0, -1, 1.0 4, 4092, 0, 0, 0.987 4, 4092, 0, 1, 1.0 4, 4092, 0, -1, 1.0 5, 4091, 0, 0, 0.984 5, 4091, 0, 1, 1.002 5, 4091, 0, -1, 1.005 6, 4090, 0, 0, 0.993 6, 4090, 0, 1, 1.001 6, 4090, 0, -1, 1.003 7, 4089, 0, 0, 0.991 7, 4089, 0, 1, 1.0 7, 4089, 0, -1, 1.001 8, 4088, 0, 0, 0.875 8, 4088, 0, 1, 0.881 8, 4088, 0, -1, 0.888 9, 4087, 0, 0, 0.872 9, 4087, 0, 1, 0.879 9, 4087, 0, -1, 0.883 10, 4086, 0, 0, 0.878 10, 4086, 0, 1, 0.886 10, 4086, 0, -1, 0.873 11, 4085, 0, 0, 0.878 11, 4085, 0, 1, 0.881 11, 4085, 0, -1, 0.879 12, 4084, 0, 0, 0.873 12, 4084, 0, 1, 0.889 12, 4084, 0, -1, 0.875 13, 4083, 0, 0, 0.873 13, 4083, 0, 1, 0.863 13, 4083, 0, -1, 0.863 14, 4082, 0, 0, 0.838 14, 4082, 0, 1, 0.869 14, 4082, 0, -1, 0.877 15, 4081, 0, 0, 0.841 15, 4081, 0, 1, 0.869 15, 4081, 0, -1, 0.876 16, 4080, 0, 0, 0.988 16, 4080, 0, 1, 0.99 16, 4080, 0, -1, 0.989 17, 4079, 0, 0, 0.978 17, 4079, 0, 1, 0.981 17, 4079, 0, -1, 0.98 18, 4078, 0, 0, 0.981 18, 4078, 0, 1, 0.98 18, 4078, 0, -1, 0.985 19, 4077, 0, 0, 0.977 19, 4077, 0, 1, 0.979 19, 4077, 0, -1, 0.986 20, 4076, 0, 0, 0.977 20, 4076, 0, 1, 0.986 20, 4076, 0, -1, 0.984 21, 4075, 0, 0, 0.977 21, 4075, 0, 1, 0.983 21, 4075, 0, -1, 0.988 22, 4074, 0, 0, 0.983 22, 4074, 0, 1, 0.994 22, 4074, 0, -1, 0.993 23, 4073, 0, 0, 0.98 23, 4073, 0, 1, 0.992 23, 4073, 0, -1, 0.995 24, 4072, 0, 0, 0.989 24, 4072, 0, 1, 0.989 24, 4072, 0, -1, 0.991 25, 4071, 0, 0, 0.99 25, 4071, 0, 1, 0.999 25, 4071, 0, -1, 0.996 26, 4070, 0, 0, 0.993 26, 4070, 0, 1, 0.995 26, 4070, 0, -1, 0.998 27, 4069, 0, 0, 0.993 27, 4069, 0, 1, 0.999 27, 4069, 0, -1, 1.0 28, 4068, 0, 0, 0.997 28, 4068, 0, 1, 1.0 28, 4068, 0, -1, 0.999 29, 4067, 0, 0, 0.996 29, 4067, 0, 1, 0.999 29, 4067, 0, -1, 0.999 30, 4066, 0, 0, 0.991 30, 4066, 0, 1, 1.001 30, 4066, 0, -1, 0.999 31, 4065, 0, 0, 0.988 31, 4065, 0, 1, 0.998 31, 4065, 0, -1, 0.998 Reviewed-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit `23102686ec`)	2022-05-16 12:28:37 -07:00
Noah Goldstein	0909286ffa	x86: Remove memcmp-sse4.S Code didn't actually use any sse4 instructions since `ptest` was removed in: commit `2f9062d717` Author: Noah Goldstein <goldstein.w.n@gmail.com> Date: Wed Nov 10 16:18:56 2021 -0600 x86: Shrink memcmp-sse4.S code size The new memcmp-sse2 implementation is also faster. geometric_mean(N=20) of page cross cases SSE2 / SSE4: 0.905 Note there are two regressions preferring SSE2 for Size = 1 and Size = 65. Size = 1: size, align0, align1, ret, New Time/Old Time 1, 1, 1, 0, 1.2 1, 1, 1, 1, 1.197 1, 1, 1, -1, 1.2 This is intentional. Size == 1 is significantly less hot based on profiles of GCC11 and Python3 than sizes [4, 8] (which is made hotter). Python3 Size = 1 -> 13.64% Python3 Size = [4, 8] -> 60.92% GCC11 Size = 1 -> 1.29% GCC11 Size = [4, 8] -> 33.86% size, align0, align1, ret, New Time/Old Time 4, 4, 4, 0, 0.622 4, 4, 4, 1, 0.797 4, 4, 4, -1, 0.805 5, 5, 5, 0, 0.623 5, 5, 5, 1, 0.777 5, 5, 5, -1, 0.802 6, 6, 6, 0, 0.625 6, 6, 6, 1, 0.813 6, 6, 6, -1, 0.788 7, 7, 7, 0, 0.625 7, 7, 7, 1, 0.799 7, 7, 7, -1, 0.795 8, 8, 8, 0, 0.625 8, 8, 8, 1, 0.848 8, 8, 8, -1, 0.914 9, 9, 9, 0, 0.625 Size = 65: size, align0, align1, ret, New Time/Old Time 65, 0, 0, 0, 1.103 65, 0, 0, 1, 1.216 65, 0, 0, -1, 1.227 65, 65, 0, 0, 1.091 65, 0, 65, 1, 1.19 65, 65, 65, -1, 1.215 This is because A) the checks in range [65, 96] are now unrolled 2x and B) because smaller values <= 16 are now given a hotter path. By contrast the SSE4 version has a branch for Size = 80. The unrolled version has get better performance for returns which need both comparisons. size, align0, align1, ret, New Time/Old Time 128, 4, 8, 0, 0.858 128, 4, 8, 1, 0.879 128, 4, 8, -1, 0.888 As well, out of microbenchmark environments that are not full predictable the branch will have a real-cost. Reviewed-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit `7cbc03d030`)	2022-05-16 12:25:18 -07:00
Noah Goldstein	5a8df6485c	x86: Optimize memcmp SSE2 in memcmp.S New code save size (-303 bytes) and has significantly better performance. geometric_mean(N=20) of page cross cases New / Original: 0.634 Reviewed-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit `8804157ad9`)	2022-05-16 12:19:59 -07:00
Noah Goldstein	af0865571a	x86: Small improvements for wcslen Just a few QOL changes. 1. Prefer `add` > `lea` as it has high execution units it can run on. 2. Don't break macro-fusion between `test` and `jcc` 3. Reduce code size by removing gratuitous padding bytes (-90 bytes). geometric_mean(N=20) of all benchmarks New / Original: 0.959 All string/memory tests pass. Reviewed-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit `244b415d38`)	2022-05-16 12:17:42 -07:00
Noah Goldstein	3b710e32d8	x86: Remove AVX str{n}casecmp The rational is: 1. SSE42 has nearly identical logic so any benefit is minimal (3.4% regression on Tigerlake using SSE42 versus AVX across the benchtest suite). 2. AVX2 version covers the majority of targets that previously prefered it. 3. The targets where AVX would still be best (SnB and IVB) are becoming outdated. All in all the saving the code size is worth it. All string/memory tests pass. Reviewed-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit `305769b2a1`)	2022-05-16 12:12:10 -07:00
Noah Goldstein	fc5d42bf82	x86: Add EVEX optimized str{n}casecmp geometric_mean(N=40) of all benchmarks EVEX / SSE42: .621 All string/memory tests pass. Reviewed-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit `84e7c46df4`)	2022-05-16 12:07:56 -07:00
Noah Goldstein	33fcf8344f	x86: Add AVX2 optimized str{n}casecmp geometric_mean(N=40) of all benchmarks AVX2 / SSE42: .702 All string/memory tests pass. Reviewed-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit `bbf8122234`)	2022-05-16 12:00:53 -07:00
Noah Goldstein	3496d64d69	x86: Optimize str{n}casecmp TOLOWER logic in strcmp-sse42.S Slightly faster method of doing TOLOWER that saves an instruction. Also replace the hard coded 5-byte no with .p2align 4. On builds with CET enabled this misaligned entry to strcasecmp. geometric_mean(N=40) of all benchmarks New / Original: .920 All string/memory tests pass. Reviewed-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit `d154758e61`)	2022-05-16 11:59:00 -07:00
Noah Goldstein	283982b362	x86: Optimize str{n}casecmp TOLOWER logic in strcmp.S Slightly faster method of doing TOLOWER that saves an instruction. Also replace the hard coded 5-byte no with .p2align 4. On builds with CET enabled this misaligned entry to strcasecmp. geometric_mean(N=40) of all benchmarks New / Original: .894 All string/memory tests pass. Reviewed-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit `670b54bc58`)	2022-05-16 11:56:16 -07:00
Noah Goldstein	420cd6f155	x86: Remove strspn-sse2.S and use the generic implementation The generic implementation is faster. geometric_mean(N=20) of all benchmarks New / Original: .710 All string/memory tests pass. Reviewed-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit `9c8a6ad620`)	2022-05-16 11:52:47 -07:00
Noah Goldstein	4b61d76521	x86: Remove strpbrk-sse2.S and use the generic implementation The generic implementation is faster (see strcspn commit). All string/memory tests pass. Reviewed-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit `6533585352`)	2022-05-16 11:48:09 -07:00
Noah Goldstein	2fef1961a7	x86: Remove strcspn-sse2.S and use the generic implementation The generic implementation is faster. geometric_mean(N=20) of all benchmarks New / Original: .678 All string/memory tests pass. Reviewed-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit `fe28e7d9d9`)	2022-05-16 10:56:35 -07:00
Noah Goldstein	1ed2813eb1	x86: Optimize strspn in strspn-c.c Use _mm_cmpeq_epi8 and _mm_movemask_epi8 to get strlen instead of _mm_cmpistri. Also change offset to unsigned to avoid unnecessary sign extensions. geometric_mean(N=20) of all benchmarks that dont fallback on sse2; New / Original: .901 All string/memory tests pass. Reviewed-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit `412d103431`)	2022-05-16 10:51:41 -07:00
Noah Goldstein	3214c878f2	x86: Optimize strcspn and strpbrk in strcspn-c.c Use _mm_cmpeq_epi8 and _mm_movemask_epi8 to get strlen instead of _mm_cmpistri. Also change offset to unsigned to avoid unnecessary sign extensions. geometric_mean(N=20) of all benchmarks that dont fallback on sse2/strlen; New / Original: .928 All string/memory tests pass. Reviewed-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit `30d627d477`)	2022-05-16 10:45:15 -07:00
Noah Goldstein	ff9772ac19	x86: Code cleanup in strchr-evex and comment justifying branch Small code cleanup for size: -81 bytes. Add comment justifying using a branch to do NULL/non-null return. All string/memory tests pass and no regressions in benchtests. geometric_mean(N=20) of all benchmarks New / Original: .985 Reviewed-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit `ec285ea904`)	2022-05-16 10:42:53 -07:00
Noah Goldstein	424bbd4d25	x86: Code cleanup in strchr-avx2 and comment justifying branch Small code cleanup for size: -53 bytes. Add comment justifying using a branch to do NULL/non-null return. All string/memory tests pass and no regressions in benchtests. geometric_mean(N=20) of all benchmarks Original / New: 1.00 Reviewed-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit `a6fbf4d51e`)	2022-05-16 10:39:59 -07:00
Adhemerval Zanella	0a10b8b181	x86_64: Remove bcopy optimizations The symbols is not present in current POSIX specification and compiler already generates memmove call. (cherry picked from commit `bf92893a14`)	2022-05-16 10:36:04 -07:00
H.J. Lu	f0a53588da	x86-64: Define __memcmpeq in ld.so Define __memcmpeq in ld.so so that compiler can generate __memcmpeq call when compiling for ld.so. (cherry picked from commit `a5659cf27d`)	2022-05-16 10:33:14 -07:00
H.J. Lu	a133623048	x86-64: Remove bzero weak alias in SS2 memset commit `3d9f171bfb` Author: H.J. Lu <hjl.tools@gmail.com> Date: Mon Feb 7 05:55:15 2022 -0800 x86-64: Optimize bzero added the optimized bzero. Remove bzero weak alias in SS2 memset to avoid undefined __bzero in memset-sse2-unaligned-erms. (cherry picked from commit `0fb8800029`)	2022-05-16 10:29:20 -07:00
H.J. Lu	18baf86f51	x86_64/multiarch: Sort sysdep_routines and put one entry per line (cherry picked from commit `c328d0152d`)	2022-05-16 10:26:53 -07:00
H.J. Lu	d422197a69	x86: Improve L to support L(XXX_SYMBOL (YYY, ZZZ)) (cherry picked from commit `1283948f23`)	2022-05-16 10:20:57 -07:00
Siddhesh Poyarekar	58947e1fa5	fortify: Ensure that __glibc_fortify condition is a constant [BZ #29141 ] The fix `c8ee1c85` introduced a -1 check for object size without also checking that object size is a constant. Because of this, the tree optimizer passes in gcc fail to fold away one of the branches in __glibc_fortify and trips on a spurious Wstringop-overflow. The warning itself is incorrect and the branch does go away eventually in DCE in the rtl passes in gcc, but the constant check is a helpful hint to simplify code early, so add it in. Resolves: BZ #29141 Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2022-05-16 21:28:38 +05:30
Florian Weimer	28ea43f8d6	dlfcn: Implement the RTLD_DI_PHDR request type for dlinfo The information is theoretically available via dl_iterate_phdr as well, but that approach is very slow if there are many shared objects. Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@rehdat.com> (cherry picked from commit `d056c21213`)	2022-05-11 19:17:47 +02:00
Florian Weimer	78f82ab4ef	manual: Document the dlinfo function Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@rehdat.com> (cherry picked from commit `93804a1ee0`)	2022-05-11 19:17:47 +02:00
Adhemerval Zanella	bbb017a2bb	NEWS: Add a bug fix entry for BZ #29109	2022-05-06 11:34:18 -03:00
Adhemerval Zanella	5c0d94d780	linux: Fix posix_spawn return code if clone fails (BZ#29109) The __clone_internal returns the error on errno. Checked on x86_64-linux-gnu. (cherry picked from commit `71e2a681f1`)	2022-05-06 11:09:52 -03:00
Noah Goldstein	059e36d9ed	x86: Fix fallback for wcsncmp_avx2 in strcmp-avx2.S [BZ #28896 ] Overflow case for __wcsncmp_avx2_rtm should be __wcscmp_avx2_rtm not __wcscmp_avx2. commit `ddf0992cf5` Author: Noah Goldstein <goldstein.w.n@gmail.com> Date: Sun Jan 9 16:02:21 2022 -0600 x86: Fix __wcsncmp_avx2 in strcmp-avx2.S [BZ# 28755] Set the wrong fallback function for `__wcsncmp_avx2_rtm`. It was set to fallback on to `__wcscmp_avx2` instead of `__wcscmp_avx2_rtm` which can cause spurious aborts. This change will need to be backported. All string/memory tests pass. Reviewed-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit `9fef7039a7`)	2022-05-04 22:41:54 -07:00
Noah Goldstein	676f7bcf11	x86: Fix bug in strncmp-evex and strncmp-avx2 [BZ #28895 ] Logic can read before the start of `s1` / `s2` if both `s1` and `s2` are near the start of a page. To avoid having the result contimated by these comparisons the `strcmp` variants would mask off these comparisons. This was missing in the `strncmp` variants causing the bug. This commit adds the masking to `strncmp` so that out of range comparisons don't affect the result. test-strcmp, test-strncmp, test-wcscmp, and test-wcsncmp all pass as well a full xcheck on x86_64 linux. Reviewed-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit `e108c02a5e`)	2022-05-04 22:41:15 -07:00
Noah Goldstein	c394d7e11a	x86: Set .text section in memset-vec-unaligned-erms commit `3d9f171bfb` Author: H.J. Lu <hjl.tools@gmail.com> Date: Mon Feb 7 05:55:15 2022 -0800 x86-64: Optimize bzero Remove setting the .text section for the code. This commit adds that back. (cherry picked from commit `7912236f4a`)	2022-05-04 22:40:57 -07:00
H.J. Lu	de0cd691b2	x86-64: Optimize bzero memset with zero as the value to set is by far the majority value (99%+ for Python3 and GCC). bzero can be slightly more optimized for this case by using a zero-idiom xor for broadcasting the set value to a register (vector or GPR). Co-developed-by: Noah Goldstein <goldstein.w.n@gmail.com> (cherry picked from commit `3d9f171bfb`)	2022-05-04 22:40:21 -07:00
Noah Goldstein	0bf9c8b5fe	x86: Remove SSSE3 instruction for broadcast in memset.S (SSE2 Only) commit `b62ace2740` Author: Noah Goldstein <goldstein.w.n@gmail.com> Date: Sun Feb 6 00:54:18 2022 -0600 x86: Improve vec generation in memset-vec-unaligned-erms.S Revert usage of 'pshufb' in broadcast logic as it is an SSSE3 instruction and memset.S is restricted to only SSE2 instructions. (cherry picked from commit `1b0c60f95b`)	2022-05-04 22:40:04 -07:00
Noah Goldstein	58596411ad	x86: Improve vec generation in memset-vec-unaligned-erms.S No bug. Split vec generation into multiple steps. This allows the broadcast in AVX2 to use 'xmm' registers for the L(less_vec) case. This saves an expensive lane-cross instruction and removes the need for 'vzeroupper'. For SSE2 replace 2x 'punpck' instructions with zero-idiom 'pxor' for byte broadcast. Results for memset-avx2 small (geomean of N = 20 benchset runs). size, New Time, Old Time, New / Old 0, 4.100, 3.831, 0.934 1, 5.074, 4.399, 0.867 2, 4.433, 4.411, 0.995 4, 4.487, 4.415, 0.984 8, 4.454, 4.396, 0.987 16, 4.502, 4.443, 0.987 All relevant string/wcsmbs tests are passing. Reviewed-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit `b62ace2740`)	2022-05-04 22:39:44 -07:00
H.J. Lu	36766c02af	x86-64: Fix strcmp-evex.S Change "movl %edx, %rdx" to "movl %edx, %edx" in: commit `8418eb3ff4` Author: Noah Goldstein <goldstein.w.n@gmail.com> Date: Mon Jan 10 15:35:39 2022 -0600 x86: Optimize strcmp-evex.S (cherry picked from commit `0e0199a9e0`)	2022-05-04 22:39:24 -07:00
H.J. Lu	250e277797	x86-64: Fix strcmp-avx2.S Change "movl %edx, %rdx" to "movl %edx, %edx" in: commit `b77b06e0e2` Author: Noah Goldstein <goldstein.w.n@gmail.com> Date: Mon Jan 10 15:35:38 2022 -0600 x86: Optimize strcmp-avx2.S (cherry picked from commit `c15efd011c`)	2022-05-04 22:39:13 -07:00
Noah Goldstein	34ef810945	x86: Optimize strcmp-evex.S Optimization are primarily to the loop logic and how the page cross logic interacts with the loop. The page cross logic is at times more expensive for short strings near the end of a page but not crossing the page. This is done to retest the page cross conditions with a non-faulty check and to improve the logic for entering the loop afterwards. This is only particular cases, however, and is general made up for by more than 10x improvements on the transition from the page cross -> loop case. The non-page cross cases as well are nearly universally improved. test-strcmp, test-strncmp, test-wcscmp, and test-wcsncmp all pass. Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com> (cherry picked from commit `8418eb3ff4`)	2022-05-04 22:39:01 -07:00
Noah Goldstein	b68e782f8e	x86: Optimize strcmp-avx2.S Optimization are primarily to the loop logic and how the page cross logic interacts with the loop. The page cross logic is at times more expensive for short strings near the end of a page but not crossing the page. This is done to retest the page cross conditions with a non-faulty check and to improve the logic for entering the loop afterwards. This is only particular cases, however, and is general made up for by more than 10x improvements on the transition from the page cross -> loop case. The non-page cross cases are improved most for smaller sizes [0, 128] and go about even for (128, 4096]. The loop page cross logic is improved so some more significant speedup is seen there as well. test-strcmp, test-strncmp, test-wcscmp, and test-wcsncmp all pass. Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com> (cherry picked from commit `b77b06e0e2`)	2022-05-04 22:38:29 -07:00
Siddhesh Poyarekar	ec5b79aac7	manual: Clarify that abbreviations of long options are allowed The man page and code comments clearly state that abbreviations of long option names are recognized correctly as long as they are unique. Document this fact in the glibc manual as well. Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org> Reviewed-by: Florian Weimer <fweimer@redhat.com> Reviewed-by: Andreas Schwab <schwab@linux-m68k.org> (cherry picked from commit `db1efe02c9`)	2022-05-04 15:58:53 +05:30
Joseph Myers	0bcba53020	Add HWCAP2_AFP, HWCAP2_RPRES from Linux 5.17 to AArch64 bits/hwcap.h Add the new HWCAP2_AFP and HWCAP2_RPRES constants from Linux 5.17. Tested with build-many-glibcs.py for aarch64-linux-gnu. (cherry picked from commit `866c599182`)	2022-05-03 10:37:20 +02:00
Joseph Myers	95759abbf3	Add SOL_MPTCP, SOL_MCTP from Linux 5.16 to bits/socket.h Linux 5.16 adds constants SOL_MPTCP and SOL_MCTP to the getsockopt / setsockopt levels; add these constants to bits/socket.h. Tested for x86_64. (cherry picked from commit `fdc1ae67fe`)	2022-05-03 10:37:00 +02:00
Joseph Myers	eed29011f9	Update kernel version to 5.17 in tst-mman-consts.py This patch updates the kernel version in the test tst-mman-consts.py to 5.17. (There are no new MAP_* constants covered by this test in 5.17 that need any other header changes.) Tested with build-many-glibcs.py. (cherry picked from commit `23808a422e`)	2022-05-03 10:36:11 +02:00
Joseph Myers	e72c363a15	Update kernel version to 5.16 in tst-mman-consts.py This patch updates the kernel version in the test tst-mman-consts.py to 5.16. (There are no new MAP_* constants covered by this test in 5.16 that need any other header changes.) Tested with build-many-glibcs.py. (cherry picked from commit `790a607e23`)	2022-05-03 10:36:08 +02:00
Joseph Myers	edc06fdd62	Update syscall lists for Linux 5.17 Linux 5.17 has one new syscall, set_mempolicy_home_node. Update syscall-names.list and regenerate the arch-syscall.h headers with build-many-glibcs.py update-syscalls. Tested with build-many-glibcs.py. (cherry picked from commit `8ef9196b26`)	2022-05-03 10:35:23 +02:00

1 2 3 4 5 ...

38457 Commits