glibc

mirror of https://sourceware.org/git/glibc.git synced 2024-12-12 22:30:12 +00:00

History

Noah Goldstein ffe75982cc x86: Remove memcmp-sse4.S Code didn't actually use any sse4 instructions since `ptest` was removed in: commit `2f9062d717` Author: Noah Goldstein <goldstein.w.n@gmail.com> Date: Wed Nov 10 16:18:56 2021 -0600 x86: Shrink memcmp-sse4.S code size The new memcmp-sse2 implementation is also faster. geometric_mean(N=20) of page cross cases SSE2 / SSE4: 0.905 Note there are two regressions preferring SSE2 for Size = 1 and Size = 65. Size = 1: size, align0, align1, ret, New Time/Old Time 1, 1, 1, 0, 1.2 1, 1, 1, 1, 1.197 1, 1, 1, -1, 1.2 This is intentional. Size == 1 is significantly less hot based on profiles of GCC11 and Python3 than sizes [4, 8] (which is made hotter). Python3 Size = 1 -> 13.64% Python3 Size = [4, 8] -> 60.92% GCC11 Size = 1 -> 1.29% GCC11 Size = [4, 8] -> 33.86% size, align0, align1, ret, New Time/Old Time 4, 4, 4, 0, 0.622 4, 4, 4, 1, 0.797 4, 4, 4, -1, 0.805 5, 5, 5, 0, 0.623 5, 5, 5, 1, 0.777 5, 5, 5, -1, 0.802 6, 6, 6, 0, 0.625 6, 6, 6, 1, 0.813 6, 6, 6, -1, 0.788 7, 7, 7, 0, 0.625 7, 7, 7, 1, 0.799 7, 7, 7, -1, 0.795 8, 8, 8, 0, 0.625 8, 8, 8, 1, 0.848 8, 8, 8, -1, 0.914 9, 9, 9, 0, 0.625 Size = 65: size, align0, align1, ret, New Time/Old Time 65, 0, 0, 0, 1.103 65, 0, 0, 1, 1.216 65, 0, 0, -1, 1.227 65, 65, 0, 0, 1.091 65, 0, 65, 1, 1.19 65, 65, 65, -1, 1.215 This is because A) the checks in range [65, 96] are now unrolled 2x and B) because smaller values <= 16 are now given a hotter path. By contrast the SSE4 version has a branch for Size = 80. The unrolled version has get better performance for returns which need both comparisons. size, align0, align1, ret, New Time/Old Time 128, 4, 8, 0, 0.858 128, 4, 8, 1, 0.879 128, 4, 8, -1, 0.888 As well, out of microbenchmark environments that are not full predictable the branch will have a real-cost. Reviewed-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit `7cbc03d030`)		2022-05-16 18:55:16 -07:00
..
aarch64	elf: Fix runtime linker auditing on aarch64 (BZ #26643 )	2022-04-12 13:33:10 -04:00
alpha	elf: Add _dl_audit_pltexit	2022-04-08 14:18:12 -04:00
arc	elf: Fix dynamic-link.h usage on rtld.c	2022-04-08 14:18:11 -04:00
arm	elf: Add _dl_audit_pltexit	2022-04-08 14:18:12 -04:00
csky	elf: Fix dynamic-link.h usage on rtld.c	2022-04-08 14:18:11 -04:00
generic	elf: Fix runtime linker auditing on aarch64 (BZ #26643 )	2022-04-12 13:33:10 -04:00
gnu	hurd: Fix glob lstat compatibility	2021-07-22 20:31:52 +02:00
hppa	hppa: Fix bind-now audit (BZ #28857 )	2022-04-12 13:33:17 -04:00
htl	htl: Do not expose pthread hidden proto outside libpthread	2021-07-18 20:25:33 +00:00
hurd
i386	i386: Regenerate ulps	2022-04-27 21:20:43 -04:00
ia64	elf: Issue la_symbind for bind-now (BZ #23734 )	2022-04-12 13:32:59 -04:00
ieee754	Update math: redirect roundeven function	2021-06-27 07:56:57 -07:00
m68k	elf: Add _dl_audit_pltexit	2022-04-08 14:18:12 -04:00
mach	hurd if_index: Explicitly use AF_INET for if index discovery	2022-02-03 16:22:04 +01:00
microblaze	elf: Fix dynamic-link.h usage on rtld.c	2022-04-08 14:18:11 -04:00
mips	elf: Add _dl_audit_pltexit	2022-04-08 14:18:12 -04:00
nios2	elf: Fix dynamic-link.h usage on rtld.c	2022-04-08 14:18:11 -04:00
nptl	nptl: Handle spurious EINTR when thread cancellation is disabled (BZ#29029)	2022-04-15 09:52:54 -03:00
posix	getcwd: Set errno to ERANGE for size == 1 (CVE-2021-3999)	2022-01-24 11:37:06 +05:30
powerpc	elf: Issue la_symbind for bind-now (BZ #23734 )	2022-04-12 13:32:59 -04:00
pthread	nptl: Handle spurious EINTR when thread cancellation is disabled (BZ#29029)	2022-04-15 09:52:54 -03:00
riscv	elf: Fix dynamic-link.h usage on rtld.c	2022-04-08 14:18:11 -04:00
s390	S390: Add new s390 platform z16.	2022-04-14 14:21:57 +02:00
sh	elf: Add _dl_audit_pltexit	2022-04-08 14:18:12 -04:00
sparc	elf: Add _dl_audit_pltexit	2022-04-08 14:18:12 -04:00
unix	Add HWCAP2_AFP, HWCAP2_RPRES from Linux 5.17 to AArch64 bits/hwcap.h	2022-05-03 11:08:52 +02:00
wordsize-32
wordsize-64
x86	x86: Improve L to support L(XXX_SYMBOL (YYY, ZZZ))	2022-05-16 18:52:19 -07:00
x86_64	x86: Remove memcmp-sse4.S	2022-05-16 18:55:16 -07:00