glibc

mirror of https://sourceware.org/git/glibc.git synced 2025-01-14 21:10:19 +00:00

History

Noah Goldstein 642933158e x86: Optimize and shrink st{r\|p}{n}{cat\|cpy}-avx2 functions Optimizations are: 1. Use more overlapping stores to avoid branches. 2. Reduce how unrolled the aligning copies are (this is more of a code-size save, its a negative for some sizes in terms of perf). 3. For st{r\|p}n{cat\|cpy} re-order the branches to minimize the number that are taken. Performance Changes: Times are from N = 10 runs of the benchmark suite and are reported as geometric mean of all ratios of New Implementation / Old Implementation. strcat-avx2 -> 0.998 strcpy-avx2 -> 0.937 stpcpy-avx2 -> 0.971 strncpy-avx2 -> 0.793 stpncpy-avx2 -> 0.775 strncat-avx2 -> 0.962 Code Size Changes: function -> Bytes New / Bytes Old -> Ratio strcat-avx2 -> 685 / 1639 -> 0.418 strcpy-avx2 -> 560 / 903 -> 0.620 stpcpy-avx2 -> 592 / 939 -> 0.630 strncpy-avx2 -> 1176 / 2390 -> 0.492 stpncpy-avx2 -> 1268 / 2438 -> 0.520 strncat-avx2 -> 1042 / 2563 -> 0.407 Notes: 1. Because of the significant difference between the implementations they are split into three files. strcpy-avx2.S -> strcpy, stpcpy, strcat strncpy-avx2.S -> strncpy strncat-avx2.S > strncat I couldn't find a way to merge them without making the ifdefs incredibly difficult to follow. Full check passes on x86-64 and build succeeds for all ISA levels w/ and w/o multiarch.		2022-11-08 19:22:33 -08:00
..
aarch64	elf: Introduce <dl-call_tls_init_tp.h> and call_tls_init_tp (bug 29249)	2022-11-03 17:28:03 +01:00
alpha	elf: Introduce <dl-call_tls_init_tp.h> and call_tls_init_tp (bug 29249)	2022-11-03 17:28:03 +01:00
arc	elf: Introduce <dl-call_tls_init_tp.h> and call_tls_init_tp (bug 29249)	2022-11-03 17:28:03 +01:00
arm	configure: Use -Wno-ignored-attributes if compiler warns about multiple aliases	2022-11-01 09:51:06 -03:00
csky	elf: Introduce <dl-call_tls_init_tp.h> and call_tls_init_tp (bug 29249)	2022-11-03 17:28:03 +01:00
generic	elf: Introduce <dl-call_tls_init_tp.h> and call_tls_init_tp (bug 29249)	2022-11-03 17:28:03 +01:00
gnu	errlist: add missing entry for EDEADLOCK (bug 29545)	2022-09-08 11:40:24 +02:00
hppa	elf: Introduce <dl-call_tls_init_tp.h> and call_tls_init_tp (bug 29249)	2022-11-03 17:28:03 +01:00
htl	htl: Make pthread*_cond_timedwait register wref before releasing mutex	2022-08-22 22:27:24 +02:00
hurd	hurd: Fix pthread_kill on exiting/ted thread	2022-01-15 15:11:54 +01:00
i386	elf: Introduce <dl-call_tls_init_tp.h> and call_tls_init_tp (bug 29249)	2022-11-03 17:28:03 +01:00
ia64	elf: Introduce <dl-call_tls_init_tp.h> and call_tls_init_tp (bug 29249)	2022-11-03 17:28:03 +01:00
ieee754	Fix build with GCC 13 _FloatN, _FloatNx built-in functions	2022-10-31 23:20:08 +00:00
loongarch	elf: Introduce <dl-call_tls_init_tp.h> and call_tls_init_tp (bug 29249)	2022-11-03 17:28:03 +01:00
m68k	elf: Introduce <dl-call_tls_init_tp.h> and call_tls_init_tp (bug 29249)	2022-11-03 17:28:03 +01:00
mach	hurd: Add sigtimedwait and sigwaitinfo support	2022-11-07 21:16:26 +01:00
microblaze	elf: Introduce <dl-call_tls_init_tp.h> and call_tls_init_tp (bug 29249)	2022-11-03 17:28:03 +01:00
mips	elf: Introduce <dl-call_tls_init_tp.h> and call_tls_init_tp (bug 29249)	2022-11-03 17:28:03 +01:00
nios2	elf: Introduce <dl-call_tls_init_tp.h> and call_tls_init_tp (bug 29249)	2022-11-03 17:28:03 +01:00
nptl	Use atomic_exchange_release/acquire	2022-09-26 16:58:08 +01:00
or1k	elf: Introduce <dl-call_tls_init_tp.h> and call_tls_init_tp (bug 29249)	2022-11-03 17:28:03 +01:00
posix	get_nscd_addresses: Fix subscript typos [BZ #29605 ]	2022-09-28 12:47:10 -04:00
powerpc	elf: Introduce <dl-call_tls_init_tp.h> and call_tls_init_tp (bug 29249)	2022-11-03 17:28:03 +01:00
pthread	Do not define static_assert or thread_local in headers for C2x	2022-09-07 18:39:28 +00:00
riscv	elf: Introduce <dl-call_tls_init_tp.h> and call_tls_init_tp (bug 29249)	2022-11-03 17:28:03 +01:00
s390	elf: Introduce <dl-call_tls_init_tp.h> and call_tls_init_tp (bug 29249)	2022-11-03 17:28:03 +01:00
sh	elf: Introduce <dl-call_tls_init_tp.h> and call_tls_init_tp (bug 29249)	2022-11-03 17:28:03 +01:00
sparc	elf: Introduce <dl-call_tls_init_tp.h> and call_tls_init_tp (bug 29249)	2022-11-03 17:28:03 +01:00
unix	Linux: Add ppoll fortify symbol for 64 bit time_t (BZ# 29746)	2022-11-08 13:37:06 -03:00
wordsize-32	Update copyright dates with scripts/update-copyrights	2022-01-01 11:40:24 -08:00
wordsize-64	configure: Use -Wno-ignored-attributes if compiler warns about multiple aliases	2022-11-01 09:51:06 -03:00
x86	elf: Remove _dl_string_hwcap	2022-10-06 07:59:48 -03:00
x86_64	x86: Optimize and shrink st{r\|p}{n}{cat\|cpy}-avx2 functions	2022-11-08 19:22:33 -08:00