glibc/sysdeps
Wilco Dijkstra b31bd11454 AArch64: Improve A64FX memcpy
v2 is a complete rewrite of the A64FX memcpy. Performance is improved
by streamlining the code, aligning all large copies and using a single
unrolled loop for all sizes. The code size for memcpy and memmove goes
down from 1796 bytes to 868 bytes. Performance is better in all cases:
bench-memcpy-random is 2.3% faster overall, bench-memcpy-large is ~33%
faster for large sizes, bench-memcpy-walk is 25% faster for small sizes
and 20% for the largest sizes. The geomean of all tests in bench-memcpy
is 5.1% faster, and total time is reduced by 4%.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2021-12-02 18:36:03 +00:00
..
aarch64 AArch64: Improve A64FX memcpy 2021-12-02 18:36:03 +00:00
alpha elf: Fix dynamic-link.h usage on rtld.c 2021-10-14 14:52:07 -03:00
arc elf: Fix dynamic-link.h usage on rtld.c 2021-10-14 14:52:07 -03:00
arm arm: Use have-mtls-dialect-gnu2 to check for ARM TLS descriptors support 2021-11-01 16:23:15 -03:00
csky String: Add hidden defs for __memcmpeq() to enable internal usage 2021-10-26 16:51:29 -05:00
generic elf: Introduce GLRO (dl_libc_freeres), called from __libc_freeres 2021-11-17 12:20:29 +01:00
gnu Remove "Contributed by" lines 2021-09-03 22:06:44 +05:30
hppa elf: Fix dynamic-link.h usage on rtld.c 2021-10-14 14:52:07 -03:00
htl htl: Reimplement GSCOPE 2021-09-16 01:04:17 +02:00
hurd Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
i386 String: Add hidden defs for __memcmpeq() to enable internal usage 2021-10-26 16:51:29 -05:00
ia64 String: Add hidden defs for __memcmpeq() to enable internal usage 2021-10-26 16:51:29 -05:00
ieee754 Fixed inaccuracy of j0f (BZ #28185) 2021-10-05 13:45:37 +02:00
m68k elf: Fix dynamic-link.h usage on rtld.c 2021-10-14 14:52:07 -03:00
mach io: Refactor close_range and closefrom 2021-11-24 09:09:37 -03:00
microblaze elf: Fix dynamic-link.h usage on rtld.c 2021-10-14 14:52:07 -03:00
mips ld.so: Initialize bootstrap_map.l_ld_readonly [BZ #28340] 2021-10-19 06:40:38 -07:00
nios2 elf: Fix dynamic-link.h usage on rtld.c 2021-10-14 14:52:07 -03:00
nptl nptl: Extract <bits/atomic_wide_counter.h> from pthread_cond_common.c 2021-11-17 12:20:13 +01:00
posix posix: Remove spawni.c 2021-09-27 12:44:25 -03:00
powerpc powerpc64[le]: Fix CFI and LR save address for asm syscalls [BZ #28532] 2021-11-30 15:18:52 -03:00
pthread nptl: Do not set signal mask on second setjmp return [BZ #28607] 2021-11-24 08:59:54 +01:00
riscv riscv: Build with -mno-relax if linker does not support R_RISCV_ALIGN 2021-11-03 09:25:06 -03:00
s390 s390: Use long branches across object boundaries (jgh instead of jh) 2021-11-10 15:21:37 +01:00
sh elf: Fix dynamic-link.h usage on rtld.c 2021-10-14 14:52:07 -03:00
sparc String: Add hidden defs for __memcmpeq() to enable internal usage 2021-10-26 16:51:29 -05:00
unix linux: Implement pipe in terms of __NR_pipe2 2021-11-30 13:13:03 -03:00
wordsize-32 Disable symbol hack in libc_nonshared.a 2021-09-27 07:46:25 -07:00
wordsize-64 Remove "Contributed by" lines 2021-09-03 22:06:44 +05:30
x86 x86: Double size of ERMS rep_movsb_threshold in dl-cacheinfo.h 2021-11-06 16:18:08 -05:00
x86_64 x86-64: Add vector sin/sinf to libmvec microbenchmark 2021-11-24 07:50:23 -08:00