glibc/sysdeps/aarch64/multiarch
Wilco Dijkstra b31bd11454 AArch64: Improve A64FX memcpy
v2 is a complete rewrite of the A64FX memcpy. Performance is improved
by streamlining the code, aligning all large copies and using a single
unrolled loop for all sizes. The code size for memcpy and memmove goes
down from 1796 bytes to 868 bytes. Performance is better in all cases:
bench-memcpy-random is 2.3% faster overall, bench-memcpy-large is ~33%
faster for large sizes, bench-memcpy-walk is 25% faster for small sizes
and 20% for the largest sizes. The geomean of all tests in bench-memcpy
is 5.1% faster, and total time is reduced by 4%.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2021-12-02 18:36:03 +00:00
..
ifunc-impl-list.c aarch64: Added optimized memset for A64FX 2021-05-27 09:47:53 +01:00
init-arch.h aarch64: Added optimized memcpy and memmove for A64FX 2021-05-27 09:47:53 +01:00
Makefile aarch64: Added optimized memset for A64FX 2021-05-27 09:47:53 +01:00
memchr_generic.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
memchr_nosimd.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
memchr.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
memcpy_a64fx.S AArch64: Improve A64FX memcpy 2021-12-02 18:36:03 +00:00
memcpy_advsimd.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
memcpy_falkor.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
memcpy_generic.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
memcpy_thunderx2.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
memcpy_thunderx.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
memcpy.c aarch64: Added optimized memcpy and memmove for A64FX 2021-05-27 09:47:53 +01:00
memmove.c aarch64: Added optimized memcpy and memmove for A64FX 2021-05-27 09:47:53 +01:00
memset_a64fx.S Revert "AArch64: Update A64FX memset not to degrade at 16KB" 2021-09-06 10:23:25 +01:00
memset_base64.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
memset_emag.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
memset_falkor.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
memset_generic.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
memset_kunpeng.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
memset.c aarch64: Added optimized memset for A64FX 2021-05-27 09:47:53 +01:00
rtld-memset.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strlen_asimd.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strlen_mte.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strlen.c aarch64: Move and update the definition of MTE_ENABLED 2021-01-25 15:35:43 +00:00