mirror of
https://sourceware.org/git/glibc.git
synced 2025-01-15 05:20:05 +00:00
f8f72bc0c3
This patch provides an optimised implementation of memchr using NEON instructions to improve its performance, especially with longer search regions. This gave an improvement in performance against the Thumb2+DSP optimised code, with more significant gains for larger inputs. The NEON code also wins in cases where the input is small (less than 8 bytes) by defaulting to a simple byte-by-byte search. This avoids the overhead imposed by filling two quadword registers from memory. * sysdeps/arm/armv7/multiarch/Makefile: Add memchr_neon to sysdep_routines. * sysdeps/arm/armv7/multiarch/ifunc-impl-list.c: Add define for __memchr_neon. Add ifunc definitions for __memchr_neon and __memchr_noneon. * sysdeps/arm/armv7/multiarch/memchr.S: New file. * sysdeps/arm/armv7/multiarch/memchr_impl.S: Likewise. * sysdeps/arm/armv7/multiarch/memchr_neon.S: Likewise. Testing done: Ran regression tests for arm-none-linux-gnueabihf as well as a full toolchain bootstrap. Benchmark tests were ran on ARMv7-A and ARMv8-A hardware targets. |
||
---|---|---|
.. | ||
aeabi_memcpy.c | ||
ifunc-impl-list.c | ||
Makefile | ||
memchr_impl.S | ||
memchr_neon.S | ||
memchr.S | ||
memcpy_impl.S | ||
memcpy_neon.S | ||
memcpy_vfp.S | ||
memcpy.S |