mirror of
https://sourceware.org/git/glibc.git
synced 2024-11-27 07:20:11 +00:00
f8f72bc0c3
This patch provides an optimised implementation of memchr using NEON instructions to improve its performance, especially with longer search regions. This gave an improvement in performance against the Thumb2+DSP optimised code, with more significant gains for larger inputs. The NEON code also wins in cases where the input is small (less than 8 bytes) by defaulting to a simple byte-by-byte search. This avoids the overhead imposed by filling two quadword registers from memory. * sysdeps/arm/armv7/multiarch/Makefile: Add memchr_neon to sysdep_routines. * sysdeps/arm/armv7/multiarch/ifunc-impl-list.c: Add define for __memchr_neon. Add ifunc definitions for __memchr_neon and __memchr_noneon. * sysdeps/arm/armv7/multiarch/memchr.S: New file. * sysdeps/arm/armv7/multiarch/memchr_impl.S: Likewise. * sysdeps/arm/armv7/multiarch/memchr_neon.S: Likewise. Testing done: Ran regression tests for arm-none-linux-gnueabihf as well as a full toolchain bootstrap. Benchmark tests were ran on ARMv7-A and ARMv8-A hardware targets.
4 lines
84 B
Makefile
4 lines
84 B
Makefile
ifeq ($(subdir),string)
|
|
sysdep_routines += memcpy_neon memcpy_vfp memchr_neon
|
|
endif
|