This patch refactor ARM memchr ifunc selector to a C implementation.
No functional change is expected, including ifunc resolution rules.
It also reorganize the ifunc options code:
1. The memchr_impl.S is renamed to memchr_neon.S and multiple
compilation options (which route to armv6t2/memchr one) is
removed. The code to build if __ARM_NEON__ is defined is
also simplified.
2. A memchr_noneon is added (which as build along previous ifunc
resolution) and includes the armv6t2 direct.
3. Same as 2. for loader object.
Alongside the aforementioned changes, it also some cleanus:
- Internal memchr definition (__GI_memcpy) is now a hidden
symbol.
- No need to create hidden definition for the ifunc variants.
Checked on armv7-linux-gnueabihf and with a build for arm-linux-gnueabi,
arm-linux-gnueabihf with and without multiarch support and with both
GCC 7.1 and GCC mainline.
* sysdeps/arm/armv7/multiarch/Makefile [$(subdir) = string]
(sysdeps_routines): Add memchr_noneon.
* sysdeps/arm/armv7/multiarch/ifunc-memchr.h: New file.
* sysdeps/arm/armv7/multiarch/memchr_noneon.S: Likewise.
* sysdeps/arm/armv7/multiarch/rtld-memchr.S: Likewise.
* sysdeps/arm/armv7/multiarch/memchr.S: Remove file.
* sysdeps/arm/armv7/multiarch/memchr.c: New file.
* sysdeps/arm/armv7/multiarch/memchr_impl.S: Move to ...
* sysdeps/arm/armv7/multiarch/memchr_neon.S: ... here.
Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This patch refactor ARM memcpy ifunc selector to a C implementation.
No functional change is expected, including ifunc resolution rules.
It also adds some cleanup:
- Internal memcpy hidden definition (__GI_memcpy) is now a hidden
symbol.
- No need to create hidden definition for the ifunc variants.
Checked on armv7-linux-gnueabihf and with a build for arm-linux-gnueabi,
arm-linux-gnueabihf with and without multiarch support and with both
GCC 7.1 and GCC mainline. I also checked with the some possible
multiarch different configurations that trigger different memcpy
buids (__ARM_NEON__ && !__SOFT_FP__, !__ARM_NEON__ && !__SOFT_FP__, and
!__ARM_NEON__ && __SOFT_FP__).
* sysdeps/arm/arm-ifunc.h: New file.
* sysdeps/arm/armv7/multiarch/ifunc-memcpy.h: Likewise.
* sysdeps/arm/armv7/multiarch/memcpy.c: Likewise.
* sysdeps/arm/armv7/multiarch/memcpy_arm.S: Likewise.
* sysdeps/arm/armv7/multiarch/rtld-memcpy.S: Likewise.
* sysdeps/arm/armv7/multiarch/memcpy_neon.S [!__ARM_NEON__]
(__memcpy_neon): Avoid create hidden alias.
* sysdeps/arm/armv7/multiarch/memcpy_vfp.S [!__ARM_NEON_]
(__memcpy_vfp): Likewise.
* sysdeps/arm/armv7/multiarch/Makefile [$(subdir) = string]
(sysdep_routines): Add memcpy_arm.
* sysdeps/arm/armv7/multiarch/memcpy.S: Remove file.
Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This patch provides an optimised implementation of memchr using NEON
instructions to improve its performance, especially with longer search regions.
This gave an improvement in performance against the Thumb2+DSP optimised code,
with more significant gains for larger inputs. The NEON code also wins in cases
where the input is small (less than 8 bytes) by defaulting to a simple
byte-by-byte search. This avoids the overhead imposed by filling two quadword
registers from memory.
* sysdeps/arm/armv7/multiarch/Makefile: Add memchr_neon to
sysdep_routines.
* sysdeps/arm/armv7/multiarch/ifunc-impl-list.c: Add define for
__memchr_neon.
Add ifunc definitions for __memchr_neon and __memchr_noneon.
* sysdeps/arm/armv7/multiarch/memchr.S: New file.
* sysdeps/arm/armv7/multiarch/memchr_impl.S: Likewise.
* sysdeps/arm/armv7/multiarch/memchr_neon.S: Likewise.
Testing done: Ran regression tests for arm-none-linux-gnueabihf as well as a
full toolchain bootstrap. Benchmark tests were ran on ARMv7-A and ARMv8-A
hardware targets.
I've moved the ARM port from ports to the main sysdeps hierarchy.
Beyond the README update, the move of the files was simply
git mv ports/sysdeps/arm sysdeps/arm
git mv ports/sysdeps/unix/arm sysdeps/unix/arm
git mv ports/sysdeps/unix/sysv/linux/arm sysdeps/unix/sysv/linux/arm
and in addition to the ChangeLog entries here, I put a note at the top
of ports/ChangeLog.arm similar to that at the top of
ChangeLog.powerpc. There is deliberately no NEWS change, as I think
it makes the most sense to put in a general note above all ports
having moved if we can achieve that for 2.20.
Tested that disassembly of installed shared libraries for arm is the
same before and after this patch, except for data (not instructions)
in ld.so (there are assertions in sysdeps/arm/dl-machine.h, and the
path by which that file is found, and so by which it appears in the
assertion message, changes as a result of the move).
* sysdeps/arm: Move directory from ports/sysdeps/arm.
* sysdeps/unix/arm: Move directory from ports/sysdeps/unix/arm.
* sysdeps/unix/sysv/linux/arm: Move directory from
ports/sysdeps/unix/sysv/linux/arm.
* README: Update listing for arm-*-linux-gnueabi.
ports/ChangeLog.arm:
* sysdeps/arm: Move directory to ../sysdeps/arm.
* sysdeps/unix/arm: Move directory to ../sysdeps.arm.
* sysdeps/unix/sysv/linux/arm: Move directory to
../sysdeps/unix/sysv/linux/arm.