There exist optimized memcpy functions on s390, but no optimized mempcpy.
This patch adds mempcpy entry points in memcpy.S files, which
use the memcpy implementation. Now mempcpy itself is also an IFUNC function
as memcpy is and the variants are listed in ifunc-impl-list.c.
The s390 string.h does not define _HAVE_STRING_ARCH_mempcpy.
Instead mempcpy string/string.h inlines memcpy() + n.
If n is constant and small enough, GCC emits instructions like mvi or mvc
and avoids the function call to memcpy.
If n is not constant, then memcpy is called and n is added afterwards.
If _HAVE_STRING_ARCH_mempcpy would be defined, mempcpy would be called in
every case.
According to PR70140 "Inefficient expansion of __builtin_mempcpy"
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70140) GCC should handle a
call to mempcpy in the same way as memcpy. Then either the mempcpy macro
in string/string.h has to be removed or _HAVE_STRING_ARCH_mempcpy has to
be defined for S390.
ChangeLog:
[BZ #19765]
* sysdeps/s390/mempcpy.S: New File.
* sysdeps/s390/multiarch/mempcpy.c: Likewise.
* sysdeps/s390/multiarch/Makefile (sysdep_routines): Add mempcpy.
* sysdeps/s390/multiarch/ifunc-impl-list.c (__libc_ifunc_impl_list):
Add mempcpy variants.
* sysdeps/s390/s390-32/memcpy.S: Add mempcpy entry point.
(memcpy): Adjust to be usable from mempcpy entry point.
(__memcpy_mvcle): Likewise.
* sysdeps/s390/s390-64/memcpy.S: Likewise.
* sysdeps/s390/s390-32/multiarch/memcpy-s390.S: Add entry points
____mempcpy_z196, ____mempcpy_z10 and add __GI_ symbols for mempcpy.
(__memcpy_z196): Adjust to be usable from mempcpy entry point.
(__memcpy_z10): Likewise.
* sysdeps/s390/s390-64/multiarch/memcpy-s390x.S: Likewise.
This patch provides optimized version of memrchr with the z13 vector
instructions.
ChangeLog:
* sysdeps/s390/multiarch/memrchr-c.c: New File.
* sysdeps/s390/multiarch/memrchr-vx.S: Likewise.
* sysdeps/s390/multiarch/memrchr.c: Likewise.
* sysdeps/s390/multiarch/Makefile
(sysdep_routines): Add memrchr functions.
* sysdeps/s390/multiarch/ifunc-impl-list-common.c
(__libc_ifunc_impl_list_common): Add ifunc test for memrchr.
This patch provides optimized versions of memccpy with the z13 vector
instructions.
ChangeLog:
* sysdeps/s390/multiarch/memccpy-c.c: New File.
* sysdeps/s390/multiarch/memccpy-vx.S: Likewise.
* sysdeps/s390/multiarch/memccpy.c: Likewise.
* sysdeps/s390/multiarch/Makefile
(sysdep_routines): Add memccpy functions.
* sysdeps/s390/multiarch/ifunc-impl-list-common.c
(__libc_ifunc_impl_list_common): Add ifunc test for memccpy.
* string/memccpy.c: Use MEMCCPY if defined.
This patch provides optimized versions of strcmp and wcscmp with the z13
vector instructions.
The architecture specific string.h had a typo, which leads to ommiting the
inline version in this file if __USE_STRING_INLINES is defined.
Tested this inline version by tweaking test-strcmp.c.
ChangeLog:
* sysdeps/s390/multiarch/strcmp-vx.S: New File.
* sysdeps/s390/multiarch/strcmp.c: Likewise.
* sysdeps/s390/multiarch/wcscmp-c.c: Likewise.
* sysdeps/s390/multiarch/wcscmp-vx.S: Likewise.
* sysdeps/s390/multiarch/wcscmp.c: Likewise.
* sysdeps/s390/s390-32/multiarch/strcmp.c: Likewise.
* sysdeps/s390/s390-64/multiarch/strcmp.c: Likewise.
* sysdeps/s390/multiarch/Makefile (sysdep_routines): Add strcmp and
wcscmp functions.
* sysdeps/s390/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Add ifunc test for strcmp, wcscmp.
* string/strcmp.c (STRCMP): Define and use macro.
* benchtests/bench-wcscmp.c: New File.
* benchtests/Makefile (wcsmbs-bench): Add wcscmp.
* sysdeps/s390/bits/string.h: Fix typo: _HAVE_STRING_ARCH_strcmp
instead of _HAVE_STRING_ARCH_memchr.
This patch provides optimized versions of strlen and wcslen with the z13 vector
instructions.
The helper macro IFUNC_VX_IMPL is introduced and is used to register all
__<func>_c() and __<func>_vx() functions within __libc_ifunc_impl_list()
to the ifunc test framework.
ChangeLog:
* sysdeps/s390/multiarch/Makefile: New File.
* sysdeps/s390/multiarch/strlen-c.c: Likewise.
* sysdeps/s390/multiarch/strlen-vx.S: Likewise.
* sysdeps/s390/multiarch/strlen.c: Likewise.
* sysdeps/s390/multiarch/wcslen-c.c: Likewise.
* sysdeps/s390/multiarch/wcslen-vx.S: Likewise.
* sysdeps/s390/multiarch/wcslen.c: Likewise.
* string/strlen.c (STRLEN): Define and use macro.
* sysdeps/s390/multiarch/ifunc-impl-list.c
(IFUNC_VX_IMPL): New macro function.
(__libc_ifunc_impl_list): Add ifunc test for strlen, wcslen.
* benchtests/Makefile (wcsmbs-bench): New variable.
(string-bench-all): Added wcsmbs-bench.
* benchtests/bench-wcslen.c: New File.
On s390 all ifunc resolvers were implemented in multiarch/ifunc-resolve.c.
The resulting single object files has undefined references to all ifunc-functions.
This patch introduces one multiarch/<func>.c file for each of memcpy, memcmp
and memset with the function specific ifunc resolver. The different function
implementations are now implemented in multiarch/<func>-s390x.S
(moved from multiarch/<func>.S).
The new multiarch/ifunc-resolve.h file contains the ifunc-resolver macro
and other helper-macros. They are merged and are now used in common for
32/64bit. Therefore the __<func>_g5/__<func>_z900 functions were renamed to
__<func>_default.
This patch also enables testing the ifunc implementations by implementing
the function __libc_ifunc_impl_list. It uses the helper-macros of ifunc-resolve.h.
ChangeLog:
* sysdeps/s390/s390-32/multiarch/Makefile (sysdep_routines):
Remove ifunc-resolve, add memset-s390, memcpy-s390, memcmp-s390.
* sysdeps/s390/s390-32/multiarch/ifunc-resolve.c: Delete File.
* sysdeps/s390/s390-32/multiarch/memcmp.S: Move to ...
* sysdeps/s390/s390-32/multiarch/memcmp-s390.S: ... here.
(memcmp, bcmp): Use __memcmp_default as alias source.
* sysdeps/s390/s390-32/multiarch/memcmp.c: New File.
* sysdeps/s390/s390-32/memcmp.S (__memcmp_g5):
Rename to __memcmp_default.
* sysdeps/s390/s390-32/multiarch/memcpy.S: Move to ...
* sysdeps/s390/s390-32/multiarch/memcpy-s390.S: ... here.
(memcpy): Use __memcpy_default as alias source.
* sysdeps/s390/s390-32/multiarch/memcpy.c: New File.
* sysdeps/s390/s390-32/memcpy.S (__memcpy_g5):
Rename to __memcpy_default.
* sysdeps/s390/s390-32/multiarch/memset.S: Move to ...
* sysdeps/s390/s390-32/multiarch/memset-s390.S: ... here.
(memset): Use __memset_default as alias source.
* sysdeps/s390/s390-32/multiarch/memset.c: New File.
* sysdeps/s390/s390-32/memset.S (__memset_g5):
Rename to __memset_default.
* sysdeps/s390/s390-64/multiarch/Makefile (sysdep_routines):
Remove ifunc-resolve, add memset-s390x, memcpy-s390x, memcmp-s390x.
* sysdeps/s390/s390-64/multiarch/ifunc-resolve.c: Delete File.
* sysdeps/s390/s390-64/multiarch/memcmp.S: Move to ...
* sysdeps/s390/s390-64/multiarch/memcmp-s390x.S: ... here.
(memcmp, bcmp): Use __memcmp_default as alias source.
* sysdeps/s390/s390-64/multiarch/memcmp.c: New File.
* sysdeps/s390/s390-64/memcmp.S (__memcmp_z900):
Rename to __memcmp_default.
* sysdeps/s390/s390-64/multiarch/memcpy.S: Move to ...
* sysdeps/s390/s390-64/multiarch/memcpy-s390x.S: ... here.
(memcpy): Use __memcpy_default as alias source.
* sysdeps/s390/s390-64/multiarch/memcpy.c: New File.
* sysdeps/s390/s390-64/memcpy.S (__memcpy_z900):
Rename to __memcpy_default.
* sysdeps/s390/s390-64/multiarch/memset.S: Move to ...
* sysdeps/s390/s390-64/multiarch/memset-s390x.S: ... here.
(memset): Use __memset_default as alias source.
* sysdeps/s390/s390-64/multiarch/memset.c: New File.
* sysdeps/s390/s390-64/memset.S (__memset_z900):
Rename to __memset_default.
* sysdeps/s390/multiarch/ifunc-resolve.h: New File.
* sysdeps/s390/multiarch/ifunc-impl-list.c: New File.