The ifunc handling for stpncpy is adjusted in order to omit ifunc
variants if those will never be used as the minimum architecture level
already supports newer CPUs by default.
Glibc internal calls will then also use the "newer" ifunc variant.
ChangeLog:
* sysdeps/s390/multiarch/Makefile
(sysdep_routines): Remove stpncpy variants.
* sysdeps/s390/Makefile (sysdep_routines): Add stpncpy variants.
* sysdeps/s390/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Refactor ifunc handling for stpncpy.
* sysdeps/s390/multiarch/stpncpy-c.c: Move to ...
* sysdeps/s390/stpncpy-c.c: ... here and adjust ifunc handling.
* sysdeps/s390/multiarch/stpncpy-vx.S: Move to ...
* sysdeps/s390/stpncpy-vx.S: ... here and adjust ifunc handling.
* sysdeps/s390/multiarch/stpncpy.c: Move to ...
* sysdeps/s390/stpncpy.c: ... here and adjust ifunc handling.
* sysdeps/s390/ifunc-stpncpy.h: New file.
The ifunc handling for strncpy is adjusted in order to omit ifunc
variants if those will never be used as the minimum architecture level
already supports newer CPUs by default.
Glibc internal calls will then also use the "newer" ifunc variant.
Note: The fallback s390-32/s390-64 ifunc variants are now moved to
the strncpy-z900.S files. The s390-32/s390-64 files multiarch/strncpy.c
and strncpy.S are deleted.
ChangeLog:
* sysdeps/s390/multiarch/Makefile
(sysdep_routines): Remove strncpy variants.
* sysdeps/s390/Makefile (sysdep_routines): Add strncpy variants.
* sysdeps/s390/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Refactor ifunc handling for strncpy.
* sysdeps/s390/multiarch/strncpy-vx.S: Move to ...
* sysdeps/s390/strncpy-vx.S: ... here and adjust ifunc handling.
* sysdeps/s390/multiarch/strncpy.c: Move to ...
* sysdeps/s390/strncpy.c: ... here and adjust ifunc handling.
* sysdeps/s390/ifunc-strncpy.h: New file.
* sysdeps/s390/s390-64/strncpy.S: Move to ...
* sysdeps/s390/s390-64/strncpy-z900.S: ... here
and adjust ifunc handling.
* sysdeps/s390/s390-32/strncpy.S: Move to ...
* sysdeps/s390/s390-32/strncpy-z900.S: ... here
and adjust ifunc handling.
* sysdeps/s390/s390-32/multiarch/strncpy.c: Delete file.
* sysdeps/s390/s390-64/multiarch/strncpy.c: Likewise.
The ifunc handling for stpcpy is adjusted in order to omit ifunc
variants if those will never be used as the minimum architecture level
already supports newer CPUs by default.
Glibc internal calls will then also use the "newer" ifunc variant.
ChangeLog:
* sysdeps/s390/multiarch/Makefile
(sysdep_routines): Remove stpcpy variants.
* sysdeps/s390/Makefile (sysdep_routines): Add stpcpy variants.
* sysdeps/s390/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Refactor ifunc handling for stpcpy.
* sysdeps/s390/multiarch/stpcpy-c.c: Move to ...
* sysdeps/s390/stpcpy-c.c: ... here and adjust ifunc handling.
* sysdeps/s390/multiarch/stpcpy-vx.S: Move to ...
* sysdeps/s390/stpcpy-vx.S: ... here and adjust ifunc handling.
* sysdeps/s390/multiarch/stpcpy.c: Move to ...
* sysdeps/s390/stpcpy.c: ... here and adjust ifunc handling.
* sysdeps/s390/ifunc-stpcpy.h: New file.
The ifunc handling for strcpy is adjusted in order to omit ifunc
variants if those will never be used as the minimum architecture level
already supports newer CPUs by default.
Glibc internal calls will then also use the "newer" ifunc variant.
Note: The fallback s390-32/s390-64 ifunc variants with mvst instruction
are now moved to the unified strcpy-z900.S file which can be used for
31/64bit. The s390-32/s390-64 files multiarch/strcpy.c and strcpy.S
are deleted.
ChangeLog:
* sysdeps/s390/multiarch/Makefile
(sysdep_routines): Remove strcpy variants.
* sysdeps/s390/Makefile (sysdep_routines): Add strcpy variants.
* sysdeps/s390/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Refactor ifunc handling for strcpy.
* sysdeps/s390/multiarch/strcpy-vx.S: Move to ...
* sysdeps/s390/strcpy-vx.S: ... here and adjust ifunc handling.
* sysdeps/s390/multiarch/strcpy.c: Move to ...
* sysdeps/s390/strcpy.c: ... here and adjust ifunc handling.
* sysdeps/s390/ifunc-strcpy.h: New file.
* sysdeps/s390/s390-64/strcpy.S: Move to ...
* sysdeps/s390/strcpy-z900.S: ... here and adjust to be usable
for 31/64bit and ifunc handling.
* sysdeps/s390/s390-32/multiarch/strcpy.c: Delete file.
* sysdeps/s390/s390-64/multiarch/strcpy.c: Likewise.
* sysdeps/s390/s390-32/strcpy.S: Likewise.
The ifunc handling for strnlen is adjusted in order to omit ifunc
variants if those will never be used as the minimum architecture level
already supports newer CPUs by default.
Glibc internal calls will then also use the "newer" ifunc variant.
ChangeLog:
* sysdeps/s390/multiarch/Makefile
(sysdep_routines): Remove strnlen variants.
* sysdeps/s390/Makefile (sysdep_routines): Add strnlen variants.
* sysdeps/s390/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Refactor ifunc handling for strnlen.
* sysdeps/s390/multiarch/strnlen-c.c: Move to ...
* sysdeps/s390/strnlen-c.c: ... here and adjust ifunc handling.
* sysdeps/s390/multiarch/strnlen-vx.S: Move to ...
* sysdeps/s390/strnlen-vx.S: ... here and adjust ifunc handling.
* sysdeps/s390/multiarch/strnlen.c: Move to ...
* sysdeps/s390/strnlen.c: ... here and adjust ifunc handling.
* sysdeps/s390/ifunc-strnlen.h: New file.
The ifunc handling for strlen is adjusted in order to omit ifunc
variants if those will never be used as the minimum architecture level
already supports newer CPUs by default.
Glibc internal calls will then also use the "newer" ifunc variant.
ChangeLog:
* sysdeps/s390/multiarch/Makefile
(sysdep_routines): Remove strlen variants.
* sysdeps/s390/Makefile (sysdep_routines): Add strlen variants.
* sysdeps/s390/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Refactor ifunc handling for strlen.
* sysdeps/s390/multiarch/strlen-c.c: Move to ...
* sysdeps/s390/strlen-c.c: ... here and adjust ifunc handling.
* sysdeps/s390/multiarch/strlen-vx.S: Move to ...
* sysdeps/s390/strlen-vx.S: ... here and adjust ifunc handling.
* sysdeps/s390/multiarch/strlen.c: Move to ...
* sysdeps/s390/strlen.c: ... here and adjust ifunc handling.
* sysdeps/s390/ifunc-strlen.h: New file.
This patch moves all ifunc variants for memcpy/mempcpy
to sysdeps/s390/memcpy-z900.S. The configure-check/preprocessor logic
in sysdeps/s390/ifunc-memcpy.h decides if ifunc is needed at all
and which ifunc variants should be available.
E.g. if the compiler/assembler already supports z196 by default,
the older ifunc variants are not included.
If we only need the newest ifunc variant,
then we can skip ifunc at all.
Therefore the ifunc-resolvers and __libc_ifunc_impl_list are adjusted
in order to handle only the available ifunc variants.
ChangeLog:
* sysdeps/s390/ifunc-memcpy.h: New File.
* sysdeps/s390/memcpy.S: Move to ...
* sysdeps/s390/memcpy-z900.S ... here.
Move implementations from memcpy-s390x.s to here.
* sysdeps/s390/multiarch/memcpy-s390x.S: Delete File.
* sysdeps/s390/multiarch/Makefile (sysdep_routines):
Remove memcpy/mempcpy variants.
* sysdeps/s390/Makefile (sysdep_routines):
Add memcpy/mempcpy variants.
* sysdeps/s390/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Adjust ifunc variants for
memcpy and mempcpy.
* sysdeps/s390/multiarch/memcpy.c: Move ifunc resolver
to ...
* sysdeps/s390/memcpy.c: ... here.
Adjust ifunc variants for memcpy.
* sysdeps/s390/multiarch/mempcpy.c: Move to ...
* sysdeps/s390/mempcpy.c: ... here.
Adjust ifunc variants for mempcpy.
* sysdeps/s390/mempcpy.S: Delete file.
The implementation of memcpy/mempcpy for s390-32 (31bit)
and s390-64 (64bit) is nearly the same.
This patch unifies it for maintability reasons.
__mem[p]cpy_z10 and __mem[p]cpy_z196 differs between 31 and 64bit:
-31bit needs .machinemode "zarch_nohighgprs" and llgfr %r4,%r4
-lr vs lgr; lgr can be also used on 31bit as this ifunc variant
is only called if we are on a zarch machine.
__mem[p]cpy_default differs between 31 and 64bit:
-Some 31bit vs 64bit instructions (e.g. ltr vs ltgr.
Solved with 31/64 specific instruction macros).
-The address of mvc instruction is setup in different ways
(larl vs bras). Solved with #if defined __s390x__.
__memcpy_mvcle differs between 31 and 64bit:
-lr vs lgr; ahi vs aghi;
Solved with 31/64bit specific instruction macros.
Otherwise 31/64bit implementation has the same structure of the code.
ChangeLog:
* sysdeps/s390/s390-64/memcpy.S: Move to ...
* sysdeps/s390/memcpy.S: ... here.
Adjust to be usable for 31/64bit.
* sysdeps/s390/s390-32/memcpy.S: Delete File.
* sysdeps/s390/multiarch/Makefile (sysdep_routines): Add memcpy.
* sysdeps/s390/s390-32/multiarch/Makefile: Delete file.
* sysdeps/s390/s390-64/multiarch/Makefile: Likewise.
* sysdeps/s390/s390-64/multiarch/memcpy-s390x.S: Move to ...
* sysdeps/s390/multiarch/memcpy-s390x.S: ... here.
Adjust to be usable for 31/64bit.
* sysdeps/s390/s390-32/multiarch/memcpy-s390.S: Delete File.
* sysdeps/s390/s390-64/multiarch/memcpy.c: Move to ...
* sysdeps/s390/multiarch/memcpy.c: ... here.
* sysdeps/s390/s390-32/multiarch/memcpy.c: Delete File.
This patch moves all ifunc variants for memcmp
to sysdeps/s390/memcmp-z900.S. The configure-check/preprocessor logic
in sysdeps/s390/ifunc-memcmp.h decides if ifunc is needed at all
and which ifunc variants should be available.
E.g. if the compiler/assembler already supports z196 by default,
the older ifunc variants are not included.
If we only need the newest ifunc variant,
then we can skip ifunc at all.
Therefore the ifunc-resolvers and __libc_ifunc_impl_list are adjusted
in order to handle only the available ifunc variants.
ChangeLog:
* sysdeps/s390/ifunc-memcmp.h: New File.
* sysdeps/s390/memcmp.S: Move to ...
* sysdeps/s390/memcmp-z900.S ... here.
Move implementations from memcmp-s390x.s to here.
* sysdeps/s390/multiarch/memcmp-s390x.S: Delete File.
* sysdeps/s390/multiarch/Makefile (sysdep_routines):
Remove memcmp variants.
* sysdeps/s390/Makefile (sysdep_routines):
Add memcmp variants.
* sysdeps/s390/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Adjust ifunc variants for
memcmp.
* sysdeps/s390/multiarch/memcmp.c: Move ifunc resolver
to ...
* sysdeps/s390/memcmp.c: ... here.
Adjust ifunc variants for memcmp.
The implementation of memcmp for s390-32 (31bit) and
s390-64 (64bit) is nearly the same.
This patch unifies it for maintability reasons.
__memcmp_z10 and __memcmp_z196 differs between 31 and 64bit:
-31bit needs .machinemode "zarch_nohighgprs" and llgfr %r4,%r4
-lr vs lgr and some other instructions:
But lgr and co can be also used on 31bit as this ifunc variant
is only called if we are on a zarch machine.
__memcmp_default differs between 31 and 64bit:
-Some 31bit vs 64bit instructions (e.g. ltr vs ltgr.
Solved with 31/64 specific instruction macros).
-The address of mvc instruction is setup in different ways
(larl vs bras). Solved with #if defined __s390x__.
Otherwise 31/64bit implementation has the same structure of the code.
ChangeLog:
* sysdeps/s390/s390-64/memcmp.S: Move to ...
* sysdeps/s390/memcmp.S: ... here.
Adjust to be usable for 31/64bit.
* sysdeps/s390/s390-32/memcmp.S: Delete File.
* sysdeps/s390/multiarch/Makefile (sysdep_routines): Add memcmp.
* sysdeps/s390/s390-32/multiarch/Makefile (sysdep_routines):
Remove memcmp.
* sysdeps/s390/s390-64/multiarch/Makefile: Likewise.
* sysdeps/s390/s390-64/multiarch/memcmp-s390x.S: Move to ...
* sysdeps/s390/multiarch/memcmp-s390x.S: ... here.
Adjust to be usable for 31/64bit.
* sysdeps/s390/s390-32/multiarch/memcmp-s390.S: Delete File.
* sysdeps/s390/s390-64/multiarch/memcmp.c: Move to ...
* sysdeps/s390/multiarch/memcmp.c: ... here.
* sysdeps/s390/s390-32/multiarch/memcmp.c: Delete File.
This patch moves all ifunc variants for memset
to sysdeps/s390/memset-z900.S. The configure-check/preprocessor logic
in sysdeps/s390/ifunc-memset.h decides if ifunc is needed at all
and which ifunc variants should be available.
E.g. if the compiler/assembler already supports z196 by default,
the older ifunc variants are not included.
If we only need the newest ifunc variant,
then we can skip ifunc at all.
Therefore the ifunc-resolvers and __libc_ifunc_impl_list are adjusted
in order to handle only the available ifunc variants.
ChangeLog:
* sysdeps/s390/ifunc-memset.h: New File.
* sysdeps/s390/memset.S: Move to ...
* sysdeps/s390/memset-z900.S ... here.
Move implementations from memset-s390x.s to here.
* sysdeps/s390/multiarch/memset-s390x.S: Delete File.
* sysdeps/s390/multiarch/Makefile (sysdep_routines):
Remove memset variants.
* sysdeps/s390/Makefile (sysdep_routines):
Add memset variants.
* sysdeps/s390/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Adjust ifunc variants for
memset.
* sysdeps/s390/multiarch/memset.c: Move ifunc resolver
to ...
* sysdeps/s390/memset.c: ... here.
Adjust ifunc variants for memset.
The implementation of memset for s390-32 (31bit) and
s390-64 (64bit) is nearly the same.
This patch unifies it for maintability reasons.
__memset_z10 and __memset_z196 differs between 31 and 64bit:
-31bit needs .machinemode "zarch_nohighgprs" and llgfr %r4,%r4
-lr vs lgr and some other instructions:
But lgr and co can be also used on 31bit as this ifunc variant
is only called if we are on a zarch machine.
__memset_default differs between 31 and 64bit:
-Some 31bit vs 64bit instructions (e.g. ltr vs ltgr.
Solved with 31/64 specific instruction macros).
-The address of mvc instruction is setup in different ways
(larl vs bras). Solved with #if defined __s390x__.
Otherwise 31/64bit implementation has the same structure of the code.
ChangeLog:
* sysdeps/s390/s390-64/memset.S: Move to ...
* sysdeps/s390/memset.S: ... here.
Adjust to be usable for 31/64bit.
* sysdeps/s390/s390-32/memset.S: Delete File.
* sysdeps/s390/multiarch/Makefile (sysdep_routines): Add memset.
* sysdeps/s390/s390-32/multiarch/Makefile (sysdep_routines):
Remove memset.
* sysdeps/s390/s390-64/multiarch/Makefile: Likewise.
* sysdeps/s390/s390-64/multiarch/memset-s390x.S: Move to ...
* sysdeps/s390/multiarch/memset-s390x.S: ... here.
Adjust to be usable for 31/64bit.
* sysdeps/s390/s390-32/multiarch/memset-s390.S: Delete File.
* sysdeps/s390/s390-64/multiarch/memset.c: Move to ...
* sysdeps/s390/multiarch/memset.c: ... here.
* sysdeps/s390/s390-32/multiarch/memset.c: Delete File.
This patch introduces a s390 specific gconv_simple.c file which provides
optimized versions for z13 with vector instructions, which will be chosen at
runtime via ifunc.
The optimized conversions can convert between internal and ascii, ucs4, ucs4le,
ucs2, ucs2le.
If the build-environment lacks vector support, then iconv/gconv_simple.c
is used wihtout any change. Otherwise iconvdata/gconv_simple.c is used to create
conversion loop routines without vector instructions as fallback, if vector
instructions aren't available at runtime.
ChangeLog:
* sysdeps/s390/multiarch/gconv_simple.c: New File.
* sysdeps/s390/multiarch/Makefile (sysdep_routines): Add gconv_simple.
This patch introduces a s390 specific 8bit-generic.c file which provides an
optimized version for z13 with translate-/vector-instructions, which will be
chosen at runtime via ifunc.
If the build-environment lacks vector support, then iconvdata/8bit-generic.c
is used wihtout any change. Otherwise iconvdata/8bit-generic.c is used to create
conversion loop routines without vector instructions as fallback, if vector
instructions aren't available at runtime.
The vector routines can only be used with charsets where the maximum UCS4 value
fits in 1 byte size. Then the hardware translate-instruction is used
to translate between up to 256 generic characters and "1 byte UCS4"
characters at once. The vector instructions are used to convert between
the "1 byte UCS4" and UCS4.
The gen-8bit.sh script in sysdeps/s390/multiarch generates the conversion
table to_ucs1. Therefore in sysdeps/s390/multiarch/Makefile is added an
override define generate-8bit-table, which is originally defined in
iconvdata/Makefile. This version calls the gen-8bit.sh in iconvdata folder
and the s390 one.
ChangeLog:
* sysdeps/s390/multiarch/8bit-generic.c: New File.
* sysdeps/s390/multiarch/gen-8bit.sh: New File.
* sysdeps/s390/multiarch/Makefile (generate-8bit-table):
New override define.
* sysdeps/s390/multiarch/iconv/skeleton.c: Likewise.
There exist optimized memcpy functions on s390, but no optimized mempcpy.
This patch adds mempcpy entry points in memcpy.S files, which
use the memcpy implementation. Now mempcpy itself is also an IFUNC function
as memcpy is and the variants are listed in ifunc-impl-list.c.
The s390 string.h does not define _HAVE_STRING_ARCH_mempcpy.
Instead mempcpy string/string.h inlines memcpy() + n.
If n is constant and small enough, GCC emits instructions like mvi or mvc
and avoids the function call to memcpy.
If n is not constant, then memcpy is called and n is added afterwards.
If _HAVE_STRING_ARCH_mempcpy would be defined, mempcpy would be called in
every case.
According to PR70140 "Inefficient expansion of __builtin_mempcpy"
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70140) GCC should handle a
call to mempcpy in the same way as memcpy. Then either the mempcpy macro
in string/string.h has to be removed or _HAVE_STRING_ARCH_mempcpy has to
be defined for S390.
ChangeLog:
[BZ #19765]
* sysdeps/s390/mempcpy.S: New File.
* sysdeps/s390/multiarch/mempcpy.c: Likewise.
* sysdeps/s390/multiarch/Makefile (sysdep_routines): Add mempcpy.
* sysdeps/s390/multiarch/ifunc-impl-list.c (__libc_ifunc_impl_list):
Add mempcpy variants.
* sysdeps/s390/s390-32/memcpy.S: Add mempcpy entry point.
(memcpy): Adjust to be usable from mempcpy entry point.
(__memcpy_mvcle): Likewise.
* sysdeps/s390/s390-64/memcpy.S: Likewise.
* sysdeps/s390/s390-32/multiarch/memcpy-s390.S: Add entry points
____mempcpy_z196, ____mempcpy_z10 and add __GI_ symbols for mempcpy.
(__memcpy_z196): Adjust to be usable from mempcpy entry point.
(__memcpy_z10): Likewise.
* sysdeps/s390/s390-64/multiarch/memcpy-s390x.S: Likewise.
This patch provides optimized version of memrchr with the z13 vector
instructions.
ChangeLog:
* sysdeps/s390/multiarch/memrchr-c.c: New File.
* sysdeps/s390/multiarch/memrchr-vx.S: Likewise.
* sysdeps/s390/multiarch/memrchr.c: Likewise.
* sysdeps/s390/multiarch/Makefile
(sysdep_routines): Add memrchr functions.
* sysdeps/s390/multiarch/ifunc-impl-list-common.c
(__libc_ifunc_impl_list_common): Add ifunc test for memrchr.
This patch provides optimized versions of memccpy with the z13 vector
instructions.
ChangeLog:
* sysdeps/s390/multiarch/memccpy-c.c: New File.
* sysdeps/s390/multiarch/memccpy-vx.S: Likewise.
* sysdeps/s390/multiarch/memccpy.c: Likewise.
* sysdeps/s390/multiarch/Makefile
(sysdep_routines): Add memccpy functions.
* sysdeps/s390/multiarch/ifunc-impl-list-common.c
(__libc_ifunc_impl_list_common): Add ifunc test for memccpy.
* string/memccpy.c: Use MEMCCPY if defined.
This patch provides optimized versions of strcmp and wcscmp with the z13
vector instructions.
The architecture specific string.h had a typo, which leads to ommiting the
inline version in this file if __USE_STRING_INLINES is defined.
Tested this inline version by tweaking test-strcmp.c.
ChangeLog:
* sysdeps/s390/multiarch/strcmp-vx.S: New File.
* sysdeps/s390/multiarch/strcmp.c: Likewise.
* sysdeps/s390/multiarch/wcscmp-c.c: Likewise.
* sysdeps/s390/multiarch/wcscmp-vx.S: Likewise.
* sysdeps/s390/multiarch/wcscmp.c: Likewise.
* sysdeps/s390/s390-32/multiarch/strcmp.c: Likewise.
* sysdeps/s390/s390-64/multiarch/strcmp.c: Likewise.
* sysdeps/s390/multiarch/Makefile (sysdep_routines): Add strcmp and
wcscmp functions.
* sysdeps/s390/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Add ifunc test for strcmp, wcscmp.
* string/strcmp.c (STRCMP): Define and use macro.
* benchtests/bench-wcscmp.c: New File.
* benchtests/Makefile (wcsmbs-bench): Add wcscmp.
* sysdeps/s390/bits/string.h: Fix typo: _HAVE_STRING_ARCH_strcmp
instead of _HAVE_STRING_ARCH_memchr.
This patch provides optimized versions of strlen and wcslen with the z13 vector
instructions.
The helper macro IFUNC_VX_IMPL is introduced and is used to register all
__<func>_c() and __<func>_vx() functions within __libc_ifunc_impl_list()
to the ifunc test framework.
ChangeLog:
* sysdeps/s390/multiarch/Makefile: New File.
* sysdeps/s390/multiarch/strlen-c.c: Likewise.
* sysdeps/s390/multiarch/strlen-vx.S: Likewise.
* sysdeps/s390/multiarch/strlen.c: Likewise.
* sysdeps/s390/multiarch/wcslen-c.c: Likewise.
* sysdeps/s390/multiarch/wcslen-vx.S: Likewise.
* sysdeps/s390/multiarch/wcslen.c: Likewise.
* string/strlen.c (STRLEN): Define and use macro.
* sysdeps/s390/multiarch/ifunc-impl-list.c
(IFUNC_VX_IMPL): New macro function.
(__libc_ifunc_impl_list): Add ifunc test for strlen, wcslen.
* benchtests/Makefile (wcsmbs-bench): New variable.
(string-bench-all): Added wcsmbs-bench.
* benchtests/bench-wcslen.c: New File.