I used these shell commands:
../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")
and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 7061 files FOO.
I then removed trailing white space from math/tgmath.h,
support/tst-support-open-dev-null-range.c, and
sysdeps/x86_64/multiarch/strlen-vec.S, to work around the following
obscure pre-commit check failure diagnostics from Savannah. I don't
know why I run into these diagnostics whereas others evidently do not.
remote: *** 912-#endif
remote: *** 913:
remote: *** 914-
remote: *** error: lines with trailing whitespace found
...
remote: *** error: sysdeps/unix/sysv/linux/statx_cp.c: trailing lines
I used these shell commands:
../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")
and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 6694 files FOO.
I then removed trailing white space from benchtests/bench-pthread-locks.c
and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this
diagnostic from Savannah:
remote: *** pre-commit check failed ...
remote: *** error: lines with trailing whitespace found
remote: error: hook declined to update refs/heads/master
DELOUSE was added to asm code to make them compatible with non-LP64
ABIs, but it is an unfortunate name and the code was not compatible
with ABIs where pointer and size_t are different. Glibc currently
only supports the LP64 ABI so these macros are not really needed or
tested, but for now the name is changed to be more meaningful instead
of removing them completely.
Some DELOUSE macros were dropped: clone, strlen and strnlen used it
unnecessarily.
The out of tree ILP32 patches are currently not maintained and will
likely need a rework to rebase them on top of the time64 changes.
This symbol is not in the implementation reserved namespace for static
linking and it was never used: it seems it was mistakenly added in the
orignal strlen_asimd commit 436e4d5b96
Optimize strlen using a mix of scalar and SIMD code. On modern micro
architectures large strings are 2.6 times faster than existing
strlen_asimd and 35% faster than the new MTE version of strlen.
On a random strlen benchmark using small sizes the speedup is 7% vs
strlen_asimd and 40% vs the MTE strlen. This fixes the main strlen
regressions on Cortex-A53 and other cores with a simple Neon unit.
Rename __strlen_generic to __strlen_mte, and select strlen_asimd when
MTE is not enabled (this is waiting on support for a HWCAP_MTE bit).
This fixes big-endian bug 25824. Passes GLIBC regression tests.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Optimize the strlen implementation by using vector operations and
loop unrolling in main loop.Compared to __strlen_generic,it reduces
latency of cases in bench-strlen by 7%~18% when the length of src
is greater than 128 bytes, with gains throughout the benchmark.
Checked on aarch64-linux-gnu.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
This variant of strlen uses vector loads and operations to reduce the
size of the code and also eliminate the non-ascii fallback. This
works very well for falkor because of its two vector units and
efficient vector ops. In the best case it reduces latency of cases in
bench-strlen by 48%, with gains throughout the benchmark.
strlen-walk also sees uniform gains in the 5%-15% range.
Overall the routine appears to work better than the stock one for falkor
regardless of the benchmark, length of string or cache state.
The same cannot be said of a53 and a72 though. a53 performance was
greatly reduced and for a72 it was a bit of a mixed bag, slightly on the
negative side but I reckon it might be fast in some situations.
* sysdeps/aarch64/strlen.S (__strlen): Rename to STRLEN.
[!STRLEN](STRLEN): Set to __strlen.
* sysdeps/aarch64/multiarch/strlen.c: New file.
* sysdeps/aarch64/multiarch/strlen_generic.S: Likewise.
* sysdeps/aarch64/multiarch/strlen_asimd.S: Likewise.
* sysdeps/aarch64/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Add strlen.
* sysdeps/aarch64/multiarch/Makefile (sysdep_routines): Add
strlen_generic and strlen_asimd.
Reviewed-By: szabolcs.nagy@arm.com
CC: pinskia@gmail.com