Remove simple_strcspn/strpbrk/strsep which are significantly slower than the
generic implementations. Also remove oldstrsep and oldstrtok since they are
practically identical to the generic implementation. Adjust iteration count
to reduce benchmark time.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Almost all uses of rawmemchr find the end of a string. Since most targets use
a generic implementation, replacing it with strchr is better since that is
optimized by compilers into strlen (s) + s. Also fix the generic rawmemchr
implementation to use a cast to unsigned char in the if statement.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
I used these shell commands:
../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")
and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 7061 files FOO.
I then removed trailing white space from math/tgmath.h,
support/tst-support-open-dev-null-range.c, and
sysdeps/x86_64/multiarch/strlen-vec.S, to work around the following
obscure pre-commit check failure diagnostics from Savannah. I don't
know why I run into these diagnostics whereas others evidently do not.
remote: *** 912-#endif
remote: *** 913:
remote: *** 914-
remote: *** error: lines with trailing whitespace found
...
remote: *** error: sysdeps/unix/sysv/linux/statx_cp.c: trailing lines
I used these shell commands:
../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")
and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 6694 files FOO.
I then removed trailing white space from benchtests/bench-pthread-locks.c
and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this
diagnostic from Savannah:
remote: *** pre-commit check failed ...
remote: *** error: lines with trailing whitespace found
remote: error: hook declined to update refs/heads/master
Improve string benchtest timing. Many tests run for 0.01s which is way too
short to give accurate results. Other tests take over 40 seconds which is
way too long. Significantly increase the iterations of the short running
tests. Reduce number of alignment variations in the long running memcpy walk
tests so they take less than 5 seconds.
As a result most tests take at least 0.1s and all finish within 5 seconds.
* benchtests/bench-memcpy-random.c (do_one_test): Use medium iterations.
* benchtests/bench-memcpy-walk.c (test_main): Reduce alignment tests.
* benchtests/bench-memmem.c (do_one_test): Use small iterations.
* benchtests/bench-memmove-walk.c (test_main): Reduce alignment tests.
* benchtests/bench-memset-walk.c (test_main): Reduce alignment tests.
* benchtests/bench-strcasestr.c (do_one_test): Use small iterations.
* benchtests/bench-string.h (INNER_LOOP_ITERS): Increase iterations.
(INNER_LOOP_ITERS_MEDIUM): New define.
(INNER_LOOP_ITERS_SMALL): New define.
* benchtests/bench-strpbrk.c (do_one_test): Use medium iterations.
* benchtests/bench-strsep.c (do_one_test): Use small iterations.
* benchtests/bench-strspn.c (do_one_test): Use medium iterations.
* benchtests/bench-strstr.c (do_one_test): Use small iterations.
* benchtests/bench-strtok.c (do_one_test): Use small iterations.
Currently the benchtests are run with internal GLIBC headers, which is incorrect.
Defining _ISOMAC in the makefile ensures the internal headers are bypassed.
Fix all tests which were relying on internal defines or includes.
* benchtests/Makefile: Define _ISOMAC.
* benchtests/bench-strcoll.c: Add missing sys/stat.h include.
* benchtests/bench-string.h: Define inhibit_loop_to_libcall macro.
* benchtests/bench-strstr.c: Define empty libc_hidden_builtin_def.
* benchtests/bench-strtok.c (oldstrtok): Use rawmemchr.
* benchtests/bench-timing.h: Define attribute_hidden.
calls strcspn, call strcspn directly so we get the end of the token without
an extra call to rawmemchr. Also avoid an unnecessary call to strcspn after
the last token by adding an early exit for an empty string. Change strtok
to tailcall strtok_r to avoid unnecessary code duplication.
Remove the special header optimization for strtok_r of a 1-character
constant string - both strspn and strcspn contain optimizations for this
case. Benchmarking this showed similar performance in the worst case,
but up to 5.5x better performance in the "found" case for large inputs.
* benchtests/bench-strtok.c (oldstrtok): Add old implementation.
* string/strtok.c (strtok): Change to tailcall __strtok_r.
* string/strtok_r.c (__strtok_r): Optimize for performance.
* string/string-inlines.c (__old_strtok_r_1c): New function.
* string/bits/string2.h (__strtok_r): Move to string-inlines.c.