I used these shell commands:
../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")
and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 7061 files FOO.
I then removed trailing white space from math/tgmath.h,
support/tst-support-open-dev-null-range.c, and
sysdeps/x86_64/multiarch/strlen-vec.S, to work around the following
obscure pre-commit check failure diagnostics from Savannah. I don't
know why I run into these diagnostics whereas others evidently do not.
remote: *** 912-#endif
remote: *** 913:
remote: *** 914-
remote: *** error: lines with trailing whitespace found
...
remote: *** error: sysdeps/unix/sysv/linux/statx_cp.c: trailing lines
The benchmark and tests must fail in case of allocation failure in the
implementation array. Also annotate the x* allocators in support.h so
that the compiler has more information about them.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
I used these shell commands:
../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")
and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 6694 files FOO.
I then removed trailing white space from benchtests/bench-pthread-locks.c
and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this
diagnostic from Savannah:
remote: *** pre-commit check failed ...
remote: *** error: lines with trailing whitespace found
remote: error: hook declined to update refs/heads/master
Some benchmarks with a very short runtime show significantly
different results across runs on Aarch64 - up to tens of percents.
Increasing the runtime to 100ms+ makes the deviation under 5%.
Tested on Aarch64 and x86-64.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* benchtests/bench-memccpy.c: Replace INNER_LOOP_ITERS
with INNER_LOOP_ITERS_LARGE.
* benchtests/bench-memchr.c: Likewise.
* benchtests/bench-rawmemchr.c: Likewise.
* benchtests/bench-strcat.c: Likewise.
* benchtests/bench-strchr.c: Likewise.
* benchtests/bench-string.h: Likewise.
* benchtests/bench-strlen.c: Likewise.
* benchtests/bench-strncpy.c: Likewise.
* benchtests/bench-strnlen.c: Likewise.
Improve string benchtest timing. Many tests run for 0.01s which is way too
short to give accurate results. Other tests take over 40 seconds which is
way too long. Significantly increase the iterations of the short running
tests. Reduce number of alignment variations in the long running memcpy walk
tests so they take less than 5 seconds.
As a result most tests take at least 0.1s and all finish within 5 seconds.
* benchtests/bench-memcpy-random.c (do_one_test): Use medium iterations.
* benchtests/bench-memcpy-walk.c (test_main): Reduce alignment tests.
* benchtests/bench-memmem.c (do_one_test): Use small iterations.
* benchtests/bench-memmove-walk.c (test_main): Reduce alignment tests.
* benchtests/bench-memset-walk.c (test_main): Reduce alignment tests.
* benchtests/bench-strcasestr.c (do_one_test): Use small iterations.
* benchtests/bench-string.h (INNER_LOOP_ITERS): Increase iterations.
(INNER_LOOP_ITERS_MEDIUM): New define.
(INNER_LOOP_ITERS_SMALL): New define.
* benchtests/bench-strpbrk.c (do_one_test): Use medium iterations.
* benchtests/bench-strsep.c (do_one_test): Use small iterations.
* benchtests/bench-strspn.c (do_one_test): Use medium iterations.
* benchtests/bench-strstr.c (do_one_test): Use small iterations.
* benchtests/bench-strtok.c (do_one_test): Use small iterations.
Drop realloc_bufs in favour of making alloc_bufs transparently
reallocate the buffers if it had allocated before. Also consolidate
computation of buffer lengths so that they don't get repeated on every
reallocation.
* benchtests/bench-string.h (buf1_size, buf2_size): New
variables.
(init_sizes): New function.
(test_init): Use it.
(alloc_buf, exit_error): New functions.
(alloc_bufs): Use ALLOC_BUF.
(realloc_bufs): Remove.
* benchtests/bench-memcmp.c (do_test): Adjust.
* benchtests/bench-memset-large.c (do_test): Likewise.
* benchtests/bench-memset-walk.c (do_test): Likewise.
* benchtests/bench-memset.c (do_test): Likewise.
* benchtests/bench-strncmp.c (do_test): Likewise.
Currently the benchtests are run with internal GLIBC headers, which is incorrect.
Defining _ISOMAC in the makefile ensures the internal headers are bypassed.
Fix all tests which were relying on internal defines or includes.
* benchtests/Makefile: Define _ISOMAC.
* benchtests/bench-strcoll.c: Add missing sys/stat.h include.
* benchtests/bench-string.h: Define inhibit_loop_to_libcall macro.
* benchtests/bench-strstr.c: Define empty libc_hidden_builtin_def.
* benchtests/bench-strtok.c (oldstrtok): Use rawmemchr.
* benchtests/bench-timing.h: Define attribute_hidden.
Keeping the same buffers along with copying the same size of data into
the same location means that the first routine is typically the
slowest since it has to bear the cost of fetching data into to cache.
Reallocating buffers stabilizes numbers by a bit.
* benchtests/bench-string.h (realloc_bufs): New function.
(test_init): Call it.
* benchtests/bench-memset-large.c (do_test): Likewise.
* benchtests/bench-memset.c (do_test): Likewise.
The test run is unnecessary and interferes with the benchmark. The
tests are done during make check, so they're unnecessary here.
* benchtests/bench-memccpy.c (do_one_test): Remove checks.
* benchtests/bench-memchr.c (do_one_test): Likewise.
* benchtests/bench-memcpy-large.c (do_one_test): Likewise.
* benchtests/bench-memcpy.c (do_one_test): Likewise.
* benchtests/bench-memmove-large.c (do_one_test): Likewise.
* benchtests/bench-memmove.c (do_one_test): Likewise.
* benchtests/bench-memset-large.c (do_one_test): Likewise.
* benchtests/bench-memset.c (do_one_test): Likewise.
* benchtests/bench-string.h (test_init): Remove memsets.
TEST_IFUNC is only tested in two headers, bench-string.h and
test-string.h, after it gets defined by those headers, and it never
gets undefined.
Thus no defines of TEST_IFUNC are needed, and the *-ifunc.c tests that
just define TEST_IFUNC and include other tests are also redundant, as
is the code to remove $(tests-ifunc) and $(xtests-ifunc) conditionally
from tests and xtests. This patch removes the useless defines and
tests of TEST_IFUNC and the associated useless tests and makefile
code. It thereby fixes a series of warnings
"../string/test-string.h:21:0: warning: "TEST_IFUNC" redefined" where
test-string.h defines TEST_IFUNC to empty, other files define it to 1
and this produces warnings.
Tested for x86_64.
* debug/test-stpcpy_chk-ifunc.c: Remove file.
* debug/test-strcpy_chk-ifunc.c: Likewise.
* wcsmbs/test-wcschr-ifunc.c: Likewise.
* wcsmbs/test-wcscmp-ifunc.c: Likewise.
* wcsmbs/test-wcscpy-ifunc.c: Likewise.
* wcsmbs/test-wcslen-ifunc.c: Likewise.
* wcsmbs/test-wcsrchr-ifunc.c: Likewise.
* wcsmbs/test-wmemcmp-ifunc.c: Likewise.
* Rules [$(multi-arch) = no] (tests): Do not filter out
$(tests-ifunc).
[$(multi-arch) = no] (xtests): Do not filter out $(xtests-ifunc).
* debug/Makefile (tests-ifunc): Remove variable.
(tests): Do not add $(tests-ifunc).
* wcsmbs/Makefile (tests-ifunc): Remove variable.
(tests): Do not add $(tests-ifunc).
* benchtests/bench-string.h (TEST_IFUNC): Remove macro.
[TEST_IFUNC]: Remove conditionals.
* string/test-string.h (TEST_IFUNC): Remove macro.
[TEST_IFUNC]: Remove conditionals.
Switch the string benchmarks to using bench-timing.h instead
of hp-timing.h directly. This allows the string benchmarks to
be run usefully on architectures such as ARM that do not have
support for hp-timing.h.
In order to do this the tests have been changed from timing each
individual call and picking the lowest execution time recorded to
timing a number of calls and taking the mean execution time.
ChangeLog:
2013-09-04 Will Newton <will.newton@linaro.org>
* benchtests/bench-timing.h (TIMING_PRINT_MEAN): New macro.
* benchtests/bench-string.h: Include bench-timing.h instead
of including hp-timing.h directly. (INNER_LOOP_ITERS): New
define. (HP_TIMING_BEST): Delete macro. (test_init): Remove
call to HP_TIMING_DIFF_INIT.
* benchtests/bench-memccpy.c: Use bench-timing.h macros
instead of hp-timing.h macros.
* benchtests/bench-memchr.c: Likewise.
* benchtests/bench-memcmp.c: Likewise.
* benchtests/bench-memcpy.c: Likewise.
* benchtests/bench-memmem.c: Likewise.
* benchtests/bench-memmove.c: Likewise.
* benchtests/bench-memset.c: Likewise.
* benchtests/bench-rawmemchr.c: Likewise.
* benchtests/bench-strcasecmp.c: Likewise.
* benchtests/bench-strcasestr.c: Likewise.
* benchtests/bench-strcat.c: Likewise.
* benchtests/bench-strchr.c: Likewise.
* benchtests/bench-strcmp.c: Likewise.
* benchtests/bench-strcpy.c: Likewise.
* benchtests/bench-strcpy_chk.c: Likewise.
* benchtests/bench-strlen.c: Likewise.
* benchtests/bench-strncasecmp.c: Likewise.
* benchtests/bench-strncat.c: Likewise.
* benchtests/bench-strncmp.c: Likewise.
* benchtests/bench-strncpy.c: Likewise.
* benchtests/bench-strnlen.c: Likewise.
* benchtests/bench-strpbrk.c: Likewise.
* benchtests/bench-strrchr.c: Likewise.
* benchtests/bench-strspn.c: Likewise.
* benchtests/bench-strstr.c: Likewise.
This is the initial support for string function performance tests,
along with copying tests for memcpy and memcpy-ifunc as proof of
concept. The string function benchmarks perform operations at
different alignments and for different sizes and compare performance
between plain operations and the optimized string operations. Due to
this their output is incompatible with the function benchmarks where
we're interested in fastest time, throughput, etc.
In future, the correctness checks in the benchmark tests can be
removed. Same goes for the performance measurements in the
string/test-*.