Commit Graph

345 Commits

Author SHA1 Message Date
H.J. Lu
89377d41d7 benchtests: Add small sizes (<= 64) to bench-bzero-walk.c
Small sizes (<= 64) represent large portion of memset usages with zero
value.  Add sizes (<= 64) to bench-bzero-walk.c to cover small sizes.
Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>
2022-02-24 12:28:34 -08:00
H.J. Lu
cf97591313 benchtests: Add benches for memset with 0 value
memset with zero as the value to set is by far the majority value (99%+
for Python3 and GCC).  Add bench-memset-zero-large.c,
bench-memset-zero-walk.c and bench-memset-zero.c to measure memset
implementations for zeroing.

Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>
2022-02-23 12:07:06 -08:00
H.J. Lu
dc98eeeb95 benchtests: Add benches for bzero
Add bench-bzero-large.c, bench-bzero-walk.c and bench-bzero.c.
2022-02-08 14:41:58 -08:00
H.J. Lu
03c9c4fce4 benchtests: Sort benches in Makefile
Put one bench per line and sort them.
2022-02-07 07:09:38 -08:00
Noah Goldstein
69e6992d79 Benchtests: Add length zero benchmark for memset in bench-memset.c
Zero is a relevant size for some workloads (roughly 5% of uses for
GCC) so we should be testing it's performance as well.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2022-02-06 22:01:39 -06:00
Noah Goldstein
90cbb80636 Benchtests: move 'alloc_bufs' from loop in bench-memset.c
One buf allocation is sufficient. Calling `alloc_bufs' in the loop
just adds unnecessary syscall overhead.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2022-02-05 16:48:00 -06:00
Noah Goldstein
80e6c6554b benchtests: Add more coverage for strcmp and strncmp benchmarks
Add more small and medium sized tests for strcmp and strncmp.

As well for strcmp add option for more direct control of
alignment. Previously alignment was being pushed to the end of the
page. While this is the most difficult case to implement, it is far
from the common case and so shouldn't be the only benchmark.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
2022-02-03 16:41:43 -06:00
Paul Eggert
581c785bf3 Update copyright dates with scripts/update-copyrights
I used these shell commands:

../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")

and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 7061 files FOO.

I then removed trailing white space from math/tgmath.h,
support/tst-support-open-dev-null-range.c, and
sysdeps/x86_64/multiarch/strlen-vec.S, to work around the following
obscure pre-commit check failure diagnostics from Savannah.  I don't
know why I run into these diagnostics whereas others evidently do not.

remote: *** 912-#endif
remote: *** 913:
remote: *** 914-
remote: *** error: lines with trailing whitespace found
...
remote: *** error: sysdeps/unix/sysv/linux/statx_cp.c: trailing lines
2022-01-01 11:40:24 -08:00
Noah Goldstein
ac759b1fbf benchtests: Add partial overlap case in bench-memmove-walk.c
This commit adds a new partial overlap benchmark. This is generally
the most interesting performance case for memmove and was missing.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-11-06 16:17:59 -05:00
Noah Goldstein
5e6cce9b34 benchtests: Add additional cases to bench-memcpy.c and bench-memmove.c
This commit adds more benchmarks for the common memcpy/memmove
benchmarks. The most signifcant cases are the half page offsets. The
current versions leaves dst and src near page aligned which leads to
false 4k aliasing on x86_64. This can add noise due to false
dependencies from one run to the next. As well, this seems like more
of an edge case that common case so it shouldn't be the only thing

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-11-06 16:17:51 -05:00
Sunil K Pandey
2856829ee7 Revert "benchtests: Add acosf function to bench-math"
This reverts commit 79d0fc6539.
2021-11-05 16:13:12 -07:00
Adhemerval Zanella
b8a6ee43bb benchtests: Add hypotf
Based on random input arguments.  About 85% tuples have exponents
of the two arguments close together (+-1 range).

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-11-01 16:23:39 -03:00
Adhemerval Zanella
dba44dbe54 benchtests: Make hypot input random
Instead of inputs based on the algorithm implementation details.
About 85% tuples have exponents of the two arguments close
together (+-1 range).

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-11-01 16:23:22 -03:00
Sunil K Pandey
79d0fc6539 benchtests: Add acosf function to bench-math
Add acosf function to bench-math and copy acosf-inputs to benchtests.
Motivation for this patch is to prepare for upcoming libmvec new
functions.  Float and double version of libmvec functions stays
together.

acosf-inputs file generated from acos-inputs file using following
scaling formula:

f = d * (FLT_MAX/DBL_MAX)

Where d is input(double) and f is output(float).  If scaled float value
is duplicate in new input file, nextafterf() function used to find next
float value, ensuring no duplicates.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-10-29 08:52:30 -07:00
Wilco Dijkstra
f392915d1e benchtests: Improve bench-memcpy-random
Improve the random memcpy benchmark. Double the number of tests and increase
the size of the memory region to test between 32KB and 1024KB. This improves
accuracy on modern cores. Clean up formatting of the frequency array.

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2021-10-29 15:45:53 +01:00
Noah Goldstein
cf3acd774f Benchtests: Add benchtests for __memcmpeq
No bug. This commit adds __memcmpeq benchmarks. The benchmarks just
use the existing ones in memcmp. This will be useful for testing
implementations of __memcmpeq that do not just alias memcmp.
2021-10-27 13:03:46 -05:00
H.J. Lu
d8e7d06381 bench-math: Sort and put each bench per line
Sort and put each math bench per line to prepare for new math benches.

Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2021-10-23 05:20:25 -07:00
Noah Goldstein
5d26d12f4a benchtests: Add medium cases and increase iters in bench-memset.c
No bug.

This commit adds new medium size cases for lengths in [512, 1024). As
well it increase the iters to INNER_LOOP_ITERS_LARGE for more reliable
results.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
2021-10-08 15:13:06 -05:00
H.J. Lu
de0a7c5a0b benchtests: Building benchmarks as static executables
Building benchmarks as static executables:
=========================================

To build benchmarks as static executables, on the build system, run:

  $ make STATIC-BENCHTESTS=yes bench-build

You can copy benchmark executables to another machine and run them
without copying the source nor build directories.
2021-10-04 10:09:13 -07:00
Noah Goldstein
a1c056c9d0 benchtests: Improve reliability of memcmp benchmarks
No bug. Remove reallocation of bufs between implementation tests. Move
initialization outside of foreach implementation test loop. Increase
iteration count.

Generally before this commit was seeing a great deal of variability
between runs. The goal of this commit is to make the results more
reliable.

Benchtests build and bench-memcmp succeeding.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
2021-09-24 18:04:05 -05:00
Naohiro Tamura
cb5088cfd3 benchtests: Fix validate_benchout.py exceptions
This patch fixed validate_benchout.py two exceptions,
1) AttributeError
   if benchout_strings.schema.json is specified, and
2) json.decoder.JSONDecodeError
   if benchout file is not JSON.

$ ~/glibc/benchtests/scripts/validate_benchout.py bench-memset.out \
~/glibc/benchtests/scripts/benchout_strings.schema.json
Traceback (most recent call last):
  File "/home/naohirot/glibc/benchtests/scripts/validate_benchout.py", line 86, in <module>
    sys.exit(main(sys.argv[1:]))
  File "/home/naohirot/glibc/benchtests/scripts/validate_benchout.py", line 69, in main
    bench.parse_bench(args[0], args[1])
  File "/home/naohirot/glibc/benchtests/scripts/import_bench.py", line 139, in parse_bench
    do_for_all_timings(bench, lambda b, f, v:
  File "/home/naohirot/glibc/benchtests/scripts/import_bench.py", line 107, in do_for_all_timings
    if 'timings' not in bench['functions'][func][k].keys():
AttributeError: 'str' object has no attribute 'keys'

$ ~/glibc/benchtests/scripts/validate_benchout.py bench-math-inlines.out \
~/glibc/benchtests/scripts/benchout_strings.schema.json
Traceback (most recent call last):
  File "/home/naohirot/glibc/benchtests/scripts/validate_benchout.py", line 86, in <module>
    sys.exit(main(sys.argv[1:]))
  File "/home/naohirot/glibc/benchtests/scripts/validate_benchout.py", line 69, in main
    bench.parse_bench(args[0], args[1])
  File "/home/naohirot/glibc/benchtests/scripts/import_bench.py", line 137, in parse_bench
    bench = json.load(benchfile)
  File "/usr/lib/python3.6/json/__init__.py", line 299, in load
    parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
  File "/usr/lib/python3.6/json/__init__.py", line 354, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python3.6/json/decoder.py", line 342, in decode
    raise JSONDecodeError("Extra data", s, end)
json.decoder.JSONDecodeError: Extra data: line 1 column 17 (char 16)

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2021-09-16 09:19:55 +05:30
Naohiro Tamura
2fd36391be benchtests: Remove redundant assert.h
This patch removed redundant "#include <assert.h>" from
bench-memset-large.c and bench-memset-walk.c.

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2021-09-13 09:08:26 +05:30
Naohiro Tamura
3886eaff9d benchtests: Enable scripts/plot_strings.py to read stdin
This patch enables scripts/plot_strings.py to read a benchmark result
file from stdin.
To keep backward compatibility, that is to keep accepting multiple of
benchmark result files in argument, blank argument doesn't mean stdin,
but '-' does.
Therefore nargs parameter of ArgumentParser.add_argument() method is
not changed to '?', but keep '+'.

ex:
  $ jq '.' bench-memset.out | plot_strings.py -
  $ jq '.' bench-memset.out | plot_strings.py - bench-memset-large.out
  $ plot_strings.py bench-memset.out bench-memset-large.out

error ex:
  $ jq '.' bench-memset.out | plot_strings.py

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2021-09-13 09:04:21 +05:30
Fangrui Song
710ba420fd Remove sysdeps/*/tls-macros.h
They provide TLS_GD/TLS_LD/TLS_IE/TLS_IE macros for TLS testing.  Now
that we have migrated to __thread and tls_model attributes, these macros
are unused and the tls-macros.h files can retire.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2021-08-18 09:15:20 -07:00
Paul Zimmermann
db737c79c6 Remove obsolete comments/name from several benchtest input files.
These comments refer to slow paths that were removed in
glibc 2.34 or earlier.  The corresponding "names" that yield
separate workload traces for "make bench" are thus obsolete.
We are however keeping the corresponding inputs.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2021-08-02 15:27:16 +02:00
Paul Zimmermann
4165dd2e95 Remove obsolete comments/name from acos-inputs, since slow path was removed. 2021-08-02 15:05:22 +02:00
Siddhesh Poyarekar
70d08ba204 tests: use xmalloc to allocate implementation array
The benchmark and tests must fail in case of allocation failure in the
implementation array.  Also annotate the x* allocators in support.h so
that the compiler has more information about them.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2021-07-28 17:45:19 +05:30
Naohiro Tamura
f12ec02f53 benchtests: Fixed bench-memcpy-random: buf1: mprotect failed
This patch fixed mprotect system call failure on AArch64.
This failure happened on not only A64FX but also ThunderX2.

Also this patch updated a JSON key from "max-size" to "length" so that
'plot_strings.py' can process 'bench-memcpy-random.out'
2021-05-26 12:01:06 +01:00
Noah Goldstein
fc335a0ded Bench: Add support for choose direction of memcpy in benchtests
This patch adds support for testing memcpy with both dst > src and dst
< src. Since memcpy is implemented as memmove which has seperate
control flows for certain sizes depending on dst > src it seems like
1) information that should be provided in the benchtest output and a
variable that can be controlled for the benchmarks.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
2021-05-23 19:36:36 -04:00
Noah Goldstein
e68d6fccca x86: Expand bench-memcmp.c and test-memcmp.c
No bug. This commit adds some additional performance test cases to
bench-memcmp.c and test-memcmp.c. The new benchtests include some
medium range sizes, as well as small sizes near page cross. The new
correctness tests correspond with the new benchtests though add some
additional cases for checking the page cross logic.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-05-18 22:57:39 -04:00
Matheus Castanho
f4605e611a benchtests: Use JSON for bench-rawmemchr output
Convert the output of benchtests/bench-rawmemchr to JSON like other string
benchmarks.  This makes the output more parseable and allows usage of
compare_strings.py, for example.

Reviewed-by: Lucas A. M. Magalhaes <lamm@linux.ibm.com>
2021-05-17 11:10:19 -03:00
Paul Zimmermann
8d0985b055 add workload traces for cbrtl
These workload traces cover the whole "long double" range.
This patch was prepared with the help of Adhemerval Zanella.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2021-05-10 18:45:34 +02:00
Noah Goldstein
1427d28e30 Bench: Expand bench-memchr.c
No bug. This commit adds some additional cases for bench-memchr.c
including testing medium sizes and testing short length with both an
inbound match and out of bound match.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
2021-05-03 10:18:11 -07:00
H.J. Lu
98544f5bcf bench-memcpy: Collect data from 2KB to 4KB
Collect data on memcpy from 2KB to 4KB with the 64-byte increment value.
2021-05-03 05:08:22 -07:00
Noah Goldstein
81f6dd2135 x86: Expand test-memset.c and bench-memset.c
No bug. This commit adds tests cases and benchmarks for page cross and
for memset to the end of the page without crossing. As well in
test-memset.c this commit adds sentinel on start/end of tstbuf to test
for overwrites

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
2021-04-19 15:08:04 -07:00
Siddhesh Poyarekar
5660ab19f4 benchtests: Fix name of exp10f benchmark variant
Variant names don't accept brackets.
2021-04-18 12:56:33 +05:30
Siddhesh Poyarekar
a373aa25c7 benchtests: Fix pthread-locks test to produce valid json
The benchtests json allows {function {variant}} categorization of
results whereas the pthread-locks tests had {function {variant
{subvariant}}}, which broke validation.  Fix that by serializing the
subvariants as variant-subvariant.  Also update the schema to
recognize the new benchmark attributes after fixing the naming
conventions.
2021-04-18 12:56:29 +05:30
noah
81cbc3bcae x86: Expanding test-memmove.c, test-memcpy.c, bench-memcpy-large.c
No Bug. This commit expanding the range of tests / benchmarks for
memmove and memcpy. The test expansion is mostly in the vein of
increasing the maximum size, increasing the number of unique
alignments tested, and testing both source < destination and vice
versa. The benchmark expansaion is just to increase the number of
unique alignments. test-memcpy, test-memccpy, test-mempcpy,
test-memmove, and tst-memmove-overflow all pass.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
2021-04-16 12:09:56 -07:00
Paul Zimmermann
934d88d862 add workload traces for missing functions (double format)
This patch adds workload traces for all double format functions where such
files are missing.  For each function, a set of 1000 random values is
generated at random using SageMath, such that the output values are
meaningful (for example avoiding too large inputs for exp10 where the
output would be +Inf).  More details about the generated values are
given at the beginning of each file.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2021-03-29 16:23:19 +02:00
Raphael Moreira Zinsly
6cf1911122 benchtests: Add ilogb* tests
Add a benchtest to ilogb, ilogbf and ilogbf128 based on the logb* benchtests.
2021-03-16 12:19:09 -03:00
Naohiro Tamura
7960c5eea9 benchtests: Updated json bench-variant attribute
This patch updates json "bench-variant" attribute of "bench-memset.c"
to "default" so that the script "benchtests/scripts/plot_strings.py"
can generate a file "memset_time_default_linear.png".
Without this patch, the script "benchtests/scripts/plot_strings.py"
generates a file "memset_time__linear.png" which has inconsistent form
with "memcpy_time_default_linear.png" and
"memmove_time_default_linear.png".
2021-02-10 08:50:26 +05:30
noah
a00e2fe3df strchr: Add additional benchmarks and tests
This patch adds additional benchmarks and tests for string size of
4096 and several benchmarks for string size 256 with different
alignments.
2021-02-08 11:34:00 -08:00
Arjun Shankar
3725ee39db benchtests: Do not build bench-timing-type with MODULE_NAME=libc
Since commit 2682695e5c, `make bench-build' with `--enable-static-pie'
fails due to bench-timing-type being incorrectly built with MODULE_NAME
set to `libc'.  This commit sets MODULE_NAME to nonlib, thus fixing the
build failure.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2021-01-26 18:14:19 +01:00
Fangrui Song
87d583c6e8 install: Replace scripts/output-format.sed with objdump -f [BZ #26559]
GNU ld and gold have supported --print-output-format since 2011. glibc
requires binutils>=2.25 (2015), so if LD is GNU ld or gold, we can
assume the option is supported.

lld is by default a cross linker supporting multiple targets. It auto
detects the file format and does not need OUTPUT_FORMAT. It does not
support --print-output-format.

By parsing objdump -f, we can support all the three linkers.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-01-11 12:03:36 -08:00
Paul Eggert
2b778ceb40 Update copyright dates with scripts/update-copyrights
I used these shell commands:

../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")

and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 6694 files FOO.
I then removed trailing white space from benchtests/bench-pthread-locks.c
and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this
diagnostic from Savannah:
remote: *** pre-commit check failed ...
remote: *** error: lines with trailing whitespace found
remote: error: hook declined to update refs/heads/master
2021-01-02 12:17:34 -08:00
DJ Delorie
4be44c3208 New benchtest: pthread locks
Performance benchmarks for various posix locks: mutex, rwlock,
spinlock, condvar, and semaphore.  Each test is performed with
an empty loop body or with a computationally "interesting" (i.e.
difficult to optimize away, and used just to allow lock code to
be "hidden" in the filler's CPU cycles).
2020-10-21 11:03:52 -04:00
H.J. Lu
06e95b93f0 bench-strcmp.c: Add workloads on page boundary
Add strcmp workloads on page boundary.
2020-09-24 10:46:38 -07:00
H.J. Lu
c4277ba234 bench-strncmp.c: Add workloads on page boundary
Add strncmp workloads on page boundary.
2020-09-24 10:46:30 -07:00
Arjun Shankar
03e26098b1 benchtests: Run _Float128 tests only on architectures that support it
__float128 is a non-standard name and is not available on some architectures
(like aarch64 or s390x) even though they may support the standard _Float128
type.  Other architectures (like armv7) don't support quad-precision
floating-point operations at all.

This commit replaces benchtests references to __float128 with _Float128 and
runs the corresponding tests only on architectures that support it.
2020-09-23 16:11:57 +02:00
Paul Zimmermann
26fbd74059 benchtests: Add "workload" traces for sinf128
This patch adds workload traces for sinf128 in binary32.  The trace is
made of 1000 random numbers, generated with SageMath.
2020-09-10 15:25:22 -03:00
Paul Zimmermann
ad1e1db5dc benchtests: Add "workload" traces for sinf
This patch adds workload traces for sinf in binary32.  The trace is
made of 1000 random numbers, generated with SageMath.
2020-09-10 15:25:22 -03:00
Paul Zimmermann
cfa220bfdc benchtests: Add "workload" traces for sin
This patch adds workload traces for sin in binary64.  The trace is
made of 1000 random numbers, generated with SageMath.
2020-09-10 15:25:22 -03:00
Paul Zimmermann
e24b248dcb benchtests: Add "workload" traces for powf128
This patch adds workload traces for pow in binary128.  The trace is
made of 1000 random numbers, generated with SageMath.
2020-09-10 15:25:22 -03:00
Paul Zimmermann
fba686aa42 benchtests: Add "workload" traces for pow
This patch adds workload traces for pow in binary64.  The trace is
made of 1000 random numbers, generated with SageMath.
2020-09-10 15:25:22 -03:00
Paul Zimmermann
abc9732aee benchtests: Add "workload" traces for expf128
This patch adds workload traces for exp in binary128.  The trace is
made of 1000 random numbers, generated with SageMath.
2020-09-10 15:25:22 -03:00
Paul Zimmermann
59bb418bd0 benchtests: Add "workload" traces for exp
This patch adds workload traces for exp in binary64.  The trace is
made of 1000 random numbers, generated with SageMath.
2020-09-10 15:25:22 -03:00
Paul Zimmermann
50a8dd367e benchtests/README update.
Improve documentation of the 'name' directive and the 'workload' mechanism.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2020-08-04 12:44:41 -04:00
Petr Vorel
5500cdba40 Remove --enable-obsolete-rpc configure flag
Sun RPC was removed from glibc. This includes rpcgen program, librpcsvc,
and Sun RPC headers. Also test for bug #20790 was removed
(test for rpcgen).

Backward compatibility for old programs is kept only for architectures
and ABIs that have been added in or before version 2.28.

libtirpc is mature enough, librpcsvc and rpcgen are provided in
rpcsvc-proto project.

NOTE: libnsl code depends on Sun RPC (installed libnsl headers use
installed Sun RPC headers), thus --enable-obsolete-rpc was a dependency
for --enable-obsolete-nsl (removed in a previous commit).

The arc ABI list file has to be updated because the port was added
with the sunrpc symbols

Tested-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2020-07-13 19:36:35 +02:00
Adhemerval Zanella
2004063fb4 benchtests: Add exp10f benchmark
It is based on expf one by converting each line with the formula:

  new_val = (float) log10 (exp ((double) old_val))
2020-06-19 10:48:15 -03:00
H.J. Lu
e52434a2e4 benchtests: Restore the clock_gettime option
commit 7621e38bf3
Author: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Date:   Tue Jan 29 17:43:45 2019 +0000

    Add generic hp-timing support

removed the clock_gettime option.  Restore the clock_gettime option for
some x86 CPUs on which value from RDTSC may not be incremented at a fixed
rate.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2020-06-05 09:48:07 -07:00
H.J. Lu
f607047668 Update HP_TIMING_NOW for _ISOMAC in sysdeps/generic/hp-timing.h
commit e9698175b0
Author: Lukasz Majewski <lukma@denx.de>
Date:   Mon Mar 16 08:31:41 2020 +0100

    y2038: Replace __clock_gettime with __clock_gettime64

breaks benchtests with sysdeps/generic/hp-timing.h:

In file included from ./bench-timing.h:23,
                 from ./bench-skeleton.c:25,
                 from
/export/build/gnu/tools-build/glibc-gitlab/build-x86_64-linux/benchtests/bench-rint.c:45:
./bench-skeleton.c: In function ‘main’:
../sysdeps/generic/hp-timing.h:37:23: error: storage size of ‘tv’ isn’t known
   37 |   struct __timespec64 tv;      \
      |                       ^~

Define HP_TIMING_NOW with clock_gettime in sysdeps/generic/hp-timing.h
if _ISOMAC is defined.  Don't define __clock_gettime in bench-timing.h
since it is no longer needed.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2020-06-05 09:44:06 -07:00
Lukasz Majewski
e9698175b0 y2038: Replace __clock_gettime with __clock_gettime64
The __clock_gettime internal function is not supporting 64 bit time on
architectures with __WORDSIZE == 32 and __TIMESIZE != 64 (like e.g. ARM 32
bit).

The __clock_gettime64 function shall be used instead in the glibc itself as
it supports 64 bit time on those systems.
This patch does not bring any changes to systems with __WORDSIZE == 64 as
for them the __clock_gettime64 is aliased to __clock_gettime (in
./include/time.h).
2020-05-20 16:45:16 +02:00
Shen-Ta Hsieh
642d5abaf1 Add benchtests for roundeven and roundevenf.
This patch adds benchtests for the roundeven and roundevenf functions.
The inputs are copied from trunc-inputs.
2020-03-27 23:24:02 +00:00
Alistair Francis
4f88b38097 Convert Python scripts to Python 3
Change all of the #! lines in Python scripts that are called from
Makefiles to reference /usr/bin/python3.

All of the scripts called from Makefiles are already run with Python 3,
so let's make sure they are explicitly using Python 3 if called
manually.
2020-03-03 15:52:09 -08:00
Wilco Dijkstra
511c91b114 Improve random memcpy benchmark
Improve the random memcpy benchmark.  Double the number of copies and
increase the memory sizes tested to 512KB.  Add a more detailed
distribution of memcpy alignment and sizes up to 4096 based on SPEC2017
traces.

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2020-02-10 16:08:40 +00:00
Joseph Myers
92ce43eef7 Run bench-timing-type with newly built libc.
benchtests/timing-type is built with the newly built libc, so should
be run with it like actual tests and benchmarks.
2020-01-20 11:29:41 +00:00
Joseph Myers
d614a75396 Update copyright dates with scripts/update-copyrights. 2020-01-01 00:14:33 +00:00
Krzysztof Koch
15740788d7 Add new script for plotting string benchmark JSON output
Add a script for visualizing the JSON output generated by existing
glibc string microbenchmarks.

Overview:
plot_strings.py is capable of plotting benchmark results in the
following formats, which are controlled with the -p or --plot argument:
1. absolute timings (-p time): plot the timings as they are in the
input benchmark results file.
2. relative timings (-p rel): plot relative timing difference with
respect to a chosen ifunc (controlled with -b argument).
3. performance relative to max (-p max): for each varied parameter
value, plot 1/timing as the percentage of the maximum value out of
the plotted ifuncs.
4. throughput (-p thru): plot varied parameter value over timing

For all types of graphs, there is an option to explicitly specify
the subset of ifuncs to plot using the --ifuncs parameter.

For plot types 1. and 4. one can hide/expose exact benchmark figures
using the --values flag.

When plotting relative timing differences between ifuncs, the first
ifunc listed in the input JSON file is the baseline, unless the
baseline implementation is explicitly chosen with the --baseline
parameter. For the ease of reading, the script marks the statistically
insignificant range on the graphs. The default is +-5% but this
value can be controlled with the --threshold parameter.

To accommodate for the heterogeneity in benchmark results files,
one can control i.e the x-axis scale, the resolution (dpi) of the
generated figures or the key to access the varied parameter value
in the JSON file. The corresponding options are --logarithmic,
--resolution or --key. The --key parameter ensures that plot_strings.py
works with all files which pass JSON schema validation. The schema
can be chosen with the --schema parameter.

If a window manager is available, one can enable interactive
figure display using the --display flag.

Finally, one can use the --grid flag to enable grid lines in the
generated figures.

Implementation:
plot_strings.py traverses the JSON tree until a 'results' array
is found and generates a separate figure for each such array.
The figure is then saved to a file in one of the available formats
(controlled with the --extension parameter).

As the tree is traversed, the recursive function tracks the metadata
about the test being run, so that each figure has a unique and
meaningful title and filename.

While plot_strings.py works with existing benchmarks, provisions
have been made to allow adding more structure and metadata to these
benchmarks. Currently, many benchmarks produce multiple timing values
for the same value of the varied parameter (typically 'length').
Mutiple data points for the same parameter usually mean that some other
parameter was varied as well, for example, if memmove's src and dst
buffers overlap or not (see bench-memmove-walk.c and
bench-memmove-walk.out).

Unfortunately, this information is not exposed in the benchmark output
file, so plot_strings.py has to resort to computing the geometric mean
of these multiple values. In the process, useful information about the
benchmark configuration is lost. Also, averaging the timings for
different alignments can hide useful characterstics of the benchmarked
ifuncs.

Testing:
plot_strings.py has been tested on all existing string microbenchmarks
which produce results in JSON format. The script was tested on both
Windows 10 and Ubuntu 16.04.2 LTS. It runs on both python 2 and 3
(2.7.12 and 3.5.12 tested).

Useful commands:
1. Plot timings for all ifuncs in bench-strlen.out:
$ ./plot_strings.py bench-strlen.out

2. Display help:
$ ./plot_strings.py -h

3. Plot throughput for __memset_avx512_unaligned_erms and
__memset_avx512_unaligned. Save the generated figure in pdf format to
'results/'. Use logarithmic x-axis scale, show grid lines and expose
the performance numbers:
$ ./plot_strings.py bench.out -o results/ -lgv -e pdf -p thru \
-i __memset_avx512_unaligned_erms __memset_avx512_unaligned

4. Plot relative timings for all ifuncs in bench.out with __generic_memset
as baseline. Display percentage difference threshold of +-10%:
$ ./plot_strings.py bench.out -p rel  -b __generic_memset -t 10

Discussion:
1. I would like to propose relaxing the benchout_strings.schema.json
to allow specifying either a 'results' array with 'timings' (as before)
or a 'variants' array. See below example:

{
 "timing_type": "hp_timing",
 "functions": {
  "memcpy": {
   "bench-variant": "default",
   "ifuncs": ["generic_memcpy", "__memcpy_thunderx"],
   "variants": [
    {
     "name": "powers of 2",
     "variants": [
      {
       "name": "both aligned",
       "results": [
        {
         "length": 1,
         "align1": 0,
         "align2": 0,
         "timings": [x, y]
        },
        {
         "length": 2,
         "align1": 0,
         "align2": 0,
         "timings": [x, y]
        },
...
        {
         "length": 65536,
         "align1": 0,
         "align2": 0,
         "timings": [x, y]
        }]
      },
      {
       "name": "dst misaligned",
       "results": [
        {
         "length": 1,
         "align1": 0,
         "align2": 0,
         "timings": [x, y]
        },
        {
         "length": 2,
         "align1": 0,
         "align2": 1,
         "timings": [x, y]
        },
...

'variants' array consists of objects such that each object has a 'name'
attribute to describe the configuration of a particular test in the
benchmark. This can be a description, for example, of how the parameter
was varied or what was the buffer alignment tested. The 'name' attribute
is then followed by another 'variants' array or a 'results' array.

The nesting of variants allows arbitrary grouping of benchmark timings,
while allowing description of these groups. Using recusion, it is
possible to proceduraly create titles and filenames for the figures being
generated.
2019-11-13 14:18:52 +00:00
Paul Eggert
5a82c74822 Prefer https to http for gnu.org and fsf.org URLs
Also, change sources.redhat.com to sourceware.org.
This patch was automatically generated by running the following shell
script, which uses GNU sed, and which avoids modifying files imported
from upstream:

sed -ri '
  s,(http|ftp)(://(.*\.)?(gnu|fsf|sourceware)\.org($|[^.]|\.[^a-z])),https\2,g
  s,(http|ftp)(://(.*\.)?)sources\.redhat\.com($|[^.]|\.[^a-z]),https\2sourceware.org\4,g
' \
  $(find $(git ls-files) -prune -type f \
      ! -name '*.po' \
      ! -name 'ChangeLog*' \
      ! -path COPYING ! -path COPYING.LIB \
      ! -path manual/fdl-1.3.texi ! -path manual/lgpl-2.1.texi \
      ! -path manual/texinfo.tex ! -path scripts/config.guess \
      ! -path scripts/config.sub ! -path scripts/install-sh \
      ! -path scripts/mkinstalldirs ! -path scripts/move-if-change \
      ! -path INSTALL ! -path  locale/programs/charmap-kw.h \
      ! -path po/libc.pot ! -path sysdeps/gnu/errlist.c \
      ! '(' -name configure \
            -execdir test -f configure.ac -o -f configure.in ';' ')' \
      ! '(' -name preconfigure \
            -execdir test -f preconfigure.ac ';' ')' \
      -print)

and then by running 'make dist-prepare' to regenerate files built
from the altered files, and then executing the following to cleanup:

  chmod a+x sysdeps/unix/sysv/linux/riscv/configure
  # Omit irrelevant whitespace and comment-only changes,
  # perhaps from a slightly-different Autoconf version.
  git checkout -f \
    sysdeps/csky/configure \
    sysdeps/hppa/configure \
    sysdeps/riscv/configure \
    sysdeps/unix/sysv/linux/csky/configure
  # Omit changes that caused a pre-commit check to fail like this:
  # remote: *** error: sysdeps/powerpc/powerpc64/ppc-mcount.S: trailing lines
  git checkout -f \
    sysdeps/powerpc/powerpc64/ppc-mcount.S \
    sysdeps/unix/sysv/linux/s390/s390-64/syscall.S
  # Omit change that caused a pre-commit check to fail like this:
  # remote: *** error: sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S: last line does not end in newline
  git checkout -f sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S
2019-09-07 02:43:31 -07:00
Wilco Dijkstra
3c05dd79d0 Use generic memset/memcpy/memmove in benchtests
Use the generic C memset/memcpy/memmove in benchtests since comparing
against a slow byte-oriented implementation makes no sense.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>

2019-08-29  Wilco Dijkstra  <wdijkstr@arm.com>

	* benchtests/bench-memcpy.c (simple_memcpy): Remove.
	(generic_memcpy): Include generic C memcpy.
	* benchtests/bench-memmove.c (simple_memmove): Remove.
	(generic_memmove): Include generic C memmove.
	* benchtests/bench-memset.c (simple_memset): Remove.
	(generic_memset): Include generic C memset.
	* benchtests/bench-memset-large.c (simple_memset): Remove.
	(generic_memset): Include generic C memset.
	* benchtests/bench-memset-walk.c (simple_memset): Remove.
	(generic_memset): Include generic C memset.
	* string/memcpy.c (MEMCPY): Add defines to enable redirection.
	* string/memset.c (MEMSET): Likewise.
	* sysdeps/x86_64/memcopy.h: Remove empty file.
2019-08-30 17:21:35 +01:00
Adhemerval Zanella
0cccd37f70 benchtests: Add logb{f} benchmark
* benchtests/Makefile (bench-math): Add logb.
	* benchtests/logb-inputs: New file.
	* benchtests/logbf-inputs: New file.

Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
2019-07-08 17:22:22 -03:00
Adhemerval Zanella
f215dbbdf1 benchtests: hypot benchmark
Inputs are based on argument reductions from generic and powerpc
implementation.

	* benchtests/Makefile (bench-math): Add hypot.
	* benchtests/hypot-inputs: New file.

Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
2019-07-08 17:14:04 -03:00
Wilco Dijkstra
d064591266 Further improve string bench timing
Further improve the timings of the string benchmarks.  Ensure most take
between 1 and 4 seconds to improve accuracy.  Overall time taken increases
by 35%.  Tested on AArch64.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

	* benchtests/bench-math-inlines.c: Increase iterations.
	* benchtests/bench-memcmp.c: Likewise.
	* benchtests/bench-rawmemchr.c: Likewise.
	* benchtests/bench-strcmp.c: Likewise.
	* benchtests/bench-strcpy_chk.c: Likewise.
	* benchtests/bench-string.h (INNER_LOOP_ITERS8): Add define.
	(INNER_LOOP_ITERS_MEDIUM): Increase iterations.
	(INNER_LOOP_ITERS_SMALL): Likewise.
	* benchtests/bench-strncat.c: Increase iterations.
	* benchtests/bench-strncmp.c: Increase iterations.
	* benchtests/bench-strncpy.c: Reduce iterations for wide strings.
	* benchtests/bench-strrchr.c: Increase iterations.
	* benchtests/bench-strstr.c: Keep iterations unchanged.
	* benchtests/bench-strtod.c: Increase iterations.
2019-06-28 13:42:36 +01:00
Anton Youdkevitch
afe23eb0f1 Bump up the runtime for "short" benchmarks
Some benchmarks with a very short runtime show significantly
different results across runs on Aarch64 - up to tens of percents.
Increasing the runtime to 100ms+ makes the deviation under 5%.

Tested on Aarch64 and x86-64.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

	* benchtests/bench-memccpy.c: Replace INNER_LOOP_ITERS
	with INNER_LOOP_ITERS_LARGE.
	* benchtests/bench-memchr.c: Likewise.
	* benchtests/bench-rawmemchr.c: Likewise.
	* benchtests/bench-strcat.c: Likewise.
	* benchtests/bench-strchr.c: Likewise.
	* benchtests/bench-string.h: Likewise.
	* benchtests/bench-strlen.c: Likewise.
	* benchtests/bench-strncpy.c: Likewise.
	* benchtests/bench-strnlen.c: Likewise.
2019-06-28 13:38:07 +01:00
Stefan Liebler
f0c5a803bd Fix gcc 9 build errors for make xcheck. [BZ #24556]
This patch fixes the following gcc 9 warnings for "make xcheck" / "make bench":
-string/tst-strcasestr.c:
../include/bits/../../misc/bits/error.h:42:5: error: ‘%s’ directive argument is null [-Werror=format-overflow=]

-argp/argp-test.c:
argp-test.c:130:20: error: ‘%d’ directive writing between 1 and 11 bytes into a region of size 10 [-Werror=format-overflow=]
argp-test.c:130:19: note: directive argument in the range [-2147483648, 122]
argp-test.c:130:5: note: ‘sprintf’ output between 2 and 12 bytes into a destination of size 10

-nss/tst-field.c:
tst-field.c:52:7: error: ‘%s’ directive argument is null [-Werror=format-overflow=]

-benchtests/bench-strstr.c:
../include/bits/../../misc/bits/error.h:42:5: error: ‘%s’ directive argument is null [-Werror=format-overflow=]

-benchtests/bench-malloc-simple.c:
bench-malloc-simple.c:93:16: error: iteration 3 invokes undefined behavior [-Werror=aggressive-loop-optimizations]

ChangeLog:

	[BZ #24556]
	* string/test-strcasestr.c (check_result): Add NULL check.
	* nss/tst-field.c (check_rewrite): Likewise.
	* benchtests/bench-strstr.c (do_one_test): Likewise.
	* string/test-strstr.c (check_result): Likewise.
	* argp/argp-test.c (popt): Increase size of buf to 12.
	* benchtests/bench-malloc-simple.c (bench):
	Do not initialize tests array out of bounds.
2019-06-19 12:32:04 +02:00
Adhemerval Zanella
2731a326b1 benchtests: Add isnan/isinf/isfinite benchmark
* benchtests/Makefile (bench-math): Add isnan, isinf, and isfinite.
	(CFLAGS-bench-isnan.c, CFLAGS-bench-isinf.c,
	CFLAGS-bench-isfinite.c): New rule.
	* benchtests/isnan-input: New file.
	* benchtests/isinf-input: New file.
	* benchtests/isfinite-input: New file.

Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
2019-06-12 11:46:30 -03:00
Wilco Dijkstra
80b2bfb535 Benchmark strstr hard needles
Benchmark needles which exhibit worst-case performance.  This shows that
basic_strstr is quadratic and thus unsuitable for large needles.
On the other hand the Two-way and new strstr implementations are linear with
increasing needle sizes.  The slowest cases of the two implementations are
within a factor of 2 on several different microarchitectures.  Two-way is
slowest on inputs which cause a branch mispredict on almost every character.
The new strstr is slowest on inputs which almost match and result in many
calls to memcmp.  Thanks to Szabolcs for providing various hard needles.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>

	* benchtests/bench-strstr.c (test_hard_needle): New function.
2019-06-11 15:52:21 +01:00
Wilco Dijkstra
46ae07324b Improve string benchtest timing
Improve string benchtest timing.  Many tests run for 0.01s which is way too
short to give accurate results.  Other tests take over 40 seconds which is
way too long.  Significantly increase the iterations of the short running
tests.  Reduce number of alignment variations in the long running memcpy walk
tests so they take less than 5 seconds.

As a result most tests take at least 0.1s and all finish within 5 seconds.

	* benchtests/bench-memcpy-random.c (do_one_test): Use medium iterations.
	* benchtests/bench-memcpy-walk.c (test_main): Reduce alignment tests.
	* benchtests/bench-memmem.c (do_one_test): Use small iterations.
	* benchtests/bench-memmove-walk.c (test_main): Reduce alignment tests.
	* benchtests/bench-memset-walk.c (test_main): Reduce alignment tests.
	* benchtests/bench-strcasestr.c (do_one_test): Use small iterations.
	* benchtests/bench-string.h (INNER_LOOP_ITERS): Increase iterations.
	(INNER_LOOP_ITERS_MEDIUM): New define.
	(INNER_LOOP_ITERS_SMALL): New define.
	* benchtests/bench-strpbrk.c (do_one_test): Use medium iterations.
	* benchtests/bench-strsep.c (do_one_test): Use small iterations.
	* benchtests/bench-strspn.c (do_one_test): Use medium iterations.
	* benchtests/bench-strstr.c (do_one_test): Use small iterations.
	* benchtests/bench-strtok.c (do_one_test): Use small iterations.
2019-05-21 15:19:06 +01:00
Florian Weimer
b5ffdc48c2 benchtests: Enable BIND_NOW if configured with --enable-bind-now
Benchmarks should reflect distribution build policies, so it makes
sense to honor the BIND_NOW configuration for them.

This commit keeps using $(+link-tests), so that the benchmarks are
linked according to the --enable-hardcoded-path-in-tests configure
option.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2019-04-25 10:41:52 +02:00
Wilco Dijkstra
fe92a91f1e Reduce benchtests time
Reduce the total time taken by benchtests.  The malloc thread test takes 4
minutes to run which is significantly more than most other tests. Reduce
this to a more reasonable 40 seconds.  The math tests take 10 seconds each,
eventhough all they do is loop on the same input.  Anything more than 1
second runtime is way overkill, so set the limit to 1 second.

	* benchtests/Makefile (BENCH_DURATION): Set to 1 second.
	* benchtests/bench-malloc-thread.c (BENCH_DURATION): Set to 10 seconds.
2019-04-24 15:38:49 +01:00
Wilco Dijkstra
648279f4af Improve string benchtests
Replace slow byte-oriented tests in several string benchmarks with the
generic implementations from the string/ directory so the comparisons
are more realistic and useful.

	* benchtests/bench-stpcpy.c (SIMPLE_STPCPY): Remove function.
	(generic_stpcpy): New function.
	* benchtests/bench-stpncpy.c (SIMPLE_STPNCPY): Remove function.
	(generic_stpncpy): New function.
	* benchtests/bench-strcat.c (SIMPLE_STRCAT): Remove function.
	(generic_strcat): New function.
	* benchtests/bench-strcpy.c (SIMPLE_STRCPY): Remove function.
	(generic_strcpy): New function.
	* benchtests/bench-strncat.c (SIMPLE_STRNCAT): Remove function.
	(STUPID_STRNCAT): Remove function.
	(generic_strncat): New function.
	* benchtests/bench-strncpy.c (SIMPLE_STRNCPY): Remove function.
	(STUPID_STRNCPY): Remove function.
	(generic_strncpy): New function.
	* benchtests/bench-strnlen.c (SIMPLE_STRNLEN): Remove function.
	(generic_strnlen): New function.
	(memchr_strnlen): New function.
	* benchtests/bench-strlen.c (generic_strlen): Define for WIDE.
	(memchr_strlen): Likewise.
2019-04-09 11:54:34 +01:00
Wilco Dijkstra
93eebae516 Improve bench-strstr
Improve bench-strstr by using an extract from the manual as the input
to make the test more realistic.  Use the same input for both found and
fail cases rather than using a memset of '0' for most of the string,
which measures performance of strchr rather than strstr.  Add result
checking to catch potential errors.  Remove the repeated tests at slightly
different alignments and add more large needle and haystack testcases.

Replace stupid_strstr with an efficient basic implementation.  Add the
Two-way implementation to simplify comparisons with much faster generic
implementations.

	* benchtests/bench-strstr.c (input): Add realistic input text.
	(stupid_strstr): Remove function.
	(basic_strstr): Add function.
	(twoway_strstr): Add function.
	(do_one_test): Add result checking.
	(do_test): Use new input text.  Remove accidental early matches.
	(test_main): Improve range of tests, reduce unaligned cases.
2019-04-09 11:49:18 +01:00
Wilco Dijkstra
a173d09f85 Improve bench-memmem
Improve bench-memmem by replacing simple_memmem with a more efficient
implementation.  Add the Two-way implementation to enable direct comparison
with the optimized memmem.

	* benchtests/bench-memmem.c (simple_memmem): Remove function.
	(basic_memmem): Add function.
	(twoway_memmem): Add function.
2019-04-09 11:46:28 +01:00
Wilco Dijkstra
6103c0a811 Remove TIMING_INIT
Remove TIMING_INIT since it's no longer used.

	* benchtests/bench-malloc-simple.c: Remove TIMING_INIT.
	* benchtests/bench-malloc-thread.c: Likewise.
	* benchtests/bench-skeleton.c: Likewise.
	* benchtests/bench-strtod.c: Likewise.
	* benchtests/bench-timing.h: Likewise.
2019-04-09 11:38:24 +01:00
Wilco Dijkstra
7621e38bf3 Add generic hp-timing support
Add missing generic hp_timing support.  It uses clock_gettime (CLOCK_MONOTONIC)
which has unspecified starting time, nano-second accuracy, and should faster on
architectures that implementes the symbol as vDSO.

Checked on aarch64-linux-gnu, x86_64-linux-gnu, and i686-linux-gnu. I also
checked the builds for all afected ABIs.

	* benchtests/Makefile (USE_CLOCK_GETTIME) Remove.
	* benchtests/README: Update description.
	* benchtests/bench-timing.h: Default to hp-timing.
	* sysdeps/generic/hp-timing.h (HP_TIMING_DIFF, HP_TIMING_ACCUM_NT,
	HP_TIMING_PRINT): Remove.
	(HP_TIMING_NOW): Add generic implementation.
	(hp_timing_t): Change to uint64_t.
2019-03-22 17:30:44 -03:00
Adhemerval Zanella
1e372ded4f Refactor hp-timing rtld usage
This patch refactor how hp-timing is used on loader code for statistics
report.  The HP_TIMING_AVAIL and HP_SMALL_TIMING_AVAIL are removed and
HP_TIMING_INLINE is used instead to check for hp-timing avaliability.
For alpha, which only defines HP_SMALL_TIMING_AVAIL, the HP_TIMING_INLINE
is set iff for IS_IN(rtld).

Checked on aarch64-linux-gnu, x86_64-linux-gnu, and i686-linux-gnu. I also
checked the builds for all afected ABIs.

	* benchtests/bench-timing.h: Replace HP_TIMING_AVAIL with
	HP_TIMING_INLINE.
	* nptl/descr.h: Likewise.
	* elf/rtld.c (RLTD_TIMING_DECLARE, RTLD_TIMING_NOW, RTLD_TIMING_DIFF,
	RTLD_TIMING_ACCUM_NT, RTLD_TIMING_SET): Define.
	(dl_start_final_info, _dl_start_final, dl_main, print_statistics):
	Abstract hp-timing usage with RTLD_* macros.
	* sysdeps/alpha/hp-timing.h (HP_TIMING_INLINE): Define iff IS_IN(rtld).
	(HP_TIMING_AVAIL, HP_SMALL_TIMING_AVAIL): Remove.
	* sysdeps/generic/hp-timing.h (HP_TIMING_AVAIL, HP_SMALL_TIMING_AVAIL,
	HP_TIMING_NONAVAIL): Likewise.
	* sysdeps/ia64/hp-timing.h (HP_TIMING_AVAIL, HP_SMALL_TIMING_AVAIL):
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/hp-timing.h (HP_TIMING_AVAIL,
	HP_SMALL_TIMING_AVAIL): Likewise.
	* sysdeps/powerpc/powerpc64/hp-timing.h (HP_TIMING_AVAIL,
	HP_SMALL_TIMING_AVAIL): Likewise.
	* sysdeps/sparc/sparc32/sparcv9/hp-timing.h (HP_TIMING_AVAIL,
	HP_SMALL_TIMING_AVAIL): Likewise.
	* sysdeps/sparc/sparc64/hp-timing.h (HP_TIMING_AVAIL,
	HP_SMALL_TIMING_AVAIL): Likewise.
	* sysdeps/x86/hp-timing.h (HP_TIMING_AVAIL, HP_SMALL_TIMING_AVAIL):
	Likewise.
	* sysdeps/generic/hp-timing-common.h: Update comment with
	HP_TIMING_AVAIL removal.
2019-03-22 17:30:44 -03:00
Joseph Myers
c4f50205e1 Add some spaces before '('.
This patch fixes various places where a space should have been present
before '(' in accordance with the GNU Coding Standards.  Most but not
all of the fixes in this patch are for calls to sizeof (but it's not
exhaustive regarding such calls that should be fixed).

Tested for x86_64, and with build-many-glibcs.py.

	* benchtests/bench-strcpy.c (do_test): Use space before '('.
	* benchtests/bench-string.h (cmdline_process_function): Likewise.
	* benchtests/bench-strlen.c (do_test): Likewise.
	(test_main): Likewise.
	* catgets/gencat.c (read_old): Likewise.
	* elf/cache.c (load_aux_cache): Likewise.
	* iconvdata/bug-iconv8.c (do_test): Likewise.
	* math/test-tgmath-ret.c (do_test): Likewise.
	* nis/nis_call.c (rec_dirsearch): Likewise.
	* nis/nis_findserv.c (__nis_findfastest_with_timeout): Likewise.
	* nptl/tst-audit-threads.c (do_test): Likewise.
	* nptl/tst-cancel4-common.h (set_socket_buffer): Likewise.
	* nss/nss_test1.c (init): Likewise.
	* nss/test-netdb.c (test_hosts): Likewise.
	* posix/execvpe.c (maybe_script_execute): Likewise.
	* stdio-common/tst-fmemopen4.c (do_test): Likewise.
	* stdio-common/tst-printf.c (do_test): Likewise.
	* stdio-common/vfscanf-internal.c (__vfscanf_internal): Likewise.
	* stdlib/fmtmsg.c (NKEYWORDS): Likewise.
	* stdlib/qsort.c (STACK_SIZE): Likewise.
	* stdlib/test-canon.c (do_test): Likewise.
	* stdlib/tst-swapcontext1.c (do_test): Likewise.
	* string/memcmp.c (OPSIZ): Likewise.
	* string/test-strcpy.c (do_test): Likewise.
	(do_random_tests): Likewise.
	* string/test-strlen.c (do_test): Likewise.
	(test_main): Likewise.
	* string/test-strrchr.c (do_test): Likewise.
	(do_random_tests): Likewise.
	* string/tester.c (test_memrchr): Likewise.
	(test_memchr): Likewise.
	* sysdeps/generic/memcopy.h (OPSIZ): Likewise.
	* sysdeps/generic/unwind-dw2.c (execute_stack_op): Likewise.
	* sysdeps/generic/unwind-pe.h (read_sleb128): Likewise.
	(read_encoded_value_with_base): Likewise.
	* sysdeps/hppa/dl-machine.h (elf_machine_runtime_setup): Likewise.
	* sysdeps/hppa/fpu/feupdateenv.c (__feupdateenv): Likewise.
	* sysdeps/ia64/fpu/sfp-machine.h (TI_BITS): Likewise.
	* sysdeps/mach/hurd/spawni.c (__spawni): Likewise.
	* sysdeps/posix/spawni.c (maybe_script_execute): Likewise.
	* sysdeps/powerpc/fpu/tst-setcontext-fpscr.c (query_auxv):
	Likewise.
	* sysdeps/unix/sysv/linux/aarch64/bits/procfs.h (ELF_NGREG):
	Likewise.
	* sysdeps/unix/sysv/linux/arm/bits/procfs.h (ELF_NGREG): Likewise.
	* sysdeps/unix/sysv/linux/arm/ioperm.c (init_iosys): Likewise.
	* sysdeps/unix/sysv/linux/csky/bits/procfs.h (ELF_NGREG):
	Likewise.
	* sysdeps/unix/sysv/linux/m68k/bits/procfs.h (ELF_NGREG):
	Likewise.
	* sysdeps/unix/sysv/linux/nios2/bits/procfs.h (ELF_NGREG):
	Likewise.
	* sysdeps/unix/sysv/linux/spawni.c (maybe_script_execute):
	Likewise.
	* sysdeps/unix/sysv/linux/x86/bits/procfs.h (ELF_NGREG): Likewise.
	* sysdeps/unix/sysv/linux/x86/bits/sigcontext.h
	(FP_XSTATE_MAGIC2_SIZE): Likewise.
	* sysdeps/x86/fpu/sfp-machine.h (TI_BITS): Likewise.
	* time/test_time.c (main): Likewise.
2019-02-27 13:55:45 +00:00
Joseph Myers
34a5a1460e Break some lines before not after operators.
The GNU Coding Standards specify that line breaks in expressions
should go before an operator, not after one.  This patch fixes various
code to do this.  It only changes code that appears to be mostly
following GNU style anyway, not files and directories with
substantially different formatting.  It is not exhaustive even for
files using GNU style (for example, changes to sysdeps files are
deferred for subsequent cleanups).  Some files changed are shared with
gnulib, but most are specific to glibc.  Changes were made manually,
with places to change found by grep (so some cases, e.g. where the
operator was followed by a comment at end of line, are particularly
liable to have been missed by grep, but I did include cases where the
operator was followed by backslash-newline).

This patch generally does not attempt to address other coding style
issues in the expressions changed (for example, missing spaces before
'(', or lack of parentheses to ensure indentation of continuation
lines properly reflects operator precedence).

Tested for x86_64, and with build-many-glibcs.py.

	* benchtests/bench-memmem.c (simple_memmem): Break lines before
	rather than after operators.
	* benchtests/bench-skeleton.c (TIMESPEC_AFTER): Likewise.
	* crypt/md5.c (md5_finish_ctx): Likewise.
	* crypt/sha256.c (__sha256_finish_ctx): Likewise.
	* crypt/sha512.c (__sha512_finish_ctx): Likewise.
	* elf/cache.c (load_aux_cache): Likewise.
	* elf/dl-load.c (open_verify): Likewise.
	* elf/get-dynamic-info.h (elf_get_dynamic_info): Likewise.
	* elf/readelflib.c (process_elf_file): Likewise.
	* elf/rtld.c (dl_main): Likewise.
	* elf/sprof.c (generate_call_graph): Likewise.
	* hurd/ctty-input.c (_hurd_ctty_input): Likewise.
	* hurd/ctty-output.c (_hurd_ctty_output): Likewise.
	* hurd/dtable.c (reauth_dtable): Likewise.
	* hurd/getdport.c (__getdport): Likewise.
	* hurd/hurd/signal.h (_hurd_interrupted_rpc_timeout): Likewise.
	* hurd/hurd/sigpreempt.h (HURD_PREEMPT_SIGNAL_P): Likewise.
	* hurd/hurdfault.c (_hurdsig_fault_catch_exception_raise):
	Likewise.
	* hurd/hurdioctl.c (fioctl): Likewise.
	* hurd/hurdselect.c (_hurd_select): Likewise.
	* hurd/hurdsig.c (_hurdsig_abort_rpcs): Likewise.
	(STOPSIGS): Likewise.
	* hurd/hurdstartup.c (_hurd_startup): Likewise.
	* hurd/intr-msg.c (_hurd_intr_rpc_mach_msg): Likewise.
	* hurd/lookup-retry.c (__hurd_file_name_lookup_retry): Likewise.
	* hurd/msgportdemux.c (msgport_server): Likewise.
	* hurd/setauth.c (_hurd_setauth): Likewise.
	* include/features.h (__GLIBC_USE_DEPRECATED_SCANF): Likewise.
	* libio/libioP.h [IO_DEBUG] (CHECK_FILE): Likewise.
	* locale/programs/ld-ctype.c (set_class_defaults): Likewise.
	* localedata/tests-mbwc/tst_swscanf.c (tst_swscanf): Likewise.
	* login/tst-utmp.c (do_check): Likewise.
	(simulate_login): Likewise.
	* mach/lowlevellock.h (lll_lock): Likewise.
	(lll_trylock): Likewise.
	* math/test-fenv.c (ALL_EXC): Likewise.
	* math/test-fenvinline.c (ALL_EXC): Likewise.
	* misc/sys/cdefs.h (__attribute_deprecated_msg__): Likewise.
	* nis/nis_call.c (__do_niscall3): Likewise.
	* nis/nis_callback.c (cb_prog_1): Likewise.
	* nis/nis_defaults.c (searchaccess): Likewise.
	* nis/nis_findserv.c (__nis_findfastest_with_timeout): Likewise.
	* nis/nis_ismember.c (internal_ismember): Likewise.
	* nis/nis_local_names.c (nis_local_principal): Likewise.
	* nis/nss_nis/nis-rpc.c (_nss_nis_getrpcbyname_r): Likewise.
	* nis/nss_nisplus/nisplus-netgrp.c (_nss_nisplus_getnetgrent_r):
	Likewise.
	* nis/ypclnt.c (yp_match): Likewise.
	(yp_first): Likewise.
	(yp_next): Likewise.
	(yp_master): Likewise.
	(yp_order): Likewise.
	* nscd/hstcache.c (cache_addhst): Likewise.
	* nscd/initgrcache.c (addinitgroupsX): Likewise.
	* nss/nss_compat/compat-pwd.c (copy_pwd_changes): Likewise.
	(internal_getpwuid_r): Likewise.
	* nss/nss_compat/compat-spwd.c (copy_spwd_changes): Likewise.
	* posix/glob.h (__GLOB_FLAGS): Likewise.
	* posix/regcomp.c (peek_token): Likewise.
	(peek_token_bracket): Likewise.
	(parse_expression): Likewise.
	* posix/regexec.c (sift_states_iter_mb): Likewise.
	(check_node_accept_bytes): Likewise.
	* posix/tst-spawn3.c (do_test): Likewise.
	* posix/wordexp-test.c (testit): Likewise.
	* posix/wordexp.c (parse_tilde): Likewise.
	(exec_comm): Likewise.
	* posix/wordexp.h (__WRDE_FLAGS): Likewise.
	* resource/vtimes.c (TIMEVAL_TO_VTIMES): Likewise.
	* setjmp/sigjmp.c (__sigjmp_save): Likewise.
	* stdio-common/printf_fp.c (__printf_fp_l): Likewise.
	* stdio-common/tst-fileno.c (do_test): Likewise.
	* stdio-common/vfprintf-internal.c (vfprintf): Likewise.
	* stdlib/strfmon_l.c (__vstrfmon_l_internal): Likewise.
	* stdlib/strtod_l.c (round_and_return): Likewise.
	(____STRTOF_INTERNAL): Likewise.
	* stdlib/tst-strfrom.h (TEST_STRFROM): Likewise.
	* string/strcspn.c (STRCSPN): Likewise.
	* string/test-memmem.c (simple_memmem): Likewise.
	* termios/tcsetattr.c (tcsetattr): Likewise.
	* time/alt_digit.c (_nl_parse_alt_digit): Likewise.
	* time/asctime.c (asctime_internal): Likewise.
	* time/strptime_l.c (__strptime_internal): Likewise.
	* time/sys/time.h (timercmp): Likewise.
	* time/tzfile.c (__tzfile_compute): Likewise.
2019-02-22 01:32:36 +00:00
Wilco Dijkstra
20d0195c71 Add missing bench-malloc-simple.c file. 2019-02-14 17:10:47 +00:00
Wilco Dijkstra
3904fd85d3 Add malloc micro benchmark
Add a malloc micro benchmark to enable accurate testing of the
various paths in malloc and free.  The benchmark does a varying
number of allocations of a given block size, then frees them again.

It tests 3 different scenarios: single-threaded using main arena,
multi-threaded using thread-arena, main arena with SINGLE_THREAD_P
false.

	* benchtests/Makefile: Add malloc-simple benchmark.
	* benchtests/bench-malloc-simple.c: New benchmark.
2019-02-14 16:37:11 +00:00
Siddhesh Poyarekar
24ca04febe benchtests: Remove useless ORIG_SRC in memmove benchmarks
The ORIG_SRC argument is likely a useless relic from the original
correctness tests that are not needed in the benchmarks.  Remove the
argument and use S1 to point to the source to avoid confusion.

        * benchtests/bench-memmove.c (do_one_test): Remove unused
        ORIG_SRC.
        (do_test): Adjust.
        * benchtests/bench-memmove-large.c (do_one_test): Remove unused
        ORIG_SRC.
        (do_test): Adjust.
2019-02-14 08:22:34 +05:30
Wilco Dijkstra
16f87cfd63 String benchtest cleanup
Continue cleanup of the string benchtests.  Remove simplistic
byte-oriented versions with faster generic implementations.
Remove bcopy/bzero benchmarks (bcopy/bzero are obsolete and never
emitted by compilers).  Remove builtin versions of memcpy, memset
and strlen.  Remove all remaining "stupid" implementations given
they are always slower than the "simple" variants and thus don't
add anything useful.

	* benchtests/bench-strcasecmp.c (stupid_strcasecmp): Remove.
	* benchtests/bench-strcasestr.c (stupid_strcasestr): Remove.
	* benchtests/bench-strchr.c (stupid_strchr): Remove.
	* benchtests/bench-strcmp.c (stupid_strcmp): Remove.
	* benchtests/bench-strcspn.c (stupid_strcspn): Remove.
	* benchtests/bench-strlen.c (builtin_strlen): Remove.
	* benchtests/bench-strncasecmp.c (stupid_strncasecmp): Remove.
	* benchtests/bench-strncmp.c (stupid_strncmp): Remove.
	* benchtests/bench-strpbrk.c (stupid_strpbrk): Remove.
	* benchtests/bench-strspn.c (stupid_strspn): Remove.
	* benchtests/Makefile: Remove bench-bcopy.c and bench-bzero.c.
	* benchtests/bench-bcopy.c: Delete file.
	* benchtests/bench-bzero.c: Likewise.
	* benchtests/bench-memccpy.c (stupid_memccpy): Remove.
	(simple_memccpy): Remove.
	(generic_memccpy): Add function.
	* benchtests/bench-memcpy.c: (builtin_memcpy): Remove.
	* benchtests/bench-memmove.c (simple_bcopy): Remove.
	* benchtests/bench-mempcpy.c (simple_mempcpy): Remove.
	(generic_mempcpy): Add new function.
	* benchtests/bench-memset.c (simple_bzero): Remove.
	(builtin_bzero): Remove.
	(builtin_memset): Remove.
	* benchtests/bench-rawmemchr.c (simple_rawmemchr): Remove.
	(generic_rawmemchr): Add new function.
2019-02-12 17:19:51 +00:00
Joseph Myers
04277e02d7 Update copyright dates with scripts/update-copyrights.
* All files with FSF copyright notices: Update copyright dates
	using scripts/update-copyrights.
	* locale/programs/charmap-kw.h: Regenerated.
	* locale/programs/locfile-kw.h: Likewise.
2019-01-01 00:11:28 +00:00
Wilco Dijkstra
5289f1f56b Improve bench-strlen
The current bench-strlen compares against a slow byte-oriented strlen which
is not useful given it's too easy to beat.  Remove it and compare against the
generic C strlen version and memchr.

	* benchtests/bench-strlen.c (generic_strlen): New function.
	(memchr_strlen): New function.
2018-12-27 14:56:23 +00:00
Wilco Dijkstra
90d3320d7f Refactor string benchtests
Refactor string benchtests by moving duplicated defines into
bench-string.h.

	* benchtests/bench-memchr.c: Cleanup defines.
	* benchtests/bench-memcmp.c: Likewise.
	* benchtests/bench-memset.c: Likewise.
	* benchtests/bench-memset-large.c: Likewise.
	* benchtests/bench-memset-walk.c: Likewise.
	* benchtests/bench-stpcpy.c: Likewise.
	* benchtests/bench-stpncpy.c: Likewise.
	* benchtests/bench-strcat.c: Likewise.
	* benchtests/bench-strchr.c: Likewise.
	* benchtests/bench-strcmp.c: Likewise.
	* benchtests/bench-strcpy.c: Likewise.
	* benchtests/bench-strcspn.c: Likewise.
	* benchtests/bench-string.h: Likewise.
	* benchtests/bench-strlen.c: Likewise.
	* benchtests/bench-strncat.c: Likewise.
	* benchtests/bench-strncmp.c: Likewise.
	* benchtests/bench-strncpy.c: Likewise.
	* benchtests/bench-strnlen.c: Likewise.
	* benchtests/bench-strpbrk.c: Likewise.
	* benchtests/bench-strrchr.c: Likewise.
	* benchtests/bench-strspn.c: Likewise.
2018-12-21 18:52:40 +00:00
Leonardo Sandoval
de099757b6 benchtests: send non-consumable data to stderr
Non-consumable data, alias data not related to benchmarks, should be sent to
the standard error, thus pipelines can work as expected.

	* benchtests/scripts/compare_bench.py (do_compare): write to stderr in case
    stat is not present.
	* benchtests/scripts/compare_bench.py (plot_graphs): write to stderr in case
    timings field is not present. Also string showing the output filename goes
    into the stderr.
2018-12-12 11:05:22 -06:00
Leonardo Sandoval
1990185f5f benchtests: include --stats parameter
Allows user to pick a statistic, defaulting to min and mean, from command
line. At the same time, if stat does not exit, catch the run-time exception
and keep comparing the rest of benchmarked functions. Finally, take care of
division-by-zero exceptions and as the latter, keep comparing the rest of the
functions, turning the script a bit more fault tolerant thus useful.

	* benchtests/scripts/compare_bench.py (do_compare): Catch KeyError and
    ZeroDivisorError exceptions.
	* benchtests/scripts/compare_bench.py (compare_runs): Use stats argument to
    loop through user provided statistics.
	* benchtests/scripts/compare_bench.py (main): Include the --stats argument.
2018-12-12 11:05:22 -06:00
Leonardo Sandoval
587426d499 benchtests: keep comparing even if function timings do not match
Allows other functions to be processed, making the script a bit more fault
tolerant thus useful.

	* benchtests/scripts/compare_bench.py (compare_runs): Continue instead of return.
2018-12-12 11:05:22 -06:00
Joseph Myers
c6982f7efc Patch to require Python 3.4 or later to build glibc.
This patch makes Python 3.4 or later a required tool for building
glibc, so allowing changes of awk, perl etc. code used in the build
and test to Python code without any such changes needing makefile
conditionals or to handle older Python versions.

This patch makes the configure test for Python check the version and
give an error if Python is missing or too old, and removes makefile
conditionals that are no longer needed.  It does not itself convert
any code from another language to Python, and does not remove any
compatibility with older Python versions from existing scripts.

Tested for x86_64.

	* configure.ac (PYTHON_PROG): Use AC_CHECK_PROG_VER.  Set
	critic_missing for versions before 3.4.
	* configure: Regenerated.
	* manual/install.texi (Tools for Compilation): Document
	requirement for Python to build glibc.
	* INSTALL: Regenerated.
	* Rules [PYTHON]: Make code unconditional.
	* benchtests/Makefile [PYTHON]: Likewise.
	* conform/Makefile [PYTHON]: Likewise.
	* manual/Makefile [PYTHON]: Likewise.
	* math/Makefile [PYTHON]: Likewise.
2018-10-29 15:28:05 +00:00
H.J. Lu
7cc65773f0 x86: Support RDTSCP for benchtests
RDTSCP waits until all previous instructions have executed and all
previous loads are globally visible before reading the counter.  RDTSC
doesn't wait until all previous instructions have been executed before
reading the counter.  All x86 processors since 2010 support RDTSCP
instruction.  This patch adds RDTSCP support to benchtests.

	* benchtests/Makefile (CPPFLAGS-nonlib): Add -DUSE_RDTSCP if
	USE_RDTSCP is defined.
	* sysdeps/x86/hp-timing.h (HP_TIMING_NOW): Use RDTSCP if
	USE_RDTSCP is defined.
2018-10-24 02:19:34 -07:00