Commit Graph

352 Commits

Author SHA1 Message Date
Wilco Dijkstra
5289f1f56b Improve bench-strlen
The current bench-strlen compares against a slow byte-oriented strlen which
is not useful given it's too easy to beat.  Remove it and compare against the
generic C strlen version and memchr.

	* benchtests/bench-strlen.c (generic_strlen): New function.
	(memchr_strlen): New function.
2018-12-27 14:56:23 +00:00
Wilco Dijkstra
90d3320d7f Refactor string benchtests
Refactor string benchtests by moving duplicated defines into
bench-string.h.

	* benchtests/bench-memchr.c: Cleanup defines.
	* benchtests/bench-memcmp.c: Likewise.
	* benchtests/bench-memset.c: Likewise.
	* benchtests/bench-memset-large.c: Likewise.
	* benchtests/bench-memset-walk.c: Likewise.
	* benchtests/bench-stpcpy.c: Likewise.
	* benchtests/bench-stpncpy.c: Likewise.
	* benchtests/bench-strcat.c: Likewise.
	* benchtests/bench-strchr.c: Likewise.
	* benchtests/bench-strcmp.c: Likewise.
	* benchtests/bench-strcpy.c: Likewise.
	* benchtests/bench-strcspn.c: Likewise.
	* benchtests/bench-string.h: Likewise.
	* benchtests/bench-strlen.c: Likewise.
	* benchtests/bench-strncat.c: Likewise.
	* benchtests/bench-strncmp.c: Likewise.
	* benchtests/bench-strncpy.c: Likewise.
	* benchtests/bench-strnlen.c: Likewise.
	* benchtests/bench-strpbrk.c: Likewise.
	* benchtests/bench-strrchr.c: Likewise.
	* benchtests/bench-strspn.c: Likewise.
2018-12-21 18:52:40 +00:00
Leonardo Sandoval
de099757b6 benchtests: send non-consumable data to stderr
Non-consumable data, alias data not related to benchmarks, should be sent to
the standard error, thus pipelines can work as expected.

	* benchtests/scripts/compare_bench.py (do_compare): write to stderr in case
    stat is not present.
	* benchtests/scripts/compare_bench.py (plot_graphs): write to stderr in case
    timings field is not present. Also string showing the output filename goes
    into the stderr.
2018-12-12 11:05:22 -06:00
Leonardo Sandoval
1990185f5f benchtests: include --stats parameter
Allows user to pick a statistic, defaulting to min and mean, from command
line. At the same time, if stat does not exit, catch the run-time exception
and keep comparing the rest of benchmarked functions. Finally, take care of
division-by-zero exceptions and as the latter, keep comparing the rest of the
functions, turning the script a bit more fault tolerant thus useful.

	* benchtests/scripts/compare_bench.py (do_compare): Catch KeyError and
    ZeroDivisorError exceptions.
	* benchtests/scripts/compare_bench.py (compare_runs): Use stats argument to
    loop through user provided statistics.
	* benchtests/scripts/compare_bench.py (main): Include the --stats argument.
2018-12-12 11:05:22 -06:00
Leonardo Sandoval
587426d499 benchtests: keep comparing even if function timings do not match
Allows other functions to be processed, making the script a bit more fault
tolerant thus useful.

	* benchtests/scripts/compare_bench.py (compare_runs): Continue instead of return.
2018-12-12 11:05:22 -06:00
Joseph Myers
c6982f7efc Patch to require Python 3.4 or later to build glibc.
This patch makes Python 3.4 or later a required tool for building
glibc, so allowing changes of awk, perl etc. code used in the build
and test to Python code without any such changes needing makefile
conditionals or to handle older Python versions.

This patch makes the configure test for Python check the version and
give an error if Python is missing or too old, and removes makefile
conditionals that are no longer needed.  It does not itself convert
any code from another language to Python, and does not remove any
compatibility with older Python versions from existing scripts.

Tested for x86_64.

	* configure.ac (PYTHON_PROG): Use AC_CHECK_PROG_VER.  Set
	critic_missing for versions before 3.4.
	* configure: Regenerated.
	* manual/install.texi (Tools for Compilation): Document
	requirement for Python to build glibc.
	* INSTALL: Regenerated.
	* Rules [PYTHON]: Make code unconditional.
	* benchtests/Makefile [PYTHON]: Likewise.
	* conform/Makefile [PYTHON]: Likewise.
	* manual/Makefile [PYTHON]: Likewise.
	* math/Makefile [PYTHON]: Likewise.
2018-10-29 15:28:05 +00:00
H.J. Lu
7cc65773f0 x86: Support RDTSCP for benchtests
RDTSCP waits until all previous instructions have executed and all
previous loads are globally visible before reading the counter.  RDTSC
doesn't wait until all previous instructions have been executed before
reading the counter.  All x86 processors since 2010 support RDTSCP
instruction.  This patch adds RDTSCP support to benchtests.

	* benchtests/Makefile (CPPFLAGS-nonlib): Add -DUSE_RDTSCP if
	USE_RDTSCP is defined.
	* sysdeps/x86/hp-timing.h (HP_TIMING_NOW): Use RDTSCP if
	USE_RDTSCP is defined.
2018-10-24 02:19:34 -07:00
Andreas Schwab
ce5a7de6cd Don't reduce test timeout to less than default
This removes all overrides of TIMEOUT that are less than or equal to the
default timeout.
2018-10-17 09:34:13 +02:00
Leonardo Sandoval
c892ae04f4 benchtests: Set float type on --threshold argument
Otherwise, we see the following runtime error when using the parameter:

  File "./glibc/benchtests/scripts/compare_bench.py", line 46, in do_compare
    if d > threshold:
TypeError: '>' not supported between instances of 'float' and 'str'

	* benchtests/scripts/compare_bench.py (main): set float type on
	threshold argument.
2018-10-08 09:11:30 -05:00
Siddhesh Poyarekar
34f86d6168 Reallocate buffers for every run in strlen
Try and avoid influencing performance of neighbouring functions.
2018-08-16 14:11:57 +05:30
Siddhesh Poyarekar
953a5a4a59 Print strlen benchmark output in json
Allow reading the benchmark using the compare_strings.py script.
2018-08-16 14:11:56 +05:30
Siddhesh Poyarekar
8cac1f2635 [benchtests] Add workload test properties to schema
Add the workload test properties (max-throughput, latency, etc.) to
the schema to prevent benchmark output validation from failing.

	* benchtests/scripts/benchout.schema.json (properties): Add
	new properties.
2018-08-11 18:55:09 +05:30
Siddhesh Poyarekar
44727aec4f [benchtests] Add mandatory attributes to workload tests
Add the duration and iterations attributes to the workloads tests to
make the json schema parser happy

	* benchtests/bench-skeleton.c (main): Add duration and
	iterations attributes.
2018-08-11 18:55:07 +05:30
Siddhesh Poyarekar
014efdd7ea benchtests: Clean up the alloc_bufs
Drop realloc_bufs in favour of making alloc_bufs transparently
reallocate the buffers if it had allocated before.  Also consolidate
computation of buffer lengths so that they don't get repeated on every
reallocation.

	* benchtests/bench-string.h (buf1_size, buf2_size): New
	variables.
	(init_sizes): New function.
	(test_init): Use it.
	(alloc_buf, exit_error): New functions.
	(alloc_bufs): Use ALLOC_BUF.
	(realloc_bufs): Remove.
	* benchtests/bench-memcmp.c (do_test): Adjust.
	* benchtests/bench-memset-large.c (do_test): Likewise.
	* benchtests/bench-memset-walk.c (do_test): Likewise.
	* benchtests/bench-memset.c (do_test): Likewise.
	* benchtests/bench-strncmp.c (do_test): Likewise.
2018-08-08 00:44:56 +05:30
Siddhesh Poyarekar
d67d634bef [benchtests] Fix compare_strings.py for python2
Python 2 does not have a FileNotFoundError so drop it in favour of
simply printing out the last (and most informative) line of the
exception.

	* benchtests/scripts/compare_strings.py: Import traceback.
	(parse_file): Pretty-print error.
2018-08-03 00:26:45 +05:30
Leonardo Sandoval
1cf4ae7fe6 benchtests: improve argument parsing through argparse library
The argparse library is used on compare_bench script to improve command line
argument parsing. The 'schema validation file' is now optional, reducing by
one the number of required parameters.

	* benchtests/scripts/compare_bench.py (__main__): use the argparse
	library to improve command line parsing.
	(__main__): make schema file as optional parameter (--schema),
	defaulting to benchtests/scripts/benchout.schema.json.
	(main): move out of the parsing stuff to __main_  and leave it
	only as caller of main comparison functions.
2018-07-19 14:53:37 -05:00
Wilco Dijkstra
3ae725dfb6 Improve strstr performance
Improve strstr performance.  Strstr tends to be slow because it uses
many calls to memchr and a slow byte loop to scan for the next match.
Performance is significantly improved by using strnlen on larger blocks
and using strchr to search for the next matching character.  strcasestr
can also use strnlen to scan ahead, and memmem can use memchr to check
for the next match.

On the GLIBC bench tests the performance gains on Cortex-A72 are:
strstr: +25%
strcasestr: +4.3%
memmem: +18%

On a 256KB dataset strstr performance improves by 67%, strcasestr by 47%.

    Reviewd-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2018-07-16 17:51:52 +01:00
H.J. Lu
cb8f6affed benchtests: Add -f/--functions argument
On x86-64, there may be multiple IFUNC implementations for a given
function.  But we may be only interested in a subset of them.  This
patch adds -f/--functions argument to compare a subset of IFUNC
implementations.

	* benchtests/scripts/compare_strings.py (process_results): Add
	funcs argument.  Compare only functions which are selected.
	(main): Check if base function is among selected functions.
	Pass selected functions to process_results.
	(__main__): Add -f/--functions argument.
2018-06-12 09:10:42 -07:00
Leonardo Sandoval
a650b05ebe benchtests: Catch exceptions in input arguments
Catch runtime exceptions in case the user provided: wrong base
function, attribute(s) or input file. In any of the latter, quit
immediately with non-zero return code.

	* benchtests/scripts/compare_string.py: (process_results) Catch
	exception in non-existent base_func and catch exception in
	non-existent attribute.
	(parse_file) Catch exception in non-existent input file.
2018-06-01 16:32:43 -05:00
Leonardo Sandoval
195abbf4cd benchtests: Add --no-diff and --no-header options
Having a string comparison report with neither diff numbers nor header
yields a more useful output to be consumed by other tools.

	* benchtests/scripts/compare_string.py: Add --no-diff and --no-header
	options to avoid diff calculation and omit header, respectively.
	(main): process --no-diff and --no-header
2018-06-01 16:32:43 -05:00
Siddhesh Poyarekar
543477f78b benchtests: Move iterator declaration into loop header
This is a minor style change to move the definition of I to its usage
scope instead of at the top of the function.  This is consistent with
glibc style guidelines and more importantly it was getting in the way
of my testing.

	* benchtests/bench-memcpy-walk.c (do_test): Move declaration
	of I into loop header.
	* benchtests/bench-memmove-walk.c (do_test): Likewise.
2018-05-07 20:54:31 +05:30
Wilco Dijkstra
fbce6f7260 Undefine attribute_hidden to fix benchtests
Add an undefine of attribute_hidden since it may be defined in some cases
(it must be defined since it is used by some hp-timing configurations).

	* benchtests/bench-timing.h (attribute_hidden): Undefine.
2018-03-19 10:53:14 +00:00
Wilco Dijkstra
f1c8185d34 Use correct includes in benchtests
Currently the benchtests are run with internal GLIBC headers, which is incorrect.
Defining _ISOMAC in the makefile ensures the internal headers are bypassed.
Fix all tests which were relying on internal defines or includes.

	* benchtests/Makefile: Define _ISOMAC.
	* benchtests/bench-strcoll.c: Add missing sys/stat.h include.
	* benchtests/bench-string.h: Define inhibit_loop_to_libcall macro.
	* benchtests/bench-strstr.c: Define empty libc_hidden_builtin_def.
	* benchtests/bench-strtok.c (oldstrtok): Use rawmemchr.
	* benchtests/bench-timing.h: Define attribute_hidden.
2018-03-15 15:44:58 +00:00
Siddhesh Poyarekar
0963ea8e8c benchtests: Don't benchmark 0 length calls for strncmp
The 0 length strncmp is interesting for correctness but not for
performance.

	* benchtests/bench-strncmp.c (test_main): Remove 0 length tests.
	(do_test_limit): Likewise.
2018-03-06 18:29:57 +05:30
Siddhesh Poyarekar
7bb3a8a556 benchtests: Reallocate buffers for every strncmp implementation
Don't reuse buffers for different strncmp implementations since the
earlier implementation will end up warming the cache for the later
one.  Eventually there should be a more elegant way to do this.

	* benchtests/bench-strncmp.c (do_test_limit): Reallocate buffers
	for every implementation.
	(do_test): Likewise.
2018-03-06 18:29:52 +05:30
Siddhesh Poyarekar
ad4e816e06 benchtests: Convert strncmp benchmark output to json
Make the output usable through the compare_strings.py script.

	* benchtests/bench-strncmp.c: Convert output to json.
2018-03-06 18:29:34 +05:30
Wilco Dijkstra
c3d466cba1 Remove slow paths from pow
Remove the slow paths from pow.  Like several other double precision math
functions, pow is exactly rounded.  This is not required from math functions
and causes major overheads as it requires multiple fallbacks using higher
precision arithmetic if a result is close to 0.5ULP.  Ridiculous slowdowns
of up to 100000x have been reported when the highest precision path triggers.

All GLIBC math tests pass on AArch64 and x64 (with ULP of pow set to 1).
The worst case error is ~0.506ULP.  A simple test over a few hundred million
values shows pow is 10% faster on average.  This fixes BZ #13932.

	[BZ #13932]
	* sysdeps/ieee754/dbl-64/uexp.h (err_1): Remove.
	* benchtests/pow-inputs: Update comment for slow path cases.
	* manual/probes.texi (slowpow_p10): Delete removed probe.
	(slowpow_p10): Likewise.
	* math/Makefile: Remove halfulp.c and slowpow.c.
	* sysdeps/aarch64/libm-test-ulps: Set ULP of pow to 1.
	* sysdeps/generic/math_private.h (__exp1): Remove error argument.
	(__halfulp): Remove.
	(__slowpow): Remove.
	* sysdeps/i386/fpu/halfulp.c: Delete file.
	* sysdeps/i386/fpu/slowpow.c: Likewise.
	* sysdeps/ia64/fpu/halfulp.c: Likewise.
	* sysdeps/ia64/fpu/slowpow.c: Likewise.
	* sysdeps/ieee754/dbl-64/e_exp.c (__exp1): Remove error argument,
	improve comments and add error analysis.
	* sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Add error analysis.
	(power1): Remove function:
	(log1): Remove error argument, add error analysis.
	(my_log2): Remove function.
	* sysdeps/ieee754/dbl-64/halfulp.c: Delete file.
	* sysdeps/ieee754/dbl-64/slowpow.c: Likewise.
	* sysdeps/m68k/m680x0/fpu/halfulp.c: Likewise.
	* sysdeps/m68k/m680x0/fpu/slowpow.c: Likewise.
	* sysdeps/powerpc/power4/fpu/Makefile: Remove CPPFLAGS-slowpow.c.
	* sysdeps/x86_64/fpu/libm-test-ulps: Set ULP of pow to 1.
	* sysdeps/x86_64/fpu/multiarch/Makefile: Remove slowpow-fma.c,
	slowpow-fma4.c, halfulp-fma.c, halfulp-fma4.c.
	* sysdeps/x86_64/fpu/multiarch/e_pow-fma.c (__slowpow): Remove define.
	* sysdeps/x86_64/fpu/multiarch/e_pow-fma4.c (__slowpow): Likewise.
	* sysdeps/x86_64/fpu/multiarch/halfulp-fma.c: Delete file.
	* sysdeps/x86_64/fpu/multiarch/halfulp-fma4.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/slowpow-fma.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/slowpow-fma4.c: Likewise.
2018-02-12 10:47:09 +00:00
Siddhesh Poyarekar
96e6a7167e benchtests: Make bench-memcmp print json
The benchamrk result can now be studied using the compare_strings.py
script.

	* benchtests/bench-memcmp.c: Print json instead of plain text.
2018-02-02 09:56:47 +05:30
Siddhesh Poyarekar
3dfcbfa1a4 benchtests: Reallocate buffers for every test run
Keeping the buffers the same across test runs gives later invocations
the advantage since they access cached data.  Reallocate so that all
test runs are on equal grounds.

	* benchtests/bench-memcmp.c (do_test): Call realloc_buf for
	every test run.
2018-02-02 09:55:45 +05:30
Joseph Myers
688903eb3e Update copyright dates with scripts/update-copyrights.
* All files with FSF copyright notices: Update copyright dates
	using scripts/update-copyrights.
	* locale/programs/charmap-kw.h: Regenerated.
	* locale/programs/locfile-kw.h: Likewise.
2018-01-01 00:32:25 +00:00
Siddhesh Poyarekar
5f1603c331 Convert strcmp benchmark output to json format
The format is now parseable with the compare_strings.py script.
2017-12-15 02:31:41 +05:30
Victor Rodriguez
0422ed1e84 benchtests: Enable BENCHSET to run subset of tests
This patch adds BENCHSET variable to benchtests/Makefile in order to
provide the capability to run a list of subsets of benchmark tests, ie;

    make bench BENCHSET="bench-pthread bench-math malloc-thread"

This helps users to benchmark specific glibc area

ChangeLog:

        * benchtests/Makefile:Add BENCHSET to allow subsets of
        benchmarks to be run.
        * benchtests/README: Add documentation for: Running subsets of
        benchmarks.

Signed-off-by: Victor Rodriguez <victor.rodriguez.bahena@intel.com>
Signed-off-by: Icarus Sparry <icarus.w.sparry@intel.com>
Reviewed-By: Siddhesh Poyarekar <siddhesh@sourceware.org>
2017-11-28 19:57:46 +05:30
Victor Rodriguez
d5090db30e benchtests: Expand range of tests names in schema.json
When executing bench-math the benchmark output is invalid with this
error msg:

    Invalid benchmark output: 'workload-spec2006.wrf' does not match any of
    the regexes: '^[_a-zA-Z0-9]*$¹ or Invalid benchmark output: Additional
    properties are not allowed ('workload-spec2006.wrf' was unexpected)

The error was seen when running the test:
workload-spec2006.wrf, 'stack=1024,guard=1' and 'stack=1024,guard=2'.
The problem is that the current regex's do not accept the hyphen, dot, equal
and comma in the output.

This patch changes the regex in benchout.schema.json to accept symbols in
benchmark tests names.

ChangeLog:

        * benchtests/scripts/benchout.schema.json: Fix regex to accept a
        wider range of tests names.

Signed-off-by: Victor Rodriguez <victor.rodriguez.bahena@intel.com>
Reviewed-By: Siddhesh Poyarekar <siddhesh@sourceware.org>
2017-11-28 19:52:57 +05:30
Victor Rodriguez
0595e36034 benchtests: Adjust valid and accepted properties
Benchmark workload-spec2006.wrf does not produce max, min or mean
results but instead produce throughput. This is represented in
benchtests/bench-skeleton.c. This patch adjust benchout.schema.json to consider
bench.out from bench-math benchmarks as valid

ChangeLog:

	* benchtests/scripts/benchout.schema.json: Add throughput as accepted
	result from property and remove "max", min" and "mean" from required
	properties based on benchtests/bench-skeleton.c.

Signed-off-by: Victor Rodriguez <victor.rodriguez.bahena@intel.com>
Reviewed-By: Siddhesh Poyarekar <siddhesh@sourceware.org>
2017-11-28 19:49:59 +05:30
Siddhesh Poyarekar
eb332f9feb benchtests: Bump start size since smaller sizes are noisy
Numbers for very small sizes (< 128B) are much noisier for non-cached
benchmarks like the walk benchmarks, so don't include them.

	* benchtests/bench-memcpy-walk.c (START_SIZE): Set to 128.
	* benchtests/bench-memmove-walk.c (START_SIZE): Likewise.
	* benchtests/bench-memset-walk.c (START_SIZE): Likewise.
2017-11-20 18:03:32 +05:30
Siddhesh Poyarekar
4d7632ff68 benchtests: Fix walking sizes and directions for *-walk benchmarks
Make the walking benchmarks walk only backwards since copying both
ways is biased in favour of implementations that use non-temporal
stores for larger sizes; falkor is one of them.  This also fixes up
bugs in computation of the result which ended up multiplying the
length with the timing result unnecessarily.

	* benchtests/bench-memcpy-walk.c (do_one_test): Copy only
	backwards.  Fix timing computation.
	* benchtests/bench-memmove-walk.c (do_one_test): Likewise.
	* benchtests/bench-memset-walk.c (do_one_test): Walk backwards
	on memset by N at a time.  Fix timing computation.
2017-11-20 18:03:32 +05:30
Rajalakshmi Srinivasaraghavan
077ee12978 Benchtests for sinf, cosf and sincosf
Numbers used from cos and sin inputs.
2017-10-13 14:19:45 +05:30
Siddhesh Poyarekar
5bfb04042d benchtests: Memory walking benchmark for memmove
This benchmark is an attempt to eliminate cache effects from string
benchmarks.  The benchmark walks both ways through a large memory area
and copies different sizes of memory and alignments one at a time
instead of looping around in the same memory area.  This is a good
metric to have alongside the simple memmove benchmark (which is only
really useful for smaller sizes) especially for larger sizes where the
likelihood of the call being done only once is pretty high.

This benchmark is different from memcpy in that it also tests
overlapping copies.

	* benchtests/bench-memmove-walk.c: New file.
	* benchtests/Makefile (string-benchset): Add it.
2017-10-05 22:20:23 +05:30
Siddhesh Poyarekar
36bb8edf51 benchtests: Memory walking benchmark for memset
This benchmark is an attempt to eliminate cache effects from string
benchmarks.  The benchmark walks backward through a large memory area
and sets different sizes of memory and alignments one at a time
instead of looping around in the same memory area.  This is a good
metric to have alongside the simple memset benchmark (which is only
really useful for smaller sizes) especially for larger sizes where the
likelihood of the call being done only once is pretty high.

	* benchtests/bench-memset-walk.c: New file.
	* benchtests/Makefile (string-benchset): Add it.
2017-10-05 22:20:23 +05:30
Siddhesh Poyarekar
9ec87fd2b1 benchtests: Memory walking benchmark for memcpy
This benchmark is an attempt to eliminate cache effects from string
benchmarks.  The benchmark walks both ways through a large memory area
and copies different sizes of memory and alignments one at a time
instead of looping around in the same memory area.  This is a good
metric to have alongside the other memcpy benchmarks, especially for
larger sizes where the likelihood of the call being done only once is
pretty high.

	* benchtests/bench-memcpy-walk.c: New file.
	* benchtests/Makefile (string-benchset): Add it.
2017-10-05 22:20:23 +05:30
Szabolcs Nagy
0525ce4850 Add exp2f and log2f benchmark trace
exp2f and log2f benchmark traces are just copies of the existing
expf and logf traces from wrf_r.

	* benchtests/Makefile: Add exp2f and log2f benchmarks.
	* benchtests/exp2f-inputs: Copy of expf-inputs.
	* benchtests/log2f-inputs: Copy of logf-inputs.
2017-09-20 10:04:12 +01:00
Wilco Dijkstra
a5dcc87e77 Add logf trace
Add a trace for logf.  This is a reduced trace based on 2.8 billion
samples extracted from wrf_r.

	* benchtests/Makefile: Add logf benchmark.
	* benchtests/logf-inputs: Add reduced trace from wrf_r.
2017-09-19 15:14:46 +01:00
Wilco Dijkstra
7024d5446d Add expf trace
Add a trace for expf.  This is a reduced trace based on 2.4 billion
samples extracted from wrf_r.

	* benchtests/Makefile: Add expf benchmark.
	* benchtests/expf-inputs: Add reduced trace from wrf_r.
2017-09-19 15:14:18 +01:00
Joseph Myers
eb375def3d Add benchtests for trunc and truncf.
This patch adds benchtests for the trunc and truncf functions.  The
inputs listed are fairly arbitrary; I do not assert they are
representative of any particular application.

	* benchtests/Makefile (bench-math): Add trunc and truncf.
	(CFLAGS-bench-trunc.c): New variable.
	(CFLAGS-bench-truncf.c): Likewise.
	* benchtests/trunc-inputs: New file.
	* benchtests/truncf-inputs: Likewise.
2017-09-19 12:59:01 +00:00
Siddhesh Poyarekar
140647ea6f benchtests: New -g option to generate graphs in compare_strings.py
The compare_strings.py option unconditionally generates a graph PNG
image of the input data, which can be unnecessary and slow.  Put this
behind an optional flag -g.

	* benchtests/scripts/compare_strings.py: New option -g.
	(draw_graph): Print a message that a graph is being generated.
	(process_results): Generate graph only if -g is passed.
	(main): Process option -g.
2017-09-16 15:24:00 +05:30
Siddhesh Poyarekar
5a6547b7b9 benchtests: Make compare_strings.py output a bit prettier
Make the column widths for the outputs fixed so that they look a
little less messy.  They will still look bad with lots of IFUNCs (like
on x86) but it's still a step forward.

	* benchtests/scripts/compare_strings.py (process_results):
	Better spacing for output.
2017-09-16 15:23:12 +05:30
Siddhesh Poyarekar
06b1de2378 benchtests: Use argparse to parse arguments
Make the script more usable by adding proper command line options
along with a way to query the options.  The script is capable of doing
a bunch of things right now like choosing a base for comparison,
choosing to generate graphs, etc. and they should be accessible via
command line switches.

	* benchtests/scripts/compare_strings.py: Use argparse.
	* benchtests/README: Document existence of compare_strings.py.
2017-09-16 11:47:32 +05:30
Siddhesh Poyarekar
503c92c37a benchtests: Reallocate buffers for memset
Keeping the same buffers along with copying the same size of data into
the same location means that the first routine is typically the
slowest since it has to bear the cost of fetching data into to cache.
Reallocating buffers stabilizes numbers by a bit.

	* benchtests/bench-string.h (realloc_bufs): New function.
	(test_init): Call it.
	* benchtests/bench-memset-large.c (do_test): Likewise.
	* benchtests/bench-memset.c (do_test): Likewise.
2017-09-14 22:54:24 +05:30
Siddhesh Poyarekar
29c933fb35 benchtests: Make memset benchmarks print json
Make the memset benchmarks (bench-memset and bench-memset-large) print
their output in JSON so that they can be evaluated using the
compare_strings.py script.

	* benchtests/bench-memset-large.c: Print output in JSON
	format.
	* benchtests/bench-memset.c: Likewise.
2017-09-14 22:54:23 +05:30
Mike FABIAN
799c8d6905 Add new codepage charmaps/IBM858 [BZ #21084]
This code page is identical to code page 850 except that X'D5'
has been changed from LI61 (dotless i) to SC20 (euro symbol).

The code points from /x01 to /x1f in the /localedata/charmaps/IBM858
file have the same mapping as those in localedata/charmaps/ANSI_X3.4-1968.
That means they disagree with with

ftp://ftp.software.ibm.com/software/globalization/gcoc/attachments/CP00858.txt

in that range.
For example, localedata/charmaps/IBM858 and localedata/charmaps/ANSI_X3.4-1968 have:

   “<U0001>     /x01         START OF HEADING (SOH)”

whereas CP00858.txt has:

   “01 SS000000        Smiling Face”

That means that CP00858.txt is not really ASCII-compatible and to make
it ASCII-compatible we deviate fro CP00858.txt in the code points from /x01
to /x1f.

	[BZ #21084]
	* benchtests/strcoll-inputs/filelist#en_US.UTF-8: Add IBM858 and ibm858.c.
	* iconvdata/Makefile: Add IBM858.
	* iconvdata/gconv-modules: Add IBM858.
	* iconvdata/ibm858.c: New file.
	* iconvdata/tst-tables.sh: Add IBM858
	* localedata/charmaps/IBM858: New file.
2017-09-14 15:50:57 +02:00
Florian Weimer
4504783c0f benchtests: Do not compile benchmark objects as libc modules [BZ #21864]
Otherwise, this will lead to link failures due to hidden symbol
references.
2017-08-21 19:28:54 +02:00
Wilco Dijkstra
d4505b895f Add math benchmark latency test
This patch further improves math function benchmarking by adding a latency
test in addition to throughput.  This enables more accurate comparisons of the
math functions. The latency test works by creating a dependency on the previous
iteration: func_res = F (func_res * zero + input[i]). The multiply by zero
avoids changing the input.

It reports reciprocal throughput and latency in nanoseconds (depending on the
timing header used) and max/min throughput in iterations per second:

   "workload-spec2006.wrf": {
    "reciprocal-throughput": 100,
    "latency": 200,
    "max-throughput": 1.0e+07,
    "min-throughput": 5.0e+06
   }

	* benchtests/bench-skeleton.c (main): Add support for
	latency benchmarking.
	* benchtests/scripts/bench.py: Add support for latency benchmarking.
2017-08-17 16:27:20 +01:00
Siddhesh Poyarekar
86c6519ee7 benchtests: Print json in memmove benchmark
Make the memmove benchmarks (bench-memmove and bench-memmove-large)
print their output in JSON so that they can be evaluated using the
compare_strings.py script.

	* benchtests/bench-memmove-large.c: Print output in JSON
	format.
	* benchtests/bench-memmove.c: Likewise.
2017-08-11 12:19:27 +05:30
Siddhesh Poyarekar
61c982910d benchtests: Remove verification runs from benchmark tests
The test run is unnecessary and interferes with the benchmark.  The
tests are done during make check, so they're unnecessary here.

	* benchtests/bench-memccpy.c (do_one_test): Remove checks.
	* benchtests/bench-memchr.c (do_one_test): Likewise.
	* benchtests/bench-memcpy-large.c (do_one_test): Likewise.
	* benchtests/bench-memcpy.c (do_one_test): Likewise.
	* benchtests/bench-memmove-large.c (do_one_test): Likewise.
	* benchtests/bench-memmove.c (do_one_test): Likewise.
	* benchtests/bench-memset-large.c (do_one_test): Likewise.
	* benchtests/bench-memset.c (do_one_test): Likewise.
	* benchtests/bench-string.h (test_init): Remove memsets.
2017-08-11 12:19:26 +05:30
Siddhesh Poyarekar
dd3e86ad7c benchtests: Avoid a display error when running in text terminal
The compare_strings.py script generates a graph for the benchmarks it
performs a comparison on and that fails if X is not available.  Avoid
the error and ensure that only the graph is generated and saved as a
PNG file.

	* benchtests/scripts/compare_strings.py: Avoid display error
	when generating graph.
2017-08-08 00:56:10 +05:30
Siddhesh Poyarekar
b115e819af benchtests: Allow selecting baseline for compare_string.py
This patch allows one to provide the function name using an optional
-base option to compare all other functions against.  This is useful
when pitching one implementation of a string function against
alternatives.  In the absence of this option, comparisons are done
against the first ifunc in the list.

	* benchtests/scripts/compare_strings.py (main): Add an
	optional -base option.
	(process_results): New argument base_func.
2017-08-08 00:55:12 +05:30
Siddhesh Poyarekar
7ee38e6040 benchtests: Use TEST_NAME instead of hardcoding memcpy
The hardcoded 'memcpy' name turns up in other derived tests like
mempcpy.

       * benchtests/bench-memcpy.c (test_main): Use TEST_NAME instead of
       hardcoding memcpy.
       * benchtests/bench-memcpy-large.c (test_name): Likewise.
       * benchtests/bench-memcpy-random.c (test_name): Likewise.
2017-08-08 00:44:00 +05:30
Siddhesh Poyarekar
25d5247277 benchtests: New script to parse memcpy results
Read the memcpy results in json and print out the results in tabular
form, in addition to generating a graph of the results to compare all
of the implementations.

The format of the output is extensible enough to allow this kind of
analysis to be done on other string functions as well.

	* benchtests/scripts/benchout_strings.schema.json: New file.
	* benchtests/scripts/compare_strings.py: New file.
2017-06-22 23:44:51 +05:30
Siddhesh Poyarekar
5ee1e3cebc benchtests: Make memcpy benchmarks print results in json
Print the benchmark output for various memcpy benchmarks in json so
that it can be predictably parsed and analyzed.

	* benchtests/bench-memcpy-large.c: Include json-lib.h.
	(do_one_test): Print json.
	(do_test): Likewise.
	(test_main): Likewise.
	* benchtests/bench-memcpy-random.c: Include json-lib.h.
	(do_one_test): Print json.
	(do_test): Likewise.
	(test_main): Likewise.
	* benchtests/bench-memcpy.c: Include json-lib.h.
	(do_one_test): Print json.
	(do_test): Likewise.
	(test_main): Likewise.
2017-06-22 23:44:19 +05:30
Siddhesh Poyarekar
738a9914a0 benchtests: Print string array elements, int and uint in json
Enhance the json module in benchtests to print signed and unsigned
integers and string array elements.

	* benchtests/json-lib.h: Include inttypes.h.
	(json_attr_int, json_attr_int, json_element_string,
	json_element_int, json_element_uint): New functions.
	* benchtests/json-lib.c: (json_attr_int, json_attr_int,
	json_element_string, json_element_int, json_element_uint): New
	functions.
2017-06-22 23:44:12 +05:30
Wilco Dijkstra
18b759355d Add powf trace
Add a workload for powf.  This is a reduced trace based on 2.3 billion
samples extracted from wrf.  The distribution of values, in particular
frequency of commonly used operands is the same as in the full trace.

    * benchtests/powf-inputs: Add reduced trace from wrf.
2017-06-20 16:50:37 +01:00
Wilco Dijkstra
beb52f502f Improve math benchmark infrastructure
Improve support for math function benchmarking.  This patch adds
a feature that allows accurate benchmarking of traces extracted
from real workloads.  This is done by iterating over all samples
rather than repeating each sample many times (which completely
ignores branch prediction and cache effects).  A trace can be
added to existing math function inputs via
"## name: workload-<name>", followed by the trace.

        * benchtests/README: Describe workload feature.
        * benchtests/bench-skeleton.c (main): Add support for
        benchmarking traces from workloads.
2017-06-20 16:26:26 +01:00
Paul Clarke
4cedcaea8d Add powf bench tests
Add powf() bench test with input which covers these cases:
- positive base to positive exponent
- exponent 0
- negative base to even exponent
- exponent 1
- exponent -1
- squared
- squareroot
- 1 to negative exponent
- -1 to negative exponent
- base 0
- -1 to even exponent
- small base
- small exponent

	* benchtests/Makefile (bench-math): Add powf.
	* benchtests/powf-inputs: New file.
2017-06-20 10:14:42 -03:00
Adhemerval Zanella
0edbf12301 nptl: Invert the mmap/mprotect logic on allocated stacks (BZ#18988)
Current allocate_stack logic for create stacks is to first mmap all
the required memory with the desirable memory and then mprotect the
guard area with PROT_NONE if required.  Although it works as expected,
it pessimizes the allocation because it requires the kernel to actually
increase commit charge (it counts against the available physical/swap
memory available for the system).

The only issue is to actually check this change since side-effects are
really Linux specific and to actually account them it would require a
kernel specific tests to parse the system wide information.  On the kernel
I checked /proc/self/statm does not show any meaningful difference for
vmm and/or rss before and after thread creation.  I could only see
really meaningful information checking on system wide /proc/meminfo
between thread creation: MemFree, MemAvailable, and Committed_AS shows
large difference without the patch.  I think trying to use these
kind of information on a testcase is fragile.

The BZ#18988 reports shows that the commit pages are easily seen with
mlockall (MCL_FUTURE) (with lock all pages that become mapped in the
process) however a more straighfoward testcase shows that pthread_create
could be faster using this patch:

--
static const int inner_count = 256;
static const int outer_count = 128;

static
void *thread1(void *arg)
{
  return NULL;
}

static
void *sleeper(void *arg)
{
  pthread_t ts[inner_count];
  for (int i = 0; i < inner_count; i++)
    pthread_create (&ts[i], &a, thread1, NULL);
  for (int i = 0; i < inner_count; i++)
    pthread_join (ts[i], NULL);

  return NULL;
}

int main(void)
{
  pthread_attr_init(&a);
  pthread_attr_setguardsize(&a, 1<<20);
  pthread_attr_setstacksize(&a, 1134592);

  pthread_t ts[outer_count];
  for (int i = 0; i < outer_count; i++)
    pthread_create(&ts[i], &a, sleeper, NULL);
  for (int i = 0; i < outer_count; i++)
    pthread_join(ts[i], NULL);
    assert(r == 0);
  }
  return 0;
}

--

On x86_64 (4.4.0-45-generic, gcc 5.4.0) running the small benchtests
I see:

$ time ./test

real	0m3.647s
user	0m0.080s
sys	0m11.836s

While with the patch I see:

$ time ./test

real	0m0.696s
user	0m0.040s
sys	0m1.152s

So I added a pthread_create benchtest (thread_create) which check
the thread creation latency.  As for the simple benchtests, I saw
improvements in thread creation on all architectures I tested the
change.

Checked on x86_64-linux-gnu, i686-linux-gnu, aarch64-linux-gnu,
arm-linux-gnueabihf, powerpc64le-linux-gnu, sparc64-linux-gnu,
and sparcv9-linux-gnu.

	[BZ #18988]
	* benchtests/thread_create-inputs: New file.
	* benchtests/thread_create-source.c: Likewise.
	* support/xpthread_attr_setguardsize.c: Likewise.
	* support/Makefile (libsupport-routines): Add
	xpthread_attr_setguardsize object.
	* support/xthread.h: Add xpthread_attr_setguardsize prototype.
	* benchtests/Makefile (bench-pthread): Add thread_create.
	* nptl/allocatestack.c (allocate_stack): Call mmap with PROT_NONE and
	then mprotect the required area.
2017-06-14 17:22:35 -03:00
H.J. Lu
6b69f98dcd benchtests: Add more tests for memrchr
bench-memchr.c is shared with bench-memrchr.c.  This patch adds some
tests for positions close to the beginning for memrchr, which are
equivalent to positions close to the end for memchr.

	* benchtests/bench-memchr.c (do_test): Print out both length
	and position.
	(test_main): Also test the position close to the beginning for
	memrchr.
2017-06-04 09:45:09 -07:00
Zack Weinberg
2bfdaeddaa Rename cppflags-iterator.mk to libof-iterator.mk, remove extra-modules.mk.
cppflags-iterator.mk no longer has anything to do with CPPFLAGS; all
it does is set libof-$(foo) for a list of files.  extra-modules.mk
does the same thing, but with a different input variable, and doesn't
let the caller control the module.  Therefore, this patch gives
cppflags-iterator.mk a better name, removes extra-modules.mk, and
updates all uses of both.

	* extra-modules.mk: Delete file.
	* cppflags-iterator.mk: Rename to ...
	* libof-iterator.mk: ...this.  Adjust comments.

	* Makerules, extra-lib.mk, benchtests/Makefile, elf/Makefile
	* elf/rtld-Rules, iconv/Makefile, locale/Makefile, malloc/Makefile
	* nscd/Makefile, sunrpc/Makefile, sysdeps/s390/Makefile:
	Use libof-iterator.mk instead of cppflags-iterator.mk or
	extra-modules.mk.

	* benchtests/strcoll-inputs/filelist#en_US.UTF-8: Remove
	extra-modules.mk and cppflags-iterator.mk, add libof-iterator.mk.
2017-05-09 07:06:29 -04:00
Steve Ellcey
29d92a8eda Change TEST_NAME to memcpy to fix IFUNC testing of multiple versions.
* benchtests/bench-memcpy-random.c (TEST_NAME): Change to memcpy.
	(IMPL) Call with 1 instead of 0 as argument.
2017-03-28 09:07:03 -07:00
Siddhesh Poyarekar
d01cbb6e8e Actually add bench-memcpy-random
git-add and commit the benchmark that Wilco posted on the list.
2017-03-26 19:01:50 +05:30
Wilco Dijkstra
8d2030d659 Add a new randomized memcpy test for copies up to 256 bytes. The distribution
of the size and alignment is based on a trace of SPEC2006.  Instead of
repeating the same copy over and over again like the existing tests, it times
several thousand different copies to more accurately estimate the overhead of
branch prediction.

	* benchtests/Makefile (string-benchset): Add memcpy-random.
	* benchtests/bench-memcpy-random.c: New file.
2017-03-23 19:00:02 +00:00
Joseph Myers
bfff8b1bec Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
Siddhesh Poyarekar
8ce8299f94 Add configure check for python program
Add a configure check that looks for python3 and python in that order
since we had agreed in the past to prefer python3 over python in all
our code.  The patch also adjusts invocations through the various
Makefiles to use the set variable.

	* configure.ac: Check for python3 or python.
	* configure: Regenerated.
	* config.make.in (PYTHON): New variable.
	* benchtests/Makefile: Don't define PYTHON.
	(bench): Define target only if PYTHON was defined.
	* Rules: Don't define PYTHON.
	Define pretty printer targets only if PYTHON was defined.
	(tests-printers): Add to tests-unsupported if PYTHON is not
	found.
	(python-flags, python-invoke): Remove.
	(tests-printers-out): Use PYTHON instead of python-invoke.
2016-12-22 23:07:52 +05:30
Wilco Dijkstra
5625f666ce This patch cleans up the strsep implementation and improves performance.
Currently strsep calls strpbrk is is now a veneer to strcspn.  Calling
strcspn directly is faster.  Since it handles a delimiter string of size
1 as a special case, this is not needed in strsep itself.  Although this
means there is a slightly higher overhead if the delimiter size is 1,
all other cases are slightly faster.  The overall performance gain is 5-10%
on AArch64.

The string/bits/string2.h header contains optimizations for constant
delimiters of size 1-3.  Benchmarking these showed similar performance for
size 1 (since in all cases strchr/strchrnul is used), while size 2 and 3
can give up to 2x speedup for small input strings.  However if these cases
are common it seems much better to add this optimization to strcspn.
So move these header optimizations to string-inlines.c.

Improve the strsep benchmark so that it actually benchmarks something.
The current version contains a delimiter character at every position in the
input string, so there is very little work to do, and the extremely inefficent
simple_strsep implementation appears fastest in every case.  The new version
has either no match in the input for the fail case and a match halfway in the
input for the success case.  The input is then restored so that each iteration
does exactly the same amount of work.  Reduce the number of testcases since
simple_strsep takes a lot of time now.

	* benchtests/bench-strsep.c (oldstrsep): Add old implementation.
	(do_one_test) Restore original string so iteration works.
	* string/string-inlines.c (do_test): Create better input strings.
	(test_main) Reduce number of testruns.
	* string/string-inlines.c (__old_strsep_1c): New function.
	(__old_strsep_2c): Likewise.
	(__old_strsep_3c): Likewise.
	* string/strsep.c (__strsep): Remove case of small delim string.
	Call strcspn directly rather than strpbrk.
	* string/bits/string2.h (__strsep): Remove define.
	(__strsep_1c): Remove.
	(__strsep_2c): Remove.
	(__strsep_3c): Remove.
	(strsep): Remove.
	* sysdeps/unix/sysv/linux/internal_statvfs.c
	(__statvfs_getflags): Rename to __strsep.
2016-12-21 15:16:29 +00:00
Adhemerval Zanella
da16c9b524 benchtests: Add fmaxf/fminf benchmarks
This patch adds fmaxf and fminf benchtests.  It is based on
math/s_fmax_template.c implementation which checks for basically four
different classes:

  1. if x is greater or equal than y.
  2. if x is less than y.
  3. if x or y is signaling.
  4. if y is nan.

Cases 1 and 2 are used for default input number (by mixing normal double
numbers and infinity), while case 3 and 4 are used each for on for a
benchmark class.

Checked on x86_64-linux-gnu and powerpc64-linux-gnu.

	* benchtests/Makefile (bench-math): Add fminf and fmaxf.
	(CFLAGS-bench-fmaxf.c): New rule.
	(CFLAGS-bench-fminf.c): Likewise.
        * benchtests/fmaxf-inputs: New file.
        * benchtests/fminf-inputs: Likewise.
2016-12-19 16:04:16 -02:00
Adhemerval Zanella
5d1f604a87 benchtests: Add fmax/fmin benchmarks
This patch adds fmax and fmin benchtests.  It is based math/s_fmax_template.c
implementation which checks for basically four different classes:

  1. if x is greater or equal than y.
  2. if x is less than y.
  3. if x or y is signaling.
  4. if y is nan.

Cases 1 and 2 are used for default input number (by mixing normal double
numbers and infinity), while case 3 and 4 are used each for on for a
benchmark class.

Checked on x86_64-linux-gnu and powerpc64-linux-gnu.

	* benchtests/Makefile (bench-math): Add fmin and fmax.
	(CFLAGS-bench-fmax.c): New rule.
	(CFLAGS-bench-fmin.c): New rule.
	* benchtests/fmax-inputs: New file.
	* benchtests/fmin-inputs: Likewise.
2016-12-19 16:04:16 -02:00
Adhemerval Zanella
b598e13477 Adjust benchtests to new support library.
This patch basically replaces the test-skeleton.c inclusion by
support/test-driver.c and also minor adjustments in bench-string.h.

Checked on x86_64-linux-gnu and powerpc64le-linux-gnu.

	* benchtests/bench-string.h (TEST_FUNCTION): Use name without
	parenthesis.
	(CMDLINE_PROCESS): Define using function instead of macro.
	* benchtests/bench-memccpy.c: Include <support/test-driver.c> instead
	of test-skeleton.
	* benchtests/bench-memchr.c: Likewise.
	* benchtests/bench-memcmp.c: Likewise.
	* benchtests/bench-memcpy-large.c: Likewise.
	* benchtests/bench-memcpy.c: Likewise.
	* benchtests/bench-memmem.c: Likewise.
	* benchtests/bench-memmove-large.c: Likewise.
	* benchtests/bench-memmove.c: Likewise.
	* benchtests/bench-memset-large.c: Likewise.
	* benchtests/bench-memset.c: Likewise.
	* benchtests/bench-rawmemchr.c: Likewise.
	* benchtests/bench-strcasecmp.c: Likewise.
	* benchtests/bench-strcasestr.c: Likewise.
	* benchtests/bench-strcat.c: Likewise.
	* benchtests/bench-strchr.c: Likewise.
	* benchtests/bench-strcmp.c: Likewise.
	* benchtests/bench-strcpy.c: Likewise.
	* benchtests/bench-strcpy_chk.c: Likewise.
	* benchtests/bench-strlen.c: Likewise.
	* benchtests/bench-strncasecmp.c: Likewise.
	* benchtests/bench-strncmp.c: Likewise.
	* benchtests/bench-strncpy.c: Likewise.
	* benchtests/bench-strnlen.c: Likewise.
	* benchtests/bench-strpbrk.c: Likewise.
	* benchtests/bench-strrchr.c: Likewise.
	* benchtests/bench-strsep.c: Likewise.
	* benchtests/bench-strspn.c: Likewise.
	* benchtests/bench-strstr.c: Likewise.
	* benchtests/bench-strtok.c: Likewise.
2016-12-19 16:04:16 -02:00
Siddhesh Poyarekar
009ba649b4 Link benchset tests against libsupport
Benchsets in benchtests use test-skeleton, so they too need to be
linked against the new libsupport DSO.

       * benchtests/Makefile (binaries-benchset): Depend on libsupport
       DSO.
2016-12-18 01:22:29 +05:30
Wilco Dijkstra
d58ab810a6 Improve strtok and strtok_r performance. Instead of calling strpbrk which
calls strcspn, call strcspn directly so we get the end of the token without
an extra call to rawmemchr.  Also avoid an unnecessary call to strcspn after
the last token by adding an early exit for an empty string.  Change strtok
to tailcall strtok_r to avoid unnecessary code duplication.

Remove the special header optimization for strtok_r of a 1-character
constant string - both strspn and strcspn contain optimizations for this
case.  Benchmarking this showed similar performance in the worst case,
but up to 5.5x better performance in the "found" case for large inputs.

	* benchtests/bench-strtok.c (oldstrtok): Add old implementation.
	* string/strtok.c (strtok): Change to tailcall __strtok_r.
	* string/strtok_r.c (__strtok_r): Optimize for performance.
	* string/string-inlines.c (__old_strtok_r_1c): New function.
	* string/bits/string2.h (__strtok_r): Move to string-inlines.c.
2016-12-14 15:12:18 +00:00
Joseph Myers
7a8330c01b Use -fno-builtin for sqrt benchmark.
This patch makes the sqrt benchmark use -fno-builtin, as already done
for benchmarks of ffs and ffsll, so that it actually benchmarks the
glibc function as (presumably) intended even in the presence of the
compiler inlining sqrt.

Tested for x86_64 and also used for benchmarking my ARM sqrt patch.

	* benchtests/Makefile (CFLAGS-bench-sqrt.c): New variable.
2016-10-21 21:18:03 +00:00
H.J. Lu
447720b03b Clear destination buffer updated by the previous run
Clear the destination buffer updated by the previous run in bench-memcpy.c
and test-memcpy.c to catch the error when the following implementations do
not copy anything.

	[BZ #19907]
	* benchtests/bench-memcpy.c (do_one_test): Clear the destination
	buffer updated by the previous run.
	* string/test-memcpy.c (do_one_test): Likewise.
	* benchtests/bench-memmove.c (do_one_test): Add a comment.
	* string/test-memmove.c (do_one_test): Likewise.
2016-05-18 05:51:59 -07:00
Siddhesh Poyarekar
2d304f3c6f benchtests: Support for cross-building benchmarks
This patch adds full support for cross-building benchmarks.  Some
benchmarks like those that need locales to be generated cannot be
built and are hence skipped for cross builds.

Tested by cross building for aarch64 on x86_64 and then running the
generated benchmark on aarch64.

	* benchtests/Makefile (wcsmbs-benchset): Include only for
	native builds and runs.
	(LOCALES): Likewise.
	(bench-build): Build timing-type here instead of the bench
	target.  Generate locale only for native builds.
	* benchtests/README: Add note for cross-building.
2016-04-20 13:19:01 +05:30
Siddhesh Poyarekar
d7aea0cf06 benchtests: Clean up extra-objs
The bench-clean target would leave behind json-lib.o.  Fix up to clean
up all extra-objs registered in benchtests.
2016-04-20 13:15:50 +05:30
Siddhesh Poyarekar
bfdda211c6 benchtests: Update README to include instructions for bench-build target 2016-04-20 10:58:20 +05:30
Siddhesh Poyarekar
a00d3f4a8c New make target to only build benchmark binaries
For situations where we are cross-building or where we want to avoid
building on the target system, we want a way to only build benchmarks
and then copy them over to the target system to run them.  I have also
added a simple enhancement for the 'bench' target where all benchmark
binaries are built and then the benchmarks executed.

Tested on arm.

	Makefile.in (bench-build): New target.
	Rules (PHONY): Add bench-build target.
	benchtests/Makefile (bench): Depend on bench-build.
	(bench-build): New target.
2016-04-20 10:23:28 +05:30
Mike Frysinger
20003c4988 localedata: iw_IL: delete old/deprecated locale [BZ #16137]
From the bug:
Obsolete locale.  The ISO-639 code for Hebrew was changed from 'iw'
to 'he' in 1989, according to Bruno Haible on libc-alpha 2003-09-01.

Reported-by: Chris Leonard <cjlhomeaddress@gmail.com>
2016-04-08 18:56:34 -04:00
H.J. Lu
a25322f4e8 Add memcpy/memmove/memset benchmarks with large data
Add memcpy, memmove and memset benchmarks with large data sizes.

	* benchtests/Makefile (string-benchset): Add memcpy-large,
	memmove-large and memset-large.
	* benchtests/bench-memcpy-large.c: New file.
	* benchtests/bench-memmove-large.c: Likewise.
	* benchtests/bench-memmove-large.c: Likewise.
	* benchtests/bench-string.h (TIMEOUT): Don't redefine.
2016-04-06 08:37:39 -07:00
H.J. Lu
344303f3cf Test 64-byte alignment in memset benchtest
Add 64-byte alignment tests in memset benchtest for 64-byte vector
registers.

	* benchtests/bench-memset.c (do_test): Support 64-byte
	alignment.
	(test_main): Test 64-byte alignment.
2016-04-01 10:00:12 -07:00
H.J. Lu
aea44bf61a Test 64-byte alignment in memmove benchtest
Add 64-byte alignment tests in memmove benchtest for 64-byte vector
registers.

	* benchtests/bench-memmove.c (test_main): Test 64-byte
	alignment.
2016-04-01 09:59:09 -07:00
H.J. Lu
32b28d24a1 Test 64-byte alignment in memcpy benchtest
Add 64-byte alignment tests in memcpy benchtest for 64-byte vector
registers.

	* benchtests/bench-memcpy.c (test_main): Test 64-byte alignment.
2016-04-01 09:57:53 -07:00
H.J. Lu
87da630b22 Support --enable-hardcoded-path-in-tests in benchtests
benchtests should use $(test-via-rtld-prefix) and $(+link-tests) like
other glibc tests.

	[BZ #19783]
	* benchtests/Makefile (run-bench): Replace $(rtld-prefix) with
	$(test-via-rtld-prefix).
	($(binaries-bench)): Replace $(+link) with $(+link-tests).
2016-03-08 04:53:38 -08:00
Carlos O'Donell
67fc563718 Use $(PYTHON) to run benchtests python files. 2016-01-13 11:00:57 -05:00
Joseph Myers
f7a9f785e5 Update copyright dates with scripts/update-copyrights. 2016-01-04 16:05:18 +00:00
Siddhesh Poyarekar
aad287f35a benchtests: ffs and ffsll are string functions, not math
The ffs and ffsll functions were listed as math functions when they
are actually defined in strings.h and string.h respectively.  Shuffle
around the Makefile variables a bit and make a separate space for ffs
and ffsll.
2015-12-09 00:15:15 +05:30
Siddhesh Poyarekar
520e7edb85 benchtests: Add inputs from sin and cos to sincos
The sincos benchmark has only about a dozen inputs that don't measure
the impact of changes to various passes.  Since much of the code
properties are inherited from sin and cos, copy those inputs in to get
more comprehensive coverage.
2015-12-09 00:10:51 +05:30
Siddhesh Poyarekar
4916acd87b benchtests: Mark output variables as used
Prevent function calls that don't return anything from being optimized
out by the compiler by marking its input variables as used.

This prevents the sincos function call from being optimized out in the
benchmark.
2015-11-17 16:01:15 +05:30
Wilco Dijkstra
cb2f668d46 Add a new benchmark for isinf/isnan/isnormal/isfinite/fpclassify. The test uses 2 arrays with 1024 doubles, one with 99% finite FP numbers (10% zeroes, 10% negative) and 1% inf/NaN, the other with 50% inf, and 50% Nan.
ChangeLog:
2015-09-18  Wilco Dijkstra  <wdijkstr@arm.com>

	* benchtests/Makefile: Add bench-math-inlines, link with libm.
	* benchtests/bench-math-inlines.c: New benchmark.
	* benchtests/bench-util.h: New file.
	* benchtests/bench-util.c: New file.
	* benchtests/bench-skeleton.c: Add include of bench-util.c/h.
2015-09-18 16:02:38 +01:00
Stefan Liebler
f21216015b S390: Optimize wmemcmp.
This patch provides optimized version of wmemcmp with the z13 vector
instructions.

ChangeLog:

	* sysdeps/s390/multiarch/wmemcmp-c.c: New File.
	* sysdeps/s390/multiarch/wmemcmp-vx.S: Likewise.
	* sysdeps/s390/multiarch/wmemcmp.c: Likewise.
	* sysdeps/s390/multiarch/Makefile
	(sysdep_routines): Add wmemcmp functions.
	* sysdeps/s390/multiarch/ifunc-impl-list-common.c
	(__libc_ifunc_impl_list_common): Add ifunc test for wmemcmp.
	* benchtests/bench-wmemcmp.c: New File.
	* benchtests/Makefile (wcsmbs-bench): Add wmemcmp.
2015-08-26 10:26:25 +02:00
Stefan Liebler
2e9e166761 S390: Optimize wmemset.
This patch provides optimized version of wmemset with the z13 vector
instructions.

ChangeLog:

	* sysdeps/s390/multiarch/wmemset-c.c: New File.
	* sysdeps/s390/multiarch/wmemset-vx.S: Likewise.
	* sysdeps/s390/multiarch/wmemset.c: Likewise.
	* sysdeps/s390/multiarch/Makefile
	(sysdep_routines): Add wmemset functions.
	* sysdeps/s390/multiarch/ifunc-impl-list-common.c
	(__libc_ifunc_impl_list_common): Add ifunc test for wmemset.
	* wcsmbs/wmemset.c: Use WMEMSET if defined.
	* string/test-memset.c: Add wmemset support.
	* wcsmbs/test-wmemset.c: New File.
	* wcsmbs/Makefile (strop-tests): Add wmemset.
	* benchtests/bench-memset.c: Add wmemset support.
	* benchtests/bench-wmemset.c: New File.
	* benchtests/Makefile (wcsmbs-bench): Add wmemset.
2015-08-26 10:26:25 +02:00
Stefan Liebler
88eefd344b S390: Optimize memchr, rawmemchr and wmemchr.
This patch provides optimized versions of memchr, rawmemchr and wmemchr with the
z13 vector instructions.

ChangeLog:

	* sysdeps/s390/multiarch/memchr-vx.S: New File.
	* sysdeps/s390/multiarch/memchr.c: Likewise.
	* sysdeps/s390/multiarch/rawmemchr-c.c: Likewise.
	* sysdeps/s390/multiarch/rawmemchr-vx.S: Likewise.
	* sysdeps/s390/multiarch/rawmemchr.c: Likewise.
	* sysdeps/s390/multiarch/wmemchr-c.c: Likewise.
	* sysdeps/s390/multiarch/wmemchr-vx.S: Likewise.
	* sysdeps/s390/multiarch/wmemchr.c: Likewise.
	* sysdeps/s390/s390-32/multiarch/memchr.c: Likewise.
	* sysdeps/s390/s390-64/multiarch/memchr.c: Likewise.
	* sysdeps/s390/multiarch/Makefile (sysdep_routines): Add memchr, wmemchr
	and rawmemchr functions.
	* sysdeps/s390/multiarch/ifunc-impl-list-common.c
	(__libc_ifunc_impl_list_common): Add ifunc test for memchr, rawmemchr
	and wmemchr.
	* wcsmbs/wmemchr.c: Use WMEMCHR if defined.
	* string/test-memchr.c: Add wmemchr support.
	* wcsmbs/test-wmemchr.c: New File.
	* wcsmbs/Makefile (strop-tests): Add wmemchr.
	* benchtests/bench-memchr.c: Add wmemchr support.
	* benchtests/bench-wmemchr.c: New File.
	* benchtests/Makefile (wcsmbs-bench): wmemchr.
2015-08-26 10:26:24 +02:00
Stefan Liebler
b4c21601b1 S390: Optimize strcspn and wcscspn.
This patch provides optimized versions of strcspn and wcscspn with the z13
vector instructions.

ChangeLog:

	* sysdeps/s390/multiarch/strcspn-c.c: New File.
	* sysdeps/s390/multiarch/strcspn-vx.S: Likewise.
	* sysdeps/s390/multiarch/strcspn.c: Likewise.
	* sysdeps/s390/multiarch/wcscspn-c.c: Likewise.
	* sysdeps/s390/multiarch/wcscspn-vx.S: Likewise.
	* sysdeps/s390/multiarch/wcscspn.c: Likewise.
	* sysdeps/s390/multiarch/Makefile (sysdep_routines): Add strcspn and
	wcscspn functions.
	* sysdeps/s390/multiarch/ifunc-impl-list.c
	(__libc_ifunc_impl_list): Add ifunc test for strcspn, wcscspn.
	* wcsmbs/wcscspn.c: Use WCSCSPN if defined.
	* string/test-strcspn.c: Add wcscspn support.
	* wcsmbs/test-wcscspn.c: New File.
	* wcsmbs/Makefile (strop-tests): Add wcscspn.
	* benchtests/bench-strcspn.c: Add wcscspn support.
	* benchtests/bench-wcscspn.c: New File.
	* benchtests/Makefile (wcsmbs-bench): Add wcscspn.
2015-08-26 10:26:24 +02:00
Stefan Liebler
f0ba659847 S390: Optimize strpbrk and wcspbrk.
This patch provides optimized versions of strpbrk and wcspbrk with the z13
vector instructions.

ChangeLog:

	* sysdeps/s390/multiarch/strpbrk-c.c: New File.
	* sysdeps/s390/multiarch/strpbrk-vx.S: Likewise.
	* sysdeps/s390/multiarch/strpbrk.c: Likewise.
	* sysdeps/s390/multiarch/wcspbrk-c.c: Likewise.
	* sysdeps/s390/multiarch/wcspbrk-vx.S: Likewise.
	* sysdeps/s390/multiarch/wcspbrk.c: Likewise.
	* sysdeps/s390/multiarch/Makefile (sysdep_routines): Add strpbrk and
	wcspbrk functions.
	* sysdeps/s390/multiarch/ifunc-impl-list.c
	(__libc_ifunc_impl_list): Add ifunc test for strpbrk, wcspbrk.
	* wcsmbs/wcspbrk.c: Use WCSPBRK if defined.
	* string/test-strpbrk.c: Add wcspbrk support.
	* wcsmbs/test-wcspbrk.c: New File.
	* wcsmbs/Makefile (strop-tests): Add wcspbrk.
	* benchtests/bench-strpbrk.c: Add wcspbrk support.
	* benchtests/bench-wcspbrk.c: New File.
	* benchtests/Makefile (wcsmbs-bench): Add wcspbrk.
2015-08-26 10:26:24 +02:00
Stefan Liebler
f1ffad98be S390: Optimize strspn and wcsspn.
This patch provides optimized versions of strspn and wcsspn with the z13
vector instructions.

ChangeLog:

	* sysdeps/s390/multiarch/strspn-c.c: New File.
	* sysdeps/s390/multiarch/strspn-vx.S: Likewise.
	* sysdeps/s390/multiarch/strspn.c: Likewise.
	* sysdeps/s390/multiarch/wcsspn-c.c: Likewise.
	* sysdeps/s390/multiarch/wcsspn-vx.S: Likewise.
	* sysdeps/s390/multiarch/wcsspn.c: Likewise.
	* wcsmbs/wcsspn.c: Use WCSSPN if defined.
	* sysdeps/s390/multiarch/Makefile (sysdep_routines): Add strspn and
	wcsspn functions.
	* sysdeps/s390/multiarch/ifunc-impl-list.c
	(__libc_ifunc_impl_list): Add ifunc test for strspn, wcsspn.
	* string/test-strspn.c: Add wcsspn support.
	* wcsmbs/test-wcsspn.c: New File.
	* wcsmbs/Makefile (strop-tests): Add wcsspn.
	* benchtests/bench-strspn.c: Add wcsspn support.
	* benchtests/bench-wcsspn.c: New File.
	* benchtests/Makefile (wcsmbs-bench): Add wcsspn.
2015-08-26 10:26:24 +02:00
Stefan Liebler
f40132d4bd S390: Optimize strrchr and wcsrchr.
This patch provides optimized versions of strrchr and wcsrchr with the z13
vector instructions.

ChangeLog:

	* sysdeps/s390/multiarch/strrchr-c.c: New File.
	* sysdeps/s390/multiarch/strrchr-vx.S: Likewise.
	* sysdeps/s390/multiarch/strrchr.c: Likewise.
	* sysdeps/s390/multiarch/wcsrchr-c.c: Likewise.
	* sysdeps/s390/multiarch/wcsrchr-vx.S: Likewise.
	* sysdeps/s390/multiarch/wcsrchr.c: Likewise.
	* sysdeps/s390/multiarch/Makefile (sysdep_routines): Add strrchr and
	wcsrchr functions.
	* sysdeps/s390/multiarch/ifunc-impl-list.c
	(__libc_ifunc_impl_list): Add ifunc test for strrchr, wcsrchr.
	* benchtests/bench-wcsrchr.c: New File.
	* benchtests/Makefile (wcsmbs-bench): Add wcsrchr.
2015-08-26 10:26:23 +02:00
Stefan Liebler
d23d4ef19f S390: Optimize strchrnul and wcschrnul.
This patch provides optimized versions of strchrnul and wcschrnul with the z13
vector instructions.

ChangeLog:

	* sysdeps/s390/multiarch/strchrnul-c.c: New File.
	* sysdeps/s390/multiarch/strchrnul-vx.S: Likewise.
	* sysdeps/s390/multiarch/strchrnul.c: Likewise.
	* sysdeps/s390/multiarch/wcschrnul-c.c: Likewise.
	* sysdeps/s390/multiarch/wcschrnul-vx.S: Likewise.
	* sysdeps/s390/multiarch/wcschrnul.c: Likewise.
	* sysdeps/s390/multiarch/Makefile (sysdep_routines): Add strchrnul and
	wcschrnul functions.
	* sysdeps/s390/multiarch/ifunc-impl-list.c
	(__libc_ifunc_impl_list): Add ifunc test for strchrnul, wcschrnul.
	* wcsmbs/wcschrnul.c: Use WCSCHRNUL if defined.
	* string/test-strchr.c: Add wcschrnul support.
	* wcsmbs/test-wcschrnul.c: New File.
	* wcsmbs/Makefile (strop-tests): Add wcschrnul.
	* benchtests/bench-strchr.c: Add wcschrnul support.
	* benchtests/bench-wcschrnul.c: New File.
	* benchtests/Makefile (wcsmbs-bench): Add wcschrnul.
2015-08-26 10:26:23 +02:00
Stefan Liebler
cf150d45a9 S390: Optimize strchr and wcschr.
This patch provides optimized versions of strchr and wcschr with the z13
vector instructions.

ChangeLog:

	* sysdeps/s390/multiarch/strchr-c.c: New File.
	* sysdeps/s390/multiarch/strchr-vx.S: Likewise.
	* sysdeps/s390/multiarch/strchr.c: Likewise.
	* sysdeps/s390/multiarch/wcschr-c.c: Likewise.
	* sysdeps/s390/multiarch/wcschr-vx.S: Likewise.
	* sysdeps/s390/multiarch/wcschr.c: Likewise.
	* sysdeps/s390/multiarch/Makefile (sysdep_routines): Add strchr and
	wcschr functions.
	* sysdeps/s390/multiarch/ifunc-impl-list.c
	(__libc_ifunc_impl_list): Add ifunc test for strchr, wcschr.
	* string/strchr.c (STRCHR): Define and use macro.
	* benchtests/bench-wcschr.c: New File.
	* benchtests/Makefile (wcsmbs-bench): Add wcschr.
2015-08-26 10:26:23 +02:00
Stefan Liebler
cee82e70cc S390: Optimize strncmp and wcsncmp.
This patch provides optimized versions of strncmp and wcsncmp with the z13
vector instructions.

ChangeLog:

	* sysdeps/s390/multiarch/strncmp-c.c: New File.
	* sysdeps/s390/multiarch/strncmp-vx.S: Likewise.
	* sysdeps/s390/multiarch/strncmp.c: Likewise.
	* sysdeps/s390/multiarch/wcsncmp-c.c: Likewise.
	* sysdeps/s390/multiarch/wcsncmp-vx.S: Likewise.
	* sysdeps/s390/multiarch/wcsncmp.c: Likewise.
	* sysdeps/s390/multiarch/Makefile (sysdep_routines): Add strncmp and
	wcsncmp functions.
	* sysdeps/s390/multiarch/ifunc-impl-list.c
	(__libc_ifunc_impl_list): Add ifunc test for strncmp, wcsncmp.
	* wcsmbs/wcsncmp.c (WCSNCMP): Define and use macro.
	* benchtests/bench-strncmp.c: Add wcsncmp support.
	* benchtests/bench-wcsncmp.c: New File.
	* benchtests/Makefile (wcsmbs-bench): Add wcsncmp.
2015-08-26 10:26:22 +02:00
Stefan Liebler
63724a6db6 S390: Optimize strcmp and wcscmp.
This patch provides optimized versions of strcmp and wcscmp with the z13
vector instructions.

The architecture specific string.h had a typo, which leads to ommiting the
inline version in this file if __USE_STRING_INLINES is defined.
Tested this inline version by tweaking test-strcmp.c.

ChangeLog:

	* sysdeps/s390/multiarch/strcmp-vx.S: New File.
	* sysdeps/s390/multiarch/strcmp.c: Likewise.
	* sysdeps/s390/multiarch/wcscmp-c.c: Likewise.
	* sysdeps/s390/multiarch/wcscmp-vx.S: Likewise.
	* sysdeps/s390/multiarch/wcscmp.c: Likewise.
	* sysdeps/s390/s390-32/multiarch/strcmp.c: Likewise.
	* sysdeps/s390/s390-64/multiarch/strcmp.c: Likewise.
	* sysdeps/s390/multiarch/Makefile (sysdep_routines): Add strcmp and
	wcscmp functions.
	* sysdeps/s390/multiarch/ifunc-impl-list.c
	(__libc_ifunc_impl_list): Add ifunc test for strcmp, wcscmp.
	* string/strcmp.c (STRCMP): Define and use macro.
	* benchtests/bench-wcscmp.c: New File.
	* benchtests/Makefile (wcsmbs-bench): Add wcscmp.
	* sysdeps/s390/bits/string.h: Fix typo: _HAVE_STRING_ARCH_strcmp
	instead of _HAVE_STRING_ARCH_memchr.
2015-08-26 10:26:22 +02:00
Stefan Liebler
e1fe91180e S390: Optimize strncat wcsncat.
This patch provides optimized versions of strncat and wcsncat with the z13
vector instructions.

ChangeLog:

	* sysdeps/s390/multiarch/strncat-c.c: New File.
	* sysdeps/s390/multiarch/strncat-vx.S: Likewise.
	* sysdeps/s390/multiarch/strncat.c: Likewise.
	* sysdeps/s390/multiarch/wcsncat-c.c: Likewise.
	* sysdeps/s390/multiarch/wcsncat-vx.S: Likewise.
	* sysdeps/s390/multiarch/wcsncat.c: Likewise.
	* sysdeps/s390/multiarch/Makefile (sysdep_routines): Add strncat and
	wcsncat functions.
	* sysdeps/s390/multiarch/ifunc-impl-list.c
	(__libc_ifunc_impl_list): Add ifunc test for strncat, wcsncat.
	* wcsmbs/wcsncat.c (WCSNCAT): Define and use macro.
	* string/test-strncat.c: Add wcsncat support.
	* wcsmbs/test-wcsncat.c: New File.
	* wcsmbs/Makefile (strop-tests): Add wcsncat.
	* benchtests/bench-strncat.c: Add wcsncat support.
	* benchtests/bench-wcsncat.c: New File.
	* benchtests/Makefile (wcsmbs-bench): Add wcsncat.
2015-08-26 10:26:22 +02:00
Stefan Liebler
d626a24f23 S390: Optimize strcat and wcscat.
This patch provides optimized versions of strcat and wcscat with the z13
vector instructions.

ChangeLog:

	* sysdeps/s390/multiarch/strcat-c.c: New File.
	* sysdeps/s390/multiarch/strcat-vx.S: Likewise.
	* sysdeps/s390/multiarch/strcat.c: Likewise.
	* sysdeps/s390/multiarch/wcscat-c.c: Likewise.
	* sysdeps/s390/multiarch/wcscat-vx.S: Likewise.
	* sysdeps/s390/multiarch/wcscat.c: Likewise.
	* sysdeps/s390/multiarch/Makefile (sysdep_routines): Add strcat and
	wcscat functions.
	* sysdeps/s390/multiarch/ifunc-impl-list.c
	(__libc_ifunc_impl_list): Add ifunc test for strcat, wcscat.
	* string/strcat.c (STRCAT): Define and use macro.
	* wcsmbs/wcscat.c: Use WCSCAT if defined.
	* string/test-strcat.c: Add wcscat support.
	* wcsmbs/test-wcscat.c: New File.
	* wcsmbs/Makefile (strop-tests): Add wcscat.
	* benchtests/bench-strcat.c: Add wcscat support.
	* benchtests/bench-wcscat.c: New File.
	* benchtests/Makefile (wcsmbs-bench): Add wcscat.
2015-08-26 10:26:21 +02:00
Stefan Liebler
b3a0c176d1 S390: Optimize stpncpy and wcpncpy.
This patch provides optimized versions of stpncpy and wcpncpy with the z13
vector instructions.

ChangeLog:

	* sysdeps/s390/multiarch/stpncpy-c.c: New File.
	* sysdeps/s390/multiarch/stpncpy-vx.S: Likewise.
	* sysdeps/s390/multiarch/stpncpy.c: Likewise.
	* sysdeps/s390/multiarch/wcpncpy-c.c: Likewise.
	* sysdeps/s390/multiarch/wcpncpy-vx.S: Likewise.
	* sysdeps/s390/multiarch/wcpncpy.c: Likewise.
	* sysdeps/s390/multiarch/Makefile (sysdep_routines): Add stpncpy and
	wcpncpy functions.
	* sysdeps/s390/multiarch/ifunc-impl-list.c
	(__libc_ifunc_impl_list): Add ifunc test for stpncpy, wcpncpy.
	* wcsmbs/wcpncpy.c: Use WCPNCPY if defined.
	* string/test-stpncpy.c: Add wcpncpy support.
	* wcsmbs/test-wcpncpy.c: New File.
	* wcsmbs/Makefile (strop-tests): Add wcpncpy.
	* benchtests/bench-stpncpy.c: Add wcpncpy support.
	* benchtests/bench-wcpncpy.c: New File.
	* benchtests/Makefile (wcsmbs-bench): Add wcpncpy.
2015-08-26 10:26:21 +02:00
Stefan Liebler
d183b96ee6 S390: Optimize strncpy and wcsncpy.
This patch provides optimized versions of strncpy and wcsncpy with the z13
vector instructions.

ChangeLog:

	* sysdeps/s390/multiarch/strncpy-vx.S: New File.
	* sysdeps/s390/multiarch/strncpy.c: Likewise.
	* sysdeps/s390/multiarch/wcsncpy-c.c: Likewise.
	* sysdeps/s390/multiarch/wcsncpy-vx.S: Likewise.
	* sysdeps/s390/multiarch/wcsncpy.c: Likewise.
	* sysdeps/s390/s390-32/multiarch/strncpy.c: Likewise.
	* sysdeps/s390/s390-64/multiarch/strncpy.c: Likewise.
	* sysdeps/s390/multiarch/Makefile (sysdep_routines): Add strncpy and
	wcsncpy functions.
	* wcsmbs/wcsncpy.c: Use WCSNCPY if defined.
	* sysdeps/s390/multiarch/ifunc-impl-list.c
	(__libc_ifunc_impl_list): Add ifunc test for strncpy, wcsncpy.
	* string/test-strncpy.c: Add wcsncpy support.
	* wcsmbs/test-wcsncpy.c: New File.
	* wcsmbs/Makefile (strop-tests): Add wcsncpy.
	* benchtests/bench-strncpy.c: Add wcsncpy support.
	* benchtests/bench-wcsncpy.c: New File.
	* benchtests/Makefile (wcsmbs-bench): Add wcsncpy
2015-08-26 10:26:21 +02:00
Stefan Liebler
8ade3db78d S390: Optimize stpcpy and wcpcpy.
This patch provides optimized versions of stpcpy and wcpcpy with the z13
vector instructions.

ChangeLog:

	* sysdeps/s390/multiarch/stpcpy-c.c: New File.
	* sysdeps/s390/multiarch/stpcpy-vx.S: Likewise.
	* sysdeps/s390/multiarch/stpcpy.c: Likewise.
	* sysdeps/s390/multiarch/wcpcpy-c.c: Likewise.
	* sysdeps/s390/multiarch/wcpcpy-vx.S: Likewise.
	* sysdeps/s390/multiarch/wcpcpy.c: Likewise.
	* sysdeps/s390/multiarch/Makefile (sysdep_routines): Add stpcpy and
	wcpcpy functions.
	* string/stpcpy.c: Use STPCPY if defined.
	* wcsmbs/wcpcpy.c: Use WCPCPY if defined.
	* sysdeps/s390/multiarch/ifunc-impl-list.c
	(__libc_ifunc_impl_list): Add ifunc test for stpcpy, wcpcpy.
	* string/test-stpcpy.c: Add wcpcpy support.
	* wcsmbs/test-wcpcpy.c: New File.
	* wcsmbs/Makefile (strop-tests): Add wcpcpy.
	* benchtests/bench-stpcpy.c: Add wcpcpy support.
	* benchtests/bench-wcpcpy.c: New File.
	* benchtests/Makefile (wcsmbs-bench): Add wcpcpy.
2015-08-26 10:26:21 +02:00
Stefan Liebler
680df122ab S390: Optimize strcpy and wcscpy.
This patch provides optimized versions of strcpy and wcscpy with the z13
vector instructions.

ChangeLog:

	* sysdeps/s390/multiarch/strcpy-vx.S: New File.
	* sysdeps/s390/multiarch/strcpy.c: Likewise.
	* sysdeps/s390/multiarch/wcscpy-c.c: Likewise.
	* sysdeps/s390/multiarch/wcscpy-vx.S: Likewise.
	* sysdeps/s390/multiarch/wcscpy.c: Likewise.
	* sysdeps/s390/s390-32/multiarch/strcpy.c: Likewise.
	* sysdeps/s390/s390-64/multiarch/strcpy.c: Likewise.
	* sysdeps/s390/multiarch/Makefile (sysdep_routines): Add strcpy and
	wcscpy functions.
	* sysdeps/s390/multiarch/ifunc-impl-list.c
	(__libc_ifunc_impl_list): Add ifunc test for strcpy, wcscpy.
	* benchtests/bench-wcscpy.c: New File.
	* benchtests/Makefile (wcsmbs-bench): Add wcscpy.
2015-08-26 10:26:20 +02:00
Stefan Liebler
fcf40ebe26 S390: Optimize strnlen and wcsnlen.
This patch provides optimized versions of strnlen and wcsnlen with the z13
vector instructions.

ChangeLog:

	* sysdeps/s390/multiarch/strnlen-c.c: New File.
	* sysdeps/s390/multiarch/strnlen-vx.S: Likewise.
	* sysdeps/s390/multiarch/strnlen.c: Likewise.
	* sysdeps/s390/multiarch/wcsnlen-c.c: Likewise.
	* sysdeps/s390/multiarch/wcsnlen-vx.S: Likewise.
	* sysdeps/s390/multiarch/wcsnlen.c: Likewise.
	* sysdeps/s390/multiarch/Makefile (sysdep_routines): Add strnlen and
	wcsnlen functions.
	* sysdeps/s390/multiarch/ifunc-impl-list.c
	(__libc_ifunc_impl_list): Add ifunc test for strnlen, wcsnlen.
	* wcsmbs/wcsnlen.c: Use WCSNLEN if defined.
	* string/test-strnlen.c: Add wcsnlen support.
	* wcsmbs/test-wcsnlen.c: New File.
	* wcsmbs/Makefile (strop-tests): Add wcsnlen.
	* benchtests/bench-strnlen.c: Add wcsnlen support.
	* benchtests/bench-wcsnlen.c: New File.
	* benchtests/Makefile (wcsmbs-bench): Add wcsnlen.
2015-08-26 10:26:20 +02:00
Stefan Liebler
9472f35a0a S390: Optimize strlen and wcslen.
This patch provides optimized versions of strlen and wcslen with the z13 vector
instructions.
The helper macro IFUNC_VX_IMPL is introduced and is used to register all
__<func>_c() and __<func>_vx() functions within __libc_ifunc_impl_list()
to the ifunc test framework.

ChangeLog:

	* sysdeps/s390/multiarch/Makefile: New File.
	* sysdeps/s390/multiarch/strlen-c.c: Likewise.
	* sysdeps/s390/multiarch/strlen-vx.S: Likewise.
	* sysdeps/s390/multiarch/strlen.c: Likewise.
	* sysdeps/s390/multiarch/wcslen-c.c: Likewise.
	* sysdeps/s390/multiarch/wcslen-vx.S: Likewise.
	* sysdeps/s390/multiarch/wcslen.c: Likewise.
	* string/strlen.c (STRLEN): Define and use macro.
	* sysdeps/s390/multiarch/ifunc-impl-list.c
	(IFUNC_VX_IMPL): New macro function.
	(__libc_ifunc_impl_list): Add ifunc test for strlen, wcslen.
	* benchtests/Makefile (wcsmbs-bench): New variable.
	(string-bench-all): Added wcsmbs-bench.
	* benchtests/bench-wcslen.c: New File.
2015-08-26 10:26:20 +02:00
Khem Raj
536fb97780 Reflect renaming of bh_IN and tu_IN in SUPPORTED file [BZ #17475] 2015-07-20 22:09:07 -04:00
Stefan Liebler
2419de0720 Fix benchtests build failure after 'add benchmark for strcoll'
This patch fixes implicit declaration of function strdup, strtok,
strcoll, strchr and removes unused variable res.

ChangeLog:

	* benchtests/bench-strcoll.c:
	Include string.h.
	(main): Remove unused variable res.
2015-06-19 13:47:59 +02:00
Siddhesh Poyarekar
0cd2828695 benchtest: script to compare two benchmarks
This script is a sample implementation that uses import_bench to
construct two benchmark objects and compare them.  If detailed timing
information is available (when one does `make DETAILED=1 bench`), it
writes out graphs for all functions it benchmarks and prints
significant differences in timings of the two benchmark runs.  If
detailed timing information is not available, it points out
significant differences in aggregate times.

Call this script as follows:

  compare_bench.py schema_file.json bench1.out bench2.out

Alternatively, if one wants to set a different threshold for warnings
(default is a 10% difference):

  compare_bench.py schema_file.json bench1.out bench2.out 25

The threshold in the example above is 25%.  schema_file.json is the
JSON schema (which is $srcdir/benchtests/scripts/benchout.schema.json
for the benchmark output file) and bench1.out and bench2.out are the
two benchmark output files to compare.

The key functionality here is the compress_timings function which
groups together points that are close together into a single point
that is the mean of all its representative points.  Any point in such
a group is at most 1.5x the smallest point in that group.  The
detailed derivation is a comment in the function.

	* benchtests/scripts/compare_bench.py: New file.
	* benchtests/scripts/import_bench.py (mean): New function.
	(split_list): Likewise.
	(do_for_all_timings): Likewise.
	(compress_timings): Likewise.
2015-06-01 23:14:11 +05:30
Siddhesh Poyarekar
0994b9b6f6 New module to import and process benchmark output
This is the beginning of a module to import and process benchmark
outputs.  The module currently supports importing of a bench.out and
validating it against a schema file.  In future this could grow a set
of routines that benchmark consumers may find useful to build their
own analysis tools.  I have altered validate_bench to use this module
too.

	* benchtests/scripts/import_bench.py: New file.
	* benchtests/scripts/validate_benchout.py: Import import_bench
	instead of jsonschema.
	(validate_bench): Remove function.
	(main): Use import_bench.
2015-06-01 23:13:29 +05:30
Carlos O'Donell
608f897106 Add sprintf benchmark.
Tests position and non-positional arguments with two
test string.
2015-05-21 10:02:52 -04:00
Leonhard Holz
60ccaf7514 Add strcoll benchmark 2015-05-13 13:05:29 +05:30
Joseph Myers
b168057aaa Update copyright dates with scripts/update-copyrights. 2015-01-02 16:29:47 +00:00
Rajalakshmi Srinivasaraghavan
f59ad976ed powerpc: POWER7 strcpy optimization for unaligned strings
This patch optimizes strcpy for ppc64/power7 for unaligned source or
destination address.  The source or destination address is aligned
to doubleword and data is shifted based on the alignment and
added with the previous loaded data to be written as a doubleword.
For each load, cmpb instruction is used for faster null check.

The word aligned optimization is also removed, since the new unaligned
code path shows better results handling word-aligned strings.

More combination of unaligned inputs is also added in benchtest
to measure the improvement.The new optimization shows 2 to 80% of
performance improvement for longer string though it does not show
big difference on string size less than 16 due to additional checks.
2014-12-31 14:35:59 -05:00
Joseph Myers
bd5dadac87 Remove TEST_IFUNC, tests-ifunc and *-ifunc.c tests.
TEST_IFUNC is only tested in two headers, bench-string.h and
test-string.h, after it gets defined by those headers, and it never
gets undefined.

Thus no defines of TEST_IFUNC are needed, and the *-ifunc.c tests that
just define TEST_IFUNC and include other tests are also redundant, as
is the code to remove $(tests-ifunc) and $(xtests-ifunc) conditionally
from tests and xtests.  This patch removes the useless defines and
tests of TEST_IFUNC and the associated useless tests and makefile
code.  It thereby fixes a series of warnings
"../string/test-string.h:21:0: warning: "TEST_IFUNC" redefined" where
test-string.h defines TEST_IFUNC to empty, other files define it to 1
and this produces warnings.

Tested for x86_64.

	* debug/test-stpcpy_chk-ifunc.c: Remove file.
	* debug/test-strcpy_chk-ifunc.c: Likewise.
	* wcsmbs/test-wcschr-ifunc.c: Likewise.
	* wcsmbs/test-wcscmp-ifunc.c: Likewise.
	* wcsmbs/test-wcscpy-ifunc.c: Likewise.
	* wcsmbs/test-wcslen-ifunc.c: Likewise.
	* wcsmbs/test-wcsrchr-ifunc.c: Likewise.
	* wcsmbs/test-wmemcmp-ifunc.c: Likewise.
	* Rules [$(multi-arch) = no] (tests): Do not filter out
	$(tests-ifunc).
	[$(multi-arch) = no] (xtests): Do not filter out $(xtests-ifunc).
	* debug/Makefile (tests-ifunc): Remove variable.
	(tests): Do not add $(tests-ifunc).
	* wcsmbs/Makefile (tests-ifunc): Remove variable.
	(tests): Do not add $(tests-ifunc).
	* benchtests/bench-string.h (TEST_IFUNC): Remove macro.
	[TEST_IFUNC]: Remove conditionals.
	* string/test-string.h (TEST_IFUNC): Remove macro.
	[TEST_IFUNC]: Remove conditionals.
2014-11-26 12:53:36 +00:00
Will Newton
b01ee67cb5 benchtests: Add malloc microbenchmark
Add a microbenchmark for measuring malloc and free performance with
varying numbers of threads. The benchmark allocates and frees buffers
of random sizes in a random order and measures the overall execution
time and RSS. Variants of the benchmark are run with 1, 8, 16 and
32 threads.

The random block sizes used follow an inverse square distribution
which is intended to mimic the behaviour of real applications which
tend to allocate many more small blocks than large ones.

ChangeLog:

2014-11-05  Will Newton  <will.newton@linaro.org>

	* benchtests/Makefile: (bench-malloc): Add malloc thread
	scalability benchmark.
	* benchtests/bench-malloc-threads.c: New file.
2014-11-05 14:13:00 +00:00
Adhemerval Zanella
71ae86478e PowerPC: memset optimization for POWER8/PPC64
This patch adds an optimized memset implementation for POWER8.  For
sizes from 0 to 255 bytes, a word/doubleword algorithm similar to
POWER7 optimized one is used.

For size higher than 255 two strategies are used:

1. If the constant is different than 0, the memory is written with
   altivec vector instruction;

2. If constant is 0, dbcz instructions are used.  The loop is unrolled
   to clear 512 byte at time.

Using vector instructions increases throughput considerable, with a
double performance for sizes larger than 1024.  The dcbz loops unrolls
also shows performance improvement, by doubling throughput for sizes
larger than 8192 bytes.
2014-09-10 07:39:46 -04:00
Richard Henderson
428dd03f5a Remove HP_TIMING_DIFF_INIT and dl_hp_timing_overhead
Without HP_TIMING_ACCUM, dl_hp_timing_overhead is write-only.
If we remove it, there's no point in HP_TIMING_DIFF_INIT either.
2014-07-03 08:38:25 -07:00
Siddhesh Poyarekar
91b84fe588 Remove unnecessary $(.)
The variable is not necessary, especially since it does not exist.
2014-06-19 17:02:48 +05:30
Siddhesh Poyarekar
42b1161e8c Validate bench.out against a JSON schema
This patch adds a JSON schema for the benchmark output file and also
adds a script that validates the generated output against the schema.
2014-06-11 14:16:29 +05:30
Joseph Myers
8540f6d2a7 Don't require test wrappers to preserve environment variables, use more consistent environment.
One wart in the original support for test wrappers for cross testing,
as noted in
<https://sourceware.org/ml/libc-alpha/2012-10/msg00722.html>, is the
requirement for test wrappers to pass a poorly-defined set of
environment variables from the build system to the system running the
glibc under test.  Although some variables are passed explicitly via
$(test-wrapper-env), including LD_* variables that simply can't be
passed implicitly because of the side effects they'd have on the build
system's dynamic linker, others are passed implicitly, including
variables such as GCONV_PATH and LOCPATH that could potentially affect
the build system's libc (so effectively relying on any such effects
not breaking the wrappers).  In addition, the code in
cross-test-ssh.sh for preserving environment variables is fragile (it
depends on how bash formats a list of exported variables, and could
well break for multi-line variable definitions where the contents
contain things looking like other variable definitions).

This patch moves to explicitly passing environment variables via
$(test-wrapper-env).  Makefile variables that previously used
$(test-wrapper) are split up into -before-env and -after-env parts
that can be passed separately to the various .sh files used in
testing, so those files can then insert environment settings between
the two parts.

The common default environment settings in make-test-out are made into
a separate makefile variable that can also be passed to scripts,
rather than many scripts duplicating those settings (for testing an
installed glibc, it is desirable to have the GCONV_PATH setting on
just one place, so just that one place needs to support it pointing to
an installed sysroot instead of the build tree).  The default settings
are included in the variables such as $(test-program-prefix), so that
if tests do not need any non-default settings they can continue to use
single variables rather than the split-up variables.

Although this patch cleans up LC_ALL=C settings (that being part of
the common defaults), various LANG=C and LANGUAGE=C settings remain.
Those are generally unnecessary and I propose a subsequent cleanup to
remove them.  LC_ALL takes precedence over LANG, and while LANGUAGE
takes precedence over LC_ALL, it only does so for settings other than
LC_ALL=C.  So LC_ALL=C on its own is sufficient to ensure the C
locale, and anything that gets LC_ALL=C does not need the other
settings.

While preparing this patch I noticed some tests with .sh files that
appeared to do nothing beyond what the generic makefile support for
tests can do (localedata/tst-wctype.sh - the makefiles support -ENV
variables and .input files - and localedata/tst-mbswcs.sh - just runs
five tests that could be run individually from the makefile).  So I
propose another subsequent cleanup to move those to using the generic
support instead of special .sh files.

Tested x86_64 (native) and powerpc32 (cross).

	* Makeconfig (run-program-env): New variable.
	(run-program-prefix-before-env): Likewise.
	(run-program-prefix-after-env): Likewise.
	(run-program-prefix): Define in terms of new variables.
	(built-program-cmd-before-env): New variable.
	(built-program-cmd-after-env): Likewise.
	(built-program-cmd): Define in terms of new variables.
	(test-program-prefix-before-env): New variable.
	(test-program-prefix-after-env): Likewise.
	(test-program-prefix): Define in terms of new variables.
	(test-program-cmd-before-env): New variable.
	(test-program-cmd-after-env): Likewise.
	(test-program-cmd): Define in terms of new variables.
	* Rules (make-test-out): Use $(run-program-env).
	* scripts/cross-test-ssh.sh (env_blacklist): Remove variable.
	(help): Do not mention environment variables.  Mention
	--timeoutfactor option.
	(timeoutfactor): New variable.
	(blacklist_exports): Remove function.
	(exports): Remove variable.
	(command): Do not include ${exports}.
	* manual/install.texi (Configuring and compiling): Do not mention
	test wrappers preserving environment variables.  Mention that last
	assignment to a variable must take precedence.
	* INSTALL: Regenerated.
	* benchtests/Makefile (run-bench): Use $(run-program-env).
	* catgets/Makefile ($(objpfx)test1.cat): Use
	$(built-program-cmd-before-env), $(run-program-env) and
	$(built-program-cmd-after-env).
	($(objpfx)test2.cat): Do not specify environment variables
	explicitly.
	($(objpfx)de/libc.cat): Use $(built-program-cmd-before-env),
	$(run-program-env) and $(built-program-cmd-after-env).
	($(objpfx)test-gencat.out): Use $(test-program-cmd-before-env),
	$(run-program-env) and $(test-program-cmd-after-env).
	($(objpfx)sample.SJIS.cat): Do not specify environment variables
	explicitly.
	* catgets/test-gencat.sh: Use test_program_cmd_before_env,
	run_program_env and test_program_cmd_after_env arguments.
	* elf/Makefile ($(objpfx)tst-pathopt.out): Use $(run-program-env).
	* elf/tst-pathopt.sh: Use run_program_env argument.
	* iconvdata/Makefile ($(objpfx)iconv-test.out): Use
	$(test-wrapper-env) and $(run-program-env).
	* iconvdata/run-iconv-test.sh: Use test_wrapper_env and
	run_program_env arguments.
	* iconvdata/tst-table.sh: Do not set GCONV_PATH explicitly.
	* intl/Makefile ($(objpfx)tst-gettext.out): Use
	$(test-program-prefix-before-env), $(run-program-env) and
	$(test-program-prefix-after-env).
	($(objpfx)tst-gettext2.out): Likewise.
	* intl/tst-gettext.sh: Use test_program_prefix_before_env,
	run_program_env and test_program_prefix_after_env arguments.
	* intl/tst-gettext2.sh: Likewise.
	* intl/tst-gettext4.sh: Do not set environment variables
	explicitly.
	* intl/tst-gettext6.sh: Likewise.
	* intl/tst-translit.sh: Likewise.
	* malloc/Makefile ($(objpfx)tst-mtrace.out): Use
	$(test-program-prefix-before-env), $(run-program-env) and
	$(test-program-prefix-after-env).
	* malloc/tst-mtrace.sh: Use test_program_prefix_before_env,
	run_program_env and test_program_prefix_after_env arguments.
	* math/Makefile (run-regen-ulps): Use $(run-program-env).
	* nptl/Makefile ($(objpfx)tst-tls6.out): Use $(run-program-env).
	* nptl/tst-tls6.sh: Use run_program_env argument.  Set LANG=C
	explicitly with each use of ${test_wrapper_env}.
	* posix/Makefile ($(objpfx)wordexp-tst.out): Use
	$(test-program-prefix-before-env), $(run-program-env) and
	$(test-program-prefix-after-env).
	* posix/tst-getconf.sh: Do not set environment variables
	explicitly.
	* posix/wordexp-tst.sh: Use test_program_prefix_before_env,
	run_program_env and test_program_prefix_after_env arguments.
	* stdio-common/tst-printf.sh: Do not set environment variables
	explicitly.
	* stdlib/Makefile ($(objpfx)tst-fmtmsg.out): Use
	$(test-program-prefix-before-env), $(run-program-env) and
	$(test-program-prefix-after-env).
	* stdlib/tst-fmtmsg.sh: Use test_program_prefix_before_env,
	run_program_env and test_program_prefix_after_env arguments.
	Split $test calls into $test_pre and $test.
	* timezone/Makefile (build-testdata): Use
	$(built-program-cmd-before-env), $(run-program-env) and
	$(built-program-cmd-after-env).

localedata/ChangeLog:
	* Makefile ($(addprefix $(objpfx),$(CTYPE_FILES))): Use
	$(built-program-cmd-before-env), $(run-program-env) and
	$(built-program-cmd-after-env).
	($(objpfx)sort-test.out): Use $(test-program-prefix-before-env),
	$(run-program-env) and $(test-program-prefix-after-env).
	($(objpfx)tst-fmon.out): Use $(run-program-prefix-before-env),
	$(run-program-env) and $(run-program-prefix-after-env).
	($(objpfx)tst-locale.out): Use $(built-program-cmd-before-env),
	$(run-program-env) and $(built-program-cmd-after-env).
	($(objpfx)tst-trans.out): Use $(run-program-prefix-before-env),
	$(run-program-env), $(run-program-prefix-after-env),
	$(test-program-prefix-before-env) and
	$(test-program-prefix-after-env).
	($(objpfx)tst-ctype.out): Use $(test-program-cmd-before-env),
	$(run-program-env) and $(test-program-cmd-after-env).
	($(objpfx)tst-wctype.out): Likewise.
	($(objpfx)tst-langinfo.out): Likewise.
	($(objpfx)tst-langinfo-static.out): Likewise.
	* gen-locale.sh: Use localedef_before_env, run_program_env and
	localedef_after_env arguments.
	* sort-test.sh: Use test_program_prefix_before_env,
	run_program_env and test_program_prefix_after_env arguments.
	* tst-ctype.sh: Use tst_ctype_before_env, run_program_env and
	tst_ctype_after_env arguments.
	* tst-fmon.sh: Use run_program_prefix_before_env, run_program_env
	and run_program_prefix_after_env arguments.
	* tst-langinfo.sh: Use tst_langinfo_before_env, run_program_env
	and tst_langinfo_after_env arguments.
	* tst-locale.sh: Use localedef_before_env, run_program_env and
	localedef_after_env arguments.
	* tst-mbswcs.sh: Do not set environment variables explicitly.
	* tst-numeric.sh: Likewise.
	* tst-rpmatch.sh: Likewise.
	* tst-trans.sh: Use run_program_prefix_before_env,
	run_program_env, run_program_prefix_after_env,
	test_program_prefix_before_env and test_program_prefix_after_env
	arguments.
	* tst-wctype.sh: Use tst_wctype_before_env, run_program_env and
	tst_wctype_after_env arguments.
2014-06-06 22:19:27 +00:00
Siddhesh Poyarekar
15eaf6ffe3 benchtests: Add new directive for benchmark initialization hook
Add a new 'init' directive that specifies the name of the function to
call to do function-specific initialization.  This is useful for
benchmarks that need to do a one-time initialization before the
functions are executed.
2014-05-26 12:37:29 +05:30
Joseph Myers
79520f4bd6 Use existing makefile variables for dependencies on glibc libraries.
glibc's Makeconfig defines some variables such as $(libm) and $(libdl)
for linking with libraries built by glibc, and nptl/Makeconfig
(included by the toplevel Makeconfig) defines others such as
$(shared-thread-library).

In some places glibc's Makefiles use those variables when linking
against the relevant libraries, but in other places they hardcode the
location of the libraries in the build tree.  This patch cleans up
various places to use the variables that already exist (in the case of
libm, replacing several duplicate definitions of a $(link-libm)
variable in subdirectory Makefiles).  (It's not necessarily exactly
equivalent to what the existing code does - in particular,
$(shared-thread-library) includes libpthread_nonshared, but is
replacing places that just referred to libpthread.so.  But I think
that change is desirable on the general principle of linking things as
close as possible to the way in which they would be linked with an
installed library, unless there is a clear reason not to do so.)

To support running tests with an installed copy of glibc without
needing the full build tree from when that copy was built, I think it
will be useful to use such variables more generally and systematically
- every time the rules for building a test refer to some file from the
build tree that's also installed by glibc, use a makefile variable so
that the installed-testing case can point those variables to installed
copies of the files.  This patch just deals with straightforward cases
where such variables already exist.

It's quite possible some uses of $(shared-thread-library) should
actually be a new $(thread-library) variable that's set appropriately
in the --disable-shared case, if those uses would in fact work without
shared libraries.  I didn't change the status quo that those cases
hardcode use of a shared library whether or not it's actually needed
(but other uses such as $(libm) and $(libdl) would now get the static
library if the shared library isn't built, when some previously
hardcoded use of the shared library - if they actually need shared
libraries, the test itself needs an enable-shared conditional anyway).

Tested x86_64.

	* benchtests/Makefile
	($(addprefix $(objpfx)bench-,$(bench-math))): Depend on $(libm),
	not $(common-objpfx)math/libm.so.
	($(addprefix $(objpfx)bench-,$(bench-pthread))): Depend on
	$(shared-thread-library), not $(common-objpfx)nptl/libpthread.so.
	* elf/Makefile ($(objpfx)noload): Depend on $(libdl), not
	$(common-objpfx)dlfcn/libdl.so.
	($(objpfx)tst-audit8): Depend on $(libm), not
	$(common-objpfx)math/libm.so.
	* malloc/Makefile ($(objpfx)libmemusage.so): Depend on $(libdl),
	not $(common-objpfx)dlfcn/libdl.so.
	* math/Makefile
	($(addprefix $(objpfx),$(filter-out $(tests-static),$(tests)))):
	Depend on $(libm), not $(objpfx)libm.so.  Do not condition on
	[$(build-shared) = yes].
	($(objpfx)test-fenv-tls): Depend on $(shared-thread-library), not
	$(common-objpfx)nptl/libpthread.so.
	* misc/Makefile ($(objpfx)tst-tsearch): Depend on $(libm), not
	$(common-objpfx)math/libm.so$(libm.so-version) or
	$(common-objpfx)math/libm.a depending on [$(build-shared) = yes].
	* nptl/Makefile ($(objpfx)tst-unload): Depend on $(libdl), not
	$(common-objpfx)dlfcn/libdl.so.
	* setjmp/Makefile (link-libm): Remove variable.
	($(objpfx)tst-setjmp-fp): Depend on $(libm), not $(link-libm).
	* stdio-common/Makefile (link-libm): Remove variable.
	($(objpfx)tst-printf-round): Depend on $(libm), not $(link-libm).
	* stdlib/Makefile (link-libm): Remove variable.
	($(objpfx)bug-getcontext): Depend on $(libm), not $(link-libm).
	($(objpfx)tst-strtod-round): Likewise.
	($(objpfx)tst-tininess): Likewise.
	($(objpfx)tst-strtod-underflow): Likewise.
	($(objpfx)tst-strtod6): Likewise.
	($(objpfx)tst-tls-atexit): Depend on $(shared-thread-library) and
	$(libdl), not $(common-objpfx)nptl/libpthread.so and
	$(common-objpfx)dlfcn/libdl.so.
2014-05-16 21:38:08 +00:00
Siddhesh Poyarekar
bb9c256fb0 benchtests: Link against objects in build directory
Using -lm and -lpthread results in the shared objects in the system
being used to link against.  This happened to work for libm because
there haven't been any changes to the libm ABI recently that could
break the existing benchmarks.  This doesn't always work for the
pthread benchmarks.  The correct way to build against libraries in the
build directory is to have the binaries explicitly depend on them so
that $(+link) can pick them up.
2014-04-15 14:33:06 +05:30
Will Newton
970c602aa6 benchtests: Improve readability of JSON output
Add a small library to print JSON values and use it to improve the
readability of the benchmark output and the readability of the
benchmark code.

ChangeLog:

2014-04-11  Will Newton  <will.newton@linaro.org>

	* benchtests/Makefile (extra-objs): Add json-lib.o.
	(bench-func): Tidy up JSON output.
	* benchtests/bench-skeleton.c: Include json-lib.h.
	(main): Use JSON library functions to do output of
	benchmark results.
	* benchtests/bench-timing-type.c (main): Output the
	timing type simply, leaving formatting to the user.
	* benchtests/json-lib.c: New file.
	* benchtests/json-lib.h: Likewise.
2014-04-11 16:05:03 +01:00
Torvald Riegel
6a5d6ea128 benchtests: Add pthread_once common-case test.
We have a single thread that runs a no-op initialization once and then
repeatedly runs checks of the initialization (i.e., an acquire load and
conditional jump) in a tight loop.  This gives us, on average, the
best-case latency of pthread_once (the initialization is the
exactly-once slow path, and we're not looking at initialization-related
synchronization overheads in this case).
2014-04-10 21:22:28 +02:00
Will Newton
f6c557968c benchtests: Build ffs and ffsl benchtests with -fno-builtin
Without this flag it is possible that the compiler will optimize
away the calls to ffs/ffsll.

ChangeLog:

2014-04-01  Will Newton  <will.newton@linaro.org>

	* benchtests/Makefile (CFLAGS-bench-ffs.c): Add
	-fno-builtin.  (CFLAGS-bench-ffsll.c): Likewise.
2014-04-01 12:50:41 +01:00
Will Newton
c760f5c210 benchtests: Add benchtests for ffs and ffsll
Add benchtests for ffs and ffsll. There is no benchtest for ffsl as
it is identical to one of the other functions.

2014-03-31  Will Newton  <will.newton@linaro.org>

	* benchtests/Makefile (bench): Add ffs and ffsll to list
	of tests.
	* benchtests/ffs-inputs: New file.
	* benchtests/ffsll-inputs: Likewise.
2014-03-31 12:50:41 +01:00
Siddhesh Poyarekar
5673750800 Detailed benchmark outputs for functions
This patch adds an option to get detailed benchmark output for
functions.  Invoking the benchmark with 'make DETAILED=1 bench' causes
each benchmark program to store a mean execution time for each input
it works on.  This is useful to give a more comprehensive picture of
performance of functions compared to just the single mean figure.
2014-03-29 09:40:19 +05:30
Siddhesh Poyarekar
cb5e4aada7 Make bench.out in json format
This patch changes the output format of the main benchmark output file
(bench.out) to an extensible format.  I chose JSON over XML because in
addition to being extensible, it is also not too verbose.
Additionally it has good support in python.

The significant change I have made in terms of functionality is to put
timing information as an attribute in JSON instead of a string and to
do that, there is a separate program that prints out a JSON snippet
mentioning the type of timing (hp_timing or clock_gettime).  The mean
timing has now changed from iterations per unit to actual timing per
iteration.
2014-03-29 09:37:44 +05:30
Siddhesh Poyarekar
cf806aff60 [benchtests] Use inputs file for modf
The modf benchmark can now use the framework since the introduction of
output arguments.
2014-03-29 09:35:50 +05:30
Will Newton
60a2f3c166 benchtests/bench-strtod.c: Increase timeout value
This benchmark can take longer than the default 2 seconds on slower
platforms, so increase it to 10 seconds.

ChangeLog:

2014-03-26  Will Newton <will.newton@linaro.org>

	* benchtests/bench-strtod.c (TIMEOUT): Define to 10.
2014-03-26 09:43:28 +00:00
Siddhesh Poyarekar
27c673b8de benchtests: Move bench.py to benchtests/scripts/
It makes much more sense to have all benchmarking-related scripts in a
single place away from everything else.
2014-03-24 21:16:36 +05:30
Siddhesh Poyarekar
df26ea5359 Implement benchmarking script in python
Implemented the benchmark script in python since it is much cleaner
and simpler to maintain.
2014-03-21 17:32:50 +05:30
Ondřej Bílka
7b3551e3a8 Make strtok benchmark competive.
We include a generic version of strtok to result which could be faster
when underlying primitives are better optimized than current version.
2014-02-28 22:45:33 +01:00
Joseph Myers
a5f891ac8d Consistently include Makeconfig after defining subdir.
In <https://sourceware.org/ml/libc-alpha/2014-01/msg00196.html> I
noted it was necessary to add includes of Makeconfig early in various
subdirectory makefiles for the tests-special variable settings added
by that patch to be conditional on configuration information.  No-one
commented on the general question there of whether Makeconfig should
always be included immediately after the definition of subdir.

This patch implements that early inclusion of Makeconfig in each
directory (which is a lot easier than consistent placement of includes
of Rules).  Includes are added if needed, or moved up if already
present.  Subdirectory "all:" targets are removed, since Makeconfig
provides one.

There is potential for further cleanups I haven't done.  Rules and
Makerules have code such as

ifneq   "$(findstring env,$(origin headers))" ""
headers :=
endif

to override to empty any value of various variables that came from the
environment.  I think there is a case for Makeconfig setting all the
subdirectory variables (other than subdir) to empty to ensure no
outside value is going to take effect if a subdirectory fails to
define a variable.  (A list of such variables, possibly out of date
and incomplete, is in manual/maint.texi.)  Rules and Makerules would
give errors if Makeconfig hadn't already been included, instead of
including it themselves.  The special code to override values coming
from the environment would then be obsolete and could be removed.

Tested x86_64, including that installed binaries are identical before
and after the patch.

	* argp/Makefile: Include Makeconfig immediately after defining
	subdir.
	* assert/Makefile: Likewise.
	* benchtests/Makefile: Likewise.
	* catgets/Makefile: Likewise.
	* conform/Makefile: Likewise.
	* crypt/Makefile: Likewise.
	* csu/Makefile: Likewise.
	(all): Remove target.
	* ctype/Makefile: Include Makeconfig immediately after defining
	subdir.
	* debug/Makefile: Likewise.
	* dirent/Makefile: Likewise.
	* dlfcn/Makefile: Likewise.
	* gmon/Makefile: Likewise.
	* gnulib/Makefile: Likewise.
	* grp/Makefile: Likewise.
	* gshadow/Makefile: Likewise.
	* hesiod/Makefile: Likewise.
	* hurd/Makefile: Likewise.
	(all): Remove target.
	* iconvdata/Makefile: Include Makeconfig immediately after
	defining subdir.
	* inet/Makefile: Likewise.
	* intl/Makefile: Likewise.
	* io/Makefile: Likewise.
	* libio/Makefile: Likewise.
	(all): Remove target.
	* locale/Makefile: Include Makeconfig immediately after defining
	subdir.
	* login/Makefile: Likewise.
	* mach/Makefile: Likewise.
	(all): Remove target.
	* malloc/Makefile: Include Makeconfig immediately after defining
	subdir.
	(all): Remove target.
	* manual/Makefile: Include Makeconfig immediately after defining
	subdir.
	* math/Makefile: Likewise.
	* misc/Makefile: Likewise.
	* nis/Makefile: Likewise.
	* nss/Makefile: Likewise.
	* po/Makefile: Likewise.
	(all): Remove target.
	* posix/Makefile: Include Makeconfig immediately after defining
	subdir.
	* pwd/Makefile: Likewise.
	* resolv/Makefile: Likewise.
	* resource/Makefile: Likewise.
	* rt/Makefile: Likewise.
	* setjmp/Makefile: Likewise.
	* shadow/Makefile: Likewise.
	* signal/Makefile: Likewise.
	* socket/Makefile: Likewise.
	* soft-fp/Makefile: Likewise.
	* stdio-common/Makefile: Likewise.
	* stdlib/Makefile: Likewise.
	* streams/Makefile: Likewise.
	* string/Makefile: Likewise.
	* sunrpc/Makefile: Likewise.
	(all): Remove target.
	* sysvipc/Makefile: Include Makeconfig immediately after defining
	subdir.
	* termios/Makefile: Likewise.
	* time/Makefile: Likewise.
	* timezone/Makefile: Likewise.
	(all): Remove target.
	* wcsmbs/Makefile: Include Makeconfig immediately after defining
	subdir.
	* wctype/Makefile: Likewise.

libidn/ChangeLog:
	* Makefile: Include Makeconfig immediately after defining subdir.

localedata/ChangeLog:
	* Makefile: Include Makeconfig immediately after defining subdir.
	(all): Remove target.

nptl/ChangeLog:
	* Makefile: Include Makeconfig immediately after defining subdir.

nptl_db/ChangeLog:
	* Makefile: Include Makeconfig immediately after defining subdir.
2014-02-26 23:12:03 +00:00
Siddhesh Poyarekar
b8cd1c4ea5 Minor formatting fix 2014-02-21 11:31:41 +05:30
Rajalakshmi Srinivasaraghavan
bd939d2322 print length in strrchr benchtest
The return criteria of strrchr() is to read till NULL even if the
search character is hit.  So its better to print len instead of pos.
2014-02-21 11:30:03 +05:30
Ondřej Bílka
a1ffb40e32 Use glibc_likely instead __builtin_expect. 2014-02-10 15:07:12 +01:00
Mike Frysinger
c5bb8e2399 tests: unify fortification handler logic
We have multiple tests that copy & paste the same logic for disabling the
fortification output.  Let's unify this in the test-skeleton instead.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
2014-02-08 06:58:43 -05:00
Siddhesh Poyarekar
649ecea212 Correct inputs for sin and cos
The inputs for the slowest path in asin and acos were incorrect and
had some fast path inputs there too.
2014-01-10 09:57:51 +05:30
Allan McRae
d4697bc93d Update copyright notices with scripts/update-copyrights 2014-01-01 22:00:23 +10:00
Siddhesh Poyarekar
dd1d85e5dd Benchmark inputs for cos and sin
Add a comprehensive number of inputs for all branches in sin and cos
computation, excluding the fast paths.  This also adds a number of
inputs for the multiple precision slow paths.
2013-12-31 12:12:46 +05:30
Siddhesh Poyarekar
1acbb90f7a benchmark inputs for atan
Add a more comprehensive set of inputs for the atan function.  I have
also fixed the name on the multiple precision fallback inputs (I
couldn't find any new inputs there) to reflect the fact that the
fallback is only 144bits and not 768bits as I had earlier mentioned.
2013-12-31 12:11:13 +05:30
Siddhesh Poyarekar
4c012ed391 benchmark inputs for tanh and atanh 2013-12-31 12:06:30 +05:30
Siddhesh Poyarekar
eff9832405 benchmark inputs for asinh and acosh
Like sinh and cosh, this patch has benchmark inputs for asinh and
acosh, generated using a random number generator and spread over
significant branches, ignoring the fast return paths.
2013-12-31 12:05:16 +05:30
Siddhesh Poyarekar
ce641152c4 benchmark inputs for sinh and cosh
Add a full set of inputs for sinh and cosh functions generated using a
random number generator and spreading it over all branches in the
function, ignoring the fast paths (i.e. immediate return for special
values).
2013-12-31 12:03:44 +05:30
Siddhesh Poyarekar
b19221b9a5 benchmark inputs for asin and acos
Add a comprehensive set of inputs for asin and acos functions,
including the multiple precision fallback path.
2013-12-31 12:01:40 +05:30
Rajalakshmi Srinivasaraghavan
9f6e964c3a benchtests: Add strtok benchmark 2013-12-19 06:45:54 -05:00
Siddhesh Poyarekar
19b5525e52 benchmark inputs for exp2, log2, log and tan 2013-12-12 09:31:53 +05:30
Siddhesh Poyarekar
9298ecba15 Accept output arguments to benchmark functions
This patch adds the ability to accept output arguments to functions
being benchmarked, by nesting the argument type in <> in the args
directive.  It includes the sincos implementation as an example, where
the function would have the following args directive:

  ## args: double:<double *>:<double *>

This simply adds a definition for a static variable whose pointer gets
passed into the function, so it's not yet possible to pass something
more complicated like a pre-allocated string or array.  That would be
a good feature to add if a function needs it.

The values in the input file will map only to the input arguments.  So
if I had a directive like this for a function foo:

  ## args: int:<int *>:int:<int *>

and I have a value list like this:

1, 2
3, 4
5, 6

then the function calls generated would be:

foo (1, &out1, 2, &out2);
foo (3, &out1, 4, &out2);
foo (5, &out1, 6, &out2);
2013-12-05 10:12:59 +05:30
Steve Ellcey
fe7da22091 Benchmark test for sqrt function. 2013-12-02 09:37:18 -08:00
Ondřej Bílka
a950349667 Also remove benchtests/bench-strsep-ifunc.c 2013-11-26 17:34:34 +01:00
Ondřej Bílka
826fa85580 Remove duplicate ifunc benchtests. 2013-11-26 12:48:33 +01:00
Rajalakshmi Srinivasaraghavan
250c23bdd9 benchtests: Add strsep benchmark 2013-11-18 06:49:44 -06:00
Steve Ellcey
e8470ea216 2013-11-13 Steve Ellcey <sellcey@mips.com>
* benchtests/bench-timing.h: Include time.h.
2013-11-13 08:48:25 -08:00
Adhemerval Zanella
450a2e2d19 benchtests: Add strtod benchmark 2013-11-11 11:24:07 -02:00
Siddhesh Poyarekar
dfa1b402a0 Benchmark inputs for pow
These inputs cover all normal processing paths for pow including all
its slow paths.
2013-10-28 16:36:46 +05:30
Siddhesh Poyarekar
54f73d9c08 New inputs for exp
A more comprehensive set of inputs for exp, including all slow paths.
The inputs have been shuffled so that they don't give a false-positive
due to a hot cache.
2013-10-28 16:35:08 +05:30
Torvald Riegel
40fefba1b5 benchtests: Add include-sources directive.
This adds the "include-sources" directive to scripts/bench.pl.  This
allows for including source code (vs including headers, which might get
a different search path) after the inclusion of any headers.
2013-10-10 14:45:30 +03:00
Siddhesh Poyarekar
a357259bf8 Add more directives to benchmark input files
This patch adds some more directives to the benchmark inputs file,
moving functionality from the Makefile and making the code generation
script a bit cleaner.  The function argument and return types that
were earlier added as variables in the makefile and passed to the
script via command line arguments are now the 'args' and 'ret'
directive respectively.  'args' should be a colon separated list of
argument types (skipped if the function doesn't accept any arguments)
and 'ret' should be the return type.

Additionally, an 'includes' directive may have a comma separated list
of headers to include in the source.  For example, the pow input file
now looks like this:

42.0, 42.0
1.0000000000000020, 1.5

I did this to unclutter the benchtests Makefile a bit and eventually
eliminate dependency of the tests on the Makefile and have tests
depend on their respective include files only.
2013-10-07 11:51:25 +05:30
Siddhesh Poyarekar
7849ff938c Add benchmark inputs for sincos 2013-09-19 16:55:27 +05:30
Will Newton
b987c77672 benchtests: Rename argument to TIMING_INIT macro.
The TIMING_INIT macro currently sets the number of loop iterations
to 1000, which limits usefulness. Make the argument a clock
resolution value and multiply by 1000 in bench-skeleton.c instead
to allow easier reuse.

ChangeLog:

2013-09-11  Will Newton  <will.newton@linaro.org>

	* benchtests/bench-timing.h (TIMING_INIT): Rename ITERS
	parameter to RES. Remove hardcoded 1000 value.
	* benchtests/bench-skeleton.c (main): Pass RES parameter
	to TIMING_INIT and multiply result by 1000.
2013-09-11 15:18:20 +01:00
Adhemerval Zanella
e029e2e5c5 benchtests: Add memrchr benchmark 2013-09-06 09:24:52 -03:00
Will Newton
bbf6e8e4f4 benchtests/Makefile: Run benchmark for memcpy.
The benchmark for memcpy got disabled accidentally. Re-enable it.

ChangeLog:

2013-09-06   Will Newton  <will.newton@linaro.org>

	* benchtests/Makefile (string-bench): Add memcpy.
2013-09-06 11:59:00 +01:00
Will Newton
44558701ff benchtests: Switch string benchmarks to use bench-timing.h.
Switch the string benchmarks to using bench-timing.h instead
of hp-timing.h directly. This allows the string benchmarks to
be run usefully on architectures such as ARM that do not have
support for hp-timing.h.

In order to do this the tests have been changed from timing each
individual call and picking the lowest execution time recorded to
timing a number of calls and taking the mean execution time.

ChangeLog:

2013-09-04   Will Newton  <will.newton@linaro.org>

	* benchtests/bench-timing.h (TIMING_PRINT_MEAN): New macro.
	* benchtests/bench-string.h: Include bench-timing.h instead
	of including hp-timing.h directly. (INNER_LOOP_ITERS): New
	define. (HP_TIMING_BEST): Delete macro. (test_init): Remove
	call to HP_TIMING_DIFF_INIT.
	* benchtests/bench-memccpy.c: Use bench-timing.h macros
	instead of hp-timing.h macros.
	* benchtests/bench-memchr.c: Likewise.
	* benchtests/bench-memcmp.c: Likewise.
	* benchtests/bench-memcpy.c: Likewise.
	* benchtests/bench-memmem.c: Likewise.
	* benchtests/bench-memmove.c: Likewise.
	* benchtests/bench-memset.c: Likewise.
	* benchtests/bench-rawmemchr.c: Likewise.
	* benchtests/bench-strcasecmp.c: Likewise.
	* benchtests/bench-strcasestr.c: Likewise.
	* benchtests/bench-strcat.c: Likewise.
	* benchtests/bench-strchr.c: Likewise.
	* benchtests/bench-strcmp.c: Likewise.
	* benchtests/bench-strcpy.c: Likewise.
	* benchtests/bench-strcpy_chk.c: Likewise.
	* benchtests/bench-strlen.c: Likewise.
	* benchtests/bench-strncasecmp.c: Likewise.
	* benchtests/bench-strncat.c: Likewise.
	* benchtests/bench-strncmp.c: Likewise.
	* benchtests/bench-strncpy.c: Likewise.
	* benchtests/bench-strnlen.c: Likewise.
	* benchtests/bench-strpbrk.c: Likewise.
	* benchtests/bench-strrchr.c: Likewise.
	* benchtests/bench-strspn.c: Likewise.
	* benchtests/bench-strstr.c: Likewise.
2013-09-04 15:40:12 +01:00
Will Newton
cae16d6675 benchtests/Makefile: Use LDLIBS instead of LDFLAGS.
LDFLAGS puts the library too early in the command line if --as-needed
is being used. Use LDLIBS instead.

ChangeLog:

2013-09-04  Will Newton  <will.newton@linaro.org>

	* benchtests/Makefile: Use LDLIBS instead of LDFLAGS.
2013-09-04 15:38:41 +01:00
Adhemerval Zanella
85c2e6110c Fix loop construction to functions calls
Check wheter the compiler has the option -fno-tree-loop-distribute-patterns
to inhibit loop transformation to library calls and uses it on memset
and memmove default implementation to avoid recursive calls.
2013-06-20 19:42:05 -05:00
Siddhesh Poyarekar
94aca5e740 Port remaining string benchmarks
There were a few more string benchmarks (strcpy_chk and stpcpy_check)
in the debug directory that needed to be ported over.
2013-06-11 20:51:55 +05:30
Siddhesh Poyarekar
9702047480 Copy over string performance tests into benchtests
Copy over already existing string performance tests into benchtests.
Bits not related to performance measurements have been omitted.
2013-06-11 15:08:13 +05:30
Siddhesh Poyarekar
c1f75dc386 Begin porting string performance tests to benchtests
This is the initial support for string function performance tests,
along with copying tests for memcpy and memcpy-ifunc as proof of
concept.  The string function benchmarks perform operations at
different alignments and for different sizes and compare performance
between plain operations and the optimized string operations.  Due to
this their output is incompatible with the function benchmarks where
we're interested in fastest time, throughput, etc.

In future, the correctness checks in the benchmark tests can be
removed.  Same goes for the performance measurements in the
string/test-*.
2013-06-11 15:08:13 +05:30
Siddhesh Poyarekar
50b818bf96 Avoid overwriting earlier flags in CPPFLAGS-nonlib in benchtests
When setting BENCH_DURATION in CPPFLAGS-nonlib, append to the variable
instead of assigning to it, to avoid overwriting earlier set flags,
notably the -DNOT_IN_libc=1 flag.
2013-06-10 10:08:46 +05:30
Siddhesh Poyarekar
3ce9e01097 Sort benchmark functions 2013-05-22 11:07:39 +05:30
Siddhesh Poyarekar
051063c88b Add benchmark inputs for math functions
Add benchmark inputs for inverse and hyperbolic trigonometric
functions and log.
2013-05-22 11:07:33 +05:30
Siddhesh Poyarekar
fef94eab0b Add a README for benchtests
Move instructions from the Makefile here and expand on them.
2013-05-21 14:59:50 +05:30
Siddhesh Poyarekar
48a18de1e1 Prevent optimizing out of benchmark function call
Resolves: #15424

The compiler would optimize the benchmark function call out of the
loop and call it only once, resulting in blazingly fast times for some
benchmarks (notably atan, sin and cos).  Mark the inputs as volatile
so that the code is forced to read again from the input for each
iteration.
2013-05-17 19:10:33 +05:30
Siddhesh Poyarekar
43fe811b73 Use HP_TIMING for benchmarks if available
HP_TIMING uses native timestamping instructions if available, thus
greatly reducing the overhead of recording start and end times for
function calls.  For architectures that don't have HP_TIMING
available, we fall back to the clock_gettime bits.  One may also
override this by invoking the benchmark as follows:

  make USE_CLOCK_GETTIME=1 bench

and get the benchmark results using clock_gettime.  One has to do
`make bench-clean` to ensure that the benchmark programs are rebuilt.
2013-05-13 13:44:32 +05:30
Siddhesh Poyarekar
5c637fe5ee Fix coding style 2013-05-10 17:44:27 +05:30
Ondrej Bilka
bb7cf681e9 Preheat CPU in benchtests.
A benchmark could be skewed by CPU initialy working on minimal
frequency and speeding up later. We first run code in loop
to partialy fix this issue.
2013-05-08 08:25:08 +02:00
Siddhesh Poyarekar
f0ee064b7d Allow multiple input domains to be run in the same benchmark program
Some math functions have distinct performance characteristics in
specific domains of inputs, where some inputs return via a fast path
while other inputs require multiple precision calculations, that too
at different precision levels.  The way to implement different domains
was to have a separate source file and benchmark definition, resulting
in separate programs.

This clutters up the benchmark, so this change allows these domains to
be consolidated into the same input file.  To do this, the input file
format is now enhanced to allow comments with a preceding # and
directives with two # at the begining of a line.  A directive that
looks like:

tells the benchmark generation script that what follows is a different
domain of inputs.  The value of the 'name' directive (in this case,
foo) is used in the output.  The two input domains are then executed
sequentially and their results collated separately.  with the above
directive, there would be two lines in the result that look like:

func(): ....
func(foo): ...
2013-04-30 14:17:57 +05:30
Siddhesh Poyarekar
d569c6eeb4 Maintain runtime of each benchmark at ~10 seconds
The idea to run benchmarks for a constant number of iterations is
problematic.  While the benchmarks may run for 10 seconds on x86_64,
they could run for about 30 seconds on powerpc and worse, over 3
minutes on arm.  Besides that, adding a new benchmark is cumbersome
since one needs to find out the number of iterations needed for a
sufficient runtime.

A better idea would be to run each benchmark for a specific amount of
time.  This patch does just that.  The run time defaults to 10 seconds
and it is configurable at command line:

  make BENCH_DURATION=5 bench
2013-04-30 14:10:20 +05:30
Siddhesh Poyarekar
45d69176e8 Mention files in which fast/slow paths of math functions are implemented 2013-04-24 14:07:40 +05:30
Adhemerval Zanella
3c0265394d PowerPC: modf optimization
This patch implements modf/modff optimization for POWER by focus
on FP operations instead of relying in integer ones.
2013-04-23 13:38:52 -05:00
Siddhesh Poyarekar
037714dd49 Add benchmark inputs for cos and tan 2013-04-17 17:45:55 +05:30
Siddhesh Poyarekar
4856bcd2df Define NOT_IN_libc when compiling benchmark programs 2013-04-16 18:34:03 +05:30
Siddhesh Poyarekar
a296407432 Add target bench-clean 2013-04-16 14:07:21 +05:30
Siddhesh Poyarekar
206a669911 Write to bench.out-tmp only once
Appending benchmark program output on every run could result in a case
where the benchmark run was cancelled, resulting in a partially
written file.  This file gets used again on the next run, resulting in
results being appended to old results.

It could have been possible to remove the file before every benchmark
run, but it is easier to just write the output to bench.out-tmp only
once.
2013-04-15 13:53:35 +05:30
Siddhesh Poyarekar
acb4325fc7 Rebuild benchmark sources when Makefile is updated
Benchmark programs are generated using parameters from the Makefile,
so it is necessary to rebuild them whenever the parameters in the
Makefile are updated.  Hence, added a dependency for the generated C
source on the Makefile so that it gets regenerated when the Makefile
is updated.
2013-04-15 11:17:01 +05:30
Siddhesh Poyarekar
8fc1bee546 Move bench target to benchtests
The bench target will only be used within the benchtests directory.
2013-04-12 15:01:44 +05:30
Siddhesh Poyarekar
64aabd4b80 Add benchmark inputs for atan
Add separate inputs for slow and fast paths of atan
2013-04-03 15:50:15 +05:30
Siddhesh Poyarekar
92e3664bb5 Add benchmark inputs for sin 2013-04-02 17:48:47 +05:30
Siddhesh Poyarekar
81f311c2ee Add benchmark tests for slowpow and slowexp
Separate benchmarks for the fast and slow implementations of pow and
exp since measuring both together doesn't make sense.  Adjust the
iterations for pow and exp accordingly so that they run long enough
for the measurements to be meaningful.
2013-04-02 17:45:45 +05:30