Commit Graph

152 Commits

Author SHA1 Message Date
Florian Weimer
4504783c0f benchtests: Do not compile benchmark objects as libc modules [BZ #21864]
Otherwise, this will lead to link failures due to hidden symbol
references.
2017-08-21 19:28:54 +02:00
Wilco Dijkstra
d4505b895f Add math benchmark latency test
This patch further improves math function benchmarking by adding a latency
test in addition to throughput.  This enables more accurate comparisons of the
math functions. The latency test works by creating a dependency on the previous
iteration: func_res = F (func_res * zero + input[i]). The multiply by zero
avoids changing the input.

It reports reciprocal throughput and latency in nanoseconds (depending on the
timing header used) and max/min throughput in iterations per second:

   "workload-spec2006.wrf": {
    "reciprocal-throughput": 100,
    "latency": 200,
    "max-throughput": 1.0e+07,
    "min-throughput": 5.0e+06
   }

	* benchtests/bench-skeleton.c (main): Add support for
	latency benchmarking.
	* benchtests/scripts/bench.py: Add support for latency benchmarking.
2017-08-17 16:27:20 +01:00
Siddhesh Poyarekar
86c6519ee7 benchtests: Print json in memmove benchmark
Make the memmove benchmarks (bench-memmove and bench-memmove-large)
print their output in JSON so that they can be evaluated using the
compare_strings.py script.

	* benchtests/bench-memmove-large.c: Print output in JSON
	format.
	* benchtests/bench-memmove.c: Likewise.
2017-08-11 12:19:27 +05:30
Siddhesh Poyarekar
61c982910d benchtests: Remove verification runs from benchmark tests
The test run is unnecessary and interferes with the benchmark.  The
tests are done during make check, so they're unnecessary here.

	* benchtests/bench-memccpy.c (do_one_test): Remove checks.
	* benchtests/bench-memchr.c (do_one_test): Likewise.
	* benchtests/bench-memcpy-large.c (do_one_test): Likewise.
	* benchtests/bench-memcpy.c (do_one_test): Likewise.
	* benchtests/bench-memmove-large.c (do_one_test): Likewise.
	* benchtests/bench-memmove.c (do_one_test): Likewise.
	* benchtests/bench-memset-large.c (do_one_test): Likewise.
	* benchtests/bench-memset.c (do_one_test): Likewise.
	* benchtests/bench-string.h (test_init): Remove memsets.
2017-08-11 12:19:26 +05:30
Siddhesh Poyarekar
dd3e86ad7c benchtests: Avoid a display error when running in text terminal
The compare_strings.py script generates a graph for the benchmarks it
performs a comparison on and that fails if X is not available.  Avoid
the error and ensure that only the graph is generated and saved as a
PNG file.

	* benchtests/scripts/compare_strings.py: Avoid display error
	when generating graph.
2017-08-08 00:56:10 +05:30
Siddhesh Poyarekar
b115e819af benchtests: Allow selecting baseline for compare_string.py
This patch allows one to provide the function name using an optional
-base option to compare all other functions against.  This is useful
when pitching one implementation of a string function against
alternatives.  In the absence of this option, comparisons are done
against the first ifunc in the list.

	* benchtests/scripts/compare_strings.py (main): Add an
	optional -base option.
	(process_results): New argument base_func.
2017-08-08 00:55:12 +05:30
Siddhesh Poyarekar
7ee38e6040 benchtests: Use TEST_NAME instead of hardcoding memcpy
The hardcoded 'memcpy' name turns up in other derived tests like
mempcpy.

       * benchtests/bench-memcpy.c (test_main): Use TEST_NAME instead of
       hardcoding memcpy.
       * benchtests/bench-memcpy-large.c (test_name): Likewise.
       * benchtests/bench-memcpy-random.c (test_name): Likewise.
2017-08-08 00:44:00 +05:30
Siddhesh Poyarekar
25d5247277 benchtests: New script to parse memcpy results
Read the memcpy results in json and print out the results in tabular
form, in addition to generating a graph of the results to compare all
of the implementations.

The format of the output is extensible enough to allow this kind of
analysis to be done on other string functions as well.

	* benchtests/scripts/benchout_strings.schema.json: New file.
	* benchtests/scripts/compare_strings.py: New file.
2017-06-22 23:44:51 +05:30
Siddhesh Poyarekar
5ee1e3cebc benchtests: Make memcpy benchmarks print results in json
Print the benchmark output for various memcpy benchmarks in json so
that it can be predictably parsed and analyzed.

	* benchtests/bench-memcpy-large.c: Include json-lib.h.
	(do_one_test): Print json.
	(do_test): Likewise.
	(test_main): Likewise.
	* benchtests/bench-memcpy-random.c: Include json-lib.h.
	(do_one_test): Print json.
	(do_test): Likewise.
	(test_main): Likewise.
	* benchtests/bench-memcpy.c: Include json-lib.h.
	(do_one_test): Print json.
	(do_test): Likewise.
	(test_main): Likewise.
2017-06-22 23:44:19 +05:30
Siddhesh Poyarekar
738a9914a0 benchtests: Print string array elements, int and uint in json
Enhance the json module in benchtests to print signed and unsigned
integers and string array elements.

	* benchtests/json-lib.h: Include inttypes.h.
	(json_attr_int, json_attr_int, json_element_string,
	json_element_int, json_element_uint): New functions.
	* benchtests/json-lib.c: (json_attr_int, json_attr_int,
	json_element_string, json_element_int, json_element_uint): New
	functions.
2017-06-22 23:44:12 +05:30
Wilco Dijkstra
18b759355d Add powf trace
Add a workload for powf.  This is a reduced trace based on 2.3 billion
samples extracted from wrf.  The distribution of values, in particular
frequency of commonly used operands is the same as in the full trace.

    * benchtests/powf-inputs: Add reduced trace from wrf.
2017-06-20 16:50:37 +01:00
Wilco Dijkstra
beb52f502f Improve math benchmark infrastructure
Improve support for math function benchmarking.  This patch adds
a feature that allows accurate benchmarking of traces extracted
from real workloads.  This is done by iterating over all samples
rather than repeating each sample many times (which completely
ignores branch prediction and cache effects).  A trace can be
added to existing math function inputs via
"## name: workload-<name>", followed by the trace.

        * benchtests/README: Describe workload feature.
        * benchtests/bench-skeleton.c (main): Add support for
        benchmarking traces from workloads.
2017-06-20 16:26:26 +01:00
Paul Clarke
4cedcaea8d Add powf bench tests
Add powf() bench test with input which covers these cases:
- positive base to positive exponent
- exponent 0
- negative base to even exponent
- exponent 1
- exponent -1
- squared
- squareroot
- 1 to negative exponent
- -1 to negative exponent
- base 0
- -1 to even exponent
- small base
- small exponent

	* benchtests/Makefile (bench-math): Add powf.
	* benchtests/powf-inputs: New file.
2017-06-20 10:14:42 -03:00
Adhemerval Zanella
0edbf12301 nptl: Invert the mmap/mprotect logic on allocated stacks (BZ#18988)
Current allocate_stack logic for create stacks is to first mmap all
the required memory with the desirable memory and then mprotect the
guard area with PROT_NONE if required.  Although it works as expected,
it pessimizes the allocation because it requires the kernel to actually
increase commit charge (it counts against the available physical/swap
memory available for the system).

The only issue is to actually check this change since side-effects are
really Linux specific and to actually account them it would require a
kernel specific tests to parse the system wide information.  On the kernel
I checked /proc/self/statm does not show any meaningful difference for
vmm and/or rss before and after thread creation.  I could only see
really meaningful information checking on system wide /proc/meminfo
between thread creation: MemFree, MemAvailable, and Committed_AS shows
large difference without the patch.  I think trying to use these
kind of information on a testcase is fragile.

The BZ#18988 reports shows that the commit pages are easily seen with
mlockall (MCL_FUTURE) (with lock all pages that become mapped in the
process) however a more straighfoward testcase shows that pthread_create
could be faster using this patch:

--
static const int inner_count = 256;
static const int outer_count = 128;

static
void *thread1(void *arg)
{
  return NULL;
}

static
void *sleeper(void *arg)
{
  pthread_t ts[inner_count];
  for (int i = 0; i < inner_count; i++)
    pthread_create (&ts[i], &a, thread1, NULL);
  for (int i = 0; i < inner_count; i++)
    pthread_join (ts[i], NULL);

  return NULL;
}

int main(void)
{
  pthread_attr_init(&a);
  pthread_attr_setguardsize(&a, 1<<20);
  pthread_attr_setstacksize(&a, 1134592);

  pthread_t ts[outer_count];
  for (int i = 0; i < outer_count; i++)
    pthread_create(&ts[i], &a, sleeper, NULL);
  for (int i = 0; i < outer_count; i++)
    pthread_join(ts[i], NULL);
    assert(r == 0);
  }
  return 0;
}

--

On x86_64 (4.4.0-45-generic, gcc 5.4.0) running the small benchtests
I see:

$ time ./test

real	0m3.647s
user	0m0.080s
sys	0m11.836s

While with the patch I see:

$ time ./test

real	0m0.696s
user	0m0.040s
sys	0m1.152s

So I added a pthread_create benchtest (thread_create) which check
the thread creation latency.  As for the simple benchtests, I saw
improvements in thread creation on all architectures I tested the
change.

Checked on x86_64-linux-gnu, i686-linux-gnu, aarch64-linux-gnu,
arm-linux-gnueabihf, powerpc64le-linux-gnu, sparc64-linux-gnu,
and sparcv9-linux-gnu.

	[BZ #18988]
	* benchtests/thread_create-inputs: New file.
	* benchtests/thread_create-source.c: Likewise.
	* support/xpthread_attr_setguardsize.c: Likewise.
	* support/Makefile (libsupport-routines): Add
	xpthread_attr_setguardsize object.
	* support/xthread.h: Add xpthread_attr_setguardsize prototype.
	* benchtests/Makefile (bench-pthread): Add thread_create.
	* nptl/allocatestack.c (allocate_stack): Call mmap with PROT_NONE and
	then mprotect the required area.
2017-06-14 17:22:35 -03:00
H.J. Lu
6b69f98dcd benchtests: Add more tests for memrchr
bench-memchr.c is shared with bench-memrchr.c.  This patch adds some
tests for positions close to the beginning for memrchr, which are
equivalent to positions close to the end for memchr.

	* benchtests/bench-memchr.c (do_test): Print out both length
	and position.
	(test_main): Also test the position close to the beginning for
	memrchr.
2017-06-04 09:45:09 -07:00
Zack Weinberg
2bfdaeddaa Rename cppflags-iterator.mk to libof-iterator.mk, remove extra-modules.mk.
cppflags-iterator.mk no longer has anything to do with CPPFLAGS; all
it does is set libof-$(foo) for a list of files.  extra-modules.mk
does the same thing, but with a different input variable, and doesn't
let the caller control the module.  Therefore, this patch gives
cppflags-iterator.mk a better name, removes extra-modules.mk, and
updates all uses of both.

	* extra-modules.mk: Delete file.
	* cppflags-iterator.mk: Rename to ...
	* libof-iterator.mk: ...this.  Adjust comments.

	* Makerules, extra-lib.mk, benchtests/Makefile, elf/Makefile
	* elf/rtld-Rules, iconv/Makefile, locale/Makefile, malloc/Makefile
	* nscd/Makefile, sunrpc/Makefile, sysdeps/s390/Makefile:
	Use libof-iterator.mk instead of cppflags-iterator.mk or
	extra-modules.mk.

	* benchtests/strcoll-inputs/filelist#en_US.UTF-8: Remove
	extra-modules.mk and cppflags-iterator.mk, add libof-iterator.mk.
2017-05-09 07:06:29 -04:00
Steve Ellcey
29d92a8eda Change TEST_NAME to memcpy to fix IFUNC testing of multiple versions.
* benchtests/bench-memcpy-random.c (TEST_NAME): Change to memcpy.
	(IMPL) Call with 1 instead of 0 as argument.
2017-03-28 09:07:03 -07:00
Siddhesh Poyarekar
d01cbb6e8e Actually add bench-memcpy-random
git-add and commit the benchmark that Wilco posted on the list.
2017-03-26 19:01:50 +05:30
Wilco Dijkstra
8d2030d659 Add a new randomized memcpy test for copies up to 256 bytes. The distribution
of the size and alignment is based on a trace of SPEC2006.  Instead of
repeating the same copy over and over again like the existing tests, it times
several thousand different copies to more accurately estimate the overhead of
branch prediction.

	* benchtests/Makefile (string-benchset): Add memcpy-random.
	* benchtests/bench-memcpy-random.c: New file.
2017-03-23 19:00:02 +00:00
Joseph Myers
bfff8b1bec Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
Siddhesh Poyarekar
8ce8299f94 Add configure check for python program
Add a configure check that looks for python3 and python in that order
since we had agreed in the past to prefer python3 over python in all
our code.  The patch also adjusts invocations through the various
Makefiles to use the set variable.

	* configure.ac: Check for python3 or python.
	* configure: Regenerated.
	* config.make.in (PYTHON): New variable.
	* benchtests/Makefile: Don't define PYTHON.
	(bench): Define target only if PYTHON was defined.
	* Rules: Don't define PYTHON.
	Define pretty printer targets only if PYTHON was defined.
	(tests-printers): Add to tests-unsupported if PYTHON is not
	found.
	(python-flags, python-invoke): Remove.
	(tests-printers-out): Use PYTHON instead of python-invoke.
2016-12-22 23:07:52 +05:30
Wilco Dijkstra
5625f666ce This patch cleans up the strsep implementation and improves performance.
Currently strsep calls strpbrk is is now a veneer to strcspn.  Calling
strcspn directly is faster.  Since it handles a delimiter string of size
1 as a special case, this is not needed in strsep itself.  Although this
means there is a slightly higher overhead if the delimiter size is 1,
all other cases are slightly faster.  The overall performance gain is 5-10%
on AArch64.

The string/bits/string2.h header contains optimizations for constant
delimiters of size 1-3.  Benchmarking these showed similar performance for
size 1 (since in all cases strchr/strchrnul is used), while size 2 and 3
can give up to 2x speedup for small input strings.  However if these cases
are common it seems much better to add this optimization to strcspn.
So move these header optimizations to string-inlines.c.

Improve the strsep benchmark so that it actually benchmarks something.
The current version contains a delimiter character at every position in the
input string, so there is very little work to do, and the extremely inefficent
simple_strsep implementation appears fastest in every case.  The new version
has either no match in the input for the fail case and a match halfway in the
input for the success case.  The input is then restored so that each iteration
does exactly the same amount of work.  Reduce the number of testcases since
simple_strsep takes a lot of time now.

	* benchtests/bench-strsep.c (oldstrsep): Add old implementation.
	(do_one_test) Restore original string so iteration works.
	* string/string-inlines.c (do_test): Create better input strings.
	(test_main) Reduce number of testruns.
	* string/string-inlines.c (__old_strsep_1c): New function.
	(__old_strsep_2c): Likewise.
	(__old_strsep_3c): Likewise.
	* string/strsep.c (__strsep): Remove case of small delim string.
	Call strcspn directly rather than strpbrk.
	* string/bits/string2.h (__strsep): Remove define.
	(__strsep_1c): Remove.
	(__strsep_2c): Remove.
	(__strsep_3c): Remove.
	(strsep): Remove.
	* sysdeps/unix/sysv/linux/internal_statvfs.c
	(__statvfs_getflags): Rename to __strsep.
2016-12-21 15:16:29 +00:00
Adhemerval Zanella
da16c9b524 benchtests: Add fmaxf/fminf benchmarks
This patch adds fmaxf and fminf benchtests.  It is based on
math/s_fmax_template.c implementation which checks for basically four
different classes:

  1. if x is greater or equal than y.
  2. if x is less than y.
  3. if x or y is signaling.
  4. if y is nan.

Cases 1 and 2 are used for default input number (by mixing normal double
numbers and infinity), while case 3 and 4 are used each for on for a
benchmark class.

Checked on x86_64-linux-gnu and powerpc64-linux-gnu.

	* benchtests/Makefile (bench-math): Add fminf and fmaxf.
	(CFLAGS-bench-fmaxf.c): New rule.
	(CFLAGS-bench-fminf.c): Likewise.
        * benchtests/fmaxf-inputs: New file.
        * benchtests/fminf-inputs: Likewise.
2016-12-19 16:04:16 -02:00
Adhemerval Zanella
5d1f604a87 benchtests: Add fmax/fmin benchmarks
This patch adds fmax and fmin benchtests.  It is based math/s_fmax_template.c
implementation which checks for basically four different classes:

  1. if x is greater or equal than y.
  2. if x is less than y.
  3. if x or y is signaling.
  4. if y is nan.

Cases 1 and 2 are used for default input number (by mixing normal double
numbers and infinity), while case 3 and 4 are used each for on for a
benchmark class.

Checked on x86_64-linux-gnu and powerpc64-linux-gnu.

	* benchtests/Makefile (bench-math): Add fmin and fmax.
	(CFLAGS-bench-fmax.c): New rule.
	(CFLAGS-bench-fmin.c): New rule.
	* benchtests/fmax-inputs: New file.
	* benchtests/fmin-inputs: Likewise.
2016-12-19 16:04:16 -02:00
Adhemerval Zanella
b598e13477 Adjust benchtests to new support library.
This patch basically replaces the test-skeleton.c inclusion by
support/test-driver.c and also minor adjustments in bench-string.h.

Checked on x86_64-linux-gnu and powerpc64le-linux-gnu.

	* benchtests/bench-string.h (TEST_FUNCTION): Use name without
	parenthesis.
	(CMDLINE_PROCESS): Define using function instead of macro.
	* benchtests/bench-memccpy.c: Include <support/test-driver.c> instead
	of test-skeleton.
	* benchtests/bench-memchr.c: Likewise.
	* benchtests/bench-memcmp.c: Likewise.
	* benchtests/bench-memcpy-large.c: Likewise.
	* benchtests/bench-memcpy.c: Likewise.
	* benchtests/bench-memmem.c: Likewise.
	* benchtests/bench-memmove-large.c: Likewise.
	* benchtests/bench-memmove.c: Likewise.
	* benchtests/bench-memset-large.c: Likewise.
	* benchtests/bench-memset.c: Likewise.
	* benchtests/bench-rawmemchr.c: Likewise.
	* benchtests/bench-strcasecmp.c: Likewise.
	* benchtests/bench-strcasestr.c: Likewise.
	* benchtests/bench-strcat.c: Likewise.
	* benchtests/bench-strchr.c: Likewise.
	* benchtests/bench-strcmp.c: Likewise.
	* benchtests/bench-strcpy.c: Likewise.
	* benchtests/bench-strcpy_chk.c: Likewise.
	* benchtests/bench-strlen.c: Likewise.
	* benchtests/bench-strncasecmp.c: Likewise.
	* benchtests/bench-strncmp.c: Likewise.
	* benchtests/bench-strncpy.c: Likewise.
	* benchtests/bench-strnlen.c: Likewise.
	* benchtests/bench-strpbrk.c: Likewise.
	* benchtests/bench-strrchr.c: Likewise.
	* benchtests/bench-strsep.c: Likewise.
	* benchtests/bench-strspn.c: Likewise.
	* benchtests/bench-strstr.c: Likewise.
	* benchtests/bench-strtok.c: Likewise.
2016-12-19 16:04:16 -02:00
Siddhesh Poyarekar
009ba649b4 Link benchset tests against libsupport
Benchsets in benchtests use test-skeleton, so they too need to be
linked against the new libsupport DSO.

       * benchtests/Makefile (binaries-benchset): Depend on libsupport
       DSO.
2016-12-18 01:22:29 +05:30
Wilco Dijkstra
d58ab810a6 Improve strtok and strtok_r performance. Instead of calling strpbrk which
calls strcspn, call strcspn directly so we get the end of the token without
an extra call to rawmemchr.  Also avoid an unnecessary call to strcspn after
the last token by adding an early exit for an empty string.  Change strtok
to tailcall strtok_r to avoid unnecessary code duplication.

Remove the special header optimization for strtok_r of a 1-character
constant string - both strspn and strcspn contain optimizations for this
case.  Benchmarking this showed similar performance in the worst case,
but up to 5.5x better performance in the "found" case for large inputs.

	* benchtests/bench-strtok.c (oldstrtok): Add old implementation.
	* string/strtok.c (strtok): Change to tailcall __strtok_r.
	* string/strtok_r.c (__strtok_r): Optimize for performance.
	* string/string-inlines.c (__old_strtok_r_1c): New function.
	* string/bits/string2.h (__strtok_r): Move to string-inlines.c.
2016-12-14 15:12:18 +00:00
Joseph Myers
7a8330c01b Use -fno-builtin for sqrt benchmark.
This patch makes the sqrt benchmark use -fno-builtin, as already done
for benchmarks of ffs and ffsll, so that it actually benchmarks the
glibc function as (presumably) intended even in the presence of the
compiler inlining sqrt.

Tested for x86_64 and also used for benchmarking my ARM sqrt patch.

	* benchtests/Makefile (CFLAGS-bench-sqrt.c): New variable.
2016-10-21 21:18:03 +00:00
H.J. Lu
447720b03b Clear destination buffer updated by the previous run
Clear the destination buffer updated by the previous run in bench-memcpy.c
and test-memcpy.c to catch the error when the following implementations do
not copy anything.

	[BZ #19907]
	* benchtests/bench-memcpy.c (do_one_test): Clear the destination
	buffer updated by the previous run.
	* string/test-memcpy.c (do_one_test): Likewise.
	* benchtests/bench-memmove.c (do_one_test): Add a comment.
	* string/test-memmove.c (do_one_test): Likewise.
2016-05-18 05:51:59 -07:00
Siddhesh Poyarekar
2d304f3c6f benchtests: Support for cross-building benchmarks
This patch adds full support for cross-building benchmarks.  Some
benchmarks like those that need locales to be generated cannot be
built and are hence skipped for cross builds.

Tested by cross building for aarch64 on x86_64 and then running the
generated benchmark on aarch64.

	* benchtests/Makefile (wcsmbs-benchset): Include only for
	native builds and runs.
	(LOCALES): Likewise.
	(bench-build): Build timing-type here instead of the bench
	target.  Generate locale only for native builds.
	* benchtests/README: Add note for cross-building.
2016-04-20 13:19:01 +05:30
Siddhesh Poyarekar
d7aea0cf06 benchtests: Clean up extra-objs
The bench-clean target would leave behind json-lib.o.  Fix up to clean
up all extra-objs registered in benchtests.
2016-04-20 13:15:50 +05:30
Siddhesh Poyarekar
bfdda211c6 benchtests: Update README to include instructions for bench-build target 2016-04-20 10:58:20 +05:30
Siddhesh Poyarekar
a00d3f4a8c New make target to only build benchmark binaries
For situations where we are cross-building or where we want to avoid
building on the target system, we want a way to only build benchmarks
and then copy them over to the target system to run them.  I have also
added a simple enhancement for the 'bench' target where all benchmark
binaries are built and then the benchmarks executed.

Tested on arm.

	Makefile.in (bench-build): New target.
	Rules (PHONY): Add bench-build target.
	benchtests/Makefile (bench): Depend on bench-build.
	(bench-build): New target.
2016-04-20 10:23:28 +05:30
Mike Frysinger
20003c4988 localedata: iw_IL: delete old/deprecated locale [BZ #16137]
From the bug:
Obsolete locale.  The ISO-639 code for Hebrew was changed from 'iw'
to 'he' in 1989, according to Bruno Haible on libc-alpha 2003-09-01.

Reported-by: Chris Leonard <cjlhomeaddress@gmail.com>
2016-04-08 18:56:34 -04:00
H.J. Lu
a25322f4e8 Add memcpy/memmove/memset benchmarks with large data
Add memcpy, memmove and memset benchmarks with large data sizes.

	* benchtests/Makefile (string-benchset): Add memcpy-large,
	memmove-large and memset-large.
	* benchtests/bench-memcpy-large.c: New file.
	* benchtests/bench-memmove-large.c: Likewise.
	* benchtests/bench-memmove-large.c: Likewise.
	* benchtests/bench-string.h (TIMEOUT): Don't redefine.
2016-04-06 08:37:39 -07:00
H.J. Lu
344303f3cf Test 64-byte alignment in memset benchtest
Add 64-byte alignment tests in memset benchtest for 64-byte vector
registers.

	* benchtests/bench-memset.c (do_test): Support 64-byte
	alignment.
	(test_main): Test 64-byte alignment.
2016-04-01 10:00:12 -07:00
H.J. Lu
aea44bf61a Test 64-byte alignment in memmove benchtest
Add 64-byte alignment tests in memmove benchtest for 64-byte vector
registers.

	* benchtests/bench-memmove.c (test_main): Test 64-byte
	alignment.
2016-04-01 09:59:09 -07:00
H.J. Lu
32b28d24a1 Test 64-byte alignment in memcpy benchtest
Add 64-byte alignment tests in memcpy benchtest for 64-byte vector
registers.

	* benchtests/bench-memcpy.c (test_main): Test 64-byte alignment.
2016-04-01 09:57:53 -07:00
H.J. Lu
87da630b22 Support --enable-hardcoded-path-in-tests in benchtests
benchtests should use $(test-via-rtld-prefix) and $(+link-tests) like
other glibc tests.

	[BZ #19783]
	* benchtests/Makefile (run-bench): Replace $(rtld-prefix) with
	$(test-via-rtld-prefix).
	($(binaries-bench)): Replace $(+link) with $(+link-tests).
2016-03-08 04:53:38 -08:00
Carlos O'Donell
67fc563718 Use $(PYTHON) to run benchtests python files. 2016-01-13 11:00:57 -05:00
Joseph Myers
f7a9f785e5 Update copyright dates with scripts/update-copyrights. 2016-01-04 16:05:18 +00:00
Siddhesh Poyarekar
aad287f35a benchtests: ffs and ffsll are string functions, not math
The ffs and ffsll functions were listed as math functions when they
are actually defined in strings.h and string.h respectively.  Shuffle
around the Makefile variables a bit and make a separate space for ffs
and ffsll.
2015-12-09 00:15:15 +05:30
Siddhesh Poyarekar
520e7edb85 benchtests: Add inputs from sin and cos to sincos
The sincos benchmark has only about a dozen inputs that don't measure
the impact of changes to various passes.  Since much of the code
properties are inherited from sin and cos, copy those inputs in to get
more comprehensive coverage.
2015-12-09 00:10:51 +05:30
Siddhesh Poyarekar
4916acd87b benchtests: Mark output variables as used
Prevent function calls that don't return anything from being optimized
out by the compiler by marking its input variables as used.

This prevents the sincos function call from being optimized out in the
benchmark.
2015-11-17 16:01:15 +05:30
Wilco Dijkstra
cb2f668d46 Add a new benchmark for isinf/isnan/isnormal/isfinite/fpclassify. The test uses 2 arrays with 1024 doubles, one with 99% finite FP numbers (10% zeroes, 10% negative) and 1% inf/NaN, the other with 50% inf, and 50% Nan.
ChangeLog:
2015-09-18  Wilco Dijkstra  <wdijkstr@arm.com>

	* benchtests/Makefile: Add bench-math-inlines, link with libm.
	* benchtests/bench-math-inlines.c: New benchmark.
	* benchtests/bench-util.h: New file.
	* benchtests/bench-util.c: New file.
	* benchtests/bench-skeleton.c: Add include of bench-util.c/h.
2015-09-18 16:02:38 +01:00
Stefan Liebler
f21216015b S390: Optimize wmemcmp.
This patch provides optimized version of wmemcmp with the z13 vector
instructions.

ChangeLog:

	* sysdeps/s390/multiarch/wmemcmp-c.c: New File.
	* sysdeps/s390/multiarch/wmemcmp-vx.S: Likewise.
	* sysdeps/s390/multiarch/wmemcmp.c: Likewise.
	* sysdeps/s390/multiarch/Makefile
	(sysdep_routines): Add wmemcmp functions.
	* sysdeps/s390/multiarch/ifunc-impl-list-common.c
	(__libc_ifunc_impl_list_common): Add ifunc test for wmemcmp.
	* benchtests/bench-wmemcmp.c: New File.
	* benchtests/Makefile (wcsmbs-bench): Add wmemcmp.
2015-08-26 10:26:25 +02:00
Stefan Liebler
2e9e166761 S390: Optimize wmemset.
This patch provides optimized version of wmemset with the z13 vector
instructions.

ChangeLog:

	* sysdeps/s390/multiarch/wmemset-c.c: New File.
	* sysdeps/s390/multiarch/wmemset-vx.S: Likewise.
	* sysdeps/s390/multiarch/wmemset.c: Likewise.
	* sysdeps/s390/multiarch/Makefile
	(sysdep_routines): Add wmemset functions.
	* sysdeps/s390/multiarch/ifunc-impl-list-common.c
	(__libc_ifunc_impl_list_common): Add ifunc test for wmemset.
	* wcsmbs/wmemset.c: Use WMEMSET if defined.
	* string/test-memset.c: Add wmemset support.
	* wcsmbs/test-wmemset.c: New File.
	* wcsmbs/Makefile (strop-tests): Add wmemset.
	* benchtests/bench-memset.c: Add wmemset support.
	* benchtests/bench-wmemset.c: New File.
	* benchtests/Makefile (wcsmbs-bench): Add wmemset.
2015-08-26 10:26:25 +02:00
Stefan Liebler
88eefd344b S390: Optimize memchr, rawmemchr and wmemchr.
This patch provides optimized versions of memchr, rawmemchr and wmemchr with the
z13 vector instructions.

ChangeLog:

	* sysdeps/s390/multiarch/memchr-vx.S: New File.
	* sysdeps/s390/multiarch/memchr.c: Likewise.
	* sysdeps/s390/multiarch/rawmemchr-c.c: Likewise.
	* sysdeps/s390/multiarch/rawmemchr-vx.S: Likewise.
	* sysdeps/s390/multiarch/rawmemchr.c: Likewise.
	* sysdeps/s390/multiarch/wmemchr-c.c: Likewise.
	* sysdeps/s390/multiarch/wmemchr-vx.S: Likewise.
	* sysdeps/s390/multiarch/wmemchr.c: Likewise.
	* sysdeps/s390/s390-32/multiarch/memchr.c: Likewise.
	* sysdeps/s390/s390-64/multiarch/memchr.c: Likewise.
	* sysdeps/s390/multiarch/Makefile (sysdep_routines): Add memchr, wmemchr
	and rawmemchr functions.
	* sysdeps/s390/multiarch/ifunc-impl-list-common.c
	(__libc_ifunc_impl_list_common): Add ifunc test for memchr, rawmemchr
	and wmemchr.
	* wcsmbs/wmemchr.c: Use WMEMCHR if defined.
	* string/test-memchr.c: Add wmemchr support.
	* wcsmbs/test-wmemchr.c: New File.
	* wcsmbs/Makefile (strop-tests): Add wmemchr.
	* benchtests/bench-memchr.c: Add wmemchr support.
	* benchtests/bench-wmemchr.c: New File.
	* benchtests/Makefile (wcsmbs-bench): wmemchr.
2015-08-26 10:26:24 +02:00
Stefan Liebler
b4c21601b1 S390: Optimize strcspn and wcscspn.
This patch provides optimized versions of strcspn and wcscspn with the z13
vector instructions.

ChangeLog:

	* sysdeps/s390/multiarch/strcspn-c.c: New File.
	* sysdeps/s390/multiarch/strcspn-vx.S: Likewise.
	* sysdeps/s390/multiarch/strcspn.c: Likewise.
	* sysdeps/s390/multiarch/wcscspn-c.c: Likewise.
	* sysdeps/s390/multiarch/wcscspn-vx.S: Likewise.
	* sysdeps/s390/multiarch/wcscspn.c: Likewise.
	* sysdeps/s390/multiarch/Makefile (sysdep_routines): Add strcspn and
	wcscspn functions.
	* sysdeps/s390/multiarch/ifunc-impl-list.c
	(__libc_ifunc_impl_list): Add ifunc test for strcspn, wcscspn.
	* wcsmbs/wcscspn.c: Use WCSCSPN if defined.
	* string/test-strcspn.c: Add wcscspn support.
	* wcsmbs/test-wcscspn.c: New File.
	* wcsmbs/Makefile (strop-tests): Add wcscspn.
	* benchtests/bench-strcspn.c: Add wcscspn support.
	* benchtests/bench-wcscspn.c: New File.
	* benchtests/Makefile (wcsmbs-bench): Add wcscspn.
2015-08-26 10:26:24 +02:00
Stefan Liebler
f0ba659847 S390: Optimize strpbrk and wcspbrk.
This patch provides optimized versions of strpbrk and wcspbrk with the z13
vector instructions.

ChangeLog:

	* sysdeps/s390/multiarch/strpbrk-c.c: New File.
	* sysdeps/s390/multiarch/strpbrk-vx.S: Likewise.
	* sysdeps/s390/multiarch/strpbrk.c: Likewise.
	* sysdeps/s390/multiarch/wcspbrk-c.c: Likewise.
	* sysdeps/s390/multiarch/wcspbrk-vx.S: Likewise.
	* sysdeps/s390/multiarch/wcspbrk.c: Likewise.
	* sysdeps/s390/multiarch/Makefile (sysdep_routines): Add strpbrk and
	wcspbrk functions.
	* sysdeps/s390/multiarch/ifunc-impl-list.c
	(__libc_ifunc_impl_list): Add ifunc test for strpbrk, wcspbrk.
	* wcsmbs/wcspbrk.c: Use WCSPBRK if defined.
	* string/test-strpbrk.c: Add wcspbrk support.
	* wcsmbs/test-wcspbrk.c: New File.
	* wcsmbs/Makefile (strop-tests): Add wcspbrk.
	* benchtests/bench-strpbrk.c: Add wcspbrk support.
	* benchtests/bench-wcspbrk.c: New File.
	* benchtests/Makefile (wcsmbs-bench): Add wcspbrk.
2015-08-26 10:26:24 +02:00