glibc/benchtests/bench-strsep.c

212 lines
4.9 KiB
C
Raw Normal View History

2013-11-18 12:03:07 +00:00
/* Measure strsep functions.
Copyright (C) 2013-2019 Free Software Foundation, Inc.
2013-11-18 12:03:07 +00:00
This file is part of the GNU C Library.
The GNU C Library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.
The GNU C Library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public
License along with the GNU C Library; if not, see
<http://www.gnu.org/licenses/>. */
#define TEST_MAIN
#define TEST_NAME "strsep"
#include "bench-string.h"
char *
simple_strsep (char **s1, char *s2)
{
char *begin;
char *s;
size_t j = 0;
begin = *s1;
s = begin;
if (begin == NULL)
return NULL;
ssize_t s2len = strlen (s2);
while (*s)
{
for (j = 0; j < s2len; j++)
{
if (*s == s2[j])
{
s[0] = '\0';
*s1 = s + 1;
return begin;
}
}
s++;
}
*s1 = NULL;
return begin;
}
This patch cleans up the strsep implementation and improves performance. Currently strsep calls strpbrk is is now a veneer to strcspn. Calling strcspn directly is faster. Since it handles a delimiter string of size 1 as a special case, this is not needed in strsep itself. Although this means there is a slightly higher overhead if the delimiter size is 1, all other cases are slightly faster. The overall performance gain is 5-10% on AArch64. The string/bits/string2.h header contains optimizations for constant delimiters of size 1-3. Benchmarking these showed similar performance for size 1 (since in all cases strchr/strchrnul is used), while size 2 and 3 can give up to 2x speedup for small input strings. However if these cases are common it seems much better to add this optimization to strcspn. So move these header optimizations to string-inlines.c. Improve the strsep benchmark so that it actually benchmarks something. The current version contains a delimiter character at every position in the input string, so there is very little work to do, and the extremely inefficent simple_strsep implementation appears fastest in every case. The new version has either no match in the input for the fail case and a match halfway in the input for the success case. The input is then restored so that each iteration does exactly the same amount of work. Reduce the number of testcases since simple_strsep takes a lot of time now. * benchtests/bench-strsep.c (oldstrsep): Add old implementation. (do_one_test) Restore original string so iteration works. * string/string-inlines.c (do_test): Create better input strings. (test_main) Reduce number of testruns. * string/string-inlines.c (__old_strsep_1c): New function. (__old_strsep_2c): Likewise. (__old_strsep_3c): Likewise. * string/strsep.c (__strsep): Remove case of small delim string. Call strcspn directly rather than strpbrk. * string/bits/string2.h (__strsep): Remove define. (__strsep_1c): Remove. (__strsep_2c): Remove. (__strsep_3c): Remove. (strsep): Remove. * sysdeps/unix/sysv/linux/internal_statvfs.c (__statvfs_getflags): Rename to __strsep.
2016-12-21 15:16:29 +00:00
char *
oldstrsep (char **stringp, const char *delim)
{
char *begin, *end;
begin = *stringp;
if (begin == NULL)
return NULL;
/* A frequent case is when the delimiter string contains only one
character. Here we don't need to call the expensive `strpbrk'
function and instead work using `strchr'. */
if (delim[0] == '\0' || delim[1] == '\0')
{
char ch = delim[0];
if (ch == '\0')
end = NULL;
else
{
if (*begin == ch)
end = begin;
else if (*begin == '\0')
end = NULL;
else
end = strchr (begin + 1, ch);
}
}
else
/* Find the end of the token. */
end = strpbrk (begin, delim);
if (end)
{
/* Terminate the token and set *STRINGP past NUL character. */
*end++ = '\0';
*stringp = end;
}
else
/* No more delimiters; this is the last token. */
*stringp = NULL;
return begin;
}
2013-11-18 12:03:07 +00:00
typedef char *(*proto_t) (const char **, const char *);
IMPL (simple_strsep, 0)
IMPL (strsep, 1)
This patch cleans up the strsep implementation and improves performance. Currently strsep calls strpbrk is is now a veneer to strcspn. Calling strcspn directly is faster. Since it handles a delimiter string of size 1 as a special case, this is not needed in strsep itself. Although this means there is a slightly higher overhead if the delimiter size is 1, all other cases are slightly faster. The overall performance gain is 5-10% on AArch64. The string/bits/string2.h header contains optimizations for constant delimiters of size 1-3. Benchmarking these showed similar performance for size 1 (since in all cases strchr/strchrnul is used), while size 2 and 3 can give up to 2x speedup for small input strings. However if these cases are common it seems much better to add this optimization to strcspn. So move these header optimizations to string-inlines.c. Improve the strsep benchmark so that it actually benchmarks something. The current version contains a delimiter character at every position in the input string, so there is very little work to do, and the extremely inefficent simple_strsep implementation appears fastest in every case. The new version has either no match in the input for the fail case and a match halfway in the input for the success case. The input is then restored so that each iteration does exactly the same amount of work. Reduce the number of testcases since simple_strsep takes a lot of time now. * benchtests/bench-strsep.c (oldstrsep): Add old implementation. (do_one_test) Restore original string so iteration works. * string/string-inlines.c (do_test): Create better input strings. (test_main) Reduce number of testruns. * string/string-inlines.c (__old_strsep_1c): New function. (__old_strsep_2c): Likewise. (__old_strsep_3c): Likewise. * string/strsep.c (__strsep): Remove case of small delim string. Call strcspn directly rather than strpbrk. * string/bits/string2.h (__strsep): Remove define. (__strsep_1c): Remove. (__strsep_2c): Remove. (__strsep_3c): Remove. (strsep): Remove. * sysdeps/unix/sysv/linux/internal_statvfs.c (__statvfs_getflags): Rename to __strsep.
2016-12-21 15:16:29 +00:00
IMPL (oldstrsep, 2)
2013-11-18 12:03:07 +00:00
static void
do_one_test (impl_t * impl, const char *s1, const char *s2)
{
size_t i, iters = INNER_LOOP_ITERS;
timing_t start, stop, cur;
TIMING_NOW (start);
for (i = 0; i < iters; ++i)
{
This patch cleans up the strsep implementation and improves performance. Currently strsep calls strpbrk is is now a veneer to strcspn. Calling strcspn directly is faster. Since it handles a delimiter string of size 1 as a special case, this is not needed in strsep itself. Although this means there is a slightly higher overhead if the delimiter size is 1, all other cases are slightly faster. The overall performance gain is 5-10% on AArch64. The string/bits/string2.h header contains optimizations for constant delimiters of size 1-3. Benchmarking these showed similar performance for size 1 (since in all cases strchr/strchrnul is used), while size 2 and 3 can give up to 2x speedup for small input strings. However if these cases are common it seems much better to add this optimization to strcspn. So move these header optimizations to string-inlines.c. Improve the strsep benchmark so that it actually benchmarks something. The current version contains a delimiter character at every position in the input string, so there is very little work to do, and the extremely inefficent simple_strsep implementation appears fastest in every case. The new version has either no match in the input for the fail case and a match halfway in the input for the success case. The input is then restored so that each iteration does exactly the same amount of work. Reduce the number of testcases since simple_strsep takes a lot of time now. * benchtests/bench-strsep.c (oldstrsep): Add old implementation. (do_one_test) Restore original string so iteration works. * string/string-inlines.c (do_test): Create better input strings. (test_main) Reduce number of testruns. * string/string-inlines.c (__old_strsep_1c): New function. (__old_strsep_2c): Likewise. (__old_strsep_3c): Likewise. * string/strsep.c (__strsep): Remove case of small delim string. Call strcspn directly rather than strpbrk. * string/bits/string2.h (__strsep): Remove define. (__strsep_1c): Remove. (__strsep_2c): Remove. (__strsep_3c): Remove. (strsep): Remove. * sysdeps/unix/sysv/linux/internal_statvfs.c (__statvfs_getflags): Rename to __strsep.
2016-12-21 15:16:29 +00:00
const char *s1a = s1;
CALL (impl, &s1a, s2);
if (s1a != NULL)
((char*)s1a)[-1] = '1';
2013-11-18 12:03:07 +00:00
}
TIMING_NOW (stop);
TIMING_DIFF (cur, start, stop);
TIMING_PRINT_MEAN ((double) cur, (double) iters);
}
static void
do_test (size_t align1, size_t align2, size_t len1, size_t len2, int fail)
{
char *s2 = (char *) (buf2 + align2);
This patch cleans up the strsep implementation and improves performance. Currently strsep calls strpbrk is is now a veneer to strcspn. Calling strcspn directly is faster. Since it handles a delimiter string of size 1 as a special case, this is not needed in strsep itself. Although this means there is a slightly higher overhead if the delimiter size is 1, all other cases are slightly faster. The overall performance gain is 5-10% on AArch64. The string/bits/string2.h header contains optimizations for constant delimiters of size 1-3. Benchmarking these showed similar performance for size 1 (since in all cases strchr/strchrnul is used), while size 2 and 3 can give up to 2x speedup for small input strings. However if these cases are common it seems much better to add this optimization to strcspn. So move these header optimizations to string-inlines.c. Improve the strsep benchmark so that it actually benchmarks something. The current version contains a delimiter character at every position in the input string, so there is very little work to do, and the extremely inefficent simple_strsep implementation appears fastest in every case. The new version has either no match in the input for the fail case and a match halfway in the input for the success case. The input is then restored so that each iteration does exactly the same amount of work. Reduce the number of testcases since simple_strsep takes a lot of time now. * benchtests/bench-strsep.c (oldstrsep): Add old implementation. (do_one_test) Restore original string so iteration works. * string/string-inlines.c (do_test): Create better input strings. (test_main) Reduce number of testruns. * string/string-inlines.c (__old_strsep_1c): New function. (__old_strsep_2c): Likewise. (__old_strsep_3c): Likewise. * string/strsep.c (__strsep): Remove case of small delim string. Call strcspn directly rather than strpbrk. * string/bits/string2.h (__strsep): Remove define. (__strsep_1c): Remove. (__strsep_2c): Remove. (__strsep_3c): Remove. (strsep): Remove. * sysdeps/unix/sysv/linux/internal_statvfs.c (__statvfs_getflags): Rename to __strsep.
2016-12-21 15:16:29 +00:00
/* Search for a delimiter in a string containing mostly '0', so don't
use '0' as a delimiter. */
static const char d[] = "123456789abcdefg";
2013-11-18 12:03:07 +00:00
#define dl (sizeof (d) - 1)
char *ss2 = s2;
for (size_t l = len2; l > 0; l = l > dl ? l - dl : 0)
{
size_t t = l > dl ? dl : l;
ss2 = mempcpy (ss2, d, t);
}
s2[len2] = '\0';
printf ("Length %4zd/%zd, alignment %2zd/%2zd, %s:",
len1, len2, align1, align2, fail ? "fail" : "found");
FOR_EACH_IMPL (impl, 0)
{
char *s1 = (char *) (buf1 + align1);
This patch cleans up the strsep implementation and improves performance. Currently strsep calls strpbrk is is now a veneer to strcspn. Calling strcspn directly is faster. Since it handles a delimiter string of size 1 as a special case, this is not needed in strsep itself. Although this means there is a slightly higher overhead if the delimiter size is 1, all other cases are slightly faster. The overall performance gain is 5-10% on AArch64. The string/bits/string2.h header contains optimizations for constant delimiters of size 1-3. Benchmarking these showed similar performance for size 1 (since in all cases strchr/strchrnul is used), while size 2 and 3 can give up to 2x speedup for small input strings. However if these cases are common it seems much better to add this optimization to strcspn. So move these header optimizations to string-inlines.c. Improve the strsep benchmark so that it actually benchmarks something. The current version contains a delimiter character at every position in the input string, so there is very little work to do, and the extremely inefficent simple_strsep implementation appears fastest in every case. The new version has either no match in the input for the fail case and a match halfway in the input for the success case. The input is then restored so that each iteration does exactly the same amount of work. Reduce the number of testcases since simple_strsep takes a lot of time now. * benchtests/bench-strsep.c (oldstrsep): Add old implementation. (do_one_test) Restore original string so iteration works. * string/string-inlines.c (do_test): Create better input strings. (test_main) Reduce number of testruns. * string/string-inlines.c (__old_strsep_1c): New function. (__old_strsep_2c): Likewise. (__old_strsep_3c): Likewise. * string/strsep.c (__strsep): Remove case of small delim string. Call strcspn directly rather than strpbrk. * string/bits/string2.h (__strsep): Remove define. (__strsep_1c): Remove. (__strsep_2c): Remove. (__strsep_3c): Remove. (strsep): Remove. * sysdeps/unix/sysv/linux/internal_statvfs.c (__statvfs_getflags): Rename to __strsep.
2016-12-21 15:16:29 +00:00
memset (s1, '0', len1);
if (!fail)
s1[len1 / 2] = '1';
2013-11-18 12:03:07 +00:00
s1[len1] = '\0';
do_one_test (impl, s1, s2);
}
putchar ('\n');
}
static int
test_main (void)
{
test_init ();
printf ("%23s", "");
FOR_EACH_IMPL (impl, 0)
printf ("\t%s", impl->name);
putchar ('\n');
for (size_t klen = 2; klen < 32; ++klen)
This patch cleans up the strsep implementation and improves performance. Currently strsep calls strpbrk is is now a veneer to strcspn. Calling strcspn directly is faster. Since it handles a delimiter string of size 1 as a special case, this is not needed in strsep itself. Although this means there is a slightly higher overhead if the delimiter size is 1, all other cases are slightly faster. The overall performance gain is 5-10% on AArch64. The string/bits/string2.h header contains optimizations for constant delimiters of size 1-3. Benchmarking these showed similar performance for size 1 (since in all cases strchr/strchrnul is used), while size 2 and 3 can give up to 2x speedup for small input strings. However if these cases are common it seems much better to add this optimization to strcspn. So move these header optimizations to string-inlines.c. Improve the strsep benchmark so that it actually benchmarks something. The current version contains a delimiter character at every position in the input string, so there is very little work to do, and the extremely inefficent simple_strsep implementation appears fastest in every case. The new version has either no match in the input for the fail case and a match halfway in the input for the success case. The input is then restored so that each iteration does exactly the same amount of work. Reduce the number of testcases since simple_strsep takes a lot of time now. * benchtests/bench-strsep.c (oldstrsep): Add old implementation. (do_one_test) Restore original string so iteration works. * string/string-inlines.c (do_test): Create better input strings. (test_main) Reduce number of testruns. * string/string-inlines.c (__old_strsep_1c): New function. (__old_strsep_2c): Likewise. (__old_strsep_3c): Likewise. * string/strsep.c (__strsep): Remove case of small delim string. Call strcspn directly rather than strpbrk. * string/bits/string2.h (__strsep): Remove define. (__strsep_1c): Remove. (__strsep_2c): Remove. (__strsep_3c): Remove. (strsep): Remove. * sysdeps/unix/sysv/linux/internal_statvfs.c (__statvfs_getflags): Rename to __strsep.
2016-12-21 15:16:29 +00:00
for (size_t hlen = 4 * klen; hlen < 8 * klen; hlen += klen)
2013-11-18 12:03:07 +00:00
{
do_test (0, 0, hlen, klen, 0);
do_test (0, 0, hlen, klen, 1);
do_test (0, 3, hlen, klen, 0);
do_test (0, 3, hlen, klen, 1);
do_test (0, 9, hlen, klen, 0);
do_test (0, 9, hlen, klen, 1);
do_test (0, 15, hlen, klen, 0);
do_test (0, 15, hlen, klen, 1);
do_test (3, 0, hlen, klen, 0);
do_test (3, 0, hlen, klen, 1);
do_test (3, 3, hlen, klen, 0);
do_test (3, 3, hlen, klen, 1);
do_test (3, 9, hlen, klen, 0);
do_test (3, 9, hlen, klen, 1);
do_test (3, 15, hlen, klen, 0);
do_test (3, 15, hlen, klen, 1);
do_test (9, 0, hlen, klen, 0);
do_test (9, 0, hlen, klen, 1);
do_test (9, 3, hlen, klen, 0);
do_test (9, 3, hlen, klen, 1);
do_test (9, 9, hlen, klen, 0);
do_test (9, 9, hlen, klen, 1);
do_test (9, 15, hlen, klen, 0);
do_test (9, 15, hlen, klen, 1);
do_test (15, 0, hlen, klen, 0);
do_test (15, 0, hlen, klen, 1);
do_test (15, 3, hlen, klen, 0);
do_test (15, 3, hlen, klen, 1);
do_test (15, 9, hlen, klen, 0);
do_test (15, 9, hlen, klen, 1);
do_test (15, 15, hlen, klen, 0);
do_test (15, 15, hlen, klen, 1);
}
do_test (0, 0, page_size - 1, 16, 0);
do_test (0, 0, page_size - 1, 16, 1);
return ret;
}
Adjust benchtests to new support library. This patch basically replaces the test-skeleton.c inclusion by support/test-driver.c and also minor adjustments in bench-string.h. Checked on x86_64-linux-gnu and powerpc64le-linux-gnu. * benchtests/bench-string.h (TEST_FUNCTION): Use name without parenthesis. (CMDLINE_PROCESS): Define using function instead of macro. * benchtests/bench-memccpy.c: Include <support/test-driver.c> instead of test-skeleton. * benchtests/bench-memchr.c: Likewise. * benchtests/bench-memcmp.c: Likewise. * benchtests/bench-memcpy-large.c: Likewise. * benchtests/bench-memcpy.c: Likewise. * benchtests/bench-memmem.c: Likewise. * benchtests/bench-memmove-large.c: Likewise. * benchtests/bench-memmove.c: Likewise. * benchtests/bench-memset-large.c: Likewise. * benchtests/bench-memset.c: Likewise. * benchtests/bench-rawmemchr.c: Likewise. * benchtests/bench-strcasecmp.c: Likewise. * benchtests/bench-strcasestr.c: Likewise. * benchtests/bench-strcat.c: Likewise. * benchtests/bench-strchr.c: Likewise. * benchtests/bench-strcmp.c: Likewise. * benchtests/bench-strcpy.c: Likewise. * benchtests/bench-strcpy_chk.c: Likewise. * benchtests/bench-strlen.c: Likewise. * benchtests/bench-strncasecmp.c: Likewise. * benchtests/bench-strncmp.c: Likewise. * benchtests/bench-strncpy.c: Likewise. * benchtests/bench-strnlen.c: Likewise. * benchtests/bench-strpbrk.c: Likewise. * benchtests/bench-strrchr.c: Likewise. * benchtests/bench-strsep.c: Likewise. * benchtests/bench-strspn.c: Likewise. * benchtests/bench-strstr.c: Likewise. * benchtests/bench-strtok.c: Likewise.
2016-12-16 17:35:06 +00:00
#include <support/test-driver.c>