Commit Graph

13 Commits

Author SHA1 Message Date
Leonardo Sandoval
1457016337 x86-64: Optimize strcmp/wcscmp and strncmp/wcsncmp with AVX2
Optimize x86-64 strcmp/wcscmp and strncmp/wcsncmp with AVX2. It uses vector
comparison as much as possible. Peak performance observed on a SkyLake
machine: 9x, 3x, 2.5x and 5.5x for strcmp, strncmp, wcscmp and wcsncmp,
respectively. The larger the comparison length, the more benefit using
avx2 functions, except on the strcmp, where peak is observed at length
== 32 bytes. Select AVX2 strcmp/wcscmp on AVX2 machines where vzeroupper
is preferred and AVX unaligned load is fast.

NB: It uses TZCNT instead of BSF since TZCNT produces the same result
as BSF for non-zero input.  TZCNT is faster than BSF and is executed
as BSF if machine doesn't support TZCNT.

	* sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add
	strcmp-avx2, strncmp-avx2, wcscmp-avx2, wcscmp-sse2, wcsncmp-avx2 and
	wcsncmp-sse2.
	* sysdeps/x86_64/multiarch/ifunc-impl-list.c
	(__libc_ifunc_impl_list): Add tests for __strcmp_avx2,
	__strncmp_avx2,	__wcscmp_avx2, __wcsncmp_avx2, __wcscmp_sse2
	and __wcsncmp_sse2.
	* sysdeps/x86_64/multiarch/strcmp.c (OPTIMIZE (avx2)):
	(IFUNC_SELECTOR): Return OPTIMIZE (avx2) on AVX 2 machines if
	AVX unaligned load is fast and vzeroupper is preferred.
	* sysdeps/x86_64/multiarch/strncmp.c: Likewise.
	* sysdeps/x86_64/multiarch/strcmp-avx2.S: New file.
	* sysdeps/x86_64/multiarch/strncmp-avx2.S: Likewise.
	* sysdeps/x86_64/multiarch/wcscmp-avx2.S: Likewise.
	* sysdeps/x86_64/multiarch/wcscmp-sse2.S: Likewise.
	* sysdeps/x86_64/multiarch/wcscmp.c: Likewise.
	* sysdeps/x86_64/multiarch/wcsncmp-avx2.S: Likewise.
	* sysdeps/x86_64/multiarch/wcsncmp-sse2.c: Likewise.
	* sysdeps/x86_64/multiarch/wcsncmp.c: Likewise.
	* sysdeps/x86_64/wcscmp.S (__wcscmp): Add alias only if __wcscmp
	is undefined.
2018-06-01 16:32:43 -05:00
Joseph Myers
688903eb3e Update copyright dates with scripts/update-copyrights.
* All files with FSF copyright notices: Update copyright dates
	using scripts/update-copyrights.
	* locale/programs/charmap-kw.h: Regenerated.
	* locale/programs/locfile-kw.h: Likewise.
2018-01-01 00:32:25 +00:00
Joseph Myers
bfff8b1bec Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
Joseph Myers
f7a9f785e5 Update copyright dates with scripts/update-copyrights. 2016-01-04 16:05:18 +00:00
Joseph Myers
2f44ee08db Fix regcomp wcscoll, wcscmp namespace (bug 18497).
regcomp brings in references to wcscoll, which isn't in all the
standards that contain regcomp.  In turn, wcscoll brings in references
to wcscmp, also not in all those standards.  This patch fixes this by
making those functions into weak aliases of __wcscoll and __wcscmp and
calling those names instead as needed.

Tested for x86_64 and x86 (testsuite, and that disassembly of
installed shared libraries is unchanged by the patch).

	[BZ #18497]
	* wcsmbs/wcscmp.c [!WCSCMP] (WCSCMP): Define as __wcscmp instead
	of wcscmp.
	(wcscmp): Define as weak alias of WCSCMP.
	* wcsmbs/wcscoll.c (STRCOLL): Define as __wcscoll instead of
	wcscoll.
	(USE_HIDDEN_DEF): Define.
	[!USE_IN_EXTENDED_LOCALE_MODEL] (wcscoll): Define as weak alias of
	__wcscoll.  Don't use libc_hidden_weak.
	* wcsmbs/wcscoll_l.c (STRCMP): Define as __wcscmp instead of
	wcscmp.
	* sysdeps/i386/i686/multiarch/wcscmp-c.c
	[SHARED] (libc_hidden_def): Define __GI___wcscmp instead of
	__GI_wcscmp.
	(weak_alias): Undefine and redefine.
	* sysdeps/i386/i686/multiarch/wcscmp.S (wcscmp): Rename to
	__wcscmp and define as weak alias of __wcscmp.
	* sysdeps/x86_64/wcscmp.S (wcscmp): Likewise.
	* include/wchar.h (__wcscmp): Declare.  Use libc_hidden_proto.
	(__wcscoll): Likewise.
	(wcscmp): Don't use libc_hidden_proto.
	(wcscoll): Likewise.
	* posix/regcomp.c (build_range_exp): Call __wcscoll instead of
	wcscoll.
	* posix/regexec.c (check_node_accept_bytes): Likewise.
	* conform/Makefile (test-xfail-XPG3/regex.h/linknamespace): Remove
	variable.
	(test-xfail-XPG4/regex.h/linknamespace): Likewise.
	(test-xfail-POSIX/regex.h/linknamespace): Likewise.
2015-06-09 21:07:30 +00:00
Joseph Myers
b168057aaa Update copyright dates with scripts/update-copyrights. 2015-01-02 16:29:47 +00:00
Allan McRae
d4697bc93d Update copyright notices with scripts/update-copyrights 2014-01-01 22:00:23 +10:00
Joseph Myers
568035b787 Update copyright notices with scripts/update-copyrights. 2013-01-02 19:05:09 +00:00
Paul Eggert
59ba27a63a Replace FSF snail mail address with URLs. 2012-02-09 23:18:22 +00:00
Ulrich Drepper
f17424ed53 Fix WS 2011-10-23 13:35:24 -04:00
Liubov Dmitrieva
95584d3b33 Fix signedness in wcscmp comparison 2011-10-23 13:34:15 -04:00
Ulrich Drepper
8e1294e83f Remove now-wrong comment 2011-09-06 17:20:33 -04:00
Ulrich Drepper
49d42c37ba Add optimized x86-64 wcscmp 2011-09-05 14:08:23 -04:00