glibc/sysdeps/x86_64/multiarch
H.J. Lu e94c310357 x86-64: Optimize memcmp-avx2-movbe.S for short difference
Check the first 32 bytes before checking size when size >= 32 bytes
to avoid unnecessary branch if the difference is in the first 32 bytes.
Replace vpmovmskb/subl/jnz with vptest/jnc.

On Haswell, the new version is as fast as the previous one.  On Skylake,
the new version is a little bit faster.

	* sysdeps/x86_64/multiarch/memcmp-avx2-movbe.S (MEMCMP): Check
	the first 32 bytes before checking size when size >= 32 bytes.
	Replace vpmovmskb/subl/jnz with vptest/jnc.
2017-06-27 07:55:00 -07:00
..
bcopy.S Use IFUNC memmove/memset in x86-64 bcopy/bzero 2012-10-11 13:58:16 -07:00
ifunc-avx2.h x86-64: Optimize memchr/rawmemchr/wmemchr with SSE2/AVX2 2017-06-09 05:13:31 -07:00
ifunc-impl-list.c x86-64: Implement memcmp family IFUNC selectors in C 2017-06-15 08:49:57 -07:00
ifunc-memcmp.h x86-64: Implement memcmp family IFUNC selectors in C 2017-06-15 08:49:57 -07:00
ifunc-memmove.h x86-64: Implement memmove family IFUNC selectors in C 2017-06-14 12:11:10 -07:00
ifunc-memset.h x86-64: Implement memset family IFUNC selectors in C 2017-06-15 08:33:35 -07:00
ifunc-sse4_2.h x86-64: Implement strcspn/strpbrk/strspn IFUNC selectors in C 2017-06-15 08:59:05 -07:00
ifunc-strcasecmp.h x86-64: Implement strcmp family IFUNC selectors in C 2017-06-21 12:11:06 -07:00
ifunc-unaligned-ssse3.h x86-64: Implement strcpy family IFUNC selectors in C 2017-06-12 09:06:09 -07:00
ifunc-wmemset.h x86-64: Rename wmemset.h to ifunc-wmemset.h 2017-06-07 14:48:34 -07:00
Makefile x86-64: Implement strcmp family IFUNC selectors in C 2017-06-21 12:11:06 -07:00
memchr-avx2.S x86-64: Optimize memchr/rawmemchr/wmemchr with SSE2/AVX2 2017-06-09 05:13:31 -07:00
memchr-sse2.S x86-64: Optimize memchr/rawmemchr/wmemchr with SSE2/AVX2 2017-06-09 05:13:31 -07:00
memchr.c x86-64: Optimize memchr/rawmemchr/wmemchr with SSE2/AVX2 2017-06-09 05:13:31 -07:00
memcmp-avx2-movbe.S x86-64: Optimize memcmp-avx2-movbe.S for short difference 2017-06-27 07:55:00 -07:00
memcmp-sse2.S x86-64: Implement memcmp family IFUNC selectors in C 2017-06-15 08:49:57 -07:00
memcmp-sse4.S Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
memcmp-ssse3.S Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
memcmp.c x86-64: Implement memcmp family IFUNC selectors in C 2017-06-15 08:49:57 -07:00
memcpy_chk-nonshared.S x86-64: Implement memmove family IFUNC selectors in C 2017-06-14 12:11:10 -07:00
memcpy_chk.c x86-64: Implement memmove family IFUNC selectors in C 2017-06-14 12:11:10 -07:00
memcpy-ssse3-back.S Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
memcpy-ssse3.S Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
memcpy.c x86-64: Implement memmove family IFUNC selectors in C 2017-06-14 12:11:10 -07:00
memmove_chk-nonshared.S x86-64: Implement memmove family IFUNC selectors in C 2017-06-14 12:11:10 -07:00
memmove_chk.c x86-64: Implement memmove family IFUNC selectors in C 2017-06-14 12:11:10 -07:00
memmove-avx512-no-vzeroupper.S Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
memmove-avx512-unaligned-erms.S Require binutils 2.24 to build x86-64 glibc [BZ #20139] 2016-07-01 06:03:05 -07:00
memmove-avx-unaligned-erms.S X86-64: Use non-temporal store in memcpy on large data 2016-04-12 08:10:47 -07:00
memmove-sse2-unaligned-erms.S x86-64: Implement memmove family IFUNC selectors in C 2017-06-14 12:11:10 -07:00
memmove-ssse3-back.S Improve 64bit memcpy/memmove for Atom, Core 2 and Core i7 2010-06-30 08:26:11 -07:00
memmove-ssse3.S Improve 64bit memcpy/memmove for Atom, Core 2 and Core i7 2010-06-30 08:26:11 -07:00
memmove-vec-unaligned-erms.S x86-64: Implement memmove family IFUNC selectors in C 2017-06-14 12:11:10 -07:00
memmove.c x86-64: Implement memmove family IFUNC selectors in C 2017-06-14 12:11:10 -07:00
mempcpy_chk-nonshared.S x86-64: Implement memmove family IFUNC selectors in C 2017-06-14 12:11:10 -07:00
mempcpy_chk.c x86-64: Implement memmove family IFUNC selectors in C 2017-06-14 12:11:10 -07:00
mempcpy.c Fix fallout from bits/string.h removal. 2017-06-20 09:39:08 -04:00
memrchr-avx2.S x86-64: Optimize memrchr with AVX2 2017-06-09 05:44:41 -07:00
memrchr-sse2.S x86-64: Optimize memrchr with AVX2 2017-06-09 05:44:41 -07:00
memrchr.c x86-64: Optimize memrchr with AVX2 2017-06-09 05:44:41 -07:00
memset_chk-nonshared.S x86-64: Implement memset family IFUNC selectors in C 2017-06-15 08:33:35 -07:00
memset_chk.c x86-64: Implement memset family IFUNC selectors in C 2017-06-15 08:33:35 -07:00
memset-avx2-unaligned-erms.S x86-64: Optimize wmemset with SSE2/AVX2/AVX512 2017-06-05 11:09:59 -07:00
memset-avx512-no-vzeroupper.S Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
memset-avx512-unaligned-erms.S x86-64: Optimize wmemset with SSE2/AVX2/AVX512 2017-06-05 11:09:59 -07:00
memset-sse2-unaligned-erms.S x86-64: Implement memset family IFUNC selectors in C 2017-06-15 08:33:35 -07:00
memset-vec-unaligned-erms.S x86-64: Implement memset family IFUNC selectors in C 2017-06-15 08:33:35 -07:00
memset.c x86-64: Implement memset family IFUNC selectors in C 2017-06-15 08:33:35 -07:00
rawmemchr-avx2.S x86-64: Optimize memchr/rawmemchr/wmemchr with SSE2/AVX2 2017-06-09 05:13:31 -07:00
rawmemchr-sse2.S Fix typo when undefining weak_alias 2017-06-19 14:56:40 +05:30
rawmemchr.c x86-64: Optimize memchr/rawmemchr/wmemchr with SSE2/AVX2 2017-06-09 05:13:31 -07:00
sched_cpucount.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
stpcpy-sse2-unaligned.S Improved st{r,p}{,n}cpy for SSE2 and SSSE3 on x86-64 2011-06-24 15:14:22 -04:00
stpcpy-sse2.S x86-64: Implement strcpy family IFUNC selectors in C 2017-06-12 09:06:09 -07:00
stpcpy-ssse3.S Improved st{r,p}{,n}cpy for SSE2 and SSSE3 on x86-64 2011-06-24 15:14:22 -04:00
stpcpy.c Fix fallout from bits/string.h removal. 2017-06-20 09:39:08 -04:00
stpncpy-c.c x86-64: Implement strcpy family IFUNC selectors in C 2017-06-12 09:06:09 -07:00
stpncpy-sse2-unaligned.S Improved st{r,p}{,n}cpy for SSE2 and SSSE3 on x86-64 2011-06-24 15:14:22 -04:00
stpncpy-ssse3.S Improved st{r,p}{,n}cpy for SSE2 and SSSE3 on x86-64 2011-06-24 15:14:22 -04:00
stpncpy.c x86-64: Implement strcpy family IFUNC selectors in C 2017-06-12 09:06:09 -07:00
strcasecmp_l-avx.S x86-64: Implement strcmp family IFUNC selectors in C 2017-06-21 12:11:06 -07:00
strcasecmp_l-sse2.S x86-64: Implement strcmp family IFUNC selectors in C 2017-06-21 12:11:06 -07:00
strcasecmp_l-sse4_2.S x86-64: Implement strcmp family IFUNC selectors in C 2017-06-21 12:11:06 -07:00
strcasecmp_l-ssse3.S Fix x86-64 build without multiarch. 2010-08-14 14:56:32 -07:00
strcasecmp_l.c x86-64: Implement strcmp family IFUNC selectors in C 2017-06-21 12:11:06 -07:00
strcasecmp.c x86-64: Implement strcmp family IFUNC selectors in C 2017-06-21 12:11:06 -07:00
strcat-sse2-unaligned.S Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
strcat-sse2.S x86-64: Implement strcat family IFUNC selectors in C 2017-06-15 08:56:59 -07:00
strcat-ssse3.S Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
strcat.c x86-64: Implement strcat family IFUNC selectors in C 2017-06-15 08:56:59 -07:00
strchr-avx2.S x86-64: Optimize strchr/strchrnul/wcschr with AVX2 2017-06-09 05:42:29 -07:00
strchr-sse2-no-bsf.S Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
strchr-sse2.S x86-64: Optimize strchr/strchrnul/wcschr with AVX2 2017-06-09 05:42:29 -07:00
strchr.c x86-64: Optimize strchr/strchrnul/wcschr with AVX2 2017-06-09 05:42:29 -07:00
strchrnul-avx2.S x86-64: Optimize strchr/strchrnul/wcschr with AVX2 2017-06-09 05:42:29 -07:00
strchrnul-sse2.S x86-64: Optimize strchr/strchrnul/wcschr with AVX2 2017-06-09 05:42:29 -07:00
strchrnul.c x86-64: Optimize strchr/strchrnul/wcschr with AVX2 2017-06-09 05:42:29 -07:00
strcmp-sse2-unaligned.S Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
strcmp-sse2.S x86-64: Implement strcmp family IFUNC selectors in C 2017-06-21 12:11:06 -07:00
strcmp-sse4_2.S x86-64: Implement strcmp family IFUNC selectors in C 2017-06-21 12:11:06 -07:00
strcmp-sse42.S x86-64: Implement strcmp family IFUNC selectors in C 2017-06-21 12:11:06 -07:00
strcmp-ssse3.S Remove NOT_IN_libc 2014-11-24 15:03:45 +05:30
strcmp.c x86-64: Implement strcmp family IFUNC selectors in C 2017-06-21 12:11:06 -07:00
strcpy-sse2-unaligned.S Fix x86 strncat optimized implementation for large sizes 2017-01-03 14:24:53 -02:00
strcpy-sse2.S x86-64: Implement strcpy family IFUNC selectors in C 2017-06-12 09:06:09 -07:00
strcpy-ssse3.S Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
strcpy.c x86-64: Implement strcpy family IFUNC selectors in C 2017-06-12 09:06:09 -07:00
strcspn-c.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
strcspn-sse2.S x86-64: Implement strcspn/strpbrk/strspn IFUNC selectors in C 2017-06-15 08:59:05 -07:00
strcspn.c x86-64: Implement strcspn/strpbrk/strspn IFUNC selectors in C 2017-06-15 08:59:05 -07:00
strlen-avx2.S x86-64: Optimize strlen/strnlen/wcslen/wcsnlen with AVX2 2017-06-09 05:18:18 -07:00
strlen-sse2.S x86-64: Optimize strlen/strnlen/wcslen/wcsnlen with AVX2 2017-06-09 05:18:18 -07:00
strlen.c x86-64: Optimize strlen/strnlen/wcslen/wcsnlen with AVX2 2017-06-09 05:18:18 -07:00
strncase_l-avx.S x86-64: Implement strcmp family IFUNC selectors in C 2017-06-21 12:11:06 -07:00
strncase_l-sse2.S x86-64: Implement strcmp family IFUNC selectors in C 2017-06-21 12:11:06 -07:00
strncase_l-sse4_2.S x86-64: Implement strcmp family IFUNC selectors in C 2017-06-21 12:11:06 -07:00
strncase_l-ssse3.S Add optimized strncasecmp versions for x86-64. 2010-08-14 22:04:01 -07:00
strncase_l.c x86-64: Implement strcmp family IFUNC selectors in C 2017-06-21 12:11:06 -07:00
strncase.c x86-64: Implement strcmp family IFUNC selectors in C 2017-06-21 12:11:06 -07:00
strncat-c.c Remove bits/string.h. 2017-06-20 08:21:24 -04:00
strncat-sse2-unaligned.S Improve 64 bit strcat functions with SSE2/SSSE3 2011-07-19 17:11:54 -04:00
strncat-ssse3.S Improve 64 bit strcat functions with SSE2/SSSE3 2011-07-19 17:11:54 -04:00
strncat.c x86-64: Implement strcat family IFUNC selectors in C 2017-06-15 08:56:59 -07:00
strncmp-sse2.S x86-64: Implement strcmp family IFUNC selectors in C 2017-06-21 12:11:06 -07:00
strncmp-sse4_2.S x86-64: Implement strcmp family IFUNC selectors in C 2017-06-21 12:11:06 -07:00
strncmp-ssse3.S x86-64: Implement strcmp family IFUNC selectors in C 2017-06-21 12:11:06 -07:00
strncmp.c x86-64: Implement strcmp family IFUNC selectors in C 2017-06-21 12:11:06 -07:00
strncpy-c.c x86-64: Implement strcpy family IFUNC selectors in C 2017-06-12 09:06:09 -07:00
strncpy-sse2-unaligned.S Improved st{r,p}{,n}cpy for SSE2 and SSSE3 on x86-64 2011-06-24 15:14:22 -04:00
strncpy-ssse3.S Improved st{r,p}{,n}cpy for SSE2 and SSSE3 on x86-64 2011-06-24 15:14:22 -04:00
strncpy.c x86-64: Implement strcpy family IFUNC selectors in C 2017-06-12 09:06:09 -07:00
strnlen-avx2.S x86-64: Optimize strlen/strnlen/wcslen/wcsnlen with AVX2 2017-06-09 05:18:18 -07:00
strnlen-sse2.S x86-64: Optimize strlen/strnlen/wcslen/wcsnlen with AVX2 2017-06-09 05:18:18 -07:00
strnlen.c x86-64: Optimize strlen/strnlen/wcslen/wcsnlen with AVX2 2017-06-09 05:18:18 -07:00
strpbrk-c.c x86-64: Implement strcspn/strpbrk/strspn IFUNC selectors in C 2017-06-15 08:59:05 -07:00
strpbrk-sse2.S x86-64: Implement strcspn/strpbrk/strspn IFUNC selectors in C 2017-06-15 08:59:05 -07:00
strpbrk.c x86-64: Implement strcspn/strpbrk/strspn IFUNC selectors in C 2017-06-15 08:59:05 -07:00
strrchr-avx2.S x86-64: Optimize strrchr/wcsrchr with AVX2 2017-06-09 05:45:52 -07:00
strrchr-sse2.S x86-64: Optimize strrchr/wcsrchr with AVX2 2017-06-09 05:45:52 -07:00
strrchr.c x86-64: Optimize strrchr/wcsrchr with AVX2 2017-06-09 05:45:52 -07:00
strspn-c.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
strspn-sse2.S x86-64: Implement strcspn/strpbrk/strspn IFUNC selectors in C 2017-06-15 08:59:05 -07:00
strspn.c x86-64: Implement strcspn/strpbrk/strspn IFUNC selectors in C 2017-06-15 08:59:05 -07:00
strstr-sse2-unaligned.S Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
strstr.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
test-multiarch.c Suppress internal declarations for most of the testsuite. 2017-05-11 19:27:59 -04:00
varshift.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
varshift.h Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
wcschr-avx2.S x86-64: Optimize strchr/strchrnul/wcschr with AVX2 2017-06-09 05:42:29 -07:00
wcschr-sse2.S x86-64: Optimize strchr/strchrnul/wcschr with AVX2 2017-06-09 05:42:29 -07:00
wcschr.c x86-64: Optimize strchr/strchrnul/wcschr with AVX2 2017-06-09 05:42:29 -07:00
wcscpy-c.c Remove NOT_IN_libc 2014-11-24 15:03:45 +05:30
wcscpy-ssse3.S Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
wcscpy.c x86-64: Implement wcscpy IFUNC selector in C 2017-06-15 08:57:52 -07:00
wcslen-avx2.S x86-64: Optimize strlen/strnlen/wcslen/wcsnlen with AVX2 2017-06-09 05:18:18 -07:00
wcslen-sse2.S x86-64: Optimize strlen/strnlen/wcslen/wcsnlen with AVX2 2017-06-09 05:18:18 -07:00
wcslen.c x86-64: Optimize strlen/strnlen/wcslen/wcsnlen with AVX2 2017-06-09 05:18:18 -07:00
wcsnlen-avx2.S x86-64: Optimize strlen/strnlen/wcslen/wcsnlen with AVX2 2017-06-09 05:18:18 -07:00
wcsnlen-c.c x86-64: Move wcsnlen.S to multiarch/wcsnlen-sse4_1.S 2017-06-06 06:12:32 -07:00
wcsnlen-sse4_1.S x86-64: Move wcsnlen.S to multiarch/wcsnlen-sse4_1.S 2017-06-06 06:12:32 -07:00
wcsnlen.c x86-64: Optimize strlen/strnlen/wcslen/wcsnlen with AVX2 2017-06-09 05:18:18 -07:00
wcsrchr-avx2.S x86-64: Optimize strrchr/wcsrchr with AVX2 2017-06-09 05:45:52 -07:00
wcsrchr-sse2.S x86-64: Optimize strrchr/wcsrchr with AVX2 2017-06-09 05:45:52 -07:00
wcsrchr.c x86-64: Optimize strrchr/wcsrchr with AVX2 2017-06-09 05:45:52 -07:00
wmemchr-avx2.S x86-64: Optimize memchr/rawmemchr/wmemchr with SSE2/AVX2 2017-06-09 05:13:31 -07:00
wmemchr-sse2.S x86-64: Optimize memchr/rawmemchr/wmemchr with SSE2/AVX2 2017-06-09 05:13:31 -07:00
wmemchr.c x86-64: Optimize memchr/rawmemchr/wmemchr with SSE2/AVX2 2017-06-09 05:13:31 -07:00
wmemcmp-avx2-movbe.S x86-64: Optimize memcmp/wmemcmp with AVX2 and MOVBE 2017-06-05 12:52:55 -07:00
wmemcmp-c.c Remove NOT_IN_libc 2014-11-24 15:03:45 +05:30
wmemcmp-sse4.S Optimized memcmp and wmemcmp for x86-64 and x86-32 2011-10-15 11:10:08 -04:00
wmemcmp-ssse3.S Optimized memcmp and wmemcmp for x86-64 and x86-32 2011-10-15 11:10:08 -04:00
wmemcmp.c x86-64: Implement memcmp family IFUNC selectors in C 2017-06-15 08:49:57 -07:00
wmemset_chk-nonshared.S x86-64: Implement memset family IFUNC selectors in C 2017-06-15 08:33:35 -07:00
wmemset_chk.c x86-64: Rename wmemset.h to ifunc-wmemset.h 2017-06-07 14:48:34 -07:00
wmemset.c x86-64: Implement memset family IFUNC selectors in C 2017-06-15 08:33:35 -07:00