glibc/sysdeps/x86_64
Noah Goldstein b77b06e0e2 x86: Optimize strcmp-avx2.S
Optimization are primarily to the loop logic and how the page cross
logic interacts with the loop.

The page cross logic is at times more expensive for short strings near
the end of a page but not crossing the page. This is done to retest
the page cross conditions with a non-faulty check and to improve the
logic for entering the loop afterwards. This is only particular cases,
however, and is general made up for by more than 10x improvements on
the transition from the page cross -> loop case.

The non-page cross cases are improved most for smaller sizes [0, 128]
and go about even for (128, 4096]. The loop page cross logic is
improved so some more significant speedup is seen there as well.

test-strcmp, test-strncmp, test-wcscmp, and test-wcsncmp all pass.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
2022-02-03 16:41:38 -06:00
..
64 Move architecture shlib-versions files to Linux-specific directories. 2014-07-17 14:31:12 +00:00
fpu math: Add more inputs to atan2 accuracy tests [BZ #28765] 2022-01-14 06:00:06 -08:00
multiarch x86: Optimize strcmp-avx2.S 2022-02-03 16:41:38 -06:00
nptl Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
x32 Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
____longjmp_chk.S ____longjmp_chk is now OS-specific. 2009-07-30 21:42:27 -07:00
__longjmp.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
_mcount.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
abort-instr.h
add_n.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
addmul_1.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
bsd-_setjmp.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
bsd-setjmp.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
bzero.S Make an empty file. 2007-10-16 05:59:15 +00:00
configure elf: Remove Intel MPX support (lazy PLT, ld.so profile, and LD_AUDIT) 2021-10-11 11:14:02 -07:00
configure.ac elf: Remove Intel MPX support (lazy PLT, ld.so profile, and LD_AUDIT) 2021-10-11 11:14:02 -07:00
crti.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
crtn.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
dl-hwcaps-subdirs.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
dl-irel.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
dl-machine.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
dl-procinfo.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
dl-runtime.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
dl-tls.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
dl-tls.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
dl-tlsdesc.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
dl-tlsdesc.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
dl-trampoline.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
dl-trampoline.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
ffs.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
ffsll.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
htonl.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
ifuncmain8.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
ifuncmod8.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
Implies Remove dbl-64/wordsize-64 (part 2) 2021-01-07 15:26:26 +00:00
isa.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
jmpbuf-offsets.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
jmpbuf-unwind.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
l10nflist.c Minor optimization of popcount in l10nflist 2011-08-11 14:07:04 -04:00
link-defines.sym elf: Remove Intel MPX support (lazy PLT, ld.so profile, and LD_AUDIT) 2021-10-11 11:14:02 -07:00
locale-defines.sym Implement optimized strcaecmp for x86-64. 2010-07-30 00:14:04 -07:00
localplt.data mtrace: Wean away from malloc hooks 2021-07-22 18:38:06 +05:30
lshift.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
machine-gmon.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
Makefile x86-64: Remove compiler -mavx512f check 2021-08-24 07:05:35 -07:00
memchr.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
memcmp.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
memcmpeq.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
memcpy_chk.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
memcpy.S X86-64: Remove previous default/SSE2/AVX2 memcpy/memmove 2016-06-08 13:58:08 -07:00
memmove_chk.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
memmove.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
mempcpy_chk.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
mempcpy.S X86-64: Remove previous default/SSE2/AVX2 memcpy/memmove 2016-06-08 13:58:08 -07:00
memrchr.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
memset_chk.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
memset.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
mp_clz_tab.c * sysdeps/x86_64/mp_clz_tab.c: New file. 2009-04-15 04:30:41 +00:00
mul_1.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
preconfigure rename configure.in to configure.ac 2013-10-30 17:32:08 +10:00
preconfigure.ac rename configure.in to configure.ac 2013-10-30 17:32:08 +10:00
rawmemchr.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
rshift.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
rtld-offsets.sym x86-64: Align the stack in __tls_get_addr [BZ #21609] 2017-07-06 04:43:20 -07:00
setjmp.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
stackguard-macros.h BZ #15754: CVE-2013-4788 2013-09-23 00:52:09 -04:00
stackinfo.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
start.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
stpcpy.S Update. 2004-05-28 06:56:51 +00:00
strcasecmp_l-nonascii.c Use locale_t, not __locale_t, throughout glibc 2017-06-20 20:30:06 -04:00
strcasecmp_l.S Implement optimized strcaecmp for x86-64. 2010-07-30 00:14:04 -07:00
strcasecmp.S Implement optimized strcaecmp for x86-64. 2010-07-30 00:14:04 -07:00
strcat.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strchr.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strchrnul.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strcmp.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strcpy.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strcspn.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strlen.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strncase_l-nonascii.c Use locale_t, not __locale_t, throughout glibc 2017-06-20 20:30:06 -04:00
strncase_l.S Add optimized strncasecmp versions for x86-64. 2010-08-14 22:04:01 -07:00
strncase.S Add optimized strncasecmp versions for x86-64. 2010-08-14 22:04:01 -07:00
strncmp.S Add SSE2 support to str{,n}cmp for x86-64. 2009-07-26 13:32:28 -07:00
strnlen.S Faster strlen on x64. 2013-03-18 07:39:12 +01:00
strpbrk.S x86-64: Implement strcspn/strpbrk/strspn IFUNC selectors in C 2017-06-15 08:59:05 -07:00
strrchr.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strspn.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
sub_n.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
submul_1.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
sysdep.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tls_get_addr.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tlsdesc.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tlsdesc.sym x86-64: Align the stack in __tls_get_addr [BZ #21609] 2017-07-06 04:43:20 -07:00
tst-audit3.c Modify several tests to use test-skeleton.c 2014-11-05 15:24:08 +05:30
tst-audit4-aux.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-audit4.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-audit5.c Modify several tests to use test-skeleton.c 2014-11-05 15:24:08 +05:30
tst-audit6.c Modify several tests to use test-skeleton.c 2015-07-15 15:10:23 +05:30
tst-audit7.c Move x86_64-specific audit tests to sysdeps/x86_64/. 2013-04-25 19:23:11 +00:00
tst-audit10-aux.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-audit10.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-audit.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-auditmod3a.c Move x86_64-specific audit tests to sysdeps/x86_64/. 2013-04-25 19:23:11 +00:00
tst-auditmod3b.c Add missing header files throughout the testsuite. 2017-02-16 17:33:18 -05:00
tst-auditmod4a.c Move x86_64-specific audit tests to sysdeps/x86_64/. 2013-04-25 19:23:11 +00:00
tst-auditmod4b.c Add missing header files throughout the testsuite. 2017-02-16 17:33:18 -05:00
tst-auditmod5a.c Move x86_64-specific audit tests to sysdeps/x86_64/. 2013-04-25 19:23:11 +00:00
tst-auditmod5b.c Add missing header files throughout the testsuite. 2017-02-16 17:33:18 -05:00
tst-auditmod6a.c Move x86_64-specific audit tests to sysdeps/x86_64/. 2013-04-25 19:23:11 +00:00
tst-auditmod6b.c Add missing header files throughout the testsuite. 2017-02-16 17:33:18 -05:00
tst-auditmod6c.c Add missing header files throughout the testsuite. 2017-02-16 17:33:18 -05:00
tst-auditmod7a.c Move x86_64-specific audit tests to sysdeps/x86_64/. 2013-04-25 19:23:11 +00:00
tst-auditmod7b.c Add missing header files throughout the testsuite. 2017-02-16 17:33:18 -05:00
tst-auditmod10a.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-auditmod10b.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-avx512-aux.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-avx512.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-avx512mod.c x86-64: Verify that _dl_runtime_resolve preserves vector registers 2017-02-09 12:19:58 -08:00
tst-avx-aux.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-avx.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-avxmod.c x86-64: Verify that _dl_runtime_resolve preserves vector registers 2017-02-09 12:19:58 -08:00
tst-glibc-hwcaps.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-platform-1.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-platformmod-1.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-platformmod-2.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-quad1.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-quad1pie.c Handle R_X86_64_RELATIVE64 and R_X86_64_64 for x32 2012-05-10 17:05:06 -07:00
tst-quad2.c Handle R_X86_64_RELATIVE64 and R_X86_64_64 for x32 2012-05-10 17:05:06 -07:00
tst-quad2pie.c Handle R_X86_64_RELATIVE64 and R_X86_64_64 for x32 2012-05-10 17:05:06 -07:00
tst-quadmod1.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-quadmod1pie.S Handle R_X86_64_RELATIVE64 and R_X86_64_64 for x32 2012-05-10 17:05:06 -07:00
tst-quadmod2.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-quadmod2pie.S Handle R_X86_64_RELATIVE64 and R_X86_64_64 for x32 2012-05-10 17:05:06 -07:00
tst-rsi-strlen.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-rsi-wcslen.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-split-dynreloc.c Fix dynamic linker issue with bind-now 2015-08-19 05:37:01 -07:00
tst-split-dynreloc.lds Fix dynamic linker issue with bind-now 2015-08-19 05:37:01 -07:00
tst-sse.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-ssemod.c x86-64: Verify that _dl_runtime_resolve preserves vector registers 2017-02-09 12:19:58 -08:00
tst-x86_64-1.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-x86_64mod-1.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-x86-64-tls-1.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
Versions Move __fentry__ version definition to sysdeps/{i386,x86_64} 2018-08-10 09:07:44 +02:00
wcschr.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
wcscmp.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
wcslen.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
wcsrchr.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
wmemset_chk.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
wmemset.S x86-64: Optimize wmemset with SSE2/AVX2/AVX512 2017-06-05 11:09:59 -07:00
wordcopy.c X86-64: Add dummy memcopy.h and wordcopy.c 2016-06-09 04:38:34 -07:00