glibc/sysdeps/x86_64
Noah Goldstein b4209615a0 x86: Optimize memrchr-evex.S
The new code:
    1. prioritizes smaller user-arg lengths more.
    2. optimizes target placement more carefully
    3. reuses logic more
    4. fixes up various inefficiencies in the logic. The biggest
       case here is the `lzcnt` logic for checking returns which
       saves either a branch or multiple instructions.

The total code size saving is: 263 bytes
Geometric Mean of all benchmarks New / Old: 0.755

Regressions:
There are some regressions. Particularly where the length (user arg
length) is large but the position of the match char is near the
beginning of the string (in first VEC). This case has roughly a
20% regression.

This is because the new logic gives the hot path for immediate matches
to shorter lengths (the more common input). This case has roughly
a 35% speedup.

Full xcheck passes on x86_64.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2022-06-07 13:10:24 -07:00
..
64
fpu x86_64: Optimize sincos where sin/cos is optimized (bug 29193) 2022-06-01 10:29:52 +02:00
multiarch x86: Optimize memrchr-evex.S 2022-06-07 13:10:24 -07:00
nptl nptl: Add backoff mechanism to spinlock loop 2022-05-09 14:38:40 -07:00
x32 x86_64: Remove _dl_skip_args usage 2022-05-30 16:33:34 -03:00
____longjmp_chk.S
__longjmp.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
_mcount.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
abort-instr.h
add_n.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
addmul_1.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
bsd-_setjmp.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
bsd-setjmp.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
configure x86/configure.ac: Define PI_STATIC_AND_HIDDEN/SUPPORT_STATIC_PIE 2022-02-14 07:34:54 -08:00
configure.ac x86/configure.ac: Define PI_STATIC_AND_HIDDEN/SUPPORT_STATIC_PIE 2022-02-14 07:34:54 -08:00
crti.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
crtn.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
dl-hwcaps-subdirs.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
dl-irel.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
dl-machine.h x86_64: Remove _dl_skip_args usage 2022-05-30 16:33:34 -03:00
dl-procinfo.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
dl-runtime.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
dl-tls.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
dl-tls.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
dl-tlsdesc.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
dl-tlsdesc.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
dl-trampoline.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
dl-trampoline.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
ffs.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
ffsll.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
htonl.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
ifuncmain8.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
ifuncmod8.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
Implies Remove dbl-64/wordsize-64 (part 2) 2021-01-07 15:26:26 +00:00
isa.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
jmpbuf-offsets.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
jmpbuf-unwind.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
l10nflist.c
link-defines.sym elf: Remove Intel MPX support (lazy PLT, ld.so profile, and LD_AUDIT) 2021-10-11 11:14:02 -07:00
locale-defines.sym
localplt.data mtrace: Wean away from malloc hooks 2021-07-22 18:38:06 +05:30
lshift.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
machine-gmon.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
Makefile build: Properly generate .d dependency files [BZ #28922] 2022-02-25 10:35:45 -08:00
memchr.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
memcmp.S x86-64: Fix SSE2 memcmp and SSSE3 memmove for x32 2022-04-22 11:23:15 -07:00
memcmpeq.S x86: Optimize memcmp SSE2 in memcmp.S 2022-04-15 13:08:35 -05:00
memcpy_chk.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
memcpy.S
memmove_chk.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
memmove.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
mempcpy_chk.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
mempcpy.S
memrchr.S x86: Optimize memrchr-sse2.S 2022-06-07 13:09:36 -07:00
memset_chk.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
memset.S x86_64: Remove bzero optimization 2022-05-16 09:36:06 -03:00
mp_clz_tab.c
mul_1.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
preconfigure
preconfigure.ac
rawmemchr.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
rshift.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
rtld-offsets.sym
setjmp.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
stackguard-macros.h
stackinfo.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
start.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
stpcpy.S
strcasecmp_l-nonascii.c
strcasecmp_l.S
strcasecmp.S
strcat.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strchr.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strchrnul.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strcmp.S x86: Remove str{n}{case}cmp-ssse3 2022-04-14 23:21:41 -05:00
strcpy.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strlen.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strncase_l-nonascii.c
strncase_l.S
strncase.S
strncmp.S
strnlen.S
strrchr.S x86: Optimize {str|wcs}rchr-sse2 2022-04-22 23:08:36 -05:00
sub_n.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
submul_1.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
sysdep.h x86: Add COND_VZEROUPPER that can replace vzeroupper if no ret 2022-06-07 13:08:28 -07:00
tls_get_addr.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tlsdesc.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tlsdesc.sym
tst-audit3.c
tst-audit4-aux.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-audit4.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-audit5.c
tst-audit6.c
tst-audit7.c
tst-audit10-aux.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-audit10.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-audit.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-auditmod3a.c
tst-auditmod3b.c
tst-auditmod4a.c
tst-auditmod4b.c
tst-auditmod5a.c
tst-auditmod5b.c
tst-auditmod6a.c
tst-auditmod6b.c
tst-auditmod6c.c
tst-auditmod7a.c
tst-auditmod7b.c
tst-auditmod10a.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-auditmod10b.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-avx512-aux.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-avx512.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-avx512mod.c
tst-avx-aux.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-avx.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-avxmod.c
tst-glibc-hwcaps.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-platform-1.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-platformmod-1.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-platformmod-2.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-quad1.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-quad1pie.c
tst-quad2.c
tst-quad2pie.c
tst-quadmod1.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-quadmod1pie.S
tst-quadmod2.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-quadmod2pie.S
tst-rsi-strlen.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-rsi-wcslen.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-split-dynreloc.c
tst-split-dynreloc.lds
tst-sse.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-ssemod.c
tst-x86_64-1.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-x86_64mod-1.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-x86-64-tls-1.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
Versions
wcschr.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
wcscmp.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
wcslen.S x86: Small improvements for wcslen 2022-03-28 15:00:03 -05:00
wcsrchr.S x86: Optimize {str|wcs}rchr-sse2 2022-04-22 23:08:36 -05:00
wmemcmp.S x86: Fix missing __wmemcmp def for disable-multiarch build 2022-04-19 20:18:57 -05:00
wmemset_chk.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
wmemset.S
wordcopy.c