glibc/sysdeps/x86_64
Noah Goldstein 104c7b1967 x86: Add EVEX optimized memchr family not safe for RTM
No bug.

This commit adds a new implementation for EVEX memchr that is not safe
for RTM because it uses vzeroupper. The benefit is that by using
ymm0-ymm15 it can use vpcmpeq and vpternlogd in the 4x loop which is
faster than the RTM safe version which cannot use vpcmpeq because
there is no EVEX encoding for the instruction. All parts of the
implementation aside from the 4x loop are the same for the two
versions and the optimization is only relevant for large sizes.

Tigerlake:
size  , algn  , Pos   , Cur T , New T , Win     , Dif
512   , 6     , 192   , 9.2   , 9.04  , no-RTM  , 0.16
512   , 7     , 224   , 9.19  , 8.98  , no-RTM  , 0.21
2048  , 0     , 256   , 10.74 , 10.54 , no-RTM  , 0.2
2048  , 0     , 512   , 14.81 , 14.87 , RTM     , 0.06
2048  , 0     , 1024  , 22.97 , 22.57 , no-RTM  , 0.4
2048  , 0     , 2048  , 37.49 , 34.51 , no-RTM  , 2.98   <--

Icelake:
size  , algn  , Pos   , Cur T , New T , Win     , Dif
512   , 6     , 192   , 7.6   , 7.3   , no-RTM  , 0.3
512   , 7     , 224   , 7.63  , 7.27  , no-RTM  , 0.36
2048  , 0     , 256   , 8.48  , 8.38  , no-RTM  , 0.1
2048  , 0     , 512   , 11.57 , 11.42 , no-RTM  , 0.15
2048  , 0     , 1024  , 17.92 , 17.38 , no-RTM  , 0.54
2048  , 0     , 2048  , 30.37 , 27.34 , no-RTM  , 3.03   <--

test-memchr, test-wmemchr, and test-rawmemchr are all passing.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-05-08 16:26:30 -04:00
..
64
fpu regenerate ulps on x86_64 with -march=native 2021-04-28 12:46:00 +02:00
multiarch x86: Add EVEX optimized memchr family not safe for RTM 2021-05-08 16:26:30 -04:00
nptl nptl: Move pthread_spin_trylock into libc 2021-04-23 17:06:48 +02:00
x32 Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
____longjmp_chk.S
__longjmp.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
_mcount.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
abort-instr.h
add_n.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
addmul_1.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
bsd-_setjmp.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
bsd-setjmp.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
bzero.S
configure configure: Check for static PIE support 2021-01-21 15:54:50 +00:00
configure.ac configure: Check for static PIE support 2021-01-21 15:54:50 +00:00
crti.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
crtn.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
dl-hwcaps-subdirs.c <sys/platform/x86.h>: Remove the C preprocessor magic 2021-01-21 05:58:17 -08:00
dl-irel.h Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
dl-machine.h x86_64: Remove lazy tlsdesc relocation related code 2021-04-15 09:47:47 +01:00
dl-procinfo.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
dl-runtime.h Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
dl-tls.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
dl-tls.h Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
dl-tlsdesc.h x86_64: Remove lazy tlsdesc relocation related code 2021-04-15 09:47:47 +01:00
dl-tlsdesc.S x86_64: Remove lazy tlsdesc relocation related code 2021-04-15 09:47:47 +01:00
dl-trampoline.h Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
dl-trampoline.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
ffs.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
ffsll.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
htonl.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
ifuncmain8.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
ifuncmod8.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
Implies Remove dbl-64/wordsize-64 (part 2) 2021-01-07 15:26:26 +00:00
isa.h Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
jmpbuf-offsets.h Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
jmpbuf-unwind.h Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
l10nflist.c
link-defines.sym
locale-defines.sym
localplt.data ld.so: Do not export free/calloc/malloc/realloc functions [BZ #25486] 2020-02-15 11:01:23 +01:00
lshift.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
machine-gmon.h Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
Makefile x86_64: Correct THREAD_SETMEM/THREAD_SETMEM_NC for movq [BZ #27591] 2021-04-01 07:00:22 -07:00
memchr.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
memcmp.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
memcpy_chk.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
memcpy.S X86-64: Remove previous default/SSE2/AVX2 memcpy/memmove 2016-06-08 13:58:08 -07:00
memmove_chk.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
memmove.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
mempcpy_chk.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
mempcpy.S X86-64: Remove previous default/SSE2/AVX2 memcpy/memmove 2016-06-08 13:58:08 -07:00
memrchr.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
memset_chk.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
memset.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
memusage.h Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
mp_clz_tab.c
mul_1.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
preconfigure
preconfigure.ac
rawmemchr.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
rshift.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
rtld-offsets.sym x86-64: Align the stack in __tls_get_addr [BZ #21609] 2017-07-06 04:43:20 -07:00
setjmp.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
stackguard-macros.h
stackinfo.h Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
start.S Reduce the statically linked startup code [BZ #23323] 2021-02-25 12:13:02 +01:00
stpcpy.S
strcasecmp_l-nonascii.c Use locale_t, not __locale_t, throughout glibc 2017-06-20 20:30:06 -04:00
strcasecmp_l.S
strcasecmp.S
strcat.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strchr.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strchrnul.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strcmp.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strcpy.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strcspn.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strlen.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strncase_l-nonascii.c Use locale_t, not __locale_t, throughout glibc 2017-06-20 20:30:06 -04:00
strncase_l.S
strncase.S
strncmp.S
strnlen.S
strpbrk.S x86-64: Implement strcspn/strpbrk/strspn IFUNC selectors in C 2017-06-15 08:59:05 -07:00
strrchr.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strspn.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
sub_n.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
submul_1.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
sysdep.h x86-64: Add AVX optimized string/memory functions for RTM 2021-03-29 07:40:17 -07:00
tls_get_addr.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
tls-macros.h
tlsdesc.c elf: Remove lazy tlsdesc relocation related code 2021-04-21 14:35:53 +01:00
tlsdesc.sym x86-64: Align the stack in __tls_get_addr [BZ #21609] 2017-07-06 04:43:20 -07:00
tst-audit3.c Modify several tests to use test-skeleton.c 2014-11-05 15:24:08 +05:30
tst-audit4-aux.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
tst-audit4.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
tst-audit5.c Modify several tests to use test-skeleton.c 2014-11-05 15:24:08 +05:30
tst-audit6.c Modify several tests to use test-skeleton.c 2015-07-15 15:10:23 +05:30
tst-audit7.c
tst-audit10-aux.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
tst-audit10.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
tst-audit.h Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
tst-auditmod3a.c
tst-auditmod3b.c Add missing header files throughout the testsuite. 2017-02-16 17:33:18 -05:00
tst-auditmod4a.c
tst-auditmod4b.c Add missing header files throughout the testsuite. 2017-02-16 17:33:18 -05:00
tst-auditmod5a.c
tst-auditmod5b.c Add missing header files throughout the testsuite. 2017-02-16 17:33:18 -05:00
tst-auditmod6a.c
tst-auditmod6b.c Add missing header files throughout the testsuite. 2017-02-16 17:33:18 -05:00
tst-auditmod6c.c Add missing header files throughout the testsuite. 2017-02-16 17:33:18 -05:00
tst-auditmod7a.c
tst-auditmod7b.c Add missing header files throughout the testsuite. 2017-02-16 17:33:18 -05:00
tst-auditmod10a.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
tst-auditmod10b.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
tst-avx512-aux.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
tst-avx512.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
tst-avx512mod.c x86-64: Verify that _dl_runtime_resolve preserves vector registers 2017-02-09 12:19:58 -08:00
tst-avx-aux.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
tst-avx.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
tst-avxmod.c x86-64: Verify that _dl_runtime_resolve preserves vector registers 2017-02-09 12:19:58 -08:00
tst-glibc-hwcaps.c <sys/platform/x86.h>: Remove the C preprocessor magic 2021-01-21 05:58:17 -08:00
tst-mallocalign1.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
tst-platform-1.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
tst-platformmod-1.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
tst-platformmod-2.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
tst-quad1.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
tst-quad1pie.c
tst-quad2.c
tst-quad2pie.c
tst-quadmod1.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
tst-quadmod1pie.S
tst-quadmod2.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
tst-quadmod2pie.S
tst-split-dynreloc.c Fix dynamic linker issue with bind-now 2015-08-19 05:37:01 -07:00
tst-split-dynreloc.lds Fix dynamic linker issue with bind-now 2015-08-19 05:37:01 -07:00
tst-sse.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
tst-ssemod.c x86-64: Verify that _dl_runtime_resolve preserves vector registers 2017-02-09 12:19:58 -08:00
tst-stack-align.h Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
tst-x86_64-1.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
tst-x86_64mod-1.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
tst-x86-64-tls-1.c x86_64: Correct THREAD_SETMEM/THREAD_SETMEM_NC for movq [BZ #27591] 2021-04-01 07:00:22 -07:00
Versions Move __fentry__ version definition to sysdeps/{i386,x86_64} 2018-08-10 09:07:44 +02:00
wcschr.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
wcscmp.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
wcslen.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
wcsrchr.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
wmemset_chk.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
wmemset.S x86-64: Optimize wmemset with SSE2/AVX2/AVX512 2017-06-05 11:09:59 -07:00
wordcopy.c X86-64: Add dummy memcopy.h and wordcopy.c 2016-06-09 04:38:34 -07:00