mirror of
https://sourceware.org/git/glibc.git
synced 2024-11-22 13:00:06 +00:00
2d2493a644
Changes to generated code are: 1. In a few places use `vpcmpeqb` instead of `vpcmpneq` to save a byte of code size. 2. Add a branch for length <= (VEC_SIZE * 6) as opposed to doing the entire block of [VEC_SIZE * 4 + 1, VEC_SIZE * 8] in a single basic-block (the space to add the extra branch without changing code size is bought with the above change). Change (2) has roughly a 20-25% speedup for sizes in [VEC_SIZE * 4 + 1, VEC_SIZE * 6] and negligible to no-cost for [VEC_SIZE * 6 + 1, VEC_SIZE * 8] From N=10 runs on Tigerlake: align1,align2 ,length ,result ,New Time ,Cur Time ,New Time / Old Time 0 ,0 ,129 ,0 ,5.404 ,6.887 ,0.785 0 ,0 ,129 ,1 ,5.308 ,6.826 ,0.778 0 ,0 ,129 ,18446744073709551615 ,5.359 ,6.823 ,0.785 0 ,0 ,161 ,0 ,5.284 ,6.827 ,0.774 0 ,0 ,161 ,1 ,5.317 ,6.745 ,0.788 0 ,0 ,161 ,18446744073709551615 ,5.406 ,6.778 ,0.798 0 ,0 ,193 ,0 ,6.804 ,6.802 ,1.000 0 ,0 ,193 ,1 ,6.950 ,6.754 ,1.029 0 ,0 ,193 ,18446744073709551615 ,6.792 ,6.719 ,1.011 0 ,0 ,225 ,0 ,6.625 ,6.699 ,0.989 0 ,0 ,225 ,1 ,6.776 ,6.735 ,1.003 0 ,0 ,225 ,18446744073709551615 ,6.758 ,6.738 ,0.992 0 ,0 ,256 ,0 ,5.402 ,5.462 ,0.989 0 ,0 ,256 ,1 ,5.364 ,5.483 ,0.978 0 ,0 ,256 ,18446744073709551615 ,5.341 ,5.539 ,0.964 Rewriting with VMM API allows for memcmpeq-evex to be used with evex512 by including "x86-evex512-vecs.h" at the top. Complete check passes on x86-64. |
||
---|---|---|
.. | ||
aarch64 | ||
alpha | ||
arc | ||
arm | ||
csky | ||
generic | ||
gnu | ||
hppa | ||
htl | ||
hurd | ||
i386 | ||
ia64 | ||
ieee754 | ||
loongarch | ||
m68k | ||
mach | ||
microblaze | ||
mips | ||
nios2 | ||
nptl | ||
or1k | ||
posix | ||
powerpc | ||
pthread | ||
riscv | ||
s390 | ||
sh | ||
sparc | ||
unix | ||
wordsize-32 | ||
wordsize-64 | ||
x86 | ||
x86_64 |