glibc/sysdeps
Noah Goldstein b79f8ff26a x86: Optimize strnlen-evex.S and implement with VMM headers
Optimizations are:
1. Use the fact that bsf(0) leaves the destination unchanged to save a
   branch in short string case.
2. Restructure code so that small strings are given the hot path.
        - This is a net-zero on the benchmark suite but in general makes
      sense as smaller sizes are far more common.
3. Use more code-size efficient instructions.
	- tzcnt ...     -> bsf ...
	- vpcmpb $0 ... -> vpcmpeq ...
4. Align labels less aggressively, especially if it doesn't save fetch
   blocks / causes the basic-block to span extra cache-lines.

The optimizations (especially for point 2) make the strnlen and
strlen code essentially incompatible so split strnlen-evex
to a new file.

Code Size Changes:
strlen-evex.S       :  -23 bytes
strnlen-evex.S      : -167 bytes

Net perf changes:

Reported as geometric mean of all improvements / regressions from N=10
runs of the benchtests. Value as New Time / Old Time so < 1.0 is
improvement and 1.0 is regression.

strlen-evex.S       : 0.992 (No real change)
strnlen-evex.S      : 0.947

Full results attached in email.

Full check passes on x86-64.
2022-10-19 17:31:03 -07:00
..
aarch64 Use PTR_MANGLE and PTR_DEMANGLE unconditionally in C sources 2022-10-18 17:04:10 +02:00
alpha Use PTR_MANGLE and PTR_DEMANGLE unconditionally in C sources 2022-10-18 17:04:10 +02:00
arc Use PTR_MANGLE and PTR_DEMANGLE unconditionally in C sources 2022-10-18 17:04:10 +02:00
arm Use PTR_MANGLE and PTR_DEMANGLE unconditionally in C sources 2022-10-18 17:04:10 +02:00
csky Use PTR_MANGLE and PTR_DEMANGLE unconditionally in C sources 2022-10-18 17:04:10 +02:00
generic Use PTR_MANGLE and PTR_DEMANGLE unconditionally in C sources 2022-10-18 17:04:10 +02:00
gnu errlist: add missing entry for EDEADLOCK (bug 29545) 2022-09-08 11:40:24 +02:00
hppa hppa: Fix initialization of dp register [BZ 29635] 2022-10-01 19:49:25 +00:00
htl htl: Make pthread*_cond_timedwait register wref before releasing mutex 2022-08-22 22:27:24 +02:00
hurd hurd: Fix pthread_kill on exiting/ted thread 2022-01-15 15:11:54 +01:00
i386 Use PTR_MANGLE and PTR_DEMANGLE unconditionally in C sources 2022-10-18 17:04:10 +02:00
ia64 Update _FloatN header support for C++ in GCC 13 2022-09-28 20:10:08 +00:00
ieee754 math: Fix asin and acos invalid exception with old gcc 2022-10-17 08:18:52 +01:00
loongarch Use PTR_MANGLE and PTR_DEMANGLE unconditionally in C sources 2022-10-18 17:04:10 +02:00
m68k Use PTR_MANGLE and PTR_DEMANGLE unconditionally in C sources 2022-10-18 17:04:10 +02:00
mach Use PTR_MANGLE and PTR_DEMANGLE unconditionally in C sources 2022-10-18 17:04:10 +02:00
microblaze Use PTR_MANGLE and PTR_DEMANGLE unconditionally in C sources 2022-10-18 17:04:10 +02:00
mips Use PTR_MANGLE and PTR_DEMANGLE unconditionally in C sources 2022-10-18 17:04:10 +02:00
nios2 Use PTR_MANGLE and PTR_DEMANGLE unconditionally in C sources 2022-10-18 17:04:10 +02:00
nptl Use atomic_exchange_release/acquire 2022-09-26 16:58:08 +01:00
or1k Use atomic_exchange_release/acquire 2022-09-26 16:58:08 +01:00
posix get_nscd_addresses: Fix subscript typos [BZ #29605] 2022-09-28 12:47:10 -04:00
powerpc Use PTR_MANGLE and PTR_DEMANGLE unconditionally in C sources 2022-10-18 17:04:10 +02:00
pthread Do not define static_assert or thread_local in headers for C2x 2022-09-07 18:39:28 +00:00
riscv Use PTR_MANGLE and PTR_DEMANGLE unconditionally in C sources 2022-10-18 17:04:10 +02:00
s390 Use PTR_MANGLE and PTR_DEMANGLE unconditionally in C sources 2022-10-18 17:04:10 +02:00
sh Use PTR_MANGLE and PTR_DEMANGLE unconditionally in C sources 2022-10-18 17:04:10 +02:00
sparc Use PTR_MANGLE and PTR_DEMANGLE unconditionally in C sources 2022-10-18 17:04:10 +02:00
unix Introduce <pointer_guard.h>, extracted from <sysdep.h> 2022-10-18 17:03:55 +02:00
wordsize-32 Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
wordsize-64 Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
x86 elf: Remove _dl_string_hwcap 2022-10-06 07:59:48 -03:00
x86_64 x86: Optimize strnlen-evex.S and implement with VMM headers 2022-10-19 17:31:03 -07:00