glibc/wcsnlen-evex.S at 419c832aba43276e285586998261d1db06033193 - glibc - Gitea: Git with a cup of tea

AuroraMiddleware/glibc

mirror of https://sourceware.org/git/glibc.git synced 2024-12-29 05:51:10 +00:00

Noah Goldstein b79f8ff26a x86: Optimize strnlen-evex.S and implement with VMM headers

Optimizations are:
1. Use the fact that bsf(0) leaves the destination unchanged to save a
   branch in short string case.
2. Restructure code so that small strings are given the hot path.
        - This is a net-zero on the benchmark suite but in general makes
      sense as smaller sizes are far more common.
3. Use more code-size efficient instructions.
	- tzcnt ...     -> bsf ...
	- vpcmpb $0 ... -> vpcmpeq ...
4. Align labels less aggressively, especially if it doesn't save fetch
   blocks / causes the basic-block to span extra cache-lines.

The optimizations (especially for point 2) make the strnlen and
strlen code essentially incompatible so split strnlen-evex
to a new file.

Code Size Changes:
strlen-evex.S       :  -23 bytes
strnlen-evex.S      : -167 bytes

Net perf changes:

Reported as geometric mean of all improvements / regressions from N=10
runs of the benchtests. Value as New Time / Old Time so < 1.0 is
improvement and 1.0 is regression.

strlen-evex.S       : 0.992 (No real change)
strnlen-evex.S      : 0.947

Full results attached in email.

Full check passes on x86-64.

2022-10-19 17:31:03 -07:00

9 lines

131 B

ArmAsm

Raw Blame History

 #ifndef WCSNLEN
 # define WCSNLEN	__wcsnlen_evex
 #endif
 #define STRNLEN	WCSNLEN
 #define USE_AS_WCSLEN 1
 #include "strnlen-evex.S"