glibc/memmove-avx-unaligned-erms.S at e805606193e1a39956ca5ef73cb44a8796730686 - glibc - Gitea: Git with a cup of tea

AuroraMiddleware/glibc

mirror of https://sourceware.org/git/glibc.git synced 2024-12-11 22:00:08 +00:00

Noah Goldstein a7392db2ff x86: Optimize memmove-vec-unaligned-erms.S

No bug.

The optimizations are as follows:

1) Always align entry to 64 bytes. This makes behavior more
   predictable and makes other frontend optimizations easier.

2) Make the L(more_8x_vec) cases 4k aliasing aware. This can have
   significant benefits in the case that:
        0 < (dst - src) < [256, 512]

3) Align before `rep movsb`. For ERMS this is roughly a [0, 30%]
   improvement and for FSRM [-10%, 25%].

In addition to these primary changes there is general cleanup
throughout to optimize the aligning routines and control flow logic.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit a6b7502ec0)

2022-04-26 18:18:16 -07:00

13 lines

275 B

ArmAsm

Raw Blame History

 #if IS_IN (libc)
 # define VEC_SIZE	32
 # define VEC(i)		ymm##i
 # define VMOVNT		vmovntdq
 # define VMOVU		vmovdqu
 # define VMOVA		vmovdqa
 # define MOV_SIZE	4
 # define SECTION(p)		p##.avx
 # define MEMMOVE_SYMBOL(p,s)	p##_avx_##s
 # include "memmove-vec-unaligned-erms.S"
 #endif