glibc/memmove-avx-unaligned-erms.S at 5307aa9c1800f36a64c183c091c9af392c1fa75c - glibc - Gitea: Git with a cup of tea

AuroraMiddleware/glibc

mirror of https://sourceware.org/git/glibc.git synced 2024-11-25 06:20:06 +00:00

Noah Goldstein a6b7502ec0 x86: Optimize memmove-vec-unaligned-erms.S

No bug.

The optimizations are as follows:

1) Always align entry to 64 bytes. This makes behavior more
   predictable and makes other frontend optimizations easier.

2) Make the L(more_8x_vec) cases 4k aliasing aware. This can have
   significant benefits in the case that:
        0 < (dst - src) < [256, 512]

3) Align before `rep movsb`. For ERMS this is roughly a [0, 30%]
   improvement and for FSRM [-10%, 25%].

In addition to these primary changes there is general cleanup
throughout to optimize the aligning routines and control flow logic.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

2021-11-06 16:18:03 -05:00

13 lines

275 B

ArmAsm

Raw Blame History

 #if IS_IN (libc)
 # define VEC_SIZE	32
 # define VEC(i)		ymm##i
 # define VMOVNT		vmovntdq
 # define VMOVU		vmovdqu
 # define VMOVA		vmovdqa
 # define MOV_SIZE	4
 # define SECTION(p)		p##.avx
 # define MEMMOVE_SYMBOL(p,s)	p##_avx_##s
 # include "memmove-vec-unaligned-erms.S"
 #endif