This patch improves following functionality
- Replace VPCMP with VPCMPEQ.
- Replace page cross check logic with sall.
- Remove extra lea from align_more.
- Remove uncondition loop jump.
- Use bsf to check max length in first vector.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
To avoid duplicate the VMM / GPR / mask insn macros in all incoming
evex512 files use the macros defined in 'reg-macros.h' and
'{vec}-macros.h'
This commit does not change libc.so
Tested build on x86-64
1. Add default ISA level selection in non-multiarch/rtld
implementations.
2. Add ISA level build guards to different implementations.
- I.e strcmp-avx2.S which is ISA level 3 will only build if
compiled ISA level <= 3. Otherwise there is no reason to
include it as we will always use one of the ISA level 4
implementations (strcmp-evex.S).
3. Refactor the ifunc selector and ifunc implementation list to use
the ISA level aware wrapper macros that allow functions below the
compiled ISA level (with a guranteed replacement) to be skipped.
Tested with and without multiarch on x86_64 for ISA levels:
{generic, x86-64-v2, x86-64-v3, x86-64-v4}
And m32 with and without multiarch.
This patch implements following evex512 version of string functions.
Perf gain for evex512 version is up to 50% as compared to evex,
depending on length and alignment.
Placeholder function, not used by any processor at the moment.
- String length function using 512 bit vectors.
- String N length using 512 bit vectors.
- Wide string length using 512 bit vectors.
- Wide string N length using 512 bit vectors.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>