This patch implements following evex512 version of string functions.
evex512 version takes up to 30% less cycle as compared to evex,
depending on length and alignment.
- memchr function using 512 bit vectors.
- rawmemchr function using 512 bit vectors.
- wmemchr function using 512 bit vectors.
Code size data:
memchr-evex.o 762 byte
memchr-evex512.o 576 byte (-24%)
rawmemchr-evex.o 461 byte
rawmemchr-evex512.o 412 byte (-11%)
wmemchr-evex.o 794 byte
wmemchr-evex512.o 552 byte (-30%)
Placeholder function, not used by any processor at the moment.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>