On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits. The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.
This pach fixes strnlen/wcsnlen for x32. Tested on x86-64 and x32. On
x86-64, libc.so is the same with and withou the fix.
[BZ# 24097]
CVE-2019-6488
* sysdeps/x86_64/multiarch/strlen-avx2.S: Use RSI_LP for length.
Clear the upper 32 bits of RSI register.
* sysdeps/x86_64/strlen.S: Use RSI_LP for length.
* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-strnlen
and tst-size_t-wcsnlen.
* sysdeps/x86_64/x32/tst-size_t-strnlen.c: New file.
* sysdeps/x86_64/x32/tst-size_t-wcsnlen.c: Likewise.
Since the result of testl is never used, this patch removes it.
Tested on 64-bit AVX2 machine.
* sysdeps/x86_64/multiarch/strlen-avx2.S (STRLEN): Remove the
unnecessary testl.
Optimize strlen/strnlen/wcslen/wcsnlen with AVX2 to check 32 bytes with
a single vector compare instruction. It is as fast as SSE2 versions for
size <= 16 bytes and up to 1X faster for or size > 16 bytes on Haswell.
Select AVX2 version on AVX2 machines where vzeroupper is preferred and
AVX unaligned load is fast.
NB: It uses TZCNT instead of BSF since TZCNT produces the same result
as BSF for non-zero input. TZCNT is faster than BSF and is executed
as BSF if machine doesn't support TZCNT.
* sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add
strlen-sse2, strnlen-sse2, strlen-avx2, strnlen-avx2,
wcslen-sse2, wcslen-avx2 and wcsnlen-avx2.
* sysdeps/x86_64/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Add tests for __strlen_avx2,
__strlen_sse2, __strnlen_avx2, __strnlen_sse2, __wcslen_avx2,
__wcslen_sse2 and __wcsnlen_avx2.
* sysdeps/x86_64/multiarch/strlen-avx2.S: New file.
* sysdeps/x86_64/multiarch/strlen-sse2.S: Likewise.
* sysdeps/x86_64/multiarch/strlen.c: Likewise.
* sysdeps/x86_64/multiarch/strnlen-avx2.S: Likewise.
* sysdeps/x86_64/multiarch/strnlen-sse2.S: Likewise.
* sysdeps/x86_64/multiarch/strnlen.c: Likewise.
* sysdeps/x86_64/multiarch/wcslen-avx2.S: Likewise.
* sysdeps/x86_64/multiarch/wcslen-sse2.S: Likewise.
* sysdeps/x86_64/multiarch/wcslen.c: Likewise.
* sysdeps/x86_64/multiarch/wcsnlen-avx2.S: Likewise.
* sysdeps/x86_64/multiarch/wcsnlen.c (OPTIMIZE (avx2)): New.
(IFUNC_SELECTOR): Return OPTIMIZE (avx2) on AVX2 machines where
vzeroupper is preferred and AVX unaligned load is fast.