Commit Graph

3 Commits

Author SHA1 Message Date
H.J. Lu
696ac77484 Add a comment in memset-sse2-unaligned-erms.S
* sysdeps/x86_64/multiarch/memset-sse2-unaligned-erms.S: Add
	a comment on VMOVU and VMOVA.
2016-04-05 05:19:18 -07:00
H.J. Lu
5cd7af016d Don't put SSE2/AVX/AVX512 memmove/memset in ld.so
Since memmove and memset in ld.so don't use IFUNC, don't put SSE2, AVX
and AVX512 memmove and memset in ld.so.

	* sysdeps/x86_64/multiarch/memmove-avx-unaligned-erms.S: Skip
	if not in libc.
	* sysdeps/x86_64/multiarch/memmove-avx512-unaligned-erms.S:
	Likewise.
	* sysdeps/x86_64/multiarch/memset-avx2-unaligned-erms.S:
	Likewise.
	* sysdeps/x86_64/multiarch/memset-avx512-unaligned-erms.S:
	Likewise.
2016-04-03 14:35:38 -07:00
H.J. Lu
830566307f Add x86-64 memset with unaligned store and rep stosb
Implement x86-64 memset with unaligned store and rep movsb.  Support
16-byte, 32-byte and 64-byte vector register sizes.  A single file
provides 2 implementations of memset, one with rep stosb and the other
without rep stosb.  They share the same codes when size is between 2
times of vector register size and REP_STOSB_THRESHOLD which defaults
to 2KB.

Key features:

1. Use overlapping store to avoid branch.
2. For size <= 4 times of vector register size, fully unroll the loop.
3. For size > 4 times of vector register size, store 4 times of vector
register size at a time.

	[BZ #19881]
	* sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add
	memset-sse2-unaligned-erms, memset-avx2-unaligned-erms and
	memset-avx512-unaligned-erms.
	* sysdeps/x86_64/multiarch/ifunc-impl-list.c
	(__libc_ifunc_impl_list): Test __memset_chk_sse2_unaligned,
	__memset_chk_sse2_unaligned_erms, __memset_chk_avx2_unaligned,
	__memset_chk_avx2_unaligned_erms, __memset_chk_avx512_unaligned,
	__memset_chk_avx512_unaligned_erms, __memset_sse2_unaligned,
	__memset_sse2_unaligned_erms, __memset_erms,
	__memset_avx2_unaligned, __memset_avx2_unaligned_erms,
	__memset_avx512_unaligned_erms and __memset_avx512_unaligned.
	* sysdeps/x86_64/multiarch/memset-avx2-unaligned-erms.S: New
	file.
	* sysdeps/x86_64/multiarch/memset-avx512-unaligned-erms.S:
	Likewise.
	* sysdeps/x86_64/multiarch/memset-sse2-unaligned-erms.S:
	Likewise.
	* sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:
	Likewise.
2016-03-31 10:06:07 -07:00