glibc/sysdeps/i386/i686
Andrew Senkevich 8b4416d83c i386: memcpy functions with SSE2 unaligned load/store
These new memcpy functions are the 32-bit version of x86_64 SSE2 unaligned
memcpy.  Memcpy average performace benefit is 18% on Silvermont, other
platforms also improved about 35%, benchmarked on Silvermont, Haswell, Ivy
Bridge, Sandy Bridge and Westmere, performance results attached in

https://sourceware.org/ml/libc-alpha/2014-07/msg00157.html

	* sysdeps/i386/i686/multiarch/bcopy-sse2-unaligned.S: New file.
	* sysdeps/i386/i686/multiarch/memcpy-sse2-unaligned.S: Likewise.
	* sysdeps/i386/i686/multiarch/memmove-sse2-unaligned.S: Likewise.
	* sysdeps/i386/i686/multiarch/mempcpy-sse2-unaligned.S: Likewise.
	* sysdeps/i386/i686/multiarch/bcopy.S: Select the sse2_unaligned
	version if bit_Fast_Unaligned_Load is set.
	* sysdeps/i386/i686/multiarch/memcpy.S: Likewise.
	* sysdeps/i386/i686/multiarch/memcpy_chk.S: Likewise.
	* sysdeps/i386/i686/multiarch/memmove.S: Likewise.
	* sysdeps/i386/i686/multiarch/memmove_chk.S: Likewise.
	* sysdeps/i386/i686/multiarch/mempcpy.S: Likewise.
	* sysdeps/i386/i686/multiarch/mempcpy_chk.S: Likewise.
	* sysdeps/i386/i686/multiarch/Makefile (sysdep_routines): Add
	bcopy-sse2-unaligned, memcpy-sse2-unaligned,
	memmove-sse2-unaligned and mempcpy-sse2-unaligned.
	* sysdeps/i386/i686/multiarch/ifunc-impl-list.c (MAX_IFUNC): Set
	to 4.
	(__libc_ifunc_impl_list): Test __bcopy_sse2_unaligned,
	__memmove_chk_sse2_unaligned, __memmove_sse2_unaligned,
	__memcpy_chk_sse2_unaligned, __memcpy_sse2_unaligned,
	__mempcpy_chk_sse2_unaligned, and __mempcpy_sse2_unaligned.
2014-12-30 07:19:38 -08:00
..
fpu Fix __ieee754_logl (-LDBL_MAX) in FE_DOWNWARD mode (bug 17022). 2014-06-18 12:32:01 +00:00
multiarch i386: memcpy functions with SSE2 unaligned load/store 2014-12-30 07:19:38 -08:00
nptl x86: Consolidate unnecessary nptl/ subdirectories. 2014-06-24 19:17:43 -07:00
add_n.S Update copyright notices with scripts/update-copyrights 2014-01-01 22:00:23 +10:00
bcopy.S Optimize 32bit memset/memcpy with SSE2/SSSE3. 2010-01-12 11:22:03 -08:00
bzero.S Remove remaining bounded-pointers support from i386 .S files. 2013-02-21 22:21:52 +00:00
cacheinfo.c Change __x86_64 prefix in cache size to __x86 2013-01-05 16:00:38 -08:00
dl-hash.h Update copyright notices with scripts/update-copyrights 2014-01-01 22:00:23 +10:00
ffs.c Update copyright notices with scripts/update-copyrights 2014-01-01 22:00:23 +10:00
hp-timing.h Always provide HP_SMALL_TIMING_AVAIL 2014-07-03 08:38:36 -07:00
Implies [BZ #106] 2004-08-09 01:01:10 +00:00
Makefile Remove HP_TIMING_DIFF_INIT and dl_hp_timing_overhead 2014-07-03 08:38:25 -07:00
memcmp.S Update copyright notices with scripts/update-copyrights 2014-01-01 22:00:23 +10:00
memcpy_chk.S Update copyright notices with scripts/update-copyrights 2014-01-01 22:00:23 +10:00
memcpy.S Remove NOT_IN_libc 2014-11-24 15:03:45 +05:30
memmove_chk.S Update copyright notices with scripts/update-copyrights 2014-01-01 22:00:23 +10:00
memmove.S Remove NOT_IN_libc 2014-11-24 15:03:45 +05:30
mempcpy_chk.S Update copyright notices with scripts/update-copyrights 2014-01-01 22:00:23 +10:00
mempcpy.S Remove NOT_IN_libc 2014-11-24 15:03:45 +05:30
memset_chk.S Update copyright notices with scripts/update-copyrights 2014-01-01 22:00:23 +10:00
memset.S Remove NOT_IN_libc 2014-11-24 15:03:45 +05:30
memusage.h Update copyright notices with scripts/update-copyrights 2014-01-01 22:00:23 +10:00
pthread_spin_trylock.S x86: Consolidate unnecessary nptl/ subdirectories. 2014-06-24 19:17:43 -07:00
stack-aliasing.h Clean up stack-coloring macros. 2014-06-20 19:50:16 -07:00
strcmp.S Update copyright notices with scripts/update-copyrights 2014-01-01 22:00:23 +10:00
strtok_r.S Remove remaining bounded-pointers support from i386 .S files. 2013-02-21 22:21:52 +00:00
strtok.S Update copyright notices with scripts/update-copyrights 2014-01-01 22:00:23 +10:00
tst-stack-align.h Update copyright notices with scripts/update-copyrights 2014-01-01 22:00:23 +10:00