glibc/sysdeps
Patrick McGehearty e6a1c5dc77 sparc: M7 optimized memset/bzero
Support added to identify Sparc M7/T7/S7/M8/T8 processor capability.
Performance tests run on Sparc S7 using new code and old niagara4 code.

Optimizations for memset also apply to bzero as they share code.

For memset/bzero, performance comparison with niagara4 code:
For memset nonzero data,
  256-1023 bytes - 60-90% gain (in cache); 5% gain (out of cache)
  1K+ bytes - 80-260% gain (in cache); 40-80% gain (out of cache)
For memset zero data (and bzero),
  256-1023 bytes - 80-120% gain (in cache), 0% gain (out of cache)
  1024+ bytes - 2-4x gain (in cache), 10-35% gain (out of cache)

Tested in sparcv9-*-* and sparc64-*-* targets in both multi and
non-multi arch configurations.

	Patrick McGehearty <patrick.mcgehearty@oracle.com>
	Adhemerval Zanella  <adhemerval.zanella@linaro.org>

	* sysdeps/sparc/sparc32/sparcv9/multiarch/Makefile
	(sysdeps_routines): Add memset-niagara7.
	* sysdeps/sparc/sparc64/multiarch/Makefile (sysdes_rotuines):
	Likewise.
	* sysdeps/sparc/sparc32/sparcv9/multiarch/memset-niagara7.S: New
	file.
	* sysdeps/sparc/sparc64/multiarch/memset-niagara7.S: Likewise.
	* sysdeps/sparc/sparc64/multiarch/ifunc-impl-list.c
	(__libc_ifunc_impl_list): Add __bzero_niagara7 and __memset_niagara7.
	* sysdeps/sparc/sparc64/multiarch/ifunc-memset.h (IFUNC_SELECTOR):
	Add niagara7 option.
	* NEWS: Mention sparc m7 optimized memcpy, mempcpy, memmove, and
	memset.
2017-12-14 08:48:19 -02:00
..
aarch64 aarch64: Improve strcmp unaligned performance 2017-12-13 18:50:27 +05:30
alpha Update Alpha libm-test-ulps 2017-12-06 18:55:09 -02:00
arm Add sysdeps/ieee754/soft-fp. 2017-12-12 23:35:21 +00:00
generic Support _Float32 in libm_alias_float. 2017-12-06 22:14:09 +00:00
gnu The -Wstringop-truncation option new in GCC 8 detects common misuses 2017-11-15 17:39:59 -07:00
hppa Handle __gmon_start__ as undefined weak on hppa. 2017-12-02 14:43:28 -05:00
i386 Add _Float32 function aliases. 2017-12-07 00:48:31 +00:00
ia64 Update IA64 libm-test-ulps 2017-12-12 16:57:41 -02:00
ieee754 Add sysdeps/ieee754/soft-fp. 2017-12-12 23:35:21 +00:00
init_array Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
m68k Add sysdeps/ieee754/soft-fp. 2017-12-12 23:35:21 +00:00
mach Introduce NO_RTLD_HIDDEN, make hurd use it instead of NO_HIDDEN 2017-10-03 01:33:38 +02:00
microblaze Add sysdeps/ieee754/soft-fp. 2017-12-12 23:35:21 +00:00
mips Add sysdeps/ieee754/soft-fp. 2017-12-12 23:35:21 +00:00
nios2 Add sysdeps/ieee754/soft-fp. 2017-12-12 23:35:21 +00:00
nptl nptl: Define __PTHREAD_MUTEX_{NUSERS_AFTER_KIND,USE_UNION} 2017-11-07 09:48:41 -02:00
posix posix: Fix generic p{read,write}v buffer allocation (BZ#22457) 2017-11-24 12:16:15 -02:00
powerpc Remove --with-fp / --without-fp. 2017-12-12 13:56:47 +00:00
pthread aio: Remove internal_function function attribute 2017-08-31 15:59:06 +02:00
s390 S390: Add CFI rule in _dl_runtime_resolve[_vx] for unwinding. 2017-12-11 08:47:51 +01:00
sh Add sysdeps/ieee754/soft-fp. 2017-12-12 23:35:21 +00:00
sparc sparc: M7 optimized memset/bzero 2017-12-14 08:48:19 -02:00
tile Add sysdeps/ieee754/soft-fp. 2017-12-12 23:35:21 +00:00
unix ia64: Add ipc_priv.h header to set __IPC_64 to zero 2017-12-12 12:19:24 -02:00
wordsize-32 Build divdi3 only for architecture that required it 2017-04-06 15:14:34 -03:00
wordsize-64 posix: Consolidate Linux glob implementation 2017-09-08 16:34:02 +02:00
x86 Add _Float64x function aliases. 2017-11-27 14:16:47 +00:00
x86_64 x86-64: Add cosf with FMA 2017-12-12 15:32:58 -08:00