Go to file
Siddhesh Poyarekar 5a67c4fa01 aarch64: Optimized memset for falkor
The generic memset reads dczid_el0 on every memset.  This has a
significant impact on falkor for a range of sizes because reading
dczid_el0 is slow.

The DZP bit in the dczid_el0 register does not change dynamically, so
it is safe to read once during program startup.  With this patch
dczid_el0 is read once during startup and zva_size is cached.  This is
used to invoke the falkor-specific memset; the generic memset routine
remains unchanged.

The gains due to this are significant for falkor, with run time
reductions as high as 48%.  Here's a sample from the falkor tests:

Function: memset
Variant: walk
                      simple_memset	__memset_falkor	__memset_generic
=====================================================================
length=256, char=0:   139.96 (-698.28%)	   9.07 ( 48.26%)  17.53
length=257, char=0:   140.50 (-699.03%)	   9.53 ( 45.80%)  17.58
length=258, char=0:   140.96 (-703.95%)	   9.58 ( 45.36%)  17.53
length=259, char=0:   141.56 (-705.16%)	   9.53 ( 45.79%)  17.58
length=260, char=0:   142.15 (-710.76%)	   9.57 ( 45.39%)  17.53
length=261, char=0:   142.50 (-710.39%)	   9.53 ( 45.78%)  17.58
length=262, char=0:   142.97 (-715.09%)	   9.57 ( 45.42%)  17.54
length=263, char=0:   143.51 (-716.18%)	   9.53 ( 45.80%)  17.58
length=264, char=0:   143.93 (-720.55%)	   9.58 ( 45.39%)  17.54
length=265, char=0:   144.56 (-722.07%)	   9.53 ( 45.80%)  17.59
length=266, char=0:   144.98 (-726.42%)	   9.58 ( 45.42%)  17.54
length=267, char=0:   145.53 (-727.53%)	   9.53 ( 45.80%)  17.59
length=268, char=0:   146.25 (-731.81%)	   9.53 ( 45.79%)  17.58
length=269, char=0:   146.52 (-735.39%)	   9.53 ( 45.66%)  17.54
length=270, char=0:   146.97 (-735.81%)	   9.53 ( 45.80%)  17.58
length=271, char=0:   147.54 (-741.08%)	   9.58 ( 45.38%)  17.54
length=512, char=0:   268.26 (-1307.85%)  12.06 ( 36.71%)  19.05
length=513, char=0:   268.73 (-1273.89%)  13.56 ( 30.68%)  19.56
length=514, char=0:   269.31 (-1276.89%)  13.56 ( 30.68%)  19.56
length=515, char=0:   269.73 (-1279.05%)  13.56 ( 30.68%)  19.56
length=516, char=0:   270.34 (-1282.24%)  13.56 ( 30.67%)  19.56
length=517, char=0:   270.83 (-1284.71%)  13.56 ( 30.66%)  19.56
length=518, char=0:   271.20 (-1286.54%)  13.56 ( 30.67%)  19.56
length=519, char=0:   271.67 (-1288.67%)  13.65 ( 30.24%)  19.56
length=520, char=0:   272.14 (-1291.04%)  13.65 ( 30.22%)  19.56
length=521, char=0:   272.66 (-1293.69%)  13.65 ( 30.23%)  19.56
length=522, char=0:   273.14 (-1296.13%)  13.65 ( 30.20%)  19.56
length=523, char=0:   273.64 (-1298.75%)  13.65 ( 30.23%)  19.56
length=524, char=0:   274.34 (-1302.16%)  13.66 ( 30.20%)  19.57
length=525, char=0:   274.64 (-1297.78%)  13.56 ( 30.99%)  19.65
length=526, char=0:   275.20 (-1300.04%)  13.56 ( 31.01%)  19.66
length=527, char=0:   275.66 (-1302.86%)  13.56 ( 30.99%)  19.65
length=1024, char=0:  524.46 (-2169.75%)  20.12 ( 12.92%)  23.11
length=1025, char=0:  525.14 (-2124.63%)  21.62 (  8.40%)  23.61
length=1026, char=0:  525.59 (-2125.36%)  21.88 (  7.37%)  23.62
length=1027, char=0:  525.98 (-2127.14%)  21.62 (  8.46%)  23.62
length=1028, char=0:  526.68 (-2131.10%)  21.62 (  8.42%)  23.61
length=1029, char=0:  527.10 (-2131.70%)  21.79 (  7.73%)  23.62
length=1030, char=0:  527.54 (-2118.51%)  21.62 (  9.10%)  23.78
length=1031, char=0:  527.98 (-2136.37%)  21.62 (  8.43%)  23.61
length=1032, char=0:  528.70 (-2139.38%)  21.62 (  8.43%)  23.61
length=1033, char=0:  529.25 (-2124.37%)  21.62 (  9.11%)  23.79
length=1034, char=0:  529.48 (-2142.95%)  21.62 (  8.43%)  23.61
length=1035, char=0:  530.11 (-2145.13%)  21.62 (  8.44%)  23.61
length=1036, char=0:  530.76 (-2147.10%)  21.79 (  7.73%)  23.62
length=1037, char=0:  531.03 (-2149.45%)  21.62 (  8.42%)  23.61
length=1038, char=0:  531.64 (-2151.87%)  21.62 (  8.42%)  23.61
length=1039, char=0:  531.99 (-2151.63%)  21.80 (  7.75%)  23.63

	* sysdeps/aarch64/memset-reg.h: New file.
	* sysdeps/aarch64/memset.S: Use it.
	(__memset): Rename to MEMSET macro.
	[ZVA_MACRO]: Use zva_macro.
	* sysdeps/aarch64/multiarch/Makefile (sysdep_routines):
	Add memset_generic and memset_falkor.
	* sysdeps/aarch64/multiarch/ifunc-impl-list.c
	(__libc_ifunc_impl_list): Add memset ifuncs.
	* sysdeps/aarch64/multiarch/init-arch.h (INIT_ARCH): New
	local variable zva_size.
	* sysdeps/aarch64/multiarch/memset.c: New file.
	* sysdeps/aarch64/multiarch/memset_generic.S: New file.
	* sysdeps/aarch64/multiarch/memset_falkor.S: New file.
	* sysdeps/aarch64/multiarch/rtld-memset.S: New file.
	* sysdeps/unix/sysv/linux/aarch64/cpu-features.c
	(DCZID_DZP_MASK): New macro.
	(DCZID_BS_MASK): Likewise.
	(init_cpu_features): Read and set zva_size.
	* sysdeps/unix/sysv/linux/aarch64/cpu-features.h
	(struct cpu_features): New member zva_size.
2017-11-20 18:25:04 +05:30
argp Mark internal argp functions with attribute_hidden [BZ #18822] 2017-10-01 15:10:27 -07:00
assert Fix position of tests-unsupported definition in assert/Makefile. 2017-08-22 00:30:51 +00:00
benchtests benchtests: Bump start size since smaller sizes are noisy 2017-11-20 18:03:32 +05:30
bits Support bits/floatn.h inclusion from .S files. 2017-11-17 22:01:43 +00:00
catgets Don't compile non-lib modules as lib modules [BZ #21864] 2017-08-21 05:34:54 -07:00
ChangeLog.old Add missing reference to bug 21654 2017-10-07 13:14:36 +02:00
conform Fix mcontext_t sigcontext namespace (bug 21457). 2017-08-30 22:02:04 +00:00
crypt Prefer https for Sourceware links 2017-11-16 11:49:26 +05:30
csu Use newly built crt*.o files to build shared objects [BZ #22362] 2017-11-06 08:29:57 -08:00
ctype Use locale_t, not __locale_t, throughout glibc 2017-06-20 20:30:06 -04:00
debug Enable unwind info in libc-start.c and backtrace.c 2017-09-19 15:07:58 +01:00
dev Rename xlocale.h to bits/types/__locale_t.h. 2017-06-20 20:28:11 -04:00
dirent hurd: Fix dirfd symbol exposition from ftw 2017-09-28 00:49:05 +02:00
dlfcn Prefer https for Sourceware links 2017-11-16 11:49:26 +05:30
elf ld.so: Add architecture specific fields 2017-11-13 08:02:52 -08:00
gmon Add a test for profiling static executable 2017-10-14 12:58:55 -07:00
gnulib Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
grp Remove compat from DEFAULT_CONFIG lookup strings 2017-09-12 10:21:48 -07:00
gshadow Remove __need macros from stdio.h and wchar.h. 2017-06-08 13:58:17 -04:00
hesiod Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
hurd hurd: fix gethostname(NULL, 0) 2017-09-07 00:51:17 +02:00
iconv localedef: Add --no-warnings/--warnings option 2017-10-25 13:36:54 -07:00
iconvdata Add new codepage charmaps/IBM858 [BZ #21084] 2017-09-14 15:50:57 +02:00
include ld.so: Add architecture specific fields 2017-11-13 08:02:52 -08:00
inet Hide internal idna functions [BZ #18822] 2017-10-01 17:33:22 -07:00
intl Hide internal __hash_string function [BZ #18822] 2017-10-01 17:41:34 -07:00
io Assume that _DIRENT_HAVE_D_TYPE is always defined. 2017-10-30 15:48:33 +01:00
libidn Remove add-ons mechanism. 2017-10-05 15:58:13 +00:00
libio Always do locking when iterating over list of streams (bug 15142) 2017-10-05 17:26:05 +02:00
locale Assume that _DIRENT_HAVE_D_TYPE is always defined. 2017-10-30 15:48:33 +01:00
localedata Prefer https for Sourceware links 2017-11-16 11:49:26 +05:30
login openpty: use TIOCGPTPEER to open slave side fd 2017-10-08 17:47:58 +02:00
mach hurd: Remove duplicate symbol version 2017-08-28 14:19:55 +02:00
malloc Prefer https for Sourceware links 2017-11-16 11:49:26 +05:30
manual manual: Document the MAP_HUGETLB, MADV_HUGEPAGE, MADV_NOHUGEPAGE flags 2017-11-20 13:23:17 +01:00
math Use __builtin_tgmath in tgmath.h with GCC 8 (bug 21660). 2017-11-15 02:08:56 +00:00
mathvec Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
misc The -Wstringop-truncation option new in GCC 8 detects common misuses 2017-11-15 17:39:59 -07:00
nis nscd: remove reference to libnsl 2017-10-11 15:51:52 +02:00
nptl Prefer https for Sourceware links 2017-11-16 11:49:26 +05:30
nptl_db Move all old ChangeLogs to a top-level ChangeLog.old directory. 2017-09-01 09:31:43 -04:00
nscd nscd: remove reference to libnsl 2017-10-11 15:51:52 +02:00
nss nss_files: Avoid large buffers with many host addresses [BZ #22078] 2017-10-11 07:07:51 +02:00
po Update translations from the Translation Project 2017-11-03 23:30:23 +00:00
posix posix/tst-glob-tilde.c: Add test for bug 22332 2017-11-02 11:06:45 +01:00
pwd Remove __need macros from stdio.h and wchar.h. 2017-06-08 13:58:17 -04:00
resolv support: Add <support/next_to_fault.h> 2017-11-13 19:29:32 +01:00
resource Hide internal __setrlimit function [BZ #18822] 2017-10-01 17:46:54 -07:00
rt aio: Remove internal_function function attribute 2017-08-31 15:59:06 +02:00
scripts Use Linux 4.14 in build-many-glibcs.py. 2017-11-15 17:10:03 +00:00
setjmp Remove __need macros from signal.h. 2017-05-20 19:04:43 -04:00
shadow Remove __need macros from stdio.h and wchar.h. 2017-06-08 13:58:17 -04:00
signal Optimize sigrelse implementation 2017-11-15 15:45:39 -02:00
socket __opensock: Remove internal_function attribute 2017-08-17 10:18:15 +02:00
soft-fp Use libm_alias_* in soft-fp. 2017-10-11 00:03:46 +00:00
stdio-common Prefer https for Sourceware links 2017-11-16 11:49:26 +05:30
stdlib Handle more _FloatN, _FloatNx types in type-generic strtod tests. 2017-11-07 18:08:44 +00:00
streams Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
string Fix string/tester.c build with GCC 8. 2017-11-14 17:52:26 +00:00
sunrpc Prefer https for Sourceware links 2017-11-16 11:49:26 +05:30
support support_become_root: Fix comment style 2017-11-18 17:54:24 +01:00
sysdeps aarch64: Optimized memset for falkor 2017-11-20 18:25:04 +05:30
sysvipc Fix test-sysvsem on some platforms 2017-01-02 18:53:50 -02:00
termios Hide internal __tcgetattr function [BZ #18822] 2017-10-01 17:48:24 -07:00
time time: Remove the internal_function attribute 2017-08-31 15:59:07 +02:00
timezone timezone: pacify GCC -Wstringop-truncation 2017-11-12 22:00:28 -08:00
wcsmbs Prefer https for Sourceware links 2017-11-16 11:49:26 +05:30
wctype Use locale_t, not __locale_t, throughout glibc 2017-06-20 20:30:06 -04:00
.gitattributes
.gitignore
abi-tags Remove the bulk of the NaCl port. 2017-05-20 08:09:10 -04:00
aclocal.m4 gmon: Add test for basic mcount/gprof functionality 2017-08-15 15:49:45 +02:00
ChangeLog aarch64: Optimized memset for falkor 2017-11-20 18:25:04 +05:30
config.h.in Don't use hidden visibility in libc.a with PIE on i386 2017-10-04 17:18:42 -07:00
config.make.in Use newly built crt*.o files to build shared objects [BZ #22362] 2017-11-06 08:29:57 -08:00
configure Prefer https for Sourceware links 2017-11-16 11:49:26 +05:30
configure.ac Prefer https for Sourceware links 2017-11-16 11:49:26 +05:30
COPYING
COPYING.LIB
extra-lib.mk Rename cppflags-iterator.mk to libof-iterator.mk, remove extra-modules.mk. 2017-05-09 07:06:29 -04:00
gen-locales.mk
INSTALL Fix botched up regeneration in the last commit 2017-11-16 12:08:52 +05:30
libc-abis
libof-iterator.mk Rename cppflags-iterator.mk to libof-iterator.mk, remove extra-modules.mk. 2017-05-09 07:06:29 -04:00
LICENSES
MAINTAINERS Add MAINTAINERS 2017-05-11 13:38:30 -04:00
Makeconfig Add a test for profiling static executable 2017-10-14 12:58:55 -07:00
Makefile Remove add-ons mechanism. 2017-10-05 15:58:13 +00:00
Makefile.in
Makerules Use newly built crt*.o files to build shared objects [BZ #22362] 2017-11-06 08:29:57 -08:00
NEWS Prefer https for Sourceware links 2017-11-16 11:49:26 +05:30
o-iterator.mk
README Require Linux kernel 3.2 or later on x86 / x86_64. 2017-05-08 10:45:20 +00:00
Rules Suppress internal declarations for most of the testsuite. 2017-05-11 19:27:59 -04:00
shlib-versions Extend NSS test suite 2017-07-17 15:52:44 -04:00
test-skeleton.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
version.h version.h: Switch to ".9000" as the suffix for the development version 2017-10-16 21:39:18 +02:00

This directory contains the sources of the GNU C Library.
See the file "version.h" for what release version you have.

The GNU C Library is the standard system C library for all GNU systems,
and is an important part of what makes up a GNU system.  It provides the
system API for all programs written in C and C-compatible languages such
as C++ and Objective C; the runtime facilities of other programming
languages use the C library to access the underlying operating system.

In GNU/Linux systems, the C library works with the Linux kernel to
implement the operating system behavior seen by user applications.
In GNU/Hurd systems, it works with a microkernel and Hurd servers.

The GNU C Library implements much of the POSIX.1 functionality in the
GNU/Hurd system, using configurations i[4567]86-*-gnu.  The current
GNU/Hurd support requires out-of-tree patches that will eventually be
incorporated into an official GNU C Library release.

When working with Linux kernels, this version of the GNU C Library
requires Linux kernel version 3.2 or later.

Also note that the shared version of the libgcc_s library must be
installed for the pthread library to work correctly.

The GNU C Library supports these configurations for using Linux kernels:

	aarch64*-*-linux-gnu
	alpha*-*-linux-gnu
	arm-*-linux-gnueabi
	hppa-*-linux-gnu	Not currently functional without patches.
	i[4567]86-*-linux-gnu
	x86_64-*-linux-gnu	Can build either x86_64 or x32
	ia64-*-linux-gnu
	m68k-*-linux-gnu
	microblaze*-*-linux-gnu
	mips-*-linux-gnu
	mips64-*-linux-gnu
	powerpc-*-linux-gnu	Hardware or software floating point, BE only.
	powerpc64*-*-linux-gnu	Big-endian and little-endian.
	s390-*-linux-gnu
	s390x-*-linux-gnu
	sh[34]-*-linux-gnu
	sparc*-*-linux-gnu
	sparc64*-*-linux-gnu
	tilegx-*-linux-gnu
	tilepro-*-linux-gnu

If you are interested in doing a port, please contact the glibc
maintainers; see http://www.gnu.org/software/libc/ for more
information.

See the file INSTALL to find out how to configure, build, and install
the GNU C Library.  You might also consider reading the WWW pages for
the C library at http://www.gnu.org/software/libc/.

The GNU C Library is (almost) completely documented by the Texinfo manual
found in the `manual/' subdirectory.  The manual is still being updated
and contains some known errors and omissions; we regret that we do not
have the resources to work on the manual as much as we would like.  For
corrections to the manual, please file a bug in the `manual' component,
following the bug-reporting instructions below.  Please be sure to check
the manual in the current development sources to see if your problem has
already been corrected.

Please see http://www.gnu.org/software/libc/bugs.html for bug reporting
information.  We are now using the Bugzilla system to track all bug reports.
This web page gives detailed information on how to report bugs properly.

The GNU C Library is free software.  See the file COPYING.LIB for copying
conditions, and LICENSES for notices about a few contributions that require
these additional notices to be distributed.  License copyright years may be
listed using range notation, e.g., 1996-2015, indicating that every year in
the range, inclusive, is a copyrightable year that would otherwise be listed
individually.