Go to file
Noah Goldstein af992e7abd x86: Increase non_temporal_threshold to roughly sizeof_L3 / 4
Current `non_temporal_threshold` set to roughly '3/4 * sizeof_L3 /
ncores_per_socket'. This patch updates that value to roughly
'sizeof_L3 / 4`

The original value (specifically dividing the `ncores_per_socket`) was
done to limit the amount of other threads' data a `memcpy`/`memset`
could evict.

Dividing by 'ncores_per_socket', however leads to exceedingly low
non-temporal thresholds and leads to using non-temporal stores in
cases where REP MOVSB is multiple times faster.

Furthermore, non-temporal stores are written directly to main memory
so using it at a size much smaller than L3 can place soon to be
accessed data much further away than it otherwise could be. As well,
modern machines are able to detect streaming patterns (especially if
REP MOVSB is used) and provide LRU hints to the memory subsystem. This
in affect caps the total amount of eviction at 1/cache_associativity,
far below meaningfully thrashing the entire cache.

As best I can tell, the benchmarks that lead this small threshold
where done comparing non-temporal stores versus standard cacheable
stores. A better comparison (linked below) is to be REP MOVSB which,
on the measure systems, is nearly 2x faster than non-temporal stores
at the low-end of the previous threshold, and within 10% for over
100MB copies (well past even the current threshold). In cases with a
low number of threads competing for bandwidth, REP MOVSB is ~2x faster
up to `sizeof_L3`.

The divisor of `4` is a somewhat arbitrary value. From benchmarks it
seems Skylake and Icelake both prefer a divisor of `2`, but older CPUs
such as Broadwell prefer something closer to `8`. This patch is meant
to be followed up by another one to make the divisor cpu-specific, but
in the meantime (and for easier backporting), this patch settles on
`4` as a middle-ground.

Benchmarks comparing non-temporal stores, REP MOVSB, and cacheable
stores where done using:
https://github.com/goldsteinn/memcpy-nt-benchmarks

Sheets results (also available in pdf on the github):
https://docs.google.com/spreadsheets/d/e/2PACX-1vS183r0rW_jRX6tG_E90m9qVuFiMbRIJvi5VAE8yYOvEOIEEc3aSNuEsrFbuXw5c3nGboxMmrupZD7K/pubhtml
Reviewed-by: DJ Delorie <dj@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2023-06-12 11:33:39 -05:00
argp tests: fix warn unused result on asprintf calls 2023-06-06 08:23:53 -04:00
assert assert: Reformat Makefile. 2023-05-18 12:56:45 -04:00
benchtests Fix all the remaining misspellings -- BZ 25337 2023-06-02 01:39:48 +00:00
bits Fix all the remaining misspellings -- BZ 25337 2023-06-02 01:39:48 +00:00
catgets Move {read,write}_all functions to a dedicated header 2023-06-06 08:23:53 -04:00
ChangeLog.old Create ChangeLog.old/ChangeLog.26. 2023-01-31 22:27:45 -05:00
conform wchar: Define va_list for POSIX (BZ #30035) 2023-05-25 16:43:29 -03:00
crypt Fix all the remaining misspellings -- BZ 25337 2023-06-02 01:39:48 +00:00
csu Fix all the remaining misspellings -- BZ 25337 2023-06-02 01:39:48 +00:00
ctype Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
debug Fix all the remaining misspellings -- BZ 25337 2023-06-02 01:39:48 +00:00
dirent tests: replace write by xwrite 2023-06-01 12:40:05 -04:00
dlfcn Fix all the remaining misspellings -- BZ 25337 2023-06-02 01:39:48 +00:00
elf elf: Sort Makefile variables. 2023-06-02 21:43:05 -04:00
gmon Fix all the remaining misspellings -- BZ 25337 2023-06-02 01:39:48 +00:00
gnulib Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
grp Fix all the remaining misspellings -- BZ 25337 2023-06-02 01:39:48 +00:00
gshadow Move libc_freeres_ptrs and libc_subfreeres to hidden/weak functions 2023-03-27 13:57:55 -03:00
hesiod Fix all the remaining misspellings -- BZ 25337 2023-06-02 01:39:48 +00:00
htl Fix all the remaining misspellings -- BZ 25337 2023-06-02 01:39:48 +00:00
hurd Fix all the remaining misspellings -- BZ 25337 2023-06-02 01:39:48 +00:00
iconv Fix misspellings in iconv/ and iconvdata/ -- BZ 25337 2023-05-27 16:37:14 +00:00
iconvdata Fix misspellings in iconv/ and iconvdata/ -- BZ 25337 2023-05-27 16:37:14 +00:00
include Move {read,write}_all functions to a dedicated header 2023-06-06 08:23:53 -04:00
inet Fix all the remaining misspellings -- BZ 25337 2023-06-02 01:39:48 +00:00
intl Move libc_freeres_ptrs and libc_subfreeres to hidden/weak functions 2023-03-27 13:57:55 -03:00
io Fix all the remaining misspellings -- BZ 25337 2023-06-02 01:39:48 +00:00
libio Fix all the remaining misspellings -- BZ 25337 2023-06-02 01:39:48 +00:00
locale locale/programs/locarchive.c: fix warn unused result 2023-05-24 21:38:46 -04:00
localedata localedata: de_DE should not use Fräulein 2023-02-27 16:54:22 +01:00
login Fix all the remaining misspellings -- BZ 25337 2023-06-02 01:39:48 +00:00
mach Fix all the remaining misspellings -- BZ 25337 2023-06-02 01:39:48 +00:00
malloc Move {read,write}_all functions to a dedicated header 2023-06-06 08:23:53 -04:00
manual Fix misspellings in manual/ -- BZ 25337 2023-05-27 16:41:44 +00:00
math Fix all the remaining misspellings -- BZ 25337 2023-06-02 01:39:48 +00:00
mathvec Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
misc tests: Replace various function calls with their x variant 2023-06-06 08:23:53 -04:00
nis nis: Fix stringop-truncation warning with -O3 in nis_local_host. 2023-03-02 14:22:54 +01:00
nptl Fix all the remaining misspellings -- BZ 25337 2023-06-02 01:39:48 +00:00
nptl_db Fix all the remaining misspellings -- BZ 25337 2023-06-02 01:39:48 +00:00
nscd Fix all the remaining misspellings -- BZ 25337 2023-06-02 01:39:48 +00:00
nss tests: Replace various function calls with their x variant 2023-06-06 08:23:53 -04:00
po Update all PO files in preparation for release. 2023-01-31 17:51:40 -05:00
posix tests: Replace various function calls with their x variant 2023-06-06 08:23:53 -04:00
pwd Move libc_freeres_ptrs and libc_subfreeres to hidden/weak functions 2023-03-27 13:57:55 -03:00
resolv resolv_conf: release lock on allocation failure (bug 30527) 2023-06-07 12:44:25 +02:00
resource Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
rt Fix all the remaining misspellings -- BZ 25337 2023-06-02 01:39:48 +00:00
scripts Add lint-makefiles Makefile linting test. 2023-06-02 21:43:05 -04:00
setjmp Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
shadow Move libc_freeres_ptrs and libc_subfreeres to hidden/weak functions 2023-03-27 13:57:55 -03:00
signal Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
socket Fix all the remaining misspellings -- BZ 25337 2023-06-02 01:39:48 +00:00
soft-fp Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
stdio-common tests: fix warn unused result on asprintf calls 2023-06-06 08:23:53 -04:00
stdlib tests: Replace various function calls with their x variant 2023-06-06 08:23:53 -04:00
string Fix all the remaining misspellings -- BZ 25337 2023-06-02 01:39:48 +00:00
sunrpc Fix all the remaining misspellings -- BZ 25337 2023-06-02 01:39:48 +00:00
support support: Add delayed__exit (with two underscores) 2023-06-06 11:37:30 +02:00
sysdeps x86: Increase non_temporal_threshold to roughly sizeof_L3 / 4 2023-06-12 11:33:39 -05:00
sysvipc Fix all the remaining misspellings -- BZ 25337 2023-06-02 01:39:48 +00:00
termios Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
time Remove unused DATEMSK file for tst-getdate 2023-06-09 16:25:36 +02:00
timezone Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
wcsmbs Fix all the remaining misspellings -- BZ 25337 2023-06-02 01:39:48 +00:00
wctype Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
.clang-format Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
.gitattributes Assume __NR_openat is always defined 2016-03-23 23:35:08 +01:00
.gitignore Add *.pyc to .gitignore 2015-05-18 15:26:26 +05:30
abi-tags Remove the bulk of the NaCl port. 2017-05-20 08:09:10 -04:00
aclocal.m4 configure: Move nm, objdump, and readelf to LIBC_PROG_BINUTILS 2023-01-12 09:05:09 -03:00
config.h.in Stop checking if MiG supports retcode. 2023-05-11 01:28:34 +02:00
config.make.in Remove --enable-tunables configure option 2023-03-29 14:33:06 -03:00
configure Remove --enable-tunables configure option 2023-03-29 14:33:06 -03:00
configure.ac Fix all the remaining misspellings -- BZ 25337 2023-06-02 01:39:48 +00:00
CONTRIBUTED-BY Remove "Contributed by" lines 2021-09-03 22:06:44 +05:30
COPYING Update to latest versions of GPL-2.0 and LGPL-2.1 2013-09-09 12:52:48 +10:00
COPYING.LIB Update to latest versions of GPL-2.0 and LGPL-2.1 2013-09-09 12:52:48 +10:00
extra-lib.mk Rename cppflags-iterator.mk to libof-iterator.mk, remove extra-modules.mk. 2017-05-09 07:06:29 -04:00
gen-locales.mk Improve gen-locales.mk and gen-locale.sh to make test files with @ options work 2018-02-27 17:01:57 +01:00
INSTALL Fix all the remaining misspellings -- BZ 25337 2023-06-02 01:39:48 +00:00
libc-abis riscv: support GNU indirect function 2021-01-10 21:25:13 -05:00
libof-iterator.mk Rename cppflags-iterator.mk to libof-iterator.mk, remove extra-modules.mk. 2017-05-09 07:06:29 -04:00
LICENSES arc4random: simplify design for better safety 2022-07-27 08:58:27 -03:00
MAINTAINERS Add MAINTAINERS 2017-05-11 13:38:30 -04:00
Makeconfig Remove --enable-tunables configure option 2023-03-29 14:33:06 -03:00
Makefile Add lint-makefiles Makefile linting test. 2023-06-02 21:43:05 -04:00
Makefile.help Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
Makefile.in New make target to only build benchmark binaries 2016-04-20 10:23:28 +05:30
Makerules Fix all the remaining misspellings -- BZ 25337 2023-06-02 01:39:48 +00:00
NEWS Fix all the remaining misspellings -- BZ 25337 2023-06-02 01:39:48 +00:00
o-iterator.mk
README hurd: Enable x86_64 build script 2023-05-02 21:40:36 +02:00
Rules libio: Do not autogenerate stdio_lim.h 2023-03-27 13:57:55 -03:00
SECURITY.md Add a SECURITY.md 2023-05-18 12:07:34 -04:00
SHARED-FILES Mention today's regex merge in SHARED-FILES 2021-09-21 18:00:10 -07:00
shlib-versions nss: Do not mention NSS test modules in <gnu/lib-names.h> 2022-03-11 08:24:04 +01:00
test-skeleton.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
version.h Open master branch for glibc 2.38 development 2023-01-31 22:39:21 -05:00

This directory contains the sources of the GNU C Library.
See the file "version.h" for what release version you have.

The GNU C Library is the standard system C library for all GNU systems,
and is an important part of what makes up a GNU system.  It provides the
system API for all programs written in C and C-compatible languages such
as C++ and Objective C; the runtime facilities of other programming
languages use the C library to access the underlying operating system.

In GNU/Linux systems, the C library works with the Linux kernel to
implement the operating system behavior seen by user applications.
In GNU/Hurd systems, it works with a microkernel and Hurd servers.

The GNU C Library implements much of the POSIX.1 functionality in the
GNU/Hurd system, using configurations i[4567]86-*-gnu and x86_64-gnu.

When working with Linux kernels, this version of the GNU C Library
requires Linux kernel version 3.2 or later.

Also note that the shared version of the libgcc_s library must be
installed for the pthread library to work correctly.

The GNU C Library supports these configurations for using Linux kernels:

	aarch64*-*-linux-gnu
	alpha*-*-linux-gnu
	arc*-*-linux-gnu
	arm-*-linux-gnueabi
	csky-*-linux-gnuabiv2
	hppa-*-linux-gnu
	i[4567]86-*-linux-gnu
	x86_64-*-linux-gnu	Can build either x86_64 or x32
	ia64-*-linux-gnu
	loongarch64-*-linux-gnu Hardware floating point, LE only.
	m68k-*-linux-gnu
	microblaze*-*-linux-gnu
	mips-*-linux-gnu
	mips64-*-linux-gnu
	or1k-*-linux-gnu
	powerpc-*-linux-gnu	Hardware or software floating point, BE only.
	powerpc64*-*-linux-gnu	Big-endian and little-endian.
	s390-*-linux-gnu
	s390x-*-linux-gnu
	riscv32-*-linux-gnu
	riscv64-*-linux-gnu
	sh[34]-*-linux-gnu
	sparc*-*-linux-gnu
	sparc64*-*-linux-gnu

If you are interested in doing a port, please contact the glibc
maintainers; see https://www.gnu.org/software/libc/ for more
information.

See the file INSTALL to find out how to configure, build, and install
the GNU C Library.  You might also consider reading the WWW pages for
the C library at https://www.gnu.org/software/libc/.

The GNU C Library is (almost) completely documented by the Texinfo manual
found in the `manual/' subdirectory.  The manual is still being updated
and contains some known errors and omissions; we regret that we do not
have the resources to work on the manual as much as we would like.  For
corrections to the manual, please file a bug in the `manual' component,
following the bug-reporting instructions below.  Please be sure to check
the manual in the current development sources to see if your problem has
already been corrected.

Please see https://www.gnu.org/software/libc/bugs.html for bug reporting
information.  We are now using the Bugzilla system to track all bug reports.
This web page gives detailed information on how to report bugs properly.

The GNU C Library is free software.  See the file COPYING.LIB for copying
conditions, and LICENSES for notices about a few contributions that require
these additional notices to be distributed.  License copyright years may be
listed using range notation, e.g., 1996-2015, indicating that every year in
the range, inclusive, is a copyrightable year that would otherwise be listed
individually.