glibc/sysdeps/x86
Noah Goldstein af992e7abd x86: Increase non_temporal_threshold to roughly sizeof_L3 / 4
Current `non_temporal_threshold` set to roughly '3/4 * sizeof_L3 /
ncores_per_socket'. This patch updates that value to roughly
'sizeof_L3 / 4`

The original value (specifically dividing the `ncores_per_socket`) was
done to limit the amount of other threads' data a `memcpy`/`memset`
could evict.

Dividing by 'ncores_per_socket', however leads to exceedingly low
non-temporal thresholds and leads to using non-temporal stores in
cases where REP MOVSB is multiple times faster.

Furthermore, non-temporal stores are written directly to main memory
so using it at a size much smaller than L3 can place soon to be
accessed data much further away than it otherwise could be. As well,
modern machines are able to detect streaming patterns (especially if
REP MOVSB is used) and provide LRU hints to the memory subsystem. This
in affect caps the total amount of eviction at 1/cache_associativity,
far below meaningfully thrashing the entire cache.

As best I can tell, the benchmarks that lead this small threshold
where done comparing non-temporal stores versus standard cacheable
stores. A better comparison (linked below) is to be REP MOVSB which,
on the measure systems, is nearly 2x faster than non-temporal stores
at the low-end of the previous threshold, and within 10% for over
100MB copies (well past even the current threshold). In cases with a
low number of threads competing for bandwidth, REP MOVSB is ~2x faster
up to `sizeof_L3`.

The divisor of `4` is a somewhat arbitrary value. From benchmarks it
seems Skylake and Icelake both prefer a divisor of `2`, but older CPUs
such as Broadwell prefer something closer to `8`. This patch is meant
to be followed up by another one to make the divisor cpu-specific, but
in the meantime (and for easier backporting), this patch settles on
`4` as a middle-ground.

Benchmarks comparing non-temporal stores, REP MOVSB, and cacheable
stores where done using:
https://github.com/goldsteinn/memcpy-nt-benchmarks

Sheets results (also available in pdf on the github):
https://docs.google.com/spreadsheets/d/e/2PACX-1vS183r0rW_jRX6tG_E90m9qVuFiMbRIJvi5VAE8yYOvEOIEEc3aSNuEsrFbuXw5c3nGboxMmrupZD7K/pubhtml
Reviewed-by: DJ Delorie <dj@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2023-06-12 11:33:39 -05:00
..
bits <sys/platform/x86.h>: Add PREFETCHI support 2023-04-05 14:46:10 -07:00
fpu Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
htl Fix a few more typos I missed in previous round -- BZ 25337 2023-06-02 23:46:32 +00:00
include <sys/platform/x86.h>: Add PREFETCHI support 2023-04-05 14:46:10 -07:00
nptl Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
sys/platform Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
__longjmp_cancel.S Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
abi-note.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
atomic-machine.h Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
cacheinfo.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
cacheinfo.h Remove --enable-tunables configure option 2023-03-29 14:33:06 -03:00
cet-control.h Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
check-cet.awk Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
configure x86_64: State assembler is being tested on sysdeps/x86/configure 2022-12-06 13:47:47 -03:00
configure.ac x86_64: State assembler is being tested on sysdeps/x86/configure 2022-12-06 13:47:47 -03:00
cpu-features-offsets.sym x86: Cleanup cpu-features-offsets.sym 2018-08-03 06:42:09 -07:00
cpu-features.c Fix misspellings in sysdeps/ -- BZ 25337 2023-05-30 23:02:29 +00:00
cpu-tunables.c Remove --enable-tunables configure option 2023-03-29 14:33:06 -03:00
dl-cacheinfo.h x86: Increase non_temporal_threshold to roughly sizeof_L3 / 4 2023-06-12 11:33:39 -05:00
dl-cet.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
dl-diagnostics-cpu.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
dl-get-cpu-features.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
dl-hwcap2.h x86: Set FSGSBASE to active if enabled by kernel 2023-04-03 11:36:48 -07:00
dl-hwcap.h Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
dl-isa-level.h Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
dl-lookupcfg.h Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
dl-minsigstacksize.h Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
dl-new-hash.h Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
dl-procinfo.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
dl-procinfo.h Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
dl-procruntime.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
dl-prop.h Fix misspellings in sysdeps/ -- BZ 25337 2023-05-30 23:02:29 +00:00
dl-tunables.list Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
elf-initfini.h Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
elide.h Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
float128-abi.h Move __isnanf128 to libc.so 2021-03-30 14:58:19 +05:30
fpu_control.h Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
get-cpuid-feature-leaf.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
get-isa-level.h Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
hp-timing.h Fix misspellings in sysdeps/ -- BZ 25337 2023-05-30 23:02:29 +00:00
init-arch.h Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
isa-ifunc-macros.h Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
isa-level.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
isa-level.h Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
jmp_buf-ssp.sym x86: Support shadow stack pointer in setjmp/longjmp 2018-07-14 05:59:53 -07:00
ldbl2mpn.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
ldsodefs.h Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
libc-start.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
libc-start.h Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
link_map.h Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
linkmap.h Rename bits/linkmap.h to linkmap.h (bug 14912). 2015-09-04 19:44:27 +00:00
longjmp.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
Makeconfig Add _Float64x function aliases. 2017-11-27 14:16:47 +00:00
Makefile Remove --enable-tunables configure option 2023-03-29 14:33:06 -03:00
sysdep.h Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
tininess.h Use sysdeps/x86/tininess.h for i386 and x86_64 2012-10-30 20:38:31 -07:00
tst-cet-legacy-1.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
tst-cet-legacy-1a.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
tst-cet-legacy-2.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
tst-cet-legacy-2a.c x86/CET: Add tests with legacy non-CET shared objects 2018-07-25 04:47:05 -07:00
tst-cet-legacy-3.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
tst-cet-legacy-4.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
tst-cet-legacy-4a.c x86/CET: Add tests with legacy non-CET shared objects 2018-07-25 04:47:05 -07:00
tst-cet-legacy-4b.c x86/CET: Add tests with legacy non-CET shared objects 2018-07-25 04:47:05 -07:00
tst-cet-legacy-4c.c x86/CET: Add tests with legacy non-CET shared objects 2018-07-25 04:47:05 -07:00
tst-cet-legacy-5.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
tst-cet-legacy-5a.c Call _dl_open_check after relocation [BZ #24259] 2019-07-01 12:23:22 -07:00
tst-cet-legacy-5b.c Call _dl_open_check after relocation [BZ #24259] 2019-07-01 12:23:22 -07:00
tst-cet-legacy-6.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
tst-cet-legacy-6a.c Call _dl_open_check after relocation [BZ #24259] 2019-07-01 12:23:22 -07:00
tst-cet-legacy-6b.c Call _dl_open_check after relocation [BZ #24259] 2019-07-01 12:23:22 -07:00
tst-cet-legacy-7.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
tst-cet-legacy-8.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
tst-cet-legacy-9-static.c x86: Properly set usable CET feature bits [BZ #26625] 2021-01-29 03:58:11 -08:00
tst-cet-legacy-9.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
tst-cet-legacy-10-static.c x86: Properly set usable CET feature bits [BZ #26625] 2021-01-29 03:58:11 -08:00
tst-cet-legacy-10.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
tst-cet-legacy-mod-1.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
tst-cet-legacy-mod-2.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
tst-cet-legacy-mod-4.c x86/CET: Add tests with legacy non-CET shared objects 2018-07-25 04:47:05 -07:00
tst-cet-legacy-mod-5.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
tst-cet-legacy-mod-5a.c Call _dl_open_check after relocation [BZ #24259] 2019-07-01 12:23:22 -07:00
tst-cet-legacy-mod-5b.c Call _dl_open_check after relocation [BZ #24259] 2019-07-01 12:23:22 -07:00
tst-cet-legacy-mod-5c.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
tst-cet-legacy-mod-6.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
tst-cet-legacy-mod-6a.c Call _dl_open_check after relocation [BZ #24259] 2019-07-01 12:23:22 -07:00
tst-cet-legacy-mod-6b.c Call _dl_open_check after relocation [BZ #24259] 2019-07-01 12:23:22 -07:00
tst-cet-legacy-mod-6c.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
tst-cet-legacy-mod-6d.c Call _dl_open_check after relocation [BZ #24259] 2019-07-01 12:23:22 -07:00
tst-cpu-features-cpuinfo-static.c x86: Add PTWRITE feature detection [BZ #27346] 2021-02-07 08:01:14 -08:00
tst-cpu-features-cpuinfo.c x86: Set FSGSBASE to active if enabled by kernel 2023-04-03 11:36:48 -07:00
tst-cpu-features-supports-static.c x86: Add PTWRITE feature detection [BZ #27346] 2021-02-07 08:01:14 -08:00
tst-cpu-features-supports.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
tst-get-cpu-features-static.c Add _dl_x86_cpu_features to rtld_global 2015-08-13 03:41:22 -07:00
tst-get-cpu-features.c <sys/platform/x86.h>: Add PREFETCHI support 2023-04-05 14:46:10 -07:00
tst-ifunc-isa-1-static.c x86: Check ifunc resolver with CPU_FEATURE_USABLE [BZ #27072] 2021-01-21 10:22:26 -08:00
tst-ifunc-isa-1.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
tst-ifunc-isa-2-static.c x86: Check ifunc resolver with CPU_FEATURE_USABLE [BZ #27072] 2021-01-21 10:22:26 -08:00
tst-ifunc-isa-2.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
tst-ifunc-isa.h Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
tst-isa-level-1.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
tst-isa-level-mod-1-baseline.c x86: Support GNU_PROPERTY_X86_ISA_1_V[234] marker [BZ #26717] 2021-01-07 13:10:13 -08:00
tst-isa-level-mod-1-v2.c x86: Support GNU_PROPERTY_X86_ISA_1_V[234] marker [BZ #26717] 2021-01-07 13:10:13 -08:00
tst-isa-level-mod-1-v3.c x86: Support GNU_PROPERTY_X86_ISA_1_V[234] marker [BZ #26717] 2021-01-07 13:10:13 -08:00
tst-isa-level-mod-1-v4.c x86: Support GNU_PROPERTY_X86_ISA_1_V[234] marker [BZ #26717] 2021-01-07 13:10:13 -08:00
tst-isa-level-mod-1.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
tst-ldbl-nonnormal-printf.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
tst-memchr-rtm.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
tst-memcmp-rtm.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
tst-memmove-rtm.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
tst-memrchr-rtm.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
tst-memset-rtm.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
tst-setjmp-cet.c x86: Set header.feature_1 in TCB for always-on CET [BZ #27177] 2021-01-13 05:03:34 -08:00
tst-stack-align.h Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
tst-strcasecmp-rtm.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
tst-strchr-rtm.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
tst-strcmp-rtm.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
tst-strcpy-rtm.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
tst-string-rtm.h Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
tst-strlen-rtm.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
tst-strncasecmp-rtm.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
tst-strncmp-rtm.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
tst-strrchr-rtm.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
tst-sysconf-cache-linesize-static.c x86: Handle _SC_LEVEL1_ICACHE_LINESIZE [BZ #27444] 2021-03-15 05:43:26 -07:00
tst-sysconf-cache-linesize.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
tst-wcscmp-rtm.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
tst-wcsncmp-rtm.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
Versions <sys/platform/x86.h>: Remove the C preprocessor magic 2021-01-21 05:58:17 -08:00