2016-12-31 18:04:42 +00:00
|
|
|
@node Tunables
|
|
|
|
@c @node Tunables, , Internal Probes, Top
|
|
|
|
@c %MENU% Tunable switches to alter libc internal behavior
|
|
|
|
@chapter Tunables
|
|
|
|
@cindex tunables
|
|
|
|
|
|
|
|
@dfn{Tunables} are a feature in @theglibc{} that allows application authors and
|
|
|
|
distribution maintainers to alter the runtime library behavior to match
|
|
|
|
their workload. These are implemented as a set of switches that may be
|
|
|
|
modified in different ways. The current default method to do this is via
|
|
|
|
the @env{GLIBC_TUNABLES} environment variable by setting it to a string
|
|
|
|
of colon-separated @var{name}=@var{value} pairs. For example, the following
|
Improve documentation for malloc etc. (BZ#27719)
Cover key corner cases (e.g., whether errno is set) that are well
settled in glibc, fix some examples to avoid integer overflow, and
update some other dated examples (code needed for K&R C, e.g.).
* manual/charset.texi (Non-reentrant String Conversion):
* manual/filesys.texi (Symbolic Links):
* manual/memory.texi (Allocating Cleared Space):
* manual/socket.texi (Host Names):
* manual/string.texi (Concatenating Strings):
* manual/users.texi (Setting Groups):
Use reallocarray instead of realloc, to avoid integer overflow issues.
* manual/filesys.texi (Scanning Directory Content):
* manual/memory.texi (The GNU Allocator, Hooks for Malloc):
* manual/tunables.texi:
Use code font for 'malloc' instead of roman font.
(Symbolic Links): Don't assume readlink return value fits in 'int'.
* manual/memory.texi (Memory Allocation and C, Basic Allocation)
(Malloc Examples, Alloca Example):
* manual/stdio.texi (Formatted Output Functions):
* manual/string.texi (Concatenating Strings, Collation Functions):
Omit pointer casts that are needed only in ancient K&R C.
* manual/memory.texi (Basic Allocation):
Say that malloc sets errno on failure.
Say "convert" rather than "cast", since casts are no longer needed.
* manual/memory.texi (Basic Allocation):
* manual/string.texi (Concatenating Strings):
In examples, use C99 declarations after statements for brevity.
* manual/memory.texi (Malloc Examples): Add portability notes for
malloc (0), errno setting, and PTRDIFF_MAX.
(Changing Block Size): Say that realloc (p, 0) acts like
(p ? (free (p), NULL) : malloc (0)).
Add xreallocarray example, since other examples can use it.
Add portability notes for realloc (0, 0), realloc (p, 0),
PTRDIFF_MAX, and improve notes for reallocating to the same size.
(Allocating Cleared Space): Reword now-confusing discussion
about replacement, and xref "Replacing malloc".
* manual/stdio.texi (Formatted Output Functions):
Don't assume message size fits in 'int'.
* manual/string.texi (Concatenating Strings):
Fix undefined behavior involving arithmetic on a freed pointer.
2021-04-12 02:06:00 +00:00
|
|
|
example enables @code{malloc} checking and sets the @code{malloc}
|
|
|
|
trim threshold to 128
|
2016-12-31 18:04:42 +00:00
|
|
|
bytes:
|
|
|
|
|
|
|
|
@example
|
|
|
|
GLIBC_TUNABLES=glibc.malloc.trim_threshold=128:glibc.malloc.check=3
|
|
|
|
export GLIBC_TUNABLES
|
|
|
|
@end example
|
|
|
|
|
|
|
|
Tunables are not part of the @glibcadj{} stable ABI, and they are
|
|
|
|
subject to change or removal across releases. Additionally, the method to
|
|
|
|
modify tunable values may change between releases and across distributions.
|
|
|
|
It is possible to implement multiple `frontends' for the tunables allowing
|
|
|
|
distributions to choose their preferred method at build time.
|
|
|
|
|
|
|
|
Finally, the set of tunables available may vary between distributions as
|
|
|
|
the tunables feature allows distributions to add their own tunables under
|
|
|
|
their own namespace.
|
|
|
|
|
2020-07-12 13:04:53 +00:00
|
|
|
Passing @option{--list-tunables} to the dynamic loader to print all
|
|
|
|
tunables with minimum and maximum values:
|
|
|
|
|
|
|
|
@example
|
|
|
|
$ /lib64/ld-linux-x86-64.so.2 --list-tunables
|
|
|
|
glibc.rtld.nns: 0x4 (min: 0x1, max: 0x10)
|
2023-01-26 16:26:18 +00:00
|
|
|
glibc.elision.skip_lock_after_retries: 3 (min: 0, max: 2147483647)
|
2020-07-12 13:04:53 +00:00
|
|
|
glibc.malloc.trim_threshold: 0x0 (min: 0x0, max: 0xffffffffffffffff)
|
|
|
|
glibc.malloc.perturb: 0 (min: 0, max: 255)
|
|
|
|
glibc.cpu.x86_shared_cache_size: 0x100000 (min: 0x0, max: 0xffffffffffffffff)
|
2023-01-26 16:26:18 +00:00
|
|
|
glibc.pthread.rseq: 1 (min: 0, max: 1)
|
|
|
|
glibc.cpu.prefer_map_32bit_exec: 0 (min: 0, max: 1)
|
2020-07-12 13:04:53 +00:00
|
|
|
glibc.mem.tagging: 0 (min: 0, max: 255)
|
2023-01-26 16:26:18 +00:00
|
|
|
glibc.elision.tries: 3 (min: 0, max: 2147483647)
|
2020-07-12 13:04:53 +00:00
|
|
|
glibc.elision.enable: 0 (min: 0, max: 1)
|
2023-01-26 16:26:18 +00:00
|
|
|
glibc.malloc.hugetlb: 0x0 (min: 0x0, max: 0xffffffffffffffff)
|
|
|
|
glibc.cpu.x86_rep_movsb_threshold: 0x2000 (min: 0x100, max: 0xffffffffffffffff)
|
2020-07-12 13:04:53 +00:00
|
|
|
glibc.malloc.mxfast: 0x0 (min: 0x0, max: 0xffffffffffffffff)
|
2023-01-26 16:26:18 +00:00
|
|
|
glibc.rtld.dynamic_sort: 2 (min: 1, max: 2)
|
|
|
|
glibc.elision.skip_lock_busy: 3 (min: 0, max: 2147483647)
|
|
|
|
glibc.malloc.top_pad: 0x20000 (min: 0x0, max: 0xffffffffffffffff)
|
2020-07-12 13:04:53 +00:00
|
|
|
glibc.cpu.x86_rep_stosb_threshold: 0x800 (min: 0x1, max: 0xffffffffffffffff)
|
2023-01-26 16:26:18 +00:00
|
|
|
glibc.cpu.x86_non_temporal_threshold: 0xc0000 (min: 0x4040, max: 0xfffffffffffffff)
|
2020-07-12 13:04:53 +00:00
|
|
|
glibc.cpu.x86_shstk:
|
2023-01-26 16:26:18 +00:00
|
|
|
glibc.pthread.stack_cache_size: 0x2800000 (min: 0x0, max: 0xffffffffffffffff)
|
2020-07-12 13:04:53 +00:00
|
|
|
glibc.cpu.hwcap_mask: 0x6 (min: 0x0, max: 0xffffffffffffffff)
|
2023-01-26 16:26:18 +00:00
|
|
|
glibc.malloc.mmap_max: 0 (min: 0, max: 2147483647)
|
|
|
|
glibc.elision.skip_trylock_internal_abort: 3 (min: 0, max: 2147483647)
|
2024-01-05 04:19:39 +00:00
|
|
|
glibc.cpu.plt_rewrite: 0 (min: 0, max: 2)
|
2020-07-12 13:04:53 +00:00
|
|
|
glibc.malloc.tcache_unsorted_limit: 0x0 (min: 0x0, max: 0xffffffffffffffff)
|
|
|
|
glibc.cpu.x86_ibt:
|
|
|
|
glibc.cpu.hwcaps:
|
2023-01-26 16:26:18 +00:00
|
|
|
glibc.elision.skip_lock_internal_abort: 3 (min: 0, max: 2147483647)
|
2020-07-12 13:04:53 +00:00
|
|
|
glibc.malloc.arena_max: 0x0 (min: 0x1, max: 0xffffffffffffffff)
|
|
|
|
glibc.malloc.mmap_threshold: 0x0 (min: 0x0, max: 0xffffffffffffffff)
|
|
|
|
glibc.cpu.x86_data_cache_size: 0x8000 (min: 0x0, max: 0xffffffffffffffff)
|
|
|
|
glibc.malloc.tcache_count: 0x0 (min: 0x0, max: 0xffffffffffffffff)
|
|
|
|
glibc.malloc.arena_test: 0x0 (min: 0x1, max: 0xffffffffffffffff)
|
|
|
|
glibc.pthread.mutex_spin_count: 100 (min: 0, max: 32767)
|
|
|
|
glibc.rtld.optional_static_tls: 0x200 (min: 0x0, max: 0xffffffffffffffff)
|
|
|
|
glibc.malloc.tcache_max: 0x0 (min: 0x0, max: 0xffffffffffffffff)
|
|
|
|
glibc.malloc.check: 0 (min: 0, max: 3)
|
|
|
|
@end example
|
|
|
|
|
2016-12-31 18:04:42 +00:00
|
|
|
@menu
|
|
|
|
* Tunable names:: The structure of a tunable name
|
|
|
|
* Memory Allocation Tunables:: Tunables in the memory allocation subsystem
|
2020-06-09 08:57:28 +00:00
|
|
|
* Dynamic Linking Tunables:: Tunables in the dynamic linking subsystem
|
2017-12-05 16:24:14 +00:00
|
|
|
* Elision Tunables:: Tunables in elision subsystem
|
2018-12-01 15:03:33 +00:00
|
|
|
* POSIX Thread Tunables:: Tunables in the POSIX thread subsystem
|
2017-04-17 04:30:35 +00:00
|
|
|
* Hardware Capability Tunables:: Tunables that modify the hardware
|
|
|
|
capabilities seen by @theglibc{}
|
2020-12-21 15:03:03 +00:00
|
|
|
* Memory Related Tunables:: Tunables that control the use of memory by
|
|
|
|
@theglibc{}.
|
gmon: improve mcount overflow handling [BZ# 27576]
When mcount overflows, no gmon.out file is generated, but no message is printed
to the user, leaving the user with no idea why, and thinking maybe there is
some bug - which is how BZ 27576 ended up being logged. Print a message to
stderr in this case so the user knows what is going on.
As a comment in sys/gmon.h acknowledges, the hardcoded MAXARCS value is too
small for some large applications, including the test case in that BZ. Rather
than increase it, add tunables to enable MINARCS and MAXARCS to be overridden
at runtime (glibc.gmon.minarcs and glibc.gmon.maxarcs). So if a user gets the
mcount overflow error, they can try increasing maxarcs (they might need to
increase minarcs too if the heuristic is wrong in their case.)
Note setting minarcs/maxarcs too large can cause monstartup to fail with an
out of memory error. If you set them large enough, it can cause an integer
overflow in calculating the buffer size. I haven't done anything to defend
against that - it would not generally be a security vulnerability, since these
tunables will be ignored in suid/sgid programs (due to the SXID_ERASE default),
and if you can set GLIBC_TUNABLES in the environment of a process, you can take
it over anyway (LD_PRELOAD, LD_LIBRARY_PATH, etc). I thought about modifying
the code of monstartup to defend against integer overflows, but doing so is
complicated, and I realise the existing code is susceptible to them even prior
to this change (e.g. try passing a pathologically large highpc argument to
monstartup), so I decided just to leave that possibility in-place.
Add a test case which demonstrates mcount overflow and the tunables.
Document the new tunables in the manual.
Signed-off-by: Simon Kissane <skissane@gmail.com>
Reviewed-by: DJ Delorie <dj@redhat.com>
2023-02-11 09:12:13 +00:00
|
|
|
* gmon Tunables:: Tunables that control the gmon profiler, used in
|
|
|
|
conjunction with gprof
|
|
|
|
|
2016-12-31 18:04:42 +00:00
|
|
|
@end menu
|
|
|
|
|
|
|
|
@node Tunable names
|
|
|
|
@section Tunable names
|
|
|
|
@cindex Tunable names
|
|
|
|
@cindex Tunable namespaces
|
|
|
|
|
|
|
|
A tunable name is split into three components, a top namespace, a tunable
|
|
|
|
namespace and the tunable name. The top namespace for tunables implemented in
|
|
|
|
@theglibc{} is @code{glibc}. Distributions that choose to add custom tunables
|
|
|
|
in their maintained versions of @theglibc{} may choose to do so under their own
|
|
|
|
top namespace.
|
|
|
|
|
|
|
|
The tunable namespace is a logical grouping of tunables in a single
|
|
|
|
module. This currently holds no special significance, although that may
|
|
|
|
change in the future.
|
|
|
|
|
|
|
|
The tunable name is the actual name of the tunable. It is possible that
|
|
|
|
different tunable namespaces may have tunables within them that have the
|
|
|
|
same name, likewise for top namespaces. Hence, we only support
|
|
|
|
identification of tunables by their full name, i.e. with the top
|
|
|
|
namespace, tunable namespace and tunable name, separated by periods.
|
|
|
|
|
|
|
|
@node Memory Allocation Tunables
|
|
|
|
@section Memory Allocation Tunables
|
|
|
|
@cindex memory allocation tunables
|
|
|
|
@cindex malloc tunables
|
|
|
|
@cindex tunables, malloc
|
|
|
|
|
|
|
|
@deftp {Tunable namespace} glibc.malloc
|
|
|
|
Memory allocation behavior can be modified by setting any of the
|
|
|
|
following tunables in the @code{malloc} namespace:
|
|
|
|
@end deftp
|
|
|
|
|
|
|
|
@deftp Tunable glibc.malloc.check
|
|
|
|
This tunable supersedes the @env{MALLOC_CHECK_} environment variable and is
|
2021-07-22 13:07:59 +00:00
|
|
|
identical in features. This tunable has no effect by default and needs the
|
2021-07-27 02:24:46 +00:00
|
|
|
debug library @file{libc_malloc_debug} to be preloaded using the
|
2021-07-22 13:07:59 +00:00
|
|
|
@code{LD_PRELOAD} environment variable.
|
2016-12-31 18:04:42 +00:00
|
|
|
|
2021-07-07 01:32:13 +00:00
|
|
|
Setting this tunable to a non-zero value less than 4 enables a special (less
|
Improve documentation for malloc etc. (BZ#27719)
Cover key corner cases (e.g., whether errno is set) that are well
settled in glibc, fix some examples to avoid integer overflow, and
update some other dated examples (code needed for K&R C, e.g.).
* manual/charset.texi (Non-reentrant String Conversion):
* manual/filesys.texi (Symbolic Links):
* manual/memory.texi (Allocating Cleared Space):
* manual/socket.texi (Host Names):
* manual/string.texi (Concatenating Strings):
* manual/users.texi (Setting Groups):
Use reallocarray instead of realloc, to avoid integer overflow issues.
* manual/filesys.texi (Scanning Directory Content):
* manual/memory.texi (The GNU Allocator, Hooks for Malloc):
* manual/tunables.texi:
Use code font for 'malloc' instead of roman font.
(Symbolic Links): Don't assume readlink return value fits in 'int'.
* manual/memory.texi (Memory Allocation and C, Basic Allocation)
(Malloc Examples, Alloca Example):
* manual/stdio.texi (Formatted Output Functions):
* manual/string.texi (Concatenating Strings, Collation Functions):
Omit pointer casts that are needed only in ancient K&R C.
* manual/memory.texi (Basic Allocation):
Say that malloc sets errno on failure.
Say "convert" rather than "cast", since casts are no longer needed.
* manual/memory.texi (Basic Allocation):
* manual/string.texi (Concatenating Strings):
In examples, use C99 declarations after statements for brevity.
* manual/memory.texi (Malloc Examples): Add portability notes for
malloc (0), errno setting, and PTRDIFF_MAX.
(Changing Block Size): Say that realloc (p, 0) acts like
(p ? (free (p), NULL) : malloc (0)).
Add xreallocarray example, since other examples can use it.
Add portability notes for realloc (0, 0), realloc (p, 0),
PTRDIFF_MAX, and improve notes for reallocating to the same size.
(Allocating Cleared Space): Reword now-confusing discussion
about replacement, and xref "Replacing malloc".
* manual/stdio.texi (Formatted Output Functions):
Don't assume message size fits in 'int'.
* manual/string.texi (Concatenating Strings):
Fix undefined behavior involving arithmetic on a freed pointer.
2021-04-12 02:06:00 +00:00
|
|
|
efficient) memory allocator for the @code{malloc} family of functions that is
|
2017-08-30 14:39:41 +00:00
|
|
|
designed to be tolerant against simple errors such as double calls of
|
|
|
|
free with the same argument, or overruns of a single byte (off-by-one
|
|
|
|
bugs). Not all such errors can be protected against, however, and memory
|
|
|
|
leaks can result. Any detected heap corruption results in immediate
|
|
|
|
termination of the process.
|
2016-12-31 18:04:42 +00:00
|
|
|
|
|
|
|
Like @env{MALLOC_CHECK_}, @code{glibc.malloc.check} has a problem in that it
|
|
|
|
diverges from normal program behavior by writing to @code{stderr}, which could
|
|
|
|
by exploited in SUID and SGID binaries. Therefore, @code{glibc.malloc.check}
|
2023-11-06 20:25:34 +00:00
|
|
|
is disabled by default for SUID and SGID binaries.
|
2016-12-31 18:04:42 +00:00
|
|
|
@end deftp
|
|
|
|
|
|
|
|
@deftp Tunable glibc.malloc.top_pad
|
|
|
|
This tunable supersedes the @env{MALLOC_TOP_PAD_} environment variable and is
|
|
|
|
identical in features.
|
|
|
|
|
|
|
|
This tunable determines the amount of extra memory in bytes to obtain from the
|
|
|
|
system when any of the arenas need to be extended. It also specifies the
|
|
|
|
number of bytes to retain when shrinking any of the arenas. This provides the
|
|
|
|
necessary hysteresis in heap size such that excessive amounts of system calls
|
|
|
|
can be avoided.
|
|
|
|
|
2022-08-04 08:24:47 +00:00
|
|
|
The default value of this tunable is @samp{131072} (128 KB).
|
2016-12-31 18:04:42 +00:00
|
|
|
@end deftp
|
|
|
|
|
|
|
|
@deftp Tunable glibc.malloc.perturb
|
|
|
|
This tunable supersedes the @env{MALLOC_PERTURB_} environment variable and is
|
|
|
|
identical in features.
|
|
|
|
|
|
|
|
If set to a non-zero value, memory blocks are initialized with values depending
|
|
|
|
on some low order bits of this tunable when they are allocated (except when
|
Improve documentation for malloc etc. (BZ#27719)
Cover key corner cases (e.g., whether errno is set) that are well
settled in glibc, fix some examples to avoid integer overflow, and
update some other dated examples (code needed for K&R C, e.g.).
* manual/charset.texi (Non-reentrant String Conversion):
* manual/filesys.texi (Symbolic Links):
* manual/memory.texi (Allocating Cleared Space):
* manual/socket.texi (Host Names):
* manual/string.texi (Concatenating Strings):
* manual/users.texi (Setting Groups):
Use reallocarray instead of realloc, to avoid integer overflow issues.
* manual/filesys.texi (Scanning Directory Content):
* manual/memory.texi (The GNU Allocator, Hooks for Malloc):
* manual/tunables.texi:
Use code font for 'malloc' instead of roman font.
(Symbolic Links): Don't assume readlink return value fits in 'int'.
* manual/memory.texi (Memory Allocation and C, Basic Allocation)
(Malloc Examples, Alloca Example):
* manual/stdio.texi (Formatted Output Functions):
* manual/string.texi (Concatenating Strings, Collation Functions):
Omit pointer casts that are needed only in ancient K&R C.
* manual/memory.texi (Basic Allocation):
Say that malloc sets errno on failure.
Say "convert" rather than "cast", since casts are no longer needed.
* manual/memory.texi (Basic Allocation):
* manual/string.texi (Concatenating Strings):
In examples, use C99 declarations after statements for brevity.
* manual/memory.texi (Malloc Examples): Add portability notes for
malloc (0), errno setting, and PTRDIFF_MAX.
(Changing Block Size): Say that realloc (p, 0) acts like
(p ? (free (p), NULL) : malloc (0)).
Add xreallocarray example, since other examples can use it.
Add portability notes for realloc (0, 0), realloc (p, 0),
PTRDIFF_MAX, and improve notes for reallocating to the same size.
(Allocating Cleared Space): Reword now-confusing discussion
about replacement, and xref "Replacing malloc".
* manual/stdio.texi (Formatted Output Functions):
Don't assume message size fits in 'int'.
* manual/string.texi (Concatenating Strings):
Fix undefined behavior involving arithmetic on a freed pointer.
2021-04-12 02:06:00 +00:00
|
|
|
allocated by @code{calloc}) and freed. This can be used to debug the use of
|
2016-12-31 18:04:42 +00:00
|
|
|
uninitialized or freed heap memory. Note that this option does not guarantee
|
|
|
|
that the freed block will have any specific values. It only guarantees that the
|
|
|
|
content the block had before it was freed will be overwritten.
|
|
|
|
|
|
|
|
The default value of this tunable is @samp{0}.
|
|
|
|
@end deftp
|
|
|
|
|
|
|
|
@deftp Tunable glibc.malloc.mmap_threshold
|
|
|
|
This tunable supersedes the @env{MALLOC_MMAP_THRESHOLD_} environment variable
|
|
|
|
and is identical in features.
|
|
|
|
|
|
|
|
When this tunable is set, all chunks larger than this value in bytes are
|
|
|
|
allocated outside the normal heap, using the @code{mmap} system call. This way
|
|
|
|
it is guaranteed that the memory for these chunks can be returned to the system
|
|
|
|
on @code{free}. Note that requests smaller than this threshold might still be
|
|
|
|
allocated via @code{mmap}.
|
|
|
|
|
|
|
|
If this tunable is not set, the default value is set to @samp{131072} bytes and
|
|
|
|
the threshold is adjusted dynamically to suit the allocation patterns of the
|
|
|
|
program. If the tunable is set, the dynamic adjustment is disabled and the
|
|
|
|
value is set as static.
|
|
|
|
@end deftp
|
|
|
|
|
|
|
|
@deftp Tunable glibc.malloc.trim_threshold
|
|
|
|
This tunable supersedes the @env{MALLOC_TRIM_THRESHOLD_} environment variable
|
|
|
|
and is identical in features.
|
|
|
|
|
|
|
|
The value of this tunable is the minimum size (in bytes) of the top-most,
|
|
|
|
releasable chunk in an arena that will trigger a system call in order to return
|
|
|
|
memory to the system from that arena.
|
|
|
|
|
|
|
|
If this tunable is not set, the default value is set as 128 KB and the
|
|
|
|
threshold is adjusted dynamically to suit the allocation patterns of the
|
|
|
|
program. If the tunable is set, the dynamic adjustment is disabled and the
|
|
|
|
value is set as static.
|
|
|
|
@end deftp
|
|
|
|
|
|
|
|
@deftp Tunable glibc.malloc.mmap_max
|
|
|
|
This tunable supersedes the @env{MALLOC_MMAP_MAX_} environment variable and is
|
|
|
|
identical in features.
|
|
|
|
|
|
|
|
The value of this tunable is maximum number of chunks to allocate with
|
|
|
|
@code{mmap}. Setting this to zero disables all use of @code{mmap}.
|
|
|
|
|
|
|
|
The default value of this tunable is @samp{65536}.
|
|
|
|
@end deftp
|
|
|
|
|
|
|
|
@deftp Tunable glibc.malloc.arena_test
|
|
|
|
This tunable supersedes the @env{MALLOC_ARENA_TEST} environment variable and is
|
|
|
|
identical in features.
|
|
|
|
|
|
|
|
The @code{glibc.malloc.arena_test} tunable specifies the number of arenas that
|
|
|
|
can be created before the test on the limit to the number of arenas is
|
|
|
|
conducted. The value is ignored if @code{glibc.malloc.arena_max} is set.
|
|
|
|
|
|
|
|
The default value of this tunable is 2 for 32-bit systems and 8 for 64-bit
|
|
|
|
systems.
|
|
|
|
@end deftp
|
|
|
|
|
|
|
|
@deftp Tunable glibc.malloc.arena_max
|
|
|
|
This tunable supersedes the @env{MALLOC_ARENA_MAX} environment variable and is
|
|
|
|
identical in features.
|
|
|
|
|
|
|
|
This tunable sets the number of arenas to use in a process regardless of the
|
|
|
|
number of cores in the system.
|
|
|
|
|
|
|
|
The default value of this tunable is @code{0}, meaning that the limit on the
|
|
|
|
number of arenas is determined by the number of CPU cores online. For 32-bit
|
|
|
|
systems the limit is twice the number of cores online and on 64-bit systems, it
|
|
|
|
is 8 times the number of cores online.
|
|
|
|
@end deftp
|
2017-04-17 04:30:35 +00:00
|
|
|
|
2017-07-06 17:37:30 +00:00
|
|
|
@deftp Tunable glibc.malloc.tcache_max
|
|
|
|
The maximum size of a request (in bytes) which may be met via the
|
|
|
|
per-thread cache. The default (and maximum) value is 1032 bytes on
|
|
|
|
64-bit systems and 516 bytes on 32-bit systems.
|
|
|
|
@end deftp
|
|
|
|
|
|
|
|
@deftp Tunable glibc.malloc.tcache_count
|
|
|
|
The maximum number of chunks of each size to cache. The default is 7.
|
2019-05-17 17:16:20 +00:00
|
|
|
The upper limit is 65535. If set to zero, the per-thread cache is effectively
|
2019-05-10 15:38:21 +00:00
|
|
|
disabled.
|
2017-07-06 17:37:30 +00:00
|
|
|
|
|
|
|
The approximate maximum overhead of the per-thread cache is thus equal
|
|
|
|
to the number of bins times the chunk count in each bin times the size
|
|
|
|
of each chunk. With defaults, the approximate maximum overhead of the
|
|
|
|
per-thread cache is approximately 236 KB on 64-bit systems and 118 KB
|
|
|
|
on 32-bit systems.
|
|
|
|
@end deftp
|
|
|
|
|
|
|
|
@deftp Tunable glibc.malloc.tcache_unsorted_limit
|
|
|
|
When the user requests memory and the request cannot be met via the
|
|
|
|
per-thread cache, the arenas are used to meet the request. At this
|
|
|
|
time, additional chunks will be moved from existing arena lists to
|
|
|
|
pre-fill the corresponding cache. While copies from the fastbins,
|
|
|
|
smallbins, and regular bins are bounded and predictable due to the bin
|
|
|
|
sizes, copies from the unsorted bin are not bounded, and incur
|
|
|
|
additional time penalties as they need to be sorted as they're
|
|
|
|
scanned. To make scanning the unsorted list more predictable and
|
|
|
|
bounded, the user may set this tunable to limit the number of chunks
|
|
|
|
that are scanned from the unsorted list while searching for chunks to
|
|
|
|
pre-fill the per-thread cache with. The default, or when set to zero,
|
|
|
|
is no limit.
|
2017-07-06 23:54:13 +00:00
|
|
|
@end deftp
|
2017-07-06 17:37:30 +00:00
|
|
|
|
2019-08-08 23:09:43 +00:00
|
|
|
@deftp Tunable glibc.malloc.mxfast
|
Improve documentation for malloc etc. (BZ#27719)
Cover key corner cases (e.g., whether errno is set) that are well
settled in glibc, fix some examples to avoid integer overflow, and
update some other dated examples (code needed for K&R C, e.g.).
* manual/charset.texi (Non-reentrant String Conversion):
* manual/filesys.texi (Symbolic Links):
* manual/memory.texi (Allocating Cleared Space):
* manual/socket.texi (Host Names):
* manual/string.texi (Concatenating Strings):
* manual/users.texi (Setting Groups):
Use reallocarray instead of realloc, to avoid integer overflow issues.
* manual/filesys.texi (Scanning Directory Content):
* manual/memory.texi (The GNU Allocator, Hooks for Malloc):
* manual/tunables.texi:
Use code font for 'malloc' instead of roman font.
(Symbolic Links): Don't assume readlink return value fits in 'int'.
* manual/memory.texi (Memory Allocation and C, Basic Allocation)
(Malloc Examples, Alloca Example):
* manual/stdio.texi (Formatted Output Functions):
* manual/string.texi (Concatenating Strings, Collation Functions):
Omit pointer casts that are needed only in ancient K&R C.
* manual/memory.texi (Basic Allocation):
Say that malloc sets errno on failure.
Say "convert" rather than "cast", since casts are no longer needed.
* manual/memory.texi (Basic Allocation):
* manual/string.texi (Concatenating Strings):
In examples, use C99 declarations after statements for brevity.
* manual/memory.texi (Malloc Examples): Add portability notes for
malloc (0), errno setting, and PTRDIFF_MAX.
(Changing Block Size): Say that realloc (p, 0) acts like
(p ? (free (p), NULL) : malloc (0)).
Add xreallocarray example, since other examples can use it.
Add portability notes for realloc (0, 0), realloc (p, 0),
PTRDIFF_MAX, and improve notes for reallocating to the same size.
(Allocating Cleared Space): Reword now-confusing discussion
about replacement, and xref "Replacing malloc".
* manual/stdio.texi (Formatted Output Functions):
Don't assume message size fits in 'int'.
* manual/string.texi (Concatenating Strings):
Fix undefined behavior involving arithmetic on a freed pointer.
2021-04-12 02:06:00 +00:00
|
|
|
One of the optimizations @code{malloc} uses is to maintain a series of ``fast
|
2019-08-08 23:09:43 +00:00
|
|
|
bins'' that hold chunks up to a specific size. The default and
|
|
|
|
maximum size which may be held this way is 80 bytes on 32-bit systems
|
|
|
|
or 160 bytes on 64-bit systems. Applications which value size over
|
|
|
|
speed may choose to reduce the size of requests which are serviced
|
|
|
|
from fast bins with this tunable. Note that the value specified
|
Improve documentation for malloc etc. (BZ#27719)
Cover key corner cases (e.g., whether errno is set) that are well
settled in glibc, fix some examples to avoid integer overflow, and
update some other dated examples (code needed for K&R C, e.g.).
* manual/charset.texi (Non-reentrant String Conversion):
* manual/filesys.texi (Symbolic Links):
* manual/memory.texi (Allocating Cleared Space):
* manual/socket.texi (Host Names):
* manual/string.texi (Concatenating Strings):
* manual/users.texi (Setting Groups):
Use reallocarray instead of realloc, to avoid integer overflow issues.
* manual/filesys.texi (Scanning Directory Content):
* manual/memory.texi (The GNU Allocator, Hooks for Malloc):
* manual/tunables.texi:
Use code font for 'malloc' instead of roman font.
(Symbolic Links): Don't assume readlink return value fits in 'int'.
* manual/memory.texi (Memory Allocation and C, Basic Allocation)
(Malloc Examples, Alloca Example):
* manual/stdio.texi (Formatted Output Functions):
* manual/string.texi (Concatenating Strings, Collation Functions):
Omit pointer casts that are needed only in ancient K&R C.
* manual/memory.texi (Basic Allocation):
Say that malloc sets errno on failure.
Say "convert" rather than "cast", since casts are no longer needed.
* manual/memory.texi (Basic Allocation):
* manual/string.texi (Concatenating Strings):
In examples, use C99 declarations after statements for brevity.
* manual/memory.texi (Malloc Examples): Add portability notes for
malloc (0), errno setting, and PTRDIFF_MAX.
(Changing Block Size): Say that realloc (p, 0) acts like
(p ? (free (p), NULL) : malloc (0)).
Add xreallocarray example, since other examples can use it.
Add portability notes for realloc (0, 0), realloc (p, 0),
PTRDIFF_MAX, and improve notes for reallocating to the same size.
(Allocating Cleared Space): Reword now-confusing discussion
about replacement, and xref "Replacing malloc".
* manual/stdio.texi (Formatted Output Functions):
Don't assume message size fits in 'int'.
* manual/string.texi (Concatenating Strings):
Fix undefined behavior involving arithmetic on a freed pointer.
2021-04-12 02:06:00 +00:00
|
|
|
includes @code{malloc}'s internal overhead, which is normally the size of one
|
2019-08-08 23:09:43 +00:00
|
|
|
pointer, so add 4 on 32-bit systems or 8 on 64-bit systems to the size
|
|
|
|
passed to @code{malloc} for the largest bin size to enable.
|
|
|
|
@end deftp
|
|
|
|
|
2021-08-13 11:36:29 +00:00
|
|
|
@deftp Tunable glibc.malloc.hugetlb
|
|
|
|
This tunable controls the usage of Huge Pages on @code{malloc} calls. The
|
|
|
|
default value is @code{0}, which disables any additional support on
|
|
|
|
@code{malloc}.
|
|
|
|
|
|
|
|
Setting its value to @code{1} enables the use of @code{madvise} with
|
|
|
|
@code{MADV_HUGEPAGE} after memory allocation with @code{mmap}. It is enabled
|
|
|
|
only if the system supports Transparent Huge Page (currently only on Linux).
|
malloc: Add Huge Page support for mmap
With the morecore hook removed, there is not easy way to provide huge
pages support on with glibc allocator without resorting to transparent
huge pages. And some users and programs do prefer to use the huge pages
directly instead of THP for multiple reasons: no splitting, re-merging
by the VM, no TLB shootdowns for running processes, fast allocation
from the reserve pool, no competition with the rest of the processes
unlike THP, no swapping all, etc.
This patch extends the 'glibc.malloc.hugetlb' tunable: the value
'2' means to use huge pages directly with the system default size,
while a positive value means and specific page size that is matched
against the supported ones by the system.
Currently only memory allocated on sysmalloc() is handled, the arenas
still uses the default system page size.
To test is a new rule is added tests-malloc-hugetlb2, which run the
addes tests with the required GLIBC_TUNABLE setting. On systems without
a reserved huge pages pool, is just stress the mmap(MAP_HUGETLB)
allocation failure. To improve test coverage it is required to create
a pool with some allocated pages.
Checked on x86_64-linux-gnu.
Reviewed-by: DJ Delorie <dj@redhat.com>
2021-08-16 18:08:27 +00:00
|
|
|
|
|
|
|
Setting its value to @code{2} enables the use of Huge Page directly with
|
|
|
|
@code{mmap} with the use of @code{MAP_HUGETLB} flag. The huge page size
|
|
|
|
to use will be the default one provided by the system. A value larger than
|
|
|
|
@code{2} specifies huge page size, which will be matched against the system
|
|
|
|
supported ones. If provided value is invalid, @code{MAP_HUGETLB} will not
|
|
|
|
be used.
|
2021-08-13 11:36:29 +00:00
|
|
|
@end deftp
|
|
|
|
|
2020-06-09 08:57:28 +00:00
|
|
|
@node Dynamic Linking Tunables
|
|
|
|
@section Dynamic Linking Tunables
|
|
|
|
@cindex dynamic linking tunables
|
|
|
|
@cindex rtld tunables
|
|
|
|
|
|
|
|
@deftp {Tunable namespace} glibc.rtld
|
|
|
|
Dynamic linker behavior can be modified by setting the
|
|
|
|
following tunables in the @code{rtld} namespace:
|
|
|
|
@end deftp
|
|
|
|
|
|
|
|
@deftp Tunable glibc.rtld.nns
|
|
|
|
Sets the number of supported dynamic link namespaces (see @code{dlmopen}).
|
|
|
|
Currently this limit can be set between 1 and 16 inclusive, the default is 4.
|
|
|
|
Each link namespace consumes some memory in all thread, and thus raising the
|
|
|
|
limit will increase the amount of memory each thread uses. Raising the limit
|
2020-07-07 09:49:11 +00:00
|
|
|
is useful when your application uses more than 4 dynamic link namespaces as
|
|
|
|
created by @code{dlmopen} with an lmid argument of @code{LM_ID_NEWLM}.
|
|
|
|
Dynamic linker audit modules are loaded in their own dynamic link namespaces,
|
|
|
|
but they are not accounted for in @code{glibc.rtld.nns}. They implicitly
|
|
|
|
increase the per-thread memory usage as necessary, so this tunable does
|
|
|
|
not need to be changed to allow many audit modules e.g. via @env{LD_AUDIT}.
|
2020-06-09 08:57:28 +00:00
|
|
|
@end deftp
|
|
|
|
|
2020-06-10 12:40:40 +00:00
|
|
|
@deftp Tunable glibc.rtld.optional_static_tls
|
|
|
|
Sets the amount of surplus static TLS in bytes to allocate at program
|
|
|
|
startup. Every thread created allocates this amount of specified surplus
|
|
|
|
static TLS. This is a minimum value and additional space may be allocated
|
|
|
|
for internal purposes including alignment. Optional static TLS is used for
|
|
|
|
optimizing dynamic TLS access for platforms that support such optimizations
|
|
|
|
e.g. TLS descriptors or optimized TLS access for POWER (@code{DT_PPC64_OPT}
|
|
|
|
and @code{DT_PPC_OPT}). In order to make the best use of such optimizations
|
|
|
|
the value should be as many bytes as would be required to hold all TLS
|
|
|
|
variables in all dynamic loaded shared libraries. The value cannot be known
|
|
|
|
by the dynamic loader because it doesn't know the expected set of shared
|
|
|
|
libraries which will be loaded. The existing static TLS space cannot be
|
|
|
|
changed once allocated at process startup. The default allocation of
|
|
|
|
optional static TLS is 512 bytes and is allocated in every thread.
|
|
|
|
@end deftp
|
|
|
|
|
2021-10-21 13:41:22 +00:00
|
|
|
@deftp Tunable glibc.rtld.dynamic_sort
|
|
|
|
Sets the algorithm to use for DSO sorting, valid values are @samp{1} and
|
|
|
|
@samp{2}. For value of @samp{1}, an older O(n^3) algorithm is used, which is
|
|
|
|
long time tested, but may have performance issues when dependencies between
|
|
|
|
shared objects contain cycles due to circular dependencies. When set to the
|
|
|
|
value of @samp{2}, a different algorithm is used, which implements a
|
|
|
|
topological sort through depth-first search, and does not exhibit the
|
|
|
|
performance issues of @samp{1}.
|
|
|
|
|
2021-12-14 11:37:44 +00:00
|
|
|
The default value of this tunable is @samp{2}.
|
2021-10-21 13:41:22 +00:00
|
|
|
@end deftp
|
2020-06-10 12:40:40 +00:00
|
|
|
|
2024-03-01 17:42:10 +00:00
|
|
|
@deftp Tunable glibc.rtld.enable_secure
|
|
|
|
Used to run a program as if it were a setuid process. The only valid value
|
|
|
|
is @samp{1} as this tunable can only be used to set and not unset
|
|
|
|
@code{enable_secure}. Setting this tunable to @samp{1} also disables all other
|
|
|
|
tunables. This tunable is intended to facilitate more extensive verification
|
|
|
|
tests for @code{AT_SECURE} programs and not meant to be a security feature.
|
|
|
|
|
|
|
|
The default value of this tunable is @samp{0}.
|
|
|
|
@end deftp
|
|
|
|
|
2017-12-05 16:24:14 +00:00
|
|
|
@node Elision Tunables
|
|
|
|
@section Elision Tunables
|
|
|
|
@cindex elision tunables
|
|
|
|
@cindex tunables, elision
|
|
|
|
|
|
|
|
@deftp {Tunable namespace} glibc.elision
|
|
|
|
Contended locks are usually slow and can lead to performance and scalability
|
|
|
|
issues in multithread code. Lock elision will use memory transactions to under
|
|
|
|
certain conditions, to elide locks and improve performance.
|
|
|
|
Elision behavior can be modified by setting the following tunables in
|
|
|
|
the @code{elision} namespace:
|
|
|
|
@end deftp
|
|
|
|
|
|
|
|
@deftp Tunable glibc.elision.enable
|
|
|
|
The @code{glibc.elision.enable} tunable enables lock elision if the feature is
|
|
|
|
supported by the hardware. If elision is not supported by the hardware this
|
|
|
|
tunable has no effect.
|
|
|
|
|
|
|
|
Elision tunables are supported for 64-bit Intel, IBM POWER, and z System
|
|
|
|
architectures.
|
|
|
|
@end deftp
|
|
|
|
|
|
|
|
@deftp Tunable glibc.elision.skip_lock_busy
|
|
|
|
The @code{glibc.elision.skip_lock_busy} tunable sets how many times to use a
|
|
|
|
non-transactional lock after a transactional failure has occurred because the
|
|
|
|
lock is already acquired. Expressed in number of lock acquisition attempts.
|
|
|
|
|
|
|
|
The default value of this tunable is @samp{3}.
|
|
|
|
@end deftp
|
|
|
|
|
|
|
|
@deftp Tunable glibc.elision.skip_lock_internal_abort
|
|
|
|
The @code{glibc.elision.skip_lock_internal_abort} tunable sets how many times
|
|
|
|
the thread should avoid using elision if a transaction aborted for any reason
|
|
|
|
other than a different thread's memory accesses. Expressed in number of lock
|
|
|
|
acquisition attempts.
|
|
|
|
|
|
|
|
The default value of this tunable is @samp{3}.
|
|
|
|
@end deftp
|
|
|
|
|
|
|
|
@deftp Tunable glibc.elision.skip_lock_after_retries
|
|
|
|
The @code{glibc.elision.skip_lock_after_retries} tunable sets how many times
|
|
|
|
to try to elide a lock with transactions, that only failed due to a different
|
|
|
|
thread's memory accesses, before falling back to regular lock.
|
|
|
|
Expressed in number of lock elision attempts.
|
|
|
|
|
|
|
|
This tunable is supported only on IBM POWER, and z System architectures.
|
|
|
|
|
|
|
|
The default value of this tunable is @samp{3}.
|
|
|
|
@end deftp
|
|
|
|
|
|
|
|
@deftp Tunable glibc.elision.tries
|
|
|
|
The @code{glibc.elision.tries} sets how many times to retry elision if there is
|
|
|
|
chance for the transaction to finish execution e.g., it wasn't
|
|
|
|
aborted due to the lock being already acquired. If elision is not supported
|
|
|
|
by the hardware this tunable is set to @samp{0} to avoid retries.
|
|
|
|
|
|
|
|
The default value of this tunable is @samp{3}.
|
|
|
|
@end deftp
|
|
|
|
|
|
|
|
@deftp Tunable glibc.elision.skip_trylock_internal_abort
|
|
|
|
The @code{glibc.elision.skip_trylock_internal_abort} tunable sets how many
|
|
|
|
times the thread should avoid trying the lock if a transaction aborted due to
|
|
|
|
reasons other than a different thread's memory accesses. Expressed in number
|
|
|
|
of try lock attempts.
|
|
|
|
|
|
|
|
The default value of this tunable is @samp{3}.
|
|
|
|
@end deftp
|
|
|
|
|
2018-12-01 15:03:33 +00:00
|
|
|
@node POSIX Thread Tunables
|
|
|
|
@section POSIX Thread Tunables
|
|
|
|
@cindex pthread mutex tunables
|
|
|
|
@cindex thread mutex tunables
|
|
|
|
@cindex mutex tunables
|
|
|
|
@cindex tunables thread mutex
|
|
|
|
|
|
|
|
@deftp {Tunable namespace} glibc.pthread
|
|
|
|
The behavior of POSIX threads can be tuned to gain performance improvements
|
|
|
|
according to specific hardware capabilities and workload characteristics by
|
|
|
|
setting the following tunables in the @code{pthread} namespace:
|
|
|
|
@end deftp
|
|
|
|
|
|
|
|
@deftp Tunable glibc.pthread.mutex_spin_count
|
|
|
|
The @code{glibc.pthread.mutex_spin_count} tunable sets the maximum number of times
|
|
|
|
a thread should spin on the lock before calling into the kernel to block.
|
|
|
|
Adaptive spin is used for mutexes initialized with the
|
|
|
|
@code{PTHREAD_MUTEX_ADAPTIVE_NP} GNU extension. It affects both
|
|
|
|
@code{pthread_mutex_lock} and @code{pthread_mutex_timedlock}.
|
|
|
|
|
|
|
|
The thread spins until either the maximum spin count is reached or the lock
|
|
|
|
is acquired.
|
|
|
|
|
|
|
|
The default value of this tunable is @samp{100}.
|
|
|
|
@end deftp
|
|
|
|
|
2021-06-28 13:48:58 +00:00
|
|
|
@deftp Tunable glibc.pthread.stack_cache_size
|
|
|
|
This tunable configures the maximum size of the stack cache. Once the
|
|
|
|
stack cache exceeds this size, unused thread stacks are returned to
|
|
|
|
the kernel, to bring the cache size below this limit.
|
|
|
|
|
|
|
|
The value is measured in bytes. The default is @samp{41943040}
|
2023-05-27 16:41:44 +00:00
|
|
|
(forty mibibytes).
|
2021-06-28 13:48:58 +00:00
|
|
|
@end deftp
|
|
|
|
|
2021-12-09 08:49:32 +00:00
|
|
|
@deftp Tunable glibc.pthread.rseq
|
|
|
|
The @code{glibc.pthread.rseq} tunable can be set to @samp{0}, to disable
|
|
|
|
restartable sequences support in @theglibc{}. This enables applications
|
|
|
|
to perform direct restartable sequence registration with the kernel.
|
|
|
|
The default is @samp{1}, which means that @theglibc{} performs
|
|
|
|
registration on behalf of the application.
|
|
|
|
|
|
|
|
Restartable sequences are a Linux-specific extension.
|
|
|
|
@end deftp
|
|
|
|
|
2023-04-14 15:12:20 +00:00
|
|
|
@deftp Tunable glibc.pthread.stack_hugetlb
|
|
|
|
This tunable controls whether to use Huge Pages in the stacks created by
|
|
|
|
@code{pthread_create}. This tunable only affects the stacks created by
|
|
|
|
@theglibc{}, it has no effect on stack assigned with
|
|
|
|
@code{pthread_attr_setstack}.
|
|
|
|
|
|
|
|
The default is @samp{1} where the system default value is used. Setting
|
|
|
|
its value to @code{0} enables the use of @code{madvise} with
|
|
|
|
@code{MADV_NOHUGEPAGE} after stack creation with @code{mmap}.
|
|
|
|
|
|
|
|
This is a memory utilization optimization, since internal glibc setup of either
|
|
|
|
the thread descriptor and the guard page might force the kernel to move the
|
|
|
|
thread stack originally backup by Huge Pages to default pages.
|
|
|
|
@end deftp
|
|
|
|
|
2017-04-17 04:30:35 +00:00
|
|
|
@node Hardware Capability Tunables
|
|
|
|
@section Hardware Capability Tunables
|
|
|
|
@cindex hardware capability tunables
|
|
|
|
@cindex hwcap tunables
|
|
|
|
@cindex tunables, hwcap
|
2017-06-21 17:20:24 +00:00
|
|
|
@cindex hwcaps tunables
|
|
|
|
@cindex tunables, hwcaps
|
tunables: Add IFUNC selection and cache sizes
The current IFUNC selection is based on microbenchmarks in glibc. It
should give the best performance for most workloads. But other choices
may have better performance for a particular workload or on the hardware
which wasn't available at the selection was made. The environment
variable, GLIBC_TUNABLES=glibc.tune.ifunc=-xxx,yyy,-zzz...., can be used
to enable CPU/ARCH feature yyy, disable CPU/ARCH feature yyy and zzz,
where the feature name is case-sensitive and has to match the ones in
cpu-features.h. It can be used by glibc developers to override the
IFUNC selection to tune for a new processor or improve performance for
a particular workload. It isn't intended for normal end users.
NOTE: the IFUNC selection may change over time. Please check all
multiarch implementations when experimenting.
Also, GLIBC_TUNABLES=glibc.tune.x86_non_temporal_threshold=NUMBER is
provided to set threshold to use non temporal store to NUMBER,
GLIBC_TUNABLES=glibc.tune.x86_data_cache_size=NUMBER to set data cache
size, GLIBC_TUNABLES=glibc.tune.x86_shared_cache_size=NUMBER to set
shared cache size.
* elf/dl-tunables.list (tune): Add ifunc,
x86_non_temporal_threshold,
x86_data_cache_size and x86_shared_cache_size.
* manual/tunables.texi: Document glibc.tune.ifunc,
glibc.tune.x86_data_cache_size, glibc.tune.x86_shared_cache_size
and glibc.tune.x86_non_temporal_threshold.
* sysdeps/unix/sysv/linux/x86/dl-sysdep.c: New file.
* sysdeps/x86/cpu-tunables.c: Likewise.
* sysdeps/x86/cacheinfo.c
(init_cacheinfo): Check and get data cache size, shared cache
size and non temporal threshold from cpu_features.
* sysdeps/x86/cpu-features.c [HAVE_TUNABLES] (TUNABLE_NAMESPACE):
New.
[HAVE_TUNABLES] Include <unistd.h>.
[HAVE_TUNABLES] Include <elf/dl-tunables.h>.
[HAVE_TUNABLES] (TUNABLE_CALLBACK (set_ifunc)): Likewise.
[HAVE_TUNABLES] (init_cpu_features): Use TUNABLE_GET to set
IFUNC selection, data cache size, shared cache size and non
temporal threshold.
* sysdeps/x86/cpu-features.h (cpu_features): Add data_cache_size,
shared_cache_size and non_temporal_threshold.
2017-06-20 15:33:29 +00:00
|
|
|
@cindex data_cache_size tunables
|
|
|
|
@cindex tunables, data_cache_size
|
|
|
|
@cindex shared_cache_size tunables
|
|
|
|
@cindex tunables, shared_cache_size
|
|
|
|
@cindex non_temporal_threshold tunables
|
|
|
|
@cindex tunables, non_temporal_threshold
|
2017-04-17 04:30:35 +00:00
|
|
|
|
2018-08-02 18:19:19 +00:00
|
|
|
@deftp {Tunable namespace} glibc.cpu
|
2017-04-17 04:30:35 +00:00
|
|
|
Behavior of @theglibc{} can be tuned to assume specific hardware capabilities
|
2018-08-02 18:19:19 +00:00
|
|
|
by setting the following tunables in the @code{cpu} namespace:
|
2017-04-17 04:30:35 +00:00
|
|
|
@end deftp
|
|
|
|
|
2018-08-02 18:19:19 +00:00
|
|
|
@deftp Tunable glibc.cpu.hwcap_mask
|
2017-04-17 04:30:35 +00:00
|
|
|
This tunable supersedes the @env{LD_HWCAP_MASK} environment variable and is
|
|
|
|
identical in features.
|
|
|
|
|
2018-01-23 19:40:44 +00:00
|
|
|
The @code{AT_HWCAP} key in the Auxiliary Vector specifies instruction set
|
2017-04-17 04:30:35 +00:00
|
|
|
extensions available in the processor at runtime for some architectures. The
|
2018-08-02 18:19:19 +00:00
|
|
|
@code{glibc.cpu.hwcap_mask} tunable allows the user to mask out those
|
2017-04-17 04:30:35 +00:00
|
|
|
capabilities at runtime, thus disabling use of those extensions.
|
|
|
|
@end deftp
|
tunables: Add IFUNC selection and cache sizes
The current IFUNC selection is based on microbenchmarks in glibc. It
should give the best performance for most workloads. But other choices
may have better performance for a particular workload or on the hardware
which wasn't available at the selection was made. The environment
variable, GLIBC_TUNABLES=glibc.tune.ifunc=-xxx,yyy,-zzz...., can be used
to enable CPU/ARCH feature yyy, disable CPU/ARCH feature yyy and zzz,
where the feature name is case-sensitive and has to match the ones in
cpu-features.h. It can be used by glibc developers to override the
IFUNC selection to tune for a new processor or improve performance for
a particular workload. It isn't intended for normal end users.
NOTE: the IFUNC selection may change over time. Please check all
multiarch implementations when experimenting.
Also, GLIBC_TUNABLES=glibc.tune.x86_non_temporal_threshold=NUMBER is
provided to set threshold to use non temporal store to NUMBER,
GLIBC_TUNABLES=glibc.tune.x86_data_cache_size=NUMBER to set data cache
size, GLIBC_TUNABLES=glibc.tune.x86_shared_cache_size=NUMBER to set
shared cache size.
* elf/dl-tunables.list (tune): Add ifunc,
x86_non_temporal_threshold,
x86_data_cache_size and x86_shared_cache_size.
* manual/tunables.texi: Document glibc.tune.ifunc,
glibc.tune.x86_data_cache_size, glibc.tune.x86_shared_cache_size
and glibc.tune.x86_non_temporal_threshold.
* sysdeps/unix/sysv/linux/x86/dl-sysdep.c: New file.
* sysdeps/x86/cpu-tunables.c: Likewise.
* sysdeps/x86/cacheinfo.c
(init_cacheinfo): Check and get data cache size, shared cache
size and non temporal threshold from cpu_features.
* sysdeps/x86/cpu-features.c [HAVE_TUNABLES] (TUNABLE_NAMESPACE):
New.
[HAVE_TUNABLES] Include <unistd.h>.
[HAVE_TUNABLES] Include <elf/dl-tunables.h>.
[HAVE_TUNABLES] (TUNABLE_CALLBACK (set_ifunc)): Likewise.
[HAVE_TUNABLES] (init_cpu_features): Use TUNABLE_GET to set
IFUNC selection, data cache size, shared cache size and non
temporal threshold.
* sysdeps/x86/cpu-features.h (cpu_features): Add data_cache_size,
shared_cache_size and non_temporal_threshold.
2017-06-20 15:33:29 +00:00
|
|
|
|
2018-08-02 18:19:19 +00:00
|
|
|
@deftp Tunable glibc.cpu.hwcaps
|
|
|
|
The @code{glibc.cpu.hwcaps=-xxx,yyy,-zzz...} tunable allows the user to
|
tunables: Add IFUNC selection and cache sizes
The current IFUNC selection is based on microbenchmarks in glibc. It
should give the best performance for most workloads. But other choices
may have better performance for a particular workload or on the hardware
which wasn't available at the selection was made. The environment
variable, GLIBC_TUNABLES=glibc.tune.ifunc=-xxx,yyy,-zzz...., can be used
to enable CPU/ARCH feature yyy, disable CPU/ARCH feature yyy and zzz,
where the feature name is case-sensitive and has to match the ones in
cpu-features.h. It can be used by glibc developers to override the
IFUNC selection to tune for a new processor or improve performance for
a particular workload. It isn't intended for normal end users.
NOTE: the IFUNC selection may change over time. Please check all
multiarch implementations when experimenting.
Also, GLIBC_TUNABLES=glibc.tune.x86_non_temporal_threshold=NUMBER is
provided to set threshold to use non temporal store to NUMBER,
GLIBC_TUNABLES=glibc.tune.x86_data_cache_size=NUMBER to set data cache
size, GLIBC_TUNABLES=glibc.tune.x86_shared_cache_size=NUMBER to set
shared cache size.
* elf/dl-tunables.list (tune): Add ifunc,
x86_non_temporal_threshold,
x86_data_cache_size and x86_shared_cache_size.
* manual/tunables.texi: Document glibc.tune.ifunc,
glibc.tune.x86_data_cache_size, glibc.tune.x86_shared_cache_size
and glibc.tune.x86_non_temporal_threshold.
* sysdeps/unix/sysv/linux/x86/dl-sysdep.c: New file.
* sysdeps/x86/cpu-tunables.c: Likewise.
* sysdeps/x86/cacheinfo.c
(init_cacheinfo): Check and get data cache size, shared cache
size and non temporal threshold from cpu_features.
* sysdeps/x86/cpu-features.c [HAVE_TUNABLES] (TUNABLE_NAMESPACE):
New.
[HAVE_TUNABLES] Include <unistd.h>.
[HAVE_TUNABLES] Include <elf/dl-tunables.h>.
[HAVE_TUNABLES] (TUNABLE_CALLBACK (set_ifunc)): Likewise.
[HAVE_TUNABLES] (init_cpu_features): Use TUNABLE_GET to set
IFUNC selection, data cache size, shared cache size and non
temporal threshold.
* sysdeps/x86/cpu-features.h (cpu_features): Add data_cache_size,
shared_cache_size and non_temporal_threshold.
2017-06-20 15:33:29 +00:00
|
|
|
enable CPU/ARCH feature @code{yyy}, disable CPU/ARCH feature @code{xxx}
|
|
|
|
and @code{zzz} where the feature name is case-sensitive and has to match
|
2022-05-31 14:21:32 +00:00
|
|
|
the ones in @code{sysdeps/x86/include/cpu-features.h}.
|
tunables: Add IFUNC selection and cache sizes
The current IFUNC selection is based on microbenchmarks in glibc. It
should give the best performance for most workloads. But other choices
may have better performance for a particular workload or on the hardware
which wasn't available at the selection was made. The environment
variable, GLIBC_TUNABLES=glibc.tune.ifunc=-xxx,yyy,-zzz...., can be used
to enable CPU/ARCH feature yyy, disable CPU/ARCH feature yyy and zzz,
where the feature name is case-sensitive and has to match the ones in
cpu-features.h. It can be used by glibc developers to override the
IFUNC selection to tune for a new processor or improve performance for
a particular workload. It isn't intended for normal end users.
NOTE: the IFUNC selection may change over time. Please check all
multiarch implementations when experimenting.
Also, GLIBC_TUNABLES=glibc.tune.x86_non_temporal_threshold=NUMBER is
provided to set threshold to use non temporal store to NUMBER,
GLIBC_TUNABLES=glibc.tune.x86_data_cache_size=NUMBER to set data cache
size, GLIBC_TUNABLES=glibc.tune.x86_shared_cache_size=NUMBER to set
shared cache size.
* elf/dl-tunables.list (tune): Add ifunc,
x86_non_temporal_threshold,
x86_data_cache_size and x86_shared_cache_size.
* manual/tunables.texi: Document glibc.tune.ifunc,
glibc.tune.x86_data_cache_size, glibc.tune.x86_shared_cache_size
and glibc.tune.x86_non_temporal_threshold.
* sysdeps/unix/sysv/linux/x86/dl-sysdep.c: New file.
* sysdeps/x86/cpu-tunables.c: Likewise.
* sysdeps/x86/cacheinfo.c
(init_cacheinfo): Check and get data cache size, shared cache
size and non temporal threshold from cpu_features.
* sysdeps/x86/cpu-features.c [HAVE_TUNABLES] (TUNABLE_NAMESPACE):
New.
[HAVE_TUNABLES] Include <unistd.h>.
[HAVE_TUNABLES] Include <elf/dl-tunables.h>.
[HAVE_TUNABLES] (TUNABLE_CALLBACK (set_ifunc)): Likewise.
[HAVE_TUNABLES] (init_cpu_features): Use TUNABLE_GET to set
IFUNC selection, data cache size, shared cache size and non
temporal threshold.
* sysdeps/x86/cpu-features.h (cpu_features): Add data_cache_size,
shared_cache_size and non_temporal_threshold.
2017-06-20 15:33:29 +00:00
|
|
|
|
2023-02-02 13:57:50 +00:00
|
|
|
On s390x, the supported HWCAP and STFLE features can be found in
|
|
|
|
@code{sysdeps/s390/cpu-features.c}. In addition the user can also set
|
|
|
|
a CPU arch-level like @code{z13} instead of single HWCAP and STFLE features.
|
|
|
|
|
2023-08-01 12:41:17 +00:00
|
|
|
On powerpc, the supported HWCAP and HWCAP2 features can be found in
|
|
|
|
@code{sysdeps/powerpc/dl-procinfo.c}.
|
|
|
|
|
2023-09-15 09:35:19 +00:00
|
|
|
On loongarch, the supported HWCAP features can be found in
|
|
|
|
@code{sysdeps/loongarch/cpu-tunables.c}.
|
|
|
|
|
|
|
|
This tunable is specific to i386, x86-64, s390x, powerpc and loongarch.
|
tunables: Add IFUNC selection and cache sizes
The current IFUNC selection is based on microbenchmarks in glibc. It
should give the best performance for most workloads. But other choices
may have better performance for a particular workload or on the hardware
which wasn't available at the selection was made. The environment
variable, GLIBC_TUNABLES=glibc.tune.ifunc=-xxx,yyy,-zzz...., can be used
to enable CPU/ARCH feature yyy, disable CPU/ARCH feature yyy and zzz,
where the feature name is case-sensitive and has to match the ones in
cpu-features.h. It can be used by glibc developers to override the
IFUNC selection to tune for a new processor or improve performance for
a particular workload. It isn't intended for normal end users.
NOTE: the IFUNC selection may change over time. Please check all
multiarch implementations when experimenting.
Also, GLIBC_TUNABLES=glibc.tune.x86_non_temporal_threshold=NUMBER is
provided to set threshold to use non temporal store to NUMBER,
GLIBC_TUNABLES=glibc.tune.x86_data_cache_size=NUMBER to set data cache
size, GLIBC_TUNABLES=glibc.tune.x86_shared_cache_size=NUMBER to set
shared cache size.
* elf/dl-tunables.list (tune): Add ifunc,
x86_non_temporal_threshold,
x86_data_cache_size and x86_shared_cache_size.
* manual/tunables.texi: Document glibc.tune.ifunc,
glibc.tune.x86_data_cache_size, glibc.tune.x86_shared_cache_size
and glibc.tune.x86_non_temporal_threshold.
* sysdeps/unix/sysv/linux/x86/dl-sysdep.c: New file.
* sysdeps/x86/cpu-tunables.c: Likewise.
* sysdeps/x86/cacheinfo.c
(init_cacheinfo): Check and get data cache size, shared cache
size and non temporal threshold from cpu_features.
* sysdeps/x86/cpu-features.c [HAVE_TUNABLES] (TUNABLE_NAMESPACE):
New.
[HAVE_TUNABLES] Include <unistd.h>.
[HAVE_TUNABLES] Include <elf/dl-tunables.h>.
[HAVE_TUNABLES] (TUNABLE_CALLBACK (set_ifunc)): Likewise.
[HAVE_TUNABLES] (init_cpu_features): Use TUNABLE_GET to set
IFUNC selection, data cache size, shared cache size and non
temporal threshold.
* sysdeps/x86/cpu-features.h (cpu_features): Add data_cache_size,
shared_cache_size and non_temporal_threshold.
2017-06-20 15:33:29 +00:00
|
|
|
@end deftp
|
|
|
|
|
2018-08-02 18:19:19 +00:00
|
|
|
@deftp Tunable glibc.cpu.cached_memopt
|
|
|
|
The @code{glibc.cpu.cached_memopt=[0|1]} tunable allows the user to
|
2017-12-11 19:39:42 +00:00
|
|
|
enable optimizations recommended for cacheable memory. If set to
|
|
|
|
@code{1}, @theglibc{} assumes that the process memory image consists
|
|
|
|
of cacheable (non-device) memory only. The default, @code{0},
|
|
|
|
indicates that the process may use device memory.
|
|
|
|
|
|
|
|
This tunable is specific to powerpc, powerpc64 and powerpc64le.
|
|
|
|
@end deftp
|
|
|
|
|
2018-08-02 18:19:19 +00:00
|
|
|
@deftp Tunable glibc.cpu.name
|
|
|
|
The @code{glibc.cpu.name=xxx} tunable allows the user to tell @theglibc{} to
|
2017-06-30 17:28:39 +00:00
|
|
|
assume that the CPU is @code{xxx} where xxx may have one of these values:
|
2023-10-26 16:30:36 +00:00
|
|
|
@code{generic}, @code{thunderxt88}, @code{thunderx2t99},
|
2021-05-27 07:42:35 +00:00
|
|
|
@code{thunderx2t99p1}, @code{ares}, @code{emag}, @code{kunpeng},
|
|
|
|
@code{a64fx}.
|
2017-06-30 17:28:39 +00:00
|
|
|
|
|
|
|
This tunable is specific to aarch64.
|
|
|
|
@end deftp
|
|
|
|
|
2018-08-02 18:19:19 +00:00
|
|
|
@deftp Tunable glibc.cpu.x86_data_cache_size
|
|
|
|
The @code{glibc.cpu.x86_data_cache_size} tunable allows the user to set
|
tunables: Add IFUNC selection and cache sizes
The current IFUNC selection is based on microbenchmarks in glibc. It
should give the best performance for most workloads. But other choices
may have better performance for a particular workload or on the hardware
which wasn't available at the selection was made. The environment
variable, GLIBC_TUNABLES=glibc.tune.ifunc=-xxx,yyy,-zzz...., can be used
to enable CPU/ARCH feature yyy, disable CPU/ARCH feature yyy and zzz,
where the feature name is case-sensitive and has to match the ones in
cpu-features.h. It can be used by glibc developers to override the
IFUNC selection to tune for a new processor or improve performance for
a particular workload. It isn't intended for normal end users.
NOTE: the IFUNC selection may change over time. Please check all
multiarch implementations when experimenting.
Also, GLIBC_TUNABLES=glibc.tune.x86_non_temporal_threshold=NUMBER is
provided to set threshold to use non temporal store to NUMBER,
GLIBC_TUNABLES=glibc.tune.x86_data_cache_size=NUMBER to set data cache
size, GLIBC_TUNABLES=glibc.tune.x86_shared_cache_size=NUMBER to set
shared cache size.
* elf/dl-tunables.list (tune): Add ifunc,
x86_non_temporal_threshold,
x86_data_cache_size and x86_shared_cache_size.
* manual/tunables.texi: Document glibc.tune.ifunc,
glibc.tune.x86_data_cache_size, glibc.tune.x86_shared_cache_size
and glibc.tune.x86_non_temporal_threshold.
* sysdeps/unix/sysv/linux/x86/dl-sysdep.c: New file.
* sysdeps/x86/cpu-tunables.c: Likewise.
* sysdeps/x86/cacheinfo.c
(init_cacheinfo): Check and get data cache size, shared cache
size and non temporal threshold from cpu_features.
* sysdeps/x86/cpu-features.c [HAVE_TUNABLES] (TUNABLE_NAMESPACE):
New.
[HAVE_TUNABLES] Include <unistd.h>.
[HAVE_TUNABLES] Include <elf/dl-tunables.h>.
[HAVE_TUNABLES] (TUNABLE_CALLBACK (set_ifunc)): Likewise.
[HAVE_TUNABLES] (init_cpu_features): Use TUNABLE_GET to set
IFUNC selection, data cache size, shared cache size and non
temporal threshold.
* sysdeps/x86/cpu-features.h (cpu_features): Add data_cache_size,
shared_cache_size and non_temporal_threshold.
2017-06-20 15:33:29 +00:00
|
|
|
data cache size in bytes for use in memory and string routines.
|
|
|
|
|
|
|
|
This tunable is specific to i386 and x86-64.
|
|
|
|
@end deftp
|
|
|
|
|
2018-08-02 18:19:19 +00:00
|
|
|
@deftp Tunable glibc.cpu.x86_shared_cache_size
|
|
|
|
The @code{glibc.cpu.x86_shared_cache_size} tunable allows the user to
|
tunables: Add IFUNC selection and cache sizes
The current IFUNC selection is based on microbenchmarks in glibc. It
should give the best performance for most workloads. But other choices
may have better performance for a particular workload or on the hardware
which wasn't available at the selection was made. The environment
variable, GLIBC_TUNABLES=glibc.tune.ifunc=-xxx,yyy,-zzz...., can be used
to enable CPU/ARCH feature yyy, disable CPU/ARCH feature yyy and zzz,
where the feature name is case-sensitive and has to match the ones in
cpu-features.h. It can be used by glibc developers to override the
IFUNC selection to tune for a new processor or improve performance for
a particular workload. It isn't intended for normal end users.
NOTE: the IFUNC selection may change over time. Please check all
multiarch implementations when experimenting.
Also, GLIBC_TUNABLES=glibc.tune.x86_non_temporal_threshold=NUMBER is
provided to set threshold to use non temporal store to NUMBER,
GLIBC_TUNABLES=glibc.tune.x86_data_cache_size=NUMBER to set data cache
size, GLIBC_TUNABLES=glibc.tune.x86_shared_cache_size=NUMBER to set
shared cache size.
* elf/dl-tunables.list (tune): Add ifunc,
x86_non_temporal_threshold,
x86_data_cache_size and x86_shared_cache_size.
* manual/tunables.texi: Document glibc.tune.ifunc,
glibc.tune.x86_data_cache_size, glibc.tune.x86_shared_cache_size
and glibc.tune.x86_non_temporal_threshold.
* sysdeps/unix/sysv/linux/x86/dl-sysdep.c: New file.
* sysdeps/x86/cpu-tunables.c: Likewise.
* sysdeps/x86/cacheinfo.c
(init_cacheinfo): Check and get data cache size, shared cache
size and non temporal threshold from cpu_features.
* sysdeps/x86/cpu-features.c [HAVE_TUNABLES] (TUNABLE_NAMESPACE):
New.
[HAVE_TUNABLES] Include <unistd.h>.
[HAVE_TUNABLES] Include <elf/dl-tunables.h>.
[HAVE_TUNABLES] (TUNABLE_CALLBACK (set_ifunc)): Likewise.
[HAVE_TUNABLES] (init_cpu_features): Use TUNABLE_GET to set
IFUNC selection, data cache size, shared cache size and non
temporal threshold.
* sysdeps/x86/cpu-features.h (cpu_features): Add data_cache_size,
shared_cache_size and non_temporal_threshold.
2017-06-20 15:33:29 +00:00
|
|
|
set shared cache size in bytes for use in memory and string routines.
|
|
|
|
@end deftp
|
|
|
|
|
2018-08-02 18:19:19 +00:00
|
|
|
@deftp Tunable glibc.cpu.x86_non_temporal_threshold
|
|
|
|
The @code{glibc.cpu.x86_non_temporal_threshold} tunable allows the user
|
2020-09-28 20:11:28 +00:00
|
|
|
to set threshold in bytes for non temporal store. Non temporal stores
|
|
|
|
give a hint to the hardware to move data directly to memory without
|
|
|
|
displacing other data from the cache. This tunable is used by some
|
|
|
|
platforms to determine when to use non temporal stores in operations
|
|
|
|
like memmove and memcpy.
|
tunables: Add IFUNC selection and cache sizes
The current IFUNC selection is based on microbenchmarks in glibc. It
should give the best performance for most workloads. But other choices
may have better performance for a particular workload or on the hardware
which wasn't available at the selection was made. The environment
variable, GLIBC_TUNABLES=glibc.tune.ifunc=-xxx,yyy,-zzz...., can be used
to enable CPU/ARCH feature yyy, disable CPU/ARCH feature yyy and zzz,
where the feature name is case-sensitive and has to match the ones in
cpu-features.h. It can be used by glibc developers to override the
IFUNC selection to tune for a new processor or improve performance for
a particular workload. It isn't intended for normal end users.
NOTE: the IFUNC selection may change over time. Please check all
multiarch implementations when experimenting.
Also, GLIBC_TUNABLES=glibc.tune.x86_non_temporal_threshold=NUMBER is
provided to set threshold to use non temporal store to NUMBER,
GLIBC_TUNABLES=glibc.tune.x86_data_cache_size=NUMBER to set data cache
size, GLIBC_TUNABLES=glibc.tune.x86_shared_cache_size=NUMBER to set
shared cache size.
* elf/dl-tunables.list (tune): Add ifunc,
x86_non_temporal_threshold,
x86_data_cache_size and x86_shared_cache_size.
* manual/tunables.texi: Document glibc.tune.ifunc,
glibc.tune.x86_data_cache_size, glibc.tune.x86_shared_cache_size
and glibc.tune.x86_non_temporal_threshold.
* sysdeps/unix/sysv/linux/x86/dl-sysdep.c: New file.
* sysdeps/x86/cpu-tunables.c: Likewise.
* sysdeps/x86/cacheinfo.c
(init_cacheinfo): Check and get data cache size, shared cache
size and non temporal threshold from cpu_features.
* sysdeps/x86/cpu-features.c [HAVE_TUNABLES] (TUNABLE_NAMESPACE):
New.
[HAVE_TUNABLES] Include <unistd.h>.
[HAVE_TUNABLES] Include <elf/dl-tunables.h>.
[HAVE_TUNABLES] (TUNABLE_CALLBACK (set_ifunc)): Likewise.
[HAVE_TUNABLES] (init_cpu_features): Use TUNABLE_GET to set
IFUNC selection, data cache size, shared cache size and non
temporal threshold.
* sysdeps/x86/cpu-features.h (cpu_features): Add data_cache_size,
shared_cache_size and non_temporal_threshold.
2017-06-20 15:33:29 +00:00
|
|
|
|
|
|
|
This tunable is specific to i386 and x86-64.
|
|
|
|
@end deftp
|
2018-07-18 18:34:35 +00:00
|
|
|
|
2020-07-06 18:48:09 +00:00
|
|
|
@deftp Tunable glibc.cpu.x86_rep_movsb_threshold
|
|
|
|
The @code{glibc.cpu.x86_rep_movsb_threshold} tunable allows the user to
|
|
|
|
set threshold in bytes to start using "rep movsb". The value must be
|
|
|
|
greater than zero, and currently defaults to 2048 bytes.
|
|
|
|
|
|
|
|
This tunable is specific to i386 and x86-64.
|
|
|
|
@end deftp
|
|
|
|
|
|
|
|
@deftp Tunable glibc.cpu.x86_rep_stosb_threshold
|
|
|
|
The @code{glibc.cpu.x86_rep_stosb_threshold} tunable allows the user to
|
|
|
|
set threshold in bytes to start using "rep stosb". The value must be
|
|
|
|
greater than zero, and currently defaults to 2048 bytes.
|
|
|
|
|
|
|
|
This tunable is specific to i386 and x86-64.
|
|
|
|
@end deftp
|
|
|
|
|
2018-08-02 18:19:19 +00:00
|
|
|
@deftp Tunable glibc.cpu.x86_ibt
|
|
|
|
The @code{glibc.cpu.x86_ibt} tunable allows the user to control how
|
2018-07-18 18:34:35 +00:00
|
|
|
indirect branch tracking (IBT) should be enabled. Accepted values are
|
|
|
|
@code{on}, @code{off}, and @code{permissive}. @code{on} always turns
|
|
|
|
on IBT regardless of whether IBT is enabled in the executable and its
|
|
|
|
dependent shared libraries. @code{off} always turns off IBT regardless
|
|
|
|
of whether IBT is enabled in the executable and its dependent shared
|
|
|
|
libraries. @code{permissive} is the same as the default which disables
|
|
|
|
IBT on non-CET executables and shared libraries.
|
|
|
|
|
|
|
|
This tunable is specific to i386 and x86-64.
|
|
|
|
@end deftp
|
|
|
|
|
2018-08-02 18:19:19 +00:00
|
|
|
@deftp Tunable glibc.cpu.x86_shstk
|
|
|
|
The @code{glibc.cpu.x86_shstk} tunable allows the user to control how
|
2018-07-18 18:34:35 +00:00
|
|
|
the shadow stack (SHSTK) should be enabled. Accepted values are
|
|
|
|
@code{on}, @code{off}, and @code{permissive}. @code{on} always turns on
|
|
|
|
SHSTK regardless of whether SHSTK is enabled in the executable and its
|
|
|
|
dependent shared libraries. @code{off} always turns off SHSTK regardless
|
|
|
|
of whether SHSTK is enabled in the executable and its dependent shared
|
|
|
|
libraries. @code{permissive} changes how dlopen works on non-CET shared
|
|
|
|
libraries. By default, when SHSTK is enabled, dlopening a non-CET shared
|
|
|
|
library returns an error. With @code{permissive}, it turns off SHSTK
|
|
|
|
instead.
|
|
|
|
|
|
|
|
This tunable is specific to i386 and x86-64.
|
|
|
|
@end deftp
|
2020-12-21 15:03:03 +00:00
|
|
|
|
2023-01-26 16:26:18 +00:00
|
|
|
@deftp Tunable glibc.cpu.prefer_map_32bit_exec
|
2023-02-23 04:04:26 +00:00
|
|
|
When this tunable is set to @code{1}, shared libraries of non-setuid
|
2023-01-26 16:26:18 +00:00
|
|
|
programs will be loaded below 2GB with MAP_32BIT.
|
|
|
|
|
|
|
|
Note that the @env{LD_PREFER_MAP_32BIT_EXEC} environment is an alias of
|
|
|
|
this tunable.
|
|
|
|
|
|
|
|
This tunable is specific to 64-bit x86-64.
|
|
|
|
@end deftp
|
|
|
|
|
2024-01-05 04:19:39 +00:00
|
|
|
@deftp Tunable glibc.cpu.plt_rewrite
|
|
|
|
When this tunable is set to @code{1}, the dynamic linker will rewrite
|
|
|
|
the PLT section with 32-bit direct jump. When it is set to @code{2},
|
|
|
|
the dynamic linker will rewrite the PLT section with 32-bit direct
|
|
|
|
jump and on APX processors with 64-bit absolute jump.
|
|
|
|
|
|
|
|
This tunable is specific to x86-64 and effective only when the lazy
|
|
|
|
binding is disabled.
|
|
|
|
@end deftp
|
|
|
|
|
2020-12-21 15:03:03 +00:00
|
|
|
@node Memory Related Tunables
|
|
|
|
@section Memory Related Tunables
|
|
|
|
@cindex memory related tunables
|
|
|
|
|
|
|
|
@deftp {Tunable namespace} glibc.mem
|
|
|
|
This tunable namespace supports operations that affect the way @theglibc{}
|
|
|
|
and the process manage memory.
|
|
|
|
@end deftp
|
|
|
|
|
|
|
|
@deftp Tunable glibc.mem.tagging
|
|
|
|
If the hardware supports memory tagging, this tunable can be used to
|
|
|
|
control the way @theglibc{} uses this feature. At present this is only
|
2023-05-27 16:41:44 +00:00
|
|
|
supported on AArch64 systems with the MTE extension; it is ignored for
|
2020-12-21 15:03:03 +00:00
|
|
|
all other systems.
|
|
|
|
|
|
|
|
This tunable takes a value between 0 and 255 and acts as a bitmask
|
|
|
|
that enables various capabilities.
|
|
|
|
|
Improve documentation for malloc etc. (BZ#27719)
Cover key corner cases (e.g., whether errno is set) that are well
settled in glibc, fix some examples to avoid integer overflow, and
update some other dated examples (code needed for K&R C, e.g.).
* manual/charset.texi (Non-reentrant String Conversion):
* manual/filesys.texi (Symbolic Links):
* manual/memory.texi (Allocating Cleared Space):
* manual/socket.texi (Host Names):
* manual/string.texi (Concatenating Strings):
* manual/users.texi (Setting Groups):
Use reallocarray instead of realloc, to avoid integer overflow issues.
* manual/filesys.texi (Scanning Directory Content):
* manual/memory.texi (The GNU Allocator, Hooks for Malloc):
* manual/tunables.texi:
Use code font for 'malloc' instead of roman font.
(Symbolic Links): Don't assume readlink return value fits in 'int'.
* manual/memory.texi (Memory Allocation and C, Basic Allocation)
(Malloc Examples, Alloca Example):
* manual/stdio.texi (Formatted Output Functions):
* manual/string.texi (Concatenating Strings, Collation Functions):
Omit pointer casts that are needed only in ancient K&R C.
* manual/memory.texi (Basic Allocation):
Say that malloc sets errno on failure.
Say "convert" rather than "cast", since casts are no longer needed.
* manual/memory.texi (Basic Allocation):
* manual/string.texi (Concatenating Strings):
In examples, use C99 declarations after statements for brevity.
* manual/memory.texi (Malloc Examples): Add portability notes for
malloc (0), errno setting, and PTRDIFF_MAX.
(Changing Block Size): Say that realloc (p, 0) acts like
(p ? (free (p), NULL) : malloc (0)).
Add xreallocarray example, since other examples can use it.
Add portability notes for realloc (0, 0), realloc (p, 0),
PTRDIFF_MAX, and improve notes for reallocating to the same size.
(Allocating Cleared Space): Reword now-confusing discussion
about replacement, and xref "Replacing malloc".
* manual/stdio.texi (Formatted Output Functions):
Don't assume message size fits in 'int'.
* manual/string.texi (Concatenating Strings):
Fix undefined behavior involving arithmetic on a freed pointer.
2021-04-12 02:06:00 +00:00
|
|
|
Bit 0 (the least significant bit) causes the @code{malloc}
|
|
|
|
subsystem to allocate
|
2020-12-21 15:03:03 +00:00
|
|
|
tagged memory, with each allocation being assigned a random tag.
|
|
|
|
|
|
|
|
Bit 1 enables precise faulting mode for tag violations on systems that
|
|
|
|
support deferred tag violation reporting. This may cause programs
|
|
|
|
to run more slowly.
|
|
|
|
|
2022-06-27 18:00:50 +00:00
|
|
|
Bit 2 enables either precise or deferred faulting mode for tag violations
|
|
|
|
whichever is preferred by the system.
|
|
|
|
|
2020-12-21 15:03:03 +00:00
|
|
|
Other bits are currently reserved.
|
|
|
|
|
|
|
|
@Theglibc{} startup code will automatically enable memory tagging
|
|
|
|
support in the kernel if this tunable has any non-zero value.
|
|
|
|
|
|
|
|
The default value is @samp{0}, which disables all memory tagging.
|
|
|
|
@end deftp
|
gmon: improve mcount overflow handling [BZ# 27576]
When mcount overflows, no gmon.out file is generated, but no message is printed
to the user, leaving the user with no idea why, and thinking maybe there is
some bug - which is how BZ 27576 ended up being logged. Print a message to
stderr in this case so the user knows what is going on.
As a comment in sys/gmon.h acknowledges, the hardcoded MAXARCS value is too
small for some large applications, including the test case in that BZ. Rather
than increase it, add tunables to enable MINARCS and MAXARCS to be overridden
at runtime (glibc.gmon.minarcs and glibc.gmon.maxarcs). So if a user gets the
mcount overflow error, they can try increasing maxarcs (they might need to
increase minarcs too if the heuristic is wrong in their case.)
Note setting minarcs/maxarcs too large can cause monstartup to fail with an
out of memory error. If you set them large enough, it can cause an integer
overflow in calculating the buffer size. I haven't done anything to defend
against that - it would not generally be a security vulnerability, since these
tunables will be ignored in suid/sgid programs (due to the SXID_ERASE default),
and if you can set GLIBC_TUNABLES in the environment of a process, you can take
it over anyway (LD_PRELOAD, LD_LIBRARY_PATH, etc). I thought about modifying
the code of monstartup to defend against integer overflows, but doing so is
complicated, and I realise the existing code is susceptible to them even prior
to this change (e.g. try passing a pathologically large highpc argument to
monstartup), so I decided just to leave that possibility in-place.
Add a test case which demonstrates mcount overflow and the tunables.
Document the new tunables in the manual.
Signed-off-by: Simon Kissane <skissane@gmail.com>
Reviewed-by: DJ Delorie <dj@redhat.com>
2023-02-11 09:12:13 +00:00
|
|
|
|
elf: Add glibc.mem.decorate_maps tunable
The PR_SET_VMA_ANON_NAME support is only enabled through a configurable
kernel switch, mainly because assigning a name to a
anonymous virtual memory area might prevent that area from being
merged with adjacent virtual memory areas.
For instance, with the following code:
void *p1 = mmap (NULL,
1024 * 4096,
PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS,
-1,
0);
void *p2 = mmap (p1 + (1024 * 4096),
1024 * 4096,
PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS,
-1,
0);
The kernel will potentially merge both mappings resulting in only one
segment of size 0x800000. If the segment is names with
PR_SET_VMA_ANON_NAME with different names, it results in two mappings.
Although this will unlikely be an issue for pthread stacks and malloc
arenas (since for pthread stacks the guard page will result in
a PROT_NONE segment, similar to the alignment requirement for the arena
block), it still might prevent the mmap memory allocated for detail
malloc.
There is also another potential scalability issue, where the prctl
requires
to take the mmap global lock which is still not fully fixed in Linux
[1] (for pthread stacks and arenas, it is mitigated by the stack
cached and the arena reuse).
So this patch disables anonymous mapping annotations as default and
add a new tunable, glibc.mem.decorate_maps, can be used to enable
it.
[1] https://lwn.net/Articles/906852/
Checked on x86_64-linux-gnu and aarch64-linux-gnu.
Reviewed-by: DJ Delorie <dj@redhat.com>
2023-11-01 12:56:11 +00:00
|
|
|
@deftp Tunable glibc.mem.decorate_maps
|
|
|
|
If the kernel supports naming anonymous virtual memory areas (since
|
|
|
|
Linux version 5.17, although not always enabled by some kernel
|
|
|
|
configurations), this tunable can be used to control whether
|
|
|
|
@theglibc{} decorates the underlying memory obtained from operating
|
|
|
|
system with a string describing its usage (for instance, on the thread
|
|
|
|
stack created by @code{ptthread_create} or memory allocated by
|
|
|
|
@code{malloc}).
|
|
|
|
|
|
|
|
The process mappings can be obtained by reading the @code{/proc/<pid>maps}
|
|
|
|
(with @code{pid} being either the @dfn{process ID} or @code{self} for the
|
|
|
|
process own mapping).
|
|
|
|
|
|
|
|
This tunable takes a value of 0 and 1, where 1 enables the feature.
|
|
|
|
The default value is @samp{0}, which disables the decoration.
|
|
|
|
@end deftp
|
|
|
|
|
gmon: improve mcount overflow handling [BZ# 27576]
When mcount overflows, no gmon.out file is generated, but no message is printed
to the user, leaving the user with no idea why, and thinking maybe there is
some bug - which is how BZ 27576 ended up being logged. Print a message to
stderr in this case so the user knows what is going on.
As a comment in sys/gmon.h acknowledges, the hardcoded MAXARCS value is too
small for some large applications, including the test case in that BZ. Rather
than increase it, add tunables to enable MINARCS and MAXARCS to be overridden
at runtime (glibc.gmon.minarcs and glibc.gmon.maxarcs). So if a user gets the
mcount overflow error, they can try increasing maxarcs (they might need to
increase minarcs too if the heuristic is wrong in their case.)
Note setting minarcs/maxarcs too large can cause monstartup to fail with an
out of memory error. If you set them large enough, it can cause an integer
overflow in calculating the buffer size. I haven't done anything to defend
against that - it would not generally be a security vulnerability, since these
tunables will be ignored in suid/sgid programs (due to the SXID_ERASE default),
and if you can set GLIBC_TUNABLES in the environment of a process, you can take
it over anyway (LD_PRELOAD, LD_LIBRARY_PATH, etc). I thought about modifying
the code of monstartup to defend against integer overflows, but doing so is
complicated, and I realise the existing code is susceptible to them even prior
to this change (e.g. try passing a pathologically large highpc argument to
monstartup), so I decided just to leave that possibility in-place.
Add a test case which demonstrates mcount overflow and the tunables.
Document the new tunables in the manual.
Signed-off-by: Simon Kissane <skissane@gmail.com>
Reviewed-by: DJ Delorie <dj@redhat.com>
2023-02-11 09:12:13 +00:00
|
|
|
@node gmon Tunables
|
|
|
|
@section gmon Tunables
|
|
|
|
@cindex gmon tunables
|
|
|
|
|
|
|
|
@deftp {Tunable namespace} glibc.gmon
|
|
|
|
This tunable namespace affects the behaviour of the gmon profiler.
|
|
|
|
gmon is a component of @theglibc{} which is normally used in
|
|
|
|
conjunction with gprof.
|
|
|
|
|
|
|
|
When GCC compiles a program with the @code{-pg} option, it instruments
|
|
|
|
the program with calls to the @code{mcount} function, to record the
|
|
|
|
program's call graph. At program startup, a memory buffer is allocated
|
|
|
|
to store this call graph; the size of the buffer is calculated using a
|
|
|
|
heuristic based on code size. If during execution, the buffer is found
|
|
|
|
to be too small, profiling will be aborted and no @file{gmon.out} file
|
|
|
|
will be produced. In that case, you will see the following message
|
|
|
|
printed to standard error:
|
|
|
|
|
|
|
|
@example
|
|
|
|
mcount: call graph buffer size limit exceeded, gmon.out will not be generated
|
|
|
|
@end example
|
|
|
|
|
|
|
|
Most of the symbols discussed in this section are defined in the header
|
|
|
|
@code{sys/gmon.h}. However, some symbols (for example @code{mcount})
|
|
|
|
are not defined in any header file, since they are only intended to be
|
|
|
|
called from code generated by the compiler.
|
|
|
|
@end deftp
|
|
|
|
|
|
|
|
@deftp Tunable glibc.mem.minarcs
|
|
|
|
The heuristic for sizing the call graph buffer is known to be
|
|
|
|
insufficient for small programs; hence, the calculated value is clamped
|
|
|
|
to be at least a minimum size. The default minimum (in units of
|
|
|
|
call graph entries, @code{struct tostruct}), is given by the macro
|
|
|
|
@code{MINARCS}. If you have some program with an unusually complex
|
|
|
|
call graph, for which the heuristic fails to allocate enough space,
|
|
|
|
you can use this tunable to increase the minimum to a larger value.
|
|
|
|
@end deftp
|
|
|
|
|
|
|
|
@deftp Tunable glibc.mem.maxarcs
|
|
|
|
To prevent excessive memory consumption when profiling very large
|
|
|
|
programs, the call graph buffer is allowed to have a maximum of
|
|
|
|
@code{MAXARCS} entries. For some very large programs, the default
|
|
|
|
value of @code{MAXARCS} defined in @file{sys/gmon.h} is too small; in
|
|
|
|
that case, you can use this tunable to increase it.
|
|
|
|
|
|
|
|
Note the value of the @code{maxarcs} tunable must be greater or equal
|
|
|
|
to that of the @code{minarcs} tunable; if this constraint is violated,
|
|
|
|
a warning will printed to standard error at program startup, and
|
|
|
|
the @code{minarcs} value will be used as the maximum as well.
|
|
|
|
|
|
|
|
Setting either tunable too high may result in a call graph buffer
|
|
|
|
whose size exceeds the available memory; in that case, an out of memory
|
|
|
|
error will be printed at program startup, the profiler will be
|
|
|
|
disabled, and no @file{gmon.out} file will be generated.
|
|
|
|
@end deftp
|