glibc/manual
Adhemerval Zanella bf033c0072 elf: Add glibc.mem.decorate_maps tunable
The PR_SET_VMA_ANON_NAME support is only enabled through a configurable
kernel switch, mainly because assigning a name to a
anonymous virtual memory area might prevent that area from being
merged with adjacent virtual memory areas.

For instance, with the following code:

   void *p1 = mmap (NULL,
                    1024 * 4096,
                    PROT_READ | PROT_WRITE,
                    MAP_PRIVATE | MAP_ANONYMOUS,
                    -1,
                    0);

   void *p2 = mmap (p1 + (1024 * 4096),
                    1024 * 4096,
                    PROT_READ | PROT_WRITE,
                    MAP_PRIVATE | MAP_ANONYMOUS,
                    -1,
                    0);

The kernel will potentially merge both mappings resulting in only one
segment of size 0x800000.  If the segment is names with
PR_SET_VMA_ANON_NAME with different names, it results in two mappings.

Although this will unlikely be an issue for pthread stacks and malloc
arenas (since for pthread stacks the guard page will result in
a PROT_NONE segment, similar to the alignment requirement for the arena
block), it still might prevent the mmap memory allocated for detail
malloc.

There is also another potential scalability issue, where the prctl
requires
to take the mmap global lock which is still not fully fixed in Linux
[1] (for pthread stacks and arenas, it is mitigated by the stack
cached and the arena reuse).

So this patch disables anonymous mapping annotations as default and
add a new tunable, glibc.mem.decorate_maps, can be used to enable
it.

[1] https://lwn.net/Articles/906852/

Checked on x86_64-linux-gnu and aarch64-linux-gnu.
Reviewed-by: DJ Delorie <dj@redhat.com>
2023-11-07 10:27:57 -03:00
..
examples crypt: Remove libcrypt support 2023-10-30 13:03:59 -03:00
argp.texi stdlib: Remove use of mergesort on qsort (BZ 21719) 2023-10-31 14:18:05 -03:00
arith.texi Fix misspellings in manual/ -- BZ 25337 2023-05-27 16:41:44 +00:00
charset.texi wcrtomb: Make behavior POSIX compliant 2022-05-13 19:15:46 +05:30
check-safety.sh Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
conf.texi Call "CST" a time zone abbreviation, not a name 2023-06-22 13:49:09 -07:00
contrib.texi crypt: Remove libcrypt support 2023-10-30 13:03:59 -03:00
creature.texi manual: fix texinfo typo 2023-04-08 13:51:26 -07:00
crypt.texi crypt: Remove libcrypt support 2023-10-30 13:03:59 -03:00
ctype.texi manual: Enhance documentation of the <ctype.h> functions 2023-07-03 12:36:56 +02:00
debug.texi Add manual documentation for threads.h 2018-07-24 14:07:31 -03:00
dir .. 2005-11-21 15:45:19 +00:00
dynlink.texi manual: Fix ld.so diagnostics menu/section structure 2023-09-06 18:37:21 +02:00
errno.texi manual: Update documentation of strerror and related functions 2023-07-03 12:36:56 +02:00
fdl-1.3.texi Sync FDL from https://www.gnu.org/licenses/fdl-1.3.texi 2021-01-02 12:46:25 -08:00
filesys.texi Improve documentation for malloc etc. (BZ#27719) 2021-04-13 12:17:56 -07:00
freemanuals.texi Prefer https to http for gnu.org and fsf.org URLs 2019-09-07 02:43:31 -07:00
getopt.texi manual: Clarify that abbreviations of long options are allowed 2022-05-04 15:56:47 +05:30
header.texi manual: Replace summary.awk with summary.pl. 2017-06-15 21:26:20 -07:00
install-plain.texi BZ #15941: Fix INSTALL file regeneration failure with makeinfo 5.x 2013-12-05 09:58:20 +05:30
install.texi crypt: Remove manul entry for --enable-crypt 2023-10-31 10:59:04 -03:00
intro.texi Fix misspellings in manual/ -- BZ 25337 2023-05-27 16:41:44 +00:00
io.texi Clean up glibc manual references to "GNU system" (bug 6911). 2012-03-08 01:27:38 +00:00
ipc.texi Fix misspellings in manual/ -- BZ 25337 2023-05-27 16:41:44 +00:00
job.texi manual/jobs.texi: Add missing @item EPERM for getpgid 2023-08-25 11:43:30 +02:00
lang.texi manual: Drop obsolete @refill 2022-01-12 14:28:44 +05:30
lgpl-2.1.texi Use canonical FSF .texi files for LGPL and FDL texts. 2011-06-06 16:16:55 -07:00
libc-texinfo.sh grep: egrep -> grep -E, fgrep -> grep -F 2022-06-05 12:09:02 -07:00
libc.texinfo Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
libcbook.texi
llio.texi Fix misspellings in manual/ -- BZ 25337 2023-05-27 16:41:44 +00:00
locale.texi stdlib: Remove use of mergesort on qsort (BZ 21719) 2023-10-31 14:18:05 -03:00
macros.texi manual: Replace summary.awk with summary.pl. 2017-06-15 21:26:20 -07:00
maint.texi manual: Manual update for strlcat, strlcpy, wcslcat, wclscpy 2023-06-14 18:10:27 +02:00
Makefile Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
math.texi arc4random: simplify design for better safety 2022-07-27 08:58:27 -03:00
memory.texi Fix misspellings in manual/ -- BZ 25337 2023-05-27 16:41:44 +00:00
message.texi manual: Use @code{errno} instead of @var{errno} [BZ #24063] 2019-01-07 11:42:04 +01:00
nss.texi nss: Use "files dns" as the default for the hosts database (bug 28700) 2021-12-17 12:01:25 +01:00
nsswitch.texi Remove --enable-obsolete-nsl configure flag 2020-07-08 17:25:57 +02:00
pattern.texi Fix misspellings in manual/ -- BZ 25337 2023-05-27 16:41:44 +00:00
pipe.texi manual: Replace summary.awk with summary.pl. 2017-06-15 21:26:20 -07:00
platform.texi x86: Add support for AVX10 preset and vec size in cpu-features 2023-09-29 14:18:42 -05:00
probes.texi elf: Add _dl_find_object function 2021-12-28 22:52:56 +01:00
process.texi linux: Add pidfd_getpid 2023-09-05 13:08:59 -03:00
README.pretty-printers Fix misspellings in manual/ -- BZ 25337 2023-05-27 16:41:44 +00:00
README.tunables tunables: Simplify TUNABLE_SET interface 2021-02-10 19:08:33 +05:30
resource.texi Move vtimes to a compatibility symbol 2020-10-19 16:44:20 -03:00
search.texi stdlib: Remove use of mergesort on qsort (BZ 21719) 2023-10-31 14:18:05 -03:00
setjmp.texi manual: Drop obsolete @refill 2022-01-12 14:28:44 +05:30
signal.texi manual: SA_ONSTACK is ignored without alternate stack 2022-02-28 11:50:41 +01:00
socket.texi Fix misspellings in manual/ -- BZ 25337 2023-05-27 16:41:44 +00:00
startup.texi Argument Syntax: Use "option", @option, and @command. 2020-10-30 13:08:38 -04:00
stdio-fp.c
stdio.texi C2x scanf %wN, %wfN support 2023-09-28 17:28:15 +00:00
string.texi manual: Manual update for strlcat, strlcpy, wcslcat, wclscpy 2023-06-14 18:10:27 +02:00
summary.pl Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
sysinfo.texi manual: Correct argument order in mount examples [BZ #27207] 2021-01-22 14:22:41 -05:00
syslog.texi manual: Replace summary.awk with summary.pl. 2017-06-15 21:26:20 -07:00
terminal.texi manual: document posix_openpt (bug 17010) 2023-04-26 12:29:39 +00:00
texinfo.tex Update miscellaneous files from upstream sources. 2019-01-01 00:52:59 +00:00
texis.awk Correct close statement. 2001-05-18 13:01:32 +00:00
threads.texi Fix misspellings in manual/ -- BZ 25337 2023-05-27 16:41:44 +00:00
time.texi Call "CST" a time zone abbreviation, not a name 2023-06-22 13:49:09 -07:00
tsort.awk Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
tunables.texi elf: Add glibc.mem.decorate_maps tunable 2023-11-07 10:27:57 -03:00
users.texi crypt: Remove libcrypt support 2023-10-30 13:03:59 -03:00
xtract-typefun.awk Make shebang interpreter directives consistent 2016-01-07 04:03:21 -05:00

			TUNABLE FRAMEWORK
			=================

Tunables is a feature in the GNU C Library that allows application authors and
distribution maintainers to alter the runtime library behaviour to match their
workload.

The tunable framework allows modules within glibc to register variables that
may be tweaked through an environment variable.  It aims to enforce a strict
namespace rule to bring consistency to naming of these tunable environment
variables across the project.  This document is a guide for glibc developers to
add tunables to the framework.

ADDING A NEW TUNABLE
--------------------

The TOP_NAMESPACE macro is defined by default as 'glibc'.  If distributions
intend to add their own tunables, they should do so in a different top
namespace by overriding the TOP_NAMESPACE macro for that tunable.  Downstream
implementations are discouraged from using the 'glibc' top namespace for
tunables they don't already have consensus to push upstream.

There are three steps to adding a tunable:

1. Add a tunable to the list and fully specify its properties:

For each tunable you want to add, make an entry in elf/dl-tunables.list.  The
format of the file is as follows:

TOP_NAMESPACE {
  NAMESPACE1 {
    TUNABLE1 {
      # tunable attributes, one per line
    }
    # A tunable with default attributes, i.e. string variable.
    TUNABLE2
    TUNABLE3 {
      # its attributes
    }
  }
  NAMESPACE2 {
    ...
  }
}

The list of allowed attributes are:

- type:			Data type.  Defaults to STRING.  Allowed types are:
			INT_32, UINT_64, SIZE_T and STRING.  Numeric types may
			be in octal or hexadecimal format too.

- minval:		Optional minimum acceptable value.  For a string type
			this is the minimum length of the value.

- maxval:		Optional maximum acceptable value.  For a string type
			this is the maximum length of the value.

- default:		Specify an optional default value for the tunable.

- env_alias:		An alias environment variable

- security_level:	Specify security level of the tunable for AT_SECURE
			binaries.  Valid values are:

			SXID_ERASE: (default) Do not read and do not pass on to
			child processes.
			SXID_IGNORE: Do not read, but retain for non-AT_SECURE
			child processes.
			NONE: Read all the time.

2. Use TUNABLE_GET/TUNABLE_SET/TUNABLE_SET_WITH_BOUNDS to get and set tunables.

3. OPTIONAL: If tunables in a namespace are being used multiple times within a
   specific module, set the TUNABLE_NAMESPACE macro to reduce the amount of
   typing.

GETTING AND SETTING TUNABLES
----------------------------

When the TUNABLE_NAMESPACE macro is defined, one may get tunables in that
module using the TUNABLE_GET macro as follows:

  val = TUNABLE_GET (check, int32_t, TUNABLE_CALLBACK (check_callback))

where 'check' is the tunable name, 'int32_t' is the C type of the tunable and
'check_callback' is the function to call if the tunable got initialized to a
non-default value.  The macro returns the value as type 'int32_t'.

The callback function should be defined as follows:

  void
  TUNABLE_CALLBACK (check_callback) (int32_t *valp)
  {
  ...
  }

where it can expect the tunable value to be passed in VALP.

Tunables in the module can be updated using:

  TUNABLE_SET (check, val)

where 'check' is the tunable name and 'val' is a value of same type.

To get and set tunables in a different namespace from that module, use the full
form of the macros as follows:

  val = TUNABLE_GET_FULL (glibc, cpu, hwcap_mask, uint64_t, NULL)

  TUNABLE_SET_FULL (glibc, cpu, hwcap_mask, val)

where 'glibc' is the top namespace, 'cpu' is the tunable namespace and the
remaining arguments are the same as the short form macros.

The minimum and maximum values can updated together with the tunable value
using:

  TUNABLE_SET_WITH_BOUNDS (check, val, min, max)

where 'check' is the tunable name, 'val' is a value of same type, 'min' and
'max' are the minimum and maximum values of the tunable.

To set the minimum and maximum values of tunables in a different namespace
from that module, use the full form of the macros as follows:

  val = TUNABLE_GET_FULL (glibc, cpu, hwcap_mask, uint64_t, NULL)

  TUNABLE_SET_WITH_BOUNDS_FULL (glibc, cpu, hwcap_mask, val, min, max)

where 'glibc' is the top namespace, 'cpu' is the tunable namespace and the
remaining arguments are the same as the short form macros.

When TUNABLE_NAMESPACE is not defined in a module, TUNABLE_GET is equivalent to
TUNABLE_GET_FULL, so you will need to provide full namespace information for
both macros.  Likewise for TUNABLE_SET, TUNABLE_SET_FULL,
TUNABLE_SET_WITH_BOUNDS and TUNABLE_SET_WITH_BOUNDS_FULL.

** IMPORTANT NOTE **

The tunable list is set as read-only after the dynamic linker relocates itself,
so setting tunable values must be limited only to tunables within the dynamic
linker, that too before relocation.

FUTURE WORK
-----------

The framework currently only allows a one-time initialization of variables
through environment variables and in some cases, modification of variables via
an API call.  A future goals for this project include:

- Setting system-wide and user-wide defaults for tunables through some
  mechanism like a configuration file.

- Allow tweaking of some tunables at runtime