The PR_SET_VMA_ANON_NAME support is only enabled through a configurable
kernel switch, mainly because assigning a name to a
anonymous virtual memory area might prevent that area from being
merged with adjacent virtual memory areas.
For instance, with the following code:
void *p1 = mmap (NULL,
1024 * 4096,
PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS,
-1,
0);
void *p2 = mmap (p1 + (1024 * 4096),
1024 * 4096,
PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS,
-1,
0);
The kernel will potentially merge both mappings resulting in only one
segment of size 0x800000. If the segment is names with
PR_SET_VMA_ANON_NAME with different names, it results in two mappings.
Although this will unlikely be an issue for pthread stacks and malloc
arenas (since for pthread stacks the guard page will result in
a PROT_NONE segment, similar to the alignment requirement for the arena
block), it still might prevent the mmap memory allocated for detail
malloc.
There is also another potential scalability issue, where the prctl
requires
to take the mmap global lock which is still not fully fixed in Linux
[1] (for pthread stacks and arenas, it is mitigated by the stack
cached and the arena reuse).
So this patch disables anonymous mapping annotations as default and
add a new tunable, glibc.mem.decorate_maps, can be used to enable
it.
[1] https://lwn.net/Articles/906852/
Checked on x86_64-linux-gnu and aarch64-linux-gnu.
Reviewed-by: DJ Delorie <dj@redhat.com>
Add anonymous mmap annotations on loader malloc, malloc when it
allocates memory with mmap, and on malloc arena. The /proc/self/maps
will now print:
[anon: glibc: malloc arena]
[anon: glibc: malloc]
[anon: glibc: loader malloc]
On arena allocation, glibc annotates only the read/write mapping.
Checked on x86_64-linux-gnu and aarch64-linux-gnu.
Reviewed-by: DJ Delorie <dj@redhat.com>
Linux 4.5 removed thread stack annotations due to the complexity of
computing them [1], and Linux added PR_SET_VMA_ANON_NAME on 5.17
as a way to name anonymous virtual memory areas.
This patch adds decoration on the stack created and used by
pthread_create, for glibc crated thread stack the /proc/self/maps will
now show:
[anon: glibc: pthread stack: <tid>]
And for user-provided stacks:
[anon: glibc: pthread user stack: <tid>]
The guard page is not decorated, and the mapping name is cleared when
the thread finishes its execution (so the cached stack does not have any
name associated).
Checked on x86_64-linux-gnu aarch64 aarch64-linux-gnu.
[1] 65376df582
Co-authored-by: Ian Rogers <irogers@google.com>
Reviewed-by: DJ Delorie <dj@redhat.com>
Linux 5.17 added support to naming anonymous virtual memory areas
through the prctl syscall. The __set_vma_name is a wrapper to avoid
optimizing the prctl call if the kernel does not support it.
If the kernel does not support PR_SET_VMA_ANON_NAME, prctl returns
EINVAL. And it also returns the same error for an invalid argument.
Since it is an internal-only API, it assumes well-formatted input:
aligned START, with (START, START+LEN) being a valid memory range,
and NAME with a limit of 80 characters without an invalid one
("\\`$[]").
Reviewed-by: DJ Delorie <dj@redhat.com>
When invoking sem_open with O_CREAT as one of its flags, we'll end up
in the second part of sem_open's "if ((oflag & O_CREAT) == 0 || (oflag
& O_EXCL) == 0)", which means that we don't expect the semaphore file
to exist.
In that part, open_flags is initialized as "O_RDWR | O_CREAT | O_EXCL
| O_CLOEXEC" and there's an attempt to open(2) the file, which will
likely fail because it won't exist. After that first (expected)
failure, some cleanup is done and we go back to the label "try_again",
which lives in the first part of the aforementioned "if".
The problem is that, in that part of the code, we expect the semaphore
file to exist, and as such O_CREAT (this time the flag we pass to
open(2)) needs to be cleaned from open_flags, otherwise we'll see
another failure (this time unexpected) when trying to open the file,
which will lead the call to sem_open to fail as well.
This can cause very strange bugs, especially with OpenMPI, which makes
extensive use of semaphores.
Fix the bug by simplifying the logic when choosing open(2) flags and
making sure O_CREAT is not set when the semaphore file is expected to
exist.
A regression test for this issue would require a complex and cpu time
consuming logic, since to trigger the wrong code path is not
straightforward due the racy condition. There is a somewhat reliable
reproducer in the bug, but it requires using OpenMPI.
This resolves BZ #30789.
See also: https://bugs.launchpad.net/ubuntu/+source/h5py/+bug/2031912
Signed-off-by: Sergio Durigan Junior <sergiodj@sergiodj.net>
Co-Authored-By: Simon Chopin <simon.chopin@canonical.com>
Co-Authored-By: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
Fixes: 533deafbdf ("Use O_CLOEXEC in more places (BZ #15722)")
It adds NT_X86_SHSTK (2fab02b25ae7cf5), NT_RISCV_CSR/NT_RISCV_VECTOR
(9300f00439743c4), and NT_LOONGARCH_HW_BREAK/NT_LOONGARCH_HW_WATCH
(1a69f7a161a78ae).
The years of dealing with Binutils, GCC and GDB test results
made the community create good tools for comparison and analysis
of DejaGnu test results. This change allows to use those tools
for Glibc's test results as well.
The motivation for this change is Linaro's pre-commit testers,
which use a modified version of GCC's validate_failures.py
to create test xfail lists with baseline failures and known
flaky tests. See below links for an example xfails file (only
one link is supposed to work at any given time):
- https://ci.linaro.org/job/tcwg_glibc_check--master-arm-build/lastSuccessfulBuild/artifact/artifacts/artifacts.precommit/sumfiles/xfails.xfail/*view*/
- https://ci.linaro.org/job/tcwg_glibc_check--master-arm-build/lastSuccessfulBuild/artifact/artifacts/sumfiles/xfails.xfail/*view*/
Specifacally, this patch changes format of glibc's .sum files from ...
<cut>
FAIL: elf/test1
PASS: string/test2
</cut>
... to ...
<cut>
=== glibc tests ===
Running elf ...
FAIL: elf/test1
Running string ...
PASS: string/test2
</cut>.
And output of "make check" from ...
<cut>
FAIL: elf/test1
</cut>
... to ...
<cut>
FAIL: elf/test1
=== Summary of results ===
1 FAIL
1 PASS
</cut>.
Signed-off-by: Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Cleanup ifuncs. Remove uses of libc_hidden_builtin_def, use ENTRY rather than
ENTRY_ALIGN, remove unnecessary defines and conditional compilation. Rename
strlen_mte to strlen_generic. Remove rtld-memset.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Commit 7f602256ab moved the tst-rfc3484*
tests from posix/ to nss/, but didn't correct references to point to
their new subdir when building for mach and arm. This commit fixes
that.
Tested with build-many-glibcs.sh for i686-gnu.
This patch adds a qsort and qsort_r to trigger the worst case
scenario for the quicksort (which glibc current lacks coverage).
The test is done with random input, dfferent internal types (uint8_t,
uint16_t, uint32_t, uint64_t, large size), and with
different set of element numbers.
Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
This patch removes the mergesort optimization on qsort implementation
and uses the introsort instead. The mergesort implementation has some
issues:
- It is as-safe only for certain types sizes (if total size is less
than 1 KB with large element sizes also forcing memory allocation)
which contradicts the function documentation. Although not required
by the C standard, it is preferable and doable to have an O(1) space
implementation.
- The malloc for certain element size and element number adds
arbitrary latency (might even be worse if malloc is interposed).
- To avoid trigger swap from memory allocation the implementation
relies on system information that might be virtualized (for instance
VMs with overcommit memory) which might lead to potentially use of
swap even if system advertise more memory than actually has. The
check also have the downside of issuing syscalls where none is
expected (although only once per execution).
- The mergesort is suboptimal on an already sorted array (BZ#21719).
The introsort implementation is already optimized to use constant extra
space (due to the limit of total number of elements from maximum VM
size) and thus can be used to avoid the malloc usage issues.
Resulting performance is slower due the usage of qsort, specially in the
worst-case scenario (partialy or sorted arrays) and due the fact
mergesort uses a slight improved swap operations.
This change also renders the BZ#21719 fix unrequired (since it is meant
to fix the sorted input performance degradation for mergesort). The
manual is also updated to indicate the function is now async-cancel
safe.
Checked on x86_64-linux-gnu.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
This patch makes the quicksort implementation to acts as introsort, to
avoid worse-case performance (and thus making it O(nlog n)). It switch
to heapsort when the depth level reaches 2*log2(total elements). The
heapsort is a textbook implementation.
Checked on x86_64-linux-gnu and aarch64-linux-gnu.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
The optimization takes in consideration both the most common elements
are either 32 or 64 bit in size and inputs are aligned to the word
boundary. This is similar to what msort does.
For large buffer the swap operation uses memcpy/mempcpy with a
small fixed size buffer (so compiler might inline the operations).
Checked on x86_64-linux-gnu.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
The prototype is:
void __memswap (void *restrict p1, void *restrict p2, size_t n)
The function swaps the content of two memory blocks P1 and P2 of
len N. Memory overlap is NOT handled.
It will be used on qsort optimization.
Checked on x86_64-linux-gnu and aarch64-linux-gnu.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
All the crypt related functions, cryptographic algorithms, and
make requirements are removed, with only the exception of md5
implementation which is moved to locale folder since it is
required by localedef for integrity protection (libc's
locale-reading code does not check these, but localedef does
generate them).
Besides thec code itself, both internal documentation and the
manual is also adjusted. This allows to remove both --enable-crypt
and --enable-nss-crypt configure options.
Checked with a build for all affected ABIs.
Co-authored-by: Zack Weinberg <zack@owlfolio.org>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The libcrypt was maked to be phase out on 2.38, and a better project
already exist that provide both compatibility and better API
(libxcrypt). The sparc optimizations add the burden to extra
build-many-glibcs.py configurations.
Checked on sparc64 and sparcv9.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Add support for MOPS in cpu_features and INIT_ARCH. Add ifuncs using MOPS for
memcpy, memmove and memset (use .inst for now so it works with all binutils
versions without needing complex configure and conditional compilation).
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
getnameinfo is an entry points for nss functionality. This commit moves
it from the 'inet' subdirectory to 'nss'. The corresponding Versions
entry is also moved from 'posix' into 'nss'.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
getaddrinfo is an entry point for nss functionality. This commit moves
it from 'sysdeps/posix' to 'nss', gets rid of the stub in 'posix', and
moves all associated tests as well.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The getservby* and getservent* routines are entry points for nss
functionality. This commit moves them from the 'inet' subdirectory to
'nss'.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The getrpcby* and getrpcent* routines are entry points for nss
functionality. This commit moves them from the 'inet' subdirectory to
'nss'. The Versions entries for these routines along with a test,
located in the 'sunrpc' subdirectory, are also moved into 'nss'.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The getprotoby* and getprotoent* routines are entry points for nss
functionality. This commit moves them from the 'inet' subdirectory to
'nss'.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The getnetby* and getnetent* routines are entry points for nss
functionality. This commit moves them from the 'inet' subdirectory to
'nss'.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
These netgroup routines are entry points for nss functionality.
This commit moves them along with netgroup.h from the 'inet'
subdirectory to 'nss', and adjusts any references accordingly.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The gethostby* and gethostent* routines are entry points for nss
functionality. This commit moves them from the 'inet' subdirectory to
'nss'.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
ether_hostton and ether_ntohost are entry points for nss functionality.
This commit moves them from the 'inet' subdirectory to 'nss', and
adjusts any references accordingly.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The aliases routines are entry points for nss functionality. This
commit moves aliases.h and the aliases routines from the 'inet'
subdirectory to 'nss', and adjusts any external references.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The majority of shadow routines are entry points for nss functionality.
This commit removes the 'shadow' subdirectory and moves all
functionality and tests to 'nss'. References to shadow/ are accordingly
changed.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The majority of pwd routines are entry points for nss functionality.
This commit removes the 'pwd' subdirectory and moves all functionality
and tests to 'nss'. References to pwd/ are accordingly changed.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The majority of gshadow routines are entry points for nss functionality.
This commit removes the 'gshadow' subdirectory and moves all
functionality and tests to 'nss'. References to gshadow/ are
accordingly changed.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The majority of grp routines are entry points for nss functionality.
This commit removes the 'grp' subdirectory and moves all nss-relevant
functionality and all tests to 'nss', and the 'setgroups' stub into
'posix' (alongside the 'getgroups' stub). References to grp/ are
accordingly changed. In addition, compat-initgroups.c, a fallback
implementation of initgroups is renamed to initgroups-fallback.c so that
the build system does not confuse it for nss_compat/compat-initgroups.c.
Build time improves very slightly; e.g. down from an average of 45.5s to
44.5s on an 8-thread mobile x86_64 CPU.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>