The dl-symbol-redir-ifunc.h redirects compiler-generated libcalls to
arch-specific memory implementations to avoid ifunc calls where it is not
yet possible. The memcmp-isa-default-impl.h aims to fix the same issue
by calling the specific memset implementation directly.
Using the memcmp symbol directly allows the compiler to inline the memset
calls (especially because _dl_tunable_set_hwcaps uses constants values),
generating better code.
Checked on x86_64-linux-gnu.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
The strlen might trigger and invalid GOT entry if it used before
the process is self-relocated (for instance on dl-tunables if any
error occurs).
For i386, _dl_writev with PIE requires to use the old 'int $0x80'
syscall mode because the calling the TLS register (gs) is not yet
initialized.
Checked on x86_64-linux-gnu.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Instead of ignoring ill-formatted tunable strings, first, check all the
tunable definitions are correct and then set each tunable value. It
means that partially invalid strings, like "key1=value1:key2=key2=value'
or 'key1=value':key2=value2=value2' do not enable 'key1=value1'. It
avoids possible user-defined errors in tunable definitions.
Checked on x86_64-linux-gnu.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Tunable definitions with more than one '=' on are parsed and enabled,
and any subsequent '=' are ignored. It means that tunables in the form
'tunable=tunable=value' or 'tunable=value=value' are handled as
'tunable=value'. These inputs are likely user input errors, which
should not be accepted.
Checked on x86_64-linux-gnu.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Some environment variables allow alteration of allocator behavior
across setuid boundaries, where a setuid program may ignore the
tunable, but its non-setuid child can read it and adjust the memory
allocator behavior accordingly.
Most library behavior tunings is limited to the current process and does
not bleed in scope; so it is unclear how pratical this misfeature is.
If behavior change across privilege boundaries is desirable, it would be
better done with a wrapper program around the non-setuid child that sets
these envvars, instead of using the setuid process as the messenger.
The patch as fixes tst-env-setuid, where it fail if any unsecvars is
set. It also adds a dynamic test, although it requires
--enable-hardcoded-path-in-tests so kernel correctly sets the setuid
bit (using the loader command directly would require to set the
setuid bit on the loader itself, which is not a usual deployment).
Co-authored-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Checked on x86_64-linux-gnu.
Reviewed-by: DJ Delorie <dj@redhat.com>
The tunable privilege levels were a retrofit to try and keep the malloc
tunable environment variables' behavior unchanged across security
boundaries. However, CVE-2023-4911 shows how tricky can be
tunable parsing in a security-sensitive environment.
Not only parsing, but the malloc tunable essentially changes some
semantics on setuid/setgid processes. Although it is not a direct
security issue, allowing users to change setuid/setgid semantics is not
a good security practice, and requires extra code and analysis to check
if each tunable is safe to use on all security boundaries.
It also means that security opt-in features, like aarch64 MTE, would
need to be explicit enabled by an administrator with a wrapper script
or with a possible future system-wide tunable setting.
Co-authored-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
setuid/setgid process now ignores any glibc tunables, and filters out
all environment variables that might changes its behavior. This patch
also adds GLIBC_TUNABLES, so any spawned process by setuid/setgid
processes should set tunable explicitly.
Checked on x86_64-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Since malloc debug support moved to a different library
(libc_malloc_debug.so), the glibc.malloc.check requires preloading the
debug library to enable it. It means that suid-debug support has not
been working since 2.34.
To restore its support, it would require to add additional information
and parsing to where to find libc_malloc_debug.so.
It is one thing less that might change AT_SECURE binaries' behavior
due to environment configurations.
Checked on x86_64-linux-gnu.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
The existing logic avoided internal stack overflow. To avoid
a denial-of-service condition with adversarial input, it is necessary
to fall over to heapsort if tail-recursing deeply, too, which does
not result in a deep stack of pending partitions.
The new test stdlib/tst-qsort5 is based on Douglas McIlroy's paper
on this subject.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The previous implementation did not consistently apply the rule that
the child nodes of node K are at 2 * K + 1 and 2 * K + 2, or
that the parent node is at (K - 1) / 2.
Add an internal test that targets the heapsort implementation
directly.
Reported-by: Stepan Golosunov <stepan@golosunov.pp.ru>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
In the insertion phase, we could run off the start of the array if the
comparison function never runs zero. In that case, it never finds the
initial element that terminates the iteration.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Remove the unused 'char *name;' from the example.
Use write instead of putchar to write input as it is read.
Example tested on x86_64 by compiling and running the example.
Tested by building the manual pdf and reviewing the results.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Linux 6.6 (09da082b07bbae1c) added support for fchmodat2, which has
similar semantics as fchmodat with an extra flag argument. This
allows fchmodat to implement AT_SYMLINK_NOFOLLOW and AT_EMPTY_PATH
without the need for procfs.
The syscall is registered on all architectures (with value of 452
except on alpha which is 562, commit 78252deb023cf087).
The tst-lchmod.c requires a small fix where fchmodat checks two
contradictory assertions ('(st.st_mode & 0777) == 2' and
'(st.st_mode & 0777) == 3').
Checked on x86_64-linux-gnu on a 6.6 kernel.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
pool_max_size denotes total allocated rows in pool but possibly not yet
initialized. it's pool_size that represents number of actually occupied
rows hence use it when freeing pool to avoid freeing random addresses.
Signed-off-by: Jan Palus <jpalus@fastmail.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Arguments to a memchr call were swapped, causing incorrect skipping
of files.
Files related to dpkg have different names: they actually end in
.dpkg-new and .dpkg-tmp, not .tmp as I mistakenly assumed.
Fixes commit 2aa0974d25 ("elf: ldconfig should skip
temporary files created by package managers").
This ensures that the test still links with a linker that refuses
to create an executable stack marker automatically.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
So that the test is harder to confuse with elf/tst-execstack
(although the tests are supposed to be the same).
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Existing MiG does not support untyped messages and the Hurd will
continue to use typed messages for the foreseeable future.
Message-ID: <ZVmYX6j4pYNUfqn4@jupiter.tail36e24.ts.net>
The `ty` pointer is only set at the end of the loop so that
`msgtl_header.msgt_inline` and `msgtl_header.msgt_deallocate` remain
valid. Also, when deallocating memory, we use the length from the
message directly rather than hard coding mach_port_t since we want to
deallocate any kind of OOL data.
Message-ID: <ZVlGVD6eEN-dXsOr@jupiter.tail36e24.ts.net>
The force_first parameter was ineffective because the dlclose'd
object was not necessarily the first in the maps array. Also
enable force_first handling unconditionally, regardless of namespace.
The initial object in a namespace should be destructed first, too.
The _dl_sort_maps_dfs function had early returns for relocation
dependency processing which broke force_first handling, too, and
this is fixed in this change as well.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The open_path stops if a relative path in search path contains a
component that is a non directory (for instance, if the component
is an existing file).
For instance:
$ cat > lib.c <<EOF
> void foo (void) {}
> EOF
$ gcc -shared -fPIC -o lib.so lib.c
$ cat > main.c <<EOF
extern void foo ();
int main () { foo (); return 0; }
EOF
$ gcc -o main main.c lib.so
$ LD_LIBRARY_PATH=. ./main
$ LD_LIBRARY_PATH=non-existing/path:. ./main
$ LD_LIBRARY_PATH=$(pwd)/main:. ./main
$ LD_LIBRARY_PATH=./main:. ./main
./main: error while loading shared libraries: lib.so: cannot open shared object file: No such file or directory
The invalid './main' should be ignored as a non-existent one,
instead as a valid but non accessible file.
Absolute paths do not trigger this issue because their status are
initialized as 'unknown' and open_path check if this is a directory.
Checked on x86_64-linux-gnu.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
strrchr-evex-base used `vpcompress{b|d}` in the page cross logic but
was missing the CPU_FEATURE checks for VBMI2 in the
ifunc/ifunc-impl-list.
The fix is either to add those checks or change the logic to not use
`vpcompress{b|d}`. Choosing the latter here so that the strrchr-evex
implementation is usable on SKX.
New implementation is a bit slower, but this is in a cold path so its
probably okay.
Fixes commit a61933fe27 ("sparc: Remove bzero optimization") that
after moving code jumped to the wrong label 4.
Verfied by successfully running string/test-memset on sparc32.
Signed-off-by: Andreas Larsson <andreas@gaisler.com>
Signed-off-by: Ludwig Rydberg <ludwig.rydberg@gaisler.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
When the given options do not include MACH_SEND_INTERRUPT,
_hurd_intr_rpc_mach_msg (aka mach_msg) is not supposed to return
MACH_SEND_INTERRUPTED. In such a case we thus have to retry sending the
message.
This was observed to fix various occurrences of spurious
"(ipc/send) interrupted" errors when running haskell programs.
The latest implementations of memcpy are actually faster than the Falkor
implementations [1], so remove the falkor/phecda ifuncs for memcpy and
the now unused IS_FALKOR/IS_PHECDA defines.
[1] https://sourceware.org/pipermail/libc-alpha/2022-December/144227.html
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Add a specialized memset for the common ZVA size of 64 to avoid the
overhead of reading the ZVA size. Since the code is identical to
__memset_falkor, remove the latter.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Cleanup emag memset - merge the memset_base64.S file, remove
the unused ZVA code (since it is disabled on emag).
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This improves compatibility with applications which assume that qsort
does not invoke the comparison function with equal pointer arguments.
The newly introduced branches should be predictable, as leading to a
call to the comparison function. If the prediction fails, we avoid
calling the function.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The PR_SET_VMA_ANON_NAME support is only enabled through a configurable
kernel switch, mainly because assigning a name to a
anonymous virtual memory area might prevent that area from being
merged with adjacent virtual memory areas.
For instance, with the following code:
void *p1 = mmap (NULL,
1024 * 4096,
PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS,
-1,
0);
void *p2 = mmap (p1 + (1024 * 4096),
1024 * 4096,
PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS,
-1,
0);
The kernel will potentially merge both mappings resulting in only one
segment of size 0x800000. If the segment is names with
PR_SET_VMA_ANON_NAME with different names, it results in two mappings.
Although this will unlikely be an issue for pthread stacks and malloc
arenas (since for pthread stacks the guard page will result in
a PROT_NONE segment, similar to the alignment requirement for the arena
block), it still might prevent the mmap memory allocated for detail
malloc.
There is also another potential scalability issue, where the prctl
requires
to take the mmap global lock which is still not fully fixed in Linux
[1] (for pthread stacks and arenas, it is mitigated by the stack
cached and the arena reuse).
So this patch disables anonymous mapping annotations as default and
add a new tunable, glibc.mem.decorate_maps, can be used to enable
it.
[1] https://lwn.net/Articles/906852/
Checked on x86_64-linux-gnu and aarch64-linux-gnu.
Reviewed-by: DJ Delorie <dj@redhat.com>
Add anonymous mmap annotations on loader malloc, malloc when it
allocates memory with mmap, and on malloc arena. The /proc/self/maps
will now print:
[anon: glibc: malloc arena]
[anon: glibc: malloc]
[anon: glibc: loader malloc]
On arena allocation, glibc annotates only the read/write mapping.
Checked on x86_64-linux-gnu and aarch64-linux-gnu.
Reviewed-by: DJ Delorie <dj@redhat.com>