Some environment variables allow alteration of allocator behavior
across setuid boundaries, where a setuid program may ignore the
tunable, but its non-setuid child can read it and adjust the memory
allocator behavior accordingly.
Most library behavior tunings is limited to the current process and does
not bleed in scope; so it is unclear how pratical this misfeature is.
If behavior change across privilege boundaries is desirable, it would be
better done with a wrapper program around the non-setuid child that sets
these envvars, instead of using the setuid process as the messenger.
The patch as fixes tst-env-setuid, where it fail if any unsecvars is
set. It also adds a dynamic test, although it requires
--enable-hardcoded-path-in-tests so kernel correctly sets the setuid
bit (using the loader command directly would require to set the
setuid bit on the loader itself, which is not a usual deployment).
Co-authored-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Checked on x86_64-linux-gnu.
Reviewed-by: DJ Delorie <dj@redhat.com>
The tunable privilege levels were a retrofit to try and keep the malloc
tunable environment variables' behavior unchanged across security
boundaries. However, CVE-2023-4911 shows how tricky can be
tunable parsing in a security-sensitive environment.
Not only parsing, but the malloc tunable essentially changes some
semantics on setuid/setgid processes. Although it is not a direct
security issue, allowing users to change setuid/setgid semantics is not
a good security practice, and requires extra code and analysis to check
if each tunable is safe to use on all security boundaries.
It also means that security opt-in features, like aarch64 MTE, would
need to be explicit enabled by an administrator with a wrapper script
or with a possible future system-wide tunable setting.
Co-authored-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
setuid/setgid process now ignores any glibc tunables, and filters out
all environment variables that might changes its behavior. This patch
also adds GLIBC_TUNABLES, so any spawned process by setuid/setgid
processes should set tunable explicitly.
Checked on x86_64-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Since malloc debug support moved to a different library
(libc_malloc_debug.so), the glibc.malloc.check requires preloading the
debug library to enable it. It means that suid-debug support has not
been working since 2.34.
To restore its support, it would require to add additional information
and parsing to where to find libc_malloc_debug.so.
It is one thing less that might change AT_SECURE binaries' behavior
due to environment configurations.
Checked on x86_64-linux-gnu.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Arguments to a memchr call were swapped, causing incorrect skipping
of files.
Files related to dpkg have different names: they actually end in
.dpkg-new and .dpkg-tmp, not .tmp as I mistakenly assumed.
Fixes commit 2aa0974d25 ("elf: ldconfig should skip
temporary files created by package managers").
So that the test is harder to confuse with elf/tst-execstack
(although the tests are supposed to be the same).
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The force_first parameter was ineffective because the dlclose'd
object was not necessarily the first in the maps array. Also
enable force_first handling unconditionally, regardless of namespace.
The initial object in a namespace should be destructed first, too.
The _dl_sort_maps_dfs function had early returns for relocation
dependency processing which broke force_first handling, too, and
this is fixed in this change as well.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The open_path stops if a relative path in search path contains a
component that is a non directory (for instance, if the component
is an existing file).
For instance:
$ cat > lib.c <<EOF
> void foo (void) {}
> EOF
$ gcc -shared -fPIC -o lib.so lib.c
$ cat > main.c <<EOF
extern void foo ();
int main () { foo (); return 0; }
EOF
$ gcc -o main main.c lib.so
$ LD_LIBRARY_PATH=. ./main
$ LD_LIBRARY_PATH=non-existing/path:. ./main
$ LD_LIBRARY_PATH=$(pwd)/main:. ./main
$ LD_LIBRARY_PATH=./main:. ./main
./main: error while loading shared libraries: lib.so: cannot open shared object file: No such file or directory
The invalid './main' should be ignored as a non-existent one,
instead as a valid but non accessible file.
Absolute paths do not trigger this issue because their status are
initialized as 'unknown' and open_path check if this is a directory.
Checked on x86_64-linux-gnu.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
The PR_SET_VMA_ANON_NAME support is only enabled through a configurable
kernel switch, mainly because assigning a name to a
anonymous virtual memory area might prevent that area from being
merged with adjacent virtual memory areas.
For instance, with the following code:
void *p1 = mmap (NULL,
1024 * 4096,
PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS,
-1,
0);
void *p2 = mmap (p1 + (1024 * 4096),
1024 * 4096,
PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS,
-1,
0);
The kernel will potentially merge both mappings resulting in only one
segment of size 0x800000. If the segment is names with
PR_SET_VMA_ANON_NAME with different names, it results in two mappings.
Although this will unlikely be an issue for pthread stacks and malloc
arenas (since for pthread stacks the guard page will result in
a PROT_NONE segment, similar to the alignment requirement for the arena
block), it still might prevent the mmap memory allocated for detail
malloc.
There is also another potential scalability issue, where the prctl
requires
to take the mmap global lock which is still not fully fixed in Linux
[1] (for pthread stacks and arenas, it is mitigated by the stack
cached and the arena reuse).
So this patch disables anonymous mapping annotations as default and
add a new tunable, glibc.mem.decorate_maps, can be used to enable
it.
[1] https://lwn.net/Articles/906852/
Checked on x86_64-linux-gnu and aarch64-linux-gnu.
Reviewed-by: DJ Delorie <dj@redhat.com>
Add anonymous mmap annotations on loader malloc, malloc when it
allocates memory with mmap, and on malloc arena. The /proc/self/maps
will now print:
[anon: glibc: malloc arena]
[anon: glibc: malloc]
[anon: glibc: loader malloc]
On arena allocation, glibc annotates only the read/write mapping.
Checked on x86_64-linux-gnu and aarch64-linux-gnu.
Reviewed-by: DJ Delorie <dj@redhat.com>
Linux 4.5 removed thread stack annotations due to the complexity of
computing them [1], and Linux added PR_SET_VMA_ANON_NAME on 5.17
as a way to name anonymous virtual memory areas.
This patch adds decoration on the stack created and used by
pthread_create, for glibc crated thread stack the /proc/self/maps will
now show:
[anon: glibc: pthread stack: <tid>]
And for user-provided stacks:
[anon: glibc: pthread user stack: <tid>]
The guard page is not decorated, and the mapping name is cleared when
the thread finishes its execution (so the cached stack does not have any
name associated).
Checked on x86_64-linux-gnu aarch64 aarch64-linux-gnu.
[1] 65376df582
Co-authored-by: Ian Rogers <irogers@google.com>
Reviewed-by: DJ Delorie <dj@redhat.com>
It adds NT_X86_SHSTK (2fab02b25ae7cf5), NT_RISCV_CSR/NT_RISCV_VECTOR
(9300f00439743c4), and NT_LOONGARCH_HW_BREAK/NT_LOONGARCH_HW_WATCH
(1a69f7a161a78ae).
All the crypt related functions, cryptographic algorithms, and
make requirements are removed, with only the exception of md5
implementation which is moved to locale folder since it is
required by localedef for integrity protection (libc's
locale-reading code does not check these, but localedef does
generate them).
Besides thec code itself, both internal documentation and the
manual is also adjusted. This allows to remove both --enable-crypt
and --enable-nss-crypt configure options.
Checked with a build for all affected ABIs.
Co-authored-by: Zack Weinberg <zack@owlfolio.org>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
This avoids crashes due to partially written files, after a package
update is interrupted.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This reverts commit 6985865bc3.
Reason for revert:
The commit changes the order of ELF destructor calls too much relative
to what applications expect or can handle. In particular, during
process exit and _dl_fini, after the revert commit, we no longer call
the destructors of the main program first; that only happens after
some dlopen'ed objects have been destructed. This robs applications
of an opportunity to influence destructor order by calling dlclose
explicitly from the main program's ELF destructors. A couple of
different approaches involving reverse constructor order were tried,
and none of them worked really well. It seems we need to keep the
dependency sorting in _dl_fini.
There is also an ambiguity regarding nested dlopen calls from ELF
constructors: Should those destructors run before or after the object
that called dlopen? Commit 6985865bc3 used reverse order
of the start of ELF constructor calls for destructors, but arguably
using completion of constructors is more correct. However, that alone
is not sufficient to address application compatibility issues (it
does not change _dl_fini ordering at all).
The string parsing routine may end up writing beyond bounds of tunestr
if the input tunable string is malformed, of the form name=name=val.
This gets processed twice, first as name=name=val and next as name=val,
resulting in tunestr being name=name=val:name=val, thus overflowing
tunestr.
Terminate the parsing loop at the first instance itself so that tunestr
does not overflow.
This also fixes up tst-env-setuid-tunables to actually handle failures
correct and add new tests to validate the fix for this CVE.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Compilation fails when building with -DNDEBUG after commit a3189f66a5.
Here is the error:
dl-close.c: In function ‘_dl_close_worker’:
dl-close.c:140:22: error: unused variable ‘nloaded’ [-Werror=unused-variable]
140 | const unsigned int nloaded = ns->_ns_nloaded;
Add __attribute_maybe_unused__ for‘nloaded’to fix it.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Now binutils use some E_MIPS_* macros and EF_MIPS_* macros, it is
difficult to decide which style macro we should use when we want
to add new ELF file header flags.
IRIX used to use EF_MIPS_* macros and in elf/elf.h there also has
comments "The following are unofficial names and should not be used".
So we should use EF_MIPS_* to keep same style with the beginning.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
It is a left-over from commit 52a01100ad
("elf: Remove ad-hoc restrictions on dlopen callers [BZ #22787]").
When backporting commmit 6985865bc3
("elf: Always call destructors in reverse constructor order
(bug 30785)"), we can move the l_init_called_next field to this
place, so that the internal GLIBC_PRIVATE ABI does not change.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
The current implementation of dlclose (and process exit) re-sorts the
link maps before calling ELF destructors. Destructor order is not the
reverse of the constructor order as a result: The second sort takes
relocation dependencies into account, and other differences can result
from ambiguous inputs, such as cycles. (The force_first handling in
_dl_sort_maps is not effective for dlclose.) After the changes in
this commit, there is still a required difference due to
dlopen/dlclose ordering by the application, but the previous
discrepancies went beyond that.
A new global (namespace-spanning) list of link maps,
_dl_init_called_list, is updated right before ELF constructors are
called from _dl_init.
In dl_close_worker, the maps variable, an on-stack variable length
array, is eliminated. (VLAs are problematic, and dlclose should not
call malloc because it cannot readily deal with malloc failure.)
Marking still-used objects uses the namespace list directly, with
next and next_idx replacing the done_index variable.
After marking, _dl_init_called_list is used to call the destructors
of now-unused maps in reverse destructor order. These destructors
can call dlopen. Previously, new objects do not have l_map_used set.
This had to change: There is no copy of the link map list anymore,
so processing would cover newly opened (and unmarked) mappings,
unloading them. Now, _dl_init (indirectly) sets l_map_used, too.
(dlclose is handled by the existing reentrancy guard.)
After _dl_init_called_list traversal, two more loops follow. The
processing order changes to the original link map order in the
namespace. Previously, dependency order was used. The difference
should not matter because relocation dependencies could already
reorder link maps in the old code.
The changes to _dl_fini remove the sorting step and replace it with
a traversal of _dl_init_called_list. The l_direct_opencount
decrement outside the loader lock is removed because it appears
incorrect: the counter manipulation could race with other dynamic
loader operations.
tst-audit23 needs adjustments to the changes in LA_ACT_DELETE
notifications. The new approach for checking la_activity should
make it clearer that la_activty calls come in pairs around namespace
updates.
The dependency sorting test cases need updates because the destructor
order is always the opposite order of constructor order, even with
relocation dependencies or cycles present.
There is a future cleanup opportunity to remove the now-constant
force_first and for_fini arguments from the _dl_sort_maps function.
Fixes commit 1df71d32fe ("elf: Implement
force_first handling in _dl_sort_maps_dfs (bug 28937)").
Reviewed-by: DJ Delorie <dj@redhat.com>
In short: __tls_get_addr checks the global generation counter and if
the current dtv is older then _dl_update_slotinfo updates dtv up to the
generation of the accessed module. So if the global generation is newer
than generation of the module then __tls_get_addr keeps hitting the
slow dtv update path. The dtv update path includes a number of checks
to see if any update is needed and this already causes measurable tls
access slow down after dlopen.
It may be possible to detect up-to-date dtv faster. But if there are
many modules loaded (> TLS_SLOTINFO_SURPLUS) then this requires at
least walking the slotinfo list.
This patch tries to update the dtv to the global generation instead, so
after a dlopen the tls access slow path is only hit once. The modules
with larger generation than the accessed one were not necessarily
synchronized before, so additional synchronization is needed.
This patch uses acquire/release synchronization when accessing the
generation counter.
Note: in the x86_64 version of dl-tls.c the generation is only loaded
once, since relaxed mo is not faster than acquire mo load.
I have not benchmarked this. Tested by Adhemerval Zanella on aarch64,
powerpc, sparc, x86 who reported that it fixes the performance issue
of bug 19924.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Parts of elf/tst-rtld-list-diagnostics.py have been copied from
scripts/tst-ld-trace.py.
The abnf module is entirely optional and used to verify the
ABNF grammar as included in the manual.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Leads to build failures (preprocessor redefinitions), and there is not
enough time to address this properly. Deferred until after 2.38 release.
This reverts commit 59dc07637f.
Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>
Add new definitions for the MIPS target, specifically: relocation
types, machine flags, section type names, and object attribute tags
and values. On MIPS64, up to three relocations may be specified
within r_info, by the r_type, r_type2, and r_type3 fields, so add new
macros to get the respective reloc types for MIPS64.
Starting with commit 1bcfe0f732, the
test was enhanced and the object for __builtin_return_address (0)
is searched with _dl_find_object.
Unfortunately on e.g. s390 (31bit), a postprocessing step is needed
as the highest bit has to be masked out. This can be done with
__builtin_extract_return_addr.
Without this postprocessing, _dl_find_object returns with -1 and the
content of dlfo is invalid, which may lead to segfaults in basename.
Therefore those checks are now only done on success.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
The sparc ABI has multiple cases on how to handle JMP_SLOT relocations,
(sparc_fixup_plt/sparc64_fixup_plt). For BINDNOW, _dl_audit_symbind
will be responsible to setup the final relocation value; while for
lazy binding _dl_fixup/_dl_profile_fixup will call the audit callback
and tail cail elf_machine_fixup_plt (which will call
sparc64_fixup_plt).
This patch fixes by issuing the SPARC specific routine on bindnow and
forwarding the audit value to elf_machine_fixup_plt for lazy resolution.
It fixes the la_symbind for bind-now tests on sparc64 and sparcv9:
elf/tst-audit24a
elf/tst-audit24b
elf/tst-audit24c
elf/tst-audit24d
Checked on sparc64-linux-gnu and sparcv9-linux-gnu.
Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Success is reported with a 0 return value, and failure is -1.
Enhance the kitchen sink test elf/tst-audit28 to cover
_dl_find_object as well.
Fixes commit 5d28a8962d ("elf: Add _dl_find_object function")
and bug 30515.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
Add --enable-fortify-source option.
It is now possible to enable fortification through a configure option.
The level may be given as parameter, if none is provided, the configure
script will determine what is the highest level possible that can be set
considering GCC built-ins availability and set it.
If level is explicitly set to 3, configure checks if the compiler
supports the built-in function necessary for it or raise an error if it
isn't.
If the configure option isn't explicitly enabled, it _FORTIFY_SOURCE is
forcibly undefined (and therefore disabled).
The result of the configure checks are new variables, ${fortify_source}
and ${no_fortify_source} that can be used to appropriately populate
CFLAGS.
A dedicated patch will follow to make use of this variable in Makefiles
when necessary.
Updated NEWS and INSTALL.
Adding dedicated x86_64 variant that enables the configuration.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
The first segment in a shared library may be read-only, not executable.
To support LD_PREFER_MAP_32BIT_EXEC on such shared libraries, we also
check MAP_DENYWRITE to decide if MAP_32BIT should be passed to mmap.
Normally the first segment is mapped with MAP_COPY, which is defined
as (MAP_PRIVATE | MAP_DENYWRITE). But if the segment alignment is
greater than the page size, MAP_COPY isn't used to allocate enough
space to ensure that the segment can be properly aligned. Map the
first segment with MAP_COPY in this case to fix BZ #30452.
ldconfig was allocating PATH_MAX bytes on the stack for the library file
name. The issues with PATH_MAX usage are well documented [0][1]; even if
a program does not rely on paths being limited to PATH_MAX bytes,
allocating 4096 bytes on the stack for paths that are typically rather
short (strlen ("/lib64/libc.so.6") is 16) is wasteful and dangerous.
[0]: https://insanecoding.blogspot.com/2007/11/pathmax-simply-isnt.html
[1]: https://eklitzke.org/path-max-is-tricky
Instead, make use of asprintf to dynamically allocate memory of just the
right size on the heap.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
With fortification enabled, system calls return result needs to be checked,
has it gets the __wur macro enabled.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Sort Makefile variables using scrips/sort-makefile-lines.py.
No code generation changes observed in non-test binary artifacts.
No regressions on x86_64 and i686.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Previously, after destructors for a DSO have been invoked, ld.so refused
to bind against that DSO in all cases. Relax this restriction somewhat
if the referencing object is itself a DSO that is being unloaded. This
assumes that the symbol reference is not going to be stored anywhere.
The situation in the test case can arise fairly easily with C++ and
objects that are built with different optimization levels and therefore
define different functions with vague linkage.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Applying this commit results in bit-identical libc.so.6.
The elf/ld-linux-x86-64.so.2 does change, but only in .note.gnu.build-id
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Linux 6.3 adds constants AT_RSEQ_FEATURE_SIZE and AT_RSEQ_ALIGN; add
them to glibc's elf.h. (Recall that, although elf.h is a
system-independent header, so far we've put AT_* constants there even
if Linux-specific, as discussed in bug 15794. So rather than making
any attempt to fix that issue, the new constants are just added there
alongside the existing ones.)
Tested for x86_64.
This patch checks _dl_debug_vdprintf, by passing various inputs to
_dl_dprintf and comparing the output with invocations of snprintf.
Signed-off-by: Roy Eldar <royeldar0@gmail.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
_dl_debug_vdprintf is a bare-bones printf implementation; currently
printing a signed integer (using "%d" format specifier) behaves
incorrectly when the number is negative, as it just prints the
corresponding unsigned integer, preceeded by a minus sign.
For example, _dl_printf("%d", -1) would print '-4294967295'.
Signed-off-by: Roy Eldar <royeldar0@gmail.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
f55727ca53 updated open_path to use the
r_search_path_struct struct but failed to update the comment.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
When dlopen is being called, efforts have been made to improve
future lookup performance. This includes marking a search path
as non-existent using `stat`. However, if the root directory
is given as a search path, there exists a bug which erroneously
marks it as non-existing.
The bug is reproduced under the following sequence:
1. dlopen is called to open a shared library, with at least:
1) a dependency 'A.so' not directly under the '/' directory
(e.g. /lib/A.so), and
2) another dependency 'B.so' resides in '/'.
2. for this bug to reproduce, 'A.so' should be searched *before* 'B.so'.
3. it first tries to find 'A.so' in /, (e.g. /A.so):
- this will (obviously) fail,
- since it's the first time we have seen the '/' directory,
its 'status' is 'unknown'.
4. `buf[buflen - namelen - 1] = '\0'` is executed:
- it intends to remove the leaf and its final slash,
- because of the speciality of '/', its buflen == namelen + 1,
- it erroneously clears the entire buffer.
6. it then calls 'stat' with the empty buffer:
- which will result in an error.
7. so it marks '/' as 'nonexisting', future lookups will not consider
this path.
8. while /B.so *does* exist, failure to look it up in the '/'
directory leads to a 'cannot open shared object file' error.
This patch fixes the bug by preventing 'buflen', an index to put '\0',
from being set to 0, so that the root '/' is always kept.
Relative search paths are always considered as 'existing' so this
wont be affected.
Writeup by Moody Liu <mooodyhunter@outlook.com>
Suggested-by: Carlos O'Donell <carlos@redhat.com>
Signed-off-by: Qixing ksyx Xue <qixingxue@outlook.com>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Sort tests against updated scripts/sort-makefile-lines.py.
No changes in generated code.
No regressions on x86_64 and i686.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Fix list terminator whitspace.
Sort using scripts/sort-makefile-lines.py.
No code generation changes observed in binary artifacts.
No regressions on x86_64 and i686.
In some cases, we do not want to go through the resolver for function
calls. For example, functions with vector arguments will use vector
registers to pass arguments. In the resolver, we do not save/restore the
vector argument registers for lazy binding efficiency. To avoid ruining
the vector arguments, functions with vector arguments will not go
through the resolver.
To achieve the goal, we will annotate the function symbols with
STO_RISCV_VARIANT_CC flag and add DT_RISCV_VARIANT_CC tag in the dynamic
section. In the first pass on PLT relocations, we do not set up to call
_dl_runtime_resolve. Instead, we resolve the functions directly.
Signed-off-by: Hsiangkai Wang <kai.wang@sifive.com>
Signed-off-by: Vincent Chen <vincent.chen@sifive.com>
Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Link: https://inbox.sourceware.org/libc-alpha/20230314162512.35802-1-kito.cheng@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Support for SFrame format is available in Binutils 2.40. The GNU ld merges
the input .sframe sections and creates an output .sframe section in a
segment PT_GNU_SFRAME.
When opening a temporary file without O_CLOEXEC we risk leaking the
file descriptor if another thread calls (fork and then) exec while we
have the fd open. Fix this by consistently passing O_CLOEXEC everywhere
where we open a file for internal use (and not to return it to the user,
in which case the API defines whether or not the close-on-exec flag
shall be set on the returned fd).
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230419160207.65988-4-bugaevc@gmail.com>
And make always supported. The configure option was added on glibc 2.25
and some features require it (such as hwcap mask, huge pages support, and
lock elisition tuning). It also simplifies the build permutations.
Changes from v1:
* Remove glibc.rtld.dynamic_sort changes, it is orthogonal and needs
more discussion.
* Cleanup more code.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Now that there is no need to use a special linker script to hardening
internal data structures, remove the --with-default-link configure
option and associated definitions.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Instead of using a special ELF section along with a linker script
directive to put the IO vtables within the RELRO section, the libio
vtables are all moved to an array marked as data.relro (so linker
will place in the RELRO segment without the need of extra directives).
To avoid static linking namespace issues and including all vtable
referenced objects, all required function pointers are set to weak alias.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
They are both used by __libc_freeres to free all library malloc
allocated resources to help tooling like mtrace or valgrind with
memory leak tracking.
The current scheme uses assembly markers and linker script entries
to consolidate the free routine function pointers in the RELRO segment
and to be freed buffers in BSS.
This patch changes it to use specific free functions for
libc_freeres_ptrs buffers and call the function pointer array directly
with call_function_static_weak.
It allows the removal of both the internal macros and the linker
script sections.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
After commit ed3ce71f5c ("elf: Move la_activity (LA_ACT_ADD) after
_dl_add_to_namespace_list() (BZ #28062)") it is no longer necessary to
reset the debugger state in the error case, since the debugger
notification only happens after no more errors can occur.
It was possible to run this test individually and have it fail because
it can't find testobj1.so. This patch adds that dependency, to prevent
such issues.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Some toolchains, such as that used on Gentoo Hardened, set -z now out of
the box. This trips up a couple of tests.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
When mcount overflows, no gmon.out file is generated, but no message is printed
to the user, leaving the user with no idea why, and thinking maybe there is
some bug - which is how BZ 27576 ended up being logged. Print a message to
stderr in this case so the user knows what is going on.
As a comment in sys/gmon.h acknowledges, the hardcoded MAXARCS value is too
small for some large applications, including the test case in that BZ. Rather
than increase it, add tunables to enable MINARCS and MAXARCS to be overridden
at runtime (glibc.gmon.minarcs and glibc.gmon.maxarcs). So if a user gets the
mcount overflow error, they can try increasing maxarcs (they might need to
increase minarcs too if the heuristic is wrong in their case.)
Note setting minarcs/maxarcs too large can cause monstartup to fail with an
out of memory error. If you set them large enough, it can cause an integer
overflow in calculating the buffer size. I haven't done anything to defend
against that - it would not generally be a security vulnerability, since these
tunables will be ignored in suid/sgid programs (due to the SXID_ERASE default),
and if you can set GLIBC_TUNABLES in the environment of a process, you can take
it over anyway (LD_PRELOAD, LD_LIBRARY_PATH, etc). I thought about modifying
the code of monstartup to defend against integer overflows, but doing so is
complicated, and I realise the existing code is susceptible to them even prior
to this change (e.g. try passing a pathologically large highpc argument to
monstartup), so I decided just to leave that possibility in-place.
Add a test case which demonstrates mcount overflow and the tunables.
Document the new tunables in the manual.
Signed-off-by: Simon Kissane <skissane@gmail.com>
Reviewed-by: DJ Delorie <dj@redhat.com>
While cleaning up old libc version support, the deprecated libc4 code was
accidentally kept in `implicit_soname`, instead of the libc6 code.
This causes additional symlinks to be created by `ldconfig` for libraries
without a soname, e.g. a library `libsomething.123.456.789` without a soname
will create a `libsomething.123` -> `libsomething.123.456.789` symlink.
As the libc6 version of the `implicit_soname` code is a trivial `xstrdup`,
just inline it and remove `implicit_soname` altogether.
Some further simplification looks possible (e.g. the call to `create_links`
looks like a no-op if `soname == NULL`, other than the verbose printfs), but
logic is kept as-is for now.
Fixes: BZ #30125
Fixes: 8ee878592c ("Assume only FLAG_ELF_LIBC6 suport")
Signed-off-by: Joan Bruguera <joanbrugueram@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The test is sufficient to detect the ldconfig bug fixed in
commit 9fe6f63638 ("elf: Fix 64 time_t
support for installed statically binaries").
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Almost all uses of rawmemchr find the end of a string. Since most targets use
a generic implementation, replacing it with strchr is better since that is
optimized by compilers into strlen (s) + s. Also fix the generic rawmemchr
implementation to use a cast to unsigned char in the if statement.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The 73fc4e28b9 refactor did not add the GL(dl_phdr) and
GL(dl_phnum) for static build, relying on the __ehdr_start symbol,
which is always added by the static linker, to get the correct values.
This is problematic in some ways:
- The segment may see its in-memory size differ from its in-file
size (or the binary may have holes). The Linux has fixed is to
provide concise values for both AT_PHDR and AT_PHNUM (commit
0da1d5002745c - "fs/binfmt_elf: Fix AT_PHDR for unusual ELF files")
- Some archs (alpha for instance) the hidden weak reference is not
correctly pulled by the static linker and __ehdr_start address
end up being 0, which makes GL(dl_phdr) and GL(dl_phnum) have both
invalid values (and triggering a segfault later on libc.so while
accessing TLS variables).
The safer fix is to just restore the previous behavior to setup
GL(dl_phdr) and GL(dl_phnum) for static based on kernel auxv. The
__ehdr_start fallback can also be simplified by not assuming weak
linkage (as for PIE).
The libc-static.c auxv init logic is moved to dl-support.c, since
the later is build without SHARED and then GLRO macro is defined
to access the variables directly.
The _dl_phdr is also assumed to be always non NULL, since an invalid
NULL values does not trigger TLS initialization (which is used in
various libc systems).
Checked on aarch64-linux-gnu, x86_64-linux-gnu, and i686-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
I've updated copyright dates in glibc for 2023. This is the patch for
the changes not generated by scripts/update-copyrights and subsequent
build / regeneration of generated files.
Always null-terminate the buffer and set E2BIG if the buffer is too
small. This fixes bug 27857.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
vfprintf is entangled with vfwprintf (of course), __printf_fp,
__printf_fphex, __vstrfmon_l_internal, and the strfrom family of
functions. The latter use the internal snprintf functionality,
so vsnprintf is converted as well.
The simples conversion is __printf_fphex, followed by
__vstrfmon_l_internal and __printf_fp, and finally
__vfprintf_internal and __vfwprintf_internal. __vsnprintf_internal
and strfrom* are mostly consuming the new interfaces, so they
are comparatively simple.
__printf_fp is a public symbol, so the FILE *-based interface
had to preserved.
The __printf_fp rewrite does not change the actual binary-to-decimal
conversion algorithm, and digits are still not emitted directly to
the target buffer. However, the staging buffer now uses bytes
instead of wide characters, and one buffer copy is eliminated.
The changes are at least performance-neutral in my testing.
Floating point printing and snprintf improved measurably, so that
this Lua script
for i=1,5000000 do
print(i, i * math.pi)
end
runs about 5% faster for me. To preserve fprintf performance for
a simple "%d" format, this commit has some logic changes under
LABEL (unsigned_number) to avoid additional function calls. There
are certainly some very easy performance improvements here: binary,
octal and hexadecimal formatting can easily avoid the temporary work
buffer (the number of digits can be computed ahead-of-time using one
of the __builtin_clz* built-ins). Decimal formatting can use a
specialized version of _itoa_word for base 10.
The existing (inconsistent) width handling between strfmon and printf
is preserved here. __print_fp_buffer_1 would have to use
__translated_number_width to achieve ISO conformance for printf.
Test expectations in libio/tst-vtables-common.c are adjusted because
the internal staging buffer merges all virtual function calls into
one.
In general, stack buffer usage is greatly reduced, particularly for
unbuffered input streams. __printf_fp can still use a large buffer
in binary128 mode for %g, though.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
These buffers will eventually be used instead of FILE * objects
to implement printf functions. The multibyte buffer is struct
__printf_buffer, the wide buffer is struct __wprintf_buffer.
To enable writing type-generic code, the header files
printf_buffer-char.h and printf_buffer-wchar_t.h define the
Xprintf macro differently, enabling Xprintf (buffer) to stand
for __printf_buffer and __wprintf_buffer as appropriate. For
common cases, macros like Xprintf_buffer are provided as a more
syntactically convenient shortcut.
Buffer-specific flush callbacks are implemented with a switch
statement instead of a function pointer, to avoid hardening issues
similar to those of libio vtables. struct __printf_buffer_as_file
is needed to support custom printf specifiers because the public
interface for that requires passing a FILE *, which is why there
is a trapdoor back from these buffers to FILE * streams.
Since the immediate user of these interfaces knows when processing
has finished, there is no flush callback for the end of processing,
only a flush callback for the intermediate buffer flush.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Current scheme only consideres the first argument for both --required
and --optional, where the idea is to append a new item.
Checked on x86_64-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
The static linker might impose any order or internal function
position, so change the test to check if the audit prints the
symbol only once in any order.
Check the return value of malloc based on the function header comment of
_dl_make_tlsdesc_dynamic(). If the return value fails, NULL is returned.
Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com>
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
This makes it more likely that the compiler can compute the strlen
argument in _startup_fatal at compile time, which is required to
avoid a dependency on strlen this early during process startup.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
The old exception handling implementation used function interposition
to replace the dynamic loader implementation (no TLS support) with the
libc implementation (TLS support). This results in problems if the
link order between the dynamic loader and libc is reversed (bug 25486).
The new implementation moves the entire implementation of the
exception handling functions back into the dynamic loader, using
THREAD_GETMEM and THREAD_SETMEM for thread-local data support.
These depends on Hurd support for these macros, added in commit
b65a82e4e7 ("hurd: Add THREAD_GET/SETMEM/_NC").
One small obstacle is that the exception handling facilities are used
before the TCB has been set up, so a check is needed if the TCB is
available. If not, a regular global variable is used to store the
exception handling information.
Also rename dl-error.c to dl-catch.c, to avoid confusion with the
dlerror function.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
The maximum number of directives is already limited by the maximum
value of iovec, and current padding usage on _dl_map_object_from_fd
specifies a value of 16 (2 times sizeof (void *)) in hexa, which is
less than the INT_STRLEN_BOUND(void *) (20 for LP64).
This works if pointers are larger than 8 bytes, for instance 16.
In this case the maximum padding would be 32 and the IFMTSIZE would
be 40.
The resulting code does use a slightly larger static stack, the
output of -fstack-usage (for x86_64):
* master:
dl-printf.c:35:1:_dl_debug_vdprintf 1344 dynamic
* patch:
dl-printf.c:36:1:_dl_debug_vdprintf 2416 static
However, there is an improvement in code generation:
* master
text data bss dec hex filename
3309 0 0 3309 ced elf/dl-printf.os
* patch
text data bss dec hex filename
3151 0 0 3151 c4f elf/dl-printf.os
Checked on x86_64-linux-gnu.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
When --enable-hardcoded-path-in-tests is used only with DT_RUNPATH,
elf/tst-relr3 and elf/tst-relr4 failed to run. Their dependency
libraries, tst-relr-mod3a.so and tst-relr-mod4a.so, are failed to
load since DT_RUNPATH on executable doesn't apply to them. Build
tst-relr-mod3a.so and tst-relr-mod4a.so with $(LDFLAGS-rpath-ORIGIN)
to add DT_RUNPATH for their dependency libraries.
Reviewed-by: Fangrui Song <maskray@google.com>
By default glibc only allocates enough static TLS for 4 link namespaces
including the initial one. So only use 3 dlmopens in the test.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
The alloca size did not consider the optional width parameter for
padding which could cause buffer underflow. The width is currently used
e.g. by _dl_map_object_from_fd which passes 2 * sizeof(void *) which
can be larger than the alloca buffer size on targets where
sizeof(void *) >= 2 * sizeof(unsigned long).
Even if large width is not used on existing targets it is better to fix
the formatting code to avoid surprises.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
This consolidates the destructor invocations from _dl_fini and
dlclose. Remove the micro-optimization that avoids
calling _dl_call_fini if they are no destructors (as dlclose is quite
expensive anyway). The debug log message is now printed
unconditionally.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This allows the rest of dynamic loader to check whether the TCB
has been set up (and THREAD_GETMEM and THREAD_SETMEM will work).
Reviewed-by: Siddhesh Poyarekar <siddhesh@gotplt.org>
The data in the _ns_debug member must be preserved, otherwise
_dl_debug_initialize enters an infinite loop. To be conservative,
only clear the libc_map member for now, to fix bug 29528.
Fixes commit d0e357ff45
("elf: Call __libc_early_init for reused namespaces (bug 29528)"),
by reverting most of it.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
Besides the option being gcc specific, this approach is still fragile
and not future proof since we do not know if this will be the only
optimization option gcc will add that transforms loops to memset
(or any libcall).
This patch adds a new header, dl-symbol-redir-ifunc.h, that can b
used to redirect the compiler generated libcalls to port the generic
memset implementation if required.
Checked on x86_64-linux-gnu and aarch64-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The print_hwcap_1 machinery was useful for the legacy hwcaps
subdirectories, but it is not worth the trouble now that they
are gone.
Signed-off-by: Javier Pello <devel@otheo.eu>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
They were set to 0 on initialisation and never changed later after
last commit.
Signed-off-by: Javier Pello <devel@otheo.eu>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Last commit made it so that the value passed for that parameter was
always 0 at its only call site.
Signed-off-by: Javier Pello <devel@otheo.eu>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Remove support for the legacy hwcaps subdirectories from ldconfig.
Signed-off-by: Javier Pello <devel@otheo.eu>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Remove support for the legacy hwcaps subdirectories from the dynamic
loader.
Signed-off-by: Javier Pello <devel@otheo.eu>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
The Z modifier is a nonstandard synonymn for z (that predates z
itself) and compiler might issue an warning for in invalid
conversion specifier.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
The need to maintain elf/elf.h and scripts/glibcelf.py in parallel
results in a backporting hazard: they need to be kept in sync to
avoid elf/tst-glibcelf consistency check failures. glibcelf (unlike
tst-glibcelf) does not use the C implementation to extract constants.
This applies the additional glibcpp syntax checks to <elf.h>.
This changereplaces the types derived from Python enum types with
custom types _TypedConstant, _IntConstant, and _FlagConstant. These
types have fewer safeguards, but this also allows incremental
construction and greater flexibility for grouping constants among
the types. Architectures-specific named constants are now added
as members into their superclasses (but value-based lookup is
still restricted to generic constants only).
Consequently, check_duplicates in elf/tst-glibcelf has been adjusted
to accept differently-named constants of the same value if their
subtypes are distinct. The ordering check for named constants
has been dropped because they are no longer strictly ordered.
Further test adjustments: Some of the type names are different.
The new types do not support iteration (because it is unclear
whether iteration should cover the all named values (including
architecture-specific constants), or only the generic named values),
so elf/tst-glibcelf now uses by_name explicit (to get all constants).
PF_HP_SBP and PF_PARISC_SBP are now of distinct types (PfHP and
PfPARISC), so they are how both present on the Python side. EM_NUM
and PT_NUM are filtered (which was an oversight in the old
conversion).
The new version of glibcelf should also be compatible with earlier
Python versions because it no longer depends on the enum module and its
advanced features.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
The implementation in _dl_close_worker requires that the first
element of l_initfini is always this very map (“We are always the
zeroth entry, and since we don't include ourselves in the
dependency analysis start at 1.”). Rather than fixing that
assumption, this commit adds an implementation of the force_first
argument to the new dependency sorting algorithm. This also means
that the directly dlopen'ed shared object is always initialized last,
which is the least surprising behavior in the presence of cycles.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
make-4.4 will add long flags to MAKEFLAGS variable:
* WARNING: Backward-incompatibility!
Previously only simple (one-letter) options were added to the MAKEFLAGS
variable that was visible while parsing makefiles. Now, all options
are available in MAKEFLAGS.
This causes locale builds to fail when long options are used:
$ make --shuffle
...
make -C localedata install-locales
make: invalid shuffle mode: '1662724426r'
The change fixes it by passing eash option via whitespace and dashes.
That way option is appended to both single-word form and whitespace
separated form.
While at it fixed --silent mode detection in $(MAKEFLAGS) by filtering
out --long-options. Otherwise options like --shuffle flag enable silent
mode unintentionally. $(silent-make) variable consolidates the checks.
Resolves: BZ# 29564
CC: Paul Smith <psmith@gnu.org>
CC: Siddhesh Poyarekar <siddhesh@gotplt.org>
Signed-off-by: Sergei Trofimovich <slyich@gmail.com>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Commit dad90d5282 added glibc-hwcaps
support for LD_LIBRARY_PATH and, for this, it adjusted the total
string size required in _dl_important_hwcaps. However, in doing so
it inadvertently altered the calculation of the size required for
the power set strings, as the computation of the power set string
size depended on the first value assigned to the total variable,
which is later shifted, resulting in overallocation of string
space. Fix this now by using a different variable to hold the
string size required for glibc-hwcaps.
Signed-off-by: Javier Pello <devel@otheo.eu>
This did not cause a warning before because the token sequence for
the two definitions was identical.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The d7703d3176 changed how vDSO like
dependencies are printed, instead of just the name and address it
follows other libraries mode and prints 'name => path'.
Unfortunately, this broke some ldd consumer that uses the output to
filter out the program's dependencies. For instance CMake
bundleutilities module [1], where GetPrequirite uses the regex to filter
out 'name => path' [2].
This patch restore the previous way to print just the name and the
mapping address.
Checked on x86_64-linux-gnu.
[1] https://github.com/Kitware/CMake/tree/master/Tests/BundleUtilities
[2] https://github.com/Kitware/CMake/blob/master/Modules/GetPrerequisites.cmake#L733
Reviewed-by: Florian Weimer <fweimer@redhat.com>
libc_map is never reset to NULL, neither during dlclose nor on a
dlopen call which reuses the namespace structure. As a result, if a
namespace is reused, its libc is not initialized properly. The most
visible result is a crash in the <ctype.h> functions.
To prevent similar bugs on namespace reuse from surfacing,
unconditionally initialize the chosen namespace to zero using memset.
This reverts commit 6f85dbf102.
Once this change hits the release branches, it will require relinking
of all statically linked applications before static dlopen works
again, for the majority of updates on release branches: The NEWS file
is regularly updated with bug references, so the __libc_early_init
suffix changes, and static dlopen cannot find the function anymore.
While this ABI check is still technically correct (we do require
rebuilding & relinking after glibc updates to keep static dlopen
working), it is too drastic for stable release branches.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The files NEWS, include/link.h, and sysdeps/generic/ldsodefs.h
contribute to the version fingerprint used for detection. The
fingerprint can be further refined using the --with-extra-version-id
configure argument.
_dl_call_libc_early_init is replaced with _dl_lookup_libc_early_init.
The new function is used store a pointer to libc.so's
__libc_early_init function in the libc_map_early_init member of the
ld.so namespace structure. This function pointer can then be called
directly, so the separate invocation function is no longer needed.
The versioned symbol lookup needs the symbol versioning data
structures, so the initialization of libc_map and libc_map_early_init
is now done from _dl_check_map_versions, after this information
becomes available. (_dl_map_object_from_fd does not set this up
in time, so the initialization code had to be moved from there.)
This means that the separate initialization code can be removed from
dl_main because _dl_check_map_versions covers all maps, including
the initial executable loaded by the kernel. The lookup still happens
before relocation and the invocation of IFUNC resolvers, so IFUNC
resolvers are protected from ABI mismatch.
The __libc_early_init function pointer is not protected because
so little code runs between the pointer write and the invocation
(only dynamic linker code and IFUNC resolvers).
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
ELF and GNU hashes can now be computed using the elf_hash and
gnu_hash functions.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
The test is valid for all TLS models, but we want to make a reasonable
effort to test the GNU2 model specifically. For example, aarch64
defaults to GNU2, but does not have -mtls-dialect=gnu2, and the test
was not run there.
Suggested-by: Martin Coufal <mcoufal@redhat.com>
GCC normally does this optimization for us in
strlen_pass::handle_builtin_strcpy but only for optimized
build. To avoid needing to include strcpy.S in the rtld build to
support the debug build, just do the optimization by hand.
The older libc versions are obsolete for over twenty years now.
This patch removes the special flags for libc5 and libc4 and assumes
that all libraries cached are libc6 compatible and use FLAG_ELF_LIBC6.
Checked with a build for all affected architectures.
Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Redirect internal assertion failures to __libc_assert_fail, based on
based on __libc_message, which writes directly to STDERR_FILENO
and calls abort. Also disable message translation and reword the
error message slightly (adjusting stdlib/tst-bz20544 accordingly).
As a result of these changes, malloc no longer needs its own
redefinition of __assert_fail.
__libc_assert_fail needs to be stubbed out during rtld dependency
analysis because the rtld rebuilds turn __libc_assert_fail into
__assert_fail, which is unconditionally provided by elf/dl-minimal.c.
This change is not possible for the public assert macro and its
__assert_fail function because POSIX requires that the diagnostic
is written to stderr.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The fix done b2cd93fce6 does not really
work since macro strification does not expand the sizeof nor the
arithmetic operation.
Checked on x86_64-linux-gnu.
NODELETE status is propagated from the referencing object to the
referenced object, not the other way round. The code is correct, only
the log message has the wrong direction.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Using -Werror and -DNDEBUG at the same time will trigger the
following compiler error:
cache.c: In function 'save_cache':
cache.c:758:15: error: unused variable 'old_offset' [-Werror=unused-variable]
758 | off64_t old_offset = lseek64 (fd, extension_offset, SEEK_SET);
| ^~~~~~~~~~
-DNDEBUG disables the assertion, making old_offset unused.
Use __attribute__ ((unused)) to disable this warning.
By adding an internal alias to avoid the GOT indirection.
On some architecture, __libc_single_thread may be accessed through
copy relocations and thus it requires to update also the copies
default copy.
This is done by adding a new internal macro,
libc_hidden_data_{proto,def}, which has an addition argument that
specifies the alias name (instead of default __GI_ one).
Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Fangrui Song <maskray@google.com>
If an executable has copy relocations for extern protected data, that
can only work if the library containing the definition is built with
assumptions (a) the compiler emits GOT-generating relocations (b) the
linker produces R_*_GLOB_DAT instead of R_*_RELATIVE. Otherwise the
library uses its own definition directly and the executable accesses a
stale copy. Note: the GOT relocations defeat the purpose of protected
visibility as an optimization, but allow rtld to make the executable and
library use the same copy when copy relocations are present, but it
turns out this never worked perfectly.
ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA has strange semantics when both
a.so and b.so define protected var and the executable copy relocates
var: b.so accesses its own copy even with GLOB_DAT. The behavior change
is from commit 62da1e3b00 (x86) and then
copied to nios2 (ae5eae7cfc) and arc
(0e7d930c4c).
Without ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA, b.so accesses the copy
relocated data like a.so.
There is now a warning for copy relocation on protected symbol since
commit 7374c02b68. It's extremely
unlikely anyone relies on the ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA
behavior, so let's remove it: this removes a check in the symbol lookup
code.
Newer versions of GNU grep (after grep 3.7, not inclusive) will warn on
'egrep' and 'fgrep' invocations.
Convert usages within the tree to their expanded non-aliased counterparts
to avoid irritating warnings during ./configure and the test suite.
Signed-off-by: Sam James <sam@gentoo.org>
Reviewed-by: Fangrui Song <maskray@google.com>
When the first object providing foo defines both foo@v1 and foo@@v2,
dlsym(RTLD_NEXT, "foo") returns foo@v1 while dlsym(RTLD_DEFAULT, "foo")
returns foo@@v2. The issue is that RTLD_DEFAULT uses the
DL_LOOKUP_RETURN_NEWEST flag while RTLD_NEXT doesn't. Fix the RTLD_NEXT
branch to use DL_LOOKUP_RETURN_NEWEST.
Note: the new behavior matches FreeBSD rtld. Future sanitizers will not
need to add versioned interceptors like https://reviews.llvm.org/D96348
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
An __always_inline static function is better to find where exactly a
crash happens, so one can step into the function with GDB.
Reviewed-by: Fangrui Song <maskray@google.com>
Unroll slightly and enforce good instruction scheduling. This improves
performance on out-of-order machines. The unrolling allows for
pipelined multiplies.
As well, as an optional sysdep, reorder the operations and prevent
reassosiation for better scheduling and higher ILP. This commit
only adds the barrier for x86, although it should be either no
change or a win for any architecture.
Unrolling further started to induce slowdowns for sizes [0, 4]
but can help the loop so if larger sizes are the target further
unrolling can be beneficial.
Results for _dl_new_hash
Benchmarked on Tigerlake: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
Time as Geometric Mean of N=30 runs
Geometric of all benchmark New / Old: 0.674
type, length, New Time, Old Time, New Time / Old Time
fixed, 0, 2.865, 2.72, 1.053
fixed, 1, 3.567, 2.489, 1.433
fixed, 2, 2.577, 3.649, 0.706
fixed, 3, 3.644, 5.983, 0.609
fixed, 4, 4.211, 6.833, 0.616
fixed, 5, 4.741, 9.372, 0.506
fixed, 6, 5.415, 9.561, 0.566
fixed, 7, 6.649, 10.789, 0.616
fixed, 8, 8.081, 11.808, 0.684
fixed, 9, 8.427, 12.935, 0.651
fixed, 10, 8.673, 14.134, 0.614
fixed, 11, 10.69, 15.408, 0.694
fixed, 12, 10.789, 16.982, 0.635
fixed, 13, 12.169, 18.411, 0.661
fixed, 14, 12.659, 19.914, 0.636
fixed, 15, 13.526, 21.541, 0.628
fixed, 16, 14.211, 23.088, 0.616
fixed, 32, 29.412, 52.722, 0.558
fixed, 64, 65.41, 142.351, 0.459
fixed, 128, 138.505, 295.625, 0.469
fixed, 256, 291.707, 601.983, 0.485
random, 2, 12.698, 12.849, 0.988
random, 4, 16.065, 15.857, 1.013
random, 8, 19.564, 21.105, 0.927
random, 16, 23.919, 26.823, 0.892
random, 32, 31.987, 39.591, 0.808
random, 64, 49.282, 71.487, 0.689
random, 128, 82.23, 145.364, 0.566
random, 256, 152.209, 298.434, 0.51
Co-authored-by: Alexander Monakov <amonakov@ispras.ru>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>