An __always_inline static function is better to find where exactly a
crash happens, so one can step into the function with GDB.
Reviewed-by: Fangrui Song <maskray@google.com>
Unroll slightly and enforce good instruction scheduling. This improves
performance on out-of-order machines. The unrolling allows for
pipelined multiplies.
As well, as an optional sysdep, reorder the operations and prevent
reassosiation for better scheduling and higher ILP. This commit
only adds the barrier for x86, although it should be either no
change or a win for any architecture.
Unrolling further started to induce slowdowns for sizes [0, 4]
but can help the loop so if larger sizes are the target further
unrolling can be beneficial.
Results for _dl_new_hash
Benchmarked on Tigerlake: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
Time as Geometric Mean of N=30 runs
Geometric of all benchmark New / Old: 0.674
type, length, New Time, Old Time, New Time / Old Time
fixed, 0, 2.865, 2.72, 1.053
fixed, 1, 3.567, 2.489, 1.433
fixed, 2, 2.577, 3.649, 0.706
fixed, 3, 3.644, 5.983, 0.609
fixed, 4, 4.211, 6.833, 0.616
fixed, 5, 4.741, 9.372, 0.506
fixed, 6, 5.415, 9.561, 0.566
fixed, 7, 6.649, 10.789, 0.616
fixed, 8, 8.081, 11.808, 0.684
fixed, 9, 8.427, 12.935, 0.651
fixed, 10, 8.673, 14.134, 0.614
fixed, 11, 10.69, 15.408, 0.694
fixed, 12, 10.789, 16.982, 0.635
fixed, 13, 12.169, 18.411, 0.661
fixed, 14, 12.659, 19.914, 0.636
fixed, 15, 13.526, 21.541, 0.628
fixed, 16, 14.211, 23.088, 0.616
fixed, 32, 29.412, 52.722, 0.558
fixed, 64, 65.41, 142.351, 0.459
fixed, 128, 138.505, 295.625, 0.469
fixed, 256, 291.707, 601.983, 0.485
random, 2, 12.698, 12.849, 0.988
random, 4, 16.065, 15.857, 1.013
random, 8, 19.564, 21.105, 0.927
random, 16, 23.919, 26.823, 0.892
random, 32, 31.987, 39.591, 0.808
random, 64, 49.282, 71.487, 0.689
random, 128, 82.23, 145.364, 0.566
random, 256, 152.209, 298.434, 0.51
Co-authored-by: Alexander Monakov <amonakov@ispras.ru>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
No change to the code other than moving the function to
dl-new-hash.h. Changed name so its now in the reserved namespace.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
_dl_skip_args is always 0, so the target specific code that modifies
argv after relro protection is applied is no longer used.
After the patch relro protection is applied to _dl_argv consistently
on all targets.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
When an executable is invoked as
./ld.so [ld.so-args] ./exe [exe-args]
then the argv is adujusted in ld.so before calling the entry point of
the executable so ld.so args are not visible to it. On most targets
this requires moving argv, env and auxv on the stack to ensure correct
stack alignment at the entry point. This had several issues:
- The code for this adjustment on the stack is written in asm as part
of the target specific ld.so _start code which is hard to maintain.
- The adjustment is done after _dl_start returns, where it's too late
to update GLRO(dl_auxv), as it is already readonly, so it points to
memory that was clobbered by the adjustment. This is bug 23293.
- _environ is also wrong in ld.so after the adjustment, but it is
likely not used after _dl_start returns so this is not user visible.
- _dl_argv was updated, but for this it was moved out of relro, which
changes security properties across targets unnecessarily.
This patch introduces a generic _dl_start_args_adjust function that
handles the argument adjustments after ld.so processed its own args
and before relro protection is applied.
The same algorithm is used on all targets, _dl_skip_args is now 0, so
existing target specific adjustment code is no longer used. The bug
affects aarch64, alpha, arc, arm, csky, ia64, nios2, s390-32 and sparc,
other targets don't need the change in principle, only for consistency.
The GNU Hurd start code relied on _dl_skip_args after dl_main returned,
now it checks directly if args were adjusted and fixes the Hurd startup
data accordingly.
Follow up patches can remove _dl_skip_args and DL_ARGV_NOT_RELRO.
Tested on aarch64-linux-gnu and cross tested on i686-gnu.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The Linux version used by i686 and m68k provide three overrrides for
generic code:
1. DISTINGUISH_LIB_VERSIONS to print additional information when
libc5 is used by a dependency.
2. EXTRA_LD_ENVVARS to that enabled LD_LIBRARY_VERSION environment
variable.
3. EXTRA_UNSECURE_ENVVARS to add two environment variables related
to aout support.
None are really requires, it has some decades since libc5 or aout
suppported was removed and Linux even remove support for aout files.
The LD_LIBRARY_VERSION is also dead code, dl_correct_cache_id is not
used anywhere.
Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
The kernel version check is used to avoid glibc to run on older
kernels where some syscall are not available and fallback code are
not enabled to handle graciously fail. However, it does not prevent
if the kernel does not correctly advertise its version through
vDSO note, uname or procfs.
Also kernel version checks are sometime not desirable by users,
where they want to deploy on different system with different kernel
version knowing the minimum set of syscall is always presented on
such systems.
The kernel version check has been removed along with the
LD_ASSUME_KERNEL environment variable. The minimum kernel used to
built glibc is still provided through NT_GNU_ABI_TAG ELF note and
also printed when libc.so is issued.
Checked on x86_64-linux-gnu.
This implements mmap fallback for a brk failure during TLS
allocation.
scripts/tls-elf-edit.py is updated to support the new patching method.
The script no longer requires that in the input object is of ET_DYN
type.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
When neither DT_HASH nor DT_GNU_HASH is present, the code scans
[DT_SYMTAB, DT_STRTAB). However, there is no guarantee that .dynstr
immediately follows .dynsym (e.g. lld typically places .gnu.version
after .dynsym).
In the absence of a hash table, symbol lookup will always fail
(map->l_nbuckets == 0 in dl-lookup.c) as if the object has no symbol, so
it seems fair for dladdr to do the same.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
__ehdr_start is already used in rltld.c:dl_main, and can serve the
same purpose as _begin. Besides tidying the code, using linker
defined section relative symbols rather than "-defsym _begin=0" better
reflects the intent of _dl_start_final use of _begin, which is to
refer to the load address of ld.so rather than absolute address zero.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
On _dl_map_object the underlying file is not opened in trace mode
(in other cases where the underlying file can't be opened,
_dl_map_object quits with an error). If there any missing libraries
being processed, they will not be considered on final nlist size
passed on _dl_sort_maps later in the function. And it is then used by
_dl_sort_maps_dfs on the stack allocated working maps:
222 /* Array to hold RPO sorting results, before we copy back to maps[]. */
223 struct link_map *rpo[nmaps];
224
225 /* The 'head' position during each DFS iteration. Note that we start at
226 one past the last element due to first-decrement-then-store (see the
227 bottom of above dfs_traversal() routine). */
228 struct link_map **rpo_head = &rpo[nmaps];
However while transversing the 'l_initfini' on dfs_traversal it will
still consider the l_faked maps and thus update rpo more times than the
allocated working 'rpo', overflowing the stack object.
As suggested in bugzilla, one option would be to avoid sorting the maps
for trace mode. However I think ignoring l_faked object does make
sense (there is one less constraint to call the sorting function), it
allows a slight less stack usage for trace, and it is slight simpler
solution.
The tests does trigger the stack overflow, however I tried to make
it more generic to check different scenarios or missing objects.
Checked on x86_64-linux-gnu.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Verify that:
1. A DT_RELR shared library without DT_NEEDED works.
2. A DT_RELR shared library without DT_VERNEED works.
3. A DT_RELR shared library without libc.so on DT_NEEDED works.
With DT_RELR, there may be no relocations in DT_RELA/DT_REL and their
entry values are zero. Don't relocate DT_RELA/DT_REL and update the
combined relocation start address if their entry values are zero.
The EI_ABIVERSION field of the ELF header in executables and shared
libraries can be bumped to indicate the minimum ABI requirement on the
dynamic linker. However, EI_ABIVERSION in executables isn't checked by
the Linux kernel ELF loader nor the existing dynamic linker. Executables
will crash mysteriously if the dynamic linker doesn't support the ABI
features required by the EI_ABIVERSION field. The dynamic linker should
be changed to check EI_ABIVERSION in executables.
Add a glibc version, GLIBC_ABI_DT_RELR, to indicate DT_RELR support so
that the existing dynamic linkers will issue an error on executables with
GLIBC_ABI_DT_RELR dependency. When there is a DT_VERNEED entry with
libc.so on DT_NEEDED, issue an error if there is a DT_RELR entry without
GLIBC_ABI_DT_RELR dependency.
Support __placeholder_only_for_empty_version_map as the placeholder symbol
used only for empty version map to generate GLIBC_ABI_DT_RELR without any
symbols.
PI_STATIC_AND_HIDDEN indicates whether accesses to internal linkage
variables and hidden visibility variables in a shared object (ld.so)
need dynamic relocations (usually R_*_RELATIVE). PI (position
independent) in the macro name is a misnomer: a code sequence using GOT
is typically position-independent as well, but using dynamic relocations
does not meet the requirement.
Not defining PI_STATIC_AND_HIDDEN is legacy and we expect that all new
ports will define PI_STATIC_AND_HIDDEN. Current ports defining
PI_STATIC_AND_HIDDEN are more than the opposite. Change the configure
default.
No functional change.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
When audit modules are loaded, ld.so initialization is not yet
complete, and rtld_active () returns false even though ld.so is
mostly working. Instead, the static dlopen hook is used, but that
does not work at all because this is not a static dlopen situation.
Commit 466c1ea15f ("dlfcn: Rework
static dlopen hooks") moved the hook pointer into _rtld_global_ro,
which means that separate protection is not needed anymore and the
hook pointer can be checked directly.
The guard for disabling libio vtable hardening in _IO_vtable_check
should stay for now.
Fixes commit 8e1472d2c1 ("ld.so:
Examine GLRO to detect inactive loader [BZ #20204]").
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
On non-PI_STATIC_AND_HIDDEN architectures, getting the address of
_rtld_local_ro (for GLRO (dl_final_object)) goes through a GOT entry.
The GOT load may be reordered before self relocation, leading to an
unrelocated/incorrect _rtld_local_ro address.
84e02af1eb tickled GCC powerpc32 to
reorder the GOT load before relative relocations, leading to ld.so
crash. This is similar to the m68k jump table reordering issue fixed by
a8e9b5b807.
Move code after self relocation into _dl_start_final to avoid the
reordering. This fixes powerpc32 and may help other architectures when
ELF_DYNAMIC_RELOCATE is simplified in the future.
This is necessary to place the libio vtables into the RELRO segment.
New tests elf/tst-relro-ldso and elf/tst-relro-libc are added to
verify that this is what actually happens.
The new tests fail on ia64 due to lack of (default) RELRO support
inbutils, so they are XFAILed there.
Hopefully, this will lead to tests that are easier to maintain. The
current approach of parsing readelf -W output using regular expressions
is not necessarily easier than parsing the ELF data directly.
This module is still somewhat incomplete (e.g., coverage of relocation
types and versioning information is missing), but it is sufficient to
perform basic symbol analysis or program header analysis.
The EM_* mapping for architecture-specific constant classes (e.g.,
SttX86_64) is not yet implemented. The classes are defined for the
benefit of elf/tst-glibcelf.py.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
elf_dynamic_do_Rel checks RTLD_BOOTSTRAP in several #ifdef branches.
Create an outside RTLD_BOOTSTRAP branch to simplify reasoning about the
function at the cost of a few duplicate lines.
Since dl_naudit is zero in RTLD_BOOTSTRAP code, the RTLD_BOOTSTRAP
branch can avoid _dl_audit_symbind calls to decrease code size.
Reviewed-by: Adheemrval Zanella <adhemerval.zanella@linaro.org>
After 73fc4e28b9,
__libc_enable_secure_decided is always 0 and a statically linked
executable may overwrite __libc_enable_secure without considering
AT_SECURE.
The __libc_enable_secure has been correctly initialized in _dl_aux_init,
so just remove __libc_enable_secure_decided and __libc_init_secure.
This allows us to remove some startup_get*id functions from
22b79ed7f4.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
The new IBM z16 is added to platform string array.
The macro _DL_PLATFORMS_COUNT is incremented.
_dl_hwcaps_subdir is extended by "z16" if HWCAP_S390_VXRS_PDE2
is set. HWCAP_S390_NNPA is not tested in _dl_hwcaps_subdirs_active
as those instructions may be replaced or removed in future.
tst-glibc-hwcaps.c is extended in order to test z16 via new marker5.
A fatal glibc error is dumped if glibc was build with architecture
level set for z16, but run on an older machine. (See dl-hwcap-check.h)
On 32-bit machines this has no affect. On 64-bit machines
{u}int_fast{16|32} are set as {u}int64_t which is often not
ideal. Particularly x86_64 this change both saves code size and
may save instruction cost.
Full xcheck passes on x86_64.
The count can be zero if an object has already been loaded as
an indirect dependency (so that l_searchlist.r_list in its link
map is still NULL) is promoted to global scope via RTLD_GLOBAL.
Fixes commit 5d28a8962d ("elf: Add _dl_find_object function").
If glibc is configured with --disable-default-pie and build on
s390 with -O3, the tests elf/tst-audit25a and elf/tst-audit25b are
failing as there are additional la_symbind lines for free and malloc.
It turns out that those belong to the executable. In fact those are
the PLT-stubs. Furthermore la_symbind is also called for calloc and
realloc symbols, but those belong to libc.
Those functions are not called at all, but dlsym'ed in
elf/dl-minimal.c:
__rtld_malloc_init_real (struct link_map *main_map)
{
...
void *new_calloc = lookup_malloc_symbol (main_map, "calloc", &version);
void *new_free = lookup_malloc_symbol (main_map, "free", &version);
void *new_malloc = lookup_malloc_symbol (main_map, "malloc", &version);
void *new_realloc = lookup_malloc_symbol (main_map, "realloc", &version);
...
}
Therefore, this commit just ignored symbols with LA_SYMB_DLSYM flag.
Reviewed-by: Adheemrval Zanella <adhemerval.zanella@linaro.org>
If the build itself is run in a container, we may not be able to
fully set up a nested container for test-container testing.
Notably is the mounting of /proc, since it's critical that it
be mounted from within the same PID namespace as its users, and
thus cannot be bind mounted from outside the container like other
mounts.
This patch defaults to using the parent's PID namespace instead of
creating a new one, as this is more likely to be allowed.
If the test needs an isolated PID namespace, it should add the "pidns"
command to its init script.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
And optimize it slightly.
This is commit 8c8510ab27 revised.
In _dl_aux_init in elf/dl-support.c, use an explicit loop
and -fno-tree-loop-distribute-patterns to avoid memset.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
1. Also generate .d dependency files for $(tests-container) and
$(tests-printers).
2. elf: Add tst-auditmod17.os to extra-test-objs.
3. iconv: Add tst-gconv-init-failure-mod.os to extra-test-objs.
4. malloc: Rename extra-tests-objs to extra-test-objs.
5. linux: Add tst-sysconf-iov_max-uapi.o to extra-test-objs.
6. x86_64: Add tst-x86_64mod-1.o, tst-platformmod-2.o, test-libmvec.o,
test-libmvec-avx.o, test-libmvec-avx2.o and test-libmvec-avx512f.o to
extra-test-objs.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Changes in v2:
1. Update commit log.
commit 163f625cf9
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Tue Dec 21 12:35:47 2021 -0800
elf: Remove excessive p_align check on PT_LOAD segments [BZ #28688]
removed the p_align check against the page size. It caused the loader
error or crash on elf/tst-p_align3 when loading elf/tst-p_alignmod3.so,
which has the invalid p_align in PT_LOAD segments, added by
commit d8d94863ef
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Tue Dec 21 13:42:28 2021 -0800
The loader failure caused by a negative length passed to __mprotect is
random, depending on architecture and toolchain. Update _dl_map_segments
to detect invalid holes. This fixes BZ #28838.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
This reverts commit 8c8510ab27. The
revert is not perfect because the commit included a bug fix for
_dl_sysdep_start with an empty argv, introduced in commit
2d47fa6862 ("Linux: Remove
DL_FIND_ARG_COMPONENTS"), and this bug fix is kept.
The revert is necessary because the reverted commit introduced an
early memset call on aarch64, which leads to crash due to lack of TCB
initialization.
The fix for BZ#22716 replacde LD_TRACE_LOADED_OBJECTS with
LD_TRACE_PRELINKING so mtrace could record executable address
position.
To provide the same information, LD_TRACE_LOADED_OBJECTS is
extended where a value or '2' also prints the executable address
as well. It avoid adding another loader environment variable
to be used solely for mtrace. The vDSO will be printed as
a default library (with '=>' pointing the same name), which is
ok since both mtrace and ldd already handles it.
The mtrace script is changed to also parse the new format. To
correctly support PIE and non-PIE executables, both the default
mtrace address and the one calculated as used (it fixes mtrace
for non-PIE exectuable as for BZ#22716 for PIE).
Checked on x86_64-linux-gnu.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Prelinked binaries and libraries still work, the dynamic tags
DT_GNU_PRELINKED, DT_GNU_LIBLIST, DT_GNU_CONFLICT just ignored
(meaning the process is reallocated as default).
The loader environment variable TRACE_PRELINKING is also removed,
since it used solely on prelink.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
And optimize it slightly.
The large switch statement in _dl_sysdep_start can be replaced with
a large array. This reduces source code and binary size. On
i686-linux-gnu:
Before:
text data bss dec hex filename
7791 12 0 7803 1e7b elf/dl-sysdep.os
After:
text data bss dec hex filename
7135 12 0 7147 1beb elf/dl-sysdep.os
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The generic version is the de-facto Linux implementation. It
requires an auxiliary vector, so Hurd does not use it.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
On hppa, a function pointer returned by la_symbind is actually a function
descriptor has the plabel bit set (bit 30). This must be cleared to get
the actual address of the descriptor. If the descriptor has been bound,
the first word of the descriptor is the physical address of theA function,
otherwise, the first word of the descriptor points to a trampoline in the
PLT.
This patch also adds a workaround on tests because on hppa (and it seems
to be the only ABI I have see it), some shared library adds a dynamic PLT
relocation to am empty symbol name:
$ readelf -r elf/tst-audit25mod1.so
[...]
Relocation section '.rela.plt' at offset 0x464 contains 6 entries:
Offset Info Type Sym.Value Sym. Name + Addend
00002008 00000081 R_PARISC_IPLT 508
[...]
It breaks some assumptions on the test, where a symbol with an empty
name ("") is passed on la_symbind.
Checked on x86_64-linux-gnu and hppa-linux-gnu.
Replace tst-audit24bmod2.so with tst-audit24bmod2 to silence:
make[2]: Entering directory '/export/gnu/import/git/gitlab/x86-glibc/elf'
Makefile:2201: warning: overriding recipe for target '/export/build/gnu/tools-build/glibc-gitlab/build-x86_64-linux/elf/tst-audit24bmod2.so'
../Makerules:765: warning: ignoring old recipe for target '/export/build/gnu/tools-build/glibc-gitlab/build-x86_64-linux/elf/tst-audit24bmod2.so'
The rtld audit support show two problems on aarch64:
1. _dl_runtime_resolve does not preserve x8, the indirect result
location register, which might generate wrong result calls
depending of the function signature.
2. The NEON Q registers pushed onto the stack by _dl_runtime_resolve
were twice the size of D registers extracted from the stack frame by
_dl_runtime_profile.
While 2. might result in wrong information passed on the PLT tracing,
1. generates wrong runtime behaviour.
The aarch64 rtld audit support is changed to:
* Both La_aarch64_regs and La_aarch64_retval are expanded to include
both x8 and the full sized NEON V registers, as defined by the
ABI.
* dl_runtime_profile needed to extract registers saved by
_dl_runtime_resolve and put them into the new correctly sized
La_aarch64_regs structure.
* The LAV_CURRENT check is change to only accept new audit modules
to avoid the undefined behavior of not save/restore x8.
* Different than other architectures, audit modules older than
LAV_CURRENT are rejected (both La_aarch64_regs and La_aarch64_retval
changed their layout and there are no requirements to support multiple
audit interface with the inherent aarch64 issues).
* A new field is also reserved on both La_aarch64_regs and
La_aarch64_retval to support variant pcs symbols.
Similar to x86, a new La_aarch64_vector type to represent the NEON
register is added on the La_aarch64_regs (so each type can be accessed
directly).
Since LAV_CURRENT was already bumped to support bind-now, there is
no need to increase it again.
Checked on aarch64-linux-gnu.
Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
The audit symbind callback is not called for binaries built with
-Wl,-z,now or when LD_BIND_NOW=1 is used, nor the PLT tracking callbacks
(plt_enter and plt_exit) since this would change the expected
program semantics (where no PLT is expected) and would have performance
implications (such as for BZ#15533).
LAV_CURRENT is also bumped to indicate the audit ABI change (where
la_symbind flags are set by the loader to indicate no possible PLT
trace).
To handle powerpc64 ELFv1 function descriptor, _dl_audit_symbind
requires to know whether bind-now is used so the symbol value is
updated to function text segment instead of the OPD (for lazy binding
this is done by PPC64_LOAD_FUNCPTR on _dl_runtime_resolve).
Checked on x86_64-linux-gnu, i686-linux-gnu, aarch64-linux-gnu,
powerpc64-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
For audit modules and dependencies with initial-exec TLS, we can not
set the initial TLS image on default loader initialization because it
would already be set by the audit setup. However, subsequent thread
creation would need to follow the default behaviour.
This patch fixes it by setting l_auditing link_map field not only
for the audit modules, but also for all its dependencies. This is
used on _dl_allocate_tls_init to avoid the static TLS initialization
at load time.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
la_activity is not called during application exit, even though
la_objclose is.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
Add <dl-r_debug.h> to get the adddress of the r_debug structure after
relocation and its offset before relocation from the PT_DYNAMIC segment
to support DT_DEBUG, DT_MIPS_RLD_MAP_REL and DT_MIPS_RLD_MAP.
Co-developed-by: Xi Ruoyao <xry111@mengyan1223.wang>
There was no direct or indirect make dependency on testobj3.so so the
test could fail with
/B/elf/loadfail: failed to load shared object: testobj3.so: cannot open
shared object file: No such file or directory
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The glibc 2.34 release really should have added a GLIBC_2.34
symbol to the dynamic loader. With it, we could move functions such
as dlopen or pthread_key_create that work on process-global state
into the dynamic loader (once we have fixed a longstanding issue
with static linking). Without the GLIBC_2.34 symbol, yet another
new symbol version would be needed because old glibc will fail to
load binaries due to the missing symbol version in ld.so that newly
linked programs will require.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
This avoid the cross-compiling breakage when the test should not run
($(run-built-tests) equal to no).
Checked on x86_64-linux-gnu and i686-linux-gnu as well with a cross
compile to aarch64-linux-gnu and powerpc64-linux-gnu.
Build tst-p_alignmod3.so with 256 byte page size and verify that it is
rejected with a proper error message.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Add tst-p_alignmod2-edit to edit the copy of tst-p_alignmod-base.so to
set p_align of the first PT_LOAD segment to 1 and verify that the shared
library can be loaded normally.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Add tst-p_alignmod1-edit to edit the copy of tst-p_alignmod-base.so to
reduce p_align of the first PT_LOAD segment by half and verify that the
shared library is mapped with the maximum p_align of all PT_LOAD segments.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
DT_RUNPATH is only used to find the immediate dependencies of the
executable or shared object containing the DT_RUNPATH entry:
1. Define link-test-modules-rpath-link if $(build-hardcoded-path-in-tests)
is yes.
2. Use $(link-test-modules-rpath-link) in build-module-helper so that
test modules can dlopen modules with DT_RUNPATH.
3. Add a test to show why link-test-modules-rpath-link is needed.
This partially fixes BZ #28455.
Check if whether valgrind is available in the test environment.
If not, skip the test. Run smoke tests with valgrind to verify dynamic loader.
First, check if algrind works with the system ld.so in the test
environment. Then run the actual test inside the test environment,
using the just build ld.so and new libraries.
Co-authored-by: Mark Wielaard <mark@klomp.org>
Linker may set p_align of a PT_LOAD segment larger than p_align of the
first PT_LOAD segment to satisfy a section alignment:
Elf file type is DYN (Shared object file)
Entry point 0x0
There are 10 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000834 0x0000000000000834 R E 0x1000
LOAD 0x0000000000000e00 0x0000000000001e00 0x0000000000001e00
0x0000000000000230 0x0000000000000230 RW 0x1000
LOAD 0x0000000000400000 0x0000000000400000 0x0000000000400000
0x0000000000000004 0x0000000000000008 RW 0x400000
...
Section to Segment mapping:
Segment Sections...
00 .note.gnu.property .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt .init .plt .plt.got .text .fini .rodata .eh_frame_hdr .eh_frame
01 .init_array .fini_array .data.rel.ro .dynamic .got .got.plt
02 .data .bss
We should align the first PT_LOAD segment to the maximum p_align of all
PT_LOAD segments, similar to the kernel commit:
commit ce81bb256a224259ab686742a6284930cbe4f1fa
Author: Chris Kennelly <ckennelly@google.com>
Date: Thu Oct 15 20:12:32 2020 -0700
fs/binfmt_elf: use PT_LOAD p_align values for suitable start address
This fixes BZ #28676.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
And compile it with the early CFLAGS. _dl_setup_hash is called
very early for the ld.so link map, so it should be compiled
differently.
Reviewed-by: Stefan Liebler <stli@linux.ibm.com>
Tested-by: Stefan Liebler <stli@linux.ibm.com>
The usage of internal static symbol for statically linked binaries
does not work correctly for objects built with -D_TIME_BITS=64,
since the internal definition does not provide the expected aliases.
This patch makes it to use the default stat functions instead (which
uses the default 64 time_t alias and types).
Checked on i686-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
l_contiguous was not initialized at all for the main map and
always 0. This commit adds code to check if the LOAD segments
are adjacent to each other, and sets l_contiguous accordingly.
This helps _dl_find_object because it is more efficient if the
main mapping is contiguous.
Note that not all (PIE or non-PIE) binaries are contiguous in this
way because BFD ld creates executables with LOAD holes:
ELF LOAD segments creating holes in the process image on GNU/Linux
https://sourceware.org/pipermail/binutils/2022-January/119082.htmlhttps://sourceware.org/bugzilla/show_bug.cgi?id=28743
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
The usage of internal static symbol for statically linked binaries
does not work correctly for objects built with -D_TIME_BITS=64,
since the internal definition does not provide the expected aliases.
This patch makes it to use the default stat functions instead (which
uses the default 64 time_t alias and types).
Checked on i686-linux-gnu.
With the current set of fences, the version update at the start
of the TM write operation is redundant, and the version update
at the end does not need to use an atomic read-modify-write
operation.
Also use relaxed MO stores during the dlclose update, and skip any
version changes there.
Suggested-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
As explained in Hans Boehm, Can Seqlocks Get Along with Programming
Language Memory Models?, an acquire fence is needed in
_dlfo_read_success. The lack of a fence resulted in an observable
bug on powerpc64le compile-time load reordering.
The fence in _dlfo_mappings_begin_update has been reordered, turning
the fence/store sequence into a release MO store equivalent.
Relaxed MO loads are used on the reader side, and relaxed MO stores
on the writer side for the shared data, to avoid formal data races.
This is just to be conservative; it should not actually be necessary
given how the data is used.
This commit also fixes the test run time. The intent was to run it
for 3 seconds, but 0.3 seconds was enough to uncover the bug very
occasionally (while 3 seconds did not reliably show the bug on every
test run).
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
DT_RUNPATH is only used to find the immediate dependencies of the
executable or shared object containing the DT_RUNPATH entry. Update
LD_AUDIT dlopen call to try the DT_RUNPATH entry of the executable.
Add tst-audit14a, which is copied from tst-audit14, to DT_RUNPATH and
build tst-audit14 with -Wl,--disable-new-dtags to test DT_RPATH.
This partially fixes BZ #28455.
Add <dl-debug.h> to setup debugging entry in PT_DYNAMIC segment to support
DT_DEBUG, DT_MIPS_RLD_MAP_REL and DT_MIPS_RLD_MAP.
Tested on x86-64, x32 and i686 as well as with build-many-glibcs.py.
I've updated copyright dates in glibc for 2022. This is the patch for
the changes not generated by scripts/update-copyrights and subsequent
build / regeneration of generated files. As well as the usual annual
updates, mainly dates in --version output (minus csu/version.c which
previously had to be handled manually but is now successfully updated
by update-copyrights), there is a small change to the copyright notice
in NEWS which should let NEWS get updated automatically next year.
Please remember to include 2022 in the dates for any new files added
in future (which means updating any existing uncommitted patches you
have that add new files to use the new copyright dates in them).
I used these shell commands:
../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")
and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 7061 files FOO.
I then removed trailing white space from math/tgmath.h,
support/tst-support-open-dev-null-range.c, and
sysdeps/x86_64/multiarch/strlen-vec.S, to work around the following
obscure pre-commit check failure diagnostics from Savannah. I don't
know why I run into these diagnostics whereas others evidently do not.
remote: *** 912-#endif
remote: *** 913:
remote: *** 914-
remote: *** error: lines with trailing whitespace found
...
remote: *** error: sysdeps/unix/sysv/linux/statx_cp.c: trailing lines
It can be used to speed up the libgcc unwinder, and the internal
_dl_find_dso_for_object function (which is used for caller
identification in dlopen and related functions, and in dladdr).
_dl_find_object is in the internal namespace due to bug 28503.
If libgcc switches to _dl_find_object, this namespace issue will
be fixed. It is located in libc for two reasons: it is necessary
to forward the call to the static libc after static dlopen, and
there is a link ordering issue with -static-libgcc and libgcc_eh.a
because libc.so is not a linker script that includes ld.so in the
glibc build tree (so that GCC's internal -lc after libgcc_eh.a does
not pick up ld.so).
It is necessary to do the i386 customization in the
sysdeps/x86/bits/dl_find_object.h header shared with x86-64 because
otherwise, multilib installations are broken.
The implementation uses software transactional memory, as suggested
by Torvald Riegel. Two copies of the supporting data structures are
used, also achieving full async-signal-safety.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The dl_main sets the LM_ID_BASE to RT_ADD just before starting to
add load new shared objects. The state is set to RT_CONSISTENT just
after all objects are loaded.
However if a audit modules tries to dlmopen an inexistent module,
the _dl_open will assert that the namespace is in an inconsistent
state.
This is different than dlopen, since first it will not use
LM_ID_BASE and second _dl_map_object_from_fd is the sole responsible
to set and reset the r_state value.
So the assert on _dl_open can not really be seen if the state is
consistent, since _dt_main resets it. This patch removes the assert.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
The vDSO is is listed in the link_map chain, but is never the subject of
an la_objopen call. A new internal flag __RTLD_VDSO is added that
acts as __RTLD_OPENEXEC to allocate the required 'struct auditstate'
extra space for the 'struct link_map'.
The return value from the callback is currently ignored, since there
is no PLT call involved by glibc when using the vDSO, neither the vDSO
are exported directly.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
The rtld-audit interfaces introduces a slowdown due to enabling
profiling instrumentation (as if LD_AUDIT implied LD_PROFILE).
However, instrumenting is only necessary if one of audit libraries
provides PLT callbacks (la_pltenter or la_pltexit symbols). Otherwise,
the slowdown can be avoided.
The following patch adjusts the logic that enables profiling to iterate
over all audit modules and check if any of those provides a PLT hook.
To keep la_symbind to work even without PLT callbacks, _dl_fixup now
calls the audit callback if the modules implements it.
Co-authored-by: Alexander Monakov <amonakov@ispras.ru>
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
It consolidates the code required to call la_pltexit audit
callback.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
It consolidates the code required to call la_pltenter audit
callback.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
It consolidates the code required to call la_preinit audit
callback.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
It consolidates the code required to call la_symbind{32,64} audit
callback.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
It consolidates the code required to call la_objclose audit
callback.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
It consolidates the code required to call la_objsearch audit
callback.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
It consolidates the code required to call la_activity audit
callback.
Also for a new Lmid_t the namespace link_map list are empty, so it
requires to check if before using it. This can happen for when audit
module is used along with dlmopen.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
It consolidates the code required to call la_objopen audit callback.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Remove AArch64 from comment for AT_MINSIGSTKSZ to match
commit 7cd60e43a6def40ecb75deb8decc677995970d0b
Author: Chang S. Bae <chang.seok.bae@intel.com>
Date: Tue May 18 13:03:15 2021 -0700
uapi/auxvec: Define the aux vector AT_MINSIGSTKSZ
Define AT_MINSIGSTKSZ in the generic uapi header. It is already used
as generic ABI in glibc's generic elf.h, and this define will prevent
future namespace conflicts. In particular, x86 is also using this
generic definition.
in Linux kernel 5.14.
p_align does not have to be a multiple of the page size. Only PT_LOAD
segment layout should be aligned to the page size.
1: Remove p_align check against the page size.
2. Use the page size, instead of p_align, to check PT_LOAD segment layout.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
With the morecore hook removed, there is not easy way to provide huge
pages support on with glibc allocator without resorting to transparent
huge pages. And some users and programs do prefer to use the huge pages
directly instead of THP for multiple reasons: no splitting, re-merging
by the VM, no TLB shootdowns for running processes, fast allocation
from the reserve pool, no competition with the rest of the processes
unlike THP, no swapping all, etc.
This patch extends the 'glibc.malloc.hugetlb' tunable: the value
'2' means to use huge pages directly with the system default size,
while a positive value means and specific page size that is matched
against the supported ones by the system.
Currently only memory allocated on sysmalloc() is handled, the arenas
still uses the default system page size.
To test is a new rule is added tests-malloc-hugetlb2, which run the
addes tests with the required GLIBC_TUNABLE setting. On systems without
a reserved huge pages pool, is just stress the mmap(MAP_HUGETLB)
allocation failure. To improve test coverage it is required to create
a pool with some allocated pages.
Checked on x86_64-linux-gnu.
Reviewed-by: DJ Delorie <dj@redhat.com>
Linux Transparent Huge Pages (THP) current supports three different
states: 'never', 'madvise', and 'always'. The 'never' is
self-explanatory and 'always' will enable THP for all anonymous
pages. However, 'madvise' is still the default for some system and
for such case THP will be only used if the memory range is explicity
advertise by the program through a madvise(MADV_HUGEPAGE) call.
To enable it a new tunable is provided, 'glibc.malloc.hugetlb',
where setting to a value diffent than 0 enables the madvise call.
This patch issues the madvise(MADV_HUGEPAGE) call after a successful
mmap() call at sysmalloc() with sizes larger than the default huge
page size. The madvise() call is disable is system does not support
THP or if it has the mode set to "never" and on Linux only support
one page size for THP, even if the architecture supports multiple
sizes.
To test is a new rule is added tests-malloc-hugetlb1, which run the
addes tests with the required GLIBC_TUNABLE setting.
Checked on x86_64-linux-gnu.
Reviewed-by: DJ Delorie <dj@redhat.com>
Add <tst-file-align.h> to support target specific ALIGN for variable
alignment test:
1. Alpha: Use 0x10000.
2. MicroBlaze and Nios II: Use 0x8000.
3. All others: Use 0x200000.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
On Linux/x86-64, for elf/tst-align3, we now get
munmap(0x7f88f9401000, 1126424) = 0
instead of
munmap(0x7f1615200018, 544768) = -1 EINVAL (Invalid argument)
Reviewed-by: Florian Weimer <fweimer@redhat.com>
The default has to change eventually, and there are no known failures
that require a delay.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
When PT_LOAD segment alignment > the page size, allocate enough space to
ensure that the segment can be properly aligned. This change helps code
segments use huge pages become simple and available.
This fixes [BZ #28676].
Signed-off-by: Xu Yu <xuyu@linux.alibaba.com>
Signed-off-by: Rongwei Wang <rongwei.wang@linux.alibaba.com>
This makes ld.so features such as --preload, --audit,
and --list-diagnostics more accessible to end users because they
do not need to know the ABI name of the dynamic loader.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
TLS_INIT_TCB_ALIGN is not actually used. TLS_TCB_ALIGN was likely
introduced to support a configuration where the thread pointer
has not the same alignment as THREAD_SELF. Only ia64 seems to use
that, but for the stack/pointer guard, not for storing tcbhead_t.
Some ports use TLS_TCB_OFFSET and TLS_PRE_TCB_SIZE to shift
the thread pointer, potentially landing in a different residue class
modulo the alignment, but the changes should not impact that.
In general, given that TLS variables have their own alignment
requirements, having different alignment for the (unshifted) thread
pointer and struct pthread would potentially result in dynamic
offsets, leading to more complexity.
hppa had different values before: __alignof__ (tcbhead_t), which
seems to be 4, and __alignof__ (struct pthread), which was 8
(old default) and is now 32. However, it defines THREAD_SELF as:
/* Return the thread descriptor for the current thread. */
# define THREAD_SELF \
({ struct pthread *__self; \
__self = __get_cr27(); \
__self - 1; \
})
So the thread pointer points after struct pthread (hence __self - 1),
and they have to have the same alignment on hppa as well.
Similarly, on ia64, the definitions were different. We have:
# define TLS_PRE_TCB_SIZE \
(sizeof (struct pthread) \
+ (PTHREAD_STRUCT_END_PADDING < 2 * sizeof (uintptr_t) \
? ((2 * sizeof (uintptr_t) + __alignof__ (struct pthread) - 1) \
& ~(__alignof__ (struct pthread) - 1)) \
: 0))
# define THREAD_SELF \
((struct pthread *) ((char *) __thread_self - TLS_PRE_TCB_SIZE))
And TLS_PRE_TCB_SIZE is a multiple of the struct pthread alignment
(confirmed by the new _Static_assert in sysdeps/ia64/libc-tls.c).
On m68k, we have a larger gap between tcbhead_t and struct pthread.
But as far as I can tell, the port is fine with that. The definition
of TCB_OFFSET is sufficient to handle the shifted TCB scenario.
This fixes commit 23c77f6018
("nptl: Increase default TCB alignment to 32").
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Programs without dynamic dependencies and without a program
interpreter are now run via execve.
Previously, the dynamic linker either crashed while attempting to
read a non-existing dynamic segment (looking for DT_AUDIT/DT_DEPAUDIT
data), or the self-relocated in the static PIE executable crashed
because the outer dynamic linker had already applied RELRO protection.
<dl-execve.h> is needed because execve is not available in the
dynamic loader on Hurd.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Since b05fae4d8e, __minimal malloc code is used during static
startup before PIE self-relocation (_dl_relocate_static_pie).
So it requires the same fix done for other objects by 47618209d0.
Checked on aarch64, x86_64, and i686 with and without static-pie.
1. Use a temporary file to generate Makefile fragments for DSO sorting
tests and use -include on them.
2. Add Makefile fragments to postclean-generated so that a "make clean"
removes the autogenerated fragments and a subsequent "make" regenerates
them.
This partially fixes BZ #28550.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The include cleanup on dl-minimal.c removed too much for some
targets.
Also for Hurd, __sbrk is removed from localplt.data now that
tunables allocated memory through mmap.
Checked with a build for all affected architectures.
The rtld_malloc functions are moved to its own file so it can be
used on csu code. Also, the functiosn are renamed to __minimal_*
(since there are now used not only on loader code).
Using the __minimal_malloc on tunables_strdup() avoids potential
issues with sbrk() calls while processing the tunables (I see
sporadic elf/tst-dso-ordering9 on powerpc64le with different
tests failing due ASLR).
Also, using __minimal_malloc over plain mmap optimizes the memory
allocation on both static and dynamic case (since it will any unused
space in either the last page of data segments, avoiding mmap() call,
or from the previous mmap() call).
Checked on x86_64-linux-gnu, i686-linux-gnu, and powerpc64le-linux-gnu.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Separated debuginfo files have PT_DYNAMIC with p_filesz == 0. We
need to check for that before the _dl_map_segments call because
that could attempt to write to mappings that extend beyond the end
of the file, resulting in SIGBUS.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
The patch removes the the ELF_DURING_STARTUP optimization and assume
both .rel.dyn and .rel.plt might not be subsequent. This allows some
code simplification since relocation will be handled independently
where it is done on bootstrap.
At least on x86_64_64, I can not measure any performance implications.
Running 10000 time the command
LD_DEBUG=statistics ./elf/ld.so ./libc.so
And filtering the "total startup time in dynamic loader" result,
the geometric mean is:
patched master
Ryzen 7 5900x 24140 24952
i7-4510U 45957 45982
(The results do show some variation, I did not make any statistical
analysis).
It also allows build arm with lld, since it inserts ".ARM.exidx"
between ".rel.dyn" and ".rel.plt" for the loader.
Checked on x86_64-linux-gnu and arm-linux-gnueabihf.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
These tests takes the address of a protected symbol (foo_protected)
and lld does not support copy relocations on protected data symbols.
Checked on x86_64-linux-gnu.
Reviewed-by: Fangrui Song <maskray@google.com>
The global test is linked with globalmod1.so which dlopens reldepmod4.so.
Make global.out depend on reldepmod4.so. This fixes BZ #28457.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
This second patch contains the actual implementation of a new sorting algorithm
for shared objects in the dynamic loader, which solves the slow behavior that
the current "old" algorithm falls into when the DSO set contains circular
dependencies.
The new algorithm implemented here is simply depth-first search (DFS) to obtain
the Reverse-Post Order (RPO) sequence, a topological sort. A new l_visited:1
bitfield is added to struct link_map to more elegantly facilitate such a search.
The DFS algorithm is applied to the input maps[nmap-1] backwards towards
maps[0]. This has the effect of a more "shallow" recursion depth in general
since the input is in BFS. Also, when combined with the natural order of
processing l_initfini[] at each node, this creates a resulting output sorting
closer to the intuitive "left-to-right" order in most cases.
Another notable implementation adjustment related to this _dl_sort_maps change
is the removing of two char arrays 'used' and 'done' in _dl_close_worker to
represent two per-map attributes. This has been changed to simply use two new
bit-fields l_map_used:1, l_map_done:1 added to struct link_map. This also allows
discarding the clunky 'used' array sorting that _dl_sort_maps had to sometimes
do along the way.
Tunable support for switching between different sorting algorithms at runtime is
also added. A new tunable 'glibc.rtld.dynamic_sort' with current valid values 1
(old algorithm) and 2 (new DFS algorithm) has been added. At time of commit
of this patch, the default setting is 1 (old algorithm).
Signed-off-by: Chung-Lin Tang <cltang@codesourcery.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This is the first of a 2-part patch set that fixes slow DSO sorting behavior in
the dynamic loader, as reported in BZ #17645. In order to facilitate such a
large modification to the dynamic loader, this first patch implements a testing
framework for validating shared object sorting behavior, to enable comparison
between old/new sorting algorithms, and any later enhancements.
This testing infrastructure consists of a Python script
scripts/dso-ordering-test.py' which takes in a description language, consisting
of strings that describe a set of link dependency relations between DSOs, and
generates testcase programs and Makefile fragments to automatically test the
described situation, for example:
a->b->c->d # four objects linked one after another
a->[bc]->d;b->c # a depends on b and c, which both depend on d,
# b depends on c (b,c linked to object a in fixed order)
a->b->c;{+a;%a;-a} # a, b, c serially dependent, main program uses
# dlopen/dlsym/dlclose on object a
a->b->c;{}!->[abc] # a, b, c serially dependent; multiple tests generated
# to test all permutations of a, b, c ordering linked
# to main program
(Above is just a short description of what the script can do, more
documentation is in the script comments.)
Two files containing several new tests, elf/dso-sort-tests-[12].def are added,
including test scenarios for BZ #15311 and Redhat issue #1162810 [1].
Due to the nature of dynamic loader tests, where the sorting behavior and test
output occurs before/after main(), generating testcases to use
support/test-driver.c does not suffice to control meaningful timeout for ld.so.
Therefore a new utility program 'support/test-run-command', based on
test-driver.c/support_test_main.c has been added. This does the same testcase
control, but for a program specified through a command-line rather than at the
source code level. This utility is used to run the dynamic loader testcases
generated by dso-ordering-test.py.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1162810
Signed-off-by: Chung-Lin Tang <cltang@codesourcery.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
As noted in bug 28475, the access attribute on memfrob in <string.h>
is incorrect: the function both reads and writes the memory pointed to
by its argument, so it needs to use __read_write__, not
__write_only__. This incorrect attribute results in a build failure
for accessing uninitialized memory for s390x-linux-gnu-O3 with
build-many-glibcs.py using GCC mainline.
Correct the attribute. Fixing this shows up that some calls to
memfrob in elf/ tests are reading uninitialized memory; I'm not
entirely sure of the purpose of those calls, but guess they are about
ensuring that the stack space is indeed allocated at that point in the
function, and so it matters that they are calling a function whose
semantics are unknown to the compiler. Thus, change the first memfrob
call in those tests to use explicit_bzero instead, as suggested by
Florian in
<https://sourceware.org/pipermail/libc-alpha/2021-October/132119.html>,
to avoid the use of uninitialized memory.
Tested for x86_64, and with build-many-glibcs.py (GCC mainline) for
s390x-linux-gnu-O3.
1. Define DL_RO_DYN_SECTION to initalize bootstrap_map.l_ld_readonly
before calling elf_get_dynamic_info to get dynamic info in bootstrap_map,
2. Define a single
static inline bool
dl_relocate_ld (const struct link_map *l)
{
/* Don't relocate dynamic section if it is readonly */
return !(l->l_ld_readonly || DL_RO_DYN_SECTION);
}
This updates BZ #28340 fix.
THe d6d89608ac broke powerpc for --enable-bind-now because it turned
out that different than patch assumption rtld elf_get_dynamic_info()
does require to handle RTLD_BOOTSTRAP to avoid DT_FLAGS and
DT_RUNPATH (more specially the GLRO usage which is not reallocate
yet).
This patch fixes by passing two arguments to elf_get_dynamic_info()
to inform that by rtld (bootstrap) or static pie initialization
(static_pie_bootstrap). I think using explicit argument is way more
clear and burried C preprocessor, and compiler should remove the
dead code.
I checked on x86_64 and i686 with default options, --enable-bind-now,
and --enable-bind-now and --enable--static-pie. I also check on
aarch64, armhf, powerpc64, and powerpc with default and
--enable-bind-now.
The 4af6982e4c fix does not fully handle RTLD_BOOTSTRAP usage on
rtld.c due two issues:
1. RTLD_BOOTSTRAP is also used on dl-machine.h on various
architectures and it changes the semantics of various machine
relocation functions.
2. The elf_get_dynamic_info() change was done sideways, previously
to 490e6c62aa get-dynamic-info.h was included by the first
dynamic-link.h include *without* RTLD_BOOTSTRAP being defined.
It means that the code within elf_get_dynamic_info() that uses
RTLD_BOOTSTRAP is in fact unused.
To fix 1. this patch now includes dynamic-link.h only once with
RTLD_BOOTSTRAP defined. The ELF_DYNAMIC_RELOCATE call will now have
the relocation fnctions with the expected semantics for the loader.
And to fix 2. part of 4af6982e4c is reverted (the check argument
elf_get_dynamic_info() is not required) and the RTLD_BOOTSTRAP
pieces are removed.
To reorganize the includes the static TLS definition is moved to
its own header to avoid a circular dependency (it is defined on
dynamic-link.h and dl-machine.h requires it at same time other
dynamic-link.h definition requires dl-machine.h defitions).
Also ELF_MACHINE_NO_REL, ELF_MACHINE_NO_RELA, and ELF_MACHINE_PLT_REL
are moved to its own header. Only ancient ABIs need special values
(arm, i386, and mips), so a generic one is used as default.
The powerpc Elf64_FuncDesc is also moved to its own header, since
csu code required its definition (which would require either include
elf/ folder or add a full path with elf/).
Checked on x86_64, i686, aarch64, armhf, powerpc64, powerpc32,
and powerpc64le.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
The tst-audit14, tst-audit15 and tst-audit16 tests all have audit
modules that write to stdout; the test reads from stdout to confirm
what was written. This assumes the stdout is a file which is not the
case when run over ssh.
This patch updates the tests to use a post run cmp command to compare
the output against and .exp file. This is similar to how many other
tests work and it fixes the stdout limitation. Also, this means the
test code can be greatly simplified.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Before to 490e6c62aa ('elf: Avoid nested functions in the loader
[BZ #27220]'), elf_get_dynamic_info() was defined twice on rtld.c: on
the first dynamic-link.h include and later within _dl_start(). The
former definition did not define DONT_USE_BOOTSTRAP_MAP and it is used
on setup_vdso() (since it is a global definition), while the former does
define DONT_USE_BOOTSTRAP_MAP and it is used on loader self-relocation.
With the commit change, the function is now included and defined once
instead of defined as a nested function. So rtld.c defines without
defining RTLD_BOOTSTRAP and it brokes at least powerpc32.
This patch fixes by moving the get-dynamic-info.h include out of
dynamic-link.h, which then the caller can corirectly set the expected
semantic by defining STATIC_PIE_BOOTSTRAP, RTLD_BOOTSTRAP, and/or
RESOLVE_MAP.
It also required to enable some asserts only for the loader bootstrap
to avoid issues when called from setup_vdso().
As a side note, this is another issues with nested functions: it is
not clear from pre-processed output (-E -dD) how the function will
be build and its semantic (since nested function will be local and
extra C defines may change it).
I checked on x86_64-linux-gnu (w/o --enable-static-pie),
i686-linux-gnu, powerpc64-linux-gnu, powerpc-linux-gnu-power4,
aarch64-linux-gnu, arm-linux-gnu, sparc64-linux-gnu, and
s390x-linux-gnu.
Reviewed-by: Fangrui Song <maskray@google.com>
dynamic-link.h is included more than once in some elf/ files (rtld.c,
dl-conflict.c, dl-reloc.c, dl-reloc-static-pie.c) and uses GCC nested
functions. This harms readability and the nested functions usage
is the biggest obstacle prevents Clang build (Clang doesn't support GCC
nested functions).
The key idea for unnesting is to add extra parameters (struct link_map
*and struct r_scope_elm *[]) to RESOLVE_MAP,
ELF_MACHINE_BEFORE_RTLD_RELOC, ELF_DYNAMIC_RELOCATE, elf_machine_rel[a],
elf_machine_lazy_rel, and elf_machine_runtime_setup. (This is inspired
by Stan Shebs' ppc64/x86-64 implementation in the
google/grte/v5-2.27/master which uses mixed extra parameters and static
variables.)
Future simplification:
* If mips elf_machine_runtime_setup no longer needs RESOLVE_GOTSYM,
elf_machine_runtime_setup can drop the `scope` parameter.
* If TLSDESC no longer need to be in elf_machine_lazy_rel,
elf_machine_lazy_rel can drop the `scope` parameter.
Tested on aarch64, i386, x86-64, powerpc64le, powerpc64, powerpc32,
sparc64, sparcv9, s390x, s390, hppa, ia64, armhf, alpha, and mips64.
In addition, tested build-many-glibcs.py with {arc,csky,microblaze,nios2}-linux-gnu
and riscv64-linux-gnu-rv64imafdc-lp64d.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
When performing symbol lookup for references in executable without
indirect external access:
1. Disallow copy relocations in executable against protected data symbols
in a shared object with indirect external access.
2. Disallow non-zero symbol values of undefined function symbols in
executable, which are used as the function pointer, against protected
function symbols in a shared object with indirect external access.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
1. Add GNU_PROPERTY_1_NEEDED:
#define GNU_PROPERTY_1_NEEDED GNU_PROPERTY_UINT32_OR_LO
to indicate the needed properties by the object file.
2. Add GNU_PROPERTY_1_NEEDED_INDIRECT_EXTERN_ACCESS:
#define GNU_PROPERTY_1_NEEDED_INDIRECT_EXTERN_ACCESS (1U << 0)
to indicate that the object file requires canonical function pointers and
cannot be used with copy relocation.
3. Scan GNU_PROPERTY_1_NEEDED property and store it in l_1_needed.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Linker creates the DT_DEBUG entry only in executables. Don't fill the
non-existent DT_DEBUG entry in ld.so with the run-time address of the
r_debug structure. This fixes BZ #28129.
The fix for bug 19329 caused a regression such that pthread_create can
deadlock when concurrent ctors from dlopen are waiting for it to finish.
Use a new GL(dl_load_tls_lock) in pthread_create that is not taken
around ctors in dlopen.
The new lock is also used in __tls_get_addr instead of GL(dl_load_lock).
The new lock is held in _dl_open_worker and _dl_close_worker around
most of the logic before/after the init/fini routines. When init/fini
routines are running then TLS is in a consistent, usable state.
In _dl_open_worker the new lock requires catching and reraising dlopen
failures that happen in the critical section.
The new lock is reinitialized in a fork child, to keep the existing
behaviour and it is kept recursive in case malloc interposition or TLS
access from signal handlers can retake it. It is not obvious if this
is necessary or helps, but avoids changing the preexisting behaviour.
The new lock may be more appropriate for dl_iterate_phdr too than
GL(dl_load_write_lock), since TLS state of an incompletely loaded
module may be accessed. If the new lock can replace the old one,
that can be a separate change.
Fixes bug 28357.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
commit ec935dea63
Author: Florian Weimer <fweimer@redhat.com>
Date: Fri Apr 24 22:31:15 2020 +0200
elf: Implement __libc_early_init
has
@@ -856,6 +876,11 @@ no more namespaces available for dlmopen()"));
/* See if an error occurred during loading. */
if (__glibc_unlikely (exception.errstring != NULL))
{
+ /* Avoid keeping around a dangling reference to the libc.so link
+ map in case it has been cached in libc_map. */
+ if (!args.libc_already_loaded)
+ GL(dl_ns)[nsid].libc_map = NULL;
+
do_dlopen calls _dl_open with nsid == __LM_ID_CALLER (-2), which calls
dl_open_worker with args.nsid = nsid. dl_open_worker updates args.nsid
if it is __LM_ID_CALLER. After dl_open_worker returns, it is wrong to
use nsid.
Replace nsid with args.nsid after dl_open_worker returns. This fixes
BZ #27609.
When add ld.so to a new namespace, we don't actually load ld.so. We
create a new link map and refers the real one for almost everything.
Copy l_addr and l_ld from the real ld.so link map to avoid GDB warning:
warning: .dynamic section for ".../elf/ld-linux-x86-64.so.2" is not at the expected address (wrong library or version mismatch?)
when handling shared library loaded by dlmopen.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Add tst-ro-dynamic-mod to modules-names-nobuild to avoid
../Makerules:767: warning: ignoring old recipe for target '.../elf/tst-ro-dynamic-mod.so'
This updates BZ #28340 fix.
We can't relocate entries in dynamic section if it is readonly:
1. Add a l_ld_readonly field to struct link_map to indicate if dynamic
section is readonly and set it based on p_flags of PT_DYNAMIC segment.
2. Replace DL_RO_DYN_SECTION with dl_relocate_ld to decide if dynamic
section should be relocated.
3. Remove DL_RO_DYN_TEMP_CNT.
4. Don't use a static dynamic section to make readonly dynamic section
in vDSO writable.
5. Remove the temp argument from elf_get_dynamic_info.
This fixes BZ #28340.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Glibc does not provide an interface for debugger to access libraries
loaded in multiple namespaces via dlmopen.
The current rtld-debugger interface is described in the file:
elf/rtld-debugger-interface.txt
under the "Standard debugger interface" heading. This interface only
provides access to the first link-map (LM_ID_BASE).
1. Bump r_version to 2 when multiple namespaces are used. This triggers
the GDB bug:
https://sourceware.org/bugzilla/show_bug.cgi?id=28236
2. Add struct r_debug_extended to extend struct r_debug into a linked-list,
where each element correlates to an unique namespace.
3. Initialize the r_debug_extended structure. Bump r_version to 2 for
the new namespace and add the new namespace to the namespace linked list.
4. Add _dl_debug_update to return the address of struct r_debug' of a
namespace.
5. Add a hidden symbol, _r_debug_extended, for struct r_debug_extended.
6. Provide the symbol, _r_debug, with size of struct r_debug, as an alias
of _r_debug_extended, for programs which reference _r_debug.
This fixes BZ #15971.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
All the ports now have THREAD_GSCOPE_IN_TCB set to 1. Remove all
support for !THREAD_GSCOPE_IN_TCB, along with the definition itself.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20210915171110.226187-4-bugaevc@gmail.com>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
While originally this definition was indeed used to distinguish between
the cases where the GSCOPE flag was stored in TCB or not, it has since
become used as a general way to distinguish between HTL and NPTL.
THREAD_GSCOPE_IN_TCB will be removed in the following commits, as HTL,
which currently is the only port that does not put the flag into TCB,
will get ported to put the GSCOPE flag into the TCB as well. To prepare
for that change, migrate all code that wants to distinguish between HTL
and NPTL to use PTHREAD_IN_LIBC instead, which is a better choice since
the distinction mostly has to do with whether libc has access to the
list of thread structures and therefore can initialize thread-local
storage.
The parts of code that actually depend on whether the GSCOPE flag is in
TCB are left unchanged.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20210907133325.255690-2-bugaevc@gmail.com>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
We stopped adding "Contributed by" or similar lines in sources in 2012
in favour of git logs and keeping the Contributors section of the
glibc manual up to date. Removing these lines makes the license
header a bit more consistent across files and also removes the
possibility of error in attribution when license blocks or files are
copied across since the contributed-by lines don't actually reflect
reality in those cases.
Move all "Contributed by" and similar lines (Written by, Test by,
etc.) into a new file CONTRIBUTED-BY to retain record of these
contributions. These contributors are also mentioned in
manual/contrib.texi, so we just maintain this additional record as a
courtesy to the earlier developers.
The following scripts were used to filter a list of files to edit in
place and to clean up the CONTRIBUTED-BY file respectively. These
were not added to the glibc sources because they're not expected to be
of any use in future given that this is a one time task:
https://gist.github.com/siddhesh/b5ecac94eabfd72ed2916d6d8157e7dchttps://gist.github.com/siddhesh/15ea1f5e435ace9774f485030695ee02
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Avoid triggering a false positive from valgrind by copying the terminating
null in tunables_strdup. At this point the heap is still clean, but
valgrind is stricter here.
elf/tls-macros.h was added for TLS testing when GCC did not support
__thread. __thread and tls_model attributes are mature now and have been
used by many newer tests.
Also delete tst-tls2.c which tests .tls_common (unused by modern GCC and
unsupported by Clang/LLD). .tls_common and .tbss definition are almost
identical after linking, so the runtime test doesn't add additional
coverage. Assembler and linker tests should be on the binutils side.
When LLD 13.0.0 is allowed in configure.ac
(https://sourceware.org/pipermail/libc-alpha/2021-August/129866.html),
`make check` result is on par with glibc built with GNU ld on aarch64
and x86_64.
As a future clean-up, TLS_GD/TLS_LD/TLS_IE/TLS_IE macros can be removed from
sysdeps/*/tls-macros.h. We can add optional -mtls-dialect={gnu2,trad}
tests to ensure coverage.
Tested on aarch64-linux-gnu, powerpc64le-linux-gnu, and x86_64-linux-gnu.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
We can consider __ehdr_start (from binutils 2.23 onwards)
unconditionally supported, since configure.ac requires binutils>=2.25.
The configure.ac check is related to an ia64 bug fixed by binutils 2.24.
See https://sourceware.org/pipermail/libc-alpha/2014-August/053503.html
Tested on x86_64-linux-gnu. Tested build-many-glibcs.py with
aarch64-linux-gnu and s390x-linux-gnu.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Commit 03e187a41d added a regression when an audit module does not have
libc as DT_NEEDED (although unusual it is possible).
Checked on x86_64-linux-gnu.
Remove all malloc hook uses from core malloc functions and move it
into a new library libc_malloc_debug.so. With this, the hooks now no
longer have any effect on the core library.
libc_malloc_debug.so is a malloc interposer that needs to be preloaded
to get hooks functionality back so that the debugging features that
depend on the hooks, i.e. malloc-check, mcheck and mtrace work again.
Without the preloaded DSO these debugging features will be nops.
These features will be ported away from hooks in subsequent patches.
Similarly, legacy applications that need hooks functionality need to
preload libc_malloc_debug.so.
The symbols exported by libc_malloc_debug.so are maintained at exactly
the same version as libc.so.
Finally, static binaries will no longer be able to use malloc
debugging features since they cannot preload the debugging DSO.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
This is updated version of the 572bd547d5 (reverted by 40ebfd016a)
that fixes the _dl_next_tls_modid issues.
This issue with 572bd547d5 patch is the DTV entry will be only
update on dl_open_worker() with the update_tls_slotinfo() call after
all dependencies are being processed by _dl_map_object_deps(). However
_dl_map_object_deps() itself might call _dl_next_tls_modid(), and since
the _dl_tls_dtv_slotinfo_list::map is not yet set the entry will be
wrongly reused.
This patch fixes by renaming the _dl_next_tls_modid() function to
_dl_assign_tls_modid() and by passing the link_map so it can set
the slotinfo value so a subsequente _dl_next_tls_modid() call will
see the entry as allocated.
The intermediary value is cleared up on remove_slotinfo() for the case
a library fails to load with RTLD_NOW.
This patch fixes BZ #27135.
Checked on x86_64-linux-gnu.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
As a result, is not necessary to specify __attribute__ ((nocommon))
on individual definitions.
GCC 10 defaults to -fno-common on all architectures except ARC,
but this change is compatible with older GCC versions and ARC, too.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Both tests try to dlopen libm.so at runtime, so make them depend on it
so that they're executed if libm.so has been updated.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
They are no longer needed after everything has been moved into
libc. The _dl_vsym test has to be removed because the symbol
cannot be used outside libc anymore.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
In dlerror_run, free corresponds to the local malloc in the
namespace, but GLRO (dl_catch_error) uses the malloc from the base
namespace. elf/tst-dlmopen-gethostbyname triggers this mismatch,
but it does not crash, presumably because of a fastbin deallocation.
Fixes commit c2059edce2 ("elf: Use
_dl_catch_error from base namespace in dl-libc.c [BZ #27646]") and
commit b2964eb1d9 ("dlfcn: Failures
after dlmopen should not terminate process [BZ #24772]").
librt.so is no longer installed for PTHREAD_IN_LIBC, and tests
are not linked against it. $(librt) is introduced globally for
shared tests that need to be linked for both PTHREAD_IN_LIBC
and !PTHREAD_IN_LIBC.
GLIBC_PRIVATE symbols that were needed during the transition are
removed again.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Previously, the installed objects were named like libc-2.33.so,
and the ABI soname libc.so.6 was just a symbolic link.
The Makefile targets to install these symbolic links are no longer
needed after this, so they are removed with this commit. The more
general $(make-link) command (which invokes scripts/rellns-sh) is
retained because other symbolic links are still needed.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@rehdat.com>
This introduces <dl-is_dso.h> and the _dl_is_dso function. A
test ensures that the official names of libc.so, ld.so, and their
versioned names are recognized.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
Revert "elf: Fix DTV gap reuse logic [BZ #27135]"
This reverts commit 572bd547d5.
It turns out that the _dl_next_tls_modid in _dl_map_object_from_fd keeps
returning the same modid over and over again if there is a gap and
more than TLS-using module is loaded in one dlopen call. This corrupts
TLS data structures. The bug is still present after a revert, but
empirically it is much more difficult to trigger (because it involves a
dlopen failure).
If lib->flags (in the cache) did not match GLRO (dl_correct_cache_id),
searching for further glibc-hwcaps entries did not happen, and it
was possible that the best glibc-hwcaps was not found. By accident,
this causes a test failure for elf/tst-glibc-hwcaps-prepend-cache
on armv7l.
This commit changes the cache lookup logic to continue searching
if (a) no match has been found, (b) a named glibc-hwcaps match
has been found(), or (c) non-glibc-hwcaps match has been found
and the entry flags and cache default flags do not match.
_DL_CACHE_DEFAULT_ID is used instead of GLRO (dl_correct_cache_id)
because the latter is only written once on i386 if loading
of libc.so.5 libraries is selected, so GLRO (dl_correct_cache_id)
should probably removed in a future change.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
For the legacy ABI with supports 32-bit time_t it calls the 64-bit
time directly, since the LFS symbols calls the 64-bit time_t ones
internally.
Checked on i686-linux-gnu and x86_64-linux-gnu.
Reviewed-by: Lukasz Majewski <lukma@denx.de>
dlerrror_run in elf/dl-libc.c needs to call GLRO (dl_catch_error)
from the base namespace, just like the exported dlerror
implementation.
Fixes commit b2964eb1d9 ("dlfcn:
Failures after dlmopen should not terminate process [BZ #24772]").
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Since commit 0c1c3a771e
("dlfcn: Move dlopen into libc") libdl.a is empty, so linking
against it is no longer necessary.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The testcase elf/tst-tls9-static sometimes fails with:
cannot open 'tst-tlsmod5.so': tst-tlsmod5.so: cannot open shared object file: No such file or directory
cannot open 'tst-tlsmod6.so': tst-tlsmod6.so: cannot open shared object file: No such file or directory
After recent commit
6f1c701026
"dlfcn: Cleanups after -ldl is no longer required"
the libdl variable is not set anymore and thus the
dependencies were missing.
Consolidate all hooks structures into a single one. There are
no static dlopen ABI concerns because glibc 2.34 already comes
with substantial ABI-incompatible changes in this area. (Static
dlopen requires the exact same dynamic glibc version that was used
for static linking.)
The new approach uses a pointer to the hooks structure into
_rtld_global_ro and initalizes it in __rtld_static_init. This avoids
a back-and-forth with various callback functions.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This commit removes the ELF constructor and internal variables from
dlfcn/dlfcn.c. The file now serves the same purpose as
nptl/libpthread-compat.c, so it is renamed to dlfcn/libdl-compat.c.
The use of libdl-shared-only-routines ensures that libdl.a is empty.
This commit adjusts the test suite not to use $(libdl). The libdl.so
symbolic link is no longer installed.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The symbol was moved using scripts/move-symbol-to-libc.py.
In elf/Makefile, remove the $(libdl) dependency from testobj1.so
because it the unused libdl DSO now causes elf/tst-unused-deps to
fail.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Once libpthread is empty and no longer marked NODELETE, it no longer
can be used for testing.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Use the __nptl_tls_static_size_for_stack inline function instead,
and the GLRO (dl_tls_static_align) value directly.
The computation of GLRO (dl_tls_static_align) in
_dl_determine_tlsoffset ensures that the alignment is at least
TLS_TCB_ALIGN, which at least STACK_ALIGN (see allocate_stack).
Therefore, the additional rounding-up step is removed.
ALso move the initialization of the default stack size from
__pthread_initialize_minimal_internal to __pthread_early_init.
This introduces an extra system call during single-threaded startup,
but this simplifies the initialization sequence. No locking is
needed around the writes to __default_pthread_attr because the
process is single-threaded at this point.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Coverity discovered that paths allocated by chroot_canon are not freed
in a couple of routines in ldconfig.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
A coverity run identified a number of resource leaks in cache.c.
There are a couple of simple memory leaks where a local allocation is
not freed before function return. Then there is a mmap leak and a
file descriptor leak where a map is not unmapped in the error case and
a file descriptor remains open respectively.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This helps to clarify that the caching of these fields in libpthread
(in __static_tls_size, __static_tls_align_m1) is unnecessary.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
After static dlopen, a copy of ld.so is loaded into the inner
namespace, but that copy is not initialized at all. Some
architectures run into serious problems as result, which is why the
_dl_var_init mechanism was invented. With libpthread moving into
libc and parts into ld.so, more architectures impacted, so it makes
sense to switch to a generic mechanism which performs the partial
initialization.
As a result, getauxval now works after static dlopen (bug 20802).
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(FYI, this is a repost of
https://sourceware.org/pipermail/libc-alpha/2019-July/105035.html now
that FSF papers have been signed and confirmed on FSF side).
This trivial patch attemps to fix BZ 24106. Basically the bash locally
used when building glibc on the host shall not leak on the installed
glibc, as the system where it is installed might be different and use
another bash location.
So I have looked for all occurences of @BASH@ or $(BASH) in installed
files, and replaced it by /bin/bash. This was suggested by Florian
Weimer in the bug report.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
For some reason only dlopen failure caused dtv gaps to be reused.
It is possible that the intent was to never reuse modids for a
different module, but after dlopen failure all gaps are reused
not just the ones caused by the unfinished dlopened.
So the code has to handle reused modids already which seems to
work, however the data races at thread creation and tls access
(see bug 19329 and bug 27111) may be more severe if slots are
reused so this is scheduled after those fixes. I think fixing
the races are not simpler if reuse is disallowed and reuse has
other benefits, so set GL(dl_tls_dtv_gaps) whenever entries are
removed from the middle of the slotinfo list. The value does
not have to be correct: incorrect true value causes the next
modid query to do a slotinfo walk, incorrect false will leave
gaps and new entries are added at the end.
Fixes bug 27135.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Test concurrent dlopen and pthread_create when the loaded modules have
TLS. This triggers dl-tls assertion failures more reliably than the
nptl/tst-stack4 test.
The dlopened module has 100 DT_NEEDED dependencies with TLS, they were
reused from an existing TLS test. The number of created threads during
dlopen depends on filesystem speed and hardware, but at most 3 threads
are alive at a time to limit resource usage.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This is a follow up patch to the fix for bug 19329. This adds relaxed
MO atomics to accesses that were previously data races but are now
race conditions, and where relaxed MO is sufficient.
The race conditions all follow the pattern that the write is behind the
dlopen lock, but a read can happen concurrently (e.g. during tls access)
without holding the lock. For slotinfo entries the read value only
matters if it reads from a synchronized write in dlopen or dlclose,
otherwise the related dtv entry is not valid to access so it is fine
to leave it in an inconsistent state. The same applies for
GL(dl_tls_max_dtv_idx) and GL(dl_tls_generation), but there the
algorithm relies on the fact that the read of the last synchronized
write is an increasing value.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
DTV setup at thread creation (_dl_allocate_tls_init) is changed
to take the dlopen lock, GL(dl_load_lock). Avoiding data races
here without locks would require design changes: the map that is
accessed for static TLS initialization here may be concurrently
freed by dlclose. That use after free may be solved by only
locking around static TLS setup or by ensuring dlclose does not
free modules with static TLS, however currently every link map
with TLS has to be accessed at least to see if it needs static
TLS. And even if that's solved, still a lot of atomics would be
needed to synchronize DTV related globals without a lock. So fix
both bug 19329 and bug 27111 with a lock that prevents DTV setup
running concurrently with dlopen or dlclose.
_dl_update_slotinfo at TLS access still does not use any locks
so CONCURRENCY NOTES are added to explain the synchronization.
The early exit from the slotinfo walk when max_modid is reached
is not strictly necessary, but does not hurt either.
An incorrect acquire load was removed from _dl_resize_dtv: it
did not synchronize with any release store or fence and
synchronization is now handled separately at thread creation
and TLS access time.
There are still a number of racy read accesses to globals that
will be changed to relaxed MO atomics in a followup patch. This
should not introduce regressions compared to existing behaviour
and avoid cluttering the main part of the fix.
Not all TLS access related data races got fixed here: there are
additional races at lazy tlsdesc relocations see bug 27137.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
All the stack lists are now in _rtld_global, so it is possible
to change stack permissions directly from there, instead of
calling into libpthread to do the change.
Tested-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Permissions of the cached stacks may have to be updated if an object
is loaded that requires executable stacks, so the dynamic loader
needs to know about these cached stacks.
The move of in_flight_stack and stack_cache_actsize is a requirement for
merging __reclaim_stacks into the fork implementation in libc.
Tested-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
This is an early variant of __tls_init_tp, primarily for initializing
thread-related elements of _rtld_global/GL.
Some existing initialization code not needed for NPTL is moved into
the generic version of this function.
Tested-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
If libpthread is included in libc, it is not necessary to delay
initialization of the lock/unlock function pointers until libpthread
is loaded. This eliminates two unprotected function pointers
from _rtld_global and removes some initialization code from
libpthread.
Tested-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Previously, the source file nptl/cancellation.c was compiled multiple
times, for libc, libpthread, librt. This commit switches to a single
implementation, with new __pthread_enable_asynccancel@@GLIBC_PRIVATE,
__pthread_disable_asynccancel@@GLIBC_PRIVATE exports.
The almost-unused CANCEL_ASYNC and CANCEL_RESET macros are replaced
by LIBC_CANCEL_ASYNC and LIBC_CANCEL_ASYNC macros. They call the
__pthread_* functions unconditionally now. The macros are still
needed because shared code uses them; Hurd has different definitions.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The stack list is available in ld.so since commit
1daccf403b ("nptl: Move stack list
variables into _rtld_global"), so it's possible to walk the stack
list directly in ld.so and perform the initialization there.
This eliminates an unprotected function pointer from _rtld_global
and reduces the libpthread initialization code.
TLS_INIT_TP is processor-specific, so it is not a good place to
put thread library initialization code (it would have to be repeated
for all CPUs). Introduce __tls_init_tp as a separate function,
to be called immediately after TLS_INIT_TP. Move the existing
stack list setup code for NPTL to this function.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Calling free directly may end up freeing a pointer allocated by the
dynamic loader using malloc from libc.so in the base namespace using
the allocator from libc.so in a secondary namespace, which results in
crashes.
This commit redirects the free call through GLRO and the dynamic
linker, to reach the correct namespace. It also cleans up the dlerror
handling along the way, so that pthread_setspecific is no longer
needed (which avoids triggering bug 24774).
Commit 9e78f6f6e7 ("Implement
_dl_catch_error, _dl_signal_error in libc.so [BZ #16628]") has the
side effect that distinct namespaces, as created by dlmopen, now have
separate implementations of the rtld exception mechanism. This means
that the call to _dl_catch_error from libdl in a secondary namespace
does not actually install an exception handler because the
thread-local variable catch_hook in the libc.so copy in the secondary
namespace is distinct from that of the base namepace. As a result, a
dlsym/dlopen/... failure in a secondary namespace terminates the process
with a dynamic linker error because it looks to the exception handler
mechanism as if no handler has been installed.
This commit restores GLRO (dl_catch_error) and uses it to set the
handler in the base namespace.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
It's necessary to stub out __libc_disable_asynccancel and
__libc_enable_asynccancel via rtld-stubbed-symbols because the new
direct references to the unwinder result in symbol conflicts when the
rtld exception handling from libc is linked in during the construction
of librtld.map.
unwind-forcedunwind.c is merged into unwind-resume.c. libc now needs
the functions that were previously only used in libpthread.
The GLIBC_PRIVATE exports of __libc_longjmp and __libc_siglongjmp are
no longer needed, so switch them to hidden symbols.
The symbol __pthread_unwind_next has been moved using
scripts/move-symbol-to-libc.py.
Reviewed-by: Adhemerva Zanella <adhemerval.zanella@linaro.org>
Remove generic tlsdesc code related to lazy tlsdesc processing since
lazy tlsdesc relocation is no longer supported. This includes removing
GL(dl_load_lock) from _dl_make_tlsdesc_dynamic which is only called at
load time when that lock is already held.
Added a documentation comment too.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
map is not valid to access here because it can be freed by a concurrent
dlclose: during tls access (via __tls_get_addr) _dl_update_slotinfo is
called without holding dlopen locks. So don't check the modid of map.
The map == 0 and map != 0 code paths can be shared (avoiding the dtv
resize in case of map == 0 is just an optimization: larger dtv than
necessary would be fine too).
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Since
commit a509eb117f
Avoid late dlopen failure due to scope, TLS slotinfo updates [BZ #25112]
the generation counter update is not needed in the failure path.
That commit ensures allocation in _dl_add_to_slotinfo happens before
the demarcation point in dlopen (it is called twice, first time is for
allocation only where dlopen can still be reverted on failure, then
second time actual dtv updates are done which then cannot fail).
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The test dlopens a large number of modules with TLS, they are reused
from an existing test.
The test relies on the reuse of slotinfo entries after dlclose, without
bug 27135 fixed this needs a failing dlopen. With a slotinfo list that
has non-monotone increasing generation counters, bug 27136 can trigger.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The max modid is a valid index in the dtv, it should not be skipped.
The bug is observable if the last module has modid == 64 and its
generation is same or less than the max generation of the previous
modules. Then dtv[0].counter implies dtv[64] is initialized but
it isn't. Fixes bug 27136.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
When parse_tunables tries to erase a tunable marked as SXID_ERASE for
setuid programs, it ends up setting the envvar string iterator
incorrectly, because of which it may parse the next tunable
incorrectly. Given that currently the implementation allows malformed
and unrecognized tunables pass through, it may even allow SXID_ERASE
tunables to go through.
This change revamps the SXID_ERASE implementation so that:
- Only valid tunables are written back to the tunestr string, because
of which children of SXID programs will only inherit a clean list of
identified tunables that are not SXID_ERASE.
- Unrecognized tunables get scrubbed off from the environment and
subsequently from the child environment.
- This has the side-effect that a tunable that is not identified by
the setxid binary, will not be passed on to a non-setxid child even
if the child could have identified that tunable. This may break
applications that expect this behaviour but expecting such tunables
to cross the SXID boundary is wrong.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Instead of passing GLIBC_TUNABLES via the environment, pass the
environment variable from parent to child. This allows us to test
multiple variables to ensure better coverage.
The test list currently only includes the case that's already being
tested. More tests will be added later.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The simplification of tunable_set interfaces took care of
signed/unsigned conversions while setting values, but comparison with
bounds ended up being incorrect; comparing TUNABLE_SIZE_T values for
example will fail because SIZE_MAX is seen as -1.
Add comparison helpers that take tunable types into account and use
them to do comparison instead.
dlopen updates libname_list by writing to lastp->next, but concurrent
reads in _dl_name_match_p were not synchronized when it was called
without holding GL(dl_load_lock), which can happen during lazy symbol
resolution.
This patch fixes the race between _dl_name_match_p reading lastp->next
and add_name_to_object writing to it. This could cause segfault on
targets with weak memory order when lastp->next->name is read, which
was observed on an arm system. Fixes bug 21349.
(Code is from Maninder Singh, comments and description is from Szabolcs
Nagy.)
Co-authored-by: Vaneet Narang <v.narang@samsung.com>
Co-authored-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This does not change the emitted code since __libc_start_main does not
return, but is important for formal flags compliance.
This also cleans up the cosmetic inconsistency in the stack protector
flags in csu, especially the incorrect value of STACK_PROTECTOR_LEVEL.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Enabling --enable-stack-protector=all causes the following tests to fail:
FAIL: elf/ifuncmain9picstatic
FAIL: elf/ifuncmain9static
Nick Alcock (who committed the stack protector code) marked the IFUNC
resolvers with inhibit_stack_protector when he done the original work and
suggested doing so again @ BZ #25680. This patch adds
inhibit_stack_protector to ifuncmain9.
After patch is applied, --enable-stack-protector=all does not fail the
above tests.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
In this case, use the link map of the dynamic loader itself as
a replacement. This is more than just a hack: if we ever support
DT_RUNPATH/DT_RPATH for the dynamic loader, reporting it for
ld.so --help (without further command line arguments) would be the
right thing to do.
Fixes commit 3324213125 ("elf: Always
set l in _dl_init_paths (bug 23462)").
After d1d5471579 ("Remove dead
DL_DST_REQ_STATIC code.") we always setup the link map l to make the
static and shared cases the same. The bug is that in elf/dl-load.c
(_dl_init_paths) we conditionally set l only in the #ifdef SHARED
case, but unconditionally use it later. The simple solution is to
remove the #ifdef SHARED conditional, because it's no longer needed,
and unconditionally setup l for both the static and shared cases. A
regression test is added to run a static binary with
LD_LIBRARY_PATH='$ORIGIN' which crashes before the fix and runs after
the fix.
Co-Authored-By: Florian Weimer <fweimer@redhat.com>
It turns out the startup code in csu/elf-init.c has a perfect pair of
ROP gadgets (see Marco-Gisbert and Ripoll-Ripoll, "return-to-csu: A
New Method to Bypass 64-bit Linux ASLR"). These functions are not
needed in dynamically-linked binaries because DT_INIT/DT_INIT_ARRAY
are already processed by the dynamic linker. However, the dynamic
linker skipped the main program for some reason. For maximum
backwards compatibility, this is not changed, and instead, the main
map is consulted from __libc_start_main if the init function argument
is a NULL pointer.
For statically linked binaries, the old approach based on linker
symbols is still used because there is nothing else available.
A new symbol version __libc_start_main@@GLIBC_2.34 is introduced because
new binaries running on an old libc would not run their ELF
constructors, leading to difficult-to-debug issues.
The elision interfaces are closely aligned between the targets that
implement them, so declare them in the generic <lowlevellock.h>
file.
Empty .c stubs are provided, so that fewer makefile updates
under sysdeps are needed. Also simplify initialization via
__libc_early_init.
The symbols __lll_clocklock_elision, __lll_lock_elision,
__lll_trylock_elision, __lll_unlock_elision, __pthread_force_elision
move into libc. For the time being, non-hidden references are used
from libpthread to access them, but once that part of libpthread
is moved into libc, hidden symbols will be used again. (Hidden
references seem desirable to reduce the likelihood of transactions
aborts.)
The kernel does not put the vDSO at special addresses, so writev can
write the name directly. Also remove the incorrect comment about not
setting l_name.
Andy Lutomirski confirmed in
<https://lore.kernel.org/linux-api/442A16C0-AE5A-4A44-B261-FE6F817EAF3C@amacapital.net/>
that this copy is not necessary.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The glibc.malloc.mmap_max tunable as well as al of the INT_32 tunables
don't have use for negative values, so pin the hardcoded limits in the
non-negative range of INT. There's no real benefit in any of those
use cases for the extended range of unsigned, so I have avoided added
a new type to keep things simple.
The TUNABLE_SET interface took a primitive C type argument, which
resulted in inconsistent type conversions internally due to incorrect
dereferencing of types, especialy on 32-bit architectures. This
change simplifies the TUNABLE setting logic along with the interfaces.
Now all numeric tunable values are stored as signed numbers in
tunable_num_t, which is intmax_t. All calls to set tunables cast the
input value to its primitive type and then to tunable_num_t for
storage. This relies on gcc-specific (although I suspect other
compilers woul also do the same) unsigned to signed integer conversion
semantics, i.e. the bit pattern is conserved. The reverse conversion
is guaranteed by the standard.
Add _SC_MINSIGSTKSZ for the minimum signal stack size derived from
AT_MINSIGSTKSZ, which is the minimum number of bytes of free stack
space required in order to gurantee successful, non-nested handling
of a single signal whose handler is an empty function, and _SC_SIGSTKSZ
which is the suggested minimum number of bytes of stack space required
for a signal stack.
If AT_MINSIGSTKSZ isn't available, sysconf (_SC_MINSIGSTKSZ) returns
MINSIGSTKSZ. On Linux/x86 with XSAVE, the signal frame used by kernel
is composed of the following areas and laid out as:
------------------------------
| alignment padding |
------------------------------
| xsave buffer |
------------------------------
| fsave header (32-bit only) |
------------------------------
| siginfo + ucontext |
------------------------------
Compute AT_MINSIGSTKSZ value as size of xsave buffer + size of fsave
header (32-bit only) + size of siginfo and ucontext + alignment padding.
If _SC_SIGSTKSZ_SOURCE or _GNU_SOURCE are defined, MINSIGSTKSZ and SIGSTKSZ
are redefined as
/* Default stack size for a signal handler: sysconf (SC_SIGSTKSZ). */
# undef SIGSTKSZ
# define SIGSTKSZ sysconf (_SC_SIGSTKSZ)
/* Minimum stack size for a signal handler: SIGSTKSZ. */
# undef MINSIGSTKSZ
# define MINSIGSTKSZ SIGSTKSZ
Compilation will fail if the source assumes constant MINSIGSTKSZ or
SIGSTKSZ.
The reason for not simply increasing the kernel's MINSIGSTKSZ #define
(apart from the fact that it is rarely used, due to glibc's shadowing
definitions) was that userspace binaries will have baked in the old
value of the constant and may be making assumptions about it.
For example, the type (char [MINSIGSTKSZ]) changes if this #define
changes. This could be a problem if an newly built library tries to
memcpy() or dump such an object defined by and old binary.
Bounds-checking and the stack sizes passed to things like sigaltstack()
and makecontext() could similarly go wrong.
The existing code specifies -Wl,--defsym=malloc=0 and other malloc.os
definitions before libc_pic.a so that libc_pic.a(malloc.os) is not
fetched. This trick is used to avoid multiple definition errors which
would happen as a chain result:
dl-allobjs.os has an undefined __libc_scratch_buffer_set_array_size
__libc_scratch_buffer_set_array_size fetches libc_pic.a(scratch_buffer_set_array_size.os)
libc_pic.a(scratch_buffer_set_array_size.os) has an undefined free
free fetches libc_pic.a(malloc.os)
libc_pic.a(malloc.os) has an undefined __libc_message
__libc_message fetches libc_pic.a(libc_fatal.os)
libc_fatal.os will cause a multiple definition error (__GI___libc_fatal)
>>> defined at dl-fxstatat64.c
>>> /tmp/p/glibc/Release/elf/dl-allobjs.os:(__GI___libc_fatal)
>>> defined at libc_fatal.c
>>> libc_fatal.os:(.text+0x240) in archive /tmp/p/glibc/Release/libc_pic.a
LLD processes --defsym after all input files, so this trick does not
suppress multiple definition errors with LLD. Split the step into two
and use an object file to make the intention more obvious and make LLD
work.
This is conceptually more appropriate because --defsym defines a SHN_ABS
symbol while a normal definition is relative to the image base.
See https://sourceware.org/pipermail/libc-alpha/2020-March/111910.html
for discussions about the --defsym semantics.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
For configurations with cross-compiling equal to 'maybe' or 'no',
ldconfig will not run and thus the ld.so.cache will not be created
on the container testroot.pristine.
This lead to failures on both tst-glibc-hwcaps-prepend-cache and
tst-ldconfig-ld_so_conf-update on environments where the same
compiler can be used to build different ABIs (powerpc and x86 for
instance).
This patch addas a new test-container hook, ldconfig.run, that
triggers a ldconfig execution prior the test execution.
Checked on x86_64-linux-gnu and i686-linux-gnu.
elf/tst-prelink-cmp was initially added for x86 (commit fe534fe898) to validate
the fix for Bug 19178, and later applied to all architectures that use GLOB_DAT
relocations (commit 89569c8bb6). However, that bug only affected targets that
handle GLOB_DAT relocations as ELF_TYPE_CLASS_EXTERN_PROTECTED_DATA, so the test
should only apply to targets defining DL_EXTERN_PROTECTED_DATA, which gates the
usage of the elf type class above. For all other targets not meeting that
criteria, the test now returns with UNSUPPORTED status.
Fixes the test on POWER10 processors, which started using R_PPC64_GLOB_DAT.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Extern symbol access in position independent code usually involves GOT
indirection which needs RELATIVE reloc in a static linked PIE. (On
some targets this is avoided e.g. because the linker can relax a GOT
access to a pc-relative access, but this is not generally true.) Code
that runs before static PIE self relocation must avoid relying on
dynamic relocations which can be ensured by using hidden visibility.
However we cannot just make all symbols hidden:
On i386, all calls to IFUNC functions must go through PLT and calls to
hidden functions CANNOT go through PLT in PIE since EBX used in PIE PLT
may not be set up for local calls to hidden IFUNC functions.
This patch aims to make symbol references hidden in code that is used
before and by _dl_relocate_static_pie when building a static PIE libc.
Note: for an object that is used in the startup code, its references
and definition may not have consistent visibility: it is only forced
hidden in the startup code.
This is needed for fixing bug 27072.
Co-authored-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
With static pie linking pointers in the tunables list need
RELATIVE relocs since the absolute address is not known at link
time. We want to avoid relocations so the static pie self
relocation can be done after tunables are initialized.
This is a simple fix that embeds the tunable strings into the
tunable list instead of using pointers. It is possible to have
a more compact representation of tunables with some additional
complexity in the generator and tunable parser logic. Such
optimization will be useful if the list of tunables grows.
There is still an issue that tunables_strdup allocates and the
failure handling code path is sufficiently complex that it can
easily have RELATIVE relocations. It is possible to avoid the
early allocation and only change environment variables in a
setuid exe after relocations are processed. But that is a
bigger change and early failure is fatal anyway so it is not
as critical to fix right away. This is bug 27181.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The representation of the tunables including type information and
the tunable list structure are only used in the implementation not
in the tunables api that is exposed to usage within glibc.
This patch moves the representation related definitions into the
existing dl-tunable-types.h and uses that only for implementation.
The tunable callback and related types are moved to dl-tunables.h
because they are part of the tunables api.
This reduces the details exposed in the tunables api so the internals
are easier to change.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Since __libc_init_secure is called before ARCH_SETUP_TLS, it must use
"int $0x80" for system calls in i386 static PIE. Add startup_getuid,
startup_geteuid, startup_getgid and startup_getegid to <startup.h>.
Update __libc_init_secure to use them.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Set the default _dl_sysinfo in _dl_aux_init to avoid RELATIVE relocation
in static PIE.
This is needed for fixing bug 27072 on x86.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
On x86, ifuncmain6pie failed with:
[hjl@gnu-cfl-2 build-i686-linux]$ ./elf/ifuncmain6pie --direct
./elf/ifuncmain6pie: IFUNC symbol 'foo' referenced in '/export/build/gnu/tools-build/glibc-32bit/build-i686-linux/elf/ifuncmod6.so' is defined in the executable and creates an unsatisfiable circular dependency.
[hjl@gnu-cfl-2 build-i686-linux]$ readelf -rW elf/ifuncmod6.so | grep foo
00003ff4 00000706 R_386_GLOB_DAT 0000400c foo_ptr
00003ff8 00000406 R_386_GLOB_DAT 00000000 foo
0000400c 00000401 R_386_32 00000000 foo
[hjl@gnu-cfl-2 build-i686-linux]$
Remove non-JUMP_SLOT relocations against foo in ifuncmod6.so, which
trigger the circular IFUNC dependency, and build ifuncmain6pie with
-Wl,-z,lazy.
Store ISA level in the portion of the unused upper 32 bits of the hwcaps
field in cache and the unused pad field in aux cache. ISA level is stored
and checked only for shared objects in glibc-hwcaps subdirectories. The
shared objects in the default directories aren't checked since there are
no fallbacks for these shared objects.
Tested on x86-64-v2, x86-64-v3 and x86-64-v4 machines with
--disable-hardcoded-path-in-tests and --enable-hardcoded-path-in-tests.
Since commit 2f056e8a5d
"aarch64: define PI_STATIC_AND_HIDDEN",
building glibc with gcc-8 on aarch64 fails with
/BLD/elf/librtld.os: in function `elf_get_dynamic_info':
/SRC/elf/get-dynamic-info.h:70:(.text+0xad8): relocation truncated to
fit: R_AARCH64_ADR_PREL_PG_HI21 against symbol `_rtld_local' defined
in .data section in /BLD/elf/librtld.os
This is a gcc bug:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98618
The bug is fixed on gcc-10 and not yet backported. gcc-9 is affected,
but the issue happens to not trigger in glibc, gcc-8 and older seems
to miscompile rtld.os.
Rewriting the affected code in elf_get_dynamic_info seems to make the
issue go away on <= gcc-9.
The change makes the logic a bit clearer too (by separating the index
computation and array update) and drops an older gcc workaround (since
gcc 4.6 is no longer supported).
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
GCC 11 supports -march=x86-64-v[234] to enable x86 micro-architecture ISA
levels:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97250
and -mneeded to emit GNU_PROPERTY_X86_ISA_1_NEEDED property with
GNU_PROPERTY_X86_ISA_1_V[234] marker:
https://gitlab.com/x86-psABIs/x86-64-ABI/-/merge_requests/13
Binutils support for GNU_PROPERTY_X86_ISA_1_V[234] marker were added by
commit b0ab06937385e0ae25cebf1991787d64f439bf12
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Fri Oct 30 06:49:57 2020 -0700
x86: Support GNU_PROPERTY_X86_ISA_1_BASELINE marker
and
commit 32930e4edbc06bc6f10c435dbcc63131715df678
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Fri Oct 9 05:05:57 2020 -0700
x86: Support GNU_PROPERTY_X86_ISA_1_V[234] marker
GNU_PROPERTY_X86_ISA_1_NEEDED property in x86 ELF binaries indicate the
micro-architecture ISA level required to execute the binary. The marker
must be added by programmers explicitly in one of 3 ways:
1. Pass -mneeded to GCC.
2. Add the marker in the linker inputs as this patch does.
3. Pass -z x86-64-v[234] to the linker.
Add GNU_PROPERTY_X86_ISA_1_BASELINE and GNU_PROPERTY_X86_ISA_1_V[234]
marker support to ld.so if binutils 2.32 or newer is used to build glibc:
1. Add GNU_PROPERTY_X86_ISA_1_BASELINE and GNU_PROPERTY_X86_ISA_1_V[234]
markers to elf.h.
2. Add GNU_PROPERTY_X86_ISA_1_BASELINE and GNU_PROPERTY_X86_ISA_1_V[234]
marker to abi-note.o based on the ISA level used to compile abi-note.o,
assuming that the same ISA level is used to compile the whole glibc.
3. Add isa_1 to cpu_features to record the supported x86 ISA level.
4. Rename _dl_process_cet_property_note to _dl_process_property_note and
add GNU_PROPERTY_X86_ISA_1_V[234] marker detection.
5. Update _rtld_main_check and _dl_open_check to check loaded objects
with the incompatible ISA level.
6. Add a testcase to verify that dlopen an x86-64-v4 shared object fails
on lesser platforms.
7. Use <get-isa-level.h> in dl-hwcaps-subdirs.c and tst-glibc-hwcaps.c.
Tested under i686, x32 and x86-64 modes on x86-64-v2, x86-64-v3 and
x86-64-v4 machines.
Marked elf/tst-isa-level-1 with x86-64-v4, ran it on x86-64-v3 machine
and got:
[hjl@gnu-cfl-2 build-x86_64-linux]$ ./elf/tst-isa-level-1
./elf/tst-isa-level-1: CPU ISA level is lower than required
[hjl@gnu-cfl-2 build-x86_64-linux]$
I've updated copyright dates in glibc for 2021. This is the patch for
the changes not generated by scripts/update-copyrights and subsequent
build / regeneration of generated files. As well as the usual annual
updates, mainly dates in --version output (minus csu/version.c which
previously had to be handled manually but is now successfully updated
by update-copyrights), there is a small change to the copyright notice
in NEWS which should let NEWS get updated automatically next year.
Please remember to include 2021 in the dates for any new files added
in future (which means updating any existing uncommitted patches you
have that add new files to use the new copyright dates in them).
I used these shell commands:
../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")
and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 6694 files FOO.
I then removed trailing white space from benchtests/bench-pthread-locks.c
and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this
diagnostic from Savannah:
remote: *** pre-commit check failed ...
remote: *** error: lines with trailing whitespace found
remote: error: hook declined to update refs/heads/master
Add a new glibc tunable: mem.tagging. This is a decimal constant in
the range 0-255 but used as a bit-field.
Bit 0 enables use of tagged memory in the malloc family of functions.
Bit 1 enables precise faulting of tag failure on platforms where this
can be controlled.
Other bits are currently unused, but if set will cause memory tag
checking for the current process to be enabled in the kernel.
Change sbrk to fail for !__libc_initial (in the generic
implementation). As a result, sbrk is (relatively) safe to use
for the __libc_initial case (from the main libc). It is therefore
no longer necessary to avoid using it in that case (or updating the
brk cache), and the __libc_initial flag does not need to be updated
as part of dlmopen or static dlopen.
As before, direct brk system calls on Linux may lead to memory
corruption.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Otherwise, it will not participate in the dependency sorting.
Fixes commit 9ffa50b26b
("elf: Include libc.so.6 as main program in dependency sort
(bug 20972)").
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The failure paths in _dl_map_object_from_fd did not clean every
potentially allocated resource up.
Handle l_phdr, l_libname and mapped segments in the common failure
handling code.
There are various bits that may not be cleaned properly on failure
(e.g. executable stack, incomplete dl_map_segments) fixing those
need further changes.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
_dl_map_object_from_fd has complex error handling with cleanups.
It was managed by a separate function to avoid code bloat at
every failure case, but since the code was changed to use gotos
there is no longer such code bloat from inlining.
Maintaining a separate error handling function is harder as it
needs to access local state which has to be passed down. And the
same lose function was used in open_verify which is error prone.
The goto labels are changed since there is no longer a call.
The new code generates slightly smaller binary.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Since elf.h is a public header file copied to other projects,
try to make it free from spelling typos.
This change fixes the following spelling typos in comments of elf.h:
Auxialiary -> Auxiliary
tenatively -> tentatively
compatability -> compatibility
_dl_map_object_deps always sorts the initially loaded object first
during dependency sorting. This means it is relocated last in
dl_open_worker. This results in crashes in IFUNC resolvers without
lazy bindings if libraries are preloaded that refer to IFUNCs in
libc.so.6: the resolvers are called when libc.so.6 has not been
relocated yet, so references to _rtld_global_ro etc. crash.
The fix is to check against the libc.so.6 link map recorded by the
__libc_early_init framework, and let it participate in the dependency
sort.
This fixes bug 20972.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
To handle GNU property notes on aarch64 some segments need to
be mmaped again, so the fd of the loaded ELF module is needed.
When the fd is not available (kernel loaded modules), then -1
is passed.
The fd is passed to both _dl_process_pt_gnu_property and
_dl_process_pt_note for consistency. Target specific note
processing functions are updated accordingly.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Program headers are processed in two pass: after the first pass
load segments are mmapped so in the second pass target specific
note processing logic can access the notes.
The second pass is moved later so various link_map fields are
set up that may be useful for note processing such as l_phdr.
The second pass should be before the fd is closed so that is
available.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Subdirectories z13, z14, z15 can be selected, mostly based on the
level of support for vector instructions.
Co-Authored-By: Stefan Liebler <stli@linux.ibm.com>
The misattributed dependencies can cause failures in parallel testing
if the dependencies have not been built yet.
Fixes commit a332bd1518
("elf: Add elf/tst-dlopenfail-2 [BZ #25396]").
This recognizes the DL_CACHE_HWCAP_EXTENSION flag in cache entries,
and picks the supported cache entry with the highest priority.
The elf/tst-glibc-hwcaps-prepend-cache test documents a non-desired
aspect of the current cache implementation: If the cache selects a DSO
that does not exist on disk, _dl_map_object falls back to open_path,
which may or may not find an alternative implementation. This is an
existing limitation that also applies to the legacy hwcaps processing
for ld.so.cache.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Libraries from these subdirectories are added to the cache
with a special hwcap bit DL_CACHE_HWCAP_EXTENSION, so that
they are ignored by older dynamic loaders.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This simplifies the string table construction in elf/cache.c
because there is no more need to keep track of offsets explicitly;
the string table implementation does this internally.
This change slightly reduces the size of the cache on disk. The
file format does not change as a result. The strings are
null-terminated, without explicit length, so tail merging is
transparent to readers.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This will be used in ldconfig to reduce the ld.so.cache size slightly.
Tail merging is an optimization where a pointer points into another
string if the first string is a suffix of the second string.
The hash function FNV-1a was chosen because it is simple and achieves
good dispersion even for short strings (so that the hash table bucket
count can be a power of two). It is clearly superior to the hsearch
hash and the ELF hash in this regard.
The hash table uses chaining for collision resolution.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
A previously unused new-format header field is used to record
the address of an extension directory.
This change adds a demo extension which records the version of
ldconfig which builds a file.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Use a reserved byte in the new format cache header to indicate whether
the file is in little endian or big endian format. Eventually, this
information could be used to provide a unified cache for qemu-user
and similiar scenarios.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This hacks non-power-set processing into _dl_important_hwcaps.
Once the legacy hwcaps handling goes away, the subdirectory
handling needs to be reworked, but it is premature to do this
while both approaches are still supported.
ld.so supports two new arguments, --glibc-hwcaps-prepend and
--glibc-hwcaps-mask. Each accepts a colon-separated list of
glibc-hwcaps subdirectory names. The prepend option adds additional
subdirectories that are searched first, in the specified order. The
mask option restricts the automatically selected subdirectories to
those listed in the option argument. For example, on systems where
/usr/lib64 is on the library search path,
--glibc-hwcaps-prepend=valgrind:debug causes the dynamic loader to
search the directories /usr/lib64/glibc-hwcaps/valgrind and
/usr/lib64/glibc-hwcaps/debug just before /usr/lib64 is searched.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
On GNU/Hurd we not only need $(common-objpfx) in LD_LIBRARY_PATH when loading
dynamic objects, but also $(common-objpfx)/mach and $(common-objpfx)/hurd. This
adds an ld-library-path variable to be used as LD_LIBRARY_PATH basis in
Makefiles, and a sysdep-ld-library-path variable for sysdeps to add some
more paths, here mach/ and hurd/.
Now __thread_gscope_wait (the function behind THREAD_GSCOPE_WAIT,
formerly __wait_lookup_done) can be implemented directly in ld.so,
eliminating the unprotected GL (dl_wait_lookup_done) function
pointer.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
We need NO_RTLD_HIDDEN because of the need for PLT calls in ld.so.
See Roland's comment in
https://sourceware.org/bugzilla/show_bug.cgi?id=15605
"in the Hurd it's crucial that calls like __mmap be the libc ones
instead of the rtld-local ones after the bootstrap phase, when the
dynamic linker is being used for dlopen and the like."
We used to just avoid all hidden use in the rtld ; this commit switches to
keeping only those that should use PLT calls, i.e. essentially those defined in
sysdeps/mach/hurd/dl-sysdep.c:
__assert_fail
__assert_perror_fail
__*stat64
_exit
This fixes a few startup issues, notably the call to __tunable_get_val that is
made before PLTs are set up.
struct file_entry_new starts with the fields of struct file_entry,
so the code can be shared if the size computation is made dynamic.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The elf/elf.h header is shared, verbatim, by the elfutils project.
However, elfutils can be used on systems with libcs other than glibc,
making the presence of __BEGIN_DECLS, __END_DECLS and <features.h> in
the file something that downstream distros may have to add patches for.
Furthermore, this file doesn't declare anything with language linkage,
so `extern "C" {}` blocks aren't necessary; it also doesn't have any
conditional definitions based on feature test macros, making inclusion
of features.h unnecessary.
The SXID_* tunable properties only influence processes that are
AT_SECURE, so make that a bit more explicit in the documentation and
comment.
Revisiting the code after a few years I managed to confuse myself, so
I imagine there could be others who may have incorrectly assumed like
I did that the SXID_ERASE tunables are not inherited by children of
non-AT_SECURE processes.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
They have been renamed from env_path_list and rtld_search_dirs to
avoid linknamespace issues.
This change will allow future use these variables in diagnostics.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This requires defining a macro for the full path, matching the
-Wl,--dynamic-link= arguments used for linking glibc programs,
and ldd script.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This prints out version information for the dynamic loader and
exits immediately, without further command line processing
(which seems to match what some GNU tools do).
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
--help processing is deferred to the point where the executable has
been loaded, so that it is possible to eventually include information
from the main executable in the help output.
As suggested in the GNU command-line interface guidelines, the help
message is printed to standard output, and the exit status is
successful.
Handle usage errors closer to the GNU command-line interface
guidelines.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Also add a comment to elf/Makefile, explaining why we cannot use
config.status for autoconf template processing.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>