DTV setup at thread creation (_dl_allocate_tls_init) is changed
to take the dlopen lock, GL(dl_load_lock). Avoiding data races
here without locks would require design changes: the map that is
accessed for static TLS initialization here may be concurrently
freed by dlclose. That use after free may be solved by only
locking around static TLS setup or by ensuring dlclose does not
free modules with static TLS, however currently every link map
with TLS has to be accessed at least to see if it needs static
TLS. And even if that's solved, still a lot of atomics would be
needed to synchronize DTV related globals without a lock. So fix
both bug 19329 and bug 27111 with a lock that prevents DTV setup
running concurrently with dlopen or dlclose.
_dl_update_slotinfo at TLS access still does not use any locks
so CONCURRENCY NOTES are added to explain the synchronization.
The early exit from the slotinfo walk when max_modid is reached
is not strictly necessary, but does not hurt either.
An incorrect acquire load was removed from _dl_resize_dtv: it
did not synchronize with any release store or fence and
synchronization is now handled separately at thread creation
and TLS access time.
There are still a number of racy read accesses to globals that
will be changed to relaxed MO atomics in a followup patch. This
should not introduce regressions compared to existing behaviour
and avoid cluttering the main part of the fix.
Not all TLS access related data races got fixed here: there are
additional races at lazy tlsdesc relocations see bug 27137.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
All the stack lists are now in _rtld_global, so it is possible
to change stack permissions directly from there, instead of
calling into libpthread to do the change.
Tested-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Permissions of the cached stacks may have to be updated if an object
is loaded that requires executable stacks, so the dynamic loader
needs to know about these cached stacks.
The move of in_flight_stack and stack_cache_actsize is a requirement for
merging __reclaim_stacks into the fork implementation in libc.
Tested-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
This is an early variant of __tls_init_tp, primarily for initializing
thread-related elements of _rtld_global/GL.
Some existing initialization code not needed for NPTL is moved into
the generic version of this function.
Tested-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
If libpthread is included in libc, it is not necessary to delay
initialization of the lock/unlock function pointers until libpthread
is loaded. This eliminates two unprotected function pointers
from _rtld_global and removes some initialization code from
libpthread.
Tested-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Previously, the source file nptl/cancellation.c was compiled multiple
times, for libc, libpthread, librt. This commit switches to a single
implementation, with new __pthread_enable_asynccancel@@GLIBC_PRIVATE,
__pthread_disable_asynccancel@@GLIBC_PRIVATE exports.
The almost-unused CANCEL_ASYNC and CANCEL_RESET macros are replaced
by LIBC_CANCEL_ASYNC and LIBC_CANCEL_ASYNC macros. They call the
__pthread_* functions unconditionally now. The macros are still
needed because shared code uses them; Hurd has different definitions.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The stack list is available in ld.so since commit
1daccf403b ("nptl: Move stack list
variables into _rtld_global"), so it's possible to walk the stack
list directly in ld.so and perform the initialization there.
This eliminates an unprotected function pointer from _rtld_global
and reduces the libpthread initialization code.
TLS_INIT_TP is processor-specific, so it is not a good place to
put thread library initialization code (it would have to be repeated
for all CPUs). Introduce __tls_init_tp as a separate function,
to be called immediately after TLS_INIT_TP. Move the existing
stack list setup code for NPTL to this function.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Calling free directly may end up freeing a pointer allocated by the
dynamic loader using malloc from libc.so in the base namespace using
the allocator from libc.so in a secondary namespace, which results in
crashes.
This commit redirects the free call through GLRO and the dynamic
linker, to reach the correct namespace. It also cleans up the dlerror
handling along the way, so that pthread_setspecific is no longer
needed (which avoids triggering bug 24774).
Commit 9e78f6f6e7 ("Implement
_dl_catch_error, _dl_signal_error in libc.so [BZ #16628]") has the
side effect that distinct namespaces, as created by dlmopen, now have
separate implementations of the rtld exception mechanism. This means
that the call to _dl_catch_error from libdl in a secondary namespace
does not actually install an exception handler because the
thread-local variable catch_hook in the libc.so copy in the secondary
namespace is distinct from that of the base namepace. As a result, a
dlsym/dlopen/... failure in a secondary namespace terminates the process
with a dynamic linker error because it looks to the exception handler
mechanism as if no handler has been installed.
This commit restores GLRO (dl_catch_error) and uses it to set the
handler in the base namespace.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
It's necessary to stub out __libc_disable_asynccancel and
__libc_enable_asynccancel via rtld-stubbed-symbols because the new
direct references to the unwinder result in symbol conflicts when the
rtld exception handling from libc is linked in during the construction
of librtld.map.
unwind-forcedunwind.c is merged into unwind-resume.c. libc now needs
the functions that were previously only used in libpthread.
The GLIBC_PRIVATE exports of __libc_longjmp and __libc_siglongjmp are
no longer needed, so switch them to hidden symbols.
The symbol __pthread_unwind_next has been moved using
scripts/move-symbol-to-libc.py.
Reviewed-by: Adhemerva Zanella <adhemerval.zanella@linaro.org>
Remove generic tlsdesc code related to lazy tlsdesc processing since
lazy tlsdesc relocation is no longer supported. This includes removing
GL(dl_load_lock) from _dl_make_tlsdesc_dynamic which is only called at
load time when that lock is already held.
Added a documentation comment too.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
map is not valid to access here because it can be freed by a concurrent
dlclose: during tls access (via __tls_get_addr) _dl_update_slotinfo is
called without holding dlopen locks. So don't check the modid of map.
The map == 0 and map != 0 code paths can be shared (avoiding the dtv
resize in case of map == 0 is just an optimization: larger dtv than
necessary would be fine too).
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Since
commit a509eb117f
Avoid late dlopen failure due to scope, TLS slotinfo updates [BZ #25112]
the generation counter update is not needed in the failure path.
That commit ensures allocation in _dl_add_to_slotinfo happens before
the demarcation point in dlopen (it is called twice, first time is for
allocation only where dlopen can still be reverted on failure, then
second time actual dtv updates are done which then cannot fail).
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The test dlopens a large number of modules with TLS, they are reused
from an existing test.
The test relies on the reuse of slotinfo entries after dlclose, without
bug 27135 fixed this needs a failing dlopen. With a slotinfo list that
has non-monotone increasing generation counters, bug 27136 can trigger.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The max modid is a valid index in the dtv, it should not be skipped.
The bug is observable if the last module has modid == 64 and its
generation is same or less than the max generation of the previous
modules. Then dtv[0].counter implies dtv[64] is initialized but
it isn't. Fixes bug 27136.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
When parse_tunables tries to erase a tunable marked as SXID_ERASE for
setuid programs, it ends up setting the envvar string iterator
incorrectly, because of which it may parse the next tunable
incorrectly. Given that currently the implementation allows malformed
and unrecognized tunables pass through, it may even allow SXID_ERASE
tunables to go through.
This change revamps the SXID_ERASE implementation so that:
- Only valid tunables are written back to the tunestr string, because
of which children of SXID programs will only inherit a clean list of
identified tunables that are not SXID_ERASE.
- Unrecognized tunables get scrubbed off from the environment and
subsequently from the child environment.
- This has the side-effect that a tunable that is not identified by
the setxid binary, will not be passed on to a non-setxid child even
if the child could have identified that tunable. This may break
applications that expect this behaviour but expecting such tunables
to cross the SXID boundary is wrong.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Instead of passing GLIBC_TUNABLES via the environment, pass the
environment variable from parent to child. This allows us to test
multiple variables to ensure better coverage.
The test list currently only includes the case that's already being
tested. More tests will be added later.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The simplification of tunable_set interfaces took care of
signed/unsigned conversions while setting values, but comparison with
bounds ended up being incorrect; comparing TUNABLE_SIZE_T values for
example will fail because SIZE_MAX is seen as -1.
Add comparison helpers that take tunable types into account and use
them to do comparison instead.
dlopen updates libname_list by writing to lastp->next, but concurrent
reads in _dl_name_match_p were not synchronized when it was called
without holding GL(dl_load_lock), which can happen during lazy symbol
resolution.
This patch fixes the race between _dl_name_match_p reading lastp->next
and add_name_to_object writing to it. This could cause segfault on
targets with weak memory order when lastp->next->name is read, which
was observed on an arm system. Fixes bug 21349.
(Code is from Maninder Singh, comments and description is from Szabolcs
Nagy.)
Co-authored-by: Vaneet Narang <v.narang@samsung.com>
Co-authored-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This does not change the emitted code since __libc_start_main does not
return, but is important for formal flags compliance.
This also cleans up the cosmetic inconsistency in the stack protector
flags in csu, especially the incorrect value of STACK_PROTECTOR_LEVEL.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Enabling --enable-stack-protector=all causes the following tests to fail:
FAIL: elf/ifuncmain9picstatic
FAIL: elf/ifuncmain9static
Nick Alcock (who committed the stack protector code) marked the IFUNC
resolvers with inhibit_stack_protector when he done the original work and
suggested doing so again @ BZ #25680. This patch adds
inhibit_stack_protector to ifuncmain9.
After patch is applied, --enable-stack-protector=all does not fail the
above tests.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
In this case, use the link map of the dynamic loader itself as
a replacement. This is more than just a hack: if we ever support
DT_RUNPATH/DT_RPATH for the dynamic loader, reporting it for
ld.so --help (without further command line arguments) would be the
right thing to do.
Fixes commit 3324213125 ("elf: Always
set l in _dl_init_paths (bug 23462)").
After d1d5471579 ("Remove dead
DL_DST_REQ_STATIC code.") we always setup the link map l to make the
static and shared cases the same. The bug is that in elf/dl-load.c
(_dl_init_paths) we conditionally set l only in the #ifdef SHARED
case, but unconditionally use it later. The simple solution is to
remove the #ifdef SHARED conditional, because it's no longer needed,
and unconditionally setup l for both the static and shared cases. A
regression test is added to run a static binary with
LD_LIBRARY_PATH='$ORIGIN' which crashes before the fix and runs after
the fix.
Co-Authored-By: Florian Weimer <fweimer@redhat.com>
It turns out the startup code in csu/elf-init.c has a perfect pair of
ROP gadgets (see Marco-Gisbert and Ripoll-Ripoll, "return-to-csu: A
New Method to Bypass 64-bit Linux ASLR"). These functions are not
needed in dynamically-linked binaries because DT_INIT/DT_INIT_ARRAY
are already processed by the dynamic linker. However, the dynamic
linker skipped the main program for some reason. For maximum
backwards compatibility, this is not changed, and instead, the main
map is consulted from __libc_start_main if the init function argument
is a NULL pointer.
For statically linked binaries, the old approach based on linker
symbols is still used because there is nothing else available.
A new symbol version __libc_start_main@@GLIBC_2.34 is introduced because
new binaries running on an old libc would not run their ELF
constructors, leading to difficult-to-debug issues.
The elision interfaces are closely aligned between the targets that
implement them, so declare them in the generic <lowlevellock.h>
file.
Empty .c stubs are provided, so that fewer makefile updates
under sysdeps are needed. Also simplify initialization via
__libc_early_init.
The symbols __lll_clocklock_elision, __lll_lock_elision,
__lll_trylock_elision, __lll_unlock_elision, __pthread_force_elision
move into libc. For the time being, non-hidden references are used
from libpthread to access them, but once that part of libpthread
is moved into libc, hidden symbols will be used again. (Hidden
references seem desirable to reduce the likelihood of transactions
aborts.)
The kernel does not put the vDSO at special addresses, so writev can
write the name directly. Also remove the incorrect comment about not
setting l_name.
Andy Lutomirski confirmed in
<https://lore.kernel.org/linux-api/442A16C0-AE5A-4A44-B261-FE6F817EAF3C@amacapital.net/>
that this copy is not necessary.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The glibc.malloc.mmap_max tunable as well as al of the INT_32 tunables
don't have use for negative values, so pin the hardcoded limits in the
non-negative range of INT. There's no real benefit in any of those
use cases for the extended range of unsigned, so I have avoided added
a new type to keep things simple.
The TUNABLE_SET interface took a primitive C type argument, which
resulted in inconsistent type conversions internally due to incorrect
dereferencing of types, especialy on 32-bit architectures. This
change simplifies the TUNABLE setting logic along with the interfaces.
Now all numeric tunable values are stored as signed numbers in
tunable_num_t, which is intmax_t. All calls to set tunables cast the
input value to its primitive type and then to tunable_num_t for
storage. This relies on gcc-specific (although I suspect other
compilers woul also do the same) unsigned to signed integer conversion
semantics, i.e. the bit pattern is conserved. The reverse conversion
is guaranteed by the standard.
Add _SC_MINSIGSTKSZ for the minimum signal stack size derived from
AT_MINSIGSTKSZ, which is the minimum number of bytes of free stack
space required in order to gurantee successful, non-nested handling
of a single signal whose handler is an empty function, and _SC_SIGSTKSZ
which is the suggested minimum number of bytes of stack space required
for a signal stack.
If AT_MINSIGSTKSZ isn't available, sysconf (_SC_MINSIGSTKSZ) returns
MINSIGSTKSZ. On Linux/x86 with XSAVE, the signal frame used by kernel
is composed of the following areas and laid out as:
------------------------------
| alignment padding |
------------------------------
| xsave buffer |
------------------------------
| fsave header (32-bit only) |
------------------------------
| siginfo + ucontext |
------------------------------
Compute AT_MINSIGSTKSZ value as size of xsave buffer + size of fsave
header (32-bit only) + size of siginfo and ucontext + alignment padding.
If _SC_SIGSTKSZ_SOURCE or _GNU_SOURCE are defined, MINSIGSTKSZ and SIGSTKSZ
are redefined as
/* Default stack size for a signal handler: sysconf (SC_SIGSTKSZ). */
# undef SIGSTKSZ
# define SIGSTKSZ sysconf (_SC_SIGSTKSZ)
/* Minimum stack size for a signal handler: SIGSTKSZ. */
# undef MINSIGSTKSZ
# define MINSIGSTKSZ SIGSTKSZ
Compilation will fail if the source assumes constant MINSIGSTKSZ or
SIGSTKSZ.
The reason for not simply increasing the kernel's MINSIGSTKSZ #define
(apart from the fact that it is rarely used, due to glibc's shadowing
definitions) was that userspace binaries will have baked in the old
value of the constant and may be making assumptions about it.
For example, the type (char [MINSIGSTKSZ]) changes if this #define
changes. This could be a problem if an newly built library tries to
memcpy() or dump such an object defined by and old binary.
Bounds-checking and the stack sizes passed to things like sigaltstack()
and makecontext() could similarly go wrong.
The existing code specifies -Wl,--defsym=malloc=0 and other malloc.os
definitions before libc_pic.a so that libc_pic.a(malloc.os) is not
fetched. This trick is used to avoid multiple definition errors which
would happen as a chain result:
dl-allobjs.os has an undefined __libc_scratch_buffer_set_array_size
__libc_scratch_buffer_set_array_size fetches libc_pic.a(scratch_buffer_set_array_size.os)
libc_pic.a(scratch_buffer_set_array_size.os) has an undefined free
free fetches libc_pic.a(malloc.os)
libc_pic.a(malloc.os) has an undefined __libc_message
__libc_message fetches libc_pic.a(libc_fatal.os)
libc_fatal.os will cause a multiple definition error (__GI___libc_fatal)
>>> defined at dl-fxstatat64.c
>>> /tmp/p/glibc/Release/elf/dl-allobjs.os:(__GI___libc_fatal)
>>> defined at libc_fatal.c
>>> libc_fatal.os:(.text+0x240) in archive /tmp/p/glibc/Release/libc_pic.a
LLD processes --defsym after all input files, so this trick does not
suppress multiple definition errors with LLD. Split the step into two
and use an object file to make the intention more obvious and make LLD
work.
This is conceptually more appropriate because --defsym defines a SHN_ABS
symbol while a normal definition is relative to the image base.
See https://sourceware.org/pipermail/libc-alpha/2020-March/111910.html
for discussions about the --defsym semantics.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
For configurations with cross-compiling equal to 'maybe' or 'no',
ldconfig will not run and thus the ld.so.cache will not be created
on the container testroot.pristine.
This lead to failures on both tst-glibc-hwcaps-prepend-cache and
tst-ldconfig-ld_so_conf-update on environments where the same
compiler can be used to build different ABIs (powerpc and x86 for
instance).
This patch addas a new test-container hook, ldconfig.run, that
triggers a ldconfig execution prior the test execution.
Checked on x86_64-linux-gnu and i686-linux-gnu.
elf/tst-prelink-cmp was initially added for x86 (commit fe534fe898) to validate
the fix for Bug 19178, and later applied to all architectures that use GLOB_DAT
relocations (commit 89569c8bb6). However, that bug only affected targets that
handle GLOB_DAT relocations as ELF_TYPE_CLASS_EXTERN_PROTECTED_DATA, so the test
should only apply to targets defining DL_EXTERN_PROTECTED_DATA, which gates the
usage of the elf type class above. For all other targets not meeting that
criteria, the test now returns with UNSUPPORTED status.
Fixes the test on POWER10 processors, which started using R_PPC64_GLOB_DAT.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Extern symbol access in position independent code usually involves GOT
indirection which needs RELATIVE reloc in a static linked PIE. (On
some targets this is avoided e.g. because the linker can relax a GOT
access to a pc-relative access, but this is not generally true.) Code
that runs before static PIE self relocation must avoid relying on
dynamic relocations which can be ensured by using hidden visibility.
However we cannot just make all symbols hidden:
On i386, all calls to IFUNC functions must go through PLT and calls to
hidden functions CANNOT go through PLT in PIE since EBX used in PIE PLT
may not be set up for local calls to hidden IFUNC functions.
This patch aims to make symbol references hidden in code that is used
before and by _dl_relocate_static_pie when building a static PIE libc.
Note: for an object that is used in the startup code, its references
and definition may not have consistent visibility: it is only forced
hidden in the startup code.
This is needed for fixing bug 27072.
Co-authored-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
With static pie linking pointers in the tunables list need
RELATIVE relocs since the absolute address is not known at link
time. We want to avoid relocations so the static pie self
relocation can be done after tunables are initialized.
This is a simple fix that embeds the tunable strings into the
tunable list instead of using pointers. It is possible to have
a more compact representation of tunables with some additional
complexity in the generator and tunable parser logic. Such
optimization will be useful if the list of tunables grows.
There is still an issue that tunables_strdup allocates and the
failure handling code path is sufficiently complex that it can
easily have RELATIVE relocations. It is possible to avoid the
early allocation and only change environment variables in a
setuid exe after relocations are processed. But that is a
bigger change and early failure is fatal anyway so it is not
as critical to fix right away. This is bug 27181.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The representation of the tunables including type information and
the tunable list structure are only used in the implementation not
in the tunables api that is exposed to usage within glibc.
This patch moves the representation related definitions into the
existing dl-tunable-types.h and uses that only for implementation.
The tunable callback and related types are moved to dl-tunables.h
because they are part of the tunables api.
This reduces the details exposed in the tunables api so the internals
are easier to change.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Since __libc_init_secure is called before ARCH_SETUP_TLS, it must use
"int $0x80" for system calls in i386 static PIE. Add startup_getuid,
startup_geteuid, startup_getgid and startup_getegid to <startup.h>.
Update __libc_init_secure to use them.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Set the default _dl_sysinfo in _dl_aux_init to avoid RELATIVE relocation
in static PIE.
This is needed for fixing bug 27072 on x86.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
On x86, ifuncmain6pie failed with:
[hjl@gnu-cfl-2 build-i686-linux]$ ./elf/ifuncmain6pie --direct
./elf/ifuncmain6pie: IFUNC symbol 'foo' referenced in '/export/build/gnu/tools-build/glibc-32bit/build-i686-linux/elf/ifuncmod6.so' is defined in the executable and creates an unsatisfiable circular dependency.
[hjl@gnu-cfl-2 build-i686-linux]$ readelf -rW elf/ifuncmod6.so | grep foo
00003ff4 00000706 R_386_GLOB_DAT 0000400c foo_ptr
00003ff8 00000406 R_386_GLOB_DAT 00000000 foo
0000400c 00000401 R_386_32 00000000 foo
[hjl@gnu-cfl-2 build-i686-linux]$
Remove non-JUMP_SLOT relocations against foo in ifuncmod6.so, which
trigger the circular IFUNC dependency, and build ifuncmain6pie with
-Wl,-z,lazy.