l_contiguous was not initialized at all for the main map and
always 0. This commit adds code to check if the LOAD segments
are adjacent to each other, and sets l_contiguous accordingly.
This helps _dl_find_object because it is more efficient if the
main mapping is contiguous.
Note that not all (PIE or non-PIE) binaries are contiguous in this
way because BFD ld creates executables with LOAD holes:
ELF LOAD segments creating holes in the process image on GNU/Linux
https://sourceware.org/pipermail/binutils/2022-January/119082.htmlhttps://sourceware.org/bugzilla/show_bug.cgi?id=28743
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
The usage of internal static symbol for statically linked binaries
does not work correctly for objects built with -D_TIME_BITS=64,
since the internal definition does not provide the expected aliases.
This patch makes it to use the default stat functions instead (which
uses the default 64 time_t alias and types).
Checked on i686-linux-gnu.
With the current set of fences, the version update at the start
of the TM write operation is redundant, and the version update
at the end does not need to use an atomic read-modify-write
operation.
Also use relaxed MO stores during the dlclose update, and skip any
version changes there.
Suggested-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
As explained in Hans Boehm, Can Seqlocks Get Along with Programming
Language Memory Models?, an acquire fence is needed in
_dlfo_read_success. The lack of a fence resulted in an observable
bug on powerpc64le compile-time load reordering.
The fence in _dlfo_mappings_begin_update has been reordered, turning
the fence/store sequence into a release MO store equivalent.
Relaxed MO loads are used on the reader side, and relaxed MO stores
on the writer side for the shared data, to avoid formal data races.
This is just to be conservative; it should not actually be necessary
given how the data is used.
This commit also fixes the test run time. The intent was to run it
for 3 seconds, but 0.3 seconds was enough to uncover the bug very
occasionally (while 3 seconds did not reliably show the bug on every
test run).
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
DT_RUNPATH is only used to find the immediate dependencies of the
executable or shared object containing the DT_RUNPATH entry. Update
LD_AUDIT dlopen call to try the DT_RUNPATH entry of the executable.
Add tst-audit14a, which is copied from tst-audit14, to DT_RUNPATH and
build tst-audit14 with -Wl,--disable-new-dtags to test DT_RPATH.
This partially fixes BZ #28455.
Add <dl-debug.h> to setup debugging entry in PT_DYNAMIC segment to support
DT_DEBUG, DT_MIPS_RLD_MAP_REL and DT_MIPS_RLD_MAP.
Tested on x86-64, x32 and i686 as well as with build-many-glibcs.py.
I've updated copyright dates in glibc for 2022. This is the patch for
the changes not generated by scripts/update-copyrights and subsequent
build / regeneration of generated files. As well as the usual annual
updates, mainly dates in --version output (minus csu/version.c which
previously had to be handled manually but is now successfully updated
by update-copyrights), there is a small change to the copyright notice
in NEWS which should let NEWS get updated automatically next year.
Please remember to include 2022 in the dates for any new files added
in future (which means updating any existing uncommitted patches you
have that add new files to use the new copyright dates in them).
I used these shell commands:
../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")
and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 7061 files FOO.
I then removed trailing white space from math/tgmath.h,
support/tst-support-open-dev-null-range.c, and
sysdeps/x86_64/multiarch/strlen-vec.S, to work around the following
obscure pre-commit check failure diagnostics from Savannah. I don't
know why I run into these diagnostics whereas others evidently do not.
remote: *** 912-#endif
remote: *** 913:
remote: *** 914-
remote: *** error: lines with trailing whitespace found
...
remote: *** error: sysdeps/unix/sysv/linux/statx_cp.c: trailing lines
It can be used to speed up the libgcc unwinder, and the internal
_dl_find_dso_for_object function (which is used for caller
identification in dlopen and related functions, and in dladdr).
_dl_find_object is in the internal namespace due to bug 28503.
If libgcc switches to _dl_find_object, this namespace issue will
be fixed. It is located in libc for two reasons: it is necessary
to forward the call to the static libc after static dlopen, and
there is a link ordering issue with -static-libgcc and libgcc_eh.a
because libc.so is not a linker script that includes ld.so in the
glibc build tree (so that GCC's internal -lc after libgcc_eh.a does
not pick up ld.so).
It is necessary to do the i386 customization in the
sysdeps/x86/bits/dl_find_object.h header shared with x86-64 because
otherwise, multilib installations are broken.
The implementation uses software transactional memory, as suggested
by Torvald Riegel. Two copies of the supporting data structures are
used, also achieving full async-signal-safety.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The dl_main sets the LM_ID_BASE to RT_ADD just before starting to
add load new shared objects. The state is set to RT_CONSISTENT just
after all objects are loaded.
However if a audit modules tries to dlmopen an inexistent module,
the _dl_open will assert that the namespace is in an inconsistent
state.
This is different than dlopen, since first it will not use
LM_ID_BASE and second _dl_map_object_from_fd is the sole responsible
to set and reset the r_state value.
So the assert on _dl_open can not really be seen if the state is
consistent, since _dt_main resets it. This patch removes the assert.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
The vDSO is is listed in the link_map chain, but is never the subject of
an la_objopen call. A new internal flag __RTLD_VDSO is added that
acts as __RTLD_OPENEXEC to allocate the required 'struct auditstate'
extra space for the 'struct link_map'.
The return value from the callback is currently ignored, since there
is no PLT call involved by glibc when using the vDSO, neither the vDSO
are exported directly.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
The rtld-audit interfaces introduces a slowdown due to enabling
profiling instrumentation (as if LD_AUDIT implied LD_PROFILE).
However, instrumenting is only necessary if one of audit libraries
provides PLT callbacks (la_pltenter or la_pltexit symbols). Otherwise,
the slowdown can be avoided.
The following patch adjusts the logic that enables profiling to iterate
over all audit modules and check if any of those provides a PLT hook.
To keep la_symbind to work even without PLT callbacks, _dl_fixup now
calls the audit callback if the modules implements it.
Co-authored-by: Alexander Monakov <amonakov@ispras.ru>
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
It consolidates the code required to call la_pltexit audit
callback.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
It consolidates the code required to call la_pltenter audit
callback.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
It consolidates the code required to call la_preinit audit
callback.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
It consolidates the code required to call la_symbind{32,64} audit
callback.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
It consolidates the code required to call la_objclose audit
callback.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
It consolidates the code required to call la_objsearch audit
callback.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
It consolidates the code required to call la_activity audit
callback.
Also for a new Lmid_t the namespace link_map list are empty, so it
requires to check if before using it. This can happen for when audit
module is used along with dlmopen.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
It consolidates the code required to call la_objopen audit callback.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Remove AArch64 from comment for AT_MINSIGSTKSZ to match
commit 7cd60e43a6def40ecb75deb8decc677995970d0b
Author: Chang S. Bae <chang.seok.bae@intel.com>
Date: Tue May 18 13:03:15 2021 -0700
uapi/auxvec: Define the aux vector AT_MINSIGSTKSZ
Define AT_MINSIGSTKSZ in the generic uapi header. It is already used
as generic ABI in glibc's generic elf.h, and this define will prevent
future namespace conflicts. In particular, x86 is also using this
generic definition.
in Linux kernel 5.14.
p_align does not have to be a multiple of the page size. Only PT_LOAD
segment layout should be aligned to the page size.
1: Remove p_align check against the page size.
2. Use the page size, instead of p_align, to check PT_LOAD segment layout.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
With the morecore hook removed, there is not easy way to provide huge
pages support on with glibc allocator without resorting to transparent
huge pages. And some users and programs do prefer to use the huge pages
directly instead of THP for multiple reasons: no splitting, re-merging
by the VM, no TLB shootdowns for running processes, fast allocation
from the reserve pool, no competition with the rest of the processes
unlike THP, no swapping all, etc.
This patch extends the 'glibc.malloc.hugetlb' tunable: the value
'2' means to use huge pages directly with the system default size,
while a positive value means and specific page size that is matched
against the supported ones by the system.
Currently only memory allocated on sysmalloc() is handled, the arenas
still uses the default system page size.
To test is a new rule is added tests-malloc-hugetlb2, which run the
addes tests with the required GLIBC_TUNABLE setting. On systems without
a reserved huge pages pool, is just stress the mmap(MAP_HUGETLB)
allocation failure. To improve test coverage it is required to create
a pool with some allocated pages.
Checked on x86_64-linux-gnu.
Reviewed-by: DJ Delorie <dj@redhat.com>
Linux Transparent Huge Pages (THP) current supports three different
states: 'never', 'madvise', and 'always'. The 'never' is
self-explanatory and 'always' will enable THP for all anonymous
pages. However, 'madvise' is still the default for some system and
for such case THP will be only used if the memory range is explicity
advertise by the program through a madvise(MADV_HUGEPAGE) call.
To enable it a new tunable is provided, 'glibc.malloc.hugetlb',
where setting to a value diffent than 0 enables the madvise call.
This patch issues the madvise(MADV_HUGEPAGE) call after a successful
mmap() call at sysmalloc() with sizes larger than the default huge
page size. The madvise() call is disable is system does not support
THP or if it has the mode set to "never" and on Linux only support
one page size for THP, even if the architecture supports multiple
sizes.
To test is a new rule is added tests-malloc-hugetlb1, which run the
addes tests with the required GLIBC_TUNABLE setting.
Checked on x86_64-linux-gnu.
Reviewed-by: DJ Delorie <dj@redhat.com>
Add <tst-file-align.h> to support target specific ALIGN for variable
alignment test:
1. Alpha: Use 0x10000.
2. MicroBlaze and Nios II: Use 0x8000.
3. All others: Use 0x200000.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
On Linux/x86-64, for elf/tst-align3, we now get
munmap(0x7f88f9401000, 1126424) = 0
instead of
munmap(0x7f1615200018, 544768) = -1 EINVAL (Invalid argument)
Reviewed-by: Florian Weimer <fweimer@redhat.com>
The default has to change eventually, and there are no known failures
that require a delay.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
When PT_LOAD segment alignment > the page size, allocate enough space to
ensure that the segment can be properly aligned. This change helps code
segments use huge pages become simple and available.
This fixes [BZ #28676].
Signed-off-by: Xu Yu <xuyu@linux.alibaba.com>
Signed-off-by: Rongwei Wang <rongwei.wang@linux.alibaba.com>
This makes ld.so features such as --preload, --audit,
and --list-diagnostics more accessible to end users because they
do not need to know the ABI name of the dynamic loader.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
TLS_INIT_TCB_ALIGN is not actually used. TLS_TCB_ALIGN was likely
introduced to support a configuration where the thread pointer
has not the same alignment as THREAD_SELF. Only ia64 seems to use
that, but for the stack/pointer guard, not for storing tcbhead_t.
Some ports use TLS_TCB_OFFSET and TLS_PRE_TCB_SIZE to shift
the thread pointer, potentially landing in a different residue class
modulo the alignment, but the changes should not impact that.
In general, given that TLS variables have their own alignment
requirements, having different alignment for the (unshifted) thread
pointer and struct pthread would potentially result in dynamic
offsets, leading to more complexity.
hppa had different values before: __alignof__ (tcbhead_t), which
seems to be 4, and __alignof__ (struct pthread), which was 8
(old default) and is now 32. However, it defines THREAD_SELF as:
/* Return the thread descriptor for the current thread. */
# define THREAD_SELF \
({ struct pthread *__self; \
__self = __get_cr27(); \
__self - 1; \
})
So the thread pointer points after struct pthread (hence __self - 1),
and they have to have the same alignment on hppa as well.
Similarly, on ia64, the definitions were different. We have:
# define TLS_PRE_TCB_SIZE \
(sizeof (struct pthread) \
+ (PTHREAD_STRUCT_END_PADDING < 2 * sizeof (uintptr_t) \
? ((2 * sizeof (uintptr_t) + __alignof__ (struct pthread) - 1) \
& ~(__alignof__ (struct pthread) - 1)) \
: 0))
# define THREAD_SELF \
((struct pthread *) ((char *) __thread_self - TLS_PRE_TCB_SIZE))
And TLS_PRE_TCB_SIZE is a multiple of the struct pthread alignment
(confirmed by the new _Static_assert in sysdeps/ia64/libc-tls.c).
On m68k, we have a larger gap between tcbhead_t and struct pthread.
But as far as I can tell, the port is fine with that. The definition
of TCB_OFFSET is sufficient to handle the shifted TCB scenario.
This fixes commit 23c77f6018
("nptl: Increase default TCB alignment to 32").
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Programs without dynamic dependencies and without a program
interpreter are now run via execve.
Previously, the dynamic linker either crashed while attempting to
read a non-existing dynamic segment (looking for DT_AUDIT/DT_DEPAUDIT
data), or the self-relocated in the static PIE executable crashed
because the outer dynamic linker had already applied RELRO protection.
<dl-execve.h> is needed because execve is not available in the
dynamic loader on Hurd.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>