vfprintf is entangled with vfwprintf (of course), __printf_fp,
__printf_fphex, __vstrfmon_l_internal, and the strfrom family of
functions. The latter use the internal snprintf functionality,
so vsnprintf is converted as well.
The simples conversion is __printf_fphex, followed by
__vstrfmon_l_internal and __printf_fp, and finally
__vfprintf_internal and __vfwprintf_internal. __vsnprintf_internal
and strfrom* are mostly consuming the new interfaces, so they
are comparatively simple.
__printf_fp is a public symbol, so the FILE *-based interface
had to preserved.
The __printf_fp rewrite does not change the actual binary-to-decimal
conversion algorithm, and digits are still not emitted directly to
the target buffer. However, the staging buffer now uses bytes
instead of wide characters, and one buffer copy is eliminated.
The changes are at least performance-neutral in my testing.
Floating point printing and snprintf improved measurably, so that
this Lua script
for i=1,5000000 do
print(i, i * math.pi)
end
runs about 5% faster for me. To preserve fprintf performance for
a simple "%d" format, this commit has some logic changes under
LABEL (unsigned_number) to avoid additional function calls. There
are certainly some very easy performance improvements here: binary,
octal and hexadecimal formatting can easily avoid the temporary work
buffer (the number of digits can be computed ahead-of-time using one
of the __builtin_clz* built-ins). Decimal formatting can use a
specialized version of _itoa_word for base 10.
The existing (inconsistent) width handling between strfmon and printf
is preserved here. __print_fp_buffer_1 would have to use
__translated_number_width to achieve ISO conformance for printf.
Test expectations in libio/tst-vtables-common.c are adjusted because
the internal staging buffer merges all virtual function calls into
one.
In general, stack buffer usage is greatly reduced, particularly for
unbuffered input streams. __printf_fp can still use a large buffer
in binary128 mode for %g, though.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
These buffers will eventually be used instead of FILE * objects
to implement printf functions. The multibyte buffer is struct
__printf_buffer, the wide buffer is struct __wprintf_buffer.
To enable writing type-generic code, the header files
printf_buffer-char.h and printf_buffer-wchar_t.h define the
Xprintf macro differently, enabling Xprintf (buffer) to stand
for __printf_buffer and __wprintf_buffer as appropriate. For
common cases, macros like Xprintf_buffer are provided as a more
syntactically convenient shortcut.
Buffer-specific flush callbacks are implemented with a switch
statement instead of a function pointer, to avoid hardening issues
similar to those of libio vtables. struct __printf_buffer_as_file
is needed to support custom printf specifiers because the public
interface for that requires passing a FILE *, which is why there
is a trapdoor back from these buffers to FILE * streams.
Since the immediate user of these interfaces knows when processing
has finished, there is no flush callback for the end of processing,
only a flush callback for the intermediate buffer flush.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Current scheme only consideres the first argument for both --required
and --optional, where the idea is to append a new item.
Checked on x86_64-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
The static linker might impose any order or internal function
position, so change the test to check if the audit prints the
symbol only once in any order.
Check the return value of malloc based on the function header comment of
_dl_make_tlsdesc_dynamic(). If the return value fails, NULL is returned.
Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com>
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
This makes it more likely that the compiler can compute the strlen
argument in _startup_fatal at compile time, which is required to
avoid a dependency on strlen this early during process startup.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
The old exception handling implementation used function interposition
to replace the dynamic loader implementation (no TLS support) with the
libc implementation (TLS support). This results in problems if the
link order between the dynamic loader and libc is reversed (bug 25486).
The new implementation moves the entire implementation of the
exception handling functions back into the dynamic loader, using
THREAD_GETMEM and THREAD_SETMEM for thread-local data support.
These depends on Hurd support for these macros, added in commit
b65a82e4e7 ("hurd: Add THREAD_GET/SETMEM/_NC").
One small obstacle is that the exception handling facilities are used
before the TCB has been set up, so a check is needed if the TCB is
available. If not, a regular global variable is used to store the
exception handling information.
Also rename dl-error.c to dl-catch.c, to avoid confusion with the
dlerror function.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
The maximum number of directives is already limited by the maximum
value of iovec, and current padding usage on _dl_map_object_from_fd
specifies a value of 16 (2 times sizeof (void *)) in hexa, which is
less than the INT_STRLEN_BOUND(void *) (20 for LP64).
This works if pointers are larger than 8 bytes, for instance 16.
In this case the maximum padding would be 32 and the IFMTSIZE would
be 40.
The resulting code does use a slightly larger static stack, the
output of -fstack-usage (for x86_64):
* master:
dl-printf.c:35:1:_dl_debug_vdprintf 1344 dynamic
* patch:
dl-printf.c:36:1:_dl_debug_vdprintf 2416 static
However, there is an improvement in code generation:
* master
text data bss dec hex filename
3309 0 0 3309 ced elf/dl-printf.os
* patch
text data bss dec hex filename
3151 0 0 3151 c4f elf/dl-printf.os
Checked on x86_64-linux-gnu.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
When --enable-hardcoded-path-in-tests is used only with DT_RUNPATH,
elf/tst-relr3 and elf/tst-relr4 failed to run. Their dependency
libraries, tst-relr-mod3a.so and tst-relr-mod4a.so, are failed to
load since DT_RUNPATH on executable doesn't apply to them. Build
tst-relr-mod3a.so and tst-relr-mod4a.so with $(LDFLAGS-rpath-ORIGIN)
to add DT_RUNPATH for their dependency libraries.
Reviewed-by: Fangrui Song <maskray@google.com>
By default glibc only allocates enough static TLS for 4 link namespaces
including the initial one. So only use 3 dlmopens in the test.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
The alloca size did not consider the optional width parameter for
padding which could cause buffer underflow. The width is currently used
e.g. by _dl_map_object_from_fd which passes 2 * sizeof(void *) which
can be larger than the alloca buffer size on targets where
sizeof(void *) >= 2 * sizeof(unsigned long).
Even if large width is not used on existing targets it is better to fix
the formatting code to avoid surprises.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
This consolidates the destructor invocations from _dl_fini and
dlclose. Remove the micro-optimization that avoids
calling _dl_call_fini if they are no destructors (as dlclose is quite
expensive anyway). The debug log message is now printed
unconditionally.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This allows the rest of dynamic loader to check whether the TCB
has been set up (and THREAD_GETMEM and THREAD_SETMEM will work).
Reviewed-by: Siddhesh Poyarekar <siddhesh@gotplt.org>
The data in the _ns_debug member must be preserved, otherwise
_dl_debug_initialize enters an infinite loop. To be conservative,
only clear the libc_map member for now, to fix bug 29528.
Fixes commit d0e357ff45
("elf: Call __libc_early_init for reused namespaces (bug 29528)"),
by reverting most of it.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
Besides the option being gcc specific, this approach is still fragile
and not future proof since we do not know if this will be the only
optimization option gcc will add that transforms loops to memset
(or any libcall).
This patch adds a new header, dl-symbol-redir-ifunc.h, that can b
used to redirect the compiler generated libcalls to port the generic
memset implementation if required.
Checked on x86_64-linux-gnu and aarch64-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The print_hwcap_1 machinery was useful for the legacy hwcaps
subdirectories, but it is not worth the trouble now that they
are gone.
Signed-off-by: Javier Pello <devel@otheo.eu>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
They were set to 0 on initialisation and never changed later after
last commit.
Signed-off-by: Javier Pello <devel@otheo.eu>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Last commit made it so that the value passed for that parameter was
always 0 at its only call site.
Signed-off-by: Javier Pello <devel@otheo.eu>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Remove support for the legacy hwcaps subdirectories from ldconfig.
Signed-off-by: Javier Pello <devel@otheo.eu>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Remove support for the legacy hwcaps subdirectories from the dynamic
loader.
Signed-off-by: Javier Pello <devel@otheo.eu>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
The Z modifier is a nonstandard synonymn for z (that predates z
itself) and compiler might issue an warning for in invalid
conversion specifier.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
The need to maintain elf/elf.h and scripts/glibcelf.py in parallel
results in a backporting hazard: they need to be kept in sync to
avoid elf/tst-glibcelf consistency check failures. glibcelf (unlike
tst-glibcelf) does not use the C implementation to extract constants.
This applies the additional glibcpp syntax checks to <elf.h>.
This changereplaces the types derived from Python enum types with
custom types _TypedConstant, _IntConstant, and _FlagConstant. These
types have fewer safeguards, but this also allows incremental
construction and greater flexibility for grouping constants among
the types. Architectures-specific named constants are now added
as members into their superclasses (but value-based lookup is
still restricted to generic constants only).
Consequently, check_duplicates in elf/tst-glibcelf has been adjusted
to accept differently-named constants of the same value if their
subtypes are distinct. The ordering check for named constants
has been dropped because they are no longer strictly ordered.
Further test adjustments: Some of the type names are different.
The new types do not support iteration (because it is unclear
whether iteration should cover the all named values (including
architecture-specific constants), or only the generic named values),
so elf/tst-glibcelf now uses by_name explicit (to get all constants).
PF_HP_SBP and PF_PARISC_SBP are now of distinct types (PfHP and
PfPARISC), so they are how both present on the Python side. EM_NUM
and PT_NUM are filtered (which was an oversight in the old
conversion).
The new version of glibcelf should also be compatible with earlier
Python versions because it no longer depends on the enum module and its
advanced features.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
The implementation in _dl_close_worker requires that the first
element of l_initfini is always this very map (“We are always the
zeroth entry, and since we don't include ourselves in the
dependency analysis start at 1.”). Rather than fixing that
assumption, this commit adds an implementation of the force_first
argument to the new dependency sorting algorithm. This also means
that the directly dlopen'ed shared object is always initialized last,
which is the least surprising behavior in the presence of cycles.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
make-4.4 will add long flags to MAKEFLAGS variable:
* WARNING: Backward-incompatibility!
Previously only simple (one-letter) options were added to the MAKEFLAGS
variable that was visible while parsing makefiles. Now, all options
are available in MAKEFLAGS.
This causes locale builds to fail when long options are used:
$ make --shuffle
...
make -C localedata install-locales
make: invalid shuffle mode: '1662724426r'
The change fixes it by passing eash option via whitespace and dashes.
That way option is appended to both single-word form and whitespace
separated form.
While at it fixed --silent mode detection in $(MAKEFLAGS) by filtering
out --long-options. Otherwise options like --shuffle flag enable silent
mode unintentionally. $(silent-make) variable consolidates the checks.
Resolves: BZ# 29564
CC: Paul Smith <psmith@gnu.org>
CC: Siddhesh Poyarekar <siddhesh@gotplt.org>
Signed-off-by: Sergei Trofimovich <slyich@gmail.com>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Commit dad90d5282 added glibc-hwcaps
support for LD_LIBRARY_PATH and, for this, it adjusted the total
string size required in _dl_important_hwcaps. However, in doing so
it inadvertently altered the calculation of the size required for
the power set strings, as the computation of the power set string
size depended on the first value assigned to the total variable,
which is later shifted, resulting in overallocation of string
space. Fix this now by using a different variable to hold the
string size required for glibc-hwcaps.
Signed-off-by: Javier Pello <devel@otheo.eu>
This did not cause a warning before because the token sequence for
the two definitions was identical.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The d7703d3176 changed how vDSO like
dependencies are printed, instead of just the name and address it
follows other libraries mode and prints 'name => path'.
Unfortunately, this broke some ldd consumer that uses the output to
filter out the program's dependencies. For instance CMake
bundleutilities module [1], where GetPrequirite uses the regex to filter
out 'name => path' [2].
This patch restore the previous way to print just the name and the
mapping address.
Checked on x86_64-linux-gnu.
[1] https://github.com/Kitware/CMake/tree/master/Tests/BundleUtilities
[2] https://github.com/Kitware/CMake/blob/master/Modules/GetPrerequisites.cmake#L733
Reviewed-by: Florian Weimer <fweimer@redhat.com>
libc_map is never reset to NULL, neither during dlclose nor on a
dlopen call which reuses the namespace structure. As a result, if a
namespace is reused, its libc is not initialized properly. The most
visible result is a crash in the <ctype.h> functions.
To prevent similar bugs on namespace reuse from surfacing,
unconditionally initialize the chosen namespace to zero using memset.
This reverts commit 6f85dbf102.
Once this change hits the release branches, it will require relinking
of all statically linked applications before static dlopen works
again, for the majority of updates on release branches: The NEWS file
is regularly updated with bug references, so the __libc_early_init
suffix changes, and static dlopen cannot find the function anymore.
While this ABI check is still technically correct (we do require
rebuilding & relinking after glibc updates to keep static dlopen
working), it is too drastic for stable release branches.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The files NEWS, include/link.h, and sysdeps/generic/ldsodefs.h
contribute to the version fingerprint used for detection. The
fingerprint can be further refined using the --with-extra-version-id
configure argument.
_dl_call_libc_early_init is replaced with _dl_lookup_libc_early_init.
The new function is used store a pointer to libc.so's
__libc_early_init function in the libc_map_early_init member of the
ld.so namespace structure. This function pointer can then be called
directly, so the separate invocation function is no longer needed.
The versioned symbol lookup needs the symbol versioning data
structures, so the initialization of libc_map and libc_map_early_init
is now done from _dl_check_map_versions, after this information
becomes available. (_dl_map_object_from_fd does not set this up
in time, so the initialization code had to be moved from there.)
This means that the separate initialization code can be removed from
dl_main because _dl_check_map_versions covers all maps, including
the initial executable loaded by the kernel. The lookup still happens
before relocation and the invocation of IFUNC resolvers, so IFUNC
resolvers are protected from ABI mismatch.
The __libc_early_init function pointer is not protected because
so little code runs between the pointer write and the invocation
(only dynamic linker code and IFUNC resolvers).
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
ELF and GNU hashes can now be computed using the elf_hash and
gnu_hash functions.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
The test is valid for all TLS models, but we want to make a reasonable
effort to test the GNU2 model specifically. For example, aarch64
defaults to GNU2, but does not have -mtls-dialect=gnu2, and the test
was not run there.
Suggested-by: Martin Coufal <mcoufal@redhat.com>
GCC normally does this optimization for us in
strlen_pass::handle_builtin_strcpy but only for optimized
build. To avoid needing to include strcpy.S in the rtld build to
support the debug build, just do the optimization by hand.
The older libc versions are obsolete for over twenty years now.
This patch removes the special flags for libc5 and libc4 and assumes
that all libraries cached are libc6 compatible and use FLAG_ELF_LIBC6.
Checked with a build for all affected architectures.
Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Redirect internal assertion failures to __libc_assert_fail, based on
based on __libc_message, which writes directly to STDERR_FILENO
and calls abort. Also disable message translation and reword the
error message slightly (adjusting stdlib/tst-bz20544 accordingly).
As a result of these changes, malloc no longer needs its own
redefinition of __assert_fail.
__libc_assert_fail needs to be stubbed out during rtld dependency
analysis because the rtld rebuilds turn __libc_assert_fail into
__assert_fail, which is unconditionally provided by elf/dl-minimal.c.
This change is not possible for the public assert macro and its
__assert_fail function because POSIX requires that the diagnostic
is written to stderr.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The fix done b2cd93fce6 does not really
work since macro strification does not expand the sizeof nor the
arithmetic operation.
Checked on x86_64-linux-gnu.
NODELETE status is propagated from the referencing object to the
referenced object, not the other way round. The code is correct, only
the log message has the wrong direction.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Using -Werror and -DNDEBUG at the same time will trigger the
following compiler error:
cache.c: In function 'save_cache':
cache.c:758:15: error: unused variable 'old_offset' [-Werror=unused-variable]
758 | off64_t old_offset = lseek64 (fd, extension_offset, SEEK_SET);
| ^~~~~~~~~~
-DNDEBUG disables the assertion, making old_offset unused.
Use __attribute__ ((unused)) to disable this warning.
By adding an internal alias to avoid the GOT indirection.
On some architecture, __libc_single_thread may be accessed through
copy relocations and thus it requires to update also the copies
default copy.
This is done by adding a new internal macro,
libc_hidden_data_{proto,def}, which has an addition argument that
specifies the alias name (instead of default __GI_ one).
Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Fangrui Song <maskray@google.com>
If an executable has copy relocations for extern protected data, that
can only work if the library containing the definition is built with
assumptions (a) the compiler emits GOT-generating relocations (b) the
linker produces R_*_GLOB_DAT instead of R_*_RELATIVE. Otherwise the
library uses its own definition directly and the executable accesses a
stale copy. Note: the GOT relocations defeat the purpose of protected
visibility as an optimization, but allow rtld to make the executable and
library use the same copy when copy relocations are present, but it
turns out this never worked perfectly.
ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA has strange semantics when both
a.so and b.so define protected var and the executable copy relocates
var: b.so accesses its own copy even with GLOB_DAT. The behavior change
is from commit 62da1e3b00 (x86) and then
copied to nios2 (ae5eae7cfc) and arc
(0e7d930c4c).
Without ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA, b.so accesses the copy
relocated data like a.so.
There is now a warning for copy relocation on protected symbol since
commit 7374c02b68. It's extremely
unlikely anyone relies on the ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA
behavior, so let's remove it: this removes a check in the symbol lookup
code.
Newer versions of GNU grep (after grep 3.7, not inclusive) will warn on
'egrep' and 'fgrep' invocations.
Convert usages within the tree to their expanded non-aliased counterparts
to avoid irritating warnings during ./configure and the test suite.
Signed-off-by: Sam James <sam@gentoo.org>
Reviewed-by: Fangrui Song <maskray@google.com>
When the first object providing foo defines both foo@v1 and foo@@v2,
dlsym(RTLD_NEXT, "foo") returns foo@v1 while dlsym(RTLD_DEFAULT, "foo")
returns foo@@v2. The issue is that RTLD_DEFAULT uses the
DL_LOOKUP_RETURN_NEWEST flag while RTLD_NEXT doesn't. Fix the RTLD_NEXT
branch to use DL_LOOKUP_RETURN_NEWEST.
Note: the new behavior matches FreeBSD rtld. Future sanitizers will not
need to add versioned interceptors like https://reviews.llvm.org/D96348
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
An __always_inline static function is better to find where exactly a
crash happens, so one can step into the function with GDB.
Reviewed-by: Fangrui Song <maskray@google.com>
Unroll slightly and enforce good instruction scheduling. This improves
performance on out-of-order machines. The unrolling allows for
pipelined multiplies.
As well, as an optional sysdep, reorder the operations and prevent
reassosiation for better scheduling and higher ILP. This commit
only adds the barrier for x86, although it should be either no
change or a win for any architecture.
Unrolling further started to induce slowdowns for sizes [0, 4]
but can help the loop so if larger sizes are the target further
unrolling can be beneficial.
Results for _dl_new_hash
Benchmarked on Tigerlake: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
Time as Geometric Mean of N=30 runs
Geometric of all benchmark New / Old: 0.674
type, length, New Time, Old Time, New Time / Old Time
fixed, 0, 2.865, 2.72, 1.053
fixed, 1, 3.567, 2.489, 1.433
fixed, 2, 2.577, 3.649, 0.706
fixed, 3, 3.644, 5.983, 0.609
fixed, 4, 4.211, 6.833, 0.616
fixed, 5, 4.741, 9.372, 0.506
fixed, 6, 5.415, 9.561, 0.566
fixed, 7, 6.649, 10.789, 0.616
fixed, 8, 8.081, 11.808, 0.684
fixed, 9, 8.427, 12.935, 0.651
fixed, 10, 8.673, 14.134, 0.614
fixed, 11, 10.69, 15.408, 0.694
fixed, 12, 10.789, 16.982, 0.635
fixed, 13, 12.169, 18.411, 0.661
fixed, 14, 12.659, 19.914, 0.636
fixed, 15, 13.526, 21.541, 0.628
fixed, 16, 14.211, 23.088, 0.616
fixed, 32, 29.412, 52.722, 0.558
fixed, 64, 65.41, 142.351, 0.459
fixed, 128, 138.505, 295.625, 0.469
fixed, 256, 291.707, 601.983, 0.485
random, 2, 12.698, 12.849, 0.988
random, 4, 16.065, 15.857, 1.013
random, 8, 19.564, 21.105, 0.927
random, 16, 23.919, 26.823, 0.892
random, 32, 31.987, 39.591, 0.808
random, 64, 49.282, 71.487, 0.689
random, 128, 82.23, 145.364, 0.566
random, 256, 152.209, 298.434, 0.51
Co-authored-by: Alexander Monakov <amonakov@ispras.ru>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
No change to the code other than moving the function to
dl-new-hash.h. Changed name so its now in the reserved namespace.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
_dl_skip_args is always 0, so the target specific code that modifies
argv after relro protection is applied is no longer used.
After the patch relro protection is applied to _dl_argv consistently
on all targets.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
When an executable is invoked as
./ld.so [ld.so-args] ./exe [exe-args]
then the argv is adujusted in ld.so before calling the entry point of
the executable so ld.so args are not visible to it. On most targets
this requires moving argv, env and auxv on the stack to ensure correct
stack alignment at the entry point. This had several issues:
- The code for this adjustment on the stack is written in asm as part
of the target specific ld.so _start code which is hard to maintain.
- The adjustment is done after _dl_start returns, where it's too late
to update GLRO(dl_auxv), as it is already readonly, so it points to
memory that was clobbered by the adjustment. This is bug 23293.
- _environ is also wrong in ld.so after the adjustment, but it is
likely not used after _dl_start returns so this is not user visible.
- _dl_argv was updated, but for this it was moved out of relro, which
changes security properties across targets unnecessarily.
This patch introduces a generic _dl_start_args_adjust function that
handles the argument adjustments after ld.so processed its own args
and before relro protection is applied.
The same algorithm is used on all targets, _dl_skip_args is now 0, so
existing target specific adjustment code is no longer used. The bug
affects aarch64, alpha, arc, arm, csky, ia64, nios2, s390-32 and sparc,
other targets don't need the change in principle, only for consistency.
The GNU Hurd start code relied on _dl_skip_args after dl_main returned,
now it checks directly if args were adjusted and fixes the Hurd startup
data accordingly.
Follow up patches can remove _dl_skip_args and DL_ARGV_NOT_RELRO.
Tested on aarch64-linux-gnu and cross tested on i686-gnu.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The Linux version used by i686 and m68k provide three overrrides for
generic code:
1. DISTINGUISH_LIB_VERSIONS to print additional information when
libc5 is used by a dependency.
2. EXTRA_LD_ENVVARS to that enabled LD_LIBRARY_VERSION environment
variable.
3. EXTRA_UNSECURE_ENVVARS to add two environment variables related
to aout support.
None are really requires, it has some decades since libc5 or aout
suppported was removed and Linux even remove support for aout files.
The LD_LIBRARY_VERSION is also dead code, dl_correct_cache_id is not
used anywhere.
Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
The kernel version check is used to avoid glibc to run on older
kernels where some syscall are not available and fallback code are
not enabled to handle graciously fail. However, it does not prevent
if the kernel does not correctly advertise its version through
vDSO note, uname or procfs.
Also kernel version checks are sometime not desirable by users,
where they want to deploy on different system with different kernel
version knowing the minimum set of syscall is always presented on
such systems.
The kernel version check has been removed along with the
LD_ASSUME_KERNEL environment variable. The minimum kernel used to
built glibc is still provided through NT_GNU_ABI_TAG ELF note and
also printed when libc.so is issued.
Checked on x86_64-linux-gnu.
This implements mmap fallback for a brk failure during TLS
allocation.
scripts/tls-elf-edit.py is updated to support the new patching method.
The script no longer requires that in the input object is of ET_DYN
type.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
When neither DT_HASH nor DT_GNU_HASH is present, the code scans
[DT_SYMTAB, DT_STRTAB). However, there is no guarantee that .dynstr
immediately follows .dynsym (e.g. lld typically places .gnu.version
after .dynsym).
In the absence of a hash table, symbol lookup will always fail
(map->l_nbuckets == 0 in dl-lookup.c) as if the object has no symbol, so
it seems fair for dladdr to do the same.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
__ehdr_start is already used in rltld.c:dl_main, and can serve the
same purpose as _begin. Besides tidying the code, using linker
defined section relative symbols rather than "-defsym _begin=0" better
reflects the intent of _dl_start_final use of _begin, which is to
refer to the load address of ld.so rather than absolute address zero.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
On _dl_map_object the underlying file is not opened in trace mode
(in other cases where the underlying file can't be opened,
_dl_map_object quits with an error). If there any missing libraries
being processed, they will not be considered on final nlist size
passed on _dl_sort_maps later in the function. And it is then used by
_dl_sort_maps_dfs on the stack allocated working maps:
222 /* Array to hold RPO sorting results, before we copy back to maps[]. */
223 struct link_map *rpo[nmaps];
224
225 /* The 'head' position during each DFS iteration. Note that we start at
226 one past the last element due to first-decrement-then-store (see the
227 bottom of above dfs_traversal() routine). */
228 struct link_map **rpo_head = &rpo[nmaps];
However while transversing the 'l_initfini' on dfs_traversal it will
still consider the l_faked maps and thus update rpo more times than the
allocated working 'rpo', overflowing the stack object.
As suggested in bugzilla, one option would be to avoid sorting the maps
for trace mode. However I think ignoring l_faked object does make
sense (there is one less constraint to call the sorting function), it
allows a slight less stack usage for trace, and it is slight simpler
solution.
The tests does trigger the stack overflow, however I tried to make
it more generic to check different scenarios or missing objects.
Checked on x86_64-linux-gnu.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Verify that:
1. A DT_RELR shared library without DT_NEEDED works.
2. A DT_RELR shared library without DT_VERNEED works.
3. A DT_RELR shared library without libc.so on DT_NEEDED works.
With DT_RELR, there may be no relocations in DT_RELA/DT_REL and their
entry values are zero. Don't relocate DT_RELA/DT_REL and update the
combined relocation start address if their entry values are zero.
The EI_ABIVERSION field of the ELF header in executables and shared
libraries can be bumped to indicate the minimum ABI requirement on the
dynamic linker. However, EI_ABIVERSION in executables isn't checked by
the Linux kernel ELF loader nor the existing dynamic linker. Executables
will crash mysteriously if the dynamic linker doesn't support the ABI
features required by the EI_ABIVERSION field. The dynamic linker should
be changed to check EI_ABIVERSION in executables.
Add a glibc version, GLIBC_ABI_DT_RELR, to indicate DT_RELR support so
that the existing dynamic linkers will issue an error on executables with
GLIBC_ABI_DT_RELR dependency. When there is a DT_VERNEED entry with
libc.so on DT_NEEDED, issue an error if there is a DT_RELR entry without
GLIBC_ABI_DT_RELR dependency.
Support __placeholder_only_for_empty_version_map as the placeholder symbol
used only for empty version map to generate GLIBC_ABI_DT_RELR without any
symbols.
PI_STATIC_AND_HIDDEN indicates whether accesses to internal linkage
variables and hidden visibility variables in a shared object (ld.so)
need dynamic relocations (usually R_*_RELATIVE). PI (position
independent) in the macro name is a misnomer: a code sequence using GOT
is typically position-independent as well, but using dynamic relocations
does not meet the requirement.
Not defining PI_STATIC_AND_HIDDEN is legacy and we expect that all new
ports will define PI_STATIC_AND_HIDDEN. Current ports defining
PI_STATIC_AND_HIDDEN are more than the opposite. Change the configure
default.
No functional change.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
When audit modules are loaded, ld.so initialization is not yet
complete, and rtld_active () returns false even though ld.so is
mostly working. Instead, the static dlopen hook is used, but that
does not work at all because this is not a static dlopen situation.
Commit 466c1ea15f ("dlfcn: Rework
static dlopen hooks") moved the hook pointer into _rtld_global_ro,
which means that separate protection is not needed anymore and the
hook pointer can be checked directly.
The guard for disabling libio vtable hardening in _IO_vtable_check
should stay for now.
Fixes commit 8e1472d2c1 ("ld.so:
Examine GLRO to detect inactive loader [BZ #20204]").
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
On non-PI_STATIC_AND_HIDDEN architectures, getting the address of
_rtld_local_ro (for GLRO (dl_final_object)) goes through a GOT entry.
The GOT load may be reordered before self relocation, leading to an
unrelocated/incorrect _rtld_local_ro address.
84e02af1eb tickled GCC powerpc32 to
reorder the GOT load before relative relocations, leading to ld.so
crash. This is similar to the m68k jump table reordering issue fixed by
a8e9b5b807.
Move code after self relocation into _dl_start_final to avoid the
reordering. This fixes powerpc32 and may help other architectures when
ELF_DYNAMIC_RELOCATE is simplified in the future.
This is necessary to place the libio vtables into the RELRO segment.
New tests elf/tst-relro-ldso and elf/tst-relro-libc are added to
verify that this is what actually happens.
The new tests fail on ia64 due to lack of (default) RELRO support
inbutils, so they are XFAILed there.
Hopefully, this will lead to tests that are easier to maintain. The
current approach of parsing readelf -W output using regular expressions
is not necessarily easier than parsing the ELF data directly.
This module is still somewhat incomplete (e.g., coverage of relocation
types and versioning information is missing), but it is sufficient to
perform basic symbol analysis or program header analysis.
The EM_* mapping for architecture-specific constant classes (e.g.,
SttX86_64) is not yet implemented. The classes are defined for the
benefit of elf/tst-glibcelf.py.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
elf_dynamic_do_Rel checks RTLD_BOOTSTRAP in several #ifdef branches.
Create an outside RTLD_BOOTSTRAP branch to simplify reasoning about the
function at the cost of a few duplicate lines.
Since dl_naudit is zero in RTLD_BOOTSTRAP code, the RTLD_BOOTSTRAP
branch can avoid _dl_audit_symbind calls to decrease code size.
Reviewed-by: Adheemrval Zanella <adhemerval.zanella@linaro.org>
After 73fc4e28b9,
__libc_enable_secure_decided is always 0 and a statically linked
executable may overwrite __libc_enable_secure without considering
AT_SECURE.
The __libc_enable_secure has been correctly initialized in _dl_aux_init,
so just remove __libc_enable_secure_decided and __libc_init_secure.
This allows us to remove some startup_get*id functions from
22b79ed7f4.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
The new IBM z16 is added to platform string array.
The macro _DL_PLATFORMS_COUNT is incremented.
_dl_hwcaps_subdir is extended by "z16" if HWCAP_S390_VXRS_PDE2
is set. HWCAP_S390_NNPA is not tested in _dl_hwcaps_subdirs_active
as those instructions may be replaced or removed in future.
tst-glibc-hwcaps.c is extended in order to test z16 via new marker5.
A fatal glibc error is dumped if glibc was build with architecture
level set for z16, but run on an older machine. (See dl-hwcap-check.h)
On 32-bit machines this has no affect. On 64-bit machines
{u}int_fast{16|32} are set as {u}int64_t which is often not
ideal. Particularly x86_64 this change both saves code size and
may save instruction cost.
Full xcheck passes on x86_64.
The count can be zero if an object has already been loaded as
an indirect dependency (so that l_searchlist.r_list in its link
map is still NULL) is promoted to global scope via RTLD_GLOBAL.
Fixes commit 5d28a8962d ("elf: Add _dl_find_object function").
If glibc is configured with --disable-default-pie and build on
s390 with -O3, the tests elf/tst-audit25a and elf/tst-audit25b are
failing as there are additional la_symbind lines for free and malloc.
It turns out that those belong to the executable. In fact those are
the PLT-stubs. Furthermore la_symbind is also called for calloc and
realloc symbols, but those belong to libc.
Those functions are not called at all, but dlsym'ed in
elf/dl-minimal.c:
__rtld_malloc_init_real (struct link_map *main_map)
{
...
void *new_calloc = lookup_malloc_symbol (main_map, "calloc", &version);
void *new_free = lookup_malloc_symbol (main_map, "free", &version);
void *new_malloc = lookup_malloc_symbol (main_map, "malloc", &version);
void *new_realloc = lookup_malloc_symbol (main_map, "realloc", &version);
...
}
Therefore, this commit just ignored symbols with LA_SYMB_DLSYM flag.
Reviewed-by: Adheemrval Zanella <adhemerval.zanella@linaro.org>
If the build itself is run in a container, we may not be able to
fully set up a nested container for test-container testing.
Notably is the mounting of /proc, since it's critical that it
be mounted from within the same PID namespace as its users, and
thus cannot be bind mounted from outside the container like other
mounts.
This patch defaults to using the parent's PID namespace instead of
creating a new one, as this is more likely to be allowed.
If the test needs an isolated PID namespace, it should add the "pidns"
command to its init script.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
And optimize it slightly.
This is commit 8c8510ab27 revised.
In _dl_aux_init in elf/dl-support.c, use an explicit loop
and -fno-tree-loop-distribute-patterns to avoid memset.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
1. Also generate .d dependency files for $(tests-container) and
$(tests-printers).
2. elf: Add tst-auditmod17.os to extra-test-objs.
3. iconv: Add tst-gconv-init-failure-mod.os to extra-test-objs.
4. malloc: Rename extra-tests-objs to extra-test-objs.
5. linux: Add tst-sysconf-iov_max-uapi.o to extra-test-objs.
6. x86_64: Add tst-x86_64mod-1.o, tst-platformmod-2.o, test-libmvec.o,
test-libmvec-avx.o, test-libmvec-avx2.o and test-libmvec-avx512f.o to
extra-test-objs.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Changes in v2:
1. Update commit log.
commit 163f625cf9
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Tue Dec 21 12:35:47 2021 -0800
elf: Remove excessive p_align check on PT_LOAD segments [BZ #28688]
removed the p_align check against the page size. It caused the loader
error or crash on elf/tst-p_align3 when loading elf/tst-p_alignmod3.so,
which has the invalid p_align in PT_LOAD segments, added by
commit d8d94863ef
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Tue Dec 21 13:42:28 2021 -0800
The loader failure caused by a negative length passed to __mprotect is
random, depending on architecture and toolchain. Update _dl_map_segments
to detect invalid holes. This fixes BZ #28838.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
This reverts commit 8c8510ab27. The
revert is not perfect because the commit included a bug fix for
_dl_sysdep_start with an empty argv, introduced in commit
2d47fa6862 ("Linux: Remove
DL_FIND_ARG_COMPONENTS"), and this bug fix is kept.
The revert is necessary because the reverted commit introduced an
early memset call on aarch64, which leads to crash due to lack of TCB
initialization.
The fix for BZ#22716 replacde LD_TRACE_LOADED_OBJECTS with
LD_TRACE_PRELINKING so mtrace could record executable address
position.
To provide the same information, LD_TRACE_LOADED_OBJECTS is
extended where a value or '2' also prints the executable address
as well. It avoid adding another loader environment variable
to be used solely for mtrace. The vDSO will be printed as
a default library (with '=>' pointing the same name), which is
ok since both mtrace and ldd already handles it.
The mtrace script is changed to also parse the new format. To
correctly support PIE and non-PIE executables, both the default
mtrace address and the one calculated as used (it fixes mtrace
for non-PIE exectuable as for BZ#22716 for PIE).
Checked on x86_64-linux-gnu.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Prelinked binaries and libraries still work, the dynamic tags
DT_GNU_PRELINKED, DT_GNU_LIBLIST, DT_GNU_CONFLICT just ignored
(meaning the process is reallocated as default).
The loader environment variable TRACE_PRELINKING is also removed,
since it used solely on prelink.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
And optimize it slightly.
The large switch statement in _dl_sysdep_start can be replaced with
a large array. This reduces source code and binary size. On
i686-linux-gnu:
Before:
text data bss dec hex filename
7791 12 0 7803 1e7b elf/dl-sysdep.os
After:
text data bss dec hex filename
7135 12 0 7147 1beb elf/dl-sysdep.os
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The generic version is the de-facto Linux implementation. It
requires an auxiliary vector, so Hurd does not use it.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
On hppa, a function pointer returned by la_symbind is actually a function
descriptor has the plabel bit set (bit 30). This must be cleared to get
the actual address of the descriptor. If the descriptor has been bound,
the first word of the descriptor is the physical address of theA function,
otherwise, the first word of the descriptor points to a trampoline in the
PLT.
This patch also adds a workaround on tests because on hppa (and it seems
to be the only ABI I have see it), some shared library adds a dynamic PLT
relocation to am empty symbol name:
$ readelf -r elf/tst-audit25mod1.so
[...]
Relocation section '.rela.plt' at offset 0x464 contains 6 entries:
Offset Info Type Sym.Value Sym. Name + Addend
00002008 00000081 R_PARISC_IPLT 508
[...]
It breaks some assumptions on the test, where a symbol with an empty
name ("") is passed on la_symbind.
Checked on x86_64-linux-gnu and hppa-linux-gnu.
Replace tst-audit24bmod2.so with tst-audit24bmod2 to silence:
make[2]: Entering directory '/export/gnu/import/git/gitlab/x86-glibc/elf'
Makefile:2201: warning: overriding recipe for target '/export/build/gnu/tools-build/glibc-gitlab/build-x86_64-linux/elf/tst-audit24bmod2.so'
../Makerules:765: warning: ignoring old recipe for target '/export/build/gnu/tools-build/glibc-gitlab/build-x86_64-linux/elf/tst-audit24bmod2.so'
The rtld audit support show two problems on aarch64:
1. _dl_runtime_resolve does not preserve x8, the indirect result
location register, which might generate wrong result calls
depending of the function signature.
2. The NEON Q registers pushed onto the stack by _dl_runtime_resolve
were twice the size of D registers extracted from the stack frame by
_dl_runtime_profile.
While 2. might result in wrong information passed on the PLT tracing,
1. generates wrong runtime behaviour.
The aarch64 rtld audit support is changed to:
* Both La_aarch64_regs and La_aarch64_retval are expanded to include
both x8 and the full sized NEON V registers, as defined by the
ABI.
* dl_runtime_profile needed to extract registers saved by
_dl_runtime_resolve and put them into the new correctly sized
La_aarch64_regs structure.
* The LAV_CURRENT check is change to only accept new audit modules
to avoid the undefined behavior of not save/restore x8.
* Different than other architectures, audit modules older than
LAV_CURRENT are rejected (both La_aarch64_regs and La_aarch64_retval
changed their layout and there are no requirements to support multiple
audit interface with the inherent aarch64 issues).
* A new field is also reserved on both La_aarch64_regs and
La_aarch64_retval to support variant pcs symbols.
Similar to x86, a new La_aarch64_vector type to represent the NEON
register is added on the La_aarch64_regs (so each type can be accessed
directly).
Since LAV_CURRENT was already bumped to support bind-now, there is
no need to increase it again.
Checked on aarch64-linux-gnu.
Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
The audit symbind callback is not called for binaries built with
-Wl,-z,now or when LD_BIND_NOW=1 is used, nor the PLT tracking callbacks
(plt_enter and plt_exit) since this would change the expected
program semantics (where no PLT is expected) and would have performance
implications (such as for BZ#15533).
LAV_CURRENT is also bumped to indicate the audit ABI change (where
la_symbind flags are set by the loader to indicate no possible PLT
trace).
To handle powerpc64 ELFv1 function descriptor, _dl_audit_symbind
requires to know whether bind-now is used so the symbol value is
updated to function text segment instead of the OPD (for lazy binding
this is done by PPC64_LOAD_FUNCPTR on _dl_runtime_resolve).
Checked on x86_64-linux-gnu, i686-linux-gnu, aarch64-linux-gnu,
powerpc64-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
For audit modules and dependencies with initial-exec TLS, we can not
set the initial TLS image on default loader initialization because it
would already be set by the audit setup. However, subsequent thread
creation would need to follow the default behaviour.
This patch fixes it by setting l_auditing link_map field not only
for the audit modules, but also for all its dependencies. This is
used on _dl_allocate_tls_init to avoid the static TLS initialization
at load time.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
la_activity is not called during application exit, even though
la_objclose is.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
Add <dl-r_debug.h> to get the adddress of the r_debug structure after
relocation and its offset before relocation from the PT_DYNAMIC segment
to support DT_DEBUG, DT_MIPS_RLD_MAP_REL and DT_MIPS_RLD_MAP.
Co-developed-by: Xi Ruoyao <xry111@mengyan1223.wang>
There was no direct or indirect make dependency on testobj3.so so the
test could fail with
/B/elf/loadfail: failed to load shared object: testobj3.so: cannot open
shared object file: No such file or directory
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The glibc 2.34 release really should have added a GLIBC_2.34
symbol to the dynamic loader. With it, we could move functions such
as dlopen or pthread_key_create that work on process-global state
into the dynamic loader (once we have fixed a longstanding issue
with static linking). Without the GLIBC_2.34 symbol, yet another
new symbol version would be needed because old glibc will fail to
load binaries due to the missing symbol version in ld.so that newly
linked programs will require.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
This avoid the cross-compiling breakage when the test should not run
($(run-built-tests) equal to no).
Checked on x86_64-linux-gnu and i686-linux-gnu as well with a cross
compile to aarch64-linux-gnu and powerpc64-linux-gnu.
Build tst-p_alignmod3.so with 256 byte page size and verify that it is
rejected with a proper error message.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Add tst-p_alignmod2-edit to edit the copy of tst-p_alignmod-base.so to
set p_align of the first PT_LOAD segment to 1 and verify that the shared
library can be loaded normally.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Add tst-p_alignmod1-edit to edit the copy of tst-p_alignmod-base.so to
reduce p_align of the first PT_LOAD segment by half and verify that the
shared library is mapped with the maximum p_align of all PT_LOAD segments.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
DT_RUNPATH is only used to find the immediate dependencies of the
executable or shared object containing the DT_RUNPATH entry:
1. Define link-test-modules-rpath-link if $(build-hardcoded-path-in-tests)
is yes.
2. Use $(link-test-modules-rpath-link) in build-module-helper so that
test modules can dlopen modules with DT_RUNPATH.
3. Add a test to show why link-test-modules-rpath-link is needed.
This partially fixes BZ #28455.
Check if whether valgrind is available in the test environment.
If not, skip the test. Run smoke tests with valgrind to verify dynamic loader.
First, check if algrind works with the system ld.so in the test
environment. Then run the actual test inside the test environment,
using the just build ld.so and new libraries.
Co-authored-by: Mark Wielaard <mark@klomp.org>
Linker may set p_align of a PT_LOAD segment larger than p_align of the
first PT_LOAD segment to satisfy a section alignment:
Elf file type is DYN (Shared object file)
Entry point 0x0
There are 10 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000834 0x0000000000000834 R E 0x1000
LOAD 0x0000000000000e00 0x0000000000001e00 0x0000000000001e00
0x0000000000000230 0x0000000000000230 RW 0x1000
LOAD 0x0000000000400000 0x0000000000400000 0x0000000000400000
0x0000000000000004 0x0000000000000008 RW 0x400000
...
Section to Segment mapping:
Segment Sections...
00 .note.gnu.property .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt .init .plt .plt.got .text .fini .rodata .eh_frame_hdr .eh_frame
01 .init_array .fini_array .data.rel.ro .dynamic .got .got.plt
02 .data .bss
We should align the first PT_LOAD segment to the maximum p_align of all
PT_LOAD segments, similar to the kernel commit:
commit ce81bb256a224259ab686742a6284930cbe4f1fa
Author: Chris Kennelly <ckennelly@google.com>
Date: Thu Oct 15 20:12:32 2020 -0700
fs/binfmt_elf: use PT_LOAD p_align values for suitable start address
This fixes BZ #28676.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
And compile it with the early CFLAGS. _dl_setup_hash is called
very early for the ld.so link map, so it should be compiled
differently.
Reviewed-by: Stefan Liebler <stli@linux.ibm.com>
Tested-by: Stefan Liebler <stli@linux.ibm.com>
The usage of internal static symbol for statically linked binaries
does not work correctly for objects built with -D_TIME_BITS=64,
since the internal definition does not provide the expected aliases.
This patch makes it to use the default stat functions instead (which
uses the default 64 time_t alias and types).
Checked on i686-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
l_contiguous was not initialized at all for the main map and
always 0. This commit adds code to check if the LOAD segments
are adjacent to each other, and sets l_contiguous accordingly.
This helps _dl_find_object because it is more efficient if the
main mapping is contiguous.
Note that not all (PIE or non-PIE) binaries are contiguous in this
way because BFD ld creates executables with LOAD holes:
ELF LOAD segments creating holes in the process image on GNU/Linux
https://sourceware.org/pipermail/binutils/2022-January/119082.htmlhttps://sourceware.org/bugzilla/show_bug.cgi?id=28743
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
The usage of internal static symbol for statically linked binaries
does not work correctly for objects built with -D_TIME_BITS=64,
since the internal definition does not provide the expected aliases.
This patch makes it to use the default stat functions instead (which
uses the default 64 time_t alias and types).
Checked on i686-linux-gnu.
With the current set of fences, the version update at the start
of the TM write operation is redundant, and the version update
at the end does not need to use an atomic read-modify-write
operation.
Also use relaxed MO stores during the dlclose update, and skip any
version changes there.
Suggested-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
As explained in Hans Boehm, Can Seqlocks Get Along with Programming
Language Memory Models?, an acquire fence is needed in
_dlfo_read_success. The lack of a fence resulted in an observable
bug on powerpc64le compile-time load reordering.
The fence in _dlfo_mappings_begin_update has been reordered, turning
the fence/store sequence into a release MO store equivalent.
Relaxed MO loads are used on the reader side, and relaxed MO stores
on the writer side for the shared data, to avoid formal data races.
This is just to be conservative; it should not actually be necessary
given how the data is used.
This commit also fixes the test run time. The intent was to run it
for 3 seconds, but 0.3 seconds was enough to uncover the bug very
occasionally (while 3 seconds did not reliably show the bug on every
test run).
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
DT_RUNPATH is only used to find the immediate dependencies of the
executable or shared object containing the DT_RUNPATH entry. Update
LD_AUDIT dlopen call to try the DT_RUNPATH entry of the executable.
Add tst-audit14a, which is copied from tst-audit14, to DT_RUNPATH and
build tst-audit14 with -Wl,--disable-new-dtags to test DT_RPATH.
This partially fixes BZ #28455.
Add <dl-debug.h> to setup debugging entry in PT_DYNAMIC segment to support
DT_DEBUG, DT_MIPS_RLD_MAP_REL and DT_MIPS_RLD_MAP.
Tested on x86-64, x32 and i686 as well as with build-many-glibcs.py.
I've updated copyright dates in glibc for 2022. This is the patch for
the changes not generated by scripts/update-copyrights and subsequent
build / regeneration of generated files. As well as the usual annual
updates, mainly dates in --version output (minus csu/version.c which
previously had to be handled manually but is now successfully updated
by update-copyrights), there is a small change to the copyright notice
in NEWS which should let NEWS get updated automatically next year.
Please remember to include 2022 in the dates for any new files added
in future (which means updating any existing uncommitted patches you
have that add new files to use the new copyright dates in them).
I used these shell commands:
../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")
and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 7061 files FOO.
I then removed trailing white space from math/tgmath.h,
support/tst-support-open-dev-null-range.c, and
sysdeps/x86_64/multiarch/strlen-vec.S, to work around the following
obscure pre-commit check failure diagnostics from Savannah. I don't
know why I run into these diagnostics whereas others evidently do not.
remote: *** 912-#endif
remote: *** 913:
remote: *** 914-
remote: *** error: lines with trailing whitespace found
...
remote: *** error: sysdeps/unix/sysv/linux/statx_cp.c: trailing lines
It can be used to speed up the libgcc unwinder, and the internal
_dl_find_dso_for_object function (which is used for caller
identification in dlopen and related functions, and in dladdr).
_dl_find_object is in the internal namespace due to bug 28503.
If libgcc switches to _dl_find_object, this namespace issue will
be fixed. It is located in libc for two reasons: it is necessary
to forward the call to the static libc after static dlopen, and
there is a link ordering issue with -static-libgcc and libgcc_eh.a
because libc.so is not a linker script that includes ld.so in the
glibc build tree (so that GCC's internal -lc after libgcc_eh.a does
not pick up ld.so).
It is necessary to do the i386 customization in the
sysdeps/x86/bits/dl_find_object.h header shared with x86-64 because
otherwise, multilib installations are broken.
The implementation uses software transactional memory, as suggested
by Torvald Riegel. Two copies of the supporting data structures are
used, also achieving full async-signal-safety.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The dl_main sets the LM_ID_BASE to RT_ADD just before starting to
add load new shared objects. The state is set to RT_CONSISTENT just
after all objects are loaded.
However if a audit modules tries to dlmopen an inexistent module,
the _dl_open will assert that the namespace is in an inconsistent
state.
This is different than dlopen, since first it will not use
LM_ID_BASE and second _dl_map_object_from_fd is the sole responsible
to set and reset the r_state value.
So the assert on _dl_open can not really be seen if the state is
consistent, since _dt_main resets it. This patch removes the assert.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
The vDSO is is listed in the link_map chain, but is never the subject of
an la_objopen call. A new internal flag __RTLD_VDSO is added that
acts as __RTLD_OPENEXEC to allocate the required 'struct auditstate'
extra space for the 'struct link_map'.
The return value from the callback is currently ignored, since there
is no PLT call involved by glibc when using the vDSO, neither the vDSO
are exported directly.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
The rtld-audit interfaces introduces a slowdown due to enabling
profiling instrumentation (as if LD_AUDIT implied LD_PROFILE).
However, instrumenting is only necessary if one of audit libraries
provides PLT callbacks (la_pltenter or la_pltexit symbols). Otherwise,
the slowdown can be avoided.
The following patch adjusts the logic that enables profiling to iterate
over all audit modules and check if any of those provides a PLT hook.
To keep la_symbind to work even without PLT callbacks, _dl_fixup now
calls the audit callback if the modules implements it.
Co-authored-by: Alexander Monakov <amonakov@ispras.ru>
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
It consolidates the code required to call la_pltexit audit
callback.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
It consolidates the code required to call la_pltenter audit
callback.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
It consolidates the code required to call la_preinit audit
callback.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
It consolidates the code required to call la_symbind{32,64} audit
callback.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
It consolidates the code required to call la_objclose audit
callback.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
It consolidates the code required to call la_objsearch audit
callback.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
It consolidates the code required to call la_activity audit
callback.
Also for a new Lmid_t the namespace link_map list are empty, so it
requires to check if before using it. This can happen for when audit
module is used along with dlmopen.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
It consolidates the code required to call la_objopen audit callback.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Remove AArch64 from comment for AT_MINSIGSTKSZ to match
commit 7cd60e43a6def40ecb75deb8decc677995970d0b
Author: Chang S. Bae <chang.seok.bae@intel.com>
Date: Tue May 18 13:03:15 2021 -0700
uapi/auxvec: Define the aux vector AT_MINSIGSTKSZ
Define AT_MINSIGSTKSZ in the generic uapi header. It is already used
as generic ABI in glibc's generic elf.h, and this define will prevent
future namespace conflicts. In particular, x86 is also using this
generic definition.
in Linux kernel 5.14.
p_align does not have to be a multiple of the page size. Only PT_LOAD
segment layout should be aligned to the page size.
1: Remove p_align check against the page size.
2. Use the page size, instead of p_align, to check PT_LOAD segment layout.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
With the morecore hook removed, there is not easy way to provide huge
pages support on with glibc allocator without resorting to transparent
huge pages. And some users and programs do prefer to use the huge pages
directly instead of THP for multiple reasons: no splitting, re-merging
by the VM, no TLB shootdowns for running processes, fast allocation
from the reserve pool, no competition with the rest of the processes
unlike THP, no swapping all, etc.
This patch extends the 'glibc.malloc.hugetlb' tunable: the value
'2' means to use huge pages directly with the system default size,
while a positive value means and specific page size that is matched
against the supported ones by the system.
Currently only memory allocated on sysmalloc() is handled, the arenas
still uses the default system page size.
To test is a new rule is added tests-malloc-hugetlb2, which run the
addes tests with the required GLIBC_TUNABLE setting. On systems without
a reserved huge pages pool, is just stress the mmap(MAP_HUGETLB)
allocation failure. To improve test coverage it is required to create
a pool with some allocated pages.
Checked on x86_64-linux-gnu.
Reviewed-by: DJ Delorie <dj@redhat.com>
Linux Transparent Huge Pages (THP) current supports three different
states: 'never', 'madvise', and 'always'. The 'never' is
self-explanatory and 'always' will enable THP for all anonymous
pages. However, 'madvise' is still the default for some system and
for such case THP will be only used if the memory range is explicity
advertise by the program through a madvise(MADV_HUGEPAGE) call.
To enable it a new tunable is provided, 'glibc.malloc.hugetlb',
where setting to a value diffent than 0 enables the madvise call.
This patch issues the madvise(MADV_HUGEPAGE) call after a successful
mmap() call at sysmalloc() with sizes larger than the default huge
page size. The madvise() call is disable is system does not support
THP or if it has the mode set to "never" and on Linux only support
one page size for THP, even if the architecture supports multiple
sizes.
To test is a new rule is added tests-malloc-hugetlb1, which run the
addes tests with the required GLIBC_TUNABLE setting.
Checked on x86_64-linux-gnu.
Reviewed-by: DJ Delorie <dj@redhat.com>
Add <tst-file-align.h> to support target specific ALIGN for variable
alignment test:
1. Alpha: Use 0x10000.
2. MicroBlaze and Nios II: Use 0x8000.
3. All others: Use 0x200000.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
On Linux/x86-64, for elf/tst-align3, we now get
munmap(0x7f88f9401000, 1126424) = 0
instead of
munmap(0x7f1615200018, 544768) = -1 EINVAL (Invalid argument)
Reviewed-by: Florian Weimer <fweimer@redhat.com>
The default has to change eventually, and there are no known failures
that require a delay.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
When PT_LOAD segment alignment > the page size, allocate enough space to
ensure that the segment can be properly aligned. This change helps code
segments use huge pages become simple and available.
This fixes [BZ #28676].
Signed-off-by: Xu Yu <xuyu@linux.alibaba.com>
Signed-off-by: Rongwei Wang <rongwei.wang@linux.alibaba.com>
This makes ld.so features such as --preload, --audit,
and --list-diagnostics more accessible to end users because they
do not need to know the ABI name of the dynamic loader.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
TLS_INIT_TCB_ALIGN is not actually used. TLS_TCB_ALIGN was likely
introduced to support a configuration where the thread pointer
has not the same alignment as THREAD_SELF. Only ia64 seems to use
that, but for the stack/pointer guard, not for storing tcbhead_t.
Some ports use TLS_TCB_OFFSET and TLS_PRE_TCB_SIZE to shift
the thread pointer, potentially landing in a different residue class
modulo the alignment, but the changes should not impact that.
In general, given that TLS variables have their own alignment
requirements, having different alignment for the (unshifted) thread
pointer and struct pthread would potentially result in dynamic
offsets, leading to more complexity.
hppa had different values before: __alignof__ (tcbhead_t), which
seems to be 4, and __alignof__ (struct pthread), which was 8
(old default) and is now 32. However, it defines THREAD_SELF as:
/* Return the thread descriptor for the current thread. */
# define THREAD_SELF \
({ struct pthread *__self; \
__self = __get_cr27(); \
__self - 1; \
})
So the thread pointer points after struct pthread (hence __self - 1),
and they have to have the same alignment on hppa as well.
Similarly, on ia64, the definitions were different. We have:
# define TLS_PRE_TCB_SIZE \
(sizeof (struct pthread) \
+ (PTHREAD_STRUCT_END_PADDING < 2 * sizeof (uintptr_t) \
? ((2 * sizeof (uintptr_t) + __alignof__ (struct pthread) - 1) \
& ~(__alignof__ (struct pthread) - 1)) \
: 0))
# define THREAD_SELF \
((struct pthread *) ((char *) __thread_self - TLS_PRE_TCB_SIZE))
And TLS_PRE_TCB_SIZE is a multiple of the struct pthread alignment
(confirmed by the new _Static_assert in sysdeps/ia64/libc-tls.c).
On m68k, we have a larger gap between tcbhead_t and struct pthread.
But as far as I can tell, the port is fine with that. The definition
of TCB_OFFSET is sufficient to handle the shifted TCB scenario.
This fixes commit 23c77f6018
("nptl: Increase default TCB alignment to 32").
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Programs without dynamic dependencies and without a program
interpreter are now run via execve.
Previously, the dynamic linker either crashed while attempting to
read a non-existing dynamic segment (looking for DT_AUDIT/DT_DEPAUDIT
data), or the self-relocated in the static PIE executable crashed
because the outer dynamic linker had already applied RELRO protection.
<dl-execve.h> is needed because execve is not available in the
dynamic loader on Hurd.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Since b05fae4d8e, __minimal malloc code is used during static
startup before PIE self-relocation (_dl_relocate_static_pie).
So it requires the same fix done for other objects by 47618209d0.
Checked on aarch64, x86_64, and i686 with and without static-pie.
1. Use a temporary file to generate Makefile fragments for DSO sorting
tests and use -include on them.
2. Add Makefile fragments to postclean-generated so that a "make clean"
removes the autogenerated fragments and a subsequent "make" regenerates
them.
This partially fixes BZ #28550.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The include cleanup on dl-minimal.c removed too much for some
targets.
Also for Hurd, __sbrk is removed from localplt.data now that
tunables allocated memory through mmap.
Checked with a build for all affected architectures.
The rtld_malloc functions are moved to its own file so it can be
used on csu code. Also, the functiosn are renamed to __minimal_*
(since there are now used not only on loader code).
Using the __minimal_malloc on tunables_strdup() avoids potential
issues with sbrk() calls while processing the tunables (I see
sporadic elf/tst-dso-ordering9 on powerpc64le with different
tests failing due ASLR).
Also, using __minimal_malloc over plain mmap optimizes the memory
allocation on both static and dynamic case (since it will any unused
space in either the last page of data segments, avoiding mmap() call,
or from the previous mmap() call).
Checked on x86_64-linux-gnu, i686-linux-gnu, and powerpc64le-linux-gnu.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Separated debuginfo files have PT_DYNAMIC with p_filesz == 0. We
need to check for that before the _dl_map_segments call because
that could attempt to write to mappings that extend beyond the end
of the file, resulting in SIGBUS.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
The patch removes the the ELF_DURING_STARTUP optimization and assume
both .rel.dyn and .rel.plt might not be subsequent. This allows some
code simplification since relocation will be handled independently
where it is done on bootstrap.
At least on x86_64_64, I can not measure any performance implications.
Running 10000 time the command
LD_DEBUG=statistics ./elf/ld.so ./libc.so
And filtering the "total startup time in dynamic loader" result,
the geometric mean is:
patched master
Ryzen 7 5900x 24140 24952
i7-4510U 45957 45982
(The results do show some variation, I did not make any statistical
analysis).
It also allows build arm with lld, since it inserts ".ARM.exidx"
between ".rel.dyn" and ".rel.plt" for the loader.
Checked on x86_64-linux-gnu and arm-linux-gnueabihf.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
These tests takes the address of a protected symbol (foo_protected)
and lld does not support copy relocations on protected data symbols.
Checked on x86_64-linux-gnu.
Reviewed-by: Fangrui Song <maskray@google.com>
The global test is linked with globalmod1.so which dlopens reldepmod4.so.
Make global.out depend on reldepmod4.so. This fixes BZ #28457.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
This second patch contains the actual implementation of a new sorting algorithm
for shared objects in the dynamic loader, which solves the slow behavior that
the current "old" algorithm falls into when the DSO set contains circular
dependencies.
The new algorithm implemented here is simply depth-first search (DFS) to obtain
the Reverse-Post Order (RPO) sequence, a topological sort. A new l_visited:1
bitfield is added to struct link_map to more elegantly facilitate such a search.
The DFS algorithm is applied to the input maps[nmap-1] backwards towards
maps[0]. This has the effect of a more "shallow" recursion depth in general
since the input is in BFS. Also, when combined with the natural order of
processing l_initfini[] at each node, this creates a resulting output sorting
closer to the intuitive "left-to-right" order in most cases.
Another notable implementation adjustment related to this _dl_sort_maps change
is the removing of two char arrays 'used' and 'done' in _dl_close_worker to
represent two per-map attributes. This has been changed to simply use two new
bit-fields l_map_used:1, l_map_done:1 added to struct link_map. This also allows
discarding the clunky 'used' array sorting that _dl_sort_maps had to sometimes
do along the way.
Tunable support for switching between different sorting algorithms at runtime is
also added. A new tunable 'glibc.rtld.dynamic_sort' with current valid values 1
(old algorithm) and 2 (new DFS algorithm) has been added. At time of commit
of this patch, the default setting is 1 (old algorithm).
Signed-off-by: Chung-Lin Tang <cltang@codesourcery.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This is the first of a 2-part patch set that fixes slow DSO sorting behavior in
the dynamic loader, as reported in BZ #17645. In order to facilitate such a
large modification to the dynamic loader, this first patch implements a testing
framework for validating shared object sorting behavior, to enable comparison
between old/new sorting algorithms, and any later enhancements.
This testing infrastructure consists of a Python script
scripts/dso-ordering-test.py' which takes in a description language, consisting
of strings that describe a set of link dependency relations between DSOs, and
generates testcase programs and Makefile fragments to automatically test the
described situation, for example:
a->b->c->d # four objects linked one after another
a->[bc]->d;b->c # a depends on b and c, which both depend on d,
# b depends on c (b,c linked to object a in fixed order)
a->b->c;{+a;%a;-a} # a, b, c serially dependent, main program uses
# dlopen/dlsym/dlclose on object a
a->b->c;{}!->[abc] # a, b, c serially dependent; multiple tests generated
# to test all permutations of a, b, c ordering linked
# to main program
(Above is just a short description of what the script can do, more
documentation is in the script comments.)
Two files containing several new tests, elf/dso-sort-tests-[12].def are added,
including test scenarios for BZ #15311 and Redhat issue #1162810 [1].
Due to the nature of dynamic loader tests, where the sorting behavior and test
output occurs before/after main(), generating testcases to use
support/test-driver.c does not suffice to control meaningful timeout for ld.so.
Therefore a new utility program 'support/test-run-command', based on
test-driver.c/support_test_main.c has been added. This does the same testcase
control, but for a program specified through a command-line rather than at the
source code level. This utility is used to run the dynamic loader testcases
generated by dso-ordering-test.py.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1162810
Signed-off-by: Chung-Lin Tang <cltang@codesourcery.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
As noted in bug 28475, the access attribute on memfrob in <string.h>
is incorrect: the function both reads and writes the memory pointed to
by its argument, so it needs to use __read_write__, not
__write_only__. This incorrect attribute results in a build failure
for accessing uninitialized memory for s390x-linux-gnu-O3 with
build-many-glibcs.py using GCC mainline.
Correct the attribute. Fixing this shows up that some calls to
memfrob in elf/ tests are reading uninitialized memory; I'm not
entirely sure of the purpose of those calls, but guess they are about
ensuring that the stack space is indeed allocated at that point in the
function, and so it matters that they are calling a function whose
semantics are unknown to the compiler. Thus, change the first memfrob
call in those tests to use explicit_bzero instead, as suggested by
Florian in
<https://sourceware.org/pipermail/libc-alpha/2021-October/132119.html>,
to avoid the use of uninitialized memory.
Tested for x86_64, and with build-many-glibcs.py (GCC mainline) for
s390x-linux-gnu-O3.
1. Define DL_RO_DYN_SECTION to initalize bootstrap_map.l_ld_readonly
before calling elf_get_dynamic_info to get dynamic info in bootstrap_map,
2. Define a single
static inline bool
dl_relocate_ld (const struct link_map *l)
{
/* Don't relocate dynamic section if it is readonly */
return !(l->l_ld_readonly || DL_RO_DYN_SECTION);
}
This updates BZ #28340 fix.
THe d6d89608ac broke powerpc for --enable-bind-now because it turned
out that different than patch assumption rtld elf_get_dynamic_info()
does require to handle RTLD_BOOTSTRAP to avoid DT_FLAGS and
DT_RUNPATH (more specially the GLRO usage which is not reallocate
yet).
This patch fixes by passing two arguments to elf_get_dynamic_info()
to inform that by rtld (bootstrap) or static pie initialization
(static_pie_bootstrap). I think using explicit argument is way more
clear and burried C preprocessor, and compiler should remove the
dead code.
I checked on x86_64 and i686 with default options, --enable-bind-now,
and --enable-bind-now and --enable--static-pie. I also check on
aarch64, armhf, powerpc64, and powerpc with default and
--enable-bind-now.
The 4af6982e4c fix does not fully handle RTLD_BOOTSTRAP usage on
rtld.c due two issues:
1. RTLD_BOOTSTRAP is also used on dl-machine.h on various
architectures and it changes the semantics of various machine
relocation functions.
2. The elf_get_dynamic_info() change was done sideways, previously
to 490e6c62aa get-dynamic-info.h was included by the first
dynamic-link.h include *without* RTLD_BOOTSTRAP being defined.
It means that the code within elf_get_dynamic_info() that uses
RTLD_BOOTSTRAP is in fact unused.
To fix 1. this patch now includes dynamic-link.h only once with
RTLD_BOOTSTRAP defined. The ELF_DYNAMIC_RELOCATE call will now have
the relocation fnctions with the expected semantics for the loader.
And to fix 2. part of 4af6982e4c is reverted (the check argument
elf_get_dynamic_info() is not required) and the RTLD_BOOTSTRAP
pieces are removed.
To reorganize the includes the static TLS definition is moved to
its own header to avoid a circular dependency (it is defined on
dynamic-link.h and dl-machine.h requires it at same time other
dynamic-link.h definition requires dl-machine.h defitions).
Also ELF_MACHINE_NO_REL, ELF_MACHINE_NO_RELA, and ELF_MACHINE_PLT_REL
are moved to its own header. Only ancient ABIs need special values
(arm, i386, and mips), so a generic one is used as default.
The powerpc Elf64_FuncDesc is also moved to its own header, since
csu code required its definition (which would require either include
elf/ folder or add a full path with elf/).
Checked on x86_64, i686, aarch64, armhf, powerpc64, powerpc32,
and powerpc64le.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>