Glibc does not provide an interface for debugger to access libraries
loaded in multiple namespaces via dlmopen.
The current rtld-debugger interface is described in the file:
elf/rtld-debugger-interface.txt
under the "Standard debugger interface" heading. This interface only
provides access to the first link-map (LM_ID_BASE).
1. Bump r_version to 2 when multiple namespaces are used. This triggers
the GDB bug:
https://sourceware.org/bugzilla/show_bug.cgi?id=28236
2. Add struct r_debug_extended to extend struct r_debug into a linked-list,
where each element correlates to an unique namespace.
3. Initialize the r_debug_extended structure. Bump r_version to 2 for
the new namespace and add the new namespace to the namespace linked list.
4. Add _dl_debug_update to return the address of struct r_debug' of a
namespace.
5. Add a hidden symbol, _r_debug_extended, for struct r_debug_extended.
6. Provide the symbol, _r_debug, with size of struct r_debug, as an alias
of _r_debug_extended, for programs which reference _r_debug.
This fixes BZ #15971.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
We can consider __ehdr_start (from binutils 2.23 onwards)
unconditionally supported, since configure.ac requires binutils>=2.25.
The configure.ac check is related to an ia64 bug fixed by binutils 2.24.
See https://sourceware.org/pipermail/libc-alpha/2014-August/053503.html
Tested on x86_64-linux-gnu. Tested build-many-glibcs.py with
aarch64-linux-gnu and s390x-linux-gnu.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
This is updated version of the 572bd547d5 (reverted by 40ebfd016a)
that fixes the _dl_next_tls_modid issues.
This issue with 572bd547d5 patch is the DTV entry will be only
update on dl_open_worker() with the update_tls_slotinfo() call after
all dependencies are being processed by _dl_map_object_deps(). However
_dl_map_object_deps() itself might call _dl_next_tls_modid(), and since
the _dl_tls_dtv_slotinfo_list::map is not yet set the entry will be
wrongly reused.
This patch fixes by renaming the _dl_next_tls_modid() function to
_dl_assign_tls_modid() and by passing the link_map so it can set
the slotinfo value so a subsequente _dl_next_tls_modid() call will
see the entry as allocated.
The intermediary value is cleared up on remove_slotinfo() for the case
a library fails to load with RTLD_NOW.
This patch fixes BZ #27135.
Checked on x86_64-linux-gnu.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
As a result, is not necessary to specify __attribute__ ((nocommon))
on individual definitions.
GCC 10 defaults to -fno-common on all architectures except ARC,
but this change is compatible with older GCC versions and ARC, too.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
All the stack lists are now in _rtld_global, so it is possible
to change stack permissions directly from there, instead of
calling into libpthread to do the change.
Tested-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
This is an early variant of __tls_init_tp, primarily for initializing
thread-related elements of _rtld_global/GL.
Some existing initialization code not needed for NPTL is moved into
the generic version of this function.
Tested-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
If libpthread is included in libc, it is not necessary to delay
initialization of the lock/unlock function pointers until libpthread
is loaded. This eliminates two unprotected function pointers
from _rtld_global and removes some initialization code from
libpthread.
Tested-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The stack list is available in ld.so since commit
1daccf403b ("nptl: Move stack list
variables into _rtld_global"), so it's possible to walk the stack
list directly in ld.so and perform the initialization there.
This eliminates an unprotected function pointer from _rtld_global
and reduces the libpthread initialization code.
TLS_INIT_TP is processor-specific, so it is not a good place to
put thread library initialization code (it would have to be repeated
for all CPUs). Introduce __tls_init_tp as a separate function,
to be called immediately after TLS_INIT_TP. Move the existing
stack list setup code for NPTL to this function.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Calling free directly may end up freeing a pointer allocated by the
dynamic loader using malloc from libc.so in the base namespace using
the allocator from libc.so in a secondary namespace, which results in
crashes.
This commit redirects the free call through GLRO and the dynamic
linker, to reach the correct namespace. It also cleans up the dlerror
handling along the way, so that pthread_setspecific is no longer
needed (which avoids triggering bug 24774).
Commit 9e78f6f6e7 ("Implement
_dl_catch_error, _dl_signal_error in libc.so [BZ #16628]") has the
side effect that distinct namespaces, as created by dlmopen, now have
separate implementations of the rtld exception mechanism. This means
that the call to _dl_catch_error from libdl in a secondary namespace
does not actually install an exception handler because the
thread-local variable catch_hook in the libc.so copy in the secondary
namespace is distinct from that of the base namepace. As a result, a
dlsym/dlopen/... failure in a secondary namespace terminates the process
with a dynamic linker error because it looks to the exception handler
mechanism as if no handler has been installed.
This commit restores GLRO (dl_catch_error) and uses it to set the
handler in the base namespace.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
I used these shell commands:
../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")
and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 6694 files FOO.
I then removed trailing white space from benchtests/bench-pthread-locks.c
and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this
diagnostic from Savannah:
remote: *** pre-commit check failed ...
remote: *** error: lines with trailing whitespace found
remote: error: hook declined to update refs/heads/master
Otherwise, it will not participate in the dependency sorting.
Fixes commit 9ffa50b26b
("elf: Include libc.so.6 as main program in dependency sort
(bug 20972)").
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
To handle GNU property notes on aarch64 some segments need to
be mmaped again, so the fd of the loaded ELF module is needed.
When the fd is not available (kernel loaded modules), then -1
is passed.
The fd is passed to both _dl_process_pt_gnu_property and
_dl_process_pt_note for consistency. Target specific note
processing functions are updated accordingly.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This hacks non-power-set processing into _dl_important_hwcaps.
Once the legacy hwcaps handling goes away, the subdirectory
handling needs to be reworked, but it is premature to do this
while both approaches are still supported.
ld.so supports two new arguments, --glibc-hwcaps-prepend and
--glibc-hwcaps-mask. Each accepts a colon-separated list of
glibc-hwcaps subdirectory names. The prepend option adds additional
subdirectories that are searched first, in the specified order. The
mask option restricts the automatically selected subdirectories to
those listed in the option argument. For example, on systems where
/usr/lib64 is on the library search path,
--glibc-hwcaps-prepend=valgrind:debug causes the dynamic loader to
search the directories /usr/lib64/glibc-hwcaps/valgrind and
/usr/lib64/glibc-hwcaps/debug just before /usr/lib64 is searched.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Now __thread_gscope_wait (the function behind THREAD_GSCOPE_WAIT,
formerly __wait_lookup_done) can be implemented directly in ld.so,
eliminating the unprotected GL (dl_wait_lookup_done) function
pointer.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This prints out version information for the dynamic loader and
exits immediately, without further command line processing
(which seems to match what some GNU tools do).
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
--help processing is deferred to the point where the executable has
been loaded, so that it is possible to eventually include information
from the main executable in the help output.
As suggested in the GNU command-line interface guidelines, the help
message is printed to standard output, and the exit status is
successful.
Handle usage errors closer to the GNU command-line interface
guidelines.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Also add a comment to elf/Makefile, explaining why we cannot use
config.status for autoconf template processing.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Introduce struct dl_main_state and move it to <dl-main.h>. Rename
enum mode to enum rtld_mode and add the rtld_mode_ prefix to the enum
constants.
This avoids the need for putting state that is only needed during
startup into the ld.so data segment.
The new static TLS surplus size computation is
surplus_tls = 192 * (nns-1) + 144 * nns + 512
where nns is controlled via the rtld.nns tunable. This commit
accounts audit modules too so nns = rtld.nns + audit modules.
rtld.nns should only include the namespaces required by the
application, namespaces for audit modules are accounted on top
of that so audit modules don't use up the static TLS that is
reserved for the application. This allows loading many audit
modules without tuning rtld.nns or using up static TLS, and it
fixes
FAIL: elf/tst-auditmany
Note that DL_NNS is currently a hard upper limit for nns, and
if rtld.nns + audit modules go over the limit that's a fatal
error. By default rtld.nns is 4 which allows 12 audit modules.
Counting the audit modules is based on existing audit string
parsing code, we cannot use GLRO(dl_naudit) before the modules
are actually loaded.
TLS_STATIC_SURPLUS is 1664 bytes currently which is not enough to
support DL_NNS (== 16) number of dynamic link namespaces, if we
assume 192 bytes of TLS are reserved for libc use and 144 bytes
are reserved for other system libraries that use IE TLS.
A new tunable is introduced to control the number of supported
namespaces and to adjust the surplus static TLS size as follows:
surplus_tls = 192 * (rtld.nns-1) + 144 * rtld.nns + 512
The default is rtld.nns == 4 and then the surplus TLS size is the
same as before, so the behaviour is unchanged by default. If an
application creates more namespaces than the rtld.nns setting
allows, then it is not guaranteed to work, but the limit is not
checked. So existing usage will continue to work, but in the
future if an application creates more than 4 dynamic link
namespaces then the tunable will need to be set.
In this patch DL_NNS is a fixed value and provides a maximum to
the rtld.nns setting.
Static linking used fixed 2048 bytes surplus TLS, this is changed
so the same contract is used as for dynamic linking. With static
linking DL_NNS == 1 so rtld.nns tunable is forced to 1, so by
default the surplus TLS is reduced to 144 + 512 = 656 bytes. This
change is not expected to cause problems.
Tested on aarch64-linux-gnu and x86_64-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Add generic code to handle PT_GNU_PROPERTY notes. Invalid
content is ignored, _dl_process_pt_gnu_property is always called
after PT_LOAD segments are mapped and it has no failure modes.
Currently only one NT_GNU_PROPERTY_TYPE_0 note is handled, which
contains target specific properties: the _dl_process_gnu_property
hook is called for each property.
The old _dl_process_pt_note and _rtld_process_pt_note differ in how
the program header is read. The old _dl_process_pt_note is called
before PT_LOAD segments are mapped and _rtld_process_pt_note is called
after PT_LOAD segments are mapped. The old _rtld_process_pt_note is
removed and _dl_process_pt_note is always called after PT_LOAD
segments are mapped and now it has no failure modes.
The program headers are scanned backwards so that PT_NOTE can be
skipped if PT_GNU_PROPERTY exists.
Co-Authored-By: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
1. Include <dl-procruntime.c> to get architecture specific initializer in
rtld_global.
2. Change _dl_x86_feature_1[2] to _dl_x86_feature_1.
3. Add _dl_x86_feature_control after _dl_x86_feature_1, which is a
struct of 2 bitfields for IBT and SHSTK control
This fixes [BZ #25887].
The rseq initialization should happen only for the libc in the base
namespace (in the dynamic case) or the statically linked libc. The
__libc_multiple_libcs flag does not quite cover this case at present,
so this commit introduces a flag argument to __libc_early_init,
indicating whether the libc being libc is the primary one (of the main
program).
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
This function is defined in libc.so, and the dynamic loader calls
right after relocation has been finished, before any ELF constructors
or the preinit function is invoked. It is also used in the static
build for initializing parts of the static libc.
To locate __libc_early_init, a direct symbol lookup function is used,
_dl_lookup_direct. It does not search the entire symbol scope and
consults merely a single link map. This function could also be used
to implement lookups in the vDSO (as an optimization).
A per-namespace variable (libc_map) is added for locating libc.so,
to avoid repeated traversals of the search scope. It is similar to
GL(dl_initfirst). An alternative would have been to thread a context
argument from _dl_open down to _dl_map_object_from_fd (where libc.so
is identified). This could have avoided the global variable, but
the change would be larger as a result. It would not have been
possible to use this to replace GL(dl_initfirst) because that global
variable is used to pass the function pointer past the stack switch
from dl_main to the main program. Replacing that requires adding
a new argument to _dl_init, which in turn needs changes to the
architecture-specific libc.so startup code written in assembler.
__libc_early_init should not be used to replace _dl_var_init (as
it exists today on some architectures). Instead, _dl_lookup_direct
should be used to look up a new variable symbol in libc.so, and
that should then be initialized from the dynamic loader, immediately
after the object has been loaded in _dl_map_object_from_fd (before
relocation is run). This way, more IFUNC resolvers which depend on
these variables will work.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
binutils ld has supported --audit, --depaudit for a long time,
only support in glibc has been missing.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
All list elements are colon-separated strings, and there is a hard
upper limit for the number of audit modules, so it is possible to
pre-allocate a fixed-size array of strings to which the LD_AUDIT
environment variable and --audit arguments are added.
Also eliminate the global variables for the audit list because
the list is only needed briefly during startup.
There is a slight behavior change: All duplicate LD_AUDIT environment
variables are now processed, not just the last one as before. However,
such environment vectors are invalid anyway.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Exporting functions and relying on symbol interposition from libc.so
makes the choice of implementation dependent on DT_NEEDED order, which
is not what some compiler drivers expect.
This commit replaces one magic mechanism (symbol interposition) with
another one (preprocessor-/compiler-based redirection). This makes
the hand-over from the minimal malloc to the full malloc more
explicit.
Removing the ABI symbols is backwards-compatible because libc.so is
always in scope, and the dynamic loader will find the malloc-related
symbols there since commit f0b2132b35
("ld.so: Support moving versioned symbols between sonames
[BZ #24741]").
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
This patch moves the vDSO setup from libc to loader code, just after
the vDSO link_map setup. For static case the initialization
is moved to _dl_non_dynamic_init instead.
Instead of using the mangled pointer, the vDSO data is set as
attribute_relro (on _rtld_global_ro for shared or _dl_vdso_* for
static). It is read-only even with partial relro.
It fixes BZ#24967 now that the vDSO pointer is setup earlier than
malloc interposition is called.
Also, vDSO calls should not be a problem for static dlopen as
indicated by BZ#20802. The vDSO pointer would be zero-initialized
and the syscall will be issued instead.
Checked on x86_64-linux-gnu, i686-linux-gnu, aarch64-linux-gnu,
arm-linux-gnueabihf, powerpc64le-linux-gnu, powerpc64-linux-gnu,
powerpc-linux-gnu, s390x-linux-gnu, sparc64-linux-gnu, and
sparcv9-linux-gnu. I also run some tests on mips.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
This change splits the scope and TLS slotinfo updates in dlopen into
two parts: one to resize the data structures, and one to actually apply
the update. The call to add_to_global_resize in dl_open_worker is moved
before the demarcation point at which no further memory allocations are
allowed.
_dl_add_to_slotinfo is adjusted to make the list update optional. There
is some optimization possibility here because we could grow the slotinfo
list of arrays in a single call, one the largest TLS modid is known.
This commit does not fix the fatal meory allocation failure in
_dl_update_slotinfo. Ideally, this error during dlopen should be
recoverable.
The update order of scopes and TLS data structures is retained, although
it appears to be more correct to fully initialize TLS first, and then
expose symbols in the newly loaded objects via the scope update.
Tested on x86_64-linux-gnu.
Change-Id: I240c58387dabda3ca1bcab48b02115175fa83d6c
To improve GCC 10 compatibility, it is necessary to remove the l_audit
zero-length array from the end of struct link_map. In preparation of
that, this commit introduces an accessor function for the audit state,
so that it is possible to change the representation of the audit state
without adjusting the code that accesses it.
Tested on x86_64-linux-gnu. Built on i686-gnu.
Change-Id: Id815673c29950fc011ae5301d7cde12624f658df
This patch refactor how hp-timing is used on loader code for statistics
report. The HP_TIMING_AVAIL and HP_SMALL_TIMING_AVAIL are removed and
HP_TIMING_INLINE is used instead to check for hp-timing avaliability.
For alpha, which only defines HP_SMALL_TIMING_AVAIL, the HP_TIMING_INLINE
is set iff for IS_IN(rtld).
Checked on aarch64-linux-gnu, x86_64-linux-gnu, and i686-linux-gnu. I also
checked the builds for all afected ABIs.
* benchtests/bench-timing.h: Replace HP_TIMING_AVAIL with
HP_TIMING_INLINE.
* nptl/descr.h: Likewise.
* elf/rtld.c (RLTD_TIMING_DECLARE, RTLD_TIMING_NOW, RTLD_TIMING_DIFF,
RTLD_TIMING_ACCUM_NT, RTLD_TIMING_SET): Define.
(dl_start_final_info, _dl_start_final, dl_main, print_statistics):
Abstract hp-timing usage with RTLD_* macros.
* sysdeps/alpha/hp-timing.h (HP_TIMING_INLINE): Define iff IS_IN(rtld).
(HP_TIMING_AVAIL, HP_SMALL_TIMING_AVAIL): Remove.
* sysdeps/generic/hp-timing.h (HP_TIMING_AVAIL, HP_SMALL_TIMING_AVAIL,
HP_TIMING_NONAVAIL): Likewise.
* sysdeps/ia64/hp-timing.h (HP_TIMING_AVAIL, HP_SMALL_TIMING_AVAIL):
Likewise.
* sysdeps/powerpc/powerpc32/power4/hp-timing.h (HP_TIMING_AVAIL,
HP_SMALL_TIMING_AVAIL): Likewise.
* sysdeps/powerpc/powerpc64/hp-timing.h (HP_TIMING_AVAIL,
HP_SMALL_TIMING_AVAIL): Likewise.
* sysdeps/sparc/sparc32/sparcv9/hp-timing.h (HP_TIMING_AVAIL,
HP_SMALL_TIMING_AVAIL): Likewise.
* sysdeps/sparc/sparc64/hp-timing.h (HP_TIMING_AVAIL,
HP_SMALL_TIMING_AVAIL): Likewise.
* sysdeps/x86/hp-timing.h (HP_TIMING_AVAIL, HP_SMALL_TIMING_AVAIL):
Likewise.
* sysdeps/generic/hp-timing-common.h: Update comment with
HP_TIMING_AVAIL removal.
This patch removes CLOCK_THREAD_CPUTIME_ID and CLOCK_PROCESS_CPUTIME_ID support
from clock_gettime and clock_settime generic implementation. For Linux, kernel
already provides supports through the syscall and Hurd HTL lacks
__pthread_clock_gettime and __pthread_clock_settime internal implementation.
As described in clock_gettime man-page [1] on 'Historical note for SMP
system', implementing CLOCK_{THREAD,PROCESS}_CPUTIME_ID with timer registers
is error-prone and susceptible to timing and accurary issues that the libc
can not deal without kernel support.
This allows removes unused code which, however, still incur in some runtime
overhead in thread creation (the struct pthread cpuclock_offset
initialization).
If hurd eventually wants to support them it should either either implement as
a kernel facility (or something related due its architecture) or in system
specific implementation.
Checked on aarch64-linux-gnu, x86_64-linux-gnu, and i686-linux-gnu. I also
checked on a i686-gnu build.
* nptl/Makefile (libpthread-routines): Remove pthread_clock_gettime and
pthread_clock_settime.
* nptl/pthreadP.h (__find_thread_by_id): Remove prototype.
* elf/dl-support.c [!HP_TIMING_NOAVAIL] (_dl_cpuclock_offset): Remove.
(_dl_non_dynamic_init): Remove _dl_cpuclock_offset setting.
* elf/rtld.c (_dl_start_final): Likewise.
* nptl/allocatestack.c (__find_thread_by_id): Remove function.
* sysdeps/generic/ldsodefs.h [!HP_TIMING_NOAVAIL] (_dl_cpuclock_offset):
Remove.
* sysdeps/mach/hurd/dl-sysdep.c [!HP_TIMING_NOAVAIL]
(_dl_cpuclock_offset): Remove.
* nptl/descr.h (struct pthread): Rename cpuclock_offset to
cpuclock_offset_ununsed.
* nptl/nptl-init.c (__pthread_initialize_minimal_internal): Remove
cpuclock_offset set.
* nptl/pthread_create.c (START_THREAD_DEFN): Likewise.
* sysdeps/nptl/fork.c (__libc_fork): Likewise.
* nptl/pthread_clock_gettime.c: Remove file.
* nptl/pthread_clock_settime.c: Likewise.
* sysdeps/unix/clock_gettime.c (hp_timing_gettime): Remove function.
[HP_TIMING_AVAIL] (realtime_gettime): Remove CLOCK_THREAD_CPUTIME_ID
and CLOCK_PROCESS_CPUTIME_ID support.
* sysdeps/unix/clock_settime.c (hp_timing_gettime): Likewise.
[HP_TIMING_AVAIL] (realtime_gettime): Likewise.
* sysdeps/posix/clock_getres.c (hp_timing_getres): Likewise.
[HP_TIMING_AVAIL] (__clock_getres): Likewise.
* sysdeps/unix/clock_nanosleep.c (CPUCLOCK_P, INVALID_CLOCK_P):
Likewise.
(__clock_nanosleep): Remove CPUCLOCK_P and INVALID_CLOCK_P usage.
[1] http://man7.org/linux/man-pages/man2/clock_gettime.2.html
This change moves the audit module loading and early notification into
separate functions out of dl_main.
It restores the bug fix from commit
8e889c5da3 ("elf: Fix LD_AUDIT for
modules with invalid version (BZ#24122)") which was reverted in commit
83e6b59625 ("[elf] Revert 8e889c5da3
(BZ#24122)").
The actual bug fix is the separate error message for the case when
la_version returns zero. The dynamic linker error message (which is
NULL in this case) is no longer used. Based on the intended use of
version zero (ignore this module due to explicit request), the message
is only printed if debugging is enabled.
This patch adds fall-through comments in some cases where -Wextra
produces implicit-fallthrough warnings.
The patch is non-exhaustive. Apart from architecture-specific code
for non-x86_64 architectures, it does not change sunrpc/xdr.c (legacy
code, probably should have such changes, but left to be dealt with
separately), or places that already had comments about the
fall-through but not matching the form expected by
-Wimplicit-fallthrough=3 (the default level with -Wextra; my
inclination is to adjust those comments to match rather than
downgrading to -Wimplicit-fallthrough=1 to allow any comment), or one
place where I thought the implicit fallthrough was not correct and so
should be handled separately as a bug fix. I think the key thing to
consider in review of this patch is whether the fall-through is indeed
intended and correct in each place where such a comment is added.
Tested for x86_64.
* elf/dl-exception.c (_dl_exception_create_format): Add
fall-through comments.
* elf/ldconfig.c (parse_conf_include): Likewise.
* elf/rtld.c (print_statistics): Likewise.
* locale/programs/charmap.c (parse_charmap): Likewise.
* misc/mntent_r.c (__getmntent_r): Likewise.
* posix/wordexp.c (parse_arith): Likewise.
(parse_backtick): Likewise.
* resolv/ns_ttl.c (ns_parse_ttl): Likewise.
* sysdeps/x86/cpu-features.c (init_cpu_features): Likewise.
* sysdeps/x86_64/dl-machine.h (elf_machine_rela): Likewise.
Intel Control-flow Enforcement Technology (CET) instructions:
https://software.intel.com/sites/default/files/managed/4d/2a/control-flow-en
forcement-technology-preview.pdf
includes Indirect Branch Tracking (IBT) and Shadow Stack (SHSTK).
GNU_PROPERTY_X86_FEATURE_1_IBT is added to GNU program property to
indicate that all executable sections are compatible with IBT when
ENDBR instruction starts each valid target where an indirect branch
instruction can land. Linker sets GNU_PROPERTY_X86_FEATURE_1_IBT on
output only if it is set on all relocatable inputs.
On an IBT capable processor, the following steps should be taken:
1. When loading an executable without an interpreter, enable IBT and
lock IBT if GNU_PROPERTY_X86_FEATURE_1_IBT is set on the executable.
2. When loading an executable with an interpreter, enable IBT if
GNU_PROPERTY_X86_FEATURE_1_IBT is set on the interpreter.
a. If GNU_PROPERTY_X86_FEATURE_1_IBT isn't set on the executable,
disable IBT.
b. Lock IBT.
3. If IBT is enabled, when loading a shared object without
GNU_PROPERTY_X86_FEATURE_1_IBT:
a. If legacy interwork is allowed, then mark all pages in executable
PT_LOAD segments in legacy code page bitmap. Failure of legacy code
page bitmap allocation causes an error.
b. If legacy interwork isn't allowed, it causes an error.
GNU_PROPERTY_X86_FEATURE_1_SHSTK is added to GNU program property to
indicate that all executable sections are compatible with SHSTK where
return address popped from shadow stack always matches return address
popped from normal stack. Linker sets GNU_PROPERTY_X86_FEATURE_1_SHSTK
on output only if it is set on all relocatable inputs.
On a SHSTK capable processor, the following steps should be taken:
1. When loading an executable without an interpreter, enable SHSTK if
GNU_PROPERTY_X86_FEATURE_1_SHSTK is set on the executable.
2. When loading an executable with an interpreter, enable SHSTK if
GNU_PROPERTY_X86_FEATURE_1_SHSTK is set on interpreter.
a. If GNU_PROPERTY_X86_FEATURE_1_SHSTK isn't set on the executable
or any shared objects loaded via the DT_NEEDED tag, disable SHSTK.
b. Otherwise lock SHSTK.
3. After SHSTK is enabled, it is an error to load a shared object
without GNU_PROPERTY_X86_FEATURE_1_SHSTK.
To enable CET support in glibc, --enable-cet is required to configure
glibc. When CET is enabled, both compiler and assembler must support
CET. Otherwise, it is a configure-time error.
To support CET run-time control,
1. _dl_x86_feature_1 is added to the writable ld.so namespace to indicate
if IBT or SHSTK are enabled at run-time. It should be initialized by
init_cpu_features.
2. For dynamic executables:
a. A l_cet field is added to struct link_map to indicate if IBT or
SHSTK is enabled in an ELF module. _dl_process_pt_note or
_rtld_process_pt_note is called to process PT_NOTE segment for
GNU program property and set l_cet.
b. _dl_open_check is added to check IBT and SHSTK compatibilty when
dlopening a shared object.
3. Replace i386 _dl_runtime_resolve and _dl_runtime_profile with
_dl_runtime_resolve_shstk and _dl_runtime_profile_shstk, respectively if
SHSTK is enabled.
CET run-time control can be changed via GLIBC_TUNABLES with
$ export GLIBC_TUNABLES=glibc.tune.x86_shstk=[permissive|on|off]
$ export GLIBC_TUNABLES=glibc.tune.x86_ibt=[permissive|on|off]
1. permissive: SHSTK is disabled when dlopening a legacy ELF module.
2. on: IBT or SHSTK are always enabled, regardless if there are IBT or
SHSTK bits in GNU program property.
3. off: IBT or SHSTK are always disabled, regardless if there are IBT or
SHSTK bits in GNU program property.
<cet.h> from CET-enabled GCC is automatically included by assembly codes
to add GNU_PROPERTY_X86_FEATURE_1_IBT and GNU_PROPERTY_X86_FEATURE_1_SHSTK
to GNU program property. _CET_ENDBR is added at the entrance of all
assembly functions whose address may be taken. _CET_NOTRACK is used to
insert NOTRACK prefix with indirect jump table to support IBT. It is
defined as notrack when _CET_NOTRACK is defined in <cet.h>.
[BZ #21598]
* configure.ac: Add --enable-cet.
* configure: Regenerated.
* elf/Makefille (all-built-dso): Add a comment.
* elf/dl-load.c (filebuf): Moved before "dynamic-link.h".
Include <dl-prop.h>.
(_dl_map_object_from_fd): Call _dl_process_pt_note on PT_NOTE
segment.
* elf/dl-open.c: Include <dl-prop.h>.
(dl_open_worker): Call _dl_open_check.
* elf/rtld.c: Include <dl-prop.h>.
(dl_main): Call _rtld_process_pt_note on PT_NOTE segment. Call
_rtld_main_check.
* sysdeps/generic/dl-prop.h: New file.
* sysdeps/i386/dl-cet.c: Likewise.
* sysdeps/unix/sysv/linux/x86/cpu-features.c: Likewise.
* sysdeps/unix/sysv/linux/x86/dl-cet.h: Likewise.
* sysdeps/x86/cet-tunables.h: Likewise.
* sysdeps/x86/check-cet.awk: Likewise.
* sysdeps/x86/configure: Likewise.
* sysdeps/x86/configure.ac: Likewise.
* sysdeps/x86/dl-cet.c: Likewise.
* sysdeps/x86/dl-procruntime.c: Likewise.
* sysdeps/x86/dl-prop.h: Likewise.
* sysdeps/x86/libc-start.h: Likewise.
* sysdeps/x86/link_map.h: Likewise.
* sysdeps/i386/dl-trampoline.S (_dl_runtime_resolve): Add
_CET_ENDBR.
(_dl_runtime_profile): Likewise.
(_dl_runtime_resolve_shstk): New.
(_dl_runtime_profile_shstk): Likewise.
* sysdeps/linux/x86/Makefile (sysdep-dl-routines): Add dl-cet
if CET is enabled.
(CFLAGS-.o): Add -fcf-protection if CET is enabled.
(CFLAGS-.os): Likewise.
(CFLAGS-.op): Likewise.
(CFLAGS-.oS): Likewise.
(asm-CPPFLAGS): Add -fcf-protection -include cet.h if CET
is enabled.
(tests-special): Add $(objpfx)check-cet.out.
(cet-built-dso): New.
(+$(cet-built-dso:=.note)): Likewise.
(common-generated): Add $(cet-built-dso:$(common-objpfx)%=%.note).
($(objpfx)check-cet.out): New.
(generated): Add check-cet.out.
* sysdeps/x86/cpu-features.c: Include <dl-cet.h> and
<cet-tunables.h>.
(TUNABLE_CALLBACK (set_x86_ibt)): New prototype.
(TUNABLE_CALLBACK (set_x86_shstk)): Likewise.
(init_cpu_features): Call get_cet_status to check CET status
and update dl_x86_feature_1 with CET status. Call
TUNABLE_CALLBACK (set_x86_ibt) and TUNABLE_CALLBACK
(set_x86_shstk). Disable and lock CET in libc.a.
* sysdeps/x86/cpu-tunables.c: Include <cet-tunables.h>.
(TUNABLE_CALLBACK (set_x86_ibt)): New function.
(TUNABLE_CALLBACK (set_x86_shstk)): Likewise.
* sysdeps/x86/sysdep.h (_CET_NOTRACK): New.
(_CET_ENDBR): Define if not defined.
(ENTRY): Add _CET_ENDBR.
* sysdeps/x86/dl-tunables.list (glibc.tune): Add x86_ibt and
x86_shstk.
* sysdeps/x86_64/dl-trampoline.h (_dl_runtime_resolve): Add
_CET_ENDBR.
(_dl_runtime_profile): Likewise.