Support usable check for all CPU features with the following changes:
1. Change struct cpu_features to
struct cpuid_features
{
struct cpuid_registers cpuid;
struct cpuid_registers usable;
};
struct cpu_features
{
struct cpu_features_basic basic;
struct cpuid_features features[COMMON_CPUID_INDEX_MAX];
unsigned int preferred[PREFERRED_FEATURE_INDEX_MAX];
...
};
so that there is a usable bit for each cpuid bit.
2. After the cpuid bits have been initialized, copy the known bits to the
usable bits. EAX/EBX from INDEX_1 and EAX from INDEX_7 aren't used for
CPU feature detection.
3. Clear the usable bits which require OS support.
4. If the feature is supported by OS, copy its cpuid bit to its usable
bit.
5. Replace HAS_CPU_FEATURE and CPU_FEATURES_CPU_P with CPU_FEATURE_USABLE
and CPU_FEATURE_USABLE_P to check if a feature is usable.
6. Add DEPR_FPU_CS_DS for INDEX_7_EBX_13.
7. Unset MPX feature since it has been deprecated.
The results are
1. If the feature is known and doesn't requre OS support, its usable bit
is copied from the cpuid bit.
2. Otherwise, its usable bit is copied from the cpuid bit only if the
feature is known to supported by OS.
3. CPU_FEATURE_USABLE/CPU_FEATURE_USABLE_P are used to check if the
feature can be used.
4. HAS_CPU_FEATURE/CPU_FEATURE_CPU_P are used to check if CPU supports
the feature.
Without msgfmt libc.mo files are not generated and its loading failure
is silent ignored with xsetlocale.
Also unset LANGUAGE environment variable to avoid it taking precedence
when loading the message catalog. Although not strictly required
(since the test is issued with test-container and it sets a strict
environment variable) it follows other tests that deal with
translation.
Checked on x86_64-linux-gnu.
__morecore, __after_morecore_hook, and __default_morecore had not
been deprecated in commit 7d17596c19
("Mark malloc hook variables as deprecated"), probably by accident.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Otherwise loading a dynamically linked libc with rseq support fails,
as result of the __rseq_abi TLS variable, which has an alignment
of 32 bytes.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Vineet Gupta <vgupta@synopsys.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Tested-by: Alistair Francis <alistair.francis@wdc.com>
Since
commit 430388d5dc
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Fri Aug 3 08:04:49 2018 -0700
x86: Don't include <init-arch.h> in assembly codes
removed all usages of <init-arch.h> from assembly codes, we can remove
__ASSEMBLER__ check in init-arch.h.
Since
commit c867597bff
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Wed Jun 8 13:57:50 2016 -0700
X86-64: Remove previous default/SSE2/AVX2 memcpy/memmove
removed the only usage of __x86_prefetchw, we can remove the unused
__x86_prefetchw.
A big shoutout to Cupertino Miranda <cmiranda@synopsys.com> for his
valuable contribution in initial bringup and debugging on Linux and
later in solving pesky unwinding/cancelation failures in testsuite.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This test fails intermittently in systems with heavy load as
CLOCK_PROCESS_CPUTIME_ID is subject to scheduler pressure. Thus the
test boundaries were relaxed to keep it from failing on such systems.
A refactor of the spent time checking was made with some support
functions. With the advantage to representing time jitter in percent
of the target.
The values used by the test boundaries are all empirical.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Teach the linker that __mcount_internal, __sigjmp_save_symbol,
__syscall_error and __GI_exit do not use r2, so that it does not need to
recover r2 after the call.
Test at configure time if the assembler supports @notoc and define
USE_PPC64_NOTOC.
Unicode 13.0.0 Support: Character encoding, character type info, and
transliteration tables are all updated to Unicode 13.0.0, using
the generator scripts contributed by Mike FABIAN (Red Hat).
Total added characters in newly generated CHARMAP: 5930
Total added characters in newly generated WIDTH: 5536
__printf_fp_l has a memory leak in the case of some I/O errors, where
both buffer and wbuffer have been malloced but the handling of I/O
errors only frees wbuffer. This patch fixes this by moving the
declaration of buffer to an outer scope and ensuring that it is freed
when wbuffer is freed.
Tested for x86_64 and x86.
__printf_fp_l has a double free bug in the case where it allocates
memory with malloc internally, then has an I/O error while outputting
trailing padding and tries to free that already-freed memory when the
error occurs. This patch fixes this by setting the relevant pointer
to NULL after the first free (the only free of this pointer that isn't
immediately followed by returning from the function).
Tested for x86_64 and x86.
Make the instructions for syscall list generation match Makefile and
refer to `update-syscall-lists'; there has been no `update-arch-syscall'
target. Also use single quotes around the command to stick to the ASCII
character set.
Fixes 4cf0d22305 ("Linux: Add tables with system call numbers").
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
To provide a y2038 safe interface a new symbol __shmctl64 is added
and __shmctl is change to call it instead (it adds some extra buffer
copying for the 32 bit time_t implementation).
Two new structures are added:
1. kernel_shmid64_ds: used internally only on 32-bit architectures
to issue the syscall. A handful of architectures (hppa, i386,
mips, powerpc32, and sparc32) require specific implementations
due to their kernel ABI.
2. shmid_ds64: this is only for __TIMESIZE != 64 to use along with
the 64-bit shmctl. It is different than the kernel struct because
the exported 64-bit time_t might require different alignment
depending on the architecture ABI.
So the resulting implementation does:
1. For 64-bit architectures it assumes shmid_ds already contains
64-bit time_t fields and will result in just the __shmctl symbol
using the __shmctl64 code. The shmid_ds argument is passed as-is
to the syscall.
2. For 32-bit architectures with default 64-bit time_t (newer ABIs
such riscv32 or arc), it will also result in only one exported
symbol but with the required high/low time handling.
3. Finally for 32-bit architecture with both 32-bit and 64-bit time_t
support we follow the already set way to provide one symbol with
64-bit time_t support and implement the 32-bit time_t support
using of the 64-bit one.
The default 32-bit symbol will allocate and copy the shmid_ds
over multiple buffers, but this should be deprecated in favor
of the __shmctl64 anyway.
Checked on i686-linux-gnu and x86_64-linux-gnu. I also did some sniff
tests on powerpc, powerpc64, mips, mips64, armhf, sparcv9, and
sparc64.
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Tested-by: Alistair Francis <alistair.francis@wdc.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Each architecture overrides the struct msqid_ds which its required
kernel ABI one.
Checked on x86_64-linux-gnu and some bases sysvipc tests on hppa,
mips, mipsle, mips64, mips64le, sparc64, sparcv9, powerpc64le,
powerpc64, and powerpc.
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Tested-by: Alistair Francis <alistair.francis@wdc.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
This will allow us to have architectures specify their own version.
Not semantic changes expected. Checked with a build against the
all affected ABIs.
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Tested-by: Alistair Francis <alistair.francis@wdc.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
To provide a y2038 safe interface a new symbol __msgctl64 is added
and __msgctl is change to call it instead (it adds some extra buffer
coping for the 32 bit time_t implementation).
Two new structures are added:
1. kernel_msqid64_ds: used internally only on 32-bit architectures
to issue the syscall. A handful of architectures (hppa, i386, mips,
powerpc32, and sparc32) require specific implementations due to
their kernel ABI.
2. msqid_ds64: this is only for __TIMESIZE != 64 to use along with
the 64-bit msgctl. It is different than the kernel struct because
the exported 64-bit time_t might require different alignment
depending on the architecture ABI.
So the resulting implementation does:
1. For 64-bit architectures it assumes msqid_ds already contains
64-bit time_t fields and will result in just the __msgctl symbol
using the __msgctl64 code. The msgid_ds argument is passed as-is
to the syscall.
2. For 32-bit architectures with default 64-bit time_t (newer ABIs
such riscv32 or arc), it will also result in only one exported
symbol but with the required high/low time handling.
3. Finally for 32-bit architecture with both 32-bit and 64-bit time_t
support we follow the already set way to provide one symbol with
64-bit time_t support and implement the 32-bit time_t support using
the 64-bit time_t.
The default 32-bit symbol will allocate and copy the msqid_ds
over multiple buffers, but this should be deprecated in favor
of the __msgctl64 anyway.
Checked on i686-linux-gnu and x86_64-linux-gnu. I also did some sniff
tests on powerpc, powerpc64, mips, mips64, armhf, sparcv9, and
sparc64.
Tested-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Tested-by: Alistair Francis <alistair.francis@wdc.com>
Each architecture overrides the struct msqid_ds which its required
kernel ABI one.
Checked on x86_64-linux-gnu and some bases sysvipc tests on hppa,
mips, mipsle, mips64, mips64le, sparc64, sparcv9, powerpc64le,
powerpc64, and powerpc.
Tested-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Tested-by: Alistair Francis <alistair.francis@wdc.com>
This will allow us to have architectures specify their own version.
Not semantic changes expected. Checked with a build against the
all affected ABIs.
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Tested-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
Different than others 64-bit time_t syscalls, the SysIPC interface
does not provide a new set of syscall for y2038 safeness. Instead it
uses unused fields in semid_ds structure to return the high bits for
the timestamps.
To provide a y2038 safe interface a new symbol __semctl64 is added
and __semctl is change to call it instead (it adds some extra buffer
copying for the 32 bit time_t implementation).
Two new structures are added:
1. kernel_semid64_ds: used internally only on 32-bit architectures
to issue the syscall. A handful of architectures (hppa, i386,
mips, powerpc32, sparc32) require specific implementations due
their kernel ABI.
2. semid_ds64: this is only for __TIMESIZE != 64 to use along with
the 64-bit semctl. It is different than the kernel struct because
the exported 64-bit time_t might require different alignment
depending on the architecture ABI.
So the resulting implementation does:
1. For 64-bit architectures it assumes semid_ds already contains
64-bit time_t fields and will result in just the __semctl symbol
using the __semctl64 code. The semid_ds argument is passed as-is
to the syscall.
2. For 32-bit architectures with default 64-bit time_t (newer ABIs
such riscv32 or arc), it will also result in only one exported
symbol but with the required high/low handling.
It might be possible to optimize it further to avoid the
kernel_semid64_ds to semun transformation if the exported ABI
for the architectures matches the expected kernel ABI, but the
implementation is already complex enough and don't think this
should be a hotspot in any case.
3. Finally for 32-bit architecture with both 32-bit and 64-bit time_t
support we follow the already set way to provide one symbol with
64-bit time_t support and implement the 32-bit time_t support
using the 64-bit one.
The default 32-bit symbol will allocate and copy the semid_ds
over multiple buffers, but this should be deprecated in favor
of the __semctl64 anyway.
Checked on i686-linux-gnu and x86_64-linux-gnu. I also did some sniff
tests on powerpc, powerpc64, mips, mips64, armhf, sparcv9, and
sparc64.
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Tested-by: Alistair Francis <alistair.francis@wdc.com>
Tested-by: Vineet Gupta <vgupta@synopsys.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
On some targets static TLS surplus area can be used opportunistically
for dynamically loaded modules such that the TLS access then becomes
faster (TLSDESC and powerpc TLS optimization). However we don't want
all surplus TLS to be used for this optimization because dynamically
loaded modules with initial-exec model TLS can only use surplus TLS.
The new contract for surplus static TLS use is:
- libc.so can have up to 192 bytes of IE TLS,
- other system libraries together can have up to 144 bytes of IE TLS.
- Some "optional" static TLS is available for opportunistic use.
The optional TLS is now tunable: rtld.optional_static_tls, so users
can directly affect the allocated static TLS size. (Note that module
unloading with dlclose does not reclaim static TLS. After the optional
TLS runs out, TLS access is no longer optimized to use static TLS.)
The default setting of rtld.optional_static_tls is 512 so the surplus
TLS is 3*192 + 4*144 + 512 = 1664 by default, the same as before.
Fixes BZ #25051.
Tested on aarch64-linux-gnu and x86_64-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The new static TLS surplus size computation is
surplus_tls = 192 * (nns-1) + 144 * nns + 512
where nns is controlled via the rtld.nns tunable. This commit
accounts audit modules too so nns = rtld.nns + audit modules.
rtld.nns should only include the namespaces required by the
application, namespaces for audit modules are accounted on top
of that so audit modules don't use up the static TLS that is
reserved for the application. This allows loading many audit
modules without tuning rtld.nns or using up static TLS, and it
fixes
FAIL: elf/tst-auditmany
Note that DL_NNS is currently a hard upper limit for nns, and
if rtld.nns + audit modules go over the limit that's a fatal
error. By default rtld.nns is 4 which allows 12 audit modules.
Counting the audit modules is based on existing audit string
parsing code, we cannot use GLRO(dl_naudit) before the modules
are actually loaded.
TLS_STATIC_SURPLUS is 1664 bytes currently which is not enough to
support DL_NNS (== 16) number of dynamic link namespaces, if we
assume 192 bytes of TLS are reserved for libc use and 144 bytes
are reserved for other system libraries that use IE TLS.
A new tunable is introduced to control the number of supported
namespaces and to adjust the surplus static TLS size as follows:
surplus_tls = 192 * (rtld.nns-1) + 144 * rtld.nns + 512
The default is rtld.nns == 4 and then the surplus TLS size is the
same as before, so the behaviour is unchanged by default. If an
application creates more namespaces than the rtld.nns setting
allows, then it is not guaranteed to work, but the limit is not
checked. So existing usage will continue to work, but in the
future if an application creates more than 4 dynamic link
namespaces then the tunable will need to be set.
In this patch DL_NNS is a fixed value and provides a maximum to
the rtld.nns setting.
Static linking used fixed 2048 bytes surplus TLS, this is changed
so the same contract is used as for dynamic linking. With static
linking DL_NNS == 1 so rtld.nns tunable is forced to 1, so by
default the surplus TLS is reduced to 144 + 512 = 656 bytes. This
change is not expected to cause problems.
Tested on aarch64-linux-gnu and x86_64-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
this means that *always* libnsl is only built as shared library for
backward compatibility and the NSS modules libnss_nis and libnss_nisplus
are not built at all, libnsl's headers aren't installed.
This compatibility is kept only for architectures and ABIs that have
been added in or before version 2.28.
Replacement implementations based on TIRPC, which additionally support
IPv6, are available from <https://github.com/thkukuk/>.
This change does not affect libnss_compat which does not depended
on libnsl since 2.27 and thus can be used without NIS.
libnsl code depends on Sun RPC, e.g. on --enable-obsolete-rpc (installed
libnsl headers use installed Sun RPC headers), which will be removed in
the following commit.
This includes bindresvport and the NSS-related RPC functions. This will
simplify the removal of the sunrpc functionality because these functions
no longer have to be treated specially.
RETURN_ADDRESS is used at several places in glibc to mean a valid
code address of the call site, but with pac-ret it may contain a
pointer authentication code (PAC), so its definition is adjusted.
This is gcc PR target/94891: __builtin_return_address should not
expose signed pointers to user code where it can cause ABI issues.
In glibc RETURN_ADDRESS is only changed if it is built with pac-ret.
There is no detection for the specific gcc issue because it is
hard to test and the additional xpac does not cause problems.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Currently gcc -pg -mbranch-protection=pac-ret passes signed return
address to _mcount, so _mcount now has to always strip pac from the
frompc since that's from user code that may be built with pac-ret.
This is gcc PR target/94791: signed pointers should not escape and get
passed across extern call boundaries, since that's an ABI break, but
because existing gcc has this issue we work it around in glibc until
that is resolved. This is compatible with a fixed gcc and it is a nop
on systems without PAuth support. The bug was introduced in gcc-7 with
-msign-return-address=non-leaf|all support which in gcc-9 got renamed
to -mbranch-protection=pac-ret|pac-ret+leaf|standard.
strip_pac uses inline asm instead of __builtin_aarch64_xpaclri since
that is not a documented api and not available in all supported gccs.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Use return address signing in assembly files for functions that save
LR when pac-ret is enabled in the compiler.
The GNU property note for PAC-RET is not meaningful to the dynamic
linker so it is not strictly required, but it may be used to track
the security property of binaries. (The PAC-RET property is only set
if BTI is set too because BTI implies working GNU property support.)
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Return address signing requires unwinder support, which is
present in libgcc since >=gcc-7, however due to bugs the
support may be broken in <gcc-10 (and similarly there may
be issues in custom unwinders), so pac-ret is not always
safe to use. So in assembly code glibc should only use
pac-ret if the compiler uses it too. Unfortunately there
is no predefined feature macro for it set by the compiler
so pac-ret is inferred from the code generation.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
When glibc is built with branch protection (i.e. with a gcc configured
with --enable-standard-branch-protection), all glibc binaries should
be BTI compatible and marked as such.
It is easy to link BTI incompatible objects by accident and this is
silent currently which is usually not the expectation, so this is
changed into a link error. (There is no linker flag for failing on
BTI incompatible inputs so all warnings are turned into fatal errors
outside the test system when building glibc with branch protection.)
Unfortunately, outlined atomic functions are not BTI compatible in
libgcc (PR libgcc/96001), so to build glibc with current gcc use
'CC=gcc -mno-outline-atomics', this should be fixed in libgcc soon
and then glibc can be built and tested without such workarounds.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Binaries can opt-in to using BTI via an ELF object file marking.
The dynamic linker has to then mprotect the executable segments
with PROT_BTI. In case of static linked executables or in case
of the dynamic linker itself, PROT_BTI protection is done by the
operating system.
On AArch64 glibc uses PT_GNU_PROPERTY instead of PT_NOTE to check
the properties of a binary because PT_NOTE can be unreliable with
old linkers (old linkers just append the notes of input objects
together and add them to the output without checking them for
consistency which means multiple incompatible GNU property notes
can be present in PT_NOTE).
BTI property is handled in the loader even if glibc is not built
with BTI support, so in theory user code can be BTI protected
independently of glibc. In practice though user binaries are not
marked with the BTI property if glibc has no support because the
static linked libc objects (crt files, libc_nonshared.a) are
unmarked.
This patch relies on Linux userspace API that is not yet in a
linux release but in v5.8-rc1 so scheduled to be in Linux 5.8.
Co-authored-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Tailcalls must use x16 or x17 for the indirect branch instruction
to be compatible with code that uses BTI c at function entries.
(Other forms of indirect branches can only land on BTI j.)
Also added a BTI c at the ELF entry point of rtld, this is not
strictly necessary since the kernel does not use indirect branch
to get there, but it seems safest once building glibc itself with
BTI is supported.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
setcontext returns to the specified context via an indirect jump,
so there should be a BTI j.
In case of getcontext (and all other returns_twice functions) the
compiler adds BTI j at the call site, but swapcontext is a normal
c call that is currently not handled specially by the compiler.
So we change swapcontext such that the saved context returns to a
local address that has BTI j and then swapcontext returns to the
caller via a normal RET. For this we save the original return
address in the slot for x1 of the context because x1 need not be
preserved by swapcontext but it is restored when the context saved
by swapcontext is resumed.
The alternative fix (which is done on x86) would make swapcontext
special in the compiler so BTI j is emitted at call sites, on
x86 there is an indirect_return attribute for this, on AArch64
we would have to use returns_twice. It was decided against because
such fix may need user code updates: the attribute has to be added
when swapcontext is called via a function pointer and it breaks
always_inline functions with swapcontext.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
To enable building glibc with branch protection, assembly code
needs BTI landing pads and ELF object file markings in the form
of a GNU property note.
The landing pads are unconditionally added to all functions that
may be indirectly called. When the code segment is not mapped
with PROT_BTI these instructions are nops. They are kept in the
code when BTI is not supported so that the layout of performance
critical code is unchanged across configurations.
The GNU property notes are only added when there is support for
BTI in the toolchain, because old binutils does not handle the
notes right. (Does not know how to merge them nor to put them in
PT_GNU_PROPERTY segment instead of PT_NOTE, and some versions
of binutils emit warnings about the unknown GNU property. In
such cases the produced libc binaries would not have valid
ELF marking so BTI would not be enabled.)
Note: functions using ENTRY or ENTRY_ALIGN now start with an
additional BTI c, so alignment of the following code changes,
but ENTRY_ALIGN_AND_PAD was fixed so there is no change to the
existing code layout. Some string functions may need to be
tuned for optimal performance after this commit.
Co-authored-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The compiler can add required elf markings based on CFLAGS
but the assembler cannot, so using C code for empty files
creates less of a maintenance problem.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Check BTI support in the compiler and linker. The check also
requires READELF that understands the BTI GNU property note.
It is expected to succeed with gcc >=gcc-9 configured with
--enable-standard-branch-protection and binutils >=binutils-2.33.
Note: passing -mbranch-protection=bti in CFLAGS when building glibc
may not be enough to get a glibc that supports BTI because crtbegin*
and crtend* provided by the compiler needs to be BTI compatible too.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>