Linux 5.6 has new openat2 and pidfd_getfd syscalls. This patch adds
them to syscall-names.list and regenerates the arch-syscall.h files.
Tested with build-many-glibcs.py.
binutils ld has supported --audit, --depaudit for a long time,
only support in glibc has been missing.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
All list elements are colon-separated strings, and there is a hard
upper limit for the number of audit modules, so it is possible to
pre-allocate a fixed-size array of strings to which the LD_AUDIT
environment variable and --audit arguments are added.
Also eliminate the global variables for the audit list because
the list is only needed briefly during startup.
There is a slight behavior change: All duplicate LD_AUDIT environment
variables are now processed, not just the last one as before. However,
such environment vectors are invalid anyway.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The advantage is that the buffer will always contain the number
of characters as returned from the function, which allows one to use
a sequence like
/* No more audit module output. */
line_length = xgetline (&buffer, &buffer_length, fp);
TEST_COMPARE_BLOB ("", 0, buffer, line_length);
to check for an expected EOF, while also reporting any unexpected
extra data encountered.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
All cancellable syscalls are done by C implementations, so there is no
no need to use a specialized implementation to optimize register usage.
It fixes BZ #25765.
Checked on x86_64-linux-gnu.
Adding the test "tst-safe-linking" for testing that Safe-Linking works
as expected. The test checks these 3 main flows:
* tcache protection
* fastbin protection
* malloc_consolidate() correctness
As there is a random chance of 1/16 that of the alignment will remain
correct, the test checks each flow up to 10 times, using different random
values for the pointer corruption. As a result, the chance for a false
failure of a given tested flow is 2**(-40), thus highly unlikely.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Now there is a generic __timeval32 and helpers we can use them for Alpha
instead of the Alpha specific ones.
Reviewed-by: Lukasz Majewski <lukma@denx.de>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The Linux kernel expects rusage to use a 32-bit time_t, even on archs
with a 64-bit time_t (like RV32). To address this let's convert
rusage to/from 32-bit and 64-bit to ensure the kernel always gets
a 32-bit time_t.
While we are converting these functions let's also convert them to be
the y2038 safe versions. This means there is a *64 function that is
called by a backwards compatible wrapper.
Reviewed-by: Lukasz Majewski <lukma@denx.de>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The Linux kernel expects itimerval to use a 32-bit time_t, even on archs
with a 64-bit time_t (like RV32). To address this let's convert
itimerval to/from 32-bit and 64-bit to ensure the kernel always gets
a 32-bit time_t.
While we are converting these functions let's also convert them to be
the y2038 safe versions. This means there is a *64 function that is
called by a backwards compatible wrapper.
Tested-by: Lukasz Majewski <lukma@denx.de>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
On y2038 safe 32-bit systems the Linux kernel expects itimerval
and rusage to use a 32-bit time_t, even though the other time_t's
are 64-bit. There are currently no plans to make 64-bit time_t versions
of these structs.
There are also other occurrences where the time passed to the kernel via
timeval doesn't match the wordsize.
To handle these cases let's define a new macro
__KERNEL_OLD_TIMEVAL_MATCHES_TIMEVAL64. This macro specifies if the
kernel's old_timeval matches the new timeval64. This should be 1 for
64-bit architectures except for Alpha's osf syscalls. The define should
be 0 for 32-bit architectures and Alpha's osf syscalls.
Reviewed-by: Lukasz Majewski <lukma@denx.de>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The corner cases included were generated using exhaustive search
for all float/binary32 values on x86_64 (comparing to MPFR for
correct rounding to nearest).
For the j0/j1/y0 functions, only cases with ulp error <= 9 were
included.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Alignment checks should be performed on the user's buffer and NOT
on the mchunkptr as was done before. This caused bugs in 32 bit
versions, because: 2*sizeof(t) != MALLOC_ALIGNMENT.
As the tcache works on users' buffers it uses the aligned_OK()
check, and the rest work on mchunkptr and therefore check using
misaligned_chunk().
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Removed unneeded '\' chars from end of lines and fixed some
indentation issues that were introduced in the original
Safe-Linking patch.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
This patch makes build-many-glibcs.py use the current versions of
Linux (5.6) and GMP (6.2.0).
Tested with build-many-glibcs.py (host-libraries, compilers and glibcs
builds).
Adds a POWER9 version of fmaf128 that uses the xsmaddqp
instruction.
Co-authored-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This addresses an issue that is present mainly on SMP machines running
threaded code. In a typical indirect call or PLT import stub, the
target address is loaded first. Then the global pointer is loaded into
the PIC register in the delay slot of a branch to the target address.
During lazy binding, the target address is a trampoline which transfers
to _dl_runtime_resolve().
_dl_runtime_resolve() uses the relocation offset stored in the global
pointer and the linkage map stored in the trampoline to find the
relocation. Then, the function descriptor is updated.
In a multi-threaded application, it is possible for the global pointer
to be updated between the load of the target address and the global
pointer. When this happens, the relocation offset has been replaced
by the new global pointer. The function pointer has probably been
updated as well but there is no way to find the address of the function
descriptor and to transfer to the target. So, _dl_runtime_resolve()
typically crashes.
HP-UX addressed this problem by adding an extra pc-relative branch to
the trampoline. The descriptor is initially setup to point to the
branch. The branch then transfers to the trampoline. This allowed
the trampoline code to figure out which descriptor was being used
without any modification to user code. I didn't use this approach
as it is more complex and changes function pointer canonicalization.
The order of loading the target address and global pointer in
indirect calls was not consistent with the order used in import stubs.
In particular, $$dyncall and some inline versions of it loaded the
global pointer first. This was inconsistent with the global pointer
being updated first in dl-machine.h. Assuming the accesses are
ordered, we want elf_machine_fixup_plt() to store the global pointer
first and calls to load it last. Then, the global pointer will be
correct when the target function is entered.
However, just to make things more fun, HP added support for
out-of-order execution of accesses in PA 2.0. The accesses used by
calls are weakly ordered. So, it's possibly under some circumstances
that a function might be entered with the wrong global pointer.
However, HP uses weakly ordered accesses in 64-bit HP-UX, so I assume
that loading the global pointer in the delay slot of the branch must
work consistently.
The basic fix for the race is a combination of modifying user code to
preserve the address of the function descriptor in register %r22 and
setting the least-significant bit in the relocation offset. The
latter was suggested by Carlos as a way to distinguish relocation
offsets from global pointer values. Conventionally, %r22 is used
as the address of the function descriptor in calls to $$dyncall.
So, it wasn't hard to preserve the address in %r22.
I have updated gcc trunk and gcc-9 branch to not clobber %r22 in
$$dyncall and inline indirect calls. I have also modified the import
stubs in binutils trunk and the 2.33 branch to preserve %r22. This
required making the stubs one instruction longer but we save one
relocation. I also modified binutils to align the .plt section on
a 8-byte boundary. This allows descriptors to be updated atomically
with a floting-point store.
With these changes, _dl_runtime_resolve() can fallback to an alternate
mechanism to find the relocation offset when it has been clobbered.
There's just one additional instruction in the fast path. I tested
the fallback function, _dl_fix_reloc_arg(), by changing the branch to
always use the fallback. Old code still runs as it did before.
Fixes bug 23296.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Similar to fenvinline.h removal, this kind of optimization is better
implemented by the compiler. Also newer code avoid setting exceptions
directly (for instance the code to make new logf, log2f and powf
implementatation to now support SVID compat).
The BZ#94194 [1] the corresponding GCC bug for adding replacements
for these on x86.
Checked on x86_64-linux-gnu and i686-linux-gnu.
[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94194
Similar to string2.h (18b10de7ce) and string3.h (09a596cc2c) this
patch removes the fenvinline.h on all architectures. Currently
only powerpc implements some optimizations. This kind of optimization
is better implemented by the compiler (which handles the architecture
ISA transparently).
Also, for the specific optimized powerpc implementation the code is
becoming convoluted and these micro-optimization are hardly wildly
used, even more being a possible hotspot in realword cases
(non-default rounding are used only on specific cases and exception
handling are done most likely only on errors path). Only x86
implements similar optimization (on fenv.h) also indicates that
these should no be on libc.
The math/test-fenv already covers all math/test-fenvinline tests,
so it is safe to remove it.
The powerpc fegetround optimization is moved to internal
fenv_libc.h.
The BZ#94193 [1] the corresponding GCC bug for adding replacements
for these on powerpc.
Checked on x86_64-linux-gnu and powerpc64le-linux-gnu.
[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94193
Safe-Linking is a security mechanism that protects single-linked
lists (such as the fastbin and tcache) from being tampered by attackers.
The mechanism makes use of randomness from ASLR (mmap_base), and when
combined with chunk alignment integrity checks, it protects the "next"
pointers from being hijacked by an attacker.
While Safe-Unlinking protects double-linked lists (such as the small
bins), there wasn't any similar protection for attacks against
single-linked lists. This solution protects against 3 common attacks:
* Partial pointer override: modifies the lower bytes (Little Endian)
* Full pointer override: hijacks the pointer to an attacker's location
* Unaligned chunks: pointing the list to an unaligned address
The design assumes an attacker doesn't know where the heap is located,
and uses the ASLR randomness to "sign" the single-linked pointers. We
mark the pointer as P and the location in which it is stored as L, and
the calculation will be:
* PROTECT(P) := (L >> PAGE_SHIFT) XOR (P)
* *L = PROTECT(P)
This way, the random bits from the address L (which start at the bit
in the PAGE_SHIFT position), will be merged with LSB of the stored
protected pointer. This protection layer prevents an attacker from
modifying the pointer into a controlled value.
An additional check that the chunks are MALLOC_ALIGNed adds an
important layer:
* Attackers can't point to illegal (unaligned) memory addresses
* Attackers must guess correctly the alignment bits
On standard 32 bit Linux machines, an attack will directly fail 7
out of 8 times, and on 64 bit machines it will fail 15 out of 16
times.
This proposed patch was benchmarked and it's effect on the overall
performance of the heap was negligible and couldn't be distinguished
from the default variance between tests on the vanilla version. A
similar protection was added to Chromium's version of TCMalloc
in 2012, and according to their documentation it had an overhead of
less than 2%.
Reviewed-by: DJ Delorie <dj@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Adhemerval Zacnella <adhemerval.zanella@linaro.org>
On y2038 safe 32-bit systems the Linux kernel expects itimerval to
use a 32-bit time_t, even though the other time_t's are 64-bit. To
address this let's add a __timeval32 struct to be used internally.
Reviewed-by: Lukasz Majewski <lukma@denx.de>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
These functions are alpha specifc, rename them to be clear.
Let's also rename the header file from tv32-compat.h to
alpha-tv32-compat.h. This is to avoid conflicts with the one we will
introduce later.
Reviewed-by: Lukasz Majewski <lukma@denx.de>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Some of these files depend on the avoidance of using the various
register sets of POWER. When enabling the IEEE 128 long double,
we must be sure to disable this ABI as some compilers will
refuse to compile if -mno-vsx and -mabi=ieeelongdouble are both
present.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
In practice, this flag should be applied globally, but it makes a good
sanity check to ensure ibm128 and ieee128 long double files are not
getting mismatched. _Float128 files use no long double, thus are
always safe to use this option.
Similarly, when investigating the linker complaints, difftime
makes trivial, self contained, usage of long double, so thus it
is also explicitly marked as such.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
This better resembles the default linking process with the gnulibs,
and also resolves the increasingly difficult to maintain
f128-loader-link usage on powerpc64le as some libgcc symbols are
dependent on those found in the loader (ld).
Ensure the correct ldouble abi flags are applied to ibm128 files and
nldbl files. Remove the IEEE options if used, and apply the flags
used to build ldouble files which are ibm128 abi.
nldbl tests are a little tricky. To use the support, we must remove
all ldouble abi flags, and ensure -mlong-double-64 is used.
Co-authored-by: Rajalakshmi Srinivasaraghavan <raji@linux.vnet.ibm.com>
Co-authored-by: Tulio Magno Quites Machado Filho <tuliom@linux.vnet.ibm.com>
Co-authored-by: Paul E. Murphy <murphyp@linux.vnet.ibm.com>
The ldbl redirects for ieee128 have some jagged edges when
inspecting and manipulating symbols directly.
e.g asprintf is unconditionally redirected to __asprintfieee128
thus any tests relying on GCC's redirect behavior will encounter
problems if they inspect the symbol names too closely.
I've mitigated tests which expose the limitations of the
ldbl -> f128 redirects by giving them knowledge about the
redirected symbol names.
Hopefully there isn't much user code which depends on this
implementation specific behavior.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
Tweak the PLT bypass magic when building glibc with long double
redirects. This is made more difficult by the fact we only get
one chance to redirect functions. This happens via the public
headers.
There are roughly three classes of redirect we need to attend to
today:
1. Simple redirects, redirected via cdef macro overrides and
and new libc_hidden_ldbl_proto macro.
2. Internal usage of internal API, e.g __snprintf, which has
no direct analogue. This is bypassed directly on case-by-
case basis.
3. Double redirects, e.g sscanf and related. These require
a heavier handed approach of macro renaming to existing
symbols.
Most simple redirects are handled via 1. Ideally, the libc_*
macro would live in libc-symbols.h, but in practice the macros
needed for it to do anything useful live in cdefs.h, so they
are defined in the local override.
Notably, the internal name of the asprintf generated for ieee ldbl
redirects is renamed to work with internal prefixed usage.
This resolves the local plt usage introduced when building glibc
with ldbl == ieee128 on ppc64le.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
During the conversion to support 64 bit time on some architectures with
__WORDSIZE == 32 && __TIMESIZE != 64 the libc_hidden_def attribute for
eligible functions was by mistake omitted.
This patch fixes this issue and exports (and allows using) those
functions when Y2038 support is enabled in glibc.
In the "Extended Char Intro" the example incorrectly uses a function
called wgetc which doesn't exist. The example is corrected to use
getwc, which is correct for the use in this case.
Reported-by: Toomas Rosin <toomas@rosin.ee>
This is an updated version of a previous patch [1] with the
following changes:
- Use compiler overflow builtins on done_add_func function.
- Define the scratch +utstring_converted_wide_string using
CHAR_T.
- Added a testcase and mention the bug report.
Both default and wide printf functions might leak memory when
manipulate multibyte characters conversion depending of the size
of the input (whether __libc_use_alloca trigger or not the fallback
heap allocation).
This patch fixes it by removing the extra memory allocation on
string formatting with conversion parts.
The testcase uses input argument size that trigger memory leaks
on unpatched code (using a scratch buffer the threashold to use
heap allocation is lower).
Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
[1] https://sourceware.org/pipermail/libc-alpha/2017-June/082098.html
With mathinline removal there is no need to keep building and testing
inline math tests.
The gen-libm-tests.py support to generate ULP_I_* is removed and all
libm-test-ulps files are updated to longer have the
i{float,double,ldouble} entries. The support for no-test-inline is
also removed from both gen-auto-libm-tests and the
auto-libm-test-out-* were regenerated.
Checked on x86_64-linux-gnu and i686-linux-gnu.