This patch reserves space for HWCAP3/HWCAP4 in the TCB of powerpc.
These hardware capabilities bits will be used by future Power
architectures.
Versioned symbol '__parse_hwcap_3_4_and_convert_at_platform' advertises
the availability of the new HWCAP3/HWCAP4 data in the TCB.
This is an ABI change for GLIBC 2.39.
Suggested-by: Peter Bergner <bergner@linux.ibm.com>
Reviewed-by: Peter Bergner <bergner@linux.ibm.com>
Current implementation of strcmp for power10 has
performance regression for multiple small sizes
and alignment combination.
Most of these performance issues are fixed by this
patch. The compare loop is unrolled and page crosses
of unrolled loop is handled.
Thanks to Paul E. Murphy for helping in fixing the
performance issues.
Signed-off-by: Amrita H S <amritahs@linux.vnet.ibm.com>
Co-Authored-By: Paul E. Murphy <murphyp@linux.ibm.com>
Reviewed-by: Rajalakshmi Srinivasaraghavan <rajis@linux.ibm.com>
Optimized memchr for POWER10 based on existing rawmemchr and strlen.
Reordering instructions and loop unrolling helped in getting better performance.
Reviewed-by: Rajalakshmi Srinivasaraghavan <rajis@linux.ibm.com>
The previous commit was incomplete: gettext() still returns a translation
if the file /usr/share/locale/C/LC_MESSAGES/<domain>.mo exists. This patch
prohibits the translation also in this case.
* gettext-runtime/intl/dcigettext.c (DCIGETTEXT): Treat C.<encoding> locale
like the C locale.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
After refactoring the alloca usage in 40c0add7d4 ("resolve: Remove
__res_context_query alloca usage") a few unaligned accesses to HEADER
fields surfaced. These unaligned accesses led to problems when running
the resolv test suite on sparc32-linux (leon) as many tests failed due to
SIGBUS crashes.
The issue(s) occured during T_QUERY_A_AND_AAAA queries as the second query
now can start on an unaligned address (previously it was explicitly aligned).
With this patch the unaligned accesses are now fixed by using the
UHEADER instead to ensure the fields are accessed with byte
loads/stores.
The patch has been verfied by running the resolv test suite on sparc32
and x86_64.
Signed-off-by: Ludwig Rydberg <ludwig.rydberg@gaisler.com>
Signed-off-by: Andreas Larsson <andreas@gaisler.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
The PT_GNU_PROPERTY segment is scanned before PT_NOTE. For binaries
with the PT_GNU_PROPERTY segment, we can check it to avoid scan of
the PT_NOTE segment.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
GLRO(dl_lazy) is used to set the parameters for the early
_dl_relocate_object call, so the consider_profiling setting has to
be applied before the call.
Fixes commit 78ca44da01 ("elf: Relocate
libc.so early during startup and dlmopen (bug 31083)").
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
One of the requirements to becoming a CVE Numbering Authority (CNA) is
to publish advisories. Do this by maintaining a file for each CVE fixed
in the advisories directory in the source tree. Links to the advisories
can then be shared as:
https://sourceware.org/git/?p=glibc.git;a=blob_plain;f=advisories/GLIBC-SA-YYYY-NNNN
The file format at the moment is rudimentary and derives from the git
commit format, i.e. a subject line and a potentially multi-paragraph
description and then tags to describe some meta information. This is a
loose format at the moment and could change as we evolve this.
Also add a script process-fixed-cves.sh that processes these advisories
and generates a list to add to NEWS at release time.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
This patch is based on __strcmp_power9 and __strlen_power10.
Improvements from __strcmp_power9:
1. Uses new POWER10 instructions
- This code uses lxvp to decrease contention on load
by loading 32 bytes per instruction.
2. Performance implication
- This version has around 30% better performance on average.
- Performance regression is seen for a specific combination
of sizes and alignments. Some of them is observed without
changes also, while rest may be induced by the patch.
Signed-off-by: Amrita H S <amritahs@linux.vnet.ibm.com>
Reviewed-by: Paul E. Murphy <murphyp@linux.ibm.com>
It splits between process_envvars_secure and process_envvars_default,
with the former used to process arguments for __libc_enable_secure.
It does not have any semantic change, just simplify the code so there
is no need to handle __libc_enable_secure on each len switch.
Checked on x86_64-linux-gnu and aarch64-linux-gnu.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
To avoid any environment variable to change setuid binaries
semantics.
Checked on x86_64-linux-gnu.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Loader already ignores LD_DEBUG, LD_DEBUG_OUTPUT, and
LD_TRACE_LOADED_OBJECTS. Both LD_WARN and LD_VERBOSE are similar to
LD_DEBUG, in the sense they enable additional checks and debug
information, so it makes sense to disable them.
Also add both LD_VERBOSE and LD_WARN on filtered environment variables
for setuid binaries.
Checked on x86_64-linux-gnu.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Call the document a "Security Policy" to disambiguate it from the
security *process* documented in the security page. Also, point to the
security page for bug reporting and CVE assignment.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
The .cfi_return_column directive changes the return column for the whole
FDE range. But the actual intent is to tell the unwinder that the value
in x30 (lr) now resides in x15 after the move, and that is expressed by
the .cfi_register directive.
New implementation is based on the existing exp/exp2, with different
reduction constants and polynomial. Worst-case error in round-to-
nearest is 0.513 ULP.
The exp/exp2 shared table is reused for exp10 - .rodata size of
e_exp_data increases by 64 bytes.
As for exp/exp2, targets with single-instruction rounding/conversion
intrinsics can use them by toggling TOINT_INTRINSICS=1 and adding the
necessary code to their math_private.h.
Improvements on Neoverse V1 compared to current GLIBC master:
exp10 thruput: 3.3x in [-0x1.439b746e36b52p+8 0x1.34413509f79ffp+8]
exp10 latency: 1.8x in [-0x1.439b746e36b52p+8 0x1.34413509f79ffp+8]
Tested on:
aarch64-linux-gnu (TOINT_INTRINSICS, fma contraction) and
x86_64-linux-gnu (!TOINT_INTRINSICS, no fma contraction)
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
The previous check did not do anything because tmp_ptr already
points before run_ptr due to the way it is initialized.
Fixes commit e4d8117b82
("stdlib: Avoid another self-comparison in qsort").
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
For i686, this change is no op but for x86_64 it forces all inlined port
rights to be 8 bytes long.
Message-ID: <20231124213041.952886-2-flaviocruz@gmail.com>
It is not strictly required by the POSIX, since O_PATH is a Linux
extension, but it is QoI to fail early instead of at readdir. Also
the check is free, since fdopendir already checks if the file
descriptor is opened for read.
Checked on x86_64-linux-gnu.
The linker just concatenates the .init and .fini sections which
results in the complete _init and _fini functions. If needed the
linker adds padding bytes due to an alignment. GNU ld is adding
NOPs, which is fine. But e.g. mold is adding traps which results
in broken _init and _fini functions.
Thus this patch removes the alignment in .init and .fini sections
in crtn.S files.
We keep the 4 byte function alignment in crti.S files. As the
assembler now also outputs the start of _init and _fini functions
as multiples of 4 byte, it perhaps has to fill it. Although GNU as
is using NOPs here, to be sure, we just keep the alignment with
0x07 (=NOPs) at the end of crti.S.
In order to avoid an obvious NOP slide in _fini, this patch also
uses an lg instead of lgr instruction. Then the emitted instructions
needs a multiple of 4 bytes.
Even for explicit large page support, allocation might use mmap without
the hugepage bit set if the requested size is smaller than
mmap_threshold. For this case where mmap is issued, MAP_HUGETLB is set
iff the allocation size is larger than the used large page.
To force such allocations to use large pages, also tune the mmap_threhold
(if it is not explicit set by a tunable). This forces allocation to
follow the sbrk path, which will fall back to mmap (which will try large
pages before galling back to default mmap).
Checked on x86_64-linux-gnu.
Reviewed-by: DJ Delorie <dj@redhat.com>
Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org>
The patch adds two new macros, TUNABLE_GET_DEFAULT and TUNABLE_IS_INITIALIZED,
here the former get the default value with a signature similar to
TUNABLE_GET, while the later returns whether the tunable was set by
the environment variable.
Checked on x86_64-linux-gnu.
Reviewed-by: DJ Delorie <dj@redhat.com>
Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Current code aligns to 2x VEC_SIZE. Aligning to 2x has no affect on
performance other than potentially resulting in an additional
iteration of the loop.
1x maintains aligned stores (the only reason to align in this case)
and doesn't incur any unnecessary loop iterations.
Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>
_dl_assign_tls_modid() assigns a slotinfo entry for a new module, but
does *not* do anything to the generation counter. The first time this
happens, the generation is zero and map_generation() returns the current
generation to be used during relocation processing. However, if
a slotinfo entry is later reused, it will already have a generation
assigned. If this generation has fallen behind the current global max
generation, then this causes an obsolete generation to be assigned
during relocation processing, as map_generation() returns this
generation if nonzero. _dl_add_to_slotinfo() eventually resets the
generation, but by then it is too late. This causes DTV updates to be
skipped, leading to NULL or broken TLS slot pointers and segfaults.
Fix this by resetting the generation to zero in _dl_assign_tls_modid(),
so it behaves the same as the first time a slot is assigned.
_dl_add_to_slotinfo() will still assign the correct static generation
later during module load, but relocation processing will no longer use
an obsolete generation.
Note that slotinfo entry (aka modid) reuse typically happens after a
dlclose and only TLS access via dynamic tlsdesc is affected. Because
tlsdesc is optimized to use the optional part of static TLS, dynamic
tlsdesc can be avoided by increasing the glibc.rtld.optional_static_tls
tunable to a large enough value, or by LD_PRELOAD-ing the affected
modules.
Fixes bug 29039.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
This patch adds the TCP_MD5SIG_FLAG_IFINDEX constant from Linux 5.6 to
sysdeps/gnu/netinet/tcp.h and updates struct tcp_md5sig accordingly to
contain the device index.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
This is just a minor optimization. It also makes it more obvious that
_dl_relocate_object can be called multiple times.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
A recent commit, apparently commit
6c6fce572f "elf: Remove /etc/suid-debug
support", resulted in localplt failures for i686-gnu and x86_64-gnu:
Missing required PLT reference: ld.so: __access_noerrno
After that commit, __access_noerrno is actually no longer used at all.
So rather than just removing the localplt expectation for that symbol
for Hurd, completely remove all definitions of and references to that
symbol.
Tested for x86_64, and with build-many-glibcs.py for i686-gnu and
x86_64-gnu.
This restore the 2.33 semantic for arena_get2. It was changed by
11a02b035b to avoid arena_get2 call malloc (back when __get_nproc
was refactored to use an scratch_buffer - 903bc7dcc2). The
__get_nproc was refactored over then and now it also avoid to call
malloc.
The 11a02b035b did not take in consideration any performance
implication, which should have been discussed properly. The
__get_nprocs_sched is still used as a fallback mechanism if procfs
and sysfs is not acessible.
Checked on x86_64-linux-gnu.
Reviewed-by: DJ Delorie <dj@redhat.com>
These were broken by the new atan2 functions, as they were only
set up for univariate functions. Arity is now detected from the
input file - this revealed a mistake that the double-precision
inputs were being used for both single- and double-precision
routines, which is now remedied.
Many applications still rely on this prototype. Rebuilds without
this prototype result in an implicit function declaration, which can
introduce security vulnerabilities due to 32-bit pointer truncation.
The _dl_non_dynamic_init does not parse LD_PROFILE, which does not
enable profile for dlopen objects. Since dlopen is deprecated for
static objects, it is better to remove the support.
It also allows to trim down libc.a of profile support.
Checked on x86_64-linux-gnu.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Loader does not ignore LD_PROFILE in secure-execution mode (different
than man-page states [1]), rather it uses a different path
(/var/profile) and ignore LD_PROFILE_OUTPUT.
Allowing secure-execution profiling is already a non good security
boundary, since it enables different code paths and extra OS access by
the process. But by ignoring LD_PROFILE_OUTPUT, the resulting profile
file might also be acceded in a racy manner since the file name does not
use any process-specific information (such as pid, timing, etc.).
Another side-effect is it forces lazy binding even on libraries that
might be with DF_BIND_NOW.
[1] https://man7.org/linux/man-pages/man8/ld.so.8.html
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Using the memcmp symbol directly allows the compile to inline the
memcmp calls (especially because _dl_tunable_set_hwcaps uses constants
values), generating better code.
Checked with tst-tunables on s390x-linux-gnu (qemu system).
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>