Commit Graph

16202 Commits

Author SHA1 Message Date
Stefan Liebler
cc1b91eabd
S390: Fix building with --disable-mutli-arch [BZ #31196]
Starting with commits
- 7ea510127e
string: Add libc_hidden_proto for strchrnul
- 22999b2f0f
string: Add libc_hidden_proto for memrchr

building glibc on s390x with --disable-multi-arch fails if only
the C-variant of strchrnul / memrchr is used.  This is the case
if gcc uses -march < z13.

The build fails with:
../sysdeps/s390/strchrnul-c.c:28:49: error: ‘__strchrnul_c’ undeclared here (not in a function); did you mean ‘__strchrnul’?
   28 | __hidden_ver1 (__strchrnul_c, __GI___strchrnul, __strchrnul_c);

With --disable-multi-arch, __strchrnul_c is not available as string/strchrnul.c
is just included without defining STRCHRNUL and thus we also don't have to create
the internal hidden symbol.

Tested-by: Andreas K. Hüttel <dilfridge@gentoo.org>
2024-01-30 22:28:51 +01:00
Andreas Schwab
6edaa12b41 riscv: add support for static PIE
In order to support static PIE the startup code must avoid relocations
before __libc_start_main is called.
2024-01-22 14:58:23 +01:00
Adhemerval Zanella
bcf2abd43b sh: Fix static build with --enable-fortify
For static the internal symbols should not be prepended with the
internal __GI_.

Checked with a make check for sh4-linux-gnu.
2024-01-22 10:04:53 -03:00
Adhemerval Zanella
926a4bdbb5 sparc: Fix sparc64 memmove length comparison (BZ 31266)
The small counts copy bytes comparsion should be unsigned (as the
memmove size argument).  It fixes string/tst-memmove-overflow on
sparcv9, where the input size triggers an invalid code path.

Checked on sparc64-linux-gnu and sparcv9-linux-gnu.
2024-01-22 09:34:50 -03:00
Adhemerval Zanella
369efd8177 sparc64: Remove unwind information from signal return stubs [BZ#31244]
Similar to sparc32 fix, remove the unwind information on the signal
return stubs.  This fixes the regressions:

FAIL: nptl/tst-cancel24-static
FAIL: nptl/tst-cond8-static
FAIL: nptl/tst-mutex8-static
FAIL: nptl/tst-mutexpi8-static
FAIL: nptl/tst-mutexpi9

On sparc64-linux-gnu.
2024-01-22 09:34:50 -03:00
Adhemerval Zanella
dd57f5e7b6 sparc: Remove 64 bit check on sparc32 wordsize (BZ 27574)
The sparc32 is always 32 bits.

Checked on sparcv9-linux-gnu.
2024-01-22 09:34:50 -03:00
Daniel Cederman
87d921e270 sparc: Do not test preservation of NaN payloads for LEON
The FPU used by LEON does not preserve NaN payload. This change allows
the math/test-*-canonicalize tests to pass on LEON.

Signed-off-by: Daniel Cederman <cederman@gaisler.com>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-01-18 08:27:44 -03:00
Daniel Cederman
45f7ea26c1 sparc: Force calculation that raises exception
Use the math_force_eval() macro to force the calculation to complete and
raise the exception.

With this change the math/test-fenv test pass.

Signed-off-by: Daniel Cederman <cederman@gaisler.com>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-01-18 08:27:44 -03:00
Daniel Cederman
a8f7c77970 sparc: Fix llrint and llround missing exceptions on SPARC V8
Conversions from a float to a long long on SPARC v8 uses a libgcc function
that may not raise the correct exceptions on overflow. It also may raise
spurious "inexact" exceptions on non overflow cases. This patch fixes the
problem in the same way as for RV32.

Signed-off-by: Daniel Cederman <cederman@gaisler.com>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-01-18 08:27:44 -03:00
Daniel Cederman
7bd06985c0 sparc: Remove unwind information from signal return stubs [BZ #31244]
The functions were previously written in C, but were not compiled
with unwind information. The ENTRY/END macros includes .cfi_startproc
and .cfi_endproc which adds unwind information. This caused the
tests cleanup-8 and cleanup-10 in the GCC testsuite to fail.
This patch adds a version of the ENTRY/END macros without the
CFI instructions that can be used instead.

sigaction registers a restorer address that is located two instructions
before the stub function. This patch adds a two instruction padding to
avoid that the unwinder accesses the unwind information from the function
that the linker has placed right before it in memory. This fixes an issue
with pthread_cancel that caused tst-mutex8-static (and other tests) to fail.

Signed-off-by: Daniel Cederman <cederman@gaisler.com>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-01-18 08:27:44 -03:00
Daniel Cederman
82a35070ec sparc: Prevent stfsr from directly following floating-point instruction
On LEON, if the stfsr instruction is immediately following a floating-point
operation instruction in a running program, with no other instruction in
between the two, the stfsr might behave as if the order was reversed
between the two instructions and the stfsr occurred before the
floating-point operation.

Add a nop instruction before the stfsr to prevent this from happening.

Signed-off-by: Daniel Cederman <cederman@gaisler.com>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-01-18 08:27:44 -03:00
Daniel Cederman
3bb1350c36 sparc: Use existing macros to avoid code duplication
Macros for using inline assembly to access the fp state register exists
in both fenv_private.h and in fpu_control.h. Let fenv_private.h use the
macros from fpu_control.h

Signed-off-by: Daniel Cederman <cederman@gaisler.com>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-01-18 08:27:43 -03:00
Joseph Myers
6511b579a5 Update kernel version to 6.7 in header constant tests
This patch updates the kernel version in the tests tst-mman-consts.py,
tst-mount-consts.py and tst-pidfd-consts.py to 6.7.  (There are no new
constants covered by these tests in 6.7 that need any other header
changes.)

Tested with build-many-glibcs.py.
2024-01-17 21:15:37 +00:00
Joseph Myers
df11c05be9 Update syscall lists for Linux 6.7
Linux 6.7 adds the futex_requeue, futex_wait and futex_wake syscalls,
and enables map_shadow_stack for architectures previously missing it.
Update syscall-names.list and regenerate the arch-syscall.h headers
with build-many-glibcs.py update-syscalls.

Tested with build-many-glibcs.py.
2024-01-17 15:38:54 +00:00
H.J. Lu
457bd9cf2e x86-64: Check if mprotect works before rewriting PLT
Systemd execution environment configuration may prohibit changing a memory
mapping to become executable:

MemoryDenyWriteExecute=
Takes a boolean argument. If set, attempts to create memory mappings
that are writable and executable at the same time, or to change existing
memory mappings to become executable, or mapping shared memory segments
as executable, are prohibited.

When it is set, systemd service stops working if PLT rewrite is enabled.
Check if mprotect works before rewriting PLT.  This fixes BZ #31230.
This also works with SELinux when deny_execmem is on.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2024-01-15 06:59:23 -08:00
Sunil K Pandey
9d94997b5f x86_64: Optimize ffsll function code size.
Ffsll function randomly regress by ~20%, depending on how code gets
aligned in memory.  Ffsll function code size is 17 bytes.  Since default
function alignment is 16 bytes, it can load on 16, 32, 48 or 64 bytes
aligned memory.  When ffsll function load at 16, 32 or 64 bytes aligned
memory, entire code fits in single 64 bytes cache line.  When ffsll
function load at 48 bytes aligned memory, it splits in two cache line,
hence random regression.

Ffsll function size reduction from 17 bytes to 12 bytes ensures that it
will always fit in single 64 bytes cache line.

This patch fixes ffsll function random performance regression.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2024-01-13 12:20:08 -08:00
Yanzhang Wang
e0590f41fe RISC-V: Enable static-pie.
This patch referents the commit 374cef3 to add static-pie support. And
because the dummy link map is used when relocating ourselves, so need
not to set __global_pointer$ at this time.

It will also check whether toolchain supports to build static-pie.
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-01-12 15:11:45 -03:00
Adhemerval Zanella
061eaf0244 linux: Fix fstat64 on alpha and sparc64
The 551101e824 change is incorrect for
alpha and sparc, since __NR_stat is defined by both kABI.  Use
__NR_newfstat to check whether to fallback to __NR_fstat64 (similar
to what fstatat64 does).

Checked on sparc64-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2024-01-12 15:11:11 -03:00
Wilco Dijkstra
08ddd26814 math: remove exp10 wrappers
Remove the error handling wrapper from exp10.  This is very similar to
the changes done to exp and exp2, except that we also need to handle
pow10 and pow10l.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-01-12 16:02:12 +00:00
Xi Ruoyao
5a85786a90
Make __getrandom_nocancel set errno and add a _nostatus version
The __getrandom_nocancel function returns errors as negative values
instead of errno.  This is inconsistent with other _nocancel functions
and it breaks "TEMP_FAILURE_RETRY (__getrandom_nocancel (p, n, 0))" in
__arc4random_buf.  Use INLINE_SYSCALL_CALL instead of
INTERNAL_SYSCALL_CALL to fix this issue.

But __getrandom_nocancel has been avoiding from touching errno for a
reason, see BZ 29624.  So add a __getrandom_nocancel_nostatus function
and use it in tcache_key_initialize.

Signed-off-by: Xi Ruoyao <xry111@xry111.site>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>
2024-01-12 14:23:11 +01:00
H.J. Lu
f2b65a4471 x86-64/cet: Make CET feature check specific to Linux/x86
CET feature bits in TCB, which are Linux specific, are used to check if
CET features are active.  Move CET feature check to Linux/x86 directory.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2024-01-11 20:35:24 -08:00
H.J. Lu
874214db62 i386: Remove CET support bits
1. Remove _dl_runtime_resolve_shstk and _dl_runtime_profile_shstk.
2. Move CET offsets from x86 cpu-features-offsets.sym to x86-64
features-offsets.sym.
3. Rename x86 cet-control.h to x86-64 feature-control.h since it is only
for x86-64 and also used for PLT rewrite.
4. Add x86-64 ldsodefs.h to include feature-control.h.
5. Change TUNABLE_CALLBACK (set_plt_rewrite) to x86-64 only.
6. Move x86 dl-procruntime.c to x86-64.
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-01-10 05:20:20 -08:00
H.J. Lu
7d544dd049 x86-64/cet: Move check-cet.awk to x86_64
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-01-10 05:20:16 -08:00
H.J. Lu
a1bbee9fd1 x86-64/cet: Move dl-cet.[ch] to x86_64 directories
Since CET is only enabled for x86-64, move dl-cet.[ch] to x86_64
directories.
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-01-10 05:19:32 -08:00
H.J. Lu
b45115a666 x86: Move x86-64 shadow stack startup codes
Move sysdeps/x86/libc-start.h to sysdeps/x86_64/libc-start.h and use
sysdeps/generic/libc-start.h for i386.
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-01-10 05:19:32 -08:00
Adhemerval Zanella
a0cfc48e8a i386: Fail if configured with --enable-cet
Since it is only supported for x86_64.

Checked on i686-linux-gnu.
2024-01-09 13:55:51 -03:00
Adhemerval Zanella
25f1e16ef0 i386: Remove CET support
CET is only support for x86_64, this patch reverts:

  - faaee1f07e x86: Support shadow stack pointer in setjmp/longjmp.
  - be9ccd27c0 i386: Add _CET_ENDBR to indirect jump targets in
    add_n.S/sub_n.S
  - c02695d776 x86/CET: Update vfork to prevent child return
  - 5d844e1b72 i386: Enable CET support in ucontext functions
  - 124bcde683 x86: Add _CET_ENDBR to functions in crti.S
  - 562837c002 x86: Add _CET_ENDBR to functions in dl-tlsdesc.S
  - f753fa7dea x86: Support IBT and SHSTK in Intel CET [BZ #21598]
  - 825b58f3fb i386-mcount.S: Add _CET_ENDBR to _mcount and __fentry__
  - 7e119cd582 i386: Use _CET_NOTRACK in i686/memcmp.S
  - 177824e232 i386: Use _CET_NOTRACK in memcmp-sse4.S
  - 0a899af097 i386: Use _CET_NOTRACK in memcpy-ssse3-rep.S
  - 7fb613361c i386: Use _CET_NOTRACK in memcpy-ssse3.S
  - 77a8ae0948 i386: Use _CET_NOTRACK in memset-sse2-rep.S
  - 00e7b76a8f i386: Use _CET_NOTRACK in memset-sse2.S
  - 90d15dc577 i386: Use _CET_NOTRACK in strcat-sse2.S
  - f1574581c7 i386: Use _CET_NOTRACK in strcpy-sse2.S
  - 4031d7484a i386/sub_n.S: Add a missing _CET_ENDBR to indirect jump
  - target
  -
Checked on i686-linux-gnu.
2024-01-09 13:55:51 -03:00
Adhemerval Zanella
b7fc4a07f2 x86: Move CET infrastructure to x86_64
The CET is only supported for x86_64 and there is no plan to add
kernel support for i386.  Move the Makefile rules and files from the
generic x86 folder to x86_64 one.

Checked on x86_64-linux-gnu and i686-linux-gnu.
2024-01-09 13:55:51 -03:00
Adhemerval Zanella
460860f457 Remove ia64-linux-gnu
Linux 6.7 removed ia64 from the official tree [1], following the general
principle that a glibc port needs upstream support for the architecture
in all the components it depends on (binutils, GCC, and the Linux
kernel).

Apart from the removal of sysdeps/ia64 and sysdeps/unix/sysv/linux/ia64,
there are updates to various comments referencing ia64 for which removal
of those references seemed appropriate. The configuration is removed
from README and build-many-glibcs.py.

The CONTRIBUTED-BY, elf/elf.h, manual/contrib.texi (the porting
mention), *.po files, config.guess, and longlong.h are not changed.

For Linux it allows cleanup some clone2 support on multiple files.

The following bug can be closed as WONTFIX: BZ 22634 [2], BZ 14250 [3],
BZ 21634 [4], BZ 10163 [5], BZ 16401 [6], and BZ 11585 [7].

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=43ff221426d33db909f7159fdf620c3b052e2d1c
[2] https://sourceware.org/bugzilla/show_bug.cgi?id=22634
[3] https://sourceware.org/bugzilla/show_bug.cgi?id=14250
[4] https://sourceware.org/bugzilla/show_bug.cgi?id=21634
[5] https://sourceware.org/bugzilla/show_bug.cgi?id=10163
[6] https://sourceware.org/bugzilla/show_bug.cgi?id=16401
[7] https://sourceware.org/bugzilla/show_bug.cgi?id=11585
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2024-01-08 17:09:36 -03:00
H.J. Lu
0f9afc265a x32: Handle displacement overflow in PLT rewrite [BZ #31218]
PLT rewrite calculated displacement with

ElfW(Addr) disp = value - branch_start - JMP32_INSN_SIZE;

On x32, displacement from 0xf7fbe060 to 0x401030 was calculated as

unsigned int disp = 0x401030 - 0xf7fbe060 - 5;

with disp == 0x8442fcb and caused displacement overflow. The PLT entry
was changed to:

0xf7fbe060 <+0>:	e9 cb 2f 44 08     	jmp    0x401030
0xf7fbe065 <+5>:	cc                 	int3
0xf7fbe066 <+6>:	cc                 	int3
0xf7fbe067 <+7>:	cc                 	int3
0xf7fbe068 <+8>:	cc                 	int3
0xf7fbe069 <+9>:	cc                 	int3
0xf7fbe06a <+10>:	cc                 	int3
0xf7fbe06b <+11>:	cc                 	int3
0xf7fbe06c <+12>:	cc                 	int3
0xf7fbe06d <+13>:	cc                 	int3
0xf7fbe06e <+14>:	cc                 	int3
0xf7fbe06f <+15>:	cc                 	int3

x32 has 32-bit address range, but it doesn't wrap address around at 4GB,
JMP target was changed to 0x100401030 (0xf7fbe060LL + 0x8442fcbLL + 5),
which is above 4GB.

Always use uint64_t to calculate displacement.  This fixes BZ #31218.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2024-01-06 14:25:49 -08:00
Noah Goldstein
b96a2eba2f x86: Fixup some nits in longjmp asm implementation
Replace a stray `nop` with a `.p2align` directive.
2024-01-05 18:00:38 -08:00
H.J. Lu
848746e88e elf: Add ELF_DYNAMIC_AFTER_RELOC to rewrite PLT
Add ELF_DYNAMIC_AFTER_RELOC to allow target specific processing after
relocation.

For x86-64, add

 #define DT_X86_64_PLT     (DT_LOPROC + 0)
 #define DT_X86_64_PLTSZ   (DT_LOPROC + 1)
 #define DT_X86_64_PLTENT  (DT_LOPROC + 3)

1. DT_X86_64_PLT: The address of the procedure linkage table.
2. DT_X86_64_PLTSZ: The total size, in bytes, of the procedure linkage
table.
3. DT_X86_64_PLTENT: The size, in bytes, of a procedure linkage table
entry.

With the r_addend field of the R_X86_64_JUMP_SLOT relocation set to the
memory offset of the indirect branch instruction.

Define ELF_DYNAMIC_AFTER_RELOC for x86-64 to rewrite the PLT section
with direct branch after relocation when the lazy binding is disabled.

PLT rewrite is disabled by default since SELinux may disallow modifying
code pages and ld.so can't detect it in all cases.  Use

$ export GLIBC_TUNABLES=glibc.cpu.plt_rewrite=1

to enable PLT rewrite with 32-bit direct jump at run-time or

$ export GLIBC_TUNABLES=glibc.cpu.plt_rewrite=2

to enable PLT rewrite with 32-bit direct jump and on APX processors with
64-bit absolute jump at run-time.

Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2024-01-05 05:49:49 -08:00
Sergey Bugaev
520b1df08d aarch64: Make cpu-features definitions not Linux-specific
These describe generic AArch64 CPU features, and are not tied to a
kernel-specific way of determining them. We can share them between
the Linux and Hurd AArch64 ports.

Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-ID: <20240103171502.1358371-13-bugaevc@gmail.com>
2024-01-04 23:48:54 +01:00
Sergey Bugaev
fbfe0b20ab hurd: Initializy _dl_pagesize early in static builds
We fetch __vm_page_size as the very first RPC that we do, inside
__mach_init (). Propagate that to _dl_pagesize ASAP after that,
before any other initialization.

In dynamic builds, this is already done immediately after
__mach_init (), inside _dl_sysdep_start ().

Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-ID: <20240103171502.1358371-12-bugaevc@gmail.com>
2024-01-04 23:48:36 +01:00
Sergey Bugaev
4145de65f6 hurd: Only init early static TLS if it's used to store stack or pointer guards
This is the case on both x86 architectures, but not on AArch64.

Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-ID: <20240103171502.1358371-11-bugaevc@gmail.com>
2024-01-04 23:48:23 +01:00
Sergey Bugaev
9eaa0e1799 hurd: Make init-first.c no longer x86-specific
This will make it usable in other ports.

Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-ID: <20240103171502.1358371-10-bugaevc@gmail.com>
2024-01-04 23:48:07 +01:00
Sergey Bugaev
b44ad8944b hurd: Drop x86-specific assembly from init-first.c
We already have the RETURN_TO macro for this exact use case, and it's already
used in the non-static code path. Use it here too.

Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-ID: <20240103171502.1358371-9-bugaevc@gmail.com>
2024-01-04 23:47:23 +01:00
Sergey Bugaev
24b707c166 hurd: Pass the data pointer to _hurd_stack_setup explicitly
Instead of relying on the stack frame layout to figure out where the stack
pointer was prior to the _hurd_stack_setup () call, just pass the pointer
as an argument explicitly. This is less brittle and much more portable.

Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-ID: <20240103171502.1358371-8-bugaevc@gmail.com>
2024-01-04 23:47:03 +01:00
H.J. Lu
35694d3416 x86-64/cet: Check the restore token in longjmp
setcontext and swapcontext put a restore token on the old shadow stack
which is used to restore the target shadow stack when switching user
contexts.  When longjmp from a user context, the target shadow stack
can be different from the current shadow stack and INCSSP can't be
used to restore the shadow stack pointer to the target shadow stack.

Update longjmp to search for a restore token.  If found, use the token
to restore the shadow stack pointer before using INCSSP to pop the
shadow stack.  Stop the token search and use INCSSP if the shadow stack
entry value is the same as the current shadow stack pointer.

It is a user error if there is a shadow stack switch without leaving a
restore token on the old shadow stack.

The only difference between __longjmp.S and __longjmp_chk.S is that
__longjmp_chk.S has a check for invalid longjmp usages.  Merge
__longjmp.S and __longjmp_chk.S by adding the CHECK_INVALID_LONGJMP
macro.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2024-01-04 13:38:26 -08:00
H.J. Lu
bbfb54930c i386: Ignore --enable-cet
Since shadow stack is only supported for x86-64, ignore --enable-cet for
i386.  Always setting $(enable-cet) for i386 to "no" to support

ifneq ($(enable-cet),no)

in x86 Makefiles.  We can't use

ifeq ($(enable-cet),yes)

since $(enable-cet) can be "yes", "no" or "permissive".
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-01-04 06:08:55 -08:00
Sergey Bugaev
0d4a2f3576 mach: Drop SNARF_ARGS macro
We're obtaining arguments from the stack differently, see init-first.c.

Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
2024-01-03 21:59:55 +01:00
Sergey Bugaev
114de961e0 mach: Drop some unnecessary vm_param.h includes
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
2024-01-03 21:59:54 +01:00
Sergey Bugaev
dac7c64065 hurd: Add some missing includes
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
2024-01-03 21:59:54 +01:00
Joseph Myers
b34b46b880 Implement C23 <stdbit.h>
C23 adds a header <stdbit.h> with various functions and type-generic
macros for bit-manipulation of unsigned integers (plus macro defines
related to endianness).  Implement this header for glibc.

The functions have both inline definitions in the header (referenced
by macros defined in the header) and copies with external linkage in
the library (which are implemented in terms of those macros to avoid
duplication).  They are documented in the glibc manual.  Tests, as
well as verifying results for various inputs (of both the macros and
the out-of-line functions), verify the types of those results (which
showed up a bug in an earlier version with the type-generic macro
stdc_has_single_bit wrongly returning a promoted type), that the
macros can be used at top level in a source file (so don't use ({})),
that they evaluate their arguments exactly once, and that the macros
for the type-specific functions have the expected implicit conversions
to the relevant argument type.

Jakub previously referred to -Wconversion warnings in type-generic
macros, so I've included a test with -Wconversion (but the only
warnings I saw and fixed from that test were actually in inline
functions in the <stdbit.h> header - not anything coming from use of
the type-generic macros themselves).

This implementation of the type-generic macros does not handle
unsigned __int128, or unsigned _BitInt types with a width other than
that of a standard integer type (and C23 doesn't require the header to
handle such types either).  Support for those types, using the new
type-generic built-in functions Jakub's added for GCC 14, can
reasonably be added in a followup (along of course with associated
tests).

This implementation doesn't do anything special to handle C++, or have
any tests of functionality in C++ beyond the existing tests that all
headers can be compiled in C++ code; it's not clear exactly what form
this header should take in C++, but probably not one using macros.

DIS ballot comment AT-107 asks for the word "count" to be added to the
names of the stdc_leading_zeros, stdc_leading_ones,
stdc_trailing_zeros and stdc_trailing_ones functions and macros.  I
don't think it's likely to be accepted (accepting any technical
comments would mean having an FDIS ballot), but if it is accepted at
the WG14 meeting (22-26 January in Strasbourg, starting with DIS
ballot comment handling) then there would still be time to update
glibc for the renaming before the 2.39 release.

The new functions and header are placed in the stdlib/ directory in
glibc, rather than creating a new toplevel stdbit/ or putting them in
string/ alongside ffs.

Tested for x86_64 and x86.
2024-01-03 12:07:14 +00:00
Szabolcs Nagy
0c12c8c0cb aarch64: Add longjmp test for SME
Includes test for setcontext too.

The test directly checks after longjmp if ZA got disabled and the
ZA contents got saved following the lazy saving scheme. It does not
use ACLE code to verify that gcc can interoperate with glibc.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-01-02 16:54:21 +00:00
Szabolcs Nagy
9d30e5cf96 aarch64: Add setcontext support for SME
For the ZA lazy saving scheme to work, setcontext has to call
__libc_arm_za_disable.

Also fixes swapcontext which uses setcontext internally.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-01-02 15:43:30 +00:00
Szabolcs Nagy
a7373e457f aarch64: Add longjmp support for SME
For the ZA lazy saving scheme to work, longjmp has to call
__libc_arm_za_disable.

In ld.so we assume ZA is not used so longjmp does not need
special support there.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-01-02 15:43:30 +00:00
Szabolcs Nagy
d3c32ae207 aarch64: Add SME runtime support
The runtime support routines for the call ABI of the Scalable Matrix
Extension (SME) are mostly in libgcc. Since libc.so cannot depend on
libgcc_s.so have an implementation of __arm_za_disable in libc for
libc internal use in longjmp and similar APIs.

__libc_arm_za_disable follows the same PCS rules as __arm_za_disable,
but it's a hidden symbol so it does not need variant PCS marking.

Using __libc_fatal instead of abort because it can print a message and
works in ld.so too. But for now we don't need SME routines in ld.so.

To check the SME HWCAP in asm, we need the _dl_hwcap2 member offset in
_rtld_global_ro in the shared libc.so, while in libc.a the _dl_hwcap2
object is accessed.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-01-02 15:43:30 +00:00
H.J. Lu
b5dcccfb12 x86/cet: Add -fcf-protection=none before -fcf-protection=branch
When shadow stack is enabled, some CET tests failed when compiled with
GCC 14:

FAIL: elf/tst-cet-legacy-4
FAIL: elf/tst-cet-legacy-5a
FAIL: elf/tst-cet-legacy-6a

which are caused by

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113039

These tests use -fcf-protection -fcf-protection=branch and assume that
-fcf-protection=branch will override -fcf-protection.  But this GCC 14
commit:

https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=1c6231c05bdcca

changed the -fcf-protection behavior such that

-fcf-protection -fcf-protection=branch

is treated the same as

-fcf-protection

Use

-fcf-protection -fcf-protection=none -fcf-protection=branch

as the workaround.  This fixes BZ #31187.

Tested with GCC 13 and GCC 14 on Intel Tiger Lake.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2024-01-01 15:53:52 -08:00
Paul Eggert
dff8da6b3e Update copyright dates with scripts/update-copyrights 2024-01-01 10:53:40 -08:00
H.J. Lu
cf9481724b x86/cet: Run some CET tests with shadow stack
When CET is disabled by default, run some CET tests with shadow stack
enabled using

$ export GLIBC_TUNABLES=glibc.cpu.hwcaps=SHSTK
2024-01-01 05:22:48 -08:00
H.J. Lu
55d63e7312 x86/cet: Don't set CET active by default
Not all CET enabled applications and libraries have been properly tested
in CET enabled environments.  Some CET enabled applications or libraries
will crash or misbehave when CET is enabled.  Don't set CET active by
default so that all applications and libraries will run normally regardless
of whether CET is active or not.  Shadow stack can be enabled by

$ export GLIBC_TUNABLES=glibc.cpu.hwcaps=SHSTK

at run-time if shadow stack can be enabled by kernel.

NB: This commit can be reverted if it is OK to enable CET by default for
all applications and libraries.
2024-01-01 05:22:48 -08:00
H.J. Lu
d360dcc001 x86/cet: Check feature_1 in TCB for active IBT and SHSTK
Initially, IBT and SHSTK are marked as active when CPU supports them
and CET are enabled in glibc.  They can be disabled early by tunables
before relocation.  Since after relocation, GLRO(dl_x86_cpu_features)
becomes read-only, we can't update GLRO(dl_x86_cpu_features) to mark
IBT and SHSTK as inactive.  Instead, check the feature_1 field in TCB
to decide if IBT and SHST are active.
2024-01-01 05:22:48 -08:00
H.J. Lu
541641a3de x86/cet: Enable shadow stack during startup
Previously, CET was enabled by kernel before passing control to user
space and the startup code must disable CET if applications or shared
libraries aren't CET enabled.  Since the current kernel only supports
shadow stack and won't enable shadow stack before passing control to
user space, we need to enable shadow stack during startup if the
application and all shared library are shadow stack enabled.  There
is no need to disable shadow stack at startup.  Shadow stack can only
be enabled in a function which will never return.  Otherwise, shadow
stack will underflow at the function return.

1. GL(dl_x86_feature_1) is set to the CET features which are supported
by the processor and are not disabled by the tunable.  Only non-zero
features in GL(dl_x86_feature_1) should be enabled.  After enabling
shadow stack with ARCH_SHSTK_ENABLE, ARCH_SHSTK_STATUS is used to check
if shadow stack is really enabled.
2. Use ARCH_SHSTK_ENABLE in RTLD_START in dynamic executable.  It is
safe since RTLD_START never returns.
3. Call arch_prctl (ARCH_SHSTK_ENABLE) from ARCH_SETUP_TLS in static
executable.  Since the start function using ARCH_SETUP_TLS never returns,
it is safe to enable shadow stack in ARCH_SETUP_TLS.
2024-01-01 05:22:48 -08:00
H.J. Lu
8d9f9c4460 elf: Always provide _dl_get_dl_main_map in libc.a
Always provide _dl_get_dl_main_map in libc.a.  It will be used by x86
to process PT_GNU_PROPERTY segment.
2024-01-01 05:22:48 -08:00
H.J. Lu
edb5e0c8f9 x86/cet: Sync with Linux kernel 6.6 shadow stack interface
Sync with Linux kernel 6.6 shadow stack interface.  Since only x86-64 is
supported, i386 shadow stack codes are unchanged and CET shouldn't be
enabled for i386.

1. When the shadow stack base in TCB is unset, the default shadow stack
is in use.  Use the current shadow stack pointer as the marker for the
default shadow stack. It is used to identify if the current shadow stack
is the same as the target shadow stack when switching ucontexts.  If yes,
INCSSP will be used to unwind shadow stack.  Otherwise, shadow stack
restore token will be used.
2. Allocate shadow stack with the map_shadow_stack syscall.  Since there
is no function to explicitly release ucontext, there is no place to
release shadow stack allocated by map_shadow_stack in ucontext functions.
Such shadow stacks will be leaked.
3. Rename arch_prctl CET commands to ARCH_SHSTK_XXX.
4. Rewrite the CET control functions with the current kernel shadow stack
interface.

Since CET is no longer enabled by kernel, a separate patch will enable
shadow stack during startup.
2024-01-01 05:22:48 -08:00
Aurelien Jarno
6b32696116 RISC-V: Add support for dl_runtime_profile (BZ #31151)
Code is mostly inspired from the LoongArch one, which has a similar ABI,
with minor changes to support riscv32 and register differences.

This fixes elf/tst-sprof-basic. This also fixes elf/tst-audit1,
elf/tst-audit2 and elf/tst-audit8 with recent binutils snapshots when
--enable-bind-now is used.

Resolves: BZ #31151

Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-12-30 11:00:10 +01:00
H.J. Lu
81be2a61da x86-64: Fix the tcb field load for x32 [BZ #31185]
_dl_tlsdesc_undefweak and _dl_tlsdesc_dynamic access the thread pointer
via the tcb field in TCB:

_dl_tlsdesc_undefweak:
        _CET_ENDBR
        movq    8(%rax), %rax
        subq    %fs:0, %rax
        ret

_dl_tlsdesc_dynamic:
	...
        subq    %fs:0, %rax
        movq    -8(%rsp), %rdi
        ret

Since the tcb field in TCB is a pointer, %fs:0 is a 32-bit location,
not 64-bit. It should use "sub %fs:0, %RAX_LP" instead.  Since
_dl_tlsdesc_undefweak returns ptrdiff_t and _dl_make_tlsdesc_dynamic
returns void *, RAX_LP is appropriate here for x32 and x86-64.  This
fixes BZ #31185.
2023-12-22 05:37:17 -08:00
H.J. Lu
3502440397 x86-64: Fix the dtv field load for x32 [BZ #31184]
On x32, I got

FAIL: elf/tst-tlsgap

$ gdb elf/tst-tlsgap
...
open tst-tlsgap-mod1.so

Thread 2 "tst-tlsgap" received signal SIGSEGV, Segmentation fault.
[Switching to LWP 2268754]
_dl_tlsdesc_dynamic () at ../sysdeps/x86_64/dl-tlsdesc.S:108
108		movq	(%rsi), %rax
(gdb) p/x $rsi
$4 = 0xf7dbf9005655fb18
(gdb)

This is caused by

_dl_tlsdesc_dynamic:
        _CET_ENDBR
        /* Preserve call-clobbered registers that we modify.
           We need two scratch regs anyway.  */
        movq    %rsi, -16(%rsp)
        movq    %fs:DTV_OFFSET, %rsi

Since the dtv field in TCB is a pointer, %fs:DTV_OFFSET is a 32-bit
location, not 64-bit.  Load the dtv field to RSI_LP instead of rsi.
This fixes BZ #31184.
2023-12-22 05:37:00 -08:00
H.J. Lu
41560a9312 x86/cet: Don't disable CET if not single threaded
In permissive mode, don't disable IBT nor SHSTK when dlopening a legacy
shared library if not single threaded since IBT and SHSTK may be still
enabled in other threads.  Other threads with IBT or SHSTK enabled will
crash when calling functions in the legacy shared library.  Instead, an
error will be issued.
2023-12-20 05:03:37 -08:00
H.J. Lu
c04035809a x86: Modularize sysdeps/x86/dl-cet.c
Improve readability and make maintenance easier for dl-feature.c by
modularizing sysdeps/x86/dl-cet.c:
1. Support processors with:
   a. Only IBT.  Or
   b. Only SHSTK.  Or
   c. Both IBT and SHSTK.
2. Lock CET features only if IBT or SHSTK are enabled and are not
enabled permissively.
2023-12-20 04:57:55 -08:00
H.J. Lu
1a23b39f9d x86/cet: Update tst-cet-vfork-1
Change tst-cet-vfork-1.c to verify that vfork child return triggers
SIGSEGV due to shadow stack mismatch.
2023-12-20 04:57:21 -08:00
Joe Ramsay
667f277c78 aarch64: Add SIMD attributes to math functions with vector versions
Added annotations for autovec by GCC and GFortran - this enables GCC
>= 9 to autovectorise math calls at -Ofast.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2023-12-20 08:41:25 +00:00
Joe Ramsay
cc0d77ba94 aarch64: Add half-width versions of AdvSIMD f32 libmvec routines
Compilers may emit calls to 'half-width' routines (two-lane
single-precision variants). These have been added in the form of
wrappers around the full-width versions, where the low half of the
vector is simply duplicated. This will perform poorly when one lane
triggers the special-case handler, as there will be a redundant call
to the scalar version, however this is expected to be rare at Ofast.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2023-12-20 08:41:25 +00:00
H.J. Lu
50bef9bd63 Fix elf: Do not duplicate the GLIBC_TUNABLES string
commit 2a969b53c0
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date:   Wed Dec 6 10:24:01 2023 -0300

    elf: Do not duplicate the GLIBC_TUNABLES string

has

@@ -38,7 +39,7 @@
    which isn't available.  */
 #define CHECK_GLIBC_IFUNC_PREFERRED_OFF(f, cpu_features, name, len) \
   _Static_assert (sizeof (#name) - 1 == len, #name " != " #len); \
-  if (memcmp (f, #name, len) == 0)             \
+  if (tunable_str_comma_strcmp_cte (&f, #name) == 0)       \
     {                           \
       cpu_features->preferred[index_arch_##name]        \
   &= ~bit_arch_##name;                \
@@ -46,12 +47,11 @@

Fix it by removing "== 0" after tunable_str_comma_strcmp_cte.
2023-12-19 16:01:33 -08:00
H.J. Lu
cad5703e4f Fix elf: Do not duplicate the GLIBC_TUNABLES string
Fix issues in sysdeps/x86/tst-hwcap-tunables.c added by

Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date:   Wed Dec 6 10:24:01 2023 -0300

    elf: Do not duplicate the GLIBC_TUNABLES string

1. -AVX,-AVX2,-AVX512F should be used to disable AVX, AVX2 and AVX512.
2. AVX512 IFUNC functions check AVX512VL.  -AVX512VL should be added
to disable these functions.

This fixed:

FAIL: elf/tst-hwcap-tunables
...
[0] Spawned test for -Prefer_ERMS,-Prefer_FSRM,-AVX,-AVX2,-AVX_Usable,-AVX2_Usable,-AVX512F_Usable,-SSE4_1,-SSE4_2,-SSSE3,-Fast_Unaligned_Load,-ERMS,-AVX_Fast_Unaligned_Load
error: subprocess failed: tst-tunables
error:   unexpected output from subprocess
../sysdeps/x86/tst-hwcap-tunables.c:91: numeric comparison failure
   left: 1 (0x1); from: impls[i].usable
  right: 0 (0x0); from: false
../sysdeps/x86/tst-hwcap-tunables.c:91: numeric comparison failure
   left: 1 (0x1); from: impls[i].usable
  right: 0 (0x0); from: false
../sysdeps/x86/tst-hwcap-tunables.c:91: numeric comparison failure
   left: 1 (0x1); from: impls[i].usable
  right: 0 (0x0); from: false
../sysdeps/x86/tst-hwcap-tunables.c:91: numeric comparison failure
   left: 1 (0x1); from: impls[i].usable
  right: 0 (0x0); from: false
../sysdeps/x86/tst-hwcap-tunables.c:91: numeric comparison failure
   left: 1 (0x1); from: impls[i].usable
  right: 0 (0x0); from: false

[1] Spawned test for ,-,-Prefer_ERMS,-Prefer_FSRM,-AVX,-AVX2,-AVX_Usable,-AVX2_Usable,-AVX512F_Usable,-SSE4_1,-SSE4_2,,-SSSE3,-Fast_Unaligned_Load,,-,-ERMS,-AVX_Fast_Unaligned_Load,-,
error: subprocess failed: tst-tunables
error:   unexpected output from subprocess
../sysdeps/x86/tst-hwcap-tunables.c:91: numeric comparison failure
   left: 1 (0x1); from: impls[i].usable
  right: 0 (0x0); from: false
../sysdeps/x86/tst-hwcap-tunables.c:91: numeric comparison failure
   left: 1 (0x1); from: impls[i].usable
  right: 0 (0x0); from: false
../sysdeps/x86/tst-hwcap-tunables.c:91: numeric comparison failure
   left: 1 (0x1); from: impls[i].usable
  right: 0 (0x0); from: false
../sysdeps/x86/tst-hwcap-tunables.c:91: numeric comparison failure
   left: 1 (0x1); from: impls[i].usable
  right: 0 (0x0); from: false
../sysdeps/x86/tst-hwcap-tunables.c:91: numeric comparison failure
   left: 1 (0x1); from: impls[i].usable
  right: 0 (0x0); from: false

error: 2 test failures

on Intel Tiger Lake.
2023-12-19 13:34:14 -08:00
Bruno Haible
d082930272 hppa: Fix undefined behaviour in feclearexcept (BZ 30983)
The expression

  (excepts & FE_ALL_EXCEPT) << 27

produces a signed integer overflow when 'excepts' is specified as
FE_INVALID (= 0x10), because
  - excepts is of type 'int',
  - FE_ALL_EXCEPT is of type 'int',
  - thus (excepts & FE_ALL_EXCEPT) is (int) 0x10,
  - 'int' is 32 bits wide.

The patched code produces the same instruction sequence as
previosuly.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2023-12-19 15:12:38 -03:00
Bruno Haible
80a40a9e14 alpha: Fix fesetexceptflag (BZ 30998)
It clears some exception flags that are outside the EXCEPTS argument.

It fixes math/test-fexcept on qemu-user.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2023-12-19 15:12:38 -03:00
Adhemerval Zanella
802aef27b2 riscv: Fix feenvupdate with FE_DFL_ENV (BZ 31022)
libc_feupdateenv_riscv should check for FE_DFL_ENV, similar to
libc_fesetenv_riscv.

Also extend the test-fenv.c to test fenvupdate.

Checked on riscv under qemu-system.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2023-12-19 15:12:38 -03:00
Bruno Haible
787282dede x86: Do not raises floating-point exception traps on fesetexceptflag (BZ 30990)
According to ISO C23 (7.6.4.4), fesetexcept is supposed to set
floating-point exception flags without raising a trap (unlike
feraiseexcept, which is supposed to raise a trap if feenableexcept
was called with the appropriate argument).

The flags can be set in the 387 unit or in the SSE unit.  When we need
to clear a flag, we need to do so in both units, due to the way
fetestexcept is implemented.

When we need to set a flag, it is sufficient to do it in the SSE unit,
because that is guaranteed to not trap.  However, on i386 CPUs that have
only a 387 unit, set the flags in the 387, as long as this cannot trap.

Co-authored-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2023-12-19 15:12:38 -03:00
Adhemerval Zanella
47a9eeb9ba i686: Do not raise exception traps on fesetexcept (BZ 30989)
According to ISO C23 (7.6.4.4), fesetexcept is supposed to set
floating-point exception flags without raising a trap (unlike
feraiseexcept, which is supposed to raise a trap if feenableexcept
was called with the appropriate argument).

The flags can be set in the 387 unit or in the SSE unit.  To set
a flag, it is sufficient to do it in the SSE unit, because that is
guaranteed to not trap.  However, on i386 CPUs that have only a
387 unit, set the flags in the 387, as long as this cannot trap.

Checked on i686-linux-gnu.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2023-12-19 15:12:38 -03:00
Adhemerval Zanella
ecb1e7220d powerpc: Do not raise exception traps for fesetexcept/fesetexceptflag (BZ 30988)
According to ISO C23 (7.6.4.4), fesetexcept is supposed to set
floating-point exception flags without raising a trap (unlike
feraiseexcept, which is supposed to raise a trap if feenableexcept was
called with the appropriate argument).

This is a side-effect of how we implement the GNU extension
feenableexcept, where feenableexcept/fesetenv/fesetmode/feupdateenv
might issue prctl (PR_SET_FPEXC, PR_FP_EXC_PRECISE) depending of the
argument.  And on PR_FP_EXC_PRECISE, setting a floating-point exception
flag triggers a trap.

To make the both functions follow the C23, fesetexcept and
fesetexceptflag now fail if the argument may trigger a trap.

The math tests now check for an value different than 0, instead
of bail out as unsupported for EXCEPTION_SET_FORCES_TRAP.

Checked on powerpc64le-linux-gnu.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2023-12-19 15:12:34 -03:00
Adhemerval Zanella
2a969b53c0 elf: Do not duplicate the GLIBC_TUNABLES string
The tunable parsing duplicates the tunable environment variable so it
null-terminates each one since it simplifies the later parsing. It has
the drawback of adding another point of failure (__minimal_malloc
failing), and the memory copy requires tuning the compiler to avoid mem
operations calls.

The parsing now tracks the tunable start and its size. The
dl-tunable-parse.h adds helper functions to help parsing, like a strcmp
that also checks for size and an iterator for suboptions that are
comma-separated (used on hwcap parsing by x86, powerpc, and s390x).

Since the environment variable is allocated on the stack by the kernel,
it is safe to keep the references to the suboptions for later parsing
of string tunables (as done by set_hwcaps by multiple architectures).

Checked on x86_64-linux-gnu, powerpc64le-linux-gnu, and
aarch64-linux-gnu.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2023-12-19 13:25:45 -03:00
Joseph Myers
5275fc784c Do not build sparc32 libgcc functions into static libc
Since GCC commit f31a019d1161ec78846473da743aedf49cca8c27 "Emit
funcall external declarations only if actually used.", the glibc
testsuite has failed to build for 32-bit SPARC with GCC mainline.

/scratch/jmyers/glibc-bot/install/compilers/sparc64-linux-gnu/lib/gcc/sparc64-glibc-linux-gnu/14.0.0/../../../../sparc64-glibc-linux-gnu/bin/ld: /scratch/jmyers/glibc-bot/install/compilers/sparc64-linux-gnu/lib/gcc/sparc64-glibc-linux-gnu/14.0.0/32/libgcc.a(_divsi3.o): in function `.div':
/scratch/jmyers/glibc-bot/src/gcc/libgcc/config/sparc/lb1spc.S:138: multiple definition of `.div'; /scratch/jmyers/glibc-bot/build/glibcs/sparcv9-linux-gnu/glibc/libc.a(sdiv.o):/scratch/jmyers/glibc-bot/src/glibc/gnulib/../sysdeps/sparc/sparc32/sparcv9/sdiv.S:13: first defined here
/scratch/jmyers/glibc-bot/install/compilers/sparc64-linux-gnu/lib/gcc/sparc64-glibc-linux-gnu/14.0.0/../../../../sparc64-glibc-linux-gnu/bin/ld: disabling relaxation; it will not work with multiple definitions
collect2: error: ld returned 1 exit status
make[3]: *** [../Rules:298: /scratch/jmyers/glibc-bot/build/glibcs/sparcv9-linux-gnu/glibc/nptl/tst-cancel24-static] Error 1

https://sourceware.org/pipermail/libc-testresults/2023q4/012154.html

I'm not sure of the exact sequence of undefined references that cause
first the glibc object file defining .div and then the libgcc object
file defining both .div and .udiv to be pulled in (which must have
been perturbed by that GCC change in a way that introduced the build
failure), but I think the failure illustrates that it's inherently
fragile for glibc to define symbols in separate object files that
libgcc defines in the same object file - and indeed for glibc to
redefine libgcc symbols at all, since the division into object files
shouldn't really be part of the interface between libgcc and libc.

These symbols appear to be in libc only for compatibility, maybe one
of the cases where they were accidentally exported from shared libc in
glibc 2.0 before the introduction of symbol versioning and so programs
started expecting shared libc to provide them.  Thus, there is no need
to have them in static libc.  Add this set of libgcc functions to
shared-only-routines so they are no longer provided in static libc.
(No change is made regarding .mul - dotmul source file - since unlike
the other symbols in this grouping, it doesn't actually appear to be a
libgcc symbol, at least in current GCC.)

Tested with build-many-glibcs.py for sparcv9-linux-gnu with GCC
mainline.
2023-12-19 16:00:11 +00:00
H.J. Lu
4d8a01d2b0 x86/cet: Check CPU_FEATURE_ACTIVE in permissive mode
Verify that CPU_FEATURE_ACTIVE works properly in permissive mode.
2023-12-19 06:58:05 -08:00
H.J. Lu
28bd6f832d x86/cet: Check legacy shadow stack code in .init_array section
Verify that legacy shadow stack code in .init_array section in application
and shared library, which are marked as shadow stack enabled, will trigger
segfault.
2023-12-19 06:57:47 -08:00
H.J. Lu
9424ce80c2 x86/cet: Add tests for GLIBC_TUNABLES=glibc.cpu.hwcaps=-SHSTK
Verify that GLIBC_TUNABLES=glibc.cpu.hwcaps=-SHSTK turns off shadow
stack properly.
2023-12-19 06:57:39 -08:00
H.J. Lu
71c0cc3357 x86/cet: Check CPU_FEATURE_ACTIVE when CET is disabled
Verify that CPU_FEATURE_ACTIVE (SHSTK) works properly when CET is
disabled.
2023-12-19 06:57:32 -08:00
H.J. Lu
f418fe6f97 x86/cet: Check legacy shadow stack applications
Add tests to verify that legacy shadow stack applications run properly
when shadow stack is enabled in Linux kernel.
2023-12-19 06:57:27 -08:00
Stefan Liebler
664f565f9c s390: Set psw addr field in getcontext and friends.
So far if the ucontext structure was obtained by getcontext and co,
the return address was stored in general purpose register 14 as
it is defined as return address in the ABI.

In contrast, the context passed to a signal handler contains the address
in psw.addr field.

If somebody e.g. wants to dump the address of the context, the origin
needs to be known.

Now this patch adjusts getcontext and friends and stores the return address
also in psw.addr field.

Note that setcontext isn't adjusted and it is not supported to pass a
ucontext structure from signal-handler to setcontext.  We are not able to
restore all registers and branching to psw.addr without clobbering one
register.
2023-12-19 11:00:19 +01:00
Matthew Sterrett
e957308723 x86: Unifies 'strlen-evex' and 'strlen-evex512' implementations.
This commit uses a common implementation 'strlen-evex-base.S' for both
'strlen-evex' and 'strlen-evex512'

The motivation is to reduce the number of implementations to maintain.
This incidentally gives a small performance improvement.

All tests pass on x86.

Benchmarks were taken on SKX.
https://www.intel.com/content/www/us/en/products/sku/123613/intel-core-i97900x-xseries-processor-13-75m-cache-up-to-4-30-ghz/specifications.html

Geometric mean for strlen-evex512 over all benchmarks (N=10) was (new/old) 0.939
Geometric mean for wcslen-evex512 over all benchmarks (N=10) was (new/old) 0.965

Code Size Changes:
    strlen-evex512.S    :  +24 bytes
    wcslen-evex512.S    :  +54 bytes

Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2023-12-18 12:38:01 -06:00
H.J. Lu
442983319b x86/cet: Don't assume that SHSTK implies IBT
Since shadow stack (SHSTK) is enabled in the Linux kernel without
enabling indirect branch tracking (IBT), don't assume that SHSTK
implies IBT.  Use "CPU_FEATURE_ACTIVE (IBT)" to check if IBT is active
and "CPU_FEATURE_ACTIVE (SHSTK)" to check if SHSTK is active.
2023-12-18 07:04:18 -08:00
H.J. Lu
0b850186fd x86/cet: Check user_shstk in /proc/cpuinfo
Linux kernel reports CPU shadow stack feature in /proc/cpuinfo as
user_shstk, instead of shstk.
2023-12-17 10:42:06 -08:00
Manjunath Matti
93a739d4a1 powerpc: Add space for HWCAP3/HWCAP4 in the TCB for future Power.
This patch reserves space for HWCAP3/HWCAP4 in the TCB of powerpc.
These hardware capabilities bits will be used by future Power
architectures.

Versioned symbol '__parse_hwcap_3_4_and_convert_at_platform' advertises
the availability of the new HWCAP3/HWCAP4 data in the TCB.

This is an ABI change for GLIBC 2.39.

Suggested-by: Peter Bergner <bergner@linux.ibm.com>
Reviewed-by: Peter Bergner <bergner@linux.ibm.com>
2023-12-15 20:20:14 -06:00
Amrita H S
90bcc8721e powerpc: Fix performance issues of strcmp power10
Current implementation of strcmp for power10 has
performance regression for multiple small sizes
and alignment combination.

Most of these performance issues are fixed by this
patch. The compare loop is unrolled and page crosses
of unrolled loop is handled.

Thanks to Paul E. Murphy for helping in fixing the
performance issues.

Signed-off-by: Amrita H S <amritahs@linux.vnet.ibm.com>
Co-Authored-By: Paul E. Murphy <murphyp@linux.ibm.com>
Reviewed-by: Rajalakshmi Srinivasaraghavan <rajis@linux.ibm.com>
2023-12-15 16:42:40 -06:00
MAHESH BODAPATI
b9182c793c powerpc : Add optimized memchr for POWER10
Optimized memchr for POWER10 based on existing rawmemchr and strlen.
Reordering instructions and loop unrolling helped in getting better performance.
Reviewed-by: Rajalakshmi Srinivasaraghavan <rajis@linux.ibm.com>
2023-12-14 14:40:14 -06:00
H.J. Lu
4753e92868 x86: Check PT_GNU_PROPERTY early
The PT_GNU_PROPERTY segment is scanned before PT_NOTE.  For binaries
with the PT_GNU_PROPERTY segment, we can check it to avoid scan of
the PT_NOTE segment.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2023-12-11 11:05:46 -08:00
H.J. Lu
7e03e0de7e sysdeps/x86/Makefile: Split and sort tests
Put each test on a separate line and sort tests.
2023-12-11 08:49:57 -08:00
Amrita H S
3367d8e180 powerpc: Optimized strcmp for power10
This patch is based on __strcmp_power9 and __strlen_power10.

Improvements from __strcmp_power9:

    1. Uses new POWER10 instructions
       - This code uses lxvp to decrease contention on load
         by loading 32 bytes per instruction.

    2. Performance implication
       - This version has around 30% better performance on average.
       - Performance regression is seen for a specific combination
         of sizes and alignments. Some of them is observed without
         changes also, while rest may be induced by the patch.

Signed-off-by: Amrita H S <amritahs@linux.vnet.ibm.com>
Reviewed-by: Paul E. Murphy <murphyp@linux.ibm.com>
2023-12-07 11:10:40 -06:00
Adhemerval Zanella
61d848b554 elf: Ignore LD_BIND_NOW and LD_BIND_NOT for setuid binaries
To avoid any environment variable to change setuid binaries
semantics.

Checked on x86_64-linux-gnu.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2023-12-05 13:21:36 -03:00
Adhemerval Zanella
876a12e513 elf: Ignore loader debug env vars for setuid
Loader already ignores LD_DEBUG, LD_DEBUG_OUTPUT, and
LD_TRACE_LOADED_OBJECTS. Both LD_WARN and LD_VERBOSE are similar to
LD_DEBUG, in the sense they enable additional checks and debug
information, so it makes sense to disable them.

Also add both LD_VERBOSE and LD_WARN on filtered environment variables
for setuid binaries.

Checked on x86_64-linux-gnu.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2023-12-05 13:21:36 -03:00
Andreas Schwab
3f79842788 aarch64: correct CFI in rawmemchr (bug 31113)
The .cfi_return_column directive changes the return column for the whole
FDE range.  But the actual intent is to tell the unwinder that the value
in x30 (lr) now resides in x15 after the move, and that is expressed by
the .cfi_register directive.
2023-12-05 12:49:37 +01:00
Joe Ramsay
63d0a35d5f math: Add new exp10 implementation
New implementation is based on the existing exp/exp2, with different
reduction constants and polynomial. Worst-case error in round-to-
nearest is 0.513 ULP.

The exp/exp2 shared table is reused for exp10 - .rodata size of
e_exp_data increases by 64 bytes.

As for exp/exp2, targets with single-instruction rounding/conversion
intrinsics can use them by toggling TOINT_INTRINSICS=1 and adding the
necessary code to their math_private.h.

Improvements on Neoverse V1 compared to current GLIBC master:
exp10 thruput: 3.3x in [-0x1.439b746e36b52p+8 0x1.34413509f79ffp+8]
exp10 latency: 1.8x in [-0x1.439b746e36b52p+8 0x1.34413509f79ffp+8]

Tested on:
    aarch64-linux-gnu (TOINT_INTRINSICS, fma contraction) and
    x86_64-linux-gnu (!TOINT_INTRINSICS, no fma contraction)

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2023-12-04 15:52:11 +00:00
Szabolcs Nagy
8e755f5bc8 aarch64: fix tested ifunc variants
Don't test a64fx string functions when BTI is enabled since they are
not BTI compatible.
2023-12-04 14:41:26 +00:00
Samuel Thibault
2fb85a3787 hurd: [!__USE_MISC] Do not #undef BSD macros in ioctls
When e.g. including termios.h first and then sys/ioctl.h, without e.g.
_BSD_SOURCE, the latter would #undef e.g. ECHO, without defining it.
2023-12-02 21:26:50 +01:00
Adhemerval Zanella
4e16d89866 linux: Make fdopendir fail with O_PATH (BZ 30373)
It is not strictly required by the POSIX, since O_PATH is a Linux
extension, but it is QoI to fail early instead of at readdir.  Also
the check is free, since fdopendir already checks if the file
descriptor is opened for read.

Checked on x86_64-linux-gnu.
2023-11-30 13:37:04 -03:00
Stefan Liebler
807849965b Avoid padding in _init and _fini. [BZ #31042]
The linker just concatenates the .init and .fini sections which
results in the complete _init and _fini functions. If needed the
linker adds padding bytes due to an alignment. GNU ld is adding
NOPs, which is fine.  But e.g. mold is adding traps which results
in broken _init and _fini functions.

Thus this patch removes the alignment in .init and .fini sections
in crtn.S files.

We keep the 4 byte function alignment in crti.S files. As the
assembler now also outputs the start of _init and _fini functions
as multiples of 4 byte, it perhaps has to fill it. Although GNU as
is using NOPs here, to be sure, we just keep the alignment with
0x07 (=NOPs) at the end of crti.S.

In order to avoid an obvious NOP slide in _fini, this patch also
uses an lg instead of lgr instruction. Then the emitted instructions
needs a multiple of 4 bytes.
2023-11-30 13:31:23 +01:00
Joe Ramsay
7b12776584 aarch64: Improve special-case handling in AdvSIMD double-precision libmvec routines
Avoids emitting many saves/restores of vector registers, reduces the
amount of code generated around the scalar fallback.
2023-11-29 15:03:36 +00:00
Noah Goldstein
9469261cf1 x86: Only align destination to 1x VEC_SIZE in memset 4x loop
Current code aligns to 2x VEC_SIZE. Aligning to 2x has no affect on
performance other than potentially resulting in an additional
iteration of the loop.
1x maintains aligned stores (the only reason to align in this case)
and doesn't incur any unnecessary loop iterations.
Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>
2023-11-28 12:06:19 -06:00
Tobias Klauser
06bbe63e36 Add TCP_MD5SIG_FLAG_IFINDEX from Linux 5.6 to netinet/tcp.h.
This patch adds the TCP_MD5SIG_FLAG_IFINDEX constant from Linux 5.6 to
sysdeps/gnu/netinet/tcp.h and updates struct tcp_md5sig accordingly to
contain the device index.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2023-11-28 13:44:47 +01:00
Joseph Myers
2e0c0ff95c Remove __access_noerrno
A recent commit, apparently commit
6c6fce572f "elf: Remove /etc/suid-debug
support", resulted in localplt failures for i686-gnu and x86_64-gnu:

Missing required PLT reference: ld.so: __access_noerrno

After that commit, __access_noerrno is actually no longer used at all.
So rather than just removing the localplt expectation for that symbol
for Hurd, completely remove all definitions of and references to that
symbol.

Tested for x86_64, and with build-many-glibcs.py for i686-gnu and
x86_64-gnu.
2023-11-23 19:01:32 +00:00
Adhemerval Zanella
472894d2cf malloc: Use __get_nprocs on arena_get2 (BZ 30945)
This restore the 2.33 semantic for arena_get2.  It was changed by
11a02b035b to avoid arena_get2 call malloc (back when __get_nproc
was refactored to use an scratch_buffer - 903bc7dcc2).  The
__get_nproc was refactored over then and now it also avoid to call
malloc.

The 11a02b035b did not take in consideration any performance
implication, which should have been discussed properly.  The
__get_nprocs_sched is still used as a fallback mechanism if procfs
and sysfs is not acessible.

Checked on x86_64-linux-gnu.
Reviewed-by: DJ Delorie <dj@redhat.com>
2023-11-22 09:39:29 -03:00
Joe Ramsay
bd70d3bacf aarch64: Fix libmvec benchmarks
These were broken by the new atan2 functions, as they were only
set up for univariate functions. Arity is now detected from the
input file - this revealed a mistake that the double-precision
inputs were being used for both single- and double-precision
routines, which is now remedied.
2023-11-22 09:10:43 +00:00
Adhemerval Zanella
55f41ef8de elf: Remove LD_PROFILE for static binaries
The _dl_non_dynamic_init does not parse LD_PROFILE, which does not
enable profile for dlopen objects.  Since dlopen is deprecated for
static objects, it is better to remove the support.

It also allows to trim down libc.a of profile support.

Checked on x86_64-linux-gnu.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2023-11-21 16:15:42 -03:00
Adhemerval Zanella
1c87f71a36 s390: Use dl-symbol-redir-ifunc.h on cpu-tunables
Using the memcmp symbol directly allows the compile to inline the
memcmp calls (especially because _dl_tunable_set_hwcaps uses constants
values), generating better code.

Checked with tst-tunables on s390x-linux-gnu (qemu system).
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2023-11-21 16:15:42 -03:00
Adhemerval Zanella
4862d546c0 x86: Use dl-symbol-redir-ifunc.h on cpu-tunables
The dl-symbol-redir-ifunc.h redirects compiler-generated libcalls to
arch-specific memory implementations to avoid ifunc calls where it is not
yet possible. The memcmp-isa-default-impl.h aims to fix the same issue
by calling the specific memset implementation directly.

Using the memcmp symbol directly allows the compiler to inline the memset
calls (especially because _dl_tunable_set_hwcaps uses constants values),
generating better code.

Checked on x86_64-linux-gnu.

Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2023-11-21 16:15:42 -03:00
Adhemerval Zanella
434eca873f elf: Fix _dl_debug_vdprintf to work before self-relocation
The strlen might trigger and invalid GOT entry if it used before
the process is self-relocated (for instance on dl-tunables if any
error occurs).

For i386, _dl_writev with PIE requires to use the old 'int $0x80'
syscall mode because the calling the TLS register (gs) is not yet
initialized.

Checked on x86_64-linux-gnu.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2023-11-21 16:15:42 -03:00
Adhemerval Zanella
11f7e3dd8f elf: Add all malloc tunable to unsecvars
Some environment variables allow alteration of allocator behavior
across setuid boundaries, where a setuid program may ignore the
tunable, but its non-setuid child can read it and adjust the memory
allocator behavior accordingly.

Most library behavior tunings is limited to the current process and does
not bleed in scope; so it is unclear how pratical this misfeature is.
If behavior change across privilege boundaries is desirable, it would be
better done with a wrapper program around the non-setuid child that sets
these envvars, instead of using the setuid process as the messenger.

The patch as fixes tst-env-setuid, where it fail if any unsecvars is
set.  It also adds a dynamic test, although it requires
--enable-hardcoded-path-in-tests so kernel correctly sets the setuid
bit (using the loader command directly would require to set the
setuid bit on the loader itself, which is not a usual deployment).

Co-authored-by: Siddhesh Poyarekar <siddhesh@sourceware.org>

Checked on x86_64-linux-gnu.
Reviewed-by: DJ Delorie <dj@redhat.com>
2023-11-21 16:15:42 -03:00
Adhemerval Zanella
9c96c87d60 elf: Ignore GLIBC_TUNABLES for setuid/setgid binaries
The tunable privilege levels were a retrofit to try and keep the malloc
tunable environment variables' behavior unchanged across security
boundaries.  However, CVE-2023-4911 shows how tricky can be
tunable parsing in a security-sensitive environment.

Not only parsing, but the malloc tunable essentially changes some
semantics on setuid/setgid processes.  Although it is not a direct
security issue, allowing users to change setuid/setgid semantics is not
a good security practice, and requires extra code and analysis to check
if each tunable is safe to use on all security boundaries.

It also means that security opt-in features, like aarch64 MTE, would
need to be explicit enabled by an administrator with a wrapper script
or with a possible future system-wide tunable setting.

Co-authored-by: Siddhesh Poyarekar  <siddhesh@sourceware.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
2023-11-21 16:15:42 -03:00
Adhemerval Zanella
a72a4eb10b elf: Add GLIBC_TUNABLES to unsecvars
setuid/setgid process now ignores any glibc tunables, and filters out
all environment variables that might changes its behavior. This patch
also adds GLIBC_TUNABLES, so any spawned process by setuid/setgid
processes should set tunable explicitly.

Checked on x86_64-linux-gnu.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2023-11-21 16:15:42 -03:00
Samuel Thibault
49b308a26e hurd: Prevent the final file_exec_paths call from signals
Otherwise if the exec server started thrashing the old task,
we won't be able to restart the exec.

This notably fixes building ghc.
2023-11-20 23:28:16 +01:00
Joe Ramsay
a8830c9285 aarch64: Add vector implementations of expm1 routines
May discard sign of 0 - auto tests for -0 and -0x1p-10000 updated accordingly.
2023-11-20 17:53:14 +00:00
Adhemerval Zanella
65341f7bbe linux: Use fchmodat2 on fchmod for flags different than 0 (BZ 26401)
Linux 6.6 (09da082b07bbae1c) added support for fchmodat2, which has
similar semantics as fchmodat with an extra flag argument.  This
allows fchmodat to implement AT_SYMLINK_NOFOLLOW and AT_EMPTY_PATH
without the need for procfs.

The syscall is registered on all architectures (with value of 452
except on alpha which is 562, commit 78252deb023cf087).

The tst-lchmod.c requires a small fix where fchmodat checks two
contradictory assertions ('(st.st_mode & 0777) == 2' and
'(st.st_mode & 0777) == 3').

Checked on x86_64-linux-gnu on a 6.6 kernel.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2023-11-20 13:15:24 -03:00
Noah Goldstein
b7f8b6b64b x86: Fix unchecked AVX512-VBMI2 usage in strrchr-evex-base.S
strrchr-evex-base used `vpcompress{b|d}` in the page cross logic but
was missing the CPU_FEATURE checks for VBMI2 in the
ifunc/ifunc-impl-list.

The fix is either to add those checks or change the logic to not use
`vpcompress{b|d}`. Choosing the latter here so that the strrchr-evex
implementation is usable on SKX.

New implementation is a bit slower, but this is in a cold path so its
probably okay.
2023-11-15 11:09:44 -06:00
Andreas Larsson
578190b7e4 sparc: Fix broken memset for sparc32 [BZ #31068]
Fixes commit a61933fe27 ("sparc: Remove bzero optimization") that
after moving code jumped to the wrong label 4.

Verfied by successfully running string/test-memset on sparc32.

Signed-off-by: Andreas Larsson <andreas@gaisler.com>
Signed-off-by: Ludwig Rydberg <ludwig.rydberg@gaisler.com>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2023-11-15 10:26:37 -03:00
Samuel Thibault
323f367cc4 hurd: Fix spawni returning allocation errors. 2023-11-14 23:55:35 +01:00
Wilco Dijkstra
2f5524cc53 AArch64: Remove Falkor memcpy
The latest implementations of memcpy are actually faster than the Falkor
implementations [1], so remove the falkor/phecda ifuncs for memcpy and
the now unused IS_FALKOR/IS_PHECDA defines.

[1] https://sourceware.org/pipermail/libc-alpha/2022-December/144227.html

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2023-11-13 16:52:50 +00:00
Wilco Dijkstra
3d7090f14b AArch64: Add memset_zva64
Add a specialized memset for the common ZVA size of 64 to avoid the
overhead of reading the ZVA size.  Since the code is identical to
__memset_falkor, remove the latter.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2023-11-13 16:50:44 +00:00
Wilco Dijkstra
9627ab99b5 AArch64: Cleanup emag memset
Cleanup emag memset - merge the memset_base64.S file, remove
the unused ZVA code (since it is disabled on emag).

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2023-11-13 16:45:47 +00:00
Joe Ramsay
3548a4f087 aarch64: Add vector implementations of log1p routines
May discard sign of zero.
2023-11-10 17:07:43 +00:00
Joe Ramsay
b07038c5d3 aarch64: Add vector implementations of atan2 routines 2023-11-10 17:07:43 +00:00
Joe Ramsay
d30c39f80d aarch64: Add vector implementations of atan routines 2023-11-10 17:07:42 +00:00
Joe Ramsay
b5d23367a8 aarch64: Add vector implementations of acos routines 2023-11-10 17:07:42 +00:00
Joe Ramsay
9bed498418 aarch64: Add vector implementations of asin routines 2023-11-10 17:07:42 +00:00
Adhemerval Zanella
bf033c0072 elf: Add glibc.mem.decorate_maps tunable
The PR_SET_VMA_ANON_NAME support is only enabled through a configurable
kernel switch, mainly because assigning a name to a
anonymous virtual memory area might prevent that area from being
merged with adjacent virtual memory areas.

For instance, with the following code:

   void *p1 = mmap (NULL,
                    1024 * 4096,
                    PROT_READ | PROT_WRITE,
                    MAP_PRIVATE | MAP_ANONYMOUS,
                    -1,
                    0);

   void *p2 = mmap (p1 + (1024 * 4096),
                    1024 * 4096,
                    PROT_READ | PROT_WRITE,
                    MAP_PRIVATE | MAP_ANONYMOUS,
                    -1,
                    0);

The kernel will potentially merge both mappings resulting in only one
segment of size 0x800000.  If the segment is names with
PR_SET_VMA_ANON_NAME with different names, it results in two mappings.

Although this will unlikely be an issue for pthread stacks and malloc
arenas (since for pthread stacks the guard page will result in
a PROT_NONE segment, similar to the alignment requirement for the arena
block), it still might prevent the mmap memory allocated for detail
malloc.

There is also another potential scalability issue, where the prctl
requires
to take the mmap global lock which is still not fully fixed in Linux
[1] (for pthread stacks and arenas, it is mitigated by the stack
cached and the arena reuse).

So this patch disables anonymous mapping annotations as default and
add a new tunable, glibc.mem.decorate_maps, can be used to enable
it.

[1] https://lwn.net/Articles/906852/

Checked on x86_64-linux-gnu and aarch64-linux-gnu.
Reviewed-by: DJ Delorie <dj@redhat.com>
2023-11-07 10:27:57 -03:00
Adhemerval Zanella
f10ba2ab25 linux: Decorate __libc_fatal error buffer
Reviewed-by: DJ Delorie <dj@redhat.com>
2023-11-07 10:27:53 -03:00
Adhemerval Zanella
78ed8bdf4f linux: Add PR_SET_VMA_ANON_NAME support
Linux 5.17 added support to naming anonymous virtual memory areas
through the prctl syscall.  The __set_vma_name is a wrapper to avoid
optimizing the prctl call if the kernel does not support it.

If the kernel does not support PR_SET_VMA_ANON_NAME, prctl returns
EINVAL. And it also returns the same error for an invalid argument.
Since it is an internal-only API, it assumes well-formatted input:
aligned START, with (START, START+LEN) being a valid memory range,
and NAME with a limit of 80 characters without an invalid one
("\\`$[]").
Reviewed-by: DJ Delorie <dj@redhat.com>
2023-11-07 10:27:20 -03:00
Samuel Thibault
091ee2190d hurd: statfsconv: Add missing f_ffree conversion 2023-11-07 12:51:25 +01:00
Flavio Cruz
5dd3bda59c Update BAD_TYPECHECK to work on x86_64
Message-ID: <ZUhn7LOcgLOJjKZr@jupiter.tail36e24.ts.net>
2023-11-06 23:24:48 +01:00
Sergio Durigan Junior
f957f47df7 sysdeps: sem_open: Clear O_CREAT when semaphore file is expected to exist [BZ #30789]
When invoking sem_open with O_CREAT as one of its flags, we'll end up
in the second part of sem_open's "if ((oflag & O_CREAT) == 0 || (oflag
& O_EXCL) == 0)", which means that we don't expect the semaphore file
to exist.

In that part, open_flags is initialized as "O_RDWR | O_CREAT | O_EXCL
| O_CLOEXEC" and there's an attempt to open(2) the file, which will
likely fail because it won't exist.  After that first (expected)
failure, some cleanup is done and we go back to the label "try_again",
which lives in the first part of the aforementioned "if".

The problem is that, in that part of the code, we expect the semaphore
file to exist, and as such O_CREAT (this time the flag we pass to
open(2)) needs to be cleaned from open_flags, otherwise we'll see
another failure (this time unexpected) when trying to open the file,
which will lead the call to sem_open to fail as well.

This can cause very strange bugs, especially with OpenMPI, which makes
extensive use of semaphores.

Fix the bug by simplifying the logic when choosing open(2) flags and
making sure O_CREAT is not set when the semaphore file is expected to
exist.

A regression test for this issue would require a complex and cpu time
consuming logic, since to trigger the wrong code path is not
straightforward due the racy condition.  There is a somewhat reliable
reproducer in the bug, but it requires using OpenMPI.

This resolves BZ #30789.

See also: https://bugs.launchpad.net/ubuntu/+source/h5py/+bug/2031912

Signed-off-by: Sergio Durigan Junior <sergiodj@sergiodj.net>
Co-Authored-By: Simon Chopin <simon.chopin@canonical.com>
Co-Authored-By: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
Fixes: 533deafbdf ("Use O_CLOEXEC in more places (BZ #15722)")
2023-11-03 15:19:38 -03:00
Joseph Myers
ac79930498 Add SEGV_CPERR from Linux 6.6 to bits/siginfo-consts.h
Linux 6.6 adds the constant SEGV_CPERR.  Add it to glibc's
bits/siginfo-consts.h.

Tested for x86_64.
2023-11-03 16:36:35 +00:00
Adhemerval Zanella
9b3cb0277e linux: Add HWCAP2_HBC from Linux 6.6 to AArch64 bits/hwcap.h 2023-11-03 10:01:46 -03:00
Adhemerval Zanella
10b4c8b96f linux: Add FSCONFIG_CMD_CREATE_EXCL from Linux 6.6 to sys/mount.h
The tst-mount-consts.py does not need to be updated because kernel
exports it as an enum (compare_macro_consts can not parse it).
2023-11-03 10:01:46 -03:00
Adhemerval Zanella
cb8c78b2ff linux: Add MMAP_ABOVE4G from Linux 6.6 to sys/mman.h
x86 added the flag (29f890d1050fc099f) for CET enabled.

Also update tst-mman-consts.py test.
2023-11-03 10:01:46 -03:00
Adhemerval Zanella
f680063f30 Update kernel version to 6.6 in header constant tests
There are no new constants covered, the tst-mman-consts.py is
updated separately along with a header constant addition.
2023-11-03 10:01:46 -03:00
Adhemerval Zanella
582383b37d Update syscall lists for Linux 6.6
Linux 6.6 has one new syscall for all architectures, fchmodat2, and
the map_shadow_stack on x86_64.
2023-11-03 10:01:46 -03:00
Wilco Dijkstra
9fd3409842 AArch64: Cleanup ifuncs
Cleanup ifuncs.  Remove uses of libc_hidden_builtin_def, use ENTRY rather than
ENTRY_ALIGN, remove unnecessary defines and conditional compilation.  Rename
strlen_mte to strlen_generic.  Remove rtld-memset.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2023-11-01 13:41:59 +00:00
Arjun Shankar
9db31d7456 Use correct subdir when building tst-rfc3484* for mach and arm
Commit 7f602256ab moved the tst-rfc3484*
tests from posix/ to nss/, but didn't correct references to point to
their new subdir when building for mach and arm.  This commit fixes
that.

Tested with build-many-glibcs.sh for i686-gnu.
2023-11-01 11:53:03 +01:00
Adhemerval Zanella
fccf38c517 string: Add internal memswap implementation
The prototype is:

  void __memswap (void *restrict p1, void *restrict p2, size_t n)

The function swaps the content of two memory blocks P1 and P2 of
len N.  Memory overlap is NOT handled.

It will be used on qsort optimization.

Checked on x86_64-linux-gnu and aarch64-linux-gnu.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2023-10-31 14:17:33 -03:00
Adhemerval Zanella
e6e3c66688 crypt: Remove libcrypt support
All the crypt related functions, cryptographic algorithms, and
make requirements are removed,  with only the exception of md5
implementation which is moved to locale folder since it is
required by localedef for integrity protection (libc's
locale-reading code does not check these, but localedef does
generate them).

Besides thec code itself, both internal documentation and the
manual is also adjusted.  This allows to remove both --enable-crypt
and --enable-nss-crypt configure options.

Checked with a build for all affected ABIs.

Co-authored-by: Zack Weinberg <zack@owlfolio.org>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2023-10-30 13:03:59 -03:00
Adhemerval Zanella
bb2ff12abd sparc: Remove optimize md5, sha256, and sha512
The libcrypt was maked to be phase out on 2.38, and a better project
already exist that provide both compatibility and better API
(libxcrypt).  The sparc optimizations add the burden to extra
build-many-glibcs.py configurations.

Checked on sparc64 and sparcv9.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2023-10-30 13:03:59 -03:00
caiyinyu
dd53a60282 LoongArch: Delete excessively allocated memory. 2023-10-26 17:29:55 +08:00
caiyinyu
83c081f73e LoongArch: Update hwcap.h to sync with LoongArch kernel. 2023-10-26 17:23:47 +08:00
caiyinyu
83e9576d41 LoongArch: Unify Register Names. 2023-10-26 17:23:47 +08:00
Wilco Dijkstra
2bd0017988 AArch64: Add support for MOPS memcpy/memmove/memset
Add support for MOPS in cpu_features and INIT_ARCH.  Add ifuncs using MOPS for
memcpy, memmove and memset (use .inst for now so it works with all binutils
versions without needing complex configure and conditional compilation).

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2023-10-24 13:37:48 +01:00
Arjun Shankar
7f602256ab Move getaddrinfo from 'posix' into 'nss'
getaddrinfo is an entry point for nss functionality.  This commit moves
it from 'sysdeps/posix' to 'nss', gets rid of the stub in 'posix', and
moves all associated tests as well.
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2023-10-24 12:31:00 +02:00
Joe Ramsay
31aaf6fed9 aarch64: Add vector implementations of exp10 routines
Double-precision routines either reuse the exp table (AdvSIMD) or use
SVE FEXPA intruction.
2023-10-23 15:00:45 +01:00
Joe Ramsay
067a34156c aarch64: Add vector implementations of log10 routines
A table is also added, which is shared between AdvSIMD and SVE log10.
2023-10-23 15:00:45 +01:00
Joe Ramsay
a8e3ab3074 aarch64: Add vector implementations of log2 routines
A table is also added, which is shared between AdvSIMD and SVE log2.
2023-10-23 15:00:45 +01:00
Joe Ramsay
b39e9db5e3 aarch64: Add vector implementations of exp2 routines
Some routines reuse table from v_exp_data.c
2023-10-23 15:00:45 +01:00
Joe Ramsay
f554334c05 aarch64: Add vector implementations of tan routines
This includes some utility headers for evaluating polynomials using
various schemes.
2023-10-23 15:00:44 +01:00
Stefan Liebler
f5677d9ceb tst-spawn-cgroup.c: Fix argument order of UNSUPPORTED message.
The arguments for "expected" and "got" are mismatched.  Furthermore
this patch is dumping both values as hex.
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
2023-10-20 08:46:09 +02:00
Stefan Liebler
97a58d885b s390: Fix undefined behaviour in feenableexcept, fedisableexcept [BZ #30960]
If feenableexcept or fedisableexcept gets excepts=FE_INVALID=0x80
as input, we have a signed left shift: 0x80 << 24 which is not
representable as int and thus is undefined behaviour according to
C standard.

This patch casts excepts as unsigned int before shifting, which is
defined.

For me, the observed undefined behaviour is that the shift is done
with "unsigned"-instructions, which is exactly what we want.
Furthermore, I don't get any exception-flags.

After the fix, the code is using the same instruction sequence as
before.
2023-10-19 14:28:22 +02:00
Florian Weimer
dd32e1db38 Revert "elf: Always call destructors in reverse constructor order (bug 30785)"
This reverts commit 6985865bc3.

Reason for revert:

The commit changes the order of ELF destructor calls too much relative
to what applications expect or can handle.  In particular, during
process exit and _dl_fini, after the revert commit, we no longer call
the destructors of the main program first; that only happens after
some dlopen'ed objects have been destructed.  This robs applications
of an opportunity to influence destructor order by calling dlclose
explicitly from the main program's ELF destructors.  A couple of
different approaches involving reverse constructor order were tried,
and none of them worked really well.  It seems we need to keep the
dependency sorting in _dl_fini.

There is also an ambiguity regarding nested dlopen calls from ELF
constructors: Should those destructors run before or after the object
that called dlopen?  Commit 6985865bc3 used reverse order
of the start of ELF constructor calls for destructors, but arguably
using completion of constructors is more correct.  However, that alone
is not sufficient to address application compatibility issues (it
does not change _dl_fini ordering at all).
2023-10-18 11:30:38 +02:00
Bruno Victal
3333eb55b7 Add LE DSCP code point from RFC-8622.
Signed-off-by: Bruno Victal <mirai@makinata.eu>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
2023-10-17 19:00:27 +02:00
Joseph Myers
ff5d2abd18 Add HWCAP2_MOPS from Linux 6.5 to AArch64 bits/hwcap.h
Linux 6.5 adds a new AArch64 HWCAP2 value, HWCAP2_MOPS.  Add it to
glibc's bits/hwcap.h.

Tested with build-many-glibcs.py for aarch64-linux-gnu.
2023-10-17 13:13:27 +00:00
Joseph Myers
5ef608f364 Add SCM_SECURITY, SCM_PIDFD to bits/socket.h
Linux 6.5 adds a constant SCM_PIDFD (recall that the non-uapi
linux/socket.h, where this constant is added, is in fact a header
providing many constants that are part of the kernel/userspace
interface).  This shows up that SCM_SECURITY, from the same set of
definitions and added in Linux 2.6.17, is also missing from glibc,
although glibc has the first two constants from this set, SCM_RIGHTS
and SCM_CREDENTIALS; add both missing constants to glibc.

Tested for x86_64.
2023-10-16 13:19:26 +00:00
Joseph Myers
2399ab0d20 Add AT_HANDLE_FID from Linux 6.5 to bits/fcntl-linux.h
Linux 6.5 adds a constant AT_HANDLE_FID; add it to glibc.  Because
this is a flag for the function name_to_handle_at declared in
bits/fcntl-linux.h, put the flag there rather than alongside other
AT_* flags in (OS-independent) fcntl.h.

Tested for x86_64.
2023-10-16 13:18:51 +00:00
Andreas Schwab
5aa1ddfcb3 Avoid maybe-uninitialized warning in __kernel_rem_pio2
With GCC 14 on 32-bit x86 the compiler emits a maybe-uninitialized
warning:

../sysdeps/ieee754/dbl-64/k_rem_pio2.c: In function '__kernel_rem_pio2':
../sysdeps/ieee754/dbl-64/k_rem_pio2.c:364:20: error: 'fq' may be used uninitialized [-Werror=maybe-uninitialized]
  364 |           y[0] = fq[0]; y[1] = fq[1]; y[2] = fw;
      |                  ~~^~~

This is similar to the warning that is suppressed in the other branch of
the switch.  Help the compiler knowing that the variable is always
initialized, which also makes the suppression obsolete.
2023-10-16 09:59:32 +02:00
Noah Goldstein
a3c50bf46a x86: Prepare strrchr-evex and strrchr-evex512 for AVX10
This commit refactors `strrchr-evex` and `strrchr-evex512` to use a
common implementation: `strrchr-evex-base.S`.

The motivation is `strrchr-evex` needed to be refactored to not use
64-bit masked registers in preperation for AVX10.

Once vec-width masked register combining was removed, the EVEX and
EVEX512 implementations can easily be implemented in the same file
without any major overhead.

The net result is performance improvements (measured on TGL) for both
`strrchr-evex` and `strrchr-evex512`. Although, note there are some
regressions in the test suite and it may be many of the cases that
make the total-geomean of improvement/regression across bench-strrchr
are cold. The point of the performance measurement is to show there
are no major regressions, but the primary motivation is preperation
for AVX10.

Benchmarks where taken on TGL:
https://www.intel.com/content/www/us/en/products/sku/213799/intel-core-i711850h-processor-24m-cache-up-to-4-80-ghz/specifications.html

EVEX geometric_mean(N=5) of all benchmarks New / Original   : 0.74
EVEX512 geometric_mean(N=5) of all benchmarks New / Original: 0.87

Full check passes on x86.
2023-10-06 00:18:55 -05:00
Joe Ramsay
5a4b6f8e4b aarch64: Optimise vecmath logs
* Transpose table layout for improved memory access
* Use half-vector special comparisons for AdvSIMD
* Improve register use near special-case branches
  - Due to the presence of a function call, return value would get
    mov-d out of x0 in order to facilitate PCS. By moving the final
    computation after the branch this can be avoided

Also change SVE routines to use overloaded intrinsics for readability.
2023-10-05 16:54:16 +01:00
Joe Ramsay
480a0dfe1a aarch64: Cosmetic change in SVE exp routines
Use overloaded intrinsics for readability. Codegen does not
change, however while we're bringing the routines up-to-date with
recent improvements to other routines in AOR it is worth copying
this change over as well.
2023-10-05 16:54:00 +01:00
Joe Ramsay
9180160e08 aarch64: Optimize SVE cos & cosf
Saves a mov by ensuring return value does not need to be moved out of
the way before special-case branch. Also change to use overloaded
intrinsics.
2023-10-05 16:53:38 +01:00
Joe Ramsay
8014d1e832 aarch64: Improve vecmath sin routines
* Update ULP comment reflecting a new observed max in [-pi/2, pi/2]
* Use the same polynomial in AdvSIMD and SVE, rather than FTRIG instructions
* Improve register use near special-case branch

Also use overloaded intrinsics for SVE.
2023-10-05 16:53:06 +01:00
Volker Weißmann
7bb8045ec0 Fix FORTIFY_SOURCE false positive
When -D_FORTIFY_SOURCE=2 was given during compilation,
sprintf and similar functions will check if their
first argument is in read-only memory and exit with
*** %n in writable segment detected ***
otherwise. To check if the memory is read-only, glibc
reads frpm the file "/proc/self/maps". If opening this
file fails due to too many open files (EMFILE), glibc
will now ignore this error.

Fixes [BZ #30932]

Signed-off-by: Volker Weißmann <volker.weissmann@gmx.de>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2023-10-04 08:07:43 -03:00
Siddhesh Poyarekar
0d5f9ea97f Propagate GLIBC_TUNABLES in setxid binaries
GLIBC_TUNABLES scrubbing happens earlier than envvar scrubbing and some
tunables are required to propagate past setxid boundary, like their
env_alias.  Rely on tunable scrubbing to clean out GLIBC_TUNABLES like
before, restoring behaviour in glibc 2.37 and earlier.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2023-10-02 15:35:05 -04:00
Kir Kolyshkin
9e4e896f0f Linux: add ST_NOSYMFOLLOW
Linux v5.10 added a mount option MS_NOSYMFOLLOW, which was added to
glibc in commit 0ca21427d9.

Add the corresponding statfs/statvfs flag bit, ST_NOSYMFOLLOW.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2023-10-02 10:54:27 -03:00
Joe Simmons-Talbott
08e9a60a1a mips: dl-machine-reject-phdr: Get rid of alloca.
Read directly into the mips_abiflags struct rather than reading the
entire segment and using alloca when the passed buffer is not big enough.

Checked with build-many-glibcs.py on mips-linux-gnu

Tested-by: Ying Huang <ying.huang@oss.cipunited.com>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2023-10-02 12:55:27 +00:00
Noah Goldstein
d90b43a4ed x86: Add support for AVX10 preset and vec size in cpu-features
This commit add support for the new AVX10 cpu features:
https://cdrdv2-public.intel.com/784267/355989-intel-avx10-spec.pdf

We add checks for:
    - `AVX10`: Check if AVX10 is present.
    - `AVX10_{X,Y,Z}MM`: Check if a given vec class has AVX10 support.

`make check` passes and cpuid output was checked against GNR/DMR on an
emulator.
2023-09-29 14:18:42 -05:00
Samuel Thibault
29d4591b07 hurd: Drop REG_GSFS and REG_ESDS from x86_64's ucontext
These are useless on x86_64, and __NGREG was actually wrong with them.
2023-09-28 00:10:13 +02:00
Manjunath Matti
4eac1825ed fegetenv_and_set_rn now uses the builtins provided by GCC.
On powerpc, SET_RESTORE_ROUND uses inline assembly to optimize the
prologue get/save/set rounding mode operations for POWER9 and
later by using 'mffscrn' where possible, this was introduced by
commit f1c56cdff0.

GCC version 14 onwards supports builtins as __builtin_set_fpscr_rn
which now returns the FPSCR fields in a double. This feature is
available on Power9 when the __SET_FPSCR_RN_RETURNS_FPSCR__ macro
is defined.
GCC commit ef3bbc69d15707e4db6e2f198c621effb636cc26 adds
this feature.

Changes are done to use __builtin_set_fpscr_rn instead of mffscrn
or mffscrni in __fe_mffscrn(rn).

Suggested-by: Carl Love <cel@us.ibm.com>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2023-09-27 13:55:36 -03:00
Adhemerval Zanella
551101e824 io: Do not implement fstat with fstatat
AT_EMPTY_PATH is a requirement to implement fstat over fstatat,
however it does not prevent the kernel to read the path argument.
It is not an issue, but on x86-64 with SMAP-capable CPUs the kernel is
forced to perform expensive user memory access.  After that regular
lookup is performed which adds even more overhead.

Instead, issue the fstat syscall directly on LFS fstat implementation
(32 bit architectures will still continue to use statx, which is
required to have 64 bit time_t support).  it should be even a
small performance gain on non x86_64, since there is no need
to handle the path argument.

Checked on x86_64-linux-gnu.
2023-09-27 09:30:24 -03:00
Wilco Dijkstra
6b695e5c62 AArch64: Remove -0.0 check from vector sin
Remove the unnecessary extra checks for sin (-0.0) from vector sin/sinf,
improving performance.  Passes regress.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2023-09-26 13:40:07 +01:00
Florian Weimer
f563971b5b elf: Add dummy declaration of _dl_audit_objclose for !SHARED
This allows us to avoid some #ifdef SHARED conditionals.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2023-09-26 11:40:12 +02:00
Romain Geissler
ec6b95c330 Fix leak in getaddrinfo introduced by the fix for CVE-2023-4806 [BZ #30843]
This patch fixes a very recently added leak in getaddrinfo.

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2023-09-25 01:21:51 +01:00
caiyinyu
672b91ba10 Revert "LoongArch: Add glibc.cpu.hwcap support."
This reverts commit a53451559d.
2023-09-21 09:10:11 +08:00
Joseph Myers
457bb77255 Update kernel version to 6.5 in header constant tests
This patch updates the kernel version in the tests tst-mman-consts.py
and tst-pidfd-consts.py to 6.5.  (There are no new constants covered
by these tests in 6.5 that need any other header changes;
tst-mount-consts.py was updated separately along with a header
constant addition.)

Tested with build-many-glibcs.py.
2023-09-20 13:36:46 +00:00
caiyinyu
a53451559d LoongArch: Add glibc.cpu.hwcap support.
Key Points:
1. On lasx & lsx platforms, We must use _dl_runtime_{profile, resolve}_{lsx, lasx}
   to save vector registers.
2. Via "tunables", users can choose str/mem_{lasx,lsx,unaligned} functions with
   `export GLIBC_TUNABLES=glibc.cpu.hwcaps=LASX,...`.
   Note: glibc.cpu.hwcaps doesn't affect _dl_runtime_{profile, resolve}_{lsx, lasx}
   selection.

Usage Notes:
1. Only valid inputs: LASX, LSX, UAL. Case-sensitive, comma-separated, no spaces.
2. Example: `export GLIBC_TUNABLES=glibc.cpu.hwcaps=LASX,UAL` turns on LASX & UAL.
   Unmentioned features turn off. With default ifunc: lasx > lsx > unaligned >
   aligned > generic, effect is: lasx > unaligned > aligned > generic; lsx off.
3. Incorrect GLIBC_TUNABLES settings will show error messages.
   For example: On lsx platforms, you cannot enable lasx features. If you do
   that, you will get error messages.
4. Valid input examples:
   - GLIBC_TUNABLES=glibc.cpu.hwcaps=LASX: lasx > aligned > generic.
   - GLIBC_TUNABLES=glibc.cpu.hwcaps=LSX,UAL: lsx > unaligned > aligned > generic.
   - GLIBC_TUNABLES=glibc.cpu.hwcaps=LASX,UAL,LASX,UAL,LSX,LASX,UAL: Repetitions
     allowed but not recommended. Results in: lasx > lsx > unaligned > aligned >
     generic.
2023-09-19 09:11:49 +08:00
Siddhesh Poyarekar
973fe93a56 getaddrinfo: Fix use after free in getcanonname (CVE-2023-4806)
When an NSS plugin only implements the _gethostbyname2_r and
_getcanonname_r callbacks, getaddrinfo could use memory that was freed
during tmpbuf resizing, through h_name in a previous query response.

The backing store for res->at->name when doing a query with
gethostbyname3_r or gethostbyname2_r is tmpbuf, which is reallocated in
gethosts during the query.  For AF_INET6 lookup with AI_ALL |
AI_V4MAPPED, gethosts gets called twice, once for a v6 lookup and second
for a v4 lookup.  In this case, if the first call reallocates tmpbuf
enough number of times, resulting in a malloc, th->h_name (that
res->at->name refers to) ends up on a heap allocated storage in tmpbuf.
Now if the second call to gethosts also causes the plugin callback to
return NSS_STATUS_TRYAGAIN, tmpbuf will get freed, resulting in a UAF
reference in res->at->name.  This then gets dereferenced in the
getcanonname_r plugin call, resulting in the use after free.

Fix this by copying h_name over and freeing it at the end.  This
resolves BZ #30843, which is assigned CVE-2023-4806.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2023-09-15 14:38:28 -04:00
dengjianbo
780adf7aea LoongArch: Change to put magic number to .rodata section
Change to put magic number to .rodata section in memmove-lsx, and use
pcalau12i and %pc_lo12 with vld to get the data.
2023-09-15 09:07:47 +08:00
dengjianbo
24279aecf3 LoongArch: Add ifunc support for strrchr{aligned, lsx, lasx}
According to glibc strrchr microbenchmark test results, this implementation
could reduce the runtime time as following:

Name                Percent of rutime reduced
strrchr-lasx        10%-50%
strrchr-lsx         0%-50%
strrchr-aligned     5%-50%

Generic strrchr is implemented by function strlen + memrchr, the lasx version
will compare with generic strrchr implemented by strlen-lasx + memrchr-lasx,
the lsx version will compare with generic strrchr implemented by strlen-lsx +
memrchr-lsx, the aligned version will compare with generic strrchr implemented
by strlen-aligned + memrchr-generic.
2023-09-15 09:07:47 +08:00
dengjianbo
06251002d4 LoongArch: Add ifunc support for strcpy, stpcpy{aligned, unaligned, lsx, lasx}
According to glibc strcpy and stpcpy microbenchmark test results(changed
to use generic_strcpy and generic_stpcpy instead of strlen + memcpy),
comparing with the generic version, this implementation could reduce the
runtime as following:

Name              Percent of rutime reduced
strcpy-aligned    8%-45%
strcpy-unaligned  8%-48%, comparing with the aligned version, unaligned
                  version takes less instructions to copy the tail of data
		  which length is less than 8. it also has better performance
		  in case src and dest cannot be both aligned with 8bytes
strcpy-lsx        20%-80%
strcpy-lasx       15%-86%
stpcpy-aligned    6%-43%
stpcpy-unaligned  8%-48%
stpcpy-lsx        10%-80%
stpcpy-lasx       10%-87%
2023-09-15 09:07:47 +08:00
caiyinyu
c6c73e136a LoongArch: Replace deprecated $v0 with $a0 to eliminate 'as' Warnings. 2023-09-15 09:07:47 +08:00
caiyinyu
f5242db159 LoongArch: Add lasx/lsx support for _dl_runtime_profile. 2023-09-15 09:07:42 +08:00
Joseph Myers
803f4073cc Add MOVE_MOUNT_BENEATH from Linux 6.5 to sys/mount.h
This patch adds the MOVE_MOUNT_BENEATH constant from Linux 6.5 to
glibc's sys/mount.h and updates tst-mount-consts.py to reflect these
constants being up to date with that Linux kernel version.

Tested with build-many-glibcs.py.
2023-09-14 14:58:15 +00:00
Joseph Myers
72511f539c Update syscall lists for Linux 6.5
Linux 6.5 has one new syscall, cachestat, and also enables the
cacheflush syscall for hppa.  Update syscall-names.list and regenerate
the arch-syscall.h headers with build-many-glibcs.py update-syscalls.

Tested with build-many-glibcs.py.
2023-09-12 14:08:53 +00:00
Sergei Trofimovich
073edbdfab
ia64: Work around miscompilation and fix build on ia64's gcc-10 and later
Needed since gcc-10 enabled -fno-common by default.

[In use in Gentoo since gcc-10, no problems observed.
Also discussed with and reviewed by Jessica Clarke from
Debian. Andreas]

Bug: https://bugs.gentoo.org/723268
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Signed-off-by: Sergei Trofimovich <slyich@gmail.com>
Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>
2023-09-11 19:19:46 +02:00
Joe Simmons-Talbott
5f798d38e9 stdio: Remove __libc_message alloca usage
Use a fixed size array instead.  The maximum number of arguments
is set by macro tricks.

Co-authored-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2023-09-11 16:16:49 +00:00
Samuel Thibault
a43003ebf6 htl: avoid exposing the vm_region symbol 2023-09-09 10:07:39 +02:00
Florian Weimer
6985865bc3 elf: Always call destructors in reverse constructor order (bug 30785)
The current implementation of dlclose (and process exit) re-sorts the
link maps before calling ELF destructors.  Destructor order is not the
reverse of the constructor order as a result: The second sort takes
relocation dependencies into account, and other differences can result
from ambiguous inputs, such as cycles.  (The force_first handling in
_dl_sort_maps is not effective for dlclose.)  After the changes in
this commit, there is still a required difference due to
dlopen/dlclose ordering by the application, but the previous
discrepancies went beyond that.

A new global (namespace-spanning) list of link maps,
_dl_init_called_list, is updated right before ELF constructors are
called from _dl_init.

In dl_close_worker, the maps variable, an on-stack variable length
array, is eliminated.  (VLAs are problematic, and dlclose should not
call malloc because it cannot readily deal with malloc failure.)
Marking still-used objects uses the namespace list directly, with
next and next_idx replacing the done_index variable.

After marking, _dl_init_called_list is used to call the destructors
of now-unused maps in reverse destructor order.  These destructors
can call dlopen.  Previously, new objects do not have l_map_used set.
This had to change: There is no copy of the link map list anymore,
so processing would cover newly opened (and unmarked) mappings,
unloading them.  Now, _dl_init (indirectly) sets l_map_used, too.
(dlclose is handled by the existing reentrancy guard.)

After _dl_init_called_list traversal, two more loops follow.  The
processing order changes to the original link map order in the
namespace.  Previously, dependency order was used.  The difference
should not matter because relocation dependencies could already
reorder link maps in the old code.

The changes to _dl_fini remove the sorting step and replace it with
a traversal of _dl_init_called_list.  The l_direct_opencount
decrement outside the loader lock is removed because it appears
incorrect: the counter manipulation could race with other dynamic
loader operations.

tst-audit23 needs adjustments to the changes in LA_ACT_DELETE
notifications.  The new approach for checking la_activity should
make it clearer that la_activty calls come in pairs around namespace
updates.

The dependency sorting test cases need updates because the destructor
order is always the opposite order of constructor order, even with
relocation dependencies or cycles present.

There is a future cleanup opportunity to remove the now-constant
force_first and for_fini arguments from the _dl_sort_maps function.

Fixes commit 1df71d32fe ("elf: Implement
force_first handling in _dl_sort_maps_dfs (bug 28937)").

Reviewed-by: DJ Delorie <dj@redhat.com>
2023-09-08 12:34:27 +02:00
Aurelien Jarno
434bf72a94 io: Fix record locking contants for powerpc64 with __USE_FILE_OFFSET64
Commit 5f828ff824 ("io: Fix F_GETLK, F_SETLK, and F_SETLKW for
powerpc64") fixed an issue with the value of the lock constants on
powerpc64 when not using __USE_FILE_OFFSET64, but it ended-up also
changing the value when using __USE_FILE_OFFSET64 causing an API change.

Fix that by also checking that define, restoring the pre
4d0fe291ae commit values:

Default values:
- F_GETLK: 5
- F_SETLK: 6
- F_SETLKW: 7

With -D_FILE_OFFSET_BITS=64:
- F_GETLK: 12
- F_SETLK: 13
- F_SETLKW: 14

At the same time, it has been noticed that there was no test for io lock
with __USE_FILE_OFFSET64, so just add one.

Tested on x86_64-linux-gnu, i686-linux-gnu and
powerpc64le-unknown-linux-gnu.

Resolves: BZ #30804.
Co-authored-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2023-09-07 21:56:31 +02:00
Joe Simmons-Talbott
955a47a4bf getaddrinfo: Get rid of alloca
Use a scratch_buffer rather than alloca to avoid potential stack
overflow.
2023-09-06 13:33:02 +00:00
Christoph Müllner
3d6fcf1bd7 riscv: Add support for XTheadBb in string-fz[a,i].h
XTheadBb has similar instructions like Zbb, which allow optimized
string processing:
* th.ff0: find-first zero is a CLZ instruction.
* th.tstnbz: Similar like orc.b, but with a bit-inverted result.

The instructions are documented here:
  https://github.com/T-head-Semi/thead-extension-spec/tree/master/xtheadbb

These instructions can be found in the T-Head C906 and the C910.

Tested with the string tests.

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2023-09-06 09:27:43 -03:00
Siddhesh Poyarekar
3bf7bab88b getcanonname: Fix a typo
This code is generally unused in practice since there don't seem to be
any NSS modules that only implement _nss_MOD_gethostbyname2_r and not
_nss_MOD_gethostbyname3_r.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2023-09-05 17:04:05 -04:00
Adhemerval Zanella Netto
e7190fc73d linux: Add pidfd_getpid
This interface allows to obtain the associated process ID from the
process file descriptor.  It is done by parsing the procps fdinfo
information.  Its prototype is:

   pid_t pidfd_getpid (int fd)

It returns the associated pid or -1 in case of an error and sets the
errno accordingly.  The possible errno values are those from open, read,
and close (used on procps parsing), along with:

   - EBADF if the FD is negative, does not have a PID associated, or if
     the fdinfo fields contain a value larger than pid_t.

   - EREMOTE if the PID is in a separate namespace.

   - ESRCH if the process is already terminated.

Checked on x86_64-linux-gnu on Linux 4.15 (no CLONE_PIDFD or waitid
support), Linux 5.4 (full support), and Linux 6.2.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2023-09-05 13:08:59 -03:00
Adhemerval Zanella Netto
0d6f9f6265 posix: Add pidfd_spawn and pidfd_spawnp (BZ 30349)
Returning a pidfd allows a process to keep a race-free handle for a
child process, otherwise, the caller will need to either use pidfd_open
(which still might be subject to TOCTOU) or keep the old racy interface
base on pid_t.

To correct use pifd_spawn, the kernel must support not only returning
the pidfd with clone/clone3 but also waitid (P_PIDFD) (added on Linux
5.4).  If kernel does not support the waitid, pidfd return ENOSYS.
It avoids the need to racy workarounds, such as reading the procfs
fdinfo to get the pid to use along with other wait interfaces.

These interfaces are similar to the posix_spawn and posix_spawnp, with
the only difference being it returns a process file descriptor (int)
instead of a process ID (pid_t).  Their prototypes are:

  int pidfd_spawn (int *restrict pidfd,
                   const char *restrict file,
                   const posix_spawn_file_actions_t *restrict facts,
                   const posix_spawnattr_t *restrict attrp,
                   char *const argv[restrict],
                   char *const envp[restrict])

  int pidfd_spawnp (int *restrict pidfd,
                    const char *restrict path,
                    const posix_spawn_file_actions_t *restrict facts,
                    const posix_spawnattr_t *restrict attrp,
                    char *const argv[restrict_arr],
                    char *const envp[restrict_arr]);

A new symbol is used instead of a posix_spawn extension to avoid
possible issues with language bindings that might track the return
argument lifetime.  Although on Linux pid_t and int are interchangeable,
POSIX only states that pid_t should be a signed integer.

Both symbols reuse the posix_spawn posix_spawn_file_actions_t and
posix_spawnattr_t, to void rehash posix_spawn API or add a new one. It
also means that both interfaces support the same attribute and file
actions, and a new flag or file action on posix_spawn is also added
automatically for pidfd_spawn.

Also, using posix_spawn plumbing allows the reusing of most of the
current testing with some changes:

  - waitid is used instead of waitpid since it is a more generic
    interface.

  - tst-posix_spawn-setsid.c is adapted to take into consideration that
    the caller can check for session id directly.  The test now spawns
itself and writes the session id as a file instead.

  - tst-spawn3.c need to know where pidfd_spawn is used so it keeps an
    extra file description unused.

Checked on x86_64-linux-gnu on Linux 4.15 (no CLONE_PIDFD or waitid
support), Linux 5.4 (full support), and Linux 6.2.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
2023-09-05 13:08:59 -03:00
Adhemerval Zanella Netto
ce2bfb8569 linux: Add posix_spawnattr_{get, set}cgroup_np (BZ 26371)
These functions allow to posix_spawn and posix_spawnp to use
CLONE_INTO_CGROUP with clone3, allowing the child process to
be created in a different cgroup version 2.  These are GNU
extensions that are available only for Linux, and also only
for the architectures that implement clone3 wrapper
(HAVE_CLONE3_WRAPPER).

To create a process on a different cgroupv2, one can use the:

  posix_spawnattr_t attr;
  posix_spawnattr_init (&attr);
  posix_spawnattr_setflags (&attr, POSIX_SPAWN_SETCGROUP);
  posix_spawnattr_setcgroup_np (&attr, cgroup);
  posix_spawn (...)

Similar to other posix_spawn flags, POSIX_SPAWN_SETCGROUP control
whether the cgroup file descriptor will be used or not with
clone3.

There is no fallback if either clone3 does not support the flag
or if the architecture does not provide the clone3 wrapper, in
this case posix_spawn returns EOPNOTSUPP.

Checked on x86_64-linux-gnu.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2023-09-05 13:08:48 -03:00
Adhemerval Zanella Netto
ad77b1bcca linux: Define __ASSUME_CLONE3 to 0 for alpha, ia64, nios2, sh, and sparc
Not all architectures added clone3 syscall.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2023-09-05 10:15:48 -03:00
Adhemerval Zanella Netto
e7d1c58664 mips: Add the clone3 wrapper
It follows the internal signature:

extern int clone3 (struct clone_args *__cl_args, size_t __size,
                   int (*__func) (void *__arg), void *__arg);

Checked on mips64el-linux-gnueabihf, mips64el-n32-linux-gnu, and
mipsel-linux-gnu.
2023-09-05 10:15:48 -03:00
Adhemerval Zanella Netto
b56f7fe79e arm: Add the clone3 wrapper
It follows the internal signature:

  extern int clone3 (struct clone_args *__cl_args, size_t __size,
		    int (*__func) (void *__arg), void *__arg);

Checked on arm-linux-gnueabihf.
2023-09-05 10:15:48 -03:00