Commit Graph

16092 Commits

Author SHA1 Message Date
Adhemerval Zanella
71149c2a2e elf: Only process multiple tunable once (BZ 31686)
The 680c597e9c commit made loader reject ill-formatted strings by
first tracking all set tunables and then applying them. However, it does
not take into consideration if the same tunable is set multiple times,
where parse_tunables_string appends the found tunable without checking
if it was already in the list. It leads to a stack-based buffer overflow
if the tunable is specified more than the total number of tunables.  For
instance:

  GLIBC_TUNABLES=glibc.malloc.check=2:... (repeat over the number of
  total support for different tunable).

Instead, use the index of the tunable list to get the expected tunable
entry.  Since now the initial list is zero-initialized, the compiler
might emit an extra memset and this requires some minor adjustment
on some ports.

Checked on x86_64-linux-gnu and aarch64-linux-gnu.

Reported-by: Yuto Maeda <maeda@cyberdefense.jp>
Reported-by: Yutaro Shimizu <shimizu@cyberdefense.jp>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
(cherry picked from commit bcae44ea85)
2024-05-07 14:06:56 -03:00
Gabi Falk
8b005d7869
i686: Fix multiple definitions of __memmove_chk and __memset_chk
Commit c73c96a4a1 updated memcpy.S and
mempcpy.S, but omitted memmove.S and memset.S.  As a result, the static
library built as PIC, whether with or without multiarch support,
contains two definitions for each of the __memmove_chk and __memset_chk
symbols.

/usr/lib/gcc/i686-pc-linux-gnu/14/../../../../i686-pc-linux-gnu/bin/ld: /usr/lib/gcc/i686-pc-linux-gnu/14/../../../../lib/libc.a(memset-ia32.o): in function `__memset_chk':
/var/tmp/portage/sys-libs/glibc-2.39-r3/work/glibc-2.39/string/../sysdeps/i386/i686/memset.S:32: multiple definition of `__memset_chk'; /usr/lib/gcc/i686-pc-linux-gnu/14/../../../../lib/libc.a(memset_chk.o):/var/tmp/portage/sys-libs/glibc-2.39-r3/work/glibc-2.39/debug/../sysdeps/i386/i686/multiarch/memset_chk.c:24: first defined here

After this change, regardless of PIC options, the static library, built
for i686 with multiarch contains implementations of these functions
respectively from debug/memmove_chk.c and debug/memset_chk.c, and
without multiarch contains implementations of these functions
respectively from sysdeps/i386/memmove_chk.S and
sysdeps/i386/memset_chk.S.  This ensures that memmove and memset won't
pull in __chk_fail and the routines it calls.

Reported-by: Sam James <sam@gentoo.org>
Tested-by: Sam James <sam@gentoo.org>
Fixes: c73c96a4a1 ("i686: Fix build with --disable-multiarch")
Signed-off-by: Gabi Falk <gabifalk@gmx.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Dmitry V. Levin <ldv@altlinux.org>
(cherry picked from commit 5a2cf833f5)
2024-05-04 13:29:48 +01:00
Gabi Falk
8323a83abd
i586: Fix multiple definitions of __memcpy_chk and __mempcpy_chk
/home/bmg/install/compilers/x86_64-linux-gnu/lib/gcc/x86_64-glibc-linux-gnu/13.2.1/../../../../x86_64-glibc-linux-gnu/bin/ld: /home/bmg/build/glibcs/i586-linux-gnu/glibc/libc.a(memcpy_chk.o): in function `__memcpy_chk':
/home/bmg/src/glibc/debug/../sysdeps/i386/memcpy_chk.S:29: multiple definition of `__memcpy_chk';/home/bmg/build/glibcs/i586-linux-gnu/glibc/libc.a(memcpy.o):/home/bmg/src/glibc/string/../sysdeps/i386/i586/memcpy.S:31: first defined here /home/bmg/install/compilers/x86_64-linux-gnu/lib/gcc/x86_64-glibc-linux-gnu/13.2.1/../../../../x86_64-glibc-linux-gnu/bin/ld: /home/bmg/build/glibcs/i586-linux-gnu/glibc/libc.a(mempcpy_chk.o): in function `__mempcpy_chk': /home/bmg/src/glibc/debug/../sysdeps/i386/mempcpy_chk.S:28: multiple definition of `__mempcpy_chk'; /home/bmg/build/glibcs/i586-linux-gnu/glibc/libc.a(mempcpy.o):/home/bmg/src/glibc/string/../sysdeps/i386/i586/memcpy.S:31: first defined here

After this change, the static library built for i586, regardless of PIC
options, contains implementations of these functions respectively from
sysdeps/i386/memcpy_chk.S and sysdeps/i386/mempcpy_chk.S.  This ensures
that memcpy and mempcpy won't pull in __chk_fail and the routines it
calls.

Reported-by: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Gabi Falk <gabifalk@gmx.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Dmitry V. Levin <ldv@altlinux.org>
(cherry picked from commit 0fdf4ba48c)
2024-05-04 13:29:05 +01:00
Sam James
5141d4d83c
Revert "i586: Fix multiple definitions of __memcpy_chk and __mempcpy_chk"
This reverts commit 3148714ab6.

I had the wrong cherry-pick reference (the commit content is right; it's
just referring to a base that isn't upstream), but let's revert and reapply
for clarity.

Signed-off-by: Sam James <sam@gentoo.org>
2024-05-04 13:28:54 +01:00
Sam James
c16871e662
Revert "i686: Fix multiple definitions of __memmove_chk and __memset_chk"
This reverts commit ad92c483a4.

I had the wrong cherry-pick reference (the commit content is right; it's
just referring to a base that isn't upstream), but let's revert and reapply
for clarity.

Signed-off-by: Sam James <sam@gentoo.org>
2024-05-04 13:28:51 +01:00
Gabi Falk
ad92c483a4
i686: Fix multiple definitions of __memmove_chk and __memset_chk
Commit c73c96a4a1 updated memcpy.S and
mempcpy.S, but omitted memmove.S and memset.S.  As a result, the static
library built as PIC, whether with or without multiarch support,
contains two definitions for each of the __memmove_chk and __memset_chk
symbols.

/usr/lib/gcc/i686-pc-linux-gnu/14/../../../../i686-pc-linux-gnu/bin/ld: /usr/lib/gcc/i686-pc-linux-gnu/14/../../../../lib/libc.a(memset-ia32.o): in function `__memset_chk':
/var/tmp/portage/sys-libs/glibc-2.39-r3/work/glibc-2.39/string/../sysdeps/i386/i686/memset.S:32: multiple definition of `__memset_chk'; /usr/lib/gcc/i686-pc-linux-gnu/14/../../../../lib/libc.a(memset_chk.o):/var/tmp/portage/sys-libs/glibc-2.39-r3/work/glibc-2.39/debug/../sysdeps/i386/i686/multiarch/memset_chk.c:24: first defined here

After this change, regardless of PIC options, the static library, built
for i686 with multiarch contains implementations of these functions
respectively from debug/memmove_chk.c and debug/memset_chk.c, and
without multiarch contains implementations of these functions
respectively from sysdeps/i386/memmove_chk.S and
sysdeps/i386/memset_chk.S.  This ensures that memmove and memset won't
pull in __chk_fail and the routines it calls.

Reported-by: Sam James <sam@gentoo.org>
Tested-by: Sam James <sam@gentoo.org>
Fixes: c73c96a4a1 ("i686: Fix build with --disable-multiarch")
Signed-off-by: Gabi Falk <gabifalk@gmx.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Dmitry V. Levin <ldv@altlinux.org>
(cherry picked from commit 5a2cf833f5)
2024-05-04 13:23:24 +01:00
Gabi Falk
3148714ab6
i586: Fix multiple definitions of __memcpy_chk and __mempcpy_chk
/home/bmg/install/compilers/x86_64-linux-gnu/lib/gcc/x86_64-glibc-linux-gnu/13.2.1/../../../../x86_64-glibc-linux-gnu/bin/ld: /home/bmg/build/glibcs/i586-linux-gnu/glibc/libc.a(memcpy_chk.o): in function `__memcpy_chk':
/home/bmg/src/glibc/debug/../sysdeps/i386/memcpy_chk.S:29: multiple definition of `__memcpy_chk';/home/bmg/build/glibcs/i586-linux-gnu/glibc/libc.a(memcpy.o):/home/bmg/src/glibc/string/../sysdeps/i386/i586/memcpy.S:31: first defined here /home/bmg/install/compilers/x86_64-linux-gnu/lib/gcc/x86_64-glibc-linux-gnu/13.2.1/../../../../x86_64-glibc-linux-gnu/bin/ld: /home/bmg/build/glibcs/i586-linux-gnu/glibc/libc.a(mempcpy_chk.o): in function `__mempcpy_chk': /home/bmg/src/glibc/debug/../sysdeps/i386/mempcpy_chk.S:28: multiple definition of `__mempcpy_chk'; /home/bmg/build/glibcs/i586-linux-gnu/glibc/libc.a(mempcpy.o):/home/bmg/src/glibc/string/../sysdeps/i386/i586/memcpy.S:31: first defined here

After this change, the static library built for i586, regardless of PIC
options, contains implementations of these functions respectively from
sysdeps/i386/memcpy_chk.S and sysdeps/i386/mempcpy_chk.S.  This ensures
that memcpy and mempcpy won't pull in __chk_fail and the routines it
calls.

Reported-by: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Gabi Falk <gabifalk@gmx.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Dmitry V. Levin <ldv@altlinux.org>
(cherry picked from commit 789894a2f554d4503ecb2f13b2b4e93e43414f33)
2024-05-04 13:23:15 +01:00
Carlos O'Donell
273a835fe7 time: Allow later version licensing.
The FSF's Licensing and Compliance Lab noted a discrepancy in the
licensing of several files in the glibc package.

When timespect_get.c was impelemented the license did not include
the standard ", or (at your option) any later version." text.

Change the license in timespec_get.c and all copied files to match
the expected license.

This change was previously approved in principle by the FSF in
RT ticket #1316403. And a similar instance was fixed in
commit 46703efa02.

(cherry picked from commit 91695ee459)
2024-05-03 10:15:11 +02:00
Florian Weimer
836d43b989 login: structs utmp, utmpx, lastlog _TIME_BITS independence (bug 30701)
These structs describe file formats under /var/log, and should not
depend on the definition of _TIME_BITS.  This is achieved by
defining __WORDSIZE_TIME64_COMPAT32 to 1 on 32-bit ports that
support 32-bit time_t values (where __time_t is 32 bits).

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
(cherry picked from commit 9abdae94c7)
2024-05-02 13:20:27 +02:00
Florian Weimer
9831f98c26 login: Check default sizes of structs utmp, utmpx, lastlog
The default <utmp-size.h> is for ports with a 64-bit time_t.
Ports with a 32-bit time_t or with __WORDSIZE_TIME64_COMPAT32=1
need to override it.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
(cherry picked from commit 4d4da5aab9)
2024-05-02 13:20:27 +02:00
H.J. Lu
2f8f157eb0 x86: Define MINIMUM_X86_ISA_LEVEL in config.h [BZ #31676]
Define MINIMUM_X86_ISA_LEVEL at configure time to avoid

/usr/bin/ld: …/build/elf/librtld.os: in function `init_cpu_features':
…/git/elf/../sysdeps/x86/cpu-features.c:1202: undefined reference to `_dl_runtime_resolve_fxsave'
/usr/bin/ld: …/build/elf/librtld.os: relocation R_X86_64_PC32 against undefined hidden symbol `_dl_runtime_resolve_fxsave' can not be used when making a shared object
/usr/bin/ld: final link failed: bad value
collect2: error: ld returned 1 exit status

when glibc is built with -march=x86-64-v3 and configured with
--with-rtld-early-cflags=-march=x86-64, which is used to allow ld.so to
print an error message on unsupported CPUs:

Fatal glibc error: CPU does not support x86-64-v3

This fixes BZ #31676.
Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>

(cherry picked from commit 46c9997413)
2024-04-25 13:16:51 +02:00
Florian Weimer
e701c7d761 i386: ulp update for SSE2 --disable-multi-arch configurations
(cherry picked from commit 3a3a449742)
2024-04-25 12:58:21 +02:00
Florian Weimer
e828914cf9 nptl: Fix tst-cancel30 on kernels without ppoll_time64 support
Fall back to ppoll if ppoll_time64 fails with ENOSYS.
Fixes commit 370da8a121 ("nptl: Fix
tst-cancel30 on sparc64").

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit f4724843ad)
2024-04-25 12:57:32 +02:00
Sunil K Pandey
423099a032 x86_64: Exclude SSE, AVX and FMA4 variants in libm multiarch
When glibc is built with ISA level 3 or higher by default, the resulting
glibc binaries won't run on SSE or FMA4 processors.  Exclude SSE, AVX and
FMA4 variants in libm multiarch when ISA level 3 or higher is enabled by
default.

When glibc is built with ISA level 2 enabled by default, only keep SSE4.1
variant.

Fixes BZ 31335.

NB: elf/tst-valgrind-smoke test fails with ISA level 4, because valgrind
doesn't support AVX512 instructions:

https://bugs.kde.org/show_bug.cgi?id=383010

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit 9f78a7c1d0)
2024-04-14 05:41:02 -07:00
H.J. Lu
04df8652eb Apply the Makefile sorting fix
Apply the Makefile sorting fix generated by sort-makefile-lines.py.

(cherry picked from commit ef7f4b1fef)
2024-04-14 05:41:02 -07:00
Florian Weimer
edb9a76e30 powerpc: Fix ld.so address determination for PCREL mode (bug 31640)
This seems to have stopped working with some GCC 14 versions,
which clobber r2.  With other compilers, the kernel-provided
r2 value is still available at this point.

Reviewed-by: Peter Bergner <bergner@linux.ibm.com>
(cherry picked from commit 14e56bd4ce)
2024-04-14 08:25:43 +02:00
Sunil K Pandey
7b92f46f04 x86-64: Simplify minimum ISA check ifdef conditional with if
Replace minimum ISA check ifdef conditional with if.  Since
MINIMUM_X86_ISA_LEVEL and AVX_X86_ISA_LEVEL are compile time constants,
compiler will perform constant folding optimization, getting same
results.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit b6e3898194)
2024-04-13 20:22:28 +02:00
H.J. Lu
9883f4304c x86-64: Don't use SSE resolvers for ISA level 3 or above
When glibc is built with ISA level 3 or above enabled, SSE resolvers
aren't available and glibc fails to build:

ld: .../elf/librtld.os: in function `init_cpu_features':
.../elf/../sysdeps/x86/cpu-features.c:1200:(.text+0x1445f): undefined reference to `_dl_runtime_resolve_fxsave'
ld: .../elf/librtld.os: relocation R_X86_64_PC32 against undefined hidden symbol `_dl_runtime_resolve_fxsave' can not be used when making a shared object
/usr/local/bin/ld: final link failed: bad value

For ISA level 3 or above, don't use _dl_runtime_resolve_fxsave nor
_dl_tlsdesc_dynamic_fxsave.

This fixes BZ #31429.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>

(cherry picked from commit befe2d3c4d)
2024-04-13 17:55:54 +02:00
Wilco Dijkstra
9d92452c70 AArch64: Check kernel version for SVE ifuncs
Old Linux kernels disable SVE after every system call.  Calling the
SVE-optimized memcpy afterwards will then cause a trap to reenable SVE.
As a result, applications with a high use of syscalls may run slower with
the SVE memcpy.  This is true for kernels between 4.15.0 and before 6.2.0,
except for 5.14.0 which was patched.  Avoid this by checking the kernel
version and selecting the SVE ifunc on modern kernels.

Parse the kernel version reported by uname() into a 24-bit kernel.major.minor
value without calling any library functions.  If uname() is not supported or
if the version format is not recognized, assume the kernel is modern.

Tested-by: Florian Weimer <fweimer@redhat.com>
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
(cherry picked from commit 2e94e2f5d2)
2024-04-09 19:58:25 +01:00
Szabolcs Nagy
395a89f61e aarch64: fix check for SVE support in assembler
Due to GCC bug 110901 -mcpu can override -march setting when compiling
asm code and thus a compiler targetting a specific cpu can fail the
configure check even when binutils gas supports SVE.

The workaround is that explicit .arch directive overrides both -mcpu
and -march, and since that's what the actual SVE memcpy uses the
configure check should use that too even if the GCC issue is fixed
independently.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
(cherry picked from commit 73c26018ed)
2024-04-09 19:58:15 +01:00
Joe Ramsay
b0e0a07018 aarch64/fpu: Sync libmvec routines from 2.39 and before with AOR
This includes a fix for big-endian in AdvSIMD log, some cosmetic
changes, and numerous small optimisations mainly around inlining and
using indexed variants of MLA intrinsics.
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>

(cherry picked from commit e302e10213)
2024-04-09 19:58:04 +01:00
Florian Weimer
31c7d69af5
i386: Use generic memrchr in libc (bug 31316)
Before this change, we incorrectly used the SSE2 variant in the
implementation, without checking that the system actually supports
SSE2.

Tested-by: Sam James <sam@gentoo.org>
(cherry picked from commit 0d9166c224)
2024-04-07 00:41:35 +02:00
Adhemerval Zanella
5d070d12b3 x86: Expand the comment on when REP STOSB is used on memset
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit 491e55beab)
2024-04-04 12:12:19 +02:00
Adhemerval Zanella
6484a92698 x86: Do not prefer ERMS for memset on Zen3+
For AMD Zen3+ architecture, the performance of the vectorized loop is
slightly better than ERMS.

Checked on x86_64-linux-gnu on Zen3.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

(cherry picked from commit 272708884c)
2024-04-04 12:12:11 +02:00
Adhemerval Zanella
aa4249266e x86: Fix Zen3/Zen4 ERMS selection (BZ 30994)
The REP MOVSB usage on memcpy/memmove does not show much performance
improvement on Zen3/Zen4 cores compared to the vectorized loops.  Also,
as from BZ 30994, if the source is aligned and the destination is not
the performance can be 20x slower.

The performance difference is noticeable with small buffer sizes, closer
to the lower bounds limits when memcpy/memmove starts to use ERMS.  The
performance of REP MOVSB is similar to vectorized instruction on the
size limit (the L2 cache).  Also, there is no drawback to multiple cores
sharing the cache.

Checked on x86_64-linux-gnu on Zen3.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

(cherry picked from commit 0c0d39fe4a)
2024-04-04 12:11:58 +02:00
Andreas Schwab
5a461f2949 Add tst-gnu2-tls2mod1 to test-internal-extras
That allows sysdeps/x86_64/tst-gnu2-tls2mod1.S to use internal headers.

Fixes: 717ebfa85c ("x86-64: Allocate state buffer space for RDI, RSI and RBX")
(cherry picked from commit fd7ee2e6c5)
2024-04-02 06:22:06 -07:00
Adhemerval Zanella
aded2fc004 elf: Enable TLS descriptor tests on aarch64
The aarch64 uses 'trad' for traditional tls and 'desc' for tls
descriptors, but unlike other targets it defaults to 'desc'.  The
gnutls2 configure check does not set aarch64 as an ABI that uses
TLS descriptors, which then disable somes stests.

Also rename the internal machinery fron gnu2 to tls descriptors.

Checked on aarch64-linux-gnu.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

(cherry picked from commit 3d53d18fc7)
2024-04-01 11:16:32 -07:00
Adhemerval Zanella
a8ba52bde5 arm: Update _dl_tlsdesc_dynamic to preserve caller-saved registers (BZ 31372)
ARM _dl_tlsdesc_dynamic slow path has two issues:

  * The ip/r12 is defined by AAPCS as a scratch register, and gcc is
    used to save the stack pointer before on some function calls.  So it
    should also be saved/restored as well.  It fixes the tst-gnu2-tls2.

  * None of the possible VFP registers are saved/restored.  ARM has the
    additional complexity to have different VFP bank sizes (depending of
    VFP support by the chip).

The tst-gnu2-tls2 test is extended to check for VFP registers, although
only for hardfp builds.  Different than setcontext, _dl_tlsdesc_dynamic
does not have  HWCAP_ARM_IWMMXT (I don't have a way to properly test
it and it is almost a decade since newer hardware was released).

With this patch there is no need to mark tst-gnu2-tls2 as XFAIL.

Checked on arm-linux-gnueabihf.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

(cherry picked from commit 64c7e34428)
2024-04-01 11:16:27 -07:00
H.J. Lu
354cabcb26 x86-64: Allocate state buffer space for RDI, RSI and RBX
_dl_tlsdesc_dynamic preserves RDI, RSI and RBX before realigning stack.
After realigning stack, it saves RCX, RDX, R8, R9, R10 and R11.  Define
TLSDESC_CALL_REGISTER_SAVE_AREA to allocate space for RDI, RSI and RBX
to avoid clobbering saved RDI, RSI and RBX values on stack by xsave to
STATE_SAVE_OFFSET(%rsp).

   +==================+<- stack frame start aligned at 8 or 16 bytes
   |                  |<- RDI saved in the red zone
   |                  |<- RSI saved in the red zone
   |                  |<- RBX saved in the red zone
   |                  |<- paddings for stack realignment of 64 bytes
   |------------------|<- xsave buffer end aligned at 64 bytes
   |                  |<-
   |                  |<-
   |                  |<-
   |------------------|<- xsave buffer start at STATE_SAVE_OFFSET(%rsp)
   |                  |<- 8-byte padding for 64-byte alignment
   |                  |<- 8-byte padding for 64-byte alignment
   |                  |<- R11
   |                  |<- R10
   |                  |<- R9
   |                  |<- R8
   |                  |<- RDX
   |                  |<- RCX
   +==================+<- RSP aligned at 64 bytes

Define TLSDESC_CALL_REGISTER_SAVE_AREA, the total register save area size
for all integer registers by adding 24 to STATE_SAVE_OFFSET since RDI, RSI
and RBX are saved onto stack without adjusting stack pointer first, using
the red-zone.  This fixes BZ #31501.
Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>

(cherry picked from commit 717ebfa85c)
2024-04-01 11:16:14 -07:00
H.J. Lu
853e915fdd x86-64: Update _dl_tlsdesc_dynamic to preserve AMX registers
_dl_tlsdesc_dynamic should also preserve AMX registers which are
caller-saved.  Add X86_XSTATE_TILECFG_ID and X86_XSTATE_TILEDATA_ID
to x86-64 TLSDESC_CALL_STATE_SAVE_MASK.  Compute the AMX state size
and save it in xsave_state_full_size which is only used by
_dl_tlsdesc_dynamic_xsave and _dl_tlsdesc_dynamic_xsavec.  This fixes
the AMX part of BZ #31372.  Tested on AMX processor.

AMX test is enabled only for compilers with the fix for

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114098

GCC 14 and GCC 11/12/13 branches have the bug fix.
Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>

(cherry picked from commit 9b7091415a)
2024-04-01 10:55:17 -07:00
H.J. Lu
a364304718 x86: Update _dl_tlsdesc_dynamic to preserve caller-saved registers
Compiler generates the following instruction sequence for GNU2 dynamic
TLS access:

	leaq	tls_var@TLSDESC(%rip), %rax
	call	*tls_var@TLSCALL(%rax)

or

	leal	tls_var@TLSDESC(%ebx), %eax
	call	*tls_var@TLSCALL(%eax)

CALL instruction is transparent to compiler which assumes all registers,
except for EFLAGS and RAX/EAX, are unchanged after CALL.  When
_dl_tlsdesc_dynamic is called, it calls __tls_get_addr on the slow
path.  __tls_get_addr is a normal function which doesn't preserve any
caller-saved registers.  _dl_tlsdesc_dynamic saved and restored integer
caller-saved registers, but didn't preserve any other caller-saved
registers.  Add _dl_tlsdesc_dynamic IFUNC functions for FNSAVE, FXSAVE,
XSAVE and XSAVEC to save and restore all caller-saved registers.  This
fixes BZ #31372.

Add GLRO(dl_x86_64_runtime_resolve) with GLRO(dl_x86_tlsdesc_dynamic)
to optimize elf_machine_runtime_setup.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>

(cherry picked from commit 0aac205a81)
2024-04-01 10:42:25 -07:00
H.J. Lu
7fc8242bf8 x86-64: Save APX registers in ld.so trampoline
Add APX registers to STATE_SAVE_MASK so that APX registers are saved in
ld.so trampoline.  This fixes BZ #31371.

Also update STATE_SAVE_OFFSET and STATE_SAVE_MASK for i386 which will
be used by i386 _dl_tlsdesc_dynamic.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>

(cherry picked from commit dfb05f8e70)
2024-04-01 10:38:15 -07:00
caiyinyu
983f34a125 LoongArch: Correct {__ieee754, _}_scalb -> {__ieee754, _}_scalbf 2024-03-22 09:29:44 +08:00
Amrita H S
aad45c8ac3 powerpc: Placeholder and infrastructure/build support to add Power11 related changes.
The following three changes have been added to provide initial Power11 support.
    1. Add the directories to hold Power11 files.
    2. Add support to select Power11 libraries based on AT_PLATFORM.
    3. Let submachine=power11 be set automatically.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
Reviewed-by: Peter Bergner <bergner@linux.ibm.com>
(cherry picked from commit 1ea0511456)
2024-03-20 19:43:40 -05:00
Manjunath Matti
ee7f4c54e1 powerpc: Add HWCAP3/HWCAP4 data to TCB for Power Architecture.
This patch adds a new feature for powerpc.  In order to get faster
access to the HWCAP3/HWCAP4 masks, similar to HWCAP/HWCAP2 (i.e. for
implementing __builtin_cpu_supports() in GCC) without the overhead of
reading them from the auxiliary vector, we now reserve space for them
in the TCB.

Suggested-by: Peter Bergner <bergner@linux.ibm.com>
Reviewed-by: Peter Bergner <bergner@linux.ibm.com>
(cherry picked from commit 3ab9b88e2a)
2024-03-20 18:09:32 -05:00
Florian Weimer
71fcdba577 linux: Use rseq area unconditionally in sched_getcpu (bug 31479)
Originally, nptl/descr.h included <sys/rseq.h>, but we removed that
in commit 2c6b4b272e ("nptl:
Unconditionally use a 32-byte rseq area").  After that, it was
not ensured that the RSEQ_SIG macro was defined during sched_getcpu.c
compilation that provided a definition.  This commit always checks
the rseq area for CPU number information before using the other
approaches.

This adds an unnecessary (but well-predictable) branch on
architectures which do not define RSEQ_SIG, but its cost is small
compared to the system call.  Most architectures that have vDSO
acceleration for getcpu also have rseq support.

Fixes: 2c6b4b272e
Fixes: 1d350aa060
Reviewed-by: Arjun Shankar <arjun@redhat.com>
(cherry picked from commit 7a76f21867)
2024-03-18 11:28:19 +01:00
Stefan Liebler
e0910f1d32 S390: Do not clobber r7 in clone [BZ #31402]
Starting with commit e57d8fc97b
"S390: Always use svc 0"
clone clobbers the call-saved register r7 in error case:
function or stack is NULL.

This patch restores the saved registers also in the error case.
Furthermore the existing test misc/tst-clone is extended to check
all error cases and that clone does not clobber registers in this
error case.

(cherry picked from commit 02782fd128)
2024-02-27 11:08:08 +01:00
Xi Ruoyao
d0724994de
math: Update mips64 ulps
Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>
(cherry picked from commit e2a65ecc4b)
2024-02-22 21:29:26 +01:00
Adhemerval Zanella
312e159626 mips: FIx clone3 implementation (BZ 31325)
For o32 we need to setup a minimal stack frame to allow cprestore
on __thread_start_clone3 (which instruct the linker to save the
gp for PIC).  Also, there is no guarantee by kABI that $8 will be
preserved after syscall execution, so we need to save it on the
provided stack.

Checked on mipsel-linux-gnu.

Reported-by: Khem Raj <raj.khem@gmail.com>
Tested-by: Khem Raj <raj.khem@gmail.com>
(cherry picked from commit bbd248ac0d)
2024-02-12 11:45:12 -03:00
Adhemerval Zanella
63295e4fda arm: Remove wrong ldr from _dl_start_user (BZ 31339)
The commit 49d877a80b (arm: Remove
_dl_skip_args usage) removed the _SKIP_ARGS literal, which was
previously loader to r4 on loader _start.  However, the cleanup did not
remove the following 'ldr r4, [sl, r4]' on _dl_start_user, used to check
to skip the arguments after ld self-relocations.

In my testing, the kernel initially set r4 to 0, which makes the
ldr instruction just read the _GLOBAL_OFFSET_TABLE_.  However, since r4
is a callee-saved register; a different runtime might not zero
initialize it and thus trigger an invalid memory access.

Checked on arm-linux-gnu.

Reported-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
(cherry picked from commit 1e25112dc0)
2024-02-05 15:43:39 -03:00
Stefan Liebler
cc1b91eabd
S390: Fix building with --disable-mutli-arch [BZ #31196]
Starting with commits
- 7ea510127e
string: Add libc_hidden_proto for strchrnul
- 22999b2f0f
string: Add libc_hidden_proto for memrchr

building glibc on s390x with --disable-multi-arch fails if only
the C-variant of strchrnul / memrchr is used.  This is the case
if gcc uses -march < z13.

The build fails with:
../sysdeps/s390/strchrnul-c.c:28:49: error: ‘__strchrnul_c’ undeclared here (not in a function); did you mean ‘__strchrnul’?
   28 | __hidden_ver1 (__strchrnul_c, __GI___strchrnul, __strchrnul_c);

With --disable-multi-arch, __strchrnul_c is not available as string/strchrnul.c
is just included without defining STRCHRNUL and thus we also don't have to create
the internal hidden symbol.

Tested-by: Andreas K. Hüttel <dilfridge@gentoo.org>
2024-01-30 22:28:51 +01:00
Andreas Schwab
6edaa12b41 riscv: add support for static PIE
In order to support static PIE the startup code must avoid relocations
before __libc_start_main is called.
2024-01-22 14:58:23 +01:00
Adhemerval Zanella
bcf2abd43b sh: Fix static build with --enable-fortify
For static the internal symbols should not be prepended with the
internal __GI_.

Checked with a make check for sh4-linux-gnu.
2024-01-22 10:04:53 -03:00
Adhemerval Zanella
926a4bdbb5 sparc: Fix sparc64 memmove length comparison (BZ 31266)
The small counts copy bytes comparsion should be unsigned (as the
memmove size argument).  It fixes string/tst-memmove-overflow on
sparcv9, where the input size triggers an invalid code path.

Checked on sparc64-linux-gnu and sparcv9-linux-gnu.
2024-01-22 09:34:50 -03:00
Adhemerval Zanella
369efd8177 sparc64: Remove unwind information from signal return stubs [BZ#31244]
Similar to sparc32 fix, remove the unwind information on the signal
return stubs.  This fixes the regressions:

FAIL: nptl/tst-cancel24-static
FAIL: nptl/tst-cond8-static
FAIL: nptl/tst-mutex8-static
FAIL: nptl/tst-mutexpi8-static
FAIL: nptl/tst-mutexpi9

On sparc64-linux-gnu.
2024-01-22 09:34:50 -03:00
Adhemerval Zanella
dd57f5e7b6 sparc: Remove 64 bit check on sparc32 wordsize (BZ 27574)
The sparc32 is always 32 bits.

Checked on sparcv9-linux-gnu.
2024-01-22 09:34:50 -03:00
Daniel Cederman
87d921e270 sparc: Do not test preservation of NaN payloads for LEON
The FPU used by LEON does not preserve NaN payload. This change allows
the math/test-*-canonicalize tests to pass on LEON.

Signed-off-by: Daniel Cederman <cederman@gaisler.com>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-01-18 08:27:44 -03:00
Daniel Cederman
45f7ea26c1 sparc: Force calculation that raises exception
Use the math_force_eval() macro to force the calculation to complete and
raise the exception.

With this change the math/test-fenv test pass.

Signed-off-by: Daniel Cederman <cederman@gaisler.com>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-01-18 08:27:44 -03:00
Daniel Cederman
a8f7c77970 sparc: Fix llrint and llround missing exceptions on SPARC V8
Conversions from a float to a long long on SPARC v8 uses a libgcc function
that may not raise the correct exceptions on overflow. It also may raise
spurious "inexact" exceptions on non overflow cases. This patch fixes the
problem in the same way as for RV32.

Signed-off-by: Daniel Cederman <cederman@gaisler.com>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-01-18 08:27:44 -03:00
Daniel Cederman
7bd06985c0 sparc: Remove unwind information from signal return stubs [BZ #31244]
The functions were previously written in C, but were not compiled
with unwind information. The ENTRY/END macros includes .cfi_startproc
and .cfi_endproc which adds unwind information. This caused the
tests cleanup-8 and cleanup-10 in the GCC testsuite to fail.
This patch adds a version of the ENTRY/END macros without the
CFI instructions that can be used instead.

sigaction registers a restorer address that is located two instructions
before the stub function. This patch adds a two instruction padding to
avoid that the unwinder accesses the unwind information from the function
that the linker has placed right before it in memory. This fixes an issue
with pthread_cancel that caused tst-mutex8-static (and other tests) to fail.

Signed-off-by: Daniel Cederman <cederman@gaisler.com>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-01-18 08:27:44 -03:00