Commit Graph

16220 Commits

Author SHA1 Message Date
Xi Ruoyao
97aa7b7346 LoongArch: Ensure sp 16-byte aligned for tlsdesc
"ADDI sp, sp, 24" and "ADDI sp, sp, SZFCSREG" (SZFCSREG = 4) are
misaligning the stack: the ABI mandates a 16-byte alignment.  Fix it
by changing the first one to "ADDI sp, sp, 32", and reuse the spare 4th
slot for saving fcsr.

Reported-by: Jinyang He <hejinyang@loongson.cn>
Signed-off-by: Xi Ruoyao <xry111@xry111.site>
2024-06-14 10:14:54 +08:00
H.J. Lu
29807a271e x86: Properly set x86 minimum ISA level [BZ #31883]
Properly set libc_cv_have_x86_isa_level in shell for MINIMUM_X86_ISA_LEVEL
defined as

(__X86_ISA_V1 + __X86_ISA_V2 + __X86_ISA_V3 + __X86_ISA_V4)

Also set __X86_ISA_V2 to 1 for i386 if __GCC_HAVE_SYNC_COMPARE_AND_SWAP_8
is defined.  There are no changes in config.h nor in config.make on x86-64.
On i386, -march=x86-64-v2 with GCC generates

 #define MINIMUM_X86_ISA_LEVEL 2

in config.h and

have-x86-isa-level = 2

in config.make.  This fixes BZ #31883.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2024-06-12 14:27:54 -07:00
Adhemerval Zanella
7edd3814b0 linux: Remove __stack_prot
The __stack_prot is used by Linux to make the stack executable if
a modules requires it.  It is also marked as RELRO, which requires
to change the segment permission to RW to update it.

Also, there is no need to keep track of the flags: either the stack
will have the default permission of the ABI or should be change to
PROT_READ | PROT_WRITE | PROT_EXEC.  The only additional flag,
PROT_GROWSDOWN or PROT_GROWSUP, is Linux only and can be deducted
from _STACK_GROWS_DOWN/_STACK_GROWS_UP.

Also, the check_consistency function was already removed some time
ago.

Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
2024-06-12 15:25:54 -03:00
H.J. Lu
09bc68b0ac x86: Properly set MINIMUM_X86_ISA_LEVEL for i386 [BZ #31867]
On i386, set the default minimum ISA level to 0, not 1 (baseline which
includes SSE2).  There are no changes in config.h nor in config.make on
x86-64.  This fixes BZ #31867.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Tested-by: Ian Jordan <immoloism@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
2024-06-11 00:10:08 -07:00
Joe Damato
bef2a827a5 x86: Enable non-temporal memset tunable for AMD
In commit 46b5e98ef6 ("x86: Add seperate non-temporal tunable for
memset") a tunable threshold for enabling non-temporal memset was added,
but only for Intel hardware.

Since that commit, new benchmark results suggest that non-temporal
memset is beneficial on AMD, as well, so allow this tunable to be set
for AMD.

See:
https://docs.google.com/spreadsheets/d/1opzukzvum4n6-RUVHTGddV6RjAEil4P2uMjjQGLbLcU/edit?usp=sharing
which has been updated to include data using different stategies for
large memset on AMD Zen2, Zen3, and Zen4.

Signed-off-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2024-06-10 16:18:18 -05:00
Samuel Thibault
74f9ee3b91 hurd: Fix lsetxattr return value
The manpage says that lsetxattr returns 0 on success, like setxattr.
2024-06-10 21:56:13 +02:00
Joe Damato
92c270d32c Linux: Add epoll ioctls
As of Linux kernel 6.9, some ioctls and a parameters structure have been
introduced which allow user programs to control whether a particular
epoll context will busy poll.

Update the headers to include these for the convenience of user apps.

The ioctls were added in Linux kernel 6.9 commit 18e2bf0edf4dd
("eventpoll: Add epoll ioctl for epoll_params") [1] to
include/uapi/linux/eventpoll.h.

[1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/diff/?h=v6.9&id=18e2bf0edf4dd

Signed-off-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-06-04 12:09:15 -05:00
Szabolcs Nagy
2a9943b4a0 math: Fix exp10 undefined left shift
Left shift of ki is undefined when ki<0, copy the logic from exp,
which uses unsigned arithmetics, to fix it.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-06-04 15:33:26 +01:00
Joseph Myers
1d441791cb Add new AArch64 HWCAP2 definitions from Linux 6.9 to bits/hwcap.h
Linux 6.9 adds 15 new HWCAP2_* values for AArch64; add them to
bits/hwcap.h in glibc.

Tested with build-many-glibcs.py for aarch64-linux-gnu.
2024-06-04 12:25:05 +00:00
Noah Goldstein
46b5e98ef6 x86: Add seperate non-temporal tunable for memset
The tuning for non-temporal stores for memset vs memcpy is not always
the same. This includes both the exact value and whether non-temporal
stores are profitable at all for a given arch.

This patch add `x86_memset_non_temporal_threshold`. Currently we
disable non-temporal stores for non Intel vendors as the only
benchmarks showing its benefit have been on Intel hardware.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2024-05-30 12:36:09 -05:00
Noah Goldstein
5bf0ab8057 x86: Improve large memset perf with non-temporal stores [RHEL-29312]
Previously we use `rep stosb` for all medium/large memsets. This is
notably worse than non-temporal stores for large (above a
few MBs) memsets.
See:
https://docs.google.com/spreadsheets/d/1opzukzvum4n6-RUVHTGddV6RjAEil4P2uMjjQGLbLcU/edit?usp=sharing
For data using different stategies for large memset on ICX and SKX.

Using non-temporal stores can be up to 3x faster on ICX and 2x faster
on SKX. Historically, these numbers would not have been so good
because of the zero-over-zero writeback optimization that `rep stosb`
is able to do. But, the zero-over-zero writeback optimization has been
removed as a potential side-channel attack, so there is no longer any
good reason to only rely on `rep stosb` for large memsets. On the flip
size, non-temporal writes can avoid data in their RFO requests saving
memory bandwidth.

All of the other changes to the file are to re-organize the
code-blocks to maintain "good" alignment given the new code added in
the `L(stosb_local)` case.

The results from running the GLIBC memset benchmarks on TGL-client for
N=20 runs:

Geometric Mean across the suite New / Old EXEX256: 0.979
Geometric Mean across the suite New / Old EXEX512: 0.979
Geometric Mean across the suite New / Old AVX2   : 0.986
Geometric Mean across the suite New / Old SSE2   : 0.979

Most of the cases are essentially unchanged, this is mostly to show
that adding the non-temporal case didn't add any regressions to the
other cases.

The results on the memset-large benchmark suite on TGL-client for N=20
runs:

Geometric Mean across the suite New / Old EXEX256: 0.926
Geometric Mean across the suite New / Old EXEX512: 0.925
Geometric Mean across the suite New / Old AVX2   : 0.928
Geometric Mean across the suite New / Old SSE2   : 0.924

So roughly a 7.5% speedup. This is lower than what we see on servers
(likely because clients typically have faster single-core bandwidth so
saving bandwidth on RFOs is less impactful), but still advantageous.

Full test-suite passes on x86_64 w/ and w/o multiarch.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2024-05-30 12:36:09 -05:00
Xi Ruoyao
0c1d2c277a LoongArch: Use "$fcsr0" instead of "$r0" in _FPU_{GET,SET}CW
Clang inline-asm parser does not allow using "$r0" in
movfcsr2gr/movgr2fcsr, so everything using _FPU_{GET,SET}CW is now
failing to build with Clang on LoongArch.  As we now requires Binutils
>= 2.41 which supports using "$fcsr0" here, use it instead of "$r0" to
fix the issue.

Link: https://github.com/loongson-community/discussions/issues/53#issuecomment-2081507390
Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=4142b2368353
Signed-off-by: Xi Ruoyao <xry111@xry111.site>
2024-05-28 09:17:05 +08:00
Xin Wang
e0f7f1808f x86_64: Reformat elf_machine_rela
A space is added before the left bracket of the x86_64 elf_machine_rela
function, in order to harmonize with the rest of the implementation of
the function and to make it easier to retrieve the function. The lines
where the function definition is located has been re-indented, as well
as its left curly bracket placed in the correct position.

Signed-off-by: Xin Wang <yw987194828@gmail.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2024-05-27 13:46:45 -07:00
Sunil K Pandey
1b713c9a53 i386: Disable Intel Xeon Phi tests for GCC 15 and above (BZ 31782)
This patch disables Intel Xeon Phi tests for GCC 15 and above.

GCC 15 removed Intel Xeon Phi ISA support.
commit e1a7e2c54d52d0ba374735e285b617af44841ace
Author: Haochen Jiang <haochen.jiang@intel.com>
Date:   Mon May 20 10:43:44 2024 +0800

    i386: Remove Xeon Phi ISA support

Fixes BZ 31782.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2024-05-27 12:28:13 -07:00
H.J. Lu
f981bf6b9d parse_fdinfo: Don't advance pointer twice [BZ #31798]
pidfd_getpid.c has

      /* Ignore invalid large values.  */
      if (INT_MULTIPLY_WRAPV (10, n, &n)
          || INT_ADD_WRAPV (n, *l++ - '0', &n))
        return -1;

For GCC older than GCC 7, INT_ADD_WRAPV(a, b, r) is defined as

   _GL_INT_OP_WRAPV (a, b, r, +, _GL_INT_ADD_RANGE_OVERFLOW)

and *l++ - '0' is evaluated twice.  Fix BZ #31798 by moving "l++" out of
the if statement.  Tested with GCC 6.4 and GCC 14.1.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-05-27 06:52:45 -07:00
H.J. Lu
23c60af6dc sysdeps/ieee754/ldbl-opt/Makefile: Split and sort libnldbl-calls
Put each item on a separate line and sort libnldbl-calls.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2024-05-24 10:25:40 -07:00
H.J. Lu
639c143db3 sysdeps/ieee754/ldbl-opt/Makefile: Remove test-nldbl-redirect-static
Remove $(objpfx)test-nldbl-redirect-static checked in by accident.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2024-05-24 06:36:18 -07:00
H.J. Lu
acfb169b3c sysdeps/ieee754/ldbl-opt/Makefile: Split and sort tests
Put each test on a separate line and sort tests.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2024-05-24 06:31:49 -07:00
Stefan Liebler
4af49c60a1 s390x: Regenerate ULPs.
Needed due to:
"Implement C23 log2p1"
commit ID 79c52daf47
2024-05-24 09:53:49 +02:00
Joseph Myers
84d2762922 Update kernel version to 6.9 in header constant tests
This patch updates the kernel version in the tests tst-mman-consts.py
and tst-mount-consts.py to 6.9.  (There are no new constants covered
by these tests in 6.9 that need any other header changes;
tst-pidfd-consts.py was updated separately along with adding new
constants relevant to that test.)

Tested with build-many-glibcs.py.
2024-05-23 14:04:48 +00:00
Adhemerval Zanella
eaa8113bf0 math: Provide missing math symbols on libc.a (BZ 31781)
The libc.a for alpha, s390, and sparcv9 does not provide
copysignf64x, copysignf128, frexpf64x, frexpf128, modff64x, and
modff128.

Checked with a static build for the affected ABIs.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2024-05-23 09:36:08 -03:00
Adhemerval Zanella
1664bbf238 s390: Make utmp32, utmpx32, and login32 shared only (BZ 31790)
The function that work with 'struct utmp32' and 'struct utmpx32'
are only for compat symbols.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2024-05-23 09:36:08 -03:00
Adhemerval Zanella
18dbe27847 microblaze: Remove cacheflush from libc.a (BZ 31788)
microblaze does not export it in libc.so nor the kernel provides the
cacheflush syscall.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2024-05-23 09:36:08 -03:00
Adhemerval Zanella
d8ebde14fb powerpc: Remove duplicated llrintf and llrintf32 from libm.a (BZ 31787)
Both the generic and POWER6 versions provide definitions of the
symbol, which are already provided by the ifunc resolver.

Checked on powerpc-linux-gnu-power4.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2024-05-23 09:36:08 -03:00
Adhemerval Zanella
5fededd825 powerpc: Remove duplicate strchrnul and strncasecmp_l libc.a (BZ 31786)
For powerpc64 the generic version provides a weak definition of
strchrnul, which are already provided by the ifunc resolver.  The
powerpc32 version is slight different, where for static case there
is no iFUNC support.

The strncasecmp_l is provided ifunc resolver.

Checked on powerpc-linux-gnu-power4 and powerpc64-linux-gnu.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2024-05-23 09:36:08 -03:00
Adhemerval Zanella
62eaa46739 loongarch: Remove duplicate strnlen in libc.a (BZ 31785)
The generic version provides weak definitions of strnlen,
which are already provided by the ifunc resolver.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2024-05-23 09:36:08 -03:00
Adhemerval Zanella
ef9596352b aarch64: Remove duplicate memchr/strlen in libc.a (BZ 31777)
The generic version provides weak definitions of memchr/strlen,
which are already provided by the ifunc resolvers.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2024-05-23 09:36:08 -03:00
Joseph Myers
e9a37242f9 Update PIDFD_* constants for Linux 6.9
Linux 6.9 adds some more PIDFD_* constants.  Add them to glibc's
sys/pidfd.h, including updating comments that said FLAGS was reserved
and must be 0, along with updating tst-pidfd-consts.py.

Tested with build-many-glibcs.py.
2024-05-23 12:22:40 +00:00
H.J. Lu
43d41ae6d7 Don't provide XXXf128_do_not_use aliases [BZ #31757]
Don't provide __nexttowardf128_do_not_use, nexttowardf128_do_not_use,
finitef128_do_not_use, isinff128_do_not_use and isnanf128_do_not_use.
This fixes BZ #31757.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-05-22 06:12:17 -07:00
Adhemerval Zanella
5d4999e519 math: Fix isnanf128 static build (BZ 31774)
Some static implementation of float128 routines might call __isnanf128,
which is not provided by the static object.

Checked on x86_64-linux-gnu.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2024-05-21 16:53:27 -03:00
Adhemerval Zanella
1f09aae36a math: Fix i386 and m68k exp10 on static build (BZ 31775)
The commit 08ddd26814 removed the static exp10 on i386 and m68k with an
empty w_exp10.c (required for the ABIs that uses the newly
implementation).  This patch fixes by adding the required symbols on the
arch-specific w_exp{f}_compat.c implementation.

Checked on i686-linux-gnu and with a build for m68k-linux-gnu.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
2024-05-21 13:44:22 -03:00
Adhemerval Zanella
0b716305df math: Fix i386 and m68k fmod/fmodf on static build (BZ 31488)
The commit 16439f419b removed the static fmod/fmodf on i386 and m68k
with and empty w_fmod.c (required for the ABIs that uses the newly
implementation).  This patch fixes by adding the required symbols on
the arch-specific w_fmod{f}_compat.c implementation.

To statically build fmod fails on some ABI (alpha, s390, sparc) because
it does not export the ldexpf128, this is also fixed by this patch.

Checked on i686-linux-gnu and with a build for m68k-linux-gnu.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
2024-05-21 13:43:39 -03:00
H.J. Lu
437c94e04b Remove the clone3 symbol from libc.a [BZ #31770]
clone3 isn't exported from glibc and is hidden in libc.so.  Fix BZ #31770
by removing clone3 alias.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-05-21 07:05:08 -07:00
Joe Ramsay
0fed0b250f aarch64/fpu: Add vector variants of pow
Plus a small amount of moving includes around in order to be able to
remove duplicate definition of asuint64.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2024-05-21 14:38:49 +01:00
caiyinyu
3c1e22372d LoongArch: Update ulps
For the log2p1 implementation.
2024-05-21 12:08:25 +08:00
mengqinggang
16d47c1594 LoongArch: Fix tst-gnu2-tls2 compiler error
Add -mno-lsx to tst-gnu2-tlsmod*.c if gcc support -mno-lsx.
Add escape character '\' in vector support test function.
2024-05-21 11:23:03 +08:00
H.J. Lu
8428278b5f i386: Don't define stpncpy alias when used in IFUNC [BZ #31768]
Fix BZ #31768 by not defining stpncpy alias when used in IFUNC.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>
2024-05-20 19:35:00 -07:00
Adhemerval Zanella
f83e461f10 powerpc: Update ulps
For the log2p1 implementation.
2024-05-20 13:12:23 -03:00
Adhemerval Zanella
32b2aa59da arm: Update ulps
For the log2p1 implementation.
2024-05-20 13:12:23 -03:00
Adhemerval Zanella
241338bd6f aarch64: Update ulps
For the log2p1 implementation.
2024-05-20 13:12:23 -03:00
Joseph Myers
79c52daf47 Implement C23 log2p1
C23 adds various <math.h> function families originally defined in TS
18661-4.  Add the log2p1 functions (log2(1+x): like log1p, but for
base-2 logarithms).

This illustrates the intended structure of implementations of all
these function families: define them initially with a type-generic
template implementation.  If someone wishes to add type-specific
implementations, it is likely such implementations can be both faster
and more accurate than the type-generic one and can then override it
for types for which they are implemented (adding benchmarks would be
desirable in such cases to demonstrate that a new implementation is
indeed faster).

The test inputs are copied from those for log1p.  Note that these
changes make gen-auto-libm-tests depend on MPFR 4.2 (or later).

The bulk of the changes are fairly generic for any such new function.
(sysdeps/powerpc/nofpu/Makefile only needs changing for those
type-generic templates that use fabs.)

Tested for x86_64 and x86, and with build-many-glibcs.py.
2024-05-20 13:41:39 +00:00
Joseph Myers
cf0ca8d52e Update syscall lists for Linux 6.9
Linux 6.9 has no new syscalls.  Update the version number in
syscall-names.list to reflect that it is still current for 6.9.

Tested with build-many-glibcs.py.
2024-05-20 13:10:31 +00:00
H.J. Lu
7935e7a537 Rename procutils_read_file to __libc_procutils_read_file [BZ #31755]
Fix BZ #31755 by renaming the internal function procutils_read_file to
__libc_procutils_read_file.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-05-20 05:22:43 -07:00
H.J. Lu
4e21cb95e2 nearbyint: Don't define alias when used in IFUNC [BZ #31759]
Fix BZ #31759 by not defining nearbyint aliases when used in IFUNC.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2024-05-20 05:21:41 -07:00
Florian Weimer
8d7b6b4cb2 socket: Use may_alias on sockaddr structs (bug 19622)
This supports common coding patterns.  The GCC C front end before
version 7 rejects the may_alias attribute on a struct definition
if it was not present in a previous forward declaration, so this
attribute can only be conditionally applied.

This implements the spirit of the change in Austin Group issue 1641.

Suggested-by: Marek Polacek <polacek@redhat.com>
Suggested-by: Jakub Jelinek <jakub@redhat.com>
Reviewed-by: Sam James <sam@gentoo.org>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2024-05-18 09:33:19 +02:00
Manjunath Matti
a81cdde1cb powerpc64: Fix by using the configure value $libc_cv_cc_submachine [BZ #31629]
This patch ensures that $libc_cv_cc_submachine, which is set from
"--with-cpu", overrides $CFLAGS for configure time tests.

Suggested-by: Peter Bergner <bergner@linux.ibm.com>
Reviewed-by: Peter Bergner <bergner@linux.ibm.com>
2024-05-16 17:31:45 -05:00
Joe Ramsay
75207bde68 aarch64/fpu: Add vector variants of cbrt
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2024-05-16 14:35:06 +01:00
Joe Ramsay
157f89fa3d aarch64/fpu: Add vector variants of hypot
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2024-05-16 14:34:43 +01:00
mengqinggang
1dbf2bef79 LoongArch: Add support for TLS Descriptors
This is mostly based on AArch64 and RISC-V implementation.

Add R_LARCH_TLS_DESC32 and R_LARCH_TLS_DESC64 relocations.

For _dl_tlsdesc_dynamic function slow path, temporarily save and restore
all vector registers.
2024-05-15 10:31:53 +08:00
Joe Ramsay
90a6ca8b28 aarch64: Fix AdvSIMD libmvec routines for big-endian
Previously many routines used * to load from vector types stored
in the data table. This is emitted as ldr, which byte-swaps the
entire vector register, and causes bugs for big-endian when not
all lanes contain the same value. When a vector is to be used
this way, it has been replaced with an array and the load with an
explicit ld1 intrinsic, which byte-swaps only within lanes.

As well, many routines previously used non-standard GCC syntax
for vector operations such as indexing into vectors types with []
and assembling vectors using {}. This syntax should not be mixed
with ACLE, as the former does not respect endianness whereas the
latter does. Such examples have been replaced with, for instance,
vcombine_* and vgetq_lane* intrinsics. Helpers which only use the
GCC syntax, such as the v_call helpers, do not need changing as
they do not use intrinsics.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2024-05-14 13:10:33 +01:00