Commit Graph

16171 Commits

Author SHA1 Message Date
Joe Ramsay
90a6ca8b28 aarch64: Fix AdvSIMD libmvec routines for big-endian
Previously many routines used * to load from vector types stored
in the data table. This is emitted as ldr, which byte-swaps the
entire vector register, and causes bugs for big-endian when not
all lanes contain the same value. When a vector is to be used
this way, it has been replaced with an array and the load with an
explicit ld1 intrinsic, which byte-swaps only within lanes.

As well, many routines previously used non-standard GCC syntax
for vector operations such as indexing into vectors types with []
and assembling vectors using {}. This syntax should not be mixed
with ACLE, as the former does not respect endianness whereas the
latter does. Such examples have been replaced with, for instance,
vcombine_* and vgetq_lane* intrinsics. Helpers which only use the
GCC syntax, such as the v_call helpers, do not need changing as
they do not use intrinsics.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2024-05-14 13:10:33 +01:00
Adhemerval Zanella
ae515ba530 powerpc: Fix __fesetround_inline_nocheck on POWER9+ (BZ 31682)
The e68b1151f7 commit changed the
__fesetround_inline_nocheck implementation to use mffscrni
(through __fe_mffscrn) instead of mtfsfi.  For generic powerpc
ceil/floor/trunc, the function is supposed to disable the
floating-point inexact exception enable bit, however mffscrni
does not change any exception enable bits.

This patch fixes by reverting the optimization for the
__fesetround_inline_nocheck.

Checked on powerpc-linux-gnu.
Reviewed-by: Paul E. Murphy <murphyp@linux.ibm.com>
2024-05-09 08:59:30 -03:00
Gabi Falk
dd5f891c1a x86_64: Fix missing wcsncat function definition without multiarch (x86-64-v4)
This code expects the WCSCAT preprocessor macro to be predefined in case
the evex implementation of the function should be defined with a name
different from __wcsncat_evex.  However, when glibc is built for
x86-64-v4 without multiarch support, sysdeps/x86_64/wcsncat.S defines
WCSNCAT variable instead of WCSCAT to build it as wcsncat.  Rename the
variable to WCSNCAT, as it is actually a better naming choice for the
variable in this case.

Reported-by: Kenton Groombridge
Link: https://bugs.gentoo.org/921945
Fixes: 64b8b6516b ("x86: Add evex optimized functions for the wchar_t strcpy family")
Signed-off-by: Gabi Falk <gabifalk@gmx.com>
Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>
2024-05-08 07:37:59 -07:00
Adhemerval Zanella
1e1ad714ee support: Add envp argument to support_capture_subprogram
So tests can specify a list of environment variables.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2024-05-07 12:16:36 -03:00
Adhemerval Zanella
bcae44ea85 elf: Only process multiple tunable once (BZ 31686)
The 680c597e9c commit made loader reject ill-formatted strings by
first tracking all set tunables and then applying them. However, it does
not take into consideration if the same tunable is set multiple times,
where parse_tunables_string appends the found tunable without checking
if it was already in the list. It leads to a stack-based buffer overflow
if the tunable is specified more than the total number of tunables.  For
instance:

  GLIBC_TUNABLES=glibc.malloc.check=2:... (repeat over the number of
  total support for different tunable).

Instead, use the index of the tunable list to get the expected tunable
entry.  Since now the initial list is zero-initialized, the compiler
might emit an extra memset and this requires some minor adjustment
on some ports.

Checked on x86_64-linux-gnu and aarch64-linux-gnu.

Reported-by: Yuto Maeda <maeda@cyberdefense.jp>
Reported-by: Yutaro Shimizu <shimizu@cyberdefense.jp>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2024-05-07 12:16:36 -03:00
H.J. Lu
5f245f3bfb Add crt1-2.0.o for glibc 2.0 compatibility tests
Starting from glibc 2.1, crt1.o contains _IO_stdin_used which is checked
by _IO_check_libio to provide binary compatibility for glibc 2.0.  Add
crt1-2.0.o for tests against glibc 2.0.  Define tests-2.0 for glibc 2.0
compatibility tests.  Add and update glibc 2.0 compatibility tests for
stderr, matherr and pthread_kill.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2024-05-06 07:49:40 -07:00
Amrita H S
23f0d81608 powerpc: Optimized strncmp for power10
This patch is based on __strcmp_power10.

Improvements from __strncmp_power9:

    1. Uses new POWER10 instructions
       - This code uses lxvp to decrease contention on load
	 by loading 32 bytes per instruction.

    2. Performance implication
       - This version has around 38% better performance on average.
       - Minor performance regression is seen for few small sizes
	 and specific combination of alignments.

Signed-off-by: Amrita H S <amritahs@linux.ibm.com>
Reviewed-by: Peter Bergner <bergner@linux.ibm.com>
2024-05-06 09:01:29 -05:00
Stafford Horne
643d9d38d5 or1k: Add hard float support
This patch adds hardware floating point support to OpenRISC.  Hardware
floating point toolchain builds are enabled by passing the machine
specific argument -mhard-float to gcc via CFLAGS.  With this enabled GCC
generates floating point instructions for single-precision operations
and exports __or1k_hard_float__.

There are 2 main parts to this patch.

 - Implement fenv functions to update the FPCSR flags keeping it in sync
   with sfp (software floating point).
 - Update machine context functions to store and restore the FPCSR
   state.

*On mcontext_t ABI*

This patch adds __fpcsr to mcontext_t.  This is an ABI change, but also
an ABI fix.  The Linux kernel has always defined padding in mcontext_t
that space was missing from the glibc ABI.  In Linux this unused space
has now been re-purposed for storing the FPCSR.  This patch brings
OpenRISC glibc in line with the Linux kernel and other libc
implementation (musl).

Compatibility getcontext, setcontext, etc symbols have been added to
allow for binaries expecting the old ABI to continue to work.

*Hard float ABI*

The calling conventions and types do not change with OpenRISC hard-float
so glibc hard-float builds continue to use dynamic linker
/lib/ld-linux-or1k.so.1.

*Testing*

I have tested this patch both with hard-float and soft-float builds and
the test results look fine to me.  Results are as follows:

Hard Float

    # failures
    FAIL: elf/tst-sprof-basic		(Haven't figured out yet, not related to hard-float)
    FAIL: gmon/tst-gmon-pie		(PIE bug in or1k toolchain)
    FAIL: gmon/tst-gmon-pie-gprof	(PIE bug in or1k toolchain)
    FAIL: iconvdata/iconv-test		(timeout, passed when run manually)
    FAIL: nptl/tst-cond24		(Timeout)
    FAIL: nptl/tst-mutex10		(Timeout)

    # summary
	  6 FAIL
       4289 PASS
	 86 UNSUPPORTED
	 16 XFAIL
	  2 XPASS

    # versions
    Toolchain: or1k-smhfpu-linux-gnu
    Compiler:  gcc version 14.0.1 20240324 (experimental) [master r14-9649-gbb04a11418f] (GCC)
    Binutils:  GNU assembler version 2.42.0 (or1k-smhfpu-linux-gnu) using BFD version (GNU Binutils) 2.42.0.20240324
    Linux:     Linux buildroot 6.9.0-rc1-00008-g4dc70e1aadfa #112 SMP Sat Apr 27 06:43:11 BST 2024 openrisc GNU/Linux
    Tester:    shorne
    Glibc:     2024-04-25 b62928f907 Florian Weimer   x86: In ld.so, diagnose missing APX support in APX-only builds  (origin/master, origin/HEAD)

Soft Float

    # failures
    FAIL: elf/tst-sprof-basic
    FAIL: gmon/tst-gmon-pie
    FAIL: gmon/tst-gmon-pie-gprof
    FAIL: nptl/tst-cond24
    FAIL: nptl/tst-mutex10

    # summary
	  5 FAIL
       4295 PASS
	 81 UNSUPPORTED
	 16 XFAIL
	  2 XPASS

    # versions
    Toolchain: or1k-smh-linux-gnu
    Compiler:  gcc version 14.0.1 20240324 (experimental) [master r14-9649-gbb04a11418f] (GCC)
    Binutils:  GNU assembler version 2.42.0 (or1k-smh-linux-gnu) using BFD version (GNU Binutils) 2.42.0.20240324
    Linux:     Linux buildroot 6.9.0-rc1-00008-g4dc70e1aadfa #112 SMP Sat Apr 27 06:43:11 BST 2024 openrisc GNU/Linux
    Tester:    shorne
    Glibc:     2024-04-25 b62928f907 Florian Weimer   x86: In ld.so, diagnose missing APX support in APX-only builds  (origin/master, origin/HEAD)

Documentation: https://raw.githubusercontent.com/openrisc/doc/master/openrisc-arch-1.4-rev0.pdf
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-05-03 18:28:18 +01:00
Stafford Horne
b57adfa49b or1k: Add hard float libm-test-ulps
This patch adds the ulps test file to prepare for the upcoming
hard float patch.  This is separated out to make the hard float patch
smaller.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-05-03 18:28:18 +01:00
Gabi Falk
5a2cf833f5
i686: Fix multiple definitions of __memmove_chk and __memset_chk
Commit c73c96a4a1 updated memcpy.S and
mempcpy.S, but omitted memmove.S and memset.S.  As a result, the static
library built as PIC, whether with or without multiarch support,
contains two definitions for each of the __memmove_chk and __memset_chk
symbols.

/usr/lib/gcc/i686-pc-linux-gnu/14/../../../../i686-pc-linux-gnu/bin/ld: /usr/lib/gcc/i686-pc-linux-gnu/14/../../../../lib/libc.a(memset-ia32.o): in function `__memset_chk':
/var/tmp/portage/sys-libs/glibc-2.39-r3/work/glibc-2.39/string/../sysdeps/i386/i686/memset.S:32: multiple definition of `__memset_chk'; /usr/lib/gcc/i686-pc-linux-gnu/14/../../../../lib/libc.a(memset_chk.o):/var/tmp/portage/sys-libs/glibc-2.39-r3/work/glibc-2.39/debug/../sysdeps/i386/i686/multiarch/memset_chk.c:24: first defined here

After this change, regardless of PIC options, the static library, built
for i686 with multiarch contains implementations of these functions
respectively from debug/memmove_chk.c and debug/memset_chk.c, and
without multiarch contains implementations of these functions
respectively from sysdeps/i386/memmove_chk.S and
sysdeps/i386/memset_chk.S.  This ensures that memmove and memset won't
pull in __chk_fail and the routines it calls.

Reported-by: Sam James <sam@gentoo.org>
Tested-by: Sam James <sam@gentoo.org>
Fixes: c73c96a4a1 ("i686: Fix build with --disable-multiarch")
Signed-off-by: Gabi Falk <gabifalk@gmx.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Dmitry V. Levin <ldv@altlinux.org>
2024-05-02 11:51:10 +01:00
Gabi Falk
0fdf4ba48c
i586: Fix multiple definitions of __memcpy_chk and __mempcpy_chk
/home/bmg/install/compilers/x86_64-linux-gnu/lib/gcc/x86_64-glibc-linux-gnu/13.2.1/../../../../x86_64-glibc-linux-gnu/bin/ld: /home/bmg/build/glibcs/i586-linux-gnu/glibc/libc.a(memcpy_chk.o): in function `__memcpy_chk':
/home/bmg/src/glibc/debug/../sysdeps/i386/memcpy_chk.S:29: multiple definition of `__memcpy_chk';/home/bmg/build/glibcs/i586-linux-gnu/glibc/libc.a(memcpy.o):/home/bmg/src/glibc/string/../sysdeps/i386/i586/memcpy.S:31: first defined here /home/bmg/install/compilers/x86_64-linux-gnu/lib/gcc/x86_64-glibc-linux-gnu/13.2.1/../../../../x86_64-glibc-linux-gnu/bin/ld: /home/bmg/build/glibcs/i586-linux-gnu/glibc/libc.a(mempcpy_chk.o): in function `__mempcpy_chk': /home/bmg/src/glibc/debug/../sysdeps/i386/mempcpy_chk.S:28: multiple definition of `__mempcpy_chk'; /home/bmg/build/glibcs/i586-linux-gnu/glibc/libc.a(mempcpy.o):/home/bmg/src/glibc/string/../sysdeps/i386/i586/memcpy.S:31: first defined here

After this change, the static library built for i586, regardless of PIC
options, contains implementations of these functions respectively from
sysdeps/i386/memcpy_chk.S and sysdeps/i386/mempcpy_chk.S.  This ensures
that memcpy and mempcpy won't pull in __chk_fail and the routines it
calls.

Reported-by: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Gabi Falk <gabifalk@gmx.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Dmitry V. Levin <ldv@altlinux.org>
2024-05-02 11:50:21 +01:00
Carlos O'Donell
91695ee459 time: Allow later version licensing.
The FSF's Licensing and Compliance Lab noted a discrepancy in the
licensing of several files in the glibc package.

When timespect_get.c was impelemented the license did not include
the standard ", or (at your option) any later version." text.

Change the license in timespec_get.c and all copied files to match
the expected license.

This change was previously approved in principle by the FSF in
RT ticket #1316403. And a similar instance was fixed in
commit 46703efa02.
2024-05-01 09:03:26 -04:00
Wilco Dijkstra
6dae61567f AArch64: Remove unused defines of CPU names
Remove unused defines of CPU names in cpu-features.h.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-04-30 13:32:29 +01:00
Florian Weimer
b62928f907 x86: In ld.so, diagnose missing APX support in APX-only builds
At this point, this is mainly a tool for testing the early ld.so
CPU compatibility diagnostics: GCC uses the new instructions in most
functions, so it's easy to spot if some of the early code is not
built correctly.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2024-04-25 17:20:28 +02:00
Florian Weimer
3a3a449742 i386: ulp update for SSE2 --disable-multi-arch configurations 2024-04-25 12:56:48 +02:00
H.J. Lu
46c9997413 x86: Define MINIMUM_X86_ISA_LEVEL in config.h [BZ #31676]
Define MINIMUM_X86_ISA_LEVEL at configure time to avoid

/usr/bin/ld: …/build/elf/librtld.os: in function `init_cpu_features':
…/git/elf/../sysdeps/x86/cpu-features.c:1202: undefined reference to `_dl_runtime_resolve_fxsave'
/usr/bin/ld: …/build/elf/librtld.os: relocation R_X86_64_PC32 against undefined hidden symbol `_dl_runtime_resolve_fxsave' can not be used when making a shared object
/usr/bin/ld: final link failed: bad value
collect2: error: ld returned 1 exit status

when glibc is built with -march=x86-64-v3 and configured with
--with-rtld-early-cflags=-march=x86-64, which is used to allow ld.so to
print an error message on unsupported CPUs:

Fatal glibc error: CPU does not support x86-64-v3

This fixes BZ #31676.
Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>
2024-04-24 04:50:56 -07:00
caiyinyu
095067efdf LoongArch: Add glibc.cpu.hwcap support.
The current IFUNC selection is always using the most recent
features which are available via AT_HWCAP.  But in
some scenarios it is useful to adjust this selection.

The environment variable:

GLIBC_TUNABLES=glibc.cpu.hwcaps=-xxx,yyy,zzz,....

can be used to enable HWCAP feature yyy, disable HWCAP feature xxx,
where the feature name is case-sensitive and has to match the ones
used in sysdeps/loongarch/cpu-tunables.c.

Signed-off-by: caiyinyu <caiyinyu@loongson.cn>
2024-04-24 18:22:38 +08:00
Florian Weimer
f4724843ad nptl: Fix tst-cancel30 on kernels without ppoll_time64 support
Fall back to ppoll if ppoll_time64 fails with ENOSYS.
Fixes commit 370da8a121 ("nptl: Fix
tst-cancel30 on sparc64").

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2024-04-23 21:16:32 +02:00
Florian Weimer
5361ad3910 login: Use unsigned 32-bit types for seconds-since-epoch
These fields store timestamps when the system was running.  No Linux
systems existed before 1970, so these values are unused.  Switching
to unsigned types allows continued use of the existing struct layouts
beyond the year 2038.

The intent is to give distributions more time to switch to improved
interfaces that also avoid locking/data corruption issues.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-04-19 14:38:17 +02:00
Florian Weimer
9abdae94c7 login: structs utmp, utmpx, lastlog _TIME_BITS independence (bug 30701)
These structs describe file formats under /var/log, and should not
depend on the definition of _TIME_BITS.  This is achieved by
defining __WORDSIZE_TIME64_COMPAT32 to 1 on 32-bit ports that
support 32-bit time_t values (where __time_t is 32 bits).

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-04-19 14:38:17 +02:00
Florian Weimer
4d4da5aab9 login: Check default sizes of structs utmp, utmpx, lastlog
The default <utmp-size.h> is for ports with a 64-bit time_t.
Ports with a 32-bit time_t or with __WORDSIZE_TIME64_COMPAT32=1
need to override it.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-04-19 14:38:17 +02:00
Florian Weimer
14e56bd4ce powerpc: Fix ld.so address determination for PCREL mode (bug 31640)
This seems to have stopped working with some GCC 14 versions,
which clobber r2.  With other compilers, the kernel-provided
r2 value is still available at this point.

Reviewed-by: Peter Bergner <bergner@linux.ibm.com>
2024-04-14 08:24:51 +02:00
Florian Weimer
aea52e3d2b Revert "x86_64: Suppress false positive valgrind error"
This reverts commit a1735e0aa8.

The test failure is a real valgrind bug that needs to be fixed before
valgrind is usable with a glibc that has been built with
CC="gcc -march=x86-64-v3".  The proposed valgrind patch teaches
valgrind to replace ld.so strcmp with an unoptimized scalar
implementation, thus avoiding any AVX2-related problems.

Valgrind bug: <https://bugs.kde.org/show_bug.cgi?id=485487>

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2024-04-13 17:42:13 +02:00
Adhemerval Zanella
686d542025 posix: Sync tempname with gnulib
The gnulib version contains an important change (9ce573cde), which
fixes some problems with multithreading, entropy loss, and ASLR leak
nfo.  It also fixes an issue where getrandom is not being used
on some new files generation (only for __GT_NOCREATE on first try).

The 044bf893ac removed __path_search, which is now moved to another
gnulib shared files (stdio-common/tmpdir.{c,h}).  Tthis patch
also fixes direxists to use __stat64_time64 instead of __xstat64,
and move the include of pathmax.h for !_LIBC (since it is not used
by glibc).  The license is also changed from GPL 3.0 to 2.1, with
permission from the authors (Bruno Haible and Paul Eggert).

The sync also removed the clock fallback, since clock_gettime
with CLOCK_REALTIME is expected to always succeed.

It syncs with gnulib commit 323834962817af7b115187e8c9a833437f8d20ec.

Checked on x86_64-linux-gnu.

Co-authored-by: Bruno Haible <bruno@clisp.org>
Co-authored-by: Paul Eggert <eggert@cs.ucla.edu>
Reviewed-by: Bruno Haible <bruno@clisp.org>
2024-04-10 14:53:39 -03:00
Florian Weimer
f8d8b1b1e6 aarch64: Enhanced CPU diagnostics for ld.so
This prints some information from struct cpu_features, and the midr_el1
and dczid_el0 system register contents on every CPU.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2024-04-08 16:48:55 +02:00
Florian Weimer
7a430f40c4 x86: Add generic CPUID data dumper to ld.so --list-diagnostics
This is surprisingly difficult to implement if the goal is to produce
reasonably sized output.  With the current approaches to output
compression (suppressing zeros and repeated results between CPUs,
folding ranges of identical subleaves, dealing with the %ecx
reflection issue), the output is less than 600 KiB even for systems
with 256 logical CPUs.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2024-04-08 16:48:55 +02:00
Florian Weimer
5653ccd847 elf: Add CPU iteration support for future use in ld.so diagnostics
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2024-04-08 16:48:55 +02:00
H.J. Lu
9e1f4aef86 x86-64: Exclude FMA4 IFUNC functions for -mapxf
When -mapxf is used to build glibc, the resulting glibc will never run
on FMA4 machines.  Exclude FMA4 IFUNC functions when -mapxf is used.
This requires GCC which defines __APX_F__ for -mapxf with commit:

1df56719bd8 x86: Define __APX_F__ for -mapxf

Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>
2024-04-06 05:03:55 -07:00
Adhemerval Zanella
c27f8763cf Reinstate generic features-time64.h
The a4ed0471d7 removed the generic version which is included by
features.h and used by Hurd.

Checked by building i686-gnu and x86_64-gnu with build-many-glibc.py.
2024-04-05 09:02:36 -03:00
Adhemerval Zanella
460d9e2dfe Cleanup __tls_get_addr on alpha/microblaze localplt.data
They are not required.

Checked with a make check for both ABIs.
2024-04-04 17:20:33 -03:00
Adhemerval Zanella
95700e7998 arm: Remove ld.so __tls_get_addr plt usage
Use the hidden alias instead.

Checked on arm-linux-gnueabihf.
2024-04-04 17:03:32 -03:00
Adhemerval Zanella
50c2be2390 aarch64: Remove ld.so __tls_get_addr plt usage
Use the hidden alias instead.

Checked on aarch64-linux-gnu.
2024-04-04 17:02:32 -03:00
Adhemerval Zanella
44ccc2465c math: x86 trunc traps when FE_INEXACT is enabled (BZ 31603)
The implementations of trunc functions using x87 floating point (i386 and
x86_64 long double only) traps when FE_INEXACT is enabled.  Although
this is a GNU extension outside the scope of the C standard, other
architectures that also support traps do not show this behavior.

The fix moves the implementation to a common one that holds any
exceptions with a 'fnclex' (libc_feholdexcept_setround_387).

Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2024-04-04 14:29:28 -03:00
Adhemerval Zanella
932544efa4 math: x86 floor traps when FE_INEXACT is enabled (BZ 31601)
The implementations of floor functions using x87 floating point (i386 and
86_64 long double only) traps when FE_INEXACT is enabled.  Although
this is a GNU extension outside the scope of the C standard, other
architectures that also support traps do not show this behavior.

The fix moves the implementation to a common one that holds any
exceptions with a 'fnclex' (libc_feholdexcept_setround_387).

Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2024-04-04 14:29:28 -03:00
Adhemerval Zanella
637bfc392f math: x86 ceill traps when FE_INEXACT is enabled (BZ 31600)
The implementations of ceil functions using x87 floating point (i386 and
x86_64 long double only) traps when FE_INEXACT is enabled.  Although
this is a GNU extension outside the scope of the C standard, other
architectures that also support traps do not show this behavior.

The fix moves the implementation to a common one that holds any
exceptions with a 'fnclex' (libc_feholdexcept_setround_387).

Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2024-04-04 14:29:28 -03:00
Joe Ramsay
87cb1dfcd6 aarch64/fpu: Add vector variants of erfc
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2024-04-04 10:33:24 +01:00
Joe Ramsay
3d3a4fb8e4 aarch64/fpu: Add vector variants of tanh
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2024-04-04 10:33:20 +01:00
Joe Ramsay
eedbbca0bf aarch64/fpu: Add vector variants of sinh
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2024-04-04 10:33:16 +01:00
Joe Ramsay
8b67920528 aarch64/fpu: Add vector variants of atanh
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2024-04-04 10:33:12 +01:00
Joe Ramsay
81406ea3c5 aarch64/fpu: Add vector variants of asinh
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2024-04-04 10:33:02 +01:00
Joe Ramsay
b09fee1d21 aarch64/fpu: Add vector variants of acosh
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2024-04-04 10:32:58 +01:00
Joe Ramsay
bdb5705b7b aarch64/fpu: Add vector variants of cosh
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2024-04-04 10:32:52 +01:00
Joe Ramsay
cb5d84f1f8 aarch64/fpu: Add vector variants of erf
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2024-04-04 10:32:48 +01:00
Stafford Horne
3db9d208dd misc: Add support for Linux uio.h RWF_NOAPPEND flag
In Linux 6.9 a new flag is added to allow for Per-io operations to
disable append mode even if a file was opened with the flag O_APPEND.
This is done with the new RWF_NOAPPEND flag.

This caused two test failures as these tests expected the flag 0x00000020
to be unused.  Adding the flag definition now fixes these tests on Linux
6.9 (v6.9-rc1).

  FAIL: misc/tst-preadvwritev2
  FAIL: misc/tst-preadvwritev64v2

This patch adds the flag, adjusts the test and adds details to
documentation.

Link: https://lore.kernel.org/all/20200831153207.GO3265@brightrain.aerifal.cx/
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-04-04 09:41:27 +01:00
Adhemerval Zanella
4dcd674b66 powerpc: Add missing arch flags on rounding ifunc variants
The ifunc variants now uses the powerpc implementation which in turn
uses the compiler builtin.  Without the proper -mcpu switch the builtin
does not generate the expected optimization.

Checked on powerpc-linux-gnu.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Peter Bergner <bergner@linux.ibm.com>
2024-04-02 15:49:31 -03:00
Adhemerval Zanella
a4ed0471d7 Always define __USE_TIME_BITS64 when 64 bit time_t is used
It was raised on libc-help [1] that some Linux kernel interfaces expect
the libc to define __USE_TIME_BITS64 to indicate the time_t size for the
kABI.  Different than defined by the initial y2038 design document [2],
the __USE_TIME_BITS64 is only defined for ABIs that support more than
one time_t size (by defining the _TIME_BITS for each module).

The 64 bit time_t redirects are now enabled using a different internal
define (__USE_TIME64_REDIRECTS). There is no expected change in semantic
or code generation.

Checked on x86_64-linux-gnu, i686-linux-gnu, aarch64-linux-gnu, and
arm-linux-gnueabi

[1] https://sourceware.org/pipermail/libc-help/2024-January/006557.html
[2] https://sourceware.org/glibc/wiki/Y2038ProofnessDesign

Reviewed-by: DJ Delorie <dj@redhat.com>
2024-04-02 15:28:36 -03:00
Adhemerval Zanella
721314c980 x86_64: Remove avx512 strstr implementation
As indicated in a recent thread, this it is a simple brute-force
algorithm that checks the whole needle at a matching character pair
(and does so 1 byte at a time after the first 64 bytes of a needle).
Also it never skips ahead and thus can match at every haystack
position after trying to match all of the needle, which generic
implementation avoids.

As indicated by Wilco, a 4x larger needle and 16x larger haystack gives
a clear 65x slowdown both basic_strstr and __strstr_avx512:

  "ifuncs": ["basic_strstr", "twoway_strstr", "__strstr_avx512",
"__strstr_sse2_unaligned", "__strstr_generic"],

    {
     "len_haystack": 65536,
     "len_needle": 1024,
     "align_haystack": 0,
     "align_needle": 0,
     "fail": 1,
     "desc": "Difficult bruteforce needle",
     "timings": [4.0948e+07, 15094.5, 3.20818e+07, 108558, 10839.2]
    },
    {
     "len_haystack": 1048576,
     "len_needle": 4096,
     "align_haystack": 0,
     "align_needle": 0,
     "fail": 1,
     "desc": "Difficult bruteforce needle",
     "timings": [2.69767e+09, 100797, 2.08535e+09, 495706, 82666.9]
    }

PS: I don't have an AVX512 capable machine to verify this issues, but
    skimming through the code it does seems to follow what Wilco has
    described.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2024-03-27 13:48:16 -03:00
Adhemerval Zanella
2e53eb9234 signal: Avoid system signal disposition to interfere with tests
Both tst-sigset2 and tst-signal1 expectes that SIGINT disposition
is set to SIG_DFL.
2024-03-27 13:47:09 -03:00
Palmer Dabbelt
96d1b9ac23 RISC-V: Fix the static-PIE non-relocated object check
The value of l_scope is only valid post relocation, so this original
check was triggering undefined behavior.  Instead just directly check to
see if the object has been relocated, at which point using l_scope is
safe.

Reported-by: Andreas Schwab <schwab@suse.de>
Closes: BZ #31317
Fixes: e0590f41fe ("RISC-V: Enable static-pie.")
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-03-25 15:17:13 +01:00
Sergey Bugaev
dc1a77269c htl: Implement some support for TLS_DTV_AT_TP
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-ID: <20240323173301.151066-19-bugaevc@gmail.com>
2024-03-23 23:00:30 +01:00