Commit Graph

41304 Commits

Author SHA1 Message Date
Adhemerval Zanella
6b7e2e1d61
linux: Also check pkey_get for ENOSYS on tst-pkey (BZ 31996)
The powerpc pkey_get/pkey_set support was only added for 64-bit [1],
and tst-pkey only checks if the support was present with pkey_alloc
(which does not fail on powerpc32, at least running a 64-bit kernel).

Checked on powerpc-linux-gnu.

[1] https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=a803367bab167f5ec4fde1f0d0ec447707c29520
Reviewed-By: Andreas K. Huettel <dilfridge@gentoo.org>
2024-07-19 22:39:44 +02:00
Adhemerval Zanella
e0f7da7235
powerpc: Update soft-fp ulps
Results based on regen-ulps using gcc 11.2.1 on a POWER8 machine.
2024-07-19 19:29:35 +02:00
John David Anglin
8cfa4ecff2 Fix usage of _STACK_GROWS_DOWN and _STACK_GROWS_UP defines [BZ 31989]
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Reviewed-By: Andreas K. Hüttel <dilfridge@gentoo.org>
2024-07-19 10:10:17 -04:00
Florian Weimer
91eb62d638 Adjust check-local-headers test for libaudit 4.0
The new version introduces /usr/include/audit_logging.h and
/usr/include/audit-records.h.
2024-07-19 15:57:46 +02:00
Adhemerval Zanella
3c354d62f5 elf: Parse the auxv values as unsigned on tst-tunables-enable_secure-env.c (BZ 31890)
AT_HWCAP on some architecture can indeed use all bits.

Checked on x86_64-linux-gnu and powerpc-linux-gnu.
Reviewed-By: Andreas K. Hüttel <dilfridge@gentoo.org>
2024-07-19 08:50:38 -03:00
H.J. Lu
66f2cd6e1a
x32: xfail elf/tst-platform-1 [BZ #22363]
Xfail elf/tst-platform-1 on x32 since kernel passes i686 in AT_PLATFORM.
See https://sourceware.org/bugzilla/show_bug.cgi?id=22363

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
2024-07-19 10:34:38 +02:00
Xi Ruoyao
d905183f0b elf/tst-rtld-does-not-exist: Pass --inhibit-cache to rtld
This avoids a test failure when the system has no /etc/ld.so.cache.

Tested on x86_64-linux-gnu.

Signed-off-by: Xi Ruoyao <xry111@xry111.site>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-07-19 01:15:53 -07:00
Andreas K. Hüttel
910aae6e5a
Revert "LoongArch: Add cfi instructions for _dl_tlsdesc_dynamic"
We're in freeze for the 2.40 release.

This reverts commit 43224b1379.

Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>
2024-07-17 15:24:51 +02:00
Samuel Thibault
6ed76f4efc htl: Fix __pthread_init_thread declaration and definition
0e75c4a463 ("hurd: Fix pthread_self() without libpthread") added a
declaration for ___pthread_init_thread instead of __pthread_init_thread,
and missed defining the external hidden symbol.
2024-07-17 15:04:25 +02:00
Samuel Thibault
0e75c4a463 hurd: Fix pthread_self() without libpthread
5476f8cd2e ("htl: move pthread_self info libc.") moved the htl
pthread_self() function from libpthread to libc, replacing the previous libc
stub that just returns 0. And 53da64d1cf ("htl: Initialize ___pthread_self
early") added initialization code which is needed before being able to
call pthread_self. It is currently in libpthread, and thus never called
before programs can call pthread_self from libc, which then segfaults
when accessing _pthread_self()->thread.

This moves the initialization to libc itself, as initialized variables, so
pthread_self can always be called fine.
2024-07-17 14:14:21 +02:00
mengqinggang
43224b1379 LoongArch: Add cfi instructions for _dl_tlsdesc_dynamic
In _dl_tlsdesc_dynamic, there are three 'addi.d sp, sp, -size'
instructions to allocate stack size for Float/LSX/LASX registers.
Every 'addi.d sp, sp, -size' needs a cfi_adjust_cfa_offset because
of sp is used to compute CFA. But only one 'addi.d sp, sp, -size'
will be run according to HWCAP value. And all cfi_adjust_cfa_offset
will be executed in stack unwinding, it result in incorrect CFA.

Change _dl_tlsdesc_dynamic to _dl_tlsdesc_dynamic,
_dl_tlsdesc_dynamic_lsx and _dl_tlsdesc_dynamic_lasx.
Conflicting cfi instructions can be distributed to the three functions.
And cfi instructions can correspond to stack down instructions.
2024-07-17 09:32:25 +08:00
Noah Goldstein
5bcf6265f2 x86: Disable non-temporal memset on Skylake Server
The original commit enabling non-temporal memset on Skylake Server had
erroneous benchmarks (actually done on ICX).

Further benchmarks indicate non-temporal stores may in fact by a
regression on Skylake Server.

This commit may be over-cautious in some cases, but should avoid any
regressions for 2.40.

Tested using qemu on all x86_64 cpu arch supported by both qemu +
GLIBC.

Reviewed-by: DJ Delorie <dj@redhat.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2024-07-16 17:20:18 +08:00
Flavio Cruz
2dcc908538 Add pthread_getname_np and pthread_setname_np for Hurd
We use thread_get_name and thread_set_name to get and set the thread
name, so nothing is stored in the thread structure since these functions
are supposed to be called sparingly.

One notable difference with Linux is that the thread name is up to 32
chars, whereas Linux's is 16.

Also added a mach_RPC_CHECK to check for the existing of gnumach RPCs.
2024-07-16 09:21:52 +02:00
Andreas K. Hüttel
a11e15ea0a
math: Update alpha ulps
Linux alphadev 6.9.8-gentoo-alpha #1 Sun Jul  7 00:45:49 EDT 2024 alpha EV68CB Titan GNU/Linux
gcc (Gentoo 14.1.1_p20240622 p2) 14.1.1 20240622
GNU ld (Gentoo 2.42 p6) 2.42.0

Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>
2024-07-14 12:44:15 +02:00
Samuel Thibault
c8b4ce0b36 hurd: Fix restoring message to be retried
save_data stores the start of the original message to be retried,
overwritten by the EINTR reply. In 64b builds the overwrite is however
rounded up to the 64b pointer size, so we have to save more than just
the 32b err.

Thanks a lot to Luca Dariz for the investigation!
2024-07-13 17:05:13 +02:00
Maciej W. Rozycki
4b2a1b602f
nptl: Convert tst-sem11 and tst-sem12 tests to use the test driver
Fix an issue with commit 2af4e3e566 ("Test of semaphores.") by making
the tst-sem11 and tst-sem12 tests use the test driver, preventing them
from ever causing testing to hang forever and never complete, such as
currently happening with the 'mips-linux-gnu' (o32 ABI) target.  Adjust
the name of the PREPARE macro, which clashes with the interpretation of
its presence by the test driver, by using a TF_ prefix in reference to
the name of the 'tf' function.
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-07-12 20:41:08 +02:00
Maciej W. Rozycki
9d8995833e
nptl: Add copyright notice tst-sem11 and tst-sem12 tests
Add a copyright notice to the tst-sem11 and tst-sem12 tests, observing
that they have been originally contributed back in 2007, with commit
2af4e3e566 ("Test of semaphores.").
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-07-12 20:40:36 +02:00
Andreas K. Hüttel
ef7005628f
tests: XFAIL audit tests failing on all mips configurations, bug 29404
Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>
2024-07-12 18:49:42 +02:00
Samuel Dobron
255df9299f time/Makefile: Split and sort tests
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-07-12 17:33:28 +02:00
Stefan Liebler
9b76514103 s390x: Fix segfault in wcsncmp [BZ #31934]
The z13/vector-optimized wcsncmp implementation segfaults if n=1
and there is only one character (equal on both strings) before
the page end.  Then it loads and compares one character and misses
to check n again.  The following load fails.

This patch removes the extra load and compare of the first character
and just start with the loop which uses vector-load-to-block-boundary.
This code-path also checks n.

With this patch both tests are passing:
- the simplified one mentioned in the bugzilla 31934
- the full one in Florian Weimer's patch:
"manual: Document a GNU extension for strncmp/wcsncmp"
(https://patchwork.sourceware.org/project/glibc/patch/874j9eml6y.fsf@oldenburg.str.redhat.com/):
On s390x-linux-gnu (z16), the new wcsncmp test fails due to bug 31934.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2024-07-11 15:08:57 +02:00
Florian Weimer
2e456ccf0c Linux: Make __rseq_size useful for feature detection (bug 31965)
The __rseq_size value is now the active area of struct rseq
(so 20 initially), not the full struct size including padding
at the end (32 initially).

Update misc/tst-rseq to print some additional diagnostics.

Reviewed-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
2024-07-09 19:33:37 +02:00
Andreas K. Hüttel
7e7f35278c
po: incorporate translations (bg)
Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>
2024-07-09 13:35:12 +02:00
DJ Delorie
6c0be74305
manual: add syscalls
The purpose of this patch is to add some system calls that (1) aren't
otherwise documented, and (2) are merely redirected to the kernel, so
can refer to their documentation; and define a standard way of doing
so in the future.  A more detailed explaination of how system calls
are wrapped is added along with reference to the Linux Man-Pages
project.

Default version of man-pages is in configure.ac but can be overridden
by --with-man-pages=X.Y

Reviewed-by: Alejandro Colomar <alx@kernel.org>
2024-07-09 11:54:29 +02:00
Andreas Schwab
2213b37b70 libio: handle opening a file when all files are closed (bug 31963)
_IO_list_all becomes NULL when all files (including standard files) are
closed.
2024-07-09 10:12:36 +02:00
Adam Sampson
895294e51d
ldconfig: Ignore all GDB extension files
ldconfig already ignores files with the -gdb.py suffix, but GDB also
looks for -gdb.gdb and -gdb.scm files. These aren't as widely used, but
libguile at least comes with a -gdb.scm file.

Rename is_gdb_python_file to is_gdb_extension_file, and make it
recognise all three types of GDB extension.

Signed-off-by: Adam Sampson <ats@offog.org>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-07-08 22:15:34 +02:00
Adam Sampson
ed2b8d3a86
ldconfig: Move endswithn into a new header file
is_gdb_python_file is doing a similar test, so it can use this helper
function as well.

Signed-off-by: Adam Sampson <ats@offog.org>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-07-08 22:14:22 +02:00
Andreas K. Hüttel
ab6045728f
math: Update m68k ULPs
This hasn't been looked at for a loong time (already guessing from
the number of missing entries), and it ain't pretty.
There are some 9-ulps results for float.

- ZaZaZebra (qemu-system-m68k clone of PowerBook 190 system)
- GCC 13.3.1 20240614 (Gentoo 13.3.1_p20240614 p17)
- ld GNU ld (Gentoo 2.42 p6) 2.42.0
- Linux ZaZaZebra  4.19.0-5-m68k #1 Gentoo 4.19.37-5 (2019-06-19) m68k 68040 68040 GNU/Linux
- manual build
- ../glibc/configure --enable-fortify-source --prefix=/usr
- Tested by Immolo (via Andreas K. Hüttel)

Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>
2024-07-08 21:51:03 +02:00
Adhemerval Zanella
184b9e530e stdlib: fix arc4random fallback to /dev/urandom (BZ 31612)
The __getrandom_nocancel used by __arc4random_buf uses
INLINE_SYSCALL_CALL (which returns -1/errno) and the loop checks for
the return value instead of errno to fallback to /dev/urandom.

The malloc code now uses __getrandom_nocancel_nostatus, which uses
INTERNAL_SYSCALL_CALL, so there is no need to use the variant that does
not set errno (BZ#29624).

Checked on x86_64-linux-gnu.

Reviewed-by: Xi Ruoyao <xry111@xry111.site>
2024-07-08 10:05:10 -03:00
Adhemerval Zanella
9fc639f654 elf: Make dl-rseq-symbols Linux only
And avoid a Hurd build failures.

Checked on x86_64-linux-gnu.
2024-07-04 10:09:07 -03:00
Michael Jeanson
2b92982e23 nptl: fix potential merge of __rseq_* relro symbols
While working on a patch to add support for the extensible rseq ABI, we
came across an issue where a new 'const' variable would be merged with
the existing '__rseq_size' variable. We tracked this to the use of
'-fmerge-all-constants' which allows the compiler to merge identical
constant variables. This means that all 'const' variables in a compile
unit that are of the same size and are initialized to the same value can
be merged.

In this specific case, on 32 bit systems 'unsigned int' and 'ptrdiff_t'
are both 4 bytes and initialized to 0 which should trigger the merge.
However for reasons we haven't delved into when the attribute 'section
(".data.rel.ro")' is added to the mix, only variables of the same exact
types are merged. As far as we know this behavior is not specified
anywhere and could change with a new compiler version, hence this patch.

Move the definitions of these variables into an assembler file and add
hidden writable aliases for internal use. This has the added bonus of
removing the asm workaround to set the values on rseq registration.

Tested on Debian 12 with GCC 12.2.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
2024-07-03 21:40:30 +02:00
Darius Rad
b85a23d736
riscv: Update nofpu libm test ulps
Fixes 32 test failures.
2024-07-03 21:05:34 +02:00
Florian Weimer
7dde7f82d9 manual: Recommendations for dynamic linker hardening
This new section in the manual provides recommendations for
use of glibc in environments with higher integrity requirements.
It's reflecting both current implementation shortcomings, and
challenges we inherit from ELF and psABI requirements.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
2024-07-03 19:26:14 +02:00
Sergey Kolosov
50f5a09e68 socket: Add new test for shutdown
This commit adds shutdown test with SHUT_RD, SHUT_WR, SHUT_RDWR for an
UNIX socket connection.

Reviewed-by: DJ Delorie <dj@redhat.com>
2024-07-03 12:55:44 -04:00
Stefan Liebler
d2f6ceaccb elf/rtld: Fix auxiliary vector for enable_secure
Starting with commit
59974938fe
elf/rtld: Count skipped environment variables for enable_secure

The new testcase elf/tst-tunables-enable_secure-env segfaults on s390 (31bit).
There _start parses the auxiliary vector for some additional checks.

Therefore it skips over the zeros after the environment variables ...
0x7fffac20:     0x7fffbd17      0x7fffbd32      0x7fffbd69      0x00000000
------------------------------------------------^^^last environment variable

... and then it parses the auxiliary vector and stops at AT_NULL.
0x7fffac30:     0x00000000      0x00000021      0x00000000      0x00000000
--------------------------------^^^AT_SYSINFO_EHDR--------------^^^AT_NULL
----------------^^^newp-----------------------------------------^^^oldp
Afterwards it tries to access AT_PHDR which points to somewhere and segfaults.

Due to not incorporating the skip_env variable in the computation of oldp
when shuffling down the auxv in rtld.c, it just copies one entry with AT_NULL
and value 0x00000021 and stops the loop.  In reality we have skipped
GLIBC_TUNABLES environment variable (=> skip_env=1). Thus we should copy from
here:
0x7fffac40:     0x00000021      0x7ffff000      0x00000010      0x007fffff
----------------^^^fixed-oldp

This patch fixes the computation of oldp when shuffling down auxiliary vector.
It also adds some checks in the testcase.  Those checks also fail on
s390x (64bit) and x86_64 without the fix.

Co-authored-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-07-03 13:01:44 +02:00
John David Anglin
4737e6a7a3 hppa/vdso: Provide 64-bit clock_gettime() vDSO only
Adhemerval noticed that the gettimeofday() and 32-bit clock_gettime()
vDSO calls won't be used by glibc on hppa, so there is no need to
declare them.  Both syscalls will be emulated by utilizing return values
of the 64-bit clock_gettime() vDSO instead.

Signed-off-by: Helge Deller <deller@gmx.de>
Suggested-by: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
2024-07-02 16:26:32 -04:00
Adhemerval Zanella
9f80d8134a debug: Fix clang open fortify wrapper (BZ 31927)
The clang open fortify wrapper from 4228baef1a added
a restriction where open with 3 arguments where flags do not
contain O_CREAT or O_TMPFILE are handled as invalid.  They are
not invalid, since the third argument is ignored, and the gcc
wrapper also allows it.

Checked x86_64-linux-gnu and with a yocto build for some affected
packages.
Tested-by: “Khem Raj <raj.khen@gmail.com>”
2024-07-02 10:19:44 -03:00
H.J. Lu
ba144c179e Add --disable-static-c++-tests option [BZ #31797]
By default, if the C++ toolchain lacks support for static linking,
configure fails to find the C++ header files and the glibc build fails.
The --disable-static-c++-link-check option allows the glibc build to
finish, but static C++ tests will fail if the C++ toolchain doesn't
have the necessary static C++ libraries which may not be easily installed.
Add --disable-static-c++-tests option to skip the static C++ link check
and tests.  This fixes BZ #31797.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2024-07-02 00:51:34 -07:00
H.J. Lu
23f12e6e0c Add --disable-static-c++-link-check option [BZ #31412]
The current minimum GCC version of glibc build is GCC 6.2 or newer. But
building i686 glibc with GCC 6.4 on Fedora 40 failed since the C++ header
files couldn't be found which was caused by the static C++ link check
failure due to missing __divmoddi4 which was referenced in i686 libc.a
and added to GCC 7.  Add --disable-static-c++-link-check configure option
to disable the static C++ link test.  The newly built i686 libc.a can be
used by GCC 6.4 to create static C++ tests.  This fixes BZ #31412.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2024-07-02 00:51:34 -07:00
DJ Delorie
dce754b155 Update mmap() flags and errors lists
Extend the list of MAP_* macros to include all macros available
to the average program (gcc -E -dM | grep MAP_*)

Extend the list of errno codes.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2024-07-01 16:44:55 -04:00
YunQiang Su
9d0e9c8a13 MIPSr6/math: Use builtin fma and fmaf
MIPSr6 has MADDF.s/MADDF.d instructions, which are fused.

In MIPS ISA, double support can be subsetted.  Only FMAF is enabled
for this case.

	* sysdeps/mips/fpu/math-use-builtins-fma.h

Signed-off-by: YunQiang Su <syq@gcc.gnu.org>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-07-01 14:52:30 -03:00
Florian Weimer
018f0fc3b8 elf: Support recursive use of dynamic TLS in interposed malloc
It turns out that quite a few applications use bundled mallocs that
have been built to use global-dynamic TLS (instead of the recommended
initial-exec TLS).  The previous workaround from
commit afe42e935b ("elf: Avoid some
free (NULL) calls in _dl_update_slotinfo") does not fix all
encountered cases unfortunatelly.

This change avoids the TLS generation update for recursive use
of TLS from a malloc that was called during a TLS update.  This
is possible because an interposed malloc has a fixed module ID and
TLS slot.  (It cannot be unloaded.)  If an initially-loaded module ID
is encountered in __tls_get_addr and the dynamic linker is already
in the middle of a TLS update, use the outdated DTV, thus avoiding
another call into malloc.  It's still necessary to update the
DTV to the most recent generation, to get out of the slow path,
which is why the check for recursion is needed.

The bookkeeping is done using a global counter instead of per-thread
flag because TLS access in the dynamic linker is tricky.

All this will go away once the dynamic linker stops using malloc
for TLS, likely as part of a change that pre-allocates all TLS
during pthread_create/dlopen.

Fixes commit d2123d6827 ("elf: Fix slow
tls access after dlopen [BZ #19924]").

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2024-07-01 19:02:11 +02:00
Carlos O'Donell
a7fe3e805d
Fix conditionals on mtrace-based tests (bug 31892)
The conditionals for several mtrace-based tests in catgets, elf, libio,
malloc, misc, nptl, posix, and stdio-common were incorrect leading to
test failures when bootstrapping glibc without perl.

The correct conditional for mtrace-based tests requires three checks:
first checking for run-built-tests, then build-shared, and lastly that
PERL is not equal to "no" (missing perl).
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-07-01 17:20:30 +02:00
Michel Lind
4f7eb238d0 signal/Makefile: Split and sort tests
This will make it easier to add tests later; see also the test section
in malloc/Makefile that inspires this.

Signed-off-by: Michel Lind <salimma@fedoraproject.org>
Suggested-by: Arjun Shankar <arjun@redhat.com>
Reviewed-by: Arjun Shankar <arjun@redhat.com>
2024-07-01 13:47:27 +02:00
MayShao-oc
9dc645cb56 x86: Set default non_temporal_threshold for Zhaoxin processors
Current 'non_temporal_threshold' set to 'non_temporal_threshold_lowbound'
on Zhaoxin processors without ERMS. The default
'non_temporal_threshold_lowbound' is too small for the KH-40000 and KX-7000
Zhaoxin processors, this patch updates the value to
'shared / cachesize_non_temporal_divisor'.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2024-06-30 06:26:43 -07:00
MayShao-oc
c19457aec6 x86_64: Optimize large size copy in memmove-ssse3
This patch optimizes large size copy using normal store when src > dst
and overlap.  Make it the same as the logic in memmove-vec-unaligned-erms.S.

Current memmove-ssse3 use '__x86_shared_cache_size_half' as the non-
temporal threshold, this patch updates that value to
'__x86_shared_non_temporal_threshold'.  Currently, the
__x86_shared_non_temporal_threshold is cpu-specific, and different CPUs
will have different values based on the related nt-benchmark results.
However, in memmove-ssse3, the nontemporal threshold uses
'__x86_shared_cache_size_half', which sounds unreasonable.

The performance is not changed drastically although shows overall
improvements without any major regressions or gains.

Results on Zhaoxin KX-7000:
bench-memcpy geometric_mean(N=20) New / Original: 0.999

bench-memcpy-random geometric_mean(N=20) New / Original: 0.999

bench-memcpy-large geometric_mean(N=20) New / Original: 0.978

bench-memmove geometric_mean(N=20) New / Original: 1.000

bench-memmmove-large geometric_mean(N=20) New / Original: 0.962

Results on Intel Core i5-6600K:
bench-memcpy geometric_mean(N=20) New / Original: 1.001

bench-memcpy-random geometric_mean(N=20) New / Original: 0.999

bench-memcpy-large geometric_mean(N=20) New / Original: 1.001

bench-memmove geometric_mean(N=20) New / Original: 0.995

bench-memmmove-large geometric_mean(N=20) New / Original: 0.936
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2024-06-30 06:26:43 -07:00
MayShao-oc
44d757eb9f x86: Set preferred CPU features on the KH-40000 and KX-7000 Zhaoxin processors
Fix code formatting under the Zhaoxin branch and add comments for
different Zhaoxin models.

Unaligned AVX load are slower on KH-40000 and KX-7000, so disable
the AVX_Fast_Unaligned_Load.

Enable Prefer_No_VZEROUPPER and Fast_Unaligned_Load features to
use sse2_unaligned version of memset,strcpy and strcat.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2024-06-30 06:26:43 -07:00
Andrew Pinski
2f1f7a5f8a
Aarch64: Add new memset for Qualcomm's oryon-1 core
Qualcom's new core, oryon-1, has a different characteristics for
memset than the current versions of memset. For non-zero, larger
sizes, using GPRs rather than the SIMD stores is ~30% faster.
For even larger sizes, using the nontemporal stores is needed
not to polute the L1/L2 caches.

For zero values, using `dc zva` should be used. Since we
know the size will always be 64 bytes, we don't need to figure
out the size there.

I started with the emag memset and added back the `dc zva` code.

Changes since v1:
* v3: Fix comment formating

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-06-30 13:47:17 +02:00
Andrew Pinski
4dc83cac78
Aarch64: Add memcpy for qualcomm's oryon-1 core
Qualcomm's new core (oryon-1) has a different performance characteristic
than other cores. For memcpy, it is faster to use the GPRs to
do the copy for large sizes (2x faster). For even larger sizes,
it is better to use the nontemporal load/store instructions so
we don't pollute the L1/L2 caches.

For smaller sizes, the characteristic are very similar to
other cores.
I used the thunderx memcpy as a starting point and expanded from there.

Changes since v1:
* v2: Fix ordering in Makefile.
* v3: Fix comment grammar about the ldnp/stnp instructions.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-06-30 13:46:33 +02:00
Adhemerval Zanella
4228baef1a debug: Fix clang open fortify wrapper (BZ 31927)
The fcntl.h fortify wrapper for clang added by 86889e22db
missed the __fortify_clang_overload_arg and and also added the
mode argument for the __fortify_function_error_function function,
which leads clang to be able to correct resolve which overloaded
function it should emit.

Checked on x86_64-linux-gnu.

Reported-by: Khem Raj <raj.khem@gmail.com>
Tested-by: Khem Raj <raj.khem@gmail.com>
2024-06-27 13:32:48 -03:00
Adhemerval Zanella
c5579f3a71 debug: Fix clang mq_open fortify wrapper (BZ 31917)
The mqueue.h fortify wrapper for clang added by c23107effb
is not fully correct, where correct 4 argument usage are not
being correctly handled.  For instance, while building socat 1.8
with a yocto clang based system shows:

  ./socat-1.8.0.0/xio-posixmq.c:119:8: error: 'mq_open' is unavailable: mq_open can be called either with 2 or 4 arguments
    119 |         mqd = mq_open(name, oflag, opt_mode, NULL);
        |               ^
  [...] /usr/include/bits/mqueue2.h:66:8: note: 'mq_open' has been explicitly marked unavailable here
     66 | __NTH (mq_open (const char *__name, int __oflag, mode_t mode,
        |        ^
  1 error generated.

The correct way to define the wrapper is to set invalid usage
with __fortify_clang_unavailable (for the case with 5 or more
arguments), followed by the expected ones.  This fix make mq_open
similar to current open wrappers.

[1] http://www.dest-unreach.org/socat/

Reported-by: Khem Raj <raj.khem@gmail.com>
Acked-by: Khem Raj <raj.khem@gmail.com>
2024-06-27 13:32:48 -03:00