Commit Graph

39175 Commits

Author SHA1 Message Date
Wilco Dijkstra
9f298bfe1f AArch64: Add SVE memcpy
Add an initial SVE memcpy implementation.  Copies up to 32 bytes use SVE
vectors which improves the random memcpy benchmark significantly.
Cleanup the memcpy and memmove ifunc selectors.
2022-06-07 16:58:03 +01:00
Raghuveer Devulapalli
5082a287d5 x86_64: Add strstr function with 512-bit EVEX
Adding a 512-bit EVEX version of strstr. The algorithm works as follows:

(1) We spend a few cycles at the begining to peek into the needle. We
locate an edge in the needle (first occurance of 2 consequent distinct
characters) and also store the first 64-bytes into a zmm register.

(2) We search for the edge in the haystack by looking into one cache
line of the haystack at a time. This avoids having to read past a page
boundary which can cause a seg fault.

(3) If an edge is found in the haystack we first compare the first
64-bytes of the needle (already stored in a zmm register) before we
proceed with a full string compare performed byte by byte.

Benchmarking results: (old = strstr_sse2_unaligned, new = strstr_avx512)

Geometric mean of all benchmarks: new / old =  0.66

Difficult skiptable(0) : new / old =  0.02
Difficult skiptable(1) : new / old =  0.01
Difficult 2-way : new / old =  0.25
Difficult testing first 2 : new / old =  1.26
Difficult skiptable(0) : new / old =  0.05
Difficult skiptable(1) : new / old =  0.06
Difficult 2-way : new / old =  0.26
Difficult testing first 2 : new / old =  1.05
Difficult skiptable(0) : new / old =  0.42
Difficult skiptable(1) : new / old =  0.24
Difficult 2-way : new / old =  0.21
Difficult testing first 2 : new / old =  1.04
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2022-06-06 19:46:55 -07:00
Adhemerval Zanella
8521001731 scripts/glibcelf.py: Add PT_AARCH64_MEMTAG_MTE constant
It was added in commit 603e5c8ba7.
This caused the elf/tst-glibcelf consistency check to fail.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2022-06-06 15:56:48 -03:00
Dmitriy Fedchenko
999835533b socket: Fix mistyped define statement in socket/sys/socket.h (BZ #29225) 2022-06-06 12:46:14 -03:00
Joseph Myers
828c72519f Declare timegm for ISO C2X
The next revision of the ISO C standard has added the timegm function
(that was already supported in glibc).  Update the feature test
conditionals on its declaration in <time.h> accordingly.

Tested for x86_64.
2022-06-06 14:47:03 +00:00
Joseph Myers
603e5c8ba7 Add PT_AARCH64_MEMTAG_MTE from Linux 5.18 to elf.h
Linux 5.18 defines a new AArch64 ELF segment type
PT_AARCH64_MEMTAG_MTE; add it to elf.h.

Tested with build-many-glibcs.py for aarch64-linux-gnu.
2022-06-06 14:45:34 +00:00
Sam James
7df596a58c grep: egrep -> grep -E, fgrep -> grep -F
Newer versions of GNU grep (after grep 3.7, not inclusive) will warn on
'egrep' and 'fgrep' invocations.

Convert usages within the tree to their expanded non-aliased counterparts
to avoid irritating warnings during ./configure and the test suite.

Signed-off-by: Sam James <sam@gentoo.org>
Reviewed-by: Fangrui Song <maskray@google.com>
2022-06-05 12:09:02 -07:00
H.J. Lu
3c23fa9f44 string.h: Fix boolean spelling in comments 2022-06-03 10:22:38 -07:00
Carlos O'Donell
48f4b30780 elf: Add #include <errno.h> for use of E* constants.
In __strerror_r we use errno constants and must include errno.h.

Tested on x86_64 and i686 without regression.
2022-06-02 15:20:36 -04:00
Carlos O'Donell
62c888b337 elf: Add #include <sys/param.h> for MAX usage.
In _dl_audit_pltenter we use MAX and so need to include param.h.

Tested on x86_64 and i686 without regression.
2022-06-02 15:20:36 -04:00
Adhemerval Zanella
1002f1af1c linux: Add process_mrelease
Added in Linux 5.15 (884a7e5964e06ed93c7771c0d7cf19c09a8946f1), the new
syscalls allows a caller to free the memory of a dying target process.

Checked on x86_64-linux-gnu.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2022-06-02 15:43:28 -03:00
Adhemerval Zanella
d19ee3473d linux: Add process_madvise
It was added on Linux 5.10 (ecb8ac8b1f146915aa6b96449b66dd48984caacc)
with the same functionality as madvise but using a pidfd of the target
process.

Checked on x86_64-linux-gnu and i686-linux-gnu.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2022-06-02 15:43:28 -03:00
Adhemerval Zanella
7d3e91ba19 linux: Set tst-pidfd-consts unsupported for kernels headers older than 5.10
Instead of fail trying to build the compare source file.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Matheus Castanho <msc@linux.ibm.com>
Reviewed-by: Matheus Castanho <msc@linux.ibm.com>
2022-06-02 15:43:25 -03:00
Florian Weimer
bb8887379f testrun.sh: Support passing strace and valgrind arguments
This is a bit of a hack, but it works quite well in practice.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2022-06-02 18:37:30 +02:00
Florian Weimer
4b527650e0 Linux: Adjust struct rseq definition to current kernel version
This definition is only used as a fallback with old kernel headers.
The change follows kernel commit bfdf4e6208051ed7165b2e92035b4bf11
("rseq: Remove broken uapi field layout on 32-bit little endian").

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2022-06-02 16:29:59 +02:00
Adhemerval Zanella
c789e6e409 iconv: Use 64 bit stat for gconv_parseconfdir (BZ# 29213)
The issue is only when used within libc.so (iconvconfig already builds
with _TIME_SIZE=64).

This is a missing spot initially from 52a5fe70a2.

Checked on i686-linux-gnu.
2022-06-01 13:23:16 -03:00
Adhemerval Zanella
634f566c3e catgets: Use 64 bit stat for __open_catalog (BZ# 29211)
This is a missing spot initially from 52a5fe70a2.

Checked on i686-linux-gnu.
2022-06-01 13:23:16 -03:00
Adhemerval Zanella
3cd4785ea0 inet: Use 64 bit stat for ruserpass (BZ# 29210)
This is a missing spot initially from 52a5fe70a2.

Checked on i686-linux-gnu.
2022-06-01 13:23:16 -03:00
Adhemerval Zanella
87f1ec12e7 socket: Use 64 bit stat for isfdtype (BZ# 29209)
This is a missing spot initially from 52a5fe70a2.

Checked on i686-linux-gnu.
2022-06-01 13:23:16 -03:00
Adhemerval Zanella
6e7137f28c posix: Use 64 bit stat for fpathconf (_PC_ASYNC_IO) (BZ# 29208)
This is a missing spot initially from 52a5fe70a2.

Checked on i686-linux-gnu.
2022-06-01 13:23:16 -03:00
Adhemerval Zanella
574ba60fc8 posix: Use 64 bit stat for posix_fallocate fallback (BZ# 29207)
This is a missing spot initially from 52a5fe70a2.

Checked on i686-linux-gnu.
2022-06-01 13:23:16 -03:00
Adhemerval Zanella
ec995fb215 misc: Use 64 bit stat for getusershell (BZ# 29203)
This is a missing spot initially from 52a5fe70a2.

Checked on i686-linux-gnu.
2022-06-01 13:23:16 -03:00
Adhemerval Zanella
3fbc33010c misc: Use 64 bit stat for daemon (BZ# 29203)
This is a missing spot initially from 52a5fe70a2.

Checked on i686-linux-gnu.
2022-06-01 13:23:13 -03:00
WANG Xuerui
e6547d635b linux: use statx for fstat if neither newfstatat nor fstatat64 is present
LoongArch is going to be the first architecture supported by Linux that
has neither fstat* nor newfstatat [1], instead exclusively relying on
statx. So in fstatat64's implementation, we need to also enable statx
usage if neither fstatat64 nor newfstatat is present, to prepare for
this new case of kernel ABI.

[1]: https://lore.kernel.org/all/20220518092619.1269111-1-chenhuacai@loongson.cn/

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2022-06-01 12:29:01 -03:00
Joseph Myers
de3501d60f Add MADV_DONTNEED_LOCKED from Linux 5.18 to bits/mman-linux.h
Linux 5.18 adds a constant MADV_DONTNEED_LOCKED (defined in multiple
header files, but with the same value on all architectures).  Add this
constant to bits/mman-linux.h.

Tested for x86_64.
2022-06-01 14:45:48 +00:00
Joseph Myers
9d03bac7e7 Add HWCAP2_MTE3 from Linux 5.18 to AArch64 bits/hwcap.h
Linux 5.18 defines a new AArch64 HWCAP value HWCAP2_MTE3; add it to
glibc's sysdeps/unix/sysv/linux/aarch64/bits/hwcap.h.

Tested with build-many-glibcs.py for aarch64-linux-gnu.
2022-06-01 14:43:06 +00:00
Adhemerval Zanella
5a6f2cabb6 i686: Use generic sincosf implementation for SSE2 version
The generic implementation shows slight better performance
(gcc 11.2.1 on a Ryzen 9 5900X):

* s_sincosf-sse2.S:
  "sincosf": {
   "workload-random": {
    "duration": 3.89961e+09,
    "iterations": 9.5472e+07,
    "reciprocal-throughput": 40.8429,
    "latency": 40.8483,
    "max-throughput": 2.4484e+07,
    "min-throughput": 2.44808e+07
   }
  }

* generic s_cossinf.c:
  "sincosf": {
   "workload-random": {
    "duration": 3.71953e+09,
    "iterations": 1.48512e+08,
    "reciprocal-throughput": 25.0515,
    "latency": 25.0391,
    "max-throughput": 3.99177e+07,
    "min-throughput": 3.99375e+07
   }
  }

Checked on i686-linux-gnu.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2022-06-01 10:47:44 -03:00
Adhemerval Zanella
dc208f4a53 benchtests: Add workload name for sincosf
So it can show both reciprocal-throughput and latency.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2022-06-01 10:47:44 -03:00
Adhemerval Zanella
3323476641 i686: Use generic sinf implementation for SSE2 version
Performance seems to be similar (gcc 11.2.1 on a Ryzen 9 5900X),
the generic algorithm shows slight better performance for
the 'workload-huge.wrf' input set.

* s_sinf-sse2.S:
  "sinf": {
   "": {
    "duration": 3.72405e+09,
    "iterations": 2.38374e+08,
    "max": 63.973,
    "min": 11.211,
    "mean": 15.6227
   },
   "workload-random.wrf": {
    "duration": 3.76923e+09,
    "iterations": 8.4e+07,
    "reciprocal-throughput": 17.6355,
    "latency": 72.108,
    "max-throughput": 5.67037e+07,
    "min-throughput": 1.38681e+07
   },
   "workload-huge.wrf": {
    "duration": 3.76943e+09,
    "iterations": 6e+07,
    "reciprocal-throughput": 29.3493,
    "latency": 96.2985,
    "max-throughput": 3.40724e+07,
    "min-throughput": 1.03844e+07
   }
  }

* generic s_sinf.c:
  "sinf": {
   "": {
    "duration": 3.70989e+09,
    "iterations": 2.18025e+08,
    "max": 69.782,
    "min": 11.1,
    "mean": 17.0159
   },
   "workload-random.wrf": {
    "duration": 3.77213e+09,
    "iterations": 9.6e+07,
    "reciprocal-throughput": 17.5402,
    "latency": 61.0459,
    "max-throughput": 5.70119e+07,
    "min-throughput": 1.63811e+07
   },
   "workload-huge.wrf": {
    "duration": 3.81576e+09,
    "iterations": 5.6e+07,
    "reciprocal-throughput": 38.2111,
    "latency": 98.0659,
    "max-throughput": 2.61704e+07,
    "min-throughput": 1.01972e+07
   }
  }

Checked on i686-linux-gnu.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2022-06-01 10:47:44 -03:00
Adhemerval Zanella
da39afa4ff i686: Use generic cosf implementation for SSE2 version
Performance seems to be similar (gcc 11.2.1 on a Ryzen 9 5900X):

* s_cosf-sse2.S:
  "cosf": {
   "workload-random": {
    "duration": 3.74987e+09,
    "iterations": 9.616e+07,
    "reciprocal-throughput": 15.8141,
    "latency": 62.1782,
    "max-throughput": 6.32346e+07,
    "min-throughput": 1.60828e+07
   }
  }

* generic s_cosf.c:
  "cosf": {
   "workload-random": {
    "duration": 3.87298e+09,
    "iterations": 1.00968e+08,
    "reciprocal-throughput": 18.3448,
    "latency": 58.3722,
    "max-throughput": 5.45113e+07,
    "min-throughput": 1.71314e+07
   }
  }

Checked on i686-linux-gnu.
2022-06-01 10:47:44 -03:00
Adhemerval Zanella
c1176b62a9 benchtests: Add workload name for cosf
So it can show both reciprocal-throughput and latency.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2022-06-01 10:47:44 -03:00
Andreas Schwab
dc1e5eeb25 x86_64: Optimize sincos where sin/cos is optimized (bug 29193)
The compiler may substitute calls to sin or cos with calls to sincos, thus
we should have the same optimized implementations for sincos.  The
optimized implementations may produce results that differ, that also makes
sure that the sincos call aggrees with the sin and cos calls.
2022-06-01 10:29:52 +02:00
Andreas Schwab
d976d44a89 manual: fix reference to source file 2022-05-31 16:22:49 +02:00
Joseph Myers
6488f4d006 Add SOL_SMC from Linux 5.18 to bits/socket.h
Linux 5.18 adds a constant SOL_SMC to the getsockopt / setsockopt
levels; add this constant to bits/socket.h.

Tested for x86_64.
2022-05-31 13:49:53 +00:00
Adhemerval Zanella
81e7fdd7cc elf: Remove _dl_skip_args
Now that no architecture uses it anymore.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2022-05-30 16:33:54 -03:00
Adhemerval Zanella
ec7bc492b6 x86_64: Remove _dl_skip_args usage
Since ad43cac44a the generic code already shuffles the argv/envp/auxv
on the stack to remove the ld.so own arguments and thus _dl_skip_args
is always 0.   So there is no need to adjust the argc or argv.

Checked on x86_64-linux-gnu and i686-linux-gnu.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2022-05-30 16:33:34 -03:00
Adhemerval Zanella
b6712b137f sparc: Remove _dl_skip_args usage
Since ad43cac44a the generic code already shuffles the argv/envp/auxv
on the stack to remove the ld.so own arguments and thus _dl_skip_args
is always 0.   So there is no need to adjust the argc or argv.

Checked on sparc64-linux-gnu and sparcv9-linux-gnu.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2022-05-30 16:33:31 -03:00
Adhemerval Zanella
4dc1f6530e sh: Remove _dl_skip_args usage
Since ad43cac44a the generic code already shuffles the argv/envp/auxv
on the stack to remove the ld.so own arguments and thus _dl_skip_args
is always 0.   So there is no need to adjust the argc or argv.

Checked with qemu-user that arguments are correctly passed on both
constructors and main program.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2022-05-30 16:33:28 -03:00
Adhemerval Zanella
22d8935d1d s390: Remove _dl_skip_args usage
Since ad43cac44a the generic code already shuffles the argv/envp/auxv
on the stack to remove the ld.so own arguments and thus _dl_skip_args
is always 0.   So there is no need to adjust the argc or argv.

Checked on s390x-linux-gnu and s390-linux-gnu.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2022-05-30 16:33:25 -03:00
Adhemerval Zanella
d62123c1ed riscv: Remove _dl_skip_args usage
Since ad43cac44a the generic code already shuffles the argv/envp/auxv
on the stack to remove the ld.so own arguments and thus _dl_skip_args
is always 0.   So there is no need to adjust the argc or argv.

Checked with qemu-user that arguments are correctly passed on both
constructors and main program.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2022-05-30 16:33:22 -03:00
Adhemerval Zanella
4868ba5d25 nios2: Remove _dl_skip_args usage (BZ# 29187)
Since ad43cac44a the generic code already shuffles the argv/envp/auxv
on the stack to remove the ld.so own arguments and thus _dl_skip_args
is always 0.   So there is no need to adjust the argc or argv.

Checked with qemu-user that arguments are correctly passed on both
constructors and main program.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2022-05-30 16:33:20 -03:00
Adhemerval Zanella
44fc092c0d mips: Remove _dl_skip_args usage
Since ad43cac44a the generic code already shuffles the argv/envp/auxv
on the stack to remove the ld.so own arguments and thus _dl_skip_args
is always 0.   So there is no need to adjust the argc or argv.

Checked with qemu-user that arguments are correctly passed on both
constructors and main program.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2022-05-30 16:33:16 -03:00
Adhemerval Zanella
90cf8e6f0a microblaze: Remove _dl_skip_args usage
Since ad43cac44a the generic code already shuffles the argv/envp/auxv
on the stack to remove the ld.so own arguments and thus _dl_skip_args
is always 0.   So there is no need to adjust the argc or argv.

Checked with qemu-user that arguments are correctly passed on both
constructors and main program.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2022-05-30 16:33:14 -03:00
Adhemerval Zanella
ee39fafa98 m68k: Remove _dl_skip_args usage
Since ad43cac44a the generic code already shuffles the argv/envp/auxv
on the stack to remove the ld.so own arguments and thus _dl_skip_args
is always 0.  So there is no need to adjust the argc or argv.

Checked with qemu-user that arguments are correctly passed on both
constructors and main program.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2022-05-30 16:33:11 -03:00
Adhemerval Zanella
57bb1e5b9f ia64: Remove _dl_skip_args usage
Since ad43cac44a the generic code already shuffles the argv/envp/auxv
on the stack to remove the ld.so own arguments and thus _dl_skip_args
is always 0.

The startup code is changed to read the _dl_argc and _dl_argv values,
and envp is calculated from argc and argv.

Checked on ia64-linux-gnu.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2022-05-30 16:33:08 -03:00
Adhemerval Zanella
1b7f05d11e i686: Remove _dl_skip_args usage
Since ad43cac44a the generic code already shuffles the argv/envp/auxv
on the stack to remove the ld.so own arguments and thus _dl_skip_args
is always 0.  So there is no need to adjust the argc or argv.

Checked on i686-linux-gnu.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2022-05-30 16:32:57 -03:00
Adhemerval Zanella
6242602273 hppa: Remove _dl_skip_args usage (BZ# 29165)
Different than other architectures, hppa creates an unrelated stack
frame where ld.so argc/argv adjustments done by ad43cac44a
is not done on the argc/argv saved/restore by _dl_start_user.

Instead load _dl_argc and _dl_argv directlty instead of adjust them
using _dl_skip_args value.

Checked on hppa-linux-gnu.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2022-05-30 16:32:35 -03:00
Adhemerval Zanella
00477963c6 csky: Remove _dl_skip_args usage
Since ad43cac44a the generic code already shuffles the argv/envp/auxv
on the stack to remove the ld.so own arguments and thus _dl_skip_args
is always 0.  It makes the fixup_stack branch ununsed.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2022-05-30 16:32:33 -03:00
Adhemerval Zanella
f20464e9e4 arc: Remove _dl_skip_args usage
Since ad43cac44a the generic code already shuffles the argv/envp/auxv
on the stack to remove the ld.so own arguments and thus _dl_skip_args
is always 0.  So there is no need to adjust the argc or argv.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2022-05-30 16:32:29 -03:00
Adhemerval Zanella
49d877a80b arm: Remove _dl_skip_args usage
Since ad43cac44a the generic code already shuffles the argv/envp/auxv
on the stack to remove the ld.so own arguments and thus _dl_skip_args
is always 0.  It makes the _fixup_stack branch ununsed.

Checked with qemu-user that arguments are correctly passed on both
constructors and main program.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2022-05-30 16:32:26 -03:00