Commit Graph

14924 Commits

Author SHA1 Message Date
Alan Modra
d6efcc118e powerpc64: Use medium model toc accesses throughout
The PowerPC64 linker edits medium model toc-indirect code to toc-pointer
relative:
	addis r9,r2,tc_entry_for_var@toc@ha
	ld r9,tc_entry_for_var@toc@l(r9)
becomes
	addis r9,r2,(var-.TOC.)@ha
	addi r9,r9,(var-.TOC.)@l
when "var" is known to be local to the binary.  This isn't done for
small-model toc-indirect code, because "var" is almost guaranteed to
be too far away from .TOC. for a 16-bit signed offset.  And, because
the analysis of which .toc entry can be removed becomes much more
complicated in objects that mix code models, they aren't removed if
any small-model toc sequence appears in an object file.

Unfortunately, glibc's build of ld.so smashes the needed objects
together in a ld -r linking stage.  This means the GOT/TOC is left
with a whole lot of relative relocations which is untidy, but in
itself is not a serious problem.  However, static-pie on powerpc64
bombs due to a segfault caused by one of the small-model accesses
before _dl_relocate_static_pie.  (The very first one in rcrt1.o
passing start_addresses in r8 to __libc_start_main.)

So this patch makes all the toc/got accesses in assembly medium code
model, and a couple of functions hidden.  By itself this is not
enough to give us working static-pie, but it is useful in isolation to
enable better linker optimisation.

There's a serious problem in libgcc too.  libgcc ifuncs access the
AT_HWCAP words stored in the tcb with an offset from the thread
pointer (r13), but r13 isn't set at the time _dl_relocate_static_pie.
A followup patch will fix that.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2022-04-10 08:33:06 +09:30
Alan Modra
30afd8c44d linux: Constify rfv variable in dl_vdso_vsym
Compilers may decide to put the rfv variable in .data rather than on
the stack.  It's slightly better to put it in .data.rel.ro.local
instead.  Regardles of that, making it const may enable further
optimisations.  Found when examining relative relocations (GOT ones
in particular) as part of enabling static-pie for PowerPC64.
2022-04-10 08:33:02 +09:30
Adhemerval Zanella
4f2146c4f4 sparc64: Remove fcopysign{f} implementation
The builtin from generic code generates similar compliant sequence.

Checked on sparc64-linux-gnu.
2022-04-07 15:11:56 -03:00
Adhemerval Zanella
0753be0c8a alpha: Remove fcopysign{f} implementation
The generic code already uses builtins.
2022-04-07 14:56:26 -03:00
Adhemerval Zanella
0a4ae090e0 math: Use builtin for ldbl-96 copysign
All architectures that uses it (x86, ia64, m68k) implement the
builtin.

Checked on x86_64-linux-gnu and ia64-linux-gnu.
2022-04-07 14:54:14 -03:00
Adhemerval Zanella
a085346267 ia64: Remove fcopysign{f} implementation
The builtin used by generic code generates similar code.

Checked on ia64-linux-gnu.
2022-04-07 12:27:00 -03:00
Adhemerval Zanella
13d45cf9a7 x86: Remove fcopysign{f} implementation
The builtin used by generic code generates similar code.

Checked on x86_64-linux-gnu and i686-linux-gnu.
2022-04-07 12:17:15 -03:00
Adhemerval Zanella
2a45807e73 powerpc: Remove fcopysign{f} implementation
The builtin and generic implementation from generic files are suffice.

Checked on powerpc64-linux-gnu and powerpc-linux-gnu.
2022-04-07 12:00:16 -03:00
Adhemerval Zanella
cbc2c56bab benchtests: Only build libmvec benchmarks iff $(build-mathvec) is set
Checked on x86_64-linux-gnu.
2022-04-05 12:01:10 -03:00
Adhemerval Zanella
053fe27343 linux: Fix __closefrom_fallback iterates until max int (BZ#28993)
The __closefrom_fallback tries to get a available file descriptor
if the initial open ("/proc/self/fd/", ...) fails.  It assumes the
failure would be only if procfs is not mount (ENOENT), however if
the the proc file is not accessible (due some other kernel filtering
such apparmor) it will iterate over a potentially large file set
issuing close calls.

It should only try the close fallback if open returns EMFILE,
ENFILE, or ENOMEM.

Checked on x86_64-linux-gnu.
2022-04-05 08:08:19 -03:00
Fangrui Song
3ee318c923 Remove -z combreloc and HAVE_Z_COMBRELOC
-z combreloc has been the default regadless of the architecture since
binutils commit f4d733664aabd7bd78c82895e030ec9779a92809 (2002). The
configure check added in commit fdde83499a (2001) has long been
unneeded.

We can therefore treat HAVE_Z_COMBRELOC as always 1 and delete dead code
paths in dl-machine.h files (many were copied from commit a711b01d34
and ee0cb67ec2).

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2022-04-04 17:19:07 -07:00
Adhemerval Zanella
1c225a2dd1 sparc: Remove s_abs implementations
For sparc64 is the same as the generic implementation, while for
sparc32 the builtin generates the same code.

Checked on sparc64-linux-gnu and sparcv9-linux-gnu.
2022-04-04 16:53:24 -03:00
Adhemerval Zanella
caee5be74b ia64: Remove fabs implementations
The generic implementation fixes 5 fabs tests on ia64-linux-gnu:

  math/test-double-fabs
  math/test-float-fabs
  math/test-float32-fabs
  math/test-float32x-fabs
  math/test-float64-fabs

Checked on ia64-linux-gnu.
2022-04-04 16:30:38 -03:00
Adhemerval Zanella
7eed708edf x86: Remove fabs{f} implementation
For x86_64 is the same as the generic implementation, while for i686
the builtin generates the same code.
2022-04-04 16:23:11 -03:00
Adhemerval Zanella
dc2cfd6a87 alpha: Remove s_abs implementations
The generic implementation already uses builtins.
2022-04-04 16:22:15 -03:00
Noah Goldstein
244b415d38 x86: Small improvements for wcslen
Just a few QOL changes.
    1. Prefer `add` > `lea` as it has high execution units it can run
       on.
    2. Don't break macro-fusion between `test` and `jcc`
    3. Reduce code size by removing gratuitous padding bytes (-90
       bytes).

geometric_mean(N=20) of all benchmarks New / Original: 0.959

All string/memory tests pass.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2022-03-28 15:00:03 -05:00
Noah Goldstein
f5bff979d0 x86: Small improvements for wcscpy-ssse3
Just a few small QOL changes.
    1. Prefer `add` > `lea` as it has high execution units it can run
       on.
    2. Don't break macro-fusion between `test` and `jcc`

geometric_mean(N=20) of all benchmarks New / Original: 0.973

All string/memory tests pass.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2022-03-28 15:00:03 -05:00
Joseph Myers
866c599182 Add HWCAP2_AFP, HWCAP2_RPRES from Linux 5.17 to AArch64 bits/hwcap.h
Add the new HWCAP2_AFP and HWCAP2_RPRES constants from Linux 5.17.
Tested with build-many-glibcs.py for aarch64-linux-gnu.
2022-03-28 13:16:48 +00:00
Noah Goldstein
305769b2a1 x86: Remove AVX str{n}casecmp
The rational is:

1. SSE42 has nearly identical logic so any benefit is minimal (3.4%
   regression on Tigerlake using SSE42 versus AVX across the
   benchtest suite).
2. AVX2 version covers the majority of targets that previously
   prefered it.
3. The targets where AVX would still be best (SnB and IVB) are
   becoming outdated.

All in all the saving the code size is worth it.

All string/memory tests pass.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2022-03-25 13:16:51 -05:00
Noah Goldstein
84e7c46df4 x86: Add EVEX optimized str{n}casecmp
geometric_mean(N=40) of all benchmarks EVEX / SSE42: .621

All string/memory tests pass.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2022-03-25 13:16:50 -05:00
Noah Goldstein
bbf8122234 x86: Add AVX2 optimized str{n}casecmp
geometric_mean(N=40) of all benchmarks AVX2 / SSE42: .702

All string/memory tests pass.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2022-03-25 13:16:43 -05:00
Noah Goldstein
d154758e61 x86: Optimize str{n}casecmp TOLOWER logic in strcmp-sse42.S
Slightly faster method of doing TOLOWER that saves an
instruction.

Also replace the hard coded 5-byte no with .p2align 4. On builds with
CET enabled this misaligned entry to strcasecmp.

geometric_mean(N=40) of all benchmarks New / Original: .920

All string/memory tests pass.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2022-03-25 11:46:13 -05:00
Noah Goldstein
670b54bc58 x86: Optimize str{n}casecmp TOLOWER logic in strcmp.S
Slightly faster method of doing TOLOWER that saves an
instruction.

Also replace the hard coded 5-byte no with .p2align 4. On builds with
CET enabled this misaligned entry to strcasecmp.

geometric_mean(N=40) of all benchmarks New / Original: .894

All string/memory tests pass.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2022-03-25 11:46:13 -05:00
Noah Goldstein
9fef7039a7 x86: Fix fallback for wcsncmp_avx2 in strcmp-avx2.S [BZ #28896]
Overflow case for __wcsncmp_avx2_rtm should be __wcscmp_avx2_rtm not
__wcscmp_avx2.

commit ddf0992cf5
Author: Noah Goldstein <goldstein.w.n@gmail.com>
Date:   Sun Jan 9 16:02:21 2022 -0600

    x86: Fix __wcsncmp_avx2 in strcmp-avx2.S [BZ# 28755]

Set the wrong fallback function for `__wcsncmp_avx2_rtm`. It was set
to fallback on to `__wcscmp_avx2` instead of `__wcscmp_avx2_rtm` which
can cause spurious aborts.

This change will need to be backported.

All string/memory tests pass.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2022-03-25 11:46:13 -05:00
Noah Goldstein
9c8a6ad620 x86: Remove strspn-sse2.S and use the generic implementation
The generic implementation is faster.

geometric_mean(N=20) of all benchmarks New / Original: .710

All string/memory tests pass.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2022-03-25 11:46:13 -05:00
Noah Goldstein
6533585352 x86: Remove strpbrk-sse2.S and use the generic implementation
The generic implementation is faster (see strcspn commit).

All string/memory tests pass.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2022-03-25 11:46:13 -05:00
Noah Goldstein
fe28e7d9d9 x86: Remove strcspn-sse2.S and use the generic implementation
The generic implementation is faster.

geometric_mean(N=20) of all benchmarks New / Original: .678

All string/memory tests pass.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2022-03-25 11:46:13 -05:00
Noah Goldstein
412d103431 x86: Optimize strspn in strspn-c.c
Use _mm_cmpeq_epi8 and _mm_movemask_epi8 to get strlen instead of
_mm_cmpistri. Also change offset to unsigned to avoid unnecessary
sign extensions.

geometric_mean(N=20) of all benchmarks that dont fallback on
sse2; New / Original: .901

All string/memory tests pass.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2022-03-25 11:46:13 -05:00
Noah Goldstein
30d627d477 x86: Optimize strcspn and strpbrk in strcspn-c.c
Use _mm_cmpeq_epi8 and _mm_movemask_epi8 to get strlen instead of
_mm_cmpistri. Also change offset to unsigned to avoid unnecessary
sign extensions.

geometric_mean(N=20) of all benchmarks that dont fallback on
sse2/strlen; New / Original: .928

All string/memory tests pass.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2022-03-25 11:46:13 -05:00
Noah Goldstein
ec285ea904 x86: Code cleanup in strchr-evex and comment justifying branch
Small code cleanup for size: -81 bytes.

Add comment justifying using a branch to do NULL/non-null return.

All string/memory tests pass and no regressions in benchtests.

geometric_mean(N=20) of all benchmarks New / Original: .985
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2022-03-25 11:46:13 -05:00
Noah Goldstein
a6fbf4d51e x86: Code cleanup in strchr-avx2 and comment justifying branch
Small code cleanup for size: -53 bytes.

Add comment justifying using a branch to do NULL/non-null return.

All string/memory tests pass and no regressions in benchtests.

geometric_mean(N=20) of all benchmarks Original / New: 1.00
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2022-03-25 11:46:13 -05:00
Joseph Myers
23808a422e Update kernel version to 5.17 in tst-mman-consts.py
This patch updates the kernel version in the test tst-mman-consts.py
to 5.17.  (There are no new MAP_* constants covered by this test in
5.17 that need any other header changes.)

Tested with build-many-glibcs.py.
2022-03-24 15:35:27 +00:00
Adhemerval Zanella
33f4d09bdc gmon: Remove unused sprofil.c functions 2022-03-23 14:29:25 -03:00
Joseph Myers
8ef9196b26 Update syscall lists for Linux 5.17
Linux 5.17 has one new syscall, set_mempolicy_home_node.  Update
syscall-names.list and regenerate the arch-syscall.h headers with
build-many-glibcs.py update-syscalls.

Tested with build-many-glibcs.py.
2022-03-23 17:11:56 +00:00
Adhemerval Zanella
c7f05bd534 Fix ununsed fstatat64_time64_statx
It is only called for legacy ABIs.
2022-03-23 13:33:16 -03:00
Adhemerval Zanella
f60e45ba10 elf: Remove inline _dl_dprintf
It is not used on rtld and ldsodef interfaces are meant to be used
solely on loader.  It also removes the only usage of gcc extension
__builtin_va_arg_pack.
2022-03-23 10:42:01 -03:00
Sam James
cb7b1c9014 configure.ac: fix bashisms in configure.ac
configure scripts need to be runnable with a POSIX-compliant /bin/sh.

On many (but not all!) systems, /bin/sh is provided by Bash, so errors
like this aren't spotted. Notably Debian defaults to /bin/sh provided
by dash which doesn't tolerate such bashisms as '=='.

This retains compatibility with bash.

Fixes configure warnings/errors like:
```
checking if compiler warns about alias for function with incompatible types... yes
/var/tmp/portage/sys-libs/glibc-2.34-r10/work/glibc-2.34/configure: 4209: test: xyes: unexpected operator
```

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Signed-off-by: Sam James <sam@gentoo.org>
2022-03-22 21:53:43 -04:00
Siddhesh Poyarekar
d3f2c2c8b5 getaddrinfo: Refactor code for readability
The close_retry goto jump is confusing and clumsy to read, so refactor
the code a bit to make it easier to follow.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
2022-03-23 06:55:39 +05:30
Siddhesh Poyarekar
bc0d18d873 gai_init: Avoid jumping from if condition to its else counterpart
Clean up another antipattern where code flows from an if condition to
its else counterpart with a goto.

Most of the change in this patch is whitespace-only; a `git diff -b`
ought to show the actual logic changes.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
2022-03-22 19:41:46 +05:30
Siddhesh Poyarekar
06890c7ba5 gaiconf_init: Refactor some bits for readability
Split out line processing for `label`, `precedence` and `scopev4` into
separate functions instead of the gotos.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
2022-03-22 19:41:42 +05:30
Siddhesh Poyarekar
b587456c0e gethosts: Return EAI_MEMORY on allocation failure
All other cases of failures due to lack of memory return EAI_MEMORY, so
it seems wrong to return EAI_SYSTEM here.  The only reason
convert_hostent_to_gaih_addrtuple could fail is on calloc failure.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
2022-03-22 19:39:46 +05:30
Siddhesh Poyarekar
ac4653ef50 gaih_inet: Split result generation into its own function
Simplify the loop a wee bit and clean up variable names too.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
2022-03-22 19:39:45 +05:30
Siddhesh Poyarekar
657472b2a5 gaih_inet: split loopback lookup into its own function
Flatten the condition nesting and replace the alloca for RET.AT/ATR with
a single array LOCAL_AT[2].  This gets rid of alloca and alloca
accounting.

`git diff -b` is probably the best way to view this change since much of
the diff is whitespace changes.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
2022-03-22 19:39:44 +05:30
Siddhesh Poyarekar
cfa3bd48cb gaih_inet: make gethosts into a function
The macro is quite a pain to debug, so make gethosts into a function to
make it easier to maintain.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
2022-03-22 19:39:42 +05:30
Siddhesh Poyarekar
906cecbe08 gaih_inet: separate nss lookup loop into its own function
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
2022-03-22 19:39:33 +05:30
Siddhesh Poyarekar
e7e5315b7f gaih_inet: Split nscd lookup code into its own function.
Add a new member got_ipv6 to indicate if the results have an IPv6
result and use it instead of the local got_ipv6.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
2022-03-22 19:39:20 +05:30
Siddhesh Poyarekar
b44389cb7f gaih_inet: Split simple gethostbyname into its own function
Add a free_at flag in gaih_result to indicate if res.at needs to be
freed by the caller.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
2022-03-22 19:39:18 +05:30
Siddhesh Poyarekar
26dea46119 gaih_inet: make numeric lookup a separate routine
Introduce the gaih_result structure and general paradigm for cleanups
that follow to process the lookup request and return a result.  A lookup
function (like text_to_binary_address), should return an integer error
code and set members of gaih_result based on what it finds.  If the
function does not have a result and no errors have occurred during the
lookup, it should return 0 and res.at should be set to NULL, allowing a
subsequent function to do the lookup until we run out of options.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
2022-03-22 19:39:17 +05:30
Siddhesh Poyarekar
8d6cf99f2f gaih_inet: Simplify service resolution
Refactor the code to split out the service resolution code into a
separate function.  Allocate the service tuples array just once to the
size of the typeproto array, thus avoiding the unnecessary pointer
chasing and stack allocations.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
2022-03-22 19:39:16 +05:30
Siddhesh Poyarekar
3004604607 getaddrinfo: Fix leak with AI_ALL [BZ #28852]
Use realloc in convert_hostent_to_gaih_addrtuple and fix up pointers in
the result list so that a single block is maintained for
hostbyname3_r/hostbyname2_r and freed in gaih_inet.  This result is
never merged with any other results, since the hosts database does not
permit merging.

Resolves BZ #28852.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
2022-03-22 19:39:14 +05:30