Commit Graph

38003 Commits

Author SHA1 Message Date
H.J. Lu
a586fe9c80 Configure GCC with --enable-initfini-array [BZ #27945]
Starting from GCC 12, the .init_array and .fini_array sections are enabled
unconditionally by

commit 13a39886940331149173b25d6ebde0850668d8b9
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Tue Jun 8 16:09:24 2021 -0700

    Always enable DT_INIT_ARRAY/DT_FINI_ARRAY on Linux

configure GCC with --enable-initfini-array to enable them when using GCC
release branches.

Fixes BZ #27945.
2021-11-05 15:30:02 -07:00
Florian Weimer
ea32ec354c elf: Earlier missing dynamic segment check in _dl_map_object_from_fd
Separated debuginfo files have PT_DYNAMIC with p_filesz == 0.  We
need to check for that before the _dl_map_segments call because
that could attempt to write to mappings that extend beyond the end
of the file, resulting in SIGBUS.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-11-05 19:34:16 +01:00
Nikita Popov
ff012870b2 gconv: Do not emit spurious NUL character in ISO-2022-JP-3 (bug 28524)
Bugfix 27256 has introduced another issue:
In conversion from ISO-2022-JP-3 encoding, it is possible
to force iconv to emit extra NUL character on internal state reset.
To do this, it is sufficient to feed iconv with escape sequence
which switches active character set.
The simplified check 'data->__statep->__count != ASCII_set'
introduced by the aforementioned bugfix picks that case and
behaves as if '\0' character has been queued thus emitting it.

To eliminate this issue, these steps are taken:
* Restore original condition
'(data->__statep->__count & ~7) != ASCII_set'.
It is necessary since bits 0-2 may contain
number of buffered input characters.
* Check that queued character is not NUL.
Similar step is taken for main conversion loop.

Bundled test case follows following logic:
* Try to convert ISO-2022-JP-3 escape sequence
switching active character set
* Reset internal state by providing NULL as input buffer
* Ensure that nothing has been converted.

Signed-off-by: Nikita Popov <npv1310@gmail.com>
2021-11-04 19:59:42 +01:00
Paul A. Clarke
9fea0f1a2a [powerpc] Tighten contraints for asm constant parameters
There are a few places where only known numeric values are acceptable for
`asm` parameters, yet the constraint "i" is used.  "i" can include
"symbolic constants whose values will be known only at assembly time or
later."

Use "n" instead of "i" where known numeric values are required.

Suggested-by: Segher Boessenkool <segher@kernel.crashing.org>
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2021-11-03 09:17:28 -05:00
Adhemerval Zanella
d3bf2f5927 elf: Do not run DSO sorting if tunables is not enabled
Since the argorithm selection requires tunables.

Checked on x86_64-linux-gnu with --enable-tunables=no.
2021-11-03 09:25:06 -03:00
Adhemerval Zanella
09f214528c riscv: Build with -mno-relax if linker does not support R_RISCV_ALIGN
It allows build both glibc and tests with lld (Since lld does not
support R_RISCV_ALIGN linker relaxation).

Checked with a build for riscv32-linux-gnu-rv32imafdc-ilp32d and
riscv64-linux-gnu-rv64imafdc-lp64d.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Fangrui Song <maskray@google.com>
2021-11-03 09:25:06 -03:00
Fangrui Song
6720d36b66 x86-64: Replace movzx with movzbl
Clang cannot assemble movzx in the AT&T dialect mode.

../sysdeps/x86_64/strcmp.S:2232:16: error: invalid operand for instruction
 movzx (%rsi), %ecx
               ^~~~

Change movzx to movzbl, which follows the AT&T dialect and is used
elsewhere in the file.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-11-02 20:59:52 -07:00
Fangrui Song
fdcd177fd3 regex: Unnest nested functions in regcomp.c
This refactor moves four functions out of a nested scope and converts
them into static always_inline functions. collseqwc, table_size,
symb_table, extra are now initialized to zero because they are passed as
function arguments.

On x86-64, .text is 16 byte larger likely due to the 4 stores.
This is nothing compared to the amount of work that regcomp has to do
looking up the collation weights, or other functions.

If the non-buildable `sysdeps/generic/dl-machine.h` doesn't count,
this patch removes the last `auto inline` usage from glibc.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2021-11-02 10:07:59 -07:00
Joseph Myers
db432f033d Use Linux 5.15 in build-many-glibcs.py
This patch makes build-many-glibcs.py use Linux 5.15.

Tested with build-many-glibcs.py (host-libraries, compilers and glibcs
builds).
2021-11-02 16:54:59 +00:00
Adhemerval Zanella
f64f4ce069 elf: Assume disjointed .rela.dyn and .rela.plt for loader
The patch removes the the ELF_DURING_STARTUP optimization and assume
both .rel.dyn and .rel.plt might not be subsequent.  This allows some
code simplification since relocation will be handled independently
where it is done on bootstrap.

At least on x86_64_64, I can not measure any performance implications.
Running 10000 time the command

  LD_DEBUG=statistics ./elf/ld.so ./libc.so

And filtering the "total startup time in dynamic loader" result,
the geometric mean is:

                  patched       master
  Ryzen 7 5900x     24140        24952
  i7-4510U          45957        45982

(The results do show some variation, I did not make any statistical
analysis).

It also allows build arm with lld, since it inserts ".ARM.exidx"
between ".rel.dyn" and ".rel.plt" for the loader.

Checked on x86_64-linux-gnu and arm-linux-gnueabihf.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-11-02 11:21:57 -03:00
Florian Weimer
cca75bd8b5 i386: Explain why __HAVE_64B_ATOMICS has to be 0 2021-11-02 10:26:23 +01:00
Adhemerval Zanella
b8a6ee43bb benchtests: Add hypotf
Based on random input arguments.  About 85% tuples have exponents
of the two arguments close together (+-1 range).

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-11-01 16:23:39 -03:00
Adhemerval Zanella
dba44dbe54 benchtests: Make hypot input random
Instead of inputs based on the algorithm implementation details.
About 85% tuples have exponents of the two arguments close
together (+-1 range).

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-11-01 16:23:22 -03:00
Adhemerval Zanella
613cb5c7b1 arm: Use have-mtls-dialect-gnu2 to check for ARM TLS descriptors support
The lld linker does not support TLSDESC for arm.  The have-arm-tls-desc
is a leftover of 56583289b1 to support NaCL.

Reviewed-by: Fangrui Song <maskray@google.com>
2021-11-01 16:23:15 -03:00
Adhemerval Zanella
d6dea8c847 arm: Use internal symbol for _dl_argv on _dl_start_user
The lld does not support R_ARM_GOTOFF32 to preemptible symbol (_dl_argv
has default visibility).  Use the internal alias instead (one option
would to use HIDDEN_JUMPTARGET, bu the macro is not defined for
!__ASSEMBLER__ and I made this patch arm-specific to avoid require to
check extensivelly on other architecture it this might break something).

Checked on arm-linux-gnueabihf.

Reviewed-by: Fangrui Song <maskray@google.com>
2021-11-01 16:21:53 -03:00
H.J. Lu
14dbbf46a0 x86-64: Remove Prefer_AVX2_STRCMP
Remove Prefer_AVX2_STRCMP to enable EVEX strcmp.  When comparing 2 32-byte
strings, EVEX strcmp has been improved to require 1 load, 1 VPTESTM, 1
VPCMP, 1 KMOVD and 1 INCL instead of 2 loads, 3 VPCMPs, 2 KORDs, 1 KMOVD
and 1 TESTL while AVX2 strcmp requires 1 load, 2 VPCMPEQs, 1 VPMINU, 1
VPMOVMSKB and 1 TESTL.  EVEX strcmp is now faster than AVX2 strcmp by up
to 40% on Tiger Lake and Ice Lake.
2021-11-01 07:53:04 -07:00
H.J. Lu
c46e9afb2d x86-64: Improve EVEX strcmp with masked load
In strcmp-evex.S, to compare 2 32-byte strings, replace

        VMOVU   (%rdi, %rdx), %YMM0
        VMOVU   (%rsi, %rdx), %YMM1
        /* Each bit in K0 represents a mismatch in YMM0 and YMM1.  */
        VPCMP   $4, %YMM0, %YMM1, %k0
        VPCMP   $0, %YMMZERO, %YMM0, %k1
        VPCMP   $0, %YMMZERO, %YMM1, %k2
        /* Each bit in K1 represents a NULL in YMM0 or YMM1.  */
        kord    %k1, %k2, %k1
        /* Each bit in K1 represents a NULL or a mismatch.  */
        kord    %k0, %k1, %k1
        kmovd   %k1, %ecx
        testl   %ecx, %ecx
        jne     L(last_vector)

with

        VMOVU   (%rdi, %rdx), %YMM0
        VPTESTM %YMM0, %YMM0, %k2
        /* Each bit cleared in K1 represents a mismatch or a null CHAR
           in YMM0 and 32 bytes at (%rsi, %rdx).  */
        VPCMP   $0, (%rsi, %rdx), %YMM0, %k1{%k2}
        kmovd   %k1, %ecx
        incl    %ecx
        jne     L(last_vector)

It makes EVEX strcmp faster than AVX2 strcmp by up to 40% on Tiger Lake
and Ice Lake.

Co-Authored-By: Noah Goldstein <goldstein.w.n@gmail.com>
2021-11-01 07:52:56 -07:00
Sunil K Pandey
79d0fc6539 benchtests: Add acosf function to bench-math
Add acosf function to bench-math and copy acosf-inputs to benchtests.
Motivation for this patch is to prepare for upcoming libmvec new
functions.  Float and double version of libmvec functions stays
together.

acosf-inputs file generated from acos-inputs file using following
scaling formula:

f = d * (FLT_MAX/DBL_MAX)

Where d is input(double) and f is output(float).  If scaled float value
is duplicate in new input file, nextafterf() function used to find next
float value, ensuring no duplicates.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-10-29 08:52:30 -07:00
Wilco Dijkstra
f392915d1e benchtests: Improve bench-memcpy-random
Improve the random memcpy benchmark. Double the number of tests and increase
the size of the memory region to test between 32KB and 1024KB. This improves
accuracy on modern cores. Clean up formatting of the frequency array.

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2021-10-29 15:45:53 +01:00
Joseph Myers
7ca9377bab Disable -Waggressive-loop-optimizations warnings in tst-dynarray.c
My build-many-glibcs.py bot shows -Waggressive-loop-optimizations
errors building the glibc testsuite for 32-bit architectures with GCC
mainline, which seem to have appeared between GCC commits
4abc0c196b10251dc80d0743ba9e8ab3e56c61ed and
d8edfadfc7a9795b65177a50ce44fd348858e844:

In function 'dynarray_long_noscratch_resize',
    inlined from 'test_long_overflow' at tst-dynarray.c:489:5,
    inlined from 'do_test' at tst-dynarray.c:571:3:
../malloc/dynarray-skeleton.c:391:36: error: iteration 1073741823 invokes undefined behavior [-Werror=aggressive-loop-optimizations]
  391 |             DYNARRAY_ELEMENT_INIT (&list->u.dynarray_header.array[i]);
tst-dynarray.c:39:37: note: in definition of macro 'DYNARRAY_ELEMENT_INIT'
   39 | #define DYNARRAY_ELEMENT_INIT(e) (*(e) = 23)
      |                                     ^
In file included from tst-dynarray.c:42:
../malloc/dynarray-skeleton.c:389:37: note: within this loop
  389 |         for (size_t i = old_size; i < size; ++i)
      |                                   ~~^~~~~~
In function 'dynarray_long_resize',
    inlined from 'test_long_overflow' at tst-dynarray.c:479:5,
    inlined from 'do_test' at tst-dynarray.c:571:3:
../malloc/dynarray-skeleton.c:391:36: error: iteration 1073741823 invokes undefined behavior [-Werror=aggressive-loop-optimizations]
  391 |             DYNARRAY_ELEMENT_INIT (&list->u.dynarray_header.array[i]);
tst-dynarray.c:27:37: note: in definition of macro 'DYNARRAY_ELEMENT_INIT'
   27 | #define DYNARRAY_ELEMENT_INIT(e) (*(e) = 17)
      |                                     ^
In file included from tst-dynarray.c:28:
../malloc/dynarray-skeleton.c:389:37: note: within this loop
  389 |         for (size_t i = old_size; i < size; ++i)
      |                                   ~~^~~~~~

I don't know what GCC change made these errors appear, or why they
only appear for 32-bit architectures.  However, the warnings appear to
be both true (that iteration would indeed involve undefined behavior
if executed) and useless in this particular case (that iteration is
never executed, because the allocation size overflows and so the
allocation fails - but the check for allocation size overflow is in a
separate source file and so can't be seen by the compiler when
compiling this test).  So use the DIAG_* macros to disable
-Waggressive-loop-optimizations around the calls in question to
dynarray_long_resize and dynarray_long_noscratch_resize in this test.

Tested with build-many-glibcs.py (GCC mainline) for arm-linux-gnueabi,
where it restores a clean testsuite build.
2021-10-29 14:40:45 +00:00
Stafford Horne
6446c725d4 Fix compiler issue with mmap_internal
Compiling mmap_internal fails to compile when we use -1 for MMAP2_PAGE_UNIT
on 32 bit architectures.  The error is as follows:

    ../sysdeps/unix/sysv/linux/mmap_internal.h:30:8: error: unknown type
    name 'uint64_t'
    |
       30 | static uint64_t page_unit;
	  |
	  |        ^~~~~~~~

Fix by adding including stdint.h.
2021-10-29 09:21:37 -03:00
Adhemerval Zanella
04e8169f1d Check if linker also support -mtls-dialect=gnu2
Since some linkers (for instance lld for i386) does not support it
for all architectures.

Checked on i686-linux-gnu.

Reviewed-by: Fangrui Song <maskray@google.com>
2021-10-29 09:21:37 -03:00
Adhemerval Zanella
3d5ecb6246 Fix LIBC_PROG_BINUTILS for -fuse-ld=lld
GCC does not print the correct linker when -fuse-ld=lld is used with
the -print-prog-name=ld:

  $ gcc -v 2>&1 | tail -n 1
  gcc version 11.2.0 (Ubuntu 11.2.0-7ubuntu2)
  $ gcc
  ld

This is different than for gold:

  $ gcc -fuse-ld=gold -print-prog-name=ld
  ld.gold

Using ld.lld as the static linker name prints the expected result.

This is only required when -fuse-ld=lld is used, if lld is used as
the 'ld' programs (through a symlink) LIBC_PROG_BINUTILS works
as expected.

Checked on x86_64-linux-gnu.

Reviewed-by: Fangrui Song <maskray@google.com>
2021-10-29 09:21:37 -03:00
Adhemerval Zanella
66a273d16a elf: Disable ifuncmain{1,5,5pic,5pie} when using LLD
These tests takes the address of a protected symbol (foo_protected)
and lld does not support copy relocations on protected data symbols.

Checked on x86_64-linux-gnu.

Reviewed-by: Fangrui Song <maskray@google.com>
2021-10-29 09:21:37 -03:00
Siddhesh Poyarekar
88e316b064 Handle NULL input to malloc_usable_size [BZ #28506]
Hoist the NULL check for malloc_usable_size into its entry points in
malloc-debug and malloc and assume non-NULL in all callees.  This fixes
BZ #28506

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
2021-10-29 14:53:55 +05:30
Noah Goldstein
1d56fd3bae x86_64: Add memcmpeq.S to fix disable-multi-arch build
The following commit:

    commit cf4fd28ea4
    Author: Noah Goldstein <goldstein.w.n@gmail.com>
    Date:   Tue Oct 26 19:43:18 2021 -0500

Broke --disable-multi-arch build for x86_64 because x86_64/memcmpeq.S
was not defined outside of multiarch and the alias for __memcmpeq in
x86_64/memcmp.S was removed.

This commit fixes that issue by adding x86_64/memcmpeq.S.

make xcheck passes on x86_64 with and without --disable-multi-arch
2021-10-28 16:35:50 -05:00
Stafford Horne
b3cf94ef15 login: Add back libutil as an empty library
There are several packages like sysvinit and buildroot that expect
-lutil to work.  Rather than impacting them with having to change
the linker flags provide an empty libutil.a.
2021-10-29 06:18:55 +09:00
Fangrui Song
6838920383 riscv: Fix incorrect jal with HIDDEN_JUMPTARGET
A non-local STV_DEFAULT defined symbol is by default preemptible in a
shared object. j/jal cannot target a preemptible symbol. On other
architectures, such a jump instruction either causes PLT [BZ #18822], or
if short-ranged, sometimes rejected by the linker (but not by GNU ld's
riscv port [ld PR/28509]).

Use HIDDEN_JUMPTARGET to target a non-preemptible symbol instead.

With this patch, ld.so and libc.so can be linked with LLD if source
files are compiled/assembled with -mno-relax/-Wa,-mno-relax.

Acked-by: Palmer Dabbelt <palmer@dabbelt.com>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-10-28 11:39:49 -07:00
Noah Goldstein
9b7cfab180 x86_64: Add evex optimized __memcmpeq in memcmpeq-evex.S
No bug. This commit adds new optimized __memcmpeq implementation for
evex.

The primary optimizations are:

1) skipping the logic to find the difference of the first mismatched
byte.

2) not updating src/dst addresses as the non-equals logic does not
need to be reused by different areas.
2021-10-27 13:03:46 -05:00
Noah Goldstein
b4ed69ba16 x86_64: Add avx2 optimized __memcmpeq in memcmpeq-avx2.S
No bug. This commit adds new optimized __memcmpeq implementation for
avx2.

The primary optimizations are:

1) skipping the logic to find the difference of the first mismatched
byte.

2) not updating src/dst addresses as the non-equals logic does not
need to be reused by different areas.
2021-10-27 13:03:46 -05:00
Noah Goldstein
fa7f63d8d6 x86_64: Add sse2 optimized __memcmpeq in memcmp-sse2.S
No bug. This commit does not modify any of the memcmp
implementation. It just adds __memcmpeq ifdefs to skip obvious cases
where computing the proper 1/-1 required by memcmp is not needed.
2021-10-27 13:03:46 -05:00
Noah Goldstein
cf4fd28ea4 x86_64: Add support for __memcmpeq using sse2, avx2, and evex
No bug. This commit adds support for __memcmpeq to be implemented
seperately from memcmp. Support is added for versions optimized with
sse2, avx2, and evex.
2021-10-27 13:03:46 -05:00
Noah Goldstein
cf3acd774f Benchtests: Add benchtests for __memcmpeq
No bug. This commit adds __memcmpeq benchmarks. The benchmarks just
use the existing ones in memcmp. This will be useful for testing
implementations of __memcmpeq that do not just alias memcmp.
2021-10-27 13:03:46 -05:00
Noah Goldstein
3592ccd472 String: Add __memcmpeq as build target
No bug. This commit just adds __memcmpeq as a build target so that
implementations for __memcmpeq that are not just aliases to memcmp can
be supported.
2021-10-27 13:03:46 -05:00
Noah Goldstein
11c88336e3 NEWS: Add item for __memcmpeq 2021-10-26 16:51:29 -05:00
Noah Goldstein
d9283b71ac String: Add tests for __memcmpeq
No bug.

This commit adds tests for the new function __memcmpeq. The new tests
use the existing tests in 'test-memcmp.c' but relax the result
requirement to only check for zero or non-zero returns.

All string tests include test-memcmpeq are passing.
2021-10-26 16:51:29 -05:00
Noah Goldstein
9894127d20 String: Add hidden defs for __memcmpeq() to enable internal usage
No bug.

This commit adds hidden defs for all declarations of __memcmpeq. This
enables usage of __memcmpeq without the PLT for usage internal to
GLIBC.
2021-10-26 16:51:29 -05:00
Noah Goldstein
44829b3ddb String: Add support for __memcmpeq() ABI on all targets
No bug.

This commit adds support for __memcmpeq() as a new ABI for all
targets. In this commit __memcmpeq() is implemented only as an alias
to the corresponding targets memcmp() implementation. __memcmpeq() is
added as a new symbol starting with GLIBC_2.35 and defined in string.h
with comments explaining its behavior. Basic tests that it is callable
and works where added in string/tester.c

As discussed in the proposal "Add new ABI '__memcmpeq()' to libc"
__memcmpeq() is essentially a reserved namespace for bcmp(). The means
is shares the same specifications as memcmp() except the return value
for non-equal byte sequences is any non-zero value. This is less
strict than memcmp()'s return value specification and can be better
optimized when a boolean return is all that is needed.

__memcmpeq() is meant to only be called by compilers if they can prove
that the return value of a memcmp() call is only used for its boolean
value.

All tests in string/tester.c passed. As well build succeeds on
x86_64-linux-gnu target.
2021-10-26 16:51:29 -05:00
Fangrui Song
8438135d34 configure: Don't check LD -v --help for LIBC_LINKER_FEATURE
When LIBC_LINKER_FEATURE is used to check a linker option with the equal
sign, it will likely fail because the LD -v --help output may look like
`-z lam-report=[none|warning|error]` while the needle is something like
`-z lam-report=warning`.

The LD -v --help filter doesn't save much time, so just remove it.
2021-10-25 13:17:44 -07:00
H.J. Lu
f9b152c83f elf: Make global.out depend on reldepmod4.so [BZ #28457]
The global test is linked with globalmod1.so which dlopens reldepmod4.so.
Make global.out depend on reldepmod4.so.  This fixes BZ #28457.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
2021-10-25 07:13:54 -07:00
Noah Goldstein
bad852b61b x86: Replace sse2 instructions with avx in memcmp-evex-movbe.S
This commit replaces two usages of SSE2 'movups' with AVX 'vmovdqu'.

it could potentially be dangerous to use SSE2 if this function is ever
called without using 'vzeroupper' beforehand. While compilers appear
to use 'vzeroupper' before function calls if AVX2 has been used, using
SSE2 here is more brittle. Since it is not absolutely necessary it
should be avoided.

It costs 2-extra bytes but the extra bytes should only eat into
alignment padding.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-10-23 13:02:42 -05:00
H.J. Lu
d8e7d06381 bench-math: Sort and put each bench per line
Sort and put each math bench per line to prepare for new math benches.

Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2021-10-23 05:20:25 -07:00
Sunil K Pandey
4f690aad9e x86_64: Add missing libmvec ABI tests
Add vector ABI tests for cos, exp, log, pow and sin functions.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-10-22 06:46:49 -07:00
Adhemerval Zanella
927246e188 elf: Fix e6fd79f379 build with --enable-tunables=no
The _dl_sort_maps_init() is not defined when tunables is not enabled.

Checked on x86_64-linux-gnu.
2021-10-21 17:26:32 -03:00
Chung-Lin Tang
15a0c5730d elf: Fix slow DSO sorting behavior in dynamic loader (BZ #17645)
This second patch contains the actual implementation of a new sorting algorithm
for shared objects in the dynamic loader, which solves the slow behavior that
the current "old" algorithm falls into when the DSO set contains circular
dependencies.

The new algorithm implemented here is simply depth-first search (DFS) to obtain
the Reverse-Post Order (RPO) sequence, a topological sort. A new l_visited:1
bitfield is added to struct link_map to more elegantly facilitate such a search.

The DFS algorithm is applied to the input maps[nmap-1] backwards towards
maps[0]. This has the effect of a more "shallow" recursion depth in general
since the input is in BFS. Also, when combined with the natural order of
processing l_initfini[] at each node, this creates a resulting output sorting
closer to the intuitive "left-to-right" order in most cases.

Another notable implementation adjustment related to this _dl_sort_maps change
is the removing of two char arrays 'used' and 'done' in _dl_close_worker to
represent two per-map attributes. This has been changed to simply use two new
bit-fields l_map_used:1, l_map_done:1 added to struct link_map. This also allows
discarding the clunky 'used' array sorting that _dl_sort_maps had to sometimes
do along the way.

Tunable support for switching between different sorting algorithms at runtime is
also added. A new tunable 'glibc.rtld.dynamic_sort' with current valid values 1
(old algorithm) and 2 (new DFS algorithm) has been added. At time of commit
of this patch, the default setting is 1 (old algorithm).

Signed-off-by: Chung-Lin Tang  <cltang@codesourcery.com>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-10-21 11:23:53 -03:00
Chung-Lin Tang
e6fd79f379 elf: Testing infrastructure for ld.so DSO sorting (BZ #17645)
This is the first of a 2-part patch set that fixes slow DSO sorting behavior in
the dynamic loader, as reported in BZ #17645. In order to facilitate such a
large modification to the dynamic loader, this first patch implements a testing
framework for validating shared object sorting behavior, to enable comparison
between old/new sorting algorithms, and any later enhancements.

This testing infrastructure consists of a Python script
scripts/dso-ordering-test.py' which takes in a description language, consisting
of strings that describe a set of link dependency relations between DSOs, and
generates testcase programs and Makefile fragments to automatically test the
described situation, for example:

  a->b->c->d          # four objects linked one after another

  a->[bc]->d;b->c     # a depends on b and c, which both depend on d,
                      # b depends on c (b,c linked to object a in fixed order)

  a->b->c;{+a;%a;-a}  # a, b, c serially dependent, main program uses
                      # dlopen/dlsym/dlclose on object a

  a->b->c;{}!->[abc]  # a, b, c serially dependent; multiple tests generated
                      # to test all permutations of a, b, c ordering linked
                      # to main program

 (Above is just a short description of what the script can do, more
  documentation is in the script comments.)

Two files containing several new tests, elf/dso-sort-tests-[12].def are added,
including test scenarios for BZ #15311 and Redhat issue #1162810 [1].

Due to the nature of dynamic loader tests, where the sorting behavior and test
output occurs before/after main(), generating testcases to use
support/test-driver.c does not suffice to control meaningful timeout for ld.so.
Therefore a new utility program 'support/test-run-command', based on
test-driver.c/support_test_main.c has been added. This does the same testcase
control, but for a program specified through a command-line rather than at the
source code level. This utility is used to run the dynamic loader testcases
generated by dso-ordering-test.py.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1162810

Signed-off-by: Chung-Lin Tang  <cltang@codesourcery.com>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-10-21 11:23:53 -03:00
Stafford Horne
0ff2d30dae iconv: Use TIMEOUTFACTOR for iconv test timeout
Currently the timeout for each iconv test is hard coded to 3 seconds.
On my OpenRISC test platform this is too slow and the test fails with a
HANG error.

This change uses the available TIMEOUTFACTOR to compute the timeout.
The default value is still 3.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-10-21 11:11:07 -03:00
Adhemerval Zanella
4e32c8f568 posix: Remove alloca usage for internal fnmatch implementation
This patch replaces the internal fnmatch pattern list generation
to use a dynamic array.

Checked on x86_64-linux-gnu.
2021-10-21 10:30:31 -03:00
Jonathan Wakely
8a9a593115 Add alloc_align attribute to memalign et al
GCC 4.9.0 added the alloc_align attribute to say that a function
argument specifies the alignment of the returned pointer. Clang supports
the attribute too. Using the attribute can allow a compiler to generate
better code if it knows the returned pointer has a minimum alignment.
See https://gcc.gnu.org/PR60092 for more details.

GCC implicitly knows the semantics of aligned_alloc and posix_memalign,
but not the obsolete memalign. As a result, GCC generates worse code
when memalign is used, compared to aligned_alloc.  Clang knows about
aligned_alloc and memalign, but not posix_memalign.

This change adds a new __attribute_alloc_align__ macro to <sys/cdefs.h>
and then uses it on memalign (where it helps GCC) and aligned_alloc
(where GCC and Clang already know the semantics, but it doesn't hurt)
and xposix_memalign. It can't be used on posix_memalign because that
doesn't return a pointer (the allocated pointer is returned via a void**
parameter instead).

Unlike the alloc_size attribute, alloc_align only allows a single
argument. That means the new __attribute_alloc_align__ macro doesn't
really need to be used with double parentheses to protect a comma
between its arguments. For consistency with __attribute_alloc_size__
this patch defines it the same way, so that double parentheses are
required.

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
2021-10-21 00:19:20 +01:00
Fangrui Song
aa783f9a7b linux: Fix a possibly non-constant expression in _Static_assert
According to C11 6.6p6, `const int` as an operand may not make up a
constant expression. GCC -O0 errors:

../sysdeps/unix/sysv/linux/opendir.c:107:19: error: static_assert expression is not an integral constant expression
  _Static_assert (allocation_size >= sizeof (struct dirent64),

-O2 -Wpedantic has a similar warning.
See https://gcc.gnu.org/PR102502 for GCC's inconsistency.

Use enum which is guaranteed to be a constant expression.
This also makes the file compilable with Clang.

Fixes: 4b962c9e85 ("linux: Simplify opendir buffer allocation")
2021-10-20 14:22:43 -07:00