Commit Graph

37269 Commits

Author SHA1 Message Date
Florian Weimer
1c75f89613 Linux: Explicitly disable cancellation checking in the dynamic loader
Historically, SINGLE_THREAD_P is defined to 1 in the dynamic loader.
This has the side effect of disabling cancellation points.  In order
to enable future use of SINGLE_THREAD_P for single-thread
optimizations in the dynamic loader (which becomes important once
more code is moved from libpthread), introduce a new
NO_SYSCALL_CANCEL_CHECKING macro which is always 1 for IS_IN (rtld),
indepdently of the actual SINGLE_THREAD_P value.

Tested-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2021-05-10 10:31:41 +02:00
Florian Weimer
321789f61a nptl: Export __libc_multiple_threads from libc as an internal symbol
This allows the elimination of the __libc_multiple_threads_ptr
variable in libpthread and its initialization procedure.

Tested-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2021-05-10 10:31:41 +02:00
Florian Weimer
d6163dfd38 elf, nptl: Resolve recursive lock implementation early
If libpthread is included in libc, it is not necessary to delay
initialization of the lock/unlock function pointers until libpthread
is loaded.  This eliminates two unprotected function pointers
from _rtld_global and removes some initialization code from
libpthread.

Tested-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2021-05-10 10:31:41 +02:00
Florian Weimer
a64af8c9b6 scripts/versions.awk: Add strings and hashes to <first-versions.h>
This generates new macros of this from:

They are useful for symbol lookups using _dl_lookup_direct.

Tested-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2021-05-10 10:31:41 +02:00
Florian Weimer
9637e5669b Hurd: Add missing hidden proto definition for __ttyname_r 2021-05-10 10:29:36 +02:00
Noah Goldstein
104c7b1967 x86: Add EVEX optimized memchr family not safe for RTM
No bug.

This commit adds a new implementation for EVEX memchr that is not safe
for RTM because it uses vzeroupper. The benefit is that by using
ymm0-ymm15 it can use vpcmpeq and vpternlogd in the 4x loop which is
faster than the RTM safe version which cannot use vpcmpeq because
there is no EVEX encoding for the instruction. All parts of the
implementation aside from the 4x loop are the same for the two
versions and the optimization is only relevant for large sizes.

Tigerlake:
size  , algn  , Pos   , Cur T , New T , Win     , Dif
512   , 6     , 192   , 9.2   , 9.04  , no-RTM  , 0.16
512   , 7     , 224   , 9.19  , 8.98  , no-RTM  , 0.21
2048  , 0     , 256   , 10.74 , 10.54 , no-RTM  , 0.2
2048  , 0     , 512   , 14.81 , 14.87 , RTM     , 0.06
2048  , 0     , 1024  , 22.97 , 22.57 , no-RTM  , 0.4
2048  , 0     , 2048  , 37.49 , 34.51 , no-RTM  , 2.98   <--

Icelake:
size  , algn  , Pos   , Cur T , New T , Win     , Dif
512   , 6     , 192   , 7.6   , 7.3   , no-RTM  , 0.3
512   , 7     , 224   , 7.63  , 7.27  , no-RTM  , 0.36
2048  , 0     , 256   , 8.48  , 8.38  , no-RTM  , 0.1
2048  , 0     , 512   , 11.57 , 11.42 , no-RTM  , 0.15
2048  , 0     , 1024  , 17.92 , 17.38 , no-RTM  , 0.54
2048  , 0     , 2048  , 30.37 , 27.34 , no-RTM  , 3.03   <--

test-memchr, test-wmemchr, and test-rawmemchr are all passing.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-05-08 16:26:30 -04:00
Alice Xu
6ea916adfa x86-64: Fix an unknown vector operation in memchr-evex.S
An unknown vector operation occurred in commit 2a76821c30. Fixed it
by using "ymm{k1}{z}" but not "ymm {k1} {z}".

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-05-07 19:03:21 -07:00
Raoni Fassina Firmino
17a73a6d8b powerpc64le: Fix ifunc selection for memset, memmove, bzero and bcopy
The hwcap2 check for the aforementioned functions should check for
both PPC_FEATURE2_ARCH_3_1 and PPC_FEATURE2_HAS_ISEL but was
mistakenly checking for any one of them, enabling isa 3.1 version of
the functions in incompatible processors, like POWER8.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2021-05-07 15:52:23 -03:00
H.J. Lu
310be3cc09 malloc: Make tunable callback functions static
Since malloc tunable callback functions are only used within the same
file, we should make them static.
2021-05-07 11:11:46 -07:00
Érico Nogueira
05ae46ee7a linux: implement ttyname as a wrapper around ttyname_r.
Big win in binary size and avoids duplicating the logic in multiple
places.

On x86_64, dropped from 1883206 to 1881790, a 1416 byte decrease.

Also changed logic to track if ttyname_buf has been allocated by
checking if it's NULL instead of tracking buflen as an additional
variable.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-05-07 13:56:02 -03:00
Érico Nogueira
0fb3dadca2 linux: use fd_to_filename instead of _fitoa_word in ttyname_r.
Simplifies the logic and makes intent clearer, while at the same time
decreasing binary size.

On x86_64, dropped from 1883270 to 1883206, a 64 byte decrease.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-05-07 13:54:43 -03:00
Érico Nogueira
330001202a misc: use _fitoa_word to implement __fd_to_filename.
In a default build for x86_64, size decreased by 24 bytes:
1883294 to 1883270.

Aditionally, avoids repeating the number printing logic in multiple
places.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-05-07 13:54:36 -03:00
Adhemerval Zanella
f13fb81ad3 linux: Remove /proc/cpuinfo fallback on alpha and sparc
There is no much gain in fallback to cpuinfo if sysfs is no present,
usually on restricted environment neither will be present.  It also
simplifies the code and make all architecture use the sched_getaffinity
as the sysfs fallback.

Checked on sparc64-linux-gnu.
2021-05-07 13:54:11 -03:00
Adhemerval Zanella
903bc7dcc2 linux: Use sched_getaffinity for __get_nprocs (BZ #27645)
Both the sysfs and procfs parsing (through GET_NPROCS_PARSER) are
removed in favor the syscall.  The initial scratch buffer should
fit to most of the common usage (1024 bytes with maps to 8192 CPUs).

Checked on x86_64-linux-gnu and aarch64-linux-gnu.
2021-05-07 13:54:09 -03:00
Adhemerval Zanella
db373e4c57 Remove architecture specific sched_cpucount optimizations
And replace the generic algorithm with the Brian Kernighan's one.
GCC optimize it with popcnt if the architecture supports, so there
is no need to add the extra POPCNT define to enable it.

This is really a micro-optimization that only adds complexity:
recent ABIs already support it (x86-64-v2 or power64le) and it
simplifies the code for internal usage, since i686 does not allow an
internal iFUNC call.

Checked on x86_64-linux-gnu, aarch64-linux-gnu, and
powerpc64le-linux-gnu.
2021-05-07 13:35:29 -03:00
H.J. Lu
69e0a5eb0d Run $(objpfx)iconvconfig with $(run-program-prefix) [BZ #27477]
When glibc is configured with --enable-hardcoded-path-in-tests,
"make xcheck" failed with

...
env GCONV_PATH=/export/build/gnu/tools-build/glibc-cet-gitlab/build-x86_64-linux/iconvdata LOCPATH=/export/build/gnu/tools-build/glibc-cet-gitlab/build-x86_64-linux/localedata LC_ALL=C  /export/build/gnu/tools-build/glibc-cet-gitlab/build-x86_64-linux/iconv/iconvconfig  --output=$tmp --nostdlib /usr/lib64/gconv;
...
/export/build/gnu/tools-build/glibc-cet-gitlab/build-x86_64-linux/iconv/iconvconfig: /lib64/libc.so.6: version `GLIBC_2.34' not found (required by /export/build/gnu/tools-build/glibc-cet-gitlab/build-x86_64-linux/iconv/iconvconfig)
...
FAIL: iconv/test-iconvconfig

Since $(objpfx)iconvconfig is an installed program, run it with
$(run-program-prefix).
2021-05-07 04:38:44 -07:00
Martin Sebor
3bf0b4f2cd Use the correct diagnostic macro. 2021-05-06 13:38:44 -06:00
Martin Sebor
26492c0a14 Annotate additional APIs with GCC attribute access.
This change continues the improvements to compile-time out of bounds
checking by decorating more APIs with either attribute access, or by
explicitly providing the array bound in APIs such as tmpnam() that
expect arrays of some minimum size as arguments.  (The latter feature
is new in GCC 11.)

The only effects of the attribute and/or the array bound is to check
and diagnose calls to the functions that fail to provide a sufficient
number of elements, and the definitions of the functions that access
elements outside the specified bounds.  (There is no interplay with
_FORTIFY_SOURCE here yet.)

Tested with GCC 7 through 11 on x86_64-linux.
2021-05-06 11:01:05 -06:00
Florian Weimer
3f0808ef4c nptl: Move pthread_barrierattr_setpshared into libc
The symbol was moved using scripts/move-symbol-to-libc.py.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-05-06 15:56:37 +02:00
Florian Weimer
39e74af22e nptl: Move pthread_barrierattr_getpshared into libc
The symbol was moved using scripts/move-symbol-to-libc.py.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-05-06 15:56:37 +02:00
Florian Weimer
e731212bc3 nptl: Move pthread_barrierattr_init into libc
The symbol was moved using scripts/move-symbol-to-libc.py.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-05-06 15:56:37 +02:00
Florian Weimer
bbacf0f56c nptl: Move pthread_barrierattr_destroy into libc
The symbol was moved using scripts/move-symbol-to-libc.py.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-05-06 15:56:37 +02:00
Florian Weimer
b9aec0dd9f nptl: Move pthread_barrier_wait into libc
The symbol was moved using scripts/move-symbol-to-libc.py.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-05-06 15:56:37 +02:00
Florian Weimer
f1af331c4e nptl: Move pthread_barrier_init into libc
The symbol was moved using scripts/move-symbol-to-libc.py.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-05-06 15:56:37 +02:00
Florian Weimer
43b3746aff nptl: Move pthread_barrier_destroy into libc
The symbol was moved using scripts/move-symbol-to-libc.py.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-05-06 15:56:37 +02:00
Florian Weimer
5633541d3b nptl: Move sem_trywait, sem_wait into libc
The symbols were moved using scripts/move-symbol-to-libc.py.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-05-05 17:19:53 +02:00
Florian Weimer
990c8ffd3a nptl: Move sem_unlink into libc
The symbol was moved using scripts/move-symbol-to-libc.py.

A small adjust to the sem_unlink implementation is necessary to avoid
a check-localplt failure.

A placeholder symbol to keep the GLIBC_2.1.1 version alive in
libpthread is added with this commit.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-05-05 17:19:50 +02:00
Florian Weimer
018c75dcb1 nptl: Move sem_timedwait into libc
The symbol was moved using scripts/move-symbol-to-libc.py.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-05-05 17:19:48 +02:00
Florian Weimer
793042c63c nptl: Move sem_post into libc
The symbol was moved using scripts/move-symbol-to-libc.py.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-05-05 17:19:47 +02:00
Florian Weimer
1ae60ae74f nptl: Move sem_init into libc
The symbol was moved using scripts/move-symbol-to-libc.py.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-05-05 17:19:45 +02:00
Florian Weimer
61878689c2 nptl: Move sem_getvalue into libc
The symbol was moved using scripts/move-symbol-to-libc.py.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-05-05 17:19:43 +02:00
Florian Weimer
4b729cca87 nptl: Move sem_destroy into libc
The symbol was moved using scripts/move-symbol-to-libc.py.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-05-05 17:19:40 +02:00
Florian Weimer
0b7d48d106 nptl: Move sem_close, sem_open into libc
The symbols were moved using move-symbol-to-libc.py.

Both functions are moved at the same time because they depend
on internal functions in sysdeps/pthread/sem_routines.c, which
are moved in this commit as well.  Additional hidden prototypes
are required to avoid check-localplt failures.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-05-05 17:19:38 +02:00
Florian Weimer
19cc20ef2e nptl: Move sem_clockwait into libc
The symbol was moved using scripts/move-symbol-to-libc.py.

A new placeholder version is added at version GLIBC_2.30, to
preserve that version in libpthread.so.0.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-05-05 17:19:35 +02:00
Florian Weimer
ce0b7961ae nptl: Consolidate async cancel enable/disable implementation in libc
Previously, the source file nptl/cancellation.c was compiled multiple
times, for libc, libpthread, librt.  This commit switches to a single
implementation, with new __pthread_enable_asynccancel@@GLIBC_PRIVATE,
__pthread_disable_asynccancel@@GLIBC_PRIVATE exports.

The almost-unused CANCEL_ASYNC and CANCEL_RESET macros are replaced
by LIBC_CANCEL_ASYNC and LIBC_CANCEL_ASYNC macros.  They call the
__pthread_* functions unconditionally now.  The macros are still
needed because shared code uses them; Hurd has different definitions.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-05-05 17:19:32 +02:00
Florian Weimer
0197c1bc60 nptl: Move pthread_testcancel into libc
The symbol was moved using scripts/move-symbol-to-libc.py.

A temporary __pthread_testcancel@@GLIBC_PRIVATE export is created
because it is needed by the semaphore implementation.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-05-05 17:19:20 +02:00
Florian Weimer
7cbf1c8416 elf, nptl: Initialize static TLS directly in ld.so
The stack list is available in ld.so since commit
1daccf403b ("nptl: Move stack list
variables into _rtld_global"), so it's possible to walk the stack
list directly in ld.so and perform the initialization there.

This eliminates an unprotected function pointer from _rtld_global
and reduces the libpthread initialization code.
2021-05-05 06:20:31 +02:00
Florian Weimer
2c71177309 posix: Fix Hurd build failure in tst-execveat
This avoids a -Werror compilation failure due to unused local
variables.
2021-05-04 16:03:07 +02:00
Noah Goldstein
2a76821c30 x86: Optimize memchr-evex.S
No bug. This commit optimizes memchr-evex.S. The optimizations include
replacing some branches with cmovcc, avoiding some branches entirely
in the less_4x_vec case, making the page cross logic less strict,
saving some ALU in the alignment process, and most importantly
increasing ILP in the 4x loop. test-memchr, test-rawmemchr, and
test-wmemchr are all passing.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-05-03 21:18:03 -04:00
Noah Goldstein
acfd088a19 x86: Optimize memchr-avx2.S
No bug. This commit optimizes memchr-avx2.S. The optimizations include
replacing some branches with cmovcc, avoiding some branches entirely
in the less_4x_vec case, making the page cross logic less strict,
asaving a few instructions the in loop return loop. test-memchr,
test-rawmemchr, and test-wmemchr are all passing.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-05-03 21:17:21 -04:00
Érico Nogueira
77c1573dbc linux: use __fd_to_filename helper function instead of snprintf.
Change made to fchmodat and fexecve. There are tests using xasprintf
instead of this helper as well, but this commit doesn't touch them.
2021-05-03 16:46:10 -03:00
Alexandra Hájková
19d83270fc linux: Add execveat system call wrapper
It operates similar to execve and it is is already used to implement
fexecve without requiring /proc to be mounted.  However, different
than fexecve, if the syscall is not supported by the kernel an error
is returned instead of trying a fallback.

Checked on x86_64-linux-gnu and powerpc64le-linux-gnu.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-05-03 16:46:06 -03:00
Noah Goldstein
1427d28e30 Bench: Expand bench-memchr.c
No bug. This commit adds some additional cases for bench-memchr.c
including testing medium sizes and testing short length with both an
inbound match and out of bound match.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
2021-05-03 10:18:11 -07:00
Lirong Yuan
7b414d6e7b locale: Align _nl_C_LC_CTYPE_class and _nl_C_LC_CTYPE_class32
Otherwise, programs that use character classification macros such as
isspace may observe unaligned pointers.
2021-05-03 16:10:18 +02:00
Florian Weimer
dde76856ba nptl: Re-sort Versions file
Due to an incorrect conflict resolution, libc/GLIBC_2.2 section was no
longer sorted lexicographically.
2021-05-03 14:12:55 +02:00
H.J. Lu
cf2c57526b x86: Set rep_movsb_threshold to 2112 on processors with FSRM
The glibc memcpy benchmark on Intel Core i7-1065G7 (Ice Lake) showed
that REP MOVSB became faster after 2112 bytes:

                                      Vector Move       REP MOVSB
length=2112, align1=0, align2=0:        24.20             24.40
length=2112, align1=1, align2=0:        26.07             23.13
length=2112, align1=0, align2=1:        27.18             28.13
length=2112, align1=1, align2=1:        26.23             25.16
length=2176, align1=0, align2=0:        23.18             22.52
length=2176, align1=2, align2=0:        25.45             22.52
length=2176, align1=0, align2=2:        27.14             27.82
length=2176, align1=2, align2=2:        22.73             25.56
length=2240, align1=0, align2=0:        24.62             24.25
length=2240, align1=3, align2=0:        29.77             27.15
length=2240, align1=0, align2=3:        35.55             29.93
length=2240, align1=3, align2=3:        34.49             25.15
length=2304, align1=0, align2=0:        34.75             26.64
length=2304, align1=4, align2=0:        32.09             22.63
length=2304, align1=0, align2=4:        28.43             31.24

Use REP MOVSB for data size > 2112 bytes in memcpy on processors with
fast short REP MOVSB (FSRM).

	* sysdeps/x86/dl-cacheinfo.h (dl_init_cacheinfo): Set
	rep_movsb_threshold to 2112 on processors with fast short REP
	MOVSB (FSRM).
2021-05-03 05:08:22 -07:00
H.J. Lu
98544f5bcf bench-memcpy: Collect data from 2KB to 4KB
Collect data on memcpy from 2KB to 4KB with the 64-byte increment value.
2021-05-03 05:08:22 -07:00
Alyssa Ross
b03e4d7bd2 stdio: fix vfscanf with matches longer than INT_MAX (bug 27650)
Patterns like %*[ can safely be used to match a great many characters,
and it's quite realisitic to use them for more than INT_MAX characters
from an IO stream.

With the previous approach, after INT_MAX characters (v)fscanf would
return successfully, indicating an end to the match, even though there
wasn't one.
2021-05-03 10:34:11 +02:00
Florian Weimer
c2fd60a586 nptl: Move pthread_yield into libc, as a compatibility symbol
And deprecate it in <pthread.h>, redirecting it to sched_yield
for the time being.

The symbol was moved using scripts/move-symbol-to-libc.py.

No GLIBC_2.34 symbol version is added because of the compatibility
symbol status.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
2021-05-03 09:23:44 +02:00
Florian Weimer
0505ae4e3b nptl: Move pthread_rwlockattr_setpshared into libc
The symbol was moved using scripts/move-symbol-to-libc.py.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
2021-05-03 09:18:45 +02:00