Commit Graph

2821 Commits

Author SHA1 Message Date
Adhemerval Zanella
89b53077d2 nptl: Fix Race conditions in pthread cancellation [BZ#12683]
The current racy approach is to enable asynchronous cancellation
before making the syscall and restore the previous cancellation
type once the syscall returns, and check if cancellation has happen
during the cancellation entrypoint.

As described in BZ#12683, this approach shows 2 problems:

  1. Cancellation can act after the syscall has returned from the
     kernel, but before userspace saves the return value.  It might
     result in a resource leak if the syscall allocated a resource or a
     side effect (partial read/write), and there is no way to program
     handle it with cancellation handlers.

  2. If a signal is handled while the thread is blocked at a cancellable
     syscall, the entire signal handler runs with asynchronous
     cancellation enabled.  This can lead to issues if the signal
     handler call functions which are async-signal-safe but not
     async-cancel-safe.

For the cancellation to work correctly, there are 5 points at which the
cancellation signal could arrive:

	[ ... )[ ... )[ syscall ]( ...
	   1      2        3    4   5

  1. Before initial testcancel, e.g. [*... testcancel)
  2. Between testcancel and syscall start, e.g. [testcancel...syscall start)
  3. While syscall is blocked and no side effects have yet taken
     place, e.g. [ syscall ]
  4. Same as 3 but with side-effects having occurred (e.g. a partial
     read or write).
  5. After syscall end e.g. (syscall end...*]

And libc wants to act on cancellation in cases 1, 2, and 3 but not
in cases 4 or 5.  For the 4 and 5 cases, the cancellation will eventually
happen in the next cancellable entrypoint without any further external
event.

The proposed solution for each case is:

  1. Do a conditional branch based on whether the thread has received
     a cancellation request;

  2. It can be caught by the signal handler determining that the saved
     program counter (from the ucontext_t) is in some address range
     beginning just before the "testcancel" and ending with the
     syscall instruction.

  3. SIGCANCEL can be caught by the signal handler and determine that
     the saved program counter (from the ucontext_t) is in the address
     range beginning just before "testcancel" and ending with the first
     uninterruptable (via a signal) syscall instruction that enters the
      kernel.

  4. In this case, except for certain syscalls that ALWAYS fail with
     EINTR even for non-interrupting signals, the kernel will reset
     the program counter to point at the syscall instruction during
     signal handling, so that the syscall is restarted when the signal
     handler returns.  So, from the signal handler's standpoint, this
     looks the same as case 2, and thus it's taken care of.

  5. For syscalls with side-effects, the kernel cannot restart the
     syscall; when it's interrupted by a signal, the kernel must cause
     the syscall to return with whatever partial result is obtained
     (e.g. partial read or write).

  6. The saved program counter points just after the syscall
     instruction, so the signal handler won't act on cancellation.
     This is similar to 4. since the program counter is past the syscall
     instruction.

So The proposed fixes are:

  1. Remove the enable_asynccancel/disable_asynccancel function usage in
     cancellable syscall definition and instead make them call a common
     symbol that will check if cancellation is enabled (__syscall_cancel
     at nptl/cancellation.c), call the arch-specific cancellable
     entry-point (__syscall_cancel_arch), and cancel the thread when
     required.

  2. Provide an arch-specific generic system call wrapper function
     that contains global markers.  These markers will be used in
     SIGCANCEL signal handler to check if the interruption has been
     called in a valid syscall and if the syscalls has side-effects.

     A reference implementation sysdeps/unix/sysv/linux/syscall_cancel.c
     is provided.  However, the markers may not be set on correct
     expected places depending on how INTERNAL_SYSCALL_NCS is
     implemented by the architecture.  It is expected that all
     architectures add an arch-specific implementation.

  3. Rewrite SIGCANCEL asynchronous handler to check for both canceling
     type and if current IP from signal handler falls between the global
     markers and act accordingly.

  4. Adjust libc code to replace LIBC_CANCEL_ASYNC/LIBC_CANCEL_RESET to
     use the appropriate cancelable syscalls.

  5. Adjust 'lowlevellock-futex.h' arch-specific implementations to
     provide cancelable futex calls.

Some architectures require specific support on syscall handling:

  * On i386 the syscall cancel bridge needs to use the old int80
    instruction because the optimized vDSO symbol the resulting PC value
    for an interrupted syscall points to an address outside the expected
    markers in __syscall_cancel_arch.  It has been discussed in LKML [1]
    on how kernel could help userland to accomplish it, but afaik
    discussion has stalled.

    Also, sysenter should not be used directly by libc since its calling
    convention is set by the kernel depending of the underlying x86 chip
    (check kernel commit 30bfa7b3488bfb1bb75c9f50a5fcac1832970c60).

  * mips o32 is the only kABI that requires 7 argument syscall, and to
    avoid add a requirement on all architectures to support it, mips
    support is added with extra internal defines.

Checked on aarch64-linux-gnu, arm-linux-gnueabihf, powerpc-linux-gnu,
powerpc64-linux-gnu, powerpc64le-linux-gnu, i686-linux-gnu, and
x86_64-linux-gnu.

[1] https://lkml.org/lkml/2016/3/8/1105
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2024-08-23 14:27:43 -03:00
Maciej W. Rozycki
bea2ad022d nptl: Fix stray process left by tst-cancel7 blocking testing
Fix an issue with commit b74121ae4b ("Update.") and prevent a stray
process from being left behind by tst-cancel7 (and also tst-cancelx7,
which is the same test built with '-fexceptions' additionally supplied
to the compiler), which then blocks remote testing until the process has
been killed by hand.

This test case creates a thread that runs an extra copy of the test via
system(3) and using the '--direct' option so that the test wrapper does
not interfere with this instance.  This extra copy executes its business
and calls sigsuspend(2) and then never terminates by itself.  Instead it
relies on being killed by the main test process directly via a thread
cancellation request or, should that fail, by issuing SIGKILL either at
the conclusion of 'do_test' or by the test driver via 'do_cleanup' where
the test timeout has been hit or the test driver interrupted.

However if the main test process has been instead killed by a signal,
such as due to incorrect execution, before it had a chance to kill the
extra copy of the test case, then the test wrapper will terminate
without running 'do_cleanup' and consequently the extra copy of the test
case will remain forever in its suspended state, and in the remote case
in particular it means that the remote test wrapper will wait forever
for the SSH command to complete.

This has been observed with the 'alpha-linux-gnu' target, where the main
test process triggers SIGSEGV and the test wrapper correctly records:

Didn't expect signal from child: got `Segmentation fault'

in nptl/tst-cancel7.out and terminates, but then the calling SSH command
continues waiting for the remaining process started in the same session
on the remote target to complete.

Address this problem by also registering 'do_cleanup' via atexit(3),
observing that 'support_delete_temp_files' is registered by the test
wrapper before the test initializing function 'do_prepare' is called and
that we call all the functions registered in the reverse of the order in
which they were registered, so it is safe to refer to 'pidfilename' in
'do_cleanup' invoked by exit(3) because by that time temporary files
have not yet been deleted.

A minor inconvenience is that if 'signal_handler' is invoked in the test
wrapper as a result of SIGALRM rather than SIGINT, then 'do_cleanup'
will be called twice, once as a cleanup handler and again by exit(3).
In reality it is harmless though, because issuing SIGKILL is guarded by
a record lock, so if the first call has succeeded in killing the extra
copy of the test case, then the subsequent call will do nothing.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2024-08-07 19:46:21 +01:00
Maciej W. Rozycki
934ba77add nptl: Reorder semaphore release in tst-cancel7
Move the release of the semaphore used to synchronize between an extra
copy of the test run as a separate process and the main test process
until after the PID file has been locked.  It is so that if the cleanup
function gets called by the test driver due to premature termination of
the main test process, then the function does not get at the PID file
before it has been locked and conclude that the extra copy of the test
has already terminated.  This won't usually happen due to a relatively
high amount of time required to elapse before timeout triggers in the
test driver, but it will change with the next change.

There is still a small time window remaining with this change in place
where the main test process gets killed for some reason between the
extra copy of the test has been already started by pthread_create(3) and
a successful return from the call to sem_wait(3), in which case the
cleanup function can be reached before PID has been written to the PID
file and the file locked.  It seems that with the test case structured
as it is now and PID-based process management we have no means to avoid
it.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2024-08-07 19:46:21 +01:00
Florian Weimer
fe06fb313b elf: Clarify and invert second argument of _dl_allocate_tls_init
Also remove an outdated comment: _dl_allocate_tls_init is
called as part of pthread_create.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2024-08-05 18:26:42 +02:00
Maciej W. Rozycki
4b2a1b602f
nptl: Convert tst-sem11 and tst-sem12 tests to use the test driver
Fix an issue with commit 2af4e3e566 ("Test of semaphores.") by making
the tst-sem11 and tst-sem12 tests use the test driver, preventing them
from ever causing testing to hang forever and never complete, such as
currently happening with the 'mips-linux-gnu' (o32 ABI) target.  Adjust
the name of the PREPARE macro, which clashes with the interpretation of
its presence by the test driver, by using a TF_ prefix in reference to
the name of the 'tf' function.
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-07-12 20:41:08 +02:00
Maciej W. Rozycki
9d8995833e
nptl: Add copyright notice tst-sem11 and tst-sem12 tests
Add a copyright notice to the tst-sem11 and tst-sem12 tests, observing
that they have been originally contributed back in 2007, with commit
2af4e3e566 ("Test of semaphores.").
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-07-12 20:40:36 +02:00
H.J. Lu
ba144c179e Add --disable-static-c++-tests option [BZ #31797]
By default, if the C++ toolchain lacks support for static linking,
configure fails to find the C++ header files and the glibc build fails.
The --disable-static-c++-link-check option allows the glibc build to
finish, but static C++ tests will fail if the C++ toolchain doesn't
have the necessary static C++ libraries which may not be easily installed.
Add --disable-static-c++-tests option to skip the static C++ link check
and tests.  This fixes BZ #31797.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2024-07-02 00:51:34 -07:00
Carlos O'Donell
a7fe3e805d
Fix conditionals on mtrace-based tests (bug 31892)
The conditionals for several mtrace-based tests in catgets, elf, libio,
malloc, misc, nptl, posix, and stdio-common were incorrect leading to
test failures when bootstrapping glibc without perl.

The correct conditional for mtrace-based tests requires three checks:
first checking for run-built-tests, then build-shared, and lastly that
PERL is not equal to "no" (missing perl).
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-07-01 17:20:30 +02:00
H.J. Lu
42e48e720c nptl: Add tst-pthread-key1-static for BZ #21777
Add a static pthread static tests to verify that BZ #21777 is fixed.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2024-04-09 05:27:03 -07:00
Konstantin Kharlamov
fe00366b63 treewide: python-scripts: use is None for none-equality
Testing for `None`-ness with `==` operator is frowned upon and causes
warnings in at least "LGTM" python linter. Fix that.

Signed-off-by: Konstantin Kharlamov <Hi-Angel@yandex.ru>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-02-23 08:50:00 -03:00
Adhemerval Zanella
460860f457 Remove ia64-linux-gnu
Linux 6.7 removed ia64 from the official tree [1], following the general
principle that a glibc port needs upstream support for the architecture
in all the components it depends on (binutils, GCC, and the Linux
kernel).

Apart from the removal of sysdeps/ia64 and sysdeps/unix/sysv/linux/ia64,
there are updates to various comments referencing ia64 for which removal
of those references seemed appropriate. The configuration is removed
from README and build-many-glibcs.py.

The CONTRIBUTED-BY, elf/elf.h, manual/contrib.texi (the porting
mention), *.po files, config.guess, and longlong.h are not changed.

For Linux it allows cleanup some clone2 support on multiple files.

The following bug can be closed as WONTFIX: BZ 22634 [2], BZ 14250 [3],
BZ 21634 [4], BZ 10163 [5], BZ 16401 [6], and BZ 11585 [7].

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=43ff221426d33db909f7159fdf620c3b052e2d1c
[2] https://sourceware.org/bugzilla/show_bug.cgi?id=22634
[3] https://sourceware.org/bugzilla/show_bug.cgi?id=14250
[4] https://sourceware.org/bugzilla/show_bug.cgi?id=21634
[5] https://sourceware.org/bugzilla/show_bug.cgi?id=10163
[6] https://sourceware.org/bugzilla/show_bug.cgi?id=16401
[7] https://sourceware.org/bugzilla/show_bug.cgi?id=11585
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2024-01-08 17:09:36 -03:00
Paul Eggert
dff8da6b3e Update copyright dates with scripts/update-copyrights 2024-01-01 10:53:40 -08:00
Florian Weimer
e21aa9b9cc nptl: Link tst-execstack-threads-mod.so with -z execstack
This ensures that the test still links with a linker that refuses
to create an executable stack marker automatically.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2023-11-20 09:22:25 +01:00
Florian Weimer
8c8eff33e4 nptl: Rename tst-execstack to tst-execstack-threads
So that the test is harder to confuse with elf/tst-execstack
(although the tests are supposed to be the same).

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2023-11-20 09:22:21 +01:00
Adhemerval Zanella
fee9e40a8d malloc: Decorate malloc maps
Add anonymous mmap annotations on loader malloc, malloc when it
allocates memory with mmap, and on malloc arena.  The /proc/self/maps
will now print:

   [anon: glibc: malloc arena]
   [anon: glibc: malloc]
   [anon: glibc: loader malloc]

On arena allocation, glibc annotates only the read/write mapping.

Checked on x86_64-linux-gnu and aarch64-linux-gnu.
Reviewed-by: DJ Delorie <dj@redhat.com>
2023-11-07 10:27:20 -03:00
Adhemerval Zanella
6afce56c19 nptl: Decorate thread stack on pthread_create
Linux 4.5 removed thread stack annotations due to the complexity of
computing them [1], and Linux added PR_SET_VMA_ANON_NAME on 5.17
as a way to name anonymous virtual memory areas.

This patch adds decoration on the stack created and used by
pthread_create, for glibc crated thread stack the /proc/self/maps will
now show:

  [anon: glibc: pthread stack: <tid>]

And for user-provided stacks:

  [anon: glibc: pthread user stack: <tid>]

The guard page is not decorated, and the mapping name is cleared when
the thread finishes its execution (so the cached stack does not have any
name associated).

Checked on x86_64-linux-gnu aarch64 aarch64-linux-gnu.

[1] 65376df582

Co-authored-by: Ian Rogers <irogers@google.com>
Reviewed-by: DJ Delorie <dj@redhat.com>
2023-11-07 10:27:20 -03:00
Samuel Thibault
6333a6014f __call_tls_dtors: Use call_function_static_weak 2023-09-04 20:03:37 +02:00
Florian Weimer
2c6b4b272e nptl: Unconditionally use a 32-byte rseq area
If the kernel headers provide a larger struct rseq, we used that
size as the argument to the rseq system call.  As a result,
rseq registration would fail on older kernels which only accept
size 32.
2023-07-21 16:18:18 +02:00
Arsen Arsenović
3edca7f545
nptl: Make tst-tls3mod.so explicitly lazy
Fixes the following test-time errors, that lead to FAILs, on toolchains
that set -z now out o the box, such as the one used on Gentoo Hardened:

  .../build-x86-x86_64-pc-linux-gnu-nptl $ grep '' nptl/tst-tls3*.out
  nptl/tst-tls3.out:dlopen failed
  nptl/tst-tls3-malloc.out:dlopen failed

Reviewed-by: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>
2023-07-20 12:24:28 +02:00
Paul Eggert
3edc4ff2ce make ‘struct pthread’ a complete type
* nptl/descr.h (struct pthread): Remove end_padding member, which
made this type incomplete.
(PTHREAD_STRUCT_END_PADDING): Stop using end_padding.

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2023-07-19 14:16:04 -07:00
Frédéric Bérat
8022fc7d51 tests: replace system by xsystem
With fortification enabled, system calls return result needs to be checked,
has it gets the __wur macro enabled.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2023-06-19 09:15:05 -04:00
Frédéric Bérat
20b6b8e8a5 tests: replace read by xread
With fortification enabled, read calls return result needs to be checked,
has it gets the __wur macro enabled.

Note on read call removal from  sysdeps/pthread/tst-cancel20.c and
sysdeps/pthread/tst-cancel21.c:
It is assumed that this second read call was there to overcome the race
condition between pipe closure and thread cancellation that could happen
in the original code. Since this race condition got fixed by
d0e3ffb7a5 the second call seems
superfluous. Hence, instead of checking for the return value of read, it
looks reasonable to simply remove it.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2023-06-19 09:14:56 -04:00
Paul Pluzhnikov
7f0d9e61f4 Fix all the remaining misspellings -- BZ 25337 2023-06-02 01:39:48 +00:00
Frédéric Bérat
026a84a54d tests: replace write by xwrite
Using write without cheks leads to warn unused result when __wur is
enabled.

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2023-06-01 12:40:05 -04:00
Carlos O'Donell
b600f47758 nptl: Reformat Makefile.
Reflow all long lines adding comment terminators.
Rename files that cause inconsistent ordering.
Sort all reflowed text using scripts/sort-makefile-lines.py.

No code generation changes observed in binary artifacts.
No regressions on x86_64 and i686.
2023-05-18 12:39:47 -04:00
Cupertino Miranda
b630be0922 Created tunable to force small pages on stack allocation.
Created tunable glibc.pthread.stack_hugetlb to control when hugepages
can be used for stack allocation.
In case THP are enabled and glibc.pthread.stack_hugetlb is set to
0, glibc will madvise the kernel not to use allow hugepages for stack
allocations.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2023-04-20 13:54:24 -03:00
Adhemerval Zanella Netto
33237fe83d Remove --enable-tunables configure option
And make always supported.  The configure option was added on glibc 2.25
and some features require it (such as hwcap mask, huge pages support, and
lock elisition tuning).  It also simplifies the build permutations.

Changes from v1:
 * Remove glibc.rtld.dynamic_sort changes, it is orthogonal and needs
   more discussion.
 * Cleanup more code.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2023-03-29 14:33:06 -03:00
Adhemerval Zanella Netto
88677348b4 Move libc_freeres_ptrs and libc_subfreeres to hidden/weak functions
They are both used by __libc_freeres to free all library malloc
allocated resources to help tooling like mtrace or valgrind with
memory leak tracking.

The current scheme uses assembly markers and linker script entries
to consolidate the free routine function pointers in the RELRO segment
and to be freed buffers in BSS.

This patch changes it to use specific free functions for
libc_freeres_ptrs buffers and call the function pointer array directly
with call_function_static_weak.

It allows the removal of both the internal macros and the linker
script sections.

Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2023-03-27 13:57:55 -03:00
Andreas Schwab
359a0b9dbc Remove pthread-pi-defines.sym
It became unused with the removal of the assembler implementation of the
pthread functions.
2023-02-03 17:59:55 +01:00
Joseph Myers
6d7e8eda9b Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
YunQiang Su
a9acb7b39e Define in_int32_t_range to check if the 64 bit time_t syscall should be used
Currently glibc uses in_time_t_range to detects time_t overflow,
and if it occurs fallbacks to 64 bit syscall version.

The function name is confusing because internally time_t might be
either 32 bits or 64 bits (depending on __TIMESIZE).

This patch refactors the in_time_t_range by replacing it with
in_int32_t_range for the case to check if the 64 bit time_t syscall
should be used.

The in_time_t range is used to detect overflow of the
syscall return value.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2022-11-17 14:35:13 -03:00
Florian Weimer
ee1ada1bdb elf: Rework exception handling in the dynamic loader [BZ #25486]
The old exception handling implementation used function interposition
to replace the dynamic loader implementation (no TLS support) with the
libc implementation (TLS support).  This results in problems if the
link order between the dynamic loader and libc is reversed (bug 25486).

The new implementation moves the entire implementation of the
exception handling functions back into the dynamic loader, using
THREAD_GETMEM and THREAD_SETMEM for thread-local data support.
These depends on Hurd support for these macros, added in commit
b65a82e4e7 ("hurd: Add THREAD_GET/SETMEM/_NC").

One small obstacle is that the exception handling facilities are used
before the TCB has been set up, so a check is needed if the TCB is
available.  If not, a regular global variable is used to store the
exception handling information.

Also rename dl-error.c to dl-catch.c, to avoid confusion with the
dlerror function.

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2022-11-03 09:39:31 +01:00
Adhemerval Zanella
3d8b5dde87 nptl: Fix pthread_create.c build with clang
clang complains that libc_hidden_data_def (__nptl_threads_events)
creates an invalid alias:

  pthread_create.c:50:1: error: alias must point to a defined variable or function
  libc_hidden_data_def (__nptl_threads_events)
  ^
  ../include/libc-symbols.h:621:37: note: expanded from macro
  'libc_hidden_data_def'

It seems that clang requires that a proper prototype is defined prior
the hidden alias creation.

Reviewed-by: Fangrui Song <maskray@google.com>
2022-11-01 09:51:10 -03:00
Adhemerval Zanella
9b5e138f2b linux: Avoid shifting a negative signed on POSIX timer interface
The current macros uses pid as signed value, which triggers a compiler
warning for process and thread timers.  Replace MAKE_PROCESS_CPUCLOCK
with static inline function that expects the pid as unsigned.  These
are similar to what Linux does internally.

Checked on x86_64-linux-gnu.
Reviewed-by: Arjun Shankar <arjun@redhat.com>
2022-10-20 10:19:08 -03:00
Yu Chien Peter Lin
365b3af67e nptl: Convert tst-setuid2 to test-driver
Use <support/test-driver.c> and replace pthread calls to its xpthread
equivalents.

Signed-off-by: Yu Chien Peter Lin <peterlin@andestech.com>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2022-10-03 11:19:36 -03:00
Wilco Dijkstra
22f4ab2d20 Use atomic_exchange_release/acquire
Rename atomic_exchange_rel/acq to use atomic_exchange_release/acquire
since these map to the standard C11 atomic builtins.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2022-09-26 16:58:08 +01:00
Wilco Dijkstra
4a07fbb689 Use C11 atomics instead of atomic_decrement_and_test
Replace atomic_decrement_and_test with atomic_fetch_add_relaxed.
These are simple counters which do not protect any shared data from
concurrent accesses. Also remove the unused file cond-perf.c.

Passes regress on AArch64.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2022-09-23 15:59:56 +01:00
Wilco Dijkstra
d1babeb32d Use C11 atomics instead of atomic_increment(_val)
Replace atomic_increment and atomic_increment_val with atomic_fetch_add_relaxed.
One case in sem_post.c uses release semantics (see comment above it).
The others are simple counters and do not protect any shared data from
concurrent accesses.

Passes regress on AArch64.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2022-09-23 15:59:56 +01:00
Wilco Dijkstra
8114b95cef Use C11 atomics instead of atomic_and/or
Remove the 4 uses of atomic_and and atomic_or with atomic_fetch_and_acquire
and atomic_fetch_or_acquire. This is preserves existing implied semantics,
however relaxed MO on FUTEX_OWNER_DIED accesses may be correct.

Passes regress on AArch64.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2022-09-23 15:59:56 +01:00
Adhemerval Zanella Netto
de477abcaa Use '%z' instead of '%Z' on printf functions
The Z modifier is a nonstandard synonymn for z (that predates z
itself) and compiler might issue an warning for in invalid
conversion specifier.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2022-09-22 08:48:04 -03:00
Wilco Dijkstra
a30e960328 Use relaxed atomics since there is no MO dependence
Replace the 3 uses of atomic_bit_set and atomic_bit_test_set with
atomic_fetch_or_relaxed.  Using relaxed MO is correct since the
atomics are used to ensure memory is released only once.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2022-09-13 11:58:07 +01:00
Wilco Dijkstra
a364a3a709 Use C11 atomics instead of atomic_decrement(_val)
Replace atomic_decrement and atomic_decrement_val with
atomic_fetch_add_relaxed.

Reviewed-by: DJ Delorie <dj@redhat.com>
2022-09-09 14:22:26 +01:00
Adhemerval Zanella Netto
6f4e0fcfa2 stdlib: Add arc4random, arc4random_buf, and arc4random_uniform (BZ #4417)
The implementation is based on scalar Chacha20 with per-thread cache.
It uses getrandom or /dev/urandom as fallback to get the initial entropy,
and reseeds the internal state on every 16MB of consumed buffer.

To improve performance and lower memory consumption the per-thread cache
is allocated lazily on first arc4random functions call, and if the
memory allocation fails getentropy or /dev/urandom is used as fallback.
The cache is also cleared on thread exit iff it was initialized (so if
arc4random is not called it is not touched).

Although it is lock-free, arc4random is still not async-signal-safe
(the per thread state is not updated atomically).

The ChaCha20 implementation is based on RFC8439 [1], omitting the final
XOR of the keystream with the plaintext because the plaintext is a
stream of zeros.  This strategy is similar to what OpenBSD arc4random
does.

The arc4random_uniform is based on previous work by Florian Weimer,
where the algorithm is based on Jérémie Lumbroso paper Optimal Discrete
Uniform Generation from Coin Flips, and Applications (2013) [2], who
credits Donald E. Knuth and Andrew C. Yao, The complexity of nonuniform
random number generation (1976), for solving the general case.

The main advantage of this method is the that the unit of randomness is not
the uniform random variable (uint32_t), but a random bit.  It optimizes the
internal buffer sampling by initially consuming a 32-bit random variable
and then sampling byte per byte.  Depending of the upper bound requested,
it might lead to better CPU utilization.

Checked on x86_64-linux-gnu, aarch64-linux, and powerpc64le-linux-gnu.

Co-authored-by: Florian Weimer <fweimer@redhat.com>
Reviewed-by: Yann Droneaud <ydroneaud@opteya.com>

[1] https://datatracker.ietf.org/doc/html/rfc8439
[2] https://arxiv.org/pdf/1304.1916.pdf
2022-07-22 11:58:27 -03:00
Adhemerval Zanella
f27e5e2178 nptl: Fix ___pthread_unregister_cancel_restore asynchronous restore
This was due a wrong revert done on 404656009b.

Checked on x86_64-linux-gnu and i686-linux-gnu.
2022-07-13 10:44:13 -03:00
Adhemerval Zanella
e070501d12 Replace __libc_multiple_threads with __libc_single_threaded
And also fixes the SINGLE_THREAD_P macro for SINGLE_THREAD_BY_GLOBAL,
since header inclusion single-thread.h is in the wrong order, the define
needs to come before including sysdeps/unix/sysdep.h.  The macro
is now moved to a per-arch single-threade.h header.

The SINGLE_THREAD_P is used on some more places.

Checked on aarch64-linux-gnu and x86_64-linux-gnu.
2022-07-05 10:14:47 -03:00
Adhemerval Zanella
a1bdd81664 Refactor internal-signals.h
The main drive is to optimize the internal usage and required size
when sigset_t is embedded in other data structures.  On Linux, the
current supported signal set requires up to 8 bytes (16 on mips),
was lower than the user defined sigset_t (128 bytes).

A new internal type internal_sigset_t is added, along with the
functions to operate on it similar to the ones for sigset_t.  The
internal-signals.h is also refactored to remove unused functions

Besides small stack usage on some functions (posix_spawn, abort)
it lower the struct pthread by about 120 bytes (112 on mips).

Checked on x86_64-linux-gnu.

Reviewed-by: Arjun Shankar <arjun@redhat.com>
2022-06-30 14:56:21 -03:00
Adhemerval Zanella
d55df811e9 nptl: Remove unused members from struct pthread
It removes both pid_ununsed and cpuclock_offset_ununsed, saving about
12 bytes from struct pthread.

Reviewed-by: Arjun Shankar <arjun@redhat.com>
2022-06-29 16:58:26 -03:00
Adhemerval Zanella
baf2a265c7 misc: Optimize internal usage of __libc_single_threaded
By adding an internal alias to avoid the GOT indirection.
On some architecture, __libc_single_thread may be accessed through
copy relocations and thus it requires to update also the copies
default copy.

This is done by adding a new internal macro,
libc_hidden_data_{proto,def}, which has an addition argument that
specifies the alias name (instead of default __GI_ one).

Checked on x86_64-linux-gnu and i686-linux-gnu.

Reviewed-by: Fangrui Song <maskray@google.com>
2022-06-24 17:45:58 -03:00
Adhemerval Zanella
c7d36dcecc nptl: Fix __libc_cleanup_pop_restore asynchronous restore (BZ#29214)
This was due a wrong revert done on 404656009b.

Checked on x86_64-linux-gnu.
2022-06-08 09:23:02 -03:00
Wangyang Guo
8162147872 nptl: Add backoff mechanism to spinlock loop
When mutiple threads waiting for lock at the same time, once lock owner
releases the lock, waiters will see lock available and all try to lock,
which may cause an expensive CAS storm.

Binary exponential backoff with random jitter is introduced. As try-lock
attempt increases, there is more likely that a larger number threads
compete for adaptive mutex lock, so increase wait time in exponential.
A random jitter is also added to avoid synchronous try-lock from other
threads.

v2: Remove read-check before try-lock for performance.

v3:
1. Restore read-check since it works well in some platform.
2. Make backoff arch dependent, and enable it for x86_64.
3. Limit max backoff to reduce latency in large critical section.

v4: Fix strict-prototypes error in sysdeps/nptl/pthread_mutex_backoff.h

v5: Commit log updated for regression in large critical section.

Result of pthread-mutex-locks bench

Test Platform: Xeon 8280L (2 socket, 112 CPUs in total)
First Row: thread number
First Col: critical section length
Values: backoff vs upstream, time based, low is better

non-critical-length: 1
	1	2	4	8	16	32	64	112	140
0	0.99	0.58	0.52	0.49	0.43	0.44	0.46	0.52	0.54
1	0.98	0.43	0.56	0.50	0.44	0.45	0.50	0.56	0.57
2	0.99	0.41	0.57	0.51	0.45	0.47	0.48	0.60	0.61
4	0.99	0.45	0.59	0.53	0.48	0.49	0.52	0.64	0.65
8	1.00	0.66	0.71	0.63	0.56	0.59	0.66	0.72	0.71
16	0.97	0.78	0.91	0.73	0.67	0.70	0.79	0.80	0.80
32	0.95	1.17	0.98	0.87	0.82	0.86	0.89	0.90	0.90
64	0.96	0.95	1.01	1.01	0.98	1.00	1.03	0.99	0.99
128	0.99	1.01	1.01	1.17	1.08	1.12	1.02	0.97	1.02

non-critical-length: 32
	1	2	4	8	16	32	64	112	140
0	1.03	0.97	0.75	0.65	0.58	0.58	0.56	0.70	0.70
1	0.94	0.95	0.76	0.65	0.58	0.58	0.61	0.71	0.72
2	0.97	0.96	0.77	0.66	0.58	0.59	0.62	0.74	0.74
4	0.99	0.96	0.78	0.66	0.60	0.61	0.66	0.76	0.77
8	0.99	0.99	0.84	0.70	0.64	0.66	0.71	0.80	0.80
16	0.98	0.97	0.95	0.76	0.70	0.73	0.81	0.85	0.84
32	1.04	1.12	1.04	0.89	0.82	0.86	0.93	0.91	0.91
64	0.99	1.15	1.07	1.00	0.99	1.01	1.05	0.99	0.99
128	1.00	1.21	1.20	1.22	1.25	1.31	1.12	1.10	0.99

non-critical-length: 128
	1	2	4	8	16	32	64	112	140
0	1.02	1.00	0.99	0.67	0.61	0.61	0.61	0.74	0.73
1	0.95	0.99	1.00	0.68	0.61	0.60	0.60	0.74	0.74
2	1.00	1.04	1.00	0.68	0.59	0.61	0.65	0.76	0.76
4	1.00	0.96	0.98	0.70	0.63	0.63	0.67	0.78	0.77
8	1.01	1.02	0.89	0.73	0.65	0.67	0.71	0.81	0.80
16	0.99	0.96	0.96	0.79	0.71	0.73	0.80	0.84	0.84
32	0.99	0.95	1.05	0.89	0.84	0.85	0.94	0.92	0.91
64	1.00	0.99	1.16	1.04	1.00	1.02	1.06	0.99	0.99
128	1.00	1.06	0.98	1.14	1.39	1.26	1.08	1.02	0.98

There is regression in large critical section. But adaptive mutex is
aimed for "quick" locks. Small critical section is more common when
users choose to use adaptive pthread_mutex.

Signed-off-by: Wangyang Guo <wangyang.guo@intel.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2022-05-09 14:38:40 -07:00