glibc/sysdeps/generic/tls-internal.c

/* Per-thread state.  Generic version.
   Copyright (C) 2020-2024 Free Software Foundation, Inc.
   This file is part of the GNU C Library.

   The GNU C Library is free software; you can redistribute it and/or
   modify it under the terms of the GNU Lesser General Public
   License as published by the Free Software Foundation; either
   version 2.1 of the License, or (at your option) any later version.

   The GNU C Library is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
   Lesser General Public License for more details.

   You should have received a copy of the GNU Lesser General Public
   License along with the GNU C Library; if not, see
   <https://www.gnu.org/licenses/>.  */

#include <string.h>
#include <tls-internal.h>

__thread struct tls_internal_t __tls_internal;

void
__glibc_tls_internal_free (void)
{
  free (__tls_internal.strsignal_buf);
  free (__tls_internal.strerror_l_buf);
}
string: Remove old TLS usage on strsignal The per-thread state is refactored two use two strategies: 1. The default one uses a TLS structure, which will be placed in the static TLS space (using __thread keyword). 2. Linux allocates via struct pthread and access it through THREAD_* macros. The default strategy has the disadvantage of increasing libc.so static TLS consumption and thus decreasing the possible surplus used in some scenarios (which might be mitigated by BZ#25051 fix). It is used only on Hurd, where accessing the thread storage in the in single thread case is not straightforward (afaiu, Hurd developers could correct me here). The fallback static allocation used for allocation failure is also removed: defining its size is problematic without synchronizing with translated messages (to avoid partial translation) and the resulting usage is not thread-safe. Checked on x86-64-linux-gnu, i686-linux-gnu, powerpc64le-linux-gnu, and s390x-linux-gnu. Tested-by: Carlos O'Donell <carlos@redhat.com> Reviewed-by: Carlos O'Donell <carlos@redhat.com> 2020-05-14 20:02:38 +00:00			`/* Per-thread state. Generic version.`
Update copyright dates with scripts/update-copyrights 2024-01-01 18:12:26 +00:00			`Copyright (C) 2020-2024 Free Software Foundation, Inc.`
string: Remove old TLS usage on strsignal The per-thread state is refactored two use two strategies: 1. The default one uses a TLS structure, which will be placed in the static TLS space (using __thread keyword). 2. Linux allocates via struct pthread and access it through THREAD_* macros. The default strategy has the disadvantage of increasing libc.so static TLS consumption and thus decreasing the possible surplus used in some scenarios (which might be mitigated by BZ#25051 fix). It is used only on Hurd, where accessing the thread storage in the in single thread case is not straightforward (afaiu, Hurd developers could correct me here). The fallback static allocation used for allocation failure is also removed: defining its size is problematic without synchronizing with translated messages (to avoid partial translation) and the resulting usage is not thread-safe. Checked on x86-64-linux-gnu, i686-linux-gnu, powerpc64le-linux-gnu, and s390x-linux-gnu. Tested-by: Carlos O'Donell <carlos@redhat.com> Reviewed-by: Carlos O'Donell <carlos@redhat.com> 2020-05-14 20:02:38 +00:00			`This file is part of the GNU C Library.`

			`The GNU C Library is free software; you can redistribute it and/or`
			`modify it under the terms of the GNU Lesser General Public`
			`License as published by the Free Software Foundation; either`
			`version 2.1 of the License, or (at your option) any later version.`

			`The GNU C Library is distributed in the hope that it will be useful,`
			`but WITHOUT ANY WARRANTY; without even the implied warranty of`
			`MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU`
			`Lesser General Public License for more details.`

			`You should have received a copy of the GNU Lesser General Public`
			`License along with the GNU C Library; if not, see`
			`<https://www.gnu.org/licenses/>. */`

stdlib: Add arc4random, arc4random_buf, and arc4random_uniform (BZ #4417) The implementation is based on scalar Chacha20 with per-thread cache. It uses getrandom or /dev/urandom as fallback to get the initial entropy, and reseeds the internal state on every 16MB of consumed buffer. To improve performance and lower memory consumption the per-thread cache is allocated lazily on first arc4random functions call, and if the memory allocation fails getentropy or /dev/urandom is used as fallback. The cache is also cleared on thread exit iff it was initialized (so if arc4random is not called it is not touched). Although it is lock-free, arc4random is still not async-signal-safe (the per thread state is not updated atomically). The ChaCha20 implementation is based on RFC8439 [1], omitting the final XOR of the keystream with the plaintext because the plaintext is a stream of zeros. This strategy is similar to what OpenBSD arc4random does. The arc4random_uniform is based on previous work by Florian Weimer, where the algorithm is based on Jérémie Lumbroso paper Optimal Discrete Uniform Generation from Coin Flips, and Applications (2013) [2], who credits Donald E. Knuth and Andrew C. Yao, The complexity of nonuniform random number generation (1976), for solving the general case. The main advantage of this method is the that the unit of randomness is not the uniform random variable (uint32_t), but a random bit. It optimizes the internal buffer sampling by initially consuming a 32-bit random variable and then sampling byte per byte. Depending of the upper bound requested, it might lead to better CPU utilization. Checked on x86_64-linux-gnu, aarch64-linux, and powerpc64le-linux-gnu. Co-authored-by: Florian Weimer <fweimer@redhat.com> Reviewed-by: Yann Droneaud <ydroneaud@opteya.com> [1] https://datatracker.ietf.org/doc/html/rfc8439 [2] https://arxiv.org/pdf/1304.1916.pdf 2022-07-21 13:04:59 +00:00			`#include <string.h>`
string: Remove old TLS usage on strsignal The per-thread state is refactored two use two strategies: 1. The default one uses a TLS structure, which will be placed in the static TLS space (using __thread keyword). 2. Linux allocates via struct pthread and access it through THREAD_* macros. The default strategy has the disadvantage of increasing libc.so static TLS consumption and thus decreasing the possible surplus used in some scenarios (which might be mitigated by BZ#25051 fix). It is used only on Hurd, where accessing the thread storage in the in single thread case is not straightforward (afaiu, Hurd developers could correct me here). The fallback static allocation used for allocation failure is also removed: defining its size is problematic without synchronizing with translated messages (to avoid partial translation) and the resulting usage is not thread-safe. Checked on x86-64-linux-gnu, i686-linux-gnu, powerpc64le-linux-gnu, and s390x-linux-gnu. Tested-by: Carlos O'Donell <carlos@redhat.com> Reviewed-by: Carlos O'Donell <carlos@redhat.com> 2020-05-14 20:02:38 +00:00			`#include <tls-internal.h>`

			`__thread struct tls_internal_t __tls_internal;`
stdlib: Add arc4random, arc4random_buf, and arc4random_uniform (BZ #4417) The implementation is based on scalar Chacha20 with per-thread cache. It uses getrandom or /dev/urandom as fallback to get the initial entropy, and reseeds the internal state on every 16MB of consumed buffer. To improve performance and lower memory consumption the per-thread cache is allocated lazily on first arc4random functions call, and if the memory allocation fails getentropy or /dev/urandom is used as fallback. The cache is also cleared on thread exit iff it was initialized (so if arc4random is not called it is not touched). Although it is lock-free, arc4random is still not async-signal-safe (the per thread state is not updated atomically). The ChaCha20 implementation is based on RFC8439 [1], omitting the final XOR of the keystream with the plaintext because the plaintext is a stream of zeros. This strategy is similar to what OpenBSD arc4random does. The arc4random_uniform is based on previous work by Florian Weimer, where the algorithm is based on Jérémie Lumbroso paper Optimal Discrete Uniform Generation from Coin Flips, and Applications (2013) [2], who credits Donald E. Knuth and Andrew C. Yao, The complexity of nonuniform random number generation (1976), for solving the general case. The main advantage of this method is the that the unit of randomness is not the uniform random variable (uint32_t), but a random bit. It optimizes the internal buffer sampling by initially consuming a 32-bit random variable and then sampling byte per byte. Depending of the upper bound requested, it might lead to better CPU utilization. Checked on x86_64-linux-gnu, aarch64-linux, and powerpc64le-linux-gnu. Co-authored-by: Florian Weimer <fweimer@redhat.com> Reviewed-by: Yann Droneaud <ydroneaud@opteya.com> [1] https://datatracker.ietf.org/doc/html/rfc8439 [2] https://arxiv.org/pdf/1304.1916.pdf 2022-07-21 13:04:59 +00:00
			`void`
			`__glibc_tls_internal_free (void)`
			`{`
			`free (__tls_internal.strsignal_buf);`
			`free (__tls_internal.strerror_l_buf);`
			`}`