Previously, once we set up TLS, we would implicitly switch from using
__hurd_reply_port0 to reply_port inside the TCB, leaving the former
unused. But we never deallocated it, so it got leaked.
Instead, migrate the port into the new TCB's reply_port slot. This
avoids both the port leak and an extra syscall to create a new reply
port for the TCB.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230319151017.531737-28-bugaevc@gmail.com>
If we're doing signals, that means we've already got the signal thread
running, and that implies TLS having been set up. So we know that
__hurd_local_reply_port will resolve to THREAD_SELF->reply_port, and can
access that directly using the THREAD_GETMEM and THREAD_SETMEM macros.
This avoids potential miscompilations, and should also be a tiny bit
faster.
Also, use mach_port_mod_refs () and not mach_port_destroy () to destroy
the receive right. mach_port_destroy () should *never* be used on
mach_task_self (); this can easily lead to port use-after-free
vulnerabilities if the task has any other references to the same port.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230319151017.531737-26-bugaevc@gmail.com>
When glibc is built as a shared library, TLS is always initialized by
the call of TLS_INIT_TP () macro made inside the dynamic loader, prior
to running the main program (see dl-call_tls_init_tp.h). We can take
advantage of this: we know for sure that __LIBC_NO_TLS () will evaluate
to 0 in all other cases, so let the compiler know that explicitly too.
Also, only define _hurd_tls_init () and TLS_INIT_TP () under the same
conditions (either !SHARED or inside rtld), to statically assert that
this is the case.
Other than a microoptimization, this also helps with avoiding awkward
sharing of the __libc_tls_initialized variable between ld.so and libc.so
that we would have to do otherwise -- we know for sure that no sharing
is required, simply because __libc_tls_initialized would always be set
to true inside libc.so.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230319151017.531737-25-bugaevc@gmail.com>
These are just regular local variables that are not accessed in any
funny ways, not even though a pointer. There's absolutely no reason to
declare them volatile. It only ends up hurting the quality of the
generated machine code.
If anything, it would make sense to decalre sigsp as *pointing* to
volatile memory (volatile void *sigsp), but evidently that's not needed
either.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230403115621.258636-2-bugaevc@gmail.com>
This is based on the Linux port's version, but laid out to match Mach's
struct i386_thread_state, much like the i386 version does.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
sysdeps/mach/hurd/htl/pt-pthread_self.c: New file.
htl/Makefile: .. Add it to libc routine.
sysdeps/mach/hurd/htl/pt-sysdep.c(__pthread_self): Remove it.
sysdeps/mach/hurd/htl/pt-sysdep.h(__pthread_self): Add hidden propertie.
htl/Versions(__pthread_self) Version it as private symbol.
Signed-off-by: Guy-Fleury Iteriteka <gfleury@disroot.org>
Message-Id: <20230318095826.1125734-3-gfleury@disroot.org>
These do not need any changes to be used on x86_64.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230319151017.531737-20-bugaevc@gmail.com>
This is more correct, if only because these fields are defined as having
the type unsigned int in the Mach headers, so casting them to a signed
int and then back is suboptimal.
Also, remove an extra reassignment of uesp -- this is another remnant of
the ecx kludge.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230319151017.531737-16-bugaevc@gmail.com>
There's nothing Mach- or Hurd-specific about it; any port that ends
up with rtld pulling in strncpy will need this.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230319151017.531737-15-bugaevc@gmail.com>
This was used for the value of libc-lock's owner when TLS is not yet set
up, so THREAD_SELF can not be used. Since the value need not be anything
specific -- it just has to be non-NULL -- we can just use a plain
constant, such as (void *) 1, for this. This avoids accessing the symbol
through GOT, and exporting it from libc.so in the first place.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230319151017.531737-12-bugaevc@gmail.com>
Noone is or should be using __hurd_threadvar_stack_{offset,mask}, we
have proper TLS now. These two remaining variables are never set to
anything other than zero, so any code that would try to use them as
described would just dereference a zero pointer and crash. So remove
them entirely.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230319151017.531737-6-bugaevc@gmail.com>
Instead define the required fields in system dependend files. The only
system dependent definition is FILENAME_MAX, which should match POSIX
PATH_MAX, and it is obtained from either kernel UAPI or mach headers.
Currently set pre-defined value from current kernels.
It avoids a circular dependendy when including stdio.h in
gen-as-const-headers files.
Checked on x86_64-linux-gnu and i686-linux-gnu
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
They are both used by __libc_freeres to free all library malloc
allocated resources to help tooling like mtrace or valgrind with
memory leak tracking.
The current scheme uses assembly markers and linker script entries
to consolidate the free routine function pointers in the RELRO segment
and to be freed buffers in BSS.
This patch changes it to use specific free functions for
libc_freeres_ptrs buffers and call the function pointer array directly
with call_function_static_weak.
It allows the removal of both the internal macros and the linker
script sections.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
C2x adds binary integer constants starting with 0b or 0B, and supports
those constants for the %i scanf format (in addition to the %b format,
which isn't yet implemented for scanf in glibc). Implement that scanf
support for glibc.
As with the strtol support, this is incompatible with previous C
standard versions, in that such an input string starting with 0b or 0B
was previously required to be parsed as 0 (with the rest of the input
potentially matching subsequent parts of the scanf format string).
Thus this patch adds 12 new __isoc23_* functions per long double
format (12, 24 or 36 depending on how many long double formats the
glibc configuration supports), with appropriate header redirection
support (generally very closely following that for the __isoc99_*
scanf functions - note that __GLIBC_USE (DEPRECATED_SCANF) takes
precedence over __GLIBC_USE (C2X_STRTOL), so the case of GNU
extensions to C89 continues to get old-style GNU %a and does not get
this new feature). The function names would remain as __isoc23_* even
if C2x ends up published in 2024 rather than 2023.
When scanf %b support is added, I think it will be appropriate for all
versions of scanf to follow C2x rules for inputs to the %b format
(given that there are no compatibility concerns for a new format).
Tested for x86_64 (full glibc testsuite). The first version was also
tested for powerpc (32-bit) and powerpc64le (stdio-common/ and wcsmbs/
tests), and with build-many-glibcs.py.
"We don't need it any more"
The INTR_MSG_TRAP macro in intr-msg.h used to play little trick with
the stack pointer: it would temporarily save the "real" stack pointer
into ecx, while setting esp to point to just before the message buffer,
and then invoke the mach_msg trap. This way, INTR_MSG_TRAP reused the
on-stack arguments laid out for the containing call of
_hurd_intr_rpc_mach_msg (), passing them to the mach_msg trap directly.
This, however, required special support in hurdsig.c and trampoline.c,
since they now had to recognize when a thread is inside the piece of
code where esp doesn't point to the real tip of the stack, and handle
this situation specially.
Commit 1d20f33ff4 has removed the actual
temporary change of esp by actually re-pushing mach_msg arguments onto
the stack, and popping them back at end. It did not, however, deal with
the rest of "the ecx kludge" code in other files, resulting in potential
crashes if a signal arrives in the middle of pushing arguments onto the
stack.
Fix that by removing "the ecx kludge". Instead, when we want a thread
to skip the RPC, but cannot make just make it jump to after the trap
since it's not done adjusting the stack yet, set the SYSRETURN register
to MACH_SEND_INTERRUPTED (as we do anyway), and rely on the thread
itself for detecting this case and skipping the RPC.
This simplifies things somewhat and paves the way for a future x86_64
port of this code.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230301162355.426887-1-bugaevc@gmail.com>
This is for future-proofing. On i386, it is 4-byte aligned anyway, but
on x86_64, we want it 8-byte aligned, not 4-byte aligned.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230214173722.428140-4-bugaevc@gmail.com>
This drops all of the return address rewriting kludges. The only
remaining hack is the jump out of a call stack while adjusting the
stack pointer.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
And make it a bit more 64-bit ready. This is in preparation to moving this
file into x86/
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230218203717.373211-6-bugaevc@gmail.com>
This ensures that a timer_t value can be cast to struct timer_node *
and back.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230218203717.373211-5-bugaevc@gmail.com>
Fix a few more cases of build errors caused by mismatched types. This is a
continuation of f4315054b4.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230218203717.373211-3-bugaevc@gmail.com>
This is going to be done differently on x86_64.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230218203717.373211-2-bugaevc@gmail.com>
C2x adds binary integer constants starting with 0b or 0B, and supports
those constants in strtol-family functions when the base passed is 0
or 2. Implement that strtol support for glibc.
As discussed at
<https://sourceware.org/pipermail/libc-alpha/2020-December/120414.html>,
this is incompatible with previous C standard versions, in that such
an input string starting with 0b or 0B was previously required to be
parsed as 0 (with the rest of the string unprocessed). Thus, as
proposed there, this patch adds 20 new __isoc23_* functions with
appropriate header redirection support. This patch does *not* do
anything about scanf %i (which will need 12 new functions per long
double variant, so 12, 24 or 36 depending on the glibc configuration),
instead leaving that for a future patch. The function names would
remain as __isoc23_* even if C2x ends up published in 2024 rather than
2023.
Making this change leads to the question of what should happen to
internal uses of these functions in glibc and its tests. The header
redirection (which applies for _GNU_SOURCE or any other feature test
macros enabling C2x features) has the effect of redirecting internal
uses but without those uses then ending up at a hidden alias (see the
comment in include/stdio.h about interaction with libc_hidden_proto).
It seems desirable for the default for internal uses to be the same
versions used by normal code using _GNU_SOURCE, so rather than doing
anything to disable that redirection, similar macro definitions to
those in include/stdio.h are added to the include/ headers for the new
functions.
Given that the default for uses in glibc is for the redirections to
apply, the next question is whether the C2x semantics are correct for
all those uses. Uses with the base fixed to 10, 16 or any other value
other than 0 or 2 can be ignored. I think this leaves the following
internal uses to consider (an important consideration for review of
this patch will be both whether this list is complete and whether my
conclusions on all entries in it are correct):
benchtests/bench-malloc-simple.c
benchtests/bench-string.h
elf/sotruss-lib.c
math/libm-test-support.c
nptl/perf.c
nscd/nscd_conf.c
nss/nss_files/files-parse.c
posix/tst-fnmatch.c
posix/wordexp.c
resolv/inet_addr.c
rt/tst-mqueue7.c
soft-fp/testit.c
stdlib/fmtmsg.c
support/support_test_main.c
support/test-container.c
sysdeps/pthread/tst-mutex10.c
I think all of these places are OK with the new semantics, except for
resolv/inet_addr.c, where the POSIX semantics of inet_addr do not
allow for binary constants; thus, I changed that file (to use
__strtoul_internal, whose semantics are unchanged) and added a test
for this case. In the case of posix/wordexp.c I think accepting
binary constants is OK since POSIX explicitly allows additional forms
of shell arithmetic expressions, and in stdlib/fmtmsg.c SEV_LEVEL is
not in POSIX so again I think accepting binary constants is OK.
Functions such as __strtol_internal, which are only exported for
compatibility with old binaries from when those were used in inline
functions in headers, have unchanged semantics; the __*_l_internal
versions (purely internal to libc and not exported) have a new
argument to specify whether to accept binary constants.
As well as for the standard functions, the header redirection also
applies to the *_l versions (GNU extensions), and to legacy functions
such as strtoq, to avoid confusing inconsistency (the *q functions
redirect to __isoc23_*ll rather than needing their own __isoc23_*
entry points). For the functions that are only declared with
_GNU_SOURCE, this means the old versions are no longer available for
normal user programs at all. An internal __GLIBC_USE_C2X_STRTOL macro
is used to control the redirections in the headers, and cases in glibc
that wish to avoid the redirections - the function implementations
themselves and the tests of the old versions of the GNU functions -
then undefine and redefine that macro to allow the old versions to be
accessed. (There would of course be greater complexity should we wish
to make any of the old versions into compat symbols / avoid them being
defined at all for new glibc ABIs.)
strtol_l.c has some similarity to strtol.c in gnulib, but has already
diverged some way (and isn't listed at all at
https://sourceware.org/glibc/wiki/SharedSourceFiles unlike strtoll.c
and strtoul.c); I haven't made any attempts at gnulib compatibility in
the changes to that file.
I note incidentally that inttypes.h and wchar.h are missing the
__nonnull present on declarations of this family of functions in
stdlib.h; I didn't make any changes in that regard for the new
declarations added.
This macro from Mach headers conflicts with how
sysdeps/x86_64/multiarch/strcmp-sse2.S expects it to be defined.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230214173722.428140-3-bugaevc@gmail.com>
* Micro-optimize TLS access using GCC's native support for gs-based
addressing when available;
* Just use THREAD_GETMEM and THREAD_SETMEM instead of more inline
assembly;
* Sync tcbhead_t layout with NPTL, in particular update/fix __private_ss
offset;
* Statically assert that the two offsets that are a part of ABI are what
we expect them to be.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230214173722.428140-2-bugaevc@gmail.com>
It has been decided that on x86_64, mach_msg_type_number_t stays 32-bit.
Therefore, it's not possible to use mach_msg_type_number_t
interchangeably with size_t, in particular this breaks when a pointer to
a variable is passed to a MIG routine.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230212111044.610942-3-bugaevc@gmail.com>
Make the code flow more linear using early returns where possible. This
makes it so much easier to reason about what runs on error / successful
code paths.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230212111044.610942-2-bugaevc@gmail.com>
We used to use .cfi_adjust_cfa_offset around %esp manipulation
asm instructions to fix unwinding, but when building glibc with
-fno-omit-frame-pointer this is bogus since in that case %ebp is the CFA and
does not move.
Instead, let's force -fno-omit-frame-pointer when building intr-msg.c so
that %ebp can always be used and no .cfi_adjust_cfa_offset is needed.
This file is not used today since we end up using
sysdeps/i386/htl/machine-sp.h. Getting the stack pointer does not need
to be hurd specific and can go into sysdeps/<arch>.
Message-Id: <Y9tpWs2WOgE/Duiq@jupiter.tail36e24.ts.net>
This adds a special SHM_ANON value that can be passed into shm_open ()
in place of a name. When called in this way, shm_open () will create a
new anonymous shared memory file. The file will be created in the same
way that other shared memory files are created (i.e., under /dev/shm/),
except that it is not given a name and therefore cannot be reached from
the file system, nor by other calls to shm_open (). This is accomplished
by utilizing O_TMPFILE.
This is intended to be compatible with FreeBSD's API of the same name.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230130125216.6254-4-bugaevc@gmail.com>
This is a flag that causes open () to create a new, unnamed file in the
same filesystem as the given directory. The file descriptor can be
simply used in the creating process as a temporary file, or shared with
children processes via fork (), or sent over a Unix socket. The file can
be left anonymous, in which case it will be deleted from the backing
file system once all copies of the file descriptor are closed, or given
a permanent name with a linkat () call, such as the following:
int fd = open ("/tmp", O_TMPFILE | O_RDWR, 0700);
/* Do something with the file... */
linkat (fd, "", AT_FDCWD, "/tmp/filename", AT_EMPTY_PATH);
In between creating the file and linking it to the file system, it is
possible to set the file content, mode, ownership, author, and other
attributes, so that the file visibly appears in the file system (perhaps
replacing another file) atomically, with all of its attributes already
set up.
The Hurd support for O_TMPFILE directly exposes the dir_mkfile RPC to
user programs. Previously, dir_mkfile was used by glibc internally, in
particular for implementing tmpfile (), but not exposed to user programs
through a Unix-level API.
O_TMPFILE was initially introduced by Linux. This implementation is
intended to be compatible with the Linux implementation, except that the
O_EXCL flag is not given the special meaning when used together with
O_TMPFILE, unlike on Linux.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230130125216.6254-3-bugaevc@gmail.com>
8b8c768e3c ("Force use of -ffreestanding when checking for gnumach
headers") was passing -ffreestanding to CFLAGS only, but headers checks are
performed with the preprocessor, so we rather need to pass it to CPPFLAGS.
Without this ./configure assumes that we are in a fully hosted
environment, which might not be the case. After this patch, we can rely on
the freestanding header files provided by GCC such as stdint.h.
Message-Id: <Y5+0V9osFc/zXMq0@mars>
Previously, getrandom would, each time it's called, traverse the file
system to find /dev/urandom, fetch some random data from it, then throw
away that port. This is quite slow, while calls to getrandom are
genrally expected to be fast.
Additionally, this means that getrandom can not work when /dev/urandom
is unavailable, such as inside a chroot that lacks one. User programs
expect calls to getrandom to work inside a chroot if they first call
getrandom outside of the chroot.
In particular, this is known to break the OpenSSH server, and in that
case the issue is exacerbated by the API of arc4random, which prevents
it from properly reporting errors, forcing glibc to abort on failure.
This causes sshd to just die once it tries to generate a random number.
Caching the random server port, in a manner similar to how socket
server ports are cached, both improves the performance and works around
the chroot issue.
Tested on i686-gnu with the following program:
pthread_barrier_t barrier;
void *worker(void*) {
pthread_barrier_wait(&barrier);
uint32_t sum = 0;
for (int i = 0; i < 10000; i++) {
sum += arc4random();
}
return (void *)(uintptr_t) sum;
}
int main() {
pthread_t threads[THREAD_COUNT];
pthread_barrier_init(&barrier, NULL, THREAD_COUNT);
for (int i = 0; i < THREAD_COUNT; i++) {
pthread_create(&threads[i], NULL, worker, NULL);
}
for (int i = 0; i < THREAD_COUNT; i++) {
void *retval;
pthread_join(threads[i], &retval);
printf("Thread %i: %lu\n", i, (unsigned long)(uintptr_t) retval);
}
In my totally unscientific benchmark, with this patch, this completes
in about 7 seconds, whereas previously it took about 50 seconds. This
program was also used to test that getrandom () doesn't explode if the
random server dies, but instead reopens the /dev/urandom anew. I have
also verified that with this patch, OpenSSH can once again accept
connections properly.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20221202135558.23781-1-bugaevc@gmail.com>
This makes it more likely that the compiler can compute the strlen
argument in _startup_fatal at compile time, which is required to
avoid a dependency on strlen this early during process startup.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
The old exception handling implementation used function interposition
to replace the dynamic loader implementation (no TLS support) with the
libc implementation (TLS support). This results in problems if the
link order between the dynamic loader and libc is reversed (bug 25486).
The new implementation moves the entire implementation of the
exception handling functions back into the dynamic loader, using
THREAD_GETMEM and THREAD_SETMEM for thread-local data support.
These depends on Hurd support for these macros, added in commit
b65a82e4e7 ("hurd: Add THREAD_GET/SETMEM/_NC").
One small obstacle is that the exception handling facilities are used
before the TCB has been set up, so a check is needed if the TCB is
available. If not, a regular global variable is used to store the
exception handling information.
Also rename dl-error.c to dl-catch.c, to avoid confusion with the
dlerror function.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
In the future, this will result in a compilation failure if the
macros are unexpectedly undefined (due to header inclusion ordering
or header inclusion missing altogether).
Assembler sources are more difficult to convert. In many cases,
they are hand-optimized for the mangling and no-mangling variants,
which is why they are not converted.
sysdeps/s390/s390-32/__longjmp.c and sysdeps/s390/s390-64/__longjmp.c
are special: These are C sources, but most of the implementation is
in assembler, so the PTR_DEMANGLE macro has to be undefined in some
cases, to match the assembler style.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This allows us to define a generic no-op version of PTR_MANGLE and
PTR_DEMANGLE. In the future, we can use PTR_MANGLE and PTR_DEMANGLE
unconditionally in C sources, avoiding an unintended loss of hardening
due to missing include files or unlucky header inclusion ordering.
In i386 and x86_64, we can avoid a <tls.h> dependency in the C
code by using the computed constant from <tcb-offsets.h>. <sysdep.h>
no longer includes these definitions, so there is no cyclic dependency
anymore when computing the <tcb-offsets.h> constants.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This addition to the list of source headers in
sysdeps/mach/hurd/bits/errno.h appears in the source tree after
build-many-glibcs.py runs, I'm guessing resulting from gnumach commit
c566ad85a2d6728ebc8ec0f461a3b35df300e96e.
Use INTERNAL_SYSCALL_CALL instead of INLINE_SYSCALL_CALL. This
requires emulate the semantic for hurd call (so __arc4random_buf
uses the fallback).
Checked on x86_64-linux-gnu.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Non-at functions can be implemented by just calling the corresponding at
function with AT_FDCWD and zero at_flags.
In the linkat case, the at behavior is different (O_NOLINK), so this introduces
__linkat_common to pass O_NOLINK as appropriate.
lstat functions can also be implemented with fstatat by adding
__fstatat64_common which takes a flags parameter in addition to the at_flags
parameter,
In the end this factorizes chmod, chown, link, lstat64, mkdir, readlink,
rename, stat64, symlink, unlink, utimes.
This also makes __lstat, __lxstat64, __stat and __xstat64 directly use
__fstatat64_common instead of __lstat64 or __stat64.
9e5c991106 ("hurd: Fix readlink() hanging on fifo") separated opening
the file for the stat call from opening the file for the read call. That
however opened a small window for the file to change. Better make this
atomic by reopening the file with O_READ.
readlink() opens the target with O_READ to be able to read the symlink
content. When the target is actually a fifo, that would hang waiting for a
writer (caught in the coreutils testsuite). We thus have to first lookup the
target without O_READ to perform io_stat and lookout for fifos, and only
after checking the symlink type, we can re-lookup with O_READ.
In gnumach, 3e1702a65fb3 ("add rpc_versions for vm types") changed the type
of vm_size_t, making it always a unsigned long. This made it incompatible on
x86 with size_t. Even if we may want to revert it to unsigned int, it's
better to fix the types of parameters according to the .defs files.
posix advises to have strerror_r fill a message even when we are returning
an error.
This makes mach's xpg_strerror_r do this, like the generic version does.
Spotted by the libunistring testsuite test-strerror_r
08d2024b41 ("string: Simplify strerror_r") inadvertently made
__strerror_r print unknown error system in decimal while the original
code was printing it in hexadecimal. perror was kept printing in
hexadecimal in 725eeb4af1 ("string: Use tls-internal on strerror_l"),
let us keep both coherent.
This also fixes a duplicate ':'
Spotted by the libunistring testsuite test-perror2
gcc introduces gs:0x14 accesses in most functions, so we need some tcbhead
to be ready very early during initialization. This configures a static area
which can be referenced by various protected functions, until proper TLS is
set up.
We do not have a hurd data block only when bootstrapping the system, in
which case we don't have a notion of suid yet anyway.
This is needed, otherwise init_standard_fds would check that standard
file descriptors are allocated, which is meaningless during bootstrap.
The inline and library functions that the CMSG_NXTHDR macro may expand
to increment the pointer to the header before checking the stride of
the increment against available space. Since C only allows incrementing
pointers to one past the end of an array, the increment must be done
after a length check. This commit fixes that and includes a regression
test for CMSG_FIRSTHDR and CMSG_NXTHDR.
The Linux, Hurd, and generic headers are all changed.
Tested on Linux on armv7hl, i686, x86_64, aarch64, ppc64le, and s390x.
[BZ #28846]
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Rather than buffering 16 MiB of entropy in userspace (by way of
chacha20), simply call getrandom() every time.
This approach is doubtlessly slower, for now, but trying to prematurely
optimize arc4random appears to be leading toward all sorts of nasty
properties and gotchas. Instead, this patch takes a much more
conservative approach. The interface is added as a basic loop wrapper
around getrandom(), and then later, the kernel and libc together can
work together on optimizing that.
This prevents numerous issues in which userspace is unaware of when it
really must throw away its buffer, since we avoid buffering all
together. Future improvements may include userspace learning more from
the kernel about when to do that, which might make these sorts of
chacha20-based optimizations more possible. The current heuristic of 16
MiB is meaningless garbage that doesn't correspond to anything the
kernel might know about. So for now, let's just do something
conservative that we know is correct and won't lead to cryptographic
issues for users of this function.
This patch might be considered along the lines of, "optimization is the
root of all evil," in that the much more complex implementation it
replaces moves too fast without considering security implications,
whereas the incremental approach done here is a much safer way of going
about things. Once this lands, we can take our time in optimizing this
properly using new interplay between the kernel and userspace.
getrandom(0) is used, since that's the one that ensures the bytes
returned are cryptographically secure. But on systems without it, we
fallback to using /dev/urandom. This is unfortunate because it means
opening a file descriptor, but there's not much of a choice. Secondly,
as part of the fallback, in order to get more or less the same
properties of getrandom(0), we poll on /dev/random, and if the poll
succeeds at least once, then we assume the RNG is initialized. This is a
rough approximation, as the ancient "non-blocking pool" initialized
after the "blocking pool", not before, and it may not port back to all
ancient kernels, though it does to all kernels supported by glibc
(≥3.2), so generally it's the best approximation we can do.
The motivation for including arc4random, in the first place, is to have
source-level compatibility with existing code. That means this patch
doesn't attempt to litigate the interface itself. It does, however,
choose a conservative approach for implementing it.
Cc: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Cristian Rodríguez <crrodriguez@opensuse.org>
Cc: Paul Eggert <eggert@cs.ucla.edu>
Cc: Mark Harris <mark.hsj@gmail.com>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: linux-crypto@vger.kernel.org
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The implementation is based on scalar Chacha20 with per-thread cache.
It uses getrandom or /dev/urandom as fallback to get the initial entropy,
and reseeds the internal state on every 16MB of consumed buffer.
To improve performance and lower memory consumption the per-thread cache
is allocated lazily on first arc4random functions call, and if the
memory allocation fails getentropy or /dev/urandom is used as fallback.
The cache is also cleared on thread exit iff it was initialized (so if
arc4random is not called it is not touched).
Although it is lock-free, arc4random is still not async-signal-safe
(the per thread state is not updated atomically).
The ChaCha20 implementation is based on RFC8439 [1], omitting the final
XOR of the keystream with the plaintext because the plaintext is a
stream of zeros. This strategy is similar to what OpenBSD arc4random
does.
The arc4random_uniform is based on previous work by Florian Weimer,
where the algorithm is based on Jérémie Lumbroso paper Optimal Discrete
Uniform Generation from Coin Flips, and Applications (2013) [2], who
credits Donald E. Knuth and Andrew C. Yao, The complexity of nonuniform
random number generation (1976), for solving the general case.
The main advantage of this method is the that the unit of randomness is not
the uniform random variable (uint32_t), but a random bit. It optimizes the
internal buffer sampling by initially consuming a 32-bit random variable
and then sampling byte per byte. Depending of the upper bound requested,
it might lead to better CPU utilization.
Checked on x86_64-linux-gnu, aarch64-linux, and powerpc64le-linux-gnu.
Co-authored-by: Florian Weimer <fweimer@redhat.com>
Reviewed-by: Yann Droneaud <ydroneaud@opteya.com>
[1] https://datatracker.ietf.org/doc/html/rfc8439
[2] https://arxiv.org/pdf/1304.1916.pdf
This change provides implementations for the mbrtoc8 and c8rtomb
functions adopted for C++20 via WG21 P0482R6 and for C2X via WG14
N2653. It also provides the char8_t typedef from WG14 N2653.
The mbrtoc8 and c8rtomb functions are declared in uchar.h in C2X
mode or when the _GNU_SOURCE macro or C++20 __cpp_char8_t feature
test macro is defined.
The char8_t typedef is declared in uchar.h in C2X mode or when the
_GNU_SOURCE macro is defined and the C++20 __cpp_char8_t feature
test macro is not defined (if __cpp_char8_t is defined, then char8_t
is a builtin type).
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
It was added on Linux 5.4 (3695eae5fee0605f316fbaad0b9e3de791d7dfaf)
to extend waitid to wait on pidfd.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
When an executable is invoked as
./ld.so [ld.so-args] ./exe [exe-args]
then the argv is adujusted in ld.so before calling the entry point of
the executable so ld.so args are not visible to it. On most targets
this requires moving argv, env and auxv on the stack to ensure correct
stack alignment at the entry point. This had several issues:
- The code for this adjustment on the stack is written in asm as part
of the target specific ld.so _start code which is hard to maintain.
- The adjustment is done after _dl_start returns, where it's too late
to update GLRO(dl_auxv), as it is already readonly, so it points to
memory that was clobbered by the adjustment. This is bug 23293.
- _environ is also wrong in ld.so after the adjustment, but it is
likely not used after _dl_start returns so this is not user visible.
- _dl_argv was updated, but for this it was moved out of relro, which
changes security properties across targets unnecessarily.
This patch introduces a generic _dl_start_args_adjust function that
handles the argument adjustments after ld.so processed its own args
and before relro protection is applied.
The same algorithm is used on all targets, _dl_skip_args is now 0, so
existing target specific adjustment code is no longer used. The bug
affects aarch64, alpha, arc, arm, csky, ia64, nios2, s390-32 and sparc,
other targets don't need the change in principle, only for consistency.
The GNU Hurd start code relied on _dl_skip_args after dl_main returned,
now it checks directly if args were adjusted and fixes the Hurd startup
data accordingly.
Follow up patches can remove _dl_skip_args and DL_ARGV_NOT_RELRO.
Tested on aarch64-linux-gnu and cross tested on i686-gnu.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The siglist.c is built with -fno-toplevel-reorder to avoid compiler
to reorder the compat assembly directives due an assembler
issue [1] (fixed on 2.39).
This patch removes the compiler flags by split the compat symbol
generation in two phases. First the __sys_siglist and __sys_sigabbrev
without any compat symbol directive is preprocessed to generate an
assembly source code. This generate assembly is then used as input
on a platform agnostic siglist.S which then creates the compat
definitions. This prevents compiler to move any compat directive
prior the _sys_errlist definition itself.
Checked on a make check run-built-tests=no on all affected ABIs.
Reviewed-by: Fangrui Song <maskray@google.com>
The errlist.c is built with -fno-toplevel-reorder to avoid compiler to
reorder the compat assembly directives due an assembler issue [1]
(fixed on 2.39).
This patch removes the compiler flags by split the compat symbol
generation in two phases. First the _sys_errlist_internal internal
without any compat symbol directive is preprocessed to generate an
assembly source code. This generate assembly is then used as input
on a platform agnostic errlist-data.S which then creates the compat
definitions. This prevents compiler to move any compat directive
prior the _sys_errlist_internal definition itself.
Checked on a make check run-built-tests=no on all affected ABIs.
[1] https://sourceware.org/bugzilla/show_bug.cgi?id=29012
After 73fc4e28b9,
__libc_enable_secure_decided is always 0 and a statically linked
executable may overwrite __libc_enable_secure without considering
AT_SECURE.
The __libc_enable_secure has been correctly initialized in _dl_aux_init,
so just remove __libc_enable_secure_decided and __libc_init_secure.
This allows us to remove some startup_get*id functions from
22b79ed7f4.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
The loader does not need to pull all __get_errlist definitions
and its size is decreased:
Before:
$ size elf/ld.so
text data bss dec hex filename
197774 11024 456 209254 33166 elf/ld.so
After:
$ size elf/ld.so
text data bss dec hex filename
191510 9936 456 201902 314ae elf/ld.so
Checked on x86_64-linux-gnu.
The posix_spawnattr_tcsetpgrp_np works on a file descriptor (the
controlling terminal), so it would make more sense to actually fit
it on the file actions API.
Also, POSIX_SPAWN_TCSETPGROUP is not really required since it is
implicit by the presence of tcsetpgrp file action.
The posix/tst-spawn6.c is also fixed when TTY can is not present.
Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
It is not Hurd-specific, but H.J. Lu wants it there.
Also, dc.a can be used to avoid hardcoding .long vs .quad and thus use
the same implementation for i386 and x86_64.
The glibc 2.34 release really should have added a GLIBC_2.34
symbol to the dynamic loader. With it, we could move functions such
as dlopen or pthread_key_create that work on process-global state
into the dynamic loader (once we have fixed a longstanding issue
with static linking). Without the GLIBC_2.34 symbol, yet another
new symbol version would be needed because old glibc will fail to
load binaries due to the missing symbol version in ld.so that newly
linked programs will require.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Currently there is no proper way to set the controlling terminal through
posix_spawn in race free manner [1]. This forces shell implementations
to keep using fork+exec when launching background process groups,
even when using posix_spawn yields better performance.
This patch adds a new GNU extension so the creating process can
configure the created process terminal group. This is done with a new
flag, POSIX_SPAWN_TCSETPGROUP, along with two new attribute functions:
posix_spawnattr_tcsetpgrp_np, and posix_spawnattr_tcgetpgrp_np.
The function sets a new attribute, spawn-tcgroupfd, that references to
the controlling terminal.
The controlling terminal is set after the spawn-pgroup attribute, and
uses the spawn-tcgroupfd along with current creating process group
(so it is composable with POSIX_SPAWN_SETPGROUP).
To create a process and set the controlling terminal, one can use the
following sequence:
posix_spawnattr_t attr;
posix_spawnattr_init (&attr);
posix_spawnattr_setflags (&attr, POSIX_SPAWN_TCSETPGROUP);
posix_spawnattr_tcsetpgrp_np (&attr, tcfd);
If the idea is also to create a new process groups:
posix_spawnattr_t attr;
posix_spawnattr_init (&attr);
posix_spawnattr_setflags (&attr, POSIX_SPAWN_TCSETPGROUP
| POSIX_SPAWN_SETPGROUP);
posix_spawnattr_tcsetpgrp_np (&attr, tcfd);
posix_spawnattr_setpgroup (&attr, 0);
The controlling terminal file descriptor is ignored if the new flag is
not set.
This interface is slight different than the one provided by QNX [2],
which only provides the POSIX_SPAWN_TCSETPGROUP flag. The QNX
documentation does not specify how the controlling terminal is obtained
nor how it iteracts with POSIX_SPAWN_SETPGROUP. Since a glibc
implementation is library based, it is more straightforward and avoid
requires additional file descriptor operations to request the caller
to setup the controlling terminal file descriptor (and it also allows
a bit less error handling by posix_spawn).
Checked on x86_64-linux-gnu and i686-linux-gnu.
[1] https://github.com/ksh93/ksh/issues/79
[2] https://www.qnx.com/developers/docs/7.0.0/index.html#com.qnx.doc.neutrino.lib_ref/topic/p/posix_spawn.html
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
If any RPC fails, the reply port will already be deallocated.
__pthread_thread_terminate thus has to defer taking its name until the very last
__thread_terminate_release which doesn't reply a message. But then we
have to read from the pthread structure.
This introduces __pthread_dealloc_finish() which does the recording of
the thread termination, so the slot can be reused really only just before
the __thread_terminate_release call. Only the real thread can set it, so
let's decouple this from the pthread_state by just removing the
PTHREAD_TERMINATED state and add a terminated field.
The content of the structure is only used internally, so we can make
__pthread_attr_getschedparam and __pthread_attr_setschedparam convert
between the public sched_param type and an internal __sched_param.
This allows to avoid to spuriously expose the sched_param type.
This fixes BZ #23088.
We have to drop the kernel_thread port from the thread structure, to
avoid pthread_kill's call to _hurd_thread_sigstate trying to reference
it and fail.
I used these shell commands:
../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")
and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 7061 files FOO.
I then removed trailing white space from math/tgmath.h,
support/tst-support-open-dev-null-range.c, and
sysdeps/x86_64/multiarch/strlen-vec.S, to work around the following
obscure pre-commit check failure diagnostics from Savannah. I don't
know why I run into these diagnostics whereas others evidently do not.
remote: *** 912-#endif
remote: *** 913:
remote: *** 914-
remote: *** error: lines with trailing whitespace found
...
remote: *** error: sysdeps/unix/sysv/linux/statx_cp.c: trailing lines
407765e9f2 ("hurd: Fix ELF_MACHINE_USER_ADDRESS_MASK value") switched
ELF_MACHINE_USER_ADDRESS_MASK from 0xf8000000UL to 0xf0000000UL to let
libraries etc. get loaded at 0x08000000. But
ELF_MACHINE_USER_ADDRESS_MASK is actually only meaningful for the main
program anyway, so keep it at 0xf8000000UL to prevent the program loader
from putting ld.so beyond 0x08000000. And conversely, drop the use of
ELF_MACHINE_USER_ADDRESS_MASK for shared objects, which don't need any
constraints since the program will have already be loaded by then.
glibc uses /dev/urandom for getrandom(), and from version 2.34 malloc
initialization uses it. We have to detect when we are running the random
translator itself, in which case we can't read ourself.
It can be used to speed up the libgcc unwinder, and the internal
_dl_find_dso_for_object function (which is used for caller
identification in dlopen and related functions, and in dladdr).
_dl_find_object is in the internal namespace due to bug 28503.
If libgcc switches to _dl_find_object, this namespace issue will
be fixed. It is located in libc for two reasons: it is necessary
to forward the call to the static libc after static dlopen, and
there is a link ordering issue with -static-libgcc and libgcc_eh.a
because libc.so is not a linker script that includes ld.so in the
glibc build tree (so that GCC's internal -lc after libgcc_eh.a does
not pick up ld.so).
It is necessary to do the i386 customization in the
sysdeps/x86/bits/dl_find_object.h header shared with x86-64 because
otherwise, multilib installations are broken.
The implementation uses software transactional memory, as suggested
by Torvald Riegel. Two copies of the supporting data structures are
used, also achieving full async-signal-safety.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
And use machine-sp.h instead. The Linux implementation is based on
already provided CURRENT_STACK_FRAME (used on nptl code) and
STACK_GROWS_UPWARD is replaced with _STACK_GROWS_UP.
hurd initialization stages use RUN_HOOK to run various initialization
functions. That is however using absolute addresses which need to be
relocated, which is done later by csu. We can however easily make the
linker compute relative addresses which thus don't need a relocation.
The new SET_RELHOOK and RUN_RELHOOK macros implement this.
Since 9cec82de71 ("htl: Initialize later"), we let csu initialize
pthreads. We can thus let it initialize tls later too, to better align
with the generic order. Initialization however accesses ports which
links/unlinks into the sigstate for unwinding. We can however easily
skip that during initialization.
The error handling is moved to sysdeps/ieee754 version with no SVID
support. The compatibility symbol versions still use the wrapper with
SVID error handling around the new code. There is no new symbol version
nor compatibility code on !LIBM_SVID_COMPAT targets (e.g. riscv).
Only ia64 is unchanged, since it still uses the arch specific
__libm_error_region on its implementation.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Build glibc programs and tests as PIE by default and enable static-pie
automatically if the architecture and toolchain supports it.
Also add a new configuration option --disable-default-pie to prevent
building programs as PIE.
Only the following architectures now have PIE disabled by default
because they do not work at the moment. hppa, ia64, alpha and csky
don't work because the linker is unable to handle a pcrel relocation
generated from PIE objects. The microblaze compiler is currently
failing with an ICE. GNU hurd tries to enable static-pie, which does
not work and hence fails. All these targets have default PIE disabled
at the moment and I have left it to the target maintainers to enable PIE
on their targets.
build-many-glibcs runs clean for all targets. I also tested x86_64 on
Fedora and Ubuntu, to verify that the default build as well as
--disable-default-pie work as expected with both system toolchains.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
TLS_INIT_TCB_ALIGN is not actually used. TLS_TCB_ALIGN was likely
introduced to support a configuration where the thread pointer
has not the same alignment as THREAD_SELF. Only ia64 seems to use
that, but for the stack/pointer guard, not for storing tcbhead_t.
Some ports use TLS_TCB_OFFSET and TLS_PRE_TCB_SIZE to shift
the thread pointer, potentially landing in a different residue class
modulo the alignment, but the changes should not impact that.
In general, given that TLS variables have their own alignment
requirements, having different alignment for the (unshifted) thread
pointer and struct pthread would potentially result in dynamic
offsets, leading to more complexity.
hppa had different values before: __alignof__ (tcbhead_t), which
seems to be 4, and __alignof__ (struct pthread), which was 8
(old default) and is now 32. However, it defines THREAD_SELF as:
/* Return the thread descriptor for the current thread. */
# define THREAD_SELF \
({ struct pthread *__self; \
__self = __get_cr27(); \
__self - 1; \
})
So the thread pointer points after struct pthread (hence __self - 1),
and they have to have the same alignment on hppa as well.
Similarly, on ia64, the definitions were different. We have:
# define TLS_PRE_TCB_SIZE \
(sizeof (struct pthread) \
+ (PTHREAD_STRUCT_END_PADDING < 2 * sizeof (uintptr_t) \
? ((2 * sizeof (uintptr_t) + __alignof__ (struct pthread) - 1) \
& ~(__alignof__ (struct pthread) - 1)) \
: 0))
# define THREAD_SELF \
((struct pthread *) ((char *) __thread_self - TLS_PRE_TCB_SIZE))
And TLS_PRE_TCB_SIZE is a multiple of the struct pthread alignment
(confirmed by the new _Static_assert in sysdeps/ia64/libc-tls.c).
On m68k, we have a larger gap between tcbhead_t and struct pthread.
But as far as I can tell, the port is fine with that. The definition
of TCB_OFFSET is sufficient to handle the shifted TCB scenario.
This fixes commit 23c77f6018
("nptl: Increase default TCB alignment to 32").
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Now that Hurd implementis both close_range and closefrom (f2c996597d),
we can make close_range() a base ABI, and make the default closefrom()
implementation on top of close_range().
The generic closefrom() implementation based on __getdtablesize() is
moved to generic close_range(). On Linux it will be overriden by
the auto-generation syscall while on Hurd it will be a system specific
implementation.
The closefrom() now calls close_range() and __closefrom_fallback().
Since on Hurd close_range() does not fail, __closefrom_fallback() is an
empty static inline function set by__ASSUME_CLOSE_RANGE.
The __ASSUME_CLOSE_RANGE also allows optimize Linux
__closefrom_fallback() implementation when --enable-kernel=5.9 or
higher is used.
Finally the Linux specific tst-close_range.c is moved to io and
enabled as default. The Linuxism and CLOSE_RANGE_UNSHARE are
guarded so it can be built for Hurd (I have not actually test it).
Checked on x86_64-linux-gnu, i686-linux-gnu, and with a i686-gnu
build.
It requires less boilerplate code for newer ports. The _Static_assert
checks from internal setjmp are moved to its own internal test since
setjmp.h is included early by multiple headers (to generate
rtld-sizes.sym).
The riscv jmp_buf-macros.h check is also redundant, it is already
done by riscv configure.ac.
Checked with a build for the affected architectures.
The include cleanup on dl-minimal.c removed too much for some
targets.
Also for Hurd, __sbrk is removed from localplt.data now that
tunables allocated memory through mmap.
Checked with a build for all affected architectures.
The close_range () function implements the same API as the Linux and
FreeBSD syscalls. It operates atomically and reliably. The specified
upper bound is clamped to the actual size of the file descriptor table;
it is expected that the most common use case is with last = UINT_MAX.
Like in the Linux syscall, it is also possible to pass the
CLOSE_RANGE_CLOEXEC flag to mark the file descriptors in the range
cloexec instead of acually closing them.
Also, add a Hurd version of the closefrom () function. Since unlike on
Linux, close_range () cannot fail due to being unuspported by the
running kernel, a fallback implementation is never necessary.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20211106153524.82700-1-bugaevc@gmail.com>
No bug.
This commit adds support for __memcmpeq() as a new ABI for all
targets. In this commit __memcmpeq() is implemented only as an alias
to the corresponding targets memcmp() implementation. __memcmpeq() is
added as a new symbol starting with GLIBC_2.35 and defined in string.h
with comments explaining its behavior. Basic tests that it is callable
and works where added in string/tester.c
As discussed in the proposal "Add new ABI '__memcmpeq()' to libc"
__memcmpeq() is essentially a reserved namespace for bcmp(). The means
is shares the same specifications as memcmp() except the return value
for non-equal byte sequences is any non-zero value. This is less
strict than memcmp()'s return value specification and can be better
optimized when a boolean return is all that is needed.
__memcmpeq() is meant to only be called by compilers if they can prove
that the return value of a memcmp() call is only used for its boolean
value.
All tests in string/tester.c passed. As well build succeeds on
x86_64-linux-gnu target.
5bf07e1b3a ("Linux: Simplify __opensock and fix race condition [BZ #28353]")
made __opensock try NETLINK then UNIX then INET. On the Hurd, only INET
knows about network interfaces, so better actually specify that in
if_index.
INTR_MSG_TRAP was tinkering with esp to make it point to
_hurd_intr_rpc_mach_msg's parameters, and notably use (&msg)[-1] which is
meaningless in C.
Instead, just push the parameters on the stack, which also avoids leaving
local variables of _hurd_intr_rpc_mach_msg below esp. We now also
properly express that OPTION and TIMEOUT may be updated during the trap
call.
C2X adds new <math.h> functions for floating-point maximum and
minimum, corresponding to the new operations that were added in IEEE
754-2019 because of concerns about the old operations not being
associative in the presence of signaling NaNs. fmaximum and fminimum
handle NaNs like most <math.h> functions (any NaN argument means the
result is a quiet NaN). fmaximum_num and fminimum_num handle both
quiet and signaling NaNs the way fmax and fmin handle quiet NaNs (if
one argument is a number and the other is a NaN, return the number),
but still raise "invalid" for a signaling NaN argument, making them
exceptions to the normal rule that a function with a floating-point
result raising "invalid" also returns a quiet NaN. fmaximum_mag,
fminimum_mag, fmaximum_mag_num and fminimum_mag_num are corresponding
functions returning the argument with greatest or least absolute
value. All these functions also treat +0 as greater than -0. There
are also corresponding <tgmath.h> type-generic macros.
Add these functions to glibc. The implementations use type-generic
templates based on those for fmax, fmin, fmaxmag and fminmag, and test
inputs are based on those for those functions with appropriate
adjustments to the expected results. The RISC-V maintainers might
wish to add optimized versions of fmaximum_num and fminimum_num (for
float and double), since RISC-V (F extension version 2.2 and later)
provides instructions corresponding to those functions - though it
might be at least as useful to add architecture-independent built-in
functions to GCC and teach the RISC-V back end to expand those
functions inline, which is what you generally want for functions that
can be implemented with a single instruction.
Tested for x86_64 and x86, and with build-many-glibcs.py.
This is an internal function meant to return the number of avaliable
processor where the process can scheduled, different than the
__get_nprocs which returns a the system available online CPU.
The Linux implementation currently only calls __get_nprocs(), which
in tuns calls sched_getaffinity.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
This patch adds the narrowing fused multiply-add functions from TS
18661-1 / TS 18661-3 / C2X to glibc's libm: ffma, ffmal, dfmal,
f32fmaf64, f32fmaf32x, f32xfmaf64 for all configurations; f32fmaf64x,
f32fmaf128, f64fmaf64x, f64fmaf128, f32xfmaf64x, f32xfmaf128,
f64xfmaf128 for configurations with _Float64x and _Float128;
__f32fmaieee128 and __f64fmaieee128 aliases in the powerpc64le case
(for calls to ffmal and dfmal when long double is IEEE binary128).
Corresponding tgmath.h macro support is also added.
The changes are mostly similar to those for the other narrowing
functions previously added, especially that for sqrt, so the
description of those generally applies to this patch as well. As with
sqrt, I reused the same test inputs in auto-libm-test-in as for
non-narrowing fma rather than adding extra or separate inputs for
narrowing fma. The tests in libm-test-narrow-fma.inc also follow
those for non-narrowing fma.
The non-narrowing fma has a known bug (bug 6801) that it does not set
errno on errors (overflow, underflow, Inf * 0, Inf - Inf). Rather
than fixing this or having narrowing fma check for errors when
non-narrowing does not (complicating the cases when narrowing fma can
otherwise be an alias for a non-narrowing function), this patch does
not attempt to check for errors from narrowing fma and set errno; the
CHECK_NARROW_FMA macro is still present, but as a placeholder that
does nothing, and this missing errno setting is considered to be
covered by the existing bug rather than needing a separate open bug.
missing-errno annotations are duly added to many of the
auto-libm-test-in test inputs for fma.
This completes adding all the new functions from TS 18661-1 to glibc,
so will be followed by corresponding stdc-predef.h changes to define
__STDC_IEC_60559_BFP__ and __STDC_IEC_60559_COMPLEX__, as the support
for TS 18661-1 will be at a similar level to that for C standard
floating-point facilities up to C11 (pragmas not implemented, but
library functions done). (There are still further changes to be done
to implement changes to the types of fromfp functions from N2548.)
Tested as followed: natively with the full glibc testsuite for x86_64
(GCC 11, 7, 6) and x86 (GCC 11); with build-many-glibcs.py with GCC
11, 7 and 6; cross testing of math/ tests for powerpc64le, powerpc32
hard float, mips64 (all three ABIs, both hard and soft float). The
different GCC versions are to cover the different cases in tgmath.h
and tgmath.h tests properly (GCC 6 has _Float* only as typedefs in
glibc headers, GCC 7 has proper _Float* support, GCC 8 adds
__builtin_tgmath).
All the ports now have THREAD_GSCOPE_IN_TCB set to 1. Remove all
support for !THREAD_GSCOPE_IN_TCB, along with the definition itself.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20210915171110.226187-4-bugaevc@gmail.com>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
This is a new implementation of GSCOPE which largely mirrors its NPTL
counterpart. Same as in NPTL, instead of a global flag shared between
threads, there is now a per-thread GSCOPE flag stored in each thread's
TCB. This makes entering and exiting a GSCOPE faster at the expense of
making THREAD_GSCOPE_WAIT () slower.
The largest win is the elimination of many redundant gsync_wake () RPC
calls; previously, even simplest programs would make dozens of fully
redundant gsync_wake () calls.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20210915171110.226187-3-bugaevc@gmail.com>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
The next commit is going to introduce a new implementation of
THREAD_GSCOPE_WAIT which needs to access the list of threads.
Since it must be usable from the dynamic laoder, we have to move
the symbols for the list of threads into the loader.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20210915171110.226187-2-bugaevc@gmail.com>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
This patch adds the narrowing square root functions from TS 18661-1 /
TS 18661-3 / C2X to glibc's libm: fsqrt, fsqrtl, dsqrtl, f32sqrtf64,
f32sqrtf32x, f32xsqrtf64 for all configurations; f32sqrtf64x,
f32sqrtf128, f64sqrtf64x, f64sqrtf128, f32xsqrtf64x, f32xsqrtf128,
f64xsqrtf128 for configurations with _Float64x and _Float128;
__f32sqrtieee128 and __f64sqrtieee128 aliases in the powerpc64le case
(for calls to fsqrtl and dsqrtl when long double is IEEE binary128).
Corresponding tgmath.h macro support is also added.
The changes are mostly similar to those for the other narrowing
functions previously added, so the description of those generally
applies to this patch as well. However, the not-actually-narrowing
cases (where the two types involved in the function have the same
floating-point format) are aliased to sqrt, sqrtl or sqrtf128 rather
than needing a separately built not-actually-narrowing function such
as was needed for add / sub / mul / div. Thus, there is no
__nldbl_dsqrtl name for ldbl-opt because no such name was needed
(whereas the other functions needed such a name since the only other
name for that entry point was e.g. f32xaddf64, not reserved by TS
18661-1); the headers are made to arrange for sqrt to be called in
that case instead.
The DIAG_* calls in sysdeps/ieee754/soft-fp/s_dsqrtl.c are because
they were observed to be needed in GCC 7 testing of
riscv32-linux-gnu-rv32imac-ilp32. The other sysdeps/ieee754/soft-fp/
files added didn't need such DIAG_* in any configuration I tested with
build-many-glibcs.py, but if they do turn out to be needed in more
files with some other configuration / GCC version, they can always be
added there.
I reused the same test inputs in auto-libm-test-in as for
non-narrowing sqrt rather than adding extra or separate inputs for
narrowing sqrt. The tests in libm-test-narrow-sqrt.inc also follow
those for non-narrowing sqrt.
Tested as followed: natively with the full glibc testsuite for x86_64
(GCC 11, 7, 6) and x86 (GCC 11); with build-many-glibcs.py with GCC
11, 7 and 6; cross testing of math/ tests for powerpc64le, powerpc32
hard float, mips64 (all three ABIs, both hard and soft float). The
different GCC versions are to cover the different cases in tgmath.h
and tgmath.h tests properly (GCC 6 has _Float* only as typedefs in
glibc headers, GCC 7 has proper _Float* support, GCC 8 adds
__builtin_tgmath).
We stopped adding "Contributed by" or similar lines in sources in 2012
in favour of git logs and keeping the Contributors section of the
glibc manual up to date. Removing these lines makes the license
header a bit more consistent across files and also removes the
possibility of error in attribution when license blocks or files are
copied across since the contributed-by lines don't actually reflect
reality in those cases.
Move all "Contributed by" and similar lines (Written by, Test by,
etc.) into a new file CONTRIBUTED-BY to retain record of these
contributions. These contributors are also mentioned in
manual/contrib.texi, so we just maintain this additional record as a
courtesy to the earlier developers.
The following scripts were used to filter a list of files to edit in
place and to clean up the CONTRIBUTED-BY file respectively. These
were not added to the glibc sources because they're not expected to be
of any use in future given that this is a one time task:
https://gist.github.com/siddhesh/b5ecac94eabfd72ed2916d6d8157e7dchttps://gist.github.com/siddhesh/15ea1f5e435ace9774f485030695ee02
Reviewed-by: Carlos O'Donell <carlos@redhat.com>