Commit Graph

15703 Commits

Author SHA1 Message Date
Szabolcs Nagy
642f1b9b3d aarch64: More configure checks for libmvec
Check assembler and linker support too, not just SVE ACLE in the
compiler, since variant PCS requires at least binutils 2.32.1.
2023-05-05 11:34:44 +01:00
Szabolcs Nagy
ee68e9cba4 aarch64: SVE ACLE configure test cleanups
Use more idiomatic configure test for better autoconf cache and logs.
2023-05-05 10:28:29 +01:00
Sam James
c8bd171caf hppa: Fix 'concurrency' typo in comment
Signed-off-by: Sam James <sam@gentoo.org>
2023-05-05 10:12:39 +01:00
Flavio Cruz
3f433cb895 Update sysdeps/mach/hurd/ioctl.c to make it more portable
Summary of the changes:
- Update msg_align to use ALIGN_UP like we have done in previous
  patches. Use it below whenever necessary to avoid repeating the same
  alignment logic.
- Define BAD_TYPECHECK to make it easier to do type checking in a few
  places below.
- Update io2mach_type to use designated initializers.
- Make RetCodeType use mach_msg_type_t. mach_msg_type_t is 8 byte for
  x86_64, so this make it portable.
- Also call msg_align for _IOT_COUNT2/_IOT_TYPE2 since it is more
  correct.
Message-Id: <ZFMvVsuFKwIy2dUS@jupiter.tail36e24.ts.net>
2023-05-05 02:22:31 +02:00
Szabolcs Nagy
1a62d7e5c3 aarch64: fix SVE ACLE check for bootstrap glibc builds
arm_sve.h depends on stdint.h but that relies on libc headers unless
compiled in freestanding mode.  Without this change a bootstrap glibc
build (that uses a compiler without installed libc headers) failed with

checking for availability of SVE ACLE... In file included from [...]/arm_sve.h:28,
                 from conftest.c:1:
[...]/stdint.h:9:16: fatal error: stdint.h: No such file or directory
    9 | # include_next <stdint.h>
      |                ^~~~~~~~~~
compilation terminated.
configure: error: mathvec is enabled but compiler does not have SVE ACLE. [...]
2023-05-04 10:19:11 +01:00
Joe Ramsay
cd94326a13 Enable libmvec support for AArch64
This patch enables libmvec on AArch64. The proposed change is mainly
implementing build infrastructure to add the new routines to ABI,
tests and benchmarks. I have demonstrated how this all fits together
by adding implementations for vector cos, in both single and double
precision, targeting both Advanced SIMD and SVE.

The implementations of the routines themselves are just loops over the
scalar routine from libm for now, as we are more concerned with
getting the plumbing right at this point. We plan to contribute vector
routines from the Arm Optimized Routines repo that are compliant with
requirements described in the libmvec wiki.

Building libmvec requires minimum GCC 10 for SVE ACLE. To avoid raising
the minimum GCC by such a big jump, we allow users to disable libmvec
if their compiler is too old.

Note that at this point users have to manually call the vector math
functions. This seems to be acceptable to some downstream users.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2023-05-03 12:09:49 +01:00
Samuel Thibault
0ec48e3337 hurd 64bit: Make dev_t word type
dev_t are 64bit on Linux ports, so better increase their size on 64bit
Hurd. It happens that this helps with BZ 23084 there: st_dev has type fsid_t
(quad) and is specified by POSIX to have type dev_t. Making dev_t 64bit
makes these match.
2023-05-02 21:29:26 +02:00
Samuel Thibault
e2b3d7f485 hurd 64bit: Fix struct msqid_ds and shmid_ds fields
The standards want msg_lspid/msg_lrpid/shm_cpid/shm_lpid to be pid_t, see BZ
23083 and 23085.

We can leave them __rpc_pid_t on i386 for ABI compatibility, but avoid
hitting the issue on 64bit.
2023-05-01 15:07:51 +02:00
Samuel Thibault
e3a3616dbf hurd 64bit: Fix ipc_perm fields types
The standards want uid/cuid to be uid_t, gid/cgid to be gid_t and mode to be
mode_t, see BZ 23082.

We can leave them short ints on i386 for ABI compatibility, but avoid
hitting the issue on 64bit.

bits/ipc.h ends up being exactly the same in sysdeps/gnu/ and
sysdeps/unix/sysv/linux/, so remove the latter.
2023-05-01 15:05:09 +02:00
Samuel Thibault
d5e2f9eaf7 hurd 64bit: Fix flock fields types
The standards want l_type and l_whence to be short ints, see BZ 23081.

We can leave them ints on i386 for ABI compatibility, but avoid hitting the
issue on 64bit.
2023-05-01 15:05:09 +02:00
Samuel Thibault
90604f670c hurd 64bit: Add data for check-c++-types 2023-05-01 15:05:09 +02:00
Samuel Thibault
65d1407d55 hurd 64bit: Fix pthread_t/thread_t type to long
So that they can be trivially cast to pointer type, like with nptl.
2023-05-01 15:05:09 +02:00
Samuel Thibault
e11a6734c4 hurd 64bit: Add missing data file for check-localplt test 2023-05-01 13:38:57 +02:00
Samuel Thibault
d44995a4b3 hurd 64bit: Add missing libanl
The move of libanl to libc was in glibc 2.34 for nptl only.
2023-05-01 13:36:14 +02:00
Samuel Thibault
d90470a37e hurd: Also XFAIL missing SA_NOCLDWAIT on 64bit 2023-05-01 13:28:53 +02:00
Samuel Thibault
14f16bd482 hurd: Fix tst-writev test
There is no compile-time IOV_MAX constraint on GNU/Hurd
2023-05-01 13:01:30 +02:00
Samuel Thibault
6d4f183495 nptl: move tst-x86-64-tls-1 to nptl-only tests
It is essentially nptl-only.
2023-05-01 12:59:33 +02:00
Sergey Bugaev
adca662202 hurd: Add expected abilist files for x86_64
These were created by creating stub files, running 'make update-abi',
and reviewing the results.

Also, set baseline ABI to GLIBC_2.38, the (upcoming) first glibc
release to first have x86_64-gnu support.

Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
2023-05-01 12:10:20 +02:00
Sergey Bugaev
4e506f67cb hurd: Replace reply port with a dead name on failed interruption
If we're trying to interrupt an interruptible RPC, but the server fails
to respond to our __interrupt_operation () call, we instead destroy the
reply port we were expecting the reply to the RPC on.

Instead of deallocating the name completely, replace it with a dead
name, so the name won't get reused for some other right, and deallocate
it in _hurd_intr_rpc_mach_msg once we return from the signal handler.

Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230429201822.2605207-4-bugaevc@gmail.com>
2023-05-01 03:18:48 +02:00
Flavio Cruz
eb14819c14 Define __mig_strlen to support dynamically sized strings in hurd RPCs
We make lib{mach,hurd}user.so only call __mig_strlen which can be
relocated before libc.so is relocated, similar to what is done with
__mig_memcpy.
Message-Id: <ZE8DTRDpY2hpPZlJ@jupiter.tail36e24.ts.net>
2023-05-01 02:24:04 +02:00
Sergey Bugaev
2bc516020f hurd: Make it possible to call memcpy very early
Normally, in static builds, the first code that runs is _start, in e.g.
sysdeps/x86_64/start.S, which quickly calls __libc_start_main, passing
it the argv etc. Among the first things __libc_start_main does is
initializing the tunables (based on env), then CPU features, and then
calls _dl_relocate_static_pie (). Specifically, this runs ifunc
resolvers to pick, based on the CPU features discovered earlier, the
most suitable implementation of "string" functions such as memcpy.

Before that point, calling memcpy (or other ifunc-resolved functions)
will not work.

In the Hurd port, things are more complex. In order to get argv/env for
our process, glibc normally needs to do an RPC to the exec server,
unless our args/env are already located on the stack (which is what
happens to bootstrap processes spawned by GNU Mach). Fetching our
argv/env from the exec server has to be done before the call to
__libc_start_main, since we need to know what our argv/env are to pass
them to __libc_start_main.

On the other hand, the implementation of the RPC (and other initial
setup needed on the Hurd before __libc_start_main can be run) is not
very trivial. In particular, it may (and on x86_64, will) use memcpy.
But as described above, calling memcpy before __libc_start_main can not
work, since the GOT entry for it is not yet initialized at that point.

Work around this by pre-filling the GOT entry with the baseline version
of memcpy, __memcpy_sse2_unaligned. This makes it possible for early
calls to memcpy to just work. The initial value of the GOT entry is
unused on x86_64, and changing it won't interfere with the relocation
being performed later: once _dl_relocate_static_pie () is called, the
baseline version will get replaced with the most suitable one, and that
is what subsequent calls of memcpy are going to call.

Checked on x86_64-gnu.

Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230429201822.2605207-6-bugaevc@gmail.com>
2023-05-01 01:21:23 +02:00
Sergey Bugaev
e6136c6939 hurd: Implement longjmp for x86_64
Checked on x86_64-gnu.

[samuel.thibault@ens-lyon.org: Restored same comments as on i386]
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230429201822.2605207-3-bugaevc@gmail.com>
2023-05-01 01:13:59 +02:00
Sergey Bugaev
b574ae0a28 hurd: Implement sigreturn for x86_64
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230429201822.2605207-2-bugaevc@gmail.com>
2023-05-01 01:06:17 +02:00
Sergey Bugaev
41aac87234 hurd: Make _exit work during early boot-up
If any of the early boot-up tasks calls exit () or returns from main (),
terminate it properly instead of crashing on trying to dereference
_hurd_ports and getting forcibly terminated by the kernel.

We sadly cannot make the __USEPORT macro do the check for _hurd_ports
being unset, because it evaluates to the value of the expression
provided as the second argument, and that can be of any type; so there
is no single suitable fallback value for the macro to evaluate to in
case _hurd_ports is unset. Instead, each use site that wants to care for
this case will have to do its own checking.

Checked on x86_64-gnu.

Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230429131354.2507443-4-bugaevc@gmail.com>
2023-04-29 16:53:47 +02:00
H.J. Lu
a443bd3fb2 __check_pf: Add a cancellation cleanup handler [BZ #20975]
There are reports for hang in __check_pf:

https://github.com/JoeDog/siege/issues/4

It is reproducible only under specific configurations:

1. Large number of cores (>= 64) and large number of threads (> 3X of
the number of cores) with long lived socket connection.
2. Low power (frequency) mode.
3. Power management is enabled.

While holding lock, __check_pf calls make_request which calls __sendto
and __recvmsg.  Since __sendto and __recvmsg are cancellation points,
lock held by __check_pf won't be released and can cause deadlock when
thread cancellation happens in __sendto or __recvmsg.  Add a cancellation
cleanup handler for __check_pf to unlock the lock when cancelled by
another thread.  This fixes BZ #20975 and the siege hang issue.
2023-04-28 13:38:38 -07:00
Hsiangkai Wang
117e8b341c
riscv: Resolve symbols directly for symbols with STO_RISCV_VARIANT_CC.
In some cases, we do not want to go through the resolver for function
calls. For example, functions with vector arguments will use vector
registers to pass arguments. In the resolver, we do not save/restore the
vector argument registers for lazy binding efficiency. To avoid ruining
the vector arguments, functions with vector arguments will not go
through the resolver.

To achieve the goal, we will annotate the function symbols with
STO_RISCV_VARIANT_CC flag and add DT_RISCV_VARIANT_CC tag in the dynamic
section. In the first pass on PLT relocations, we do not set up to call
_dl_runtime_resolve. Instead, we resolve the functions directly.

Signed-off-by: Hsiangkai Wang <kai.wang@sifive.com>
Signed-off-by: Vincent Chen <vincent.chen@sifive.com>
Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Link: https://inbox.sourceware.org/libc-alpha/20230314162512.35802-1-kito.cheng@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-04-28 07:02:42 -07:00
Joseph Myers
af16a59ee1 Fix Hurd getcwd build with GCC >= 13
The build of glibc for i686-gnu has been failing for a while with GCC
mainline / GCC 13:

../sysdeps/mach/hurd/getcwd.c: In function '__hurd_canonicalize_directory_name_internal':
../sysdeps/mach/hurd/getcwd.c:242:48: error: pointer 'file_name' may be used after 'realloc' [-Werror=use-after-free]
  242 |                   file_namep = &buf[file_namep - file_name + size / 2];
      |                                     ~~~~~~~~~~~^~~~~~~~~~~
../sysdeps/mach/hurd/getcwd.c:236:25: note: call to 'realloc' here
  236 |                   buf = realloc (file_name, size);
      |                         ^~~~~~~~~~~~~~~~~~~~~~~~~

Fix by doing the subtraction before the reallocation.

Tested with build-many-glibcs.py for i686-gnu.

[samuel.thibault@ens-lyon.rg: Removed mention of this being a bug]

Message-Id: <18587337-7815-4056-ebd0-724df262d591@codesourcery.com>
2023-04-27 01:27:28 +02:00
Joseph Myers
bcca5ae804 Regenerate sysdeps/mach/hurd/bits/errno.h
This file was out of date, as shown by build-many-glibcs.py runs
resulting in a modified source directory.
2023-04-26 17:11:41 +00:00
Joe Simmons-Talbott
a3461d4923 if_index: Remove unneeded alloca.h include
Nothing is being used from this header.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2023-04-26 08:06:49 -04:00
Joe Simmons-Talbott
19fdc3542b gethostid: Do not include alloca.h
Nothing from alloca.h is being used here.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2023-04-26 08:06:44 -04:00
Samuel Thibault
323fe6a1a9 hurd: Do not take any flag from the CMSG_DATA
As fixed in 0822e3552a ("hurd: Don't pass FD_CLOEXEC in CMSG_DATA"),
senders currently don't have any flag to pass.  We shouldn't blindly take
random flags that senders could be erroneously giving us.
2023-04-25 00:14:58 +02:00
Sergey Bugaev
5fa8945605 hurd: Implement MSG_CMSG_CLOEXEC
This is a new flag that can be passed to recvmsg () to make it
atomically set the CLOEXEC flag on all the file descriptors received
using the SCM_RIGHTS mechanism. This is useful for all the same reasons
that the other XXX_CLOEXEC flags are useful: namely, it provides
atomicity with respect to another thread of the same process calling
(fork and then) exec at the same time.

This flag is already supported on Linux and FreeBSD. The flag's value,
0x40000, is choosen to match FreeBSD's.

Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230423160548.126576-2-bugaevc@gmail.com>
2023-04-24 23:09:50 +02:00
Samuel Thibault
0822e3552a hurd: Don't pass FD_CLOEXEC in CMSG_DATA
The flags are used by _hurd_intern_fd, which takes O_* flags, not FD_*.

Also, it is of no concern to the receiving process whether or not
the sender process wants to close its copy of sent file descriptor
upon exec, and it should not influence whether or not the received
file descriptor gets the FD_CLOEXEC flag set in the receiving process.

The latter should in fact be dependent on the MSG_CMSG_CLOEXEC flag
being passed to the recvmsg () call, which is going to be implemented
in the following commit.

Fixes 344e755248
"hurd: Support sending file descriptors over Unix sockets"

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
2023-04-24 23:05:15 +02:00
Sergey Bugaev
c02b26455b hurd: Implement prefer_map_32bit_exec tunable
This makes the prefer_map_32bit_exec tunable no longer Linux-specific.

Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230423215526.346009-4-bugaevc@gmail.com>
2023-04-24 22:48:35 +02:00
Sergey Bugaev
35b7bf2fe0 hurd: Don't attempt to deallocate MACH_PORT_DEAD
...in some more places.

Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230423215526.346009-2-bugaevc@gmail.com>
2023-04-24 22:44:53 +02:00
Sergey Bugaev
4c39333050 hurd: Only deallocate addrport when it's valid
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230423160548.126576-3-bugaevc@gmail.com>
2023-04-24 22:44:18 +02:00
Sergey Bugaev
70b9173caa hurd: Implement MAP_32BIT
This is a flag that can be passed to mmap () to request that the mapping
being established should be located in the lower 2 GB area of the
address space, so only the lower 31 (not 32) bits can be set in its
address, and the address can be represented as a 32-bit integer without
truncating it.

This flag is intended to be compatible with Linux, FreeBSD, and Darwin
flags of the same name. Out of those systems, it appears Linux and
FreeBSD take MAP_32BIT to mean "map 31 bit", whereas Darwin allows the
32nd bit to be set in the address as well. The Hurd follows Linux and
FreeBSD behavior.

Unlike on those systems, on the Hurd MAP_32BIT is defined on all
supported architectures (which currently are only i386 and x86_64).

Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230423215526.346009-1-bugaevc@gmail.com>
2023-04-24 22:42:12 +02:00
Sergey Bugaev
533deafbdf Use O_CLOEXEC in more places (BZ #15722)
When opening a temporary file without O_CLOEXEC we risk leaking the
file descriptor if another thread calls (fork and then) exec while we
have the fd open. Fix this by consistently passing O_CLOEXEC everywhere
where we open a file for internal use (and not to return it to the user,
in which case the API defines whether or not the close-on-exec flag
shall be set on the returned fd).

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230419160207.65988-4-bugaevc@gmail.com>
2023-04-22 13:50:14 +02:00
Sergey Bugaev
8e78a2e1d1 hurd: Don't migrate reply port into __init1_tcbhead
Properly differentiate between setting up the real TLS with
TLS_INIT_TP, and setting up the early TLS (__init1_tcbhead) in static
builds. In the latter case, don't yet migrate the reply port into the
TCB, and don't yet set __libc_tls_initialized to 1.

This also lets us move the __init1_desc assignment inside
_hurd_tls_init ().

Fixes cd019ddd89
"hurd: Don't leak __hurd_reply_port0"

Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
2023-04-21 03:02:04 +02:00
Sergey Bugaev
88cc282a9a hurd: Make dl-sysdep's open () cope with O_IGNORE_CTTY
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230419160207.65988-6-bugaevc@gmail.com>
2023-04-20 23:05:54 +02:00
Cupertino Miranda
b630be0922 Created tunable to force small pages on stack allocation.
Created tunable glibc.pthread.stack_hugetlb to control when hugepages
can be used for stack allocation.
In case THP are enabled and glibc.pthread.stack_hugetlb is set to
0, glibc will madvise the kernel not to use allow hugepages for stack
allocations.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2023-04-20 13:54:24 -03:00
Adhemerval Zanella
320768a664 linux: Re-flow and sort multiline Makefile definitions 2023-04-20 10:40:54 -03:00
Sergey Bugaev
8895a99c10 hurd: Microoptimize sigreturn
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
2023-04-18 16:20:09 +02:00
Sergey Bugaev
e411e31b7b hurd: Fix restoring reply port in sigreturn
We must not use the user's reply port (scp->sc_reply_port) for any of
our own RPCs, otherwise various things break. So, use MACH_PORT_DEAD as
a reply port when destroying our reply port, and make sure to do this
after _hurd_sigstate_unlock (), which may do a gsync_wake () RPC.

Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
2023-04-17 21:00:02 +02:00
Wilco Dijkstra
76d0f094dd math: Improve fmod(f) performance
Optimize the fast paths (x < y) and (x/y < 2^12).  Delay handling of special
cases to reduce the number of instructions executed before the fast paths.
Performance improvements for fmod:

		Skylake	Zen2	Neoverse V1
subnormals	11.8%	4.2%	11.5%
normal		3.9%	0.01%	-0.5%
close-exponents	6.3%	5.6%	19.4%

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2023-04-17 13:03:10 +01:00
Sergey Bugaev
e275690332 hurd: Only check for TLS initialization inside rtld or in static builds
When glibc is built as a shared library, TLS is always initialized by
the call of TLS_INIT_TP () macro made inside the dynamic loader, prior
to running the main program (see dl-call_tls_init_tp.h). We can take
advantage of this: we know for sure that __LIBC_NO_TLS () will evaluate
to 0 in all other cases, so let the compiler know that explicitly too.

Also, only define _hurd_tls_init () and TLS_INIT_TP () under the same
conditions (either !SHARED or inside rtld), to statically assert that
this is the case.

Other than a microoptimization, this also helps with avoiding awkward
sharing of the __libc_tls_initialized variable between ld.so and libc.so
that we would have to do otherwise -- we know for sure that no sharing
is required, simply because __libc_tls_initialized would always be set
to true inside libc.so.

Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230319151017.531737-25-bugaevc@gmail.com>
2023-04-14 10:31:22 +00:00
Sergey Bugaev
ba00d787f3 hurd: Remove __hurd_local_reply_port
Now that the signal code no longer accesses it, the only real user of it
was mig-reply.c, so move the logic for managing the port there.

If we're in SHARED and outside of rtld, we know that __LIBC_NO_TLS ()
always evaluates to 0, and a TLS reply port will always be used, not
__hurd_reply_port0. Still, the compiler does not see that
__hurd_reply_port0 is never used due to its address being taken. To deal
with this, explicitly compile out __hurd_reply_port0 when we know we
won't use it.

Also, instead of accessing the port via THREAD_SELF->reply_port, this
uses THREAD_GETMEM and THREAD_SETMEM directly, avoiding possible
miscompilations.

Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
2023-04-14 10:31:22 +00:00
Adhemerval Zanella
05fe3ecfff malloc: Assure that THP mode read do write OOB end of stringt 2023-04-14 08:22:40 -03:00
Adhemerval Zanella
801deb07f6 malloc: Assure that THP mode is always null terminated 2023-04-13 17:18:04 -03:00
Samuel Thibault
decf02d382 hurd: Mark two tests as unsupported
They make the whole testsuite hang/crash.
2023-04-13 02:02:38 +02:00
Samuel Thibault
6538a288be hurd: Restore destroying receive rights on sigreturn
Just subtracting a ref is making signal/tst-signal signal/tst-raise
signal/tst-minsigstksz-5 htl/tst-raise1 fail.
2023-04-13 00:49:16 +02:00
Samuel Thibault
5473a1747a Revert "hurd: Only check for TLS initialization inside rtld or in static builds"
This reverts commit b37899d34d.

Apparently we load libc.so (and thus start using its functions) before
calling TLS_INIT_TP, so libc.so functions should not actually assume
that TLS is always set up.
2023-04-11 18:45:47 +00:00
Sergey Bugaev
cd019ddd89 hurd: Don't leak __hurd_reply_port0
Previously, once we set up TLS, we would implicitly switch from using
__hurd_reply_port0 to reply_port inside the TCB, leaving the former
unused. But we never deallocated it, so it got leaked.

Instead, migrate the port into the new TCB's reply_port slot. This
avoids both the port leak and an extra syscall to create a new reply
port for the TCB.

Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230319151017.531737-28-bugaevc@gmail.com>
2023-04-11 00:24:40 +02:00
Sergey Bugaev
747812349d hurd: Improve reply port handling when exiting signal handlers
If we're doing signals, that means we've already got the signal thread
running, and that implies TLS having been set up. So we know that
__hurd_local_reply_port will resolve to THREAD_SELF->reply_port, and can
access that directly using the THREAD_GETMEM and THREAD_SETMEM macros.
This avoids potential miscompilations, and should also be a tiny bit
faster.

Also, use mach_port_mod_refs () and not mach_port_destroy () to destroy
the receive right. mach_port_destroy () should *never* be used on
mach_task_self (); this can easily lead to port use-after-free
vulnerabilities if the task has any other references to the same port.

Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230319151017.531737-26-bugaevc@gmail.com>
2023-04-10 23:54:28 +02:00
Sergey Bugaev
b37899d34d hurd: Only check for TLS initialization inside rtld or in static builds
When glibc is built as a shared library, TLS is always initialized by
the call of TLS_INIT_TP () macro made inside the dynamic loader, prior
to running the main program (see dl-call_tls_init_tp.h). We can take
advantage of this: we know for sure that __LIBC_NO_TLS () will evaluate
to 0 in all other cases, so let the compiler know that explicitly too.

Also, only define _hurd_tls_init () and TLS_INIT_TP () under the same
conditions (either !SHARED or inside rtld), to statically assert that
this is the case.

Other than a microoptimization, this also helps with avoiding awkward
sharing of the __libc_tls_initialized variable between ld.so and libc.so
that we would have to do otherwise -- we know for sure that no sharing
is required, simply because __libc_tls_initialized would always be set
to true inside libc.so.

Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230319151017.531737-25-bugaevc@gmail.com>
2023-04-10 23:33:30 +02:00
Sergey Bugaev
4644fb9c4c elf: Stop including tls.h in ldsodefs.h
Nothing in there needs tls.h

Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230319151017.531737-24-bugaevc@gmail.com>
2023-04-10 23:26:28 +02:00
Sergey Bugaev
60f9bf9746 hurd: Port trampoline.c to x86_64
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230403115621.258636-3-bugaevc@gmail.com>
2023-04-10 20:44:43 +02:00
Sergey Bugaev
645da826bb hurd: Do not declare local variables volatile
These are just regular local variables that are not accessed in any
funny ways, not even though a pointer. There's absolutely no reason to
declare them volatile. It only ends up hurting the quality of the
generated machine code.

If anything, it would make sense to decalre sigsp as *pointing* to
volatile memory (volatile void *sigsp), but evidently that's not needed
either.

Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230403115621.258636-2-bugaevc@gmail.com>
2023-04-10 20:42:28 +02:00
Sergey Bugaev
892f702827 hurd: Implement x86_64/intr-msg.h
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230319151017.531737-18-bugaevc@gmail.com>
2023-04-10 20:39:28 +02:00
Sergey Bugaev
57df0f16b4 hurd: Add sys/ucontext.h and sigcontext.h for x86_64
This is based on the Linux port's version, but laid out to match Mach's
struct i386_thread_state, much like the i386 version does.

Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
2023-04-10 20:11:43 +02:00
Flavio Cruz
f7f7dd8009 hurd: Stop depending on the default_pager stubs provided by gnumach
The hurd source tree already provides the same stubs and they are only
needed there.
Message-Id: <ZDN3rDdjMowtUWf7@jupiter.tail36e24.ts.net>
2023-04-10 19:01:52 +02:00
H.J. Lu
81a3cc956e <sys/platform/x86.h>: Add PREFETCHI support
Add PREFETCHI support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2023-04-05 14:46:10 -07:00
H.J. Lu
b05521c916 <sys/platform/x86.h>: Add AMX-COMPLEX support
Add AMX-COMPLEX support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2023-04-05 14:46:10 -07:00
H.J. Lu
609b7b2d3c <sys/platform/x86.h>: Add AVX-NE-CONVERT support
Add AVX-NE-CONVERT support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2023-04-05 14:46:10 -07:00
H.J. Lu
4c120c88a6 <sys/platform/x86.h>: Add AVX-VNNI-INT8 support
Add AVX-VNNI-INT8 support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2023-04-05 14:46:10 -07:00
H.J. Lu
b39741b45f <sys/platform/x86.h>: Add MSRLIST support
Add MSRLIST support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2023-04-05 14:46:10 -07:00
H.J. Lu
96037c697d <sys/platform/x86.h>: Add AVX-IFMA support
Add AVX-IFMA support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2023-04-05 14:46:10 -07:00
H.J. Lu
8b4cc05eab <sys/platform/x86.h>: Add AMX-FP16 support
Add AMX-FP16 support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2023-04-05 14:46:10 -07:00
H.J. Lu
227983551d <sys/platform/x86.h>: Add WRMSRNS support
Add WRMSRNS support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2023-04-05 14:46:10 -07:00
H.J. Lu
a00db8305d <sys/platform/x86.h>: Add ArchPerfmonExt support
Add Architectural Performance Monitoring Extended Leaf (EAX = 23H)
support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2023-04-05 14:46:10 -07:00
H.J. Lu
2f02d0d8e1 <sys/platform/x86.h>: Add CMPCCXADD support
Add CMPCCXADD support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2023-04-05 14:46:10 -07:00
H.J. Lu
aa528a579b <sys/platform/x86.h>: Add LASS support
Add Linear Address Space Separation (LASS) support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2023-04-05 14:46:10 -07:00
H.J. Lu
231bf916ce <sys/platform/x86.h>: Add RAO-INT support
Add RAO-INT support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2023-04-05 14:46:10 -07:00
H.J. Lu
fb90dc8513 <sys/platform/x86.h>: Add LBR support
Add architectural LBR support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2023-04-05 14:46:10 -07:00
H.J. Lu
f47b7d96fb <sys/platform/x86.h>: Add RTM_FORCE_ABORT support
Add RTM_FORCE_ABORT support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2023-04-05 14:46:10 -07:00
H.J. Lu
f6790a489d <sys/platform/x86.h>: Add SGX-KEYS support
Add SGX-KEYS support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2023-04-05 14:46:10 -07:00
H.J. Lu
09cc5fee21 <sys/platform/x86.h>: Add BUS_LOCK_DETECT support
Add Bus lock debug exceptions (BUS_LOCK_DETECT) support to
<sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2023-04-05 14:46:10 -07:00
H.J. Lu
8c8e391166 <sys/platform/x86.h>: Add LA57 support
Add 57-bit linear addresses and five-level paging (LA57) support to
<sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2023-04-05 14:46:10 -07:00
H.J. Lu
2d8c590a5e <bits/platform/x86.h>: Rename to x86_cpu_INDEX_7_ECX_15
Rename x86_cpu_INDEX_7_ECX_1 to x86_cpu_INDEX_7_ECX_15 for the unused bit
15 in ECX from CPUID with EAX == 0x7 and ECX == 0.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2023-04-05 14:46:10 -07:00
John David Anglin
c4468cd399 hppa: Update struct __pthread_rwlock_arch_t comment.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
2023-04-05 18:54:47 +00:00
John David Anglin
e9327e8584 hppa: Revise __TIMESIZE define to use __WORDSIZE
Handle both 32 and 64-bit ABIs.

Signed-off-by: John David Anglin <dave.anglin@bell.net>
2023-04-05 18:35:38 +00:00
Guy-Fleury Iteriteka
5476f8cd2e htl: move pthread_self info libc.
Signed-off-by: Guy-Fleury Iteriteka <gfleury@disroot.org>
Message-Id: <20230318095826.1125734-4-gfleury@disroot.org>
2023-04-05 01:26:36 +02:00
Guy-Fleury Iteriteka
f987e9b7a3 htl: move ___pthread_self into libc.
sysdeps/mach/hurd/htl/pt-pthread_self.c: New file.
htl/Makefile: .. Add it to libc routine.
sysdeps/mach/hurd/htl/pt-sysdep.c(__pthread_self): Remove it.
sysdeps/mach/hurd/htl/pt-sysdep.h(__pthread_self): Add hidden propertie.
htl/Versions(__pthread_self) Version it as private symbol.

Signed-off-by: Guy-Fleury Iteriteka <gfleury@disroot.org>
Message-Id: <20230318095826.1125734-3-gfleury@disroot.org>
2023-04-05 01:26:34 +02:00
Andreas Schwab
856bab7717 x86/dl-cacheinfo: remove unsused parameter from handle_amd
Also replace an unreachable assert with __builtin_unreachable.
2023-04-04 16:16:21 +02:00
Adhemerval Zanella
59db5735e6 powerpc: Disable stack protector in early static initialization
Similar to fb95c31638, also disable
for string-ppc64.c (pulled on rltd as the default string
implementation).

Checked on powerpc64-linux-gnu.
2023-04-03 17:42:08 -03:00
Adhemerval Zanella
370da8a121 nptl: Fix tst-cancel30 on sparc64
As indicated by sparc kernel-features.h, even though sparc64 defines
__NR_pause,  it is not supported (ENOSYS).  Always use ppoll or the
64 bit time_t variant instead.
2023-04-03 17:41:59 -03:00
Adhemerval Zanella Netto
16439f419b math: Remove the error handling wrapper from fmod and fmodf
The error handling is moved to sysdeps/ieee754 version with no SVID
support.  The compatibility symbol versions still use the wrapper
with SVID error handling around the new code.  There is no new symbol
version nor compatibility code on !LIBM_SVID_COMPAT targets
(e.g. riscv).

The ia64 is unchanged, since it still uses the arch specific
__libm_error_region on its implementation.  For both i686 and m68k,
which provive arch specific implementation, wrappers are added so
no new symbol are added (which would require to change the
implementations).

It shows an small improvement, the results for fmod:

  Architecture     | Input           | master   | patch
  -----------------|-----------------|----------|--------
  x86_64 (Ryzen 9) | subnormals      | 12.5049  | 9.40992
  x86_64 (Ryzen 9) | normal          | 296.939  | 296.738
  x86_64 (Ryzen 9) | close-exponents | 16.0244  | 13.119
  aarch64 (N1)     | subnormal       | 6.81778  | 4.33313
  aarch64 (N1)     | normal          | 155.620  | 152.915
  aarch64 (N1)     | close-exponents | 8.21306  | 5.76138
  armhf (N1)       | subnormal       | 15.1083  | 14.5746
  armhf (N1)       | normal          | 244.833  | 241.738
  armhf (N1)       | close-exponents | 21.8182  | 22.457

Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
2023-04-03 16:45:27 -03:00
Adhemerval Zanella Netto
cf9cf33199 math: Improve fmodf
This uses a new algorithm similar to already proposed earlier [1].
With x = mx * 2^ex and y = my * 2^ey (mx, my, ex, ey being integers),
the simplest implementation is:

   mx * 2^ex == 2 * mx * 2^(ex - 1)

   while (ex > ey)
     {
       mx *= 2;
       --ex;
       mx %= my;
     }

With mx/my being mantissa of double floating pointer, on each step the
argument reduction can be improved 8 (which is sizeof of uint32_t minus
MANTISSA_WIDTH plus the signal bit):

   while (ex > ey)
     {
       mx << 8;
       ex -= 8;
       mx %= my;
     }  */

The implementation uses builtin clz and ctz, along with shifts to
convert hx/hy back to doubles.  Different than the original patch,
this path assume modulo/divide operation is slow, so use multiplication
with invert values.

I see the following performance improvements using fmod benchtests
(result only show the 'mean' result):

  Architecture     | Input           | master   | patch
  -----------------|-----------------|----------|--------
  x86_64 (Ryzen 9) | subnormals      | 17.2549  | 12.0318
  x86_64 (Ryzen 9) | normal          | 85.4096  | 49.9641
  x86_64 (Ryzen 9) | close-exponents | 19.1072  | 15.8224
  aarch64 (N1)     | subnormal       | 10.2182  | 6.81778
  aarch64 (N1)     | normal          | 60.0616  | 20.3667
  aarch64 (N1)     | close-exponents | 11.5256  | 8.39685

I also see similar improvements on arm-linux-gnueabihf when running on
the N1 aarch64 chips, where it a lot of soft-fp implementation (for
modulo, and multiplication):

  Architecture     | Input           | master   | patch
  -----------------|-----------------|----------|--------
  armhf (N1)       | subnormal       | 11.6662  | 10.8955
  armhf (N1)       | normal          | 69.2759  | 34.1524
  armhf (N1)       | close-exponents | 13.6472  | 18.2131

Instead of using the math_private.h definitions, I used the
math_config.h instead which is used on newer math implementations.

Co-authored-by: kirill <kirill.okhotnikov@gmail.com>

[1] https://sourceware.org/pipermail/libc-alpha/2020-November/119794.html
Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
2023-04-03 16:45:18 -03:00
Adhemerval Zanella Netto
34b9f8bc17 math: Improve fmod
This uses a new algorithm similar to already proposed earlier [1].
With x = mx * 2^ex and y = my * 2^ey (mx, my, ex, ey being integers),
the simplest implementation is:

   mx * 2^ex == 2 * mx * 2^(ex - 1)

   while (ex > ey)
     {
       mx *= 2;
       --ex;
       mx %= my;
     }

With mx/my being mantissa of double floating pointer, on each step the
argument reduction can be improved 11 (which is sizeo of uint64_t minus
MANTISSA_WIDTH plus the signal bit):

   while (ex > ey)
     {
       mx << 11;
       ex -= 11;
       mx %= my;
     }  */

The implementation uses builtin clz and ctz, along with shifts to
convert hx/hy back to doubles.  Different than the original patch,
this path assume modulo/divide operation is slow, so use multiplication
with invert values.

I see the following performance improvements using fmod benchtests
(result only show the 'mean' result):

  Architecture     | Input           | master   | patch
  -----------------|-----------------|----------|--------
  x86_64 (Ryzen 9) | subnormals      | 19.1584  | 12.5049
  x86_64 (Ryzen 9) | normal          | 1016.51  | 296.939
  x86_64 (Ryzen 9) | close-exponents | 18.4428  | 16.0244
  aarch64 (N1)     | subnormal       | 11.153   | 6.81778
  aarch64 (N1)     | normal          | 528.649  | 155.62
  aarch64 (N1)     | close-exponents | 11.4517  | 8.21306

I also see similar improvements on arm-linux-gnueabihf when running on
the N1 aarch64 chips, where it a lot of soft-fp implementation (for
modulo, clz, ctz, and multiplication):

  Architecture     | Input           | master   | patch
  -----------------|-----------------|----------|--------
  armhf (N1)       | subnormal       | 15.908   | 15.1083
  armhf (N1)       | normal          | 837.525  | 244.833
  armhf (N1)       | close-exponents | 16.2111  | 21.8182

Instead of using the math_private.h definitions, I used the
math_config.h instead which is used on newer math implementations.

Co-authored-by: kirill <kirill.okhotnikov@gmail.com>

[1] https://sourceware.org/pipermail/libc-alpha/2020-November/119794.html
Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
2023-04-03 16:36:24 -03:00
H.J. Lu
743113d42e x86: Set FSGSBASE to active if enabled by kernel
Linux kernel uses AT_HWCAP2 to indicate if FSGSBASE instructions are
enabled.  If the HWCAP2_FSGSBASE bit in AT_HWCAP2 is set, FSGSBASE
instructions can be used in user space.  Define dl_check_hwcap2 to set
the FSGSBASE feature to active on Linux when the HWCAP2_FSGSBASE bit is
set.

Add a test to verify that FSGSBASE is active on current kernels.
NB: This test will fail if the kernel doesn't set the HWCAP2_FSGSBASE
bit in AT_HWCAP2 while fsgsbase shows up in /proc/cpuinfo.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
2023-04-03 11:36:48 -07:00
Florian Weimer
5d1ccdda7b x86_64: Fix asm constraints in feraiseexcept (bug 30305)
The divss instruction clobbers its first argument, and the constraints
need to reflect that.  Fortunately, with GCC 12, generated code does
not actually change, so there is no externally visible bug.

Suggested-by: Jakub Jelinek <jakub@redhat.com>
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2023-04-03 18:40:52 +02:00
Sergey Bugaev
17841fa7d4 hurd: Add vm_param.h for x86_64
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230319151017.531737-30-bugaevc@gmail.com>
2023-04-03 01:24:13 +02:00
Sergey Bugaev
20427b8f23 hurd: Implement _hurd_longjmp_thread_state for x86_64
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230319151017.531737-29-bugaevc@gmail.com>
2023-04-03 01:23:30 +02:00
Sergey Bugaev
e0bbae0062 htl: Implement thread_set_pcsptp for x86_64
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230319151017.531737-23-bugaevc@gmail.com>
2023-04-03 01:18:27 +02:00
Sergey Bugaev
8d873a4904 x86_64: Add rtld-stpncpy & rtld-strncpy
Just like the other existing rtld-str* files, this provides rtld with
usable versions of stpncpy and strncpy.

Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230319151017.531737-22-bugaevc@gmail.com>
2023-04-03 01:17:56 +02:00
Sergey Bugaev
fb9e7f6732 htl: Add tcb-offsets.sym for x86_64
The source code is the same as sysdeps/i386/htl/tcb-offsets.sym, but of
course the produced tcb-offsets.h will be different.

Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230319151017.531737-21-bugaevc@gmail.com>
2023-04-03 01:15:30 +02:00
Sergey Bugaev
d8b69e89d8 hurd: Move a couple of signal-related files to x86
These do not need any changes to be used on x86_64.

Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230319151017.531737-20-bugaevc@gmail.com>
2023-04-03 01:14:51 +02:00
Sergey Bugaev
a1fbae7527 hurd: Use uintptr_t for register values in trampoline.c
This is more correct, if only because these fields are defined as having
the type unsigned int in the Mach headers, so casting them to a signed
int and then back is suboptimal.

Also, remove an extra reassignment of uesp -- this is another remnant of
the ecx kludge.

Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230319151017.531737-16-bugaevc@gmail.com>
2023-04-03 01:13:28 +02:00
Sergey Bugaev
b43cb67457 hurd: Move rtld-strncpy-c.c out of mach/hurd/
There's nothing Mach- or Hurd-specific about it; any port that ends
up with rtld pulling in strncpy will need this.

Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230319151017.531737-15-bugaevc@gmail.com>
2023-04-03 01:10:23 +02:00
Sergey Bugaev
0001a23f7a hurd: More 64-bit integer casting fixes
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230319151017.531737-13-bugaevc@gmail.com>
2023-04-03 01:03:06 +02:00