AArch64 always uses pc relative access to static and hidden object
symbols, but the config setting was previously missing.
This affects ld.so start up code.
GCC 11 supports -march=x86-64-v[234] to enable x86 micro-architecture ISA
levels:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97250
and -mneeded to emit GNU_PROPERTY_X86_ISA_1_NEEDED property with
GNU_PROPERTY_X86_ISA_1_V[234] marker:
https://gitlab.com/x86-psABIs/x86-64-ABI/-/merge_requests/13
Binutils support for GNU_PROPERTY_X86_ISA_1_V[234] marker were added by
commit b0ab06937385e0ae25cebf1991787d64f439bf12
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Fri Oct 30 06:49:57 2020 -0700
x86: Support GNU_PROPERTY_X86_ISA_1_BASELINE marker
and
commit 32930e4edbc06bc6f10c435dbcc63131715df678
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Fri Oct 9 05:05:57 2020 -0700
x86: Support GNU_PROPERTY_X86_ISA_1_V[234] marker
GNU_PROPERTY_X86_ISA_1_NEEDED property in x86 ELF binaries indicate the
micro-architecture ISA level required to execute the binary. The marker
must be added by programmers explicitly in one of 3 ways:
1. Pass -mneeded to GCC.
2. Add the marker in the linker inputs as this patch does.
3. Pass -z x86-64-v[234] to the linker.
Add GNU_PROPERTY_X86_ISA_1_BASELINE and GNU_PROPERTY_X86_ISA_1_V[234]
marker support to ld.so if binutils 2.32 or newer is used to build glibc:
1. Add GNU_PROPERTY_X86_ISA_1_BASELINE and GNU_PROPERTY_X86_ISA_1_V[234]
markers to elf.h.
2. Add GNU_PROPERTY_X86_ISA_1_BASELINE and GNU_PROPERTY_X86_ISA_1_V[234]
marker to abi-note.o based on the ISA level used to compile abi-note.o,
assuming that the same ISA level is used to compile the whole glibc.
3. Add isa_1 to cpu_features to record the supported x86 ISA level.
4. Rename _dl_process_cet_property_note to _dl_process_property_note and
add GNU_PROPERTY_X86_ISA_1_V[234] marker detection.
5. Update _rtld_main_check and _dl_open_check to check loaded objects
with the incompatible ISA level.
6. Add a testcase to verify that dlopen an x86-64-v4 shared object fails
on lesser platforms.
7. Use <get-isa-level.h> in dl-hwcaps-subdirs.c and tst-glibc-hwcaps.c.
Tested under i686, x32 and x86-64 modes on x86-64-v2, x86-64-v3 and
x86-64-v4 machines.
Marked elf/tst-isa-level-1 with x86-64-v4, ran it on x86-64-v3 machine
and got:
[hjl@gnu-cfl-2 build-x86_64-linux]$ ./elf/tst-isa-level-1
./elf/tst-isa-level-1: CPU ISA level is lower than required
[hjl@gnu-cfl-2 build-x86_64-linux]$
Remove the wordsize-64 implementations by merging them into the main dbl-64
directory. The second patch just moves all wordsize-64 files and removes a
few wordsize-64 uses in comments and Implies files.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Remove the wordsize-64 implementations by merging them into the main dbl-64
directory. The first patch adds special cases needed for 32-bit targets
(FIX_INT_FP_CONVERT_ZERO and FIX_DBL_LONG_CONVERT_OVERFLOW) to the
wordsize-64 versions. This has no effect on 64-bit targets since they don't
define these macros.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
It sync with gnulib version ae9fb3d66. The testcase for BZ#23741
(stdlib/test-bz22786.c) is adjusted to check also for ENOMEM.
The patch fixes multiple realpath issues:
- Portability fixes for errno clobbering on free (BZ#10635). The
function does not call free directly anymore, although it might be
done through scratch_buffer_free. The free errno clobbering is
being tracked by BZ#17924.
- Pointer arithmetic overflows in realpath (BZ#26592).
- Realpath cyclically call __alloca(path_max) to consume too much
stack space (BZ#26341).
- Realpath mishandles EOVERFLOW; stat not needed anyway (BZ#24970).
The check is done through faccessat now.
Checked on x86_64-linux-gnu and i686-linux-gnu.
This ia regression from 09153638cf, versioned_symbol acts as
weak_alias for !SHARED but it is undefined to avoid non versioned
alias from the generic implementation.
Checked with a build for alpha-linux-gnu.
Calling an IFUNC function defined in unrelocated executable also leads to
segfault. Issue a fatal error message when calling IFUNC function defined
in the unrelocated executable from a shared library.
In the !MAP_FIXED case, when a bogus address is given mmap should pick up a
valide address rather than returning EINVAL: Posix only talks about
EINVAL for the MAP_FIXED case.
This fixes long-running ghc processes.
When copying with "rep movsb", if the distance between source and
destination is N*4GB + [1..63] with N >= 0, performance may be very
slow. This patch updates memmove-vec-unaligned-erms.S for AVX and
AVX512 versions with the distance in RCX:
cmpl $63, %ecx
// Don't use "rep movsb" if ECX <= 63
jbe L(Don't use rep movsb")
Use "rep movsb"
Benchtests data with bench-memcpy, bench-memcpy-large, bench-memcpy-random
and bench-memcpy-walk on Skylake, Ice Lake and Tiger Lake show that its
performance impact is within noise range as "rep movsb" is only used for
data size >= 4KB.
After sp is updated, the CFA offset should be set before next instruction.
Tested in glibc-2.28:
Thread 2 "xxxxxxx" hit Breakpoint 1, _dl_tlsdesc_dynamic () at ../sysdeps/aarch64/dl-tlsdesc.S:149
149 stp x1, x2, [sp, #-32]!
Missing separate debuginfos, use: dnf debuginfo-install libgcc-7.3.0-20190804.h24.aarch64
(gdb) bt
#0 _dl_tlsdesc_dynamic () at ../sysdeps/aarch64/dl-tlsdesc.S:149
#1 0x0000ffffbe4fbb44 in OurFunction (threadId=3194870184)
at /home/test/test_function.c:30
#2 0x0000000000400c08 in initaaa () at thread.c:58
#3 0x0000000000400c50 in thread_proc (param=0x0) at thread.c:71
#4 0x0000ffffbf6918bc in start_thread (arg=0xfffffffff29f) at pthread_create.c:486
#5 0x0000ffffbf5669ec in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:78
(gdb) ni
_dl_tlsdesc_dynamic () at ../sysdeps/aarch64/dl-tlsdesc.S:150
150 stp x3, x4, [sp, #16]
(gdb) bt
#0 _dl_tlsdesc_dynamic () at ../sysdeps/aarch64/dl-tlsdesc.S:150
#1 0x0000ffffbe4fbb44 in OurFunction (threadId=3194870184)
at /home/test/test_function.c:30
#2 0x0000000000000000 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) ni
_dl_tlsdesc_dynamic () at ../sysdeps/aarch64/dl-tlsdesc.S:157
157 mrs x4, tpidr_el0
(gdb) bt
#0 _dl_tlsdesc_dynamic () at ../sysdeps/aarch64/dl-tlsdesc.S:157
#1 0x0000ffffbe4fbb44 in OurFunction (threadId=3194870184)
at /home/test/test_function.c:30
#2 0x0000000000400c08 in initaaa () at thread.c:58
#3 0x0000000000400c50 in thread_proc (param=0x0) at thread.c:71
#4 0x0000ffffbf6918bc in start_thread (arg=0xfffffffff29f) at pthread_create.c:486
#5 0x0000ffffbf5669ec in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:78
Signed-off-by: liqingqing <liqingqing3@huawei.com>
Signed-off-by: Shuo Wang <wangshuo47@huawei.com>
Make the tests use TEST_COND_intel96 to decide on whether to build the
unnormal tests instead of the macro in nan-pseudo-number.h and then
drop the header inclusion. This unbreaks test runs on all
architectures that do not have ldbl-96.
Also drop the HANDLE_PSEUDO_NUMBERS macro since it is not used
anywhere.
I've updated copyright dates in glibc for 2021. This is the patch for
the changes not generated by scripts/update-copyrights and subsequent
build / regeneration of generated files. As well as the usual annual
updates, mainly dates in --version output (minus csu/version.c which
previously had to be handled manually but is now successfully updated
by update-copyrights), there is a small change to the copyright notice
in NEWS which should let NEWS get updated automatically next year.
Please remember to include 2021 in the dates for any new files added
in future (which means updating any existing uncommitted patches you
have that add new files to use the new copyright dates in them).
I used these shell commands:
../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")
and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 6694 files FOO.
I then removed trailing white space from benchtests/bench-pthread-locks.c
and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this
diagnostic from Savannah:
remote: *** pre-commit check failed ...
remote: *** error: lines with trailing whitespace found
remote: error: hook declined to update refs/heads/master
DELOUSE was added to asm code to make them compatible with non-LP64
ABIs, but it is an unfortunate name and the code was not compatible
with ABIs where pointer and size_t are different. Glibc currently
only supports the LP64 ABI so these macros are not really needed or
tested, but for now the name is changed to be more meaningful instead
of removing them completely.
Some DELOUSE macros were dropped: clone, strlen and strnlen used it
unnecessarily.
The out of tree ILP32 patches are currently not maintained and will
likely need a rework to rebase them on top of the time64 changes.
clone already uses r31 to temporarily save input arguments before doing the
syscall, so we use a different register to read from the TCB. We can also avoid
allocating another stack frame, which is not needed since we can simply extend
the usage of the red zone.
Tested-by: Lucas A. M. Magalhães <lamm@linux.ibm.com>
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
Linux kernel v5.9 added support for system calls using the scv
instruction for POWER9 and later. The new codepath provides better
performance (see below) if compared to using sc. For the
foreseeable future, both sc and scv mechanisms will co-exist, so this
patch enables glibc to do a runtime check and use scv when it is
available.
Before issuing the system call to the kernel, we check hwcap2 in the TCB
for PPC_FEATURE2_SCV to see if scv is supported by the kernel. If not,
we fallback to sc and keep the old behavior.
The kernel implements a different error return convention for scv, so
when returning from a system call we need to handle the return value
differently depending on the instruction we used to enter the kernel.
For syscalls implemented in ASM, entry and exit are implemented by
different macros (PSEUDO and PSEUDO_RET, resp.), which may be used in
sequence (e.g. for templated syscalls) or with other instructions in
between (e.g. clone). To avoid accessing the TCB a second time on
PSEUDO_RET to check which instruction we used, the value read from
hwcap2 is cached on a non-volatile register.
This is not needed when using INTERNAL_SYSCALL macro, since entry and
exit are bundled into the same inline asm directive.
The dynamic loader may issue syscalls before the TCB has been setup
so it always uses sc with no extra checks. For the static case, there
is no compile-time way to determine if we are inside startup code,
so we also check the value of the thread pointer before effectively
accessing the TCB. For such situations in which the availability of
scv cannot be determined, sc is always used.
Support for scv in syscalls implemented in their own ASM file (clone and
vfork) will be added later. For now simply use sc as before.
Average performance over 1M calls for each syscall "type":
- stat: C wrapper calling INTERNAL_SYSCALL
- getpid: templated ASM syscall
- syscall: call to gettid using syscall function
Standard:
stat : 1.573445 us / ~3619 cycles
getpid : 0.164986 us / ~379 cycles
syscall : 0.162743 us / ~374 cycles
With scv:
stat : 1.537049 us / ~3535 cycles <~ -84 cycles / -2.32%
getpid : 0.109923 us / ~253 cycles <~ -126 cycles / -33.25%
syscall : 0.116410 us / ~268 cycles <~ -106 cycles / -28.34%
Tested on powerpc, powerpc64, powerpc64le (with and without scv)
Tested-by: Lucas A. M. Magalhães <lamm@linux.ibm.com>
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
Add support to treat pseudo-numbers specially and implement x86
version to consider all of them as signaling.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
With xmknod wrapper functions removed (589260cef8), the mknod functions
are now properly exported, and version is done using symbols versioning
instead of the extra _MKNOD_* argument.
It also allows us to consolidate Linux and Hurd mknod implementation.
Reviewed-by: Lukasz Majewski <lukma@denx.de>
With xstat wrapper functions removed (8ed005daf0), the stat functions
are now properly exported, and version is done using symbols versioning
instead of the extra _STAT_* argument.
Reviewed-by: Lukasz Majewski <lukma@denx.de>
The new __proc_waitid RPC now expects WEXITED to be passed, allowing to
properly implement waitid, and thus define the missing W* macros
(according to FreeBSD values).
Instead of having the arch-specific trampoline setup code detect whether
preemption happened or not, we'd rather pass it the sigaction. In the
future, this may also allow to change sa_flags from post_signal().
Do not attempt to fix the significand top bit in long double input
received in printf. The code should never reach here because isnan
should now detect unnormals as NaN. This is already a NOP for glibc
since it uses the gcc __builtin_isnan, which detects unnormals as NaN.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
This syncs up isnanl behaviour with gcc. Also move the isnanl
implementation to sysdeps/x86 and remove the sysdeps/x86_64 version.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Also move sysdeps/i386/fpu/s_fpclassifyl.c to
sysdeps/x86/fpu/s_fpclassifyl.c and remove
sysdeps/x86_64/fpu/s_fpclassifyl.c
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Add Intel Linear Address Masking (LAM) support to <sys/platform/x86.h>.
HAS_CPU_FEATURE (LAM) can be used to detect if LAM is enabled in CPU.
LAM modifies the checking that is applied to 64-bit linear addresses,
allowing software to use of the untranslated address bits for metadata.
Add various defines and stubs for enabling MTE on AArch64 sysv-like
systems such as Linux. The HWCAP feature bit is copied over in the
same way as other feature bits. Similarly we add a new wrapper header
for mman.h to define the PROT_MTE flag that can be used with mmap and
related functions.
We add a new field to struct cpu_features that can be used, for
example, to check whether or not certain ifunc'd routines should be
bound to MTE-safe versions.
Finally, if we detect that MTE should be enabled (ie via the glibc
tunable); we enable MTE during startup as required.
Support in the Linux kernel was added in version 5.10.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Older versions of the Linux kernel headers obviously lack support for
memory tagging, but we still want to be able to build in support when
using those (obviously it can't be enabled on such systems).
The linux kernel extensions are made to the platform-independent
header (linux/prctl.h), so this patch takes a similar approach.
This patch adds the basic support for memory tagging.
Various flavours are supported, particularly being able to turn on
tagged memory at run-time: this allows the same code to be used on
systems where memory tagging support is not present without neededing
a separate build of glibc. Also, depending on whether the kernel
supports it, the code will use mmap for the default arena if morecore
does not, or cannot support tagged memory (on AArch64 it is not
available).
All the hooks use function pointers to allow this to work without
needing ifuncs.
Reviewed-by: DJ Delorie <dj@redhat.com>
This is clever, but it confuses downstream detection in at least zstd
and GNOME's glib. zstd has preprocessor tests for the 'st_mtime' macro,
which is not provided by the path using the anonymous union; glib checks
for the presence of 'st_mtimensec' in struct stat but then tries to
access that field in struct statx (which might be a bug on its own).
Checked with a build for alpha-linux-gnu.
setjmp() uses C code to store current registers into jmp_buf
environment. -fstack-protector-all places canary into setjmp()
prologue and clobbers 'a5' before it gets saved.
The change inhibits stack canary injection to avoid clobber.
When SA_SIGINFO is available, sysdeps/posix/s?profil.c use it, so we have to
fix the __profil_counter function accordingly, using sigcontextinfo.h's
sigcontext_get_pc.
SA_SIGINFO is actually just another way of expressing what we were
already passing over with struct sigcontext. This just introduces the
SIGINFO interface and fixes the posix values when that interface is
requested by the application.
asin and acos have slow paths for rounding the last bit that cause some
calls to be 500-1500x slower than average calls.
These slow paths are rare, a test of a trillion (1.000.000.000.000)
random inputs between -1 and 1 showed 32870 slow calls for acos and 4473
for asin, with most occurrences between -1.0 .. -0.9 and 0.9 .. 1.0.
The slow paths claim correct rounding and use __sin32() and __cos32()
(which compare two result candidates and return the closest one) as the
final step, with the second result candidate (res1) having a small offset
applied from res. This suggests that res and res1 are intended to be 1
ULP apart (which makes sense for rounding), barring bugs, allowing us to
pick either one and still remain within 1 ULP of the exact result.
Remove the slow paths as the accuracy is better than 1 ULP even without
them, which is enough for glibc.
Also remove code comments claiming correctly rounded results.
After slow path removal, checking the accuracy of 14.400.000.000 random
asin() and acos() inputs showed only three incorrectly rounded
(error > 0.5 ULP) results:
- asin(-0x1.ee2b43286db75p-1) (0.500002 ULP, same as before)
- asin(-0x1.f692ba202abcp-4) (0.500003 ULP, same as before)
- asin(-0x1.9915e876fc062p-1) (0.50000000001 ULP, previously exact)
The first two had the same error even before this commit, and they did
not use the slow path at all.
Checking 4934 known randomly found previously-slow-path asin inputs
shows 25 calls with incorrectly rounded results, with a maximum error of
0.500000002 ULP (for 0x1.fcd5742999ab8p-1). The previous slow-path code
rounded all these inputs correctly (error < 0.5 ULP).
The observed average speed increase was 130x.
Checking 36240 known randomly found previously-slow-path acos inputs
shows 42 calls with incorrectly rounded results, with a maximum error of
0.500000008 ULP (for 0x1.f63845056f35ep-1). The previous "exact"
slow-path code showed 34 calls with incorrectly rounded results, with the
same maximum error of 0.500000008 ULP (for 0x1.f63845056f35ep-1).
The observed average speed increase was 130x.
The functions could likely be trimmed more while keeping acceptable
accuracy, but this at least gets rid of the egregiously slow cases.
Tested on x86_64.
This patch updates the kernel version in the test tst-mman-consts.py
to 5.10. (There are no new MAP_* constants covered by this test in
5.10 that need any other header changes.)
Tested with build-many-glibcs.py.
GCC 6.5 fails to correctly build ldconfig with recent ld.so.cache
commits, e.g.:
785969a047
elf: Implement a string table for ldconfig, with tail merging
If glibc is build with gcc 6.5.0:
__builtin_add_overflow is used in
<glibc>/elf/stringtable.c:stringtable_finalize()
which leads to ldconfig failing with "String table is too large".
This is also recognizable in following tests:
FAIL: elf/tst-glibc-hwcaps-cache
FAIL: elf/tst-glibc-hwcaps-prepend-cache
FAIL: elf/tst-ldconfig-X
FAIL: elf/tst-ldconfig-bad-aux-cache
FAIL: elf/tst-ldconfig-ld_so_conf-update
FAIL: elf/tst-stringtable
See gcc "Bug 98269 - gcc 6.5.0 __builtin_add_overflow() with small
uint32_t values incorrectly detects overflow"
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98269)
Change sbrk to fail for !__libc_initial (in the generic
implementation). As a result, sbrk is (relatively) safe to use
for the __libc_initial case (from the main libc). It is therefore
no longer necessary to avoid using it in that case (or updating the
brk cache), and the __libc_initial flag does not need to be updated
as part of dlmopen or static dlopen.
As before, direct brk system calls on Linux may lead to memory
corruption.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Linux 5.10 has one new syscall, process_madvise. Update
syscall-names.list and regenerate the arch-syscall.h headers with
build-many-glibcs.py update-syscalls.
Tested with build-many-glibcs.py.
This symbol is not in the implementation reserved namespace for static
linking and it was never used: it seems it was mistakenly added in the
orignal strlen_asimd commit 436e4d5b96
Since we can't tell if the tunable value is set by user or not:
https://sourceware.org/bugzilla/show_bug.cgi?id=27069
remove the default REP MOVSB threshold tunable value so that the correct
default value will be set correctly by init_cacheinfo ().
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Re-mmap executable segments if possible instead of using mprotect
to add PROT_BTI. This allows using BTI protection with security
policies that prevent mprotect with PROT_EXEC.
If the fd of the ELF module is not available because it was kernel
mapped then mprotect is used and failures are ignored. To protect
the main executable even when mprotect is filtered the linux kernel
will have to be changed to add PROT_BTI to it.
The delayed failure reporting is mainly needed because currently
_dl_process_gnu_properties does not propagate failures such that
the required cleanups happen. Using the link_map_machine struct for
error propagation is not ideal, but this seemed to be the least
intrusive solution.
Fixes bug 26831.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
To handle GNU property notes on aarch64 some segments need to
be mmaped again, so the fd of the loaded ELF module is needed.
When the fd is not available (kernel loaded modules), then -1
is passed.
The fd is passed to both _dl_process_pt_gnu_property and
_dl_process_pt_note for consistency. Target specific note
processing functions are updated accordingly.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Handle unaligned executable load segments (the bfd linker is not
expected to produce such binaries, but other linkers may).
Computing the mapping bounds follows _dl_map_object_from_fd more
closely now.
Fixes bug 26988.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The _dl_open_check and _rtld_main_check hooks are not called on the
dependencies of a loaded module, so BTI protection was missed on
every module other than the main executable and directly dlopened
libraries.
The fix just iterates over dependencies to enable BTI.
Fixes bug 26926.
It removes all the arch-specific assembly implementation. The
outliers are alpha, where its kernel ABI explict return -ENOMEM
in case of failure; and i686, where it can't use
"call *%gs:SYSINFO_OFFSET" during statup in static PIE.
Also some ABIs exports an additional ___brk_addr symbol and to
handle it an internal HAVE_INTERNAL_BRK_ADDR_SYMBOL is added.
Checked on x86_64-linux-gnu, i686-linux-gnu, adn with builsd for
the affected ABIs.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
Subdirectories z13, z14, z15 can be selected, mostly based on the
level of support for vector instructions.
Co-Authored-By: Stefan Liebler <stli@linux.ibm.com>
float_t supposedly represents the type that is used to evaluate float
expressions internally. While the isa supports single-precision float
operations, the port of glibc to s390 incorrectly deferred to the
generic definitions which, back then, tied float_t to double. gcc by
default evaluates float in single precision, so that scenario violates
the C standard (sections 5.2.4.2.2 and 7.12 in C11/C17). With
-fexcess-precision=standard, gcc evaluates float in double precision,
which aligns with the standard yet at the cost of added conversion
instructions.
With this patch, we drop the s390-specific definition of float_t and
defer to the default behavior, which aligns float_t with the
compiler-defined FLT_EVAL_METHOD in a standard-compliant way.
Checked on s390x-linux-gnu with 31-bit and 64-bit builds.
The functions strtoimax, strtoumax, wcstoimax, wcstoumax currently
have three implementations each (wordsize-32, wordsize-64 and dummy
implementation in stdlib/ using #error), defining the functions as
thin wrappers round corresponding *_internal functions. Simplify the
code by changing them into aliases of functions such as strtol and
wcstoull. This is more consistent with how e.g. imaxdiv is handled.
Tested for x86_64 and x86.
This code manages the mappings of the available databases in NSS
(i.e. passwd, hosts, netgroup, etc) with the actions that should
be taken to do a query on those databases.
This is the main API between query functions scattered throughout
glibc and the underlying code (actions, modules, etc).
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Check HAS_CPU_FEATURE instead of CPU_FEATURE_USABLE for FSGSBASE, IBT,
LM, SHSTK and XSAVES since FSGSBASE requires kernel support, IBT/SHSTK/LM
require OS support and XSAVES is supervisor-mode only.
Following macros: lll_futex_timed_lock_pi, lll_futex_clock_wait_bitset,
lll_futex_wait_requeue_pi, lll_futex_timed_wait_requeue_pi are not
used anymore so are eligible for removal.
Build tests:
./src/scripts/build-many-glibcs.py glibcs
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
After gai_suspend and aio_suspend conversion to support 64 bit time and
hence rewriting the code to use only absolute variants of futex wait
functions (i.e. __futex_abstimed_wait64 and __futex_abstimed_wait_cancelable64)
futex_reltimed_wait{_cancelable} are not needed anymore and can be removed.
Build tests:
./src/scripts/build-many-glibcs.py glibcs
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
This change uses (in gai_misc.h):
- __futex_abstimed_wait64 (instead of futex_reltimed_wait)
- __futex_abstimed_wait_cancellable64
(instead of futex_reltimed_wait_cancellable)
from ./sysdeps/nptl/futex-helpers.h
The gai_suspend() accepts relative timeout, which then is converted to
absolute one.
The i686-gnu port (HURD) do not define DONT_NEED_GAI_MISC_COND and as it
doesn't (yet) support 64 bit time it uses not converted
pthread_cond_timedwait().
The __gai_suspend() is supposed to be run on ports with __TIMESIZE !=64 and
__WORDSIZE==32. It internally utilizes __gai_suspend_time64() and hence the
conversion from 32 bit struct timespec to 64 bit one is required.
For ports supporting 64 bit time the __gai_suspend_time64() will be used
either via alias (to __gai_suspend when __TIMESIZE==64) or redirection
(when -D_TIME_BITS=64 is passed).
Build tests:
./src/scripts/build-many-glibcs.py glibcs
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Libraries from these subdirectories are added to the cache
with a special hwcap bit DL_CACHE_HWCAP_EXTENSION, so that
they are ignored by older dynamic loaders.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
A previously unused new-format header field is used to record
the address of an extension directory.
This change adds a demo extension which records the version of
ldconfig which builds a file.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Use a reserved byte in the new format cache header to indicate whether
the file is in little endian or big endian format. Eventually, this
information could be used to provide a unified cache for qemu-user
and similiar scenarios.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This hacks non-power-set processing into _dl_important_hwcaps.
Once the legacy hwcaps handling goes away, the subdirectory
handling needs to be reworked, but it is premature to do this
while both approaches are still supported.
ld.so supports two new arguments, --glibc-hwcaps-prepend and
--glibc-hwcaps-mask. Each accepts a colon-separated list of
glibc-hwcaps subdirectory names. The prepend option adds additional
subdirectories that are searched first, in the specified order. The
mask option restricts the automatically selected subdirectories to
those listed in the option argument. For example, on systems where
/usr/lib64 is on the library search path,
--glibc-hwcaps-prepend=valgrind:debug causes the dynamic loader to
search the directories /usr/lib64/glibc-hwcaps/valgrind and
/usr/lib64/glibc-hwcaps/debug just before /usr/lib64 is searched.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The previous definition of THREAD_SELF did not tell the compiler
that %fs (or %gs) usage is invalid for the !DL_LOOKUP_GSCOPE_LOCK
case in _dl_lookup_symbol_x. As a result, ld.so could try to use the
TCB before it was initialized.
As the comment in tls.h explains, asm volatile is undesirable here.
Using the __seg_fs (or __seg_gs) namespace does not interfere with
optimization, and expresses that THREAD_SELF is potentially trapping.
This reverts commit 81b83ff61f to move
__xmknod{at} back to default symbols. ABIs with default symbol version
of 2.33 or newer (such as riscv32) continue to just provide the mknod*
symbols.
The idea is to not force static libraries built against old glibc
to update against new glibcs (since they reference the the
xmknod{at} symbols).
Checked on x86_64-linux-gnu and i686-linux-gnu.
This reverts commit 20b39d5946 to move
{f}xstat{at} back to default symbols. ABIs with default symbol version
of 2.33 or newer (such as riscv32) continue to just provide the stat
symbols.
The idea is to not force static libraries built against old glibc
to update against new glibcs (since they reference the old
{f}xstat{at} symbols).
Checked on x86_64-linux-gnu and i686-linux-gnu.
The earlier implementation of this, __lll_clocklock, calls lll_clockwait
that doesn't return the futex syscall error codes. It always tries again
if that fails.
However in the current implementation, when the futex returns EAGAIN,
__futex_clocklock64 will also return EGAIN, even if the futex is taken.
This patch fixes the EAGAIN issue and also adds a check for EINTR. As
futex syscall can return EINTR if the thread is interrupted by a signal.
In this case I'm assuming the function should continue trying to lock as
there is no mention to about it on POSIX. Also add a test for both
scenarios.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Programatically generate simple wrappers for interesting libm *f128
objects. Selected functions are transcendental functions or
those with trivial compiler builtins. This can result in a 2-3x
speedup (e.g logf128 and expf128).
A second set of implementation files are generated which include
the first implementation encountered along the search path. This
usually works, except when a wrapper is overriden and makefile
search order slightly diverges from include order. Likewise,
wrapper object files are created for each generated file. These
hold the ifunc selection routines which export ABI.
Next, several shared headers are intercepted to control renaming of
asm function redirects are used first, and sometimes macro renames
if the former is impractical.
Notably, if the request machine supports hardware IEEE128 (i.e POWER9
and newer) this ifunc machinery is disabled. Likewise existing
ifunc support for float128 is consolidated into this (e.g sqrtf128
and fmaf128).
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
The aio_suspend function has been converted to support 64 bit time.
This change uses (in aio_misc.h):
- __futex_abstimed_wait64 (instead of futex_reltimed_wait)
- __futex_abstimed_wait_cancellable64
(instead of futex_reltimed_wait_cancellable)
from ./sysdeps/nptl/futex-helpers.h
The aio_suspend() accepts relative timeout, which then is converted to
absolute one.
The i686-gnu port (HURD) do not define DONT_NEED_AIO_MISC_COND and as it
doesn't (yet) support 64 bit time it uses not converted
pthread_cond_timedwait().
The __aio_suspend() is supposed to be run on ports with __TIMESIZE !=64 and
__WORDSIZE==32. It internally utilizes __aio_suspend_time64() and hence the
conversion from 32 bit struct timespec to 64 bit one is required.
For ports supporting 64 bit time the __aio_suspend_time64() will be used
either via alias (to __aio_suspend when __TIMESIZE==64) or redirection
(when -D_TIME_BITS=64 is passed).
Build tests:
./src/scripts/build-many-glibcs.py glibcs
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The 878fe624d4 changed lll_futex_timed_wait, which expects a relative
timeout, with a __futex_abstimed_wait64, which expects an absolute
timeout. However the code still passes a relative timeout.
Also, the PTHREAD_PRIO_PROTECT support for clocks different than
CLOCK_REALTIME was broken since the inclusion of
pthread_mutex_clocklock (9d20e22e46) since lll_futex_timed_wait
always use CLOCK_REALTIME.
This patch fixes by removing the relative time calculation. It
also adds some xtests that tests both thread and inter-process
usage.
Checked on x86_64-linux-gnu.
The commit 605f38177d (sh: Split BE/LE abilist) did not take in
consideration the SH4 fpu support.
Checked with a build for sh4-linux-gnu and manually checked that
the implementations at sysdeps/sh/sh4/fpu/ are selected.
John Paul Adrian Glaubitz also confirmed it fixes the build issues
he encontered.
The align the GNU extension with the others one that accept specify
which clock to wait for (such as pthread_mutex_clocklock).
Check on x86_64-linux-gnu.
Reviewed-by: Lukasz Majewski <lukma@denx.de>
Linux futex FUTEX_LOCK_PI operation only supports CLOCK_REALTIME,
so pthread_mutex_clocklock operation with priority aware mutexes
may fail depending of the input timeout.
Also, it is not possible to convert a CLOCK_MONOTONIC to a
CLOCK_REALTIME due the possible wall clock time change which might
invalid the requested timeout.
Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Lukasz Majewski <lukma@denx.de>
For non null timeouts, the __futex_clocklock_wait64 creates an a
relative timeout by subtracting the current time from the input
argument. The same behavior can be obtained with FUTEX_WAIT_BITSET
without the need to calculate the relative timeout. Besides consolidate
the code it also avoid the possible relative timeout issues [1].
The __futex_abstimed_wait64 needs also to return EINVAL syscall
errors.
Checked on x86_64-linux-gnu and i686-linux-gnu.
[1] https://sourceware.org/pipermail/libc-alpha/2020-November/119881.html
Reviewed-by: Lukasz Majewski <lukma@denx.de>
And add a small optimization to avoid setting the operation for the
32-bit time fallback operation.
Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Lukasz Majewski <lukma@denx.de>
It can be replaced with a __futex_abstimed_wait_cancelable64 call,
with the advantage that there is no need to further clock adjustments
to create a absolute timeout. It allows to remove the now ununsed
futex_timed_wait_cancel64 internal function.
Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Lukasz Majewski <lukma@denx.de>
It is used solely on __pthread_cond_wait_common and the call can be
replaced by a __futex_abstimed_wait_cancelable64 one.
Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Lukasz Majewski <lukma@denx.de>
The __futex_abstimed_wait usage was remove with 3102e28bd1 and the
__futex_abstimed_wait_cancelable by 323592fdc9 and b8d3e8fbaa.
The futex_lock_pi can be replaced by a futex_lock_pi64.
Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Lukasz Majewski <lukma@denx.de>
PT_THREAD_POINTER is currenty defined inside a #ifndef __ASSEMBLER__ block, but
its usage should not be limited to C code, as it can be useful when accessing
the TLS from assembly code as well.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
On GNU/Hurd we not only need $(common-objpfx) in LD_LIBRARY_PATH when loading
dynamic objects, but also $(common-objpfx)/mach and $(common-objpfx)/hurd. This
adds an ld-library-path variable to be used as LD_LIBRARY_PATH basis in
Makefiles, and a sysdep-ld-library-path variable for sysdeps to add some
more paths, here mach/ and hurd/.
Now __thread_gscope_wait (the function behind THREAD_GSCOPE_WAIT,
formerly __wait_lookup_done) can be implemented directly in ld.so,
eliminating the unprotected GL (dl_wait_lookup_done) function
pointer.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
On ports with __TIMESIZE != 64 the remaining time argument always receives
pointer to struct __timespec64 instance. This is the different behavior
when compared to 64 bit versions of clock_nanosleep and nanosleep
functions, which receive NULL.
To avoid any potential issues, we also pass NULL when *rem pointer is
NULL.
Reported-by: Andreas Schwab <schwab@suse.de>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
The thrd_sleep function has been converted to support 64 bit time.
It was also necessary to provide Linux specific copy of it to avoid
problems on i686-gnu (i.e. HURD) port, which is not providing
clock_nanosleep() supporting 64 bit time.
The thrd_sleep is a wrapper on POSIX threads to provide C11 standard
threads interface. It directly calls __clock_nanosleep64().
Build tests:
./src/scripts/build-many-glibcs.py glibcs
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
The mtx_timedlock function has been converted to support 64 bit time.
It was also necessary to provide Linux specific copy of it to avoid
problems on i686-gnu (i.e. HURD) port, which is not providing
pthread_mutex_timedlock() supporting 64 bit time.
The mtx_timedlock is a wrapper on POSIX threads to provide C11 standard
threads interface. It directly calls __pthread_mutex_timedlock64().
Build tests:
./src/scripts/build-many-glibcs.py glibcs
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
The cnd_timedwait function has been converted to support 64 bit time.
It was also necessary to provide Linux specific copy of it to avoid
problems on i686-gnu (i.e. HURD) port, which is not providing
pthread_cond_timedwait() supporting 64 bit time.
Moreover, a linux specific copy of thrd_priv.h header file has been
added as well.
The cnd_timedwait is a wrapper on POSIX threads to provide C11 standard
threads interface. It directly calls __pthread_cond_timedwait64().
Build tests:
./src/scripts/build-many-glibcs.py glibcs
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
See
https://sourceware.org/pipermail/libc-alpha/2020-November/119575.html
lib{mach,hurd}user.so gets relocated before libc.so, but its references
to strpcpy and memcpy would need an ifunc decision, which e.g. on
x86 relies on cpu_features, but libc.so's _rtld_global_ro is not
relocated yet.
We can however just make lib{mach,hurd}user.so only call non-ifunc
functions, which can be relocated before libc.so is relocated.
The tls.h inclusion is not really required and limits possible
definition on more arch specific headers.
This is a cleanup to allow inline functions on sysdep.h, more
specifically on i386 and ia64 which requires to access some tls
definitions its own.
No semantic changes expected, checked with a build against all
affected ABIs.
Most systems are SMP, so optimizing for the UP case is no longer
approriate. A dynamic check based on the kernel identification
has been only implemented for i386 anyway.
To disable adaptive mutexes on sh, define DEFAULT_ADAPTIVE_COUNT
as zero for this architecture.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The UP macro is never defined. Also define LOCK_PREFIX
unconditionally, to the same string.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Now that _hurd_libc_proc_init is idempotent, we can always call it,
independently of the __libc_multiple_libcs test which may not match
whether signals should be started or not.
Add stpncpy support into the POWER9 strncpy.
Reviewed-by: Matheus Castanho <msc@linux.ibm.com>
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
Similar to the strcpy P9 optimization, this version uses VSX to improve
performance.
Reviewed-by: Matheus Castanho <msc@linux.ibm.com>
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
Notifying the proc server is an involved task, and unleashes various signal
handling etc. so we have to do this after e.g. ifunc relocations are
completed.
Since htl does not actually need a stack switch, we can initialize it
like nptl is, avoiding all sorts of startup issues with ifunc.
More precisely, htl defines __pthread_initialize_minimal instead of the
elder _cthread_init_routine. We can then drop the stack switching dances.
__attribute__((__aligned__)) selects an alignment that depends on
the micro-architecture selected by GCC flags. Enabling vector
extensions may increase the allignment. This is a problem when
building glibc as a collection of ELF multilibs with different
GCC flags because ld.so and libc.so/libpthread.so/&c may end up
with a different layout of struct pthread because of the
changing offset of its struct _Unwind_Exception field.
Tested-By: Matheus Castanho <msc@linux.ibm.com>
We need NO_RTLD_HIDDEN because of the need for PLT calls in ld.so.
See Roland's comment in
https://sourceware.org/bugzilla/show_bug.cgi?id=15605
"in the Hurd it's crucial that calls like __mmap be the libc ones
instead of the rtld-local ones after the bootstrap phase, when the
dynamic linker is being used for dlopen and the like."
We used to just avoid all hidden use in the rtld ; this commit switches to
keeping only those that should use PLT calls, i.e. essentially those defined in
sysdeps/mach/hurd/dl-sysdep.c:
__assert_fail
__assert_perror_fail
__*stat64
_exit
This fixes a few startup issues, notably the call to __tunable_get_val that is
made before PLTs are set up.
DL_SYSDEP_INIT and DL_PLATFORM_INIT were not getting called, leading to
missing x86 platform tuning, now mandatory with 0f09154c64
("x86: Initialize CPU info via IFUNC relocation [BZ 26203]")
Add support to query cache information on RISC-V through sysconf()
function. The cache information had been added in AUX vector of RISC-V
architecture in Linux kernel v.5.10-rc1.
Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com>
Acked-by: Palmer Dabbelt <palmerdabbelt@google.com>
This is required for the debugglibc.sh script to work. Tested by
successfully using this patched script, and a riscv64-linux testsuite
run.
We could perhaps call RTLD_EPILOGUE for ENTRY_POINT before calling
RTLD_PROLOGUE for _dl_start_user, but I don't think it matters.
OK?
Jim
The adjtime interface allows return the amount of time remaining
from any previous adjustment that has not yet been completed by
passing a NULL as first argument. This was introduced with y2038
support 0308077e3a.
Checked on i686-linux-gnu.
Reviewed-by: Lukasz Majewski <lukma@denx.de>
* sysdeps/unix/bsd/getpt.c (__getpt): Add oflag parameter, pass
it to the _open call and rename to...
(__bsd_openpt): ... new function.
(__getpt): Reimplement on top of __bsd_openpt.
(__posix_openpt): Replace stub with implementation on top of __bsd_openpt.
(posix_openpt): Remove stub warning.
The #include <sys/msg.h> is redundant as we do not use message specific
types for issuing syscalls to handle msg and shm. Only msgctl requires
this header.
Build tests:
./src/scripts/build-many-glibcs.py glibcs
This test fails without bug 26798 fixed because some integer registers
likely get clobbered by lazy binding and variant PCS only allows x16
and x17 to be clobbered at call time.
The test requires binutils 2.32.1 or newer for handling variant PCS
symbols. SVE registers are not covered by this test, to avoid the
complexity of handling multiple compile- and runtime feature support
cases.
The variant PCS support was ineffective because in the common case
linkmap->l_mach.plt == 0 but then the symbol table flags were ignored
and normal lazy binding was used instead of resolving the relocs early.
(This was a misunderstanding about how GOT[1] is setup by the linker.)
In practice this mainly affects SVE calls when the vector length is
more than 128 bits, then the top bits of the argument registers get
clobbered during lazy binding.
Fixes bug 26798.
GCC 11 introduces a -Wstringop-overflow warning for calls to functions
with an array argument passed as a pointer to memory not large enough
for that array. This includes the __sigsetjmp calls from
pthread_cleanup_push macros, because those use a structure in
__pthread_unwind_buf_t, which has a common initial subsequence with
jmp_buf but does not include the saved signal mask; this is OK in this
case because the second argument to __sigsetjmp is 0 so the signal
mask is not accessed.
To avoid this warning, use a function alias __sigsetjmp_cancel with
first argument an array of exactly the type used in the calls to the
function, if using GCC 11 or later. With older compilers, continue to
use __sigsetjmp with a cast, to avoid any issues with compilers
predating the returns_twice attribute not applying the same special
handling to __sigsetjmp_cancel as to __sigsetjmp.
Tested with build-many-glibcs.py for arm-linux-gnueabi that this fixes
the testsuite build failures.