This reverts commit 20b39d5946 for static
library. This avoids the need to rebuild the world for the case where
libstdc++ (and potentially other libraries) are linked to a old glibc.
To avoid requering to provide xstat symbols for newer ABIs (such as
riscv32) a new LIB_COMPAT macro is added. It is similar to SHLIB_COMPAT
but also works for static case (thus evaluating similar to SHLIB_COMPAT
for both shared and static case).
Checked with a check-abi on all affected ABIs. I also check if the
static library does contains the xstat symbols.
In commit 863d775c48, kunpeng920 is added to default memcpy version,
however, there is performance degradation when the copy size is some large bytes, eg: 100k.
This is the result, tested in glibc-2.28:
before backport after backport Performance improvement
memcpy_1k 0.005 0.005 0.00%
memcpy_10k 0.032 0.029 10.34%
memcpy_100k 0.356 0.429 -17.02%
memcpy_1m 7.470 11.153 -33.02%
This is the demo
#include "stdio.h"
#include "string.h"
#include "stdlib.h"
char a[1024*1024] = {12};
char b[1024*1024] = {13};
int main(int argc, char *argv[])
{
int i = atoi(argv[1]);
int j;
int size = atoi(argv[2]);
for (j = 0; j < i; j++)
memcpy(b, a, size*1024);
return 0;
}
# gcc -g -O0 memcpy.c -o memcpy
# time taskset -c 10 ./memcpy 100000 1024
Co-authored-by: liqingqing <liqingqing3@huawei.com>
Extern symbol access in position independent code usually involves GOT
indirection which needs RELATIVE reloc in a static linked PIE. (On
some targets this is avoided e.g. because the linker can relax a GOT
access to a pc-relative access, but this is not generally true.) Code
that runs before static PIE self relocation must avoid relying on
dynamic relocations which can be ensured by using hidden visibility.
However we cannot just make all symbols hidden:
On i386, all calls to IFUNC functions must go through PLT and calls to
hidden functions CANNOT go through PLT in PIE since EBX used in PIE PLT
may not be set up for local calls to hidden IFUNC functions.
This patch aims to make symbol references hidden in code that is used
before and by _dl_relocate_static_pie when building a static PIE libc.
Note: for an object that is used in the startup code, its references
and definition may not have consistent visibility: it is only forced
hidden in the startup code.
This is needed for fixing bug 27072.
Co-authored-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Add SUPPORT_STATIC_PIE that targets can define if they support
static PIE. This requires PI_STATIC_AND_HIDDEN support and various
linker features as described in
commit 9d7a3741c9
Add --enable-static-pie configure option to build static PIE [BZ #19574]
Currently defined on x86_64, i386 and aarch64 where static PIE is
known to work.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
In <sys/platform/x86.h>, define CPU features as enum instead of using
the C preprocessor magic to make it easier to wrap this functionality
in other languages. Move the C preprocessor magic to internal header
for better GCC codegen when more than one features are checked in a
single expression as in x86-64 dl-hwcaps-subdirs.c.
1. Rename COMMON_CPUID_INDEX_XXX to CPUID_INDEX_XXX.
2. Move CPUID_INDEX_MAX to sysdeps/x86/include/cpu-features.h.
3. Remove struct cpu_features and __x86_get_cpu_features from
<sys/platform/x86.h>.
4. Add __x86_get_cpuid_feature_leaf to <sys/platform/x86.h> and put it
in libc.
5. Make __get_cpu_features() private to glibc.
6. Replace __x86_get_cpu_features(N) with __get_cpu_features().
7. Add _dl_x86_get_cpu_features to GLIBC_PRIVATE.
8. Use a single enum index for each CPU feature detection.
9. Pass the CPUID feature leaf to __x86_get_cpuid_feature_leaf.
10. Return zero struct cpuid_feature for the older glibc binary with a
smaller CPUID_INDEX_MAX [BZ #27104].
11. Inside glibc, use the C preprocessor magic so that cpu_features data
can be loaded just once leading to more compact code for glibc.
256 bits are used for each CPUID leaf. Some leaves only contain a few
features. We can add exceptions to such leaves. But it will increase
code sizes and it is harder to provide backward/forward compatibilities
when new features are added to such leaves in the future.
When new leaves are added, _rtld_global_ro offsets will change which
leads to race condition during in-place updates. We may avoid in-place
updates by
1. Rename the old glibc.
2. Install the new glibc.
3. Remove the old glibc.
NB: A function, __x86_get_cpuid_feature_leaf , is used to avoid the copy
relocation issue with IFUNC resolver as shown in IFUNC resolver tests.
Since __libc_init_secure is called before ARCH_SETUP_TLS, it must use
"int $0x80" for system calls in i386 static PIE. Add startup_getuid,
startup_geteuid, startup_getgid and startup_getegid to <startup.h>.
Update __libc_init_secure to use them.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Add extra-test-objs to test-extras so that they are compiled with
-DMODULE_NAME=testsuite instead of -DMODULE_NAME=libc.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
1. Move x86 processor cache info to _dl_x86_cpu_features in ld.so.
2. Update tunable bounds with TUNABLE_SET_WITH_BOUNDS.
3. Move x86 cache info initialization to dl-cacheinfo.h and initialize
x86 cache info in init_cpu_features ().
4. Put x86 cache info for libc in cacheinfo.h, which is included in
libc-start.c in libc.a and is included in cacheinfo.c in libc.so.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Store ISA level in the portion of the unused upper 32 bits of the hwcaps
field in cache and the unused pad field in aux cache. ISA level is stored
and checked only for shared objects in glibc-hwcaps subdirectories. The
shared objects in the default directories aren't checked since there are
no fallbacks for these shared objects.
Tested on x86-64-v2, x86-64-v3 and x86-64-v4 machines with
--disable-hardcoded-path-in-tests and --enable-hardcoded-path-in-tests.
The first getrandom is used only for __GT_NOCREATE, which is inherently
insecure and can use the entropy as a small improvement. On the
second and later attempts it might help against DoS attacks.
It sync with gnulib commit 854fbb81d91f7a0f2b463e7ace2499dee2f380f2.
Checked on x86_64-linux-gnu.
It syncs with gnulib commit b1268f22f443e8e4b9e. The try_tempname_len
now uses getrandom on each iteration to get entropy and only uses the
clock plus ASLR as source of entropy if getrandom fails.
Checked on x86_64-linux-gnu and i686-linux-gnu.
POSIX states that system returned code for failure to execute the shell
shall be as if the shell had terminated using _exit(127). This
behaviour was removed with 5fb7fc9635.
Checked on x86_64-linux-gnu.
The $gp register may be used to access the global variable in
the PDE program, so the $gp register should be initialized before
executing the IFUNC resolver of PDE program to avoid unexpected
error occurs.
AArch64 always uses pc relative access to static and hidden object
symbols, but the config setting was previously missing.
This affects ld.so start up code.
GCC 11 supports -march=x86-64-v[234] to enable x86 micro-architecture ISA
levels:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97250
and -mneeded to emit GNU_PROPERTY_X86_ISA_1_NEEDED property with
GNU_PROPERTY_X86_ISA_1_V[234] marker:
https://gitlab.com/x86-psABIs/x86-64-ABI/-/merge_requests/13
Binutils support for GNU_PROPERTY_X86_ISA_1_V[234] marker were added by
commit b0ab06937385e0ae25cebf1991787d64f439bf12
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Fri Oct 30 06:49:57 2020 -0700
x86: Support GNU_PROPERTY_X86_ISA_1_BASELINE marker
and
commit 32930e4edbc06bc6f10c435dbcc63131715df678
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Fri Oct 9 05:05:57 2020 -0700
x86: Support GNU_PROPERTY_X86_ISA_1_V[234] marker
GNU_PROPERTY_X86_ISA_1_NEEDED property in x86 ELF binaries indicate the
micro-architecture ISA level required to execute the binary. The marker
must be added by programmers explicitly in one of 3 ways:
1. Pass -mneeded to GCC.
2. Add the marker in the linker inputs as this patch does.
3. Pass -z x86-64-v[234] to the linker.
Add GNU_PROPERTY_X86_ISA_1_BASELINE and GNU_PROPERTY_X86_ISA_1_V[234]
marker support to ld.so if binutils 2.32 or newer is used to build glibc:
1. Add GNU_PROPERTY_X86_ISA_1_BASELINE and GNU_PROPERTY_X86_ISA_1_V[234]
markers to elf.h.
2. Add GNU_PROPERTY_X86_ISA_1_BASELINE and GNU_PROPERTY_X86_ISA_1_V[234]
marker to abi-note.o based on the ISA level used to compile abi-note.o,
assuming that the same ISA level is used to compile the whole glibc.
3. Add isa_1 to cpu_features to record the supported x86 ISA level.
4. Rename _dl_process_cet_property_note to _dl_process_property_note and
add GNU_PROPERTY_X86_ISA_1_V[234] marker detection.
5. Update _rtld_main_check and _dl_open_check to check loaded objects
with the incompatible ISA level.
6. Add a testcase to verify that dlopen an x86-64-v4 shared object fails
on lesser platforms.
7. Use <get-isa-level.h> in dl-hwcaps-subdirs.c and tst-glibc-hwcaps.c.
Tested under i686, x32 and x86-64 modes on x86-64-v2, x86-64-v3 and
x86-64-v4 machines.
Marked elf/tst-isa-level-1 with x86-64-v4, ran it on x86-64-v3 machine
and got:
[hjl@gnu-cfl-2 build-x86_64-linux]$ ./elf/tst-isa-level-1
./elf/tst-isa-level-1: CPU ISA level is lower than required
[hjl@gnu-cfl-2 build-x86_64-linux]$
Remove the wordsize-64 implementations by merging them into the main dbl-64
directory. The second patch just moves all wordsize-64 files and removes a
few wordsize-64 uses in comments and Implies files.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Remove the wordsize-64 implementations by merging them into the main dbl-64
directory. The first patch adds special cases needed for 32-bit targets
(FIX_INT_FP_CONVERT_ZERO and FIX_DBL_LONG_CONVERT_OVERFLOW) to the
wordsize-64 versions. This has no effect on 64-bit targets since they don't
define these macros.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
It sync with gnulib version ae9fb3d66. The testcase for BZ#23741
(stdlib/test-bz22786.c) is adjusted to check also for ENOMEM.
The patch fixes multiple realpath issues:
- Portability fixes for errno clobbering on free (BZ#10635). The
function does not call free directly anymore, although it might be
done through scratch_buffer_free. The free errno clobbering is
being tracked by BZ#17924.
- Pointer arithmetic overflows in realpath (BZ#26592).
- Realpath cyclically call __alloca(path_max) to consume too much
stack space (BZ#26341).
- Realpath mishandles EOVERFLOW; stat not needed anyway (BZ#24970).
The check is done through faccessat now.
Checked on x86_64-linux-gnu and i686-linux-gnu.
This ia regression from 09153638cf, versioned_symbol acts as
weak_alias for !SHARED but it is undefined to avoid non versioned
alias from the generic implementation.
Checked with a build for alpha-linux-gnu.
Calling an IFUNC function defined in unrelocated executable also leads to
segfault. Issue a fatal error message when calling IFUNC function defined
in the unrelocated executable from a shared library.
In the !MAP_FIXED case, when a bogus address is given mmap should pick up a
valide address rather than returning EINVAL: Posix only talks about
EINVAL for the MAP_FIXED case.
This fixes long-running ghc processes.
When copying with "rep movsb", if the distance between source and
destination is N*4GB + [1..63] with N >= 0, performance may be very
slow. This patch updates memmove-vec-unaligned-erms.S for AVX and
AVX512 versions with the distance in RCX:
cmpl $63, %ecx
// Don't use "rep movsb" if ECX <= 63
jbe L(Don't use rep movsb")
Use "rep movsb"
Benchtests data with bench-memcpy, bench-memcpy-large, bench-memcpy-random
and bench-memcpy-walk on Skylake, Ice Lake and Tiger Lake show that its
performance impact is within noise range as "rep movsb" is only used for
data size >= 4KB.
After sp is updated, the CFA offset should be set before next instruction.
Tested in glibc-2.28:
Thread 2 "xxxxxxx" hit Breakpoint 1, _dl_tlsdesc_dynamic () at ../sysdeps/aarch64/dl-tlsdesc.S:149
149 stp x1, x2, [sp, #-32]!
Missing separate debuginfos, use: dnf debuginfo-install libgcc-7.3.0-20190804.h24.aarch64
(gdb) bt
#0 _dl_tlsdesc_dynamic () at ../sysdeps/aarch64/dl-tlsdesc.S:149
#1 0x0000ffffbe4fbb44 in OurFunction (threadId=3194870184)
at /home/test/test_function.c:30
#2 0x0000000000400c08 in initaaa () at thread.c:58
#3 0x0000000000400c50 in thread_proc (param=0x0) at thread.c:71
#4 0x0000ffffbf6918bc in start_thread (arg=0xfffffffff29f) at pthread_create.c:486
#5 0x0000ffffbf5669ec in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:78
(gdb) ni
_dl_tlsdesc_dynamic () at ../sysdeps/aarch64/dl-tlsdesc.S:150
150 stp x3, x4, [sp, #16]
(gdb) bt
#0 _dl_tlsdesc_dynamic () at ../sysdeps/aarch64/dl-tlsdesc.S:150
#1 0x0000ffffbe4fbb44 in OurFunction (threadId=3194870184)
at /home/test/test_function.c:30
#2 0x0000000000000000 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) ni
_dl_tlsdesc_dynamic () at ../sysdeps/aarch64/dl-tlsdesc.S:157
157 mrs x4, tpidr_el0
(gdb) bt
#0 _dl_tlsdesc_dynamic () at ../sysdeps/aarch64/dl-tlsdesc.S:157
#1 0x0000ffffbe4fbb44 in OurFunction (threadId=3194870184)
at /home/test/test_function.c:30
#2 0x0000000000400c08 in initaaa () at thread.c:58
#3 0x0000000000400c50 in thread_proc (param=0x0) at thread.c:71
#4 0x0000ffffbf6918bc in start_thread (arg=0xfffffffff29f) at pthread_create.c:486
#5 0x0000ffffbf5669ec in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:78
Signed-off-by: liqingqing <liqingqing3@huawei.com>
Signed-off-by: Shuo Wang <wangshuo47@huawei.com>
Make the tests use TEST_COND_intel96 to decide on whether to build the
unnormal tests instead of the macro in nan-pseudo-number.h and then
drop the header inclusion. This unbreaks test runs on all
architectures that do not have ldbl-96.
Also drop the HANDLE_PSEUDO_NUMBERS macro since it is not used
anywhere.
I've updated copyright dates in glibc for 2021. This is the patch for
the changes not generated by scripts/update-copyrights and subsequent
build / regeneration of generated files. As well as the usual annual
updates, mainly dates in --version output (minus csu/version.c which
previously had to be handled manually but is now successfully updated
by update-copyrights), there is a small change to the copyright notice
in NEWS which should let NEWS get updated automatically next year.
Please remember to include 2021 in the dates for any new files added
in future (which means updating any existing uncommitted patches you
have that add new files to use the new copyright dates in them).
I used these shell commands:
../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")
and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 6694 files FOO.
I then removed trailing white space from benchtests/bench-pthread-locks.c
and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this
diagnostic from Savannah:
remote: *** pre-commit check failed ...
remote: *** error: lines with trailing whitespace found
remote: error: hook declined to update refs/heads/master
DELOUSE was added to asm code to make them compatible with non-LP64
ABIs, but it is an unfortunate name and the code was not compatible
with ABIs where pointer and size_t are different. Glibc currently
only supports the LP64 ABI so these macros are not really needed or
tested, but for now the name is changed to be more meaningful instead
of removing them completely.
Some DELOUSE macros were dropped: clone, strlen and strnlen used it
unnecessarily.
The out of tree ILP32 patches are currently not maintained and will
likely need a rework to rebase them on top of the time64 changes.
clone already uses r31 to temporarily save input arguments before doing the
syscall, so we use a different register to read from the TCB. We can also avoid
allocating another stack frame, which is not needed since we can simply extend
the usage of the red zone.
Tested-by: Lucas A. M. Magalhães <lamm@linux.ibm.com>
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
Linux kernel v5.9 added support for system calls using the scv
instruction for POWER9 and later. The new codepath provides better
performance (see below) if compared to using sc. For the
foreseeable future, both sc and scv mechanisms will co-exist, so this
patch enables glibc to do a runtime check and use scv when it is
available.
Before issuing the system call to the kernel, we check hwcap2 in the TCB
for PPC_FEATURE2_SCV to see if scv is supported by the kernel. If not,
we fallback to sc and keep the old behavior.
The kernel implements a different error return convention for scv, so
when returning from a system call we need to handle the return value
differently depending on the instruction we used to enter the kernel.
For syscalls implemented in ASM, entry and exit are implemented by
different macros (PSEUDO and PSEUDO_RET, resp.), which may be used in
sequence (e.g. for templated syscalls) or with other instructions in
between (e.g. clone). To avoid accessing the TCB a second time on
PSEUDO_RET to check which instruction we used, the value read from
hwcap2 is cached on a non-volatile register.
This is not needed when using INTERNAL_SYSCALL macro, since entry and
exit are bundled into the same inline asm directive.
The dynamic loader may issue syscalls before the TCB has been setup
so it always uses sc with no extra checks. For the static case, there
is no compile-time way to determine if we are inside startup code,
so we also check the value of the thread pointer before effectively
accessing the TCB. For such situations in which the availability of
scv cannot be determined, sc is always used.
Support for scv in syscalls implemented in their own ASM file (clone and
vfork) will be added later. For now simply use sc as before.
Average performance over 1M calls for each syscall "type":
- stat: C wrapper calling INTERNAL_SYSCALL
- getpid: templated ASM syscall
- syscall: call to gettid using syscall function
Standard:
stat : 1.573445 us / ~3619 cycles
getpid : 0.164986 us / ~379 cycles
syscall : 0.162743 us / ~374 cycles
With scv:
stat : 1.537049 us / ~3535 cycles <~ -84 cycles / -2.32%
getpid : 0.109923 us / ~253 cycles <~ -126 cycles / -33.25%
syscall : 0.116410 us / ~268 cycles <~ -106 cycles / -28.34%
Tested on powerpc, powerpc64, powerpc64le (with and without scv)
Tested-by: Lucas A. M. Magalhães <lamm@linux.ibm.com>
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
Add support to treat pseudo-numbers specially and implement x86
version to consider all of them as signaling.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
With xmknod wrapper functions removed (589260cef8), the mknod functions
are now properly exported, and version is done using symbols versioning
instead of the extra _MKNOD_* argument.
It also allows us to consolidate Linux and Hurd mknod implementation.
Reviewed-by: Lukasz Majewski <lukma@denx.de>
With xstat wrapper functions removed (8ed005daf0), the stat functions
are now properly exported, and version is done using symbols versioning
instead of the extra _STAT_* argument.
Reviewed-by: Lukasz Majewski <lukma@denx.de>
The new __proc_waitid RPC now expects WEXITED to be passed, allowing to
properly implement waitid, and thus define the missing W* macros
(according to FreeBSD values).
Instead of having the arch-specific trampoline setup code detect whether
preemption happened or not, we'd rather pass it the sigaction. In the
future, this may also allow to change sa_flags from post_signal().
Do not attempt to fix the significand top bit in long double input
received in printf. The code should never reach here because isnan
should now detect unnormals as NaN. This is already a NOP for glibc
since it uses the gcc __builtin_isnan, which detects unnormals as NaN.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
This syncs up isnanl behaviour with gcc. Also move the isnanl
implementation to sysdeps/x86 and remove the sysdeps/x86_64 version.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Also move sysdeps/i386/fpu/s_fpclassifyl.c to
sysdeps/x86/fpu/s_fpclassifyl.c and remove
sysdeps/x86_64/fpu/s_fpclassifyl.c
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Add Intel Linear Address Masking (LAM) support to <sys/platform/x86.h>.
HAS_CPU_FEATURE (LAM) can be used to detect if LAM is enabled in CPU.
LAM modifies the checking that is applied to 64-bit linear addresses,
allowing software to use of the untranslated address bits for metadata.
Add various defines and stubs for enabling MTE on AArch64 sysv-like
systems such as Linux. The HWCAP feature bit is copied over in the
same way as other feature bits. Similarly we add a new wrapper header
for mman.h to define the PROT_MTE flag that can be used with mmap and
related functions.
We add a new field to struct cpu_features that can be used, for
example, to check whether or not certain ifunc'd routines should be
bound to MTE-safe versions.
Finally, if we detect that MTE should be enabled (ie via the glibc
tunable); we enable MTE during startup as required.
Support in the Linux kernel was added in version 5.10.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Older versions of the Linux kernel headers obviously lack support for
memory tagging, but we still want to be able to build in support when
using those (obviously it can't be enabled on such systems).
The linux kernel extensions are made to the platform-independent
header (linux/prctl.h), so this patch takes a similar approach.
This patch adds the basic support for memory tagging.
Various flavours are supported, particularly being able to turn on
tagged memory at run-time: this allows the same code to be used on
systems where memory tagging support is not present without neededing
a separate build of glibc. Also, depending on whether the kernel
supports it, the code will use mmap for the default arena if morecore
does not, or cannot support tagged memory (on AArch64 it is not
available).
All the hooks use function pointers to allow this to work without
needing ifuncs.
Reviewed-by: DJ Delorie <dj@redhat.com>
This is clever, but it confuses downstream detection in at least zstd
and GNOME's glib. zstd has preprocessor tests for the 'st_mtime' macro,
which is not provided by the path using the anonymous union; glib checks
for the presence of 'st_mtimensec' in struct stat but then tries to
access that field in struct statx (which might be a bug on its own).
Checked with a build for alpha-linux-gnu.
setjmp() uses C code to store current registers into jmp_buf
environment. -fstack-protector-all places canary into setjmp()
prologue and clobbers 'a5' before it gets saved.
The change inhibits stack canary injection to avoid clobber.
When SA_SIGINFO is available, sysdeps/posix/s?profil.c use it, so we have to
fix the __profil_counter function accordingly, using sigcontextinfo.h's
sigcontext_get_pc.
SA_SIGINFO is actually just another way of expressing what we were
already passing over with struct sigcontext. This just introduces the
SIGINFO interface and fixes the posix values when that interface is
requested by the application.
asin and acos have slow paths for rounding the last bit that cause some
calls to be 500-1500x slower than average calls.
These slow paths are rare, a test of a trillion (1.000.000.000.000)
random inputs between -1 and 1 showed 32870 slow calls for acos and 4473
for asin, with most occurrences between -1.0 .. -0.9 and 0.9 .. 1.0.
The slow paths claim correct rounding and use __sin32() and __cos32()
(which compare two result candidates and return the closest one) as the
final step, with the second result candidate (res1) having a small offset
applied from res. This suggests that res and res1 are intended to be 1
ULP apart (which makes sense for rounding), barring bugs, allowing us to
pick either one and still remain within 1 ULP of the exact result.
Remove the slow paths as the accuracy is better than 1 ULP even without
them, which is enough for glibc.
Also remove code comments claiming correctly rounded results.
After slow path removal, checking the accuracy of 14.400.000.000 random
asin() and acos() inputs showed only three incorrectly rounded
(error > 0.5 ULP) results:
- asin(-0x1.ee2b43286db75p-1) (0.500002 ULP, same as before)
- asin(-0x1.f692ba202abcp-4) (0.500003 ULP, same as before)
- asin(-0x1.9915e876fc062p-1) (0.50000000001 ULP, previously exact)
The first two had the same error even before this commit, and they did
not use the slow path at all.
Checking 4934 known randomly found previously-slow-path asin inputs
shows 25 calls with incorrectly rounded results, with a maximum error of
0.500000002 ULP (for 0x1.fcd5742999ab8p-1). The previous slow-path code
rounded all these inputs correctly (error < 0.5 ULP).
The observed average speed increase was 130x.
Checking 36240 known randomly found previously-slow-path acos inputs
shows 42 calls with incorrectly rounded results, with a maximum error of
0.500000008 ULP (for 0x1.f63845056f35ep-1). The previous "exact"
slow-path code showed 34 calls with incorrectly rounded results, with the
same maximum error of 0.500000008 ULP (for 0x1.f63845056f35ep-1).
The observed average speed increase was 130x.
The functions could likely be trimmed more while keeping acceptable
accuracy, but this at least gets rid of the egregiously slow cases.
Tested on x86_64.
This patch updates the kernel version in the test tst-mman-consts.py
to 5.10. (There are no new MAP_* constants covered by this test in
5.10 that need any other header changes.)
Tested with build-many-glibcs.py.
GCC 6.5 fails to correctly build ldconfig with recent ld.so.cache
commits, e.g.:
785969a047
elf: Implement a string table for ldconfig, with tail merging
If glibc is build with gcc 6.5.0:
__builtin_add_overflow is used in
<glibc>/elf/stringtable.c:stringtable_finalize()
which leads to ldconfig failing with "String table is too large".
This is also recognizable in following tests:
FAIL: elf/tst-glibc-hwcaps-cache
FAIL: elf/tst-glibc-hwcaps-prepend-cache
FAIL: elf/tst-ldconfig-X
FAIL: elf/tst-ldconfig-bad-aux-cache
FAIL: elf/tst-ldconfig-ld_so_conf-update
FAIL: elf/tst-stringtable
See gcc "Bug 98269 - gcc 6.5.0 __builtin_add_overflow() with small
uint32_t values incorrectly detects overflow"
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98269)
Change sbrk to fail for !__libc_initial (in the generic
implementation). As a result, sbrk is (relatively) safe to use
for the __libc_initial case (from the main libc). It is therefore
no longer necessary to avoid using it in that case (or updating the
brk cache), and the __libc_initial flag does not need to be updated
as part of dlmopen or static dlopen.
As before, direct brk system calls on Linux may lead to memory
corruption.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Linux 5.10 has one new syscall, process_madvise. Update
syscall-names.list and regenerate the arch-syscall.h headers with
build-many-glibcs.py update-syscalls.
Tested with build-many-glibcs.py.
This symbol is not in the implementation reserved namespace for static
linking and it was never used: it seems it was mistakenly added in the
orignal strlen_asimd commit 436e4d5b96
Since we can't tell if the tunable value is set by user or not:
https://sourceware.org/bugzilla/show_bug.cgi?id=27069
remove the default REP MOVSB threshold tunable value so that the correct
default value will be set correctly by init_cacheinfo ().
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Re-mmap executable segments if possible instead of using mprotect
to add PROT_BTI. This allows using BTI protection with security
policies that prevent mprotect with PROT_EXEC.
If the fd of the ELF module is not available because it was kernel
mapped then mprotect is used and failures are ignored. To protect
the main executable even when mprotect is filtered the linux kernel
will have to be changed to add PROT_BTI to it.
The delayed failure reporting is mainly needed because currently
_dl_process_gnu_properties does not propagate failures such that
the required cleanups happen. Using the link_map_machine struct for
error propagation is not ideal, but this seemed to be the least
intrusive solution.
Fixes bug 26831.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
To handle GNU property notes on aarch64 some segments need to
be mmaped again, so the fd of the loaded ELF module is needed.
When the fd is not available (kernel loaded modules), then -1
is passed.
The fd is passed to both _dl_process_pt_gnu_property and
_dl_process_pt_note for consistency. Target specific note
processing functions are updated accordingly.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Handle unaligned executable load segments (the bfd linker is not
expected to produce such binaries, but other linkers may).
Computing the mapping bounds follows _dl_map_object_from_fd more
closely now.
Fixes bug 26988.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The _dl_open_check and _rtld_main_check hooks are not called on the
dependencies of a loaded module, so BTI protection was missed on
every module other than the main executable and directly dlopened
libraries.
The fix just iterates over dependencies to enable BTI.
Fixes bug 26926.
It removes all the arch-specific assembly implementation. The
outliers are alpha, where its kernel ABI explict return -ENOMEM
in case of failure; and i686, where it can't use
"call *%gs:SYSINFO_OFFSET" during statup in static PIE.
Also some ABIs exports an additional ___brk_addr symbol and to
handle it an internal HAVE_INTERNAL_BRK_ADDR_SYMBOL is added.
Checked on x86_64-linux-gnu, i686-linux-gnu, adn with builsd for
the affected ABIs.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
Subdirectories z13, z14, z15 can be selected, mostly based on the
level of support for vector instructions.
Co-Authored-By: Stefan Liebler <stli@linux.ibm.com>
float_t supposedly represents the type that is used to evaluate float
expressions internally. While the isa supports single-precision float
operations, the port of glibc to s390 incorrectly deferred to the
generic definitions which, back then, tied float_t to double. gcc by
default evaluates float in single precision, so that scenario violates
the C standard (sections 5.2.4.2.2 and 7.12 in C11/C17). With
-fexcess-precision=standard, gcc evaluates float in double precision,
which aligns with the standard yet at the cost of added conversion
instructions.
With this patch, we drop the s390-specific definition of float_t and
defer to the default behavior, which aligns float_t with the
compiler-defined FLT_EVAL_METHOD in a standard-compliant way.
Checked on s390x-linux-gnu with 31-bit and 64-bit builds.
The functions strtoimax, strtoumax, wcstoimax, wcstoumax currently
have three implementations each (wordsize-32, wordsize-64 and dummy
implementation in stdlib/ using #error), defining the functions as
thin wrappers round corresponding *_internal functions. Simplify the
code by changing them into aliases of functions such as strtol and
wcstoull. This is more consistent with how e.g. imaxdiv is handled.
Tested for x86_64 and x86.
This code manages the mappings of the available databases in NSS
(i.e. passwd, hosts, netgroup, etc) with the actions that should
be taken to do a query on those databases.
This is the main API between query functions scattered throughout
glibc and the underlying code (actions, modules, etc).
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>