Commit Graph

13648 Commits

Author SHA1 Message Date
H.J. Lu
3ec5d83d2a x86-64: Avoid rep movsb with short distance [BZ #27130]
When copying with "rep movsb", if the distance between source and
destination is N*4GB + [1..63] with N >= 0, performance may be very
slow.  This patch updates memmove-vec-unaligned-erms.S for AVX and
AVX512 versions with the distance in RCX:

	cmpl	$63, %ecx
	// Don't use "rep movsb" if ECX <= 63
	jbe	L(Don't use rep movsb")
	Use "rep movsb"

Benchtests data with bench-memcpy, bench-memcpy-large, bench-memcpy-random
and bench-memcpy-walk on Skylake, Ice Lake and Tiger Lake show that its
performance impact is within noise range as "rep movsb" is only used for
data size >= 4KB.
2021-01-04 07:58:57 -08:00
Shuo Wang
cd6274089f aarch64: fix stack missing after sp is updated
After sp is updated, the CFA offset should be set before next instruction.
Tested in glibc-2.28:
Thread 2 "xxxxxxx" hit Breakpoint 1, _dl_tlsdesc_dynamic () at ../sysdeps/aarch64/dl-tlsdesc.S:149
149		stp	x1,  x2, [sp, #-32]!
Missing separate debuginfos, use: dnf debuginfo-install libgcc-7.3.0-20190804.h24.aarch64
(gdb) bt
#0  _dl_tlsdesc_dynamic () at ../sysdeps/aarch64/dl-tlsdesc.S:149
#1  0x0000ffffbe4fbb44 in OurFunction (threadId=3194870184)
    at /home/test/test_function.c:30
#2  0x0000000000400c08 in initaaa () at thread.c:58
#3  0x0000000000400c50 in thread_proc (param=0x0) at thread.c:71
#4  0x0000ffffbf6918bc in start_thread (arg=0xfffffffff29f) at pthread_create.c:486
#5  0x0000ffffbf5669ec in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:78
(gdb) ni
_dl_tlsdesc_dynamic () at ../sysdeps/aarch64/dl-tlsdesc.S:150
150		stp	x3,  x4, [sp, #16]
(gdb) bt
#0  _dl_tlsdesc_dynamic () at ../sysdeps/aarch64/dl-tlsdesc.S:150
#1  0x0000ffffbe4fbb44 in OurFunction (threadId=3194870184)
    at /home/test/test_function.c:30
#2  0x0000000000000000 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) ni
_dl_tlsdesc_dynamic () at ../sysdeps/aarch64/dl-tlsdesc.S:157
157		mrs	x4, tpidr_el0
(gdb) bt
#0  _dl_tlsdesc_dynamic () at ../sysdeps/aarch64/dl-tlsdesc.S:157
#1  0x0000ffffbe4fbb44 in OurFunction (threadId=3194870184)
    at /home/test/test_function.c:30
#2  0x0000000000400c08 in initaaa () at thread.c:58
#3  0x0000000000400c50 in thread_proc (param=0x0) at thread.c:71
#4  0x0000ffffbf6918bc in start_thread (arg=0xfffffffff29f) at pthread_create.c:486
#5  0x0000ffffbf5669ec in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:78

Signed-off-by: liqingqing <liqingqing3@huawei.com>
Signed-off-by: Shuo Wang <wangshuo47@huawei.com>
2021-01-04 15:37:06 +00:00
Siddhesh Poyarekar
8cc1e39a36 Drop nan-pseudo-number.h usage from tests
Make the tests use TEST_COND_intel96 to decide on whether to build the
unnormal tests instead of the macro in nan-pseudo-number.h and then
drop the header inclusion.  This unbreaks test runs on all
architectures that do not have ldbl-96.

Also drop the HANDLE_PSEUDO_NUMBERS macro since it is not used
anywhere.
2021-01-04 20:49:56 +05:30
Siddhesh Poyarekar
fee3b889d8 Move generic nan-pseudo-number.h to ldbl-96
The concept of pseudo number formats only exists in the realm of the
96 bit long double format.
2021-01-04 14:51:52 +05:30
Paul Eggert
9fcdec7386 Update copyright dates not handled by scripts/update-copyrights.
I've updated copyright dates in glibc for 2021.  This is the patch for
the changes not generated by scripts/update-copyrights and subsequent
build / regeneration of generated files.  As well as the usual annual
updates, mainly dates in --version output (minus csu/version.c which
previously had to be handled manually but is now successfully updated
by update-copyrights), there is a small change to the copyright notice
in NEWS which should let NEWS get updated automatically next year.

Please remember to include 2021 in the dates for any new files added
in future (which means updating any existing uncommitted patches you
have that add new files to use the new copyright dates in them).
2021-01-02 12:17:34 -08:00
Paul Eggert
2b778ceb40 Update copyright dates with scripts/update-copyrights
I used these shell commands:

../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")

and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 6694 files FOO.
I then removed trailing white space from benchtests/bench-pthread-locks.c
and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this
diagnostic from Savannah:
remote: *** pre-commit check failed ...
remote: *** error: lines with trailing whitespace found
remote: error: hook declined to update refs/heads/master
2021-01-02 12:17:34 -08:00
Szabolcs Nagy
45b1e17e91 aarch64: use PTR_ARG and SIZE_ARG instead of DELOUSE
DELOUSE was added to asm code to make them compatible with non-LP64
ABIs, but it is an unfortunate name and the code was not compatible
with ABIs where pointer and size_t are different. Glibc currently
only supports the LP64 ABI so these macros are not really needed or
tested, but for now the name is changed to be more meaningful instead
of removing them completely.

Some DELOUSE macros were dropped: clone, strlen and strnlen used it
unnecessarily.

The out of tree ILP32 patches are currently not maintained and will
likely need a rework to rebase them on top of the time64 changes.
2020-12-31 16:50:58 +00:00
Matheus Castanho
41f013cef2 powerpc: Use scv instruction on clone when available
clone already uses r31 to temporarily save input arguments before doing the
syscall, so we use a different register to read from the TCB. We can also avoid
allocating another stack frame, which is not needed since we can simply extend
the usage of the red zone.

Tested-by: Lucas A. M. Magalhães <lamm@linux.ibm.com>
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2020-12-30 18:26:33 -03:00
Matheus Castanho
68ab82f566 powerpc: Runtime selection between sc and scv for syscalls
Linux kernel v5.9 added support for system calls using the scv
instruction for POWER9 and later.  The new codepath provides better
performance (see below) if compared to using sc.  For the
foreseeable future, both sc and scv mechanisms will co-exist, so this
patch enables glibc to do a runtime check and use scv when it is
available.

Before issuing the system call to the kernel, we check hwcap2 in the TCB
for PPC_FEATURE2_SCV to see if scv is supported by the kernel.  If not,
we fallback to sc and keep the old behavior.

The kernel implements a different error return convention for scv, so
when returning from a system call we need to handle the return value
differently depending on the instruction we used to enter the kernel.

For syscalls implemented in ASM, entry and exit are implemented by
different macros (PSEUDO and PSEUDO_RET, resp.), which may be used in
sequence (e.g. for templated syscalls) or with other instructions in
between (e.g. clone).  To avoid accessing the TCB a second time on
PSEUDO_RET to check which instruction we used, the value read from
hwcap2 is cached on a non-volatile register.

This is not needed when using INTERNAL_SYSCALL macro, since entry and
exit are bundled into the same inline asm directive.

The dynamic loader may issue syscalls before the TCB has been setup
so it always uses sc with no extra checks.  For the static case, there
is no compile-time way to determine if we are inside startup code,
so we also check the value of the thread pointer before effectively
accessing the TCB.  For such situations in which the availability of
scv cannot be determined, sc is always used.

Support for scv in syscalls implemented in their own ASM file (clone and
vfork) will be added later. For now simply use sc as before.

Average performance over 1M calls for each syscall "type":
  - stat: C wrapper calling INTERNAL_SYSCALL
  - getpid: templated ASM syscall
  - syscall: call to gettid using syscall function

  Standard:
     stat : 1.573445 us / ~3619 cycles
   getpid : 0.164986 us / ~379 cycles
  syscall : 0.162743 us / ~374 cycles

  With scv:
     stat : 1.537049 us / ~3535 cycles <~ -84 cycles  / -2.32%
   getpid : 0.109923 us / ~253 cycles  <~ -126 cycles / -33.25%
  syscall : 0.116410 us / ~268 cycles  <~ -106 cycles / -28.34%

Tested on powerpc, powerpc64, powerpc64le (with and without scv)

Tested-by: Lucas A. M. Magalhães <lamm@linux.ibm.com>
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2020-12-30 18:26:25 -03:00
Siddhesh Poyarekar
7525c1c71d x86 long double: Consider pseudo numbers as signaling
Add support to treat pseudo-numbers specially and implement x86
version to consider all of them as signaling.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2020-12-30 10:52:45 +05:30
Adhemerval Zanella
99468ed45f io: Remove xmknod{at} implementations
With xmknod wrapper functions removed (589260cef8), the mknod functions
are now properly exported, and version is done using symbols versioning
instead of the extra _MKNOD_* argument.

It also allows us to consolidate Linux and Hurd mknod implementation.

Reviewed-by: Lukasz Majewski <lukma@denx.de>
2020-12-29 16:44:16 -03:00
Adhemerval Zanella
4d97cc8cf3 io: Remove xstat implementations
With xstat wrapper functions removed (8ed005daf0), the stat functions
are now properly exported, and version is done using symbols versioning
instead of the extra _STAT_* argument.

Reviewed-by: Lukasz Majewski <lukma@denx.de>
2020-12-29 16:44:05 -03:00
Samuel Thibault
f6abd97028 hurd: Add WSTOPPED/WCONTINUED/WEXITED/WNOWAIT support [BZ #23091]
The new __proc_waitid RPC now expects WEXITED to be passed, allowing to
properly implement waitid, and thus define the missing W* macros
(according to FreeBSD values).
2020-12-28 23:37:04 +01:00
Samuel Thibault
e42efa01c9 hurd: set sigaction for signal preemptors in arch-independent file
Instead of having the arch-specific trampoline setup code detect whether
preemption happened or not, we'd rather pass it the sigaction. In the
future, this may also allow to change sa_flags from post_signal().
2020-12-26 18:03:31 +01:00
Samuel Thibault
a39b95b975 hurd: Fix spawni SPAWN_XFLAGS_TRY_SHELL with empty argv
When argv is empty, we need to add the original script to be run on the
shell command line.
2020-12-26 16:39:40 +01:00
Samuel Thibault
13adfa34af hurd: Try shell in posix_spawn* only in compat mode
Reported by Bruno Haible <bruno@clisp.org>
2020-12-26 15:12:04 +01:00
H.J. Lu
f380868f6d Remove _ISOMAC check from <cpu-features.h>
Remove _ISOMAC check from <cpu-features.h> since it isn't an installer
header file.
2020-12-24 15:43:34 -08:00
H.J. Lu
45dcd1af09 x86: Remove the duplicated CPU_FEATURE_CPU_P
CPU_FEATURE_CPU_P is defined in sysdeps/x86/sys/platform/x86.h.  Remove
the duplicated CPU_FEATURE_CPU_P in sysdeps/x86/include/cpu-features.h.
2020-12-24 04:39:08 -08:00
Siddhesh Poyarekar
41290b6e84 Partially revert 681900d296
Do not attempt to fix the significand top bit in long double input
received in printf.  The code should never reach here because isnan
should now detect unnormals as NaN.  This is already a NOP for glibc
since it uses the gcc __builtin_isnan, which detects unnormals as NaN.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2020-12-24 06:05:46 +05:30
Siddhesh Poyarekar
94547d9209 x86 long double: Support pseudo numbers in isnanl
This syncs up isnanl behaviour with gcc.  Also move the isnanl
implementation to sysdeps/x86 and remove the sysdeps/x86_64 version.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2020-12-24 06:05:40 +05:30
Siddhesh Poyarekar
b7f8815617 x86 long double: Support pseudo numbers in fpclassifyl
Also move sysdeps/i386/fpu/s_fpclassifyl.c to
sysdeps/x86/fpu/s_fpclassifyl.c and remove
sysdeps/x86_64/fpu/s_fpclassifyl.c

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2020-12-24 06:05:26 +05:30
Florian Weimer
0e981d3524 s390x: Regenerate ulps
For new inputs added in commit cad5ad81d2,
as seen on a z13 system.
2020-12-22 19:27:38 +01:00
Florian Weimer
2aa8ec7dd7 powerpc: Regenerate ulps
For new inputs added in commit cad5ad81d2,
as seen on a POWER8 system.
2020-12-22 19:22:44 +01:00
H.J. Lu
a2e5da2cf4 <sys/platform/x86.h>: Add Intel LAM support
Add Intel Linear Address Masking (LAM) support to <sys/platform/x86.h>.
HAS_CPU_FEATURE (LAM) can be used to detect if LAM is enabled in CPU.

LAM modifies the checking that is applied to 64-bit linear addresses,
allowing software to use of the untranslated address bits for metadata.
2020-12-22 03:45:47 -08:00
Florian Weimer
bca0283815 i386: Regenerate ulps
For new inputs added in commit cad5ad81d2.
2020-12-21 18:19:03 +01:00
Szabolcs Nagy
682cdd6e1a aarch64: update ulps.
For new test cases in
commit cad5ad81d2
2020-12-21 16:40:34 +00:00
Richard Earnshaw
d27f0e5d88 aarch64: Add aarch64-specific files for memory tagging support
This final patch provides the architecture-specific implementation of
the memory-tagging support hooks for aarch64.
2020-12-21 15:25:25 +00:00
Richard Earnshaw
bde4949b6b aarch64: Add sysv specific enabling code for memory tagging
Add various defines and stubs for enabling MTE on AArch64 sysv-like
systems such as Linux.  The HWCAP feature bit is copied over in the
same way as other feature bits.  Similarly we add a new wrapper header
for mman.h to define the PROT_MTE flag that can be used with mmap and
related functions.

We add a new field to struct cpu_features that can be used, for
example, to check whether or not certain ifunc'd routines should be
bound to MTE-safe versions.

Finally, if we detect that MTE should be enabled (ie via the glibc
tunable); we enable MTE during startup as required.

Support in the Linux kernel was added in version 5.10.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2020-12-21 15:25:25 +00:00
Richard Earnshaw
0d1bafdcb6 linux: Add compatibility definitions to sys/prctl.h for MTE
Older versions of the Linux kernel headers obviously lack support for
memory tagging, but we still want to be able to build in support when
using those (obviously it can't be enabled on such systems).

The linux kernel extensions are made to the platform-independent
header (linux/prctl.h), so this patch takes a similar approach.
2020-12-21 15:25:25 +00:00
Richard Earnshaw
3784dfc098 malloc: Basic support for memory tagging in the malloc() family
This patch adds the basic support for memory tagging.

Various flavours are supported, particularly being able to turn on
tagged memory at run-time: this allows the same code to be used on
systems where memory tagging support is not present without neededing
a separate build of glibc.  Also, depending on whether the kernel
supports it, the code will use mmap for the default arena if morecore
does not, or cannot support tagged memory (on AArch64 it is not
available).

All the hooks use function pointers to allow this to work without
needing ifuncs.

Reviewed-by: DJ Delorie <dj@redhat.com>
2020-12-21 15:25:25 +00:00
Matt Turner
d552058570 alpha: Remove anonymous union in struct stat [BZ #27042]
This is clever, but it confuses downstream detection in at least zstd
and GNOME's glib. zstd has preprocessor tests for the 'st_mtime' macro,
which is not provided by the path using the anonymous union; glib checks
for the presence of 'st_mtimensec' in struct stat but then tries to
access that field in struct statx (which might be a bug on its own).

Checked with a build for alpha-linux-gnu.
2020-12-21 09:09:43 -03:00
Paul Zimmermann
cad5ad81d2 add inputs to auto-libm-test-in yielding larger errors (binary64, x86_64) 2020-12-21 10:35:20 +05:30
Sergei Trofimovich
6eb7e1da0e m68k: fix clobbering a5 in setjmp() [BZ #24202]
setjmp() uses C code to store current registers into jmp_buf
environment. -fstack-protector-all places canary into setjmp()
prologue and clobbers 'a5' before it gets saved.

The change inhibits stack canary injection to avoid clobber.
2020-12-21 10:24:34 +05:30
Samuel Thibault
e0aec6c833 hurd: Make trampoline fill siginfo ss_sp from sc_uesp
Mach actually rather fills the uesp field, not esp.
2020-12-21 03:17:00 +01:00
Samuel Thibault
53432762ac profil-counter: Add missing SIGINFO case
When SA_SIGINFO is available, sysdeps/posix/s?profil.c use it, so we have to
fix the __profil_counter function accordingly, using sigcontextinfo.h's
sigcontext_get_pc.
2020-12-21 02:08:33 +01:00
Jeremie Koenig
d865ff74ba hurd: implement SA_SIGINFO signal handlers.
SA_SIGINFO is actually just another way of expressing what we were
already passing over with struct sigcontext. This just introduces the
SIGINFO interface and fixes the posix values when that interface is
requested by the application.
2020-12-21 01:44:20 +01:00
Samuel Thibault
407765e9f2 hurd: Fix ELF_MACHINE_USER_ADDRESS_MASK value
x86 binaries are linked at 0x08000000, so we need to let them get mapped
there.
2020-12-20 01:47:47 +01:00
Samuel Thibault
e94b01393e hurd: Note when the vm_map kernel bug was fixed
dl-sysdep has been wanting to use high bits in the vm_map mask for decades,
but that was only implemented lately.
2020-12-20 01:46:11 +01:00
Anssi Hannula
69a7ca7705 ieee754: Remove unused __sin32 and __cos32
The __sin32 and __cos32 functions were only used in the now removed slow
path of asin and acos.
2020-12-18 12:10:31 +05:30
Anssi Hannula
f67f9c9af2 ieee754: Remove slow paths from asin and acos
asin and acos have slow paths for rounding the last bit that cause some
calls to be 500-1500x slower than average calls.

These slow paths are rare, a test of a trillion (1.000.000.000.000)
random inputs between -1 and 1 showed 32870 slow calls for acos and 4473
for asin, with most occurrences between -1.0 .. -0.9 and 0.9 .. 1.0.

The slow paths claim correct rounding and use __sin32() and __cos32()
(which compare two result candidates and return the closest one) as the
final step, with the second result candidate (res1) having a small offset
applied from res. This suggests that res and res1 are intended to be 1
ULP apart (which makes sense for rounding), barring bugs, allowing us to
pick either one and still remain within 1 ULP of the exact result.

Remove the slow paths as the accuracy is better than 1 ULP even without
them, which is enough for glibc.

Also remove code comments claiming correctly rounded results.

After slow path removal, checking the accuracy of 14.400.000.000 random
asin() and acos() inputs showed only three incorrectly rounded
(error > 0.5 ULP) results:
- asin(-0x1.ee2b43286db75p-1) (0.500002 ULP, same as before)
- asin(-0x1.f692ba202abcp-4)  (0.500003 ULP, same as before)
- asin(-0x1.9915e876fc062p-1) (0.50000000001 ULP, previously exact)
The first two had the same error even before this commit, and they did
not use the slow path at all.

Checking 4934 known randomly found previously-slow-path asin inputs
shows 25 calls with incorrectly rounded results, with a maximum error of
0.500000002 ULP (for 0x1.fcd5742999ab8p-1). The previous slow-path code
rounded all these inputs correctly (error < 0.5 ULP).
The observed average speed increase was 130x.

Checking 36240 known randomly found previously-slow-path acos inputs
shows 42 calls with incorrectly rounded results, with a maximum error of
0.500000008 ULP (for 0x1.f63845056f35ep-1). The previous "exact"
slow-path code showed 34 calls with incorrectly rounded results, with the
same maximum error of 0.500000008 ULP (for 0x1.f63845056f35ep-1).
The observed average speed increase was 130x.

The functions could likely be trimmed more while keeping acceptable
accuracy, but this at least gets rid of the egregiously slow cases.

Tested on x86_64.
2020-12-18 12:09:23 +05:30
Joseph Myers
2ec40e66ad Update kernel version to 5.10 in tst-mman-consts.py.
This patch updates the kernel version in the test tst-mman-consts.py
to 5.10.  (There are no new MAP_* constants covered by this test in
5.10 that need any other header changes.)

Tested with build-many-glibcs.py.
2020-12-17 16:17:59 +00:00
Stefan Liebler
844b4d8b4b s390x: Require GCC 7.1 or later to build glibc.
GCC 6.5 fails to correctly build ldconfig with recent ld.so.cache
commits, e.g.:
785969a047
elf: Implement a string table for ldconfig, with tail merging

If glibc is build with gcc 6.5.0:
__builtin_add_overflow is used in
<glibc>/elf/stringtable.c:stringtable_finalize()
which leads to ldconfig failing with "String table is too large".
This is also recognizable in following tests:
FAIL: elf/tst-glibc-hwcaps-cache
FAIL: elf/tst-glibc-hwcaps-prepend-cache
FAIL: elf/tst-ldconfig-X
FAIL: elf/tst-ldconfig-bad-aux-cache
FAIL: elf/tst-ldconfig-ld_so_conf-update
FAIL: elf/tst-stringtable

See gcc "Bug 98269 - gcc 6.5.0 __builtin_add_overflow() with small
uint32_t values incorrectly detects overflow"
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98269)
2020-12-17 16:18:04 +01:00
Florian Weimer
e7570f4131 Replace __libc_multiple_libcs with __libc_initial flag
Change sbrk to fail for !__libc_initial (in the generic
implementation).  As a result, sbrk is (relatively) safe to use
for the __libc_initial case (from the main libc).  It is therefore
no longer necessary to avoid using it in that case (or updating the
brk cache), and the __libc_initial flag does not need to be updated
as part of dlmopen or static dlopen.

As before, direct brk system calls on Linux may lead to memory
corruption.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2020-12-16 15:13:40 +01:00
Samuel Thibault
749cd2ca78 htl: Get sem_open/sem_close/sem_unlink support [BZ #25524]
This just moves the existing nptl implementation to reuse as it is in
htl.
2020-12-16 14:27:25 +01:00
Joseph Myers
bcf47eb0fb Update syscall lists for Linux 5.10.
Linux 5.10 has one new syscall, process_madvise.  Update
syscall-names.list and regenerate the arch-syscall.h headers with
build-many-glibcs.py update-syscalls.

Tested with build-many-glibcs.py.
2020-12-16 02:08:52 +00:00
Samuel Thibault
c8f9421298 htl: Add pshared semaphore support
The implementation is extremely similar to the nptl implementation, but
with slight differences in the futex interface. This fixes some of BZ
25521.
2020-12-16 01:58:33 +01:00
Samuel Thibault
f26f0d766b hurd: Add __libc_open and __libc_close
Needed by libpthread for sem_open and sem_close
2020-12-16 01:58:33 +01:00
Samuel Thibault
6e411b42f8 htl: Add futex-internal.h
That provides futex_supports_pshared
2020-12-16 01:58:33 +01:00
Samuel Thibault
bec412424e hurd: make lll_* take a variable instead of a ptr
To be coherent with other ports, let's make lll_* take a variable, and
rename those that keep taking a ptr into __lll_*.
2020-12-16 01:58:33 +01:00
Samuel Thibault
18c2ab9a09 hurd: Rename LLL_INITIALIZER to LLL_LOCK_INITIALIZER
To get coherent with other ports.
2020-12-16 01:58:33 +01:00