Commit Graph

36005 Commits

Author SHA1 Message Date
H.J. Lu
f8b4630ef6 x86: Correct bit_cpu_CLFSH [BZ #26208]
bit_cpu_CLFSH should be (1u << 19), not (1u << 20).
2020-07-06 06:38:05 -07:00
Florian Weimer
01ffa6002e manual: Document __libc_single_threaded
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Reviewed-by: DJ Delorie <dj@redhat.com>
2020-07-06 11:17:53 +02:00
Florian Weimer
706ad1e7af Add the __libc_single_threaded variable
The variable is placed in libc.so, and it can be true only in
an outer libc, not libcs loaded via dlmopen or static dlopen.
Since thread creation from inner namespaces does not work,
pthread_create can update __libc_single_threaded directly.

Using __libc_early_init and its initial flag, implementation of this
variable is very straightforward.  A future version may reset the flag
during fork (but not in an inner namespace), or after joining all
threads except one.

Reviewed-by: DJ Delorie <dj@redhat.com>
2020-07-06 11:15:58 +02:00
Mathieu Desnoyers
8f4632deb3 Linux: rseq registration tests
These tests validate that rseq is registered from various execution
contexts (main thread, destructor, other threads, other threads created
from destructor, forked process (without exec), pthread_atfork handlers,
pthread setspecific destructors, signal handlers, atexit handlers).

tst-rseq.c only links against libc.so, testing registration of rseq in
a non-multithreaded environment.

tst-rseq-nptl.c also links against libpthread.so, testing registration
of rseq in a multithreaded environment.

See the Linux kernel selftests for extensive rseq stress-tests.
2020-07-06 10:21:35 +02:00
Mathieu Desnoyers
6e29cb3f61 Linux: Use rseq in sched_getcpu if available
When available, use the cpu_id field from __rseq_abi on Linux to
implement sched_getcpu().  Fall-back on the vgetcpu vDSO if unavailable.

Benchmarks:

x86-64: Intel E5-2630 v3@2.40GHz, 16-core, hyperthreading

glibc sched_getcpu():                     13.7 ns (baseline)
glibc sched_getcpu() using rseq:           2.5 ns (speedup:  5.5x)
inline load cpuid from __rseq_abi TLS:     0.8 ns (speedup: 17.1x)
2020-07-06 10:21:32 +02:00
Mathieu Desnoyers
0c76fc3c2b Linux: Perform rseq registration at C startup and thread creation
Register rseq TLS for each thread (including main), and unregister for
each thread (excluding main).  "rseq" stands for Restartable Sequences.

See the rseq(2) man page proposed here:
  https://lkml.org/lkml/2018/9/19/647

Those are based on glibc master branch commit 3ee1e0ec5c.
The rseq system call was merged into Linux 4.18.

The TLS_STATIC_SURPLUS define is increased to leave additional room for
dlopen'd initial-exec TLS, which keeps elf/tst-auditmany working.

The increase (76 bytes) is larger than 32 bytes because it has not been
increased in quite a while.  The cost in terms of additional TLS storage
is quite significant, but it will also obscure some initial-exec-related
dlopen failures.
2020-07-06 10:21:16 +02:00
Samuel Thibault
f9cf873537 tst-cancel4: deal with ENOSYS errors
The Hurd port doesn't have support for sigwaitinfo, sigtimedwait, and msgget
yet, so let us ignore the test for these when they return ENOSYS.

* nptl/tst-cancel4.c (tf_sigwaitinfo): Fallback on sigwait when
sigwaitinfo returns ENOSYS.
(tf_sigtimedwait): Likewise with sigtimedwait.
(tf_msgrcv, tf_msgsnd): Fallback on tf_usleep when msgget returns ENOSYS.
2020-07-05 19:21:45 +02:00
Florian Weimer
a3f747a912 manual: Show copyright information not just in the printed manual
@insertcopying was not used at all in the Info and HTML versions.
As a result, the notices that need to be present according to the
GNU Free Documentation License were missing.

This commit shows these notices above the table of contents in the
HTML version, and as part of the Main Menu node in the Info version.

Remove the "This file documents" line because it is redundant with the
following line.

Tested-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2020-07-03 10:06:24 +02:00
Joseph Myers
c6aac3bf36 Fix typo in comment in bug 26137 fix. 2020-07-01 14:53:30 +00:00
Joseph Myers
09555b9721 Fix strtod multiple-precision division bug (bug 26137).
Bug 26137 reports spurious "inexact" exceptions from strtod, on 32-bit
systems only, for a decimal argument that is exactly 1 + 2^-32.  In
fact the same issue also appears for 1 + 2^-64 and 1 + 2^-96 as
arguments to strtof128 on 32-bit systems, and 1 + 2^-64 as an argument
to strtof128 on 64-bit systems.  In FE_DOWNWARD or FE_TOWARDZERO mode,
the return value is also incorrect.

The problem is in the multiple-precision division logic used in the
case of dividing by a denominator that occupies at least three GMP
limbs.  There was a comment "The division does not work if the upper
limb of the two-limb mumerator is greater than the denominator.", but
in fact there were problems for the case of equality (that is, where
the high limbs are equal, offset by some multiple of the GMP limb
size) as well.  In such cases, the code used "quot = ~(mp_limb_t) 0;"
(with subsequent correction if that is an overestimate), because
udiv_qrnnd does not support the case of equality, but it's possible
for the shifted numerator to be greater than or equal to the
denominator, in which case that is an underestimate.  To avoid that,
this patch changes the ">" condition to ">=", meaning the first
division is done with a zero high word.

The tests added are all 1 + 2^-n for n from 1 to 113 except for those
that were already present in tst-strtod-round-data.

Tested for x86_64 and x86.
2020-06-30 23:04:06 +00:00
Florian Weimer
5f40e4b1ba Linux: Fix UTC offset setting in settimeofday for __TIMESIZE != 64
The time argument is NULL in this case, and attempt to convert it
leads to a null pointer dereference.

This fixes commit d2e3b697da
("y2038: linux: Provide __settimeofday64 implementation").

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2020-06-30 21:20:20 +02:00
John Marshall
354b98cdfd random: range is not portably RAND_MAX [BZ #7003]
On other platforms, RAND_MAX (which is the range of rand(3))
may differ from 2^31-1 (which is the range of random(3)).

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2020-06-30 14:20:17 -04:00
Joseph Myers
3ee1e0ec5c Update kernel version to 5.7 in tst-mman-consts.py.
This patch updates the kernel version in the test tst-mman-consts.py
to 5.7.  (There are no new constants covered by this test in 5.7 that
need any other header changes; there's a new MREMAP_DONTUNMAP, but
this test doesn't yet cover MREMAP_*.)

Tested with build-many-glibcs.py.
2020-06-29 14:06:32 +00:00
Tulio Magno Quites Machado Filho
d2ba3677da powerpc: Add support for POWER10
1. Add the directories to hold POWER10 files.

2. Add support to select POWER10 libraries based on AT_PLATFORM.

3. Let submachine=power10 be set automatically.
2020-06-29 10:08:38 -03:00
Samuel Thibault
81b1c8cbb5 hurd: Simplify usleep timeout computation
as suggested by Andreas Schwab

* sysdeps/mach/usleep.c (usleep): Divide timeout in an overflow-safe way.
2020-06-29 10:10:32 +02:00
Samuel Thibault
269e4c17cd htl: Enable cancel*16 an cancel*20 tests
* nptl/tst-cancel16.c, tst-cancel20.c, tst-cancelx16.c, tst-cancelx20.c:
Move to...
* sysdeps/pthread: ... here.
* nptl/Makefile: Move corresponding references and rules to...
* sysdeps/pthread/Makefile: ... here.
* sysdeps/mach/hurd/i386/Makefile: Xfail tst-cancel*16 for now: missing
barrier pshared support, but test should be working otherwise.
2020-06-29 00:16:33 +00:00
Samuel Thibault
f512321130 hurd: Add remaining cancelation points
* hurd/hurdselect.c: Include <sysdep-cancel.h>.
(_hurd_select): Surround call to __mach_msg with enabling async cancel.
* sysdeps/mach/hurd/accept4.c: Include <sysdep-cancel.h>.
(__libc_accept4): Surround call to __socket_accept with enabling async cancel,
and use HURD_DPORT_USE_CANCEL instead of HURD_DPORT_USE.
* sysdeps/mach/hurd/connect.c: Include <sysdep-cancel.h>.
(__connect): Surround call to __file_name_lookup and __socket_connect
with enabling async cancel, and use HURD_DPORT_USE_CANCEL instead of
HURD_DPORT_USE.
* sysdeps/mach/hurd/fdatasync.c: Include <sysdep-cancel.h>.
(fdatasync): Surround call to __file_sync with enabling async cancel, and use
HURD_DPORT_USE_CANCEL instead of HURD_DPORT_USE.
* sysdeps/mach/hurd/fsync.c: Include <sysdep-cancel.h>.
(fsync): Surround call to __file_sync with enabling async cancel, and use
HURD_DPORT_USE_CANCEL instead of HURD_DPORT_USE.
* sysdeps/mach/hurd/ioctl.c: Include <sysdep-cancel.h>.
(__ioctl): When request is TIOCDRAIN, surround call to send_rpc with enabling
async cancel, and use HURD_DPORT_USE_CANCEL instead of HURD_DPORT_USE.
* sysdeps/mach/hurd/msync.c: Include <sysdep-cancel.h>.
(msync): Surround call to __vm_object_sync with enabling async cancel.
* sysdeps/mach/hurd/sigsuspend.c: Include <sysdep-cancel.h>.
(__sigsuspend): Surround call to __mach_msg with enabling async cancel.
* sysdeps/mach/hurd/sigwait.c: Include <sysdep-cancel.h>.
(__sigwait): Surround wait code with enabling async cancel.
* sysdeps/mach/msync.c: Include <sysdep-cancel.h>.
(msync): Surround call to __vm_msync with enabling async cancel.
* sysdeps/mach/sleep.c: Include <sysdep-cancel.h>.
(__sleep): Surround call to __mach_msg with enabling async cancel.
* sysdeps/mach/usleep.c: Include <sysdep-cancel.h>.
(usleep): Surround call to __vm_msync with enabling async cancel.
2020-06-28 22:46:21 +00:00
Samuel Thibault
1f3413338e hurd: fix usleep(ULONG_MAX)
* sysdeps/mach/usleep.c (usleep): Clamp timeout when rouding up.
2020-06-28 22:39:03 +00:00
Samuel Thibault
3c9f67e7a5 hurd: Make fcntl(F_SETLKW*) cancellation points
and add _nocancel variant.

* sysdeps/mach/hurd/Makefile [io] (sysdep_routines): Add fcntl_nocancel.
* sysdeps/mach/hurd/fcntl.c [NOCANCEL]: Include <not-cancel.h>.
[!NOCANCEL]: Include <sysdep-cancel.h>.
(__libc_fcntl) [!NOCANCEL]: Surround __file_record_lock call with enabling async cancel, and use HURD_FD_PORT_USE_CANCEL instead of HURD_FD_PORT_USE.
* sysdeps/mach/hurd/fcntl_nocancel.c: New file, defines __fcntl_nocancel by including fcntl.c.
* sysdeps/mach/hurd/not-cancel.h (__fcntl64_nocancel): Replace macro with
    __fcntl_nocancel declaration with hidden proto, and make
    __fcntl64_nocancel call __fcntl_nocancel.
2020-06-28 18:24:37 +00:00
Samuel Thibault
09effdc9b0 hurd: make wait4 a cancellation point
and add _nocancel variant.

* sysdeps/mach/hurd/Makefile [io] (sysdep_routines): Add wait4_nocancel.
* sysdeps/mach/hurd/wait4.c: Include <sysdep-cancel.h>
(__wait4): Surround __proc_wait with enabling async cancel, and use
__USEPORT_CANCEL instead of __USEPORT.
* sysdeps/mach/hurd/wait4_nocancel.c: New file, contains previous
implementation of __wait4.
* sysdeps/mach/hurd/not-cancel.h (__waitpid_nocancel): Replace macro with
__wait4_nocancel declaration with hidden proto, and make
__waitpid_nocancel call __wait4_nocancel.
2020-06-28 18:04:27 +00:00
Samuel Thibault
d60fdd480d hurd: Fix port definition in HURD_PORT_USE_CANCEL
* sysdeps/hurd/include/hurd/port.h: Include <libc-lock.h>.
(HURD_PORT_USE_CANCEL): Add local port variable.
2020-06-28 18:04:26 +00:00
Samuel Thibault
fd3df63fb6 hurd: make close a cancellation point
and add _nocancel variant.

* sysdeps/mach/hurd/Makefile [io] (sysdep_routines): Add close_nocancel.
* sysdeps/mach/hurd/Versions (libc.GLIBC_PRIVATE, ld.GLIBC_PRIVATE): Add
__close_nocancel.
* sysdeps/mach/hurd/i386/localplt.data (__close_nocancel): Allow PLT.
* sysdeps/mach/hurd/close.c: Include <sysdep-cancel.h>
(__libc_close): Surround _hurd_fd_close with enabling async cancel.
* sysdeps/mach/hurd/close_nocancel.c: New file.
* sysdeps/mach/hurd/not-cancel.h (__close_nocancel): Replace macro with
declaration with hidden proto.
2020-06-28 16:34:14 +00:00
Samuel Thibault
4cafcd839f hurd: make open and openat cancellation points
and add _nocancel variants.

* sysdeps/mach/hurd/Makefile [io] (sysdep_routines): Add open_nocancel
openat_nocancel.
* sysdeps/mach/hurd/Versions (libc.GLIBC_PRIVATE, ld.GLIBC_PRIVATE): Add
__open_nocancel.
* sysdeps/mach/hurd/dl-sysdep.c (__open_nocancel): Add alias, check it
is not hidden.
* sysdeps/mach/hurd/i386/localplt.data (__open_nocancel): Allow PLT.
* sysdeps/mach/hurd/not-cancel.h (__open_nocancel, __openat_nocancel:
Replace macros with declarations with hidden proto.
(__open64_nocancel, __openat64_nocancel): Call __open_nocancel and
__openat_nocancel instead of __open64 and __openat64.
* sysdeps/mach/hurd/open.c: Include <sysdep-cancel.h>
(__libc_open): Surround __file_name_lookup with enabling async cancel.
* sysdeps/mach/hurd/openat.c: Likewise.
* sysdeps/mach/hurd/open_nocancel.c,
sysdeps/mach/hurd/openat_nocancel.c: New files.
2020-06-28 15:11:23 +00:00
Samuel Thibault
67a78072e2 hurd: clean fd and port on thread cancel
HURD_*PORT_USE link fd and port with a stack-stored structure, so on
thread cancel we need to cleanup this.

* hurd/fd-cleanup.c: New file.
* hurd/port-cleanup.c (_hurd_port_use_cleanup): New function.
* hurd/Makefile (routines): Add fd-cleanup.
* sysdeps/hurd/include/hurd.h (__USEPORT_CANCEL): New macro.
* sysdeps/hurd/include/hurd/fd.h (_hurd_fd_port_use_data): New
structure.
(_hurd_fd_port_use_cleanup): New prototype.
(HURD_DPORT_USE_CANCEL, HURD_FD_PORT_USE_CANCEL): New macros.
* sysdeps/hurd/include/hurd/port.h (_hurd_port_use_data): New structure.
(_hurd_port_use_cleanup): New prototype.
(HURD_PORT_USE_CANCEL): New macro.
* hurd/hurd/fd.h (HURD_FD_PORT_USE): Also refer to HURD_FD_PORT_USE_CANCEL.
* hurd/hurd.h (__USEPORT): Also refer to __USEPORT_CANCEL.
* hurd/hurd/port.h (HURD_PORT_USE): Also refer to HURD_PORT_USE_CANCEL.

* hurd/fd-read.c (_hurd_fd_read): Call HURD_FD_PORT_USE_CANCEL instead
of HURD_FD_PORT_USE.
* hurd/fd-write.c (_hurd_fd_write): Likewise.
* sysdeps/mach/hurd/send.c (__send): Call HURD_DPORT_USE_CANCEL instead
of HURD_DPORT_USE.
* sysdeps/mach/hurd/sendmsg.c (__libc_sendmsg): Likewise.
* sysdeps/mach/hurd/sendto.c (__sendto): Likewise.
* sysdeps/mach/hurd/recv.c (__recv): Likewise.
* sysdeps/mach/hurd/recvfrom.c (__recvfrom): Likewise.
* sysdeps/mach/hurd/recvmsg.c (__libc_recvmsg): Call __USEPORT_CANCEL
instead of __USEPORT, and HURD_DPORT_USE_CANCEL instead of
HURD_DPORT_USE.
2020-06-28 00:38:46 +00:00
Samuel Thibault
6414eef6e0 htl: Move cleanup handling to non-private libc-lock
This adds sysdeps/htl/libc-lock.h which augments sysdeps/mach/libc-lock.h with
the htl-aware cleanup handling. Otherwise inclusion of libc-lock.h
without libc-lockP.h would keep only the mach-aware handling.

This also fixes cleanup getting called when the binary is
statically-linked without libpthread.

* sysdeps/htl/libc-lockP.h (__libc_cleanup_region_start,
__libc_cleanup_end, __libc_cleanup_region_end,
__pthread_get_cleanup_stack): Move to...
* sysdeps/htl/libc-lock.h: ... new file.
(__libc_cleanup_region_start): Always set handler and arg.
(__libc_cleanup_end): Always call the cleanup handler.
(__libc_cleanup_push, __libc_cleanup_pop): New macros.
2020-06-28 00:13:57 +00:00
Samuel Thibault
cf2c8cc2c6 htl: Fix includes for lockfile
These only need exactly to use __libc_ptf_call.

* sysdeps/htl/flockfile.c: Include <libc-lockP.h> instead of
<libc-lock.h>
* sysdeps/htl/ftrylockfile.c: Include <libc-lockP.h> instead of
<errno.h>, <pthread.h>, <stdio-lock.h>
* sysdeps/htl/funlockfile.c: Include <libc-lockP.h> instead of
<pthread.h> and <stdio-lock.h>
2020-06-28 00:13:57 +00:00
Samuel Thibault
726117e01b htl: avoid cancelling threads inside critical sections
Like hurd_thread_cancel does.

* sysdeps/mach/hurd/htl/pt-docancel.c: Include <hurd/signal.h>
(__pthread_do_cancel): Lock target thread's critical_section_lock and ss
lock around thread mangling.
2020-06-27 02:34:18 +02:00
Samuel Thibault
b9ca3f3efb tst-cancel4-common.c: fix calling socketpair
PF_UNIX was actually never intended to be passed as protocol parameter to
socket() calls: it is a protocol family, not a protocol.  It happens that
Linux introduced accepting it during its 2.0 development, but it shouldn't.
OpenBSD kernels accept it as well, but FreeBSD and NetBSD rightfully do not.
GNU/Hurd does not either.

* nptl/tst-cancel4-common.c (do_test): Pass 0 instead of PF_UNIX as
protocol.
2020-06-26 23:51:52 +02:00
H.J. Lu
4fdd4d41a1 x86: Detect Intel Advanced Matrix Extensions
Intel Advanced Matrix Extensions (Intel AMX) is a new programming
paradigm consisting of two components: a set of 2-dimensional registers
(tiles) representing sub-arrays from a larger 2-dimensional memory image,
and accelerators able to operate on tiles.  Intel AMX is an extensible
architecture.  New accelerators can be added and the existing accelerator
may be enhanced to provide higher performance.  The initial features are
AMX-BF16, AMX-TILE and AMX-INT8, which are usable only if the operating
system supports both XTILECFG state and XTILEDATA state.

Add AMX-BF16, AMX-TILE and AMX-INT8 support to HAS_CPU_FEATURE and
CPU_FEATURE_USABLE.
2020-06-26 06:53:05 -07:00
Mike FABIAN
6e540caa21 Set width of JUNGSEONG/JONGSEONG characters from UD7B0 to UD7FB to 0 [BZ #26120]
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2020-06-26 09:54:43 +02:00
Stefan Liebler
1d21fb1061 S390: Optimize __memset_z196.
It turned out that an 256b-mvc instruction which depends on the
result of a previous 256b-mvc instruction is counterproductive.
Therefore this patch adjusts the 256b-loop by storing the
first byte with stc and setting the remaining 255b with mvc.
Now the 255b-mvc instruction depends on the stc instruction.
2020-06-26 09:45:11 +02:00
Stefan Liebler
0792c8ae1a S390: Optimize __memcpy_z196.
This patch introduces an extra loop without pfd instructions
as it turned out that the pfd instructions are usefull
for copies >=64KB but are counterproductive for smaller copies.
2020-06-26 09:45:11 +02:00
Florian Weimer
2034c70e64 elf: Include <stddef.h> (for size_t), <sys/stat.h> in <ldconfig.h>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2020-06-25 16:51:03 +02:00
Szabolcs Nagy
087942251f nptl: Don't madvise user provided stack
User provided stack should not be released nor madvised at
thread exit because it's owned by the user.

If the memory is shared or file based then MADV_DONTNEED
can have unwanted effects. With memory tagging on aarch64
linux the tags are dropped and thus it may invalidate
pointers.

Tested on aarch64-linux-gnu with MTE, it fixes

FAIL: nptl/tst-stack3
FAIL: nptl/tst-stack3-mem
2020-06-25 14:19:16 +01:00
Stefan Liebler
f6b955e8ba S390: Regenerate ULPs.
Updates needed after recent exp10f commits.
2020-06-24 14:51:06 +02:00
Florian Weimer
1fb7dc751e htl: Add wrapper header for <semaphore.h> with hidden __sem_post
This is required to avoid a check-localplt failure due to a
sem_post call through the PLT.

Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
2020-06-24 13:38:08 +02:00
Florian Weimer
6f3331f26d elf: Include <stdbool.h> in <dl-tunables.h> because bool is used 2020-06-24 11:02:34 +02:00
Samuel Thibault
1b90d52df9 htl: Fix case when sem_*wait is canceled while holding a token
* sysdeps/htl/sem-timedwait.c (struct cancel_ctx): Add cancel_wake
field.
(cancel_hook): When unblocking thread, set cancel_wake field to 1.
(__sem_timedwait_internal): Set cancel_wake field to 0 by default.
On cancellation exit, check whether we hold a token, to be put back.
2020-06-24 02:20:42 +02:00
Samuel Thibault
eca16db02d htl: Make sem_*wait cancellations points
By aligning its implementation on pthread_cond_wait.

* sysdeps/htl/sem-timedwait.c (cancel_ctx): New structure.
(cancel_hook): New function.
(__sem_timedwait_internal): Check for cancellation and register
cancellation hook that wakes the thread up, and check again for
cancellation on exit.
* nptl/tst-cancel13.c, nptl/tst-cancelx13.c: Move to...
* sysdeps/pthread/: ... here.
* nptl/Makefile: Move corresponding references and rules to...
* sysdeps/pthread/Makefile: ... here.
2020-06-24 01:19:49 +02:00
Samuel Thibault
3513d5af3d htl: Simplify non-cancel path of __pthread_cond_timedwait_internal
Since __pthread_exit does not return, we do not need to indent the
noncancel path

* sysdeps/htl/pt-cond-timedwait.c (__pthread_cond_timedwait_internal):
Move cancelled path before non-cancelled path, to avoid "else"
indentation.
2020-06-24 01:19:48 +02:00
Samuel Thibault
9f6e508b42 htl: Enable tst-cancel25 test
* nptl/tst-cancel25.c: Move to...
* sysdeps/pthread/tst-cancel25.c: ... here.
(tf2) Do not test for SIGCANCEL when it is not defined.
* nptl/Makefile: Move corresponding reference to...
* sysdeps/pthread/Makefile: ... here.
2020-06-24 00:02:31 +02:00
Tulio Magno Quites Machado Filho
ae725e3f9c powerpc: Add new hwcap values
Linux commit ID ee988c11acf6f9464b7b44e9a091bf6afb3b3a49 reserved 2 new
bits in AT_HWCAP2:
 - PPC_FEATURE2_ARCH_3_1 indicates the availability of the POWER ISA
   3.1;
 - PPC_FEATURE2_MMA indicates the availability of the Matrix-Multiply
   Assist facility.
2020-06-23 18:15:06 -03:00
Alex Butler
03e1378f94 aarch64: MTE compatible strncmp
Add support for MTE to strncmp. Regression tested with xcheck and benchmarked
with glibc's benchtests on the Cortex-A53, Cortex-A72, and Neoverse N1.

The existing implementation assumes that any access to the pages in which the
string resides is safe. This assumption is not true when MTE is enabled. This
patch updates the algorithm to ensure that accesses remain within the bounds
of an MTE tag (16-byte chunks) and improves overall performance.

Co-authored-by: Branislav Rankov <branislav.rankov@arm.com>
Co-authored-by: Wilco Dijkstra <wilco.dijkstra@arm.com>
2020-06-23 17:55:39 +01:00
Alex Butler
adac54ffc5 aarch64: MTE compatible strcmp
Add support for MTE to strcmp. Regression tested with xcheck and benchmarked
with glibc's benchtests on the Cortex-A53, Cortex-A72, and Neoverse N1.

The existing implementation assumes that any access to the pages in which the
string resides is safe. This assumption is not true when MTE is enabled. This
patch updates the algorithm to ensure that accesses remain within the bounds
of an MTE tag (16-byte chunks) and improves overall performance.

Co-authored-by: Branislav Rankov <branislav.rankov@arm.com>
Co-authored-by: Wilco Dijkstra <wilco.dijkstra@arm.com>
2020-06-23 17:55:39 +01:00
Alex Butler
79160c06c7 aarch64: MTE compatible strrchr
Add support for MTE to strrchr. Regression tested with xcheck and benchmarked
with glibc's benchtests on the Cortex-A53, Cortex-A72, and Neoverse N1.

The existing implementation assumes that any access to the pages in which the
string resides is safe. This assumption is not true when MTE is enabled. This
patch updates the algorithm to ensure that accesses remain within the bounds
of an MTE tag (16-byte chunks) and improves overall performance.

Co-authored-by: Wilco Dijkstra <wilco.dijkstra@arm.com>
2020-06-23 17:55:39 +01:00
Alex Butler
df06b0d90f aarch64: MTE compatible memrchr
Add support for MTE to memrchr. Regression tested with xcheck and benchmarked
with glibc's benchtests on the Cortex-A53, Cortex-A72, and Neoverse N1.

The existing implementation assumes that any access to the pages in which the
string resides is safe. This assumption is not true when MTE is enabled. This
patch updates the algorithm to ensure that accesses remain within the bounds
of an MTE tag (16-byte chunks) and improves overall performance.

Co-authored-by: Wilco Dijkstra <wilco.dijkstra@arm.com>
2020-06-23 17:55:39 +01:00
Alex Butler
7ff899969f aarch64: MTE compatible memchr
Add support for MTE to memchr. Regression tested with xcheck and benchmarked
with glibc's benchtests on the Cortex-A53, Cortex-A72, and Neoverse N1.

The existing implementation assumes that any access to the pages in which the
string resides is safe. This assumption is not true when MTE is enabled. This
patch updates the algorithm to ensure that accesses remain within the bounds
of an MTE tag (16-byte chunks) and improves overall performance.

Co-authored-by: Gabor Kertesz <gabor.kertesz@arm.com>
2020-06-23 17:55:39 +01:00
Alex Butler
bb2c12aecb aarch64: MTE compatible strcpy
Add support for MTE to strcpy. Regression tested with xcheck and benchmarked
with glibc's benchtests on the Cortex-A53, Cortex-A72, and Neoverse N1.

The existing implementation assumes that any access to the pages in which the
string resides is safe. This assumption is not true when MTE is enabled. This
patch updates the algorithm to ensure that accesses remain within the bounds
of an MTE tag (16-byte chunks) and improves overall performance.

Co-authored-by: Wilco Dijkstra <wilco.dijkstra@arm.com>
2020-06-23 17:55:39 +01:00
Joseph Myers
8ec13b4639 Add MREMAP_DONTUNMAP from Linux 5.7
Add the new constant MREMAP_DONTUNMAP from Linux 5.7 to
bits/mman-shared.h.

Tested with build-many-glibcs.py.
2020-06-23 14:42:45 +00:00
H.J. Lu
ecbbadbf10 x86: Update CPU feature detection [BZ #26149]
1. Divide architecture features into the usable features and the preferred
features.  The usable features are for correctness and can be exported in
a stable ABI.  The preferred features are for performance and only for
glibc internal use.
2. Change struct cpu_features to

struct cpu_features
{
  struct cpu_features_basic basic;
  unsigned int *usable_p;
  struct cpuid_registers cpuid[COMMON_CPUID_INDEX_MAX];
  unsigned int usable[USABLE_FEATURE_INDEX_MAX];
  unsigned int preferred[PREFERRED_FEATURE_INDEX_MAX];
  ...
};

and initialize usable_p to pointer to the usable arary so that

struct cpu_features
{
  struct cpu_features_basic basic;
  unsigned int *usable_p;
  struct cpuid_registers cpuid[COMMON_CPUID_INDEX_MAX];
};

can be exported via a stable ABI.  The cpuid and usable arrays can be
expanded with backward binary compatibility for both .o and .so files.
3. Add COMMON_CPUID_INDEX_7_ECX_1 for AVX512_BF16.
4. Detect ENQCMD, PKS, AVX512_VP2INTERSECT, MD_CLEAR, SERIALIZE, HYBRID,
TSXLDTRK, L1D_FLUSH, CORE_CAPABILITIES and AVX512_BF16.
5. Rename CAPABILITIES to ARCH_CAPABILITIES.
6. Check if AVX512_VP2INTERSECT, AVX512_BF16 and PKU are usable.
7. Update CPU feature detection test.
2020-06-22 13:09:33 -07:00