Commit Graph

1674 Commits

Author SHA1 Message Date
H.J. Lu
609b7b2d3c <sys/platform/x86.h>: Add AVX-NE-CONVERT support
Add AVX-NE-CONVERT support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2023-04-05 14:46:10 -07:00
H.J. Lu
4c120c88a6 <sys/platform/x86.h>: Add AVX-VNNI-INT8 support
Add AVX-VNNI-INT8 support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2023-04-05 14:46:10 -07:00
H.J. Lu
b39741b45f <sys/platform/x86.h>: Add MSRLIST support
Add MSRLIST support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2023-04-05 14:46:10 -07:00
H.J. Lu
96037c697d <sys/platform/x86.h>: Add AVX-IFMA support
Add AVX-IFMA support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2023-04-05 14:46:10 -07:00
H.J. Lu
8b4cc05eab <sys/platform/x86.h>: Add AMX-FP16 support
Add AMX-FP16 support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2023-04-05 14:46:10 -07:00
H.J. Lu
227983551d <sys/platform/x86.h>: Add WRMSRNS support
Add WRMSRNS support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2023-04-05 14:46:10 -07:00
H.J. Lu
a00db8305d <sys/platform/x86.h>: Add ArchPerfmonExt support
Add Architectural Performance Monitoring Extended Leaf (EAX = 23H)
support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2023-04-05 14:46:10 -07:00
H.J. Lu
2f02d0d8e1 <sys/platform/x86.h>: Add CMPCCXADD support
Add CMPCCXADD support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2023-04-05 14:46:10 -07:00
H.J. Lu
aa528a579b <sys/platform/x86.h>: Add LASS support
Add Linear Address Space Separation (LASS) support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2023-04-05 14:46:10 -07:00
H.J. Lu
231bf916ce <sys/platform/x86.h>: Add RAO-INT support
Add RAO-INT support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2023-04-05 14:46:10 -07:00
H.J. Lu
fb90dc8513 <sys/platform/x86.h>: Add LBR support
Add architectural LBR support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2023-04-05 14:46:10 -07:00
H.J. Lu
f47b7d96fb <sys/platform/x86.h>: Add RTM_FORCE_ABORT support
Add RTM_FORCE_ABORT support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2023-04-05 14:46:10 -07:00
H.J. Lu
f6790a489d <sys/platform/x86.h>: Add SGX-KEYS support
Add SGX-KEYS support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2023-04-05 14:46:10 -07:00
H.J. Lu
09cc5fee21 <sys/platform/x86.h>: Add BUS_LOCK_DETECT support
Add Bus lock debug exceptions (BUS_LOCK_DETECT) support to
<sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2023-04-05 14:46:10 -07:00
H.J. Lu
8c8e391166 <sys/platform/x86.h>: Add LA57 support
Add 57-bit linear addresses and five-level paging (LA57) support to
<sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2023-04-05 14:46:10 -07:00
H.J. Lu
083204a0e2 platform.texi: Move LAM after LAHF64_SAHF64
Move LAM after LAHF64_SAHF64 to sort x86 features.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2023-04-05 14:46:10 -07:00
Siddhesh Poyarekar
ac2a14343e manual: Document __wur usage under _FORTIFY_SOURCE
The __warn_unused_result__ attribute is only enabled when fortification
is enabled.  Mention that in the document.  The rationale for this is
essentially to mitigate against CWE-252:

[1] https://cwe.mitre.org/data/definitions/252.html

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
2023-04-03 10:20:04 -04:00
Adhemerval Zanella Netto
33237fe83d Remove --enable-tunables configure option
And make always supported.  The configure option was added on glibc 2.25
and some features require it (such as hwcap mask, huge pages support, and
lock elisition tuning).  It also simplifies the build permutations.

Changes from v1:
 * Remove glibc.rtld.dynamic_sort changes, it is orthogonal and needs
   more discussion.
 * Cleanup more code.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2023-03-29 14:33:06 -03:00
Adhemerval Zanella
6384171fa0 Remove --disable-experimental-malloc option
It is the default since 2.26 and it has bitrotten over the years,
By using it multiple malloc tests fails:

  FAIL: malloc/tst-memalign-2
  FAIL: malloc/tst-memalign-2-malloc-hugetlb1
  FAIL: malloc/tst-memalign-2-malloc-hugetlb2
  FAIL: malloc/tst-memalign-2-mcheck
  FAIL: malloc/tst-mxfast-malloc-hugetlb1
  FAIL: malloc/tst-mxfast-malloc-hugetlb2
  FAIL: malloc/tst-tcfree2
  FAIL: malloc/tst-tcfree2-malloc-hugetlb1
  FAIL: malloc/tst-tcfree2-malloc-hugetlb2

Checked on x86_64-linux-gnu.

Reviewed-by: DJ Delorie <dj@redhat.com>
2023-03-29 14:33:06 -03:00
Adhemerval Zanella Netto
91fc5b9990 Remove --with-default-link configure option
Now that there is no need to use a special linker script to hardening
internal data structures, remove the --with-default-link configure
option and associated definitions.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2023-03-27 13:57:55 -03:00
Joseph Myers
2d4728e606 Update printf %b/%B C2x support
WG14 recently accepted two additions to the printf/scanf %b/%B
support: there are now PRIb* and SCNb* macros in <inttypes.h>, and
printf %B is now an optional feature defined in normative text,
instead of recommended practice, with corresponding PRIB* macros that
can also be used to test whether that optional feature is supported.
See N3072 items 14 and 15 for details (those changes were accepted,
some other changes in that paper weren't).

Add the corresponding PRI* macros to glibc and update one place in the
manual referring to %B as recommended.  (SCNb* should naturally be
added at the same time as the corresponding scanf %b support.)

Tested for x86_64 and x86.
2023-03-14 16:58:35 +00:00
Joseph Myers
dee2bea048 C2x scanf binary constant handling
C2x adds binary integer constants starting with 0b or 0B, and supports
those constants for the %i scanf format (in addition to the %b format,
which isn't yet implemented for scanf in glibc).  Implement that scanf
support for glibc.

As with the strtol support, this is incompatible with previous C
standard versions, in that such an input string starting with 0b or 0B
was previously required to be parsed as 0 (with the rest of the input
potentially matching subsequent parts of the scanf format string).
Thus this patch adds 12 new __isoc23_* functions per long double
format (12, 24 or 36 depending on how many long double formats the
glibc configuration supports), with appropriate header redirection
support (generally very closely following that for the __isoc99_*
scanf functions - note that __GLIBC_USE (DEPRECATED_SCANF) takes
precedence over __GLIBC_USE (C2X_STRTOL), so the case of GNU
extensions to C89 continues to get old-style GNU %a and does not get
this new feature).  The function names would remain as __isoc23_* even
if C2x ends up published in 2024 rather than 2023.

When scanf %b support is added, I think it will be appropriate for all
versions of scanf to follow C2x rules for inputs to the %b format
(given that there are no compatibility concerns for a new format).

Tested for x86_64 (full glibc testsuite).  The first version was also
tested for powerpc (32-bit) and powerpc64le (stdio-common/ and wcsmbs/
tests), and with build-many-glibcs.py.
2023-03-02 19:10:37 +00:00
H.J. Lu
188ecdb777 tunables.texi: Change \code{1} to @code{1}
Update

317f1c0a8a x86-64: Add glibc.cpu.prefer_map_32bit_exec [BZ #28656]
2023-02-23 08:50:19 -08:00
H.J. Lu
317f1c0a8a x86-64: Add glibc.cpu.prefer_map_32bit_exec [BZ #28656]
Crossing 2GB boundaries with indirect calls and jumps can use more
branch prediction resources on Intel Golden Cove CPU (see the
"Misprediction for Branches >2GB" section in Intel 64 and IA-32
Architectures Optimization Reference Manual.)  There is visible
performance improvement on workloads with many PLT calls when executable
and shared libraries are mmapped below 2GB.  Add the Prefer_MAP_32BIT_EXEC
bit so that mmap will try to map executable or denywrite pages in shared
libraries with MAP_32BIT first.

NB: Prefer_MAP_32BIT_EXEC reduces bits available for address space
layout randomization (ASLR), which is always disabled for SUID programs
and can only be enabled by the tunable, glibc.cpu.prefer_map_32bit_exec,
or the environment variable, LD_PREFER_MAP_32BIT_EXEC.  This works only
between shared libraries or between shared libraries and executables with
addresses below 2GB.  PIEs are usually loaded at a random address above
4GB by the kernel.
2023-02-22 18:28:37 -08:00
Simon Kissane
31be941e43 gmon: improve mcount overflow handling [BZ# 27576]
When mcount overflows, no gmon.out file is generated, but no message is printed
to the user, leaving the user with no idea why, and thinking maybe there is
some bug - which is how BZ 27576 ended up being logged. Print a message to
stderr in this case so the user knows what is going on.

As a comment in sys/gmon.h acknowledges, the hardcoded MAXARCS value is too
small for some large applications, including the test case in that BZ. Rather
than increase it, add tunables to enable MINARCS and MAXARCS to be overridden
at runtime (glibc.gmon.minarcs and glibc.gmon.maxarcs). So if a user gets the
mcount overflow error, they can try increasing maxarcs (they might need to
increase minarcs too if the heuristic is wrong in their case.)

Note setting minarcs/maxarcs too large can cause monstartup to fail with an
out of memory error. If you set them large enough, it can cause an integer
overflow in calculating the buffer size. I haven't done anything to defend
against that - it would not generally be a security vulnerability, since these
tunables will be ignored in suid/sgid programs (due to the SXID_ERASE default),
and if you can set GLIBC_TUNABLES in the environment of a process, you can take
it over anyway (LD_PRELOAD, LD_LIBRARY_PATH, etc). I thought about modifying
the code of monstartup to defend against integer overflows, but doing so is
complicated, and I realise the existing code is susceptible to them even prior
to this change (e.g. try passing a pathologically large highpc argument to
monstartup), so I decided just to leave that possibility in-place.

Add a test case which demonstrates mcount overflow and the tunables.

Document the new tunables in the manual.

Signed-off-by: Simon Kissane <skissane@gmail.com>
Reviewed-by: DJ Delorie <dj@redhat.com>
2023-02-22 21:00:14 -05:00
Joseph Myers
64924422a9 C2x strtol binary constant handling
C2x adds binary integer constants starting with 0b or 0B, and supports
those constants in strtol-family functions when the base passed is 0
or 2.  Implement that strtol support for glibc.

As discussed at
<https://sourceware.org/pipermail/libc-alpha/2020-December/120414.html>,
this is incompatible with previous C standard versions, in that such
an input string starting with 0b or 0B was previously required to be
parsed as 0 (with the rest of the string unprocessed).  Thus, as
proposed there, this patch adds 20 new __isoc23_* functions with
appropriate header redirection support.  This patch does *not* do
anything about scanf %i (which will need 12 new functions per long
double variant, so 12, 24 or 36 depending on the glibc configuration),
instead leaving that for a future patch.  The function names would
remain as __isoc23_* even if C2x ends up published in 2024 rather than
2023.

Making this change leads to the question of what should happen to
internal uses of these functions in glibc and its tests.  The header
redirection (which applies for _GNU_SOURCE or any other feature test
macros enabling C2x features) has the effect of redirecting internal
uses but without those uses then ending up at a hidden alias (see the
comment in include/stdio.h about interaction with libc_hidden_proto).
It seems desirable for the default for internal uses to be the same
versions used by normal code using _GNU_SOURCE, so rather than doing
anything to disable that redirection, similar macro definitions to
those in include/stdio.h are added to the include/ headers for the new
functions.

Given that the default for uses in glibc is for the redirections to
apply, the next question is whether the C2x semantics are correct for
all those uses.  Uses with the base fixed to 10, 16 or any other value
other than 0 or 2 can be ignored.  I think this leaves the following
internal uses to consider (an important consideration for review of
this patch will be both whether this list is complete and whether my
conclusions on all entries in it are correct):

benchtests/bench-malloc-simple.c
benchtests/bench-string.h
elf/sotruss-lib.c
math/libm-test-support.c
nptl/perf.c
nscd/nscd_conf.c
nss/nss_files/files-parse.c
posix/tst-fnmatch.c
posix/wordexp.c
resolv/inet_addr.c
rt/tst-mqueue7.c
soft-fp/testit.c
stdlib/fmtmsg.c
support/support_test_main.c
support/test-container.c
sysdeps/pthread/tst-mutex10.c

I think all of these places are OK with the new semantics, except for
resolv/inet_addr.c, where the POSIX semantics of inet_addr do not
allow for binary constants; thus, I changed that file (to use
__strtoul_internal, whose semantics are unchanged) and added a test
for this case.  In the case of posix/wordexp.c I think accepting
binary constants is OK since POSIX explicitly allows additional forms
of shell arithmetic expressions, and in stdlib/fmtmsg.c SEV_LEVEL is
not in POSIX so again I think accepting binary constants is OK.

Functions such as __strtol_internal, which are only exported for
compatibility with old binaries from when those were used in inline
functions in headers, have unchanged semantics; the __*_l_internal
versions (purely internal to libc and not exported) have a new
argument to specify whether to accept binary constants.

As well as for the standard functions, the header redirection also
applies to the *_l versions (GNU extensions), and to legacy functions
such as strtoq, to avoid confusing inconsistency (the *q functions
redirect to __isoc23_*ll rather than needing their own __isoc23_*
entry points).  For the functions that are only declared with
_GNU_SOURCE, this means the old versions are no longer available for
normal user programs at all.  An internal __GLIBC_USE_C2X_STRTOL macro
is used to control the redirections in the headers, and cases in glibc
that wish to avoid the redirections - the function implementations
themselves and the tests of the old versions of the GNU functions -
then undefine and redefine that macro to allow the old versions to be
accessed.  (There would of course be greater complexity should we wish
to make any of the old versions into compat symbols / avoid them being
defined at all for new glibc ABIs.)

strtol_l.c has some similarity to strtol.c in gnulib, but has already
diverged some way (and isn't listed at all at
https://sourceware.org/glibc/wiki/SharedSourceFiles unlike strtoll.c
and strtoul.c); I haven't made any attempts at gnulib compatibility in
the changes to that file.

I note incidentally that inttypes.h and wchar.h are missing the
__nonnull present on declarations of this family of functions in
stdlib.h; I didn't make any changes in that regard for the new
declarations added.
2023-02-16 23:02:40 +00:00
Stefan Liebler
41f67ccbe9 S390: Influence hwcaps/stfle via GLIBC_TUNABLES.
This patch enables the option to influence hwcaps and stfle bits used
by the s390 specific ifunc-resolvers.  The currently x86-specific
tunable glibc.cpu.hwcaps is also used on s390x to achieve the task. In
addition the user can also set a CPU arch-level like z13 instead of
single HWCAP and STFLE features.

Note that the tunable only handles the features which are really used
in the IFUNC-resolvers.  All others are ignored as the values are only
used inside glibc.  Thus we can influence:
- HWCAP_S390_VXRS (z13)
- HWCAP_S390_VXRS_EXT (z14)
- HWCAP_S390_VXRS_EXT2 (z15)
- STFLE_MIE3 (z15)

The influenced hwcap/stfle-bits are stored in the s390-specific
cpu_features struct which also contains reserved fields for future
usage.

The ifunc-resolvers and users of stfle bits are adjusted to use the
information from cpu_features struct.

On 31bit, the ELF_MACHINE_IRELATIVE macro is now also defined.
Otherwise the new ifunc-resolvers segfaults as they depend on
the not yet processed_rtld_global_ro@GLIBC_PRIVATE relocation.
2023-02-07 09:19:27 +01:00
Wilco Dijkstra
32c7acd464 Replace rawmemchr (s, '\0') with strchr
Almost all uses of rawmemchr find the end of a string.  Since most targets use
a generic implementation, replacing it with strchr is better since that is
optimized by compilers into strlen (s) + s.  Also fix the generic rawmemchr
implementation to use a cast to unsigned char in the if statement.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2023-02-06 16:16:19 +00:00
Carlos O'Donell
5199024232 Update install.texi, and regenerate INSTALL. 2023-01-31 17:51:40 -05:00
Carlos O'Donell
1bcbb25882 Update manual/contrib.texi.
Thank Yinyu Cai for their maintainership of the LoongArch port.

Thank Vineet Gupta for their maintainership of the ARC port.

Thank Tulio Magno Quites Machado Filho for their past maintainership
of the PowerPC port.

Thank Rajalakshmi Srinivasaraghavan for their current maintainership
of the PowerPC port.
2023-01-31 17:51:40 -05:00
Paul Pluzhnikov
0674613e66 Document '%F' format specifier
The '%F' format specifier was implemented in commit 6c46718f9f on
2000-08-23, but remains undocumented in the manual.
https://stackoverflow.com/questions/75157669/format-specifier-f-missing-from-glibcs-documentation

Fix that.

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2023-01-25 00:39:31 +00:00
Martin Joerg
8394b8c461 manual: Fix typo
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2023-01-18 18:34:08 +01:00
Siddhesh Poyarekar
3d3a2911ba Add _FORTIFY_SOURCE implementation documentation [BZ #28998]
There have been multiple requests to provide more detail on how the
_FORTIFY_SOURCE macro works, so this patch adds a new node in the
Library Maintenance section that does this.  A lot of the description is
implementation detail, which is why I put this in the appendix and not
in the main documentation.

Resolves: BZ #28998.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
2023-01-10 10:22:38 -05:00
Joseph Myers
6d7e8eda9b Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
Florian Weimer
118816de33 libio: Convert __vswprintf_internal to buffers (bug 27857)
Always null-terminate the buffer and set E2BIG if the buffer is too
small.  This fixes bug 27857.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2022-12-19 18:56:55 +01:00
Jakub Wilk
a35c960dbb manual: Add missing % in int conversion list
Signed-off-by: Jakub Wilk <jwilk@jwilk.net>
2022-10-25 09:12:30 +02:00
Wilco Dijkstra
22f4ab2d20 Use atomic_exchange_release/acquire
Rename atomic_exchange_rel/acq to use atomic_exchange_release/acquire
since these map to the standard C11 atomic builtins.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2022-09-26 16:58:08 +01:00
Wilco Dijkstra
4a07fbb689 Use C11 atomics instead of atomic_decrement_and_test
Replace atomic_decrement_and_test with atomic_fetch_add_relaxed.
These are simple counters which do not protect any shared data from
concurrent accesses. Also remove the unused file cond-perf.c.

Passes regress on AArch64.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2022-09-23 15:59:56 +01:00
Wilco Dijkstra
d1babeb32d Use C11 atomics instead of atomic_increment(_val)
Replace atomic_increment and atomic_increment_val with atomic_fetch_add_relaxed.
One case in sem_post.c uses release semantics (see comment above it).
The others are simple counters and do not protect any shared data from
concurrent accesses.

Passes regress on AArch64.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2022-09-23 15:59:56 +01:00
Paul Eggert
05967faf0e Fix BRE typos in check-safety.sh
* manual/check-safety.sh: Fix BRE portability typos.
POSIX says \] produces undefined results.
2022-09-14 00:12:20 -05:00
Wilco Dijkstra
a364a3a709 Use C11 atomics instead of atomic_decrement(_val)
Replace atomic_decrement and atomic_decrement_val with
atomic_fetch_add_relaxed.

Reviewed-by: DJ Delorie <dj@redhat.com>
2022-09-09 14:22:26 +01:00
Florian Weimer
89baed0b93 Revert "Detect ld.so and libc.so version inconsistency during startup"
This reverts commit 6f85dbf102.

Once this change hits the release branches, it will require relinking
of all statically linked applications before static dlopen works
again, for the majority of updates on release branches: The NEWS file
is regularly updated with bug references, so the __libc_early_init
suffix changes, and static dlopen cannot find the function anymore.

While this ABI check is still technically correct (we do require
rebuilding & relinking after glibc updates to keep static dlopen
working), it is too drastic for stable release branches.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2022-08-25 18:46:43 +02:00
Florian Weimer
6f85dbf102 Detect ld.so and libc.so version inconsistency during startup
The files NEWS, include/link.h, and sysdeps/generic/ldsodefs.h
contribute to the version fingerprint used for detection.  The
fingerprint can be further refined using the --with-extra-version-id
configure argument.

_dl_call_libc_early_init is replaced with _dl_lookup_libc_early_init.
The new function is used store a pointer to libc.so's
__libc_early_init function in the libc_map_early_init member of the
ld.so namespace structure.  This function pointer can then be called
directly, so the separate invocation function is no longer needed.

The versioned symbol lookup needs the symbol versioning data
structures, so the initialization of libc_map and libc_map_early_init
is now done from _dl_check_map_versions, after this information
becomes available.  (_dl_map_object_from_fd does not set this up
in time, so the initialization code had to be moved from there.)
This means that the separate initialization code can be removed from
dl_main because _dl_check_map_versions covers all maps, including
the initial executable loaded by the kernel.  The lookup still happens
before relocation and the invocation of IFUNC resolvers, so IFUNC
resolvers are protected from ABI mismatch.

The __libc_early_init function pointer is not protected because
so little code runs between the pointer write and the invocation
(only dynamic linker code and IFUNC resolvers).

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2022-08-24 17:35:57 +02:00
Florian Weimer
6c93af6b45 malloc: Correct the documentation of the top_pad default
DEFAULT_TOP_PAD is defined as 131072 in
sysdeps/generic/malloc-machine.h.
2022-08-04 17:20:48 +02:00
Carlos O'Donell
7a52dfab02 Update install.texi, and regenerate INSTALL. 2022-07-29 17:51:16 -04:00
Jason A. Donenfeld
eaad4f9e8f arc4random: simplify design for better safety
Rather than buffering 16 MiB of entropy in userspace (by way of
chacha20), simply call getrandom() every time.

This approach is doubtlessly slower, for now, but trying to prematurely
optimize arc4random appears to be leading toward all sorts of nasty
properties and gotchas. Instead, this patch takes a much more
conservative approach. The interface is added as a basic loop wrapper
around getrandom(), and then later, the kernel and libc together can
work together on optimizing that.

This prevents numerous issues in which userspace is unaware of when it
really must throw away its buffer, since we avoid buffering all
together. Future improvements may include userspace learning more from
the kernel about when to do that, which might make these sorts of
chacha20-based optimizations more possible. The current heuristic of 16
MiB is meaningless garbage that doesn't correspond to anything the
kernel might know about. So for now, let's just do something
conservative that we know is correct and won't lead to cryptographic
issues for users of this function.

This patch might be considered along the lines of, "optimization is the
root of all evil," in that the much more complex implementation it
replaces moves too fast without considering security implications,
whereas the incremental approach done here is a much safer way of going
about things. Once this lands, we can take our time in optimizing this
properly using new interplay between the kernel and userspace.

getrandom(0) is used, since that's the one that ensures the bytes
returned are cryptographically secure. But on systems without it, we
fallback to using /dev/urandom. This is unfortunate because it means
opening a file descriptor, but there's not much of a choice. Secondly,
as part of the fallback, in order to get more or less the same
properties of getrandom(0), we poll on /dev/random, and if the poll
succeeds at least once, then we assume the RNG is initialized. This is a
rough approximation, as the ancient "non-blocking pool" initialized
after the "blocking pool", not before, and it may not port back to all
ancient kernels, though it does to all kernels supported by glibc
(≥3.2), so generally it's the best approximation we can do.

The motivation for including arc4random, in the first place, is to have
source-level compatibility with existing code. That means this patch
doesn't attempt to litigate the interface itself. It does, however,
choose a conservative approach for implementing it.

Cc: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Cristian Rodríguez <crrodriguez@opensuse.org>
Cc: Paul Eggert <eggert@cs.ucla.edu>
Cc: Mark Harris <mark.hsj@gmail.com>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: linux-crypto@vger.kernel.org
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2022-07-27 08:58:27 -03:00
caiyinyu
3a38045820 LoongArch: Update NEWS and README for the LoongArch port. 2022-07-26 12:35:12 -03:00
Adhemerval Zanella Netto
ca4d3ea513 manual: Add documentation for arc4random functions 2022-07-22 11:58:27 -03:00
Tejas Belagod
e9dd368296 AArch64: Add asymmetric faulting mode for tag violations in mem.tagging tunable
The new asymmetric mode is available when HWCAP2_MTE3 is set (support is
available), bit2 is set in the tunable (user request per application),
and the system is configured such that the asymmetric mode is preferred over
sync or async (per-cpu system-wide setting).

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2022-06-30 14:01:08 +01:00
Noah Goldstein
b446822b6a x86: Add bounds x86_non_temporal_threshold
The lower-bound (16448) and upper-bound (SIZE_MAX / 16) are assumed
by memmove-vec-unaligned-erms.

The lower-bound is needed because memmove-vec-unaligned-erms unrolls
the loop aggressively in the L(large_memset_4x) case.

The upper-bound is needed because memmove-vec-unaligned-erms
right-shifts the value of `x86_non_temporal_threshold` by
LOG_4X_MEMCPY_THRESH (4) which without a bound may overflow.

The lack of lower-bound can be a correctness issue. The lack of
upper-bound cannot.
2022-06-15 14:25:55 -07:00
Sam James
7df596a58c grep: egrep -> grep -E, fgrep -> grep -F
Newer versions of GNU grep (after grep 3.7, not inclusive) will warn on
'egrep' and 'fgrep' invocations.

Convert usages within the tree to their expanded non-aliased counterparts
to avoid irritating warnings during ./configure and the test suite.

Signed-off-by: Sam James <sam@gentoo.org>
Reviewed-by: Fangrui Song <maskray@google.com>
2022-06-05 12:09:02 -07:00
Andreas Schwab
d976d44a89 manual: fix reference to source file 2022-05-31 16:22:49 +02:00
Siddhesh Poyarekar
9bcd12d223 wcrtomb: Make behavior POSIX compliant
The GNU implementation of wcrtomb assumes that there are at least
MB_CUR_MAX bytes available in the destination buffer passed to wcrtomb
as the first argument.  This is not compatible with the POSIX
definition, which only requires enough space for the input wide
character.

This does not break much in practice because when users supply buffers
smaller than MB_CUR_MAX (e.g. in ncurses), they compute and dynamically
allocate the buffer, which results in enough spare space (thanks to
usable_size in malloc and padding in alloca) that no actual buffer
overflow occurs.  However when the code is built with _FORTIFY_SOURCE,
it runs into the hard check against MB_CUR_MAX in __wcrtomb_chk and
hence fails.  It wasn't evident until now since dynamic allocations
would result in wcrtomb not being fortified but since _FORTIFY_SOURCE=3,
that limitation is gone, resulting in such code failing.

To fix this problem, introduce an internal buffer that is MB_LEN_MAX
long and use that to perform the conversion and then copy the resultant
bytes into the destination buffer.  Also move the fortification check
into the main implementation, which checks the result after conversion
and aborts if the resultant byte count is greater than the destination
buffer size.

One complication is that applications that assume the MB_CUR_MAX
limitation to be gone may not be able to run safely on older glibcs if
they use static destination buffers smaller than MB_CUR_MAX; dynamic
allocations will always have enough spare space that no actual overruns
will occur.  One alternative to fixing this is to bump symbol version to
prevent them from running on older glibcs but that seems too strict a
constraint.  Instead, since these users will only have made this
decision on reading the manual, I have put a note in the manual warning
them about the pitfalls of having static buffers smaller than
MB_CUR_MAX and running them on older glibc.

Benchmarking:

The wcrtomb microbenchmark shows significant increases in maximum
execution time for all locales, ranging from 10x for ar_SA.UTF-8 to
1.5x-2x for nearly everything else.  The mean execution time however saw
practically no impact, with some results even being quicker, indicating
that cache locality has a much bigger role in the overhead.

Given that the additional copy uses a temporary buffer inside wcrtomb,
it's likely that a hot path will end up putting that buffer (which is
responsible for the additional overhead) in a similar place on stack,
giving the necessary cache locality to negate the overhead.  However in
situations where wcrtomb ends up getting called at wildly different
spots on the call stack (or is on different call stacks, e.g. with
threads or different execution contexts) and is still a hotspot, the
performance lag will be visible.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2022-05-13 19:15:46 +05:30
Siddhesh Poyarekar
db1efe02c9 manual: Clarify that abbreviations of long options are allowed
The man page and code comments clearly state that abbreviations of long
option names are recognized correctly as long as they are unique.
Document this fact in the glibc manual as well.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Reviewed-by: Andreas Schwab <schwab@linux-m68k.org>
2022-05-04 15:56:47 +05:30
Florian Weimer
d056c21213 dlfcn: Implement the RTLD_DI_PHDR request type for dlinfo
The information is theoretically available via dl_iterate_phdr as
well, but that approach is very slow if there are many shared
objects.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@rehdat.com>
2022-04-29 17:00:53 +02:00
Florian Weimer
93804a1ee0 manual: Document the dlinfo function
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@rehdat.com>
2022-04-29 17:00:48 +02:00
Florian Weimer
c935789bdf INSTALL: Rephrase -with-default-link documentation
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2022-04-26 14:22:23 +02:00
Florian Weimer
198abcbb94 Default to --with-default-link=no (bug 25812)
This is necessary to place the libio vtables into the RELRO segment.
New tests elf/tst-relro-ldso and elf/tst-relro-libc are added to
verify that this is what actually happens.

The new tests fail on ia64 due to lack of (default) RELRO support
inbutils, so they are XFAILed there.
2022-04-22 10:59:03 +02:00
Adhemerval Zanella
404656009b nptl: Handle spurious EINTR when thread cancellation is disabled (BZ#29029)
Some Linux interfaces never restart after being interrupted by a signal
handler, regardless of the use of SA_RESTART [1].  It means that for
pthread cancellation, if the target thread disables cancellation with
pthread_setcancelstate and calls such interfaces (like poll or select),
it should not see spurious EINTR failures due the internal SIGCANCEL.

However recent changes made pthread_cancel to always sent the internal
signal, regardless of the target thread cancellation status or type.
To fix it, the previous semantic is restored, where the cancel signal
is only sent if the target thread has cancelation enabled in
asynchronous mode.

The cancel state and cancel type is moved back to cancelhandling
and atomic operation are used to synchronize between threads.  The
patch essentially revert the following commits:

  8c1c0aae20 nptl: Move cancel type out of cancelhandling
  2b51742531 nptl: Move cancel state out of cancelhandling
  26cfbb7162 nptl: Remove CANCELING_BITMASK

However I changed the atomic operation to follow the internal C11
semantic and removed the MACRO usage, it simplifies a bit the
resulting code (and removes another usage of the old atomic macros).

Checked on x86_64-linux-gnu, i686-linux-gnu, aarch64-linux-gnu,
and powerpc64-linux-gnu.

[1] https://man7.org/linux/man-pages/man7/signal.7.html

Reviewed-by: Florian Weimer <fweimer@redhat.com>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
2022-04-14 12:48:31 -03:00
Samuel Thibault
45a8e05785 hurd: Define ELIBEXEC
So we can implement it in the exec server.
2022-04-12 22:16:40 +02:00
Florian Weimer
ca7334d34b manual: SA_ONSTACK is ignored without alternate stack
The current stack is used.  No SIGILL is generated.
2022-02-28 11:50:41 +01:00
Carlos O'Donell
6415fd2ddc Update install.texi, and regenerate INSTALL. 2022-02-03 00:06:38 -05:00
Florian Weimer
6c33b01843 Linux: Use ptrdiff_t for __rseq_offset
This matches the data size initial-exec relocations use on most
targets.

Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2022-02-02 22:37:20 +01:00
Sunil K Pandey
3e63b15d43 x86_64: Document libmvec vector functions accuracy [BZ #28766]
Document maximum 4 ulps accuracy for x86_64 libmvec functions.
This fixes BZ #28766.

Reviewed-By: Paul Zimmermann <Paul.Zimmermann@inria.fr>
2022-01-19 07:57:11 -08:00
Florian Weimer
9ba202c78f Add --with-rtld-early-cflags configure option
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2022-01-14 20:17:15 +01:00
Siddhesh Poyarekar
0005e54f76 manual: Drop obsolete @refill
The @refill command has been obsolete for a while and now texinfo has
started warning about it.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2022-01-12 14:28:44 +05:30
Szabolcs Nagy
347a5b592c math: Fix float conversion regressions with gcc-12 [BZ #28713]
Converting double precision constants to float is now affected by the
runtime dynamic rounding mode instead of being evaluated at compile
time with default rounding mode (except static object initializers).

This can change the computed result and cause performance regression.
The known correctness issues (increased ulp errors) are already fixed,
this patch fixes remaining cases of unnecessary runtime conversions.

Add float M_* macros to math.h as new GNU extension API.  To avoid
conversions the new M_* macros are used and instead of casting double
literals to float, use float literals (only required if the conversion
is inexact).

The patch was tested on aarch64 where the following symbols had new
spurious conversion instructions that got fixed:

  __clog10f
  __gammaf_r_finite@GLIBC_2.17
  __j0f_finite@GLIBC_2.17
  __j1f_finite@GLIBC_2.17
  __jnf_finite@GLIBC_2.17
  __kernel_casinhf
  __lgamma_negf
  __log1pf
  __y0f_finite@GLIBC_2.17
  __y1f_finite@GLIBC_2.17
  cacosf
  cacoshf
  casinhf
  catanf
  catanhf
  clogf
  gammaf_positive

Fixes bug 28713.

Reviewed-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>
2022-01-10 14:27:17 +00:00
Paul Eggert
581c785bf3 Update copyright dates with scripts/update-copyrights
I used these shell commands:

../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")

and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 7061 files FOO.

I then removed trailing white space from math/tgmath.h,
support/tst-support-open-dev-null-range.c, and
sysdeps/x86_64/multiarch/strlen-vec.S, to work around the following
obscure pre-commit check failure diagnostics from Savannah.  I don't
know why I run into these diagnostics whereas others evidently do not.

remote: *** 912-#endif
remote: *** 913:
remote: *** 914-
remote: *** error: lines with trailing whitespace found
...
remote: *** error: sysdeps/unix/sysv/linux/statx_cp.c: trailing lines
2022-01-01 11:40:24 -08:00
Florian Weimer
5d28a8962d elf: Add _dl_find_object function
It can be used to speed up the libgcc unwinder, and the internal
_dl_find_dso_for_object function (which is used for caller
identification in dlopen and related functions, and in dladdr).

_dl_find_object is in the internal namespace due to bug 28503.
If libgcc switches to _dl_find_object, this namespace issue will
be fixed.  It is located in libc for two reasons: it is necessary
to forward the call to the static libc after static dlopen, and
there is a link ordering issue with -static-libgcc and libgcc_eh.a
because libc.so is not a linker script that includes ld.so in the
glibc build tree (so that GCC's internal -lc after libgcc_eh.a does
not pick up ld.so).

It is necessary to do the i386 customization in the
sysdeps/x86/bits/dl_find_object.h header shared with x86-64 because
otherwise, multilib installations are broken.

The implementation uses software transactional memory, as suggested
by Torvald Riegel.  Two copies of the supporting data structures are
used, also achieving full async-signal-safety.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-12-28 22:52:56 +01:00
Florian Weimer
9702a7901e stdio: Implement %#m for vfprintf and related functions
%#m prints errno as an error constant if one is available, or
a decimal number as a fallback.  This intends to address the gap
that strerrorname_np does not work well with printf for unknown
error codes due to its NULL return values in those cases.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-12-23 15:02:50 +01:00
Florian Weimer
b99b0f93ee nss: Use "files dns" as the default for the hosts database (bug 28700)
This matches what is currently in nss/nsswitch.conf.  The new ordering
matches what most distributions use in their installed configuration
files.

It is common to add localhost to /etc/hosts because the name does not
exist in the DNS, but is commonly used as a host name.

With the built-in "dns [!UNAVAIL=return] files" default, dns is
searched first and provides an answer for "localhost" (NXDOMAIN).
We never look at the files database as a result, so the contents of
/etc/hosts is ignored.  This means that "getent hosts localhost"
fail without a /etc/nsswitch.conf file, even though the host name
is listed in /etc/hosts.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2021-12-17 12:01:25 +01:00
Adhemerval Zanella
98d5fcb8d0 malloc: Add Huge Page support for mmap
With the morecore hook removed, there is not easy way to provide huge
pages support on with glibc allocator without resorting to transparent
huge pages.  And some users and programs do prefer to use the huge pages
directly instead of THP for multiple reasons: no splitting, re-merging
by the VM, no TLB shootdowns for running processes, fast allocation
from the reserve pool, no competition with the rest of the processes
unlike THP, no swapping all, etc.

This patch extends the 'glibc.malloc.hugetlb' tunable: the value
'2' means to use huge pages directly with the system default size,
while a positive value means and specific page size that is matched
against the supported ones by the system.

Currently only memory allocated on sysmalloc() is handled, the arenas
still uses the default system page size.

To test is a new rule is added tests-malloc-hugetlb2, which run the
addes tests with the required GLIBC_TUNABLE setting.  On systems without
a reserved huge pages pool, is just stress the mmap(MAP_HUGETLB)
allocation failure.  To improve test coverage it is required to create
a pool with some allocated pages.

Checked on x86_64-linux-gnu.

Reviewed-by: DJ Delorie <dj@redhat.com>
2021-12-15 17:35:38 -03:00
Adhemerval Zanella
5f6d8d97c6 malloc: Add madvise support for Transparent Huge Pages
Linux Transparent Huge Pages (THP) current supports three different
states: 'never', 'madvise', and 'always'.  The 'never' is
self-explanatory and 'always' will enable THP for all anonymous
pages.  However, 'madvise' is still the default for some system and
for such case THP will be only used if the memory range is explicity
advertise by the program through a madvise(MADV_HUGEPAGE) call.

To enable it a new tunable is provided, 'glibc.malloc.hugetlb',
where setting to a value diffent than 0 enables the madvise call.

This patch issues the madvise(MADV_HUGEPAGE) call after a successful
mmap() call at sysmalloc() with sizes larger than the default huge
page size.  The madvise() call is disable is system does not support
THP or if it has the mode set to "never" and on Linux only support
one page size for THP, even if the architecture supports multiple
sizes.

To test is a new rule is added tests-malloc-hugetlb1, which run the
addes tests with the required GLIBC_TUNABLE setting.

Checked on x86_64-linux-gnu.

Reviewed-by: DJ Delorie <dj@redhat.com>
2021-12-15 17:35:14 -03:00
Florian Weimer
0884724a95 elf: Use new dependency sorting algorithm by default
The default has to change eventually, and there are no known failures
that require a delay.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-12-14 14:44:04 +01:00
Siddhesh Poyarekar
23645707f1 Replace --enable-static-pie with --disable-default-pie
Build glibc programs and tests as PIE by default and enable static-pie
automatically if the architecture and toolchain supports it.

Also add a new configuration option --disable-default-pie to prevent
building programs as PIE.

Only the following architectures now have PIE disabled by default
because they do not work at the moment.  hppa, ia64, alpha and csky
don't work because the linker is unable to handle a pcrel relocation
generated from PIE objects.  The microblaze compiler is currently
failing with an ICE.  GNU hurd tries to enable static-pie, which does
not work and hence fails.  All these targets have default PIE disabled
at the moment and I have left it to the target maintainers to enable PIE
on their targets.

build-many-glibcs runs clean for all targets.  I also tested x86_64 on
Fedora and Ubuntu, to verify that the default build as well as
--disable-default-pie work as expected with both system toolchains.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2021-12-13 08:08:59 +05:30
Florian Weimer
c901c3e764 nptl: Add public rseq symbols and <sys/rseq.h>
The relationship between the thread pointer and the rseq area
is made explicit.  The constant offset can be used by JIT compilers
to optimize rseq access (e.g., for really fast sched_getcpu).

Extensibility is provided through __rseq_size and __rseq_flags.
(In the future, the kernel could request a different rseq size
via the auxiliary vector.)

Co-Authored-By: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2021-12-09 09:49:32 +01:00
Florian Weimer
e3e589829d nptl: Add glibc.pthread.rseq tunable to control rseq registration
This tunable allows applications to register the rseq area instead
of glibc.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2021-12-09 09:49:32 +01:00
H.J. Lu
bada2e312a Add --with-timeoutfactor=NUM to specify TIMEOUTFACTOR
On Ice Lake and Tiger Lake laptops, some test programs timeout when there
are 3 "make check -j8" runs in parallel.  Add --with-timeoutfactor=NUM to
specify an integer to scale the timeout of test programs, which can be
overriden by TIMEOUTFACTOR environment variable.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2021-12-04 12:58:28 -08:00
Joseph Myers
309548bec3 Support C2X printf %b, %B
C2X adds a printf %b format (see
<http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2630.pdf>, accepted
for C2X), for outputting integers in binary.  It also has recommended
practice for a corresponding %B format (like %b, but %#B starts the
output with 0B instead of 0b).  Add support for these formats to
glibc.

One existing test uses %b as an example of an unknown format, to test
how glibc printf handles unknown formats; change that to %v.  Use of
%b and %B as user-registered format specifiers continues to work (and
we already have a test that covers that, tst-printfsz.c).

Note that C2X also has scanf %b support, plus support for binary
constants starting 0b in strtol (base 0 and 2) and scanf %i (strtol
base 0 and scanf %i coming from a previous paper that added binary
integer literals).  I intend to implement those features in a separate
patch or patches; as discussed in the thread starting at
<https://sourceware.org/pipermail/libc-alpha/2020-December/120414.html>,
they will be more complicated because they involve adding extra public
symbols to ensure compatibility with existing code that might not
expect 0b constants to be handled by strtol base 0 and 2 and scanf %i,
whereas simply adding a new format specifier poses no such
compatibility concerns.

Note that the actual conversion from integer to string uses existing
code in _itoa.c.  That code has special cases for bases 8, 10 and 16,
probably so that the compiler can optimize division by an integer
constant in the code for those bases.  If desired such special cases
could easily be added for base 2 as well, but that would be an
optimization, not actually needed for these printf formats to work.

Tested for x86_64 and x86.  Also tested with build-many-glibcs.py for
aarch64-linux-gnu with GCC mainline to make sure that the test does
indeed build with GCC 12 (where format checking warnings are enabled
for most of the test).
2021-11-10 15:52:21 +00:00
Chung-Lin Tang
15a0c5730d elf: Fix slow DSO sorting behavior in dynamic loader (BZ #17645)
This second patch contains the actual implementation of a new sorting algorithm
for shared objects in the dynamic loader, which solves the slow behavior that
the current "old" algorithm falls into when the DSO set contains circular
dependencies.

The new algorithm implemented here is simply depth-first search (DFS) to obtain
the Reverse-Post Order (RPO) sequence, a topological sort. A new l_visited:1
bitfield is added to struct link_map to more elegantly facilitate such a search.

The DFS algorithm is applied to the input maps[nmap-1] backwards towards
maps[0]. This has the effect of a more "shallow" recursion depth in general
since the input is in BFS. Also, when combined with the natural order of
processing l_initfini[] at each node, this creates a resulting output sorting
closer to the intuitive "left-to-right" order in most cases.

Another notable implementation adjustment related to this _dl_sort_maps change
is the removing of two char arrays 'used' and 'done' in _dl_close_worker to
represent two per-map attributes. This has been changed to simply use two new
bit-fields l_map_used:1, l_map_done:1 added to struct link_map. This also allows
discarding the clunky 'used' array sorting that _dl_sort_maps had to sometimes
do along the way.

Tunable support for switching between different sorting algorithms at runtime is
also added. A new tunable 'glibc.rtld.dynamic_sort' with current valid values 1
(old algorithm) and 2 (new DFS algorithm) has been added. At time of commit
of this patch, the default setting is 1 (old algorithm).

Signed-off-by: Chung-Lin Tang  <cltang@codesourcery.com>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-10-21 11:23:53 -03:00
Stafford Horne
ad6feef1b0 manual: Update _TIME_BITS to clarify it's user defined
The current language reads "This macro determines...", changing to
"Define this macro...".  This is consistent with other feature macro
documentation language.

When I first read the previous language it seems to indicate that the
macro is already defined.  By changing the language to "Define this
macro..." it's clear that its the user's responsibility to define it.
2021-10-18 13:31:15 -03:00
Joseph Myers
01d34e934a Add C2X _PRINTF_NAN_LEN_MAX
C2X adds a macro _PRINTF_NAN_LEN_MAX to <stdio.h>, giving the maximum
length of printf output for a NaN.  glibc never includes an
n-char-sequence in its printf output for NaNs, so the correct value
for glibc is 4 ("-nan" or "-NAN"); define the macro accordingly.

This patch makes the macro definition conditional on __GLIBC_USE
(ISOC2X), as is generally done with features from new standard
versions.  The name is in the implementation namespace for older
standards, so it would also be possible to define it unconditionally.

Tested for x86_64.
2021-09-30 20:53:34 +00:00
Joseph Myers
90f0ac10a7 Add fmaximum, fminimum functions
C2X adds new <math.h> functions for floating-point maximum and
minimum, corresponding to the new operations that were added in IEEE
754-2019 because of concerns about the old operations not being
associative in the presence of signaling NaNs.  fmaximum and fminimum
handle NaNs like most <math.h> functions (any NaN argument means the
result is a quiet NaN).  fmaximum_num and fminimum_num handle both
quiet and signaling NaNs the way fmax and fmin handle quiet NaNs (if
one argument is a number and the other is a NaN, return the number),
but still raise "invalid" for a signaling NaN argument, making them
exceptions to the normal rule that a function with a floating-point
result raising "invalid" also returns a quiet NaN.  fmaximum_mag,
fminimum_mag, fmaximum_mag_num and fminimum_mag_num are corresponding
functions returning the argument with greatest or least absolute
value.  All these functions also treat +0 as greater than -0.  There
are also corresponding <tgmath.h> type-generic macros.

Add these functions to glibc.  The implementations use type-generic
templates based on those for fmax, fmin, fmaxmag and fminmag, and test
inputs are based on those for those functions with appropriate
adjustments to the expected results.  The RISC-V maintainers might
wish to add optimized versions of fmaximum_num and fminimum_num (for
float and double), since RISC-V (F extension version 2.2 and later)
provides instructions corresponding to those functions - though it
might be at least as useful to add architecture-independent built-in
functions to GCC and teach the RISC-V back end to expand those
functions inline, which is what you generally want for functions that
can be implemented with a single instruction.

Tested for x86_64 and x86, and with build-many-glibcs.py.
2021-09-28 23:31:35 +00:00
Joseph Myers
b3f27d8150 Add narrowing fma functions
This patch adds the narrowing fused multiply-add functions from TS
18661-1 / TS 18661-3 / C2X to glibc's libm: ffma, ffmal, dfmal,
f32fmaf64, f32fmaf32x, f32xfmaf64 for all configurations; f32fmaf64x,
f32fmaf128, f64fmaf64x, f64fmaf128, f32xfmaf64x, f32xfmaf128,
f64xfmaf128 for configurations with _Float64x and _Float128;
__f32fmaieee128 and __f64fmaieee128 aliases in the powerpc64le case
(for calls to ffmal and dfmal when long double is IEEE binary128).
Corresponding tgmath.h macro support is also added.

The changes are mostly similar to those for the other narrowing
functions previously added, especially that for sqrt, so the
description of those generally applies to this patch as well.  As with
sqrt, I reused the same test inputs in auto-libm-test-in as for
non-narrowing fma rather than adding extra or separate inputs for
narrowing fma.  The tests in libm-test-narrow-fma.inc also follow
those for non-narrowing fma.

The non-narrowing fma has a known bug (bug 6801) that it does not set
errno on errors (overflow, underflow, Inf * 0, Inf - Inf).  Rather
than fixing this or having narrowing fma check for errors when
non-narrowing does not (complicating the cases when narrowing fma can
otherwise be an alias for a non-narrowing function), this patch does
not attempt to check for errors from narrowing fma and set errno; the
CHECK_NARROW_FMA macro is still present, but as a placeholder that
does nothing, and this missing errno setting is considered to be
covered by the existing bug rather than needing a separate open bug.
missing-errno annotations are duly added to many of the
auto-libm-test-in test inputs for fma.

This completes adding all the new functions from TS 18661-1 to glibc,
so will be followed by corresponding stdc-predef.h changes to define
__STDC_IEC_60559_BFP__ and __STDC_IEC_60559_COMPLEX__, as the support
for TS 18661-1 will be at a similar level to that for C standard
floating-point facilities up to C11 (pragmas not implemented, but
library functions done).  (There are still further changes to be done
to implement changes to the types of fromfp functions from N2548.)

Tested as followed: natively with the full glibc testsuite for x86_64
(GCC 11, 7, 6) and x86 (GCC 11); with build-many-glibcs.py with GCC
11, 7 and 6; cross testing of math/ tests for powerpc64le, powerpc32
hard float, mips64 (all three ABIs, both hard and soft float).  The
different GCC versions are to cover the different cases in tgmath.h
and tgmath.h tests properly (GCC 6 has _Float* only as typedefs in
glibc headers, GCC 7 has proper _Float* support, GCC 8 adds
__builtin_tgmath).
2021-09-22 21:25:31 +00:00
Joseph Myers
abd383584b Add narrowing square root functions
This patch adds the narrowing square root functions from TS 18661-1 /
TS 18661-3 / C2X to glibc's libm: fsqrt, fsqrtl, dsqrtl, f32sqrtf64,
f32sqrtf32x, f32xsqrtf64 for all configurations; f32sqrtf64x,
f32sqrtf128, f64sqrtf64x, f64sqrtf128, f32xsqrtf64x, f32xsqrtf128,
f64xsqrtf128 for configurations with _Float64x and _Float128;
__f32sqrtieee128 and __f64sqrtieee128 aliases in the powerpc64le case
(for calls to fsqrtl and dsqrtl when long double is IEEE binary128).
Corresponding tgmath.h macro support is also added.

The changes are mostly similar to those for the other narrowing
functions previously added, so the description of those generally
applies to this patch as well.  However, the not-actually-narrowing
cases (where the two types involved in the function have the same
floating-point format) are aliased to sqrt, sqrtl or sqrtf128 rather
than needing a separately built not-actually-narrowing function such
as was needed for add / sub / mul / div.  Thus, there is no
__nldbl_dsqrtl name for ldbl-opt because no such name was needed
(whereas the other functions needed such a name since the only other
name for that entry point was e.g. f32xaddf64, not reserved by TS
18661-1); the headers are made to arrange for sqrt to be called in
that case instead.

The DIAG_* calls in sysdeps/ieee754/soft-fp/s_dsqrtl.c are because
they were observed to be needed in GCC 7 testing of
riscv32-linux-gnu-rv32imac-ilp32.  The other sysdeps/ieee754/soft-fp/
files added didn't need such DIAG_* in any configuration I tested with
build-many-glibcs.py, but if they do turn out to be needed in more
files with some other configuration / GCC version, they can always be
added there.

I reused the same test inputs in auto-libm-test-in as for
non-narrowing sqrt rather than adding extra or separate inputs for
narrowing sqrt.  The tests in libm-test-narrow-sqrt.inc also follow
those for non-narrowing sqrt.

Tested as followed: natively with the full glibc testsuite for x86_64
(GCC 11, 7, 6) and x86 (GCC 11); with build-many-glibcs.py with GCC
11, 7 and 6; cross testing of math/ tests for powerpc64le, powerpc32
hard float, mips64 (all three ABIs, both hard and soft float).  The
different GCC versions are to cover the different cases in tgmath.h
and tgmath.h tests properly (GCC 6 has _Float* only as typedefs in
glibc headers, GCC 7 has proper _Float* support, GCC 8 adds
__builtin_tgmath).
2021-09-10 20:56:22 +00:00
Siddhesh Poyarekar
30891f35fa Remove "Contributed by" lines
We stopped adding "Contributed by" or similar lines in sources in 2012
in favour of git logs and keeping the Contributors section of the
glibc manual up to date.  Removing these lines makes the license
header a bit more consistent across files and also removes the
possibility of error in attribution when license blocks or files are
copied across since the contributed-by lines don't actually reflect
reality in those cases.

Move all "Contributed by" and similar lines (Written by, Test by,
etc.) into a new file CONTRIBUTED-BY to retain record of these
contributions.  These contributors are also mentioned in
manual/contrib.texi, so we just maintain this additional record as a
courtesy to the earlier developers.

The following scripts were used to filter a list of files to edit in
place and to clean up the CONTRIBUTED-BY file respectively.  These
were not added to the glibc sources because they're not expected to be
of any use in future given that this is a one time task:

https://gist.github.com/siddhesh/b5ecac94eabfd72ed2916d6d8157e7dc
https://gist.github.com/siddhesh/15ea1f5e435ace9774f485030695ee02

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2021-09-03 22:06:44 +05:30
Michael Kerrisk
5aa359d331 llio.texi: Wording fixes in description of closefrom()
Fix two problems.

Rather than "larger than", better English is "greater than".

Then there is a wordinig error on the following line: "then lowfd"
appears to be cruft.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-26 15:23:07 -03:00
Carlos O'Donell
06eae99ab4 Update install.texi, and regenerate INSTALL. 2021-08-01 16:48:43 -04:00
Siddhesh Poyarekar
fb1621a886 manual: Drop the .so suffix in libc_malloc_debug description
All references to libraries in the manual are without the .so prefix,
so do the same for libc_malloc_debug.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2021-07-27 07:54:46 +05:30
Siddhesh Poyarekar
d34ed66f96 manual: Document unsupported cases for interposition
These functions call the core allocator functions (realloc and malloc
respectively) and are hence guaranteed to allocate memory using the
correct functions when multiple allocators are interposed.  Having
these functions interposed in one allocator and not another may result
in confusion, hence discourage interposing them altogether.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2021-07-23 20:27:10 +05:30
H.J. Lu
7c124e3714 x86: Install <bits/platform/x86.h> [BZ #27958]
1. Install <bits/platform/x86.h> for <sys/platform/x86.h> which includes
<bits/platform/x86.h>.
2. Rename HAS_CPU_FEATURE to CPU_FEATURE_PRESENT which checks if the
processor has the feature.
3. Rename CPU_FEATURE_USABLE to CPU_FEATURE_ACTIVE which checks if the
feature is active.  There may be other preconditions, like sufficient
stack space or further setup for AMX, which must be satisfied before the
feature can be used.

This fixes BZ #27958.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2021-07-23 05:12:51 -07:00
Siddhesh Poyarekar
1e5a5866cb Remove malloc hooks [BZ #23328]
Make malloc hooks symbols compat-only so that new applications cannot
link against them and remove the declarations from the API.  Also
remove the unused malloc-hooks.h.

Finally, mark all symbols in libc_malloc_debug.so as compat so that
the library cannot be linked against.

Add a note about the deprecation in NEWS.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
2021-07-22 18:38:12 +05:30
Siddhesh Poyarekar
2d2d9f2b48 Move malloc hooks into a compat DSO
Remove all malloc hook uses from core malloc functions and move it
into a new library libc_malloc_debug.so.  With this, the hooks now no
longer have any effect on the core library.

libc_malloc_debug.so is a malloc interposer that needs to be preloaded
to get hooks functionality back so that the debugging features that
depend on the hooks, i.e. malloc-check, mcheck and mtrace work again.
Without the preloaded DSO these debugging features will be nops.
These features will be ported away from hooks in subsequent patches.

Similarly, legacy applications that need hooks functionality need to
preload libc_malloc_debug.so.

The symbols exported by libc_malloc_debug.so are maintained at exactly
the same version as libc.so.

Finally, static binaries will no longer be able to use malloc
debugging features since they cannot preload the debugging DSO.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
2021-07-22 18:37:59 +05:30
Adhemerval Zanella
469761eac8 elf: Fix tst-cpu-features-cpuinfo on some AMD systems (BZ #28090)
The SSBD feature is implemented in 2 different ways on AMD processors:
newer systems (Zen3) provides AMD_SSBD (function 8000_0008, EBX[24]),
while older system provides AMD_VIRT_SSBD (function 8000_0008, EBX[25]).
However for AMD_VIRT_SSBD, kernel shows both 'ssdb' and 'virt_ssdb' on
/proc/cpuinfo; while for AMD_SSBD only 'ssdb' is provided.

This now check is AMD_SSBD is set to check for 'ssbd', otherwise check
if AMD_VIRT_SSDB is set to check for 'virt_ssbd'.

Checked on x86_64-linux-gnu on a Ryzen 9 5900x.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-07-19 14:12:29 -03:00
H.J. Lu
5d98a7dae9 Define PTHREAD_STACK_MIN to sysconf(_SC_THREAD_STACK_MIN)
The constant PTHREAD_STACK_MIN may be too small for some processors.
Rename _SC_SIGSTKSZ_SOURCE to _DYNAMIC_STACK_SIZE_SOURCE.  When
_DYNAMIC_STACK_SIZE_SOURCE or _GNU_SOURCE are defined, define
PTHREAD_STACK_MIN to sysconf(_SC_THREAD_STACK_MIN) which is changed
to MIN (PTHREAD_STACK_MIN, sysconf(_SC_MINSIGSTKSZ)).

Consolidate <bits/local_lim.h> with <bits/pthread_stack_min.h> to
provide a constant target specific PTHREAD_STACK_MIN value.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2021-07-09 15:10:35 -07:00
Adhemerval Zanella
607449506f io: Add closefrom [BZ #10353]
The function closes all open file descriptors greater than or equal to
input argument.  Negative values are clamped to 0, i.e, it will close
all file descriptors.

As indicated by the bug report, this is a common symbol provided by
different systems (Solaris, OpenBSD, NetBSD, FreeBSD) and, although
its has inherent issues with not taking in consideration internal libc
file descriptors (such as syslog), this is also a common feature used
in multiple projects [1][2][3][4][5].

The Linux fallback implementation iterates over /proc and close all
file descriptors sequentially.  Although it was raised the questioning
whether getdents on /proc/self/fd might return disjointed entries
when file descriptor are closed; it does not seems the case on my
testing on multiple kernel (v4.18, v5.4, v5.9) and the same strategy
is used on different projects [1][2][3][5].

Also, the interface is set a fail-safe meaning that a failure in the
fallback results in a process abort.

Checked on x86_64-linux-gnu and i686-linux-gnu on kernel 5.11 and 4.15.

[1] 5238e95759/src/basic/fd-util.c (L217)
[2] ddf4b77e11/src/lxc/start.c (L236)
[3] 9e4f2f3a6b/Modules/_posixsubprocess.c (L220)
[4] 5f47c0613e/src/libstd/sys/unix/process2.rs (L303-L308)
[5] https://github.com/openjdk/jdk/blob/master/src/java.base/unix/native/libjava/childproc.c#L82
2021-07-08 14:08:14 -03:00
Adhemerval Zanella
286286283e linux: Add close_range
It was added on Linux 5.9 (278a5fbaed89) with CLOSE_RANGE_CLOEXEC
added on 5.11 (582f1fb6b721f).  Although FreeBSD has added the same
syscall, this only adds the symbol on Linux ports.  This syscall is
required to provided a fail-safe way to implement the closefrom
symbol (BZ #10353).

Checked on x86_64-linux-gnu and i686-linux-gnu on kernel 5.11 and 4.15.
2021-07-08 14:08:13 -03:00
Siddhesh Poyarekar
83e55c982f glibc.malloc.check: Fix nit in documentation
The tunable will not work with *any* non-zero tunable value since its
list of allowed values is 0-3.  Fix the documentation to reflect that.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-07-07 07:02:13 +05:30
Armin Brauns
b156c5f0a7 manual: fix description for preadv() 2021-07-06 16:23:15 +02:00
H.J. Lu
ea8e465a6b x86: Check RTM_ALWAYS_ABORT for RTM [BZ #28033]
From

https://www.intel.com/content/www/us/en/support/articles/000059422/processors.html

* Intel TSX will be disabled by default.
* The processor will force abort all Restricted Transactional Memory (RTM)
  transactions by default.
* A new CPUID bit CPUID.07H.0H.EDX[11](RTM_ALWAYS_ABORT) will be enumerated,
  which is set to indicate to updated software that the loaded microcode is
  forcing RTM abort.
* On processors that enumerate support for RTM, the CPUID enumeration bits
  for Intel TSX (CPUID.07H.0H.EBX[11] and CPUID.07H.0H.EBX[4]) continue to
  be set by default after microcode update.
* Workloads that were benefited from Intel TSX might experience a change
  in performance.
* System software may use a new bit in Model-Specific Register (MSR) 0x10F
  TSX_FORCE_ABORT[TSX_CPUID_CLEAR] functionality to clear the Hardware Lock
  Elision (HLE) and RTM bits to indicate to software that Intel TSX is
  disabled.

1. Add RTM_ALWAYS_ABORT to CPUID features.
2. Set RTM usable only if RTM_ALWAYS_ABORT isn't set.  This skips the
string/tst-memchr-rtm etc. testcases on the affected processors, which
always fail after a microcde update.
3. Check RTM feature, instead of usability, against /proc/cpuinfo.

This fixes BZ #28033.
2021-07-01 10:47:35 -07:00
Adhemerval Zanella
c32c868ab8 posix: Add _Fork [BZ #4737]
Austin Group issue 62 [1] dropped the async-signal-safe requirement
for fork and provided a async-signal-safe _Fork replacement that
does not run the atfork handlers.  It will be included in the next
POSIX standard.

It allow to close a long standing issue to make fork AS-safe (BZ#4737).
As indicated on the bug, besides the internal lock for the atfork
handlers itself; there is no guarantee that the handlers itself will
not introduce more AS-safe issues.

The idea is synchronize fork with the required internal locks to allow
children in multithread processes to use mostly of standard function
(even though POSIX states only AS-safe function should be used).  On
signal handles, _Fork should be used intead and only AS-safe functions
should be used.

For testing, the new tst-_Fork only check basic usage.  I also added
a new tst-mallocfork3 which uses the same strategy to check for
deadlock of tst-mallocfork2 but using threads instead of subprocesses
(and it does deadlock if it replaces _Fork with fork).

[1] https://austingroupbugs.net/view.php?id=62
2021-06-28 15:55:56 -03:00
Florian Weimer
dd45734e32 nptl: Add glibc.pthread.stack_cache_size tunable
The valgrind/helgrind test suite needs a way to make stack dealloction
more prompt, and this feature seems to be generally useful.

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2021-06-28 16:41:58 +02:00
Adhemerval Zanella
e3e3eb0a2e x86: Fix tst-cpu-features-cpuinfo on Ryzen 9 (BZ #27873)
AMD define different flags for IRPB, IBRS, and STIPBP [1], so new
x86_64_cpu are added and IBRS_IBPB is only tested for Intel.

The SSDB is also defined and implemented different on AMD [2],
and also a new AMD_SSDB flag is added.  It should map to the
cpuinfo 'ssdb' on recent AMD cpus.

It fixes tst-cpu-features-cpuinfo and tst-cpu-features-cpuinfo-static
on recent AMD cpus.

Checked on x86_64-linux-gnu on AMD Ryzen 9 5900X.

[1] https://developer.amd.com/wp-content/resources/Architecture_Guidelines_Update_Indirect_Branch_Control.pdf
[2] https://bugzilla.kernel.org/show_bug.cgi?id=199889

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-06-24 09:57:46 -03:00
Paul Eggert
03caacbc7f doc: _TIME_BITS defaults may change
* NEWS: Don't imply the default will always be 32-bit.
* manual/creature.texi (Feature Test Macros):
Say that _TIME_BITS and _FILE_OFFSET_BITS defaults
may change in future releases.
2021-06-23 09:04:22 -07:00
Adhemerval Zanella
47f24c21ee y2038: Add support for 64-bit time on legacy ABIs
A new build flag, _TIME_BITS, enables the usage of the newer 64-bit
time symbols for legacy ABI (where 32-bit time_t is default).  The 64
bit time support is only enabled if LFS (_FILE_OFFSET_BITS=64) is
also used.

Different than LFS support, the y2038 symbols are added only for the
required ABIs (armhf, csky, hppa, i386, m68k, microblaze, mips32,
mips64-n32, nios2, powerpc32, sparc32, s390-32, and sh).  The ABIs with
64-bit time support are unchanged, both for symbol and types
redirection.

On Linux the full 64-bit time support requires a minimum of kernel
version v5.1.  Otherwise, the 32-bit fallbacks are used and might
results in error with overflow return code (EOVERFLOW).

The i686-gnu does not yet support 64-bit time.

This patch exports following rediretions to support 64-bit time:

  * libc:
    adjtime
    adjtimex
    clock_adjtime
    clock_getres
    clock_gettime
    clock_nanosleep
    clock_settime
    cnd_timedwait
    ctime
    ctime_r
    difftime
    fstat
    fstatat
    futimens
    futimes
    futimesat
    getitimer
    getrusage
    gettimeofday
    gmtime
    gmtime_r
    localtime
    localtime_r
    lstat_time
    lutimes
    mktime
    msgctl
    mtx_timedlock
    nanosleep
    nanosleep
    ntp_gettime
    ntp_gettimex
    ppoll
    pselec
    pselect
    pthread_clockjoin_np
    pthread_cond_clockwait
    pthread_cond_timedwait
    pthread_mutex_clocklock
    pthread_mutex_timedlock
    pthread_rwlock_clockrdlock
    pthread_rwlock_clockwrlock
    pthread_rwlock_timedrdlock
    pthread_rwlock_timedwrlock
    pthread_timedjoin_np
    recvmmsg
    sched_rr_get_interval
    select
    sem_clockwait
    semctl
    semtimedop
    sem_timedwait
    setitimer
    settimeofday
    shmctl
    sigtimedwait
    stat
    thrd_sleep
    time
    timegm
    timerfd_gettime
    timerfd_settime
    timespec_get
    utime
    utimensat
    utimes
    utimes
    wait3
    wait4

  * librt:
    aio_suspend
    mq_timedreceive
    mq_timedsend
    timer_gettime
    timer_settime

  * libanl:
    gai_suspend

Reviewed-by: Lukasz Majewski <lukma@denx.de>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
2021-06-15 10:42:11 -03:00
Matheus Castanho
ebae2f5a6f Add build option to disable usage of scv on powerpc
Commit 68ab82f566 added support for the scv
syscall ABI on powerpc.  Since then systems that have kernel and processor
support started using scv.  However adding the proper support for a new syscall
ABI requires changes to several other projects (e.g. qemu, valgrind, strace,
kernel), which are gradually receiving support.

Meanwhile, having a way to disable scv on glibc at build time can be useful for
distros that may encounter conflicts with projects that still do not support the
scv ABI, buying time until proper support is added.

This commit adds a --disable-scv option that disables scv support and uses sc
for all syscalls, like before commit 68ab82f566.

Reviewed-by: Raphael M Zinsly <rzinsly@linux.ibm.com>
2021-06-10 16:23:25 -03:00
Adhemerval Zanella
2b51742531 nptl: Move cancel state out of cancelhandling
Now that thread cancellation state is not accessed concurrently anymore,
it is possible to move it out the 'cancelhandling'.

The code is also simplified: CANCELLATION_P is replaced with a
internal pthread_testcancel call and the CANCELSTATE_BIT{MASK} is
removed.

With this behavior pthread_setcancelstate does not require to act on
cancellation if cancel type is asynchronous (is already handled either
by pthread_setcanceltype or by the signal handler).

Checked on x86_64-linux-gnu and aarch64-linux-gnu.
2021-06-09 15:16:45 -03:00
Xeonacid
5295172e20 fix typo
"accomodate" should be "accommodate"
Reviewed-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>
2021-06-02 12:16:49 +02:00
Joseph Myers
858045ad1c Update floating-point feature test macro handling for C2X
ISO C2X has made some changes to the handling of feature test macros
related to features from the floating-point TSes, and to exactly what
such features are present in what headers, that require corresponding
changes in glibc.

* For the few features that were controlled by
  __STDC_WANT_IEC_60559_BFP_EXT__ (and the corresponding DFP macro) in
  C2X, there is now instead a new feature test macro
  __STDC_WANT_IEC_60559_EXT__ covering both binary and decimal FP.
  This controls CR_DECIMAL_DIG in <float.h> (provided by GCC; I
  implemented support for the new feature test macro for GCC 11) and
  the totalorder and payload functions in <math.h>.  C2X no longer
  says anything about __STDC_WANT_IEC_60559_BFP_EXT__ (so it's
  appropriate for that macro to continue to enable exactly the
  features from TS 18661-1).

* The SNAN macros for each floating-point type have moved to <float.h>
  (and been renamed in the process).  Thus, the copies in <math.h>
  should only be defined for __STDC_WANT_IEC_60559_BFP_EXT__, not for
  C2X.

* The fmaxmag and fminmag functions have been removed (replaced by new
  functions for the new min/max operations in IEEE 754-2019).  Thus
  those should also only be declared for
  __STDC_WANT_IEC_60559_BFP_EXT__.

* The _FloatN / _FloatNx handling for the last two points in glibc is
  trickier, since __STDC_WANT_IEC_60559_TYPES_EXT__ is still in C2X
  (the integration of TS 18661-3 as an Annex, that is, which hasn't
  yet been merged into the C standard git repository but has been
  accepted by WG14), so C2X with that macro should not declare some
  things that are declared for older standards with that macro.  The
  approach taken here is to provide the declarations (when
  __STDC_WANT_IEC_60559_TYPES_EXT__ is enabled) only when (defined
  __USE_GNU || !__GLIBC_USE (ISOC2X)), so if C2X features are enabled
  then those declarations (that are only in TS 18661-3 and not in C2X)
  will only be provided if _GNU_SOURCE is defined as well.  Thus
  _GNU_SOURCE remains a superset of the TS features as well as of C2X.

Some other somewhat related changes in C2X are not addressed here.
There's an open proposal not to include the fmin and fmax functions
for the _FloatN / _FloatNx types, given the new min/max operations,
which could be handled like the previous point if adopted.  And the
fromfp functions have been changed to return a result in floating type
rather than intmax_t / uintmax_t; my inclination there is to treat
that like that change of totalorder type (new symbol versions etc. for
the ABI change; old versions become compat symbols and are no longer
supported as an API).

Tested for x86_64 and x86.
2021-06-01 14:22:06 +00:00
Naohiro Tamura
fa527f345c aarch64: Added optimized memcpy and memmove for A64FX
This patch optimizes the performance of memcpy/memmove for A64FX [1]
which implements ARMv8-A SVE and has L1 64KB cache per core and L2 8MB
cache per NUMA node.

The performance optimization makes use of Scalable Vector Register
with several techniques such as loop unrolling, memory access
alignment, cache zero fill, and software pipelining.

SVE assembler code for memcpy/memmove is implemented as Vector Length
Agnostic code so theoretically it can be run on any SOC which supports
ARMv8-A SVE standard.

We confirmed that all testcases have been passed by running 'make
check' and 'make xcheck' not only on A64FX but also on ThunderX2.

And also we confirmed that the SVE 512 bit vector register performance
is roughly 4 times better than Advanced SIMD 128 bit register and 8
times better than scalar 64 bit register by running 'make bench'.

[1] https://github.com/fujitsu/A64FX

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Reviewed-by: Szabolcs Nagy <Szabolcs.Nagy@arm.com>
2021-05-27 09:47:53 +01:00
Naohiro Tamura
3856056358 aarch64: Added Vector Length Set test helper script
This patch is a test helper script to change Vector Length for child
process. This script can be used as test-wrapper for 'make check'.

Usage examples:

~/build$ make check subdirs=string \
test-wrapper='~/glibc/sysdeps/unix/sysv/linux/aarch64/vltest.py 16'

~/build$ ~/glibc/sysdeps/unix/sysv/linux/aarch64/vltest.py 16 \
make test t=string/test-memcpy

~/build$ ~/glibc/sysdeps/unix/sysv/linux/aarch64/vltest.py 32 \
./debugglibc.sh string/test-memmove

~/build$ ~/glibc/sysdeps/unix/sysv/linux/aarch64/vltest.py 64 \
./testrun.sh string/test-memset
2021-05-26 12:01:06 +01:00
Florian Weimer
ce0b7961ae nptl: Consolidate async cancel enable/disable implementation in libc
Previously, the source file nptl/cancellation.c was compiled multiple
times, for libc, libpthread, librt.  This commit switches to a single
implementation, with new __pthread_enable_asynccancel@@GLIBC_PRIVATE,
__pthread_disable_asynccancel@@GLIBC_PRIVATE exports.

The almost-unused CANCEL_ASYNC and CANCEL_RESET macros are replaced
by LIBC_CANCEL_ASYNC and LIBC_CANCEL_ASYNC macros.  They call the
__pthread_* functions unconditionally now.  The macros are still
needed because shared code uses them; Hurd has different definitions.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-05-05 17:19:32 +02:00
Paul Eggert
bdc674d97b Improve documentation for malloc etc. (BZ#27719)
Cover key corner cases (e.g., whether errno is set) that are well
settled in glibc, fix some examples to avoid integer overflow, and
update some other dated examples (code needed for K&R C, e.g.).
* manual/charset.texi (Non-reentrant String Conversion):
* manual/filesys.texi (Symbolic Links):
* manual/memory.texi (Allocating Cleared Space):
* manual/socket.texi (Host Names):
* manual/string.texi (Concatenating Strings):
* manual/users.texi (Setting Groups):
Use reallocarray instead of realloc, to avoid integer overflow issues.
* manual/filesys.texi (Scanning Directory Content):
* manual/memory.texi (The GNU Allocator, Hooks for Malloc):
* manual/tunables.texi:
Use code font for 'malloc' instead of roman font.
(Symbolic Links): Don't assume readlink return value fits in 'int'.
* manual/memory.texi (Memory Allocation and C, Basic Allocation)
(Malloc Examples, Alloca Example):
* manual/stdio.texi (Formatted Output Functions):
* manual/string.texi (Concatenating Strings, Collation Functions):
Omit pointer casts that are needed only in ancient K&R C.
* manual/memory.texi (Basic Allocation):
Say that malloc sets errno on failure.
Say "convert" rather than "cast", since casts are no longer needed.
* manual/memory.texi (Basic Allocation):
* manual/string.texi (Concatenating Strings):
In examples, use C99 declarations after statements for brevity.
* manual/memory.texi (Malloc Examples): Add portability notes for
malloc (0), errno setting, and PTRDIFF_MAX.
(Changing Block Size): Say that realloc (p, 0) acts like
(p ? (free (p), NULL) : malloc (0)).
Add xreallocarray example, since other examples can use it.
Add portability notes for realloc (0, 0), realloc (p, 0),
PTRDIFF_MAX, and improve notes for reallocating to the same size.
(Allocating Cleared Space): Reword now-confusing discussion
about replacement, and xref "Replacing malloc".
* manual/stdio.texi (Formatted Output Functions):
Don't assume message size fits in 'int'.
* manual/string.texi (Concatenating Strings):
Fix undefined behavior involving arithmetic on a freed pointer.
2021-04-13 12:17:56 -07:00
Alyssa Ross
4d8d70d301 manual: clarify that scanf %n supports type modifiers
My initial reading of the %n documentation was that it didn't support
type conversions, because it only mentioned int*.

Corresponding man-pages patch:
https://lore.kernel.org/linux-man/20210328215509.31666-1-hi@alyssa.is/

Reviewed-by: Arjun Shankar <arjun@redhat.com>
2021-03-30 20:40:39 +02:00
Wilco Dijkstra
47ad14d789 math: Remove mpa files [BZ #15267]
Finally remove all mpa related files, headers, declarations, probes, unused
tables and update makefiles.

Reviewed-By: Paul Zimmermann <Paul.Zimmermann@inria.fr>
2021-03-11 14:26:36 +00:00
Lukasz Majewski
496e36f225 tst: Extend cross-test-ssh.sh to specify if target date can be altered
This code adds new flag - '--allow-time-setting' to cross-test-ssh.sh
script to indicate if it is allowed to alter the date on the system
on which tests are executed. This change is supposed to be used with
test systems, which use virtual machines for testing.

The GLIBC_TEST_ALLOW_TIME_SETTING env variable is exported to the
remote environment on which the eligible test is run and brings no
functional change when it is not.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-03-08 22:37:16 +01:00
Siddhesh Poyarekar
61117bfa1b tunables: Simplify TUNABLE_SET interface
The TUNABLE_SET interface took a primitive C type argument, which
resulted in inconsistent type conversions internally due to incorrect
dereferencing of types, especialy on 32-bit architectures.  This
change simplifies the TUNABLE setting logic along with the interfaces.

Now all numeric tunable values are stored as signed numbers in
tunable_num_t, which is intmax_t.  All calls to set tunables cast the
input value to its primitive type and then to tunable_num_t for
storage.  This relies on gcc-specific (although I suspect other
compilers woul also do the same) unsigned to signed integer conversion
semantics, i.e. the bit pattern is conserved.  The reverse conversion
is guaranteed by the standard.
2021-02-10 19:08:33 +05:30
H.J. Lu
5ab25c8875 x86: Add PTWRITE feature detection [BZ #27346]
1. Add CPUID_INDEX_14_ECX_0 for CPUID leaf 0x14 to detect PTWRITE feature
in EBX of CPUID leaf 0x14 with ECX == 0.
2. Add PTWRITE detection to CPU feature tests.
3. Add 2 static CPU feature tests.
2021-02-07 08:01:14 -08:00
Florian Weimer
2d8a22cdec manual: Correct description of ENTRY [BZ #17183]
The struct tag is actually entry (not ENTRY).  The data member has
type void *, and it can point to binary data.  Only the key member is
required to be a null-terminated string.

Reviewed-by: Arjun Shankar <arjun@redhat.com>
2021-02-04 15:22:12 +01:00
H.J. Lu
6c57d32048 sysconf: Add _SC_MINSIGSTKSZ/_SC_SIGSTKSZ [BZ #20305]
Add _SC_MINSIGSTKSZ for the minimum signal stack size derived from
AT_MINSIGSTKSZ, which is the minimum number of bytes of free stack
space required in order to gurantee successful, non-nested handling
of a single signal whose handler is an empty function, and _SC_SIGSTKSZ
which is the suggested minimum number of bytes of stack space required
for a signal stack.

If AT_MINSIGSTKSZ isn't available, sysconf (_SC_MINSIGSTKSZ) returns
MINSIGSTKSZ.  On Linux/x86 with XSAVE, the signal frame used by kernel
is composed of the following areas and laid out as:

 ------------------------------
 | alignment padding          |
 ------------------------------
 | xsave buffer               |
 ------------------------------
 | fsave header (32-bit only) |
 ------------------------------
 | siginfo + ucontext         |
 ------------------------------

Compute AT_MINSIGSTKSZ value as size of xsave buffer + size of fsave
header (32-bit only) + size of siginfo and ucontext + alignment padding.

If _SC_SIGSTKSZ_SOURCE or _GNU_SOURCE are defined, MINSIGSTKSZ and SIGSTKSZ
are redefined as

/* Default stack size for a signal handler: sysconf (SC_SIGSTKSZ).  */
 # undef SIGSTKSZ
 # define SIGSTKSZ sysconf (_SC_SIGSTKSZ)

/* Minimum stack size for a signal handler: SIGSTKSZ.  */
 # undef MINSIGSTKSZ
 # define MINSIGSTKSZ SIGSTKSZ

Compilation will fail if the source assumes constant MINSIGSTKSZ or
SIGSTKSZ.

The reason for not simply increasing the kernel's MINSIGSTKSZ #define
(apart from the fact that it is rarely used, due to glibc's shadowing
definitions) was that userspace binaries will have baked in the old
value of the constant and may be making assumptions about it.

For example, the type (char [MINSIGSTKSZ]) changes if this #define
changes.  This could be a problem if an newly built library tries to
memcpy() or dump such an object defined by and old binary.
Bounds-checking and the stack sizes passed to things like sigaltstack()
and makecontext() could similarly go wrong.
2021-02-01 11:00:52 -08:00
Tulio Magno Quites Machado Filho
ad47748992 Update INSTALL with package versions that are known to work
Most packages have been tested with their latest releases, except for
Python, whose latest version is 3.9.1.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-01-25 13:13:17 -03:00
John McCabe
56ef6ab0cd manual: Correct argument order in mount examples [BZ #27207]
Reviewed-by: DJ Delorie <dj@redhat.com>
2021-01-22 14:22:41 -05:00
H.J. Lu
ff6d62e9ed <sys/platform/x86.h>: Remove the C preprocessor magic
In <sys/platform/x86.h>, define CPU features as enum instead of using
the C preprocessor magic to make it easier to wrap this functionality
in other languages.  Move the C preprocessor magic to internal header
for better GCC codegen when more than one features are checked in a
single expression as in x86-64 dl-hwcaps-subdirs.c.

1. Rename COMMON_CPUID_INDEX_XXX to CPUID_INDEX_XXX.
2. Move CPUID_INDEX_MAX to sysdeps/x86/include/cpu-features.h.
3. Remove struct cpu_features and __x86_get_cpu_features from
<sys/platform/x86.h>.
4. Add __x86_get_cpuid_feature_leaf to <sys/platform/x86.h> and put it
in libc.
5. Make __get_cpu_features() private to glibc.
6. Replace __x86_get_cpu_features(N) with __get_cpu_features().
7. Add _dl_x86_get_cpu_features to GLIBC_PRIVATE.
8. Use a single enum index for each CPU feature detection.
9. Pass the CPUID feature leaf to __x86_get_cpuid_feature_leaf.
10. Return zero struct cpuid_feature for the older glibc binary with a
smaller CPUID_INDEX_MAX [BZ #27104].
11. Inside glibc, use the C preprocessor magic so that cpu_features data
can be loaded just once leading to more compact code for glibc.

256 bits are used for each CPUID leaf.  Some leaves only contain a few
features.  We can add exceptions to such leaves.  But it will increase
code sizes and it is harder to provide backward/forward compatibilities
when new features are added to such leaves in the future.

When new leaves are added, _rtld_global_ro offsets will change which
leads to race condition during in-place updates. We may avoid in-place
updates by

1. Rename the old glibc.
2. Install the new glibc.
3. Remove the old glibc.

NB: A function, __x86_get_cpuid_feature_leaf , is used to avoid the copy
relocation issue with IFUNC resolver as shown in IFUNC resolver tests.
2021-01-21 05:58:17 -08:00
H.J. Lu
86f65dffc2 ld.so: Add --list-tunables to print tunable values
Pass --list-tunables to ld.so to print tunables with min and max values.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-01-15 05:59:10 -08:00
Paul Eggert
21c3f4b536 Sync FDL from https://www.gnu.org/licenses/fdl-1.3.texi 2021-01-02 12:46:25 -08:00
Paul Eggert
2b778ceb40 Update copyright dates with scripts/update-copyrights
I used these shell commands:

../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")

and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 6694 files FOO.
I then removed trailing white space from benchtests/bench-pthread-locks.c
and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this
diagnostic from Savannah:
remote: *** pre-commit check failed ...
remote: *** error: lines with trailing whitespace found
remote: error: hook declined to update refs/heads/master
2021-01-02 12:17:34 -08:00
Siddhesh Poyarekar
c43c579612 Introduce _FORTIFY_SOURCE=3
Introduce a new _FORTIFY_SOURCE level of 3 to enable additional
fortifications that may have a noticeable performance impact, allowing
more fortification coverage at the cost of some performance.

With llvm 9.0 or later, this will replace the use of
__builtin_object_size with __builtin_dynamic_object_size.

__builtin_dynamic_object_size
-----------------------------

__builtin_dynamic_object_size is an LLVM builtin that is similar to
__builtin_object_size.  In addition to what __builtin_object_size
does, i.e. replace the builtin call with a constant object size,
__builtin_dynamic_object_size will replace the call site with an
expression that evaluates to the object size, thus expanding its
applicability.  In practice, __builtin_dynamic_object_size evaluates
these expressions through malloc/calloc calls that it can associate
with the object being evaluated.

A simple motivating example is below; -D_FORTIFY_SOURCE=2 would miss
this and emit memcpy, but -D_FORTIFY_SOURCE=3 with the help of
__builtin_dynamic_object_size is able to emit __memcpy_chk with the
allocation size expression passed into the function:

void *copy_obj (const void *src, size_t alloc, size_t copysize)
{
  void *obj = malloc (alloc);
  memcpy (obj, src, copysize);
  return obj;
}

Limitations
-----------

If the object was allocated elsewhere that the compiler cannot see, or
if it was allocated in the function with a function that the compiler
does not recognize as an allocator then __builtin_dynamic_object_size
also returns -1.

Further, the expression used to compute object size may be non-trivial
and may potentially incur a noticeable performance impact.  These
fortifications are hence enabled at a new _FORTIFY_SOURCE level to
allow developers to make a choice on the tradeoff according to their
environment.
2020-12-31 16:55:21 +05:30
Paul Eggert
69fda43b8d free: preserve errno [BZ#17924]
In the next release of POSIX, free must preserve errno
<https://www.austingroupbugs.net/view.php?id=385>.
Modify __libc_free to save and restore errno, so that
any internal munmap etc. syscalls do not disturb the caller's errno.
Add a test malloc/tst-free-errno.c (almost all by Bruno Haible),
and document that free preserves errno.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2020-12-29 00:46:46 -08:00
H.J. Lu
a2e5da2cf4 <sys/platform/x86.h>: Add Intel LAM support
Add Intel Linear Address Masking (LAM) support to <sys/platform/x86.h>.
HAS_CPU_FEATURE (LAM) can be used to detect if LAM is enabled in CPU.

LAM modifies the checking that is applied to 64-bit linear addresses,
allowing software to use of the untranslated address bits for metadata.
2020-12-22 03:45:47 -08:00
Richard Earnshaw
26450d04d3 elf: Add a tunable to control use of tagged memory
Add a new glibc tunable: mem.tagging.  This is a decimal constant in
the range 0-255 but used as a bit-field.

Bit 0 enables use of tagged memory in the malloc family of functions.
Bit 1 enables precise faulting of tag failure on platforms where this
can be controlled.
Other bits are currently unused, but if set will cause memory tag
checking for the current process to be enabled in the kernel.
2020-12-21 15:25:25 +00:00
Richard Earnshaw
3378408987 config: Allow memory tagging to be enabled when configuring glibc
This patch adds the configuration machinery to allow memory tagging to be
enabled from the command line via the configure option --enable-memory-tagging.

The current default is off, though in time we may change that once the API
is more stable.
2020-12-21 15:25:25 +00:00
Anssi Hannula
69a7ca7705 ieee754: Remove unused __sin32 and __cos32
The __sin32 and __cos32 functions were only used in the now removed slow
path of asin and acos.
2020-12-18 12:10:31 +05:30
Stefan Liebler
844b4d8b4b s390x: Require GCC 7.1 or later to build glibc.
GCC 6.5 fails to correctly build ldconfig with recent ld.so.cache
commits, e.g.:
785969a047
elf: Implement a string table for ldconfig, with tail merging

If glibc is build with gcc 6.5.0:
__builtin_add_overflow is used in
<glibc>/elf/stringtable.c:stringtable_finalize()
which leads to ldconfig failing with "String table is too large".
This is also recognizable in following tests:
FAIL: elf/tst-glibc-hwcaps-cache
FAIL: elf/tst-glibc-hwcaps-prepend-cache
FAIL: elf/tst-ldconfig-X
FAIL: elf/tst-ldconfig-bad-aux-cache
FAIL: elf/tst-ldconfig-ld_so_conf-update
FAIL: elf/tst-stringtable

See gcc "Bug 98269 - gcc 6.5.0 __builtin_add_overflow() with small
uint32_t values incorrectly detects overflow"
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98269)
2020-12-17 16:18:04 +01:00
Florian Weimer
e960d8313d manual: Clarify File Access Modes section and add O_PATH
Kees Cook reported that the current text is misleading:

  <https://lore.kernel.org/lkml/202005150847.2B1ED8F81@keescook/>
2020-12-03 10:59:50 +01:00
Adhemerval Zanella
b4c3446836 nptl: Return EINVAL for invalid clock for pthread_clockjoin_np
The align the GNU extension with the others one that accept specify
which clock to wait for (such as pthread_mutex_clocklock).

Check on x86_64-linux-gnu.

Reviewed-by: Lukasz Majewski <lukma@denx.de>
2020-11-25 10:46:25 -03:00
Carlos O'Donell
d598134bfb Argument Syntax: Use "option", @option, and @command.
Suggested-by: David O'Brien <daobrien@redhat.com>
2020-10-30 13:08:38 -04:00
Siddhesh Poyarekar
6c2b579962 Reword description of SXID_* tunable properties
The SXID_* tunable properties only influence processes that are
AT_SECURE, so make that a bit more explicit in the documentation and
comment.

Revisiting the code after a few years I managed to confuse myself, so
I imagine there could be others who may have incorrectly assumed like
I did that the SXID_ERASE tunables are not inherited by children of
non-AT_SECURE processes.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2020-10-22 13:52:38 +05:30
Adhemerval Zanella
ab5ee31e14 Move vtimes to a compatibility symbol
I couldn't pinpoint which standard has added it, but no other POSIX
system supports it and/or no longer provide it.  The 'struct vtimes'
also has a lot of drawbacks due its limited internal type size.

I couldn't also see find any project that actually uses this symbol,
either in some dignostic way (such as sanitizer).  So I think it should
be safer to just move to compat symbol, instead of deprecated.  The
idea it to avoid new ports to export such broken interface (riscv32
for instance).

Checked on x86_64-linux-gnu and i686-linux-gnu.
2020-10-19 16:44:20 -03:00
Benno Schulenberg
af548086ed manual: correct the spelling of "MALLOC_PERTURB_" [BZ #23015]
Reported-by: Martin Dorey <martin.dorey@hds.com>
2020-10-13 14:58:16 +02:00
Benno Schulenberg
a5177499e4 manual: replace an obsolete collation example with a valid one
In the Spanish language, the digraph "ll" has not been considered a
separate letter since 1994:
  https://www.rae.es/consultas/exclusion-de-ch-y-ll-del-abecedario

Since January 1998 (commit 49891c1062),
glibc's locale data no longer specifies "ch" and "ll" as separate
collation elements.  So, it's better to not use "ll" in an example.

Also, the Czech "ch" is a better example as it collates in a more
surprising place.
2020-10-13 14:58:16 +02:00
H.J. Lu
428985c436 <sys/platform/x86.h>: Add FSRCS/FSRS/FZLRM support
Add Fast Short REP CMP and SCA (FSRCS), Fast Short REP STO (FSRS) and
Fast Zero-Length REP MOV (FZLRM) support to <sys/platform/x86.h>.
2020-10-09 11:52:30 -07:00
H.J. Lu
c712401bc6 <sys/platform/x86.h>: Add Intel HRESET support
Add Intel HRESET support to <sys/platform/x86.h>.
2020-10-09 11:52:30 -07:00
H.J. Lu
875a50ff63 <sys/platform/x86.h>: Add AVX-VNNI support
Add AVX-VNNI support to <sys/platform/x86.h>.
2020-10-09 11:52:30 -07:00
H.J. Lu
ebe454bcca <sys/platform/x86.h>: Add AVX512_FP16 support
Add AVX512_FP16 support to <sys/platform/x86.h>.
2020-10-09 11:52:30 -07:00
H.J. Lu
7674695cf7 <sys/platform/x86.h>: Add Intel UINTR support
Add Intel UINTR support to <sys/platform/x86.h>.
2020-10-09 11:52:30 -07:00
Florian Weimer
27fe5f2e67 Linux: Require properly configured /dev/pts for PTYs
Current systems do not have BSD terminals, so the fallback code in
posix_openpt/getpt does not do anything.  Also remove the file system
check for /dev/pts.  Current systems always have a devpts file system
mounted there if /dev/ptmx exists.

grantpt is now essentially a no-op.  It only verifies that the
argument is a ptmx-descriptor.  Therefore, this change indirectly
addresses bug 24941.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2020-10-07 14:55:50 +02:00
Jonathan Wakely
5bb2e5300b manual: Fix typo 2020-10-05 17:29:46 +01:00
H.J. Lu
dfb8e514cf Set tunable value as well as min/max values
Some tunable values and their minimum/maximum values must be determinted
at run-time.  Add TUNABLE_SET_WITH_BOUNDS and TUNABLE_SET_WITH_BOUNDS_FULL
to update tunable value together with minimum and maximum values.
__tunable_set_val is updated to set tunable value as well as min/max
values.
2020-09-29 09:03:47 -07:00
Patrick McGehearty
d3c5702747 Reversing calculation of __x86_shared_non_temporal_threshold
The __x86_shared_non_temporal_threshold determines when memcpy on x86
uses non_temporal stores to avoid pushing other data out of the last
level cache.

This patch proposes to revert the calculation change made by H.J. Lu's
patch of June 2, 2017.

H.J. Lu's patch selected a threshold suitable for a single thread
getting maximum performance. It was tuned using the single threaded
large memcpy micro benchmark on an 8 core processor. The last change
changes the threshold from using 3/4 of one thread's share of the
cache to using 3/4 of the entire cache of a multi-threaded system
before switching to non-temporal stores. Multi-threaded systems with
more than a few threads are server-class and typically have many
active threads. If one thread consumes 3/4 of the available cache for
all threads, it will cause other active threads to have data removed
from the cache. Two examples show the range of the effect. John
McCalpin's widely parallel Stream benchmark, which runs in parallel
and fetches data sequentially, saw a 20% slowdown with this patch on
an internal system test of 128 threads. This regression was discovered
when comparing OL8 performance to OL7.  An example that compares
normal stores to non-temporal stores may be found at
https://vgatherps.github.io/2018-09-02-nontemporal/.  A simple test
shows performance loss of 400 to 500% due to a failure to use
nontemporal stores. These performance losses are most likely to occur
when the system load is heaviest and good performance is critical.

The tunable x86_non_temporal_threshold can be used to override the
default for the knowledgable user who really wants maximum cache
allocation to a single thread in a multi-threaded system.
The manual entry for the tunable has been expanded to provide
more information about its purpose.

	modified: sysdeps/x86/cacheinfo.c
	modified: manual/tunables.texi
2020-09-28 22:10:39 +00:00
H.J. Lu
f2c679d4b2 <sys/platform/x86.h>: Add Intel Key Locker support
Add Intel Key Locker:

https://software.intel.com/content/www/us/en/develop/download/intel-key-locker-specification.html

support to <sys/platform/x86.h>.  Intel Key Locker has

1. KL: AES Key Locker instructions.
2. WIDE_KL: AES wide Key Locker instructions.
3. AESKLE: AES Key Locker instructions are enabled by OS.

Applications should use

if (CPU_FEATURE_USABLE (KL))

and

if (CPU_FEATURE_USABLE (WIDE_KL))

to check if AES Key Locker instructions and AES wide Key Locker
instructions are usable.
2020-09-16 05:56:10 -07:00