C2X adds a printf %b format (see
<http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2630.pdf>, accepted
for C2X), for outputting integers in binary. It also has recommended
practice for a corresponding %B format (like %b, but %#B starts the
output with 0B instead of 0b). Add support for these formats to
glibc.
One existing test uses %b as an example of an unknown format, to test
how glibc printf handles unknown formats; change that to %v. Use of
%b and %B as user-registered format specifiers continues to work (and
we already have a test that covers that, tst-printfsz.c).
Note that C2X also has scanf %b support, plus support for binary
constants starting 0b in strtol (base 0 and 2) and scanf %i (strtol
base 0 and scanf %i coming from a previous paper that added binary
integer literals). I intend to implement those features in a separate
patch or patches; as discussed in the thread starting at
<https://sourceware.org/pipermail/libc-alpha/2020-December/120414.html>,
they will be more complicated because they involve adding extra public
symbols to ensure compatibility with existing code that might not
expect 0b constants to be handled by strtol base 0 and 2 and scanf %i,
whereas simply adding a new format specifier poses no such
compatibility concerns.
Note that the actual conversion from integer to string uses existing
code in _itoa.c. That code has special cases for bases 8, 10 and 16,
probably so that the compiler can optimize division by an integer
constant in the code for those bases. If desired such special cases
could easily be added for base 2 as well, but that would be an
optimization, not actually needed for these printf formats to work.
Tested for x86_64 and x86. Also tested with build-many-glibcs.py for
aarch64-linux-gnu with GCC mainline to make sure that the test does
indeed build with GCC 12 (where format checking warnings are enabled
for most of the test).
This second patch contains the actual implementation of a new sorting algorithm
for shared objects in the dynamic loader, which solves the slow behavior that
the current "old" algorithm falls into when the DSO set contains circular
dependencies.
The new algorithm implemented here is simply depth-first search (DFS) to obtain
the Reverse-Post Order (RPO) sequence, a topological sort. A new l_visited:1
bitfield is added to struct link_map to more elegantly facilitate such a search.
The DFS algorithm is applied to the input maps[nmap-1] backwards towards
maps[0]. This has the effect of a more "shallow" recursion depth in general
since the input is in BFS. Also, when combined with the natural order of
processing l_initfini[] at each node, this creates a resulting output sorting
closer to the intuitive "left-to-right" order in most cases.
Another notable implementation adjustment related to this _dl_sort_maps change
is the removing of two char arrays 'used' and 'done' in _dl_close_worker to
represent two per-map attributes. This has been changed to simply use two new
bit-fields l_map_used:1, l_map_done:1 added to struct link_map. This also allows
discarding the clunky 'used' array sorting that _dl_sort_maps had to sometimes
do along the way.
Tunable support for switching between different sorting algorithms at runtime is
also added. A new tunable 'glibc.rtld.dynamic_sort' with current valid values 1
(old algorithm) and 2 (new DFS algorithm) has been added. At time of commit
of this patch, the default setting is 1 (old algorithm).
Signed-off-by: Chung-Lin Tang <cltang@codesourcery.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The current language reads "This macro determines...", changing to
"Define this macro...". This is consistent with other feature macro
documentation language.
When I first read the previous language it seems to indicate that the
macro is already defined. By changing the language to "Define this
macro..." it's clear that its the user's responsibility to define it.
C2X adds a macro _PRINTF_NAN_LEN_MAX to <stdio.h>, giving the maximum
length of printf output for a NaN. glibc never includes an
n-char-sequence in its printf output for NaNs, so the correct value
for glibc is 4 ("-nan" or "-NAN"); define the macro accordingly.
This patch makes the macro definition conditional on __GLIBC_USE
(ISOC2X), as is generally done with features from new standard
versions. The name is in the implementation namespace for older
standards, so it would also be possible to define it unconditionally.
Tested for x86_64.
C2X adds new <math.h> functions for floating-point maximum and
minimum, corresponding to the new operations that were added in IEEE
754-2019 because of concerns about the old operations not being
associative in the presence of signaling NaNs. fmaximum and fminimum
handle NaNs like most <math.h> functions (any NaN argument means the
result is a quiet NaN). fmaximum_num and fminimum_num handle both
quiet and signaling NaNs the way fmax and fmin handle quiet NaNs (if
one argument is a number and the other is a NaN, return the number),
but still raise "invalid" for a signaling NaN argument, making them
exceptions to the normal rule that a function with a floating-point
result raising "invalid" also returns a quiet NaN. fmaximum_mag,
fminimum_mag, fmaximum_mag_num and fminimum_mag_num are corresponding
functions returning the argument with greatest or least absolute
value. All these functions also treat +0 as greater than -0. There
are also corresponding <tgmath.h> type-generic macros.
Add these functions to glibc. The implementations use type-generic
templates based on those for fmax, fmin, fmaxmag and fminmag, and test
inputs are based on those for those functions with appropriate
adjustments to the expected results. The RISC-V maintainers might
wish to add optimized versions of fmaximum_num and fminimum_num (for
float and double), since RISC-V (F extension version 2.2 and later)
provides instructions corresponding to those functions - though it
might be at least as useful to add architecture-independent built-in
functions to GCC and teach the RISC-V back end to expand those
functions inline, which is what you generally want for functions that
can be implemented with a single instruction.
Tested for x86_64 and x86, and with build-many-glibcs.py.
This patch adds the narrowing fused multiply-add functions from TS
18661-1 / TS 18661-3 / C2X to glibc's libm: ffma, ffmal, dfmal,
f32fmaf64, f32fmaf32x, f32xfmaf64 for all configurations; f32fmaf64x,
f32fmaf128, f64fmaf64x, f64fmaf128, f32xfmaf64x, f32xfmaf128,
f64xfmaf128 for configurations with _Float64x and _Float128;
__f32fmaieee128 and __f64fmaieee128 aliases in the powerpc64le case
(for calls to ffmal and dfmal when long double is IEEE binary128).
Corresponding tgmath.h macro support is also added.
The changes are mostly similar to those for the other narrowing
functions previously added, especially that for sqrt, so the
description of those generally applies to this patch as well. As with
sqrt, I reused the same test inputs in auto-libm-test-in as for
non-narrowing fma rather than adding extra or separate inputs for
narrowing fma. The tests in libm-test-narrow-fma.inc also follow
those for non-narrowing fma.
The non-narrowing fma has a known bug (bug 6801) that it does not set
errno on errors (overflow, underflow, Inf * 0, Inf - Inf). Rather
than fixing this or having narrowing fma check for errors when
non-narrowing does not (complicating the cases when narrowing fma can
otherwise be an alias for a non-narrowing function), this patch does
not attempt to check for errors from narrowing fma and set errno; the
CHECK_NARROW_FMA macro is still present, but as a placeholder that
does nothing, and this missing errno setting is considered to be
covered by the existing bug rather than needing a separate open bug.
missing-errno annotations are duly added to many of the
auto-libm-test-in test inputs for fma.
This completes adding all the new functions from TS 18661-1 to glibc,
so will be followed by corresponding stdc-predef.h changes to define
__STDC_IEC_60559_BFP__ and __STDC_IEC_60559_COMPLEX__, as the support
for TS 18661-1 will be at a similar level to that for C standard
floating-point facilities up to C11 (pragmas not implemented, but
library functions done). (There are still further changes to be done
to implement changes to the types of fromfp functions from N2548.)
Tested as followed: natively with the full glibc testsuite for x86_64
(GCC 11, 7, 6) and x86 (GCC 11); with build-many-glibcs.py with GCC
11, 7 and 6; cross testing of math/ tests for powerpc64le, powerpc32
hard float, mips64 (all three ABIs, both hard and soft float). The
different GCC versions are to cover the different cases in tgmath.h
and tgmath.h tests properly (GCC 6 has _Float* only as typedefs in
glibc headers, GCC 7 has proper _Float* support, GCC 8 adds
__builtin_tgmath).
This patch adds the narrowing square root functions from TS 18661-1 /
TS 18661-3 / C2X to glibc's libm: fsqrt, fsqrtl, dsqrtl, f32sqrtf64,
f32sqrtf32x, f32xsqrtf64 for all configurations; f32sqrtf64x,
f32sqrtf128, f64sqrtf64x, f64sqrtf128, f32xsqrtf64x, f32xsqrtf128,
f64xsqrtf128 for configurations with _Float64x and _Float128;
__f32sqrtieee128 and __f64sqrtieee128 aliases in the powerpc64le case
(for calls to fsqrtl and dsqrtl when long double is IEEE binary128).
Corresponding tgmath.h macro support is also added.
The changes are mostly similar to those for the other narrowing
functions previously added, so the description of those generally
applies to this patch as well. However, the not-actually-narrowing
cases (where the two types involved in the function have the same
floating-point format) are aliased to sqrt, sqrtl or sqrtf128 rather
than needing a separately built not-actually-narrowing function such
as was needed for add / sub / mul / div. Thus, there is no
__nldbl_dsqrtl name for ldbl-opt because no such name was needed
(whereas the other functions needed such a name since the only other
name for that entry point was e.g. f32xaddf64, not reserved by TS
18661-1); the headers are made to arrange for sqrt to be called in
that case instead.
The DIAG_* calls in sysdeps/ieee754/soft-fp/s_dsqrtl.c are because
they were observed to be needed in GCC 7 testing of
riscv32-linux-gnu-rv32imac-ilp32. The other sysdeps/ieee754/soft-fp/
files added didn't need such DIAG_* in any configuration I tested with
build-many-glibcs.py, but if they do turn out to be needed in more
files with some other configuration / GCC version, they can always be
added there.
I reused the same test inputs in auto-libm-test-in as for
non-narrowing sqrt rather than adding extra or separate inputs for
narrowing sqrt. The tests in libm-test-narrow-sqrt.inc also follow
those for non-narrowing sqrt.
Tested as followed: natively with the full glibc testsuite for x86_64
(GCC 11, 7, 6) and x86 (GCC 11); with build-many-glibcs.py with GCC
11, 7 and 6; cross testing of math/ tests for powerpc64le, powerpc32
hard float, mips64 (all three ABIs, both hard and soft float). The
different GCC versions are to cover the different cases in tgmath.h
and tgmath.h tests properly (GCC 6 has _Float* only as typedefs in
glibc headers, GCC 7 has proper _Float* support, GCC 8 adds
__builtin_tgmath).
We stopped adding "Contributed by" or similar lines in sources in 2012
in favour of git logs and keeping the Contributors section of the
glibc manual up to date. Removing these lines makes the license
header a bit more consistent across files and also removes the
possibility of error in attribution when license blocks or files are
copied across since the contributed-by lines don't actually reflect
reality in those cases.
Move all "Contributed by" and similar lines (Written by, Test by,
etc.) into a new file CONTRIBUTED-BY to retain record of these
contributions. These contributors are also mentioned in
manual/contrib.texi, so we just maintain this additional record as a
courtesy to the earlier developers.
The following scripts were used to filter a list of files to edit in
place and to clean up the CONTRIBUTED-BY file respectively. These
were not added to the glibc sources because they're not expected to be
of any use in future given that this is a one time task:
https://gist.github.com/siddhesh/b5ecac94eabfd72ed2916d6d8157e7dchttps://gist.github.com/siddhesh/15ea1f5e435ace9774f485030695ee02
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Fix two problems.
Rather than "larger than", better English is "greater than".
Then there is a wordinig error on the following line: "then lowfd"
appears to be cruft.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
All references to libraries in the manual are without the .so prefix,
so do the same for libc_malloc_debug.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
These functions call the core allocator functions (realloc and malloc
respectively) and are hence guaranteed to allocate memory using the
correct functions when multiple allocators are interposed. Having
these functions interposed in one allocator and not another may result
in confusion, hence discourage interposing them altogether.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
1. Install <bits/platform/x86.h> for <sys/platform/x86.h> which includes
<bits/platform/x86.h>.
2. Rename HAS_CPU_FEATURE to CPU_FEATURE_PRESENT which checks if the
processor has the feature.
3. Rename CPU_FEATURE_USABLE to CPU_FEATURE_ACTIVE which checks if the
feature is active. There may be other preconditions, like sufficient
stack space or further setup for AMX, which must be satisfied before the
feature can be used.
This fixes BZ #27958.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Make malloc hooks symbols compat-only so that new applications cannot
link against them and remove the declarations from the API. Also
remove the unused malloc-hooks.h.
Finally, mark all symbols in libc_malloc_debug.so as compat so that
the library cannot be linked against.
Add a note about the deprecation in NEWS.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
Remove all malloc hook uses from core malloc functions and move it
into a new library libc_malloc_debug.so. With this, the hooks now no
longer have any effect on the core library.
libc_malloc_debug.so is a malloc interposer that needs to be preloaded
to get hooks functionality back so that the debugging features that
depend on the hooks, i.e. malloc-check, mcheck and mtrace work again.
Without the preloaded DSO these debugging features will be nops.
These features will be ported away from hooks in subsequent patches.
Similarly, legacy applications that need hooks functionality need to
preload libc_malloc_debug.so.
The symbols exported by libc_malloc_debug.so are maintained at exactly
the same version as libc.so.
Finally, static binaries will no longer be able to use malloc
debugging features since they cannot preload the debugging DSO.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
The SSBD feature is implemented in 2 different ways on AMD processors:
newer systems (Zen3) provides AMD_SSBD (function 8000_0008, EBX[24]),
while older system provides AMD_VIRT_SSBD (function 8000_0008, EBX[25]).
However for AMD_VIRT_SSBD, kernel shows both 'ssdb' and 'virt_ssdb' on
/proc/cpuinfo; while for AMD_SSBD only 'ssdb' is provided.
This now check is AMD_SSBD is set to check for 'ssbd', otherwise check
if AMD_VIRT_SSDB is set to check for 'virt_ssbd'.
Checked on x86_64-linux-gnu on a Ryzen 9 5900x.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
The constant PTHREAD_STACK_MIN may be too small for some processors.
Rename _SC_SIGSTKSZ_SOURCE to _DYNAMIC_STACK_SIZE_SOURCE. When
_DYNAMIC_STACK_SIZE_SOURCE or _GNU_SOURCE are defined, define
PTHREAD_STACK_MIN to sysconf(_SC_THREAD_STACK_MIN) which is changed
to MIN (PTHREAD_STACK_MIN, sysconf(_SC_MINSIGSTKSZ)).
Consolidate <bits/local_lim.h> with <bits/pthread_stack_min.h> to
provide a constant target specific PTHREAD_STACK_MIN value.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The function closes all open file descriptors greater than or equal to
input argument. Negative values are clamped to 0, i.e, it will close
all file descriptors.
As indicated by the bug report, this is a common symbol provided by
different systems (Solaris, OpenBSD, NetBSD, FreeBSD) and, although
its has inherent issues with not taking in consideration internal libc
file descriptors (such as syslog), this is also a common feature used
in multiple projects [1][2][3][4][5].
The Linux fallback implementation iterates over /proc and close all
file descriptors sequentially. Although it was raised the questioning
whether getdents on /proc/self/fd might return disjointed entries
when file descriptor are closed; it does not seems the case on my
testing on multiple kernel (v4.18, v5.4, v5.9) and the same strategy
is used on different projects [1][2][3][5].
Also, the interface is set a fail-safe meaning that a failure in the
fallback results in a process abort.
Checked on x86_64-linux-gnu and i686-linux-gnu on kernel 5.11 and 4.15.
[1] 5238e95759/src/basic/fd-util.c (L217)
[2] ddf4b77e11/src/lxc/start.c (L236)
[3] 9e4f2f3a6b/Modules/_posixsubprocess.c (L220)
[4] 5f47c0613e/src/libstd/sys/unix/process2.rs (L303-L308)
[5] https://github.com/openjdk/jdk/blob/master/src/java.base/unix/native/libjava/childproc.c#L82
It was added on Linux 5.9 (278a5fbaed89) with CLOSE_RANGE_CLOEXEC
added on 5.11 (582f1fb6b721f). Although FreeBSD has added the same
syscall, this only adds the symbol on Linux ports. This syscall is
required to provided a fail-safe way to implement the closefrom
symbol (BZ #10353).
Checked on x86_64-linux-gnu and i686-linux-gnu on kernel 5.11 and 4.15.
The tunable will not work with *any* non-zero tunable value since its
list of allowed values is 0-3. Fix the documentation to reflect that.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
From
https://www.intel.com/content/www/us/en/support/articles/000059422/processors.html
* Intel TSX will be disabled by default.
* The processor will force abort all Restricted Transactional Memory (RTM)
transactions by default.
* A new CPUID bit CPUID.07H.0H.EDX[11](RTM_ALWAYS_ABORT) will be enumerated,
which is set to indicate to updated software that the loaded microcode is
forcing RTM abort.
* On processors that enumerate support for RTM, the CPUID enumeration bits
for Intel TSX (CPUID.07H.0H.EBX[11] and CPUID.07H.0H.EBX[4]) continue to
be set by default after microcode update.
* Workloads that were benefited from Intel TSX might experience a change
in performance.
* System software may use a new bit in Model-Specific Register (MSR) 0x10F
TSX_FORCE_ABORT[TSX_CPUID_CLEAR] functionality to clear the Hardware Lock
Elision (HLE) and RTM bits to indicate to software that Intel TSX is
disabled.
1. Add RTM_ALWAYS_ABORT to CPUID features.
2. Set RTM usable only if RTM_ALWAYS_ABORT isn't set. This skips the
string/tst-memchr-rtm etc. testcases on the affected processors, which
always fail after a microcde update.
3. Check RTM feature, instead of usability, against /proc/cpuinfo.
This fixes BZ #28033.
Austin Group issue 62 [1] dropped the async-signal-safe requirement
for fork and provided a async-signal-safe _Fork replacement that
does not run the atfork handlers. It will be included in the next
POSIX standard.
It allow to close a long standing issue to make fork AS-safe (BZ#4737).
As indicated on the bug, besides the internal lock for the atfork
handlers itself; there is no guarantee that the handlers itself will
not introduce more AS-safe issues.
The idea is synchronize fork with the required internal locks to allow
children in multithread processes to use mostly of standard function
(even though POSIX states only AS-safe function should be used). On
signal handles, _Fork should be used intead and only AS-safe functions
should be used.
For testing, the new tst-_Fork only check basic usage. I also added
a new tst-mallocfork3 which uses the same strategy to check for
deadlock of tst-mallocfork2 but using threads instead of subprocesses
(and it does deadlock if it replaces _Fork with fork).
[1] https://austingroupbugs.net/view.php?id=62
The valgrind/helgrind test suite needs a way to make stack dealloction
more prompt, and this feature seems to be generally useful.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
AMD define different flags for IRPB, IBRS, and STIPBP [1], so new
x86_64_cpu are added and IBRS_IBPB is only tested for Intel.
The SSDB is also defined and implemented different on AMD [2],
and also a new AMD_SSDB flag is added. It should map to the
cpuinfo 'ssdb' on recent AMD cpus.
It fixes tst-cpu-features-cpuinfo and tst-cpu-features-cpuinfo-static
on recent AMD cpus.
Checked on x86_64-linux-gnu on AMD Ryzen 9 5900X.
[1] https://developer.amd.com/wp-content/resources/Architecture_Guidelines_Update_Indirect_Branch_Control.pdf
[2] https://bugzilla.kernel.org/show_bug.cgi?id=199889
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* NEWS: Don't imply the default will always be 32-bit.
* manual/creature.texi (Feature Test Macros):
Say that _TIME_BITS and _FILE_OFFSET_BITS defaults
may change in future releases.
A new build flag, _TIME_BITS, enables the usage of the newer 64-bit
time symbols for legacy ABI (where 32-bit time_t is default). The 64
bit time support is only enabled if LFS (_FILE_OFFSET_BITS=64) is
also used.
Different than LFS support, the y2038 symbols are added only for the
required ABIs (armhf, csky, hppa, i386, m68k, microblaze, mips32,
mips64-n32, nios2, powerpc32, sparc32, s390-32, and sh). The ABIs with
64-bit time support are unchanged, both for symbol and types
redirection.
On Linux the full 64-bit time support requires a minimum of kernel
version v5.1. Otherwise, the 32-bit fallbacks are used and might
results in error with overflow return code (EOVERFLOW).
The i686-gnu does not yet support 64-bit time.
This patch exports following rediretions to support 64-bit time:
* libc:
adjtime
adjtimex
clock_adjtime
clock_getres
clock_gettime
clock_nanosleep
clock_settime
cnd_timedwait
ctime
ctime_r
difftime
fstat
fstatat
futimens
futimes
futimesat
getitimer
getrusage
gettimeofday
gmtime
gmtime_r
localtime
localtime_r
lstat_time
lutimes
mktime
msgctl
mtx_timedlock
nanosleep
nanosleep
ntp_gettime
ntp_gettimex
ppoll
pselec
pselect
pthread_clockjoin_np
pthread_cond_clockwait
pthread_cond_timedwait
pthread_mutex_clocklock
pthread_mutex_timedlock
pthread_rwlock_clockrdlock
pthread_rwlock_clockwrlock
pthread_rwlock_timedrdlock
pthread_rwlock_timedwrlock
pthread_timedjoin_np
recvmmsg
sched_rr_get_interval
select
sem_clockwait
semctl
semtimedop
sem_timedwait
setitimer
settimeofday
shmctl
sigtimedwait
stat
thrd_sleep
time
timegm
timerfd_gettime
timerfd_settime
timespec_get
utime
utimensat
utimes
utimes
wait3
wait4
* librt:
aio_suspend
mq_timedreceive
mq_timedsend
timer_gettime
timer_settime
* libanl:
gai_suspend
Reviewed-by: Lukasz Majewski <lukma@denx.de>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
Commit 68ab82f566 added support for the scv
syscall ABI on powerpc. Since then systems that have kernel and processor
support started using scv. However adding the proper support for a new syscall
ABI requires changes to several other projects (e.g. qemu, valgrind, strace,
kernel), which are gradually receiving support.
Meanwhile, having a way to disable scv on glibc at build time can be useful for
distros that may encounter conflicts with projects that still do not support the
scv ABI, buying time until proper support is added.
This commit adds a --disable-scv option that disables scv support and uses sc
for all syscalls, like before commit 68ab82f566.
Reviewed-by: Raphael M Zinsly <rzinsly@linux.ibm.com>
Now that thread cancellation state is not accessed concurrently anymore,
it is possible to move it out the 'cancelhandling'.
The code is also simplified: CANCELLATION_P is replaced with a
internal pthread_testcancel call and the CANCELSTATE_BIT{MASK} is
removed.
With this behavior pthread_setcancelstate does not require to act on
cancellation if cancel type is asynchronous (is already handled either
by pthread_setcanceltype or by the signal handler).
Checked on x86_64-linux-gnu and aarch64-linux-gnu.
ISO C2X has made some changes to the handling of feature test macros
related to features from the floating-point TSes, and to exactly what
such features are present in what headers, that require corresponding
changes in glibc.
* For the few features that were controlled by
__STDC_WANT_IEC_60559_BFP_EXT__ (and the corresponding DFP macro) in
C2X, there is now instead a new feature test macro
__STDC_WANT_IEC_60559_EXT__ covering both binary and decimal FP.
This controls CR_DECIMAL_DIG in <float.h> (provided by GCC; I
implemented support for the new feature test macro for GCC 11) and
the totalorder and payload functions in <math.h>. C2X no longer
says anything about __STDC_WANT_IEC_60559_BFP_EXT__ (so it's
appropriate for that macro to continue to enable exactly the
features from TS 18661-1).
* The SNAN macros for each floating-point type have moved to <float.h>
(and been renamed in the process). Thus, the copies in <math.h>
should only be defined for __STDC_WANT_IEC_60559_BFP_EXT__, not for
C2X.
* The fmaxmag and fminmag functions have been removed (replaced by new
functions for the new min/max operations in IEEE 754-2019). Thus
those should also only be declared for
__STDC_WANT_IEC_60559_BFP_EXT__.
* The _FloatN / _FloatNx handling for the last two points in glibc is
trickier, since __STDC_WANT_IEC_60559_TYPES_EXT__ is still in C2X
(the integration of TS 18661-3 as an Annex, that is, which hasn't
yet been merged into the C standard git repository but has been
accepted by WG14), so C2X with that macro should not declare some
things that are declared for older standards with that macro. The
approach taken here is to provide the declarations (when
__STDC_WANT_IEC_60559_TYPES_EXT__ is enabled) only when (defined
__USE_GNU || !__GLIBC_USE (ISOC2X)), so if C2X features are enabled
then those declarations (that are only in TS 18661-3 and not in C2X)
will only be provided if _GNU_SOURCE is defined as well. Thus
_GNU_SOURCE remains a superset of the TS features as well as of C2X.
Some other somewhat related changes in C2X are not addressed here.
There's an open proposal not to include the fmin and fmax functions
for the _FloatN / _FloatNx types, given the new min/max operations,
which could be handled like the previous point if adopted. And the
fromfp functions have been changed to return a result in floating type
rather than intmax_t / uintmax_t; my inclination there is to treat
that like that change of totalorder type (new symbol versions etc. for
the ABI change; old versions become compat symbols and are no longer
supported as an API).
Tested for x86_64 and x86.
This patch optimizes the performance of memcpy/memmove for A64FX [1]
which implements ARMv8-A SVE and has L1 64KB cache per core and L2 8MB
cache per NUMA node.
The performance optimization makes use of Scalable Vector Register
with several techniques such as loop unrolling, memory access
alignment, cache zero fill, and software pipelining.
SVE assembler code for memcpy/memmove is implemented as Vector Length
Agnostic code so theoretically it can be run on any SOC which supports
ARMv8-A SVE standard.
We confirmed that all testcases have been passed by running 'make
check' and 'make xcheck' not only on A64FX but also on ThunderX2.
And also we confirmed that the SVE 512 bit vector register performance
is roughly 4 times better than Advanced SIMD 128 bit register and 8
times better than scalar 64 bit register by running 'make bench'.
[1] https://github.com/fujitsu/A64FX
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Reviewed-by: Szabolcs Nagy <Szabolcs.Nagy@arm.com>
This patch is a test helper script to change Vector Length for child
process. This script can be used as test-wrapper for 'make check'.
Usage examples:
~/build$ make check subdirs=string \
test-wrapper='~/glibc/sysdeps/unix/sysv/linux/aarch64/vltest.py 16'
~/build$ ~/glibc/sysdeps/unix/sysv/linux/aarch64/vltest.py 16 \
make test t=string/test-memcpy
~/build$ ~/glibc/sysdeps/unix/sysv/linux/aarch64/vltest.py 32 \
./debugglibc.sh string/test-memmove
~/build$ ~/glibc/sysdeps/unix/sysv/linux/aarch64/vltest.py 64 \
./testrun.sh string/test-memset
Previously, the source file nptl/cancellation.c was compiled multiple
times, for libc, libpthread, librt. This commit switches to a single
implementation, with new __pthread_enable_asynccancel@@GLIBC_PRIVATE,
__pthread_disable_asynccancel@@GLIBC_PRIVATE exports.
The almost-unused CANCEL_ASYNC and CANCEL_RESET macros are replaced
by LIBC_CANCEL_ASYNC and LIBC_CANCEL_ASYNC macros. They call the
__pthread_* functions unconditionally now. The macros are still
needed because shared code uses them; Hurd has different definitions.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Cover key corner cases (e.g., whether errno is set) that are well
settled in glibc, fix some examples to avoid integer overflow, and
update some other dated examples (code needed for K&R C, e.g.).
* manual/charset.texi (Non-reentrant String Conversion):
* manual/filesys.texi (Symbolic Links):
* manual/memory.texi (Allocating Cleared Space):
* manual/socket.texi (Host Names):
* manual/string.texi (Concatenating Strings):
* manual/users.texi (Setting Groups):
Use reallocarray instead of realloc, to avoid integer overflow issues.
* manual/filesys.texi (Scanning Directory Content):
* manual/memory.texi (The GNU Allocator, Hooks for Malloc):
* manual/tunables.texi:
Use code font for 'malloc' instead of roman font.
(Symbolic Links): Don't assume readlink return value fits in 'int'.
* manual/memory.texi (Memory Allocation and C, Basic Allocation)
(Malloc Examples, Alloca Example):
* manual/stdio.texi (Formatted Output Functions):
* manual/string.texi (Concatenating Strings, Collation Functions):
Omit pointer casts that are needed only in ancient K&R C.
* manual/memory.texi (Basic Allocation):
Say that malloc sets errno on failure.
Say "convert" rather than "cast", since casts are no longer needed.
* manual/memory.texi (Basic Allocation):
* manual/string.texi (Concatenating Strings):
In examples, use C99 declarations after statements for brevity.
* manual/memory.texi (Malloc Examples): Add portability notes for
malloc (0), errno setting, and PTRDIFF_MAX.
(Changing Block Size): Say that realloc (p, 0) acts like
(p ? (free (p), NULL) : malloc (0)).
Add xreallocarray example, since other examples can use it.
Add portability notes for realloc (0, 0), realloc (p, 0),
PTRDIFF_MAX, and improve notes for reallocating to the same size.
(Allocating Cleared Space): Reword now-confusing discussion
about replacement, and xref "Replacing malloc".
* manual/stdio.texi (Formatted Output Functions):
Don't assume message size fits in 'int'.
* manual/string.texi (Concatenating Strings):
Fix undefined behavior involving arithmetic on a freed pointer.
Finally remove all mpa related files, headers, declarations, probes, unused
tables and update makefiles.
Reviewed-By: Paul Zimmermann <Paul.Zimmermann@inria.fr>
This code adds new flag - '--allow-time-setting' to cross-test-ssh.sh
script to indicate if it is allowed to alter the date on the system
on which tests are executed. This change is supposed to be used with
test systems, which use virtual machines for testing.
The GLIBC_TEST_ALLOW_TIME_SETTING env variable is exported to the
remote environment on which the eligible test is run and brings no
functional change when it is not.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The TUNABLE_SET interface took a primitive C type argument, which
resulted in inconsistent type conversions internally due to incorrect
dereferencing of types, especialy on 32-bit architectures. This
change simplifies the TUNABLE setting logic along with the interfaces.
Now all numeric tunable values are stored as signed numbers in
tunable_num_t, which is intmax_t. All calls to set tunables cast the
input value to its primitive type and then to tunable_num_t for
storage. This relies on gcc-specific (although I suspect other
compilers woul also do the same) unsigned to signed integer conversion
semantics, i.e. the bit pattern is conserved. The reverse conversion
is guaranteed by the standard.
1. Add CPUID_INDEX_14_ECX_0 for CPUID leaf 0x14 to detect PTWRITE feature
in EBX of CPUID leaf 0x14 with ECX == 0.
2. Add PTWRITE detection to CPU feature tests.
3. Add 2 static CPU feature tests.
The struct tag is actually entry (not ENTRY). The data member has
type void *, and it can point to binary data. Only the key member is
required to be a null-terminated string.
Reviewed-by: Arjun Shankar <arjun@redhat.com>
Add _SC_MINSIGSTKSZ for the minimum signal stack size derived from
AT_MINSIGSTKSZ, which is the minimum number of bytes of free stack
space required in order to gurantee successful, non-nested handling
of a single signal whose handler is an empty function, and _SC_SIGSTKSZ
which is the suggested minimum number of bytes of stack space required
for a signal stack.
If AT_MINSIGSTKSZ isn't available, sysconf (_SC_MINSIGSTKSZ) returns
MINSIGSTKSZ. On Linux/x86 with XSAVE, the signal frame used by kernel
is composed of the following areas and laid out as:
------------------------------
| alignment padding |
------------------------------
| xsave buffer |
------------------------------
| fsave header (32-bit only) |
------------------------------
| siginfo + ucontext |
------------------------------
Compute AT_MINSIGSTKSZ value as size of xsave buffer + size of fsave
header (32-bit only) + size of siginfo and ucontext + alignment padding.
If _SC_SIGSTKSZ_SOURCE or _GNU_SOURCE are defined, MINSIGSTKSZ and SIGSTKSZ
are redefined as
/* Default stack size for a signal handler: sysconf (SC_SIGSTKSZ). */
# undef SIGSTKSZ
# define SIGSTKSZ sysconf (_SC_SIGSTKSZ)
/* Minimum stack size for a signal handler: SIGSTKSZ. */
# undef MINSIGSTKSZ
# define MINSIGSTKSZ SIGSTKSZ
Compilation will fail if the source assumes constant MINSIGSTKSZ or
SIGSTKSZ.
The reason for not simply increasing the kernel's MINSIGSTKSZ #define
(apart from the fact that it is rarely used, due to glibc's shadowing
definitions) was that userspace binaries will have baked in the old
value of the constant and may be making assumptions about it.
For example, the type (char [MINSIGSTKSZ]) changes if this #define
changes. This could be a problem if an newly built library tries to
memcpy() or dump such an object defined by and old binary.
Bounds-checking and the stack sizes passed to things like sigaltstack()
and makecontext() could similarly go wrong.
Most packages have been tested with their latest releases, except for
Python, whose latest version is 3.9.1.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
In <sys/platform/x86.h>, define CPU features as enum instead of using
the C preprocessor magic to make it easier to wrap this functionality
in other languages. Move the C preprocessor magic to internal header
for better GCC codegen when more than one features are checked in a
single expression as in x86-64 dl-hwcaps-subdirs.c.
1. Rename COMMON_CPUID_INDEX_XXX to CPUID_INDEX_XXX.
2. Move CPUID_INDEX_MAX to sysdeps/x86/include/cpu-features.h.
3. Remove struct cpu_features and __x86_get_cpu_features from
<sys/platform/x86.h>.
4. Add __x86_get_cpuid_feature_leaf to <sys/platform/x86.h> and put it
in libc.
5. Make __get_cpu_features() private to glibc.
6. Replace __x86_get_cpu_features(N) with __get_cpu_features().
7. Add _dl_x86_get_cpu_features to GLIBC_PRIVATE.
8. Use a single enum index for each CPU feature detection.
9. Pass the CPUID feature leaf to __x86_get_cpuid_feature_leaf.
10. Return zero struct cpuid_feature for the older glibc binary with a
smaller CPUID_INDEX_MAX [BZ #27104].
11. Inside glibc, use the C preprocessor magic so that cpu_features data
can be loaded just once leading to more compact code for glibc.
256 bits are used for each CPUID leaf. Some leaves only contain a few
features. We can add exceptions to such leaves. But it will increase
code sizes and it is harder to provide backward/forward compatibilities
when new features are added to such leaves in the future.
When new leaves are added, _rtld_global_ro offsets will change which
leads to race condition during in-place updates. We may avoid in-place
updates by
1. Rename the old glibc.
2. Install the new glibc.
3. Remove the old glibc.
NB: A function, __x86_get_cpuid_feature_leaf , is used to avoid the copy
relocation issue with IFUNC resolver as shown in IFUNC resolver tests.
I used these shell commands:
../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")
and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 6694 files FOO.
I then removed trailing white space from benchtests/bench-pthread-locks.c
and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this
diagnostic from Savannah:
remote: *** pre-commit check failed ...
remote: *** error: lines with trailing whitespace found
remote: error: hook declined to update refs/heads/master
Introduce a new _FORTIFY_SOURCE level of 3 to enable additional
fortifications that may have a noticeable performance impact, allowing
more fortification coverage at the cost of some performance.
With llvm 9.0 or later, this will replace the use of
__builtin_object_size with __builtin_dynamic_object_size.
__builtin_dynamic_object_size
-----------------------------
__builtin_dynamic_object_size is an LLVM builtin that is similar to
__builtin_object_size. In addition to what __builtin_object_size
does, i.e. replace the builtin call with a constant object size,
__builtin_dynamic_object_size will replace the call site with an
expression that evaluates to the object size, thus expanding its
applicability. In practice, __builtin_dynamic_object_size evaluates
these expressions through malloc/calloc calls that it can associate
with the object being evaluated.
A simple motivating example is below; -D_FORTIFY_SOURCE=2 would miss
this and emit memcpy, but -D_FORTIFY_SOURCE=3 with the help of
__builtin_dynamic_object_size is able to emit __memcpy_chk with the
allocation size expression passed into the function:
void *copy_obj (const void *src, size_t alloc, size_t copysize)
{
void *obj = malloc (alloc);
memcpy (obj, src, copysize);
return obj;
}
Limitations
-----------
If the object was allocated elsewhere that the compiler cannot see, or
if it was allocated in the function with a function that the compiler
does not recognize as an allocator then __builtin_dynamic_object_size
also returns -1.
Further, the expression used to compute object size may be non-trivial
and may potentially incur a noticeable performance impact. These
fortifications are hence enabled at a new _FORTIFY_SOURCE level to
allow developers to make a choice on the tradeoff according to their
environment.
In the next release of POSIX, free must preserve errno
<https://www.austingroupbugs.net/view.php?id=385>.
Modify __libc_free to save and restore errno, so that
any internal munmap etc. syscalls do not disturb the caller's errno.
Add a test malloc/tst-free-errno.c (almost all by Bruno Haible),
and document that free preserves errno.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>