Both new HWCAPs were introduced in these kernel commits:
- 7e8403ecaf884f307b627f3c371475913dd29292
"s390: add HWCAP_S390_PCI_MIO to ELF hwcaps"
- 7e82523f2583e9813e4109df3656707162541297
"s390/hwcaps: make sie capability regular hwcap"
Also note that the kernel commit 511ad531afd4090625def4d9aba1f5227bd44b8e
"s390/hwcaps: shorten HWCAP defines" has shortened the prefix of the macros
from "HWCAP_S390_" to "HWCAP_". For compatibility reasons, we do not
change the prefix in public glibc header file.
The fd validity check in open_dev_null checks if fd > 0, which would
lead to a leaked fd if it is == 0.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Linker creates the DT_DEBUG entry only in executables. Don't fill the
non-existent DT_DEBUG entry in ld.so with the run-time address of the
r_debug structure. This fixes BZ #28129.
Building stdlib/tst-setcontext.c fails with GCC mainline:
tst-setcontext.c: In function 'f2':
tst-setcontext.c:61:16: error: comparison between two arrays [-Werror=array-compare]
61 | if (on_stack < st2 || on_stack >= st2 + sizeof (st2))
| ^
tst-setcontext.c:61:16: note: use '&on_stack[0] < &st2[0]' to compare the addresses
The comparison in this case is deliberate, so adjust it as suggested
in that note.
Tested with build-many-glibcs.py (GCC mainline) for aarch64-linux-gnu.
The largest errors over the full binary32 range are after this
patch (on x86_64):
RNDN: libm wrong by up to 9.00e+00 ulp(s) [9] for x=0x1.04c39cp+6
RNDZ: libm wrong by up to 9.00e+00 ulp(s) [9] for x=0x1.04c39cp+6
RNDU: libm wrong by up to 9.00e+00 ulp(s) [9] for x=0x1.04c39cp+6
RNDD: libm wrong by up to 8.98e+00 ulp(s) [9] for x=0x1.4b7066p+7
Inputs that were yielding huge errors have been added to "make check".
Reviewed-by: Adhemeral Zanella <adhemerval.zanella@linaro.org>
My glibc bot shows failures building the testsuite with GCC mainline
across all architectures:
tst-vfprintf-width-prec.c: In function 'do_test':
tst-vfprintf-width-prec.c:90:16: error: the comparison will always evaluate as 'false' for the address of 'result' will never be NULL [-Werror=address]
90 | if (result == NULL)
| ^~
tst-vfprintf-width-prec.c:89:13: note: 'result' declared here
89 | wchar_t result[100];
| ^~~~~~
This is clearly a correct warning; the comparison against NULL is
clearly a cut-and-paste mistake from an earlier case in the test that
does use calloc. Thus, remove the unnecessary check for NULL shown up
by the warning.
Similarly, two other tests have bogus comparisons against NULL; remove
those as well:
scanf14a.c:95:13: error: the comparison will always evaluate as 'false' for the address of 'fname' will never be NULL [-Werror=address]
95 | if (fname == NULL)
| ^~
scanf14a.c:93:8: note: 'fname' declared here
93 | char fname[strlen (tmpdir) + sizeof "/tst-scanf14.XXXXXX"];
| ^~~~~
scanf16a.c:125:13: error: the comparison will always evaluate as 'false' for the address of 'fname' will never be NULL [-Werror=address]
125 | if (fname == NULL)
| ^~
scanf16a.c:123:8: note: 'fname' declared here
123 | char fname[strlen (tmpdir) + sizeof "/tst-scanf16.XXXXXX"];
| ^~~~~
Tested with build-many-glibcs.py (GCC mainline) for aarch64-linux-gnu.
Building benchmarks as static executables:
=========================================
To build benchmarks as static executables, on the build system, run:
$ make STATIC-BENCHTESTS=yes bench-build
You can copy benchmark executables to another machine and run them
without copying the source nor build directories.
The fix for bug 19329 caused a regression such that pthread_create can
deadlock when concurrent ctors from dlopen are waiting for it to finish.
Use a new GL(dl_load_tls_lock) in pthread_create that is not taken
around ctors in dlopen.
The new lock is also used in __tls_get_addr instead of GL(dl_load_lock).
The new lock is held in _dl_open_worker and _dl_close_worker around
most of the logic before/after the init/fini routines. When init/fini
routines are running then TLS is in a consistent, usable state.
In _dl_open_worker the new lock requires catching and reraising dlopen
failures that happen in the critical section.
The new lock is reinitialized in a fork child, to keep the existing
behaviour and it is kept recursive in case malloc interposition or TLS
access from signal handlers can retake it. It is not obvious if this
is necessary or helps, but avoids changing the preexisting behaviour.
The new lock may be more appropriate for dl_iterate_phdr too than
GL(dl_load_write_lock), since TLS state of an incompletely loaded
module may be accessed. If the new lock can replace the old one,
that can be a separate change.
Fixes bug 28357.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Running the test on a 4.4 kernel within KVM, the precision used on
ITIMER_VIRTUAL and ITIMER_PROF seems to different than the one used
for ITIMER_REAL (it seems the same used for CLOCK_REALTIME_COARSE and
CLOCK_MONOTONIC_COARSE). I did not see it on other kernels, for
instance 5.11 and 4.15.
To avoid trying to guess the resolution used, do not check the
nanosecond internal values for the specific timers.
Checked on i686-linux-gnu with a 4.4 kernel.
The first test in the set do not require 64-bit time_t support, so there
is no need to return UNSUPPORTED for the whole test. The patch also adds
another test with arbitrary date prior y2038.
Checked on x86_64-linux-gnu and i686-linux-gnu.
Unicode 14.0.0 Support: Character encoding, character type info, and
transliteration tables are all updated to Unicode 14.0.0, using
the generator scripts contributed by Mike FABIAN (Red Hat).
Total added characters in newly generated CHARMAP: 838
Total removed characters in newly generated WIDTH: 1
(Characters not in WIDTH get width 1 by default, i.e. these have width 1 now.)
removed: <U1734> 0 : eaw=N category=Mc bidi=L name=HANUNOO SIGN PAMUDPOD
That seems intentional, the character had category Mn (Mark, nonspacing) before
and now has Mc (Mark, spacing combining)
Total changed characters in newly generated WIDTH: 0
Total added characters in newly generated WIDTH: 175
The choice between the kill vs tgkill system calls is not just about
the TID reuse race, but also about whether the signal is sent to the
whole process (and any thread in it) or to a specific thread.
This was caught by the openposix test suite:
LTP: openposix test suite - FAIL: SIGUSR1 is member of new thread pendingset.
<https://gitlab.com/cki-project/kernel-tests/-/issues/764>
Fixes commit 526c3cf11e ("nptl: Fix race
between pthread_kill and thread exit (bug 12889)").
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
Some kernel versions (observed with kernel 5.14 and earlier) can list
"0" entries in /proc/self/task. This happens when a thread exits
while the task list is being constructed. Treat this entry as not
present, like the proposed kernel patch does:
[PATCH] procfs: Do not list TID 0 in /proc/<pid>/task
<https://lore.kernel.org/all/8735pn5dx7.fsf@oldenburg.str.redhat.com/>
Fixes commit 032d74eaf6 ("support: Add
support_wait_for_thread_exit").
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
Linux added FUTEX_LOCK_PI2 to support clock selection
(commit bf22a6976897977b0a3f1aeba6823c959fc4fdae). With the new
flag we can now proper support CLOCK_MONOTONIC for
pthread_mutex_clocklock with Priority Inheritance. If kernel
does not support, EINVAL is returned instead.
The difference is the futex operation will be issued and the kernel
will advertise the missing support (instead of hard-code error
return).
Checked on x86_64-linux-gnu and i686-linux-gnu on Linux 5.14, 5.11,
and 4.15.
This patch uses the new futex PI operation provided by Linux v5.14
when it is required.
The futex_lock_pi64() is moved to futex-internal.c (since it used on
two different places and its code size might be large depending of the
kernel configuration) and clockid is added as an argument.
Co-authored-by: Kurt Kanzenbach <kurt@linutronix.de>
Linux v5.14.0 introduced a new futex operation called FUTEX_LOCK_PI2.
This kernel feature can be used to implement
pthread_mutex_clocklock(MONOTONIC)/PI.
Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
C2X adds a macro _PRINTF_NAN_LEN_MAX to <stdio.h>, giving the maximum
length of printf output for a NaN. glibc never includes an
n-char-sequence in its printf output for NaNs, so the correct value
for glibc is 4 ("-nan" or "-NAN"); define the macro accordingly.
This patch makes the macro definition conditional on __GLIBC_USE
(ISOC2X), as is generally done with features from new standard
versions. The name is in the implementation namespace for older
standards, so it would also be possible to define it unconditionally.
Tested for x86_64.
glibc has had exp10 functions since long before they were
standardized; now they are standardized in TS 18661-4 and C2X, they
are also specified there to have a corresponding type-generic macro.
Add one to <tgmath.h>, so fixing bug 26108.
glibc doesn't have other functions from TS 18661-4 yet, but when
added, it will be natural to add the type-generic macro for each
function family at the same time as the functions.
Tested for x86_64.
commit ec935dea63
Author: Florian Weimer <fweimer@redhat.com>
Date: Fri Apr 24 22:31:15 2020 +0200
elf: Implement __libc_early_init
has
@@ -856,6 +876,11 @@ no more namespaces available for dlmopen()"));
/* See if an error occurred during loading. */
if (__glibc_unlikely (exception.errstring != NULL))
{
+ /* Avoid keeping around a dangling reference to the libc.so link
+ map in case it has been cached in libc_map. */
+ if (!args.libc_already_loaded)
+ GL(dl_ns)[nsid].libc_map = NULL;
+
do_dlopen calls _dl_open with nsid == __LM_ID_CALLER (-2), which calls
dl_open_worker with args.nsid = nsid. dl_open_worker updates args.nsid
if it is __LM_ID_CALLER. After dl_open_worker returns, it is wrong to
use nsid.
Replace nsid with args.nsid after dl_open_worker returns. This fixes
BZ #27609.
GCC treats the pragma as a statement, so that the else branch only
consists of the pragma, not the return statement.
Fixes commit a725ff1de9 ("Suppress
-Wcast-qual warnings in bsearch").
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
The first cast to (void *) is redundant but should be (const void *)
anyway, because that's the type of the lvalue being assigned to.
The second cast is necessary and intentionally not const-correct, so
tell the compiler not to warn about it.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
When add ld.so to a new namespace, we don't actually load ld.so. We
create a new link map and refers the real one for almost everything.
Copy l_addr and l_ld from the real ld.so link map to avoid GDB warning:
warning: .dynamic section for ".../elf/ld-linux-x86-64.so.2" is not at the expected address (wrong library or version mismatch?)
when handling shared library loaded by dlmopen.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Recent versions of binutils (with commit
b25f942e18d6ecd7ec3e2d2e9930eb4f996c258a) stopped preserving "sticky"
options across a base `.machine` directive, nullifying the use of
passing "-many" through GCC to the assembler. As a result, some
instructions which were recognized even under older, more stringent
`.machine` directives become unrecognized instructions in that
context.
In `sysdeps/powerpc/tst-set_ppr.c`, the use of the `mfppr32` extended
mnemonic became unrecognized, as the default compilation with GCC for
32bit powerpc adds a `.machine ppc` in the resulting assembly, so the
command line option `-Wa,-many` is essentially ignored, and the ISA 2.06
instructions and mnemonics, like `mfppr32`, are unrecognized.
The compilation of `sysdeps/powerpc/tst-set_ppr.c` fails with:
Error: unrecognized opcode: `mfppr32'
Add appropriate `.machine` directives in the assembly to bracket the
`mfppr32` instruction.
Part of a 2019 fix (commit 9250e6610f) to
the above test's Makefile to add `-many` to the compilation when GCC
itself stopped passing `-many` to the assember no longer has any effect,
so remove that.
Reported-by: Joseph Myers <joseph@codesourcery.com>
At the last WG14 meeting,
<http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2711.htm> was
accepted, which places more emphasis on the new fmaximum / fminimum
functions and less on the old fmax / fmin functions. Some of the
changes are to examples, notes or otherwise don't require
implementation changes. However, the changes include removing the
_FloatN / _FloatNx versions of the fmax and fmin functions that came
from TS 18661-3.
Thus, those function versions should only be declared under similar
conditions to the _FloatN / _FloatNx versions of fmaxmag and fminmag:
for _GNU_SOURCE and pre-C2X use of __STDC_WANT_IEC_60559_TYPES_EXT__,
but not for C2X without _GNU_SOURCE.
In turn this requires a tgmath.h change so that the corresponding
tgmath.h macros, for C2X with __STDC_WANT_IEC_60559_TYPES_EXT__ but
without _GNU_SOURCE, don't try to use function variants that aren't
declared. (That issue doesn't arise for the tgmath.h macros for
fmaxmag and fminmag, because those aren't defined at all in those
circumstances unless __STDC_WANT_IEC_60559_BFP_EXT__ (from TS 18661-1
and not specified at all by C2X) is also defined, and in that case the
_FloatN / _FloatNx versions of fmaxmag and fminmag get declared - this
is only ever an issue when it's possible for some functions
corresponding to a type-generic-macro to be declared, and for _FloatN
/ _FloatNx functions in general to be declared, but without the
_FloatN / _FloatNx functions corresponding to that particular macro
being declared.)
Tested for x86_64.
C2X does not include fmaxmag and fminmag. When I updated feature test
macro handling accordingly (commit
858045ad1c, "Update floating-point
feature test macro handling for C2X", included in 2.34), I missed
updating tgmath.h so it doesn't define the corresponding type-generic
macros unless __STDC_WANT_IEC_60559_BFP_EXT__ is defined; I've now
reported this as bug 28397. Adjust the conditionals in tgmath.h
accordingly.
Tested for x86_64.
C2X adds new <math.h> functions for floating-point maximum and
minimum, corresponding to the new operations that were added in IEEE
754-2019 because of concerns about the old operations not being
associative in the presence of signaling NaNs. fmaximum and fminimum
handle NaNs like most <math.h> functions (any NaN argument means the
result is a quiet NaN). fmaximum_num and fminimum_num handle both
quiet and signaling NaNs the way fmax and fmin handle quiet NaNs (if
one argument is a number and the other is a NaN, return the number),
but still raise "invalid" for a signaling NaN argument, making them
exceptions to the normal rule that a function with a floating-point
result raising "invalid" also returns a quiet NaN. fmaximum_mag,
fminimum_mag, fmaximum_mag_num and fminimum_mag_num are corresponding
functions returning the argument with greatest or least absolute
value. All these functions also treat +0 as greater than -0. There
are also corresponding <tgmath.h> type-generic macros.
Add these functions to glibc. The implementations use type-generic
templates based on those for fmax, fmin, fmaxmag and fminmag, and test
inputs are based on those for those functions with appropriate
adjustments to the expected results. The RISC-V maintainers might
wish to add optimized versions of fmaximum_num and fminimum_num (for
float and double), since RISC-V (F extension version 2.2 and later)
provides instructions corresponding to those functions - though it
might be at least as useful to add architecture-independent built-in
functions to GCC and teach the RISC-V back end to expand those
functions inline, which is what you generally want for functions that
can be implemented with a single instruction.
Tested for x86_64 and x86, and with build-many-glibcs.py.
AF_NETLINK support is not quite optional on modern Linux systems
anymore, so it is likely that the first attempt will always succeed.
Consequently, there is no need to cache the result. Keep AF_UNIX
and the Internet address families as a fallback, for the rare case
that AF_NETLINK is missing. The other address families previously
probed are totally obsolete be now, so remove them.
Use this simplified version as the generic implementation, disabling
Netlink support as needed.
When running this test on the OpenRISC port I am working on this test
fails with a timeout. The test passes when being straced or debugged.
Looking at the code there seems to be a race condition in that:
1 main thread: calls xpthread_cancel
2 sub thread : receives cancel signal
3 sub thread : cleanup routine waits on barrier
4 main thread: re-inits barrier
5 main thread: waits on barrier
After getting to 5 the main thread and sub thread wait forever as the 2
barriers are no longer the same.
Removing the barrier re-init seems to fix this issue. Also, the barrier
does not need to be reinitialized as that is done by default.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Although it provide an alternate implementation that communicates
using pipe() instead of shared memory, no port uses and it adds extra
burden for posix_spawn() extensions.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
The use of sched_getaffinity on get_nproc and
sysconf (_SC_NPROCESSORS_ONLN) done in 903bc7dcc2 (BZ #27645)
breaks the top command in common hypervisor configurations and also
other monitoring tools.
The main issue using sched_getaffinity changed the symbols semantic
from system-wide scope of online CPUs to per-process one (which can
be changed with kernel cpusets or book parameters in VM).
This patch reverts mostly of the 903bc7dcc2, with the
exceptions:
* No more cached values and atomic updates, since they are inherent
racy.
* No /proc/cpuinfo fallback, since /proc/stat is already used and
it would require to revert more arch-specific code.
* The alloca is replace with a static buffer of 1024 bytes.
So the implementation first consult the sysfs, and fallbacks to procfs.
Checked on x86_64-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
This patch simplifies the memory allocation code and uses the sched
routines instead of reimplement it. This still uses a stack
allocation buffer, so it can be used on malloc initialization code.
Linux currently supports at maximum of 4096 cpus for most architectures:
$ find -iname Kconfig | xargs git grep -A10 -w NR_CPUS | grep -w range
arch/alpha/Kconfig- range 2 32
arch/arc/Kconfig- range 2 4096
arch/arm/Kconfig- range 2 16 if DEBUG_KMAP_LOCAL
arch/arm/Kconfig- range 2 32 if !DEBUG_KMAP_LOCAL
arch/arm64/Kconfig- range 2 4096
arch/csky/Kconfig- range 2 32
arch/hexagon/Kconfig- range 2 6 if SMP
arch/ia64/Kconfig- range 2 4096
arch/mips/Kconfig- range 2 256
arch/openrisc/Kconfig- range 2 32
arch/parisc/Kconfig- range 2 32
arch/riscv/Kconfig- range 2 32
arch/s390/Kconfig- range 2 512
arch/sh/Kconfig- range 2 32
arch/sparc/Kconfig- range 2 32 if SPARC32
arch/sparc/Kconfig- range 2 4096 if SPARC64
arch/um/Kconfig- range 1 1
arch/x86/Kconfig-# [NR_CPUS_RANGE_BEGIN ... NR_CPUS_RANGE_END] range.
arch/x86/Kconfig- range NR_CPUS_RANGE_BEGIN NR_CPUS_RANGE_END
arch/xtensa/Kconfig- range 2 32
With x86 supporting 8192:
arch/x86/Kconfig
976 config NR_CPUS_RANGE_END
977 int
978 depends on X86_64
979 default 8192 if SMP && CPUMASK_OFFSTACK
980 default 512 if SMP && !CPUMASK_OFFSTACK
981 default 1 if !SMP
So using a maximum of 32k cpu should cover all cases (and I would
expect once we start to have many more CPUs that Linux would provide
a more straightforward way to query for such information).
A test is added to check if sched_getaffinity can successfully return
with large buffers.
Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
This is an internal function meant to return the number of avaliable
processor where the process can scheduled, different than the
__get_nprocs which returns a the system available online CPU.
The Linux implementation currently only calls __get_nprocs(), which
in tuns calls sched_getaffinity.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
d482ebfa67 ('htl: Keep thread signals blocked during its initialization')
fixed not letting signals get delivered too early during thread creation,
but it also affected the main thread, thus making it block signals by
default. We need to just let the main thread sigset as it is.
Add tst-ro-dynamic-mod to modules-names-nobuild to avoid
../Makerules:767: warning: ignoring old recipe for target '.../elf/tst-ro-dynamic-mod.so'
This updates BZ #28340 fix.
No bug. Remove reallocation of bufs between implementation tests. Move
initialization outside of foreach implementation test loop. Increase
iteration count.
Generally before this commit was seeing a great deal of variability
between runs. The goal of this commit is to make the results more
reliable.
Benchtests build and bench-memcmp succeeding.
Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
TS 18661-1 and C2X specify predefined macros __STDC_IEC_60559_BFP__
and __STDC_IEC_60559_COMPLEX__, making __STDC_IEC_559__ and
__STDC_IEC_559_COMPLEX__ obsolescent (but still included in the
standard). Now that we have all the functions from TS 18661-1, define
these macros in stdc-predef.h, under the same conditions in which the
older macros are defined, since support for the floating-point
features in TS 18661-1 is now at the same level as that for those in
C11 and before (all library functions and other library APIs present,
but no standard pragma support).
The macros are defined for now with their TS 18661-1 values. C2X will
give them new values (listed as yyyymmL in the working drafts until
the final standard), at which point there will be the question of what
value to use in stdc-predef.h (where it could depend on
__STDC_VERSION__, but not on feature test macros defined by the user).
My inclination then would be to use the C2X value unconditionally
rather than using an older value to indicate TS support, and only have
any C standard version conditionals for the value when subsequent C
standard versions define further values.
(Note that I'm also inclined, when we implement the C2X change to the
return types of fromfp functions, to make that change unconditional
much like the change made to the types of totalorder functions, with
the old version only supported with compat symbols for already-linked
programs and not as an API for newly built objects. So using the C2X
value would also accurately reflect not supporting the versions of
APIs in the TS where those ended up being incompatible with the first
version actually added to the standard.)
Tested for x86_64.
It turns out the __SSE2_MATH__ conditional in sysdeps/x86/fpu/s_ffma.c
does not cover all cases where the x86 fenv_private.h macros might
manipulate one of the SSE and 387 floating-point state, while the
actual fma implementation uses the other. Specifically, in the 32-bit
case, with a compiler not defaulting to -mfpmath=sse, but testing on a
processor with hardware FMA support, the multiarch fma function
implementations will end up using SSE, while the fenv_private.h macros
will use the 387 state for double. Change the conditional to use the
default macros rather than the optimized ones in all cases except when
the compiler inlines an fma instruction (in which case, since all
those instructions are SSE instructions and -mfpmath=sse must be in
effect for them to be inlined, the optimized macros will only use the
SSE state and it's OK for them to only use the SSE state).
Tested for x86_64 and x86. H.J. reports in
<https://sourceware.org/pipermail/libc-alpha/2021-September/131367.html>
that it fixes the problems he observed.
This drops reliance on _GLOBAL_OFFSET_TABLE_[0] being the link-time
address of _DYNAMIC.
The code sequence length does not change.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>