The preemptor sigcode doesn't match since the POSIX sigcode SI_TIMER is
used when SIGALRM is sent. In addition, The inline version of
hurd_preempt_signals doesn't update _hurdsig_preempted_set. For these
reasons, the preemptor would be skipped by post_signal.
* sysdeps/mach/hurd/setitimer.c (setitimer_locked): Fix preemptor setup.
This patch is a reimplementation of [1], which was submitted back in
2015. Copyright issue has been sorted [2] last year. It proposed a new
section (.gnu.xhash) and related dynamic tag (GT_GNU_XHASH). The new
section would be virtually identical to the existing .gnu.hash except
for the translation table (xlat) which would contain correct MIPS
.dynsym indexes corresponding to the hashvals in chains. This is because
MIPS ABI imposes a different ordering of the dynsyms than the one
expected by the .gnu.hash section. Another addition would be a leading
word at the beggining of the section, which would contain the number of
entries in the translation table.
In this patch, the new section name and dynamic tag are changed to
reflect the fact that the section should be treated as MIPS specific
(.MIPS.xhash and DT_MIPS_XHASH).
This patch addresses the alignment issue reported in [3] which is caused
by the leading word of the .MIPS.xhash section. Leading word is now
removed in the corresponding binutils patch, and the number of entries
in the translation table is computed using DT_MIPS_SYMTABNO dynamic tag.
Since the MIPS specific dl-lookup.c file was removed following the
initial patch submission, I opted for the definition of three new macros
in the generic ldsodefs.h. ELF_MACHINE_GNU_HASH_ADDRIDX defines the
index of the dynamic tag in the l_info array. ELF_MACHINE_HASH_SYMIDX is
used to calculate the index of a symbol in GNU hash. On MIPS, it is
defined to look up the symbol index in the translation table.
ELF_MACHINE_XHASH_SETUP is defined for MIPS only. It initializes the
.MIPS.xhash pointer in the link_map_machine struct.
The other major change is bumping the highest EI_ABIVERSION value for
MIPS to suggest that the dynamic linker now supports GNU hash.
The patch was tested by running the glibc testsuite for the three MIPS
ABIs (o32, n32 and n64) and for x86_64-linux-gnu.
[1] https://sourceware.org/ml/binutils/2015-10/msg00057.html
[2] https://sourceware.org/ml/binutils/2018-03/msg00025.html
[3] https://sourceware.org/ml/binutils/2016-01/msg00006.html
* elf/dl-addr.c (determine_info): Calculate the symbol index
using the newly defined ELF_MACHINE_HASH_SYMIDX macro.
* elf/dl-lookup.c (do_lookup_x): Ditto.
(_dl_setup_hash): Initialize MIPS xhash translation table.
* elf/elf.h (SHT_MIPS_XHASH): New define.
(DT_MIPS_XHASH): New define.
* sysdeps/generic/ldsodefs.h (ELF_MACHINE_GNU_HASH_ADDRIDX): New
define.
(ELF_MACHINE_HASH_SYMIDX): Ditto.
(ELF_MACHINE_XHASH_SETUP): Ditto.
* sysdeps/mips/ldsodefs.h (ELF_MACHINE_GNU_HASH_ADDRIDX): New
define.
(ELF_MACHINE_HASH_SYMIDX): Ditto.
(ELF_MACHINE_XHASH_SETUP): Ditto.
* sysdeps/mips/linkmap.h (struct link_map_machine): New member.
* sysdeps/unix/sysv/linux/mips/ldsodefs.h: Increment valid ABI
version.
* sysdeps/unix/sysv/linux/mips/libc-abis: New ABI version.
The fix for BZ#18231 requires new symbols only for sh4eb. This patch
adds the required folder and files for both BE and LE abilist. No
semantic changes are expected.
Checked with check-abi for sh4eb-linux-gnu and sh4-linux-gnu.
* sysdeps/sh/preconfigure.ac: New file.
* sysdeps/sh/preconfigure: Regenerate.
* sysdeps/sh/be/sh3/Implies: New file.
* sysdeps/sh/be/sh4/Implies: Likewise.
* sysdeps/sh/le/sh3/Implies: Likewise.
* sysdeps/sh/le/sh4/Implies: Likewise.
* sysdeps/unix/sysv/linux/sh/le/sh3/Implies: Likewise.
* sysdeps/unix/sysv/linux/sh/le/sh4/Implies: Likewise.
* sysdeps/unix/sysv/linux/sh/*.abilist: Move to
sysdeps/unix/sysv/linux/sh/le/*.abilist.
* sysdeps/unix/sysv/linux/sh/be/*.abilist: New files.
The fix for BZ#18231 requires new symbols only for microblaze. This patch
adds the required folder and files for both BE and LE abilist. No semantic
changes are expected.
Checked with check-abi for microblaze-linux-gnueabihf and
microblazeel-linux-gnueabihf.
* sysdeps/microblaze/preconfigure.ac: New file.
* sysdeps/microblaze/preconfigure: Regenerate.
* sysdeps/microblaze/be/implies: New file.
* sysdeps/microblaze/le/implies: Likewise.
* sysdeps/unix/sysv/linux/microblaze/be/implies: Likewise.
* sysdeps/unix/sysv/linux/microblaze/le/implies: Likewise.
* sysdeps/unix/sysv/linux/microblaze/*.abilist. Move to
sysdeps/unix/sysv/linux/microblaze/be/*.abilist.
* sysdeps/unix/sysv/linux/microblaze/le/*.abilist: New files.
The fix for BZ#18231 requires new symbols only for armeb. This patch
adds the required folder and files for both BE and LE abilist. No
semantic changes are expected.
Checked with check-abi for arm-linux-gnueabihf and armeb-linux-gnueabihf.
* sysdeps/arm/preconfigure.ac: Set machine based on endianness.
* sysdeps/arm/preconfigure: Regenerate.
* sysdeps/arm/be/Implies: New file.
* sysdeps/arm/be/armv6/Implies: Likewise.
* sysdeps/arm/be/armv6t2/Implies: Likewise.
* sysdeps/arm/be/armv7/Implies: Likewise.
* sysdeps/arm/le/Implies: Likewise.
* sysdeps/unix/sysv/linux/arm/be/Implies: Likewise.
* sysdeps/unix/sysv/linux/arm/le/Implies: Likewise.
* sysdeps/unix/sysv/linux/arm/*.abilist: Move to
sysdeps/unix/sysv/linux/arm/le/*.abilist.
* sysdeps/unix/sysv/linux/arm/be/l*.abilist: New files.
fegetenv_status() wants to use the lighter weight instruction 'mffsl'
for reading the Floating-Point Status and Control Register (FPSCR).
It currently will use it directly if compiled '-mcpu=power9', and will
perform a runtime check (cpu_supports("arch_3_00")) otherwise.
Nicely, it turns out that the 'mffsl' instruction will decode to
'mffs' on architectures older than "arch_3_00" because the additional
bits set for 'mffsl' are "don't care" for 'mffs'. 'mffs' is a superset
of 'mffsl'.
So, just generate 'mffsl'.
fesetenv() reads the current value of the Floating-Point Status and Control
Register (FPSCR) to determine the difference between the current state of
exception enables and the newly requested state. All of these bits are also
returned by the lighter weight 'mffsl' instruction used by fegetenv_status().
Use that instead.
Also, remove a local macro _FPU_MASK_ALL in favor of a common macro,
FPU_ENABLES_MASK from fenv_libc.h.
Finally, use a local variable ('new') in favor of a pointer dereference
('*envp').
SET_RESTORE_ROUND uses libc_feholdsetround_ppc_ctx and
libc_feresetround_ppc_ctx to bracket a block of code where the floating point
rounding mode must be set to a certain value.
For the *prologue*, libc_feholdsetround_ppc_ctx is used and performs:
1. Read/save FPSCR.
2. Create new value for FPSCR with new rounding mode and enables cleared.
3. If new value is different than current value,
a. If transitioning from a state where some exceptions enabled,
enter "ignore exceptions / non-stop" mode.
b. Write new value to FPSCR.
c. Put a mark on the wall indicating the FPSCR was changed.
(1) uses the 'mffs' instruction. On POWER9, the lighter weight 'mffsl'
instruction can be used, but it doesn't return all of the bits in the FPSCR.
fegetenv_status uses 'mffsl' on POWER9, 'mffs' otherwise, and can thus be
used instead of fegetenv_register.
(3b) uses 'mtfsf 0b11111111' to write the entire FPSCR, so it must
instead use 'mtfsf 0b00000011' to write just the enables and the mode,
because some of the rest of the bits are not valid if 'mffsl' was used.
fesetenv_mode uses 'mtfsf 0b00000011' on POWER9, 'mtfsf 0b11111111'
otherwise.
For the *epilogue*, libc_feresetround_ppc_ctx checks the mark on the wall, then
calls libc_feresetround_ppc, which just calls __libc_femergeenv_ppc with
parameters such that it performs:
1. Retreive saved value of FPSCR, saved in prologue above.
2. Read FPSCR.
3. Create new value of FPSCR where:
- Summary bits and exception indicators = current OR saved.
- Rounding mode and enables = saved.
- Status bits = current.
4. If transitioning from some exceptions enabled to none,
enter "ignore exceptions / non-stop" mode.
5. If transitioning from no exceptions enabled to some,
enter "catch exceptions" mode.
6. Write new value to FPSCR.
The summary bits are hardwired to the exception indicators, so there is no
need to restore any saved summary bits.
The exception indicator bits, which are sticky and remain set unless
explicitly cleared, would only need to be restored if the code block
might explicitly clear any of them. This is certainly not expected.
So, the only bits that need to be restored are the enables and the mode.
If it is the case that only those bits are to be restored, there is no need to
read the FPSCR. Steps (2) and (3) are unnecessary, and step (6) only needs to
write the bits being restored.
We know we are transitioning out of "ignore exceptions" mode, so step (4) is
unnecessary, and in step (6), we only need to check the state we are
entering.
Since fe{en,dis}ableexcept() and fesetmode() read-modify-write just the
"mode" (exception enable and rounding mode) bits of the Floating Point Status
Control Register (FPSCR), the lighter weight 'mffsl' instruction can be used
to read the FPSCR (enables and rounding mode), and 'mtfsf 0b00000011' can be
used to write just those bits back to the FPSCR. The net is better performance.
In addition, fe{en,dis}ableexcept() read the FPSCR again after writing it, or
they determine that it doesn't need to be written because it is not changing.
In either case, the local variable holds the current values of the enable
bits in the FPSCR. This local variable can be used instead of again reading
the FPSCR.
Also, that value of the FPSCR which is read the second time is validated
against the requested enables. Since the write can't fail, this validation
step is unnecessary, and can be removed. Instead, the exceptions to be
enabled (or disabled) are transformed into available bits in the FPSCR,
then validated after being transformed back, to ensure that all requested
bits are actually being set. For example, FE_INVALID_SQRT can be
requested, but cannot actually be set. This bit is not mapped during the
transformations, so a test for that bit being set before and after
transformations will show the bit would not be set, and the function will
return -1 for failure.
Finally, convert the local macros in fesetmode.c to more generally useful
macros in fenv_libc.h.
The exceptions passed to fe{en,dis}ableexcept() are defined in the ABI
as a bitmask, a combination of FE_INVALID, FE_OVERFLOW, etc.
Within the functions, these bits must be translated to/from the corresponding
enable bits in the Floating Point Status Control Register (FPSCR).
This translation is currently done bit-by-bit. The compiler generates
a series of conditional bit operations. Nicely, the "FE" exception
bits are all a uniform offset from the FPSCR enable bits, so the bit-by-bit
operation can instead be performed by a shift with appropriate masking.
Move non-ASCII contributor names from installed headers
into contrib.texi when possible, and when it's not (the
copyright notice in sysdeps/unix/sysv/linux/mips/sys/user.h)
go back to ASCIIfied names. Problem reported by Joseph Myers in:
https://www.sourceware.org/ml/libc-alpha/2019-08/msg00646.html
This bumps the highest valid EI_ABIVERSION value to ABSOLUTE ABI.
New testcase loads the symbol from the GOT with the "lb" instruction
so that the EI_ABIVERSION header field of the shared object is set
to ABSOLUTE (it doesn't actually check the value of the symbol), and
makes sure that the main executable is executed without "ABI version
invalid" error.
Tested for all three ABIs (o32, n32, n64) using both static linker which
handles undefined weak symbols correctly [1] (and sets the EI_ABIVERSION
of the test module) and the one that doesn't (EI_ABIVERSION left as 0).
[1] https://sourceware.org/ml/binutils/2018-07/msg00268.html
[BZ #24916]
* sysdeps/mips/Makefile [$(subdir) = elf] (tests): Add
tst-undefined-weak.
[$(subdir) = elf] (modules-names): Add tst-undefined-weak-lib.
[$(subdir) = elf] ($(objpfx)tst-undefined-weak): Add dependency.
* sysdeps/mips/tst-undefined-weak-lib.S: New file.
* sysdeps/mips/tst-undefined-weak.c: Likewise.
* sysdeps/unix/sysv/linux/mips/ldsodefs.h (VALID_ELF_ABIVERSION):
Increment highest valid ABIVERSION value.
Linux/Mips kernels prior to 4.8 could potentially crash the user
process when doing FPU emulation while running on non-executable
user stack.
Currently, gcc doesn't emit .note.GNU-stack for mips, but that will
change in the future. To ensure that glibc can be used with such
future gcc, without silently resulting in binaries that might crash
in runtime, this patch forces RWX stack for all built objects if
configured to run against minimum kernel version less than 4.8.
* sysdeps/unix/sysv/linux/mips/Makefile
(test-xfail-check-execstack):
Move under mips-has-gnustack != yes.
(CFLAGS-.o*, ASFLAGS-.o*): New rules.
Apply -Wa,-execstack if mips-force-execstack == yes.
* sysdeps/unix/sysv/linux/mips/configure: Regenerated.
* sysdeps/unix/sysv/linux/mips/configure.ac
(mips-force-execstack): New var.
Set to yes for hard-float builds with minimum_kernel < 4.8.0
or minimum_kernel not set at all.
(mips-has-gnustack): New var.
Use value of libc_cv_as_noexecstack
if mips-force-execstack != yes, otherwise set to no.
As indicated by Joseph's comment on BZ#17726, this symbol is most
likely a historical ABI accident. This patch make it on both arm
and sparc ABIs a compat_symbol.
Checked against a build arm-linux-gnueabihf, sparcv9-linux-gnu, adn
sparc64-linux-gnu to see if the symbol is still present.
* gmon/Versions (libc) [GLIBC_2.31]: New entry.
* sysdeps/unix/sysv/linux/arm/profil-counter.h (profil_counter):
Make a compat_symbol.
* sysdeps/unix/sysv/linux/sparc/profil-counter.h
(__profil_counter_global): Likewise.
This patch refactor sigcontextinfo.h header to use SA_SIGINFO as default
for both gmon and debug implementations. This allows simplify
profil-counter.h on Linux to use a single implementation and remove the
requirements for newer ports to redefine __sigaction/sigaction to use
SA_SIGINFO.
The GET_PC macro is also replaced with a function sigcontext_get_pc that
returns an uintptr_t instead of a void pointer. It allows easier convertion
to integer on ILP32 architecture, such as x32, without the need to suppress
compiler warnings.
The patch also requires some refactor of register-dump.h file for some
architectures (to reflect it is now called from a sa_sigaction instead of
sa_handler signal context).
- Alpha, i386, and s390 are straighfoward to take in consideration the
new argument type.
- ia64 takes in consideration the kernel pass a struct sigcontextt
as third argument for sa_sigaction.
- sparc take in consideration the kernel pass a pt_regs struct
as third argument for sa_sigaction.
- m68k dummy function is removed and the FP state is dumped on
register_dump itself.
- For SH the register-dump.h file is consolidate on a common implementation
and the floating-point state is checked based on ownedfp field.
The register_dump does not change its output format in any affected
architecture.
I checked on x86_64-linux-gnu, i686-linux-gnu, aarch64-linux-gnu,
arm-linux-gnueabihf, sparcv9-linux-gnu, sparc64-linux-gnu, powerpc-linux-gnu,
powerpc64-linux-gnu, and powerpc64le-linux-gnu.
I also checked the libSegFault.so through catchsegv on alpha-linux-gnu,
m68k-linux-gnu and sh4-linux-gnu to confirm the output has not changed.
Adhemerval Zanella <adhemerval.zanella@linaro.org>
Florian Weimer <fweimer@redhat.com>
* debug/segfault.c (install_handler): Use SA_SIGINFO if defined.
* sysdeps/generic/profil-counter.h (__profil_counter): Cast to
uintptr_t.
* sysdeps/generic/sigcontextinfo.h (GET_PC): Rename to
sigcontext_get_pc and return aligned cast to uintptr_t.
* sysdeps/mach/hurd/i386/sigcontextinfo.h (GET_PC): Likewise.
* sysdeps/posix/profil.c (profil_count): Change PC argument to
uintptr_t.
(__profil): Use SA_SIGINFO.
* sysdeps/posix/sprofil.c (profil_count): Change PCP argument to
uintptr_t.
(__sprofil): Use SA_SIGINFO.
* sysdeps/unix/sysv/linux/profil-counter.h: New file.
* sysdeps/unix/sysv/linux/aarch64/profil-counter.h: Remove file.
* sysdeps/unix/sysv/linux/csky/profil-counter.h: Likewise.
* sysdeps/unix/sysv/linux/hppa/profil-counter.h: Likewise.
* sysdeps/unix/sysv/linux/i386/profil-counter.h: Likewise.
* sysdeps/unix/sysv/linux/ia64/profil-counter.h: Likewise.
* sysdeps/unix/sysv/linux/microblaze/profil-counter.h: Likewise.
* sysdeps/unix/sysv/linux/mips/profil-counter.h: Likewise.
* sysdeps/unix/sysv/linux/nios2/profil-counter.h: Likewise.
* sysdeps/unix/sysv/linux/powerpc/profil-counter.h: Likewise.
* sysdeps/unix/sysv/linux/x86_64/profil-counter.h: Likewise.
* sysdeps/unix/sysv/linux/riscv/profil-counter.h: Likewise.
* sysdeps/sysv/linux/s390/s390-32/profil-counter.h: Likewise.
* sysdeps/sysv/linux/s390/s390-64/profil-counter.h: Likewise.
* sysdeps/unix/sysv/linux/sh/profil-counter.h: Likewise.
* sysdeps/unix/sysv/linux/arm/profil-counter.h (__profil_counter):
Assume SA_SIGINFO and use sigcontext_get_pc instead of GET_PC.
* sysdeps/unix/sysv/linux/sparc/profil-counter.h: New file.
* sysdeps/unix/sysv/linux/sparc/sparc64/profil-counter.h: Remove file.
* sysdeps/unix/sysv/linux/sparc/sparc32/profil-counter.h: Likewise.
* sysdpes/unix/sysv/linux/aarch64/sigcontextinfo.h (SIGCONTEXT,
GET_PC, __sigaction, sigaction): Remove defines.
(sigcontext_get_pc): New function.
* sysdeps/unix/sysv/linux/alpha/sigcontextinfo.h: Likewise.
* sysdeps/unix/sysv/linux/arm/sigcontextinfo.h: Likewise.
* sysdeps/unix/sysv/linux/csky/sigcontextinfo.h: Likewise.
* sysdeps/unix/sysv/linux/hppa/sigcontextinfo.h: Likewise.
* sysdeps/unix/sysv/linux/i386/sigcontextinfo.h: Likewise.
* sysdeps/unix/sysv/linux/ia64/sigcontextinfo.h: Likewise.
* sysdeps/unix/sysv/linux/m68k/sigcontextinfo.h: Likewise.
* sysdeps/unix/sysv/linux/mips/sigcontextinfo.h: Likewise.
* sysdeps/unix/sysv/linux/nios2/sigcontextinfo.h: Likewise.
* sysdeps/unix/sysv/linux/s390/sigcontextinfo.h: Likewise.
* sysdeps/unix/sysv/linux/microblaze/sigcontextinfo.h: Likewise.
* sysdeps/unix/sysv/linux/powerpc/sigcontextinfo.h: Likewise.
* sysdeps/unix/sysv/linux/riscv/sigcontextinfo.h: Likewise.
* sysdeps/unix/sysv/linux/sh/sigcontextinfo.h: Likewise.
* sysdeps/sysv/linux/sparc/sparc32/sigcontextinfo.h: Likewise.
* sysdeps/sysv/linux/sparc/sparc64/sigcontextinfo.h: Likewise.
* sysdeps/unix/sysv/linux/x86_64/sigcontextinfo.h: Likewise.
* sysdeps/unix/sysv/linux/alpha/register-dump.h (register_dump):
Handle CTX argument as ucontext_t.
* sysdeps/unix/sysv/linux/i386/register-dump.h: Likewise.
Likewise.
* sysdeps/unix/sysv/linux/m68k/register-dump.h: Likewise.
* sysdeps/sysv/linux/s390/s390-32/register-dump.h: Likewise.
* sysdeps/sysv/linux/s390/s390-64/register-dump.h: Likewise.
* sysdeps/unix/sysv/linux/sh/register-dump.h: New file.
* sysdeps/unix/sysv/linux/sh/sh4/register-dump.h: Remove File.
* sysdeps/unix/sysv/linux/sh/sh3/register-dump.h: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc32/register-dump.h: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc64/register-dump.h: Likewise.
* sysdeps/unix/sysv/linux/Makefile (tests-internal): Add
tst-sigcontextinfo-get_pc.
* sysdeps/unix/sysv/linux/tst-sigcontextinfo-get_pc.c: New file.
(CFLAGS-tst-sigcontextinfo-get_pc.c): New rule.
Fix a couple of typos and v_regs field name in mcontext_t.
* sysdeps/unix/sysv/linux/powerpc/sys/ucontext.h: Fix typos and
field name in mcontext_t struct.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
It doesn't make sense to remove all the internal uses of time.
It's still a standard ISO C function, and its callers don't need
sub-second resolution and would be unnecessarily complicated if
they had to declare a struct timespec instead of just a time_t.
However, a handful of places were using the vestigial "result"
argument instead of the return value, which is slightly less
efficient and also looks strange. Correct this.
* misc/syslog.c (__vsyslog_internal)
* time/getdate.c (__getdate_r)
* time/tst_wcsftime.c (main):
Use return value of time, not its argument.
* string/strfry.c (strfry)
* sysdeps/mach/sleep.c (__sleep):
Remove unnecessary casts of NULL in calls to time.
If the process is in a bad state, we used to print backtraces in
many cases. This is problematic because doing so could involve
a lot of work, like loading libgcc_s using the dynamic linker,
and this could itself be targeted by exploit writers. For example,
if the crashing process was forked from a long-lived process, the
addresses in the error message could be used to bypass ASLR.
Commit ed421fca42 ("Avoid backtrace from
__stack_chk_fail [BZ #12189]"), backtraces where no longer printed
because backtrace_and_maps was always called with do_abort == 1.
Rather than fixing this logic error, this change removes the backtrace
functionality from the sources. With the prevalence of external crash
handlers, it does not appear to be particularly useful. The crash
handler may also destroy useful information for debugging.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The resolution of C floating-point Clarification Request 25
<http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2397.htm#dr_25> is
that the totalorder and totalordermag functions should take pointer
arguments, and this has been adopted in C2X (with const added; note
that the integration of this change into C2X is present in the C
standard git repository but postdates the most recent public PDF
draft).
This patch updates glibc accordingly. As a defect resolution, the API
is changed unconditionally rather than supporting any sort of TS
18661-1 mode for compilation with the old version of the API. There
are compat symbols for existing binaries that pass floating-point
arguments directly. As a consequence of changing to pointer
arguments, there are no longer type-generic macros in tgmath.h for
these functions.
Because of the fairly complicated logic for creating libm function
aliases and determining the set of aliases to create in a given glibc
configuration, rather than duplicating all that in individual source
files to create the versioned and compat symbols, the source files for
the various versions of totalorder functions are set up to redefine
weak_alias before using libm_alias_* macros to create the symbols
required. In turn, this requires creating a separate alias for each
symbol version pointing to the same implementation (see binutils bug
<https://sourceware.org/bugzilla/show_bug.cgi?id=23840>), which is
done automatically using __COUNTER__. (As I noted in
<https://sourceware.org/ml/libc-alpha/2018-10/msg00631.html>, it might
well make sense for glibc's symbol versioning macros to do that alias
creation with __COUNTER__ themselves, which would somewhat simplify
the logic in the totalorder source files.)
It is of course desirable to test the compat symbols. I did this with
the generic libm-test machinery, but didn't wish to duplicate the
actual tables of test inputs and outputs, and thought it risky to
attempt to have a single object file refer to both default and compat
versions of the same function in order to test them together. Thus, I
created libm-test-compat_totalorder.inc and
libm-test-compat_totalordermag.inc which include the generated .c
files (with the processed version of those tables of inputs) from the
non-compat tests, and added appropriate dependencies. I think this
provides sufficient test coverage for the compat symbols without also
needing to make the special ldbl-96 and ldbl-128ibm tests (of
peculiarities relating to the representations of those formats that
can't be covered in the generic tests) run for the compat symbols.
Tests of compat symbols need to be internal tests, meaning _ISOMAC is
not defined. Making some libm-test tests into internal tests showed
up two other issues. GCC diagnoses duplicate macro definitions of
__STDC_* macros, including __STDC_WANT_IEC_60559_TYPES_EXT__; I added
an appropriate conditional and filed
<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91451> for this issue.
On ia64, include/setjmp.h ends up getting included indirectly from
libm-symbols.h, resulting in conflicting definitions of the STR macro
(also defined in libm-test-driver.c); I renamed the macros in
include/setjmp.h. (It's arguable that we should have common internal
headers used everywhere for stringizing and concatenation macros.)
Tested for x86_64 and x86, and with build-many-glibcs.py.
* math/bits/mathcalls.h
[__GLIBC_USE (IEC_60559_BFP_EXT) || __MATH_DECLARING_FLOATN]
(totalorder): Take pointer arguments.
[__GLIBC_USE (IEC_60559_BFP_EXT) || __MATH_DECLARING_FLOATN]
(totalordermag): Likewise.
* manual/arith.texi (totalorder): Likewise.
(totalorderf): Likewise.
(totalorderl): Likewise.
(totalorderfN): Likewise.
(totalorderfNx): Likewise.
(totalordermag): Likewise.
(totalordermagf): Likewise.
(totalordermagl): Likewise.
(totalordermagfN): Likewise.
(totalordermagfNx): Likewise.
* math/tgmath.h (__TGMATH_BINARY_REAL_RET_ONLY): Remove macro.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (totalorder): Likewise.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (totalordermag): Likewise.
* math/Versions (GLIBC_2.31): Add totalorder, totalorderf,
totalorderl, totalordermag, totalordermagf, totalordermagl,
totalorderf32, totalorderf64, totalorderf32x, totalordermagf32,
totalordermagf64, totalordermagf32x, totalorderf64x,
totalordermagf64x, totalorderf128 and totalordermagf128.
* math/Makefile (libm-test-funcs-noauto): Add compat_totalorder
and compat_totalordermag.
(libm-test-funcs-compat): New variable.
(libm-tests-compat): Likewise.
(tests): Do not include compat tests.
(tests-internal): Add compat tests.
($(foreach t,$(libm-tests-base),
$(objpfx)$(t)-compat_totalorder.o)): Depend
on $(objpfx)libm-test-totalorder.c.
($(foreach t,$(libm-tests-base),
$(objpfx)$(t)-compat_totalordermag.o): Depend on
$(objpfx)libm-test-totalordermag.c.
(tgmath3-macros): Remove totalorder and totalordermag.
* math/libm-test-compat_totalorder.inc: New file.
* math/libm-test-compat_totalordermag.inc: Likewise.
* math/libm-test-driver.c (struct test_ff_i_data): Update comment.
(RUN_TEST_fpfp_b): New macro.
(RUN_TEST_LOOP_fpfp_b): Likewise.
* math/libm-test-totalorder.inc (totalorder_test_data): Use
TEST_fpfp_b.
(totalorder_test): Condition on [!COMPAT_TEST].
(do_test): Likewise.
* math/libm-test-totalordermag.inc (totalordermag_test_data): Use
TEST_fpfp_b.
(totalordermag_test): Condition on [!COMPAT_TEST].
(do_test): Likewise.
* math/gen-tgmath-tests.py (Tests.add_all_tests): Remove
totalorder and totalordermag.
* math/test-tgmath.c (NCALLS): Change to 132.
(F(compile_test)): Do not call totalorder or totalordermag.
(F(totalorder)): Remove.
(F(totalordermag)): Likewise.
* include/float.h (__STDC_WANT_IEC_60559_TYPES_EXT__): Do not
define if [__STDC_WANT_IEC_60559_TYPES_EXT__].
* include/setjmp.h [!_ISOMAC] (STR_HELPER): Rename to
SJSTR_HELPER.
[!_ISOMAC] (STR): Rename to SJSTR. Update call to STR_HELPER.
[!_ISOMAC] (TEST_SIZE): Update call to STR.
[!_ISOMAC] (TEST_ALIGN): Likewise.
[!_ISOMAC] (TEST_OFFSET): Likewise.
* sysdeps/ieee754/dbl-64/s_totalorder.c: Include <shlib-compat.h>
and <first-versions.h>.
(__totalorder): Take pointer arguments. Add symbol versions and
compat symbols.
* sysdeps/ieee754/dbl-64/s_totalordermag.c: Include
<shlib-compat.h> and <first-versions.h>.
(__totalordermag): Take pointer arguments. Add symbol versions
and compat symbols.
* sysdeps/ieee754/dbl-64/wordsize-64/s_totalorder.c: Include
<shlib-compat.h> and <first-versions.h>.
(__totalorder): Take pointer arguments. Add symbol versions and
compat symbols.
* sysdeps/ieee754/dbl-64/wordsize-64/s_totalordermag.c: Include
<shlib-compat.h> and <first-versions.h>.
(__totalordermag): Take pointer arguments. Add symbol versions
and compat symbols.
* sysdeps/ieee754/float128/float128_private.h
(__totalorder_compatl): New macro.
(__totalordermag_compatl): Likewise.
* sysdeps/ieee754/flt-32/s_totalorderf.c: Include <shlib-compat.h>
and <first-versions.h>.
(__totalorderf): Take pointer arguments. Add symbol versions and
compat symbols.
* sysdeps/ieee754/flt-32/s_totalordermagf.c: Include
<shlib-compat.h> and <first-versions.h>.
(__totalordermagf): Take pointer arguments. Add symbol versions
and compat symbols.
* sysdeps/ieee754/ldbl-128/s_totalorderl.c: Include
<shlib-compat.h> and <first-versions.h>.
(__totalorderl): Take pointer arguments. Add symbol versions and
compat symbols.
* sysdeps/ieee754/ldbl-128/s_totalordermagl.c: Include
<shlib-compat.h> and <first-versions.h>.
(__totalordermagl): Take pointer arguments. Add symbol versions
and compat symbols.
* sysdeps/ieee754/ldbl-128ibm/s_totalorderl.c: Include
<shlib-compat.h>.
(__totalorderl): Take pointer arguments. Add symbol versions and
compat symbols.
* sysdeps/ieee754/ldbl-128ibm/s_totalordermagl.c: Include
<shlib-compat.h>.
(__totalordermagl): Take pointer arguments. Add symbol versions
and compat symbols.
* sysdeps/ieee754/ldbl-96/s_totalorderl.c: Include
<shlib-compat.h> and <first-versions.h>.
(__totalorderl): Take pointer arguments. Add symbol versions and
compat symbols.
* sysdeps/ieee754/ldbl-96/s_totalordermagl.c: Include
<shlib-compat.h> and <first-versions.h>.
(__totalordermagl): Take pointer arguments. Add symbol versions
and compat symbols.
* sysdeps/ieee754/ldbl-opt/nldbl-totalorder.c (totalorderl): Take
pointer arguments.
* sysdeps/ieee754/ldbl-opt/nldbl-totalordermag.c (totalordermagl):
Likewise.
* sysdeps/ieee754/ldbl-128ibm/test-totalorderl-ldbl-128ibm.c
(do_test): Update calls to totalorderl and totalordermagl.
* sysdeps/ieee754/ldbl-96/test-totalorderl-ldbl-96.c (do_test):
Update calls to totalorderl and totalordermagl.
* sysdeps/mach/hurd/i386/libm.abilist: Update.
* sysdeps/unix/sysv/linux/aarch64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/arm/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/csky/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/hppa/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/microblaze/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/nios2/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist:
Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist:
Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/be/libm.abilist:
Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/le/libm.abilist:
Likewise.
* sysdeps/unix/sysv/linux/riscv/rv64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sh/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise.
Commit 7532837d7b ("The
-Wstringop-truncation option new in GCC 8 detects common misuses")
added __attribute_nonstring__ to bits/utmp.h, but it did not update
the parallel bits/utmpx.h header. In struct utmp, the nonstring
attribute for ut_id was missing.
* sysdeps/aarch64/multiarch/memset_base64.S (DC_ZVA_THRESHOLD):
Disable DC ZVA code if this macro is defined as zero.
* sysdeps/aarch64/multiarch/memset_emag.S (DC_ZVA_THRESHOLD):
Change to zero to disable using DC ZVA.
This patch adds the SYNC_FILE_RANGE_WRITE_AND_WAIT constant from Linux
5.2 (a new name for a combination of existing bits, not actually a new
kernel interface) to bits/fcntl-linux.h.
Tested for x86_64.
* sysdeps/unix/sysv/linux/bits/fcntl-linux.h [__USE_GNU]
(SYNC_FILE_RANGE_WRITE_AND_WAIT): New macro.
The commit 5e855c8954
"s390: Enable VDSO for static linking" removed the definition of VDSO_SETUP
which leads to not setup the vdso symbols.
Instead it jumps to false addresses.
This patch just re adds the removed VDSO_SETUP macro definition.
ChangeLog:
* sysdeps/unix/sysv/linux/s390/init-first.c (VDSO_SETUP): New define.
This patch adds the CLONE_PIDFD constant from Linux 5.2 to glibc's
bits/sched.h.
Tested for x86_64.
* sysdeps/unix/sysv/linux/bits/sched.h [__USE_GNU] (CLONE_PIDFD):
New macro.
This patch assumes static vDSO is supported as default, it is now supported
on all current architectures that support vDSO. It allows removing both
ALWAYS_USE_VSYSCALL define, which an architecture requires to explicit define
and USE_VSYSCALL (which defines vDSO only for shared or if architecture defines
ALWAYS_USE_VSYSCALL).
Checked with a build against all affected ABIs.
[BZ #19767]
* sysdeps/unix/sysv/linux/aarch64/sysdep.h (ALWAYS_USE_VSYSCALL):
Remove definition.
* sysdeps/unix/sysv/linux/arm/sysdep.h (ALWAYS_USE_VSYSCALL):
Likewise.
* sysdeps/unix/sysv/linux/i386/sysdep.h (ALWAYS_USE_VSYSCALL):
Likewise.
* sysdeps/unix/sysv/linux/mips/mips32/sysdep.h (ALWAYS_USE_VSYSCALL):
Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/n32/sysdep.h
(ALWAYS_USE_VSYSCALL): Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/n64/sysdep.h
(ALWAYS_USE_VSYSCALL): Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/sysdep.h
(ALWAYS_USE_VSYSCALL): Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/sysdep.h
(ALWAYS_USE_VSYSCALL): Likewise.
* sysdeps/unix/sysv/linux/riscv/sysdep.h (ALWAYS_USE_VSYSCALL):
Likewise.
* sysdeps/unix/sysv/linux/s390/s390-32/sysdep.h
(ALWAYS_USE_VSYSCALL): Likewise.
* sysdeps/unix/sysv/linux/s390/s390-64/sysdep.h
(ALWAYS_USE_VSYSCALL): Likewise.
* sysdeps/unix/sysv/linux/sparc/sysdep.h (ALWAYS_USE_VSYSCALL):
Likewise.
* sysdeps/unix/sysv/linux/x86_64/sysdep.h (ALWAYS_USE_VSYSCALL):
Likewise.
* sysdeps/unix/sysv/linux/x86/libc-vdso.h: Remove #if USE_VSYSCALL.
* sysdeps/unix/sysv/linux/sysdep-vdso.h: Likewise.
* sysdeps/unix/sysv/linux/sysdep.h (ALWAYS_USE_VSYSCALL,
USE_VSYSCALL): Remove defitions.
Although s390 only enables vDSO for dynamically linked elf binaries
(arch/s390/kernel/vdso.c:217), there is no indication in the code or
associated commit message for why not enable it for statically linked
binaries as well. To double check, I rebuilt a kernel with the
check removed and the vDSO does work for static build for supplied
symbols.
Checked on s390x-linux-gnu and s390-linux-gnu.
[BZ #19767]
* sysdeps/unix/sysv/linux/s390/init-first.c: Remove #ifdef SHARED.
* sysdeps/unix/sysv/linux/s390/libc-vdso.h: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-32/sysdep.h
(ALWAYS_USE_VSYSCALL): Define.
* sysdeps/unix/sysv/linux/s390/s390-64/sysdep.h
(ALWAYS_USE_VSYSCALL): Likewise.
There is just one file-based implementation, so this dispatch
mechanism is unnecessary. Instead of the vtable pointer
__libc_utmp_jump_table, use a non-negative file_fd as the indicator
that the backend is initialized.
This patch updates the Linux kernel version in a comment in
syscall-names.list to agree with the following "kernel" line.
* sysdeps/unix/sysv/linux/syscall-names.list: Update comment.
The tst-mman-consts.py test includes a kernel version number, to avoid
failures because of newly added constants in the kernel (if kernel
headers are newer than this version of glibc) or missing constants in
the kernel (if kernel headers are older than this version of glibc).
This patch updates it to 5.2 to reflect that the MAP_* constants in
glibc are still current as of that kernel version.
Tested with build-many-glibcs.py.
* sysdeps/unix/sysv/linux/tst-mman-consts.py (main): Update Linux
kernel version number to 5.2.
Some implementations in sysdeps/powerpc/powerpc64/power8/*.S still had
pre power8 compatible binutils hardcoded macros and were not using
.machine power8.
This patch should not have semantic changes, in fact it should have the
same exact code generated.
Tested that generated stripped shared objects are identical when
using "strip --remove-section=.note.gnu.build-id".
Checked on:
- powerpc64le, power9, build-many-glibcs.py, gcc 6.4.1 20180104, binutils 2.26.2.20160726
- powerpc64le, power8, debian 9, gcc 6.3.0 20170516, binutils 2.28
- powerpc64le, power9, ubuntu 19.04, gcc 8.3.0, binutils 2.32
- powerpc64le, power9, opensuse tumbleweed, gcc 9.1.1 20190527, binutils 2.32
- powerpc64, power9, debian 10, gcc 8.3.0, binutils 2.31.1
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
This is missing bit for fully fix BZ#15813 (the other two were fixed
by 359653aaac).
Checked on x86_64-linux-gnu.
[BZ #15813]
sysdeps/posix/tempname.c (__gen_tempname): get entrypy on each
attempt.
Commit ffe8a9a831, "powerpc: Remove
rt_sigreturn usage on context function", removed from powerpc32
swapcontext a setting of r31 that is relied upon in subsequent code.
I'm not sure why this didn't produce test failures in Adhemerval's
32-bit testing; in my (soft-float) testing in preparation for 2.30
release, I see several context-related failures
FAIL: stdlib/tst-makecontext2
FAIL: stdlib/tst-makecontext3
FAIL: stdlib/tst-setcontext
FAIL: stdlib/tst-setcontext2
FAIL: stdlib/tst-setcontext4
FAIL: stdlib/tst-setcontext7
FAIL: stdlib/tst-setcontext9
FAIL: stdlib/tst-swapcontext1
that did not appear in 2.29 testing. This patch restores the removed
register setting in question, and thus fixes those failures.
Tested for powerpc (soft-float).
* sysdeps/unix/sysv/linux/powerpc/powerpc32/swapcontext-common.S
(__CONTEXT_FUNC_NAME): Restore setting of r31.
When compiled with -O3 and AVX, GCC 8 and 9 optimize some loops in
sysdeps/ieee754/dbl-64/branred.c with 256-bit vector instructions,
which leads to store forward stall:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90579
There is no easy fix in compiler. This patch limits vector width to
128 bits to work around this issue. It improves performance of sin
and cos by more than 40% on Skylake compiled with -O3 -march=skylake.
Tested with GCC 7/8/9 on x86-64.
[BZ #24603]
* sysdeps/x86_64/configure.ac: Check if -mprefer-vector-width=128
works.
* sysdeps/x86_64/configure: Regenerated.
* sysdeps/x86_64/fpu/Makefile (CFLAGS-branred.c): New. Set
to -mprefer-vector-width=128 if supported.
The kernel changes for a 64-bit time_t on 32-bit architectures
resulted in <asm/socket.h> indirectly including <linux/posix_types.h>.
The latter is not namespace-clean for the POSIX version of
<sys/socket.h>.
This issue has persisted across several Linux releases, so this commit
creates our own copy of the SO_* definitions for !__USE_MISC mode.
The new test socket/tst-socket-consts ensures that the copy is
consistent with the kernel definitions (which vary across
architectures). The test is tricky to get right because CPPFLAGS
includes include/libc-symbols.h, which in turn defines _GNU_SOURCE
unconditionally.
Tested with build-many-glibcs.py. I verified that a discrepancy in
the definitions actually results in a failure of the
socket/tst-socket-consts test.
The pthread _clock functions that were recently added to nptl need to be
declared in hppa's pthread.h too. After this change, the function
declaration part of sysdeps/nptl/pthread.h and
sysdeps/unix/sysv/linux/hppa/pthread.h are identical.
* sysdeps/unix/sysv/linux/hppa/pthread.h: Add declarations of
functions recently added to sysdeps/nptl/pthread.h:
pthread_mutex_clocklock, pthread_rwlock_clockrdlock,
pthread_rwlock_clockwrlock and pthread_cond_clockwait.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
In afe4de7d28, I added forwarding functions
from libc to libpthread for __pthread_cond_clockwait and
pthread_cond_clockwait to mirror those for pthread_cond_timedwait. These
are unnecessary[1], since these functions aren't (yet) being called from
within libc itself. Let's remove them.
* nptl/forward.c: Remove unnecessary __pthread_cond_clockwait and
pthread_cond_clockwait forwarding functions. There are no internal
users, so it is unnecessary to expose these functions in libc.so.
* sysdeps/nptl/pthread-functions.h (pthread_functions): Remove
unnecessary ptr___pthread_cond_clockwait member.
* nptl/nptl-init.c (pthread_functions): Remove assignment of
removed member.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
[1] https://sourceware.org/ml/libc-alpha/2017-10/msg00082.html
The only implementation of futex_supports_exact_relative_timeouts always
returns true. Let's remove it and all its callers.
* nptl/pthread_cond_wait.c: (__pthread_cond_clockwait): Remove code
that is only useful if futex_supports_exact_relative_timeouts ()
returns false.
* nptl/pthread_condattr_setclock.c: (pthread_condattr_setclock):
Likewise.
* sysdeps/nptl/futex-internal.h: Remove comment about relative
timeouts potentially being imprecise since it's no longer true.
Remove declaration of futex_supports_exact_relative_timeouts.
* sysdeps/unix/sysv/linux/futex-internal.h: Remove implementation
of futex_supports_exact_relative_timeouts.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Rename lll_timedlock to lll_clocklock and add clockid
parameter to indicate the clock that the abstime parameter should
be measured against in preparation for adding
pthread_mutex_clocklock.
The name change mirrors the naming for the exposed pthread functions:
timed => absolute timeout measured against CLOCK_REALTIME (or clock
specified by attribute in the case of pthread_cond_timedwait.)
clock => absolute timeout measured against clock specified in preceding
parameter.
* sysdeps/nptl/lowlevellock.h (lll_clocklock): Rename from
lll_timedlock and add clockid parameter. (__lll_clocklock): Rename
from __lll_timedlock and add clockid parameter.
* sysdeps/unix/sysv/linux/sparc/lowlevellock.h (lll_clocklock):
Likewise.
* nptl/lll_timedlock_wait.c (__lll_clocklock_wait): Rename from
__lll_timedlock_wait and add clockid parameter. Use __clock_gettime
rather than __gettimeofday so that clockid can be used. This means
that conversion from struct timeval is no longer required.
* sysdeps/sparc/sparc32/lowlevellock.c (lll_clocklock_wait):
Likewise.
* sysdeps/sparc/sparc32/lll_timedlock_wait.c: Update comment to
refer to __lll_clocklock_wait rather than __lll_timedlock_wait.
* nptl/pthread_mutex_timedlock.c (lll_clocklock_elision): Rename
from lll_timedlock_elision, add clockid parameter and use
meaningful names for other parameters. (__pthread_mutex_timedlock):
Pass CLOCK_REALTIME where necessary to lll_clocklock and
lll_clocklock_elision.
* sysdeps/unix/sysv/linux/powerpc/lowlevellock.h
(lll_clocklock_elision): Rename from lll_timedlock_elision and add
clockid parameter. (__lll_clocklock_elision): Rename from
__lll_timedlock_elision and add clockid parameter.
* sysdeps/unix/sysv/linux/s390/lowlevellock.h: Likewise.
* sysdeps/unix/sysv/linux/x86/lowlevellock.h: Likewise.
* sysdeps/unix/sysv/linux/powerpc/elision-timed.c
(__lll_lock_elision): Call __lll_clocklock_elision rather than
__lll_timedlock_elision. (EXTRAARG): Add clockid parameter.
(LLL_LOCK): Likewise.
* sysdeps/unix/sysv/linux/s390/elision-timed.c: Likewise.
* sysdeps/unix/sysv/linux/x86/elision-timed.c: Likewise.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Add:
int pthread_rwlock_clockrdlock (pthread_rwlock_t *rwlock,
clockid_t clockid,
const struct timespec *abstime)
and:
int pthread_rwlock_clockwrlock (pthread_rwlock_t *rwlock,
clockid_t clockid,
const struct timespec *abstime)
which behave like pthread_rwlock_timedrdlock and
pthread_rwlock_timedwrlock respectively, except they always measure
abstime against the supplied clockid. The functions currently support
CLOCK_REALTIME and CLOCK_MONOTONIC and return EINVAL if any other
clock is specified.
* sysdeps/nptl/pthread.h: Add pthread_rwlock_clockrdlock and
pthread_wrlock_clockwrlock.
* nptl/Makefile: Build pthread_rwlock_clockrdlock.c and
pthread_rwlock_clockwrlock.c.
* nptl/pthread_rwlock_clockrdlock.c: Implement
pthread_rwlock_clockrdlock.
* nptl/pthread_rwlock_clockwrlock.c: Implement
pthread_rwlock_clockwrlock.
* nptl/pthread_rwlock_common.c (__pthread_rwlock_rdlock_full): Add
clockid parameter and verify that it indicates a supported clock on
entry so that we fail even if it doesn't end up being used. Pass
that clock on to futex_abstimed_wait when necessary.
(__pthread_rwlock_wrlock_full): Likewise.
* nptl/pthread_rwlock_rdlock.c: (__pthread_rwlock_rdlock): Pass
CLOCK_REALTIME to __pthread_rwlock_rdlock_full even though it won't
be used because there's no timeout.
* nptl/pthread_rwlock_wrlock.c (__pthread_rwlock_wrlock): Pass
CLOCK_REALTIME to __pthread_rwlock_wrlock_full even though it won't
be used because there is no timeout.
* nptl/pthread_rwlock_timedrdlock.c (pthread_rwlock_timedrdlock):
Pass CLOCK_REALTIME to __pthread_rwlock_rdlock_full since abstime
uses that clock.
* nptl/pthread_rwlock_timedwrlock.c (pthread_rwlock_timedwrlock):
Pass CLOCK_REALTIME to __pthread_rwlock_wrlock_full since abstime
uses that clock.
* sysdeps/unix/sysv/linux/aarch64/libpthread.abilist (GLIBC_2.30): Likewise.
* sysdeps/unix/sysv/linux/alpha/libpthread.abilist (GLIBC_2.30): Likewise.
* sysdeps/unix/sysv/linux/arm/libpthread.abilist (GLIBC_2.30): Likewise.
* sysdeps/unix/sysv/linux/csky/libpthread.abilist (GLIBC_2.30): Likewise.
* sysdeps/unix/sysv/linux/hppa/libpthread.abilist (GLIBC_2.30): Likewise.
* sysdeps/unix/sysv/linux/i386/libpthread.abilist (GLIBC_2.30): Likewise.
* sysdeps/unix/sysv/linux/ia64/libpthread.abilist (GLIBC_2.30): Likewise.
* sysdeps/unix/sysv/linux/m68k/coldfire/libpthread.abilist
(GLIBC_2.30): Likewise.
* sysdeps/unix/sysv/linux/m68k/m680x0/libpthread.abilist
(GLIBC_2.30): Likewise.
* sysdeps/unix/sysv/linux/microblaze/libpthread.abilist
(GLIBC_2.30): Likewise.
* sysdeps/unix/sysv/linux/mips/mips32/libpthread.abilist
(GLIBC_2.30): Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/libpthread.abilist
(GLIBC_2.30): Likewise.
* sysdeps/unix/sysv/linux/nios2/libpthread.abilist (GLIBC_2.30): Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/libpthread.abilist
(GLIBC_2.30): Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/be/libpthread.abilist
(GLIBC_2.30): Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/le/libpthread.abilist
(GLIBC_2.30): Likewise.
* sysdeps/unix/sysv/linux/riscv/rv64/libpthread.abilist
(GLIBC_2.30): Likewise.
* sysdeps/unix/sysv/linux/s390/s390-32/libpthread.abilist
(GLIBC_2.30): Likewise.
* sysdeps/unix/sysv/linux/s390/s390-64/libpthread.abilist
(GLIBC_2.30): Likewise.
* sysdeps/unix/sysv/linux/sh/libpthread.abilist (GLIBC_2.30): Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc32/libpthread.abilist
(GLIBC_2.30): Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc64/libpthread.abilist
(GLIBC_2.30): Likewise.
* sysdeps/unix/sysv/linux/x86_64/64/libpthread.abilist
(GLIBC_2.30): Likewise.
* sysdeps/unix/sysv/linux/x86_64/x32/libpthread.abilist
(GLIBC_2.30): Likewise.
* nptl/tst-abstime.c (th): Add pthread_rwlock_clockrdlock and
pthread_rwlock_clockwrlock timeout tests to match the existing
pthread_rwlock_timedrdloock and pthread_rwlock_timedwrlock tests.
* nptl/tst-rwlock14.c (do_test): Likewise.
* nptl/tst-rwlock6.c Invent verbose_printf macro, and use for
ancillary output throughout. (tf): Accept thread_args structure so
that rwlock, a clockid and function name can be passed to the
thread. (do_test_clock): Rename from do_test. Accept clockid
parameter to specify test clock. Use the magic clockid value of
CLOCK_USE_TIMEDLOCK to indicate that pthread_rwlock_timedrdlock and
pthread_rwlock_timedwrlock should be tested, otherwise pass the
specified clockid to pthread_rwlock_clockrdlock and
pthread_rwlock_clockwrlock. Use xpthread_create and xpthread_join.
(do_test): Call do_test_clock to test each clockid in turn.
* nptl/tst-rwlock7.c: Likewise.
* nptl/tst-rwlock9.c (writer_thread, reader_thread): Accept
thread_args structure so that the (now int) thread number, the
clockid and the function name can be passed to the thread.
(do_test_clock): Renamed from do_test. Pass the necessary
thread_args when creating the reader and writer threads. Use
xpthread_create and xpthread_join.
(do_test): Call do_test_clock to test each clockid in turn.
* manual/threads.texi: Add documentation for
pthread_rwlock_clockrdlock and pthread_rwlock_clockwrclock.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Add:
int pthread_cond_clockwait (pthread_cond_t *cond,
pthread_mutex_t *mutex,
clockid_t clockid,
const struct timespec *abstime)
which behaves just like pthread_cond_timedwait except it always measures
abstime against the supplied clockid. Currently supports CLOCK_REALTIME
and
CLOCK_MONOTONIC and returns EINVAL if any other clock is specified.
Includes feedback from many others. This function was originally
proposed[1] as pthread_cond_timedwaitonclock_np, but The Austin Group
preferred the new name.
* nptl/Makefile: Add tst-cond26 and tst-cond27
* nptl/Versions (GLIBC_2.30): Add pthread_cond_clockwait
* sysdeps/nptl/pthread.h: Likewise
* nptl/forward.c: Add __pthread_cond_clockwait
* nptl/forward.c: Likewise
* nptl/pthreadP.h: Likewise
* sysdeps/nptl/pthread-functions.h: Likewise
* nptl/pthread_cond_wait.c (__pthread_cond_wait_common): Add
clockid parameter and comment describing why we don't need to
check
its value. Use that value when calling
futex_abstimed_wait_cancelable rather than reading the clock
from
the flags. (__pthread_cond_wait): Pass unused clockid parameter.
(__pthread_cond_timedwait): Read clock from flags and pass it to
__pthread_cond_wait_common. (__pthread_cond_clockwait): Add new
function with weak alias from pthread_cond_clockwait.
* sysdeps/mach/hurd/i386/libpthread.abilist (GLIBC_2.30):
* Likewise.
* sysdeps/unix/sysv/linux/aarch64/libpthread.abilist
* (GLIBC_2.30): Likewise.
* sysdeps/unix/sysv/linux/alpha/libpthread.abilist (GLIBC_2.30):
* Likewise.
* sysdeps/unix/sysv/linux/arm/libpthread.abilist (GLIBC_2.30):
* Likewise.
* sysdeps/unix/sysv/linux/csky/libpthread.abilist (GLIBC_2.30):
* Likewise.
* sysdeps/unix/sysv/linux/hppa/libpthread.abilist (GLIBC_2.30):
* Likewise.
* sysdeps/unix/sysv/linux/i386/libpthread.abilist (GLIBC_2.30):
* Likewise.
* sysdeps/unix/sysv/linux/ia64/libpthread.abilist (GLIBC_2.30):
* Likewise.
* sysdeps/unix/sysv/linux/m68k/coldfire/libpthread.abilist
(GLIBC_2.30): Likewise.
* sysdeps/unix/sysv/linux/m68k/m680x0/libpthread.abilist
(GLIBC_2.30): Likewise.
* sysdeps/unix/sysv/linux/microblaze/libpthread.abilist
(GLIBC_2.30): Likewise.
* sysdeps/unix/sysv/linux/mips/mips32/libpthread.abilist
(GLIBC_2.30): Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/libpthread.abilist
(GLIBC_2.30): Likewise.
* sysdeps/unix/sysv/linux/nios2/libpthread.abilist (GLIBC_2.30):
* Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/libpthread.abilist
(GLIBC_2.30): Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/be/libpthread.abilist
(GLIBC_2.30): Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/le/libpthread.abilist
(GLIBC_2.30): Likewise.
* sysdeps/unix/sysv/linux/riscv/rv64/libpthread.abilist
(GLIBC_2.30): Likewise.
* sysdeps/unix/sysv/linux/s390/s390-32/libpthread.abilist
(GLIBC_2.30): Likewise.
* sysdeps/unix/sysv/linux/s390/s390-64/libpthread.abilist
(GLIBC_2.30): Likewise.
* sysdeps/unix/sysv/linux/sh/libpthread.abilist (GLIBC_2.30):
* Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc32/libpthread.abilist
(GLIBC_2.30): Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc64/libpthread.abilist
(GLIBC_2.30): Likewise.
* sysdeps/unix/sysv/linux/x86_64/64/libpthread.abilist
(GLIBC_2.30): Likewise.
* sysdeps/unix/sysv/linux/x86_64/x32/libpthread.abilist
(GLIBC_2.30): Likewise.
* nptl/tst-cond11.c (run_test): Support testing
pthread_cond_clockwait too by using a special magic
CLOCK_USE_ATTR_CLOCK value to determine whether to call
pthread_cond_timedwait or pthread_cond_clockwait. (do_test):
Pass
CLOCK_USE_ATTR_CLOCK for existing tests, and add new tests using
all combinations of CLOCK_MONOTONIC and CLOCK_REALTIME.
* ntpl/tst-cond26.c: New test for passing unsupported and
* invalid
clocks to pthread_cond_clockwait.
* nptl/tst-cond27.c: Add test similar to tst-cond5.c, but using
struct timespec and pthread_cond_clockwait.
* manual/threads.texi: Document pthread_cond_clockwait. The
* comment
was provided by Carlos O'Donell.
[1] https://sourceware.org/ml/libc-alpha/2015-07/msg00193.html
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
In preparation for adding POSIX clockwait variants of timedwait functions,
add a clockid_t parameter to futex_abstimed_wait functions and pass
CLOCK_REALTIME from all callers for the time being.
Replace lll_futex_timed_wait_bitset with lll_futex_clock_wait_bitset
which takes a clockid_t parameter rather than the magic clockbit.
* sysdeps/nptl/lowlevellock-futex.h,
sysdeps/unix/sysv/linux/lowlevellock-futex.h: Replace
lll_futex_timed_wait_bitset with lll_futex_clock_wait_bitset that
takes a clockid rather than a special clockbit.
* sysdeps/nptl/lowlevellock-futex.h: Add
lll_futex_supported_clockid so that client functions can check
whether their clockid parameter is valid even if they don't
ultimately end up calling lll_futex_clock_wait_bitset.
* sysdeps/nptl/futex-internal.h,
sysdeps/unix/sysv/linux/futex-internal.h
(futex_abstimed_wait, futex_abstimed_wait_cancelable): Add
clockid_t parameter to indicate which clock the absolute time
passed should be measured against. Pass that clockid onto
lll_futex_clock_wait_bitset. Add invalid clock as reason for
returning -EINVAL.
* sysdeps/nptl/futex-internal.h,
sysdeps/unix/sysv/linux/futex-internal.h: Introduce
futex_abstimed_supported_clockid so that client functions can check
whether their clockid parameter is valid even if they don't
ultimately end up calling futex_abstimed_wait.
* nptl/pthread_cond_wait.c (__pthread_cond_wait_common): Remove
code to calculate relative timeout for
__PTHREAD_COND_CLOCK_MONOTONIC_MASK and just pass CLOCK_MONOTONIC
or CLOCK_REALTIME as required to futex_abstimed_wait_cancelable.
* nptl/pthread_rwlock_common (__pthread_rwlock_rdlock_full)
(__pthread_wrlock_full), nptl/sem_waitcommon (do_futex_wait): Pass
additional CLOCK_REALTIME to futex_abstimed_wait_cancelable.
* nptl/pthread_mutex_timedlock.c (__pthread_mutex_timedlock):
Switch to lll_futex_clock_wait_bitset and pass CLOCK_REALTIME
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The fix for BZ#21270 (commit 158d5fa0e1) added a mask to avoid offset larger
than 1^44 to be used along __NR_mmap2. However mips64n32 users __NR_mmap,
as mips64n64, but still defines off_t as old non-LFS type (other ILP32, such
x32, defines off_t being equal to off64_t). This leads to use the same
mask meant only for __NR_mmap2 call for __NR_mmap, thus limiting the maximum
offset it can use with mmap64.
This patch fixes by setting the high mask only for __NR_mmap2 usage. The
posix/tst-mmap-offset.c already tests it and also fails for mips64n32. The
patch also change the test to check for an arch-specific header that defines
the maximum supported offset.
Checked on x86_64-linux-gnu, i686-linux-gnu, and I also tests tst-mmap-offset
on qemu simulated mips64 with kernel 3.2.0 kernel for both mips-linux-gnu and
mips64-n32-linux-gnu.
[BZ #24699]
* posix/tst-mmap-offset.c: Mention BZ #24699.
(do_test_bz21270): Rename to do_test_large_offset and use
mmap64_maximum_offset to check for maximum expected offset value.
* sysdeps/generic/mmap_info.h: New file.
* sysdeps/unix/sysv/linux/mips/mmap_info.h: Likewise.
* sysdeps/unix/sysv/linux/mmap64.c (MMAP_OFF_HIGH_MASK): Define iff
__NR_mmap2 is used.
Remove unnecessary variant_pcs field: the dynamic tag can be checked
directly.
* sysdeps/aarch64/dl-machine.h (elf_machine_runtime_setup): Remove the
DT_AARCH64_VARIANT_PCS check.
(elf_machine_lazy_rel): Use l_info[DT_AARCH64 (VARIANT_PCS)].
* sysdeps/aarch64/linkmap.h (struct link_map_machine): Remove
variant_pcs.
Using __builtin_cpu_supports() requires support in GCC and Glibc.
My recent patch to fenv_libc.h added an unprotected use of
__builtin_cpu_supports(). Compilation of Glibc itself will fail
with a sufficiently new GCC and sufficiently old Glibc:
../sysdeps/powerpc/fpu/fegetexcept.c: In function ‘__fegetexcept’:
../sysdeps/powerpc/fpu/fenv_libc.h:52:20: error: builtin ‘__builtin_cpu_supports’ needs GLIBC (2.23 and newer) that exports hardware capability bits [-Werror]
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Fixes 3db85a9814.
The power7 logb implementation does not show a performance gain on
ISA 2.07+ chips with faster floating-point to GRP instructions
(currently POWER8 and POWER9).
This patch moves the POWER7 implementation to generic one and enables
it for POWER7. It also add some cleanup to use inline floating-point
number instead of define them using static const.
The performance difference is for POWER9:
- Without patch:
"logb": {
"subnormal": {
"duration": 4.99202e+09,
"iterations": 8.83662e+08,
"max": 75.194,
"min": 5.501,
"mean": 5.64925
},
"normal": {
"duration": 4.97063e+09,
"iterations": 9.97094e+08,
"max": 46.489,
"min": 4.956,
"mean": 4.98512
}
}
- With patch:
"logb": {
"subnormal": {
"duration": 4.97226e+09,
"iterations": 9.92036e+08,
"max": 77.209,
"min": 4.892,
"mean": 5.01218
},
"normal": {
"duration": 4.96192e+09,
"iterations": 1.07545e+09,
"max": 12.361,
"min": 4.593,
"mean": 4.61382
}
}
The ifunc implementation is also enabled only for powerpc64.
Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).
* sysdeps/powerpc/power7/fpu/s_logb.c: Move to ...
* sysdeps/powerpc/fpu/s_logb.c: ... here. Use inline FP constants.
* sysdeps/powerpc/power7/fpu/s_logbf.c: Move to ...
* sysdeps/powerpc/fpu/s_logbf.c: ... here. Use inline FP constants.
* sysdeps/powerpc/power7/fpu/s_logbl.c: Move to ...
* sysdeps/powerpc/fpu/s_logbl.c: ... here. Use inline FP constants.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_logb-power7.c:
Adjust implementation path.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_logbf-power7.c:
Adjust implementation path.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_logbl-power7.c:
Adjust implementation path.
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/Makefile
(libm-sysdep_routines): Add s_log* objects.
(CFLAGS-s_logbf-power7.c, CFLAGS-s_logbl-power7.c,
CFLAGS-s_logb-power7.c): New fule.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logb-power7.c: Move
to ...
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logb-power7.c:
... here.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logb-ppc64.c: Move
to ...
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logb-ppc64.c:
... here.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logb.c: Move to ...
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logb.c: ... here.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logbf-power7.c: Move
to ...
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logbf-power7.c:
... here.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logbf-ppc64.c: Move
to ...
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logbf-ppc64.c:
... here.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logbf.c: Move to ...
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logbf.c: ... here.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logbl-power7.c: Move
to ...
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logbl-power7.c:
... here.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logbl-ppc64.c: Move
to ...
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logbl-ppc64.c:
... here.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logbl.c: Move to ...
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logbl.c: ... here.
* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile: Remove file.
* sysdeps/powerpc/powerpc64/power7/fpu/s_logb.c: Remove file.
* sysdeps/powerpc/powerpc64/power7/fpu/s_logbf.c: Likewise.
* sysdeps/powerpc/powerpc64/power7/fpu/s_logbl.c: Likewise.
Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
- The resulting binary difference on 32 bits architecture is
minimum. On i686-linux-gnu (with architecture optimization
routine removed) there is no different using logb benchtests
- It helps wordsize-64 architectures that use ldbl-opt.
- It add some code simplification with reduction of duplicated
implementations.
Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).
* sysdeps/ieee754/dbl-64/wordsize-64/s_logb.c: Move to ...
* sysdeps/ieee754/dbl-64/s_logb.c: ... here. Add work around for
powerpc32 integer 0 converting to -0.
Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
Passing a second argument to the ifunc resolver allows accessing
AT_HWCAP2 values from the resolver. AArch64 will start using AT_HWCAP2
on linux because for ilp32 to remain compatible with lp64 ABI no more
than 32bit hwcap flags can be in AT_HWCAP which is already used up.
Currently the relocation ordering logic does not guarantee that ifunc
resolvers can call libc apis or access libc objects, so only the
resolver arguments and runtime environment dependent instructions can
be used to do the dispatch (this affects ifunc resolvers outside of
the libc).
Since ifunc resolver is target specific and only supposed to be
called by the dynamic linker, the call ABI can be changed in a
backward compatible way:
Old call ABI passed hwcap as uint64_t, new abi sets the
_IFUNC_ARG_HWCAP flag in the hwcap and passes a second argument
that's a pointer to an extendible struct. A resolver has to check
the _IFUNC_ARG_HWCAP flag before accessing the second argument.
The new sys/ifunc.h installed header has the definitions for the
new ABI, everything is in the implementation reserved namespace.
An alternative approach is to try to support extern calls from ifunc
resolvers such as getauxval, but that seems non-trivial
https://sourceware.org/ml/libc-alpha/2017-01/msg00468.html
* sysdeps/aarch64/Makefile: Install sys/ifunc.h and add tests.
* sysdeps/aarch64/dl-irel.h (elf_ifunc_invoke): Update to new ABI.
* sysdeps/aarch64/sys/ifunc.h: New file.
* sysdeps/aarch64/tst-ifunc-arg-1.c: New file.
* sysdeps/aarch64/tst-ifunc-arg-2.c: New file.
With commit f0b2132b35 ("ld.so:
Support moving versioned symbols between sonames [BZ #24741]"), the
dynamic linker will find the definition of vfork in libc and binds
a vfork reference to that symbol, even if the soname in the version
reference says that the symbol should be located in libpthread.
As a result, the forwarder (whether it's IFUNC-based or a duplicate
of the libc implementation) is no longer necessary.
On older architectures, a placeholder symbol is required, to make sure
that the GLIBC_2.1.2 symbol version does not go away, or is turned in
to a weak symbol definition by the link editor. (The symbol version
needs to preserved so that the symbol coverage check in
elf/dl-version.c does not fail for old binaries.)
mips32 is an outlier: It defined __vfork@@GLIBC_2.2, but the
baseline is GLIBC_2.0. Since there are other @@GLIBC_2.2 symbols,
the placeholder symbol is not needed there.
Using 'mffs' instruction to read the Floating Point Status Control Register
(FPSCR) can force a processor flush in some cases, with undesirable
performance impact. If the values of the bits in the FPSCR which force the
flush are not needed, an instruction that is new to POWER9 (ISA version 3.0),
'mffsl' can be used instead.
Cases included: get_rounding_mode, fegetround, fegetmode, fegetexcept.
* sysdeps/powerpc/bits/fenvinline.h (__fegetround): Use
__fegetround_ISA300() or __fegetround_ISA2() as appropriate.
(__fegetround_ISA300) New.
(__fegetround_ISA2) New.
* sysdeps/powerpc/fpu_control.h (IS_ISA300): New.
(_FPU_MFFS): Move implementation...
(_FPU_GETCW): Here.
(_FPU_MFFSL): Move implementation....
(_FPU_GET_RC_ISA300): Here. New.
(_FPU_GET_RC): Use _FPU_GET_RC_ISA300() or _FPU_GETCW() as appropriate.
* sysdeps/powerpc/fpu/fenv_libc.h (fegetenv_status_ISA300): New.
(fegetenv_status): New.
* sysdeps/powerpc/fpu/fegetmode.c (fegetmode): Use fegetenv_status()
instead of fegetenv_register().
* sysdeps/powerpc/fpu/fegetexcept.c (__fegetexcept): Likewise.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
The kernel is evolving this interface (e.g., removal of the
restriction on cross-device copies), and keeping up with that
is difficult. Applications which need the function should
run kernels which support the system call instead of relying on
the imperfect glibc emulation.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The kernel interface uses type unsigned int, but there is an
internal conversion to int, so INT_MAX is the correct limit.
Part of the buffer will always be unused, but this is not a
problem. Such huge buffers do not occur in practice anyway.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Since sysdeps/i386/dl-lookupcfg.h and sysdeps/x86_64/dl-lookupcfg.h are
identical, we can replace them with sysdeps/x86/dl-lookupcfg.h.
* sysdeps/i386/dl-lookupcfg.h: Moved to ...
* sysdeps/x86/dl-lookupcfg.h: Here.
* sysdeps/x86_64/dl-lookupcfg.h: Removed.
The nds32 creates two specific syscalls, udftrap and fp_udfiex_crtl, in
kernel v5.0 and v5.2, respectively. Add these two syscalls to
syscall-names.list.
Define all currently used Linux versions used for
PREPARE_VERSION{,_KNOWN} in sysdeps/unix/sysv/linux/dl-vdso.h and use
them instead of duplicating the versions and precomputed hashes across
architecture specific files.
* sysdeps/unix/sysv/linux/aarch64/gettimeofday.c (INIT_ARCH): Use
PREPARE_VERSION_KNOWN.
* sysdeps/unix/sysv/linux/aarch64/init-first.c: Likewise.
* sysdeps/unix/sysv/linux/dl-vdso.h (VDSO_NAME_LINUX_2_6_39): New
define.
(VDSO_HASH_LINUX_2_6_39): Likewise.
(VDSO_NAME_LINUX_4_9): Likewise.
(VDSO_HASH_LINUX_4_9): Likewise.
* sysdeps/unix/sysv/linux/powerpc/gettimeofday.c (INIT_ARCH): Likewise.
* sysdeps/unix/sysv/linux/powerpc/init-first.c
(_libc_vdso_platform_setup): Likewise.
* sysdeps/unix/sysv/linux/powerpc/time.c (INIT_ARCH): Likewise.
* sysdeps/unix/sysv/linux/s390/init-first.c (_libc_vdso_platform_setup):
Likewise.
* sysdeps/unix/sysv/linux/x86_64/init-first.c (__vdso_platform_setup):
Likewise.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Add 'volatile' keyword to a few asm statements, to force the compiler
to generate the instructions therein.
Some instances were implicitly volatile, but adding keyword for consistency.
2019-06-19 Paul A. Clarke <pc@us.ibm.com>
* sysdeps/powerpc/fpu/fenv_libc.h (relax_fenv_state): Add 'volatile'.
* sysdeps/powerpc/fpu/fpu_control.h (__FPU_MFFS): Likewise.
(__FPU_MFFSL): Likewise.
(_FPU_SETCW): Likewise.
__ppc_get_timebase_freq() always return 0 when using static linked
glibc.
This is a minimal example.c to reproduce:
/******************************/
#include <inttypes.h>
#include <stdint.h>
#include <stdio.h>
#include <sys/platform/ppc.h>
int main() {
uint64_t freq = __ppc_get_timebase_freq();
printf("Time Base frequency = %"PRIu64" Hz\n", freq);
if (freq == 0)
return -1;
return 0;
}
/******************************/
Compile command: gcc -static example.c
This bug has been reproduced, fixed and tested on all powerpc platforms
(ppc32, ppc64 and ppc64le).
The underlying code of __ppc_get_timebase_freq uses __get_timebase_freq
that has a different implementation for shared and static version of
glibc. In the static version, there is an incorrect sense in the if
check for the fd returned when opening /proc/cpuinfo.
This solution is mostly a cherry-pick from:
commit 4791e4f773d060c1a37b27aac5b03cdfa9327afc
Author: Stan Shebs <stanshebs@google.com>
Date: Fri May 17 12:25:19 2019 -0700
Subject: Fix sense of a test in the static-linking version of ppc get_clockfreq
That is in branch glibc/google/grte/v5-2.27/master and was mentioned for
inclusion on master here:
https://www.sourceware.org/ml/libc-alpha/2019-05/msg00409.html
Adapted from original fix for get_clockfreq. That code was moved to
get_timebase_freq.
Also added a static-build testcase for __ppc_get_timebase_freq since the
underlying function has different implementations for shared and static
build.
[BZ #24640]
* sysdeps/unix/sysv/linux/powerpc/get_timebase_freq.c
[!SHARED] (__get_timebase_freq): Fix sense of a test in the
static-linking version.
* sysdeps/unix/sysv/linux/powerpc/Makefile
(tests-static): Add test-gettimebasefreq-static.
(tests): Likewise.
* sysdeps/unix/sysv/linux/powerpc/test-gettimebasefreq-static.c:
New file.
Although defined in initial TLS/NPTL ABI for m68k and ColdFire [1], kernel
support was never pushed upstream. This patch removes the unused m68k
vDSO support.
Checked with a build against m68k and m68k-coldfire and some basic
tests on ARAnyM.
* sysdeps/unix/sysv/linux/m68k/Makefile (sysdep_routines,
sysdep-rtld-routines): Remove rules.
* sysdeps/unix/sysv/linux/m68k/Versions (libc) [GLIBC_PRIVATE]:
Remove __vdso_atomic_cmpxchg_32 and __vdso_atomic_barrier.
(ld) [GLIBC_PRIVATE]: __rtld___vdso_read_tp,
__rtld___vdso_atomic_cmpxchg_32, and __rtld___vdso_atomic_barrier.
* sysdeps/unix/sysv/linux/m68k/coldfire/atomic-machine.h
(atomic_compare_and_exchange_val_acq, atomic_full_barrier): Remove
vDSO path for SHARED.
* sysdeps/unix/sysv/linux/m68k/init-first.c: Remove file.
* sysdeps/unix/sysv/linux/m68k/libc-m68k-vdso.c: Likewise.
* sysdeps/unix/sysv/linux/m68k/m68k-helpers.S: Likewise.
* sysdeps/unix/sysv/linux/m68k/m68k-vdso.c: Likewise.
* sysdeps/unix/sysv/linux/m68k/m68k-vdso.h: Likewise.
* sysdeps/unix/sysv/linux/m68k/m68k-helpers.c: New file.
[1] https://lists.debian.org/debian-68k/2007/11/msg00071.html
This patches consolidates all the powerpc llrint{f} implementations on
the generic sysdeps/powerpc/fpu/s_llrint{f}.
The IFUNC support is also moved only to powerpc64 only, since for
powerpc64le generic implementation resulting in optimized code.
Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/Makefile
(libm-sysdep_routines): Add s_llrint-power8, s_llrint-power6x, and
s_llrint-ppc64.
(CFLAGS-s_llrint-power8.c, CFLAGS-s_llrint-power6x.c): New rule.
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llrint-power6x.c: New
file.
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llrint-power8.c:
Likewise.
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llrint-ppc64.c:
Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_lrint.c: Move to ...
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_lrint.c: ... here.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llrint.c: Move to ...
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llrint.c: ... here.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llrintf.c: Move to ...
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llrintf.c: ... here.
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_lrint.c: New file.
* sysdeps/powerpc/powerpc64/fpu/Makefile: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile
(libm-sysdep_routines): Remove s_llrint-* objects.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llrint-power6x.S: Remove
file.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llrint-power8.S:
Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llrint-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_llrint.c: New file.
* sysdeps/powerpc/powerpc64/fpu/s_llrintf.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_lrint.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_lrintf.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_llrint.S: Remove file.
* sysdeps/powerpc/powerpc64/fpu/s_llrintf.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_lrint.S: Likewise.
* sysdeps/powerpc/powerpc64/power6x/fpu/s_llrint.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/fpu/s_llrint.S: Likewise.
Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
The identifier linux is used as a predefined macro, so the actually
used path is 1/stat.h or 1/stat64.h. Using the quote-based version
triggers a file lookup for /usr/include/bits/linux/stat.h (or whatever
directory is used to store bits/statx.h), but since bits/ is pretty
much reserved by glibc, this appears to be acceptable.
This is related to GCC PR 80005: incorrect macro expansion of the
argument of __has_include.
Suggested by Zack Weinberg.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
This patch adds the new constant IPV6_ROUTER_ALERT_ISOLATE from Linux
5.1 to sysdeps/unix/sysv/linux/bits/in.h.
Tested for x86_64.
* sysdeps/unix/sysv/linux/bits/in.h (IPV6_ROUTER_ALERT_ISOLATE):
New macro.
Some recent change on GCC mainline resulted in the localplt test
failing for powerpc soft-float (not sure exactly when, as the failure
appeared when there were other build test failures as well;
<https://sourceware.org/ml/libc-testresults/2019-q2/msg00261.html>
shows it remaining when other failures went away). The problem is a
call to memset that GCC now generates in the libgcc long double code.
Since memset is documented as a function GCC may always implicitly
generate calls to, it seems reasonable to allow that local PLT
reference (just like those for libgcc functions that GCC implicitly
generates calls to and that are also exported from libc.so), which
this patch does.
Tested for powerpc soft-float with build-many-glibcs.py.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/localplt.data:
Allow memset in libc.so.
Avoid lazy binding of symbols that may follow a variant PCS with different
register usage convention from the base PCS.
Currently the lazy binding entry code does not preserve all the registers
required for AdvSIMD and SVE vector calls. Saving and restoring all
registers unconditionally may break existing binaries, even if they never
use vector calls, because of the larger stack requirement for lazy
resolution, which can be significant on an SVE system.
The solution is to mark all symbols in the symbol table that may follow
a variant PCS so the dynamic linker can handle them specially. In this
patch such symbols are always resolved at load time, not lazily.
So currently LD_AUDIT for variant PCS symbols are not supported, for that
the _dl_runtime_profile entry needs to be changed e.g. to unconditionally
save/restore all registers (but pass down arg and retval registers to
pltentry/exit callbacks according to the base PCS).
This patch also removes a __builtin_expect from the modified code because
the branch prediction hint did not seem useful.
* sysdeps/aarch64/dl-dtprocnum.h: New file.
* sysdeps/aarch64/dl-machine.h (DT_AARCH64): Define.
(elf_machine_runtime_setup): Handle DT_AARCH64_VARIANT_PCS.
(elf_machine_lazy_rel): Check STO_AARCH64_VARIANT_PCS and bind such
symbols at load time.
* sysdeps/aarch64/linkmap.h (struct link_map_machine): Add variant_pcs.
The powerpc finite optimization do not show much gain:
- GCC will call libm iff -fsignaling-nans is used. This usage pattern
is usually not performance oriented and for such calls PLT overhead
should dominate execution time.
- The power7 uses ftdiv to optimize for some input patterns, but at
cost of others. Comparing against generic C implementation built
for powerpc64-linux-gnu-power7 (--with-cpu=power7):
- Generic sysdeps/ieee754 implementation:
"isfinite": {
"": {
"duration": 5.0082e+09,
"iterations": 2.45299e+09,
"max": 43.824,
"min": 2.008,
"mean": 2.04167
},
"INF": {
"duration": 4.66554e+09,
"iterations": 2.28288e+09,
"max": 35.73,
"min": 2.008,
"mean": 2.04371
},
"NAN": {
"duration": 4.66274e+09,
"iterations": 2.28716e+09,
"max": 34.161,
"min": 2.009,
"mean": 2.03866
}
}
- power7 optimized one:
"isfinite": {
"": {
"duration": 4.99111e+09,
"iterations": 2.65566e+09,
"max": 25.015,
"min": 1.716,
"mean": 1.87942
},
"INF": {
"duration": 4.6783e+09,
"iterations": 2.0999e+09,
"max": 35.264,
"min": 1.868,
"mean": 2.22787
},
"NAN": {
"duration": 4.67915e+09,
"iterations": 2.08678e+09,
"max": 38.099,
"min": 1.869,
"mean": 2.24228
}
}
So it basically optimizes marginally for normal numbers while
increasing the latency for other kind of FP.
- The power8 implementation is just the generic implementation using
ISA 2.07 mfvsrd instruction (which GCC uses for generic implementation).
So generic implementation is the best option for powerpc64le.
Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile
(sysdeps_routines, libm-sysdep_routines): Remove s_finite*
objects.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finite-power7.S:
Remove file.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finite-ppc32.c:
Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finite.c: Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finitef-ppc32.c:
Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finitef.c: Likewise.
* sysdeps/powerpc/powerpc32/power7/fpu/s_finite.S: Likewise.
* sysdeps/powerpc/powerpc32/power7/fpu/s_finitef.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (sysdep_call):
Remove s_finite* objects.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite-power7.S: Remove file.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite-ppc64.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_finitef-ppc64.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_finitef.c: Likewise.
* sysdeps/powerpc/powerpc64/power7/fpu/s_finite.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/fpu/s_finitef.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/fpu/s_finite.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/fpu/s_finitef.S: Likewise.
Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
- math.h will use compiler builtin for gcc 4.4 when built without
-fsignaling-nans and the builtin is expanded inline for all
support architectures. As an example, there is no intra finite
call on libm for the architecture I checked, x86, arm, aarch64,
and powerpc.
- The resulting binary difference on 32 bits architecture is minimum
for the non hotspot symbol.
- It helps wordsize-64 architectures that use ldbl-opt.
- It add some code simplification with reduction of duplicated
implementations.
Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).
* sysdeps/ieee754/dbl-64/wordsize-64/s_finite.c: Move to ...
* sysdeps/ieee754/dbl-64/s_finite.c: ... here and format code.
Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
The powerpc isinf optimizations onyl adds complexity:
- GCC will call libm iff -fsignaling-nans is used. This usage pattern
is usually not performance oriented and for such calls PLT overhead
should dominate execution time.
- The power7 uses ftdiv to optimize for some input pattern and branch
implementation for INF and denormal that does:
return (ix & UINT64_C (0x7fffffffffffffff)) == UINT64_C (0x7ff0000000000000)
Although it does show slight better latency than generic algorithm
(as below), it is only for power7 and requires it to override it
for power8.
- The power8 implementation is just the generic implementation using
ISA 2.07 mfvsrd instruction (which GCC uses for generic implementation).
So generic implementation is the best option for powerpc64le.
Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile
(sysdeps_routines, libm-sysdep_routines): Remove s_isinf* and s_isinf*
objects.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinf-power7.S:
Remove file.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinf-ppc32.c:
Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinf.c: Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinff-ppc32.c:
Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinff.c: Likewise.
* sysdeps/powerpc/powerpc32/power7/fpu/s_isinf.S: Likewise.
* sysdeps/powerpc/powerpc32/power7/fpu/s_isinff.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (sysdep_call):
Remove s_isinf* and s_isinf* objects.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinf-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinf-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinf-ppc64.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinf.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinff-ppc64.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinff.c: Likewise.
* sysdeps/powerpc/powerpc64/power7/fpu/s_isinf.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/fpu/s_isinff.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/fpu/s_isinf.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/fpu/s_isinff.S: Likewise.
Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
- math.h will use compiler builtin for gcc 4.4 when built without
-fsignaling-nans and the builtin is expanded inline for all
support architectures. As an example, there is no intra isinf
call on libm for the architecture I checked, x86, arm, aarch64,
and powerpc.
- The resulting binary difference on 32 bits architecture is minimum
for the non hotspot symbol.
- It helps wordsize-64 architectures that use ldbl-opt.
- It add some code simplification with reduction of duplicated
implementations.
Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).
* sysdeps/ieee754/dbl-64/wordsize-64/s_isinf.c: Move to ...
* sysdeps/ieee754/dbl-64/s_isinf.c: ... here and format code.
Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
The powerpc isnan optimizations are not really a gain:
- GCC will call libm iff -fsignaling-nans is used. This usage pattern
is usually not performance oriented and for such calls PLT overhead
should dominate execution time.
- The power5, power6, and power6x are just micro-optimization to
improve the Load-Hit-Store hazards from floating-point to general
register transfer, and current GCC already has support to minimize
it by inserting either extra nops or group dispatch instructions.
- The power7 uses ftdiv to optimize for some input patterns, but at
cost of others. Comparing against generic C implementation built
for powerpc-linux-gnu-power4 (which uses the hp-timing support on
benchtests):
- Generic sysdeps/ieee754 implementation:
"isnan": {
"": {
"duration": 4.98415e+09,
"iterations": 2.34516e+09,
"max": 45.925,
"min": 2.052,
"mean": 2.12529
},
"INF": {
"duration": 4.74057e+09,
"iterations": 1.69761e+09,
"max": 91.01,
"min": 2.052,
"mean": 2.79249
},
"NAN": {
"duration": 4.74071e+09,
"iterations": 1.68768e+09,
"max": 282.343,
"min": 2.052,
"mean": 2.809
}
}
- power7 optimized one:
$ ./testrun.sh benchtests/bench-isnan
"isnan": {
"": {
"duration": 4.96842e+09,
"iterations": 2.56297e+09,
"max": 50.048,
"min": 1.872,
"mean": 1.93854
},
"INF": {
"duration": 4.76648e+09,
"iterations": 1.54213e+09,
"max": 373.408,
"min": 2.661,
"mean": 3.09084
},
"NAN": {
"duration": 4.76845e+09,
"iterations": 1.54515e+09,
"max": 51.016,
"min": 2.736,
"mean": 3.08607
}
}
So it basically optimizes marginally for normal numbers while
increasing the latency for other kind of FP.
- The generic implementation requires getting the floating point
status, disable the invalid operation bit, and restore the
floating-point status. Each operation is costly and requires
flushing the FP pipeline.
Using the same scenarion for the previous analysis:
"isnan": {
"": {
"duration": 5.08284e+09,
"iterations": 6.2898e+08,
"max": 41.844,
"min": 8.057,
"mean": 8.08108
},
"INF": {
"duration": 4.97904e+09,
"iterations": 6.16176e+08,
"max": 39.661,
"min": 8.057,
"mean": 8.08055
},
"NAN": {
"duration": 4.98695e+09,
"iterations": 5.95866e+08,
"max": 29.728,
"min": 8.345,
"mean": 8.36925
}
}
- The power8 implementation is just the generic implementation using
ISA 2.07 mfvsrd instruction (which GCC uses for generic implementation).
So generic implementation is the best option for powerpc64le.
Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).
* sysdeps/powerpc/fpu/s_isnan.c: Remove file.
* sysdeps/powerpc/fpu/s_isnanf.S: Likewise.
* sysdeps/powerpc/powerpc32/fpu/s_isnan.S: Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile
(sysdeps_routines, libm-sysdep_routines): Remove s_isnan-* and
s_isnanf-* objects.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan-power5.S:
Remove file
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan-power6.S:
Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan-power7.S:
Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan-ppc32.S:
Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan.c: Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnanf-power5.S:
Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnanf-power6.S:
Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnanf.c: Likewise.
* sysdeps/powerpc/powerpc32/power5/fpu/s_isnan.S: Likewise.
* sysdeps/powerpc/powerpc32/power5/fpu/s_isnanf.S: Likewise.
* sysdeps/powerpc/powerpc32/power6/fpu/s_isnan.S: Likewise.
* sysdeps/powerpc/powerpc32/power6/fpu/s_isnanf.S: Likewise.
* sysdeps/powerpc/powerpc32/power7/fpu/s_isnan.S: Likewise.
* sysdeps/powerpc/powerpc32/power7/fpu/s_isnanf.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (sysdep_calls):
Remove s_isnan-* and s_isnanf-* objects.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power5.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power6.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power6x.S:
Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnanf.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_isnan.S: Likewise.
* sysdeps/powerpc/powerpc64/power5/fpu/s_isnan.S: Likewise.
* sysdeps/powerpc/powerpc64/power6/fpu/s_isnan.S: Likewise.
* sysdeps/powerpc/powerpc64/power6x/fpu/s_isnan.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/fpu/s_isnan.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/fpu/s_isnanf.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/fpu/s_isnan.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/fpu/s_isnanf.S: Likewise.
Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
- math.h will use compiler builtin for gcc 4.4 when built without
-fsignaling-nans and the builtin is expanded inline for all
support architectures. As an example, there is no intra isnan
call on libm for the architecture I checked, x86, arm, aarch64,
and powerpc.
- The resulting binary difference on 32 bits architecture is minimum
for the non hotspot symbol.
- It helps wordsize-64 architectures that use ldbl-opt.
- It add some code simplification with reduction of duplicated
implementations.
Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).
* sysdeps/ieee754/dbl-64/wordsize-64/s_isnan.c: Move to ...
* sysdeps/ieee754/dbl-64/s_isnan.c: ... here and format code.
Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
GCC always expand copysign{f} for all possible cpus, so calling the libm
is only done if user explicitly states to disable the builtin (which is
done usually not for performance reason). So to provide ifunc variant
for copysign is just unrequired complexity, since libm will be called
on non-performance critical code.
This patch removes both powerpc32 and powerpc64 ifunc variants and
consolidates the powerpc implementation on
sysdeps/powerpc/fpu/s_copysign{f}.c using compiler builtins.
Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).
* sysdeps/powerpc/fpu/s_copysign.c: New file.
* sysdeps/powerpc/fpu/s_copysignf.c: Likewise.
* sysdeps/powerpc/powerpc32/fpu/s_copysign.S: Remove file.
* sysdeps/powerpc/powerpc32/fpu/s_copysignf.S: Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile
(sysdep_routines, libm-sysdep_routines): Remove s_copysign-power6 and
s_copysign-ppc32.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysign-power6.S:
Remove file.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysign-ppc32.S:
Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysign.c:
Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysignf.c:
Likewise.
* sysdeps/powerpc/powerpc32/power6/fpu/s_copysign.S: Likewise.
* sysdeps/powerpc/powerpc32/power6/fpu/s_copysignf.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (sysdeps_calls):
Remove s_copysign-power6 s_copysign-ppc64.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysign-power6.S:
Remove file.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysign-ppc64.S:
Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysign.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysignf.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_copysign.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_copysignf.S: Likewise.
* sysdeps/powerpc/powerpc64/power6/fpu/s_copysign.S: Likewise.
* sysdeps/powerpc/powerpc64/power6/fpu/s_copysignf.S: Likewise.
Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
This patches consolidates all the powerpc rint{f} implementations on
the generic sysdeps/powerpc/fpu/s_rint{f}.
Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).
* sysdeps/powerpc/fpu/round_to_integer.h (set_fenv_mode,
round_to_integer_float, round_mode): Add RINT handling.
(reset_fenv_mode): New symbol.
* sysdeps/powerpc/fpu/s_rint.c (__rint): Use generic implementation.
* sysdeps/powerpc/fpu/s_rintf.c (__rintf): Likewise.
* sysdeps/powerpc/powerpc32/fpu/s_rint.S: Remove file.
* sysdeps/powerpc/powerpc32/fpu/s_rintf.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_rint.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_rintf.S: Likewise.
Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
Now that there are no internal users of __sysctl left, it is possible
to add an unconditional deprecation warning to <sys/sysctl.h>.
To avoid a test failure due this warning in check-install-headers,
skip the test for sys/sysctl.h.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
No 32-bit system call wrapper is added because the interface
is problematic because it cannot deal with 64-bit inode numbers
and 64-bit directory hashes.
A future commit will deprecate the undocumented getdirentries
and getdirentries64 functions.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Add support to use 'mffsl' instruction if compiled for POWER9 (or later).
Also, mask the result to avoid bleeding unrelated bits into the result of
_FPU_GET_RC().
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
fegetexcept() included code which exactly duplicates the code in
fenv_reg_to_exceptions(). Replace with a call to that function.
2019-06-05 Paul A. Clarke <pc@us.ibm.com>
* sysdeps/powerpc/fpu/fegetexcept.c (__fegetexcept): Replace code
with call to equivalent function.
Linux only supports the required ISA sysctls on StrongARM devices,
which are armv4 and no longer tested during glibc development
and probably bit-rotted by this point. (No reported test results,
and the last discussion of armv4 support was in the glibc 2.19
release notes.)
<asm/unistd.h> on arm defines the following macros:
#define __ARM_NR_breakpoint (__ARM_NR_BASE+1)
#define __ARM_NR_cacheflush (__ARM_NR_BASE+2)
#define __ARM_NR_usr26 (__ARM_NR_BASE+3)
#define __ARM_NR_usr32 (__ARM_NR_BASE+4)
#define __ARM_NR_set_tls (__ARM_NR_BASE+5)
#define __ARM_NR_get_tls (__ARM_NR_BASE+6)
These do not follow the regular __NR_* naming convention and
have so far been ignored by the syscall-names.list consistency
checks. This commit adds these names to the file, preparing
for the availability of these names in the regular __NR_*
namespace.
Since GCC commit 271500 (svn), also known as the following commit on the
git mirror:
commit 61edec870f9fdfb5df3fa4e40f28cbaede28a5b1
Author: amodra <amodra@138bc75d-0d04-0410-961f-82ee72b054a4>
Date: Wed May 22 04:34:26 2019 +0000
[RS6000] Don't pass -many to the assembler
glibc builds are failing when an assembly implementation does not
declare the correct '.machine' directive, or when no such directive is
declared at all. For example, when a POWER6 instruction is used, but
'.machine power6' is not declared, the assembler will fail with an error
similar to the following:
../sysdeps/powerpc/powerpc64/power8/strcmp.S: Assembler messages:
24 ../sysdeps/powerpc/powerpc64/power8/strcmp.S:55: Error: unrecognized opcode: `cmpb'
This patch adds '.machine powerN' directives where none existed, as well
as it updates '.machine power7' directives on POWER8 files, because the
minimum binutils version required to build glibc (binutils 2.25) now
provides this machine version. It also adds '-many' to the assembler
command used to build tst-set_ppr.c.
Tested for powerpc, powerpc64, and powerpc64le, as well as with
build-many-glibcs.py for powerpc targets.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
The patch 6e8ba7fd57 meant to remove the all get_clockfreq.c. This
patch removes the missing files for sparcv9 and x86_64.
Checked against a build to x86_64-linux-gnu and sparcv9-linux-gnu.
* sysdeps/unix/sysv/linux/sparc/sparc32/sparcv9/get_clockfreq.c:
Remove file.
* sysdeps/unix/sysv/linux/x86_64/get_clockfreq.c: Likewise.
This patch adds the new F_SEAL_FUTURE_WRITE constant from Linux 5.1 to
bits/fcntl-linux.h.
Tested for x86_64.
* sysdeps/unix/sysv/linux/bits/fcntl-linux.h [__USE_GNU]
(F_SEAL_FUTURE_WRITE): New macro.
GCC 9 dropped support for the SPE extensions to PowerPC, which means
powerpc*-*-*gnuspe* configurations are no longer buildable with that
compiler. This ISA extension was peculiar to the “e500” line of
embedded PowerPC chips, which, as far as I can tell, are no longer
being manufactured, so I think we should follow suit.
This patch was developed by grepping for “e500”, “__SPE__”, and
“__NO_FPRS__”, and may not eliminate every vestige of SPE support.
Most uses of __NO_FPRS__ are left alone, as they are relevant to
normal embedded PowerPC with soft-float.
* sysdeps/powerpc/preconfigure: Error out on powerpc-*-*gnuspe*
host type.
* scripts/build-many-glibcs.py: Remove powerpc-*-linux-gnuspe
and powerpc-*-linux-gnuspe-e500v1 from list of build configurations.
* sysdeps/powerpc/powerpc32/e500: Recursively delete.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/e500: Recursively delete.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/context-e500.h:
Delete.
* sysdeps/powerpc/fpu_control.h: Remove SPE variant.
Issue an #error if used with a compiler in SPE-float mode.
* sysdeps/powerpc/powerpc32/__longjmp_common.S
* sysdeps/powerpc/powerpc32/setjmp_common.S
* sysdeps/unix/sysv/linux/powerpc/powerpc32/getcontext-common.S
* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/getcontext.S
* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/setcontext.S
* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/swapcontext.S
* sysdeps/unix/sysv/linux/powerpc/powerpc32/setcontext-common.S
* sysdeps/unix/sysv/linux/powerpc/powerpc32/swapcontext-common.S:
Remove code to preserve SPE register state.
* sysdeps/unix/sysv/linux/powerpc/elision-lock.c
* sysdeps/unix/sysv/linux/powerpc/elision-trylock.c
* sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
Remove __SPE__ ifndefs.
This patch add the missing SEMTIMEDOP_IPC_ARGS definions on powerpc
and sparc ipc_priv.h.
Checked on powerpc64le-linux-gnu and with a build for sparc64-linux-gnu.
* sysdeps/unix/sysv/linux/powerpc/ipc_priv.h (SEMTIMEDOP_IPC_ARGS):
New define.
* sysdeps/unix/sysv/linux/sparc/sparc64/ipc_priv.h
(SEMTIMEDOP_IPC_ARGS): Likewise.
This patch consolidates the s390-32 semtimedop implementation by defining
a arch-specific SEMTIMEDOP_IPC_ARGS to rearrange the arguments expected
by s390 Linux kABI. The idea is to avoid have multiples semtimedop
implementation changes for Linux v5.1 change to enable wire-up sysvipc
support.
Checked with a s390-linux-gnu and s390x-linux-gnu and checking that
resulting semtimedop objects did not change.
* sysdeps/unix/sysv/linux/ipc_priv.h (SEMTIMEDOP_IPC_ARGS): New
define.
* sysdpes/unix/sysv/linux/s390/ipc_priv.h: New file.
* sysdeps/unix/sysv/linux/s390/semtimedop.c: Remove file.
* sysdeps/unix/sysv/linux/semtimedop.c (semtimedop): Use
SEMTIMEDOP_IPC_ARGS for calls with __NR_ipc.
The __IPC64 flags is meant to be used to enable the new sysv struct
format when the architectures supports it (ARCH_WANT_IPC_PARSE_VERSION
config flag on Linux kernel).
This currently issue only affects alpha.
[BZ #24570]
* sysdeps/unix/sysv/linux/msgctl.c (__old_msgctl): Remove __IPC_64
usage.
Linux 5.1 adds missing syscalls to the syscall table for many Linux
kernel architectures. This patch updates the kernel-features.h
headers accordingly. __ASSUME_DIRECT_SYSVIPC_SYSCALLS is not updated
because of the differences between new and old syscalls described in
<https://sourceware.org/ml/libc-alpha/2019-05/msg00235.html>. The
statfs64 structure used by alpha matches what the new kernel syscalls
use.
Tested with build-many-glibcs.py.
* sysdeps/unix/sysv/linux/alpha/kernel-features.h
(__ASSUME_STATFS64): Only undefine if [__LINUX_KERNEL_VERSION <
0x050100].
* sysdeps/unix/sysv/linux/ia64/kernel-features.h (__ASSUME_STATX):
Likewise.
* sysdeps/unix/sysv/linux/sh/kernel-features.h
(__ASSUME_STATX): Likewise.
The function uses the internal service_user type, so it is not
really usable from the outside of glibc. Rename the function
to __nss_database_lookup2 for internal use, and change
__nss_database_lookup to always indicate failure to the caller.
__nss_next already was a compatibility symbol. The new
implementation always fails and no longer calls __nss_next2.
unscd, the alternative nscd implementation, does not use
__nss_database_lookup, so it is not affected by this change.
The tgkill function is sometimes used in crash handlers.
<bits/signal_ext.h> follows the same approach as <bits/unistd_ext.h>
(which was added for the gettid system call wrapper).
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
This patch removes the arch-specific x86 assembly implementation for
low level locking and consolidate both 64 bits and 32 bits in a
single implementation.
Different than other architectures, x86 lll_trylock, lll_lock, and
lll_unlock implements a single-thread optimization to avoid atomic
operation, using cmpxchgl instead. This patch implements by using
the new single-thread.h definitions in a generic way, although using
the previous semantic.
The lll_cond_trylock, lll_cond_lock, and lll_timedlock just use
atomic operations plus calls to lll_lock_wait*.
For __lll_lock_wait_private and __lll_lock_wait the generic implemtation
there is no indication that assembly implementation is required
performance-wise.
Checked on x86_64-linux-gnu and i686-linux-gnu.
* sysdeps/nptl/lowlevellock.h (__lll_trylock): New macro.
(lll_trylock): Call __lll_trylock.
* sysdeps/unix/sysv/linux/i386/libc-lowlevellock.S: Remove file.
* sysdeps/unix/sysv/linux/i386/lll_timedlock_wait.c: Likewise.
* sysdeps/unix/sysv/linux/i386/lowlevellock.S: Likewise.
* sysdeps/unix/sysv/linux/i386/lowlevellock.h: Likewise.
* sysdeps/unix/sysv/linux/x86_64/libc-lowlevellock.S: Likewise.
* sysdeps/unix/sysv/linux/x86_64/lll_timedlock_wait.c: Likewise.
* sysdeps/unix/sysv/linux/x86_64/lowlevellock.S: Likewise.
* sysdeps/unix/sysv/linux/x86_64/lowlevellock.h: Likewise.
* sysdeps/unix/sysv/linux/x86/lowlevellock.h: New file.
* sysdeps/unix/sysv/linux/x86_64/cancellation.S: Include
lowlevellock-futex.h.
Since hppa is not an outlier anymore regarding LLL_LOCK_INITIALIZER value,
we can now assume it 0 for all architectures.
Checked on a build for all major ABIs.
* nptl/nptl-init.c (__pthread_initialize_minimal_internal): Remove
initialization for LLL_LOCK_INITIALIZER different than 0.
* nptl/old_pthread_cond_broadcast.c (__pthread_cond_broadcast_2_0):
Assume LLL_LOCK_INITIALIZER being 0.
* nptl/old_pthread_cond_signal.c (__pthread_cond_signal_2_0): Likewise.
* nptl/old_pthread_cond_timedwait.c (__pthread_cond_timedwait_2_0):
Likewise.
* nptl/old_pthread_cond_wait.c (__pthread_cond_wait_2_0): Likewise.
* sysdeps/nptl/libc-lockP.h (__libc_lock_define_initialized): Likewise.
This patch move the single-thread syscall optimization defintions from
syscall-cancel.h to new header file single-thread.h and also move the
cancellation definitions from pthreadP.h to syscall-cancel.h.
The idea is just simplify the inclusion of both syscall-cancel.h and
single-thread.h (without the requirement of including all pthreadP.h
defintions).
No semantic changes expected, checked on a build for all major ABIs.
* nptl/pthreadP.h (CANCEL_ASYNC, CANCEL_RESET, LIBC_CANCEL_ASYNC,
LIBC_CANCEL_RESET, __libc_enable_asynccancel,
__libc_disable_asynccancel, __librt_enable_asynccancel,
__libc_disable_asynccancel, __librt_enable_asynccancel,
__librt_disable_asynccancel): Move to ...
* sysdeps/unix/sysv/linux/sysdep-cancel.h: ... here.
(SINGLE_THREAD_P, RTLD_SINGLE_THREAD_P): Move to ...
* sysdeps/unix/sysv/linux/single-thread.h: ... here.
* sysdeps/generic/single-thread.h: New file.
* sysdeps/unix/sysdep.h: Include single-thread.h.
* sysdeps/unix/sysv/linux/futex-internal.h: Include sysdep-cancel.h.
* sysdeps/unix/sysv/linux/lowlevellock-futex.h: Likewise.
This patches consolidates all the powerpc trunc{f} implementations on
the generic sysdeps/powerpc/fpu/s_trunc{f}. The generic implementation
uses either the compiler builts for ISA 2.03+ (which generates the
frim instruction) or a generic implementation which uses FP only
operations.
The IFUNC organization for powerpc64 is also change to be enabled only
for powerpc64 and not for powerpc64le (since minium ISA of 2.08 does not
require the fallback generic implementation).
Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).
* sysdeps/powerpc/fpu/trunc_to_integer.h (set_fenv_mode): Add
TRUNC handling.
(round_mode): Add definition for TRUNC.
* sysdeps/powerpc/fpu/s_trunc.c: New file.
* sysdeps/powerpc/fpu/s_truncf.c: New file.
* sysdeps/powerpc/powerpc32/fpu/s_trunc.S: Remove file.
* sysdeps/powerpc/powerpc32/fpu/s_truncf.S: Likewise.
* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_trunc-power5+.S:
Likewise.
* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_trunc-ppc32.S:
Likewise.
* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_truncf-power5+.S:
Likewise.
* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_truncf-ppc32.S:
Likewise.
* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_trunc-power5+.c: New
file.
* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_trunc-ppc32.c:
Likewise.
* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_truncf-power5+.c:
Likewise.
* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_truncf-ppc32.c:
Likewise.
* sysdep/powerpc/powerpc32/power5+/fpu/s_trunc.S: Remove file.
* sysdep/powerpc/powerpc32/power5+/fpu/s_truncf.S: Likewise.
* sysdep/powerpc/powerpc64/be/fpu/multiarch/Makefile
(libm-sysdep_routines): Add s_trunc-power5+, s_trunc-ppc64,
s_truncf-power5+, and s_truncf-ppc64.
(CFLAGS-s_trunc-power5+.c, CFLAGS-s_truncf-power5+.c): New rule.
* sysdep/powerpc/powercp64/be/fpu/multiarch/s_trunc-power5+.c: New
file.
* sysdep/powerpc/powercp64/be/fpu/multiarch/s_trunc-ppc64.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_trunc.c: Move to ...
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_trunc.c: ... here.
* sysdep/powerpc/powercp64/be/fpu/multiarch/s_truncf-power5+.c: New
file.
* sysdep/powerpc/powercp64/be/fpu/multiarch/s_truncf-ppc64.c:
Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_truncf.c: Move to ...
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_truncf.c: ... here.
* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile
(libm-sysdep_routines): Remove s_trunc-power5+, s_trunc-ppc64,
s_truncf-power5+, and s_truncf-ppc64.
* sysdep/powerpc/powerpc64/fpu/multiarch/s_trunc-power5+.S: Remove
file.
* sysdep/powerpc/powerpc64/fpu/multiarch/s_trunc-ppc64.S: Likewise.
* sysdep/powerpc/powerpc64/fpu/multiarch/s_truncf-power5+.S:
Likewise.
* sysdep/powerpc/powerpc64/fpu/multiarch/s_truncf-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_trunc.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_truncf.S: Likewise.
* sysdep/powerpc/powerpc64/power5+/fpu/s_trunc.S: Likewise.
* sysdep/powerpc/powerpc64/power5+/fpu/s_truncf.S: Likewise.
Reviewed-by: Gabriel F. T. Gomes <gabriel@inconstante.eti.br>
This patches consolidates all the powerpc round{f} implementations on
the generic sysdeps/powerpc/fpu/s_round{f}. The generic implementation
uses either the compiler builts for ISA 2.03+ (which generates the
frim instruction) or a generic implementation which uses FP only
operations.
The IFUNC organization for powerpc64 is also change to be enabled only
for powerpc64 and not for powerpc64le (since minium ISA of 2.08 does not
require the fallback generic implementation).
Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).
* sysdeps/powerpc/fpu/round_to_integer.h (set_fenv_mode): Add
ROUND handling.
(round_mode): Add definition for ROUND.
(round_to_integer_float): Likewise.
* sysdeps/powerpc/fpu/s_round.c: New file.
* sysdeps/powerpc/fpu/s_roundf.c: New file.
* sysdeps/powerpc/powerpc32/fpu/s_round.S: Remove file.
* sysdeps/powerpc/powerpc32/fpu/s_roundf.S: Likewise.
* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_round-power5+.S:
Likewise.
* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_round-ppc32.S:
Likewise.
* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_roundf-power5+.S:
Likewise.
* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_roundf-ppc32.S:
Likewise.
* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_round-power5+.c: New
file.
* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_round-ppc32.c:
Likewise.
* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_roundf-power5+.c:
Likewise.
* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_roundf-ppc32.c:
Likewise.
* sysdep/powerpc/powerpc32/power5+/fpu/s_round.S: Remove file.
* sysdep/powerpc/powerpc32/power5+/fpu/s_roundf.S: Likewise.
* sysdep/powerpc/powerpc64/be/fpu/multiarch/Makefile
(libm-sysdep_routines): Add s_round-power5+, s_round-ppc64,
s_roundf-power5+, and s_roundf-ppc64.
(CFLAGS-s_round-power5+.c, CFLAGS-s_roundf-power5+.c): New rule.
* sysdep/powerpc/powercp64/be/fpu/multiarch/s_round-power5+.c: New
file.
* sysdep/powerpc/powercp64/be/fpu/multiarch/s_round-ppc64.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_round.c: Move to ...
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_round.c: ... here.
* sysdep/powerpc/powercp64/be/fpu/multiarch/s_roundf-power5+.c: New
file.
* sysdep/powerpc/powercp64/be/fpu/multiarch/s_roundf-ppc64.c:
Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_roundf.c: Move to ...
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_roundf.c: ... here.
* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile
(libm-sysdep_routines): Remove s_round-power5+, s_round-ppc64,
s_roundf-power5+, and s_roundf-ppc64.
* sysdep/powerpc/powerpc64/fpu/multiarch/s_round-power5+.S: Remove
file.
* sysdep/powerpc/powerpc64/fpu/multiarch/s_round-ppc64.S: Likewise.
* sysdep/powerpc/powerpc64/fpu/multiarch/s_roundf-power5+.S:
Likewise.
* sysdep/powerpc/powerpc64/fpu/multiarch/s_roundf-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_round.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_roundf.S: Likewise.
* sysdep/powerpc/powerpc64/power5+/fpu/s_round.S: Likewise.
* sysdep/powerpc/powerpc64/power5+/fpu/s_roundf.S: Likewise.
Reviewed-by: Gabriel F. T. Gomes <gabriel@inconstante.eti.br>
This patches consolidates all the powerpc floor{f} implementations on
the generic sysdeps/powerpc/fpu/s_floor{f}. The generic implementation
uses either the compiler builts for ISA 2.03+ (which generates the
frim instruction) or a generic implementation which uses FP only
operations.
The IFUNC organization for powerpc64 is also change to be enabled only
for powerpc64 and not for powerpc64le (since minium ISA of 2.08 does not
require the fallback generic implementation).
Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).
* sysdeps/powerpc/fpu/round_to_integer.h (set_fenv_mode):
Add FLOOR option.
(round_mode): Add definition for FLOOR.
* sysdeps/powerpc/fpu/s_floor.c: New file.
* sysdeps/powerpc/fpu/s_floorf.c: Likewise.
* sysdeps/powerpc/powerpc32/fpu/s_floor.S: Remove file.
* sysdeps/powerpc/powerpc32/fpu/s_floorf.S: Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floor-power5+.S:
Remove file.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floor-ppc32.S:
Likewise
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floorf-power5+.S:
Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floorf-ppc32.S:
Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floor-power5+.c:
New file.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floor-ppc32.c:
Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floorf-power5+.c:
Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floorf-ppc32.c:
Likewise.
* sysdeps/powerpc/powerpc32/power5+/fpu/s_floor.S: Remove file.
* sysdeps/powerpc/powerpc32/power5+/fpu/s_floorf.S: Remove file.
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/Makefile
(libm-sysdep_routines): Add s_floor-power5+, s_floor-ppc64,
s_floorf-power5+, and s_floorf-ppc64.
(CFLAGS-s_floor-power5+.c, CFLAGS-s_floorf-power5+.c): New rule.
* sysdep/powerpc/powerpc64/be/fpu/multiarch/s_floor-power5+.c: New
file.
* sysdep/powerpc/powerpc64/be/fpu/multiarch/s_floor-ppc64.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_floor.c: Move to ...
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_floor.c: ... here.
* sysdep/powerpc/powerpc64/be/fpu/multiarch/s_floorf-power5+.c: New
file.
* sysdep/powerpc/powerpc64/be/fpu/multiarch/s_floorf-ppc64.c:
Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_floorf.c: Move to ...
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_floorf.c: ... here.
* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile
(libm-sysdep_routines): Remove s_floor-power5+, s_floor-ppc64,
s_floorf-power5+, and s_floorf-ppc64.
* sysdep/powerpc/powerpc64/fpu/multiarch/s_floor-power5+.S: Remove
file.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_floor-ppc64.S: Remove
file.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_floorf-power5+.S:
Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_floorf-ppc64.S:
Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_floor.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_floorf.S: Likewise.
* sysdeps/powerpc/powerpc64/power5+/fpu/s_floor.S: Likewise.
* sysdeps/powerpc/powerpc64/power5+/fpu/s_floorf.S: Likewise.
Reviewed-by: Gabriel F. T. Gomes <gabriel@inconstante.eti.br>
This patch updates syscall-names.list for Linux 5.1 (which has many
new syscalls, mainly but not entirely ones for 64-bit time).
Tested with build-many-glibcs.py (before the revert of the move to
Linux 5.1 there; verified there were no tst-syscall-list failures).
* sysdeps/unix/sysv/linux/syscall-names.list: Update kernel
version to 5.1.
(clock_adjtime64) New syscall.
(clock_getres_time64) Likewise.
(clock_gettime64) Likewise.
(clock_nanosleep_time64) Likewise.
(clock_settime64) Likewise.
(futex_time64) Likewise.
(io_pgetevents_time64) Likewise.
(io_uring_enter) Likewise.
(io_uring_register) Likewise.
(io_uring_setup) Likewise.
(mq_timedreceive_time64) Likewise.
(mq_timedsend_time64) Likewise.
(pidfd_send_signal) Likewise.
(ppoll_time64) Likewise.
(pselect6_time64) Likewise.
(recvmmsg_time64) Likewise.
(rt_sigtimedwait_time64) Likewise.
(sched_rr_get_interval_time64) Likewise.
(semtimedop_time64) Likewise.
(timer_gettime64) Likewise.
(timer_settime64) Likewise.
(timerfd_gettime64) Likewise.
(timerfd_settime64) Likewise.
(utimensat_time64) Likewise.
The performance improvement is about 20%-30% for
larger cases and about 1%-5% for smaller cases.
Used SIMD load/store instead of GPR for large
overlapping forward moves.
Reused existing memcpy implementation for smaller
or overlapping backward moves.
Fixed the existing memcpy implementation to allow it
to deal with the overlapping case.
Simplified loop tails in the memcpy implementation -
use branchless overlapping sequence of fixed length
load/stores instead of branching depending on the
size.
A cleanup/optimization converting str's to stp's.
Added __memmove_thunderx2 to the list of the
available implementations.
The twalk function is very difficult to use in a multi-threaded
program because there is no way to pass external state to the
iterator function.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Complementing commit 4a06ceea33 ("sysdeps/ieee754/soft-fp: ignore
maybe-uninitialized with -O [BZ #19444]") and commit 27c5e756a2
("sysdeps/ieee754: prevent maybe-uninitialized errors with -O [BZ
#19444]") also fix compilation errors observed at -O1 in `__ddivl' and
`__fdivl' with GCC 9 and RISC-V targets:
In file included from ../soft-fp/soft-fp.h:318,
from ../sysdeps/ieee754/soft-fp/s_fdivl.c:27:
../sysdeps/ieee754/soft-fp/s_fdivl.c: In function '__fdivl':
../soft-fp/op-2.h:108:9: error: 'R_f1' may be used uninitialized in this function [-Werror=maybe-uninitialized]
108 | : (X##_f1 << (2*_FP_W_TYPE_SIZE - (N)))) \
| ^
../sysdeps/ieee754/soft-fp/s_fdivl.c:37:14: note: 'R_f1' was declared here
37 | FP_DECL_Q (R);
| ^
../soft-fp/op-common.h:39:3: note: in expansion of macro '_FP_FRAC_DECL_2'
39 | _FP_FRAC_DECL_##wc (X)
| ^~~~~~~~~~~~~~
../soft-fp/quad.h:226:24: note: in expansion of macro '_FP_DECL'
226 | # define FP_DECL_Q(X) _FP_DECL (2, X)
| ^~~~~~~~
../sysdeps/ieee754/soft-fp/s_fdivl.c:37:3: note: in expansion of macro 'FP_DECL_Q'
37 | FP_DECL_Q (R);
| ^~~~~~~~~
../soft-fp/op-2.h:109:8: error: 'R_f0' may be used uninitialized in this function [-Werror=maybe-uninitialized]
109 | | X##_f0) != 0)); \
| ^
../sysdeps/ieee754/soft-fp/s_fdivl.c:37:14: note: 'R_f0' was declared here
37 | FP_DECL_Q (R);
| ^
../soft-fp/op-common.h:39:3: note: in expansion of macro '_FP_FRAC_DECL_2'
39 | _FP_FRAC_DECL_##wc (X)
| ^~~~~~~~~~~~~~
../soft-fp/quad.h:226:24: note: in expansion of macro '_FP_DECL'
226 | # define FP_DECL_Q(X) _FP_DECL (2, X)
| ^~~~~~~~
../sysdeps/ieee754/soft-fp/s_fdivl.c:37:3: note: in expansion of macro 'FP_DECL_Q'
37 | FP_DECL_Q (R);
| ^~~~~~~~~
In file included from ../soft-fp/soft-fp.h:318,
from ../sysdeps/ieee754/soft-fp/s_ddivl.c:31:
../sysdeps/ieee754/soft-fp/s_ddivl.c: In function '__ddivl':
../soft-fp/op-2.h:98:25: error: 'R_f1' may be used uninitialized in this function [-Werror=maybe-uninitialized]
98 | X##_f0 = (X##_f1 << (_FP_W_TYPE_SIZE - (N)) | X##_f0 >> (N) \
| ^~
../sysdeps/ieee754/soft-fp/s_ddivl.c:41:14: note: 'R_f1' was declared here
41 | FP_DECL_Q (R);
| ^
../soft-fp/op-2.h:37:36: note: in definition of macro '_FP_FRAC_DECL_2'
37 | _FP_W_TYPE X##_f0 _FP_ZERO_INIT, X##_f1 _FP_ZERO_INIT
| ^
../soft-fp/quad.h:226:24: note: in expansion of macro '_FP_DECL'
226 | # define FP_DECL_Q(X) _FP_DECL (2, X)
| ^~~~~~~~
../sysdeps/ieee754/soft-fp/s_ddivl.c:41:3: note: in expansion of macro 'FP_DECL_Q'
41 | FP_DECL_Q (R);
| ^~~~~~~~~
../soft-fp/op-2.h:101:17: error: 'R_f0' may be used uninitialized in this function [-Werror=maybe-uninitialized]
101 | : (X##_f0 << (_FP_W_TYPE_SIZE - (N))) != 0)); \
| ^~
../sysdeps/ieee754/soft-fp/s_ddivl.c:41:14: note: 'R_f0' was declared here
41 | FP_DECL_Q (R);
| ^
../soft-fp/op-2.h:37:14: note: in definition of macro '_FP_FRAC_DECL_2'
37 | _FP_W_TYPE X##_f0 _FP_ZERO_INIT, X##_f1 _FP_ZERO_INIT
| ^
../soft-fp/quad.h:226:24: note: in expansion of macro '_FP_DECL'
226 | # define FP_DECL_Q(X) _FP_DECL (2, X)
| ^~~~~~~~
../sysdeps/ieee754/soft-fp/s_ddivl.c:41:3: note: in expansion of macro 'FP_DECL_Q'
41 | FP_DECL_Q (R);
| ^~~~~~~~~
cc1: all warnings being treated as errors
make[2]: *** [.../sysd-rules:587: .../math/s_fdivl.o] Error 1
make[2]: *** Waiting for unfinished jobs....
cc1: all warnings being treated as errors
make[2]: *** [.../sysd-rules:587: .../math/s_ddivl.o] Error 1
This comes from cases in _FP_DIV that return a result described as
FP_CLS_ZERO or FP_CLS_INF and do not initialize the fractional part,
which is then operated on unconditionally in FP_TRUNC_COOKED before
being ignored by _FP_PACK_CANONICAL.
Clearly at this optimization level GCC cannot guarantee to be able to
determine that the fractional part is ultimately unused, so ignore the
error as with the earlier commits referred, letting compilation proceed.
[BZ #19444]
* sysdeps/ieee754/soft-fp/s_ddivl.c (__ddivl): Ignore errors
from `-Wmaybe-uninitialized'.
* sysdeps/ieee754/soft-fp/s_fdivl.c (__fdivl): Likewise.
This patches consolidates all the powerpc ceil{f} implementations on
the generic sysdeps/powerpc/fpu/s_ceil{f}. The generic implementation
uses either the compiler builts for ISA 2.03+ (which generates the frip
instruction) or a generic implementation which uses FP only operations.
It adds a generic implementation (round_to_integer.h) which is shared
with other rounding to integer routines. The resulting code should be
similar in term os performance to previous assembly one.
The IFUNC organization for powerpc64 is also change to be enabled only
for powerpc64 and not for powerpc64le (since minium ISA of 2.08 does not
require the fallback generic implementation).
Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).
* sysdeps/powerpc/fpu/fenv_libc.h (__fesetround_inline_nocheck): New
function.
* sysdeps/powerpc/fpu/round_to_integer.h: New file.
* sysdeps/powerpc/fpu/s_ceil.c: Likewise.
* sysdeps/powerpc/fpu/s_ceilf.c: Likewise.
* sysdeps/powerpc/powerpc32/fpu/s_ceil.S: Remove file.
* sysdeps/powerpc/powerpc32/fpu/s_ceilf.S: Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile
(CFLAGS-s_ceil-power5+.c, CFLAGS-s_ceilf-power5+.c): New rule.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceil-power5+.S:
Remove file.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceil-ppc32.S:
Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceilf-power5+.S:
Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceilf-ppc32.S:
Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceil-power5+.c:
New file.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceil-ppc32.c:
Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceilf-power5+.c:
Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceilf-ppc32.c:
Likewise.
* sysdeps/powerpc/powerpc32/power5+/fpu/s_ceil.S: Remove file.
* sysdeps/powerpc/powerpc32/power5+/fpu/s_ceilf.S: Likewise.
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/Makefile: New file.
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_ceil-power5+.c:
Likewise.
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_ceil-ppc64.c:
Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceil.c: Move to ...
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_ceil.c: ... here.
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_ceilf-power5+.c: New
file.
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_ceilf-ppc64.c:
Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceilf.c: Move to ...
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_ceilf.c: ...
* here.
* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile
(libm-sysdep_routines): Remove s_ceil-power5+, s_ceil-ppc64,
s_ceilf-power5+, and s_ceilf-ppc64.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceil-power5+.S: Remove
file.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceil-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceilf-power5+.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceilf-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_ceil.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_ceilf.S: Likewise.
* sysdeps/powerpc/powerpc64/power5+/fpu/s_ceil.S: Likewise.
* sysdeps/powerpc/powerpc64/power5+/fpu/s_ceilf.S: Likewise.
Reviewed-by: Gabriel F. T. Gomes <gabriel@inconstante.eti.br>
Except the following functions, NPTL implementation assume sem_t
argument (or other arguments) are not NULL, so they would benefit
from having the nonnull attribute.
- sem_close(): can cope with a NULL sem_t and return -1 with error EINVAL;
- sem_destroy(): does nothing at all
* sysdeps/pthread/semaphore.h (sem_init): Add __nonnull attribute.
(sem_destroy, sem_open, sem_close, sem_unlink): Likewise.
(sem_wait, sem_timedwait, sem_trywait, sem_post): Likewise.
(sem_getvalue): Likewise.
While working on enabling D front-end (GDC) in GCC we noticed that
druntime was segfaulting if it is linked dynamically. This was tracked
to DL_RO_DYN_SECTION.
DL_RO_DYN_SECTION lines seem to be copied from MIPS file (which is the
only user of it), but the comment doesn't apply to RISC-V. There is no
such requirement in RISC-V ABI.
[BZ#24484]
* sysdeps/riscv/ldsodefs.h: Remove DL_RO_DYN_SECTION as it is not
required by RISC-V ABI.
This patch just refactor the assembly implementation to use compiler
builtins instead.
Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).
* sysdeps/powerpc/fpu/s_fma.S: Remove file.
* sysdeps/powerpc/fpu/s_fmaf.S: Likewise.
* sysdeps/powerpc/fpu/s_fma.c: New file.
* sysdeps/powerpc/fpu/s_fmaf.c: Likewise.
Since be2e25bbd7 the generic ieee754 implementation uses
compiler builtin which generates fabs{f} for all supported targets.
Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).
* sysdeps/powerpc/fpu/s_fabs.S: Remove file.
* sysdeps/powerpc/fpu/s_fabsf.S: Likewise.
Similar to powerpc, mips also issues rt_sigreturn for setcontext
case the v0 value saved is not the one set by setcontext or
makecontext. As for powerpc, it is intention is no really supported
since setcontext is not async-signal-safe.
Checked the context tests on mips64-linux-gnu and mips-linux-gnu.
* sysdeps/unix/sysv/linux/mips/getcontext.S (__getcontext): Remove
the magic flag store.
* sysdeps/unix/sysv/linux/mips/makecontext.S (__makecontext):
Likewise.
* sysdeps/unix/sysv/linux/mips/swapcontext.S (__swapcontext):
Likewise.
* sysdeps/unix/sysv/linux/mips/setcontext.S (__setcontext):
Remove rt_sigreturn call.
As described in a recent glibc thread [1], the rt_sigreturn syscall
on setcontext and swapcontext is not used on default use and its
intention is no really supported since neither setcontext nor
swapcontext are async-signal-safe.
Checked on powerpc64-linux-gnu and powerpc-linux-gnu.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/setcontext-common.S:
Remove rt_sigreturn call.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/swapcontext-common.S:
Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/setcontext.S: Likewie.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/swapcontext.S: Likewise.
[1] https://sourceware.org/ml/libc-alpha/2019-02/msg00367.html
This functionality was deprecated in glibc 2.25.
This commit only includes the core changes to remove the
functionality. It does not remove the RES_USE_INET6 handling in the
individual NSS service modules and the res_use_inet6 function.
These changes will happen in future commits.
Here is the updated patch for improving the long unaligned
code path (the one using "ext" instruction).
1. Always taken conditional branch at the beginning is
removed.
2. Epilogue code is placed after the end of the loop to
reduce the number of branches.
3. The redundant "mov" instructions inside the loop are
gone due to the changed order of the registers in the "ext"
instructions inside the loop, the prologue has additional
"ext" instruction.
4.Updating count in the prologue was hoisted out as
it is the same update for each prologue.
5. Invariant code of the loop epilogue was hoisted out.
6. As the current size of the ext chunk is exactly 16
instructions long "nop" was added at the beginning
of the code sequence so that the loop entry for all the
chunks be aligned.
* sysdeps/aarch64/multiarch/memcpy_thunderx2.S: Cleanup branching
and remove redundant code.
* sysdeps/alpha/divqu.S (__divqu): Move save of $f0 and excb after
conditional branch to DIVBYZERO. Fix unwind info.
* sysdeps/alpha/remqu.S (__remqu): Move saves of $f0, $f1, $f2 and
excb after conditional branch to $powerof2. Add missing unop
instructions and .align directives and reorder instructions to
match __divqu.
Signed-off-by: Uroš Bizjak <ubizjak@gmail.com>
Fixes build using v5.1-rc1 headers.
The kernel has cleaned up how these are defined. Previous behavior
was to define __NR_osf_shmat as 209 and not define __NR_shmat.
Current behavior is to define __NR_shmat as 209 and then define
__NR_osf_shmat as __NR_shmat.
* sysdeps/unix/sysv/linux/alpha/kernel-features.h (__NR_shmat):
Do not redefine.
* sysdeps/unix/sysv/linux/alpha/sysdep.h (__NR_osf_shmat):
Do not redefine.
Fix a:
.../sysdeps/unix/sysv/linux/riscv/configure: line 181: test: =: unary operator expected
message produced by the RISC-V configure fragment with the soft-float
ABI selected, caused by $libc_cv_riscv_float_abi evaluating to nil in
the invocation of `test $libc_cv_riscv_float_abi = no'.
* sysdeps/unix/sysv/linux/riscv/configure.ac: Quote
$libc_cv_riscv_float_abi in `test' invocation.
* sysdeps/unix/sysv/linux/riscv/configure: Regenerate.
Replace inline asm uses of the "mffs" and "mtfsf" instructions with
the analogous GCC builtins.
__builtin_mffs and __builtin_mtfsf are both available in GCC 5 and above.
Given the minimum GCC level for GLibC is now GCC 6.2, it is safe to use
these builtins without restriction.
2019-03-29 Paul A. Clarke <pc@us.ibm.com>
* sysdeps/powerpc/fpu/fenv_libc.h (fegetenv_register): Replace inline
asm with builtin.
* sysdeps/powerpc/powerpc64/le/fpu/sfp-machine.h (FP_INIT_ROUNDMODE):
Likewise.
* sysdeps/powerpc/fpu/tst-setcontext-fpscr.c (_GET_DI_FPSCR): Likewise.
(_GET_SI_FPSCR): Likewise.
(_SET_SI_FPSCR): Likewise.
This file is not used anywhere since removal of {k,e}_rem_pio2f.c
(commit ca3aac57ef).
Checked with a build for powerpc-linux-gnu (with --with-cpu=power4
and --with-cpu=power7), powerpc64-linux-gnu (with --with-cpu=power4
and --with-cpu=power7), and powerpc64le-linux (with --with-cpu=power8).
* sysdeps/powerpc/fpu/s_float_bitwise.h: Remove file.
Add missing generic hp_timing support. It uses clock_gettime (CLOCK_MONOTONIC)
which has unspecified starting time, nano-second accuracy, and should faster on
architectures that implementes the symbol as vDSO.
Checked on aarch64-linux-gnu, x86_64-linux-gnu, and i686-linux-gnu. I also
checked the builds for all afected ABIs.
* benchtests/Makefile (USE_CLOCK_GETTIME) Remove.
* benchtests/README: Update description.
* benchtests/bench-timing.h: Default to hp-timing.
* sysdeps/generic/hp-timing.h (HP_TIMING_DIFF, HP_TIMING_ACCUM_NT,
HP_TIMING_PRINT): Remove.
(HP_TIMING_NOW): Add generic implementation.
(hp_timing_t): Change to uint64_t.
This patch refactor how hp-timing is used on loader code for statistics
report. The HP_TIMING_AVAIL and HP_SMALL_TIMING_AVAIL are removed and
HP_TIMING_INLINE is used instead to check for hp-timing avaliability.
For alpha, which only defines HP_SMALL_TIMING_AVAIL, the HP_TIMING_INLINE
is set iff for IS_IN(rtld).
Checked on aarch64-linux-gnu, x86_64-linux-gnu, and i686-linux-gnu. I also
checked the builds for all afected ABIs.
* benchtests/bench-timing.h: Replace HP_TIMING_AVAIL with
HP_TIMING_INLINE.
* nptl/descr.h: Likewise.
* elf/rtld.c (RLTD_TIMING_DECLARE, RTLD_TIMING_NOW, RTLD_TIMING_DIFF,
RTLD_TIMING_ACCUM_NT, RTLD_TIMING_SET): Define.
(dl_start_final_info, _dl_start_final, dl_main, print_statistics):
Abstract hp-timing usage with RTLD_* macros.
* sysdeps/alpha/hp-timing.h (HP_TIMING_INLINE): Define iff IS_IN(rtld).
(HP_TIMING_AVAIL, HP_SMALL_TIMING_AVAIL): Remove.
* sysdeps/generic/hp-timing.h (HP_TIMING_AVAIL, HP_SMALL_TIMING_AVAIL,
HP_TIMING_NONAVAIL): Likewise.
* sysdeps/ia64/hp-timing.h (HP_TIMING_AVAIL, HP_SMALL_TIMING_AVAIL):
Likewise.
* sysdeps/powerpc/powerpc32/power4/hp-timing.h (HP_TIMING_AVAIL,
HP_SMALL_TIMING_AVAIL): Likewise.
* sysdeps/powerpc/powerpc64/hp-timing.h (HP_TIMING_AVAIL,
HP_SMALL_TIMING_AVAIL): Likewise.
* sysdeps/sparc/sparc32/sparcv9/hp-timing.h (HP_TIMING_AVAIL,
HP_SMALL_TIMING_AVAIL): Likewise.
* sysdeps/sparc/sparc64/hp-timing.h (HP_TIMING_AVAIL,
HP_SMALL_TIMING_AVAIL): Likewise.
* sysdeps/x86/hp-timing.h (HP_TIMING_AVAIL, HP_SMALL_TIMING_AVAIL):
Likewise.
* sysdeps/generic/hp-timing-common.h: Update comment with
HP_TIMING_AVAIL removal.
This patch removes the HP_TIMING_BITS usage for fast random bits and replace
with clock_gettime (CLOCK_MONOTONIC). It has unspecified starting time and
nano-second accuracy, so its randomness is significantly better than
gettimeofday.
Althoug it should incur in more overhead (specially for architecture that
support hp-timing), the symbol is also common implemented as a vDSO.
Checked on aarch64-linux-gnu, x86_64-linux-gnu, and i686-linux-gnu. I also
checked on a i686-gnu build.
* include/random-bits.h: New file.
* resolv/res_mkquery.c [HP_TIMING_AVAIL] (RANDOM_BITS,
(__res_context_mkquery): Remove usage hp-timing usage and replace with
random_bits.
* resolv/res_send.c [HP_TIMING_AVAIL] (nameserver_offset): Likewise.
* sysdeps/posix/tempname.c [HP_TIMING_AVAIL] (__gen_tempname):
Likewise.
With clock_getres, clock_gettime, and clock_settime refactor to remove the
generic CLOCK_PROCESS_CPUTIME_ID and CLOCK_THREAD_CPUTIME_ID support through
hp-timing, there is no usage of internal __get_clockfreq. This patch removes
both generic and Linux implementation..
Checked with a build against aarch64-linux-gnu, i686-linux-gnu, ia64-linux-gnu,
sparc64-linux-gnu, powerpc-linux-gnu-power4.
* include/libc-internal.h (__get_clockfreq): Remove prototype.
* rt/Makefile (clock-routines): Remove get_clockfreq.
* rt/get_clockfreq.c: Remove file.
* sysdeps/unix/sysv/linux/i386/get_clockfreq.c: Likewise.
* sysdeps/unix/sysv/linux/ia64/get_clockfreq.c: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc64/get_clockfreq.c: Likewise.
* sysdeps/unix/sysv/linux/powerpc/get_clockfreq.c: Move code to ...
* sysdeps/unix/sysv/linux/powerpc/get_timebase_freq.c: ... here.
The Linux 3.2 clock_getres kernel code (kernel/posix-cpu-timers.c)
issued for clock_getres CLOCK_PROCESS_CPUTIME_ID (process_cpu_clock_getres)
and CLOCK_THREAD_CPUTIME_ID (thread_cpu_clock_getres) call
posix_cpu_clock_getres. And it fails on check_clock only if an invalid
clock is used (not the case) or if we pass an invalid the pid/tid in
29 msb of clock_id (not the case either).
This patch assumes that clock_getres syscall always support
CLOCK_PROCESS_CPUTIME_ID and CLOCK_THREAD_CPUTIME_ID, so there is no need
to fallback to hp-timing support for _SC_MONOTONIC_CLOCK neither to issue
the syscall to certify the clock_id is supported bt the kernel. This
allows simplify the sysconf support to always use the syscall.
it also removes ia64 itc drift check and assume kernel handles it correctly.
Checked on aarch64-linux-gnu, x86_64-linux-gnu, and i686-linux-gnu.
* sysdeps/unix/sysv/linux/ia64/has_cpuclock.c: Remove file.
* sysdeps/unix/sysv/linux/ia64/sysconf.c: Likewise.
* sysdeps/unix/sysv/linux/sysconf.c (has_cpuclock): Remove function.
(__sysconf): Assume kernel support for _SC_MONOTONIC_CLOCK,
_SC_CPUTIME, and _SC_THREAD_CPUTIME.
This patch removes CLOCK_THREAD_CPUTIME_ID and CLOCK_PROCESS_CPUTIME_ID support
from clock_gettime and clock_settime generic implementation. For Linux, kernel
already provides supports through the syscall and Hurd HTL lacks
__pthread_clock_gettime and __pthread_clock_settime internal implementation.
As described in clock_gettime man-page [1] on 'Historical note for SMP
system', implementing CLOCK_{THREAD,PROCESS}_CPUTIME_ID with timer registers
is error-prone and susceptible to timing and accurary issues that the libc
can not deal without kernel support.
This allows removes unused code which, however, still incur in some runtime
overhead in thread creation (the struct pthread cpuclock_offset
initialization).
If hurd eventually wants to support them it should either either implement as
a kernel facility (or something related due its architecture) or in system
specific implementation.
Checked on aarch64-linux-gnu, x86_64-linux-gnu, and i686-linux-gnu. I also
checked on a i686-gnu build.
* nptl/Makefile (libpthread-routines): Remove pthread_clock_gettime and
pthread_clock_settime.
* nptl/pthreadP.h (__find_thread_by_id): Remove prototype.
* elf/dl-support.c [!HP_TIMING_NOAVAIL] (_dl_cpuclock_offset): Remove.
(_dl_non_dynamic_init): Remove _dl_cpuclock_offset setting.
* elf/rtld.c (_dl_start_final): Likewise.
* nptl/allocatestack.c (__find_thread_by_id): Remove function.
* sysdeps/generic/ldsodefs.h [!HP_TIMING_NOAVAIL] (_dl_cpuclock_offset):
Remove.
* sysdeps/mach/hurd/dl-sysdep.c [!HP_TIMING_NOAVAIL]
(_dl_cpuclock_offset): Remove.
* nptl/descr.h (struct pthread): Rename cpuclock_offset to
cpuclock_offset_ununsed.
* nptl/nptl-init.c (__pthread_initialize_minimal_internal): Remove
cpuclock_offset set.
* nptl/pthread_create.c (START_THREAD_DEFN): Likewise.
* sysdeps/nptl/fork.c (__libc_fork): Likewise.
* nptl/pthread_clock_gettime.c: Remove file.
* nptl/pthread_clock_settime.c: Likewise.
* sysdeps/unix/clock_gettime.c (hp_timing_gettime): Remove function.
[HP_TIMING_AVAIL] (realtime_gettime): Remove CLOCK_THREAD_CPUTIME_ID
and CLOCK_PROCESS_CPUTIME_ID support.
* sysdeps/unix/clock_settime.c (hp_timing_gettime): Likewise.
[HP_TIMING_AVAIL] (realtime_gettime): Likewise.
* sysdeps/posix/clock_getres.c (hp_timing_getres): Likewise.
[HP_TIMING_AVAIL] (__clock_getres): Likewise.
* sysdeps/unix/clock_nanosleep.c (CPUCLOCK_P, INVALID_CLOCK_P):
Likewise.
(__clock_nanosleep): Remove CPUCLOCK_P and INVALID_CLOCK_P usage.
[1] http://man7.org/linux/man-pages/man2/clock_gettime.2.html
This patch introduces the new arch13 ifunc variant for memmem.
For needles longer than 9 bytes it is relying on the common-code
implementation. For shorter needles it is using the new vstrs instruction
which is able to search a substring within a vector register.
ChangeLog:
* sysdeps/s390/Makefile (sysdep_routines): Add memmem-arch13.
* sysdeps/s390/ifunc-memmem.h (HAVE_MEMMEM_ARCH13, MEMMEM_ARCH13,
MEMMEM_Z13_ONLY_USED_AS_FALLBACK, HAVE_MEMMEM_IFUNC_AND_ARCH13_SUPPORT):
New defines.
* sysdeps/s390/memmem-arch13.S: New file.
* sysdeps/s390/memmem-vx.c: Omit GI symbol for z13 memmem ifunc variant
if it is only used as fallback.
* sysdeps/s390/memmem.c (memmem): Add arch13 variant in ifunc selector.
* sysdeps/s390/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Add ifunc variant for arch13 memmem.
This patch introduces the new arch13 ifunc variant for strstr.
For needles longer than 9 charachters it is relying on the common-code
implementation. For shorter needles it is using the new vstrs instruction
which is able to search a substring within a vector register.
ChangeLog:
* sysdeps/s390/Makefile (sysdep_routines): Add strstr-arch13.
* sysdeps/s390/ifunc-strstr.h (HAVE_STRSTR_ARCH13, STRSTR_ARCH13,
STRSTR_Z13_ONLY_USED_AS_FALLBACK, HAVE_STRSTR_IFUNC_AND_ARCH13_SUPPORT):
New defines.
* sysdeps/s390/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Add ifunc variant for arch13 strstr.
* sysdeps/s390/strstr-arch13.S: New file.
* sysdeps/s390/strstr-vx.c: Omit GI symbol for z13 strstr ifunc variant
if it is only used as fallback.
* sysdeps/s390/strstr.c (strstr): Add arch13 variant in ifunc selector.
This patch introduces the new arch13 ifunc variant for memmove.
For the forward or non-overlapping case it is just using memcpy.
For the backward case it relies on the new instruction mvcrl.
The instruction copies up to 256 bytes at once.
In case of an overlap, it copies the bytes like copying them
one by one starting from right to left.
ChangeLog:
* sysdeps/s390/ifunc-memcpy.h (HAVE_MEMMOVE_ARCH13, MEMMOVE_ARCH13
HAVE_MEMMOVE_IFUNC_AND_ARCH13_SUPPORT): New defines.
* sysdeps/s390/memcpy-z900.S: Add arch13 memmove implementation.
* sysdeps/s390/memmove.c (memmove): Add arch13 variant in
ifunc selector.
* sysdeps/s390/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Add ifunc variant for arch13 memmove.
* sysdeps/s390/multiarch/ifunc-resolve.h (S390_STFLE_BITS_ARCH13_MIE3,
S390_IS_ARCH13_MIE3): New defines.
Add two configure checks which detect if arch13 is supported
by the assembler at all - by explicitely setting the machine -
and if it is supported with default settings.
ChangeLog:
* config.h.in (HAVE_S390_MIN_ARCH13_ZARCH_ASM_SUPPORT,
HAVE_S390_ARCH13_ASM_SUPPORT): New undefine.
* sysdeps/s390/configure.ac: Add checks for arch13 support.
* sysdeps/s390/configure: Regenerated.
This patch adds vx and vxe as important hwcaps
which allows one to provide shared libraries
tuned for platforms with non-vx/-vxe, vx or vxe.
ChangeLog:
* sysdeps/s390/dl-procinfo.h (HWCAP_IMPORTANT):
Add HWCAP_S390_VX and HWCAP_S390_VXE.
This patch adds new AArch64 HWCAPs from Linux 5.0 to the AArch64
bits/hwcap.h and dl-procinfo.c.
Tested (compilation only) with build-many-glibcs.py for
aarch64-linux-gnu.
* sysdeps/unix/sysv/linux/aarch64/bits/hwcap.h (HWCAP_SB): New
macro.
(HWCAP_PACA): Likewise.
(HWCAP_PACG): Likewise.
* sysdeps/unix/sysv/linux/aarch64/dl-procinfo.c (_DL_HWCAP_COUNT):
Increase to 32.
(_dl_aarch64_cap_flags): Add new entries for new HWCAPs.
This patch updates sysdeps/unix/sysv/linux/syscall-names.list for
Linux 5.0. Based on testing with build-many-glibcs.py, the only new
entry needed is for old_getpagesize (a newly added __NR_* name for an
old syscall on ia64). (Because 5.0 changes how syscall tables are
handled in the kernel, checking diffs wasn't a useful way of looking
for new syscalls in 5.0 as most of the syscall tables were moved to
the new representation without actually adding any syscalls to them.)
Tested with build-many-glibcs.py.
* sysdeps/unix/sysv/linux/syscall-names.list: Update kernel
version to 5.0.
(old_getpagesize): New syscall.
The stub implementations are turned into compat symbols.
Linux actually has two reserved system call numbers (for getpmsg
and putpmsg), but these system calls have never been implemented,
and there are no plans to implement them, so this patch replaces
the wrappers with the generic stubs.
According to <https://bugzilla.redhat.com/show_bug.cgi?id=436349>,
the presence of the XSI STREAMS declarations is a minor portability
hazard because they are not actually implemented.
This commit does not change the TIRPC support code in
sunrpc/rpc_svcout.c. It uses additional XTI functionality and
therefore never worked with glibc.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Mach does not support IP_RECVERR, so replace this function with a
stub in a sysdeps override for Hurd.
This fixes commit 08504de718
("resolv: Enable full ICMP errors for UDP DNS sockets [BZ #24047]").
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
inttypes.h and stdint.h are in sysdeps/generic, but there are no other
versions of these headers anywhere in the source tree, so they aren’t
actually system-dependent. Move them to the subdirectory that
installs them (stdlib).
Reviewed-by: Joseph Myers <joseph@codesourcery.com>
* sysdeps/generic/inttypes.h, sysdeps/generic/stdint.h:
Move to stdlib.
* include/inttypes.h: Adjust to match.
* include/stdint.h: New wrapper.
Starting with commit 1616d034b6
the output was corrupted on some platforms as _dl_procinfo
was called for every auxv entry and on some architectures like s390
all entries were represented as "AT_HWCAP".
This patch is removing the condition and let _dl_procinfo decide if
an entry is printed in a platform specific or generic way.
This patch also adjusts all _dl_procinfo implementations which assumed
that they are only called for AT_HWCAP or AT_HWCAP2. They are now just
returning a non-zero-value for entries which are not handled platform
specifc.
ChangeLog:
* elf/dl-sysdep.c (_dl_show_auxv): Remove condition and always
call _dl_procinfo.
* sysdeps/unix/sysv/linux/s390/dl-procinfo.h (_dl_procinfo):
Ignore types other than AT_HWCAP.
* sysdeps/sparc/dl-procinfo.h (_dl_procinfo): Likewise.
* sysdeps/unix/sysv/linux/i386/dl-procinfo.h (_dl_procinfo):
Likewise.
* sysdeps/powerpc/dl-procinfo.h (_dl_procinfo): Adjust comment
in the case of falling back to generic output mechanism.
* sysdeps/unix/sysv/linux/arm/dl-procinfo.h (_dl_procinfo):
Likewise.
Mark the lr register as undefined at the start of execution, so unwind
will stop at this frame. run-backtrace-*.sh from elfutils testsuite will
fail without this patch.
* sysdeps/csky/abiv2/start.S: Mark lr as undefined.
* sysdeps/unix/sysv/linux/csky/abiv2/clone.S: Likewise.
* sysdeps/unix/sysv/linux/csky/abiv2/setcontext.S: Likewise.
C-SKY GDB dose not use this file for ptrace and coredump. ptrace can use
pt_regs definition from linux kernel directly. The old definition only
got 34 regs instead of 38 regs from linux kernel, which will corrupted
the memory after ptrace PTRACE_GETREGSET call.
* sysdeps/unix/sysv/linux/csky/sys/procfs.h: Use linux definition
directly.
* sysdeps/unix/sysv/linux/csky/sys/user.h: Remove user_regs
definition.
C-SKY defines SIGCONTEXT as siginfo_t *_si, struct ucontext_t * for
__profil_counter. ucontext_t get an extra __mask field which is miss
match with the struct sigcontext from linux kernel. The time value
from gprof report will be always zero without this patch. This
patch also fix the registers sequence in register-dump.h.
* sysdeps/unix/sysv/linux/csky/register-dump.h: Adjust offset change.
* sysdeps/unix/sysv/linux/csky/sys/ucontext.h: Remove __mask field
in mcontext_t
This patch fixes further coding style issues where code should have
broken lines before operators in accordance with the GNU Coding
Standards but instead was breaking lines after them.
Tested for x86_64, and with build-many-glibcs.py.
* stdio-common/vfscanf-internal.c (ARG): Break lines before rather
than after operators.
* sysdeps/mach/hurd/setitimer.c (timer_thread): Likewise.
(setitimer_locked): Likewise.
* sysdeps/mach/hurd/sigaction.c (__sigaction): Likewise.
* sysdeps/mach/hurd/sigaltstack.c (__sigaltstack): Likewise.
* sysdeps/mach/pagecopy.h (PAGE_COPY_FWD): Likewise.
* sysdeps/mach/thread_state.h (machine_get_basic_state): Likewise.
* sysdeps/powerpc/powerpc64/tst-ucontext-ppc64-vscr.c
(PPC_CPU_SUPPORTED): Likewise.
* sysdeps/unix/sysv/linux/alpha/a.out.h (N_TXTOFF): Likewise.
* sysdeps/unix/sysv/linux/generic/wordsize-32/overflow.h
(stat_overflow): Likewise.
(statfs_overflow): Likewise.
* sysdeps/unix/sysv/linux/tst-personality.c (do_test): Likewise.
* sysdeps/unix/sysv/linux/tst-ttyname.c (eq_ttyname): Likewise.
(eq_ttyname_r): Likewise.
(run_chroot_tests): Likewise.
This patch assumes realtime clock support for nptl and thus removes
all the associated code.
For __pthread_mutex_timedlock the fallback usage for the case where
lll_futex_timed_wait_bitset it not set define is also removed. The
generic lowlevellock-futex.h always define it, so for NPTL code the
check always yield true.
Checked on x86_64-linux-gnu and i686-linux-gnu.
* nptl/nptl-init.c (__have_futex_clock_realtime,
__have_futex_clock_realtime): Remove definition.
(__pthread_initialize_minimal_internal): Remove FUTEX_CLOCK_REALTIME
check test for !__ASSUME_FUTEX_CLOCK_REALTIME.
* nptl/pthread_mutex_timedlock.c (__pthread_mutex_timedlock): Assume
__ASSUME_FUTEX_CLOCK_REALTIME support.
* sysdeps/unix/sysv/linux/i386/lowlevellock.S: Likewise.
* sysdeps/unix/sysv/linux/x86_64/lowlevellock.S: Likewise.
* sysdeps/unix/sysv/linux/kernel-features.h
(__ASSUME_FUTEX_CLOCK_REALTIME): Remove.
* sysdeps/nptl/lowlevellock-futex.h (lll_futex_timed_wait_bitset):
Adjust comment.
Since the commit
commit 81a1443941
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date: Tue Feb 5 17:35:12 2019 -0200
wcsmbs: optimize wcscat
powerpc64 and powerpc64le builds fail when configured with
--disable-multi-arch and --with-cpu=power6 (or newer), due to an
undefined reference to __GI___wcscpy. This patch fixes this on
sysdeps/powerpc/powerpc64/power6/wcscpy.c, which is only used when
multi-arch is disabled.
This patch does nothing for the failures on 32-bits powerpc builds,
because the file is under the powerpc64 subdirectory, however, powerpc
builds were already failing with --disable-multi-arch, with multiple
error messages, even before the aforementioned commit.
Tested for powerpc, powerpc64, and powerpc64le with multi-arch enabled
(all pass) and disabled (powerpc still fails as explained above).
Set the default function alignment to 16 bytes in order to
get rid of some unwanted performance effects.
Please see also GCC commit "S/390: Set default function
alignment to 16." (Subversion revision 262817)
ChangeLog:
* sysdeps/s390/s390-64/sysdep.h(ENTRY): Use alignment of 16byte.
* sysdeps/s390/s390-32/sysdep.h: Likewise.
This patch adds test cases for the compatibility versions of the
functions: err, errx, verr, verrx, warn, warnx, vwarn, vwarnx (from
err.h), error, and error_at_line (from error.h), when long double has
the same format as double (-mlong-double-64).
Tested for powerpc, powerpc64 and powerpc64le.
On platforms where long double may have the same format as double
(-mlong-double-64), error and error_at_line do not take that into
account and might produce wrong output if a long double conversion is
requested by the format string ('%Lf'). This patch adds compatibility
functions for this situation and redirects calls via header magic.
Tested for powerpc, powerpc64 and powerpc64le.
When support for long double format with 128-bits (-mlong-double-128)
was added for platforms where long double had the same format as double,
such as powerpc, compatibility versions for the functions listed in the
commit title were missed. Since the older format of long double can
still be used (with -mlong-double-64), using these functions with a
format string that requests the printing of long double variables will
produce wrong outputs.
This patch adds the missing compatibility functions and header magic to
redirect calls to them when -mlong-double-64 is in use.
Tested for powerpc, powerpc64 and powerpc64le.
The test case tst-ldbl-argp checks that the conversion specifier '%Lf'
correctly prints long double values with the default long double format
for a platform. This patch reuses the test case for long double with
the same format as double (-mlong-double-64).
Tested for powerpc, powerpc64 and powerpc64le.
The functions argp_error and argp_failure are missing support for
printing long double values when long double has the same format as
double. This patch adds the new functions __nldbl_argp_error and
__nldbl_argp_failure, as well as header magic to redirect calls to them
when -mlong-double-64 is in use.
Tested for powerpc, powerpc64 and powerpc64le.
The recent commit 81a1443941
has introduced __wcscpy, __GI___wcscpy and the weak alias wcscpy.
This patch also introduces those symbols if glibc is build
with CFLAGS="-march=z13" where the ifunc is omitted.
ChangeLog:
* sysdeps/s390/wcscpy-vx.S: Add strong aliases to
__wcscpy, __GI___wcscpy and weak alias to wcscpy.
This patch fixes more places where a space should have been present
before '(' in accordance with the GNU Coding Standards (as with the
previous patch, mainly for calls to sizeof).
Tested with build-many-glibcs.py.
* sysdeps/powerpc/powerpc32/dl-machine.c
(__elf_machine_fixup_plt): Use space before '('.
(__process_machine_rela): Likewise.
* sysdeps/powerpc/powerpc32/register-dump.h (register_dump):
Likewise.
* sysdeps/powerpc/powerpc64/le/fpu/sfp-machine.h (TI_BITS):
Likewise.
* sysdeps/powerpc/powerpc64/register-dump.h (register_dump):
Likewise.
* sysdeps/powerpc/test-arith.c (union_t): Likewise.
(pattern): Likewise.
(delta): Likewise.
(check_result): Likewise.
(check_excepts): Likewise.
(check_op): Likewise.
(fail_xr): Likewise.
* sysdeps/unix/alpha/sysdep.h (syscall_promote): Likewise.
* sysdeps/unix/sysv/linux/alpha/a.out.h (AOUTHSZ): Likewise.
(SCNHSZ): Likewise.
* sysdeps/unix/sysv/linux/hppa/makecontext.c (FRAME_SIZE_BYTES):
Likewise.
(ARGS): Likewise.
(__makecontext): Likewise.
* sysdeps/unix/sysv/linux/powerpc/sys/ucontext.h (ucontext_t):
Likewise.
This patch rewrites wcscat using wcslen and wcscpy. This is similar to
the optimization done on strcat by 6e46de42fe.
The strcpy changes are mainly to add the internal alias to avoid PLT
calls.
Checked on x86_64-linux-gnu and a build against the affected
architectures.
* include/wchar.h (__wcscpy): New prototype.
* sysdeps/powerpc/powerpc32/power4/multiarch/wcscpy-ppc32.c
(__wcscpy): Route internal symbol to generic implementation.
* sysdeps/powerpc/powerpc32/power4/multiarch/wcscpy.c (wcscpy):
Add internal __wcscpy alias.
* sysdeps/powerpc/powerpc64/multiarch/wcscpy.c (wcscpy): Likewise.
* sysdeps/s390/wcscpy.c (wcscpy): Likewise.
* sysdeps/x86_64/multiarch/wcscpy.c (wcscpy): Likewise.
* wcsmbs/wcscpy.c (wcscpy): Add
* sysdeps/x86_64/multiarch/wcscpy-c.c (WCSCPY): Adjust macro to
use generic implementation.
* wcsmbs/wcscat.c (wcscat): Rewrite using wcslen and wcscpy.
This patch rewrites wcpcpy using wcslen and wmemcpy. This is
similar to the optimizatio done on stpcpy by f559d8cf29.
Checked on x86_64-linux-gnu and string tests on a simulated
m68k-linux-gnu.
* sysdeps/m68k/wcpcpy.c: Remove file.
* wcsmbs/wcpcpy.c (__wcpcpy): Rewrite using wcslen and wmemcpy.
This patch fixes -Wimplicit-fallthrough warnings in system-specific
code that show up building glibc with -Wextra, by adding fall-through
comments, or moving existing such comments to the place required for
them to work (immediately before the case label being fallen through).
Tested with build-many-glibcs.py.
* sysdeps/i386/dl-machine.h (elf_machine_rela): Add fall-through
comments.
* sysdeps/m68k/m680x0/fpu/s_cexp_template.c (s(__cexp)): Likewise.
* sysdeps/m68k/memcopy.h (WORD_COPY_FWD): Likewise.
(WORD_COPY_BWD): Likewise.
* sysdeps/mach/hurd/ioctl.c (__ioctl): Likewise.
* sysdeps/powerpc/powerpc64/dl-machine.h (elf_machine_rela):
Likewise.
* sysdeps/s390/iso-8859-1_cp037_z900.c (TR_LOOP): Likewise.
* sysdeps/mips/dl-machine.h (elf_machine_reloc): Move fall-through
comment.
* sysdeps/mips/dl-trampoline.c (__dl_runtime_resolve): Likewise.
The configure fragment for powerpc64le contains a test for the presence
of several compiler builtins and of the __float128 type, which are
provided by GCC 6.2 for powerpc64le. Since this configure test was
added, the compiler version required to build glibc for powerpc64le was
different than that required for the other architectures.
Now that glibc requires GCC 6.2 globally (since commit ID 4dcbbc3b28),
this patch removes the powerpc64le-specific test.
Even tough the configure test checks for compiler features rather than
compiler version, the intent of the test was to stop build attempts at
early stages, if they had been configured with a too old compiler. It
was not the intention of the test to detect compiler breakage (such as
the removal of the required compiler features in future GCC versions),
and glibc is not the place to test for compiler regressions, anyway.
Tested for powerpc64le with GCC 6.2 (built with build-many-glibcs.py).
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
Building glibc with -Wextra shows a -Wimplicit-fallthrough warning for
SPARC64 that appears to be a real bug in glibc. The dynamic linker
handling of R_SPARC_H34 falls through to that of R_SPARC_H44, which in
the case of this code is nonsensical (it means the value computed for
R_SPARC_H34 gets overwritten by one computed with the different logic
for R_SPARC_H44). Thus, this patch adds the missing break there.
Note: I do not have a testcase to demonstrate this bug.
Tested with build-many-glibcs.py.
[BZ #24231]
* sysdeps/sparc/sparc64/dl-machine.h (elf_machine_rela): Add break
after R_SPARC_H34 case.
From time to time the test misc/tst-clone3 fails with a timeout.
Then futex_wait is blocking. Usually ctid should be set to zero
due to CLONE_CHILD_CLEARTID and the futex should be waken up.
But the fail occures if the thread has already exited before
ctid is set to the return value of clone(). Then futex_wait() will
block as there will be nobody who wakes the futex up again.
This patch initializes ctid to a known value before calling clone
and the kernel is the only one who updates the value to zero after clone.
If futex_wait is called then it is either waked up due to the exited thread
or the futex syscall fails as *ctid_ptr is already zero instead of the
specified value 1.
ChangeLog:
* sysdeps/unix/sysv/linux/tst-clone3.c (do_test):
Initialize ctid with a known value and remove update of ctid
after clone.
(wait_tid): Adjust arguments and call futex_wait with ctid_val
as assumed current value of ctid_ptr.
sys/procfs.h was already using this sysdeps directory.
This avoids the need for nptl-specific wrapper headers under
include/, a generic location in the source tree.
With internal fcntl64 internal (commit 06ab719d), it is possible to
consolidate lockf implementation by using the LFS fcntl interface
instead of using arch and system-specific implementations.
For Linux, the i386 implementation is used as generic implementation
by replacing the direct syscall with fcntl64 call. The LFS symbol
alias for default LFS ABI (__OFF_T_MATCHES_OFF64_T) is used to avoid
the duplicate symbol (instead of overriding the implementation with an
empty file).
For Hurd lockf64 semantic is changed: previous generic lockf64
implementation returned EOVERFLOW if LEN input is larger than 32-bit
off_t. However, Hurd fcntl64 implementation for F_GETLK64, F_SETLK64,
and F_SETLKW64 do accept off64_t inputs (__f_setlk accepts only off64_t
inputs).
Checked on i686-linux-gnu and x86_64-linux-gnu along with a i686-gnu
build.
* io/Makefile (tests): Add tst-lockf.
* io/lockf.c (lockf): Use __fcntl and only define for
!__OFF_T_MATCHES_OFF64_T.
* io/lockf64.c (__lockf64): Call __fcntl64 and alias to lockf for
__OFF_T_MATCHES_OFF64_T case.
* io/tst-lockf.c: New file.
* sysdeps/unix/sysv/linux/i386/lockf64.c: Remove file.
* sysdeps/unix/sysv/linux/arm/lockf64.c: Likewise.
* sysdeps/unix/sysv/linux/m68k/lockf64.c: Likewise.
* sysdeps/unix/sysv/linux/mips/mips32/lockf64.c: Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/n32/lockf64.c: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/lockf64.c: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-32/lockf64.c: Likewise.
* sysdeps/unix/sysv/linux/sh/lockf64.c: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc32/lockf64.c: Likewise.
Patch ce7eb0e903 ("nptl: Cleanup cancellation macros") changed the
join sequence for internal common __pthread_timedjoin_ex to use the
new macro lll_wait_tid. The idea was this macro would issue the
cancellable futex operation depending whether the timeout is used or
not. However if a timeout is used, __lll_timedwait_tid is called and
it is not a cancellable entrypoint.
This patch fixes it by simplifying the code in various ways:
- Instead of adding the cancellation handling on __lll_timedwait_tid,
it moves the generic implementation to pthread_join_common.c (called
now timedwait_tid with some fixes to use the correct type for pid).
- The llvm_wait_tid macro is removed, along with its replication on
x86_64, i686, and sparc arch-specific lowlevellock.h.
- sparc32 __lll_timedwait_tid is also removed, since the code is similar
to generic one.
- x86_64 and i386 provides arch-specific __lll_timedwait_tid which is
also removed since they are similar in functionality to generic C code
and there is no indication it is better than compiler generated code.
New tests, tst-join8 and tst-join9, are provided to check if
pthread_timedjoin_np acts as a cancellation point.
Checked on x86_64-linux-gnu, i686-linux-gnu, sparcv9-linux-gnu, and
aarch64-linux-gnu.
[BZ #24215]
* nptl/Makefile (lpthread-routines): Remove lll_timedwait_tid.
(tests): Add tst-join8 tst-join9.
* nptl/lll_timedwait_tid.c: Remove file.
* sysdeps/sparc/sparc32/lll_timedwait_tid.c: Likewise.
* sysdeps/unix/sysv/linux/i386/lll_timedwait_tid.c: Likewise.
* sysdeps/sysv/linux/x86_64/lll_timedwait_tid.c: Likewise.
* nptl/pthread_join_common.c (timedwait_tid): New function.
(__pthread_timedjoin_ex): Act as cancellation entrypoint is block
is set.
* nptl/tst-join5.c (thread_join): New function.
(tf1, tf2, do_test): Use libsupport and add pthread_timedjoin_np
check.
* nptl/tst-join8.c: New file.
* nptl/tst-join9.c: Likewise.
* sysdeps/nptl/lowlevellock-futex.h (lll_futex_wait_cancel,
lll_futex_timed_wait_cancel): Add generic macros.
* sysdeps/nptl/lowlevellock.h (__lll_timedwait_tid, lll_wait_tid):
Remove definitions.
* sysdeps/unix/sysv/linux/i386/lowlevellock.h: Likewise.
* sysdeps/unix/sysv/linux/sparc/lowlevellock.h: Likewise.
* sysdeps/unix/sysv/linux/x86_64/lowlevellock.h: Likewise.
* sysdeps/sparc/sparc32/lowlevellock.c (__lll_timedwait_tid):
Remove function.
* sysdeps/unix/sysv/linux/i386/lowlevellock.S (__lll_timedwait_tid):
Likewise.
* sysdeps/unix/sysv/linux/x86_64/lowlevellock.S: Likewise.
* sysdeps/unix/sysv/linux/lowlevellock-futex.h
(lll_futex_timed_wait_cancel): New macro.
The clone.S patch fixes 2 elfutils testsuite unwind failures, where the
backtrace gets stuck repeating __thread_start until we hit the backtrace
limit. This was confirmed by building and installing a patched glibc and
then building elfutils and running its testsuite.
Unfortunately, the testcase isn't working as expected and I don't know why.
The testcase passes even when my clone.S patch is not installed. The testcase
looks logically similarly to the elfutils testcases that are failing. Maybe
there is a subtle difference in how the glibc unwinding works versus the
elfutils unwinding? I don't have good gdb pthread support yet, so I haven't
found a way to debug this. Anyways, I don't know if the testcase is useful or
not. If the testcase isn't useful then maybe the clone.S patch is OK without
a testcase?
Jim
[BZ #24040]
* elf/Makefile (CFLAGS-tst-unwind-main.c): Add -DUSE_PTHREADS=0.
* elf/tst-unwind-main.c: If USE_PTHEADS, include pthread.h and error.h
(func): New.
(main): If USE_PTHREADS, call pthread_create to run func. Otherwise
call func directly.
* nptl/Makefile (tests): Add tst-unwind-thread.
(CFLAGS-tst-unwind-thread.c): Define.
* nptl/tst-unwind-thread.c: New file.
* sysdeps/unix/sysv/linux/riscv/clone.S (__thread_start): Mark ra
as undefined.
This patch adds fall-through comments in some cases where -Wextra
produces implicit-fallthrough warnings.
The patch is non-exhaustive. Apart from architecture-specific code
for non-x86_64 architectures, it does not change sunrpc/xdr.c (legacy
code, probably should have such changes, but left to be dealt with
separately), or places that already had comments about the
fall-through but not matching the form expected by
-Wimplicit-fallthrough=3 (the default level with -Wextra; my
inclination is to adjust those comments to match rather than
downgrading to -Wimplicit-fallthrough=1 to allow any comment), or one
place where I thought the implicit fallthrough was not correct and so
should be handled separately as a bug fix. I think the key thing to
consider in review of this patch is whether the fall-through is indeed
intended and correct in each place where such a comment is added.
Tested for x86_64.
* elf/dl-exception.c (_dl_exception_create_format): Add
fall-through comments.
* elf/ldconfig.c (parse_conf_include): Likewise.
* elf/rtld.c (print_statistics): Likewise.
* locale/programs/charmap.c (parse_charmap): Likewise.
* misc/mntent_r.c (__getmntent_r): Likewise.
* posix/wordexp.c (parse_arith): Likewise.
(parse_backtick): Likewise.
* resolv/ns_ttl.c (ns_parse_ttl): Likewise.
* sysdeps/x86/cpu-features.c (init_cpu_features): Likewise.
* sysdeps/x86_64/dl-machine.h (elf_machine_rela): Likewise.
The type used within e_sqrt.c(__slow_ieee754_sqrtf) was, unnecessarily and
likely inadvertently, double. float is not only appropriate, but also
more efficient, avoiding the need for the compiler to emit a
round-to-single-precision instruction.
This is the difference in compiled code:
0000000000000000 <__ieee754_sqrtf>:
0: 2c 08 20 ec fsqrts f1,f1
- 4: 18 08 20 fc frsp f1,f1
- 8: 20 00 80 4e blr
+ 4: 20 00 80 4e blr
(Found by Anton Blanchard.)
Continuing the process of moving away from having bits/mathinline.h
headers in glibc, leaving the compiler to inline functions as
appropriate depending on the options passed to it, this patch removes
the header for powerpc.
<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88558> is the
corresponding GCC bug for adding replacements for these powerpc
(32-bit-only) lrint / lrintf inlines.
Tested with build-many-glibcs.py for its powerpc configurations.
* sysdeps/powerpc/bits/mathinline.h: Remove.
No functional change; the previous path worked as well, but it
re-added the obsolete sysdeps/generic/bits directory, which was
removed (for the first time) in commit
c72565e5f1.
Fixes commit e47d82c99a ("Provide
<bits/unistd_ext.h> as a sysdeps header exclusively").
Non-sysdeps headers cannot be overriden by sysdeps headers across the
entire build, so it is necessary to turn such extension headers into
sysdeps headers themselves. The approach here follows the existing
<bits/shm.h> header (although it uses sysdeps/gnu instead of
sysdeps/generic).
Fixes commit 1d0fc21382 ("Linux: Add
gettid system call wrapper [BZ #6399]") and commit
8f89ab216f ("posix: Fix missing wrapper
header for <bits/unistd_ext.h>").
Commit 27761a1042 ("Refactor atfork
handlers") introduced a lock, atfork_lock, around fork handler list
accesses. It turns out that this lock occasionally results in
self-deadlocks in malloc/tst-mallocfork2:
(gdb) bt
#0 __lll_lock_wait_private ()
at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:63
#1 0x00007f160c6f927a in __run_fork_handlers (who=(unknown: 209394016),
who@entry=atfork_run_prepare) at register-atfork.c:116
#2 0x00007f160c6b7897 in __libc_fork () at ../sysdeps/nptl/fork.c:58
#3 0x00000000004027d6 in sigusr1_handler (signo=<optimized out>)
at tst-mallocfork2.c:80
#4 sigusr1_handler (signo=<optimized out>) at tst-mallocfork2.c:64
#5 <signal handler called>
#6 0x00007f160c6f92e4 in __run_fork_handlers (who=who@entry=atfork_run_parent)
at register-atfork.c:136
#7 0x00007f160c6b79a2 in __libc_fork () at ../sysdeps/nptl/fork.c:152
#8 0x0000000000402567 in do_test () at tst-mallocfork2.c:156
#9 0x0000000000402dd2 in support_test_main (argc=1, argv=0x7ffc81ef1ab0,
config=config@entry=0x7ffc81ef1970) at support_test_main.c:350
#10 0x0000000000402362 in main (argc=<optimized out>, argv=<optimized out>)
at ../support/test-driver.c:168
If no locking happens in the single-threaded case (where fork is
expected to be async-signal-safe), this deadlock is avoided.
(pthread_atfork is not required to be async-signal-safe, so a fork
call from a signal handler interrupting pthread_atfork is not
a problem.)
This commit adds gettid to <unistd.h> on Linux, and not to the
kernel-independent GNU API.
gettid is now supportable on Linux because too many things assume a
1:1 mapping between libpthread threads and kernel threads.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
One group of warnings seen with -Wextra is warnings for static or
inline not at the start of a declaration (-Wold-style-declaration).
This patch fixes various such cases for inline, ensuring it comes at
the start of the declaration (after any static). A common case of the
fix is "static inline <type> __always_inline"; the definition of
__always_inline starts with __inline, so the natural change is to
"static __always_inline <type>". Other cases of the warning may be
harder to fix (one pattern is a function definition that gets
rewritten to be static by an including file, "#define funcname static
wrapped_funcname" or similar), but it seems worth fixing these cases
with inline anyway.
Tested for x86_64.
* elf/dl-load.h (_dl_postprocess_loadcmd): Use __always_inline
before return type, without separate inline.
* elf/dl-tunables.c (maybe_enable_malloc_check): Likewise.
* elf/dl-tunables.h (tunable_is_name): Likewise.
* malloc/malloc.c (do_set_trim_threshold): Likewise.
(do_set_top_pad): Likewise.
(do_set_mmap_threshold): Likewise.
(do_set_mmaps_max): Likewise.
(do_set_mallopt_check): Likewise.
(do_set_perturb_byte): Likewise.
(do_set_arena_test): Likewise.
(do_set_arena_max): Likewise.
(do_set_tcache_max): Likewise.
(do_set_tcache_count): Likewise.
(do_set_tcache_unsorted_limit): Likewise.
* nis/nis_subr.c (count_dots): Likewise.
* nptl/allocatestack.c (advise_stack_range): Likewise.
* sysdeps/ieee754/dbl-64/s_sin.c (do_cos): Likewise.
(do_sin): Likewise.
(reduce_sincos): Likewise.
(do_sincos): Likewise.
* sysdeps/unix/sysv/linux/x86/elision-conf.c
(do_set_elision_enable): Likewise.
(TUNABLE_CALLBACK_FNDECL): Likewise.
In the i386 case, it appears that the sole remaining LIBC_PROBE was
removed in commit a9fe4c5aa8 ("Support
six-argument syscalls from C for 32-bit x86, use generic
lowlevellock-futex.h (bug 18138)."), when
sysdeps/unix/sysv/linux/i386/lowlevellock-futex.h was replaced with
the generic version.
For x86_64, the relevant change is commit
76f71081cd ("Use generic
lowlevellock-futex.h in x86_64 lowlevellock.h."), again by using the
generic version of <lowlevellock-futex.h>.
Tested on i386 and x86_64, with and without --enable-systemtap.
The recent commit 65f7767a91
has introduced __wmemcmp and the weak alias wmemcmp.
This patch also introduces those symbols if glibc is build
with CFLAGS="-march=z13" where the ifunc is omitted.
ChangeLog:
* sysdeps/s390/wmemcmp-vx.S: Add strong alias to
__wmemcmp and weak alias to wmemcmp.
With the default "nor" constraint, current GCC will use the "o"
constraint for constants, after emitting the constant to memory. That
results in unparseable Systemtap probe notes such as "-4@.L1052".
Removing the "o" alternative and using "nr" instead avoids this.
This fixes the same bug in fnmatch that was fixed by commit 7e2f0d2d77 for
regexp matching. As a side effect it also removes the use of an unbound
VLA.
Since the size argument is unsigned. we should use unsigned Jcc
instructions, instead of signed, to check size.
Tested on x86-64 and x32, with and without --disable-multi-arch.
[BZ #24155]
CVE-2019-7309
* NEWS: Updated for CVE-2019-7309.
* sysdeps/x86_64/memcmp.S: Use RDX_LP for size. Clear the
upper 32 bits of RDX register for x32. Use unsigned Jcc
instructions, instead of signed.
* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-memcmp-2.
* sysdeps/x86_64/x32/tst-size_t-memcmp-2.c: New test.
On Linux, we define _POSIX_PRIORITY_SCHEDULING, but functions such
as sched_setparam and sched_setscheduler apply to individual threads,
not processes.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Clock_gettime, settime and getres implementations are unncessarily
complex due to using defines and C file inclusion. Simplify the
code by replacing the redundant defines and removing the inclusion,
making it much easier to understand. No functional changes.
* sysdeps/posix/clock_getres.c (__clock_getres): Cleanup.
* sysdeps/unix/clock_gettime.c (__clock_gettime): Cleanup.
* sysdeps/unix/clock_settime.c (__clock_settime): Cleanup.
* sysdeps/unix/sysv/linux/clock_getres.c (__clock_getres): Cleanup.
* sysdeps/unix/sysv/linux/clock_gettime.c (__clock_gettime): Cleanup.
* sysdeps/unix/sysv/linux/clock_settime.c (__clock_settime): Cleanup.
This version uses general register based memory instruction to load
data, because vector register based is slightly slower in emag.
Character-matching is performed on 16-byte (both size and alignment)
memory block in parallel each iteration.
* sysdeps/aarch64/memchr.S (__memchr): Rename to MEMCHR.
[!MEMCHR](MEMCHR): Set to __memchr.
* sysdeps/aarch64/multiarch/Makefile (sysdep_routines):
Add memchr_generic and memchr_nosimd.
* sysdeps/aarch64/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Add memchr ifuncs.
* sysdeps/aarch64/multiarch/memchr.c: New file.
* sysdeps/aarch64/multiarch/memchr_generic.S: Likewise.
* sysdeps/aarch64/multiarch/memchr_nosimd.S: Likewise.
This version uses general register based memory store instead of
vector register based, for the former is faster than the latter
in emag.
The fact that DC ZVA size in emag is 64-byte, is used by IFUNC
dispatch to select this memset, so that cost of runtime-check on
DC ZVA size can be saved.
* sysdeps/aarch64/multiarch/Makefile (sysdep_routines):
Add memset_emag.
* sysdeps/aarch64/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Add __memset_emag to memset ifunc.
* sysdeps/aarch64/multiarch/memset.c (libc_ifunc):
Add IS_EMAG check for ifunc dispatch.
* sysdeps/aarch64/multiarch/memset_base64.S: New file.
* sysdeps/aarch64/multiarch/memset_emag.S: New file.
Emag is a 64-bit CPU core released by AmpereComputing.
Add its name to cpu list, and corresponding macro as utilities for
later IFUNC dispatch.
* manual/tunables.texi (Tunable glibc.cpu.name): Add emag.
* sysdeps/unix/sysv/linux/aarch64/cpu-features.c (cpu_list):
Add emag.
* sysdeps/unix/sysv/linux/aarch64/cpu-features.h (IS_EMAG):
New macro.
There was missing restore of $f3 before the return from the function
via the $y_is_neg path. This caused the math/big testcase from Go-1.11
testsuite (that includes lots of corner cases that exercise remqu) FAIL.
[BZ #24130]
* sysdeps/alpha/remqu.S (__remqu): Add missing restore
of $f3 register on $y_is_neg path.
The IPv4 address parser in the getaddrinfo function is changed so that
it does not ignore trailing whitespace and all characters after it.
For backwards compatibility, the getaddrinfo function still recognizes
legacy name syntax, such as 192.000.002.010 interpreted as 192.0.2.8
(octal).
This commit does not change the behavior of inet_addr and inet_aton.
gethostbyname already had additional sanity checks (but is switched
over to the new __inet_aton_exact function for completeness as well).
To avoid sending the problematic query names over DNS, commit
6ca53a2453 ("resolv: Do not send queries
for non-host-names in nss_dns [BZ #24112]") is needed.
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits. The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.
This pach fixes strnlen/wcsnlen for x32. Tested on x86-64 and x32. On
x86-64, libc.so is the same with and withou the fix.
[BZ# 24097]
CVE-2019-6488
* sysdeps/x86_64/multiarch/strlen-avx2.S: Use RSI_LP for length.
Clear the upper 32 bits of RSI register.
* sysdeps/x86_64/strlen.S: Use RSI_LP for length.
* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-strnlen
and tst-size_t-wcsnlen.
* sysdeps/x86_64/x32/tst-size_t-strnlen.c: New file.
* sysdeps/x86_64/x32/tst-size_t-wcsnlen.c: Likewise.
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits. The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.
This pach fixes strncpy for x32. Tested on x86-64 and x32. On x86-64,
libc.so is the same with and withou the fix.
[BZ# 24097]
CVE-2019-6488
* sysdeps/x86_64/multiarch/strcpy-avx2.S: Use RDX_LP for length.
* sysdeps/x86_64/multiarch/strcpy-sse2-unaligned.S: Likewise.
* sysdeps/x86_64/multiarch/strcpy-ssse3.S: Likewise.
* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-strncpy.
* sysdeps/x86_64/x32/tst-size_t-strncpy.c: New file.
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits. The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.
This pach fixes the strncmp family for x32. Tested on x86-64 and x32.
On x86-64, libc.so is the same with and withou the fix.
[BZ# 24097]
CVE-2019-6488
* sysdeps/x86_64/multiarch/strcmp-avx2.S: Use RDX_LP for length.
* sysdeps/x86_64/multiarch/strcmp-sse42.S: Likewise.
* sysdeps/x86_64/strcmp.S: Likewise.
* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-strncasecmp,
tst-size_t-strncmp and tst-size_t-wcsncmp.
* sysdeps/x86_64/x32/tst-size_t-strncasecmp.c: New file.
* sysdeps/x86_64/x32/tst-size_t-strncmp.c: Likewise.
* sysdeps/x86_64/x32/tst-size_t-wcsncmp.c: Likewise.
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits. The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.
This pach fixes memset/wmemset for x32. Tested on x86-64 and x32. On
x86-64, libc.so is the same with and withou the fix.
[BZ# 24097]
CVE-2019-6488
* sysdeps/x86_64/multiarch/memset-avx512-no-vzeroupper.S: Use
RDX_LP for length. Clear the upper 32 bits of RDX register.
* sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S: Likewise.
* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-wmemset.
* sysdeps/x86_64/x32/tst-size_t-memset.c: New file.
* sysdeps/x86_64/x32/tst-size_t-wmemset.c: Likewise.
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits. The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.
This pach fixes memrchr for x32. Tested on x86-64 and x32. On x86-64,
libc.so is the same with and withou the fix.
[BZ# 24097]
CVE-2019-6488
* sysdeps/x86_64/memrchr.S: Use RDX_LP for length.
* sysdeps/x86_64/multiarch/memrchr-avx2.S: Likewise.
* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-memrchr.
* sysdeps/x86_64/x32/tst-size_t-memrchr.c: New file.
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits. The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.
This pach fixes memcpy for x32. Tested on x86-64 and x32. On x86-64,
libc.so is the same with and withou the fix.
[BZ# 24097]
CVE-2019-6488
* sysdeps/x86_64/multiarch/memcpy-ssse3-back.S: Use RDX_LP for
length. Clear the upper 32 bits of RDX register.
* sysdeps/x86_64/multiarch/memcpy-ssse3.S: Likewise.
* sysdeps/x86_64/multiarch/memmove-avx512-no-vzeroupper.S:
Likewise.
* sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:
Likewise.
* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-memcpy.
tst-size_t-wmemchr.
* sysdeps/x86_64/x32/tst-size_t-memcpy.c: New file.
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits. The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.
This pach fixes memcmp/wmemcmp for x32. Tested on x86-64 and x32. On
x86-64, libc.so is the same with and withou the fix.
[BZ# 24097]
CVE-2019-6488
* sysdeps/x86_64/multiarch/memcmp-avx2-movbe.S: Use RDX_LP for
length. Clear the upper 32 bits of RDX register.
* sysdeps/x86_64/multiarch/memcmp-sse4.S: Likewise.
* sysdeps/x86_64/multiarch/memcmp-ssse3.S: Likewise.
* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-memcmp and
tst-size_t-wmemcmp.
* sysdeps/x86_64/x32/tst-size_t-memcmp.c: New file.
* sysdeps/x86_64/x32/tst-size_t-wmemcmp.c: Likewise.
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits. The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.
This pach fixes memchr/wmemchr for x32. Tested on x86-64 and x32. On
x86-64, libc.so is the same with and withou the fix.
[BZ# 24097]
CVE-2019-6488
* sysdeps/x86_64/memchr.S: Use RDX_LP for length. Clear the
upper 32 bits of RDX register.
* sysdeps/x86_64/multiarch/memchr-avx2.S: Likewise.
* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-memchr and
tst-size_t-wmemchr.
* sysdeps/x86_64/x32/test-size_t.h: New file.
* sysdeps/x86_64/x32/tst-size_t-memchr.c: Likewise.
* sysdeps/x86_64/x32/tst-size_t-wmemchr.c: Likewise.
A single underscore was omitted in
sysdeps/powerpc/powerpc64/multiarch/strncmp.c, resulting in use of
power8 version of strncmp instead of power9 version, with significant
performance degradation.
* sysdeps/powerpc/powerpc64/multiarch/strncmp.c: Fix #ifdef.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
An error "impossible register constraint in 'asm'" was raised on POWER
5 and due to __vector __int128_t being used as operands without passing the
option -msvx to gcc.
This patch replaces "__vector __int128_t" with "__vector unsigned int"
which requires only -maltivec, available since POWER ISA 2.03, and which
is already passed to the compiler.
* sysdeps/powerpc/powerpc64/tst-ucontext-ppc64-vscr.c:
(do_test): Changed __vector __int128_t to __vector unsigned int.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
Optimize x86-64 strcat/strncat, strcpy/strncpy and stpcpy/stpncpy with AVX2.
It uses vector comparison as much as possible. In general, the larger the
source string, the greater performance gain observed, reaching speedups of
1.6x compared to SSE2 unaligned routines. Select AVX2 strcat/strncat,
strcpy/strncpy and stpcpy/stpncpy on AVX2 machines where vzeroupper is
preferred and AVX unaligned load is fast.
* sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add
strcat-avx2, strncat-avx2, strcpy-avx2, strncpy-avx2,
stpcpy-avx2 and stpncpy-avx2.
* sysdeps/x86_64/multiarch/ifunc-impl-list.c:
(__libc_ifunc_impl_list): Add tests for __strcat_avx2,
__strncat_avx2, __strcpy_avx2, __strncpy_avx2, __stpcpy_avx2
and __stpncpy_avx2.
* sysdeps/x86_64/multiarch/{ifunc-unaligned-ssse3.h =>
ifunc-strcpy.h}: rename header for a more generic name.
* sysdeps/x86_64/multiarch/ifunc-strcpy.h:
(IFUNC_SELECTOR): Return OPTIMIZE (avx2) on AVX 2 machines if
AVX unaligned load is fast and vzeroupper is preferred.
* sysdeps/x86_64/multiarch/stpcpy-avx2.S: New file
* sysdeps/x86_64/multiarch/stpncpy-avx2.S: Likewise
* sysdeps/x86_64/multiarch/strcat-avx2.S: Likewise
* sysdeps/x86_64/multiarch/strcpy-avx2.S: Likewise
* sysdeps/x86_64/multiarch/strncat-avx2.S: Likewise
* sysdeps/x86_64/multiarch/strncpy-avx2.S: Likewise
This patch fix VSCR position on ucontext. VSCR was read in the wrong
position on ucontext structure because it was ignoring the machine
endianess.
[BZ #24088]
* sysdeps/unix/sysv/linux/powerpc/sys/ucontext.h (vscr_t): Added
ifdef to fix read of VSCR.
* sysdeps/powerpc/powerpc64/Makefile [$subdir == stdlib]: Add
tst-ucontext-ppc64-vscr.c to test list.
* sysdeps/powerpc/powerpc64/tst-ucontext-ppc64-vscr.c: New test file.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
With this patch applied, I get 13 glibc testsuite failures using
TIMEOUTFACTOR=4 on a HiFive Unleashed running Fedora Core 29, using top of
tree binutils and gcc. 5 of those failures are due to a kernel bug. Without
the patch, there are over a hundred failures.
This patch is incidentally similar to the powerpc-nofpu ulps update that
Joseph Myers added a few days ago.
* sysdeps/riscv/rv64/rvd/libm-test-ulps: Update.
Add Ares to the midr_el0 list and support ifunc dispatch. Since Ares
supports 2 128-bit loads/stores, use Neon registers for memcpy by
selecting __memcpy_falkor by default (we should rename this to
__memcpy_simd or similar).
* manual/tunables.texi (glibc.cpu.name): Add ares tunable.
* sysdeps/aarch64/multiarch/memcpy.c (__libc_memcpy): Use
__memcpy_falkor for ares.
* sysdeps/unix/sysv/linux/aarch64/cpu-features.h (IS_ARES):
Add new define.
* sysdeps/unix/sysv/linux/aarch64/cpu-features.c (cpu_list):
Add ares cpu.
With -O included in CFLAGS it fails to build with:
../sysdeps/ieee754/ldbl-96/e_jnl.c: In function '__ieee754_jnl':
../sysdeps/ieee754/ldbl-96/e_jnl.c:146:20: error: 'temp' may be used uninitialized in this function [-Werror=maybe-uninitialized]
b = invsqrtpi * temp / sqrtl (x);
~~~~~~~~~~^~~~~~
../sysdeps/ieee754/ldbl-96/e_jnl.c: In function '__ieee754_ynl':
../sysdeps/ieee754/ldbl-96/e_jnl.c:375:16: error: 'temp' may be used uninitialized in this function [-Werror=maybe-uninitialized]
b = invsqrtpi * temp / sqrtl (x);
~~~~~~~~~~^~~~~~
../sysdeps/ieee754/dbl-64/e_jn.c: In function '__ieee754_jn':
../sysdeps/ieee754/dbl-64/e_jn.c:113:20: error: 'temp' may be used uninitialized in this function [-Werror=maybe-uninitialized]
b = invsqrtpi * temp / sqrt (x);
~~~~~~~~~~^~~~~~
../sysdeps/ieee754/dbl-64/e_jn.c: In function '__ieee754_yn':
../sysdeps/ieee754/dbl-64/e_jn.c:320:16: error: 'temp' may be used uninitialized in this function [-Werror=maybe-uninitialized]
b = invsqrtpi * temp / sqrt (x);
~~~~~~~~~~^~~~~~
Build tested with Yocto for ARM, AARCH64, X86, X86_64, PPC, MIPS, MIPS64
with -O, -O1, -Os.
For AARCH64 it needs one more fix in locale for -Os:
https://sourceware.org/ml/libc-alpha/2018-09/msg00539.html
[BZ #19444]
* sysdeps/ieee754/dbl-64/e_jn.c (__ieee754_jn): Use
__builtin_unreachable for default case in switch.
(__ieee754_yn): Likewise.
* sysdeps/ieee754/ldbl-96/e_jnl.c (__ieee754_jnl): Likewise.
(__ieee754_ynl): Likewise.
* sysdeps/ieee754/ldbl-128/e_jnl.c (__ieee754_jnl): Likewise.
(__ieee754_ynl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_jnl.c (__ieee754_jnl): Likewise.
(__ieee754_ynl): Likewise.
An attempt to re-create a different PTY under the same name can fail
if the PTY has a very high number. Try to increase the file
descriptor limit in this case, and bail out if this still does not
allow the test to proceed.
This patch wraps all uses of *_{enable,disable}_asynccancel and
and *_CANCEL_{ASYNC,RESET} in either already provided macros
(lll_futex_timed_wait_cancel) or creates new ones if the
functionality is not provided (SYSCALL_CANCEL_NCS, lll_futex_wait_cancel,
and lll_futex_timed_wait_cancel).
Also for some generic implementations, the direct call of the macros
are removed since the underlying symbols are suppose to provide
cancellation support.
This is a priliminary patch intended to simplify the work required
for BZ#12683 fix. It is a refactor change, no semantic changes are
expected.
Checked on x86_64-linux-gnu and i686-linux-gnu.
* nptl/pthread_join_common.c (__pthread_timedjoin_ex): Use
lll_wait_tid with timeout.
* nptl/sem_wait.c (__old_sem_wait): Use lll_futex_wait_cancel.
* sysdeps/nptl/aio_misc.h (AIO_MISC_WAIT): Use
futex_reltimed_wait_cancelable for cancelabla mode.
* sysdeps/nptl/gai_misc.h (GAI_MISC_WAIT): Likewise.
* sysdeps/posix/open64.c (__libc_open64): Do not call cancelation
macros.
* sysdeps/posix/sigwait.c (__sigwait): Likewise.
* sysdeps/posix/waitid.c (__sigwait): Likewise.
* sysdeps/unix/sysdep.h (__SYSCALL_CANCEL_CALL,
SYSCALL_CANCEL_NCS): New macro.
* sysdeps/nptl/lowlevellock.h (lll_wait_tid): Add timeout argument.
(lll_timedwait_tid): Remove macro.
* sysdeps/unix/sysv/linux/i386/lowlevellock.h (lll_wait_tid):
Likewise.
(lll_timedwait_tid): Likewise.
* sysdeps/unix/sysv/linux/sparc/lowlevellock.h (lll_wait_tid):
Likewise.
(lll_timedwait_tid): Likewise.
* sysdeps/unix/sysv/linux/x86_64/lowlevellock.h (lll_wait_tid):
Likewise.
(lll_timedwait_tid): Likewise.
* sysdeps/unix/sysv/linux/clock_nanosleep.c (__clock_nanosleep):
Use INTERNAL_SYSCALL_CANCEL.
* sysdeps/unix/sysv/linux/futex-internal.h
(futex_reltimed_wait_cancelable): Use LIBC_CANCEL_{ASYNC,RESET}
instead of __pthread_{enable,disable}_asynccancel.
* sysdeps/unix/sysv/linux/lowlevellock-futex.h
(lll_futex_wait_cancel): New macro.
The x86 defines optimized THREAD_ATOMIC_* macros where reference always
the current thread instead of the one indicated by input 'descr' argument.
It work as long the input is the self thread pointer, however it generates
wrong code if the semantic is to set a bit atomicialy from another thread.
This is not an issue for current GLIBC usage, however the new cancellation
code expects that some synchronization code to atomically set bits from
different threads.
If some usage indeed proves to be a hotspot we can add an extra macro
with a more descriptive name (THREAD_ATOMIC_BIT_SET_SELF for instance)
where i386 might optimize it.
Checked on i686-linux-gnu.
* sysdeps/i686/nptl/tls.h (THREAD_ATOMIC_CMPXCHG_VAL,
THREAD_ATOMIC_AND, THREAD_ATOMIC_BIT_SET): Remove macros.
The x86 defines optimized THREAD_ATOMIC_* macros where reference always
the current thread instead of the one indicated by input 'descr' argument.
It work as long the input is the self thread pointer, however it generates
wrong code if the semantic is to set a bit atomicialy from another thread.
This is not an issue for current GLIBC usage, however the new cancellation
code expects that some synchronization code to atomically set bits from
different threads.
The generic code generates an additional load to reference to TLS segment,
for instance the code:
THREAD_ATOMIC_BIT_SET (THREAD_SELF, cancelhandling, CANCELED_BIT);
Compiles to:
lock;orl $4, %fs:776
Where with patch changes it now compiles to:
mov %fs:16,%rax
lock;orl $4, 776(%rax)
If some usage indeed proves to be a hotspot we can add an extra macro
with a more descriptive name (THREAD_ATOMIC_BIT_SET_SELF for instance)
where x86_64 might optimize it.
Checked on x86_64-linux-gnu.
* sysdeps/x86_64/nptl/tls.h (THREAD_ATOMIC_CMPXCHG_VAL,
THREAD_ATOMIC_AND, THREAD_ATOMIC_BIT_SET): Remove macros.
bits/hwcap.h should be updated together with dl-procinfo.c.
* sysdeps/unix/sysv/linux/aarch64/bits/hwcap.h: Add comment.
* sysdeps/unix/sysv/linux/aarch64/dl-procinfo.c (_DL_HWCAP_COUNT):
Update.
Austin Group issue #411 [1] proposes that posix_spawn file action
posix_spawn_file_actions_adddup2 resets the close-on-exec when
source and destination refer to same file descriptor.
It solves the issue on multi-thread applications which uses
close-on-exec as default, and want to hand-chose specifically
file descriptor to purposefully inherited into a child process.
Current approach to achieve this scenario is to use two adddup2 file
actions and a temporary file description which do not conflict with
any other, coupled with a close file action to avoid leaking the
temporary file descriptor. This approach, besides being complex,
may fail with EMFILE/ENFILE file descriptor exaustion.
This can be more easily accomplished with an in-place removal of
FD_CLOEXEC. Although the resulting adddup2 semantic is slight
different than dup2 (equal file descriptors should be handled as
no-op), the proposed possible solution are either more complex
(fcntl action which a limited set of operations) or results in
unrequired operations (dup3 which also returns EINVAL for same
file descriptor).
Checked on aarch64-linux-gnu.
[BZ #23640]
* posix/tst-spawn.c (do_prepare, handle_restart, do_test): Add
posix_spawn_file_actions_adddup2 test to check O_CLOCEXEC reset.
* sysdeps/unix/sysv/linux/spawni.c (__spawni_child): Add
close-on-exec reset for adddup2 file action.
* sysdeps/posix/spawni.c (__spawni_child): Likewise.
[1] http://austingroupbugs.net/view.php?id=411