Update libm-test-ulps for AVX512 mathvec tests by running
“make regen-ulps” on Intel Xeon processor with AVX512.
* sysdeps/x86_64/fpu/libm-test-ulps: Regenerated.
(cherry picked from commit fcaaca412f)
This patch adds two new internal defines to set the internal
pthread_mutex_t layout required by the supported ABIS:
1. __PTHREAD_MUTEX_NUSERS_AFTER_KIND which control whether to define
__nusers fields before or after __kind. The preferred value for
is 0 for new ports and it sets __nusers before __kind.
2. __PTHREAD_MUTEX_USE_UNION which control whether internal __spins and
__list members will be place inside an union for linuxthreads
compatibility. The preferred value is 0 for ports and it sets
to not use an union to define both fields.
It fixes the wrong offsets value for __kind value on x86_64-linux-gnu-x32.
Checked with a make check run-built-tests=no on all afected ABIs.
[BZ #22298]
* nptl/allocatestack.c (allocate_stack): Check if
__PTHREAD_MUTEX_HAVE_PREV is non-zero, instead if
__PTHREAD_MUTEX_HAVE_PREV is defined.
* nptl/descr.h (pthread): Likewise.
* nptl/nptl-init.c (__pthread_initialize_minimal_internal):
Likewise.
* nptl/pthread_create.c (START_THREAD_DEFN): Likewise.
* sysdeps/nptl/fork.c (__libc_fork): Likewise.
* sysdeps/nptl/pthread.h (PTHREAD_MUTEX_INITIALIZER): Likewise.
* sysdeps/nptl/bits/thread-shared-types.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): New
defines.
(__pthread_internal_list): Check __PTHREAD_MUTEX_USE_UNION instead
of __WORDSIZE for internal layout.
(__pthread_mutex_s): Check __PTHREAD_MUTEX_NUSERS_AFTER_KIND instead
of __WORDSIZE for internal __nusers layout and __PTHREAD_MUTEX_USE_UNION
instead of __WORDSIZE whether to use an union for __spins and __list
fields.
(__PTHREAD_MUTEX_HAVE_PREV): Define also for __PTHREAD_MUTEX_USE_UNION
case.
* sysdeps/aarch64/nptl/bits/pthreadtypes-arch.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): New
defines.
* sysdeps/alpha/nptl/bits/pthreadtypes-arch.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION):
Likewise.
* sysdeps/arm/nptl/bits/pthreadtypes-arch.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION):
Likewise.
* sysdeps/hppa/nptl/bits/pthreadtypes-arch.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION):
Likewise.
* sysdeps/ia64/nptl/bits/pthreadtypes-arch.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION):
Likewise.
* sysdeps/m68k/nptl/bits/pthreadtypes-arch.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION):
Likewise.
* sysdeps/microblaze/nptl/bits/pthreadtypes-arch.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION):
Likewise.
* sysdeps/mips/nptl/bits/pthreadtypes-arch.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION):
Likewise.
* sysdeps/nios2/nptl/bits/pthreadtypes-arch.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION):
Likewise.
* sysdeps/powerpc/nptl/bits/pthreadtypes-arch.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION):
Likewise.
* sysdeps/s390/nptl/bits/pthreadtypes-arch.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION):
Likewise.
* sysdeps/sh/nptl/bits/pthreadtypes-arch.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION):
Likewise.
* sysdeps/sparc/nptl/bits/pthreadtypes-arch.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION):
Likewise.
* sysdeps/tile/nptl/bits/pthreadtypes-arch.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION):
Likewise.
* sysdeps/x86/nptl/bits/pthreadtypes-arch.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION):
Likewise.
Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit 06be6368da)
This patch adds a new build test to check for internal fields
offsets for user visible internal field. Although currently
the only field which is statically initialized to a non zero value
is pthread_mutex_t.__data.__kind value, the tests also check the
offset of __kind, __spins, __elision (if supported), and __list
internal member. A internal header (pthread-offset.h) is added
to each major ABI with the reference value.
Checked on x86_64-linux-gnu and with a build check for all affected
ABIs (aarch64-linux-gnu, alpha-linux-gnu, arm-linux-gnueabihf,
hppa-linux-gnu, i686-linux-gnu, ia64-linux-gnu, m68k-linux-gnu,
microblaze-linux-gnu, mips64-linux-gnu, mips64-n32-linux-gnu,
mips-linux-gnu, powerpc64le-linux-gnu, powerpc-linux-gnu,
s390-linux-gnu, s390x-linux-gnu, sh4-linux-gnu, sparc64-linux-gnu,
sparcv9-linux-gnu, tilegx-linux-gnu, tilegx-linux-gnu-x32,
tilepro-linux-gnu, x86_64-linux-gnu, and x86_64-linux-x32).
* nptl/pthreadP.h (ASSERT_PTHREAD_STRING,
ASSERT_PTHREAD_INTERNAL_OFFSET): New macro.
* nptl/pthread_mutex_init.c (__pthread_mutex_init): Add build time
checks for internal pthread_mutex_t offsets.
* sysdeps/aarch64/nptl/pthread-offsets.h
(__PTHREAD_MUTEX_NUSERS_OFFSET, __PTHREAD_MUTEX_KIND_OFFSET,
__PTHREAD_MUTEX_SPINS_OFFSET, __PTHREAD_MUTEX_ELISION_OFFSET,
__PTHREAD_MUTEX_LIST_OFFSET): New macro.
* sysdeps/alpha/nptl/pthread-offsets.h: Likewise.
* sysdeps/arm/nptl/pthread-offsets.h: Likewise.
* sysdeps/hppa/nptl/pthread-offsets.h: Likewise.
* sysdeps/i386/nptl/pthread-offsets.h: Likewise.
* sysdeps/ia64/nptl/pthread-offsets.h: Likewise.
* sysdeps/m68k/nptl/pthread-offsets.h: Likewise.
* sysdeps/microblaze/nptl/pthread-offsets.h: Likewise.
* sysdeps/mips/nptl/pthread-offsets.h: Likewise.
* sysdeps/nios2/nptl/pthread-offsets.h: Likewise.
* sysdeps/powerpc/nptl/pthread-offsets.h: Likewise.
* sysdeps/s390/nptl/pthread-offsets.h: Likewise.
* sysdeps/sh/nptl/pthread-offsets.h: Likewise.
* sysdeps/sparc/nptl/pthread-offsets.h: Likewise.
* sysdeps/tile/nptl/pthread-offsets.h: Likewise.
* sysdeps/x86_64/nptl/pthread-offsets.h: Likewise.
Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit dff91cd45e)
As shown in some buildbot issues on aarch64 and powerpc, calling
clone (VFORK) and waitpid (WNOHANG) does not guarantee the child
is ready to be collected. This patch changes the call back to 0
as before fe05e1cb6d fix.
This change can lead to the scenario 4.3 described in the commit,
where the waitpid call can hang undefinitely on the call. However
this is also a very unlikely and also undefinied situation where
both the caller is trying to terminate a pid before posix_spawn
returns and the race pid reuse is triggered. I don't see how to
correct handle this specific situation within posix_spawn.
Checked on x86_64-linux-gnu, aarch64-linux-gnu and
powerpc64-linux-gnu.
* sysdeps/unix/sysv/linux/spawni.c (__spawnix): Use 0 instead of
WNOHANG in waitpid call.
(cherry picked from commit aa95a2414e)
As noted by Florian Weimer, current Linux posix_spawn implementation
can trigger an assert if the auxiliary process is terminated before
actually setting the err member:
340 /* Child must set args.err to something non-negative - we rely on
341 the parent and child sharing VM. */
342 args.err = -1;
[...]
362 new_pid = CLONE (__spawni_child, STACK (stack, stack_size), stack_size,
363 CLONE_VM | CLONE_VFORK | SIGCHLD, &args);
364
365 if (new_pid > 0)
366 {
367 ec = args.err;
368 assert (ec >= 0);
Another possible issue is killing the child between setting the err and
actually calling execve. In this case the process will not ran, but
posix_spawn also will not report any error:
269
270 args->err = 0;
271 args->exec (args->file, args->argv, args->envp);
As suggested by Andreas Schwab, this patch removes the faulty assert
and also handles any signal that happens before fork and execve as the
spawn was successful (and thus relaying the handling to the caller to
figure this out). Different than Florian, I can not see why using
atomics to set err would help here, essentially the code runs
sequentially (due CLONE_VFORK) and I think it would not be legal the
compiler evaluate ec without checking for new_pid result (thus there
is no need to compiler barrier).
Summarizing the possible scenarios on posix_spawn execution, we
have:
1. For default case with a success execution, args.err will be 0, pid
will not be collected and it will be reported to caller.
2. For default failure case, args.err will be positive and the it will
be collected by the waitpid. An error will be reported to the
caller.
3. For the unlikely case where the process was terminated and not
collected by a caller signal handler, it will be reported as succeful
execution and not be collected by posix_spawn (since args.err will
be 0). The caller will need to actually handle this case.
4. For the unlikely case where the process was terminated and collected
by caller we have 3 other possible scenarios:
4.1. The auxiliary process was terminated with args.err equal to 0:
it will handled as 1. (so it does not matter if we hit the pid
reuse race since we won't possible collect an unexpected
process).
4.2. The auxiliary process was terminated after execve (due a failure
in calling it) and before setting args.err to -1: it will also
be handle as 1. but with the issue of not be able to report the
caller a possible execve failures.
4.3. The auxiliary process was terminated after args.err is set to -1:
this is the case where it will be possible to hit the pid reuse
case where we will need to collected the auxiliary pid but we
can not be sure if it will be expected one. I think for this
case we need to actually change waitpid to use WNOHANG to avoid
hanging indefinitely on the call and report an error to caller
since we can't differentiate between a default failure as 2.
and a possible pid reuse race issue.
Checked on x86_64-linux-gnu.
* sysdeps/unix/sysv/linux/spawni.c (__spawnix): Handle the case where
the auxiliary process is terminated by a signal before calling _exit
or execve.
(cherry picked from commit fe05e1cb6d)
This patch fixes the compat glob implementation consolidation from
commit 116f1c64d with the following changes:
- Add a compat implementation on s390 to avoid the architecture
to build the symbols on default linux oldglob.c by setting
GLOB_NO_OLD_VERSION.
- Remove the duplicate rule to build oldglob on alpha.
Checked on s390-linux-gnu and alpha-linux-gnu using build-many-glibc.py.
* sysdeps/unix/sysv/linux/s390/s390-32/oldglob.c: New file.
* sysdeps/unix/sysv/linux/alpha/Makefile
[$(subdir) = csu] (sysdep_routines): Remove rule.
(cherry picked from commit 3ca622e4d6)
This patch consolidates the glob implementation. The main changes are:
* On Linux all implementation now uses the default one at
sysdeps/unix/sysv/linux/glob{free}{64}.c with the exception
of alpha (which requires specific versioning) and s390-32 (which
different than other 32 bits ports it does not add a compat one
symbol for 2.1 version).
* The default implementation uses XSTAT_IS_XSTAT64 to define whether
both glob{free} and glob{free}64 should be different implementations.
For archictures that define XSTAT_IS_XSTAT64, glob{free} is an alias
to glob{free}64.
* Move i386 olddirent.h header to Linux default directory, since it is
the only header with this name and it is shared among different
architectures (and used on compat glob symbol as well).
Checked on x86_64-linux-gnu and on a build using build-many-glibcs.py
for all major architectures.
* sysdeps/unix/sysv/linux/arm/glob64.c: Remove file.
* sysdeps/unix/sysv/linux/i386/glob64.c: Likewise.
* sysdeps/unix/sysv/linux/m68k/glob64.c: Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/n64/glob64.c: Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/n64/globfree64.c: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/glob64.c: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc32/glob64.c: Likewise.
* sysdeps/unix/sysv/linux/wordsize-64/glob64.c: Likewise.
* sysdeps/unix/sysv/linux/wordsize-64/globfree64.c: Likewise.
* sysdeps/unix/sysv/linux/x86_64/x32/glob.c: Likewise.
* sysdeps/unix/sysv/linux/x86_64/x32/globfree.c: Likewise.
* sysdeps/wordsize-64/glob.c: Likewise.
* sysdeps/wordsize-64/glob64.c: Likewise.
* sysdeps/wordsize-64/globfree64.c: Likewise.
* sysdeps/unix/sysv/linux/glob.c: New file.
* sysdeps/unix/sysv/linux/glob64.c: Likewise.
* sysdeps/unix/sysv/linux/globfree.c: Likewise.
* sysdeps/unix/sysv/linux/globfree64.c: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-32/glob64.c: Likewise.
* sysdeps/unix/sysv/linux/oldglob.c [SHLIB_COMPAT]: Also
adds !GLOB_NO_OLD_VERSION as an extra condition.
* sysdeps/unix/sysv/linux/i386/alphasort64.c: Include olddirent.h
using relative path instead of absolute one.
* sysdeps/unix/sysv/linux/i386/getdents64.c: Likewise.
* sysdeps/unix/sysv/linux/i386/readdir64.c: Likewise.
* sysdeps/unix/sysv/linux/i386/readdir64_r.c: Likewise.
* sysdeps/unix/sysv/linux/i386/versionsort64.c: Likewise.
* sysdeps/unix/sysv/linux/i386/olddirent.h: Move to ...
* sysdeps/unix/sysv/linux//olddirent.h: ... here.
(cherry picked from commit 116f1c64d8)
Similar to bug 21987 for SPARC, MIPS64 wrongly installs the ldbl-128
version of bits/long-double.h, meaning incorrect results when using
headers installed from a 64-bit installation for a 32-bit build. (I
haven't actually seen this cause build failures before its interaction
with bits/floatn.h did so - installed headers wrongly expecting
_Float128 to be available in a 32-bit configuration.)
This patch fixes the bug by moving the MIPS header to
sysdeps/mips/ieee754, which comes before sysdeps/ieee754/ldbl-128 in
the sysdeps directory ordering. (bits/floatn.h will need a similar
fix - duplicating the ldbl-128 version for MIPS will suffice - for
headers from a 32-bit installation to be correct for 64-bit builds.)
Tested with build-many-glibcs.py (compilers build for
mips64-linux-gnu, where there was previously a libstdc++ build failure
as at
<https://sourceware.org/ml/libc-testresults/2017-q4/msg00130.html>).
[BZ #22322]
* sysdeps/mips/bits/long-double.h: Move to ....
* sysdeps/mips/ieee754/bits/long-double.h: ... here.
(cherry picked from commit 37bb78cb8c)
In _dl_runtime_resolve, use fxsave/xsave/xsavec to preserve all vector,
mask and bound registers. It simplifies _dl_runtime_resolve and supports
different calling conventions. ld.so code size is reduced by more than
1 KB. However, use fxsave/xsave/xsavec takes a little bit more cycles
than saving and restoring vector and bound registers individually.
Latency for _dl_runtime_resolve to lookup the function, foo, from one
shared library plus libc.so:
Before After Change
Westmere (SSE)/fxsave 345 866 151%
IvyBridge (AVX)/xsave 420 643 53%
Haswell (AVX)/xsave 713 1252 75%
Skylake (AVX+MPX)/xsavec 559 719 28%
Skylake (AVX512+MPX)/xsavec 145 272 87%
Ryzen (AVX)/xsavec 280 553 97%
This is the worst case where portion of time spent for saving and
restoring registers is bigger than majority of cases. With smaller
_dl_runtime_resolve code size, overall performance impact is negligible.
On IvyBridge, differences in build and test time of binutils with lazy
binding GCC and binutils are noises. On Westmere, differences in
bootstrap and "makc check" time of GCC 7 with lazy binding GCC and
binutils are also noises.
[BZ #21265]
* sysdeps/x86/cpu-features-offsets.sym (XSAVE_STATE_SIZE_OFFSET):
New.
* sysdeps/x86/cpu-features.c: Include <libc-pointer-arith.h>.
(get_common_indeces): Set xsave_state_size, xsave_state_full_size
and bit_arch_XSAVEC_Usable if needed.
(init_cpu_features): Remove bit_arch_Use_dl_runtime_resolve_slow
and bit_arch_Use_dl_runtime_resolve_opt.
* sysdeps/x86/cpu-features.h (bit_arch_Use_dl_runtime_resolve_opt):
Removed.
(bit_arch_Use_dl_runtime_resolve_slow): Likewise.
(bit_arch_Prefer_No_AVX512): Updated.
(bit_arch_MathVec_Prefer_No_AVX512): Likewise.
(bit_arch_XSAVEC_Usable): New.
(STATE_SAVE_OFFSET): Likewise.
(STATE_SAVE_MASK): Likewise.
[__ASSEMBLER__]: Include <cpu-features-offsets.h>.
(cpu_features): Add xsave_state_size and xsave_state_full_size.
(index_arch_Use_dl_runtime_resolve_opt): Removed.
(index_arch_Use_dl_runtime_resolve_slow): Likewise.
(index_arch_XSAVEC_Usable): New.
* sysdeps/x86/cpu-tunables.c (TUNABLE_CALLBACK (set_hwcaps)):
Support XSAVEC_Usable. Remove Use_dl_runtime_resolve_slow.
* sysdeps/x86_64/Makefile (tst-x86_64-1-ENV): New if tunables
is enabled.
* sysdeps/x86_64/dl-machine.h (elf_machine_runtime_setup):
Replace _dl_runtime_resolve_sse, _dl_runtime_resolve_avx,
_dl_runtime_resolve_avx_slow, _dl_runtime_resolve_avx_opt,
_dl_runtime_resolve_avx512 and _dl_runtime_resolve_avx512_opt
with _dl_runtime_resolve_fxsave, _dl_runtime_resolve_xsave and
_dl_runtime_resolve_xsavec.
* sysdeps/x86_64/dl-trampoline.S (DL_RUNTIME_UNALIGNED_VEC_SIZE):
Removed.
(DL_RUNTIME_RESOLVE_REALIGN_STACK): Check STATE_SAVE_ALIGNMENT
instead of VEC_SIZE.
(REGISTER_SAVE_BND0): Removed.
(REGISTER_SAVE_BND1): Likewise.
(REGISTER_SAVE_BND3): Likewise.
(REGISTER_SAVE_RAX): Always defined to 0.
(VMOV): Removed.
(_dl_runtime_resolve_avx): Likewise.
(_dl_runtime_resolve_avx_slow): Likewise.
(_dl_runtime_resolve_avx_opt): Likewise.
(_dl_runtime_resolve_avx512): Likewise.
(_dl_runtime_resolve_avx512_opt): Likewise.
(_dl_runtime_resolve_sse): Likewise.
(_dl_runtime_resolve_sse_vex): Likewise.
(USE_FXSAVE): New.
(_dl_runtime_resolve_fxsave): Likewise.
(USE_XSAVE): Likewise.
(_dl_runtime_resolve_xsave): Likewise.
(USE_XSAVEC): Likewise.
(_dl_runtime_resolve_xsavec): Likewise.
* sysdeps/x86_64/dl-trampoline.h (_dl_runtime_resolve_avx512):
Removed.
(_dl_runtime_resolve_avx512_opt): Likewise.
(_dl_runtime_resolve_avx): Likewise.
(_dl_runtime_resolve_avx_opt): Likewise.
(_dl_runtime_resolve_sse): Likewise.
(_dl_runtime_resolve_sse_vex): Likewise.
(_dl_runtime_resolve_fxsave): New.
(_dl_runtime_resolve_xsave): Likewise.
(_dl_runtime_resolve_xsavec): Likewise.
(cherry picked from commit b52b0d793d)
Before glibc 2.26, ld.so set dl_platform to "x86_64" and searched the
"x86_64" subdirectory when loading a shared library. ld.so in glibc
2.26 was changed to set dl_platform to "haswell" or "xeon_phi", based
on supported ISAs. This led to shared library loading failure for
shared libraries placed under the "x86_64" subdirectory.
This patch adds "x86_64" to x86-64 dl_hwcap so that ld.so will always
search the "x86_64" subdirectory when loading a shared library.
NB: We can't set x86-64 dl_platform to "x86-64" since ld.so will skip
the "haswell" and "xeon_phi" subdirectories on "haswell" and "xeon_phi"
machines.
Tested on i686 and x86-64.
[BZ #22093]
* sysdeps/x86/cpu-features.c (init_cpu_features): Initialize
GLRO(dl_hwcap) to HWCAP_X86_64 for x86-64.
* sysdeps/x86/dl-hwcap.h (HWCAP_COUNT): Updated.
(HWCAP_IMPORTANT): Likewise.
(HWCAP_X86_64): New enum.
(HWCAP_X86_AVX512_1): Updated.
* sysdeps/x86/dl-procinfo.c (_dl_x86_hwcap_flags): Add "x86_64".
* sysdeps/x86_64/Makefile (tests): Add tst-x86_64-1.
(modules-names): Add x86_64/tst-x86_64mod-1.
(LDFLAGS-tst-x86_64mod-1.so): New.
($(objpfx)tst-x86_64-1): Likewise.
($(objpfx)x86_64/tst-x86_64mod-1.os): Likewise.
(tst-x86_64-1-clean): Likewise.
* sysdeps/x86_64/tst-x86_64-1.c: New file.
* sysdeps/x86_64/tst-x86_64mod-1.c: Likewise.
(cherry picked from commit 45ff34638f)
This patch syncs posix/glob.c implementation with gnulib version
b5ec983 (glob: simplify symlink detection). The only difference
to gnulib code is
* DT_UNKNOWN, DT_DIR, and DT_LNK definition in the case there
were not already defined. Gnulib code which uses
HAVE_STRUCT_DIRENT_D_TYPE will redefine them wrongly because
GLIBC does not define HAVE_STRUCT_DIRENT_D_TYPE. Instead
the patch check for each definition instead.
Also, the patch requires additional globfree and globfree64 files
for compatibility version on some architectures. Also the code
simplification leads to not macro simplification (not need for
NO_GLOB_PATTERN_P anymore).
Checked on x86_64-linux-gnu and on a build using build-many-glibcs.py
for all major architectures.
[BZ #1062]
* posix/Makefile (routines): Add globfree, globfree64, and
glob_pattern_p.
* posix/flexmember.h: New file.
* posix/glob_internal.h: Likewise.
* posix/glob_pattern_p.c: Likewise.
* posix/globfree.c: Likewise.
* posix/globfree64.c: Likewise.
* sysdeps/gnu/globfree64.c: Likewise.
* sysdeps/unix/sysv/linux/alpha/globfree.c: Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/n64/globfree64.c: Likewise.
* sysdeps/unix/sysv/linux/oldglob.c: Likewise.
* sysdeps/unix/sysv/linux/wordsize-64/globfree64.c: Likewise.
* sysdeps/unix/sysv/linux/x86_64/x32/globfree.c: Likewise.
* sysdeps/wordsize-64/globfree.c: Likewise.
* sysdeps/wordsize-64/globfree64.c: Likewise.
* posix/glob.c (HAVE_CONFIG_H): Use !_LIBC instead.
[NDEBUG): Remove comments.
(GLOB_ONLY_P, _AMIGA, VMS): Remove define.
(dirent_type): New type. Use uint_fast8_t not
uint8_t, as C99 does not require uint8_t.
(DT_UNKNOWN, DT_DIR, DT_LNK): New macros.
(struct readdir_result): Use dirent_type. Do not define skip_entry
unless it is needed; this saves a byte on platforms lacking d_ino.
(readdir_result_type, readdir_result_skip_entry):
New functions, replacing ...
(readdir_result_might_be_symlink, readdir_result_might_be_dir):
these functions, which were removed. This makes the callers
easier to read. All callers changed.
(D_INO_TO_RESULT): Now empty if there is no d_ino.
(size_add_wrapv, glob_use_alloca): New static functions.
(glob, glob_in_dir): Check for size_t overflow in several places,
and fix some size_t checks that were not quite right.
Remove old code using SHELL since Bash no longer
uses this.
(glob, prefix_array): Separate MS code better.
(glob_in_dir): Remove old Amiga and VMS code.
(globfree, __glob_pattern_type, __glob_pattern_p): Move to
separate files.
(glob_in_dir): Do not rely on undefined behavior in accessing
struct members beyond their bounds. Use a flexible array member
instead
(link_stat): Rename from link_exists2_p and return -1/0 instead of
0/1. Caller changed.
(glob): Fix memory leaks.
* posix/glob64 (globfree64): Move to separate file.
* sysdeps/gnu/glob64.c (NO_GLOB_PATTERN_P): Remove define.
(globfree64): Remove hidden alias.
* sysdeps/unix/sysv/linux/Makefile (sysdeps_routines): Add
oldglob.
* sysdeps/unix/sysv/linux/alpha/glob.c (__new_globfree): Move to
separate file.
* sysdeps/unix/sysv/linux/i386/glob64.c (NO_GLOB_PATTERN_P): Remove
define.
Move compat code to separate file.
* sysdeps/wordsize-64/glob.c (globfree): Move definitions to
separate file.
(cherry picked from commit c66c908230)
Hide internal __old_glob64 function to allow direct access within
libc.so and libc.a without using GOT nor PLT.
[BZ #18822]
* sysdeps/unix/sysv/linux/i386/glob64.c (__old_glob64): Add
libc_hidden_proto and libc_hidden_def.
(cherry picked from commit 2585d7b839)
After commit 37f802f864 (Remove
__need_IOV_MAX and __need_FOPEN_MAX), UIO_MAXIOV is no longer supplied
(indirectly) through <bits/stdio_lim.h>, so sysdeps/posix/sysconf.c no
longer sees the definition.
(cherry picked from commit 63b4baa44e)
The IEEE 754 implementation of lgammal in sysdeps/ieee754/ldbl-128/ used
to be shared by IBM's implementation in sysdeps/ieee754/ldbl-128ibm/ (by
an inclusion of the source file). In order for the algorithm to work
for IBM's implementation, a check for LDBL_MANT_DIG was required. Since
the source file is no longer shared, the requirement for the check is
gone. This patch removes the conditionals.
Tested for powerpc64le and s390x.
* sysdeps/ieee754/ldbl-128/e_lgammal_r.c (__ieee754_lgammal_r):
Remove conditionals on LDBL_MANT_DIG.
* sysdeps/ieee754/ldbl-128ibm/e_lgammal_r.c
(__ieee754_lgammal_r): Likewise.
(cherry picked from commit 9ac3c68218)
The ldbl-128ibm implementation of j0l, j1l, lgammal_r, and cbrtl, as
well as the tables used by expl were copied from ldbl-128. However, the
original files used _Float128 for the type and L() for the literal
suffix. This patch uses the following sed command to rewrite _Float128
as long double and L(x) as xL (for e_expl.c, e_j0l.c, e_j1l.c,
e_lgammal_r.c, and t_expl.h):
sed -i <filename> \
-e "/^#define _Float128 long double/d" \
-e "/^#define L(x) x ## L/d" \
-e "/L(/s/)/L/" \
-e "/L(/s/L(//" \
-e "s/_Float128/long double/g"
For sysdeps/ieee754/ldbl-128ibm/s_cbrtl.c, this sed command incorrectly
replaces a few occurrences of L(), so the following command is used
instead:
sed -i sysdeps/ieee754/ldbl-128ibm/s_cbrtl.c \
-e "/^#define _Float128 long double/d" \
-e "/^#define L(x) x ## L/d" \
-e "s/L(0\.3\{40\})/0.3333333333333333333333333333333333333333L/" \
-e "s/L(3\.7568280825958912391243e-1)/3.7568280825958912391243e-1L/" \
-e "/L(/s/)/L/" \
-e "/L(/s/L(//" \
-e "s/_Float128/long double/g"
Tested for powerpc64le with patched [1] and unpatched gcc.
[1] https://gcc.gnu.org/ml/gcc-patches/2017-08/msg01028.html
* sysdeps/ieee754/ldbl-128ibm/e_expl.c: Remove definitions of
_Float128 and L().
* sysdeps/ieee754/ldbl-128ibm/e_j0l.c: Remove definitions of
_Float128 and L(). Replace _Float128 with long double and L(x)
with xL, throughout the file.
* sysdeps/ieee754/ldbl-128ibm/e_j1l.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_lgammal_r.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_cbrtl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/t_expl.h: Likewise.
(cherry picked from commit d2f0ed09f8)
Some files under sysdeps/ieee754/ldbl-128ibm/ are able to reuse the
implementation in sysdeps/ieee754/ldbl-128/ by defining _Float128 to
long double. This relied on compiler support for _Float128 being
disabled. On powerpc, such support was disabled by default, however, it
got enabled by default [1] in GCC 8.
This patch copies the implementations from ldbl-128 to ldbl-128ibm. The
uses of _Float128 and L() are kept intact in this patch and are replaced
with a script in a subsequent patch.
[1] https://gcc.gnu.org/ml/gcc-patches/2017-08/msg01028.html
Tested for powerpc64 and powerpc64le.
* sysdeps/ieee754/ldbl-128ibm/e_expl.c: Include tables from
sysdeps/ieee754/ldbl-128ibm.
* sysdeps/ieee754/ldbl-128ibm/e_j0l.c: Copy contents from the
equivalent implementation in sysdeps/ieee754/ldbl-128/ instead
of including it. Keep _Float128 and L() intact. These will be
reviewed by a separate patch.
* sysdeps/ieee754/ldbl-128ibm/e_j1l.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_lgammal_r.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_cbrtl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/t_expl.h: Likewise.
(cherry picked from commit c5c2e667bf)
On powerpc64le, compiler support for float128 is not enabled by default
on gcc. To enable it, the flag -mfloat128 must be passed as a command
line option to the compiler. This means that only the few files that
actively have -mfloat128 passed as an argument get compiler support for
float128, whereas all other files don't.
When -mfloat128 becomes enabled by default on powerpc [1], all the files
that do not currently have compiler support for float128 enabled during
their compilation, will start to have it. This will lead to build
errors in s_finite.c, s_isinf.c, and s_isnan.c.
The errors are due to the unintended macro expansion of __finitef128 to
__redirect_finitef128 in math/bits/mathcalls-helper-functions.h. In
that header, __MATHDECL_1 takes '__finite' and 'f128' as arguments and
concatenates them. However, since '__finite' has been redefined in
s_finite.c, the function declaration becomes __redirect_finitef128:
extern int __redirect___finitef128 (_Float128 __value) __attribute__ ((__nothrow__ )) __attribute__ ((__const__));
This declaration itself is OK. The problem arises when include/math.h
creates the hidden prototype ('hidden_proto (__finitef128)'), which
expands to:
extern __typeof (__finitef128) __finitef128 __attribute__ ((visibility ("hidden")));
Since __finitef128 is not declared, __typeof fails. This effect was
already true for the 'float' and 'long double' versions and is now true
for float128. Likewise for isinsff128 and isnanf128.
This patch defines __finitef128 as __redirect___finitef128 in
sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite.c, similarly to what's
done for the float and long double versions of these functions, to get
rid of the build error. Likewise for isinff128 and isnanf128.
[1] https://gcc.gnu.org/ml/gcc-patches/2017-08/msg01028.html
Tested for powerpc64 and powerpc64le.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite.c
(__finitef128): Define to __redirect___finitef128.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinf.c
(__isinff128): Define to __redirect___isinff128.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan.c
(__isnanf128): Define to __redirect___isnanf128.
(cherry picked from commit e010deb231)
On powerpc64le, not all files can have the flag -mfloat128 passed as an
option on the compile command, since that could conflict with other
flags, such as -mno-vsx. Each file that needs the flag, gets it through
a CFLAGS-filename variable on sysdeps/powerpc/powerpc64le/Makefile.
The test cases tst-strtod-nan-locale and tst-wcstod-nan-locale are
missing this flag.
Tested for powerpc64le.
* sysdeps/powerpc/powerpc64le/Makefile
(CFLAGS-tst-strtod-nan-locale.c): New variable.
(CFLAGS-tst-wcstod-nan-locale.c): New variable.
(cherry picked from commit ffa448041b)
This is an optimized memmove implementation for the Qualcomm Falkor
processor core. Due to the way the falkor memcpy needs to be written,
code cannot be easily shared between memmove and memcpy like in case
of other aarch64 memcpy implementations due to which this routine is
separate. The underlying principle is the same as that of memcpy
where it tries to use registers with the same lower 4 bits for
fetching the same stream, thus optimizing hardware prefetcher
performance.
The memcpy copy loop copies 64 bytes at a time using the same register
pair since that's the way to train the hardware prefetcher on the
falkor core. memmove cannot quite do that since it needs to avoid
overlaps, so it does the next best thing, i.e. has a 32 byte loop with
a 32 byte end (prefetch a loop ahead to account for overlapping
locations) with register pairs that alias so that they hit the same
prefetcher. Due to this difference in loop size, they have to
currently be separate implementations but efforts are on to try and
get memmove to fall back into memcpy whenever it can without simply
duplicating all of the code.
Performance:
The routine fares around 20-25% better than the generic memmove for
most medium to large sizes (i.e. > 128 bytes) for the new walking
memmove benchmark (memmove-walk) with an unexplained regression
between 1K and 2K. The minor regression is something worth looking
into for us, but the remaining gains are significant enough that we
would like this included upstream as we looking into the cause for the
regression. Here is a snippet of the numbers as generated from the
microbenchmark by the compare_strings script. Comparisons are against
__memmove_generic:
Function: memmove
Variant: walk
__memmove_thunderx __memmove_falkor __memmove_generic
========================================================================================================================
<snip>
length=16384: 12508800.00 ( 6.09%) 11486800.00 ( 13.76%) 13319600.00
length=16400: 13614200.00 ( -0.67%) 11585000.00 ( 14.33%) 13523600.00
length=16385: 13448400.00 ( 0.10%) 11732700.00 ( 12.84%) 13461200.00
length=16399: 13594100.00 ( -0.22%) 11859600.00 ( 12.57%) 13564400.00
length=16386: 13211600.00 ( 1.13%) 11503800.00 ( 13.91%) 13362400.00
length=16398: 13218600.00 ( 2.12%) 11573200.00 ( 14.30%) 13504700.00
length=16387: 13510900.00 ( -0.37%) 11744200.00 ( 12.76%) 13461300.00
length=16397: 13603700.00 ( -0.15%) 11878200.00 ( 12.55%) 13583200.00
length=16388: 13461700.00 ( -0.13%) 11558000.00 ( 14.03%) 13444100.00
length=16396: 13517500.00 ( -0.03%) 11561300.00 ( 14.45%) 13513900.00
length=16389: 13534100.00 ( 0.17%) 11756800.00 ( 13.28%) 13556900.00
length=16395: 13585600.00 ( 0.11%) 11791800.00 ( 13.30%) 13601200.00
length=16390: 13480100.00 ( -0.13%) 11685500.00 ( 13.20%) 13462100.00
length=16394: 13529900.00 ( -0.23%) 11549800.00 ( 14.43%) 13498200.00
length=16391: 13595400.00 ( -0.26%) 11768200.00 ( 13.22%) 13560600.00
length=16393: 13567000.00 ( 0.20%) 11779700.00 ( 13.35%) 13594700.00
length=32768: 71308800.00 ( -6.53%) 50220800.00 ( 24.98%) 66939200.00
length=32784: 72100800.00 (-11.55%) 50114100.00 ( 22.47%) 64636300.00
length=32769: 71767000.00 ( -7.10%) 51238400.00 ( 23.54%) 67010000.00
length=32783: 70113700.00 (-40.95%) 51129000.00 ( -2.78%) 49744400.00
length=32770: 71367600.00 ( -6.52%) 50244700.00 ( 25.01%) 67000900.00
length=32782: 64366700.00 ( 4.71%) 50101400.00 ( 25.83%) 67545600.00
length=32771: 71440100.00 ( -6.51%) 51263900.00 ( 23.57%) 67074900.00
length=32781: 66993000.00 ( 0.34%) 51108300.00 ( 23.97%) 67220300.00
length=32772: 71443900.00 (-60.50%) 50062100.00 (-12.47%) 44512600.00
length=32780: 71759100.00 ( -6.58%) 50263200.00 ( 25.35%) 67328600.00
length=32773: 71714900.00 (-33.21%) 51076600.00 ( 5.12%) 53835400.00
length=32779: 71756900.00 ( -6.56%) 51290800.00 ( 23.83%) 67337800.00
length=32774: 59689300.00 (-34.55%) 50068400.00 (-12.86%) 44363300.00
length=32778: 71847500.00 (-18.20%) 50084100.00 ( 17.61%) 60786500.00
length=32775: 71599300.00 ( -6.54%) 51278200.00 ( 23.70%) 67204800.00
length=32777: 71862900.00 (-60.85%) 51094000.00 (-14.36%) 44677900.00
length=65536: 282848000.00 ( -6.60%) 199187000.00 ( 24.93%) 265325000.00
length=65552: 243285000.00 (-41.61%) 198512000.00 (-15.54%) 171805000.00
length=65537: 255415000.00 (-23.47%) 202499000.00 ( 2.11%) 206858000.00
length=65551: 280122000.00 (-62.95%) 203349000.00 (-18.29%) 171911000.00
length=65538: 283676000.00 (-14.46%) 198368000.00 ( 19.96%) 247848000.00
length=65550: 275566000.00 (-51.76%) 198494000.00 ( -9.31%) 181581000.00
length=65539: 283699000.00 ( -6.58%) 203453000.00 ( 23.57%) 266195000.00
length=65549: 286572000.00 ( -6.65%) 202607000.00 ( 24.60%) 268712000.00
length=65540: 283710000.00 ( -6.59%) 199161000.00 ( 25.17%) 266160000.00
length=65548: 237573000.00 ( 11.48%) 198462000.00 ( 26.06%) 268395000.00
length=65541: 284150000.00 ( -6.58%) 203273000.00 ( 23.75%) 266600000.00
length=65547: 286250000.00 ( -6.70%) 202594000.00 ( 24.48%) 268263000.00
length=65542: 284167000.00 ( -6.60%) 199122000.00 ( 25.31%) 266584000.00
length=65546: 285656000.00 ( -6.59%) 198443000.00 ( 25.95%) 268002000.00
length=65543: 284600000.00 ( -6.58%) 203247000.00 ( 23.89%) 267030000.00
length=65545: 285665000.00 ( -6.40%) 202575000.00 ( 24.55%) 268472000.00
<snip>
* sysdeps/aarch64/multiarch/Makefile (sysdep_routines): Add
memmove_falkor.
* sysdeps/aarch64/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Likewise.
* sysdeps/aarch64/multiarch/memmove.c: Likewise.
* sysdeps/aarch64/multiarch/memmove_falkor.S: New file.
All representations of floating-point numbers in types with IEC 60559
binary exchange format are canonical. On the other hand, types with IEC
60559 extended formats, such as those implemented under ldbl-96 and
ldbl-128ibm, contain representations that are not canonical.
TS 18661-1 introduced the type-generic macro iscanonical, which returns
whether a floating-point value is canonical or not. In Glibc, this
type-generic macro is implemented using the macro __MATH_TG, which, when
support for float128 is enabled, relies on __builtin_types_compatible_p
to select between floating-point types. However, this use of
iscanonical breaks C++ applications, because the builtin is only
available in C mode.
This patch provides a C++ implementation of iscanonical that relies on
function overloading, rather than builtins, to select between
floating-point types.
Unlike the C++ implementations for iszero and issignaling, this
implementation ignores __NO_LONG_DOUBLE_MATH. The double type always
matches IEC 60559 double format, which is always canonical. Thus, when
double and long double are the same (__NO_LONG_DOUBLE_MATH), iscanonical
always returns 1 and is not implemented with __MATH_TG.
Tested for powerpc64, powerpc64le and x86_64.
[BZ #22235]
* math/math.h: Trivial fix for unbalanced parentheses in comment.
* math/Makefile [CXX] (tests): Add test-math-iscanonical.cc.
(CFLAGS-test-math-iscanonical.cc): New variable.
* math/test-math-iscanonical.cc: New file.
* sysdeps/ieee754/ldbl-96/bits/iscanonical.h (iscanonical):
Provide a C++ implementation based on function overloading,
rather than using __MATH_TG, which uses C-only builtins.
* sysdeps/ieee754/ldbl-128ibm/bits/iscanonical.h (iscanonical):
Likewise.
* sysdeps/powerpc/powerpc64le/Makefile
(CFLAGS-test-math-iscanonical.cc): New variable.
(cherry picked from commit aa0235dfde)
My refactoring of long double information
commit 0acb8a2a85
Author: Joseph Myers <joseph@codesourcery.com>
Date: Wed Dec 14 18:27:56 2016 +0000
Refactor long double information into bits/long-double.h.
resulted in sparc32 configurations installing the ldbl-opt version of
bits/long-double.h instead of the intended
sysdeps/unix/sysv/linux/sparc version.
For sparc32 by itself, this is not a problem, since the ldbl-opt
version is correct for sparc32. However, both sparc32 and sparc64 are
supposed to install sets of headers that work for both of them, so
that a single sysroot, whichever order the libraries are built and
installed in, works for both. The effect of having the wrong version
installed is that you end up with a miscompiled sparc64 libstdc++
which fails glibc's configure tests for the C++ compiler.
This patch moves the header from sysdeps/unix/sysv/linux/sparc to
separate copies of the same file for sparc32 and sparc64, to ensure it
comes before ldbl-opt in the sysdeps directory ordering.
Tested with build-many-glibcs.py for sparc64-linux-gnu and
sparcv9-linux-gnu.
[BZ #21987]
* sysdeps/unix/sysv/linux/sparc/bits/long-double.h: Remove file
and copy to ...
* sysdeps/unix/sysv/linux/sparc/sparc32/bits/long-double.h:
... here.
* sysdeps/unix/sysv/linux/sparc/sparc64/bits/long-double.h:
... and here.
(cherry picked from commit 80f91666fe)
In <https://sourceware.org/ml/libc-alpha/2013-05/msg00722.html> I
remarked on the possibility of arithmetic in various nearbyint
implementations being scheduled before feholdexcept calls, resulting
in spurious "inexact" exceptions.
I'm now actually observing this occurring in glibc built for ARM with
GCC 7 (in fact, both copies of the same addition/subtraction sequence
being combined and moved out before the conditionals and
feholdexcept/fesetenv pairs), resulting in test failures.
This patch makes the nearbyint implementations with this particular
feholdexcept / arithmetic / fesetenv pattern consistently use
math_opt_barrier on the function argument when first used in
arithmetic, and also consistently use math_force_eval before fesetenv
(the latter was generally already done, but the dbl-64/wordsize-64
implementation used math_opt_barrier instead, and as
math_opt_barrier's intended effect is through its output value being
used, such a use that doesn't use the return value is suspect).
Tested for x86_64 (--disable-multi-arch so more of these
implementations get used), and for ARM in a configuration where I saw
the problem scheduling.
[BZ #22225]
* sysdeps/ieee754/dbl-64/s_nearbyint.c (__nearbyint): Use
math_opt_barrier on argument when doing arithmetic on it.
* sysdeps/ieee754/dbl-64/wordsize-64/s_nearbyint.c (__nearbyint):
Likewise. Use math_force_eval not math_opt_barrier after
arithmetic.
* sysdeps/ieee754/flt-32/s_nearbyintf.c (__nearbyintf): Use
math_opt_barrier on argument when doing arithmetic on it.
* sysdeps/ieee754/ldbl-128/s_nearbyintl.c (__nearbyintl):
Likewise.
(cherry picked from commit f124cb3811)
This makes the __tls_get_addr_opt test run as a shared library, and so
actually test that DTPMOD64/DTPREL64 pairs are processed by ld.so to
support the __tls_get_adfr_opt call stub fast return. After a
2017-01-24 patch (binutils f0158f4416) ld.bfd no longer emitted
unnecessary dynamic relocations against local thread variables,
instead setting up the __tls_index GOT entries for the call stub fast
return. This meant tst-tlsopt-powerpc passed but did not check ld.so
relocation support. After a 2017-07-16 patch (binutils 676ee2b5fa)
ld.bfd no longer set up the __tls_index GOT entries for the call stub
fast return, and tst-tlsopt-powerpc failed.
Compiling mod-tlsopt-powerpc.c with -DSHARED exposed a bug in
powerpc64/tls-macros.h, which defines a __TLS_GET_ADDR macro that
clashes with one defined in dl-tls.h. The tls-macros.h version is
only used in that file, so delete it and expand.
* sysdeps/powerpc/mod-tlsopt-powerpc.c: Extract from
tst-tlsopt-powerpc.c with function name change and no test harness.
* sysdeps/powerpc/tst-tlsopt-powerpc.c: Remove body of test.
Call tls_get_addr_opt_test.
* sysdeps/powerpc/Makefile (LDFLAGS-tst-tlsopt-powerpc): Don't define.
(modules-names): Add mod-tlsopt-powerpc.
(mod-tlsopt-powerpc.so-no-z-defs): Define.
(tst-tlsopt-powerpc): Depend on .so.
* sysdeps/powerpc/powerpc64/tls-macros.h (__TLS_GET_ADDR): Don't
define. Expand use in TLS_GD and TLS_LD.
(cherry picked from commit e98c925fa4)
The old code uses errno as the primary indicator for success or
failure. This is wrong because errno is only set for specific
combinations of the status return value and the h_errno variable.
(cherry picked from commit f4a6be2582)
This simplifies the code because it is not necessary to propagate the
temporary h_errno value to the thread-local variable. It also increases
compatibility with NSS modules which update only one of the two places.
(cherry picked from commit 53250a21b8)
When signaling nans are enabled (with -fsignaling-nans), the C++ version
of iszero uses the fpclassify macro, which is defined with __MATH_TG.
However, when support for float128 is available, __MATH_TG uses the
builtin __builtin_types_compatible_p, which is only available in C mode.
This patch refactors the C++ version of iszero so that it uses function
overloading to select between the floating-point types, instead of
relying on fpclassify and __MATH_TG.
Tested for powerpc64le, s390x, x86_64, and with build-many-glibcs.py.
[BZ #21930]
* math/math.h [defined __cplusplus && defined __SUPPORT_SNAN__]
(iszero): New C++ implementation that does not use
fpclassify/__MATH_TG/__builtin_types_compatible_p, when
signaling nans are enabled, since __builtin_types_compatible_p
is a C-only feature.
* math/test-math-iszero.cc: When __HAVE_DISTINCT_FLOAT128 is
defined, include ieee754_float128.h for access to the union and
member ieee854_float128.ieee.
[__HAVE_DISTINCT_FLOAT128] (do_test): Call check_float128.
[__HAVE_DISTINCT_FLOAT128] (check_float128): New function.
* sysdeps/powerpc/powerpc64le/Makefile [subdir == math]
(CXXFLAGS-test-math-iszero.cc): Add -mfloat128 to the build
options of test-math-zero on powerpc64le.
(cherry picked from commit 42496114ec)
The macro __MATH_TG contains the logic to select between long double and
_Float128, when these types are ABI-distinct. This logic relies on
__builtin_types_compatible_p, which is not available in C++ mode.
On the other hand, C++ function overloading provides the means to
distinguish between the floating-point types. The overloading
resolution will match the correct parameter regardless of type
qualifiers, i.e.: const and volatile.
Tested for powerpc64le, s390x, and x86_64.
* math/math.h [defined __cplusplus] (issignaling): Provide a C++
definition for issignaling that does not rely on __MATH_TG,
since __MATH_TG uses __builtin_types_compatible_p, which is only
available in C mode.
(CFLAGS-test-math-issignaling.cc): New variable.
* math/Makefile [CXX] (tests): Add test-math-issignaling.
* math/test-math-issignaling.cc: New test for C++ implementation
of type-generic issignaling.
* sysdeps/powerpc/powerpc64le/Makefile [subdir == math]
(CXXFLAGS-test-math-issignaling.cc): Add -mfloat128 to the build
options of test-math-issignaling on powerpc64le.
(cherry picked from commit a16e8bc08e)
POWER ISA 3.0 introduces the xssqrtqp instructions, which expects
operands to be in Vector Registers (Altivec/VMX), even though this
instruction belongs to the Vector-Scalar Instruction Set.
In GCC's Extended Assembly for POWER, the 'wq' register constraint is
provided for use with IEEE 754 128-bit floating-point values. However,
this constraint does not limit the register allocation to Vector
Registers (Altivec/VMX) and could assign a Vector-Scalar Register (VSX)
to the operands of the instruction.
This patch changes the register constraint used in sqrtf128 from 'wq' to
'v', in order to request a Vector Register (Altivec/VMX) for use with
the xssqrtqp instruction.
Tested for powerpc64le and --with-cpu=power9.
[BZ #21941]
* sysdeps/powerpc/fpu/math_private.h (__ieee754_sqrtf128): Since
xssqrtqp requires operands to be in Vector Registers
(Altivec/VMX), replace the register constraint 'wq' with 'v'.
* sysdeps/powerpc/powerpc64le/power9/fpu/e_sqrtf128.c
(__ieee754_sqrtf128): Likewise.
(cherry picked from commit 4d98ace9de)
Different than other architectures hppa-linux-gnu define different values
for ENOTSUP and EOPNOTSUPP, where the later is a Linux specific one.
This leads to tst-preadwritev{64}v2 tests failures:
$ ./testrun.sh misc/tst-preadvwritev2
error: tst-preadvwritev2-common.c:35: preadv2 failure did not set errno to ENOTSUP (223)
error: 1 test failures
The straightforward fix is to return the POSIX defined ENOTSUP on all
p{read,write}v{64}v2 implementations instead of Linux specific one.
Checked on x86_64-linux-gnu and the tst-preadwritev{64}v2 on
hppa-linux-gnu (although due the installed kernel on my testing system
the pwritev{64}v2 with an invalid flag still fails due a known kernel
issue [1]).
[BZ #21780]
* sysdeps/posix/preadv2.c (preadv2): Use ENOTSUP instead of
EOPNOTSUPP.
* sysdeps/posix/preadv64v2.c (preadv64v2): Likewise.
* sysdeps/posix/pwritev2.c (pwritev2): Likewise.
* sysdeps/posix/pwritev64v2.c (pwritev64v2): Likewise.
* sysdeps/unix/sysv/linux/preadv2.c (preadv2): Likewise.
* sysdeps/unix/sysv/linux/preadv64v2.c (preadv64v2): Likewise.
* sysdeps/unix/sysv/linux/pwritev2.c (pwritev2): Likewise.
* sysdeps/unix/sysv/linux/pwritev64v2.c (pwritev64v2): Likewise.
[1] https://sourceware.org/ml/libc-alpha/2017-06/msg00726.html
Cherry-pick of 852d631207
On AVX machines with XGETBV (ECX == 1) like Skylake processors,
(gdb) disass _dl_runtime_resolve_avx_opt
Dump of assembler code for function _dl_runtime_resolve_avx_opt:
0x0000000000015890 <+0>: push %rax
0x0000000000015891 <+1>: push %rcx
0x0000000000015892 <+2>: push %rdx
0x0000000000015893 <+3>: mov $0x1,%ecx
0x0000000000015898 <+8>: xgetbv
0x000000000001589b <+11>: mov %eax,%r11d
0x000000000001589e <+14>: pop %rdx
0x000000000001589f <+15>: pop %rcx
0x00000000000158a0 <+16>: pop %rax
0x00000000000158a1 <+17>: and $0x4,%r11d
0x00000000000158a5 <+21>: bnd je 0x16200 <_dl_runtime_resolve_sse_vex>
End of assembler dump.
is slower than:
(gdb) disass _dl_runtime_resolve_avx_slow
Dump of assembler code for function _dl_runtime_resolve_avx_slow:
0x0000000000015850 <+0>: vorpd %ymm0,%ymm1,%ymm8
0x0000000000015854 <+4>: vorpd %ymm2,%ymm3,%ymm9
0x0000000000015858 <+8>: vorpd %ymm4,%ymm5,%ymm10
0x000000000001585c <+12>: vorpd %ymm6,%ymm7,%ymm11
0x0000000000015860 <+16>: vorpd %ymm8,%ymm9,%ymm9
0x0000000000015865 <+21>: vorpd %ymm10,%ymm11,%ymm10
0x000000000001586a <+26>: vpcmpeqd %xmm8,%xmm8,%xmm8
0x000000000001586f <+31>: vorpd %ymm9,%ymm10,%ymm10
0x0000000000015874 <+36>: vptest %ymm10,%ymm8
0x0000000000015879 <+41>: bnd jae 0x158b0 <_dl_runtime_resolve_avx>
0x000000000001587c <+44>: vzeroupper
0x000000000001587f <+47>: bnd jmpq 0x16200 <_dl_runtime_resolve_sse_vex>
End of assembler dump.
(gdb)
since xgetbv takes much more cycles than single cycle operations like
vpord/vvpcmpeq/ptest. _dl_runtime_resolve_opt should be used only with
AVX512 where AVX512 instructions lead to lower CPU frequency on Skylake
server.
[BZ #21871]
* sysdeps/x86/cpu-features.c (init_cpu_features): Set
bit_arch_Use_dl_runtime_resolve_opt only with AVX512F.
(cherry picked from commit d2cf37c0a2)
This comes from running “make regen-ulps” on an AMD Opteron 2378 CPU.
Changelog:
* sysdeps/i386/i686/fpu/multiarch/libm-test-ulps: Regenerated.
(cherry picked from commit 144bdab050)
The relative branch directly to __libc_vfork results in an relocation
that cannot be resolved. Specifically a R_MICROBLAZE_64_PCREL relocation
is created for this branch, however for MicroBlaze R_MICROBLAZE_64_PCREL
type relocations symbols are not resolved. Additionally due to the
branch being located in the .text section the instruction cannot be
rewritten as the section is not writable, and causes a segfault at
runtime when loading libpthread.
To resolve this issue, ensure the branch is done using PLT. This removes
the need to modify the instruction and trades the R_MICROBLAZE_64_PCREL
for a more common R_MICROBLAZE_JUMP via the PLT.
[BZ #21779]
* sysdeps/unix/sysv/linux/microblaze/pt-vfork.S: Branch using PLT.
The function maybe_enable_malloc_check, which is called by
__tunables_init, calls __access_noerrno. It isn't problem when
symbol is is in ld.so, which has a special version of __access_noerrno
without stack protector. But when glibc is built with stack protector,
maybe_enable_malloc_check in libc.a can't call the regular version of
__access_noerrno with stack protector.
This patch changes how Linux defines the __access_noerrno to be an
inline call instead and thus preventing defining different build
rules for ld/static and shared.
H.J. Lu <hongjiu.lu@intel.com>
Adhemerval Zanella <adhemerval.zanella@linaro.org>
[BZ #21744]
* elf/dl-tunables.c: Include not-errno.h header.
* include/unistd.h (__access_noerrno): Remove definition.
* sysdeps/unix/sysv/linux/access.c (__access_noerrno): Likewise.
* sysdeps/generic/not-errno.h: New file.
* sysdeps/unix/sysv/linux/not-errno.h: Likewise.
__libc_argv[0] points to address on stack and __libc_secure_getenv
accesses environment variables which are on stack. We should avoid
accessing stack when stack is corrupted.
This patch also renames function argument in __fortify_fail_abort
from do_backtrace to need_backtrace to avoid confusion with do_backtrace
from enum __libc_message_action.
[BZ #21752]
* debug/fortify_fail.c (__fortify_fail_abort): Don't pass down
__libc_argv[0] if we aren't doing backtrace. Rename do_backtrace
to need_backtrace.
* sysdeps/posix/libc_fatal.c (__libc_message): Don't call
__libc_secure_getenv if we aren't doing backtrace.
sys/ptrace.h on S390 used to be includible both before and after
asm/ptrace.h, until commit b08a6a0dea
among other changes introduced PTRACE_SINGLEBLOCK enum constant which
is also defined in asm/ptrace.h as a macro, making sys/ptrace.h fail
to compile when included after asm/ptrace.h.
* sysdeps/unix/sysv/linux/s390/sys/ptrace.h [_LINUX_PTRACE_H ||
_S390_PTRACE_H]: Undefine all PTRACE_* macro constants defined
later as enum constants, except PTRACE_PEEKUSER, PTRACE_POKEUSER,
and PTRACE_SEIZE_DEVEL that are not defined by Linux headers.
This patch fixes the argument passing for exit syscall after
the clone function returns on hppa. This fixes misc/tst-clone2
on alpha-linux-gnu.
Checked misc/tst-clone2 on alpha-linux-gnu.
[BZ #21512]
* sysdeps/unix/sysv/linux/alpha/clone.S (__clone): Fix argument
passing to syscall exit.
Since there are no multiarch versions of memmove_chk and memset_chk,
test multiarch versions of memmove_chk and memset_chk only in libc.so.
[BZ #21741]
* sysdeps/i386/i686/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Test memmove_chk and memset_chk only
in libc.so.
The patch proposed by Peter Bergner [1] to libgcc in order to fix
[BZ #21707] adds a dependency on a symbol provided by the loader,
forcing the loader to be linked to tests after libgcc was linked.
It also requires to read the thread pointer during IRELA relocations.
Tested on powerpc, powerpc64, powerpc64le, s390x and x86_64.
[1] https://sourceware.org/ml/libc-alpha/2017-06/msg01383.html
[BZ #21707]
* csu/libc-start.c (LIBC_START_MAIN): Perform IREL{,A}
relocations before or after initializing the TCB on statically
linked executables. That's a per-architecture definition.
* elf/rtld.c (dl_main): Add a comment about thread-local
variables initialization.
* sysdeps/generic/libc-start.h: New file. Define
ARCH_APPLY_IREL and ARCH_SETUP_IREL.
* sysdeps/powerpc/Makefile:
[$(subdir) = elf && $(multi-arch) != no] (tests-static-internal): Add tst-tlsifunc-static.
[$(subdir) = elf && $(multi-arch) != no && $(build-shared) == yes]
(tests-internal): Add tst-tlsifunc.
* sysdeps/powerpc/tst-tlsifunc.c: New file.
* sysdeps/powerpc/tst-tlsifunc-static.c: Likewise.
* sysdeps/powerpc/powerpc64le/Makefile (f128-loader-link): New
variable.
[$(subdir) = math] (test-float128% test-ifloat128%): Force
linking to the loader after linking to libgcc.
[$(subdir) = wcsmbs || $(subdir) = stdlib] (bug-strtod bug-strtod2)
(bug-strtod2 tst-strtod-round tst-wcstod-round tst-strtod6 tst-strrom)
(tst-strfrom-locale strfrom-skeleton): Likewise.
* sysdeps/unix/sysv/linux/powerpc/libc-start.h: New file. Define
ARCH_APPLY_IREL and ARCH_SETUP_IREL.
This patch fixes the argument passing for exit syscall after
the clone function returns on hppa. This fixes misc/tst-clone2
on hppa-linux-gnu.
Checked misc/tst-clone2 on hppa-linux-gnu.
[BZ #21512]
* sysdeps/unix/sysv/linux/hppa/clone.S (__clone): Fix argument
passing to syscall exit.
This patch adds the HWCAP_JSCVT, HWCAP_FCMA and HWCAP_LRCPC macros
from Linux 4.12 to the AArch64 bits/hwcap.h.
* sysdeps/unix/sysv/linux/aarch64/bits/hwcap.h (HWCAP_FCMA): New macro.
(HWCAP_JSCVT, HWCAP_LRCPC): Likewise.
There is bug report that ld.so in GLIBC 2.24 built by Binutils 2.29 will crash
on arm-linux-gnueabihf. This is confirmed, and the details is at:
https://sourceware.org/bugzilla/show_bug.cgi?id=21725.
As analyzed in the PR, the old code was with the assumption that assembler
won't set bit0 of thumb function address if it comes from PC-relative
instructions and the calculation can be finished during assembling. This
assumption however does not hold after PR gas/21458.
* sysdeps/arm/dl-machine.h (elf_machine_load_address): Also strip bit 0
of pcrel_address under Thumb mode.
On powerpc64le, the compilation of the files related to float128 support
requires the option -mfloat128 to be passed to gcc. However, not all
possible object suffixes were covered in the Makefile. This patch uses
$(all-object-suffixes) in all remaining rules.
Tested for powerpc64le.
* sysdeps/powerpc/powerpc64le/Makefile: Use $(all-object-suffixes)
to iterate over all possible object suffixes. Add a comment
explaining the use of sysdep-CFLAGS instead of CFLAGS.
__stack_chk_fail is called on corrupted stack. Stack backtrace is very
unreliable against corrupted stack. __libc_message is changed to accept
enum __libc_message_action and call BEFORE_ABORT only if action includes
do_backtrace. __fortify_fail_abort is added to avoid backtrace from
__stack_chk_fail.
[BZ #12189]
* debug/Makefile (CFLAGS-tst-ssp-1.c): New.
(tests): Add tst-ssp-1 if -fstack-protector works.
* debug/fortify_fail.c: Include <stdbool.h>.
(_fortify_fail_abort): New function.
(__fortify_fail): Call _fortify_fail_abort.
(__fortify_fail_abort): Add a hidden definition.
* debug/stack_chk_fail.c: Include <stdbool.h>.
(__stack_chk_fail): Call __fortify_fail_abort, instead of
__fortify_fail.
* debug/tst-ssp-1.c: New file.
* include/stdio.h (__libc_message_action): New enum.
(__libc_message): Replace int with enum __libc_message_action.
(__fortify_fail_abort): New hidden prototype.
* malloc/malloc.c (malloc_printerr): Update __libc_message calls.
* sysdeps/posix/libc_fatal.c (__libc_message): Replace int
with enum __libc_message_action. Call BEFORE_ABORT only if
action includes do_backtrace.
(__libc_fatal): Update __libc_message call.
Linux 4.12 (b745fafaf70c0a98a2e1e7ac8cb14542889ceb0e) adds a new
p{read,write}v2 flag RWF_NOWAIT. This patch adds it for linux
uio-ext.h header.
Checked on x86_64-linux-gnu (on a 4.10 kernel).
[BZ #21738]
* manual/llio.texi (RWF_NOWAIT): New item.
* misc/tst-preadvwritev2-common.c (do_test_with_invalid_flags):
Add RWF_NOWAIT check.
* sysdeps/unix/sysv/linux/bits/uio-ext.h (RWF_NOWAIT): New flag.
The request PTRACE_SINGLEBLOCK was introduced in Linux 3.15. Thus the ptrace call
will fail on older kernels.
Thus the test is now testing PTRACE_SINGLEBLOCK with data argument pointing to a
buffer on stack which is assumed to fail. If the request would be interpreted as
PTRACE_GETREGS, then the ptrace call will not fail and the regs are written to buf.
If we run with a kernel with support for PTRACE_SINGLEBLOCK a ptrace call with
data=NULL, returns zero with no error. If we run with a kernel without support for
PTRACE_SINGLEBLOCK a ptrace call with data=NULL reports an error.
In the latter case, the test is just continuing with PTRACE_CONT.
ChangeLog:
* sysdeps/unix/sysv/linux/s390/tst-ptrace-singleblock.c:
Support running on kernels without PTRACE_SINGLEBLOCK.
Since there are no multiarch versions of memmove_chk and memset_chk,
test multiarch versions of memmove_chk and memset_chk only in libc.so.
[BZ #21741]
* sysdeps/x86_64/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Test memmove_chk and memset_chk only
in libc.so.
This patch fixes some build issues when including types/sigevent_t.h
along with bits/pthreadtypes.h.
Checked on x86_64-linux-gnu and on a build on supported major ABIs.
[BZ #21715]
* sysdeps/nptl/bits/pthreadtypes.h (__have_pthread_attr_t): Fix typo
on definition.
This change forces realignment of the stack pointer in __tls_get_addr, so
that binaries compiled by GCCs older than GCC 4.9:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58066
continue to work even if vector instructions are used in glibc which
require the ABI stack realignment.
__tls_get_addr_slow is added to handle the slow paths in the default
implementation of__tls_get_addr in elf/dl-tls.c. The new __tls_get_addr
calls __tls_get_addr_slow after realigning the stack. Internal calls
within ld.so go directly to the default implementation of __tls_get_addr
because they do not need stack realignment.
[BZ #21609]
* sysdeps/x86_64/Makefile (sysdep-dl-routines): Add tls_get_addr.
(gen-as-const-headers): Add rtld-offsets.sym.
* sysdeps/x86_64/dl-tls.c: New file.
* sysdeps/x86_64/rtld-offsets.sym: Likwise.
* sysdeps/x86_64/tls_get_addr.S: Likewise.
* sysdeps/x86_64/dl-tls.h: Add multiple inclusion guards.
* sysdeps/x86_64/tlsdesc.sym (TI_MODULE_OFFSET): New.
(TI_OFFSET_OFFSET): Likwise.
This patch fix the return value for error conditions for default
posix_spawn (where the errno is expected). It also avoid clobber
errno on fork call.
Checked on x86_64 (with Linux implementation removed).
[BZ# 21697]
* sysdeps/posix/spawni.c (__spawni_child): Fix return value.
(__spawnix): Do not clober errno.
Locking overhead can be significant in some stdio operations
that are common in single threaded applications.
This patch adds the _IO_FLAGS2_NEED_LOCK flag to indicate if
an _IO_FILE object needs to be locked and some of the stdio
functions just jump to their _unlocked variant when not. The
flag is set on all _IO_FILE objects when the first thread is
created. A new GLIBC_PRIVATE libc symbol, _IO_enable_locks,
was added to do this from libpthread.
The optimization can be applied to more stdio functions,
currently it is only applied to single flag check or single
non-wide-char standard operations. The flag should probably
be never set for files with _IO_USER_LOCK, but that's just a
further optimization, not a correctness requirement.
The optimization is valid in a single thread because stdio
operations are non-as-safe (so lock state is not observable
from a signal handler) and stdio locks are recursive (so lock
state is not observable via deadlock). The optimization is not
valid if a thread may be created while an stdio lock is taken
and thus it should be disabled if any user code may run during
an stdio operation (interposed malloc, printf hooks, etc).
This makes the optimization more complicated for some stdio
operations (e.g. printf), but those are bigger and thus less
important to optimize so this patch does not try to do that.
* libio/libio.h (_IO_FLAGS2_NEED_LOCK, _IO_need_lock): Define.
* libio/libioP.h (_IO_enable_locks): Declare.
* libio/Versions (_IO_enable_locks): New symbol.
* libio/genops.c (_IO_enable_locks): Define.
(_IO_old_init): Initialize flags2.
* libio/feof.c.c (_IO_feof): Avoid locking when not needed.
* libio/ferror.c (_IO_ferror): Likewise.
* libio/fputc.c (fputc): Likewise.
* libio/putc.c (_IO_putc): Likewise.
* libio/getc.c (_IO_getc): Likewise.
* libio/getchar.c (getchar): Likewise.
* libio/ioungetc.c (_IO_ungetc): Likewise.
* nptl/pthread_create.c (__pthread_create_2_1): Enable stdio locks.
* libio/iofopncook.c (_IO_fopencookie): Enable locking for the file.
* sysdeps/pthread/flockfile.c (__flockfile): Likewise.
struct resolv_context objects provide a temporary resolver context
which does not change during a name lookup operation. Only when the
outmost context is created, the stub resolver configuration is
verified to be current (at present, only against previous res_init
calls). Subsequent attempts to obtain the context will reuse the
result of the initial verification operation.
struct resolv_context can also be extended in the future to store
data which needs to be deallocated during thread cancellation.
In math/math.h, __MATH_TG will expand signbit to __builtin_signbit*,
e.g.: __builtin_signbitf128, before GCC 6. However, there has never
been a __builtin_signbitf128 in GCC and the type-generic builtin is
only available since GCC 6. For older GCC, this patch defines
__builtin_signbitf128 to __signbitf128, so that the internal function
is used instead of the non-existent builtin.
This patch also changes the implementation of __signbitf128, because
it was reusing the implementation of __signbitl from ldbl-128, which
calls __builtin_signbitl. Using the long double version of the
builtin is not correct on machines where _Float128 is ABI-distinct
from long double (i.e.: ia64, powerpc64le, x86, x86_84). The new
implementation does not rely on builtins when being built with GCC
versions older than 6.0.
The new code does not currently affect powerpc64le builds, because
only GCC 6.2 fulfills the requirements from configure. It might
affect powerpc64le builds if those requirements are backported to
older versions of the compiler. The new code affects x86_64 builds,
since glibc is supposed to build correctly with older versions of GCC.
Tested for powerpc64le and x86_64.
* include/math.h (__signbitf128): Define as hidden.
* sysdeps/ieee754/float128/s_signbitf128.c (__signbitf128):
Reimplement without builtins.
* sysdeps/ia64/bits/floatn.h [!__GNUC_PREREQ (6, 0)]
(__builtin_signbitf128): Define to __signbitf128.
* sysdeps/powerpc/bits/floatn.h: Likewise.
* sysdeps/x86/bits/floatn.h: Likewise.
Add a new tunable (glibc.tune.cpu) to override CPU identification on
aarch64. This is useful in two cases: one where it is desirable to
pretend to be another CPU for purposes of testing or because routines
written for that CPU are beneficial for specific workloads and second
where the underlying kernel does not support emulation of MRS to get
the MIDR of the CPU.
* elf/dl-tunables.h (tunable_is_name): Move from...
* elf/dl-tunables.c (is_name): ... here.
(parse_tunables, __tunables_init): Adjust.
* manual/tunables.texi: Document glibc.tune.cpu.
* sysdeps/aarch64/dl-tunables.list: New file.
* sysdeps/unix/sysv/linux/aarch64/cpu-features.c (struct
cpu_list): New type.
(cpu_list): New list of CPU names and their MIDR.
(get_midr_from_mcpu): New function.
(init_cpu_features): Override MIDR if necessary.
The string function implementations implemented so far do not use any
instructions that may deviate from standard aarch64, so it is possible
for all routines to run on all armv8 hardware. Select all
implementations in the benchmarks and tests.
* sysdeps/aarch64/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Unconditionally select thunderx
routine for testing.
GCC 7 changed the definition of max_align_t on i386:
https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=9b5c49ef97e63cc63f1ffa13baf771368105ebe2
As a result, glibc malloc no longer returns memory blocks which are as
aligned as max_align_t requires.
This causes malloc/tst-malloc-thread-fail to fail with an error like this
one:
error: allocation function 0, size 144 not aligned to 16
This patch moves the MALLOC_ALIGNMENT definition to <malloc-alignment.h>
and increases the malloc alignment to 16 for i386.
[BZ #21120]
* malloc/malloc-internal.h (MALLOC_ALIGNMENT): Moved to ...
* sysdeps/generic/malloc-alignment.h: Here. New file.
* sysdeps/i386/malloc-alignment.h: Likewise.
* sysdeps/generic/malloc-machine.h: Include <malloc-alignment.h>.
This patch improves the default posix implementation of posix_spawn{p}
and align with Linux one. The main idea is to fix some issues already
fixed in Linux code, and deprecated vfork internal usage (source of
various bug reports). In a short:
- It moves POSIX_SPAWN_USEVFORK usage and sets it a no-op. Since
the process that actually spawn the new process do not share
memory with parent (with vfork), it fixes BZ#14750 for this
implementation.
- It uses a pipe to correctly obtain the return upon failure
of execution (BZ#18433).
- It correctly enable/disable asynchronous cancellation (checked
on ptl/tst-exec5.c).
- It correctly disable/enable signal handling.
Using this version instead of Linux shows only one regression,
posix/tst-spawn3, because of pipe2 usage which increase total
number of file descriptor.
* sysdeps/posix/spawni.c (__spawni_child): New function.
(__spawni): Rename to __spawnix.
This patch implements a requirement of binutils >= 2.25 (up from 2.22)
to build glibc. Tests for 2.24 or later on x86_64 and s390 are
removed. It was already the case, as indicated by buildbot results,
that 2.24 was too old for building tests for 32-bit x86 (produced
internal linker errors linking elf/tst-gnu2-tls1mod.so). I don't know
if any configure tests for binutils features are obsolete given the
increased version requirement.
Tested for x86_64.
* configure.ac (AS): Require binutils 2.25 or later.
(LD): Likewise.
* configure: Regenerated.
* sysdeps/s390/configure.ac (AS): Remove version check.
* sysdeps/s390/configure: Regenerated.
* sysdeps/x86_64/configure.ac (AS): Remove version check.
* sysdeps/x86_64/configure: Regenerated.
* manual/install.texi (Tools for Compilation): Document
requirement for binutils 2.25 or later.
* INSTALL: Regenerated.
This patch fixes various miscellaneous namespace issues in
sys/ucontext.h headers.
Some struct tags are removed where the structs also have *_t typedef
names, while other struct tags without such names are renamed to start
__; the changes are noted in NEWS as they can affect C++ name mangling
(although there seems to be little if any external use of these types,
at least based on checking codesearch.debian.net). For powerpc,
pointers to struct pt_regs (not defined in this header) are changed to
point to struct __ctx(pt_regs), so in the __USE_MISC case those struct
fields continue to point to the existing struct pt_regs type for
maximum compatibility, while when that's a namespace issue they point
to a struct __pt_regs type which is always an incomplete struct.
Tested for affected architectures with build-many-glibcs.py.
[BZ #21457]
* sysdeps/unix/sysv/linux/m68k/sys/ucontext.h (fpregset_t): Remove
struct tag.
* sysdeps/unix/sysv/linux/mips/sys/ucontext.h (fpregset_t):
Likewise.
* sysdeps/unix/sysv/linux/nios2/sys/ucontext.h (mcontext_t):
Likewise.
* sysdeps/unix/sysv/linux/powerpc/sys/ucontext.h (pt_regs):
Declare struct type with __ctx.
[__WORDSIZE != 32] (mcontext_t): Use __ctx with pt_regs struct
tag.
(ucontext_t) [__WORDSIZE == 32]: Use __ctx with pt_regs struct tag
and regs field name.
This patch provides an optimised implementation of memchr using NEON
instructions to improve its performance, especially with longer search regions.
This gave an improvement in performance against the Thumb2+DSP optimised code,
with more significant gains for larger inputs. The NEON code also wins in cases
where the input is small (less than 8 bytes) by defaulting to a simple
byte-by-byte search. This avoids the overhead imposed by filling two quadword
registers from memory.
* sysdeps/arm/armv7/multiarch/Makefile: Add memchr_neon to
sysdep_routines.
* sysdeps/arm/armv7/multiarch/ifunc-impl-list.c: Add define for
__memchr_neon.
Add ifunc definitions for __memchr_neon and __memchr_noneon.
* sysdeps/arm/armv7/multiarch/memchr.S: New file.
* sysdeps/arm/armv7/multiarch/memchr_impl.S: Likewise.
* sysdeps/arm/armv7/multiarch/memchr_neon.S: Likewise.
Testing done: Ran regression tests for arm-none-linux-gnueabihf as well as a
full toolchain bootstrap. Benchmark tests were ran on ARMv7-A and ARMv8-A
hardware targets.