posix/sched_cpucount.c assumes that size of __cpu_mask == size of long,
which is incorrect for x32. This patch uses __builtin_popcount, which
is availabe in GCC 4.9, in posix/sched_cpucount.c.
Tested on i686, x86-64 and x32 with multi-arch disabled.
[BZ #21696]
* posix/sched_cpucount.c: Don't include <limits.h>.
(__sched_cpucount): Use __builtin_popcount.
In math/math.h, __MATH_TG will expand signbit to __builtin_signbit*,
e.g.: __builtin_signbitf128, before GCC 6. However, there has never
been a __builtin_signbitf128 in GCC and the type-generic builtin is
only available since GCC 6. For older GCC, this patch defines
__builtin_signbitf128 to __signbitf128, so that the internal function
is used instead of the non-existent builtin.
This patch also changes the implementation of __signbitf128, because
it was reusing the implementation of __signbitl from ldbl-128, which
calls __builtin_signbitl. Using the long double version of the
builtin is not correct on machines where _Float128 is ABI-distinct
from long double (i.e.: ia64, powerpc64le, x86, x86_84). The new
implementation does not rely on builtins when being built with GCC
versions older than 6.0.
The new code does not currently affect powerpc64le builds, because
only GCC 6.2 fulfills the requirements from configure. It might
affect powerpc64le builds if those requirements are backported to
older versions of the compiler. The new code affects x86_64 builds,
since glibc is supposed to build correctly with older versions of GCC.
Tested for powerpc64le and x86_64.
* include/math.h (__signbitf128): Define as hidden.
* sysdeps/ieee754/float128/s_signbitf128.c (__signbitf128):
Reimplement without builtins.
* sysdeps/ia64/bits/floatn.h [!__GNUC_PREREQ (6, 0)]
(__builtin_signbitf128): Define to __signbitf128.
* sysdeps/powerpc/bits/floatn.h: Likewise.
* sysdeps/x86/bits/floatn.h: Likewise.
Add a new tunable (glibc.tune.cpu) to override CPU identification on
aarch64. This is useful in two cases: one where it is desirable to
pretend to be another CPU for purposes of testing or because routines
written for that CPU are beneficial for specific workloads and second
where the underlying kernel does not support emulation of MRS to get
the MIDR of the CPU.
* elf/dl-tunables.h (tunable_is_name): Move from...
* elf/dl-tunables.c (is_name): ... here.
(parse_tunables, __tunables_init): Adjust.
* manual/tunables.texi: Document glibc.tune.cpu.
* sysdeps/aarch64/dl-tunables.list: New file.
* sysdeps/unix/sysv/linux/aarch64/cpu-features.c (struct
cpu_list): New type.
(cpu_list): New list of CPU names and their MIDR.
(get_midr_from_mcpu): New function.
(init_cpu_features): Override MIDR if necessary.
The string function implementations implemented so far do not use any
instructions that may deviate from standard aarch64, so it is possible
for all routines to run on all armv8 hardware. Select all
implementations in the benchmarks and tests.
* sysdeps/aarch64/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Unconditionally select thunderx
routine for testing.
GCC 7 changed the definition of max_align_t on i386:
https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=9b5c49ef97e63cc63f1ffa13baf771368105ebe2
As a result, glibc malloc no longer returns memory blocks which are as
aligned as max_align_t requires.
This causes malloc/tst-malloc-thread-fail to fail with an error like this
one:
error: allocation function 0, size 144 not aligned to 16
This patch moves the MALLOC_ALIGNMENT definition to <malloc-alignment.h>
and increases the malloc alignment to 16 for i386.
[BZ #21120]
* malloc/malloc-internal.h (MALLOC_ALIGNMENT): Moved to ...
* sysdeps/generic/malloc-alignment.h: Here. New file.
* sysdeps/i386/malloc-alignment.h: Likewise.
* sysdeps/generic/malloc-machine.h: Include <malloc-alignment.h>.
This patch improves the default posix implementation of posix_spawn{p}
and align with Linux one. The main idea is to fix some issues already
fixed in Linux code, and deprecated vfork internal usage (source of
various bug reports). In a short:
- It moves POSIX_SPAWN_USEVFORK usage and sets it a no-op. Since
the process that actually spawn the new process do not share
memory with parent (with vfork), it fixes BZ#14750 for this
implementation.
- It uses a pipe to correctly obtain the return upon failure
of execution (BZ#18433).
- It correctly enable/disable asynchronous cancellation (checked
on ptl/tst-exec5.c).
- It correctly disable/enable signal handling.
Using this version instead of Linux shows only one regression,
posix/tst-spawn3, because of pipe2 usage which increase total
number of file descriptor.
* sysdeps/posix/spawni.c (__spawni_child): New function.
(__spawni): Rename to __spawnix.
This patch adds tgmath.h support for _Float128, so eliminating the
awkward caveat in NEWS about the type not being supported there. This
does inevitably increase the size of macro expansions (which grows
particularly fast when you have nested calls to tgmath.h macros), but
only when _Float128 is supported and the declarations of _Float128
interfaces are visible; otherwise the expansions are unchanged.
Tested for x86_64 and arm.
* math/tgmath.h: Include <bits/libc-header-start.h> and
<bits/floatn.h>.
(__TGMATH_F128): New macro.
(__TGMATH_CF128): Likewise.
(__TGMATH_UNARY_REAL_ONLY): Use __TGMATH_F128.
(__TGMATH_UNARY_REAL_RET_ONLY): Likewise.
(__TGMATH_BINARY_FIRST_REAL_ONLY): Likewise.
(__TGMATH_BINARY_FIRST_REAL_STD_ONLY): New macro.
(__TGMATH_BINARY_REAL_ONLY): Use __TGMATH_F128.
(__TGMATH_BINARY_REAL_STD_ONLY): New macro.
(__TGMATH_BINARY_REAL_RET_ONLY): Use __TGMATH_F128.
(__TGMATH_TERNARY_FIRST_SECOND_REAL_ONLY): Likewise.
(__TGMATH_TERNARY_REAL_ONLY): Likewise.
(__TGMATH_TERNARY_FIRST_REAL_RET_ONLY): Likewise.
(__TGMATH_UNARY_REAL_IMAG): Use __TGMATH_CF128.
(__TGMATH_UNARY_IMAG): Use __TGMATH_F128.
(__TGMATH_UNARY_REAL_IMAG_RET_REAL): Use __TGMATH_CF128.
(__TGMATH_BINARY_REAL_IMAG): Likewise.
(nexttoward): Use __TGMATH_BINARY_FIRST_REAL_STD_ONLY.
[__USE_MISC] (scalb): Use __TGMATH_BINARY_REAL_STD_ONLY.
* math/gen-tgmath-tests.py (Type.init_types): Enable _FloatN and
_FloatNx types if the corresponding HUGE_VAL macros are defined.
As a GNU extension, for _GNU_SOURCE glibc's complex.h provides a
clog10 function and tgmath.h supports complex arguments to the log10
macro. However, tgmath.h uses __clog10 not clog10 in defining the
macro.
There is no namespace reason (ignoring the block-scope namespace
issues that would apply equally to *every* function called by tgmath.h
macros) for using __clog10 here, since this is only for _GNU_SOURCE so
clog10 is always visible when this macro definition is used.
Furthermore, __clog10f128 is not exported, so supporting _Float128 in
tgmath.h implies using clog10 not __clog10 there. (__clog10 and
clog10 aren't used in libstdc++ either, although that library would
have a good case for using the __clog10 reserved-namespace export: the
standard C++ library includes log10 of a complex number.) This patch
duly changes the header to use clog10, and enables tests of the macro
for complex arguments.
Tested for x86_64.
* math/tgmath.h [__USE_GNU] (log10): Use clog10 not __clog10.
* math/gen-tgmath-tests.py (Tests.add_all_tests): Test log10 for
complex arguments.
The tgmath.h totalorder and totalordermag macros wrongly return a
floating-point type. They should return int, like the underlying
functions. This patch fixes them accordingly, updating tests
including enabling tests of those functions from gen-tgmath-tests.py.
Tested for x86_64.
[BZ #21687]
* math/tgmath.h (__TGMATH_BINARY_REAL_RET_ONLY): New macro.
(totalorder): Use it.
(totalordermag): Likewise.
* math/gen-tgmath-tests.py (Tests.add_all_tests): Enable tests of
totalorder and totalordermag.
* math/test-tgmath.c (F(compile_test)): Do not call totalorder or
totalordermag in arguments of calls to those functions.
(NCALLS): Change to 134.
The tgmath.h macros for function with integer return types generate
unnecessary casts to the return type. Since in those cases the return
type does not depend on the argument type, all the cases in the
conditional expressions already have the right type, and no casts are
needed; this patch removes them.
Tested for x86_64.
* math/tgmath.h (__TGMATH_UNARY_REAL_RET_ONLY): Do not take or
cast to return type argument.
(__TGMATH_TERNARY_FIRST_REAL_RET_ONLY): Likewise.
(lrint): Update call to __TGMATH_UNARY_REAL_RET_ONLY.
(llrint): Likewise.
(lround): Likewise.
(llround): Likewise.
(ilogb): Likewise.
(llogb): Likewise.
(fromfp): Update call to __TGMATH_TERNARY_FIRST_REAL_RET_ONLY.
(ufromfp): Likewise.
(fromfpx): Likewise.
(ufromfpx): Likewise.
As noted in bug 21607, NO_LONG_DOUBLE conditionals in libm tests are
no longer effective. For most this is harmless - they were only
present because of long double functions not being declared with _LIBC
defined, and _LIBC is no longer defined for building most tests. For
the few where this is actually relevant to the test, testing
LDBL_MANT_DIG > DBL_MANT_DIG is more appropriate as that limits the
test to public APIs. This patch fixes the tests accordingly.
Tested for x86_64 and arm.
[BZ #21607]
* math/basic-test.c [!NO_LONG_DOUBLE]: Change conditionals to
[LDBL_MANT_DIG > DBL_MANT_DIG].
* math/bug-nextafter.c [!NO_LONG_DOUBLE]: Remove conditionals.
* math/bug-nexttoward.c [!NO_LONG_DOUBLE]: Likewise.
* math/test-math-isinff.cc [!NO_LONG_DOUBLE]: Likewise.
* math/test-math-iszero.cc [!NO_LONG_DOUBLE]: Likewise.
* math/test-nan-overflow.c [!NO_LONG_DOUBLE]: Likewise.
* math/test-nan-payload.c [!NO_LONG_DOUBLE]: Likewise.
* math/test-nearbyint-except-2.c [!NO_LONG_DOUBLE]: Likewise.
* math/test-nearbyint-except.c [!NO_LONG_DOUBLE]: Likewise.
* math/test-powl.c [!NO_LONG_DOUBLE]: Likewise.
* math/test-signgam-finite-c99.c [!NO_LONG_DOUBLE]: Likewise.
* math/test-signgam-finite.c [!NO_LONG_DOUBLE]: Likewise.
* math/test-signgam-main.c [!NO_LONG_DOUBLE]: Likewise.
* math/test-snan.c [!NO_LONG_DOUBLE]: Likewise.
* math/test-tgmath-ret.c [!NO_LONG_DOUBLE]: Likewise.
* math/test-tgmath.c: Include <float.h>.
[!NO_LONG_DOUBLE]: Change conditionals to [LDBL_MANT_DIG >
DBL_MANT_DIG].
* math/test-tgmath2.c: Include <float.h>.
[!NO_LONG_DOUBLE]: Change conditionals to [LDBL_MANT_DIG >
DBL_MANT_DIG].
This patch adds a more thorough test of tgmath.h macros, verifying
both the return type and the function called for all the cases of
valid argument types. (Cases with current problems - I've just filed
four bugs - are disabled or omitted pending fixing those problems.)
The test uses a Python generator (works with both Python 2 and 3) to
generate a C file which is then built and run as a test in the usual
way (and that C file includes its own dummy definitions of libm
functions similar to existing tgmath.h tests). The motivation is to
make it easier to add tests of tgmath.h for _Float128 when adding
tgmath.h support for that type; the _FloatN / _FloatNx support is
present in the script, but disabled until the tgmath.h support is
written.
Tested for x86_64, and for arm to check things in the long double =
double case. (In that case, it's OK to call either double or long
double functions when the selected type is double or long double, as
long as the return type of the macro is exactly correct.)
* math/gen-tgmath-tests.py: New file.
* math/Makefile [PYTHON] (tests): Add test-tgmath3.
[PYTHON] (generated): Add test-tgmath3.c.
[PYTHON] (CFLAGS-test-tgmath3.c): New variable.
[PYTHON] ($(objpfx)test-tgmath3.c): New rule.
This patch implements a requirement of binutils >= 2.25 (up from 2.22)
to build glibc. Tests for 2.24 or later on x86_64 and s390 are
removed. It was already the case, as indicated by buildbot results,
that 2.24 was too old for building tests for 32-bit x86 (produced
internal linker errors linking elf/tst-gnu2-tls1mod.so). I don't know
if any configure tests for binutils features are obsolete given the
increased version requirement.
Tested for x86_64.
* configure.ac (AS): Require binutils 2.25 or later.
(LD): Likewise.
* configure: Regenerated.
* sysdeps/s390/configure.ac (AS): Remove version check.
* sysdeps/s390/configure: Regenerated.
* sysdeps/x86_64/configure.ac (AS): Remove version check.
* sysdeps/x86_64/configure: Regenerated.
* manual/install.texi (Tools for Compilation): Document
requirement for binutils 2.25 or later.
* INSTALL: Regenerated.
This patch fixes various miscellaneous namespace issues in
sys/ucontext.h headers.
Some struct tags are removed where the structs also have *_t typedef
names, while other struct tags without such names are renamed to start
__; the changes are noted in NEWS as they can affect C++ name mangling
(although there seems to be little if any external use of these types,
at least based on checking codesearch.debian.net). For powerpc,
pointers to struct pt_regs (not defined in this header) are changed to
point to struct __ctx(pt_regs), so in the __USE_MISC case those struct
fields continue to point to the existing struct pt_regs type for
maximum compatibility, while when that's a namespace issue they point
to a struct __pt_regs type which is always an incomplete struct.
Tested for affected architectures with build-many-glibcs.py.
[BZ #21457]
* sysdeps/unix/sysv/linux/m68k/sys/ucontext.h (fpregset_t): Remove
struct tag.
* sysdeps/unix/sysv/linux/mips/sys/ucontext.h (fpregset_t):
Likewise.
* sysdeps/unix/sysv/linux/nios2/sys/ucontext.h (mcontext_t):
Likewise.
* sysdeps/unix/sysv/linux/powerpc/sys/ucontext.h (pt_regs):
Declare struct type with __ctx.
[__WORDSIZE != 32] (mcontext_t): Use __ctx with pt_regs struct
tag.
(ucontext_t) [__WORDSIZE == 32]: Use __ctx with pt_regs struct tag
and regs field name.
Building the testsuite with current GCC mainline fails with:
loadtest.c: In function 'main':
loadtest.c:76:3: error: macro expands to multiple statements [-Werror=multistatement-macros]
for (map = MAPS; map != NULL; map = map->l_next) \
^
loadtest.c:165:2: note: in expansion of macro 'OUT'
OUT;
^~~
loadtest.c:164:7: note: some parts of macro expansion are not guarded by this 'if' clause
if (debug)
^~
This seems like a genuine bug, although fairly harmless; it means the
fflush call in the OUT macro is unconditional instead of being inside
the conditional as presumably intended. This patch makes this macro
use do { } while (0) to avoid the problem.
Tested for x86_64 (testsuite), and with build-many-glibcs.py for
aarch64-linux-gnu with GCC mainline.
* elf/loadtest.c (OUT): Define using do { } while (0).
Building with current GCC mainline fails with:
strftime_l.c: In function '__strftime_internal':
strftime_l.c:719:4: error: macro expands to multiple statements [-Werror=multistatement-macros]
digits = d > width ? d : width; \
^
strftime_l.c:1260:6: note: in expansion of macro 'DO_NUMBER'
DO_NUMBER (1, tp->tm_year + TM_YEAR_BASE);
^~~~~~~~~
strftime_l.c:1259:4: note: some parts of macro expansion are not guarded by this 'else' clause
else
^~~~
In fact this particular instance is harmless; the code looks like:
if (modifier == L_('O'))
goto bad_format;
else
DO_NUMBER (1, tp->tm_year + TM_YEAR_BASE);
and because of the goto, it doesn't matter that part of the expansion
isn't under the "else" conditional. But it's also clearly bad style
to rely on that. This patch changes DO_NUMBER and DO_NUMBER_SPACEPAD
to use do { } while (0) to avoid such problems.
Tested (full testsuite) for x86_64 (GCC 6), and with
build-many-glibcs.py with GCC mainline, in conjunction with my libgcc
patch <https://gcc.gnu.org/ml/gcc-patches/2017-06/msg02032.html>.
* time/strftime_l.c (DO_NUMBER): Define using do { } while (0).
(DO_NUMBER_SPACEPAD): Likewise.
This patch provides an optimised implementation of memchr using NEON
instructions to improve its performance, especially with longer search regions.
This gave an improvement in performance against the Thumb2+DSP optimised code,
with more significant gains for larger inputs. The NEON code also wins in cases
where the input is small (less than 8 bytes) by defaulting to a simple
byte-by-byte search. This avoids the overhead imposed by filling two quadword
registers from memory.
* sysdeps/arm/armv7/multiarch/Makefile: Add memchr_neon to
sysdep_routines.
* sysdeps/arm/armv7/multiarch/ifunc-impl-list.c: Add define for
__memchr_neon.
Add ifunc definitions for __memchr_neon and __memchr_noneon.
* sysdeps/arm/armv7/multiarch/memchr.S: New file.
* sysdeps/arm/armv7/multiarch/memchr_impl.S: Likewise.
* sysdeps/arm/armv7/multiarch/memchr_neon.S: Likewise.
Testing done: Ran regression tests for arm-none-linux-gnueabihf as well as a
full toolchain bootstrap. Benchmark tests were ran on ARMv7-A and ARMv8-A
hardware targets.
This patch adds an ifunc variant to use the cu instruction on arch12 CPUs.
This new ifunc variant can be built if binutils support z13 vector
instructions. At runtime, HWCAP_S390_VXE decides if we can use the
cu21 instruction.
ChangeLog:
* sysdeps/s390/utf8-utf16-z9.c (__to_utf8_loop_vx_cu):
Use vector and cu21 instruction.
* sysdeps/s390/multiarch/utf8-utf16-z9.c:
Add __to_utf8_loop_vx_cu in ifunc resolver.
This patch adds an ifunc variant to use the cu instruction on arch12 CPUs.
This new ifunc variant can be built if binutils support z13 vector
instructions. At runtime, HWCAP_S390_VXE decides if we can use the
cu24 instruction.
ChangeLog:
* sysdeps/s390/utf16-utf32-z9.c (__from_utf16_loop_vx_cu):
Use vector and cu24 instruction.
* sysdeps/s390/multiarch/utf16-utf32-z9.c:
Add __from_utf16_loop_vx_cu in ifunc resolver.
This patch adds an ifunc variant to use the cu instruction on arch12 CPUs.
This new ifunc variant can be built if binutils support z13 vector
instructions. At runtime, HWCAP_S390_VXE decides if we can use the
cu42 instruction.
ChangeLog:
* sysdeps/s390/utf16-utf32-z9.c (__to_utf16_loop_vx_cu):
Use vector and cu42 instruction.
* sysdeps/s390/multiarch/utf16-utf32-z9.c:
Add __to_utf16_loop_vx_cu in ifunc resolver.