Add a new tunable (glibc.tune.cpu) to override CPU identification on
aarch64. This is useful in two cases: one where it is desirable to
pretend to be another CPU for purposes of testing or because routines
written for that CPU are beneficial for specific workloads and second
where the underlying kernel does not support emulation of MRS to get
the MIDR of the CPU.
* elf/dl-tunables.h (tunable_is_name): Move from...
* elf/dl-tunables.c (is_name): ... here.
(parse_tunables, __tunables_init): Adjust.
* manual/tunables.texi: Document glibc.tune.cpu.
* sysdeps/aarch64/dl-tunables.list: New file.
* sysdeps/unix/sysv/linux/aarch64/cpu-features.c (struct
cpu_list): New type.
(cpu_list): New list of CPU names and their MIDR.
(get_midr_from_mcpu): New function.
(init_cpu_features): Override MIDR if necessary.
The string function implementations implemented so far do not use any
instructions that may deviate from standard aarch64, so it is possible
for all routines to run on all armv8 hardware. Select all
implementations in the benchmarks and tests.
* sysdeps/aarch64/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Unconditionally select thunderx
routine for testing.
GCC 7 changed the definition of max_align_t on i386:
https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=9b5c49ef97e63cc63f1ffa13baf771368105ebe2
As a result, glibc malloc no longer returns memory blocks which are as
aligned as max_align_t requires.
This causes malloc/tst-malloc-thread-fail to fail with an error like this
one:
error: allocation function 0, size 144 not aligned to 16
This patch moves the MALLOC_ALIGNMENT definition to <malloc-alignment.h>
and increases the malloc alignment to 16 for i386.
[BZ #21120]
* malloc/malloc-internal.h (MALLOC_ALIGNMENT): Moved to ...
* sysdeps/generic/malloc-alignment.h: Here. New file.
* sysdeps/i386/malloc-alignment.h: Likewise.
* sysdeps/generic/malloc-machine.h: Include <malloc-alignment.h>.
This updates a bunch of locales based on CLDR v29 data:
az_AZ: changing Azərbaycanca to azərbaycan dili
be_BY: changing беларуская мова to беларуская
bem_ZM: changing iciBemba to Ichibemba
bg_BG: changing български език to български
bo_CN: changing པོད་སྐད་ to བོད་སྐད་
bo_IN: changing པོད་སྐད་ to བོད་སྐད་
br_FR: changing Brezhoneg to brezhoneg
brx_IN: lang_name: setting to बड़ो
ce_RU: changing нохчийн мотт to нохчийн
cs_CZ: changing Čeština to čeština
dz_BT: changing (རྫོང་ཁ to རྫོང་ཁ
el_CY: changing ελληνικά to Ελληνικά
el_GR: changing ελληνικά to Ελληνικά
es_AR: changing Español to español
es_BO: changing Español to español
es_CL: changing Español to español
es_CO: changing Español to español
es_CR: changing Español to español
es_CU: changing Español to español
es_DO: changing Español to español
es_EC: changing Español to español
es_ES: changing Español to español
es_GT: changing Español to español
es_HN: changing Español to español
es_MX: changing Español to español
es_NI: changing Español to español
es_PA: changing Español to español
es_PE: changing Español to español
es_PR: changing Español to español
es_PY: changing Español to español
es_SV: changing Español to español
es_US: changing Español to español
es_UY: changing Español to español
es_VE: changing Español to español
et_EE: changing eesti keel to eesti
eu_ES: changing Euskara to euskara
fr_BE: changing Français to français
fr_CA: changing Français to français
fr_CH: changing Français to français
fr_FR: changing Français to français
fr_LU: changing Français to français
fur_IT: changing Furlan to furlan
fy_NL: changing Frysk to West-Frysk
gl_ES: changing Galego to galego
gv_GB: changing y Ghaelg to Gaelg
he_IL: lang_name: setting to עברית
hsb_DE: changing Hornjoserbšćina to hornjoserbšćina
hy_AM: changing Հայերեն to հայերեն
id_ID: changing Bahasa Indonesia to Bahasa Indonesia
it_CH: changing Italiano to italiano
it_IT: changing Italiano to italiano
kl_GL: changing Kalaallisut to kalaallisut
km_KH: changing ភាសាខ្មែរ to ខ្មែរ
ko_KR: changing 한국말 to 한국어
ks_IN: changing kạ̄šur to کٲشُر
kw_GB: changing Kernowek to kernewek
ky_KG: changing Кыргызча to кыргызча
lg_UG: changing Oluganda to Luganda
lt_LT: changing lietuvių kalba to lietuvių
lv_LV: changing latviešu valoda to latviešu
mk_MK: changing македонск/и јазик to македонски
mn_MN: changing Монгол хэл to монгол
nb_NO: changing Bokmål to norsk bokmål
nn_NO: changing Nynorsk to nynorsk
os_RU: lang_name: setting to ирон
ru_RU: lang_name: setting to русский
ru_UA: lang_name: setting to русский
se_NO: changing Davvisámegiella to davvisámegiella
sk_SK: lang_name: setting to slovenčina
ta_IN: lang_name: setting to தமிழ்
ta_LK: lang_name: setting to தமிழ்
tk_TM: changing Türkmençe to türkmençe
tr_CY: changing Turkish to Türkçe
tr_TR: changing Turkish to Türkçe
ur_IN: lang_name: setting to {اردو}
ur_PK: lang_name: setting to {اردو}
vi_VN: changing Việt ngữ to Tiếng Việt
yo_NG: changing Yorùbá to Èdè Yorùbá
zu_ZA: changing IsiZulu to isiZulu
Most of these are simple case changes, but they match the CLDR db.
A search for a few of the others suggests they're also correct.
This patch improves the default posix implementation of posix_spawn{p}
and align with Linux one. The main idea is to fix some issues already
fixed in Linux code, and deprecated vfork internal usage (source of
various bug reports). In a short:
- It moves POSIX_SPAWN_USEVFORK usage and sets it a no-op. Since
the process that actually spawn the new process do not share
memory with parent (with vfork), it fixes BZ#14750 for this
implementation.
- It uses a pipe to correctly obtain the return upon failure
of execution (BZ#18433).
- It correctly enable/disable asynchronous cancellation (checked
on ptl/tst-exec5.c).
- It correctly disable/enable signal handling.
Using this version instead of Linux shows only one regression,
posix/tst-spawn3, because of pipe2 usage which increase total
number of file descriptor.
* sysdeps/posix/spawni.c (__spawni_child): New function.
(__spawni): Rename to __spawnix.
This patch adds tgmath.h support for _Float128, so eliminating the
awkward caveat in NEWS about the type not being supported there. This
does inevitably increase the size of macro expansions (which grows
particularly fast when you have nested calls to tgmath.h macros), but
only when _Float128 is supported and the declarations of _Float128
interfaces are visible; otherwise the expansions are unchanged.
Tested for x86_64 and arm.
* math/tgmath.h: Include <bits/libc-header-start.h> and
<bits/floatn.h>.
(__TGMATH_F128): New macro.
(__TGMATH_CF128): Likewise.
(__TGMATH_UNARY_REAL_ONLY): Use __TGMATH_F128.
(__TGMATH_UNARY_REAL_RET_ONLY): Likewise.
(__TGMATH_BINARY_FIRST_REAL_ONLY): Likewise.
(__TGMATH_BINARY_FIRST_REAL_STD_ONLY): New macro.
(__TGMATH_BINARY_REAL_ONLY): Use __TGMATH_F128.
(__TGMATH_BINARY_REAL_STD_ONLY): New macro.
(__TGMATH_BINARY_REAL_RET_ONLY): Use __TGMATH_F128.
(__TGMATH_TERNARY_FIRST_SECOND_REAL_ONLY): Likewise.
(__TGMATH_TERNARY_REAL_ONLY): Likewise.
(__TGMATH_TERNARY_FIRST_REAL_RET_ONLY): Likewise.
(__TGMATH_UNARY_REAL_IMAG): Use __TGMATH_CF128.
(__TGMATH_UNARY_IMAG): Use __TGMATH_F128.
(__TGMATH_UNARY_REAL_IMAG_RET_REAL): Use __TGMATH_CF128.
(__TGMATH_BINARY_REAL_IMAG): Likewise.
(nexttoward): Use __TGMATH_BINARY_FIRST_REAL_STD_ONLY.
[__USE_MISC] (scalb): Use __TGMATH_BINARY_REAL_STD_ONLY.
* math/gen-tgmath-tests.py (Type.init_types): Enable _FloatN and
_FloatNx types if the corresponding HUGE_VAL macros are defined.
As a GNU extension, for _GNU_SOURCE glibc's complex.h provides a
clog10 function and tgmath.h supports complex arguments to the log10
macro. However, tgmath.h uses __clog10 not clog10 in defining the
macro.
There is no namespace reason (ignoring the block-scope namespace
issues that would apply equally to *every* function called by tgmath.h
macros) for using __clog10 here, since this is only for _GNU_SOURCE so
clog10 is always visible when this macro definition is used.
Furthermore, __clog10f128 is not exported, so supporting _Float128 in
tgmath.h implies using clog10 not __clog10 there. (__clog10 and
clog10 aren't used in libstdc++ either, although that library would
have a good case for using the __clog10 reserved-namespace export: the
standard C++ library includes log10 of a complex number.) This patch
duly changes the header to use clog10, and enables tests of the macro
for complex arguments.
Tested for x86_64.
* math/tgmath.h [__USE_GNU] (log10): Use clog10 not __clog10.
* math/gen-tgmath-tests.py (Tests.add_all_tests): Test log10 for
complex arguments.
The tgmath.h totalorder and totalordermag macros wrongly return a
floating-point type. They should return int, like the underlying
functions. This patch fixes them accordingly, updating tests
including enabling tests of those functions from gen-tgmath-tests.py.
Tested for x86_64.
[BZ #21687]
* math/tgmath.h (__TGMATH_BINARY_REAL_RET_ONLY): New macro.
(totalorder): Use it.
(totalordermag): Likewise.
* math/gen-tgmath-tests.py (Tests.add_all_tests): Enable tests of
totalorder and totalordermag.
* math/test-tgmath.c (F(compile_test)): Do not call totalorder or
totalordermag in arguments of calls to those functions.
(NCALLS): Change to 134.
The tgmath.h macros for function with integer return types generate
unnecessary casts to the return type. Since in those cases the return
type does not depend on the argument type, all the cases in the
conditional expressions already have the right type, and no casts are
needed; this patch removes them.
Tested for x86_64.
* math/tgmath.h (__TGMATH_UNARY_REAL_RET_ONLY): Do not take or
cast to return type argument.
(__TGMATH_TERNARY_FIRST_REAL_RET_ONLY): Likewise.
(lrint): Update call to __TGMATH_UNARY_REAL_RET_ONLY.
(llrint): Likewise.
(lround): Likewise.
(llround): Likewise.
(ilogb): Likewise.
(llogb): Likewise.
(fromfp): Update call to __TGMATH_TERNARY_FIRST_REAL_RET_ONLY.
(ufromfp): Likewise.
(fromfpx): Likewise.
(ufromfpx): Likewise.
As noted in bug 21607, NO_LONG_DOUBLE conditionals in libm tests are
no longer effective. For most this is harmless - they were only
present because of long double functions not being declared with _LIBC
defined, and _LIBC is no longer defined for building most tests. For
the few where this is actually relevant to the test, testing
LDBL_MANT_DIG > DBL_MANT_DIG is more appropriate as that limits the
test to public APIs. This patch fixes the tests accordingly.
Tested for x86_64 and arm.
[BZ #21607]
* math/basic-test.c [!NO_LONG_DOUBLE]: Change conditionals to
[LDBL_MANT_DIG > DBL_MANT_DIG].
* math/bug-nextafter.c [!NO_LONG_DOUBLE]: Remove conditionals.
* math/bug-nexttoward.c [!NO_LONG_DOUBLE]: Likewise.
* math/test-math-isinff.cc [!NO_LONG_DOUBLE]: Likewise.
* math/test-math-iszero.cc [!NO_LONG_DOUBLE]: Likewise.
* math/test-nan-overflow.c [!NO_LONG_DOUBLE]: Likewise.
* math/test-nan-payload.c [!NO_LONG_DOUBLE]: Likewise.
* math/test-nearbyint-except-2.c [!NO_LONG_DOUBLE]: Likewise.
* math/test-nearbyint-except.c [!NO_LONG_DOUBLE]: Likewise.
* math/test-powl.c [!NO_LONG_DOUBLE]: Likewise.
* math/test-signgam-finite-c99.c [!NO_LONG_DOUBLE]: Likewise.
* math/test-signgam-finite.c [!NO_LONG_DOUBLE]: Likewise.
* math/test-signgam-main.c [!NO_LONG_DOUBLE]: Likewise.
* math/test-snan.c [!NO_LONG_DOUBLE]: Likewise.
* math/test-tgmath-ret.c [!NO_LONG_DOUBLE]: Likewise.
* math/test-tgmath.c: Include <float.h>.
[!NO_LONG_DOUBLE]: Change conditionals to [LDBL_MANT_DIG >
DBL_MANT_DIG].
* math/test-tgmath2.c: Include <float.h>.
[!NO_LONG_DOUBLE]: Change conditionals to [LDBL_MANT_DIG >
DBL_MANT_DIG].
This patch adds a more thorough test of tgmath.h macros, verifying
both the return type and the function called for all the cases of
valid argument types. (Cases with current problems - I've just filed
four bugs - are disabled or omitted pending fixing those problems.)
The test uses a Python generator (works with both Python 2 and 3) to
generate a C file which is then built and run as a test in the usual
way (and that C file includes its own dummy definitions of libm
functions similar to existing tgmath.h tests). The motivation is to
make it easier to add tests of tgmath.h for _Float128 when adding
tgmath.h support for that type; the _FloatN / _FloatNx support is
present in the script, but disabled until the tgmath.h support is
written.
Tested for x86_64, and for arm to check things in the long double =
double case. (In that case, it's OK to call either double or long
double functions when the selected type is double or long double, as
long as the return type of the macro is exactly correct.)
* math/gen-tgmath-tests.py: New file.
* math/Makefile [PYTHON] (tests): Add test-tgmath3.
[PYTHON] (generated): Add test-tgmath3.c.
[PYTHON] (CFLAGS-test-tgmath3.c): New variable.
[PYTHON] ($(objpfx)test-tgmath3.c): New rule.
This patch implements a requirement of binutils >= 2.25 (up from 2.22)
to build glibc. Tests for 2.24 or later on x86_64 and s390 are
removed. It was already the case, as indicated by buildbot results,
that 2.24 was too old for building tests for 32-bit x86 (produced
internal linker errors linking elf/tst-gnu2-tls1mod.so). I don't know
if any configure tests for binutils features are obsolete given the
increased version requirement.
Tested for x86_64.
* configure.ac (AS): Require binutils 2.25 or later.
(LD): Likewise.
* configure: Regenerated.
* sysdeps/s390/configure.ac (AS): Remove version check.
* sysdeps/s390/configure: Regenerated.
* sysdeps/x86_64/configure.ac (AS): Remove version check.
* sysdeps/x86_64/configure: Regenerated.
* manual/install.texi (Tools for Compilation): Document
requirement for binutils 2.25 or later.
* INSTALL: Regenerated.
This patch fixes various miscellaneous namespace issues in
sys/ucontext.h headers.
Some struct tags are removed where the structs also have *_t typedef
names, while other struct tags without such names are renamed to start
__; the changes are noted in NEWS as they can affect C++ name mangling
(although there seems to be little if any external use of these types,
at least based on checking codesearch.debian.net). For powerpc,
pointers to struct pt_regs (not defined in this header) are changed to
point to struct __ctx(pt_regs), so in the __USE_MISC case those struct
fields continue to point to the existing struct pt_regs type for
maximum compatibility, while when that's a namespace issue they point
to a struct __pt_regs type which is always an incomplete struct.
Tested for affected architectures with build-many-glibcs.py.
[BZ #21457]
* sysdeps/unix/sysv/linux/m68k/sys/ucontext.h (fpregset_t): Remove
struct tag.
* sysdeps/unix/sysv/linux/mips/sys/ucontext.h (fpregset_t):
Likewise.
* sysdeps/unix/sysv/linux/nios2/sys/ucontext.h (mcontext_t):
Likewise.
* sysdeps/unix/sysv/linux/powerpc/sys/ucontext.h (pt_regs):
Declare struct type with __ctx.
[__WORDSIZE != 32] (mcontext_t): Use __ctx with pt_regs struct
tag.
(ucontext_t) [__WORDSIZE == 32]: Use __ctx with pt_regs struct tag
and regs field name.
Building the testsuite with current GCC mainline fails with:
loadtest.c: In function 'main':
loadtest.c:76:3: error: macro expands to multiple statements [-Werror=multistatement-macros]
for (map = MAPS; map != NULL; map = map->l_next) \
^
loadtest.c:165:2: note: in expansion of macro 'OUT'
OUT;
^~~
loadtest.c:164:7: note: some parts of macro expansion are not guarded by this 'if' clause
if (debug)
^~
This seems like a genuine bug, although fairly harmless; it means the
fflush call in the OUT macro is unconditional instead of being inside
the conditional as presumably intended. This patch makes this macro
use do { } while (0) to avoid the problem.
Tested for x86_64 (testsuite), and with build-many-glibcs.py for
aarch64-linux-gnu with GCC mainline.
* elf/loadtest.c (OUT): Define using do { } while (0).
Building with current GCC mainline fails with:
strftime_l.c: In function '__strftime_internal':
strftime_l.c:719:4: error: macro expands to multiple statements [-Werror=multistatement-macros]
digits = d > width ? d : width; \
^
strftime_l.c:1260:6: note: in expansion of macro 'DO_NUMBER'
DO_NUMBER (1, tp->tm_year + TM_YEAR_BASE);
^~~~~~~~~
strftime_l.c:1259:4: note: some parts of macro expansion are not guarded by this 'else' clause
else
^~~~
In fact this particular instance is harmless; the code looks like:
if (modifier == L_('O'))
goto bad_format;
else
DO_NUMBER (1, tp->tm_year + TM_YEAR_BASE);
and because of the goto, it doesn't matter that part of the expansion
isn't under the "else" conditional. But it's also clearly bad style
to rely on that. This patch changes DO_NUMBER and DO_NUMBER_SPACEPAD
to use do { } while (0) to avoid such problems.
Tested (full testsuite) for x86_64 (GCC 6), and with
build-many-glibcs.py with GCC mainline, in conjunction with my libgcc
patch <https://gcc.gnu.org/ml/gcc-patches/2017-06/msg02032.html>.
* time/strftime_l.c (DO_NUMBER): Define using do { } while (0).
(DO_NUMBER_SPACEPAD): Likewise.
This patch provides an optimised implementation of memchr using NEON
instructions to improve its performance, especially with longer search regions.
This gave an improvement in performance against the Thumb2+DSP optimised code,
with more significant gains for larger inputs. The NEON code also wins in cases
where the input is small (less than 8 bytes) by defaulting to a simple
byte-by-byte search. This avoids the overhead imposed by filling two quadword
registers from memory.
* sysdeps/arm/armv7/multiarch/Makefile: Add memchr_neon to
sysdep_routines.
* sysdeps/arm/armv7/multiarch/ifunc-impl-list.c: Add define for
__memchr_neon.
Add ifunc definitions for __memchr_neon and __memchr_noneon.
* sysdeps/arm/armv7/multiarch/memchr.S: New file.
* sysdeps/arm/armv7/multiarch/memchr_impl.S: Likewise.
* sysdeps/arm/armv7/multiarch/memchr_neon.S: Likewise.
Testing done: Ran regression tests for arm-none-linux-gnueabihf as well as a
full toolchain bootstrap. Benchmark tests were ran on ARMv7-A and ARMv8-A
hardware targets.
This patch adds an ifunc variant to use the cu instruction on arch12 CPUs.
This new ifunc variant can be built if binutils support z13 vector
instructions. At runtime, HWCAP_S390_VXE decides if we can use the
cu21 instruction.
ChangeLog:
* sysdeps/s390/utf8-utf16-z9.c (__to_utf8_loop_vx_cu):
Use vector and cu21 instruction.
* sysdeps/s390/multiarch/utf8-utf16-z9.c:
Add __to_utf8_loop_vx_cu in ifunc resolver.
This patch adds an ifunc variant to use the cu instruction on arch12 CPUs.
This new ifunc variant can be built if binutils support z13 vector
instructions. At runtime, HWCAP_S390_VXE decides if we can use the
cu24 instruction.
ChangeLog:
* sysdeps/s390/utf16-utf32-z9.c (__from_utf16_loop_vx_cu):
Use vector and cu24 instruction.
* sysdeps/s390/multiarch/utf16-utf32-z9.c:
Add __from_utf16_loop_vx_cu in ifunc resolver.
This patch adds an ifunc variant to use the cu instruction on arch12 CPUs.
This new ifunc variant can be built if binutils support z13 vector
instructions. At runtime, HWCAP_S390_VXE decides if we can use the
cu42 instruction.
ChangeLog:
* sysdeps/s390/utf16-utf32-z9.c (__to_utf16_loop_vx_cu):
Use vector and cu42 instruction.
* sysdeps/s390/multiarch/utf16-utf32-z9.c:
Add __to_utf16_loop_vx_cu in ifunc resolver.
This patch adds an ifunc variant to use the cu instruction on arch12 CPUs.
This new ifunc variant can be built if binutils support z13 vector
instructions. At runtime, HWCAP_S390_VXE decides if we can use the
cu41 instruction.
ChangeLog:
* sysdeps/s390/utf8-utf32-z9.c (__to_utf8_loop_vx_cu):
Use vector and cu41 instruction.
* sysdeps/s390/multiarch/utf8-utf32-z9.c: Add __to_utf8_loop_vx_cu
in ifunc resolver.
The new hwcap values indicate support for:
- Vector packed decimal facility
- Vector enhancements facility 1
- Guarded storage facility
Note: arch12 is NOT the official name of the new CPU.
It refers to the edition number of the Principle of Operations manual.
ChangeLog:
* sysdeps/s390/dl-procinfo.c (_dl_s390_cap_flags):
Add vxd, vxe, gs flag.
* sysdeps/s390/dl-procinfo.h: Add HWCAP_S390_VXD, HWCAP_S390_VXE,
HWCAP_S390_GS capability.
* sysdeps/unix/sysv/linux/s390/bits/hwcap.h
(HWCAP_S390_VXD, HWCAP_S390_VXE, HWCAP_S390_GS): Define.
Check the first 32 bytes before checking size when size >= 32 bytes
to avoid unnecessary branch if the difference is in the first 32 bytes.
Replace vpmovmskb/subl/jnz with vptest/jnc.
On Haswell, the new version is as fast as the previous one. On Skylake,
the new version is a little bit faster.
* sysdeps/x86_64/multiarch/memcmp-avx2-movbe.S (MEMCMP): Check
the first 32 bytes before checking size when size >= 32 bytes.
Replace vpmovmskb/subl/jnz with vptest/jnc.