This version uses general register based memory instruction to load
data, because vector register based is slightly slower in emag.
Character-matching is performed on 16-byte (both size and alignment)
memory block in parallel each iteration.
* sysdeps/aarch64/memchr.S (__memchr): Rename to MEMCHR.
[!MEMCHR](MEMCHR): Set to __memchr.
* sysdeps/aarch64/multiarch/Makefile (sysdep_routines):
Add memchr_generic and memchr_nosimd.
* sysdeps/aarch64/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Add memchr ifuncs.
* sysdeps/aarch64/multiarch/memchr.c: New file.
* sysdeps/aarch64/multiarch/memchr_generic.S: Likewise.
* sysdeps/aarch64/multiarch/memchr_nosimd.S: Likewise.
This version uses general register based memory store instead of
vector register based, for the former is faster than the latter
in emag.
The fact that DC ZVA size in emag is 64-byte, is used by IFUNC
dispatch to select this memset, so that cost of runtime-check on
DC ZVA size can be saved.
* sysdeps/aarch64/multiarch/Makefile (sysdep_routines):
Add memset_emag.
* sysdeps/aarch64/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Add __memset_emag to memset ifunc.
* sysdeps/aarch64/multiarch/memset.c (libc_ifunc):
Add IS_EMAG check for ifunc dispatch.
* sysdeps/aarch64/multiarch/memset_base64.S: New file.
* sysdeps/aarch64/multiarch/memset_emag.S: New file.
Emag is a 64-bit CPU core released by AmpereComputing.
Add its name to cpu list, and corresponding macro as utilities for
later IFUNC dispatch.
* manual/tunables.texi (Tunable glibc.cpu.name): Add emag.
* sysdeps/unix/sysv/linux/aarch64/cpu-features.c (cpu_list):
Add emag.
* sysdeps/unix/sysv/linux/aarch64/cpu-features.h (IS_EMAG):
New macro.
From time to time I get fails in tst-spawn like:
tst-spawn.c:111: numeric comparison failure
left: 0 (0x0); from: xlseek (fd2, 0, SEEK_CUR)
right: 28 (0x1c); from: strlen (fd2string)
error: 1 test failures
tst-spawn.c:252: numeric comparison failure
left: 1 (0x1); from: WEXITSTATUS (status)
right: 0 (0x0); from: 0
error: 1 test failures
It turned out, that a child process is testing it's open file descriptors
with e.g. a sequence of testing the current position, setting the position
to zero and reading a specific amount of bytes.
Unfortunately starting with commit 2a69f853c0
the test is spawning a second child process which is sharing some of the
file descriptors. If the test sequence as mentioned above is running in parallel
it leads to test failures.
As the second call of posix_spawn shall test a NULL pid argument,
this patch is just moving the waitpid of the first child
before the posix_spawn of the second child.
ChangeLog:
* posix/tst-spawn do_test(): Move waitpid before posix_spawn.
For a full analysis of both the pthread_rwlock_tryrdlock() stall
and the pthread_rwlock_trywrlock() stall see:
https://sourceware.org/bugzilla/show_bug.cgi?id=23844#c14
In the pthread_rwlock_trydlock() function we fail to inspect for
PTHREAD_RWLOCK_FUTEX_USED in __wrphase_futex and wake the waiting
readers.
In the pthread_rwlock_trywrlock() function we write 1 to
__wrphase_futex and loose the setting of the PTHREAD_RWLOCK_FUTEX_USED
bit, again failing to wake waiting readers during unlock.
The fix in the case of pthread_rwlock_trydlock() is to check for
PTHREAD_RWLOCK_FUTEX_USED and wake the readers.
The fix in the case of pthread_rwlock_trywrlock() is to only write
1 to __wrphase_futex if we installed the write phase, since all other
readers would be spinning waiting for this step.
We add two new tests, one exercises the stall for
pthread_rwlock_trywrlock() which is easy to exercise, and one exercises
the stall for pthread_rwlock_trydlock() which is harder to exercise.
The pthread_rwlock_trywrlock() test fails consistently without the fix,
and passes after. The pthread_rwlock_tryrdlock() test fails roughly
5-10% of the time without the fix, and passes all the time after.
Signed-off-by: Carlos O'Donell <carlos@redhat.com>
Signed-off-by: Torvald Riegel <triegel@redhat.com>
Signed-off-by: Rik Prohaska <prohaska7@gmail.com>
Co-authored-by: Torvald Riegel <triegel@redhat.com>
Co-authored-by: Rik Prohaska <prohaska7@gmail.com>
GLIBC explicitly allows one to assign a new FILE pointer to stdout and
other standard streams. printf and wprintf were honouring assignment to
stdout and using the new value, but puts, putchar, and wide char variants
did not.
The stdout part is fixed here. The stdin part will be fixed in a followup.
Problem found by AddressSanitizer, reported by Hongxu Chen in:
https://debbugs.gnu.org/34140
* posix/regexec.c (proceed_next_node):
Do not read past end of input buffer.
If /etc/aliases ends with a continuation line (a line that starts
with whitespace) which does not have a trailing newline character,
the file parser would crash due to a null pointer dereference.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* NEWS: Add the list of bugs fixed in 2.29.
* manual/contrib.texi: Update contributors list with some more
names.
* manual/install.texi: Update latest versions of packages
tested.
* INSTALL: Regenerated.
There was missing restore of $f3 before the return from the function
via the $y_is_neg path. This caused the math/big testcase from Go-1.11
testsuite (that includes lots of corner cases that exercise remqu) FAIL.
[BZ #24130]
* sysdeps/alpha/remqu.S (__remqu): Add missing restore
of $f3 register on $y_is_neg path.
The full representation of the alternative calendar year (%EY)
typically includes an internal use of "%Ey". As a GNU extension,
apply any flags on "%EY" (e.g. "%_EY", "%-EY") to the internal "%Ey",
allowing users of "%EY" to control how the year is padded.
Reviewed-by: Rafal Luzynski <digitalfreak@lingonborough.com>
Reviewed-by: Zack Weinberg <zackw@panix.com>
ChangeLog:
[BZ #24096]
* manual/time.texi (strftime): Document "%EC" and "%EY".
* time/Makefile (tests): Add tst-strftime2.
(LOCALES): Add ja_JP.UTF-8, lo_LA.UTF-8, and th_TH.UTF-8.
* time/strftime_l.c (__strftime_internal): Add argument yr_spec to
override padding for "%Ey".
If an optional flag ('_' or '-') is specified to "%EY", interpret the
"%Ey" in the subformat as if decorated with that flag.
* time/tst-strftime2.c: New file.
In Japanese locales, strftime's alternative year format (%Ey) produces
a year numbered within a time period called an _era_. A new era
typically begins when a new emperor is enthroned. The result of "%Ey"
is therefore usually a one- or two-digit number.
Many programs that display Japanese era dates assume that the era year
is two digits wide. To improve how these programs display dates
during the first nine years of a new era, change "%Ey" to pad one-
digit numbers on the left with a zero. This change applies to all
locales. It is expected to be harmless for other locales that use the
alternative year format (e.g. lo_LA and th_TH, in which "%Ey" produces
the year of the Buddhist calendar) as those calendars' year numbers
are already more than two digits wide, and this is not expected to
change.
This change needs to be in place before 2019-05-01 CE, as a new era is
scheduled to begin on that date.
Reviewed-by: Zack Weinberg <zackw@panix.com>
Reviewed-by: Rafal Luzynski <digitalfreak@lingonborough.com>
ChangeLog:
[BZ #23758]
* manual/time.texi (strftime): Document "%Ey".
* time/strftime_l.c (__strftime_internal): Set the default width
padding with zero of "%Ey" to 2.
Hurd does not support MAP_NORESERVE and MAP_STACK.
Checked on i686-gnu build.
* support/xsigstack.c (MAP_NORESERVE, MAP_STACK): Define if they
are not defined.
* hurd/lookup-at.c (__file_name_lookup_at): When at_flags contains
AT_EMPTY_PATH, call __dir_lookup and __hurd_file_name_lookup_retry
directly instead of __hurd_file_name_lookup.
The IPv4 address parser in the getaddrinfo function is changed so that
it does not ignore trailing whitespace and all characters after it.
For backwards compatibility, the getaddrinfo function still recognizes
legacy name syntax, such as 192.000.002.010 interpreted as 192.0.2.8
(octal).
This commit does not change the behavior of inet_addr and inet_aton.
gethostbyname already had additional sanity checks (but is switched
over to the new __inet_aton_exact function for completeness as well).
To avoid sending the problematic query names over DNS, commit
6ca53a2453 ("resolv: Do not send queries
for non-host-names in nss_dns [BZ #24112]") is needed.
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits. The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.
This pach fixes strnlen/wcsnlen for x32. Tested on x86-64 and x32. On
x86-64, libc.so is the same with and withou the fix.
[BZ# 24097]
CVE-2019-6488
* sysdeps/x86_64/multiarch/strlen-avx2.S: Use RSI_LP for length.
Clear the upper 32 bits of RSI register.
* sysdeps/x86_64/strlen.S: Use RSI_LP for length.
* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-strnlen
and tst-size_t-wcsnlen.
* sysdeps/x86_64/x32/tst-size_t-strnlen.c: New file.
* sysdeps/x86_64/x32/tst-size_t-wcsnlen.c: Likewise.
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits. The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.
This pach fixes strncpy for x32. Tested on x86-64 and x32. On x86-64,
libc.so is the same with and withou the fix.
[BZ# 24097]
CVE-2019-6488
* sysdeps/x86_64/multiarch/strcpy-avx2.S: Use RDX_LP for length.
* sysdeps/x86_64/multiarch/strcpy-sse2-unaligned.S: Likewise.
* sysdeps/x86_64/multiarch/strcpy-ssse3.S: Likewise.
* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-strncpy.
* sysdeps/x86_64/x32/tst-size_t-strncpy.c: New file.
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits. The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.
This pach fixes the strncmp family for x32. Tested on x86-64 and x32.
On x86-64, libc.so is the same with and withou the fix.
[BZ# 24097]
CVE-2019-6488
* sysdeps/x86_64/multiarch/strcmp-avx2.S: Use RDX_LP for length.
* sysdeps/x86_64/multiarch/strcmp-sse42.S: Likewise.
* sysdeps/x86_64/strcmp.S: Likewise.
* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-strncasecmp,
tst-size_t-strncmp and tst-size_t-wcsncmp.
* sysdeps/x86_64/x32/tst-size_t-strncasecmp.c: New file.
* sysdeps/x86_64/x32/tst-size_t-strncmp.c: Likewise.
* sysdeps/x86_64/x32/tst-size_t-wcsncmp.c: Likewise.
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits. The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.
This pach fixes memset/wmemset for x32. Tested on x86-64 and x32. On
x86-64, libc.so is the same with and withou the fix.
[BZ# 24097]
CVE-2019-6488
* sysdeps/x86_64/multiarch/memset-avx512-no-vzeroupper.S: Use
RDX_LP for length. Clear the upper 32 bits of RDX register.
* sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S: Likewise.
* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-wmemset.
* sysdeps/x86_64/x32/tst-size_t-memset.c: New file.
* sysdeps/x86_64/x32/tst-size_t-wmemset.c: Likewise.
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits. The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.
This pach fixes memrchr for x32. Tested on x86-64 and x32. On x86-64,
libc.so is the same with and withou the fix.
[BZ# 24097]
CVE-2019-6488
* sysdeps/x86_64/memrchr.S: Use RDX_LP for length.
* sysdeps/x86_64/multiarch/memrchr-avx2.S: Likewise.
* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-memrchr.
* sysdeps/x86_64/x32/tst-size_t-memrchr.c: New file.
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits. The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.
This pach fixes memcpy for x32. Tested on x86-64 and x32. On x86-64,
libc.so is the same with and withou the fix.
[BZ# 24097]
CVE-2019-6488
* sysdeps/x86_64/multiarch/memcpy-ssse3-back.S: Use RDX_LP for
length. Clear the upper 32 bits of RDX register.
* sysdeps/x86_64/multiarch/memcpy-ssse3.S: Likewise.
* sysdeps/x86_64/multiarch/memmove-avx512-no-vzeroupper.S:
Likewise.
* sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:
Likewise.
* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-memcpy.
tst-size_t-wmemchr.
* sysdeps/x86_64/x32/tst-size_t-memcpy.c: New file.
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits. The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.
This pach fixes memcmp/wmemcmp for x32. Tested on x86-64 and x32. On
x86-64, libc.so is the same with and withou the fix.
[BZ# 24097]
CVE-2019-6488
* sysdeps/x86_64/multiarch/memcmp-avx2-movbe.S: Use RDX_LP for
length. Clear the upper 32 bits of RDX register.
* sysdeps/x86_64/multiarch/memcmp-sse4.S: Likewise.
* sysdeps/x86_64/multiarch/memcmp-ssse3.S: Likewise.
* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-memcmp and
tst-size_t-wmemcmp.
* sysdeps/x86_64/x32/tst-size_t-memcmp.c: New file.
* sysdeps/x86_64/x32/tst-size_t-wmemcmp.c: Likewise.
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits. The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.
This pach fixes memchr/wmemchr for x32. Tested on x86-64 and x32. On
x86-64, libc.so is the same with and withou the fix.
[BZ# 24097]
CVE-2019-6488
* sysdeps/x86_64/memchr.S: Use RDX_LP for length. Clear the
upper 32 bits of RDX register.
* sysdeps/x86_64/multiarch/memchr-avx2.S: Likewise.
* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-memchr and
tst-size_t-wmemchr.
* sysdeps/x86_64/x32/test-size_t.h: New file.
* sysdeps/x86_64/x32/tst-size_t-memchr.c: Likewise.
* sysdeps/x86_64/x32/tst-size_t-wmemchr.c: Likewise.
Before this commit, nss_dns would send a query which did not contain a
host name as the query name (such as invalid\032name.example.com) and
then reject the answer in getanswer_r and gaih_getanswer_slice, using
a check based on res_hnok. With this commit, no query is sent, and a
host-not-found error is returned to NSS without network interaction.
Commit 6923f6db1e ("malloc: Use current
(C11-style) atomics for fastbin access") caused a substantial
performance regression on POWER and Aarch64, and the old atomics,
while hard to prove correct, seem to work in practice.
Since MINSIGSTKSZ may not have sufficent stack space to allow lazy
binding, build tests for minimal signal handler with -Wl,-z,now to
disable lazy binding.
* signal/Makefile (LDFLAGS-tst-minsigstksz-1): New. Set to
-Wl,-z,now.
(LDFLAGS-tst-minsigstksz-2): Likewise.
(LDFLAGS-tst-minsigstksz-3): Likewise.
(LDFLAGS-tst-minsigstksz-3a): Likewise.
(LDFLAGS-tst-minsigstksz-4): Likewise.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
A single underscore was omitted in
sysdeps/powerpc/powerpc64/multiarch/strncmp.c, resulting in use of
power8 version of strncmp instead of power9 version, with significant
performance degradation.
* sysdeps/powerpc/powerpc64/multiarch/strncmp.c: Fix #ifdef.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
There is general agreement that the very short list of things that ISO
C says you can do in an async signal handler should all work when the
handler is running on an alternate signal stack with only MINSIGSTKSZ
space. This patch adds tests to make sure those things do work.
To facilitate this, there is a new set of test support routines for
setting up alternate signal stacks; see support/xsignal.h for the API.
* support/xsignal.h (xalloc_sigstack, xfree_sigstack)
(xget_sigstack_location): New test support functions.
* support/xsigstack.c: New file, implementing them.
* support/tst-xsigstack.c: New test for them.
* support/Makefile: Update.
* signal/tst-minsigstksz-1.c
* signal/tst-minsigstksz-2.c
* signal/tst-minsigstksz-3.c
* signal/tst-minsigstksz-3a.c
* signal/tst-minsigstksz-4.c: New tests.
* signal/Makefile: Run them.
Ignore 112 errors in math/test-ldouble-fma and math/test-ildouble-fma
when IBM 128-bit long double used.
These errors are caused by spurious overflows from libgcc.
* math/libm-test-fma.inc (fma_test_data): Set
XFAIL_ROUNDING_IBM128_LIBGCC to more tests.
Signed-off-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
An error "impossible register constraint in 'asm'" was raised on POWER
5 and due to __vector __int128_t being used as operands without passing the
option -msvx to gcc.
This patch replaces "__vector __int128_t" with "__vector unsigned int"
which requires only -maltivec, available since POWER ISA 2.03, and which
is already passed to the compiler.
* sysdeps/powerpc/powerpc64/tst-ucontext-ppc64-vscr.c:
(do_test): Changed __vector __int128_t to __vector unsigned int.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
Optimize x86-64 strcat/strncat, strcpy/strncpy and stpcpy/stpncpy with AVX2.
It uses vector comparison as much as possible. In general, the larger the
source string, the greater performance gain observed, reaching speedups of
1.6x compared to SSE2 unaligned routines. Select AVX2 strcat/strncat,
strcpy/strncpy and stpcpy/stpncpy on AVX2 machines where vzeroupper is
preferred and AVX unaligned load is fast.
* sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add
strcat-avx2, strncat-avx2, strcpy-avx2, strncpy-avx2,
stpcpy-avx2 and stpncpy-avx2.
* sysdeps/x86_64/multiarch/ifunc-impl-list.c:
(__libc_ifunc_impl_list): Add tests for __strcat_avx2,
__strncat_avx2, __strcpy_avx2, __strncpy_avx2, __stpcpy_avx2
and __stpncpy_avx2.
* sysdeps/x86_64/multiarch/{ifunc-unaligned-ssse3.h =>
ifunc-strcpy.h}: rename header for a more generic name.
* sysdeps/x86_64/multiarch/ifunc-strcpy.h:
(IFUNC_SELECTOR): Return OPTIMIZE (avx2) on AVX 2 machines if
AVX unaligned load is fast and vzeroupper is preferred.
* sysdeps/x86_64/multiarch/stpcpy-avx2.S: New file
* sysdeps/x86_64/multiarch/stpncpy-avx2.S: Likewise
* sysdeps/x86_64/multiarch/strcat-avx2.S: Likewise
* sysdeps/x86_64/multiarch/strcpy-avx2.S: Likewise
* sysdeps/x86_64/multiarch/strncat-avx2.S: Likewise
* sysdeps/x86_64/multiarch/strncpy-avx2.S: Likewise
This patch fix VSCR position on ucontext. VSCR was read in the wrong
position on ucontext structure because it was ignoring the machine
endianess.
[BZ #24088]
* sysdeps/unix/sysv/linux/powerpc/sys/ucontext.h (vscr_t): Added
ifdef to fix read of VSCR.
* sysdeps/powerpc/powerpc64/Makefile [$subdir == stdlib]: Add
tst-ucontext-ppc64-vscr.c to test list.
* sysdeps/powerpc/powerpc64/tst-ucontext-ppc64-vscr.c: New test file.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>