Commit Graph

34062 Commits

Author SHA1 Message Date
Florian Weimer
5dab5eafb3 malloc: Various cleanups for malloc/tst-mxfast
(cherry picked from commit f9769a2397)
2019-10-30 19:03:40 +01:00
DJ Delorie
cb89ba9c72 Add glibc.malloc.mxfast tunable
* elf/dl-tunables.list: Add glibc.malloc.mxfast.
* manual/tunables.texi: Document it.
* malloc/malloc.c (do_set_mxfast): New.
(__libc_mallopt): Call it.
* malloc/arena.c: Add mxfast tunable.
* malloc/tst-mxfast.c: New.
* malloc/Makefile: Add it.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
(cherry picked from commit c48d92b430)
2019-10-30 19:03:40 +01:00
Andreas Schwab
a645c48756 nscd: avoid assertion failure during persistent db check
nscd should not abort when it finds inconsistencies in the persistent db.

(cherry picked from commit 61595e3d36)
2019-10-30 13:59:26 +01:00
Wilco Dijkstra
f88c59f465 Small tcache improvements
Change the tcache->counts[] entries to uint16_t - this removes
the limit set by char and allows a larger tcache.  Remove a few
redundant asserts.

bench-malloc-thread with 4 threads is ~15% faster on Cortex-A72.

Reviewed-by: DJ Delorie <dj@redhat.com>

	* malloc/malloc.c (MAX_TCACHE_COUNT): Increase to UINT16_MAX.
	(tcache_put): Remove redundant assert.
	(tcache_get): Remove redundant asserts.
	(__libc_malloc): Check tcache count is not zero.
	* manual/tunables.texi (glibc.malloc.tcache_count): Update maximum.

(cherry picked from commit 1f50f2ad85)
2019-10-30 12:01:42 +01:00
Joseph Myers
3640758943 Fix assertion in malloc.c:tcache_get.
One of the warnings that appears with -Wextra is "ordered comparison
of pointer with integer zero" in malloc.c:tcache_get, for the
assertion:

  assert (tcache->entries[tc_idx] > 0);

Indeed, a "> 0" comparison does not make sense for
tcache->entries[tc_idx], which is a pointer.  My guess is that
tcache->counts[tc_idx] is what's intended here, and this patch changes
the assertion accordingly.

Tested for x86_64.

	* malloc/malloc.c (tcache_get): Compare tcache->counts[tc_idx]
	with 0, not tcache->entries[tc_idx].

(cherry picked from commit 77dc0d8643)
2019-10-30 11:56:49 +01:00
Wilco Dijkstra
b58bace7ce Improve performance of memmem
This patch significantly improves performance of memmem using a novel
modified Horspool algorithm.  Needles up to size 256 use a bad-character
table indexed by hashed pairs of characters to quickly skip past mismatches.
Long needles use a self-adapting filtering step to avoid comparing the whole
needle repeatedly.

By limiting the needle length to 256, the shift table only requires 8 bits
per entry, lowering preprocessing overhead and minimizing cache effects.
This limit also implies worst-case performance is linear.

Small needles up to size 2 use a dedicated linear search.  Very long needles
use the Two-Way algorithm (to avoid increasing stack size or slowing down
the common case, inlining is disabled).

The performance gain is 6.6 times on English text on AArch64 using random
needles with average size 8.

Tested against GLIBC testsuite and randomized tests.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>

	* string/memmem.c (__memmem): Rewrite to improve performance.

(cherry picked from commit 680942b016)
2019-09-13 15:19:26 +01:00
Wilco Dijkstra
354f52e984 Improve performance of strstr
This patch significantly improves performance of strstr using a novel
modified Horspool algorithm.  Needles up to size 256 use a bad-character
table indexed by hashed pairs of characters to quickly skip past mismatches.
Long needles use a self-adapting filtering step to avoid comparing the whole
needle repeatedly.

By limiting the needle length to 256, the shift table only requires 8 bits
per entry, lowering preprocessing overhead and minimizing cache effects.
This limit also implies worst-case performance is linear.

Small needles up to size 3 use a dedicated linear search.  Very long needles
use the Two-Way algorithm.

The performance gain using the improved bench-strstr on Cortex-A72 is 5.8
times basic_strstr and 3.7 times twoway_strstr.

Tested against GLIBC testsuite, randomized tests and the GNULIB strstr test
(https://git.savannah.gnu.org/cgit/gnulib.git/tree/tests/test-strstr.c).

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>

	* string/str-two-way.h (two_way_short_needle): Add inline to avoid
	warning.
	(two_way_long_needle): Block inlining.
	* string/strstr.c (strstr2): Add new function.
	(strstr3): Likewise.
	(STRSTR): Completely rewrite strstr to improve performance.

(cherry picked from commit 5e0a7ecb66)
2019-09-13 15:19:19 +01:00
Rajalakshmi Srinivasaraghavan
d6ccf2f45c Speedup first memmem match
As done in commit 284f42bc77, memcmp
can be used after memchr to avoid the initialization overhead of the
two-way algorithm for the first match.  This has shown improvement
>40% for first match.

(cherry picked from commit c8dd67e7c9)
2019-09-13 15:17:56 +01:00
Wilco Dijkstra
1def6a34ae Simplify and speedup strstr/strcasestr first match
Looking at the benchtests, both strstr and strcasestr spend a lot of time
in a slow initialization loop handling one character per iteration.
This can be simplified and use the much faster strlen/strnlen/strchr/memcmp.
Read ahead a few cachelines to reduce the number of strnlen calls, which
improves performance by ~3-4%.  This patch improves the time taken for the
full strstr benchtest by >40%.

	* string/strcasestr.c (STRCASESTR): Simplify and speedup first match.
	* string/strstr.c (AVAILABLE): Likewise.

(cherry picked from commit 284f42bc77)
2019-09-13 15:15:36 +01:00
Wilco Dijkstra
1533274d5f [AArch64] Add ifunc support for Ares
Add Ares to the midr_el0 list and support ifunc dispatch.  Since Ares
supports 2 128-bit loads/stores, use Neon registers for memcpy by
selecting __memcpy_falkor by default (we should rename this to
__memcpy_simd or similar).

	* manual/tunables.texi (glibc.cpu.name): Add ares tunable.
	* sysdeps/aarch64/multiarch/memcpy.c (__libc_memcpy): Use
	__memcpy_falkor for ares.
	* sysdeps/unix/sysv/linux/aarch64/cpu-features.h (IS_ARES):
	Add new define.
	* sysdeps/unix/sysv/linux/aarch64/cpu-features.c (cpu_list):
	Add ares cpu.

(cherry picked from commit 02f440c1ef)
2019-09-06 18:49:02 +01:00
Adhemerval Zanella
57922433fa posix: Fix large mmap64 offset for mips64n32 (BZ#24699)
The fix for BZ#21270 (commit 158d5fa0e1) added a mask to avoid offset larger
than 1^44 to be used along __NR_mmap2.  However mips64n32 users __NR_mmap,
as mips64n64, but still defines off_t as old non-LFS type (other ILP32, such
x32, defines off_t being equal to off64_t).  This leads to use the same
mask meant only for __NR_mmap2 call for __NR_mmap, thus limiting the maximum
offset it can use with mmap64.

This patch fixes by setting the high mask only for __NR_mmap2 usage. The
posix/tst-mmap-offset.c already tests it and also fails for mips64n32. The
patch also change the test to check for an arch-specific header that defines
the maximum supported offset.

Checked on x86_64-linux-gnu, i686-linux-gnu, and I also tests tst-mmap-offset
on qemu simulated mips64 with kernel 3.2.0 kernel for both mips-linux-gnu and
mips64-n32-linux-gnu.

	[BZ #24699]
	* posix/tst-mmap-offset.c: Mention BZ #24699.
	(do_test_bz21270): Rename to do_test_large_offset and use
	mmap64_maximum_offset to check for maximum expected offset value.
	* sysdeps/generic/mmap_info.h: New file.
	* sysdeps/unix/sysv/linux/mips/mmap_info.h: Likewise.
	* sysdeps/unix/sysv/linux/mmap64.c (MMAP_OFF_HIGH_MASK): Define iff
	__NR_mmap2 is used.

(cherry picked from commit a008c76b56)
2019-07-12 20:59:32 +00:00
Szabolcs Nagy
298e659172 aarch64: handle STO_AARCH64_VARIANT_PCS
Backport of commit 82bc69c012
and commit 30ba037546
without using DT_AARCH64_VARIANT_PCS for optimizing the symbol table check.
This is needed so the internal abi between ld.so and libc.so is unchanged.

Avoid lazy binding of symbols that may follow a variant PCS with different
register usage convention from the base PCS.

Currently the lazy binding entry code does not preserve all the registers
required for AdvSIMD and SVE vector calls.  Saving and restoring all
registers unconditionally may break existing binaries, even if they never
use vector calls, because of the larger stack requirement for lazy
resolution, which can be significant on an SVE system.

The solution is to mark all symbols in the symbol table that may follow
a variant PCS so the dynamic linker can handle them specially.  In this
patch such symbols are always resolved at load time, not lazily.

So currently LD_AUDIT for variant PCS symbols are not supported, for that
the _dl_runtime_profile entry needs to be changed e.g. to unconditionally
save/restore all registers (but pass down arg and retval registers to
pltentry/exit callbacks according to the base PCS).

This patch also removes a __builtin_expect from the modified code because
the branch prediction hint did not seem useful.

	* sysdeps/aarch64/dl-machine.h (elf_machine_lazy_rel): Check
	STO_AARCH64_VARIANT_PCS and bind such symbols at load time.
2019-07-12 10:37:56 +01:00
Szabolcs Nagy
b95c8a3150 aarch64: add STO_AARCH64_VARIANT_PCS and DT_AARCH64_VARIANT_PCS
STO_AARCH64_VARIANT_PCS is a non-visibility st_other flag for marking
symbols that reference functions that may follow a variant PCS with
different register usage convention from the base PCS.

DT_AARCH64_VARIANT_PCS is a dynamic tag that marks ELF modules that
have R_*_JUMP_SLOT relocations for symbols marked with
STO_AARCH64_VARIANT_PCS (i.e. have variant PCS calls via a PLT).

	* elf/elf.h (STO_AARCH64_VARIANT_PCS): Define.
	(DT_AARCH64_VARIANT_PCS): Define.
2019-07-10 15:49:37 +01:00
Florian Weimer
5f7fa9ac30 io: Remove copy_file_range emulation [BZ #24744]
The kernel is evolving this interface (e.g., removal of the
restriction on cross-device copies), and keeping up with that
is difficult.  Applications which need the function should
run kernels which support the system call instead of relying on
the imperfect glibc emulation.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit 5a659ccc0e)
2019-07-09 10:33:10 +02:00
Dmitry V. Levin
91b02c5b4d libio: do not attempt to free wide buffers of legacy streams [BZ #24228]
Commit a601b74d31 aka glibc-2.23~693
("In preparation for fixing BZ#16734, fix failure in misc/tst-error1-mem
when _G_HAVE_MMAP is turned off.") introduced a regression:
_IO_unbuffer_all now invokes _IO_wsetb to free wide buffers of all
files, including legacy standard files which are small statically
allocated objects that do not have wide buffers and the _mode member,
causing memory corruption.

Another memory corruption in _IO_unbuffer_all happens when -1
is assigned to the _mode member of legacy standard files that
do not have it.

[BZ #24228]
* libio/genops.c (_IO_unbuffer_all)
[SHLIB_COMPAT (libc, GLIBC_2_0, GLIBC_2_1)]: Do not attempt to free wide
buffers and access _IO_FILE_complete members of legacy libio streams.
* libio/tst-bz24228.c: New file.
* libio/tst-bz24228.map: Likewise.
* libio/Makefile [build-shared] (tests): Add tst-bz24228.
[build-shared] (generated): Add tst-bz24228.mtrace and
tst-bz24228.check.
[run-built-tests && build-shared] (tests-special): Add
$(objpfx)tst-bz24228-mem.out.
(LDFLAGS-tst-bz24228, tst-bz24228-ENV): New variables.
($(objpfx)tst-bz24228-mem.out): New rule.

(cherry picked from commit 21cc130b78)
2019-06-20 17:32:07 +00:00
Wilco Dijkstra
58d2672f64 Fix tcache count maximum (BZ #24531)
The tcache counts[] array is a char, which has a very small range and thus
may overflow.  When setting tcache_count tunable, there is no overflow check.
However the tunable must not be larger than the maximum value of the tcache
counts[] array, otherwise it can overflow when filling the tcache.

	[BZ #24531]
	* malloc/malloc.c (MAX_TCACHE_COUNT): New define.
	(do_set_tcache_count): Only update if count is small enough.
	* manual/tunables.texi (glibc.malloc.tcache_count): Document max value.

(cherry picked from commit 5ad533e8e6)
2019-05-22 14:37:54 +01:00
Mark Wielaard
059d6750f9 dlfcn: Guard __dlerror_main_freeres with __libc_once_get (once) [BZ#24476]
dlerror.c (__dlerror_main_freeres) will try to free resources which only
have been initialized when init () has been called. That function is
called when resources are needed using __libc_once (once, init) where
once is a __libc_once_define (static, once) in the dlerror.c file.
Trying to free those resources if init () hasn't been called will
produce errors under valgrind memcheck. So guard the freeing of those
resources using __libc_once_get (once) and make sure we have a valid
key. Also add a similar guard to __dlerror ().

	* dlfcn/dlerror.c (__dlerror_main_freeres): Guard using
	__libc_once_get (once) and static_bug == NULL.
	(__dlerror): Check we have a valid key, set result to static_buf
	otherwise.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
(cherry picked from commit 11b451c886)
2019-05-16 15:19:44 +02:00
Andreas Schwab
d948478bc5 Fix crash in _IO_wfile_sync (bug 20568)
When computing the length of the converted part of the stdio buffer, use
the number of consumed wide characters, not the (negative) distance to the
end of the wide buffer.

(cherry picked from commit 32ff397533)
2019-05-15 17:08:48 +02:00
Adam Maris
4a5e58827f malloc: Check for large bin list corruption when inserting unsorted chunk
Fixes bug 24216. This patch adds security checks for bk and bk_nextsize pointers
of chunks in large bin when inserting chunk from unsorted bin. It was possible
to write the pointer to victim (newly inserted chunk) to arbitrary memory
locations if bk or bk_nextsize pointers of the next large bin chunk
got corrupted.

(cherry picked from commit 5b06f538c5)
2019-05-02 14:29:11 +02:00
Istvan Kurucsai
38e8981833 malloc: Check the alignment of mmapped chunks before unmapping.
* malloc/malloc.c (munmap_chunk): Verify chunk alignment.

(cherry picked from commit c0e82f1173)
2019-05-02 14:28:59 +02:00
Istvan Kurucsai
b9054d6602 malloc: Add more integrity checks to mremap_chunk.
* malloc/malloc.c (mremap_chunk): Additional checks.

(cherry picked from commit ebe544bf6e)
2019-05-02 14:28:48 +02:00
Florian Weimer
4d24775292 malloc: Add ChangeLog for accidentally committed change
Commit b90ddd08f6 ("malloc: Additional
checks for unsorted bin integrity I.") was committed without a
whitespace fix, so it is adjusted here as well.

(cherry picked from commit 35cfefd960)
2019-05-02 14:27:51 +02:00
Adhemerval Zanella
5cbb73004b elf: Fix pldd (BZ#18035)
Since 9182aa6799 (Fix vDSO l_name for GDB's, BZ#387) the initial link_map
for executable itself and loader will have both l_name and l_libname->name
holding the same value due:

 elf/dl-object.c

 95   new->l_name = *realname ? realname : (char *) newname->name + libname_len - 1;

Since newname->name points to new->l_libname->name.

This leads to pldd to an infinite call at:

 elf/pldd-xx.c

203     again:
204       while (1)
205         {
206           ssize_t n = pread64 (memfd, tmpbuf.data, tmpbuf.length, name_offset);

228           /* Try the l_libname element.  */
229           struct E(libname_list) ln;
230           if (pread64 (memfd, &ln, sizeof (ln), m.l_libname) == sizeof (ln))
231             {
232               name_offset = ln.name;
233               goto again;
234             }

Since the value at ln.name (l_libname->name) will be the same as previously
read. The straightforward fix is just avoid the check and read the new list
entry.

I checked also against binaries issues with old loaders with fix for BZ#387,
and pldd could dump the shared objects.

Checked on x86_64-linux-gnu, i686-linux-gnu, aarch64-linux-gnu, and
powerpc64le-linux-gnu.

	[BZ #18035]
	* elf/Makefile (tests-container): Add tst-pldd.
	* elf/pldd-xx.c: Use _Static_assert in of pldd_assert.
	(E(find_maps)): Avoid use alloca, use default read file operations
	instead of explicit LFS names, and fix infinite	loop.
	* elf/pldd.c: Explicit set _FILE_OFFSET_BITS, cleanup headers.
	(get_process_info): Use _Static_assert instead of assert, use default
	directory operations instead of explicit LFS names, and free some
	leadek pointers.
	* elf/tst-pldd.c: New file.

(cherry picked from commit 1a4c27355e)
(Backported without the test case due to lack of test-in-container
support.)
2019-04-26 14:08:20 +02:00
Florian Weimer
9b3aac5869 Revert "memusagestat: use local glibc when linking [BZ #18465]"
This reverts commit 630d7201ce.

The position of the -Wl,-rpath-link= options on the linker command
line is not correct, so the new way of linking memusagestat does not
always work.
2019-04-25 14:59:42 +02:00
Mike Frysinger
630d7201ce memusagestat: use local glibc when linking [BZ #18465]
The memusagestat is the only binary that has its own link line which
causes it to be linked against the existing installed C library.  It
has been this way since it was originally committed in 1999, but I
don't see any reason as to why.  Since we want all the programs we
build locally to be against the new copy of glibc, change the build
to be like all other programs.

(cherry picked from commit f9b645b4b0)
2019-04-24 19:30:53 +02:00
TAMUKI Shoichi
7423da211d ja_JP locale: Add entry for the new Japanese era [BZ #22964]
The Japanese era name will be changed on May 1, 2019.  The Japanese
government made a preliminary announcement on April 1, 2019.

The glibc ja_JP locale must be updated to include the new era name for
strftime's alternative year format support.

This is a minimal cherry pick of just the required locale changes.

(cherry picked from commit 466afec308)
2019-04-03 18:23:14 -04:00
Stefan Liebler
4aeff335ca S390: Mark vx and vxe as important hwcap.
This patch adds vx and vxe as important hwcaps
which allows one to provide shared libraries
tuned for platforms with non-vx/-vxe, vx or vxe.

ChangeLog:

	* sysdeps/s390/dl-procinfo.h (HWCAP_IMPORTANT):
	Add HWCAP_S390_VX and HWCAP_S390_VXE.

(cherry picked from commit 61f5e9470f)

Conflicts:
	ChangeLog
2019-03-21 09:29:57 +01:00
Aurelien Jarno
54e725e39d Record CVE-2019-9169 in NEWS and ChangeLog [BZ #24114]
(cherry picked from commit b626c5aa5d)
2019-03-16 23:34:23 +01:00
Paul Eggert
2aee101ff6 regex: fix read overrun [BZ #24114]
Problem found by AddressSanitizer, reported by Hongxu Chen in:
https://debbugs.gnu.org/34140
* posix/regexec.c (proceed_next_node):
Do not read past end of input buffer.

(cherry picked from commit 583dd860d5)
2019-03-16 23:32:36 +01:00
Andreas Schwab
4bf5ab3196 RISC-V: don't assume PI mutexes and robust futexes before 4.20 (bug 23864)
Support for futex_cmpxchg as only been added to 4.20-rc1.

(cherry picked from commit 295132ff05)
2019-03-14 22:59:06 +01:00
Adhemerval Zanella
e5366c12d0 powerpc: Only enable TLE with PPC_FEATURE2_HTM_NOSC
Linux from 3.9 through 4.2 does not abort HTM transaction on syscalls,
instead it suspend and resume it when leaving the kernel.  The
side-effects of the syscall will always remain visible, even if the
transaction is aborted.  This is an issue when transaction is used along
with futex syscall, on pthread_cond_wait for instance, where the futex
call might succeed but the transaction is rolled back leading the
pthread_cond object in an inconsistent state.

Glibc used to prevent it by always aborting a transaction before issuing
a syscall.  Linux 4.2 also decided to abort active transaction in
syscalls which makes the glibc workaround superfluous.  Worse, glibc
transaction abortion leads to a performance issue on recent kernels
where the HTM state is saved/restore lazily (v4.9).  By aborting a
transaction on every syscalls, regardless whether a transaction has being
initiated before, GLIBS makes the kernel always save/restore HTM state
(it can not even lazily disable it after a certain number of syscall
iterations).

Because of this shortcoming, Transactional Lock Elision is just enabled
when it has been explicitly set (either by tunables of by a configure
switch) and if kernel aborts HTM transactions on syscalls
(PPC_FEATURE2_HTM_NOSC).  It is reported that using simple benchmark [1],
the context-switch is about 5% faster by not issuing a tabort in every
syscall in newer kernels.

Checked on powerpc64le-linux-gnu with 4.4.0 kernel (Ubuntu 16.04).

	* NEWS: Add note about new TLE support on powerpc64le.
	* sysdeps/powerpc/nptl/tcb-offsets.sym (TM_CAPABLE): Remove.
	* sysdeps/powerpc/nptl/tls.h (tcbhead_t): Rename tm_capable to
	__ununsed1.
	(TLS_INIT_TP, TLS_DEFINE_INIT_TP): Remove tm_capable setup.
	(THREAD_GET_TM_CAPABLE, THREAD_SET_TM_CAPABLE): Remove macros.
	* sysdeps/powerpc/powerpc32/sysdep.h,
	sysdeps/powerpc/powerpc64/sysdep.h (ABORT_TRANSACTION_IMPL,
	ABORT_TRANSACTION): Remove macros.
	* sysdeps/powerpc/sysdep.h (ABORT_TRANSACTION): Likewise.
	* sysdeps/unix/sysv/linux/powerpc/elision-conf.c (elision_init): Set
	__pthread_force_elision iff PPC_FEATURE2_HTM_NOSC is set.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/sysdep.h,
	sysdeps/unix/sysv/linux/powerpc/powerpc64/sysdep.h
	sysdeps/unix/sysv/linux/powerpc/syscall.S (ABORT_TRANSACTION): Remove
	usage.
	* sysdeps/unix/sysv/linux/powerpc/not-errno.h: Remove file.

Reported-by: Breno Leitão <leitao@debian.org>
(cherry picked from commit f0458cf4f9)
2019-02-27 17:36:47 +01:00
Jim Wilson
384113d1c0 RISC-V: Fix elfutils testsuite unwind failures.
The clone.S patch fixes 2 elfutils testsuite unwind failures, where the
backtrace gets stuck repeating __thread_start until we hit the backtrace
limit.  This was confirmed by building and installing a patched glibc and
then building elfutils and running its testsuite.

Unfortunately, the testcase isn't working as expected and I don't know why.
The testcase passes even when my clone.S patch is not installed.  The testcase
looks logically similarly to the elfutils testcases that are failing.  Maybe
there is a subtle difference in how the glibc unwinding works versus the
elfutils unwinding?  I don't have good gdb pthread support yet, so I haven't
found a way to debug this.  Anyways, I don't know if the testcase is useful or
not.  If the testcase isn't useful then maybe the clone.S patch is OK without
a testcase?

Jim

	[BZ #24040]
	* elf/Makefile (CFLAGS-tst-unwind-main.c): Add -DUSE_PTHREADS=0.
	* elf/tst-unwind-main.c: If USE_PTHEADS, include pthread.h and error.h
	(func): New.
	(main): If USE_PTHREADS, call pthread_create to run func.  Otherwise
	call func directly.
	* nptl/Makefile (tests): Add tst-unwind-thread.
	(CFLAGS-tst-unwind-thread.c): Define.
	* nptl/tst-unwind-thread.c: New file.
	* sysdeps/unix/sysv/linux/riscv/clone.S (__thread_start): Mark ra
	as undefined.

(cherry picked from commit 85bd1ddbdf)
2019-02-19 06:38:23 +01:00
Carlos O'Donell
e8c13d5f7a nptl: Fix pthread_rwlock_try*lock stalls (Bug 23844)
For a full analysis of both the pthread_rwlock_tryrdlock() stall
and the pthread_rwlock_trywrlock() stall see:
https://sourceware.org/bugzilla/show_bug.cgi?id=23844#c14

In the pthread_rwlock_trydlock() function we fail to inspect for
PTHREAD_RWLOCK_FUTEX_USED in __wrphase_futex and wake the waiting
readers.

In the pthread_rwlock_trywrlock() function we write 1 to
__wrphase_futex and loose the setting of the PTHREAD_RWLOCK_FUTEX_USED
bit, again failing to wake waiting readers during unlock.

The fix in the case of pthread_rwlock_trydlock() is to check for
PTHREAD_RWLOCK_FUTEX_USED and wake the readers.

The fix in the case of pthread_rwlock_trywrlock() is to only write
1 to __wrphase_futex if we installed the write phase, since all other
readers would be spinning waiting for this step.

We add two new tests, one exercises the stall for
pthread_rwlock_trywrlock() which is easy to exercise, and one exercises
the stall for pthread_rwlock_trydlock() which is harder to exercise.

The pthread_rwlock_trywrlock() test fails consistently without the fix,
and passes after. The pthread_rwlock_tryrdlock() test fails roughly
5-10% of the time without the fix, and passes all the time after.

Signed-off-by: Carlos O'Donell <carlos@redhat.com>
Signed-off-by: Torvald Riegel <triegel@redhat.com>
Signed-off-by: Rik Prohaska <prohaska7@gmail.com>
Co-authored-by: Torvald Riegel <triegel@redhat.com>
Co-authored-by: Rik Prohaska <prohaska7@gmail.com>
(cherry picked from commit 5fc9ed4c40)
2019-02-17 11:44:19 +01:00
Florian Weimer
60f8062425 nptl: Avoid fork handler lock for async-signal-safe fork [BZ #24161]
Commit 27761a1042 ("Refactor atfork
handlers") introduced a lock, atfork_lock, around fork handler list
accesses.  It turns out that this lock occasionally results in
self-deadlocks in malloc/tst-mallocfork2:

(gdb) bt
#0  __lll_lock_wait_private ()
    at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:63
#1  0x00007f160c6f927a in __run_fork_handlers (who=(unknown: 209394016),
    who@entry=atfork_run_prepare) at register-atfork.c:116
#2  0x00007f160c6b7897 in __libc_fork () at ../sysdeps/nptl/fork.c:58
#3  0x00000000004027d6 in sigusr1_handler (signo=<optimized out>)
    at tst-mallocfork2.c:80
#4  sigusr1_handler (signo=<optimized out>) at tst-mallocfork2.c:64
#5  <signal handler called>
#6  0x00007f160c6f92e4 in __run_fork_handlers (who=who@entry=atfork_run_parent)
    at register-atfork.c:136
#7  0x00007f160c6b79a2 in __libc_fork () at ../sysdeps/nptl/fork.c:152
#8  0x0000000000402567 in do_test () at tst-mallocfork2.c:156
#9  0x0000000000402dd2 in support_test_main (argc=1, argv=0x7ffc81ef1ab0,
    config=config@entry=0x7ffc81ef1970) at support_test_main.c:350
#10 0x0000000000402362 in main (argc=<optimized out>, argv=<optimized out>)
    at ../support/test-driver.c:168

If no locking happens in the single-threaded case (where fork is
expected to be async-signal-safe), this deadlock is avoided.
(pthread_atfork is not required to be async-signal-safe, so a fork
call from a signal handler interrupting pthread_atfork is not
a problem.)

(cherry picked from commit 669ff911e2)
2019-02-08 12:55:21 +01:00
Stefan Liebler
a9f60b1571 Add compiler barriers around modifications of the robust mutex list for pthread_mutex_trylock. [BZ #24180]
While debugging a kernel warning, Thomas Gleixner, Sebastian Sewior and
Heiko Carstens found a bug in pthread_mutex_trylock due to misordered
instructions:
140:   a5 1b 00 01             oill    %r1,1
144:   e5 48 a0 f0 00 00       mvghi   240(%r10),0   <--- THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
14a:   e3 10 a0 e0 00 24       stg     %r1,224(%r10) <--- last THREAD_SETMEM of ENQUEUE_MUTEX_PI

vs (with compiler barriers):
140:   a5 1b 00 01             oill    %r1,1
144:   e3 10 a0 e0 00 24       stg     %r1,224(%r10)
14a:   e5 48 a0 f0 00 00       mvghi   240(%r10),0

Please have a look at the discussion:
"Re: WARN_ON_ONCE(!new_owner) within wake_futex_pi() triggerede"
(https://lore.kernel.org/lkml/20190202112006.GB3381@osiris/)

This patch is introducing the same compiler barriers and comments
for pthread_mutex_trylock as introduced for pthread_mutex_lock and
pthread_mutex_timedlock by commit 8f9450a0b7
"Add compiler barriers around modifications of the robust mutex list."

ChangeLog:

	[BZ #24180]
	* nptl/pthread_mutex_trylock.c (__pthread_mutex_trylock):
	Add compiler barriers and comments.

(cherry picked from commit 823624bdc4)
2019-02-07 15:39:15 +01:00
Aurelien Jarno
85224b0290 NEWS: Mention bug 24112.
The fix has been backported in the commit
37edf1d3f8.
2019-02-04 22:35:41 +01:00
Florian Weimer
c533244b8e nscd: Do not use __inet_aton_exact@GLIBC_PRIVATE [BZ #20018]
This commit avoids referencing the __inet_aton_exact@GLIBC_PRIVATE
symbol from nscd.  In master, the separately-compiled getaddrinfo
implementation in nscd needs it, however such an internal ABI change
is not desirable on a release branch if it can be avoided.
2019-02-04 21:36:38 +01:00
Florian Weimer
2373941bd7 CVE-2016-10739: getaddrinfo: Fully parse IPv4 address strings [BZ #20018]
The IPv4 address parser in the getaddrinfo function is changed so that
it does not ignore trailing whitespace and all characters after it.
For backwards compatibility, the getaddrinfo function still recognizes
legacy name syntax, such as 192.000.002.010 interpreted as 192.0.2.8
(octal).

This commit does not change the behavior of inet_addr and inet_aton.
gethostbyname already had additional sanity checks (but is switched
over to the new __inet_aton_exact function for completeness as well).

To avoid sending the problematic query names over DNS, commit
6ca53a2453 ("resolv: Do not send queries
for non-host-names in nss_dns [BZ #24112]") is needed.

(cherry picked from commit 108bc4049f)
2019-02-04 21:36:37 +01:00
Florian Weimer
37edf1d3f8 resolv: Do not send queries for non-host-names in nss_dns [BZ #24112]
Before this commit, nss_dns would send a query which did not contain a
host name as the query name (such as invalid\032name.example.com) and
then reject the answer in getanswer_r and gaih_getanswer_slice, using
a check based on res_hnok.  With this commit, no query is sent, and a
host-not-found error is returned to NSS without network interaction.

(cherry picked from commit 6ca53a2453)
2019-02-04 21:35:01 +01:00
Florian Weimer
8e92ca5dd7 resolv: Reformat inet_addr, inet_aton to GNU style
(cherry picked from commit 5e30b8ef07)
2019-02-04 21:35:01 +01:00
H.J. Lu
9aaa083387 x86-64 memcmp: Use unsigned Jcc instructions on size [BZ #24155]
Since the size argument is unsigned. we should use unsigned Jcc
instructions, instead of signed, to check size.

Tested on x86-64 and x32, with and without --disable-multi-arch.

	[BZ #24155]
	CVE-2019-7309
	* NEWS: Updated for CVE-2019-7309.
	* sysdeps/x86_64/memcmp.S: Use RDX_LP for size.  Clear the
	upper 32 bits of RDX register for x32.  Use unsigned Jcc
	instructions, instead of signed.
	* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-memcmp-2.
	* sysdeps/x86_64/x32/tst-size_t-memcmp-2.c: New test.

(cherry picked from commit 3f635fb433)
2019-02-04 08:56:04 -08:00
H.J. Lu
d09b11cbe5 x86-64 strnlen/wcsnlen: Properly handle the length parameter [BZ #24097]
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits.  The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.

This pach fixes strnlen/wcsnlen for x32.  Tested on x86-64 and x32.  On
x86-64, libc.so is the same with and withou the fix.

	[BZ #24097]
	CVE-2019-6488
	* sysdeps/x86_64/multiarch/strlen-avx2.S: Use RSI_LP for length.
	Clear the upper 32 bits of RSI register.
	* sysdeps/x86_64/strlen.S: Use RSI_LP for length.
	* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-strnlen
	and tst-size_t-wcsnlen.
	* sysdeps/x86_64/x32/tst-size_t-strnlen.c: New file.
	* sysdeps/x86_64/x32/tst-size_t-wcsnlen.c: Likewise.

(cherry picked from commit 5165de69c0)
2019-02-01 12:24:16 -08:00
H.J. Lu
07a42c0eff x86-64 strncpy: Properly handle the length parameter [BZ #24097]
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits.  The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.

This pach fixes strncpy for x32.  Tested on x86-64 and x32.  On x86-64,
libc.so is the same with and withou the fix.

	[BZ #24097]
	CVE-2019-6488
	* sysdeps/x86_64/multiarch/strcpy-sse2-unaligned.S: Use RDX_LP
	for length.
	* sysdeps/x86_64/multiarch/strcpy-ssse3.S: Likewise.
	* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-strncpy.
	* sysdeps/x86_64/x32/tst-size_t-strncpy.c: New file.

(cherry picked from commit c7c54f65b0)
2019-02-01 12:23:34 -08:00
H.J. Lu
c678b80233 x86-64 strncmp family: Properly handle the length parameter [BZ #24097]
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits.  The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.

This pach fixes the strncmp family for x32.  Tested on x86-64 and x32.
On x86-64, libc.so is the same with and withou the fix.

	[BZ #24097]
	CVE-2019-6488
	* sysdeps/x86_64/multiarch/strcmp-avx2.S: Use RDX_LP for length.
	* sysdeps/x86_64/multiarch/strcmp-sse42.S: Likewise.
	* sysdeps/x86_64/strcmp.S: Likewise.
	* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-strncasecmp,
	tst-size_t-strncmp and tst-size_t-wcsncmp.
	* sysdeps/x86_64/x32/tst-size_t-strncasecmp.c: New file.
	* sysdeps/x86_64/x32/tst-size_t-strncmp.c: Likewise.
	* sysdeps/x86_64/x32/tst-size_t-wcsncmp.c: Likewise.

(cherry picked from commit ee915088a0)
2019-02-01 12:22:42 -08:00
H.J. Lu
17fc7debcc x86-64 memset/wmemset: Properly handle the length parameter [BZ #24097]
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits.  The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.

This pach fixes memset/wmemset for x32.  Tested on x86-64 and x32.  On
x86-64, libc.so is the same with and withou the fix.

	[BZ #24097]
	CVE-2019-6488
	* sysdeps/x86_64/multiarch/memset-avx512-no-vzeroupper.S: Use
	RDX_LP for length.  Clear the upper 32 bits of RDX register.
	* sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S: Likewise.
	* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-wmemset.
	* sysdeps/x86_64/x32/tst-size_t-memset.c: New file.
	* sysdeps/x86_64/x32/tst-size_t-wmemset.c: Likewise.

(cherry picked from commit 82d0b4a4d7)
2019-02-01 12:21:50 -08:00
H.J. Lu
eee0a3d04b x86-64 memrchr: Properly handle the length parameter [BZ #24097]
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits.  The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.

This pach fixes memrchr for x32.  Tested on x86-64 and x32.  On x86-64,
libc.so is the same with and withou the fix.

	[BZ #24097]
	CVE-2019-6488
	* sysdeps/x86_64/memrchr.S: Use RDX_LP for length.
	* sysdeps/x86_64/multiarch/memrchr-avx2.S: Likewise.
	* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-memrchr.
	* sysdeps/x86_64/x32/tst-size_t-memrchr.c: New file.

(cherry picked from commit ecd8b842cf)
2019-02-01 12:21:03 -08:00
H.J. Lu
781403407b x86-64 memcpy: Properly handle the length parameter [BZ #24097]
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits.  The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.

This pach fixes memcpy for x32.  Tested on x86-64 and x32.  On x86-64,
libc.so is the same with and withou the fix.

	[BZ #24097]
	CVE-2019-6488
	* sysdeps/x86_64/multiarch/memcpy-ssse3-back.S: Use RDX_LP for
	length.  Clear the upper 32 bits of RDX register.
	* sysdeps/x86_64/multiarch/memcpy-ssse3.S: Likewise.
	* sysdeps/x86_64/multiarch/memmove-avx512-no-vzeroupper.S:
	Likewise.
	* sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:
	Likewise.
	* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-memcpy.
	tst-size_t-wmemchr.
	* sysdeps/x86_64/x32/tst-size_t-memcpy.c: New file.

(cherry picked from commit 231c56760c)
2019-02-01 12:20:16 -08:00
H.J. Lu
f57666aa30 x86-64 memcmp/wmemcmp: Properly handle the length parameter [BZ #24097]
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits.  The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.

This pach fixes memcmp/wmemcmp for x32.  Tested on x86-64 and x32.  On
x86-64, libc.so is the same with and withou the fix.

	[BZ #24097]
	CVE-2019-6488
	* sysdeps/x86_64/multiarch/memcmp-avx2-movbe.S: Use RDX_LP for
	length.  Clear the upper 32 bits of RDX register.
	* sysdeps/x86_64/multiarch/memcmp-sse4.S: Likewise.
	* sysdeps/x86_64/multiarch/memcmp-ssse3.S: Likewise.
	* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-memcmp and
	tst-size_t-wmemcmp.
	* sysdeps/x86_64/x32/tst-size_t-memcmp.c: New file.
	* sysdeps/x86_64/x32/tst-size_t-wmemcmp.c: Likewise.

(cherry picked from commit b304fc201d)
2019-02-01 12:19:20 -08:00
H.J. Lu
492524a691 x86-64 memchr/wmemchr: Properly handle the length parameter [BZ #24097]
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits.  The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.

This pach fixes memchr/wmemchr for x32.  Tested on x86-64 and x32.  On
x86-64, libc.so is the same with and withou the fix.

	[BZ #24097]
	CVE-2019-6488
	* sysdeps/x86_64/memchr.S: Use RDX_LP for length.  Clear the
	upper 32 bits of RDX register.
	* sysdeps/x86_64/multiarch/memchr-avx2.S: Likewise.
	* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-memchr and
	tst-size_t-wmemchr.
	* sysdeps/x86_64/x32/test-size_t.h: New file.
	* sysdeps/x86_64/x32/tst-size_t-memchr.c: Likewise.
	* sysdeps/x86_64/x32/tst-size_t-wmemchr.c: Likewise.

(cherry picked from commit 97700a34f3)
2019-02-01 12:17:27 -08:00
Tulio Magno Quites Machado Filho
b297581acb Add XFAIL_ROUNDING_IBM128_LIBGCC to more fma() tests
Ignore 16 errors in math/test-ldouble-fma and 4 errors in
math/test-ildouble-fma when IBM 128-bit long double used.
These errors are caused by spurious overflows from libgcc.

	* math/libm-test-fma.inc (fma_test_data): Set
	XFAIL_ROUNDING_IBM128_LIBGCC to more tests.

Signed-off-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
(cherry picked from commit ecdacd34a2)
2019-01-16 15:55:23 -02:00