Commit Graph

32134 Commits

Author SHA1 Message Date
Siddhesh Poyarekar
ad64510e5c aarch64,falkor: Use vector registers for memcpy
Vector registers perform better than scalar register pairs for copying
data so prefer them instead.  This results in a time reduction of over
50% (i.e. 2x speed improvemnet) for some smaller sizes for memcpy-walk.
Larger sizes show improvements of around 1% to 2%.  memcpy-random shows
a very small improvement, in the range of 1-2%.

	* sysdeps/aarch64/multiarch/memcpy_falkor.S (__memcpy_falkor):
	Use vector registers.

(cherry picked from commit 0aec4c1d18)
2019-09-06 18:42:14 +01:00
Siddhesh Poyarekar
d3c05bfffa aarch64,falkor: Ignore prefetcher tagging for smaller copies
For smaller and medium sized copies, the effect of hardware
prefetching are not as dominant as instruction level parallelism.
Hence it makes more sense to load data into multiple registers than to
try and route them to the same prefetch unit.  This is also the case
for the loop exit where we are unable to latch on to the same prefetch
unit anyway so it makes more sense to have data loaded in parallel.

The performance results are a bit mixed with memcpy-random, with
numbers jumping between -1% and +3%, i.e. the numbers don't seem
repeatable.  memcpy-walk sees a 70% improvement (i.e. > 2x) for 128
bytes and that improvement reduces down as the impact of the tail copy
decreases in comparison to the loop.

	* sysdeps/aarch64/multiarch/memcpy_falkor.S (__memcpy_falkor):
	Use multiple registers to copy data in loop tail.

(cherry picked from commit db725a458e)
2019-09-06 18:41:04 +01:00
Siddhesh Poyarekar
e3c35100d3 aarch64/strncmp: Use lsr instead of mov+lsr
A lsr can do what the mov and lsr did.

(cherry picked from commit b47c3e7637)
2019-09-06 17:15:27 +01:00
Siddhesh Poyarekar
00fd3acde1 aarch64/strncmp: Unbreak builds with old binutils
Binutils 2.26.* and older do not support moves with shifted registers,
so use a separate shift instruction instead.

(cherry picked from commit d46f84de74)
2019-09-06 17:14:47 +01:00
Siddhesh Poyarekar
af9381b734 aarch64: Improve strncmp for mutually misaligned inputs
The mutually misaligned inputs on aarch64 are compared with a simple
byte copy, which is not very efficient.  Enhance the comparison
similar to strcmp by loading a double-word at a time.  The peak
performance improvement (i.e. 4k maxlen comparisons) due to this on
the strncmp microbenchmark is as follows:

falkor: 3.5x (up to 72% time reduction)
cortex-a73: 3.5x (up to 71% time reduction)
cortex-a53: 3.5x (up to 71% time reduction)

All mutually misaligned inputs from 16 bytes maxlen onwards show
upwards of 15% improvement and there is no measurable effect on the
performance of aligned/mutually aligned inputs.

	* sysdeps/aarch64/strncmp.S (count): New macro.
	(strncmp): Store misaligned length in SRC1 in COUNT.
	(mutual_align): Adjust.
	(misaligned8): Load dword at a time when it is safe.

(cherry picked from commit 7108f1f944)
2019-09-06 17:14:05 +01:00
Siddhesh Poyarekar
01de24dbca aarch64/strcmp: fix misaligned loop jump target
I accidentally set the loop jump back label as misaligned8 instead of
do_misaligned.  The typo is harmless but it's always nice to not have
to unnecessarily execute those two instructions.

	* sysdeps/aarch64/strcmp.S (do_misaligned): Jump back to
	do_misaligned, not misaligned8.

(cherry picked from commit 6ca24c4348)
2019-09-06 17:13:34 +01:00
Siddhesh Poyarekar
4e75091d6c aarch64: Improve strcmp unaligned performance
Replace the simple byte-wise compare in the misaligned case with a
dword compare with page boundary checks in place.  For simplicity I've
chosen a 4K page boundary so that we don't have to query the actual
page size on the system.

This results in up to 3x improvement in performance in the unaligned
case on falkor and about 2.5x improvement on mustang as measured using
bench-strcmp.

	* sysdeps/aarch64/strcmp.S (misaligned8): Compare dword at a
	time whenever possible.

(cherry picked from commit 2bce01ebba)
2019-09-06 17:13:02 +01:00
Siddhesh Poyarekar
8569357e11 aarch64: Fix branch target to loop16
I goofed up when changing the loop8 name to loop16 and missed on out
the branch instance.  Fixed and actually build tested this time.

	* sysdeps/aarch64/memcmp.S (more16): Fix branch target loop16.

(cherry picked from commit 4e54d91863)
2019-09-06 16:33:12 +01:00
Siddhesh Poyarekar
ec4512194f aarch64: Optimized memcmp for medium to large sizes
This improved memcmp provides a fast path for compares up to 16 bytes
and then compares 16 bytes at a time, thus optimizing loads from both
sources.  The glibc memcmp microbenchmark retains performance (with an
error of ~1ns) for smaller compare sizes and reduces up to 31% of
execution time for compares up to 4K on the APM Mustang.  On Qualcomm
Falkor this improves to almost 48%, i.e. it is almost 2x improvement
for sizes of 2K and above.

	* sysdeps/aarch64/memcmp.S: Widen comparison to 16 bytes at a
	time.

(cherry picked from commit 30a81dae5b)
2019-09-06 16:33:04 +01:00
Siddhesh Poyarekar
600e4e866c aarch64: Use the L() macro for labels in memcmp
The L() macro makes the assembly a bit more readable.

	* sysdeps/aarch64/memcmp.S: Use L() macro for labels.

(cherry picked from commit 84c94d2fd9)
2019-09-06 16:31:36 +01:00
Wilco Dijkstra
1896de3d92 [AArch64] Optimized memcmp.
This is an optimized memcmp for AArch64.  This is a complete rewrite
using a different algorithm.  The previous version split into cases
where both inputs were aligned, the inputs were mutually aligned and
unaligned using a byte loop.  The new version combines all these cases,
while small inputs of less than 8 bytes are handled separately.

This allows the main code to be sped up using unaligned loads since
there are now at least 8 bytes to be compared.  After the first 8 bytes,
align the first input.  This ensures each iteration does at most one
unaligned access and mutually aligned inputs behave as aligned.
After the main loop, process the last 8 bytes using unaligned accesses.

This improves performance of (mutually) aligned cases by 25% and
unaligned by >500% (yes >6 times faster) on large inputs.

	* sysdeps/aarch64/memcmp.S (memcmp):
	Rewrite of optimized memcmp.

(cherry picked from commit 922369032c)
2019-09-06 16:30:10 +01:00
Adhemerval Zanella
54194d8b4d posix: Fix large mmap64 offset for mips64n32 (BZ#24699)
The fix for BZ#21270 (commit 158d5fa0e1) added a mask to avoid offset larger
than 1^44 to be used along __NR_mmap2.  However mips64n32 users __NR_mmap,
as mips64n64, but still defines off_t as old non-LFS type (other ILP32, such
x32, defines off_t being equal to off64_t).  This leads to use the same
mask meant only for __NR_mmap2 call for __NR_mmap, thus limiting the maximum
offset it can use with mmap64.

This patch fixes by setting the high mask only for __NR_mmap2 usage. The
posix/tst-mmap-offset.c already tests it and also fails for mips64n32. The
patch also change the test to check for an arch-specific header that defines
the maximum supported offset.

Checked on x86_64-linux-gnu, i686-linux-gnu, and I also tests tst-mmap-offset
on qemu simulated mips64 with kernel 3.2.0 kernel for both mips-linux-gnu and
mips64-n32-linux-gnu.

	[BZ #24699]
	* posix/tst-mmap-offset.c: Mention BZ #24699.
	(do_test_bz21270): Rename to do_test_large_offset and use
	mmap64_maximum_offset to check for maximum expected offset value.
	* sysdeps/generic/mmap_info.h: New file.
	* sysdeps/unix/sysv/linux/mips/mmap_info.h: Likewise.
	* sysdeps/unix/sysv/linux/mmap64.c (MMAP_OFF_HIGH_MASK): Define iff
	__NR_mmap2 is used.

(cherry picked from commit a008c76b56)
2019-07-12 19:02:04 +00:00
Szabolcs Nagy
f2f501ff39 aarch64: handle STO_AARCH64_VARIANT_PCS
Backport of commit 82bc69c012
and commit 30ba037546
without using DT_AARCH64_VARIANT_PCS for optimizing the symbol table check.
This is needed so the internal abi between ld.so and libc.so is unchanged.

Avoid lazy binding of symbols that may follow a variant PCS with different
register usage convention from the base PCS.

Currently the lazy binding entry code does not preserve all the registers
required for AdvSIMD and SVE vector calls.  Saving and restoring all
registers unconditionally may break existing binaries, even if they never
use vector calls, because of the larger stack requirement for lazy
resolution, which can be significant on an SVE system.

The solution is to mark all symbols in the symbol table that may follow
a variant PCS so the dynamic linker can handle them specially.  In this
patch such symbols are always resolved at load time, not lazily.

So currently LD_AUDIT for variant PCS symbols are not supported, for that
the _dl_runtime_profile entry needs to be changed e.g. to unconditionally
save/restore all registers (but pass down arg and retval registers to
pltentry/exit callbacks according to the base PCS).

This patch also removes a __builtin_expect from the modified code because
the branch prediction hint did not seem useful.

	* sysdeps/aarch64/dl-machine.h (elf_machine_lazy_rel): Check
	STO_AARCH64_VARIANT_PCS and bind such symbols at load time.
2019-07-12 10:50:17 +01:00
Szabolcs Nagy
71c2578a9b aarch64: add STO_AARCH64_VARIANT_PCS and DT_AARCH64_VARIANT_PCS
STO_AARCH64_VARIANT_PCS is a non-visibility st_other flag for marking
symbols that reference functions that may follow a variant PCS with
different register usage convention from the base PCS.

DT_AARCH64_VARIANT_PCS is a dynamic tag that marks ELF modules that
have R_*_JUMP_SLOT relocations for symbols marked with
STO_AARCH64_VARIANT_PCS (i.e. have variant PCS calls via a PLT).

	* elf/elf.h (STO_AARCH64_VARIANT_PCS): Define.
	(DT_AARCH64_VARIANT_PCS): Define.
2019-07-12 10:50:12 +01:00
Wilco Dijkstra
ac92c66821 Fix tcache count maximum (BZ #24531)
The tcache counts[] array is a char, which has a very small range and thus
may overflow.  When setting tcache_count tunable, there is no overflow check.
However the tunable must not be larger than the maximum value of the tcache
counts[] array, otherwise it can overflow when filling the tcache.

	[BZ #24531]
	* malloc/malloc.c (MAX_TCACHE_COUNT): New define.
	(do_set_tcache_count): Only update if count is small enough.
	* manual/tunables.texi (glibc.malloc.tcache_count): Document max value.

(cherry picked from commit 5ad533e8e6)
2019-05-22 15:41:24 +01:00
Andreas Schwab
4385ec1d8a Fix crash in _IO_wfile_sync (bug 20568)
When computing the length of the converted part of the stdio buffer, use
the number of consumed wide characters, not the (negative) distance to the
end of the wide buffer.

(cherry picked from commit 32ff397533)
2019-05-16 10:10:04 +02:00
Stefan Liebler
c165427d55 Add compiler barriers around modifications of the robust mutex list for pthread_mutex_trylock. [BZ #24180]
While debugging a kernel warning, Thomas Gleixner, Sebastian Sewior and
Heiko Carstens found a bug in pthread_mutex_trylock due to misordered
instructions:
140:   a5 1b 00 01             oill    %r1,1
144:   e5 48 a0 f0 00 00       mvghi   240(%r10),0   <--- THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
14a:   e3 10 a0 e0 00 24       stg     %r1,224(%r10) <--- last THREAD_SETMEM of ENQUEUE_MUTEX_PI

vs (with compiler barriers):
140:   a5 1b 00 01             oill    %r1,1
144:   e3 10 a0 e0 00 24       stg     %r1,224(%r10)
14a:   e5 48 a0 f0 00 00       mvghi   240(%r10),0

Please have a look at the discussion:
"Re: WARN_ON_ONCE(!new_owner) within wake_futex_pi() triggerede"
(https://lore.kernel.org/lkml/20190202112006.GB3381@osiris/)

This patch is introducing the same compiler barriers and comments
for pthread_mutex_trylock as introduced for pthread_mutex_lock and
pthread_mutex_timedlock by commit 8f9450a0b7
"Add compiler barriers around modifications of the robust mutex list."

ChangeLog:

	[BZ #24180]
	* nptl/pthread_mutex_trylock.c (__pthread_mutex_trylock):
	Add compiler barriers and comments.

(cherry picked from commit 823624bdc4)
2019-02-07 15:49:36 +01:00
H.J. Lu
04e767b59b x86-64 memcmp: Use unsigned Jcc instructions on size [BZ #24155]
Since the size argument is unsigned. we should use unsigned Jcc
instructions, instead of signed, to check size.

Tested on x86-64 and x32, with and without --disable-multi-arch.

	[BZ #24155]
	CVE-2019-7309
	* NEWS: Updated for CVE-2019-7309.
	* sysdeps/x86_64/memcmp.S: Use RDX_LP for size.  Clear the
	upper 32 bits of RDX register for x32.  Use unsigned Jcc
	instructions, instead of signed.
	* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-memcmp-2.
	* sysdeps/x86_64/x32/tst-size_t-memcmp-2.c: New test.

(cherry picked from commit 3f635fb433)
2019-02-04 10:27:37 -08:00
H.J. Lu
dc968f5573 x86-64 strnlen/wcsnlen: Properly handle the length parameter [BZ #24097]
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits.  The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.

This pach fixes strnlen/wcsnlen for x32.  Tested on x86-64 and x32.  On
x86-64, libc.so is the same with and withou the fix.

	[BZ #24097]
	CVE-2019-6488
	* sysdeps/x86_64/multiarch/strlen-avx2.S: Use RSI_LP for length.
	Clear the upper 32 bits of RSI register.
	* sysdeps/x86_64/strlen.S: Use RSI_LP for length.
	* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-strnlen
	and tst-size_t-wcsnlen.
	* sysdeps/x86_64/x32/tst-size_t-strnlen.c: New file.
	* sysdeps/x86_64/x32/tst-size_t-wcsnlen.c: Likewise.

(cherry picked from commit 5165de69c0)
2019-02-01 15:35:22 -08:00
H.J. Lu
40575878cd x86-64 strncpy: Properly handle the length parameter [BZ #24097]
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits.  The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.

This pach fixes strncpy for x32.  Tested on x86-64 and x32.  On x86-64,
libc.so is the same with and withou the fix.

	[BZ #24097]
	CVE-2019-6488
	* sysdeps/x86_64/multiarch/strcpy-sse2-unaligned.S: Use RDX_LP
	for length.
	* sysdeps/x86_64/multiarch/strcpy-ssse3.S: Likewise.
	* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-strncpy.
	* sysdeps/x86_64/x32/tst-size_t-strncpy.c: New file.

(cherry picked from commit c7c54f65b0)
2019-02-01 15:35:00 -08:00
H.J. Lu
15ce2f62f6 x86-64 strncmp family: Properly handle the length parameter [BZ #24097]
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits.  The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.

This pach fixes the strncmp family for x32.  Tested on x86-64 and x32.
On x86-64, libc.so is the same with and withou the fix.

	[BZ #24097]
	CVE-2019-6488
	* sysdeps/x86_64/multiarch/strcmp-sse42.S: Use RDX_LP for length.
	* sysdeps/x86_64/strcmp.S: Likewise.
	* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-strncasecmp,
	tst-size_t-strncmp and tst-size_t-wcsncmp.
	* sysdeps/x86_64/x32/tst-size_t-strncasecmp.c: New file.
	* sysdeps/x86_64/x32/tst-size_t-strncmp.c: Likewise.
	* sysdeps/x86_64/x32/tst-size_t-wcsncmp.c: Likewise.

(cherry picked from commit ee915088a0)
2019-02-01 15:34:41 -08:00
H.J. Lu
885e4af2ac x86-64 memset/wmemset: Properly handle the length parameter [BZ #24097]
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits.  The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.

This pach fixes memset/wmemset for x32.  Tested on x86-64 and x32.  On
x86-64, libc.so is the same with and withou the fix.

	[BZ #24097]
	CVE-2019-6488
	* sysdeps/x86_64/multiarch/memset-avx512-no-vzeroupper.S: Use
	RDX_LP for length.  Clear the upper 32 bits of RDX register.
	* sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S: Likewise.
	* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-wmemset.
	* sysdeps/x86_64/x32/tst-size_t-memset.c: New file.
	* sysdeps/x86_64/x32/tst-size_t-wmemset.c: Likewise.

(cherry picked from commit 82d0b4a4d7)
2019-02-01 15:34:29 -08:00
H.J. Lu
c9ea2e82d4 x86-64 memrchr: Properly handle the length parameter [BZ #24097]
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits.  The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.

This pach fixes memrchr for x32.  Tested on x86-64 and x32.  On x86-64,
libc.so is the same with and withou the fix.

	[BZ #24097]
	CVE-2019-6488
	* sysdeps/x86_64/memrchr.S: Use RDX_LP for length.
	* sysdeps/x86_64/multiarch/memrchr-avx2.S: Likewise.
	* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-memrchr.
	* sysdeps/x86_64/x32/tst-size_t-memrchr.c: New file.

(cherry picked from commit ecd8b842cf)
2019-02-01 15:34:17 -08:00
H.J. Lu
94b88894b1 x86-64 memcpy: Properly handle the length parameter [BZ #24097]
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits.  The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.

This pach fixes memcpy for x32.  Tested on x86-64 and x32.  On x86-64,
libc.so is the same with and withou the fix.

	[BZ #24097]
	CVE-2019-6488
	* sysdeps/x86_64/multiarch/memcpy-ssse3-back.S: Use RDX_LP for
	length.  Clear the upper 32 bits of RDX register.
	* sysdeps/x86_64/multiarch/memcpy-ssse3.S: Likewise.
	* sysdeps/x86_64/multiarch/memmove-avx512-no-vzeroupper.S:
	Likewise.
	* sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:
	Likewise.
	* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-memcpy.
	tst-size_t-wmemchr.
	* sysdeps/x86_64/x32/tst-size_t-memcpy.c: New file.

(cherry picked from commit 231c56760c)
2019-02-01 15:34:04 -08:00
H.J. Lu
232a7628f0 x86-64 memcmp/wmemcmp: Properly handle the length parameter [BZ #24097]
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits.  The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.

This pach fixes memcmp/wmemcmp for x32.  Tested on x86-64 and x32.  On
x86-64, libc.so is the same with and withou the fix.

	[BZ #24097]
	CVE-2019-6488
	* sysdeps/x86_64/multiarch/memcmp-avx2-movbe.S: Use RDX_LP for
	length.  Clear the upper 32 bits of RDX register.
	* sysdeps/x86_64/multiarch/memcmp-sse4.S: Likewise.
	* sysdeps/x86_64/multiarch/memcmp-ssse3.S: Likewise.
	* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-memcmp and
	tst-size_t-wmemcmp.
	* sysdeps/x86_64/x32/tst-size_t-memcmp.c: New file.
	* sysdeps/x86_64/x32/tst-size_t-wmemcmp.c: Likewise.

(cherry picked from commit b304fc201d)
2019-02-01 15:33:50 -08:00
H.J. Lu
bff8346b01 x86-64 memchr/wmemchr: Properly handle the length parameter [BZ #24097]
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits.  The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.

This pach fixes memchr/wmemchr for x32.  Tested on x86-64 and x32.  On
x86-64, libc.so is the same with and withou the fix.

	[BZ #24097]
	CVE-2019-6488
	* sysdeps/x86_64/memchr.S: Use RDX_LP for length.  Clear the
	upper 32 bits of RDX register.
	* sysdeps/x86_64/multiarch/memchr-avx2.S: Likewise.
	* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-memchr and
	tst-size_t-wmemchr.
	* sysdeps/x86_64/x32/test-size_t.h: New file.
	* sysdeps/x86_64/x32/tst-size_t-memchr.c: Likewise.
	* sysdeps/x86_64/x32/tst-size_t-wmemchr.c: Likewise.

(cherry picked from commit 97700a34f3)
2019-02-01 15:32:53 -08:00
Gabriel F. T. Gomes
7ab39c6a3c powerpc: Regenerate ULPs
On POWER9, cbrtf128 fails by 1 ULP.

	* sysdeps/powerpc/fpu/libm-test-ulps: Regenerate.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.vnet.ibm.com>
(cherry picked from commit 428fc49eaa)
2019-01-11 14:14:10 -02:00
Tulio Magno Quites Machado Filho
d851e9199e [BZ #21745] powerpc: build some IFUNC math functions for libc and libm
Some math functions have to be distributed in libc because they're
required by printf.
libc and libm require their own builds of these functions, e.g. libc
functions have to call __stack_chk_fail_local in order to bypass the
PLT, while libm functions have to call __stack_chk_fail.

While math/Makefile treat the generic cases, i.e. s_isinff, the
multiarch Makefile has to treat its own files, i.e. s_isinff-ppc64.

	[BZ #21745]
	* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile:
	[$(subdir) = math] (sysdep_calls): New variable.  Has the
	previous contents of sysdep_routines, but re-sorted..
	[$(subdir) = math] (sysdep_routines): Re-use the contents from
	sysdep_calls.
	[$(subdir) = math] (libm-sysdep_routines): Remove the functions
	defined in sysdep_calls and replace by the respective m_* names.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-ppc64.S:
	(compat_symbol): Undefine to avoid duplicated compat symbols in
	libc.

(cherry picked from commit 61c45f2505)
2019-01-11 14:14:10 -02:00
Florian Weimer
a5bd0ba192 intl: Do not return NULL on asprintf failure in gettext [BZ #24018]
Fixes commit 9695dd0c93 ("DCIGETTEXT:
Use getcwd, asprintf to construct absolute pathname").

(cherry picked from commit 8c1aafc1f3)
2019-01-02 20:01:05 +01:00
Florian Weimer
94417f6c26 malloc: Always call memcpy in _int_realloc [BZ #24027]
This commit removes the custom memcpy implementation from _int_realloc
for small chunk sizes.  The ncopies variable has the wrong type, and
an integer wraparound could cause the existing code to copy too few
elements (leaving the new memory region mostly uninitialized).
Therefore, removing this code fixes bug 24027.

(cherry picked from commit b50dd3bc8c)
2019-01-01 11:07:26 +01:00
Florian Weimer
a0bc5dd3be CVE-2018-19591: if_nametoindex: Fix descriptor for overlong name [BZ #23927]
(cherry picked from commit d527c860f5)
2018-11-27 21:35:31 +01:00
Alexandra Hájková
27e46f5836 Add an additional test to resolv/tst-resolv-network.c
Test for the infinite loop in getnetbyname, bug #17630.

(cherry picked from commit ac8060265b)
2018-11-09 14:43:45 +01:00
Andreas Schwab
fcd86c6253 libanl: properly cleanup if first helper thread creation failed (bug 22927)
(cherry picked from commit bd3b0fbae3)
2018-11-06 17:12:07 +01:00
Adhemerval Zanella
dc40423dba x86: Fix Haswell CPU string flags (BZ#23709)
Th commit 'Disable TSX on some Haswell processors.' (2702856bf4) changed the
default flags for Haswell models.  Previously, new models were handled by the
default switch path, which assumed a Core i3/i5/i7 if AVX is available. After
the patch, Haswell models (0x3f, 0x3c, 0x45, 0x46) do not set the flags
Fast_Rep_String, Fast_Unaligned_Load, Fast_Unaligned_Copy, and
Prefer_PMINUB_for_stringop (only the TSX one).

This patch fixes it by disentangle the TSX flag handling from the memory
optimization ones.  The strstr case cited on patch now selects the
__strstr_sse2_unaligned as expected for the Haswell cpu.

Checked on x86_64-linux-gnu.

	[BZ #23709]
	* sysdeps/x86/cpu-features.c (init_cpu_features): Set TSX bits
	independently of other flags.

(cherry picked from commit c3d8dc45c9)
2018-11-02 11:14:05 +01:00
Joseph Myers
33f5de7a79 Disable -Wrestrict for two nptl/tst-attr3.c tests.
nptl/tst-attr3 fails to build with GCC mainline because of
(deliberate) aliasing between the second (attributes) and fourth
(argument to thread start routine) arguments to pthread_create.

Although both those arguments are restrict-qualified in POSIX,
pthread_create does not actually dereference its fourth argument; it's
an opaque pointer passed to the thread start routine.  Thus, the
aliasing is actually valid in this case, and it's deliberate in the
test.  So this patch makes the test disable -Wrestrict for the two
pthread_create calls in question.  (-Wrestrict was added in GCC 7,
hence the __GNUC_PREREQ conditions, but the particular warning in
question is new in GCC 8.)

Tested compilation with build-many-glibcs.py for aarch64-linux-gnu.

	* nptl/tst-attr3.c: Include <libc-diag.h>.
	(do_test) [__GNUC_PREREQ (7, 0)]: Ignore -Wrestrict for two tests.

(cherry picked from commit 40c4162df6)
2018-10-22 14:20:23 +02:00
Joseph Myers
6ae2ca620a Fix string/bug-strncat1.c build with GCC 8.
GCC 8 warns about strncat calls with truncated output.
string/bug-strncat1.c tests such a call; this patch disables the
warning for it.

Tested (compilation) with GCC 8 for x86_64-linux-gnu with
build-many-glibcs.py (in conjunction with Martin's patch to allow
glibc to build).

	* string/bug-strncat1.c: Include <libc-diag.h>.
	(main): Disable -Wstringop-truncation for strncat call for GCC 8.

(cherry picked from commit ec72135e5f)
2018-10-22 14:16:02 +02:00
Joseph Myers
fe5978e1a5 Ignore -Wrestrict for one strncat test.
With current GCC mainline, one strncat test involving a size close to
SIZE_MAX results in a -Wrestrict warning that that buffer size would
imply that the two buffers must overlap.  This patch fixes the build
by adding disabling of -Wrestrict (for GCC versions supporting that
option) to the already-present disabling of -Wstringop-overflow= and
-Warray-bounds for this test.

Tested with build-many-glibcs.py that this restores the testsuite
build with GCC mainline for aarch64-linux-gnu.

	* string/tester.c (test_strncat) [__GNUC_PREREQ (7, 0)]: Also
	ignore -Wrestrict for one test.

(cherry picked from commit 35ebb6b0c4)
2018-10-22 14:15:29 +02:00
Joseph Myers
70e810a30c Disable strncat test array-bounds warnings for GCC 8.
Some strncat tests fail to build with GCC 8 because of -Warray-bounds
warnings.  These tests are deliberately test over-large size arguments
passed to strncat, and already disable -Wstringop-overflow warnings,
but now the warnings for these tests come under -Warray-bounds so that
option needs disabling for them as well, which this patch does (with
an update on the comments; the DIAG_IGNORE_NEEDS_COMMENT call for
-Warray-bounds doesn't need to be conditional itself, because that
option is supported by all versions of GCC that can build glibc).

Tested compilation with build-many-glibcs.py for aarch64-linux-gnu.

	* string/tester.c (test_strncat): Also disable -Warray-bounds
	warnings for two tests.

(cherry picked from commit 1421f39b7e)
2018-10-22 14:15:29 +02:00
Joseph Myers
dd03d15e28 Fix string/tester.c build with GCC 8.
GCC 8 warns about more cases of string functions truncating their
output or not copying a trailing NUL byte.

This patch fixes testsuite build failures caused by such warnings in
string/tester.c.  In general, the warnings are disabled around the
relevant calls using DIAG_* macros, since the relevant cases are being
deliberately tested.  In one case, the warning is with
-Wstringop-overflow= instead of -Wstringop-truncation; in that case,
the conditional is __GNUC_PREREQ (7, 0) (being the version where
-Wstringop-overflow= was introduced), to allow the conditional to be
removed sooner, since it's harmless to disable the warning for a
GCC version where it doesn't actually occur.  In the case of warnings
for strncpy calls in test_memcmp, the calls in question are changed to
use memcpy, as they don't copy a trailing NUL and the point of that
code is to test memcmp rather than strncpy.

Tested (compilation) with GCC 8 for x86_64-linux-gnu with
build-many-glibcs.py (in conjunction with Martin's patch to allow
glibc to build).

	* string/tester.c (test_stpncpy): Disable -Wstringop-truncation
	for stpncpy calls for GCC 8.
	(test_strncat): Disable -Wstringop-truncation warning for strncat
	calls for GCC 8.  Disable -Wstringop-overflow= warning for one
	strncat call for GCC 7.
	(test_strncpy): Disable -Wstringop-truncation warning for strncpy
	calls for GCC 8.
	(test_memcmp): Use memcpy instead of strncpy for calls not copying
	trailing NUL.

(cherry picked from commit 2e64ec9c9e)
2018-10-22 14:15:28 +02:00
Joseph Myers
935cecfe9a Fix nscd readlink argument aliasing (bug 22446).
Current GCC mainline detects that nscd calls readlink with the same
buffer for both input and output, which is not valid (those arguments
are both restrict-qualified in POSIX).  This patch makes it use a
separate buffer for readlink's input (with a size that is sufficient
to avoid truncation, so there should be no problems with warnings
about possible truncation, though not strictly minimal, but much
smaller than the buffer for output) to avoid this problem.

Tested compilation for aarch64-linux-gnu with build-many-glibcs.py.

	[BZ #22446]
	* nscd/connections.c (handle_request) [SO_PEERCRED]: Use separate
	buffers for readlink input and output.

(cherry picked from commit 49b036bce9)
2018-10-22 14:08:12 +02:00
Steve Ellcey
3fb525c103 Increase buffer size due to warning from ToT GCC
* nscd/dbg_log.c (dbg_log): Increase msg buffer size.

(cherry picked from commit a7e3edf4f2)
2018-10-22 14:00:39 +02:00
Joseph Myers
636f49ba92 Fix p_secstodate overflow handling (bug 22463).
The resolv/res_debug.c function p_secstodate (which is a public
function exported from libresolv, taking an unsigned long argument)
does:

        struct tm timebuf;
        time = __gmtime_r(&clock, &timebuf);
        time->tm_year += 1900;
        time->tm_mon += 1;
        sprintf(output, "%04d%02d%02d%02d%02d%02d",
                time->tm_year, time->tm_mon, time->tm_mday,
                time->tm_hour, time->tm_min, time->tm_sec);

If __gmtime_r returns NULL (because the year overflows the range of
int), this will dereference a null pointer.  Otherwise, if the
computed year does not fit in four characters, this will cause a
buffer overrun of the fixed-size 15-byte buffer.  With current GCC
mainline, there is a compilation failure because of the possible
buffer overrun.

I couldn't find a specification for how this function is meant to
behave, but Paul pointed to RFC 4034 as relevant to the cases where
this function is called from within glibc.  The function's interface
is inherently problematic when dates beyond Y2038 might be involved,
because of the ambiguity in how to interpret 32-bit timestamps as such
dates (the RFC suggests interpreting times as being within 68 years of
the present date, which would mean some kind of interface whose
behavior depends on the present date).

This patch works on the basis of making a minimal fix in preparation
for obsoleting the function.  The function is made to handle times in
the interval [0, 0x7fffffff] only, on all platforms, with <overflow>
used as the output string in other cases (and errno set to EOVERFLOW
in such cases).  This seems to be a reasonable state for the function
to be in when made a compat symbol by a future patch, being compatible
with any existing uses for existing timestamps without trying to work
for later timestamps.  Results independent of the range of time_t also
simplify the testcase.

I couldn't persuade GCC to recognize the ranges of the struct tm
fields by adding explicit range checks with a call to
__builtin_unreachable if outside the range (this looks similar to
<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80776>), so having added
a range check on the input, this patch then disables the
-Wformat-overflow= warning for the sprintf call (I prefer that to the
use of strftime, as being more transparently correct without knowing
what each of %m and %M etc. is).

I do not know why this build failure should be new with mainline GCC
(that is, I don't know what GCC change might have introduced it, when
the basic functionality for such warnings was already in GCC 7).

I do not know if this is a security issue (that is, if there are
plausible ways in which a date before -999 or after 9999 from an
untrusted source might end up in this function).  The system clock is
arguably an untrusted source (in that e.g. NTP is insecure), but
probably not to that extent (NTP can't communicate such wild
timestamps), and uses from within glibc are limited to 32-bit inputs.

Tested with build-many-glibcs.py that this restores the build for arm
with yesterday's mainline GCC.  Also tested for x86_64 and x86.

	[BZ #22463]
	* resolv/res_debug.c: Include <libc-diag.h>.
	(p_secstodate): Assert time_t at least as wide as u_long.  On
	overflow, use integer seconds since the epoch as output, or use
	"<overflow>" as output and set errno to EOVERFLOW if integer
	seconds since the epoch would be 14 or more characters.
	(p_secstodate) [__GNUC_PREREQ (7, 0)]: Disable -Wformat-overflow=
	for sprintf call.
	* resolv/tst-p_secstodate.c: New file.
	* resolv/Makefile (tests): Add tst-p_secstodate.
	($(objpfx)tst-p_secstodate): Depend on $(objpfx)libresolv.so.

(cherry picked from commit f120cda607)
2018-10-22 14:00:17 +02:00
Paul Eggert
d161b294e1 timezone: pacify GCC -Wstringop-truncation
Problem reported by Martin Sebor in:
https://sourceware.org/ml/libc-alpha/2017-11/msg00336.html
* timezone/zic.c (writezone): Use memcpy, not strncpy.

(cherry picked from commit e69897bf20)
2018-10-22 14:00:17 +02:00
Martin Sebor
e37ec9c813 utmp: Avoid -Wstringop-truncation warning
The -Wstringop-truncation option new in GCC 8 detects common misuses
of the strncat and strncpy function that may result in truncating
the copied string before the terminating NUL.  To avoid false positive
warnings for correct code that intentionally creates sequences of
characters that aren't guaranteed to be NUL-terminated, arrays that
are intended to store such sequences should be decorated with a new
nonstring attribute.  This change add this attribute to Glibc and
uses it to suppress such false positives.

ChangeLog:
	* misc/sys/cdefs.h (__attribute_nonstring__): New macro.
	* sysdeps/gnu/bits/utmp.h (struct utmp): Use it.
	* sysdeps/unix/sysv/linux/s390/bits/utmp.h (struct utmp): Same.

(cherry picked from commit 7532837d7b)
2018-10-22 14:00:13 +02:00
Joseph Myers
27611fd05b Avoid use of strlen in getlogin_r (bug 22447).
Building glibc with current mainline GCC fails, among other reasons,
because of an error for use of strlen on the nonstring ut_user field.
This patch changes the problem code in getlogin_r to use __strnlen
instead.  It also needs to set the trailing NUL byte of the result
explicitly, because of the case where ut_user does not have such a
trailing NUL byte (but the result should always have one).

Tested for x86_64.  Also tested that, in conjunction with
<https://sourceware.org/ml/libc-alpha/2017-11/msg00797.html>, it fixes
the build for arm with mainline GCC.

	[BZ #22447]
	* sysdeps/unix/getlogin_r.c (__getlogin_r): Use __strnlen not
	strlen to compute length of ut_user and set trailing NUL byte of
	result explicitly.

(cherry picked from commit 4bae615022)
2018-10-22 13:58:39 +02:00
Ilya Yu. Malakhov
48bef587bf signal: Use correct type for si_band in siginfo_t [BZ #23562]
(cherry picked from commit f997b4be18)
2018-10-22 13:43:59 +02:00
Adhemerval Zanella
202d08db40 Fix misreported errno on preadv2/pwritev2 (BZ#23579)
The fallback code of Linux wrapper for preadv2/pwritev2 executes
regardless of the errno code for preadv2, instead of the case where
the syscall is not supported.

This fixes it by calling the fallback code iff errno is ENOSYS. The
patch also adds tests for both invalid file descriptor and invalid
iov_len and vector count.

The only discrepancy between preadv2 and fallback code regarding
error reporting is when an invalid flags are used.  The fallback code
bails out earlier with ENOTSUP instead of EINVAL/EBADF when the syscall
is used.

Checked on x86_64-linux-gnu on a 4.4.0 and 4.15.0 kernel.

	[BZ #23579]
	* misc/tst-preadvwritev2-common.c (do_test_with_invalid_fd): New
	test.
	* misc/tst-preadvwritev2.c, misc/tst-preadvwritev64v2.c (do_test):
	Call do_test_with_invalid_fd.
	* sysdeps/unix/sysv/linux/preadv2.c (preadv2): Use fallback code iff
	errno is ENOSYS.
	* sysdeps/unix/sysv/linux/preadv64v2.c (preadv64v2): Likewise.
	* sysdeps/unix/sysv/linux/pwritev2.c (pwritev2): Likewise.
	* sysdeps/unix/sysv/linux/pwritev64v2.c (pwritev64v2): Likewise.

(cherry picked from commit 7a16bdbb9f)
2018-09-28 15:16:04 -03:00
Florian Weimer
3022a296bd preadv2/pwritev2: Handle offset == -1 [BZ #22753]
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

(cherry picked from commit d4b4a00a46)
2018-09-28 15:16:02 -03:00
Stefan Liebler
c5c90b480e Fix segfault in maybe_script_execute.
If glibc is built with gcc 8 and -march=z900,
the testcase posix/tst-spawn4-compat crashes with a segfault.

In function maybe_script_execute, the new_argv array is dynamically
initialized on stack with (argc + 1) elements.
The function wants to add _PATH_BSHELL as the first argument
and writes out of bounds of new_argv.
There is an off-by-one because maybe_script_execute fails to count
the terminating NULL when sizing new_argv.

ChangeLog:

	* sysdeps/unix/sysv/linux/spawni.c (maybe_script_execute):
	Increment size of new_argv by one.

(cherry picked from commit 28669f86f6)
2018-09-10 14:27:40 +02:00
Martin Kuchta
174709d879 pthread_cond_broadcast: Fix waiters-after-spinning case [BZ #23538]
(cherry picked from commit 99ea93ca31)
2018-08-27 19:14:58 +02:00