Commit Graph

990 Commits

Author SHA1 Message Date
H.J. Lu
5cd7af016d Don't put SSE2/AVX/AVX512 memmove/memset in ld.so
Since memmove and memset in ld.so don't use IFUNC, don't put SSE2, AVX
and AVX512 memmove and memset in ld.so.

	* sysdeps/x86_64/multiarch/memmove-avx-unaligned-erms.S: Skip
	if not in libc.
	* sysdeps/x86_64/multiarch/memmove-avx512-unaligned-erms.S:
	Likewise.
	* sysdeps/x86_64/multiarch/memset-avx2-unaligned-erms.S:
	Likewise.
	* sysdeps/x86_64/multiarch/memset-avx512-unaligned-erms.S:
	Likewise.
2016-04-03 14:35:38 -07:00
H.J. Lu
ea2785e96f Fix memmove-vec-unaligned-erms.S
__mempcpy_erms and __memmove_erms can't be placed between __memmove_chk
and __memmove it breaks __memmove_chk.

Don't check source == destination first since it is less common.

	* sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:
	(__mempcpy_erms, __memmove_erms): Moved before __mempcpy_chk
	with unaligned_erms.
	(__memmove_erms): Skip if source == destination.
	(__memmove_unaligned_erms): Don't check source == destination
	first.
2016-04-03 12:38:25 -07:00
H.J. Lu
830566307f Add x86-64 memset with unaligned store and rep stosb
Implement x86-64 memset with unaligned store and rep movsb.  Support
16-byte, 32-byte and 64-byte vector register sizes.  A single file
provides 2 implementations of memset, one with rep stosb and the other
without rep stosb.  They share the same codes when size is between 2
times of vector register size and REP_STOSB_THRESHOLD which defaults
to 2KB.

Key features:

1. Use overlapping store to avoid branch.
2. For size <= 4 times of vector register size, fully unroll the loop.
3. For size > 4 times of vector register size, store 4 times of vector
register size at a time.

	[BZ #19881]
	* sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add
	memset-sse2-unaligned-erms, memset-avx2-unaligned-erms and
	memset-avx512-unaligned-erms.
	* sysdeps/x86_64/multiarch/ifunc-impl-list.c
	(__libc_ifunc_impl_list): Test __memset_chk_sse2_unaligned,
	__memset_chk_sse2_unaligned_erms, __memset_chk_avx2_unaligned,
	__memset_chk_avx2_unaligned_erms, __memset_chk_avx512_unaligned,
	__memset_chk_avx512_unaligned_erms, __memset_sse2_unaligned,
	__memset_sse2_unaligned_erms, __memset_erms,
	__memset_avx2_unaligned, __memset_avx2_unaligned_erms,
	__memset_avx512_unaligned_erms and __memset_avx512_unaligned.
	* sysdeps/x86_64/multiarch/memset-avx2-unaligned-erms.S: New
	file.
	* sysdeps/x86_64/multiarch/memset-avx512-unaligned-erms.S:
	Likewise.
	* sysdeps/x86_64/multiarch/memset-sse2-unaligned-erms.S:
	Likewise.
	* sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:
	Likewise.
2016-03-31 10:06:07 -07:00
H.J. Lu
88b57b8ed4 Add x86-64 memmove with unaligned load/store and rep movsb
Implement x86-64 memmove with unaligned load/store and rep movsb.
Support 16-byte, 32-byte and 64-byte vector register sizes.  When
size <= 8 times of vector register size, there is no check for
address overlap bewteen source and destination.  Since overhead for
overlap check is small when size > 8 times of vector register size,
memcpy is an alias of memmove.

A single file provides 2 implementations of memmove, one with rep movsb
and the other without rep movsb.  They share the same codes when size is
between 2 times of vector register size and REP_MOVSB_THRESHOLD which
is 2KB for 16-byte vector register size and scaled up by large vector
register size.

Key features:

1. Use overlapping load and store to avoid branch.
2. For size <= 8 times of vector register size, load  all sources into
registers and store them together.
3. If there is no address overlap bewteen source and destination, copy
from both ends with 4 times of vector register size at a time.
4. If address of destination > address of source, backward copy 8 times
of vector register size at a time.
5. Otherwise, forward copy 8 times of vector register size at a time.
6. Use rep movsb only for forward copy.  Avoid slow backward rep movsb
by fallbacking to backward copy 8 times of vector register size at a
time.
7. Skip when address of destination == address of source.

	[BZ #19776]
	* sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add
	memmove-sse2-unaligned-erms, memmove-avx-unaligned-erms and
	memmove-avx512-unaligned-erms.
	* sysdeps/x86_64/multiarch/ifunc-impl-list.c
	(__libc_ifunc_impl_list): Test
	__memmove_chk_avx512_unaligned_2,
	__memmove_chk_avx512_unaligned_erms,
	__memmove_chk_avx_unaligned_2, __memmove_chk_avx_unaligned_erms,
	__memmove_chk_sse2_unaligned_2,
	__memmove_chk_sse2_unaligned_erms, __memmove_avx_unaligned_2,
	__memmove_avx_unaligned_erms, __memmove_avx512_unaligned_2,
	__memmove_avx512_unaligned_erms, __memmove_erms,
	__memmove_sse2_unaligned_2, __memmove_sse2_unaligned_erms,
	__memcpy_chk_avx512_unaligned_2,
	__memcpy_chk_avx512_unaligned_erms,
	__memcpy_chk_avx_unaligned_2, __memcpy_chk_avx_unaligned_erms,
	__memcpy_chk_sse2_unaligned_2, __memcpy_chk_sse2_unaligned_erms,
	__memcpy_avx_unaligned_2, __memcpy_avx_unaligned_erms,
	__memcpy_avx512_unaligned_2, __memcpy_avx512_unaligned_erms,
	__memcpy_sse2_unaligned_2, __memcpy_sse2_unaligned_erms,
	__memcpy_erms, __mempcpy_chk_avx512_unaligned_2,
	__mempcpy_chk_avx512_unaligned_erms,
	__mempcpy_chk_avx_unaligned_2, __mempcpy_chk_avx_unaligned_erms,
	__mempcpy_chk_sse2_unaligned_2, __mempcpy_chk_sse2_unaligned_erms,
	__mempcpy_avx512_unaligned_2, __mempcpy_avx512_unaligned_erms,
	__mempcpy_avx_unaligned_2, __mempcpy_avx_unaligned_erms,
	__mempcpy_sse2_unaligned_2, __mempcpy_sse2_unaligned_erms and
	__mempcpy_erms.
	* sysdeps/x86_64/multiarch/memmove-avx-unaligned-erms.S: New
	file.
	* sysdeps/x86_64/multiarch/memmove-avx512-unaligned-erms.S:
	Likwise.
	* sysdeps/x86_64/multiarch/memmove-sse2-unaligned-erms.S:
	Likwise.
	* sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:
	Likwise.
2016-03-31 10:04:40 -07:00
H.J. Lu
064f01b10b Make __memcpy_avx512_no_vzeroupper an alias
Since x86-64 memcpy-avx512-no-vzeroupper.S implements memmove, make
__memcpy_avx512_no_vzeroupper an alias of __memmove_avx512_no_vzeroupper
to reduce code size of libc.so.

	* sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Remove
	memcpy-avx512-no-vzeroupper.
	* sysdeps/x86_64/multiarch/memcpy-avx512-no-vzeroupper.S: Renamed
	to ...
	* sysdeps/x86_64/multiarch/memmove-avx512-no-vzeroupper.S: This.
	(MEMCPY): Don't define.
	(MEMCPY_CHK): Likewise.
	(MEMPCPY): Likewise.
	(MEMPCPY_CHK): Likewise.
	(MEMPCPY_CHK): Renamed to ...
	(__mempcpy_chk_avx512_no_vzeroupper): This.
	(MEMPCPY_CHK): Renamed to ...
	(__mempcpy_chk_avx512_no_vzeroupper): This.
	(MEMCPY_CHK): Renamed to ...
	(__memmove_chk_avx512_no_vzeroupper): This.
	(MEMCPY): Renamed to ...
	(__memmove_avx512_no_vzeroupper): This.
	(__memcpy_avx512_no_vzeroupper): New alias.
	(__memcpy_chk_avx512_no_vzeroupper): Likewise.
2016-03-28 13:16:22 -07:00
H.J. Lu
c365e615f7 Implement x86-64 multiarch mempcpy in memcpy
Implement x86-64 multiarch mempcpy in memcpy to share most of code.  It
reduces code size of libc.so.

	[BZ #18858]
	* sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Remove
	mempcpy-ssse3, mempcpy-ssse3-back, mempcpy-avx-unaligned
	and mempcpy-avx512-no-vzeroupper.
	* sysdeps/x86_64/multiarch/memcpy-avx-unaligned.S (MEMPCPY_CHK):
	New.
	(MEMPCPY): Likewise.
	* sysdeps/x86_64/multiarch/memcpy-avx512-no-vzeroupper.S
	(MEMPCPY_CHK): New.
	(MEMPCPY): Likewise.
	* sysdeps/x86_64/multiarch/memcpy-ssse3-back.S (MEMPCPY_CHK): New.
	(MEMPCPY): Likewise.
	* sysdeps/x86_64/multiarch/memcpy-ssse3.S (MEMPCPY_CHK): New.
	(MEMPCPY): Likewise.
	* sysdeps/x86_64/multiarch/mempcpy-avx-unaligned.S: Removed.
	* sysdeps/x86_64/multiarch/mempcpy-avx512-no-vzeroupper.S:
	Likewise.
	* sysdeps/x86_64/multiarch/mempcpy-ssse3-back.S: Likewise.
	* sysdeps/x86_64/multiarch/mempcpy-ssse3.S: Likewise.
2016-03-28 13:13:51 -07:00
H.J. Lu
e41b395523 [x86] Add a feature bit: Fast_Unaligned_Copy
On AMD processors, memcpy optimized with unaligned SSE load is
slower than emcpy optimized with aligned SSSE3 while other string
functions are faster with unaligned SSE load.  A feature bit,
Fast_Unaligned_Copy, is added to select memcpy optimized with
unaligned SSE load.

	[BZ #19583]
	* sysdeps/x86/cpu-features.c (init_cpu_features): Set
	Fast_Unaligned_Copy with Fast_Unaligned_Load for Intel
	processors.  Set Fast_Copy_Backward for AMD Excavator
	processors.
	* sysdeps/x86/cpu-features.h (bit_arch_Fast_Unaligned_Copy):
	New.
	(index_arch_Fast_Unaligned_Copy): Likewise.
	* sysdeps/x86_64/multiarch/memcpy.S (__new_memcpy): Check
	Fast_Unaligned_Copy instead of Fast_Unaligned_Load.
2016-03-28 04:40:03 -07:00
Florian Weimer
f327f5b47b tst-audit10: Fix compilation on compilers without bit_AVX512F [BZ #19860]
[BZ# 19860]
	* sysdeps/x86_64/tst-audit10.c (avx512_enabled): Always return
	zero if the compiler does not provide the AVX512F bit.
2016-03-25 11:11:42 +01:00
Joseph Myers
c898991d8b Fix x86_64 / x86 powl inaccuracy for integer exponents (bug 19848).
Bug 19848 reports cases where powl on x86 / x86_64 has error
accumulation, for small integer exponents, larger than permitted by
glibc's accuracy goals, at least in some rounding modes.  This patch
further restricts the exponent range for which the
small-integer-exponent logic is used to limit the possible error
accumulation.

Tested for x86_64 and x86 and ulps updated accordingly.

	[BZ #19848]
	* sysdeps/i386/fpu/e_powl.S (p3): Rename to p2 and change value
	from 8 to 4.
	(__ieee754_powl): Compare integer exponent against 4 not 8.
	* sysdeps/x86_64/fpu/e_powl.S (p3): Rename to p2 and change value
	from 8 to 4.
	(__ieee754_powl): Compare integer exponent against 4 not 8.
	* math/auto-libm-test-in: Add more tests of pow.
	* math/auto-libm-test-out: Regenerated.
	* sysdeps/i386/i686/fpu/multiarch/libm-test-ulps: Update.
	* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
2016-03-24 01:32:52 +00:00
H.J. Lu
3c9a4cd16c Don't set %rcx twice before "rep movsb"
* sysdeps/x86_64/multiarch/memcpy-avx-unaligned.S (MEMCPY):
	Don't set %rcx twice before "rep movsb".
2016-03-22 08:36:16 -07:00
H.J. Lu
86ed888255 Use JUMPTARGET in x86-64 mathvec
When PLT may be used, JUMPTARGET should be used instead calling the
function directly.

	* sysdeps/x86_64/fpu/multiarch/svml_d_cos2_core_sse4.S
	(_ZGVbN2v_cos_sse4): Use JUMPTARGET to call cos.
	* sysdeps/x86_64/fpu/multiarch/svml_d_cos4_core_avx2.S
	(_ZGVdN4v_cos_avx2): Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core_avx512.S
	(_ZGVdN4v_cos): Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_exp2_core_sse4.S
	(_ZGVbN2v_exp_sse4): Use JUMPTARGET to call exp.
	* sysdeps/x86_64/fpu/multiarch/svml_d_exp4_core_avx2.S
	(_ZGVdN4v_exp_avx2): Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core_avx512.S
	(_ZGVdN4v_exp): Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_log2_core_sse4.S
	(_ZGVbN2v_log_sse4): Use JUMPTARGET to call log.
	* sysdeps/x86_64/fpu/multiarch/svml_d_log4_core_avx2.S
	(_ZGVdN4v_log_avx2): Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_log8_core_avx512.S
	(_ZGVdN4v_log): Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_pow2_core_sse4.S
	(_ZGVbN2vv_pow_sse4): Use JUMPTARGET to call pow.
	* sysdeps/x86_64/fpu/multiarch/svml_d_pow4_core_avx2.S
	(_ZGVdN4vv_pow_avx2): Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core_avx512.S
	(_ZGVdN4vv_pow): Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core_sse4.S
	(_ZGVbN2v_sin_sse4): Use JUMPTARGET to call sin.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core_avx2.S
	(_ZGVdN4v_sin_avx2): Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core_avx512.S
	(_ZGVdN4v_sin): Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sincos2_core_sse4.S
	(_ZGVbN2vvv_sincos_sse4): Use JUMPTARGET to call sin and cos.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sincos4_core_avx2.S
	(_ZGVdN4vvv_sincos_avx2): Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core_avx512.S
	(_ZGVdN4vvv_sincos): Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_cosf16_core_avx512.S
	(_ZGVdN8v_cosf): Use JUMPTARGET to call cosf.
	* sysdeps/x86_64/fpu/multiarch/svml_s_cosf4_core_sse4.S
	(_ZGVbN4v_cosf_sse4): Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_cosf8_core_avx2.S
	(_ZGVdN8v_cosf_avx2): Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_expf16_core_avx512.S
	(_ZGVdN8v_expf): Use JUMPTARGET to call expf.
	* sysdeps/x86_64/fpu/multiarch/svml_s_expf4_core_sse4.S
	(_ZGVbN4v_expf_sse4): Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_expf8_core_avx2.S
	(_ZGVdN8v_expf_avx2): Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_logf16_core_avx512.S
	(_ZGVdN8v_logf): Use JUMPTARGET to call logf.
	* sysdeps/x86_64/fpu/multiarch/svml_s_logf4_core_sse4.S
	(_ZGVbN4v_logf_sse4): Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_logf8_core_avx2.S
	(_ZGVdN8v_logf_avx2): Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_powf16_core_avx512.S
	(_ZGVdN8vv_powf): Use JUMPTARGET to call powf.
	* sysdeps/x86_64/fpu/multiarch/svml_s_powf4_core_sse4.S
	(_ZGVbN4vv_powf_sse4): Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_powf8_core_avx2.S
	(_ZGVdN8vv_powf_avx2): Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core_avx512.S
	(_ZGVdN8vv_powf): Use JUMPTARGET to call sinf and cosf.
	* sysdeps/x86_64/fpu/multiarch/svml_s_sincosf4_core_sse4.S
	(_ZGVbN4vvv_sincosf_sse4): Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_sincosf8_core_avx2.S
	(_ZGVdN8vvv_sincosf_avx2): Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_sinf16_core_avx512.S
	(_ZGVdN8v_sinf): Use JUMPTARGET to call sinf.
	* sysdeps/x86_64/fpu/multiarch/svml_s_sinf4_core_sse4.S
	(_ZGVbN4v_sinf_sse4): Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_sinf8_core_avx2.S
	(_ZGVdN8v_sinf_avx2): Likewise.
	* sysdeps/x86_64/fpu/svml_d_wrapper_impl.h (WRAPPER_IMPL_SSE2):
	Use JUMPTARGET to call callee.
	(WRAPPER_IMPL_SSE2_ff): Likewise.
	(WRAPPER_IMPL_SSE2_fFF): Likewise.
	(WRAPPER_IMPL_AVX): Likewise.
	(WRAPPER_IMPL_AVX_ff): Likewise.
	(WRAPPER_IMPL_AVX_fFF): Likewise.
	(WRAPPER_IMPL_AVX512): Likewise.
	(WRAPPER_IMPL_AVX512_ff): Likewise.
	* sysdeps/x86_64/fpu/svml_s_wrapper_impl.h (WRAPPER_IMPL_SSE2):
	Likewise.
	(WRAPPER_IMPL_SSE2_ff): Likewise.
	(WRAPPER_IMPL_SSE2_fFF): Likewise.
	(WRAPPER_IMPL_AVX): Likewise.
	(WRAPPER_IMPL_AVX_ff): Likewise.
	(WRAPPER_IMPL_AVX_fFF): Likewise.
	(WRAPPER_IMPL_AVX512): Likewise.
	(WRAPPER_IMPL_AVX512_ff): Likewise.
	(WRAPPER_IMPL_AVX512_fFF): Likewise.
2016-03-16 14:24:19 -07:00
Roland McGrath
3bd80c0de2 Fix tst-audit10 build when -mavx512f is not supported. 2016-03-08 12:32:59 -08:00
Florian Weimer
3c0f7407ee tst-audit4, tst-audit10: Compile AVX/AVX-512 code separately [BZ #19269]
This ensures that GCC will not use unsupported instructions before
the run-time check to ensure support.
2016-03-07 16:00:25 +01:00
H.J. Lu
fee9eb6200 Group AVX512 functions in .text.avx512 section
* sysdeps/x86_64/multiarch/memcpy-avx512-no-vzeroupper.S:
	Replace .text with .text.avx512.
	* sysdeps/x86_64/multiarch/memset-avx512-no-vzeroupper.S:
	Likewise.
2016-03-06 16:48:11 -08:00
H.J. Lu
16b23e0363 Replace PREINIT_FUNCTION@PLT with *%rax in call
Since we have loaded address of PREINIT_FUNCTION into %rax, we can
avoid extra branch to PLT slot.

	[BZ #19745]
	* sysdeps/x86_64/crti.S (_init): Replace PREINIT_FUNCTION@PLT
	with *%rax in call.
2016-03-04 16:15:41 -08:00
H.J. Lu
21683b5a7d Replace @PLT with @GOTPCREL(%rip) in call
Since __libc_start_main is called very early, lazy binding isn't relevant
here.  Use indirect branch via GOT to avoid extra branch to PLT slot.

	[BZ #19745]
	* sysdeps/x86_64/start.S (_start): __libc_start_main@PLT
	with *__libc_start_main@GOTPCREL(%rip) in call.
2016-03-04 16:15:41 -08:00
H.J. Lu
97f7112728 Add a comment in sysdeps/x86_64/Makefile
Mention recursive calls when ENTRY is used in _mcount.S.

	* sysdeps/x86_64/Makefile (sysdep_noprof): Add a comment.
2016-03-04 08:44:58 -08:00
H.J. Lu
14a1d7cc4c x86-64: Fix memcpy IFUNC selection
Chek Fast_Unaligned_Load, instead of Slow_BSF, and also check for
Fast_Copy_Backward to enable __memcpy_ssse3_back.  Existing selection
order is updated with following selection order:

1. __memcpy_avx_unaligned if AVX_Fast_Unaligned_Load bit is set.
2. __memcpy_sse2_unaligned if Fast_Unaligned_Load bit is set.
3. __memcpy_sse2 if SSSE3 isn't available.
4. __memcpy_ssse3_back if Fast_Copy_Backward bit it set.
5. __memcpy_ssse3

	[BZ #18880]
	* sysdeps/x86_64/multiarch/memcpy.S: Check Fast_Unaligned_Load,
	instead of Slow_BSF, and also check for Fast_Copy_Backward to
	enable __memcpy_ssse3_back.
2016-03-04 08:39:07 -08:00
Paul Pluzhnikov
5cdc3d9db0 2016-03-03 Paul Pluzhnikov <ppluzhnikov@google.com>
[BZ #19490]
	* sysdeps/x86_64/_mcount.S (_mcount): Add unwind descriptor.
	(__fentry__): Likewise
2016-03-03 09:53:49 -08:00
H.J. Lu
87a07a4376 Copy x86_64 _mcount.op from _mcount.o
No need to compile x86_64 _mcount.S with -pg.  We can just copy the
normal static object.

	* gmon/Makefile (noprof): Add $(sysdep_noprof).
	* sysdeps/x86_64/Makefile (sysdep_noprof): Add _mcount.
2016-03-03 06:56:22 -08:00
H.J. Lu
ec215346b9 Call x86-64 __mcount_internal/__sigjmp_save directly
Since __mcount_internal and __sigjmp_save are internal to x86-64 libc.so:

3532: 0000000000104530   289 FUNC    LOCAL  DEFAULT   13 __mcount_internal
3391: 0000000000034170    38 FUNC    LOCAL  DEFAULT   13 __sigjmp_save

they can be called directly without PLT.

	* sysdeps/x86_64/_mcount.S (C_LABEL(_mcount)): Call
	__mcount_internal directly.
	(C_LABEL(__fentry__)): Likewise.
	* sysdeps/x86_64/setjmp.S __sigsetjmp): Call __sigjmp_save
	directly.
2016-03-01 16:58:07 -08:00
H.J. Lu
8d9c92017d [x86_64] Set DL_RUNTIME_UNALIGNED_VEC_SIZE to 8
Due to GCC bug:

   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58066

__tls_get_addr may be called with 8-byte stack alignment.  Although
this bug has been fixed in GCC 4.9.4, 5.3 and 6, we can't assume
that stack will be always aligned at 16 bytes.  Since SSE optimized
memory/string functions with aligned SSE register load and store are
used in the dynamic linker, we must set DL_RUNTIME_UNALIGNED_VEC_SIZE
to 8 so that _dl_runtime_resolve_sse will align the stack before
calling _dl_fixup:

Dump of assembler code for function _dl_runtime_resolve_sse:
   0x00007ffff7deea90 <+0>:	push   %rbx
   0x00007ffff7deea91 <+1>:	mov    %rsp,%rbx
   0x00007ffff7deea94 <+4>:	and    $0xfffffffffffffff0,%rsp
                                ^^^^^^^^^^^ Align stack to 16 bytes
   0x00007ffff7deea98 <+8>:	sub    $0x100,%rsp
   0x00007ffff7deea9f <+15>:	mov    %rax,0xc0(%rsp)
   0x00007ffff7deeaa7 <+23>:	mov    %rcx,0xc8(%rsp)
   0x00007ffff7deeaaf <+31>:	mov    %rdx,0xd0(%rsp)
   0x00007ffff7deeab7 <+39>:	mov    %rsi,0xd8(%rsp)
   0x00007ffff7deeabf <+47>:	mov    %rdi,0xe0(%rsp)
   0x00007ffff7deeac7 <+55>:	mov    %r8,0xe8(%rsp)
   0x00007ffff7deeacf <+63>:	mov    %r9,0xf0(%rsp)
   0x00007ffff7deead7 <+71>:	movaps %xmm0,(%rsp)
   0x00007ffff7deeadb <+75>:	movaps %xmm1,0x10(%rsp)
   0x00007ffff7deeae0 <+80>:	movaps %xmm2,0x20(%rsp)
   0x00007ffff7deeae5 <+85>:	movaps %xmm3,0x30(%rsp)
   0x00007ffff7deeaea <+90>:	movaps %xmm4,0x40(%rsp)
   0x00007ffff7deeaef <+95>:	movaps %xmm5,0x50(%rsp)
   0x00007ffff7deeaf4 <+100>:	movaps %xmm6,0x60(%rsp)
   0x00007ffff7deeaf9 <+105>:	movaps %xmm7,0x70(%rsp)

	[BZ #19679]
	* sysdeps/x86_64/dl-trampoline.S (DL_RUNIME_UNALIGNED_VEC_SIZE):
	Renamed to ...
	(DL_RUNTIME_UNALIGNED_VEC_SIZE): This.  Set to 8.
	(DL_RUNIME_RESOLVE_REALIGN_STACK): Renamed to ...
	(DL_RUNTIME_RESOLVE_REALIGN_STACK): This.  Updated.
	(DL_RUNIME_RESOLVE_REALIGN_STACK): Renamed to ...
	(DL_RUNTIME_RESOLVE_REALIGN_STACK): This.
	* sysdeps/x86_64/dl-trampoline.h
	(DL_RUNIME_RESOLVE_REALIGN_STACK): Renamed to ...
	(DL_RUNTIME_RESOLVE_REALIGN_STACK): This.
2016-02-19 15:45:09 -08:00
Andrew Senkevich
a5df3210a6 Use PIC relocation in ALIAS_IMPL
Since libmvec_nonshared.a may be linked into shared objects, ALIAS_IMPL
should use PIC relocation.

	[BZ #19590]
	* sysdeps/x86_64/fpu/svml_finite_alias.S (ALIAS_IMPL): Use PIC
	relocation.
2016-02-17 14:23:32 -08:00
Paul Pluzhnikov
b274130206 2016-01-20 Paul Pluzhnikov <ppluzhnikov@google.com>
[BZ #19490]
* sysdeps/unix/sysv/linux/x86_64/pthread_cond_broadcast.S (pthread_cond_broadcast): Use ENTRY/END
* sysdeps/unix/sysv/linux/x86_64/pthread_cond_signal.S (pthread_cond_signal): Likewise
* sysdeps/x86_64/nptl/pthread_spin_lock.S (pthread_spin_lock): Likewise
* sysdeps/x86_64/nptl/pthread_spin_trylock.S (pthread_spin_trylock): Likewise
* sysdeps/x86_64/nptl/pthread_spin_unlock.S (pthread_spin_unlock): Likewise
2016-01-20 13:39:20 -08:00
Andrew Senkevich
df782dc690 Fixed build with assembler w/o AVX-512 support.
* sysdeps/x86_64/multiarch/ifunc-impl-list.c: Fixed build with
    assembler not supporting AVX-512.
2016-01-19 14:34:53 +03:00
Andrew Senkevich
214a44f394 Fixed typos in __memcpy_chk.
* sysdeps/x86_64/multiarch/memcpy_chk.S: Fixed typos.
2016-01-16 14:42:26 +03:00
Andrew Senkevich
72276d6e88 Added memcpy/memmove family optimized with AVX512 for KNL hardware.
Added AVX512 implementations of memcpy, mempcpy, memmove, memcpy_chk,
mempcpy_chk, memmove_chk.
It shows average improvement more than 30% over AVX versions on KNL
hardware (performance results in the thread
<https://sourceware.org/ml/libc-alpha/2016-01/msg00258.html>).

    * sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Added new files.
    * sysdeps/x86_64/multiarch/ifunc-impl-list.c: Added new tests.
    * sysdeps/x86_64/multiarch/memcpy-avx512-no-vzeroupper.S: New file.
    * sysdeps/x86_64/multiarch/mempcpy-avx512-no-vzeroupper.S: Likewise.
    * sysdeps/x86_64/multiarch/memmove-avx512-no-vzeroupper.S: Likewise.
    * sysdeps/x86_64/multiarch/memcpy.S: Added new IFUNC branch.
    * sysdeps/x86_64/multiarch/memcpy_chk.S: Likewise.
    * sysdeps/x86_64/multiarch/memmove.c: Likewise.
    * sysdeps/x86_64/multiarch/memmove_chk.c: Likewise.
    * sysdeps/x86_64/multiarch/mempcpy.S: Likewise.
    * sysdeps/x86_64/multiarch/mempcpy_chk.S: Likewise.
2016-01-16 00:49:45 +03:00
Joseph Myers
f7a9f785e5 Update copyright dates with scripts/update-copyrights. 2016-01-04 16:05:18 +00:00
Andrew Senkevich
83d776f979 Added memset optimized with AVX512 for KNL hardware.
It shows improvement up to 28% over AVX2 memset (performance results
attached at <https://sourceware.org/ml/libc-alpha/2015-12/msg00052.html>).

    * sysdeps/x86_64/multiarch/memset-avx512-no-vzeroupper.S: New file.
    * sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Added new file.
    * sysdeps/x86_64/multiarch/ifunc-impl-list.c: Added new tests.
    * sysdeps/x86_64/multiarch/memset.S: Added new IFUNC branch.
    * sysdeps/x86_64/multiarch/memset_chk.S: Likewise.
    * sysdeps/x86/cpu-features.h (bit_Prefer_No_VZEROUPPER,
    index_Prefer_No_VZEROUPPER): New.
    * sysdeps/x86/cpu-features.c (init_cpu_features): Set the
    Prefer_No_VZEROUPPER for Knights Landing.
2015-12-19 02:47:28 +03:00
Andrew Senkevich
977a30801f Better workaround for aliases of *_finite symbols in vector math library.
Old workaround based on assembly aliases can lead to link fail (bug 19058).
This patch makes workaround in another way to avoid it.

    [BZ #19058]
    * math/Makefile ($(inst_libdir)/libm.so): Added libmvec_nonshared.a
    to AS_NEEDED.
    * sysdeps/x86/fpu/bits/math-vector.h: Removed code with old workaround.
    * sysdeps/x86_64/fpu/Makefile (libmvec-support,
    libmvec-static-only-routines): Added new file.
    * sysdeps/x86_64/fpu/svml_finite_alias.S: New file.
2015-11-27 16:22:26 +03:00
Joseph Myers
01189b083b Fix i386/x86_64 log* (1) zero sign for -ffinite-math-only (bug 19213).
For the -ffinite-math-only versions of various x86_64 and x86 log*
functions, a zero result from log* (1) is returned with incorrect sign
in round-downward mode.  This patch fixes this in a similar way to the
previous fixes for the non-*_finite versions of the functions.

Tested for x86_64 and x86 (including an i586 build), together with a
patch that will be applied separately to enable the main libm-test.inc
tests for the finite-math-only functions.

	[BZ #19213]
	* sysdeps/i386/fpu/e_log.S (__log_finite): Ensure +0 is always
	returned for argument 1.
	* sysdeps/i386/fpu/e_logf.S (__logf_finite): Likewise.
	* sysdeps/i386/fpu/e_logl.S (__logl_finite): Likewise.
	* sysdeps/i386/i686/fpu/e_logl.S (__logl_finite): Likewise.
	* sysdeps/x86_64/fpu/e_log10l.S (__log10l_finite): Likewise.
	* sysdeps/x86_64/fpu/e_log2l.S (__log2l_finite): Likewise.
	* sysdeps/x86_64/fpu/e_logl.S (__logl_finite): Likewise.
2015-11-05 21:56:31 +00:00
Joseph Myers
9f9f27248b Remove miscellaneous GCC >= 4.7 version conditionals.
This patch removes miscellaneous __GNUC_PREREQ (4, 7) conditionals
that are now dead.

Tested for x86_64 and x86 (testsuite, and that installed stripped
shared libraries are unchanged by the patch).

	* sysdeps/arm/atomic-machine.h
	[__GNUC_PREREQ (4, 7) && __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4]:
	Change conditional to [__GCC_HAVE_SYNC_COMPARE_AND_SWAP_4].
	[__GCC_HAVE_SYNC_COMPARE_AND_SWAP_4 && !__GNUC_PREREQ (4, 7)]:
	Remove conditional code.
	[!__GNUC_PREREQ (4, 7) || !__GCC_HAVE_SYNC_COMPARE_AND_SWAP_4]:
	Change conditional to [!__GCC_HAVE_SYNC_COMPARE_AND_SWAP_4].
	* sysdeps/i386/sysdep.h [__ASSEMBLER__ && __GNUC_PREREQ (4, 7)]:
	Change conditional to [__ASSEMBLER__].
	[__ASSEMBLER__ && !__GNUC_PREREQ (4, 7)]: Remove conditional code.
	[!__ASSEMBLER__ && __GNUC_PREREQ (4, 7)]: Change conditional to
	[!__ASSEMBLER__].
	[!__ASSEMBLER__ && !__GNUC_PREREQ (4, 7)]: Remove conditional
	code.
	* sysdeps/unix/sysv/linux/sh/atomic-machine.h (rNOSP): Remove
	conditional macro definitions.
	(__arch_compare_and_exchange_val_8_acq): Use "u" instead of rNOSP.
	(__arch_compare_and_exchange_val_16_acq): Likewise.
	(__arch_compare_and_exchange_val_32_acq): Likewise.
	(atomic_exchange_and_add): Likewise.
	(atomic_add): Likewise.
	(atomic_add_negative): Likewise.
	(atomic_add_zero): Likewise.
	(atomic_bit_set): Likewise.
	(atomic_bit_test_set): Likewise.
	* sysdeps/x86_64/atomic-machine.h [__GNUC_PREREQ (4, 7)]: Make
	code unconditional.
	[!__GNUC_PREREQ (4, 7)]: Remove conditional code.
2015-11-04 21:34:36 +00:00
Joseph Myers
91bcb95ad4 Remove cpuid.h configure tests.
There are configure tests for the cpuid.h header for x86 / x86_64.
GCC 4.3 and later install this header, so those tests are obsolete.
This patch removes them.

Tested for x86_64 and x86 (testsuite, and that installed shared
libraries are unchanged by the patch).

	* sysdeps/i386/configure.ac (cpuid.h): Do not test for header.
	* sysdeps/i386/configure: Regenerated.
	* sysdeps/x86_64/configure.ac (cpuid.h): Do not test for header.
	* sysdeps/x86_64/configure: Regenerated.
2015-10-29 14:43:46 +00:00
Joseph Myers
2145f97cee Handle more state in i386/x86_64 fesetenv (bug 16068).
fenv_t should include architecture-specific floating-point modes and
status flags.  i386 and x86_64 fesetenv limit which bits they use from
the x87 status and control words, when using saved state, and limit
which parts of the state they set to fixed values, when using
FE_DFL_ENV / FE_NOMASK_ENV.  The following should be included but are
excluded in at least some cases: status and masking for the "denormal
operand" exception (which isn't part of FE_ALL_EXCEPT); precision
control (explicitly mentioned in Annex F as something that counts as
part of the floating-point environment); MXCSR FZ and DAZ bits (for
FE_DFL_ENV and FE_NOMASK_ENV).  This patch arranges for this extra
state to be handled by fesetenv (and thereby by feupdateenv, which
calls fesetenv).

(Note that glibc functions using floating point are not generally
expected to work correctly with non-default values of this state,
especially precision control, but it is still logically part of the
floating-point environment and should be handled as such by fesetenv.
Changes to the state relating to subnormals ought generally to work
with libm functions when the arguments aren't subnormal and neither
are the expected results; that's a consequence of functions avoiding
spurious internal underflows.)

A question arising from this is whether FE_NOMASK_ENV should or should
not mask the "denormal operand" exception.  I decided it should mask
that exception.  This is the status quo - previously that exception
could only be unmasked by direct manipulation of control registers
(possibly via <fpu_control.h>).  In addition, it means that use of
FE_NOMASK_ENV leaves a floating-point environment the same as could be
obtained by fesetenv (FE_DFL_ENV); feenableexcept (FE_ALL_EXCEPT);,
rather than an environment in which an exception is unmasked that
could only be masked again by using fesetenv with FE_DFL_ENV (or a
previously saved environment) - this exception not being usable with
other <fenv.h> functions because it's outside FE_ALL_EXCEPT.

Tested for x86_64 and x86.

	[BZ #16068]
	* sysdeps/i386/fpu/fesetenv.c: Include <fpu_control.h>.
	(FE_ALL_EXCEPT_X86): New macro.
	(__fesetenv): Use FE_ALL_EXCEPT_X86 in most places instead of
	FE_ALL_EXCEPT.  Ensure precision control is included in
	floating-point state.  Ensure that FE_DFL_ENV and FE_NOMASK_ENV
	handle "denormal operand exception" and clear FZ and DAZ bits.
	* sysdeps/x86_64/fpu/fesetenv.c: Include <fpu_control.h>.
	(FE_ALL_EXCEPT_X86): New macro.
	(__fesetenv): Use FE_ALL_EXCEPT_X86 in most places instead of
	FE_ALL_EXCEPT.  Ensure precision control is included in
	floating-point state.  Ensure that FE_DFL_ENV and FE_NOMASK_ENV
	handle "denormal operand exception" and clear FZ and DAZ bits.
	* sysdeps/x86/fpu/test-fenv-sse-2.c: New file.
	* sysdeps/x86/fpu/test-fenv-x87.c: Likewise.
	* sysdeps/x86/fpu/Makefile [$(subdir) = math] (tests): Add
	test-fenv-x87 and test-fenv-sse-2.
	[$(subdir) = math] (CFLAGS-test-fenv-sse-2.c): New variable.
2015-10-28 22:58:29 +00:00
Joseph Myers
0b9af583a5 Fix i386/x86_64 fesetenv SSE exception clearing (bug 19181).
The i386 and x86_64 versions of fesetenv, when called with FE_DFL_ENV
or FE_NOMASK_ENV as argument, do not clear SSE exceptions raised in
MXCSR.  These arguments should, like other fenv_t values, represent
the whole of the floating-point state, so such exceptions should be
cleared; this patch adds the required clearing.  (Discovered while
working on bug 16068.)

Tested for x86_64 and x86.

	[BZ #19181]
	* sysdeps/i386/fpu/fesetenv.c (__fesetenv): Clear already-raised
	SSE exceptions when argument is FE_DFL_ENV or FE_NOMASK_ENV.
	* sysdeps/x86_64/fpu/fesetenv.c (__fesetenv): Likewise.
	* math/test-fenv-clear-main.c: New file.
	* math/test-fenv-clear.c: Likewise.
	* math/Makefile (tests): Add test-fenv-clear.
	* sysdeps/x86/fpu/test-fenv-clear-sse.c: New file.
	* sysdeps/x86/fpu/Makefile [$(subdir) = math] (tests): Add
	test-fenv-clear-sse.
	[$(subdir) = math] (CFLAGS-test-fenv-clear-sse.c): New variable.
2015-10-28 18:50:20 +00:00
Joseph Myers
c871b9b096 Remove -mavx2 configure tests.
There are configure tests for the -mavx2 compiler option.  AVX2
support was added in GCC 4.7, so these tests are now obsolete; this
patch removes them.

Tested for x86_64 and x86 (testsuite, and that installed stripped
shared libraries are unchanged by the patch).

	* sysdeps/i386/configure.ac (libc_cv_cc_avx2): Remove configure
	test.
	* sysdeps/i386/configure: Regenerated.
	* sysdeps/x86_64/configure.ac (libc_cv_cc_avx2): Remove configure
	test.
	* sysdeps/x86_64/configure: Regenerated.
	* config.h.in (HAVE_AVX2_SUPPORT): Remove #undef.
	* sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add
	memset-avx2 unconditionally instead of conditionally on
	[$(config-cflags-avx2) = yes].
	* sysdeps/x86_64/multiarch/ifunc-impl-list.c
	(__libc_ifunc_impl_list) [HAVE_AVX2_SUPPORT]: Make code
	unconditional.
	* sysdeps/x86_64/multiarch/memset.S [HAVE_AVX2_SUPPORT]: Likewise.
	* sysdeps/x86_64/multiarch/memset_chk.S
	[IS_IN (libc) && SHARED && HAVE_AVX2_SUPPORT]: Change conditional
	to [IS_IN (libc) && SHARED].
2015-10-28 13:29:03 +00:00
Florian Weimer
e39edb0552 x86_64: Regenerate ulps [BZ #19168]
This comes from running “make regen-ulps” on AMD Opteron 6272 CPUs.
2015-10-26 13:15:48 +01:00
Joseph Myers
9d1687b2df Add more libm tests (ilogb, is*, j0, j1, jn, lgamma, log*).
This patch improves the libm test coverage for a few more functions.

Tested for x86_64 and x86.

	* math/auto-libm-test-in: Add more tests of log, log10, log1p and
	log2.
	* math/auto-libm-test-out: Regenerated.
	* math/libm-test.inc (MAX_EXP): New macro.
	(ilogb_test_data): Add more tests.
	(isfinite_test_data): Likewise.
	(isgreater_test_data): Likewise.
	(isgreaterequal_test_data): Likewise.
	(isinf_test_data): Likewise.
	(isless_test_data): Likewise.
	(islessequal_test_data): Likewise.
	(islessgreater_test_data): Likewise.
	(isnan_test_data): Likewise.
	(isnormal_test_data): Likewise.
	(issignaling_test_data): Likewise.
	(isunordered_test_data): Likewise.
	(j0_test_data): Likewise.
	(j1_test_data): Likewise.
	(jn_test_data): Likewise.
	(lgamma_test_data): Likewise.
	(log_test_data): Likewise.
	(log10_test_data): Likewise.
	(log1p_test_data): Likewise.
	(log2_test_data): Likewise.
	(logb_test_data): Likewise.
	* sysdeps/x86_64/fpu/libm-test-ulps: Update.
2015-10-23 22:46:05 +00:00
Joseph Myers
846d9a4a3a Fix i386 / x86_64 nearbyint exception clearing (bug 15491).
The implementations of nearbyint functions using x87 floating point
(i386 all versions, x86_64 long double only) use the fclex
instruction, which clears any exceptions that were raised before the
function was called.  These functions must not clear exceptions that
were raised before they were called.

This patch fixes these functions to save and restore the whole
floating-point environment (fnstenv / fldenv) as the way of avoiding
raising "inexact" (recall that there isn't an x87 instruction for
loading just the status word, so the whole environment has to be saved
and loaded instead - the code already saved and loaded the control
word, which is now obtained from the saved environment after this
patch, to disable traps on "inexact").  In the case of the long double
functions, any "invalid" exception from frndint (applied to a
signaling NaN) needs merging into the saved state; this issue doesn't
apply to the float and double functions because that exception would
have been raised when the argument is loaded, before the environment
is saved.

	[BZ #15491]
	* sysdeps/i386/fpu/s_nearbyint.S (__nearbyint): Save and restore
	floating-point environment instead of clearing all exceptions.
	* sysdeps/i386/fpu/s_nearbyintf.S (__nearbyintf): Likewise.
	* sysdeps/i386/fpu/s_nearbyintl.S (__nearbyintl): Likewise,
	merging in "invalid" exceptions from frndint.
	* sysdeps/x86_64/fpu/s_nearbyintl.S (__nearbyintl): Likewise.
	* math/test-nearbyint-except.c: New file.
	* math/Makefile (tests): Add test-nearbyint-except.
2015-10-22 23:14:55 +00:00
Joseph Myers
bd2260a206 Convert 231 sysdeps function definitions to prototype style.
This mostly automatically-generated patch converts 231 sysdeps
function definitions in glibc from old-style K&R to prototype-style.

For __aio_sigqueue and __gai_sigqueue I had to add internal_function
to the definitions as noted by Florian in
<https://sourceware.org/ml/libc-alpha/2015-10/msg00595.html> to keep
the functions compiling on x86 after conversion to prototype
definitions.  Otherwise, the patch is automatically generated with all
the same exclusions and caveats as in
<https://sourceware.org/ml/libc-alpha/2015-10/msg00594.html> except
that it's a patch for sysdeps files.

Tested for x86_64 and x86 (testsuite, and that installed stripped
shared libraries are unchanged by the patch).  Also tested for arm,
mips64 and powerpc32 that installed stripped shared libraries are
unchanged by the patch.

	* sysdeps/arm/backtrace.c (__backtrace): Convert to
	prototype-style function definition.
	* sysdeps/i386/backtrace.c (__backtrace): Likewise.
	* sysdeps/i386/ffs.c (__ffs): Likewise.
	* sysdeps/i386/i686/ffs.c (__ffs): Likewise.
	* sysdeps/ia64/nptl/pthread_spin_lock.c (pthread_spin_lock):
	Likewise.
	* sysdeps/ia64/nptl/pthread_spin_trylock.c (pthread_spin_trylock):
	Likewise.
	* sysdeps/ieee754/ldbl-128/e_log2l.c (__ieee754_log2l): Likewise.
	* sysdeps/ieee754/ldbl-128ibm/e_log2l.c (__ieee754_log2l):
	Likewise.
	* sysdeps/m68k/ffs.c (__ffs): Likewise.
	* sysdeps/m68k/m680x0/fpu/e_acos.c (FUNC): Likewise.
	* sysdeps/m68k/m680x0/fpu/e_fmod.c (FUNC): Likewise.
	* sysdeps/mach/adjtime.c (__adjtime): Likewise.
	* sysdeps/mach/gettimeofday.c (__gettimeofday): Likewise.
	* sysdeps/mach/hurd/_exit.c (_exit): Likewise.
	* sysdeps/mach/hurd/access.c (__access): Likewise.
	* sysdeps/mach/hurd/adjtime.c (__adjtime): Likewise.
	* sysdeps/mach/hurd/chdir.c (__chdir): Likewise.
	* sysdeps/mach/hurd/chmod.c (__chmod): Likewise.
	* sysdeps/mach/hurd/chown.c (__chown): Likewise.
	* sysdeps/mach/hurd/cthreads.c (cthread_keycreate): Likewise.
	(cthread_getspecific): Likewise.
	(cthread_setspecific): Likewise.
	(__libc_getspecific): Likewise.
	* sysdeps/mach/hurd/euidaccess.c (__euidaccess): Likewise.
	* sysdeps/mach/hurd/faccessat.c (faccessat): Likewise.
	* sysdeps/mach/hurd/fchdir.c (__fchdir): Likewise.
	* sysdeps/mach/hurd/fchmod.c (__fchmod): Likewise.
	* sysdeps/mach/hurd/fchmodat.c (fchmodat): Likewise.
	* sysdeps/mach/hurd/fchown.c (__fchown): Likewise.
	* sysdeps/mach/hurd/fchownat.c (fchownat): Likewise.
	* sysdeps/mach/hurd/flock.c (__flock): Likewise.
	* sysdeps/mach/hurd/fsync.c (fsync): Likewise.
	* sysdeps/mach/hurd/ftruncate.c (__ftruncate): Likewise.
	* sysdeps/mach/hurd/getgroups.c (__getgroups): Likewise.
	* sysdeps/mach/hurd/gethostname.c (__gethostname): Likewise.
	* sysdeps/mach/hurd/getitimer.c (__getitimer): Likewise.
	* sysdeps/mach/hurd/getlogin_r.c (__getlogin_r): Likewise.
	* sysdeps/mach/hurd/getpgid.c (__getpgid): Likewise.
	* sysdeps/mach/hurd/getrusage.c (__getrusage): Likewise.
	* sysdeps/mach/hurd/getsockname.c (__getsockname): Likewise.
	* sysdeps/mach/hurd/group_member.c (__group_member): Likewise.
	* sysdeps/mach/hurd/isatty.c (__isatty): Likewise.
	* sysdeps/mach/hurd/lchown.c (__lchown): Likewise.
	* sysdeps/mach/hurd/link.c (__link): Likewise.
	* sysdeps/mach/hurd/linkat.c (linkat): Likewise.
	* sysdeps/mach/hurd/listen.c (__listen): Likewise.
	* sysdeps/mach/hurd/mkdir.c (__mkdir): Likewise.
	* sysdeps/mach/hurd/mkdirat.c (mkdirat): Likewise.
	* sysdeps/mach/hurd/openat.c (__openat): Likewise.
	* sysdeps/mach/hurd/poll.c (__poll): Likewise.
	* sysdeps/mach/hurd/readlink.c (__readlink): Likewise.
	* sysdeps/mach/hurd/readlinkat.c (readlinkat): Likewise.
	* sysdeps/mach/hurd/recv.c (__recv): Likewise.
	* sysdeps/mach/hurd/rename.c (rename): Likewise.
	* sysdeps/mach/hurd/renameat.c (renameat): Likewise.
	* sysdeps/mach/hurd/revoke.c (revoke): Likewise.
	* sysdeps/mach/hurd/rewinddir.c (__rewinddir): Likewise.
	* sysdeps/mach/hurd/rmdir.c (__rmdir): Likewise.
	* sysdeps/mach/hurd/seekdir.c (seekdir): Likewise.
	* sysdeps/mach/hurd/send.c (__send): Likewise.
	* sysdeps/mach/hurd/setdomain.c (setdomainname): Likewise.
	* sysdeps/mach/hurd/setegid.c (setegid): Likewise.
	* sysdeps/mach/hurd/seteuid.c (seteuid): Likewise.
	* sysdeps/mach/hurd/setgid.c (__setgid): Likewise.
	* sysdeps/mach/hurd/setgroups.c (setgroups): Likewise.
	* sysdeps/mach/hurd/sethostid.c (sethostid): Likewise.
	* sysdeps/mach/hurd/sethostname.c (sethostname): Likewise.
	* sysdeps/mach/hurd/setlogin.c (setlogin): Likewise.
	* sysdeps/mach/hurd/setpgid.c (__setpgid): Likewise.
	* sysdeps/mach/hurd/setregid.c (__setregid): Likewise.
	* sysdeps/mach/hurd/setreuid.c (__setreuid): Likewise.
	* sysdeps/mach/hurd/settimeofday.c (__settimeofday): Likewise.
	* sysdeps/mach/hurd/setuid.c (__setuid): Likewise.
	* sysdeps/mach/hurd/shutdown.c (shutdown): Likewise.
	* sysdeps/mach/hurd/sigaction.c (__sigaction): Likewise.
	* sysdeps/mach/hurd/sigaltstack.c (__sigaltstack): Likewise.
	* sysdeps/mach/hurd/sigpending.c (sigpending): Likewise.
	* sysdeps/mach/hurd/sigprocmask.c (__sigprocmask): Likewise.
	* sysdeps/mach/hurd/sigsuspend.c (__sigsuspend): Likewise.
	* sysdeps/mach/hurd/socket.c (__socket): Likewise.
	* sysdeps/mach/hurd/symlink.c (__symlink): Likewise.
	* sysdeps/mach/hurd/symlinkat.c (symlinkat): Likewise.
	* sysdeps/mach/hurd/telldir.c (telldir): Likewise.
	* sysdeps/mach/hurd/truncate.c (__truncate): Likewise.
	* sysdeps/mach/hurd/umask.c (__umask): Likewise.
	* sysdeps/mach/hurd/unlink.c (__unlink): Likewise.
	* sysdeps/mach/hurd/unlinkat.c (unlinkat): Likewise.
	* sysdeps/mips/mips64/__longjmp.c (__longjmp): Likewise.
	* sysdeps/posix/alarm.c (alarm): Likewise.
	* sysdeps/posix/cuserid.c (cuserid): Likewise.
	* sysdeps/posix/dirfd.c (dirfd): Likewise.
	* sysdeps/posix/dup.c (__dup): Likewise.
	* sysdeps/posix/dup2.c (__dup2): Likewise.
	* sysdeps/posix/euidaccess.c (euidaccess): Likewise.
	(main): Likewise.
	* sysdeps/posix/flock.c (__flock): Likewise.
	* sysdeps/posix/fpathconf.c (__fpathconf): Likewise.
	* sysdeps/posix/getcwd.c (__getcwd): Likewise.
	* sysdeps/posix/gethostname.c (__gethostname): Likewise.
	* sysdeps/posix/gettimeofday.c (__gettimeofday): Likewise.
	* sysdeps/posix/isatty.c (__isatty): Likewise.
	* sysdeps/posix/killpg.c (killpg): Likewise.
	* sysdeps/posix/libc_fatal.c (__libc_fatal): Likewise.
	* sysdeps/posix/mkfifoat.c (mkfifoat): Likewise.
	* sysdeps/posix/raise.c (raise): Likewise.
	* sysdeps/posix/remove.c (remove): Likewise.
	* sysdeps/posix/rename.c (rename): Likewise.
	* sysdeps/posix/rewinddir.c (__rewinddir): Likewise.
	* sysdeps/posix/seekdir.c (seekdir): Likewise.
	* sysdeps/posix/sigblock.c (__sigblock): Likewise.
	* sysdeps/posix/sigignore.c (sigignore): Likewise.
	* sysdeps/posix/sigintr.c (siginterrupt): Likewise.
	* sysdeps/posix/signal.c (__bsd_signal): Likewise.
	* sysdeps/posix/sigset.c (sigset): Likewise.
	* sysdeps/posix/sigsuspend.c (__sigsuspend): Likewise.
	* sysdeps/posix/sysconf.c (__sysconf): Likewise.
	* sysdeps/posix/sysv_signal.c (__sysv_signal): Likewise.
	* sysdeps/posix/time.c (time): Likewise.
	* sysdeps/posix/ttyname.c (getttyname): Likewise.
	(ttyname): Likewise.
	* sysdeps/posix/ttyname_r.c (__ttyname_r): Likewise.
	* sysdeps/posix/utime.c (utime): Likewise.
	* sysdeps/powerpc/fpu/s_isnan.c (__isnan): Likewise.
	* sysdeps/powerpc/nptl/pthread_spin_lock.c (pthread_spin_lock):
	Likewise.
	* sysdeps/powerpc/nptl/pthread_spin_trylock.c
	(pthread_spin_trylock): Likewise.
	* sysdeps/pthread/aio_error.c (aio_error): Likewise.
	* sysdeps/pthread/aio_read.c (aio_read): Likewise.
	* sysdeps/pthread/aio_read64.c (aio_read64): Likewise.
	* sysdeps/pthread/aio_write.c (aio_write): Likewise.
	* sysdeps/pthread/aio_write64.c (aio_write64): Likewise.
	* sysdeps/pthread/flockfile.c (__flockfile): Likewise.
	* sysdeps/pthread/ftrylockfile.c (__ftrylockfile): Likewise.
	* sysdeps/pthread/funlockfile.c (__funlockfile): Likewise.
	* sysdeps/pthread/timer_create.c (timer_create): Likewise.
	* sysdeps/pthread/timer_getoverr.c (timer_getoverrun): Likewise.
	* sysdeps/pthread/timer_gettime.c (timer_gettime): Likewise.
	* sysdeps/s390/ffs.c (__ffs): Likewise.
	* sysdeps/s390/nptl/pthread_spin_lock.c (pthread_spin_lock):
	Likewise.
	* sysdeps/s390/nptl/pthread_spin_trylock.c (pthread_spin_trylock):
	Likewise.
	* sysdeps/sh/nptl/pthread_spin_lock.c (pthread_spin_lock):
	Likewise.
	* sysdeps/sparc/nptl/pthread_barrier_destroy.c
	(pthread_barrier_destroy): Likewise.
	* sysdeps/sparc/nptl/pthread_barrier_wait.c
	(__pthread_barrier_wait): Likewise.
	* sysdeps/sparc/sparc32/e_sqrt.c (__ieee754_sqrt): Likewise.
	* sysdeps/sparc/sparc32/pthread_barrier_wait.c
	(__pthread_barrier_wait): Likewise.
	* sysdeps/sparc/sparc32/sem_init.c (__old_sem_init): Likewise.
	* sysdeps/tile/memcmp.c (memcmp_common_alignment): Likewise.
	(memcmp_not_common_alignment): Likewise.
	(MEMCMP): Likewise.
	* sysdeps/tile/wordcopy.c (_wordcopy_fwd_aligned): Likewise.
	(_wordcopy_fwd_dest_aligned): Likewise.
	(_wordcopy_bwd_aligned): Likewise.
	(_wordcopy_bwd_dest_aligned): Likewise.
	* sysdeps/unix/bsd/ftime.c (ftime): Likewise.
	* sysdeps/unix/bsd/gtty.c (gtty): Likewise.
	* sysdeps/unix/bsd/stty.c (stty): Likewise.
	* sysdeps/unix/bsd/tcflow.c (tcflow): Likewise.
	* sysdeps/unix/bsd/tcflush.c (tcflush): Likewise.
	* sysdeps/unix/bsd/tcgetattr.c (__tcgetattr): Likewise.
	* sysdeps/unix/bsd/tcgetpgrp.c (tcgetpgrp): Likewise.
	* sysdeps/unix/bsd/tcsendbrk.c (tcsendbreak): Likewise.
	* sysdeps/unix/bsd/tcsetattr.c (tcsetattr): Likewise.
	* sysdeps/unix/bsd/tcsetpgrp.c (tcsetpgrp): Likewise.
	* sysdeps/unix/bsd/ualarm.c (ualarm): Likewise.
	* sysdeps/unix/bsd/wait3.c (__wait3): Likewise.
	* sysdeps/unix/getlogin_r.c (__getlogin_r): Likewise.
	* sysdeps/unix/sockatmark.c (sockatmark): Likewise.
	* sysdeps/unix/stime.c (stime): Likewise.
	* sysdeps/unix/sysv/linux/_exit.c (_exit): Likewise.
	* sysdeps/unix/sysv/linux/aio_sigqueue.c (__aio_sigqueue):
	Likewise.  Use internal_function.
	* sysdeps/unix/sysv/linux/arm/sigaction.c (__libc_sigaction):
	Convert to prototype-style function definition.
	* sysdeps/unix/sysv/linux/faccessat.c (faccessat): Likewise.
	* sysdeps/unix/sysv/linux/fchmodat.c (fchmodat): Likewise.
	* sysdeps/unix/sysv/linux/fpathconf.c (__fpathconf): Likewise.
	* sysdeps/unix/sysv/linux/gai_sigqueue.c (__gai_sigqueue):
	Likewise.  Use internal_function.
	* sysdeps/unix/sysv/linux/gethostid.c (sethostid): Convert to
	prototype-style function definition
	* sysdeps/unix/sysv/linux/getlogin_r.c (__getlogin_r_loginuid):
	Likewise.
	(__getlogin_r): Likewise.
	* sysdeps/unix/sysv/linux/getpt.c (__posix_openpt): Likewise.
	* sysdeps/unix/sysv/linux/hppa/pthread_cond_broadcast.c
	(__pthread_cond_broadcast): Likewise.
	* sysdeps/unix/sysv/linux/hppa/pthread_cond_destroy.c
	(__pthread_cond_destroy): Likewise.
	* sysdeps/unix/sysv/linux/hppa/pthread_cond_init.c
	(__pthread_cond_init): Likewise.
	* sysdeps/unix/sysv/linux/hppa/pthread_cond_signal.c
	(__pthread_cond_signal): Likewise.
	* sysdeps/unix/sysv/linux/hppa/pthread_cond_wait.c
	(__pthread_cond_wait): Likewise.
	* sysdeps/unix/sysv/linux/i386/getmsg.c (getmsg): Likewise.
	* sysdeps/unix/sysv/linux/i386/setegid.c (setegid): Likewise.
	* sysdeps/unix/sysv/linux/ia64/sigaction.c (__libc_sigaction):
	Likewise.
	* sysdeps/unix/sysv/linux/ia64/sigpending.c (sigpending):
	Likewise.
	* sysdeps/unix/sysv/linux/ia64/sigprocmask.c (__sigprocmask):
	Likewise.
	* sysdeps/unix/sysv/linux/mips/sigaction.c (__libc_sigaction):
	Likewise.
	* sysdeps/unix/sysv/linux/msgget.c (msgget): Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/ftruncate64.c
	(__ftruncate64): Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/truncate64.c
	(truncate64): Likewise.
	* sysdeps/unix/sysv/linux/pt-raise.c (raise): Likewise.
	* sysdeps/unix/sysv/linux/pthread_getcpuclockid.c
	(pthread_getcpuclockid): Likewise.
	* sysdeps/unix/sysv/linux/pthread_getname.c (pthread_getname_np):
	Likewise.
	* sysdeps/unix/sysv/linux/pthread_setname.c (pthread_setname_np):
	Likewise.
	* sysdeps/unix/sysv/linux/pthread_sigmask.c (pthread_sigmask):
	Likewise.
	* sysdeps/unix/sysv/linux/pthread_sigqueue.c (pthread_sigqueue):
	Likewise.
	* sysdeps/unix/sysv/linux/raise.c (raise): Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-64/sigaction.c
	(__libc_sigaction): Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-64/sigpending.c (sigpending):
	Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-64/sigprocmask.c
	(__sigprocmask): Likewise.
	* sysdeps/unix/sysv/linux/semget.c (semget): Likewise.
	* sysdeps/unix/sysv/linux/semop.c (semop): Likewise.
	* sysdeps/unix/sysv/linux/setrlimit64.c (setrlimit64): Likewise.
	* sysdeps/unix/sysv/linux/shmat.c (shmat): Likewise.
	* sysdeps/unix/sysv/linux/shmdt.c (shmdt): Likewise.
	* sysdeps/unix/sysv/linux/shmget.c (shmget): Likewise.
	* sysdeps/unix/sysv/linux/sigaction.c (__libc_sigaction):
	Likewise.
	* sysdeps/unix/sysv/linux/sigpending.c (sigpending): Likewise.
	* sysdeps/unix/sysv/linux/sigprocmask.c (__sigprocmask): Likewise.
	* sysdeps/unix/sysv/linux/sigqueue.c (__sigqueue): Likewise.
	* sysdeps/unix/sysv/linux/sigstack.c (sigstack): Likewise.
	* sysdeps/unix/sysv/linux/sparc/sparc64/sigpending.c (sigpending):
	Likewise.
	* sysdeps/unix/sysv/linux/sparc/sparc64/sigprocmask.c
	(__sigprocmask): Likewise.
	* sysdeps/unix/sysv/linux/speed.c (cfgetospeed): Likewise.
	(cfgetispeed): Likewise.
	(cfsetospeed): Likewise.
	(cfsetispeed): Likewise.
	* sysdeps/unix/sysv/linux/tcflow.c (tcflow): Likewise.
	* sysdeps/unix/sysv/linux/tcflush.c (tcflush): Likewise.
	* sysdeps/unix/sysv/linux/tcgetattr.c (__tcgetattr): Likewise.
	* sysdeps/unix/sysv/linux/tcsetattr.c (tcsetattr): Likewise.
	* sysdeps/unix/sysv/linux/time.c (time): Likewise.
	* sysdeps/unix/sysv/linux/timer_create.c (timer_create): Likewise.
	* sysdeps/unix/sysv/linux/timer_delete.c (timer_delete): Likewise.
	* sysdeps/unix/sysv/linux/timer_getoverr.c (timer_getoverrun):
	Likewise.
	* sysdeps/unix/sysv/linux/timer_gettime.c (timer_gettime):
	Likewise.
	* sysdeps/unix/sysv/linux/x86_64/sigpending.c (sigpending):
	Likewise.
	* sysdeps/unix/sysv/linux/x86_64/sigprocmask.c (__sigprocmask):
	Likewise.
	* sysdeps/x86_64/backtrace.c (__backtrace): Likewise.
2015-10-19 12:04:33 +00:00
H.J. Lu
9edf9b18b1 Mark x86 _dl_unmap/_dl_make_tlsdesc_dynamic hidden
Since x86 _dl_unmap and _dl_make_tlsdesc_dynamic are only used
internally in ld.so, they can be made hidden.

	[BZ #19122]
	* sysdeps/i386/dl-lookupcfg.h (_dl_unmap): Add attribute_hidden.
	* sysdeps/i386/dl-tlsdesc.h (_dl_make_tlsdesc_dynamic):
	Likewise.
	* sysdeps/x86_64/dl-tlsdesc.h (_dl_make_tlsdesc_dynamic):
	Likewise.
	* sysdeps/x86_64/dl-lookupcfg.h (_dl_unmap): Likewise.
2015-10-15 13:48:54 -07:00
H.J. Lu
d3d9c95aef Support PLT and GOT references in local PIC check
Linker in binutils 2.26 and newer generate GOT references instead
PLT references when -z now is passed to linker.  We need to extend
scripts/localplt.awk to allow PLT or GOT references.

	[BZ #19007]
	* scripts/localplt.awk: Also allow GOT references.
	* sysdeps/unix/sysv/linux/i386/localplt.data: Mark
	_Unwind_Find_FDE, calloc, memalign, realloc and __libc_memalign
	with "+ REL R_386_GLOB_DAT".
	* sysdeps/x86_64/localplt.data: Mark calloc, memalign, realloc
	and __libc_memalign with "+ RELA R_X86_64_GLOB_DAT".
2015-10-14 06:00:02 -07:00
H.J. Lu
0a5768fe9c Support x86-64 assmebler without AVX512
When x86-64 assmebler doesn't support AVX512, we should make
_dl_runtime_resolve_avx512/_dl_runtime_profile_avx512 as aliases of
_dl_runtime_resolve_avx/_dl_runtime_profile_avx.  Tested on x86-64
using GCC 5.2 with binutils 20151008 and GCC 4.8 with binutils 20130219.
There are no differences in ld.so with binutils 20151008.  There are no
unexpected failures with binutils 20130219 and 20151008.

	[BZ #19124]
	* sysdeps/x86_64/dl-trampoline.S [!HAVE_AVX512_ASM_SUPPORT]
	(_dl_runtime_resolve_avx512): Make it a hidden alias of
	_dl_runtime_resolve_avx.
	(_dl_runtime_profile_avx512): Make it a hidden alias of
	_dl_runtime_profile_avx.
2015-10-13 10:36:27 -07:00
H.J. Lu
4b71ce6c1a Update lrint/lrintf/lrintl for x32
The x86_64 versions of lrint/lrintf/ lrintl are aliases for the long
long versions which isn't correct for x32, where exceptions must respect
overflow for 32-bit long.  Separate versions of the long functions for
x32 that convert to 32-bit long and raise the right exceptions for that
conversion, while keeping the aliases in the non-x32 case.

Tested on x86_64 and x32.  There are no code changes in libm.so on
x86_64.

	* sysdeps/x86_64/fpu/s_llrint.S (__lrint): Add alias only if
	__ILP32__ isn't defined.
	(lrint): Likewise.
	* sysdeps/x86_64/fpu/s_llrintf.S (__lrintf): Likewise.
	(lrintf): Likewise.
	* sysdeps/x86_64/fpu/s_llrintl.S (__lrintl): Likewise.
	(lrintl): Likewise.
	* sysdeps/x86_64/x32/fpu/s_lrint.S: New file.
	* sysdeps/x86_64/x32/fpu/s_lrintf.S: Likewise.
	* sysdeps/x86_64/x32/fpu/s_lrintl.S: Likewise.
2015-10-09 11:42:10 -07:00
Joseph Myers
ae5d8eaed0 Remove configure tests for -mno-vzeroupper support.
GCC added support for -mno-vzeroupper in version 4.6.  Thus the
configure tests for this support are obsolete, and this patch removes
them.

Tested for x86_64 and x86 (testsuite, and that installed stripped
shared libraries are unchanged by this patch).

	* sysdeps/i386/configure.ac (libc_cv_cc_novzeroupper): Remove
	configure test.
	* sysdeps/i386/configure: Regenerated.
	* sysdeps/x86_64/configure.ac (libc_cv_cc_novzeroupper): Remove
	configure test.
	* sysdeps/x86_64/configure: Regenerated.
	* sysdeps/x86_64/Makefile [$(config-cflags-novzeroupper) = yes]:
	Make code unconditional.
2015-10-09 16:03:48 +00:00
Joseph Myers
b7848899a5 Remove configure tests for FMA4 support.
GCC added support for -mfma4 in version 4.5.  Thus the configure tests
for this support are obsolete, and this patch removes them.

Tested for x86_64 and x86 (testsuite, and that installed stripped
shared libraries are unchanged by this patch).

	* sysdeps/i386/configure.ac (libc_cv_cc_fma4): Remove configure
	test.
	* sysdeps/i386/configure: Regenerated.
	* sysdeps/x86_64/configure.ac (libc_cv_cc_fma4): Remove configure
	test.
	* sysdeps/x86_64/configure: Regenerated.
	* sysdeps/x86_64/fpu/multiarch/Makefile [$(have-mfma4) = yes]:
	Make code unconditional.
	* sysdeps/x86_64/fpu/multiarch/e_asin.c [HAVE_FMA4_SUPPORT]:
	Likewise.
	* sysdeps/x86_64/fpu/multiarch/e_atan2.c [HAVE_FMA4_SUPPORT]:
	Likewise.
	[!HAVE_FMA4_SUPPORT]: Remove conditional code.
	* sysdeps/x86_64/fpu/multiarch/e_exp.c [HAVE_FMA4_SUPPORT]: Make
	code unconditional.
	[!HAVE_FMA4_SUPPORT]: Remove conditional code.
	* sysdeps/x86_64/fpu/multiarch/e_log.c [HAVE_FMA4_SUPPORT]: Make
	code unconditional.
	[!HAVE_FMA4_SUPPORT]: Remove conditional code.
	* sysdeps/x86_64/fpu/multiarch/e_pow.c [HAVE_FMA4_SUPPORT]: Make
	code unconditional.
	* sysdeps/x86_64/fpu/multiarch/s_atan.c [HAVE_FMA4_SUPPORT]: Make
	code unconditional.
	[!HAVE_FMA4_SUPPORT]: Remove conditional code.
	* sysdeps/x86_64/fpu/multiarch/s_fma.c [HAVE_FMA4_SUPPORT]: Make
	code unconditional.
	[!HAVE_FMA4_SUPPORT]: Remove conditional code.
	* sysdeps/x86_64/fpu/multiarch/s_fmaf.c [HAVE_FMA4_SUPPORT]: Make
	code unconditional.
	[!HAVE_FMA4_SUPPORT]: Remove conditional code.
	* sysdeps/x86_64/fpu/multiarch/s_sin.c [HAVE_FMA4_SUPPORT]: Make
	code unconditional.
	[!HAVE_FMA4_SUPPORT]: Remove conditional code.
	* sysdeps/x86_64/fpu/multiarch/s_tan.c [HAVE_FMA4_SUPPORT]: Make
	code unconditional.
	[!HAVE_FMA4_SUPPORT]: Remove conditional code.
	* config.h.in (HAVE_FMA4_SUPPORT): Remove #undef.
2015-10-09 16:02:54 +00:00
Joseph Myers
1b12cd7f4d Remove configure tests for AVX support.
GCC added support for -mavx and -msse2avx in version 4.4.  Thus the
configure tests for this support are obsolete, and this patch removes
them.

Tested for x86_64 and x86 (testsuite, and that installed stripped
shared libraries are unchanged by this patch).

	* sysdeps/i386/configure.ac (libc_cv_cc_avx): Remove configure
	test.
	(libc_cv_cc_sse2avx): Likewise.
	* sysdeps/i386/configure: Regenerated.
	* sysdeps/i386/i686/multiarch/Makefile
	[$(subdir)$(config-cflags-avx) = mathyes]: Change conditional to
	[$(subdir) = math].
	* sysdeps/i386/i686/multiarch/s_fma-fma.c [HAVE_AVX_SUPPORT]: Make
	code unconditional.
	* sysdeps/i386/i686/multiarch/s_fma.c [HAVE_AVX_SUPPORT]:
	Likewise.
	* sysdeps/i386/i686/multiarch/s_fmaf-fma.c [HAVE_AVX_SUPPORT]:
	Likewise.
	* sysdeps/i386/i686/multiarch/s_fmaf.c [HAVE_AVX_SUPPORT]:
	Likewise.
	* sysdeps/x86_64/configure.ac (libc_cv_cc_avx): Remove configure
	test.
	(libc_cv_cc_sse2avx): Likewise.
	* sysdeps/x86_64/configure: Regenerated.
	* sysdeps/x86_64/Makefile [$(config-cflags-avx) = yes]: Make code
	unconditional.
	* sysdeps/x86_64/dl-trampoline.h (_dl_runtime_profile)
	[HAVE_AVX_SUPPORT || HAVE_AVX512_ASM_SUPPORT]: Make code
	unconditional.
	(_dl_runtime_profile)
	[!(HAVE_AVX_SUPPORT || HAVE_AVX512_ASM_SUPPORT)]: Remove
	conditional code.
	* sysdeps/x86_64/fpu/multiarch/Makefile
	[$(config-cflags-sse2avx) = yes]: Make code unconditional.
	* sysdeps/x86_64/fpu/multiarch/e_atan2.c
	[HAVE_FMA4_SUPPORT || HAVE_AVX_SUPPORT]: Likewise.
	* sysdeps/x86_64/fpu/multiarch/e_exp.c
	[HAVE_FMA4_SUPPORT || HAVE_AVX_SUPPORT]: Likewise.
	* sysdeps/x86_64/fpu/multiarch/e_log.c
	[HAVE_FMA4_SUPPORT || HAVE_AVX_SUPPORT]: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_atan.c
	[HAVE_FMA4_SUPPORT || HAVE_AVX_SUPPORT]: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_fma.c [HAVE_AVX_SUPPORT]:
	Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_fmaf.c [HAVE_AVX_SUPPORT]:
	Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_sin.c
	[HAVE_FMA4_SUPPORT || HAVE_AVX_SUPPORT]: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_tan.c
	[HAVE_FMA4_SUPPORT || HAVE_AVX_SUPPORT]: Likewise.
	* sysdeps/x86_64/multiarch/strcmp.S [HAVE_AVX_SUPPORT]: Likewise.
	* config.h.in (HAVE_AVX_SUPPORT): Remove #undef.
	(HAVE_SSE2AVX_SUPPORT): Likewise.
2015-10-08 15:59:32 +00:00
Joseph Myers
b75bc69cdf Don't use dbl-64/wordsize-64 lround based on llround for ILP32 (bug 19079).
The implementation of lround in dbl-64/wordsize-64 as an alias or
wrapper for llround is always incorrect when long is not 64-bit,
because it misses required exceptions in overflow cases, as shown by
my recently added tests.  This patch removes that alias / wrapper in
the non-LP64 case, together with the REGISTER_CAST_INT32_TO_INT64
macro, restoring the previous version of lround for dbl-64/wordsize-64
(newly conditioned on !_LP64).

Tested for x86_64, and for mips64 with use of dbl-64/wordsize-64
enabled.

	[BZ #19079]
	* sysdeps/ieee754/dbl-64/wordsize-64/s_lround.c: Restore previous
	file, conditioned on [!_LP64].
	* sysdeps/ieee754/dbl-64/wordsize-64/s_llround.c
	[!_LP64] (__lround): Do not define as function or alias.
	[!_LP64] (lround): Likewise.
	[!_LP64] (__lroundl): Likewise.
	[!_LP64] (lroundl): Likewise.
	* sysdeps/tile/sysdep.h (REGISTER_CAST_INT32_TO_INT64): Remove
	macro.
	* sysdeps/x86_64/x32/sysdep.h (REGISTER_CAST_INT32_TO_INT64):
	Likewise.
2015-10-07 00:40:12 +00:00
Joseph Myers
3b7aa5bf59 Remove configure tests for SSE4 support.
GCC added support for -msse4 in version 4.3.  Thus the configure tests
for it are obsolete, and this patch removes them.

Tested for x86_64 and x86 (testsuite, and that installed stripped
shared libraries are unchanged by this patch).

	* sysdeps/i386/configure.ac (libc_cv_cc_sse4): Remove configure
	test.
	* sysdeps/i386/configure: Regenerated.
	* sysdeps/i386/i686/multiarch/Makefile
	[$(config-cflags-sse4) = yes]: Make code unconditional.
	* sysdeps/i386/i686/multiarch/strcspn.S [HAVE_SSE4_SUPPORT]:
	Likewise.
	* sysdeps/i386/i686/multiarch/strspn.S [HAVE_SSE4_SUPPORT]:
	Likewise.
	* sysdeps/x86_64/configure.ac (libc_cv_cc_sse4): Remove configure
	test.
	* sysdeps/x86_64/configure: Regenerated.
	* sysdeps/x86_64/multiarch/Makefile [$(config-cflags-sse4) = yes]:
	Make code unconditional.
	* sysdeps/x86_64/multiarch/strcspn.S [HAVE_SSE4_SUPPORT]:
	Likewise.
	* sysdeps/x86_64/multiarch/strspn.S [HAVE_SSE4_SUPPORT]: Likewise.
	* config.h.in (HAVE_SSE4_SUPPORT): Remove #undef.
2015-10-06 20:47:40 +00:00
Paul Pluzhnikov
57352c2201 sysdeps/x86_64/fpu/libm-test-ulps: Regenerated on Haswell. 2015-10-03 11:31:21 -07:00