Commit Graph

926 Commits

Author SHA1 Message Date
H.J. Lu
d8725b1fba Use SSE2 optimized strcmp in x86-64 ld.so
Since ld.so preserves vector registers now, we can use the same SSE2
optimized strcmp in x86-64 libc and ld.so.

	* sysdeps/x86_64/strcmp.S: Remove "#if !IS_IN (libc)".
2015-08-25 12:38:11 -07:00
H.J. Lu
2194737e77 Replace %xmm[8-12] with %xmm[0-4]
Since ld.so preserves vector registers now, we can use %xmm[0-4] to
avoid the REX prefix.

	* sysdeps/x86_64/strlen.S: Replace %xmm[8-12] with %xmm[0-4].
2015-08-25 08:51:23 -07:00
H.J. Lu
2339c6f4bd Remove x86-64 rtld-xxx.c and rtld-xxx.S
Since ld.so preserves vector registers now, we can use the regular,
non-ifunc string and memory functions in ld.so.

	* sysdeps/x86_64/rtld-memcmp.c: Removed.
	* sysdeps/x86_64/rtld-memset.S: Likewise.
	* sysdeps/x86_64/rtld-strchr.S: Likewise.
	* sysdeps/x86_64/rtld-strlen.S: Likewise.
	* sysdeps/x86_64/multiarch/rtld-memcmp.c: Likewise.
	* sysdeps/x86_64/multiarch/rtld-memset.S: Likewise.
2015-08-25 08:50:06 -07:00
H.J. Lu
5f92ec52e7 Replace %xmm8 with %xmm0
Since ld.so preserves vector registers now, we can use %xmm0 to avoid
the REX prefix.

	* sysdeps/x86_64/memset.S: Replace %xmm8 with %xmm0.
2015-08-25 08:48:34 -07:00
Ondřej Bílka
0d0325ed4b Fix strcpy_chk and stpcpy_chk performance.
Hi, as I wrote in previous patches a performance of checked strcpy and
stpcpy is terrible as these don't use sse2 and are around four times
slower that strcpy and stpcpy now.

As this bug shows that these functions are not performance sensitive I
decided just to improve generic implementation instead for easier
maintainance.

        * debug/strcpy_chk.c: Improve performance.
        * debug/stpcpy_chk.c: Likewise.
        * sysdeps/x86_64/strcpy_chk.S: Remove.
        * sysdeps/x86_64/stpcpy_chk.S: Remove.
2015-08-25 15:06:33 +02:00
H.J. Lu
f3dcae82d5 Save and restore vector registers in x86-64 ld.so
This patch adds SSE, AVX and AVX512 versions of _dl_runtime_resolve
and _dl_runtime_profile, which save and restore the first 8 vector
registers used for parameter passing.  elf_machine_runtime_setup
selects the proper _dl_runtime_resolve or _dl_runtime_profile based
on _dl_x86_cpu_features.  It avoids race condition caused by
FOREIGN_CALL macros, which are only used for x86-64.

Performance impact of saving and restoring 8 vector registers are
negligible on Nehalem, Sandy Bridge, Ivy Bridge and Haswell when
ld.so is optimized with SSE2.

	[BZ #15128]
	* sysdeps/x86_64/Makefile [$(subdir) == elf] (tests): Add
	ifuncmain8.
	(modules-names): Add ifuncmod8.
	($(objpfx)ifuncmain8): New rule.
	* sysdeps/x86_64/dl-machine.h: Include <dl-procinfo.h> and
	<cpuid.h>.
	(elf_machine_runtime_setup): Use _dl_runtime_resolve_sse,
	_dl_runtime_resolve_avx, or _dl_runtime_resolve_avx512,
	_dl_runtime_profile_sse, _dl_runtime_profile_avx, or
	_dl_runtime_profile_avx512, based on HAS_ARCH_FEATURE.
	* sysdeps/x86_64/dl-trampoline.S: Rewrite.
	* sysdeps/x86_64/dl-trampoline.h: Likewise.
	* sysdeps/x86_64/ifuncmain8.c: New file.
	* sysdeps/x86_64/ifuncmod8.c: Likewise.
	* sysdeps/x86_64/nptl/tcb-offsets.sym (RTLD_SAVESPACE_SSE):
	Removed.
	* sysdeps/x86_64/nptl/tls.h (__128bits): Removed.
	(tcbhead_t): Change rtld_must_xmm_save to __glibc_unused1.
	Change rtld_savespace_sse to __glibc_unused2.
	(RTLD_CHECK_FOREIGN_CALL): Removed.
	(RTLD_ENABLE_FOREIGN_CALL): Likewise.
	(RTLD_PREPARE_FOREIGN_CALL): Likewise.
	(RTLD_FINALIZE_FOREIGN_CALL): Likewise.
2015-08-25 04:34:13 -07:00
H.J. Lu
daa4db69fc Remove the unused IFUNC files
sysdeps/i386/i686/multiarch/strcasestr-c.c became unused after

commit 1818483b15
Author: Andreas Schwab <schwab@suse.de>
Date:   Wed Dec 18 11:53:27 2013 +1000

    Remove use of SSE4.2 functions for strstr on i686

which contains

-sysdep_routines += strcspn-c strpbrk-c strspn-c strstr-c strcasestr-c
+sysdep_routines += strcspn-c strpbrk-c strspn-c

sysdeps/x86_64/multiarch/strcasestr.c became useless after

t 584b18eb4d
Author: Ondřej Bílka <neleai@seznam.cz>
Date:   Sat Dec 14 19:33:56 2013 +0100

    Add strstr with unaligned loads. Fixes bug 12100.

which changes sysdeps/x86_64/multiarch/strcasestr.c to

libc_ifunc (__strcasestr, __strcasestr_sse2);

This patch removes these file.

	* i386/i686/multiarch/strcasestr-c.c: Removed.
	* x86_64/multiarch/strcasestr.c: Likewise.
	* x86_64/multiarch/ifunc-impl-list.c (__libc_ifunc_impl_list):
	Remove strcasestr.
2015-08-20 12:47:20 -07:00
H.J. Lu
1ae6c72dc1 Move x86_64 init-arch.h to sysdeps/x86/init-arch.h
Move sysdeps/x86_64/multiarch/init-arch.h to sysdeps/x86/init-arch.h
which can be used for both i386 and x86_64.

	* sysdeps/i386/i686/multiarch/init-arch.h: Removed.
	* sysdeps/unix/sysv/linux/x86/init-arch.h: Likewise.
	* sysdeps/x86_64/cacheinfo.c: Include <init-arch.h> instead
	of "multiarch/init-arch.h".
	* sysdeps/x86_64/multiarch/init-arch.h: Renamed to ...
	* sysdeps/x86/init-arch.h: This.
2015-08-20 04:29:23 -07:00
Petar Jovanovic
fa19d5c48a Fix dynamic linker issue with bind-now
Fix the bind-now case when DT_REL and DT_JMPREL sections are separate
and there is a gap between them.

	[BZ #14341]
	* elf/dynamic-link.h (elf_machine_lazy_rel): Properly handle the
	case when there is a gap between DT_REL and DT_JMPREL sections.
	* sysdeps/x86_64/Makefile (tests): Add tst-split-dynreloc.
	(LDFLAGS-tst-split-dynreloc): New.
	(tst-split-dynreloc-ENV): Likewise.
	* sysdeps/x86_64/tst-split-dynreloc.c: New file.
	* sysdeps/x86_64/tst-split-dynreloc.lds: Likewise.
2015-08-19 05:37:01 -07:00
Paul Pluzhnikov
d5dff793af Fix BZ #18084 -- backtrace (..., 0) dumps core on x86.
Other architectures also had bugs, or did unnecessary work.
2015-08-15 11:42:43 -07:00
Paul Pluzhnikov
db7f8c8fe0 Regenerated sysdeps/x86_64/fpu/libm-test-ulps with AVX2. 2015-08-14 09:59:04 -07:00
Siddhesh Poyarekar
37dd6a19ca Remove incorrect register mov in floorf/nearbyint on x86_64
The change in 0b5395f052 replaced calls
to __get_cpu_features@plt followed by a mov from rax to rdx, with a
single macro LOAD_RTLD_GLOBAL_RO_RDX.  It is pretty clear that there
was a typo in s_floorf and __nearbyint due to which the (now incorrect)
mov was not removed.  This patch removes that mov.

	* sysdeps/x86_64/fpu/multiarch/s_floorf.S (__floorf): Remove
	unnecessary movq.
	* sysdeps/x86_64/fpu/multiarch/s_nearbyint.S (__nearbyint):
	Likewise.
2015-08-14 05:30:17 -07:00
Joseph Myers
3ba0ac10fa Add more random libm-test inputs.
This patch adds more test inputs to various libm functions found
through random generation to have larger ulps errors than previously
listed in libm-test-ulp, on at least one of x86_64 and x86.

Tested for x86_64 and x86.

	* math/auto-libm-test-in: Add more tests of acos, acosh, asin,
	asinh, atan, atan2, atanh, cabs, cbrt, cosh, csqrt, erf, erfc,
	exp, exp2, lgamma, log, log1p, log2, pow, sin, sincos, tan, tanh
	and tgamma.
	* math/auto-libm-test-out: Regenerated.
	* sysdeps/i386/fpu/libm-test-ulps: Update.
	* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
2015-08-13 23:23:23 +00:00
H.J. Lu
1dfa4a94ae Update libmvec multiarch functions for <cpu-features.h>
This patch updates libmvec multiarch functions to use the newly defined
HAS_CPU_FEATURE, HAS_ARCH_FEATURE and LOAD_RTLD_GLOBAL_RO_RDX from
<cpu-features.h>.

	* math/Makefile ($(addprefix $(objpfx), $(libm-vec-tests))):
	Remove $(objpfx)init-arch.o.
	* sysdeps/x86_64/fpu/Makefile (libmvec-support): Remove
	init-arch.
	* sysdeps/x86_64/fpu/math-tests-arch.h (avx_usable): Removed.
	(INIT_ARCH_EXT): Defined as empty.
	(CHECK_ARCH_EXT): Replace HAS_XXX with HAS_ARCH_FEATURE (XXX).
	* sysdeps/x86_64/fpu/multiarch/svml_d_cos2_core.S: Remove
	__init_cpu_features call.  Replace HAS_XXX with
	HAS_CPU_FEATURE/HAS_ARCH_FEATURE (XXX).
	* sysdeps/x86_64/fpu/multiarch/svml_d_cos4_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_exp2_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_exp4_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_log2_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_log4_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_log8_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_pow2_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_pow4_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sincos2_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sincos4_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_cosf16_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_cosf4_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_cosf8_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_expf16_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_expf4_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_expf8_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_logf16_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_logf4_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_logf8_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_powf16_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_powf4_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_powf8_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_sincosf4_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_sincosf8_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_sinf16_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_sinf4_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_sinf8_core.S: Likewise.
2015-08-13 03:41:47 -07:00
H.J. Lu
0b5395f052 Update x86_64 multiarch functions for <cpu-features.h>
This patch updates x86_64 multiarch functions to use the newly defined
HAS_CPU_FEATURE, HAS_ARCH_FEATURE and LOAD_RTLD_GLOBAL_RO_RDX from
<cpu-features.h>.

	* sysdeps/x86_64/fpu/multiarch/e_asin.c: Replace HAS_XXX with
	HAS_CPU_FEATURE/HAS_ARCH_FEATURE (XXX).
	* sysdeps/x86_64/fpu/multiarch/e_atan2.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/e_exp.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/e_log.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/e_pow.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_atan.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_fma.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_fmaf.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_sin.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_tan.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_ceil.S: Use
	LOAD_RTLD_GLOBAL_RO_RDX and HAS_CPU_FEATURE (SSE4_1).
	* sysdeps/x86_64/fpu/multiarch/s_ceilf.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_floor.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_floorf.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_nearbyint.S : Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_nearbyintf.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_rintf.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_rintf.S : Likewise.
	* sysdeps/x86_64/multiarch/ifunc-impl-list.c: Likewise.
	* sysdeps/x86_64/multiarch/sched_cpucount.c: Likewise.
	* sysdeps/x86_64/multiarch/strstr.c: Likewise.
	* sysdeps/x86_64/multiarch/memmove.c: Likewise.
	* sysdeps/x86_64/multiarch/memmove_chk.c: Likewise.
	* sysdeps/x86_64/multiarch/test-multiarch.c: Likewise.
	* sysdeps/x86_64/multiarch/memcmp.S: Remove __init_cpu_features
	call.  Add LOAD_RTLD_GLOBAL_RO_RDX.  Replace HAS_XXX with
	HAS_CPU_FEATURE/HAS_ARCH_FEATURE (XXX).
	* sysdeps/x86_64/multiarch/memcpy.S: Likewise.
	* sysdeps/x86_64/multiarch/memcpy_chk.S: Likewise.
	* sysdeps/x86_64/multiarch/mempcpy.S: Likewise.
	* sysdeps/x86_64/multiarch/mempcpy_chk.S: Likewise.
	* sysdeps/x86_64/multiarch/memset.S: Likewise.
	* sysdeps/x86_64/multiarch/memset_chk.S: Likewise.
	* sysdeps/x86_64/multiarch/strcat.S: Likewise.
	* sysdeps/x86_64/multiarch/strchr.S: Likewise.
	* sysdeps/x86_64/multiarch/strcmp.S: Likewise.
	* sysdeps/x86_64/multiarch/strcpy.S: Likewise.
	* sysdeps/x86_64/multiarch/strcspn.S: Likewise.
	* sysdeps/x86_64/multiarch/strspn.S: Likewise.
	* sysdeps/x86_64/multiarch/wcscpy.S: Likewise.
	* sysdeps/x86_64/multiarch/wmemcmp.S: Likewise.
2015-08-13 03:41:30 -07:00
H.J. Lu
e2e4f56056 Add _dl_x86_cpu_features to rtld_global
This patch adds _dl_x86_cpu_features to rtld_global in x86 ld.so
and initializes it early before __libc_start_main is called so that
cpu_features is always available when it is used and we can avoid
calling __init_cpu_features in IFUNC selectors.

	* sysdeps/i386/dl-machine.h: Include <cpu-features.c>.
	(dl_platform_init): Call init_cpu_features.
	* sysdeps/i386/dl-procinfo.c (_dl_x86_cpu_features): New.
	* sysdeps/i386/i686/cacheinfo.c
	(DISABLE_PREFERRED_MEMORY_INSTRUCTION): Removed.
	* sysdeps/i386/i686/multiarch/Makefile (aux): Remove init-arch.
	* sysdeps/i386/i686/multiarch/Versions: Removed.
	* sysdeps/i386/i686/multiarch/ifunc-defines.sym (KIND_OFFSET):
	Removed.
	* sysdeps/i386/ldsodefs.h: Include <cpu-features.h>.
	* sysdeps/unix/sysv/linux/x86/Makefile
	(libpthread-sysdep_routines): Remove init-arch.
	* sysdeps/unix/sysv/linux/x86_64/dl-procinfo.c: Include
	<sysdeps/x86_64/dl-procinfo.c> instead of
	sysdeps/generic/dl-procinfo.c>.
	* sysdeps/x86/Makefile [$(subdir) == csu] (gen-as-const-headers):
	Add cpu-features-offsets.sym and rtld-global-offsets.sym.
	[$(subdir) == elf] (sysdep-dl-routines): Add dl-get-cpu-features.
	[$(subdir) == elf] (tests): Add tst-get-cpu-features.
	[$(subdir) == elf] (tests-static): Add
	tst-get-cpu-features-static.
	* sysdeps/x86/Versions: New file.
	* sysdeps/x86/cpu-features-offsets.sym: Likewise.
	* sysdeps/x86/cpu-features.c: Likewise.
	* sysdeps/x86/cpu-features.h: Likewise.
	* sysdeps/x86/dl-get-cpu-features.c: Likewise.
	* sysdeps/x86/libc-start.c: Likewise.
	* sysdeps/x86/rtld-global-offsets.sym: Likewise.
	* sysdeps/x86/tst-get-cpu-features-static.c: Likewise.
	* sysdeps/x86/tst-get-cpu-features.c: Likewise.
	* sysdeps/x86_64/dl-procinfo.c: Likewise.
	* sysdeps/x86_64/cacheinfo.c (__cpuid_count): Removed.
	Assume USE_MULTIARCH is defined and don't check it.
	(is_intel): Replace __cpu_features with GLRO(dl_x86_cpu_features).
	(is_amd): Likewise.
	(max_cpuid): Likewise.
	(intel_check_word): Likewise.
	(__cache_sysconf): Don't call __init_cpu_features.
	(__x86_preferred_memory_instruction): Removed.
	(init_cacheinfo): Don't call __init_cpu_features. Replace
	__cpu_features with GLRO(dl_x86_cpu_features).
	* sysdeps/x86_64/dl-machine.h: <cpu-features.c>.
	(dl_platform_init): Call init_cpu_features.
	* sysdeps/x86_64/ldsodefs.h: Include <cpu-features.h>.
	* sysdeps/x86_64/multiarch/Makefile (aux): Remove init-arch.
	* sysdeps/x86_64/multiarch/Versions: Removed.
	* sysdeps/x86_64/multiarch/cacheinfo.c: Likewise.
	* sysdeps/x86_64/multiarch/init-arch.c: Likewise.
	* sysdeps/x86_64/multiarch/ifunc-defines.sym (KIND_OFFSET):
	Removed.
	* sysdeps/x86_64/multiarch/init-arch.h: Rewrite.
2015-08-13 03:41:22 -07:00
Joseph Myers
4afe4b20ce Add more tests of various libm functions.
This patch adds more tests of various libm functions found through
random test generation to give increased ulps on 32-bit x86.

Tested for x86_64 and x86.

	* math/auto-libm-test-in: Add more tests of acosh, asin, asinh,
	atanh, cabs, carg, cbrt, cosh, csqrt, erf, erfc, exp, exp10,
	expm1, hypot, log, log10, log1p, log2, pow, sinh, tan and tgamma.
	* math/auto-libm-test-out: Regenerated.
	* sysdeps/i386/fpu/libm-test-ulps: Update.
	* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
2015-08-11 00:58:28 +00:00
H.J. Lu
72354ab5e1 Align stack to 16 bytes when calling __errno_location
We should align stack to 16 bytes when calling __errno_location.

	[BZ #18661]
	* sysdeps/x86_64/fpu/s_cosf.S (__cosf): Align stack to 16 bytes
	when calling __errno_location.
	* sysdeps/x86_64/fpu/s_sincosf.S (__sincosf): Likewise.
	* sysdeps/x86_64/fpu/s_sinf.S (__sinf): Likewise.
2015-08-05 08:36:27 -07:00
H.J. Lu
3b8d2eb7f8 Compile {memcpy,strcmp}-sse2-unaligned.S only for libc
{memcpy,strcmp}-sse2-unaligned.S aren't needed in ld.so.

	* sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S: Compile
	only for libc.
	* sysdeps/x86_64/multiarch/strcmp-sse2-unaligned.S: Likewise.
2015-08-05 08:28:37 -07:00
Andrew Senkevich
a9e8ea51cc Prevent runtime fail of SSE vector math tests on non SSE4.1 machine.
[BZ #18740]
    * sysdeps/x86_64/fpu/Makefile (double-vlen2-arch-ext-cflags,
    float-vlen4-arch-ext-cflags): Removed.
    * math/Makefile (CFLAGS-test-double-vlen2-wrappers.c,
    CFLAGS-test-float-vlen4-wrappers.c): Likewise.
2015-07-30 18:00:24 +03:00
H.J. Lu
9637d8a253 Extend local PLT reference check
On x86, linker in binutils 2.26 and newer consolidates R_*_JUMP_SLOT with
R_*_GLOB_DAT relocation against the same symbol.  This patch extends
local PLT reference check to support alternate relocations.

	[BZ #18078]
	* scripts/check-localplt.awk: Support alternate relocations.
	* scripts/localplt.awk: Also check relocations in DT_RELA/DT_REL
	sections.
	* sysdeps/unix/sysv/linux/i386/localplt.data: Mark free and
	malloc entries with + REL R_386_GLOB_DAT.
	* sysdeps/x86_64/localplt.data: New file.
2015-07-29 11:58:06 -07:00
Andrew Senkevich
febce2ac5f Added runtime check for AVX vector math tests.
[BZ #18731]
    * sysdeps/x86_64/fpu/math-tests-arch.h: Added AVX runtime check.
    * sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise.
    * sysdeps/x86_64/fpu/test-float-vlen8.c: Likewise.
2015-07-29 19:47:29 +03:00
Andrew Senkevich
9901716135 Fixed several libmvec bugs found during testing on KNL hardware.
AVX512 IFUNC implementations, implementations of wrappers to
AVX2 versions and KNL expf implementation fixed.

    * sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core.S: Fixed AVX512 IFUNC.
    * sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core.S: Likewise.
    * sysdeps/x86_64/fpu/multiarch/svml_d_log8_core.S: Likewise.
    * sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core.S: Likewise.
    * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core.S: Likewise.
    * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core.S: Likewise.
    * sysdeps/x86_64/fpu/multiarch/svml_s_cosf16_core.S: Likewise.
    * sysdeps/x86_64/fpu/multiarch/svml_s_expf16_core.S: Likewise.
    * sysdeps/x86_64/fpu/multiarch/svml_s_logf16_core.S: Likewise.
    * sysdeps/x86_64/fpu/multiarch/svml_s_powf16_core.S: Likewise.
    * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core.S: Likewise.
    * sysdeps/x86_64/fpu/multiarch/svml_s_sinf16_core.S: Likewise.
    * sysdeps/x86_64/fpu/svml_d_wrapper_impl.h: Fixed wrappers to AVX2.
    * sysdeps/x86_64/fpu/svml_s_wrapper_impl.h: Likewise.
    * sysdeps/x86_64/fpu/multiarch/svml_s_expf16_core_avx512.S: Fixed KNL
    implementation.
2015-07-24 14:47:23 +03:00
Arjun Shankar
0035851c3c Modify several tests to use test-skeleton.c
These tests were skipped by the use-test-skeleton conversion done in
commit 29955b5d because they were reused in other tests via the #include
directive, and so deemed worth an inspection before they were modified.
This has now been done.

ChangeLog:

2015-07-09  Arjun Shankar  <arjun.is@lostca.se>

	* elf/tst-leaks1.c (main): Converted to ...
	(do_test): ... this.
	(TEST_FUNCTION): New macro.
	Include test-skeleton.c.
	* localedata/tst-langinfo.c (main): Converted to ...
	(do_test): ... this.
	(TEST_FUNCTION): New macro.
	Include test-skeleton.c.
	* math/test-fpucw.c (main): Converted to ...
	(do_test): ... this.
	(TEST_FUNCTION): New macro.
	Include test-skeleton.c.
	* math/test-tgmath.c (main): Converted to ...
	(do_test): ... this.
	(TEST_FUNCTION): New macro.
	Include test-skeleton.c.
	* math/test-tgmath2.c (main): Converted to ...
	(do_test): ... this.
	(TEST_FUNCTION): New macro.
	Include test-skeleton.c.
	* setjmp/tst-setjmp.c (main): Converted to ...
	(do_test): ... this.
	(TEST_FUNCTION): New macro.
	Include test-skeleton.c.
	* stdio-common/tst-sscanf.c (main): Converted to ...
	(do_test): ... this.
	(TEST_FUNCTION): New macro.
	Include test-skeleton.c.
	* sysdeps/x86_64/tst-audit6.c (main): Converted to ...
	(do_test): ... this.
	(TEST_FUNCTION): New macro.
	Include test-skeleton.c.
2015-07-15 15:10:23 +05:30
H.J. Lu
2eb9ef29b6 Improve bndmov encoding with zero displacement
If x86-64 assembler doesn't support MPX, we encode bndmov instruction by
hand.  When displacement is zero, assembler generates shorter encoding.
This patch improves bndmov encoding with zero displacement so that ld.so
is identical when using assemblers with and without MPX support.

	* sysdeps/x86_64/dl-trampoline.S (_dl_runtime_resolve): Improve
	bndmov encoding with zero displacement.
2015-07-09 09:30:20 -07:00
Igor Zamyatin
14c5cbabc2 Preserve bound registers for pointer pass/return
We need to save/restore bound registers and add a BND prefix before
branches in _dl_runtime_profile so that bound registers for pointer
pass and return are preserved when LD_AUDIT is used.

	[BZ #18134]
	* sysdeps/i386/configure.ac: Set HAVE_MPX_SUPPORT.
	* sysdeps/i386/configure: Regenerated.
	* sysdeps/i386/dl-trampoline.S (PRESERVE_BND_REGS_PREFIX): New.
	(_dl_runtime_profile): Save and restore Intel MPX return bound
	registers when calling _dl_call_pltexit.  Add
	PRESERVE_BND_REGS_PREFIX before return.
	* sysdeps/i386/link-defines.sym (LRV_BND0_OFFSET): New.
	(LRV_BND1_OFFSET): Likewise.
	* sysdeps/x86/bits/link.h (La_i86_retval): Add lrv_bnd0 and
	lrv_bnd1.
	* sysdeps/x86_64/dl-trampoline.S (_dl_runtime_profile): Fix
	typo in bndmov encoding.
	* sysdeps/x86_64/dl-trampoline.h: Properly save and restore
	Intel MPX bound registers.  Add PRESERVE_BND_REGS_PREFIX before
	branch instructions to preserve bounds.
2015-07-09 06:50:12 -07:00
H.J. Lu
9aec6d2a2f Add la_symbind32 to x86-64 audit tests
la_symbind32 is used for x32 in x86-64 audit tests.  We should define
both la_symbind32 and la_symbind64 in x86-64 audit tests.

	* sysdeps/x86_64/tst-auditmod10b.c (la_symbind32): New.
	* sysdeps/x86_64/tst-auditmod4b.c (la_symbind32): Likewise.
	* sysdeps/x86_64/tst-auditmod5b.c (la_symbind32): Likewise.
	* sysdeps/x86_64/tst-auditmod6b.c (la_symbind32): Likewise.
	* sysdeps/x86_64/tst-auditmod6c.c (la_symbind32): Likewise.
	* sysdeps/x86_64/tst-auditmod7b.c (la_symbind32): Likewise.
2015-07-07 06:09:29 -07:00
Torvald Riegel
4eb984d3ab Clean up BUSY_WAIT_NOP and atomic_delay.
This patch combines BUSY_WAIT_NOP and atomic_delay into a new
atomic_spin_nop function and adjusts all clients.  The new function is
put into atomic.h because what is best done in a spin loop is
architecture-specific, and atomics must be used for spinning.  The
function name is meant to tell users that this has no effect on
synchronization semantics but is a performance aid for spinning.
2015-06-30 15:57:15 +02:00
Joseph Myers
e02920bc02 Improve tgamma accuracy (bug 18613).
In non-default rounding modes, tgamma can be slightly less accurate
than permitted by glibc's accuracy goals.

Part of the problem is error accumulation, addressed in this patch by
setting round-to-nearest for internal computations.  However, there
was also a bug in the code dealing with computing pow (x + n, x + n)
where x + n is not exactly representable, providing another source of
error even in round-to-nearest mode; it was necessary to address both
bugs to get errors for all testcases within glibc's accuracy goals.
Given this second fix, accuracy in round-to-nearest mode is also
improved (hence regeneration of ulps for tgamma should be from scratch
- truncate libm-test-ulps or at least remove existing tgamma entries -
so that the expected ulps can be reduced).

Some additional complications also arose.  Certain tgamma tests should
strictly, according to IEEE semantics, overflow or not depending on
the rounding mode; this is beyond the scope of glibc's accuracy goals
for any function without exactly-determined results, but
gen-auto-libm-tests doesn't handle being lax there as it does for
underflow.  (libm-test.inc also doesn't handle being lax about whether
the result in cases very close to the overflow threshold is infinity
or a finite value close to overflow, but that doesn't cause problems
in this case though I've seen it cause problems with random test
generation for some functions.)  Thus, spurious-overflow markings,
with a comment, are added to auto-libm-test-in (no bug in Bugzilla
because the issue is with the testsuite, not a user-visible bug in
glibc).  And on x86, after the patch I saw ERANGE issues as previously
reported by Carlos (see my commentary in
<https://sourceware.org/ml/libc-alpha/2015-01/msg00485.html>), which
needed addressing by ensuring excess range and precision were
eliminated at various points if FLT_EVAL_METHOD != 0.

I also noticed and fixed a cosmetic issue where 1.0f was used in long
double functions and should have been 1.0L.

This completes the move of all functions to testing in all rounding
modes with ALL_RM_TEST, so gen-libm-have-vector-test.sh is updated to
remove the workaround for some functions not using ALL_RM_TEST.

Tested for x86_64, x86, mips64 and powerpc.

	[BZ #18613]
	* sysdeps/ieee754/dbl-64/e_gamma_r.c (gamma_positive): Take log of
	X_ADJ not X when adjusting exponent.
	(__ieee754_gamma_r): Do intermediate computations in
	round-to-nearest then adjust overflowing and underflowing results
	as needed.
	* sysdeps/ieee754/flt-32/e_gammaf_r.c (gammaf_positive): Take log
	of X_ADJ not X when adjusting exponent.
	(__ieee754_gammaf_r): Do intermediate computations in
	round-to-nearest then adjust overflowing and underflowing results
	as needed.
	* sysdeps/ieee754/ldbl-128/e_gammal_r.c (gammal_positive): Take
	log of X_ADJ not X when adjusting exponent.
	(__ieee754_gammal_r): Do intermediate computations in
	round-to-nearest then adjust overflowing and underflowing results
	as needed.  Use 1.0L not 1.0f as numerator of division.
	* sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (gammal_positive): Take
	log of X_ADJ not X when adjusting exponent.
	(__ieee754_gammal_r): Do intermediate computations in
	round-to-nearest then adjust overflowing and underflowing results
	as needed.  Use 1.0L not 1.0f as numerator of division.
	* sysdeps/ieee754/ldbl-96/e_gammal_r.c (gammal_positive): Take log
	of X_ADJ not X when adjusting exponent.
	(__ieee754_gammal_r): Do intermediate computations in
	round-to-nearest then adjust overflowing and underflowing results
	as needed.  Use 1.0L not 1.0f as numerator of division.
	* math/libm-test.inc (tgamma_test_data): Remove one test.  Moved
	to auto-libm-test-in.
	(tgamma_test): Use ALL_RM_TEST.
	* math/auto-libm-test-in: Add one test of tgamma.  Mark some other
	tests of tgamma with spurious-overflow.
	* math/auto-libm-test-out: Regenerated.
	* math/gen-libm-have-vector-test.sh: Do not check for START.
	* sysdeps/i386/fpu/libm-test-ulps: Update.
	* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
2015-06-29 23:29:35 +00:00
Joseph Myers
a8e2112ae3 Use round-to-nearest internally in jn, test with ALL_RM_TEST (bug 18602).
Some existing jn tests, if run in non-default rounding modes, produce
errors above those accepted in glibc, which causes problems for moving
tests of jn to use ALL_RM_TEST.  This patch makes jn set rounding
to-nearest internally, as was done for yn some time ago, then computes
the appropriate underflowing value for results that underflowed to
zero in to-nearest, and moves the tests to ALL_RM_TEST.  It does
nothing about the general inaccuracy of Bessel function
implementations in glibc, though it should make jn more accurate on
average in non-default rounding modes through reduced error
accumulation.  The recomputation of results that underflowed to zero
should as a side-effect fix some cases of bug 16559, where jn just
used an exact zero, but that is *not* the goal of this patch and other
cases of that bug remain unfixed.

(Most of the changes in the patch are reindentation to add new scopes
for SET_RESTORE_ROUND*.)

Tested for x86_64, x86, powerpc and mips64.

	[BZ #16559]
	[BZ #18602]
	* sysdeps/ieee754/dbl-64/e_jn.c (__ieee754_jn): Set
	round-to-nearest internally then recompute results that
	underflowed to zero in the original rounding mode.
	* sysdeps/ieee754/flt-32/e_jnf.c (__ieee754_jnf): Likewise.
	* sysdeps/ieee754/ldbl-128/e_jnl.c (__ieee754_jnl): Likewise.
	* sysdeps/ieee754/ldbl-128ibm/e_jnl.c (__ieee754_jnl): Likewise.
	* sysdeps/ieee754/ldbl-96/e_jnl.c (__ieee754_jnl): Likewise
	* math/libm-test.inc (jn_test): Use ALL_RM_TEST.
	* sysdeps/i386/fpu/libm-test-ulps: Update.
	* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
2015-06-25 21:46:02 +00:00
Joseph Myers
f9536db790 Refactor libm tests.
This patch refactors the libm tests using libm-test.inc to reduce the
level of duplicate definitions.  New headers are created for the
definitions shared by tests for a particular type; by tests of inline
functions; by tests of non-inline functions; by scalar tests; and by
vector tests.  The unused MATHCONST macro is removed.  A new macro
VEC_LEN is added to the vector headers to allow the macros defining
wrappers for vector functions to be defined once, instead of six times
each (differing only in vector length) as before.  There is still
scope for further refactoring, but this seems a useful start.

Tested for x86_64.

	* math/test-double.h: New file.
	* math/test-float.h: Likewise.
	* math/test-ldouble.h: Likewise.
	* math/test-math-inline.h: Likewise.
	* math/test-math-no-inline.h: Likewise.
	* math/test-math-scalar.h: Likewise.
	* math/test-math-vector.h: Likewise.
	* math/test-vec-loop.h: Remove file.  Contents moved into
	test-math-vector.h.
	* math/libm-test.inc (MATHCONST): Do not document macro.
	* math/test-double.c: Include test-double.h, test-math-no-inline.h
	and test-math-scalar.h.
	(FUNC): Remove macro.
	(FUNC_TEST): Likewise.
	(FLOAT): Likewise.
	(MATHCONST): Likewise.
	(PRINTF_EXPR): Likewise.
	(PRINTF_XEXPR): Likewise.
	(PRINTF_NEXPR): Likewise.
	(TEST_DOUBLE): Likewise.
	(TEST_MATHVEC): Likewise.
	(__NO_MATH_INLINES): Likewise.
	* math/test-float.c: Include test-float.h, test-math-no-inline.h
	and test-math-scalar.h.
	(FUNC): Remove macro.
	(FUNC_TEST): Likewise.
	(FLOAT): Likewise.
	(MATHCONST): Likewise.
	(PRINTF_EXPR): Likewise.
	(PRINTF_XEXPR): Likewise.
	(PRINTF_NEXPR): Likewise.
	(TEST_FLOAT): Likewise.
	(TEST_MATHVEC): Likewise.
	(__NO_MATH_INLINES): Likewise.
	* math/test-idouble.c: Include test-double.h, test-math-inline.h
	and test-math-scalar.h.
	(FUNC): Remove macro.
	(FUNC_TEST): Likewise.
	(FLOAT): Likewise.
	(MATHCONST): Likewise.
	(PRINTF_EXPR): Likewise.
	(PRINTF_XEXPR): Likewise.
	(PRINTF_NEXPR): Likewise.
	(TEST_DOUBLE): Likewise.
	(TEST_MATHVEC): Likewise.
	(TEST_INLINE): Likewise.
	(__NO_MATH_INLINES): Likewise.
	* math/test-ifloat.c: Include test-float.h, test-math-inline.h and
	test-math-scalar.h.
	(FUNC): Remove macro.
	(FUNC_TEST): Likewise.
	(FLOAT): Likewise.
	(MATHCONST): Likewise.
	(PRINTF_EXPR): Likewise.
	(PRINTF_XEXPR): Likewise.
	(PRINTF_NEXPR): Likewise.
	(TEST_FLOAT): Likewise.
	(TEST_MATHVEC): Likewise.
	(TEST_INLINE): Likewise.
	(__NO_MATH_INLINES): Likewise.
	* math/test-ildoubl.c: Include test-ldouble.h, test-math-inline.h
	and test-math-scalar.h.
	(FUNC): Remove macro.
	(FUNC_TEST): Likewise.
	(FLOAT): Likewise.
	(MATHCONST): Likewise.
	(PRINTF_EXPR): Likewise.
	(PRINTF_XEXPR): Likewise.
	(PRINTF_NEXPR): Likewise.
	(TEST_LDOUBLE): Likewise.
	(TEST_MATHVEC): Likewise.
	(TEST_INLINE): Likewise.
	(__NO_MATH_INLINES): Likewise.
	* math/test-ldouble.c: Include test-ldouble.h,
	test-math-no-inline.h and test-math-scalar.h.
	(FUNC): Remove macro.
	(FUNC_TEST): Likewise.
	(FLOAT): Likewise.
	(MATHCONST): Likewise.
	(PRINTF_EXPR): Likewise.
	(PRINTF_XEXPR): Likewise.
	(PRINTF_NEXPR): Likewise.
	(TEST_LDOUBLE): Likewise.
	(TEST_MATHVEC): Likewise.
	(__NO_MATH_INLINES): Likewise.
	* math/test-double-vlen2.h: Include test-double.h,
	test-math-no-inline.h and test-math-vector.h.
	(FLOAT): Remove macro.
	(FUNC): Likewise.
	(MATHCONST): Likewise.
	(PRINTF_EXPR): Likewise.
	(PRINTF_XEXPR): Likewise.
	(PRINTF_NEXPR): Likewise.
	(TEST_DOUBLE): Likewise.
	(TEST_MATHVEC): Likewise.
	(__NO_MATH_INLINES): Likewise.
	(CNCT): Likewise.
	(CONCAT): Likewise.
	(WRAPPER_NAME): Likewise.
	(WRAPPER_DECL): Likewise.
	(WRAPPER_DECL_ff): Likewise.
	(WRAPPER_DECL_fFF): Likewise.
	(VECTOR_WRAPPER): Likewise.
	(VECTOR_WRAPPER_ff): Likewise.
	(VECTOR_WRAPPER_fFF): Likewise.
	(VEC_LEN): New macro.
	* math/test-double-vlen4.h: Include test-double.h,
	test-math-no-inline.h and test-math-vector.h.
	(FLOAT): Remove macro.
	(FUNC): Likewise.
	(MATHCONST): Likewise.
	(PRINTF_EXPR): Likewise.
	(PRINTF_XEXPR): Likewise.
	(PRINTF_NEXPR): Likewise.
	(TEST_DOUBLE): Likewise.
	(TEST_MATHVEC): Likewise.
	(__NO_MATH_INLINES): Likewise.
	(CNCT): Likewise.
	(CONCAT): Likewise.
	(WRAPPER_NAME): Likewise.
	(WRAPPER_DECL): Likewise.
	(WRAPPER_DECL_ff): Likewise.
	(WRAPPER_DECL_fFF): Likewise.
	(VECTOR_WRAPPER): Likewise.
	(VECTOR_WRAPPER_ff): Likewise.
	(VECTOR_WRAPPER_fFF): Likewise.
	(VEC_LEN): New macro.
	* math/test-double-vlen8.h: Include test-double.h,
	test-math-no-inline.h and test-math-vector.h.
	(FLOAT): Remove macro.
	(FUNC): Likewise.
	(MATHCONST): Likewise.
	(PRINTF_EXPR): Likewise.
	(PRINTF_XEXPR): Likewise.
	(PRINTF_NEXPR): Likewise.
	(TEST_DOUBLE): Likewise.
	(TEST_MATHVEC): Likewise.
	(__NO_MATH_INLINES): Likewise.
	(CNCT): Likewise.
	(CONCAT): Likewise.
	(WRAPPER_NAME): Likewise.
	(WRAPPER_DECL): Likewise.
	(WRAPPER_DECL_ff): Likewise.
	(WRAPPER_DECL_fFF): Likewise.
	(VECTOR_WRAPPER): Likewise.
	(VECTOR_WRAPPER_ff): Likewise.
	(VECTOR_WRAPPER_fFF): Likewise.
	(VEC_LEN): New macro.
	* math/test-float-vlen4.h: Include test-float.h,
	test-math-no-inline.h and test-math-vector.h.
	(FLOAT): Remove macro.
	(FUNC): Likewise.
	(MATHCONST): Likewise.
	(PRINTF_EXPR): Likewise.
	(PRINTF_XEXPR): Likewise.
	(PRINTF_NEXPR): Likewise.
	(TEST_FLOAT): Likewise.
	(TEST_MATHVEC): Likewise.
	(__NO_MATH_INLINES): Likewise.
	(CNCT): Likewise.
	(CONCAT): Likewise.
	(WRAPPER_NAME): Likewise.
	(WRAPPER_DECL): Likewise.
	(WRAPPER_DECL_ff): Likewise.
	(WRAPPER_DECL_fFF): Likewise.
	(VECTOR_WRAPPER): Likewise.
	(VECTOR_WRAPPER_ff): Likewise.
	(VECTOR_WRAPPER_fFF): Likewise.
	(VEC_LEN): New macro.
	* math/test-float-vlen8.h: Include test-float.h,
	test-math-no-inline.h and test-math-vector.h.
	(FLOAT): Remove macro.
	(FUNC): Likewise.
	(MATHCONST): Likewise.
	(PRINTF_EXPR): Likewise.
	(PRINTF_XEXPR): Likewise.
	(PRINTF_NEXPR): Likewise.
	(TEST_FLOAT): Likewise.
	(TEST_MATHVEC): Likewise.
	(__NO_MATH_INLINES): Likewise.
	(CNCT): Likewise.
	(CONCAT): Likewise.
	(WRAPPER_NAME): Likewise.
	(WRAPPER_DECL): Likewise.
	(WRAPPER_DECL_ff): Likewise.
	(WRAPPER_DECL_fFF): Likewise.
	(VECTOR_WRAPPER): Likewise.
	(VECTOR_WRAPPER_ff): Likewise.
	(VECTOR_WRAPPER_fFF): Likewise.
	(VEC_LEN): New macro.
	* math/test-float-vlen16.h: Include test-float.h,
	test-math-no-inline.h and test-math-vector.h.
	(FLOAT): Remove macro.
	(FUNC): Likewise.
	(MATHCONST): Likewise.
	(PRINTF_EXPR): Likewise.
	(PRINTF_XEXPR): Likewise.
	(PRINTF_NEXPR): Likewise.
	(TEST_FLOAT): Likewise.
	(TEST_MATHVEC): Likewise.
	(__NO_MATH_INLINES): Likewise.
	(CNCT): Likewise.
	(CONCAT): Likewise.
	(WRAPPER_NAME): Likewise.
	(WRAPPER_DECL): Likewise.
	(WRAPPER_DECL_ff): Likewise.
	(WRAPPER_DECL_fFF): Likewise.
	(VECTOR_WRAPPER): Likewise.
	(VECTOR_WRAPPER_ff): Likewise.
	(VECTOR_WRAPPER_fFF): Likewise.
	(VEC_LEN): New macro.
	* sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: Do not include
	test-vec-loop.h.
	* sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: Likewise.
	* sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: Likewise.
	* sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: Likewise.
	* sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c: Likewise.
	* sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c: Likewise.
	* sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c: Likewise.
	* sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c: Likewise.
2015-06-24 23:27:18 +00:00
Joseph Myers
a67894c505 Fix cexp, ccos, ccosh, csin, csinh spurious underflows (bug 18594).
cexp, ccos, ccosh, csin and csinh have spurious underflows in cases
where they compute sin of the smallest normal, that produces an
underflow exception (depending on which sin implementation is in use)
but the final result does not underflow.  ctan and ctanh may also have
such underflows, or they may be latent (the issue there is that
e.g. ctan (DBL_MIN) should, rounded upwards, be the next double value
above DBL_MIN, which under glibc's accuracy goals may not have an
underflow exception, but the intermediate computation of sin (DBL_MIN)
would legitimately underflow on before-rounding architectures).

This patch fixes all those functions so they use plain comparisons (>
DBL_MIN etc.) instead of comparing the result of fpclassify with
FP_SUBNORMAL (in all these cases, we already know the number being
compared is finite).  Note that in the case of csin / csinf / csinl,
there is no need for fabs calls in the comparison because the real
part has already been reduced to its absolute value.

As the patch fixes the failures that previously obstructed moving
tests of cexp to use ALL_RM_TEST, those tests are moved to ALL_RM_TEST
by the patch (two functions remain yet to be converted).

Tested for x86_64 and x86 and ulps updated accordingly.

	[BZ #18594]
	* math/s_ccosh.c (__ccosh): Compare with least normal value
	instead of comparing class with FP_SUBNORMAL.
	* math/s_ccoshf.c (__ccoshf): Likewise.
	* math/s_ccoshl.c (__ccoshl): Likewise.
	* math/s_cexp.c (__cexp): Likewise.
	* math/s_cexpf.c (__cexpf): Likewise.
	* math/s_cexpl.c (__cexpl): Likewise.
	* math/s_csin.c (__csin): Likewise.
	* math/s_csinf.c (__csinf): Likewise.
	* math/s_csinh.c (__csinh): Likewise.
	* math/s_csinhf.c (__csinhf): Likewise.
	* math/s_csinhl.c (__csinhl): Likewise.
	* math/s_csinl.c (__csinl): Likewise.
	* math/s_ctan.c (__ctan): Likewise.
	* math/s_ctanf.c (__ctanf): Likewise.
	* math/s_ctanh.c (__ctanh): Likewise.
	* math/s_ctanhf.c (__ctanhf): Likewise.
	* math/s_ctanhl.c (__ctanhl): Likewise.
	* math/s_ctanl.c (__ctanl): Likewise.
	* math/auto-libm-test-in: Add more tests of ccos, ccosh, cexp,
	csin, csinh, ctan and ctanh.
	* math/auto-libm-test-out: Regenerated.
	* math/libm-test.inc (cexp_test): Use ALL_RM_TEST.
	* sysdeps/i386/fpu/libm-test-ulps: Update.
	* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
2015-06-24 21:04:51 +00:00
Joseph Myers
ac831b362a Fix csin, csinh overflow in directed rounding modes (bug 18593).
csin and csinh can produce bad results when overflowing in directed
rounding modes, because a multiplication that can overflow is followed
by a possible negation.  This patch fixes this by negating one of the
arguments of the multiplication before the multiplication instead of
negating the result.

The new tests for this issue are added to auto-libm-test-in, starting
use of that file for csin and csinh.  The issue was found in the
course of moving existing tests for csin and csinh (existing tests, by
being enabled in more cases than previously, showed the issue for
float and double but not for long double); that move will now be done
separately.

Tested for x86_64 and x86 and ulps updated accordingly.

	[BZ #18593]
	* math/s_csin.c (__csin): Negate before rather than after possibly
	overflowing multiplication.
	* math/s_csinf.c (__csinf): Likewise.
	* math/s_csinh.c (__csinh): Likewise.
	* math/s_csinhf.c (__csinhf): Likewise.
	* math/s_csinhl.c (__csinhl): Likewise.
	* math/s_csinl.c (__csinl): Likewise.
	* math/auto-libm-test-in: Add some tests of csin and csinh.
	* math/auto-libm-test-out: Regenerated.
	* math/libm-test.inc (csin_test_data): Use AUTO_TESTS_c_c.
	(csinh_test_data): Likewise.
	* sysdeps/x86_64/fpu/libm-test-ulps: Update.
2015-06-24 16:20:48 +00:00
Andrew Senkevich
36870482d2 Combination of data tables for x86_64 vector functions sinf, cosf and sincosf.
* sysdeps/x86_64/fpu/Makefile (libmvec-support): Fixed files list.
    * sysdeps/x86_64/fpu/multiarch/svml_s_cosf4_core_sse4.S: Renamed variable
    and included header.
    * sysdeps/x86_64/fpu/multiarch/svml_s_cosf8_core_avx2.S: Likewise.
    * sysdeps/x86_64/fpu/multiarch/svml_s_cosf16_core_avx512.S: Likewise.
    * sysdeps/x86_64/fpu/multiarch/svml_s_sinf4_core_sse4.S: Likewise.
    * sysdeps/x86_64/fpu/multiarch/svml_s_sinf8_core_avx2.S: Likewise.
    * sysdeps/x86_64/fpu/multiarch/svml_s_sinf16_core_avx512.S: Likewise.
    * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf4_core_sse4.S: Likewise.
    * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf8_core_avx2.S: Likewise.
    * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core_avx512.S: Likewise.
    * sysdeps/x86_64/fpu/svml_s_trig_data.S: New file.
    * sysdeps/x86_64/fpu/svml_s_trig_data.h: Likewise.
    * sysdeps/x86_64/fpu/svml_s_cosf_data.S: Removed file.
    * sysdeps/x86_64/fpu/svml_s_cosf_data.h: Likewise.
    * sysdeps/x86_64/fpu/svml_s_sinf_data.S: Likewise.
    * sysdeps/x86_64/fpu/svml_s_sinf_data.h: Likewise.
    * sysdeps/x86_64/fpu/svml_s_sincosf_data.S: Likewise.
    * sysdeps/x86_64/fpu/svml_s_sincosf_data.h: Likewise.
2015-06-24 17:44:35 +03:00
Torvald Riegel
c47ca9647f Fix atomic_full_barrier on x86 and x86_64.
This fixes BZ #17403 by defining atomic_full_barrier,
atomic_read_barrier, and atomic_write_barrier on x86 and x86_64.  A full
barrier is implemented through an atomic idempotent modification to the
stack and not through using mfence because the latter can supposedly be
somewhat slower due to having to provide stronger guarantees wrt.
self-modifying code, for example.
2015-06-23 19:20:52 +02:00
Andrew Senkevich
5872b8352a Combination of data tables for x86_64 vector functions sin, cos and sincos.
* sysdeps/x86_64/fpu/Makefile (libmvec-support): Fixed files list.
    * sysdeps/x86_64/fpu/multiarch/svml_d_cos2_core_sse4.S: Renamed variable
    and included header.
    * sysdeps/x86_64/fpu/multiarch/svml_d_cos4_core_avx2.S: Likewise.
    * sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core_avx512.S: Likewise.
    * sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core_sse4.S: Likewise.
    * sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core_avx2.S: Likewise.
    * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core_avx512.S: Likewise.
    * sysdeps/x86_64/fpu/multiarch/svml_d_sincos2_core_sse4.S: Likewise.
    * sysdeps/x86_64/fpu/multiarch/svml_d_sincos4_core_avx2.S: Likewise.
    * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core_avx512.S: Likewise.
    * sysdeps/x86_64/fpu/svml_d_trig_data.S: New file.
    * sysdeps/x86_64/fpu/svml_d_trig_data.h: Likewise.
    * sysdeps/x86_64/fpu/svml_d_cos2_core.S: Removed unneeded include.
    * sysdeps/x86_64/fpu/svml_d_cos4_core.S: Likewise.
    * sysdeps/x86_64/fpu/svml_d_cos8_core.S: Likewise.
    * sysdeps/x86_64/fpu/svml_d_cos_data.S: Removed file.
    * sysdeps/x86_64/fpu/svml_d_cos_data.h: Likewise.
    * sysdeps/x86_64/fpu/svml_d_sin_data.S: Likewise.
    * sysdeps/x86_64/fpu/svml_d_sin_data.h: Likewise.
    * sysdeps/x86_64/fpu/svml_d_sincos_data.S: Likewise.
    * sysdeps/x86_64/fpu/svml_d_sincos_data.h: Likewise.
2015-06-23 19:21:50 +03:00
Joseph Myers
cb0937b299 Fix x86_64 / x86 expm1l (-min_subnorm) result sign (bug 18569).
In the x86 / x86_64 implementations of expm1l, when expm1l's result
should underflow to 0 (argument minus the least subnormal, in some
rounding modes), it can be a zero of the wrong sign.  This patch fixes
this by returning the argument with underflow forced in that case
(this is a 1ulp error relative to the correctly rounded result of -0,
which is OK in terms of the documented accuracy goals, whereas a
result with the wrong sign never is).

Tested for x86_64 and x86.

	[BZ #18569]
	* sysdeps/i386/fpu/e_expl.S (IEEE754_EXPL) [USE_AS_EXPM1L]: Force
	underflow and return argument in case of subnormal argument.
	* sysdeps/x86_64/fpu/e_expl.S (IEEE754_EXPL) [USE_AS_EXPM1L]:
	Likewise.
	* math/auto-libm-test-in: Add more tests of expm1.
	* math/auto-libm-test-out: Regenerated.
2015-06-21 18:43:10 +00:00
Joseph Myers
7540cfc5a8 Fix x86 / x86_64 expl, exp10l missing underflows (bug 16361).
Similar to various other bugs in this area, the x86 and x86_64
implementations of expl / exp10l can fail to produce underflow
exceptions when the unscaled result has trailing 0 bits so the scaling
down to subnormal precision is exact.  This patch fixes this by
forcing the exception in the case of tiny results.

Tested for x86_64 and x86.

	[BZ #16361]
	* sysdeps/i386/fpu/e_expl.S [!USE_AS_EXPM1L] (cmin): New object.
	[!USE_AS_EXPM1L] (IEEE754_EXPL): Force underflow exception for
	tiny results.
	* sysdeps/x86_64/fpu/e_expl.S [!USE_AS_EXPM1L] (cmin): New object.
	[!USE_AS_EXPM1L] (IEEE754_EXPL): Force underflow exception for
	tiny results.
	* math/auto-libm-test-in: Add more tests of exp and exp10.  Do not
	mark underflow exceptions as possibly missing for bug 16361.
	* math/auto-libm-test-out: Regenerated.
2015-06-21 17:48:04 +00:00
Andrew Senkevich
a6336cc446 Vector sincosf for x86_64 and tests.
Here is implementation of vectorized sincosf containing SSE, AVX,
AVX2 and AVX512 versions according to Vector ABI
<https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>.

    * NEWS: Mention addition of x86_64 vector sincosf.
    * math/test-float-vlen16.h: Added wrapper for sincosf tests.
    * math/test-float-vlen4.h: Likewise.
    * math/test-float-vlen8.h: Likewise.
    * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New symbols added.
    * sysdeps/x86/fpu/bits/math-vector.h: Added sincosf SIMD declaration.
    * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files.
    * sysdeps/x86_64/fpu/Versions: New versions added.
    * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated.
    * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines):
    Added build of SSE, AVX2 and AVX512 IFUNC versions.
    * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core.S
    * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core_avx512.S
    * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf4_core.S
    * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf4_core_sse4.S
    * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf8_core.S
    * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf8_core_avx2.S
    * sysdeps/x86_64/fpu/svml_s_sincosf16_core.S
    * sysdeps/x86_64/fpu/svml_s_sincosf4_core.S
    * sysdeps/x86_64/fpu/svml_s_sincosf8_core.S
    * sysdeps/x86_64/fpu/svml_s_sincosf8_core_avx.S
    * sysdeps/x86_64/fpu/svml_s_sincosf_data.S: New file.
    * sysdeps/x86_64/fpu/svml_s_sincosf_data.h: New file.
    * sysdeps/x86_64/fpu/svml_s_wrapper_impl.h: Added 3 argument wrappers.
    * sysdeps/x86_64/fpu/test-float-vlen16.c: : Vector sincosf tests.
    * sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c: Likewise.
    * sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c: Likewise.
    * sysdeps/x86_64/fpu/test-float-vlen4.c: Likewise.
    * sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c: Likewise.
    * sysdeps/x86_64/fpu/test-float-vlen8-avx2.c: Likewise.
    * sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c: Likewise.
    * sysdeps/x86_64/fpu/test-float-vlen8.c: Likewise.
2015-06-18 20:11:27 +03:00
Andrew Senkevich
c9a8c526ac Vector sincos for x86_64 and tests.
Here is implementation of vectorized sincos containing SSE, AVX,
AVX2 and AVX512 versions according to Vector ABI
<https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>.

    * NEWS: Mention addition of x86_64 vector sincos.
    * bits/libm-simd-decl-stubs.h: Added stubs for sincos.
    * math/math.h (__MATHDECL_VEC): New macro.
    * math/bits/mathcalls.h: Added sincos declaration with __MATHDECL_VEC.
    * math/gen-libm-have-vector-test.sh: Added generation of sincos wrapper
    declaration under condition.
    * math/test-vec-loop.h (TEST_VEC_LOOP): Refactored.
    * math/test-double-vlen2.h: Added wrapper for sincos tests, reflected
    TEST_VEC_LOOP change.
    * math/test-double-vlen4.h: Likewise.
    * math/test-double-vlen8.h: Likewise.
    * math/test-float-vlen16.h: Reflected TEST_VEC_LOOP change.
    * math/test-float-vlen4.h: Likewise.
    * math/test-float-vlen8.h: Likewise.
    * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New symbols added.
    * sysdeps/x86/fpu/bits/math-vector.h: Added sincos SIMD declaration.
    * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files.
    * sysdeps/x86_64/fpu/Versions: New versions added.
    * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated.
    * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines):
    Added build of SSE, AVX2 and AVX512 IFUNC versions.
    * sysdeps/x86_64/fpu/multiarch/svml_d_sincos2_core.S: New file.
    * sysdeps/x86_64/fpu/multiarch/svml_d_sincos2_core_sse4.S: New file.
    * sysdeps/x86_64/fpu/multiarch/svml_d_sincos4_core.S: New file.
    * sysdeps/x86_64/fpu/multiarch/svml_d_sincos4_core_avx2.S: New file.
    * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core.S: New file.
    * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core_avx512.S: New file.
    * sysdeps/x86_64/fpu/svml_d_sincos2_core.S: New file.
    * sysdeps/x86_64/fpu/svml_d_sincos4_core.S: New file.
    * sysdeps/x86_64/fpu/svml_d_sincos4_core_avx.S: New file.
    * sysdeps/x86_64/fpu/svml_d_sincos8_core.S: New file.
    * sysdeps/x86_64/fpu/svml_d_sincos_data.S: New file.
    * sysdeps/x86_64/fpu/svml_d_sincos_data.h: New file.
    * sysdeps/x86_64/fpu/svml_d_wrapper_impl.h: Added wrappers for sincos.
    * sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: Vector sincos tests.
    * sysdeps/x86_64/fpu/test-double-vlen2.c: Likewise.
    * sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: Likewise.
    * sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: Likewise.
    * sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: Likewise.
    * sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise.
    * sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: Likewise.
    * sysdeps/x86_64/fpu/test-double-vlen8.c: Likewise.
2015-06-18 17:55:55 +03:00
Andrew Senkevich
8aa92022e2 Vector powf for x86_64 and tests.
Here is implementation of vectorized powf containing SSE, AVX,
AVX2 and AVX512 versions according to Vector ABI
<https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>.

    * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New symbols added.
    * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration and asm
    redirections for powf.
    * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files.
    * sysdeps/x86_64/fpu/Versions: New versions added.
    * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated.
    * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines):
    Added build of SSE, AVX2 and AVX512 IFUNC versions.
    * sysdeps/x86_64/fpu/svml_s_wrapper_impl.h: Added 2 argument wrappers.
    * sysdeps/x86_64/fpu/multiarch/svml_s_powf16_core.S: New file.
    * sysdeps/x86_64/fpu/multiarch/svml_s_powf16_core_avx512.S: New file.
    * sysdeps/x86_64/fpu/multiarch/svml_s_powf4_core.S: New file.
    * sysdeps/x86_64/fpu/multiarch/svml_s_powf4_core_sse4.S: New file.
    * sysdeps/x86_64/fpu/multiarch/svml_s_powf8_core.S: New file.
    * sysdeps/x86_64/fpu/multiarch/svml_s_powf8_core_avx2.S: New file.
    * sysdeps/x86_64/fpu/svml_s_powf16_core.S: New file.
    * sysdeps/x86_64/fpu/svml_s_powf4_core.S: New file.
    * sysdeps/x86_64/fpu/svml_s_powf8_core.S: New file.
    * sysdeps/x86_64/fpu/svml_s_powf8_core_avx.S: New file.
    * sysdeps/x86_64/fpu/svml_s_powf_data.S: New file.
    * sysdeps/x86_64/fpu/svml_s_powf_data.h: New file.
    * sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c: Vector powf tests.
    * sysdeps/x86_64/fpu/test-float-vlen16.c: Likewise.
    * sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c: Likewise.
    * sysdeps/x86_64/fpu/test-float-vlen4.c: Likewise.
    * sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c: Likewise.
    * sysdeps/x86_64/fpu/test-float-vlen8-avx2.c: Likewise.
    * sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c: Likewise.
    * sysdeps/x86_64/fpu/test-float-vlen8.c: Likewise.
    * math/test-float-vlen16.h: Fixed 2 argument macro.
    * math/test-float-vlen4.h: Likewise.
    * math/test-float-vlen8.h: Likewise.
    * NEWS: Mention addition of x86_64 vector powf.
2015-06-18 17:04:07 +03:00
Andrew Senkevich
c10b9b13f7 Vector pow for x86_64 and tests.
Here is implementation of vectorized pow containing SSE, AVX,
AVX2 and AVX512 versions according to Vector ABI
<https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>.

  * bits/libm-simd-decl-stubs.h: Added stubs for pow.
    * math/bits/mathcalls.h: Added pow declaration with __MATHCALL_VEC.
    * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New versions added.
    * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration and asm
    redirections for pow.
    * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files.
    * sysdeps/x86_64/fpu/Versions: New versions added.
    * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated.
    * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added
    build of SSE, AVX2 and AVX512 IFUNC versions.
    * sysdeps/x86_64/fpu/svml_d_wrapper_impl.h: Added 2 argument wrappers.
    * sysdeps/x86_64/fpu/multiarch/svml_d_pow2_core.S: New file.
    * sysdeps/x86_64/fpu/multiarch/svml_d_pow2_core_sse4.S: New file.
    * sysdeps/x86_64/fpu/multiarch/svml_d_pow4_core.S: New file.
    * sysdeps/x86_64/fpu/multiarch/svml_d_pow4_core_avx2.S: New file.
    * sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core.S: New file.
    * sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core_avx512.S: New file.
    * sysdeps/x86_64/fpu/svml_d_pow2_core.S: New file.
    * sysdeps/x86_64/fpu/svml_d_pow4_core.S: New file.
    * sysdeps/x86_64/fpu/svml_d_pow4_core_avx.S: New file.
    * sysdeps/x86_64/fpu/svml_d_pow8_core.S: New file.
    * sysdeps/x86_64/fpu/svml_d_pow_data.S: New file.
    * sysdeps/x86_64/fpu/svml_d_pow_data.h: New file.
    * sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: Added vector pow test.
    * sysdeps/x86_64/fpu/test-double-vlen2.c: Likewise.
    * sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: Likewise.
    * sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: Likewise.
    * sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: Likewise.
    * sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise.
    * sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: Likewise.
    * sysdeps/x86_64/fpu/test-double-vlen8.c: Likewise.
    * NEWS: Mention addition of x86_64 vector pow.
2015-06-17 16:22:26 +03:00
Andrew Senkevich
1663be053d Vector expf for x86_64 and tests.
Here is implementation of vectorized expf containing SSE, AVX,
AVX2 and AVX512 versions according to Vector ABI
<https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>.

    * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New symbols added.
    * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration and asm
    redirections for expf.
    * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files.
    * sysdeps/x86_64/fpu/Versions: New versions added.
    * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated.
    * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added
    build of SSE, AVX2 and AVX512 IFUNC versions.
    * sysdeps/x86_64/fpu/multiarch/svml_s_expf16_core.S: New file.
    * sysdeps/x86_64/fpu/multiarch/svml_s_expf16_core_avx512.S: New file.
    * sysdeps/x86_64/fpu/multiarch/svml_s_expf4_core.S: New file.
    * sysdeps/x86_64/fpu/multiarch/svml_s_expf4_core_sse4.S: New file.
    * sysdeps/x86_64/fpu/multiarch/svml_s_expf8_core.S: New file.
    * sysdeps/x86_64/fpu/multiarch/svml_s_expf8_core_avx2.S: New file.
    * sysdeps/x86_64/fpu/svml_s_expf16_core.S: New file.
    * sysdeps/x86_64/fpu/svml_s_expf4_core.S: New file.
    * sysdeps/x86_64/fpu/svml_s_expf8_core.S: New file.
    * sysdeps/x86_64/fpu/svml_s_expf8_core_avx.S: New file.
    * sysdeps/x86_64/fpu/svml_s_expf_data.S: New file.
    * sysdeps/x86_64/fpu/svml_s_expf_data.h: New file.
    * sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c: Vector expf tests.
    * sysdeps/x86_64/fpu/test-float-vlen16.c: Likewise.
    * sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c: Likewise.
    * sysdeps/x86_64/fpu/test-float-vlen4.c: Likewise.
    * sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c: Likewise.
    * sysdeps/x86_64/fpu/test-float-vlen8-avx2.c: Likewise.
    * sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c: Likewise.
    * sysdeps/x86_64/fpu/test-float-vlen8.c: Likewise.
    * NEWS: Mention addition of x86_64 vector expf.
2015-06-17 16:10:51 +03:00
Andrew Senkevich
9c02f663f6 Vector exp for x86_64 and tests.
Here is implementation of vectorized exp containing SSE, AVX,
AVX2 and AVX512 versions according to Vector ABI
<https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>.

    * bits/libm-simd-decl-stubs.h: Added stubs for exp.
    * math/bits/mathcalls.h: Added exp declaration with __MATHCALL_VEC.
    * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New versions added.
    * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration and asm
    redirections for exp.
    * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files.
    * sysdeps/x86_64/fpu/Versions: New versions added.
    * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated.
    * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added
    build of SSE, AVX2 and AVX512 IFUNC versions.
    * sysdeps/x86_64/fpu/multiarch/svml_d_exp2_core.S: New file.
    * sysdeps/x86_64/fpu/multiarch/svml_d_exp2_core_sse4.S: New file.
    * sysdeps/x86_64/fpu/multiarch/svml_d_exp4_core.S: New file.
    * sysdeps/x86_64/fpu/multiarch/svml_d_exp4_core_avx2.S: New file.
    * sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core.S: New file.
    * sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core_avx512.S: New file.
    * sysdeps/x86_64/fpu/svml_d_exp2_core.S: New file.
    * sysdeps/x86_64/fpu/svml_d_exp4_core.S: New file.
    * sysdeps/x86_64/fpu/svml_d_exp4_core_avx.S: New file.
    * sysdeps/x86_64/fpu/svml_d_exp8_core.S: New file.
    * sysdeps/x86_64/fpu/svml_d_exp_data.S: New file.
    * sysdeps/x86_64/fpu/svml_d_exp_data.h: New file.
    * sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: Added vector exp test.
    * sysdeps/x86_64/fpu/test-double-vlen2.c: Likewise.
    * sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: Likewise.
    * sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: Likewise.
    * sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: Likewise.
    * sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise.
    * sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: Likewise.
    * sysdeps/x86_64/fpu/test-double-vlen8.c: Likewise.
    * NEWS: Mention addition of x86_64 vector exp.
2015-06-17 15:58:05 +03:00
Andrew Senkevich
774488f88a Vector logf for x86_64 and tests.
Here is implementation of vectorized logf containing SSE, AVX,
AVX2 and AVX512 versions according to Vector ABI
<https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>.

    * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New symbols added.
    * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration and asm
    redirections for logf.
    * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files.
    * sysdeps/x86_64/fpu/Versions: New versions added.
    * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated.
    * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added
    build of SSE, AVX2 and AVX512 IFUNC versions.
    * sysdeps/x86_64/fpu/multiarch/svml_s_logf16_core.S: New file.
    * sysdeps/x86_64/fpu/multiarch/svml_s_logf16_core_avx512.S: New file.
    * sysdeps/x86_64/fpu/multiarch/svml_s_logf4_core.S: New file.
    * sysdeps/x86_64/fpu/multiarch/svml_s_logf4_core_sse4.S: New file.
    * sysdeps/x86_64/fpu/multiarch/svml_s_logf8_core.S: New file.
    * sysdeps/x86_64/fpu/multiarch/svml_s_logf8_core_avx2.S: New file.
    * sysdeps/x86_64/fpu/svml_s_logf16_core.S: New file.
    * sysdeps/x86_64/fpu/svml_s_logf4_core.S: New file.
    * sysdeps/x86_64/fpu/svml_s_logf8_core.S: New file.
    * sysdeps/x86_64/fpu/svml_s_logf8_core_avx.S: New file.
    * sysdeps/x86_64/fpu/svml_s_logf_data.S: New file.
    * sysdeps/x86_64/fpu/svml_s_logf_data.h: New file.
    * sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c: Vector logf tests.
    * sysdeps/x86_64/fpu/test-float-vlen16.c: Likewise.
    * sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c: Likewise.
    * sysdeps/x86_64/fpu/test-float-vlen4.c: Likewise.
    * sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c: Likewise.
    * sysdeps/x86_64/fpu/test-float-vlen8-avx2.c: Likewise.
    * sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c: Likewise.
    * sysdeps/x86_64/fpu/test-float-vlen8.c: Likewise.
    * NEWS: Mention addition of x86_64 vector logf.
2015-06-17 15:53:00 +03:00
Andrew Senkevich
6af25acc7b Vector log for x86_64 and tests.
Here is implementation of vectorized log containing SSE, AVX,
AVX2 and AVX512 versions according to Vector ABI
<https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>.

    * bits/libm-simd-decl-stubs.h: Added stubs for log.
    * math/bits/mathcalls.h: Added log declaration with __MATHCALL_VEC.
    * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New versions added.
    * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration and asm
    redirections for log.
    * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files.
    * sysdeps/x86_64/fpu/Versions: New versions added.
    * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated.
    * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added
    build of SSE, AVX2 and AVX512 IFUNC versions.
    * sysdeps/x86_64/fpu/multiarch/svml_d_log2_core.S: New file.
    * sysdeps/x86_64/fpu/multiarch/svml_d_log2_core_sse4.S: New file.
    * sysdeps/x86_64/fpu/multiarch/svml_d_log4_core.S: New file.
    * sysdeps/x86_64/fpu/multiarch/svml_d_log4_core_avx2.S: New file.
    * sysdeps/x86_64/fpu/multiarch/svml_d_log8_core.S: New file.
    * sysdeps/x86_64/fpu/multiarch/svml_d_log8_core_avx512.S: New file.
    * sysdeps/x86_64/fpu/svml_d_log2_core.S: New file.
    * sysdeps/x86_64/fpu/svml_d_log4_core.S: New file.
    * sysdeps/x86_64/fpu/svml_d_log4_core_avx.S: New file.
    * sysdeps/x86_64/fpu/svml_d_log8_core.S: New file.
    * sysdeps/x86_64/fpu/svml_d_log_data.S: New file.
    * sysdeps/x86_64/fpu/svml_d_log_data.h: New file.
    * sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: Added vector log test.
    * sysdeps/x86_64/fpu/test-double-vlen2.c: Likewise.
    * sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: Likewise.
    * sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: Likewise.
    * sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: Likewise.
    * sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise.
    * sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: Likewise.
    * sysdeps/x86_64/fpu/test-double-vlen8.c: Likewise.
    * NEWS: Mention addition of x86_64 vector log.
2015-06-17 15:38:29 +03:00
Andrew Senkevich
2a8c2c7b33 Vector sinf for x86_64 and tests.
Here is implementation of vectorized sinf containing SSE, AVX,
AVX2 and AVX512 versions according to Vector ABI
<https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>.

    * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New symbols added.
    * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration for sinf.
    * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files.
    * sysdeps/x86_64/fpu/Versions: New versions added.
    * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated.
    * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added
    build of SSE, AVX2 and AVX512 IFUNC versions.
    * sysdeps/x86_64/fpu/multiarch/svml_s_sinf16_core.S: New file.
    * sysdeps/x86_64/fpu/multiarch/svml_s_sinf16_core_avx512.S: New file.
    * sysdeps/x86_64/fpu/multiarch/svml_s_sinf4_core.S: New file.
    * sysdeps/x86_64/fpu/multiarch/svml_s_sinf4_core_sse4.S: New file.
    * sysdeps/x86_64/fpu/multiarch/svml_s_sinf8_core.S: New file.
    * sysdeps/x86_64/fpu/multiarch/svml_s_sinf8_core_avx2.S: New file.
    * sysdeps/x86_64/fpu/svml_s_sinf16_core.S: New file.
    * sysdeps/x86_64/fpu/svml_s_sinf4_core.S: New file.
    * sysdeps/x86_64/fpu/svml_s_sinf8_core.S: New file.
    * sysdeps/x86_64/fpu/svml_s_sinf8_core_avx.S: New file.
    * sysdeps/x86_64/fpu/svml_s_sinf_data.S: New file.
    * sysdeps/x86_64/fpu/svml_s_sinf_data.h: New file.
    * sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c: Vector sinf tests.
    * sysdeps/x86_64/fpu/test-float-vlen16.c: Likewise.
    * sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c: Likewise.
    * sysdeps/x86_64/fpu/test-float-vlen4.c: Likewise.
    * sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c: Likewise.
    * sysdeps/x86_64/fpu/test-float-vlen8-avx2.c: Likewise.
    * sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c: Likewise.
    * sysdeps/x86_64/fpu/test-float-vlen8.c: Likewise.
    * NEWS: Mention addition of x86_64 vector sinf.
2015-06-15 15:06:53 +03:00
Andrew Senkevich
4b9c2b707b Vector sin for x86_64 and tests.
Here is implementation of vectorized sin containing SSE, AVX,
AVX2 and AVX512 versions according to Vector ABI
<https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>.

    * bits/libm-simd-decl-stubs.h: Added stubs for sin.
    * math/bits/mathcalls.h: Added sin declaration with __MATHCALL_VEC.
    * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New versions added.
    * sysdeps/x86/fpu/bits/math-vector.h: SIMD declaration for sin.
    * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files.
    * sysdeps/x86_64/fpu/Versions: New versions added.
    * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated.
    * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added
    build of SSE, AVX2 and AVX512 IFUNC versions.
    * sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core.S: New file.
    * sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core_sse4.S: New file.
    * sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core.S: New file.
    * sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core_avx2.S: New file.
    * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core.S: New file.
    * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core_avx512.S: New file.
    * sysdeps/x86_64/fpu/svml_d_sin2_core.S: New file.
    * sysdeps/x86_64/fpu/svml_d_sin4_core.S: New file.
    * sysdeps/x86_64/fpu/svml_d_sin4_core_avx.S: New file.
    * sysdeps/x86_64/fpu/svml_d_sin8_core.S: New file.
    * sysdeps/x86_64/fpu/svml_d_sin_data.S: New file.
    * sysdeps/x86_64/fpu/svml_d_sin_data.h: New file.
    * sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: Added vector sin test.
    * sysdeps/x86_64/fpu/test-double-vlen2.c: Likewise.
    * sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: Likewise.
    * sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: Likewise.
    * sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: Likewise.
    * sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise.
    * sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: Likewise.
    * sysdeps/x86_64/fpu/test-double-vlen8.c: Likewise.
    * NEWS: Mention addition of x86_64 vector sin.
2015-06-11 17:12:38 +03:00
Andrew Senkevich
0724d898bb More strict check of AVX512 support in assembler.
Binutils 2.24 doesn't support some AVX512 instructions with ZMM
registers, so we need add more strict check.

    * configure.ac: Added more strict check.
    * configure: Regenerated.
2015-06-11 13:50:07 +03:00
Joseph Myers
2f44ee08db Fix regcomp wcscoll, wcscmp namespace (bug 18497).
regcomp brings in references to wcscoll, which isn't in all the
standards that contain regcomp.  In turn, wcscoll brings in references
to wcscmp, also not in all those standards.  This patch fixes this by
making those functions into weak aliases of __wcscoll and __wcscmp and
calling those names instead as needed.

Tested for x86_64 and x86 (testsuite, and that disassembly of
installed shared libraries is unchanged by the patch).

	[BZ #18497]
	* wcsmbs/wcscmp.c [!WCSCMP] (WCSCMP): Define as __wcscmp instead
	of wcscmp.
	(wcscmp): Define as weak alias of WCSCMP.
	* wcsmbs/wcscoll.c (STRCOLL): Define as __wcscoll instead of
	wcscoll.
	(USE_HIDDEN_DEF): Define.
	[!USE_IN_EXTENDED_LOCALE_MODEL] (wcscoll): Define as weak alias of
	__wcscoll.  Don't use libc_hidden_weak.
	* wcsmbs/wcscoll_l.c (STRCMP): Define as __wcscmp instead of
	wcscmp.
	* sysdeps/i386/i686/multiarch/wcscmp-c.c
	[SHARED] (libc_hidden_def): Define __GI___wcscmp instead of
	__GI_wcscmp.
	(weak_alias): Undefine and redefine.
	* sysdeps/i386/i686/multiarch/wcscmp.S (wcscmp): Rename to
	__wcscmp and define as weak alias of __wcscmp.
	* sysdeps/x86_64/wcscmp.S (wcscmp): Likewise.
	* include/wchar.h (__wcscmp): Declare.  Use libc_hidden_proto.
	(__wcscoll): Likewise.
	(wcscmp): Don't use libc_hidden_proto.
	(wcscoll): Likewise.
	* posix/regcomp.c (build_range_exp): Call __wcscoll instead of
	wcscoll.
	* posix/regexec.c (check_node_accept_bytes): Likewise.
	* conform/Makefile (test-xfail-XPG3/regex.h/linknamespace): Remove
	variable.
	(test-xfail-XPG4/regex.h/linknamespace): Likewise.
	(test-xfail-POSIX/regex.h/linknamespace): Likewise.
2015-06-09 21:07:30 +00:00