Commit Graph

942 Commits

Author SHA1 Message Date
Joseph Myers
3b7aa5bf59 Remove configure tests for SSE4 support.
GCC added support for -msse4 in version 4.3.  Thus the configure tests
for it are obsolete, and this patch removes them.

Tested for x86_64 and x86 (testsuite, and that installed stripped
shared libraries are unchanged by this patch).

	* sysdeps/i386/configure.ac (libc_cv_cc_sse4): Remove configure
	test.
	* sysdeps/i386/configure: Regenerated.
	* sysdeps/i386/i686/multiarch/Makefile
	[$(config-cflags-sse4) = yes]: Make code unconditional.
	* sysdeps/i386/i686/multiarch/strcspn.S [HAVE_SSE4_SUPPORT]:
	Likewise.
	* sysdeps/i386/i686/multiarch/strspn.S [HAVE_SSE4_SUPPORT]:
	Likewise.
	* sysdeps/x86_64/configure.ac (libc_cv_cc_sse4): Remove configure
	test.
	* sysdeps/x86_64/configure: Regenerated.
	* sysdeps/x86_64/multiarch/Makefile [$(config-cflags-sse4) = yes]:
	Make code unconditional.
	* sysdeps/x86_64/multiarch/strcspn.S [HAVE_SSE4_SUPPORT]:
	Likewise.
	* sysdeps/x86_64/multiarch/strspn.S [HAVE_SSE4_SUPPORT]: Likewise.
	* config.h.in (HAVE_SSE4_SUPPORT): Remove #undef.
2015-10-06 20:47:40 +00:00
Paul Pluzhnikov
57352c2201 sysdeps/x86_64/fpu/libm-test-ulps: Regenerated on Haswell. 2015-10-03 11:31:21 -07:00
Joseph Myers
93e448cbed Improve test coverage of real libm functions [a-e]*.
This patch improves test coverage of the real libm functions [a-e]*,
ensuring that special cases and ranges of input values of potential
significance (such as close to overflow and underflow thresholds) are
more systematically covered.

This is a followup to
<https://sourceware.org/ml/libc-alpha/2013-12/msg00757.html> which
covered [a-c]* (however, I found more weaknesses in the coverage of
those functions when preparing this patch, hence the additional tests
being added for them here).

Addition of a test for acosh (-qNaN) is temporarily deferred, to be
included as part of a fix for bug 19032 which was discovered in the
course of adding these tests (and which illustrates the use of testing
-qNaN as well as +qNaN as input even to functions for which the sign
of a NaN isn't meant to be significant).

Tested for x86_64 and x86.

	* math/auto-libm-test-in: Add more tests of acos, acosh, asin,
	atan, atan2, atanh, cbrt, cos, cosh, erf, erfc, exp, exp10, exp2
	and expm1.
	* math/auto-libm-test-out: Regenerated.
	* math/libm-test.inc (acos_test_data): Add more tests.
	(asin_test_data): Likewise.
	(asinh_test_data): Likewise.
	(atan_test_data): Likewise.
	(atanh_test_data): Likewise.
	(atan2_test_data): Likewise.
	(cbrt_test_data): Likewise.
	(ceil_test_data): Likewise.
	(copysign_test_data): Likewise.
	(cos_test_data): Likewise.
	(cosh_test_data): Likewise.
	(erf_test_data): Likewise.
	(erfc_test_data): Likewise.
	(exp_test_data): Likewise.
	(exp10_test_data): Likewise.
	(exp2_test_data): Likewise.
	(expm1_test_data): Likewise.
	* sysdeps/x86_64/fpu/libm-test-ulps: Update.
2015-09-30 18:06:02 +00:00
Joseph Myers
a5721ebc68 Fix clog, clog10 inaccuracy (bug 19016).
For arguments with X^2 + Y^2 close to 1, clog and clog10 avoid large
errors from log(hypot) by computing X^2 + Y^2 - 1 in a way that avoids
cancellation error and then using log1p.

However, the thresholds for using that approach still result in log
being used on argument as large as sqrt(13/16) > 0.9, leading to
significant errors, in some cases above the 9ulp maximum allowed in
glibc libm.  This patch arranges for the approach using log1p to be
used in any cases where |X|, |Y| < 1 and X^2 + Y^2 >= 0.5 (with the
existing allowance for cases where one of X and Y is very small),
adjusting the __x2y2m1 functions to work with the wider range of
inputs.  This way, log only gets used on arguments below sqrt(1/2) (or
substantially above 1), where the error involved is much less.

Tested for x86_64, x86, mips64 and powerpc.  For the ulps regeneration
I removed the existing clog and clog10 ulps before regenerating to
allow any reduced ulps to appear.  Tests added include those found by
random test generation to produce large ulps either before or after
the patch, and some found by trying inputs close to the (0.75, 0.5)
threshold where the potential errors from using log are largest.

	[BZ #19016]
	* sysdeps/generic/math_private.h (__x2y2m1f): Update comment to
	allow more cases with X^2 + Y^2 >= 0.5.
	* sysdeps/ieee754/dbl-64/x2y2m1.c (__x2y2m1): Likewise.  Add -1 as
	normal element in sum instead of special-casing based on values of
	arguments.
	* sysdeps/ieee754/dbl-64/x2y2m1f.c (__x2y2m1f): Update comment.
	* sysdeps/ieee754/ldbl-128/x2y2m1l.c (__x2y2m1l): Likewise.  Add
	-1 as normal element in sum instead of special-casing based on
	values of arguments.
	* sysdeps/ieee754/ldbl-128ibm/x2y2m1l.c (__x2y2m1l): Likewise.
	* sysdeps/ieee754/ldbl-96/x2y2m1.c [FLT_EVAL_METHOD != 0]
	(__x2y2m1): Update comment.
	* sysdeps/ieee754/ldbl-96/x2y2m1l.c (__x2y2m1l): Likewise.  Add -1
	as normal element in sum instead of special-casing based on values
	of arguments.
	* math/s_clog.c (__clog): Handle more cases using log1p without
	hypot.
	* math/s_clog10.c (__clog10): Likewise.
	* math/s_clog10f.c (__clog10f): Likewise.
	* math/s_clog10l.c (__clog10l): Likewise.
	* math/s_clogf.c (__clogf): Likewise.
	* math/s_clogl.c (__clogl): Likewise.
	* math/auto-libm-test-in: Add more tests of clog and clog10.
	* math/auto-libm-test-out: Regenerated.
	* sysdeps/i386/fpu/libm-test-ulps: Update.
	* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
2015-09-28 22:11:22 +00:00
Joseph Myers
fa752c6981 Fix powf inaccuracy (bug 18956).
The flt-32 version of powf can be inaccurate because of bugs in the
extra-precision calculation of (x-1)/(x+1) or (x-1.5)/(x+1.5) as part
of calculating log(x) with extra precision: a constant used (as part
of adding 1 or 1.5 through integer arithmetic) is incorrect, and then
the code fails to mask a computed high part before using it in
arithmetic that relies on s_h*t_h being exactly representable.  This
patch fixes these bugs.

Tested for x86_64 and x86.  x86_64 ulps for powf removed and
regenerated to reflect reduced ulps from the increased accuracy for
existing tests.

	[BZ #18956]
	* sysdeps/ieee754/flt-32/e_powf.c (__ieee754_powf): Add 0x00400000
	not 0x0040000 for high bit of mantissa.  Mask with 0xfffff000 when
	extracting high part.
	* math/auto-libm-test-in: Add another test of pow.
	* math/auto-libm-test-out: Regenerated.
	* sysdeps/x86_64/fpu/libm-test-ulps: Update.
2015-09-26 00:27:06 +00:00
Joseph Myers
6ace393821 Fix pow missing underflows (bug 18825).
Similar to various other bugs in this area, pow functions can fail to
raise the underflow exception when the result is tiny and inexact but
one or more low bits of the intermediate result that is scaled down
(or, in the i386 case, converted from a wider evaluation format) are
zero.  This patch forces the exception in a similar way to previous
fixes, thereby concluding the fixes for known bugs with missing
underflow exceptions currently filed in Bugzilla.

Tested for x86_64, x86, mips64 and powerpc.

	[BZ #18825]
	* sysdeps/i386/fpu/i386-math-asm.h (FLT_NARROW_EVAL_UFLOW_NONNAN):
	New macro.
	(DBL_NARROW_EVAL_UFLOW_NONNAN): Likewise.
	(LDBL_CHECK_FORCE_UFLOW_NONNAN): Likewise.
	* sysdeps/i386/fpu/e_pow.S: Use DEFINE_DBL_MIN.
	(__ieee754_pow): Use DBL_NARROW_EVAL_UFLOW_NONNAN instead of
	DBL_NARROW_EVAL, reloading the PIC register as needed.
	* sysdeps/i386/fpu/e_powf.S: Use DEFINE_FLT_MIN.
	(__ieee754_powf): Use FLT_NARROW_EVAL_UFLOW_NONNAN instead of
	FLT_NARROW_EVAL.  Use separate return path for case when first
	argument is NaN.
	* sysdeps/i386/fpu/e_powl.S: Include <i386-math-asm.h>.  Use
	DEFINE_LDBL_MIN.
	(__ieee754_powl): Use LDBL_CHECK_FORCE_UFLOW_NONNAN, reloading the
	PIC register.
	* sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Use
	math_check_force_underflow_nonneg.
	* sysdeps/ieee754/flt-32/e_powf.c (__ieee754_powf): Force
	underflow for subnormal result.
	* sysdeps/ieee754/ldbl-128/e_powl.c (__ieee754_powl): Likewise.
	* sysdeps/ieee754/ldbl-128ibm/e_powl.c (__ieee754_powl): Use
	math_check_force_underflow_nonneg.
	* sysdeps/x86/fpu/powl_helper.c (__powl_helper): Use
	math_check_force_underflow.
	* sysdeps/x86_64/fpu/x86_64-math-asm.h
	(LDBL_CHECK_FORCE_UFLOW_NONNAN): New macro.
	* sysdeps/x86_64/fpu/e_powl.S: Include <x86_64-math-asm.h>.  Use
	DEFINE_LDBL_MIN.
	(__ieee754_powl): Use LDBL_CHECK_FORCE_UFLOW_NONNAN.
	* math/auto-libm-test-in: Add more tests of pow.
	* math/auto-libm-test-out: Regenerated.
2015-09-25 22:29:10 +00:00
Joseph Myers
b2a64460ba Refactor x86_64 libm code forcing underflow exceptions.
This patch refactors code in sysdeps/x86_64/fpu that forces underflow
exceptions and closely follows corresponding i386 code to use common
macros in x86_64-math-asm.h for that purpose.  This is mainly about
keeping the code similar to the i386 code as far as possible, since
each macro apart from DEFINE_LDBL_MIN ends up used only once.  It
would be possible to do a further refactoring to share these macros
between i386 and x86_64 (with i386 using the fcomip / fucomip versions
when building for i686 and above), but I have no immediate plans to do
so.

Tested for x86_64.

	* sysdeps/x86_64/fpu/x86_64-math-asm.h: New file.
	* sysdeps/x86_64/fpu/e_exp2l.S: Include <x86_64-math-asm.h>.
	(ldbl_min): Replace with use of DEFINE_LDBL_MIN.
	(__ieee754_exp2l): Use LDBL_CHECK_FORCE_UFLOW_NONNEG_NAN.
	* sysdeps/x86_64/fpu/e_expl.S: Include <x86_64-math-asm.h>.
	[!USE_AS_EXPM1L] (cmin): Replace with use of DEFINE_LDBL_MIN.
	(IEEE754_EXPL): Use LDBL_CHECK_FORCE_UFLOW_NONNEG.
2015-09-24 22:25:30 +00:00
Joseph Myers
51df260506 Fix x86_64 fma4 pow inappropriate contraction (bug 19003).
The x86_64 fma4 version of pow fails to disable contraction of
operations other than those explicitly intended to use fma
instructions, so resulting in large ulps errors on processors with
fma4 instructions, as in bug 18104 (165ulp for the test added for that
bug; error originally reported by "blaaa" on #glibc).  This patch adds
$(config-cflags-nofma) for e_pow-fma4.c, corresponding to the use for
e_pow.c in sysdeps/ieee754/dbl-64/Makefile.

Tested for x86_64 on a processor with fma4.

	[BZ #19003]
	* sysdeps/x86_64/fpu/multiarch/Makefile (CFLAGS-e_pow-fma4.c): Add
	$(config-cflags-nofma).
2015-09-24 16:48:32 +00:00
Joseph Myers
da2f4f2dd5 Make scalbn set errno (bug 6803).
As noted in bug 6803, scalbn fails to set errno on overflow and
underflow.  This patch fixes this by making scalbn an alias of ldexp,
which has exactly the same semantics (for floating-point types with
radix 2) and already has wrappers that deal with setting errno,
instead of an alias of the internal __scalbn (which ldexp calls).

Notes:

* Where compat symbols were defined for scalbn functions, I didn't
  change what they point to (to keep the patch minimal), so such
  compat symbols continue to go directly to the non-errno-setting
  functions.

* Mike, I didn't do anything with the IA64 versions of these
  functions, where I think both the ldexp and scalbn functions already
  deal with setting errno.  As a cleanup (not needed to fix this bug)
  however you might want to make those functions into aliases for
  IA64; there is no need for them to be separate function
  implementations at all.

* This concludes the fix for bug 6803 since the scalb and scalbln
  cases of that bug were fixed some time ago.

Tested for x86_64, x86, mips64 and powerpc.

	[BZ #6803]
	* math/s_ldexp.c (scalbn): Define as weak alias of __ldexp.
	[NO_LONG_DOUBLE] (scalbnl): Define as weak alias of __ldexp.
	* math/s_ldexpf.c (scalbnf): Define as weak alias of __ldexpf.
	* math/s_ldexpl.c (scalbnl): Define as weak alias of __ldexpl.
	* sysdeps/i386/fpu/s_scalbn.S (scalbn): Remove alias.
	* sysdeps/i386/fpu/s_scalbnf.S (scalbnf): Likewise.
	* sysdeps/i386/fpu/s_scalbnl.S (scalbnl): Likewise.
	* sysdeps/ieee754/dbl-64/s_scalbn.c (scalbn): Likewise.
	[NO_LONG_DOUBLE] (scalbnl): Likewise.
	* sysdeps/ieee754/dbl-64/wordsize-64/s_scalbn.c (scalbn):
	Likewise.
	[NO_LONG_DOUBLE] (scalbnl): Likewise.
	* sysdeps/ieee754/flt-32/s_scalbnf.c (scalbnf): Likewise.
	* sysdeps/ieee754/ldbl-128/s_scalbnl.c (scalbnl): Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_scalbnl.c (scalbnl): Remove
	long_double_symbol calls.
	* sysdeps/ieee754/ldbl-64-128/s_scalbnl.c (scalbnl): Likewise.
	* sysdeps/ieee754/ldbl-opt/s_ldexpl.c (__ldexpl_2): Define as
	strong alias of __ldexpl.
	(scalbnl): Define using long_double_symbol.
	* sysdeps/m68k/m680x0/fpu/s_scalbn.c (__CONCATX(scalbn,suffix)):
	Remove alias.
	* sysdeps/sparc/sparc64/soft-fp/s_scalbnl.c (scalbnl): Likewise.
	* sysdeps/x86_64/fpu/s_scalbnl.S (scalbnl): Likewise.
	* math/libm-test.inc (scalbn_test_data): Add errno expectations.
	(scalbln_test_data): Add more errno expectations.
2015-09-16 21:11:00 +00:00
Joseph Myers
903af5af9a Fix exp2 missing underflows (bug 16521).
Various exp2 implementations in glibc can miss underflow exceptions
when the scaling down part of the calculation is exact (or, in the x86
case, when the conversion from extended precision to the target
precision is exact).  This patch forces the exception in a similar way
to previous fixes.

The x86 exp2f changes may in fact not be needed for this purpose -
it's likely to be the case that no argument of type float has an exp2
result so close to an exact subnormal float value that it equals that
value when rounded to 64 bits (even taking account of variation
between different x86 implementations).  However, they are included
for consistency with the changes to exp2 and so as to fix the exp2f
part of bug 18875 by ensuring that excess range and precision is
removed from underflowing return values.

Tested for x86_64, x86 and mips64.

	[BZ #16521]
	[BZ #18875]
	* math/e_exp2l.c (__ieee754_exp2l): Force underflow exception for
	small results.
	* sysdeps/i386/fpu/e_exp2.S (dbl_min): New object.
	(MO): New macro.
	(__ieee754_exp2): For small results, force underflow exception and
	remove excess range and precision from return value.
	* sysdeps/i386/fpu/e_exp2f.S (flt_min): New object.
	(MO): New macro.
	(__ieee754_exp2f): For small results, force underflow exception
	and remove excess range and precision from return value.
	* sysdeps/i386/fpu/e_exp2l.S (ldbl_min): New object.
	(MO): New macro.
	(__ieee754_exp2l): Force underflow exception for small results.
	* sysdeps/ieee754/dbl-64/e_exp2.c (__ieee754_exp2): Likewise.
	* sysdeps/ieee754/flt-32/e_exp2f.c (__ieee754_exp2f): Likewise.
	* sysdeps/x86_64/fpu/e_exp2l.S (ldbl_min): New object.
	(MO): New macro.
	(__ieee754_exp2l): Force underflow exception for small results.
	* math/auto-libm-test-in: Add more tests or exp2.
	* math/auto-libm-test-out: Regenerated.
2015-09-14 22:00:12 +00:00
Joseph Myers
a1f99ba28b Add more random libm test inputs (mainly for ldbl-128).
This patch adds more libm test inputs found through random test
generation to increase previously known ulps.  This particular test
generation was run for mips64, so most of the increased ulps are for
ldbl-128 (float and double having been fairly well covered by such
testing for x86_64), but there's the odd ulps increase for other
formats.

Tested for x86_64, x86 and mips64.

	* math/auto-libm-test-in: Add more tests of acos, acosh, asin,
	asinh, atan, atan2, atanh, cabs, carg, cos, csqrt, erfc, exp,
	exp10, exp2, log, log1p, log2, pow, sin, sincos, sinh, tan and
	tanh.
	* math/auto-libm-test-out: Regenerated.
	* sysdeps/i386/fpu/libm-test-ulps: Update.
	* sysdeps/mips/mips32/libm-test-ulps: Likewise.
	* sysdeps/mips/mips64/libm-test-ulps: Likewise.
	* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
2015-09-12 00:01:38 +00:00
Joseph Myers
de071d199a Move bits/atomic.h to atomic-machine.h (bug 14912).
It was noted in
<https://sourceware.org/ml/libc-alpha/2012-09/msg00305.html> that the
bits/*.h naming scheme should only be used for installed headers.
This patch renames bits/atomic.h to atomic-machine.h to follow that
convention.

This is the only change in this series that needs to change the
filename rather than simply removing a directory level (because both
atomic.h and bits/atomic.h exist at present).

Tested for x86_64 (testsuite, and that installed stripped shared
libraries are unchanged by the patch).

	[BZ #14912]
	* sysdeps/aarch64/bits/atomic.h: Move to ...
	* sysdeps/aarch64/atomic-machine.h: ...here.
	(_AARCH64_BITS_ATOMIC_H): Rename macro to
	_AARCH64_ATOMIC_MACHINE_H.
	* sysdeps/alpha/bits/atomic.h: Move to ...
	* sysdeps/alpha/atomic-machine.h: ...here.
	* sysdeps/arm/bits/atomic.h: Move to ...
	* sysdeps/arm/atomic-machine.h: ...here.  Update comments.
	* bits/atomic.h: Move to ...
	* sysdeps/generic/atomic-machine.h: ...here.
	(_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H.
	* sysdeps/i386/bits/atomic.h: Move to ...
	* sysdeps/i386/atomic-machine.h: ...here.
	* sysdeps/ia64/bits/atomic.h: Move to ...
	* sysdeps/ia64/atomic-machine.h: ...here.
	* sysdeps/m68k/coldfire/bits/atomic.h: Move to ...
	* sysdeps/m68k/coldfire/atomic-machine.h: ...here.
	(_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H.
	* sysdeps/m68k/m680x0/m68020/bits/atomic.h: Move to ...
	* sysdeps/m68k/m680x0/m68020/atomic-machine.h: ...here.
	* sysdeps/microblaze/bits/atomic.h: Move to ...
	* sysdeps/microblaze/atomic-machine.h: ...here.
	* sysdeps/mips/bits/atomic.h: Move to ...
	* sysdeps/mips/atomic-machine.h: ...here.
	(_MIPS_BITS_ATOMIC_H): Rename macro to _MIPS_ATOMIC_MACHINE_H.
	* sysdeps/powerpc/bits/atomic.h: Move to ...
	* sysdeps/powerpc/atomic-machine.h: ...here.  Update comments.
	* sysdeps/powerpc/powerpc32/bits/atomic.h: Move to ...
	* sysdeps/powerpc/powerpc32/atomic-machine.h: ...here.  Update
	comments.  Include <atomic-machine.h> instead of <bits/atomic.h>.
	* sysdeps/powerpc/powerpc64/bits/atomic.h: Move to ...
	* sysdeps/powerpc/powerpc64/atomic-machine.h: ...here.  Include
	<atomic-machine.h> instead of <bits/atomic.h>.
	* sysdeps/s390/bits/atomic.h: Move to ...
	* sysdeps/s390/atomic-machine.h: ...here.
	* sysdeps/sparc/sparc32/bits/atomic.h: Move to ...
	* sysdeps/sparc/sparc32/atomic-machine.h: ...here.
	(_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H.
	* sysdeps/sparc/sparc32/sparcv9/bits/atomic.h: Move to ...
	* sysdeps/sparc/sparc32/sparcv9/atomic-machine.h: ...here.
	* sysdeps/sparc/sparc64/bits/atomic.h: Move to ...
	* sysdeps/sparc/sparc64/atomic-machine.h: ...here.
	* sysdeps/tile/bits/atomic.h: Move to ...
	* sysdeps/tile/atomic-machine.h: ...here.
	* sysdeps/tile/tilegx/bits/atomic.h: Move to ...
	* sysdeps/tile/tilegx/atomic-machine.h: ...here.  Include
	<sysdeps/tile/atomic-machine.h> instead of
	<sysdeps/tile/bits/atomic.h>.
	(_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H.
	* sysdeps/tile/tilepro/bits/atomic.h: Move to ...
	* sysdeps/tile/tilepro/atomic-machine.h: ...here.  Include
	<sysdeps/tile/atomic-machine.h> instead of
	<sysdeps/tile/bits/atomic.h>.
	(_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H.
	* sysdeps/unix/sysv/linux/arm/bits/atomic.h: Move to ...
	* sysdeps/unix/sysv/linux/arm/atomic-machine.h: ...here.  Include
	<sysdeps/arm/atomic-machine.h> instead of
	<sysdeps/arm/bits/atomic.h>.
	* sysdeps/unix/sysv/linux/hppa/bits/atomic.h: Move to ...
	* sysdeps/unix/sysv/linux/hppa/atomic-machine.h: ...here.
	(_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H.
	* sysdeps/unix/sysv/linux/m68k/coldfire/bits/atomic.h: Move to ...
	* sysdeps/unix/sysv/linux/m68k/coldfire/atomic-machine.h: ...here.
	(_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H.
	* sysdeps/unix/sysv/linux/nios2/bits/atomic.h: Move to ...
	* sysdeps/unix/sysv/linux/nios2/atomic-machine.h: ...here.
	(_NIOS2_BITS_ATOMIC_H): Rename macro to _NIOS2_ATOMIC_MACHINE_H.
	* sysdeps/unix/sysv/linux/sh/bits/atomic.h: Move to ...
	* sysdeps/unix/sysv/linux/sh/atomic-machine.h: ...here.
	* sysdeps/x86_64/bits/atomic.h: Move to ...
	* sysdeps/x86_64/atomic-machine.h: ...here.
	* include/atomic.h: Include <atomic-machine.h> instead of
	<bits/atomic.h>.
2015-09-11 20:00:19 +00:00
Joseph Myers
00a7073c38 Add more randomly-generated libm tests.
This patch adds more libm test inputs found through random test
generation to increase observed ulps on x86_64.

Tested for x86_64 and x86.

	* math/auto-libm-test-in: Add more tests of acosh, atanh, cbrt,
	cosh, csqrt, erfc, expm1 and lgamma.
	* math/auto-libm-test-out: Regenerated.
	* sysdeps/i386/fpu/libm-test-ulps: Update.
	* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
2015-09-11 15:03:10 +00:00
Joseph Myers
050f29c188 Fix lgamma (negative) inaccuracy (bug 2542, bug 2543, bug 2558).
The existing implementations of lgamma functions (except for the ia64
versions) use the reflection formula for negative arguments.  This
suffers large inaccuracy from cancellation near zeros of lgamma (near
where the gamma function is +/- 1).

This patch fixes this inaccuracy.  For arguments above -2, there are
no zeros and no large cancellation, while for sufficiently large
negative arguments the zeros are so close to integers that even for
integers +/- 1ulp the log(gamma(1-x)) term dominates and cancellation
is not significant.  Thus, it is only necessary to take special care
about cancellation for arguments around a limited number of zeros.

Accordingly, this patch uses precomputed tables of relevant zeros,
expressed as the sum of two floating-point values.  The log of the
ratio of two sines can be computed accurately using log1p in cases
where log would lose accuracy.  The log of the ratio of two gamma(1-x)
values can be computed using Stirling's approximation (the difference
between two values of that approximation to lgamma being computable
without computing the two values and then subtracting), with
appropriate adjustments (which don't reduce accuracy too much) in
cases where 1-x is too small to use Stirling's approximation directly.

In the interval from -3 to -2, using the ratios of sines and of
gamma(1-x) can still produce too much cancellation between those two
parts of the computation (and that interval is also the worst interval
for computing the ratio between gamma(1-x) values, which computation
becomes more accurate, while being less critical for the final result,
for larger 1-x).  Because this can result in errors slightly above
those accepted in glibc, this interval is instead dealt with by
polynomial approximations.  Separate polynomial approximations to
(|gamma(x)|-1)(x-n)/(x-x0) are used for each interval of length 1/8
from -3 to -2, where n (-3 or -2) is the nearest integer to the
1/8-interval and x0 is the zero of lgamma in the relevant half-integer
interval (-3 to -2.5 or -2.5 to -2).

Together, the two approaches are intended to give sufficient accuracy
for all negative arguments in the problem range.  Outside that range,
the previous implementation continues to be used.

Tested for x86_64, x86, mips64 and powerpc.  The mips64 and powerpc
testing shows up pre-existing problems for ldbl-128 and ldbl-128ibm
with large negative arguments giving spurious "invalid" exceptions
(exposed by newly added tests for cases this patch doesn't affect the
logic for); I'll address those problems separately.

	[BZ #2542]
	[BZ #2543]
	[BZ #2558]
	* sysdeps/ieee754/dbl-64/e_lgamma_r.c (__ieee754_lgamma_r): Call
	__lgamma_neg for arguments from -28.0 to -2.0.
	* sysdeps/ieee754/flt-32/e_lgammaf_r.c (__ieee754_lgammaf_r): Call
	__lgamma_negf for arguments from -15.0 to -2.0.
	* sysdeps/ieee754/ldbl-128/e_lgammal_r.c (__ieee754_lgammal_r):
	Call __lgamma_negl for arguments from -48.0 or -50.0 to -2.0.
	* sysdeps/ieee754/ldbl-96/e_lgammal_r.c (__ieee754_lgammal_r):
	Call __lgamma_negl for arguments from -33.0 to -2.0.
	* sysdeps/ieee754/dbl-64/lgamma_neg.c: New file.
	* sysdeps/ieee754/dbl-64/lgamma_product.c: Likewise.
	* sysdeps/ieee754/flt-32/lgamma_negf.c: Likewise.
	* sysdeps/ieee754/flt-32/lgamma_productf.c: Likewise.
	* sysdeps/ieee754/ldbl-128/lgamma_negl.c: Likewise.
	* sysdeps/ieee754/ldbl-128/lgamma_productl.c: Likewise.
	* sysdeps/ieee754/ldbl-128ibm/lgamma_negl.c: Likewise.
	* sysdeps/ieee754/ldbl-128ibm/lgamma_productl.c: Likewise.
	* sysdeps/ieee754/ldbl-96/lgamma_negl.c: Likewise.
	* sysdeps/ieee754/ldbl-96/lgamma_product.c: Likewise.
	* sysdeps/ieee754/ldbl-96/lgamma_productl.c: Likewise.
	* sysdeps/generic/math_private.h (__lgamma_negf): New prototype.
	(__lgamma_neg): Likewise.
	(__lgamma_negl): Likewise.
	(__lgamma_product): Likewise.
	(__lgamma_productl): Likewise.
	* math/Makefile (libm-calls): Add lgamma_neg and lgamma_product.
	* math/auto-libm-test-in: Add more tests of lgamma.
	* math/auto-libm-test-out: Regenerated.
	* sysdeps/i386/fpu/libm-test-ulps: Update.
	* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
2015-09-10 22:27:58 +00:00
Joseph Myers
ec999b8e5e Move bits/libc-lock.h and bits/libc-lockP.h out of bits/ (bug 14912).
It was noted in
<https://sourceware.org/ml/libc-alpha/2012-09/msg00305.html> that the
bits/*.h naming scheme should only be used for installed headers.
This patch renames bits/libc-lock.h to plain libc-lock.h and
bits/libc-lockP.h to plain libc-lockP.h to follow that convention.

Note that I don't know where libc-lockP.h comes from for Hurd (the
Hurd libc-lock.h includes libc-lockP.h, but the only libc-lockP.h in
the glibc source tree is for NPTL) - some unmerged patch? - but I
updated the #include in the Hurd libc-lock.h anyway.

Tested for x86_64 (testsuite, and that installed stripped shared
libraries are unchanged by the patch).

	[BZ #14912]
	* bits/libc-lock.h: Move to ...
	* sysdeps/generic/libc-lock.h: ...here.
	(_BITS_LIBC_LOCK_H): Rename macro to _LIBC_LOCK_H.
	* sysdeps/mach/hurd/bits/libc-lock.h: Move to ...
	* sysdeps/mach/hurd/libc-lock.h: ...here.
	(_BITS_LIBC_LOCK_H): Rename macro to _LIBC_LOCK_H.
	[_LIBC]: Include <libc-lockP.h> instead of <bits/libc-lockP.h>.
	* sysdeps/mach/bits/libc-lock.h: Move to ...
	* sysdeps/mach/libc-lock.h: ...here.
	(_BITS_LIBC_LOCK_H): Rename macro to _LIBC_LOCK_H.
	* sysdeps/nptl/bits/libc-lock.h: Move to ...
	* sysdeps/nptl/libc-lock.h: ...here.
	(_BITS_LIBC_LOCK_H): Rename macro to _LIBC_LOCK_H.
	* sysdeps/nptl/bits/libc-lockP.h: Move to ...
	* sysdeps/nptl/libc-lockP.h: ...here.
	(_BITS_LIBC_LOCKP_H): Rename macro to _LIBC_LOCKP_H.
	* crypt/crypt_util.c: Include <libc-lock.h> instead of
	<bits/libc-lock.h>.
	* dirent/scandir-tail.c: Likewise.
	* dlfcn/dlerror.c: Likewise.
	* elf/dl-close.c: Likewise.
	* elf/dl-iteratephdr.c: Likewise.
	* elf/dl-lookup.c: Likewise.
	* elf/dl-open.c: Likewise.
	* elf/dl-support.c: Likewise.
	* elf/dl-writev.h: Likewise.
	* elf/rtld.c: Likewise.
	* grp/fgetgrent.c: Likewise.
	* gshadow/fgetsgent.c: Likewise.
	* gshadow/sgetsgent.c: Likewise.
	* iconv/gconv_conf.c: Likewise.
	* iconv/gconv_db.c: Likewise.
	* iconv/gconv_dl.c: Likewise.
	* iconv/gconv_int.h: Likewise.
	* iconv/gconv_trans.c: Likewise.
	* include/link.h: Likewise.
	* inet/getnameinfo.c: Likewise.
	* inet/getnetgrent.c: Likewise.
	* inet/getnetgrent_r.c: Likewise.
	* intl/bindtextdom.c: Likewise.
	* intl/dcigettext.c: Likewise.
	* intl/finddomain.c: Likewise.
	* intl/gettextP.h: Likewise.
	* intl/loadmsgcat.c: Likewise.
	* intl/localealias.c: Likewise.
	* intl/textdomain.c: Likewise.
	* libidn/idn-stub.c: Likewise.
	* libio/libioP.h: Likewise.
	* locale/duplocale.c: Likewise.
	* locale/freelocale.c: Likewise.
	* locale/newlocale.c: Likewise.
	* locale/setlocale.c: Likewise.
	* login/getutent_r.c: Likewise.
	* login/getutid_r.c: Likewise.
	* login/getutline_r.c: Likewise.
	* login/utmp-private.h: Likewise.
	* login/utmpname.c: Likewise.
	* malloc/mtrace.c: Likewise.
	* misc/efgcvt.c: Likewise.
	* misc/error.c: Likewise.
	* misc/fstab.c: Likewise.
	* misc/getpass.c: Likewise.
	* misc/mntent.c: Likewise.
	* misc/syslog.c: Likewise.
	* nis/nis_call.c: Likewise.
	* nis/nis_callback.c: Likewise.
	* nis/nss-default.c: Likewise.
	* nis/nss_compat/compat-grp.c: Likewise.
	* nis/nss_compat/compat-initgroups.c: Likewise.
	* nis/nss_compat/compat-pwd.c: Likewise.
	* nis/nss_compat/compat-spwd.c: Likewise.
	* nis/nss_nis/nis-alias.c: Likewise.
	* nis/nss_nis/nis-ethers.c: Likewise.
	* nis/nss_nis/nis-grp.c: Likewise.
	* nis/nss_nis/nis-hosts.c: Likewise.
	* nis/nss_nis/nis-network.c: Likewise.
	* nis/nss_nis/nis-proto.c: Likewise.
	* nis/nss_nis/nis-pwd.c: Likewise.
	* nis/nss_nis/nis-rpc.c: Likewise.
	* nis/nss_nis/nis-service.c: Likewise.
	* nis/nss_nis/nis-spwd.c: Likewise.
	* nis/nss_nisplus/nisplus-alias.c: Likewise.
	* nis/nss_nisplus/nisplus-ethers.c: Likewise.
	* nis/nss_nisplus/nisplus-grp.c: Likewise.
	* nis/nss_nisplus/nisplus-hosts.c: Likewise.
	* nis/nss_nisplus/nisplus-initgroups.c: Likewise.
	* nis/nss_nisplus/nisplus-network.c: Likewise.
	* nis/nss_nisplus/nisplus-proto.c: Likewise.
	* nis/nss_nisplus/nisplus-pwd.c: Likewise.
	* nis/nss_nisplus/nisplus-rpc.c: Likewise.
	* nis/nss_nisplus/nisplus-service.c: Likewise.
	* nis/nss_nisplus/nisplus-spwd.c: Likewise.
	* nis/ypclnt.c: Likewise.
	* nptl/libc_pthread_init.c: Likewise.
	* nss/getXXbyYY.c: Likewise.
	* nss/getXXent.c: Likewise.
	* nss/getXXent_r.c: Likewise.
	* nss/nss_db/db-XXX.c: Likewise.
	* nss/nss_db/db-netgrp.c: Likewise.
	* nss/nss_db/nss_db.h: Likewise.
	* nss/nss_files/files-XXX.c: Likewise.
	* nss/nss_files/files-alias.c: Likewise.
	* nss/nsswitch.c: Likewise.
	* posix/regex_internal.h: Likewise.
	* posix/wordexp.c: Likewise.
	* pwd/fgetpwent.c: Likewise.
	* resolv/res_hconf.c: Likewise.
	* resolv/res_libc.c: Likewise.
	* shadow/fgetspent.c: Likewise.
	* shadow/lckpwdf.c: Likewise.
	* shadow/sgetspent.c: Likewise.
	* socket/opensock.c: Likewise.
	* stdio-common/reg-modifier.c: Likewise.
	* stdio-common/reg-printf.c: Likewise.
	* stdio-common/reg-type.c: Likewise.
	* stdio-common/vfprintf.c: Likewise.
	* stdio-common/vfscanf.c: Likewise.
	* stdlib/abort.c: Likewise.
	* stdlib/cxa_atexit.c: Likewise.
	* stdlib/fmtmsg.c: Likewise.
	* stdlib/random.c: Likewise.
	* stdlib/setenv.c: Likewise.
	* string/strsignal.c: Likewise.
	* sunrpc/auth_none.c: Likewise.
	* sunrpc/bindrsvprt.c: Likewise.
	* sunrpc/create_xid.c: Likewise.
	* sunrpc/key_call.c: Likewise.
	* sunrpc/rpc_thread.c: Likewise.
	* sysdeps/arm/backtrace.c: Likewise.
	* sysdeps/generic/ldsodefs.h: Likewise.
	* sysdeps/generic/stdio-lock.h: Likewise.
	* sysdeps/generic/unwind-dw2-fde.c: Likewise.
	* sysdeps/i386/backtrace.c: Likewise.
	* sysdeps/ieee754/ldbl-opt/nldbl-compat.c: Likewise.
	* sysdeps/m68k/backtrace.c: Likewise.
	* sysdeps/mach/hurd/cthreads.c: Likewise.
	* sysdeps/mach/hurd/dirstream.h: Likewise.
	* sysdeps/mach/hurd/malloc-machine.h: Likewise.
	* sysdeps/nptl/malloc-machine.h: Likewise.
	* sysdeps/nptl/stdio-lock.h: Likewise.
	* sysdeps/posix/dirstream.h: Likewise.
	* sysdeps/posix/getaddrinfo.c: Likewise.
	* sysdeps/posix/system.c: Likewise.
	* sysdeps/pthread/aio_suspend.c: Likewise.
	* sysdeps/s390/s390-32/backtrace.c: Likewise.
	* sysdeps/s390/s390-64/backtrace.c: Likewise.
	* sysdeps/unix/sysv/linux/check_pf.c: Likewise.
	* sysdeps/unix/sysv/linux/if_index.c: Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-32/getutent_r.c: Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-32/getutid_r.c: Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-32/getutline_r.c: Likewise.
	* sysdeps/unix/sysv/linux/shm-directory.c: Likewise.
	* sysdeps/unix/sysv/linux/system.c: Likewise.
	* sysdeps/x86_64/backtrace.c: Likewise.
	* time/alt_digit.c: Likewise.
	* time/era.c: Likewise.
	* time/tzset.c: Likewise.
	* wcsmbs/wcsmbsload.c: Likewise.
	* nptl/tst-initializers1.c (do_test): Refer to <libc-lock.h>
	instead of <bits/libc-lock.h> in comment.
2015-09-08 21:11:03 +00:00
H.J. Lu
38d22f9f48 Don't disable SSE in x86-64 ld.so
Since x86-64 ld.so preserves vector registers now, we can use SSE in
x86-64 ld.so.  We should run tst-ld-sse-use.sh only on i386.

	* sysdeps/x86/Makefile [$(subdir) == elf] (CFLAGS-.os,
	tests-special, $(objpfx)tst-ld-sse-use.out): Moved to ...
	* sysdeps/i386/Makefile [$(subdir) == elf] (CFLAGS-.os,
	tests-special, $(objpfx)tst-ld-sse-use.out): Here.  Update
	comments.
	* sysdeps/x86_64/Makefile [$(subdir) == elf] (CFLAGS-.os): Add
	-mno-mmx for $(all-rtld-routines).
	* sysdeps/x86/tst-ld-sse-use.sh: Moved to ...
	* sysdeps/i386/tst-ld-sse-use.sh: Here.  Replace x86-64 with
	i386.
2015-08-26 07:55:42 -07:00
H.J. Lu
d8725b1fba Use SSE2 optimized strcmp in x86-64 ld.so
Since ld.so preserves vector registers now, we can use the same SSE2
optimized strcmp in x86-64 libc and ld.so.

	* sysdeps/x86_64/strcmp.S: Remove "#if !IS_IN (libc)".
2015-08-25 12:38:11 -07:00
H.J. Lu
2194737e77 Replace %xmm[8-12] with %xmm[0-4]
Since ld.so preserves vector registers now, we can use %xmm[0-4] to
avoid the REX prefix.

	* sysdeps/x86_64/strlen.S: Replace %xmm[8-12] with %xmm[0-4].
2015-08-25 08:51:23 -07:00
H.J. Lu
2339c6f4bd Remove x86-64 rtld-xxx.c and rtld-xxx.S
Since ld.so preserves vector registers now, we can use the regular,
non-ifunc string and memory functions in ld.so.

	* sysdeps/x86_64/rtld-memcmp.c: Removed.
	* sysdeps/x86_64/rtld-memset.S: Likewise.
	* sysdeps/x86_64/rtld-strchr.S: Likewise.
	* sysdeps/x86_64/rtld-strlen.S: Likewise.
	* sysdeps/x86_64/multiarch/rtld-memcmp.c: Likewise.
	* sysdeps/x86_64/multiarch/rtld-memset.S: Likewise.
2015-08-25 08:50:06 -07:00
H.J. Lu
5f92ec52e7 Replace %xmm8 with %xmm0
Since ld.so preserves vector registers now, we can use %xmm0 to avoid
the REX prefix.

	* sysdeps/x86_64/memset.S: Replace %xmm8 with %xmm0.
2015-08-25 08:48:34 -07:00
Ondřej Bílka
0d0325ed4b Fix strcpy_chk and stpcpy_chk performance.
Hi, as I wrote in previous patches a performance of checked strcpy and
stpcpy is terrible as these don't use sse2 and are around four times
slower that strcpy and stpcpy now.

As this bug shows that these functions are not performance sensitive I
decided just to improve generic implementation instead for easier
maintainance.

        * debug/strcpy_chk.c: Improve performance.
        * debug/stpcpy_chk.c: Likewise.
        * sysdeps/x86_64/strcpy_chk.S: Remove.
        * sysdeps/x86_64/stpcpy_chk.S: Remove.
2015-08-25 15:06:33 +02:00
H.J. Lu
f3dcae82d5 Save and restore vector registers in x86-64 ld.so
This patch adds SSE, AVX and AVX512 versions of _dl_runtime_resolve
and _dl_runtime_profile, which save and restore the first 8 vector
registers used for parameter passing.  elf_machine_runtime_setup
selects the proper _dl_runtime_resolve or _dl_runtime_profile based
on _dl_x86_cpu_features.  It avoids race condition caused by
FOREIGN_CALL macros, which are only used for x86-64.

Performance impact of saving and restoring 8 vector registers are
negligible on Nehalem, Sandy Bridge, Ivy Bridge and Haswell when
ld.so is optimized with SSE2.

	[BZ #15128]
	* sysdeps/x86_64/Makefile [$(subdir) == elf] (tests): Add
	ifuncmain8.
	(modules-names): Add ifuncmod8.
	($(objpfx)ifuncmain8): New rule.
	* sysdeps/x86_64/dl-machine.h: Include <dl-procinfo.h> and
	<cpuid.h>.
	(elf_machine_runtime_setup): Use _dl_runtime_resolve_sse,
	_dl_runtime_resolve_avx, or _dl_runtime_resolve_avx512,
	_dl_runtime_profile_sse, _dl_runtime_profile_avx, or
	_dl_runtime_profile_avx512, based on HAS_ARCH_FEATURE.
	* sysdeps/x86_64/dl-trampoline.S: Rewrite.
	* sysdeps/x86_64/dl-trampoline.h: Likewise.
	* sysdeps/x86_64/ifuncmain8.c: New file.
	* sysdeps/x86_64/ifuncmod8.c: Likewise.
	* sysdeps/x86_64/nptl/tcb-offsets.sym (RTLD_SAVESPACE_SSE):
	Removed.
	* sysdeps/x86_64/nptl/tls.h (__128bits): Removed.
	(tcbhead_t): Change rtld_must_xmm_save to __glibc_unused1.
	Change rtld_savespace_sse to __glibc_unused2.
	(RTLD_CHECK_FOREIGN_CALL): Removed.
	(RTLD_ENABLE_FOREIGN_CALL): Likewise.
	(RTLD_PREPARE_FOREIGN_CALL): Likewise.
	(RTLD_FINALIZE_FOREIGN_CALL): Likewise.
2015-08-25 04:34:13 -07:00
H.J. Lu
daa4db69fc Remove the unused IFUNC files
sysdeps/i386/i686/multiarch/strcasestr-c.c became unused after

commit 1818483b15
Author: Andreas Schwab <schwab@suse.de>
Date:   Wed Dec 18 11:53:27 2013 +1000

    Remove use of SSE4.2 functions for strstr on i686

which contains

-sysdep_routines += strcspn-c strpbrk-c strspn-c strstr-c strcasestr-c
+sysdep_routines += strcspn-c strpbrk-c strspn-c

sysdeps/x86_64/multiarch/strcasestr.c became useless after

t 584b18eb4d
Author: Ondřej Bílka <neleai@seznam.cz>
Date:   Sat Dec 14 19:33:56 2013 +0100

    Add strstr with unaligned loads. Fixes bug 12100.

which changes sysdeps/x86_64/multiarch/strcasestr.c to

libc_ifunc (__strcasestr, __strcasestr_sse2);

This patch removes these file.

	* i386/i686/multiarch/strcasestr-c.c: Removed.
	* x86_64/multiarch/strcasestr.c: Likewise.
	* x86_64/multiarch/ifunc-impl-list.c (__libc_ifunc_impl_list):
	Remove strcasestr.
2015-08-20 12:47:20 -07:00
H.J. Lu
1ae6c72dc1 Move x86_64 init-arch.h to sysdeps/x86/init-arch.h
Move sysdeps/x86_64/multiarch/init-arch.h to sysdeps/x86/init-arch.h
which can be used for both i386 and x86_64.

	* sysdeps/i386/i686/multiarch/init-arch.h: Removed.
	* sysdeps/unix/sysv/linux/x86/init-arch.h: Likewise.
	* sysdeps/x86_64/cacheinfo.c: Include <init-arch.h> instead
	of "multiarch/init-arch.h".
	* sysdeps/x86_64/multiarch/init-arch.h: Renamed to ...
	* sysdeps/x86/init-arch.h: This.
2015-08-20 04:29:23 -07:00
Petar Jovanovic
fa19d5c48a Fix dynamic linker issue with bind-now
Fix the bind-now case when DT_REL and DT_JMPREL sections are separate
and there is a gap between them.

	[BZ #14341]
	* elf/dynamic-link.h (elf_machine_lazy_rel): Properly handle the
	case when there is a gap between DT_REL and DT_JMPREL sections.
	* sysdeps/x86_64/Makefile (tests): Add tst-split-dynreloc.
	(LDFLAGS-tst-split-dynreloc): New.
	(tst-split-dynreloc-ENV): Likewise.
	* sysdeps/x86_64/tst-split-dynreloc.c: New file.
	* sysdeps/x86_64/tst-split-dynreloc.lds: Likewise.
2015-08-19 05:37:01 -07:00
Paul Pluzhnikov
d5dff793af Fix BZ #18084 -- backtrace (..., 0) dumps core on x86.
Other architectures also had bugs, or did unnecessary work.
2015-08-15 11:42:43 -07:00
Paul Pluzhnikov
db7f8c8fe0 Regenerated sysdeps/x86_64/fpu/libm-test-ulps with AVX2. 2015-08-14 09:59:04 -07:00
Siddhesh Poyarekar
37dd6a19ca Remove incorrect register mov in floorf/nearbyint on x86_64
The change in 0b5395f052 replaced calls
to __get_cpu_features@plt followed by a mov from rax to rdx, with a
single macro LOAD_RTLD_GLOBAL_RO_RDX.  It is pretty clear that there
was a typo in s_floorf and __nearbyint due to which the (now incorrect)
mov was not removed.  This patch removes that mov.

	* sysdeps/x86_64/fpu/multiarch/s_floorf.S (__floorf): Remove
	unnecessary movq.
	* sysdeps/x86_64/fpu/multiarch/s_nearbyint.S (__nearbyint):
	Likewise.
2015-08-14 05:30:17 -07:00
Joseph Myers
3ba0ac10fa Add more random libm-test inputs.
This patch adds more test inputs to various libm functions found
through random generation to have larger ulps errors than previously
listed in libm-test-ulp, on at least one of x86_64 and x86.

Tested for x86_64 and x86.

	* math/auto-libm-test-in: Add more tests of acos, acosh, asin,
	asinh, atan, atan2, atanh, cabs, cbrt, cosh, csqrt, erf, erfc,
	exp, exp2, lgamma, log, log1p, log2, pow, sin, sincos, tan, tanh
	and tgamma.
	* math/auto-libm-test-out: Regenerated.
	* sysdeps/i386/fpu/libm-test-ulps: Update.
	* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
2015-08-13 23:23:23 +00:00
H.J. Lu
1dfa4a94ae Update libmvec multiarch functions for <cpu-features.h>
This patch updates libmvec multiarch functions to use the newly defined
HAS_CPU_FEATURE, HAS_ARCH_FEATURE and LOAD_RTLD_GLOBAL_RO_RDX from
<cpu-features.h>.

	* math/Makefile ($(addprefix $(objpfx), $(libm-vec-tests))):
	Remove $(objpfx)init-arch.o.
	* sysdeps/x86_64/fpu/Makefile (libmvec-support): Remove
	init-arch.
	* sysdeps/x86_64/fpu/math-tests-arch.h (avx_usable): Removed.
	(INIT_ARCH_EXT): Defined as empty.
	(CHECK_ARCH_EXT): Replace HAS_XXX with HAS_ARCH_FEATURE (XXX).
	* sysdeps/x86_64/fpu/multiarch/svml_d_cos2_core.S: Remove
	__init_cpu_features call.  Replace HAS_XXX with
	HAS_CPU_FEATURE/HAS_ARCH_FEATURE (XXX).
	* sysdeps/x86_64/fpu/multiarch/svml_d_cos4_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_exp2_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_exp4_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_log2_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_log4_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_log8_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_pow2_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_pow4_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sincos2_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sincos4_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_cosf16_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_cosf4_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_cosf8_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_expf16_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_expf4_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_expf8_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_logf16_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_logf4_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_logf8_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_powf16_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_powf4_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_powf8_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_sincosf4_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_sincosf8_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_sinf16_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_sinf4_core.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_sinf8_core.S: Likewise.
2015-08-13 03:41:47 -07:00
H.J. Lu
0b5395f052 Update x86_64 multiarch functions for <cpu-features.h>
This patch updates x86_64 multiarch functions to use the newly defined
HAS_CPU_FEATURE, HAS_ARCH_FEATURE and LOAD_RTLD_GLOBAL_RO_RDX from
<cpu-features.h>.

	* sysdeps/x86_64/fpu/multiarch/e_asin.c: Replace HAS_XXX with
	HAS_CPU_FEATURE/HAS_ARCH_FEATURE (XXX).
	* sysdeps/x86_64/fpu/multiarch/e_atan2.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/e_exp.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/e_log.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/e_pow.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_atan.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_fma.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_fmaf.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_sin.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_tan.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_ceil.S: Use
	LOAD_RTLD_GLOBAL_RO_RDX and HAS_CPU_FEATURE (SSE4_1).
	* sysdeps/x86_64/fpu/multiarch/s_ceilf.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_floor.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_floorf.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_nearbyint.S : Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_nearbyintf.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_rintf.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_rintf.S : Likewise.
	* sysdeps/x86_64/multiarch/ifunc-impl-list.c: Likewise.
	* sysdeps/x86_64/multiarch/sched_cpucount.c: Likewise.
	* sysdeps/x86_64/multiarch/strstr.c: Likewise.
	* sysdeps/x86_64/multiarch/memmove.c: Likewise.
	* sysdeps/x86_64/multiarch/memmove_chk.c: Likewise.
	* sysdeps/x86_64/multiarch/test-multiarch.c: Likewise.
	* sysdeps/x86_64/multiarch/memcmp.S: Remove __init_cpu_features
	call.  Add LOAD_RTLD_GLOBAL_RO_RDX.  Replace HAS_XXX with
	HAS_CPU_FEATURE/HAS_ARCH_FEATURE (XXX).
	* sysdeps/x86_64/multiarch/memcpy.S: Likewise.
	* sysdeps/x86_64/multiarch/memcpy_chk.S: Likewise.
	* sysdeps/x86_64/multiarch/mempcpy.S: Likewise.
	* sysdeps/x86_64/multiarch/mempcpy_chk.S: Likewise.
	* sysdeps/x86_64/multiarch/memset.S: Likewise.
	* sysdeps/x86_64/multiarch/memset_chk.S: Likewise.
	* sysdeps/x86_64/multiarch/strcat.S: Likewise.
	* sysdeps/x86_64/multiarch/strchr.S: Likewise.
	* sysdeps/x86_64/multiarch/strcmp.S: Likewise.
	* sysdeps/x86_64/multiarch/strcpy.S: Likewise.
	* sysdeps/x86_64/multiarch/strcspn.S: Likewise.
	* sysdeps/x86_64/multiarch/strspn.S: Likewise.
	* sysdeps/x86_64/multiarch/wcscpy.S: Likewise.
	* sysdeps/x86_64/multiarch/wmemcmp.S: Likewise.
2015-08-13 03:41:30 -07:00
H.J. Lu
e2e4f56056 Add _dl_x86_cpu_features to rtld_global
This patch adds _dl_x86_cpu_features to rtld_global in x86 ld.so
and initializes it early before __libc_start_main is called so that
cpu_features is always available when it is used and we can avoid
calling __init_cpu_features in IFUNC selectors.

	* sysdeps/i386/dl-machine.h: Include <cpu-features.c>.
	(dl_platform_init): Call init_cpu_features.
	* sysdeps/i386/dl-procinfo.c (_dl_x86_cpu_features): New.
	* sysdeps/i386/i686/cacheinfo.c
	(DISABLE_PREFERRED_MEMORY_INSTRUCTION): Removed.
	* sysdeps/i386/i686/multiarch/Makefile (aux): Remove init-arch.
	* sysdeps/i386/i686/multiarch/Versions: Removed.
	* sysdeps/i386/i686/multiarch/ifunc-defines.sym (KIND_OFFSET):
	Removed.
	* sysdeps/i386/ldsodefs.h: Include <cpu-features.h>.
	* sysdeps/unix/sysv/linux/x86/Makefile
	(libpthread-sysdep_routines): Remove init-arch.
	* sysdeps/unix/sysv/linux/x86_64/dl-procinfo.c: Include
	<sysdeps/x86_64/dl-procinfo.c> instead of
	sysdeps/generic/dl-procinfo.c>.
	* sysdeps/x86/Makefile [$(subdir) == csu] (gen-as-const-headers):
	Add cpu-features-offsets.sym and rtld-global-offsets.sym.
	[$(subdir) == elf] (sysdep-dl-routines): Add dl-get-cpu-features.
	[$(subdir) == elf] (tests): Add tst-get-cpu-features.
	[$(subdir) == elf] (tests-static): Add
	tst-get-cpu-features-static.
	* sysdeps/x86/Versions: New file.
	* sysdeps/x86/cpu-features-offsets.sym: Likewise.
	* sysdeps/x86/cpu-features.c: Likewise.
	* sysdeps/x86/cpu-features.h: Likewise.
	* sysdeps/x86/dl-get-cpu-features.c: Likewise.
	* sysdeps/x86/libc-start.c: Likewise.
	* sysdeps/x86/rtld-global-offsets.sym: Likewise.
	* sysdeps/x86/tst-get-cpu-features-static.c: Likewise.
	* sysdeps/x86/tst-get-cpu-features.c: Likewise.
	* sysdeps/x86_64/dl-procinfo.c: Likewise.
	* sysdeps/x86_64/cacheinfo.c (__cpuid_count): Removed.
	Assume USE_MULTIARCH is defined and don't check it.
	(is_intel): Replace __cpu_features with GLRO(dl_x86_cpu_features).
	(is_amd): Likewise.
	(max_cpuid): Likewise.
	(intel_check_word): Likewise.
	(__cache_sysconf): Don't call __init_cpu_features.
	(__x86_preferred_memory_instruction): Removed.
	(init_cacheinfo): Don't call __init_cpu_features. Replace
	__cpu_features with GLRO(dl_x86_cpu_features).
	* sysdeps/x86_64/dl-machine.h: <cpu-features.c>.
	(dl_platform_init): Call init_cpu_features.
	* sysdeps/x86_64/ldsodefs.h: Include <cpu-features.h>.
	* sysdeps/x86_64/multiarch/Makefile (aux): Remove init-arch.
	* sysdeps/x86_64/multiarch/Versions: Removed.
	* sysdeps/x86_64/multiarch/cacheinfo.c: Likewise.
	* sysdeps/x86_64/multiarch/init-arch.c: Likewise.
	* sysdeps/x86_64/multiarch/ifunc-defines.sym (KIND_OFFSET):
	Removed.
	* sysdeps/x86_64/multiarch/init-arch.h: Rewrite.
2015-08-13 03:41:22 -07:00
Joseph Myers
4afe4b20ce Add more tests of various libm functions.
This patch adds more tests of various libm functions found through
random test generation to give increased ulps on 32-bit x86.

Tested for x86_64 and x86.

	* math/auto-libm-test-in: Add more tests of acosh, asin, asinh,
	atanh, cabs, carg, cbrt, cosh, csqrt, erf, erfc, exp, exp10,
	expm1, hypot, log, log10, log1p, log2, pow, sinh, tan and tgamma.
	* math/auto-libm-test-out: Regenerated.
	* sysdeps/i386/fpu/libm-test-ulps: Update.
	* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
2015-08-11 00:58:28 +00:00
H.J. Lu
72354ab5e1 Align stack to 16 bytes when calling __errno_location
We should align stack to 16 bytes when calling __errno_location.

	[BZ #18661]
	* sysdeps/x86_64/fpu/s_cosf.S (__cosf): Align stack to 16 bytes
	when calling __errno_location.
	* sysdeps/x86_64/fpu/s_sincosf.S (__sincosf): Likewise.
	* sysdeps/x86_64/fpu/s_sinf.S (__sinf): Likewise.
2015-08-05 08:36:27 -07:00
H.J. Lu
3b8d2eb7f8 Compile {memcpy,strcmp}-sse2-unaligned.S only for libc
{memcpy,strcmp}-sse2-unaligned.S aren't needed in ld.so.

	* sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S: Compile
	only for libc.
	* sysdeps/x86_64/multiarch/strcmp-sse2-unaligned.S: Likewise.
2015-08-05 08:28:37 -07:00
Andrew Senkevich
a9e8ea51cc Prevent runtime fail of SSE vector math tests on non SSE4.1 machine.
[BZ #18740]
    * sysdeps/x86_64/fpu/Makefile (double-vlen2-arch-ext-cflags,
    float-vlen4-arch-ext-cflags): Removed.
    * math/Makefile (CFLAGS-test-double-vlen2-wrappers.c,
    CFLAGS-test-float-vlen4-wrappers.c): Likewise.
2015-07-30 18:00:24 +03:00
H.J. Lu
9637d8a253 Extend local PLT reference check
On x86, linker in binutils 2.26 and newer consolidates R_*_JUMP_SLOT with
R_*_GLOB_DAT relocation against the same symbol.  This patch extends
local PLT reference check to support alternate relocations.

	[BZ #18078]
	* scripts/check-localplt.awk: Support alternate relocations.
	* scripts/localplt.awk: Also check relocations in DT_RELA/DT_REL
	sections.
	* sysdeps/unix/sysv/linux/i386/localplt.data: Mark free and
	malloc entries with + REL R_386_GLOB_DAT.
	* sysdeps/x86_64/localplt.data: New file.
2015-07-29 11:58:06 -07:00
Andrew Senkevich
febce2ac5f Added runtime check for AVX vector math tests.
[BZ #18731]
    * sysdeps/x86_64/fpu/math-tests-arch.h: Added AVX runtime check.
    * sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise.
    * sysdeps/x86_64/fpu/test-float-vlen8.c: Likewise.
2015-07-29 19:47:29 +03:00
Andrew Senkevich
9901716135 Fixed several libmvec bugs found during testing on KNL hardware.
AVX512 IFUNC implementations, implementations of wrappers to
AVX2 versions and KNL expf implementation fixed.

    * sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core.S: Fixed AVX512 IFUNC.
    * sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core.S: Likewise.
    * sysdeps/x86_64/fpu/multiarch/svml_d_log8_core.S: Likewise.
    * sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core.S: Likewise.
    * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core.S: Likewise.
    * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core.S: Likewise.
    * sysdeps/x86_64/fpu/multiarch/svml_s_cosf16_core.S: Likewise.
    * sysdeps/x86_64/fpu/multiarch/svml_s_expf16_core.S: Likewise.
    * sysdeps/x86_64/fpu/multiarch/svml_s_logf16_core.S: Likewise.
    * sysdeps/x86_64/fpu/multiarch/svml_s_powf16_core.S: Likewise.
    * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core.S: Likewise.
    * sysdeps/x86_64/fpu/multiarch/svml_s_sinf16_core.S: Likewise.
    * sysdeps/x86_64/fpu/svml_d_wrapper_impl.h: Fixed wrappers to AVX2.
    * sysdeps/x86_64/fpu/svml_s_wrapper_impl.h: Likewise.
    * sysdeps/x86_64/fpu/multiarch/svml_s_expf16_core_avx512.S: Fixed KNL
    implementation.
2015-07-24 14:47:23 +03:00
Arjun Shankar
0035851c3c Modify several tests to use test-skeleton.c
These tests were skipped by the use-test-skeleton conversion done in
commit 29955b5d because they were reused in other tests via the #include
directive, and so deemed worth an inspection before they were modified.
This has now been done.

ChangeLog:

2015-07-09  Arjun Shankar  <arjun.is@lostca.se>

	* elf/tst-leaks1.c (main): Converted to ...
	(do_test): ... this.
	(TEST_FUNCTION): New macro.
	Include test-skeleton.c.
	* localedata/tst-langinfo.c (main): Converted to ...
	(do_test): ... this.
	(TEST_FUNCTION): New macro.
	Include test-skeleton.c.
	* math/test-fpucw.c (main): Converted to ...
	(do_test): ... this.
	(TEST_FUNCTION): New macro.
	Include test-skeleton.c.
	* math/test-tgmath.c (main): Converted to ...
	(do_test): ... this.
	(TEST_FUNCTION): New macro.
	Include test-skeleton.c.
	* math/test-tgmath2.c (main): Converted to ...
	(do_test): ... this.
	(TEST_FUNCTION): New macro.
	Include test-skeleton.c.
	* setjmp/tst-setjmp.c (main): Converted to ...
	(do_test): ... this.
	(TEST_FUNCTION): New macro.
	Include test-skeleton.c.
	* stdio-common/tst-sscanf.c (main): Converted to ...
	(do_test): ... this.
	(TEST_FUNCTION): New macro.
	Include test-skeleton.c.
	* sysdeps/x86_64/tst-audit6.c (main): Converted to ...
	(do_test): ... this.
	(TEST_FUNCTION): New macro.
	Include test-skeleton.c.
2015-07-15 15:10:23 +05:30
H.J. Lu
2eb9ef29b6 Improve bndmov encoding with zero displacement
If x86-64 assembler doesn't support MPX, we encode bndmov instruction by
hand.  When displacement is zero, assembler generates shorter encoding.
This patch improves bndmov encoding with zero displacement so that ld.so
is identical when using assemblers with and without MPX support.

	* sysdeps/x86_64/dl-trampoline.S (_dl_runtime_resolve): Improve
	bndmov encoding with zero displacement.
2015-07-09 09:30:20 -07:00
Igor Zamyatin
14c5cbabc2 Preserve bound registers for pointer pass/return
We need to save/restore bound registers and add a BND prefix before
branches in _dl_runtime_profile so that bound registers for pointer
pass and return are preserved when LD_AUDIT is used.

	[BZ #18134]
	* sysdeps/i386/configure.ac: Set HAVE_MPX_SUPPORT.
	* sysdeps/i386/configure: Regenerated.
	* sysdeps/i386/dl-trampoline.S (PRESERVE_BND_REGS_PREFIX): New.
	(_dl_runtime_profile): Save and restore Intel MPX return bound
	registers when calling _dl_call_pltexit.  Add
	PRESERVE_BND_REGS_PREFIX before return.
	* sysdeps/i386/link-defines.sym (LRV_BND0_OFFSET): New.
	(LRV_BND1_OFFSET): Likewise.
	* sysdeps/x86/bits/link.h (La_i86_retval): Add lrv_bnd0 and
	lrv_bnd1.
	* sysdeps/x86_64/dl-trampoline.S (_dl_runtime_profile): Fix
	typo in bndmov encoding.
	* sysdeps/x86_64/dl-trampoline.h: Properly save and restore
	Intel MPX bound registers.  Add PRESERVE_BND_REGS_PREFIX before
	branch instructions to preserve bounds.
2015-07-09 06:50:12 -07:00
H.J. Lu
9aec6d2a2f Add la_symbind32 to x86-64 audit tests
la_symbind32 is used for x32 in x86-64 audit tests.  We should define
both la_symbind32 and la_symbind64 in x86-64 audit tests.

	* sysdeps/x86_64/tst-auditmod10b.c (la_symbind32): New.
	* sysdeps/x86_64/tst-auditmod4b.c (la_symbind32): Likewise.
	* sysdeps/x86_64/tst-auditmod5b.c (la_symbind32): Likewise.
	* sysdeps/x86_64/tst-auditmod6b.c (la_symbind32): Likewise.
	* sysdeps/x86_64/tst-auditmod6c.c (la_symbind32): Likewise.
	* sysdeps/x86_64/tst-auditmod7b.c (la_symbind32): Likewise.
2015-07-07 06:09:29 -07:00
Torvald Riegel
4eb984d3ab Clean up BUSY_WAIT_NOP and atomic_delay.
This patch combines BUSY_WAIT_NOP and atomic_delay into a new
atomic_spin_nop function and adjusts all clients.  The new function is
put into atomic.h because what is best done in a spin loop is
architecture-specific, and atomics must be used for spinning.  The
function name is meant to tell users that this has no effect on
synchronization semantics but is a performance aid for spinning.
2015-06-30 15:57:15 +02:00
Joseph Myers
e02920bc02 Improve tgamma accuracy (bug 18613).
In non-default rounding modes, tgamma can be slightly less accurate
than permitted by glibc's accuracy goals.

Part of the problem is error accumulation, addressed in this patch by
setting round-to-nearest for internal computations.  However, there
was also a bug in the code dealing with computing pow (x + n, x + n)
where x + n is not exactly representable, providing another source of
error even in round-to-nearest mode; it was necessary to address both
bugs to get errors for all testcases within glibc's accuracy goals.
Given this second fix, accuracy in round-to-nearest mode is also
improved (hence regeneration of ulps for tgamma should be from scratch
- truncate libm-test-ulps or at least remove existing tgamma entries -
so that the expected ulps can be reduced).

Some additional complications also arose.  Certain tgamma tests should
strictly, according to IEEE semantics, overflow or not depending on
the rounding mode; this is beyond the scope of glibc's accuracy goals
for any function without exactly-determined results, but
gen-auto-libm-tests doesn't handle being lax there as it does for
underflow.  (libm-test.inc also doesn't handle being lax about whether
the result in cases very close to the overflow threshold is infinity
or a finite value close to overflow, but that doesn't cause problems
in this case though I've seen it cause problems with random test
generation for some functions.)  Thus, spurious-overflow markings,
with a comment, are added to auto-libm-test-in (no bug in Bugzilla
because the issue is with the testsuite, not a user-visible bug in
glibc).  And on x86, after the patch I saw ERANGE issues as previously
reported by Carlos (see my commentary in
<https://sourceware.org/ml/libc-alpha/2015-01/msg00485.html>), which
needed addressing by ensuring excess range and precision were
eliminated at various points if FLT_EVAL_METHOD != 0.

I also noticed and fixed a cosmetic issue where 1.0f was used in long
double functions and should have been 1.0L.

This completes the move of all functions to testing in all rounding
modes with ALL_RM_TEST, so gen-libm-have-vector-test.sh is updated to
remove the workaround for some functions not using ALL_RM_TEST.

Tested for x86_64, x86, mips64 and powerpc.

	[BZ #18613]
	* sysdeps/ieee754/dbl-64/e_gamma_r.c (gamma_positive): Take log of
	X_ADJ not X when adjusting exponent.
	(__ieee754_gamma_r): Do intermediate computations in
	round-to-nearest then adjust overflowing and underflowing results
	as needed.
	* sysdeps/ieee754/flt-32/e_gammaf_r.c (gammaf_positive): Take log
	of X_ADJ not X when adjusting exponent.
	(__ieee754_gammaf_r): Do intermediate computations in
	round-to-nearest then adjust overflowing and underflowing results
	as needed.
	* sysdeps/ieee754/ldbl-128/e_gammal_r.c (gammal_positive): Take
	log of X_ADJ not X when adjusting exponent.
	(__ieee754_gammal_r): Do intermediate computations in
	round-to-nearest then adjust overflowing and underflowing results
	as needed.  Use 1.0L not 1.0f as numerator of division.
	* sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (gammal_positive): Take
	log of X_ADJ not X when adjusting exponent.
	(__ieee754_gammal_r): Do intermediate computations in
	round-to-nearest then adjust overflowing and underflowing results
	as needed.  Use 1.0L not 1.0f as numerator of division.
	* sysdeps/ieee754/ldbl-96/e_gammal_r.c (gammal_positive): Take log
	of X_ADJ not X when adjusting exponent.
	(__ieee754_gammal_r): Do intermediate computations in
	round-to-nearest then adjust overflowing and underflowing results
	as needed.  Use 1.0L not 1.0f as numerator of division.
	* math/libm-test.inc (tgamma_test_data): Remove one test.  Moved
	to auto-libm-test-in.
	(tgamma_test): Use ALL_RM_TEST.
	* math/auto-libm-test-in: Add one test of tgamma.  Mark some other
	tests of tgamma with spurious-overflow.
	* math/auto-libm-test-out: Regenerated.
	* math/gen-libm-have-vector-test.sh: Do not check for START.
	* sysdeps/i386/fpu/libm-test-ulps: Update.
	* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
2015-06-29 23:29:35 +00:00
Joseph Myers
a8e2112ae3 Use round-to-nearest internally in jn, test with ALL_RM_TEST (bug 18602).
Some existing jn tests, if run in non-default rounding modes, produce
errors above those accepted in glibc, which causes problems for moving
tests of jn to use ALL_RM_TEST.  This patch makes jn set rounding
to-nearest internally, as was done for yn some time ago, then computes
the appropriate underflowing value for results that underflowed to
zero in to-nearest, and moves the tests to ALL_RM_TEST.  It does
nothing about the general inaccuracy of Bessel function
implementations in glibc, though it should make jn more accurate on
average in non-default rounding modes through reduced error
accumulation.  The recomputation of results that underflowed to zero
should as a side-effect fix some cases of bug 16559, where jn just
used an exact zero, but that is *not* the goal of this patch and other
cases of that bug remain unfixed.

(Most of the changes in the patch are reindentation to add new scopes
for SET_RESTORE_ROUND*.)

Tested for x86_64, x86, powerpc and mips64.

	[BZ #16559]
	[BZ #18602]
	* sysdeps/ieee754/dbl-64/e_jn.c (__ieee754_jn): Set
	round-to-nearest internally then recompute results that
	underflowed to zero in the original rounding mode.
	* sysdeps/ieee754/flt-32/e_jnf.c (__ieee754_jnf): Likewise.
	* sysdeps/ieee754/ldbl-128/e_jnl.c (__ieee754_jnl): Likewise.
	* sysdeps/ieee754/ldbl-128ibm/e_jnl.c (__ieee754_jnl): Likewise.
	* sysdeps/ieee754/ldbl-96/e_jnl.c (__ieee754_jnl): Likewise
	* math/libm-test.inc (jn_test): Use ALL_RM_TEST.
	* sysdeps/i386/fpu/libm-test-ulps: Update.
	* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
2015-06-25 21:46:02 +00:00
Joseph Myers
f9536db790 Refactor libm tests.
This patch refactors the libm tests using libm-test.inc to reduce the
level of duplicate definitions.  New headers are created for the
definitions shared by tests for a particular type; by tests of inline
functions; by tests of non-inline functions; by scalar tests; and by
vector tests.  The unused MATHCONST macro is removed.  A new macro
VEC_LEN is added to the vector headers to allow the macros defining
wrappers for vector functions to be defined once, instead of six times
each (differing only in vector length) as before.  There is still
scope for further refactoring, but this seems a useful start.

Tested for x86_64.

	* math/test-double.h: New file.
	* math/test-float.h: Likewise.
	* math/test-ldouble.h: Likewise.
	* math/test-math-inline.h: Likewise.
	* math/test-math-no-inline.h: Likewise.
	* math/test-math-scalar.h: Likewise.
	* math/test-math-vector.h: Likewise.
	* math/test-vec-loop.h: Remove file.  Contents moved into
	test-math-vector.h.
	* math/libm-test.inc (MATHCONST): Do not document macro.
	* math/test-double.c: Include test-double.h, test-math-no-inline.h
	and test-math-scalar.h.
	(FUNC): Remove macro.
	(FUNC_TEST): Likewise.
	(FLOAT): Likewise.
	(MATHCONST): Likewise.
	(PRINTF_EXPR): Likewise.
	(PRINTF_XEXPR): Likewise.
	(PRINTF_NEXPR): Likewise.
	(TEST_DOUBLE): Likewise.
	(TEST_MATHVEC): Likewise.
	(__NO_MATH_INLINES): Likewise.
	* math/test-float.c: Include test-float.h, test-math-no-inline.h
	and test-math-scalar.h.
	(FUNC): Remove macro.
	(FUNC_TEST): Likewise.
	(FLOAT): Likewise.
	(MATHCONST): Likewise.
	(PRINTF_EXPR): Likewise.
	(PRINTF_XEXPR): Likewise.
	(PRINTF_NEXPR): Likewise.
	(TEST_FLOAT): Likewise.
	(TEST_MATHVEC): Likewise.
	(__NO_MATH_INLINES): Likewise.
	* math/test-idouble.c: Include test-double.h, test-math-inline.h
	and test-math-scalar.h.
	(FUNC): Remove macro.
	(FUNC_TEST): Likewise.
	(FLOAT): Likewise.
	(MATHCONST): Likewise.
	(PRINTF_EXPR): Likewise.
	(PRINTF_XEXPR): Likewise.
	(PRINTF_NEXPR): Likewise.
	(TEST_DOUBLE): Likewise.
	(TEST_MATHVEC): Likewise.
	(TEST_INLINE): Likewise.
	(__NO_MATH_INLINES): Likewise.
	* math/test-ifloat.c: Include test-float.h, test-math-inline.h and
	test-math-scalar.h.
	(FUNC): Remove macro.
	(FUNC_TEST): Likewise.
	(FLOAT): Likewise.
	(MATHCONST): Likewise.
	(PRINTF_EXPR): Likewise.
	(PRINTF_XEXPR): Likewise.
	(PRINTF_NEXPR): Likewise.
	(TEST_FLOAT): Likewise.
	(TEST_MATHVEC): Likewise.
	(TEST_INLINE): Likewise.
	(__NO_MATH_INLINES): Likewise.
	* math/test-ildoubl.c: Include test-ldouble.h, test-math-inline.h
	and test-math-scalar.h.
	(FUNC): Remove macro.
	(FUNC_TEST): Likewise.
	(FLOAT): Likewise.
	(MATHCONST): Likewise.
	(PRINTF_EXPR): Likewise.
	(PRINTF_XEXPR): Likewise.
	(PRINTF_NEXPR): Likewise.
	(TEST_LDOUBLE): Likewise.
	(TEST_MATHVEC): Likewise.
	(TEST_INLINE): Likewise.
	(__NO_MATH_INLINES): Likewise.
	* math/test-ldouble.c: Include test-ldouble.h,
	test-math-no-inline.h and test-math-scalar.h.
	(FUNC): Remove macro.
	(FUNC_TEST): Likewise.
	(FLOAT): Likewise.
	(MATHCONST): Likewise.
	(PRINTF_EXPR): Likewise.
	(PRINTF_XEXPR): Likewise.
	(PRINTF_NEXPR): Likewise.
	(TEST_LDOUBLE): Likewise.
	(TEST_MATHVEC): Likewise.
	(__NO_MATH_INLINES): Likewise.
	* math/test-double-vlen2.h: Include test-double.h,
	test-math-no-inline.h and test-math-vector.h.
	(FLOAT): Remove macro.
	(FUNC): Likewise.
	(MATHCONST): Likewise.
	(PRINTF_EXPR): Likewise.
	(PRINTF_XEXPR): Likewise.
	(PRINTF_NEXPR): Likewise.
	(TEST_DOUBLE): Likewise.
	(TEST_MATHVEC): Likewise.
	(__NO_MATH_INLINES): Likewise.
	(CNCT): Likewise.
	(CONCAT): Likewise.
	(WRAPPER_NAME): Likewise.
	(WRAPPER_DECL): Likewise.
	(WRAPPER_DECL_ff): Likewise.
	(WRAPPER_DECL_fFF): Likewise.
	(VECTOR_WRAPPER): Likewise.
	(VECTOR_WRAPPER_ff): Likewise.
	(VECTOR_WRAPPER_fFF): Likewise.
	(VEC_LEN): New macro.
	* math/test-double-vlen4.h: Include test-double.h,
	test-math-no-inline.h and test-math-vector.h.
	(FLOAT): Remove macro.
	(FUNC): Likewise.
	(MATHCONST): Likewise.
	(PRINTF_EXPR): Likewise.
	(PRINTF_XEXPR): Likewise.
	(PRINTF_NEXPR): Likewise.
	(TEST_DOUBLE): Likewise.
	(TEST_MATHVEC): Likewise.
	(__NO_MATH_INLINES): Likewise.
	(CNCT): Likewise.
	(CONCAT): Likewise.
	(WRAPPER_NAME): Likewise.
	(WRAPPER_DECL): Likewise.
	(WRAPPER_DECL_ff): Likewise.
	(WRAPPER_DECL_fFF): Likewise.
	(VECTOR_WRAPPER): Likewise.
	(VECTOR_WRAPPER_ff): Likewise.
	(VECTOR_WRAPPER_fFF): Likewise.
	(VEC_LEN): New macro.
	* math/test-double-vlen8.h: Include test-double.h,
	test-math-no-inline.h and test-math-vector.h.
	(FLOAT): Remove macro.
	(FUNC): Likewise.
	(MATHCONST): Likewise.
	(PRINTF_EXPR): Likewise.
	(PRINTF_XEXPR): Likewise.
	(PRINTF_NEXPR): Likewise.
	(TEST_DOUBLE): Likewise.
	(TEST_MATHVEC): Likewise.
	(__NO_MATH_INLINES): Likewise.
	(CNCT): Likewise.
	(CONCAT): Likewise.
	(WRAPPER_NAME): Likewise.
	(WRAPPER_DECL): Likewise.
	(WRAPPER_DECL_ff): Likewise.
	(WRAPPER_DECL_fFF): Likewise.
	(VECTOR_WRAPPER): Likewise.
	(VECTOR_WRAPPER_ff): Likewise.
	(VECTOR_WRAPPER_fFF): Likewise.
	(VEC_LEN): New macro.
	* math/test-float-vlen4.h: Include test-float.h,
	test-math-no-inline.h and test-math-vector.h.
	(FLOAT): Remove macro.
	(FUNC): Likewise.
	(MATHCONST): Likewise.
	(PRINTF_EXPR): Likewise.
	(PRINTF_XEXPR): Likewise.
	(PRINTF_NEXPR): Likewise.
	(TEST_FLOAT): Likewise.
	(TEST_MATHVEC): Likewise.
	(__NO_MATH_INLINES): Likewise.
	(CNCT): Likewise.
	(CONCAT): Likewise.
	(WRAPPER_NAME): Likewise.
	(WRAPPER_DECL): Likewise.
	(WRAPPER_DECL_ff): Likewise.
	(WRAPPER_DECL_fFF): Likewise.
	(VECTOR_WRAPPER): Likewise.
	(VECTOR_WRAPPER_ff): Likewise.
	(VECTOR_WRAPPER_fFF): Likewise.
	(VEC_LEN): New macro.
	* math/test-float-vlen8.h: Include test-float.h,
	test-math-no-inline.h and test-math-vector.h.
	(FLOAT): Remove macro.
	(FUNC): Likewise.
	(MATHCONST): Likewise.
	(PRINTF_EXPR): Likewise.
	(PRINTF_XEXPR): Likewise.
	(PRINTF_NEXPR): Likewise.
	(TEST_FLOAT): Likewise.
	(TEST_MATHVEC): Likewise.
	(__NO_MATH_INLINES): Likewise.
	(CNCT): Likewise.
	(CONCAT): Likewise.
	(WRAPPER_NAME): Likewise.
	(WRAPPER_DECL): Likewise.
	(WRAPPER_DECL_ff): Likewise.
	(WRAPPER_DECL_fFF): Likewise.
	(VECTOR_WRAPPER): Likewise.
	(VECTOR_WRAPPER_ff): Likewise.
	(VECTOR_WRAPPER_fFF): Likewise.
	(VEC_LEN): New macro.
	* math/test-float-vlen16.h: Include test-float.h,
	test-math-no-inline.h and test-math-vector.h.
	(FLOAT): Remove macro.
	(FUNC): Likewise.
	(MATHCONST): Likewise.
	(PRINTF_EXPR): Likewise.
	(PRINTF_XEXPR): Likewise.
	(PRINTF_NEXPR): Likewise.
	(TEST_FLOAT): Likewise.
	(TEST_MATHVEC): Likewise.
	(__NO_MATH_INLINES): Likewise.
	(CNCT): Likewise.
	(CONCAT): Likewise.
	(WRAPPER_NAME): Likewise.
	(WRAPPER_DECL): Likewise.
	(WRAPPER_DECL_ff): Likewise.
	(WRAPPER_DECL_fFF): Likewise.
	(VECTOR_WRAPPER): Likewise.
	(VECTOR_WRAPPER_ff): Likewise.
	(VECTOR_WRAPPER_fFF): Likewise.
	(VEC_LEN): New macro.
	* sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: Do not include
	test-vec-loop.h.
	* sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: Likewise.
	* sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: Likewise.
	* sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: Likewise.
	* sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c: Likewise.
	* sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c: Likewise.
	* sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c: Likewise.
	* sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c: Likewise.
2015-06-24 23:27:18 +00:00
Joseph Myers
a67894c505 Fix cexp, ccos, ccosh, csin, csinh spurious underflows (bug 18594).
cexp, ccos, ccosh, csin and csinh have spurious underflows in cases
where they compute sin of the smallest normal, that produces an
underflow exception (depending on which sin implementation is in use)
but the final result does not underflow.  ctan and ctanh may also have
such underflows, or they may be latent (the issue there is that
e.g. ctan (DBL_MIN) should, rounded upwards, be the next double value
above DBL_MIN, which under glibc's accuracy goals may not have an
underflow exception, but the intermediate computation of sin (DBL_MIN)
would legitimately underflow on before-rounding architectures).

This patch fixes all those functions so they use plain comparisons (>
DBL_MIN etc.) instead of comparing the result of fpclassify with
FP_SUBNORMAL (in all these cases, we already know the number being
compared is finite).  Note that in the case of csin / csinf / csinl,
there is no need for fabs calls in the comparison because the real
part has already been reduced to its absolute value.

As the patch fixes the failures that previously obstructed moving
tests of cexp to use ALL_RM_TEST, those tests are moved to ALL_RM_TEST
by the patch (two functions remain yet to be converted).

Tested for x86_64 and x86 and ulps updated accordingly.

	[BZ #18594]
	* math/s_ccosh.c (__ccosh): Compare with least normal value
	instead of comparing class with FP_SUBNORMAL.
	* math/s_ccoshf.c (__ccoshf): Likewise.
	* math/s_ccoshl.c (__ccoshl): Likewise.
	* math/s_cexp.c (__cexp): Likewise.
	* math/s_cexpf.c (__cexpf): Likewise.
	* math/s_cexpl.c (__cexpl): Likewise.
	* math/s_csin.c (__csin): Likewise.
	* math/s_csinf.c (__csinf): Likewise.
	* math/s_csinh.c (__csinh): Likewise.
	* math/s_csinhf.c (__csinhf): Likewise.
	* math/s_csinhl.c (__csinhl): Likewise.
	* math/s_csinl.c (__csinl): Likewise.
	* math/s_ctan.c (__ctan): Likewise.
	* math/s_ctanf.c (__ctanf): Likewise.
	* math/s_ctanh.c (__ctanh): Likewise.
	* math/s_ctanhf.c (__ctanhf): Likewise.
	* math/s_ctanhl.c (__ctanhl): Likewise.
	* math/s_ctanl.c (__ctanl): Likewise.
	* math/auto-libm-test-in: Add more tests of ccos, ccosh, cexp,
	csin, csinh, ctan and ctanh.
	* math/auto-libm-test-out: Regenerated.
	* math/libm-test.inc (cexp_test): Use ALL_RM_TEST.
	* sysdeps/i386/fpu/libm-test-ulps: Update.
	* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
2015-06-24 21:04:51 +00:00
Joseph Myers
ac831b362a Fix csin, csinh overflow in directed rounding modes (bug 18593).
csin and csinh can produce bad results when overflowing in directed
rounding modes, because a multiplication that can overflow is followed
by a possible negation.  This patch fixes this by negating one of the
arguments of the multiplication before the multiplication instead of
negating the result.

The new tests for this issue are added to auto-libm-test-in, starting
use of that file for csin and csinh.  The issue was found in the
course of moving existing tests for csin and csinh (existing tests, by
being enabled in more cases than previously, showed the issue for
float and double but not for long double); that move will now be done
separately.

Tested for x86_64 and x86 and ulps updated accordingly.

	[BZ #18593]
	* math/s_csin.c (__csin): Negate before rather than after possibly
	overflowing multiplication.
	* math/s_csinf.c (__csinf): Likewise.
	* math/s_csinh.c (__csinh): Likewise.
	* math/s_csinhf.c (__csinhf): Likewise.
	* math/s_csinhl.c (__csinhl): Likewise.
	* math/s_csinl.c (__csinl): Likewise.
	* math/auto-libm-test-in: Add some tests of csin and csinh.
	* math/auto-libm-test-out: Regenerated.
	* math/libm-test.inc (csin_test_data): Use AUTO_TESTS_c_c.
	(csinh_test_data): Likewise.
	* sysdeps/x86_64/fpu/libm-test-ulps: Update.
2015-06-24 16:20:48 +00:00
Andrew Senkevich
36870482d2 Combination of data tables for x86_64 vector functions sinf, cosf and sincosf.
* sysdeps/x86_64/fpu/Makefile (libmvec-support): Fixed files list.
    * sysdeps/x86_64/fpu/multiarch/svml_s_cosf4_core_sse4.S: Renamed variable
    and included header.
    * sysdeps/x86_64/fpu/multiarch/svml_s_cosf8_core_avx2.S: Likewise.
    * sysdeps/x86_64/fpu/multiarch/svml_s_cosf16_core_avx512.S: Likewise.
    * sysdeps/x86_64/fpu/multiarch/svml_s_sinf4_core_sse4.S: Likewise.
    * sysdeps/x86_64/fpu/multiarch/svml_s_sinf8_core_avx2.S: Likewise.
    * sysdeps/x86_64/fpu/multiarch/svml_s_sinf16_core_avx512.S: Likewise.
    * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf4_core_sse4.S: Likewise.
    * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf8_core_avx2.S: Likewise.
    * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core_avx512.S: Likewise.
    * sysdeps/x86_64/fpu/svml_s_trig_data.S: New file.
    * sysdeps/x86_64/fpu/svml_s_trig_data.h: Likewise.
    * sysdeps/x86_64/fpu/svml_s_cosf_data.S: Removed file.
    * sysdeps/x86_64/fpu/svml_s_cosf_data.h: Likewise.
    * sysdeps/x86_64/fpu/svml_s_sinf_data.S: Likewise.
    * sysdeps/x86_64/fpu/svml_s_sinf_data.h: Likewise.
    * sysdeps/x86_64/fpu/svml_s_sincosf_data.S: Likewise.
    * sysdeps/x86_64/fpu/svml_s_sincosf_data.h: Likewise.
2015-06-24 17:44:35 +03:00