Commit Graph

26 Commits

Author SHA1 Message Date
Joseph Myers
a5721ebc68 Fix clog, clog10 inaccuracy (bug 19016).
For arguments with X^2 + Y^2 close to 1, clog and clog10 avoid large
errors from log(hypot) by computing X^2 + Y^2 - 1 in a way that avoids
cancellation error and then using log1p.

However, the thresholds for using that approach still result in log
being used on argument as large as sqrt(13/16) > 0.9, leading to
significant errors, in some cases above the 9ulp maximum allowed in
glibc libm.  This patch arranges for the approach using log1p to be
used in any cases where |X|, |Y| < 1 and X^2 + Y^2 >= 0.5 (with the
existing allowance for cases where one of X and Y is very small),
adjusting the __x2y2m1 functions to work with the wider range of
inputs.  This way, log only gets used on arguments below sqrt(1/2) (or
substantially above 1), where the error involved is much less.

Tested for x86_64, x86, mips64 and powerpc.  For the ulps regeneration
I removed the existing clog and clog10 ulps before regenerating to
allow any reduced ulps to appear.  Tests added include those found by
random test generation to produce large ulps either before or after
the patch, and some found by trying inputs close to the (0.75, 0.5)
threshold where the potential errors from using log are largest.

	[BZ #19016]
	* sysdeps/generic/math_private.h (__x2y2m1f): Update comment to
	allow more cases with X^2 + Y^2 >= 0.5.
	* sysdeps/ieee754/dbl-64/x2y2m1.c (__x2y2m1): Likewise.  Add -1 as
	normal element in sum instead of special-casing based on values of
	arguments.
	* sysdeps/ieee754/dbl-64/x2y2m1f.c (__x2y2m1f): Update comment.
	* sysdeps/ieee754/ldbl-128/x2y2m1l.c (__x2y2m1l): Likewise.  Add
	-1 as normal element in sum instead of special-casing based on
	values of arguments.
	* sysdeps/ieee754/ldbl-128ibm/x2y2m1l.c (__x2y2m1l): Likewise.
	* sysdeps/ieee754/ldbl-96/x2y2m1.c [FLT_EVAL_METHOD != 0]
	(__x2y2m1): Update comment.
	* sysdeps/ieee754/ldbl-96/x2y2m1l.c (__x2y2m1l): Likewise.  Add -1
	as normal element in sum instead of special-casing based on values
	of arguments.
	* math/s_clog.c (__clog): Handle more cases using log1p without
	hypot.
	* math/s_clog10.c (__clog10): Likewise.
	* math/s_clog10f.c (__clog10f): Likewise.
	* math/s_clog10l.c (__clog10l): Likewise.
	* math/s_clogf.c (__clogf): Likewise.
	* math/s_clogl.c (__clogl): Likewise.
	* math/auto-libm-test-in: Add more tests of clog and clog10.
	* math/auto-libm-test-out: Regenerated.
	* sysdeps/i386/fpu/libm-test-ulps: Update.
	* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
2015-09-28 22:11:22 +00:00
Joseph Myers
d96164c330 Refactor code forcing underflow exceptions.
Various floating-point functions have code to force underflow
exceptions if a tiny result was computed in a way that might not have
resulted in such exceptions even though the result is inexact.  This
typically uses math_force_eval to ensure that the underflowing
expression is evaluated, but sometimes uses volatile.

This patch refactors such code to use three new macros
math_check_force_underflow, math_check_force_underflow_nonneg and
math_check_force_underflow_complex (which in turn use
math_force_eval).  In the limited number of cases not suited to a
simple conversion to these macros, existing uses of volatile are
changed to use math_force_eval instead.  The converted code does not
always execute exactly the same sequence of operations as the original
code, but the overall effects should be the same.

Tested for x86_64, x86, mips64 and powerpc.

	* sysdeps/generic/math_private.h (fabs_tg): New macro.
	(min_of_type): Likewise.
	(math_check_force_underflow): Likewise.
	(math_check_force_underflow_nonneg): Likewise.
	(math_check_force_underflow_complex): Likewise.
	* math/e_exp2l.c (__ieee754_exp2l): Use
	math_check_force_underflow_nonneg.
	* math/k_casinh.c (__kernel_casinh): Likewise.
	* math/k_casinhf.c (__kernel_casinhf): Likewise.
	* math/k_casinhl.c (__kernel_casinhl): Likewise.
	* math/s_catan.c (__catan): Use
	math_check_force_underflow_complex.
	* math/s_catanf.c (__catanf): Likewise.
	* math/s_catanh.c (__catanh): Likewise.
	* math/s_catanhf.c (__catanhf): Likewise.
	* math/s_catanhl.c (__catanhl): Likewise.
	* math/s_catanl.c (__catanl): Likewise.
	* math/s_ccosh.c (__ccosh): Likewise.
	* math/s_ccoshf.c (__ccoshf): Likewise.
	* math/s_ccoshl.c (__ccoshl): Likewise.
	* math/s_cexp.c (__cexp): Likewise.
	* math/s_cexpf.c (__cexpf): Likewise.
	* math/s_cexpl.c (__cexpl): Likewise.
	* math/s_clog.c (__clog): Use math_check_force_underflow_nonneg.
	* math/s_clog10.c (__clog10): Likewise.
	* math/s_clog10f.c (__clog10f): Likewise.
	* math/s_clog10l.c (__clog10l): Likewise.
	* math/s_clogf.c (__clogf): Likewise.
	* math/s_clogl.c (__clogl): Likewise.
	* math/s_csin.c (__csin): Use math_check_force_underflow_complex.
	* math/s_csinf.c (__csinf): Likewise.
	* math/s_csinh.c (__csinh): Likewise.
	* math/s_csinhf.c (__csinhf): Likewise.
	* math/s_csinhl.c (__csinhl): Likewise.
	* math/s_csinl.c (__csinl): Likewise.
	* math/s_csqrt.c (__csqrt): Use math_check_force_underflow.
	* math/s_csqrtf.c (__csqrtf): Likewise.
	* math/s_csqrtl.c (__csqrtl): Likewise.
	* math/s_ctan.c (__ctan): Use math_check_force_underflow_complex.
	* math/s_ctanf.c (__ctanf): Likewise.
	* math/s_ctanh.c (__ctanh): Likewise.
	* math/s_ctanhf.c (__ctanhf): Likewise.
	* math/s_ctanhl.c (__ctanhl): Likewise.
	* math/s_ctanl.c (__ctanl): Likewise.
	* stdlib/strtod_l.c (round_and_return): Use math_force_eval
	instead of volatile.
	* sysdeps/ieee754/dbl-64/e_asin.c (__ieee754_asin): Use
	math_check_force_underflow.
	* sysdeps/ieee754/dbl-64/e_atanh.c (__ieee754_atanh): Likewise.
	* sysdeps/ieee754/dbl-64/e_exp.c (__ieee754_exp): Do not use
	volatile when forcing underflow.
	* sysdeps/ieee754/dbl-64/e_exp2.c (__ieee754_exp2): Use
	math_check_force_underflow_nonneg.
	* sysdeps/ieee754/dbl-64/e_gamma_r.c (__ieee754_gamma_r):
	Likewise.
	* sysdeps/ieee754/dbl-64/e_j1.c (__ieee754_j1): Use
	math_check_force_underflow.
	* sysdeps/ieee754/dbl-64/e_jn.c (__ieee754_jn): Likewise.
	* sysdeps/ieee754/dbl-64/e_sinh.c (__ieee754_sinh): Likewise.
	* sysdeps/ieee754/dbl-64/s_asinh.c (__asinh): Likewise.
	* sysdeps/ieee754/dbl-64/s_atan.c (atan): Use
	math_check_force_underflow_nonneg.
	* sysdeps/ieee754/dbl-64/s_erf.c (__erf): Use
	math_check_force_underflow.
	* sysdeps/ieee754/dbl-64/s_expm1.c (__expm1): Likewise.
	* sysdeps/ieee754/dbl-64/s_fma.c (__fma): Use math_force_eval
	instead of volatile.
	* sysdeps/ieee754/dbl-64/s_log1p.c (__log1p): Use
	math_check_force_underflow.
	* sysdeps/ieee754/dbl-64/s_sin.c (__sin): Likewise.
	* sysdeps/ieee754/dbl-64/s_tan.c (tan): Use
	math_check_force_underflow_nonneg.
	* sysdeps/ieee754/dbl-64/s_tanh.c (__tanh): Use
	math_check_force_underflow.
	* sysdeps/ieee754/flt-32/e_asinf.c (__ieee754_asinf): Likewise.
	* sysdeps/ieee754/flt-32/e_atanhf.c (__ieee754_atanhf): Likewise.
	* sysdeps/ieee754/flt-32/e_exp2f.c (__ieee754_exp2f): Use
	math_check_force_underflow_nonneg.
	* sysdeps/ieee754/flt-32/e_gammaf_r.c (__ieee754_gammaf_r):
	Likewise.
	* sysdeps/ieee754/flt-32/e_j1f.c (__ieee754_j1f): Use
	math_check_force_underflow.
	* sysdeps/ieee754/flt-32/e_jnf.c (__ieee754_jnf): Likewise.
	* sysdeps/ieee754/flt-32/e_sinhf.c (__ieee754_sinhf): Likewise.
	* sysdeps/ieee754/flt-32/k_sinf.c (__kernel_sinf): Likewise.
	* sysdeps/ieee754/flt-32/k_tanf.c (__kernel_tanf): Likewise.
	* sysdeps/ieee754/flt-32/s_asinhf.c (__asinhf): Likewise.
	* sysdeps/ieee754/flt-32/s_atanf.c (__atanf): Likewise.
	* sysdeps/ieee754/flt-32/s_erff.c (__erff): Likewise.
	* sysdeps/ieee754/flt-32/s_expm1f.c (__expm1f): Likewise.
	* sysdeps/ieee754/flt-32/s_log1pf.c (__log1pf): Likewise.
	* sysdeps/ieee754/flt-32/s_tanhf.c (__tanhf): Likewise.
	* sysdeps/ieee754/ldbl-128/e_asinl.c (__ieee754_asinl): Likewise.
	* sysdeps/ieee754/ldbl-128/e_atanhl.c (__ieee754_atanhl):
	Likewise.
	* sysdeps/ieee754/ldbl-128/e_expl.c (__ieee754_expl): Use
	math_check_force_underflow_nonneg.
	* sysdeps/ieee754/ldbl-128/e_gammal_r.c (__ieee754_gammal_r):
	Likewise.
	* sysdeps/ieee754/ldbl-128/e_j1l.c (__ieee754_j1l): Use
	math_check_force_underflow.
	* sysdeps/ieee754/ldbl-128/e_jnl.c (__ieee754_jnl): Likewise.
	* sysdeps/ieee754/ldbl-128/e_sinhl.c (__ieee754_sinhl): Likewise.
	* sysdeps/ieee754/ldbl-128/k_sincosl.c (__kernel_sincosl):
	Likewise.
	* sysdeps/ieee754/ldbl-128/k_sinl.c (__kernel_sinl): Likewise.
	* sysdeps/ieee754/ldbl-128/k_tanl.c (__kernel_tanl): Likewise.
	* sysdeps/ieee754/ldbl-128/s_asinhl.c (__asinhl): Likewise.
	* sysdeps/ieee754/ldbl-128/s_atanl.c (__atanl): Likewise.
	* sysdeps/ieee754/ldbl-128/s_erfl.c (__erfl): Likewise.
	* sysdeps/ieee754/ldbl-128/s_expm1l.c (__expm1l): Likewise.
	* sysdeps/ieee754/ldbl-128/s_fmal.c (__fmal): Use math_force_eval
	instead of volatile.
	* sysdeps/ieee754/ldbl-128/s_log1pl.c (__log1pl): Use
	math_check_force_underflow.
	* sysdeps/ieee754/ldbl-128/s_tanhl.c (__tanhl): Likewise.
	* sysdeps/ieee754/ldbl-128ibm/e_asinl.c (__ieee754_asinl): Use
	math_check_force_underflow.
	* sysdeps/ieee754/ldbl-128ibm/e_atanhl.c (__ieee754_atanhl):
	Likewise.
	* sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (__ieee754_gammal_r):
	Use math_check_force_underflow_nonneg.
	* sysdeps/ieee754/ldbl-128ibm/e_jnl.c (__ieee754_jnl): Use
	math_check_force_underflow.
	* sysdeps/ieee754/ldbl-128ibm/e_sinhl.c (__ieee754_sinhl):
	Likewise.
	* sysdeps/ieee754/ldbl-128ibm/k_sincosl.c (__kernel_sincosl):
	Likewise.
	* sysdeps/ieee754/ldbl-128ibm/k_sinl.c (__kernel_sinl): Likewise.
	* sysdeps/ieee754/ldbl-128ibm/k_tanl.c (__kernel_tanl): Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_asinhl.c (__asinhl): Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_atanl.c (__atanl): Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_erfl.c (__erfl): Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_tanhl.c (__tanhl): Likewise.
	* sysdeps/ieee754/ldbl-96/e_asinl.c (__ieee754_asinl): Likewise.
	* sysdeps/ieee754/ldbl-96/e_atanhl.c (__ieee754_atanhl): Likewise.
	* sysdeps/ieee754/ldbl-96/e_gammal_r.c (__ieee754_gammal_r): Use
	math_check_force_underflow_nonneg.
	* sysdeps/ieee754/ldbl-96/e_j1l.c (__ieee754_j1l): Use
	math_check_force_underflow.
	* sysdeps/ieee754/ldbl-96/e_jnl.c (__ieee754_jnl): Likewise.
	* sysdeps/ieee754/ldbl-96/e_sinhl.c (__ieee754_sinhl): Likewise.
	* sysdeps/ieee754/ldbl-96/k_sinl.c (__kernel_sinl): Likewise.
	* sysdeps/ieee754/ldbl-96/k_tanl.c (__kernel_tanl): Use
	math_check_force_underflow_nonneg.
	* sysdeps/ieee754/ldbl-96/s_asinhl.c (__asinhl): Use
	math_check_force_underflow.
	* sysdeps/ieee754/ldbl-96/s_erfl.c (__erfl): Likewise.
	* sysdeps/ieee754/ldbl-96/s_fmal.c (__fmal): Use math_force_eval
	instead of volatile.
	* sysdeps/ieee754/ldbl-96/s_tanhl.c (__tanhl): Use
	math_check_force_underflow.
2015-09-23 22:42:30 +00:00
Joseph Myers
c8235dda72 Avoid excess range overflowing results from cosh, sinh, lgamma (bug 18980).
Various i386 libm functions return values with excess range and
precision; Wilco Dijkstra's patches to make isfinite etc. expand
inline cause this pre-existing issue to result in test failures (when
e.g. a result that overflows float but not long double gets counted as
overflowing for some purposes but not others).

This patch addresses those cases arising from functions defined in C,
adding a math_narrow_eval macro that forces values to memory to
eliminate excess precision if FLT_EVAL_METHOD indicates this is
needed, and is a no-op otherwise.  I'll convert existing uses of
volatile and asm for this purpose to use the new macro later, once
i386 has clean test results again (which requires fixes for .S files
as well).

Tested for x86_64 and x86.  Committed.

	[BZ #18980]
	* sysdeps/generic/math_private.h: Include <float.h>.
	(math_narrow_eval): New macro.
	[FLT_EVAL_METHOD != 0] (excess_precision): Likewise.
	* sysdeps/ieee754/dbl-64/e_cosh.c (__ieee754_cosh): Use
	math_narrow_eval on overflowing return value.
	* sysdeps/ieee754/dbl-64/e_lgamma_r.c (__ieee754_lgamma_r):
	Likewise.
	* sysdeps/ieee754/dbl-64/e_sinh.c (__ieee754_sinh): Likewise.
	* sysdeps/ieee754/flt-32/e_coshf.c (__ieee754_coshf): Likewise.
	* sysdeps/ieee754/flt-32/e_lgammaf_r.c (__ieee754_lgammaf_r):
	Likewise.
	* sysdeps/ieee754/flt-32/e_sinhf.c (__ieee754_sinhf): Likewise.
2015-09-18 20:00:48 +00:00
Joseph Myers
050f29c188 Fix lgamma (negative) inaccuracy (bug 2542, bug 2543, bug 2558).
The existing implementations of lgamma functions (except for the ia64
versions) use the reflection formula for negative arguments.  This
suffers large inaccuracy from cancellation near zeros of lgamma (near
where the gamma function is +/- 1).

This patch fixes this inaccuracy.  For arguments above -2, there are
no zeros and no large cancellation, while for sufficiently large
negative arguments the zeros are so close to integers that even for
integers +/- 1ulp the log(gamma(1-x)) term dominates and cancellation
is not significant.  Thus, it is only necessary to take special care
about cancellation for arguments around a limited number of zeros.

Accordingly, this patch uses precomputed tables of relevant zeros,
expressed as the sum of two floating-point values.  The log of the
ratio of two sines can be computed accurately using log1p in cases
where log would lose accuracy.  The log of the ratio of two gamma(1-x)
values can be computed using Stirling's approximation (the difference
between two values of that approximation to lgamma being computable
without computing the two values and then subtracting), with
appropriate adjustments (which don't reduce accuracy too much) in
cases where 1-x is too small to use Stirling's approximation directly.

In the interval from -3 to -2, using the ratios of sines and of
gamma(1-x) can still produce too much cancellation between those two
parts of the computation (and that interval is also the worst interval
for computing the ratio between gamma(1-x) values, which computation
becomes more accurate, while being less critical for the final result,
for larger 1-x).  Because this can result in errors slightly above
those accepted in glibc, this interval is instead dealt with by
polynomial approximations.  Separate polynomial approximations to
(|gamma(x)|-1)(x-n)/(x-x0) are used for each interval of length 1/8
from -3 to -2, where n (-3 or -2) is the nearest integer to the
1/8-interval and x0 is the zero of lgamma in the relevant half-integer
interval (-3 to -2.5 or -2.5 to -2).

Together, the two approaches are intended to give sufficient accuracy
for all negative arguments in the problem range.  Outside that range,
the previous implementation continues to be used.

Tested for x86_64, x86, mips64 and powerpc.  The mips64 and powerpc
testing shows up pre-existing problems for ldbl-128 and ldbl-128ibm
with large negative arguments giving spurious "invalid" exceptions
(exposed by newly added tests for cases this patch doesn't affect the
logic for); I'll address those problems separately.

	[BZ #2542]
	[BZ #2543]
	[BZ #2558]
	* sysdeps/ieee754/dbl-64/e_lgamma_r.c (__ieee754_lgamma_r): Call
	__lgamma_neg for arguments from -28.0 to -2.0.
	* sysdeps/ieee754/flt-32/e_lgammaf_r.c (__ieee754_lgammaf_r): Call
	__lgamma_negf for arguments from -15.0 to -2.0.
	* sysdeps/ieee754/ldbl-128/e_lgammal_r.c (__ieee754_lgammal_r):
	Call __lgamma_negl for arguments from -48.0 or -50.0 to -2.0.
	* sysdeps/ieee754/ldbl-96/e_lgammal_r.c (__ieee754_lgammal_r):
	Call __lgamma_negl for arguments from -33.0 to -2.0.
	* sysdeps/ieee754/dbl-64/lgamma_neg.c: New file.
	* sysdeps/ieee754/dbl-64/lgamma_product.c: Likewise.
	* sysdeps/ieee754/flt-32/lgamma_negf.c: Likewise.
	* sysdeps/ieee754/flt-32/lgamma_productf.c: Likewise.
	* sysdeps/ieee754/ldbl-128/lgamma_negl.c: Likewise.
	* sysdeps/ieee754/ldbl-128/lgamma_productl.c: Likewise.
	* sysdeps/ieee754/ldbl-128ibm/lgamma_negl.c: Likewise.
	* sysdeps/ieee754/ldbl-128ibm/lgamma_productl.c: Likewise.
	* sysdeps/ieee754/ldbl-96/lgamma_negl.c: Likewise.
	* sysdeps/ieee754/ldbl-96/lgamma_product.c: Likewise.
	* sysdeps/ieee754/ldbl-96/lgamma_productl.c: Likewise.
	* sysdeps/generic/math_private.h (__lgamma_negf): New prototype.
	(__lgamma_neg): Likewise.
	(__lgamma_negl): Likewise.
	(__lgamma_product): Likewise.
	(__lgamma_productl): Likewise.
	* math/Makefile (libm-calls): Add lgamma_neg and lgamma_product.
	* math/auto-libm-test-in: Add more tests of lgamma.
	* math/auto-libm-test-out: Regenerated.
	* sysdeps/i386/fpu/libm-test-ulps: Update.
	* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
2015-09-10 22:27:58 +00:00
Joseph Myers
8116321f65 Fix libm feupdateenv namespace (bug 17748).
Concluding the fixes for C90 libm functions calling C99 fe* functions,
this patch fixes the case of feupdateenv by making it a weak alias for
__feupdateenv and making the affected code call __feupdateenv.

Tested for x86_64 (testsuite, and that installed stripped shared
libraries are unchanged by the patch).  Also tested for ARM
(soft-float) that the math.h linknamespace tests now pass.

	[BZ #17748]
	* include/fenv.h (__feupdateenv): Use libm_hidden_proto.
	* math/feupdateenv.c (__feupdateenv): Use libm_hidden_def.
	* sysdeps/aarch64/fpu/feupdateenv.c (feupdateenv): Rename to
	__feupdateenv and define as weak alias of __feupdateenv.  Use
	libm_hidden_weak.
	* sysdeps/alpha/fpu/feupdateenv.c (__feupdateenv): Use
	libm_hidden_def.
	* sysdeps/arm/feupdateenv.c (feupdateenv): Rename to __feupdateenv
	and define as weak alias of __feupdateenv.  Use libm_hidden_weak.
	* sysdeps/hppa/fpu/feupdateenv.c (feupdateenv): Likewise.
	* sysdeps/i386/fpu/feupdateenv.c (__feupdateenv): Use
	libm_hidden_def.
	* sysdeps/ia64/fpu/feupdateenv.c (feupdateenv): Rename to
	__feupdateenv and define as weak alias of __feupdateenv.  Use
	libm_hidden_weak.
	* sysdeps/m68k/fpu/feupdateenv.c (__feupdateenv): Use
	libm_hidden_def.
	* sysdeps/mips/fpu/feupdateenv.c (feupdateenv): Rename to
	__feupdateenv and define as weak alias of __feupdateenv.  Use
	libm_hidden_weak.
	* sysdeps/powerpc/fpu/feupdateenv.c (__feupdateenv): Use
	libm_hidden_def.
	* sysdeps/powerpc/nofpu/feupdateenv.c (__feupdateenv): Likewise.
	* sysdeps/powerpc/powerpc32/e500/nofpu/feupdateenv.c
	(__feupdateenv): Likewise.
	* sysdeps/s390/fpu/feupdateenv.c (feupdateenv): Rename to
	__feupdateenv and define as weak alias of __feupdateenv.  Use
	libm_hidden_weak.
	* sysdeps/sh/sh4/fpu/feupdateenv.c (feupdateenv): Likewise.
	* sysdeps/sparc/fpu/feupdateenv.c (__feupdateenv): Use
	libm_hidden_def.
	* sysdeps/tile/math_private.h (__feupdateenv): New inline
	function.
	* sysdeps/x86_64/fpu/feupdateenv.c (__feupdateenv): Use
	libm_hidden_def.
	* sysdeps/generic/math_private.h (default_libc_feupdateenv): Call
	__feupdateenv instead of feupdateenv.
	(default_libc_feupdateenv_test): Likewise.
	(libc_feresetround_ctx): Likewise.
2015-01-07 19:01:20 +00:00
Joseph Myers
01238691bb Fix libm fesetround namespace (bug 17748).
Continuing the fixes for C90 libm functions calling C99 fe* functions,
this patch fixes the case of fesetround by making it a weak alias of
__fesetround and making the affected code call __fesetround.  An
existing __fesetround function in fenv_libc.h for powerpc is renamed
to __fesetround_inline.

Tested for x86_64 (testsuite, and that disassembly of installed shared
libraries is unchanged by the patch).  Also tested for ARM
(soft-float) that fesetround failures disappear from the linknamespace
test results (feupdateenv remains to be addressed to complete fixing
bug 17748).

	[BZ #17748]
	* include/fenv.h (__fesetround): Declare.  Use libm_hidden_proto.
	* math/fesetround.c (fesetround): Rename to __fesetround and
	define as weak alias of __fesetround.  Use libm_hidden_weak.
	* sysdeps/aarch64/fpu/fesetround.c (fesetround): Likewise.
	* sysdeps/alpha/fpu/fesetround.c (fesetround): Likewise.
	* sysdeps/arm/fesetround.c (fesetround): Likewise.
	* sysdeps/hppa/fpu/fesetround.c (fesetround): Likewise.
	* sysdeps/i386/fpu/fesetround.c (fesetround): Likewise.
	* sysdeps/ia64/fpu/fesetround.c (fesetround): Likewise.
	* sysdeps/m68k/fpu/fesetround.c (fesetround): Likewise.
	* sysdeps/mips/fpu/fesetround.c (fesetround): Likewise.
	* sysdeps/powerpc/fpu/fenv_libc.h (__fesetround): Rename to
	__fesetround_inline.
	* sysdeps/powerpc/fpu/fenv_private.h (libc_fesetround_ppc): Call
	__fesetround_inline instead of __fesetround.
	* sysdeps/powerpc/fpu/fesetround.c (fesetround): Rename to
	__fesetround and define as weak alias of __fesetround.  Use
	libm_hidden_weak.  Call __fesetround_inline instead of
	__fesetround.
	* sysdeps/powerpc/nofpu/fesetround.c (fesetround): Rename to
	__fesetround and define as weak alias of __fesetround.  Use
	libm_hidden_weak.
	* sysdeps/powerpc/powerpc32/e500/nofpu/fesetround.c (fesetround):
	Likewise.
	* sysdeps/s390/fpu/fesetround.c (fesetround): Likewise.
	* sysdeps/sh/sh4/fpu/fesetround.c (fesetround): Likewise.
	* sysdeps/sparc/fpu/fesetround.c (fesetround): Likewise.
	* sysdeps/tile/math_private.h (__fesetround): New inline function.
	* sysdeps/x86_64/fpu/fesetround.c (fesetround): Rename to
	__fesetround and define as weak alias of __fesetround.  Use
	libm_hidden_weak.
	* sysdeps/generic/math_private.h (default_libc_fesetround): Call
	__fesetround instead of fesetround.
	(default_libc_feholdexcept_setround): Likewise.
	(libc_feholdsetround_ctx): Likewise.
	(libc_feholdsetround_noex_ctx): Likewise.
2015-01-07 00:41:23 +00:00
Joseph Myers
cd42798aef Fix libm fesetenv namespace (bug 17748).
Continuing the fixes for C90 libm functions calling C99 fe* functions,
this patch fixes the case of fesetenv by making it a weak alias of
__fesetenv and making the affected code (including various copies of
feupdateenv which also gets called from C90 functions) call
__fesetenv.

Tested for x86_64 (testsuite, and that disassembly of installed shared
libraries is unchanged by the patch).  Also tested for ARM
(soft-float) that fesetenv failures disappear from the linknamespace
test results (fsetround and feupdateenv remain to be addressed to
complete fixing bug 17748).

	[BZ #17748]
	* include/fenv.h (__fesetenv): Use libm_hidden_proto.
	* math/fesetenv.c (__fesetenv): Use libm_hidden_def.
	* sysdeps/aarch64/fpu/fesetenv.c (fesetenv): Rename to __fesetenv
	and define as weak alias of __fesetenv.  Use libm_hidden_weak.
	* sysdeps/alpha/fpu/fesetenv.c (__fesetenv): Use libm_hidden_def.
	* sysdeps/arm/fesetenv.c (fesetenv): Rename to __fesetenv and
	define as weak alias of __fesetenv.  Use libm_hidden_weak.
	* sysdeps/hppa/fpu/fesetenv.c (fesetenv): Likewise.
	* sysdeps/i386/fpu/fesetenv.c (__fesetenv): Use libm_hidden_def.
	* sysdeps/ia64/fpu/fesetenv.c (fesetenv): Rename to __fesetenv and
	define as weak alias of __fesetenv.  Use libm_hidden_weak.
	* sysdeps/m68k/fpu/fesetenv.c (__fesetenv): Use libm_hidden_def.
	* sysdeps/mips/fpu/fesetenv.c (fesetenv): Rename to __fesetenv and
	define as weak alias of __fesetenv.  Use libm_hidden_weak.
	* sysdeps/powerpc/fpu/fesetenv.c (__fesetenv): Use
	libm_hidden_def.
	* sysdeps/powerpc/nofpu/fesetenv.c (__fesetenv): Likewise.
	* sysdeps/powerpc/powerpc32/e500/nofpu/fesetenv.c (__fesetenv):
	Likewise.
	* sysdeps/s390/fpu/fesetenv.c (fesetenv): Rename to __fesetenv and
	define as weak alias of __fesetenv.  Use libm_hidden_weak.
	* sysdeps/sh/sh4/fpu/fesetenv.c (fesetenv): Likewise.
	* sysdeps/sparc/fpu/fesetenv.c (__fesetenv): Use libm_hidden_def.
	* sysdeps/tile/math_private.h (__fesetenv): New inline function.
	* sysdeps/x86_64/fpu/fesetenv.c (fesetenv): Rename to __fesetenv
	and define as weak alias of __fesetenv.  Use libm_hidden_weak.
	* sysdeps/generic/math_private.h (default_libc_fesetenv): Use
	__fesetenv instead of fesetenv.
	(libc_feresetround_noex_ctx): Likewise.
	* sysdeps/alpha/fpu/feupdateenv.c (__feupdateenv): Likewise.
	* sysdeps/hppa/fpu/feupdateenv.c (feupdateenv): Likewise.
	* sysdeps/i386/fpu/feupdateenv.c (__feupdateenv): Likewise.
	* sysdeps/ia64/fpu/feupdateenv.c (feupdateenv): Likewise.
	* sysdeps/m68k/fpu/feupdateenv.c (__feupdateenv): Likewise.
	* sysdeps/mips/fpu/feupdateenv.c (feupdateenv): Likewise.
	* sysdeps/powerpc/nofpu/feupdateenv.c (__feupdateenv): Likewise.
	* sysdeps/powerpc/powerpc32/e500/nofpu/feupdateenv.c
	(__feupdateenv): Likewise.
	* sysdeps/s390/fpu/feupdateenv.c (feupdateenv): Likewise.
	* sysdeps/sh/sh4/fpu/feupdateenv.c (feupdateenv): Likewise.
	* sysdeps/sparc/fpu/feupdateenv.c (__feupdateenv): Likewise.
	* sysdeps/x86_64/fpu/feupdateenv.c (__feupdateenv): Likewise.
2015-01-06 23:36:20 +00:00
Joseph Myers
ef9faf1385 Fix libm feholdexcept namespace (bug 17748).
Continuing the fixes for C90 libm functions calling C99 fe* functions,
this patch fixes the case of feholdexcept by making it a weak alias of
__feholdexcept and making the affected code call __feholdexcept.

Tested for x86_64 (testsuite, and that disassembly of installed shared
libraries is unchanged by the patch).  Also tested for ARM
(soft-float) that feholdexcept failures disappear from the
linknamespace test failures (fesetenv, fsetround and feupdateenv
remain to be addressed to complete fixing bug 17748).

	[BZ #17748]
	* include/fenv.h (__feholdexcept): Declare.  Use
	libm_hidden_proto.
	* math/feholdexcpt.c (feholdexcept): Rename to __feholdexcept and
	define as weak alias of __feholdexcept.  Use libm_hidden_weak.
	* sysdeps/aarch64/fpu/feholdexcpt.c (feholdexcept): Likewise.
	* sysdeps/alpha/fpu/feholdexcpt.c (feholdexcept): Likewise.
	* sysdeps/arm/feholdexcpt.c (feholdexcept): Likewise.
	* sysdeps/hppa/fpu/feholdexcpt.c (feholdexcept): Likewise.
	* sysdeps/i386/fpu/feholdexcpt.c (feholdexcept): Likewise.
	* sysdeps/ia64/fpu/feholdexcpt.c (feholdexcept): Likewise.
	* sysdeps/m68k/fpu/feholdexcpt.c (feholdexcept): Likewise.
	* sysdeps/mips/fpu/feholdexcpt.c (feholdexcept): Likewise.
	* sysdeps/powerpc/fpu/feholdexcpt.c (feholdexcept): Likewise.
	* sysdeps/powerpc/nofpu/feholdexcpt.c (feholdexcept): Likewise.
	* sysdeps/powerpc/powerpc32/e500/nofpu/feholdexcpt.c
	(feholdexcept): Likewise.
	* sysdeps/s390/fpu/feholdexcpt.c (feholdexcept): Likewise.
	* sysdeps/sh/sh4/fpu/feholdexcpt.c (feholdexcept): Likewise.
	* sysdeps/sparc/fpu/feholdexcpt.c (feholdexcept): Likewise.
	* sysdeps/x86_64/fpu/feholdexcpt.c (feholdexcept): Likewise.
	* sysdeps/generic/math_private.h (default_libc_feholdexcept): Use
	__feholdexcept instead of feholdexcept.
	(default_libc_feholdexcept_setround): Likewise.
2015-01-05 23:06:14 +00:00
Joseph Myers
73a268c759 Fix libm fegetenv namespace (bug 17748).
Some C90 libm functions call fegetenv via libc_feholdsetround*
functions in math_private.h.  This patch makes them call __fegetenv
instead, making fegetenv into a weak alias for __fegetenv as needed.

Tested for x86_64 (testsuite, and that disassembly of installed shared
libraries is unchanged by the patch).  Also tested for ARM
(soft-float) that fegetenv failures disappear from the linknamespace
test failures (however, similar fixes will also be needed for
fegetround, feholdexcept, fesetenv, fesetround and feupdateenv before
this set of namespace issues covered by bug 17748 is fully fixed and
those linknamespace tests start passing).

	[BZ #17748]
	* include/fenv.h (__fegetenv): Use libm_hidden_proto.
	* math/fegetenv.c (__fegetenv): Use libm_hidden_def.
	* sysdeps/aarch64/fpu/fegetenv.c (fegetenv): Rename to __fegetenv
	and define as weak alias of __fegetenv.  Use libm_hidden_weak.
	* sysdeps/alpha/fpu/fegetenv.c (__fegetenv): Use libm_hidden_def.
	* sysdeps/arm/fegetenv.c (fegetenv): Rename to __fegetenv and
	define as weak alias of __fegetenv.  Use libm_hidden_weak.
	* sysdeps/hppa/fpu/fegetenv.c (fegetenv): Likewise.
	* sysdeps/i386/fpu/fegetenv.c (__fegetenv): Use libm_hidden_def.
	* sysdeps/ia64/fpu/fegetenv.c (fegetenv): Rename to __fegetenv and
	define as weak alias of __fegetenv.  Use libm_hidden_weak.
	* sysdeps/m68k/fpu/fegetenv.c (__fegetenv): Use libm_hidden_def.
	* sysdeps/mips/fpu/fegetenv.c (fegetenv): Rename to __fegetenv and
	define as weak alias of __fegetenv.  Use libm_hidden_weak.
	* sysdeps/powerpc/fpu/fegetenv.c (__fegetenv): Use
	libm_hidden_def.
	* sysdeps/powerpc/nofpu/fegetenv.c (__fegetenv): Likewise.
	* sysdeps/powerpc/powerpc32/e500/nofpu/fegetenv.c (__fegetenv):
	Likewise.
	* sysdeps/s390/fpu/fegetenv.c (fegetenv): Rename to __fegetenv and
	define as weak alias of __fegetenv.  Use libm_hidden_weak.
	* sysdeps/sh/sh4/fpu/fegetenv.c (fegetenv): Likewise.
	* sysdeps/sparc/fpu/fegetenv.c (__fegetenv): Use libm_hidden_def.
	* sysdeps/tile/math_private.h (__fegetenv): New inline function.
	* sysdeps/x86_64/fpu/fegetenv.c (fegetenv): Rename to __fegetenv
	and define as weak alias of __fegetenv.  Use libm_hidden_weak.
	* sysdeps/generic/math_private.h (libc_feholdsetround_ctx): Use
	__fegetenv instead of fegetenv.
	(libc_feholdsetround_noex_ctx): Likewise.
2014-12-31 22:07:52 +00:00
Wilco Dijkstra
e6d90d675d Add generic HAVE_RM_CTX implementation
This patch adds a generic implementation of HAVE_RM_CTX using standard
fenv calls. As a result math functions using SET_RESTORE_ROUND* macros
do not suffer from a large slowdown on targets which do not implement
optimized libc_fe*_ctx inline functions. Most of the libc_fe* inline
functions are now unused and could be removed in the future (there are
a few math functions left which use a mixture of standard fenv calls
and libc_fe* inline functions - they could be updated to use
SET_RESTORE_ROUND or improved to avoid expensive fenv manipulations
across just a few FP instructions).

libc_feholdsetround*_noex_ctx is added to enable better optimization of
SET_RESTORE_ROUND_NOEX* implementations.

Performance measurements on ARM and x86 of sin() show significant gains
over the current default, fairly close to a highly optimized fenv_private:

                        ARM   x86
no fenv_private      : 100%  100%
generic HAVE_RM_CTX  : 250%  350%
fenv_private (CTX)   : 250%  450%

2014-06-23  Will Newton  <will.newton@linaro.org>
	    Wilco  <wdijkstr@arm.com>

	* sysdeps/generic/math_private.h: Add generic HAVE_RM_CTX
	implementation.  Include get-rounding-mode.h.
	[!HAVE_RM_CTX]: Define HAVE_RM_CTX to zero.
	[!libc_feholdsetround_noex_ctx]: Define
	libc_feholdsetround_noex_ctx.
	[!libc_feholdsetround_noexf_ctx]: Define
	libc_feholdsetround_noexf_ctx.
	[!libc_feholdsetround_noexl_ctx]: Define
	libc_feholdsetround_noexl_ctx.
	(libc_feholdsetround_ctx): New function.
	(libc_feresetround_ctx): New function.
	(libc_feholdsetround_noex_ctx): New function.
	(libc_feresetround_noex_ctx): New function.
2014-06-23 17:29:00 +01:00
Will Newton
d0ac132472 Revert "Fix HAVE_RM_CTX -Wundef warnings"
This reverts commit 9290130a8d.
2014-03-17 20:36:06 +00:00
Will Newton
9290130a8d Fix HAVE_RM_CTX -Wundef warnings
ChangeLog:

2014-03-17  Will Newton  <will.newton@linaro.org>

	* sysdeps/generic/math_private.h: Check whether
	HAVE_RM_CTX is defined with #ifdef rather
	than #if.
2014-03-17 16:05:24 +00:00
Siddhesh Poyarekar
09544cbcd6 Consolidate multiple precision sin/cos functions 2013-10-08 11:50:17 +05:30
Siddhesh Poyarekar
2506109403 Set/restore rounding mode only when needed
The most common use case of math functions is with default rounding
mode, i.e. rounding to nearest.  Setting and restoring rounding mode
is an unnecessary overhead for this, so I've added support for a
context, which does the set/restore only if the FP status needs a
change.  The code is written such that only x86 uses these.  Other
architectures should be unaffected by it, but would definitely benefit
if the set/restore has as much overhead relative to the rest of the
code, as the x86 bits do.

Here's a summary of the performance improvement due to these
improvements; I've only mentioned functions that use the set/restore
and have benchmark inputs for x86_64:

Before:

cos(): ITERS:4.69335e+08: TOTAL:28884.6Mcy, MAX:4080.28cy, MIN:57.562cy, 16248.6 calls/Mcy
exp(): ITERS:4.47604e+08: TOTAL:28796.2Mcy, MAX:207.721cy, MIN:62.385cy, 15543.9 calls/Mcy
pow(): ITERS:1.63485e+08: TOTAL:28879.9Mcy, MAX:362.255cy, MIN:172.469cy, 5660.86 calls/Mcy
sin(): ITERS:3.89578e+08: TOTAL:28900Mcy, MAX:704.859cy, MIN:47.583cy, 13480.2 calls/Mcy
tan(): ITERS:7.0971e+07: TOTAL:28902.2Mcy, MAX:1357.79cy, MIN:388.58cy, 2455.55 calls/Mcy

After:

cos(): ITERS:6.0014e+08: TOTAL:28875.9Mcy, MAX:364.283cy, MIN:45.716cy, 20783.4 calls/Mcy
exp(): ITERS:5.48578e+08: TOTAL:28764.9Mcy, MAX:191.617cy, MIN:51.011cy, 19071.1 calls/Mcy
pow(): ITERS:1.70013e+08: TOTAL:28873.6Mcy, MAX:689.522cy, MIN:163.989cy, 5888.18 calls/Mcy
sin(): ITERS:4.64079e+08: TOTAL:28891.5Mcy, MAX:6959.3cy, MIN:36.189cy, 16062.8 calls/Mcy
tan(): ITERS:7.2354e+07: TOTAL:28898.9Mcy, MAX:1295.57cy, MIN:380.698cy, 2503.7 calls/Mcy

So the improvements are:

cos: 27.9089%
exp: 22.6919%
pow: 4.01564%
sin: 19.1585%
tan: 1.96086%

The downside of the change is that it will have an adverse performance
impact on non-default rounding modes, but I think the tradeoff is
justified.
2013-06-12 10:36:48 +05:30
Siddhesh Poyarekar
4c60cb0c83 Skip modifying exception mask and flags in SET_RESTORE_ROUND_53BIT
We only need to set/restore rounding mode to ensure correct
computation for non-default rounding modes.
2013-06-05 13:56:19 +05:30
Joseph Myers
d8cd06db62 Improve tgamma accuracy (bugs 2546, 2560, 5159, 15426). 2013-05-08 11:58:18 +00:00
Joseph Myers
5b5b04d628 Make fma use of Dekker and Knuth algorithms use round-to-nearest (bug 14796). 2012-11-03 19:48:53 +00:00
Joseph Myers
d032e0d29b Fix inaccuracy of clog, clog10 near |z| = 1 (bug 13629). 2012-09-25 19:43:49 +00:00
Adhemerval Zanella
76da726532 Fix ilogb exception and errno (bug 6794)
[BZ #6794]
Following Joseph comments about bug 6794, here is a proposed fix. It turned out
to be a large fix mainly because I had to move some file along to follow libm
files/names conventions.

Basically I have added wrappers (w_ilogb.c, w_ilogbf.c, w_ilogbl.c) that now calls
the symbol '__ieee754_ilogb'. The wrappers checks for '__ieee754_ilogb' output and
set the errno and raise exceptions as expected.

The '__ieee754_ilogb' is implemented in sysdeps. I have moved the 's_ilogb[f|l]' files
to e_ilogb[f|l] and renamed the '__ilogb[f|l]' to '__ieee754_ilogb[f|l]'.

I also found out a bug in i386 and x86-64 assembly coded ilogb implementation where
it raises a FE_DIVBYZERO when argument is '0.0'. I corrected this issue as well.

Finally I added the errno and FE_INVALID tests for 0.0, NaN and +-InF argument. Tested
on i386, x86-64, ppc32 and ppc64.
2012-04-17 22:12:53 +02:00
Joseph Myers
41bf21a1e7 Avoid overflows from long double functions using __kernel_standard. 2012-03-28 09:32:12 +00:00
Richard Henderson
0fe0f1f86f Create and use libc_feupdateenv_test.
We can reduce the number of STMXCSR, and often we can avoid the
call to __feraiseexcept.
2012-03-19 06:50:41 -07:00
Richard Henderson
eb92c487b3 Create and use SET_RESTORE_ROUND{,_NOEX,_53BIT}{,F,L}. 2012-03-19 06:49:44 -07:00
Richard Henderson
b4dabbb47a Convert libc_feholdexcept et al from macros to inline functions. 2012-03-19 06:48:27 -07:00
Richard Henderson
4851a949b4 Make inline __isnan, __isinf_ns, __finite generic.
For code generation to stay identical on x86_64, this requires that
we define the fp word manipulation macros before including the
generic header.
2012-03-19 06:47:43 -07:00
Andreas Jaeger
c4814b6b3a Implement and use libc_feholdexcept_setround_53bit and libc_feupdateenv_53bit
so that double arithmetic in s_sin is done in 53 bit (without extend i386 double precision)
2012-03-14 17:20:10 +01:00
Richard Henderson
5f0a5daeee Move math/math_private.h to sysdeps/generic/math_private.h.
This reverts commit 60d6f5a6f5.
2012-03-09 16:12:17 -08:00