Commit Graph

505 Commits

Author SHA1 Message Date
Joseph Myers
26b0bf9600 Avoid "inexact" exceptions in i386/x86_64 ceil functions (bug 15479).
As discussed in
<https://sourceware.org/ml/libc-alpha/2016-05/msg00577.html>, TS
18661-1 disallows ceil, floor, round and trunc functions from raising
the "inexact" exception, in accordance with general IEEE 754 semantics
for when that exception is raised.  Fixing this for x87 floating point
is more complicated than for the other versions of these functions,
because they use the frndint instruction that raises "inexact" and
this can only be avoided by saving and restoring the whole
floating-point environment.

As I noted in
<https://sourceware.org/ml/libc-alpha/2016-06/msg00128.html>, I have
now implemented a GCC option -fno-fp-int-builtin-inexact for GCC 7,
such that GCC will inline these functions on x86, without caring about
"inexact", when the default -ffp-int-builtin-inexact is in effect.
This allows users to get optimized code depending on the options they
pass to the compiler, while making the out-of-line functions follow TS
18661-1 semantics and avoid "inexact".

This patch duly fixes the out-of-line ceil function implementations to
avoid "inexact", in the same way as the nearbyint implementations.

I do not know how the performance of implementations such as these
based on saving the environment and changing the rounding mode
temporarily compares to that of the C versions or SSE 4.1 versions (of
course, for 32-bit x86 SSE implementations still need to get the
return value in an x87 register); it's entirely possible other
implementations could be faster in some cases.

Tested for x86_64 and x86.

	[BZ #15479]
	* sysdeps/i386/fpu/s_ceil.S (__ceil): Save and restore
	floating-point environment rather than just control word.
	* sysdeps/i386/fpu/s_ceilf.S (__ceilf): Likewise.
	* sysdeps/i386/fpu/s_ceill.S (__ceill): Save and restore
	floating-point environment, with "invalid" exceptions merged in,
	rather than just control word.
	* sysdeps/x86_64/fpu/s_ceill.S (__ceill): Likewise.
	* math/libm-test.inc (ceil_test_data): Do not allow spurious
	"inexact" exceptions.
2016-06-27 17:24:30 +00:00
Joseph Myers
40244be372 Fix i386/x86_64 scalbl with sNaN input (bug 20296).
The x86_64 and i386 versions of scalbl return sNaN for some cases of
sNaN input and are missing "invalid" exceptions for other cases.  This
results from overly complicated code that either returns a NaN input,
or discards both inputs when one is NaN and loads a NaN from memory.
This patch fixes this by simplifying the code to add the arguments
when either one is NaN.

Tested for x86_64 and x86.

	[BZ #20296]
	* sysdeps/i386/fpu/e_scalbl.S (__ieee754_scalbl): Add arguments
	when either argument is a NaN.
	* sysdeps/x86_64/fpu/e_scalbl.S (__ieee754_scalbl): Likewise.
	* math/libm-test.inc (scalb_test_data): Add sNaN tests.
2016-06-23 22:17:41 +00:00
Joseph Myers
4e9bf327ad Simplify x86 nearbyint functions.
The i386 implementations of nearbyint functions, and x86_64
nearbyintl, contain code to mask the "inexact" exception.  However,
the fnstenv instruction has the effect of masking all exceptions, so
this masking code has been redundant since fnstenv was added to those
implementations (by commit 846d9a4a3acdb4939ca7bf6aed48f9f6f26911be;
commit 71d1b0166b added the test
math/test-nearbyint-except-2.c that verifies these functions do work
when called with "inexact" traps enabled); this patch removes the
redundant code.

Tested for x86_64 and x86.

	* sysdeps/i386/fpu/s_nearbyint.S (__nearbyint): Do not mask
	"inexact" exceptions after fnstenv.
	* sysdeps/i386/fpu/s_nearbyintf.S (__nearbyintf): Likewise.
	* sysdeps/i386/fpu/s_nearbyintl.S (__nearbyintl): Likewise.
	* sysdeps/x86_64/fpu/s_nearbyintl.S (__nearbyintl): Likewise.
2016-06-22 15:40:30 +00:00
Joseph Myers
228a78c21b Fix i386 fdim double rounding (bug 20255).
fdim suffers from double rounding on i386 because subtracting two
double values can produce an inexact long double value exactly half
way between two double values.  This patch fixes this by creating an
i386-specific version of fdim - C, based on the generic version,
unlike the previous .S version - which sets the x87 precision control
to double precision for the subtraction and then restores it
afterwards.  As noted in the comment added, there are no issues of
double rounding for subnormals (a case that setting precision control
does not address) because subtraction cannot produce an inexact result
in the subnormal range.

Tested for x86_64 and x86.

	[BZ #20255]
	* sysdeps/i386/fpu/s_fdim.c: New file.  Based on math/s_fdim.c.
	* math/libm-test.inc (fdim_test_data): Add another test.
2016-06-14 16:41:50 +00:00
Joseph Myers
f4015c8a86 Use generic fdim on more architectures (bug 6796, bug 20255, bug 20256).
Some architectures have their own versions of fdim functions, which
are missing errno setting (bug 6796) and may also return sNaN instead
of qNaN for sNaN input, in the case of the x86 / x86_64 long double
versions (bug 20256).

These versions are not actually doing anything that a compiler
couldn't generate, just straightforward comparisons / arithmetic (and,
in the x86 / x86_64 case, testing for NaNs with fxam, which isn't
actually needed once you use an unordered comparison and let the NaNs
pass through the same subtraction as non-NaN inputs).  This patch
removes the x86 / x86_64 / powerpc versions, so that those
architectures use the generic C versions, which correctly handle
setting errno and deal properly with sNaN inputs.  This seems better
than dealing with setting errno in lots of .S versions.

The i386 versions also return results with excess range and precision,
which is not appropriate for a function exactly defined by reference
to IEEE operations.  For errno setting to work correctly on overflow,
it's necessary to remove excess range with math_narrow_eval, which
this patch duly does in the float and double versions so that the
tests can reliably pass on x86.  For float, this avoids any double
rounding issues as the long double precision is more than twice that
of float.  For double, double rounding issues will need to be
addressed separately, so this patch does not fully fix bug 20255.

Tested for x86_64, x86 and powerpc.

	[BZ #6796]
	[BZ #20255]
	[BZ #20256]
	* math/s_fdim.c: Include <math_private.h>.
	(__fdim): Use math_narrow_eval on result.
	* math/s_fdimf.c: Include <math_private.h>.
	(__fdimf): Use math_narrow_eval on result.
	* sysdeps/i386/fpu/s_fdim.S: Remove file.
	* sysdeps/i386/fpu/s_fdimf.S: Likewise.
	* sysdeps/i386/fpu/s_fdiml.S: Likewise.
	* sysdeps/i386/i686/fpu/s_fdim.S: Likewise.
	* sysdeps/i386/i686/fpu/s_fdimf.S: Likewise.
	* sysdeps/i386/i686/fpu/s_fdiml.S: Likewise.
	* sysdeps/powerpc/fpu/s_fdim.c: Likewise.
	* sysdeps/powerpc/fpu/s_fdimf.c: Likewise.
	* sysdeps/powerpc/powerpc32/fpu/s_fdim.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_fdim.c: Likewise.
	* sysdeps/x86_64/fpu/s_fdiml.S: Likewise.
	* math/libm-test.inc (fdim_test_data): Expect errno setting on
	overflow.  Add sNaN tests.
2016-06-14 16:04:19 +00:00
Joseph Myers
88283451b2 Fix frexp (NaN) (bug 20250).
Various implementations of frexp functions return sNaN for sNaN
input.  This patch fixes them to add such arguments to themselves so
that qNaN is returned.

Tested for x86_64, x86, mips64 and powerpc.

	[BZ #20250]
	* sysdeps/i386/fpu/s_frexpl.S (__frexpl): Add non-finite input to
	itself.
	* sysdeps/ieee754/dbl-64/s_frexp.c (__frexp): Add non-finite or
	zero input to itself.
	* sysdeps/ieee754/dbl-64/wordsize-64/s_frexp.c (__frexp):
	Likewise.
	* sysdeps/ieee754/flt-32/s_frexpf.c (__frexpf): Likewise.
	* sysdeps/ieee754/ldbl-128/s_frexpl.c (__frexpl): Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_frexpl.c (__frexpl): Likewise.
	* sysdeps/ieee754/ldbl-96/s_frexpl.c (__frexpl): Likewise.
	* math/libm-test.inc (frexp_test_data): Add sNaN tests.
2016-06-13 17:27:19 +00:00
Joseph Myers
f00faa4a43 Fix i386/x86_64 log2l (sNaN) (bug 20235).
The i386/x86_64 versions of log2l return sNaN for sNaN input.  This
patch fixes them to add NaN inputs to themselves so that qNaN is
returned in this case.

Tested for x86_64 and x86.

	[BZ #20235]
	* sysdeps/i386/fpu/e_log2l.S (__ieee754_log2l): Add NaN input to
	itself.
	* sysdeps/x86_64/fpu/e_log2l.S (__ieee754_log2l): Likewise.
	* math/libm-test.inc (log2_test_data): Add sNaN tests.
2016-06-09 18:04:30 +00:00
Joseph Myers
8c010e2f71 Fix i386/x86_64 log1pl (sNaN) (bug 20229).
The i386/x86_64 versions of log1pl return sNaN for sNaN input.  This
patch fixes them to add a NaN input to itself so that qNaN is returned
in this case.

Tested for x86_64 and x86.

	[BZ #20229]
	* sysdeps/i386/fpu/s_log1pl.S (__log1pl): Add NaN input to itself.
	* sysdeps/x86_64/fpu/s_log1pl.S (__log1pl): Likewise.
	* math/libm-test.inc (log1p_test_data): Add sNaN tests.
2016-06-08 23:11:42 +00:00
Joseph Myers
09096b3615 Fix i386/x86_64 log10l (sNaN) (bug 20228).
The i386/x86_64 versions of log10l return sNaN for sNaN input.  This
patch fixes them to add a NaN input to itself so that qNaN is returned
in this case.

Tested for x86_64 and x86.

	[BZ #20228]
	* sysdeps/i386/fpu/e_log10l.S (__ieee754_log10l): Add NaN input to
	itself.
	* sysdeps/x86_64/fpu/e_log10l.S (__ieee754_log10l): Likewise.
	* math/libm-test.inc (log10_test_data): Add sNaN tests.
2016-06-08 22:59:18 +00:00
Joseph Myers
df179d8808 Fix i386/x86_64 logl (sNaN) (bug 20227).
The i386/x86_64 versions of logl return sNaN for sNaN input.  This
patch fixes them to add a NaN input to itself so that qNaN is returned
in this case.

Tested for x86_64 and x86 (including a build for i586 to cover the
non-i686 logl version).

	[BZ #20227]
	* sysdeps/i386/fpu/e_logl.S (__ieee754_logl): Add NaN input to
	itself.
	* sysdeps/i386/i686/fpu/e_logl.S (__ieee754_logl): Likewise.
	* sysdeps/x86_64/fpu/e_logl.S (__ieee754_logl): Likewise.
	* math/libm-test.inc (log_test_data): Add sNaN tests.
2016-06-08 22:24:06 +00:00
Joseph Myers
9bd3ef8e19 Fix i386/x86_64 expl, exp10l, expm1l for sNaN input (bug 20226).
The i386 and x86_64 implementations of expl, exp10l and expm1l (code
shared between the functions) return sNaN for sNaN input.  This patch
fixes them to add NaN inputs to themselves so that qNaN is returned in
this case.

Tested for x86_64 and x86.

	[BZ #20226]
	* sysdeps/i386/fpu/e_expl.S (IEEE754_EXPL): Add NaN argument to
	itself.
	* sysdeps/x86_64/fpu/e_expl.S (IEEE754_EXPL): Likewise.
	* math/libm-test.inc (exp_test_data): Add sNaN tests.
	(exp10_test_data): Likewise.
	(expm1_test_data): Likewise.
2016-06-08 21:55:06 +00:00
Joseph Myers
40720ec9f9 Fix i386 cbrtl (sNaN) (bug 20224).
The i386 version of cbrtl returns sNaN (without raising any
exceptions) for sNaN input.  This patch fixes it to add non-finite
arguments to themselves (the code path in question is also reached for
zero arguments, for which adding them to themselves is also harmless),
so that "invalid" is raised and qNaN returned.

Tested for x86_64 and x86.

	[BZ #20224]
	* sysdeps/i386/fpu/s_cbrtl.S (__cbrtl): Add non-finite or zero
	argument to itself.
	* math/libm-test.inc (cbrt_test_data): Add sNaN tests.
2016-06-08 21:02:40 +00:00
Joseph Myers
8fa8a330f9 Fix i386 atanhl (sNaN) (bug 20219).
The i386 version of atanhl returns sNaN for sNaN input.  This patch
fixes it to add NaN arguments to themselves so it returns qNaN in this
case.

Tested for x86_64 and x86.

	[BZ #20219]
	* sysdeps/i386/fpu/e_atanhl.S (__ieee754_atanhl): Add NaN argument
	to itself.
	* math/libm-test.inc (atanh_test_data): Add sNaN tests.
2016-06-07 23:08:32 +00:00
Joseph Myers
c23805a95d Fix i386 asinhl (sNaN) (bug 20218).
The i386 version of asinhl returns sNaN (without raising any
exceptions) for sNaN input.  This patch fixes it to add non-finite
arguments to themselves, so that "invalid" is raised and qNaN
returned.

Tested for x86_64 and x86.

	[BZ #20218]
	* sysdeps/i386/fpu/s_asinhl.S (__asinhl): Add non-finite argument
	to itself.
	* math/libm-test.inc (asinh_test_data): Add sNaN tests.
2016-06-07 22:54:58 +00:00
Joseph Myers
8cbd1453ec Fix x86/x86_64 nextafterl incrementing negative subnormals (bug 20205).
The x86 / x86_64 implementation of nextafterl (also used for
nexttowardl) produces incorrect results (NaNs) when negative
subnormals, the low 32 bits of whose mantissa are zero, are
incremented towards zero.  This patch fixes this by disabling the
logic to decrement the exponent in that case.

Tested for x86_64 and x86.

	[BZ #20205]
	* sysdeps/i386/fpu/s_nextafterl.c (__nextafterl): Do not adjust
	exponent when incrementing negative subnormal with low mantissa
	word zero.
	* math/libm-test.inc (nextafter_test_data) [TEST_COND_intel96]:
	Add another test.
2016-06-03 21:30:12 +00:00
Joseph Myers
c898991d8b Fix x86_64 / x86 powl inaccuracy for integer exponents (bug 19848).
Bug 19848 reports cases where powl on x86 / x86_64 has error
accumulation, for small integer exponents, larger than permitted by
glibc's accuracy goals, at least in some rounding modes.  This patch
further restricts the exponent range for which the
small-integer-exponent logic is used to limit the possible error
accumulation.

Tested for x86_64 and x86 and ulps updated accordingly.

	[BZ #19848]
	* sysdeps/i386/fpu/e_powl.S (p3): Rename to p2 and change value
	from 8 to 4.
	(__ieee754_powl): Compare integer exponent against 4 not 8.
	* sysdeps/x86_64/fpu/e_powl.S (p3): Rename to p2 and change value
	from 8 to 4.
	(__ieee754_powl): Compare integer exponent against 4 not 8.
	* math/auto-libm-test-in: Add more tests of pow.
	* math/auto-libm-test-out: Regenerated.
	* sysdeps/i386/i686/fpu/multiarch/libm-test-ulps: Update.
	* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
2016-03-24 01:32:52 +00:00
Joseph Myers
f7a9f785e5 Update copyright dates with scripts/update-copyrights. 2016-01-04 16:05:18 +00:00
Aurelien Jarno
5537f466d6 i386: move ULPs to i686/multiarch and regenerate new ones for i386
The i386 ULPs are actually the i686/multiarch ones. The i686/multiarch
float ULPs are more precise as the SSE2 version (when available) uses
double for the cosf and sinf functions.

On the other hand the higher precision of the x86 FPU improves the
precision for a few other math functions.

	* sysdeps/i386/fpu/libm-test-ulps: Move to ....
	* sysdeps/i386/i686/multiarch/fpu/libm-test-ulps: ...here.
	* sysdeps/i386/fpu/libm-test-ulps: Regenerate.
2015-12-20 16:36:45 +01:00
Joseph Myers
2d2c271aea Fix math_private.h multiple include guards.
Various math_private.h headers are guarded by "#ifndef
_MATH_PRIVATE_H", but never define the macro.  Nothing else defines
the macro either (the generic math_private.h that they include defines
a different macro, _MATH_PRIVATE_H_), so those guards are ineffective.

With the recent inclusion of s_sin.c in s_sincos.c, this breaks the
build for MIPS, since the build of s_sincos.c ends up including
<math_private.h> twice and the MIPS version defines inline functions
such as libc_feholdexcept_mips, without a separate fenv_private.h
header with its own guards such as some architectures have.

This patch fixes all the problem headers to use architecture-specific
guard macro names, and to define those macros in the headers they
guard, just as some architectures already do.

Tested for x86 (testsuite, and that installed shared libraries are
unchanged by the patch), and for mips64 (that it fixes the build).

	* sysdeps/arm/math_private.h [!_MATH_PRIVATE_H]: Change guard to
	[!ARM_MATH_PRIVATE_H].
	[!ARM_MATH_PRIVATE_H] (ARM_MATH_PRIVATE_H): Define macro.
	* sysdeps/hppa/math_private.h [!_MATH_PRIVATE_H]: Change guard to
	[!HPPA_MATH_PRIVATE_H].
	[!HPPA_MATH_PRIVATE_H] (HPPA_MATH_PRIVATE_H): Define macro.
	* sysdeps/i386/fpu/math_private.h [!_MATH_PRIVATE_H]: Change guard
	to [!I386_MATH_PRIVATE_H].
	[!I386_MATH_PRIVATE_H] (I386_MATH_PRIVATE_H): Define macro.
	* sysdeps/m68k/m680x0/fpu/math_private.h [!_MATH_PRIVATE_H]:
	Change guard to [!M68K_MATH_PRIVATE_H].
	[!M68K_MATH_PRIVATE_H] (M68K_MATH_PRIVATE_H): Define macro.
	* sysdeps/microblaze/math_private.h [!_MATH_PRIVATE_H]: Change
	guard to [!MICROBLAZE_MATH_PRIVATE_H].
	[!MICROBLAZE_MATH_PRIVATE_H] (MICROBLAZE_MATH_PRIVATE_H): Define
	macro.
	* sysdeps/mips/math_private.h [!_MATH_PRIVATE_H]: Change guard to
	[!MIPS_MATH_PRIVATE_H].
	[!MIPS_MATH_PRIVATE_H] (MIPS_MATH_PRIVATE_H): Define macro.
	* sysdeps/nios2/math_private.h [!_MATH_PRIVATE_H]: Change guard to
	[!NIO2_MATH_PRIVATE_H].
	[!NIO2_MATH_PRIVATE_H] (NIO2_MATH_PRIVATE_H): Define macro.
	* sysdeps/tile/math_private.h [!_MATH_PRIVATE_H]: Change guard to
	[!TILE_MATH_PRIVATE_H].
	[!TILE_MATH_PRIVATE_H] (TILE_MATH_PRIVATE_H): Define macro.
2015-11-20 23:46:23 +00:00
Joseph Myers
01189b083b Fix i386/x86_64 log* (1) zero sign for -ffinite-math-only (bug 19213).
For the -ffinite-math-only versions of various x86_64 and x86 log*
functions, a zero result from log* (1) is returned with incorrect sign
in round-downward mode.  This patch fixes this in a similar way to the
previous fixes for the non-*_finite versions of the functions.

Tested for x86_64 and x86 (including an i586 build), together with a
patch that will be applied separately to enable the main libm-test.inc
tests for the finite-math-only functions.

	[BZ #19213]
	* sysdeps/i386/fpu/e_log.S (__log_finite): Ensure +0 is always
	returned for argument 1.
	* sysdeps/i386/fpu/e_logf.S (__logf_finite): Likewise.
	* sysdeps/i386/fpu/e_logl.S (__logl_finite): Likewise.
	* sysdeps/i386/i686/fpu/e_logl.S (__logl_finite): Likewise.
	* sysdeps/x86_64/fpu/e_log10l.S (__log10l_finite): Likewise.
	* sysdeps/x86_64/fpu/e_log2l.S (__log2l_finite): Likewise.
	* sysdeps/x86_64/fpu/e_logl.S (__logl_finite): Likewise.
2015-11-05 21:56:31 +00:00
Joseph Myers
199a338654 Add more libm tests (scalb*, signbit, sin, sincos, sinh, sqrt, tan, tanh, tgamma, y0, y1, yn, significand).
This patch improves the libm test coverage for a few more functions.

Tested for x86_64 and x86.

	* math/auto-libm-test-in: Add more tests of sin, sincos, sinh,
	sqrt, tan, tanh, y0, y1 and yn.
	* math/auto-libm-test-out: Regenerated.
	* math/libm-test.inc (scalb_test_data): Add more tests.
	(scalbn_test_data): Likewise.
	(scalbln_test_data): Likewise.
	(signbit_test_data): Likewise.
	(sin_test_data): Likewise.
	(sincos_test_data): Likewise.
	(sinh_test_data): Likewise.
	(sqrt_test_data): Likewise.
	(tan_test_data): Likewise.
	(tanh_test_data): Likewise.
	(tgamma_test_data): Likewise.
	(y0_test_data): Likewise.
	(y1_test_data): Likewise.
	(yn_test_data): Likewise.
	(significand_test_data): Likewise.
	* sysdeps/i386/fpu/libm-test-ulps: Update.
2015-11-04 00:45:23 +00:00
Joseph Myers
85422c2acb Make nextafter, nexttoward set errno (bug 6799).
nextafter and nexttoward fail to set errno on overflow and underflow.
This patch makes them do so in cases that should include all the cases
where such errno setting is required by glibc's goals for when to set
errno (but not all cases of underflow where the result is nonzero and
so glibc's goals do not require errno setting).

Tested for x86_64, x86, mips64 and powerpc.

	[BZ #6799]
	* math/s_nextafter.c: Include <errno.h>.
	(__nextafter): Set errno on overflow and underflow.
	* math/s_nexttowardf.c: Include <errno.h>.
	(__nexttowardf): Set errno on overflow and underflow.
	* sysdeps/i386/fpu/s_nextafterl.c: Include <errno.h>.
	(__nextafterl): Set errno on overflow and underflow.
	* sysdeps/i386/fpu/s_nexttoward.c: Include <errno.h>.
	(__nexttoward): Set errno on overflow and underflow.
	* sysdeps/i386/fpu/s_nexttowardf.c: Include <errno.h>.
	(__nexttowardf): Set errno on overflow and underflow.
	* sysdeps/ieee754/flt-32/s_nextafterf.c: Include <errno.h>.
	(__nextafterf): Set errno on overflow and underflow.
	* sysdeps/ieee754/ldbl-128/s_nextafterl.c: Include <errno.h>.
	(__nextafterl): Set errno on overflow and underflow.
	* sysdeps/ieee754/ldbl-128/s_nexttoward.c: Include <errno.h>.
	(__nexttoward): Set errno on overflow and underflow.
	* sysdeps/ieee754/ldbl-128/s_nexttowardf.c: Include <errno.h>.
	(__nexttowardf): Set errno on overflow and underflow.
	* sysdeps/ieee754/ldbl-128ibm/s_nextafterl.c: Include <errno.h>.
	(__nextafterl): Set errno on overflow and underflow.
	* sysdeps/ieee754/ldbl-128ibm/s_nexttoward.c: Include <errno.h>.
	(__nexttoward): Set errno on overflow and underflow.
	* sysdeps/ieee754/ldbl-128ibm/s_nexttowardf.c: Include <errno.h>.
	(__nexttowardf): Set errno on overflow and underflow.
	* sysdeps/ieee754/ldbl-96/s_nexttoward.c: Include <errno.h>.
	(__nexttoward): Set errno on overflow and underflow.
	* sysdeps/ieee754/ldbl-96/s_nexttowardf.c: Include <errno.h>.
	(__nexttowardf): Set errno on overflow and underflow.
	* sysdeps/ieee754/ldbl-opt/s_nexttowardfd.c: Include <errno.h>.
	(__nldbl_nexttowardf): Set errno on overflow and underflow.
	* sysdeps/m68k/m680x0/fpu/s_nextafterl.c: Include <errno.h>.
	(__nextafterl): Set errno on overflow and underflow.
	* math/libm-test.inc (nextafter_test_data): Do not allow errno
	setting to be missing on overflow.  Add more tests.
	(nexttoward_test_data): Likewise.
2015-11-02 18:54:19 +00:00
Joseph Myers
2145f97cee Handle more state in i386/x86_64 fesetenv (bug 16068).
fenv_t should include architecture-specific floating-point modes and
status flags.  i386 and x86_64 fesetenv limit which bits they use from
the x87 status and control words, when using saved state, and limit
which parts of the state they set to fixed values, when using
FE_DFL_ENV / FE_NOMASK_ENV.  The following should be included but are
excluded in at least some cases: status and masking for the "denormal
operand" exception (which isn't part of FE_ALL_EXCEPT); precision
control (explicitly mentioned in Annex F as something that counts as
part of the floating-point environment); MXCSR FZ and DAZ bits (for
FE_DFL_ENV and FE_NOMASK_ENV).  This patch arranges for this extra
state to be handled by fesetenv (and thereby by feupdateenv, which
calls fesetenv).

(Note that glibc functions using floating point are not generally
expected to work correctly with non-default values of this state,
especially precision control, but it is still logically part of the
floating-point environment and should be handled as such by fesetenv.
Changes to the state relating to subnormals ought generally to work
with libm functions when the arguments aren't subnormal and neither
are the expected results; that's a consequence of functions avoiding
spurious internal underflows.)

A question arising from this is whether FE_NOMASK_ENV should or should
not mask the "denormal operand" exception.  I decided it should mask
that exception.  This is the status quo - previously that exception
could only be unmasked by direct manipulation of control registers
(possibly via <fpu_control.h>).  In addition, it means that use of
FE_NOMASK_ENV leaves a floating-point environment the same as could be
obtained by fesetenv (FE_DFL_ENV); feenableexcept (FE_ALL_EXCEPT);,
rather than an environment in which an exception is unmasked that
could only be masked again by using fesetenv with FE_DFL_ENV (or a
previously saved environment) - this exception not being usable with
other <fenv.h> functions because it's outside FE_ALL_EXCEPT.

Tested for x86_64 and x86.

	[BZ #16068]
	* sysdeps/i386/fpu/fesetenv.c: Include <fpu_control.h>.
	(FE_ALL_EXCEPT_X86): New macro.
	(__fesetenv): Use FE_ALL_EXCEPT_X86 in most places instead of
	FE_ALL_EXCEPT.  Ensure precision control is included in
	floating-point state.  Ensure that FE_DFL_ENV and FE_NOMASK_ENV
	handle "denormal operand exception" and clear FZ and DAZ bits.
	* sysdeps/x86_64/fpu/fesetenv.c: Include <fpu_control.h>.
	(FE_ALL_EXCEPT_X86): New macro.
	(__fesetenv): Use FE_ALL_EXCEPT_X86 in most places instead of
	FE_ALL_EXCEPT.  Ensure precision control is included in
	floating-point state.  Ensure that FE_DFL_ENV and FE_NOMASK_ENV
	handle "denormal operand exception" and clear FZ and DAZ bits.
	* sysdeps/x86/fpu/test-fenv-sse-2.c: New file.
	* sysdeps/x86/fpu/test-fenv-x87.c: Likewise.
	* sysdeps/x86/fpu/Makefile [$(subdir) = math] (tests): Add
	test-fenv-x87 and test-fenv-sse-2.
	[$(subdir) = math] (CFLAGS-test-fenv-sse-2.c): New variable.
2015-10-28 22:58:29 +00:00
Joseph Myers
0b9af583a5 Fix i386/x86_64 fesetenv SSE exception clearing (bug 19181).
The i386 and x86_64 versions of fesetenv, when called with FE_DFL_ENV
or FE_NOMASK_ENV as argument, do not clear SSE exceptions raised in
MXCSR.  These arguments should, like other fenv_t values, represent
the whole of the floating-point state, so such exceptions should be
cleared; this patch adds the required clearing.  (Discovered while
working on bug 16068.)

Tested for x86_64 and x86.

	[BZ #19181]
	* sysdeps/i386/fpu/fesetenv.c (__fesetenv): Clear already-raised
	SSE exceptions when argument is FE_DFL_ENV or FE_NOMASK_ENV.
	* sysdeps/x86_64/fpu/fesetenv.c (__fesetenv): Likewise.
	* math/test-fenv-clear-main.c: New file.
	* math/test-fenv-clear.c: Likewise.
	* math/Makefile (tests): Add test-fenv-clear.
	* sysdeps/x86/fpu/test-fenv-clear-sse.c: New file.
	* sysdeps/x86/fpu/Makefile [$(subdir) = math] (tests): Add
	test-fenv-clear-sse.
	[$(subdir) = math] (CFLAGS-test-fenv-clear-sse.c): New variable.
2015-10-28 18:50:20 +00:00
Joseph Myers
846d9a4a3a Fix i386 / x86_64 nearbyint exception clearing (bug 15491).
The implementations of nearbyint functions using x87 floating point
(i386 all versions, x86_64 long double only) use the fclex
instruction, which clears any exceptions that were raised before the
function was called.  These functions must not clear exceptions that
were raised before they were called.

This patch fixes these functions to save and restore the whole
floating-point environment (fnstenv / fldenv) as the way of avoiding
raising "inexact" (recall that there isn't an x87 instruction for
loading just the status word, so the whole environment has to be saved
and loaded instead - the code already saved and loaded the control
word, which is now obtained from the saved environment after this
patch, to disable traps on "inexact").  In the case of the long double
functions, any "invalid" exception from frndint (applied to a
signaling NaN) needs merging into the saved state; this issue doesn't
apply to the float and double functions because that exception would
have been raised when the argument is loaded, before the environment
is saved.

	[BZ #15491]
	* sysdeps/i386/fpu/s_nearbyint.S (__nearbyint): Save and restore
	floating-point environment instead of clearing all exceptions.
	* sysdeps/i386/fpu/s_nearbyintf.S (__nearbyintf): Likewise.
	* sysdeps/i386/fpu/s_nearbyintl.S (__nearbyintl): Likewise,
	merging in "invalid" exceptions from frndint.
	* sysdeps/x86_64/fpu/s_nearbyintl.S (__nearbyintl): Likewise.
	* math/test-nearbyint-except.c: New file.
	* math/Makefile (tests): Add test-nearbyint-except.
2015-10-22 23:14:55 +00:00
Joseph Myers
59a63cca11 Fix nexttoward overflow in non-default rounding modes (bug 19059).
ISO C requires overflowing results from nexttoward to be the
appropriate infinity independent of the rounding mode, but some
implementations use a rounding-mode-dependent result (this is the same
issue as was fixed for nextafter in bug 16677).  This patch fixes the
problem by making the nexttoward implementations discard the result
from the floating-point computation that forced an overflow exception
and then return the infinity previously computed with integer
arithmetic.

Tested for x86_64, x86, mips64 and powerpc.

	[BZ #19059]
	* math/s_nexttowardf.c (__nexttowardf): Do not return value from
	overflowing computation.
	* sysdeps/i386/fpu/s_nexttoward.c (__nexttoward): Likewise.
	* sysdeps/i386/fpu/s_nexttowardf.c (__nexttowardf): Likewise.
	* sysdeps/ieee754/ldbl-128/s_nexttoward.c (__nexttoward):
	Likewise.
	* sysdeps/ieee754/ldbl-128/s_nexttowardf.c (__nexttowardf):
	Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_nexttoward.c (__nexttoward):
	Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_nexttowardf.c (__nexttowardf):
	Likewise.
	* sysdeps/ieee754/ldbl-96/s_nexttoward.c (__nexttoward): Likewise.
	* sysdeps/ieee754/ldbl-96/s_nexttowardf.c (__nexttowardf):
	Likewise.
	* sysdeps/ieee754/ldbl-opt/s_nexttowardfd.c (__nldbl_nexttowardf):
	Likewise.
	* math/libm-test.inc (nexttoward_test_data): Add more tests.
2015-10-02 17:11:13 +00:00
Joseph Myers
8c6c923636 Fix i386 acosh (-qNaN) spurious "invalid" exception.
The i386 versions of acoshf and acosh raise a spurious "invalid"
exception for an argument that is a quiet NaN with the sign bit set.
The integer arithmetic to detect arguments < 1 also detects -NaN, and
then the computation 0 / 0 in that case raises the exception.  This
patch fixes this by using (x - x) / (x - x) as the computation in that
case instead, which will always raise the exception for non-NaN
arguments reaching that code, but not for quiet NaN arguments.

Tested for x86_64 and x86.

	[BZ #19032]
	* sysdeps/i386/fpu/e_acosh.S (__ieee754_acosh): For arguments < 1,
	compute result as (x - x) / (x - x) not as 0 / 0.
	* sysdeps/i386/fpu/e_acoshf.S (__ieee754_acoshf): Likewise.
	* math/libm-test.inc (acosh_test_data): Add another test of acosh.
2015-09-30 21:44:42 +00:00
Joseph Myers
a5721ebc68 Fix clog, clog10 inaccuracy (bug 19016).
For arguments with X^2 + Y^2 close to 1, clog and clog10 avoid large
errors from log(hypot) by computing X^2 + Y^2 - 1 in a way that avoids
cancellation error and then using log1p.

However, the thresholds for using that approach still result in log
being used on argument as large as sqrt(13/16) > 0.9, leading to
significant errors, in some cases above the 9ulp maximum allowed in
glibc libm.  This patch arranges for the approach using log1p to be
used in any cases where |X|, |Y| < 1 and X^2 + Y^2 >= 0.5 (with the
existing allowance for cases where one of X and Y is very small),
adjusting the __x2y2m1 functions to work with the wider range of
inputs.  This way, log only gets used on arguments below sqrt(1/2) (or
substantially above 1), where the error involved is much less.

Tested for x86_64, x86, mips64 and powerpc.  For the ulps regeneration
I removed the existing clog and clog10 ulps before regenerating to
allow any reduced ulps to appear.  Tests added include those found by
random test generation to produce large ulps either before or after
the patch, and some found by trying inputs close to the (0.75, 0.5)
threshold where the potential errors from using log are largest.

	[BZ #19016]
	* sysdeps/generic/math_private.h (__x2y2m1f): Update comment to
	allow more cases with X^2 + Y^2 >= 0.5.
	* sysdeps/ieee754/dbl-64/x2y2m1.c (__x2y2m1): Likewise.  Add -1 as
	normal element in sum instead of special-casing based on values of
	arguments.
	* sysdeps/ieee754/dbl-64/x2y2m1f.c (__x2y2m1f): Update comment.
	* sysdeps/ieee754/ldbl-128/x2y2m1l.c (__x2y2m1l): Likewise.  Add
	-1 as normal element in sum instead of special-casing based on
	values of arguments.
	* sysdeps/ieee754/ldbl-128ibm/x2y2m1l.c (__x2y2m1l): Likewise.
	* sysdeps/ieee754/ldbl-96/x2y2m1.c [FLT_EVAL_METHOD != 0]
	(__x2y2m1): Update comment.
	* sysdeps/ieee754/ldbl-96/x2y2m1l.c (__x2y2m1l): Likewise.  Add -1
	as normal element in sum instead of special-casing based on values
	of arguments.
	* math/s_clog.c (__clog): Handle more cases using log1p without
	hypot.
	* math/s_clog10.c (__clog10): Likewise.
	* math/s_clog10f.c (__clog10f): Likewise.
	* math/s_clog10l.c (__clog10l): Likewise.
	* math/s_clogf.c (__clogf): Likewise.
	* math/s_clogl.c (__clogl): Likewise.
	* math/auto-libm-test-in: Add more tests of clog and clog10.
	* math/auto-libm-test-out: Regenerated.
	* sysdeps/i386/fpu/libm-test-ulps: Update.
	* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
2015-09-28 22:11:22 +00:00
Joseph Myers
6ace393821 Fix pow missing underflows (bug 18825).
Similar to various other bugs in this area, pow functions can fail to
raise the underflow exception when the result is tiny and inexact but
one or more low bits of the intermediate result that is scaled down
(or, in the i386 case, converted from a wider evaluation format) are
zero.  This patch forces the exception in a similar way to previous
fixes, thereby concluding the fixes for known bugs with missing
underflow exceptions currently filed in Bugzilla.

Tested for x86_64, x86, mips64 and powerpc.

	[BZ #18825]
	* sysdeps/i386/fpu/i386-math-asm.h (FLT_NARROW_EVAL_UFLOW_NONNAN):
	New macro.
	(DBL_NARROW_EVAL_UFLOW_NONNAN): Likewise.
	(LDBL_CHECK_FORCE_UFLOW_NONNAN): Likewise.
	* sysdeps/i386/fpu/e_pow.S: Use DEFINE_DBL_MIN.
	(__ieee754_pow): Use DBL_NARROW_EVAL_UFLOW_NONNAN instead of
	DBL_NARROW_EVAL, reloading the PIC register as needed.
	* sysdeps/i386/fpu/e_powf.S: Use DEFINE_FLT_MIN.
	(__ieee754_powf): Use FLT_NARROW_EVAL_UFLOW_NONNAN instead of
	FLT_NARROW_EVAL.  Use separate return path for case when first
	argument is NaN.
	* sysdeps/i386/fpu/e_powl.S: Include <i386-math-asm.h>.  Use
	DEFINE_LDBL_MIN.
	(__ieee754_powl): Use LDBL_CHECK_FORCE_UFLOW_NONNAN, reloading the
	PIC register.
	* sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Use
	math_check_force_underflow_nonneg.
	* sysdeps/ieee754/flt-32/e_powf.c (__ieee754_powf): Force
	underflow for subnormal result.
	* sysdeps/ieee754/ldbl-128/e_powl.c (__ieee754_powl): Likewise.
	* sysdeps/ieee754/ldbl-128ibm/e_powl.c (__ieee754_powl): Use
	math_check_force_underflow_nonneg.
	* sysdeps/x86/fpu/powl_helper.c (__powl_helper): Use
	math_check_force_underflow.
	* sysdeps/x86_64/fpu/x86_64-math-asm.h
	(LDBL_CHECK_FORCE_UFLOW_NONNAN): New macro.
	* sysdeps/x86_64/fpu/e_powl.S: Include <x86_64-math-asm.h>.  Use
	DEFINE_LDBL_MIN.
	(__ieee754_powl): Use LDBL_CHECK_FORCE_UFLOW_NONNAN.
	* math/auto-libm-test-in: Add more tests of pow.
	* math/auto-libm-test-out: Regenerated.
2015-09-25 22:29:10 +00:00
Joseph Myers
f6987f5aa4 Fix hypot missing underflows (bug 18803).
Similar to various other bugs in this area, hypot functions can fail
to raise the underflow exception when the result is tiny and inexact
but one or more low bits of the intermediate result that is scaled
down (or, in the i386 case, converted from a wider evaluation format)
are zero.  This patch forces the exception in a similar way to
previous fixes.

Note that this issue cannot arise for implementations of hypotf using
double (or wider) for intermediate evaluation (if hypotf should
underflow, that means the double square root is being computed of some
number of the form N*2^-298, for 0 < N < 2^46, which is exactly
represented as a double, and whatever the rounding mode such a square
root cannot have a mantissa with all zeroes after the initial 23
bits).  Thus no changes are made to hypotf implementations in this
patch, only to hypot and hypotl.

Tested for x86_64, x86, mips64 and powerpc.

	[BZ #18803]
	* sysdeps/i386/fpu/e_hypot.S: Use DEFINE_DBL_MIN.
	(MO): New macro.
	(__ieee754_hypot) [PIC]: Load PIC register.
	(__ieee754_hypot): Use DBL_NARROW_EVAL_UFLOW_NONNEG instead of
	DBL_NARROW_EVAL.
	* sysdeps/ieee754/dbl-64/e_hypot.c (__ieee754_hypot): Use
	math_check_force_underflow_nonneg in case where result might be
	tiny.
	* sysdeps/ieee754/ldbl-128/e_hypotl.c (__ieee754_hypotl):
	Likewise.
	* sysdeps/ieee754/ldbl-128ibm/e_hypotl.c (__ieee754_hypotl):
	Likewise.
	* sysdeps/ieee754/ldbl-96/e_hypotl.c (__ieee754_hypotl): Likewise.
	* sysdeps/powerpc/fpu/e_hypot.c (__ieee754_hypot): Likewise.
	* math/auto-libm-test-in: Add more tests of hypot.
	* math/auto-libm-test-out: Regenerated.
2015-09-24 23:43:57 +00:00
Joseph Myers
1a19b8894f Use LOAD_PIC_REG in i386 atanh.
sysdeps/i386/fpu/e_atanh.S, unlike all other functions in that
directory, loads the PIC register with its own code using
_GLOBAL_OFFSET_TABLE_, rather than with the LOAD_PIC_REG macro.  I see
no good reason for the difference; this patch makes it use the common
macro.

Tested for x86.

	* sysdeps/i386/fpu/e_atanh.S (__ieee754_atanh) [PIC]: Use
	LOAD_PIC_REG.
2015-09-24 21:48:22 +00:00
Joseph Myers
37825319ee Refactor i386 libm code forcing underflow exceptions.
This patch refactors code in sysdeps/i386/fpu that forces underflow
exceptions to use common macros for that purpose as far as possible.
(Although some of the macros end up used in only one place, I think
it's cleanest to define all these macros together so that all the code
forcing underflow uses such macros.  Some more uses of such macros
will also be introduced when fixing remaining bugs about missing
underflow exceptions, and it would be possible to do further
refactoring of the macros in i386-math-asm.h to share more code by
using other macros internally.  Places that test for underflow by
examining the representation of the argument with integer operations,
rather that using floating-point comparisons on the argument or
result, are unchanged by this patch.)

Most of this code uses a macro MO to abstract away the differences
between PIC and non-PIC memory references to constants.  log1p
functions, however, hardcoded PIC conditionals for this.  Because the
common macros rely on the use of MO, I changed the log1p functions to
use the normal style here, and, for consistency, also made that change
to log1pl which is otherwise unaffected by this patch.

Tested for x86.

	* sysdeps/i386/fpu/i386-math-asm.h (DEFINE_LDBL_MIN): New macro.
	(FLT_CHECK_FORCE_UFLOW): Likewise.
	(DBL_CHECK_FORCE_UFLOW): Likewise.
	(FLT_CHECK_FORCE_UFLOW_NARROW): Likewise.
	(DBL_CHECK_FORCE_UFLOW_NARROW): Likewise.
	(LDBL_CHECK_FORCE_UFLOW_NONNEG_NAN): Likewise.
	(FLT_CHECK_FORCE_UFLOW_NONNAN): Likewise.
	(DBL_CHECK_FORCE_UFLOW_NONNAN): Likewise.
	(FLT_CHECK_FORCE_UFLOW_NONNEG): Likewise.
	(DBL_CHECK_FORCE_UFLOW_NONNEG): Likewise.
	(LDBL_CHECK_FORCE_UFLOW_NONNEG): Likewise.
	* sysdeps/i386/fpu/e_asin.S: Include <i386-math-asm.h>.
	(dbl_min): Replace with use of DEFINE_DBL_MIN.
	(__ieee754_asin): Use DBL_CHECK_FORCE_UFLOW.
	* sysdeps/i386/fpu/e_asinf.S: Include <i386-math-asm.h>.
	(flt_min): Replace with use of DEFINE_FLT_MIN.
	(__ieee754_asinf): Use FLT_CHECK_FORCE_UFLOW.
	* sysdeps/i386/fpu/e_atan2.S: Include <i386-math-asm.h>.
	(dbl_min): Replace with use of DEFINE_DBL_MIN.
	(__ieee754_atan2): Use DBL_CHECK_FORCE_UFLOW_NARROW.
	* sysdeps/i386/fpu/e_atan2f.S: Include <i386-math-asm.h>.
	(flt_min): Replace with use of DEFINE_FLT_MIN.
	(__ieee754_atan2f): Use FLT_CHECK_FORCE_UFLOW_NARROW.
	* sysdeps/i386/fpu/e_atanh.S: Include <i386-math-asm.h>.
	(dbl_min): Replace with use of DEFINE_DBL_MIN.
	(__ieee754_atanh): Use DBL_CHECK_FORCE_UFLOW_NONNEG.
	* sysdeps/i386/fpu/e_atanhf.S: Include <i386-math-asm.h>.
	(flt_min): Replace with use of DEFINE_FLT_MIN.
	(__ieee754_atanhf): Use FLT_CHECK_FORCE_UFLOW_NONNEG.
	* sysdeps/i386/fpu/e_exp2l.S: Include <i386-math-asm.h>.
	(ldbl_min): Replace with use of DEFINE_LDBL_MIN.
	(__ieee754_exp2l): Use LDBL_CHECK_FORCE_UFLOW_NONNEG_NAN.
	* sysdeps/i386/fpu/e_expl.S: Include <i386-math-asm.h>.
	[!USE_AS_EXPM1L] (cmin): Replace with use of DEFINE_LDBL_MIN.
	(IEEE754_EXPL): Use LDBL_CHECK_FORCE_UFLOW_NONNEG.
	* sysdeps/i386/fpu/s_atan.S: Include <i386-math-asm.h>.
	(dbl_min): Replace with use of DEFINE_DBL_MIN.
	(__atan): Use DBL_CHECK_FORCE_UFLOW.
	* sysdeps/i386/fpu/s_atanf.S: Include <i386-math-asm.h>.
	(flt_min): Replace with use of DEFINE_FLT_MIN.
	(__atanf): Use FLT_CHECK_FORCE_UFLOW.
	* sysdeps/i386/fpu/s_expm1.S: Include <i386-math-asm.h>.
	(dbl_min): Replace with use of DEFINE_DBL_MIN.
	(__expm1): Use DBL_CHECK_FORCE_UFLOW.  Move underflow check after
	main computation.
	* sysdeps/i386/fpu/s_expm1f.S: Include <i386-math-asm.h>.
	(flt_min): Replace with use of DEFINE_FLT_MIN.
	(__expm1f): Use FLT_CHECK_FORCE_UFLOW.  Move underflow check after
	main computation.
	* sysdeps/i386/fpu/s_log1p.S: Include <i386-math-asm.h>.
	(dbl_min): Replace with use of DEFINE_DBL_MIN.
	(MO): New macro.
	(__log1p): Use MO.  Use DBL_CHECK_FORCE_UFLOW_NONNAN.
	* sysdeps/i386/fpu/s_log1pf.S: Include <i386-math-asm.h>.
	(flt_min): Replace with use of DEFINE_FLT_MIN.
	(MO): New macro.
	(__log1pf): Use MO.  Use FLT_CHECK_FORCE_UFLOW_NONNAN.
	* sysdeps/i386/fpu/s_log1pl.S (MO): New macro.
	(__log1pl): Use MO.
2015-09-24 21:41:00 +00:00
Joseph Myers
54142c44e9 Use math_narrow_eval more consistently.
Where glibc code needs to avoid excess range and precision in
floating-point arithmetic, code variously uses either asms or volatile
to force the results of that arithmetic to memory; mostly this is
conditional on FLT_EVAL_METHOD, but in the case of lrint / llrint
functions some use of volatile is unconditional (and is present
unnecessarily in versions for long double).  This patch make such code
use the recently-added math_narrow_eval macro consistently, removing
the unnecessary uses of volatile in long double lrint / llrint
implementations completely.

Tested for x86_64, x86, mips64 and powerpc.

	* math/s_nexttowardf.c (__nexttowardf): Use math_narrow_eval.
	* stdlib/strtod_l.c: Include <math_private.h>.
	(overflow_value): Use math_narrow_eval.
	(underflow_value): Likewise.
	* sysdeps/i386/fpu/s_nexttoward.c (__nexttoward): Likewise.
	* sysdeps/i386/fpu/s_nexttowardf.c (__nexttowardf): Likewise.
	* sysdeps/ieee754/dbl-64/e_gamma_r.c (gamma_positive): Likewise.
	(__ieee754_gamma_r): Likewise.
	* sysdeps/ieee754/dbl-64/gamma_productf.c (__gamma_productf):
	Likewise.
	* sysdeps/ieee754/dbl-64/k_rem_pio2.c (__kernel_rem_pio2):
	Likewise.
	* sysdeps/ieee754/dbl-64/lgamma_neg.c (__lgamma_neg): Likewise.
	* sysdeps/ieee754/dbl-64/s_erf.c (__erfc): Likewise.
	* sysdeps/ieee754/dbl-64/s_llrint.c (__llrint): Likewise.
	* sysdeps/ieee754/dbl-64/s_lrint.c (__lrint): Likewise.
	* sysdeps/ieee754/flt-32/e_gammaf_r.c (gammaf_positive): Likewise.
	(__ieee754_gammaf_r): Likewise.
	* sysdeps/ieee754/flt-32/k_rem_pio2f.c (__kernel_rem_pio2f):
	Likewise.
	* sysdeps/ieee754/flt-32/lgamma_negf.c (__lgamma_negf): Likewise.
	* sysdeps/ieee754/flt-32/s_erff.c (__erfcf): Likewise.
	* sysdeps/ieee754/flt-32/s_llrintf.c (__llrintf): Likewise.
	* sysdeps/ieee754/flt-32/s_lrintf.c (__lrintf): Likewise.
	* sysdeps/ieee754/ldbl-128/s_llrintl.c (__llrintl): Do not use
	volatile.
	* sysdeps/ieee754/ldbl-128/s_lrintl.c (__lrintl): Likewise.
	* sysdeps/ieee754/ldbl-128/s_nexttoward.c (__nexttoward): Use
	math_narrow_eval.
	* sysdeps/ieee754/ldbl-128ibm/s_nexttoward.c (__nexttoward):
	Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_nexttowardf.c (__nexttowardf):
	Likewise.
	* sysdeps/ieee754/ldbl-96/gamma_product.c (__gamma_product):
	Likewise.
	* sysdeps/ieee754/ldbl-96/s_llrintl.c (__llrintl): Do not use
	volatile.
	* sysdeps/ieee754/ldbl-96/s_lrintl.c (__lrintl): Likewise.
	* sysdeps/ieee754/ldbl-96/s_nexttoward.c (__nexttoward): Use
	math_narrow_eval.
	* sysdeps/ieee754/ldbl-96/s_nexttowardf.c (__nexttowardf):
	Likewise.
	* sysdeps/ieee754/ldbl-opt/s_nexttowardfd.c (__nldbl_nexttowardf):
	Likewise.
2015-09-23 18:14:57 +00:00
Joseph Myers
6f0f237bf5 Avoid excess range in results from i386 exp, hypot, pow functions (bug 18980).
i386 exp, hypot and pow functions can return overflowing and
underflowing values with excess range and precision; ; Wilco
Dijkstra's patches to make isfinite etc. expand inline cause this
pre-existing issue to result in test failures.

This patch fixes those functions to avoid excess range and precision
in their return values.  Appropriate macros are added for the repeated
code sequences; in future I'll add more such macros and refactor
existing code forcing underflow (with or without also eliminating
excess range and precision from the return value) to use such macros.

Tested for x86.  If, after this patch, you still see x86 libm test
failures with excess range or precision, please file bugs in Bugzilla.

	[BZ #18980]
	* sysdeps/i386/fpu/i386-math-asm.h (DEFINE_FLT_MIN): New macro.
	(DEFINE_DBL_MIN): Likewise.
	(FLT_NARROW_EVAL_UFLOW_NONNEG_NAN): Likewise.
	(DBL_NARROW_EVAL_UFLOW_NONNEG_NAN): Likewise.
	(FLT_NARROW_EVAL_UFLOW_NONNEG): Likewise.
	(DBL_NARROW_EVAL_UFLOW_NONNEG): Likewise.
	* sysdeps/i386/fpu/e_exp.S: Include <i386-math-asm.h>.
	(dbl_min): Replace with use of DEFINE_DBL_MIN.
	(__ieee754_exp): Use DBL_NARROW_EVAL_UFLOW_NONNEG_NAN.
	(__exp_finite): Use DBL_NARROW_EVAL_UFLOW_NONNEG.
	* sysdeps/i386/fpu/e_exp10.S: Include <i386-math-asm.h>.
	(dbl_min): Replace with use of DEFINE_DBL_MIN.
	(__ieee754_exp10): Use DBL_NARROW_EVAL_UFLOW_NONNEG_NAN.
	* sysdeps/i386/fpu/e_exp10f.S: Include <i386-math-asm.h>.
	(flt_min): Replace with use of DEFINE_FLT_MIN.
	(__ieee754_exp10f): Use FLT_NARROW_EVAL_UFLOW_NONNEG_NAN.
	* sysdeps/i386/fpu/e_exp2.S: Include <i386-math-asm.h>.
	(dbl_min): Replace with use of DEFINE_DBL_MIN.
	(__ieee754_exp2): Use DBL_NARROW_EVAL_UFLOW_NONNEG_NAN.
	* sysdeps/i386/fpu/e_exp2f.S: Include <i386-math-asm.h>.
	(flt_min): Replace with use of DEFINE_FLT_MIN.
	(__ieee754_exp2f): Use FLT_NARROW_EVAL_UFLOW_NONNEG_NAN.
	* sysdeps/i386/fpu/e_expf.S: Include <i386-math-asm.h>.
	(flt_min): Replace with use of DEFINE_FLT_MIN.
	(__ieee754_expf): Use FLT_NARROW_EVAL_UFLOW_NONNEG_NAN.
	(__expf_finite): Use FLT_NARROW_EVAL_UFLOW_NONNEG.
	* sysdeps/i386/fpu/e_hypot.S: Include <i386-math-asm.h>.
	(__ieee754_hypot): Use DBL_NARROW_EVAL.
	* sysdeps/i386/fpu/e_hypotf.S: Include <i386-math-asm.h>.
	(__ieee754_hypotf): Use FLT_NARROW_EVAL.
	* sysdeps/i386/fpu/e_pow.S: Include <i386-math-asm.h>.
	(__ieee754_pow): Use DBL_NARROW_EVAL.
	* sysdeps/i386/fpu/e_powf.S: Include <i386-math-asm.h>.
	(__ieee754_powf): Use FLT_NARROW_EVAL.
	* sysdeps/i386/i686/fpu/multiarch/e_expf-sse2.S
	(__ieee754_expf_sse2): Convert double-precision result to single
	precision.
	* sysdeps/i386/fpu/libm-test-ulps: Update.
2015-09-18 21:53:22 +00:00
Joseph Myers
94ced920a9 Avoid excess range in results from i386 scalb functions (bug 18981).
i386 scalb / scalbn / scalbln (and thus ldexp) functions for float and
double can return results with excess range (and consequently excess
precision for subnormal results).  As the results of these functions
are fully determined by reference to IEEE 754 operations, this is
unambiguously a bug, apart from the testsuite failures it causes.

This patch makes those functions store their results on the stack and
load them back to eliminate the excess range.  Double rounding is not
a problem, as the only cases where it could occur are when the result
overflows or underflows for extended precision, and then the
double-rounded results are the same as the single-rounded results.

The new macros will be used for more functions, more such macros
added, and existing code refactored to use such macros, in subsequent
patches.

Tested for x86.  Committed.

	[BZ #18981]
	* sysdeps/i386/fpu/i386-math-asm.h: New file.
	* sysdeps/i386/fpu/e_scalb.S: Include <i386-math-asm.h>.
	(__ieee754_scalb): Use DBL_NARROW_EVAL.
	* sysdeps/i386/fpu/e_scalbf.S: Include <i386-math-asm.h>.
	(__ieee754_scalbf): Use FLT_NARROW_EVAL.
	* sysdeps/i386/fpu/s_scalbn.S: Include <i386-math-asm.h>.
	(__scalbn): Use DBL_NARROW_EVAL.
	* sysdeps/i386/fpu/s_scalbnf.S: Include <i386-math-asm.h>.
	(__scalbnf): Use FLT_NARROW_EVAL.
2015-09-18 20:34:59 +00:00
Joseph Myers
da2f4f2dd5 Make scalbn set errno (bug 6803).
As noted in bug 6803, scalbn fails to set errno on overflow and
underflow.  This patch fixes this by making scalbn an alias of ldexp,
which has exactly the same semantics (for floating-point types with
radix 2) and already has wrappers that deal with setting errno,
instead of an alias of the internal __scalbn (which ldexp calls).

Notes:

* Where compat symbols were defined for scalbn functions, I didn't
  change what they point to (to keep the patch minimal), so such
  compat symbols continue to go directly to the non-errno-setting
  functions.

* Mike, I didn't do anything with the IA64 versions of these
  functions, where I think both the ldexp and scalbn functions already
  deal with setting errno.  As a cleanup (not needed to fix this bug)
  however you might want to make those functions into aliases for
  IA64; there is no need for them to be separate function
  implementations at all.

* This concludes the fix for bug 6803 since the scalb and scalbln
  cases of that bug were fixed some time ago.

Tested for x86_64, x86, mips64 and powerpc.

	[BZ #6803]
	* math/s_ldexp.c (scalbn): Define as weak alias of __ldexp.
	[NO_LONG_DOUBLE] (scalbnl): Define as weak alias of __ldexp.
	* math/s_ldexpf.c (scalbnf): Define as weak alias of __ldexpf.
	* math/s_ldexpl.c (scalbnl): Define as weak alias of __ldexpl.
	* sysdeps/i386/fpu/s_scalbn.S (scalbn): Remove alias.
	* sysdeps/i386/fpu/s_scalbnf.S (scalbnf): Likewise.
	* sysdeps/i386/fpu/s_scalbnl.S (scalbnl): Likewise.
	* sysdeps/ieee754/dbl-64/s_scalbn.c (scalbn): Likewise.
	[NO_LONG_DOUBLE] (scalbnl): Likewise.
	* sysdeps/ieee754/dbl-64/wordsize-64/s_scalbn.c (scalbn):
	Likewise.
	[NO_LONG_DOUBLE] (scalbnl): Likewise.
	* sysdeps/ieee754/flt-32/s_scalbnf.c (scalbnf): Likewise.
	* sysdeps/ieee754/ldbl-128/s_scalbnl.c (scalbnl): Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_scalbnl.c (scalbnl): Remove
	long_double_symbol calls.
	* sysdeps/ieee754/ldbl-64-128/s_scalbnl.c (scalbnl): Likewise.
	* sysdeps/ieee754/ldbl-opt/s_ldexpl.c (__ldexpl_2): Define as
	strong alias of __ldexpl.
	(scalbnl): Define using long_double_symbol.
	* sysdeps/m68k/m680x0/fpu/s_scalbn.c (__CONCATX(scalbn,suffix)):
	Remove alias.
	* sysdeps/sparc/sparc64/soft-fp/s_scalbnl.c (scalbnl): Likewise.
	* sysdeps/x86_64/fpu/s_scalbnl.S (scalbnl): Likewise.
	* math/libm-test.inc (scalbn_test_data): Add errno expectations.
	(scalbln_test_data): Add more errno expectations.
2015-09-16 21:11:00 +00:00
Joseph Myers
828bf6828b Fix i386 exp10 missing underflows (bug 18966).
On i386, the double version of exp10 can miss underflow exceptions if
the result is in the subnormal range for double but the last 11 bits
of the 64-bit extended-precision mantissa happen to be zero.  This
patch forces the exception in a similar way to previous fixes.

As with the exp2 and exp fixes, the exp10f changes may in fact not be
needed to ensure underflow exceptions, but are included for
consistency and to fix the exp10 part of bug 18875 by ensuring that
excess range and precision is removed from underflowing return values.

Tested for x86_64 and x86.

	[BZ #18875]
	[BZ #18966]
	* sysdeps/i386/fpu/e_exp10.S (dbl_min): New object.
	(MO): New macro.
	(__ieee754_exp10): For small results, force underflow exception
	and remove excess range and precision from return value.
	* sysdeps/i386/fpu/e_exp10f.S (flt_min): New object.
	(MO): New macro.
	(__ieee754_exp10f): For small results, force underflow exception
	and remove excess range and precision from return value.
	* math/auto-libm-test-in: Add more tests of exp10.
	* math/auto-libm-test-out: Regenerated.
2015-09-15 16:50:02 +00:00
Joseph Myers
de5e81691c Fix i386 exp missing underflows (bug 18961).
On i386, the double version of exp can miss underflow exceptions if
the result is in the subnormal range for double but the last 11 bits
of the 64-bit extended-precision mantissa happen to be zero.  This
patch forces the exception in a similar way to previous fixes.

As with the exp2 fixes, the expf changes may in fact not be needed to
ensure underflow exceptions, but are included for consistency and to
fix the exp part of bug 18875 by ensuring that excess range and
precision is removed from underflowing return values.

Tested for x86_64 and x86.

	[BZ #18875]
	[BZ #18961]
	* sysdeps/i386/fpu/e_exp.S (dbl_min): New object.
	(MO): New macro.
	(__ieee754_exp): For small results, force underflow exception and
	remove excess range and precision from return value.
	(__exp_finite): Likewise.
	* sysdeps/i386/fpu/e_expf.S (flt_min): New object.
	(MO): New macro.
	(__ieee754_expf): For small results, force underflow exception and
	remove excess range and precision from return value.
	(__expf_finite): Likewise.
	* math/auto-libm-test-in: Add more tests of exp.
	* math/auto-libm-test-out: Regenerated.
2015-09-14 22:40:05 +00:00
Joseph Myers
903af5af9a Fix exp2 missing underflows (bug 16521).
Various exp2 implementations in glibc can miss underflow exceptions
when the scaling down part of the calculation is exact (or, in the x86
case, when the conversion from extended precision to the target
precision is exact).  This patch forces the exception in a similar way
to previous fixes.

The x86 exp2f changes may in fact not be needed for this purpose -
it's likely to be the case that no argument of type float has an exp2
result so close to an exact subnormal float value that it equals that
value when rounded to 64 bits (even taking account of variation
between different x86 implementations).  However, they are included
for consistency with the changes to exp2 and so as to fix the exp2f
part of bug 18875 by ensuring that excess range and precision is
removed from underflowing return values.

Tested for x86_64, x86 and mips64.

	[BZ #16521]
	[BZ #18875]
	* math/e_exp2l.c (__ieee754_exp2l): Force underflow exception for
	small results.
	* sysdeps/i386/fpu/e_exp2.S (dbl_min): New object.
	(MO): New macro.
	(__ieee754_exp2): For small results, force underflow exception and
	remove excess range and precision from return value.
	* sysdeps/i386/fpu/e_exp2f.S (flt_min): New object.
	(MO): New macro.
	(__ieee754_exp2f): For small results, force underflow exception
	and remove excess range and precision from return value.
	* sysdeps/i386/fpu/e_exp2l.S (ldbl_min): New object.
	(MO): New macro.
	(__ieee754_exp2l): Force underflow exception for small results.
	* sysdeps/ieee754/dbl-64/e_exp2.c (__ieee754_exp2): Likewise.
	* sysdeps/ieee754/flt-32/e_exp2f.c (__ieee754_exp2f): Likewise.
	* sysdeps/x86_64/fpu/e_exp2l.S (ldbl_min): New object.
	(MO): New macro.
	(__ieee754_exp2l): Force underflow exception for small results.
	* math/auto-libm-test-in: Add more tests or exp2.
	* math/auto-libm-test-out: Regenerated.
2015-09-14 22:00:12 +00:00
Joseph Myers
a1f99ba28b Add more random libm test inputs (mainly for ldbl-128).
This patch adds more libm test inputs found through random test
generation to increase previously known ulps.  This particular test
generation was run for mips64, so most of the increased ulps are for
ldbl-128 (float and double having been fairly well covered by such
testing for x86_64), but there's the odd ulps increase for other
formats.

Tested for x86_64, x86 and mips64.

	* math/auto-libm-test-in: Add more tests of acos, acosh, asin,
	asinh, atan, atan2, atanh, cabs, carg, cos, csqrt, erfc, exp,
	exp10, exp2, log, log1p, log2, pow, sin, sincos, sinh, tan and
	tanh.
	* math/auto-libm-test-out: Regenerated.
	* sysdeps/i386/fpu/libm-test-ulps: Update.
	* sysdeps/mips/mips32/libm-test-ulps: Likewise.
	* sysdeps/mips/mips64/libm-test-ulps: Likewise.
	* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
2015-09-12 00:01:38 +00:00
Joseph Myers
00a7073c38 Add more randomly-generated libm tests.
This patch adds more libm test inputs found through random test
generation to increase observed ulps on x86_64.

Tested for x86_64 and x86.

	* math/auto-libm-test-in: Add more tests of acosh, atanh, cbrt,
	cosh, csqrt, erfc, expm1 and lgamma.
	* math/auto-libm-test-out: Regenerated.
	* sysdeps/i386/fpu/libm-test-ulps: Update.
	* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
2015-09-11 15:03:10 +00:00
Joseph Myers
050f29c188 Fix lgamma (negative) inaccuracy (bug 2542, bug 2543, bug 2558).
The existing implementations of lgamma functions (except for the ia64
versions) use the reflection formula for negative arguments.  This
suffers large inaccuracy from cancellation near zeros of lgamma (near
where the gamma function is +/- 1).

This patch fixes this inaccuracy.  For arguments above -2, there are
no zeros and no large cancellation, while for sufficiently large
negative arguments the zeros are so close to integers that even for
integers +/- 1ulp the log(gamma(1-x)) term dominates and cancellation
is not significant.  Thus, it is only necessary to take special care
about cancellation for arguments around a limited number of zeros.

Accordingly, this patch uses precomputed tables of relevant zeros,
expressed as the sum of two floating-point values.  The log of the
ratio of two sines can be computed accurately using log1p in cases
where log would lose accuracy.  The log of the ratio of two gamma(1-x)
values can be computed using Stirling's approximation (the difference
between two values of that approximation to lgamma being computable
without computing the two values and then subtracting), with
appropriate adjustments (which don't reduce accuracy too much) in
cases where 1-x is too small to use Stirling's approximation directly.

In the interval from -3 to -2, using the ratios of sines and of
gamma(1-x) can still produce too much cancellation between those two
parts of the computation (and that interval is also the worst interval
for computing the ratio between gamma(1-x) values, which computation
becomes more accurate, while being less critical for the final result,
for larger 1-x).  Because this can result in errors slightly above
those accepted in glibc, this interval is instead dealt with by
polynomial approximations.  Separate polynomial approximations to
(|gamma(x)|-1)(x-n)/(x-x0) are used for each interval of length 1/8
from -3 to -2, where n (-3 or -2) is the nearest integer to the
1/8-interval and x0 is the zero of lgamma in the relevant half-integer
interval (-3 to -2.5 or -2.5 to -2).

Together, the two approaches are intended to give sufficient accuracy
for all negative arguments in the problem range.  Outside that range,
the previous implementation continues to be used.

Tested for x86_64, x86, mips64 and powerpc.  The mips64 and powerpc
testing shows up pre-existing problems for ldbl-128 and ldbl-128ibm
with large negative arguments giving spurious "invalid" exceptions
(exposed by newly added tests for cases this patch doesn't affect the
logic for); I'll address those problems separately.

	[BZ #2542]
	[BZ #2543]
	[BZ #2558]
	* sysdeps/ieee754/dbl-64/e_lgamma_r.c (__ieee754_lgamma_r): Call
	__lgamma_neg for arguments from -28.0 to -2.0.
	* sysdeps/ieee754/flt-32/e_lgammaf_r.c (__ieee754_lgammaf_r): Call
	__lgamma_negf for arguments from -15.0 to -2.0.
	* sysdeps/ieee754/ldbl-128/e_lgammal_r.c (__ieee754_lgammal_r):
	Call __lgamma_negl for arguments from -48.0 or -50.0 to -2.0.
	* sysdeps/ieee754/ldbl-96/e_lgammal_r.c (__ieee754_lgammal_r):
	Call __lgamma_negl for arguments from -33.0 to -2.0.
	* sysdeps/ieee754/dbl-64/lgamma_neg.c: New file.
	* sysdeps/ieee754/dbl-64/lgamma_product.c: Likewise.
	* sysdeps/ieee754/flt-32/lgamma_negf.c: Likewise.
	* sysdeps/ieee754/flt-32/lgamma_productf.c: Likewise.
	* sysdeps/ieee754/ldbl-128/lgamma_negl.c: Likewise.
	* sysdeps/ieee754/ldbl-128/lgamma_productl.c: Likewise.
	* sysdeps/ieee754/ldbl-128ibm/lgamma_negl.c: Likewise.
	* sysdeps/ieee754/ldbl-128ibm/lgamma_productl.c: Likewise.
	* sysdeps/ieee754/ldbl-96/lgamma_negl.c: Likewise.
	* sysdeps/ieee754/ldbl-96/lgamma_product.c: Likewise.
	* sysdeps/ieee754/ldbl-96/lgamma_productl.c: Likewise.
	* sysdeps/generic/math_private.h (__lgamma_negf): New prototype.
	(__lgamma_neg): Likewise.
	(__lgamma_negl): Likewise.
	(__lgamma_product): Likewise.
	(__lgamma_productl): Likewise.
	* math/Makefile (libm-calls): Add lgamma_neg and lgamma_product.
	* math/auto-libm-test-in: Add more tests of lgamma.
	* math/auto-libm-test-out: Regenerated.
	* sysdeps/i386/fpu/libm-test-ulps: Update.
	* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
2015-09-10 22:27:58 +00:00
Joseph Myers
948e12a238 Fix csqrt missing underflows (bug 18370).
The csqrt implementations in glibc can miss underflow exceptions when
the real or imaginary part of the result becomes tiny in the course of
scaling down (in particular, multiplication by 0.5) and that scaling
is exact although the relevant part of the mathematical result isn't.
This patch forces the exception in a similar way to previous fixes.

Tested for x86_64 and x86.

	[BZ #18370]
	* math/s_csqrt.c (__csqrt): Force underflow exception for results
	whose real or imaginary part has small absolute value.
	* math/s_csqrtf.c (__csqrtf): Likewise.
	* math/s_csqrtl.c (__csqrtl): Likewise.
	* math/auto-libm-test-in: Add more tests of csqrt.
	* math/auto-libm-test-out: Regenerated.
	* sysdeps/i386/fpu/libm-test-ulps: Update.
2015-08-19 22:42:01 +00:00
Joseph Myers
3ba0ac10fa Add more random libm-test inputs.
This patch adds more test inputs to various libm functions found
through random generation to have larger ulps errors than previously
listed in libm-test-ulp, on at least one of x86_64 and x86.

Tested for x86_64 and x86.

	* math/auto-libm-test-in: Add more tests of acos, acosh, asin,
	asinh, atan, atan2, atanh, cabs, cbrt, cosh, csqrt, erf, erfc,
	exp, exp2, lgamma, log, log1p, log2, pow, sin, sincos, tan, tanh
	and tgamma.
	* math/auto-libm-test-out: Regenerated.
	* sysdeps/i386/fpu/libm-test-ulps: Update.
	* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
2015-08-13 23:23:23 +00:00
Joseph Myers
37d83a089d Fix tanh missing underflows (bug 16520).
Similar to various other bugs in this area, some tanh implementations
do not raise the underflow exception for subnormal arguments, when the
result is tiny and inexact.  This patch forces the exception in a
similar way to previous fixes.

Tested for x86_64, x86, mips64 and powerpc.

	[BZ #16520]
	* sysdeps/ieee754/dbl-64/s_tanh.c: Include <float.h>.
	(__tanh): Force underflow exception for arguments with small
	absolute value.
	* sysdeps/ieee754/flt-32/s_tanhf.c: Include <float.h>.
	(__tanhf): Force underflow exception for arguments with small
	absolute value.
	* sysdeps/ieee754/ldbl-128/s_tanhl.c: Include <float.h>.
	(__tanhl): Force underflow exception for arguments with small
	absolute value.
	* sysdeps/ieee754/ldbl-128ibm/s_tanhl.c: Include <float.h>.
	(__tanhl): Force underflow exception for arguments with small
	absolute value.
	* sysdeps/ieee754/ldbl-96/s_tanhl.c: Include <float.h>.
	(__tanhl): Force underflow exception for arguments with small
	absolute value.
	* math/auto-libm-test-in: Add more tests of tanh.
	* math/auto-libm-test-out: Regenerated.
	* sysdeps/i386/fpu/libm-test-ulps: Update.
2015-08-13 16:40:39 +00:00
Joseph Myers
4afe4b20ce Add more tests of various libm functions.
This patch adds more tests of various libm functions found through
random test generation to give increased ulps on 32-bit x86.

Tested for x86_64 and x86.

	* math/auto-libm-test-in: Add more tests of acosh, asin, asinh,
	atanh, cabs, carg, cbrt, cosh, csqrt, erf, erfc, exp, exp10,
	expm1, hypot, log, log10, log1p, log2, pow, sinh, tan and tgamma.
	* math/auto-libm-test-out: Regenerated.
	* sysdeps/i386/fpu/libm-test-ulps: Update.
	* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
2015-08-11 00:58:28 +00:00
Joseph Myers
7ee06ef158 Fix ldbl-128ibm tanhl inaccuracy (bug 18790).
ldbl-128ibm tanhl uses a too-small threshold to decide when to return
+/-1, resulting in large errors.  This patch changes it to a more
appropriate threshold (the requirement is for 2*exp(-2|x|) to be small
in terms of ulps of 1).

Tested for x86_64, x86 and powerpc.

	[BZ #18790]
	* sysdeps/ieee754/ldbl-128ibm/s_tanhl.c (__tanhl): Increase
	threshold for returning +/- 1.
	* math/auto-libm-test-in: Add more tests of tanh.
	* math/auto-libm-test-out: Regenerated.
	* sysdeps/i386/fpu/libm-test-ulps: Update.
2015-08-10 20:35:30 +00:00
Joseph Myers
d0649b2d8e Fix ldbl-128ibm sinhl inaccuracy near 0 (bug 18789).
ldbl-128ibm sinhl uses a too-big threshold to decide when to return
the argument, resulting in large errors.  This patch fixes it to use a
more appropriate threshold.

Tested for x86_64, x86 and powerpc.

	[BZ #18789]
	* sysdeps/ieee754/ldbl-128ibm/e_sinhl.c (__ieee754_sinhl): Use
	smaller threshold for returning the argument.
	* math/auto-libm-test-in: Add more tests of sinh.
	* math/auto-libm-test-out: Regenerated.
	* sysdeps/i386/fpu/libm-test-ulps: Update.
2015-08-10 15:25:10 +00:00
Joseph Myers
5e29dd5737 Fix sinh missing underflows (bug 16519).
Similar to various other bugs in this area, some sinh implementations
do not raise the underflow exception for subnormal arguments, when the
result is tiny and inexact.  This patch forces the exception in a
similar way to previous fixes.

Tested for x86_64, x86, mips64 and powerpc.

	[BZ #16519]
	* sysdeps/ieee754/dbl-64/e_sinh.c: Include <float.h>.
	(__ieee754_sinh): Force underflow exception for arguments with
	small absolute value.
	* sysdeps/ieee754/flt-32/e_sinhf.c: Include <float.h>.
	(__ieee754_sinhf): Force underflow exception for arguments with
	small absolute value.
	* sysdeps/ieee754/ldbl-128/e_sinhl.c: Include <float.h>.
	(__ieee754_sinhl): Force underflow exception for arguments with
	small absolute value.
	* sysdeps/ieee754/ldbl-128ibm/e_sinhl.c: Include <float.h>.
	(__ieee754_sinhl): Force underflow exception for arguments with
	small absolute value.
	* sysdeps/ieee754/ldbl-96/e_sinhl.c: Include <float.h>.
	(__ieee754_sinhl): Force underflow exception for arguments with
	small absolute value.
	* math/auto-libm-test-in: Add more tests of sinh.
	* math/auto-libm-test-out: Regenerated.
	* sysdeps/i386/fpu/libm-test-ulps: Update.
2015-08-06 23:01:09 +00:00
Joseph Myers
e02920bc02 Improve tgamma accuracy (bug 18613).
In non-default rounding modes, tgamma can be slightly less accurate
than permitted by glibc's accuracy goals.

Part of the problem is error accumulation, addressed in this patch by
setting round-to-nearest for internal computations.  However, there
was also a bug in the code dealing with computing pow (x + n, x + n)
where x + n is not exactly representable, providing another source of
error even in round-to-nearest mode; it was necessary to address both
bugs to get errors for all testcases within glibc's accuracy goals.
Given this second fix, accuracy in round-to-nearest mode is also
improved (hence regeneration of ulps for tgamma should be from scratch
- truncate libm-test-ulps or at least remove existing tgamma entries -
so that the expected ulps can be reduced).

Some additional complications also arose.  Certain tgamma tests should
strictly, according to IEEE semantics, overflow or not depending on
the rounding mode; this is beyond the scope of glibc's accuracy goals
for any function without exactly-determined results, but
gen-auto-libm-tests doesn't handle being lax there as it does for
underflow.  (libm-test.inc also doesn't handle being lax about whether
the result in cases very close to the overflow threshold is infinity
or a finite value close to overflow, but that doesn't cause problems
in this case though I've seen it cause problems with random test
generation for some functions.)  Thus, spurious-overflow markings,
with a comment, are added to auto-libm-test-in (no bug in Bugzilla
because the issue is with the testsuite, not a user-visible bug in
glibc).  And on x86, after the patch I saw ERANGE issues as previously
reported by Carlos (see my commentary in
<https://sourceware.org/ml/libc-alpha/2015-01/msg00485.html>), which
needed addressing by ensuring excess range and precision were
eliminated at various points if FLT_EVAL_METHOD != 0.

I also noticed and fixed a cosmetic issue where 1.0f was used in long
double functions and should have been 1.0L.

This completes the move of all functions to testing in all rounding
modes with ALL_RM_TEST, so gen-libm-have-vector-test.sh is updated to
remove the workaround for some functions not using ALL_RM_TEST.

Tested for x86_64, x86, mips64 and powerpc.

	[BZ #18613]
	* sysdeps/ieee754/dbl-64/e_gamma_r.c (gamma_positive): Take log of
	X_ADJ not X when adjusting exponent.
	(__ieee754_gamma_r): Do intermediate computations in
	round-to-nearest then adjust overflowing and underflowing results
	as needed.
	* sysdeps/ieee754/flt-32/e_gammaf_r.c (gammaf_positive): Take log
	of X_ADJ not X when adjusting exponent.
	(__ieee754_gammaf_r): Do intermediate computations in
	round-to-nearest then adjust overflowing and underflowing results
	as needed.
	* sysdeps/ieee754/ldbl-128/e_gammal_r.c (gammal_positive): Take
	log of X_ADJ not X when adjusting exponent.
	(__ieee754_gammal_r): Do intermediate computations in
	round-to-nearest then adjust overflowing and underflowing results
	as needed.  Use 1.0L not 1.0f as numerator of division.
	* sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (gammal_positive): Take
	log of X_ADJ not X when adjusting exponent.
	(__ieee754_gammal_r): Do intermediate computations in
	round-to-nearest then adjust overflowing and underflowing results
	as needed.  Use 1.0L not 1.0f as numerator of division.
	* sysdeps/ieee754/ldbl-96/e_gammal_r.c (gammal_positive): Take log
	of X_ADJ not X when adjusting exponent.
	(__ieee754_gammal_r): Do intermediate computations in
	round-to-nearest then adjust overflowing and underflowing results
	as needed.  Use 1.0L not 1.0f as numerator of division.
	* math/libm-test.inc (tgamma_test_data): Remove one test.  Moved
	to auto-libm-test-in.
	(tgamma_test): Use ALL_RM_TEST.
	* math/auto-libm-test-in: Add one test of tgamma.  Mark some other
	tests of tgamma with spurious-overflow.
	* math/auto-libm-test-out: Regenerated.
	* math/gen-libm-have-vector-test.sh: Do not check for START.
	* sysdeps/i386/fpu/libm-test-ulps: Update.
	* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
2015-06-29 23:29:35 +00:00