Commit Graph

791 Commits

Author SHA1 Message Date
Paul E. Murphy
02bbfb414f ldbl-128: Use L(x) macro for long double constants
This runs the attached sed script against these files using
a regex which aggressively matches long double literals
when not obviously part of a comment.

Likewise, 5 digit or less integral constants are replaced
with integer constants, excepting the two cases of 0 used
in large tables, which are also the only integral values
of the form x.0*E0L encountered within these converted
files.

Likewise, -L(x) is transformed into L(-x).

Naturally, the script has a few minor hiccups which are
more clearly remedied via the attached fixup patch.  Such
hiccups include, context-sensitive promotion to a real
type, and munging constants inside harder to detect
comment blocks.
2016-09-13 15:33:59 -05:00
Siddhesh Poyarekar
54c86ccab6 Inline all support functions for sin and cos
The support functions for sin and cos have a lot of identical
functionality, so inlining them gives a pretty decent jump in
functionality: ~19% in the sincos function.  On SPEC2006 this
translates to about 2.1% in the tonto test.

	* sysdeps/ieee754/dbl-64/s_sin.c (do_cos): Mark as inline.
	(do_cos_slow): Likewise.
	(do_sin): Likewise.
	(do_sin_slow): Likewise.
	(slow): Likewise.
	(slow1): Likewise.
	(slow2): Likewise.
	(sloww): Likewise.
	(sloww1): Likewise.
	(sloww2): Likewise.
	(bsloww): Likewise.
	(bsloww1): Likewise.
	(bsloww2): Likewise.
	(cslow2): Likewise.
2016-09-02 20:08:41 +05:30
Siddhesh Poyarekar
25e440c6c7 Use do_sin for sin(x) where 0.25 < |x| < 0.855469
The only code looks slightly different from do_sin but on closer
examination, should give exactly the same result.  Drop it in favour
of the do_sin function call.

	* sysdeps/ieee754/dbl-64/s_sin.c (__sin): Use do_sin.
2016-09-02 20:08:41 +05:30
Siddhesh Poyarekar
758e79ec89 Consolidate input partitioning into do_cos and do_sin
All calls to do_cos are preceded by code that partitions x into a
larger double that gives an offset into the sincos table and a smaller
double that is used in a polynomial computation.  Consolidate all of
them into do_cos and do_sin to reduce code duplication.

	* sysdeps/ieee754/dbl-64/s_sin.c (do_cos): Accept X and DX as input
	arguments.  Consolidate input partitioning from callers here.
	(do_cos_slow): Likewise.
	(do_sin): Likewise.
	(do_sin_slow): Likewise.
	(do_sincos_1): Remove the no longer necessary input partitioning.
	(do_sincos_2): Likewise.
	(__sin): Likewise.
	(__cos): Likewise.
	(slow1): Likewise.
	(slow2): Likewise.
	(sloww1): Likewise.
	(sloww2): Likewise.
	(bsloww1): Likewise.
	(bsloww2): Likewise.
	(cslow2): Likewise.
2016-09-02 20:08:41 +05:30
Paul E. Murphy
f306ea1ada Make common fmin implementation generic. 2016-09-01 09:31:05 -05:00
Paul E. Murphy
847c9161c7 Make common fmax implementation generic.
Also update aarch64 to ensure the correct s_fmin.c is included.
The include order favors including the generated copy.
2016-09-01 09:31:05 -05:00
Paul E. Murphy
ee8a49071c Make common nextdown implementation generic.
With the exception of those machines using the ldbl-opt in
an Implies file, this is a trivial transformation.

nextdownl is not subject to the non-trivial versioning rules
of the other generated functions, so to keep things simple,
it is handled as a one-off case in ldbl-opt to preserve the
existing behavior.
2016-09-01 09:31:03 -05:00
Paul E. Murphy
7b7c39450b Make common fdim implementation generic.
The only difference is the usage of math_narrow_eval when
building s_fdiml.c.  This should be harmless for long double,
but I did observe some code generation changes on m68k, but
lack the resources to test it.

Likewise, to more easily support overriding symbol generation,
the aliasing macros are always conditionally defined on their
absence to reduce boilerplate.

I also ran builds for i486, ppc64, sparcv9, aarch64,
s390x and observed no changes to s_fdim* objects.
2016-09-01 09:28:05 -05:00
Paul E. Murphy
de6b6d14e9 ldbl-128: Cleanup e_gammal_r.c after _Float128 rename 2016-08-31 17:17:03 -05:00
Paul E. Murphy
15089e046b ldbl-128: Rename 'long double' to '_Float128'
Add a layer of macro indirection for long double files
which need to be built using another typename.  Likewise,
add the L(num) macro used in a later patch to override
real constants.

These macros are only defined through the ldbl-128
math_ldbl.h header, thereby implicitly restricting
these macros to machines which back long double
with an IEEE binary128 format.

Likewise, appropriate changes are made for the few
files which indirectly include such ldbl-128 files.

These changes produce identical binaries for s390x,
aarch64, and ppc64.
2016-08-31 10:38:11 -05:00
Siddhesh Poyarekar
9d84d0e51d Use fabs(x) instead of branching on signedness of input to sin and cos
The sin and cos code is inconsistent about its use of fabs to get the
absolute value of X where in some places it conditionalizes the code
while in others it uses fabs.  fabs seems to be a better candidate in
most cases because it avoids a branch.  Similarly there is an attempt
to make it easier for the compiler to emit conditional assignment
instructions (like fcsel on aarch64) where it can, by isolating
conditional assignment constructs from the rest of the expression.

A further benefit of this change is to identify common constructs
across functions and consolidate them in future patches.

	* sysdeps/ieee754/dbl-64/s_sin.c (do_cos_slow): Use ternary
	instead of if/else.
	(do_sin_slow): Likewise.
	(do_sincos_1): Use fabs instead of if/else.
	(do_sincos_2): Likewise.
	(__sin): Likewise.
	(__cos): Likewise.
	(slow2): Likewise.
	(sloww): Likewise.
	(sloww1): Likewise.  Drop argument M.
	(sloww2): Use fabs instead of if/else.
	(bsloww): Likewise.
	(bsloww1): Likewise.
	(bsloww2): Likewise.
2016-08-30 13:01:59 +05:30
Siddhesh Poyarekar
1a822c6184 Add fall through comments
Add fall through comments I had missed writing in previously.
2016-08-30 13:00:29 +05:30
Siddhesh Poyarekar
32efd690bd Consolidate reduce_and_compute code
This patch reshuffles the reduce_and_compute code so that the
structure matches other code structures of the same type elsewhere in
s_sin.c and s_sincos.c.  This is the beginning of an attempt to
consolidate and reduce code duplication in functions in s_sin.c to
make it easier to read and possibly also easier for the compiler to
optimize.

	* sysdeps/ieee754/dbl-64/s_sin.c (reduce_and_compute):
	Consolidate switch cases 0 and 2.
2016-08-30 12:51:39 +05:30
Paul E. Murphy
feb62ddacb Convert remaining complex function to generated files
Convert cpow, clog, clog10, cexp, csqrt, and cproj functions
into generated templates.  Note, ldbl-opt still retains
s_clog10l.c as the aliasing rules are non-trivial.
2016-08-29 12:43:38 -05:00
Paul E. Murphy
d5602cebf1 Convert _Complex tangent functions to generated code
This converts s_c{,a}tan{,h}{f,,l} into a single
templated file c{,a}tan{,h}_template.c with the
exception of alpha.
2016-08-19 16:47:31 -05:00
Paul E. Murphy
c50eee19c4 Convert _Complex sine functions to generated code
Refactor s_c{,a}sin{,h}{f,,l} into a single templated
macro.
2016-08-19 16:46:41 -05:00
Paul E. Murphy
4482ff226e Merge common usage of mul_split function
A number of files share identical code for the
mul_split function.

This moves the duplicated function mul_split into its
own header, and refactors the fma usage into a single
selection macro.  Likewise, mul_split when used by a
long double implementation is renamed mul_splitl for
clarity.
2016-08-19 11:29:43 -05:00
Paul E. Murphy
01ee387015 Convert _Complex cosine functions to generated code
This is fairly straight fowards.  m68k overrides are
updated to use the framework, and thus are simplified
a bit.
2016-08-19 11:28:55 -05:00
Stefan Liebler
b65f0b7b2e Get rid of array-bounds warning in __kernel_rem_pio2[f] with gcc 6.1 -O3.
On s390x I get the following werror when build with gcc 6.1 (or current gcc head) and -O3:
../sysdeps/ieee754/dbl-64/k_rem_pio2.c: In function ‘__kernel_rem_pio2’:
../sysdeps/ieee754/dbl-64/k_rem_pio2.c:254:18: error: array subscript is below array bounds [-Werror=array-bounds]
    for (k = 1; iq[jk - k] == 0; k++)
                ~~^~~~~~~~
I get the same error with sysdeps/ieee754/flt-32/k_rem_pio2f.c.

This patch adds DIAG_* macros around it.

ChangeLog:

	* sysdeps/ieee754/dbl-64/k_rem_pio2.c (__kernel_rem_pio2):
	Use DIAG_*_NEEDS_COMMENT macro to get rid of array-bounds warning.
	* sysdeps/ieee754/flt-32/k_rem_pio2f.c (__kernel_rem_pio2f):
	Likewise.
2016-08-18 12:20:35 +02:00
Paul E. Murphy
ee19f1de0d ldbl-128: Remove unused sqrtl declaration in e_asinl.c
This did not alter compilation for s390x and aarch64
targets.
2016-08-17 14:06:54 -05:00
Paul E. Murphy
ce6698ea0a Support for type-generic libm function implementations libm
This defines a new classes of libm objects.  The
<func>_template.c file which is used in conjunction
with the new makefile hooks to derive variants for
each type supported by the target machine.

The headers math-type-macros-TYPE.h are used to supply
macros to a common implementation of a function in
a file named FUNC_template.c and glued togethor via
a generated file matching existing naming in the
build directory.

This has the properties of preserving the existing
override mechanism and not requiring any arcane
build system twiddling.  Likewise, it enables machines
to override these files without any additional work.

I have verified the built objects for ppc64, x86_64,
alpha, arm, and m68k do not change in any meaningful
way with these changes using the Fedora cross toolchains.
I have verified the x86_64 and ppc64 changes still run.
2016-08-17 14:06:03 -05:00
Paul E. Murphy
cad1d6066f Remove tacit double usage in ldbl-128
There is quiet truncation to double arithmetic in several
files.  I noticed them when building ldbl-128 in a
soft-fp context.  This did not change any test results.
2016-08-03 11:01:25 -05:00
Aurelien Jarno
bdf20beac1 sparc64: add a VIS3 version of ceil, floor and trunc
sparc64 passes floating point values in the floating point registers.
As the the generic ceil, floor and trunc functions use integer
instructions, it makes sense to provide a VIS3 version consisting in
the the generic version compiled with -mvis3. GCC will then use
movdtox, movxtod, movwtos and movstow instructions.

sparc32 passes the floating point values in the integer registers, so it
doesn't make sense to do the same.

Changelog:
	* sysdeps/ieee754/dbl-64/s_trunc.c: Avoid alias renamed.
	* sysdeps/ieee754/dbl-64/wordsize-64/s_trunc.c: Likewise.
	* sysdeps/ieee754/flt-32/s_truncf.c: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/Makefile
	[$(subdir) = math && $(have-as-vis3) = yes] (libm-sysdep_routines):
	Add s_ceilf-vis3, s_ceil-vis3, s_floorf-vis3, s_floor-vis3,
	s_truncf-vis3, s_trunc-vis3.
	(CFLAGS-s_ceilf-vis3.c): New. Set to -Wa,-Av9d -mvis3.
	(CFLAGS-s_ceil-vis3.c): Likewise.
	(CFLAGS-s_floorf-vis3.c): Likewise.
	(CFLAGS-s_floor-vis3.c): Likewise.
	(CFLAGS-s_truncf-vis3.c): Likewise.
	(CFLAGS-s_trunc-vis3.c): Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_ceil-vis3.c: New file.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_ceil.c: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_ceilf-vis3.c: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_ceilf.c: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_floor-vis3.c: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_floor.c: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_floorf-vis3.c: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_floorf.c: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_trunc-vis3.c: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_trunc.c: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_truncf-vis3.c: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_truncf.c: Likewise.
2016-08-03 13:35:22 +02:00
Siddhesh Poyarekar
cbf88869ed Fix cos computation for multiple precision fallback (bz #20357)
During the sincos consolidation I made two mistakes, one was a logical
error due to which cos(0x1.8475e5afd4481p+0) returned
sin(0x1.8475e5afd4481p+0) instead.

The second issue was an error in negating inputs for the correct
quadrants for sine.  I could not find a suitable test case for this
despite running a program to search for such an input for a couple of
hours.

Following patch fixes both issues.  Tested on x86_64.  Thanks to Matt
Clay for identifying the issue.

	[BZ #20357]
	* sysdeps/ieee754/dbl-64/s_sin.c (sloww): Fix up condition
	to call __mpsin/__mpcos and to negate values.
	* math/auto-libm-test-in: Add test.
	* math/auto-libm-test-out: Regenerate.
2016-07-18 22:33:09 +05:30
Rajalakshmi Srinivasaraghavan
41a359e22f Add nextup and nextdown math functions
TS 18661 adds nextup and nextdown functions alongside nextafter to provide
support for float128 equivalent to it.  This patch adds nextupl, nextup,
nextupf, nextdownl, nextdown and nextdownf to libm before float128 support.

The nextup functions return the next representable value in the direction of
positive infinity and the nextdown functions return the next representable
value in the direction of negative infinity.  These are currently enabled
as GNU extensions.
2016-06-16 21:37:45 +05:30
Joseph Myers
a2ae1696f7 Fix dbl-64 atan2 (sNaN, qNaN) (bug 20252).
The dbl-64 implementation of atan2, passed arguments (sNaN, qNaN),
fails to raise the "invalid" exception.  This patch fixes it to add
both arguments, rather than just adding the second argument to itself,
in the case where the second argument is a NaN (which is checked for
before checking for the first argument being a NaN).  sNaN tests for
atan2 are added, along with some qNaN tests I noticed were missing but
should have been there by analogy with other tests present.

Tested for x86_64 and x86.

	[BZ #20252]
	* sysdeps/ieee754/dbl-64/e_atan2.c (__ieee754_atan2): Add both
	arguments when second argument is a NaN.
	* math/libm-test.inc (atan2_test_data): Add sNaN tests and more
	qNaN tests.
2016-06-13 21:43:22 +00:00
Joseph Myers
88283451b2 Fix frexp (NaN) (bug 20250).
Various implementations of frexp functions return sNaN for sNaN
input.  This patch fixes them to add such arguments to themselves so
that qNaN is returned.

Tested for x86_64, x86, mips64 and powerpc.

	[BZ #20250]
	* sysdeps/i386/fpu/s_frexpl.S (__frexpl): Add non-finite input to
	itself.
	* sysdeps/ieee754/dbl-64/s_frexp.c (__frexp): Add non-finite or
	zero input to itself.
	* sysdeps/ieee754/dbl-64/wordsize-64/s_frexp.c (__frexp):
	Likewise.
	* sysdeps/ieee754/flt-32/s_frexpf.c (__frexpf): Likewise.
	* sysdeps/ieee754/ldbl-128/s_frexpl.c (__frexpl): Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_frexpl.c (__frexpl): Likewise.
	* sysdeps/ieee754/ldbl-96/s_frexpl.c (__frexpl): Likewise.
	* math/libm-test.inc (frexp_test_data): Add sNaN tests.
2016-06-13 17:27:19 +00:00
Joseph Myers
b7519f61fe Fix ldbl-128ibm log1pl (sNaN) (bug 20234).
The ldbl-128ibm version of log1pl returns sNaN for sNaN input.  This
patch fixes it to add such inputs to themselves so that qNaN is
returned in this case.

Tested for powerpc.

	[BZ #20234]
	* sysdeps/ieee754/ldbl-128ibm/s_log1pl.c (__log1pl): Add positive
	infinity or NaN input to itself.
2016-06-09 17:25:54 +00:00
Joseph Myers
f8fc4b4494 Fix ldbl-128ibm expm1l (sNaN) (bug 20233).
The ldbl-128ibm version of expm1l returns sNaN for sNaN input.  This
patch fixes it to add such inputs to themselves so that qNaN is
returned in this case.

Tested for powerpc.

	[BZ #20233]
	* sysdeps/ieee754/ldbl-128ibm/s_expm1l.c (__expm1l): Add NaN input
	to itself.
2016-06-09 17:24:52 +00:00
Joseph Myers
59e53a7898 Fix ldbl-128 expm1l (sNaN) (bug 20232).
The ldbl-128 version of expm1l returns sNaN for sNaN input.  This
patch fixes it to add such inputs to themselves so that qNaN is
returned in this case.

Tested for mips64.

	[BZ #20232]
	* sysdeps/ieee754/ldbl-128/s_expm1l.c (__expm1l): Add NaN input to
	itself.
2016-06-09 17:23:51 +00:00
Joseph Myers
3d8b06bc61 Fix dbl-64 asin (sNaN) (bug 20213).
The dbl-64 version of asin returns sNaN for sNaN arguments.  This
patch fixes it to add NaN arguments to themselves so that qNaN is
returned in this case.

Tested for x86_64 and x86.

	[BZ #20213]
	* sysdeps/ieee754/dbl-64/e_asin.c (__ieee754_asin): Add NaN
	argument to itself.
	* math/libm-test.inc (asin_test_data): Add sNaN tests.
2016-06-06 22:21:11 +00:00
Joseph Myers
af0cfbaf1d Fix dbl-64 acos (sNaN) (bug 20212).
The dbl-64 version of acos returns sNaN for sNaN arguments.  This
patch fixes it to add NaN arguments to themselves so that qNaN is
returned in this case.

Tested for x86_64 and x86.

	[BZ #20212]
	* sysdeps/ieee754/dbl-64/e_asin.c (__ieee754_acos): Add NaN
	argument to itself.
	* math/libm-test.inc (acos_test_data): Add sNaN tests.
2016-06-06 22:10:11 +00:00
Joseph Myers
bba1419589 Fix ldbl-128ibm ceill, rintl etc. for sNaN arguments (bug 20156).
The ldbl-128ibm implementations of ceill, floorl, roundl, truncl,
rintl and nearbyintl wrongly return an sNaN when given an sNaN
argument.  This patch fixes them to add such an argument to itself to
turn it into a quiet NaN.  (The code structure means this "else" case
applies to any argument which is zero or not finite; it's OK to do
this in all such cases.)

Tested for powerpc.

	[BZ #20156]
	* sysdeps/ieee754/ldbl-128ibm/s_ceill.c (__ceill): Add high part
	to itself when zero or not finite.
	* sysdeps/ieee754/ldbl-128ibm/s_floorl.c (__floorl): Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_rintl.c (__rintl): Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_roundl.c (__roundl): Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_truncl.c (__truncl): Likewise.
2016-05-27 13:59:24 +00:00
Joseph Myers
98c9c9d9ca Fix ldbl-128ibm sqrtl (sNaN) (bug 20153).
The ldbl-128ibm implementation of sqrtl wrongly returns an sNaN for
signaling NaN arguments.  This patch fixes it to quiet its argument,
using the same x * x + x return for infinities and NaNs as the dbl-64
implementation uses to ensure that +Inf maps to +Inf while -Inf and
NaN map to NaN.

Tested for powerpc.

	[BZ #20153]
	* sysdeps/ieee754/ldbl-128ibm/e_sqrtl.c (__ieee754_sqrtl): Return
	x * x + x for infinities and NaNs.
2016-05-26 22:58:36 +00:00
Joseph Myers
d73e7bdb3a Fix ldbl-128 j0l, j1l, y0l, y1l for sNaN argument (bug 20151).
The ldbl-128 implementations of j0l, j1l, y0l, y1l (also used for
ldbl-128ibm) return an sNaN argument unchanged.  This patch fixes them
to add a NaN argument to itself to quiet it before return.

Tested for mips64.

	[BZ #20151]
	* sysdeps/ieee754/ldbl-128/e_j0l.c (__ieee754_j0l): Add NaN
	argument to itself before returning result.
	(__ieee754_y0l): Likewise.
	* sysdeps/ieee754/ldbl-128/e_j1l.c (__ieee754_j1l): Likewise.
	(__ieee754_y1l).
2016-05-26 20:55:03 +00:00
Joseph Myers
078d1cf8ac Do not raise "inexact" from generic round (bug 15479).
C99 and C11 allow but do not require ceil, floor, round and trunc to
raise the "inexact" exception for noninteger arguments.  TS 18661-1
requires that this exception not be raised by these functions.  This
aligns them with general IEEE semantics, where "inexact" is only
raised if the final step of rounding the infinite-precision result to
the result type is inexact; for these functions, the
infinite-precision integer result is always representable in the
result type, so "inexact" should never be raised.

The generic implementations of ceil, floor and round functions contain
code to force "inexact" to be raised.  This patch removes it for round
functions to align them with TS 18661-1 in this regard.  The tests
*are* updated by this patch; there are fewer architecture-specific
versions than for ceil and floor, and I fixed the powerpc ones some
time ago.  If any others still have the issue, as shown by tests for
round failing with spurious exceptions, they can be fixed separately
by architecture maintainers or others.

Tested for x86_64, x86 and mips64.

	[BZ #15479]
	* sysdeps/ieee754/dbl-64/s_round.c (huge): Remove variable.
	(__round): Do not force "inexact" exception.
	* sysdeps/ieee754/dbl-64/wordsize-64/s_round.c (huge): Remove
	variable.
	(__round): Do not force "inexact" exception.
	* sysdeps/ieee754/flt-32/s_roundf.c (huge): Remove variable.
	(__roundf): Do not force "inexact" exception.
	* sysdeps/ieee754/ldbl-128/s_roundl.c (huge): Remove variable.
	(__roundl): Do not force "inexact" exception.
	* sysdeps/ieee754/ldbl-96/s_roundl.c (huge): Remove variable.
	(__roundl): Do not force "inexact" exception.
	* math/libm-test.inc (round_test_data): Do not allow spurious
	"inexact" exceptions.
2016-05-24 17:46:55 +00:00
Joseph Myers
876c5bd30c Do not raise "inexact" from generic floor (bug 15479).
C99 and C11 allow but do not require ceil, floor, round and trunc to
raise the "inexact" exception for noninteger arguments.  TS 18661-1
requires that this exception not be raised by these functions.  This
aligns them with general IEEE semantics, where "inexact" is only
raised if the final step of rounding the infinite-precision result to
the result type is inexact; for these functions, the
infinite-precision integer result is always representable in the
result type, so "inexact" should never be raised.

The generic implementations of ceil, floor and round functions contain
code to force "inexact" to be raised.  This patch removes it for floor
functions to align them with TS 18661-1 in this regard.  Note that
some architecture-specific versions may still raise "inexact", so the
tests are not updated and the bug is not yet fixed.

Tested for x86_64, x86 and mips64.

	[BZ #15479]
	* sysdeps/ieee754/dbl-64/s_floor.c: Do not mention "inexact"
	exception in comment.
	(huge): Remove variable.
	(__floor): Do not force "inexact" exception.
	* sysdeps/ieee754/dbl-64/wordsize-64/s_floor.c: Do not mention
	"inexact" exception in comment.
	(huge): Remove variable.
	(__floor): Do not force "inexact" exception.
	* sysdeps/ieee754/flt-32/s_floorf.c: Do not mention "inexact"
	exception in comment.
	(huge): Remove variable.
	(__floorf): Do not force "inexact" exception.
	* sysdeps/ieee754/ldbl-128/s_floorl.c: Do not mention "inexact"
	exception in comment.
	(huge): Remove variable.
	(__floorl): Do not force "inexact" exception.
2016-05-24 17:44:46 +00:00
Joseph Myers
ac2cc6f021 Do not raise "inexact" from generic ceil (bug 15479).
C99 and C11 allow but do not require ceil, floor, round and trunc to
raise the "inexact" exception for noninteger arguments.  TS 18661-1
requires that this exception not be raised by these functions.  This
aligns them with general IEEE semantics, where "inexact" is only
raised if the final step of rounding the infinite-precision result to
the result type is inexact; for these functions, the
infinite-precision integer result is always representable in the
result type, so "inexact" should never be raised.

The generic implementations of ceil, floor and round functions contain
code to force "inexact" to be raised.  This patch removes it for ceil
functions to align them with TS 18661-1 in this regard.  Note that
some architecture-specific versions may still raise "inexact", so the
tests are not updated and the bug is not yet fixed.

Tested for x86_64, x86 and mips64.

	[BZ #15479]
	* sysdeps/ieee754/dbl-64/s_ceil.c: Do not mention "inexact"
	exception in comment.
	(huge): Remove variable.
	(__ceil): Do not force "inexact" exception.
	* sysdeps/ieee754/dbl-64/wordsize-64/s_ceil.c: Do not mention
	"inexact" exception in comment.
	(huge): Remove variable.
	(__ceil): Do not force "inexact" exception.
	* sysdeps/ieee754/flt-32/s_ceilf.c (huge): Remove variable.
	(__ceilf): Do not force "inexact" exception.
	* sysdeps/ieee754/ldbl-128/s_ceill.c: Do not mention "inexact"
	exception in comment.
	(huge): Remove variable.
	(__ceill): Do not force "inexact" exception.
2016-05-24 17:42:10 +00:00
Joseph Myers
ffe9aaf2b9 Implement proper fmal for ldbl-128ibm (bug 13304).
ldbl-128ibm had an implementation of fmal that just did (x * y) + z in
most cases, with no attempt at actually being a fused operation.

This patch replaces it with a genuine fused operation.  It is not
necessarily correctly rounding, but should produce a result at least
as accurate as the long double arithmetic operations in libgcc, which
I think is all that can reasonably be expected for such a non-IEEE
format where arithmetic is approximate rather than rounded according
to any particular rule for determining the exact result.  Like the
libgcc arithmetic, it may produce spurious overflow and underflow
results, and it falls back to the libgcc multiplication in the case of
(finite, finite, zero).

This concludes the fixes for bug 13304; any subsequently found fma
issues should go in separate Bugzilla bugs.  Various other pieces of
bug 13304 were fixed in past releases over the past several years.

Tested for powerpc.

	[BZ #13304]
	* sysdeps/ieee754/ldbl-128ibm/s_fmal.c: Include <fenv.h>,
	<float.h>, <math_private.h> and <stdlib.h>.
	(add_split): New function.
	(mul_split): Likewise.
	(ext_val): New typedef.
	(store_ext_val): New function.
	(mul_ext_val): New function.
	(compare): New function.
	(add_split_ext): New function.
	(__fmal): After checking for Inf, NaN and zero, compute result as
	an exact sum of scaled double values in round-to-nearest before
	adding those up and adjusting for other rounding modes.
	* math/auto-libm-test-in: Remove xfail-rounding:ldbl-128ibm from
	tests of fma.
	* math/auto-libm-test-out: Regenerated.
2016-05-19 20:10:56 +00:00
Paul E. Murphy
37a4c70bd4 Increase internal precision of ldbl-128ibm decimal printf [BZ #19853]
When the signs differ, the precision of the conversion sometimes
drops below 106 bits.  This strategy is identical to the
hexadecimal variant.

I've refactored tst-sprintf3 to enable testing a value with more
than 30 significant digits in order to demonstrate this failure
and its solution.

Additionally, this implicitly fixes a typo in the shift
quantities when subtracting from the high mantissa to compute
the difference.
2016-03-31 12:14:33 -05:00
Joseph Myers
613c92b3b5 Fix ldbl-128ibm nearbyintl in non-default rounding modes (bug 19790).
The ldbl-128ibm implementation of nearbyintl uses logic that only
works in round-to-nearest mode.  This contrasts with rintl, which
works in all rounding modes.

Now, arguably nearbyintl could simply be aliased to rintl, given that
spurious "inexact" is generally allowed for ldbl-128ibm, even for the
underlying arithmetic operations.  But given that the only point of
nearbyintl is to avoid "inexact", this patch follows the more
conservative approach of adding conditionals to the rintl
implementation to make it suitable for use to implement nearbyintl,
then builds it for nearbyintl with USE_AS_NEARBYINTL defined.  The
test test-nearbyint-except-2 shows up issues when traps on "inexact"
are enabled, which turn out to be problems with the powerpc
fenv_private.h implementation (two functions that should disable
exception traps potentially failing to do so in some cases); this
patch duly fixes that as well (I don't see any other existing cases
where this would be user-visible; there isn't much use of *_NOEX,
*hold* etc. in libm that requires exceptions to be discarded and not
trapped on).

Tested for powerpc.

	[BZ #19790]
	* sysdeps/ieee754/ldbl-128ibm/s_rintl.c [USE_AS_NEARBYINTL]
	(rintl): Define as macro.
	[USE_AS_NEARBYINTL] (__rintl): Likewise.
	(__rintl) [USE_AS_NEARBYINTL]: Use SET_RESTORE_ROUND_NOEX instead
	of fesetround.  Ensure results are evaluated before end of scope.
	* sysdeps/ieee754/ldbl-128ibm/s_nearbyintl.c: Define
	USE_AS_NEARBYINTL and include s_rintl.c.
	* sysdeps/powerpc/fpu/fenv_private.h (libc_feholdsetround_ppc):
	Disable exception traps in new environment.
	(libc_feholdsetround_ppc_ctx): Likewise.
2016-03-09 00:30:59 +00:00
Joseph Myers
cc4084017e Fix ldbl-128ibm remainderl equality test for zero low part (bug 19677).
The ldbl-128ibm implementation of remainderl has logic resulting in
incorrect tests for equality of the absolute values of the arguments
in the case of zero low parts.  If the low parts are both zero but
with different signs, this can wrongly cause equal arguments to be
treated as different, resulting in turn in incorrect signs of zero
result in nondefault rounding modes arising from the subtractions done
when the arguments are not equal.

This patch fixes the logic to convert -0 low parts into +0 before the
comparison (remquo already has separate logic to deal with signs of
zero results, so doesn't need such a change).  Tests are added for
remainderl and remquol similar to that for fmodl, and based on a
refactoring of it, since the bug depends on low parts which should not
be relied upon in tests not setting the representation explicitly
(although in fact the bug shows up in test-ldouble with current GCC).
Tested for powerpc.

	[BZ #19677]
	* sysdeps/ieee754/ldbl-128ibm/e_remainderl.c
	(__ieee754_remainderl): Put zero low parts in canonical form.
	* sysdeps/ieee754/ldbl-128ibm/test-fmodrem-ldbl-128ibm.c: New
	file.  Based on
	sysdeps/ieee754/ldbl-128ibm/test-fmodl-ldbl-128ibm.c.
	* sysdeps/ieee754/ldbl-128ibm/test-fmodl-ldbl-128ibm.c: Replace
	with wrapper round test-fmodrem-ldbl-128ibm.c.
	* sysdeps/ieee754/ldbl-128ibm/test-remainderl-ldbl-128ibm.c: New
	file.
	* sysdeps/ieee754/ldbl-128ibm/test-remquol-ldbl-128ibm.c:
	Likewise.
	* sysdeps/ieee754/ldbl-128ibm/Makefile (tests): Add
	test-remainderl-ldbl-128ibm and test-remquol-ldbl-128ibm.
2016-03-08 00:27:21 +00:00
Joseph Myers
7b428e744b Fix ldbl-128ibm nextafterl, nexttowardl sign of zero result (bug 19678).
The ldbl-128ibm implementation of nextafterl / nexttowardl returns -0
in FE_DOWNWARD mode when taking the next value below the least
positive subnormal, when it should return +0.  This patch fixes it to
check explicitly for this case.

Tested for powerpc.

	[BZ #19678]
	* sysdeps/ieee754/ldbl-128ibm/s_nextafterl.c (__nextafterl):
	Ensure +0.0 is returned when taking the next value below the least
	positive value.
2016-02-19 17:19:53 +00:00
Joseph Myers
c091488e51 Fix ldbl-128ibm powl overflow handling (bug 19674).
The ldbl-128ibm implementation of powl has some problems in the case
of overflow or underflow, which are mainly visible in non-default
rounding modes.

* When overflow or underflow is detected early, the correct sign of an
  overflowing or underflowing result is not allowed for.  This is
  mostly hidden in the default rounding mode by the errno-setting
  wrappers recomputing the result (except in non-default
  error-handling modes such as -lieee), but visible in other rounding
  modes where a result that is not zero or infinity causes the
  wrappers not to do the recomputation.

* The final scaling is done before the sign is incorporated in the
  result, but should be done afterwards for correct overflowing and
  underflowing results in directed rounding modes.

This patch fixes those problems.  Tested for powerpc.

	[BZ #19674]
	* sysdeps/ieee754/ldbl-128ibm/e_powl.c (__ieee754_powl): Include
	sign in overflowing and underflowing results when overflow or
	underflow is detected early.  Include sign in result before rather
	than after scaling.
2016-02-19 01:07:40 +00:00
Joseph Myers
9120a57f48 Fix ldbl-128ibm remainderl, remquol equality tests (bug 19603).
The ldbl-128ibm implementations of remainderl and remquol have logic
resulting in incorrect tests for equality of the absolute values of
the arguments.  Equality is tested based on the integer
representations of the high and low parts, with the sign bit masked
off the high part - but when this changes the sign of the high part,
the sign of the low part needs to be changed as well, and failure to
do this means arguments are wrongly treated as equal when they are
not.

This patch fixes the logic to adjust signs of low parts as needed.
Tested for powerpc.

	[BZ #19603]
	* sysdeps/ieee754/ldbl-128ibm/e_remainderl.c
	(__ieee754_remainderl): Adjust sign of integer version of low part
	when taking absolute value of high part.
	* sysdeps/ieee754/ldbl-128ibm/s_remquol.c (__remquol): Likewise.
	* math/libm-test.inc (remainder_test_data): Add another test.
	(remquo_test_data): Likewise.
2016-02-19 00:55:46 +00:00
Joseph Myers
0fed79a827 Fix ldbl-128ibm fmodl handling of equal arguments with low part zero (bug 19602).
The ldbl-128ibm implementation of fmodl has logic to detect when the
first argument has absolute value less than or equal to the second.
This logic is only correct for nonzero low parts; if the high parts
are equal and the low parts are zero, then the signs of the low parts
(which have no semantic effect on the value of the long double number)
can result in equal values being wrongly treated as unequal, and an
incorrect result being returned from fmodl.  This patch fixes this by
checking for the case of zero low parts.

Although this does show up in tests from libm-test.inc (both tests of
fmodl, and, indirectly, of remainderl / dreml), the dependence on
non-semantic zero low parts means that test shouldn't be expected to
reproduce it reliably; thus, this patch adds a standalone test that
sets up affected values using unions.

Tested for powerpc.

	[BZ #19602]
	* sysdeps/ieee754/ldbl-128ibm/e_fmodl.c (__ieee754_fmodl): Handle
	equal high parts and both low parts zero specially.
	* sysdeps/ieee754/ldbl-128ibm/test-fmodl-ldbl-128ibm.c: New test.
	* sysdeps/ieee754/ldbl-128ibm/Makefile [$(subdir) = math] (tests):
	Add test-fmodl-ldbl-128ibm.
2016-02-18 22:54:07 +00:00
Joseph Myers
e2c631384a Fix ldbl-128ibm fmodl handling of subnormal results (bug 19595).
The ldbl-128ibm implementation of fmodl has completely bogus logic for
subnormal results (in this context, that means results for which the
result is in the subnormal range for double, not results with absolute
value below LDBL_MIN), based on code used for ldbl-128 that is correct
in that case but incorrect in the ldbl-128ibm use.  This patch fixes
it to convert the mantissa into the correct form expected by
ldbl_insert_mantissa, removing the other cases of the code that were
incorrect and in one case unreachable for ldbl-128ibm.  A correct
exponent value is then passed to ldbl_insert_mantissa to reflect the
shifted result.

Tested for powerpc.

	[BZ #19595]
	* sysdeps/ieee754/ldbl-128ibm/e_fmodl.c (__ieee754_fmodl): Use
	common logic for all cases of shifting subnormal results.  Do not
	insert sign bit in shifted mantissa.  Always pass -1023 as biased
	exponent to ldbl_insert_mantissa in subnormal case.
2016-02-18 22:42:06 +00:00
Joseph Myers
b9a76339be Fix ldbl-128ibm roundl for non-default rounding modes (bug 19594).
The ldbl-128ibm implementation of roundl is only correct in
round-to-nearest mode (in other modes, there are incorrect results and
overflow exceptions in some cases).  This patch reimplements it along
the lines used for floorl, ceill and truncl, using __round on the high
part, and on the low part if the high part is an integer, and then
adjusting in the cases where this is incorrect.

Tested for powerpc.

	[BZ #19594]
	* sysdeps/ieee754/ldbl-128ibm/s_roundl.c (__roundl): Use __round
	on high and low parts then adjust result and use
	ldbl_canonicalize_int if needed.
2016-02-18 22:24:32 +00:00
Joseph Myers
e2310a27be Fix ldbl-128ibm truncl for non-default rounding modes (bug 19593).
The ldbl-128ibm implementation of truncl is only correct in
round-to-nearest mode (in other modes, there are incorrect results and
overflow exceptions in some cases).  It is also unnecessarily
complicated, rounding both high and low parts to the nearest integer
and then adjusting for the semantics of trunc, when it seems more
natural to take the truncation of the high part (__trunc optimized
inline versions can be used), and the floor or ceiling of the low part
(depending on the sign of the high part) if the high part is an
integer, as was done for floorl and ceill.  This patch makes it use
that simpler approach.

Tested for powerpc.

	[BZ #19593]
	* sysdeps/ieee754/ldbl-128ibm/s_truncl.c (__truncl): Use __trunc
	on high part and __floor or __ceil on low part then use
	ldbl_canonicalize_int if needed.
2016-02-18 21:52:07 +00:00
Joseph Myers
8a9fa0086d Fix ldbl-128ibm ceill for non-default rounding modes (bug 19592).
The ldbl-128ibm implementation of ceill is only correct in
round-to-nearest mode (in other modes, there are incorrect results and
overflow exceptions in some cases).  It is also unnecessarily
complicated, rounding both high and low parts to the nearest integer
and then adjusting for the semantics of ceil, when it seems more
natural to take the ceiling of the high part (__ceil optimized inline
versions can be used), and that of the low part if the high part is an
integer, as was done for floorl.  This patch makes it use that simpler
approach.

Tested for powerpc.

	[BZ #19592]
	* sysdeps/ieee754/ldbl-128ibm/s_ceill.c (__ceill): Use __ceil on
	high and low parts then use ldbl_canonicalize_int if needed.
2016-02-18 21:40:39 +00:00