Like the previous change, exploit the fact that computation for sin
and cos is identical except that it is apart by a quadrant. Also
remove csloww, csloww1 and csloww2 since they can easily be expressed
in terms of sloww, sloww1 and sloww2.
The sin and cos computation for this range of input is identical
except for a difference in quadrants by 1. Exploit that fact and the
common argument reduction to reduce computations for sincos.
Range reduction needs to be done only once for sin and cos, so copy
over all of the relevant functions (__sin, __cos, reduce_and_compute)
and consolidate common code.
Include the __sin and __cos functions as local static copies to allow
deper optimization of the functions. This change shows an improvement
of about 17% in the min case and 12.5% in the mean case for the sincos
microbenchmark on x86_64.
* sysdeps/ieee754/dbl-64/s_sin.c (__sin)[IN_SINCOS]: Mark function
static and don't set or restore rounding.
(__cos)[IN_SINCOS]: Likewise.
* sysdeps/ieee754/dbl-64/s_sincos.c: Include s_sin.c.
(__sincos): Set and restore rounding mode. Remove check for infinite
or NaN input.
Similar to various other bugs in this area, some sin and sincos
implementations do not raise the underflow exception for subnormal
arguments, when the result is tiny and inexact. This patch forces the
exception in a similar way to previous fixes.
Tested for x86_64, x86, mips64 and powerpc.
[BZ #16526]
[BZ #16538]
* sysdeps/ieee754/dbl-64/s_sin.c: Include <float.h>.
(__sin): Force underflow exception for arguments with small
absolute value.
* sysdeps/ieee754/flt-32/k_sinf.c: Include <float.h>.
(__kernel_sinf): Force underflow exception for arguments with
small absolute value.
* sysdeps/ieee754/ldbl-128/k_sincosl.c: Include <float.h>.
(__kernel_sincosl): Force underflow exception for arguments with
small absolute value.
* sysdeps/ieee754/ldbl-128/k_sinl.c: Include <float.h>.
(__kernel_sinl): Force underflow exception for arguments with
small absolute value.
* sysdeps/ieee754/ldbl-128ibm/k_sincosl.c: Include <float.h>.
(__kernel_sincosl): Force underflow exception for arguments with
small absolute value.
* sysdeps/ieee754/ldbl-128ibm/k_sinl.c: Include <float.h>.
(__kernel_sinl): Force underflow exception for arguments with
small absolute value.
* sysdeps/ieee754/ldbl-96/k_sinl.c: Include <float.h>.
(__kernel_sinl): Force underflow exception for arguments with
small absolute value.
* sysdeps/powerpc/fpu/k_sinf.c: Include <float.h>.
(__kernel_sinf): Force underflow exception for arguments with
small absolute value.
* math/auto-libm-test-in: Add more tests of sin and sincos.
* math/auto-libm-test-out: Regenerated.
In 84ba214c, I removed some redundant sign computations and in the
process, I incorrectly got rid of a temporary variable, thus passing
the absolute value of the input to bsloww1. This caused #16623.
This fix undoes the incorrect change.
This patch consolidates the multiple copies of code that looks up sin
and cos of a number from the lookup table and computes the final
value, into static functions. This does not have a noticeable
performance impact since the functions are inlined by gcc.
There is further scope for consolidation in the functions but they
cause a more noticable impact on performance (>5%) due to which I have
held back on them.
Removed more redundant computations in the slow paths of the sin and
cos functions. The notable change is the passing of the most
significant bits of X to the slow functions to check if X is positive
so that just the absolute value of x can be passed and the repeated
ABS() operation is avoided.
There are multiple points in the code where the absolute value of a
number is computed multiple times or is computed even though the value
can only be positive. This change removes those redundant
computations. Tested on x86_64 to verify that there were no
regressions in the testsuite.
- Remove redundant mynumber union definitions
- Clean up a clumsy ternary operator
- Rename TAYLOR_SINCOS to TAYLOR_SIN since we're only expanding the
sin Taylor series in it.
2001-03-12 Ulrich Drepper <drepper@redhat.com>
* sysdeps/ieee754/dbl-64/e_remainder.c: Fix handling of boundary
conditions.
* sysdeps/ieee754/dbl-64/e_pow.c: Fix handling of boundary
conditions.
* sysdeps/ieee754/dbl-64/s_sin.c (__sin): Handle Inf and NaN
correctly.
(__cos): Likewise.
* sysdeps/ieee754/dbl-64/e_asin.c (__ieee754_asin): Handle NaN
correctly.
(__ieee754_acos): Likewise.
redefinition.
* sysdeps/ieee754/dbl-64/endian.h: Define also one of BIG_ENDI and
LITTLE_ENDI.
* sysdeps/ieee754/dbl-64/MathLib.h (Init_Lib): Use void as
parameter list.
2001-03-11 Ulrich Drepper <drepper@redhat.com>
Last-bit accurate math library implementation by IBM Haifa.
Contributed by Abraham Ziv <ziv@il.ibm.com>, Moshe Olshansky
<olshansk@il.ibm.com>, Ealan Henis <ealan@il.ibm.com>, and
Anna Reitman <reitman@il.ibm.com>.
* math/Makefile (dbl-only-routines): New variable.
(libm-routines): Add $(dbl-only-routines).
* sysdeps/ieee754/dbl-64/e_acos.c: Empty, definition is in e_asin.c.
* sysdeps/ieee754/dbl-64/e_asin.c: Replaced with accurate asin
implementation.
* sysdeps/ieee754/dbl-64/e_atan2.c: Replaced with accurate atan2
implementation.
* sysdeps/ieee754/dbl-64/e_exp.c: Replaced with accurate exp
implementation.
* sysdeps/ieee754/dbl-64/e_lgamma_r.c: Don't use __kernel_sin and
__kernel_cos.
* sysdeps/ieee754/dbl-64/e_log.c: Replaced with accurate log
implementation.
* sysdeps/ieee754/dbl-64/e_remainder.c: Replaced with accurate
remainder implementation.
* sysdeps/ieee754/dbl-64/e_pow.c: Replaced with accurate pow
implementation.
* sysdeps/ieee754/dbl-64/e_sqrt.c: Replaced with accurate sqrt
implementation.
* sysdeps/ieee754/dbl-64/k_cos.c: Empty, definition is in s_sin.c.
* sysdeps/ieee754/dbl-64/k_sin.c: Empty, definition is in s_sin.c.
* sysdeps/ieee754/dbl-64/s_atan.c: Replaced with accurate atan
implementation.
* sysdeps/ieee754/dbl-64/s_cos.c: Empty, definition is in s_sin.c.
* sysdeps/ieee754/dbl-64/s_sin.c: Replaced with accurate sin/cos
implementation.
* sysdeps/ieee754/dbl-64/s_sincos.c: Rewritten to not use __kernel_sin
and __kernel_cos.
* sysdeps/ieee754/dbl-64/s_tan.c: Replaced with accurate tan
implementation.
* sysdeps/ieee754/dbl-64/Dist: Add new non-code files.
* sysdeps/ieee754/dbl-64/MathLib.h: New file.
* sysdeps/ieee754/dbl-64/asincos.tbl: New file.
* sysdeps/ieee754/dbl-64/atnat.h: New file.
* sysdeps/ieee754/dbl-64/atnat2.h: New file.
* sysdeps/ieee754/dbl-64/branred.c: New file.
* sysdeps/ieee754/dbl-64/branred.h: New file.
* sysdeps/ieee754/dbl-64/dla.h: New file.
* sysdeps/ieee754/dbl-64/doasin.c: New file.
* sysdeps/ieee754/dbl-64/doasin.h: New file.
* sysdeps/ieee754/dbl-64/dosincos.c: New file.
* sysdeps/ieee754/dbl-64/dosincos.h: New file.
* sysdeps/ieee754/dbl-64/endian.h: New file.
* sysdeps/ieee754/dbl-64/halfulp.c: New file.
* sysdeps/ieee754/dbl-64/mpa.c: New file.
* sysdeps/ieee754/dbl-64/mpa.h: New file.
* sysdeps/ieee754/dbl-64/mpa2.h: New file.
* sysdeps/ieee754/dbl-64/mpatan.c: New file.
* sysdeps/ieee754/dbl-64/mpatan.h: New file.
* sysdeps/ieee754/dbl-64/mpatan2.c: New file.
* sysdeps/ieee754/dbl-64/mpexp.c: New file.
* sysdeps/ieee754/dbl-64/mpexp.h: New file.
* sysdeps/ieee754/dbl-64/mplog.c: New file.
* sysdeps/ieee754/dbl-64/mplog.h: New file.
* sysdeps/ieee754/dbl-64/mpsqrt.c: New file.
* sysdeps/ieee754/dbl-64/mpsqrt.h: New file.
* sysdeps/ieee754/dbl-64/mptan.c: New file.
* sysdeps/ieee754/dbl-64/mydefs.h: New file.
* sysdeps/ieee754/dbl-64/powtwo.tbl: New file.
* sysdeps/ieee754/dbl-64/root.tbl: New file.
* sysdeps/ieee754/dbl-64/sincos.tbl: New file.
* sysdeps/ieee754/dbl-64/sincos32.c: New file.
* sysdeps/ieee754/dbl-64/sincos32.h: New file.
* sysdeps/ieee754/dbl-64/slowexp.c: New file.
* sysdeps/ieee754/dbl-64/slowpow.c: New file.
* sysdeps/ieee754/dbl-64/uasncs.h: New file.
* sysdeps/ieee754/dbl-64/uatan.tbl: New file.
* sysdeps/ieee754/dbl-64/uexp.h: New file.
* sysdeps/ieee754/dbl-64/uexp.tbl: New file.
* sysdeps/ieee754/dbl-64/ulog.h: New file.
* sysdeps/ieee754/dbl-64/ulog.tbl: New file.
* sysdeps/ieee754/dbl-64/upow.h: New file.
* sysdeps/ieee754/dbl-64/upow.tbl: New file.
* sysdeps/ieee754/dbl-64/urem.h: New file.
* sysdeps/ieee754/dbl-64/uroot.h: New file.
* sysdeps/ieee754/dbl-64/usncs.h: New file.
* sysdeps/ieee754/dbl-64/utan.h: New file.
* sysdeps/ieee754/dbl-64/utan.tbl: New file.
* sysdeps/i386/fpu/branred.c: New file.
* sysdeps/i386/fpu/doasin.c: New file.
* sysdeps/i386/fpu/dosincos.c: New file.
* sysdeps/i386/fpu/halfulp.c: New file.
* sysdeps/i386/fpu/mpa.c: New file.
* sysdeps/i386/fpu/mpatan.c: New file.
* sysdeps/i386/fpu/mpatan2.c: New file.
* sysdeps/i386/fpu/mpexp.c: New file.
* sysdeps/i386/fpu/mplog.c: New file.
* sysdeps/i386/fpu/mpsqrt.c: New file.
* sysdeps/i386/fpu/mptan.c: New file.
* sysdeps/i386/fpu/sincos32.c: New file.
* sysdeps/i386/fpu/slowexp.c: New file.
* sysdeps/i386/fpu/slowpow.c: New file.
* sysdeps/ia64/fpu/branred.c: New file.
* sysdeps/ia64/fpu/doasin.c: New file.
* sysdeps/ia64/fpu/dosincos.c: New file.
* sysdeps/ia64/fpu/halfulp.c: New file.
* sysdeps/ia64/fpu/mpa.c: New file.
* sysdeps/ia64/fpu/mpatan.c: New file.
* sysdeps/ia64/fpu/mpatan2.c: New file.
* sysdeps/ia64/fpu/mpexp.c: New file.
* sysdeps/ia64/fpu/mplog.c: New file.
* sysdeps/ia64/fpu/mpsqrt.c: New file.
* sysdeps/ia64/fpu/mptan.c: New file.
* sysdeps/ia64/fpu/sincos32.c: New file.
* sysdeps/ia64/fpu/slowexp.c: New file.
* sysdeps/ia64/fpu/slowpow.c: New file.
* sysdeps/m68k/fpu/branred.c: New file.
* sysdeps/m68k/fpu/doasin.c: New file.
* sysdeps/m68k/fpu/dosincos.c: New file.
* sysdeps/m68k/fpu/halfulp.c: New file.
* sysdeps/m68k/fpu/mpa.c: New file.
* sysdeps/m68k/fpu/mpatan.c: New file.
* sysdeps/m68k/fpu/mpatan2.c: New file.
* sysdeps/m68k/fpu/mpexp.c: New file.
* sysdeps/m68k/fpu/mplog.c: New file.
* sysdeps/m68k/fpu/mpsqrt.c: New file.
* sysdeps/m68k/fpu/mptan.c: New file.
* sysdeps/m68k/fpu/sincos32.c: New file.
* sysdeps/m68k/fpu/slowexp.c: New file.
* sysdeps/m68k/fpu/slowpow.c: New file.
* iconvdata/gconv-modules: Add a number of alias, mostly for IBM
codepages.