Commit Graph

344 Commits

Author SHA1 Message Date
Siddhesh Poyarekar
2506109403 Set/restore rounding mode only when needed
The most common use case of math functions is with default rounding
mode, i.e. rounding to nearest.  Setting and restoring rounding mode
is an unnecessary overhead for this, so I've added support for a
context, which does the set/restore only if the FP status needs a
change.  The code is written such that only x86 uses these.  Other
architectures should be unaffected by it, but would definitely benefit
if the set/restore has as much overhead relative to the rest of the
code, as the x86 bits do.

Here's a summary of the performance improvement due to these
improvements; I've only mentioned functions that use the set/restore
and have benchmark inputs for x86_64:

Before:

cos(): ITERS:4.69335e+08: TOTAL:28884.6Mcy, MAX:4080.28cy, MIN:57.562cy, 16248.6 calls/Mcy
exp(): ITERS:4.47604e+08: TOTAL:28796.2Mcy, MAX:207.721cy, MIN:62.385cy, 15543.9 calls/Mcy
pow(): ITERS:1.63485e+08: TOTAL:28879.9Mcy, MAX:362.255cy, MIN:172.469cy, 5660.86 calls/Mcy
sin(): ITERS:3.89578e+08: TOTAL:28900Mcy, MAX:704.859cy, MIN:47.583cy, 13480.2 calls/Mcy
tan(): ITERS:7.0971e+07: TOTAL:28902.2Mcy, MAX:1357.79cy, MIN:388.58cy, 2455.55 calls/Mcy

After:

cos(): ITERS:6.0014e+08: TOTAL:28875.9Mcy, MAX:364.283cy, MIN:45.716cy, 20783.4 calls/Mcy
exp(): ITERS:5.48578e+08: TOTAL:28764.9Mcy, MAX:191.617cy, MIN:51.011cy, 19071.1 calls/Mcy
pow(): ITERS:1.70013e+08: TOTAL:28873.6Mcy, MAX:689.522cy, MIN:163.989cy, 5888.18 calls/Mcy
sin(): ITERS:4.64079e+08: TOTAL:28891.5Mcy, MAX:6959.3cy, MIN:36.189cy, 16062.8 calls/Mcy
tan(): ITERS:7.2354e+07: TOTAL:28898.9Mcy, MAX:1295.57cy, MIN:380.698cy, 2503.7 calls/Mcy

So the improvements are:

cos: 27.9089%
exp: 22.6919%
pow: 4.01564%
sin: 19.1585%
tan: 1.96086%

The downside of the change is that it will have an adverse performance
impact on non-default rounding modes, but I think the tradeoff is
justified.
2013-06-12 10:36:48 +05:30
Joseph Myers
fab7ce3f5b Link extra-libs consistently with libc and ld.so. 2013-05-31 16:16:33 +00:00
Joseph Myers
dd4259b9f7 Test drem and pow10 in libm-test.inc. 2013-05-24 20:33:14 +00:00
Joseph Myers
4f8dfe270b Use same tests for isfinite/finite, lgamma/gamma. 2013-05-24 19:21:22 +00:00
Joseph Myers
b50a71810b Don't include expected results in libm-test test names. 2013-05-22 11:49:36 +00:00
Joseph Myers
db62a90753 Handle sincos with generic libm-test logic. 2013-05-19 14:45:41 +00:00
Joseph Myers
601a3a5fd5 Convert TEST_ff_f tests from code to data. 2013-05-12 13:17:09 +00:00
Joseph Myers
d8cd06db62 Improve tgamma accuracy (bugs 2546, 2560, 5159, 15426). 2013-05-08 11:58:18 +00:00
Joseph Myers
10de07f5fd Fix catan, catanh spurious underflows (bug 15423). 2013-05-01 10:07:00 +00:00
Joseph Myers
caf84319c1 Fix catan, catanh inaccuracy from atan2 denominators near 0 (bug 15416). 2013-04-30 11:27:35 +00:00
Joseph Myers
5b4217d71f Fix catan, catanh spurious overflows (bug 15409). 2013-04-27 14:57:41 +00:00
Allan McRae
4721b2d1ca Update i386 libm-test ULPs 2013-04-27 15:13:12 +10:00
Joseph Myers
2f38fbfe09 Fix catan, catanh inaccuracy through use of log (bug 15394). 2013-04-24 18:49:13 +00:00
Carlos O'Donell
aba5e333d4 libm-test.inc: Fix tests where cos(PI/2) != 0.
The value of PI is never exactly PI in any floating point representation,
and the value of PI/2 is never PI/2. It is wrong to expect cos(M_PI_2l)
to return 0, instead it will return an answer that is  non-zero because
M_PI_2l doesn't round to exactly PI/2 in the type used.

That is to say that the correct answer is to do the following:
* Take PI or PI/2.
* Round to the floating point representation.
* Take the rounded value and compute an infinite precision cos or sin.
* Use the rounded result of the infinite precision cos or sin as the
  answer to the test.

I used printf to do the type rounding, and Wolfram's Alpha to do the
infinite precision cos calculations.

The following changes bring x86-64 and x86 to 1/2 ulp for two tests.
It shows that the x86 cos implementation is quite good, and that
our test are flawed.

Unfortunately given that the rounding errors are type dependent we
need to fix this for each type. No regressions on x86-64 or x86.

---

2013-04-11  Carlos O'Donell  <carlos@redhat.com>

	* math/libm-test.inc (cos_test): Fix PI/2 test.
	(sincos_test): Likewise.
	* sysdeps/x86_64/fpu/libm-test-ulps: Regenerate.
	* sysdeps/i386/fpu/libm-test-ulps: Regenerate.
2013-04-11 08:52:18 -04:00
Thomas Schwinge
74d87055bf Refer to two GCC PRs. 2013-04-03 14:13:44 +02:00
Joseph Myers
52ce486045 Fix cacosh inaccuracy and spurious exceptions (bug 15327). 2013-04-02 22:54:00 +00:00
Joseph Myers
ccc8cadf75 Fix casinh inaccuracy for imaginary part < 1.0, real part small (bug 10357). 2013-03-30 13:31:53 +00:00
Joseph Myers
3a7182a14b Fix casinh inaccuracy near i, imaginary part > 1 (bug 15307). 2013-03-27 14:38:44 +00:00
Thomas Schwinge
5aa4a1a1fd On 32-bit x86, disable certain tests involving sNaN values.
Follow-up to commit 495ded2c8c.
2013-03-21 16:05:29 +01:00
Joseph Myers
98c48fe5cc Fix Bessel function spurious overflows for ldbl-128 / ldbl-128ibm (bug 15285). 2013-03-21 13:57:21 +00:00
Joseph Myers
0a1b2ae6f6 Fix casinh inaccuracy for argument with imaginary part 1 (bug 15287). 2013-03-21 10:27:10 +00:00
Joseph Myers
d2f9799e7c Fix y1l spurious overflows for ldbl-96 (bug 15283). 2013-03-16 17:51:48 +00:00
Joseph Myers
2366713d87 Remove remaining bounded-pointers support from i386 .S files. 2013-02-21 22:21:52 +00:00
Joseph Myers
92945b5261 Remove some bounded-pointers support from i386 .S files. 2013-02-19 21:58:08 +00:00
Joseph Myers
e97ed6ddbe Remove bp-sym.h and BP_SYM uses from C code. 2013-02-14 13:12:02 +00:00
Joseph Myers
8cf28c5ebe Fix casinh spurious underflows away from [-i,i] (bug 15062). 2013-01-31 22:55:29 +00:00
Siddhesh Poyarekar
0b57daebab Fix application of the exception mask
Fixes BZ #14496.
2013-01-18 14:16:25 +05:30
Joseph Myers
728d7b43fc Fix cacos real-part inaccuracy for result real part near 0 (bug 15023). 2013-01-17 20:25:51 +00:00
Joseph Myers
a9708fed77 Fix casinh, casin overflow (bug 14996). 2013-01-07 14:59:53 +00:00
Joseph Myers
cdc1c96fba Fix casinh, casin inaccuracy from cancellation (bug 14994). 2013-01-04 13:25:17 +00:00
Joseph Myers
568035b787 Update copyright notices with scripts/update-copyrights. 2013-01-02 19:05:09 +00:00
Joseph Myers
1bead169c3 Fix powl inaccuracy for x86_64 and x86 (bug 13881). 2012-11-28 13:40:54 +00:00
Andreas Schwab
fff1530e61 Update i386 libm-test ULPs 2012-11-22 14:59:33 +01:00
David S. Miller
05b227bdae Correct tinyness handling in long-double and float y0/y1.
With help from Joseph Myers.
	* sysdeps/ieee754/flt-32/e_j0f.c (__ieee754_y0f): Adjust tinyness
	cutoff to 2**-13.
	* sysdeps/ieee754/flt-32/e_j1f.c (__ieee754_y1f): Adjust tinyness
	cutoff to 2**-25.
	* sysdeps/ieee754/ldbl-128/e_j0l.c (U0): New constant.
	( __ieee754_y0l): Avoid arithmetic underflow when 'x' is very
	small.
	* sysdeps/ieee754/ldbl-128/e_j1l.c (__ieee754_y1l): Likewise.
	* math/libm-test.inc (y0_test): New tests.
	(y1_test): New tests.
	* sysdeps/i386/fpu/libm-test-ulps: Update.
	* sysdeps/x86_64/fpu/libm-test-ulps: Update.
	* sysdeps/sparc/fpu/libm-test-ulps: Update.
2012-11-18 12:33:53 -08:00
Joseph Myers
60e235ee2a Fix spurious underflows from pow with results close to 1 (bug 14811). 2012-11-07 13:03:31 +00:00
Joseph Myers
5b5b04d628 Make fma use of Dekker and Knuth algorithms use round-to-nearest (bug 14796). 2012-11-03 19:48:53 +00:00
Joseph Myers
d032e0d29b Fix inaccuracy of clog, clog10 near |z| = 1 (bug 13629). 2012-09-25 19:43:49 +00:00
Liubov Dmitrieva
22bf5c1793 Add optimized sincosf for SSE2 for x86 and x86-64 2012-09-25 20:47:20 +02:00
Allan McRae
19fcedd5fc Update i386 ULPs for recently added math tests 2012-09-12 13:58:53 +10:00
Andreas Jaeger
bcd6c8dc64 Update libm-test-ulps 2012-09-03 15:43:56 +02:00
Andreas Jaeger
fab967007b Another ULPs update. 2012-08-14 08:04:51 +02:00
Andreas Jaeger
e11f5155b2 Update i386 ULPs 2012-08-14 08:02:08 +02:00
Marek Polacek
b67e9372b2 Get rid of ASM_TYPE_DIRECTIVE{,_PREFIX}. 2012-08-02 21:04:29 +02:00
Joseph Myers
d0419dbfbd Improve clog, clog10 handling of values with real or imaginary part slightly above 1 (bug 13629). 2012-07-31 14:21:19 +00:00
Joseph Myers
da865e95bc Improve clog, clog10 handling of values with real or imaginary part 1 (bug 13629). 2012-07-26 11:31:35 +00:00
Joseph Myers
638a572eb0 Fix clog, clog10 spurious underflow exceptions (bug 14337). 2012-07-09 11:06:34 +00:00
Joseph Myers
9ad63c23ea Fix tanf underflow close to pi/4 (bug 14154). 2012-07-06 21:19:38 +00:00
Joseph Myers
f17ac40d7c Fix expm1 spurious underflow exceptions (bug 6778). 2012-07-06 11:17:41 +00:00
Joseph Myers
cdfe2c5eb3 Fix csqrt underflow (bugs 14157, 14331). 2012-07-05 11:02:13 +00:00
Joseph Myers
ca61cf32d9 Fix ctan, ctanh of subnormals in round-upwards mode (bug 14328). 2012-07-04 09:55:26 +00:00