The ldbl-128ibm implementations of ceill, floorl, roundl, truncl,
rintl and nearbyintl wrongly return an sNaN when given an sNaN
argument. This patch fixes them to add such an argument to itself to
turn it into a quiet NaN. (The code structure means this "else" case
applies to any argument which is zero or not finite; it's OK to do
this in all such cases.)
Tested for powerpc.
[BZ #20156]
* sysdeps/ieee754/ldbl-128ibm/s_ceill.c (__ceill): Add high part
to itself when zero or not finite.
* sysdeps/ieee754/ldbl-128ibm/s_floorl.c (__floorl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_rintl.c (__rintl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_roundl.c (__roundl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_truncl.c (__truncl): Likewise.
The ldbl-128ibm implementation of nearbyintl uses logic that only
works in round-to-nearest mode. This contrasts with rintl, which
works in all rounding modes.
Now, arguably nearbyintl could simply be aliased to rintl, given that
spurious "inexact" is generally allowed for ldbl-128ibm, even for the
underlying arithmetic operations. But given that the only point of
nearbyintl is to avoid "inexact", this patch follows the more
conservative approach of adding conditionals to the rintl
implementation to make it suitable for use to implement nearbyintl,
then builds it for nearbyintl with USE_AS_NEARBYINTL defined. The
test test-nearbyint-except-2 shows up issues when traps on "inexact"
are enabled, which turn out to be problems with the powerpc
fenv_private.h implementation (two functions that should disable
exception traps potentially failing to do so in some cases); this
patch duly fixes that as well (I don't see any other existing cases
where this would be user-visible; there isn't much use of *_NOEX,
*hold* etc. in libm that requires exceptions to be discarded and not
trapped on).
Tested for powerpc.
[BZ #19790]
* sysdeps/ieee754/ldbl-128ibm/s_rintl.c [USE_AS_NEARBYINTL]
(rintl): Define as macro.
[USE_AS_NEARBYINTL] (__rintl): Likewise.
(__rintl) [USE_AS_NEARBYINTL]: Use SET_RESTORE_ROUND_NOEX instead
of fesetround. Ensure results are evaluated before end of scope.
* sysdeps/ieee754/ldbl-128ibm/s_nearbyintl.c: Define
USE_AS_NEARBYINTL and include s_rintl.c.
* sysdeps/powerpc/fpu/fenv_private.h (libc_feholdsetround_ppc):
Disable exception traps in new environment.
(libc_feholdsetround_ppc_ctx): Likewise.
The natural fix for some linknamespace test failures, where C90 libm
functions call C99 <fenv.h> functions, is to make fe* into weak
aliases for __fe* and call __fe* from within libm as needed.
To do this, the __fe* names need to be available for that purpose -
that is, they must not be used for something other than aliases of
fe*. On powerpc, however, __fegetround is an inline function in
fenv_libc.h, with no corresponding fegetround inline function;
fegetround has an equivalent macro expansion in bits/fenvinline.h, but
that is disabled if __NO_MATH_INLINES (which is defined for building
libm).
I see no need for that disabling; it's not even clear that
__NO_MATH_INLINES should affect <fenv.h>, and the results of
fegetround are completely defined so there is no semantic effect of
that disabling at all outside glibc. The x86 inline feraiseexcept is
conditioned on __USE_EXTERN_INLINES not __NO_MATH_INLINES (but that's
an inline function rather than a macro).
This patch removes the __NO_MATH_INLINES conditional on that
fegetround macro, so resulting in it being expanded inline inside
glibc. In turn, this means that direct calls to __fegetround from C99
functions in ldbl-128ibm can be changed to calls to fegetround, so
that nofpu fenv_libc.h files don't need to define __fegetround at all
and, by changing ldbl-128ibm files to use <fenv.h> not <fenv_libc.h>,
non-e500 nofpu no longer needs an fenv_libc.h file.
The other macros in fenvinline.h are left conditional on
__NO_MATH_INLINES, although since the only case where this should make
a difference is one involving undefined behavior (if the argument to
the function is not a valid exception macro).
The out-of-line definition for fegetround uses __fegetround (the
inline function removed by this patch). So this continues to work,
the fenvinline.h header is made to define __fegetround, and then to
define fegetround to call __fegetround.
Tested for powerpc32 (hard float) that installed stripped shared
libraries are unchanged by this patch; also tested that powerpc-nofpu
build still works. (This patch does not itself fix any bugs; it
simply cleans things up in preparation for separate bug fixes.)
* sysdeps/powerpc/bits/fenvinline.h (fegetround): Rename macro to
__fegetround and redefine to call __fegetround. Remove condition
on [!__NO_MATH_INLINES].
* sysdeps/powerpc/fpu/fenv_libc.h (__fegetround): Remove inline
function.
* sysdeps/powerpc/nofpu/fenv_libc.h: Remove file.
* sysdeps/powerpc/powerpc32/e500/nofpu/fenv_libc.h (__fegetround):
Remove macro.
* sysdeps/ieee754/ldbl-128ibm/s_llrintl.c: Include <fenv.h>
instead of <fenv_libc.h>.
(__llrintl): Call fegetround instead of __fegetround.
* sysdeps/ieee754/ldbl-128ibm/s_llroundl.c: Include <fenv.h>
instead of <fenv_libc.h>.
* sysdeps/ieee754/ldbl-128ibm/s_lrintl.c: Likewise.
(__lrintl): Call fegetround instead of __fegetround.
* sysdeps/ieee754/ldbl-128ibm/s_lroundl.c: Include <fenv.h>
instead of <fenv_libc.h>.
* sysdeps/ieee754/ldbl-128ibm/s_rintl.c: Likewise.
(__rintl): Call fegetround instead of __fegetround.
This patch helps some math functions performance by adding the libc_fexxx
variant of inline functions to handle both FPU round and exception set/restore
and by using them on the libc_fexxx_ctx functions. It is based on already coded
fexxx family functions for PPC with fpu.
Here is the summary of performance improvements due this patch (measured on a
POWER7 machine):
Before:
cos(): ITERS:9.5895e+07: TOTAL:5116.03Mcy, MAX:77.6cy, MIN:49.792cy, 18744 calls/Mcy
exp(): ITERS:2.827e+07: TOTAL:5187.15Mcy, MAX:494.018cy, MIN:38.422cy, 5450.01 calls/Mcy
pow(): ITERS:6.1705e+07: TOTAL:5144.26Mcy, MAX:171.95cy, MIN:29.935cy, 11994.9 calls/Mcy
sin(): ITERS:8.6898e+07: TOTAL:5117.06Mcy, MAX:83.841cy, MIN:46.582cy, 16982 calls/Mcy
tan(): ITERS:2.9473e+07: TOTAL:5115.39Mcy, MAX:191.017cy, MIN:172.352cy, 5761.63 calls/Mcy
After:
cos(): ITERS:2.05265e+08: TOTAL:5111.37Mcy, MAX:78.754cy, MIN:24.196cy, 40158.5 calls/Mcy
exp(): ITERS:3.341e+07: TOTAL:5170.84Mcy, MAX:476.317cy, MIN:15.574cy, 6461.23 calls/Mcy
pow(): ITERS:7.6153e+07: TOTAL:5129.1Mcy, MAX:147.5cy, MIN:30.916cy, 14847.2 calls/Mcy
sin(): ITERS:1.58816e+08: TOTAL:5115.11Mcy, MAX:1490.39cy, MIN:22.341cy, 31048.4 calls/Mcy
tan(): ITERS:3.4964e+07: TOTAL:5114.18Mcy, MAX:177.422cy, MIN:146.115cy, 6836.68 calls/Mcy
Jakub Jelinek <jakub@redhat.com>
Roland McGrath <roland@redhat.com>
Steven Munroe <sjmunroe@us.ibm.com>
Alan Modra <amodra@bigpond.net.au>
* sysdeps/powerpc/powerpc64/fpu/s_truncf.S: Comment fix.
* sysdeps/powerpc/powerpc32/fpu/s_truncf.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_llroundf.S: Likewise.
* sysdeps/powerpc/fpu/libm-test-ulps: Update.
* math/libm-test.inc (check_float_internal): Allow ulp <= 0.5.
(erfc_test): Don't run erfcl (27.0L) test if erfcl (27.0L) is
denormal.
[TEST_LDOUBLE] (ceil_test, floor_test, llrint_test, llround_test,
rint_test, round_test, trunc_test): Add new tests.
* sysdeps/powerpc/powerpc32/fpu/s_copysignl.S: New file.
* sysdeps/powerpc/powerpc32/fpu/s_fabs.S: New file.
* sysdeps/powerpc/powerpc32/fpu/s_fabsl.S: New file.
* sysdeps/powerpc/powerpc32/fpu/s_fdim.c: New file.
* sysdeps/powerpc/powerpc32/fpu/s_fmax.S: New file.
* sysdeps/powerpc/powerpc32/fpu/s_fmin.S: New file.
* sysdeps/powerpc/powerpc32/fpu/s_isnan.c: New file.
* sysdeps/powerpc/powerpc64/fpu/s_ceill.S: New file.
* sysdeps/powerpc/powerpc64/fpu/s_copysignl.S: New file.
* sysdeps/powerpc/powerpc64/fpu/s_fabs.S: New file.
* sysdeps/powerpc/powerpc64/fpu/s_fabsl.S: New file.
* sysdeps/powerpc/powerpc64/fpu/s_fdim.c: New file.
* sysdeps/powerpc/powerpc64/fpu/s_floorl.S: New file.
* sysdeps/powerpc/powerpc64/fpu/s_fmax.S: New file.
* sysdeps/powerpc/powerpc64/fpu/s_fmin.S: New file.
* sysdeps/powerpc/powerpc64/fpu/s_isnan.c: New file.
* sysdeps/powerpc/powerpc64/fpu/s_llrintl.S: New file.
* sysdeps/powerpc/powerpc64/fpu/s_llroundl.S: New file.
* sysdeps/powerpc/powerpc64/fpu/s_lrintl.S: New file.
* sysdeps/powerpc/powerpc64/fpu/s_lroundl.S: New file.
* sysdeps/powerpc/powerpc64/fpu/s_nearbyintl.S: New file.
* sysdeps/powerpc/powerpc64/fpu/s_rintl.S: New file.
* sysdeps/powerpc/powerpc64/fpu/s_roundl.S: New file.
* sysdeps/powerpc/powerpc64/fpu/s_truncl.S: New file.
* sysdeps/unix/sysv/linux/powerpc/Implies: New file.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/fpu/Implies: New file.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/Implies: New file.
* sysdeps/unix/sysv/linux/powerpc/configure.in: New file.
* sysdeps/unix/sysv/linux/powerpc/configure: New file.
* sysdeps/unix/sysv/linux/powerpc/bits/wordsize.h
(__LONG_DOUBLE_MATH_OPTIONAL): Define.
(__NO_LONG_DOUBLE_MATH): Define.
* sysdeps/unix/sysv/linux/powerpc/nldbl-abi.h: New file.
* sysdeps/powerpc/fpu/s_isnan.c: Include math_ldbl_opt.h.
* sysdeps/powerpc/powerpc64/fpu/s_ceil.S: Include math_ldbl_opt.h.
[LONG_DOUBLE_COMPAT] (ceill): Add compatibility symbols.
* sysdeps/powerpc/powerpc64/fpu/s_copysign.S: Include math_ldbl_opt.h.
[LONG_DOUBLE_COMPAT] (copysignl): Add compatibility symbols.
* sysdeps/powerpc/powerpc64/fpu/s_floor.S: Include math_ldbl_opt.h.
[LONG_DOUBLE_COMPAT] (floorl): Add compatibility symbols.
* sysdeps/powerpc/powerpc64/fpu/s_llrint.S: Include math_ldbl_opt.h.
[LONG_DOUBLE_COMPAT] (llrintl, lrintl): Add compatibility symbols.
* sysdeps/powerpc/powerpc64/fpu/s_llround.S: Include math_ldbl_opt.h.
[LONG_DOUBLE_COMPAT] (llroundl, lroundl): Add compatibility symbols.
* sysdeps/powerpc/powerpc64/fpu/s_rint.S: Include math_ldbl_opt.h.
[LONG_DOUBLE_COMPAT] (rintl): Add compatibility symbols.
* sysdeps/powerpc/powerpc64/fpu/s_round.S: Include math_ldbl_opt.h.
[LONG_DOUBLE_COMPAT] (roundl): Add compatibility symbols.
* sysdeps/powerpc/powerpc64/fpu/s_trunc.S: Include math_ldbl_opt.h.
[LONG_DOUBLE_COMPAT] (truncl): Add compatibility symbols.
* sysdeps/powerpc/powerpc32/fpu/s_ceil.S: Include math_ldbl_opt.h.
[LONG_DOUBLE_COMPAT] (ceill): Add compatibility symbols.
* sysdeps/powerpc/powerpc32/fpu/s_copysign.S: Include math_ldbl_opt.h.
[LONG_DOUBLE_COMPAT] (copysignl): Add compatibility symbols.
* sysdeps/powerpc/powerpc32/fpu/s_floor.S: Include math_ldbl_opt.h.
[LONG_DOUBLE_COMPAT] (floorl): Add compatibility symbols.
* sysdeps/powerpc/powerpc32/fpu/s_lrint.S: Include math_ldbl_opt.h.
[LONG_DOUBLE_COMPAT] (lrintl): Add compatibility symbols.
* sysdeps/powerpc/powerpc32/fpu/s_llrint.c: Include math_ldbl_opt.h.
[LONG_DOUBLE_COMPAT] (llrintl): Add compatibility symbols.
* sysdeps/powerpc/powerpc32/fpu/s_lround.S: Include math_ldbl_opt.h.
[LONG_DOUBLE_COMPAT] (lroundl): Add compatibility symbols.
* sysdeps/powerpc/powerpc32/fpu/s_rint.S: Include math_ldbl_opt.h.
[LONG_DOUBLE_COMPAT] (rintl): Add compatibility symbols.
* sysdeps/powerpc/powerpc32/fpu/s_round.S: Include math_ldbl_opt.h.
[LONG_DOUBLE_COMPAT] (roundl): Add compatibility symbols.
* sysdeps/powerpc/powerpc32/fpu/s_trunc.S: Include math_ldbl_opt.h.
[LONG_DOUBLE_COMPAT] (truncl): Add compatibility symbols.
* misc/qefgcvt_r.c [LDBL_MIN_10_EXP == -291] (FLOAT_MIN_10_NORM): New.
* sysdeps/powerpc/fpu/bits/mathdef.h (__NO_LONG_DOUBLE_MATH): Remove.
* sysdeps/powerpc/Implies: Add ieee754/ldbl-128ibm.
* sysdeps/powerpc/powerpc32/Implies: Remove powerpc/soft-fp.
* sysdeps/ieee754/ldbl-128ibm/Makefile: New file.
* sysdeps/ieee754/ldbl-128ibm/e_acoshl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/e_acosl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/e_asinl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/e_atan2l.c: New file.
* sysdeps/ieee754/ldbl-128ibm/e_atanhl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/e_coshl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/e_expl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/e_fmodl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c: New file.
* sysdeps/ieee754/ldbl-128ibm/e_hypotl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/e_j0l.c: New file.
* sysdeps/ieee754/ldbl-128ibm/e_j1l.c: New file.
* sysdeps/ieee754/ldbl-128ibm/e_jnl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/e_lgammal_r.c: New file.
* sysdeps/ieee754/ldbl-128ibm/e_log10l.c: New file.
* sysdeps/ieee754/ldbl-128ibm/e_log2l.c: New file.
* sysdeps/ieee754/ldbl-128ibm/e_logl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/e_powl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/e_rem_pio2l.c: New file.
* sysdeps/ieee754/ldbl-128ibm/e_remainderl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/e_sinhl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/e_sqrtl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/ieee754.h: New file.
* sysdeps/ieee754/ldbl-128ibm/k_cosl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/k_sincosl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/k_sinl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/k_tanl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/ldbl2mpn.c: New file.
* sysdeps/ieee754/ldbl-128ibm/math_ldbl.h: New file.
* sysdeps/ieee754/ldbl-128ibm/mpn2ldbl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/printf_fphex.c: New file.
* sysdeps/ieee754/ldbl-128ibm/s_asinhl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/s_atanl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/s_cbrtl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/s_cosl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/s_erfl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/s_expm1l.c: New file.
* sysdeps/ieee754/ldbl-128ibm/s_fabsl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/s_finitel.c: New file.
* sysdeps/ieee754/ldbl-128ibm/s_fpclassifyl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/s_frexpl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/s_ilogbl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/s_isinfl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/s_isnanl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/s_log1pl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/s_logbl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/s_modfl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/s_nextafterl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/s_nexttoward.c: New file.
* sysdeps/ieee754/ldbl-128ibm/s_nexttowardf.c: New file.
* sysdeps/ieee754/ldbl-128ibm/s_remquol.c: New file.
* sysdeps/ieee754/ldbl-128ibm/s_rintl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/s_scalblnl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/s_scalbnl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/s_signbitl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/s_sincosl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/s_sinl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/s_tanhl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/s_tanl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/s_truncl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/strtold_l.c: New file.
* sysdeps/ieee754/ldbl-128ibm/t_sincosl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/w_expl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/s_copysignl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/s_floorl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/s_llrintl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/s_llroundl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/s_roundl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/s_ceill.c: New file.
* sysdeps/ieee754/ldbl-128ibm/s_nearbyintl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/s_lrintl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/s_lroundl.c: New file.
* sysdeps/ieee754/ldbl-128/e_powl.c: Fix old comment.