Continuing the fixes for C90 libm functions calling C99 fe* functions,
this patch fixes the case of fesetround by making it a weak alias of
__fesetround and making the affected code call __fesetround. An
existing __fesetround function in fenv_libc.h for powerpc is renamed
to __fesetround_inline.
Tested for x86_64 (testsuite, and that disassembly of installed shared
libraries is unchanged by the patch). Also tested for ARM
(soft-float) that fesetround failures disappear from the linknamespace
test results (feupdateenv remains to be addressed to complete fixing
bug 17748).
[BZ #17748]
* include/fenv.h (__fesetround): Declare. Use libm_hidden_proto.
* math/fesetround.c (fesetround): Rename to __fesetround and
define as weak alias of __fesetround. Use libm_hidden_weak.
* sysdeps/aarch64/fpu/fesetround.c (fesetround): Likewise.
* sysdeps/alpha/fpu/fesetround.c (fesetround): Likewise.
* sysdeps/arm/fesetround.c (fesetround): Likewise.
* sysdeps/hppa/fpu/fesetround.c (fesetround): Likewise.
* sysdeps/i386/fpu/fesetround.c (fesetround): Likewise.
* sysdeps/ia64/fpu/fesetround.c (fesetround): Likewise.
* sysdeps/m68k/fpu/fesetround.c (fesetround): Likewise.
* sysdeps/mips/fpu/fesetround.c (fesetround): Likewise.
* sysdeps/powerpc/fpu/fenv_libc.h (__fesetround): Rename to
__fesetround_inline.
* sysdeps/powerpc/fpu/fenv_private.h (libc_fesetround_ppc): Call
__fesetround_inline instead of __fesetround.
* sysdeps/powerpc/fpu/fesetround.c (fesetround): Rename to
__fesetround and define as weak alias of __fesetround. Use
libm_hidden_weak. Call __fesetround_inline instead of
__fesetround.
* sysdeps/powerpc/nofpu/fesetround.c (fesetround): Rename to
__fesetround and define as weak alias of __fesetround. Use
libm_hidden_weak.
* sysdeps/powerpc/powerpc32/e500/nofpu/fesetround.c (fesetround):
Likewise.
* sysdeps/s390/fpu/fesetround.c (fesetround): Likewise.
* sysdeps/sh/sh4/fpu/fesetround.c (fesetround): Likewise.
* sysdeps/sparc/fpu/fesetround.c (fesetround): Likewise.
* sysdeps/tile/math_private.h (__fesetround): New inline function.
* sysdeps/x86_64/fpu/fesetround.c (fesetround): Rename to
__fesetround and define as weak alias of __fesetround. Use
libm_hidden_weak.
* sysdeps/generic/math_private.h (default_libc_fesetround): Call
__fesetround instead of fesetround.
(default_libc_feholdexcept_setround): Likewise.
(libc_feholdsetround_ctx): Likewise.
(libc_feholdsetround_noex_ctx): Likewise.
The natural fix for some linknamespace test failures, where C90 libm
functions call C99 <fenv.h> functions, is to make fe* into weak
aliases for __fe* and call __fe* from within libm as needed.
To do this, the __fe* names need to be available for that purpose -
that is, they must not be used for something other than aliases of
fe*. On powerpc, however, __fegetround is an inline function in
fenv_libc.h, with no corresponding fegetround inline function;
fegetround has an equivalent macro expansion in bits/fenvinline.h, but
that is disabled if __NO_MATH_INLINES (which is defined for building
libm).
I see no need for that disabling; it's not even clear that
__NO_MATH_INLINES should affect <fenv.h>, and the results of
fegetround are completely defined so there is no semantic effect of
that disabling at all outside glibc. The x86 inline feraiseexcept is
conditioned on __USE_EXTERN_INLINES not __NO_MATH_INLINES (but that's
an inline function rather than a macro).
This patch removes the __NO_MATH_INLINES conditional on that
fegetround macro, so resulting in it being expanded inline inside
glibc. In turn, this means that direct calls to __fegetround from C99
functions in ldbl-128ibm can be changed to calls to fegetround, so
that nofpu fenv_libc.h files don't need to define __fegetround at all
and, by changing ldbl-128ibm files to use <fenv.h> not <fenv_libc.h>,
non-e500 nofpu no longer needs an fenv_libc.h file.
The other macros in fenvinline.h are left conditional on
__NO_MATH_INLINES, although since the only case where this should make
a difference is one involving undefined behavior (if the argument to
the function is not a valid exception macro).
The out-of-line definition for fegetround uses __fegetround (the
inline function removed by this patch). So this continues to work,
the fenvinline.h header is made to define __fegetround, and then to
define fegetround to call __fegetround.
Tested for powerpc32 (hard float) that installed stripped shared
libraries are unchanged by this patch; also tested that powerpc-nofpu
build still works. (This patch does not itself fix any bugs; it
simply cleans things up in preparation for separate bug fixes.)
* sysdeps/powerpc/bits/fenvinline.h (fegetround): Rename macro to
__fegetround and redefine to call __fegetround. Remove condition
on [!__NO_MATH_INLINES].
* sysdeps/powerpc/fpu/fenv_libc.h (__fegetround): Remove inline
function.
* sysdeps/powerpc/nofpu/fenv_libc.h: Remove file.
* sysdeps/powerpc/powerpc32/e500/nofpu/fenv_libc.h (__fegetround):
Remove macro.
* sysdeps/ieee754/ldbl-128ibm/s_llrintl.c: Include <fenv.h>
instead of <fenv_libc.h>.
(__llrintl): Call fegetround instead of __fegetround.
* sysdeps/ieee754/ldbl-128ibm/s_llroundl.c: Include <fenv.h>
instead of <fenv_libc.h>.
* sysdeps/ieee754/ldbl-128ibm/s_lrintl.c: Likewise.
(__lrintl): Call fegetround instead of __fegetround.
* sysdeps/ieee754/ldbl-128ibm/s_lroundl.c: Include <fenv.h>
instead of <fenv_libc.h>.
* sysdeps/ieee754/ldbl-128ibm/s_rintl.c: Likewise.
(__rintl): Call fegetround instead of __fegetround.
This patch optimizes the FPSCR update on exception and rounding change
functions by just updating its value if new value if different from
current one. It also optimizes fedisableexcept and feenableexcept by
removing an unecessary FPSCR read.
This patch helps some math functions performance by adding the libc_fexxx
variant of inline functions to handle both FPU round and exception set/restore
and by using them on the libc_fexxx_ctx functions. It is based on already coded
fexxx family functions for PPC with fpu.
Here is the summary of performance improvements due this patch (measured on a
POWER7 machine):
Before:
cos(): ITERS:9.5895e+07: TOTAL:5116.03Mcy, MAX:77.6cy, MIN:49.792cy, 18744 calls/Mcy
exp(): ITERS:2.827e+07: TOTAL:5187.15Mcy, MAX:494.018cy, MIN:38.422cy, 5450.01 calls/Mcy
pow(): ITERS:6.1705e+07: TOTAL:5144.26Mcy, MAX:171.95cy, MIN:29.935cy, 11994.9 calls/Mcy
sin(): ITERS:8.6898e+07: TOTAL:5117.06Mcy, MAX:83.841cy, MIN:46.582cy, 16982 calls/Mcy
tan(): ITERS:2.9473e+07: TOTAL:5115.39Mcy, MAX:191.017cy, MIN:172.352cy, 5761.63 calls/Mcy
After:
cos(): ITERS:2.05265e+08: TOTAL:5111.37Mcy, MAX:78.754cy, MIN:24.196cy, 40158.5 calls/Mcy
exp(): ITERS:3.341e+07: TOTAL:5170.84Mcy, MAX:476.317cy, MIN:15.574cy, 6461.23 calls/Mcy
pow(): ITERS:7.6153e+07: TOTAL:5129.1Mcy, MAX:147.5cy, MIN:30.916cy, 14847.2 calls/Mcy
sin(): ITERS:1.58816e+08: TOTAL:5115.11Mcy, MAX:1490.39cy, MIN:22.341cy, 31048.4 calls/Mcy
tan(): ITERS:3.4964e+07: TOTAL:5114.18Mcy, MAX:177.422cy, MIN:146.115cy, 6836.68 calls/Mcy
2008-11-13 Ryan S. Arnold <rsa@us.ibm.com>
[BZ #6411]
* sysdeps/powerpc/fpu/Makefile: Added test case tst-setcontext-fpscr.
* sysdeps/powerpc/fpu/feholdexcpt.c (_FPU_MASK_ALL): Define to replace
magic numbers.
* sysdeps/powerpc/fpu/fenv_libc.h (fesetenv_register): Dynamically
choose mtfsf insn based on PPC_FEATURE_HAS_DFP.
(relax_fenv_state): Same as above.
(FPSCR_29): Reserve bit in ISA 2.05.
(FPSCR_NI): Provide define for compat.
* sysdeps/powerpc/fpu/fesetenv.c (_FPU_MASK_ALL): Define to replace
magic numbers.
* sysdeps/powerpc/fpu/feupdateenv.c (_FPU_MASK_ALL): Define to replace
magic numbers.
* sysdeps/powerpc/fpu/tst-setcontext-fpscr.c: New file. Test case to
test setcontext and swapcontext with dynamic 64-bit FPSCR detection.
* sysdeps/powerpc/powerpc32/fpu/__longjmp-common.S (__longjmp): Adjust
access to hwcap to account for hwcap size increase to uint64_t.
* sysdeps/powerpc/powerpc32/fpu/setjmp-common.S (__sigsetjmp ):
Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/getcontext-common.S
(*setcontext): Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/power6/fpu/setcontext.S:
New file.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/power6/fpu/swapcontext.S:
New file.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/setcontext-common.S
(*setcontext): dynamically select mtfsf insn based on
PPC_FEATURE_HAS_DFP. Adjust access to hwcap to account for hwcap size
increase to uint64_t.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/swapcontext-common.S
(*swapcontext): dynamically select mtfsf insn based on
PPC_FEATURE_HAS_DFP. Adjust access to hwcap to account for hwcap size
increase to uint64_t.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/power6/fpu/setcontext.S:
New file.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/power6/fpu/swapcontext.S:
New file.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/setcontext.S
(*setcontext): dynamically select mtfsf insn based on
PPC_FEATURE_HAS_DFP.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/swapcontext.S
(*swapcontext): dynamically select mtfsf insn based on
PPC_FEATURE_HAS_DFP.
__fe_nomask_env.
* sysdeps/powerpc/fpu/fe_nomask.c: Add libm_hidden_def.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/fe_nomask.c: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/fpu/fe_nomask.c: Likewise.
* sysdeps/powerpc/bits/fenv.h: Make safe for C++.
* sysdeps/unix/sysv/linux/powerpc/bits/mathinline.h: New file.
* sysdeps/powerpc/fpu/fegetexcept.c (__fegetexcept): Rename
function from fegetexcept and make old name weak alias.
* include/fenv.h: Declare __fegetexcept.
* sysdeps/powerpc/fpu/fedisblxcpt.c: Use __fegetexcept instead of
fegetexcept.
* sysdeps/powerpc/fpu/feenablxcpt.c: Likewise.
* sysdeps/powerpc/fpu/fraiseexcpt.c (__feraiseexcept): Avoid call
to fetestexcept.
* sysdeps/ieee754/ldbl-128ibm/s_log1pl.c (__log1pl): Use __frexpl
instead of frexpl to avoid local PLT.
* math/s_significandl.c (__significandl): Use __ilogbl instead of
ilogbl to avoid local PLT.
* sysdeps/ieee754/ldbl-128ibm/s_expm1l.c (__expm1l): Use __ldexpl
instead of ldexpl to avoid local PLT.
* sysdeps/ieee754/ldbl-128ibm/e_expl.c (__ieee754_expl): Use
__roundl not roundl to avoid local PLT.
* sysdeps/ieee754/ldbl-128/e_j0l.c: Use function names which avoid
local PLTs. Use __sincosl instead of separate sinl and cosl
calls.
* sysdeps/ieee754/ldbl-128/e_j1l.c: Likewise.
2001-07-06 Paul Eggert <eggert@twinsun.com>
* manual/argp.texi: Remove ignored LGPL copyright notice; it's
not appropriate for documentation anyway.
* manual/libc-texinfo.sh: "Library General Public License" ->
"Lesser General Public License".
2001-07-06 Andreas Jaeger <aj@suse.de>
* All files under GPL/LGPL version 2: Place under LGPL version
2.1.