sysdeps/powerpc/fpu/ has versions of llround and llroundf that are
actually used only for powerpc32 because
sysdeps/powerpc/powerpc64/fpu/ has its own versions of those
functions. This patch moves them into sysdeps/powerpc/powerpc32/fpu
to reflect where they are actually used (in preparation for fixing
other problems with those functions).
Tested for powerpc that installed stripped shared libraries are
unchanged by this patch.
* sysdeps/powerpc/fpu/s_llround.c: Move to ....
* sysdeps/powerpc/powerpc32/fpu/s_llround.c: ...here.
* sysdeps/powerpc/fpu/s_llroundf.c: Move to ....
* sysdeps/powerpc/powerpc32/fpu/s_llroundf.c: ...here.
Similar to various other bugs in this area, hypot functions can fail
to raise the underflow exception when the result is tiny and inexact
but one or more low bits of the intermediate result that is scaled
down (or, in the i386 case, converted from a wider evaluation format)
are zero. This patch forces the exception in a similar way to
previous fixes.
Note that this issue cannot arise for implementations of hypotf using
double (or wider) for intermediate evaluation (if hypotf should
underflow, that means the double square root is being computed of some
number of the form N*2^-298, for 0 < N < 2^46, which is exactly
represented as a double, and whatever the rounding mode such a square
root cannot have a mantissa with all zeroes after the initial 23
bits). Thus no changes are made to hypotf implementations in this
patch, only to hypot and hypotl.
Tested for x86_64, x86, mips64 and powerpc.
[BZ #18803]
* sysdeps/i386/fpu/e_hypot.S: Use DEFINE_DBL_MIN.
(MO): New macro.
(__ieee754_hypot) [PIC]: Load PIC register.
(__ieee754_hypot): Use DBL_NARROW_EVAL_UFLOW_NONNEG instead of
DBL_NARROW_EVAL.
* sysdeps/ieee754/dbl-64/e_hypot.c (__ieee754_hypot): Use
math_check_force_underflow_nonneg in case where result might be
tiny.
* sysdeps/ieee754/ldbl-128/e_hypotl.c (__ieee754_hypotl):
Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_hypotl.c (__ieee754_hypotl):
Likewise.
* sysdeps/ieee754/ldbl-96/e_hypotl.c (__ieee754_hypotl): Likewise.
* sysdeps/powerpc/fpu/e_hypot.c (__ieee754_hypot): Likewise.
* math/auto-libm-test-in: Add more tests of hypot.
* math/auto-libm-test-out: Regenerated.
Similar to various other bugs in this area, some sin and sincos
implementations do not raise the underflow exception for subnormal
arguments, when the result is tiny and inexact. This patch forces the
exception in a similar way to previous fixes.
Tested for x86_64, x86, mips64 and powerpc.
[BZ #16526]
[BZ #16538]
* sysdeps/ieee754/dbl-64/s_sin.c: Include <float.h>.
(__sin): Force underflow exception for arguments with small
absolute value.
* sysdeps/ieee754/flt-32/k_sinf.c: Include <float.h>.
(__kernel_sinf): Force underflow exception for arguments with
small absolute value.
* sysdeps/ieee754/ldbl-128/k_sincosl.c: Include <float.h>.
(__kernel_sincosl): Force underflow exception for arguments with
small absolute value.
* sysdeps/ieee754/ldbl-128/k_sinl.c: Include <float.h>.
(__kernel_sinl): Force underflow exception for arguments with
small absolute value.
* sysdeps/ieee754/ldbl-128ibm/k_sincosl.c: Include <float.h>.
(__kernel_sincosl): Force underflow exception for arguments with
small absolute value.
* sysdeps/ieee754/ldbl-128ibm/k_sinl.c: Include <float.h>.
(__kernel_sinl): Force underflow exception for arguments with
small absolute value.
* sysdeps/ieee754/ldbl-96/k_sinl.c: Include <float.h>.
(__kernel_sinl): Force underflow exception for arguments with
small absolute value.
* sysdeps/powerpc/fpu/k_sinf.c: Include <float.h>.
(__kernel_sinf): Force underflow exception for arguments with
small absolute value.
* math/auto-libm-test-in: Add more tests of sin and sincos.
* math/auto-libm-test-out: Regenerated.
Similarly to sqrt in
<https://sourceware.org/ml/libc-alpha/2015-02/msg00353.html>, the
powerpc sqrtf implementation for when _ARCH_PPCSQ is not defined also
relies on a * b + c being contracted into a fused multiply-add.
Although this contraction is not explicitly disabled for e_sqrtf.c, it
still seems appropriate to make the file explicit about its
requirements by using __builtin_fmaf; this patch does so.
Furthermore, it turns out that doing so fixes the observed inaccuracy
and missing exceptions (that is, that without explicit __builtin_fmaf
usage, it was not being compiled as intended).
Tested for powerpc32 (hard float).
[BZ #17967]
* sysdeps/powerpc/fpu/e_sqrtf.c (__slow_ieee754_sqrtf): Use
__builtin_fmaf instead of relying on contraction of a * b + c.
As Adhemerval noted in
<https://sourceware.org/ml/libc-alpha/2015-01/msg00451.html>, the
powerpc sqrt implementation for when _ARCH_PPCSQ is not defined is
inaccurate in some cases.
The problem is that this code relies on fused multiply-add, and relies
on the compiler contracting a * b + c to get a fused operation. But
sysdeps/ieee754/dbl-64/Makefile disables contraction for e_sqrt.c,
because the implementation in that directory relies on *not* having
contracted operations.
While it would be possible to arrange makefiles so that an earlier
sysdeps directory can disable the setting in
sysdeps/ieee754/dbl-64/Makefile, it seems a lot cleaner to make the
dependence on fused operations explicit in the .c file. GCC 4.6
introduced support for __builtin_fma on powerpc and other
architectures with such instructions, so we can rely on that; this
patch duly makes the code use __builtin_fma for all such fused
operations.
Tested for powerpc32 (hard float).
2015-02-12 Joseph Myers <joseph@codesourcery.com>
[BZ #17964]
* sysdeps/powerpc/fpu/e_sqrt.c (__slow_ieee754_sqrt): Use
__builtin_fma instead of relying on contraction of a * b + c.
This patch fixes a bug introduced by 18f2945ae9, where it optimizes
the FPSCR set by just issuing a mtfs instruction if new flag is different
from older one. The issue is a typo, where the new flag should the the
new value, instead of the old one.
It fixes BZ#17885.
Some powerpc64 processors (e5500 core for instance) does not provide the
fsqrt instruction, however current check to use in math_private.h is
__WORDSIZE and _ARCH_PWR4 (ISA 2.02). This is patch change it to use
the compiler flag _ARCH_PPCSQ (which is the same condition GCC uses to
decide whether to generate fsqrt instruction).
It fixes BZ#16576.
Concluding the fixes for C90 libm functions calling C99 fe* functions,
this patch fixes the case of feupdateenv by making it a weak alias for
__feupdateenv and making the affected code call __feupdateenv.
Tested for x86_64 (testsuite, and that installed stripped shared
libraries are unchanged by the patch). Also tested for ARM
(soft-float) that the math.h linknamespace tests now pass.
[BZ #17748]
* include/fenv.h (__feupdateenv): Use libm_hidden_proto.
* math/feupdateenv.c (__feupdateenv): Use libm_hidden_def.
* sysdeps/aarch64/fpu/feupdateenv.c (feupdateenv): Rename to
__feupdateenv and define as weak alias of __feupdateenv. Use
libm_hidden_weak.
* sysdeps/alpha/fpu/feupdateenv.c (__feupdateenv): Use
libm_hidden_def.
* sysdeps/arm/feupdateenv.c (feupdateenv): Rename to __feupdateenv
and define as weak alias of __feupdateenv. Use libm_hidden_weak.
* sysdeps/hppa/fpu/feupdateenv.c (feupdateenv): Likewise.
* sysdeps/i386/fpu/feupdateenv.c (__feupdateenv): Use
libm_hidden_def.
* sysdeps/ia64/fpu/feupdateenv.c (feupdateenv): Rename to
__feupdateenv and define as weak alias of __feupdateenv. Use
libm_hidden_weak.
* sysdeps/m68k/fpu/feupdateenv.c (__feupdateenv): Use
libm_hidden_def.
* sysdeps/mips/fpu/feupdateenv.c (feupdateenv): Rename to
__feupdateenv and define as weak alias of __feupdateenv. Use
libm_hidden_weak.
* sysdeps/powerpc/fpu/feupdateenv.c (__feupdateenv): Use
libm_hidden_def.
* sysdeps/powerpc/nofpu/feupdateenv.c (__feupdateenv): Likewise.
* sysdeps/powerpc/powerpc32/e500/nofpu/feupdateenv.c
(__feupdateenv): Likewise.
* sysdeps/s390/fpu/feupdateenv.c (feupdateenv): Rename to
__feupdateenv and define as weak alias of __feupdateenv. Use
libm_hidden_weak.
* sysdeps/sh/sh4/fpu/feupdateenv.c (feupdateenv): Likewise.
* sysdeps/sparc/fpu/feupdateenv.c (__feupdateenv): Use
libm_hidden_def.
* sysdeps/tile/math_private.h (__feupdateenv): New inline
function.
* sysdeps/x86_64/fpu/feupdateenv.c (__feupdateenv): Use
libm_hidden_def.
* sysdeps/generic/math_private.h (default_libc_feupdateenv): Call
__feupdateenv instead of feupdateenv.
(default_libc_feupdateenv_test): Likewise.
(libc_feresetround_ctx): Likewise.
Continuing the fixes for C90 libm functions calling C99 fe* functions,
this patch fixes the case of fesetround by making it a weak alias of
__fesetround and making the affected code call __fesetround. An
existing __fesetround function in fenv_libc.h for powerpc is renamed
to __fesetround_inline.
Tested for x86_64 (testsuite, and that disassembly of installed shared
libraries is unchanged by the patch). Also tested for ARM
(soft-float) that fesetround failures disappear from the linknamespace
test results (feupdateenv remains to be addressed to complete fixing
bug 17748).
[BZ #17748]
* include/fenv.h (__fesetround): Declare. Use libm_hidden_proto.
* math/fesetround.c (fesetround): Rename to __fesetround and
define as weak alias of __fesetround. Use libm_hidden_weak.
* sysdeps/aarch64/fpu/fesetround.c (fesetround): Likewise.
* sysdeps/alpha/fpu/fesetround.c (fesetround): Likewise.
* sysdeps/arm/fesetround.c (fesetround): Likewise.
* sysdeps/hppa/fpu/fesetround.c (fesetround): Likewise.
* sysdeps/i386/fpu/fesetround.c (fesetround): Likewise.
* sysdeps/ia64/fpu/fesetround.c (fesetround): Likewise.
* sysdeps/m68k/fpu/fesetround.c (fesetround): Likewise.
* sysdeps/mips/fpu/fesetround.c (fesetround): Likewise.
* sysdeps/powerpc/fpu/fenv_libc.h (__fesetround): Rename to
__fesetround_inline.
* sysdeps/powerpc/fpu/fenv_private.h (libc_fesetround_ppc): Call
__fesetround_inline instead of __fesetround.
* sysdeps/powerpc/fpu/fesetround.c (fesetround): Rename to
__fesetround and define as weak alias of __fesetround. Use
libm_hidden_weak. Call __fesetround_inline instead of
__fesetround.
* sysdeps/powerpc/nofpu/fesetround.c (fesetround): Rename to
__fesetround and define as weak alias of __fesetround. Use
libm_hidden_weak.
* sysdeps/powerpc/powerpc32/e500/nofpu/fesetround.c (fesetround):
Likewise.
* sysdeps/s390/fpu/fesetround.c (fesetround): Likewise.
* sysdeps/sh/sh4/fpu/fesetround.c (fesetround): Likewise.
* sysdeps/sparc/fpu/fesetround.c (fesetround): Likewise.
* sysdeps/tile/math_private.h (__fesetround): New inline function.
* sysdeps/x86_64/fpu/fesetround.c (fesetround): Rename to
__fesetround and define as weak alias of __fesetround. Use
libm_hidden_weak.
* sysdeps/generic/math_private.h (default_libc_fesetround): Call
__fesetround instead of fesetround.
(default_libc_feholdexcept_setround): Likewise.
(libc_feholdsetround_ctx): Likewise.
(libc_feholdsetround_noex_ctx): Likewise.
Continuing the fixes for C90 libm functions calling C99 fe* functions,
this patch fixes the case of fesetenv by making it a weak alias of
__fesetenv and making the affected code (including various copies of
feupdateenv which also gets called from C90 functions) call
__fesetenv.
Tested for x86_64 (testsuite, and that disassembly of installed shared
libraries is unchanged by the patch). Also tested for ARM
(soft-float) that fesetenv failures disappear from the linknamespace
test results (fsetround and feupdateenv remain to be addressed to
complete fixing bug 17748).
[BZ #17748]
* include/fenv.h (__fesetenv): Use libm_hidden_proto.
* math/fesetenv.c (__fesetenv): Use libm_hidden_def.
* sysdeps/aarch64/fpu/fesetenv.c (fesetenv): Rename to __fesetenv
and define as weak alias of __fesetenv. Use libm_hidden_weak.
* sysdeps/alpha/fpu/fesetenv.c (__fesetenv): Use libm_hidden_def.
* sysdeps/arm/fesetenv.c (fesetenv): Rename to __fesetenv and
define as weak alias of __fesetenv. Use libm_hidden_weak.
* sysdeps/hppa/fpu/fesetenv.c (fesetenv): Likewise.
* sysdeps/i386/fpu/fesetenv.c (__fesetenv): Use libm_hidden_def.
* sysdeps/ia64/fpu/fesetenv.c (fesetenv): Rename to __fesetenv and
define as weak alias of __fesetenv. Use libm_hidden_weak.
* sysdeps/m68k/fpu/fesetenv.c (__fesetenv): Use libm_hidden_def.
* sysdeps/mips/fpu/fesetenv.c (fesetenv): Rename to __fesetenv and
define as weak alias of __fesetenv. Use libm_hidden_weak.
* sysdeps/powerpc/fpu/fesetenv.c (__fesetenv): Use
libm_hidden_def.
* sysdeps/powerpc/nofpu/fesetenv.c (__fesetenv): Likewise.
* sysdeps/powerpc/powerpc32/e500/nofpu/fesetenv.c (__fesetenv):
Likewise.
* sysdeps/s390/fpu/fesetenv.c (fesetenv): Rename to __fesetenv and
define as weak alias of __fesetenv. Use libm_hidden_weak.
* sysdeps/sh/sh4/fpu/fesetenv.c (fesetenv): Likewise.
* sysdeps/sparc/fpu/fesetenv.c (__fesetenv): Use libm_hidden_def.
* sysdeps/tile/math_private.h (__fesetenv): New inline function.
* sysdeps/x86_64/fpu/fesetenv.c (fesetenv): Rename to __fesetenv
and define as weak alias of __fesetenv. Use libm_hidden_weak.
* sysdeps/generic/math_private.h (default_libc_fesetenv): Use
__fesetenv instead of fesetenv.
(libc_feresetround_noex_ctx): Likewise.
* sysdeps/alpha/fpu/feupdateenv.c (__feupdateenv): Likewise.
* sysdeps/hppa/fpu/feupdateenv.c (feupdateenv): Likewise.
* sysdeps/i386/fpu/feupdateenv.c (__feupdateenv): Likewise.
* sysdeps/ia64/fpu/feupdateenv.c (feupdateenv): Likewise.
* sysdeps/m68k/fpu/feupdateenv.c (__feupdateenv): Likewise.
* sysdeps/mips/fpu/feupdateenv.c (feupdateenv): Likewise.
* sysdeps/powerpc/nofpu/feupdateenv.c (__feupdateenv): Likewise.
* sysdeps/powerpc/powerpc32/e500/nofpu/feupdateenv.c
(__feupdateenv): Likewise.
* sysdeps/s390/fpu/feupdateenv.c (feupdateenv): Likewise.
* sysdeps/sh/sh4/fpu/feupdateenv.c (feupdateenv): Likewise.
* sysdeps/sparc/fpu/feupdateenv.c (__feupdateenv): Likewise.
* sysdeps/x86_64/fpu/feupdateenv.c (__feupdateenv): Likewise.
Continuing the fixes for C90 libm functions calling C99 fe* functions,
this patch fixes the case of feholdexcept by making it a weak alias of
__feholdexcept and making the affected code call __feholdexcept.
Tested for x86_64 (testsuite, and that disassembly of installed shared
libraries is unchanged by the patch). Also tested for ARM
(soft-float) that feholdexcept failures disappear from the
linknamespace test failures (fesetenv, fsetround and feupdateenv
remain to be addressed to complete fixing bug 17748).
[BZ #17748]
* include/fenv.h (__feholdexcept): Declare. Use
libm_hidden_proto.
* math/feholdexcpt.c (feholdexcept): Rename to __feholdexcept and
define as weak alias of __feholdexcept. Use libm_hidden_weak.
* sysdeps/aarch64/fpu/feholdexcpt.c (feholdexcept): Likewise.
* sysdeps/alpha/fpu/feholdexcpt.c (feholdexcept): Likewise.
* sysdeps/arm/feholdexcpt.c (feholdexcept): Likewise.
* sysdeps/hppa/fpu/feholdexcpt.c (feholdexcept): Likewise.
* sysdeps/i386/fpu/feholdexcpt.c (feholdexcept): Likewise.
* sysdeps/ia64/fpu/feholdexcpt.c (feholdexcept): Likewise.
* sysdeps/m68k/fpu/feholdexcpt.c (feholdexcept): Likewise.
* sysdeps/mips/fpu/feholdexcpt.c (feholdexcept): Likewise.
* sysdeps/powerpc/fpu/feholdexcpt.c (feholdexcept): Likewise.
* sysdeps/powerpc/nofpu/feholdexcpt.c (feholdexcept): Likewise.
* sysdeps/powerpc/powerpc32/e500/nofpu/feholdexcpt.c
(feholdexcept): Likewise.
* sysdeps/s390/fpu/feholdexcpt.c (feholdexcept): Likewise.
* sysdeps/sh/sh4/fpu/feholdexcpt.c (feholdexcept): Likewise.
* sysdeps/sparc/fpu/feholdexcpt.c (feholdexcept): Likewise.
* sysdeps/x86_64/fpu/feholdexcpt.c (feholdexcept): Likewise.
* sysdeps/generic/math_private.h (default_libc_feholdexcept): Use
__feholdexcept instead of feholdexcept.
(default_libc_feholdexcept_setround): Likewise.
Continuing the fixes for C90 libm functions calling C99 fe* functions,
this patch fixes the case of fegetround by making it a weak alias of
__fegetround and making the affected code call __fegetround.
Tested for x86_64 (testsuite, and that disassembly of installed shared
libraries is unchanged by the patch). Also tested for ARM
(soft-float) that fegetround failures disappear from the linknamespace
test failures (feholdexcept, fesetenv, fesetround and feupdateenv
remain to be addressed before bug 17748 is fully fixed, although this
patch may suffice to fix the failures in some cases, when the libc_fe*
functions are implemented but there is no architecture-specific sqrt
implementation in use so there were failures from fegetround used by
sqrt but no other such failures).
[BZ #17748]
* include/fenv.h (__fegetround): Declare. Use libm_hidden_proto.
* math/fegetround.c (fegetround): Rename to __fegetround and
define as weak alias of __fegetround. Use libm_hidden_weak.
* sysdeps/aarch64/fpu/fegetround.c (fegetround): Likewise.
* sysdeps/alpha/fpu/fegetround.c (fegetround): Likewise.
* sysdeps/arm/fegetround.c (fegetround): Likewise.
* sysdeps/hppa/fpu/fegetround.c (fegetround): Likewise.
* sysdeps/i386/fpu/fegetround.c (fegetround): Likewise.
* sysdeps/ia64/fpu/fegetround.c (fegetround): Likewise.
* sysdeps/m68k/fpu/fegetround.c (fegetround): Likewise.
* sysdeps/mips/fpu/fegetround.c (fegetround): Likewise.
* sysdeps/powerpc/fpu/fegetround.c (fegetround): Likewise.
Undefine after rather than before function definition; use
parentheses around function name in definition.
(__fegetround): Also undefine macro after function definition.
* sysdeps/powerpc/nofpu/fegetround.c (fegetround): Rename to
__fegetround and define as weak alias of __fegetround. Use
libm_hidden_weak. Do not undefine as macro.
* sysdeps/powerpc/powerpc32/e500/nofpu/fegetround.c (fegetround):
Likewise.
* sysdeps/s390/fpu/fegetround.c (fegetround): Rename to
__fegetround and define as weak alias of __fegetround. Use
libm_hidden_weak.
* sysdeps/sh/sh4/fpu/fegetround.c (fegetround): Likewise.
* sysdeps/sparc/fpu/fegetround.c (fegetround): Likewise.
* sysdeps/tile/math_private.h (__fegetround): New inline function.
* sysdeps/x86_64/fpu/fegetround.c (fegetround): Rename to
__fegetround and define as weak alias of __fegetround. Use
libm_hidden_weak.
* sysdeps/ieee754/dbl-64/e_sqrt.c (__ieee754_sqrt): Use
__fegetround instead of fegetround.
Some C90 libm functions call fegetenv via libc_feholdsetround*
functions in math_private.h. This patch makes them call __fegetenv
instead, making fegetenv into a weak alias for __fegetenv as needed.
Tested for x86_64 (testsuite, and that disassembly of installed shared
libraries is unchanged by the patch). Also tested for ARM
(soft-float) that fegetenv failures disappear from the linknamespace
test failures (however, similar fixes will also be needed for
fegetround, feholdexcept, fesetenv, fesetround and feupdateenv before
this set of namespace issues covered by bug 17748 is fully fixed and
those linknamespace tests start passing).
[BZ #17748]
* include/fenv.h (__fegetenv): Use libm_hidden_proto.
* math/fegetenv.c (__fegetenv): Use libm_hidden_def.
* sysdeps/aarch64/fpu/fegetenv.c (fegetenv): Rename to __fegetenv
and define as weak alias of __fegetenv. Use libm_hidden_weak.
* sysdeps/alpha/fpu/fegetenv.c (__fegetenv): Use libm_hidden_def.
* sysdeps/arm/fegetenv.c (fegetenv): Rename to __fegetenv and
define as weak alias of __fegetenv. Use libm_hidden_weak.
* sysdeps/hppa/fpu/fegetenv.c (fegetenv): Likewise.
* sysdeps/i386/fpu/fegetenv.c (__fegetenv): Use libm_hidden_def.
* sysdeps/ia64/fpu/fegetenv.c (fegetenv): Rename to __fegetenv and
define as weak alias of __fegetenv. Use libm_hidden_weak.
* sysdeps/m68k/fpu/fegetenv.c (__fegetenv): Use libm_hidden_def.
* sysdeps/mips/fpu/fegetenv.c (fegetenv): Rename to __fegetenv and
define as weak alias of __fegetenv. Use libm_hidden_weak.
* sysdeps/powerpc/fpu/fegetenv.c (__fegetenv): Use
libm_hidden_def.
* sysdeps/powerpc/nofpu/fegetenv.c (__fegetenv): Likewise.
* sysdeps/powerpc/powerpc32/e500/nofpu/fegetenv.c (__fegetenv):
Likewise.
* sysdeps/s390/fpu/fegetenv.c (fegetenv): Rename to __fegetenv and
define as weak alias of __fegetenv. Use libm_hidden_weak.
* sysdeps/sh/sh4/fpu/fegetenv.c (fegetenv): Likewise.
* sysdeps/sparc/fpu/fegetenv.c (__fegetenv): Use libm_hidden_def.
* sysdeps/tile/math_private.h (__fegetenv): New inline function.
* sysdeps/x86_64/fpu/fegetenv.c (fegetenv): Rename to __fegetenv
and define as weak alias of __fegetenv. Use libm_hidden_weak.
* sysdeps/generic/math_private.h (libc_feholdsetround_ctx): Use
__fegetenv instead of fegetenv.
(libc_feholdsetround_noex_ctx): Likewise.
The natural fix for some linknamespace test failures, where C90 libm
functions call C99 <fenv.h> functions, is to make fe* into weak
aliases for __fe* and call __fe* from within libm as needed.
To do this, the __fe* names need to be available for that purpose -
that is, they must not be used for something other than aliases of
fe*. On powerpc, however, __fegetround is an inline function in
fenv_libc.h, with no corresponding fegetround inline function;
fegetround has an equivalent macro expansion in bits/fenvinline.h, but
that is disabled if __NO_MATH_INLINES (which is defined for building
libm).
I see no need for that disabling; it's not even clear that
__NO_MATH_INLINES should affect <fenv.h>, and the results of
fegetround are completely defined so there is no semantic effect of
that disabling at all outside glibc. The x86 inline feraiseexcept is
conditioned on __USE_EXTERN_INLINES not __NO_MATH_INLINES (but that's
an inline function rather than a macro).
This patch removes the __NO_MATH_INLINES conditional on that
fegetround macro, so resulting in it being expanded inline inside
glibc. In turn, this means that direct calls to __fegetround from C99
functions in ldbl-128ibm can be changed to calls to fegetround, so
that nofpu fenv_libc.h files don't need to define __fegetround at all
and, by changing ldbl-128ibm files to use <fenv.h> not <fenv_libc.h>,
non-e500 nofpu no longer needs an fenv_libc.h file.
The other macros in fenvinline.h are left conditional on
__NO_MATH_INLINES, although since the only case where this should make
a difference is one involving undefined behavior (if the argument to
the function is not a valid exception macro).
The out-of-line definition for fegetround uses __fegetround (the
inline function removed by this patch). So this continues to work,
the fenvinline.h header is made to define __fegetround, and then to
define fegetround to call __fegetround.
Tested for powerpc32 (hard float) that installed stripped shared
libraries are unchanged by this patch; also tested that powerpc-nofpu
build still works. (This patch does not itself fix any bugs; it
simply cleans things up in preparation for separate bug fixes.)
* sysdeps/powerpc/bits/fenvinline.h (fegetround): Rename macro to
__fegetround and redefine to call __fegetround. Remove condition
on [!__NO_MATH_INLINES].
* sysdeps/powerpc/fpu/fenv_libc.h (__fegetround): Remove inline
function.
* sysdeps/powerpc/nofpu/fenv_libc.h: Remove file.
* sysdeps/powerpc/powerpc32/e500/nofpu/fenv_libc.h (__fegetround):
Remove macro.
* sysdeps/ieee754/ldbl-128ibm/s_llrintl.c: Include <fenv.h>
instead of <fenv_libc.h>.
(__llrintl): Call fegetround instead of __fegetround.
* sysdeps/ieee754/ldbl-128ibm/s_llroundl.c: Include <fenv.h>
instead of <fenv_libc.h>.
* sysdeps/ieee754/ldbl-128ibm/s_lrintl.c: Likewise.
(__lrintl): Call fegetround instead of __fegetround.
* sysdeps/ieee754/ldbl-128ibm/s_lroundl.c: Include <fenv.h>
instead of <fenv_libc.h>.
* sysdeps/ieee754/ldbl-128ibm/s_rintl.c: Likewise.
(__rintl): Call fegetround instead of __fegetround.
Various C90 and UNIX98 libm functions call feraiseexcept, which is not
in those standards. This causes linknamespace test failures - except
on x86 / x86_64, where feraiseexcept is inline (for the relevant
constant arguments) in bits/fenv.h.
This patch fixes this by making those functions call __feraiseexcept
instead. All changes are applied to all architectures rather than
considering the possibility that some might not be needed in some
cases (e.g. x86) as it seems most maintainable to keep architectures
consistent.
Where __feraiseexcept does not exist, it is added, with feraiseexcept
made a weak alias; where it is a strong alias, it is made weak.
libm_hidden_def / libm_hidden_proto are used with __feraiseexcept
(this might in some cases improve code generation for existing calls
to __feraiseexcept in some code on some architectures). Where there
are dummy feraiseexcept macros (on architectures without
floating-point exceptions support, to avoid compile errors from
references to undefined FE_* macros), corresponding dummy
__feraiseexcept macros are added. And on x86, to ensure
__feraiseexcept calls still get inlined, the inline function in
bits/fenv.h is refactored so that most of it can be reused in an
inline __feraiseexcept in a separate include/bits/fenv.h.
Calls are changed in C90/UNIX98 functions, but generally not in
functions missing from those standards. They are also changed in
libc_fe* functions (on the basis that those might be used in any libm
function), and in feupdateenv (on the same basis - may be used, via
default libc_*, in any libm function - of course feupdateenv will need
changing to __feupdateenv in a subsequent patch to make that fully
namespace-clean).
No __feraiseexcept is added corresponding to the feraiseexcept in
powerpc bits/fenvinline.h, because that macro definition is
conditional on !defined __NO_MATH_INLINES, and glibc libm is built
with -D__NO_MATH_INLINES, so changing internal calls to use
__feraiseexcept should make no difference.
Tested for x86_64 (testsuite; the only change in disassembly of
installed shared libraries is a slight code reordering in clog10, of
no apparent significance). Also tested for MIPS, where (in the
configuration tested) it eliminates math.h linknamespace failures for
n32 and n64 (some for o32 remain because of other issues).
[BZ #17723]
* include/fenv.h (__feraiseexcept): Use libm_hidden_proto.
* math/fraiseexcpt.c (__feraiseexcept): Use libm_hidden_def.
* sysdeps/aarch64/fpu/fraiseexcpt.c (feraiseexcept): Rename to
__feraiseexcept and define as weak alias of __feraiseexcept. Use
libm_hidden_weak.
* sysdeps/arm/fraiseexcpt.c (feraiseexcept): Likewise.
* sysdeps/hppa/fpu/fraiseexcpt.c (feraiseexcept): Likewise.
* sysdeps/i386/fpu/fraiseexcpt.c (__feraiseexcept): Use
libm_hidden_def.
* sysdeps/ia64/fpu/fraiseexcpt.c (feraiseexcept): Rename to
__feraiseexcept and define as weak alias of __feraiseexcept. Use
libm_hidden_weak.
* sysdeps/m68k/coldfire/fpu/fraiseexcpt.c (feraiseexcept):
Likewise.
* sysdeps/microblaze/math_private.h (__feraiseexcept): New macro.
* sysdeps/mips/fpu/fraiseexcpt.c (feraiseexcept): Rename to
__feraiseexcept and define as weak alias of __feraiseexcept. Use
libm_hidden_weak.
* sysdeps/powerpc/fpu/fraiseexcpt.c (__feraiseexcept): Use
libm_hidden_def.
* sysdeps/powerpc/nofpu/fraiseexcpt.c (__feraiseexcept): Likewise.
* sysdeps/powerpc/powerpc32/e500/nofpu/fraiseexcpt.c
(__feraiseexcept): Likewise.
* sysdeps/s390/fpu/fraiseexcpt.c (feraiseexcept): Rename to
__feraiseexcept and define as weak alias of __feraiseexcept. Use
libm_hidden_weak.
* sysdeps/sh/sh4/fpu/fraiseexcpt.c (feraiseexcept): Likewise.
* sysdeps/sparc/fpu/fraiseexcpt.c (__feraiseexcept): Use
libm_hidden_def.
* sysdeps/tile/math_private.h (__feraiseexcept): New macro.
* sysdeps/unix/sysv/linux/alpha/fraiseexcpt.S (__feraiseexcept):
Use libm_hidden_def.
* sysdeps/x86_64/fpu/fraiseexcpt.c (__feraiseexcept): Use
libm_hidden_def.
(feraiseexcept): Define as weak not strong alias. Use
libm_hidden_weak.
* sysdeps/x86/fpu/bits/fenv.h (__feraiseexcept_invalid_divbyzero):
New inline function. Factored out of ...
(feraiseexcept): ... here. Use __feraiseexcept_invalid_divbyzero.
* sysdeps/x86/fpu/include/bits/fenv.h: New file.
* math/e_scalb.c (invalid_fn): Call __feraiseexcept instead of
feraiseexcept.
* math/w_acos.c (__acos): Likewise.
* math/w_asin.c (__asin): Likewise.
* math/w_ilogb.c (__ilogb): Likewise.
* math/w_j0.c (y0): Likewise.
* math/w_j1.c (y1): Likewise.
* math/w_jn.c (yn): Likewise.
* math/w_log.c (__log): Likewise.
* math/w_log10.c (__log10): Likewise.
* sysdeps/aarch64/fpu/feupdateenv.c (feupdateenv): Likewise.
* sysdeps/aarch64/fpu/math_private.h
(libc_feupdateenv_test_aarch64): Likewise.
* sysdeps/alpha/fpu/feupdateenv.c (__feupdateenv): Likewise.
* sysdeps/arm/fenv_private.h (libc_feupdateenv_test_vfp): Likewise.
* sysdeps/arm/feupdateenv.c (feupdateenv): Likewise.
* sysdeps/ia64/fpu/feupdateenv.c (feupdateenv): Likewise.
* sysdeps/m68k/fpu/feupdateenv.c (__feupdateenv): Likewise.
* sysdeps/mips/fpu/feupdateenv.c (feupdateenv): Likewise.
* sysdeps/powerpc/fpu/e_sqrt.c (__slow_ieee754_sqrt): Likewise.
* sysdeps/s390/fpu/feupdateenv.c (feupdateenv): Likewise.
* sysdeps/sh/sh4/fpu/feupdateenv.c (feupdateenv): Likewise.
* sysdeps/sparc/fpu/feupdateenv.c (__feupdateenv): Likewise.
This patch fixes the __copysignf optimized macro meant to internal libm
usage when used with constant value. Without the explicit cast to
float, if it is used with const double value (for instance, on
s_casinhf.c) double constants will be used and it may lead to precision
issues in some algorithms.
It fixes the following failures on PPC64/POWER7:
Failure: Test: Real part of: cacos_downward (inf + 0 i)
Result:
is: 1.19209289550781250000e-07 0x1.00000000000000000000p-23
should be: 0.00000000000000000000e+00 0x0.00000000000000000000p+0
Failure: Test: Real part of: cacos_downward (inf - 0 i)
Result:
is: 1.19209289550781250000e-07 0x1.00000000000000000000p-23
should be: 0.00000000000000000000e+00 0x0.00000000000000000000p+0
Failure: Test: Real part of: cacos_downward (inf + 0.5 i)
Result:
is: 1.19209289550781250000e-07 0x1.00000000000000000000p-23
should be: 0.00000000000000000000e+00 0x0.00000000000000000000p+0
Failure: Test: Real part of: cacos_downward (inf - 0.5 i)
Result:
is: 1.19209289550781250000e-07 0x1.00000000000000000000p-23
should be: 0.00000000000000000000e+00 0x0.00000000000000000000p+0
Failure: Test: Real part of: cacos_towardzero (inf + 0 i)
Result:
is: 1.19209289550781250000e-07 0x1.00000000000000000000p-23
should be: 0.00000000000000000000e+00 0x0.00000000000000000000p+0
Failure: Test: Real part of: cacos_towardzero (inf - 0 i)
Result:
is: 1.19209289550781250000e-07 0x1.00000000000000000000p-23
should be: 0.00000000000000000000e+00 0x0.00000000000000000000p+0
Failure: Test: Real part of: cacos_towardzero (inf + 0.5 i)
Result:
is: 1.19209289550781250000e-07 0x1.00000000000000000000p-23
should be: 0.00000000000000000000e+00 0x0.00000000000000000000p+0
Failure: Test: Real part of: cacos_towardzero (inf - 0.5 i)
Result:
is: 1.19209289550781250000e-07 0x1.00000000000000000000p-23
should be: 0.00000000000000000000e+00 0x0.00000000000000000000p+0
This patch optimizes the FPSCR update on exception and rounding change
functions by just updating its value if new value if different from
current one. It also optimizes fedisableexcept and feenableexcept by
removing an unecessary FPSCR read.
This patch fixes the powerpc32 optimized nearbyint/nearbyintf bogus
results for FE_DOWNWARD rounding mode. This is due wrong instructions
sequence used in the rounding calculation (two subtractions instead of
adition and a subtraction).
Fixes BZ#16815.
As recently discussed
<https://sourceware.org/ml/libc-alpha/2014-02/msg00670.html>, it
doesn't seem particularly useful for libm-test-ulps files to contain
huge amounts of data on ulps for individual tests; just the global
maximum observed ulps for each function, together with the
verification of exceptions, errno and special results such as
infinities and NaNs for each test, suffices to verify that a
function's behavior on the given test inputs is within the expected
accuracy. Removing this data reduces source tree churn caused by
updates to these files when libm tests are added, and reduces the
frequency with which testsuite additions actually need libm-test-ulps
changes at all.
Accordingly, this patch removes that data, so that individual tests
get checked against the global bounds for the given function and only
generate an error if those are exceeded. Tested x86_64 (including
verifying that if an ulps value is artificially reduced, the tests do
indeed fail as they should and "make regen-ulps" generates the
expected changes).
* math/libm-test.inc (struct ulp_data): Don't refer to ulps for
individual tests in comment.
(libm-test-ulps.h): Don't refer to test_ulps in #include comment.
(prev_max_error): New variable.
(prev_real_max_error): Likewise.
(prev_imag_max_error): Likewise.
(compare_ulp_data): Don't refer to test names in comment.
(find_test_ulps): Remove function.
(find_function_ulps): Likewise.
(find_complex_function_ulps): Likewise.
(init_max_error): Take function name as argument. Look up ulps
for that function.
(print_ulps): Remove function.
(print_max_error): Use prev_max_error instead of calling
find_function_ulps.
(print_complex_max_error): Use prev_real_max_error and
prev_imag_max_error instead of calling find_complex_function_ulps.
(check_float_internal): Take max_ulp parameter instead of calling
find_test_ulps. Don't call print_ulps.
(check_float): Update call to check_float_internal.
(check_complex): Update calls to check_float_internal.
(START): Pass argument to init_max_error.
* math/gen-libm-test.pl (%results): Don't include "kind"
information.
(parse_ulps): Don't handle ulps of individual tests.
(print_ulps_file): Likewise.
(output_ulps): Likewise.
* math/README.libm-test: Update.
* manual/libm-err-tab.pl (parse_ulps): Don't handle ulps of
individual tests.
* sysdeps/aarch64/libm-test-ulps: Remove individual test ulps.
* sysdeps/alpha/fpu/libm-test-ulps: Likewise.
* sysdeps/arm/libm-test-ulps: Likewise.
* sysdeps/i386/fpu/libm-test-ulps: Likewise.
* sysdeps/ia64/fpu/libm-test-ulps: Likewise.
* sysdeps/m68k/coldfire/fpu/libm-test-ulps: Likewise.
* sysdeps/m68k/m680x0/fpu/libm-test-ulps: Likewise.
* sysdeps/microblaze/libm-test-ulps: Likewise.
* sysdeps/mips/mips32/libm-test-ulps: Likewise.
* sysdeps/mips/mips64/libm-test-ulps: Likewise.
* sysdeps/powerpc/fpu/libm-test-ulps: Likewise.
* sysdeps/powerpc/nofpu/libm-test-ulps: Likewise.
* sysdeps/s390/fpu/libm-test-ulps: Likewise.
* sysdeps/sh/libm-test-ulps: Likewise.
* sysdeps/sparc/fpu/libm-test-ulps: Likewise.
* sysdeps/tile/libm-test-ulps: Likewise.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
* sysdeps/hppa/fpu/libm-test-ulps: Remove individual test ulps.
For PPC64, all the wrappers at sysdeps are superfluous: they are
basically the same implementation from math/w_sqrt.c with the
'#ifdef _IEEE_LIBM'. And the power4 version just force the 'fsqrt'
instruction utilization with an inline assembly, which is already
handled by math_private.h __ieee754_sqrt implementation.
This patch helps some math functions performance by adding the libc_fexxx
variant of inline functions to handle both FPU round and exception set/restore
and by using them on the libc_fexxx_ctx functions. It is based on already coded
fexxx family functions for PPC with fpu.
Here is the summary of performance improvements due this patch (measured on a
POWER7 machine):
Before:
cos(): ITERS:9.5895e+07: TOTAL:5116.03Mcy, MAX:77.6cy, MIN:49.792cy, 18744 calls/Mcy
exp(): ITERS:2.827e+07: TOTAL:5187.15Mcy, MAX:494.018cy, MIN:38.422cy, 5450.01 calls/Mcy
pow(): ITERS:6.1705e+07: TOTAL:5144.26Mcy, MAX:171.95cy, MIN:29.935cy, 11994.9 calls/Mcy
sin(): ITERS:8.6898e+07: TOTAL:5117.06Mcy, MAX:83.841cy, MIN:46.582cy, 16982 calls/Mcy
tan(): ITERS:2.9473e+07: TOTAL:5115.39Mcy, MAX:191.017cy, MIN:172.352cy, 5761.63 calls/Mcy
After:
cos(): ITERS:2.05265e+08: TOTAL:5111.37Mcy, MAX:78.754cy, MIN:24.196cy, 40158.5 calls/Mcy
exp(): ITERS:3.341e+07: TOTAL:5170.84Mcy, MAX:476.317cy, MIN:15.574cy, 6461.23 calls/Mcy
pow(): ITERS:7.6153e+07: TOTAL:5129.1Mcy, MAX:147.5cy, MIN:30.916cy, 14847.2 calls/Mcy
sin(): ITERS:1.58816e+08: TOTAL:5115.11Mcy, MAX:1490.39cy, MIN:22.341cy, 31048.4 calls/Mcy
tan(): ITERS:3.4964e+07: TOTAL:5114.18Mcy, MAX:177.422cy, MIN:146.115cy, 6836.68 calls/Mcy