Commit Graph

18 Commits

Author SHA1 Message Date
Paul A. Clarke
e3d85df50b [powerpc] fenv_private.h clean up
fenv_private.h includes unused functions, magic macro constants, and
some replicated common code fragments.

Remove unused functions, replace magic constants with constants from
fenv_libc.h, and refactor replicated code.

Suggested-by: Paul E. Murphy <murphyp@linux.ibm.com>
Reviewed-By: Paul E Murphy <murphyp@linux.ibm.com>
2019-09-27 08:48:56 -05:00
Paul A. Clarke
f1c56cdff0 [powerpc] SET_RESTORE_ROUND optimizations and bug fix
SET_RESTORE_ROUND brackets a block of code, temporarily setting and
restoring the rounding mode and letting everything else, including
exceptions generated within the block, pass through.

On powerpc, the current code clears the exception enables, which will hide
exceptions generated within the block.  This issue was introduced by me
in commit e905212627.

Fix this by not clearing exception enable bits in the prologue.

Also, since we are no longer changing the enable bits in either the
prologue or the epilogue, there is no need to test for entering/exiting
non-stop mode.

Also, optimize the prologue get/save/set rounding mode operations for
POWER9 and later by using 'mffscrn' when possible.

Suggested-by: Paul E. Murphy <murphyp@linux.ibm.com>
Reviewed-by: Paul E. Murphy <murphyp@linux.ibm.com>
Fixes: e905212627

2019-09-19  Paul A. Clarke  <pc@us.ibm.com>

	* sysdeps/powerpc/fpu/fenv_libc.h (fegetenv_and_set_rn): New.
	(__fe_mffscrn): New.
	* sysdeps/powerpc/fpu/fenv_private.h (libc_feholdsetround_ppc_ctx):
	Do not clear enable bits, remove obsolete code, use
	fegetenv_and_set_rn.
	(libc_feresetround_ppc): Remove obsolete code, use
	fegetenv_and_set_rn.
2019-09-19 13:02:30 -05:00
Paul Eggert
5a82c74822 Prefer https to http for gnu.org and fsf.org URLs
Also, change sources.redhat.com to sourceware.org.
This patch was automatically generated by running the following shell
script, which uses GNU sed, and which avoids modifying files imported
from upstream:

sed -ri '
  s,(http|ftp)(://(.*\.)?(gnu|fsf|sourceware)\.org($|[^.]|\.[^a-z])),https\2,g
  s,(http|ftp)(://(.*\.)?)sources\.redhat\.com($|[^.]|\.[^a-z]),https\2sourceware.org\4,g
' \
  $(find $(git ls-files) -prune -type f \
      ! -name '*.po' \
      ! -name 'ChangeLog*' \
      ! -path COPYING ! -path COPYING.LIB \
      ! -path manual/fdl-1.3.texi ! -path manual/lgpl-2.1.texi \
      ! -path manual/texinfo.tex ! -path scripts/config.guess \
      ! -path scripts/config.sub ! -path scripts/install-sh \
      ! -path scripts/mkinstalldirs ! -path scripts/move-if-change \
      ! -path INSTALL ! -path  locale/programs/charmap-kw.h \
      ! -path po/libc.pot ! -path sysdeps/gnu/errlist.c \
      ! '(' -name configure \
            -execdir test -f configure.ac -o -f configure.in ';' ')' \
      ! '(' -name preconfigure \
            -execdir test -f preconfigure.ac ';' ')' \
      -print)

and then by running 'make dist-prepare' to regenerate files built
from the altered files, and then executing the following to cleanup:

  chmod a+x sysdeps/unix/sysv/linux/riscv/configure
  # Omit irrelevant whitespace and comment-only changes,
  # perhaps from a slightly-different Autoconf version.
  git checkout -f \
    sysdeps/csky/configure \
    sysdeps/hppa/configure \
    sysdeps/riscv/configure \
    sysdeps/unix/sysv/linux/csky/configure
  # Omit changes that caused a pre-commit check to fail like this:
  # remote: *** error: sysdeps/powerpc/powerpc64/ppc-mcount.S: trailing lines
  git checkout -f \
    sysdeps/powerpc/powerpc64/ppc-mcount.S \
    sysdeps/unix/sysv/linux/s390/s390-64/syscall.S
  # Omit change that caused a pre-commit check to fail like this:
  # remote: *** error: sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S: last line does not end in newline
  git checkout -f sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S
2019-09-07 02:43:31 -07:00
Paul A. Clarke
e905212627 [powerpc] SET_RESTORE_ROUND improvements
SET_RESTORE_ROUND uses libc_feholdsetround_ppc_ctx and
libc_feresetround_ppc_ctx to bracket a block of code where the floating point
rounding mode must be set to a certain value.

For the *prologue*, libc_feholdsetround_ppc_ctx is used and performs:
1. Read/save FPSCR.
2. Create new value for FPSCR with new rounding mode and enables cleared.
3. If new value is different than current value,
   a. If transitioning from a state where some exceptions enabled,
      enter "ignore exceptions / non-stop" mode.
   b. Write new value to FPSCR.
   c. Put a mark on the wall indicating the FPSCR was changed.

(1) uses the 'mffs' instruction.  On POWER9, the lighter weight 'mffsl'
instruction can be used, but it doesn't return all of the bits in the FPSCR.
fegetenv_status uses 'mffsl' on POWER9, 'mffs' otherwise, and can thus be
used instead of fegetenv_register.
(3b) uses 'mtfsf 0b11111111' to write the entire FPSCR, so it must
instead use 'mtfsf 0b00000011' to write just the enables and the mode,
because some of the rest of the bits are not valid if 'mffsl' was used.
fesetenv_mode uses 'mtfsf 0b00000011' on POWER9, 'mtfsf 0b11111111'
otherwise.

For the *epilogue*, libc_feresetround_ppc_ctx checks the mark on the wall, then
calls libc_feresetround_ppc, which just calls __libc_femergeenv_ppc with
parameters such that it performs:
1. Retreive saved value of FPSCR, saved in prologue above.
2. Read FPSCR.
3. Create new value of FPSCR where:
   - Summary bits and exception indicators = current OR saved.
   - Rounding mode and enables = saved.
   - Status bits = current.
4. If transitioning from some exceptions enabled to none,
   enter "ignore exceptions / non-stop" mode.
5. If transitioning from no exceptions enabled to some,
   enter "catch exceptions" mode.
6. Write new value to FPSCR.

The summary bits are hardwired to the exception indicators, so there is no
need to restore any saved summary bits.
The exception indicator bits, which are sticky and remain set unless
explicitly cleared, would only need to be restored if the code block
might explicitly clear any of them.  This is certainly not expected.

So, the only bits that need to be restored are the enables and the mode.
If it is the case that only those bits are to be restored, there is no need to
read the FPSCR.  Steps (2) and (3) are unnecessary, and step (6) only needs to
write the bits being restored.

We know we are transitioning out of "ignore exceptions" mode, so step (4) is
unnecessary, and in step (6), we only need to check the state we are
entering.
2019-08-28 13:51:10 -05:00
Joseph Myers
04277e02d7 Update copyright dates with scripts/update-copyrights.
* All files with FSF copyright notices: Update copyright dates
	using scripts/update-copyrights.
	* locale/programs/charmap-kw.h: Regenerated.
	* locale/programs/locfile-kw.h: Likewise.
2019-01-01 00:11:28 +00:00
Joseph Myers
ff6b24501f Split fenv_private.h out of math_private.h more consistently.
On some architectures, the parts of math_private.h relating to the
floating-point environment are in a separate file fenv_private.h
included from math_private.h.  As this is purely an
architecture-specific convention used by several architectures,
however, all such architectures still need their own math_private.h,
even if it has nothing to do beyond #include <fenv_private.h> and
peculiarity of including the i386 file directly instead of having a
shared file in sysdeps/x86.

This patch makes the fenv_private.h name an architecture-independent
convention in glibc.  The include of fenv_private.h from
math_private.h becomes architecture-independent (until callers are
updated to include fenv_private.h directly so the include from
math_private.h is no longer needed).  Some architecture math_private.h
headers are removed if no longer needed, or renamed to fenv_private.h
if all they define belongs in that header; architecture fenv_private.h
headers now do require #include_next <fenv_private.h>.  The i386
fenv_private.h file moves to sysdeps/x86/fpu/ to reflect how it is
actually shared with x86_64.  The generic math_private.h gets a new
include of <stdbool.h>, as needed for bool in some prototypes in that
header (previously that was indirectly included via include/fenv.h,
which now only gets included too late in math_private.h, after those
prototypes).

Tested for x86_64 and x86, and tested with build-many-glibcs.py that
installed stripped shared libraries are unchanged by the patch.

	* sysdeps/aarch64/fpu/fenv_private.h: New file.  Based on ....
	* sysdeps/aarch64/fpu/math_private.h: ... this file.  All contents
	moved to fenv_private.h except for ...
	(TOINT_INTRINSICS): Kept in math_private.h.
	(roundtoint): Likewise.
	(converttoint): Likewise.
	* sysdeps/arm/fenv_private.h: Change multiple-include guard to
	[ARM_FENV_PRIVATE_H].  Include next <fenv_private.h>.
	* sysdeps/arm/math_private.h: Remove.
	* sysdeps/generic/fenv_private.h: New file.  Contents moved from
	....
	* sysdeps/generic/math_private.h: ... this file.  Include
	<stdbool.h>.  Do not include <fenv.h> or <get-rounding-mode.h>.
	Include <fenv_private.h>.  Remove functions and macros moved to
	fenv_private.h.
	* sysdeps/i386/fpu/math_private.h: Remove.
	* sysdeps/mips/math_private.h: Move to ....
	* sysdeps/mips/fpu/fenv_private.h: ... here.  Change
	multiple-include guard to [MIPS_FENV_PRIVATE_H].  Remove
	[__mips_hard_float] conditional.  Include next <fenv_private.h>.
	* sysdeps/powerpc/fpu/fenv_private.h: Change multiple-include
	guard to [POWERPC_FENV_PRIVATE_H].  Include next <fenv_private.h>.
	* sysdeps/powerpc/fpu/math_private.h: Do not include
	<fenv_private.h>.
	* sysdeps/riscv/rvf/math_private.h: Move to ....
	* sysdeps/riscv/rvf/fenv_private.h: ... here.  Change
	multiple-include guard to [RISCV_FENV_PRIVATE_H].  Include next
	<fenv_private.h>.
	* sysdeps/sparc/fpu/fenv_private.h: Change multiple-include guard
	to [SPARC_FENV_PRIVATE_H].  Include next <fenv_private.h>.
	* sysdeps/sparc/fpu/math_private.h: Remove.
	* sysdeps/i386/fpu/fenv_private.h: Move to ....
	* sysdeps/x86/fpu/fenv_private.h: ... here.  Change
	multiple-include guard to [X86_FENV_PRIVATE_H].  Include next
	<fenv_private.h>.
	* sysdeps/x86_64/fpu/math_private.h: Do not include
	<sysdeps/i386/fpu/fenv_private.h>.
2018-08-28 20:48:49 +00:00
Joseph Myers
688903eb3e Update copyright dates with scripts/update-copyrights.
* All files with FSF copyright notices: Update copyright dates
	using scripts/update-copyrights.
	* locale/programs/charmap-kw.h: Regenerated.
	* locale/programs/locfile-kw.h: Likewise.
2018-01-01 00:32:25 +00:00
Paul Clarke
346729f66b powerpc: fix check-before-set in SET_RESTORE_ROUND
A performance regression was introduced by commit
84d74e427a "powerpc: Cleanup fenv_private.h".

In the powerpc implementation of SET_RESTORE_ROUND, there is the
following code in the "SET" function (slightly simplified):
--
  old.fenv = fegetenv_register ();

  new.l = (old.l & _FPU_MASK_TRAPS_RN) | r; (1)

  if (new.l != old.l)                       (2)
    {
      if ((old.l & _FPU_ALL_TRAPS) != 0)
        (void) __fe_mask_env ();
      fesetenv_register (new.fenv);         (3)
--

Line (1) sets the value of "new" to the current value of FPSCR,
but masks off summary bits, exceptions, non-IEEE mode, and
rounding mode, then ORs in the new rounding mode.

Line (2) compares this new value to the current value in order to
avoid setting a new value in the FPSCR (line (3)) unless something
significant has changed (exception enables or rounding mode).

The summary bits are not germane to the comparison, but are cleared
in "new" and preserved in "old", resulting in false negative
comparisons, and unnecessarily setting the FPSCR in those cases
with associated negative performance impacts.

The solution is to treat the summaries identically for "new" and "old":
- save them in SET
- leave them alone otherwise
- restore the saved values in RESTORE

Also minor changes:
- expand _FPU_MASK_RN to 64bit hex, to match other MASKs
- treat bit 52 (left-to-right) as reserved (since it is)

	* sysdeps/powerpc/fpu/fenv_private.h (_FPU_MASK_TRAPS_RN):
	(_FPU_MASK_FRAC_INEX_RET_CC): Fix masks to more properly handle
	summary bits.
	(_FPU_MASK_RN): Expand _FPU_MASK_RN to 64bit hex.
	(_FPU_MASK_NOT_RN_NI): Treat bit 52 (left-to-right) as reserved.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.vnet.ibm.com>
2017-10-18 12:08:28 -02:00
Joseph Myers
bfff8b1bec Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
Paul Murphy
84d74e427a powerpc: Cleanup fenv_private.h
Some of the masks are wrong, and the naming is confusing.

There are two basic cases we really care about:

1. Stacking a new rounding mode when running certain
   sections of code, and pausing exception handling.

2. Likewise, but discarding any exceptions which occur
   while running under the new rounding mode.

libc_feholdexcept_setround_ppc_ctx has been removed as it basically
does the same thing as libc_feholdsetround_ppc_ctx but also clearing
any sticky bits.  The restore behavior is what differentiates these
two cases as the SET_RESTORE_ROUND{,_NOEX} macros will either merge
or discard all exceptions occurring during scope of their usage.

Likewise, there are a number of routines to swap, replace,
or merge FP environments.  This change reduces much of
the common and sometimes wrong code.

Tested on ppc64le, with results before and after.
2016-10-21 16:40:03 -02:00
Joseph Myers
613c92b3b5 Fix ldbl-128ibm nearbyintl in non-default rounding modes (bug 19790).
The ldbl-128ibm implementation of nearbyintl uses logic that only
works in round-to-nearest mode.  This contrasts with rintl, which
works in all rounding modes.

Now, arguably nearbyintl could simply be aliased to rintl, given that
spurious "inexact" is generally allowed for ldbl-128ibm, even for the
underlying arithmetic operations.  But given that the only point of
nearbyintl is to avoid "inexact", this patch follows the more
conservative approach of adding conditionals to the rintl
implementation to make it suitable for use to implement nearbyintl,
then builds it for nearbyintl with USE_AS_NEARBYINTL defined.  The
test test-nearbyint-except-2 shows up issues when traps on "inexact"
are enabled, which turn out to be problems with the powerpc
fenv_private.h implementation (two functions that should disable
exception traps potentially failing to do so in some cases); this
patch duly fixes that as well (I don't see any other existing cases
where this would be user-visible; there isn't much use of *_NOEX,
*hold* etc. in libm that requires exceptions to be discarded and not
trapped on).

Tested for powerpc.

	[BZ #19790]
	* sysdeps/ieee754/ldbl-128ibm/s_rintl.c [USE_AS_NEARBYINTL]
	(rintl): Define as macro.
	[USE_AS_NEARBYINTL] (__rintl): Likewise.
	(__rintl) [USE_AS_NEARBYINTL]: Use SET_RESTORE_ROUND_NOEX instead
	of fesetround.  Ensure results are evaluated before end of scope.
	* sysdeps/ieee754/ldbl-128ibm/s_nearbyintl.c: Define
	USE_AS_NEARBYINTL and include s_rintl.c.
	* sysdeps/powerpc/fpu/fenv_private.h (libc_feholdsetround_ppc):
	Disable exception traps in new environment.
	(libc_feholdsetround_ppc_ctx): Likewise.
2016-03-09 00:30:59 +00:00
Joseph Myers
f7a9f785e5 Update copyright dates with scripts/update-copyrights. 2016-01-04 16:05:18 +00:00
Joseph Myers
01238691bb Fix libm fesetround namespace (bug 17748).
Continuing the fixes for C90 libm functions calling C99 fe* functions,
this patch fixes the case of fesetround by making it a weak alias of
__fesetround and making the affected code call __fesetround.  An
existing __fesetround function in fenv_libc.h for powerpc is renamed
to __fesetround_inline.

Tested for x86_64 (testsuite, and that disassembly of installed shared
libraries is unchanged by the patch).  Also tested for ARM
(soft-float) that fesetround failures disappear from the linknamespace
test results (feupdateenv remains to be addressed to complete fixing
bug 17748).

	[BZ #17748]
	* include/fenv.h (__fesetround): Declare.  Use libm_hidden_proto.
	* math/fesetround.c (fesetround): Rename to __fesetround and
	define as weak alias of __fesetround.  Use libm_hidden_weak.
	* sysdeps/aarch64/fpu/fesetround.c (fesetround): Likewise.
	* sysdeps/alpha/fpu/fesetround.c (fesetround): Likewise.
	* sysdeps/arm/fesetround.c (fesetround): Likewise.
	* sysdeps/hppa/fpu/fesetround.c (fesetround): Likewise.
	* sysdeps/i386/fpu/fesetround.c (fesetround): Likewise.
	* sysdeps/ia64/fpu/fesetround.c (fesetround): Likewise.
	* sysdeps/m68k/fpu/fesetround.c (fesetround): Likewise.
	* sysdeps/mips/fpu/fesetround.c (fesetround): Likewise.
	* sysdeps/powerpc/fpu/fenv_libc.h (__fesetround): Rename to
	__fesetround_inline.
	* sysdeps/powerpc/fpu/fenv_private.h (libc_fesetround_ppc): Call
	__fesetround_inline instead of __fesetround.
	* sysdeps/powerpc/fpu/fesetround.c (fesetround): Rename to
	__fesetround and define as weak alias of __fesetround.  Use
	libm_hidden_weak.  Call __fesetround_inline instead of
	__fesetround.
	* sysdeps/powerpc/nofpu/fesetround.c (fesetround): Rename to
	__fesetround and define as weak alias of __fesetround.  Use
	libm_hidden_weak.
	* sysdeps/powerpc/powerpc32/e500/nofpu/fesetround.c (fesetround):
	Likewise.
	* sysdeps/s390/fpu/fesetround.c (fesetround): Likewise.
	* sysdeps/sh/sh4/fpu/fesetround.c (fesetround): Likewise.
	* sysdeps/sparc/fpu/fesetround.c (fesetround): Likewise.
	* sysdeps/tile/math_private.h (__fesetround): New inline function.
	* sysdeps/x86_64/fpu/fesetround.c (fesetround): Rename to
	__fesetround and define as weak alias of __fesetround.  Use
	libm_hidden_weak.
	* sysdeps/generic/math_private.h (default_libc_fesetround): Call
	__fesetround instead of fesetround.
	(default_libc_feholdexcept_setround): Likewise.
	(libc_feholdsetround_ctx): Likewise.
	(libc_feholdsetround_noex_ctx): Likewise.
2015-01-07 00:41:23 +00:00
Joseph Myers
b168057aaa Update copyright dates with scripts/update-copyrights. 2015-01-02 16:29:47 +00:00
Adhemerval Zanella
2cd925f743 PowerPC: Add fenv macros for long double
This patch add the missing libc_<function>l_ctx macros for long
double.  Similar for float, they point to default double versions.
2014-04-17 14:01:51 -05:00
Allan McRae
d4697bc93d Update copyright notices with scripts/update-copyrights 2014-01-01 22:00:23 +10:00
Adhemerval Zanella
bd12ab55c0 PowerPC: Fix __fe_nomask_env missing symbol
This patch fix the missing symbol __fe_nomask_env from commit
41e8926aa4 for GLIBC_2.1.
2013-11-26 07:25:08 -06:00
Adhemerval Zanella
41e8926aa4 PowerPC: Set/restore rounding mode only when needed
This patch helps some math functions performance by adding the libc_fexxx
variant of inline functions to handle both FPU round and exception set/restore
and by using them on the libc_fexxx_ctx functions. It is based on already coded
fexxx family functions for PPC with fpu.

Here is the summary of performance improvements due this patch (measured on a
POWER7 machine):

Before:

cos(): ITERS:9.5895e+07: TOTAL:5116.03Mcy, MAX:77.6cy, MIN:49.792cy, 18744 calls/Mcy
exp(): ITERS:2.827e+07: TOTAL:5187.15Mcy, MAX:494.018cy, MIN:38.422cy, 5450.01 calls/Mcy
pow(): ITERS:6.1705e+07: TOTAL:5144.26Mcy, MAX:171.95cy, MIN:29.935cy, 11994.9 calls/Mcy
sin(): ITERS:8.6898e+07: TOTAL:5117.06Mcy, MAX:83.841cy, MIN:46.582cy, 16982 calls/Mcy
tan(): ITERS:2.9473e+07: TOTAL:5115.39Mcy, MAX:191.017cy, MIN:172.352cy, 5761.63 calls/Mcy

After:

cos(): ITERS:2.05265e+08: TOTAL:5111.37Mcy, MAX:78.754cy, MIN:24.196cy, 40158.5 calls/Mcy
exp(): ITERS:3.341e+07: TOTAL:5170.84Mcy, MAX:476.317cy, MIN:15.574cy, 6461.23 calls/Mcy
pow(): ITERS:7.6153e+07: TOTAL:5129.1Mcy, MAX:147.5cy, MIN:30.916cy, 14847.2 calls/Mcy
sin(): ITERS:1.58816e+08: TOTAL:5115.11Mcy, MAX:1490.39cy, MIN:22.341cy, 31048.4 calls/Mcy
tan(): ITERS:3.4964e+07: TOTAL:5114.18Mcy, MAX:177.422cy, MIN:146.115cy, 6836.68 calls/Mcy
2013-11-25 06:34:41 -06:00