glibc/sysdeps/ieee754/dbl-64/s_fmaf.c

/* Compute x * y + z as ternary operation.
   Copyright (C) 2010-2018 Free Software Foundation, Inc.
   This file is part of the GNU C Library.
   Contributed by Jakub Jelinek <jakub@redhat.com>, 2010.

   The GNU C Library is free software; you can redistribute it and/or
   modify it under the terms of the GNU Lesser General Public
   License as published by the Free Software Foundation; either
   version 2.1 of the License, or (at your option) any later version.

   The GNU C Library is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
   Lesser General Public License for more details.

   You should have received a copy of the GNU Lesser General Public
   License along with the GNU C Library; if not, see
   <http://www.gnu.org/licenses/>.  */

#include <math.h>
#include <fenv.h>
#include <ieee754.h>
#include <math-barriers.h>
#include <math_private.h>
#include <libm-alias-float.h>

/* This implementation relies on double being more than twice as
   precise as float and uses rounding to odd in order to avoid problems
   with double rounding.
   See a paper by Boldo and Melquiond:
   http://www.lri.fr/~melquion/doc/08-tc.pdf  */

float
__fmaf (float x, float y, float z)
{
  fenv_t env;

  /* Multiplication is always exact.  */
  double temp = (double) x * (double) y;

  /* Ensure correct sign of an exact zero result by performing the
     addition in the original rounding mode in that case.  */
  if (temp == -z)
    return (float) temp + z;

  union ieee754_double u;

  libc_feholdexcept_setround (&env, FE_TOWARDZERO);

  /* Perform addition with round to odd.  */
  u.d = temp + (double) z;
  /* Ensure the addition is not scheduled after fetestexcept call.  */
  math_force_eval (u.d);

  /* Reset rounding mode and test for inexact simultaneously.  */
  int j = libc_feupdateenv_test (&env, FE_INEXACT) != 0;

  if ((u.ieee.mantissa1 & 1) == 0 && u.ieee.exponent != 0x7ff)
    u.ieee.mantissa1 |= j;

  /* And finally truncation with round to nearest.  */
  return (float) u.d;
}
#ifndef __fmaf
libm_alias_float (__fma, fma)
#endif
Correct implementation of fmaf. 2010-10-11 13:27:05 +00:00			`/* Compute x * y + z as ternary operation.`
Update copyright dates with scripts/update-copyrights. * All files with FSF copyright notices: Update copyright dates using scripts/update-copyrights. * locale/programs/charmap-kw.h: Regenerated. * locale/programs/locfile-kw.h: Likewise. 2018-01-01 00:32:25 +00:00			`Copyright (C) 2010-2018 Free Software Foundation, Inc.`
Correct implementation of fmaf. 2010-10-11 13:27:05 +00:00			`This file is part of the GNU C Library.`
			`Contributed by Jakub Jelinek <jakub@redhat.com>, 2010.`

			`The GNU C Library is free software; you can redistribute it and/or`
			`modify it under the terms of the GNU Lesser General Public`
			`License as published by the Free Software Foundation; either`
			`version 2.1 of the License, or (at your option) any later version.`

			`The GNU C Library is distributed in the hope that it will be useful,`
			`but WITHOUT ANY WARRANTY; without even the implied warranty of`
			`MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU`
			`Lesser General Public License for more details.`

			`You should have received a copy of the GNU Lesser General Public`
Replace FSF snail mail address with URLs. 2012-02-09 23:18:22 +00:00			`License along with the GNU C Library; if not, see`
			`<http://www.gnu.org/licenses/>. */`
Correct implementation of fmaf. 2010-10-11 13:27:05 +00:00
			`#include <math.h>`
			`#include <fenv.h>`
			`#include <ieee754.h>`
Do not include math-barriers.h in math_private.h. This patch continues the math_private.h cleanup by stopping math_private.h from including math-barriers.h and making the users of the barrier macros include the latter header directly. No attempt is made to remove any math_private.h includes that are now unused, except in strtod_l.c where that is done to avoid line number changes in assertions, so that installed stripped shared libraries can be compared before and after the patch. (I think the floating-point environment support in math_private.h should also move out - some architectures already have fenv_private.h as an architecture-internal header included from their math_private.h - and after moving that out might be a better time to identify unused math_private.h includes.) Tested for x86_64 and x86, and tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by the patch. * sysdeps/generic/math_private.h: Do not include <math-barriers.h>. * stdlib/strtod_l.c: Include <math-barriers.h> instead of <math_private.h>. * math/fromfp.h: Include <math-barriers.h>. * math/math-narrow.h: Likewise. * math/s_nextafter.c: Likewise. * math/s_nexttowardf.c: Likewise. * sysdeps/aarch64/fpu/s_llrint.c: Likewise. * sysdeps/aarch64/fpu/s_llrintf.c: Likewise. * sysdeps/aarch64/fpu/s_lrint.c: Likewise. * sysdeps/aarch64/fpu/s_lrintf.c: Likewise. * sysdeps/i386/fpu/s_nextafterl.c: Likewise. * sysdeps/i386/fpu/s_nexttoward.c: Likewise. * sysdeps/i386/fpu/s_nexttowardf.c: Likewise. * sysdeps/ieee754/dbl-64/e_atan2.c: Likewise. * sysdeps/ieee754/dbl-64/e_atanh.c: Likewise. * sysdeps/ieee754/dbl-64/e_exp.c: Likewise. * sysdeps/ieee754/dbl-64/e_exp2.c: Likewise. * sysdeps/ieee754/dbl-64/e_j0.c: Likewise. * sysdeps/ieee754/dbl-64/e_sqrt.c: Likewise. * sysdeps/ieee754/dbl-64/s_expm1.c: Likewise. * sysdeps/ieee754/dbl-64/s_fma.c: Likewise. * sysdeps/ieee754/dbl-64/s_fmaf.c: Likewise. * sysdeps/ieee754/dbl-64/s_log1p.c: Likewise. * sysdeps/ieee754/dbl-64/s_nearbyint.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_nearbyint.c: Likewise. * sysdeps/ieee754/flt-32/e_atanhf.c: Likewise. * sysdeps/ieee754/flt-32/e_j0f.c: Likewise. * sysdeps/ieee754/flt-32/s_expm1f.c: Likewise. * sysdeps/ieee754/flt-32/s_log1pf.c: Likewise. * sysdeps/ieee754/flt-32/s_nearbyintf.c: Likewise. * sysdeps/ieee754/flt-32/s_nextafterf.c: Likewise. * sysdeps/ieee754/k_standardl.c: Likewise. * sysdeps/ieee754/ldbl-128/e_asinl.c: Likewise. * sysdeps/ieee754/ldbl-128/e_expl.c: Likewise. * sysdeps/ieee754/ldbl-128/e_powl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_fmal.c: Likewise. * sysdeps/ieee754/ldbl-128/s_nearbyintl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_nextafterl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_nexttoward.c: Likewise. * sysdeps/ieee754/ldbl-128/s_nexttowardf.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/e_asinl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_fmal.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_nextafterl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_nexttoward.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_nexttowardf.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_rintl.c: Likewise. * sysdeps/ieee754/ldbl-96/e_atanhl.c: Likewise. * sysdeps/ieee754/ldbl-96/e_j0l.c: Likewise. * sysdeps/ieee754/ldbl-96/s_fma.c: Likewise. * sysdeps/ieee754/ldbl-96/s_fmal.c: Likewise. * sysdeps/ieee754/ldbl-96/s_nexttoward.c: Likewise. * sysdeps/ieee754/ldbl-96/s_nexttowardf.c: Likewise. * sysdeps/ieee754/ldbl-opt/s_nexttowardfd.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_nextafterl.c: Likewise. 2018-05-11 15:11:38 +00:00			`#include <math-barriers.h>`
Use new internal libc_fe* interfaces in more functions 2011-10-18 19:11:31 +00:00			`#include <math_private.h>`
Use libm_alias_float for dbl-64 fmaf. This patch makes the implementation of fmaf in the dbl-64 directory use libm_alias float. Tested for x86_64, and verified with build-many-glibcs.py that installed stripped shared libraries are unchanged by this patch. * sysdeps/ieee754/dbl-64/s_fmaf.c: Include <libm-alias-float.h>. [!__fmaf] (fmaf): Define using libm_alias_float. 2017-10-03 21:01:33 +00:00			`#include <libm-alias-float.h>`
Correct implementation of fmaf. 2010-10-11 13:27:05 +00:00
			`/* This implementation relies on double being more than twice as`
			`precise as float and uses rounding to odd in order to avoid problems`
			`with double rounding.`
			`See a paper by Boldo and Melquiond:`
			`http://www.lri.fr/~melquion/doc/08-tc.pdf */`

			`float`
			`__fmaf (float x, float y, float z)`
			`{`
			`fenv_t env;`
Fix sign of exact zero return from fma (bug 14638). 2012-09-29 18:31:54 +00:00
Correct implementation of fmaf. 2010-10-11 13:27:05 +00:00			`/* Multiplication is always exact. */`
			`double temp = (double) x * (double) y;`
Fix sign of exact zero return from fma (bug 14638). 2012-09-29 18:31:54 +00:00
			`/* Ensure correct sign of an exact zero result by performing the`
			`addition in the original rounding mode in that case. */`
			`if (temp == -z)`
			`return (float) temp + z;`

Correct implementation of fmaf. 2010-10-11 13:27:05 +00:00			`union ieee754_double u;`
Create and use libc_feupdateenv_test. We can reduce the number of STMXCSR, and often we can avoid the call to __feraiseexcept. 2012-03-10 16:53:05 +00:00
			`libc_feholdexcept_setround (&env, FE_TOWARDZERO);`

Correct implementation of fmaf. 2010-10-11 13:27:05 +00:00			`/* Perform addition with round to odd. */`
			`u.d = temp + (double) z;`
Ensure additions are not scheduled after fetestexcept in fmaf and fmal. 2012-06-01 19:02:21 +00:00			`/* Ensure the addition is not scheduled after fetestexcept call. */`
			`math_force_eval (u.d);`
Create and use libc_feupdateenv_test. We can reduce the number of STMXCSR, and often we can avoid the call to __feraiseexcept. 2012-03-10 16:53:05 +00:00
			`/* Reset rounding mode and test for inexact simultaneously. */`
			`int j = libc_feupdateenv_test (&env, FE_INEXACT) != 0;`

Implement accurate fma. 2010-10-14 02:27:03 +00:00			`if ((u.ieee.mantissa1 & 1) == 0 && u.ieee.exponent != 0x7ff)`
Create and use libc_feupdateenv_test. We can reduce the number of STMXCSR, and often we can avoid the call to __feraiseexcept. 2012-03-10 16:53:05 +00:00			`u.ieee.mantissa1 \|= j;`

Correct implementation of fmaf. 2010-10-11 13:27:05 +00:00			`/* And finally truncation with round to nearest. */`
			`return (float) u.d;`
			`}`
			`#ifndef __fmaf`
Use libm_alias_float for dbl-64 fmaf. This patch makes the implementation of fmaf in the dbl-64 directory use libm_alias float. Tested for x86_64, and verified with build-many-glibcs.py that installed stripped shared libraries are unchanged by this patch. * sysdeps/ieee754/dbl-64/s_fmaf.c: Include <libm-alias-float.h>. [!__fmaf] (fmaf): Define using libm_alias_float. 2017-10-03 21:01:33 +00:00			`libm_alias_float (__fma, fma)`
Correct implementation of fmaf. 2010-10-11 13:27:05 +00:00			`#endif`