glibc/sysdeps/x86/fpu/s_ffma.c
Joseph Myers b26901b26e Fix sysdeps/x86/fpu/s_ffma.c for 32-bit FMA processor case
It turns out the __SSE2_MATH__ conditional in sysdeps/x86/fpu/s_ffma.c
does not cover all cases where the x86 fenv_private.h macros might
manipulate one of the SSE and 387 floating-point state, while the
actual fma implementation uses the other.  Specifically, in the 32-bit
case, with a compiler not defaulting to -mfpmath=sse, but testing on a
processor with hardware FMA support, the multiarch fma function
implementations will end up using SSE, while the fenv_private.h macros
will use the 387 state for double.  Change the conditional to use the
default macros rather than the optimized ones in all cases except when
the compiler inlines an fma instruction (in which case, since all
those instructions are SSE instructions and -mfpmath=sse must be in
effect for them to be inlined, the optimized macros will only use the
SSE state and it's OK for them to only use the SSE state).

Tested for x86_64 and x86.  H.J. reports in
<https://sourceware.org/pipermail/libc-alpha/2021-September/131367.html>
that it fixes the problems he observed.
2021-09-24 17:59:22 +00:00

51 lines
1.8 KiB
C

/* Fused multiply-add of double value, narrowing the result to float.
x86 version.
Copyright (C) 2021 Free Software Foundation, Inc.
This file is part of the GNU C Library.
The GNU C Library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.
The GNU C Library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public
License along with the GNU C Library; if not, see
<https://www.gnu.org/licenses/>. */
#define f32fmaf64 __hide_f32fmaf64
#define f32fmaf32x __hide_f32fmaf32x
#define ffmal __hide_ffmal
#include <math.h>
#undef f32fmaf64
#undef f32fmaf32x
#undef ffmal
#include <math-narrow.h>
#ifndef __FP_FAST_FMA
/* Depending on the details of the glibc configuration, fma might use
either SSE or 387 arithmetic; ensure that both parts of the
floating-point state are handled in the round-to-odd code. If
__FP_FAST_FMA is defined, that implies that the compiler is using
SSE floating point and that the fma call will be inlined, so the
x86 macros will work with only the SSE state and that is
sufficient. */
# undef libc_feholdexcept_setround
# define libc_feholdexcept_setround default_libc_feholdexcept_setround
# undef libc_feupdateenv_test
# define libc_feupdateenv_test default_libc_feupdateenv_test
#endif
float
__ffma (double x, double y, double z)
{
NARROW_FMA_ROUND_TO_ODD (x, y, z, float, union ieee754_double, , mantissa1,
false);
}
libm_alias_float_double (fma)