powerpc: refactor logb{f,l}

The power7 logb implementation does not show a performance gain on
ISA 2.07+ chips with faster floating-point to GRP instructions
(currently POWER8 and POWER9).

This patch moves the POWER7 implementation to generic one and enables
it for POWER7.  It also add some cleanup to use inline floating-point
number instead of define them using static const.

The performance difference is for POWER9:

  - Without patch:
  "logb": {
   "subnormal": {
    "duration": 4.99202e+09,
    "iterations": 8.83662e+08,
    "max": 75.194,
    "min": 5.501,
    "mean": 5.64925
   },
   "normal": {
    "duration": 4.97063e+09,
    "iterations": 9.97094e+08,
    "max": 46.489,
    "min": 4.956,
    "mean": 4.98512
   }
  }

  - With patch:
  "logb": {
   "subnormal": {
    "duration": 4.97226e+09,
    "iterations": 9.92036e+08,
    "max": 77.209,
    "min": 4.892,
    "mean": 5.01218
   },
   "normal": {
    "duration": 4.96192e+09,
    "iterations": 1.07545e+09,
    "max": 12.361,
    "min": 4.593,
    "mean": 4.61382
   }
  }

The ifunc implementation is also enabled only for powerpc64.

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

	* sysdeps/powerpc/power7/fpu/s_logb.c: Move to ...
	* sysdeps/powerpc/fpu/s_logb.c: ... here.  Use inline FP constants.
	* sysdeps/powerpc/power7/fpu/s_logbf.c: Move to ...
	* sysdeps/powerpc/fpu/s_logbf.c: ... here.  Use inline FP constants.
	* sysdeps/powerpc/power7/fpu/s_logbl.c: Move to ...
	* sysdeps/powerpc/fpu/s_logbl.c: ... here.  Use inline FP constants.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_logb-power7.c:
	Adjust implementation path.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_logbf-power7.c:
	Adjust implementation path.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_logbl-power7.c:
	Adjust implementation path.
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/Makefile
	(libm-sysdep_routines): Add s_log* objects.
	(CFLAGS-s_logbf-power7.c, CFLAGS-s_logbl-power7.c,
	CFLAGS-s_logb-power7.c): New fule.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logb-power7.c: Move
	to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logb-power7.c:
	... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logb-ppc64.c: Move
	to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logb-ppc64.c:
	... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logb.c: Move to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logb.c: ... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logbf-power7.c: Move
	to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logbf-power7.c:
	... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logbf-ppc64.c: Move
	to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logbf-ppc64.c:
	... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logbf.c: Move to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logbf.c: ... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logbl-power7.c: Move
	to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logbl-power7.c:
	... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logbl-ppc64.c: Move
	to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logbl-ppc64.c:
	... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logbl.c: Move to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logbl.c: ... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile: Remove file.
	* sysdeps/powerpc/powerpc64/power7/fpu/s_logb.c: Remove file.
	* sysdeps/powerpc/powerpc64/power7/fpu/s_logbf.c: Likewise.
	* sysdeps/powerpc/powerpc64/power7/fpu/s_logbl.c: Likewise.

Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
This commit is contained in:
Adhemerval Zanella 2019-03-19 12:22:21 +00:00
parent 105f2ed368
commit 6ea21bfe43
21 changed files with 138 additions and 125 deletions

View File

@ -1,5 +1,56 @@
2019-07-08 Adhemerval Zanella <adhemerval.zanella@linaro.org>
* sysdeps/powerpc/power7/fpu/s_logb.c: Move to ...
* sysdeps/powerpc/fpu/s_logb.c: ... here. Use inline FP constants.
* sysdeps/powerpc/power7/fpu/s_logbf.c: Move to ...
* sysdeps/powerpc/fpu/s_logbf.c: ... here. Use inline FP constants.
* sysdeps/powerpc/power7/fpu/s_logbl.c: Move to ...
* sysdeps/powerpc/fpu/s_logbl.c: ... here. Use inline FP constants.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_logb-power7.c:
Adjust implementation path.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_logbf-power7.c:
Adjust implementation path.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_logbl-power7.c:
Adjust implementation path.
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/Makefile
(libm-sysdep_routines): Add s_log* objects.
(CFLAGS-s_logbf-power7.c, CFLAGS-s_logbl-power7.c,
CFLAGS-s_logb-power7.c): New fule.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logb-power7.c: Move
to ...
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logb-power7.c:
... here.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logb-ppc64.c: Move
to ...
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logb-ppc64.c:
... here.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logb.c: Move to ...
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logb.c: ... here.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logbf-power7.c: Move
to ...
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logbf-power7.c:
... here.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logbf-ppc64.c: Move
to ...
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logbf-ppc64.c:
... here.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logbf.c: Move to ...
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logbf.c: ... here.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logbl-power7.c: Move
to ...
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logbl-power7.c:
... here.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logbl-ppc64.c: Move
to ...
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logbl-ppc64.c:
... here.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logbl.c: Move to ...
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logbl.c: ... here.
* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile: Remove file.
* sysdeps/powerpc/powerpc64/power7/fpu/s_logb.c: Remove file.
* sysdeps/powerpc/powerpc64/power7/fpu/s_logbf.c: Likewise.
* sysdeps/powerpc/powerpc64/power7/fpu/s_logbl.c: Likewise.
* sysdeps/ieee754/dbl-64/wordsize-64/s_logb.c: Move to ...
* sysdeps/ieee754/dbl-64/s_logb.c: ... here. Add work around for
powerpc32 integer 0 converting to -0.

View File

@ -1,4 +1,4 @@
/* logb(). PowerPC/POWER7 version.
/* Get exponent of a floating-point value. PowerPC version.
Copyright (C) 2012-2019 Free Software Foundation, Inc.
This file is part of the GNU C Library.
@ -16,59 +16,49 @@
License along with the GNU C Library; if not, see
<http://www.gnu.org/licenses/>. */
#include <math.h>
#include <math_private.h>
#include <math_ldbl_opt.h>
#include <libm-alias-double.h>
/* ISA 2.07 provides fast GPR to FP instruction (mfvsr{d,wz}) which make
generic implementation faster. */
#if defined(_ARCH_PWR8) || !defined(_ARCH_PWR7)
# include <sysdeps/ieee754/dbl-64/s_logb.c>
#else
# include <math.h>
# include <math_private.h>
# include <math_ldbl_opt.h>
# include <libm-alias-double.h>
/* This implementation avoids FP to INT conversions by using VSX
bitwise instructions over FP values. */
static const double two1div52 = 2.220446049250313e-16; /* 1/2**52 */
static const double two10m1 = -1023.0; /* 2**10 -1 */
/* FP mask to extract the exponent. */
static const union {
unsigned long long mask;
double d;
} mask = { 0x7ff0000000000000ULL };
double
__logb (double x)
{
double ret;
if (__builtin_expect (x == 0.0, 0))
if (__glibc_unlikely (x == 0.0))
/* Raise FE_DIVBYZERO and return -HUGE_VAL[LF]. */
return -1.0 / __builtin_fabs (x);
return -1.0 / fabs (x);
/* ret = x & 0x7ff0000000000000; */
asm (
"xxland %x0,%x1,%x2\n"
"fcfid %0,%0"
: "=f" (ret)
: "f" (x), "f" (mask.d));
/* ret = (ret >> 52) - 1023.0; */
ret = (ret * two1div52) + two10m1;
if (__builtin_expect (ret > -two10m1, 0))
/* Mask to extract the exponent. */
asm ("xxland %x0,%x1,%x2\n"
"fcfid %0,%0"
: "=d" (ret)
: "d" (x), "d" (0x7ff0000000000000ULL));
ret = (ret * 0x1p-52) - 1023.0;
if (ret > 1023.0)
/* Multiplication is used to set logb (+-INF) = INF. */
return (x * x);
else if (__builtin_expect (ret == two10m1, 0))
else if (ret == -1023.0)
{
/* POSIX specifies that denormal numbers are treated as
though they were normalized. */
int32_t lx, ix;
int ma;
EXTRACT_WORDS (ix, lx, x);
ix &= 0x7fffffff;
if (ix == 0)
ma = __builtin_clz (lx) + 32;
else
ma = __builtin_clz (ix);
return (double) (-1023 - (ma - 12));
int64_t ix;
EXTRACT_WORDS64 (ix, x);
ix &= UINT64_C (0x7fffffffffffffff);
return (double) (-1023 - (__builtin_clzll (ix) - 12));
}
/* Test to avoid logb_downward (0.0) == -0.0. */
return ret == -0.0 ? 0.0 : ret;
}
# ifndef __logb
libm_alias_double (__logb, logb)
# endif
#endif

View File

@ -1,4 +1,4 @@
/* logbf(). PowerPC/POWER7 version.
/* Get exponent of a floating-point value. PowerPC version.
Copyright (C) 2012-2019 Free Software Foundation, Inc.
This file is part of the GNU C Library.
@ -16,40 +16,33 @@
License along with the GNU C Library; if not, see
<http://www.gnu.org/licenses/>. */
#include <libm-alias-float.h>
/* ISA 2.07 provides fast GPR to FP instruction (mfvsr{d,wz}) which make
generic implementation faster. */
#if defined(_ARCH_PWR8) || !defined(_ARCH_PWR7)
# include <sysdeps/ieee754/flt-32/s_logbf.c>
#else
# include <math.h>
# include <libm-alias-float.h>
/* This implementation avoids FP to INT conversions by using VSX
bitwise instructions over FP values. */
static const double two1div52 = 2.220446049250313e-16; /* 1/2**52 */
static const double two10m1 = -1023.0; /* -2**10 + 1 */
static const double two7m1 = -127.0; /* -2**7 + 1 */
/* FP mask to extract the exponent. */
static const union {
unsigned long long mask;
double d;
} mask = { 0x7ff0000000000000ULL };
float
__logbf (float x)
{
/* VSX operation are all done internally as double. */
double ret;
if (__builtin_expect (x == 0.0, 0))
if (__glibc_unlikely (x == 0.0))
/* Raise FE_DIVBYZERO and return -HUGE_VAL[LF]. */
return -1.0 / __builtin_fabsf (x);
return -1.0 / fabs (x);
/* ret = x & 0x7f800000; */
asm (
"xxland %x0,%x1,%x2\n"
"fcfid %0,%0"
: "=f"(ret)
: "f" (x), "f" (mask.d));
/* mask to extract the exponent. */
asm ("xxland %x0,%x1,%x2\n"
"fcfid %0,%0"
: "=d"(ret)
: "d" (x), "d" (0x7ff0000000000000ULL));
/* ret = (ret >> 52) - 1023.0, since ret is double. */
ret = (ret * two1div52) + two10m1;
if (__builtin_expect (ret > -two7m1, 0))
ret = (ret * 0x1p-52) - 1023.0;
if (ret > 127.0)
/* Multiplication is used to set logb (+-INF) = INF. */
return (x * x);
/* Since operations are done with double we don't need
@ -57,4 +50,7 @@ __logbf (float x)
The test is to avoid logb_downward (0.0) == -0.0. */
return ret == -0.0 ? 0.0 : ret;
}
# ifndef __logbf
libm_alias_float (__logb, logb)
# endif
#endif

View File

@ -1,4 +1,4 @@
/* logbl(). PowerPC/POWER7 version.
/* Get exponent of a floating-point value. PowerPC version.
Copyright (C) 2012-2019 Free Software Foundation, Inc.
This file is part of the GNU C Library.
@ -16,22 +16,17 @@
License along with the GNU C Library; if not, see
<http://www.gnu.org/licenses/>. */
#include <math.h>
#include <math_private.h>
#include <math_ldbl_opt.h>
/* ISA 2.07 provides fast GPR to FP instruction (mfvsr{d,wz}) which make
generic implementation faster. */
#if defined(_ARCH_PWR8) || !defined(_ARCH_PWR7)
# include <./sysdeps/ieee754/ldbl-128ibm/s_logbl.c>
#else
# include <math.h>
# include <math_private.h>
# include <math_ldbl_opt.h>
/* This implementation avoids FP to INT conversions by using VSX
bitwise instructions over FP values. */
static const double two1div52 = 2.220446049250313e-16; /* 1/2**52 */
static const double two10m1 = -1023.0; /* 2**10 -1 */
/* FP mask to extract the exponent. */
static const union {
unsigned long long mask;
double d;
} mask = { 0x7ff0000000000000ULL };
long double
__logbl (long double x)
{
@ -39,24 +34,23 @@ __logbl (long double x)
double ret;
int64_t hx;
if (__builtin_expect (x == 0.0L, 0))
if (__glibc_unlikely (x == 0.0))
/* Raise FE_DIVBYZERO and return -HUGE_VAL[LF]. */
return -1.0L / __builtin_fabsl (x);
ldbl_unpack (x, &xh, &xl);
EXTRACT_WORDS64 (hx, xh);
/* ret = x & 0x7ff0000000000000; */
asm (
"xxland %x0,%x1,%x2\n"
"fcfid %0,%0"
: "=f" (ret)
: "f" (xh), "f" (mask.d));
/* ret = (ret >> 52) - 1023.0; */
ret = (ret * two1div52) + two10m1;
if (__builtin_expect (ret > -two10m1, 0))
/* Mask to extract the exponent. */
asm ("xxland %x0,%x1,%x2\n"
"fcfid %0,%0"
: "=d" (ret)
: "d" (xh), "d" (0x7ff0000000000000ULL));
ret = (ret * 0x1p-52) - 1023.0;
if (ret > 1023.0)
/* Multiplication is used to set logb (+-INF) = INF. */
return (xh * xh);
else if (__builtin_expect (ret == two10m1, 0))
else if (ret == -1023.0)
{
/* POSIX specifies that denormal number is treated as
though it were normalized. */
@ -78,6 +72,7 @@ __logbl (long double x)
/* Test to avoid logb_downward (0.0) == -0.0. */
return ret == -0.0 ? 0.0 : ret;
}
#ifndef __logbl
# ifndef __logbl
long_double_symbol (libm, __logbl, logbl);
# endif
#endif

View File

@ -16,16 +16,5 @@
License along with the GNU C Library; if not, see
<http://www.gnu.org/licenses/>. */
#include <math.h>
#include <math_ldbl_opt.h>
#undef weak_alias
#define weak_alias(a, b)
#undef strong_alias
#define strong_alias(a, b)
#undef compat_symbol
#define compat_symbol(lib, name, alias, ver)
#define __logb __logb_power7
#include <sysdeps/powerpc/power7/fpu/s_logb.c>
#include <sysdeps/powerpc/fpu/s_logb.c>

View File

@ -16,11 +16,5 @@
License along with the GNU C Library; if not, see
<http://www.gnu.org/licenses/>. */
#include <math.h>
#undef weak_alias
#define weak_alias(a, b)
#define __logbf __logbf_power7
#include <sysdeps/powerpc/power7/fpu/s_logbf.c>
#include <sysdeps/powerpc/fpu/s_logbf.c>

View File

@ -17,5 +17,4 @@
<http://www.gnu.org/licenses/>. */
#define __logbl __logbl_power7
#include <sysdeps/powerpc/power7/fpu/s_logbl.c>
#include <sysdeps/powerpc/fpu/s_logbl.c>

View File

@ -32,6 +32,12 @@ libm-sysdep_routines += s_ceil-power5+ \
s_llround-power5+ \
s_llround-ppc64 \
s_llroundf-ppc64 \
s_logb-power7 \
s_logbf-power7 \
s_logbl-power7 \
s_logb-ppc64 \
s_logbf-ppc64 \
s_logbl-ppc64 \
$(sysdep_calls:s_%=m_%)
CFLAGS-s_ceil-power5+.c = -mcpu=power5+
@ -50,6 +56,10 @@ CFLAGS-s_llround-power5+.c += -mcpu=power5+
CFLAGS-s_modf-power5+.c += -mcpu=power5+
CFLAGS-s_modff-power5+.c += -mcpu=power5+
CFLAGS-s_logbf-power7.c = -mcpu=power7
CFLAGS-s_logbl-power7.c = -mcpu=power7
CFLAGS-s_logb-power7.c = -mcpu=power7
# These files quiet sNaNs in a way that is optimized away without
# -fsignaling-nans.
CFLAGS-s_modf-ppc64.c += -fsignaling-nans

View File

@ -16,4 +16,5 @@
License along with the GNU C Library; if not, see
<http://www.gnu.org/licenses/>. */
#include <sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_logb-power7.c>
#define __logb __logb_power7
#include <sysdeps/powerpc/fpu/s_logb.c>

View File

@ -16,4 +16,5 @@
License along with the GNU C Library; if not, see
<http://www.gnu.org/licenses/>. */
#include <sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_logbf-power7.c>
#define __logbf __logbf_power7
#include <sysdeps/powerpc/fpu/s_logbf.c>

View File

@ -16,4 +16,5 @@
License along with the GNU C Library; if not, see
<http://www.gnu.org/licenses/>. */
#include <sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_logbl-power7.c>
#define __logbl __logbl_power7
#include <sysdeps/powerpc/fpu/s_logbl.c>

View File

@ -1,11 +0,0 @@
ifeq ($(subdir),math)
sysdep_routines += $(sysdep_calls)
libm-sysdep_routines += s_logb-power7 s_logbf-power7 \
s_logbl-power7 s_logb-ppc64 s_logbf-ppc64 \
s_logbl-ppc64 \
$(sysdep_calls:s_%=m_%)
CFLAGS-s_logbf-power7.c = -mcpu=power7
CFLAGS-s_logbl-power7.c = -mcpu=power7
CFLAGS-s_logb-power7.c = -mcpu=power7
endif

View File

@ -1 +0,0 @@
#include <sysdeps/powerpc/power7/fpu/s_logb.c>

View File

@ -1 +0,0 @@
#include <sysdeps/powerpc/power7/fpu/s_logbf.c>

View File

@ -1 +0,0 @@
#include <sysdeps/powerpc/power7/fpu/s_logbl.c>