Commit Graph

582 Commits

Author SHA1 Message Date
Raoni Fassina Firmino
066020c5e8 powerpc: Cleanup: use actual power8 assembly mnemonics
Some implementations in sysdeps/powerpc/powerpc64/power8/*.S still had
pre power8 compatible binutils hardcoded macros and were not using
.machine power8.

This patch should not have semantic changes, in fact it should have the
same exact code generated.

Tested that generated stripped shared objects are identical when
using "strip --remove-section=.note.gnu.build-id".

Checked on:
- powerpc64le, power9, build-many-glibcs.py, gcc 6.4.1 20180104, binutils 2.26.2.20160726
- powerpc64le, power8, debian 9, gcc 6.3.0 20170516, binutils 2.28
- powerpc64le, power9, ubuntu 19.04, gcc 8.3.0, binutils 2.32
- powerpc64le, power9, opensuse tumbleweed, gcc 9.1.1 20190527, binutils 2.32
- powerpc64, power9, debian 10, gcc 8.3.0, binutils 2.31.1

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
2019-08-01 15:57:50 -03:00
Adhemerval Zanella
6ea21bfe43 powerpc: refactor logb{f,l}
The power7 logb implementation does not show a performance gain on
ISA 2.07+ chips with faster floating-point to GRP instructions
(currently POWER8 and POWER9).

This patch moves the POWER7 implementation to generic one and enables
it for POWER7.  It also add some cleanup to use inline floating-point
number instead of define them using static const.

The performance difference is for POWER9:

  - Without patch:
  "logb": {
   "subnormal": {
    "duration": 4.99202e+09,
    "iterations": 8.83662e+08,
    "max": 75.194,
    "min": 5.501,
    "mean": 5.64925
   },
   "normal": {
    "duration": 4.97063e+09,
    "iterations": 9.97094e+08,
    "max": 46.489,
    "min": 4.956,
    "mean": 4.98512
   }
  }

  - With patch:
  "logb": {
   "subnormal": {
    "duration": 4.97226e+09,
    "iterations": 9.92036e+08,
    "max": 77.209,
    "min": 4.892,
    "mean": 5.01218
   },
   "normal": {
    "duration": 4.96192e+09,
    "iterations": 1.07545e+09,
    "max": 12.361,
    "min": 4.593,
    "mean": 4.61382
   }
  }

The ifunc implementation is also enabled only for powerpc64.

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

	* sysdeps/powerpc/power7/fpu/s_logb.c: Move to ...
	* sysdeps/powerpc/fpu/s_logb.c: ... here.  Use inline FP constants.
	* sysdeps/powerpc/power7/fpu/s_logbf.c: Move to ...
	* sysdeps/powerpc/fpu/s_logbf.c: ... here.  Use inline FP constants.
	* sysdeps/powerpc/power7/fpu/s_logbl.c: Move to ...
	* sysdeps/powerpc/fpu/s_logbl.c: ... here.  Use inline FP constants.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_logb-power7.c:
	Adjust implementation path.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_logbf-power7.c:
	Adjust implementation path.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_logbl-power7.c:
	Adjust implementation path.
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/Makefile
	(libm-sysdep_routines): Add s_log* objects.
	(CFLAGS-s_logbf-power7.c, CFLAGS-s_logbl-power7.c,
	CFLAGS-s_logb-power7.c): New fule.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logb-power7.c: Move
	to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logb-power7.c:
	... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logb-ppc64.c: Move
	to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logb-ppc64.c:
	... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logb.c: Move to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logb.c: ... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logbf-power7.c: Move
	to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logbf-power7.c:
	... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logbf-ppc64.c: Move
	to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logbf-ppc64.c:
	... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logbf.c: Move to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logbf.c: ... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logbl-power7.c: Move
	to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logbl-power7.c:
	... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logbl-ppc64.c: Move
	to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logbl-ppc64.c:
	... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logbl.c: Move to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logbl.c: ... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile: Remove file.
	* sysdeps/powerpc/powerpc64/power7/fpu/s_logb.c: Remove file.
	* sysdeps/powerpc/powerpc64/power7/fpu/s_logbf.c: Likewise.
	* sysdeps/powerpc/powerpc64/power7/fpu/s_logbl.c: Likewise.

Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
2019-07-08 17:22:22 -03:00
Adhemerval Zanella
931c616eed powerpc: Refactor modf{f}
The modf{f} optimization is not an optimization for ISA 2.07+.  This
patch move the IFUNC for powerpc64 only, move the power5+ to generic
location, and include the generic implementation for ISA 2.07+.

The performance changes are based on modf benchtests:

  * POWER9 - ppc64
  "modf": {
   "": {
    "duration": 4.97057e+09,
    "iterations": 1.00688e+09,
    "max": 28.76,
    "min": 4.912,
    "mean": 4.9366
   }
  }
  * POWER9 - power5+
  "modf": {
   "": {
    "duration": 4.98291e+09,
    "iterations": 9.32818e+08,
    "max": 15.058,
    "min": 5.107,
    "mean": 5.34178
   }
  }

  * POWER8 - ppc64
   "modf": {
   "": {
    "duration": 5.05329e+09,
    "iterations": 8.38814e+08,
    "max": 518.051,
    "min": 5.79,
    "mean": 6.02433
   }
  }
  * POWER8 - power5+
  "modf": {
   "": {
    "duration": 5.05573e+09,
    "iterations": 8.35254e+08,
    "max": 63.141,
    "min": 5.873,
    "mean": 6.05293
   }
  }

  * POWER7 - ppc64
  "modf": {
   "": {
    "duration": 4.89818e+09,
    "iterations": 1.08408e+09,
    "max": 57.556,
    "min": 3.953,
    "mean": 4.51827
   }
  }
  * POWER7 - power5+
  "modf": {
   "": {
    "duration": 4.83789e+09,
    "iterations": 1.33409e+09,
    "max": 46.608,
    "min": 2.224,
    "mean": 3.62636
   }
  }

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

	* sysdeps/powerpc/power5+/fpu/s_modf.c: Move to ...
	* sysdeps/powerpc/fpu/s_modf.c: ... here.  Add ISA 2.07 optimization.
	* sysdeps/powerpc/power5+/fpu/s_modff.c: Move to ...
	* sysdeps/powerpc/fpu/s_modff.c: ... here.  Add ISA 2.07 optimization.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_modf-power5+.c:
	Adjust include.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_modff-power5+.c:
	Likewise.
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/Makefile (sysdep_calls,
	sysdep_routines): Add s_modf* objects.
	(CFLAGS-s_modf-power5+.c, CFLAGS-s_modff-power5+.c,
	CFLAGS-s_modf-ppc64.c, CFLAGS-s_modff-ppc64.c): New rule.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_modf-power5+.c: Move
	to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modf-power5+.c:
	... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_modf-power5+.c: Movo
	to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modf-power5+.c: Move
	... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_modf.c: Move to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modf.c: ... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_modff-power5+.c: Move
	to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modff-power5+.c:
	... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_modff-ppc64.c: Move to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modff-ppc64.c:
	... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_modff.c: Move to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modff.c: ... here.

Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
2019-07-08 17:22:22 -03:00
Adhemerval Zanella
69461d9896 powerpc: hypot refactor and optimization
The powerpc hypot is slight optimized by:

  - Commit 8df4e219e4, both isnan and isinf are always inlined and thus
    the check TEST_INF_NAN does not make sense anymore.  The generic
    check for POWER7 should be faster on all powerpc configuration.

  - The redundant check 'y > two60factor && (x / y) > two60' is removed.

Both changes leads to unrequired ifunc especialization for power7 and
thus they are removed.  Finally The code is also cleanup a bit by inlining
the constants floating points.

The performance changes using the hypot benchtests are:

  - POWER9 without patch:
    "hypot": {
     "overflow": {
      "duration": 4.98585e+09,
      "iterations": 4.84932e+08,
      "max": 46.551,
      "min": 10.229,
      "mean": 10.2815
     },
     "higher_two500": {
      "duration": 5.00192e+09,
      "iterations": 4.24843e+08,
      "max": 33.319,
      "min": 11.606,
      "mean": 11.7736
     },
     "subnormal": {
      "duration": 5.0075e+09,
      "iterations": 4.06792e+08,
      "max": 22.178,
      "min": 12.15,
      "mean": 12.3097
     },
     "less_two500": {
      "duration": 5.00685e+09,
      "iterations": 4.08772e+08,
      "max": 22.784,
      "min": 12.052,
      "mean": 12.2485
     },
     "default": {
      "duration": 5.06002e+09,
      "iterations": 4.09894e+08,
      "max": 20.648,
      "min": 11.874,
      "mean": 12.3447
     }
    }

  - POWER9 with patch:
    "hypot": {
     "overflow": {
      "duration": 4.91848e+09,
      "iterations": 7.28039e+08,
      "max": 47.958,
      "min": 6.436,
      "mean": 6.75579
     },
     "higher_two500": {
      "duration": 4.9359e+09,
      "iterations": 6.63376e+08,
      "max": 20.783,
      "min": 7.321,
      "mean": 7.44057
     },
     "subnormal": {
      "duration": 4.9479e+09,
      "iterations": 6.19772e+08,
      "max": 18.856,
      "min": 7.817,
      "mean": 7.98341
     },
     "less_two500": {
      "duration": 4.94275e+09,
      "iterations": 6.3889e+08,
      "max": 17.452,
      "min": 7.597,
      "mean": 7.73647
     },
     "default": {
      "duration": 5.03645e+09,
      "iterations": 5.70718e+08,
      "max": 18.904,
      "min": 8.55,
      "mean": 8.82476
     }
    }

  - POWER7 without patch
    "hypot": {
     "overflow": {
      "duration": 4.86637e+09,
      "iterations": 6.43196e+08,
      "max": 53.958,
      "min": 7.328,
      "mean": 7.56592
     },
     "higher_two500": {
      "duration": 4.99842e+09,
      "iterations": 3.11012e+08,
      "max": 78.227,
      "min": 15.696,
      "mean": 16.0715
     },
     "subnormal": {
      "duration": 4.99841e+09,
      "iterations": 3.08935e+08,
      "max": 51.392,
      "min": 15.983,
      "mean": 16.1795
     },
     "less_two500": {
      "duration": 5.00108e+09,
      "iterations": 2.99464e+08,
      "max": 73.247,
      "min": 16.416,
      "mean": 16.7001
     },
     "default": {
      "duration": 5.04645e+09,
      "iterations": 3.52608e+08,
      "max": 70.073,
      "min": 13.38,
      "mean": 14.3118
     }
    }

  - POWER7 with patch
    "hypot": {
     "overflow": {
      "duration": 4.80785e+09,
      "iterations": 8.00001e+08,
      "max": 66.262,
      "min": 5.888,
      "mean": 6.00981
     },
     "higher_two500": {
      "duration": 4.9859e+09,
      "iterations": 3.39449e+08,
      "max": 5148.44,
      "min": 14.539,
      "mean": 14.6882
     },
     "subnormal": {
      "duration": 4.9905e+09,
      "iterations": 3.28874e+08,
      "max": 64.905,
      "min": 14.971,
      "mean": 15.1745
     },
     "less_two500": {
      "duration": 4.99494e+09,
      "iterations": 3.19755e+08,
      "max": 103.696,
      "min": 14.972,
      "mean": 15.6211
     },
     "default": {
      "duration": 5.03951e+09,
      "iterations": 4.02502e+08,
      "max": 61.008,
      "min": 12.368,
      "mean": 12.5205
     }
    }

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

	* sysdeps/powerpc/fpu/e_hypot.c (two60, two500, two600, two1022,
	twoM500, twoM600, two60factor, pdnum): Remove.
	(TEST_INFO_NAN, GET_TW0_HIGH_WORD): Remove macro.
	(__ieee754_hypot): Replace static variables with inline definition,
	remove ununsed branches.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile
	(libm-sysdep_routines): Remove e_hypot-* objects.
	(CFLAGS-e_hypot-power7.c, CFLAGS-e_hypotf-power7.c): Remove rule.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/e_hypot-power7.c: Remove
	file.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/e_hypot-ppc64.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/e_hypot.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/e_hypotf-power7.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/e_hypotf-ppc64.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/e_hypotf.c: Likewise.

Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
2019-07-08 17:21:15 -03:00
Adhemerval Zanella
aa32f5bf0c powerpc: Use generic e_expf
Generic implementation is faster on both power8 and power9:

POWER9:
- sysdeps/ieee754/flt-32/e_expf.c
  "expf": {
   "workload-spec2017.wrf": {
    "duration": 5.1236e+09,
    "iterations": 7.53344e+08,
    "reciprocal-throughput": 5.9436,
    "latency": 7.65869,
    "max-throughput": 1.68248e+08,
    "min-throughput": 1.30571e+08
   }
  }

- sysdeps/powerpc/powerpc64/power8/fpu/e_expf.S
  "expf": {
   "workload-spec2017.wrf": {
    "duration": 5.14429e+09,
    "iterations": 5.29248e+08,
    "reciprocal-throughput": 8.05372,
    "latency": 11.3863,
    "max-throughput": 1.24166e+08,
    "min-throughput": 8.78249e+07
   }
  }

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

	* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile
	(libm-sysdep_routines): Remove e_expf-power8 and expf-ppc64.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/e_expf-power8.S: Remove
	file.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/e_expf-ppc64.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/e_expf.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/w_expf.c: Likewise.
	* sysdeps/powerpc/powerpc64/power8/fpu/e_expf.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/fpu/w_expf.c: Likewise.

Reviewed-by: Gabriel F. T. Gomes <gabriel@inconstante.eti.br>
2019-06-26 14:33:58 -03:00
Adhemerval Zanella
dee07df1a4 powerpc: Refactor powerpc64 lround/lroundf/llround/llroundf
This patches consolidates all the powerpc {l}lround{f} implementations
on the generic sysdeps/powerpc/fpu/s_{l}lround{f}.c.

The IFUNC support is also moved only to powerpc64 only, since for
powerpc64le generic implementation resulting in optimized code.

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/Makefile
	(libm-sysdep_routines): Add s_llround-power8, s_llround-power6x,
	s_llround-power5+, s_llround-ppc64, and s_llroundf-ppc64.
	(CFLAGS-s_llround-power8.c, CFLAGS-s_llround-power6x.c,
	CFLAGS-s_llround-power5+.c): New rule.
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llround-power5+.c:
	New file.
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llround-power6x.c:
	Likewise.
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llround-power8.c:
	Likewise.
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llround-ppc64.c:
	Likewise.
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llroundf-ppc64.c:
	Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llround.c: Move to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llround.c: ... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llroundf.c: Move to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llroundf.c: ... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_lround.c: Move to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_lround.c: ... here.
	* sysdeps/powerpc/powerpc64/fpu/Makefile
	[$(subdir) == math] (CFLAGS-s_llround.c): New rule.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile
	(libm-sysdep_routines): Remove s_llround-* objects.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llround-power5+.S: Remove
	file.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llround-power6x.S:
	Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llround-power8.S:
	Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llround-ppc64.S:
	Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llroundf-ppc64.S:
	Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_llround.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_llroundf.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_lround.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_lroundf.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_llround.c: New file.
	* sysdeps/powerpc/powerpc64/fpu/s_llroundf.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_lround.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_lroundf.c: Likewise.
	* sysdeps/powerpc/powerpc64/power5+/fpu/s_llround.S: Likewise.
	* sysdeps/powerpc/powerpc64/power5+/fpu/s_llroundf.S: Likewise.
	* sysdeps/powerpc/powerpc64/power6x/fpu/s_llround.S: Likewise.
	* sysdeps/powerpc/powerpc64/power6x/fpu/s_llroundf.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/fpu/s_llround.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/fpu/s_llroundf.S: Likewise.

Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
2019-06-17 09:28:21 -03:00
Adhemerval Zanella
2166283fcc powerpc: Refactor powerpc32 lrint/lrintf/llrint/llrintf
This patches consolidates all the powerpc llrint{f} implementations on
the generic sysdeps/powerpc/powerpc32/fpu/s_llrint{f}.

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cpu and with --with-cpu=power5+
and --disable-multi-arch).

	* sysdeps/powerpc/fpu/s_lrintf.S: Remove file.
	* sysdeps/powerpc/powerpc64/fpu/s_lrintf.c: Move to ...
	* sysdeps/powerpc/fpu/s_lrintf.c: ... here.
	* sysdeps/powerpc/powerpc32/fpu/Makefile
	[$(subdir) == math] (CFLAGS-s_lrint.c): New rule.
	* sysdeps/powerpc/powerpc32/fpu/s_llrint.c (__llrint): Add power4
	optimization.
	* sysdeps/powerpc/powerpc32/fpu/s_llrintf.c (__llrintf): Likewise.
	* sysdeps/powerpc/powerpc32/fpu/s_lrint.S: Remove file.
	* sysdeps/powerpc/powerpc32/fpu/s_lrint.c: New file.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile
	(CFLAGS-s_llrintf-power6.c, CFLAGS-s_llrintf-ppc32.c,
	CFLAGS-s_llrint-power6.c, CFLAGS-s_llrint-ppc32.c,
	CFLAGS-s_lrint-ppc32.c): New rule.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llrint-power6.S:
	Remove file.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llrint-ppc32.S:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llrintf-power6.S:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llrintf-ppc32.S:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lrint-ppc32.S:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/s_llrint.S: Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/s_llrintf.S: Likewise.
	* sysdeps/powerpc/powerpc32/power6/fpu/s_llrint.S: Likewise.
	* sysdeps/powerpc/powerpc32/power6/fpu/s_llrintf.S: Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llrint-power6.c:
	New file.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llrint-ppc32.c:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llrintf-power6.c:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llrintf-ppc32.c:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lrint-ppc32.c:
	Likewise.

Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
2019-06-17 09:27:02 -03:00
Adhemerval Zanella
78049de0a9 powerpc: refactor powerpc64 lrint/lrintf/llrint/llrintf
This patches consolidates all the powerpc llrint{f} implementations on
the generic sysdeps/powerpc/fpu/s_llrint{f}.

The IFUNC support is also moved only to powerpc64 only, since for
powerpc64le generic implementation resulting in optimized code.

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/Makefile
	(libm-sysdep_routines): Add s_llrint-power8, s_llrint-power6x, and
	s_llrint-ppc64.
	(CFLAGS-s_llrint-power8.c, CFLAGS-s_llrint-power6x.c): New rule.
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llrint-power6x.c: New
	file.
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llrint-power8.c:
	Likewise.
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llrint-ppc64.c:
	Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_lrint.c: Move to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_lrint.c: ... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llrint.c: Move to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llrint.c: ... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llrintf.c: Move to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llrintf.c: ... here.
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_lrint.c: New file.
	* sysdeps/powerpc/powerpc64/fpu/Makefile: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile
	(libm-sysdep_routines): Remove s_llrint-* objects.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llrint-power6x.S: Remove
	file.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llrint-power8.S:
	Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llrint-ppc64.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_llrint.c: New file.
	* sysdeps/powerpc/powerpc64/fpu/s_llrintf.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_lrint.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_lrintf.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_llrint.S: Remove file.
	* sysdeps/powerpc/powerpc64/fpu/s_llrintf.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_lrint.S: Likewise.
	* sysdeps/powerpc/powerpc64/power6x/fpu/s_llrint.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/fpu/s_llrint.S: Likewise.

Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
2019-06-17 09:26:21 -03:00
Adhemerval Zanella
1192696069 powerpc: Remove optimized finite
The powerpc finite optimization do not show much gain:

  - GCC will call libm iff -fsignaling-nans is used. This usage pattern
    is usually not performance oriented and for such calls PLT overhead
    should dominate execution time.

  - The power7 uses ftdiv to optimize for some input patterns, but at
    cost of others.  Comparing against generic C implementation built
    for powerpc64-linux-gnu-power7 (--with-cpu=power7):

    - Generic sysdeps/ieee754 implementation:
       "isfinite": {
        "": {
         "duration": 5.0082e+09,
         "iterations": 2.45299e+09,
         "max": 43.824,
         "min": 2.008,
         "mean": 2.04167
        },
        "INF": {
         "duration": 4.66554e+09,
         "iterations": 2.28288e+09,
         "max": 35.73,
         "min": 2.008,
         "mean": 2.04371
        },
        "NAN": {
         "duration": 4.66274e+09,
         "iterations": 2.28716e+09,
         "max": 34.161,
         "min": 2.009,
         "mean": 2.03866
        }
       }

    - power7 optimized one:
       "isfinite": {
        "": {
         "duration": 4.99111e+09,
         "iterations": 2.65566e+09,
         "max": 25.015,
         "min": 1.716,
         "mean": 1.87942
        },
        "INF": {
         "duration": 4.6783e+09,
         "iterations": 2.0999e+09,
         "max": 35.264,
         "min": 1.868,
         "mean": 2.22787
        },
        "NAN": {
         "duration": 4.67915e+09,
         "iterations": 2.08678e+09,
         "max": 38.099,
         "min": 1.869,
         "mean": 2.24228
        }
       }

     So it basically optimizes marginally for normal numbers while
     increasing the latency for other kind of FP.

  - The power8 implementation is just the generic implementation using
    ISA 2.07 mfvsrd instruction (which GCC uses for generic implementation).
    So generic implementation is the best option for powerpc64le.

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile
	(sysdeps_routines, libm-sysdep_routines): Remove s_finite*
	objects.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finite-power7.S:
	Remove file.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finite-ppc32.c:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finite.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finitef-ppc32.c:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finitef.c: Likewise.
	* sysdeps/powerpc/powerpc32/power7/fpu/s_finite.S: Likewise.
	* sysdeps/powerpc/powerpc32/power7/fpu/s_finitef.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (sysdep_call):
	Remove s_finite* objects.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite-power7.S: Remove file.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite-power8.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite-ppc64.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_finitef-ppc64.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_finitef.c: Likewise.
	* sysdeps/powerpc/powerpc64/power7/fpu/s_finite.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/fpu/s_finitef.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/fpu/s_finite.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/fpu/s_finitef.S: Likewise.

Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
2019-06-12 14:32:39 -03:00
Adhemerval Zanella
6427a6ac8c powerpc: Remove optimized isinf
The powerpc isinf optimizations onyl adds complexity:

  - GCC will call libm iff -fsignaling-nans is used. This usage pattern
    is usually not performance oriented and for such calls PLT overhead
    should dominate execution time.

  - The power7 uses ftdiv to optimize for some input pattern and branch
    implementation for INF and denormal that does:

    return (ix & UINT64_C (0x7fffffffffffffff)) == UINT64_C (0x7ff0000000000000)

    Although it does show slight better latency than generic algorithm
    (as below), it is only for power7 and requires it to override it
    for power8.

  - The power8 implementation is just the generic implementation using
    ISA 2.07 mfvsrd instruction (which GCC uses for generic implementation).
    So generic implementation is the best option for powerpc64le.

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile
	(sysdeps_routines, libm-sysdep_routines): Remove s_isinf* and s_isinf*
	objects.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinf-power7.S:
	Remove file.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinf-ppc32.c:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinf.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinff-ppc32.c:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinff.c: Likewise.
	* sysdeps/powerpc/powerpc32/power7/fpu/s_isinf.S: Likewise.
	* sysdeps/powerpc/powerpc32/power7/fpu/s_isinff.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (sysdep_call):
	Remove s_isinf* and s_isinf* objects.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinf-power7.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinf-power8.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinf-ppc64.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinf.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinff-ppc64.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinff.c: Likewise.
	* sysdeps/powerpc/powerpc64/power7/fpu/s_isinf.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/fpu/s_isinff.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/fpu/s_isinf.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/fpu/s_isinff.S: Likewise.

Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
2019-06-12 14:32:39 -03:00
Adhemerval Zanella
2666f96390 powerpc: Remove optimized isnan
The powerpc isnan optimizations are not really a gain:

  - GCC will call libm iff -fsignaling-nans is used. This usage pattern
    is usually not performance oriented and for such calls PLT overhead
    should dominate execution time.

  - The power5, power6, and power6x are just micro-optimization to
    improve the Load-Hit-Store hazards from floating-point to general
    register transfer, and current GCC already has support to minimize
    it by inserting either extra nops or group dispatch instructions.

  - The power7 uses ftdiv to optimize for some input patterns, but at
    cost of others.  Comparing against generic C implementation built
    for powerpc-linux-gnu-power4 (which uses the hp-timing support on
    benchtests):

    - Generic sysdeps/ieee754 implementation:
      "isnan": {
       "": {
        "duration": 4.98415e+09,
        "iterations": 2.34516e+09,
        "max": 45.925,
        "min": 2.052,
        "mean": 2.12529
       },
       "INF": {
        "duration": 4.74057e+09,
        "iterations": 1.69761e+09,
        "max": 91.01,
        "min": 2.052,
        "mean": 2.79249
       },
       "NAN": {
        "duration": 4.74071e+09,
        "iterations": 1.68768e+09,
        "max": 282.343,
        "min": 2.052,
        "mean": 2.809
       }
      }

    - power7 optimized one:
    $ ./testrun.sh benchtests/bench-isnan
      "isnan": {
       "": {
        "duration": 4.96842e+09,
        "iterations": 2.56297e+09,
        "max": 50.048,
        "min": 1.872,
        "mean": 1.93854
       },
       "INF": {
        "duration": 4.76648e+09,
        "iterations": 1.54213e+09,
        "max": 373.408,
        "min": 2.661,
        "mean": 3.09084
       },
       "NAN": {
        "duration": 4.76845e+09,
        "iterations": 1.54515e+09,
        "max": 51.016,
        "min": 2.736,
        "mean": 3.08607
       }
      }

    So it basically optimizes marginally for normal numbers while
    increasing the latency for other kind of FP.

  - The generic implementation requires getting the floating point
    status, disable the invalid operation bit, and restore the
    floating-point status.  Each operation is costly and requires
    flushing the FP pipeline.

    Using the same scenarion for the previous analysis:

      "isnan": {
       "": {
        "duration": 5.08284e+09,
        "iterations": 6.2898e+08,
        "max": 41.844,
        "min": 8.057,
        "mean": 8.08108
       },
       "INF": {
        "duration": 4.97904e+09,
        "iterations": 6.16176e+08,
        "max": 39.661,
        "min": 8.057,
        "mean": 8.08055
       },
       "NAN": {
        "duration": 4.98695e+09,
        "iterations": 5.95866e+08,
        "max": 29.728,
        "min": 8.345,
        "mean": 8.36925
       }
      }

  - The power8 implementation is just the generic implementation using
    ISA 2.07 mfvsrd instruction (which GCC uses for generic implementation).
    So generic implementation is the best option for powerpc64le.

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

	* sysdeps/powerpc/fpu/s_isnan.c: Remove file.
	* sysdeps/powerpc/fpu/s_isnanf.S: Likewise.
	* sysdeps/powerpc/powerpc32/fpu/s_isnan.S: Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile
	(sysdeps_routines, libm-sysdep_routines): Remove s_isnan-* and
	s_isnanf-* objects.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan-power5.S:
	Remove file
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan-power6.S:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan-power7.S:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan-ppc32.S:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnanf-power5.S:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnanf-power6.S:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnanf.c: Likewise.
	* sysdeps/powerpc/powerpc32/power5/fpu/s_isnan.S: Likewise.
	* sysdeps/powerpc/powerpc32/power5/fpu/s_isnanf.S: Likewise.
	* sysdeps/powerpc/powerpc32/power6/fpu/s_isnan.S: Likewise.
	* sysdeps/powerpc/powerpc32/power6/fpu/s_isnanf.S: Likewise.
	* sysdeps/powerpc/powerpc32/power7/fpu/s_isnan.S: Likewise.
	* sysdeps/powerpc/powerpc32/power7/fpu/s_isnanf.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (sysdep_calls):
	Remove s_isnan-* and s_isnanf-* objects.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power5.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power6.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power6x.S:
	Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power7.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power8.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-ppc64.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnanf.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_isnan.S: Likewise.
	* sysdeps/powerpc/powerpc64/power5/fpu/s_isnan.S: Likewise.
	* sysdeps/powerpc/powerpc64/power6/fpu/s_isnan.S: Likewise.
	* sysdeps/powerpc/powerpc64/power6x/fpu/s_isnan.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/fpu/s_isnan.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/fpu/s_isnanf.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/fpu/s_isnan.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/fpu/s_isnanf.S: Likewise.

Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
2019-06-12 14:32:36 -03:00
Adhemerval Zanella
e41d66e41a powerpc: copysign cleanup
GCC always expand copysign{f} for all possible cpus, so calling the libm
is only done if user explicitly states to disable the builtin (which is
done usually not for performance reason).  So to provide ifunc variant
for copysign is just unrequired complexity, since libm will be called
on non-performance critical code.

This patch removes both powerpc32 and powerpc64 ifunc variants and
consolidates the powerpc implementation on
sysdeps/powerpc/fpu/s_copysign{f}.c using compiler builtins.

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

	* sysdeps/powerpc/fpu/s_copysign.c: New file.
	* sysdeps/powerpc/fpu/s_copysignf.c: Likewise.
	* sysdeps/powerpc/powerpc32/fpu/s_copysign.S: Remove file.
	* sysdeps/powerpc/powerpc32/fpu/s_copysignf.S: Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile
	(sysdep_routines, libm-sysdep_routines): Remove s_copysign-power6 and
	s_copysign-ppc32.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysign-power6.S:
	Remove file.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysign-ppc32.S:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysign.c:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysignf.c:
	Likewise.
	* sysdeps/powerpc/powerpc32/power6/fpu/s_copysign.S: Likewise.
	* sysdeps/powerpc/powerpc32/power6/fpu/s_copysignf.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (sysdeps_calls):
	Remove s_copysign-power6 s_copysign-ppc64.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysign-power6.S:
	Remove file.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysign-ppc64.S:
	Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysign.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysignf.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_copysign.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_copysignf.S: Likewise.
	* sysdeps/powerpc/powerpc64/power6/fpu/s_copysign.S: Likewise.
	* sysdeps/powerpc/powerpc64/power6/fpu/s_copysignf.S: Likewise.

Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
2019-06-12 11:46:26 -03:00
Adhemerval Zanella
21bd039bb4 powerpc: consolidate rint
This patches consolidates all the powerpc rint{f} implementations on
the generic sysdeps/powerpc/fpu/s_rint{f}.

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

	* sysdeps/powerpc/fpu/round_to_integer.h (set_fenv_mode,
	round_to_integer_float, round_mode): Add RINT handling.
	(reset_fenv_mode): New symbol.
	* sysdeps/powerpc/fpu/s_rint.c (__rint): Use generic implementation.
	* sysdeps/powerpc/fpu/s_rintf.c (__rintf): Likewise.
	* sysdeps/powerpc/powerpc32/fpu/s_rint.S: Remove file.
	* sysdeps/powerpc/powerpc32/fpu/s_rintf.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_rint.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_rintf.S: Likewise.

Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
2019-06-12 11:46:22 -03:00
Gabriel F. T. Gomes
9250e6610f powerpc: Fix build failures with current GCC
Since GCC commit 271500 (svn), also known as the following commit on the
git mirror:

commit 61edec870f9fdfb5df3fa4e40f28cbaede28a5b1
Author: amodra <amodra@138bc75d-0d04-0410-961f-82ee72b054a4>
Date:   Wed May 22 04:34:26 2019 +0000

    [RS6000] Don't pass -many to the assembler

glibc builds are failing when an assembly implementation does not
declare the correct '.machine' directive, or when no such directive is
declared at all.  For example, when a POWER6 instruction is used, but
'.machine power6' is not declared, the assembler will fail with an error
similar to the following:

    ../sysdeps/powerpc/powerpc64/power8/strcmp.S: Assembler messages:
     24 ../sysdeps/powerpc/powerpc64/power8/strcmp.S:55: Error: unrecognized opcode: `cmpb'

This patch adds '.machine powerN' directives where none existed, as well
as it updates '.machine power7' directives on POWER8 files, because the
minimum binutils version required to build glibc (binutils 2.25) now
provides this machine version.  It also adds '-many' to the assembler
command used to build tst-set_ppr.c.

Tested for powerpc, powerpc64, and powerpc64le, as well as with
build-many-glibcs.py for powerpc targets.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2019-05-30 11:10:48 -03:00
Adhemerval Zanella
e47308c98d powerpc: generic nearbyint/nearbyintf
This patches consolidates all the powerpc nearbyint{f} implementations
on the generic sysdeps/powerpc/fpu/s_nearbyint{f}.

	* sysdeps/powerpc/fpu/round_to_integer.h (set_fenv_mode): Add
	NEARBYINT handling.
	* sysdeps/powerpc/fpu/s_nearbyint.c: New file.
	* sysdeps/powerpc/fpu/s_nearbyintf.c: Likewise.
	* sysdeps/powerpc/powerpc32/fpu/s_nearbyint.S: Remove file.
	* sysdeps/powerpc/powerpc32/fpu/s_nearbyintf.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_nearbyint.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_nearbyintf.S: Likewise.
2019-05-28 18:16:48 -03:00
Adhemerval Zanella
ae45cf84af powerpc: trunc/truncf refactor
This patches consolidates all the powerpc trunc{f} implementations on
the generic sysdeps/powerpc/fpu/s_trunc{f}.  The generic implementation
uses either the compiler builts for ISA 2.03+ (which generates the
frim instruction) or a generic implementation which uses FP only
operations.

The IFUNC organization for powerpc64 is also change to be enabled only
for powerpc64 and not for powerpc64le (since minium ISA of 2.08 does not
require the fallback generic implementation).

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

	* sysdeps/powerpc/fpu/trunc_to_integer.h (set_fenv_mode): Add
	 TRUNC handling.
	(round_mode): Add definition for TRUNC.
	* sysdeps/powerpc/fpu/s_trunc.c: New file.
	* sysdeps/powerpc/fpu/s_truncf.c: New file.
	* sysdeps/powerpc/powerpc32/fpu/s_trunc.S: Remove file.
	* sysdeps/powerpc/powerpc32/fpu/s_truncf.S: Likewise.
	* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_trunc-power5+.S:
	Likewise.
	* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_trunc-ppc32.S:
	Likewise.
	* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_truncf-power5+.S:
	Likewise.
	* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_truncf-ppc32.S:
	Likewise.
	* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_trunc-power5+.c: New
	file.
	* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_trunc-ppc32.c:
	Likewise.
	* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_truncf-power5+.c:
	Likewise.
	* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_truncf-ppc32.c:
	Likewise.
	* sysdep/powerpc/powerpc32/power5+/fpu/s_trunc.S: Remove file.
	* sysdep/powerpc/powerpc32/power5+/fpu/s_truncf.S: Likewise.
	* sysdep/powerpc/powerpc64/be/fpu/multiarch/Makefile
	(libm-sysdep_routines): Add s_trunc-power5+, s_trunc-ppc64,
	s_truncf-power5+, and s_truncf-ppc64.
	(CFLAGS-s_trunc-power5+.c, CFLAGS-s_truncf-power5+.c): New rule.
	* sysdep/powerpc/powercp64/be/fpu/multiarch/s_trunc-power5+.c: New
	file.
	* sysdep/powerpc/powercp64/be/fpu/multiarch/s_trunc-ppc64.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_trunc.c: Move to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_trunc.c: ... here.
	* sysdep/powerpc/powercp64/be/fpu/multiarch/s_truncf-power5+.c: New
	file.
	* sysdep/powerpc/powercp64/be/fpu/multiarch/s_truncf-ppc64.c:
	Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_truncf.c: Move to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_truncf.c: ... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile
	(libm-sysdep_routines): Remove s_trunc-power5+, s_trunc-ppc64,
	s_truncf-power5+, and s_truncf-ppc64.
	* sysdep/powerpc/powerpc64/fpu/multiarch/s_trunc-power5+.S: Remove
	file.
	* sysdep/powerpc/powerpc64/fpu/multiarch/s_trunc-ppc64.S: Likewise.
	* sysdep/powerpc/powerpc64/fpu/multiarch/s_truncf-power5+.S:
	Likewise.
	* sysdep/powerpc/powerpc64/fpu/multiarch/s_truncf-ppc64.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_trunc.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_truncf.S: Likewise.
	* sysdep/powerpc/powerpc64/power5+/fpu/s_trunc.S: Likewise.
	* sysdep/powerpc/powerpc64/power5+/fpu/s_truncf.S: Likewise.

Reviewed-by: Gabriel F. T. Gomes <gabriel@inconstante.eti.br>
2019-05-09 09:39:28 -03:00
Adhemerval Zanella
a1cb1888b7 powerpc: round/roundf refactor
This patches consolidates all the powerpc round{f} implementations on
the generic sysdeps/powerpc/fpu/s_round{f}.  The generic implementation
uses either the compiler builts for ISA 2.03+ (which generates the
frim instruction) or a generic implementation which uses FP only
operations.

The IFUNC organization for powerpc64 is also change to be enabled only
for powerpc64 and not for powerpc64le (since minium ISA of 2.08 does not
require the fallback generic implementation).

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

	* sysdeps/powerpc/fpu/round_to_integer.h (set_fenv_mode): Add
	ROUND handling.
	(round_mode): Add definition for ROUND.
	(round_to_integer_float): Likewise.
	* sysdeps/powerpc/fpu/s_round.c: New file.
	* sysdeps/powerpc/fpu/s_roundf.c: New file.
	* sysdeps/powerpc/powerpc32/fpu/s_round.S: Remove file.
	* sysdeps/powerpc/powerpc32/fpu/s_roundf.S: Likewise.
	* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_round-power5+.S:
	Likewise.
	* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_round-ppc32.S:
	Likewise.
	* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_roundf-power5+.S:
	Likewise.
	* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_roundf-ppc32.S:
	Likewise.
	* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_round-power5+.c: New
	file.
	* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_round-ppc32.c:
	Likewise.
	* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_roundf-power5+.c:
	Likewise.
	* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_roundf-ppc32.c:
	Likewise.
	* sysdep/powerpc/powerpc32/power5+/fpu/s_round.S: Remove file.
	* sysdep/powerpc/powerpc32/power5+/fpu/s_roundf.S: Likewise.
	* sysdep/powerpc/powerpc64/be/fpu/multiarch/Makefile
	(libm-sysdep_routines): Add s_round-power5+, s_round-ppc64,
	s_roundf-power5+, and s_roundf-ppc64.
	(CFLAGS-s_round-power5+.c, CFLAGS-s_roundf-power5+.c): New rule.
	* sysdep/powerpc/powercp64/be/fpu/multiarch/s_round-power5+.c: New
	file.
	* sysdep/powerpc/powercp64/be/fpu/multiarch/s_round-ppc64.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_round.c: Move to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_round.c: ... here.
	* sysdep/powerpc/powercp64/be/fpu/multiarch/s_roundf-power5+.c: New
	file.
	* sysdep/powerpc/powercp64/be/fpu/multiarch/s_roundf-ppc64.c:
	Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_roundf.c: Move to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_roundf.c: ... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile
	(libm-sysdep_routines): Remove s_round-power5+, s_round-ppc64,
	s_roundf-power5+, and s_roundf-ppc64.
	* sysdep/powerpc/powerpc64/fpu/multiarch/s_round-power5+.S: Remove
	file.
	* sysdep/powerpc/powerpc64/fpu/multiarch/s_round-ppc64.S: Likewise.
	* sysdep/powerpc/powerpc64/fpu/multiarch/s_roundf-power5+.S:
	Likewise.
	* sysdep/powerpc/powerpc64/fpu/multiarch/s_roundf-ppc64.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_round.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_roundf.S: Likewise.
	* sysdep/powerpc/powerpc64/power5+/fpu/s_round.S: Likewise.
	* sysdep/powerpc/powerpc64/power5+/fpu/s_roundf.S: Likewise.

Reviewed-by: Gabriel F. T. Gomes <gabriel@inconstante.eti.br>
2019-05-09 09:39:07 -03:00
Adhemerval Zanella
252296c625 powerpc: floor/floorf refactor
This patches consolidates all the powerpc floor{f} implementations on
the generic sysdeps/powerpc/fpu/s_floor{f}.  The generic implementation
uses either the compiler builts for ISA 2.03+ (which generates the
frim instruction) or a generic implementation which uses FP only
operations.

The IFUNC organization for powerpc64 is also change to be enabled only
for powerpc64 and not for powerpc64le (since minium ISA of 2.08 does not
require the fallback generic implementation).

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

	* sysdeps/powerpc/fpu/round_to_integer.h (set_fenv_mode):
	Add FLOOR option.
	(round_mode): Add definition for FLOOR.
	* sysdeps/powerpc/fpu/s_floor.c: New file.
	* sysdeps/powerpc/fpu/s_floorf.c: Likewise.
	* sysdeps/powerpc/powerpc32/fpu/s_floor.S: Remove file.
	* sysdeps/powerpc/powerpc32/fpu/s_floorf.S: Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floor-power5+.S:
	Remove file.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floor-ppc32.S:
	Likewise
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floorf-power5+.S:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floorf-ppc32.S:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floor-power5+.c:
	New file.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floor-ppc32.c:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floorf-power5+.c:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floorf-ppc32.c:
	Likewise.
	* sysdeps/powerpc/powerpc32/power5+/fpu/s_floor.S: Remove file.
	* sysdeps/powerpc/powerpc32/power5+/fpu/s_floorf.S: Remove file.
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/Makefile
	(libm-sysdep_routines): Add s_floor-power5+, s_floor-ppc64,
	s_floorf-power5+, and s_floorf-ppc64.
	(CFLAGS-s_floor-power5+.c, CFLAGS-s_floorf-power5+.c): New rule.
	* sysdep/powerpc/powerpc64/be/fpu/multiarch/s_floor-power5+.c: New
	file.
	* sysdep/powerpc/powerpc64/be/fpu/multiarch/s_floor-ppc64.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_floor.c: Move to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_floor.c: ... here.
	* sysdep/powerpc/powerpc64/be/fpu/multiarch/s_floorf-power5+.c: New
	file.
	* sysdep/powerpc/powerpc64/be/fpu/multiarch/s_floorf-ppc64.c:
	Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_floorf.c: Move to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_floorf.c: ... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile
	(libm-sysdep_routines): Remove s_floor-power5+, s_floor-ppc64,
	s_floorf-power5+, and s_floorf-ppc64.
	* sysdep/powerpc/powerpc64/fpu/multiarch/s_floor-power5+.S: Remove
	file.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_floor-ppc64.S: Remove
	file.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_floorf-power5+.S:
	Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_floorf-ppc64.S:
	Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_floor.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_floorf.S: Likewise.
	* sysdeps/powerpc/powerpc64/power5+/fpu/s_floor.S: Likewise.
	* sysdeps/powerpc/powerpc64/power5+/fpu/s_floorf.S: Likewise.

Reviewed-by: Gabriel F. T. Gomes <gabriel@inconstante.eti.br>
2019-05-09 09:38:40 -03:00
Adhemerval Zanella
6cac323c8d powerpc: ceil/ceilf refactor
This patches consolidates all the powerpc ceil{f} implementations on
the generic sysdeps/powerpc/fpu/s_ceil{f}.  The generic implementation
uses either the compiler builts for ISA 2.03+ (which generates the frip
instruction) or a generic implementation which uses FP only operations.

It adds a generic implementation (round_to_integer.h) which is shared
with other rounding to integer routines.  The resulting code should be
similar in term os performance to previous assembly one.

The IFUNC organization for powerpc64 is also change to be enabled only
for powerpc64 and not for powerpc64le (since minium ISA of 2.08 does not
require the fallback generic implementation).

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

	* sysdeps/powerpc/fpu/fenv_libc.h (__fesetround_inline_nocheck): New
	function.
	* sysdeps/powerpc/fpu/round_to_integer.h: New file.
	* sysdeps/powerpc/fpu/s_ceil.c: Likewise.
	* sysdeps/powerpc/fpu/s_ceilf.c: Likewise.
	* sysdeps/powerpc/powerpc32/fpu/s_ceil.S: Remove file.
	* sysdeps/powerpc/powerpc32/fpu/s_ceilf.S: Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile
	(CFLAGS-s_ceil-power5+.c, CFLAGS-s_ceilf-power5+.c): New rule.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceil-power5+.S:
	Remove file.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceil-ppc32.S:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceilf-power5+.S:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceilf-ppc32.S:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceil-power5+.c:
	New file.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceil-ppc32.c:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceilf-power5+.c:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceilf-ppc32.c:
	Likewise.
	* sysdeps/powerpc/powerpc32/power5+/fpu/s_ceil.S: Remove file.
	* sysdeps/powerpc/powerpc32/power5+/fpu/s_ceilf.S: Likewise.
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/Makefile: New file.
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_ceil-power5+.c:
	Likewise.
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_ceil-ppc64.c:
	Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceil.c: Move to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_ceil.c: ... here.
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_ceilf-power5+.c: New
	file.
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_ceilf-ppc64.c:
	Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceilf.c: Move to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_ceilf.c: ...
	* here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile
	(libm-sysdep_routines): Remove s_ceil-power5+, s_ceil-ppc64,
	s_ceilf-power5+, and s_ceilf-ppc64.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceil-power5+.S: Remove
	file.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceil-ppc64.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceilf-power5+.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceilf-ppc64.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_ceil.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_ceilf.S: Likewise.
	* sysdeps/powerpc/powerpc64/power5+/fpu/s_ceil.S: Likewise.
	* sysdeps/powerpc/powerpc64/power5+/fpu/s_ceilf.S: Likewise.

Reviewed-by: Gabriel F. T. Gomes <gabriel@inconstante.eti.br>
2019-04-29 08:43:37 -03:00
Adhemerval Zanella
f82ed45d7f powerpc: Use generic wcsrchr optimization
This patch removes the power6 wcsrchr optimization and uses generic
implementation instead.  Currently, both power6 and power7 IFUNC variant
resulting binary are essentially the same and the generic implementation
with unrolling loop set to 8 also results in similar performance.

Checked on powerpc64-linux-gnu.

	* sysdeps/powerpc/Makefile [$(subdir) == wcsmbs] (CFLAGS-wcsrchr.c):
	New rule.
	* sysdeps/powerpc/power6/wcsrchr.c: Remove file.
	* sysdeps/powerpc/powerpc32/power4/multiarch/wcsrchr-power6.c:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/wcsrchr-power7.c:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/wcsrchr-ppc32.c:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/wcsrchr.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/wcsrchr-power6.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/wcsrchr-power7.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/wcsrchr-ppc64.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/wcsrchr.c: Likewise.
	* sysdeps/powerpc/powerpc64/power6/wcsrchr.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/Makefile
	[$(subdir) == wcsmbs] (sysdeps_routines): Remove wcsrchr-power6 and
	wcsrchr-power7.
	(CFLAGS-wcsrchr-power7.c, CFLAGS-wcsrchr-power6.c): Remove rule.
	* sysdeps/powerpc/powerpc64/multiarch/Makefile: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/ifunc-impl-list.c:
	Remove wcsrchr optimizations.
	* sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c: Likewise.
2019-04-04 16:01:14 +07:00
Adhemerval Zanella
421e3005ca powerpc: Use generic wcschr optimization
This patch removes the power6 wcschr optimization and uses generic
implementation instead.  Currently, both power6 and power7 IFUNC variant
resulting binary are essentially the same and the generic implementation
with unrolling loop set to 8 also results in similar performance.

Checked on powerpc64-linux-gnu.

	* sysdeps/powerpc/Makefile [$(subdir) == wcsmbs] (CFLAGS-wcschr.c):
	New rule.
	* sysdeps/powerpc/power6/wcschr.c: Remove file.
	* sysdeps/powerpc/powerpc32/power4/multiarch/wcschr-power6.c:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/wcschr-power7.c:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/wcschr-ppc32.c:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/wcschr.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/wcschr-power6.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/wcschr-power7.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/wcschr-ppc64.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/wcschr.c: Likewise.
	* sysdeps/powerpc/powerpc64/power6/wcschr.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/Makefile
	[$(subdir) == wcsmbs] (sysdeps_routines): Remove wcschr-power6 and
	wcschr-power7.
	(CFLAGS-wcschr-power7.c, CFLAGS-wcschr-power6.c): Remove rule.
	* sysdeps/powerpc/powerpc64/multiarch/Makefile: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/ifunc-impl-list.c:
	Remove wcschr optimizations.
	* sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c: Likewise.
2019-04-04 16:01:14 +07:00
Adhemerval Zanella
447a1306c3 powerpc: Use generic wcscpy optimization
This patch removes the power6 wcscpy optimization and uses generic
implementation instead.  Currently, both power6 and power7 IFUNC variant
resulting binary are essentially the same and the generic implementation
with unrolling loop set to 8 also results in similar performance.

Checked on powerpc64-linux-gnu.

	* sysdeps/powerpc/Makefile [$(subdir) == wcsmbs] (CFLAGS-wcscpy.c):
	New rule.
	* sysdeps/powerpc/power6/wcscpy.c: Remove file.
	* sysdeps/powerpc/powerpc32/power4/multiarch/wcscpy-power6.c:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/wcscpy-power7.c:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/wcscpy-ppc32.c:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/wcscpy.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/wcscpy-power6.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/wcscpy-power7.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/wcscpy-ppc64.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/wcscpy.c: Likewise.
	* sysdeps/powerpc/powerpc64/power6/wcscpy.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/Makefile
	[$(subdir) == wcsmbs] (sysdeps_routines): Remove wcscpy-power6 and
	wcscpy-power7.
	(CFLAGS-wcscpy-power7.c, CFLAGS-wcscpy-power6.c): Remove rule.
	* sysdeps/powerpc/powerpc64/multiarch/Makefile: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/ifunc-impl-list.c:
	Remove wcscpy optimizations.
	* sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c: Likewise.
2019-04-04 16:01:14 +07:00
Paul A. Clarke
10cce66930 [powerpc] Use __builtin_{mffs,mtfsf}
Replace inline asm uses of the "mffs" and "mtfsf" instructions with
the analogous GCC builtins.

__builtin_mffs and __builtin_mtfsf are both available in GCC 5 and above.
Given the minimum GCC level for GLibC is now GCC 6.2, it is safe to use
these builtins without restriction.

2019-03-29  Paul A. Clarke  <pc@us.ibm.com>

	* sysdeps/powerpc/fpu/fenv_libc.h (fegetenv_register): Replace inline
	asm with builtin.
	* sysdeps/powerpc/powerpc64/le/fpu/sfp-machine.h (FP_INIT_ROUNDMODE):
	Likewise.
	* sysdeps/powerpc/fpu/tst-setcontext-fpscr.c (_GET_DI_FPSCR): Likewise.
	(_GET_SI_FPSCR): Likewise.
	(_SET_SI_FPSCR): Likewise.
2019-03-29 19:16:34 -05:00
Adhemerval Zanella
1e372ded4f Refactor hp-timing rtld usage
This patch refactor how hp-timing is used on loader code for statistics
report.  The HP_TIMING_AVAIL and HP_SMALL_TIMING_AVAIL are removed and
HP_TIMING_INLINE is used instead to check for hp-timing avaliability.
For alpha, which only defines HP_SMALL_TIMING_AVAIL, the HP_TIMING_INLINE
is set iff for IS_IN(rtld).

Checked on aarch64-linux-gnu, x86_64-linux-gnu, and i686-linux-gnu. I also
checked the builds for all afected ABIs.

	* benchtests/bench-timing.h: Replace HP_TIMING_AVAIL with
	HP_TIMING_INLINE.
	* nptl/descr.h: Likewise.
	* elf/rtld.c (RLTD_TIMING_DECLARE, RTLD_TIMING_NOW, RTLD_TIMING_DIFF,
	RTLD_TIMING_ACCUM_NT, RTLD_TIMING_SET): Define.
	(dl_start_final_info, _dl_start_final, dl_main, print_statistics):
	Abstract hp-timing usage with RTLD_* macros.
	* sysdeps/alpha/hp-timing.h (HP_TIMING_INLINE): Define iff IS_IN(rtld).
	(HP_TIMING_AVAIL, HP_SMALL_TIMING_AVAIL): Remove.
	* sysdeps/generic/hp-timing.h (HP_TIMING_AVAIL, HP_SMALL_TIMING_AVAIL,
	HP_TIMING_NONAVAIL): Likewise.
	* sysdeps/ia64/hp-timing.h (HP_TIMING_AVAIL, HP_SMALL_TIMING_AVAIL):
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/hp-timing.h (HP_TIMING_AVAIL,
	HP_SMALL_TIMING_AVAIL): Likewise.
	* sysdeps/powerpc/powerpc64/hp-timing.h (HP_TIMING_AVAIL,
	HP_SMALL_TIMING_AVAIL): Likewise.
	* sysdeps/sparc/sparc32/sparcv9/hp-timing.h (HP_TIMING_AVAIL,
	HP_SMALL_TIMING_AVAIL): Likewise.
	* sysdeps/sparc/sparc64/hp-timing.h (HP_TIMING_AVAIL,
	HP_SMALL_TIMING_AVAIL): Likewise.
	* sysdeps/x86/hp-timing.h (HP_TIMING_AVAIL, HP_SMALL_TIMING_AVAIL):
	Likewise.
	* sysdeps/generic/hp-timing-common.h: Update comment with
	HP_TIMING_AVAIL removal.
2019-03-22 17:30:44 -03:00
Joseph Myers
c5f65462a2 Break lines before not after operators, batch 4.
This patch fixes further coding style issues where code should have
broken lines before operators in accordance with the GNU Coding
Standards but instead was breaking lines after them.

Tested for x86_64, and with build-many-glibcs.py.

	* stdio-common/vfscanf-internal.c (ARG): Break lines before rather
	than after operators.
	* sysdeps/mach/hurd/setitimer.c (timer_thread): Likewise.
	(setitimer_locked): Likewise.
	* sysdeps/mach/hurd/sigaction.c (__sigaction): Likewise.
	* sysdeps/mach/hurd/sigaltstack.c (__sigaltstack): Likewise.
	* sysdeps/mach/pagecopy.h (PAGE_COPY_FWD): Likewise.
	* sysdeps/mach/thread_state.h (machine_get_basic_state): Likewise.
	* sysdeps/powerpc/powerpc64/tst-ucontext-ppc64-vscr.c
	(PPC_CPU_SUPPORTED): Likewise.
	* sysdeps/unix/sysv/linux/alpha/a.out.h (N_TXTOFF): Likewise.
	* sysdeps/unix/sysv/linux/generic/wordsize-32/overflow.h
	(stat_overflow): Likewise.
	(statfs_overflow): Likewise.
	* sysdeps/unix/sysv/linux/tst-personality.c (do_test): Likewise.
	* sysdeps/unix/sysv/linux/tst-ttyname.c (eq_ttyname): Likewise.
	(eq_ttyname_r): Likewise.
	(run_chroot_tests): Likewise.
2019-03-07 20:20:25 +00:00
Gabriel F. T. Gomes
590675c079 powerpc: Fix build of wcscpy with --disable-multi-arch
Since the commit

commit 81a1443941
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date:   Tue Feb 5 17:35:12 2019 -0200

    wcsmbs: optimize wcscat

powerpc64 and powerpc64le builds fail when configured with
--disable-multi-arch and --with-cpu=power6 (or newer), due to an
undefined reference to __GI___wcscpy.  This patch fixes this on
sysdeps/powerpc/powerpc64/power6/wcscpy.c, which is only used when
multi-arch is disabled.

This patch does nothing for the failures on 32-bits powerpc builds,
because the file is under the powerpc64 subdirectory, however, powerpc
builds were already failing with --disable-multi-arch, with multiple
error messages, even before the aforementioned commit.

Tested for powerpc, powerpc64, and powerpc64le with multi-arch enabled
(all pass) and disabled (powerpc still fails as explained above).
2019-03-05 11:33:19 -03:00
Joseph Myers
462e83a4a0 Add more spaces before '('.
This patch fixes more places where a space should have been present
before '(' in accordance with the GNU Coding Standards (as with the
previous patch, mainly for calls to sizeof).

Tested with build-many-glibcs.py.

	* sysdeps/powerpc/powerpc32/dl-machine.c
	(__elf_machine_fixup_plt): Use space before '('.
	(__process_machine_rela): Likewise.
	* sysdeps/powerpc/powerpc32/register-dump.h (register_dump):
	Likewise.
	* sysdeps/powerpc/powerpc64/le/fpu/sfp-machine.h (TI_BITS):
	Likewise.
	* sysdeps/powerpc/powerpc64/register-dump.h (register_dump):
	Likewise.
	* sysdeps/powerpc/test-arith.c (union_t): Likewise.
	(pattern): Likewise.
	(delta): Likewise.
	(check_result): Likewise.
	(check_excepts): Likewise.
	(check_op): Likewise.
	(fail_xr): Likewise.
	* sysdeps/unix/alpha/sysdep.h (syscall_promote): Likewise.
	* sysdeps/unix/sysv/linux/alpha/a.out.h (AOUTHSZ): Likewise.
	(SCNHSZ): Likewise.
	* sysdeps/unix/sysv/linux/hppa/makecontext.c (FRAME_SIZE_BYTES):
	Likewise.
	(ARGS): Likewise.
	(__makecontext): Likewise.
	* sysdeps/unix/sysv/linux/powerpc/sys/ucontext.h (ucontext_t):
	Likewise.
2019-02-28 15:02:09 +00:00
Adhemerval Zanella
81a1443941 wcsmbs: optimize wcscat
This patch rewrites wcscat using wcslen and wcscpy.  This is similar to
the optimization done on strcat by 6e46de42fe.

The strcpy changes are mainly to add the internal alias to avoid PLT
calls.

Checked on x86_64-linux-gnu and a build against the affected
architectures.

	* include/wchar.h (__wcscpy): New prototype.
	* sysdeps/powerpc/powerpc32/power4/multiarch/wcscpy-ppc32.c
	(__wcscpy): Route internal symbol to generic implementation.
	* sysdeps/powerpc/powerpc32/power4/multiarch/wcscpy.c (wcscpy):
	Add internal __wcscpy alias.
	* sysdeps/powerpc/powerpc64/multiarch/wcscpy.c (wcscpy): Likewise.
	* sysdeps/s390/wcscpy.c (wcscpy): Likewise.
	* sysdeps/x86_64/multiarch/wcscpy.c (wcscpy): Likewise.
	* wcsmbs/wcscpy.c (wcscpy): Add
	* sysdeps/x86_64/multiarch/wcscpy-c.c (WCSCPY): Adjust macro to
	use generic implementation.
	* wcsmbs/wcscat.c (wcscat): Rewrite using wcslen and wcscpy.
2019-02-27 10:00:37 -03:00
Joseph Myers
e0cb7b6131 Add and move fall-through comments in system-specific code.
This patch fixes -Wimplicit-fallthrough warnings in system-specific
code that show up building glibc with -Wextra, by adding fall-through
comments, or moving existing such comments to the place required for
them to work (immediately before the case label being fallen through).

Tested with build-many-glibcs.py.

	* sysdeps/i386/dl-machine.h (elf_machine_rela): Add fall-through
	comments.
	* sysdeps/m68k/m680x0/fpu/s_cexp_template.c (s(__cexp)): Likewise.
	* sysdeps/m68k/memcopy.h (WORD_COPY_FWD): Likewise.
	(WORD_COPY_BWD): Likewise.
	* sysdeps/mach/hurd/ioctl.c (__ioctl): Likewise.
	* sysdeps/powerpc/powerpc64/dl-machine.h (elf_machine_rela):
	Likewise.
	* sysdeps/s390/iso-8859-1_cp037_z900.c (TR_LOOP): Likewise.
	* sysdeps/mips/dl-machine.h (elf_machine_reloc): Move fall-through
	comment.
	* sysdeps/mips/dl-trampoline.c (__dl_runtime_resolve): Likewise.
2019-02-26 02:09:18 +00:00
Gabriel F. T. Gomes
4a2dd41cb5 powerpc64le: Remove test for GCC 6.2
The configure fragment for powerpc64le contains a test for the presence
of several compiler builtins and of the __float128 type, which are
provided by GCC 6.2 for powerpc64le.  Since this configure test was
added, the compiler version required to build glibc for powerpc64le was
different than that required for the other architectures.

Now that glibc requires GCC 6.2 globally (since commit ID 4dcbbc3b28),
this patch removes the powerpc64le-specific test.

Even tough the configure test checks for compiler features rather than
compiler version, the intent of the test was to stop build attempts at
early stages, if they had been configured with a too old compiler.  It
was not the intention of the test to detect compiler breakage (such as
the removal of the required compiler features in future GCC versions),
and glibc is not the place to test for compiler regressions, anyway.

Tested for powerpc64le with GCC 6.2 (built with build-many-glibcs.py).

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2019-02-20 11:06:51 -03:00
Paul Clarke
008b598e2a powerpc: Fix tiny bug in strncmp.c
A single underscore was omitted in
sysdeps/powerpc/powerpc64/multiarch/strncmp.c, resulting in use of
power8 version of strncmp instead of power9 version, with significant
performance degradation.

	* sysdeps/powerpc/powerpc64/multiarch/strncmp.c: Fix #ifdef.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2019-01-16 12:18:58 -02:00
Rogerio Alves
56054664cc powerpc: fix tst-ucontext-ppc64-vscr test for POWER 5/6.
An error "impossible register constraint in 'asm'" was raised on POWER
5 and due to __vector __int128_t being used as operands without passing the
option -msvx to gcc.
This patch replaces "__vector __int128_t" with "__vector unsigned int"
which requires only -maltivec, available since POWER ISA 2.03, and which
is already passed to the compiler.

	* sysdeps/powerpc/powerpc64/tst-ucontext-ppc64-vscr.c:
	(do_test): Changed __vector __int128_t to __vector unsigned int.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2019-01-15 16:26:28 -02:00
Rogerio Alves
0bc9bdf159 powerpc: Fix VSCR position in ucontext (bug 24088)
This patch fix VSCR position on ucontext. VSCR was read in the wrong
position on ucontext structure because it was ignoring the machine
endianess.

	[BZ #24088]
	* sysdeps/unix/sysv/linux/powerpc/sys/ucontext.h (vscr_t): Added
	ifdef to fix read of VSCR.
	* sysdeps/powerpc/powerpc64/Makefile [$subdir == stdlib]: Add
	tst-ucontext-ppc64-vscr.c to test list.
	* sysdeps/powerpc/powerpc64/tst-ucontext-ppc64-vscr.c: New test file.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2019-01-11 15:17:25 -02:00
Joseph Myers
04277e02d7 Update copyright dates with scripts/update-copyrights.
* All files with FSF copyright notices: Update copyright dates
	using scripts/update-copyrights.
	* locale/programs/charmap-kw.h: Regenerated.
	* locale/programs/locfile-kw.h: Likewise.
2019-01-01 00:11:28 +00:00
Tulio Magno Quites Machado Filho
1d880d4a9b powerpc: Add missing CFI register information (bug #23614)
Add CFI information about the offset of registers stored in the stack
frame.

	[BZ #23614]
	* sysdeps/powerpc/powerpc64/addmul_1.S (FUNC): Add CFI offset for
	registers saved in the stack frame.
	* sysdeps/powerpc/powerpc64/lshift.S (__mpn_lshift): Likewise.
	* sysdeps/powerpc/powerpc64/mul_1.S (__mpn_mul_1): Likewise.

Signed-off-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
Reviewed-by: Gabriel F. T. Gomes <gabriel@inconstante.eti.br>
2018-12-12 10:56:51 -02:00
Joseph Myers
81dca813cc Use copysign functions not __copysign functions in glibc libm.
Continuing the move to use, within libm, public names for libm
functions that can be inlined as built-in functions on many
architectures, this patch moves calls to __copysign functions to call
the corresponding copysign names instead, with asm redirection to
__copysign when the calls are not inlined (all cases are inlined
except for IBM long double for powerpc soft-float / e500v1).  This
eliminates the need for an inline function defining __copysign in
terms of __builtin_copysign.

Tested for x86_64, and with build-many-glibcs.py.

	* include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ &&
	__FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT]
	(MATH_REDIRECT_BINARY_ARGS): New macro.
	[!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0)
	&& !NO_MATH_REDIRECT] (copysign): Redirect using MATH_REDIRECT.
	* sysdeps/alpha/fpu/s_copysign.c: Define NO_MATH_REDIRECT before
	header inclusion.
	* sysdeps/alpha/fpu/s_copysignf.c: Likewise.
	* sysdeps/ieee754/dbl-64/s_copysign.c: Likewise.
	* sysdeps/ieee754/float128/s_copysignf128.c: Likewise.
	* sysdeps/ieee754/flt-32/s_copysignf.c: Likewise.
	* sysdeps/ieee754/ldbl-128/s_copysignl.c: Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_copysignl.c: Likewise.
	* sysdeps/ieee754/ldbl-96/s_copysignl.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysign.c:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysignf.c:
	Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysign.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysignf.c: Likewise.
	* sysdeps/riscv/rvd/s_copysign.c: Likewise.
	* sysdeps/riscv/rvf/s_copysignf.c: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_copysign.c:
	Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_copysignf.c:
	Likewise.
	* sysdeps/generic/math_private_calls.h
	[!__MATH_DECLARING_LONG_DOUBLE || !NO_LONG_DOUBLE] (__copysign):
	Do not declare and define as an inline function.
	* math/divtc3.c (__divtc3): Use copysign functions instead of
	__copysign variants.
	* math/multc3.c (__multc3): Likewise.
	* sysdeps/generic/math-type-macros.h (M_COPYSIGN): Likewise.
	* sysdeps/ieee754/dbl-64/e_atan2.c (signArctan2): Likewise.
	* sysdeps/ieee754/dbl-64/e_atanh.c (__ieee754_atanh): Likewise.
	* sysdeps/ieee754/dbl-64/e_gamma_r.c (__ieee754_gamma_r):
	Likewise.
	* sysdeps/ieee754/dbl-64/e_jn.c (__ieee754_jn): Likewise.
	(__ieee754_yn): Likewise.
	* sysdeps/ieee754/dbl-64/s_asinh.c (__asinh): Likewise.
	* sysdeps/ieee754/dbl-64/s_atan.c (__signArctan): Likewise.
	* sysdeps/ieee754/dbl-64/s_scalbln.c (__scalbln): Likewise.
	* sysdeps/ieee754/dbl-64/s_scalbn.c (__scalbn): Likewise.
	* sysdeps/ieee754/dbl-64/s_sin.c (do_sin): Likewise.
	(__sin): Likewise.
	* sysdeps/ieee754/dbl-64/s_sincos.c (__sincos): Likewise.
	* sysdeps/ieee754/dbl-64/wordsize-64/s_nearbyint.c (__nearbyint):
	Likewise.
	* sysdeps/ieee754/dbl-64/wordsize-64/s_scalbln.c (__scalbln):
	Likewise.
	* sysdeps/ieee754/dbl-64/wordsize-64/s_scalbn.c (__scalbn):
	Likewise.
	* sysdeps/ieee754/flt-32/e_atanhf.c (__ieee754_atanhf): Likewise.
	* sysdeps/ieee754/flt-32/e_gammaf_r.c (__ieee754_gammaf_r):
	Likewise.
	* sysdeps/ieee754/flt-32/e_jnf.c (__ieee754_jnf): Likewise.
	(__ieee754_ynf): Likewise.
	* sysdeps/ieee754/flt-32/s_asinhf.c (__asinhf): Likewise.
	* sysdeps/ieee754/flt-32/s_scalbnf.c (__scalbnf): Likewise.
	* sysdeps/ieee754/k_standard.c (__kernel_standard): Likewise.
	* sysdeps/ieee754/ldbl-128/e_gammal_r.c (__ieee754_gammal_r):
	Likewise.
	* sysdeps/ieee754/ldbl-128/e_jnl.c (__ieee754_jnl): Likewise.
	(__ieee754_ynl): Likewise.
	* sysdeps/ieee754/ldbl-128/s_scalblnl.c (__scalblnl): Likewise.
	* sysdeps/ieee754/ldbl-128/s_scalbnl.c (__scalbnl): Likewise.
	* sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (__ieee754_gammal_r):
	Likewise.
	* sysdeps/ieee754/ldbl-128ibm/e_jnl.c (__ieee754_jnl): Likewise.
	(__ieee754_ynl): Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_fmal.c (__fmal): Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_scalblnl.c (__scalblnl): Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_scalbnl.c (__scalbnl): Likewise.
	* sysdeps/ieee754/ldbl-96/e_gammal_r.c (__ieee754_gammal_r):
	Likewise.
	* sysdeps/ieee754/ldbl-96/e_jnl.c (__ieee754_jnl): Likewise.
	(__ieee754_ynl)
	* sysdeps/ieee754/ldbl-96/s_asinhl.c (__asinhl): Likewise.
	* sysdeps/ieee754/ldbl-96/s_scalblnl.c (__scalblnl): Likewise.
	* sysdeps/ieee754/ldbl-opt/nldbl-copysign.c (copysignl): Likewise.
	* sysdeps/powerpc/power5+/fpu/s_modf.c (__modf): Likewise.
	* sysdeps/powerpc/power5+/fpu/s_modff.c (__modff): Likewise.
2018-09-27 20:04:48 +00:00
Joseph Myers
9755bc4686 Use round functions not __round functions in glibc libm.
Continuing the move to use, within libm, public names for libm
functions that can be inlined as built-in functions on many
architectures, this patch moves calls to __round functions to call the
corresponding round names instead, with asm redirection to __round
when the calls are not inlined.

An additional complication arises in
sysdeps/ieee754/ldbl-128ibm/e_expl.c, where a call to roundl, with the
result converted to int, gets converted by the compiler to call
lroundl in the case of 32-bit long, so resulting in localplt test
failures.  It's logically correct to let the compiler make such an
optimization; an appropriate asm redirection of lroundl to __lroundl
is thus added to that file (it's not needed anywhere else).

Tested for x86_64, and with build-many-glibcs.py.

	* include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ &&
	__FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (round): Redirect
	using MATH_REDIRECT.
	* sysdeps/aarch64/fpu/s_round.c: Define NO_MATH_REDIRECT before
	header inclusion.
	* sysdeps/aarch64/fpu/s_roundf.c: Likewise.
	* sysdeps/ieee754/dbl-64/s_round.c: Likewise.
	* sysdeps/ieee754/dbl-64/wordsize-64/s_round.c: Likewise.
	* sysdeps/ieee754/float128/s_roundf128.c: Likewise.
	* sysdeps/ieee754/flt-32/s_roundf.c: Likewise.
	* sysdeps/ieee754/ldbl-128/s_roundl.c: Likewise.
	* sysdeps/ieee754/ldbl-96/s_roundl.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_round.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_roundf.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_round.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_roundf.c: Likewise.
	* sysdeps/riscv/rv64/rvd/s_round.c: Likewise.
	* sysdeps/riscv/rvf/s_roundf.c: Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_roundl.c: Likewise.
	(round): Redirect to __round.
	(__roundl): Call round instead of __round.
	* sysdeps/powerpc/fpu/math_private.h [_ARCH_PWR5X] (__round):
	Remove macro.
	[_ARCH_PWR5X] (__roundf): Likewise.
	* sysdeps/ieee754/dbl-64/e_gamma_r.c (gamma_positive): Use round
	functions instead of __round variants.
	* sysdeps/ieee754/flt-32/e_gammaf_r.c (gammaf_positive): Likewise.
	* sysdeps/ieee754/ldbl-128/e_gammal_r.c (gammal_positive):
	Likewise.
	* sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (gammal_positive):
	Likewise.
	* sysdeps/ieee754/ldbl-96/e_gammal_r.c (gammal_positive):
	Likewise.
	* sysdeps/x86/fpu/powl_helper.c (__powl_helper): Likewise.
	* sysdeps/ieee754/ldbl-128ibm/e_expl.c (lroundl): Redirect to
	__lroundl.
	(__ieee754_expl): Call roundl instead of __roundl.
2018-09-27 12:35:23 +00:00
Adhemerval Zanella
f0458cf4f9 powerpc: Only enable TLE with PPC_FEATURE2_HTM_NOSC
Linux from 3.9 through 4.2 does not abort HTM transaction on syscalls,
instead it suspend and resume it when leaving the kernel.  The
side-effects of the syscall will always remain visible, even if the
transaction is aborted.  This is an issue when transaction is used along
with futex syscall, on pthread_cond_wait for instance, where the futex
call might succeed but the transaction is rolled back leading the
pthread_cond object in an inconsistent state.

Glibc used to prevent it by always aborting a transaction before issuing
a syscall.  Linux 4.2 also decided to abort active transaction in
syscalls which makes the glibc workaround superfluous.  Worse, glibc
transaction abortion leads to a performance issue on recent kernels
where the HTM state is saved/restore lazily (v4.9).  By aborting a
transaction on every syscalls, regardless whether a transaction has being
initiated before, GLIBS makes the kernel always save/restore HTM state
(it can not even lazily disable it after a certain number of syscall
iterations).

Because of this shortcoming, Transactional Lock Elision is just enabled
when it has been explicitly set (either by tunables of by a configure
switch) and if kernel aborts HTM transactions on syscalls
(PPC_FEATURE2_HTM_NOSC).  It is reported that using simple benchmark [1],
the context-switch is about 5% faster by not issuing a tabort in every
syscall in newer kernels.

Checked on powerpc64le-linux-gnu with 4.4.0 kernel (Ubuntu 16.04).

	* NEWS: Add note about new TLE support on powerpc64le.
	* sysdeps/powerpc/nptl/tcb-offsets.sym (TM_CAPABLE): Remove.
	* sysdeps/powerpc/nptl/tls.h (tcbhead_t): Rename tm_capable to
	__ununsed1.
	(TLS_INIT_TP, TLS_DEFINE_INIT_TP): Remove tm_capable setup.
	(THREAD_GET_TM_CAPABLE, THREAD_SET_TM_CAPABLE): Remove macros.
	* sysdeps/powerpc/powerpc32/sysdep.h,
	sysdeps/powerpc/powerpc64/sysdep.h (ABORT_TRANSACTION_IMPL,
	ABORT_TRANSACTION): Remove macros.
	* sysdeps/powerpc/sysdep.h (ABORT_TRANSACTION): Likewise.
	* sysdeps/unix/sysv/linux/powerpc/elision-conf.c (elision_init): Set
	__pthread_force_elision iff PPC_FEATURE2_HTM_NOSC is set.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/sysdep.h,
	sysdeps/unix/sysv/linux/powerpc/powerpc64/sysdep.h
	sysdeps/unix/sysv/linux/powerpc/syscall.S (ABORT_TRANSACTION): Remove
	usage.
	* sysdeps/unix/sysv/linux/powerpc/not-errno.h: Remove file.

Reported-by: Breno Leitão <leitao@debian.org>
2018-09-21 10:18:03 -07:00
Joseph Myers
7abf97bed9 Use trunc functions not __trunc functions in glibc libm.
Continuing the move to use, within libm, public names for libm
functions that can be inlined as built-in functions on many
architectures, this patch moves calls to __trunc functions to call the
corresponding trunc names instead, with asm redirection to __trunc
when the calls are not inlined.

Tested for x86_64, and with build-many-glibcs.py.

	* include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ &&
	__FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (trunc): Redirect
	using MATH_REDIRECT.
	* sysdeps/aarch64/fpu/s_trunc.c: Define NO_MATH_REDIRECT before
	header inclusion.
	* sysdeps/aarch64/fpu/s_truncf.c: Likewise.
	* sysdeps/ieee754/dbl-64/wordsize-64/s_trunc.c: Likewise.
	* sysdeps/ieee754/float128/s_truncf128.c: Likewise.
	* sysdeps/ieee754/dbl-64/s_trunc.c: Likewise.
	* sysdeps/ieee754/flt-32/s_truncf.c: Likewise.
	* sysdeps/ieee754/ldbl-128/s_truncl.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_trunc.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_truncf.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_trunc.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_truncf.c: Likewise.
	* sysdeps/riscv/rv64/rvd/s_trunc.c: Likewise.
	* sysdeps/riscv/rvf/s_truncf.c: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_trunc.c: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_truncf.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_trunc.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_truncf.c: Likewise.
	* sysdeps/m68k/m680x0/fpu/s_trunc_template.c: Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_truncl.c: Likewise.
	(ceil): Redirect to __ceil.
	(floor): Redirect to __floor.
	(trunc): Redirect to __trunc.
	(__truncl): Call trunc instead of __trunc.
	* sysdeps/powerpc/fpu/math_private.h [_ARCH_PWR5X] (__trunc):
	Remove macro.
	[_ARCH_PWR5X] (__truncf): Likewise.
	* sysdeps/ieee754/dbl-64/e_gamma_r.c (__ieee754_gamma_r): Use
	trunc functions instead of __trunc variants.
	* sysdeps/ieee754/flt-32/e_gammaf_r.c (__ieee754_gammaf_r):
	Likewise.
	* sysdeps/ieee754/ldbl-128/e_gammal_r.c (__ieee754_gammal_r):
	Likewise.
	* sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (__ieee754_gammal_r):
	Likewise.
	* sysdeps/ieee754/ldbl-96/e_gammal_r.c (__ieee754_gammal_r):
	Likewise.
2018-09-20 21:11:10 +00:00
Joseph Myers
71223ef909 Use ceil functions not __ceil functions in glibc libm.
Continuing the move to use, within libm, public names for libm
functions that can be inlined as built-in functions on many
architectures, this patch moves calls to __ceil functions to call the
corresponding ceil names instead, with asm redirection to __ceil when
the calls are not inlined.

Tested for x86_64, and with build-many-glibcs.py.

	* include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ &&
	__FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (ceil): Redirect
	using MATH_REDIRECT.
	* sysdeps/aarch64/fpu/s_ceil.c: Define NO_MATH_REDIRECT before
	header inclusion.
	* sysdeps/aarch64/fpu/s_ceilf.c: Likewise.
	* sysdeps/ieee754/dbl-64/s_ceil.c: Likewise.
	* sysdeps/ieee754/dbl-64/wordsize-64/s_ceil.c: Likewise.
	* sysdeps/ieee754/float128/s_ceilf128.c: Likewise.
	* sysdeps/ieee754/flt-32/s_ceilf.c: Likewise.
	* sysdeps/ieee754/ldbl-128/s_ceill.c: Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_ceill.c: Likewise.
	* sysdeps/m68k/m680x0/fpu/s_ceil_template.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceil.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceilf.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceil.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceilf.c: Likewise.
	* sysdeps/riscv/rv64/rvd/s_ceil.c: Likewise.
	* sysdeps/riscv/rvf/s_ceilf.c: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_ceil.c: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_ceilf.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_ceil.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_ceilf.c: Likewise.
	* sysdeps/powerpc/fpu/math_private.h [_ARCH_PWR5X] (__ceil):
	Remove macro.
	* sysdeps/ieee754/dbl-64/e_gamma_r.c (gamma_positive): Use ceil
	functions instead of __ceil variants.
	* sysdeps/ieee754/flt-32/e_gammaf_r.c (gammaf_positive): Likewise.
	* sysdeps/ieee754/ldbl-128/e_gammal_r.c (gammal_positive):
	Likewise.
	* sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (gammal_positive):
	Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_truncl.c (__truncl): Likewise.
	* sysdeps/ieee754/ldbl-96/e_gammal_r.c (gammal_positive):
	Likewise.
	* sysdeps/powerpc/power5+/fpu/s_modf.c (__modf): Likewise.
	* sysdeps/powerpc/power5+/fpu/s_modff.c (__modff): Likewise.
2018-09-17 20:42:06 +00:00
Joseph Myers
e44acb2063 Use floor functions not __floor functions in glibc libm.
Similar to the changes that were made to call sqrt functions directly
in glibc, instead of __ieee754_sqrt variants, so that the compiler
could inline them automatically without needing special inline
definitions in lots of math_private.h headers, this patch makes libm
code call floor functions directly instead of __floor variants,
removing the inlines / macros for x86_64 (SSE4.1) and powerpc
(POWER5).

The redirection used to ensure that __ieee754_sqrt does still get
called when the compiler doesn't inline a built-in function expansion
is refactored so it can be applied to other functions; the refactoring
is arranged so it's not limited to unary functions either (it would be
reasonable to use this mechanism for copysign - removing the inline in
math_private_calls.h but also eliminating unnecessary local PLT entry
use in the cases (powerpc soft-float and e500v1, for IBM long double)
where copysign calls don't get inlined).

The point of this change is that more architectures can get floor
calls inlined where they weren't previously (AArch64, for example),
without needing special inline definitions in their math_private.h,
and existing such definitions in math_private.h headers can be
removed.

Note that it's possible that in some cases an inline may be used where
an IFUNC call was previously used - this is the case on x86_64, for
example.  I think the direct calls to floor are still appropriate; if
there's any significant performance cost from inline SSE2 floor
instead of an IFUNC call ending up with SSE4.1 floor, that indicates
that either the function should be doing something else that's faster
than using floor at all, or it should itself have IFUNC variants, or
that the compiler choice of inlining for generic tuning should change
to allow for the possibility that, by not inlining, an SSE4.1 IFUNC
might be called at runtime - but not that glibc should avoid calling
floor internally.  (After all, all the same considerations would apply
to any user program calling floor, where it might either be inlined or
left as an out-of-line call allowing for a possible IFUNC.)

Tested for x86_64, and with build-many-glibcs.py.

	* include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ &&
	__FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (MATH_REDIRECT):
	New macro.
	[!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0)
	&& !NO_MATH_REDIRECT] (MATH_REDIRECT_LDBL): Likewise.
	[!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0)
	&& !NO_MATH_REDIRECT] (MATH_REDIRECT_F128): Likewise.
	[!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0)
	&& !NO_MATH_REDIRECT] (MATH_REDIRECT_UNARY_ARGS): Likewise.
	[!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0)
	&& !NO_MATH_REDIRECT] (sqrt): Redirect using MATH_REDIRECT.
	[!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0)
	&& !NO_MATH_REDIRECT] (floor): Likewise.
	* sysdeps/aarch64/fpu/s_floor.c: Define NO_MATH_REDIRECT before
	header inclusion.
	* sysdeps/aarch64/fpu/s_floorf.c: Likewise.
	* sysdeps/ieee754/dbl-64/s_floor.c: Likewise.
	* sysdeps/ieee754/dbl-64/wordsize-64/s_floor.c: Likewise.
	* sysdeps/ieee754/float128/s_floorf128.c: Likewise.
	* sysdeps/ieee754/flt-32/s_floorf.c: Likewise.
	* sysdeps/ieee754/ldbl-128/s_floorl.c: Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_floorl.c: Likewise.
	* sysdeps/m68k/m680x0/fpu/s_floor_template.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floor.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floorf.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_floor.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_floorf.c: Likewise.
	* sysdeps/riscv/rv64/rvd/s_floor.c: Likewise.
	* sysdeps/riscv/rvf/s_floorf.c: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_floor.c: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_floorf.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_floor.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_floorf.c: Likewise.
	* sysdeps/powerpc/fpu/math_private.h [_ARCH_PWR5X] (__floor):
	Remove macro.
	[_ARCH_PWR5X] (__floorf): Likewise.
	* sysdeps/x86_64/fpu/math_private.h [__SSE4_1__] (__floor): Remove
	inline function.
	[__SSE4_1__] (__floorf): Likewise.
	* math/w_lgamma_main.c (LGFUNC (__lgamma)): Use floor functions
	instead of __floor variants.
	* math/w_lgamma_r_compat.c (__lgamma_r): Likewise.
	* math/w_lgammaf_main.c (LGFUNC (__lgammaf)): Likewise.
	* math/w_lgammaf_r_compat.c (__lgammaf_r): Likewise.
	* math/w_lgammal_main.c (LGFUNC (__lgammal)): Likewise.
	* math/w_lgammal_r_compat.c (__lgammal_r): Likewise.
	* math/w_tgamma_compat.c (__tgamma): Likewise.
	* math/w_tgamma_template.c (M_DECL_FUNC (__tgamma)): Likewise.
	* math/w_tgammaf_compat.c (__tgammaf): Likewise.
	* math/w_tgammal_compat.c (__tgammal): Likewise.
	* sysdeps/ieee754/dbl-64/e_lgamma_r.c (sin_pi): Likewise.
	* sysdeps/ieee754/dbl-64/k_rem_pio2.c (__kernel_rem_pio2):
	Likewise.
	* sysdeps/ieee754/dbl-64/lgamma_neg.c (__lgamma_neg): Likewise.
	* sysdeps/ieee754/flt-32/e_lgammaf_r.c (sin_pif): Likewise.
	* sysdeps/ieee754/flt-32/lgamma_negf.c (__lgamma_negf): Likewise.
	* sysdeps/ieee754/ldbl-128/e_lgammal_r.c (__ieee754_lgammal_r):
	Likewise.
	* sysdeps/ieee754/ldbl-128/e_powl.c (__ieee754_powl): Likewise.
	* sysdeps/ieee754/ldbl-128/lgamma_negl.c (__lgamma_negl):
	Likewise.
	* sysdeps/ieee754/ldbl-128/s_expm1l.c (__expm1l): Likewise.
	* sysdeps/ieee754/ldbl-128ibm/e_lgammal_r.c (__ieee754_lgammal_r):
	Likewise.
	* sysdeps/ieee754/ldbl-128ibm/e_powl.c (__ieee754_powl): Likewise.
	* sysdeps/ieee754/ldbl-128ibm/lgamma_negl.c (__lgamma_negl):
	Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_expm1l.c (__expm1l): Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_truncl.c (__truncl): Likewise.
	* sysdeps/ieee754/ldbl-96/e_lgammal_r.c (sin_pi): Likewise.
	* sysdeps/ieee754/ldbl-96/lgamma_negl.c (__lgamma_negl): Likewise.
	* sysdeps/powerpc/power5+/fpu/s_modf.c (__modf): Likewise.
	* sysdeps/powerpc/power5+/fpu/s_modff.c (__modff): Likewise.
2018-09-14 13:09:01 +00:00
Paul Pluzhnikov
a6e8926f8d [BZ #20271] Add newlines in __libc_fatal calls. 2018-08-31 18:04:32 -07:00
Rajalakshmi Srinivasaraghavan
fa78896b1f powerpc: Remove powerpc specific sinf and cosf optimization
New generic optimization of sinf and cosf introduced by commit
599cf39766 shows improvement
compared to powerpc specific assembly version.  Hence removing
the powerpc assembly versions to make use of generic code.
2018-08-20 08:47:43 +05:30
Rajalakshmi Srinivasaraghavan
7793ad7a2c powerpc: Rearrange little endian specific files
This patch moves little endian specific POWER9 optimization files to
sysdeps/powerpc/powerpc64/le and creates POWER9 ifunc functions
only for little endian.
2018-08-16 12:12:02 +05:30
Rogerio Alves
52b2a80fae powerpc64: Always restore TOC on longjmp [BZ #21895]
This patch changes longjmp to always restore the TOC pointer (r2 register)
to the caller frame on powerpc64 and powerpc64le.  This is related to bug
21895 that reports a situation where you have a static longjmp to a
shared object file.

	[BZ #21895]
	* sysdeps/powerpc/powerpc64/__longjmp-common.S: Remove condition code for
	restoring r2 in longjmp.
	* sysdeps/powerpc/powerpc64/Makefile: Added tst-setjmp-bug21895-static to
	test list.
	Added rules to build test tst-setjmp-bug21895-static.
	Added module setjmp-bug21895 and rules to build a shared object from it.
	* sysdeps/powerpc/powerpc64/setjmp-bug21895.c: New test file.
	* sysdeps/powerpc/powerpc64/tst-setjmp-bug21895-static.c: New test file.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2018-07-16 16:08:41 -03:00
Joseph Myers
809dc95d14 Fix powerpc64le build of nan-sign tests (bug 23303).
My recent nan-sign tests fail to build for powerpc64le with GCC 8
because of the special compile / link options needed there for any
test using _Float128.  This patch arranges for these tests to be
handled on powerpc64le similarly to other such tests.

Tested with build-many-glibcs.py for powerpc64le.

	[BZ #23303]
	* sysdeps/powerpc/powerpc64/le/Makefile
	(CFLAGS-tst-strtod-nan-sign.c): Add -mfloat128.
	(CFLAGS-tst-wcstod-nan-sign.c): Likewise.
	(gnulib-tests): Also add $(f128-loader-link) for
	tst-strtod-nan-sign abd tst-wcstod-nan-sign.
2018-06-18 11:27:51 +00:00
H.J. Lu
67c0579669 Mark _init and _fini as hidden [BZ #23145]
_init and _fini are special functions provided by glibc for linker to
define DT_INIT and DT_FINI in executable and shared library.  They
should never be put in dynamic symbol table.  This patch marks them as
hidden to remove them from dynamic symbol table.

Tested with build-many-glibcs.py.

	[BZ #23145]
	* elf/Makefile (tests-special): Add $(objpfx)check-initfini.out.
	($(all-built-dso:=.dynsym): New target.
	(common-generated): Add $(all-built-dso:$(common-objpfx)%=%.dynsym).
	($(objpfx)check-initfini.out): New target.
	(generated): Add check-initfini.out.
	* scripts/check-initfini.awk: New file.
	* sysdeps/aarch64/crti.S (_init): Mark as hidden.
	(_fini): Likewise.
	* sysdeps/alpha/crti.S (_init): Mark as hidden.
	(_fini): Likewise.
	* sysdeps/arm/crti.S (_init): Mark as hidden.
	(_fini): Likewise.
	* sysdeps/hppa/crti.S (_init): Mark as hidden.
	(_fini): Likewise.
	* sysdeps/i386/crti.S (_init): Mark as hidden.
	(_fini): Likewise.
	* sysdeps/ia64/crti.S (_init): Mark as hidden.
	(_fini): Likewise.
	* sysdeps/m68k/crti.S (_init): Mark as hidden.
	(_fini): Likewise.
	* sysdeps/microblaze/crti.S (_init): Mark as hidden.
	(_fini): Likewise.
	* sysdeps/mips/mips32/crti.S (_init): Mark as hidden.
	(_fini): Likewise.
	* sysdeps/mips/mips64/n32/crti.S (_init): Mark as hidden.
	(_fini): Likewise.
	* sysdeps/mips/mips64/n64/crti.S (_init): Mark as hidden.
	(_fini): Likewise.
	* sysdeps/nios2/crti.S (_init): Mark as hidden.
	(_fini): Likewise.
	* sysdeps/powerpc/powerpc32/crti.S (_init): Mark as hidden.
	(_fini): Likewise.
	* sysdeps/powerpc/powerpc64/crti.S (_init): Mark as hidden.
	(_fini): Likewise.
	* sysdeps/s390/s390-32/crti.S (_init): Mark as hidden.
	(_fini): Likewise.
	* sysdeps/s390/s390-64/crti.S (_init): Mark as hidden.
	(_fini): Likewise.
	* sysdeps/sh/crti.S (_init): Mark as hidden.
	(_fini): Likewise.
	* sysdeps/sparc/crti.S (_init): Mark as hidden.
	(_fini): Likewise.
	* sysdeps/x86_64/crti.S (_init): Mark as hidden.
	(_fini): Likewise.
2018-06-08 10:28:52 -07:00
Tulio Magno Quites Machado Filho
1c09524e4d powerpc64le: Fix TFtype in sqrtf128 when using -mabi=ieeelongdouble
When building with -mlong-double-128 or -mabi=ibmlongdouble, TFtype
represents the IBM 128-bit extended floating point type, while KFtype
represents the IEEE 128-bit floating point type.
The soft float implementation of e_sqrtf128 had to redefine TFtype and
TF in order to workaround this issue.  However, this behavior changes
when -mabi=ieeelongdouble is used and the macros are not necessary.

	* sysdeps/powerpc/powerpc64/le/fpu/e_sqrtf128.c
	[__HAVE_FLOAT128_UNLIKE_LDBL] (TFtype, TF): Restrict TFtype
	and TF redirection to KFtype and KF only when the default
	long double type is not the IEEE 128-bit floating point type.

Signed-off-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2018-06-06 12:27:39 -03:00
Rajalakshmi Srinivasaraghavan
2c93fce76a powerpc: Add multiarch sqrtf128 for ppc64le
This patch creates ifunc for sqrtf128() to make use of new xssqrtqp
instruction for POWER9 when --enable-multi-arch and --with-cpu=power8
options are used on power9 system.  This is achieved by explicitly
adding -mcpu=power9 flag for sqrtf128-power9.
2018-05-30 21:31:27 +05:30
Tulio Magno Quites Machado Filho
c1dc1e1b34 powerpc: Move around math-related Implies
Currently, powerpc, powerpc64, and powerpc64le imply the same set of
subdirectories from sysdeps/ieee754: flt-32, dbl-64, ldbl-128ibm, and
ldbl-opt.  In preparation for the transition of the long double format -
from IBM Extended Precision to IEEE 754 128-bits floating-point - on
powerpc64le, this patch splits the shared Implies file into three
separate files (one for each of the powerpc architectures), without
changing their contents.  Future patches will modify powerpc64le.

	* sysdeps/powerpc/Implies: Removed.  Previous contents copied to...
	* sysdeps/powerpc/powerpc32/Implies-after: ... here.
	* sysdeps/powerpc/powerpc64/be/Implies-after: ... here.
	* sysdeps/powerpc/powerpc64/le/Implies-before: ... and here.
2018-05-24 22:49:10 -03:00