Commit Graph

77 Commits

Author SHA1 Message Date
Paul Eggert
2b778ceb40 Update copyright dates with scripts/update-copyrights
I used these shell commands:

../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")

and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 6694 files FOO.
I then removed trailing white space from benchtests/bench-pthread-locks.c
and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this
diagnostic from Savannah:
remote: *** pre-commit check failed ...
remote: *** error: lines with trailing whitespace found
remote: error: hook declined to update refs/heads/master
2021-01-02 12:17:34 -08:00
Joseph Myers
d614a75396 Update copyright dates with scripts/update-copyrights. 2020-01-01 00:14:33 +00:00
Paul Eggert
5a82c74822 Prefer https to http for gnu.org and fsf.org URLs
Also, change sources.redhat.com to sourceware.org.
This patch was automatically generated by running the following shell
script, which uses GNU sed, and which avoids modifying files imported
from upstream:

sed -ri '
  s,(http|ftp)(://(.*\.)?(gnu|fsf|sourceware)\.org($|[^.]|\.[^a-z])),https\2,g
  s,(http|ftp)(://(.*\.)?)sources\.redhat\.com($|[^.]|\.[^a-z]),https\2sourceware.org\4,g
' \
  $(find $(git ls-files) -prune -type f \
      ! -name '*.po' \
      ! -name 'ChangeLog*' \
      ! -path COPYING ! -path COPYING.LIB \
      ! -path manual/fdl-1.3.texi ! -path manual/lgpl-2.1.texi \
      ! -path manual/texinfo.tex ! -path scripts/config.guess \
      ! -path scripts/config.sub ! -path scripts/install-sh \
      ! -path scripts/mkinstalldirs ! -path scripts/move-if-change \
      ! -path INSTALL ! -path  locale/programs/charmap-kw.h \
      ! -path po/libc.pot ! -path sysdeps/gnu/errlist.c \
      ! '(' -name configure \
            -execdir test -f configure.ac -o -f configure.in ';' ')' \
      ! '(' -name preconfigure \
            -execdir test -f preconfigure.ac ';' ')' \
      -print)

and then by running 'make dist-prepare' to regenerate files built
from the altered files, and then executing the following to cleanup:

  chmod a+x sysdeps/unix/sysv/linux/riscv/configure
  # Omit irrelevant whitespace and comment-only changes,
  # perhaps from a slightly-different Autoconf version.
  git checkout -f \
    sysdeps/csky/configure \
    sysdeps/hppa/configure \
    sysdeps/riscv/configure \
    sysdeps/unix/sysv/linux/csky/configure
  # Omit changes that caused a pre-commit check to fail like this:
  # remote: *** error: sysdeps/powerpc/powerpc64/ppc-mcount.S: trailing lines
  git checkout -f \
    sysdeps/powerpc/powerpc64/ppc-mcount.S \
    sysdeps/unix/sysv/linux/s390/s390-64/syscall.S
  # Omit change that caused a pre-commit check to fail like this:
  # remote: *** error: sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S: last line does not end in newline
  git checkout -f sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S
2019-09-07 02:43:31 -07:00
Raoni Fassina Firmino
066020c5e8 powerpc: Cleanup: use actual power8 assembly mnemonics
Some implementations in sysdeps/powerpc/powerpc64/power8/*.S still had
pre power8 compatible binutils hardcoded macros and were not using
.machine power8.

This patch should not have semantic changes, in fact it should have the
same exact code generated.

Tested that generated stripped shared objects are identical when
using "strip --remove-section=.note.gnu.build-id".

Checked on:
- powerpc64le, power9, build-many-glibcs.py, gcc 6.4.1 20180104, binutils 2.26.2.20160726
- powerpc64le, power8, debian 9, gcc 6.3.0 20170516, binutils 2.28
- powerpc64le, power9, ubuntu 19.04, gcc 8.3.0, binutils 2.32
- powerpc64le, power9, opensuse tumbleweed, gcc 9.1.1 20190527, binutils 2.32
- powerpc64, power9, debian 10, gcc 8.3.0, binutils 2.31.1

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
2019-08-01 15:57:50 -03:00
Adhemerval Zanella
aa32f5bf0c powerpc: Use generic e_expf
Generic implementation is faster on both power8 and power9:

POWER9:
- sysdeps/ieee754/flt-32/e_expf.c
  "expf": {
   "workload-spec2017.wrf": {
    "duration": 5.1236e+09,
    "iterations": 7.53344e+08,
    "reciprocal-throughput": 5.9436,
    "latency": 7.65869,
    "max-throughput": 1.68248e+08,
    "min-throughput": 1.30571e+08
   }
  }

- sysdeps/powerpc/powerpc64/power8/fpu/e_expf.S
  "expf": {
   "workload-spec2017.wrf": {
    "duration": 5.14429e+09,
    "iterations": 5.29248e+08,
    "reciprocal-throughput": 8.05372,
    "latency": 11.3863,
    "max-throughput": 1.24166e+08,
    "min-throughput": 8.78249e+07
   }
  }

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

	* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile
	(libm-sysdep_routines): Remove e_expf-power8 and expf-ppc64.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/e_expf-power8.S: Remove
	file.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/e_expf-ppc64.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/e_expf.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/w_expf.c: Likewise.
	* sysdeps/powerpc/powerpc64/power8/fpu/e_expf.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/fpu/w_expf.c: Likewise.

Reviewed-by: Gabriel F. T. Gomes <gabriel@inconstante.eti.br>
2019-06-26 14:33:58 -03:00
Adhemerval Zanella
dee07df1a4 powerpc: Refactor powerpc64 lround/lroundf/llround/llroundf
This patches consolidates all the powerpc {l}lround{f} implementations
on the generic sysdeps/powerpc/fpu/s_{l}lround{f}.c.

The IFUNC support is also moved only to powerpc64 only, since for
powerpc64le generic implementation resulting in optimized code.

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/Makefile
	(libm-sysdep_routines): Add s_llround-power8, s_llround-power6x,
	s_llround-power5+, s_llround-ppc64, and s_llroundf-ppc64.
	(CFLAGS-s_llround-power8.c, CFLAGS-s_llround-power6x.c,
	CFLAGS-s_llround-power5+.c): New rule.
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llround-power5+.c:
	New file.
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llround-power6x.c:
	Likewise.
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llround-power8.c:
	Likewise.
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llround-ppc64.c:
	Likewise.
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llroundf-ppc64.c:
	Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llround.c: Move to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llround.c: ... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llroundf.c: Move to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llroundf.c: ... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_lround.c: Move to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_lround.c: ... here.
	* sysdeps/powerpc/powerpc64/fpu/Makefile
	[$(subdir) == math] (CFLAGS-s_llround.c): New rule.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile
	(libm-sysdep_routines): Remove s_llround-* objects.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llround-power5+.S: Remove
	file.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llround-power6x.S:
	Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llround-power8.S:
	Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llround-ppc64.S:
	Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llroundf-ppc64.S:
	Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_llround.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_llroundf.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_lround.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_lroundf.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_llround.c: New file.
	* sysdeps/powerpc/powerpc64/fpu/s_llroundf.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_lround.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_lroundf.c: Likewise.
	* sysdeps/powerpc/powerpc64/power5+/fpu/s_llround.S: Likewise.
	* sysdeps/powerpc/powerpc64/power5+/fpu/s_llroundf.S: Likewise.
	* sysdeps/powerpc/powerpc64/power6x/fpu/s_llround.S: Likewise.
	* sysdeps/powerpc/powerpc64/power6x/fpu/s_llroundf.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/fpu/s_llround.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/fpu/s_llroundf.S: Likewise.

Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
2019-06-17 09:28:21 -03:00
Adhemerval Zanella
78049de0a9 powerpc: refactor powerpc64 lrint/lrintf/llrint/llrintf
This patches consolidates all the powerpc llrint{f} implementations on
the generic sysdeps/powerpc/fpu/s_llrint{f}.

The IFUNC support is also moved only to powerpc64 only, since for
powerpc64le generic implementation resulting in optimized code.

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/Makefile
	(libm-sysdep_routines): Add s_llrint-power8, s_llrint-power6x, and
	s_llrint-ppc64.
	(CFLAGS-s_llrint-power8.c, CFLAGS-s_llrint-power6x.c): New rule.
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llrint-power6x.c: New
	file.
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llrint-power8.c:
	Likewise.
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llrint-ppc64.c:
	Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_lrint.c: Move to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_lrint.c: ... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llrint.c: Move to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llrint.c: ... here.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llrintf.c: Move to ...
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llrintf.c: ... here.
	* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_lrint.c: New file.
	* sysdeps/powerpc/powerpc64/fpu/Makefile: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile
	(libm-sysdep_routines): Remove s_llrint-* objects.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llrint-power6x.S: Remove
	file.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llrint-power8.S:
	Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llrint-ppc64.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_llrint.c: New file.
	* sysdeps/powerpc/powerpc64/fpu/s_llrintf.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_lrint.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_lrintf.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_llrint.S: Remove file.
	* sysdeps/powerpc/powerpc64/fpu/s_llrintf.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_lrint.S: Likewise.
	* sysdeps/powerpc/powerpc64/power6x/fpu/s_llrint.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/fpu/s_llrint.S: Likewise.

Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
2019-06-17 09:26:21 -03:00
Adhemerval Zanella
1192696069 powerpc: Remove optimized finite
The powerpc finite optimization do not show much gain:

  - GCC will call libm iff -fsignaling-nans is used. This usage pattern
    is usually not performance oriented and for such calls PLT overhead
    should dominate execution time.

  - The power7 uses ftdiv to optimize for some input patterns, but at
    cost of others.  Comparing against generic C implementation built
    for powerpc64-linux-gnu-power7 (--with-cpu=power7):

    - Generic sysdeps/ieee754 implementation:
       "isfinite": {
        "": {
         "duration": 5.0082e+09,
         "iterations": 2.45299e+09,
         "max": 43.824,
         "min": 2.008,
         "mean": 2.04167
        },
        "INF": {
         "duration": 4.66554e+09,
         "iterations": 2.28288e+09,
         "max": 35.73,
         "min": 2.008,
         "mean": 2.04371
        },
        "NAN": {
         "duration": 4.66274e+09,
         "iterations": 2.28716e+09,
         "max": 34.161,
         "min": 2.009,
         "mean": 2.03866
        }
       }

    - power7 optimized one:
       "isfinite": {
        "": {
         "duration": 4.99111e+09,
         "iterations": 2.65566e+09,
         "max": 25.015,
         "min": 1.716,
         "mean": 1.87942
        },
        "INF": {
         "duration": 4.6783e+09,
         "iterations": 2.0999e+09,
         "max": 35.264,
         "min": 1.868,
         "mean": 2.22787
        },
        "NAN": {
         "duration": 4.67915e+09,
         "iterations": 2.08678e+09,
         "max": 38.099,
         "min": 1.869,
         "mean": 2.24228
        }
       }

     So it basically optimizes marginally for normal numbers while
     increasing the latency for other kind of FP.

  - The power8 implementation is just the generic implementation using
    ISA 2.07 mfvsrd instruction (which GCC uses for generic implementation).
    So generic implementation is the best option for powerpc64le.

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile
	(sysdeps_routines, libm-sysdep_routines): Remove s_finite*
	objects.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finite-power7.S:
	Remove file.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finite-ppc32.c:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finite.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finitef-ppc32.c:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finitef.c: Likewise.
	* sysdeps/powerpc/powerpc32/power7/fpu/s_finite.S: Likewise.
	* sysdeps/powerpc/powerpc32/power7/fpu/s_finitef.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (sysdep_call):
	Remove s_finite* objects.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite-power7.S: Remove file.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite-power8.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite-ppc64.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_finitef-ppc64.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_finitef.c: Likewise.
	* sysdeps/powerpc/powerpc64/power7/fpu/s_finite.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/fpu/s_finitef.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/fpu/s_finite.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/fpu/s_finitef.S: Likewise.

Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
2019-06-12 14:32:39 -03:00
Adhemerval Zanella
6427a6ac8c powerpc: Remove optimized isinf
The powerpc isinf optimizations onyl adds complexity:

  - GCC will call libm iff -fsignaling-nans is used. This usage pattern
    is usually not performance oriented and for such calls PLT overhead
    should dominate execution time.

  - The power7 uses ftdiv to optimize for some input pattern and branch
    implementation for INF and denormal that does:

    return (ix & UINT64_C (0x7fffffffffffffff)) == UINT64_C (0x7ff0000000000000)

    Although it does show slight better latency than generic algorithm
    (as below), it is only for power7 and requires it to override it
    for power8.

  - The power8 implementation is just the generic implementation using
    ISA 2.07 mfvsrd instruction (which GCC uses for generic implementation).
    So generic implementation is the best option for powerpc64le.

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile
	(sysdeps_routines, libm-sysdep_routines): Remove s_isinf* and s_isinf*
	objects.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinf-power7.S:
	Remove file.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinf-ppc32.c:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinf.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinff-ppc32.c:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinff.c: Likewise.
	* sysdeps/powerpc/powerpc32/power7/fpu/s_isinf.S: Likewise.
	* sysdeps/powerpc/powerpc32/power7/fpu/s_isinff.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (sysdep_call):
	Remove s_isinf* and s_isinf* objects.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinf-power7.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinf-power8.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinf-ppc64.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinf.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinff-ppc64.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinff.c: Likewise.
	* sysdeps/powerpc/powerpc64/power7/fpu/s_isinf.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/fpu/s_isinff.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/fpu/s_isinf.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/fpu/s_isinff.S: Likewise.

Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
2019-06-12 14:32:39 -03:00
Adhemerval Zanella
2666f96390 powerpc: Remove optimized isnan
The powerpc isnan optimizations are not really a gain:

  - GCC will call libm iff -fsignaling-nans is used. This usage pattern
    is usually not performance oriented and for such calls PLT overhead
    should dominate execution time.

  - The power5, power6, and power6x are just micro-optimization to
    improve the Load-Hit-Store hazards from floating-point to general
    register transfer, and current GCC already has support to minimize
    it by inserting either extra nops or group dispatch instructions.

  - The power7 uses ftdiv to optimize for some input patterns, but at
    cost of others.  Comparing against generic C implementation built
    for powerpc-linux-gnu-power4 (which uses the hp-timing support on
    benchtests):

    - Generic sysdeps/ieee754 implementation:
      "isnan": {
       "": {
        "duration": 4.98415e+09,
        "iterations": 2.34516e+09,
        "max": 45.925,
        "min": 2.052,
        "mean": 2.12529
       },
       "INF": {
        "duration": 4.74057e+09,
        "iterations": 1.69761e+09,
        "max": 91.01,
        "min": 2.052,
        "mean": 2.79249
       },
       "NAN": {
        "duration": 4.74071e+09,
        "iterations": 1.68768e+09,
        "max": 282.343,
        "min": 2.052,
        "mean": 2.809
       }
      }

    - power7 optimized one:
    $ ./testrun.sh benchtests/bench-isnan
      "isnan": {
       "": {
        "duration": 4.96842e+09,
        "iterations": 2.56297e+09,
        "max": 50.048,
        "min": 1.872,
        "mean": 1.93854
       },
       "INF": {
        "duration": 4.76648e+09,
        "iterations": 1.54213e+09,
        "max": 373.408,
        "min": 2.661,
        "mean": 3.09084
       },
       "NAN": {
        "duration": 4.76845e+09,
        "iterations": 1.54515e+09,
        "max": 51.016,
        "min": 2.736,
        "mean": 3.08607
       }
      }

    So it basically optimizes marginally for normal numbers while
    increasing the latency for other kind of FP.

  - The generic implementation requires getting the floating point
    status, disable the invalid operation bit, and restore the
    floating-point status.  Each operation is costly and requires
    flushing the FP pipeline.

    Using the same scenarion for the previous analysis:

      "isnan": {
       "": {
        "duration": 5.08284e+09,
        "iterations": 6.2898e+08,
        "max": 41.844,
        "min": 8.057,
        "mean": 8.08108
       },
       "INF": {
        "duration": 4.97904e+09,
        "iterations": 6.16176e+08,
        "max": 39.661,
        "min": 8.057,
        "mean": 8.08055
       },
       "NAN": {
        "duration": 4.98695e+09,
        "iterations": 5.95866e+08,
        "max": 29.728,
        "min": 8.345,
        "mean": 8.36925
       }
      }

  - The power8 implementation is just the generic implementation using
    ISA 2.07 mfvsrd instruction (which GCC uses for generic implementation).
    So generic implementation is the best option for powerpc64le.

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

	* sysdeps/powerpc/fpu/s_isnan.c: Remove file.
	* sysdeps/powerpc/fpu/s_isnanf.S: Likewise.
	* sysdeps/powerpc/powerpc32/fpu/s_isnan.S: Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile
	(sysdeps_routines, libm-sysdep_routines): Remove s_isnan-* and
	s_isnanf-* objects.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan-power5.S:
	Remove file
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan-power6.S:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan-power7.S:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan-ppc32.S:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnanf-power5.S:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnanf-power6.S:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnanf.c: Likewise.
	* sysdeps/powerpc/powerpc32/power5/fpu/s_isnan.S: Likewise.
	* sysdeps/powerpc/powerpc32/power5/fpu/s_isnanf.S: Likewise.
	* sysdeps/powerpc/powerpc32/power6/fpu/s_isnan.S: Likewise.
	* sysdeps/powerpc/powerpc32/power6/fpu/s_isnanf.S: Likewise.
	* sysdeps/powerpc/powerpc32/power7/fpu/s_isnan.S: Likewise.
	* sysdeps/powerpc/powerpc32/power7/fpu/s_isnanf.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (sysdep_calls):
	Remove s_isnan-* and s_isnanf-* objects.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power5.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power6.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power6x.S:
	Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power7.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power8.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-ppc64.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnanf.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_isnan.S: Likewise.
	* sysdeps/powerpc/powerpc64/power5/fpu/s_isnan.S: Likewise.
	* sysdeps/powerpc/powerpc64/power6/fpu/s_isnan.S: Likewise.
	* sysdeps/powerpc/powerpc64/power6x/fpu/s_isnan.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/fpu/s_isnan.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/fpu/s_isnanf.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/fpu/s_isnan.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/fpu/s_isnanf.S: Likewise.

Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
2019-06-12 14:32:36 -03:00
Gabriel F. T. Gomes
9250e6610f powerpc: Fix build failures with current GCC
Since GCC commit 271500 (svn), also known as the following commit on the
git mirror:

commit 61edec870f9fdfb5df3fa4e40f28cbaede28a5b1
Author: amodra <amodra@138bc75d-0d04-0410-961f-82ee72b054a4>
Date:   Wed May 22 04:34:26 2019 +0000

    [RS6000] Don't pass -many to the assembler

glibc builds are failing when an assembly implementation does not
declare the correct '.machine' directive, or when no such directive is
declared at all.  For example, when a POWER6 instruction is used, but
'.machine power6' is not declared, the assembler will fail with an error
similar to the following:

    ../sysdeps/powerpc/powerpc64/power8/strcmp.S: Assembler messages:
     24 ../sysdeps/powerpc/powerpc64/power8/strcmp.S:55: Error: unrecognized opcode: `cmpb'

This patch adds '.machine powerN' directives where none existed, as well
as it updates '.machine power7' directives on POWER8 files, because the
minimum binutils version required to build glibc (binutils 2.25) now
provides this machine version.  It also adds '-many' to the assembler
command used to build tst-set_ppr.c.

Tested for powerpc, powerpc64, and powerpc64le, as well as with
build-many-glibcs.py for powerpc targets.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2019-05-30 11:10:48 -03:00
Joseph Myers
04277e02d7 Update copyright dates with scripts/update-copyrights.
* All files with FSF copyright notices: Update copyright dates
	using scripts/update-copyrights.
	* locale/programs/charmap-kw.h: Regenerated.
	* locale/programs/locfile-kw.h: Likewise.
2019-01-01 00:11:28 +00:00
Rajalakshmi Srinivasaraghavan
fa78896b1f powerpc: Remove powerpc specific sinf and cosf optimization
New generic optimization of sinf and cosf introduced by commit
599cf39766 shows improvement
compared to powerpc specific assembly version.  Hence removing
the powerpc assembly versions to make use of generic code.
2018-08-20 08:47:43 +05:30
Gabriel F. T. Gomes
3a33b06969 powerpc64*: fix the order of implied sysdeps directories
The creation of the divergent sysdeps directory for powerpc64le

commit 2f7f3cd8cd
Author: Paul E. Murphy <murphyp@linux.vnet.ibm.com>
Date:   Fri Jul 15 18:04:40 2016 -0500

    powerpc64le: Create divergent sysdep directory for powerpc64le.

allowed float128 to be enabled for powerpc64le (little-endian) and not
for powerpc64 (big-endian).  Since the only intended difference between
them was the presence or absence of the float128 interface, the sysdeps
directory for powerpc64le explicitly reused the files from powerpc64
(through the use of Implies files).

Although this works, it also means that files under the powerpc64
directory might be preferred over files under powerpc64le.  For
instance, on a build for powerpc64le with target set to power9, a file
from powerpc64/power5 might get built, even though a file with the same
name exists in powerpc64le/power8.  That happens because the processor
hierarchy was only defined in the sysdeps directory for powerpc64 (and
borrowed by powerpc64le).

This patch fixes this behavior, by creating new subdirectories under
powerpc64 (i.e.: powerpc64/be and powerpc64/le) and creating new Implies
files to provide the hierarchy of processors for powerpc64 and
powerpc64le separately.  These changes have no effect on installed,
stripped binaries (which remain unchanged).

Tested that installed stripped binaries are unchanged and that there are
no regressions on powerpc64 and powerpc64le.
2018-04-27 16:32:01 -03:00
Joseph Myers
688903eb3e Update copyright dates with scripts/update-copyrights.
* All files with FSF copyright notices: Update copyright dates
	using scripts/update-copyrights.
	* locale/programs/charmap-kw.h: Regenerated.
	* locale/programs/locfile-kw.h: Likewise.
2018-01-01 00:32:25 +00:00
Rajalakshmi Srinivasaraghavan
1e36806fb8 powerpc: st{r,p}cpy optimization for aligned strings
This patch makes use of vectors for aligned inputs.  Improvements
upto 30% seen for larger aligned inputs.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.vnet.ibm.com>
2017-12-15 10:55:58 +05:30
Joseph Myers
216933b242 Use libm_alias_float for powerpc.
Continuing the preparation for additional _FloatN / _FloatNx function
aliases, this patch makes powerpc libm function implementations use
libm_alias_float to define function aliases.

Tested with build-many-glibcs.py that installed stripped shared
libraries are unchanged for all its hard-float powerpc configurations.

	* sysdeps/powerpc/fpu/s_cosf.c: Include <libm-alias-float.h>.
	(cosf): Define using libm_alias_float.
	* sysdeps/powerpc/fpu/s_fabs.S: Include <libm-alias-float.h>.
	(fabsf): Define using libm_alias_float.
	* sysdeps/powerpc/fpu/s_fmaf.S: Include <libm-alias-float.h>.
	(fmaf): Define using libm_alias_float.
	* sysdeps/powerpc/fpu/s_rintf.c: Include <libm-alias-float.h>.
	(rintf): Define using libm_alias_float.
	* sysdeps/powerpc/fpu/s_sinf.c: Include <libm-alias-float.h>.
	(sinf): Define using libm_alias_float.
	* sysdeps/powerpc/power5+/fpu/s_modff.c: Include
	<libm-alias-float.h>.
	(modff): Define using libm_alias_float.
	* sysdeps/powerpc/power7/fpu/s_logbf.c: Include
	<libm-alias-float.h>.
	(logbf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc32/fpu/s_ceilf.S: Include
	<libm-alias-float.h>.
	(ceilf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc32/fpu/s_copysign.S: Include
	<libm-alias-float.h>.
	(copysignf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc32/fpu/s_floorf.S: Include
	<libm-alias-float.h>.
	(floorf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc32/fpu/s_llrintf.c: Include
	<libm-alias-float.h>.
	(llrintf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc32/fpu/s_llroundf.c: Include
	<libm-alias-float.h>.
	(llroundf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc32/fpu/s_lrint.S: Include
	<libm-alias-float.h>.
	(lrintf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc32/fpu/s_lround.S: Include
	<libm-alias-float.h>.
	(lroundf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc32/fpu/s_nearbyintf.S: Include
	<libm-alias-float.h>.
	(nearbyintf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc32/fpu/s_rintf.S: Include
	<libm-alias-float.h>.
	(rintf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc32/fpu/s_roundf.S: Include
	<libm-alias-float.h>.
	(roundf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc32/fpu/s_truncf.S: Include
	<libm-alias-float.h>.
	(truncf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceilf.c:
	Include <libm-alias-float.h>.
	(ceilf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysignf.c:
	Include <libm-alias-float.h>.
	(copysignf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floorf.c:
	Include <libm-alias-float.h>.
	(floorf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llrintf.c:
	Include <libm-alias-float.h>.
	(llrintf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llroundf.c:
	Include <libm-alias-float.h>.
	(llroundf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_logbf.c:
	Include <libm-alias-float.h>.
	(logbf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lrintf.c:
	Include <libm-alias-float.h>.
	(lrintf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lroundf.c:
	Include <libm-alias-float.h>.
	(lroundf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_modff.c:
	Include <libm-alias-float.h>.
	(modff): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_roundf.c:
	Include <libm-alias-float.h>.
	(roundf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_truncf.c:
	Include <libm-alias-float.h>.
	(truncf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc32/power4/fpu/s_llrintf.S: Include
	<libm-alias-float.h>.
	(llrintf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc32/power4/fpu/s_llround.S: Include
	<libm-alias-float.h>.
	(llroundf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc32/power5+/fpu/s_ceilf.S: Include
	<libm-alias-float.h>.
	(ceilf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc32/power5+/fpu/s_floorf.S: Include
	<libm-alias-float.h>.
	(floorf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc32/power5+/fpu/s_llround.S: Include
	<libm-alias-float.h>.
	(llroundf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc32/power5+/fpu/s_lround.S: Include
	<libm-alias-float.h>.
	(lroundf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc32/power5+/fpu/s_roundf.S: Include
	<libm-alias-float.h>.
	(roundf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc32/power5+/fpu/s_truncf.S: Include
	<libm-alias-float.h>.
	(truncf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc32/power6/fpu/s_copysign.S: Include
	<libm-alias-float.h>.
	(copysignf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc32/power6/fpu/s_llrintf.S: Include
	<libm-alias-float.h>.
	(llrintf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc32/power6/fpu/s_llround.S: Include
	<libm-alias-float.h>.
	(llroundf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc32/power6x/fpu/s_lrint.S: Include
	<libm-alias-float.h>.
	(lrintf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc32/power6x/fpu/s_lround.S: Include
	<libm-alias-float.h>.
	(lroundf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceilf.c: Include
	<libm-alias-float.h>.
	(ceilf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysignf.c: Include
	<libm-alias-float.h>.
	(copysignf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_cosf.c: Include
	<libm-alias-float.h>.
	(cosf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_floorf.c: Include
	<libm-alias-float.h>.
	(floorf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llrintf.c: Include
	<libm-alias-float.h>.
	(llrintf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llroundf.c: Include
	<libm-alias-float.h>.
	(llroundf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logbf.c: Include
	<libm-alias-float.h>.
	(logbf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_modff.c: Include
	<libm-alias-float.h>.
	(modff): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_roundf.c: Include
	<libm-alias-float.h>.
	(roundf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_sinf.c: Include
	<libm-alias-float.h>.
	(sinf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_truncf.c: Include
	<libm-alias-float.h>.
	(truncf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc64/fpu/s_ceilf.S: Include
	<libm-alias-float.h>.
	(ceilf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc64/fpu/s_copysign.S: Include
	<libm-alias-float.h>.
	(copysignf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc64/fpu/s_floorf.S: Include
	<libm-alias-float.h>.
	(floorf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc64/fpu/s_llrint.S: Include
	<libm-alias-float.h>.
	(llrintf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc64/fpu/s_llroundf.S: Include
	<libm-alias-float.h>.
	(llroundf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc64/fpu/s_nearbyintf.S: Include
	<libm-alias-float.h>.
	(nearbyintf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc64/fpu/s_rintf.S: Include
	<libm-alias-float.h>.
	(rintf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc64/fpu/s_roundf.S: Include
	<libm-alias-float.h>.
	(roundf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc64/fpu/s_truncf.S: Include
	<libm-alias-float.h>.
	(truncf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc64/power5+/fpu/s_ceilf.S: Include
	<libm-alias-float.h>.
	(ceilf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc64/power5+/fpu/s_floorf.S: Include
	<libm-alias-float.h>.
	(floorf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc64/power5+/fpu/s_llround.S: Include
	<libm-alias-float.h>.
	(llroundf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc64/power5+/fpu/s_roundf.S: Include
	<libm-alias-float.h>.
	(roundf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc64/power5+/fpu/s_truncf.S: Include
	<libm-alias-float.h>.
	(truncf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc64/power6/fpu/s_copysign.S: Include
	<libm-alias-float.h>.
	(copysignf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc64/power6x/fpu/s_llrint.S: Include
	<libm-alias-float.h>.
	(llrintf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc64/power6x/fpu/s_llround.S: Include
	<libm-alias-float.h>.
	(llroundf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc64/power8/fpu/s_cosf.S: Include
	<libm-alias-float.h>.
	(cosf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc64/power8/fpu/s_llrint.S: Include
	<libm-alias-float.h>.
	(llrintf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc64/power8/fpu/s_llround.S: Include
	<libm-alias-float.h>.
	(llroundf): Define using libm_alias_float.
	* sysdeps/powerpc/powerpc64/power8/fpu/s_sinf.S: Include
	<libm-alias-float.h>.
	(sinf): Define using libm_alias_float.
2017-12-05 00:26:26 +00:00
Joseph Myers
d17542d235 Use libm_alias_double for remaining powerpc functions.
Continuing the preparation for additional _FloatN / _FloatNx function
aliases, this patch makes the remaining double powerpc functions use
libm_alias_double to define function aliases (with consequent removal
of the need for local compat symbol handling).  Previous cleanups
avoid this patch changing installed stripped shared libraries for any
build-many-glibcs.py configuration (there are still some functions in
this patch for which the order of double and float aliases changes
within an individual source file, but in this case this doesn't result
in changes to the final library).

Tested with build-many-glibcs.py that installed stripped shared
libraries are unchanged for all its hard-float powerpc configurations.

	* sysdeps/powerpc/power7/fpu/s_logb.c: Include
	<libm-alias-double.h>.
	(logb): Define using libm_alias_double.
	* sysdeps/powerpc/powerpc32/fpu/s_copysign.S: Include
	<libm-alias-double.h>.
	(copysign): Define using libm_alias_double.
	* sysdeps/powerpc/powerpc32/fpu/s_llrint.c: Include
	<libm-alias-double.h>.
	(llrint): Define using libm_alias_double.
	* sysdeps/powerpc/powerpc32/fpu/s_llround.c: Include
	<libm-alias-double.h>.
	(llround): Define using libm_alias_double.
	* sysdeps/powerpc/powerpc32/fpu/s_lrint.S: Include
	<libm-alias-double.h>.
	(lrint): Define using libm_alias_double.
	* sysdeps/powerpc/powerpc32/fpu/s_lround.S: Include
	<libm-alias-double.h>.
	(lround): Define using libm_alias_double.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysign.c:
	Include <libm-alias-double.h>.
	(copysign): Define using libm_alias_double.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llrint.c:
	Include <libm-alias-double.h>.
	(llrint): Define using libm_alias_double.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround.c:
	Include <libm-alias-double.h>.
	(llround): Define using libm_alias_double.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_logb.c: Include
	<libm-alias-double.h>.
	(logb): Define using libm_alias_double.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lrint.c:
	Include <libm-alias-double.h>.
	(lrint): Define using libm_alias_double.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lround.c:
	Include <libm-alias-double.h>.
	(lround): Define using libm_alias_double.
	* sysdeps/powerpc/powerpc32/power4/fpu/s_llrint.S: Include
	<libm-alias-double.h>.
	(llrint): Define using libm_alias_double.
	* sysdeps/powerpc/powerpc32/power4/fpu/s_llround.S: Include
	<libm-alias-double.h>.
	(llround): Define using libm_alias_double.
	* sysdeps/powerpc/powerpc32/power5+/fpu/s_llround.S: Include
	<libm-alias-double.h>.
	(llround): Define using libm_alias_double.
	* sysdeps/powerpc/powerpc32/power5+/fpu/s_lround.S: Include
	<libm-alias-double.h>.
	(lround): Define using libm_alias_double.
	* sysdeps/powerpc/powerpc32/power6/fpu/s_copysign.S: Include
	<libm-alias-double.h>.
	(copysign): Define using libm_alias_double.
	* sysdeps/powerpc/powerpc32/power6/fpu/s_llrint.S: Include
	<libm-alias-double.h>.
	(llrint): Define using libm_alias_double.
	* sysdeps/powerpc/powerpc32/power6/fpu/s_llround.S: Include
	<libm-alias-double.h>.
	(llround): Define using libm_alias_double.
	* sysdeps/powerpc/powerpc32/power6x/fpu/s_lrint.S: Include
	<libm-alias-double.h>.
	(lrint): Define using libm_alias_double.
	* sysdeps/powerpc/powerpc32/power6x/fpu/s_lround.S: Include
	<libm-alias-double.h>.
	(lround): Define using libm_alias_double.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysign.c: Include
	<libm-alias-double.h>.
	(copysign): Define using libm_alias_double.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llrint.c: Include
	<libm-alias-double.h>.
	(llrint): Define using libm_alias_double.
	(lrint): Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llround.c: Include
	<libm-alias-double.h>.
	(llround): Define using libm_alias_double.
	(lround): Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logb.c: Include
	<libm-alias-double.h>.
	(logb): Define using libm_alias_double.
	* sysdeps/powerpc/powerpc64/fpu/s_copysign.S: Include
	<libm-alias-double.h>.
	(copysign): Define using libm_alias_double.
	* sysdeps/powerpc/powerpc64/fpu/s_llrint.S: Include
	<libm-alias-double.h>.
	(llrint): Define using libm_alias_double.
	(lrint): Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_llround.S: Include
	<libm-alias-double.h>.
	(llround): Define using libm_alias_double.
	(lround): Likewise.
	* sysdeps/powerpc/powerpc64/power5+/fpu/s_llround.S: Include
	<libm-alias-double.h>.
	(llround): Define using libm_alias_double.
	(lround): Likewise.
	* sysdeps/powerpc/powerpc64/power6/fpu/s_copysign.S: Include
	<libm-alias-double.h>.
	(copysign): Define using libm_alias_double.
	* sysdeps/powerpc/powerpc64/power6x/fpu/s_llrint.S: Include
	<libm-alias-double.h>.
	(llrint): Define using libm_alias_double.
	(lrint): Likewise.
	* sysdeps/powerpc/powerpc64/power6x/fpu/s_llround.S: Include
	<libm-alias-double.h>.
	(llround): Define using libm_alias_double.
	(lround): Likewise.
	* sysdeps/powerpc/powerpc64/power8/fpu/s_llrint.S: Include
	<libm-alias-double.h>.
	(llrint): Define using libm_alias_double.
	(lrint): Likewise.
	* sysdeps/powerpc/powerpc64/power8/fpu/s_llround.S: Include
	<libm-alias-double.h>.
	(llround): Define using libm_alias_double.
	(lround): Likewise.
2017-12-02 00:11:37 +00:00
Alan Modra
174935af03 PowerPC64 power8 strncpy cfi fixes
cfi info for stack adjust needs to be on the insn doing the adjust.
cfi describing register saves can be anywhere after the save insn but
before the reg is altered.  Fewer locations with cfi result in smaller
cfi programs and possibly slightly faster exception handling.  Thus
the LR cfi_offset move.

The idea behind ajusting sp after restoring regs is to break a
register dependency chain, in this case not be using r1 immediately
after it is modified.

The missing LR cfi_restore meant that code after the blr,
unaligned_lt_16 and other labels, would have cfi that said LR was at
cfa+16, but that code is reached without LR being saved.

	* sysdeps/powerpc/powerpc64/power8/strncpy.S: Move LR cfi.
	Adjust stack after restoring regs.  Add missing LR cfi_restore.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.vnet.ibm.com>
2017-10-23 07:46:58 +10:30
Rajalakshmi Srinivasaraghavan
5a90716891 powerpc: Fix IFUNC for memrchr
Recent commit 59ba2d2b54 missed to add __memrchr_power8 in
ifunc list.  Also handled discarding unwanted bytes for
unaligned inputs in power8 optimization.

2017-10-05  Rajalakshmi Srinivasaraghavan  <raji@linux.vnet.ibm.com>

	* sysdeps/powerpc/powerpc64/multiarch/memrchr-ppc64.c: Revert
	back to powerpc32 file.
	* sysdeps/powerpc/powerpc64/multiarch/memrchr.c
	(memrchr): Add __memrchr_power8 to ifunc list.
	* sysdeps/powerpc/powerpc64/power8/memrchr.S: Mask
	extra bytes for unaligned inputs.
2017-10-06 10:04:52 +05:30
Szabolcs Nagy
f7a0b063e7 Do not wrap expf and exp2f
The new generic expf and exp2f code don't need wrappers any more, they
set errno inline, so only use the wrappers on targets that need it.
(If the wrapper is needed, then the top level wrapper code is included,
otherwise empty w_exp*f.c is used to suppress the wrapper.)

A powerpc64 expf implementation includes the expf c code directly which
needed some changes.

	* sysdeps/ieee754/flt-32/e_exp2f.c (__exp2f): Define without wrapper.
	* sysdeps/ieee754/flt-32/e_expf.c (__expf): Likewise
	* sysdeps/ieee754/flt-32/w_exp2f.c: New file.
	* sysdeps/ieee754/flt-32/w_expf.c: New file.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/e_expf-ppc64.c: Update for
	the new expf code.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/w_expf.c: New file.
	* sysdeps/powerpc/powerpc64/power8/fpu/w_expf.c: New file.
	* sysdeps/m68k/m680x0/fpu/w_exp2f.c: New file.
	* sysdeps/m68k/m680x0/fpu/w_expf.c: New file.
	* sysdeps/i386/fpu/w_exp2f.c: New file.
	* sysdeps/i386/fpu/w_expf.c: New file.
	* sysdeps/i386/i686/fpu/multiarch/w_expf.c: New file.
	* sysdeps/x86_64/fpu/w_expf.c: New file.
2017-10-02 14:38:54 +01:00
Rajalakshmi Srinivasaraghavan
59ba2d2b54 powerpc: Optimize memrchr for power8
Vectorized loops are used for sizes greater than 32B to improve
performance over power7 optimization.  This shows as an average
of 25% improvement depending on the position of search
character.  The performance is same for shorter strings.
2017-10-02 17:31:13 +05:30
Rajalakshmi Srinivasaraghavan
bd17ba29eb powerpc: Avoid misaligned stores in memset
As per the section "3.1.4.2 Alignment Interrupts" of the "POWER8 Processor
User's Manual for the Single-Chip Module", alignment interrupt is reported
for misaligned stores in  Caching-inhibited storage.  As memset is used in
some drivers for DMA (like xorg), this patch avoids misaligned stores for
sizes less than 8 in memset.
2017-09-19 13:55:49 +05:30
Joseph Myers
f17a42333f Do not use __ptr_t.
sys/cdefs.h has a macro __ptr_t, which a few places in glibc use
instead of void *.  void * is a well-understood standard type for that
purpose and in a post-C89 context there is no need for a macro for it;
this patch changes those places to use void * directly instead.

Unlike __long_double_t, __ptr_t is widely used outside glibc (or at
least has many hits on codesearch.debian.net).  I don't know how many
of those uses would break if sys/cdefs.h ceased to define the macro,
but there's enough risk that this patch leaves the definition and just
removes the uses within glibc; removal of the definition can be
considered separately if desired.

Tested for x86_64, and with build-many-glibcs.py.

	* malloc/mcheck.c (old_free_hook): Use void * instead of __ptr_t.
	(old_malloc_hook): Likewise.
	(old_memalign_hook): Likewise.
	(old_realloc_hook): Likewise.
	(struct hdr): Likewise.
	(flood): Likewise.
	(freehook): Likewise.
	(mallochook): Likewise.
	(memalignhook): Likewise.
	(reallochook): Likewise.
	(mprobe): Likewise.
	* malloc/mtrace.c (mallwatch): Likewise.
	(tr_old_free_hook): Likewise.
	(tr_old_malloc_hook): Likewise.
	(tr_old_realloc_hook): Likewise.
	(tr_old_memalign_hook): Likewise.
	(tr_where): Likewise.
	(lock_and_info): Likewise.
	(tr_freehook): Likewise.
	(tr_mallochook): Likewise.
	(tr_reallochook): Likewise.
	(tr_memalignhook): Likewise.
	* misc/err.h [!__GNUC_VA_LIST] (__gnuc_va_list): Likewise.
	* misc/mmap.c (__mmap): Likewise.
	* misc/mmap64.c (__mmap64): Likewise.
	* misc/mprotect.c (__mprotect): Likewise.
	* misc/msync.c (msync): Likewise.
	* misc/munmap.c (__munmap): Likewise.
	* posix/posix_madvise.c (posix_madvise): Likewise.
	* socket/send.c (__send): Likewise.
	* socket/sendto.c (__sendto): Likewise.
	* socket/setsockopt.c (__setsockopt): Likewise.
	* string/memcmp.c (__ptr_t): Remove macro.
	(MEMCMP): Use void * instead of ptr_t.
	* string/memrchr.c (__ptr_t): Remove macro.
	(__memrchr): Use void * instead of ptr_t.
	* sysdeps/mach/hurd/dl-sysdep.c (__mmap): Likewise.
	* sysdeps/mach/hurd/mmap.c (__mmap): Likewise.
	* sysdeps/mach/hurd/mmap64.c (__mmap64): Likewise.
	* sysdeps/mach/mprotect.c (__mprotect): Likewise.
	* sysdeps/mach/msync.c (msync): Likewise.
	* sysdeps/mach/munmap.c (__munmap): Likewise.
	* sysdeps/mips/bits/setjmp.h (struct __jmp_buf_internal_tag):
	Likewise.
	* sysdeps/posix/getcwd.c (__getcwd): Likewise.
	* sysdeps/powerpc/powerpc32/memset.S (memset): Likewise.
	* sysdeps/powerpc/powerpc32/power4/memcpy.S (memcpy): Likewise.
	* sysdeps/powerpc/powerpc32/power4/memset.S (memset): Likewise.
	* sysdeps/powerpc/powerpc32/power6/memcpy.S (memcpy): Likewise.
	* sysdeps/powerpc/powerpc32/power6/memset.S (memset): Likewise.
	* sysdeps/powerpc/powerpc32/power7/memcpy.S (memcpy): Likewise.
	* sysdeps/powerpc/powerpc32/power7/mempcpy.S (__mempcpy):
	Likewise.
	* sysdeps/powerpc/powerpc32/power7/memset.S (memset): Likewise.
	* sysdeps/powerpc/powerpc64/memcpy.S (memcpy): Likewise.
	* sysdeps/powerpc/powerpc64/memset.S (memset): Likewise.
	* sysdeps/powerpc/powerpc64/power4/memcpy.S (memcpy): Likewise.
	* sysdeps/powerpc/powerpc64/power4/memset.S (memset): Likewise.
	* sysdeps/powerpc/powerpc64/power6/memcpy.S (memcpy): Likewise.
	* sysdeps/powerpc/powerpc64/power6/memset.S (memset): Likewise.
	* sysdeps/powerpc/powerpc64/power7/memcpy.S (memcpy): Likewise.
	* sysdeps/powerpc/powerpc64/power7/mempcpy.S (__mempcpy):
	Likewise.
	* sysdeps/powerpc/powerpc64/power7/memset.S (memset): Likewise.
	* sysdeps/powerpc/powerpc64/power8/memset.S (memset): Likewise.
	* sysdeps/tile/memcmp.c (__ptr_t): Remove macro.
	(MEMCMP): Use void * instead of ptr_t.
	* sysdeps/unix/sysv/linux/alpha/oldglob.c (old_glob_t): Likewise.
	* sysdeps/unix/sysv/linux/mmap.c (__mmap): Likewise.
2017-08-08 17:14:49 +00:00
Rajalakshmi Srinivasaraghavan
2572f356b1 powerpc: Clean up strlen and strnlen for power8
To align a quadword aligned address to 64 bytes, maximum of three
16 bytes load is needed for worst case instead of loading four times.
2017-07-03 10:46:13 +05:30
Rajalakshmi Srinivasaraghavan
12f50337ae powerpc: refactor strrchr IFUNC
As done in commit 6d15a5c2e9
clean up IFUNC implementation for power8 in order to remove
unneeded macro definitions.
2017-06-23 11:24:30 +05:30
Rajalakshmi Srinivasaraghavan
001b09a6a2 powerpc: Add optimized version of [l]lroundf
This patch makes use of optimized double version of llround for single
precision as both the versions return [long] long type.
2017-06-23 10:43:31 +05:30
Rajalakshmi Srinivasaraghavan
43e0ac24c8 powerpc: Optimize memchr for power8
Vectorized loops are used for sizes greater than 32B to improve
performance over power7 optimiztion.
2017-06-21 10:55:12 +05:30
Rajalakshmi Srinivasaraghavan
99c3eb0f73 powerpc: Add optimized version of [l]lrintf
This patch makes use of optimized double version of llrint for single
precision as both the versions return [long] long type.
2017-06-21 10:44:18 +05:30
Alan Modra
d5b411854f PowerPC64 ENTRY_TOCLESS
A number of functions in the sysdeps/powerpc/powerpc64/ tree don't use
or change r2, yet declare a global entry that sets up r2.  This patch
fixes that problem, and consolidates the ENTRY and EALIGN macros.

	* sysdeps/powerpc/powerpc64/sysdep.h: Formatting.
	(NOPS, ENTRY_3): New macros.
	(ENTRY): Rewrite.
	(ENTRY_TOCLESS): Define.
	(EALIGN, EALIGN_W_0, EALIGN_W_1, EALIGN_W_2, EALIGN_W_4, EALIGN_W_5,
	EALIGN_W_6, EALIGN_W_7, EALIGN_W_8): Delete.
	* sysdeps/powerpc/powerpc64/a2/memcpy.S: Replace EALIGN with ENTRY.
	* sysdeps/powerpc/powerpc64/dl-trampoline.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_ceil.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_ceilf.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_floor.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_floorf.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_nearbyint.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_nearbyintf.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_rint.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_rintf.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_round.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_roundf.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_trunc.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_truncf.S: Likewise.
	* sysdeps/powerpc/powerpc64/memset.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/fpu/s_finite.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/fpu/s_isinf.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/fpu/s_isnan.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/strstr.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/fpu/e_expf.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/fpu/s_cosf.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/fpu/s_sinf.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/strcasestr.S: Likewise.
	* sysdeps/powerpc/powerpc64/addmul_1.S: Use ENTRY_TOCLESS.
	* sysdeps/powerpc/powerpc64/cell/memcpy.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_copysign.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_copysignl.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_fabsl.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_isnan.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_llrint.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_llrintf.S: Likewise.
	* sysdeps/powerpc/powerpc64/lshift.S: Likewise.
	* sysdeps/powerpc/powerpc64/memcpy.S: Likewise.
	* sysdeps/powerpc/powerpc64/mul_1.S: Likewise.
	* sysdeps/powerpc/powerpc64/power4/memcmp.S: Likewise.
	* sysdeps/powerpc/powerpc64/power4/memcpy.S: Likewise.
	* sysdeps/powerpc/powerpc64/power4/memset.S: Likewise.
	* sysdeps/powerpc/powerpc64/power4/strncmp.S: Likewise.
	* sysdeps/powerpc/powerpc64/power5+/fpu/s_ceil.S: Likewise.
	* sysdeps/powerpc/powerpc64/power5+/fpu/s_ceilf.S: Likewise.
	* sysdeps/powerpc/powerpc64/power5+/fpu/s_floor.S: Likewise.
	* sysdeps/powerpc/powerpc64/power5+/fpu/s_floorf.S: Likewise.
	* sysdeps/powerpc/powerpc64/power5+/fpu/s_llround.S: Likewise.
	* sysdeps/powerpc/powerpc64/power5+/fpu/s_round.S: Likewise.
	* sysdeps/powerpc/powerpc64/power5+/fpu/s_roundf.S: Likewise.
	* sysdeps/powerpc/powerpc64/power5+/fpu/s_trunc.S: Likewise.
	* sysdeps/powerpc/powerpc64/power5+/fpu/s_truncf.S: Likewise.
	* sysdeps/powerpc/powerpc64/power5/fpu/s_isnan.S: Likewise.
	* sysdeps/powerpc/powerpc64/power6/fpu/s_copysign.S: Likewise.
	* sysdeps/powerpc/powerpc64/power6/fpu/s_isnan.S: Likewise.
	* sysdeps/powerpc/powerpc64/power6/memcpy.S: Likewise.
	* sysdeps/powerpc/powerpc64/power6/memset.S: Likewise.
	* sysdeps/powerpc/powerpc64/power6x/fpu/s_isnan.S: Likewise.
	* sysdeps/powerpc/powerpc64/power6x/fpu/s_llrint.S: Likewise.
	* sysdeps/powerpc/powerpc64/power6x/fpu/s_llround.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/add_n.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/memchr.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/memcmp.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/memcpy.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/memmove.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/mempcpy.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/memrchr.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/memset.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/rawmemchr.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/strcasecmp.S (strcasecmp_l):
	Likewise.
	* sysdeps/powerpc/powerpc64/power7/strchr.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/strchrnul.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/strcmp.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/strlen.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/strncmp.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/strncpy.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/strnlen.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/strrchr.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/fpu/s_finite.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/fpu/s_isinf.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/fpu/s_isnan.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/fpu/s_llrint.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/fpu/s_llround.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/memcmp.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/memset.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/strchr.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/strcmp.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/strcpy.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/strlen.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/strncmp.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/strncpy.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/strnlen.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/strrchr.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/strspn.S: Likewise.
	* sysdeps/powerpc/powerpc64/power9/strcmp.S: Likewise.
	* sysdeps/powerpc/powerpc64/power9/strncmp.S: Likewise.
	* sysdeps/powerpc/powerpc64/strchr.S: Likewise.
	* sysdeps/powerpc/powerpc64/strcmp.S: Likewise.
	* sysdeps/powerpc/powerpc64/strlen.S: Likewise.
	* sysdeps/powerpc/powerpc64/strncmp.S: Likewise.
	* sysdeps/powerpc/powerpc64/ppc-mcount.S: Store LR earlier.  Don't
	add nop when SHARED.
	* sysdeps/powerpc/powerpc64/start.S: Fix comment.
	* sysdeps/powerpc/powerpc64/multiarch/strrchr-power8.S (ENTRY): Don't
	define.
	(ENTRY_TOCLESS): Define.
	* sysdeps/powerpc/powerpc32/sysdep.h (ENTRY_TOCLESS): Define.
	* sysdeps/powerpc/fpu/s_fma.S: Use ENTRY_TOCLESS.
	* sysdeps/powerpc/fpu/s_fmaf.S: Likewise.
2017-06-14 10:45:50 +09:30
Alan Modra
de7ee73d6f PowerPC64 strncpy, stpncpy and strstr fixes
Makes __stpncpy_power8 call __memset_power8 directly rather than via an
IFUNC.  Fixes a missing _mcount, and removes some redundant NOPS.  The
*_is_local defines are also used in a followup patch.

	* sysdeps/powerpc/powerpc64/multiarch/strncpy-power7.S: Define
	MEMSET_is_local.
	* sysdeps/powerpc/powerpc64/multiarch/strncpy-power8.S: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/stpncpy-power7.S: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/stpncpy-power8.S: Likewise.
	Define MEMSET.
	* sysdeps/powerpc/powerpc64/multiarch/strstr-power7.S: Define
	STRLEN_is_local, STRNLEN_is_local, and STRCHR_is_local.
	* sysdeps/powerpc/powerpc64/power7/strstr.S: Likewise.  Don't add
	nop after local calls.
	* sysdeps/powerpc/powerpc64/power7/strncpy.S: Define MEMSET_is_local.
	Don't add nop after local call.
	* sysdeps/powerpc/powerpc64/power8/strncpy.S: Likewise.  Add missing
	CALL_MCOUNT.
2017-06-14 10:44:59 +09:30
Rajalakshmi Srinivasaraghavan
dec4a7105e powerpc: Improve memcmp performance for POWER8
Vectorization improves performance over the current implementation.
Tested on powerpc64 and powerpc64le.
2017-05-18 11:21:20 +05:30
Paul Clarke
b2980e3c54 powerpc: Add a POWER8-optimized version of cosf()
This implementation is based on the one already used at
sysdeps/powerpc/powerpc64/fpu/multiarch/s_sinf-power8.S.

	* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile
	[$(subdir) = math] (libm-sysdep_routines): Add s_cosf-power8 and
	s_cosf-ppc64.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_cosf-power8.S: New file.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_cosf-ppc64.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_cosf.c: Likewise.
	* sysdeps/powerpc/powerpc64/power8/fpu/s_cosf.S: Likewise.
2017-05-17 18:37:48 -03:00
Rajalakshmi Srinivasaraghavan
6c6ab1fc49 powerpc64: strrchr optimization for power8
P7 code is used for <=32B strings and for > 32B vectorized loops are used.
This shows as an average 25% improvement depending on the position of search
character.  The performance is same for shorter strings.
Tested on ppc64 and ppc64le.
2017-04-18 11:28:56 +05:30
Wainer dos Santos Moschetta
18e0054bf7 powerpc: refactor memset IFUNC.
Clean up the IFUNC implementations for powerpc in order to remove
unneeded macro definitions.

Tested on ppc64le with and without --disable-multi-arch flag.

	* sysdeps/powerpc/powerpc64/multiarch/memset-power4.S: Define the
	implementation-specific function name and remove unneeded macros
	definition.
	* sysdeps/powerpc/powerpc64/multiarch/memset-power6.S: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/memset-power7.S: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/memset-power8.S: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/memset-ppc64.S: Likewise.
	* sysdeps/powerpc/powerpc64/memset.S: Set a default function name if
	not defined and pass as parameter to macros accordingly.
	* sysdeps/powerpc/powerpc64/power4/memset.S: Likewise.
	* sysdeps/powerpc/powerpc64/power6/memset.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/memset.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/memset.S: Likewise.
2017-04-11 17:13:55 -03:00
Wainer dos Santos Moschetta
f0748b70a8 powerpc: refactor strcasestr and strstr IFUNC.
Clean up the IFUNC implementations for powerpc in order to remove
unneeded macro definitions.

Tested on ppc64le with and without --disable-multi-arch flag.

	* sysdeps/powerpc/powerpc64/multiarch/strcasestr-power8.S: Define the
	strcasestr implementation name and remove unneeded macros definition.
	* sysdeps/powerpc/powerpc64/multiarch/strstr-power7.S: Define
	strstr implementation name and remove unneeded macros definition.
	* sysdeps/powerpc/powerpc64/power7/strstr.S: Set a default function
	name if not defined and pass as parameter to macros accordingly.
	* sysdeps/powerpc/powerpc64/power8/strcasestr.S: Likewise.
2017-04-11 17:13:55 -03:00
Wainer dos Santos Moschetta
6d15a5c2e9 powerpc: refactor strchr, strchrnul, and strrchr IFUNC.
Clean up the IFUNC implementations for powerpc in order to remove
unneeded macro definitions.

Tested on ppc64le with and without --disable-multi-arch flag.

	* sysdeps/powerpc/powerpc64/multiarch/strchr-power7.S: Define the
	implementation-specific function name and remove unneeded macros
	definition.
	* sysdeps/powerpc/powerpc64/multiarch/strchr-power8.S: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strchr-ppc64.S: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strchrnul-power7.S: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strchrnul-power8.S: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strrchr-power7.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/strchr.S: Set a default
	function name if not defined and pass as parameter to macros
	accordingly.
	* sysdeps/powerpc/powerpc64/power7/strchrnul.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/strrchr.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/strchr.S: Likewise.
	* sysdeps/powerpc/powerpc64/strchr.S: Likewise.
2017-04-11 17:13:54 -03:00
Wainer dos Santos Moschetta
001649fd18 powerpc: refactor strnlen and strlen IFUNC.
Clean up the IFUNC implementations for powerpc in order to remove
unneeded macro definitions.

Tested on ppc64le with and without --disable-multi-arch flag.

	* sysdeps/powerpc/powerpc64/multiarch/strlen-power7.S: Define
	the strlen implementation name and remove unneeded macros definition.
	* sysdeps/powerpc/powerpc64/multiarch/strlen-power8.S: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strlen-ppc64.S: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strnlen-power7.S: Define
	the strnlen implementation name and remove unneeded macros definition.
	* sysdeps/powerpc/powerpc64/power7/strlen.S: Set a default function
	name if not defined and pass as parameter to macros accordingly.
	* sysdeps/powerpc/powerpc64/power7/strnlen.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/strlen.S: Likewise.
	* sysdeps/powerpc/powerpc64/strlen.S: Likewise.
2017-04-11 17:13:54 -03:00
Wainer dos Santos Moschetta
3bc426e156 powerpc: refactor strcasecmp, strcmp, and strncmp IFUNC.
Clean up the IFUNC implementations for powerpc in order to remove
unneeded macro definitions.

Tested on ppc64le with and without --disable-multi-arch flag.

	* sysdeps/powerpc/powerpc64/multiarch/strcasecmp_l-power7.S: Define
	the implementation-specific function name and remove unneeded
	macros definition.
	* sysdeps/powerpc/powerpc64/multiarch/strcmp-power7.S: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strcmp-power8.S Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strcmp-power9.S: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strcmp-ppc64.S: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strncmp-power4.S: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strncmp-power7.S: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strncmp-power8.S: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strncmp-power9.S: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strncmp-ppc64.S: Likewise.
	* sysdeps/powerpc/powerpc64/power4/strncmp.S: Set a default function
	name if not defined and pass as parameter to macros accordingly.
	* sysdeps/powerpc/powerpc64/power7/strcmp.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/strncmp.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/strcmp.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/strncmp.S: Likewise.
	* sysdeps/powerpc/powerpc64/power9/strcmp.S: Likewise.
	* sysdeps/powerpc/powerpc64/power9/strncmp.S: Likewise.
	* sysdeps/powerpc/powerpc64/strcmp.S: Likewise.
	* sysdeps/powerpc/powerpc64/strncmp.S: Likewise.
2017-04-11 17:13:54 -03:00
Wainer dos Santos Moschetta
dbcc7d0893 powerpc: refactor stpcpy, stpncpy, strcpy, and strncpy IFUNC.
Clean up the IFUNC implementations for powerpc in order to remove
unneeded macro definitions.

Tested on ppc64le with and without --disable-multi-arch flag.

	* sysdeps/powerpc/powerpc64/multiarch/stpcpy-power8.S: Define the
	implementation-specific function name and remove unneeded macros
	definition.
	* sysdeps/powerpc/powerpc64/multiarch/stpncpy-power7.S: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/stpncpy-power8.S: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strcpy-power8.S: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strncpy-power7.S: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strncpy-power8.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/strncpy.S: Set a default
	function name if not defined.
	* sysdeps/powerpc/powerpc64/power8/strcpy.S: Likewise.
	* sysdeps/powerpc/powerpc64/power8/strncpy.S: Likewise.
2017-04-11 17:13:31 -03:00
Wainer dos Santos Moschetta
18b6e2c86c powerpc64: Add POWER8 strnlen
Added strnlen POWER8 otimized for long strings. It delivers
same performance as POWER7 implementation for short strings.

This takes advantage of reasonably performing unaligned loads
and bit permutes to check the first 1-16 bytes until
quadword aligned, then checks in 64 bytes strides until unsafe,
then 16 bytes, truncating the count if need be.

Likewise, the POWER7 code is recycled for less than 32 bytes strings.

Tested on ppc64 and ppc64le.

	* sysdeps/powerpc/powerpc64/multiarch/Makefile
	(sysdep_routines): Add strnlen-power8.
	* sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
	(strnlen): Add __strnlen_power8 to list of strnlen functions.
	* sysdeps/powerpc/powerpc64/multiarch/strnlen-power8.S:
	New file.
	* sysdeps/powerpc/powerpc64/multiarch/strnlen.c
	(__strnlen): Add __strnlen_power8 to ifunc list.
	* sysdeps/powerpc/powerpc64/power8/strnlen.S: New file.
2017-04-05 10:26:58 -03:00
Rajalakshmi Srinivasaraghavan
04f0fd640d powerpc: Improve strcmp performance for shorter strings
For strings >16B and <32B existing algorithm takes more time than default
implementation when strings are placed closed to end of page. This is due
to byte by byte access for handling page cross. This is improved by
following >32B code path where the address is adjusted to aligned memory
before doing load doubleword operation instead of loading bytes.

Tested on powerpc64 and powerpc64le.
2017-02-07 10:40:26 +05:30
Joseph Myers
bfff8b1bec Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
Rajalakshmi Srinivasaraghavan
9314d3545e powerpc64: strchr/strchrnul optimization for power8
The P7 code is used for <=32B strings and for > 32B vectorized loops are used.
This shows as an average 25% improvement depending on the position of search
character.  The performance is same for shorter strings.
Tested on ppc64 and ppc64le.
2016-12-28 11:44:31 -02:00
Rajalakshmi Srinivasaraghavan
30e4cc5413 powerpc: Fix return code of strcasecmp for unaligned inputs
If the input values are unaligned and if there are null characters in the
memory before the starting address of the input values, strcasecmp
gives incorrect return code. Fixed it by adding mask the bits that
are not part of the string.
2016-07-05 21:20:41 +05:30
Anton Blanchard
aa95fc13f5 powerpc: Add a POWER8-optimized version of sinf()
This uses the implementation of sinf() in sysdeps/x86_64/fpu/s_sinf.S
as inspiration.
2016-06-30 16:08:49 -03:00
Tulio Magno Quites Machado Filho
35da2541c3 powerpc: Add a POWER8-optimized version of expf()
This implementation is based on the one already used at
sysdeps/x86_64/fpu/e_expf.S.

This implementation improves the performance by ~14% on average in synthetic
benchmarks at the cost of decreasing accuracy to 1 ULP.
2016-06-30 14:56:14 -03:00
raji
c8376f3e07 powerpc: strcasecmp/strncasecmp optmization for power8
This implementation utilizes vectors to improve performance
compared to current byte by byte implementation for POWER7.
The performance improvement is upto 4x.  This patch is tested
on powerpc64 and powerpc64le.
2016-06-14 14:51:16 +05:30
Tulio Magno Quites Machado Filho
c24480ce3b powerpc: Fix --disable-multi-arch build on POWER8
Add missing symbols of stpncpy and strcasestr when multi-arch is
disabled.
Fix memset call from strncpy/stpncpy when multi-arch is disabled.
2016-06-06 16:03:29 -03:00
Gabriel F. T. Gomes
eb3b8a4924 powerpc: Fix operand prefixes
The file sysdeps/powerpc/sysdeps.h defines aliases for condition register
operands.  E.g.: 'cr7' means condition register 7.  On the one hand, this
increases readability, as it makes it easier for readers to know whether the
operand is a condition register, a general purpose register or an immediate.
On the other hand, this permits that condition registers be written as if they
were general purpose, and vice-versa, thus reducing the readability of the
code.

This commit removes some of these unintentional misuses.

The changes have no effect on the final code.  Checked with objdump.
2016-05-04 09:14:52 -03:00