Commit Graph

7 Commits

Author SHA1 Message Date
Paul Eggert
5a82c74822 Prefer https to http for gnu.org and fsf.org URLs
Also, change sources.redhat.com to sourceware.org.
This patch was automatically generated by running the following shell
script, which uses GNU sed, and which avoids modifying files imported
from upstream:

sed -ri '
  s,(http|ftp)(://(.*\.)?(gnu|fsf|sourceware)\.org($|[^.]|\.[^a-z])),https\2,g
  s,(http|ftp)(://(.*\.)?)sources\.redhat\.com($|[^.]|\.[^a-z]),https\2sourceware.org\4,g
' \
  $(find $(git ls-files) -prune -type f \
      ! -name '*.po' \
      ! -name 'ChangeLog*' \
      ! -path COPYING ! -path COPYING.LIB \
      ! -path manual/fdl-1.3.texi ! -path manual/lgpl-2.1.texi \
      ! -path manual/texinfo.tex ! -path scripts/config.guess \
      ! -path scripts/config.sub ! -path scripts/install-sh \
      ! -path scripts/mkinstalldirs ! -path scripts/move-if-change \
      ! -path INSTALL ! -path  locale/programs/charmap-kw.h \
      ! -path po/libc.pot ! -path sysdeps/gnu/errlist.c \
      ! '(' -name configure \
            -execdir test -f configure.ac -o -f configure.in ';' ')' \
      ! '(' -name preconfigure \
            -execdir test -f preconfigure.ac ';' ')' \
      -print)

and then by running 'make dist-prepare' to regenerate files built
from the altered files, and then executing the following to cleanup:

  chmod a+x sysdeps/unix/sysv/linux/riscv/configure
  # Omit irrelevant whitespace and comment-only changes,
  # perhaps from a slightly-different Autoconf version.
  git checkout -f \
    sysdeps/csky/configure \
    sysdeps/hppa/configure \
    sysdeps/riscv/configure \
    sysdeps/unix/sysv/linux/csky/configure
  # Omit changes that caused a pre-commit check to fail like this:
  # remote: *** error: sysdeps/powerpc/powerpc64/ppc-mcount.S: trailing lines
  git checkout -f \
    sysdeps/powerpc/powerpc64/ppc-mcount.S \
    sysdeps/unix/sysv/linux/s390/s390-64/syscall.S
  # Omit change that caused a pre-commit check to fail like this:
  # remote: *** error: sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S: last line does not end in newline
  git checkout -f sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S
2019-09-07 02:43:31 -07:00
Anton Youdkevitch
32e902a94e aarch64: thunderx2 memmove performance improvements
The performance improvement is about 20%-30% for
larger cases and about 1%-5% for smaller cases.

Used SIMD load/store instead of GPR for large
overlapping forward moves.

Reused existing memcpy implementation for smaller
or overlapping backward moves.

Fixed the existing memcpy implementation to allow it
to deal with the overlapping case.

Simplified loop tails in the memcpy implementation -
use branchless overlapping sequence of fixed length
load/stores instead of branching depending on the
size.

A cleanup/optimization converting str's to stp's.

Added __memmove_thunderx2 to the list of the
available implementations.
2019-05-03 11:01:34 -07:00
Anton Youdkevitch
94e358f6d4 aarch64: thunderx2 memcpy implementation cleanup and streamlining
Here is the updated patch for improving the long unaligned
code path (the one using "ext" instruction).

1. Always taken conditional branch at the beginning is
removed.

2. Epilogue code is placed after the end of the loop to
reduce the number of branches.

3. The redundant "mov" instructions inside the loop are
gone due to the changed order of the registers in the "ext"
instructions inside the loop,  the prologue has additional
"ext" instruction.

4.Updating count in the prologue was hoisted out as
it is the same update for each prologue.

5. Invariant code of the loop epilogue was hoisted out.

6. As the current size of the ext chunk is exactly 16
instructions long "nop" was added at the beginning
of the code sequence so that the loop entry for all the
chunks be aligned.

	* sysdeps/aarch64/multiarch/memcpy_thunderx2.S: Cleanup branching
	and remove redundant code.
2019-04-05 13:59:54 -07:00
Joseph Myers
04277e02d7 Update copyright dates with scripts/update-copyrights.
* All files with FSF copyright notices: Update copyright dates
	using scripts/update-copyrights.
	* locale/programs/charmap-kw.h: Regenerated.
	* locale/programs/locfile-kw.h: Likewise.
2019-01-01 00:11:28 +00:00
Steve Ellcey
f0da0bcf8b Remove extra space at end of line. 2018-10-16 11:02:03 -07:00
Anton Youdkevitch
75c1aee500 aarch64: optimized memcpy implementation for thunderx2
Since aligned loads and stores are huge performance
advantage the implementation always tries to do aligned
access. Among the cases when src and dst addresses are
aligned or unaligned evenly there are cases of not evenly
unaligned src and dst. For such cases (if the length is
big enough) ext instruction is used to merge-and-shift
two memory chunks loaded from two adjacent aligned
locations and then the adjusted chunk gets stored to
aligned address.

Performance gain against the current T2 implementation:
     memcpy-large: 65K-32M: +40% - +10%
     memcpy-walk:  128-32M: +20% - +2%
2018-10-16 11:00:27 -07:00
Steve Ellcey
e9537dddc7 IFUNC for Cavium ThunderX2
* sysdeps/aarch64/multiarch/Makefile (sysdep_routines):
	Add memcpy_thunderx2.
	* sysdeps/aarch64/multiarch/ifunc-impl-list.c (MAX_IFUNC):
	Increment to 4.
	(__libc_ifunc_impl_list): Add __memcpy_thunderx2.
	* sysdeps/aarch64/multiarch/memcpy.c (libc_ifunc): Add IS_THUNDERX2
	and IS_THUNDERX2PA checks.
	* sysdeps/aarch64/multiarch/memcpy_thunderx.S (USE_THUNDERX2):
	Use macro to set name appropriately.
	(memcpy): Use USE_THUNDERX2 macro to modify prefetches.
	* sysdeps/aarch64/multiarch/memcpy_thunderx2.S: New file.
	* sysdeps/unix/sysv/linux/aarch64/cpu-features.h (IS_THUNDERX2PA):
	New macro.
	(IS_THUNDERX2): New macro.
2018-02-22 08:38:47 -08:00