glibc/sysdeps
Anton Youdkevitch 94e358f6d4 aarch64: thunderx2 memcpy implementation cleanup and streamlining
Here is the updated patch for improving the long unaligned
code path (the one using "ext" instruction).

1. Always taken conditional branch at the beginning is
removed.

2. Epilogue code is placed after the end of the loop to
reduce the number of branches.

3. The redundant "mov" instructions inside the loop are
gone due to the changed order of the registers in the "ext"
instructions inside the loop,  the prologue has additional
"ext" instruction.

4.Updating count in the prologue was hoisted out as
it is the same update for each prologue.

5. Invariant code of the loop epilogue was hoisted out.

6. As the current size of the ext chunk is exactly 16
instructions long "nop" was added at the beginning
of the code sequence so that the loop entry for all the
chunks be aligned.

	* sysdeps/aarch64/multiarch/memcpy_thunderx2.S: Cleanup branching
	and remove redundant code.
2019-04-05 13:59:54 -07:00
..
aarch64 aarch64: thunderx2 memcpy implementation cleanup and streamlining 2019-04-05 13:59:54 -07:00
alpha alpha: Improve sysdeps/alpha/divqu.S and sysdeps/alpha/remqu.S 2019-04-01 16:00:37 +07:00
arm Break further lines before not after operators. 2019-02-26 15:01:50 +00:00
csky C-SKY: mark lr as undefined to stop unwinding 2019-03-11 09:51:14 +08:00
generic Add generic hp-timing support 2019-03-22 17:30:44 -03:00
gnu Add UDP_GRO from Linux 5.0 to netinet/udp.h. 2019-03-25 13:16:46 +00:00
hppa Add some spaces before '('. 2019-02-27 13:55:45 +00:00
htl hurd: advertise *_setpshared as not supported 2019-01-02 22:21:34 +01:00
hurd Update copyright dates with scripts/update-copyrights. 2019-01-01 00:11:28 +00:00
i386 Add and move fall-through comments in system-specific code. 2019-02-26 02:09:18 +00:00
ia64 Refactor hp-timing rtld usage 2019-03-22 17:30:44 -03:00
ieee754 ldbl-opt: Reuse test cases from misc/ that check long double 2019-03-01 15:32:49 -03:00
init_array Update copyright dates with scripts/update-copyrights. 2019-01-01 00:11:28 +00:00
m68k wcsmbs: optimize wcpcpy 2019-02-27 10:00:34 -03:00
mach nptl: Remove pthread_clock_gettime pthread_clock_settime 2019-03-22 15:37:43 -03:00
microblaze Break more lines before not after operators. 2019-02-25 13:19:19 +00:00
mips Add and move fall-through comments in system-specific code. 2019-02-26 02:09:18 +00:00
nios2 Update copyright dates with scripts/update-copyrights. 2019-01-01 00:11:28 +00:00
nptl nptl: Remove pthread_clock_gettime pthread_clock_settime 2019-03-22 15:37:43 -03:00
posix Do not use HP_TIMING_NOW for random bits 2019-03-22 17:30:39 -03:00
powerpc powerpc: Use generic wcsrchr optimization 2019-04-04 16:01:14 +07:00
pthread Update copyright dates with scripts/update-copyrights. 2019-01-01 00:11:28 +00:00
riscv RISC-V: Update nofpu ULPs 2019-01-24 13:55:05 -05:00
s390 S390: Add arch13 memmem ifunc variant. 2019-03-22 11:14:09 +01:00
sh Update copyright dates with scripts/update-copyrights. 2019-01-01 00:11:28 +00:00
sparc Refactor hp-timing rtld usage 2019-03-22 17:30:44 -03:00
unix alpha: Do not redefine __NR_shmat or __NR_osf_shmat 2019-04-01 15:54:00 +07:00
wordsize-32 Update copyright dates with scripts/update-copyrights. 2019-01-01 00:11:28 +00:00
wordsize-64 Update copyright dates with scripts/update-copyrights. 2019-01-01 00:11:28 +00:00
x86 Refactor hp-timing rtld usage 2019-03-22 17:30:44 -03:00
x86_64 wcsmbs: optimize wcscat 2019-02-27 10:00:37 -03:00