Commit Graph

362 Commits

Author SHA1 Message Date
Adhemerval Zanella
a53fbd8e6c PowerPC: Fix gprof entry point for LE
This patch fixes the ELFv2 gprof entry point since the ABI
does not define function descriptors.  It fixes BZ#17213.
2014-07-30 09:01:25 -03:00
Adhemerval Zanella
27b75f56c9 PowerPC: Cleanup powerpc memmove
Now that MEMCPY_OK_FOR_FWD_MEMMOVE should be define on memcopy.h there
is no need to specialized powerpc memmove implementation.  This patch
moves the define set to powerpc memcopy and cleanup its definition on
powerpc code.
2014-07-08 09:16:15 -05:00
Adhemerval Zanella
e7f95bb5f0 PowerPC: Fix compiler warnings
This patch fixes some compiler due trailing data in #undef directives
and due missing prototypes.
2014-07-08 09:16:12 -05:00
Adhemerval Zanella
87868c2418 PowerPC: Align power7 memcpy using VSX to quadword
This patch changes power7 memcpy to use VSX instructions only when
memory is aligned to quardword.  It is to avoid unaligned kernel traps
on non-cacheable memory (for instance, memory-mapped I/O).
2014-07-07 15:41:27 -05:00
Adhemerval Zanella
17762f6625 PowerPC: optimized memmove for POWER7/PPC64
This patch adds an optimized memmove optimization for POWER7/powerpc64.
Basically the idea is to use the memcpy for POWER7 on non-overlapped
memory regions and a optimized backward memcpy for memory regions
that overlap (similar to the idea of string/memmove.c).

The backward memcpy algorithm used is similar the one use for memcpy for
POWER7, with adjustments done for alignment.  The difference is memory
is always aligned to 16 bytes before using VSX/altivec instructions.
2014-07-07 15:41:21 -05:00
Richard Henderson
05502548e9 Always provide HP_SMALL_TIMING_AVAIL 2014-07-03 08:38:36 -07:00
Richard Henderson
86e1a7ff92 Unify hp-timing implementations
Provide an hp-timing-common.h for ports to use.
2014-07-03 08:38:30 -07:00
Richard Henderson
428dd03f5a Remove HP_TIMING_DIFF_INIT and dl_hp_timing_overhead
Without HP_TIMING_ACCUM, dl_hp_timing_overhead is write-only.
If we remove it, there's no point in HP_TIMING_DIFF_INIT either.
2014-07-03 08:38:25 -07:00
Richard Henderson
c39323e9d2 Removing HP_TIMING_ACCUM as unused 2014-07-03 08:38:21 -07:00
Richard Henderson
850e0e032b Removing HP_TIMING_ZERO as unused 2014-07-03 08:38:18 -07:00
Vidya Ranganathan
bc8ea38590 PowerPC: strcat optimization for PPC64/POWER7
This patch adds an ifunc power7 strcat symbol that uses the logic on
sysdeps/powerpc/strcat.c but call power7 strlen/strcpy symbols instead
of default ones.
2014-07-02 14:04:21 -05:00
Siddhesh Poyarekar
4cf5b6d0d7 Fix Wundef warning for ELF_MACHINE_NO_RELA
This patch defines ELF_MACHINE_NO_RELA on all architectures.  Tested
only on x86_64 to verify that the sources before and after are
identical except for two instructions that pass the current line
number in dl-machine.h to assert_fail.
2014-06-26 22:30:40 +05:30
Vidya Ranganathan
e23d3d2690 PowerPC: Optimized strcmp for PPC64/POWER7
Optimization is achieved on 8 byte aligned strings with double word
comparison using cmpb instruction. On unaligned strings loop unrolling
is applied for Power7 gain.
2014-06-11 08:39:31 -05:00
Adhemerval Zanella
ed36bfa18f PowerPC: Fix optimized strncat strlen call
This patch fixes the optimized ppc64/power7 strncat strlen call for
static build without ifunc enabled.  The strlen symbol to call in such
situation is just strlen, instead of __GI_strlen (since the __GI_
alias is just created for shared objects).
2014-06-06 09:37:07 -05:00
Adhemerval Zanella
af121e371d PowerPC: Fix multiarch hypotf PPC64 path
This patch moves the hypotf multiarch implementation to correct path.
2014-05-19 18:06:40 -05:00
Vidya Ranganathan
f360f94a05 PowerPC: strncpy/stpncpy optimization for PPC64/POWER7
The optimization is achieved by following techniques:
  > data alignment [gain from aligned memory access on read/write]
  > POWER7 gains performance with loop unrolling/unwinding
    [gain by reduction of branch penalty].
  > zero padding done by calling optimized memset
2014-05-06 09:54:25 -05:00
Adhemerval Zanella
19c4bec0f4 PowerPC: ifunc improvement for internal calls
This patch changes de default symbol redirection for internal call of
memcpy, memset, memchr, and strlen to the IFUNC resolved ones.  The
performance improvement is noticeable in algorithms that uses these
symbols extensible, like the regex functions.
2014-05-05 13:30:16 -05:00
Adhemerval Zanella
de21c33c06 PowerPC: Fix --disable-multi-arch builds
This patch fixes some powerpc32 and powerpc64 builds with
--disable-multi-arch option along with different --with-cpu=powerN.
It cleanups the Implies directories by removing the multiarch
folder for non multiarch config and also fixing two assembly
implementations: powerpc64/power7/strncat.S that is calling the
wrong strlen; and power8/fpu/s_isnan.S that misses the hidden_def and
weak_alias directives.
2014-04-09 06:22:53 -05:00
Alan Modra
af6b17973c Correct prefetch hint in power7 memrchr.
Typo fix.

	* sysdeps/powerpc/powerpc64/power7/memrchr.S: Correct stream hint.
2014-04-02 13:42:27 +10:30
Alan Modra
483818d768 Fix reference to toc symbol.
https://sourceware.org/ml/binutils/2014-03/msg00033.html removes the
"magic" treatment of symbols defined in a .toc section.

	* sysdeps/powerpc/powerpc64/start.S: Add @toc to toc symbol reference.
2014-04-02 13:40:21 +10:30
Alan Modra
c859b32e9d Fix s_copysign stack temp for PowerPC64 ELFv2
[BZ #16786]
	* sysdeps/powerpc/powerpc64/fpu/s_copysign.S: Don't trash stack.
2014-04-01 14:10:22 +10:30
Adhemerval Zanella
757d9dd5c3 PowerPC: Fix little endian enconding for mfvsrd
This patch fixes the MFVSRD_R3_V1 macro that encodes 'mfvsrd  r3,vs1'
(to support old binutils) for little endian.
2014-03-31 08:00:38 -05:00
Adhemerval Zanella
6f23d0939e PowerPC: optimized strpbrk for POWER7
This patch add an optimized strpbrk for POWER7 by using a different
algorithm than default implementation: it constructs a table based on
the 'accept' argument and use this table to check for any occurance on
the input string. The idea is similar as x86_64 uses.
For PowerPC some tunings were added, such as unroll loops and memory
clear using VSX instructions.
2014-03-20 19:46:13 -05:00
Adhemerval Zanella
6eaf95cbfa PowerPC: optimized strcspn for PPC64/POWER7
This patch add a optimized strcspn for POWER7 by using a different
algorithm than default implementation: it constructs a table based on
the 'accept' argument and use this table to check for any occurance
on the input string. The idea is similar as x86_64 uses.
For PowerPC some tunings were added, such as unroll loops and align
stack memory to table to 16 bytes (so VSX clean can ran without
alignment issues).
2014-03-20 11:24:52 -05:00
Adhemerval Zanella
c7de502503 PowerPC: remove wrong roundl implementation for PowerPC64
The roundl assembly implementation
(sysdeps/powerpc/powerpc64/fpu/s_roundl.S)
returns wrong results for some inputs where first double is a exact
integer and the precision is determined by second long double.

Checking on implementation comments and history, I am very confident the
assembly implementation was based on a version before commit
5c68d40169 that fixes BZ#2423 (Errors in
long double (ldbl-128ibm) rounding functions in glibc-2.4).

By just removing the implementation and make the build select
sysdeps/ieee754/ldbl-128ibm/s_roundl.c instead fixes the failing math.

This fixes 16707.
2014-03-14 12:54:47 -05:00
Adhemerval Zanella
98fb27a373 PowerPC: remove wrong nearbyintl implementation for PPC64
The nearbyintl assembly implementation
(sysdeps/powerpc/powerpc64/fpu/s_nearbyintl.S)
returns wrong results for some inputs where first double is a exact
integer and the precision is determined by second long double.

Checking on implementation comments and history, I am very confident the
assembly implementation was based on a version before commit
5c68d40169 that fixes BZ#2423 (Errors in
long double (ldbl-128ibm) rounding functions in glibc-2.4).

By just removing the implementation and make the build select
sysdeps/ieee754/ldbl-128ibm/s_nearbyintl.c instead fixes the failing
math.

Fixes BZ#16706.
2014-03-14 12:54:47 -05:00
Adhemerval Zanella
374f7f6121 PowerPC: remove wrong ceill implementation for PowerPC64
The ceill assembly implementation (sysdeps/powerpc/powerpc64/fpu/s_ceill.S)
returns wrong results for some inputs where first double is a exact
integer and the precision is determined by second long double.

Checking on implementation comments and history, I am very confident the
assembly implementation was based on a version before commit
5c68d40169 that fixes BZ#2423 (Errors in
long double (ldbl-128ibm) rounding functions in glibc-2.4).

By just removing the implementation and make the build select
sysdeps/ieee754/ldbl-128ibm/s_ceill.c instead fixes the failing math.

Fixes BZ#16701.
2014-03-14 12:54:47 -05:00
Adhemerval Zanella
27c7220a48 PowerPC: Fix strspn for static build
This patch makes the strspn ifunc selector build for static builds.
2014-03-12 06:54:44 -05:00
Adhemerval Zanella
4facea4730 PowerPC: Fix bzero definition for static libc for PPC64
This patch fixes an issue for powerpc64[le] static build where __bzero
is definied in multiple places (memset-ppc64.o and bzero.o). It is now
defined only in bzero.o and memset-ppc64.o only defined __bzero_ppc for
both dynamic and static library.

Fixes BZ#16683.
2014-03-11 09:31:59 -05:00
Vidya Ranganathan
e65caf1f1d PowerPC: strspn optimization for PPC64/POWER7
The optimization is achieved by following techniques:
  > hashing of needle.
  > hashing avoids scanning of duplicate entries in needle across the string.
  > initializing the hash table with Vector instructions (VSX) by quadword access.
  > unrolling when scanning for character in string across hash table.
2014-03-11 08:54:33 -05:00
Adhemerval Zanella
ba9cc0714e PowerPC: strncat optimization for PPC64
The optimization is achieved by following techniques:
1. Doubleword aligned memory access and compares using
   cmpb instruction.
2. Loop unrolling for byte load/store.
3. CPU pre-fetch to avoid cache miss.
2014-03-10 07:25:09 -05:00
Rajalakshmi Srinivasaraghavan
c7debbdfac PowerPC: strrchr optimization for POWER7/PPC64
This patch optimizes strrchr() for ppc64. It uses aligned memory
access along with cmpb instruction and CPU prefetch to avoid
cache misses for speed improvement.
2014-03-03 08:06:41 -06:00
Adhemerval Zanella
fe13a20c37 PowerPC: llround/llroundf POWER8 optimization
This patch add a optimized llround/llroundf implementation for POWER8
using the new Move From VSR Doubleword instruction to gains some
cycles from FP to GRP register move.
2014-02-27 12:58:33 -06:00
Adhemerval Zanella
1ad8950a3e PowerPC: llrint/llrintf POWER8 optimization
This patch add a optimized llrint/llrintf implementation for POWER8
using the new Move From VSR Doubleword instruction to gains some
cycles from FP to GRP register move.
2014-02-27 12:58:33 -06:00
Adhemerval Zanella
cac626d60a PowerPC: Optimized finite/finitef for POWER8
This patch add a optimized finite/finitef implementation for POWER8
using the new Move From VSR Doubleword instruction to gains some
cycles from FP to GRP register move.
2014-02-27 12:58:33 -06:00
Adhemerval Zanella
4393fc119c PowerPC: Optimized isinf/isinff for POWER8
This patch add a optimized isinf/isinff implementation for POWER8
using the new Move From VSR Doubleword instruction to gains some
cycles from FP to GRP register move.
2014-02-27 12:58:33 -06:00
Adhemerval Zanella
487972aea5 PowerPC: Optimized isnan/isnanf for POWER8
This patch add a optimized isnan/isnanf implementation for POWER8
using the new Move From VSR Doubleword instruction to gains some
cycles from FP to GRP register move.
2014-02-27 12:58:32 -06:00
Ondřej Bílka
a1ffb40e32 Use glibc_likely instead __builtin_expect. 2014-02-10 15:07:12 +01:00
Adhemerval Zanella
38f3458175 PowerPC: remove wrong truncl implementation for PowerPC64
The truncl assembly implementation (sysdeps/powerpc/powerpc64/fpu/s_truncl.S)
returns wrong results for some inputs where first double is a exact integer
and the precision is determined by second long double.

Checking on implementation comments and history, I am very confident the
assembly implementation was based on a version before commit
5c68d40169 that fixes BZ#2423 (Errors in
long double (ldbl-128ibm) rounding functions in glibc-2.4).

By just removing the implementation and make the build select
sysdeps/ieee754/ldbl-128ibm/s_truncl.c instead it fixes tgammal
issues regarding wrong result sign.
2014-01-08 08:14:48 -06:00
Adhemerval Zanella
d7ad2d9bad PowerPC: Fix compiler warnings
This patch fixes some compile warnings related to extra tokens at
end of #undef directive from multilib patchset.
2014-01-03 13:29:10 -06:00
Allan McRae
d4697bc93d Update copyright notices with scripts/update-copyrights 2014-01-01 22:00:23 +10:00
Andreas Schwab
83f5c32d21 Fix uses of CALL_MCOUNT in ppc64 assembler sources 2013-12-19 17:06:48 +01:00
Adhemerval Zanella
42fcb46ce6 PowerPC: multiarch hypot/hypotf for PowerPC64 2013-12-13 15:38:01 -05:00
Adhemerval Zanella
83efded424 PowerPC: multiarch modf/modff for PowerPC64 2013-12-13 15:37:23 -05:00
Adhemerval Zanella
43e246d2a6 PowerPC: multiarch logb/logbl/logbf for PowerPC64 2013-12-13 15:36:33 -05:00
Adhemerval Zanella
8fdad12379 PowerPC: multiarch isinf/isinff for PowerPC64 2013-12-13 15:35:44 -05:00
Adhemerval Zanella
1481d7066c PowerPC: multiarch finite/finitef for PowerPC64 2013-12-13 15:34:52 -05:00
Adhemerval Zanella
5ccd5fc893 PowerPC: multiarch llrint/lrint for PowerPC64 2013-12-13 15:33:54 -05:00
Adhemerval Zanella
2568f3fa69 PowerPC: multiarch copysign/copysignf for PowerPC64 2013-12-13 15:32:58 -05:00
Adhemerval Zanella
1cb341fd78 PowerPC: multiarch trunc/truncf for PowerPC64 2013-12-13 15:30:57 -05:00