Commit Graph

398 Commits

Author SHA1 Message Date
Adhemerval Zanella
af121e371d PowerPC: Fix multiarch hypotf PPC64 path
This patch moves the hypotf multiarch implementation to correct path.
2014-05-19 18:06:40 -05:00
Vidya Ranganathan
f360f94a05 PowerPC: strncpy/stpncpy optimization for PPC64/POWER7
The optimization is achieved by following techniques:
  > data alignment [gain from aligned memory access on read/write]
  > POWER7 gains performance with loop unrolling/unwinding
    [gain by reduction of branch penalty].
  > zero padding done by calling optimized memset
2014-05-06 09:54:25 -05:00
Adhemerval Zanella
19c4bec0f4 PowerPC: ifunc improvement for internal calls
This patch changes de default symbol redirection for internal call of
memcpy, memset, memchr, and strlen to the IFUNC resolved ones.  The
performance improvement is noticeable in algorithms that uses these
symbols extensible, like the regex functions.
2014-05-05 13:30:16 -05:00
Adhemerval Zanella
de21c33c06 PowerPC: Fix --disable-multi-arch builds
This patch fixes some powerpc32 and powerpc64 builds with
--disable-multi-arch option along with different --with-cpu=powerN.
It cleanups the Implies directories by removing the multiarch
folder for non multiarch config and also fixing two assembly
implementations: powerpc64/power7/strncat.S that is calling the
wrong strlen; and power8/fpu/s_isnan.S that misses the hidden_def and
weak_alias directives.
2014-04-09 06:22:53 -05:00
Alan Modra
af6b17973c Correct prefetch hint in power7 memrchr.
Typo fix.

	* sysdeps/powerpc/powerpc64/power7/memrchr.S: Correct stream hint.
2014-04-02 13:42:27 +10:30
Alan Modra
483818d768 Fix reference to toc symbol.
https://sourceware.org/ml/binutils/2014-03/msg00033.html removes the
"magic" treatment of symbols defined in a .toc section.

	* sysdeps/powerpc/powerpc64/start.S: Add @toc to toc symbol reference.
2014-04-02 13:40:21 +10:30
Alan Modra
c859b32e9d Fix s_copysign stack temp for PowerPC64 ELFv2
[BZ #16786]
	* sysdeps/powerpc/powerpc64/fpu/s_copysign.S: Don't trash stack.
2014-04-01 14:10:22 +10:30
Adhemerval Zanella
757d9dd5c3 PowerPC: Fix little endian enconding for mfvsrd
This patch fixes the MFVSRD_R3_V1 macro that encodes 'mfvsrd  r3,vs1'
(to support old binutils) for little endian.
2014-03-31 08:00:38 -05:00
Adhemerval Zanella
6f23d0939e PowerPC: optimized strpbrk for POWER7
This patch add an optimized strpbrk for POWER7 by using a different
algorithm than default implementation: it constructs a table based on
the 'accept' argument and use this table to check for any occurance on
the input string. The idea is similar as x86_64 uses.
For PowerPC some tunings were added, such as unroll loops and memory
clear using VSX instructions.
2014-03-20 19:46:13 -05:00
Adhemerval Zanella
6eaf95cbfa PowerPC: optimized strcspn for PPC64/POWER7
This patch add a optimized strcspn for POWER7 by using a different
algorithm than default implementation: it constructs a table based on
the 'accept' argument and use this table to check for any occurance
on the input string. The idea is similar as x86_64 uses.
For PowerPC some tunings were added, such as unroll loops and align
stack memory to table to 16 bytes (so VSX clean can ran without
alignment issues).
2014-03-20 11:24:52 -05:00
Adhemerval Zanella
c7de502503 PowerPC: remove wrong roundl implementation for PowerPC64
The roundl assembly implementation
(sysdeps/powerpc/powerpc64/fpu/s_roundl.S)
returns wrong results for some inputs where first double is a exact
integer and the precision is determined by second long double.

Checking on implementation comments and history, I am very confident the
assembly implementation was based on a version before commit
5c68d40169 that fixes BZ#2423 (Errors in
long double (ldbl-128ibm) rounding functions in glibc-2.4).

By just removing the implementation and make the build select
sysdeps/ieee754/ldbl-128ibm/s_roundl.c instead fixes the failing math.

This fixes 16707.
2014-03-14 12:54:47 -05:00
Adhemerval Zanella
98fb27a373 PowerPC: remove wrong nearbyintl implementation for PPC64
The nearbyintl assembly implementation
(sysdeps/powerpc/powerpc64/fpu/s_nearbyintl.S)
returns wrong results for some inputs where first double is a exact
integer and the precision is determined by second long double.

Checking on implementation comments and history, I am very confident the
assembly implementation was based on a version before commit
5c68d40169 that fixes BZ#2423 (Errors in
long double (ldbl-128ibm) rounding functions in glibc-2.4).

By just removing the implementation and make the build select
sysdeps/ieee754/ldbl-128ibm/s_nearbyintl.c instead fixes the failing
math.

Fixes BZ#16706.
2014-03-14 12:54:47 -05:00
Adhemerval Zanella
374f7f6121 PowerPC: remove wrong ceill implementation for PowerPC64
The ceill assembly implementation (sysdeps/powerpc/powerpc64/fpu/s_ceill.S)
returns wrong results for some inputs where first double is a exact
integer and the precision is determined by second long double.

Checking on implementation comments and history, I am very confident the
assembly implementation was based on a version before commit
5c68d40169 that fixes BZ#2423 (Errors in
long double (ldbl-128ibm) rounding functions in glibc-2.4).

By just removing the implementation and make the build select
sysdeps/ieee754/ldbl-128ibm/s_ceill.c instead fixes the failing math.

Fixes BZ#16701.
2014-03-14 12:54:47 -05:00
Adhemerval Zanella
27c7220a48 PowerPC: Fix strspn for static build
This patch makes the strspn ifunc selector build for static builds.
2014-03-12 06:54:44 -05:00
Adhemerval Zanella
4facea4730 PowerPC: Fix bzero definition for static libc for PPC64
This patch fixes an issue for powerpc64[le] static build where __bzero
is definied in multiple places (memset-ppc64.o and bzero.o). It is now
defined only in bzero.o and memset-ppc64.o only defined __bzero_ppc for
both dynamic and static library.

Fixes BZ#16683.
2014-03-11 09:31:59 -05:00
Vidya Ranganathan
e65caf1f1d PowerPC: strspn optimization for PPC64/POWER7
The optimization is achieved by following techniques:
  > hashing of needle.
  > hashing avoids scanning of duplicate entries in needle across the string.
  > initializing the hash table with Vector instructions (VSX) by quadword access.
  > unrolling when scanning for character in string across hash table.
2014-03-11 08:54:33 -05:00
Adhemerval Zanella
ba9cc0714e PowerPC: strncat optimization for PPC64
The optimization is achieved by following techniques:
1. Doubleword aligned memory access and compares using
   cmpb instruction.
2. Loop unrolling for byte load/store.
3. CPU pre-fetch to avoid cache miss.
2014-03-10 07:25:09 -05:00
Rajalakshmi Srinivasaraghavan
c7debbdfac PowerPC: strrchr optimization for POWER7/PPC64
This patch optimizes strrchr() for ppc64. It uses aligned memory
access along with cmpb instruction and CPU prefetch to avoid
cache misses for speed improvement.
2014-03-03 08:06:41 -06:00
Adhemerval Zanella
fe13a20c37 PowerPC: llround/llroundf POWER8 optimization
This patch add a optimized llround/llroundf implementation for POWER8
using the new Move From VSR Doubleword instruction to gains some
cycles from FP to GRP register move.
2014-02-27 12:58:33 -06:00
Adhemerval Zanella
1ad8950a3e PowerPC: llrint/llrintf POWER8 optimization
This patch add a optimized llrint/llrintf implementation for POWER8
using the new Move From VSR Doubleword instruction to gains some
cycles from FP to GRP register move.
2014-02-27 12:58:33 -06:00
Adhemerval Zanella
cac626d60a PowerPC: Optimized finite/finitef for POWER8
This patch add a optimized finite/finitef implementation for POWER8
using the new Move From VSR Doubleword instruction to gains some
cycles from FP to GRP register move.
2014-02-27 12:58:33 -06:00
Adhemerval Zanella
4393fc119c PowerPC: Optimized isinf/isinff for POWER8
This patch add a optimized isinf/isinff implementation for POWER8
using the new Move From VSR Doubleword instruction to gains some
cycles from FP to GRP register move.
2014-02-27 12:58:33 -06:00
Adhemerval Zanella
487972aea5 PowerPC: Optimized isnan/isnanf for POWER8
This patch add a optimized isnan/isnanf implementation for POWER8
using the new Move From VSR Doubleword instruction to gains some
cycles from FP to GRP register move.
2014-02-27 12:58:32 -06:00
Ondřej Bílka
a1ffb40e32 Use glibc_likely instead __builtin_expect. 2014-02-10 15:07:12 +01:00
Adhemerval Zanella
38f3458175 PowerPC: remove wrong truncl implementation for PowerPC64
The truncl assembly implementation (sysdeps/powerpc/powerpc64/fpu/s_truncl.S)
returns wrong results for some inputs where first double is a exact integer
and the precision is determined by second long double.

Checking on implementation comments and history, I am very confident the
assembly implementation was based on a version before commit
5c68d40169 that fixes BZ#2423 (Errors in
long double (ldbl-128ibm) rounding functions in glibc-2.4).

By just removing the implementation and make the build select
sysdeps/ieee754/ldbl-128ibm/s_truncl.c instead it fixes tgammal
issues regarding wrong result sign.
2014-01-08 08:14:48 -06:00
Adhemerval Zanella
d7ad2d9bad PowerPC: Fix compiler warnings
This patch fixes some compile warnings related to extra tokens at
end of #undef directive from multilib patchset.
2014-01-03 13:29:10 -06:00
Allan McRae
d4697bc93d Update copyright notices with scripts/update-copyrights 2014-01-01 22:00:23 +10:00
Andreas Schwab
83f5c32d21 Fix uses of CALL_MCOUNT in ppc64 assembler sources 2013-12-19 17:06:48 +01:00
Adhemerval Zanella
42fcb46ce6 PowerPC: multiarch hypot/hypotf for PowerPC64 2013-12-13 15:38:01 -05:00
Adhemerval Zanella
83efded424 PowerPC: multiarch modf/modff for PowerPC64 2013-12-13 15:37:23 -05:00
Adhemerval Zanella
43e246d2a6 PowerPC: multiarch logb/logbl/logbf for PowerPC64 2013-12-13 15:36:33 -05:00
Adhemerval Zanella
8fdad12379 PowerPC: multiarch isinf/isinff for PowerPC64 2013-12-13 15:35:44 -05:00
Adhemerval Zanella
1481d7066c PowerPC: multiarch finite/finitef for PowerPC64 2013-12-13 15:34:52 -05:00
Adhemerval Zanella
5ccd5fc893 PowerPC: multiarch llrint/lrint for PowerPC64 2013-12-13 15:33:54 -05:00
Adhemerval Zanella
2568f3fa69 PowerPC: multiarch copysign/copysignf for PowerPC64 2013-12-13 15:32:58 -05:00
Adhemerval Zanella
1cb341fd78 PowerPC: multiarch trunc/truncf for PowerPC64 2013-12-13 15:30:57 -05:00
Adhemerval Zanella
59a3e194f7 PowerPC: multiarch round/roundf for PowerPC64 2013-12-13 15:06:01 -05:00
Adhemerval Zanella
357fd3b40a PowerPC: multiarch floor/floorf for PowerPC64 2013-12-13 15:04:04 -05:00
Adhemerval Zanella
96770f12b0 PowerPC: multiarch ceil/ceilf for PowerPC64 2013-12-13 15:02:32 -05:00
Adhemerval Zanella
c3627f6e96 PowerPC: multiarch llround/lround for PowerPC64 2013-12-13 15:01:54 -05:00
Adhemerval Zanella
b2284ad7cf PowerPC: multiarch isnan/isnanf for PowerPC64 2013-12-13 15:01:10 -05:00
Adhemerval Zanella
69bbc63d88 PowerPC: Adjust multiarch Implies for PowerPC64
This patch adds Implies files on multiarch folder for POWER chips so
multirach is enabled when building with --with-cpu and powerN
option.
2013-12-13 14:58:02 -05:00
Adhemerval Zanella
c24517c9dd PowerPC: Cleaning up uneeded sqrt routines
For PPC64, all the wrappers at sysdeps are superfluous: they are
basically the same implementation from math/w_sqrt.c with the
'#ifdef _IEEE_LIBM'. And the power4 version just force the 'fsqrt'
instruction utilization with an inline assembly, which is already
handled by math_private.h __ieee754_sqrt implementation.
2013-12-13 14:56:09 -05:00
Adhemerval Zanella
a52374e82b PowerPC: multiarch stpcpy for PowerPC64 2013-12-13 14:55:22 -05:00
Adhemerval Zanella
7f5ec11336 PowerPC: multiarch strcpy for PowerPC64 2013-12-13 14:54:41 -05:00
Adhemerval Zanella
e28bcd427b PowerPC: multiarch wordcopy for PowerPC64 2013-12-13 14:54:08 -05:00
Adhemerval Zanella
92cacfce7d PowerPC: multiarch wcscpy for PowerPC64. 2013-12-13 14:53:25 -05:00
Adhemerval Zanella
7b714620a7 PowerPC: multiarch wcsrchr for PowerPC64 2013-12-13 14:52:48 -05:00
Adhemerval Zanella
16fd2ae37c PowerPC: multiarch wcschr for PowerPC64 2013-12-13 14:51:36 -05:00
Adhemerval Zanella
9ee2969b05 PowerPC: multiarch strchrnul for PowerPC64 2013-12-13 14:50:26 -05:00
Adhemerval Zanella
372dc060e0 PowerPC: multiarch strchr for PowerPC64 2013-12-13 14:49:54 -05:00
Adhemerval Zanella
24c2c3b996 PowerPC: multiarch strncmp for PowerPC64 2013-12-13 14:48:48 -05:00
Adhemerval Zanella
1c92d9a0e0 PowerPC: multiarch strncasecmp for PowerPC64 2013-12-13 14:40:28 -05:00
Adhemerval Zanella
17de3ee3c1 PowerPC: multiarch strcasecmp for PowerPC64 2013-12-13 14:39:51 -05:00
Adhemerval Zanella
62982bf978 PowerPC: multiarch strnlen for PowerPC64 2013-12-13 14:38:50 -05:00
Adhemerval Zanella
a65f4904ab PowerPC: multiarch strlen for PowerPC64 2013-12-13 14:38:17 -05:00
Adhemerval Zanella
1fd005ad2f PowerPC: multiarch rawmemchr for PowerPC64 2013-12-13 14:37:26 -05:00
Adhemerval Zanella
cd05ba9135 PowerPC: multiarch memrchr for PowerPC64 2013-12-13 14:36:50 -05:00
Adhemerval Zanella
870f867648 PowerPC: multiarch memchr for PowerPC64 2013-12-13 14:35:28 -05:00
Adhemerval Zanella
f00be62b08 PowerPC: multiarch mempcpy for PowerPC64 2013-12-13 14:34:06 -05:00
Adhemerval Zanella
8a29a3d00b PowerPC: multiarch memset/bzero for PowerPC64 2013-12-13 14:33:16 -05:00
Adhemerval Zanella
07253fcf7b PowerPC: multirach memcmp for PowerPC64 2013-12-13 14:32:31 -05:00
Adhemerval Zanella
b5beafbcee PowerPC: multiarch memcpy for PowerPC64 2013-12-13 14:31:41 -05:00
Adhemerval Zanella
5e6a4d4b9e PowerPC: Adjust multiarch Implies for PowerPC64
This patch adds Implies files on multiarch folder for POWER chips so
multirach is enabled when building with --with-cpu and powerN
option.
2013-12-13 14:29:27 -05:00
Adhemerval Zanella
24eeafdb44 PowerPC: Optimized mpn functions for PowerPC64/POWER7
This patch add optimized __mpn_add_n/__mpn_sub_n for PowerPC64/POWER7.
They are originally from GMP with adjustments for GLIBC.
2013-12-06 11:52:31 -06:00
Adhemerval Zanella
4a2c0fd44d PowerPC: Optimized mpn functions for PowerPC64
This patch add optimized __mpn_addmul, __mpn_addsub, __mpn_lshift, and
__mpn_mul_1 implementations for PowerPC64. They are originally from GMP
with adjustments for GLIBC.
2013-12-06 11:52:22 -06:00
Adhemerval Zanella
2d9470b2ae PowerPC: multiarch logb/logbf/logbl for PowerPC32 2013-12-06 05:47:05 -06:00
Adhemerval Zanella
ea5a72f882 PowerPC: multiarch wordcopy routines for PowerPC32 2013-12-06 05:47:02 -06:00
Adhemerval Zanella
93be09e725 PowerPC: multiarch wcscpy for PowerPC32 2013-12-06 05:47:02 -06:00
Adhemerval Zanella
bb04e529f6 PowerPC: multiarch wcsrchr for PowerPC32 2013-12-06 05:47:02 -06:00
Adhemerval Zanella
05b5cd1ce5 PowerPC: multiarch wcschr for PowerPC32 2013-12-06 05:47:02 -06:00
Adhemerval Zanella
eb5ad6b9bc PowerPC: Add systemtap static probe points in setjmp/longjmp
This patch add static probes for setjmp/longjmp in the way gdb expects,fixing
the gdb.base/longjmp.exp gdb testcases.

It changes the symbol_name and use macros to to avoid change the probe names
and ending up adding more logic on GDB (since with the expected name
GDB work seamlessly).
2013-12-05 07:44:07 -06:00
Ulrich Weigand
61cd8fe401 PowerPC64 ELFv2 ABI 5/6: LD_AUDIT interface changes
The ELFv2 ABI changes the calling convention by passing and returning
structures in registers in more cases than the old ABI:
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg01145.html
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg01147.html

For the most part, this does not affect glibc, since glibc assembler
files do not use structure parameters / return values.  However, one
place is affected: the LD_AUDIT interface provides a structure to
the audit routine that contains all registers holding function
argument and return values for the intercepted PLT call.

Since the new ABI now sometimes uses registers to return values
that were never used for this purpose in the old ABI, this structure
has to be extended.  To force audit routines to be modified for the
new ABI if necessary, the patch defines v2 variants of the la_ppc64
types and routines.

In addition, the patch contains two unrelated changes to the
PLT trampoline routines: it fixes a bug where FPR return values
were stored in the wrong place, and it removes the unnecessary
save/restore of CR.
2013-12-04 07:41:39 -06:00
Ulrich Weigand
8b8a692cfd PowerPC64 ELFv2 ABI 4/6: Stack frame layout changes
This updates glibc for the changes in the ELFv2 relating to the
stack frame layout.  These are described in more detail here:
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg01149.html
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg01146.html

Specifically, the "compiler and linker doublewords" were removed,
which has the effect that the save slot for the TOC register is
now at offset 24 rather than 40 to the stack pointer.

In addition, a function may now no longer necessarily assume that
its caller has set up a 64-byte register save area its use.

To address the first change, the patch goes through all assembler
files and replaces immediate offsets in instructions accessing the
ABI-defined stack slots by symbolic offsets.  Those already were
defined in ucontext_i.sym and used in some of the context routines,
but that doesn't really seem like the right place for those defines.

The patch instead defines those symbolic offsets in sysdeps.h,
in two variants for the old and new ABI, and uses them systematically
in all assembler files, not just the context routines.

The second change only affected a few assembler files that used
the save area to temporarily store some registers.  In those
cases where this happens within a leaf function, this patch
changes the code to store those registers to the "red zone"
below the stack pointer.  Otherwise, the functions already allocate
a stack frame, and the patch changes them to add extra space in
these frames as temporary space for the ELFv2 ABI.
2013-12-04 07:41:39 -06:00
Ulrich Weigand
122b66defd PowerPC64 ELFv2 ABI 3/6: PLT local entry point optimization
This is a follow-on to the previous patch to support the ELFv2 ABI in the
dynamic loader, split off into its own patch since it is just an optional
optimization.

In the ELFv2 ABI, most functions define both a global and a local entry
point; the local entry requires r2 to be already set up by the caller
to point to the callee's TOC; while the global entry does not require
the caller to know about the callee's TOC, but it needs to set up r12
to the callee's entry point address.

Now, when setting up a PLT slot, the dynamic linker will usually need
to enter the target function's global entry point.  However, if the
linker can prove that the target function is in the same DSO as the
PLT slot itself, and the whole DSO only uses a single TOC (which the
linker will let ld.so know via a DT_PPC64_OPT entry), then it is
possible to actually enter the local entry point address into the
PLT slot, for a slight improvement in performance.

Note that this uncovered a problem on the first call via _dl_runtime_resolve,
because that routine neglected to restore the caller's TOC before calling
the target function for the first time, since it assumed that function
would always reload its own TOC anyway ...
2013-12-04 07:41:38 -06:00
Ulrich Weigand
696caf1d00 PowerPC64 ELFv2 ABI 2/6: Remove function descriptors
This patch adds support for the ELFv2 ABI feature to remove function
descriptors.  See this GCC patch for in-depth discussion:
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg01141.html

This mostly involves two types of changes: updating assembler source
files to the new logic, and updating the dynamic loader.

After the refactoring in the previous patch, most of the assembler source
changes can be handled simply by providing ELFv2 versions of the
macros in sysdep.h.   One somewhat non-obvious change is in __GI__setjmp:
this used to "fall through" to the immediately following __setjmp ENTRY
point.  This is no longer safe in the ELFv2 since ENTRY defines both
a global and a local entry point, and you cannot simply fall through
to a global entry point as it requires r12 to be set up.

Also, makecontext needs to be updated to set up registers according to
the new ABI for calling into the context's start routine.

The dynamic linker changes mostly consist of removing special code
to handle function descriptors.  We also need to support the new PLT
and glink format used by the the ELFv2 linker, see:
https://sourceware.org/ml/binutils/2013-10/msg00376.html

In addition, the dynamic linker now verifies that the dynamic libraries
it loads match its own ABI.

The hack in VDSO_IFUNC_RET to "synthesize" a function descriptor
for vDSO routines is also no longer necessary for ELFv2.
2013-12-04 07:41:38 -06:00
Ulrich Weigand
d31beafa8e PowerPC64 ELFv2 ABI 1/6: Code refactoring
This is the first patch to support the new ELFv2 ABI in glibc.

As preparation, this patch simply refactors some of the powerpc64 assembler
code to move all code related to creating function descriptors (.opd section)
or using function descriptors (function pointer call) into a central place
in sysdep.h.

Note that most locations creating .opd entries were already using macros
in sysdep.h, this patch simply extends this to the remaining places.

No relevant change in generated code expected.
2013-12-04 07:41:38 -06:00
Alan Modra
7ec07d9a7b PowerPC64: Report overflow on @h and @ha relocations
This patch updates glibc in accordance with the binutils patch checked in here:
https://sourceware.org/ml/binutils/2013-10/msg00372.html

This changes the various R_PPC64_..._HI and _HA relocations to report
32-bit overflows.  The motivation is that existing uses of @h / @ha
are to build up 32-bit offsets (for the "medium model" TOC access
that GCC now defaults to), and we'd really like to see failures at
link / load time rather than silent truncations.

For those rare cases where a modifier is needed to build up a 64-bit
constant, new relocations _HIGH / _HIGHA are supported.

The patch also fixes a bug in overflow checking for the R_PPC64_ADDR30
and R_PPC64_ADDR32 relocations.
2013-12-04 07:41:37 -06:00
Mike Frysinger
cb8a6dbd17 rename configure.in to configure.ac
Autoconf has been deprecating configure.in for quite a long time.
Rename all our configure.in and preconfigure.in files to .ac.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
2013-10-30 17:32:08 +10:00
Adhemerval Zanella
69f13dbf06 PowerPC: strcpy/stpcpy optimization for PPC64/POWER7
This patch intends to unify both strcpy and stpcpy implementationsi
for PPC64 and PPC64/POWER7. The idead default powerpc64 implementation
is to provide both doubleword and word aligned memory access.

For PPC64/POWER7 is also provide doubleword and word memory access,
remove the branch hints, use the cmpb instruction for compare
doubleword/words, and add an optimization for inputs of same alignment.
2013-10-25 13:28:24 -05:00
Alan Modra
4cb81307b3 Use stdint.h types in union unaligned.
* sysdeps/powerpc/powerpc32/dl-machine.c (__process_machine_rela):
	Use stdint types in rather than __attribute__((mode())).
	* sysdeps/powerpc/powerpc64/dl-machine.h (elf_machine_rela): Likewise.
2013-10-04 12:51:11 +09:30
Alan Modra
f8e3e9f31b Correct little-endian relocation of UADDR64,32,16.
* sysdeps/powerpc/powerpc32/dl-machine.c (__process_machine_rela):
	Correct handling of unaligned relocs for little-endian.
	* sysdeps/powerpc/powerpc64/dl-machine.h (elf_machine_rela): Likewise.
2013-10-04 11:33:12 +09:30
Alan Modra
466b039332 PowerPC LE memchr and memrchr
http://sourceware.org/ml/libc-alpha/2013-08/msg00105.html

Like strnlen, memchr and memrchr had a number of defects fixed by this
patch as well as adding little-endian support.  The first one I
noticed was that the entry to the main loop needlessly checked for
"are we done yet?" when we know the size is large enough that we can't
be done.  The second defect I noticed was that the main loop count was
wrong, which in turn meant that the small loop needed to handle an
extra word.  Thirdly, there is nothing to say that the string can't
wrap around zero, except of course that we'd normally hit a segfault
on trying to read from address zero.  Fixing that simplified a number
of places:

-	/* Are we done already?  */
-	addi    r9,r8,8
-	cmpld	r9,r7
-	bge	L(null)

becomes

+	cmpld	r8,r7
+	beqlr

However, the exit gets an extra test because I test for being on the
last word then if so whether the byte offset is less than the end.
Overall, the change is a win.

Lastly, memrchr used the wrong cache hint.

	* sysdeps/powerpc/powerpc64/power7/memchr.S: Replace rlwimi with
	insrdi.  Make better use of reg selection to speed exit slightly.
	Schedule entry path a little better.  Remove useless "are we done"
	checks on entry to main loop.  Handle wrapping around zero address.
	Correct main loop count.  Handle single left-over word from main
	loop inline rather than by using loop_small.  Remove extra word
	case in loop_small caused by wrong loop count.  Add little-endian
	support.
	* sysdeps/powerpc/powerpc32/power7/memchr.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/memrchr.S: Likewise.  Use proper
	cache hint.
	* sysdeps/powerpc/powerpc32/power7/memrchr.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/rawmemchr.S: Add little-endian
	support.  Avoid rlwimi.
	* sysdeps/powerpc/powerpc32/power7/rawmemchr.S: Likewise.
2013-10-04 10:41:46 +09:30
Alan Modra
3be87c77d2 PowerPC LE memset
http://sourceware.org/ml/libc-alpha/2013-08/msg00104.html

One of the things I noticed when looking at power7 timing is that rlwimi
is cracked and the two resulting insns have a register dependency.
That makes it a little slower than the equivalent rldimi.

	* sysdeps/powerpc/powerpc64/memset.S: Replace rlwimi with
        insrdi.  Formatting.
	* sysdeps/powerpc/powerpc64/power4/memset.S: Likewise.
	* sysdeps/powerpc/powerpc64/power6/memset.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/memset.S: Likewise.
	* sysdeps/powerpc/powerpc32/power4/memset.S: Likewise.
	* sysdeps/powerpc/powerpc32/power6/memset.S: Likewise.
	* sysdeps/powerpc/powerpc32/power7/memset.S: Likewise.
2013-10-04 10:41:35 +09:30
Alan Modra
759cfef3ac PowerPC LE memcpy
http://sourceware.org/ml/libc-alpha/2013-08/msg00103.html

LIttle-endian support for memcpy.  I spent some time cleaning up the
64-bit power7 memcpy, in order to avoid the extra alignment traps
power7 takes for little-endian.  It probably would have been better
to copy the linux kernel version of memcpy.

	* sysdeps/powerpc/powerpc32/power4/memcpy.S: Add little endian support.
	* sysdeps/powerpc/powerpc32/power6/memcpy.S: Likewise.
	* sysdeps/powerpc/powerpc32/power7/memcpy.S: Likewise.
	* sysdeps/powerpc/powerpc32/power7/mempcpy.S: Likewise.
	* sysdeps/powerpc/powerpc64/memcpy.S: Likewise.
	* sysdeps/powerpc/powerpc64/power4/memcpy.S: Likewise.
	* sysdeps/powerpc/powerpc64/power6/memcpy.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/memcpy.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/mempcpy.S: Likewise.  Make better
	use of regs.  Use power7 mtocrf.  Tidy function tails.
2013-10-04 10:41:24 +09:30
Alan Modra
fe6e95d717 PowerPC LE memcmp
http://sourceware.org/ml/libc-alpha/2013-08/msg00102.html

This is a rather large patch due to formatting and renaming.  The
formatting changes were to make it possible to compare power7 and
power4 versions of memcmp.  Using different register defines came
about while I was wrestling with the code, trying to find spare
registers at one stage.  I found it much simpler if we refer to a reg
by the same name throughout a function, so it's better if short-term
multiple use regs like rTMP are referred to using their register
number.  I made the cr field usage changes when attempting to reload
rWORDn regs in the exit path to byte swap before comparing when
little-endian.  That proved a bad idea due to the pipelining involved
in the main loop;  Offsets to reload the regs were different first
time around the loop..  Anyway, I left the cr field usage changes in
place for consistency.

Aside from these more-or-less cosmetic changes, I fixed a number of
places where an early exit path restores regs unnecessarily, removed
some dead code, and optimised one or two exits.

	* sysdeps/powerpc/powerpc64/power7/memcmp.S: Add little-endian support.
	Formatting.  Consistently use rXXX register defines or rN defines.
	Use early exit labels that avoid restoring unused non-volatile regs.
	Make cr field use more consistent with rWORDn compares.  Rename
	regs used as shift registers for unaligned loop, using rN defines
	for short lifetime/multiple use regs.
	* sysdeps/powerpc/powerpc64/power4/memcmp.S: Likewise.
	* sysdeps/powerpc/powerpc32/power7/memcmp.S: Likewise.  Exit with
	addi 1,1,64 to pop stack frame.  Simplify return value code.
	* sysdeps/powerpc/powerpc32/power4/memcmp.S: Likewise.
2013-10-04 10:40:56 +09:30
Alan Modra
664318c3eb PowerPC LE strchr
http://sourceware.org/ml/libc-alpha/2013-08/msg00101.html

Adds little-endian support to optimised strchr assembly.  I've also
tweaked the big-endian code a little.  In power7/strchr.S there's a
check in the tail of the function that we didn't match 0 before
finding a c match, done by comparing leading zero counts.  It's just
as valid, and quicker, to compare the raw output from cmpb.

Another little tweak is to use rldimi/insrdi in place of rlwimi for
the power7 strchr functions.  Since rlwimi is cracked, it is a few
cycles slower.  rldimi can be used on the 32-bit power7 functions
too.

	* sysdeps/powerpc/powerpc64/power7/strchr.S (strchr): Add little-endian
	support.  Correct typos, formatting.  Optimize tail.  Use insrdi
	rather than rlwimi.
	* sysdeps/powerpc/powerpc32/power7/strchr.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/strchrnul.S (__strchrnul): Add
	little-endian support.  Correct typos.
	* sysdeps/powerpc/powerpc32/power7/strchrnul.S: Likewise.  Use insrdi
	rather than rlwimi.
	* sysdeps/powerpc/powerpc64/strchr.S (rTMP4, rTMP5): Define.  Use
	in loop and entry code to keep "and." results.
	(strchr): Add little-endian support.  Comment.  Move cntlzd
	earlier in tail.
	* sysdeps/powerpc/powerpc32/strchr.S: Likewise.
2013-10-04 10:40:22 +09:30
Alan Modra
43b8401371 PowerPC LE strcpy
http://sourceware.org/ml/libc-alpha/2013-08/msg00100.html

The strcpy changes for little-endian are quite straight-forward, just
a matter of rotating the last word differently.

I'll note that the powerpc64 version of stpcpy is just begging to be
converted to use 64-bit loads and stores..

	* sysdeps/powerpc/powerpc64/strcpy.S: Add little-endian support:
	* sysdeps/powerpc/powerpc32/strcpy.S: Likewise.
	* sysdeps/powerpc/powerpc64/stpcpy.S: Likewise.
	* sysdeps/powerpc/powerpc32/stpcpy.S: Likewise.
2013-10-04 10:40:11 +09:30
Alan Modra
8a7413f9b0 PowerPC LE strcmp and strncmp
http://sourceware.org/ml/libc-alpha/2013-08/msg00099.html

More little-endian support.  I leave the main strcmp loops unchanged,
(well, except for renumbering rTMP to something other than r0 since
it's needed in an addi insn) and modify the tail for little-endian.

I noticed some of the big-endian tail code was a little untidy so have
cleaned that up too.

	* sysdeps/powerpc/powerpc64/strcmp.S (rTMP2): Define as r0.
	(rTMP): Define as r11.
	(strcmp): Add little-endian support.  Optimise tail.
	* sysdeps/powerpc/powerpc32/strcmp.S: Similarly.
	* sysdeps/powerpc/powerpc64/strncmp.S: Likewise.
	* sysdeps/powerpc/powerpc32/strncmp.S: Likewise.
	* sysdeps/powerpc/powerpc64/power4/strncmp.S: Likewise.
	* sysdeps/powerpc/powerpc32/power4/strncmp.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/strncmp.S: Likewise.
	* sysdeps/powerpc/powerpc32/power7/strncmp.S: Likewise.
2013-10-04 10:39:52 +09:30
Alan Modra
33ee81de05 PowerPC LE strnlen
http://sourceware.org/ml/libc-alpha/2013-08/msg00098.html

The existing strnlen code has a number of defects, so this patch is more
than just adding little-endian support.  The changes here are similar to
those for memchr.

	* sysdeps/powerpc/powerpc64/power7/strnlen.S (strnlen): Add
	little-endian support.  Remove unnecessary "are we done" tests.
	Handle "s" wrapping around zero and extremely large "size".
	Correct main loop count.  Handle single left-over word from main
	loop inline rather than by using small_loop.  Correct comments.
	Delete "zero" tail, use "end_max" instead.
	* sysdeps/powerpc/powerpc32/power7/strnlen.S: Likewise.
2013-10-04 10:39:42 +09:30
Alan Modra
db9b4570c5 PowerPC LE strlen
http://sourceware.org/ml/libc-alpha/2013-08/msg00097.html

This is the first of nine patches adding little-endian support to the
existing optimised string and memory functions.  I did spend some
time with a power7 simulator looking at cycle by cycle behaviour for
memchr, but most of these patches have not been run on cpu simulators
to check that we are going as fast as possible.  I'm sure PowerPC can
do better.  However, the little-endian support mostly leaves main
loops unchanged, so I'm banking on previous authors having done a
good job on big-endian..  As with most code you stare at long enough,
I found some improvements for big-endian too.

Little-endian support for strlen.  Like most of the string functions,
I leave the main word or multiple-word loops substantially unchanged,
just needing to modify the tail.

Removing the branch in the power7 functions is just a tidy.  .align
produces a branch anyway.  Modifying regs in the non-power7 functions
is to suit the new little-endian tail.

	* sysdeps/powerpc/powerpc64/power7/strlen.S (strlen): Add little-endian
	support.  Don't branch over align.
	* sysdeps/powerpc/powerpc32/power7/strlen.S: Likewise.
	* sysdeps/powerpc/powerpc64/strlen.S (strlen): Add little-endian support.
	Rearrange tmp reg use to suit.  Comment.
	* sysdeps/powerpc/powerpc32/strlen.S: Likewise.
2013-10-04 10:39:32 +09:30
Alan Modra
9b874b2f1e PowerPC ugly symbol versioning
http://sourceware.org/ml/libc-alpha/2013-08/msg00090.html

This patch fixes symbol versioning in setjmp/longjmp.  The existing
code uses raw versions, which results in wrong symbol versioning when
you want to build glibc with a base version of 2.19 for LE.

Note that the merging the 64-bit and 32-bit versions in novmx-lonjmp.c
and pt-longjmp.c doesn't result in GLIBC_2.0 versions for 64-bit, due
to the base in shlib_versions.

	* sysdeps/powerpc/longjmp.c: Use proper symbol versioning macros.
	* sysdeps/powerpc/novmx-longjmp.c: Likewise.
	* sysdeps/powerpc/powerpc32/bsd-_setjmp.S: Likewise.
	* sysdeps/powerpc/powerpc32/bsd-setjmp.S: Likewise.
	* sysdeps/powerpc/powerpc32/fpu/__longjmp.S: Likewise.
	* sysdeps/powerpc/powerpc32/fpu/setjmp.S: Likewise.
	* sysdeps/powerpc/powerpc32/mcount.c: Likewise.
	* sysdeps/powerpc/powerpc32/setjmp.S: Likewise.
	* sysdeps/powerpc/powerpc64/setjmp.S: Likewise.
	* nptl/sysdeps/unix/sysv/linux/powerpc/pt-longjmp.c: Likewise.
2013-10-04 10:38:28 +09:30
Anton Blanchard
be1e5d3113 PowerPC LE setjmp/longjmp
http://sourceware.org/ml/libc-alpha/2013-08/msg00089.html

Little-endian fixes for setjmp/longjmp.  When writing these I noticed
the setjmp code corrupts the non volatile VMX registers when using an
unaligned buffer.  Anton fixed this, and also simplified it quite a
bit.

The current code uses boilerplate for the case where we want to store
16 bytes to an unaligned address.  For that we have to do a
read/modify/write of two aligned 16 byte quantities.  In our case we
are storing a bunch of back to back data (consective VMX registers),
and only the start and end of the region need the read/modify/write.

	[BZ #15723]
	* sysdeps/powerpc/jmpbuf-offsets.h: Comment fix.
	* sysdeps/powerpc/powerpc32/fpu/__longjmp-common.S: Correct
	_dl_hwcap access for little-endian.
	* sysdeps/powerpc/powerpc32/fpu/setjmp-common.S: Likewise.  Don't
	destroy vmx regs when saving unaligned.
	* sysdeps/powerpc/powerpc64/__longjmp-common.S: Correct CR load.
	* sysdeps/powerpc/powerpc64/setjmp-common.S: Likewise CR save.  Don't
	destroy vmx regs when saving unaligned.
2013-10-04 10:37:59 +09:30
Anton Blanchard
76a66d510a PowerPC floating point little-endian [14 of 15]
http://sourceware.org/ml/libc-alpha/2013-07/msg00205.html

These all wrongly specified float constants in a 64-bit word.

	* sysdeps/powerpc/powerpc64/fpu/s_ceilf.S: Correct float constants
	for little-endian.
	* sysdeps/powerpc/powerpc64/fpu/s_floorf.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_nearbyintf.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_rintf.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_roundf.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_truncf.S: Likewise.
2013-10-04 10:36:24 +09:30
Alan Modra
7b88401f3b PowerPC floating point little-endian [12 of 15]
http://sourceware.org/ml/libc-alpha/2013-08/msg00087.html

Fixes for little-endian in 32-bit assembly.

	* sysdeps/powerpc/sysdep.h (LOWORD, HIWORD, HISHORT): Define.
	* sysdeps/powerpc/powerpc32/fpu/s_copysign.S: Load little-endian
	words of double from correct stack offsets.
	* sysdeps/powerpc/powerpc32/fpu/s_copysignl.S: Likewise.
	* sysdeps/powerpc/powerpc32/fpu/s_lrint.S: Likewise.
	* sysdeps/powerpc/powerpc32/fpu/s_lround.S: Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/s_llrint.S: Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/s_llrintf.S: Likewise.
	* sysdeps/powerpc/powerpc32/power5+/fpu/s_llround.S: Likewise.
	* sysdeps/powerpc/powerpc32/power5+/fpu/s_lround.S: Likewise.
	* sysdeps/powerpc/powerpc32/power5/fpu/s_isnan.S: Likewise.
	* sysdeps/powerpc/powerpc32/power6/fpu/s_isnan.S: Likewise.
	* sysdeps/powerpc/powerpc32/power6/fpu/s_llrint.S: Likewise.
	* sysdeps/powerpc/powerpc32/power6/fpu/s_llrintf.S: Likewise.
	* sysdeps/powerpc/powerpc32/power6/fpu/s_llround.S: Likewise.
	* sysdeps/powerpc/powerpc32/power7/fpu/s_finite.S: Likewise.
	* sysdeps/powerpc/powerpc32/power7/fpu/s_isinf.S: Likewise.
	* sysdeps/powerpc/powerpc32/power7/fpu/s_isnan.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/fpu/s_finite.S: Use HISHORT.
	* sysdeps/powerpc/powerpc64/power7/fpu/s_isinf.S: Likewise.
2013-10-04 10:35:43 +09:30
Adhemerval Zanella
5ebbff8fd1 PowerPC: Fix POINTER_CHK_GUARD thread register for PPC64 2013-09-25 13:43:04 -05:00
Carlos O'Donell
c61b4d41c9 BZ #15754: CVE-2013-4788
The pointer guard used for pointer mangling was not initialized for
static applications resulting in the security feature being disabled.
The pointer guard is now correctly initialized to a random value for
static applications. Existing static applications need to be
recompiled to take advantage of the fix.

The test tst-ptrguard1-static and tst-ptrguard1 add regression
coverage to ensure the pointer guards are sufficiently random
and initialized to a default value.
2013-09-23 00:52:09 -04:00
Adhemerval Zanella
5430fc65a1 PowerPC: fix POWER7 memrchr for some large inputs 2013-09-05 09:32:56 -05:00
Ondřej Bílka
f24a6d086b Fix then/than typos. 2013-08-30 18:10:31 +02:00
Ondřej Bílka
c0c3f78afb Fix typos. 2013-08-21 19:48:48 +02:00