Commit Graph

656 Commits

Author SHA1 Message Date
Rajalakshmi Srinivasaraghavan
d89060d603 powerpc: strncmp optimization for power9
Vectorized loops are used for strings > 32B when compared
to power8 optimization.

Tested on power9 ppc64le simulator.
2016-12-13 10:53:42 +05:30
Adhemerval Zanella
5cd94e67d0 powerpc: Remove stpcpy internal clash with IFUNC
Commit c7debbdfac redirected the internal strrch to default powerpc64
implementation by redefining the weak_alias at
sysdeps/powerpc/powerpc64/multiarch/strchr-ppc64.c:

  #undef weak_alias
  #define weak_alias(name, aliasname) \
    extern __typeof (__strrchr_ppc) aliasname \
      __attribute__ ((weak, alias ("__strrchr_ppc")));

This creates a __GI_strchr alias that clashes with the IFUNC symbol in
stprchr.os.  There is not need to define the default version for internal
version, since ifunc should work internally for powerpc64.  This patch
removes the weak_alias indirection.

Checked on powerpc64le.

	* sysdeps/powerpc/powerpc64/multiarch/strrchr-ppc64.c (weak_alias):
	Remove redirection to __strrchr_ppc.
2016-12-01 15:53:16 -02:00
Rajalakshmi Srinivasaraghavan
80ab6401a9 powerpc: strcmp optimization for power9
Vectorized loops are used for strings > 32B when compared
to power8 optimization.

Tested on power9 ppc64le simulator.
2016-12-01 11:35:43 +05:30
Adhemerval Zanella
8072373ea9 powerpc: Remove stpcpy internal clash with IFUNC
Commit 142e0a9953 redirected the internal stpcpy to default powerpc64
implementation by redefining the weak_alias at
sysdeps/powerpc/powerpc64/multiarch/stpcpy-ppc64.c:

  #undef weak_alias
  #define weak_alias(name, aliasname) \
    extern __typeof (__stpcpy_ppc) aliasname \
      __attribute__ ((weak, alias ("__stpcpy_ppc")));

This creates a __GI_stpcpy alias that clashes with the IFUNC symbol in
stpcpy.os.  There is not need to define the default version for internal
version, since ifunc should work internally for powerpc64.  This patch
removes the weak_alias indirection.

Checked on powerpc64le.

	* sysdeps/powerpc/powerpc64/multiarch/stpcpy-ppc64.c (weak_alias):
	Remove redirection to __stpcpy_ppc.
2016-11-30 15:13:26 -02:00
Florian Weimer
b365289364 powerpc: Add hidden definition for __sigsetjmp
There already is a hidden prototype for __sigsetjmp, but the
architecture-specific definition was missing.
2016-11-29 10:16:35 +01:00
Steve Ellcey
d060cd002d Define wordsize.h macros everywhere
* bits/wordsize.h: Add documentation.
	* sysdeps/aarch64/bits/wordsize.h : New file
	* sysdeps/generic/stdint.h (PTRDIFF_MIN, PTRDIFF_MAX): Update
	definitions.
	(SIZE_MAX): Change ifdef to if in __WORDSIZE32_SIZE_ULONG check.
	* sysdeps/gnu/bits/utmp.h (__WORDSIZE_TIME64_COMPAT32): Check
	with #if instead of #ifdef.
	* sysdeps/gnu/bits/utmpx.h (__WORDSIZE_TIME64_COMPAT32): Ditto.
	* sysdeps/mips/bits/wordsize.h (__WORDSIZE32_SIZE_ULONG,
	__WORDSIZE32_PTRDIFF_LONG, __WORDSIZE_TIME64_COMPAT32):
	Add or change defines.
	* sysdeps/powerpc/powerpc32/bits/wordsize.h: Likewise.
	* sysdeps/powerpc/powerpc64/bits/wordsize.h: Likewise.
	* sysdeps/s390/s390-32/bits/wordsize.h: Likewise.
	* sysdeps/s390/s390-64/bits/wordsize.h: Likewise.
	* sysdeps/sparc/sparc32/bits/wordsize.h: Likewise.
	* sysdeps/sparc/sparc64/bits/wordsize.h: Likewise.
	* sysdeps/tile/tilegx/bits/wordsize.h: Likewise.
	* sysdeps/tile/tilepro/bits/wordsize.h: Likewise.
	* sysdeps/unix/sysv/linux/alpha/bits/wordsize.h: Likewise.
	* sysdeps/unix/sysv/linux/powerpc/bits/wordsize.h: Likewise.
	* sysdeps/unix/sysv/linux/sparc/bits/wordsize.h: Likewise.
	* sysdeps/wordsize-32/bits/wordsize.h: Likewise.
	* sysdeps/wordsize-64/bits/wordsize.h: Likewise.
	* sysdeps/x86/bits/wordsize.h: Likewise.
2016-11-04 09:37:44 -07:00
Joseph Myers
78b7adbaea Fix cmpli usage in power6 memset.
Building glibc for powerpc64 with recent (2.27.51.20161012) binutils,
with multi-arch enabled, I get the error:

../sysdeps/powerpc/powerpc64/power6/memset.S: Assembler messages:
../sysdeps/powerpc/powerpc64/power6/memset.S:254: Error: operand out of range (5 is not between 0 and 1)
../sysdeps/powerpc/powerpc64/power6/memset.S:254: Error: operand out of range (128 is not between 0 and 31)
../sysdeps/powerpc/powerpc64/power6/memset.S:254: Error: missing operand

Indeed, cmpli is documented as a four-operand instruction, and looking
at nearby code it seems likely cmpldi was intended.  This patch fixes
this powerpc64 code accordingly, and makes a corresponding change to
the powerpc32 code.

Tested for powerpc, powerpc64 and powerpc64le by Tulio Magno Quites
Machado Filho

	* sysdeps/powerpc/powerpc32/power6/memset.S (memset): Use cmplwi
	instead of cmpli.
	* sysdeps/powerpc/powerpc64/power6/memset.S (memset): Use cmpldi
	instead of cmpli.
2016-10-25 15:54:16 +00:00
Joseph Myers
05f3ed0a79 Stop powerpc copysignl raising "invalid" for sNaN argument (bug 20718).
The powerpc (hard-float) implementations of copysignl, both 32-bit and
64-bit, raise spurious "invalid" exceptions when the first argument is
a signaling NaN.  copysign functions should never raise exceptions
even for signaling NaNs.

The problem is the use of an fcmpu instruction to test the sign of the
high part of the long double argument.  This patch fixes the functions
to use fsel instead (as used for fabsl following my fixes for a
similar bug there), or to examine the integer representation for older
32-bit processors without fsel.

Tested for powerpc64 and powerpc32 (configurations with and without
fsel used).

	[BZ #20718]
	* sysdeps/powerpc/powerpc32/fpu/s_copysignl.S (__copysignl): Do
	not use floating-point comparisons to test sign.
	* sysdeps/powerpc/powerpc64/fpu/s_copysignl.S (__copysignl):
	Likewise.
2016-10-19 22:58:34 +00:00
Stefan Liebler
00980d845f Use gcc attribute ifunc in libc_ifunc macro instead of inline assembly due to false debuginfo.
The current s390 ifunc resolver for vector optimized functions and the common
libc_ifunc macro in include/libc-symbols.h uses something like that to generate ifunc'ed functions:
extern void *__resolve___strlen(unsigned long int dl_hwcap) asm (strlen);
asm (".type strlen, %gnu_indirect_function");

This leads to false debug information:
objdump --dwarf=info libc.so:
...
<1><1e6424>: Abbrev Number: 43 (DW_TAG_subprogram)
    <1e6425>   DW_AT_external    : 1
    <1e6425>   DW_AT_name        : (indirect string, offset: 0x1146e): __resolve___strlen
    <1e6429>   DW_AT_decl_file   : 1
    <1e642a>   DW_AT_decl_line   : 23
    <1e642b>   DW_AT_linkage_name: (indirect string, offset: 0x1147a): strlen
    <1e642f>   DW_AT_prototyped  : 1
    <1e642f>   DW_AT_type        : <0x1e4ccd>
    <1e6433>   DW_AT_low_pc      : 0x998e0
    <1e643b>   DW_AT_high_pc     : 0x16
    <1e6443>   DW_AT_frame_base  : 1 byte block: 9c     (DW_OP_call_frame_cfa)
    <1e6445>   DW_AT_GNU_all_call_sites: 1
    <1e6445>   DW_AT_sibling     : <0x1e6459>
 <2><1e6449>: Abbrev Number: 44 (DW_TAG_formal_parameter)
    <1e644a>   DW_AT_name        : (indirect string, offset: 0x1845): dl_hwcap
    <1e644e>   DW_AT_decl_file   : 1
    <1e644f>   DW_AT_decl_line   : 23
    <1e6450>   DW_AT_type        : <0x1e4c8d>
    <1e6454>   DW_AT_location    : 0x122115 (location list)
...

The debuginfo for the ifunc-resolver function contains the DW_AT_linkage_name
field, which names the real function name "strlen". If you perform an inferior
function call to strlen in lldb, then it fails due to something like that:
"error: no matching function for call to 'strlen'
candidate function not viable: no known conversion from 'const char [6]'
to 'unsigned long' for 1st argument"

The unsigned long is the dl_hwcap argument of the resolver function.
The strlen function itself has no debufinfo.

The s390 ifunc resolver for memset & co uses something like that:
asm (".globl FUNC"
     ".type FUNC, @gnu_indirect_function"
     ".set FUNC, __resolve_FUNC");

This way the debuginfo for the ifunc-resolver function does not conain the
DW_AT_linkage_name field and the real function has no debuginfo, too.

Using this strategy for the vector optimized functions leads to some troubles
for functions like strnlen. Here we have __strnlen and a weak alias strnlen.
The __strnlen function is the ifunc function, which is realized with the asm-
statement above. The weak_alias-macro can't be used here due to undefined symbol:
gcc ../sysdeps/s390/multiarch/strnlen.c -c ...
In file included from <command-line>:0:0:
../sysdeps/s390/multiarch/strnlen.c:28:24: error: ‘strnlen’ aliased to undefined symbol ‘__strnlen’
 weak_alias (__strnlen, strnlen)
                        ^
./../include/libc-symbols.h:111:26: note: in definition of macro ‘_weak_alias’
   extern __typeof (name) aliasname __attribute__ ((weak, alias (#name)));
                          ^
../sysdeps/s390/multiarch/strnlen.c:28:1: note: in expansion of macro ‘weak_alias’
 weak_alias (__strnlen, strnlen)
 ^
make[2]: *** [build/string/strnlen.o] Error 1

As the __strnlen function is defined with asm-statements the function name
__strnlen isn't known by gcc. But the weak alias can also be done with an
asm statement to resolve this issue:
__asm__ (".weak  strnlen\n\t"
         ".set   strnlen,__strnlen\n");

In order to use the weak_alias macro, gcc needs to know the ifunc function. The
minimum gcc to build glibc is currently 4.7, which supports attribute((ifunc)).
See https://gcc.gnu.org/onlinedocs/gcc-4.7.0/gcc/Function-Attributes.html.
It is only supported if gcc is configured with --enable-gnu-indirect-function
or gcc supports it by default for at least intel and s390x architecture.
This patch uses the old behaviour if gcc support is not available.
Usage of attribute ifunc is something like that:
__typeof (FUNC) FUNC __attribute__ ((ifunc ("__resolve_FUNC")));

Then gcc produces the same .globl, .type, .set assembler instructions like above.
And the debuginfo does not contain the DW_AT_linkage_name field and there is no
debuginfo for the real function, too.

But in order to get it work, there is also some extra work to do.
Currently, the glibc internal symbol on s390x e.g. __GI___strnlen is not the
ifunc symbol, but the fallback __strnlen_c symbol. Thus I have to omit the
libc_hidden_def macro in strnlen.c (here is the ifunc function __strnlen)
because it is already handled in strnlen-c.c (here is __strnlen_c).

Due to libc_hidden_proto (__strnlen) in string.h, compiling fails:
gcc ../sysdeps/s390/multiarch/strnlen.c -c ...
In file included from <command-line>:0:0:
../sysdeps/s390/multiarch/strnlen.c:53:24: error: ‘strnlen’ aliased to undefined symbol ‘__strnlen’
 weak_alias (__strnlen, strnlen)
                        ^
./../include/libc-symbols.h:111:26: note: in definition of macro ‘_weak_alias’
   extern __typeof (name) aliasname __attribute__ ((weak, alias (#name)));
                          ^
../sysdeps/s390/multiarch/strnlen.c:53:1: note: in expansion of macro ‘weak_alias’
 weak_alias (__strnlen, strnlen)
 ^
make[2]: *** [build/string/strnlen.os] Error 1

I have to redirect the prototypes for __strnlen in string.h and create a copy
of the prototype for using as ifunc function:
__typeof (__redirect___strnlen) __strnlen __attribute__ ((ifunc ("__resolve_strnlen")));
weak_alias (__strnlen, strnlen)

This way there is no trouble with the internal __GI_* symbols.
Glibc builds fine with this construct and the debuginfo is "correct".
For functions without a __GI_* symbol like memccpy this redirection is not needed.

This patch adjusts the common libc_ifunc and libm_ifunc macro to use gcc
attribute ifunc. Due to this change, the macro users where the __GI_* symbol
does not target the ifunc symbol have to be prepared with the redirection
construct.
Furthermore a configure check to test gcc support is added. If it is not supported,
the old behaviour is used.

This patch also prepares the libc_ifunc macro to be useable in s390-ifunc-macro.
The s390 ifunc-resolver-functions do have an hwcaps parameter and not all
resolvers need the same initialization code. The next patch in this series
changes the s390 ifunc macros to use this common one.

ChangeLog:

	* include/libc-symbols.h (__ifunc_resolver):
	New macro is used by __ifunc* macros.
	(__ifunc): New macro uses gcc attribute ifunc or inline assembly
	depending on HAVE_GCC_IFUNC.
	(libc_ifunc, libm_ifunc): Use __ifunc as base macro.
	(libc_ifunc_redirected, libc_ifunc_hidden, libm_ifunc_init): New macro.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finite.c:
	Redirect ifunced function in header for using as type for ifunc function.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finitef.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinf.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinff.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/memcmp.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/memcpy.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/memmove.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/mempcpy.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/memset.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/rawmemchr.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/strchr.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/strlen.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/strncmp.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/strnlen.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_finitef.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinf.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinff.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/memcmp.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/mempcpy.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/rawmemchr.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/stpncpy.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strcat.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strchr.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strcmp.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strcpy.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strncmp.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strncpy.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strnlen.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strrchr.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strstr.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/wcschr.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnanf.c:
	Add libc_hidden_def() and use libc_ifunc_hidden() macro
	instead of libc_ifunc() macro.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnanf.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/stpcpy.c: Likewise.
2016-10-07 10:03:20 +02:00
Tulio Magno Quites Machado Filho
1850ce5a2e powerpc: Fix POWER9 implies
Fix multiarch build for POWER9 by correcting the order of the
directories listed at sysnames configure variable.
2016-09-19 09:35:38 -03:00
Aurelien Jarno
6bcc7ced4f ppc: Fix modf (sNaN) for pre-POWER5+ CPU (bug 20240).
Commit a6a4395d fixed modf implementation by compiling s_modf.c and
s_modff.c with -fsignaling-nans. However these files are also included
from the pre-POWER5+ implementation, and thus these files should also
be compiled with -fsignaling-nans.

Changelog:
	[BZ #20240]
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile
	(CFLAGS-s_modf-ppc32.c): New variable.
	(CFLAGS-s_modff-ppc32.c): Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile
	(CFLAGS-s_modf-ppc64.c): Likewise.
	(CFLAGS-s_modff-ppc64.c): Likewise.
2016-07-08 11:24:34 +02:00
Rajalakshmi Srinivasaraghavan
30e4cc5413 powerpc: Fix return code of strcasecmp for unaligned inputs
If the input values are unaligned and if there are null characters in the
memory before the starting address of the input values, strcasecmp
gives incorrect return code. Fixed it by adding mask the bits that
are not part of the string.
2016-07-05 21:20:41 +05:30
Anton Blanchard
aa95fc13f5 powerpc: Add a POWER8-optimized version of sinf()
This uses the implementation of sinf() in sysdeps/x86_64/fpu/s_sinf.S
as inspiration.
2016-06-30 16:08:49 -03:00
Tulio Magno Quites Machado Filho
35da2541c3 powerpc: Add a POWER8-optimized version of expf()
This implementation is based on the one already used at
sysdeps/x86_64/fpu/e_expf.S.

This implementation improves the performance by ~14% on average in synthetic
benchmarks at the cost of decreasing accuracy to 1 ULP.
2016-06-30 14:56:14 -03:00
Torvald Riegel
76a0b73e81 Remove atomic_compare_and_exchange_bool_rel.
atomic_compare_and_exchange_bool_rel and
catomic_compare_and_exchange_bool_rel are removed and replaced with the
new C11-like atomic_compare_exchange_weak_release.  The concurrent code
in nscd/cache.c has not been reviewed yet, so this patch does not add
detailed comments.

	* nscd/cache.c (cache_add): Use new C11-like atomic operation instead
	of atomic_compare_and_exchange_bool_rel.
	* nptl/pthread_mutex_unlock.c (__pthread_mutex_unlock_full): Likewise.
	* include/atomic.h (atomic_compare_and_exchange_bool_rel,
	catomic_compare_and_exchange_bool_rel): Remove.
	* sysdeps/aarch64/atomic-machine.h
	(atomic_compare_and_exchange_bool_rel): Likewise.
	* sysdeps/alpha/atomic-machine.h
	(atomic_compare_and_exchange_bool_rel): Likewise.
	* sysdeps/arm/atomic-machine.h
	(atomic_compare_and_exchange_bool_rel): Likewise.
	* sysdeps/mips/atomic-machine.h
	(atomic_compare_and_exchange_bool_rel): Likewise.
	* sysdeps/tile/atomic-machine.h
	(atomic_compare_and_exchange_bool_rel): Likewise.
2016-06-24 23:04:40 +03:00
Joseph Myers
f4015c8a86 Use generic fdim on more architectures (bug 6796, bug 20255, bug 20256).
Some architectures have their own versions of fdim functions, which
are missing errno setting (bug 6796) and may also return sNaN instead
of qNaN for sNaN input, in the case of the x86 / x86_64 long double
versions (bug 20256).

These versions are not actually doing anything that a compiler
couldn't generate, just straightforward comparisons / arithmetic (and,
in the x86 / x86_64 case, testing for NaNs with fxam, which isn't
actually needed once you use an unordered comparison and let the NaNs
pass through the same subtraction as non-NaN inputs).  This patch
removes the x86 / x86_64 / powerpc versions, so that those
architectures use the generic C versions, which correctly handle
setting errno and deal properly with sNaN inputs.  This seems better
than dealing with setting errno in lots of .S versions.

The i386 versions also return results with excess range and precision,
which is not appropriate for a function exactly defined by reference
to IEEE operations.  For errno setting to work correctly on overflow,
it's necessary to remove excess range with math_narrow_eval, which
this patch duly does in the float and double versions so that the
tests can reliably pass on x86.  For float, this avoids any double
rounding issues as the long double precision is more than twice that
of float.  For double, double rounding issues will need to be
addressed separately, so this patch does not fully fix bug 20255.

Tested for x86_64, x86 and powerpc.

	[BZ #6796]
	[BZ #20255]
	[BZ #20256]
	* math/s_fdim.c: Include <math_private.h>.
	(__fdim): Use math_narrow_eval on result.
	* math/s_fdimf.c: Include <math_private.h>.
	(__fdimf): Use math_narrow_eval on result.
	* sysdeps/i386/fpu/s_fdim.S: Remove file.
	* sysdeps/i386/fpu/s_fdimf.S: Likewise.
	* sysdeps/i386/fpu/s_fdiml.S: Likewise.
	* sysdeps/i386/i686/fpu/s_fdim.S: Likewise.
	* sysdeps/i386/i686/fpu/s_fdimf.S: Likewise.
	* sysdeps/i386/i686/fpu/s_fdiml.S: Likewise.
	* sysdeps/powerpc/fpu/s_fdim.c: Likewise.
	* sysdeps/powerpc/fpu/s_fdimf.c: Likewise.
	* sysdeps/powerpc/powerpc32/fpu/s_fdim.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_fdim.c: Likewise.
	* sysdeps/x86_64/fpu/s_fdiml.S: Likewise.
	* math/libm-test.inc (fdim_test_data): Expect errno setting on
	overflow.  Add sNaN tests.
2016-06-14 16:04:19 +00:00
raji
c8376f3e07 powerpc: strcasecmp/strncasecmp optmization for power8
This implementation utilizes vectors to improve performance
compared to current byte by byte implementation for POWER7.
The performance improvement is upto 4x.  This patch is tested
on powerpc64 and powerpc64le.
2016-06-14 14:51:16 +05:30
Tulio Magno Quites Machado Filho
c24480ce3b powerpc: Fix --disable-multi-arch build on POWER8
Add missing symbols of stpncpy and strcasestr when multi-arch is
disabled.
Fix memset call from strncpy/stpncpy when multi-arch is disabled.
2016-06-06 16:03:29 -03:00
Joseph Myers
f6ef0657e4 Fix powerpc64 ceil, rint etc. on sNaN input (bug 20160).
The powerpc64 versions of ceil, floor, round, trunc, rint, nearbyint
and their float versions return sNaN for sNaN input when they should
return qNaN.  This patch fixes them to add a NaN argument to itself to
quiet sNaNs before returning.

Tested for powerpc64.

	[BZ #20160]
	* sysdeps/powerpc/powerpc64/fpu/s_ceil.S (__ceil): Add NaN
	argument to itself before returning the result.
	* sysdeps/powerpc/powerpc64/fpu/s_ceilf.S (__ceilf): Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_floor.S (__floor): Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_floorf.S (__floorf): Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_nearbyint.S (__nearbyint):
	Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_nearbyintf.S (__nearbyintf):
	Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_rint.S (__rint): Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_rintf.S (__rintf): Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_round.S (__round): Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_roundf.S (__roundf): Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_trunc.S (__trunc): Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_truncf.S (__truncf): Likewise.
2016-05-27 17:47:54 +00:00
Joseph Myers
24e9ae1bc2 Avoid "invalid" exceptions from powerpc fabsl (sNaN) (bug 20157).
The powerpc implementations of fabsl for ldbl-128ibm (both powerpc32
and powerpc64) wrongly raise the "invalid" exception for sNaN
arguments.  fabs functions should be quiet for all inputs including
signaling NaNs.  The problem is the use of a comparison instruction
fcmpu to determine if the high part of the argument is negative and so
the low part needs to be negated; such instructions raise "invalid"
for sNaNs.

There is a pure integer implementation of fabsl in
sysdeps/ieee754/ldbl-128ibm/s_fabsl.c.  However, it's not necessary to
use it to avoid such exceptions.  The fsel instruction does not raise
exceptions for sNaNs, and can be used in place of the original
comparison.  (Note that if the high part is zero or a NaN, it does not
matter whether the low part is negated; the choice of whether the low
part of a zero is +0 or -0 does not affect the value, and the low part
of a NaN does not affect the value / payload either.)

The condition in GCC for fsel to be available is TARGET_PPC_GFXOPT,
corresponding to the _ARCH_PPCGR predefined macro.  fsel is available
on all 64-bit processors supported by GCC.  A few 32-bit processors
supported by GCC do not have TARGET_PPC_GFXOPT despite having hard
float support.  To support those processors, integer code (similar to
that in copysignl) is included for the !_ARCH_PPCGR case for
powerpc32.

Tested for powerpc32 (configurations with and without _ARCH_PPCGR) and
powerpc64.

	[BZ #20157]
	* sysdeps/powerpc/powerpc32/fpu/s_fabsl.S (__fabsl): Use fsel to
	determine whether to negate low half if [_ARCH_PPCGR], and integer
	comparison otherwise.
	* sysdeps/powerpc/powerpc64/fpu/s_fabsl.S (__fabsl): Use fsel to
	determine whether to negate low half.
2016-05-27 15:29:31 +00:00
Joseph Myers
b4d80349bb Do not raise "inexact" from powerpc64 ceil, floor, trunc (bug 15479).
Continuing fixes for ceil, floor and trunc functions not to raise the
"inexact" exception, this patch fixes the versions used on older
powerpc64 processors.  As was done with the round implementations some
time ago, the save of floating-point state is moved after the first
floating-point operation on the input to ensure that any "invalid"
exception from signaling NaN input is included in the saved state, and
then the whole state gets restored rather than just the rounding mode.

This has no effect on configurations using the power5+ code, since
such processors can do these operations with a single instruction (and
those instructions do not set "inexact", so are correct for TS 18661-1
semantics).

Tested for powerpc64.

	[BZ #15479]
	* sysdeps/powerpc/powerpc64/fpu/s_ceil.S (__ceil): Move save of
	floating-point state after first floating-point operation on
	input.  Restore full floating-point state instead of just rounding
	mode.
	* sysdeps/powerpc/powerpc64/fpu/s_ceilf.S (__ceilf): Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_floor.S (__floor): Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_floorf.S (__floorf): Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_trunc.S (__trunc): Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_truncf.S (__truncf): Likewise.
2016-05-25 17:42:22 +00:00
Gabriel F. T. Gomes
eb3b8a4924 powerpc: Fix operand prefixes
The file sysdeps/powerpc/sysdeps.h defines aliases for condition register
operands.  E.g.: 'cr7' means condition register 7.  On the one hand, this
increases readability, as it makes it easier for readers to know whether the
operand is a condition register, a general purpose register or an immediate.
On the other hand, this permits that condition registers be written as if they
were general purpose, and vice-versa, thus reducing the readability of the
code.

This commit removes some of these unintentional misuses.

The changes have no effect on the final code.  Checked with objdump.
2016-05-04 09:14:52 -03:00
Gabriel F. T. Gomes
72c11b353e powerpc: Zero pad using memset in strncpy/stpncpy
Call __memset_power8 to pad, with zeros, the remaining bytes in the
dest string on __strncpy_power8 and __stpncpy_power8.  This improves
performance when n is larger than the input string, giving ~30% gain for
larger strings without impacting much shorter strings.
2016-04-29 10:05:33 -03:00
Paul E. Murphy
8f1b841e45 powerpc: Add optimized strcspn for P8
A few minor adjustments to the P8 strspn gives us
an almost equally optimized P8 strcspn.
2016-04-25 09:11:02 -05:00
Rajalakshmi Srinivasaraghavan
e413b14e18 powerpc: strcasestr optmization for power8
This patch optimizes strcasestr function for power >= 8 systems.  The average
improvement of this optimization is ~40% and compares 16 bytes at a time
using vector instructions.  This patch is tested on powerpc64 and powerpc64le.
2016-04-22 19:23:13 +05:30
Carlos Eduardo Seo
1b045ee53e powerpc: Optimization for strlen for POWER8.
This implementation takes advantage of vectorization to improve performance of
the loop over the current strlen implementation for POWER7.
2016-04-15 17:19:19 -03:00
Paul E. Murphy
25dba0ad05 powerpc: Add optimized P8 strspn
This utilizes vectors and bitmasks.  For small needle, large
haystack, the performance improvement is upto 8x.  For short
strings (0-4B), the cost of computing the bitmask dominates,
and is a tad slower.
2016-04-07 15:51:28 -05:00
Adhemerval Zanella
528ffb3a04 Remove powerpc64 strspn, strcspn, and strpbrk implementation
This patch removes the powerpc64 optimized strspn, strcspn, and
strpbrk assembly implementation now that the default C one
implements the same strategy.  On internal glibc benchtests
current implementations shows similar performance with -O2.

Tested on powerpc64le (POWER8).

	* sysdeps/powerpc/powerpc64/strcspn.S: Remove file.
	* sysdeps/powerpc/powerpc64/strpbrk.S: Remove file.
	* sysdeps/powerpc/powerpc64/strspn.S: Remove file.
2016-04-01 10:44:45 -03:00
Rajalakshmi Srinivasaraghavan
869d7180dd powerpc: Rearrange cfi_offset calls
This patch rearranges cfi_offset() calls after the last store
so as to avoid extra DW_CFA_advance opcodes in unwind information.
2016-03-11 11:31:58 -03:00
Joseph Myers
f7a9f785e5 Update copyright dates with scripts/update-copyrights. 2016-01-04 16:05:18 +00:00
Carlos Eduardo Seo
b1f19b8ef1 powerpc: Add basic support for POWER9 sans hwcap.
This patch adds the minimum changes for supporting the POWER9 processor.
2015-12-22 14:45:55 -02:00
Carlos Eduardo Seo
67385a01d2 powerpc: Add hwcap/hwcap2/platform data to TCB.
This patch adds a new feature for powerpc.  In order to get faster access to
the HWCAP/HWCAP2 bits and platform number (i.e. for implementing
__builtin_cpu_is () / __builtin_cpu_supports () in GCC) without the overhead of
reading from the auxiliary vector, we now reserve space for them in the TCB.
This is an ABI change for GLIBC 2.23.

A new versioned symbol '__parse_hwcap_and_convert_at_platform' is available to
get the data from the auxiliary vector and parse it, and store it for later use
in the TLS initialization code.  This function is called very early
(in _dl_sysdep_start () via DL_PLATFORM_INFO for the dynamic linking case, and
in __libc_start_main () for the static linking case) to make sure the data is
available at the time of TLS initialization.

	* sysdeps/powerpc/Makefile (sysdep-dl-routines): Add hwcapinfo.
	(sysdep_routines): Likewise.
	(sysdep-rtld-routines): Likewise.
	[$(subdir) = nptl](tests): Add test-get_hwcap and test-get_hwcap-static
	[$(subdir) = nptl](tests-static): test-get_hwcap-static
	* sysdeps/powerpc/Versions: Added new
	__parse_hwcap_and_convert_at_platform symbol to GLIBC-2.23.
	* sysdeps/powerpc/hwcapinfo.c: New file.
	(__tcb_parse_hwcap_and_convert_at_platform): New function to initialize
	and parse hwcap, hwcap2 and platform number information.
	* sysdeps/powerpc/hwcapinfo.h: New file.  Creates global variables
	to store HWCAP+HWCAP2 and platform number.
	* sysdeps/powerpc/nptl/tcb-offsets.sym: Added new offsets
	for HWCAP+HWCAP2 and platform number in the TCB.
	* sysdeps/powerpc/nptl/tls.h: New functionality.  Stores
	the HWCAP, HWCAP2 and platform number in the TCB.
	(dtv): Added new fields for HWCAP+HWCAP2 and platform number.
	(TLS_INIT_TP): Included calls to add the hwcap and
	at_platform values in the TCB in TP initialization.
	(TLS_DEFINE_INIT_TP): Likewise.
	(THREAD_GET_HWCAP): New macro.
	(THREAD_SET_HWCAP): Likewise.
	(THREAD_GET_AT_PLATFORM): Likewise.
	(THREAD_SET_AT_PLATFORM): Likewise.
	* sysdeps/powerpc/powerpc32/dl-machine.h:
	(dl_platform_init): New function that calls
	__parse_hwcap_and_convert_at_platform for the dymanic linking case for
	powerpc32.
	* sysdeps/powerpc/powerpc64/dl-machine.h: Likewise, for powerpc64.
	* sysdeps/powerpc/test-get_hwcap-static.c: New file.  Testcase for
	this functionality, static linking case.
	* sysdeps/powerpc/test-get_hwcap.c: New file.  Likewise, dynamic
	linking case.
	* sysdeps/unix/sysv/linux/powerpc/libc-start.c: Added call to
	__parse_hwcap_and_convert_at_platform for the static linking case.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/ld.abilist:
	Included the new __parse_hwcap_and_convert_at_platform symbol in the
	ABI list for GLIBC 2.23.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/ld-le.abilist:
	Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/ld.abilist:
	Likewise.
2015-12-03 13:56:13 -02:00
Joseph Myers
21378ae0d3 Fix powerpc round, roundf spurious "inexact" (bug 19238).
The powerpc hard-float round and roundf functions, both 32-bit and
64-bit, raise spurious "inexact" exceptions for integer arguments from
adding 0.5 and rounding to integer toward zero.

Since these functions already save and restore the rounding mode, it's
natural to make them restore the full floating-point state instead to
fix this bug, which this patch does.  The save of the state is moved
after the first floating-point operation on the input so that any
"invalid" exceptions from signaling NaN inputs are properly
preserved.  As a consequence of this approach to the fix, "inexact"
for noninteger arguments (disallowed by TS 18661-1 but not by C99/C11,
see bug 15479) is also avoided for these implementations; this is
*not* a general fix for bug 15479 since plenty of other
implementations of various functions still raise spurious "inexact"
for noninteger arguments.

This issue and fix do not apply to builds using power5+ versions of
round and roundf, which use the frin instruction and avoid "inexact"
exceptions that way.

This patch should get hard-float powerpc32 and powerpc64 (default
function implementations) back to a state where test-float and
test-double will pass after ulps regeneration.

Tested for powerpc32 and powerpc64.

	[BZ #15479]
	[BZ #19238]
	* sysdeps/powerpc/powerpc32/fpu/s_round.S (__round): Save
	floating-point state after first operation on input.  Restore full
	state rather than just rounding mode.
	* sysdeps/powerpc/powerpc32/fpu/s_roundf.S (__roundf): Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_round.S (__round): Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_roundf.S (__roundf): Likewise.
2015-11-12 19:00:06 +00:00
Joseph Myers
32b71ad358 Fix powerpc64 lround, lroundf, llround, llroundf spurious "inexact" exceptions (bug 19235).
Similar to bug 19134 for powerpc32, the powerpc64 implementations of
lround, lroundf, llround, llroundf can raise spurious "inexact"
exceptions for integer arguments from adding 0.5 then converting to
integer (this does not apply to the power5+ version for double, which
uses the frin instruction which is defined never to raise "inexact"; I
don't know why power5+ doesn't use that version for float as well).

This patch fixes the bug in a similar way to the powerpc32 bug, by
testing for integers (adding and subtracting 2^52 and comparing with
the value before that addition and subtraction) and not adding 0.5 in
that case.

The powerpc maintainers may wish to look at making power5+ / power6x /
power8 use frin for float lround / llround as well as for double,
unless there's some reason I've missed that this isn't beneficial.

Tested for powerpc64.

	[BZ #19235]
	* sysdeps/powerpc/powerpc64/fpu/s_llround.S (__llround): Do not
	add 0.5 to integer arguments.
	* sysdeps/powerpc/powerpc64/fpu/s_llroundf.S (__llroundf):
	Likewise.
	(.LC2): New object.
2015-11-12 16:24:00 +00:00
Joseph Myers
71d1b0166b Fix powerpc nearbyint wrongly clearing "inexact" and leaving traps disabled (bug 19228).
Similar to bug 15491 recently fixed for x86_64 / x86, the powerpc
(both powerpc32 and powerpc64) hard-float implementations of
nearbyintf and nearbyint wrongly clear an "inexact" exception that was
raised before the function was called; this shows up as failure of the
test math/test-nearbyint-except added when that bug was fixed.  They
also wrongly leave traps on "inexact" disabled if they were enabled
before the function was called.

This patch fixes the bugs similar to how the x86 bug was fixed: saving
and restoring the whole floating-point state, both to restore the
original "inexact" flag state and to restore the original state of
whether traps on "inexact" were enabled.  Because there's a convenient
point in the powerpc implementations to save state after any sNaN
arguments will have raised "invalid" but before "inexact" traps need
to be disabled, no special handling for "invalid" is needed as in the
x86 version.

Tested for powerpc64 and powerpc32, where it fixes the
math/test-nearbyint-except failure as well as fixing the new test
math/test-nearbyint-except-2 added by this patch.  Also tested for
x86_64 and x86 that the new test passes.

If powerpc experts see a more efficient way of doing this
(e.g. instruction positioning that's better for pipelines on typical
processors) then of course followups optimizing the fix are welcome.

	[BZ #19228]
	* sysdeps/powerpc/powerpc32/fpu/s_nearbyint.S (__nearbyint): Save
	and restore full floating-point state.
	* sysdeps/powerpc/powerpc32/fpu/s_nearbyintf.S (__nearbyintf):
	Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_nearbyint.S (__nearbyint):
	Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_nearbyintf.S (__nearbyintf):
	Likewise.
	* math/test-nearbyint-except-2.c: New file.
	* math/Makefile (tests): Add test-nearbyint-except-2.
2015-11-11 00:06:09 +00:00
Gabriel F. T. Gomes
b0f81637d5 PowerPC: Add comments to optimized strncpy
* sysdeps/powerpc/powerpc64/power8/strncpy.S: Added comments to some
	assembly instructions.
2015-10-01 17:36:55 -03:00
Gabriel F. T. Gomes
850713336e PowerPC: Fix operand prefixes
The file sysdeps/powerpc/sysdeps.h defines aliases for register operands,
which add the letter 'r' as a prefix to a register name.  E.g.: register 20
can be written as 'r20', instead of '20'.  On the one hand, this increases
readability, as it makes it easier for readers to know whether the operand is a
register or an immediate.  On the other hand, this permits that immediate
operands be written as if they were registers, and vice-versa, thus reducing
the readability of the code.

This commit removes some of these unintentional misuses.

This commit also increases readability of the code by adding the prefix 'cr' to
some uses of the control register.

Both changes have no effect on the final code.  Checked with objdump.

	* sysdeps/powerpc/powerpc64/power8/strncpy.S: Remove or add register
	prefix from operands.
2015-10-01 17:36:46 -03:00
Joseph Myers
de071d199a Move bits/atomic.h to atomic-machine.h (bug 14912).
It was noted in
<https://sourceware.org/ml/libc-alpha/2012-09/msg00305.html> that the
bits/*.h naming scheme should only be used for installed headers.
This patch renames bits/atomic.h to atomic-machine.h to follow that
convention.

This is the only change in this series that needs to change the
filename rather than simply removing a directory level (because both
atomic.h and bits/atomic.h exist at present).

Tested for x86_64 (testsuite, and that installed stripped shared
libraries are unchanged by the patch).

	[BZ #14912]
	* sysdeps/aarch64/bits/atomic.h: Move to ...
	* sysdeps/aarch64/atomic-machine.h: ...here.
	(_AARCH64_BITS_ATOMIC_H): Rename macro to
	_AARCH64_ATOMIC_MACHINE_H.
	* sysdeps/alpha/bits/atomic.h: Move to ...
	* sysdeps/alpha/atomic-machine.h: ...here.
	* sysdeps/arm/bits/atomic.h: Move to ...
	* sysdeps/arm/atomic-machine.h: ...here.  Update comments.
	* bits/atomic.h: Move to ...
	* sysdeps/generic/atomic-machine.h: ...here.
	(_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H.
	* sysdeps/i386/bits/atomic.h: Move to ...
	* sysdeps/i386/atomic-machine.h: ...here.
	* sysdeps/ia64/bits/atomic.h: Move to ...
	* sysdeps/ia64/atomic-machine.h: ...here.
	* sysdeps/m68k/coldfire/bits/atomic.h: Move to ...
	* sysdeps/m68k/coldfire/atomic-machine.h: ...here.
	(_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H.
	* sysdeps/m68k/m680x0/m68020/bits/atomic.h: Move to ...
	* sysdeps/m68k/m680x0/m68020/atomic-machine.h: ...here.
	* sysdeps/microblaze/bits/atomic.h: Move to ...
	* sysdeps/microblaze/atomic-machine.h: ...here.
	* sysdeps/mips/bits/atomic.h: Move to ...
	* sysdeps/mips/atomic-machine.h: ...here.
	(_MIPS_BITS_ATOMIC_H): Rename macro to _MIPS_ATOMIC_MACHINE_H.
	* sysdeps/powerpc/bits/atomic.h: Move to ...
	* sysdeps/powerpc/atomic-machine.h: ...here.  Update comments.
	* sysdeps/powerpc/powerpc32/bits/atomic.h: Move to ...
	* sysdeps/powerpc/powerpc32/atomic-machine.h: ...here.  Update
	comments.  Include <atomic-machine.h> instead of <bits/atomic.h>.
	* sysdeps/powerpc/powerpc64/bits/atomic.h: Move to ...
	* sysdeps/powerpc/powerpc64/atomic-machine.h: ...here.  Include
	<atomic-machine.h> instead of <bits/atomic.h>.
	* sysdeps/s390/bits/atomic.h: Move to ...
	* sysdeps/s390/atomic-machine.h: ...here.
	* sysdeps/sparc/sparc32/bits/atomic.h: Move to ...
	* sysdeps/sparc/sparc32/atomic-machine.h: ...here.
	(_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H.
	* sysdeps/sparc/sparc32/sparcv9/bits/atomic.h: Move to ...
	* sysdeps/sparc/sparc32/sparcv9/atomic-machine.h: ...here.
	* sysdeps/sparc/sparc64/bits/atomic.h: Move to ...
	* sysdeps/sparc/sparc64/atomic-machine.h: ...here.
	* sysdeps/tile/bits/atomic.h: Move to ...
	* sysdeps/tile/atomic-machine.h: ...here.
	* sysdeps/tile/tilegx/bits/atomic.h: Move to ...
	* sysdeps/tile/tilegx/atomic-machine.h: ...here.  Include
	<sysdeps/tile/atomic-machine.h> instead of
	<sysdeps/tile/bits/atomic.h>.
	(_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H.
	* sysdeps/tile/tilepro/bits/atomic.h: Move to ...
	* sysdeps/tile/tilepro/atomic-machine.h: ...here.  Include
	<sysdeps/tile/atomic-machine.h> instead of
	<sysdeps/tile/bits/atomic.h>.
	(_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H.
	* sysdeps/unix/sysv/linux/arm/bits/atomic.h: Move to ...
	* sysdeps/unix/sysv/linux/arm/atomic-machine.h: ...here.  Include
	<sysdeps/arm/atomic-machine.h> instead of
	<sysdeps/arm/bits/atomic.h>.
	* sysdeps/unix/sysv/linux/hppa/bits/atomic.h: Move to ...
	* sysdeps/unix/sysv/linux/hppa/atomic-machine.h: ...here.
	(_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H.
	* sysdeps/unix/sysv/linux/m68k/coldfire/bits/atomic.h: Move to ...
	* sysdeps/unix/sysv/linux/m68k/coldfire/atomic-machine.h: ...here.
	(_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H.
	* sysdeps/unix/sysv/linux/nios2/bits/atomic.h: Move to ...
	* sysdeps/unix/sysv/linux/nios2/atomic-machine.h: ...here.
	(_NIOS2_BITS_ATOMIC_H): Rename macro to _NIOS2_ATOMIC_MACHINE_H.
	* sysdeps/unix/sysv/linux/sh/bits/atomic.h: Move to ...
	* sysdeps/unix/sysv/linux/sh/atomic-machine.h: ...here.
	* sysdeps/x86_64/bits/atomic.h: Move to ...
	* sysdeps/x86_64/atomic-machine.h: ...here.
	* include/atomic.h: Include <atomic-machine.h> instead of
	<bits/atomic.h>.
2015-09-11 20:00:19 +00:00
Paul E. Murphy
18173559a2 powerpc: Fix tabort usage in syscalls
Fix usage of tabort in generated syscalls.  r0 has special meaning
when used with this instruction, thus it will not generate
persistent errors, nor return an error code.  This mitigates poor
CPU usage when performing elided critical sections.

Additionally, transactions should be aborted when entering a user
invoked syscall.  Otherwise the results of the transaction may be
undefined.

2015-08-25  Paul E. Murphy  <murphyp@linux.vnet.ibm.com>

	* sysdeps/powerpc/powerpc32/sysdep.h (ABORT_TRANSACTION): Use
	register other than r0 for tabort, it has special meaning.
	* sysdeps/powerpc/powerpc64/sysdep.h (ABORT_TRANSACTION): Likewise
	* sysdeps/unix.sysv/linux/powerpc/syscall.S (syscall): Abort
	transaction before starting syscall.
2015-08-25 13:45:56 -03:00
Rajalakshmi Srinivasaraghavan
fe7faec3e5 powerpc: Handle worstcase behavior in strstr() for POWER7
Instead of checking needle length, constant 'n' number of comparisons
is checked to fall back to default implementation.  This patch is tested
on powerpc64 and powerpc64le.

2015-08-25  Rajalakshmi Srinivasaraghavan  <raji@linux.vnet.ibm.com>

	* sysdeps/powerpc/powerpc64/power7/strstr.S: Handle worst case.
2015-08-25 13:45:56 -03:00
Carlos Eduardo Seo
502b91de14 powerpc: make memchr use memchr-power7.
In powerpc64, memchr was always pointing to the internal __GI_memchr
implementation.  This patch fixes that and makes it use the
optimized POWER7 version when adequate.

	* sysdeps/powerpc/powerpc64/multiarch/memchr-ppc64.c: Make
	memchr not point to the internal __GI_memchr implementation.
2015-08-21 17:05:40 -03:00
Ondrej Bilka
5011051da3 powerpc: Fix stpcpy performance for power8
This patch fixes the missing enablement for stpcpy on POWER8.

	* sysdeps/powerpc/powerpc64/multiarch/stpcpy.c: Fix ifunc.
2015-08-11 10:03:10 -03:00
Adhemerval Zanella
6f714aa4ad powerpc: Fix PPC64/POWER7 conform tests
When building with --disable-multi-arch the memmove and strstr POWER7
optimization create and uses symbols that conflict with expect conform
tests.

	* sysdeps/powerpc/powerpc64/power7/memmove.S (bcopy): Changing to
	__bcopy and add a weak_alias to bcopy.
	* sysdeps/powerpc/powerpc64/power7/strstr.S (strstr): Use __strnlen
	for static build.
2015-08-11 10:03:10 -03:00
Adhemerval Zanella
142e0a9953 powerpc: Use default strcpy optimization for POWER7
This patches uses the default strcpy/stpcpy implementation for
POWER7/PPC64.  This is faster in mostly inputs for benchtests
and for multiarch the implementation uses the POWER7 strlen and
memcpy.

	* string/stpcpy.c (__stpcpy): Use STPCPY to redefine symbol name and
	cleanup macro usage.
	* string/strcpy.c (strcpt): Use STRCPY to redefine symbol name.
	* sysdeps/powerpc/powerpc64/multiarch/stpcpy-power7.S: Remove file.
	* sysdeps/powerpc/powerpc64/multiarch/stpcpy-ppc64.S: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strcpy-power7.S: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strcpy-ppc64.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/stpcpy.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/strcpy.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/strcpy.c: Likewise.
	* sysdeps/powerpc/powerpc64/stpcpy.S: Likewise.
	* sysdeps/powerpc/powerpc64/strcpy.S: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/stpcpy.c
	[SHARED && IS_IN (libc)]: Include <string/strcpy.c>.
	* sysdeps/powerpc/powerpc64/multiarch/stpcpy.c
	[SHARED && IS_IN (libc)]: Include <string/stpcpy.c>.
	* sysdeps/powerpc/powerpc64/multiarch/stpcpy-power7.c: New file.
	* sysdeps/powerpc/powerpc64/multiarch/stpcpy-ppc64.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strcpy-power7.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strcpy-ppc64.c: Likewise.
	* sysdeps/powerpc/powerpc64/power7/strcpy.c: Likewise.
2015-08-11 10:03:10 -03:00
Adhemerval Zanella
14362ef154 powerpc: Fix strnlen/power7 build
This patch fixes the strnlen.S build with --disable-multi-arch option.
2015-08-11 10:03:09 -03:00
Adhemerval Zanella
357bb400f1 powerpc: Fix strstr/power7 build
This patch fixes the strstr build with --disable-multi-arch option.
The optimization calls the __strstr_ppc symbol, which always build
for multiarch config but not if it is disable.  This patch fixes it
by adding the default C implementation object with the expected
symbol name.

	* sysdeps/powerpc/powerpc64/power7/Makefile [$(subdir) = string]
	(sysdep_routines): Add strstr-ppc64.
	* sysdeps/powerpc/powerpc64/power7/strstr-ppc64.c: New file.
2015-08-11 10:03:09 -03:00
Rajalakshmi Srinivasaraghavan
b42f8cad52 powerpc: strstr optimization
This patch optimizes strstr function for power >= 7 systems.  Performance
gain is obtained using aligned memory access and usage of cmpb
instruction for quicker comparison.  The average improvement of this
optimization is ~40%.  Tested on ppc64 and ppc64le.

2015-07-16  Rajalakshmi Srinivasaraghavan  <raji@linux.vnet.ibm.com>

	* sysdeps/powerpc/powerpc64/multiarch/Makefile: Add strstr().
	* sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c: Likewise.
	* sysdeps/powerpc/powerpc64/power7/strstr.S: New File.
	* sysdeps/powerpc/powerpc64/multiarch/strstr-power7.S: New File.
	* sysdeps/powerpc/powerpc64/multiarch/strstr-ppc64.c: New File.
	* sysdeps/powerpc/powerpc64/multiarch/strstr.c: New File.
2015-07-16 13:43:51 -03:00
Adhemerval Zanella
7bf8fb1042 libc-vdso.h place consolidation
This patch moves the libc-vdso.h internal header from bits folder to
default architecture one and also corrects the remaning includes in
the files.
2015-04-20 08:51:17 -03:00
Alan Modra
19a6a3acd1 Harden powerpc64 elf_machine_fixup_plt
IFUNC is difficult to correctly implement on any target needing a GOT
to support position independent code, due to the dependency on order
of dynamic relocations.  ld.so should be changed to apply IFUNC
relocations last, globally, because without that it is actually
impossible to write an IFUNC resolver in C that works in all
situations.  Case in point, vfork in libpthread.so is an IFUNC with
the resolver returning &__libc_vfork.  (system and fork are similar.)
If another shared library, libA say, uses vfork then it is quite
possible that libpthread.so hasn't been dynamically relocated before
the unfortunate libA is dynamically relocated.  In that case the GOT
entry for &__libc_vfork is still zero, so the IFUNC resolver returns
NULL.  LD_BIND_NOW=1 results in libA PLT dynamic relocations being
applied using this NULL value and ld.so segfaults.

This patch hardens ld.so to not segfault on a NULL from an IFUNC
resolver.  It also fixes a problem with undefined weak.  If you leave
the plt entry as-is for undefined weak then if the entry is ever
called it will loop in ld.so rather than segfaulting.

	* sysdeps/powerpc/powerpc64/dl-machine.h (elf_machine_fixup_plt):
	Don't segfault if ifunc resolver returns a NULL.  Do set plt to
	zero for undefined weak.
	(elf_machine_plt_conflict): Similarly.
2015-03-26 12:30:45 +10:30
Alan Modra
afcd9480fe powerpc __tls_get_addr call optimization
This patch is glibc support for a PowerPC TLS optimization, inspired
by Alexandre Oliva's TLS optimization for other processors,
http://www.lsd.ic.unicamp.br/~oliva/writeups/TLS/RFC-TLSDESC-x86.txt

In essence, this optimization uses a zero module id in the tls_index
GOT entry to indicate that a TLS variable is allocated space in the
static TLS area.  A special plt call linker stub for __tls_get_addr
checks for such a tls_index and if found, returns the offset
immediately.  The linker communicates the fact that the special
__tls_get_addr stub is used by setting a bit in the dynamic tag
DT_PPC64_OPT/DT_PPC_OPT.  glibc communicates to the linker that this
optimization is available by the presence of __tls_get_addr_opt.

tst-tlsmod2.so is built with -Wl,--no-tls-get-addr-optimize for
tst-tls-dlinfo, which otherwise would fail since it tests that no
static tls is allocated.  The ld option --no-tls-get-addr-optimize has
been available since binutils-2.20 so doesn't need a configure test.

	* NEWS: Advertise TLS optimization.
	* elf/elf.h (R_PPC_TLSGD, R_PPC_TLSLD, DT_PPC_OPT, PPC_OPT_TLS): Define.
	(DT_PPC_NUM): Increment.
	* elf/dynamic-link.h (HAVE_STATIC_TLS): Define.
	(CHECK_STATIC_TLS): Use here.
	* sysdeps/powerpc/powerpc32/dl-machine.h (elf_machine_rela): Optimize
	TLS descriptors.
	* sysdeps/powerpc/powerpc64/dl-machine.h (elf_machine_rela): Likewise.
	* sysdeps/powerpc/dl-tls.c: New file.
	* sysdeps/powerpc/Versions: Add __tls_get_addr_opt.
	* sysdeps/powerpc/tst-tlsopt-powerpc.c: New tls test.
	* sysdeps/unix/sysv/linux/powerpc/Makefile: Add new test.
	Build tst-tlsmod2.so with --no-tls-get-addr-optimize.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/ld.abilist: Update.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/ld.abilist: Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/ld-le.abilist: Likewise.
2015-03-25 15:53:47 +10:30
Alan Modra
da9f333410 powerpc64 configure message
This feature doesn't depend on the linker, as can be seen from the
actual test.  It's a compiler feature.

	* sysdeps/powerpc/powerpc64/configure.ac: Correct "linker support
	for overlapping .opd entries" to "support...".
	* sysdeps/powerpc/powerpc64/configure: Regenerate
2015-03-25 15:45:36 +10:30
Adhemerval Zanella
5ca10a0c9a powerpc: Remove HAVE_ASM_GLOBAL_DOT_NAME define
With AIX port deprecated there is no need to check/define
HAVE_ASM_GLOBAL_DOT_NAME anymore since the current minimum binutils
supported (2.22) does not emit global symbol with dot.

This patch removes all the HAVE_ASM_GLOBAL_DOT_NAME definition and
checks for powerpc64 port.
2015-03-11 09:01:05 -04:00
H.J. Lu
209826bcf2 Replace ELF_RTYPE_CLASS_NOCOPY with ELF_RTYPE_CLASS_COPY
ELF_RTYPE_CLASS_NOCOPY in comments is a typo.  It should be
ELF_RTYPE_CLASS_COPY.

	[BZ #18082]
	* sysdeps/alpha/dl-machine.h (elf_machine_type_class): Replace
	ELF_RTYPE_CLASS_NOCOPY with ELF_RTYPE_CLASS_COPY in comments.
	* sysdeps/arm/dl-machine.h (elf_machine_type_class): Likewise.
	* sysdeps/hppa/dl-machine.h (elf_machine_type_class): Likewise.
	* sysdeps/i386/dl-machine.h (elf_machine_type_class): Likewise.
	* sysdeps/ia64/dl-machine.h (elf_machine_type_class): Likewise.
	* sysdeps/m68k/dl-machine.h (elf_machine_type_class): Likewise.
	* sysdeps/microblaze/dl-machine.h (elf_machine_type_class):
	Likewise.
	* sysdeps/nios2/dl-machine.h (elf_machine_type_class): Likewise.
	* sysdeps/powerpc/powerpc32/dl-machine.h (elf_machine_type_class):
	Likewise.
	* sysdeps/powerpc/powerpc64/dl-machine.h (elf_machine_type_class):
	Likewise.
	* sysdeps/s390/s390-32/dl-machine.h (elf_machine_type_class):
	Likewise.
	* sysdeps/s390/s390-64/dl-machine.h (elf_machine_type_class):
	Likewise.
	* sysdeps/sh/dl-machine.h (elf_machine_type_class): Likewise.
	* sysdeps/sparc/sparc32/dl-machine.h (elf_machine_type_class):
	Likewise.
	* sysdeps/sparc/sparc64/dl-machine.h (elf_machine_type_class):
	Likewise.
	* sysdeps/tile/dl-machine.h (elf_machine_type_class): Likewise.
	* sysdeps/x86_64/dl-machine.h (elf_machine_type_class): Likewise.
2015-03-05 08:40:41 -08:00
Adhemerval Zanella
115e0de72a powerpc: Fix memmove static build
This patch fixes the missing "__memcpy_ppc" symbol for memmove-ppc64
object in static builds.  Since memcpy ifunc is not enabled in static
mode, the specialized symbols are not provided.  The patch changed the
it to just "__memcpy" instead.
2015-02-25 13:25:54 -05:00
Rajalakshmi Srinivasaraghavan
98408b95b1 powerpc: POWER7 strncpy optimization for unaligned string
This patch optimizes strncpy for power7 for unaligned source or
destination address. The source or destination address is aligned
to doubleword and data is shifted based on the alignment and
added with the previous loaded data to be written as a doubleword.
For each load, cmpb instruction is used for faster null check.

The new optimization shows 10 to 70% of performance improvement
for longer string though it does not show big difference on string
size less than 16 due to additional checks.Hence this new algorithm
is restricted to string greater than 16.
2015-02-12 13:16:08 -05:00
Adhemerval Zanella
10169938b1 powerpc: wordcopy/memmove cleanup for ppc32
This patch cleanup some multiarch code related to memmmove
optimization. Initial IFUNC support added specialized wordcopy
symbols which turned in local IFUNC calls used by memmove default
implementation.  The patch removes the internal IFUNC for wordcopy
symbols and uses local branches in the memmmove optimization instead.
2015-02-09 06:42:28 -05:00
Adhemerval Zanella
b269211467 powerpc: wordcopy/memmove cleanup for ppc64
This patch cleanup some multiarch code related to memmmove
optimization. Initial IFUNC support added specialized wordcopy
symbols which turned in local IFUNC calls used by memmove default
implementation.

This change by removing then and used the optimized memmove instead
for supported chips.
2015-02-09 06:42:28 -05:00
Adhemerval Zanella
18e270aada powerpc: Remove POWER7 wordcopy ifunc
This patch remove the POWER7 ifunc wordcopy function
(_wordcopy_*_power7), since now GLIBC provides a optimized memmove/bcopy
for POWER7.
2015-02-09 06:42:28 -05:00
Adhemerval Zanella
6f0993a638 powerpc: Simplify bcopy default implementation
This patch simplify the default bcopy symbol for powerpc64 by just using
memmove instead of implementing using the default bcopy.  Since the
symbol is deprecated, it trades speed by code size.
2015-02-09 06:42:28 -05:00
Adhemerval Zanella
3001e54c57 powerpc: multiarch Makefile cleanup for powerpc64
This patch cleanups the multiarch Makefile by putting the wide chars
implementation to correct wcsmbs rule.
2015-02-09 06:42:27 -05:00
Adhemerval Zanella
08cee2a464 powerpc: Fix fsqrt build in libm [BZ#16576]
Some powerpc64 processors (e5500 core for instance) does not provide the
fsqrt instruction, however current check to use in math_private.h is
__WORDSIZE and _ARCH_PWR4 (ISA 2.02).  This is patch change it to use
the compiler flag _ARCH_PPCSQ (which is the same condition GCC uses to
decide whether to generate fsqrt instruction).

It fixes BZ#16576.
2015-01-28 05:59:16 -05:00
Adhemerval Zanella
bea5801360 powerpc: Fix powerpc64 build failure with binutils 2.22
GLIBC memset optimization for POWER8 uses the '.machine power8'
directive, which is only supported officially on binutils 2.24+.  This
causes a build failure on older binutils.

Since the requirement of .machine power8 is to correctly assembly the
'mtvsrd' instruction and it is already handled by the MTVSRD_V1_R4
macro, there is no really needed of using it.

The patch replaces the power8 with power7 for .machine directive.

It fixes BZ#17869.
2015-01-24 08:40:04 -05:00
Adhemerval Zanella
0e87343e20 powerpc: Fix ifuncmain6pie failure with GCC 4.9
This patch fix the elf/ifuncmain6pie failure when building with GCC
4.9+.  For some reason, the compiler removes the branch taken code at
resolve_ifunc (sysdeps/powerpc/powerpc64/dl-machine.h) as dead-code
and thus the testcase fails because the ifunc resolves branches to an
invalid memory location.  It fixes by explicit adding a dependency of
value based on odp variable to avoid compiler optimization.

It fixes BZ#17868.
2015-01-24 08:38:39 -05:00
Adhemerval Zanella
ce6615c9c6 powerpc: Fix POWER7/PPC64 performance regression on LE
This patch fixes a performance regression on the POWER7/PPC64 memcmp
porting for Little Endian.  The LE code uses 'ldbrx' instruction to read
the memory on byte reversed form, however ISA 2.06 just provide the indexed
form which uses a register value as additional index, instead of a fixed value
enconded in the instruction.

And the port strategy for LE uses r0 index value and update the address
value on each compare loop interation.  For large compare size values,
it adds 8 more instructions plus some more depending of trailing
size.  This patch fixes it by adding pre-calculate indexes to remove the
address update on loops and tailing sizes.

For large sizes it shows a considerable gain, with double performance
pairing with BE.
2015-01-13 14:35:40 -05:00
Adhemerval Zanella
d3b00f468b powerpc: Optimized strncmp for POWER8/PPC64
This patch adds an optimized POWER8 strncmp.  The implementation focus
on speeding up unaligned cases follwing the ideas of power8 strcmp.

The algorithm first check the initial 16 bytes, then align the first
function source and uses unaligned loads on second argument only.
Aditional checks for page boundaries are done for unaligned cases
(where sources alignment are different).
2015-01-13 14:35:40 -05:00
Rajalakshmi Srinivasaraghavan
72607db038 powerpc: Optimize POWER7 strcmp trailing checks
This patch optimized the POWER7 trailing check by avoiding using byte
read operations and instead use the doubleword already readed with
bitwise operations.
2015-01-13 14:35:40 -05:00
Adhemerval Zanella
8bedcb5f03 powerpc: Optimized strcmp for POWER8/PPC64
This patch adds an optimized POWER8 strcmp using unaligned accesses.
The algorithm first check the initial 16 bytes, then align the first
function source and uses unaligned loads on second argument only.
Aditional checks for page boundaries are done for unaligned cases
2015-01-13 11:28:58 -05:00
Adhemerval Zanella
f06a4faf8a powerpc: Optimized st{r,p}ncpy for POWER8/PPC64
This patch adds an optimized POWER8 st{r,p}ncpy using unaligned accesses.
It shows 10%-80% improvement over the optimized POWER7 one that uses
only aligned accesses, specially on unaligned inputs.

The algorithm first read and check 16 bytes (if inputs do not cross a 4K
page size).  The it realign source to 16-bytes and issue a 16 bytes read
and compare loop to speedup null byte checks for large strings.  Also,
different from POWER7 optimization, the null pad is done inline in the
implementation using possible unaligned accesses, instead of realying on
a memset call.  Special case is added for page cross reads.
2015-01-13 11:28:44 -05:00
Adhemerval Zanella
9f2f36e5a9 powerpc: Optimized strncat for POWER7/PPC64
With 3eb38795db (Simplify strncat) the generic algorithms uses
strlen, strnlen, and memcpy.  This is faster than POWER7 current
implementation, especially for unaligned strings (where POWER7 code
uses byte-byte operations).

This patch removes the assembly implementation and uses a multiarch
specialization based on default algorithm calling optimized POWER7
symbols.
2015-01-13 11:28:40 -05:00
Adhemerval Zanella
94c9680945 powerpc: Optimized strcat for POWER8/PPC64
With new optimized strcpy for POWER8, this patch adds an optimized
strcat which uses it along with default implementation at strings/.
2015-01-13 11:28:36 -05:00
Adhemerval Zanella
96d6fd6c40 powerpc: Optimized st{r,p}cpy for POWER8/PPC64
This patch adds an optimized POWER8 strcpy using unaligned accesses.
For strings up to 16 bytes the implementation first calculate the
string size, like strlen, and issues a memcpy.  For larger strings,
source is first aligned to 16 bytes and then tested over a loop that
reads 16 bytes am combine the cmpb results for speedup.  Special case is
added for page cross reads.

It shows 30%-60% improvement over the optimized POWER7 one that uses
only aligned accesses.
2015-01-13 11:28:30 -05:00
Adhemerval Zanella
56cf276381 powerpc: abort transaction in syscalls
Linux kernel powerpc documentation states issuing a syscall inside a
transaction is not recommended and may lead to undefined behavior. It
also states syscalls does not abort transactoin neither they run in
transactional state.

To avoid side-effects being visible outside transactions, GLIBC with
lock elision enabled will issue a transaction abort instruction just
before all syscalls if hardware supports hardware transactions.
2015-01-12 06:32:08 -05:00
Joseph Myers
b168057aaa Update copyright dates with scripts/update-copyrights. 2015-01-02 16:29:47 +00:00
Rajalakshmi Srinivasaraghavan
f59ad976ed powerpc: POWER7 strcpy optimization for unaligned strings
This patch optimizes strcpy for ppc64/power7 for unaligned source or
destination address.  The source or destination address is aligned
to doubleword and data is shifted based on the alignment and
added with the previous loaded data to be written as a doubleword.
For each load, cmpb instruction is used for faster null check.

The word aligned optimization is also removed, since the new unaligned
code path shows better results handling word-aligned strings.

More combination of unaligned inputs is also added in benchtest
to measure the improvement.The new optimization shows 2 to 80% of
performance improvement for longer string though it does not show
big difference on string size less than 16 due to additional checks.
2014-12-31 14:35:59 -05:00
Joseph Myers
2cfbdb9a27 Fix strftime wcschr namespace (bug 17634).
Use of strftime, a C90 function, ends up bringing in wcschr, which is
not a C90 function.  Although not a conformance bug (C90 reserves
wcs*), this is still contrary to glibc practice of avoiding relying on
those reservations; this patch arranges for the internal uses to use
__wcschr instead, with wcschr being a weak alias.  This is more
complicated than some such patches because of the various IFUNC
definitions of wcschr (which include code redefining libc_hidden_def
in a way that involves creating __GI_wcschr manually and so also needs
to create __GI___wcschr after the change of internal uses to use
__wcschr).

Tested for x86_64 and 32-bit x86 (testsuite, and that disassembly of
installed shared libraries is unchanged by the patch).

2014-12-10  Joseph Myers  <joseph@codesourcery.com>
	    Adhemerval Zanella  <azanella@linux.vnet.ibm.com>

	[BZ #17634]
	* wcsmbs/wcschr.c [!WCSCHR] (wcschr): Define as __wcschr.
	Undefine after defining function.  Define as weak alias of
	__wcschr.  Use libc_hidden_weak.
	* include/wchar.h (__wcschr): Declare.  Use libc_hidden_proto.
	* sysdeps/i386/i686/multiarch/wcschr-c.c [IS_IN (libc) && SHARED]
	(libc_hidden_def): Also define __GI___wcschr alias.
	* sysdeps/i386/i686/multiarch/wcschr.S (wcschr): Rename to
	__wcschr and define as weak alias of __wcschr.
	* sysdeps/powerpc/power6/wcschr.c [!WCSCHR] (WCSCHR): Define as
	__wcschr.
	[!WCSCHR] (DEFAULT_WCSCHR): Define.
	[DEFAULT_WCSCHR] (__wcschr): Use libc_hidden_def.
	[DEFAULT_WCSCHR] (wcschr): Define as weak alias of __wcschr.  Use
	libc_hidden_weak.  Do not use libc_hidden_def.
	* sysdeps/powerpc/powerpc32/power4/multiarch/wcschr-ppc32.c
	[IS_IN (libc) && SHARED] (libc_hidden_def): Also define
	__GI___wcschr alias.
	* sysdeps/powerpc/powerpc32/power4/multiarch/wcschr.c
	[IS_IN (libc)] (wcschr): Define as macro expanding to
	__redirect_wcschr.
	[IS_IN (libc)] (__wcschr_ppc): Use __redirect_wcschr in typeof.
	[IS_IN (libc)] (__wcschr_power6): Likewise.
	[IS_IN (libc)] (__wcschr_power7): Likewise.
	[IS_IN (libc)] (__libc_wcschr): New.  Define with libc_ifunc
	instead of wcschr.
	[IS_IN (libc)] (wcschr): Undefine and define as weak alias of
	__libc_wcschr.
	[!IS_IN (libc)] (libc_hidden_def): Do not undefine and redefine.
	* sysdeps/powerpc/powerpc64/multiarch/wcschr.c (wcschr): Rename to
	__wcschr and define as weak alias of __wcschr.  Use
	libc_hidden_builtin_def.
	* sysdeps/x86_64/wcschr.S (wcschr): Rename to __wcschr and define
	as weak alias of __wcschr.  Use libc_hidden_weak.
	* time/alt_digit.c (_nl_get_walt_digit): Use __wcschr instead of
	wcschr.
	* time/era.c (_nl_init_era_entries): Likewise.
	* conform/Makefile (test-xfail-ISO/time.h/linknamespace): Remove
	variable.
	(test-xfail-XPG3/time.h/linknamespace): Likewise.
	(test-xfail-XPG4/time.h/linknamespace): Likewise.
2014-12-10 16:59:02 +00:00
Adhemerval Zanella
0f0a1c82f5 powerpc: Add powerpc64 strpbrk optimization
This patch makes the POWER7 optimized strpbrk generic by using
default doubleword stores to zero the hash, instead of VSX
instructions.  Performance on POWER7/POWER8 does not change.
2014-12-02 13:34:02 -05:00
Adhemerval Zanella
bb2542e0ae powerpc: Add powerpc64 strcspn optimization
This patch makes the POWER7 optimized strcspn generic by using
default doubleword stores to zero the hash, instead of VSX
instructions.  Performance on POWER7/POWER8 does not change.
2014-12-02 07:16:24 -05:00
Adhemerval Zanella
2e8a2de2da powerpc: Add powerpc64 strspn optimization
This patch makes the POWER7 optimized strspn generic by using
default doubleword stores to zero the hash, instead of VSX
instructions. Performance on POWER7/POWER8 machines does not changed.
2014-12-02 07:15:58 -05:00
Rajalakshmi Srinivasaraghavan
a8a7d7d212 powerpc: strtok{_r} optimization for powerpc64
This patch optimizes strtok and strtok_r for POWERPC64.
A table of 256 characters is created and marked based on
the 'accept' argument and used to check for any occurance on
the input string.Loop unrolling is also used to gain improvements.
2014-12-01 09:03:58 -05:00
Adhemerval Zanella
704f794714 powerpc: Fix missing barriers in atomic_exchange_and_add_{acq,rel}
On powerpc, atomic_exchange_and_add is implemented without any
barriers.  This patchs adds the missing instruction and memory barrier
for acquire and release semanthics.
2014-11-26 07:06:28 -05:00
Anton Blanchard
5fbb569186 powerpc: Fix __arch_compare_and_exchange_bool_64_rel
Fix a typo in the inline assembly.
2014-11-25 07:28:28 -05:00
Siddhesh Poyarekar
4f41c682f3 Remove NOT_IN_libc
Replace with !IS_IN (libc).  This completes the transition from
the IS_IN/NOT_IN macros to the IN_MODULE macro set.

The generated code is unchanged on x86_64.

	* stdlib/isomac.c (fmt): Replace NOT_IN_libc with IN_MODULE.
	(get_null_defines): Adjust.
	* sunrpc/Makefile: Adjust comment.
	* Makerules (CPPFLAGS-nonlib): Remove NOT_IN_libc.
	* elf/Makefile (CPPFLAGS-sotruss-lib): Likewise.
	(CFLAGS-interp.c): Likewise.
	(CFLAGS-ldconfig.c): Likewise.
	(CPPFLAGS-.os): Likewise.
	* elf/rtld-Rules (rtld-CPPFLAGS): Likewise.
	* extra-lib.mk (CPPFLAGS-$(lib)): Likewise.
	* extra-modules.mk (extra-modules.mk): Likewise.
	* iconv/Makefile (CPPFLAGS-iconvprogs): Likewise.
	* locale/Makefile (CPPFLAGS-locale_programs): Likewise.
	* malloc/Makefile (CPPFLAGS-memusagestat): Likewise.
	* nscd/Makefile (CPPFLAGS-nscd): Likewise.
	* nss/Makefile (CPPFLAGS-nss_test1): Likewise.
	* stdlib/Makefile (CFLAGS-tst-putenvmod.c): Likewise.
	* sysdeps/gnu/Makefile ($(objpfx)errlist-compat.c): Likewise.
	* sysdeps/unix/sysv/linux/Makefile (CPPFLAGS-lddlibc4): Likewise.
	* iconvdata/Makefile (CPPFLAGS): Likewise.
	(cpp-srcs-left): Add libof for all iconvdata routines.
	* bits/stdio-lock.h: Replace NOT_IN_libc with IS_IN.
	* include/assert.h: Likewise.
	* include/ctype.h: Likewise.
	* include/errno.h: Likewise.
	* include/libc-symbols.h: Likewise.
	* include/math.h: Likewise.
	* include/netdb.h: Likewise.
	* include/resolv.h: Likewise.
	* include/stdio.h: Likewise.
	* include/stdlib.h: Likewise.
	* include/string.h: Likewise.
	* include/sys/stat.h: Likewise.
	* include/wctype.h: Likewise.
	* intl/l10nflist.c: Likewise.
	* libidn/idn-stub.c: Likewise.
	* libio/libioP.h: Likewise.
	* nptl/libc_multiple_threads.c: Likewise.
	* nptl/pthreadP.h: Likewise.
	* posix/regex_internal.h: Likewise.
	* resolv/res_hconf.c: Likewise.
	* sysdeps/arm/armv7/multiarch/memcpy.S: Likewise.
	* sysdeps/arm/memmove.S: Likewise.
	* sysdeps/arm/sysdep.h: Likewise.
	* sysdeps/generic/_itoa.h: Likewise.
	* sysdeps/generic/symbol-hacks.h: Likewise.
	* sysdeps/gnu/errlist.awk: Likewise.
	* sysdeps/gnu/errlist.c: Likewise.
	* sysdeps/i386/i586/memcpy.S: Likewise.
	* sysdeps/i386/i586/memset.S: Likewise.
	* sysdeps/i386/i686/memcpy.S: Likewise.
	* sysdeps/i386/i686/memmove.S: Likewise.
	* sysdeps/i386/i686/mempcpy.S: Likewise.
	* sysdeps/i386/i686/memset.S: Likewise.
	* sysdeps/i386/i686/multiarch/bcopy.S: Likewise.
	* sysdeps/i386/i686/multiarch/bzero.S: Likewise.
	* sysdeps/i386/i686/multiarch/memchr-sse2-bsf.S: Likewise.
	* sysdeps/i386/i686/multiarch/memchr-sse2.S: Likewise.
	* sysdeps/i386/i686/multiarch/memchr.S: Likewise.
	* sysdeps/i386/i686/multiarch/memcmp-sse4.S: Likewise.
	* sysdeps/i386/i686/multiarch/memcmp-ssse3.S: Likewise.
	* sysdeps/i386/i686/multiarch/memcmp.S: Likewise.
	* sysdeps/i386/i686/multiarch/memcpy-ssse3-rep.S: Likewise.
	* sysdeps/i386/i686/multiarch/memcpy-ssse3.S: Likewise.
	* sysdeps/i386/i686/multiarch/memcpy.S: Likewise.
	* sysdeps/i386/i686/multiarch/memcpy_chk.S: Likewise.
	* sysdeps/i386/i686/multiarch/memmove.S: Likewise.
	* sysdeps/i386/i686/multiarch/memmove_chk.S: Likewise.
	* sysdeps/i386/i686/multiarch/mempcpy.S: Likewise.
	* sysdeps/i386/i686/multiarch/mempcpy_chk.S: Likewise.
	* sysdeps/i386/i686/multiarch/memrchr-c.c: Likewise.
	* sysdeps/i386/i686/multiarch/memrchr-sse2-bsf.S: Likewise.
	* sysdeps/i386/i686/multiarch/memrchr-sse2.S: Likewise.
	* sysdeps/i386/i686/multiarch/memrchr.S: Likewise.
	* sysdeps/i386/i686/multiarch/memset-sse2-rep.S: Likewise.
	* sysdeps/i386/i686/multiarch/memset-sse2.S: Likewise.
	* sysdeps/i386/i686/multiarch/memset.S: Likewise.
	* sysdeps/i386/i686/multiarch/memset_chk.S: Likewise.
	* sysdeps/i386/i686/multiarch/rawmemchr.S: Likewise.
	* sysdeps/i386/i686/multiarch/strcat-sse2.S: Likewise.
	* sysdeps/i386/i686/multiarch/strcat-ssse3.S: Likewise.
	* sysdeps/i386/i686/multiarch/strcat.S: Likewise.
	* sysdeps/i386/i686/multiarch/strchr-sse2-bsf.S: Likewise.
	* sysdeps/i386/i686/multiarch/strchr-sse2.S: Likewise.
	* sysdeps/i386/i686/multiarch/strchr.S: Likewise.
	* sysdeps/i386/i686/multiarch/strcmp-sse4.S: Likewise.
	* sysdeps/i386/i686/multiarch/strcmp-ssse3.S: Likewise.
	* sysdeps/i386/i686/multiarch/strcmp.S: Likewise.
	* sysdeps/i386/i686/multiarch/strcpy-sse2.S: Likewise.
	* sysdeps/i386/i686/multiarch/strcpy-ssse3.S: Likewise.
	* sysdeps/i386/i686/multiarch/strcpy.S: Likewise.
	* sysdeps/i386/i686/multiarch/strcspn.S: Likewise.
	* sysdeps/i386/i686/multiarch/strlen-sse2-bsf.S: Likewise.
	* sysdeps/i386/i686/multiarch/strlen-sse2.S: Likewise.
	* sysdeps/i386/i686/multiarch/strlen.S: Likewise.
	* sysdeps/i386/i686/multiarch/strnlen.S: Likewise.
	* sysdeps/i386/i686/multiarch/strrchr-sse2-bsf.S: Likewise.
	* sysdeps/i386/i686/multiarch/strrchr-sse2.S: Likewise.
	* sysdeps/i386/i686/multiarch/strrchr.S: Likewise.
	* sysdeps/i386/i686/multiarch/strspn.S: Likewise.
	* sysdeps/i386/i686/multiarch/wcschr-c.c: Likewise.
	* sysdeps/i386/i686/multiarch/wcschr-sse2.S: Likewise.
	* sysdeps/i386/i686/multiarch/wcschr.S: Likewise.
	* sysdeps/i386/i686/multiarch/wcscmp-sse2.S: Likewise.
	* sysdeps/i386/i686/multiarch/wcscmp.S: Likewise.
	* sysdeps/i386/i686/multiarch/wcscpy-c.c: Likewise.
	* sysdeps/i386/i686/multiarch/wcscpy-ssse3.S: Likewise.
	* sysdeps/i386/i686/multiarch/wcscpy.S: Likewise.
	* sysdeps/i386/i686/multiarch/wcslen-c.c: Likewise.
	* sysdeps/i386/i686/multiarch/wcslen-sse2.S: Likewise.
	* sysdeps/i386/i686/multiarch/wcslen.S: Likewise.
	* sysdeps/i386/i686/multiarch/wcsrchr-c.c: Likewise.
	* sysdeps/i386/i686/multiarch/wcsrchr-sse2.S: Likewise.
	* sysdeps/i386/i686/multiarch/wcsrchr.S: Likewise.
	* sysdeps/i386/i686/multiarch/wmemcmp-c.c: Likewise.
	* sysdeps/i386/i686/multiarch/wmemcmp.S: Likewise.
	* sysdeps/ia64/fpu/libm-symbols.h: Likewise.
	* sysdeps/nptl/bits/libc-lock.h: Likewise.
	* sysdeps/nptl/bits/libc-lockP.h: Likewise.
	* sysdeps/nptl/bits/stdio-lock.h: Likewise.
	* sysdeps/posix/closedir.c: Likewise.
	* sysdeps/posix/opendir.c: Likewise.
	* sysdeps/posix/readdir.c: Likewise.
	* sysdeps/posix/rewinddir.c: Likewise.
	* sysdeps/powerpc/novmx-sigjmp.c: Likewise.
	* sysdeps/powerpc/powerpc32/__longjmp.S: Likewise.
	* sysdeps/powerpc/powerpc32/bsd-_setjmp.S: Likewise.
	* sysdeps/powerpc/powerpc32/fpu/__longjmp.S: Likewise.
	* sysdeps/powerpc/powerpc32/fpu/setjmp.S: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/bzero.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/memchr.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/memcmp-ppc32.S: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/memcmp.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/memcpy-ppc32.S: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/memcpy.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/memmove.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/mempcpy.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/memrchr-ppc32.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/memrchr.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/memset-ppc32.S: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/memset.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/rawmemchr.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/strcasecmp.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/strcasecmp_l.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/strchr.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/strchrnul.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/strlen-ppc32.S: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/strlen.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/strncase.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/strncase_l.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/strncmp-ppc32.S: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/strncmp.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/strnlen.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/wcschr-ppc32.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/wcschr.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/wcscpy-ppc32.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/wcscpy.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/wcsrchr-ppc32.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/wcsrchr.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/wordcopy.c: Likewise.
	* sysdeps/powerpc/powerpc32/power6/memset.S: Likewise.
	* sysdeps/powerpc/powerpc32/setjmp.S: Likewise.
	* sysdeps/powerpc/powerpc64/__longjmp.S: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/bzero.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/memchr.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/memcmp-ppc64.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/memcmp.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/memcpy-ppc64.S: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/memcpy.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/memmove-ppc64.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/memmove.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/mempcpy.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/memrchr.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/memset-ppc64.S: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/memset.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/rawmemchr.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/stpcpy-ppc64.S: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/stpcpy.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/stpncpy.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strcasecmp.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strcasecmp_l.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strcat.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strchr.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strchrnul.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strcmp-ppc64.S: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strcmp.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strcpy-ppc64.S: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strcpy.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strcspn.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strlen-ppc64.S: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strlen.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strncase.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strncase_l.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strncat.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strncmp-ppc64.S: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strncmp.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strncpy-ppc64.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strncpy.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strnlen.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strpbrk.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strrchr-ppc64.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strrchr.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strspn-ppc64.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/strspn.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/wcschr.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/wcscpy.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/wcsrchr.c: Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/wordcopy.c: Likewise.
	* sysdeps/powerpc/powerpc64/setjmp.S: Likewise.
	* sysdeps/s390/s390-32/multiarch/ifunc-resolve.c: Likewise.
	* sysdeps/s390/s390-32/multiarch/memcmp.S: Likewise.
	* sysdeps/s390/s390-32/multiarch/memcpy.S: Likewise.
	* sysdeps/s390/s390-32/multiarch/memset.S: Likewise.
	* sysdeps/s390/s390-64/multiarch/ifunc-resolve.c: Likewise.
	* sysdeps/s390/s390-64/multiarch/memcmp.S: Likewise.
	* sysdeps/s390/s390-64/multiarch/memcpy.S: Likewise.
	* sysdeps/s390/s390-64/multiarch/memset.S: Likewise.
	* sysdeps/sparc/sparc64/multiarch/memcpy-niagara1.S: Likewise.
	* sysdeps/sparc/sparc64/multiarch/memcpy-niagara2.S: Likewise.
	* sysdeps/sparc/sparc64/multiarch/memcpy-niagara4.S: Likewise.
	* sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S: Likewise.
	* sysdeps/sparc/sparc64/multiarch/memcpy.S: Likewise.
	* sysdeps/sparc/sparc64/multiarch/memset-niagara1.S: Likewise.
	* sysdeps/sparc/sparc64/multiarch/memset-niagara4.S: Likewise.
	* sysdeps/sparc/sparc64/multiarch/memset.S: Likewise.
	* sysdeps/unix/alpha/sysdep.S: Likewise.
	* sysdeps/unix/alpha/sysdep.h: Likewise.
	* sysdeps/unix/make-syscalls.sh: Likewise.
	* sysdeps/unix/sysv/linux/aarch64/sysdep-cancel.h: Likewise.
	* sysdeps/unix/sysv/linux/aarch64/sysdep.h: Likewise.
	* sysdeps/unix/sysv/linux/alpha/sysdep-cancel.h: Likewise.
	* sysdeps/unix/sysv/linux/alpha/vfork.S: Likewise.
	* sysdeps/unix/sysv/linux/arm/sysdep-cancel.h: Likewise.
	* sysdeps/unix/sysv/linux/arm/sysdep.h: Likewise.
	* sysdeps/unix/sysv/linux/getpid.c: Likewise.
	* sysdeps/unix/sysv/linux/hppa/nptl/lowlevellock.h: Likewise.
	* sysdeps/unix/sysv/linux/hppa/nptl/sysdep-cancel.h: Likewise.
	* sysdeps/unix/sysv/linux/i386/i486/lowlevellock.S: Likewise.
	* sysdeps/unix/sysv/linux/i386/lowlevellock.h: Likewise.
	* sysdeps/unix/sysv/linux/i386/sysdep-cancel.h: Likewise.
	* sysdeps/unix/sysv/linux/i386/sysdep.h: Likewise.
	* sysdeps/unix/sysv/linux/ia64/lowlevellock.h: Likewise.
	* sysdeps/unix/sysv/linux/ia64/sysdep-cancel.h: Likewise.
	* sysdeps/unix/sysv/linux/ia64/sysdep.S: Likewise.
	* sysdeps/unix/sysv/linux/ia64/sysdep.h: Likewise.
	* sysdeps/unix/sysv/linux/lowlevellock-futex.h: Likewise.
	* sysdeps/unix/sysv/linux/m68k/bits/m68k-vdso.h: Likewise.
	* sysdeps/unix/sysv/linux/m68k/sysdep-cancel.h: Likewise.
	* sysdeps/unix/sysv/linux/m68k/sysdep.h: Likewise.
	* sysdeps/unix/sysv/linux/microblaze/lowlevellock.h: Likewise.
	* sysdeps/unix/sysv/linux/microblaze/sysdep-cancel.h: Likewise.
	* sysdeps/unix/sysv/linux/mips/mips64/sysdep-cancel.h: Likewise.
	* sysdeps/unix/sysv/linux/mips/sysdep-cancel.h: Likewise.
	* sysdeps/unix/sysv/linux/not-cancel.h: Likewise.
	* sysdeps/unix/sysv/linux/powerpc/lowlevellock.h: Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/sysdep-cancel.h: Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/sysdep-cancel.h: Likewise.
	* sysdeps/unix/sysv/linux/s390/longjmp_chk.c: Likewise.
	* sysdeps/unix/sysv/linux/s390/lowlevellock.h: Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-32/sysdep-cancel.h: Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-32/sysdep.S: Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-32/sysdep.h: Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-32/vfork.S: Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-64/sysdep-cancel.h: Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-64/sysdep.S: Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-64/sysdep.h: Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-64/vfork.S: Likewise.
	* sysdeps/unix/sysv/linux/sh/lowlevellock.S: Likewise.
	* sysdeps/unix/sysv/linux/sh/lowlevellock.h: Likewise.
	* sysdeps/unix/sysv/linux/sh/sysdep-cancel.h: Likewise.
	* sysdeps/unix/sysv/linux/sh/sysdep.h: Likewise.
	* sysdeps/unix/sysv/linux/sh/vfork.S: Likewise.
	* sysdeps/unix/sysv/linux/sparc/lowlevellock.h: Likewise.
	* sysdeps/unix/sysv/linux/sparc/sparc32/sysdep-cancel.h: Likewise.
	* sysdeps/unix/sysv/linux/sparc/sparc32/sysdep.h: Likewise.
	* sysdeps/unix/sysv/linux/sparc/sparc64/brk.S: Likewise.
	* sysdeps/unix/sysv/linux/sparc/sparc64/sysdep-cancel.h: Likewise.
	* sysdeps/unix/sysv/linux/sparc/sparc64/sysdep.h: Likewise.
	* sysdeps/unix/sysv/linux/tile/lowlevellock.h: Likewise.
	* sysdeps/unix/sysv/linux/tile/sysdep-cancel.h: Likewise.
	* sysdeps/unix/sysv/linux/tile/sysdep.h: Likewise.
	* sysdeps/unix/sysv/linux/tile/waitpid.S: Likewise.
	* sysdeps/unix/sysv/linux/x86_64/lowlevellock.S: Likewise.
	* sysdeps/unix/sysv/linux/x86_64/lowlevellock.h: Likewise.
	* sysdeps/unix/sysv/linux/x86_64/sysdep-cancel.h: Likewise.
	* sysdeps/unix/sysv/linux/x86_64/sysdep.h: Likewise.
	* sysdeps/wordsize-32/symbol-hacks.h: Likewise.
	* sysdeps/x86_64/memcpy.S: Likewise.
	* sysdeps/x86_64/memmove.c: Likewise.
	* sysdeps/x86_64/memset.S: Likewise.
	* sysdeps/x86_64/multiarch/init-arch.h: Likewise.
	* sysdeps/x86_64/multiarch/memcmp-sse4.S: Likewise.
	* sysdeps/x86_64/multiarch/memcmp-ssse3.S: Likewise.
	* sysdeps/x86_64/multiarch/memcmp.S: Likewise.
	* sysdeps/x86_64/multiarch/memcpy-avx-unaligned.S: Likewise.
	* sysdeps/x86_64/multiarch/memcpy-ssse3-back.S: Likewise.
	* sysdeps/x86_64/multiarch/memcpy-ssse3.S: Likewise.
	* sysdeps/x86_64/multiarch/memcpy.S: Likewise.
	* sysdeps/x86_64/multiarch/memcpy_chk.S: Likewise.
	* sysdeps/x86_64/multiarch/memmove.c: Likewise.
	* sysdeps/x86_64/multiarch/mempcpy.S: Likewise.
	* sysdeps/x86_64/multiarch/mempcpy_chk.S: Likewise.
	* sysdeps/x86_64/multiarch/memset-avx2.S: Likewise.
	* sysdeps/x86_64/multiarch/memset.S: Likewise.
	* sysdeps/x86_64/multiarch/memset_chk.S: Likewise.
	* sysdeps/x86_64/multiarch/strcat-sse2-unaligned.S: Likewise.
	* sysdeps/x86_64/multiarch/strcat-ssse3.S: Likewise.
	* sysdeps/x86_64/multiarch/strcat.S: Likewise.
	* sysdeps/x86_64/multiarch/strchr-sse2-no-bsf.S: Likewise.
	* sysdeps/x86_64/multiarch/strchr.S: Likewise.
	* sysdeps/x86_64/multiarch/strcmp-ssse3.S: Likewise.
	* sysdeps/x86_64/multiarch/strcmp.S: Likewise.
	* sysdeps/x86_64/multiarch/strcpy-sse2-unaligned.S: Likewise.
	* sysdeps/x86_64/multiarch/strcpy-ssse3.S: Likewise.
	* sysdeps/x86_64/multiarch/strcpy.S: Likewise.
	* sysdeps/x86_64/multiarch/strcspn.S: Likewise.
	* sysdeps/x86_64/multiarch/strspn.S: Likewise.
	* sysdeps/x86_64/multiarch/wcscpy-c.c: Likewise.
	* sysdeps/x86_64/multiarch/wcscpy-ssse3.S: Likewise.
	* sysdeps/x86_64/multiarch/wcscpy.S: Likewise.
	* sysdeps/x86_64/multiarch/wmemcmp-c.c: Likewise.
	* sysdeps/x86_64/multiarch/wmemcmp.S: Likewise.
	* sysdeps/x86_64/strcmp.S: Likewise.
2014-11-24 15:03:45 +05:30
Siddhesh Poyarekar
a38484851a Remove IS_IN_rtld
Replace with IS_IN (rtld).  Generated code is unchanged on
x86_64.

        * elf/Makefile (CPPFLAGS-.os): Remove IS_IN_rtld.
        * elf/dl-open.c: Use IS_IN (rtld) instead if IS_IN_rtld.
        * elf/rtld-Rules: Likewise.
        * elf/setup-vdso.h: Likewise.
        * include/assert.h: Likewise.
        * include/bits/stdlib-float.h: Likewise.
        * include/errno.h: Likewise.
        * include/sys/stat.h: Likewise.
        * include/unistd.h: Likewise.
        * sysdeps/aarch64/setjmp.S: Likewise.
        * sysdeps/alpha/setjmp.S: Likewise.
        * sysdeps/arm/__longjmp.S: Likewise.
        * sysdeps/arm/aeabi_unwind_cpp_pr1.c: Likewise.
        * sysdeps/arm/setjmp.S: Likewise.
        * sysdeps/arm/sysdep.h: Likewise.
        * sysdeps/generic/_itoa.h: Likewise.
        * sysdeps/generic/dl-sysdep.h: Likewise.
        * sysdeps/generic/ldsodefs.h: Likewise.
        * sysdeps/i386/dl-tls.h: Likewise.
        * sysdeps/i386/setjmp.S: Likewise.
        * sysdeps/m68k/setjmp.c: Likewise.
        * sysdeps/mach/hurd/dl-execstack.c: Likewise.
        * sysdeps/mach/hurd/opendir.c: Likewise.
        * sysdeps/posix/getcwd.c: Likewise.
        * sysdeps/posix/opendir.c: Likewise.
        * sysdeps/posix/profil.c: Likewise.
        * sysdeps/powerpc/dl-procinfo.h: Likewise.
        * sysdeps/powerpc/powerpc32/fpu/__longjmp-common.S: Likewise.
        * sysdeps/powerpc/powerpc32/fpu/setjmp-common.S: Likewise.
        * sysdeps/powerpc/powerpc32/power4/multiarch/init-arch.h: Likewise.
        * sysdeps/powerpc/powerpc32/setjmp-common.S: Likewise.
        * sysdeps/powerpc/powerpc64/__longjmp-common.S: Likewise.
        * sysdeps/powerpc/powerpc64/setjmp-common.S: Likewise.
        * sysdeps/s390/dl-tls.h: Likewise.
        * sysdeps/s390/s390-32/setjmp.S: Likewise.
        * sysdeps/s390/s390-64/setjmp.S: Likewise.
        * sysdeps/sh/sh3/setjmp.S: Likewise.
        * sysdeps/sh/sh4/setjmp.S: Likewise.
        * sysdeps/unix/alpha/sysdep.h: Likewise.
        * sysdeps/unix/arm/sysdep.S: Likewise.
        * sysdeps/unix/i386/sysdep.S: Likewise.
        * sysdeps/unix/sysv/linux/aarch64/sysdep.h: Likewise.
        * sysdeps/unix/sysv/linux/getcwd.c: Likewise.
        * sysdeps/unix/sysv/linux/hppa/nptl/lowlevellock.h: Likewise.
        * sysdeps/unix/sysv/linux/i386/lowlevellock.h: Likewise.
        * sysdeps/unix/sysv/linux/i386/sysdep.h: Likewise.
        * sysdeps/unix/sysv/linux/ia64/lowlevellock.h: Likewise.
        * sysdeps/unix/sysv/linux/ia64/setjmp.S: Likewise.
        * sysdeps/unix/sysv/linux/ia64/sysdep.h: Likewise.
        * sysdeps/unix/sysv/linux/lowlevellock-futex.h: Likewise.
        * sysdeps/unix/sysv/linux/m68k/bits/m68k-vdso.h: Likewise.
        * sysdeps/unix/sysv/linux/m68k/m68k-helpers.S: Likewise.
        * sysdeps/unix/sysv/linux/microblaze/lowlevellock.h: Likewise.
        * sysdeps/unix/sysv/linux/powerpc/lowlevellock.h: Likewise.
        * sysdeps/unix/sysv/linux/powerpc/powerpc32/sysdep.h: Likewise.
        * sysdeps/unix/sysv/linux/powerpc/powerpc64/sysdep.h: Likewise.
        * sysdeps/unix/sysv/linux/s390/lowlevellock.h: Likewise.
        * sysdeps/unix/sysv/linux/s390/s390-32/sysdep.h: Likewise.
        * sysdeps/unix/sysv/linux/s390/s390-64/sysdep.h: Likewise.
        * sysdeps/unix/sysv/linux/sh/lowlevellock.h: Likewise.
        * sysdeps/unix/sysv/linux/sh/sysdep.h: Likewise.
        * sysdeps/unix/sysv/linux/sparc/lowlevellock.h: Likewise.
        * sysdeps/unix/sysv/linux/sparc/sparc32/sysdep.h: Likewise.
        * sysdeps/unix/sysv/linux/sparc/sparc64/sysdep.h: Likewise.
        * sysdeps/unix/sysv/linux/tile/lowlevellock.h: Likewise.
        * sysdeps/unix/sysv/linux/tile/sysdep.h: Likewise.
        * sysdeps/unix/sysv/linux/x86_64/lowlevellock.h: Likewise.
        * sysdeps/unix/sysv/linux/x86_64/sysdep.h: Likewise.
        * sysdeps/unix/x86_64/sysdep.S: Likewise.
        * sysdeps/x86_64/setjmp.S: Likewise.
2014-11-24 11:41:48 +05:30
Siddhesh Poyarekar
a109996ef9 Remove IS_IN_libm
Replace with IS_IN (libm). Generated code unchanged on x86_64.

        * include/math.h: Use IS_IN instead of IS_IN_libm.
        * sysdeps/alpha/fpu/s_copysign.c: Likewise.
        * sysdeps/ieee754/ldbl-128ibm/s_copysignl.c: Likewise.
        * sysdeps/ieee754/ldbl-128ibm/s_finitel.c: Likewise.
        * sysdeps/ieee754/ldbl-128ibm/s_fmal.c: Likewise.
        * sysdeps/ieee754/ldbl-128ibm/s_frexpl.c: Likewise.
        * sysdeps/ieee754/ldbl-128ibm/s_isinfl.c: Likewise.
        * sysdeps/ieee754/ldbl-128ibm/s_isnanl.c: Likewise.
        * sysdeps/ieee754/ldbl-128ibm/s_modfl.c: Likewise.
        * sysdeps/ieee754/ldbl-128ibm/s_scalbnl.c: Likewise.
        * sysdeps/ieee754/ldbl-128ibm/s_signbitl.c: Likewise.
        * sysdeps/ieee754/ldbl-64-128/s_copysignl.c: Likewise.
        * sysdeps/ieee754/ldbl-64-128/s_finitel.c: Likewise.
        * sysdeps/ieee754/ldbl-64-128/s_frexpl.c: Likewise.
        * sysdeps/ieee754/ldbl-64-128/s_isinfl.c: Likewise.
        * sysdeps/ieee754/ldbl-64-128/s_isnanl.c: Likewise.
        * sysdeps/ieee754/ldbl-64-128/s_modfl.c: Likewise.
        * sysdeps/ieee754/ldbl-64-128/s_scalbnl.c: Likewise.
        * sysdeps/ieee754/ldbl-64-128/s_signbitl.c: Likewise.
        * sysdeps/ieee754/ldbl-64-128/w_scalblnl.c: Likewise.
        * sysdeps/ieee754/ldbl-opt/s_copysign.c: Likewise.
        * sysdeps/ieee754/ldbl-opt/s_finite.c: Likewise.
        * sysdeps/ieee754/ldbl-opt/s_frexp.c: Likewise.
        * sysdeps/ieee754/ldbl-opt/s_isinf.c: Likewise.
        * sysdeps/ieee754/ldbl-opt/s_isnan.c: Likewise.
        * sysdeps/ieee754/ldbl-opt/s_ldexp.c: Likewise.
        * sysdeps/ieee754/ldbl-opt/s_ldexpl.c: Likewise.
        * sysdeps/ieee754/ldbl-opt/s_modf.c: Likewise.
        * sysdeps/ieee754/ldbl-opt/s_scalbln.c: Likewise.
        * sysdeps/ieee754/ldbl-opt/s_scalbn.c: Likewise.
        * sysdeps/powerpc/power5+/fpu/s_modf.c: Likewise.
        * sysdeps/powerpc/powerpc32/fpu/s_copysign.S: Likewise.
        * sysdeps/powerpc/powerpc32/fpu/s_copysignl.S: Likewise.
        * sysdeps/powerpc/powerpc32/fpu/s_isnan.S: Likewise.
        * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysign.c: Likewise.
        * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finite.c: Likewise.
        * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinf.c: Likewise.
        * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan.c: Likewise.
        * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_modf.c: Likewise.
        * sysdeps/powerpc/powerpc32/power5/fpu/s_isnan.S: Likewise.
        * sysdeps/powerpc/powerpc32/power6/fpu/s_copysign.S: Likewise.
        * sysdeps/powerpc/powerpc32/power6/fpu/s_isnan.S: Likewise.
        * sysdeps/powerpc/powerpc32/power7/fpu/s_finite.S: Likewise.
        * sysdeps/powerpc/powerpc32/power7/fpu/s_isinf.S: Likewise.
        * sysdeps/powerpc/powerpc32/power7/fpu/s_isnan.S: Likewise.
        * sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysign.c: Likewise.
        * sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite.c: Likewise.
        * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinf.c: Likewise.
        * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan.c: Likewise.
        * sysdeps/powerpc/powerpc64/fpu/multiarch/s_modf.c: Likewise.
        * sysdeps/powerpc/powerpc64/fpu/s_copysign.S: Likewise.
        * sysdeps/powerpc/powerpc64/fpu/s_copysignl.S: Likewise.
        * sysdeps/powerpc/powerpc64/fpu/s_isnan.S: Likewise.
        * sysdeps/powerpc/powerpc64/power5/fpu/s_isnan.S: Likewise.
        * sysdeps/powerpc/powerpc64/power6/fpu/s_copysign.S: Likewise.
        * sysdeps/powerpc/powerpc64/power6/fpu/s_isnan.S: Likewise.
        * sysdeps/powerpc/powerpc64/power6x/fpu/s_isnan.S: Likewise.
        * sysdeps/powerpc/powerpc64/power7/fpu/s_finite.S: Likewise.
        * sysdeps/powerpc/powerpc64/power7/fpu/s_isinf.S: Likewise.
        * sysdeps/powerpc/powerpc64/power7/fpu/s_isnan.S: Likewise.
        * sysdeps/powerpc/powerpc64/power8/fpu/s_finite.S: Likewise.
        * sysdeps/powerpc/powerpc64/power8/fpu/s_isinf.S: Likewise.
        * sysdeps/powerpc/powerpc64/power8/fpu/s_isnan.S: Likewise.
        * sysdeps/sparc/sparc32/fpu/s_signbitl.S: Likewise.
        * sysdeps/sparc/sparc32/sparcv9/fpu/s_isnan.S: Likewise.
        * sysdeps/unix/sysv/linux/alpha/fraiseexcpt.S: Likewise.
2014-11-24 11:41:47 +05:30
Torvald Riegel
1ea339b697 Add arch-specific configuration for C11 atomics support.
This sets __HAVE_64B_ATOMICS if provided.  It also sets
USE_ATOMIC_COMPILER_BUILTINS to true if the existing atomic ops use the
__atomic* builtins (aarch64, mips partially) or if this has been
tested (x86_64); otherwise, this is set to false so that C11 atomics will
be based on the existing atomic operations.
2014-11-20 11:57:38 +01:00
Joseph Myers
c1b0aadcdf Fix build of C mempcpy and stpcpy.
This patch fixes the build of C mempcpy and stpcpy by disabling the
redirection to __mempcpy and __stpcpy asm names if
NO_MEMPCPY_STPCPY_REDIRECT is defined, and defining that macro in the
relevant source files.

Tested for powerpc32 that the build is fixed.

	* include/string.h [NO_MEMPCPY_STPCPY_REDIRECT] (mempcpy): Do not
	redeclare with asm name.
	[NO_MEMPCPY_STPCPY_REDIRECT] (stpcpy): Likewise.
	* string/mempcpy.c (NO_MEMPCPY_STPCPY_REDIRECT): Define before
	including <string.h>.
	* string/stpcpy.c (NO_MEMPCPY_STPCPY_REDIRECT): Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/mempcpy.c
	[!NOT_IN_libc] (NO_MEMPCPY_STPCPY_REDIRECT): Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/mempcpy.c
	[!NOT_IN_libc] (NO_MEMPCPY_STPCPY_REDIRECT): Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/stpcpy.c
	[SHARED && !NOT_IN_libc] (NO_MEMPCPY_STPCPY_REDIRECT): Likewise.
2014-11-14 13:48:39 +00:00
Joseph Myers
9cf27b8d09 Remove INTDEF / INTUSE / INTVARDEF (bug 14132).
Completing the removal of the obsolete INTDEF / INTUSE mechanism, this
patch removes the final use - that for _dl_starting_up - replacing it
by rtld_hidden_def / rtld_hidden_proto.  Having removed the last use,
the mechanism itself is also removed.

Tested for x86_64 that installed stripped shared libraries are
unchanged by the patch.  (This is not much of a test since this
variable is only defined and used in the !HAVE_INLINED_SYSCALLS case.)

	[BZ #14132]
	* include/libc-symbols.h (INTUSE): Remove macro.
	(INTDEF): Likewise.
	(INTVARDEF): Likewise.
	(_INTVARDEF): Likewise.
	(INTDEF2): Likewise.
	(INTVARDEF2): Likewise.
	* elf/rtld.c [!HAVE_INLINED_SYSCALLS] (_dl_starting_up): Use
	rtld_hidden_def instead of INTVARDEF.
	* sysdeps/generic/ldsodefs.h [IS_IN_rtld]
	(_dl_starting_up_internal): Remove declaration.
	(_dl_starting_up): Use rtld_hidden_proto.
	* elf/dl-init.c [!HAVE_INLINED_SYSCALLS] (_dl_starting_up): Remove
	declaration.
	[!HAVE_INLINED_SYSCALLS] (_dl_starting_up_internal): Likewise.
	(_dl_init) [!HAVE_INLINED_SYSCALLS]: Don't use INTUSE with
	_dl_starting_up.
	* elf/dl-writev.h (_dl_writev): Likewise.
	* sysdeps/powerpc/powerpc64/dl-machine.h [!HAVE_INLINED_SYSCALLS]
	(DL_STARTING_UP_DEF): Use __GI__dl_starting_up instead of
	_dl_starting_up_internal.
2014-11-05 23:35:36 +00:00
Adhemerval Zanella
7110166d4f powerpc: Simplify encoding of POWER8 instruction 2014-11-05 08:01:09 -05:00
Joseph Myers
4243cbea6d Don't use INTDEF/INTUSE with _dl_argv (bug 14132).
Continuing the removal of the obsolete INTDEF / INTUSE mechanism, this
patch replaces its use for _dl_argv with rtld_hidden_data_def and
rtld_hidden_proto.  Some places in .S files that previously used
_dl_argv_internal or INTUSE(_dl_argv) now use __GI__dl_argv directly
(there are plenty of existing examples of such direct use of __GI_*).

A single place in rtld.c previously used _dl_argv without INTUSE,
apparently accidentally, while the rtld_hidden_proto mechanism avoids
such accidential omissions.  As a consequence, this patch *does*
change the contents of stripped ld.so.  However, the installed
stripped shared libraries are identical to those you get if instead of
this patch you change that single _dl_argv use to use INTUSE, without
any other changes.

Tested for x86_64 (testsuite as well as comparison of installed
stripped shared libraries as described above).

	[BZ #14132]
	* sysdeps/generic/ldsodefs.h (_dl_argv): Use rtld_hidden_proto.
	[IS_IN_rtld] (_dl_argv_internal): Do not declare.
	(rtld_progname): Make macro definition unconditional.
	* elf/rtld.c (_dl_argv): Use rtld_hidden_data_def instead of
	INTDEF.
	(dlmopen_doit): Do not use INTUSE with _dl_argv.
	(dl_main): Likewise.
	* elf/dl-sysdep.c (_dl_sysdep_start): Likewise.
	* sysdeps/alpha/dl-machine.h (RTLD_START): Use __GI__dl_argv
	instead of _dl_argv_internal.
	* sysdeps/powerpc/powerpc32/dl-start.S (_dl_start_user): Use
	__GI__dl_argv instead of INTUSE(_dl_argv).
	* sysdeps/powerpc/powerpc64/dl-machine.h (RTLD_START): Use
	__GI__dl_argv instead of _dl_argv_internal.
2014-11-04 17:39:39 +00:00
Adhemerval Zanella
5e4df2848d powerpc: Fix encoding of POWER8 instruction
This patch adds a binary encoding for 'mtvsrd' instruction to avoid
build failures when assembler does not support POWER8.
2014-11-03 07:26:33 -05:00
Torvald Riegel
7f981fc24a powerpc: Change atomic_write_barrier to have release semantics. 2014-10-31 23:26:22 +01:00
Adhemerval Zanella
71ae86478e PowerPC: memset optimization for POWER8/PPC64
This patch adds an optimized memset implementation for POWER8.  For
sizes from 0 to 255 bytes, a word/doubleword algorithm similar to
POWER7 optimized one is used.

For size higher than 255 two strategies are used:

1. If the constant is different than 0, the memory is written with
   altivec vector instruction;

2. If constant is 0, dbcz instructions are used.  The loop is unrolled
   to clear 512 byte at time.

Using vector instructions increases throughput considerable, with a
double performance for sizes larger than 1024.  The dcbz loops unrolls
also shows performance improvement, by doubling throughput for sizes
larger than 8192 bytes.
2014-09-10 07:39:46 -04:00
Adhemerval Zanella
3b473fecdf PowerPC: multiarch bzero cleanup for PPC64
This patch cleanups the multiarch bzero for powerpc64 by remove
the multiarch objects and use instead the the memset embedded
implementation presented in each multiarch optimization.  The
code generate is essentially the same, but the TB_TOCLESS (which
is not essential).
2014-09-10 07:39:46 -04:00
Siddhesh Poyarekar
eb72478a28 Remove unnecessary uses of NOT_IN_libc
If a IS_IN_* macro is defined, then NOT_IN_libc is always defined,
except obviously for IS_IN_libc.  There's no need to check for both.
Verified on x86_64 and i686 that the source is unchanged.

       * include/libc-symbols.h: Remove unnecessary check for
       NOT_IN_libc.
       * nptl/pthreadP.h: Likewise.
       * sysdeps/aarch64/setjmp.S: Likewise.
       * sysdeps/alpha/setjmp.S: Likewise.
       * sysdeps/arm/sysdep.h: Likewise.
       * sysdeps/i386/setjmp.S: Likewise.
       * sysdeps/m68k/setjmp.c: Likewise.
       * sysdeps/posix/getcwd.c: Likewise.
       * sysdeps/powerpc/powerpc32/setjmp-common.S: Likewise.
       * sysdeps/powerpc/powerpc64/setjmp-common.S: Likewise.
       * sysdeps/s390/s390-32/setjmp.S: Likewise.
       * sysdeps/s390/s390-64/setjmp.S: Likewise.
       * sysdeps/sh/sh3/setjmp.S: Likewise.
       * sysdeps/sh/sh4/setjmp.S: Likewise.
       * sysdeps/unix/alpha/sysdep.h: Likewise.
       * sysdeps/unix/sysv/linux/aarch64/sysdep.h: Likewise.
       * sysdeps/unix/sysv/linux/i386/sysdep.h: Likewise.
       * sysdeps/unix/sysv/linux/ia64/setjmp.S: Likewise.
       * sysdeps/unix/sysv/linux/ia64/sysdep.h: Likewise.
       * sysdeps/unix/sysv/linux/powerpc/powerpc32/sysdep.h: Likewise.
       * sysdeps/unix/sysv/linux/powerpc/powerpc64/sysdep.h: Likewise.
       * sysdeps/unix/sysv/linux/s390/s390-32/sysdep.h: Likewise.
       * sysdeps/unix/sysv/linux/s390/s390-64/sysdep.h: Likewise.
       * sysdeps/unix/sysv/linux/sh/sysdep.h: Likewise.
       * sysdeps/unix/sysv/linux/sparc/sparc32/sysdep.h: Likewise.
       * sysdeps/unix/sysv/linux/sparc/sparc64/sysdep.h: Likewise.
       * sysdeps/unix/sysv/linux/tile/sysdep.h: Likewise.
       * sysdeps/unix/sysv/linux/x86_64/sysdep.h: Likewise.
       * sysdeps/x86_64/setjmp.S: Likewise.
2014-08-21 10:26:46 +05:30
Adhemerval Zanella
a53fbd8e6c PowerPC: Fix gprof entry point for LE
This patch fixes the ELFv2 gprof entry point since the ABI
does not define function descriptors.  It fixes BZ#17213.
2014-07-30 09:01:25 -03:00
Adhemerval Zanella
27b75f56c9 PowerPC: Cleanup powerpc memmove
Now that MEMCPY_OK_FOR_FWD_MEMMOVE should be define on memcopy.h there
is no need to specialized powerpc memmove implementation.  This patch
moves the define set to powerpc memcopy and cleanup its definition on
powerpc code.
2014-07-08 09:16:15 -05:00
Adhemerval Zanella
e7f95bb5f0 PowerPC: Fix compiler warnings
This patch fixes some compiler due trailing data in #undef directives
and due missing prototypes.
2014-07-08 09:16:12 -05:00
Adhemerval Zanella
87868c2418 PowerPC: Align power7 memcpy using VSX to quadword
This patch changes power7 memcpy to use VSX instructions only when
memory is aligned to quardword.  It is to avoid unaligned kernel traps
on non-cacheable memory (for instance, memory-mapped I/O).
2014-07-07 15:41:27 -05:00
Adhemerval Zanella
17762f6625 PowerPC: optimized memmove for POWER7/PPC64
This patch adds an optimized memmove optimization for POWER7/powerpc64.
Basically the idea is to use the memcpy for POWER7 on non-overlapped
memory regions and a optimized backward memcpy for memory regions
that overlap (similar to the idea of string/memmove.c).

The backward memcpy algorithm used is similar the one use for memcpy for
POWER7, with adjustments done for alignment.  The difference is memory
is always aligned to 16 bytes before using VSX/altivec instructions.
2014-07-07 15:41:21 -05:00
Richard Henderson
05502548e9 Always provide HP_SMALL_TIMING_AVAIL 2014-07-03 08:38:36 -07:00
Richard Henderson
86e1a7ff92 Unify hp-timing implementations
Provide an hp-timing-common.h for ports to use.
2014-07-03 08:38:30 -07:00
Richard Henderson
428dd03f5a Remove HP_TIMING_DIFF_INIT and dl_hp_timing_overhead
Without HP_TIMING_ACCUM, dl_hp_timing_overhead is write-only.
If we remove it, there's no point in HP_TIMING_DIFF_INIT either.
2014-07-03 08:38:25 -07:00
Richard Henderson
c39323e9d2 Removing HP_TIMING_ACCUM as unused 2014-07-03 08:38:21 -07:00
Richard Henderson
850e0e032b Removing HP_TIMING_ZERO as unused 2014-07-03 08:38:18 -07:00
Vidya Ranganathan
bc8ea38590 PowerPC: strcat optimization for PPC64/POWER7
This patch adds an ifunc power7 strcat symbol that uses the logic on
sysdeps/powerpc/strcat.c but call power7 strlen/strcpy symbols instead
of default ones.
2014-07-02 14:04:21 -05:00
Siddhesh Poyarekar
4cf5b6d0d7 Fix Wundef warning for ELF_MACHINE_NO_RELA
This patch defines ELF_MACHINE_NO_RELA on all architectures.  Tested
only on x86_64 to verify that the sources before and after are
identical except for two instructions that pass the current line
number in dl-machine.h to assert_fail.
2014-06-26 22:30:40 +05:30
Vidya Ranganathan
e23d3d2690 PowerPC: Optimized strcmp for PPC64/POWER7
Optimization is achieved on 8 byte aligned strings with double word
comparison using cmpb instruction. On unaligned strings loop unrolling
is applied for Power7 gain.
2014-06-11 08:39:31 -05:00
Adhemerval Zanella
ed36bfa18f PowerPC: Fix optimized strncat strlen call
This patch fixes the optimized ppc64/power7 strncat strlen call for
static build without ifunc enabled.  The strlen symbol to call in such
situation is just strlen, instead of __GI_strlen (since the __GI_
alias is just created for shared objects).
2014-06-06 09:37:07 -05:00
Adhemerval Zanella
af121e371d PowerPC: Fix multiarch hypotf PPC64 path
This patch moves the hypotf multiarch implementation to correct path.
2014-05-19 18:06:40 -05:00
Vidya Ranganathan
f360f94a05 PowerPC: strncpy/stpncpy optimization for PPC64/POWER7
The optimization is achieved by following techniques:
  > data alignment [gain from aligned memory access on read/write]
  > POWER7 gains performance with loop unrolling/unwinding
    [gain by reduction of branch penalty].
  > zero padding done by calling optimized memset
2014-05-06 09:54:25 -05:00
Adhemerval Zanella
19c4bec0f4 PowerPC: ifunc improvement for internal calls
This patch changes de default symbol redirection for internal call of
memcpy, memset, memchr, and strlen to the IFUNC resolved ones.  The
performance improvement is noticeable in algorithms that uses these
symbols extensible, like the regex functions.
2014-05-05 13:30:16 -05:00
Adhemerval Zanella
de21c33c06 PowerPC: Fix --disable-multi-arch builds
This patch fixes some powerpc32 and powerpc64 builds with
--disable-multi-arch option along with different --with-cpu=powerN.
It cleanups the Implies directories by removing the multiarch
folder for non multiarch config and also fixing two assembly
implementations: powerpc64/power7/strncat.S that is calling the
wrong strlen; and power8/fpu/s_isnan.S that misses the hidden_def and
weak_alias directives.
2014-04-09 06:22:53 -05:00
Alan Modra
af6b17973c Correct prefetch hint in power7 memrchr.
Typo fix.

	* sysdeps/powerpc/powerpc64/power7/memrchr.S: Correct stream hint.
2014-04-02 13:42:27 +10:30
Alan Modra
483818d768 Fix reference to toc symbol.
https://sourceware.org/ml/binutils/2014-03/msg00033.html removes the
"magic" treatment of symbols defined in a .toc section.

	* sysdeps/powerpc/powerpc64/start.S: Add @toc to toc symbol reference.
2014-04-02 13:40:21 +10:30
Alan Modra
c859b32e9d Fix s_copysign stack temp for PowerPC64 ELFv2
[BZ #16786]
	* sysdeps/powerpc/powerpc64/fpu/s_copysign.S: Don't trash stack.
2014-04-01 14:10:22 +10:30
Adhemerval Zanella
757d9dd5c3 PowerPC: Fix little endian enconding for mfvsrd
This patch fixes the MFVSRD_R3_V1 macro that encodes 'mfvsrd  r3,vs1'
(to support old binutils) for little endian.
2014-03-31 08:00:38 -05:00
Adhemerval Zanella
6f23d0939e PowerPC: optimized strpbrk for POWER7
This patch add an optimized strpbrk for POWER7 by using a different
algorithm than default implementation: it constructs a table based on
the 'accept' argument and use this table to check for any occurance on
the input string. The idea is similar as x86_64 uses.
For PowerPC some tunings were added, such as unroll loops and memory
clear using VSX instructions.
2014-03-20 19:46:13 -05:00
Adhemerval Zanella
6eaf95cbfa PowerPC: optimized strcspn for PPC64/POWER7
This patch add a optimized strcspn for POWER7 by using a different
algorithm than default implementation: it constructs a table based on
the 'accept' argument and use this table to check for any occurance
on the input string. The idea is similar as x86_64 uses.
For PowerPC some tunings were added, such as unroll loops and align
stack memory to table to 16 bytes (so VSX clean can ran without
alignment issues).
2014-03-20 11:24:52 -05:00
Adhemerval Zanella
c7de502503 PowerPC: remove wrong roundl implementation for PowerPC64
The roundl assembly implementation
(sysdeps/powerpc/powerpc64/fpu/s_roundl.S)
returns wrong results for some inputs where first double is a exact
integer and the precision is determined by second long double.

Checking on implementation comments and history, I am very confident the
assembly implementation was based on a version before commit
5c68d40169 that fixes BZ#2423 (Errors in
long double (ldbl-128ibm) rounding functions in glibc-2.4).

By just removing the implementation and make the build select
sysdeps/ieee754/ldbl-128ibm/s_roundl.c instead fixes the failing math.

This fixes 16707.
2014-03-14 12:54:47 -05:00
Adhemerval Zanella
98fb27a373 PowerPC: remove wrong nearbyintl implementation for PPC64
The nearbyintl assembly implementation
(sysdeps/powerpc/powerpc64/fpu/s_nearbyintl.S)
returns wrong results for some inputs where first double is a exact
integer and the precision is determined by second long double.

Checking on implementation comments and history, I am very confident the
assembly implementation was based on a version before commit
5c68d40169 that fixes BZ#2423 (Errors in
long double (ldbl-128ibm) rounding functions in glibc-2.4).

By just removing the implementation and make the build select
sysdeps/ieee754/ldbl-128ibm/s_nearbyintl.c instead fixes the failing
math.

Fixes BZ#16706.
2014-03-14 12:54:47 -05:00
Adhemerval Zanella
374f7f6121 PowerPC: remove wrong ceill implementation for PowerPC64
The ceill assembly implementation (sysdeps/powerpc/powerpc64/fpu/s_ceill.S)
returns wrong results for some inputs where first double is a exact
integer and the precision is determined by second long double.

Checking on implementation comments and history, I am very confident the
assembly implementation was based on a version before commit
5c68d40169 that fixes BZ#2423 (Errors in
long double (ldbl-128ibm) rounding functions in glibc-2.4).

By just removing the implementation and make the build select
sysdeps/ieee754/ldbl-128ibm/s_ceill.c instead fixes the failing math.

Fixes BZ#16701.
2014-03-14 12:54:47 -05:00
Adhemerval Zanella
27c7220a48 PowerPC: Fix strspn for static build
This patch makes the strspn ifunc selector build for static builds.
2014-03-12 06:54:44 -05:00
Adhemerval Zanella
4facea4730 PowerPC: Fix bzero definition for static libc for PPC64
This patch fixes an issue for powerpc64[le] static build where __bzero
is definied in multiple places (memset-ppc64.o and bzero.o). It is now
defined only in bzero.o and memset-ppc64.o only defined __bzero_ppc for
both dynamic and static library.

Fixes BZ#16683.
2014-03-11 09:31:59 -05:00
Vidya Ranganathan
e65caf1f1d PowerPC: strspn optimization for PPC64/POWER7
The optimization is achieved by following techniques:
  > hashing of needle.
  > hashing avoids scanning of duplicate entries in needle across the string.
  > initializing the hash table with Vector instructions (VSX) by quadword access.
  > unrolling when scanning for character in string across hash table.
2014-03-11 08:54:33 -05:00
Adhemerval Zanella
ba9cc0714e PowerPC: strncat optimization for PPC64
The optimization is achieved by following techniques:
1. Doubleword aligned memory access and compares using
   cmpb instruction.
2. Loop unrolling for byte load/store.
3. CPU pre-fetch to avoid cache miss.
2014-03-10 07:25:09 -05:00
Rajalakshmi Srinivasaraghavan
c7debbdfac PowerPC: strrchr optimization for POWER7/PPC64
This patch optimizes strrchr() for ppc64. It uses aligned memory
access along with cmpb instruction and CPU prefetch to avoid
cache misses for speed improvement.
2014-03-03 08:06:41 -06:00
Adhemerval Zanella
fe13a20c37 PowerPC: llround/llroundf POWER8 optimization
This patch add a optimized llround/llroundf implementation for POWER8
using the new Move From VSR Doubleword instruction to gains some
cycles from FP to GRP register move.
2014-02-27 12:58:33 -06:00
Adhemerval Zanella
1ad8950a3e PowerPC: llrint/llrintf POWER8 optimization
This patch add a optimized llrint/llrintf implementation for POWER8
using the new Move From VSR Doubleword instruction to gains some
cycles from FP to GRP register move.
2014-02-27 12:58:33 -06:00
Adhemerval Zanella
cac626d60a PowerPC: Optimized finite/finitef for POWER8
This patch add a optimized finite/finitef implementation for POWER8
using the new Move From VSR Doubleword instruction to gains some
cycles from FP to GRP register move.
2014-02-27 12:58:33 -06:00
Adhemerval Zanella
4393fc119c PowerPC: Optimized isinf/isinff for POWER8
This patch add a optimized isinf/isinff implementation for POWER8
using the new Move From VSR Doubleword instruction to gains some
cycles from FP to GRP register move.
2014-02-27 12:58:33 -06:00
Adhemerval Zanella
487972aea5 PowerPC: Optimized isnan/isnanf for POWER8
This patch add a optimized isnan/isnanf implementation for POWER8
using the new Move From VSR Doubleword instruction to gains some
cycles from FP to GRP register move.
2014-02-27 12:58:32 -06:00
Ondřej Bílka
a1ffb40e32 Use glibc_likely instead __builtin_expect. 2014-02-10 15:07:12 +01:00
Adhemerval Zanella
38f3458175 PowerPC: remove wrong truncl implementation for PowerPC64
The truncl assembly implementation (sysdeps/powerpc/powerpc64/fpu/s_truncl.S)
returns wrong results for some inputs where first double is a exact integer
and the precision is determined by second long double.

Checking on implementation comments and history, I am very confident the
assembly implementation was based on a version before commit
5c68d40169 that fixes BZ#2423 (Errors in
long double (ldbl-128ibm) rounding functions in glibc-2.4).

By just removing the implementation and make the build select
sysdeps/ieee754/ldbl-128ibm/s_truncl.c instead it fixes tgammal
issues regarding wrong result sign.
2014-01-08 08:14:48 -06:00
Adhemerval Zanella
d7ad2d9bad PowerPC: Fix compiler warnings
This patch fixes some compile warnings related to extra tokens at
end of #undef directive from multilib patchset.
2014-01-03 13:29:10 -06:00
Allan McRae
d4697bc93d Update copyright notices with scripts/update-copyrights 2014-01-01 22:00:23 +10:00
Andreas Schwab
83f5c32d21 Fix uses of CALL_MCOUNT in ppc64 assembler sources 2013-12-19 17:06:48 +01:00
Adhemerval Zanella
42fcb46ce6 PowerPC: multiarch hypot/hypotf for PowerPC64 2013-12-13 15:38:01 -05:00
Adhemerval Zanella
83efded424 PowerPC: multiarch modf/modff for PowerPC64 2013-12-13 15:37:23 -05:00
Adhemerval Zanella
43e246d2a6 PowerPC: multiarch logb/logbl/logbf for PowerPC64 2013-12-13 15:36:33 -05:00
Adhemerval Zanella
8fdad12379 PowerPC: multiarch isinf/isinff for PowerPC64 2013-12-13 15:35:44 -05:00
Adhemerval Zanella
1481d7066c PowerPC: multiarch finite/finitef for PowerPC64 2013-12-13 15:34:52 -05:00
Adhemerval Zanella
5ccd5fc893 PowerPC: multiarch llrint/lrint for PowerPC64 2013-12-13 15:33:54 -05:00
Adhemerval Zanella
2568f3fa69 PowerPC: multiarch copysign/copysignf for PowerPC64 2013-12-13 15:32:58 -05:00
Adhemerval Zanella
1cb341fd78 PowerPC: multiarch trunc/truncf for PowerPC64 2013-12-13 15:30:57 -05:00
Adhemerval Zanella
59a3e194f7 PowerPC: multiarch round/roundf for PowerPC64 2013-12-13 15:06:01 -05:00
Adhemerval Zanella
357fd3b40a PowerPC: multiarch floor/floorf for PowerPC64 2013-12-13 15:04:04 -05:00
Adhemerval Zanella
96770f12b0 PowerPC: multiarch ceil/ceilf for PowerPC64 2013-12-13 15:02:32 -05:00
Adhemerval Zanella
c3627f6e96 PowerPC: multiarch llround/lround for PowerPC64 2013-12-13 15:01:54 -05:00
Adhemerval Zanella
b2284ad7cf PowerPC: multiarch isnan/isnanf for PowerPC64 2013-12-13 15:01:10 -05:00
Adhemerval Zanella
69bbc63d88 PowerPC: Adjust multiarch Implies for PowerPC64
This patch adds Implies files on multiarch folder for POWER chips so
multirach is enabled when building with --with-cpu and powerN
option.
2013-12-13 14:58:02 -05:00
Adhemerval Zanella
c24517c9dd PowerPC: Cleaning up uneeded sqrt routines
For PPC64, all the wrappers at sysdeps are superfluous: they are
basically the same implementation from math/w_sqrt.c with the
'#ifdef _IEEE_LIBM'. And the power4 version just force the 'fsqrt'
instruction utilization with an inline assembly, which is already
handled by math_private.h __ieee754_sqrt implementation.
2013-12-13 14:56:09 -05:00
Adhemerval Zanella
a52374e82b PowerPC: multiarch stpcpy for PowerPC64 2013-12-13 14:55:22 -05:00
Adhemerval Zanella
7f5ec11336 PowerPC: multiarch strcpy for PowerPC64 2013-12-13 14:54:41 -05:00
Adhemerval Zanella
e28bcd427b PowerPC: multiarch wordcopy for PowerPC64 2013-12-13 14:54:08 -05:00
Adhemerval Zanella
92cacfce7d PowerPC: multiarch wcscpy for PowerPC64. 2013-12-13 14:53:25 -05:00
Adhemerval Zanella
7b714620a7 PowerPC: multiarch wcsrchr for PowerPC64 2013-12-13 14:52:48 -05:00
Adhemerval Zanella
16fd2ae37c PowerPC: multiarch wcschr for PowerPC64 2013-12-13 14:51:36 -05:00
Adhemerval Zanella
9ee2969b05 PowerPC: multiarch strchrnul for PowerPC64 2013-12-13 14:50:26 -05:00
Adhemerval Zanella
372dc060e0 PowerPC: multiarch strchr for PowerPC64 2013-12-13 14:49:54 -05:00
Adhemerval Zanella
24c2c3b996 PowerPC: multiarch strncmp for PowerPC64 2013-12-13 14:48:48 -05:00
Adhemerval Zanella
1c92d9a0e0 PowerPC: multiarch strncasecmp for PowerPC64 2013-12-13 14:40:28 -05:00
Adhemerval Zanella
17de3ee3c1 PowerPC: multiarch strcasecmp for PowerPC64 2013-12-13 14:39:51 -05:00
Adhemerval Zanella
62982bf978 PowerPC: multiarch strnlen for PowerPC64 2013-12-13 14:38:50 -05:00
Adhemerval Zanella
a65f4904ab PowerPC: multiarch strlen for PowerPC64 2013-12-13 14:38:17 -05:00
Adhemerval Zanella
1fd005ad2f PowerPC: multiarch rawmemchr for PowerPC64 2013-12-13 14:37:26 -05:00
Adhemerval Zanella
cd05ba9135 PowerPC: multiarch memrchr for PowerPC64 2013-12-13 14:36:50 -05:00
Adhemerval Zanella
870f867648 PowerPC: multiarch memchr for PowerPC64 2013-12-13 14:35:28 -05:00
Adhemerval Zanella
f00be62b08 PowerPC: multiarch mempcpy for PowerPC64 2013-12-13 14:34:06 -05:00
Adhemerval Zanella
8a29a3d00b PowerPC: multiarch memset/bzero for PowerPC64 2013-12-13 14:33:16 -05:00
Adhemerval Zanella
07253fcf7b PowerPC: multirach memcmp for PowerPC64 2013-12-13 14:32:31 -05:00
Adhemerval Zanella
b5beafbcee PowerPC: multiarch memcpy for PowerPC64 2013-12-13 14:31:41 -05:00
Adhemerval Zanella
5e6a4d4b9e PowerPC: Adjust multiarch Implies for PowerPC64
This patch adds Implies files on multiarch folder for POWER chips so
multirach is enabled when building with --with-cpu and powerN
option.
2013-12-13 14:29:27 -05:00
Adhemerval Zanella
24eeafdb44 PowerPC: Optimized mpn functions for PowerPC64/POWER7
This patch add optimized __mpn_add_n/__mpn_sub_n for PowerPC64/POWER7.
They are originally from GMP with adjustments for GLIBC.
2013-12-06 11:52:31 -06:00
Adhemerval Zanella
4a2c0fd44d PowerPC: Optimized mpn functions for PowerPC64
This patch add optimized __mpn_addmul, __mpn_addsub, __mpn_lshift, and
__mpn_mul_1 implementations for PowerPC64. They are originally from GMP
with adjustments for GLIBC.
2013-12-06 11:52:22 -06:00
Adhemerval Zanella
2d9470b2ae PowerPC: multiarch logb/logbf/logbl for PowerPC32 2013-12-06 05:47:05 -06:00
Adhemerval Zanella
ea5a72f882 PowerPC: multiarch wordcopy routines for PowerPC32 2013-12-06 05:47:02 -06:00
Adhemerval Zanella
93be09e725 PowerPC: multiarch wcscpy for PowerPC32 2013-12-06 05:47:02 -06:00
Adhemerval Zanella
bb04e529f6 PowerPC: multiarch wcsrchr for PowerPC32 2013-12-06 05:47:02 -06:00
Adhemerval Zanella
05b5cd1ce5 PowerPC: multiarch wcschr for PowerPC32 2013-12-06 05:47:02 -06:00
Adhemerval Zanella
eb5ad6b9bc PowerPC: Add systemtap static probe points in setjmp/longjmp
This patch add static probes for setjmp/longjmp in the way gdb expects,fixing
the gdb.base/longjmp.exp gdb testcases.

It changes the symbol_name and use macros to to avoid change the probe names
and ending up adding more logic on GDB (since with the expected name
GDB work seamlessly).
2013-12-05 07:44:07 -06:00
Ulrich Weigand
61cd8fe401 PowerPC64 ELFv2 ABI 5/6: LD_AUDIT interface changes
The ELFv2 ABI changes the calling convention by passing and returning
structures in registers in more cases than the old ABI:
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg01145.html
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg01147.html

For the most part, this does not affect glibc, since glibc assembler
files do not use structure parameters / return values.  However, one
place is affected: the LD_AUDIT interface provides a structure to
the audit routine that contains all registers holding function
argument and return values for the intercepted PLT call.

Since the new ABI now sometimes uses registers to return values
that were never used for this purpose in the old ABI, this structure
has to be extended.  To force audit routines to be modified for the
new ABI if necessary, the patch defines v2 variants of the la_ppc64
types and routines.

In addition, the patch contains two unrelated changes to the
PLT trampoline routines: it fixes a bug where FPR return values
were stored in the wrong place, and it removes the unnecessary
save/restore of CR.
2013-12-04 07:41:39 -06:00
Ulrich Weigand
8b8a692cfd PowerPC64 ELFv2 ABI 4/6: Stack frame layout changes
This updates glibc for the changes in the ELFv2 relating to the
stack frame layout.  These are described in more detail here:
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg01149.html
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg01146.html

Specifically, the "compiler and linker doublewords" were removed,
which has the effect that the save slot for the TOC register is
now at offset 24 rather than 40 to the stack pointer.

In addition, a function may now no longer necessarily assume that
its caller has set up a 64-byte register save area its use.

To address the first change, the patch goes through all assembler
files and replaces immediate offsets in instructions accessing the
ABI-defined stack slots by symbolic offsets.  Those already were
defined in ucontext_i.sym and used in some of the context routines,
but that doesn't really seem like the right place for those defines.

The patch instead defines those symbolic offsets in sysdeps.h,
in two variants for the old and new ABI, and uses them systematically
in all assembler files, not just the context routines.

The second change only affected a few assembler files that used
the save area to temporarily store some registers.  In those
cases where this happens within a leaf function, this patch
changes the code to store those registers to the "red zone"
below the stack pointer.  Otherwise, the functions already allocate
a stack frame, and the patch changes them to add extra space in
these frames as temporary space for the ELFv2 ABI.
2013-12-04 07:41:39 -06:00
Ulrich Weigand
122b66defd PowerPC64 ELFv2 ABI 3/6: PLT local entry point optimization
This is a follow-on to the previous patch to support the ELFv2 ABI in the
dynamic loader, split off into its own patch since it is just an optional
optimization.

In the ELFv2 ABI, most functions define both a global and a local entry
point; the local entry requires r2 to be already set up by the caller
to point to the callee's TOC; while the global entry does not require
the caller to know about the callee's TOC, but it needs to set up r12
to the callee's entry point address.

Now, when setting up a PLT slot, the dynamic linker will usually need
to enter the target function's global entry point.  However, if the
linker can prove that the target function is in the same DSO as the
PLT slot itself, and the whole DSO only uses a single TOC (which the
linker will let ld.so know via a DT_PPC64_OPT entry), then it is
possible to actually enter the local entry point address into the
PLT slot, for a slight improvement in performance.

Note that this uncovered a problem on the first call via _dl_runtime_resolve,
because that routine neglected to restore the caller's TOC before calling
the target function for the first time, since it assumed that function
would always reload its own TOC anyway ...
2013-12-04 07:41:38 -06:00
Ulrich Weigand
696caf1d00 PowerPC64 ELFv2 ABI 2/6: Remove function descriptors
This patch adds support for the ELFv2 ABI feature to remove function
descriptors.  See this GCC patch for in-depth discussion:
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg01141.html

This mostly involves two types of changes: updating assembler source
files to the new logic, and updating the dynamic loader.

After the refactoring in the previous patch, most of the assembler source
changes can be handled simply by providing ELFv2 versions of the
macros in sysdep.h.   One somewhat non-obvious change is in __GI__setjmp:
this used to "fall through" to the immediately following __setjmp ENTRY
point.  This is no longer safe in the ELFv2 since ENTRY defines both
a global and a local entry point, and you cannot simply fall through
to a global entry point as it requires r12 to be set up.

Also, makecontext needs to be updated to set up registers according to
the new ABI for calling into the context's start routine.

The dynamic linker changes mostly consist of removing special code
to handle function descriptors.  We also need to support the new PLT
and glink format used by the the ELFv2 linker, see:
https://sourceware.org/ml/binutils/2013-10/msg00376.html

In addition, the dynamic linker now verifies that the dynamic libraries
it loads match its own ABI.

The hack in VDSO_IFUNC_RET to "synthesize" a function descriptor
for vDSO routines is also no longer necessary for ELFv2.
2013-12-04 07:41:38 -06:00
Ulrich Weigand
d31beafa8e PowerPC64 ELFv2 ABI 1/6: Code refactoring
This is the first patch to support the new ELFv2 ABI in glibc.

As preparation, this patch simply refactors some of the powerpc64 assembler
code to move all code related to creating function descriptors (.opd section)
or using function descriptors (function pointer call) into a central place
in sysdep.h.

Note that most locations creating .opd entries were already using macros
in sysdep.h, this patch simply extends this to the remaining places.

No relevant change in generated code expected.
2013-12-04 07:41:38 -06:00
Alan Modra
7ec07d9a7b PowerPC64: Report overflow on @h and @ha relocations
This patch updates glibc in accordance with the binutils patch checked in here:
https://sourceware.org/ml/binutils/2013-10/msg00372.html

This changes the various R_PPC64_..._HI and _HA relocations to report
32-bit overflows.  The motivation is that existing uses of @h / @ha
are to build up 32-bit offsets (for the "medium model" TOC access
that GCC now defaults to), and we'd really like to see failures at
link / load time rather than silent truncations.

For those rare cases where a modifier is needed to build up a 64-bit
constant, new relocations _HIGH / _HIGHA are supported.

The patch also fixes a bug in overflow checking for the R_PPC64_ADDR30
and R_PPC64_ADDR32 relocations.
2013-12-04 07:41:37 -06:00
Mike Frysinger
cb8a6dbd17 rename configure.in to configure.ac
Autoconf has been deprecating configure.in for quite a long time.
Rename all our configure.in and preconfigure.in files to .ac.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
2013-10-30 17:32:08 +10:00
Adhemerval Zanella
69f13dbf06 PowerPC: strcpy/stpcpy optimization for PPC64/POWER7
This patch intends to unify both strcpy and stpcpy implementationsi
for PPC64 and PPC64/POWER7. The idead default powerpc64 implementation
is to provide both doubleword and word aligned memory access.

For PPC64/POWER7 is also provide doubleword and word memory access,
remove the branch hints, use the cmpb instruction for compare
doubleword/words, and add an optimization for inputs of same alignment.
2013-10-25 13:28:24 -05:00
Alan Modra
4cb81307b3 Use stdint.h types in union unaligned.
* sysdeps/powerpc/powerpc32/dl-machine.c (__process_machine_rela):
	Use stdint types in rather than __attribute__((mode())).
	* sysdeps/powerpc/powerpc64/dl-machine.h (elf_machine_rela): Likewise.
2013-10-04 12:51:11 +09:30
Alan Modra
f8e3e9f31b Correct little-endian relocation of UADDR64,32,16.
* sysdeps/powerpc/powerpc32/dl-machine.c (__process_machine_rela):
	Correct handling of unaligned relocs for little-endian.
	* sysdeps/powerpc/powerpc64/dl-machine.h (elf_machine_rela): Likewise.
2013-10-04 11:33:12 +09:30
Alan Modra
466b039332 PowerPC LE memchr and memrchr
http://sourceware.org/ml/libc-alpha/2013-08/msg00105.html

Like strnlen, memchr and memrchr had a number of defects fixed by this
patch as well as adding little-endian support.  The first one I
noticed was that the entry to the main loop needlessly checked for
"are we done yet?" when we know the size is large enough that we can't
be done.  The second defect I noticed was that the main loop count was
wrong, which in turn meant that the small loop needed to handle an
extra word.  Thirdly, there is nothing to say that the string can't
wrap around zero, except of course that we'd normally hit a segfault
on trying to read from address zero.  Fixing that simplified a number
of places:

-	/* Are we done already?  */
-	addi    r9,r8,8
-	cmpld	r9,r7
-	bge	L(null)

becomes

+	cmpld	r8,r7
+	beqlr

However, the exit gets an extra test because I test for being on the
last word then if so whether the byte offset is less than the end.
Overall, the change is a win.

Lastly, memrchr used the wrong cache hint.

	* sysdeps/powerpc/powerpc64/power7/memchr.S: Replace rlwimi with
	insrdi.  Make better use of reg selection to speed exit slightly.
	Schedule entry path a little better.  Remove useless "are we done"
	checks on entry to main loop.  Handle wrapping around zero address.
	Correct main loop count.  Handle single left-over word from main
	loop inline rather than by using loop_small.  Remove extra word
	case in loop_small caused by wrong loop count.  Add little-endian
	support.
	* sysdeps/powerpc/powerpc32/power7/memchr.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/memrchr.S: Likewise.  Use proper
	cache hint.
	* sysdeps/powerpc/powerpc32/power7/memrchr.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/rawmemchr.S: Add little-endian
	support.  Avoid rlwimi.
	* sysdeps/powerpc/powerpc32/power7/rawmemchr.S: Likewise.
2013-10-04 10:41:46 +09:30
Alan Modra
3be87c77d2 PowerPC LE memset
http://sourceware.org/ml/libc-alpha/2013-08/msg00104.html

One of the things I noticed when looking at power7 timing is that rlwimi
is cracked and the two resulting insns have a register dependency.
That makes it a little slower than the equivalent rldimi.

	* sysdeps/powerpc/powerpc64/memset.S: Replace rlwimi with
        insrdi.  Formatting.
	* sysdeps/powerpc/powerpc64/power4/memset.S: Likewise.
	* sysdeps/powerpc/powerpc64/power6/memset.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/memset.S: Likewise.
	* sysdeps/powerpc/powerpc32/power4/memset.S: Likewise.
	* sysdeps/powerpc/powerpc32/power6/memset.S: Likewise.
	* sysdeps/powerpc/powerpc32/power7/memset.S: Likewise.
2013-10-04 10:41:35 +09:30
Alan Modra
759cfef3ac PowerPC LE memcpy
http://sourceware.org/ml/libc-alpha/2013-08/msg00103.html

LIttle-endian support for memcpy.  I spent some time cleaning up the
64-bit power7 memcpy, in order to avoid the extra alignment traps
power7 takes for little-endian.  It probably would have been better
to copy the linux kernel version of memcpy.

	* sysdeps/powerpc/powerpc32/power4/memcpy.S: Add little endian support.
	* sysdeps/powerpc/powerpc32/power6/memcpy.S: Likewise.
	* sysdeps/powerpc/powerpc32/power7/memcpy.S: Likewise.
	* sysdeps/powerpc/powerpc32/power7/mempcpy.S: Likewise.
	* sysdeps/powerpc/powerpc64/memcpy.S: Likewise.
	* sysdeps/powerpc/powerpc64/power4/memcpy.S: Likewise.
	* sysdeps/powerpc/powerpc64/power6/memcpy.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/memcpy.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/mempcpy.S: Likewise.  Make better
	use of regs.  Use power7 mtocrf.  Tidy function tails.
2013-10-04 10:41:24 +09:30
Alan Modra
fe6e95d717 PowerPC LE memcmp
http://sourceware.org/ml/libc-alpha/2013-08/msg00102.html

This is a rather large patch due to formatting and renaming.  The
formatting changes were to make it possible to compare power7 and
power4 versions of memcmp.  Using different register defines came
about while I was wrestling with the code, trying to find spare
registers at one stage.  I found it much simpler if we refer to a reg
by the same name throughout a function, so it's better if short-term
multiple use regs like rTMP are referred to using their register
number.  I made the cr field usage changes when attempting to reload
rWORDn regs in the exit path to byte swap before comparing when
little-endian.  That proved a bad idea due to the pipelining involved
in the main loop;  Offsets to reload the regs were different first
time around the loop..  Anyway, I left the cr field usage changes in
place for consistency.

Aside from these more-or-less cosmetic changes, I fixed a number of
places where an early exit path restores regs unnecessarily, removed
some dead code, and optimised one or two exits.

	* sysdeps/powerpc/powerpc64/power7/memcmp.S: Add little-endian support.
	Formatting.  Consistently use rXXX register defines or rN defines.
	Use early exit labels that avoid restoring unused non-volatile regs.
	Make cr field use more consistent with rWORDn compares.  Rename
	regs used as shift registers for unaligned loop, using rN defines
	for short lifetime/multiple use regs.
	* sysdeps/powerpc/powerpc64/power4/memcmp.S: Likewise.
	* sysdeps/powerpc/powerpc32/power7/memcmp.S: Likewise.  Exit with
	addi 1,1,64 to pop stack frame.  Simplify return value code.
	* sysdeps/powerpc/powerpc32/power4/memcmp.S: Likewise.
2013-10-04 10:40:56 +09:30
Alan Modra
664318c3eb PowerPC LE strchr
http://sourceware.org/ml/libc-alpha/2013-08/msg00101.html

Adds little-endian support to optimised strchr assembly.  I've also
tweaked the big-endian code a little.  In power7/strchr.S there's a
check in the tail of the function that we didn't match 0 before
finding a c match, done by comparing leading zero counts.  It's just
as valid, and quicker, to compare the raw output from cmpb.

Another little tweak is to use rldimi/insrdi in place of rlwimi for
the power7 strchr functions.  Since rlwimi is cracked, it is a few
cycles slower.  rldimi can be used on the 32-bit power7 functions
too.

	* sysdeps/powerpc/powerpc64/power7/strchr.S (strchr): Add little-endian
	support.  Correct typos, formatting.  Optimize tail.  Use insrdi
	rather than rlwimi.
	* sysdeps/powerpc/powerpc32/power7/strchr.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/strchrnul.S (__strchrnul): Add
	little-endian support.  Correct typos.
	* sysdeps/powerpc/powerpc32/power7/strchrnul.S: Likewise.  Use insrdi
	rather than rlwimi.
	* sysdeps/powerpc/powerpc64/strchr.S (rTMP4, rTMP5): Define.  Use
	in loop and entry code to keep "and." results.
	(strchr): Add little-endian support.  Comment.  Move cntlzd
	earlier in tail.
	* sysdeps/powerpc/powerpc32/strchr.S: Likewise.
2013-10-04 10:40:22 +09:30
Alan Modra
43b8401371 PowerPC LE strcpy
http://sourceware.org/ml/libc-alpha/2013-08/msg00100.html

The strcpy changes for little-endian are quite straight-forward, just
a matter of rotating the last word differently.

I'll note that the powerpc64 version of stpcpy is just begging to be
converted to use 64-bit loads and stores..

	* sysdeps/powerpc/powerpc64/strcpy.S: Add little-endian support:
	* sysdeps/powerpc/powerpc32/strcpy.S: Likewise.
	* sysdeps/powerpc/powerpc64/stpcpy.S: Likewise.
	* sysdeps/powerpc/powerpc32/stpcpy.S: Likewise.
2013-10-04 10:40:11 +09:30
Alan Modra
8a7413f9b0 PowerPC LE strcmp and strncmp
http://sourceware.org/ml/libc-alpha/2013-08/msg00099.html

More little-endian support.  I leave the main strcmp loops unchanged,
(well, except for renumbering rTMP to something other than r0 since
it's needed in an addi insn) and modify the tail for little-endian.

I noticed some of the big-endian tail code was a little untidy so have
cleaned that up too.

	* sysdeps/powerpc/powerpc64/strcmp.S (rTMP2): Define as r0.
	(rTMP): Define as r11.
	(strcmp): Add little-endian support.  Optimise tail.
	* sysdeps/powerpc/powerpc32/strcmp.S: Similarly.
	* sysdeps/powerpc/powerpc64/strncmp.S: Likewise.
	* sysdeps/powerpc/powerpc32/strncmp.S: Likewise.
	* sysdeps/powerpc/powerpc64/power4/strncmp.S: Likewise.
	* sysdeps/powerpc/powerpc32/power4/strncmp.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/strncmp.S: Likewise.
	* sysdeps/powerpc/powerpc32/power7/strncmp.S: Likewise.
2013-10-04 10:39:52 +09:30
Alan Modra
33ee81de05 PowerPC LE strnlen
http://sourceware.org/ml/libc-alpha/2013-08/msg00098.html

The existing strnlen code has a number of defects, so this patch is more
than just adding little-endian support.  The changes here are similar to
those for memchr.

	* sysdeps/powerpc/powerpc64/power7/strnlen.S (strnlen): Add
	little-endian support.  Remove unnecessary "are we done" tests.
	Handle "s" wrapping around zero and extremely large "size".
	Correct main loop count.  Handle single left-over word from main
	loop inline rather than by using small_loop.  Correct comments.
	Delete "zero" tail, use "end_max" instead.
	* sysdeps/powerpc/powerpc32/power7/strnlen.S: Likewise.
2013-10-04 10:39:42 +09:30
Alan Modra
db9b4570c5 PowerPC LE strlen
http://sourceware.org/ml/libc-alpha/2013-08/msg00097.html

This is the first of nine patches adding little-endian support to the
existing optimised string and memory functions.  I did spend some
time with a power7 simulator looking at cycle by cycle behaviour for
memchr, but most of these patches have not been run on cpu simulators
to check that we are going as fast as possible.  I'm sure PowerPC can
do better.  However, the little-endian support mostly leaves main
loops unchanged, so I'm banking on previous authors having done a
good job on big-endian..  As with most code you stare at long enough,
I found some improvements for big-endian too.

Little-endian support for strlen.  Like most of the string functions,
I leave the main word or multiple-word loops substantially unchanged,
just needing to modify the tail.

Removing the branch in the power7 functions is just a tidy.  .align
produces a branch anyway.  Modifying regs in the non-power7 functions
is to suit the new little-endian tail.

	* sysdeps/powerpc/powerpc64/power7/strlen.S (strlen): Add little-endian
	support.  Don't branch over align.
	* sysdeps/powerpc/powerpc32/power7/strlen.S: Likewise.
	* sysdeps/powerpc/powerpc64/strlen.S (strlen): Add little-endian support.
	Rearrange tmp reg use to suit.  Comment.
	* sysdeps/powerpc/powerpc32/strlen.S: Likewise.
2013-10-04 10:39:32 +09:30
Alan Modra
9b874b2f1e PowerPC ugly symbol versioning
http://sourceware.org/ml/libc-alpha/2013-08/msg00090.html

This patch fixes symbol versioning in setjmp/longjmp.  The existing
code uses raw versions, which results in wrong symbol versioning when
you want to build glibc with a base version of 2.19 for LE.

Note that the merging the 64-bit and 32-bit versions in novmx-lonjmp.c
and pt-longjmp.c doesn't result in GLIBC_2.0 versions for 64-bit, due
to the base in shlib_versions.

	* sysdeps/powerpc/longjmp.c: Use proper symbol versioning macros.
	* sysdeps/powerpc/novmx-longjmp.c: Likewise.
	* sysdeps/powerpc/powerpc32/bsd-_setjmp.S: Likewise.
	* sysdeps/powerpc/powerpc32/bsd-setjmp.S: Likewise.
	* sysdeps/powerpc/powerpc32/fpu/__longjmp.S: Likewise.
	* sysdeps/powerpc/powerpc32/fpu/setjmp.S: Likewise.
	* sysdeps/powerpc/powerpc32/mcount.c: Likewise.
	* sysdeps/powerpc/powerpc32/setjmp.S: Likewise.
	* sysdeps/powerpc/powerpc64/setjmp.S: Likewise.
	* nptl/sysdeps/unix/sysv/linux/powerpc/pt-longjmp.c: Likewise.
2013-10-04 10:38:28 +09:30
Anton Blanchard
be1e5d3113 PowerPC LE setjmp/longjmp
http://sourceware.org/ml/libc-alpha/2013-08/msg00089.html

Little-endian fixes for setjmp/longjmp.  When writing these I noticed
the setjmp code corrupts the non volatile VMX registers when using an
unaligned buffer.  Anton fixed this, and also simplified it quite a
bit.

The current code uses boilerplate for the case where we want to store
16 bytes to an unaligned address.  For that we have to do a
read/modify/write of two aligned 16 byte quantities.  In our case we
are storing a bunch of back to back data (consective VMX registers),
and only the start and end of the region need the read/modify/write.

	[BZ #15723]
	* sysdeps/powerpc/jmpbuf-offsets.h: Comment fix.
	* sysdeps/powerpc/powerpc32/fpu/__longjmp-common.S: Correct
	_dl_hwcap access for little-endian.
	* sysdeps/powerpc/powerpc32/fpu/setjmp-common.S: Likewise.  Don't
	destroy vmx regs when saving unaligned.
	* sysdeps/powerpc/powerpc64/__longjmp-common.S: Correct CR load.
	* sysdeps/powerpc/powerpc64/setjmp-common.S: Likewise CR save.  Don't
	destroy vmx regs when saving unaligned.
2013-10-04 10:37:59 +09:30
Anton Blanchard
76a66d510a PowerPC floating point little-endian [14 of 15]
http://sourceware.org/ml/libc-alpha/2013-07/msg00205.html

These all wrongly specified float constants in a 64-bit word.

	* sysdeps/powerpc/powerpc64/fpu/s_ceilf.S: Correct float constants
	for little-endian.
	* sysdeps/powerpc/powerpc64/fpu/s_floorf.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_nearbyintf.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_rintf.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_roundf.S: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/s_truncf.S: Likewise.
2013-10-04 10:36:24 +09:30
Alan Modra
7b88401f3b PowerPC floating point little-endian [12 of 15]
http://sourceware.org/ml/libc-alpha/2013-08/msg00087.html

Fixes for little-endian in 32-bit assembly.

	* sysdeps/powerpc/sysdep.h (LOWORD, HIWORD, HISHORT): Define.
	* sysdeps/powerpc/powerpc32/fpu/s_copysign.S: Load little-endian
	words of double from correct stack offsets.
	* sysdeps/powerpc/powerpc32/fpu/s_copysignl.S: Likewise.
	* sysdeps/powerpc/powerpc32/fpu/s_lrint.S: Likewise.
	* sysdeps/powerpc/powerpc32/fpu/s_lround.S: Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/s_llrint.S: Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/s_llrintf.S: Likewise.
	* sysdeps/powerpc/powerpc32/power5+/fpu/s_llround.S: Likewise.
	* sysdeps/powerpc/powerpc32/power5+/fpu/s_lround.S: Likewise.
	* sysdeps/powerpc/powerpc32/power5/fpu/s_isnan.S: Likewise.
	* sysdeps/powerpc/powerpc32/power6/fpu/s_isnan.S: Likewise.
	* sysdeps/powerpc/powerpc32/power6/fpu/s_llrint.S: Likewise.
	* sysdeps/powerpc/powerpc32/power6/fpu/s_llrintf.S: Likewise.
	* sysdeps/powerpc/powerpc32/power6/fpu/s_llround.S: Likewise.
	* sysdeps/powerpc/powerpc32/power7/fpu/s_finite.S: Likewise.
	* sysdeps/powerpc/powerpc32/power7/fpu/s_isinf.S: Likewise.
	* sysdeps/powerpc/powerpc32/power7/fpu/s_isnan.S: Likewise.
	* sysdeps/powerpc/powerpc64/power7/fpu/s_finite.S: Use HISHORT.
	* sysdeps/powerpc/powerpc64/power7/fpu/s_isinf.S: Likewise.
2013-10-04 10:35:43 +09:30
Adhemerval Zanella
5ebbff8fd1 PowerPC: Fix POINTER_CHK_GUARD thread register for PPC64 2013-09-25 13:43:04 -05:00
Carlos O'Donell
c61b4d41c9 BZ #15754: CVE-2013-4788
The pointer guard used for pointer mangling was not initialized for
static applications resulting in the security feature being disabled.
The pointer guard is now correctly initialized to a random value for
static applications. Existing static applications need to be
recompiled to take advantage of the fix.

The test tst-ptrguard1-static and tst-ptrguard1 add regression
coverage to ensure the pointer guards are sufficiently random
and initialized to a default value.
2013-09-23 00:52:09 -04:00
Adhemerval Zanella
5430fc65a1 PowerPC: fix POWER7 memrchr for some large inputs 2013-09-05 09:32:56 -05:00
Ondřej Bílka
f24a6d086b Fix then/than typos. 2013-08-30 18:10:31 +02:00
Ondřej Bílka
c0c3f78afb Fix typos. 2013-08-21 19:48:48 +02:00
Adhemerval Zanella
d400dcac5e PowerPC: fix backtrace to handle signal trampolines
This patch fixes backtrace for PPC32 and PPC64 to correctly handle
signal trampolines. The 'debug/tst-backtrace6.c' also check for
SA_SIGINFO handling, where is triggers another vDSO symbols for PPC32.
2013-08-20 15:05:49 -05:00
Ryan S. Arnold
2f063a6e84 PowerPC: Enable POWER8 platform sans hwcap bits. 2013-06-24 15:33:32 -05:00
Joseph Myers
9c84384cc1 Remove trailing whitespace. 2013-06-05 20:44:03 +00:00
Siddhesh Poyarekar
b937534868 Avoid crashing in LD_DEBUG when program name is unavailable
Resolves: #15465

The program name may be unavailable if the user application tampers
with argc and argv[].  Some parts of the dynamic linker caters for
this while others don't, so this patch consolidates the check and
fallback into a single macro and updates all users.
2013-05-29 21:34:12 +05:30
Adhemerval Zanella
aa630f590c PowerPC: modf optimization fix
This patch fix the 3c0265394d commits
by correctly setting minimum architecture for modf PPC optimization
to power5+ instead of power5 (since only on power5+ round/ceil will
be inline to inline assembly).
2013-04-26 13:00:56 -05:00
Adhemerval Zanella
3c0265394d PowerPC: modf optimization
This patch implements modf/modff optimization for POWER by focus
on FP operations instead of relying in integer ones.
2013-04-23 13:38:52 -05:00
Adhemerval Zanella
60c414c346 PowerPC: remove branch prediction from rint implementation
The branch prediction hints is actually hurts performance in this case.
The assembly implementation make two assumptions: 1. 'fabs (x) < 2^52'
is unlikely and 2. 'x > 0.0' is unlike (if 1. is true). Since it a
general floating point function, expected input is not bounded and then
it is better to let the hardware handle the branches.
2013-04-01 06:36:51 -05:00
Alan Modra
b0f1246ab4 PowerPC: .eh_frame info in crt1.o isn't useful and triggers gold bug 14675.
The .eh_frame info in crt1.o isn't useful and this patch prevents it from
being generated on PowerPC.  It triggers the following gold bug:

http://sourceware.org/bugzilla/show_bug.cgi?id=14675
2013-03-28 12:16:28 -05:00
Siddhesh Poyarekar
6d9145d817 Consolidate copies of mp code in powerpc
Retain a single copy of the mp code in power4 instead of the two
identical copies in powerpc32 and powerpc64.
2013-03-08 11:38:41 +05:30
Siddhesh Poyarekar
ce544b5bda Merge powerpc slowexp.c into generic code 2013-03-07 13:25:02 +05:30
Siddhesh Poyarekar
4cc149fd8e Merge powerpc slowpow.c into generic code 2013-03-07 13:23:07 +05:30
Siddhesh Poyarekar
e6ebd4a7d5 Use an intermediate variable to sum exponents in powerpc __mul and __sqr 2013-03-07 13:18:56 +05:30
Siddhesh Poyarekar
82a9811d29 Use generic mpa.c code for everything except __mul and __sqr 2013-03-07 12:23:29 +05:30
Joseph Myers
2d67d91ac0 Remove powerpc64 bounded-pointers code. 2013-03-06 00:10:21 +00:00
Siddhesh Poyarekar
8d19fe64ee Sync up ppc add_magnitudes and sub_magnitudes with default code 2013-02-28 11:13:05 +05:30
Siddhesh Poyarekar
60f5a8b534 Sync up powerpc __mp_dbl with default code 2013-02-25 12:01:45 +05:30
Siddhesh Poyarekar
8094523147 Mark __inv as static in powerpc 2013-02-21 15:05:28 +05:30
Siddhesh Poyarekar
bab8a695ee Fix whitespace differences between generic and powerpc mpa.c 2013-02-21 14:31:42 +05:30
Siddhesh Poyarekar
4c7a4263af Mark ZERO inputs to __mul as unlikely on powerpc
Syncs up with generic code.
2013-02-21 12:17:29 +05:30
Siddhesh Poyarekar
20cd7fb3ae Copy comment about inner loop from powerpc mpa.c to the default one 2013-02-20 18:56:20 +05:30
Siddhesh Poyarekar
cb57ce6031 Remove redundant return keyword 2013-02-14 15:43:25 +05:30
Siddhesh Poyarekar
d6752ccd69 New __sqr function as a faster special case of __mul 2013-02-14 10:31:09 +05:30
Joseph Myers
70d9946a44 Remove __ptrvalue, __bounded and __unbounded. 2013-02-13 23:30:40 +00:00
Joseph Myers
e782a927c2 Remove BOUNDED_N and BOUNDED_1. 2013-02-01 06:35:29 +00:00
Andreas Schwab
ed689c2f74 Remove use of mpa2.h 2013-01-20 02:05:53 +01:00
Siddhesh Poyarekar
1066a53440 Fix code formatting in mpa.c
This includes the overridden mpa.c in power4.
2013-01-14 21:53:43 +05:30
Siddhesh Poyarekar
e34ab70550 Remove unnecessary local variable mptwo 2013-01-14 21:23:47 +05:30
Andreas Schwab
557eead076 Revert "Use ieee754/dbl-64/wordsize-64 on powerpc64"
This reverts commit 7a9d2c3971.
2013-01-10 10:44:05 +01:00
Andreas Schwab
7a9d2c3971 Use ieee754/dbl-64/wordsize-64 on powerpc64
* sysdeps/ieee754/ldbl-opt/wordsize-64/s_ceil.c: New file.
	* sysdeps/ieee754/ldbl-opt/wordsize-64/s_finite.c: New file.
	* sysdeps/ieee754/ldbl-opt/wordsize-64/s_floor.c: New file.
	* sysdeps/ieee754/ldbl-opt/wordsize-64/s_frexp.c: New file.
	* sysdeps/ieee754/ldbl-opt/wordsize-64/s_isinf.c: New file.
	* sysdeps/ieee754/ldbl-opt/wordsize-64/s_isnan.c: New file.
	* sysdeps/ieee754/ldbl-opt/wordsize-64/s_llround.c: New file.
	* sysdeps/ieee754/ldbl-opt/wordsize-64/s_logb.c: New file.
	* sysdeps/ieee754/ldbl-opt/wordsize-64/s_lround.c: New file.
	* sysdeps/ieee754/ldbl-opt/wordsize-64/s_modf.c: New file.
	* sysdeps/ieee754/ldbl-opt/wordsize-64/s_nearbyint.c: New file.
	* sysdeps/ieee754/ldbl-opt/wordsize-64/s_remquo.c: New file.
	* sysdeps/ieee754/ldbl-opt/wordsize-64/s_rint.c: New file.
	* sysdeps/ieee754/ldbl-opt/wordsize-64/s_round.c: New file.
	* sysdeps/ieee754/ldbl-opt/wordsize-64/s_scalbln.c: New file.
	* sysdeps/ieee754/ldbl-opt/wordsize-64/s_scalbn.c: New file.
	* sysdeps/ieee754/ldbl-opt/wordsize-64/s_trunc.c: New file.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/Implies: Add
	ieee754/ldbl-opt/wordsize-64.
	* sysdeps/powerpc/powerpc64/Implies: Add
	ieee754/dbl-64/wordsize-64.
2013-01-10 09:59:58 +01:00
Siddhesh Poyarekar
950c99ca90 Update comments in mpa.c
Fixed comment style and clearer wording in some cases.
2013-01-09 19:07:15 +05:30
Anton Blanchard
2ccdea26f2 Fix spelling errors in sysdeps/powerpc files. 2013-01-07 11:20:53 -06:00
Siddhesh Poyarekar
fffb407f46 Remove unused __cr and __cpymn 2013-01-04 22:52:12 +05:30
Siddhesh Poyarekar
b18decba11 Fix build failure on power4 processors
The power4-specific mpa.c depended on some global variables that were
removed by earlier patches.  Also, it did not define mpone and mptwo.
2013-01-04 22:05:49 +05:30
Joseph Myers
568035b787 Update copyright notices with scripts/update-copyrights. 2013-01-02 19:05:09 +00:00
Joseph Myers
f4cf5f2d8b Add script to update copyright notices and reformat some to facilitate its use. 2013-01-01 16:29:10 +00:00
Roland McGrath
48085d142e Fix type-punning warning in powerpc64 gmon-start. 2012-11-30 13:48:39 -08:00
Roland McGrath
b8493de0ec Add missing magic to GLIBC_PROVIDES. 2012-10-09 15:41:30 -07:00
Will Schmidt
15d0da8cb3 Add versions of wcscpy, wcschr, wcsrchr for power6/power7.
Initially based on the versions found in wcsmbs/* ; these files have
been changed by hand unrolling, and adding some additional variables
to allow some read-ahead to occur, which then relieves some of the
wait-for-increment/wait-for-load/wait-for-compare-results pressure
that was slowing down every iteration through the while-loop.

For 64-bit Power7, These changes give an approx 20% throughput boost
for the wcschr and wcsrchr functions; and approx 40% boost for the
wcscpy function.  32-bit improvements appear to be slightly better
with ~ %30 and ~ %45 respectively.  Results for Power6 closely match
those for power7.
2012-08-22 11:04:42 -05:00
Will Schmidt
14a50c9d23 [Powerpc] Tune/optimize powerpc{32,64}/power7/memchr.S.
Assorted tweaking, twisting and tuning to squeeze a few additional cycles
out of the memchr code.   Changes include bypassing the shift pairs
(sld,srd) when they are not required, and unrolling the small_loop that
handles short and trailing strings.

Per scrollpipe data measuring aligned strings for 64-bit, these changes
save between five and eight cycles (9-13% overall) for short strings (<32),
Longer aligned strings see slight improvement of 1-3% due to bypassing the
shifts and the instruction rearranging.
2012-08-21 14:20:55 -05:00
Joseph Myers
3129cfc6ec Move testsuite audit definitions to sysdeps tst-audit.h files. 2012-07-26 11:29:07 +00:00
Adhemerval Zanella
d37cbdaa86 Split tls-macros.h in sysdeps directories.
Split PowerPC definitions in PPC32 and PPC64 headers.
2012-07-19 17:04:04 -03:00
Marek Polacek
3b05db33f6 Remove TLS configure checks. 2012-07-17 23:57:43 +02:00