Commit Graph

909 Commits

Author SHA1 Message Date
Adhemerval Zanella
71ae86478e PowerPC: memset optimization for POWER8/PPC64
This patch adds an optimized memset implementation for POWER8.  For
sizes from 0 to 255 bytes, a word/doubleword algorithm similar to
POWER7 optimized one is used.

For size higher than 255 two strategies are used:

1. If the constant is different than 0, the memory is written with
   altivec vector instruction;

2. If constant is 0, dbcz instructions are used.  The loop is unrolled
   to clear 512 byte at time.

Using vector instructions increases throughput considerable, with a
double performance for sizes larger than 1024.  The dcbz loops unrolls
also shows performance improvement, by doubling throughput for sizes
larger than 8192 bytes.
2014-09-10 07:39:46 -04:00
Adhemerval Zanella
3b473fecdf PowerPC: multiarch bzero cleanup for PPC64
This patch cleanups the multiarch bzero for powerpc64 by remove
the multiarch objects and use instead the the memset embedded
implementation presented in each multiarch optimization.  The
code generate is essentially the same, but the TB_TOCLESS (which
is not essential).
2014-09-10 07:39:46 -04:00
Khem Raj
a78b712d40 Define __GI_fegetenv for e500 libm
generic HAVE_RM_CTX implementation which is used for ppc/e500 as well
has introduced calls to fegetenv which should be resolved internally
with in libm

Signed-off-by: Khem Raj <raj.khem@gmail.com>

	* sysdeps/powerpc/powerpc32/e500/nofpu/fegetenv.c (fegetenv): Add
	libm_hidden_ver.
2014-09-02 21:39:04 +00:00
Siddhesh Poyarekar
eb72478a28 Remove unnecessary uses of NOT_IN_libc
If a IS_IN_* macro is defined, then NOT_IN_libc is always defined,
except obviously for IS_IN_libc.  There's no need to check for both.
Verified on x86_64 and i686 that the source is unchanged.

       * include/libc-symbols.h: Remove unnecessary check for
       NOT_IN_libc.
       * nptl/pthreadP.h: Likewise.
       * sysdeps/aarch64/setjmp.S: Likewise.
       * sysdeps/alpha/setjmp.S: Likewise.
       * sysdeps/arm/sysdep.h: Likewise.
       * sysdeps/i386/setjmp.S: Likewise.
       * sysdeps/m68k/setjmp.c: Likewise.
       * sysdeps/posix/getcwd.c: Likewise.
       * sysdeps/powerpc/powerpc32/setjmp-common.S: Likewise.
       * sysdeps/powerpc/powerpc64/setjmp-common.S: Likewise.
       * sysdeps/s390/s390-32/setjmp.S: Likewise.
       * sysdeps/s390/s390-64/setjmp.S: Likewise.
       * sysdeps/sh/sh3/setjmp.S: Likewise.
       * sysdeps/sh/sh4/setjmp.S: Likewise.
       * sysdeps/unix/alpha/sysdep.h: Likewise.
       * sysdeps/unix/sysv/linux/aarch64/sysdep.h: Likewise.
       * sysdeps/unix/sysv/linux/i386/sysdep.h: Likewise.
       * sysdeps/unix/sysv/linux/ia64/setjmp.S: Likewise.
       * sysdeps/unix/sysv/linux/ia64/sysdep.h: Likewise.
       * sysdeps/unix/sysv/linux/powerpc/powerpc32/sysdep.h: Likewise.
       * sysdeps/unix/sysv/linux/powerpc/powerpc64/sysdep.h: Likewise.
       * sysdeps/unix/sysv/linux/s390/s390-32/sysdep.h: Likewise.
       * sysdeps/unix/sysv/linux/s390/s390-64/sysdep.h: Likewise.
       * sysdeps/unix/sysv/linux/sh/sysdep.h: Likewise.
       * sysdeps/unix/sysv/linux/sparc/sparc32/sysdep.h: Likewise.
       * sysdeps/unix/sysv/linux/sparc/sparc64/sysdep.h: Likewise.
       * sysdeps/unix/sysv/linux/tile/sysdep.h: Likewise.
       * sysdeps/unix/sysv/linux/x86_64/sysdep.h: Likewise.
       * sysdeps/x86_64/setjmp.S: Likewise.
2014-08-21 10:26:46 +05:30
Joseph Myers
898c62f488 Fix powerpc-nofpu __fe_enabled_env and __fe_nonieee_env (bug 17261).
On powerpc, floating-point environment macros are defined as pointers
to constants in the library that contain the bit-patterns of the
desired environment, instead of being magic constants cast to pointer
type.

For soft-float, the bit-patterns used for fenv_t are not laid out the
same as for hard-float.  (e500 has a third layout used; that's not an
ABI issue because these values are only meaningful within a single
process, all of whose glibc libraries must come from the same build of
glibc.)  While the __fe_dfl_env value for soft-float was appropriate
for the soft-float fenv_t representation, the other two constants had
the same bit-patterns as for hard-float.  Those bit patterns had the
effect of having exceptions already raised, causing
math/test-fenv-return to fail; this patch fixes the patterns used.
(__fe_nonieee_env also had exceptions unmasked, though they should be
masked to match hard-float semantics.  Since there is no separate
non-IEEE mode for soft-float, it's most appropriate for
__fe_nonieee_env to be the same as __fe_dfl_env; this patch makes it
an alias.)

Tested for powerpc-nofpu.

	[BZ #17261]
	* sysdeps/powerpc/nofpu/fenv_const.c (__fe_enabled_env): Change
	value to 0.
	(__fe_nonieee_env): Define as an alias for __fe_dfl_env.
2014-08-12 20:31:54 +00:00
Adhemerval Zanella
a53fbd8e6c PowerPC: Fix gprof entry point for LE
This patch fixes the ELFv2 gprof entry point since the ABI
does not define function descriptors.  It fixes BZ#17213.
2014-07-30 09:01:25 -03:00
Andreas Schwab
4a2552c3eb Fix missing newline in test output 2014-07-09 11:07:24 +02:00
Adhemerval Zanella
27b75f56c9 PowerPC: Cleanup powerpc memmove
Now that MEMCPY_OK_FOR_FWD_MEMMOVE should be define on memcopy.h there
is no need to specialized powerpc memmove implementation.  This patch
moves the define set to powerpc memcopy and cleanup its definition on
powerpc code.
2014-07-08 09:16:15 -05:00
Adhemerval Zanella
e7f95bb5f0 PowerPC: Fix compiler warnings
This patch fixes some compiler due trailing data in #undef directives
and due missing prototypes.
2014-07-08 09:16:12 -05:00
Adhemerval Zanella
91f4b564bd PowerPC: Add ifunc tests for memmove
This patch add the missing ifunc tests definition for memmove ppc32
optimization patch (commit 07aedd7).
2014-07-08 09:16:09 -05:00
Adhemerval Zanella
87868c2418 PowerPC: Align power7 memcpy using VSX to quadword
This patch changes power7 memcpy to use VSX instructions only when
memory is aligned to quardword.  It is to avoid unaligned kernel traps
on non-cacheable memory (for instance, memory-mapped I/O).
2014-07-07 15:41:27 -05:00
Adhemerval Zanella
07aedd78b0 PowerPC: optimized memmove for POWER7/PPC32
This patch adds a optimized memmove for power7 by using the optimized
power7 memcpy for forward copying.
2014-07-07 15:41:27 -05:00
Adhemerval Zanella
17762f6625 PowerPC: optimized memmove for POWER7/PPC64
This patch adds an optimized memmove optimization for POWER7/powerpc64.
Basically the idea is to use the memcpy for POWER7 on non-overlapped
memory regions and a optimized backward memcpy for memory regions
that overlap (similar to the idea of string/memmove.c).

The backward memcpy algorithm used is similar the one use for memcpy for
POWER7, with adjustments done for alignment.  The difference is memory
is always aligned to 16 bytes before using VSX/altivec instructions.
2014-07-07 15:41:21 -05:00
Adhemerval Zanella
d6f68bbef4 PowerPC: memmove default implementation cleanup
This patch removes the powerpc specific logic in memmove and instead
include default implementation with MEMCPY_OK_FOR_FWD_MEMMOVE defined.
This lead in a increase performance, since the constraints to use
memcpy in powerpc code are too restrictive and memcpy can be used for
any forward memmove.
2014-07-07 14:46:44 -05:00
Adhemerval Zanella
3f17b03b09 PowerPC: Guard CALL_ELF check for ppc64 only in link.h
This patch fixes powerpc32 undef compiler warnings for _CALL_ELF,
since it is defined only for powerpc64.
2014-07-07 14:46:22 -05:00
Richard Henderson
05502548e9 Always provide HP_SMALL_TIMING_AVAIL 2014-07-03 08:38:36 -07:00
Richard Henderson
86e1a7ff92 Unify hp-timing implementations
Provide an hp-timing-common.h for ports to use.
2014-07-03 08:38:30 -07:00
Richard Henderson
428dd03f5a Remove HP_TIMING_DIFF_INIT and dl_hp_timing_overhead
Without HP_TIMING_ACCUM, dl_hp_timing_overhead is write-only.
If we remove it, there's no point in HP_TIMING_DIFF_INIT either.
2014-07-03 08:38:25 -07:00
Richard Henderson
c39323e9d2 Removing HP_TIMING_ACCUM as unused 2014-07-03 08:38:21 -07:00
Richard Henderson
850e0e032b Removing HP_TIMING_ZERO as unused 2014-07-03 08:38:18 -07:00
Richard Henderson
7db48f6aab powerpc: Remove dummy hp-timing.h
It's the same as the generic dummy version.
2014-07-03 08:38:15 -07:00
Siddhesh Poyarekar
99f8dc9220 Fix -Wundef warning on PAGE_COPY_THRESHOLD
The PAGE_COPY_THRESHOLD macro is meant to be overridden by
architecture-specific pagecopy.h, but it is currently done only by
mach; all other architectures use the default.  Check to see if the
macro is defined in addition to whether it is set to a non-zero value.
2014-07-03 01:49:43 +05:30
Vidya Ranganathan
bc8ea38590 PowerPC: strcat optimization for PPC64/POWER7
This patch adds an ifunc power7 strcat symbol that uses the logic on
sysdeps/powerpc/strcat.c but call power7 strlen/strcpy symbols instead
of default ones.
2014-07-02 14:04:21 -05:00
Adhemerval Zanella
9b71d0e38c Update powerpc-fpu ULPs. 2014-06-30 17:38:43 -04:00
Joseph Myers
a7672a2f81 Regenerate powerpc-nofpu libm-test-ulps.
This patch regenerates libm-test-ulps for powerpc-nofpu.

	* sysdeps/powerpc/nofpu/libm-test-ulps: Regenerated.
2014-06-30 21:26:49 +00:00
Joseph Myers
f1eafb41fa Remove shlib-versions ABI names support.
shlib-versions files can contain ABI lines that map triplets to a
canonical ABI name.  This name was once used for various purposes
where test baseline files for different ABIs went in a single
directory; now these purposes use sysdeps files, generation of headers
which have per-ABI variants uses abi-variants and related Makefile
variables and the shlib-versions ABI names are unused.  This patch
duly removes those lines and associated build system support for them.

Tested for x86_64 (both a full testsuite run and confirming the
installed shared libraries are unchanged by the patch).

	* Makeconfig ($(common-objpfx)soversions.mk): Do not generate
	abi-name definition.
	* scripts/soversions.awk: Do not handle or generate ABI lines.
	* shlib-versions: Remove ABI entries.
	* sysdeps/powerpc/nofpu/shlib-versions: Remove file.
	* sysdeps/x86_64/x32/shlib-versions: Remove ABI entry.
2014-06-27 20:24:23 +00:00
Siddhesh Poyarekar
4cf5b6d0d7 Fix Wundef warning for ELF_MACHINE_NO_RELA
This patch defines ELF_MACHINE_NO_RELA on all architectures.  Tested
only on x86_64 to verify that the sources before and after are
identical except for two instructions that pass the current line
number in dl-machine.h to assert_fail.
2014-06-26 22:30:40 +05:30
Joseph Myers
3e239be647 Move base_machine and machine settings from configure.ac to sysdeps preconfigure fragments.
This patch makes non-ex-ports architectures set base_machine and
machine based on the original configured machine value in preconfigure
fragments, like ex-ports architectures, rather than in the toplevel
configure.ac.

Tested x86 that the disassembly of installed shared libraries is
unchanged by the patch.

	* configure.ac (base_machine): Do not set specially for particular
	machines here.
	* configure: Regenerated.
	* sysdeps/powerpc/preconfigure: Move machine and base_machine
	settings from configure.ac.
	* sysdeps/i386/preconfigure: New file.
	* sysdeps/s390/preconfigure: Likewise.
	* sysdeps/sh/preconfigure: Likewise.
	* sysdeps/sparc/preconfigure: Likewise.
2014-06-25 17:52:56 +00:00
Adhemerval Zanella
6eaa65cefb Update powerpc-fpu ULPs. 2014-06-25 09:57:39 -05:00
Adhemerval Zanella
db22400947 PowerPC: sync hwcap.h capabilities
Linux commit dd58a092c4202f2bd490adab7285b3ff77f8e467 added the
PPC_FEATURE2_VEC_CRYPTO auvx capability to indicate whether to
hardware supports vector crypto hardware instructions.  This patch
adds its definition to powerpc hwcap bits.
2014-06-23 09:40:05 -05:00
Joseph Myers
9bc6103d04 Include <kernel-features.h> explicitly where required.
This patch makes files using __ASSUME_* macros include
<kernel-features.h> explicitly, rather than relying on some other
header (such as tls.h, lowlevellock.h or pthreadP.h) to include it
implicitly.  (I omitted cases where I've already posted or am testing
the patch that stops the file from needing __ASSUME_* at all.)  This
accords with the general principle of making source files include the
headers for anything they use, and also helps make it safe to remove
<kernel-features.h> includes from any file that doesn't use
__ASSUME_* (some of those may be stray includes left behind after
increasing the minimum kernel version, others may never have been
needed or may have become obsolete after some other change).

Tested x86_64 that the disassembly of installed shared libraries is
unchanged by this patch.

	* nptl/pthread_cond_wait.c: Include <kernel-features.h>.
	* nptl/pthread_rwlock_timedrdlock.c: Likewise.
	* nptl/pthread_rwlock_timedwrlock.c: Likewise.
	* nptl/sysdeps/unix/sysv/linux/lowlevelrobustlock.c: Likewise.
	* nscd/nscd.c: Likewise.
	* sysdeps/i386/nptl/tcb-offsets.sym: Likewise.
	* sysdeps/powerpc/nptl/tcb-offsets.sym: Likewise.
	* sysdeps/sh/nptl/tcb-offsets.sym: Likewise.
	* sysdeps/x86_64/nptl/tcb-offsets.sym: Likewise.
2014-06-20 23:24:00 +00:00
Adhemerval Zanella
556f529dab PowerPC: Move powerpc code out of nptl/ subdirectory 2014-06-17 07:54:22 -05:00
Adhemerval Zanella
31c44fea31 Update powerpc-fpu ULPs. 2014-06-11 21:22:49 -05:00
Vidya Ranganathan
e23d3d2690 PowerPC: Optimized strcmp for PPC64/POWER7
Optimization is achieved on 8 byte aligned strings with double word
comparison using cmpb instruction. On unaligned strings loop unrolling
is applied for Power7 gain.
2014-06-11 08:39:31 -05:00
Adhemerval Zanella
ed36bfa18f PowerPC: Fix optimized strncat strlen call
This patch fixes the optimized ppc64/power7 strncat strlen call for
static build without ifunc enabled.  The strlen symbol to call in such
situation is just strlen, instead of __GI_strlen (since the __GI_
alias is just created for shared objects).
2014-06-06 09:37:07 -05:00
Adhemerval Zanella
bab900166e Update powerpc-fpu ULPs. 2014-05-26 12:40:08 -05:00
Adhemerval Zanella
d298c41635 PowerPC: Remove 64 bits instructions in PPC32 code
This patch replaces the insrdi by insrwi in powerpc32 assembly.
2014-05-26 09:09:21 -05:00
Adhemerval Zanella
32999d63fd PowerPC: Remove unneeded copysign[f] macros
This patch remove the unneeded copysign[f] macro from powerpc
math_private.h, since they are already covered in generic version.
2014-05-22 16:05:19 -05:00
Adhemerval Zanella
3d2badacf1 PowerPC: Fix memchr ifunc hidden symbol for PPC32
This patch fixes a similar issue to
736c304a1a, where for PPC32 if the symbol
is defined as hidden (memchr) then compiler will create a local branc
(symbol@local) and the linker will not create a required PLT call to
make the ifunc work.  It changes the default hidden symbol (__GI_memchr)
to default memchr symbol for powerpc32 (__memchr_ppc32).
2014-05-22 07:53:44 -05:00
Adhemerval Zanella
7c112a3812 Update powerpc-fpu ULPs. 2014-05-20 16:21:51 -05:00
Adhemerval Zanella
e13bccd3de PowerPC: Fix copysignf optimization macro
This patch fixes the __copysignf optimized macro meant to internal libm
usage when used with constant value.  Without the explicit cast to
float, if it is used with const double value (for instance, on
s_casinhf.c) double constants will be used and it may lead to precision
issues in some algorithms.

It fixes the following failures on PPC64/POWER7:

Failure: Test: Real part of: cacos_downward (inf + 0 i)
Result:
 is:          1.19209289550781250000e-07   0x1.00000000000000000000p-23
 should be:   0.00000000000000000000e+00   0x0.00000000000000000000p+0
Failure: Test: Real part of: cacos_downward (inf - 0 i)
Result:
 is:          1.19209289550781250000e-07   0x1.00000000000000000000p-23
 should be:   0.00000000000000000000e+00   0x0.00000000000000000000p+0
Failure: Test: Real part of: cacos_downward (inf + 0.5 i)
Result:
 is:          1.19209289550781250000e-07   0x1.00000000000000000000p-23
 should be:   0.00000000000000000000e+00   0x0.00000000000000000000p+0
Failure: Test: Real part of: cacos_downward (inf - 0.5 i)
Result:
 is:          1.19209289550781250000e-07   0x1.00000000000000000000p-23
 should be:   0.00000000000000000000e+00   0x0.00000000000000000000p+0
Failure: Test: Real part of: cacos_towardzero (inf + 0 i)
Result:
 is:          1.19209289550781250000e-07   0x1.00000000000000000000p-23
 should be:   0.00000000000000000000e+00   0x0.00000000000000000000p+0
Failure: Test: Real part of: cacos_towardzero (inf - 0 i)
Result:
 is:          1.19209289550781250000e-07   0x1.00000000000000000000p-23
 should be:   0.00000000000000000000e+00   0x0.00000000000000000000p+0
Failure: Test: Real part of: cacos_towardzero (inf + 0.5 i)
Result:
 is:          1.19209289550781250000e-07   0x1.00000000000000000000p-23
 should be:   0.00000000000000000000e+00   0x0.00000000000000000000p+0
Failure: Test: Real part of: cacos_towardzero (inf - 0.5 i)
Result:
 is:          1.19209289550781250000e-07   0x1.00000000000000000000p-23
 should be:   0.00000000000000000000e+00   0x0.00000000000000000000p+0
2014-05-20 16:07:49 -05:00
Adhemerval Zanella
af121e371d PowerPC: Fix multiarch hypotf PPC64 path
This patch moves the hypotf multiarch implementation to correct path.
2014-05-19 18:06:40 -05:00
Vidya Ranganathan
f360f94a05 PowerPC: strncpy/stpncpy optimization for PPC64/POWER7
The optimization is achieved by following techniques:
  > data alignment [gain from aligned memory access on read/write]
  > POWER7 gains performance with loop unrolling/unwinding
    [gain by reduction of branch penalty].
  > zero padding done by calling optimized memset
2014-05-06 09:54:25 -05:00
Adhemerval Zanella
19c4bec0f4 PowerPC: ifunc improvement for internal calls
This patch changes de default symbol redirection for internal call of
memcpy, memset, memchr, and strlen to the IFUNC resolved ones.  The
performance improvement is noticeable in algorithms that uses these
symbols extensible, like the regex functions.
2014-05-05 13:30:16 -05:00
Adhemerval Zanella
dc041bd4db Fix 2014-04-29 07:45:05 -05:00
Adhemerval Zanella
18f2945ae9 PowerPC: Suppress unnecessary FPSCR write
This patch optimizes the FPSCR update on exception and rounding change
functions by just updating its value if new value if different from
current one.  It also optimizes fedisableexcept and feenableexcept by
removing an unecessary FPSCR read.
2014-04-29 07:05:39 -05:00
Adhemerval Zanella
2cd925f743 PowerPC: Add fenv macros for long double
This patch add the missing libc_<function>l_ctx macros for long
double.  Similar for float, they point to default double versions.
2014-04-17 14:01:51 -05:00
Adhemerval Zanella
de21c33c06 PowerPC: Fix --disable-multi-arch builds
This patch fixes some powerpc32 and powerpc64 builds with
--disable-multi-arch option along with different --with-cpu=powerN.
It cleanups the Implies directories by removing the multiarch
folder for non multiarch config and also fixing two assembly
implementations: powerpc64/power7/strncat.S that is calling the
wrong strlen; and power8/fpu/s_isnan.S that misses the hidden_def and
weak_alias directives.
2014-04-09 06:22:53 -05:00
Adhemerval Zanella
8bd70862e1 PowerPC: Fix nearbyint/nearbyintf result for FE_DOWNWARD
This patch fixes the powerpc32 optimized nearbyint/nearbyintf bogus
results for FE_DOWNWARD rounding mode.  This is due wrong instructions
sequence used in the rounding calculation (two subtractions instead of
adition and a subtraction).

Fixes BZ#16815.
2014-04-06 14:58:05 -05:00
Alan Modra
af6b17973c Correct prefetch hint in power7 memrchr.
Typo fix.

	* sysdeps/powerpc/powerpc64/power7/memrchr.S: Correct stream hint.
2014-04-02 13:42:27 +10:30