glibc

mirror of https://sourceware.org/git/glibc.git synced 2024-12-26 12:41:05 +00:00

Author	SHA1	Message	Date
Carlos Eduardo Seo	67385a01d2	powerpc: Add hwcap/hwcap2/platform data to TCB. This patch adds a new feature for powerpc. In order to get faster access to the HWCAP/HWCAP2 bits and platform number (i.e. for implementing __builtin_cpu_is () / __builtin_cpu_supports () in GCC) without the overhead of reading from the auxiliary vector, we now reserve space for them in the TCB. This is an ABI change for GLIBC 2.23. A new versioned symbol '__parse_hwcap_and_convert_at_platform' is available to get the data from the auxiliary vector and parse it, and store it for later use in the TLS initialization code. This function is called very early (in _dl_sysdep_start () via DL_PLATFORM_INFO for the dynamic linking case, and in __libc_start_main () for the static linking case) to make sure the data is available at the time of TLS initialization. * sysdeps/powerpc/Makefile (sysdep-dl-routines): Add hwcapinfo. (sysdep_routines): Likewise. (sysdep-rtld-routines): Likewise. [$(subdir) = nptl](tests): Add test-get_hwcap and test-get_hwcap-static [$(subdir) = nptl](tests-static): test-get_hwcap-static * sysdeps/powerpc/Versions: Added new __parse_hwcap_and_convert_at_platform symbol to GLIBC-2.23. * sysdeps/powerpc/hwcapinfo.c: New file. (__tcb_parse_hwcap_and_convert_at_platform): New function to initialize and parse hwcap, hwcap2 and platform number information. * sysdeps/powerpc/hwcapinfo.h: New file. Creates global variables to store HWCAP+HWCAP2 and platform number. * sysdeps/powerpc/nptl/tcb-offsets.sym: Added new offsets for HWCAP+HWCAP2 and platform number in the TCB. * sysdeps/powerpc/nptl/tls.h: New functionality. Stores the HWCAP, HWCAP2 and platform number in the TCB. (dtv): Added new fields for HWCAP+HWCAP2 and platform number. (TLS_INIT_TP): Included calls to add the hwcap and at_platform values in the TCB in TP initialization. (TLS_DEFINE_INIT_TP): Likewise. (THREAD_GET_HWCAP): New macro. (THREAD_SET_HWCAP): Likewise. (THREAD_GET_AT_PLATFORM): Likewise. (THREAD_SET_AT_PLATFORM): Likewise. * sysdeps/powerpc/powerpc32/dl-machine.h: (dl_platform_init): New function that calls __parse_hwcap_and_convert_at_platform for the dymanic linking case for powerpc32. * sysdeps/powerpc/powerpc64/dl-machine.h: Likewise, for powerpc64. * sysdeps/powerpc/test-get_hwcap-static.c: New file. Testcase for this functionality, static linking case. * sysdeps/powerpc/test-get_hwcap.c: New file. Likewise, dynamic linking case. * sysdeps/unix/sysv/linux/powerpc/libc-start.c: Added call to __parse_hwcap_and_convert_at_platform for the static linking case. * sysdeps/unix/sysv/linux/powerpc/powerpc32/ld.abilist: Included the new __parse_hwcap_and_convert_at_platform symbol in the ABI list for GLIBC 2.23. * sysdeps/unix/sysv/linux/powerpc/powerpc64/ld-le.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/ld.abilist: Likewise.	2015-12-03 13:56:13 -02:00
Joseph Myers	21378ae0d3	Fix powerpc round, roundf spurious "inexact" (bug 19238). The powerpc hard-float round and roundf functions, both 32-bit and 64-bit, raise spurious "inexact" exceptions for integer arguments from adding 0.5 and rounding to integer toward zero. Since these functions already save and restore the rounding mode, it's natural to make them restore the full floating-point state instead to fix this bug, which this patch does. The save of the state is moved after the first floating-point operation on the input so that any "invalid" exceptions from signaling NaN inputs are properly preserved. As a consequence of this approach to the fix, "inexact" for noninteger arguments (disallowed by TS 18661-1 but not by C99/C11, see bug 15479) is also avoided for these implementations; this is not a general fix for bug 15479 since plenty of other implementations of various functions still raise spurious "inexact" for noninteger arguments. This issue and fix do not apply to builds using power5+ versions of round and roundf, which use the frin instruction and avoid "inexact" exceptions that way. This patch should get hard-float powerpc32 and powerpc64 (default function implementations) back to a state where test-float and test-double will pass after ulps regeneration. Tested for powerpc32 and powerpc64. [BZ #15479] [BZ #19238] * sysdeps/powerpc/powerpc32/fpu/s_round.S (__round): Save floating-point state after first operation on input. Restore full state rather than just rounding mode. * sysdeps/powerpc/powerpc32/fpu/s_roundf.S (__roundf): Likewise. * sysdeps/powerpc/powerpc64/fpu/s_round.S (__round): Likewise. * sysdeps/powerpc/powerpc64/fpu/s_roundf.S (__roundf): Likewise.	2015-11-12 19:00:06 +00:00
Joseph Myers	32b71ad358	Fix powerpc64 lround, lroundf, llround, llroundf spurious "inexact" exceptions (bug 19235). Similar to bug 19134 for powerpc32, the powerpc64 implementations of lround, lroundf, llround, llroundf can raise spurious "inexact" exceptions for integer arguments from adding 0.5 then converting to integer (this does not apply to the power5+ version for double, which uses the frin instruction which is defined never to raise "inexact"; I don't know why power5+ doesn't use that version for float as well). This patch fixes the bug in a similar way to the powerpc32 bug, by testing for integers (adding and subtracting 2^52 and comparing with the value before that addition and subtraction) and not adding 0.5 in that case. The powerpc maintainers may wish to look at making power5+ / power6x / power8 use frin for float lround / llround as well as for double, unless there's some reason I've missed that this isn't beneficial. Tested for powerpc64. [BZ #19235] * sysdeps/powerpc/powerpc64/fpu/s_llround.S (__llround): Do not add 0.5 to integer arguments. * sysdeps/powerpc/powerpc64/fpu/s_llroundf.S (__llroundf): Likewise. (.LC2): New object.	2015-11-12 16:24:00 +00:00
Joseph Myers	71d1b0166b	Fix powerpc nearbyint wrongly clearing "inexact" and leaving traps disabled (bug 19228). Similar to bug 15491 recently fixed for x86_64 / x86, the powerpc (both powerpc32 and powerpc64) hard-float implementations of nearbyintf and nearbyint wrongly clear an "inexact" exception that was raised before the function was called; this shows up as failure of the test math/test-nearbyint-except added when that bug was fixed. They also wrongly leave traps on "inexact" disabled if they were enabled before the function was called. This patch fixes the bugs similar to how the x86 bug was fixed: saving and restoring the whole floating-point state, both to restore the original "inexact" flag state and to restore the original state of whether traps on "inexact" were enabled. Because there's a convenient point in the powerpc implementations to save state after any sNaN arguments will have raised "invalid" but before "inexact" traps need to be disabled, no special handling for "invalid" is needed as in the x86 version. Tested for powerpc64 and powerpc32, where it fixes the math/test-nearbyint-except failure as well as fixing the new test math/test-nearbyint-except-2 added by this patch. Also tested for x86_64 and x86 that the new test passes. If powerpc experts see a more efficient way of doing this (e.g. instruction positioning that's better for pipelines on typical processors) then of course followups optimizing the fix are welcome. [BZ #19228] * sysdeps/powerpc/powerpc32/fpu/s_nearbyint.S (__nearbyint): Save and restore full floating-point state. * sysdeps/powerpc/powerpc32/fpu/s_nearbyintf.S (__nearbyintf): Likewise. * sysdeps/powerpc/powerpc64/fpu/s_nearbyint.S (__nearbyint): Likewise. * sysdeps/powerpc/powerpc64/fpu/s_nearbyintf.S (__nearbyintf): Likewise. * math/test-nearbyint-except-2.c: New file. * math/Makefile (tests): Add test-nearbyint-except-2.	2015-11-11 00:06:09 +00:00
Gabriel F. T. Gomes	b0f81637d5	PowerPC: Add comments to optimized strncpy * sysdeps/powerpc/powerpc64/power8/strncpy.S: Added comments to some assembly instructions.	2015-10-01 17:36:55 -03:00
Gabriel F. T. Gomes	850713336e	PowerPC: Fix operand prefixes The file sysdeps/powerpc/sysdeps.h defines aliases for register operands, which add the letter 'r' as a prefix to a register name. E.g.: register 20 can be written as 'r20', instead of '20'. On the one hand, this increases readability, as it makes it easier for readers to know whether the operand is a register or an immediate. On the other hand, this permits that immediate operands be written as if they were registers, and vice-versa, thus reducing the readability of the code. This commit removes some of these unintentional misuses. This commit also increases readability of the code by adding the prefix 'cr' to some uses of the control register. Both changes have no effect on the final code. Checked with objdump. * sysdeps/powerpc/powerpc64/power8/strncpy.S: Remove or add register prefix from operands.	2015-10-01 17:36:46 -03:00
Joseph Myers	de071d199a	Move bits/atomic.h to atomic-machine.h (bug 14912). It was noted in <https://sourceware.org/ml/libc-alpha/2012-09/msg00305.html> that the bits/.h naming scheme should only be used for installed headers. This patch renames bits/atomic.h to atomic-machine.h to follow that convention. This is the only change in this series that needs to change the filename rather than simply removing a directory level (because both atomic.h and bits/atomic.h exist at present). Tested for x86_64 (testsuite, and that installed stripped shared libraries are unchanged by the patch). [BZ #14912] sysdeps/aarch64/bits/atomic.h: Move to ... * sysdeps/aarch64/atomic-machine.h: ...here. (_AARCH64_BITS_ATOMIC_H): Rename macro to _AARCH64_ATOMIC_MACHINE_H. * sysdeps/alpha/bits/atomic.h: Move to ... * sysdeps/alpha/atomic-machine.h: ...here. * sysdeps/arm/bits/atomic.h: Move to ... * sysdeps/arm/atomic-machine.h: ...here. Update comments. * bits/atomic.h: Move to ... * sysdeps/generic/atomic-machine.h: ...here. (_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H. * sysdeps/i386/bits/atomic.h: Move to ... * sysdeps/i386/atomic-machine.h: ...here. * sysdeps/ia64/bits/atomic.h: Move to ... * sysdeps/ia64/atomic-machine.h: ...here. * sysdeps/m68k/coldfire/bits/atomic.h: Move to ... * sysdeps/m68k/coldfire/atomic-machine.h: ...here. (_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H. * sysdeps/m68k/m680x0/m68020/bits/atomic.h: Move to ... * sysdeps/m68k/m680x0/m68020/atomic-machine.h: ...here. * sysdeps/microblaze/bits/atomic.h: Move to ... * sysdeps/microblaze/atomic-machine.h: ...here. * sysdeps/mips/bits/atomic.h: Move to ... * sysdeps/mips/atomic-machine.h: ...here. (_MIPS_BITS_ATOMIC_H): Rename macro to _MIPS_ATOMIC_MACHINE_H. * sysdeps/powerpc/bits/atomic.h: Move to ... * sysdeps/powerpc/atomic-machine.h: ...here. Update comments. * sysdeps/powerpc/powerpc32/bits/atomic.h: Move to ... * sysdeps/powerpc/powerpc32/atomic-machine.h: ...here. Update comments. Include <atomic-machine.h> instead of <bits/atomic.h>. * sysdeps/powerpc/powerpc64/bits/atomic.h: Move to ... * sysdeps/powerpc/powerpc64/atomic-machine.h: ...here. Include <atomic-machine.h> instead of <bits/atomic.h>. * sysdeps/s390/bits/atomic.h: Move to ... * sysdeps/s390/atomic-machine.h: ...here. * sysdeps/sparc/sparc32/bits/atomic.h: Move to ... * sysdeps/sparc/sparc32/atomic-machine.h: ...here. (_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H. * sysdeps/sparc/sparc32/sparcv9/bits/atomic.h: Move to ... * sysdeps/sparc/sparc32/sparcv9/atomic-machine.h: ...here. * sysdeps/sparc/sparc64/bits/atomic.h: Move to ... * sysdeps/sparc/sparc64/atomic-machine.h: ...here. * sysdeps/tile/bits/atomic.h: Move to ... * sysdeps/tile/atomic-machine.h: ...here. * sysdeps/tile/tilegx/bits/atomic.h: Move to ... * sysdeps/tile/tilegx/atomic-machine.h: ...here. Include <sysdeps/tile/atomic-machine.h> instead of <sysdeps/tile/bits/atomic.h>. (_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H. * sysdeps/tile/tilepro/bits/atomic.h: Move to ... * sysdeps/tile/tilepro/atomic-machine.h: ...here. Include <sysdeps/tile/atomic-machine.h> instead of <sysdeps/tile/bits/atomic.h>. (_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H. * sysdeps/unix/sysv/linux/arm/bits/atomic.h: Move to ... * sysdeps/unix/sysv/linux/arm/atomic-machine.h: ...here. Include <sysdeps/arm/atomic-machine.h> instead of <sysdeps/arm/bits/atomic.h>. * sysdeps/unix/sysv/linux/hppa/bits/atomic.h: Move to ... * sysdeps/unix/sysv/linux/hppa/atomic-machine.h: ...here. (_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H. * sysdeps/unix/sysv/linux/m68k/coldfire/bits/atomic.h: Move to ... * sysdeps/unix/sysv/linux/m68k/coldfire/atomic-machine.h: ...here. (_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H. * sysdeps/unix/sysv/linux/nios2/bits/atomic.h: Move to ... * sysdeps/unix/sysv/linux/nios2/atomic-machine.h: ...here. (_NIOS2_BITS_ATOMIC_H): Rename macro to _NIOS2_ATOMIC_MACHINE_H. * sysdeps/unix/sysv/linux/sh/bits/atomic.h: Move to ... * sysdeps/unix/sysv/linux/sh/atomic-machine.h: ...here. * sysdeps/x86_64/bits/atomic.h: Move to ... * sysdeps/x86_64/atomic-machine.h: ...here. * include/atomic.h: Include <atomic-machine.h> instead of <bits/atomic.h>.	2015-09-11 20:00:19 +00:00
Paul E. Murphy	18173559a2	powerpc: Fix tabort usage in syscalls Fix usage of tabort in generated syscalls. r0 has special meaning when used with this instruction, thus it will not generate persistent errors, nor return an error code. This mitigates poor CPU usage when performing elided critical sections. Additionally, transactions should be aborted when entering a user invoked syscall. Otherwise the results of the transaction may be undefined. 2015-08-25 Paul E. Murphy <murphyp@linux.vnet.ibm.com> * sysdeps/powerpc/powerpc32/sysdep.h (ABORT_TRANSACTION): Use register other than r0 for tabort, it has special meaning. * sysdeps/powerpc/powerpc64/sysdep.h (ABORT_TRANSACTION): Likewise * sysdeps/unix.sysv/linux/powerpc/syscall.S (syscall): Abort transaction before starting syscall.	2015-08-25 13:45:56 -03:00
Rajalakshmi Srinivasaraghavan	fe7faec3e5	powerpc: Handle worstcase behavior in strstr() for POWER7 Instead of checking needle length, constant 'n' number of comparisons is checked to fall back to default implementation. This patch is tested on powerpc64 and powerpc64le. 2015-08-25 Rajalakshmi Srinivasaraghavan <raji@linux.vnet.ibm.com> * sysdeps/powerpc/powerpc64/power7/strstr.S: Handle worst case.	2015-08-25 13:45:56 -03:00
Carlos Eduardo Seo	502b91de14	powerpc: make memchr use memchr-power7. In powerpc64, memchr was always pointing to the internal __GI_memchr implementation. This patch fixes that and makes it use the optimized POWER7 version when adequate. * sysdeps/powerpc/powerpc64/multiarch/memchr-ppc64.c: Make memchr not point to the internal __GI_memchr implementation.	2015-08-21 17:05:40 -03:00
Ondrej Bilka	5011051da3	powerpc: Fix stpcpy performance for power8 This patch fixes the missing enablement for stpcpy on POWER8. * sysdeps/powerpc/powerpc64/multiarch/stpcpy.c: Fix ifunc.	2015-08-11 10:03:10 -03:00
Adhemerval Zanella	6f714aa4ad	powerpc: Fix PPC64/POWER7 conform tests When building with --disable-multi-arch the memmove and strstr POWER7 optimization create and uses symbols that conflict with expect conform tests. * sysdeps/powerpc/powerpc64/power7/memmove.S (bcopy): Changing to __bcopy and add a weak_alias to bcopy. * sysdeps/powerpc/powerpc64/power7/strstr.S (strstr): Use __strnlen for static build.	2015-08-11 10:03:10 -03:00
Adhemerval Zanella	142e0a9953	powerpc: Use default strcpy optimization for POWER7 This patches uses the default strcpy/stpcpy implementation for POWER7/PPC64. This is faster in mostly inputs for benchtests and for multiarch the implementation uses the POWER7 strlen and memcpy. * string/stpcpy.c (__stpcpy): Use STPCPY to redefine symbol name and cleanup macro usage. * string/strcpy.c (strcpt): Use STRCPY to redefine symbol name. * sysdeps/powerpc/powerpc64/multiarch/stpcpy-power7.S: Remove file. * sysdeps/powerpc/powerpc64/multiarch/stpcpy-ppc64.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strcpy-power7.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strcpy-ppc64.S: Likewise. * sysdeps/powerpc/powerpc64/power7/stpcpy.S: Likewise. * sysdeps/powerpc/powerpc64/power7/strcpy.S: Likewise. * sysdeps/powerpc/powerpc64/power7/strcpy.c: Likewise. * sysdeps/powerpc/powerpc64/stpcpy.S: Likewise. * sysdeps/powerpc/powerpc64/strcpy.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/stpcpy.c [SHARED && IS_IN (libc)]: Include <string/strcpy.c>. * sysdeps/powerpc/powerpc64/multiarch/stpcpy.c [SHARED && IS_IN (libc)]: Include <string/stpcpy.c>. * sysdeps/powerpc/powerpc64/multiarch/stpcpy-power7.c: New file. * sysdeps/powerpc/powerpc64/multiarch/stpcpy-ppc64.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strcpy-power7.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strcpy-ppc64.c: Likewise. * sysdeps/powerpc/powerpc64/power7/strcpy.c: Likewise.	2015-08-11 10:03:10 -03:00
Adhemerval Zanella	14362ef154	powerpc: Fix strnlen/power7 build This patch fixes the strnlen.S build with --disable-multi-arch option.	2015-08-11 10:03:09 -03:00
Adhemerval Zanella	357bb400f1	powerpc: Fix strstr/power7 build This patch fixes the strstr build with --disable-multi-arch option. The optimization calls the __strstr_ppc symbol, which always build for multiarch config but not if it is disable. This patch fixes it by adding the default C implementation object with the expected symbol name. * sysdeps/powerpc/powerpc64/power7/Makefile [$(subdir) = string] (sysdep_routines): Add strstr-ppc64. * sysdeps/powerpc/powerpc64/power7/strstr-ppc64.c: New file.	2015-08-11 10:03:09 -03:00
Rajalakshmi Srinivasaraghavan	b42f8cad52	powerpc: strstr optimization This patch optimizes strstr function for power >= 7 systems. Performance gain is obtained using aligned memory access and usage of cmpb instruction for quicker comparison. The average improvement of this optimization is ~40%. Tested on ppc64 and ppc64le. 2015-07-16 Rajalakshmi Srinivasaraghavan <raji@linux.vnet.ibm.com> * sysdeps/powerpc/powerpc64/multiarch/Makefile: Add strstr(). * sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c: Likewise. * sysdeps/powerpc/powerpc64/power7/strstr.S: New File. * sysdeps/powerpc/powerpc64/multiarch/strstr-power7.S: New File. * sysdeps/powerpc/powerpc64/multiarch/strstr-ppc64.c: New File. * sysdeps/powerpc/powerpc64/multiarch/strstr.c: New File.	2015-07-16 13:43:51 -03:00
Adhemerval Zanella	7bf8fb1042	libc-vdso.h place consolidation This patch moves the libc-vdso.h internal header from bits folder to default architecture one and also corrects the remaning includes in the files.	2015-04-20 08:51:17 -03:00
Alan Modra	19a6a3acd1	Harden powerpc64 elf_machine_fixup_plt IFUNC is difficult to correctly implement on any target needing a GOT to support position independent code, due to the dependency on order of dynamic relocations. ld.so should be changed to apply IFUNC relocations last, globally, because without that it is actually impossible to write an IFUNC resolver in C that works in all situations. Case in point, vfork in libpthread.so is an IFUNC with the resolver returning &__libc_vfork. (system and fork are similar.) If another shared library, libA say, uses vfork then it is quite possible that libpthread.so hasn't been dynamically relocated before the unfortunate libA is dynamically relocated. In that case the GOT entry for &__libc_vfork is still zero, so the IFUNC resolver returns NULL. LD_BIND_NOW=1 results in libA PLT dynamic relocations being applied using this NULL value and ld.so segfaults. This patch hardens ld.so to not segfault on a NULL from an IFUNC resolver. It also fixes a problem with undefined weak. If you leave the plt entry as-is for undefined weak then if the entry is ever called it will loop in ld.so rather than segfaulting. * sysdeps/powerpc/powerpc64/dl-machine.h (elf_machine_fixup_plt): Don't segfault if ifunc resolver returns a NULL. Do set plt to zero for undefined weak. (elf_machine_plt_conflict): Similarly.	2015-03-26 12:30:45 +10:30
Alan Modra	afcd9480fe	powerpc __tls_get_addr call optimization This patch is glibc support for a PowerPC TLS optimization, inspired by Alexandre Oliva's TLS optimization for other processors, http://www.lsd.ic.unicamp.br/~oliva/writeups/TLS/RFC-TLSDESC-x86.txt In essence, this optimization uses a zero module id in the tls_index GOT entry to indicate that a TLS variable is allocated space in the static TLS area. A special plt call linker stub for __tls_get_addr checks for such a tls_index and if found, returns the offset immediately. The linker communicates the fact that the special __tls_get_addr stub is used by setting a bit in the dynamic tag DT_PPC64_OPT/DT_PPC_OPT. glibc communicates to the linker that this optimization is available by the presence of __tls_get_addr_opt. tst-tlsmod2.so is built with -Wl,--no-tls-get-addr-optimize for tst-tls-dlinfo, which otherwise would fail since it tests that no static tls is allocated. The ld option --no-tls-get-addr-optimize has been available since binutils-2.20 so doesn't need a configure test. * NEWS: Advertise TLS optimization. * elf/elf.h (R_PPC_TLSGD, R_PPC_TLSLD, DT_PPC_OPT, PPC_OPT_TLS): Define. (DT_PPC_NUM): Increment. * elf/dynamic-link.h (HAVE_STATIC_TLS): Define. (CHECK_STATIC_TLS): Use here. * sysdeps/powerpc/powerpc32/dl-machine.h (elf_machine_rela): Optimize TLS descriptors. * sysdeps/powerpc/powerpc64/dl-machine.h (elf_machine_rela): Likewise. * sysdeps/powerpc/dl-tls.c: New file. * sysdeps/powerpc/Versions: Add __tls_get_addr_opt. * sysdeps/powerpc/tst-tlsopt-powerpc.c: New tls test. * sysdeps/unix/sysv/linux/powerpc/Makefile: Add new test. Build tst-tlsmod2.so with --no-tls-get-addr-optimize. * sysdeps/unix/sysv/linux/powerpc/powerpc32/ld.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc64/ld.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/ld-le.abilist: Likewise.	2015-03-25 15:53:47 +10:30
Alan Modra	da9f333410	powerpc64 configure message This feature doesn't depend on the linker, as can be seen from the actual test. It's a compiler feature. * sysdeps/powerpc/powerpc64/configure.ac: Correct "linker support for overlapping .opd entries" to "support...". * sysdeps/powerpc/powerpc64/configure: Regenerate	2015-03-25 15:45:36 +10:30
Adhemerval Zanella	5ca10a0c9a	powerpc: Remove HAVE_ASM_GLOBAL_DOT_NAME define With AIX port deprecated there is no need to check/define HAVE_ASM_GLOBAL_DOT_NAME anymore since the current minimum binutils supported (2.22) does not emit global symbol with dot. This patch removes all the HAVE_ASM_GLOBAL_DOT_NAME definition and checks for powerpc64 port.	2015-03-11 09:01:05 -04:00
H.J. Lu	209826bcf2	Replace ELF_RTYPE_CLASS_NOCOPY with ELF_RTYPE_CLASS_COPY ELF_RTYPE_CLASS_NOCOPY in comments is a typo. It should be ELF_RTYPE_CLASS_COPY. [BZ #18082] * sysdeps/alpha/dl-machine.h (elf_machine_type_class): Replace ELF_RTYPE_CLASS_NOCOPY with ELF_RTYPE_CLASS_COPY in comments. * sysdeps/arm/dl-machine.h (elf_machine_type_class): Likewise. * sysdeps/hppa/dl-machine.h (elf_machine_type_class): Likewise. * sysdeps/i386/dl-machine.h (elf_machine_type_class): Likewise. * sysdeps/ia64/dl-machine.h (elf_machine_type_class): Likewise. * sysdeps/m68k/dl-machine.h (elf_machine_type_class): Likewise. * sysdeps/microblaze/dl-machine.h (elf_machine_type_class): Likewise. * sysdeps/nios2/dl-machine.h (elf_machine_type_class): Likewise. * sysdeps/powerpc/powerpc32/dl-machine.h (elf_machine_type_class): Likewise. * sysdeps/powerpc/powerpc64/dl-machine.h (elf_machine_type_class): Likewise. * sysdeps/s390/s390-32/dl-machine.h (elf_machine_type_class): Likewise. * sysdeps/s390/s390-64/dl-machine.h (elf_machine_type_class): Likewise. * sysdeps/sh/dl-machine.h (elf_machine_type_class): Likewise. * sysdeps/sparc/sparc32/dl-machine.h (elf_machine_type_class): Likewise. * sysdeps/sparc/sparc64/dl-machine.h (elf_machine_type_class): Likewise. * sysdeps/tile/dl-machine.h (elf_machine_type_class): Likewise. * sysdeps/x86_64/dl-machine.h (elf_machine_type_class): Likewise.	2015-03-05 08:40:41 -08:00
Adhemerval Zanella	115e0de72a	powerpc: Fix memmove static build This patch fixes the missing "__memcpy_ppc" symbol for memmove-ppc64 object in static builds. Since memcpy ifunc is not enabled in static mode, the specialized symbols are not provided. The patch changed the it to just "__memcpy" instead.	2015-02-25 13:25:54 -05:00
Rajalakshmi Srinivasaraghavan	98408b95b1	powerpc: POWER7 strncpy optimization for unaligned string This patch optimizes strncpy for power7 for unaligned source or destination address. The source or destination address is aligned to doubleword and data is shifted based on the alignment and added with the previous loaded data to be written as a doubleword. For each load, cmpb instruction is used for faster null check. The new optimization shows 10 to 70% of performance improvement for longer string though it does not show big difference on string size less than 16 due to additional checks.Hence this new algorithm is restricted to string greater than 16.	2015-02-12 13:16:08 -05:00
Adhemerval Zanella	10169938b1	powerpc: wordcopy/memmove cleanup for ppc32 This patch cleanup some multiarch code related to memmmove optimization. Initial IFUNC support added specialized wordcopy symbols which turned in local IFUNC calls used by memmove default implementation. The patch removes the internal IFUNC for wordcopy symbols and uses local branches in the memmmove optimization instead.	2015-02-09 06:42:28 -05:00
Adhemerval Zanella	b269211467	powerpc: wordcopy/memmove cleanup for ppc64 This patch cleanup some multiarch code related to memmmove optimization. Initial IFUNC support added specialized wordcopy symbols which turned in local IFUNC calls used by memmove default implementation. This change by removing then and used the optimized memmove instead for supported chips.	2015-02-09 06:42:28 -05:00
Adhemerval Zanella	18e270aada	powerpc: Remove POWER7 wordcopy ifunc This patch remove the POWER7 ifunc wordcopy function (_wordcopy_*_power7), since now GLIBC provides a optimized memmove/bcopy for POWER7.	2015-02-09 06:42:28 -05:00
Adhemerval Zanella	6f0993a638	powerpc: Simplify bcopy default implementation This patch simplify the default bcopy symbol for powerpc64 by just using memmove instead of implementing using the default bcopy. Since the symbol is deprecated, it trades speed by code size.	2015-02-09 06:42:28 -05:00
Adhemerval Zanella	3001e54c57	powerpc: multiarch Makefile cleanup for powerpc64 This patch cleanups the multiarch Makefile by putting the wide chars implementation to correct wcsmbs rule.	2015-02-09 06:42:27 -05:00
Adhemerval Zanella	08cee2a464	powerpc: Fix fsqrt build in libm [BZ#16576] Some powerpc64 processors (e5500 core for instance) does not provide the fsqrt instruction, however current check to use in math_private.h is __WORDSIZE and _ARCH_PWR4 (ISA 2.02). This is patch change it to use the compiler flag _ARCH_PPCSQ (which is the same condition GCC uses to decide whether to generate fsqrt instruction). It fixes BZ#16576.	2015-01-28 05:59:16 -05:00
Adhemerval Zanella	bea5801360	powerpc: Fix powerpc64 build failure with binutils 2.22 GLIBC memset optimization for POWER8 uses the '.machine power8' directive, which is only supported officially on binutils 2.24+. This causes a build failure on older binutils. Since the requirement of .machine power8 is to correctly assembly the 'mtvsrd' instruction and it is already handled by the MTVSRD_V1_R4 macro, there is no really needed of using it. The patch replaces the power8 with power7 for .machine directive. It fixes BZ#17869.	2015-01-24 08:40:04 -05:00
Adhemerval Zanella	0e87343e20	powerpc: Fix ifuncmain6pie failure with GCC 4.9 This patch fix the elf/ifuncmain6pie failure when building with GCC 4.9+. For some reason, the compiler removes the branch taken code at resolve_ifunc (sysdeps/powerpc/powerpc64/dl-machine.h) as dead-code and thus the testcase fails because the ifunc resolves branches to an invalid memory location. It fixes by explicit adding a dependency of value based on odp variable to avoid compiler optimization. It fixes BZ#17868.	2015-01-24 08:38:39 -05:00
Adhemerval Zanella	ce6615c9c6	powerpc: Fix POWER7/PPC64 performance regression on LE This patch fixes a performance regression on the POWER7/PPC64 memcmp porting for Little Endian. The LE code uses 'ldbrx' instruction to read the memory on byte reversed form, however ISA 2.06 just provide the indexed form which uses a register value as additional index, instead of a fixed value enconded in the instruction. And the port strategy for LE uses r0 index value and update the address value on each compare loop interation. For large compare size values, it adds 8 more instructions plus some more depending of trailing size. This patch fixes it by adding pre-calculate indexes to remove the address update on loops and tailing sizes. For large sizes it shows a considerable gain, with double performance pairing with BE.	2015-01-13 14:35:40 -05:00
Adhemerval Zanella	d3b00f468b	powerpc: Optimized strncmp for POWER8/PPC64 This patch adds an optimized POWER8 strncmp. The implementation focus on speeding up unaligned cases follwing the ideas of power8 strcmp. The algorithm first check the initial 16 bytes, then align the first function source and uses unaligned loads on second argument only. Aditional checks for page boundaries are done for unaligned cases (where sources alignment are different).	2015-01-13 14:35:40 -05:00
Rajalakshmi Srinivasaraghavan	72607db038	powerpc: Optimize POWER7 strcmp trailing checks This patch optimized the POWER7 trailing check by avoiding using byte read operations and instead use the doubleword already readed with bitwise operations.	2015-01-13 14:35:40 -05:00
Adhemerval Zanella	8bedcb5f03	powerpc: Optimized strcmp for POWER8/PPC64 This patch adds an optimized POWER8 strcmp using unaligned accesses. The algorithm first check the initial 16 bytes, then align the first function source and uses unaligned loads on second argument only. Aditional checks for page boundaries are done for unaligned cases	2015-01-13 11:28:58 -05:00
Adhemerval Zanella	f06a4faf8a	powerpc: Optimized st{r,p}ncpy for POWER8/PPC64 This patch adds an optimized POWER8 st{r,p}ncpy using unaligned accesses. It shows 10%-80% improvement over the optimized POWER7 one that uses only aligned accesses, specially on unaligned inputs. The algorithm first read and check 16 bytes (if inputs do not cross a 4K page size). The it realign source to 16-bytes and issue a 16 bytes read and compare loop to speedup null byte checks for large strings. Also, different from POWER7 optimization, the null pad is done inline in the implementation using possible unaligned accesses, instead of realying on a memset call. Special case is added for page cross reads.	2015-01-13 11:28:44 -05:00
Adhemerval Zanella	9f2f36e5a9	powerpc: Optimized strncat for POWER7/PPC64 With `3eb38795db` (Simplify strncat) the generic algorithms uses strlen, strnlen, and memcpy. This is faster than POWER7 current implementation, especially for unaligned strings (where POWER7 code uses byte-byte operations). This patch removes the assembly implementation and uses a multiarch specialization based on default algorithm calling optimized POWER7 symbols.	2015-01-13 11:28:40 -05:00
Adhemerval Zanella	94c9680945	powerpc: Optimized strcat for POWER8/PPC64 With new optimized strcpy for POWER8, this patch adds an optimized strcat which uses it along with default implementation at strings/.	2015-01-13 11:28:36 -05:00
Adhemerval Zanella	96d6fd6c40	powerpc: Optimized st{r,p}cpy for POWER8/PPC64 This patch adds an optimized POWER8 strcpy using unaligned accesses. For strings up to 16 bytes the implementation first calculate the string size, like strlen, and issues a memcpy. For larger strings, source is first aligned to 16 bytes and then tested over a loop that reads 16 bytes am combine the cmpb results for speedup. Special case is added for page cross reads. It shows 30%-60% improvement over the optimized POWER7 one that uses only aligned accesses.	2015-01-13 11:28:30 -05:00
Adhemerval Zanella	56cf276381	powerpc: abort transaction in syscalls Linux kernel powerpc documentation states issuing a syscall inside a transaction is not recommended and may lead to undefined behavior. It also states syscalls does not abort transactoin neither they run in transactional state. To avoid side-effects being visible outside transactions, GLIBC with lock elision enabled will issue a transaction abort instruction just before all syscalls if hardware supports hardware transactions.	2015-01-12 06:32:08 -05:00
Joseph Myers	b168057aaa	Update copyright dates with scripts/update-copyrights.	2015-01-02 16:29:47 +00:00
Rajalakshmi Srinivasaraghavan	f59ad976ed	powerpc: POWER7 strcpy optimization for unaligned strings This patch optimizes strcpy for ppc64/power7 for unaligned source or destination address. The source or destination address is aligned to doubleword and data is shifted based on the alignment and added with the previous loaded data to be written as a doubleword. For each load, cmpb instruction is used for faster null check. The word aligned optimization is also removed, since the new unaligned code path shows better results handling word-aligned strings. More combination of unaligned inputs is also added in benchtest to measure the improvement.The new optimization shows 2 to 80% of performance improvement for longer string though it does not show big difference on string size less than 16 due to additional checks.	2014-12-31 14:35:59 -05:00
Joseph Myers	2cfbdb9a27	Fix strftime wcschr namespace (bug 17634). Use of strftime, a C90 function, ends up bringing in wcschr, which is not a C90 function. Although not a conformance bug (C90 reserves wcs), this is still contrary to glibc practice of avoiding relying on those reservations; this patch arranges for the internal uses to use __wcschr instead, with wcschr being a weak alias. This is more complicated than some such patches because of the various IFUNC definitions of wcschr (which include code redefining libc_hidden_def in a way that involves creating __GI_wcschr manually and so also needs to create __GI___wcschr after the change of internal uses to use __wcschr). Tested for x86_64 and 32-bit x86 (testsuite, and that disassembly of installed shared libraries is unchanged by the patch). 2014-12-10 Joseph Myers <joseph@codesourcery.com> Adhemerval Zanella <azanella@linux.vnet.ibm.com> [BZ #17634] wcsmbs/wcschr.c [!WCSCHR] (wcschr): Define as __wcschr. Undefine after defining function. Define as weak alias of __wcschr. Use libc_hidden_weak. * include/wchar.h (__wcschr): Declare. Use libc_hidden_proto. * sysdeps/i386/i686/multiarch/wcschr-c.c [IS_IN (libc) && SHARED] (libc_hidden_def): Also define __GI___wcschr alias. * sysdeps/i386/i686/multiarch/wcschr.S (wcschr): Rename to __wcschr and define as weak alias of __wcschr. * sysdeps/powerpc/power6/wcschr.c [!WCSCHR] (WCSCHR): Define as __wcschr. [!WCSCHR] (DEFAULT_WCSCHR): Define. [DEFAULT_WCSCHR] (__wcschr): Use libc_hidden_def. [DEFAULT_WCSCHR] (wcschr): Define as weak alias of __wcschr. Use libc_hidden_weak. Do not use libc_hidden_def. * sysdeps/powerpc/powerpc32/power4/multiarch/wcschr-ppc32.c [IS_IN (libc) && SHARED] (libc_hidden_def): Also define __GI___wcschr alias. * sysdeps/powerpc/powerpc32/power4/multiarch/wcschr.c [IS_IN (libc)] (wcschr): Define as macro expanding to __redirect_wcschr. [IS_IN (libc)] (__wcschr_ppc): Use __redirect_wcschr in typeof. [IS_IN (libc)] (__wcschr_power6): Likewise. [IS_IN (libc)] (__wcschr_power7): Likewise. [IS_IN (libc)] (__libc_wcschr): New. Define with libc_ifunc instead of wcschr. [IS_IN (libc)] (wcschr): Undefine and define as weak alias of __libc_wcschr. [!IS_IN (libc)] (libc_hidden_def): Do not undefine and redefine. * sysdeps/powerpc/powerpc64/multiarch/wcschr.c (wcschr): Rename to __wcschr and define as weak alias of __wcschr. Use libc_hidden_builtin_def. * sysdeps/x86_64/wcschr.S (wcschr): Rename to __wcschr and define as weak alias of __wcschr. Use libc_hidden_weak. * time/alt_digit.c (_nl_get_walt_digit): Use __wcschr instead of wcschr. * time/era.c (_nl_init_era_entries): Likewise. * conform/Makefile (test-xfail-ISO/time.h/linknamespace): Remove variable. (test-xfail-XPG3/time.h/linknamespace): Likewise. (test-xfail-XPG4/time.h/linknamespace): Likewise.	2014-12-10 16:59:02 +00:00
Adhemerval Zanella	0f0a1c82f5	powerpc: Add powerpc64 strpbrk optimization This patch makes the POWER7 optimized strpbrk generic by using default doubleword stores to zero the hash, instead of VSX instructions. Performance on POWER7/POWER8 does not change.	2014-12-02 13:34:02 -05:00
Adhemerval Zanella	bb2542e0ae	powerpc: Add powerpc64 strcspn optimization This patch makes the POWER7 optimized strcspn generic by using default doubleword stores to zero the hash, instead of VSX instructions. Performance on POWER7/POWER8 does not change.	2014-12-02 07:16:24 -05:00
Adhemerval Zanella	2e8a2de2da	powerpc: Add powerpc64 strspn optimization This patch makes the POWER7 optimized strspn generic by using default doubleword stores to zero the hash, instead of VSX instructions. Performance on POWER7/POWER8 machines does not changed.	2014-12-02 07:15:58 -05:00
Rajalakshmi Srinivasaraghavan	a8a7d7d212	powerpc: strtok{_r} optimization for powerpc64 This patch optimizes strtok and strtok_r for POWERPC64. A table of 256 characters is created and marked based on the 'accept' argument and used to check for any occurance on the input string.Loop unrolling is also used to gain improvements.	2014-12-01 09:03:58 -05:00
Adhemerval Zanella	704f794714	powerpc: Fix missing barriers in atomic_exchange_and_add_{acq,rel} On powerpc, atomic_exchange_and_add is implemented without any barriers. This patchs adds the missing instruction and memory barrier for acquire and release semanthics.	2014-11-26 07:06:28 -05:00
Anton Blanchard	5fbb569186	powerpc: Fix __arch_compare_and_exchange_bool_64_rel Fix a typo in the inline assembly.	2014-11-25 07:28:28 -05:00

1 2 3 4 5 ...

425 Commits