The definition of SHARED is tested with #ifdef pretty much everywhere
apart from these few places. The tlsdesc.c code seems to be copy and
pasted to a few architectures and there is one instance in the hppa
startup code.
ChangeLog:
2014-07-09 Will Newton <will.newton@linaro.org>
* sysdeps/aarch64/tlsdesc.c (_dl_unmap): Test SHARED with #ifdef.
* sysdeps/arm/tlsdesc.c (_dl_unmap): Likewise.
* sysdeps/i386/tlsdesc.c (_dl_unmap): Likewise.
* sysdeps/x86_64/tlsdesc.c (_dl_unmap): Likewise.
* sysdeps/hppa/start.S (_start): Likewise.
While we're at fixing build warnings, here's one unnecessary warning
that can be fixed fairly easily. The SIZE variable is never actually
use uninitialized, but the compiler cannot make that out and thinks
(correctly) that there is a potential for accessing SIZE without
initializing it. Make this safe by initializing SIZE to 0.
Tested on x86_64.
There was a typo in the previous patch due to which resplen2 was
checked for non-zero instead of the value at resplen2. Fix that and
improve the condition by checking resplen2 for non-NULL (instead of
answerp2) and also adding the check in a third place.
Now that MEMCPY_OK_FOR_FWD_MEMMOVE should be define on memcopy.h there
is no need to specialized powerpc memmove implementation. This patch
moves the define set to powerpc memcopy and cleanup its definition on
powerpc code.
This patch changes power7 memcpy to use VSX instructions only when
memory is aligned to quardword. It is to avoid unaligned kernel traps
on non-cacheable memory (for instance, memory-mapped I/O).
This patch adds an optimized memmove optimization for POWER7/powerpc64.
Basically the idea is to use the memcpy for POWER7 on non-overlapped
memory regions and a optimized backward memcpy for memory regions
that overlap (similar to the idea of string/memmove.c).
The backward memcpy algorithm used is similar the one use for memcpy for
POWER7, with adjustments done for alignment. The difference is memory
is always aligned to 16 bytes before using VSX/altivec instructions.
This patch removes the powerpc specific logic in memmove and instead
include default implementation with MEMCPY_OK_FOR_FWD_MEMMOVE defined.
This lead in a increase performance, since the constraints to use
memcpy in powerpc code are too restrictive and memcpy can be used for
any forward memmove.
Merge most of the gnulib implementation of memchr. The changes that
remain are:
- copyright header
- bp-sym.h removed
- reg_char removed
- allow MEMCHR to be redefined
- non-conforming whitespace changes
The merged code fixes a number of -Wundef warnings and also introduces
an optimized algorithm. I haven't detected any performance difference
in the new code which I believe is down to the quite specific
circumstances required to hit it. However the new code is approximately
half the size of the old code on AArch64 (which uses generic memchr).
ChangeLog:
2014-07-04 Will Newton <will.newton@linaro.org>
* string/memchr.c: Merge from gnulib.
[_LIBC]: Remove conditionals.
(__ptr_t): Remove define.
(LONG_MAX_32_BITS): Likewise.
(LONG_MAX): Likewise.
(MEMCHR): Use ANSI prototype and optimize algorithm.
The original implementation was written for EV5, which does not
record inexact in the status register for /SU (but no /I) insns.
But EV6 does record the inexact status; the lack of /I simply
means that the exception is suppressed.
Adding feholdexcept becomes the bulk of the overhead, so we might
as well use the default implementation.
Two bugs in these implementations: First is that the add of 0.5
was not done in chopped rounding mode (easily fixable). Second
is that the method generates incorrect inexact exceptions for
small integral values (not easily fixable).
This test case is very, especially on targets using soft-float or QEMU
(where soft-float is used internally), and appears to be the only such
outlier. Therefore rather than requiring to have TIMEOUTFACTOR set
large enough globally, bump up the local scaling factor instead.
* stdlib/tst-strtod-overflow.c (TIMEOUT): Bump up to 30.
c4c4124473 added the _Noreturn macro for
pre-C11 compilers, but it now throws a new Wundef warning during `make
check` for __STDC_VERSION__ which gcc does not define by default. The
following patch fixes this in line with other uses of __STDC_VERSION__
in the file.
The PAGE_COPY_THRESHOLD macro is meant to be overridden by
architecture-specific pagecopy.h, but it is currently done only by
mach; all other architectures use the default. Check to see if the
macro is defined in addition to whether it is set to a non-zero value.
This patch adds an ifunc power7 strcat symbol that uses the logic on
sysdeps/powerpc/strcat.c but call power7 strlen/strcpy symbols instead
of default ones.