Summary of changes:
- Use of !_LIBC instead of HAVE_CONFIG_H
- Code changes in [!_LIBC] that don't affect us
- Minor formatting changes
- Use __builtin_expect in shared code
- Define some macros in [_LIBC] that are used in gnulib but never
defined in glibc
- Flip macro check for STRERROR_R_CHAR_P so that it does not throw a
warning
Here's an updated patch to fix the crash in bug-ga2 when the system
has no configured ipv6 address. I have taken a different approach of
using libc_freeres_fn instead of the libc_freeres_ptr since the former
gives better control over what is freed; we need that since cache may
or may not be allocated using malloc.
Verified that bug-ga2 works correctly in both cases and does not have
memory leaks in either of them.
The definition of SHARED is tested with #ifdef pretty much everywhere
apart from these few places. The tlsdesc.c code seems to be copy and
pasted to a few architectures and there is one instance in the hppa
startup code.
ChangeLog:
2014-07-09 Will Newton <will.newton@linaro.org>
* sysdeps/aarch64/tlsdesc.c (_dl_unmap): Test SHARED with #ifdef.
* sysdeps/arm/tlsdesc.c (_dl_unmap): Likewise.
* sysdeps/i386/tlsdesc.c (_dl_unmap): Likewise.
* sysdeps/x86_64/tlsdesc.c (_dl_unmap): Likewise.
* sysdeps/hppa/start.S (_start): Likewise.
While we're at fixing build warnings, here's one unnecessary warning
that can be fixed fairly easily. The SIZE variable is never actually
use uninitialized, but the compiler cannot make that out and thinks
(correctly) that there is a potential for accessing SIZE without
initializing it. Make this safe by initializing SIZE to 0.
Tested on x86_64.
There was a typo in the previous patch due to which resplen2 was
checked for non-zero instead of the value at resplen2. Fix that and
improve the condition by checking resplen2 for non-NULL (instead of
answerp2) and also adding the check in a third place.
Now that MEMCPY_OK_FOR_FWD_MEMMOVE should be define on memcopy.h there
is no need to specialized powerpc memmove implementation. This patch
moves the define set to powerpc memcopy and cleanup its definition on
powerpc code.
This patch changes power7 memcpy to use VSX instructions only when
memory is aligned to quardword. It is to avoid unaligned kernel traps
on non-cacheable memory (for instance, memory-mapped I/O).
This patch adds an optimized memmove optimization for POWER7/powerpc64.
Basically the idea is to use the memcpy for POWER7 on non-overlapped
memory regions and a optimized backward memcpy for memory regions
that overlap (similar to the idea of string/memmove.c).
The backward memcpy algorithm used is similar the one use for memcpy for
POWER7, with adjustments done for alignment. The difference is memory
is always aligned to 16 bytes before using VSX/altivec instructions.
This patch removes the powerpc specific logic in memmove and instead
include default implementation with MEMCPY_OK_FOR_FWD_MEMMOVE defined.
This lead in a increase performance, since the constraints to use
memcpy in powerpc code are too restrictive and memcpy can be used for
any forward memmove.
Merge most of the gnulib implementation of memchr. The changes that
remain are:
- copyright header
- bp-sym.h removed
- reg_char removed
- allow MEMCHR to be redefined
- non-conforming whitespace changes
The merged code fixes a number of -Wundef warnings and also introduces
an optimized algorithm. I haven't detected any performance difference
in the new code which I believe is down to the quite specific
circumstances required to hit it. However the new code is approximately
half the size of the old code on AArch64 (which uses generic memchr).
ChangeLog:
2014-07-04 Will Newton <will.newton@linaro.org>
* string/memchr.c: Merge from gnulib.
[_LIBC]: Remove conditionals.
(__ptr_t): Remove define.
(LONG_MAX_32_BITS): Likewise.
(LONG_MAX): Likewise.
(MEMCHR): Use ANSI prototype and optimize algorithm.
The original implementation was written for EV5, which does not
record inexact in the status register for /SU (but no /I) insns.
But EV6 does record the inexact status; the lack of /I simply
means that the exception is suppressed.
Adding feholdexcept becomes the bulk of the overhead, so we might
as well use the default implementation.
Two bugs in these implementations: First is that the add of 0.5
was not done in chopped rounding mode (easily fixable). Second
is that the method generates incorrect inexact exceptions for
small integral values (not easily fixable).