Initially based on the versions found in wcsmbs/* ; these files have
been changed by hand unrolling, and adding some additional variables
to allow some read-ahead to occur, which then relieves some of the
wait-for-increment/wait-for-load/wait-for-compare-results pressure
that was slowing down every iteration through the while-loop.
For 64-bit Power7, These changes give an approx 20% throughput boost
for the wcschr and wcsrchr functions; and approx 40% boost for the
wcscpy function. 32-bit improvements appear to be slightly better
with ~ %30 and ~ %45 respectively. Results for Power6 closely match
those for power7.
Assorted tweaking, twisting and tuning to squeeze a few additional cycles
out of the memchr code. Changes include bypassing the shift pairs
(sld,srd) when they are not required, and unrolling the small_loop that
handles short and trailing strings.
Per scrollpipe data measuring aligned strings for 64-bit, these changes
save between five and eight cycles (9-13% overall) for short strings (<32),
Longer aligned strings see slight improvement of 1-3% due to bypassing the
shifts and the instruction rearranging.
These internal knobs are not exposed as part of the public ABI, so mark
them hidden to avoid generating relocations against them.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
We can't assume sock_cloexec and pipe2 are bound together as the former
defines are found in glibc only while the latter are a combo of kernel
headers and glibc. So if we do a runtime detection of SOCK_CLOEXEC, but
pipe2() is a stub inside of glibc, we hit a problem. For example:
main()
{
getgrnam("portage");
if (!popen("ls", "r"))
perror("popen()");
}
getgrnam() will detect that the kernel supports SOCK_CLOEXEC and then set
both __have_sock_cloexec and __have_pipe2 to true. But if glibc was built
against older kernel headers where __NR_pipe2 does not exist, glibc will
have a ENOSYS stub for it. So popen() will always fail as glibc assumes
pipe2() works.
While this isn't too much of an issue for some arches as they added the
functionality to the kernel at the same time, not all arches are that
lucky.
Since the code already has dedicated names for each feature, delete the
defines wiring these three features together and make each one a proper
dedicated knob.
We've been carrying this in Gentoo since glibc-2.9.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
When unmapping the first object in a namespace, the runtime linker
did not update the externally visible pointer. This resulted in
debuggers seeing pointers to memory that had been freed.
The original runtime linker auditing interface described
by Solaris allows the 5th argument of la_pltenter() to be
modified. This patch cleans up the ldsodefs.h definitions
such that the 5th argument is not constant.
At one point the 5th argument *was* constant but this was
changed with commit 2413fdba7a.
This patch updates alpha, ia64, mips, sh and sparc with similar
changes.