Add systemtap probes to various slow paths in libm so that application
developers may use systemtap to find out if their applications are
hitting these slow paths. We have added probes for pow, exp, log,
tan, atan and atan2.
Partially revert commits 2b766585f9 and
de2fd463b1, which were intended to fix BZ#11741
but caused another, likely worse bug, namely that fwrite() and fputs() could,
in an error path, read data beyond the end of the specified buffer, and
potentially even write this data to the file.
Fix BZ#11741 properly by checking the return value from _IO_padn() in
stdio-common/vfprintf.c.
A large value of bytes passed to memalign_check can cause an integer
overflow in _int_memalign and heap corruption. This issue can be
exposed by running tst-memalign with MALLOC_CHECK_=3.
ChangeLog:
2013-10-10 Will Newton <will.newton@linaro.org>
* malloc/hooks.c (memalign_check): Ensure the value of bytes
passed to _int_memalign does not overflow.
This adds the "include-sources" directive to scripts/bench.pl. This
allows for including source code (vs including headers, which might get
a different search path) after the inclusion of any headers.
This patch adds some more directives to the benchmark inputs file,
moving functionality from the Makefile and making the code generation
script a bit cleaner. The function argument and return types that
were earlier added as variables in the makefile and passed to the
script via command line arguments are now the 'args' and 'ret'
directive respectively. 'args' should be a colon separated list of
argument types (skipped if the function doesn't accept any arguments)
and 'ret' should be the return type.
Additionally, an 'includes' directive may have a comma separated list
of headers to include in the source. For example, the pow input file
now looks like this:
42.0, 42.0
1.0000000000000020, 1.5
I did this to unclutter the benchtests Makefile a bit and eventually
eliminate dependency of the tests on the Makefile and have tests
depend on their respective include files only.
Add some comments and call free on all potentially allocated pointers.
Also remove duplicate check for NULL pointer.
ChangeLog:
2013-10-04 Will Newton <will.newton@linaro.org>
* malloc/tst-valloc.c: Add comments.
(do_test): Add comments and call free on all potentially
allocated pointers. Remove duplicate check for NULL pointer.
Add space after cast.
Add some comments and call free on all potentially allocated pointers.
Also remove duplicate check for NULL pointer.
ChangeLog:
2013-10-04 Will Newton <will.newton@linaro.org>
* malloc/tst-pvalloc.c: Add comments.
(do_test): Add comments and call free on all potentially
allocated pointers. Remove duplicate check for NULL pointer.
Add space after cast.
Add some comments and call free on all potentially allocated pointers.
ChangeLog:
2013-10-04 Will Newton <will.newton@linaro.org>
* malloc/tst-posix_memalign.c: Add comments.
(do_test): Add comments and call free on all potentially
allocated pointers. Add space after cast.
* sysdeps/powerpc/powerpc32/dl-machine.c (__process_machine_rela):
Use stdint types in rather than __attribute__((mode())).
* sysdeps/powerpc/powerpc64/dl-machine.h (elf_machine_rela): Likewise.
http://sourceware.org/ml/libc-alpha/2013-08/msg00096.html
This adds the basic configury bits for powerpc64le and powerpcle.
* configure.in: Map powerpc64le and powerpcle to base_machine/machine.
* configure: Regenerate.
* nptl/shlib-versions: Powerpc*le starts at 2.18.
* shlib-versions: Likewise.
http://sourceware.org/ml/libc-alpha/2013-08/msg00095.html
I found this useful at one stage when I was seeing a huge number of
memrchr failures all of test number 10.
* string/tester.c (test_memrchr): Increment reported test cycle.
http://sourceware.org/ml/libc-alpha/2013-08/msg00094.html
Using plain %s here runs the risk of segfaulting when displaying the
string. src and dst aren't zero terminated strings.
* string/test-memcpy.c (do_one_test): When reporting errors, print
string address and don't overrun end of string.
http://sourceware.org/ml/libc-alpha/2013-08/msg00105.html
Like strnlen, memchr and memrchr had a number of defects fixed by this
patch as well as adding little-endian support. The first one I
noticed was that the entry to the main loop needlessly checked for
"are we done yet?" when we know the size is large enough that we can't
be done. The second defect I noticed was that the main loop count was
wrong, which in turn meant that the small loop needed to handle an
extra word. Thirdly, there is nothing to say that the string can't
wrap around zero, except of course that we'd normally hit a segfault
on trying to read from address zero. Fixing that simplified a number
of places:
- /* Are we done already? */
- addi r9,r8,8
- cmpld r9,r7
- bge L(null)
becomes
+ cmpld r8,r7
+ beqlr
However, the exit gets an extra test because I test for being on the
last word then if so whether the byte offset is less than the end.
Overall, the change is a win.
Lastly, memrchr used the wrong cache hint.
* sysdeps/powerpc/powerpc64/power7/memchr.S: Replace rlwimi with
insrdi. Make better use of reg selection to speed exit slightly.
Schedule entry path a little better. Remove useless "are we done"
checks on entry to main loop. Handle wrapping around zero address.
Correct main loop count. Handle single left-over word from main
loop inline rather than by using loop_small. Remove extra word
case in loop_small caused by wrong loop count. Add little-endian
support.
* sysdeps/powerpc/powerpc32/power7/memchr.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/memrchr.S: Likewise. Use proper
cache hint.
* sysdeps/powerpc/powerpc32/power7/memrchr.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/rawmemchr.S: Add little-endian
support. Avoid rlwimi.
* sysdeps/powerpc/powerpc32/power7/rawmemchr.S: Likewise.
http://sourceware.org/ml/libc-alpha/2013-08/msg00104.html
One of the things I noticed when looking at power7 timing is that rlwimi
is cracked and the two resulting insns have a register dependency.
That makes it a little slower than the equivalent rldimi.
* sysdeps/powerpc/powerpc64/memset.S: Replace rlwimi with
insrdi. Formatting.
* sysdeps/powerpc/powerpc64/power4/memset.S: Likewise.
* sysdeps/powerpc/powerpc64/power6/memset.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/memset.S: Likewise.
* sysdeps/powerpc/powerpc32/power4/memset.S: Likewise.
* sysdeps/powerpc/powerpc32/power6/memset.S: Likewise.
* sysdeps/powerpc/powerpc32/power7/memset.S: Likewise.
http://sourceware.org/ml/libc-alpha/2013-08/msg00103.html
LIttle-endian support for memcpy. I spent some time cleaning up the
64-bit power7 memcpy, in order to avoid the extra alignment traps
power7 takes for little-endian. It probably would have been better
to copy the linux kernel version of memcpy.
* sysdeps/powerpc/powerpc32/power4/memcpy.S: Add little endian support.
* sysdeps/powerpc/powerpc32/power6/memcpy.S: Likewise.
* sysdeps/powerpc/powerpc32/power7/memcpy.S: Likewise.
* sysdeps/powerpc/powerpc32/power7/mempcpy.S: Likewise.
* sysdeps/powerpc/powerpc64/memcpy.S: Likewise.
* sysdeps/powerpc/powerpc64/power4/memcpy.S: Likewise.
* sysdeps/powerpc/powerpc64/power6/memcpy.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/memcpy.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/mempcpy.S: Likewise. Make better
use of regs. Use power7 mtocrf. Tidy function tails.
http://sourceware.org/ml/libc-alpha/2013-08/msg00102.html
This is a rather large patch due to formatting and renaming. The
formatting changes were to make it possible to compare power7 and
power4 versions of memcmp. Using different register defines came
about while I was wrestling with the code, trying to find spare
registers at one stage. I found it much simpler if we refer to a reg
by the same name throughout a function, so it's better if short-term
multiple use regs like rTMP are referred to using their register
number. I made the cr field usage changes when attempting to reload
rWORDn regs in the exit path to byte swap before comparing when
little-endian. That proved a bad idea due to the pipelining involved
in the main loop; Offsets to reload the regs were different first
time around the loop.. Anyway, I left the cr field usage changes in
place for consistency.
Aside from these more-or-less cosmetic changes, I fixed a number of
places where an early exit path restores regs unnecessarily, removed
some dead code, and optimised one or two exits.
* sysdeps/powerpc/powerpc64/power7/memcmp.S: Add little-endian support.
Formatting. Consistently use rXXX register defines or rN defines.
Use early exit labels that avoid restoring unused non-volatile regs.
Make cr field use more consistent with rWORDn compares. Rename
regs used as shift registers for unaligned loop, using rN defines
for short lifetime/multiple use regs.
* sysdeps/powerpc/powerpc64/power4/memcmp.S: Likewise.
* sysdeps/powerpc/powerpc32/power7/memcmp.S: Likewise. Exit with
addi 1,1,64 to pop stack frame. Simplify return value code.
* sysdeps/powerpc/powerpc32/power4/memcmp.S: Likewise.