Some math functions have distinct performance characteristics in
specific domains of inputs, where some inputs return via a fast path
while other inputs require multiple precision calculations, that too
at different precision levels. The way to implement different domains
was to have a separate source file and benchmark definition, resulting
in separate programs.
This clutters up the benchmark, so this change allows these domains to
be consolidated into the same input file. To do this, the input file
format is now enhanced to allow comments with a preceding # and
directives with two # at the begining of a line. A directive that
looks like:
tells the benchmark generation script that what follows is a different
domain of inputs. The value of the 'name' directive (in this case,
foo) is used in the output. The two input domains are then executed
sequentially and their results collated separately. with the above
directive, there would be two lines in the result that look like:
func(): ....
func(foo): ...
The idea to run benchmarks for a constant number of iterations is
problematic. While the benchmarks may run for 10 seconds on x86_64,
they could run for about 30 seconds on powerpc and worse, over 3
minutes on arm. Besides that, adding a new benchmark is cumbersome
since one needs to find out the number of iterations needed for a
sufficient runtime.
A better idea would be to run each benchmark for a specific amount of
time. This patch does just that. The run time defaults to 10 seconds
and it is configurable at command line:
make BENCH_DURATION=5 bench
This patch fix the 3c0265394d commits
by correctly setting minimum architecture for modf PPC optimization
to power5+ instead of power5 (since only on power5+ round/ceil will
be inline to inline assembly).
Use the most accurate hex literals possible for the answers to the
cos and sincos tests that vary according to the error in the rounding
of PI/2.
---
2013-04-24 Carlos O'Donell <carlos@redhat.com>
* math/libm-test.inc (cos_test): Use accurate hex constants.
(sincost_test): Likewise.
Resolves#14888.
This only really manifests itself when there are no spaces between
format specifiers, which is not allowed by POSIX, but is allowed by
the glibc implementation.
Kay Sievers reported that coreutils' stat tool has a problem with
s390's statfs[64] definition:
> The definition of struct statfs::f_type needs a fix. s390 is the only
> architecture in the kernel that uses an int and expects magic
> constants lager than INT_MAX to fit into.
>
> A fix is needed to make Fedora boot on s390, it currently fails to do
> so. Userspace does not want to add code to paper-over this issue.
[...]
> Even coreutils cannot handle it:
> #define RAMFS_MAGIC 0x858458f6
> # stat -f -c%t /
> ffffffff858458f6
>
> #define BTRFS_SUPER_MAGIC 0x9123683E
> # stat -f -c%t /mnt
> ffffffff9123683e
The bug is caused by an implicit sign extension within the stat tool:
out_uint_x (pformat, prefix_len, statfsbuf->f_type);
where the format finally will be "%lx".
A similar problem can be found in the 'tail' tool.
s390 is the only architecture which has an int type f_type member in
struct statfs[64]. Other architectures have either unsigned ints or
long values, so that the problem doesn't occur there.
Therefore change the type of the f_type member to unsigned int, so
that we get zero extension instead sign extension when assignment to
a long value happens.
Reported-by: Kay Sievers <kay@vrfy.org>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
We no longer support configuring for i386, nor do we
elide such a configuration to i686. Configuring with
i386-* is a failure, and we provide an example of
how to fix that.
---
2013-04-17 Carlos O'Donell <carlos@redhat.com>
* configure.in: Remove i386 configure warning. Remove i386 case.
* configure: Regenerate.
* sysdeps/i386/configure.in: Raise error if config_machine is i386.
Add example to error message.
* sysdeps/i386/configure: Regenerate.
Appending benchmark program output on every run could result in a case
where the benchmark run was cancelled, resulting in a partially
written file. This file gets used again on the next run, resulting in
results being appended to old results.
It could have been possible to remove the file before every benchmark
run, but it is easier to just write the output to bench.out-tmp only
once.
Benchmark programs are generated using parameters from the Makefile,
so it is necessary to rebuild them whenever the parameters in the
Makefile are updated. Hence, added a dependency for the generated C
source on the Makefile so that it gets regenerated when the Makefile
is updated.
The value of PI is never exactly PI in any floating point representation,
and the value of PI/2 is never PI/2. It is wrong to expect cos(M_PI_2l)
to return 0, instead it will return an answer that is non-zero because
M_PI_2l doesn't round to exactly PI/2 in the type used.
That is to say that the correct answer is to do the following:
* Take PI or PI/2.
* Round to the floating point representation.
* Take the rounded value and compute an infinite precision cos or sin.
* Use the rounded result of the infinite precision cos or sin as the
answer to the test.
I used printf to do the type rounding, and Wolfram's Alpha to do the
infinite precision cos calculations.
The following changes bring x86-64 and x86 to 1/2 ulp for two tests.
It shows that the x86 cos implementation is quite good, and that
our test are flawed.
Unfortunately given that the rounding errors are type dependent we
need to fix this for each type. No regressions on x86-64 or x86.
---
2013-04-11 Carlos O'Donell <carlos@redhat.com>
* math/libm-test.inc (cos_test): Fix PI/2 test.
(sincos_test): Likewise.
* sysdeps/x86_64/fpu/libm-test-ulps: Regenerate.
* sysdeps/i386/fpu/libm-test-ulps: Regenerate.
run-via-rtld-prefix checks whether the program to be run is a static
test and skips if it is. This is fine, except that it assumes that
the program to be run is the second $^, which is true only for tests.
This change creates an rtld-prefix, which is simply the dynamic linker
prefix with the necessary arguments and uses that in the non-test
targets.
Fixes#15346.
The POSIX description of getdate allows for extra spaces in the
getdate input string. __getdate_r uses strptime internally, which
works fine with extra spaces between format strings (and hence within
an input string) but not with leading and trailing spaces. So we trim
off the leading and trailing spaces before we pass it on to strptime.
Document the use of the convenience testrun.sh script for
running the libm test.
---
2013-04-06 Carlos O'Donell <carlos@redhat.com>
* math/README.libm-test (How can I generate "libm-test-ulps"?):
Use testrun.sh to run libm tests.
The seen array was doubled in size recently, but the memset to clear
the array was not adjusted. We adjust the memset to always be correct
regardless of the size of seen.
---
2013-04-06 Carlos O'Donell <carlos@redhat.com>
[BZ #15309]
* elf/dl-open.c (dl_open_worker): memset all of seen array.
Define yesstr/nostr in fi_FI (as "Kyllä" and "Ei").
Fixes part of BZ#15264.
---
2013-04-06 Marko Myllynen <myllynen@redhat.com>
[BZ #15264]
* locales/fi_FI (LC_MESSAGES): Define yesstr and nostr.
The wiki "Regeneration" page has this to say about update ULPs.
"The libm-test-ulps files are semiautomatically updated. To
update an ulps baseline, run each of the failing tests (test-float,
test-double, etc.) with -u; this will generate a file called ULPs;
concatenate each of those files with the existing libm-test-ulps
file, after removing any entries for particularly huge numbers of
ulps that you do not want to mark as expected. Then run
gen-libm-test.pl -n -u FILE where FILE is the concatenated file
produced in the previous step. This generates a file called
NewUlps which is the new sorted version of libm-test-ulps."
The same information is listed in math/README.libm-test, and is a
lot of manual work that you often want to run over-and-over again
while working on a particular test.
The `regen-ulps' convenience target does this automatically for
developers.
We strictly assume the source tree is readonly and add a
new --output-dir option to libm-test.inc to allow for writing
out ULPs to $(objpfx).
When run the new target does the following:
* Starts with the baseline ULPs file.
* Runs each of the libm math tests with -u.
* Adds new changes seen with -u to the baseline.
* Sorts and prepares the test output with gen-libm-test.pl.
* Leaves math/NewUlps in your build tree to copy to your source
tree, cleanup, and checkin.
The math test documentation in math/README.libm-test is updated
document the new Makefile target.
---
2013-04-06 Carlos O'Donell <carlos@redhat.com>
* Makefile.in (regen-ulps): New target.
* math/Makefile [ifneq (no,$(PERL)]: Declare regen-ulps with .PHONY.
[ifneq (no,$(PERL)] (run-regen-ulps): New variable.
[ifneq (no,$(PERL)] (regen-ulps): New target.
[ifeq (no,$(PERL)] (regen-ulps): New target.
* math/libm-test.inc (ulps_file_name): Define.
(output_dir): New variable.
(options): Add "output-dir" option.
(parse_opt): Handle 'o' case.
(main): If output_dir is non-NULL use it as a prefix
otherwise use "".
* math/README.libm-test: Update `How can I generate "libm-test-ulps"?'
This change does two things:
* Treats a target i386-* as if it were i686.
* Fails configure if the user is generating code
for i386.
We no longer support i386 code-generation because the i386
lacks the atomic operations we need in glibc.
You can still configure for i386-*, but you get i686 code.
You can't build with --march=i386, --mtune=i386 or a compiler
that defaults to i386 code-generation.
I've added two i386 entries in the master todo list to discuss
merging and renaming:
http://sourceware.org/glibc/wiki/Development_Todo/Master#i386
The failure modes are fail-safe here. You compile for i386,
get i686, and try to run on i386 and it fails. The configure
log has a warning saying we elided to i686. There is no situation
that I can see where we run into any serious problems.
The patch makes the current state better in that we get less
confused users and we build successfully in more default
configurations.
The next enhancement would be to add --march=i?86
as suggested in #c20 of BZ#10062 for any i?86-* builds, which
would solve the problem of a 32-bit compiler that defaults to
i386 code-gen and glibc configured for i686-* target. Which
previously failed at build time, and now will fail at configure
time (requires adding --march=i686).
Updated NEWS with BZ #10060 and #10062.
No regressions.
---
2013-04-06 Carlos O'Donell <carlos@redhat.com>
[BZ #10060, #10062]
* aclocal.m4 (LIBC_COMPILER_BUILTIN_INLINED): New macro.
* sysdeps/i386/configure.in: Use LIBC_COMPILER_BUILTIN_INLINED and
fail configure if __sync_val_compare_and_swap is not inlined.
* sysdeps/i386/configure: Regenerate.
* configure.in: Build for i686 when configured for i386.
* configure: Regenerate.
* README: Remove i386 reference.
Write output from the currently running benchmark into a temporary
file and move files around only once the current run is complete.
That way we don't lose data from the last two runs due to an
incomplete run.
Fix BZ #15305.
On kernel versions earlier than 2.6.29, the Linux kernel exported a
sysctl called restrict_chown for xfs, which could be used to allow
chown to users other than the owner. 2.6.29 removed this support,
causing the open_not_cancel_2 to fail and thus modify errno. The fix
is to save and restore errno so that the caller sees it as unmodified.
Additionally, since the code to check the sysctl is not useful on
newer kernels, we add an ifdef so that in future the code block gets
rmeoved completely.
Separate benchmarks for the fast and slow implementations of pow and
exp since measuring both together doesn't make sense. Adjust the
iterations for pow and exp accordingly so that they run long enough
for the measurements to be meaningful.
Fixes BZ #15304.
The *initgroups_dyn functions are called with a group argument. This
group gid is usually skipped while populating the grouplist since the
caller adds that group id in advance.
The hesiod initgroups_dyn implementation however adds the group gid to
the list if it does not already exist. While it works fine for the
usual initgroups, it breaks nscd since it calls initgroups_dyn with -1
as the gid (to have all groups included).
The compiler is smart enough to convert those into double for powerpc,
but if we put them as doubles, it adds overhead by performing those
operations in floating point mode.
The mantissa of mp_no is intended to take only integral values. This
is a relatively good choice for powerpc due to its 4 fpus, but not for
other architectures, which suffer due to this choice. This change
makes the default mantissa a long integer and allows powerpc to
override it. Additionally, some operations have been optimized for
integer manipulation, resulting in a significant improvement in
performance.
The patch increase the high value to check if expl overflows. Current
high mark value is not really correct, the algorithm accepts high values.
It also adds a correct wrapper function to check for overflow and underflow.
Due to a typo repeated several times, this bug hasn't been fixed yet,
despite being marked as resolved in glibc 2.12.
* sysdeps/x86_64/strcmp.S: Replace all occurrences of NOT_IN_lib
with NOT_IN_libc.
This allows us to define custom functions in C code files and
benchmark scenarios rather than just functions. The main current use
of this is to separate the slow and fast path benchmarks for math
functions.
This adds the base chapter for POSIX threads and also documentation
for thread-specific data, along with a note on its interaction with
C++11 thread_local variables.
Fixes BZ #12723
The variable pipe buffer size does nothing to the value of PIPE_BUF,
since the number of bytes that are atomically written is still
PIPE_BUF on Linux.
* sysdeps/unix/sysv/linux/bits/mman-linux.h (MAP_ANONYMOUS):
Allow definition via __MAP_ANONYMOUS.
* sysdeps/unix/sysv/linux/mips/bits/mman.h: Remove all defines
provided by bits/mman-linux.h and include <bits/mman-linux.h>.
(__MAP_ANONYMOUS): Define.
* sysdeps/unix/sysv/linux/s390/bits/mman.h: Include
<bits/mman-linux.h>.
(MCL_CURRENT, MCL_FUTURE): Do not define here, the generic value
is fine.
* sysdeps/unix/sysv/linux/sh/bits/mman.h: Move include of
<bits/mman-linux.h> to end of file.
(MCL_CURRENT, MCL_FUTURE): Do not define here, the generic value
is fine.
* sysdeps/unix/sysv/linux/x86/bits/mman.h: Move include of
<bits/mman-linux.h> to end of file.
(MCL_CURRENT, MCL_FUTURE): Do not define here, the generic value
is fine.
* sysdeps/unix/sysv/linux/sparc/bits/mman.h: Move include of
<bits/mman-linux.h> to end of file.
* sysdeps/unix/sysv/linux/bits/mman-linux.h [!MCL_CURRENT]
(MCL_CURRENT, MCL_FUTURE): Define here.
* sysdeps/unix/sysv/linux/bits/mman-linux.h: New file, with
Linux common definitions.
* sysdeps/unix/sysv/linux/sh/bits/mman.h: Remove all defines
provided by bits/mman-linux.h and include <bits/mman-linux.h>.
* sysdeps/unix/sysv/linux/x86/bits/mman.h: Likewise.
* sysdeps/unix/sysv/linux/s390/bits/mman.h: Likewise.
* sysdeps/unix/sysv/linux/powerpc/bits/mman.h: Likewise.
* sysdeps/unix/sysv/linux/sh/bits/mman.h: Likewise.
* sysdeps/unix/sysv/linux/sparc/bits/mman.h: Likewise.
This reverts the change that allows the POSIX Thread default stack size
to be changed by the environment variable
GLIBC_PTHREAD_DEFAULT_STACKSIZE. It has been requested that more
discussion happen before this change goes into 2.18.