http://sourceware.org/ml/libc-alpha/2013-08/msg00081.html
This is the first of a series of patches to ban ieee854_long_double
and the ieee854_long_double macros when using IBM long double. union
ieee854_long_double just isn't correct for IBM long double, especially
when little-endian, and pretending it is OK has allowed a number of
bugs to remain undetected in sysdeps/ieee754/ldbl-128ibm/.
This changes the few places in generic code that use it.
* stdio-common/printf_size.c (__printf_size): Don't use
union ieee854_long_double in fpnum union.
* stdio-common/printf_fphex.c (__printf_fphex): Likewise. Use
signbit macro to retrieve sign from long double.
* stdio-common/printf_fp.c (___printf_fp): Use signbit macro to
retrieve sign from long double.
* sysdeps/ieee754/ldbl-128ibm/printf_fphex.c: Adjust for fpnum change.
* sysdeps/ieee754/ldbl-128/printf_fphex.c: Likewise.
* sysdeps/ieee754/ldbl-96/printf_fphex.c: Likewise.
* sysdeps/x86_64/fpu/printf_fphex.c: Likewise.
* math/test-misc.c (main): Don't use union ieee854_long_double.
ports/
* sysdeps/ia64/fpu/printf_fphex.c: Adjust for fpnum change.
http://sourceware.org/ml/libc-alpha/2013-06/msg00919.html
I discovered a number of places where denormals and other corner cases
were being handled wrongly.
- printf_fphex.c: Testing for the low double exponent being zero is
unnecessary. If the difference in exponents is less than 53 then the
high double exponent must be nearing the low end of its range, and the
low double exponent hit rock bottom.
- ldbl2mpn.c: A denormal (ie. exponent of zero) value is treated as
if the exponent was one, so shift mantissa left by one. Code handling
normalisation of the low double mantissa lacked a test for shift count
greater than bits in type being shifted, and lacked anything to handle
the case where the difference in exponents is less than 53 as in
printf_fphex.c.
- math_ldbl.h (ldbl_extract_mantissa): Same as above, but worse, with
code testing for exponent > 1 for some reason, probably a typo for >= 1.
- math_ldbl.h (ldbl_insert_mantissa): Round the high double as per
mpn2ldbl.c (hi is odd or explicit mantissas non-zero) so that the
number we return won't change when applying ldbl_canonicalize().
Add missing overflow checks and normalisation of high mantissa.
Correct misleading comment: "The hidden bit of the lo mantissa is
zero" is not always true as can be seen from the code rounding the hi
mantissa. Also by inspection, lzcount can never be less than zero so
remove that test. Lastly, masking bitfields to their widths can be
left to the compiler.
- mpn2ldbl.c: The overflow checks here on rounding of high double were
just plain wrong. Incrementing the exponent must be accompanied by a
shift right of the mantissa to keep the value unchanged. Above notes
for ldbl_insert_mantissa are also relevant.
[BZ #15680]
* sysdeps/ieee754/ldbl-128ibm/e_rem_pio2l.c: Comment fix.
* sysdeps/ieee754/ldbl-128ibm/printf_fphex.c
(PRINT_FPHEX_LONG_DOUBLE): Tidy code by moving -53 into ediff
calculation. Remove unnecessary test for denormal exponent.
* sysdeps/ieee754/ldbl-128ibm/ldbl2mpn.c (__mpn_extract_long_double):
Correct handling of denormals. Avoid undefined shift behaviour.
Correct normalisation of low mantissa when low double is denormal.
* sysdeps/ieee754/ldbl-128ibm/math_ldbl.h
(ldbl_extract_mantissa): Likewise. Comment. Use uint64_t* for hi64.
(ldbl_insert_mantissa): Make both hi64 and lo64 parms uint64_t.
Correct normalisation of low mantissa. Test for overflow of high
mantissa and normalise.
(ldbl_nearbyint): Use more readable constant for two52.
* sysdeps/ieee754/ldbl-128ibm/mpn2ldbl.c
(__mpn_construct_long_double): Fix test for overflow of high
mantissa and correct normalisation. Avoid undefined shift.
http://sourceware.org/ml/libc-alpha/2013-07/msg00001.html
This patch starts the process of supporting powerpc64 little-endian
long double in glibc. IBM long double is an array of two ieee
doubles, so making union ibm_extended_long_double reflect this fact is
the correct way to access fields of the doubles.
* sysdeps/ieee754/ldbl-128ibm/ieee754.h
(union ibm_extended_long_double): Define as an array of ieee754_double.
(IBM_EXTENDED_LONG_DOUBLE_BIAS): Delete.
* sysdeps/ieee754/ldbl-128ibm/printf_fphex.c: Update all references
to ibm_extended_long_double and IBM_EXTENDED_LONG_DOUBLE_BIAS.
* sysdeps/ieee754/ldbl-128ibm/e_exp10l.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_expl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/ldbl2mpn.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/math_ldbl.h: Likewise.
* sysdeps/ieee754/ldbl-128ibm/mpn2ldbl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_nearbyintl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/strtold_l.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/x2y2m1l.c: Likewise.
[BZ #15915] As described in the bug, the pattern rule for lib%.so files
in Makerules includes linkobj/libc.so as a dependency. However, the
explicit rule for linkobj/libc.so is in the top-level Makefile.
Thus, the subdirectory makefiles that include Makerules end up with an
erroneous makefile pattern rule for linkobj/libc.so that includes
itself as a dependency. The result is make warnings whenever rules
for other .so files are resolved -- and, on occasion, actual makefile
failures when a race condition causes the implicit rule to actually be
used.
This patch moves the explicit rules for linkobj/libc.so into Makerules
to clear up this problem. It also elaborates a couple of comments
that I'd initially found confusing.
Colin Watson reported that some versions of gcc warn about
attribute leaf used on a static function, since it has no
effect on anything but external functions.
* posix/glob.c (next_brace_sub, prefix_array, collated_compare):
Use __THROWNL rather than __THROW on static functions.
Signed-off-by: Eric Blake <eblake@redhat.com>
sysdeps/unix/make-syscalls.sh and sysdeps/unix/Makefile use GNU Bash's
${parameter/pattern/string} parameter expansion. Non-Bash shells (e.g.
dash or BusyBox ash when built with CONFIG_ASH_BASH_COMPAT disabled)
don't support this expansion syntax. So glibc will fail to build when
$(SHELL) expands to a path that isn't provided by Bash.
An example build failure:
for dir in [...]; do \
test -f $dir/syscalls.list && \
{ sysdirs='[...]' \
asm_CPP='gcc -c -I[...] -D_LIBC_REENTRANT -include include/libc-symbols.h -DASSEMBLER -g -Wa,--noexecstack -E -x assembler-with-cpp' \
/bin/sh sysdeps/unix/make-syscalls.sh $dir || exit 1; }; \
test $dir = sysdeps/unix && break; \
done > [build-dir]/sysd-syscallsT
sysdeps/unix/make-syscalls.sh: line 273: syntax error: bad substitution
This patch simply replaces the three instances of the Bash-only syntax
in these files with an echo and sed command substitution.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
This define was removed from the rest of the tree eight years ago.
ChangeLog:
2013-09-24 Will Newton <will.newton@linaro.org>
* sysdeps/mach/hurd/i386/tls.h (TLS_INIT_TP_EXPENSIVE): Remove
macro.
strcoll is implemented using a cache for indices and weights of
collation sequences in the strings so that subsequent passes do not
have to search through collation data again. For very large string
inputs, the cache size computation could overflow. In such a case,
use the fallback function that does not cache indices and weights of
collation sequences.
Fixes CVE-2012-4412.
strcoll currently falls back to alloca if malloc fails, resulting in a
possible stack overflow. This patch implements sequence traversal and
comparison without caching indices and rules.
Fixes CVE-2012-4424.
Statically built binaries use __pointer_chk_guard_local,
while dynamically built binaries use __pointer_chk_guard.
Provide the right definition depending on the test case
we are building.
The pointer guard used for pointer mangling was not initialized for
static applications resulting in the security feature being disabled.
The pointer guard is now correctly initialized to a random value for
static applications. Existing static applications need to be
recompiled to take advantage of the fix.
The test tst-ptrguard1-static and tst-ptrguard1 add regression
coverage to ensure the pointer guards are sufficiently random
and initialized to a default value.
for ChangeLog
* malloc/arena.c (new_heap): New memory_heap_new probe.
(grow_heap): New memory_heap_more probe.
(shrink_heap): New memory_heap_less probe.
(heap_trim): New memory_heap_free probe.
* malloc/malloc.c (sysmalloc): New memory_sbrk_more probe.
(systrim): New memory_sbrk_less probe.
* manual/probes.texi: Document them.
It has been a long practice for software using IEEE 754 floating-point
arithmetic run on MIPS processors to use an encoding of Not-a-Number
(NaN) data different to one used by software run on other processors.
And as of IEEE 754-2008 revision [1] this encoding does not follow one
recommended in the standard, as specified in section 6.2.1, where it
is stated that quiet NaNs should have the first bit (d1) of their
significand set to 1 while signalling NaNs should have that bit set to
0, but MIPS software interprets the two bits in the opposite manner.
As from revision 3.50 [2][3] the MIPS Architecture provides for
processors that support the IEEE 754-2008 preferred NaN encoding format.
As the two formats (further referred to as "legacy NaN" and "2008 NaN")
are incompatible to each other, tools have to provide support for the
two formats to help people avoid using incompatible binary modules.
The change is comprised of two functional groups of features, both of
which are required for correct support.
1. Dynamic linker support.
To enforce the NaN encoding requirement in dynamic linking a new ELF
file header flag has been defined. This flag is set for 2008-NaN
shared modules and executables and clear for legacy-NaN ones. The
dynamic linker silently ignores any incompatible modules it
encounters in dependency processing.
To avoid unnecessary processing of incompatible modules in the
presence of a shared module cache, a set of new cache flags has been
defined to mark 2008-NaN modules for the three ABIs supported.
Changes to sysdeps/unix/sysv/linux/mips/readelflib.c have been made
following an earlier code quality suggestion made here:
http://sourceware.org/ml/libc-ports/2009-03/msg00036.html
and are therefore a little bit more extensive than the minimum
required.
Finally a new name has been defined for the dynamic linker so that
2008-NaN and legacy-NaN binaries can coexist on a single system that
supports dual-mode operation and that a legacy dynamic linker that
does not support verifying the 2008-NaN ELF file header flag is not
chosen to interpret a 2008-NaN binary by accident.
2. Floating environment support.
IEEE 754-2008 features are controlled in the Floating-Point Control
and Status (FCSR) register and updates are needed to floating
environment support so that the 2008-NaN flag is set correctly and
the kernel default, inferred from the 2008-NaN ELF file header flag
at the time an executable is loaded, respected.
As the NaN encoding format is a property of GCC code generation that is
both a user-selected GCC configuration default and can be overridden
with GCC options, code that needs to know what NaN encoding standard it
has been configured for checks for the __mips_nan2008 macro that is
defined internally by GCC whenever the 2008-NaN mode has been selected.
This mode is determined at the glibc configuration time and therefore a
few consistency checks have been added to catch cases where compilation
flags have been overridden by the user.
The 2008 NaN set of features relies on kernel support as the in-kernel
floating-point emulator needs to be aware of the NaN encoding used even
on hard-float processors and configure the FPU context according to the
value of the 2008 NaN ELF file header flag of the executable being
started. As at this time work on kernel support is still in progress
and the relevant changes have not made their way yet to linux.org master
repository.
Therefore the minimum version supported has been artificially set to
10.0.0 so that 2008-NaN code is not accidentally run on a Linux kernel
that does not suppport it. It is anticipated that the version is
adjusted later on to the actual initial linux.org kernel version to
support this feature. Legacy NaN encoding support is unaffected, older
kernel versions remain supported.
[1] "IEEE Standard for Floating-Point Arithmetic", IEEE Computer
Society, IEEE Std 754-2008, 29 August 2008
[2] "MIPS Architecture For Programmers, Volume I-A: Introduction to the
MIPS32 Architecture", MIPS Technologies, Inc., Document Number:
MD00082, Revision 3.50, September 20, 2012
[3] "MIPS Architecture For Programmers, Volume I-A: Introduction to the
MIPS64 Architecture", MIPS Technologies, Inc., Document Number:
MD00083, Revision 3.50, September 20, 2012
The TIMING_INIT macro currently sets the number of loop iterations
to 1000, which limits usefulness. Make the argument a clock
resolution value and multiply by 1000 in bench-skeleton.c instead
to allow easier reuse.
ChangeLog:
2013-09-11 Will Newton <will.newton@linaro.org>
* benchtests/bench-timing.h (TIMING_INIT): Rename ITERS
parameter to RES. Remove hardcoded 1000 value.
* benchtests/bench-skeleton.c (main): Pass RES parameter
to TIMING_INIT and multiply result by 1000.
A large bytes parameter to memalign could cause an integer overflow
and corrupt allocator internals. Check the overflow does not occur
before continuing with the allocation.
ChangeLog:
2013-09-11 Will Newton <will.newton@linaro.org>
[BZ #15857]
* malloc/malloc.c (__libc_memalign): Check the value of bytes
does not overflow.
A large bytes parameter to valloc could cause an integer overflow
and corrupt allocator internals. Check the overflow does not occur
before continuing with the allocation.
ChangeLog:
2013-09-11 Will Newton <will.newton@linaro.org>
[BZ #15856]
* malloc/malloc.c (__libc_valloc): Check the value of bytes
does not overflow.
A large bytes parameter to pvalloc could cause an integer overflow
and corrupt allocator internals. Check the overflow does not occur
before continuing with the allocation.
ChangeLog:
2013-09-11 Will Newton <will.newton@linaro.org>
[BZ #15855]
* malloc/malloc.c (__libc_pvalloc): Check the value of bytes
does not overflow.
The end of the "Parsing of Floats" subsection currently reads:
The GNU C Library also provides '_l' versions of these functions,
which take an additional argument, the locale to use in conversion.
*Note Parsing of Integers::.
Split the final note as it is unrelated to the above comment and
reference it with "See also" instead.
The pt-chown binary is discussed in the "Running make install" section
without clarification of the needed configure option. Clarify this
and simplfy the discription which is already covered in the "Configuring
and compiling" section.
Long ago static startup did not parse the auxiliary vector and therefore
could not get at any `AT_FPUCW' tag to check whether upon FPU context
allocation the kernel would use a FPU control word setting different to
that provided by the `__fpu_control' variable. Static startup therefore
always initialized the FPU control word, forcing immediate FPU context
allocation even for binaries that otherwise never used the FPU.
As from GIT commit f8f900ecb9 static
startup supports parsing the auxiliary vector, so now it can avoid
explicit initialization of the FPU control word, just as can dynamic
startup, in the usual case where the setting written to the FPU control
word would be the same as the kernel uses. This defers FPU context
allocation until the binary itself actually pokes at the FPU.
Note that the `AT_FPUCW' tag is usually absent from the auxiliary vector
in which case _FPU_DEFAULT is assumed to be the kernel default.
The current tests don't test the functionality of realloc in detail.
Add a new test for realloc that exercises some of the corner cases
that are not otherwise tested.
ChangeLog:
2013-09-09 Will Newton <will.newton@linaro.org>
* malloc/Makefile: Add tst-realloc to tests.
* malloc/tst-realloc.c: New file.
The benchmark for memcpy got disabled accidentally. Re-enable it.
ChangeLog:
2013-09-06 Will Newton <will.newton@linaro.org>
* benchtests/Makefile (string-bench): Add memcpy.
This change synchronizes the glibc headers with the Linux kernel
headers and arranges to coordinate the definition of structures
already defined the Linux kernel UAPI headers.
It is now safe to include glibc's netinet/in.h or Linux's linux/in6.h
in any order in a userspace application and you will get the same
ABI. The ABI is guaranteed by UAPI and glibc.
Since fanotify_init requires CAP_SYS_ADMIN in order to work (which usually
means running as root), we need to handle that error case too.
Reported-by: Andreas Jaeger <aj@suse.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Switch the string benchmarks to using bench-timing.h instead
of hp-timing.h directly. This allows the string benchmarks to
be run usefully on architectures such as ARM that do not have
support for hp-timing.h.
In order to do this the tests have been changed from timing each
individual call and picking the lowest execution time recorded to
timing a number of calls and taking the mean execution time.
ChangeLog:
2013-09-04 Will Newton <will.newton@linaro.org>
* benchtests/bench-timing.h (TIMING_PRINT_MEAN): New macro.
* benchtests/bench-string.h: Include bench-timing.h instead
of including hp-timing.h directly. (INNER_LOOP_ITERS): New
define. (HP_TIMING_BEST): Delete macro. (test_init): Remove
call to HP_TIMING_DIFF_INIT.
* benchtests/bench-memccpy.c: Use bench-timing.h macros
instead of hp-timing.h macros.
* benchtests/bench-memchr.c: Likewise.
* benchtests/bench-memcmp.c: Likewise.
* benchtests/bench-memcpy.c: Likewise.
* benchtests/bench-memmem.c: Likewise.
* benchtests/bench-memmove.c: Likewise.
* benchtests/bench-memset.c: Likewise.
* benchtests/bench-rawmemchr.c: Likewise.
* benchtests/bench-strcasecmp.c: Likewise.
* benchtests/bench-strcasestr.c: Likewise.
* benchtests/bench-strcat.c: Likewise.
* benchtests/bench-strchr.c: Likewise.
* benchtests/bench-strcmp.c: Likewise.
* benchtests/bench-strcpy.c: Likewise.
* benchtests/bench-strcpy_chk.c: Likewise.
* benchtests/bench-strlen.c: Likewise.
* benchtests/bench-strncasecmp.c: Likewise.
* benchtests/bench-strncat.c: Likewise.
* benchtests/bench-strncmp.c: Likewise.
* benchtests/bench-strncpy.c: Likewise.
* benchtests/bench-strnlen.c: Likewise.
* benchtests/bench-strpbrk.c: Likewise.
* benchtests/bench-strrchr.c: Likewise.
* benchtests/bench-strspn.c: Likewise.
* benchtests/bench-strstr.c: Likewise.
LDFLAGS puts the library too early in the command line if --as-needed
is being used. Use LDLIBS instead.
ChangeLog:
2013-09-04 Will Newton <will.newton@linaro.org>
* benchtests/Makefile: Use LDLIBS instead of LDFLAGS.
Another example of all the 64bit arches getting the definition via a
common file, but the 32bit ones all adding it by themselves and hppa
was missed.
I'm not entirely sure about the usage of GLIBC_2.19 symbols here.
We'd like to backport this so people can use it, but it means we'd
be releasing a glibc-2.17/glibc-2.18 with a GLIBC_2.19 symbol in it.
But maybe it won't be a big deal since you'd only get that 2.19 ref
if you actually used the symbol ?
There hasn't been a glibc release where hppa worked w/out a bunch of
patches, so in reality there's only two distros that matter -- Gentoo
and Debian.
Reported-by: Jeroen Roovers <jer@gentoo.org>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
These have helped me find and fix type conversion issues in QEMU's MIPS
hardware emulation. While certainly glibc is not the best place for such
tests, they're just an enhancement of tests already present.
Since the dlopen funcs might invoke a constructor that calls a func
that is in the same compilation unit as the caller, we cannot mark
them as leaf funcs.
Similarly, dlclose might invoke a destructor that calls a func that
is in the same compilation unit as the caller.
URL: https://sourceware.org/bugzilla/show_bug.cgi?id=15897
Reportedy-by: Fabrice Bauzac <libnoon@gmail.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
This patch fixes backtrace for PPC32 and PPC64 to correctly handle
signal trampolines. The 'debug/tst-backtrace6.c' also check for
SA_SIGINFO handling, where is triggers another vDSO symbols for PPC32.
This patch fixes dlfcn/tststatic5 for PowerPC where pagesize
variable was not properly initialized in certain cases. This patch
is based on other architecture code.
The helper binary pt_chown tricked into granting access to another
user's pseudo-terminal.
Pre-conditions for the attack:
* Attacker with local user account
* Kernel with FUSE support
* "user_allow_other" in /etc/fuse.conf
* Victim with allocated slave in /dev/pts
Using the setuid installed pt_chown and a weak check on whether a file
descriptor is a tty, an attacker could fake a pty check using FUSE and
trick pt_chown to grant ownership of a pty descriptor that the current
user does not own. It cannot access /dev/pts/ptmx however.
In most modern distributions pt_chown is not needed because devpts
is enabled by default. The fix for this CVE is to disable building
and using pt_chown by default. We still provide a configure option
to enable hte use of pt_chown but distributions do so at their own
risk.
The generated header is compiled with `-ffreestanding' to avoid any
circular dependencies against the installed implementation headers.
Such a dependency would require the implementation header to be
installed before the generated header could be built (See bug 15711).
In current practice the generated header dependencies do not include
any of the implementation headers removed by the use of `-ffreestanding'.
---
2013-07-15 Carlos O'Donell <carlos@redhat.com>
[BZ #15711]
* sysdeps/unix/sysv/linux/Makefile ($(objpfx)bits/syscall%h):
Avoid system header dependency with -ffreestanding.
($(objpfx)bits/syscall%d): Likewise.
This change creates a link map in static executables to serve as the
global search list for dlopen. It fixes a problem with the inability
to access the global symbol object and a crash on an attempt to map a
DSO into the global scope. Some code that has become dead after the
addition of this link map is removed too and test cases are provided.
Many Linux arches require fixed mmaps to be aligned higher than pagesize,
so use the SHMLBA define as it represents this quantity exactly.
This fixes spurious errors seen on those arches like:
cannot map archive header: Invalid argument
URL: http://sourceware.org/bugzilla/show_bug.cgi?id=10283
Reported-by: CHIKAMA Masaki <masaki.chikama@gmail.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Rather than open coding the masks, add helper macros to do the magic.
This makes code easier to read.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Check wheter the compiler has the option -fno-tree-loop-distribute-patterns
to inhibit loop transformation to library calls and uses it on memset
and memmove default implementation to avoid recursive calls.
This patch introduces two new convenience functions to set the default
thread attributes used for creating threads. This allows a programmer
to set the default thread attributes just once in a process and then
run pthread_create without additional attributes.
GCC 4.8 enables -ftree-loop-distribute-patterns at -O3 by default and
this optimization may transform loops into memset/memmove calls. Without
proper handling this may generate unexpected PLT calls on GLIBC.
This patch fixes by create memset/memmove alias to internal GLIBC
__GI_memset/__GI_memmove symbols.
The most common use case of math functions is with default rounding
mode, i.e. rounding to nearest. Setting and restoring rounding mode
is an unnecessary overhead for this, so I've added support for a
context, which does the set/restore only if the FP status needs a
change. The code is written such that only x86 uses these. Other
architectures should be unaffected by it, but would definitely benefit
if the set/restore has as much overhead relative to the rest of the
code, as the x86 bits do.
Here's a summary of the performance improvement due to these
improvements; I've only mentioned functions that use the set/restore
and have benchmark inputs for x86_64:
Before:
cos(): ITERS:4.69335e+08: TOTAL:28884.6Mcy, MAX:4080.28cy, MIN:57.562cy, 16248.6 calls/Mcy
exp(): ITERS:4.47604e+08: TOTAL:28796.2Mcy, MAX:207.721cy, MIN:62.385cy, 15543.9 calls/Mcy
pow(): ITERS:1.63485e+08: TOTAL:28879.9Mcy, MAX:362.255cy, MIN:172.469cy, 5660.86 calls/Mcy
sin(): ITERS:3.89578e+08: TOTAL:28900Mcy, MAX:704.859cy, MIN:47.583cy, 13480.2 calls/Mcy
tan(): ITERS:7.0971e+07: TOTAL:28902.2Mcy, MAX:1357.79cy, MIN:388.58cy, 2455.55 calls/Mcy
After:
cos(): ITERS:6.0014e+08: TOTAL:28875.9Mcy, MAX:364.283cy, MIN:45.716cy, 20783.4 calls/Mcy
exp(): ITERS:5.48578e+08: TOTAL:28764.9Mcy, MAX:191.617cy, MIN:51.011cy, 19071.1 calls/Mcy
pow(): ITERS:1.70013e+08: TOTAL:28873.6Mcy, MAX:689.522cy, MIN:163.989cy, 5888.18 calls/Mcy
sin(): ITERS:4.64079e+08: TOTAL:28891.5Mcy, MAX:6959.3cy, MIN:36.189cy, 16062.8 calls/Mcy
tan(): ITERS:7.2354e+07: TOTAL:28898.9Mcy, MAX:1295.57cy, MIN:380.698cy, 2503.7 calls/Mcy
So the improvements are:
cos: 27.9089%
exp: 22.6919%
pow: 4.01564%
sin: 19.1585%
tan: 1.96086%
The downside of the change is that it will have an adverse performance
impact on non-default rounding modes, but I think the tradeoff is
justified.
This is the initial support for string function performance tests,
along with copying tests for memcpy and memcpy-ifunc as proof of
concept. The string function benchmarks perform operations at
different alignments and for different sizes and compare performance
between plain operations and the optimized string operations. Due to
this their output is incompatible with the function benchmarks where
we're interested in fastest time, throughput, etc.
In future, the correctness checks in the benchmark tests can be
removed. Same goes for the performance measurements in the
string/test-*.
__clock_gettime and other __clock_* functions could result in an extra
PLT reference within libc.so if it actually gets used. None of the
code currently uses them, which is why this probably went unnoticed.
When setting BENCH_DURATION in CPPFLAGS-nonlib, append to the variable
instead of assigning to it, to avoid overwriting earlier set flags,
notably the -DNOT_IN_libc=1 flag.
In 128-bit IBM long double the precision of the type
decreases as you approach subnormal numbers, equaling
that of a double for subnormal numbers. Therefore
adjust the computation in ulp to use 2^(MIN_EXP - MANT_DIG)
which is correct for FP_SUBNORMAL for all types.
Resolves: #15465
The program name may be unavailable if the user application tampers
with argc and argv[]. Some parts of the dynamic linker caters for
this while others don't, so this patch consolidates the check and
fallback into a single macro and updates all users.
Added descriptive titles to the Belarusian,
English (American), and Chinese (simplified)
po/pot files.
---
2013-05-28 Carlos O'Donell <carlos@redhat.com>
* po/be.po: Add descriptive title.
* po/zh_CN.po: Likewise.
* po/header.pot: Likewise.
When mkstemp fails, the error message the user gets back is:
cannot create temporary file: No such file or directory
That isn't terribly useful in figuring out why, so include the full
filename we tried to create in the error output.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
It is the magnitude of the return value which lies
in [0.5, 1), not the return value itself.
---
2013-05-28 Ben North <ben@redfrontdoor.org>
* manual/arith.texi (frexp): It is the magnitude of the return
value which lies in [0.5, 1), not the return value itself.
The current value used for ulp near zero is wrong,
and this commit fixes it such that ulp(0) is the smallest
subnormal value nearest to zero, which makes the most
sense for testing values near zero. Note that this is not
what Java does; they use the nearest normal value, which
is less accurate than what we want for glibc. Note that
there is no correct implementation of ulp since there
is no strict mathmatical definition that is accepted by
all groups using IEEE 754.
Previously with the large ulp values near zero there
were tests that previously passed, but were in fact
billions of ulp away from the precise answer. With this
commit we now need to disable one of the cpow tests which
is revealed to be inaccurate (bug 14473).
---
2013-05-24 Carlos O'Donell <carlos@redhat.com>
* math/libm-test.inc (MAX_EXP): Define.
(ULPDIFF): Define.
(ulp): New function.
(check_float_internal): Use ULPDIFF.
(cpow_test): Disable failing test.
(check_ulp): Test ulp() implemetnation.
(main): Call check_ulp before starting tests.
Fixes 15381.
Using wide character function is on byte oriented memstream is undefined
behaviour. This behaviour was masked by not initializing wide struct
info. We now initialize it to cause a predictable crash.
In dl-hwcaps.c the comment read that rounding was done
to ElfW(Addr), but it's actually rounded to ElfW(Word).
In ldconfig.c we make each comment a sentence and
mention that the "tls" pseudo-hwcap is just for legacy
installations where TLS was optional.
---
2013-05-22 Carlos O'Donell <carlos@redhat.com>
* elf/ldconfig.c (is_hwcap_platform): Make comments full setences.
(main): Mention "tls" pseudo-hwcap is legacy.
* elf/dl-hwcaps.c (_dl_important_hwcaps): Correct rounding comment.
This patch fixes two issues, and perhaps should be two distinct commits,
but I present it here as one for the sake of completeness.
Commit 006dd86111 fails to check malloc's
return in intl/dcigettext.c (_nl_find_msg):
~~~
freemem_size = INITIAL_BLOCK_SIZE;
newmem = (transmem_block_t *) malloc (freemem_size);
...
newmem->next = transmem_list;
transmem_list = newmem;
~~~
If malloc fails then newmem is NULL then newmem->next results in a
fault.
The fix is easy enough, check for newmem != NULL, and fall through to
the error condition below which returns (char *) -1 e.g. resource error.
The problem is that returning (char *) -1 will break all sorts of other
code, so while what we did is correct, the real failure case fix is
slightly broader.
There are 4 other places where _nl_find_msg is called, one is OK, the
other three are fixed to handle -1 error return value.
No regressions on x86-64 or x86.
However, no regressions isn't really a useful metric for this code.
The change was tested as documented here:
http://sourceware.org/glibc/wiki/Testing/WhiteBox
using SystemTap for fault injection to simulate malloc failure.
---
2013-05-03 Carlos O'Donell <carlos at redhat.com>
[BZ #15441]
* intl/dcigettext.c (DCIGETTEXT): Skip translating if _nl_find_msg
returns -1.
(_nl_find_msg): Return -1 if recursive call returned -1. If newmem is
null return -1.
* intl/loadmsgcat.c (_nl_load_domain): If _nl_find_msg returns -1 abort
loading the domain.
This helps testing for regression of BZ#15339. Creation of network
isolated environments is a privileged operation and therefore is not
included to the test.
Fixes BZ #15339.
NSS_STATUS_UNAVAIL may mean that a necessary input resource is not
available. This could occur in a number of cases including when the
network is down, system runs out of file descriptors, etc. The
correct differentiator in such a case is the h_errno, which gives the
nature of failure. In case of failures other than a simple 'not
found', we set h_errno as NETDB_INTERNAL and let errno be the
identifier for the exact error.
This implementation speed up memset in several ways. First is avoiding
expensive computed jump. Second is using fact that arguments of memset
are most of time aligned to 8 bytes.
Benchmark results on:
kam.mff.cuni.cz/~ondra/benchmark_string/memset_profile_result27_04_13.tar.bz2
We add new memcpy version that uses unaligned loads which are fast
on modern processors. This allows second improvement which is avoiding
computed jump which is relatively expensive operation.
Tests available here:
http://kam.mff.cuni.cz/~ondra/memcpy_profile_result27_04_13.tar.bz2