The previous code used to evaluate the preprocessor token is_lock_free to
a variable before starting a transaction. This behavior can cause an
error if another thread got the lock (without using a transaction)
between the evaluation of the token and the beginning of the transaction.
This bug can be triggered with the following order of events:
1. The lock accessed by is_lock_free is free.
2. Thread T1 evaluates is_lock_free and stores into register R1 that the
lock is free.
3. Thread T2 acquires the same lock used in is_lock_free.
4. T1 begins the transaction, creating a memory barrier where is_lock_free
is false, but R1 is true.
5. T1 reads R1 and doesn't abort the transaction.
6. T1 calls ELIDE_UNLOCK, which reads false from is_lock_free and decides
to unlock a lock acquired by T2, leading to undefined behavior.
This patch delays the evaluation of is_lock_free to inside a transaction
by moving this part of the code to the macro ELIDE_LOCK.
[BZ #18743]
* sysdeps/powerpc/nptl/elide.h (__elide_lock): Move most of this
code to...
(ELIDE_LOCK): ...here.
(__get_new_count): New function with part of the code from
__elide_lock that updates the value of adapt_count after a
transaction abort.
(__elided_trylock): Moved this code to...
(ELIDE_TRYLOCK): ...here.
The previous (11th) version of the Hungarian spelling rules (released
in 1984) said that the separator had to be a dot, e.g. 10.35 meaning
10 o'clock 35 minutes. glibc correctly implements this.
The brand new (12th) version, in effect since September 1, 2015 adopts
to the common use of colon (especially in the digital world) and
allows to use either separator, without even expressing a preference.
For computer systems, using colons is way more typical and probably
easier to recognize. Dot is typically used in printed materials.
It also avoids an almost ambiguous situation where a space makes a
difference, e.g. "10.15-ig" means "until 10 o'clock 15 minutes"
whereas "10. 15-ig" means "until 15th of October". So I believe using
the colon as the separator is not only more frequent in the computer
world, but is also easier and quicker to recognize for the brain that
it's about hour:minute rather than month and day. And luckily it's now
equally correct according to the official rules.
11th edition: http://helyesiras.mta.hu/helyesiras/default/akh11
12th edition: http://helyesiras.mta.hu/helyesiras/default/akh12
In both editions it's the very last (299th and 300th, respectively) rule.
Microsoft also uses and recommends a colon since at least May 2011:
http://download.microsoft.com/download/e/6/1/e61266b2-d8b4-4fe0-a553-f01dc3976675/hun-hun-StyleGuide.pdf
The time format is different in common language and in the language of
IT. In common texts we usually do not abbreviate, so the full forms are
used: “7 óra 10 perckor csörgött a telefon”. However, the short format,
consisting of numerals only, can also be used. In this case a period
must be used between the two numbers and there must not be a space
between them: “találkozzunk 10.45-kor”.
However, in software mostly the short format is used, and the numbers
are separated by a colon. An obvious example is the clock in the bottom
right corner of your screen, thus 18:31.
Only i386 implements epoll_pwait in assembly code withot cancellation
support. All other architectures implement epoll_pwait in epoll_pwait.c
with
int epoll_pwait (int epfd, struct epoll_event *events,
int maxevents, int timeout,
const sigset_t *set)
{
return SYSCALL_CANCEL (epoll_pwait, epfd, events, maxevents,
timeout, set, _NSIG / 8);
}
Although there is no test for epoll_pwait in glibc, since SYSCALL_CANCEL
works on i386 and epoll_pwait.c works for other architectures, it is
safe to assume that epoll_pwait.c with SYSCALL_CANCEL also works on
i386.
[BZ #19137]
* sysdeps/unix/sysv/linux/i386/Makefile (CFLAGS-epoll_pwait.c):
Add -fomit-frame-pointer.
* sysdeps/unix/sysv/linux/i386/epoll_pwait.S: Remove file.
Honoring the LD_POINTER_GUARD environment variable in AT_SECURE mode
has security implications. This commit enables pointer guard
unconditionally, and the environment variable is now ignored.
[BZ #18928]
* sysdeps/generic/ldsodefs.h (struct rtld_global_ro): Remove
_dl_pointer_guard member.
* elf/rtld.c (_rtld_global_ro): Remove _dl_pointer_guard
initializer.
(security_init): Always set up pointer guard.
(process_envvars): Do not process LD_POINTER_GUARD.
The powerpc32 implementation of lround and lroundf can produce
spurious exceptions from adding 0.5 then converting to integer. This
includes "inexact" from the conversion to integer (not allowed for
integer arguments to these functions), and, for larger integer
arguments, "inexact", and "overflow" when rounding upward, from the
addition. In addition, "inexact" is not allowed together with
"invalid" and so inexact addition must be avoided when the integer
will be out of range of 32-bit long, whether or not the argument is an
integer.
This patch fixes these problems. As in the powerpc64 llround
implementation, a check is added for too-large arguments; in the
powerpc64 case that means arguments at least 2^52 in magnitude (so
that 0.5 cannot be added exactly), while in this case it means
arguments for which the result would overflow "long". In those cases
a suitable overflowing value is used for the integer conversion
without adding 0.5, while for smaller arguments it's tested whether
the argument is an integer (by adding and subtracting 2^52 to the
absolute value and comparing with the original absolute value) to
avoid adding 0.5 to integers and generating spurious "inexact".
This code is not used when the power5+ sysdeps directories are used,
as there's a separate power5+ version of these functions..
Tested for powerpc. This gets test-float (for a default powerpc32
hard-float build without any --with-cpu) back to the point where it
should pass once powerpc ulps are regenerated; test-double still needs
another problem with exceptions fixed to get back to that point (and I
haven't looked lately at what default powerpc64 results are like).
[BZ #19134]
* sysdeps/powerpc/powerpc32/fpu/s_lround.S (.LC1): New object.
(.LC2): Likewise.
(.LC3): Likewise.
(__lround): Do not add 0.5 to integer or out-of-range arguments.
_dl_tlsdesc_resolve_hold calls into a C function that clobbers r0,
but it assumes the original argument is still in r0 after the call.
This can cause crash in case of concurrent TLS access when TLSDESC
is in use (-mtls-dialect=gnu2).
Run into this while fixing BZ 18572.
Both r0 and r1 are saved/restored so the stack remains 8 byte aligned.
[BZ #19129]
* sysdeps/arm/dl-tlsdesc.S (_dl_tlsdesc_resolve_hold): Save and restore
r0 and r1.
Linker in binutils 2.26 and newer generate GOT references instead
PLT references when -z now is passed to linker. We need to extend
scripts/localplt.awk to allow PLT or GOT references.
[BZ #19007]
* scripts/localplt.awk: Also allow GOT references.
* sysdeps/unix/sysv/linux/i386/localplt.data: Mark
_Unwind_Find_FDE, calloc, memalign, realloc and __libc_memalign
with "+ REL R_386_GLOB_DAT".
* sysdeps/x86_64/localplt.data: Mark calloc, memalign, realloc
and __libc_memalign with "+ RELA R_X86_64_GLOB_DAT".
The powerpc32 implementations of llroundf and llround produce spurious
and missing exceptions (some arising from such exceptions from
conversions to long long, some present even when fctidz is used).
This patch fixes those problems in a similar way to the llrint /
llrintf fixes. The spurious exceptions in the fctidz case for large
arguments arise from a converted value that saturated as LLONG_MAX
being converted back to float or double (the conversion back being
inexact, but "inexact" must not be raised together with "invalid"),
and from the subtraction x - xrf also being inexact for sufficiently
large arguments (whether the saturation was to LLONG_MAX or
LLONG_MIN); those are fixed by returning early if the argument is
large enough that no rounding is needed.
This code is not used for --with-cpu=power4 builds (I suspect the code
used in that case may also produce spurious "inexact" exceptions, but
that's something to investigate later).
Tested for powerpc.
[BZ #19125]
* sysdeps/powerpc/powerpc32/fpu/s_llround.c: Include <limits.h>,
<math_private.h> and <stdint.h>.
(__llround): Avoid conversions to and from long long int, and
subtractions, where those might raise spurious exceptions.
* sysdeps/powerpc/powerpc32/fpu/s_llroundf.c: Include
<math_private.h> and <stdint.h>.
(__llroundf): Avoid conversions to and from long long int, and
subtractions, where those might raise spurious exceptions.
When x86-64 assmebler doesn't support AVX512, we should make
_dl_runtime_resolve_avx512/_dl_runtime_profile_avx512 as aliases of
_dl_runtime_resolve_avx/_dl_runtime_profile_avx. Tested on x86-64
using GCC 5.2 with binutils 20151008 and GCC 4.8 with binutils 20130219.
There are no differences in ld.so with binutils 20151008. There are no
unexpected failures with binutils 20130219 and 20151008.
[BZ #19124]
* sysdeps/x86_64/dl-trampoline.S [!HAVE_AVX512_ASM_SUPPORT]
(_dl_runtime_resolve_avx512): Make it a hidden alias of
_dl_runtime_resolve_avx.
(_dl_runtime_profile_avx512): Make it a hidden alias of
_dl_runtime_profile_avx.
The versions of llrint and llrintf for older powerpc32 processors
convert the results of __rint / __rintf to long long int, resulting in
spurious exceptions from such casts in certain cases. This patch
makes glibc work around the problems with the libgcc conversions when
the compiler used to build glibc doesn't use the fctidz instruction
for them.
Tested for powerpc.
[BZ #16422]
* sysdeps/powerpc/powerpc32/fpu/configure.ac (libc_cv_ppc_fctidz):
New configure test.
* sysdeps/powerpc/powerpc32/fpu/configure: Regenerated.
* config.h.in [_LIBC] (HAVE_PPC_FCTIDZ): New macro.
* sysdeps/powerpc/powerpc32/fpu/s_llrint.c: Include <limits.h>,
<math_private.h> and <stdint.h>.
(__llrint): Avoid conversions to long long int where those might
raise spurious exceptions.
* sysdeps/powerpc/powerpc32/fpu/s_llrintf.c: Include
<math_private.h> and <stdint.h>.
(__llrintf): Avoid conversions to long long int where those might
raise spurious exceptions.
Similar to the recent fix for MIPS, ARM is also missing correct
exceptions on overflow from llrint and llround functions because casts
from floating-point types to long long do not result in correct
exceptions on overflow. This patch enables the fix for this for ARM.
Tested for ARM.
[BZ #15470]
* sysdeps/arm/fix-fp-int-convert-overflow.h: New file.
For 32-bit MIPS and some other systems, various of the lrint, llrint,
lround, llround functions can be missing exceptions on overflow
because casts do not (in current GCC) result in the proper
exceptions. In the MIPS case there are two problems here: MIPS I code
generation uses an assembler macro that doesn't raise exceptions,
while the libgcc conversions of floating-point values to long long
also do not raise "invalid" on all overflow cases (and can raise
spurious "inexact").
This patch adds support in the generic code (only the functions for
which this problem has actually been seen) for forcing the "invalid"
exception in the problem cases, and enables that support for the
affected MIPS cases.
Tested for MIPS; also tested for x86_64 and x86 that installed
stripped shared libraries are unchanged by this patch.
[BZ #16399]
* sysdeps/generic/fix-fp-int-convert-overflow.h: New file.
* sysdeps/ieee754/dbl-64/s_llrint.c: Include <fenv.h>, <limits.h>
and <fix-fp-int-convert-overflow.h>.
(__llrint) [FE_INVALID]: Force FE_INVALID exception as needed if
FIX_DBL_LLONG_CONVERT_OVERFLOW.
* sysdeps/ieee754/dbl-64/s_llround.c: Include <fenv.h>, <limits.h>
and <fix-fp-int-convert-overflow.h>.
(__llround) [FE_INVALID]: Force FE_INVALID exception as needed if
FIX_DBL_LLONG_CONVERT_OVERFLOW.
* sysdeps/ieee754/dbl-64/s_lrint.c: Include
<fix-fp-int-convert-overflow.h>.
(__lrint) [FE_INVALID]: Force FE_INVALID exception as needed if
FIX_DBL_LLONG_CONVERT_OVERFLOW.
* sysdeps/ieee754/dbl-64/s_lround.c: Include
<fix-fp-int-convert-overflow.h>.
(__lround) [FE_INVALID]: Force FE_INVALID exception as needed if
FIX_DBL_LLONG_CONVERT_OVERFLOW.
* sysdeps/ieee754/flt-32/s_llrintf.c: Include <fenv.h>, <limits.h>
and <fix-fp-int-convert-overflow.h>.
(__llrintf) [FE_INVALID]: Force FE_INVALID exception as needed if
FIX_DBL_LLONG_CONVERT_OVERFLOW.
* sysdeps/ieee754/flt-32/s_llroundf.c: Include <fenv.h>,
<limits.h> and <fix-fp-int-convert-overflow.h>.
(__llroundf) [FE_INVALID]: Force FE_INVALID exception as needed if
FIX_DBL_LLONG_CONVERT_OVERFLOW.
* sysdeps/ieee754/flt-32/s_lrintf.c: Include <fenv.h>, <limits.h>
and <fix-fp-int-convert-overflow.h>.
(__lrintf) [FE_INVALID]: Force FE_INVALID exception as needed if
FIX_DBL_LLONG_CONVERT_OVERFLOW.
* sysdeps/ieee754/flt-32/s_lroundf.c: Include <fenv.h>, <limits.h>
and <fix-fp-int-convert-overflow.h>.
(__lroundf) [FE_INVALID]: Force FE_INVALID exception as needed if
FIX_DBL_LLONG_CONVERT_OVERFLOW.
* sysdeps/mips/mips32/fpu/fix-fp-int-convert-overflow.h: New file.
The dbl-64 implementation of lrint produces incorrect results for some
arguments with 64-bit long because a 32-bit (unsigned) low part of the
mantissa is shifted left, losing high bits in the process. This patch
fixes this by casting to long int before shifting, as in lround (as
this case only applies for 64-bit long, there are no issues with
sign-extension).
Tested for mips64 (n64).
[BZ #19095]
* sysdeps/ieee754/dbl-64/s_lrint.c (__lrint): Cast low part of
mantissa to long int before shifting left.
The dbl-64, ldbl-96 and ldbl-128 implementations of lrint and llrint
fail to produce "invalid" exceptions in cases where the rounded result
overflows the target type, but truncating the floating-point argument
to the next integer towards zero does not overflow it (so in
particular casts do not produce such exceptions). (This issue cannot
arise for float, or for double with 64-bit target type, or for ldbl-96
with 64-bit target type and negative arguments, because of
insufficient precision in the floating-point type for arguments with
the relevant property to exist. It also obviously cannot arise in
FE_TOWARDZERO mode.)
This patch fixes these problems by inserting checks for the special
cases that can occur in each implementation, and explicitly raising
FE_INVALID (and avoiding the cast if it might raise spurious
FE_INEXACT, while raising FE_INEXACT explicitly in the cases where it
is needed; unlike lround and llround, FE_INEXACT is required, not
optional, for these functions for a within-range inexact result).
The fixes are conditional on FE_INVALID or FE_INEXACT being defined.
If any future architecture supports one but not both of those
exceptions, the code will fail to compile and need fixing to handle
that case (this seemed better than conditioning on both macros being
defined, resulting in code that would compile but quietly miss
exceptions on such a system).
Tested for x86_64, x86 and mips64. Tested the ldbl-96 changes (only
relevant for ia64, it appears) on x86_64 by removing the x86_64
versions of lrintl / llrintl.
[BZ #19094]
* sysdeps/ieee754/dbl-64/s_lrint.c: Include <fenv.h> and
<limits.h>.
(__lrint) [FE_INVALID || FE_INEXACT]: Force FE_INVALID exception
when result overflows but exception would not result from cast.
* sysdeps/ieee754/ldbl-128/s_llrintl.c: Include <fenv.h> and
<limits.h>.
(__llrintl) [FE_INVALID || FE_INEXACT]: Force FE_INVALID exception
when result overflows but exception would not result from cast.
* sysdeps/ieee754/ldbl-128/s_lrintl.c: Include <fenv.h> and
<limits.h>.
(__lrintl) [FE_INVALID || FE_INEXACT]: Force FE_INVALID exception
when result overflows but exception would not result from cast.
* sysdeps/ieee754/ldbl-96/s_llrintl.c: Include <fenv.h> and
<limits.h>.
(__llrintl) [FE_INVALID || FE_INEXACT]: Force FE_INVALID exception
when result overflows but exception would not result from cast.
* sysdeps/ieee754/ldbl-96/s_lrintl.c: Include <fenv.h> and
<limits.h>.
(__lrintl) [FE_INVALID || FE_INEXACT]: Force FE_INVALID exception
when result overflows but exception would not result from cast.
* math/libm-test.inc (lrint_test_data): Add more tests.
(llrint_test_data): Likewise.
The optimization introduced in commit
f13c2a8dff, causes regressions in
sorting for languages that have digraphs that change sort order, like
cs_CZ which sorts ch between h and i.
My analysis shows the fast-forwarding optimization in STRCOLL advances
through a digraph while possibly stopping in the middle which results
in a subsequent skipping of the digraph and incorrect sorting. The
optimization is incorrect as implemented and because of that I'm
removing it for 2.23, and I will also commit this fix for 2.22 where
it was originally introduced.
This patch reverts the optimization, introduces a new bug-strcoll2.c
regression test that tests both cs_CZ.UTF-8 and da_DK.ISO-8859-1 and
ensures they sort one digraph each correctly. The optimization can't be
applied without regressing this test.
Checked on x86_64, bug-strcoll2.c fails without this patch and passes
after. This will also get a fix on 2.22 which has the same bug.
The dbl-64, ldbl-96 and ldbl-128 implementations of lround and llround
fail to produce "invalid" exceptions in cases where the rounded result
overflows the target type, but truncating the floating-point argument
to the next integer towards zero does not overflow it (so in
particular casts do not produce such exceptions). (This issue cannot
arise for float, or for double with 64-bit target type, or for ldbl-96
with 64-bit target type and negative arguments, because of
insufficient precision in the floating-point type for arguments with
the relevant property to exist.)
This patch fixes these problems by inserting checks for the special
cases that can occur in each implementation, and explicitly raising
FE_INVALID (and avoiding the cast if it might raise spurious
FE_INEXACT).
Tested for x86_64, x86 and mips64.
[BZ #19088]
* sysdeps/ieee754/dbl-64/s_lround.c: Include <fenv.h> and
<limits.h>.
(__lround) [FE_INVALID]: Force FE_INVALID exception when result
overflows but exception would not result from cast.
* sysdeps/ieee754/dbl-64/wordsize-64/s_lround.c: Include <fenv.h>
and <limits.h>.
(__lround) [FE_INVALID]: Force FE_INVALID exception when result
overflows but exception would not result from cast.
* sysdeps/ieee754/ldbl-128/s_llroundl.c: Include <fenv.h> and
<limits.h>.
(__llroundl) [FE_INVALID]: Force FE_INVALID exception when result
overflows but exception would not result from cast.
* sysdeps/ieee754/ldbl-128/s_lroundl.c: Include <fenv.h> and
<limits.h>.
(__lroundl) [FE_INVALID]: Force FE_INVALID exception when result
overflows but exception would not result from cast.
* sysdeps/ieee754/ldbl-96/s_llroundl.c: Include <fenv.h> and
<limits.h>.
(__llroundl) [FE_INVALID]: Force FE_INVALID exception when result
overflows but exception would not result from cast.
* sysdeps/ieee754/ldbl-96/s_lroundl.c: Include <fenv.h> and
<limits.h>.
(__lroundl) [FE_INVALID]: Force FE_INVALID exception when result
overflows but exception would not result from cast.
* math/libm-test.inc (lround_test_data): Add more tests.
(llround_test_data): Likewise.
The ldbl-128 implementations of lrintl and lroundl miss "invalid"
exceptions on systems with 32-bit long for arguments that overflow
long but have exponent below 48. This patch fixes this by rearranging
the sequence of tests in the code so the exponent < 48 case is only
used for exponents that don't overflow long.
Tested for mips64 (n32 and n64).
[BZ #19085]
* sysdeps/ieee754/ldbl-128/s_lrintl.c (__lrintl): Move test for
exponent below 48 inside case for non-overflowing exponent.
* sysdeps/ieee754/ldbl-128/s_lroundl.c (__lroundl): Likewise.
The implementation of lround in dbl-64/wordsize-64 as an alias or
wrapper for llround is always incorrect when long is not 64-bit,
because it misses required exceptions in overflow cases, as shown by
my recently added tests. This patch removes that alias / wrapper in
the non-LP64 case, together with the REGISTER_CAST_INT32_TO_INT64
macro, restoring the previous version of lround for dbl-64/wordsize-64
(newly conditioned on !_LP64).
Tested for x86_64, and for mips64 with use of dbl-64/wordsize-64
enabled.
[BZ #19079]
* sysdeps/ieee754/dbl-64/wordsize-64/s_lround.c: Restore previous
file, conditioned on [!_LP64].
* sysdeps/ieee754/dbl-64/wordsize-64/s_llround.c
[!_LP64] (__lround): Do not define as function or alias.
[!_LP64] (lround): Likewise.
[!_LP64] (__lroundl): Likewise.
[!_LP64] (lroundl): Likewise.
* sysdeps/tile/sysdep.h (REGISTER_CAST_INT32_TO_INT64): Remove
macro.
* sysdeps/x86_64/x32/sysdep.h (REGISTER_CAST_INT32_TO_INT64):
Likewise.
The ldbl-128ibm expl wrapper checks the argument to determine when to
call __kernel_standard_l, thereby overriding overflowing results from
__ieee754_expl that could otherwise (given appropriately patched
libgcc) be correct for the rounding mode. This patch changes it to
check the result of __ieee754_expl instead, as other versions of this
wrapper do.
Tested for powerpc.
[BZ #19078]
* sysdeps/ieee754/ldbl-128ibm/w_expl.c (o_thres): Remove variable.
(u_thres): Likewise.
(__expl): Determine whether to call __kernel_standard_l based on
value of result, not argument.
The ldbl-128ibm implementation of logl produces a zero with the wrong
sign for logl (1) in FE_DOWNWARD mode. This patch makes it explicitly
return 0.0L in that case, as in e.g. the ldbl-128 implementation.
Tested for powerpc.
[BZ #19077]
* sysdeps/ieee754/ldbl-128ibm/e_logl.c (__ieee754_logl): Return
0.0L for argument 1.0L.
The ldbl-128ibm implementation of log1pl produces an infinity with the
wrong sign for log1pl (-1) in FE_DOWNWARD mode. This patch fixes this
by changing a division (-1.0L / (x - x)) (incorrect in FE_DOWNWARD
mode) to (-1.0L / 0.0L) (correct in all rounding modes).
Tested for powerpc.
[BZ #19076]
* sysdeps/ieee754/ldbl-128ibm/s_log1pl.c (__log1pl): Divide by
constant 0.0L when computing infinite result.
The ldbl-96 version of lroundl is incorrect for systems with 64-bit
long when the argument's absolute value is just below a power of 2,
2^32 or more, and rounds up to the next integer; in such cases, it
returns 0. The problem is incrementing the high part of the mantissa
loses the high bit of the value (which is not an issue for any other
floating-point format, and is handled specially in lround when the bit
corresponding to 0.5 was in the high part rather than the low part).
This patch fixes this in a similar way to that used in llroundl:
storing the high part in an unsigned long variable before incrementing
it, so problems cannot occur in the case when this code is reachable.
I improved test coverage for both lround and llround by making them
use the same test inputs (appropriately conditioned on the size of
long in the lround case) - complete with the same comments, to make
comparison as easy as possible. (This test coverage improvement was
how I found the lroundl bug.)
Tested for x86_64 and x86.
[BZ #19071]
* sysdeps/ieee754/ldbl-96/s_lroundl.c (__lroundl): Use unsigned
long int variable to store possibly incremented high part of
mantissa.
* math/libm-test.inc (lround_test_data): Add tests used for
llround. Use [LONG_MAX > 0x7fffffff] consistently as condition
for tests requiring 64-bit long. Do not condition tests on
[TEST_FLOAT] unnecessarily.
(llround_test_data): Add tests used for lround. Add another
expectation for the "inexact" exception. Do not condition tests
on [TEST_FLOAT] unnecessarily.
On powerpc32 hard-float, older processors (ones where fcfid is not
available for 32-bit code), GCC generates conversions from integers to
floating point that wrongly convert integer 0 to -0 instead of +0 in
FE_DOWNWARD mode. This in turn results in logb and a few other
functions wrongly returning -0 when they should return +0.
This patch works around this issue in glibc as I proposed in
<https://sourceware.org/ml/libc-alpha/2015-09/msg00728.html>, so that
the affected functions can be correct and the affected tests pass in
the absence of a GCC fix for this longstanding issue (GCC bug 67771 -
if fixed, of course we can put in GCC version conditionals, and
eventually phase out the workarounds). A new macro
FIX_INT_FP_CONVERT_ZERO is added in a new sysdeps header
fix-int-fp-convert-zero.h, and the powerpc32/fpu version of that
header defines the macro based on the results of a configure test for
whether such conversions use the fcfid instruction.
Tested for x86_64 (that installed stripped shared libraries are
unchanged by the patch) and powerpc (that HAVE_PPC_FCFID comes out to
0 as expected and that the relevant tests are fixed). Also tested a
build with GCC configured for -mcpu=power4 and verified that
HAVE_PPC_FCFID comes out to 1 in that case.
There are still some other issues to fix to get test-float and
test-double passing cleanly for older powerpc32 processors (apart from
the need for an ulps regeneration for powerpc). (test-ldouble will be
harder to get passing cleanly, but with a combination of selected
fixes to ldbl-128ibm code that don't involve significant performance
issues, allowing spurious underflow and inexact exceptions for that
format, and lots of XFAILing for the default case of unpatched libgcc,
it should be doable.)
[BZ #887]
[BZ #19049]
[BZ #19050]
* sysdeps/generic/fix-int-fp-convert-zero.h: New file.
* sysdeps/ieee754/dbl-64/e_log10.c: Include
<fix-int-fp-convert-zero.h>.
(__ieee754_log10): Adjust signs as needed if FIX_INT_FP_CONVERT_ZERO.
* sysdeps/ieee754/dbl-64/e_log2.c: Include
<fix-int-fp-convert-zero.h>.
(__ieee754_log2): Adjust signs as needed if FIX_INT_FP_CONVERT_ZERO.
* sysdeps/ieee754/dbl-64/s_erf.c: Include
<fix-int-fp-convert-zero.h>.
(__erfc): Adjust signs as needed if FIX_INT_FP_CONVERT_ZERO.
* sysdeps/ieee754/dbl-64/s_logb.c: Include
<fix-int-fp-convert-zero.h>.
(__logb): Adjust signs as needed if FIX_INT_FP_CONVERT_ZERO.
* sysdeps/ieee754/flt-32/e_log10f.c: Include
<fix-int-fp-convert-zero.h>.
(__ieee754_log10f): Adjust signs as needed if FIX_INT_FP_CONVERT_ZERO.
* sysdeps/ieee754/flt-32/e_log2f.c: Include
<fix-int-fp-convert-zero.h>.
(__ieee754_log2f): Adjust signs as needed if FIX_INT_FP_CONVERT_ZERO.
* sysdeps/ieee754/flt-32/s_erff.c: Include
<fix-int-fp-convert-zero.h>.
(__erfcf): Adjust signs as needed if FIX_INT_FP_CONVERT_ZERO.
* sysdeps/ieee754/flt-32/s_logbf.c: Include
<fix-int-fp-convert-zero.h>.
(__logbf): Adjust signs as needed if FIX_INT_FP_CONVERT_ZERO.
* sysdeps/ieee754/ldbl-128ibm/s_erfl.c: Include
<fix-int-fp-convert-zero.h>.
(__erfcl): Adjust signs as needed if FIX_INT_FP_CONVERT_ZERO.
* sysdeps/ieee754/ldbl-128ibm/s_logbl.c: Include
<fix-int-fp-convert-zero.h>.
(__logbl): Adjust signs as needed if FIX_INT_FP_CONVERT_ZERO.
* sysdeps/powerpc/powerpc32/fpu/configure.ac: New file.
* sysdeps/powerpc/powerpc32/fpu/configure: New generated file.
* sysdeps/powerpc/powerpc32/fpu/fix-int-fp-convert-zero.h: New
file.
* config.h.in [_LIBC] (HAVE_PPC_FCFID): New macro.
ISO C requires overflowing results from nexttoward to be the
appropriate infinity independent of the rounding mode, but some
implementations use a rounding-mode-dependent result (this is the same
issue as was fixed for nextafter in bug 16677). This patch fixes the
problem by making the nexttoward implementations discard the result
from the floating-point computation that forced an overflow exception
and then return the infinity previously computed with integer
arithmetic.
Tested for x86_64, x86, mips64 and powerpc.
[BZ #19059]
* math/s_nexttowardf.c (__nexttowardf): Do not return value from
overflowing computation.
* sysdeps/i386/fpu/s_nexttoward.c (__nexttoward): Likewise.
* sysdeps/i386/fpu/s_nexttowardf.c (__nexttowardf): Likewise.
* sysdeps/ieee754/ldbl-128/s_nexttoward.c (__nexttoward):
Likewise.
* sysdeps/ieee754/ldbl-128/s_nexttowardf.c (__nexttowardf):
Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_nexttoward.c (__nexttoward):
Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_nexttowardf.c (__nexttowardf):
Likewise.
* sysdeps/ieee754/ldbl-96/s_nexttoward.c (__nexttoward): Likewise.
* sysdeps/ieee754/ldbl-96/s_nexttowardf.c (__nexttowardf):
Likewise.
* sysdeps/ieee754/ldbl-opt/s_nexttowardfd.c (__nldbl_nexttowardf):
Likewise.
* math/libm-test.inc (nexttoward_test_data): Add more tests.
This prevents injection of ':' and '\n' into output functions which
use the NSS files database syntax. Critical fields (user/group names
and file system paths) are checked strictly. For backwards
compatibility, the GECOS field is rewritten instead.
The getent program is adjusted to use the put*ent functions in libc,
instead of local copies. This changes the behavior of getent if user
names start with '-' or '+'.
The ldbl-128 / ldbl-128ibm implementation of lgamma has problems with
its handling of large arguments. It has an overflow threshold that is
correct only for ldbl-128, despite being used for both types - with
diagnostic control macros as a temporary measure to disable warnings
about that constant overflowing for ldbl-128ibm - and it has a
calculation that's roughly x * log(x) - x, resulting in overflows for
arguments that are roughly at most a factor 1/log(threshold) below the
overflow threshold.
This patch fixes both issues, using an overflow threshold appropriate
for the type in question and adding another case for large arguments
that avoids the possible intermediate overflow.
Tested for x86_64, x86, mips64 and powerpc.
[BZ #16347]
[BZ #19046]
* sysdeps/ieee754/ldbl-128/e_lgammal_r.c: Do not include
<libc-internal.h>.
(MAXLGM): Do not use diagnostic control macros.
[LDBL_MANT_DIG == 106] (MAXLGM): Change value to overflow
threshold for ldbl-128ibm.
(__ieee754_lgammal_r): For large arguments, multiply by log - 1
instead of multiplying by log then subtracting.
* math/auto-libm-test-in: Add more tests of lgamma.
* math/auto-libm-test-out: Regenerated.
The ldbl-128ibm implementation of exp10l uses a version of log(10)
split into high and low parts - but the low part is negative, so
causing spurious overflows from __ieee754_expl (exp_high) in cases
close to the overflow threshold (I added relevant tests close to the
overflow threshold to the testsuite earlier today). The same issue
applies close to the underflow threshold as well (except that spurious
underflows in IBM long double arithmetic are harder to fix than the
other deficiencies, so we might end up permitting those for IBM long
double in the libm testsuite, as permitted by ISO C).
This patch fixes it to use a low part rounded downward to 48 bits
instead. (The choice of 48 instead of 53 bits is to make it more
obviously safe even when the low part of the argument is negative.)
Tested for powerpc. (Note that because of libgcc bugs with
multiplication very close to LDBL_MAX, libgcc also needs patching for
all the problem cases to be fixed, but this patch is still safe and
correct in the absence of such libgcc fixes.)
[BZ #16620]
* sysdeps/ieee754/ldbl-128ibm/e_exp10l.c (log10_high): Use value
of log (10) rounded downward to 48 bits.
(log10_low): Use corresponding low part of log (10).
The i386 versions of acoshf and acosh raise a spurious "invalid"
exception for an argument that is a quiet NaN with the sign bit set.
The integer arithmetic to detect arguments < 1 also detects -NaN, and
then the computation 0 / 0 in that case raises the exception. This
patch fixes this by using (x - x) / (x - x) as the computation in that
case instead, which will always raise the exception for non-NaN
arguments reaching that code, but not for quiet NaN arguments.
Tested for x86_64 and x86.
[BZ #19032]
* sysdeps/i386/fpu/e_acosh.S (__ieee754_acosh): For arguments < 1,
compute result as (x - x) / (x - x) not as 0 / 0.
* sysdeps/i386/fpu/e_acoshf.S (__ieee754_acoshf): Likewise.
* math/libm-test.inc (acosh_test_data): Add another test of acosh.
For arguments with X^2 + Y^2 close to 1, clog and clog10 avoid large
errors from log(hypot) by computing X^2 + Y^2 - 1 in a way that avoids
cancellation error and then using log1p.
However, the thresholds for using that approach still result in log
being used on argument as large as sqrt(13/16) > 0.9, leading to
significant errors, in some cases above the 9ulp maximum allowed in
glibc libm. This patch arranges for the approach using log1p to be
used in any cases where |X|, |Y| < 1 and X^2 + Y^2 >= 0.5 (with the
existing allowance for cases where one of X and Y is very small),
adjusting the __x2y2m1 functions to work with the wider range of
inputs. This way, log only gets used on arguments below sqrt(1/2) (or
substantially above 1), where the error involved is much less.
Tested for x86_64, x86, mips64 and powerpc. For the ulps regeneration
I removed the existing clog and clog10 ulps before regenerating to
allow any reduced ulps to appear. Tests added include those found by
random test generation to produce large ulps either before or after
the patch, and some found by trying inputs close to the (0.75, 0.5)
threshold where the potential errors from using log are largest.
[BZ #19016]
* sysdeps/generic/math_private.h (__x2y2m1f): Update comment to
allow more cases with X^2 + Y^2 >= 0.5.
* sysdeps/ieee754/dbl-64/x2y2m1.c (__x2y2m1): Likewise. Add -1 as
normal element in sum instead of special-casing based on values of
arguments.
* sysdeps/ieee754/dbl-64/x2y2m1f.c (__x2y2m1f): Update comment.
* sysdeps/ieee754/ldbl-128/x2y2m1l.c (__x2y2m1l): Likewise. Add
-1 as normal element in sum instead of special-casing based on
values of arguments.
* sysdeps/ieee754/ldbl-128ibm/x2y2m1l.c (__x2y2m1l): Likewise.
* sysdeps/ieee754/ldbl-96/x2y2m1.c [FLT_EVAL_METHOD != 0]
(__x2y2m1): Update comment.
* sysdeps/ieee754/ldbl-96/x2y2m1l.c (__x2y2m1l): Likewise. Add -1
as normal element in sum instead of special-casing based on values
of arguments.
* math/s_clog.c (__clog): Handle more cases using log1p without
hypot.
* math/s_clog10.c (__clog10): Likewise.
* math/s_clog10f.c (__clog10f): Likewise.
* math/s_clog10l.c (__clog10l): Likewise.
* math/s_clogf.c (__clogf): Likewise.
* math/s_clogl.c (__clogl): Likewise.
* math/auto-libm-test-in: Add more tests of clog and clog10.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
The flt-32 version of powf can be inaccurate because of bugs in the
extra-precision calculation of (x-1)/(x+1) or (x-1.5)/(x+1.5) as part
of calculating log(x) with extra precision: a constant used (as part
of adding 1 or 1.5 through integer arithmetic) is incorrect, and then
the code fails to mask a computed high part before using it in
arithmetic that relies on s_h*t_h being exactly representable. This
patch fixes these bugs.
Tested for x86_64 and x86. x86_64 ulps for powf removed and
regenerated to reflect reduced ulps from the increased accuracy for
existing tests.
[BZ #18956]
* sysdeps/ieee754/flt-32/e_powf.c (__ieee754_powf): Add 0x00400000
not 0x0040000 for high bit of mantissa. Mask with 0xfffff000 when
extracting high part.
* math/auto-libm-test-in: Add another test of pow.
* math/auto-libm-test-out: Regenerated.
* sysdeps/x86_64/fpu/libm-test-ulps: Update.
Similar to various other bugs in this area, pow functions can fail to
raise the underflow exception when the result is tiny and inexact but
one or more low bits of the intermediate result that is scaled down
(or, in the i386 case, converted from a wider evaluation format) are
zero. This patch forces the exception in a similar way to previous
fixes, thereby concluding the fixes for known bugs with missing
underflow exceptions currently filed in Bugzilla.
Tested for x86_64, x86, mips64 and powerpc.
[BZ #18825]
* sysdeps/i386/fpu/i386-math-asm.h (FLT_NARROW_EVAL_UFLOW_NONNAN):
New macro.
(DBL_NARROW_EVAL_UFLOW_NONNAN): Likewise.
(LDBL_CHECK_FORCE_UFLOW_NONNAN): Likewise.
* sysdeps/i386/fpu/e_pow.S: Use DEFINE_DBL_MIN.
(__ieee754_pow): Use DBL_NARROW_EVAL_UFLOW_NONNAN instead of
DBL_NARROW_EVAL, reloading the PIC register as needed.
* sysdeps/i386/fpu/e_powf.S: Use DEFINE_FLT_MIN.
(__ieee754_powf): Use FLT_NARROW_EVAL_UFLOW_NONNAN instead of
FLT_NARROW_EVAL. Use separate return path for case when first
argument is NaN.
* sysdeps/i386/fpu/e_powl.S: Include <i386-math-asm.h>. Use
DEFINE_LDBL_MIN.
(__ieee754_powl): Use LDBL_CHECK_FORCE_UFLOW_NONNAN, reloading the
PIC register.
* sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Use
math_check_force_underflow_nonneg.
* sysdeps/ieee754/flt-32/e_powf.c (__ieee754_powf): Force
underflow for subnormal result.
* sysdeps/ieee754/ldbl-128/e_powl.c (__ieee754_powl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_powl.c (__ieee754_powl): Use
math_check_force_underflow_nonneg.
* sysdeps/x86/fpu/powl_helper.c (__powl_helper): Use
math_check_force_underflow.
* sysdeps/x86_64/fpu/x86_64-math-asm.h
(LDBL_CHECK_FORCE_UFLOW_NONNAN): New macro.
* sysdeps/x86_64/fpu/e_powl.S: Include <x86_64-math-asm.h>. Use
DEFINE_LDBL_MIN.
(__ieee754_powl): Use LDBL_CHECK_FORCE_UFLOW_NONNAN.
* math/auto-libm-test-in: Add more tests of pow.
* math/auto-libm-test-out: Regenerated.
Fix a regression introduced with commit 0d23a5c1 [Static dlopen
correction fallout fixes] that caused the default library search path to
be ignored for modules loaded with dlopen from static executables.
[BZ #17250]
* elf/dl-support.c (_dl_main_map): Don't initialize l_flags_1
member.
Similar to various other bugs in this area, hypot functions can fail
to raise the underflow exception when the result is tiny and inexact
but one or more low bits of the intermediate result that is scaled
down (or, in the i386 case, converted from a wider evaluation format)
are zero. This patch forces the exception in a similar way to
previous fixes.
Note that this issue cannot arise for implementations of hypotf using
double (or wider) for intermediate evaluation (if hypotf should
underflow, that means the double square root is being computed of some
number of the form N*2^-298, for 0 < N < 2^46, which is exactly
represented as a double, and whatever the rounding mode such a square
root cannot have a mantissa with all zeroes after the initial 23
bits). Thus no changes are made to hypotf implementations in this
patch, only to hypot and hypotl.
Tested for x86_64, x86, mips64 and powerpc.
[BZ #18803]
* sysdeps/i386/fpu/e_hypot.S: Use DEFINE_DBL_MIN.
(MO): New macro.
(__ieee754_hypot) [PIC]: Load PIC register.
(__ieee754_hypot): Use DBL_NARROW_EVAL_UFLOW_NONNEG instead of
DBL_NARROW_EVAL.
* sysdeps/ieee754/dbl-64/e_hypot.c (__ieee754_hypot): Use
math_check_force_underflow_nonneg in case where result might be
tiny.
* sysdeps/ieee754/ldbl-128/e_hypotl.c (__ieee754_hypotl):
Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_hypotl.c (__ieee754_hypotl):
Likewise.
* sysdeps/ieee754/ldbl-96/e_hypotl.c (__ieee754_hypotl): Likewise.
* sysdeps/powerpc/fpu/e_hypot.c (__ieee754_hypot): Likewise.
* math/auto-libm-test-in: Add more tests of hypot.
* math/auto-libm-test-out: Regenerated.
The x86_64 fma4 version of pow fails to disable contraction of
operations other than those explicitly intended to use fma
instructions, so resulting in large ulps errors on processors with
fma4 instructions, as in bug 18104 (165ulp for the test added for that
bug; error originally reported by "blaaa" on #glibc). This patch adds
$(config-cflags-nofma) for e_pow-fma4.c, corresponding to the use for
e_pow.c in sysdeps/ieee754/dbl-64/Makefile.
Tested for x86_64 on a processor with fma4.
[BZ #19003]
* sysdeps/x86_64/fpu/multiarch/Makefile (CFLAGS-e_pow-fma4.c): Add
$(config-cflags-nofma).
i386 exp, hypot and pow functions can return overflowing and
underflowing values with excess range and precision; ; Wilco
Dijkstra's patches to make isfinite etc. expand inline cause this
pre-existing issue to result in test failures.
This patch fixes those functions to avoid excess range and precision
in their return values. Appropriate macros are added for the repeated
code sequences; in future I'll add more such macros and refactor
existing code forcing underflow (with or without also eliminating
excess range and precision from the return value) to use such macros.
Tested for x86. If, after this patch, you still see x86 libm test
failures with excess range or precision, please file bugs in Bugzilla.
[BZ #18980]
* sysdeps/i386/fpu/i386-math-asm.h (DEFINE_FLT_MIN): New macro.
(DEFINE_DBL_MIN): Likewise.
(FLT_NARROW_EVAL_UFLOW_NONNEG_NAN): Likewise.
(DBL_NARROW_EVAL_UFLOW_NONNEG_NAN): Likewise.
(FLT_NARROW_EVAL_UFLOW_NONNEG): Likewise.
(DBL_NARROW_EVAL_UFLOW_NONNEG): Likewise.
* sysdeps/i386/fpu/e_exp.S: Include <i386-math-asm.h>.
(dbl_min): Replace with use of DEFINE_DBL_MIN.
(__ieee754_exp): Use DBL_NARROW_EVAL_UFLOW_NONNEG_NAN.
(__exp_finite): Use DBL_NARROW_EVAL_UFLOW_NONNEG.
* sysdeps/i386/fpu/e_exp10.S: Include <i386-math-asm.h>.
(dbl_min): Replace with use of DEFINE_DBL_MIN.
(__ieee754_exp10): Use DBL_NARROW_EVAL_UFLOW_NONNEG_NAN.
* sysdeps/i386/fpu/e_exp10f.S: Include <i386-math-asm.h>.
(flt_min): Replace with use of DEFINE_FLT_MIN.
(__ieee754_exp10f): Use FLT_NARROW_EVAL_UFLOW_NONNEG_NAN.
* sysdeps/i386/fpu/e_exp2.S: Include <i386-math-asm.h>.
(dbl_min): Replace with use of DEFINE_DBL_MIN.
(__ieee754_exp2): Use DBL_NARROW_EVAL_UFLOW_NONNEG_NAN.
* sysdeps/i386/fpu/e_exp2f.S: Include <i386-math-asm.h>.
(flt_min): Replace with use of DEFINE_FLT_MIN.
(__ieee754_exp2f): Use FLT_NARROW_EVAL_UFLOW_NONNEG_NAN.
* sysdeps/i386/fpu/e_expf.S: Include <i386-math-asm.h>.
(flt_min): Replace with use of DEFINE_FLT_MIN.
(__ieee754_expf): Use FLT_NARROW_EVAL_UFLOW_NONNEG_NAN.
(__expf_finite): Use FLT_NARROW_EVAL_UFLOW_NONNEG.
* sysdeps/i386/fpu/e_hypot.S: Include <i386-math-asm.h>.
(__ieee754_hypot): Use DBL_NARROW_EVAL.
* sysdeps/i386/fpu/e_hypotf.S: Include <i386-math-asm.h>.
(__ieee754_hypotf): Use FLT_NARROW_EVAL.
* sysdeps/i386/fpu/e_pow.S: Include <i386-math-asm.h>.
(__ieee754_pow): Use DBL_NARROW_EVAL.
* sysdeps/i386/fpu/e_powf.S: Include <i386-math-asm.h>.
(__ieee754_powf): Use FLT_NARROW_EVAL.
* sysdeps/i386/i686/fpu/multiarch/e_expf-sse2.S
(__ieee754_expf_sse2): Convert double-precision result to single
precision.
* sysdeps/i386/fpu/libm-test-ulps: Update.
i386 scalb / scalbn / scalbln (and thus ldexp) functions for float and
double can return results with excess range (and consequently excess
precision for subnormal results). As the results of these functions
are fully determined by reference to IEEE 754 operations, this is
unambiguously a bug, apart from the testsuite failures it causes.
This patch makes those functions store their results on the stack and
load them back to eliminate the excess range. Double rounding is not
a problem, as the only cases where it could occur are when the result
overflows or underflows for extended precision, and then the
double-rounded results are the same as the single-rounded results.
The new macros will be used for more functions, more such macros
added, and existing code refactored to use such macros, in subsequent
patches.
Tested for x86. Committed.
[BZ #18981]
* sysdeps/i386/fpu/i386-math-asm.h: New file.
* sysdeps/i386/fpu/e_scalb.S: Include <i386-math-asm.h>.
(__ieee754_scalb): Use DBL_NARROW_EVAL.
* sysdeps/i386/fpu/e_scalbf.S: Include <i386-math-asm.h>.
(__ieee754_scalbf): Use FLT_NARROW_EVAL.
* sysdeps/i386/fpu/s_scalbn.S: Include <i386-math-asm.h>.
(__scalbn): Use DBL_NARROW_EVAL.
* sysdeps/i386/fpu/s_scalbnf.S: Include <i386-math-asm.h>.
(__scalbnf): Use FLT_NARROW_EVAL.
built-ins when available. Since going through the PLT is expensive for these small functions,
inlining results in major speedups (about 7x on Cortex-A57 for isinf). The GCC built-ins are not
correct if signalling NaN support is required, and thus are turned off in that case (see GCC bug
66462). The test-snan.c tests sNaNs and so must be explicitly built with -fsignaling-nans.
2015-09-18 Wilco Dijkstra <wdijkstr@arm.com>
[BZ #15367]
[BZ #17441]
* math/Makefile: Build test-snan.c with -fsignaling-nans.
* math/math.h (fpclassify): Use __builtin_fpclassify when
available. (signbit): Use __builtin_signbit(f/l).
(isfinite): Use__builtin_isfinite. (isnormal): Use
__builtin_isnormal. (isnan): Use __builtin_isnan.
(isinf): Use __builtin_isinf_sign.
In ISO 8601, +03:30 is a valid time zone. Currently, strptime() only
parses it as a 2-digit time zone an believes this is +03:00. This change
makes it accept a single colon.
C99/C11 Annex G specifies the sign of the zero part of the result of
ctan (x +/- i * Inf) and ctanh (+/-Inf + i * y). This patch fixes glibc
to follow that specification, along the lines I described in my review
of Andreas's previous patch for this issue
<https://sourceware.org/ml/libc-alpha/2014-08/msg00142.html>.
Tested for x86_64.
2015-09-17 Joseph Myers <joseph@codesourcery.com>
Andreas Schwab <schwab@suse.de>
[BZ #17118]
* math/s_ctan.c (__ctan): Determine sign of zero real part of
result when imaginary part of argument is infinite using sine and
cosine.
* math/s_ctanf.c (__ctanf): Likewise.
* math/s_ctanl.c (__ctanl): Likewise.
* math/s_ctanh.c (__ctanh): Determine sign of zero imaginary part
of result when real part of argument is infinite using sine and
cosine.
* math/s_ctanhf.c (__ctanhf): Likewise.
* math/s_ctanhl.c (__ctanhl): Likewise.
* math/libm-test.inc (ctan_test_data): Add more tests of ctan.
(ctanh_test_data): Add more tests of ctanh.
Bug 15384 notes that in __finite, two different constants are used
that could be the same constant (the result only depends on the
exponent of the floating-point representation), and that using the
same constant is better for architectures where constants need loading
from a constant pool. This patch implements that change.
Tested for x86_64, mips64 and powerpc.
[BZ #15384]
* sysdeps/ieee754/dbl-64/s_finite.c (FINITE): Use same constant as
bit-mask as in subtraction.
* sysdeps/ieee754/dbl-64/wordsize-64/s_finite.c (__finite):
Likewise.
* sysdeps/ieee754/flt-32/s_finitef.c (FINITEF): Likewise.
* sysdeps/ieee754/ldbl-128/s_finitel.c (__finitel): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_finitel.c (__finitel): Likewise.
Similar to various other bugs in this area, tgamma functions can fail
to raise the underflow exception when the result is tiny and inexact
but one or more low bits of the intermediate result that is scaled
down are zero. This patch forces the exception in a similar way to
previous fixes.
Tested for x86_64, x86, mips64 and powerpc.
[BZ #18951]
* sysdeps/ieee754/dbl-64/e_gamma_r.c (__ieee754_gamma_r): Force
underflow exception for small results.
* sysdeps/ieee754/flt-32/e_gammaf_r.c (__ieee754_gammaf_r):
Likewise.
* sysdeps/ieee754/ldbl-128/e_gammal_r.c (__ieee754_gammal_r):
Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (__ieee754_gammal_r):
Likewise.
* sysdeps/ieee754/ldbl-96/e_gammal_r.c (__ieee754_gammal_r):
Likewise.
* math/auto-libm-test-in: Add more tests of tgamma.
* math/auto-libm-test-out: Regenerated.
As noted in bug 6803, scalbn fails to set errno on overflow and
underflow. This patch fixes this by making scalbn an alias of ldexp,
which has exactly the same semantics (for floating-point types with
radix 2) and already has wrappers that deal with setting errno,
instead of an alias of the internal __scalbn (which ldexp calls).
Notes:
* Where compat symbols were defined for scalbn functions, I didn't
change what they point to (to keep the patch minimal), so such
compat symbols continue to go directly to the non-errno-setting
functions.
* Mike, I didn't do anything with the IA64 versions of these
functions, where I think both the ldexp and scalbn functions already
deal with setting errno. As a cleanup (not needed to fix this bug)
however you might want to make those functions into aliases for
IA64; there is no need for them to be separate function
implementations at all.
* This concludes the fix for bug 6803 since the scalb and scalbln
cases of that bug were fixed some time ago.
Tested for x86_64, x86, mips64 and powerpc.
[BZ #6803]
* math/s_ldexp.c (scalbn): Define as weak alias of __ldexp.
[NO_LONG_DOUBLE] (scalbnl): Define as weak alias of __ldexp.
* math/s_ldexpf.c (scalbnf): Define as weak alias of __ldexpf.
* math/s_ldexpl.c (scalbnl): Define as weak alias of __ldexpl.
* sysdeps/i386/fpu/s_scalbn.S (scalbn): Remove alias.
* sysdeps/i386/fpu/s_scalbnf.S (scalbnf): Likewise.
* sysdeps/i386/fpu/s_scalbnl.S (scalbnl): Likewise.
* sysdeps/ieee754/dbl-64/s_scalbn.c (scalbn): Likewise.
[NO_LONG_DOUBLE] (scalbnl): Likewise.
* sysdeps/ieee754/dbl-64/wordsize-64/s_scalbn.c (scalbn):
Likewise.
[NO_LONG_DOUBLE] (scalbnl): Likewise.
* sysdeps/ieee754/flt-32/s_scalbnf.c (scalbnf): Likewise.
* sysdeps/ieee754/ldbl-128/s_scalbnl.c (scalbnl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_scalbnl.c (scalbnl): Remove
long_double_symbol calls.
* sysdeps/ieee754/ldbl-64-128/s_scalbnl.c (scalbnl): Likewise.
* sysdeps/ieee754/ldbl-opt/s_ldexpl.c (__ldexpl_2): Define as
strong alias of __ldexpl.
(scalbnl): Define using long_double_symbol.
* sysdeps/m68k/m680x0/fpu/s_scalbn.c (__CONCATX(scalbn,suffix)):
Remove alias.
* sysdeps/sparc/sparc64/soft-fp/s_scalbnl.c (scalbnl): Likewise.
* sysdeps/x86_64/fpu/s_scalbnl.S (scalbnl): Likewise.
* math/libm-test.inc (scalbn_test_data): Add errno expectations.
(scalbln_test_data): Add more errno expectations.
The ldbl-128 and ldbl-128ibm expm1l implementations have code to
handle +Inf and finite arguments above an overflow threshold. Since
they now use __expl for large positive arguments to fix other
problems, this code is unreachable; this patch removes it.
Tested for mips64 and powerpc.
[BZ #16415]
* sysdeps/ieee754/ldbl-128/s_expm1l.c (maxlog): Remove variable.
(__expm1l): Remove code to handle positive infinity and overflow.
* sysdeps/ieee754/ldbl-128ibm/s_expm1l.c (maxlog): Remove
variable.
(__expm1l): Remove code to handle positive infinity and overflow.
math.h incorrectly declares various functions for XSI POSIX 2001 and
2008 editions. gamma was removed in the 2001 edition but is still
declared, along with gammaf and gammal which were never standard
functions. isnan is still declared as a function, along with isnanf
and isnanl which were never standard functions, although in 2001 the
function was replaced by the type-generic macro. scalbf and scalbl
are declared although never standard, and scalb was removed in the
2008 edition but is still declared. The scalb type-generic macro in
tgmath.h shouldn't be present for any POSIX version, since POSIX never
had such a type-generic macro.
This patch disables all those declarations in the relevant cases (as a
minimal fix, it leaves them enabled for __USE_MISC). For the matter
of declaring scalb but not scalbf or scalbl for the 2001 edition, a
new macro __MATH_DECLARING_DOUBLE is added, defined by math.h around
includes of bits/mathcalls.h, for bits/mathcalls.h to use to test
which type's functions are being declared.
Tested for x86_64 and x86 (testsuite, and that installed stripped
shared libraries are unchanged by the patch).
[BZ #18967]
* math/math.h (__MATH_DECLARING_DOUBLE): New macro. Define and
undefine around includes of <bits/mathcalls.h>.
* math/bits/mathcalls.h [!__USE_MISC && __USE_XOPEN2K] (isnan): Do
not declare function.
[!__USE_MISC && __USE_XOPEN2K] (gamma): Likewise.
[!__USE_MISC && (!__MATH_DECLARING_DOUBLE || __USE_XOPEN2K8)]
(scalb): Likewise.
* math/tgmath.h [!__USE_MISC && __USE_XOPEN_EXTENDED] (scalb): Do
not define macro.
* conform/Makefile (test-xfail-XOPEN2K/math.h/conform): Remove
variable.
(test-xfail-XOPEN2K/tgmath.h/conform): Likewise.
(test-xfail-XOPEN2K8/math.h/conform): Likewise.
(test-xfail-XOPEN2K8/tgmath.h/conform): Likewise.
The ldbl-128ibm implementation of nearbyintl wrongly uses signaling
comparisons such as "if (fabs (u.d[0].d) < TWO52)" on arguments that
might be NaNs, when "invalid" exceptions should not be raised. (For
hard float, this issue may be hidden by
<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58684>, powerpc GCC
wrongly only using unordered comparison instructions.) This patch
fixes this by just returning the argument if it is not finite (because
of the arbitrary value of the low part of a NaN in IBM long double,
there are quite a lot of comparisons that could end up involving a NaN
when the argument to nearbyintl is a NaN, so excluding NaN arguments
at the start is the simplest and safest fix).
Tested for powerpc-nofpu, where it removes failures for spurious
"invalid" exceptions from nearbyintl.
[BZ #18857]
* sysdeps/ieee754/ldbl-128ibm/s_nearbyintl.c (__nearbyintl): Just
return non-finite argument without doing ordered comparisons on
it.
Bug 16296 notes that fegetround is a pure function and should be
marked as such in fenv.h. This patch implements that.
Tested for x86_64 and x86 (testsuite, and that installed stripped
shared libraries are unchanged by this patch).
[BZ #16296]
* math/fenv.h (fegetround): Use __attribute_pure__.
* include/fenv.h (__fegetround): Likewise.
Similar to various other bugs in this area, ctan and ctanh can fail to
raise the underflow exception for some cases of results that are tiny
and inexact. This patch forces the exception in a similar way to
previous fixes.
Tested for x86_64 and x86.
[BZ #18595]
* math/s_ctan.c (__ctan): Force underflow exception for results
whose real or imaginary part has small absolute value.
* math/s_ctanf.c (__ctanf): Likewise.
* math/s_ctanh.c (__ctanh): Likewise.
* math/s_ctanhf.c (__ctanhf): Likewise.
* math/s_ctanhl.c (__ctanhl): Likewise.
* math/s_ctanl.c (__ctanl): Likewise.
* math/auto-libm-test-in: Do not allow missing underflow for ctan
and ctanh. Add more tests of ctan and ctanh.
Bug 15918 points out that the handling of infinities in hypotf can be
simplified: it's enough to return the absolute value of the infinite
argument without first comparing it to the other argument and possibly
returning that other argument's absolute value. This patch makes that
cleanup (which should not change how hypotf behaves on any input).
Tested for x86_64.
[BZ #15918]
* sysdeps/ieee754/flt-32/e_hypotf.c (__ieee754_hypotf): Simplify
handling of cases where one argument is an infinity.
On i386, the double version of exp10 can miss underflow exceptions if
the result is in the subnormal range for double but the last 11 bits
of the 64-bit extended-precision mantissa happen to be zero. This
patch forces the exception in a similar way to previous fixes.
As with the exp2 and exp fixes, the exp10f changes may in fact not be
needed to ensure underflow exceptions, but are included for
consistency and to fix the exp10 part of bug 18875 by ensuring that
excess range and precision is removed from underflowing return values.
Tested for x86_64 and x86.
[BZ #18875]
[BZ #18966]
* sysdeps/i386/fpu/e_exp10.S (dbl_min): New object.
(MO): New macro.
(__ieee754_exp10): For small results, force underflow exception
and remove excess range and precision from return value.
* sysdeps/i386/fpu/e_exp10f.S (flt_min): New object.
(MO): New macro.
(__ieee754_exp10f): For small results, force underflow exception
and remove excess range and precision from return value.
* math/auto-libm-test-in: Add more tests of exp10.
* math/auto-libm-test-out: Regenerated.
On i386, the double version of exp can miss underflow exceptions if
the result is in the subnormal range for double but the last 11 bits
of the 64-bit extended-precision mantissa happen to be zero. This
patch forces the exception in a similar way to previous fixes.
As with the exp2 fixes, the expf changes may in fact not be needed to
ensure underflow exceptions, but are included for consistency and to
fix the exp part of bug 18875 by ensuring that excess range and
precision is removed from underflowing return values.
Tested for x86_64 and x86.
[BZ #18875]
[BZ #18961]
* sysdeps/i386/fpu/e_exp.S (dbl_min): New object.
(MO): New macro.
(__ieee754_exp): For small results, force underflow exception and
remove excess range and precision from return value.
(__exp_finite): Likewise.
* sysdeps/i386/fpu/e_expf.S (flt_min): New object.
(MO): New macro.
(__ieee754_expf): For small results, force underflow exception and
remove excess range and precision from return value.
(__expf_finite): Likewise.
* math/auto-libm-test-in: Add more tests of exp.
* math/auto-libm-test-out: Regenerated.
Various exp2 implementations in glibc can miss underflow exceptions
when the scaling down part of the calculation is exact (or, in the x86
case, when the conversion from extended precision to the target
precision is exact). This patch forces the exception in a similar way
to previous fixes.
The x86 exp2f changes may in fact not be needed for this purpose -
it's likely to be the case that no argument of type float has an exp2
result so close to an exact subnormal float value that it equals that
value when rounded to 64 bits (even taking account of variation
between different x86 implementations). However, they are included
for consistency with the changes to exp2 and so as to fix the exp2f
part of bug 18875 by ensuring that excess range and precision is
removed from underflowing return values.
Tested for x86_64, x86 and mips64.
[BZ #16521]
[BZ #18875]
* math/e_exp2l.c (__ieee754_exp2l): Force underflow exception for
small results.
* sysdeps/i386/fpu/e_exp2.S (dbl_min): New object.
(MO): New macro.
(__ieee754_exp2): For small results, force underflow exception and
remove excess range and precision from return value.
* sysdeps/i386/fpu/e_exp2f.S (flt_min): New object.
(MO): New macro.
(__ieee754_exp2f): For small results, force underflow exception
and remove excess range and precision from return value.
* sysdeps/i386/fpu/e_exp2l.S (ldbl_min): New object.
(MO): New macro.
(__ieee754_exp2l): Force underflow exception for small results.
* sysdeps/ieee754/dbl-64/e_exp2.c (__ieee754_exp2): Likewise.
* sysdeps/ieee754/flt-32/e_exp2f.c (__ieee754_exp2f): Likewise.
* sysdeps/x86_64/fpu/e_exp2l.S (ldbl_min): New object.
(MO): New macro.
(__ieee754_exp2l): Force underflow exception for small results.
* math/auto-libm-test-in: Add more tests or exp2.
* math/auto-libm-test-out: Regenerated.
If you pass in a path that fails to be opened, then output_path is set to
NULL, and an error is flagged. Then at the end, we use both of those:
cannot write output files to `(null)': No such file or directory
Tweak the message to use the user's input when output_path is NULL.
It was noted in
<https://sourceware.org/ml/libc-alpha/2012-09/msg00305.html> that the
bits/*.h naming scheme should only be used for installed headers.
This patch renames bits/atomic.h to atomic-machine.h to follow that
convention.
This is the only change in this series that needs to change the
filename rather than simply removing a directory level (because both
atomic.h and bits/atomic.h exist at present).
Tested for x86_64 (testsuite, and that installed stripped shared
libraries are unchanged by the patch).
[BZ #14912]
* sysdeps/aarch64/bits/atomic.h: Move to ...
* sysdeps/aarch64/atomic-machine.h: ...here.
(_AARCH64_BITS_ATOMIC_H): Rename macro to
_AARCH64_ATOMIC_MACHINE_H.
* sysdeps/alpha/bits/atomic.h: Move to ...
* sysdeps/alpha/atomic-machine.h: ...here.
* sysdeps/arm/bits/atomic.h: Move to ...
* sysdeps/arm/atomic-machine.h: ...here. Update comments.
* bits/atomic.h: Move to ...
* sysdeps/generic/atomic-machine.h: ...here.
(_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H.
* sysdeps/i386/bits/atomic.h: Move to ...
* sysdeps/i386/atomic-machine.h: ...here.
* sysdeps/ia64/bits/atomic.h: Move to ...
* sysdeps/ia64/atomic-machine.h: ...here.
* sysdeps/m68k/coldfire/bits/atomic.h: Move to ...
* sysdeps/m68k/coldfire/atomic-machine.h: ...here.
(_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H.
* sysdeps/m68k/m680x0/m68020/bits/atomic.h: Move to ...
* sysdeps/m68k/m680x0/m68020/atomic-machine.h: ...here.
* sysdeps/microblaze/bits/atomic.h: Move to ...
* sysdeps/microblaze/atomic-machine.h: ...here.
* sysdeps/mips/bits/atomic.h: Move to ...
* sysdeps/mips/atomic-machine.h: ...here.
(_MIPS_BITS_ATOMIC_H): Rename macro to _MIPS_ATOMIC_MACHINE_H.
* sysdeps/powerpc/bits/atomic.h: Move to ...
* sysdeps/powerpc/atomic-machine.h: ...here. Update comments.
* sysdeps/powerpc/powerpc32/bits/atomic.h: Move to ...
* sysdeps/powerpc/powerpc32/atomic-machine.h: ...here. Update
comments. Include <atomic-machine.h> instead of <bits/atomic.h>.
* sysdeps/powerpc/powerpc64/bits/atomic.h: Move to ...
* sysdeps/powerpc/powerpc64/atomic-machine.h: ...here. Include
<atomic-machine.h> instead of <bits/atomic.h>.
* sysdeps/s390/bits/atomic.h: Move to ...
* sysdeps/s390/atomic-machine.h: ...here.
* sysdeps/sparc/sparc32/bits/atomic.h: Move to ...
* sysdeps/sparc/sparc32/atomic-machine.h: ...here.
(_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H.
* sysdeps/sparc/sparc32/sparcv9/bits/atomic.h: Move to ...
* sysdeps/sparc/sparc32/sparcv9/atomic-machine.h: ...here.
* sysdeps/sparc/sparc64/bits/atomic.h: Move to ...
* sysdeps/sparc/sparc64/atomic-machine.h: ...here.
* sysdeps/tile/bits/atomic.h: Move to ...
* sysdeps/tile/atomic-machine.h: ...here.
* sysdeps/tile/tilegx/bits/atomic.h: Move to ...
* sysdeps/tile/tilegx/atomic-machine.h: ...here. Include
<sysdeps/tile/atomic-machine.h> instead of
<sysdeps/tile/bits/atomic.h>.
(_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H.
* sysdeps/tile/tilepro/bits/atomic.h: Move to ...
* sysdeps/tile/tilepro/atomic-machine.h: ...here. Include
<sysdeps/tile/atomic-machine.h> instead of
<sysdeps/tile/bits/atomic.h>.
(_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H.
* sysdeps/unix/sysv/linux/arm/bits/atomic.h: Move to ...
* sysdeps/unix/sysv/linux/arm/atomic-machine.h: ...here. Include
<sysdeps/arm/atomic-machine.h> instead of
<sysdeps/arm/bits/atomic.h>.
* sysdeps/unix/sysv/linux/hppa/bits/atomic.h: Move to ...
* sysdeps/unix/sysv/linux/hppa/atomic-machine.h: ...here.
(_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H.
* sysdeps/unix/sysv/linux/m68k/coldfire/bits/atomic.h: Move to ...
* sysdeps/unix/sysv/linux/m68k/coldfire/atomic-machine.h: ...here.
(_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H.
* sysdeps/unix/sysv/linux/nios2/bits/atomic.h: Move to ...
* sysdeps/unix/sysv/linux/nios2/atomic-machine.h: ...here.
(_NIOS2_BITS_ATOMIC_H): Rename macro to _NIOS2_ATOMIC_MACHINE_H.
* sysdeps/unix/sysv/linux/sh/bits/atomic.h: Move to ...
* sysdeps/unix/sysv/linux/sh/atomic-machine.h: ...here.
* sysdeps/x86_64/bits/atomic.h: Move to ...
* sysdeps/x86_64/atomic-machine.h: ...here.
* include/atomic.h: Include <atomic-machine.h> instead of
<bits/atomic.h>.
The ldbl-128 / ldbl-128ibm implementation of lgammal converts (the
floor of minus) non-integer negative arguments to int to determine the
value of signgam. When those values are outside the range of int,
this produces spurious "invalid" exceptions and incorrect values of
signgam. This patch fixes this by instead determining signgam through
comparing half the integer in question to floor of half the integer.
Tested for mips64, x86_64 and x86.
[BZ #18952]
* sysdeps/ieee754/ldbl-128/e_lgammal_r.c (__ieee754_lgammal_r): Do
not convert non-integer negative arguments to int to determine the
value of signgam.
* math/auto-libm-test-in: Add more tests of lgamma.
* math/auto-libm-test-out: Regenerated.
The existing implementations of lgamma functions (except for the ia64
versions) use the reflection formula for negative arguments. This
suffers large inaccuracy from cancellation near zeros of lgamma (near
where the gamma function is +/- 1).
This patch fixes this inaccuracy. For arguments above -2, there are
no zeros and no large cancellation, while for sufficiently large
negative arguments the zeros are so close to integers that even for
integers +/- 1ulp the log(gamma(1-x)) term dominates and cancellation
is not significant. Thus, it is only necessary to take special care
about cancellation for arguments around a limited number of zeros.
Accordingly, this patch uses precomputed tables of relevant zeros,
expressed as the sum of two floating-point values. The log of the
ratio of two sines can be computed accurately using log1p in cases
where log would lose accuracy. The log of the ratio of two gamma(1-x)
values can be computed using Stirling's approximation (the difference
between two values of that approximation to lgamma being computable
without computing the two values and then subtracting), with
appropriate adjustments (which don't reduce accuracy too much) in
cases where 1-x is too small to use Stirling's approximation directly.
In the interval from -3 to -2, using the ratios of sines and of
gamma(1-x) can still produce too much cancellation between those two
parts of the computation (and that interval is also the worst interval
for computing the ratio between gamma(1-x) values, which computation
becomes more accurate, while being less critical for the final result,
for larger 1-x). Because this can result in errors slightly above
those accepted in glibc, this interval is instead dealt with by
polynomial approximations. Separate polynomial approximations to
(|gamma(x)|-1)(x-n)/(x-x0) are used for each interval of length 1/8
from -3 to -2, where n (-3 or -2) is the nearest integer to the
1/8-interval and x0 is the zero of lgamma in the relevant half-integer
interval (-3 to -2.5 or -2.5 to -2).
Together, the two approaches are intended to give sufficient accuracy
for all negative arguments in the problem range. Outside that range,
the previous implementation continues to be used.
Tested for x86_64, x86, mips64 and powerpc. The mips64 and powerpc
testing shows up pre-existing problems for ldbl-128 and ldbl-128ibm
with large negative arguments giving spurious "invalid" exceptions
(exposed by newly added tests for cases this patch doesn't affect the
logic for); I'll address those problems separately.
[BZ #2542]
[BZ #2543]
[BZ #2558]
* sysdeps/ieee754/dbl-64/e_lgamma_r.c (__ieee754_lgamma_r): Call
__lgamma_neg for arguments from -28.0 to -2.0.
* sysdeps/ieee754/flt-32/e_lgammaf_r.c (__ieee754_lgammaf_r): Call
__lgamma_negf for arguments from -15.0 to -2.0.
* sysdeps/ieee754/ldbl-128/e_lgammal_r.c (__ieee754_lgammal_r):
Call __lgamma_negl for arguments from -48.0 or -50.0 to -2.0.
* sysdeps/ieee754/ldbl-96/e_lgammal_r.c (__ieee754_lgammal_r):
Call __lgamma_negl for arguments from -33.0 to -2.0.
* sysdeps/ieee754/dbl-64/lgamma_neg.c: New file.
* sysdeps/ieee754/dbl-64/lgamma_product.c: Likewise.
* sysdeps/ieee754/flt-32/lgamma_negf.c: Likewise.
* sysdeps/ieee754/flt-32/lgamma_productf.c: Likewise.
* sysdeps/ieee754/ldbl-128/lgamma_negl.c: Likewise.
* sysdeps/ieee754/ldbl-128/lgamma_productl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/lgamma_negl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/lgamma_productl.c: Likewise.
* sysdeps/ieee754/ldbl-96/lgamma_negl.c: Likewise.
* sysdeps/ieee754/ldbl-96/lgamma_product.c: Likewise.
* sysdeps/ieee754/ldbl-96/lgamma_productl.c: Likewise.
* sysdeps/generic/math_private.h (__lgamma_negf): New prototype.
(__lgamma_neg): Likewise.
(__lgamma_negl): Likewise.
(__lgamma_product): Likewise.
(__lgamma_productl): Likewise.
* math/Makefile (libm-calls): Add lgamma_neg and lgamma_product.
* math/auto-libm-test-in: Add more tests of lgamma.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
Topic: strptime supports a %z input field descriptor, which parses a
time zone offset from UTC time into the broken-out time field tm_gmtoff.
Problems:
1) In the current implementation, the minutes portion calculation is
correct only for minutes evenly divisible by 3. This is because the
minutes value is converted to decimal time, but inadequate precision
leads to rounding which calculates results that are too low for
some values.
For example, due to rounding, a +1159 offset string results in an
incorrect tm_gmtoff of 43128 (== 11 * 3600 + 58.8 * 60) seconds,
instead of 43140 (== 11 * 3600 + 59 * 60) seconds. In contrast,
a +1157 offset (minutes divisible by 3) does not cause the bug,
and results in a correct tm_gmtoff of 43020.
2) strptime's %z specifier will not parse time offsets less than
-1200 or greater than +1200, or if only hour digits are present, less
than -12 or greater than +12. It will return NULL for offsets outside
that range. These limits do not meet historical and modern use cases:
* Present day exceeds the +1200 limit:
- Pacific/Auckland (New Zealand) summer time is +1300.
- Pacific/Kiritimati (Christmas Island) is +1400.
- Pacific/Apia (Samoa) summer time is +1400.
* Historical offsets exceeded +1500/-1500.
* POSIX supports -2459 to +2559.
* Offsets up to +/-9959 may occasionally be useful.
* Paul Eggert's notes provide additional detail:
- https://sourceware.org/ml/libc-alpha/2014-12/msg00068.html
- https://sourceware.org/ml/libc-alpha/2014-12/msg00072.html
3) tst-strptime2, part of the 'make check' test suite, does not test
for the above problems.
Corrective actions:
1) In time/strptime_l.c, calculate the offset from the hour and
minute portions directly, without the rounding errors introduced by
decimal time.
2) Remove the +/-1200 range limit, permitting strptime to parse offsets
from -9959 through +9959.
3) Add zone offset values to time/tst-strptime2.c.
* Test minutes evenly divisible by three (+1157) and not evenly
divisible by three (+1158 and +1159).
* Test offsets near the old and new range limits (-1201, -1330, -2459,
-2500, -99, -9959, +1201, +1330, +1400, +1401, +2559, +2600, +99,
and +9959)
The revised strptime passes all old and new tst-strptime2 tests.
This patch fixes the default wordsize-32 mmap implementation offset
calculation for negative values. Current code uses signed shift
operation to calculate the multiple size to use with syscall and
it is implementation defined. Change it to use a division base
on mmap page size (default being as before, 4096).
Tested on armv7hf.
[BZ #18877]
* posix/Makefile (tests): Add tst-mmap-offset.
* posix/tst-mmap.c: New file.
* sysdeps/unix/sysv/linux/generic/wordsize-32/mmap.c (__mmap): Fix
offset calculation for negative values.
This patch set introduces optimized string, wcsmbs and memory functions for
S390/S390x. The functions are accelerated by the usage of the new z13 vector
instructions.
The Principles of Operations manual for IBM z13 is publically available:
http://publibfi.boulder.ibm.com/epubs/pdf/dz9zr010.pdf
The support for these instructions in assembler was introduced by commits:
-"[Committed] S/390: Add support for IBM z13."
(https://sourceware.org/ml/binutils/2015-01/msg00197.html)
-"[Committed] S/390: Add more IBM z13 instructions"
(https://sourceware.org/ml/binutils/2015-03/msg00088.html)
The first patches do preparation for the latter optimization patches.
The floating point exception handling - fetestexcept(), ... - is fixed and
the platform and hwcap strings are extended.
The current ifunc routines memset, memcpy and memcmp are refactored and the
ifunc test-framework is now enabled.
A S390 specific configure-check tests if the used binutils supports the new
vector instructions. The optimized functions are provided via ifunc if the
binutils supports the vector instructions. Otherwise a message is dumped to
configure output and only the currently used common code functions are
available.
The optimized functions are implemented in common for s390-32 and s390-64
and the few differences are handled via #ifdef.
The ifunc-resolvers are defined in files sysdeps/s390/multiarch/<func>.c,
which choose either the current implementation __<func>_c() or the vector
implementation __<func>_vx() depending on the HWCAP_S390_VX flag bit in
AT_HWCAP field. If the bit is set, the hardware and the kernel are supporting
vector registers and instructions. If the used binutils lacks vector-support,
then the default implementation in string or wcsmbs directory is included
here instead.
The file sysdeps/s390/multiarch/<func>-c.c includes the current implementation
and defines the function name __<func>_c.
The assembler files sysdeps/s390/multiarch/<func>-vx.S with the vector
instructions are using the directive '.machine "z13"' to allow building glibc
without option '-march=z13'. Additionally the directive '.machinemode
"zarch_nohighgprs"' is needed for the 31bit glibc. This mode does not set the
highgprs flag in ELF header, which would lead to an unloadable libc on a 31bit
kernel.
The most optimized string functions are structured in the same way:
The first 16 bytes of the string is loaded unaligned via vlbb - vector load
to block boundary (e.g. 4k). This instruction loads 16 bytes if possible.
In case of a page cross, it only loads the last bytes of the current page
without a segmentation fault.
Afterwards these first part of string is processed. If e.g. for strlen the end
of string is reached within this first part, the function returns. Otherwise
the pointer is aligned to 16 byte, so i can load a full vector register with vl
without checking for a page cross. Afterwards the first part of string is
processed. If e.g. for strlen the end of string is reached within this first
part, the function returns. Otherwise the pointer is aligned to 16 byte, so
a full vector register can be loaded with vl - vector load - without checking
for a page cross. The remaining string is processed in a four times unrolled
loop, because benchmark results measured improvements compared to a non
unrolled loop.
The optimized wide string functions can only handle 4byte aligned string
pointers. Although a wchar_t pointer should always be 4byte aligned, the most
current common code wide string functions can handle non aligned strings.
Thus the optimized functions will fall back to the common code functions in
case of a non aligned wide string to behave the same as before this patch.
Some string tests can test the string and the wide string version of a function.
The remaining ones are extended and new wide string tests are added.
This is the same in case of the benchtests.
ChangeLog:
* NEWS: New item for IBM z13 string optimizations.
The csqrt implementations in glibc can miss underflow exceptions when
the real or imaginary part of the result becomes tiny in the course of
scaling down (in particular, multiplication by 0.5) and that scaling
is exact although the relevant part of the mathematical result isn't.
This patch forces the exception in a similar way to previous fixes.
Tested for x86_64 and x86.
[BZ #18370]
* math/s_csqrt.c (__csqrt): Force underflow exception for results
whose real or imaginary part has small absolute value.
* math/s_csqrtf.c (__csqrtf): Likewise.
* math/s_csqrtl.c (__csqrtl): Likewise.
* math/auto-libm-test-in: Add more tests of csqrt.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
lang_lib (which reflects ISO 639-2/B (bibliographic) codes) and
lang_term (which reflects ISO 639-2/T (terminology) codes) should be
identical except for those languages for which ISO 639-2 specifies
separate bibliographic/terminology values.
I used this Library of Congress page as the source:
http://www.loc.gov/standards/iso639-2/php/code_list.php
The csqrt functions scale up small arguments to avoid underflows when
calling hypot functions. However, even when hypot does not underflow,
a subsequent calculation of 0.5 * hypot can underflow. This patch
duly increases the threshold and scale factor to avoid such underflows
as well.
Tested for x86_64, x86 and mips64.
[BZ #18823]
* math/s_csqrt.c (__csqrt): Increase threshold and scale factor
for scaling up small arguments.
* math/s_csqrtf.c (__csqrtf): Likewise.
* math/s_csqrtl.c (__csqrtl): Likewise.
* math/auto-libm-test-in: Add more tests of csqrt.
* math/auto-libm-test-out: Regenerated.
I think the last clause of the conditional,
|| __n <= __bos (__dest)
may be backward. The code should call the runtime-checking function
if __n is not constant, or if __n is known to be LARGER than the size
of the destination.
Various fma implementations have logic that, when computing fma (x, y,
z) where z is large (so care needs taking to avoid internal overflow)
but x * y is small, scale x * y up instead of down to avoid internal
underflows resulting from scaling down. (In these cases, x * y is
small enough that only its sign actually matters rather than the exact
value.)
The threshold for scaling up instead of down was correct for "if the
unscaled values were multiplied, the low part of the multiplication
could underflow", and the scaling was sufficient to ensure that the
low part of the multiplication did not underflow (given that cases of
very small x * y - less than half the least subnormal - were
previously dealt with). However, the choice in the functions wasn't
between scaling up or no scaling, but between scaling up and scaling
down (scaling down actually being needed when x * y isn't so small
compared to z and so the exact value does matter). Thus a larger
threshold is needed to ensure that scaling down doesn't produce values
the multiplication of whose low parts underflows. This patch
increases the thresholds accordingly.
Tested for x86_64, x86 and mips64 (with the MIPS version of s_fmal.c
removed so that the ldbl-128 version gets tested instead of the
soft-fp one).
[BZ #18824]
* sysdeps/ieee754/dbl-64/s_fma.c (__fma): Increase threshold for
scaling x * y up instead of down.
* sysdeps/ieee754/ldbl-128/s_fmal.c (__fmal): Likewise.
* sysdeps/ieee754/ldbl-96/s_fmal.c (__fmal): Likewise.
* math/auto-libm-test-in: Add more tests of fma.
* math/auto-libm-test-out: Regenerated.
Similar to various other bugs in this area, some tanh implementations
do not raise the underflow exception for subnormal arguments, when the
result is tiny and inexact. This patch forces the exception in a
similar way to previous fixes.
Tested for x86_64, x86, mips64 and powerpc.
[BZ #16520]
* sysdeps/ieee754/dbl-64/s_tanh.c: Include <float.h>.
(__tanh): Force underflow exception for arguments with small
absolute value.
* sysdeps/ieee754/flt-32/s_tanhf.c: Include <float.h>.
(__tanhf): Force underflow exception for arguments with small
absolute value.
* sysdeps/ieee754/ldbl-128/s_tanhl.c: Include <float.h>.
(__tanhl): Force underflow exception for arguments with small
absolute value.
* sysdeps/ieee754/ldbl-128ibm/s_tanhl.c: Include <float.h>.
(__tanhl): Force underflow exception for arguments with small
absolute value.
* sysdeps/ieee754/ldbl-96/s_tanhl.c: Include <float.h>.
(__tanhl): Force underflow exception for arguments with small
absolute value.
* math/auto-libm-test-in: Add more tests of tanh.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
https://sourceware.org/bugzilla/show_bug.cgi?id=18778
If dlopen fails to load an object that has triggered loading libpthread it
causes ld.so to unload libpthread because its DF_1_NODELETE flags has been
forcefully cleared. The next call to __rtdl_unlock_lock_recursive will crash
since pthread_mutex_unlock no longer exists.
This patch moves l->l_flags_1 &= ~DF_1_NODELETE out of loop through all loaded
libraries and performs the action only on inconsistent one.
[BZ #18778]
* elf/Makefile (tests): Add Add tst-nodelete2.
(modules-names): Add tst-nodelete2mod.
(tst-nodelete2mod.so-no-z-defs): New.
($(objpfx)tst-nodelete2): Likewise.
($(objpfx)tst-nodelete2.out): Likewise.
(LDFLAGS-tst-nodelete2): Likewise.
* elf/dl-close.c (_dl_close_worker): Move DF_1_NODELETE clearing
out of loop through all loaded libraries.
* elf/tst-nodelete2.c: New file.
* elf/tst-nodelete2mod.c: Likewise.
ldbl-128ibm tanhl uses a too-small threshold to decide when to return
+/-1, resulting in large errors. This patch changes it to a more
appropriate threshold (the requirement is for 2*exp(-2|x|) to be small
in terms of ulps of 1).
Tested for x86_64, x86 and powerpc.
[BZ #18790]
* sysdeps/ieee754/ldbl-128ibm/s_tanhl.c (__tanhl): Increase
threshold for returning +/- 1.
* math/auto-libm-test-in: Add more tests of tanh.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
ldbl-128ibm sinhl uses a too-big threshold to decide when to return
the argument, resulting in large errors. This patch fixes it to use a
more appropriate threshold.
Tested for x86_64, x86 and powerpc.
[BZ #18789]
* sysdeps/ieee754/ldbl-128ibm/e_sinhl.c (__ieee754_sinhl): Use
smaller threshold for returning the argument.
* math/auto-libm-test-in: Add more tests of sinh.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
The attached change fixes the miscompilation of sched_setaffinity() on
hppa. This is an old problem that was fixed on other architectures using
a similar approach to the attached change. See:
https://sourceware.org/ml/libc-hacker/2004-04/msg00016.html
Build tested on trunk. Patch has been applied to debian glibc for some time.
As noted in the bug, the asm operands need to be copied to register
variables to avoid operand reloads in the principal asm of the macro.
See the arm implementation for reference. Otherwise we get:
../sysdeps/unix/sysv/linux/hppa/bits/atomic.h:68:6: error:
can't find a register in class 'R1_REGS' while reloading 'asm'
Build tested on trunk with gcc-4.8. Similar patch has been tested
with 2.19 on Debian hppa-unknown-linux-gnu.
Similar to various other bugs in this area, some tan implementations
do not raise the underflow exception for subnormal arguments, when the
result is tiny and inexact. This patch forces the exception in a
similar way to previous fixes.
Tested for x86_64, x86, mips64 and powerpc.
[BZ #16517]
* sysdeps/ieee754/dbl-64/s_tan.c: Include <float.h>.
(tan): Force underflow exception for arguments with small absolute
value.
* sysdeps/ieee754/flt-32/k_tanf.c: Include <float.h>.
(__kernel_tanf): Force underflow exception for arguments with
small absolute value.
* sysdeps/ieee754/ldbl-128/k_tanl.c: Include <float.h>.
(__kernel_tanl): Force underflow exception for arguments with
small absolute value.
* sysdeps/ieee754/ldbl-128ibm/k_tanl.c: Include <float.h>.
(__kernel_tanl): Force underflow exception for arguments with
small absolute value.
* sysdeps/ieee754/ldbl-96/k_tanl.c: Include <float.h>.
(__kernel_tanl): Force underflow exception for arguments with
small absolute value.
* math/auto-libm-test-in: Add more tests of tan.
* math/auto-libm-test-out: Regenerated.
Similar to various other bugs in this area, some sinh implementations
do not raise the underflow exception for subnormal arguments, when the
result is tiny and inexact. This patch forces the exception in a
similar way to previous fixes.
Tested for x86_64, x86, mips64 and powerpc.
[BZ #16519]
* sysdeps/ieee754/dbl-64/e_sinh.c: Include <float.h>.
(__ieee754_sinh): Force underflow exception for arguments with
small absolute value.
* sysdeps/ieee754/flt-32/e_sinhf.c: Include <float.h>.
(__ieee754_sinhf): Force underflow exception for arguments with
small absolute value.
* sysdeps/ieee754/ldbl-128/e_sinhl.c: Include <float.h>.
(__ieee754_sinhl): Force underflow exception for arguments with
small absolute value.
* sysdeps/ieee754/ldbl-128ibm/e_sinhl.c: Include <float.h>.
(__ieee754_sinhl): Force underflow exception for arguments with
small absolute value.
* sysdeps/ieee754/ldbl-96/e_sinhl.c: Include <float.h>.
(__ieee754_sinhl): Force underflow exception for arguments with
small absolute value.
* math/auto-libm-test-in: Add more tests of sinh.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
In the "Kill regexp.h" thread, Joseph dug up more accurate information
about exactly which editions of the Single Unix Standard included and
deprecated this header.
The flt-32 implementation of powf wrongly uses x-1 instead of |x|-1
when computing log (x) for the case where |x| is close to 1 and y is
large. This patch fixes the logic accordingly. Relevant tests
existed for x close to 1, and corresponding tests are added for x
close to -1, as well as for some new variant cases.
Tested for x86_64 and x86.
[BZ #18647]
* sysdeps/ieee754/flt-32/e_powf.c (__ieee754_powf): For large y
and |x| close to 1, use absolute value of x when computing log.
* math/auto-libm-test-in: Add more tests of pow.
* math/auto-libm-test-out: Regenerated.
as discussed in the thread starting at
https://sourceware.org/ml/libc-alpha/2015-06/msg00098.html
it looks like the best options is to remove locale timezone information
from locales which currently provide it (in incomplete or incorrect
fashion) rather than to start duplicating tzdata info in glibc.
This patch adds __nonnull annotations for wcscat, wcsncat, wcscmp and wcsncmp.
These added annotations match the annoations for strcat, strncat, strcmp, strncmp in glibc.
<regexp.h> (not to be confused with <regex.h>) is an obsolete and
frankly horrible regular expression-matching API. It was part of SVID
but was withdrawn in Issue 5 (for reference, we're on Issue 7 now).
It doesn't do anything you can't do with <regex.h>, and using it
involves defining a bunch of macros before including the header.
Moreover, the code in regexp.h that uses those macros has been buggy
since its creation (in 1996) and no one has noticed, which indicates
to me that there are no users. (Specifically, RETURN() is used in a
whole bunch of cases where it should have been ERROR().)
The header is given a warning and marked deprecated for 2.22.
See:
https://sourceware.org/ml/libc-alpha/2015-07/msg00862.html and
https://sourceware.org/ml/libc-alpha/2015-07/msg00871.html.
On x86, linker in binutils 2.26 and newer consolidates R_*_JUMP_SLOT with
R_*_GLOB_DAT relocation against the same symbol. This patch extends
local PLT reference check to support alternate relocations.
[BZ #18078]
* scripts/check-localplt.awk: Support alternate relocations.
* scripts/localplt.awk: Also check relocations in DT_RELA/DT_REL
sections.
* sysdeps/unix/sysv/linux/i386/localplt.data: Mark free and
malloc entries with + REL R_386_GLOB_DAT.
* sysdeps/x86_64/localplt.data: New file.
Changes in support of -fno-plt also cause the elf/tst-audit* tests to
start passing on MIPS. This patch duly marks the relevant bug as
fixed in ChangeLog and NEWS.
The recently introduced TLS variables in the thread-local destructor
implementation (__cxa_thread_atexit_impl) used the default GD access
model, resulting in a call to __tls_get_addr. This causes a deadlock
with recent changes to the way TLS is initialized because DTV
allocations are delayed and hence despite knowing the offset to the
variable inside its TLS block, the thread has to take the global rtld
lock to safely update the TLS offset.
This causes deadlocks when a thread is instantiated and joined inside
a destructor of a dlopen'd DSO. The correct long term fix is to
somehow not take the lock, but that will need a lot deeper change set
to alter the way in which the big rtld lock is used.
Instead, this patch just eliminates the call to __tls_get_addr for the
thread-local variables inside libc.so, libpthread.so and rtld by
building all of their units with -mtls-model=initial-exec.
There were concerns that the static storage for TLS is limited and
hence we should not be using it. Additionally, dynamically loaded
modules may result in libc.so looking for this static storage pretty
late in static binaries. Both concerns are valid when using TLSDESC
since that is where one may attempt to allocate a TLS block from
static storage for even those variables that are not IE. They're not
very strong arguments for the traditional TLS model though, since it
assumes that the static storage would be used sparingly and definitely
not by default. Hence, for now this would only theoretically affect
ARM architectures.
The impact is hence limited to statically linked binaries that dlopen
modules that in turn load libc.so, all that on arm hardware. It seems
like a small enough impact to justify fixing the larger problem that
currently affects everything everywhere.
This still does not solve the original problem completely. That is,
it is still possible to deadlock on the big rtld lock with a small
tweak to the test case attached to this patch. That problem is
however not a regression in 2.22 and hence could be tackled as a
separate project. The test case is picked up as is from Alex's patch.
This change has been tested to verify that it does not cause any
issues on x86_64.
ChangeLog:
[BZ #18457]
* nptl/Makefile (tests): New test case tst-join7.
(modules-names): New test case module tst-join7mod.
* nptl/tst-join7.c: New file.
* nptl/tst-join7mod.c: New file.
* Makeconfig (tls-model): Pass -ftls-model=initial-exec for
all translation units in libc.so, libpthread.so and rtld.
When an TLS destructor is registered, we set the DF_1_NODELETE flag to
signal that the object should not be destroyed. We then clear the
DF_1_NODELETE flag when all destructors are called, which is wrong -
the flag could have been set by other means too.
This patch replaces this use of the flag by using l_tls_dtor_count
directly to determine whether it is safe to unload the object. This
change has the added advantage of eliminating the lock taking when
calling the destructors, which could result in a deadlock. The patch
also fixes the test case tst-tls-atexit - it was making an invalid
dlclose call, which would just return an error silently.
I have also added a detailed note on concurrency which also aims to
justify why I chose the semantics I chose for accesses to
l_tls_dtor_count. Thanks to Torvald for his help in getting me
started on this and (literally) teaching my how to approach the
problem.
Change verified on x86_64; the test suite does not show any
regressions due to the patch.
ChangeLog:
[BZ #18657]
* elf/dl-close.c (_dl_close_worker): Don't unload DSO if there
are pending TLS destructor calls.
* include/link.h (struct link_map): Add concurrency note for
L_TLS_DTOR_COUNT.
* stdlib/cxa_thread_atexit_impl.c (__cxa_thread_atexit_impl):
Don't touch the link map flag. Atomically increment
l_tls_dtor_count.
(__call_tls_dtors): Atomically decrement l_tls_dtor_count.
Avoid taking the load lock and don't touch the link map flag.
* stdlib/tst-tls-atexit-nodelete.c: New test case.
* stdlib/Makefile (tests): Use it.
* stdlib/tst-tls-atexit.c (do_test): dlopen
tst-tls-atexit-lib.so again before dlclose. Add conditionals
to allow tst-tls-atexit-nodelete test case to use it.
Commit a059d359d8 changed the sigaction
struct to pass conform tests, but it ended up also changing the ABI for
32 bit builds. For 64 bit builds, changing the long to two ints works,
but for 32 bit builds, it inserts 4 extra bytes. This leads to many
packages randomly failing like bash that spews things like:
configure: line 471: wait_for: No record of process 0
Bracket the new member by a wordsize check to fix the ABI for 32bit.
X86 struct siginfo in kernel 3.19 has been changed by
commit ee1b58d36aa1b5a79eaba11f5c3633c88231da83
Author: Qiaowei Ren <qiaowei.ren@intel.com>
Date: Fri Nov 14 07:18:19 2014 -0800
mpx: Extend siginfo structure to include bound violation information
This patch adds new fields about bound violation into siginfo
structure. si_lower and si_upper are respectively lower bound
and upper bound when bound violation is caused.
This patch updates x86 struct siginfo to enable GDB with MPX support.
[BZ #18696]
* sysdeps/unix/sysv/linux/x86/bits/siginfo.h (_sigfault): Add
si_addr_bnd.
(si_lower): New.
(si_upper): Likewise.
The DF_1_NODELETE flag is set too late when opening a DSO, due to
which, if a DSO is already open, subsequently opening it with
RTLD_NODELETE fails to set the DF_1_NODELETE flag. This patch fixes
this by setting the flag immediately after bumping the opencount.
Verified on x86_64.
[BZ #18676]
* elf/tst-nodelete-opened.c: New test case.
* elf/tst-nodelete-opened-lib.c: New test case module.
* elf/Makefile (tests, modules-names): Use them.
* elf/dl-open.c (dl_open_worker): Set DF_1_NODELETE flag
early.
Bhili [1] and Tulu [2] language does not have iso-639-1 codes. Patch
moves locale file with correct code and also fix iso-639.def.
1. http://www-01.sil.org/iso639-3/documentation.asp?id=bhb
2. http://www-01.sil.org/iso639-3/documentation.asp?id=tcy
localedata/ChangeLog:
2015-07-02 Pravin Satpute <psatpute@redhat.com>
[BZ #17475]
* locales/tu_IN: renamed to tcy_IN
* locales/bh_IN: renamed to bhb_IN
Changelog:
2015-03-05 Pravin Satpute <psatpute@redhat.com>
[BZ #17475]
* locale/iso-639.def: Update Bhili and Tulu language codes as
per iso639-3.
and also powerpc64 and powerpc64le. See the discussion in the thread
below for details. This change reverts the problematic bits leaving
the added test in place and marking XFAIL in anticipation of fixing
the bug in the near future.
https://sourceware.org/ml/libc-alpha/2015-07/msg00141.html
[BZ #18435]
* nptl/pthreadP.h (pthread_cleanup_push, pthread_cleanup_pop):
Revert commit ed225df3ad.
* nptl/Makefile (test-xfail-tst-once5): Define.
We need to save/restore bound registers and add a BND prefix before
branches in _dl_runtime_profile so that bound registers for pointer
pass and return are preserved when LD_AUDIT is used.
[BZ #18134]
* sysdeps/i386/configure.ac: Set HAVE_MPX_SUPPORT.
* sysdeps/i386/configure: Regenerated.
* sysdeps/i386/dl-trampoline.S (PRESERVE_BND_REGS_PREFIX): New.
(_dl_runtime_profile): Save and restore Intel MPX return bound
registers when calling _dl_call_pltexit. Add
PRESERVE_BND_REGS_PREFIX before return.
* sysdeps/i386/link-defines.sym (LRV_BND0_OFFSET): New.
(LRV_BND1_OFFSET): Likewise.
* sysdeps/x86/bits/link.h (La_i86_retval): Add lrv_bnd0 and
lrv_bnd1.
* sysdeps/x86_64/dl-trampoline.S (_dl_runtime_profile): Fix
typo in bndmov encoding.
* sysdeps/x86_64/dl-trampoline.h: Properly save and restore
Intel MPX bound registers. Add PRESERVE_BND_REGS_PREFIX before
branch instructions to preserve bounds.
This is an ABI breaking change, but
typedef int greg_t;
is not a useful definition on aarch64.
greg_t is usually used for defining gregset_t which is used
in mcontext_t. The general registers in mcontext_t can only
be accessed by target specific code and on aarch64 greg_t
is not needed for that so this change is not supposed to break
existing code, just fix the definition.
[BZ #18648]
* sysdeps/unix/sysv/linux/aarch64/sys/ucontext.h (greg_t): Change the
definition to elf_greg_t.
(Added another BZ entry that was missed in the previous commit).
This patch added a new fmemopen version, for glibc 2.22, that aims to be
POSIX complaint. It fixes some long-stading glibc fmemopen issues, such
as:
* it changes the way fseek with SEEK_END works on fmemopen to seek
relative to buffer size instead of first '\0'. This is default mode and
'b' opening mode does not change internal behavior (bz#6544).
* fix apending opening mode to use as start position either first null
byte of len specified in function call (bz#13152 and #13151).
* remove binary option 'b' and internal different handling (bz#12836)
* fix seek/SEE_END with negative values (bz#14292).
A compatibility symbol is provided to with old behavior for older symbols
version (2.2.5).
* include/stdio.h (fmemopen): Remove hidden prototype.
(__fmemopen): Add new hidden prototype.
* libio/Makefile: Add oldfmemopen object.
* libio/Versions [GLIBC_2.22]: Add new fmemopen symbol.
* libio/fmemopen.c (__fmemopen): Function rewrite to be POSIX
compliance.
* libio/oldfmemopen.c: New file: old fmemopen implementation for
symbol compatibility.
* stdio-common/Makefile [tests]: Add new tst-fmemopen3.
* stdio-common/psiginfo.c [psiginfo]: Call __fmemopen instead of
fmemopen.
* stdio-common/tst-fmemopen3.c: New file: more fmemopen tests, focus
on append and read mode.
* sysdeps/unix/sysv/linux/aarch64/libc.abilist [GLIBC_2.22]: Add
fmemopen.
* sysdeps/unix/sysv/linux/alpha/libc.abilist [GLIBC_2.22]: Likewise.
* sysdeps/unix/sysv/linux/arm/libc.abilist [GLIBC_2.22]: Likewise.
* sysdeps/unix/sysv/linux/i386/libc.abilist [GLIBC_2.22]: Likewise.
* sysdeps/unix/sysv/linux/ia64/libc.abilist [GLIBC_2.22]:
Likewise.
* sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist
[GLIBC_2.22]: Likewise.
* sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist [GLIBC_2.22]:
Likewise.
* sysdeps/unix/sysv/linux/microblaze/libc.abilist [GLIBC_2.22]:
Likewise.
* sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist
[GLIBC_2.22]: Likewise.
* sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist
[GLIBC_2.22]: Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist
[GLIBC_2.22]: Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist
[GLIBC_2.22]: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist
[GLIBC_2.22]: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist
[GLIBC_2.22]: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libc.abilist
[GLIBC_2.22]: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist [GLIBC_2.22]:
Likewise.
* sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist [GLIBC_2.22]:
Likewise.
* sysdeps/unix/sysv/linux/sh/libc.abilist [GLIBC_2.22]: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist
[GLIBC_2.22]: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist
[GLIBC_2.22]: Likewise.
* sysdeps/unix/sysv/linux/tile/tilegx/tilegx32/libc.abilist
[GLIBC_2.22]: Likewise.
* sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/libc.abilist
[GLIBC_2.22]: Likewise.
* sysdeps/unix/sysv/linux/tile/tilepro/libc.abilist [GLIBC_2.22]:
Likewise.
* sysdeps/unix/sysv/linux/x86_64/64/libc.abilist [GLIBC_2.22]:
Likewise.
* sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist [GLIBC_2.22]:
Likewise.
* sysdeps/unix/sysv/linux/hppa/libc.abilist [GLIBC_2.22]: Likewise.
* sysdeps/unix/sysv/linux/nios2/libc.abilist [GLIBC_2.22]: Likewise.
Mark all the functions that don't handle NULL pointers as __nonnull.
POSIX does not require either behavior, so the prototypes should match
the reality of the codebase.
Fixes bug 18557.
The ruserok API does hosts checks first while it walks the
user's ~/.rhosts file. This results in lots of DNS queries
that could have been skipped if we short-circuit test the
user portion first to see if would have had a failed match.
This supports configurations where rlogin is used on internal
secure networks with large numbers of users and machines.
The Red Hat QE team did extensive testing on various rlogin
combinations to validate this change, and in fact we found
a defect in the first version which is fixed in this version.
https://sourceware.org/bugzilla/show_bug.cgi?id=17833
I've a shared library that contains both undefined and unique symbols.
Then I try to call the following sequence of dlopen:
1. dlopen("./libfoo.so", RTLD_NOW)
2. dlopen("./libfoo.so", RTLD_LAZY | RTLD_GLOBAL)
First dlopen call terminates with error because of undefined symbols,
but STB_GNU_UNIQUE ones set DF_1_NODELETE flag and hence block library
in the memory.
The library goes into inconsistent state as several structures remain
uninitialized. For instance, relocations for GOT table were not performed.
By the time of second dlopen call this library looks like as it would be
fully initialized but this is not true: any call through incorrect GOT
table leads to segmentation fault. On some systems this inconsistency
triggers assertions in the dynamic linker.
This patch adds a parameter to _dl_close_worker to implement forced object
deletion in case of dlopen() failure:
1. Clears DF_1_NODELETE bit if forced, to allow library to be removed from
memory.
2. For each unique symbol that is defined in this object clears
appropriate entry in _ns_unique_sym_table.
[BZ #17833]
* elf/Makefile (tests): Add tst-nodelete.
(modules-names): Add tst-nodelete-uniquemod.
(tst-nodelete-uniquemod.so-no-z-defs): New.
(tst-nodelete-rtldmod.so-no-z-defs): Likewise.
(tst-nodelete-zmod.so-no-z-defs): Likewise.
($(objpfx)tst-nodelete): Likewise.
($(objpfx)tst-nodelete.out): Likewise.
(LDFLAGS-tst-nodelete): Likewise.
(LDFLAGS-tst-nodelete-zmod.so): Likewise.
* elf/dl-close.c (_dl_close_worker): Add a parameter to
implement forced object deletion.
(_dl_close): Pass false to _dl_close_worker.
* elf/dl-open.c (_dl_open): Pass true to _dl_close_worker.
* elf/tst-nodelete.cc: New file.
* elf/tst-nodeletelib.cc: Likewise.
* elf/tst-znodeletelib.cc: Likewise.
* include/dlfcn.h (_dl_close_worker): Add a new parameter.
On s390/s390x backtrace(buffer, size) returns the series of called functions until
"makecontext_ret" and additional entries (up to "size") with "makecontext_ret".
GDB-backtrace is also warning:
"Backtrace stopped: previous frame identical to this frame (corrupt stack?)"
To reproduce this scenario you have to setup a new context with makecontext()
and activate it with setcontext(). See e.g. cf() function in testcase stdlib/tst-makecontext.c.
Or see bug in libgo "Bug 66303 - runtime.Caller() returns infinitely deep stack frames
on s390x " (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66303).
This patch omits the cfi_startproc/cfi_endproc directives in ENTRY/END macro of
__makecontext_ret. Thus no frame information is generated in .eh_frame and backtrace
stops after __makecontext_ret. There is also no .eh_frame info for _start or
thread_start functions.
ChangeLog:
[BZ #18508]
* stdlib/Makefile ($(objpfx)tst-makecontext3):
Depend on $(libdl).
* stdlib/tst-makecontext.c (cf): Test if _Unwind_Backtrace
is not called infinitely times.
(backtrace_helper): New function.
(trace_arg): New struct.
(st1): Enlarge stack size.
* sysdeps/unix/sysv/linux/s390/s390-32/__makecontext_ret.S:
(__makecontext_ret): Omit cfi_startproc and cfi_endproc.
* sysdeps/unix/sysv/linux/s390/s390-64/__makecontext_ret.S:
Likewise.
Some of the x86 string functions create pointers based on input strings
that may be outside of the input strings. When this happens in C code,
the compiler can potentially detect this, leading to warnings in
application code when those string functions are inlined. Perform those
operations in the assembly code instead of the C code to fix this.
In the ldbl-128 implementation of expm1l, when expm1l's result should
underflow to 0 (argument minus the least subnormal, in some rounding
modes), it can be a zero of the wrong sign. This patch fixes this in
the same way previously used for the x86 / x86_64 versions.
Tested for mips64.
[BZ #18619]
* sysdeps/ieee754/ldbl-128/s_expm1l.c (__expm1l): Force underflow
and return argument in case of subnormal argument.
the initialization routine to exit by throwing an exception.
Such an execution, termed exceptional, requires call_once to
propagate the exception to its caller. A program may contain
any number of exceptional executions but only one returning
execution (which, if it exists, must be the last execution
with the same once flag).
On POSIX systems such as Linux, std::call_once is implemented
in terms of pthread_once. However, as discussed in libstdc++
bug 66146 - "call_once not C++11-compliant on ppc64le," GLIBC's
pthread_once hangs when the initialization function exits by
throwing an exception on at least arm and ppc64 (though
apparently not on x86_64). This effectively prevents call_once
from conforming to the C++ requirements since there doesn't
appear to be a thread-safe way to work around this problem in
libstdc++.
This patch changes pthread_once to handle gracefully init
functions that exit by throwing exceptions. It was successfully
tested on ppc64, ppc64le, and x86_64.
[BZ #18435]
* nptl/Makefile: Add tst-once5.cc.
* nptl/pthreadP.h (pthread_cleanup_push, pthread_cleanup_pop):
Remove macro redefinitions.
* nptl/tst-once5.cc: New test.
In non-default rounding modes, tgamma can be slightly less accurate
than permitted by glibc's accuracy goals.
Part of the problem is error accumulation, addressed in this patch by
setting round-to-nearest for internal computations. However, there
was also a bug in the code dealing with computing pow (x + n, x + n)
where x + n is not exactly representable, providing another source of
error even in round-to-nearest mode; it was necessary to address both
bugs to get errors for all testcases within glibc's accuracy goals.
Given this second fix, accuracy in round-to-nearest mode is also
improved (hence regeneration of ulps for tgamma should be from scratch
- truncate libm-test-ulps or at least remove existing tgamma entries -
so that the expected ulps can be reduced).
Some additional complications also arose. Certain tgamma tests should
strictly, according to IEEE semantics, overflow or not depending on
the rounding mode; this is beyond the scope of glibc's accuracy goals
for any function without exactly-determined results, but
gen-auto-libm-tests doesn't handle being lax there as it does for
underflow. (libm-test.inc also doesn't handle being lax about whether
the result in cases very close to the overflow threshold is infinity
or a finite value close to overflow, but that doesn't cause problems
in this case though I've seen it cause problems with random test
generation for some functions.) Thus, spurious-overflow markings,
with a comment, are added to auto-libm-test-in (no bug in Bugzilla
because the issue is with the testsuite, not a user-visible bug in
glibc). And on x86, after the patch I saw ERANGE issues as previously
reported by Carlos (see my commentary in
<https://sourceware.org/ml/libc-alpha/2015-01/msg00485.html>), which
needed addressing by ensuring excess range and precision were
eliminated at various points if FLT_EVAL_METHOD != 0.
I also noticed and fixed a cosmetic issue where 1.0f was used in long
double functions and should have been 1.0L.
This completes the move of all functions to testing in all rounding
modes with ALL_RM_TEST, so gen-libm-have-vector-test.sh is updated to
remove the workaround for some functions not using ALL_RM_TEST.
Tested for x86_64, x86, mips64 and powerpc.
[BZ #18613]
* sysdeps/ieee754/dbl-64/e_gamma_r.c (gamma_positive): Take log of
X_ADJ not X when adjusting exponent.
(__ieee754_gamma_r): Do intermediate computations in
round-to-nearest then adjust overflowing and underflowing results
as needed.
* sysdeps/ieee754/flt-32/e_gammaf_r.c (gammaf_positive): Take log
of X_ADJ not X when adjusting exponent.
(__ieee754_gammaf_r): Do intermediate computations in
round-to-nearest then adjust overflowing and underflowing results
as needed.
* sysdeps/ieee754/ldbl-128/e_gammal_r.c (gammal_positive): Take
log of X_ADJ not X when adjusting exponent.
(__ieee754_gammal_r): Do intermediate computations in
round-to-nearest then adjust overflowing and underflowing results
as needed. Use 1.0L not 1.0f as numerator of division.
* sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (gammal_positive): Take
log of X_ADJ not X when adjusting exponent.
(__ieee754_gammal_r): Do intermediate computations in
round-to-nearest then adjust overflowing and underflowing results
as needed. Use 1.0L not 1.0f as numerator of division.
* sysdeps/ieee754/ldbl-96/e_gammal_r.c (gammal_positive): Take log
of X_ADJ not X when adjusting exponent.
(__ieee754_gammal_r): Do intermediate computations in
round-to-nearest then adjust overflowing and underflowing results
as needed. Use 1.0L not 1.0f as numerator of division.
* math/libm-test.inc (tgamma_test_data): Remove one test. Moved
to auto-libm-test-in.
(tgamma_test): Use ALL_RM_TEST.
* math/auto-libm-test-in: Add one test of tgamma. Mark some other
tests of tgamma with spurious-overflow.
* math/auto-libm-test-out: Regenerated.
* math/gen-libm-have-vector-test.sh: Do not check for START.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
The ldbl-128 implementation of j1l produces spurious underflow
exceptions for some small arguments, as a result of squaring the
argument. This patch fixes it just to use a linear approximation for
sufficiently small arguments, and then to force an underflow exception
only in the cases where it is required.
Tested for mips64.
[BZ #18612]
* sysdeps/ieee754/ldbl-128/e_j1l.c (__ieee754_j1l): For small
arguments, just return 0.5 times the argument, with underflow
forced as needed.
* math/auto-libm-test-in: Add more tests of j1.
* math/auto-libm-test-out: Regenerated.
Similar to various other bugs in this area, j1 and jn implementations
can fail to raise the underflow exception when the internal
computation is exact although the actual function is inexact. This
patch forces the exception in a similar way to other such fixes. (The
ldbl-128 / ldbl-128ibm j1l implementation is different and doesn't
need a change for this until spurious underflows in it are fixed.)
Tested for x86_64, x86, mips64 and powerpc.
[BZ #16559]
* sysdeps/ieee754/dbl-64/e_j1.c: Include <float.h>.
(__ieee754_j1): Force underflow exception for small results.
* sysdeps/ieee754/dbl-64/e_jn.c (__ieee754_jn): Likewise.
* sysdeps/ieee754/flt-32/e_j1f.c: Include <float.h>.
(__ieee754_j1f): Force underflow exception for small results.
* sysdeps/ieee754/flt-32/e_jnf.c (__ieee754_jnf): Likewise.
* sysdeps/ieee754/ldbl-128/e_jnl.c (__ieee754_jnl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_jnl.c (__ieee754_jnl): Likewise.
* sysdeps/ieee754/ldbl-96/e_j1l.c: Include <float.h>.
(__ieee754_j1l): Force underflow exception for small results.
* sysdeps/ieee754/ldbl-96/e_jnl.c (__ieee754_jnl): Likewise.
* math/auto-libm-test-in: Add more tests of j1 and jn.
* math/auto-libm-test-out: Regenerated.
mksquashfs was reported in openSUSE to be causing segmentation faults when
creating installation images. Testing showed that mksquashfs sometimes
failed and could be reproduced within 10 attempts. The core dump looked
like the heap top was corrupted and was pointing to an unmapped area. In
other cases, this has been due to an application corrupting glibc structures
but mksquashfs appears to be fine in this regard.
The problem is that heap_trim is "growing" the top into unmapped space.
If the top chunk == MINSIZE then top_area is -1 and this check does not
behave as expected due to a signed/unsigned comparison
if (top_area <= pad)
return 0;
The next calculation extra = ALIGN_DOWN(top_area - pad, pagesz) calculates
extra as a negative number which also is unnoticed due to a signed/unsigned
comparison. We then call shrink_heap(heap, negative_number) which crashes
later. This patch adds a simple check against MINSIZE to make sure extra
does not become negative. It adds a cast to hint to the reader that this
is a signed vs unsigned issue.
Without the patch, mksquash fails within 10 attempts. With it applied, it
completed 1000 times without error. The standard test suite "make check"
showed no changes in the summary of test results.
Some existing jn tests, if run in non-default rounding modes, produce
errors above those accepted in glibc, which causes problems for moving
tests of jn to use ALL_RM_TEST. This patch makes jn set rounding
to-nearest internally, as was done for yn some time ago, then computes
the appropriate underflowing value for results that underflowed to
zero in to-nearest, and moves the tests to ALL_RM_TEST. It does
nothing about the general inaccuracy of Bessel function
implementations in glibc, though it should make jn more accurate on
average in non-default rounding modes through reduced error
accumulation. The recomputation of results that underflowed to zero
should as a side-effect fix some cases of bug 16559, where jn just
used an exact zero, but that is *not* the goal of this patch and other
cases of that bug remain unfixed.
(Most of the changes in the patch are reindentation to add new scopes
for SET_RESTORE_ROUND*.)
Tested for x86_64, x86, powerpc and mips64.
[BZ #16559]
[BZ #18602]
* sysdeps/ieee754/dbl-64/e_jn.c (__ieee754_jn): Set
round-to-nearest internally then recompute results that
underflowed to zero in the original rounding mode.
* sysdeps/ieee754/flt-32/e_jnf.c (__ieee754_jnf): Likewise.
* sysdeps/ieee754/ldbl-128/e_jnl.c (__ieee754_jnl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_jnl.c (__ieee754_jnl): Likewise.
* sysdeps/ieee754/ldbl-96/e_jnl.c (__ieee754_jnl): Likewise
* math/libm-test.inc (jn_test): Use ALL_RM_TEST.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
To support building glibc with GCC 6 configured with --enable-default-pie,
which generates PIE by default, we need to build programs as PIE. But
elf/tst-dlopen-aout must not be built as PIE since it tests dlopen on
ET_EXEC file and PIE is ET_DYN.
[BZ #17841]
* Makeconfig (no-pie-ldflag): New.
(+link): Set to $(+link-pie) if default to PIE.
(+link-tests): Set to $(+link-pie-tests) if default to PIE.
* config.make.in (build-pie-default): New.
* configure.ac (libc_cv_pie_default): New. Set to yes if -fPIE
is default. AC_SUBST.
* configure: Regenerated.
* elf/Makefile (LDFLAGS-tst-dlopen-aout): New.
cexp, ccos, ccosh, csin and csinh have spurious underflows in cases
where they compute sin of the smallest normal, that produces an
underflow exception (depending on which sin implementation is in use)
but the final result does not underflow. ctan and ctanh may also have
such underflows, or they may be latent (the issue there is that
e.g. ctan (DBL_MIN) should, rounded upwards, be the next double value
above DBL_MIN, which under glibc's accuracy goals may not have an
underflow exception, but the intermediate computation of sin (DBL_MIN)
would legitimately underflow on before-rounding architectures).
This patch fixes all those functions so they use plain comparisons (>
DBL_MIN etc.) instead of comparing the result of fpclassify with
FP_SUBNORMAL (in all these cases, we already know the number being
compared is finite). Note that in the case of csin / csinf / csinl,
there is no need for fabs calls in the comparison because the real
part has already been reduced to its absolute value.
As the patch fixes the failures that previously obstructed moving
tests of cexp to use ALL_RM_TEST, those tests are moved to ALL_RM_TEST
by the patch (two functions remain yet to be converted).
Tested for x86_64 and x86 and ulps updated accordingly.
[BZ #18594]
* math/s_ccosh.c (__ccosh): Compare with least normal value
instead of comparing class with FP_SUBNORMAL.
* math/s_ccoshf.c (__ccoshf): Likewise.
* math/s_ccoshl.c (__ccoshl): Likewise.
* math/s_cexp.c (__cexp): Likewise.
* math/s_cexpf.c (__cexpf): Likewise.
* math/s_cexpl.c (__cexpl): Likewise.
* math/s_csin.c (__csin): Likewise.
* math/s_csinf.c (__csinf): Likewise.
* math/s_csinh.c (__csinh): Likewise.
* math/s_csinhf.c (__csinhf): Likewise.
* math/s_csinhl.c (__csinhl): Likewise.
* math/s_csinl.c (__csinl): Likewise.
* math/s_ctan.c (__ctan): Likewise.
* math/s_ctanf.c (__ctanf): Likewise.
* math/s_ctanh.c (__ctanh): Likewise.
* math/s_ctanhf.c (__ctanhf): Likewise.
* math/s_ctanhl.c (__ctanhl): Likewise.
* math/s_ctanl.c (__ctanl): Likewise.
* math/auto-libm-test-in: Add more tests of ccos, ccosh, cexp,
csin, csinh, ctan and ctanh.
* math/auto-libm-test-out: Regenerated.
* math/libm-test.inc (cexp_test): Use ALL_RM_TEST.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
Many packages, including GCC, install Python files for GDB in library
diretory. ldconfig reads them and issue errors since they aren't ELF
files:
ldconfig: /usr/gcc-5.1.1/lib/libstdc++.so.6.0.21-gdb.py is not an ELF file - it has the wrong magic bytes at the start.
ldconfig: /usr/gcc-5.1.1/libx32/libstdc++.so.6.0.21-gdb.py is not an ELF file - it has the wrong magic bytes at the start.
ldconfig: /usr/gcc-5.1.1/lib64/libstdc++.so.6.0.21-gdb.py is not an ELF file - it has the wrong magic bytes at the start.
This patch silences ldconfig on GDB Python files by checking filenames
with -gdb.py suffix.
[BZ #18585]
* elf/readlib.c (is_gdb_python_file): New.
(process_file): Don't issue errors on filenames with -gdb.py
suffix.
csin and csinh can produce bad results when overflowing in directed
rounding modes, because a multiplication that can overflow is followed
by a possible negation. This patch fixes this by negating one of the
arguments of the multiplication before the multiplication instead of
negating the result.
The new tests for this issue are added to auto-libm-test-in, starting
use of that file for csin and csinh. The issue was found in the
course of moving existing tests for csin and csinh (existing tests, by
being enabled in more cases than previously, showed the issue for
float and double but not for long double); that move will now be done
separately.
Tested for x86_64 and x86 and ulps updated accordingly.
[BZ #18593]
* math/s_csin.c (__csin): Negate before rather than after possibly
overflowing multiplication.
* math/s_csinf.c (__csinf): Likewise.
* math/s_csinh.c (__csinh): Likewise.
* math/s_csinhf.c (__csinhf): Likewise.
* math/s_csinhl.c (__csinhl): Likewise.
* math/s_csinl.c (__csinl): Likewise.
* math/auto-libm-test-in: Add some tests of csin and csinh.
* math/auto-libm-test-out: Regenerated.
* math/libm-test.inc (csin_test_data): Use AUTO_TESTS_c_c.
(csinh_test_data): Likewise.
* sysdeps/x86_64/fpu/libm-test-ulps: Update.
Similar to various other bugs in this area, the ldbl-128 expl
implementation does not raise the underflow exception for all
subnormal results, if the scaling down is exact although the actual
result is inexact. This patch fixes this by forcing the exception in
this case (the tests that failed before and pass after the test are
already in the testsuite).
Tested for mips64.
[BZ #18586]
* sysdeps/ieee754/ldbl-128/e_expl.c (__ieee754_expl): Force
underflow exception for small results.
Similar to various other bugs in this area, some sin and sincos
implementations do not raise the underflow exception for subnormal
arguments, when the result is tiny and inexact. This patch forces the
exception in a similar way to previous fixes.
Tested for x86_64, x86, mips64 and powerpc.
[BZ #16526]
[BZ #16538]
* sysdeps/ieee754/dbl-64/s_sin.c: Include <float.h>.
(__sin): Force underflow exception for arguments with small
absolute value.
* sysdeps/ieee754/flt-32/k_sinf.c: Include <float.h>.
(__kernel_sinf): Force underflow exception for arguments with
small absolute value.
* sysdeps/ieee754/ldbl-128/k_sincosl.c: Include <float.h>.
(__kernel_sincosl): Force underflow exception for arguments with
small absolute value.
* sysdeps/ieee754/ldbl-128/k_sinl.c: Include <float.h>.
(__kernel_sinl): Force underflow exception for arguments with
small absolute value.
* sysdeps/ieee754/ldbl-128ibm/k_sincosl.c: Include <float.h>.
(__kernel_sincosl): Force underflow exception for arguments with
small absolute value.
* sysdeps/ieee754/ldbl-128ibm/k_sinl.c: Include <float.h>.
(__kernel_sinl): Force underflow exception for arguments with
small absolute value.
* sysdeps/ieee754/ldbl-96/k_sinl.c: Include <float.h>.
(__kernel_sinl): Force underflow exception for arguments with
small absolute value.
* sysdeps/powerpc/fpu/k_sinf.c: Include <float.h>.
(__kernel_sinf): Force underflow exception for arguments with
small absolute value.
* math/auto-libm-test-in: Add more tests of sin and sincos.
* math/auto-libm-test-out: Regenerated.
__kernel_standard_l converts long double arguments to double for use
in SVID "struct exception". This has special-case handling for when
that conversion would overflow or underflow but the original long
double function wouldn't. However, it turns out that "inexact"
exceptions can be spurious here as well, when the function is exactly
determined and __kernel_standard_l is being called for a domain error.
This patch fixes this by using feholdexcept / fesetenv to avoid
exceptions from the conversion, replacing the previous special-case
logic for overflow and underflow (this covers all functions using
__kernel_standard_l, not just those that actually need a change, since
there doesn't seem to be much point in restricting things just to the
functions that mustn't get "inexact" here).
Tested for x86_64 and x86.
[BZ #18245]
[BZ #18583]
* sysdeps/ieee754/k_standardl.c: Include <fenv.h>.
(__kernel_standard_l): Use feholdexcept and fesetenv around
conversion to double instead of special-casing overflow and
underflow.
* math/libm-test.inc (fmod_test_data): Add more tests.
(remainder_test_data): Likewise.
(sqrt_test_data): Likewise.
This fixes BZ #17403 by defining atomic_full_barrier,
atomic_read_barrier, and atomic_write_barrier on x86 and x86_64. A full
barrier is implemented through an atomic idempotent modification to the
stack and not through using mfence because the latter can supposedly be
somewhat slower due to having to provide stronger guarantees wrt.
self-modifying code, for example.
The csqrt implementations in glibc can cause spurious underflows in
some cases as a side-effect of the scaling for large arguments (when
underflow is correct for the square root of the argument that was
scaled down to avoid overflow, but not for the original argument).
This patch arranges to avoid the underflowing intermediate computation
(eliminating a multiplication in 0.5 in the problem cases where a
subsequent scaling by 2 would follow).
Tested for x86_64 and x86 and ulps updated accordingly (only needed
for x86).
[BZ #18371]
* math/s_csqrt.c (__csqrt): Avoid multiplication by 0.5 where
intermediate but not final result might underflow.
* math/s_csqrtf.c (__csqrtf): Likewise.
* math/s_csqrtl.c (__csqrtl): Likewise.
* math/auto-libm-test-in: Add more tests of csqrt.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
The dbl-64 and flt-32 implementations of exp2 functions produce
spurious underflow exceptions. The underlying reason is the same in
both cases: the computation works as (2^a - 1)*2^b + 2^b for suitably
chosen a and b, where a has small magnitude so 2^a - 1 can be computed
with a low-degree polynomial approximation, and (2^a - 1)*2^b can
underflow even when the final result does not. This patch fixes this
by adjusting the threshold for when scaling is used to avoid
intermediate underflow so it works for any possible value of a where
the final result would not underflow.
Tested for x86_64 and x86.
[BZ #18219]
* sysdeps/ieee754/dbl-64/e_exp2.c (__ieee754_exp2): Reduce
threshold on absolute value of exponent for which scaling is used.
* sysdeps/ieee754/flt-32/e_exp2f.c (__ieee754_exp2f): Likewise.
* math/auto-libm-test-in: Add more tests of exp2.
* math/auto-libm-test-out: Regenerated.
When "reorder" resolver option is enabled, threads of a multi-threaded process
could hang in gethostbyaddr_r, gethostbyname_r, or gethostbyname2_r.
Due to a trivial bug in _res_hconf_reorder_addrs, simultaneous
invocations of this function in a multi-threaded process could result to
_res_hconf_reorder_addrs returning without releasing the lock it holds,
causing other threads to block indefinitely while waiting for the lock
that is not going to be released.
[BZ #17977]
* resolv/res_hconf.c (_res_hconf_reorder_addrs): Fix unlocking
when initializing interface list, based on the bug analysis
and the patch proposed by Eric Newton.
* resolv/tst-res_hconf_reorder.c: New test.
* resolv/Makefile [$(have-thread-library) = yes] (tests): Add
tst-res_hconf_reorder.
($(objpfx)tst-res_hconf_reorder): Depend on $(libdl)
and $(shared-thread-library).
(tst-res_hconf_reorder-ENV): New variable.
Similar to various other bugs in this area, some expm1 implementations
do not raise the underflow exception for subnormal arguments, when the
result is tiny and inexact. This patch forces the exception in a
similar way to previous fixes.
(The issue does not apply to the ldbl-* implementations or to those
for x86 / x86_64 long double. The change to
sysdeps/ieee754/dbl-64/wordsize-64/e_cosh.c is one I missed when
previously fixing bug 16354; the bug in that implementation was
previously latent, but the expm1 fixes stopped it being latent and so
required it to be fixed to avoid spurious underflows from cosh.)
Tested for x86_64 and x86.
[BZ #16353]
* sysdeps/i386/fpu/s_expm1.S (dbl_min): New object.
(__expm1): Force underflow exception for arguments with small
absolute value.
* sysdeps/i386/fpu/s_expm1f.S (flt_min): New object.
(__expm1f): Force underflow exception for arguments with small
absolute value.
* sysdeps/ieee754/dbl-64/s_expm1.c: Include <float.h>.
(__expm1): Force underflow exception for arguments with small
absolute value.
* sysdeps/ieee754/flt-32/s_expm1f.c: Include <float.h>.
(__expm1f): Force underflow exception for arguments with small
absolute value.
* sysdeps/ieee754/dbl-64/wordsize-64/e_cosh.c (__ieee754_cosh):
Check for small arguments before calling __expm1.
* math/auto-libm-test-in: Do not mark underflow exceptions as
possibly missing for bug 16353.
* math/auto-libm-test-out: Regenerated.
In the x86 / x86_64 implementations of expm1l, when expm1l's result
should underflow to 0 (argument minus the least subnormal, in some
rounding modes), it can be a zero of the wrong sign. This patch fixes
this by returning the argument with underflow forced in that case
(this is a 1ulp error relative to the correctly rounded result of -0,
which is OK in terms of the documented accuracy goals, whereas a
result with the wrong sign never is).
Tested for x86_64 and x86.
[BZ #18569]
* sysdeps/i386/fpu/e_expl.S (IEEE754_EXPL) [USE_AS_EXPM1L]: Force
underflow and return argument in case of subnormal argument.
* sysdeps/x86_64/fpu/e_expl.S (IEEE754_EXPL) [USE_AS_EXPM1L]:
Likewise.
* math/auto-libm-test-in: Add more tests of expm1.
* math/auto-libm-test-out: Regenerated.
Similar to various other bugs in this area, the x86 and x86_64
implementations of expl / exp10l can fail to produce underflow
exceptions when the unscaled result has trailing 0 bits so the scaling
down to subnormal precision is exact. This patch fixes this by
forcing the exception in the case of tiny results.
Tested for x86_64 and x86.
[BZ #16361]
* sysdeps/i386/fpu/e_expl.S [!USE_AS_EXPM1L] (cmin): New object.
[!USE_AS_EXPM1L] (IEEE754_EXPL): Force underflow exception for
tiny results.
* sysdeps/x86_64/fpu/e_expl.S [!USE_AS_EXPM1L] (cmin): New object.
[!USE_AS_EXPM1L] (IEEE754_EXPL): Force underflow exception for
tiny results.
* math/auto-libm-test-in: Add more tests of exp and exp10. Do not
mark underflow exceptions as possibly missing for bug 16361.
* math/auto-libm-test-out: Regenerated.
Similar to various other bugs in this area, some asinh implementations
do not raise the underflow exception for subnormal arguments, when the
result is tiny and inexact. This patch forces the exception in a
similar way to previous fixes.
Tested for x86_64, x86 and mips64.
[BZ #16350]
* sysdeps/i386/fpu/s_asinh.S (__asinh): Force underflow exception
for arguments with small absolute value.
* sysdeps/i386/fpu/s_asinhf.S (__asinhf): Likewise.
* sysdeps/i386/fpu/s_asinhl.S (__asinhl): Likewise.
* sysdeps/ieee754/dbl-64/s_asinh.c: Include <float.h>.
(__asinh): Force underflow exception for arguments with small
absolute value.
* sysdeps/ieee754/flt-32/s_asinhf.c: Include <float.h>.
(__asinhf): Force underflow exception for arguments with small
absolute value.
* sysdeps/ieee754/ldbl-128/s_asinhl.c: Include <float.h>.
(__asinhl): Force underflow exception for arguments with small
absolute value.
* sysdeps/ieee754/ldbl-128ibm/s_asinhl.c: Include <float.h>.
(__asinhl): Force underflow exception for arguments with small
absolute value.
* sysdeps/ieee754/ldbl-96/s_asinhl.c: Include <float.h>.
(__asinhl): Force underflow exception for arguments with small
absolute value.
* math/auto-libm-test-in: Do not mark underflow exceptions as
possibly missing for bug 16350.
* math/auto-libm-test-out: Regenerated.
sysdeps/unix/sysv/linux/bits/in.h (as included in netinet/in.h, and
via that in netdb.h and arpa/inet.h) defines a series of MCAST_*
macros, both under __USE_MISC and then again unconditionally. These
are not POSIX macros, nor in any of the namespaces listed in POSIX as
reserved for this header, so should not be defined unconditionally.
This patch duly removes the unconditional definitions, leaving the
ones conditional on __USE_MISC.
Tested for x86_64 and x86 (testsuite, and that installed stripped
shared libraries are unchanged by the patch).
[BZ #18558]
* sysdeps/unix/sysv/linux/bits/in.h (MCAST_JOIN_GROUP): Remove
unconditional definition.
(MCAST_BLOCK_SOURCE): Likewise.
(MCAST_UNBLOCK_SOURCE): Likewise.
(MCAST_LEAVE_GROUP): Likewise.
(MCAST_JOIN_SOURCE_GROUP): Likewise.
(MCAST_LEAVE_SOURCE_GROUP): Likewise.
(MCAST_MSFILTER): Likewise.
* conform/Makefile (test-xfail-XOPEN2K/arpa/inet.h/conform):
Remove variable.
(test-xfail-XOPEN2K/netdb.h/conform): Likewise.
(test-xfail-XOPEN2K/netinet/in.h/conform): Likewise.
(test-xfail-XOPEN2K8/arpa/inet.h/conform): Likewise.
(test-xfail-XOPEN2K8/netdb.h/conform): Likewise.
(test-xfail-XOPEN2K8/netinet/in.h/conform): Likewise.
nice (XPG3) calls getpriority and setpriority (in XPG4 but not XPG3,
i.e. UX-shaded in XPG4). This patch fixes this by making those
functions into weak aliases of __* functions and calling the __*
versions as needed.
Tested for x86_64 and x86 (testsuite, and that disassembly of
installed shared libraries is unchanged by this patch).
This completes cleaning up the unsorted linknamespace test XFAILs.
[BZ #18553]
* resource/getpriority.c (getpriority): Rename to __getpriority
and define as weak alias of __getpriority.
* resource/setpriority.c (setpriority): Rename to __setpriority
and define as weak alias of __setpriority.
* sysdeps/mach/hurd/getpriority.c (getpriority): Rename to
__getpriority and define as weak alias of __getpriority.
* sysdeps/mach/hurd/setpriority.c (setpriority): Rename to
__setpriority and define as weak alias of __setpriority.
* sysdeps/unix/syscalls.list (getpriority): Use __getpriority as
strong name.
(setpriority): Use __setpriority as strong name.
* sysdeps/unix/sysv/linux/getpriority.c (getpriority): Rename to
__getpriority and define as weak alias of __getpriority.
* include/sys/resource.h (__getpriority): Declare. Use
libc_hidden_proto.
(__setpriority): Likewise.
(getpriority): Don't use libc_hidden_proto.
(setpriority): Likewise.
* sysdeps/posix/nice.c (nice): Call __getpriority instead of
getpriority. Call __setpriority instead of setpriority.
* conform/Makefile (test-xfail-XPG3/unistd.h/linknamespace):
Remove variable.
ttyslot (XPG4) calls the non-XPG4 functions endttyent, getttyent and
setttyent, which in turn bring in references to fgets_unlocked and
getttynam. This patch fixes this by making these functions into weak
aliases and calling the __* names as needed.
Tested for x86_64 and x86 (testsuite, and that disassembly of
installed stripped shared libraries is unchanged by the patch).
[BZ #18547]
* misc/getttyent.c (getttynam): Rename to __getttynam and define
as weak alias of __getttynam. Use prototype function definition.
Call __setttyent, __getttyent and __endttyent instead of
setttyent, getttyent and endttyent.
(getttyent): Rename to __getttyent and define as weak alias of
__getttyent. Call __setttyent instead of setttyent. Call
__fgets_unlocked instead of fgets_unlocked.
(setttyent): Rename to __setttyent and define as weak alias of
__setttyent.
(endttyent): Rename to __endttyent and define as weak alias of
__endttyent.
* include/ttyent.h (__getttyent): Declare. Use libc_hidden_proto.
(__setttyent): Likewise.
(__endttyent): Likewise.
(getttyent): Don't use libc_hidden_proto.
(setttyent): Likewise.
(endttyent): Likewise.
* misc/ttyslot.c (ttyslot): Call __setttyent, __getttyent and
__endttyent instead of setttyent, getttyent and endttyent.
* conform/Makefile (test-xfail-XPG4/unistd.h/linknamespace):
Remove variable.
mq_notify (in the 1996 edition of POSIX) brings in references to recv
and socket (not in POSIX until the 2001 edition). This patch fixes
this by using __recv and __socket, exporting them from libc at version
GLIBC_PRIVATE.
Tested for x86_64 and x86 (testsuite and comparison of installed
stripped shared libraries; PLT / dynamic symbol table changes render
the comparison not particularly useful for libc).
[BZ #18546]
* socket/recv.c (__recv): Use libc_hidden_def.
* socket/socket.c (__socket): Likewise.
* sysdeps/mach/hurd/recv.c (__recv): Likewise.
* sysdeps/mach/hurd/socket.c (__socket): Likewise.
* sysdeps/unix/sysv/linux/generic/recv.c (__recv): Likewise.
* sysdeps/unix/sysv/linux/recv.c (__recv): Use libc_hidden_weak.
* sysdeps/unix/sysv/linux/socket.c (__socket): Use
libc_hidden_def.
* sysdeps/unix/sysv/linux/x86_64/recv.c (__recv): Use
libc_hidden_weak.
* include/sys/socket.h (__socket): Do not use attribute_hidden.
Use libc_hidden_proto.
(__recv): Likewise.
* socket/Versions (libc): Export __recv and __socket at version
GLIBC_PRIVATE.
* sysdeps/unix/sysv/linux/mq_notify.c (helper_thread): Call __recv
instead of recv.
(init_mq_netlink): Call __socket instead of socket.
* conform/Makefile (test-xfail-POSIX/mqueue.h/linknamespace):
Remove variable.
mq_receive calls mq_timedreceive, and mq_send calls mq_timedsend. But
mq_receive and mq_send were in POSIX by 1996, while mq_timed* were
added in the 2001 edition of POSIX. This patch fixes this by making
mq_timed* into weak aliases for __mq_timed* and calling the
__mq_timed* names.
Tested for x86_64 and x86 (testsuite, and that disassembly of
installed shared libraries is unchanged by the patch).
[BZ #18545]
* rt/mq_timedreceive.c (mq_timedreceive): Rename to
__mq_timedreceive and define as alias of __mq_timedreceive. Use
hidden_weak.
* rt/mq_timedsend.c (mq_timedsend): Rename to __mq_timedsend and
define as alias of __mq_timedsend. Use hidden_weak.
* sysdeps/unix/sysv/linux/syscalls.list (mq_timedsend): Use
__mq_timedsend as strong name.
(mq_timedreceive): Use __mq_timedreceive as strong name.
* include/mqueue.h (__mq_timedsend): Declare. Use hidden_proto.
(__mq_timedreceive): Likewise.
* sysdeps/unix/sysv/linux/mq_receive.c (mq_receive): Call
__mq_timedreceive instead of mq_timedreceive.
* sysdeps/unix/sysv/linux/mq_send.c (mq_send): Call __mq_timedsend
instead of mq_timedsend.
* conform/Makefile (test-xfail-UNIX98/mqueue.h/linknamespace):
Remove variable.
mq_notify (present in POSIX by 1996) brings in references to
pthread_barrier_init and pthread_barrier_wait (new in the 2001 edition
of POSIX). This patch fixes this by making those functions into weak
aliases of __pthread_barrier_*, exporting the __pthread_barrier_*
names at version GLIBC_PRIVATE and using them in mq_notify.
Tested for x86_64 and x86 (testsuite, and comparison of installed
stripped shared libraries). Changes in addresses from dynamic symbol
table / PLT changes render most comparisons not particularly useful,
but when the addresses of subsequent code don't change there's no sign
of unexpected changes there. This patch does not remove any
linknamespace XFAILs because of other namespace issues remaining with
mqueue.h functions.
[BZ #18544]
* nptl/pthread_barrier_init.c (pthread_barrier_init): Rename to
__pthread_barrier_init and define as weak alias of
__pthread_barrier_init.
* sysdeps/sparc/nptl/pthread_barrier_init.c
(pthread_barrier_init): Likewise.
* nptl/pthread_barrier_wait.c (pthread_barrier_wait): Rename to
__pthread_barrier_wait and define as weak alias of
__pthread_barrier_wait.
* sysdeps/sparc/nptl/pthread_barrier_wait.c
(pthread_barrier_wait): Likewise.
* sysdeps/sparc/sparc32/pthread_barrier_wait.c
(pthread_barrier_wait): Likewise.
* sysdeps/unix/sysv/linux/i386/i486/pthread_barrier_wait.S
(pthread_barrier_wait): Likewise.
* sysdeps/unix/sysv/linux/x86_64/pthread_barrier_wait.S
(pthread_barrier_wait): Likewise.
* nptl/Versions (libpthread): Export __pthread_barrier_init and
__pthread_barrier_wait at version GLIBC_PRIVATE.
* include/pthread.h (__pthread_barrier_init): Declare.
(__pthread_barrier_wait): Likewise.
* sysdeps/unix/sysv/linux/mq_notify.c (notification_function):
Call __pthread_barrier_wait instead of pthread_barrier_wait.
(helper_thread): Likewise.
(init_mq_netlink): Call __pthread_barrier_init instead of
pthread_barrier_init.
swscanf (added in C90 Amendment 1, present in UNIX98) calls vswscanf
(added in C99, not in C90 Amendment 1 or UNIX98). This patch fixes
this by using __vswscanf instead and making vswscanf into a weak
alias.
(I intend to add conform/ test support for C90 Amendment 1 - and
various other standard versions supported by glibc but not yet by
conform/ tests - at some point, once the results for currently tested
standards are cleaner.)
Tested for x86_64 and x86 (testsuite, and that installed stripped
shared libraries are unchanged by the patch).
[BZ #18542]
* libio/iovswscanf.c (__vswscanf): Use libc_hidden_def.
(vswscanf): Use ldbl_weak_alias instead of ldbl_strong_alias
* include/wchar.h (__vswscanf): Declare. Use libc_hidden_proto.
* libio/swscanf.c (__swscanf): Call __vswscanf instead of
vswscanf.
* conform/Makefile (test-xfail-UNIX98/wchar.h/linknamespace):
Remove variable.
The getpass function (XPG3 / XPG4 / UNIX98) calls fflush_unlocked (not
in any of those standards). This patch fixes this by making
fflush_unlocked into a weak alias for __fflush_unlocked and calling
__fflush_unlocked from getpass.
Tested for x86_64 and x86 (testsuite, and that disassembly of
installed stripped shared libraries is unchanged by the patch).
[BZ #18540]
* libio/iofflush.c [!_IO_MTSAFE_IO] (__fflush_unlocked): Define as
strong alias of _IO_fflush. Use libc_hidden_def.
* libio/iofflush_u.c (fflush_unlocked): Rename to
__fflush_unlocked and define as weak alias of __fflush_unlocked.
Use libc_hidden_weak.
* include/stdio.h (__fflush_unlocked): Declare. Use
libc_hidden_proto.
* misc/getpass.c (getpass): Call __fflush_unlocked instead of
fflush_unlocked.
* conform/Makefile (test-xfail-UNIX98/unistd.h/linknamespace):
Remove variable.
Use of fmtmsg (XSI POSIX) brings in addseverity (non-POSIX). This
patch fixes this by making addseverity into a weak alias for
__addseverity.
Tested for x86_64 and x86 (testsuite, and that disassembly of
installed shared libraries is unchanged by the patch).
[BZ #18539]
* stdlib/fmtmsg.c (addseverity): Rename to __addseverity and
define as weak alias of __addseverity.
* conform/Makefile (test-xfail-XPG4/fmtmsg.h/linknamespace):
Remove variable.
(test-xfail-UNIX98/fmtmsg.h/linknamespace): Likewise.
(test-xfail-XOPEN2K/fmtmsg.h/linknamespace): Likewise.
(test-xfail-XOPEN2K8/fmtmsg.h/linknamespace): Likewise.
The sem_* functions bring in references to tdelete, tfind, tsearch and
twalk. But the t* functions are XSI-shaded, while sem_* aren't. This
patch fixes this by using __t* instead, exporting those functions from
libc at version GLIBC_PRIVATE (since sem_* are in libpthread) and
using libc_hidden_* for the benefit of calls within libc.
Tested for x86_64 and x86 (testsuite, and comparison of disassembly of
installed stripped shared libraries). libpthread gets changes from
PLT reordering; addresses in libc change because of PLT / dynamic
symbol table changes.
[BZ #18536]
* misc/tsearch.c (__tsearch): Use libc_hidden_def.
(__tfind): Likewise.
(__tdelete): Likewise.
(__twalk): Likewise.
* misc/Versions (libc): Add __tdelete, __tfind, __tsearch and
__twalk to GLIBC_PRIVATE.
* include/search.h (__tsearch): Use libc_hidden_proto.
(__tfind): Likewise.
(__tdelete): Likewise.
(__twalk): Likewise.
* nptl/sem_close.c (sem_close): Call __twalk instead of twalk.
Call __tdelete instead of tdelete.
* nptl/sem_open.c (check_add_mapping): Call __tfind instead of
tfind. Call __tsearch instead of tsearch.
* sysdeps/sparc/sparc32/sem_open.c (check_add_mapping): Likewise.
* conform/Makefile (test-xfail-POSIX/semaphore.h/linknamespace):
Remove variable.
(test-xfail-POSIX2008/semaphore.h/linknamespace): Likewise.
syslog functions bring in references to dprintf, which wasn't added to
POSIX until the 2008 edition and so isn't in various standards
containing the syslog functions. This patch fixes this by making
dprintf into a weak alias of __dprintf and using __dprintf as
appropriate.
Tested for x86_64 and x86 (testsuite, and that installed stripped
shared libraries are unchanged by the patch).
[BZ #18534]
* stdio-common/dprintf.c (__dprintf): Use libc_hidden_def.
(dprintf): Define as a weak alias of __dprintf, not a strong
alias.
* include/stdio.h (__dprintf): Declare. Use libc_hidden_proto.
* misc/syslog.c (__vsyslog_chk): Call __dprintf instead of
dprintf.
* conform/Makefile (test-xfail-XPG4/syslog.h/linknamespace):
Remove variable.
(test-xfail-UNIX98/syslog.h/linknamespace): Likewise.
(test-xfail-XOPEN2K/syslog.h/linknamespace): Likewise.
syslog functions (in POSIX) bring in the strong symbol vsyslog (not in
POSIX). This patch fixes this by changing this symbol from a strong
alias to a weak alias.
Tested for x86_64 and x86 (testsuite, and that installed stripped
shared libraries are unchanged by the patch). (vsyslog becomes weak
in the static libraries, which is what's needed; the particular macro
sequence in use leaves it as strong in the shared libraries, hence
those libraries being completely unchanged, but it doesn't generally
matter whether symbols exported from the shared libraries are weak or
strong.)
[BZ #18533]
* misc/syslog.c (vsyslog): Define as a weak alias of __vsyslog,
not a strong alias.
* conform/Makefile (test-xfail-XOPEN2K8/syslog.h/linknamespace):
Remove variable.
gethostbyaddr brings in references to in6addr_any and thereby
in6addr_loopback, which aren't in all the standards containing
gethostbyaddr (gethostbyaddr is in XPG4 and UNIX98, in6addr_any and
in6addr_loopback are new in POSIX.1:2001). This patch fixes this by
making those symbols into weak aliases (safe in this case, unlike for
most data symbols, because these data symbols are const).
Tested for x86_64 and x86 (testsuite, and comparison of disassembly of
installed stripped shared libraries). Disassembly is unchanged for
x86_64; for x86, I see some changes of stack offsets, but no other
code generation changes or code size differences.
[BZ #18532]
* inet/in6_addr.c (in6addr_any): Rename to __in6addr_any and
define as weak alias of __in6addr_any. Use libc_hidden_data_weak.
(in6addr_loopback): Rename to __in6addr_loopback and define as
weak alias of __in6addr_loopback. Use libc_hidden_data_weak.
* include/netinet/in.h (__in6addr_loopback): Declare. Use
libc_hidden_proto.
(__in6addr_any): Likewise.
* inet/gethstbyad_r.c (PREPROCESS): Use __in6addr_any instead of
in6addr_any.
* conform/Makefile (test-xfail-XPG4/netdb.h/linknamespace): Remove
variable.
(test-xfail-UNIX98/netdb.h/linknamespace): Likewise.
Lazy TLSDESC initialization needs to be synchronized with concurrent TLS
accesses. The TLS descriptor contains a function pointer (entry) and an
argument that is accessed from the entry function. With lazy initialization
the first call to the entry function updates the entry and the argument to
their final value. A final entry function must make sure that it accesses an
initialized argument, this needs synchronization on systems with weak memory
ordering otherwise the writes of the first call can be observed out of order.
There are at least two issues with the current code:
tlsdesc.c (i386, x86_64, arm, aarch64) uses volatile memory accesses on the
write side (in the initial entry function) instead of C11 atomics.
And on systems with weak memory ordering (arm, aarch64) the read side
synchronization is missing from the final entry functions (dl-tlsdesc.S).
This patch only deals with aarch64.
* Write side:
Volatile accesses were replaced with C11 relaxed atomics, and a release
store was used for the initialization of entry so the read side can
synchronize with it.
* Read side:
TLS access generated by the compiler and an entry function code is roughly
ldr x1, [x0] // load the entry
blr x1 // call it
entryfunc:
ldr x0, [x0,#8] // load the arg
ret
Various alternatives were considered to force the ordering in the entry
function between the two loads:
(1) barrier
entryfunc:
dmb ishld
ldr x0, [x0,#8]
(2) address dependency (if the address of the second load depends on the
result of the first one the ordering is guaranteed):
entryfunc:
ldr x1,[x0]
and x1,x1,#8
orr x1,x1,#8
ldr x0,[x0,x1]
(3) load-acquire (ARMv8 instruction that is ordered before subsequent
loads and stores)
entryfunc:
ldar xzr,[x0]
ldr x0,[x0,#8]
Option (1) is the simplest but slowest (note: this runs at every TLS
access), options (2) and (3) do one extra load from [x0] (same address
loads are ordered so it happens-after the load on the call site),
option (2) clobbers x1 which is problematic because existing gcc does
not expect that, so approach (3) was chosen.
A new _dl_tlsdesc_return_lazy entry function was introduced for lazily
relocated static TLS, so non-lazy static TLS can avoid the synchronization
cost.
[BZ #18034]
* sysdeps/aarch64/dl-tlsdesc.h (_dl_tlsdesc_return_lazy): Declare.
* sysdeps/aarch64/dl-tlsdesc.S (_dl_tlsdesc_return_lazy): Define.
(_dl_tlsdesc_undefweak): Guarantee TLSDESC entry and argument load-load
ordering using ldar.
(_dl_tlsdesc_dynamic): Likewise.
(_dl_tlsdesc_return_lazy): Likewise.
* sysdeps/aarch64/tlsdesc.c (_dl_tlsdesc_resolve_rela_fixup): Use
relaxed atomics instead of volatile and synchronize with release store.
(_dl_tlsdesc_resolve_hold_fixup): Use relaxed atomics instead of
volatile.
* elf/tlsdeschtab.h (_dl_tlsdesc_resolve_early_return_p): Likewise.
syslog (XSI POSIX) brings in references to fputs_unlocked (not
POSIX). This patch fixes this by making fputs_unlocked into a weak
alias for __fputs_unlocked and using __fputs_unlocked as needed. (No
linknamespace test XFAILs are removed because there are other failures
from syslog as well.)
Tested for x86_64 and x86 (testsuite, and comparison of disassembly of
installed stripped shared libraries). Disassembly of installed
stripped shared libraries is unchanged on x86_64; on x86, I see some
small changes to instruction ordering and register choice, with no
apparent reason for such changes to be related to this patch, but they
also seem completely harmless with no change to code size.
[BZ #18530]
* libio/iofputs.c [!_IO_MTSAFE_IO] (__fputs_unlocked): Define as
strong alias of _IO_fputs. Use libc_hidden_def.
* libio/iofputs_u.c (fputs_unlocked): Rename to __fputs_unlocked
and define as weak alias of __fputs_unlocked. Use
libc_hidden_weak.
* include/stdio.h (__fputs_unlocked): Declare. Use
libc_hidden_proto.
* misc/syslog.c (__vsyslog_chk): Call __fputs_unlocked instead of
fputs_unlocked.
netdb.h declares interfaces such as getaddrinfo if __USE_POSIX,
i.e. POSIX.1:1990 or later. However, these interfaces were new in the
2001 edition of POSIX, although the header was in XPG4 and UNIX98, so
they should not be declared for XPG4 or UNIX98. (This produces
spurious linknamespace test failures, although there are other
failures for this header as well for the same standards so this patch
doesn't remove any XFAILs.) This patch corrects the condition, and
the conform/ test expectations which were similarly wrong.
Tested for x86_64 and x86 (testsuite, and that installed stripped
shared libraries are unchanged by the patch).
[BZ #18529]
* resolv/netdb.h [__USE_POSIX]: Change condition to
[__USE_XOPEN2K].
* conform/data/netdb.h-data [XPG4 || UNIX98] (struct addrinfo): Do
not expect.
[XPG4 || UNIX98] (AI_PASSIVE): Likewise.
[XPG4 || UNIX98] (AI_CANONNAME): Likewise.
[XPG4 || UNIX98] (AI_NUMERICHOST): Likewise.
[XPG4 || UNIX98] (AI_V4MAPPED): Likewise.
[XPG4 || UNIX98] (AI_ALL): Likewise.
[XPG4 || UNIX98] (AI_ADDRCONFIG): Likewise.
[XPG4 || UNIX98] (AI_NUMERICSERV): Likewise.
[XPG4 || UNIX98] (NI_NOFQDN): Likewise.
[XPG4 || UNIX98] (NI_NUMERICHOST): Likewise.
[XPG4 || UNIX98] (NI_NAMEREQD): Likewise.
[XPG4 || UNIX98] (NI_NUMERICSERV): Likewise.
[XPG4 || UNIX98] (NI_DGRAM): Likewise.
[XPG4 || UNIX98] (EAI_AGAIN): Likewise.
[XPG4 || UNIX98] (EAI_BADFLAGS): Likewise.
[XPG4 || UNIX98] (EAI_FAIL): Likewise.
[XPG4 || UNIX98] (EAI_FAMILY): Likewise.
[XPG4 || UNIX98] (EAI_MEMORY): Likewise.
[XPG4 || UNIX98] (EAI_NONAME): Likewise.
[XPG4 || UNIX98] (EAI_SERVICE): Likewise.
[XPG4 || UNIX98] (EAI_SOCKTYPE): Likewise.
[XPG4 || UNIX98] (EAI_SYSTEM): Likewise.
[XPG4 || UNIX98] (EAI_SYSTEM): Likewise.
[XPG4 || UNIX98] (freeaddrinfo): Likewise.
[XPG4 || UNIX98] (gai_strerror): Likewise.
[XPG4 || UNIX98] (getaddrinfo): Likewise.
[XPG4 || UNIX98] (getnameinfo): Likewise.
grp.h declares endgrent and getgrent if __USE_XOPEN2K8 (i.e. 2008
edition of POSIX, non-XSI). However, the 2013 Technical Corrigendum
corrected the grp.h specification to XSI-shade these functions as in
previous editions (see <http://austingroupbugs.net/view.php?id=24>),
so they should not be declared for non-XSI POSIX. This patch corrects
the conditions - using __USE_MISC || __USE_XOPEN_EXTENDED to match
setgrent - and the conform/ test expectations for this header, thereby
fixing the conform tests for this header for XPG3 (where the
expectations were wrong) and the linknamespace tests for it for
POSIX2008 (where the header bug meant it was wrongly considered a
problem for endgrent to bring in a reference to setgrent).
Tested for x86_64 and x86 (testsuite, and that installed stripped
shared libraries are unchanged by the patch).
[BZ #18528]
* grp/grp.h (endgrent): Condition on [__USE_MISC ||
__USE_XOPEN_EXTENDED], not [__USE_XOPEN_EXTENDED ||
__USE_XOPEN2K8].
(getgrent): Likewise.
* conform/data/grp.h-data [XPG3 || POSIX2008] (getgrent): Do not
expect.
[XPG3 || POSIX2008] (endgrent): Likewise.
[XPG3] (setgrent): Likewise.
* conform/Makefile (test-xfail-XPG3/grp.h/conform): Remove
variable.
(test-xfail-POSIX2008/grp.h/linknamespace): Likewise.
Various functions in XPG4 bring in references to getlogin_r, which is
not in XPG4; this is also a bug for some older POSIX versions which
aren't yet covered by the linknamespace tests. This patch fixes this
by making getlogin_r into a weak alias for __getlogin_r and using
__getlogin_r as needed.
Tested for x86_64 and x86 (testsuite, and that disassembly of
installed stripped shared libraries is unchanged by the patch).
[BZ #18527]
* login/getlogin_r.c (getlogin_r): Rename to __getlogin_r and
define as weak alias of __getlogin_r. Use libc_hidden_weak.
* sysdeps/mach/hurd/getlogin_r.c (getlogin_r): Likewise.
* sysdeps/unix/getlogin_r.c (getlogin_r): Likewise.
* sysdeps/unix/sysv/linux/getlogin_r.c (getlogin_r): Likewise.
* include/unistd.h (__getlogin_r): Declare. Use
libc_hidden_proto.
* posix/glob.c (glob): Call __getlogin_r instead of getlogin_r.
* conform/Makefile (test-xfail-XPG3/glob.h/linknamespace): Remove
variable.
(test-xfail-XPG3/wordexp.h/linknamespace): Likewise.
(test-xfail-XPG4/glob.h/linknamespace): Likewise.
(test-xfail-XPG4/wordexp.h/linknamespace): Likewise.
a non-standard directory specified by the prefix make variable
fails with an error. Since this is an unsupported use case,
this change makes make install fail early and with a descriptive
error message when either the prefix or the exec_prefix make
variable is overridden on the command line.
aio_* bring in references to pread, which isn't in all the standards
containing aio_* (as a reference from one library to another, this is
a bug for dynamic as well as static linking). This patch fixes this
by using __libc_pread instead, exporting that function from libc at
symbol version GLIBC_PRIVATE; the code, with conditionals that may
call either __pread64 or __libc_pread, becomes exactly analogous to
that elsewhere in the same file that may call either __pwrite64 or
__libc_pwrite.
Tested for x86_64 and x86 (testsuite, and comparison of disassembly of
installed shared libraries). libc changes because of the PLT entry
for the newly exported __libc_pread; librt changes because of
assertion line numbers and PLT rearrangement; other stripped installed
shared libraries do not change.
[BZ #18519]
* posix/Versions (libc): Export __libc_pread at version
GLIBC_PRIVATE.
* sysdeps/pthread/aio_misc.c (handle_fildes_io): Call __libc_pread
instead of pread.
* conform/Makefile (test-xfail-POSIX/aio.h/linknamespace): Remove
variable.
The functions ecvt, fcvt and gcvt, in some standards, bring in
references to ecvt_r and fcvt_r, which aren't in any of those
standards. The calls are correctly to __ecvt_r and __fcvt_r, but then
the names ecvt_r and fcvt_r are defined as strong aliases; this patch
changes them to weak aliases.
Tested for x86_64 and x86 (testsuite, and that disassembly of
installed stripped shared libraries is unchanged by the patch).
[BZ #18522]
* misc/efgcvt_r.c
[LONG_DOUBLE_COMPAT (libc, GLIBC_2_0) && !LONG_DOUBLE_CVT]
(cvt_symbol): Use weak_alias instead of strong_alias.
[LONG_DOUBLE_COMPAT (libc, GLIBC_2_0)] (cvt_symbol): Likewise.
* conform/Makefile (test-xfail-XPG4/stdlib.h/linknamespace):
Remove variable.
(test-xfail-UNIX98/stdlib.h/linknamespace): Likewise.
(test-xfail-XOPEN2K/stdlib.h/linknamespace): Likewise.
The 2008 edition of POSIX removed h_errno, but some functions still
bring in references to the h_errno external symbol. As this symbol is
not a part of the public ABI (only __h_errno_location is), this patch
fixes this by renaming the GLIBC_PRIVATE TLS symbol to __h_errno.
Tested for x86_64 and x86 (testsuite, and comparison of installed
shared libraries). Disassembly of all shared libraries using h_errno
changes because of the renaming (and changes to associated TLS / GOT
offsets in some cases); disassembly of libpthread on x86_64 changes
more substantially because the enlargement of .dynsym affects
subsequent addresses.
[BZ #18520]
* inet/herrno.c (h_errno): Rename to __h_errno.
(__libc_h_errno): Define as alias of __h_errno not h_errno.
* include/netdb.h [IS_IN_LIB && !IS_IN (libc)] (h_errno): Define
to __h_errno instead of h_errno.
* nptl/herrno.c (h_errno): Rename to __h_errno.
(__h_errno_location): Refer to __h_errno not h_errno.
* resolv/Versions (h_errno): Rename to __h_errno.
* conform/Makefile (test-xfail-XOPEN2K8/grp.h/linknamespace):
Remove variable.
(test-xfail-XOPEN2K8/pwd.h/linknamespace): Likewise.
Here is implementation of vectorized sin containing SSE, AVX,
AVX2 and AVX512 versions according to Vector ABI
<https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>.
* bits/libm-simd-decl-stubs.h: Added stubs for sin.
* math/bits/mathcalls.h: Added sin declaration with __MATHCALL_VEC.
* sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New versions added.
* sysdeps/x86/fpu/bits/math-vector.h: SIMD declaration for sin.
* sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files.
* sysdeps/x86_64/fpu/Versions: New versions added.
* sysdeps/x86_64/fpu/libm-test-ulps: Regenerated.
* sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added
build of SSE, AVX2 and AVX512 IFUNC versions.
* sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core_sse4.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core_avx2.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core_avx512.S: New file.
* sysdeps/x86_64/fpu/svml_d_sin2_core.S: New file.
* sysdeps/x86_64/fpu/svml_d_sin4_core.S: New file.
* sysdeps/x86_64/fpu/svml_d_sin4_core_avx.S: New file.
* sysdeps/x86_64/fpu/svml_d_sin8_core.S: New file.
* sysdeps/x86_64/fpu/svml_d_sin_data.S: New file.
* sysdeps/x86_64/fpu/svml_d_sin_data.h: New file.
* sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: Added vector sin test.
* sysdeps/x86_64/fpu/test-double-vlen2.c: Likewise.
* sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: Likewise.
* sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: Likewise.
* sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: Likewise.
* sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise.
* sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: Likewise.
* sysdeps/x86_64/fpu/test-double-vlen8.c: Likewise.
* NEWS: Mention addition of x86_64 vector sin.
In commit 02657da2cf, .interp section
was removed from libpthread.so. This led to an error:
$ /lib64/libpthread.so.0
Native POSIX Threads Library by Ulrich Drepper et al
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Forced unwind support included.
Segmentation fault
(gdb) bt
#0 0x00000000000055a6 in _exit@plt ()
Unfortunately, there is no way to add a regression test for the bug
because .interp specifies the path to dynamic linker of the target
system.
[BZ #18479]
* nptl/pt-interp.c: New file.
* nptl/Makefile (libpthread-routines, libpthread-shared-only-routines):
Add pt-interp.
[$(build-shared) = yes] ($(objpfx)pt-interp.os): Depend on
$(common-objpfx)runtime-linker.h.
regcomp brings in references to wcscoll, which isn't in all the
standards that contain regcomp. In turn, wcscoll brings in references
to wcscmp, also not in all those standards. This patch fixes this by
making those functions into weak aliases of __wcscoll and __wcscmp and
calling those names instead as needed.
Tested for x86_64 and x86 (testsuite, and that disassembly of
installed shared libraries is unchanged by the patch).
[BZ #18497]
* wcsmbs/wcscmp.c [!WCSCMP] (WCSCMP): Define as __wcscmp instead
of wcscmp.
(wcscmp): Define as weak alias of WCSCMP.
* wcsmbs/wcscoll.c (STRCOLL): Define as __wcscoll instead of
wcscoll.
(USE_HIDDEN_DEF): Define.
[!USE_IN_EXTENDED_LOCALE_MODEL] (wcscoll): Define as weak alias of
__wcscoll. Don't use libc_hidden_weak.
* wcsmbs/wcscoll_l.c (STRCMP): Define as __wcscmp instead of
wcscmp.
* sysdeps/i386/i686/multiarch/wcscmp-c.c
[SHARED] (libc_hidden_def): Define __GI___wcscmp instead of
__GI_wcscmp.
(weak_alias): Undefine and redefine.
* sysdeps/i386/i686/multiarch/wcscmp.S (wcscmp): Rename to
__wcscmp and define as weak alias of __wcscmp.
* sysdeps/x86_64/wcscmp.S (wcscmp): Likewise.
* include/wchar.h (__wcscmp): Declare. Use libc_hidden_proto.
(__wcscoll): Likewise.
(wcscmp): Don't use libc_hidden_proto.
(wcscoll): Likewise.
* posix/regcomp.c (build_range_exp): Call __wcscoll instead of
wcscoll.
* posix/regexec.c (check_node_accept_bytes): Likewise.
* conform/Makefile (test-xfail-XPG3/regex.h/linknamespace): Remove
variable.
(test-xfail-XPG4/regex.h/linknamespace): Likewise.
(test-xfail-POSIX/regex.h/linknamespace): Likewise.
pathconf uses __statvfs64, and fpathconf uses __fstatvfs64. On
systems using sysdeps/unix/sysv/linux/wordsize-64, __statvfs64 then
brings in the strong symbol statvfs, and __fstatvfs64 brings in the
strong symbol fstatvfs, which are not in all the standards that have
pathconf and fpathconf. This patch fixes this by making those symbols
into weak aliases.
Tested for x86_64 and x86 (testsuite, and that disassembly of
installed shared libraries is unchanged by the patch).
[BZ #18507]
* sysdeps/unix/sysv/linux/fstatvfs.c (fstatvfs): Rename to
__fstatvfs and define as weak alias of __fstatvfs. Use
libc_hidden_weak.
* sysdeps/unix/sysv/linux/statvfs.c (statvs): Rename to __statvfs
and define as weak alias of __statvfs. Use libc_hidden_weak.
* sysdeps/unix/sysv/linux/wordsize-64/fstatvfs.c (__fstatvfs64):
Define as alias of __fstatvfs, not fstatvfs.
(fstatvfs64): Likewise.
* sysdeps/unix/sysv/linux/wordsize-64/statvfs.c (__statvfs64):
Define as alias of __statvfs, not statvfs.
(statvfs64): Likewise.
* conform/Makefile (test-xfail-POSIX/unistd.h/linknamespace):
Remove variable.
Here is implementation of vectorized cosf containing SSE, AVX,
AVX2 and AVX512 versions according to Vector ABI
<https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>.
* sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files.
* sysdeps/x86_64/fpu/Versions: New versions added.
* sysdeps/x86_64/fpu/svml_s_cosf4_core.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_s_cosf4_core.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_s_cosf4_core_sse4.S: New file.
* sysdeps/x86_64/fpu/svml_s_cosf8_core_avx.S: New file.
* sysdeps/x86_64/fpu/svml_s_cosf8_core.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_s_cosf8_core.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_s_cosf8_core_avx2.S: New file.
* sysdeps/x86_64/fpu/svml_s_cosf16_core.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_s_cosf16_core.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_s_cosf16_core_avx512.S: New file.
* sysdeps/x86_64/fpu/svml_s_wrapper_impl.h: New file.
* sysdeps/x86_64/fpu/svml_s_cosf_data.S: New file.
* sysdeps/x86_64/fpu/svml_s_cosf_data.h: New file.
* sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added
build of SSE, AVX2 and AVX512 IFUNC versions.
* sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New versions added.
* sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration for cosf.
* NEWS: Mention addition of x86_64 vector cosf.
Here is implementation of cos containing SSE, AVX, AVX2 and AVX512
versions according to Vector ABI which had been discussed in
<https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>.
Vector math library build and ABI testing enabled by default for x86_64.
* sysdeps/x86_64/fpu/Makefile: New file.
* sysdeps/x86_64/fpu/Versions: New file.
* sysdeps/x86_64/fpu/svml_d_cos_data.S: New file.
* sysdeps/x86_64/fpu/svml_d_cos_data.h: New file.
* sysdeps/x86_64/fpu/svml_d_cos2_core.S: New file.
* sysdeps/x86_64/fpu/svml_d_cos4_core.S: New file.
* sysdeps/x86_64/fpu/svml_d_cos4_core_avx.S: New file.
* sysdeps/x86_64/fpu/svml_d_cos8_core.S: New file.
* sysdeps/x86_64/fpu/svml_d_wrapper_impl.h: New file.
* sysdeps/x86_64/fpu/multiarch/svml_d_cos2_core.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_d_cos2_core_sse4.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_d_cos4_core.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_d_cos4_core_avx2.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core_avx512.S: New file.
* sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added
build of SSE, AVX2 and AVX512 IFUNC versions.
* sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration for cos.
* math/bits/mathcalls.h: Added cos declaration with __MATHCALL_VEC.
* sysdeps/x86_64/configure.ac: Options for libmvec build.
* sysdeps/x86_64/configure: Regenerated.
* sysdeps/x86_64/sysdep.h (cfi_offset_rel_rsp): New macro.
* sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New file.
* manual/install.texi (Configuring and compiling): Document
--disable-mathvec.
* INSTALL: Regenerated.
* NEWS: Mention addition of libmvec and x86_64 vector cos.
open_memstream is new in the 2008 edition of POSIX. However, the
older functions getopt, closelog and fmtmsg all bring in references to
it. This patch fixes this in the usual way, making open_memstream
into a weak alias of __open_memstream and calling __open_memstream
from the relevant places.
Tested for x86_64 and x86 (testsuite, and that disassembly of
installed shared libraries is unchanged by the patch). 32-bit builds
produce an XPASS for conform/POSIX/unistd.h/linknamespace after this
patch (because the only cause of failure left there now is 64-bit
specific); that will disappear once the 64-bit failure is resolved and
the XFAIL removed at that time.
[BZ #18498]
* libio/memstream.c (open_memstream): Rename to __open_memstream
and define as weak alias of __open_memstream.
* include/stdio.h (__open_memstream): Declare. Use
libc_hidden_proto.
(open_memstream): Don't use libc_hidden_proto.
* misc/syslog.c (__vsyslog_chk): Call __open_memstream instead of
open_memstream.
* posix/getopt.c (_getopt_internal_r): Likewise.
* conform/Makefile (test-xfail-XPG3/stdio.h/linknamespace): Remove
variable.
(test-xfail-XPG4/stdio.h/linknamespace): Likewise.
(test-xfail-UNIX98/stdio.h/linknamespace): Likewise.
(test-xfail-XOPEN2K/unistd.h/linknamespace): Likewise.
The regex code brings in references to wcrtomb, which isn't in all the
standards that contain regex. This patch makes it call __wcrtomb
instead (in fact some places already called __wcrtomb, so this patch
makes it internally consistent about which name is used).
Tested for x86_64 and x86 that installed stripped shared libraries are
unchanged by the patch.
[BZ #18496]
* posix/regex_internal.c (build_wcs_upper_buffer): Call __wcrtomb
instead of wcrtomb.
signal.h declares psignal and psiginfo if __USE_XOPEN2K - that is, for
the 2001 edition of POSIX. These functions were actually added in the
2008 edition (as indicated in the header comments). This patch fixes
the header conditionals. This fixes some linknamespace test failures
because psiginfo uses fmemopen, which is also new in the 2008 edition,
so before the header fix this appeared to the linknamespace tests as a
2001 function bringing in references to a 2008 function. The problem
also appeared in conformtest header namespace test results (the
conformtest data has correct conditionals for when these functions
should be visible), but the affected headers still have other
namespace problems so this doesn't fix any of those XFAILs.
Tested for x86_64 and x86 (testsuite, and that installed stripped
shared libraries are unchanged by the patch).
[BZ #18483]
* signal/signal.h [__USE_XOPEN2K] (psignal): Change condition to
[__USE_XOPEN2K8]. Remove redundant #endif.
[__USE_XOPEN2K] (psiginfo): Change condition to [__USE_XOPEN2K8].
Remove redundant #if.
* conform/Makefile (test-xfail-XOPEN2K/signal.h/linknamespace):
Remove variable.
(test-xfail-XOPEN2K/sys/wait.h/linknamespace): Likewise.
(test-xfail-XOPEN2K/ucontext.h/linknamespace): Likewise.
regcomp brings in references to various wctype functions that aren't
in all the standards including regcomp. This patch fixes this in the
usual way by using the __* versions of these functions (which already
exist, but some didn't have libc_hidden_proto / libc_hidden_def
before).
Tested for x86_64 and x86 (testsuite, and that installed stripped
shared libraries are unchanged by the patch). (Other wide character
function references from the regex code mean that this patch by itself
doesn't fix any XFAILed linknamespace test failures; further patches
will be needed for that.)
[BZ #18495]
* wctype/wcfuncs.c (__iswalnum): Use libc_hidden_def.
(__iswlower): Likewise.
* include/wctype.h (__iswalnum): Declare. Use libc_hidden_proto.
(__iswlower): Likewise.
* posix/regcomp.c (re_compile_fastmap_iter): Call __towlower
instead of towlower.
* posix/regex_internal.c (build_wcs_upper_buffer): Call __iswlower
instead of iswlower. Call __towupper instead of towupper.
* posix/regex_internal.h (IS_WIDE_WORD_CHAR): Call __iswalnum
instead of iswalnum.
This adds wake-ups that would be missing if assuming that for a
non-writer-preferring rwlock, if one thread has acquired a rdlock and
does not release it, another thread will eventually acquire a rdlock too
despite concurrent write lock acquisition attempts. BZ 14958 is about
supporting this assumption. Strictly speaking, this isn't a valid
test case, but nonetheless worth supporting (see comment 7 of BZ 14958).
If we set up a rwlock to prefer writers (and disallow recursive rdlock
acquisitions), then readers will block for writers that are blocked to
acquire the lock (otherwise, readers could constantly enter and exit,
and the writer would never get the lock). However, the existing
implementation did not wake such readers when the writer timed out.
This patch adds the missing wake-up.
There's no similar case for writers being blocked on readers.
fnmatch brings in references to strnlen, which isn't in all the
standards that contain fnmatch (not added until the 2008 edition of
POSIX), resulting in linknamespace test failures. (This is contrary
to glibc conventions, rather than a standards conformance issue,
because of the str* reservation.) This patch fixes this in the usual
way, using __strnlen instead of strnlen.
Tested for x86_64 and x86 (testsuite, and that installed stripped
shared libraries are unchanged by the patch).
[BZ #18470]
* posix/fnmatch.c (fnmatch) [_LIBC]: Call __strnlen instead of
strnlen.
* conform/Makefile (test-xfail-XPG3/fnmatch.h/linknamespace):
Remove variable.
(test-xfail-XPG4/fnmatch.h/linknamespace): Likewise.
(test-xfail-POSIX/fnmatch.h/linknamespace): Likewise.
(test-xfail-POSIX/glob.h/linknamespace): Likewise.
(test-xfail-POSIX/wordexp.h/linknamespace): Likewise.
(test-xfail-UNIX98/fnmatch.h/linknamespace): Likewise.
(test-xfail-UNIX98/glob.h/linknamespace): Likewise.
(test-xfail-UNIX98/wordexp.h/linknamespace): Likewise.
(test-xfail-XOPEN2K/fnmatch.h/linknamespace): Likewise.
(test-xfail-XOPEN2K/glob.h/linknamespace): Likewise.
(test-xfail-XOPEN2K/wordexp.h/linknamespace): Likewise.
fnmatch brings in references to wmemchr, which isn't in all the
standards that contain fnmatch, resulting in linknamespace test
failures. This patch fixes this in the usual way, making wmemchr into
a weak alias for __wmemchr.
Tested for x86_64 and x86 (testsuite, and that disassembly of
installed shared libraries is unchanged by the patch).
[BZ #18468]
* wcsmbs/wmemchr.c (wmemchr): Rename to __wmemchr and define as
weak alias of __wmemchr. Use libc_hidden_weak.
* include/wchar.h (__wmemchr): Declare. Use libc_hidden_proto.
* posix/fnmatch.c [HANDLE_MULTIBYTE] (MEMCHR): Use __wmemchr
instead of wmemchr.
fnmatch brings in references to towlower (and thereby towupper), which
isn't in all the standards that contain fnmatch, resulting in
linknamespace test failures. (This is contrary to glibc conventions,
rather than a standards conformance issue, because of the to*
reservation.) This patch fixes this in the usual way, making those
functions into weak aliases.
Tested for x86_64 and x86 (testsuite, and that disassembly of
installed shared libraries is unchanged by the patch). This is on top
of <https://sourceware.org/ml/libc-alpha/2015-06/msg00019.html>, but
the two patches should be independent.
(The __attribute_pure__ on the declarations in include/wctype.h comes
from GCC's built-in attributes for towlower and towupper, and is
needed to get the same code generation for fnmatch before and after
the patch. It seems likely there are cases where the declaration of
__foo in the internal headers is missing attributes from foo in the
public headers, built-in to GCC or both, but I don't know a good way
to detect such missing attributes.)
[BZ #18469]
* wctype/wcfuncs.c (towlower): Rename to __towlower and define as
weak alias of __towlower. Use libc_hidden_weak.
(towupper): Rename to __towupper and define as weak alias of
__towupper. Use libc_hidden_weak.
* include/wctype.h (__towlower): Declare. Use libc_hidden_proto.
(__towupper): Likewise.
* posix/fnmatch.c [HANDLE_MULTIBYTE && _LIBC] (FOLD): Use
__towlower instead of towlower.
PLT relocations aren't required when -z now used. Linker on master with:
commit 25070364b0ce33eed46aa5d78ebebbec6accec7e
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Sat May 16 07:00:21 2015 -0700
Don't generate PLT relocations for now binding
There is no need for PLT relocations with -z now. We can use GOT
relocations, which take less space, instead and replace 16-byte .plt
entres with 8-byte .plt.got entries.
bfd/
* elf32-i386.c (elf_i386_check_relocs): Create .plt.got section
for now binding.
(elf_i386_allocate_dynrelocs): Use .plt.got section for now
binding.
* elf64-x86-64.c (elf_x86_64_check_relocs): Create .plt.got
section for now binding.
(elf_x86_64_allocate_dynrelocs): Use .plt.got section for now
binding.
won't generate PLT relocations with -z now. elf/tst-audit2.c expect
certain order of execution in ld.so. With PLT relocations, the GOTPLT
entry of calloc is update to calloc defined in tst-audit2:
(gdb) bt
skip_ifunc=<optimized out>, reloc_addr_arg=<optimized out>,
version=<optimized out>, sym=<optimized out>, map=<optimized out>)
at ../sysdeps/i386/dl-machine.h:329
out>,
nrelative=<optimized out>, relsize=<optimized out>,
reladdr=<optimized out>, map=<optimized out>) at do-rel.h:137
reloc_mode=reloc_mode@entry=0,
consider_profiling=1, consider_profiling@entry=0) at dl-reloc.c:258
user_entry=0xffffcf1c, auxv=0xffffd0a8) at rtld.c:2133
start_argptr=start_argptr@entry=0xffffcfb0,
dl_main=dl_main@entry=0xf7fda6f0 <dl_main>) at
../elf/dl-sysdep.c:249
from /export/build/gnu/glibc-32bit/build-i686-linux/elf/ld.so
(gdb)
and then calloc is called:
(gdb) c
Continuing.
Breakpoint 4, calloc (n=n@entry=20, m=4) at tst-audit2.c:18
18 {
(gdb) bt
reloc_mode=reloc_mode@entry=0, consider_profiling=1,
consider_profiling@entry=0) at dl-reloc.c:272
user_entry=0xffffcf1c, auxv=0xffffd0a8) at rtld.c:2133
start_argptr=start_argptr@entry=0xffffcfb0,
dl_main=dl_main@entry=0xf7fda6f0 <dl_main>) at
../elf/dl-sysdep.c:249
from /export/build/gnu/glibc-32bit/build-i686-linux/elf/ld.so
(gdb)
With GOT relocation, calloc in ld.so is called first:
(gdb) bt
consider_profiling=1) at dl-reloc.c:272
user_entry=0xffffcf0c, auxv=0xffffd098) at rtld.c:2074
start_argptr=start_argptr@entry=0xffffcfa0,
dl_main=dl_main@entry=0xf7fda6c0 <dl_main>) at
../elf/dl-sysdep.c:249
from /export/build/gnu/glibc-32bit-test/build-i686-linux/elf/ld.so
(gdb)
and then the GOT entry of calloc is updated:
(gdb) bt
skip_ifunc=<optimized out>, reloc_addr_arg=<optimized out>,
version=<optimized out>, sym=<optimized out>, map=<optimized out>)
at ../sysdeps/i386/dl-machine.h:329
out>,
nrelative=<optimized out>, relsize=<optimized out>,
reladdr=<optimized out>, map=<optimized out>) at do-rel.h:137
reloc_mode=reloc_mode@entry=0,
consider_profiling=1, consider_profiling@entry=0) at dl-reloc.c:258
user_entry=0xffffcf0c, auxv=0xffffd098) at rtld.c:2133
start_argptr=start_argptr@entry=0xffffcfa0,
dl_main=dl_main@entry=0xf7fda6c0 <dl_main>) at
../elf/dl-sysdep.c:249
from /export/build/gnu/glibc-32bit-test/build-i686-linux/elf/ld.so
(gdb)
After that, since calloc isn't called from ld.so nor any other modules,
magic in tst-audit2 isn't updated. Both orders are correct. This patch
makes sure that calloc in tst-audit2.c is called at least once from ld.so.
[BZ #18422]
* Makefile ($(objpfx)tst-audit2): Depend on $(libdl).
($(objpfx)tst-audit2.out): Also depend on
$(objpfx)tst-auditmod9b.so.
* elf/tst-audit2.c: Include <dlfcn.h>.
(calloc_called): New.
(calloc): Allow to be called more than once.
(do_test): dllopen/dlclose $ORIGIN/tst-auditmod9b.so.
In the introduction for the official orthography rules for Ukrainian
language (http://spelling.ulif.org.ua/peredmova.htm) there's a note
that only apostrophe does not affect order of the words when sorting.
As could be seen from the official alphabet the soft sign
(U+044C/U+042C) has its hard position and thus affects the order and
also letters "е" and "є" (CYR-IE: U+0435/U+0415 and UKR-IE:
U+0454/U+0404) have their own positions and should have separate place
when sorting.
This also corresponds to official Unicode collation chart for these
letters: http://unicode.org/charts/collation/chart_Cyrillic.html
On 21/05/15 05:29, Siddhesh Poyarekar wrote:
> On Wed, May 20, 2015 at 06:55:02PM +0100, Szabolcs Nagy wrote:
>> i guess it's ok for consistency if i fix struct stat64
>> too to use __USE_XOPEN2K8.
>>
>> i will run some tests and come back with a patch
>
> I also think it would be appropriate to change this code in other
> architectures (microblaze and nacl IIRC) to make all of them
> consistent. It is a mechanical enough change IMO that all arch
> maintainer acks is not necessary.
>
here is the patch with consistent __USE_XOPEN2K8
ok to commit?
2015-05-21 Szabolcs Nagy <szabolcs.nagy@arm.com>
[BZ #18234]
* conform/data/sys/stat.h-data (struct stat): Add tests for st_atim,
st_mtim and st_ctim members.
* sysdeps/nacl/bits/stat.h (struct stat, struct stat64): Make
st_atim, st_ctim, st_mtim visible under __USE_XOPEN2K8 only.
* sysdeps/unix/sysv/linux/generic/bits/stat.h (struct stat,):
(struct stat64): Likewise.
* sysdeps/unix/sysv/linux/ia64/bits/stat.h (struct stat,):
(struct stat64): Likewise.
* sysdeps/unix/sysv/linux/microblaze/bits/stat.h (struct stat,):
(struct stat64): Likewise.
A shared object doesn't need PLT if there are no PLT relocations. It
shouldn't be an error if DT_PLTRELSZ is missing.
[BZ #18410]
* elf/dl-reloc.c (_dl_relocate_object): Don't issue an error
for missing DT_PLTRELSZ.
[BZ #18412]
* intl/locale.alias: Remove obsolete aliases "bokmål" and "français"
which caused 'locale -a' to output Latin-1 data in UTF-8 locales,
breaking some applications that use 'locale -a' output.
Change the encoding of this file from Latin-1 to ASCII to avoid
other potential problems with people grepping this file.
My review of conformtest expectations for POSIX showed up that the
_POSIX2_C_VERSION macro, required by POSIX and XPG standards before
2001, was missing in unistd.h, having been removed on 2003-04-03
despite those standards still being supported. This patch adds it
back. As it's in the implementation namespace, there's no need for it
to be conditional, and other such macros aren't conditional in this
header either.
Tested for x86_64 and x86 (testsuite). Note that this *does* change
the installed libraries, because it affects the sysconf support
(present all along) for _SC_2_C_VERSION.
[BZ #438]
* posix/unistd.h (_POSIX2_C_VERSION): New macro.
* conform/Makefile (test-xfail-POSIX/unistd.h/conform): Remove
variable.
pathconf (sysdeps/unix/sysv/linux/pathconf.c) uses basename. But
pathconf is in POSIX back to 1990 while basename is only reserved with
external linkage in those standards including XPG functions. This
patch fixes this namespace issue in the usual way, renaming basename
to __basename and making it into a weak alias.
Tested for x86_64 and x86 (testsuite, and that disassembly of
installed shared libraries is unchanged by the patch).
[BZ #18444]
* string/basename.c (basename): Rename to __basename and define as
weak alias of __basename. Use libc_hidden_weak.
* include/string.h (__basename): Declare. Use libc_hidden_proto.
* sysdeps/unix/sysv/linux/pathconf.c (distinguish_extX): Call
__basename instead of basename.
* conform/Makefile (test-xfail-POSIX2008/unistd.h/linknamespace):
Remove variable.
(test-xfail-XOPEN2K8/unistd.h/linknamespace): Likewise.
Remove use of ext.nsmap member of struct __res_state and always use
an identity mapping betwen the nsaddr_list array and the ext.nsaddrs
array. The fact that a nameserver has an IPv6 address is signalled by
setting nsaddr_list[].sin_family to zero.
ldbl-96 remquol wrongly handles the case where the first argument is
finite and the second infinite, because the check for the second
argument being a NaN fails to disregard the explicit high mantissa bit
and so wrongly interprets an infinity as being a NaN. This patch
fixes this by masking off that bit, and improves test coverage for
both remainder and remquo (various cases were missing tests, or, as in
the case of the bug, were tested only for one of the two functions).
Tested for x86_64 and x86.
[BZ #18244]
* sysdeps/ieee754/ldbl-96/s_remquol.c (__remquol): Ignore explicit
high mantissa bit when testing whether P is a NaN.
* math/libm-test.inc (remainder_test_data): Add more tests.
(remquo_test_data): Likewise.
The i386 implementation of atanhl, for small arguments, does a
calculation that involves computing twice the square of the argument,
resulting in spurious underflows for some arguments. This patch fixes
this by just returning the argument when its exponent is below -32,
with underflow being forced as needed for subnormal arguments.
Tested for x86 and x86_64.
[BZ #18049]
* sysdeps/i386/fpu/e_atanhl.S (__ieee754_atanhl): For exponents
below -32, return the argument, with underflow if subnormal.
* math/auto-libm-test-in: Add more tests of atanh.
* math/auto-libm-test-out: Regenerated.
[BZ #17581] The checking chain of unused chunks was terminated by a hash of
the block pointer, which was sometimes confused with the chunk length byte.
We now avoid using a length byte equal to the magic byte.
When the malloc subsystem detects some kind of memory corruption,
depending on the configuration it prints the error, a backtrace, a
memory map and then aborts the process. In this process, the
backtrace() call may result in a call to malloc, resulting in
various kinds of problematic behavior.
In one case, the malloc it calls may detect a corruption and call
backtrace again, and a stack overflow may result due to the infinite
recursion. In another case, the malloc it calls may deadlock on an
arena lock with the malloc (or free, realloc, etc.) that detected the
corruption. In yet another case, if the program is linked with
pthreads, backtrace may do a pthread_once initialization, which
deadlocks on itself.
In all these cases, the program exit is not as intended. This is
avoidable by marking the arena that malloc detected a corruption on,
as unusable. The following patch does that. Features of this patch
are as follows:
- A flag is added to the mstate struct of the arena to indicate if the
arena is corrupt.
- The flag is checked whenever malloc functions try to get a lock on
an arena. If the arena is unusable, a NULL is returned, causing the
malloc to use mmap or try the next arena.
- malloc_printerr sets the corrupt flag on the arena when it detects a
corruption
- free does not concern itself with the flag at all. It is not
important since the backtrace workflow does not need free. A free
in a parallel thread may cause another corruption, but that's not
new
- The flag check and set are not atomic and may race. This is fine
since we don't care about contention during the flag check. We want
to make sure that the malloc call in the backtrace does not trip on
itself and all that action happens in the same thread and not across
threads.
I verified that the test case does not show any regressions due to
this patch. I also ran the malloc benchmarks and found an
insignificant difference in timings (< 2%).
* malloc/Makefile (tests): New test case tst-malloc-backtrace.
* malloc/arena.c (arena_lock): Check if arena is corrupt.
(reused_arena): Find a non-corrupt arena.
(heap_trim): Pass arena to unlink.
* malloc/hooks.c (malloc_check_get_size): Pass arena to
malloc_printerr.
(top_check): Likewise.
(free_check): Likewise.
(realloc_check): Likewise.
* malloc/malloc.c (malloc_printerr): Add arena argument.
(unlink): Likewise.
(munmap_chunk): Adjust.
(ARENA_CORRUPTION_BIT): New macro.
(arena_is_corrupt): Likewise.
(set_arena_corrupt): Likewise.
(sysmalloc): Use mmap if there are no usable arenas.
(_int_malloc): Likewise.
(__libc_malloc): Don't fail if arena_get returns NULL.
(_mid_memalign): Likewise.
(__libc_calloc): Likewise.
(__libc_realloc): Adjust for additional argument to
malloc_printerr.
(_int_free): Likewise.
(malloc_consolidate): Likewise.
(_int_realloc): Likewise.
(_int_memalign): Don't touch corrupt arenas.
* malloc/tst-malloc-backtrace.c: New test case.
Similar to various other bugs in this area, some atanh implementations
do not raise the underflow exception for subnormal arguments, when the
result is tiny and inexact. This patch forces the exception in a
similar way to previous fixes. (No change in this regard is needed
for the i386 implementation; special handling to force underflows in
these cases will only be needed there when the spurious underflows,
bug 18049, get fixed.)
Tested for x86_64, x86, powerpc and mips64.
[BZ #16352]
* sysdeps/i386/fpu/e_atanh.S (dbl_min): New object.
(__ieee754_atanh): Force underflow exception for results with
small absolute value.
* sysdeps/i386/fpu/e_atanhf.S (flt_min): New object.
(__ieee754_atanhf): Force underflow exception for results with
small absolute value.
* sysdeps/ieee754/dbl-64/e_atanh.c: Include <float.h>.
(__ieee754_atanh): Force underflow exception for results with
small absolute value.
* sysdeps/ieee754/flt-32/e_atanhf.c: Include <float.h>.
(__ieee754_atanhf): Force underflow exception for results with
small absolute value.
* sysdeps/ieee754/ldbl-128/e_atanhl.c: Include <float.h>.
(__ieee754_atanhl): Force underflow exception for results with
small absolute value.
* sysdeps/ieee754/ldbl-128ibm/e_atanhl.c: Include <float.h>.
(__ieee754_atanhl): Force underflow exception for results with
small absolute value.
* sysdeps/ieee754/ldbl-96/e_atanhl.c: Include <float.h>.
(__ieee754_atanhl): Force underflow exception for results with
small absolute value.
* math/auto-libm-test-in: Do not allow missing underflow
exceptions from atanh.
* math/auto-libm-test-out: Regenerated.
The flt-32 implementation of tanf produces spurious underflow
exceptions for some small arguments, through computing values on the
order of x^5. This patch fixes this by adjusting the threshold for
returning x (or, as applicable, +/- 1/x) to 2**-13 (the next term in
the power series being x^3/3).
Tested for x86_64 and x86.
[BZ #18221]
* sysdeps/ieee754/flt-32/k_tanf.c (__kernel_tanf): Use 2**-13 not
2**-28 as threshold for returning x or +/- 1/x.
* math/auto-libm-test-in: Add more tests of tan.
* math/auto-libm-test-out: Regenerated.
The flt-32 implementation of lgammaf produces spurious underflow
exceptions for some large arguments, because of calculations involving
x^-2 multiplied by small constants. This patch fixes this by
adjusting the threshold for a simpler computation to 2**26 (the error
in the simpler computation is on the order of 0.5 * log (x), for a
result on the order of x * log (x)).
Tested for x86_64 and x86.
[BZ #18220]
* sysdeps/ieee754/flt-32/e_lgammaf_r.c (__ieee754_lgammaf_r): Use
2**26 not 2**58 as threshold for returning x * (log (x) - 1).
* math/auto-libm-test-in: Add another test of lgamma.
* math/auto-libm-test-out: Regenerated.
The flt-32 implementation of erfcf produces spurious underflow
exceptions for some arguments close to 0, because of calculations
squaring the argument and then multiplying by small constants. This
patch fixes this by adjusting the threshold for arguments for which
the result is so close to 1 that 1 - x will give the right result from
2**-56 to 2**-26. (If 1 - x * 2/sqrt(pi) were used, the errors would be
on the order of x^3 and a much larger threshold could be used.)
Tested for x86_64 and x86.
[BZ #18217]
* sysdeps/ieee754/flt-32/s_erff.c (__erfcf): Use 2**-26 not 2**-56
as threshold for returning 1 - x.
* math/auto-libm-test-in: Add more tests of erfc.
* math/auto-libm-test-out: Regenerated.
The sysdeps/ieee754/flt-32 version of atanf produces spurious
underflow exceptions for some large arguments, because of computations
that compute x^-4. This patch fixes this by adjusting the threshold
for large arguments (for which +/- pi/2 can just be returned, the
correct result being roughly +/- pi/2 - 1/x) from 2^34 to 2^25.
Tested for x86_64 and x86.
[BZ #18196]
* sysdeps/ieee754/flt-32/s_atanf.c (__atanf): Use 2^25 not 2^34 as
threshold for large arguments.
* math/auto-libm-test-in: Add another test of atan.
* math/auto-libm-test-out: Regenerated.
Similar to various other bugs in this area, some log1p implementations
do not raise the underflow exception for subnormal arguments, when the
result is tiny and inexact. This patch forces the exception in a
similar way to previous fixes. (The ldbl-128ibm implementation
doesn't currently need any change as it already generates this
exception, albeit through code that would generate spurious exceptions
in other cases; special code for this issue will only be needed there
when fixing the spurious exceptions.)
Tested for x86_64, x86, powerpc and mips64.
[BZ #16339]
* sysdeps/i386/fpu/s_log1p.S (dbl_min): New object.
(__log1p): Force underflow exception for results with small
absolute value.
* sysdeps/i386/fpu/s_log1pf.S (flt_min): New object.
(__log1pf): Force underflow exception for results with small
absolute value.
* sysdeps/ieee754/dbl-64/s_log1p.c: Include <float.h>.
(__log1p): Force underflow exception for results with small
absolute value.
* sysdeps/ieee754/flt-32/s_log1pf.c: Include <float.h>.
(__log1pf): Force underflow exception for results with small
absolute value.
* sysdeps/ieee754/ldbl-128/s_log1pl.c: Include <float.h>.
(__log1pl): Force underflow exception for results with small
absolute value.
* math/auto-libm-test-in: Do not allow missing underflow
exceptions from log1p.
* math/auto-libm-test-out: Regenerated.
Programs are supposed to be able to define the __fpu_control variable,
overriding the library's version to cause the floating-point control
word to be set to the chosen value at startup.
This is broken for mips16 for static linking because the library's
__fpu_control variable is in the same object file as the helper
functions used by fpu_control.h for mips16, so test-fpucw-ieee-static
fails to link with multiple definitions of __fpu_control.
This patch fixes this by putting the helpers in a separate file rather
than overriding fpu_control.c. Tested for mips16 that this fixes the
link failure and the ABI tests still pass.
[BZ #18397]
* sysdeps/mips/mips32/fpu/fpu_control.c: Move to ....
* sysdeps/mips/mips32/fpu/fpucw-helpers.c: ... here. Include
<fpu_control.h> instead of <math/fpu_control.c>.
* sysdeps/mips/mips32/fpu/Makefile: New file.
There appears to be a discrepancy among the implementations
of setcontext with regards to the function called once the last
linked-to context has finished executing via setcontext.
The POSIX standard says:
~~~
If the uc_link member of the ucontext_t structure pointed to by
the ucp argument is equal to 0, then this context is the main
context, and the thread will exit when this context returns.
~~~
It says "exit" not "exit immediately" nor "exit without running
functions registered with atexit or on_exit."
Therefore the AArch64, ARM, hppa and NIOS II implementations are
wrong and no test detects it.
It is questionable if this should even be fixed or just documented
that the above 4 targets are wrong. The functions are deprecated
and nobody should be using them, but at the same time it silly to
have cross-target differences that make it hard to port old
applications from say x86_64 to AArch64.
Therefore I will ix the 4 arches, and checkin a regression
test to prevent it from changing again.
https://sourceware.org/ml/libc-alpha/2015-03/msg00720.html
Robin Hack discovered Samba would enter an infinite loop processing
certain quota-related requests. We eventually tracked this down to a
glibc issue.
Running a (simplified) test case under strace shows that /etc/passwd
is continuously opened and closed:
…
open("/etc/passwd", O_RDONLY|O_CLOEXEC) = 3
lseek(3, 0, SEEK_CUR) = 0
read(3, "root❌0:0:root:/root:/bin/bash\n"..., 4096) = 2717
lseek(3, 2717, SEEK_SET) = 2717
close(3) = 0
open("/etc/passwd", O_RDONLY|O_CLOEXEC) = 3
lseek(3, 0, SEEK_CUR) = 0
lseek(3, 0, SEEK_SET) = 0
read(3, "root❌0:0:root:/root:/bin/bash\n"..., 4096) = 2717
lseek(3, 2717, SEEK_SET) = 2717
close(3) = 0
open("/etc/passwd", O_RDONLY|O_CLOEXEC) = 3
lseek(3, 0, SEEK_CUR) = 0
…
The lookup function implementation in
nss/nss_files/files-XXX.c:DB_LOOKUP has code to prevent that. It is
supposed skip closing the input file if it was already open.
/* Reset file pointer to beginning or open file. */ \
status = internal_setent (keep_stream); \
\
if (status == NSS_STATUS_SUCCESS) \
{ \
/* Tell getent function that we have repositioned the file pointer. */ \
last_use = getby; \
\
while ((status = internal_getent (result, buffer, buflen, errnop \
H_ERRNO_ARG EXTRA_ARGS_VALUE)) \
== NSS_STATUS_SUCCESS) \
{ break_if_match } \
\
if (! keep_stream) \
internal_endent (); \
} \
keep_stream is initialized from the stayopen flag in internal_setent.
internal_setent is called from the set*ent implementation as:
status = internal_setent (stayopen);
However, for non-host database, this flag is always 0, per the
STAYOPEN magic in nss/getXXent_r.c.
Thus, the fix is this:
- status = internal_setent (stayopen);
+ status = internal_setent (1);
This is not a behavioral change even for the hosts database (where the
application can specify the stayopen flag) because with a call to
sethostent(0), the file handle is still not closed in the
implementation of gethostent.
The implementation of roundl for ldbl-128 involves undefined behavior
for arguments with exponents from 31 to 47 inclusive, from the shift:
u_int64_t i = -1ULL >> (j0 - 48);
For example, on mips64, this means roundl (0xffffffffffff.8p0L)
wrongly returns its argument, which is not an integer. A condition
checking for exponents < 31 should actually be checking for exponents
< 48, and this patch makes it do so. (That condition is for whether
the bit representing 0.5 is in the high 64-bit half of the
floating-point number. The value 31 might have arisen from an
incorrect conversion of the ldbl-96 version to handle ldbl-128.)
This was originally reported as a GCC libquadmath bug
<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65757>.
Tested for mips64; also tested for x86_64 and x86 to make sure the new
tests pass there.
[BZ #18346]
* sysdeps/ieee754/ldbl-128/s_roundl.c (__roundl): Handle all
exponents less than 48 as cases where high part of mantissa needs
examining to determine whether argument is integral.
* math/libm-test.inc (round_test_data): Add more tests.
add_temp_file now makes a copy which is freed by delete_temp_files.
Callers to create_temp_file can now free the returned file name to
avoid the memory leak. These changes do not affect the leak behavior
of existing code.
Also address a NULL pointer derefence in tzset after a memoru allocation
failure, found during testing.
This patch adds support to query cache information on s390
via sysconf() function - e.g. with _SC_LEVEL1_ICACHE_SIZE.
The attributes size, linesize and assoc can be queried
for cache level 1 - 4 via "extract cpu attribute" instruction,
which was first available with z10.
* NEWS: Mention sysconf() cache information support for s390.
* sysdeps/unix/sysv/linux/s390/sysconf.c: New File.
[BZ #18206]
* wcsmbs/wcsncmp.c (wcsncmp): Compare as wchar_t, not wint_t.
Use signed comparision instead of substraction to avoid
overflow bug.
* localedata/tests-mbwc/tst_wcsncmp.c (tst_wcsncmp):
Take the sign of ret.
* localedata/tests-mbwc/dat_wcsncmp.c (tst_wcsncmp_loc):
Do not expect precise return values. Only the sign matters.
* wcsmbs/Makefile (strop-tests): Add wcsncmp.
* wcsmbs/test-wcsncmp.c: New File.
* string/test-strncmp.c: Add wcsncmp support.
According to bug 6792, errno is not set to ERANGE/EDOM
by calling log1p/log1pf/log1pl with x = -1 or x < -1.
This patch adds a wrapper which sets errno in those cases
and returns the value of the existing __log1p function.
The log1p is now an alias to the wrapper function
instead of __log1p.
The files in sysdeps are reflecting these changes.
The ia64 implementation sets errno by itself,
thus the wrapper-file is empty.
The libm-test is adjusted for log1p-tests to check errno.
[BZ #6792]
* math/w_log1p.c: New file.
* math/w_log1pf.c: Likewise.
* math/w_log1pl.c: Likewise.
* math/Makefile (libm-calls): Add w_log1p.
* math/s_log1pl.c (log1pl): Remove weak_alias.
* sysdeps/i386/fpu/s_log1p.S (log1p): Likewise.
* sysdeps/i386/fpu/s_log1pf.S (log1pf): Likewise.
* sysdeps/i386/fpu/s_log1pl.S (log1pl): Likewise.
* sysdeps/x86_64/fpu/s_log1pl.S (log1pl): Likewise.
* sysdeps/ieee754/dbl-64/s_log1p.c (log1p): Likewise.
[NO_LONG_DOUBLE] (log1pl): Likewise.
* sysdeps/ieee754/flt-32/s_log1pf.c (log1pf): Likewise.
* sysdeps/ieee754/ldbl-128/s_log1pl.c (log1pl): Likewise.
* sysdeps/ieee754/ldbl-64-128/s_log1pl.c
(log1p): Remove long_double_symbol.
* sysdeps/ieee754/ldbl-128ibm/s_log1pl.c (log1pl): Likewise.
* sysdeps/ieee754/ldbl-64-128/w_log1pl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/w_log1pl.c: Likewise.
* sysdeps/m68k/m680x0/fpu/s_log1p.c: Define empty weak_alias to
remove weak_alias for corresponding log1p function.
* sysdeps/m68k/m680x0/fpu/s_log1pf.c: Likewise.
* sysdeps/m68k/m680x0/fpu/s_log1pl.c: Likewise.
* sysdeps/ia64/fpu/w_log1p.c: New file.
* sysdeps/ia64/fpu/w_log1pf.c: Likewise.
* sysdeps/ia64/fpu/w_log1pl.c: Likewise.
* math/libm-test.inc (log1p_test_data): Add errno expectations.
Bug 18247 is an off-by-one error in strtof's determination of a
decimal exponent such that any value with that decimal exponent is at
most half the least subnormal and so the appropriate underflowing
value for the rounding mode can be determined with no
multiple-precision computations. (Whether the value is in fact safe
despite the off-by-one depends on the floating-point format in
question. It's wrong for float and for m68k ldbl-96 but not for other
supported formats.) This patch corrects the computation of the
exponent in question to be safe in general, adding a comment
explaining the new computation.
Tested for x86_64.
[BZ #18247]
* stdlib/strtod_l.c (____STRTOF_INTERNAL): Decrease minimum
decimal exponent by 1.
* stdlib/tst-strtod-round-data: Add more tests.
* stdlib/tst-strtod-round.c (tests): Regenerated.
The dbl-64 implementation of atan2 does computations that expect to
run in round-to-nearest mode, and in other modes the errors can
accumulate to more than the maximum accepted 9ulp. This patch makes
it use FE_TONEAREST internally, similar to other functions with such
issues. Tests that previously produced large errors are added for
atan2 and the closely related carg, clog and clog10 functions.
Tested for x86_64 and x86 and ulps updated accordingly.
[BZ #18210]
[BZ #18211]
* sysdeps/ieee754/dbl-64/e_atan2.c: Include <fenv.h>.
(__ieee754_atan2): Set FE_TONEAREST mode for internal
computations.
* math/auto-libm-test-in: Add more tests of atan2, carg, clog and
clog10.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
The dbl-64 implementation of atan does computations that expect to run
in round-to-nearest mode, and in other modes the errors can accumulate
to more than the maximum accepted 9ulp. This patch makes it use
FE_TONEAREST internally, similar to other functions with such issues.
Tested for x86_64 and x86; no ulps updates needed.
[BZ #18197]
* sysdeps/ieee754/dbl-64/s_atan.c: Include <fenv.h>.
(atan): Set FE_TONEAREST mode for internal computations.
* math/auto-libm-test-in: Add more tests of atan.
* math/auto-libm-test-out: Regenerated.
Trimming heaps is a balance between saving memory and the system overhead
required to update page tables and discard allocated pages. The malloc
option M_TRIM_THRESHOLD is a tunable that users are meant to use to decide
where this balance point is but it is only applied to the main arena.
For scalability reasons, glibc malloc has per-thread heaps but these are
shrunk with madvise() if there is one page free at the top of the heap.
In some circumstances this can lead to high system overhead if a thread
has a control flow like
while (data_to_process) {
buf = malloc(large_size);
do_stuff();
free(buf);
}
For a large size, the free() will call madvise (pagetable teardown, page
free and TLB flush) every time followed immediately by a malloc (fault,
kernel page alloc, zeroing and charge accounting). The kernel overhead
can dominate such a workload.
This patch allows the user to tune when madvise gets called by applying
the trim threshold to the per-thread heaps and using similar logic to the
main arena when deciding whether to shrink. Alternatively if the dynamic
brk/mmap threshold gets adjusted then the new values will be obeyed by
the per-thread heaps.
Bug 17195 was a test case motivated by a problem encountered in scientific
applications written in python that performance badly due to high page fault
overhead. The basic operation of such a program was posted by Julian Taylor
https://sourceware.org/ml/libc-alpha/2015-02/msg00373.html
With this patch applied, the overhead is eliminated. All numbers in this
report are in seconds and were recorded by running Julian's program 30
times.
pyarray
glibc madvise
2.21 v2
System min 1.81 ( 0.00%) 0.00 (100.00%)
System mean 1.93 ( 0.00%) 0.02 ( 99.20%)
System stddev 0.06 ( 0.00%) 0.01 ( 88.99%)
System max 2.06 ( 0.00%) 0.03 ( 98.54%)
Elapsed min 3.26 ( 0.00%) 2.37 ( 27.30%)
Elapsed mean 3.39 ( 0.00%) 2.41 ( 28.84%)
Elapsed stddev 0.14 ( 0.00%) 0.02 ( 82.73%)
Elapsed max 4.05 ( 0.00%) 2.47 ( 39.01%)
glibc madvise
2.21 v2
User 141.86 142.28
System 57.94 0.60
Elapsed 102.02 72.66
Note that almost a minutes worth of system time is eliminted and the
program completes 28% faster on average.
To illustrate the problem without python this is a basic test-case for
the worst case scenario where every free is a madvise followed by a an alloc
/* gcc bench-free.c -lpthread -o bench-free */
static int num = 1024;
void __attribute__((noinline,noclone)) dostuff (void *p)
{
}
void *worker (void *data)
{
int i;
for (i = num; i--;)
{
void *m = malloc (48*4096);
dostuff (m);
free (m);
}
return NULL;
}
int main()
{
int i;
pthread_t t;
void *ret;
if (pthread_create (&t, NULL, worker, NULL))
exit (2);
if (pthread_join (t, &ret))
exit (3);
return 0;
}
Before the patch, this resulted in 1024 calls to madvise. With the patch applied,
madvise is called twice because the default trim threshold is high enough to avoid
this.
This a more complex case where there is a mix of frees. It's simply a different worker
function for the test case above
void *worker (void *data)
{
int i;
int j = 0;
void *free_index[num];
for (i = num; i--;)
{
void *m = malloc ((i % 58) *4096);
dostuff (m);
if (i % 2 == 0) {
free (m);
} else {
free_index[j++] = m;
}
}
for (; j >= 0; j--)
{
free(free_index[j]);
}
return NULL;
}
glibc 2.21 calls malloc 90305 times but with the patch applied, it's
called 13438. Increasing the trim threshold will decrease the number of
times it's called with the option of eliminating the overhead.
ebizzy is meant to generate a workload resembling common web application
server workloads. It is threaded with a large working set that at its core
has an allocation, do_stuff, free loop that also hits this case. The primary
metric of the benchmark is records processed per second. This is running on
my desktop which is a single socket machine with an I7-4770 and 8 cores.
Each thread count was run for 30 seconds. It was only run once as the
performance difference is so high that the variation is insignificant.
glibc 2.21 patch
threads 1 10230 44114
threads 2 19153 84925
threads 4 34295 134569
threads 8 51007 183387
Note that the saving happens to be a concidence as the size allocated
by ebizzy was less than the default threshold. If a different number of
chunks were specified then it may also be necessary to tune the threshold
to compensate
This is roughly quadrupling the performance of this benchmark. The difference in
system CPU usage illustrates why.
ebizzy running 1 thread with glibc 2.21
10230 records/s 306904
real 30.00 s
user 7.47 s
sys 22.49 s
22.49 seconds was spent in the kernel for a workload runinng 30 seconds. With the
patch applied
ebizzy running 1 thread with patch applied
44126 records/s 1323792
real 30.00 s
user 29.97 s
sys 0.00 s
system CPU usage was zero with the patch applied. strace shows that glibc
running this workload calls madvise approximately 9000 times a second. With
the patch applied madvise was called twice during the workload (or 0.06
times per second).
2015-02-10 Mel Gorman <mgorman@suse.de>
[BZ #17195]
* malloc/arena.c (free): Apply trim threshold to per-thread heaps
as well as the main arena.
Silvermont and Knights Landing have a modular system design with two cores
sharing an L2 cache. If more than 2 cores are detected to shared L2 cache,
it should be adjusted for Silvermont and Knights Landing.
[BZ #18185]
* sysdeps/x86_64/cacheinfo.c (init_cacheinfo): Limit threads
sharing L2 cache to 2 for Silvermont/Knights Landing.
This patch is glibc support for a PowerPC TLS optimization, inspired
by Alexandre Oliva's TLS optimization for other processors,
http://www.lsd.ic.unicamp.br/~oliva/writeups/TLS/RFC-TLSDESC-x86.txt
In essence, this optimization uses a zero module id in the tls_index
GOT entry to indicate that a TLS variable is allocated space in the
static TLS area. A special plt call linker stub for __tls_get_addr
checks for such a tls_index and if found, returns the offset
immediately. The linker communicates the fact that the special
__tls_get_addr stub is used by setting a bit in the dynamic tag
DT_PPC64_OPT/DT_PPC_OPT. glibc communicates to the linker that this
optimization is available by the presence of __tls_get_addr_opt.
tst-tlsmod2.so is built with -Wl,--no-tls-get-addr-optimize for
tst-tls-dlinfo, which otherwise would fail since it tests that no
static tls is allocated. The ld option --no-tls-get-addr-optimize has
been available since binutils-2.20 so doesn't need a configure test.
* NEWS: Advertise TLS optimization.
* elf/elf.h (R_PPC_TLSGD, R_PPC_TLSLD, DT_PPC_OPT, PPC_OPT_TLS): Define.
(DT_PPC_NUM): Increment.
* elf/dynamic-link.h (HAVE_STATIC_TLS): Define.
(CHECK_STATIC_TLS): Use here.
* sysdeps/powerpc/powerpc32/dl-machine.h (elf_machine_rela): Optimize
TLS descriptors.
* sysdeps/powerpc/powerpc64/dl-machine.h (elf_machine_rela): Likewise.
* sysdeps/powerpc/dl-tls.c: New file.
* sysdeps/powerpc/Versions: Add __tls_get_addr_opt.
* sysdeps/powerpc/tst-tlsopt-powerpc.c: New tls test.
* sysdeps/unix/sysv/linux/powerpc/Makefile: Add new test.
Build tst-tlsmod2.so with --no-tls-get-addr-optimize.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/ld.abilist: Update.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/ld.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/ld-le.abilist: Likewise.
sem_timedwait converts absolute timeouts to relative to pass them to
the futex syscall. (Before the recent reimplementation, on x86_64 it
used FUTEX_CLOCK_REALTIME, but not on other architectures.)
Correctly implementing POSIX requirements, however, requires use of
FUTEX_CLOCK_REALTIME; passing a relative timeout to the kernel does
not conform to POSIX. The POSIX specification for sem_timedwait says
"The timeout shall be based on the CLOCK_REALTIME clock.". The POSIX
specification for clock_settime says "If the value of the
CLOCK_REALTIME clock is set via clock_settime(), the new value of the
clock shall be used to determine the time of expiration for absolute
time services based upon the CLOCK_REALTIME clock. This applies to the
time at which armed absolute timers expire. If the absolute time
requested at the invocation of such a time service is before the new
value of the clock, the time service shall expire immediately as if
the clock had reached the requested time normally.". If a relative
timeout is passed to the kernel, it is interpreted according to the
CLOCK_MONOTONIC clock, and so fails to meet that POSIX requirement in
the event of clock changes.
This patch makes sem_timedwait use lll_futex_timed_wait_bitset with
FUTEX_CLOCK_REALTIME when possible, as done in some other places in
NPTL. FUTEX_CLOCK_REALTIME is always available for supported Linux
kernel versions; unavailability of lll_futex_timed_wait_bitset is only
an issue for hppa (an issue noted in
<https://sourceware.org/glibc/wiki/PortStatus>, and fixed by the
unreviewed
<https://sourceware.org/ml/libc-alpha/2014-12/msg00655.html> that
removes the hppa lowlevellock.h completely).
In the FUTEX_CLOCK_REALTIME case, the glibc code still needs to check
for negative tv_sec and handle that as timeout, because the Linux
kernel returns EINVAL not ETIMEDOUT for that case, so resulting in
failures of nptl/tst-abstime and nptl/tst-sem13 in the absence of that
check. If we're trying to distinguish between Linux-specific and
generic-futex NPTL code, I suppose having this in an nptl/ file isn't
ideal, but there doesn't seem to be any better place at present.
It's not possible to add a testcase for this issue to the testsuite
because of the requirement to change the system clock as part of a
test (this is a case where testing would require some form of
container, with root in that container, and one whose CLOCK_REALTIME
is isolated from that of the host; I'm not sure what forms of
containers, short of a full virtual machine, provide that clock
isolation).
Tested for x86_64. Also tested for powerpc with the testcase included
in the bug.
[BZ #18138]
* nptl/sem_waitcommon.c: Include <kernel-features.h>.
(futex_abstimed_wait)
[__ASSUME_FUTEX_CLOCK_REALTIME && lll_futex_timed_wait_bitset]:
Use lll_futex_timed_wait_bitset with FUTEX_CLOCK_REALTIME instead
of lll_futex_timed_wait.
If xports is NULL in xprt_register we malloc it but if sock >
_rpc_dtablesize() that memory does not get initialised and may in theory
contain any value. Later we make a conditional jump in svc_getreq_common
based on the uninitialised memory and this caused a general protection
fault in rpc.statd on an older version of glibc but this code has not
changed since that version.
Following is the valgrind warning.
==26802== Conditional jump or move depends on uninitialised value(s)
==26802== at 0x5343A25: svc_getreq_common (in /lib64/libc-2.5.so)
==26802== by 0x534357B: svc_getreqset (in /lib64/libc-2.5.so)
==26802== by 0x10DE1F: ??? (in /sbin/rpc.statd)
==26802== by 0x10D0EF: main (in /sbin/rpc.statd)
==26802== Uninitialised value was created by a heap allocation
==26802== at 0x4C2210C: malloc (vg_replace_malloc.c:195)
==26802== by 0x53438BE: xprt_register (in /lib64/libc-2.5.so)
==26802== by 0x53450DF: svcudp_bufcreate (in /lib64/libc-2.5.so)
==26802== by 0x10FE32: ??? (in /sbin/rpc.statd)
==26802== by 0x10D13E: main (in /sbin/rpc.statd)
for ChangeLog
[BZ #17090]
[BZ #17620]
[BZ #17621]
[BZ #17628]
* NEWS: Update.
* elf/dl-tls.c (_dl_update_slotinfo): Clean up outdated DTV
entries with Static TLS too. Skip entries past the end of the
allocated DTV, from Alan Modra.
(tls_get_addr_tail): Update to glibc_likely/unlikely. Move
Static TLS DTV entry set up from...
(_dl_allocate_tls_init): ... here (fix modid assertion), ...
* elf/dl-reloc.c (_dl_nothread_init_static_tls): ... here...
* nptl/allocatestack.c (init_one_static_tls): ... and here...
* elf/dlopen.c (dl_open_worker): Drop l_tls_modid upper bound
for Static TLS.
* elf/tlsdeschtab.h (map_generation): Return size_t. Check
that the slot we find is associated with the given map before
using its generation count.
* nptl_db/db_info.c: Include ldsodefs.h.
(rtld_global, dtv_slotinfo_list, dtv_slotinfo): New typedefs.
* nptl_db/structs.def (DB_RTLD_VARIABLE): New macro.
(DB_MAIN_VARIABLE, DB_RTLD_GLOBAL_FIELD): Likewise.
(link_map::l_tls_offset): New struct field.
(dtv_t::counter): Likewise.
(rtld_global): New struct.
(_rtld_global): New rtld variable.
(dl_tls_dtv_slotinfo_list): New rtld global field.
(dtv_slotinfo_list): New struct.
(dtv_slotinfo): Likewise.
* nptl_db/td_symbol_list.c: Drop gnu/lib-names.h include.
(td_lookup): Rename to...
(td_mod_lookup): ... this. Use new mod parameter instead of
LIBPTHREAD_SO.
* nptl_db/td_thr_tlsbase.c: Include link.h.
(dtv_slotinfo_list, dtv_slotinfo): New functions.
(td_thr_tlsbase): Check DTV generation. Compute Static TLS
addresses even if the DTV is out of date or missing them.
* nptl_db/fetch-value.c (_td_locate_field): Do not refuse to
index zero-length arrays.
* nptl_db/thread_dbP.h: Include gnu/lib-names.h.
(td_lookup): Make it a macro implemented in terms of...
(td_mod_lookup): ... this declaration.
* nptl_db/db-symbols.awk (DB_RTLD_VARIABLE): Override.
(DB_MAIN_VARIABLE): Likewise.
In bug 14906 the user complains that the inotify support in nscd
is not sufficient when it comes to detecting changes in the
configurationfiles that should be watched for the various databases.
The current nscd implementation uses inotify to watch for changes in
the configuration files, but adds watches only for IN_DELETE_SELF and
IN_MODIFY. These watches are insufficient to cover even the most basic
uses by a system administrator. For example using emacs or vim to edit
a configuration file should trigger a reload but it might not if
the editors use move to atomically update the file. This atomic update
changes the inode and thus removes the notification on the file (as
inotify is based on inodes). Thus the inotify support in nscd for
configuration files is insufficient to account for the average use
cases of system administrators and users.
The inotify support is significantly enhanced and described here:
https://www.sourceware.org/ml/libc-alpha/2015-02/msg00504.html
Tested on x86_64 with and without inotify support.
ldconfig is using an aux-cache to speed up the ld.so.cache update. It
is read by mmaping the file to a structure which contains data offsets
used as pointers. As they are not checked, it is not hard to get
ldconfig to segfault with a corrupted file. This happens for instance if
the file is truncated, which is common following a filesystem check
following a system crash.
This can be reproduced for example by truncating the file to roughly
half of it's size.
There is already some code in elf/cache.c (load_aux_cache) to check
for a corrupted aux cache, but it happens to be broken and not enough.
The test (aux_cache->nlibs >= aux_cache_size) compares the number of
libs entry with the cache size. It's a non sense, as it basically
assumes that each library entry is a 1 byte... Instead this commit
computes the theoretical cache size using the headers and compares it
to the real size.
The function feupdateenv has been fixed to correctly handle FE_DFL_ENV
and FE_NOMASK_ENV.
The fesetexceptflag function has been fixed to correctly handle setting
the new flags instead of just OR-ing the existing flags.
This fixes the test-fenv-return and test-fenvinline failures on hppa.
The constraints in the inline assembly in feholdexcept and fesetenv
are incorrect. The assembly modifies the buffer pointer, but doesn't
express that in the constraints. The simple fix is to remove the
modification of the buffer pointer which is no longer required by
the existing code, and adjust the one constraint that did express
the modification of bufptr.
The change fixes test-fenv when glibc is compiled with recent gcc.
This patch fixes the inline feraiseexcept and feclearexcept macros for
powerpc by casting the input argument to integer before operation on it.
It fixes BZ#17776.
Since 2014-11-24 binutils git commit bb4d2ac2, readelf has appended
the symbol version to symbols shown in reloc dumps.
[BZ #16512]
* scripts/localplt.awk: Strip off symbol version.
* NEWS: Mention bug fix.
__ASSUME_PRLIMIT64 is defined in kernel-features.h for kernels 2.6.36
and later, but hppa, microblaze and sh did not add the prlimit64
syscall until 2.6.37. This patch adds corresponding undefines of
__ASSUME_PRLIMIT64 to those architectures' kernel-features.h files.
(This concludes the kernel-features.h fixes arising out of the review
- limited to macros defined in the architecture-independent
kernel-features.h file - I did in connection with the move to 2.6.32
minimum kernel version. For that subset of macros - I didn't check
any purely architecture-specific macros - I think they are now defined
for the correct kernel versions on each architecture after this
patch.)
[BZ #17779]
* sysdeps/unix/sysv/linux/hppa/kernel-features.h
[__LINUX_KERNEL_VERSION < 0x020625] (__ASSUME_PRLIMIT64):
Undefine.
* sysdeps/unix/sysv/linux/microblaze/kernel-features.h
[__LINUX_KERNEL_VERSION < 0x020625] (__ASSUME_PRLIMIT64):
Likewise.
* sysdeps/unix/sysv/linux/sh/kernel-features.h
[__LINUX_KERNEL_VERSION < 0x020625] (__ASSUME_PRLIMIT64):
Likewise.
Protocted symbol in shared library can only be accessed from PIE
or shared library. Linker in binutils 2.26 enforces it. We must
compile vismain with -fPIE and link it with -pie.
[BZ #17711]
* elf/Makefile (tests): Add vismain only if PIE is enabled.
(tests-pie): Add vismain.
(CFLAGS-vismain.c): New.
* elf/vismain.c: Add comments for PIE requirement.
The threshold in ldbl-96 atanhl for when to return the argument,
0x1p-28, is a bit too big, and that in ldbl-128ibm atanhl is much too
big (the relevant condition being x^3/3 being < 0.5ulp of x),
resulting in errors a bit above the limits of those considered
acceptable in glibc in the ldbl-96 case, and in large errors in the
ldbl-128ibm case. This patch changes those implementations to use
more appropriate thresholds and adds tests around the thresholds for
various formats.
Tested for x86_64, x86 and powerpc. x86_64 and x86 ulps updated
accordingly.
[BZ #18046]
[BZ #18047]
* sysdeps/ieee754/ldbl-128ibm/e_atanhl.c (__ieee754_atanhl): Use
0x1p-56L as threshold for just returning the argument.
* sysdeps/ieee754/ldbl-96/e_atanhl.c (__ieee754_atanhl): Use
0x1p-32L as threshold for just returning the argument.
* math/auto-libm-test-in: Add more tests of atanh.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulp: Likewise.
The ldbl-128 and ldbl-128ibm implementations of acosl have similar
bugs, using a threshold of 0x1p-57L to determine when they just return
pi/2. Since the result pi/2 - asinl (x) is roughly pi/2 - x for small
x, the relevant cut-off is actually x being < 0.5ulp of 1. This patch
fixes the implementations to use that cut-off and adds tests of small
acos arguments.
Tested for powerpc and mips64. Also tested for x86_64 and x86; no
ulps updates needed.
[BZ #18038]
[BZ #18039]
* sysdeps/ieee754/ldbl-128/e_acosl.c (__ieee754_acosl): Only
return pi/2 for arguments below 0x1p-113L.
* sysdeps/ieee754/ldbl-128ibm/e_acosl.c (__ieee754_acosl): Only
return pi/2 for arguments below 0x1p-106L.
* math/auto-libm-test-in: Add more tests of acos.
* math/auto-libm-test-out: Regenerated.
Similar to various other bugs in this area, some asin implementations
do not raise the underflow exception for subnormal arguments, when the
result is tiny and inexact. This patch forces the exception in a
similar way to previous fixes.
Tested for x86_64, x86, powerpc and mips64.
[BZ #16351]
* sysdeps/i386/fpu/e_asin.S (dbl_min): New object.
(MO): New macro.
(__ieee754_asin): Force underflow exception for results with small
absolute value.
* sysdeps/i386/fpu/e_asinf.S (flt_min): New object.
(MO): New macro.
(__ieee754_asinf): Force underflow exception for results with
small absolute value.
* sysdeps/ieee754/dbl-64/e_asin.c: Include <float.h> and <math.h>.
(__ieee754_asin): Force underflow exception for results with small
absolute value.
* sysdeps/ieee754/flt-32/e_asinf.c: Include <float.h>.
(__ieee754_asinf): Force underflow exception for results with
small absolute value.
* sysdeps/ieee754/ldbl-128/e_asinl.c: Include <float.h>.
(__ieee754_asinl): Force underflow exception for results with
small absolute value.
* sysdeps/ieee754/ldbl-128ibm/e_asinl.c: Include <float.h>.
(__ieee754_asinl): Force underflow exception for results with
small absolute value.
* sysdeps/ieee754/ldbl-96/e_asinl.c: Include <float.h>.
(__ieee754_asinl): Force underflow exception for results with
small absolute value.
* sysdeps/x86_64/fpu/multiarch/e_asin.c [HAVE_FMA4_SUPPORT]:
Include <math.h>.
* math/auto-libm-test-in: Do not mark underflow exceptions as
possibly missing for bug 16351.
* math/auto-libm-test-out: Regenerated.
The ldbl-128ibm implementation of logbl produces incorrect results
when the high part of the argument is a power of 2 and the low part a
nonzero number with the opposite sign (and so the returned exponent
should be 1 less than that of the high part). For example, logbl
(0x1.ffffffffffffffp1L) returns 2 but should return 1. (This is
similar to (fixed) bug 16740 for frexpl, and (fixed) bug 18029 for
ilogbl.) This patch adds checks for that case.
Tested for powerpc.
[BZ #18030]
* sysdeps/ieee754/ldbl-128ibm/s_logbl.c (__logbl): Adjust exponent
of power of 2 down when low part has opposite sign.
* math/libm-test.inc (logb_test_data): Add more tests.
The ldbl-128ibm implementation of ilogbl produces incorrect results
when the high part of the argument is a power of 2 and the low part a
nonzero number with the opposite sign (and so the returned exponent
should be 1 less than that of the high part). For example, ilogbl
(0x1.ffffffffffffffp1L) returns 2 but should return 1. (This is
similar to (fixed) bug 16740 for frexpl, and bug 18030 for logbl.)
This patch adds checks for that case.
Tested for powerpc.
[BZ #18029]
* sysdeps/ieee754/ldbl-128ibm/e_ilogbl.c (__ieee754_ilogbl):
Adjust exponent of power of 2 down when low part has opposite
sign.
* math/libm-test.inc (ilogb_test_data): Add more tests.
If a locale alias is defined in locale.alias but not in an archive,
and the referenced locale is only present in the archive, setlocale
will fail if given the alias name. This is unintuitive. This patch
fixes it, arranging for the locale archive to be searched again after
alias expansion.
for ChangeLog
[BZ #15969]
* locale/findlocale.c (_nl_find_locale): Retry archive search
after alias expansion.
The ldbl-128ibm implementation of asinhl uses cut-offs of 0x1p28 and
0x1p-29 to determine when to use simpler formulas that avoid possible
overflow / underflow. Both those cut-offs are inappropriate for this
format, resulting in large errors. This patch changes the code to use
more appropriate cut-offs of 0x1p56 and 0x1p-56, adding tests around
the cut-offs for various floating-point formats.
Tested for powerpc. Also tested for x86_64 and x86 and updated ulps.
[BZ #18020]
* sysdeps/ieee754/ldbl-128ibm/s_asinhl.c (__asinhl): Use 2**56 and
2**-56 not 2**28 and 2**-29 as thresholds for simpler formulas.
* math/auto-libm-test-in: Add more tests of asinh.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
The ldbl-128ibm implementation of acoshl uses a cut-off of 0x1p28 to
determine when to use log(x) + log(2) as a formula. That cut-off is
too small for this format, resulting in large errors. This patch
changes it to a more appropriate cut-off of 0x1p56, adding tests
around the cut-offs for various floating-point formats.
Tested for powerpc. Also tested for x86_64 and x86 and updated ulps.
[BZ #18019]
* sysdeps/ieee754/ldbl-128ibm/e_acoshl.c (__ieee754_acoshl): Use
2**56 not 2**28 as threshold for log (2x) formula.
* math/auto-libm-test-in: Add more tests of acosh.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
Various x86 / x86_64 versions of scalb / scalbf / scalbl produce
spurious "invalid" exceptions for (qNaN, -Inf) arguments, because this
is wrongly handled like (+/-Inf, -Inf) which *should* raise such an
exception. (In fact the NaN case of the code determining whether to
quietly return a zero or a NaN for second argument -Inf was
accidentally dead since the code had been made to return a NaN with
exception.) This patch fixes the code to do the proper test for an
infinity as distinct from a NaN.
(Since the existing code does nothing to distinguish qNaNs and sNaNs
here, this patch doesn't either. If in future we systematically
implement proper sNaN semantics following TS 18661-1:2014, there will
be lots of bugs to address - Thomas found lots of issues with his
patch <https://sourceware.org/ml/libc-ports/2013-04/msg00008.html> to
add SNaN tests (which never went in and would now require significant
reworking).)
Tested for x86_64 and x86. Committed.
[BZ #16783]
* sysdeps/i386/fpu/e_scalb.S (__ieee754_scalb): Do not handle
arguments (NaN, -Inf) the same as (+/-Inf, -Inf).
* sysdeps/i386/fpu/e_scalbf.S (__ieee754_scalbf): Likewise.
* sysdeps/i386/fpu/e_scalbl.S (__ieee754_scalbl): Likewise.
* sysdeps/x86_64/fpu/e_scalbl.S (__ieee754_scalbl): Likewise.
* math/libm-test.inc (scalb_test_data): Add more tests.
Both open and openat load their last argument 'mode' lazily, using
va_arg() only if O_CREAT is found in oflag. This is wrong, mode is also
necessary if O_TMPFILE is in oflag.
By chance on x86_64, the problem wasn't evident when using O_TMPFILE
with open, as the 3rd argument of open, even when not loaded with
va_arg, is left untouched in RDX, where the syscall expects it.
However, openat was not so lucky, and O_TMPFILE couldn't be used: mode
is the 4th argument, in RCX, but the syscall expects its 4th argument in
a different register than the glibc wrapper, in R10.
Introduce a macro __OPEN_NEEDS_MODE (oflag) to test if either O_CREAT or
O_TMPFILE is set in oflag.
Tested on Linux x86_64.
[BZ #17523]
* io/fcntl.h (__OPEN_NEEDS_MODE): New macro.
* io/bits/fcntl2.h (open): Use it.
(openat): Likewise.
* io/open.c (__libc_open): Likewise.
* io/open64.c (__libc_open64): Likewise.
* io/open64_2.c (__open64_2): Likewise.
* io/open_2.c (__open_2): Likewise.
* io/openat.c (__openat): Likewise.
* io/openat64.c (__openat64): Likewise.
* io/openat64_2.c (__openat64_2): Likewise.
* io/openat_2.c (__openat_2): Likewise.
* sysdeps/mach/hurd/open.c (__libc_open): Likewise.
* sysdeps/mach/hurd/openat.c (__openat): Likewise.
* sysdeps/posix/open64.c (__libc_open64): Likewise.
* sysdeps/unix/sysv/linux/dl-openat64.c (openat64): Likewise.
* sysdeps/unix/sysv/linux/generic/open.c (__libc_open): Likewise.
(__open_nocancel): Likewise.
* sysdeps/unix/sysv/linux/generic/open64.c (__libc_open64): Likewise.
* sysdeps/unix/sysv/linux/open64.c (__libc_open64): Likewise.
* sysdeps/unix/sysv/linux/openat.c (__OPENAT): Likewise.
DNSSEC defines a number of response types that one me expect when the
DO bit is set. We don't process any of them, but since we do allow
setting the DO bit, skip them without logging an error since it is
only a nuisance.
Tested on x86_64.
[BZ #14841]
* resolv/gethnamaddr.c (getanswer): Skip logging if
RES_USE_DNSSEC is set.
* resolv/nss_dns/dns-host.c (getanswer_r): Likewise.
We compile gcrt1.o with -fPIC to support both "gcc -pg" and "gcc -pie -pg".
[BZ #17836]
* csu/Makefile (extra-objs): Add gmon-start.o if not builing
shared library. Add gmon-start.os otherwise.
($(objpfx)g$(start-installed-name)): Use $(objpfx)S%
$(objpfx)gmon-start.os if builing shared library.
($(objpfx)g$(static-start-installed-name)): Likewise.
for localedata/ChangeLog
[BZ #17588]
[BZ #13064]
[BZ #14094]
[BZ #17998]
* unicode-gen/Makefile: New.
* unicode-gen/unicode-license.txt: New, from Unicode.
* unicode-gen/UnicodeData.txt: New, from Unicode.
* unicode-gen/DerivedCoreProperties.txt: New, from Unicode.
* unicode-gen/EastAsianWidth.txt: New, from Unicode.
* unicode-gen/gen_unicode_ctype.py: New generator, from Mike
FABIAN <mfabian@redhat.com>.
* unicode-gen/ctype_compatibility.py: New verifier, from
Pravin Satpute <psatpute@redhat.com> and Mike FABIAN.
* unicode-gen/ctype_compatibility_test_cases.py: New verifier
module, from Mike FABIAN.
* unicode-gen/utf8_gen.py: New generator, from Pravin Satpute
and Mike FABIAN.
* unicode-gen/utf8_compatibility.py: New verifier, from Pravin
Satpute and Mike FABIAN.
* charmaps/UTF-8: Update.
* locales/i18n: Update.
* gen-unicode-ctype.c: Remove.
* tst-ctype-de_DE.ISO-8859-1.in: Adjust, islower now returns
true for ordinal indicators.
The POSIX function scandir calls scandirat, which is not a POSIX
function. This patch fixes this by making it use __scandirat and
making scandirat a weak alias. There are no changes for scandir64 /
scandirat64 because those are both _GNU_SOURCE-only functions so no
namespace issue arises for them.
Tested for x86_64 that the disassembly of installed shared libraries
is unchanged by this patch.
[BZ #17999]
* dirent/scandir.c [!SCANDIR] (SCANDIRAT): Define to __scandirat
instead of scandirat.
* dirent/scandirat.c [!SCANDIRAT] (SCANDIRAT): Likewise.
[!SCANDIRAT] (SCANDIRAT_WEAK_ALIAS): Define.
[SCANDIRAT_WEAK_ALIAS] (scandirat): Define as weak alias of
__scandirat.
* include/dirent.h (scandirat): Do not use libc_hidden_proto.
(__scandirat): Declare. Use libc_hidden_proto.
* conform/Makefile (test-xfail-POSIX2008/dirent.h/linknamespace):
Remove variable.
(test-xfail-XOPEN2K8/dirent.h/linknamespace): Likewise.
This patch fixes bug 15319, missing underflows from atan / atan2 when
the result of atan is very close to its small argument (or that of
atan2 is very close to the ratio of its arguments, which may be an
exact division).
The usual approach of doing an underflowing computation if the
computed result is subnormal is followed. For 32-bit x86, there are
extra complications: the inline __ieee754_atan2 in bits/mathinline.h
needs to be disabled for float and double because other libm functions
using it generally rely on getting proper underflow exceptions from
it, while the out-of-line functions have to remove excess range and
precision from the underflowing result so as to return an exact 0 in
the case where errno should be set for underflow to 0. (The failures
I saw without that are similar to those Carlos reported for other
functions, where I haven't seen a response to
<https://sourceware.org/ml/libc-alpha/2015-01/msg00485.html>
confirming if my diagnosis is correct. Arguably all libm functions
with float and double returns should remove excess range and
precision, but that's a separate matter.)
The x86_64 long double case reported in a comment in bug 15319 is not
a bug (it's an argument of LDBL_MIN, and x86_64 is an after-rounding
architecture so the correct IEEE result is not to raise underflow in
the given rounding mode, in addition to treating the result as an
exact LDBL_MIN being within the newly clarified documentation of
accuracy goals). I'm presuming that the fpatan instruction can be
trusted to raise appropriate exceptions when the (long double) result
underflows (after rounding) and so no changes are needed for x86 /
x86_64 long double functions here; empirically this is the case for
the cases covered in the testsuite, on my system.
Tested for x86_64, x86, powerpc and mips64. Only 32-bit x86 needs
ulps updates (for the changes to inlines meaning some functions no
longer get excess precision from their __ieee754_atan2* calls).
[BZ #15319]
* sysdeps/i386/fpu/e_atan2.S (dbl_min): New object.
(MO): New macro.
(__ieee754_atan2): For results with small absolute value, force
underflow exception and remove excess range and precision from
return value.
* sysdeps/i386/fpu/e_atan2f.S (flt_min): New object.
(MO): New macro.
(__ieee754_atan2f): For results with small absolute value, force
underflow exception and remove excess range and precision from
return value.
* sysdeps/i386/fpu/s_atan.S (dbl_min): New object.
(MO): New macro.
(__atan): For results with small absolute value, force underflow
exception and remove excess range and precision from return value.
* sysdeps/i386/fpu/s_atanf.S (flt_min): New object.
(MO): New macro.
(__atanf): For results with small absolute value, force underflow
exception and remove excess range and precision from return value.
* sysdeps/ieee754/dbl-64/e_atan2.c: Include <float.h> and
<math.h>.
(__ieee754_atan2): Force underflow exception for results with
small absolute value.
* sysdeps/ieee754/dbl-64/s_atan.c: Include <float.h> and
<math_private.h>.
(atan): Force underflow exception for results with small absolute
value.
* sysdeps/ieee754/flt-32/s_atanf.c: Include <float.h>.
(__atanf): Force underflow exception for results with small
absolute value.
* sysdeps/ieee754/ldbl-128/s_atanl.c: Include <float.h> and
<math.h>.
(__atanl): Force underflow exception for results with small
absolute value.
* sysdeps/ieee754/ldbl-128ibm/s_atanl.c: Include <float.h>.
(__atanl): Force underflow exception for results with small
absolute value.
* sysdeps/x86/fpu/bits/mathinline.h
[!__SSE2_MATH__ && !__x86_64__ && __LIBC_INTERNAL_MATH_INLINES]
(__ieee754_atan2): Only define inline for long double.
* sysdeps/x86_64/fpu/multiarch/e_atan2.c
[HAVE_FMA4_SUPPORT || HAVE_AVX_SUPPORT]: Include <math.h>.
* math/auto-libm-test-in: Do not mark underflow exceptions as
possibly missing for bug 15319. Add more tests of atan2.
* math/auto-libm-test-out: Regenerated.
* math/libm-test.inc (casin_test_data): Do not mark underflow
exceptions as possibly missing for bug 15319.
(casinh_test_data): Likewise.
* sysdeps/i386/fpu/libm-test-ulps: Update.
The implementation of the (XSI POSIX) functions hsearch / hcreate /
hdestroy uses hsearch_r / hcreate_r / hdestroy_r, which are not POSIX
functions. This patch makes those into weak aliases for __*_r and
uses those names for the calls within libc.
Tested for x86_64 that the disassembly of installed shared libraries
is unchanged by this patch.
[BZ #17996]
* include/search.h (hcreate_r): Don't use libc_hidden_proto.
(hdestroy_r): Likewise.
(hsearch_r): Likewise.
(__hcreate_r): Declare and use libc_hidden_proto.
(__hdestroy_r): Likewise.
(__hsearch_r): Likewise.
* misc/hsearch.c (hsearch): Call __hsearch_r instead of hsearch_r.
(hcreate): Call __hcreate_r instead of hcreate_r.
(__hdestroy): Call __hdestroy_r instead of hdestroy_r.
* misc/hsearch_r.c (hcreate_r): Rename to __hcreate_r and define
as weak alias of __hcreate_r.
(hdestroy_r): Rename to __hdestroy_r and define as weak alias of
__hdestroy_r.
(hsearch_r): Rename to __hsearch_r and define as weak alias of
__hsearch_r.
* conform/Makefile (test-xfail-XPG3/search.h/linknamespace):
Remove variable.
(test-xfail-XPG4/search.h/linknamespace): Likewise.
(test-xfail-UNIX98/search.h/linknamespace): Likewise.
(test-xfail-XOPEN2K/search.h/linknamespace): Likewise.
(test-xfail-XOPEN2K8/search.h/linknamespace): Likewise.
posix_spawn (a standard POSIX function) brings in a use of getrlimit64
(not a standard POSIX function). This patch fixes this by using
__getrlimit64 and making getrlimit64 a weak alias.
This is more complicated than some such changes because of files that
define getrlimit64 in their own way using symbol versioning after
including the main sysdeps/unix/sysv/linux/getrlimit64.c with a
getrlimit macro defined. There are various existing patterns for such
cases in glibc; the one I've used here is that a getrlimit64 macro
disables the weak_alias / libc_hidden_weak calls, leaving it to the
including file to define the getrlimit64 name in whatever way is
appropriate.
Tested for x86_64 and x86 that installed stripped shared libraries are
unchanged by this patch.
[BZ #17991]
* include/sys/resource.h (__getrlimit64): Declare. Use
libc_hidden_proto.
* resource/getrlimit64.c (getrlimit64): Rename to __getrlimit64
and define as weak alias of __getrlimit64. Use libc_hidden_weak.
* sysdeps/posix/spawni.c (__spawni): Call __getrlimit64 instead of
getrlimit64.
* sysdeps/unix/sysv/linux/getrlimit64.c (getrlimit64): Rename to
__getrlimit64.
[!getrlimit64] (getrlimit64): Define as weak alias of
__getrlimit64. Use libc_hidden_weak.
* sysdeps/unix/sysv/linux/i386/getrlimit64.c (getrlimit64): Define
using __getrlimit64 not __new_getrlimit64.
(__GI_getrlimit64): Likewise.
* sysdeps/unix/sysv/linux/mips/getrlimit64.c (getrlimit64):
Likewise.
(__GI_getrlimit64): Likewise.
(__old_getrlimit64): Use __getrlimit64 not __new_getrlimit64.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/syscalls.list
(getrlimit): Add __getrlimit64 alias.
* sysdeps/unix/sysv/linux/wordsize-64/syscalls.list (getrlimit):
Likewise.
* conform/Makefile (test-xfail-XOPEN2K/spawn.h/linknamespace):
Remove variable.
(test-xfail-POSIX2008/spawn.h/linknamespace): Likewise.
(test-xfail-XOPEN2K8/spawn.h/linknamespace): Likewise.
Various remquo implementations produce a zero remainder with the wrong
sign (a zero remainder should always have the sign of the first
argument, as specified in IEEE 754) in round-downward mode, resulting
from the sign of 0 - 0. This patch checks for zero results and fixes
their sign accordingly.
Tested for x86_64, x86, mips64 and powerpc.
[BZ #17987]
* sysdeps/ieee754/dbl-64/s_remquo.c (__remquo): Ensure sign of
zero result does not depend on the sign resulting from
subtraction.
* sysdeps/ieee754/dbl-64/wordsize-64/s_remquo.c (__remquo):
Likewise.
* sysdeps/ieee754/flt-32/s_remquof.c (__remquof): Likewise.
* sysdeps/ieee754/ldbl-128/s_remquol.c (__remquol): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_remquol.c (__remquol): Likewise.
* sysdeps/ieee754/ldbl-96/s_remquol.c (__remquol): Likewise.
* math/libm-test.inc (remquo_test_data): Add more tests.
Various remquo implementations, when computing the last three bits of
the quotient, have spurious overflows when 4 times the second argument
to remquo overflows. These overflows can in turn cause bad results in
rounding modes where that overflow results in a finite value. This
patch adds tests to avoid the problem multiplications in cases where
they would overflow, similar to those that control an earlier
multiplication by 8.
Tested for x86_64, x86, mips64 and powerpc.
[BZ #17978]
* sysdeps/ieee754/dbl-64/s_remquo.c (__remquo): Do not form
products 4 * y and 2 * y where those would overflow.
* sysdeps/ieee754/dbl-64/wordsize-64/s_remquo.c (__remquo):
Likewise.
* sysdeps/ieee754/flt-32/s_remquof.c (__remquof): Likewise.
* sysdeps/ieee754/ldbl-128/s_remquol.c (__remquol): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_remquol.c (__remquol): Likewise.
* sysdeps/ieee754/ldbl-96/s_remquol.c (__remquol): Likewise.
* math/libm-test.inc (remquo_test_data): Add more tests.
Remove IA64 PAGE_SIZE related macros as PAGE_SIZE is not defined.
Also remove macros that are only used for BFD's trad-core support
which is not relavant for IA64 according to the thread starting
here:
https://sourceware.org/ml/libc-ports/2013-11/msg00028.html
This patch is neither built nor tested but is equivalent to a MIPS
patch for the same fix.
The dbl-64/wordsize-64 remquo implementation follows similar logic to
various other implementations, but where that logic computes some
absolute values, it wrongly uses a previously computed bit-pattern for
the absolute value of the first argument, where actually it needs the
absolute value of the first argument mod 8 times the second. This
patch fixes it to compute the correct absolute value.
The integer quotient result of remquo is only specified mod 8
(including its sign); architecture-specific versions may well vary in
what results they give for higher bits of that result (and indeed bug
17569 gives an example correct result from __builtin_remquo giving 9
for that result, where the particular glibc implementation used in
that bug report would give 1 after this fix). Thus, this patch adapts
the tests of remquo to test that result only mod 8, to allow for such
variation when tests with higher quotient are included.
Tested for x86_64 and x86.
[BZ #17569]
* sysdeps/ieee754/dbl-64/wordsize-64/s_remquo.c (__remquo):
Compute absolute value of x as modified by fmod, not original
value of x.
* math/libm-test.inc (RUN_TEST_ffI_f1): Rename to
RUN_TEST_ffI_f1_mod8. Check extra return value mod 8.
(RUN_TEST_LOOP_ffI_f1): Rename to RUN_TEST_LOOP_ffI_f1_mod8. Call
RUN_TEST_ffI_f1_mod8.
(remquo_test_data): Add more tests.
Similarly to sqrt in
<https://sourceware.org/ml/libc-alpha/2015-02/msg00353.html>, the
powerpc sqrtf implementation for when _ARCH_PPCSQ is not defined also
relies on a * b + c being contracted into a fused multiply-add.
Although this contraction is not explicitly disabled for e_sqrtf.c, it
still seems appropriate to make the file explicit about its
requirements by using __builtin_fmaf; this patch does so.
Furthermore, it turns out that doing so fixes the observed inaccuracy
and missing exceptions (that is, that without explicit __builtin_fmaf
usage, it was not being compiled as intended).
Tested for powerpc32 (hard float).
[BZ #17967]
* sysdeps/powerpc/fpu/e_sqrtf.c (__slow_ieee754_sqrtf): Use
__builtin_fmaf instead of relying on contraction of a * b + c.
As Adhemerval noted in
<https://sourceware.org/ml/libc-alpha/2015-01/msg00451.html>, the
powerpc sqrt implementation for when _ARCH_PPCSQ is not defined is
inaccurate in some cases.
The problem is that this code relies on fused multiply-add, and relies
on the compiler contracting a * b + c to get a fused operation. But
sysdeps/ieee754/dbl-64/Makefile disables contraction for e_sqrt.c,
because the implementation in that directory relies on *not* having
contracted operations.
While it would be possible to arrange makefiles so that an earlier
sysdeps directory can disable the setting in
sysdeps/ieee754/dbl-64/Makefile, it seems a lot cleaner to make the
dependence on fused operations explicit in the .c file. GCC 4.6
introduced support for __builtin_fma on powerpc and other
architectures with such instructions, so we can rely on that; this
patch duly makes the code use __builtin_fma for all such fused
operations.
Tested for powerpc32 (hard float).
2015-02-12 Joseph Myers <joseph@codesourcery.com>
[BZ #17964]
* sysdeps/powerpc/fpu/e_sqrt.c (__slow_ieee754_sqrt): Use
__builtin_fma instead of relying on contraction of a * b + c.
The tv_sec is of type time_t in both struct timeval and struct timespec.
This matches the implementation and also the relevant standard (checked
C11 for timespec and opengroup for timeval).
This patch fixes the remaining part of bug 16560, spurious underflows
from exp2 of arguments close to 0 (when the result is close to 1, so
should not underflow), by just using 1+x instead of a more complicated
calculation when the argument is sufficiently small.
Tested for x86_64, x86 and mips64.
[BZ #16560]
* math/e_exp2l.c [LDBL_MANT_DIG == 106] (LDBL_EPSILON): Undefine
and redefine.
(__ieee754_exp2l): Do not multiply small fractional parts by
M_LN2l.
* sysdeps/i386/fpu/e_exp2l.S (__ieee754_exp2l): Just add 1 to
small argument.
* sysdeps/ieee754/dbl-64/e_exp2.c (__ieee754_exp2): Likewise.
* sysdeps/ieee754/flt-32/e_exp2f.c (__ieee754_exp2f): Likewise.
* sysdeps/x86_64/fpu/e_exp2l.S (__ieee754_exp2l): Likewise.
* math/auto-libm-test-in: Add more tests of exp2.
* math/auto-libm-test-out: Regenerated.
pthread_mutexattr_settype adds PTHREAD_MUTEX_NO_ELISION_NP to kind,
which is an internal flag that pthread_mutexattr_gettype shouldn't
expose, since pthread_mutexattr_settype wouldn't accept it.
This patch makes sincos set errno to EDOM when passed an infinity,
similarly to sin and cos.
Tested for x86_64, x86, powerpc and mips64. I don't know if the
architecture-specific implementations for ia64 and m68k might need
corresponding fixes.
2015-02-11 Joseph Myers <joseph@codesourcery.com>
[BZ #15467]
* sysdeps/ieee754/dbl-64/s_sincos.c: Include <errno.h>.
(__sincos): Set errno to EDOM for infinite argument.
* sysdeps/ieee754/flt-32/s_sincosf.c: Include <errno.h>.
(SINCOSF_FUNC): Set errno to EDOM for infinite argument.
* sysdeps/ieee754/ldbl-128/s_sincosl.c: Include <errno.h>.
(__sincosl): Set errno to EDOM for infinite argument.
* sysdeps/ieee754/ldbl-128ibm/s_sincosl.c: Include <errno.h>.
(__sincosl): Set errno to EDOM for infinite argument.
* sysdeps/ieee754/ldbl-96/s_sincosl.c: Include <errno.h>.
(__sincosl): Set errno to EDOM for infinite argument.
* math/libm-test.inc (sincos_test_data): Test errno setting.
soft-fp's _FP_FMA fails to set the result's exponent for cases where
the result of the multiplication is 0, yielding incorrect (arbitrary,
depending on uninitialized values) results for those cases. This
affects libm for architectures using soft-fp to implement fma. This
patch adds the exponent setting and tests for this case.
Tested for ARM soft-float (which uses soft-fp fma), x86_64 and x86 (to
verify not introducing new libm test failures there).
(This bug showed up in testing my patch to move the Linux kernel to
current soft-fp. math/Makefile has "override CFLAGS +=
-Wno-uninitialized" which would have stopped compiler warnings from
showing up this problem, although I wouldn't be surprised if removing
that shows spurious warnings from this code, if the compiler fails to
follow that various cases where the exponent is uninitialized don't
need it initialized because the class is set to a value meaning the
uninitialized exponent isn't used.)
[BZ #17932]
* soft-fp/op-common.h (_FP_FMA): Set exponent of result in case
where multiplication results in zero and third argument is finite
and nonzero.
* math/auto-libm-test-in: Add more tests of fma.
* math/auto-libm-test-out: Regenerated.
BZ #16618
Under certain conditions wscanf can allocate too little memory for the
to-be-scanned arguments and overflow the allocated buffer. The
implementation now correctly computes the required buffer size when
using malloc.
A regression test was added to tst-sscanf.
memcpy with unaligned 256-bit AVX register loads/stores are slow on older
processorsl like Sandy Bridge. This patch adds bit_AVX_Fast_Unaligned_Load
and sets it only when AVX2 is available.
[BZ #17801]
* sysdeps/x86_64/multiarch/init-arch.c (__init_cpu_features):
Set the bit_AVX_Fast_Unaligned_Load bit for AVX2.
* sysdeps/x86_64/multiarch/init-arch.h (bit_AVX_Fast_Unaligned_Load):
New.
(index_AVX_Fast_Unaligned_Load): Likewise.
(HAS_AVX_FAST_UNALIGNED_LOAD): Likewise.
* sysdeps/x86_64/multiarch/memcpy.S (__new_memcpy): Check the
bit_AVX_Fast_Unaligned_Load bit instead of the bit_AVX_Usable bit.
* sysdeps/x86_64/multiarch/memcpy_chk.S (__memcpy_chk): Likewise.
* sysdeps/x86_64/multiarch/mempcpy.S (__mempcpy): Likewise.
* sysdeps/x86_64/multiarch/mempcpy_chk.S (__mempcpy_chk): Likewise.
* sysdeps/x86_64/multiarch/memmove.c (__libc_memmove): Replace
HAS_AVX with HAS_AVX_FAST_UNALIGNED_LOAD.
* sysdeps/x86_64/multiarch/memmove_chk.c (__memmove_chk): Likewise.
The padding bytes in the statsdata struct are not initialized, due to
which valgrind throws a warning:
==11384== Memcheck, a memory error detector
==11384== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==11384== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==11384== Command: nscd -d
==11384==
Fri 25 Apr 2014 10:34:53 AM CEST - 11384: handle_request: request received (Version = 2) from PID 11396
Fri 25 Apr 2014 10:34:53 AM CEST - 11384: GETSTAT
==11384== Thread 6:
==11384== Syscall param socketcall.sendto(msg) points to uninitialised byte(s)
==11384== at 0x4E4ACDC: send (in /lib64/libpthread-2.12.so)
==11384== by 0x11AF6B: send_stats (in /usr/sbin/nscd)
==11384== by 0x112F75: nscd_run_worker (in /usr/sbin/nscd)
==11384== by 0x4E439D0: start_thread (in /lib64/libpthread-2.12.so)
==11384== by 0x599AB6C: clone (in /lib64/libc-2.12.so)
==11384== Address 0x15708395 is on thread 6's stack
Fix the warning by initializing the structure.
This patch fixes a bug introduced by 18f2945ae9, where it optimizes
the FPSCR set by just issuing a mtfs instruction if new flag is different
from older one. The issue is a typo, where the new flag should the the
new value, instead of the old one.
It fixes BZ#17885.
Some powerpc64 processors (e5500 core for instance) does not provide the
fsqrt instruction, however current check to use in math_private.h is
__WORDSIZE and _ARCH_PWR4 (ISA 2.02). This is patch change it to use
the compiler flag _ARCH_PPCSQ (which is the same condition GCC uses to
decide whether to generate fsqrt instruction).
It fixes BZ#16576.
GLIBC memset optimization for POWER8 uses the '.machine power8'
directive, which is only supported officially on binutils 2.24+. This
causes a build failure on older binutils.
Since the requirement of .machine power8 is to correctly assembly the
'mtvsrd' instruction and it is already handled by the MTVSRD_V1_R4
macro, there is no really needed of using it.
The patch replaces the power8 with power7 for .machine directive.
It fixes BZ#17869.
This patch fix the elf/ifuncmain6pie failure when building with GCC
4.9+. For some reason, the compiler removes the branch taken code at
resolve_ifunc (sysdeps/powerpc/powerpc64/dl-machine.h) as dead-code
and thus the testcase fails because the ifunc resolves branches to an
invalid memory location. It fixes by explicit adding a dependency of
value based on odp variable to avoid compiler optimization.
It fixes BZ#17868.
This patch replaces unsigned long int and 1UL with uint64_t and
(uint64_t) 1 to support ILP32 targets like x32.
[BZ #17870]
* nptl/sem_post.c (__new_sem_post): Replace unsigned long int
with uint64_t.
* nptl/sem_waitcommon.c (__sem_wait_cleanup): Replace 1UL with
(uint64_t) 1.
(__new_sem_wait_slow): Replace unsigned long int with uint64_t.
Replace 1UL with (uint64_t) 1.
* sysdeps/nptl/internaltypes.h (new_sem): Replace unsigned long
int with uint64_t.
This patch fix powerpc __get_clockfreq racy and cancel-safe issues by
dropping internal static cache and by using nocancel file operations.
The vDSO failure check is also removed, since kernel code does not
return an error (it cleans cr0.so bit on function return) and the static
code (to read value /proc) now uses non-cancellable calls.
The ability to recursively call dlopen is useful for malloc
implementations that wish to load other dynamic modules that
implement reentrant/AS-safe functions to use in their own
implementation.
Given that a user malloc implementation may be called by an
ongoing dlopen to allocate memory the user malloc
implementation interrupts dlopen and if it calls dlopen again
that's a reentrant call.
This patch fixes the issues with the ld.so.cache mapping
and the _r_debug assertion which prevent this from working
as expected.
See:
https://sourceware.org/ml/libc-alpha/2014-12/msg00446.html
This commit fixes semaphore destruction by either using 64b atomic
operations (where available), or by using two separate fields when only
32b atomic operations are available. In the latter case, we keep a
conservative estimate of whether there are any waiting threads in one
bit of the field that counts the number of available tokens, thus
allowing sem_post to atomically both add a token and determine whether
it needs to call futex_wake.
See:
https://sourceware.org/ml/libc-alpha/2014-12/msg00155.html
This patch adds an optimized POWER8 strncmp. The implementation focus
on speeding up unaligned cases follwing the ideas of power8 strcmp.
The algorithm first check the initial 16 bytes, then align the first
function source and uses unaligned loads on second argument only.
Aditional checks for page boundaries are done for unaligned cases
(where sources alignment are different).
This patch adds an optimized POWER8 strcmp using unaligned accesses.
The algorithm first check the initial 16 bytes, then align the first
function source and uses unaligned loads on second argument only.
Aditional checks for page boundaries are done for unaligned cases
This patch adds an optimized POWER8 st{r,p}ncpy using unaligned accesses.
It shows 10%-80% improvement over the optimized POWER7 one that uses
only aligned accesses, specially on unaligned inputs.
The algorithm first read and check 16 bytes (if inputs do not cross a 4K
page size). The it realign source to 16-bytes and issue a 16 bytes read
and compare loop to speedup null byte checks for large strings. Also,
different from POWER7 optimization, the null pad is done inline in the
implementation using possible unaligned accesses, instead of realying on
a memset call. Special case is added for page cross reads.
This patch adds an optimized POWER8 strcpy using unaligned accesses.
For strings up to 16 bytes the implementation first calculate the
string size, like strlen, and issues a memcpy. For larger strings,
source is first aligned to 16 bytes and then tested over a loop that
reads 16 bytes am combine the cmpb results for speedup. Special case is
added for page cross reads.
It shows 30%-60% improvement over the optimized POWER7 one that uses
only aligned accesses.
[Modified from the original email by Siddhesh Poyarekar]
This patch solves bug #16009 by implementing an additional path in
strxfrm that does not depend on caching the weight and rule indices.
In detail the following changed:
* The old main loop was factored out of strxfrm_l into the function
do_xfrm_cached to be able to alternativly use the non-caching version
do_xfrm.
* strxfrm_l allocates a a fixed size array on the stack. If this is not
sufficiant to store the weight and rule indices, the non-caching path is
taken. As the cache size is not dependent on the input there can be no
problems with integer overflows or stack allocations greater than
__MAX_ALLOCA_CUTOFF. Note that malloc-ing is not possible because the
definition of strxfrm does not allow an oom errorhandling.
* The uncached path determines the weight and rule index for every char
and for every pass again.
* Passing all the locale data array by array resulted in very long
parameter lists, so I introduced a structure that holds them.
* Checking for zero src string has been moved a bit upwards, it is
before the locale data initialization now.
* To verify that the non-caching path works correct I added a test run
to localedata/sort-test.sh & localedata/xfrm-test.c where all strings
are patched up with spaces so that they are too large for the caching path.
The ldbl-96 implementation of scalblnl (used for x86_64 and ia64) uses
a condition k <= -63 to determine when a standard underflowing result
tiny*__copysignl(tiny,x) should be returned. However, that condition
corresponds to values with exponent -16446 or less, and in the case of
-16446, the correct result for round-to-nearest depends on whether the
value is exactly 0x1p-16446 (half the least subnormal) or more than
that. This patch fixes the bug by changing the condition to k <= -64
and accordingly adjusting the exponent by 64 not 63 when converting to
a normal value.
Tested for x86_64.
[BZ #17803]
* sysdeps/ieee754/ldbl-96/s_scalblnl.c (twom63): Rename to
twom64. Adjust value to 0x1p-64L.
(__scalblnl): Only return standard underflowing result for K <=
-64 not K <= -63; adjust exponent for underflowing result by 64
not 63.
* math/libm-test.inc (scalbn_test_data): Add more tests.
(scalbln_test_data): Likewise.
The ldbl-96 implementation of scalblnl (used for x86_64 and ia64) is
incorrect for subnormal arguments (this is a separate bug from bug
17803, which is about underflowing results). There are two problems
with the adjustments of subnormal arguments: the "two63" variable
multiplied by is actually 0x1p52L not 0x1p63L, so is insufficient to
make values normal, and then GET_LDOUBLE_EXP(es,x), used to extract
the new exponent, extracts it into a variable that isn't used, while
the value taken to by the new exponent is wrongly taken from the high
part of the mantissa before the adjustment (hx). This patch fixes
both those problems and adds appropriate tests.
Tested for x86_64.
[BZ #17834]
* sysdeps/ieee754/ldbl-96/s_scalblnl.c (two63): Change value to
0x1p63L.
(__scalblnl): Get new exponent of adjusted subnormal value from ES
not HX.
* math/libm-test.inc (scalbn_test_data): Add more tests.
(scalbln_test_data): Likewise.
This patch adds support for lock elision using ISA 2.07 hardware
transactional memory instructions for pthread_mutex primitives.
Similar to s390 version, the for elision logic defined in
'force-elision.h' is only enabled if ENABLE_LOCK_ELISION is defined.
Also, the lock elision code should be able to be built even with
a compiler that does not provide HTM support with builtins.
However I have noted the performance is sub-optimal due scheduling
pressures.
Microblaze apparently has a variable page size (see thread below) and
should not hard-code any page-size related macros.
Also remove macros that are only used for BFD's trad-core support
which is not relavant for microblaze also according to the thread
starting here:
https://sourceware.org/ml/libc-ports/2013-11/msg00028.html
This patch is neither built nor tested but mirrors a MIPS patch that
fixes the same issue.
Thanks,
Matthew
* sysdepsysdeps/unix/sysv/linux/microblaze/sys/user.h
(PAGE_SHIFT, PAGE_SIZE, PAGE_MASK, NBPG, UPAGES): Remove.
(HOST_TEXT_START_ADDR, HOST_STACK_END_ADDR): Remove.
Signed-off-by: David Holsgrove <david.holsgrove@xilinx.com>
Concluding the fixes for C90 libm functions calling C99 fe* functions,
this patch fixes the case of feupdateenv by making it a weak alias for
__feupdateenv and making the affected code call __feupdateenv.
Tested for x86_64 (testsuite, and that installed stripped shared
libraries are unchanged by the patch). Also tested for ARM
(soft-float) that the math.h linknamespace tests now pass.
[BZ #17748]
* include/fenv.h (__feupdateenv): Use libm_hidden_proto.
* math/feupdateenv.c (__feupdateenv): Use libm_hidden_def.
* sysdeps/aarch64/fpu/feupdateenv.c (feupdateenv): Rename to
__feupdateenv and define as weak alias of __feupdateenv. Use
libm_hidden_weak.
* sysdeps/alpha/fpu/feupdateenv.c (__feupdateenv): Use
libm_hidden_def.
* sysdeps/arm/feupdateenv.c (feupdateenv): Rename to __feupdateenv
and define as weak alias of __feupdateenv. Use libm_hidden_weak.
* sysdeps/hppa/fpu/feupdateenv.c (feupdateenv): Likewise.
* sysdeps/i386/fpu/feupdateenv.c (__feupdateenv): Use
libm_hidden_def.
* sysdeps/ia64/fpu/feupdateenv.c (feupdateenv): Rename to
__feupdateenv and define as weak alias of __feupdateenv. Use
libm_hidden_weak.
* sysdeps/m68k/fpu/feupdateenv.c (__feupdateenv): Use
libm_hidden_def.
* sysdeps/mips/fpu/feupdateenv.c (feupdateenv): Rename to
__feupdateenv and define as weak alias of __feupdateenv. Use
libm_hidden_weak.
* sysdeps/powerpc/fpu/feupdateenv.c (__feupdateenv): Use
libm_hidden_def.
* sysdeps/powerpc/nofpu/feupdateenv.c (__feupdateenv): Likewise.
* sysdeps/powerpc/powerpc32/e500/nofpu/feupdateenv.c
(__feupdateenv): Likewise.
* sysdeps/s390/fpu/feupdateenv.c (feupdateenv): Rename to
__feupdateenv and define as weak alias of __feupdateenv. Use
libm_hidden_weak.
* sysdeps/sh/sh4/fpu/feupdateenv.c (feupdateenv): Likewise.
* sysdeps/sparc/fpu/feupdateenv.c (__feupdateenv): Use
libm_hidden_def.
* sysdeps/tile/math_private.h (__feupdateenv): New inline
function.
* sysdeps/x86_64/fpu/feupdateenv.c (__feupdateenv): Use
libm_hidden_def.
* sysdeps/generic/math_private.h (default_libc_feupdateenv): Call
__feupdateenv instead of feupdateenv.
(default_libc_feupdateenv_test): Likewise.
(libc_feresetround_ctx): Likewise.