The i386 and x86_64 implementations of expl, exp10l and expm1l (code
shared between the functions) return sNaN for sNaN input. This patch
fixes them to add NaN inputs to themselves so that qNaN is returned in
this case.
Tested for x86_64 and x86.
[BZ #20226]
* sysdeps/i386/fpu/e_expl.S (IEEE754_EXPL): Add NaN argument to
itself.
* sysdeps/x86_64/fpu/e_expl.S (IEEE754_EXPL): Likewise.
* math/libm-test.inc (exp_test_data): Add sNaN tests.
(exp10_test_data): Likewise.
(expm1_test_data): Likewise.
The wrapper implementations of ldexp / scalbn / scalbln
(architecture-independent), and their float / long double variants,
return sNaN for sNaN input. This patch fixes them to add relevant
arguments to themselves so that qNaN is returned in this case.
Tested for x86_64 and x86.
[BZ #20225]
* math/s_ldexp.c (__ldexp): Add non-finite or zero argument to
itself.
* math/s_ldexpf.c (__ldexpf): Likewise.
* math/s_ldexpl.c (__ldexpl): Likewise.
* math/w_scalbln.c (__w_scalbln): Likewise.
* math/w_scalblnf.c (__w_scalblnf): Likewise.
* math/w_scalblnl.c (__w_scalblnl): Likewise.
* math/libm-test.inc (scalbn_test_data): Add sNaN tests.
(scalbln_test_data): Likewise.
The i386 version of cbrtl returns sNaN (without raising any
exceptions) for sNaN input. This patch fixes it to add non-finite
arguments to themselves (the code path in question is also reached for
zero arguments, for which adding them to themselves is also harmless),
so that "invalid" is raised and qNaN returned.
Tested for x86_64 and x86.
[BZ #20224]
* sysdeps/i386/fpu/s_cbrtl.S (__cbrtl): Add non-finite or zero
argument to itself.
* math/libm-test.inc (cbrt_test_data): Add sNaN tests.
Since the new SSE2/AVX2 memcpy/memmove are faster than the previous ones,
we can remove the previous SSE2/AVX2 memcpy/memmove and replace them with
the new ones.
No change in IFUNC selection if SSE2 and AVX2 memcpy/memmove weren't used
before. If SSE2 or AVX2 memcpy/memmove were used, the new SSE2 or AVX2
memcpy/memmove optimized with Enhanced REP MOVSB will be used for
processors with ERMS. The new AVX512 memcpy/memmove will be used for
processors with AVX512 which prefer vzeroupper.
Since the new SSE2 memcpy/memmove are faster than the previous default
memcpy/memmove used in libc.a and ld.so, we also remove the previous
default memcpy/memmove and make them the default memcpy/memmove, except
that non-temporal store isn't used in ld.so.
Together, it reduces the size of libc.so by about 6 KB and the size of
ld.so by about 2 KB.
[BZ #19776]
* sysdeps/x86_64/memcpy.S: Make it dummy.
* sysdeps/x86_64/mempcpy.S: Likewise.
* sysdeps/x86_64/memmove.S: New file.
* sysdeps/x86_64/memmove_chk.S: Likewise.
* sysdeps/x86_64/multiarch/memmove.S: Likewise.
* sysdeps/x86_64/multiarch/memmove_chk.S: Likewise.
* sysdeps/x86_64/memmove.c: Removed.
* sysdeps/x86_64/multiarch/memcpy-avx-unaligned.S: Likewise.
* sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S: Likewise.
* sysdeps/x86_64/multiarch/memmove-avx-unaligned.S: Likewise.
* sysdeps/x86_64/multiarch/memmove-sse2-unaligned-erms.S:
Likewise.
* sysdeps/x86_64/multiarch/memmove.c: Likewise.
* sysdeps/x86_64/multiarch/memmove_chk.c: Likewise.
* sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Remove
memcpy-sse2-unaligned, memmove-avx-unaligned,
memcpy-avx-unaligned and memmove-sse2-unaligned-erms.
* sysdeps/x86_64/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Replace
__memmove_chk_avx512_unaligned_2 with
__memmove_chk_avx512_unaligned. Remove
__memmove_chk_avx_unaligned_2. Replace
__memmove_chk_sse2_unaligned_2 with
__memmove_chk_sse2_unaligned. Remove __memmove_chk_sse2 and
__memmove_avx_unaligned_2. Replace __memmove_avx512_unaligned_2
with __memmove_avx512_unaligned. Replace
__memmove_sse2_unaligned_2 with __memmove_sse2_unaligned.
Remove __memmove_sse2. Replace __memcpy_chk_avx512_unaligned_2
with __memcpy_chk_avx512_unaligned. Remove
__memcpy_chk_avx_unaligned_2. Replace
__memcpy_chk_sse2_unaligned_2 with __memcpy_chk_sse2_unaligned.
Remove __memcpy_chk_sse2. Remove __memcpy_avx_unaligned_2.
Replace __memcpy_avx512_unaligned_2 with
__memcpy_avx512_unaligned. Remove __memcpy_sse2_unaligned_2
and __memcpy_sse2. Replace __mempcpy_chk_avx512_unaligned_2
with __mempcpy_chk_avx512_unaligned. Remove
__mempcpy_chk_avx_unaligned_2. Replace
__mempcpy_chk_sse2_unaligned_2 with
__mempcpy_chk_sse2_unaligned. Remove __mempcpy_chk_sse2.
Replace __mempcpy_avx512_unaligned_2 with
__mempcpy_avx512_unaligned. Remove __mempcpy_avx_unaligned_2.
Replace __mempcpy_sse2_unaligned_2 with
__mempcpy_sse2_unaligned. Remove __mempcpy_sse2.
* sysdeps/x86_64/multiarch/memcpy.S (__new_memcpy): Support
__memcpy_avx512_unaligned_erms and __memcpy_avx512_unaligned.
Use __memcpy_avx_unaligned_erms and __memcpy_sse2_unaligned_erms
if processor has ERMS. Default to __memcpy_sse2_unaligned.
(ENTRY): Removed.
(END): Likewise.
(ENTRY_CHK): Likewise.
(libc_hidden_builtin_def): Likewise.
Don't include ../memcpy.S.
* sysdeps/x86_64/multiarch/memcpy_chk.S (__memcpy_chk): Support
__memcpy_chk_avx512_unaligned_erms and
__memcpy_chk_avx512_unaligned. Use
__memcpy_chk_avx_unaligned_erms and
__memcpy_chk_sse2_unaligned_erms if if processor has ERMS.
Default to __memcpy_chk_sse2_unaligned.
* sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S
Change function suffix from unaligned_2 to unaligned.
* sysdeps/x86_64/multiarch/mempcpy.S (__mempcpy): Support
__mempcpy_avx512_unaligned_erms and __mempcpy_avx512_unaligned.
Use __mempcpy_avx_unaligned_erms and __mempcpy_sse2_unaligned_erms
if processor has ERMS. Default to __mempcpy_sse2_unaligned.
(ENTRY): Removed.
(END): Likewise.
(ENTRY_CHK): Likewise.
(libc_hidden_builtin_def): Likewise.
Don't include ../mempcpy.S.
(mempcpy): New. Add a weak alias.
* sysdeps/x86_64/multiarch/mempcpy_chk.S (__mempcpy_chk): Support
__mempcpy_chk_avx512_unaligned_erms and
__mempcpy_chk_avx512_unaligned. Use
__mempcpy_chk_avx_unaligned_erms and
__mempcpy_chk_sse2_unaligned_erms if if processor has ERMS.
Default to __mempcpy_chk_sse2_unaligned.
Since the new SSE2/AVX2 memsets are faster than the previous ones, we
can remove the previous SSE2/AVX2 memsets and replace them with the
new ones. This reduces the size of libc.so by about 900 bytes.
No change in IFUNC selection if SSE2 and AVX2 memsets weren't used
before. If SSE2 or AVX2 memset was used, the new SSE2 or AVX2 memset
optimized with Enhanced REP STOSB will be used for processors with
ERMS. The new AVX512 memset will be used for processors with AVX512
which prefer vzeroupper.
[BZ #19881]
* sysdeps/x86_64/multiarch/memset-sse2-unaligned-erms.S: Folded
into ...
* sysdeps/x86_64/memset.S: This.
(__bzero): Removed.
(__memset_tail): Likewise.
(__memset_chk): Likewise.
(memset): Likewise.
(MEMSET_CHK_SYMBOL): New. Define only if MEMSET_SYMBOL isn't
defined.
(MEMSET_SYMBOL): Define only if MEMSET_SYMBOL isn't defined.
* sysdeps/x86_64/multiarch/memset-avx2.S: Removed.
(__memset_zero_constant_len_parameter): Check SHARED instead of
PIC.
* sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Remove
memset-avx2 and memset-sse2-unaligned-erms.
* sysdeps/x86_64/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Remove __memset_chk_sse2,
__memset_chk_avx2, __memset_sse2 and __memset_avx2_unaligned.
* sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S
(__bzero): Enabled.
* sysdeps/x86_64/multiarch/memset.S (memset): Replace
__memset_sse2 and __memset_avx2 with __memset_sse2_unaligned
and __memset_avx2_unaligned. Use __memset_sse2_unaligned_erms
or __memset_avx2_unaligned_erms if processor has ERMS. Support
__memset_avx512_unaligned_erms and __memset_avx512_unaligned.
(memset): Removed.
(__memset_chk): Likewise.
(MEMSET_SYMBOL): New.
(libc_hidden_builtin_def): Replace __memset_sse2 with
__memset_sse2_unaligned.
* sysdeps/x86_64/multiarch/memset_chk.S (__memset_chk): Replace
__memset_chk_sse2 and __memset_chk_avx2 with
__memset_chk_sse2_unaligned and __memset_chk_avx2_unaligned_erms.
Use __memset_chk_sse2_unaligned_erms or
__memset_chk_avx2_unaligned_erms if processor has ERMS. Support
__memset_chk_avx512_unaligned_erms and
__memset_chk_avx512_unaligned.
This converts the inclusion macro for each test to use
the format specific macro. In addition, the format
specifier is removed as it is applied via the LIT() macro
which is itself applied when converting the auto inputs and
libm-test.inc into libm-test.c.
Apply the following sed regexes to auto-libm-test-in in order:
s/flt-32/binary32/
s/dbl-64/binary64/
s/ldbl-96-intel/intel96/
s/ldbl-96-m68k/m68k96/
s/ldbl-128ibm/ibm128/
s/ldbl-128/binary128/
and fixup ldbl-96 comment manually.
Use gen-libm-test.pl to generate a list of macros
mapping to libm-test-ulps.h as this simplifies adding new
types without having to modify a growing number of
static headers each time a type is added.
This also removes the final usage of the TEST_(DOUBLE|FLOAT|LDOUBLE)
macros. Thus, they too are removed.
With the exception of the second argument of nexttoward,
any suffixes should be stripped from the test input, and
the macro LIT(x) should be applied to use the correct
suffix for the type being tested.
This adds a new argument type "j" to gen-test-libm.pl
to signify an argument to a test input which does not
require fixup. The test cases of nexttoward have
been updated to use this new feature.
This applies post-processing to all of the test inputs
through gen-libm-test.pl to strip literal suffixes and
apply the LIT(x) macro, with one exception stated above.
This seems a bit cleaner than tossing the macro onto
everything, albeit slightly more obfuscated.
For regular mmapped chunks there are two size fields (hence a reduction
by 2 * SIZE_SZ bytes), but for fake chunks, we only have one size field,
so we need to subtract SIZE_SZ bytes.
This was initially reported as Emacs bug 23726.
The i386 version of atanhl returns sNaN for sNaN input. This patch
fixes it to add NaN arguments to themselves so it returns qNaN in this
case.
Tested for x86_64 and x86.
[BZ #20219]
* sysdeps/i386/fpu/e_atanhl.S (__ieee754_atanhl): Add NaN argument
to itself.
* math/libm-test.inc (atanh_test_data): Add sNaN tests.
The i386 version of asinhl returns sNaN (without raising any
exceptions) for sNaN input. This patch fixes it to add non-finite
arguments to themselves, so that "invalid" is raised and qNaN
returned.
Tested for x86_64 and x86.
[BZ #20218]
* sysdeps/i386/fpu/s_asinhl.S (__asinhl): Add non-finite argument
to itself.
* math/libm-test.inc (asinh_test_data): Add sNaN tests.
Since the FMA4 bit is in COMMON_CPUID_INDEX_80000001 and FMA4 requires
AVX, determine if FMA4 is usable after COMMON_CPUID_INDEX_80000001 is
available and if AVX is usable.
[BZ #20195]
* sysdeps/x86/cpu-features.c (get_common_indeces): Move FMA4
check to ...
(init_cpu_features): Here.
In: https://sourceware.org/glibc/wiki/Synchronizing_Headers
we explain how we synchronize our headers with Linux kernel
headers.
In order to synchronize with the Linux linux/in6.h and
linux/ipv6.h headers we checked for their guard macros and
then defined __USE_KERNEL_IPV6_DEFS and conditionalized code
on this macro.
In upstream kernel 56c176c9 the _UAPI prefix was stripped and
this broke our synchronized headers again. We now need to check
for _LINUX_IN6_H and _IPV6_H, and keep checking the old versions
of the header guard checks for maximum backwards compatibility
with older Linux headers (the history is actually a bit muddled
here and it appears upstream linus kernel broke this 10 months
*before* our fix was ever applied to glibc, but without glibc
testing we didn't notice and distro kernels have their own
testing to fix this).
This patch fixes synchronization with linux/in6.h and
with netinet/in.h.
In C++11 18.5.12 says "Objects shall not be destroyed as a
result of calling quick_exit." In C11 quick_exit is silent
about thread object destruction. Therefore to make glibc
C++ compliant we do not call any thread local destructors.
A new regression test verifies the fix.
I will note that C++11 18.5.3 makes it clear that C++
defines additional requirements for _Exit() to prevent it
from executing destructors.
Given that the point of _Exit() is to terminate the process
immediately it makes sense the C and C++ should line up
and avoid calling destructors.
No failures. New regtest passes.
The dbl-64 version of asin returns sNaN for sNaN arguments. This
patch fixes it to add NaN arguments to themselves so that qNaN is
returned in this case.
Tested for x86_64 and x86.
[BZ #20213]
* sysdeps/ieee754/dbl-64/e_asin.c (__ieee754_asin): Add NaN
argument to itself.
* math/libm-test.inc (asin_test_data): Add sNaN tests.
This patch consolidates all the pwritev{64} implementation for Linux
in only one (sysdeps/unix/sysv/linux/pwritev{64}.c). It also removes the
syscall from the auto-generation using assembly macros.
It was based on previous pwrite/pwrite64 consolidation patch. The new macro
SYSCALL_LL{64} is used to handle the offset argument and alias is created
for __ASSUME_OFF_DIFF_OFF64 in case of pread64.
Checked on x86_64, i386, aarch64, and powerpc64le.
* misc/Makefile (CFLAGS-pwritev.c): New variable: add cancellation
required flags.
(CFLAGS-pwritev64.c): Likewise.
* sysdeps/unix/sysv/linux/generic/wordsize-32/pwritev.c: Remove file.
* sysdeps/unix/sysv/linux/generic/wordsize-32/pwritev64.c: Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/n64/pwritev64.c: Likewise.
* sysdeps/unix/sysv/linux/wordsize-64/pwritev.c: Likewise.
* sysdeps/unix/sysv/linux/wordsize-64/pwritev64.: Likwise.
* sysdeps/unix/sysv/linux/x86_64/x32/syscalls.list (pwritev): Remove
syscall from auto-generation.
* sysdeps/unix/sysv/linux/pwritev.c: Rewrite implementation.
[WORDSIZE == 64] (pwritev64): Remove macro.
[!PWRITEV] (PWRITEV): Likewise.
[!PWRITEV] (PWRITEV_REPLACEMENT): Likewise.
[!PWRITEV] (PWRITE): Likewise.
[!PWRITEV] (OFF_T): Likewise.
[!__ASSUME_PWRITEV] (PWRITEV_REPLACEMENT): Likewise.
(LO_HI_LONG): Remove macro.
[__WORDSIZE != 64 || __ASSUME_OFF_DIFF_OFF64] (pwritev): Add function.
* sysdeps/unix/sysv/linux/pwritev64.c: Rewrite implementation.
(PWRITEV): Remove macro.
(PWRITEV_REPLACEMENTE): Likewise.
(PWRITE): Likewise.
(OFF_T): Likewise.
(pwritev64): New function.
* nptl/tst-cancel4.c (tf_writev): Add test.
This patch consolidates all the preadv{64} implementation for Linux
in only one (sysdeps/unix/sysv/linux/preadv{64}.c). It also removes the
syscall from the auto-generation using assembly macros.
It was based on previous pread/pread64 consolidation patch. The new macro
SYSCALL_LL{64} is used to handle the offset argument and alias is created
for __ASSUME_OFF_DIFF_OFF64 in case of pread64.
Checked on x86_64, i386, aarch64, and powerpc64le.
* misc/Makefile (CFLAGS-preadv.c): New variable: add cancellation
required flags.
(CFLAGS-preadv64.c): Likewise.
* sysdeps/unix/sysv/linux/generic/wordsize-32/preadv.c: Remove file.
* sysdeps/unix/sysv/linux/generic/wordsize-32/preadv64.c: Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/n64/preadv64.c: Likewise.
* sysdeps/unix/sysv/linux/wordsize-64/preadv.c: Likewise.
* sysdeps/unix/sysv/linux/wordsize-64/preadv64.: Likwise.
* sysdeps/unix/sysv/linux/x86_64/x32/syscalls.list (preadv): Remove
syscall from auto-generation.
* sysdeps/unix/sysv/linux/preadv.c: Rewrite implementation.
[WORDSIZE == 64] (preadv64): Remove macro.
[!PREADV] (PREADV): Likewise.
[!PREADV] (PREADV_REPLACEMENT): Likewise.
[!PREADV] (PREAD): Likewise.
[!PREADV] (OFF_T): Likewise.
[!__ASSUME_PREADV] (PREADV_REPLACEMENT): Likewise.
(LO_HI_LONG): Remove macro.
[__WORDSIZE != 64 || __ASSUME_OFF_DIFF_OFF64] (preadv): Add function.
* sysdeps/unix/sysv/linux/preadv64.c: Rewrite implementation.
(PREADV): Remove macro.
(PREADV_REPLACEMENTE): Likewise.
(PREAD): Likewise.
(OFF_T): Likewise.
(preadv64): New function.
* nptl/tst-cancel4.c (tf_preadv): Add test.
The dbl-64 version of acos returns sNaN for sNaN arguments. This
patch fixes it to add NaN arguments to themselves so that qNaN is
returned in this case.
Tested for x86_64 and x86.
[BZ #20212]
* sysdeps/ieee754/dbl-64/e_asin.c (__ieee754_acos): Add NaN
argument to itself.
* math/libm-test.inc (acos_test_data): Add sNaN tests.
on S390, I get a compile error for dlfcn/tst-rec-dlopen.c:
tst-rec-dlopen.c: In function ‘malloc’:
tst-rec-dlopen.c:101:4: error: implicit declaration of function ‘strlen’ [-Werror=implicit-function-declaration]
(void) write (STDOUT_FILENO, message, strlen (message));
^
tst-rec-dlopen.c:101:42: error: incompatible implicit declaration of built-in function ‘strlen’ [-Werror]
(void) write (STDOUT_FILENO, message, strlen (message));
^
tst-rec-dlopen.c:112:42: error: incompatible implicit declaration of built-in function ‘strlen’ [-Werror]
(void) write (STDOUT_FILENO, message, strlen (message));
^
This patch adds the missing "#include <string.h>" for strlen.
ChangeLog:
* dlfcn/tst-rec-dlopen.c: Include string.h.
When trying to compile regression tests that use
C++ and the threads header you get this failure:
In file included from /usr/include/c++/5.3.1/cwchar:44:0,
from /usr/include/c++/5.3.1/bits/postypes.h:40,
from /usr/include/c++/5.3.1/bits/char_traits.h:40,
from /usr/include/c++/5.3.1/string:40,
from /usr/include/c++/5.3.1/stdexcept:39,
from /usr/include/c++/5.3.1/array:38,
from /usr/include/c++/5.3.1/tuple:39,
from /usr/include/c++/5.3.1/functional:55,
from /usr/include/c++/5.3.1/thread:39,
from tst-thread-quick_exit.cc:19:
../include/wchar.h:105:23: error: invalid conversion from ‘wchar_t*
(*)(wchar_t*, wchar_t, size_t) throw () {aka wchar_t* (*)(wchar_t*,
wchar_t, long unsigned int) throw ()}’ to ‘int’ [-fpermissive]
extern typeof (wmemset) __wmemset;
^
../include/wchar.h:105:25: error: expected ‘,’ or ‘;’ before ‘__wmemset’
extern typeof (wmemset) __wmemset;
^
The simplest fix for C++ is to avoid the use of
typeof and just declare the prototype as expected.
No regressions on x86_64. Committed as obvious.
The include/wchar.h header is only for internal
build uses and therefore is not ever seen by any
external users and needs no bug #.
The x86 / x86_64 implementation of nextafterl (also used for
nexttowardl) produces incorrect results (NaNs) when negative
subnormals, the low 32 bits of whose mantissa are zero, are
incremented towards zero. This patch fixes this by disabling the
logic to decrement the exponent in that case.
Tested for x86_64 and x86.
[BZ #20205]
* sysdeps/i386/fpu/s_nextafterl.c (__nextafterl): Do not adjust
exponent when incrementing negative subnormal with low mantissa
word zero.
* math/libm-test.inc (nextafter_test_data) [TEST_COND_intel96]:
Add another test.
The use of __USE_KERNEL_IPV6_DEFS with ifndef is bad
practice per: https://sourceware.org/glibc/wiki/Wundef.
This change moves it to use 'if' and always define the
macro.
Please note that this is not the only problem with this
code. I have a series of fixes after this one to resolve
breakage with this code and add regression tests for it
via compile-only source testing (to be discussed in another
thread).
Unfortunately __USE_KERNEL_XATTR_DEFS is set by the kernel
and not glibc, and uses 'define', so we can't fix that yet.
* scripts/check-local-headers.sh (exclude): Add hurd/ihash.h, and
include .*-.*/ in addition to .*-.*-.*/ (i.e. i386-gnu in addition to
i386-linux-gnu).
This patch call _exit instead of exit in failure case for the spawned
child in Linux posix_spawn{p} implementation.
Tested on x86_64.
[BZ #20178]
* sysdeps/unix/sysv/linux/spawni.c (__spawni_child): Call _exit
on failure instead of exit.
For Intel processors, when there are both L2 and L3 caches, SMT level
type should be ued to count number of available logical processors
sharing L2 cache. If there is only L2 cache, core level type should
be used to count number of available logical processors sharing L2
cache. Number of available logical processors sharing L2 cache should
be used for non-inclusive L2 and L3 caches.
* sysdeps/x86/cacheinfo.c (init_cacheinfo): Count number of
available logical processors with SMT level type sharing L2
cache for Intel processors.
The powerpc64 versions of ceil, floor, round, trunc, rint, nearbyint
and their float versions return sNaN for sNaN input when they should
return qNaN. This patch fixes them to add a NaN argument to itself to
quiet sNaNs before returning.
Tested for powerpc64.
[BZ #20160]
* sysdeps/powerpc/powerpc64/fpu/s_ceil.S (__ceil): Add NaN
argument to itself before returning the result.
* sysdeps/powerpc/powerpc64/fpu/s_ceilf.S (__ceilf): Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_floor.S (__floor): Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_floorf.S (__floorf): Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_nearbyint.S (__nearbyint):
Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_nearbyintf.S (__nearbyintf):
Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_rint.S (__rint): Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_rintf.S (__rintf): Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_round.S (__round): Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_roundf.S (__roundf): Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_trunc.S (__trunc): Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_truncf.S (__truncf): Likewise.
The powerpc32 versions of ceil, floor, round, trunc, rint, nearbyint
and their float versions return sNaN for sNaN input when they should
return qNaN. This patch fixes them to add a NaN argument to itself to
quiet sNaNs before returning. The powerpc64 versions, which have the
same bug, will be addressed separately.
Tested for powerpc32.
[BZ #20160]
* sysdeps/powerpc/powerpc32/fpu/s_ceil.S (__ceil): Add NaN
argument to itself before returning the result.
* sysdeps/powerpc/powerpc32/fpu/s_ceilf.S (__ceilf): Likewise.
* sysdeps/powerpc/powerpc32/fpu/s_floor.S (__floor): Likewise.
* sysdeps/powerpc/powerpc32/fpu/s_floorf.S (__floorf): Likewise.
* sysdeps/powerpc/powerpc32/fpu/s_nearbyint.S (__nearbyint):
Likewise.
* sysdeps/powerpc/powerpc32/fpu/s_nearbyintf.S (__nearbyintf):
Likewise.
* sysdeps/powerpc/powerpc32/fpu/s_rint.S (__rint): Likewise.
* sysdeps/powerpc/powerpc32/fpu/s_rintf.S (__rintf): Likewise.
* sysdeps/powerpc/powerpc32/fpu/s_round.S (__round): Likewise.
* sysdeps/powerpc/powerpc32/fpu/s_roundf.S (__roundf): Likewise.
* sysdeps/powerpc/powerpc32/fpu/s_trunc.S (__trunc): Likewise.
* sysdeps/powerpc/powerpc32/fpu/s_truncf.S (__truncf): Likewise.
This is useful in situations where the long double type is
less precise than the type under test. This adds a new
wrapper macro LITM(x) to each type to append the proper
suffix onto macro constants found in math.h.
These are local to the test suite. Rename them as a macro starting
with lit_pi and a series of postfix operations to give us a constant
starting with lit_pi.
The lit prefix is intended to enable easy substitutions via
gen-test-libm.pl if needed.
The powerpc implementations of fabsl for ldbl-128ibm (both powerpc32
and powerpc64) wrongly raise the "invalid" exception for sNaN
arguments. fabs functions should be quiet for all inputs including
signaling NaNs. The problem is the use of a comparison instruction
fcmpu to determine if the high part of the argument is negative and so
the low part needs to be negated; such instructions raise "invalid"
for sNaNs.
There is a pure integer implementation of fabsl in
sysdeps/ieee754/ldbl-128ibm/s_fabsl.c. However, it's not necessary to
use it to avoid such exceptions. The fsel instruction does not raise
exceptions for sNaNs, and can be used in place of the original
comparison. (Note that if the high part is zero or a NaN, it does not
matter whether the low part is negated; the choice of whether the low
part of a zero is +0 or -0 does not affect the value, and the low part
of a NaN does not affect the value / payload either.)
The condition in GCC for fsel to be available is TARGET_PPC_GFXOPT,
corresponding to the _ARCH_PPCGR predefined macro. fsel is available
on all 64-bit processors supported by GCC. A few 32-bit processors
supported by GCC do not have TARGET_PPC_GFXOPT despite having hard
float support. To support those processors, integer code (similar to
that in copysignl) is included for the !_ARCH_PPCGR case for
powerpc32.
Tested for powerpc32 (configurations with and without _ARCH_PPCGR) and
powerpc64.
[BZ #20157]
* sysdeps/powerpc/powerpc32/fpu/s_fabsl.S (__fabsl): Use fsel to
determine whether to negate low half if [_ARCH_PPCGR], and integer
comparison otherwise.
* sysdeps/powerpc/powerpc64/fpu/s_fabsl.S (__fabsl): Use fsel to
determine whether to negate low half.
This patch removes various no-longer-used macros from libm-test.inc.
NO_TEST_INLINE_FLOAT, NO_TEST_INLINE_DOUBLE and M_PI_6l would have
been used before relevant tests were moved to auto-libm-test-in.
TEST_COND_x86_64 and TEST_COND_x86 were for tests in auto-libm-test-in
XFAILed for x86, and are no longer relevant now the bugs in question
have been fixed and the XFAILing removed (if future x86-specific
XFAILs become needed, they can always be added back).
Tested for x86_64 and x86.
* math/libm-test.inc (NO_TEST_INLINE_FLOAT): Remove macro.
(NO_TEST_INLINE_DOUBLE): Likewise.
(TEST_COND_x86_64): Likewise.
(TEST_COND_x86): Likewise.
(M_PI_6l): Likewise.
Replace most of the type specific macros with the equivalent
type-generic macro using the following sed replacement command below:
sed -ri -e 's/defined TEST_FLOAT/TEST_COND_binary32/' \
-e 's/ndef TEST_FLOAT/ !TEST_COND_binary32/' \
-e 's/def TEST_FLOAT/ TEST_COND_binary32/' \
-e 's/defined TEST_DOUBLE/TEST_COND_binary64/'\
-e 's/ndef TEST_DOUBLE/ !TEST_COND_binary64/' \
-e 's/def TEST_DOUBLE/ TEST_COND_binary64/' \
-e 's/defined TEST_LDOUBLE && //' \
-e 's/ifdef TEST_LDOUBLE/if MANT_DIG >= 64/' \
-e 's/defined TEST_LDOUBLE/MANT_DIG >= 64/' \
-e '/nexttoward_test_data\[\]/,/ };/!s/LDBL_(MIN_EXP|MAX_EXP|MANT_DIG)/\1/g' \
libm-test.inc
With a little extra manual cleanup to simplify the following case:
#if MANT_DIG >= 64
# if MANT_DIG >= 64
...
# endif
...
Note, TEST_LDOUBLE checks are replaced by MANT_DIG >= 64 excepting
where another property of the type is being tested. And, the final
regex is intended to avoid replacing LDBL_ macro usage within the
nexttoward tests which explicitly take argument 2 as long double.
Attempt to creatively redefine the macros
to choose tests based on the format being
tested, not the type.
Note, TS 18661 does not define any printf
modifiers, so we need to be a little more
verbose about constructing strings to
output.
The ldbl-128ibm implementations of ceill, floorl, roundl, truncl,
rintl and nearbyintl wrongly return an sNaN when given an sNaN
argument. This patch fixes them to add such an argument to itself to
turn it into a quiet NaN. (The code structure means this "else" case
applies to any argument which is zero or not finite; it's OK to do
this in all such cases.)
Tested for powerpc.
[BZ #20156]
* sysdeps/ieee754/ldbl-128ibm/s_ceill.c (__ceill): Add high part
to itself when zero or not finite.
* sysdeps/ieee754/ldbl-128ibm/s_floorl.c (__floorl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_rintl.c (__rintl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_roundl.c (__roundl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_truncl.c (__truncl): Likewise.