Commit Graph

1131 Commits

Author SHA1 Message Date
Paul Zimmermann
9acda61d94 Fix the inaccuracy of j0f/j1f/y0f/y1f [BZ #14469, #14470, #14471, #14472]
For j0f/j1f/y0f/y1f, the largest error for all binary32
inputs is reduced to at most 9 ulps for all rounding modes.

The new code is enabled only when there is a cancellation at the very end of
the j0f/j1f/y0f/y1f computation, or for very large inputs, thus should not
give any visible slowdown on average.  Two different algorithms are used:

* around the first 64 zeros of j0/j1/y0/y1, approximation polynomials of
  degree 3 are used, computed using the Sollya tool (https://www.sollya.org/)

* for large inputs, an asymptotic formula from [1] is used

[1] Fast and Accurate Bessel Function Computation,
    John Harrison, Proceedings of Arith 19, 2009.

Inputs yielding the new largest errors are added to auto-libm-test-in,
and ulps are regenerated for various targets (thanks Adhemerval Zanella).

Tested on x86_64 with --disable-multi-arch and on powerpc64le-linux-gnu.
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-04-02 06:15:48 +02:00
Siddhesh Poyarekar
abadbef5c8 Move __isnanf128 to libc.so
All of the isnan functions are in libc.so due to printf_fp, so move
__isnanf128 there too for consistency.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@ascii.art.br>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
2021-03-30 14:58:19 +05:30
Wilco Dijkstra
92cfc9ad82 math: Remove mpa files (part 2) [BZ #15267]
Previous commit was missing deleted files in sysdeps/ieee754/dbl-64.

Finally remove all mpa related files, headers, declarations, probes, unused
tables and update makefiles.

Reviewed-By: Paul Zimmermann <Paul.Zimmermann@inria.fr>
2021-03-11 15:45:19 +00:00
Wilco Dijkstra
47ad14d789 math: Remove mpa files [BZ #15267]
Finally remove all mpa related files, headers, declarations, probes, unused
tables and update makefiles.

Reviewed-By: Paul Zimmermann <Paul.Zimmermann@inria.fr>
2021-03-11 14:26:36 +00:00
Wilco Dijkstra
4e1a870b9a math: Remove slow paths from atan2 [BZ #15267]
Remove slow paths from atan2. Add ULP annotations.

Reviewed-By: Paul Zimmermann <Paul.Zimmermann@inria.fr>
2021-03-11 14:26:36 +00:00
Wilco Dijkstra
e898cd1593 math: Remove slow paths from atan [BZ #15267]
Remove slow paths from atan. Add ULP annotations.

Reviewed-By: Paul Zimmermann <Paul.Zimmermann@inria.fr>
2021-03-11 14:26:36 +00:00
Wilco Dijkstra
476d692e8a math: Remove slow paths in tan [BZ #15267]
Remove slow paths in tan. Add ULP annotations. Merge 'number' into 'mynumber'.
Remove unused entries from tan constants.

Reviewed-By: Paul Zimmermann <Paul.Zimmermann@inria.fr>
2021-03-11 14:26:36 +00:00
Wilco Dijkstra
db3f7bb558 math: Remove slow paths from asin and acos [BZ #15267]
This patch series removes all remaining slow paths and related code.
First asin/acos, tan, atan, atan2 implementations are updated, and the final
patch removes the unused mpa files, headers and probes. Passes buildmanyglibc.

Remove slow paths from asin/acos. Add ULP annotations based on previous slow
path checks (which are approximate). Update AArch64 and x86_64 libm-test-ulps.

Reviewed-By: Paul Zimmermann <Paul.Zimmermann@inria.fr>
2021-03-11 14:26:36 +00:00
Adhemerval Zanella
bf7db6d369 math: Add BZ#18980 fix back on dbl-64 cosh
It is regression from 9e97f239ea (Remove dbl-64/wordsize-64
(part 2)) where is missed to add the BZ#18980 fix (9e97f239ea).

Checked on i686-linux-gnu.
2021-01-11 16:56:33 -03:00
Wilco Dijkstra
9e97f239ea Remove dbl-64/wordsize-64 (part 2)
Remove the wordsize-64 implementations by merging them into the main dbl-64
directory.  The second patch just moves all wordsize-64 files and removes a
few wordsize-64 uses in comments and Implies files.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-01-07 15:26:26 +00:00
Wilco Dijkstra
caa884dda7 Remove dbl-64/wordsize-64
Remove the wordsize-64 implementations by merging them into the main dbl-64
directory.  The first patch adds special cases needed for 32-bit targets
(FIX_INT_FP_CONVERT_ZERO and FIX_DBL_LONG_CONVERT_OVERFLOW) to the
wordsize-64 versions.  This has no effect on 64-bit targets since they don't
define these macros.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-01-07 15:02:51 +00:00
Siddhesh Poyarekar
8cc1e39a36 Drop nan-pseudo-number.h usage from tests
Make the tests use TEST_COND_intel96 to decide on whether to build the
unnormal tests instead of the macro in nan-pseudo-number.h and then
drop the header inclusion.  This unbreaks test runs on all
architectures that do not have ldbl-96.

Also drop the HANDLE_PSEUDO_NUMBERS macro since it is not used
anywhere.
2021-01-04 20:49:56 +05:30
Siddhesh Poyarekar
fee3b889d8 Move generic nan-pseudo-number.h to ldbl-96
The concept of pseudo number formats only exists in the realm of the
96 bit long double format.
2021-01-04 14:51:52 +05:30
Paul Eggert
2b778ceb40 Update copyright dates with scripts/update-copyrights
I used these shell commands:

../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")

and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 6694 files FOO.
I then removed trailing white space from benchtests/bench-pthread-locks.c
and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this
diagnostic from Savannah:
remote: *** pre-commit check failed ...
remote: *** error: lines with trailing whitespace found
remote: error: hook declined to update refs/heads/master
2021-01-02 12:17:34 -08:00
Siddhesh Poyarekar
7525c1c71d x86 long double: Consider pseudo numbers as signaling
Add support to treat pseudo-numbers specially and implement x86
version to consider all of them as signaling.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2020-12-30 10:52:45 +05:30
Anssi Hannula
69a7ca7705 ieee754: Remove unused __sin32 and __cos32
The __sin32 and __cos32 functions were only used in the now removed slow
path of asin and acos.
2020-12-18 12:10:31 +05:30
Anssi Hannula
f67f9c9af2 ieee754: Remove slow paths from asin and acos
asin and acos have slow paths for rounding the last bit that cause some
calls to be 500-1500x slower than average calls.

These slow paths are rare, a test of a trillion (1.000.000.000.000)
random inputs between -1 and 1 showed 32870 slow calls for acos and 4473
for asin, with most occurrences between -1.0 .. -0.9 and 0.9 .. 1.0.

The slow paths claim correct rounding and use __sin32() and __cos32()
(which compare two result candidates and return the closest one) as the
final step, with the second result candidate (res1) having a small offset
applied from res. This suggests that res and res1 are intended to be 1
ULP apart (which makes sense for rounding), barring bugs, allowing us to
pick either one and still remain within 1 ULP of the exact result.

Remove the slow paths as the accuracy is better than 1 ULP even without
them, which is enough for glibc.

Also remove code comments claiming correctly rounded results.

After slow path removal, checking the accuracy of 14.400.000.000 random
asin() and acos() inputs showed only three incorrectly rounded
(error > 0.5 ULP) results:
- asin(-0x1.ee2b43286db75p-1) (0.500002 ULP, same as before)
- asin(-0x1.f692ba202abcp-4)  (0.500003 ULP, same as before)
- asin(-0x1.9915e876fc062p-1) (0.50000000001 ULP, previously exact)
The first two had the same error even before this commit, and they did
not use the slow path at all.

Checking 4934 known randomly found previously-slow-path asin inputs
shows 25 calls with incorrectly rounded results, with a maximum error of
0.500000002 ULP (for 0x1.fcd5742999ab8p-1). The previous slow-path code
rounded all these inputs correctly (error < 0.5 ULP).
The observed average speed increase was 130x.

Checking 36240 known randomly found previously-slow-path acos inputs
shows 42 calls with incorrectly rounded results, with a maximum error of
0.500000008 ULP (for 0x1.f63845056f35ep-1). The previous "exact"
slow-path code showed 34 calls with incorrectly rounded results, with the
same maximum error of 0.500000008 ULP (for 0x1.f63845056f35ep-1).
The observed average speed increase was 130x.

The functions could likely be trimmed more while keeping acceptable
accuracy, but this at least gets rid of the egregiously slow cases.

Tested on x86_64.
2020-12-18 12:09:23 +05:30
Adhemerval Zanella
01bd62517c Remove tls.h inclusion from internal errno.h
The tls.h inclusion is not really required and limits possible
definition on more arch specific headers.

This is a cleanup to allow inline functions on sysdep.h, more
specifically on i386 and ia64 which requires to access some tls
definitions its own.

No semantic changes expected, checked with a build against all
affected ABIs.
2020-11-13 12:59:19 -03:00
Paul Zimmermann
b7dd366dbe math: Fix inaccuracy of j0f for x >= 2^127 when sin(x)+cos(x) is tiny
Checked on x86_64-linux-gnu and i686-linux-gnu.
2020-08-07 16:33:13 -03:00
Joseph Myers
6c010c5dde Use C2x return value from getpayload of non-NaN (bug 26073).
In TS 18661-1, getpayload had an unspecified return value for a
non-NaN argument, while C2x requires the return value -1 in that case.

This patch implements the return value of -1.  I don't think this is
worth having a new symbol version that's an alias of the old one,
although occasionally we do that in such cases where the new function
semantics are a refinement of the old ones (to avoid programs relying
on the new semantics running on older glibc versions but not behaving
as intended).

Tested for x86_64 and x86; also ran math/ tests for aarch64 and
powerpc.
2020-07-06 16:18:02 +00:00
Adhemerval Zanella
be668a8d78 New exp10f version without SVID compat wrapper
This patch changes the exp10f error handling semantics to only set
errno according to POSIX rules.  New symbol version is introduced at
GLIBC_2.32.  The old wrappers are kept for compat symbols.

There are some outliers that need special handling:

  - ia64 provides an optimized implementation of exp10f that uses ia64
    specific routines to set SVID compatibility.  The new symbol version
    is aliased to the exp10f one.

  - m68k also provides an optimized implementation, and the new version
    uses it instead of the sysdeps/ieee754/flt32 one.

  - riscv and csky uses the generic template implementation that
    does not provide SVID support.  For both cases a new exp10f
    version is not added, but rather the symbols version of the
    generic sysdeps/ieee754/flt32 is adjusted instead.

Checked on aarch64-linux-gnu, x86_64-linux-gnu, i686-linux-gnu,
powerpc64le-linux-gnu.
2020-06-19 12:08:47 -03:00
Paul Zimmermann
6e98983c09 math: Optimized generic exp10f with wrappers
It is inspired by expf and reuses its tables and internal functions.
The error checks are inlined and errno setting is in separate tail
called functions, but the wrappers are kept in this patch to handle
the _LIB_VERSION==_SVID_ case.

Double precision arithmetics is used which is expected to be faster on
most targets (including soft-float) than using single precision and it
is easier to get good precision result with it.

Result for x86_64 (i7-4790K CPU @ 4.00GHz) are:

Before new code:
  "exp10f": {
   "workload-spec2017.wrf (adapted)": {
    "duration": 4.0414e+09,
    "iterations": 1.00128e+08,
    "reciprocal-throughput": 26.6818,
    "latency": 54.043,
    "max-throughput": 3.74787e+07,
    "min-throughput": 1.85038e+07
   }

With new code:
  "exp10f": {
   "workload-spec2017.wrf (adapted)": {
    "duration": 4.11951e+09,
    "iterations": 1.23968e+08,
    "reciprocal-throughput": 21.0581,
    "latency": 45.4028,
    "max-throughput": 4.74876e+07,
    "min-throughput": 2.20251e+07
   }

Result for aarch64 (A72 @ 2GHz) are:

Before new code:
  "exp10f": {
   "workload-spec2017.wrf (adapted)": {
    "duration": 4.62362e+09,
    "iterations": 3.3376e+07,
    "reciprocal-throughput": 127.698,
    "latency": 149.365,
    "max-throughput": 7.831e+06,
    "min-throughput": 6.69501e+06
   }

With new code:
  "exp10f": {
   "workload-spec2017.wrf (adapted)": {
    "duration": 4.29108e+09,
    "iterations": 6.6752e+07,
    "reciprocal-throughput": 51.2111,
    "latency": 77.3568,
    "max-throughput": 1.9527e+07,
    "min-throughput": 1.29271e+07
   }

Checked on x86_64-linux-gnu, powerpc64le-linux-gnu, aarch64-linux-gnu,
and sparc64-linux-gnu.
2020-06-19 10:48:15 -03:00
Vineet Gupta
e93c264336 ieee754/dbl-64: Reduce the scope of temporary storage variables
This came to light when adding hard-flaot support to ARC glibc port
without hardware sqrt support causing glibc build to fail:

| ../sysdeps/ieee754/dbl-64/e_sqrt.c: In function '__ieee754_sqrt':
| ../sysdeps/ieee754/dbl-64/e_sqrt.c:58:54: error: unused variable 'ty' [-Werror=unused-variable]
|   double y, t, del, res, res1, hy, z, zz, p, hx, tx, ty, s;

The reason being EMULV() macro uses the hardware provided
__builtin_fma() variant, leaving temporary variables 'p, hx, tx, hy, ty'
unused hence compiler warning and ensuing error.

The intent of the patch was to fix that error, but EMULV is pervasive
and used fair bit indirectly via othe rmacros, hence this patch.
Functionally it should not result in code gen changes and if at all
those would be better since the scope of those temporaries is greatly
reduced now

Built tested with aarch64-linux-gnu arm-linux-gnueabi arm-linux-gnueabihf hppa-linux-gnu x86_64-linux-gnu arm-linux-gnueabihf riscv64-linux-gnu-rv64imac-lp64 riscv64-linux-gnu-rv64imafdc-lp64 powerpc-linux-gnu microblaze-linux-gnu nios2-linux-gnu hppa-linux-gnu

Also as suggested by Joseph [1] used --strip and compared the libs with
and w/o patch and they are byte-for-byte unchanged (with gcc 9).

| for i in `find . -name libm-2.31.9000.so`;
| do
|    echo $i; diff $i /SCRATCH/vgupta/gnu2/install/glibcs/$i ; echo $?;
| done

| ./aarch64-linux-gnu/lib64/libm-2.31.9000.so
| 0
| ./arm-linux-gnueabi/lib/libm-2.31.9000.so
| 0
| ./x86_64-linux-gnu/lib64/libm-2.31.9000.so
| 0
| ./arm-linux-gnueabihf/lib/libm-2.31.9000.so
| 0
| ./riscv64-linux-gnu-rv64imac-lp64/lib64/lp64/libm-2.31.9000.so
| 0
| ./riscv64-linux-gnu-rv64imafdc-lp64/lib64/lp64/libm-2.31.9000.so
| 0
| ./powerpc-linux-gnu/lib/libm-2.31.9000.so
| 0
| ./microblaze-linux-gnu/lib/libm-2.31.9000.so
| 0
| ./nios2-linux-gnu/lib/libm-2.31.9000.so
| 0
| ./hppa-linux-gnu/lib/libm-2.31.9000.so
| 0
| ./s390x-linux-gnu/lib64/libm-2.31.9000.so

[1] https://sourceware.org/pipermail/libc-alpha/2019-November/108267.html
2020-06-15 13:09:21 -07:00
Vineet Gupta
628d90c5f9 ieee754: provide gcc builtins based generic fma functions
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2020-06-03 10:23:28 -07:00
Vineet Gupta
3374868668 ieee754: provide gcc builtins based generic sqrt functions
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2020-06-03 10:23:22 -07:00
Paul E. Murphy
a49e56a945 float128: use builtin_signbitf128 always
The minimum GCC version has been raised to 6.2 for building
glibc.  Therefore, follow the advice inside the implementation
and remove the GCC < 6 codepath.

Likewise, remove the hidden_proto as all internal usages should
inline now.
2020-05-04 13:18:45 -05:00
Paul E. Murphy
5c7ccc2983 powerpc64le: blacklist broken GCC compilers (e.g GCC 7.5.0)
GCC 7.5.0 (PR94200) will refuse to compile if both -mabi=% and
-mlong-double-128 are passed on the command line.  Surprisingly,
it will work happily if the latter is not.  For the sake of
maintaining status quo, test for and blacklist such compilers.

Tested with a GCC 8.3.1 and GCC 7.5.0 compiler for ppc64le.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2020-04-30 08:52:08 -05:00
Paul E. Murphy
e2239af353 Rename __LONG_DOUBLE_USES_FLOAT128 to __LDOUBLE_REDIRECTS_TO_FLOAT128_ABI
Improve the commentary to aid future developers who will stumble
upon this novel, yet not always perfect, mechanism to support
alternative formats for long double.

Likewise, rename __LONG_DOUBLE_USES_FLOAT128 to
__LDOUBLE_REDIRECTS_TO_FLOAT128_ABI now that development work
has settled down.  The command used was

git grep -l __LONG_DOUBLE_USES_FLOAT128 ':!./ChangeLog*' | \
  xargs sed -i 's/__LONG_DOUBLE_USES_FLOAT128/__LDOUBLE_REDIRECTS_TO_FLOAT128_ABI/g'

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2020-04-30 08:52:08 -05:00
Paul E. Murphy
bd98471eb2 ldbl-128ibm: simplify iscanonical.h
The test for enabling _Float128 or IEEE 128 long double can be
greatly simplified knowing that there is no ibm128, thus we require
no special cases, and everything is canonical.

This reverts the changes to ldbl-128ibm iscanonical.h from commit
8dbfea3a20 and extends the check
for __NO_LONG_DOUBLE_MATH to include a check for float128 redirects
to long double.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2020-04-06 10:23:58 -05:00
Paul E. Murphy
3618e5fece Makeconfig: sandwich gnulib-tests between libc/ld linking of tests
This better resembles the default linking process with the gnulibs,
and also resolves the increasingly difficult to maintain
f128-loader-link usage on powerpc64le as some libgcc symbols are
dependent on those found in the loader (ld).
2020-03-25 14:34:23 -05:00
Paul E. Murphy
45ae17dd7e ldbl-128ibm-compat: PLT redirects for using ldbl redirects internally
Tweak the PLT bypass magic when building glibc with long double
redirects.  This is made more difficult by the fact we only get
one chance to redirect functions.  This happens via the public
headers.

There are roughly three classes of redirect we need to attend to
today:

 1. Simple redirects, redirected via cdef macro overrides and
    and new libc_hidden_ldbl_proto macro.
 2. Internal usage of internal API, e.g __snprintf, which has
    no direct analogue.  This is bypassed directly on case-by-
    case basis.
 3. Double redirects, e.g sscanf and related.  These require
    a heavier handed approach of macro renaming to existing
    symbols.

Most simple redirects are handled via 1.  Ideally, the libc_*
macro would live in libc-symbols.h, but in practice the macros
needed for it to do anything useful live in cdefs.h, so they
are defined in the local override.

Notably, the internal name of the asprintf generated for ieee ldbl
redirects is renamed to work with internal prefixed usage.

This resolves the local plt usage introduced when building glibc
with ldbl == ieee128 on ppc64le.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2020-03-25 14:34:23 -05:00
Adhemerval Zanella
1c15464ca0 math: Remove inline math tests
With mathinline removal there is no need to keep building and testing
inline math tests.

The gen-libm-tests.py support to generate ULP_I_* is removed and all
libm-test-ulps files are updated to longer have the
i{float,double,ldouble} entries.  The support for no-test-inline is
also removed from both gen-auto-libm-tests and the
auto-libm-test-out-* were regenerated.

Checked on x86_64-linux-gnu and i686-linux-gnu.
2020-03-19 11:45:44 -03:00
Tulio Magno Quites Machado Filho
d2b2acefa6 ldbl-128ibm: Let long double files have specific compiler flags
Soon, powerpc64le will need to provide extra compiler flags to the long
double files in order to continue to build using the IBM 128-bit
extended floating point type as long double.
2020-03-06 09:17:32 -06:00
Rajalakshmi Srinivasaraghavan
0059122aa0 ldbl-128ibm-compat: Add tests for IBM long double functions
This patch creates test-ibm128* tests from the long double function tests.
In order to explicitly test IBM long double functions -mabi=ibmlongdouble is
added to CFLAGS.

Likewise, update the test headers to correct choose ULPs when redirects
are enabled.

Co-authored-by: Tulio Magno Quites Machado Filho  <tuliom@linux.ibm.com>
Co-authored-by: Paul E. Murphy  <murphyp@linux.vnet.ibm.com>
2020-03-06 09:17:32 -06:00
Paul E. Murphy
4068b54c70 ldbl-128ibm-compat: link tst-ldbl-efgcvt against loader too
This also requires the linker workaround to ensure everything links
correctly.  See comment in sysdeps/powerpc/powerpc64/le/Makefile
for details.
2020-02-28 08:20:02 -06:00
Paul E. Murphy
77d5bddd8f ldbl-128ibm-compat: enforce ibm128 on compat tests
For lack of a more comprehensive solution, tack on the ibm128 ABI
compiler options for the totalorder{,mag}l compat tests which exist
prior to enabling this feature.
2020-02-28 08:20:02 -06:00
Gabriel F. T. Gomes
d967237443 ldbl-128ibm-compat: Provide nexttoward functions
The functions in the nexttoward family are special, in the sense that
they always have a long double argument, regardless of their suffix
(i.e.: nexttowardf and nexttoward have a long double argument, besides
the float and double arguments).

On top of that, they are also special because nexttoward functions are
not part of the _FloatN API, hence __nexttowardf128 do not exist.

This patch adds 4 new function implementations for the new long double
format:

  __nexttoward_to_ieee128
  __nexttowardf_to_ieee128
  __nexttowardieee128 (as an alias to __nextafterieee128)

Likewise, rename "long double" "_Float128" in shared ldbl-128
files to ensure correct type is used irrespective of ABI
switches.

Thank you to those who helped out with this patch:

Co-Authored-By: Tulio Magno Quites Machado Filho  <tuliom@linux.ibm.com>
2020-02-28 08:20:02 -06:00
Tulio Magno Quites Machado Filho
f1a166b0d7 ldbl-128ibm-compat: Provide a significand implementation
Reuse the template in order to provide the global symbol
__significandieee128.
2020-02-28 08:20:02 -06:00
Tulio Magno Quites Machado Filho
8dbfea3a20 ldbl-128ibm-compat: Redirect long double functions to f128/ieee128 functions
Modify the headers to redirect long double functions to global __*f128
symbols or to __*ieee128 otherwise.

Most of the functions in math.h benefit from the infrastructure already
available for __LDBL_COMPAT.  The only exceptions are nexttowardf and
nexttoward that need especial treatment.

Both math/bits/mathcalls-helper-functions.h and math/bits/mathcalls.h
were modified in order to provide alternative redirection destinations
that are essential to support functions that should not be redirected to
the same name pattern of the rest of the functions, i.e.: __fpclassify,
__signbit, __iseqsig, __issignaling, isinf, finite and isnan, which will
be redirected to __*f128 instead of __*ieee128 used for the rest.
2020-02-28 08:20:02 -06:00
Paul E. Murphy
a92ac9f494 ldbl-128ibm: make ieee754.h work with IEEE 128 long double
Instead of attempting something more creative, just copy
the small struct from ldbl-128 and enable it when IEEE
long double is present, and update the ibm long double
variant if supported.

Likewise, provide a shadow copy of math_ldbl.h to prevent
the ibm128 specific long double header from poisoning
unrelated files due to it's usage in math_private.h.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2020-02-21 08:24:36 -06:00
Paul E. Murphy
bb05281822 ldbl-128ibm-compat: fixup subdir location of several funcs
We want to ensure that if a second file is built to support
ieee128 long double, we built its companion implementation
with ibm128 long double.  The shared object versions of these
files build correctly because the aliasing is sufficiently
complex to prevent the redirects from applying when defining
them.

However, this does not prevent the static object variants
from becoming quietly broken due to redirects.  This is
intentionally avoided by marking such objects to be built
with -mabi=ibmlongdouble.

Shuffle the misplaced routines to build against the subdir
which defines the needed symbols.
2020-02-21 08:24:15 -06:00
Paul E. Murphy
74f614b70a ldbl-128ibm-compat: enforce correct abi flags on internal file
A number of utility files and helper objects should also be
explicitly configured to build with the ibm128 ABI to prevent
gremlins when enabling IEEE long double.
2020-02-21 08:24:04 -06:00
Tulio Magno Quites Machado Filho
218dad29e8 ldbl-128ibm-compat: Provide ieee128 symbols to narrow functions
Move the narrow math aliasing macros into a new sysdep header file
math-narrow-alias-float128.h.  Then, provide an override header
to supply the necessary changes to supply the *ieee128 aliases of
these symbols.

This adds ieee128 aliases for faddl, fdivl, fmull, fsubl, daddl, ddivl,
dmull, dsubl.
2020-02-20 17:12:14 -06:00
Tulio Magno Quites Machado Filho
77ad97356c Undefine redirections after long double definition on __LDBL_COMPAT [BZ #23294]
After defining the long double redirections to double, __MATHDECL_1 has
to be redefined to its previous state in order to avoid redirecting all
subsequent types.
2020-02-20 17:11:06 -06:00
Tulio Magno Quites Machado Filho
2b7dc93f82 ldbl-128ibm-compat: Provide a scalb implementation
Reuse the template in order to provide the redirect for
scalbl to __scalbieee128, but avoid any extra aliasing
as this is intended to support long double redirects only.
2020-02-14 08:24:56 -06:00
Tulio Magno Quites Machado Filho
c624d23260 Add a generic scalb implementation
This is a preparatory patch to enable building a _Float128
variant to ease reuse when building a _Float128 variant to
alias this long double only symbol.

Notably, stubs are added where missing to the native _Float128
sysdep dir to prevent building these newly templated variants
created inside the build directories.

Also noteworthy are the changes around LIBM_SVID_COMPAT.  These
changes are not intuitive.  The templated version is only
enabled when !LIBM_SVID_COMPAT, and the compat version is
predicated entirely on LIBM_SVID_COMPAT.  Thus, exactly one is
stubbed out entirely when building.  The nldbl scalb compat
files are updated to account for this.

Likewise, fixup the reuse of m68k's e_scalb{f,l}.c to include
it's override of e_scalb.c.  Otherwise, the search path finds
the templated copy in the build directory.  This could be
futher simplified by providing an overridden template, but I
lack the hardware to verify.
2020-02-14 08:24:56 -06:00
Joseph Myers
ad180676b8 Adjust thresholds in Bessel function implementations (bug 14469).
A recent discussion in bug 14469 notes that a threshold in float
Bessel function implementations, used to determine when to use a
simpler implementation approach, results in substantially inaccurate
results.

As I discussed in
<https://sourceware.org/ml/libc-alpha/2013-03/msg00345.html>, a
heuristic argument suggests 2^(S+P) as the right order of magnitude
for a suitable threshold, where S is the number of significand bits in
the floating-point type and P is the number of significant bits in the
representation of the floating-point type, and the float and ldbl-96
implementations use thresholds that are too small.  Some threshold
does need using, there or elsewhere in the implementation, to avoid
spurious underflow and overflow for large arguments.

This patch sets the thresholds in the affected implementations to more
heuristically justifiable values.  Results will still be inaccurate
close to zeroes of the functions (thus this patch does *not* fix any
of the bugs for Bessel function inaccuracy); fixing that would require
a different implementation approach, likely along the lines described
in <http://www.cl.cam.ac.uk/~jrh13/papers/bessel.ps.gz>.

So the justification for a change such as this would be statistical
rather than based on particular tests that had excessive errors and no
longer do so (no doubt such tests could be found, but would probably
be too fragile to add to the testsuite, as liable to give large errors
again from very small implementation changes or even from compiler
changes).  See
<https://sourceware.org/ml/libc-alpha/2020-02/msg00638.html> for such
statistics of the resulting improvements for float functions.

Tested (glibc testsuite) for x86_64.
2020-02-14 14:16:25 +00:00
Florian Weimer
c10acd4026 math/test-sinl-pseudo: Use stack protector only if available
This fixes commit 9333498794 ("Avoid ldbl-96 stack
corruption from range reduction of pseudo-zero (bug 25487).").
2020-02-13 17:01:15 +01:00
Joseph Myers
9333498794 Avoid ldbl-96 stack corruption from range reduction of pseudo-zero (bug 25487).
Bug 25487 reports stack corruption in ldbl-96 sinl on a pseudo-zero
argument (an representation where all the significand bits, including
the explicit high bit, are zero, but the exponent is not zero, which
is not a valid representation for the long double type).

Although this is not a valid long double representation, existing
practice in this area (see bug 4586, originally marked invalid but
subsequently fixed) is that we still seek to avoid invalid memory
accesses as a result, in case of programs that treat arbitrary binary
data as long double representations, although the invalid
representations of the ldbl-96 format do not need to be consistently
handled the same as any particular valid representation.

This patch makes the range reduction detect pseudo-zero and unnormal
representations that would otherwise go to __kernel_rem_pio2, and
returns a NaN for them instead of continuing with the range reduction
process.  (Pseudo-zero and unnormal representations whose unbiased
exponent is less than -1 have already been safely returned from the
function before this point without going through the rest of range
reduction.)  Pseudo-zero representations would previously result in
the value passed to __kernel_rem_pio2 being all-zero, which is
definitely unsafe; unnormal representations would previously result in
a value passed whose high bit is zero, which might well be unsafe
since that is not a form of input expected by __kernel_rem_pio2.

Tested for x86_64.
2020-02-12 23:31:56 +00:00
Paul E. Murphy
cd7965bd97 ldbl-128ibm-compat: set PRINTF_CHK flag in {,v}sprintf_chk
This should be unconditionally set to match the common implementation,
and fixes multiple test failures related to sprintf.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2020-02-10 16:51:03 -06:00