1999-12-21 16:15:04 +00:00
|
|
|
# Begin of automatic generation
|
|
|
|
|
|
|
|
# Maximal error of functions:
|
2006-01-28 00:15:15 +00:00
|
|
|
Function: "acos":
|
2021-03-01 20:07:27 +00:00
|
|
|
double: 1
|
2015-05-29 12:40:33 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2006-01-28 00:15:15 +00:00
|
|
|
ldouble: 1
|
|
|
|
|
2012-05-04 11:05:57 +00:00
|
|
|
Function: "acos_downward":
|
2015-12-22 13:06:36 +00:00
|
|
|
double: 1
|
2012-05-31 22:51:03 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2013-12-02 11:16:42 +00:00
|
|
|
ldouble: 3
|
2012-05-04 11:05:57 +00:00
|
|
|
|
|
|
|
Function: "acos_towardzero":
|
|
|
|
double: 1
|
2012-05-31 22:51:03 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2013-12-02 11:16:42 +00:00
|
|
|
ldouble: 3
|
2012-05-04 11:05:57 +00:00
|
|
|
|
|
|
|
Function: "acos_upward":
|
2013-12-02 11:16:42 +00:00
|
|
|
double: 1
|
2015-05-29 12:40:33 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2012-05-31 22:51:03 +00:00
|
|
|
ldouble: 2
|
2012-05-04 11:05:57 +00:00
|
|
|
|
2006-01-28 00:15:15 +00:00
|
|
|
Function: "acosh":
|
2015-12-22 13:06:36 +00:00
|
|
|
double: 2
|
2015-05-29 12:40:33 +00:00
|
|
|
float: 2
|
2021-03-03 17:39:17 +00:00
|
|
|
float128: 4
|
2016-10-05 11:57:47 +00:00
|
|
|
ldouble: 2
|
2006-01-28 00:15:15 +00:00
|
|
|
|
2014-05-20 21:21:51 +00:00
|
|
|
Function: "acosh_downward":
|
2017-01-13 11:33:42 +00:00
|
|
|
double: 2
|
2015-12-22 13:06:36 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2015-12-22 13:06:36 +00:00
|
|
|
ldouble: 3
|
2014-05-20 21:21:51 +00:00
|
|
|
|
|
|
|
Function: "acosh_towardzero":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 2
|
2015-12-22 13:06:36 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2021-03-01 20:07:27 +00:00
|
|
|
ldouble: 5
|
2014-05-20 21:21:51 +00:00
|
|
|
|
|
|
|
Function: "acosh_upward":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 2
|
2015-12-22 13:06:36 +00:00
|
|
|
float: 2
|
2021-03-01 20:07:27 +00:00
|
|
|
float128: 3
|
2015-12-22 13:06:36 +00:00
|
|
|
ldouble: 4
|
2014-05-20 21:21:51 +00:00
|
|
|
|
Add narrowing add functions.
This patch adds the narrowing add functions from TS 18661-1 to glibc's
libm: fadd, faddl, daddl, f32addf64, f32addf32x, f32xaddf64 for all
configurations; f32addf64x, f32addf128, f64addf64x, f64addf128,
f32xaddf64x, f32xaddf128, f64xaddf128 for configurations with
_Float64x and _Float128; __nldbl_daddl for ldbl-opt. As discussed for
the build infrastructure patch, tgmath.h support is deliberately
deferred, and FP_FAST_* macros are not applicable without optimized
function implementations.
Function implementations are added for all relevant pairs of formats
(including certain cases of a format and itself where more than one
type has that format). The main implementations use round-to-odd, or
a trivial computation in the case where both formats are the same or
where the wider format is IBM long double (in which case we don't
attempt to be correctly rounding). The sysdeps/ieee754/soft-fp
implementations use soft-fp, and are used automatically for
configurations without exceptions and rounding modes by virtue of
existing Implies files. As previously discussed, optimized versions
for particular architectures are possible, but not included.
i386 gets a special version of f32xaddf64 to avoid problems with
double rounding (similar to the existing fdim version), since this
function must round just once without an intermediate rounding to long
double. (No such special version is needed for any other function,
because the nontrivial functions use round-to-odd, which does the
intermediate computation with the rounding mode set to round-to-zero,
and double rounding is OK except in round-to-nearest mode, so is OK
for that intermediate round-to-zero computation.) mul and div will
need slightly different special versions for i386 (using round-to-odd
on long double instead of precision control) because of the
possibility of inexact intermediate results in the subnormal range for
double.
To reduce duplication among the different function implementations,
math-narrow.h gets macros CHECK_NARROW_ADD, NARROW_ADD_ROUND_TO_ODD
and NARROW_ADD_TRIVIAL.
In the trivial cases and for any architecture-specific optimized
implementations, the overhead of the errno setting might be
significant, but I think that's best handled through compiler built-in
functions rather than providing separate no-errno versions in glibc
(and likewise there are no __*_finite entry points for these function
provided, __*_finite effectively being no-errno versions at present in
most cases).
Tested for x86_64 and x86, with both GCC 6 and GCC 7. Tested for
mips64 (all three ABIs, both hard and soft float) and powerpc with GCC
7. Tested with build-many-glibcs.py with both GCC 6 and GCC 7.
* math/Makefile (libm-narrow-fns): Add add.
(libm-test-funcs-narrow): Likewise.
* math/Versions (GLIBC_2.28): Add narrowing add functions.
* math/bits/mathcalls-narrow.h (add): Use __MATHCALL_NARROW .
* math/gen-auto-libm-tests.c (test_functions): Add add.
* math/math-narrow.h (CHECK_NARROW_ADD): New macro.
(NARROW_ADD_ROUND_TO_ODD): Likewise.
(NARROW_ADD_TRIVIAL): Likewise.
* sysdeps/ieee754/float128/float128_private.h (__faddl): New
macro.
(__daddl): Likewise.
* sysdeps/ieee754/ldbl-opt/Makefile (libnldbl-calls): Add fadd and
dadd.
(CFLAGS-nldbl-dadd.c): New variable.
(CFLAGS-nldbl-fadd.c): Likewise.
* sysdeps/ieee754/ldbl-opt/Versions (GLIBC_2.28): Add
__nldbl_daddl.
* sysdeps/ieee754/ldbl-opt/nldbl-compat.h (__nldbl_daddl): New
prototype.
* manual/arith.texi (Misc FP Arithmetic): Document fadd, faddl,
daddl, fMaddfN, fMaddfNx, fMxaddfN and fMxaddfNx.
* math/auto-libm-test-in: Add tests of add.
* math/auto-libm-test-out-narrow-add: New generated file.
* math/libm-test-narrow-add.inc: New file.
* sysdeps/i386/fpu/s_f32xaddf64.c: Likewise.
* sysdeps/ieee754/dbl-64/s_f32xaddf64.c: Likewise.
* sysdeps/ieee754/dbl-64/s_fadd.c: Likewise.
* sysdeps/ieee754/float128/s_f32addf128.c: Likewise.
* sysdeps/ieee754/float128/s_f64addf128.c: Likewise.
* sysdeps/ieee754/float128/s_f64xaddf128.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_daddl.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_f64xaddf128.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_faddl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_daddl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_faddl.c: Likewise.
* sysdeps/ieee754/ldbl-96/s_daddl.c: Likewise.
* sysdeps/ieee754/ldbl-96/s_faddl.c: Likewise.
* sysdeps/ieee754/ldbl-opt/nldbl-dadd.c: Likewise.
* sysdeps/ieee754/ldbl-opt/nldbl-fadd.c: Likewise.
* sysdeps/ieee754/soft-fp/s_daddl.c: Likewise.
* sysdeps/ieee754/soft-fp/s_fadd.c: Likewise.
* sysdeps/ieee754/soft-fp/s_faddl.c: Likewise.
* sysdeps/powerpc/fpu/libm-test-ulps: Update.
* sysdeps/mach/hurd/i386/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/aarch64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/arm/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/hppa/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/microblaze/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/nios2/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/riscv/rv64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sh/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/tile/tilegx32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/tile/tilegx64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise.
2018-02-10 02:08:43 +00:00
|
|
|
Function: "add_ldouble":
|
|
|
|
double: 1
|
|
|
|
float: 1
|
|
|
|
|
2006-01-28 00:15:15 +00:00
|
|
|
Function: "asin":
|
2021-03-01 20:07:27 +00:00
|
|
|
double: 1
|
2015-05-29 12:40:33 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2006-01-28 00:15:15 +00:00
|
|
|
ldouble: 2
|
|
|
|
|
2012-05-04 11:05:57 +00:00
|
|
|
Function: "asin_downward":
|
|
|
|
double: 1
|
2012-05-31 22:51:03 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2013-12-02 11:16:42 +00:00
|
|
|
ldouble: 2
|
2012-05-04 11:05:57 +00:00
|
|
|
|
|
|
|
Function: "asin_towardzero":
|
2015-12-22 13:06:36 +00:00
|
|
|
double: 1
|
2012-05-31 22:51:03 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2012-05-31 22:51:03 +00:00
|
|
|
ldouble: 1
|
2012-05-04 11:05:57 +00:00
|
|
|
|
|
|
|
Function: "asin_upward":
|
2021-03-01 20:07:27 +00:00
|
|
|
double: 2
|
2012-05-04 11:05:57 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2013-12-02 11:16:42 +00:00
|
|
|
ldouble: 2
|
2012-05-04 11:05:57 +00:00
|
|
|
|
2006-01-28 00:15:15 +00:00
|
|
|
Function: "asinh":
|
2020-12-22 18:20:56 +00:00
|
|
|
double: 2
|
2020-04-07 14:41:29 +00:00
|
|
|
float: 2
|
2021-03-03 17:39:17 +00:00
|
|
|
float128: 4
|
2015-05-29 12:40:33 +00:00
|
|
|
ldouble: 2
|
2013-12-23 13:40:10 +00:00
|
|
|
|
2014-03-25 15:13:53 +00:00
|
|
|
Function: "asinh_downward":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 3
|
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 4
|
2015-03-09 17:26:48 +00:00
|
|
|
ldouble: 5
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: "asinh_towardzero":
|
2015-03-09 17:26:48 +00:00
|
|
|
double: 2
|
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2015-03-09 17:26:48 +00:00
|
|
|
ldouble: 5
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: "asinh_upward":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 3
|
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 4
|
2016-10-05 11:57:47 +00:00
|
|
|
ldouble: 7
|
2014-03-25 15:13:53 +00:00
|
|
|
|
2013-12-23 13:40:10 +00:00
|
|
|
Function: "atan":
|
|
|
|
double: 1
|
2015-05-29 12:40:33 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2015-05-29 12:40:33 +00:00
|
|
|
ldouble: 1
|
2006-01-28 00:15:15 +00:00
|
|
|
|
2001-04-22 03:36:36 +00:00
|
|
|
Function: "atan2":
|
2005-07-20 18:20:48 +00:00
|
|
|
float: 1
|
2021-03-01 20:07:27 +00:00
|
|
|
float128: 2
|
2013-12-17 16:23:00 +00:00
|
|
|
ldouble: 2
|
2001-04-22 03:36:36 +00:00
|
|
|
|
2014-03-25 15:13:53 +00:00
|
|
|
Function: "atan2_downward":
|
|
|
|
double: 1
|
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2015-05-29 12:40:33 +00:00
|
|
|
ldouble: 5
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: "atan2_towardzero":
|
|
|
|
double: 1
|
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2015-05-29 12:40:33 +00:00
|
|
|
ldouble: 6
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: "atan2_upward":
|
|
|
|
double: 1
|
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2014-03-25 15:13:53 +00:00
|
|
|
ldouble: 3
|
|
|
|
|
|
|
|
Function: "atan_downward":
|
|
|
|
double: 1
|
2015-05-29 12:40:33 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2014-03-25 15:13:53 +00:00
|
|
|
ldouble: 1
|
|
|
|
|
|
|
|
Function: "atan_towardzero":
|
|
|
|
double: 1
|
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2014-03-25 15:13:53 +00:00
|
|
|
ldouble: 1
|
|
|
|
|
|
|
|
Function: "atan_upward":
|
|
|
|
double: 1
|
2015-05-29 12:40:33 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2015-05-29 12:40:33 +00:00
|
|
|
ldouble: 2
|
2014-03-25 15:13:53 +00:00
|
|
|
|
1999-12-21 16:15:04 +00:00
|
|
|
Function: "atanh":
|
2015-12-22 13:06:36 +00:00
|
|
|
double: 2
|
2015-05-29 12:40:33 +00:00
|
|
|
float: 2
|
2021-03-03 17:39:17 +00:00
|
|
|
float128: 4
|
2015-05-29 12:40:33 +00:00
|
|
|
ldouble: 2
|
1999-12-21 16:15:04 +00:00
|
|
|
|
2014-03-25 15:13:53 +00:00
|
|
|
Function: "atanh_downward":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 3
|
2015-12-22 13:06:36 +00:00
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 4
|
2015-05-29 12:40:33 +00:00
|
|
|
ldouble: 3
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: "atanh_towardzero":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 2
|
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2021-03-01 20:07:27 +00:00
|
|
|
ldouble: 4
|
2014-03-25 15:13:53 +00:00
|
|
|
|
2015-05-29 12:40:33 +00:00
|
|
|
Function: "atanh_upward":
|
2015-12-22 13:06:36 +00:00
|
|
|
double: 3
|
2015-05-29 12:40:33 +00:00
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 4
|
2016-10-05 11:57:47 +00:00
|
|
|
ldouble: 4
|
2015-05-29 12:40:33 +00:00
|
|
|
|
2006-01-28 00:15:15 +00:00
|
|
|
Function: "cabs":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2006-01-28 00:15:15 +00:00
|
|
|
ldouble: 1
|
|
|
|
|
2014-03-25 15:13:53 +00:00
|
|
|
Function: "cabs_downward":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2014-03-25 15:13:53 +00:00
|
|
|
ldouble: 1
|
|
|
|
|
|
|
|
Function: "cabs_towardzero":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2014-03-25 15:13:53 +00:00
|
|
|
ldouble: 1
|
|
|
|
|
|
|
|
Function: "cabs_upward":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2015-12-22 13:06:36 +00:00
|
|
|
ldouble: 2
|
2014-03-25 15:13:53 +00:00
|
|
|
|
2006-01-28 00:15:15 +00:00
|
|
|
Function: Real part of "cacos":
|
2012-03-10 17:20:51 +00:00
|
|
|
double: 1
|
2013-04-29 17:10:03 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2014-01-08 13:45:54 +00:00
|
|
|
ldouble: 1
|
2006-01-28 00:15:15 +00:00
|
|
|
|
|
|
|
Function: Imaginary part of "cacos":
|
2017-02-21 13:18:18 +00:00
|
|
|
double: 2
|
2013-04-29 17:10:03 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2014-01-08 13:45:54 +00:00
|
|
|
ldouble: 2
|
2006-01-28 00:15:15 +00:00
|
|
|
|
2014-05-20 21:21:51 +00:00
|
|
|
Function: Real part of "cacos_downward":
|
2017-02-21 13:18:18 +00:00
|
|
|
double: 3
|
2014-05-20 21:21:51 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2014-05-20 21:21:51 +00:00
|
|
|
ldouble: 6
|
|
|
|
|
|
|
|
Function: Imaginary part of "cacos_downward":
|
|
|
|
double: 5
|
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 6
|
2016-10-05 11:57:47 +00:00
|
|
|
ldouble: 8
|
2014-05-20 21:21:51 +00:00
|
|
|
|
|
|
|
Function: Real part of "cacos_towardzero":
|
2017-02-21 13:18:18 +00:00
|
|
|
double: 3
|
2014-05-20 21:21:51 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2017-02-21 13:18:18 +00:00
|
|
|
ldouble: 7
|
2014-05-20 21:21:51 +00:00
|
|
|
|
|
|
|
Function: Imaginary part of "cacos_towardzero":
|
|
|
|
double: 5
|
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 5
|
2016-10-05 11:57:47 +00:00
|
|
|
ldouble: 8
|
2014-05-20 21:21:51 +00:00
|
|
|
|
|
|
|
Function: Real part of "cacos_upward":
|
|
|
|
double: 2
|
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2017-02-21 13:18:18 +00:00
|
|
|
ldouble: 7
|
2014-05-20 21:21:51 +00:00
|
|
|
|
|
|
|
Function: Imaginary part of "cacos_upward":
|
2017-02-21 13:18:18 +00:00
|
|
|
double: 5
|
|
|
|
float: 5
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 7
|
2017-02-21 13:18:18 +00:00
|
|
|
ldouble: 13
|
2014-05-20 21:21:51 +00:00
|
|
|
|
1999-12-21 16:15:04 +00:00
|
|
|
Function: Real part of "cacosh":
|
2017-02-21 13:18:18 +00:00
|
|
|
double: 2
|
2015-05-29 12:40:33 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2014-01-08 13:45:54 +00:00
|
|
|
ldouble: 2
|
1999-12-21 16:15:04 +00:00
|
|
|
|
|
|
|
Function: Imaginary part of "cacosh":
|
2006-01-30 22:29:44 +00:00
|
|
|
double: 1
|
2015-05-29 12:40:33 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2014-01-08 13:45:54 +00:00
|
|
|
ldouble: 1
|
1999-12-21 16:15:04 +00:00
|
|
|
|
2014-03-25 15:13:53 +00:00
|
|
|
Function: Real part of "cacosh_downward":
|
|
|
|
double: 5
|
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 5
|
2016-10-05 11:57:47 +00:00
|
|
|
ldouble: 8
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: Imaginary part of "cacosh_downward":
|
2017-02-21 13:18:18 +00:00
|
|
|
double: 3
|
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 4
|
2014-03-25 15:13:53 +00:00
|
|
|
ldouble: 6
|
|
|
|
|
|
|
|
Function: Real part of "cacosh_towardzero":
|
|
|
|
double: 5
|
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 5
|
2016-10-05 11:57:47 +00:00
|
|
|
ldouble: 8
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: Imaginary part of "cacosh_towardzero":
|
2017-02-21 13:18:18 +00:00
|
|
|
double: 3
|
2014-03-25 15:13:53 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2017-02-21 13:18:18 +00:00
|
|
|
ldouble: 7
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: Real part of "cacosh_upward":
|
|
|
|
double: 4
|
|
|
|
float: 4
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 6
|
2017-02-21 13:18:18 +00:00
|
|
|
ldouble: 12
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: Imaginary part of "cacosh_upward":
|
2017-02-21 13:18:18 +00:00
|
|
|
double: 3
|
2014-03-25 15:13:53 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 4
|
2017-02-21 13:18:18 +00:00
|
|
|
ldouble: 8
|
2014-03-25 15:13:53 +00:00
|
|
|
|
2013-07-04 12:14:44 +00:00
|
|
|
Function: "carg":
|
2015-12-22 13:06:36 +00:00
|
|
|
double: 1
|
2015-05-29 12:40:33 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2015-12-22 13:06:36 +00:00
|
|
|
ldouble: 2
|
2013-07-04 12:14:44 +00:00
|
|
|
|
2014-03-25 15:13:53 +00:00
|
|
|
Function: "carg_downward":
|
|
|
|
double: 1
|
2015-12-22 13:06:36 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2015-05-29 12:40:33 +00:00
|
|
|
ldouble: 5
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: "carg_towardzero":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 1
|
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2015-05-29 12:40:33 +00:00
|
|
|
ldouble: 6
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: "carg_upward":
|
|
|
|
double: 1
|
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2015-12-22 13:06:36 +00:00
|
|
|
ldouble: 3
|
2014-03-25 15:13:53 +00:00
|
|
|
|
1999-12-21 16:15:04 +00:00
|
|
|
Function: Real part of "casin":
|
2002-09-02 20:04:55 +00:00
|
|
|
double: 1
|
1999-12-21 16:15:04 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2006-01-28 00:15:15 +00:00
|
|
|
ldouble: 1
|
|
|
|
|
|
|
|
Function: Imaginary part of "casin":
|
2017-02-21 13:18:18 +00:00
|
|
|
double: 2
|
2013-04-29 17:10:03 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2014-01-08 13:45:54 +00:00
|
|
|
ldouble: 2
|
1999-12-21 16:15:04 +00:00
|
|
|
|
2014-03-25 15:13:53 +00:00
|
|
|
Function: Real part of "casin_downward":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 3
|
2017-02-21 13:18:18 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2017-02-21 13:18:18 +00:00
|
|
|
ldouble: 4
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: Imaginary part of "casin_downward":
|
|
|
|
double: 5
|
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 6
|
2016-10-05 11:57:47 +00:00
|
|
|
ldouble: 8
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: Real part of "casin_towardzero":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 3
|
2014-03-25 15:13:53 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2014-03-25 15:13:53 +00:00
|
|
|
ldouble: 5
|
|
|
|
|
|
|
|
Function: Imaginary part of "casin_towardzero":
|
|
|
|
double: 5
|
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 5
|
2016-10-05 11:57:47 +00:00
|
|
|
ldouble: 8
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: Real part of "casin_upward":
|
2017-02-21 13:18:18 +00:00
|
|
|
double: 3
|
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2015-03-09 17:26:48 +00:00
|
|
|
ldouble: 6
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: Imaginary part of "casin_upward":
|
2017-02-21 13:18:18 +00:00
|
|
|
double: 5
|
|
|
|
float: 5
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 7
|
2017-02-21 13:18:18 +00:00
|
|
|
ldouble: 13
|
2014-03-25 15:13:53 +00:00
|
|
|
|
1999-12-21 16:15:04 +00:00
|
|
|
Function: Real part of "casinh":
|
2017-02-21 13:18:18 +00:00
|
|
|
double: 2
|
2013-04-29 17:10:03 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2014-01-08 13:45:54 +00:00
|
|
|
ldouble: 2
|
1999-12-21 16:15:04 +00:00
|
|
|
|
|
|
|
Function: Imaginary part of "casinh":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 1
|
2013-04-29 17:10:03 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2006-01-28 00:15:15 +00:00
|
|
|
ldouble: 1
|
1999-12-21 16:15:04 +00:00
|
|
|
|
2014-03-25 15:13:53 +00:00
|
|
|
Function: Real part of "casinh_downward":
|
|
|
|
double: 5
|
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 6
|
2016-10-05 11:57:47 +00:00
|
|
|
ldouble: 8
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: Imaginary part of "casinh_downward":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 3
|
2017-02-21 13:18:18 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2017-02-21 13:18:18 +00:00
|
|
|
ldouble: 4
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: Real part of "casinh_towardzero":
|
|
|
|
double: 5
|
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 5
|
2016-10-05 11:57:47 +00:00
|
|
|
ldouble: 8
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: Imaginary part of "casinh_towardzero":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 3
|
2014-03-25 15:13:53 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2014-03-25 15:13:53 +00:00
|
|
|
ldouble: 5
|
|
|
|
|
|
|
|
Function: Real part of "casinh_upward":
|
2017-02-21 13:18:18 +00:00
|
|
|
double: 5
|
|
|
|
float: 5
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 7
|
2017-02-21 13:18:18 +00:00
|
|
|
ldouble: 13
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: Imaginary part of "casinh_upward":
|
2017-02-21 13:18:18 +00:00
|
|
|
double: 3
|
2014-03-25 15:13:53 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2015-03-09 17:26:48 +00:00
|
|
|
ldouble: 6
|
2014-03-25 15:13:53 +00:00
|
|
|
|
1999-12-21 16:15:04 +00:00
|
|
|
Function: Real part of "catan":
|
2017-02-21 13:18:18 +00:00
|
|
|
double: 1
|
2015-05-29 12:40:33 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2017-02-21 13:18:18 +00:00
|
|
|
ldouble: 3
|
1999-12-21 16:15:04 +00:00
|
|
|
|
|
|
|
Function: Imaginary part of "catan":
|
|
|
|
double: 1
|
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2017-02-21 13:18:18 +00:00
|
|
|
ldouble: 2
|
1999-12-21 16:15:04 +00:00
|
|
|
|
2014-04-06 19:50:11 +00:00
|
|
|
Function: Real part of "catan_downward":
|
|
|
|
double: 1
|
2017-02-21 13:18:18 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2017-02-21 13:18:18 +00:00
|
|
|
ldouble: 6
|
2014-04-06 19:50:11 +00:00
|
|
|
|
|
|
|
Function: Imaginary part of "catan_downward":
|
|
|
|
double: 2
|
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2017-02-21 13:18:18 +00:00
|
|
|
ldouble: 7
|
2014-04-06 19:50:11 +00:00
|
|
|
|
|
|
|
Function: Real part of "catan_towardzero":
|
|
|
|
double: 1
|
2017-02-21 13:18:18 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2017-02-21 13:18:18 +00:00
|
|
|
ldouble: 7
|
2014-04-06 19:50:11 +00:00
|
|
|
|
|
|
|
Function: Imaginary part of "catan_towardzero":
|
|
|
|
double: 2
|
2017-02-21 13:18:18 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2017-02-21 13:18:18 +00:00
|
|
|
ldouble: 3
|
2014-04-06 19:50:11 +00:00
|
|
|
|
|
|
|
Function: Real part of "catan_upward":
|
2017-02-21 13:18:18 +00:00
|
|
|
double: 1
|
2014-04-06 19:50:11 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2017-02-21 13:18:18 +00:00
|
|
|
ldouble: 6
|
2014-04-06 19:50:11 +00:00
|
|
|
|
|
|
|
Function: Imaginary part of "catan_upward":
|
|
|
|
double: 3
|
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2015-03-09 17:26:48 +00:00
|
|
|
ldouble: 8
|
2014-04-06 19:50:11 +00:00
|
|
|
|
1999-12-21 16:15:04 +00:00
|
|
|
Function: Real part of "catanh":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 1
|
2013-04-30 13:51:02 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2017-02-21 13:18:18 +00:00
|
|
|
ldouble: 2
|
1999-12-21 16:15:04 +00:00
|
|
|
|
2006-01-30 22:29:44 +00:00
|
|
|
Function: Imaginary part of "catanh":
|
2017-02-21 13:18:18 +00:00
|
|
|
double: 1
|
2013-04-30 13:51:02 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2017-02-21 13:18:18 +00:00
|
|
|
ldouble: 3
|
2006-01-30 22:29:44 +00:00
|
|
|
|
2014-04-06 19:50:11 +00:00
|
|
|
Function: Real part of "catanh_downward":
|
|
|
|
double: 2
|
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2015-03-09 17:26:48 +00:00
|
|
|
ldouble: 5
|
2014-04-06 19:50:11 +00:00
|
|
|
|
|
|
|
Function: Imaginary part of "catanh_downward":
|
|
|
|
double: 1
|
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2017-02-21 13:18:18 +00:00
|
|
|
ldouble: 6
|
2014-04-06 19:50:11 +00:00
|
|
|
|
|
|
|
Function: Real part of "catanh_towardzero":
|
|
|
|
double: 2
|
2017-02-21 13:18:18 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2017-02-21 13:18:18 +00:00
|
|
|
ldouble: 3
|
2014-04-06 19:50:11 +00:00
|
|
|
|
|
|
|
Function: Imaginary part of "catanh_towardzero":
|
|
|
|
double: 1
|
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2017-02-21 13:18:18 +00:00
|
|
|
ldouble: 7
|
2014-04-06 19:50:11 +00:00
|
|
|
|
|
|
|
Function: Real part of "catanh_upward":
|
|
|
|
double: 4
|
2017-02-21 13:18:18 +00:00
|
|
|
float: 4
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 4
|
2014-04-06 19:50:11 +00:00
|
|
|
ldouble: 8
|
|
|
|
|
|
|
|
Function: Imaginary part of "catanh_upward":
|
2017-02-21 13:18:18 +00:00
|
|
|
double: 1
|
2014-04-06 19:50:11 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2017-02-21 13:18:18 +00:00
|
|
|
ldouble: 6
|
2014-04-06 19:50:11 +00:00
|
|
|
|
1999-12-21 16:15:04 +00:00
|
|
|
Function: "cbrt":
|
2020-12-22 18:20:56 +00:00
|
|
|
double: 4
|
2013-12-02 11:16:42 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2014-01-08 13:45:54 +00:00
|
|
|
ldouble: 1
|
1999-12-21 16:15:04 +00:00
|
|
|
|
2014-03-25 15:13:53 +00:00
|
|
|
Function: "cbrt_downward":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 4
|
2014-03-25 15:13:53 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2015-12-22 13:06:36 +00:00
|
|
|
ldouble: 5
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: "cbrt_towardzero":
|
2015-12-22 13:06:36 +00:00
|
|
|
double: 3
|
2015-05-29 12:40:33 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2014-03-25 15:13:53 +00:00
|
|
|
ldouble: 3
|
|
|
|
|
|
|
|
Function: "cbrt_upward":
|
2015-12-22 13:06:36 +00:00
|
|
|
double: 5
|
2014-03-25 15:13:53 +00:00
|
|
|
float: 1
|
2017-11-17 20:20:42 +00:00
|
|
|
float128: 2
|
2014-03-25 15:13:53 +00:00
|
|
|
ldouble: 2
|
|
|
|
|
1999-12-21 16:15:04 +00:00
|
|
|
Function: Real part of "ccos":
|
|
|
|
double: 1
|
2002-09-02 20:04:55 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2006-01-28 00:15:15 +00:00
|
|
|
ldouble: 1
|
1999-12-21 16:15:04 +00:00
|
|
|
|
|
|
|
Function: Imaginary part of "ccos":
|
2012-05-19 15:46:20 +00:00
|
|
|
double: 1
|
2012-05-31 22:51:03 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2016-10-05 11:57:47 +00:00
|
|
|
ldouble: 2
|
1999-12-21 16:15:04 +00:00
|
|
|
|
2014-03-25 15:13:53 +00:00
|
|
|
Function: Real part of "ccos_downward":
|
|
|
|
double: 1
|
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2014-03-25 15:13:53 +00:00
|
|
|
ldouble: 6
|
|
|
|
|
|
|
|
Function: Imaginary part of "ccos_downward":
|
Add new exp and exp2 implementations
Optimized exp and exp2 implementations using a lookup table for
fractional powers of 2. There are several variants, see e_exp_data.c,
they can be selected by modifying math_config.h allowing different
tradeoffs.
The default selection should be acceptable as generic libm code.
Worst case error is 0.509 ULP for exp and 0.507 ULP for exp2, on
aarch64 the rodata size is 2160 bytes, shared between exp and exp2.
On aarch64 .text + .rodata size decreased by 24912 bytes.
The non-nearest rounding error is less than 1 ULP even on targets
without efficient round implementation (although the error rate is
higher in that case). Targets with single instruction, rounding mode
independent, to nearest integer rounding and conversion can use them
by setting TOINT_INTRINSICS and adding the necessary code to their
math_private.h.
The __exp1 code uses the same algorithm, so the error bound of pow
increased a bit.
New double precision error handling code was added following the
style of the single precision error handling code.
Improvements on Cortex-A72 compared to current glibc master:
exp thruput: 1.61x in [-9.9 9.9]
exp latency: 1.53x in [-9.9 9.9]
exp thruput: 1.13x in [0.5 1]
exp latency: 1.30x in [0.5 1]
exp2 thruput: 2.03x in [-9.9 9.9]
exp2 latency: 1.64x in [-9.9 9.9]
For small (< 1) inputs the current exp code uses a separate algorithm
so the speed up there is less.
Was tested on
aarch64-linux-gnu (TOINT_INTRINSICS, fma contraction) and
arm-linux-gnueabihf (!TOINT_INTRINSICS, no fma contraction) and
x86_64-linux-gnu (!TOINT_INTRINSICS, no fma contraction) and
powerpc64le-linux-gnu (!TOINT_INTRINSICS, fma contraction) targets,
only non-nearest rounding ulp errors increase and they are within
acceptable bounds (ulp updates are in separate patches).
* NEWS: Mention exp and exp2 improvements.
* math/Makefile (libm-support): Remove t_exp.
(type-double-routines): Add math_err and e_exp_data.
* sysdeps/aarch64/libm-test-ulps: Update.
* sysdeps/arm/libm-test-ulps: Update.
* sysdeps/i386/fpu/e_exp_data.c: New file.
* sysdeps/i386/fpu/math_err.c: New file.
* sysdeps/i386/fpu/t_exp.c: Remove.
* sysdeps/ia64/fpu/e_exp_data.c: New file.
* sysdeps/ia64/fpu/math_err.c: New file.
* sysdeps/ia64/fpu/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/e_exp.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp2.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp_data.c: New file.
* sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Update error bound.
* sysdeps/ieee754/dbl-64/eexp.tbl: Remove.
* sysdeps/ieee754/dbl-64/math_config.h: New file.
* sysdeps/ieee754/dbl-64/math_err.c: New file.
* sysdeps/ieee754/dbl-64/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/t_exp2.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.tbl: Remove.
* sysdeps/m68k/m680x0/fpu/e_exp_data.c: New file.
* sysdeps/m68k/m680x0/fpu/math_err.c: New file.
* sysdeps/m68k/m680x0/fpu/t_exp.c: Remove.
* sysdeps/powerpc/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Update.
2018-02-12 18:16:03 +00:00
|
|
|
double: 3
|
2014-03-25 15:13:53 +00:00
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2014-03-25 15:13:53 +00:00
|
|
|
ldouble: 6
|
|
|
|
|
|
|
|
Function: Real part of "ccos_towardzero":
|
|
|
|
double: 1
|
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2014-03-25 15:13:53 +00:00
|
|
|
ldouble: 6
|
|
|
|
|
|
|
|
Function: Imaginary part of "ccos_towardzero":
|
Add new exp and exp2 implementations
Optimized exp and exp2 implementations using a lookup table for
fractional powers of 2. There are several variants, see e_exp_data.c,
they can be selected by modifying math_config.h allowing different
tradeoffs.
The default selection should be acceptable as generic libm code.
Worst case error is 0.509 ULP for exp and 0.507 ULP for exp2, on
aarch64 the rodata size is 2160 bytes, shared between exp and exp2.
On aarch64 .text + .rodata size decreased by 24912 bytes.
The non-nearest rounding error is less than 1 ULP even on targets
without efficient round implementation (although the error rate is
higher in that case). Targets with single instruction, rounding mode
independent, to nearest integer rounding and conversion can use them
by setting TOINT_INTRINSICS and adding the necessary code to their
math_private.h.
The __exp1 code uses the same algorithm, so the error bound of pow
increased a bit.
New double precision error handling code was added following the
style of the single precision error handling code.
Improvements on Cortex-A72 compared to current glibc master:
exp thruput: 1.61x in [-9.9 9.9]
exp latency: 1.53x in [-9.9 9.9]
exp thruput: 1.13x in [0.5 1]
exp latency: 1.30x in [0.5 1]
exp2 thruput: 2.03x in [-9.9 9.9]
exp2 latency: 1.64x in [-9.9 9.9]
For small (< 1) inputs the current exp code uses a separate algorithm
so the speed up there is less.
Was tested on
aarch64-linux-gnu (TOINT_INTRINSICS, fma contraction) and
arm-linux-gnueabihf (!TOINT_INTRINSICS, no fma contraction) and
x86_64-linux-gnu (!TOINT_INTRINSICS, no fma contraction) and
powerpc64le-linux-gnu (!TOINT_INTRINSICS, fma contraction) targets,
only non-nearest rounding ulp errors increase and they are within
acceptable bounds (ulp updates are in separate patches).
* NEWS: Mention exp and exp2 improvements.
* math/Makefile (libm-support): Remove t_exp.
(type-double-routines): Add math_err and e_exp_data.
* sysdeps/aarch64/libm-test-ulps: Update.
* sysdeps/arm/libm-test-ulps: Update.
* sysdeps/i386/fpu/e_exp_data.c: New file.
* sysdeps/i386/fpu/math_err.c: New file.
* sysdeps/i386/fpu/t_exp.c: Remove.
* sysdeps/ia64/fpu/e_exp_data.c: New file.
* sysdeps/ia64/fpu/math_err.c: New file.
* sysdeps/ia64/fpu/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/e_exp.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp2.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp_data.c: New file.
* sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Update error bound.
* sysdeps/ieee754/dbl-64/eexp.tbl: Remove.
* sysdeps/ieee754/dbl-64/math_config.h: New file.
* sysdeps/ieee754/dbl-64/math_err.c: New file.
* sysdeps/ieee754/dbl-64/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/t_exp2.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.tbl: Remove.
* sysdeps/m68k/m680x0/fpu/e_exp_data.c: New file.
* sysdeps/m68k/m680x0/fpu/math_err.c: New file.
* sysdeps/m68k/m680x0/fpu/t_exp.c: Remove.
* sysdeps/powerpc/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Update.
2018-02-12 18:16:03 +00:00
|
|
|
double: 3
|
2014-03-25 15:13:53 +00:00
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2014-03-25 15:13:53 +00:00
|
|
|
ldouble: 6
|
|
|
|
|
|
|
|
Function: Real part of "ccos_upward":
|
|
|
|
double: 1
|
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2014-03-25 15:13:53 +00:00
|
|
|
ldouble: 3
|
|
|
|
|
|
|
|
Function: Imaginary part of "ccos_upward":
|
|
|
|
double: 2
|
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2014-03-25 15:13:53 +00:00
|
|
|
ldouble: 4
|
|
|
|
|
1999-12-21 16:15:04 +00:00
|
|
|
Function: Real part of "ccosh":
|
|
|
|
double: 1
|
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2006-01-28 00:15:15 +00:00
|
|
|
ldouble: 1
|
1999-12-21 16:15:04 +00:00
|
|
|
|
|
|
|
Function: Imaginary part of "ccosh":
|
2012-05-19 15:46:20 +00:00
|
|
|
double: 1
|
2012-05-31 22:51:03 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2006-01-28 00:15:15 +00:00
|
|
|
ldouble: 2
|
1999-12-21 16:15:04 +00:00
|
|
|
|
2014-03-25 15:13:53 +00:00
|
|
|
Function: Real part of "ccosh_downward":
|
Add new exp and exp2 implementations
Optimized exp and exp2 implementations using a lookup table for
fractional powers of 2. There are several variants, see e_exp_data.c,
they can be selected by modifying math_config.h allowing different
tradeoffs.
The default selection should be acceptable as generic libm code.
Worst case error is 0.509 ULP for exp and 0.507 ULP for exp2, on
aarch64 the rodata size is 2160 bytes, shared between exp and exp2.
On aarch64 .text + .rodata size decreased by 24912 bytes.
The non-nearest rounding error is less than 1 ULP even on targets
without efficient round implementation (although the error rate is
higher in that case). Targets with single instruction, rounding mode
independent, to nearest integer rounding and conversion can use them
by setting TOINT_INTRINSICS and adding the necessary code to their
math_private.h.
The __exp1 code uses the same algorithm, so the error bound of pow
increased a bit.
New double precision error handling code was added following the
style of the single precision error handling code.
Improvements on Cortex-A72 compared to current glibc master:
exp thruput: 1.61x in [-9.9 9.9]
exp latency: 1.53x in [-9.9 9.9]
exp thruput: 1.13x in [0.5 1]
exp latency: 1.30x in [0.5 1]
exp2 thruput: 2.03x in [-9.9 9.9]
exp2 latency: 1.64x in [-9.9 9.9]
For small (< 1) inputs the current exp code uses a separate algorithm
so the speed up there is less.
Was tested on
aarch64-linux-gnu (TOINT_INTRINSICS, fma contraction) and
arm-linux-gnueabihf (!TOINT_INTRINSICS, no fma contraction) and
x86_64-linux-gnu (!TOINT_INTRINSICS, no fma contraction) and
powerpc64le-linux-gnu (!TOINT_INTRINSICS, fma contraction) targets,
only non-nearest rounding ulp errors increase and they are within
acceptable bounds (ulp updates are in separate patches).
* NEWS: Mention exp and exp2 improvements.
* math/Makefile (libm-support): Remove t_exp.
(type-double-routines): Add math_err and e_exp_data.
* sysdeps/aarch64/libm-test-ulps: Update.
* sysdeps/arm/libm-test-ulps: Update.
* sysdeps/i386/fpu/e_exp_data.c: New file.
* sysdeps/i386/fpu/math_err.c: New file.
* sysdeps/i386/fpu/t_exp.c: Remove.
* sysdeps/ia64/fpu/e_exp_data.c: New file.
* sysdeps/ia64/fpu/math_err.c: New file.
* sysdeps/ia64/fpu/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/e_exp.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp2.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp_data.c: New file.
* sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Update error bound.
* sysdeps/ieee754/dbl-64/eexp.tbl: Remove.
* sysdeps/ieee754/dbl-64/math_config.h: New file.
* sysdeps/ieee754/dbl-64/math_err.c: New file.
* sysdeps/ieee754/dbl-64/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/t_exp2.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.tbl: Remove.
* sysdeps/m68k/m680x0/fpu/e_exp_data.c: New file.
* sysdeps/m68k/m680x0/fpu/math_err.c: New file.
* sysdeps/m68k/m680x0/fpu/t_exp.c: Remove.
* sysdeps/powerpc/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Update.
2018-02-12 18:16:03 +00:00
|
|
|
double: 2
|
2014-03-25 15:13:53 +00:00
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2014-03-25 15:13:53 +00:00
|
|
|
ldouble: 6
|
|
|
|
|
|
|
|
Function: Imaginary part of "ccosh_downward":
|
Add new exp and exp2 implementations
Optimized exp and exp2 implementations using a lookup table for
fractional powers of 2. There are several variants, see e_exp_data.c,
they can be selected by modifying math_config.h allowing different
tradeoffs.
The default selection should be acceptable as generic libm code.
Worst case error is 0.509 ULP for exp and 0.507 ULP for exp2, on
aarch64 the rodata size is 2160 bytes, shared between exp and exp2.
On aarch64 .text + .rodata size decreased by 24912 bytes.
The non-nearest rounding error is less than 1 ULP even on targets
without efficient round implementation (although the error rate is
higher in that case). Targets with single instruction, rounding mode
independent, to nearest integer rounding and conversion can use them
by setting TOINT_INTRINSICS and adding the necessary code to their
math_private.h.
The __exp1 code uses the same algorithm, so the error bound of pow
increased a bit.
New double precision error handling code was added following the
style of the single precision error handling code.
Improvements on Cortex-A72 compared to current glibc master:
exp thruput: 1.61x in [-9.9 9.9]
exp latency: 1.53x in [-9.9 9.9]
exp thruput: 1.13x in [0.5 1]
exp latency: 1.30x in [0.5 1]
exp2 thruput: 2.03x in [-9.9 9.9]
exp2 latency: 1.64x in [-9.9 9.9]
For small (< 1) inputs the current exp code uses a separate algorithm
so the speed up there is less.
Was tested on
aarch64-linux-gnu (TOINT_INTRINSICS, fma contraction) and
arm-linux-gnueabihf (!TOINT_INTRINSICS, no fma contraction) and
x86_64-linux-gnu (!TOINT_INTRINSICS, no fma contraction) and
powerpc64le-linux-gnu (!TOINT_INTRINSICS, fma contraction) targets,
only non-nearest rounding ulp errors increase and they are within
acceptable bounds (ulp updates are in separate patches).
* NEWS: Mention exp and exp2 improvements.
* math/Makefile (libm-support): Remove t_exp.
(type-double-routines): Add math_err and e_exp_data.
* sysdeps/aarch64/libm-test-ulps: Update.
* sysdeps/arm/libm-test-ulps: Update.
* sysdeps/i386/fpu/e_exp_data.c: New file.
* sysdeps/i386/fpu/math_err.c: New file.
* sysdeps/i386/fpu/t_exp.c: Remove.
* sysdeps/ia64/fpu/e_exp_data.c: New file.
* sysdeps/ia64/fpu/math_err.c: New file.
* sysdeps/ia64/fpu/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/e_exp.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp2.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp_data.c: New file.
* sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Update error bound.
* sysdeps/ieee754/dbl-64/eexp.tbl: Remove.
* sysdeps/ieee754/dbl-64/math_config.h: New file.
* sysdeps/ieee754/dbl-64/math_err.c: New file.
* sysdeps/ieee754/dbl-64/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/t_exp2.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.tbl: Remove.
* sysdeps/m68k/m680x0/fpu/e_exp_data.c: New file.
* sysdeps/m68k/m680x0/fpu/math_err.c: New file.
* sysdeps/m68k/m680x0/fpu/t_exp.c: Remove.
* sysdeps/powerpc/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Update.
2018-02-12 18:16:03 +00:00
|
|
|
double: 3
|
2014-03-25 15:13:53 +00:00
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2014-03-25 15:13:53 +00:00
|
|
|
ldouble: 6
|
|
|
|
|
|
|
|
Function: Real part of "ccosh_towardzero":
|
Add new exp and exp2 implementations
Optimized exp and exp2 implementations using a lookup table for
fractional powers of 2. There are several variants, see e_exp_data.c,
they can be selected by modifying math_config.h allowing different
tradeoffs.
The default selection should be acceptable as generic libm code.
Worst case error is 0.509 ULP for exp and 0.507 ULP for exp2, on
aarch64 the rodata size is 2160 bytes, shared between exp and exp2.
On aarch64 .text + .rodata size decreased by 24912 bytes.
The non-nearest rounding error is less than 1 ULP even on targets
without efficient round implementation (although the error rate is
higher in that case). Targets with single instruction, rounding mode
independent, to nearest integer rounding and conversion can use them
by setting TOINT_INTRINSICS and adding the necessary code to their
math_private.h.
The __exp1 code uses the same algorithm, so the error bound of pow
increased a bit.
New double precision error handling code was added following the
style of the single precision error handling code.
Improvements on Cortex-A72 compared to current glibc master:
exp thruput: 1.61x in [-9.9 9.9]
exp latency: 1.53x in [-9.9 9.9]
exp thruput: 1.13x in [0.5 1]
exp latency: 1.30x in [0.5 1]
exp2 thruput: 2.03x in [-9.9 9.9]
exp2 latency: 1.64x in [-9.9 9.9]
For small (< 1) inputs the current exp code uses a separate algorithm
so the speed up there is less.
Was tested on
aarch64-linux-gnu (TOINT_INTRINSICS, fma contraction) and
arm-linux-gnueabihf (!TOINT_INTRINSICS, no fma contraction) and
x86_64-linux-gnu (!TOINT_INTRINSICS, no fma contraction) and
powerpc64le-linux-gnu (!TOINT_INTRINSICS, fma contraction) targets,
only non-nearest rounding ulp errors increase and they are within
acceptable bounds (ulp updates are in separate patches).
* NEWS: Mention exp and exp2 improvements.
* math/Makefile (libm-support): Remove t_exp.
(type-double-routines): Add math_err and e_exp_data.
* sysdeps/aarch64/libm-test-ulps: Update.
* sysdeps/arm/libm-test-ulps: Update.
* sysdeps/i386/fpu/e_exp_data.c: New file.
* sysdeps/i386/fpu/math_err.c: New file.
* sysdeps/i386/fpu/t_exp.c: Remove.
* sysdeps/ia64/fpu/e_exp_data.c: New file.
* sysdeps/ia64/fpu/math_err.c: New file.
* sysdeps/ia64/fpu/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/e_exp.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp2.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp_data.c: New file.
* sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Update error bound.
* sysdeps/ieee754/dbl-64/eexp.tbl: Remove.
* sysdeps/ieee754/dbl-64/math_config.h: New file.
* sysdeps/ieee754/dbl-64/math_err.c: New file.
* sysdeps/ieee754/dbl-64/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/t_exp2.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.tbl: Remove.
* sysdeps/m68k/m680x0/fpu/e_exp_data.c: New file.
* sysdeps/m68k/m680x0/fpu/math_err.c: New file.
* sysdeps/m68k/m680x0/fpu/t_exp.c: Remove.
* sysdeps/powerpc/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Update.
2018-02-12 18:16:03 +00:00
|
|
|
double: 2
|
2014-03-25 15:13:53 +00:00
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2014-03-25 15:13:53 +00:00
|
|
|
ldouble: 6
|
|
|
|
|
|
|
|
Function: Imaginary part of "ccosh_towardzero":
|
Add new exp and exp2 implementations
Optimized exp and exp2 implementations using a lookup table for
fractional powers of 2. There are several variants, see e_exp_data.c,
they can be selected by modifying math_config.h allowing different
tradeoffs.
The default selection should be acceptable as generic libm code.
Worst case error is 0.509 ULP for exp and 0.507 ULP for exp2, on
aarch64 the rodata size is 2160 bytes, shared between exp and exp2.
On aarch64 .text + .rodata size decreased by 24912 bytes.
The non-nearest rounding error is less than 1 ULP even on targets
without efficient round implementation (although the error rate is
higher in that case). Targets with single instruction, rounding mode
independent, to nearest integer rounding and conversion can use them
by setting TOINT_INTRINSICS and adding the necessary code to their
math_private.h.
The __exp1 code uses the same algorithm, so the error bound of pow
increased a bit.
New double precision error handling code was added following the
style of the single precision error handling code.
Improvements on Cortex-A72 compared to current glibc master:
exp thruput: 1.61x in [-9.9 9.9]
exp latency: 1.53x in [-9.9 9.9]
exp thruput: 1.13x in [0.5 1]
exp latency: 1.30x in [0.5 1]
exp2 thruput: 2.03x in [-9.9 9.9]
exp2 latency: 1.64x in [-9.9 9.9]
For small (< 1) inputs the current exp code uses a separate algorithm
so the speed up there is less.
Was tested on
aarch64-linux-gnu (TOINT_INTRINSICS, fma contraction) and
arm-linux-gnueabihf (!TOINT_INTRINSICS, no fma contraction) and
x86_64-linux-gnu (!TOINT_INTRINSICS, no fma contraction) and
powerpc64le-linux-gnu (!TOINT_INTRINSICS, fma contraction) targets,
only non-nearest rounding ulp errors increase and they are within
acceptable bounds (ulp updates are in separate patches).
* NEWS: Mention exp and exp2 improvements.
* math/Makefile (libm-support): Remove t_exp.
(type-double-routines): Add math_err and e_exp_data.
* sysdeps/aarch64/libm-test-ulps: Update.
* sysdeps/arm/libm-test-ulps: Update.
* sysdeps/i386/fpu/e_exp_data.c: New file.
* sysdeps/i386/fpu/math_err.c: New file.
* sysdeps/i386/fpu/t_exp.c: Remove.
* sysdeps/ia64/fpu/e_exp_data.c: New file.
* sysdeps/ia64/fpu/math_err.c: New file.
* sysdeps/ia64/fpu/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/e_exp.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp2.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp_data.c: New file.
* sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Update error bound.
* sysdeps/ieee754/dbl-64/eexp.tbl: Remove.
* sysdeps/ieee754/dbl-64/math_config.h: New file.
* sysdeps/ieee754/dbl-64/math_err.c: New file.
* sysdeps/ieee754/dbl-64/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/t_exp2.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.tbl: Remove.
* sysdeps/m68k/m680x0/fpu/e_exp_data.c: New file.
* sysdeps/m68k/m680x0/fpu/math_err.c: New file.
* sysdeps/m68k/m680x0/fpu/t_exp.c: Remove.
* sysdeps/powerpc/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Update.
2018-02-12 18:16:03 +00:00
|
|
|
double: 3
|
2014-03-25 15:13:53 +00:00
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2014-03-25 15:13:53 +00:00
|
|
|
ldouble: 6
|
|
|
|
|
|
|
|
Function: Real part of "ccosh_upward":
|
|
|
|
double: 1
|
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2014-03-25 15:13:53 +00:00
|
|
|
ldouble: 3
|
|
|
|
|
|
|
|
Function: Imaginary part of "ccosh_upward":
|
|
|
|
double: 2
|
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2014-03-25 15:13:53 +00:00
|
|
|
ldouble: 4
|
|
|
|
|
1999-12-21 16:15:04 +00:00
|
|
|
Function: Real part of "cexp":
|
2012-03-20 00:02:29 +00:00
|
|
|
double: 2
|
1999-12-21 16:15:04 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2006-01-28 00:15:15 +00:00
|
|
|
ldouble: 2
|
1999-12-21 16:15:04 +00:00
|
|
|
|
|
|
|
Function: Imaginary part of "cexp":
|
2012-03-20 00:02:29 +00:00
|
|
|
double: 1
|
2012-03-23 22:53:53 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2012-03-20 00:02:29 +00:00
|
|
|
ldouble: 2
|
1999-12-21 16:15:04 +00:00
|
|
|
|
2015-07-24 13:21:56 +00:00
|
|
|
Function: Real part of "cexp_downward":
|
Add new exp and exp2 implementations
Optimized exp and exp2 implementations using a lookup table for
fractional powers of 2. There are several variants, see e_exp_data.c,
they can be selected by modifying math_config.h allowing different
tradeoffs.
The default selection should be acceptable as generic libm code.
Worst case error is 0.509 ULP for exp and 0.507 ULP for exp2, on
aarch64 the rodata size is 2160 bytes, shared between exp and exp2.
On aarch64 .text + .rodata size decreased by 24912 bytes.
The non-nearest rounding error is less than 1 ULP even on targets
without efficient round implementation (although the error rate is
higher in that case). Targets with single instruction, rounding mode
independent, to nearest integer rounding and conversion can use them
by setting TOINT_INTRINSICS and adding the necessary code to their
math_private.h.
The __exp1 code uses the same algorithm, so the error bound of pow
increased a bit.
New double precision error handling code was added following the
style of the single precision error handling code.
Improvements on Cortex-A72 compared to current glibc master:
exp thruput: 1.61x in [-9.9 9.9]
exp latency: 1.53x in [-9.9 9.9]
exp thruput: 1.13x in [0.5 1]
exp latency: 1.30x in [0.5 1]
exp2 thruput: 2.03x in [-9.9 9.9]
exp2 latency: 1.64x in [-9.9 9.9]
For small (< 1) inputs the current exp code uses a separate algorithm
so the speed up there is less.
Was tested on
aarch64-linux-gnu (TOINT_INTRINSICS, fma contraction) and
arm-linux-gnueabihf (!TOINT_INTRINSICS, no fma contraction) and
x86_64-linux-gnu (!TOINT_INTRINSICS, no fma contraction) and
powerpc64le-linux-gnu (!TOINT_INTRINSICS, fma contraction) targets,
only non-nearest rounding ulp errors increase and they are within
acceptable bounds (ulp updates are in separate patches).
* NEWS: Mention exp and exp2 improvements.
* math/Makefile (libm-support): Remove t_exp.
(type-double-routines): Add math_err and e_exp_data.
* sysdeps/aarch64/libm-test-ulps: Update.
* sysdeps/arm/libm-test-ulps: Update.
* sysdeps/i386/fpu/e_exp_data.c: New file.
* sysdeps/i386/fpu/math_err.c: New file.
* sysdeps/i386/fpu/t_exp.c: Remove.
* sysdeps/ia64/fpu/e_exp_data.c: New file.
* sysdeps/ia64/fpu/math_err.c: New file.
* sysdeps/ia64/fpu/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/e_exp.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp2.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp_data.c: New file.
* sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Update error bound.
* sysdeps/ieee754/dbl-64/eexp.tbl: Remove.
* sysdeps/ieee754/dbl-64/math_config.h: New file.
* sysdeps/ieee754/dbl-64/math_err.c: New file.
* sysdeps/ieee754/dbl-64/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/t_exp2.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.tbl: Remove.
* sysdeps/m68k/m680x0/fpu/e_exp_data.c: New file.
* sysdeps/m68k/m680x0/fpu/math_err.c: New file.
* sysdeps/m68k/m680x0/fpu/t_exp.c: Remove.
* sysdeps/powerpc/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Update.
2018-02-12 18:16:03 +00:00
|
|
|
double: 2
|
2015-07-24 13:21:56 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2015-07-24 13:21:56 +00:00
|
|
|
ldouble: 11
|
|
|
|
|
|
|
|
Function: Imaginary part of "cexp_downward":
|
Add new exp and exp2 implementations
Optimized exp and exp2 implementations using a lookup table for
fractional powers of 2. There are several variants, see e_exp_data.c,
they can be selected by modifying math_config.h allowing different
tradeoffs.
The default selection should be acceptable as generic libm code.
Worst case error is 0.509 ULP for exp and 0.507 ULP for exp2, on
aarch64 the rodata size is 2160 bytes, shared between exp and exp2.
On aarch64 .text + .rodata size decreased by 24912 bytes.
The non-nearest rounding error is less than 1 ULP even on targets
without efficient round implementation (although the error rate is
higher in that case). Targets with single instruction, rounding mode
independent, to nearest integer rounding and conversion can use them
by setting TOINT_INTRINSICS and adding the necessary code to their
math_private.h.
The __exp1 code uses the same algorithm, so the error bound of pow
increased a bit.
New double precision error handling code was added following the
style of the single precision error handling code.
Improvements on Cortex-A72 compared to current glibc master:
exp thruput: 1.61x in [-9.9 9.9]
exp latency: 1.53x in [-9.9 9.9]
exp thruput: 1.13x in [0.5 1]
exp latency: 1.30x in [0.5 1]
exp2 thruput: 2.03x in [-9.9 9.9]
exp2 latency: 1.64x in [-9.9 9.9]
For small (< 1) inputs the current exp code uses a separate algorithm
so the speed up there is less.
Was tested on
aarch64-linux-gnu (TOINT_INTRINSICS, fma contraction) and
arm-linux-gnueabihf (!TOINT_INTRINSICS, no fma contraction) and
x86_64-linux-gnu (!TOINT_INTRINSICS, no fma contraction) and
powerpc64le-linux-gnu (!TOINT_INTRINSICS, fma contraction) targets,
only non-nearest rounding ulp errors increase and they are within
acceptable bounds (ulp updates are in separate patches).
* NEWS: Mention exp and exp2 improvements.
* math/Makefile (libm-support): Remove t_exp.
(type-double-routines): Add math_err and e_exp_data.
* sysdeps/aarch64/libm-test-ulps: Update.
* sysdeps/arm/libm-test-ulps: Update.
* sysdeps/i386/fpu/e_exp_data.c: New file.
* sysdeps/i386/fpu/math_err.c: New file.
* sysdeps/i386/fpu/t_exp.c: Remove.
* sysdeps/ia64/fpu/e_exp_data.c: New file.
* sysdeps/ia64/fpu/math_err.c: New file.
* sysdeps/ia64/fpu/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/e_exp.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp2.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp_data.c: New file.
* sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Update error bound.
* sysdeps/ieee754/dbl-64/eexp.tbl: Remove.
* sysdeps/ieee754/dbl-64/math_config.h: New file.
* sysdeps/ieee754/dbl-64/math_err.c: New file.
* sysdeps/ieee754/dbl-64/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/t_exp2.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.tbl: Remove.
* sysdeps/m68k/m680x0/fpu/e_exp_data.c: New file.
* sysdeps/m68k/m680x0/fpu/math_err.c: New file.
* sysdeps/m68k/m680x0/fpu/t_exp.c: Remove.
* sysdeps/powerpc/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Update.
2018-02-12 18:16:03 +00:00
|
|
|
double: 3
|
2015-07-24 13:21:56 +00:00
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2015-07-24 13:21:56 +00:00
|
|
|
ldouble: 11
|
|
|
|
|
|
|
|
Function: Real part of "cexp_towardzero":
|
Add new exp and exp2 implementations
Optimized exp and exp2 implementations using a lookup table for
fractional powers of 2. There are several variants, see e_exp_data.c,
they can be selected by modifying math_config.h allowing different
tradeoffs.
The default selection should be acceptable as generic libm code.
Worst case error is 0.509 ULP for exp and 0.507 ULP for exp2, on
aarch64 the rodata size is 2160 bytes, shared between exp and exp2.
On aarch64 .text + .rodata size decreased by 24912 bytes.
The non-nearest rounding error is less than 1 ULP even on targets
without efficient round implementation (although the error rate is
higher in that case). Targets with single instruction, rounding mode
independent, to nearest integer rounding and conversion can use them
by setting TOINT_INTRINSICS and adding the necessary code to their
math_private.h.
The __exp1 code uses the same algorithm, so the error bound of pow
increased a bit.
New double precision error handling code was added following the
style of the single precision error handling code.
Improvements on Cortex-A72 compared to current glibc master:
exp thruput: 1.61x in [-9.9 9.9]
exp latency: 1.53x in [-9.9 9.9]
exp thruput: 1.13x in [0.5 1]
exp latency: 1.30x in [0.5 1]
exp2 thruput: 2.03x in [-9.9 9.9]
exp2 latency: 1.64x in [-9.9 9.9]
For small (< 1) inputs the current exp code uses a separate algorithm
so the speed up there is less.
Was tested on
aarch64-linux-gnu (TOINT_INTRINSICS, fma contraction) and
arm-linux-gnueabihf (!TOINT_INTRINSICS, no fma contraction) and
x86_64-linux-gnu (!TOINT_INTRINSICS, no fma contraction) and
powerpc64le-linux-gnu (!TOINT_INTRINSICS, fma contraction) targets,
only non-nearest rounding ulp errors increase and they are within
acceptable bounds (ulp updates are in separate patches).
* NEWS: Mention exp and exp2 improvements.
* math/Makefile (libm-support): Remove t_exp.
(type-double-routines): Add math_err and e_exp_data.
* sysdeps/aarch64/libm-test-ulps: Update.
* sysdeps/arm/libm-test-ulps: Update.
* sysdeps/i386/fpu/e_exp_data.c: New file.
* sysdeps/i386/fpu/math_err.c: New file.
* sysdeps/i386/fpu/t_exp.c: Remove.
* sysdeps/ia64/fpu/e_exp_data.c: New file.
* sysdeps/ia64/fpu/math_err.c: New file.
* sysdeps/ia64/fpu/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/e_exp.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp2.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp_data.c: New file.
* sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Update error bound.
* sysdeps/ieee754/dbl-64/eexp.tbl: Remove.
* sysdeps/ieee754/dbl-64/math_config.h: New file.
* sysdeps/ieee754/dbl-64/math_err.c: New file.
* sysdeps/ieee754/dbl-64/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/t_exp2.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.tbl: Remove.
* sysdeps/m68k/m680x0/fpu/e_exp_data.c: New file.
* sysdeps/m68k/m680x0/fpu/math_err.c: New file.
* sysdeps/m68k/m680x0/fpu/t_exp.c: Remove.
* sysdeps/powerpc/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Update.
2018-02-12 18:16:03 +00:00
|
|
|
double: 2
|
2015-07-24 13:21:56 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2015-07-24 13:21:56 +00:00
|
|
|
ldouble: 11
|
|
|
|
|
|
|
|
Function: Imaginary part of "cexp_towardzero":
|
Add new exp and exp2 implementations
Optimized exp and exp2 implementations using a lookup table for
fractional powers of 2. There are several variants, see e_exp_data.c,
they can be selected by modifying math_config.h allowing different
tradeoffs.
The default selection should be acceptable as generic libm code.
Worst case error is 0.509 ULP for exp and 0.507 ULP for exp2, on
aarch64 the rodata size is 2160 bytes, shared between exp and exp2.
On aarch64 .text + .rodata size decreased by 24912 bytes.
The non-nearest rounding error is less than 1 ULP even on targets
without efficient round implementation (although the error rate is
higher in that case). Targets with single instruction, rounding mode
independent, to nearest integer rounding and conversion can use them
by setting TOINT_INTRINSICS and adding the necessary code to their
math_private.h.
The __exp1 code uses the same algorithm, so the error bound of pow
increased a bit.
New double precision error handling code was added following the
style of the single precision error handling code.
Improvements on Cortex-A72 compared to current glibc master:
exp thruput: 1.61x in [-9.9 9.9]
exp latency: 1.53x in [-9.9 9.9]
exp thruput: 1.13x in [0.5 1]
exp latency: 1.30x in [0.5 1]
exp2 thruput: 2.03x in [-9.9 9.9]
exp2 latency: 1.64x in [-9.9 9.9]
For small (< 1) inputs the current exp code uses a separate algorithm
so the speed up there is less.
Was tested on
aarch64-linux-gnu (TOINT_INTRINSICS, fma contraction) and
arm-linux-gnueabihf (!TOINT_INTRINSICS, no fma contraction) and
x86_64-linux-gnu (!TOINT_INTRINSICS, no fma contraction) and
powerpc64le-linux-gnu (!TOINT_INTRINSICS, fma contraction) targets,
only non-nearest rounding ulp errors increase and they are within
acceptable bounds (ulp updates are in separate patches).
* NEWS: Mention exp and exp2 improvements.
* math/Makefile (libm-support): Remove t_exp.
(type-double-routines): Add math_err and e_exp_data.
* sysdeps/aarch64/libm-test-ulps: Update.
* sysdeps/arm/libm-test-ulps: Update.
* sysdeps/i386/fpu/e_exp_data.c: New file.
* sysdeps/i386/fpu/math_err.c: New file.
* sysdeps/i386/fpu/t_exp.c: Remove.
* sysdeps/ia64/fpu/e_exp_data.c: New file.
* sysdeps/ia64/fpu/math_err.c: New file.
* sysdeps/ia64/fpu/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/e_exp.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp2.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp_data.c: New file.
* sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Update error bound.
* sysdeps/ieee754/dbl-64/eexp.tbl: Remove.
* sysdeps/ieee754/dbl-64/math_config.h: New file.
* sysdeps/ieee754/dbl-64/math_err.c: New file.
* sysdeps/ieee754/dbl-64/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/t_exp2.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.tbl: Remove.
* sysdeps/m68k/m680x0/fpu/e_exp_data.c: New file.
* sysdeps/m68k/m680x0/fpu/math_err.c: New file.
* sysdeps/m68k/m680x0/fpu/t_exp.c: Remove.
* sysdeps/powerpc/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Update.
2018-02-12 18:16:03 +00:00
|
|
|
double: 3
|
2015-07-24 13:21:56 +00:00
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2015-07-24 13:21:56 +00:00
|
|
|
ldouble: 11
|
|
|
|
|
|
|
|
Function: Real part of "cexp_upward":
|
|
|
|
double: 1
|
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2015-07-24 13:21:56 +00:00
|
|
|
ldouble: 3
|
|
|
|
|
|
|
|
Function: Imaginary part of "cexp_upward":
|
Add new exp and exp2 implementations
Optimized exp and exp2 implementations using a lookup table for
fractional powers of 2. There are several variants, see e_exp_data.c,
they can be selected by modifying math_config.h allowing different
tradeoffs.
The default selection should be acceptable as generic libm code.
Worst case error is 0.509 ULP for exp and 0.507 ULP for exp2, on
aarch64 the rodata size is 2160 bytes, shared between exp and exp2.
On aarch64 .text + .rodata size decreased by 24912 bytes.
The non-nearest rounding error is less than 1 ULP even on targets
without efficient round implementation (although the error rate is
higher in that case). Targets with single instruction, rounding mode
independent, to nearest integer rounding and conversion can use them
by setting TOINT_INTRINSICS and adding the necessary code to their
math_private.h.
The __exp1 code uses the same algorithm, so the error bound of pow
increased a bit.
New double precision error handling code was added following the
style of the single precision error handling code.
Improvements on Cortex-A72 compared to current glibc master:
exp thruput: 1.61x in [-9.9 9.9]
exp latency: 1.53x in [-9.9 9.9]
exp thruput: 1.13x in [0.5 1]
exp latency: 1.30x in [0.5 1]
exp2 thruput: 2.03x in [-9.9 9.9]
exp2 latency: 1.64x in [-9.9 9.9]
For small (< 1) inputs the current exp code uses a separate algorithm
so the speed up there is less.
Was tested on
aarch64-linux-gnu (TOINT_INTRINSICS, fma contraction) and
arm-linux-gnueabihf (!TOINT_INTRINSICS, no fma contraction) and
x86_64-linux-gnu (!TOINT_INTRINSICS, no fma contraction) and
powerpc64le-linux-gnu (!TOINT_INTRINSICS, fma contraction) targets,
only non-nearest rounding ulp errors increase and they are within
acceptable bounds (ulp updates are in separate patches).
* NEWS: Mention exp and exp2 improvements.
* math/Makefile (libm-support): Remove t_exp.
(type-double-routines): Add math_err and e_exp_data.
* sysdeps/aarch64/libm-test-ulps: Update.
* sysdeps/arm/libm-test-ulps: Update.
* sysdeps/i386/fpu/e_exp_data.c: New file.
* sysdeps/i386/fpu/math_err.c: New file.
* sysdeps/i386/fpu/t_exp.c: Remove.
* sysdeps/ia64/fpu/e_exp_data.c: New file.
* sysdeps/ia64/fpu/math_err.c: New file.
* sysdeps/ia64/fpu/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/e_exp.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp2.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp_data.c: New file.
* sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Update error bound.
* sysdeps/ieee754/dbl-64/eexp.tbl: Remove.
* sysdeps/ieee754/dbl-64/math_config.h: New file.
* sysdeps/ieee754/dbl-64/math_err.c: New file.
* sysdeps/ieee754/dbl-64/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/t_exp2.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.tbl: Remove.
* sysdeps/m68k/m680x0/fpu/e_exp_data.c: New file.
* sysdeps/m68k/m680x0/fpu/math_err.c: New file.
* sysdeps/m68k/m680x0/fpu/t_exp.c: Remove.
* sysdeps/powerpc/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Update.
2018-02-12 18:16:03 +00:00
|
|
|
double: 3
|
2015-07-24 13:21:56 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2015-07-24 13:21:56 +00:00
|
|
|
ldouble: 3
|
|
|
|
|
2002-09-02 20:04:55 +00:00
|
|
|
Function: Real part of "clog":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 3
|
2006-01-30 22:29:44 +00:00
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2015-05-29 12:40:33 +00:00
|
|
|
ldouble: 5
|
1999-12-21 16:15:04 +00:00
|
|
|
|
2015-05-29 12:40:33 +00:00
|
|
|
Function: Imaginary part of "clog":
|
1999-12-21 16:15:04 +00:00
|
|
|
double: 1
|
2012-03-03 13:20:24 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2012-10-11 08:22:46 +00:00
|
|
|
ldouble: 2
|
1999-12-21 16:15:04 +00:00
|
|
|
|
2015-05-29 12:40:33 +00:00
|
|
|
Function: Real part of "clog10":
|
2014-04-06 19:50:11 +00:00
|
|
|
double: 3
|
2015-12-22 13:06:36 +00:00
|
|
|
float: 4
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2015-05-29 12:40:33 +00:00
|
|
|
ldouble: 3
|
2014-04-06 19:50:11 +00:00
|
|
|
|
2015-05-29 12:40:33 +00:00
|
|
|
Function: Imaginary part of "clog10":
|
2016-10-05 11:57:47 +00:00
|
|
|
double: 2
|
2014-04-06 19:50:11 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2015-05-29 12:40:33 +00:00
|
|
|
ldouble: 2
|
|
|
|
|
|
|
|
Function: Real part of "clog10_downward":
|
|
|
|
double: 6
|
|
|
|
float: 6
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2015-05-29 12:40:33 +00:00
|
|
|
ldouble: 10
|
|
|
|
|
|
|
|
Function: Imaginary part of "clog10_downward":
|
|
|
|
double: 2
|
|
|
|
float: 4
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2014-04-06 19:50:11 +00:00
|
|
|
ldouble: 7
|
|
|
|
|
|
|
|
Function: Real part of "clog10_towardzero":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 5
|
2015-12-22 13:06:36 +00:00
|
|
|
float: 5
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 4
|
2015-05-29 12:40:33 +00:00
|
|
|
ldouble: 9
|
2014-04-06 19:50:11 +00:00
|
|
|
|
|
|
|
Function: Imaginary part of "clog10_towardzero":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 2
|
|
|
|
float: 4
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2015-12-22 13:06:36 +00:00
|
|
|
ldouble: 8
|
2014-04-06 19:50:11 +00:00
|
|
|
|
|
|
|
Function: Real part of "clog10_upward":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 8
|
|
|
|
float: 5
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 4
|
2015-05-29 12:40:33 +00:00
|
|
|
ldouble: 10
|
2014-04-06 19:50:11 +00:00
|
|
|
|
|
|
|
Function: Imaginary part of "clog10_upward":
|
|
|
|
double: 2
|
2015-12-22 13:06:36 +00:00
|
|
|
float: 4
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2015-12-22 13:06:36 +00:00
|
|
|
ldouble: 7
|
2014-04-06 19:50:11 +00:00
|
|
|
|
|
|
|
Function: Real part of "clog_downward":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 7
|
|
|
|
float: 5
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2015-05-29 12:40:33 +00:00
|
|
|
ldouble: 11
|
2014-04-06 19:50:11 +00:00
|
|
|
|
|
|
|
Function: Imaginary part of "clog_downward":
|
|
|
|
double: 1
|
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2015-05-29 12:40:33 +00:00
|
|
|
ldouble: 5
|
2014-04-06 19:50:11 +00:00
|
|
|
|
|
|
|
Function: Real part of "clog_towardzero":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 7
|
|
|
|
float: 5
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2015-05-29 12:40:33 +00:00
|
|
|
ldouble: 10
|
2014-04-06 19:50:11 +00:00
|
|
|
|
|
|
|
Function: Imaginary part of "clog_towardzero":
|
|
|
|
double: 1
|
2015-12-22 13:06:36 +00:00
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2015-12-22 13:06:36 +00:00
|
|
|
ldouble: 7
|
2014-04-06 19:50:11 +00:00
|
|
|
|
2015-05-29 12:40:33 +00:00
|
|
|
Function: Real part of "clog_upward":
|
|
|
|
double: 8
|
|
|
|
float: 5
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 4
|
2015-05-29 12:40:33 +00:00
|
|
|
ldouble: 10
|
|
|
|
|
2014-04-06 19:50:11 +00:00
|
|
|
Function: Imaginary part of "clog_upward":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 1
|
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2015-12-22 13:06:36 +00:00
|
|
|
ldouble: 4
|
2014-04-06 19:50:11 +00:00
|
|
|
|
1999-12-21 16:15:04 +00:00
|
|
|
Function: "cos":
|
2018-04-04 22:17:13 +00:00
|
|
|
double: 1
|
2017-05-02 16:38:29 +00:00
|
|
|
float: 3
|
2021-03-03 17:39:17 +00:00
|
|
|
float128: 2
|
2015-03-09 17:26:48 +00:00
|
|
|
ldouble: 4
|
2006-01-28 00:15:15 +00:00
|
|
|
|
2012-03-03 13:20:24 +00:00
|
|
|
Function: "cos_downward":
|
2013-12-02 11:16:42 +00:00
|
|
|
double: 1
|
2017-05-02 16:38:29 +00:00
|
|
|
float: 4
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2015-03-09 17:26:48 +00:00
|
|
|
ldouble: 5
|
2012-03-03 13:20:24 +00:00
|
|
|
|
|
|
|
Function: "cos_towardzero":
|
2013-12-02 11:16:42 +00:00
|
|
|
double: 1
|
2017-05-02 16:38:29 +00:00
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2015-03-09 17:26:48 +00:00
|
|
|
ldouble: 4
|
2012-03-03 13:20:24 +00:00
|
|
|
|
|
|
|
Function: "cos_upward":
|
2013-12-02 11:16:42 +00:00
|
|
|
double: 1
|
2017-05-02 16:38:29 +00:00
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2015-03-09 17:26:48 +00:00
|
|
|
ldouble: 5
|
2012-03-03 13:20:24 +00:00
|
|
|
|
2006-01-28 00:15:15 +00:00
|
|
|
Function: "cosh":
|
2020-12-22 18:20:56 +00:00
|
|
|
double: 2
|
2020-04-07 14:41:29 +00:00
|
|
|
float: 2
|
2021-03-01 20:07:27 +00:00
|
|
|
float128: 2
|
2014-01-08 13:45:54 +00:00
|
|
|
ldouble: 3
|
1999-12-21 16:15:04 +00:00
|
|
|
|
2012-03-05 19:20:15 +00:00
|
|
|
Function: "cosh_downward":
|
2020-12-22 18:20:56 +00:00
|
|
|
double: 3
|
2012-03-05 19:20:15 +00:00
|
|
|
float: 1
|
2021-03-01 20:07:27 +00:00
|
|
|
float128: 3
|
2020-12-22 18:20:56 +00:00
|
|
|
ldouble: 6
|
2012-03-05 19:20:15 +00:00
|
|
|
|
|
|
|
Function: "cosh_towardzero":
|
2020-12-22 18:20:56 +00:00
|
|
|
double: 3
|
2012-03-05 19:20:15 +00:00
|
|
|
float: 1
|
2021-03-01 20:07:27 +00:00
|
|
|
float128: 3
|
2020-12-22 18:20:56 +00:00
|
|
|
ldouble: 6
|
2012-03-05 19:20:15 +00:00
|
|
|
|
|
|
|
Function: "cosh_upward":
|
Add new exp and exp2 implementations
Optimized exp and exp2 implementations using a lookup table for
fractional powers of 2. There are several variants, see e_exp_data.c,
they can be selected by modifying math_config.h allowing different
tradeoffs.
The default selection should be acceptable as generic libm code.
Worst case error is 0.509 ULP for exp and 0.507 ULP for exp2, on
aarch64 the rodata size is 2160 bytes, shared between exp and exp2.
On aarch64 .text + .rodata size decreased by 24912 bytes.
The non-nearest rounding error is less than 1 ULP even on targets
without efficient round implementation (although the error rate is
higher in that case). Targets with single instruction, rounding mode
independent, to nearest integer rounding and conversion can use them
by setting TOINT_INTRINSICS and adding the necessary code to their
math_private.h.
The __exp1 code uses the same algorithm, so the error bound of pow
increased a bit.
New double precision error handling code was added following the
style of the single precision error handling code.
Improvements on Cortex-A72 compared to current glibc master:
exp thruput: 1.61x in [-9.9 9.9]
exp latency: 1.53x in [-9.9 9.9]
exp thruput: 1.13x in [0.5 1]
exp latency: 1.30x in [0.5 1]
exp2 thruput: 2.03x in [-9.9 9.9]
exp2 latency: 1.64x in [-9.9 9.9]
For small (< 1) inputs the current exp code uses a separate algorithm
so the speed up there is less.
Was tested on
aarch64-linux-gnu (TOINT_INTRINSICS, fma contraction) and
arm-linux-gnueabihf (!TOINT_INTRINSICS, no fma contraction) and
x86_64-linux-gnu (!TOINT_INTRINSICS, no fma contraction) and
powerpc64le-linux-gnu (!TOINT_INTRINSICS, fma contraction) targets,
only non-nearest rounding ulp errors increase and they are within
acceptable bounds (ulp updates are in separate patches).
* NEWS: Mention exp and exp2 improvements.
* math/Makefile (libm-support): Remove t_exp.
(type-double-routines): Add math_err and e_exp_data.
* sysdeps/aarch64/libm-test-ulps: Update.
* sysdeps/arm/libm-test-ulps: Update.
* sysdeps/i386/fpu/e_exp_data.c: New file.
* sysdeps/i386/fpu/math_err.c: New file.
* sysdeps/i386/fpu/t_exp.c: Remove.
* sysdeps/ia64/fpu/e_exp_data.c: New file.
* sysdeps/ia64/fpu/math_err.c: New file.
* sysdeps/ia64/fpu/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/e_exp.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp2.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp_data.c: New file.
* sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Update error bound.
* sysdeps/ieee754/dbl-64/eexp.tbl: Remove.
* sysdeps/ieee754/dbl-64/math_config.h: New file.
* sysdeps/ieee754/dbl-64/math_err.c: New file.
* sysdeps/ieee754/dbl-64/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/t_exp2.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.tbl: Remove.
* sysdeps/m68k/m680x0/fpu/e_exp_data.c: New file.
* sysdeps/m68k/m680x0/fpu/math_err.c: New file.
* sysdeps/m68k/m680x0/fpu/t_exp.c: Remove.
* sysdeps/powerpc/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Update.
2018-02-12 18:16:03 +00:00
|
|
|
double: 2
|
2013-12-23 13:40:10 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2015-05-29 12:40:33 +00:00
|
|
|
ldouble: 2
|
2012-03-05 19:20:15 +00:00
|
|
|
|
1999-12-21 16:15:04 +00:00
|
|
|
Function: Real part of "cpow":
|
2002-09-02 20:04:55 +00:00
|
|
|
double: 2
|
2005-06-18 02:04:15 +00:00
|
|
|
float: 5
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 4
|
2012-10-11 08:22:46 +00:00
|
|
|
ldouble: 4
|
1999-12-21 16:15:04 +00:00
|
|
|
|
|
|
|
Function: Imaginary part of "cpow":
|
|
|
|
float: 2
|
2017-08-08 21:52:03 +00:00
|
|
|
float128: 1
|
2006-01-28 00:15:15 +00:00
|
|
|
ldouble: 2
|
|
|
|
|
2014-06-25 14:57:39 +00:00
|
|
|
Function: Real part of "cpow_downward":
|
Add new exp and exp2 implementations
Optimized exp and exp2 implementations using a lookup table for
fractional powers of 2. There are several variants, see e_exp_data.c,
they can be selected by modifying math_config.h allowing different
tradeoffs.
The default selection should be acceptable as generic libm code.
Worst case error is 0.509 ULP for exp and 0.507 ULP for exp2, on
aarch64 the rodata size is 2160 bytes, shared between exp and exp2.
On aarch64 .text + .rodata size decreased by 24912 bytes.
The non-nearest rounding error is less than 1 ULP even on targets
without efficient round implementation (although the error rate is
higher in that case). Targets with single instruction, rounding mode
independent, to nearest integer rounding and conversion can use them
by setting TOINT_INTRINSICS and adding the necessary code to their
math_private.h.
The __exp1 code uses the same algorithm, so the error bound of pow
increased a bit.
New double precision error handling code was added following the
style of the single precision error handling code.
Improvements on Cortex-A72 compared to current glibc master:
exp thruput: 1.61x in [-9.9 9.9]
exp latency: 1.53x in [-9.9 9.9]
exp thruput: 1.13x in [0.5 1]
exp latency: 1.30x in [0.5 1]
exp2 thruput: 2.03x in [-9.9 9.9]
exp2 latency: 1.64x in [-9.9 9.9]
For small (< 1) inputs the current exp code uses a separate algorithm
so the speed up there is less.
Was tested on
aarch64-linux-gnu (TOINT_INTRINSICS, fma contraction) and
arm-linux-gnueabihf (!TOINT_INTRINSICS, no fma contraction) and
x86_64-linux-gnu (!TOINT_INTRINSICS, no fma contraction) and
powerpc64le-linux-gnu (!TOINT_INTRINSICS, fma contraction) targets,
only non-nearest rounding ulp errors increase and they are within
acceptable bounds (ulp updates are in separate patches).
* NEWS: Mention exp and exp2 improvements.
* math/Makefile (libm-support): Remove t_exp.
(type-double-routines): Add math_err and e_exp_data.
* sysdeps/aarch64/libm-test-ulps: Update.
* sysdeps/arm/libm-test-ulps: Update.
* sysdeps/i386/fpu/e_exp_data.c: New file.
* sysdeps/i386/fpu/math_err.c: New file.
* sysdeps/i386/fpu/t_exp.c: Remove.
* sysdeps/ia64/fpu/e_exp_data.c: New file.
* sysdeps/ia64/fpu/math_err.c: New file.
* sysdeps/ia64/fpu/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/e_exp.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp2.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp_data.c: New file.
* sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Update error bound.
* sysdeps/ieee754/dbl-64/eexp.tbl: Remove.
* sysdeps/ieee754/dbl-64/math_config.h: New file.
* sysdeps/ieee754/dbl-64/math_err.c: New file.
* sysdeps/ieee754/dbl-64/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/t_exp2.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.tbl: Remove.
* sysdeps/m68k/m680x0/fpu/e_exp_data.c: New file.
* sysdeps/m68k/m680x0/fpu/math_err.c: New file.
* sysdeps/m68k/m680x0/fpu/t_exp.c: Remove.
* sysdeps/powerpc/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Update.
2018-02-12 18:16:03 +00:00
|
|
|
double: 5
|
2014-06-25 14:57:39 +00:00
|
|
|
float: 8
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 6
|
2014-06-25 14:57:39 +00:00
|
|
|
ldouble: 7
|
|
|
|
|
|
|
|
Function: Imaginary part of "cpow_downward":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 1
|
2014-06-25 14:57:39 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2014-06-25 14:57:39 +00:00
|
|
|
ldouble: 4
|
|
|
|
|
|
|
|
Function: Real part of "cpow_towardzero":
|
Add new exp and exp2 implementations
Optimized exp and exp2 implementations using a lookup table for
fractional powers of 2. There are several variants, see e_exp_data.c,
they can be selected by modifying math_config.h allowing different
tradeoffs.
The default selection should be acceptable as generic libm code.
Worst case error is 0.509 ULP for exp and 0.507 ULP for exp2, on
aarch64 the rodata size is 2160 bytes, shared between exp and exp2.
On aarch64 .text + .rodata size decreased by 24912 bytes.
The non-nearest rounding error is less than 1 ULP even on targets
without efficient round implementation (although the error rate is
higher in that case). Targets with single instruction, rounding mode
independent, to nearest integer rounding and conversion can use them
by setting TOINT_INTRINSICS and adding the necessary code to their
math_private.h.
The __exp1 code uses the same algorithm, so the error bound of pow
increased a bit.
New double precision error handling code was added following the
style of the single precision error handling code.
Improvements on Cortex-A72 compared to current glibc master:
exp thruput: 1.61x in [-9.9 9.9]
exp latency: 1.53x in [-9.9 9.9]
exp thruput: 1.13x in [0.5 1]
exp latency: 1.30x in [0.5 1]
exp2 thruput: 2.03x in [-9.9 9.9]
exp2 latency: 1.64x in [-9.9 9.9]
For small (< 1) inputs the current exp code uses a separate algorithm
so the speed up there is less.
Was tested on
aarch64-linux-gnu (TOINT_INTRINSICS, fma contraction) and
arm-linux-gnueabihf (!TOINT_INTRINSICS, no fma contraction) and
x86_64-linux-gnu (!TOINT_INTRINSICS, no fma contraction) and
powerpc64le-linux-gnu (!TOINT_INTRINSICS, fma contraction) targets,
only non-nearest rounding ulp errors increase and they are within
acceptable bounds (ulp updates are in separate patches).
* NEWS: Mention exp and exp2 improvements.
* math/Makefile (libm-support): Remove t_exp.
(type-double-routines): Add math_err and e_exp_data.
* sysdeps/aarch64/libm-test-ulps: Update.
* sysdeps/arm/libm-test-ulps: Update.
* sysdeps/i386/fpu/e_exp_data.c: New file.
* sysdeps/i386/fpu/math_err.c: New file.
* sysdeps/i386/fpu/t_exp.c: Remove.
* sysdeps/ia64/fpu/e_exp_data.c: New file.
* sysdeps/ia64/fpu/math_err.c: New file.
* sysdeps/ia64/fpu/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/e_exp.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp2.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp_data.c: New file.
* sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Update error bound.
* sysdeps/ieee754/dbl-64/eexp.tbl: Remove.
* sysdeps/ieee754/dbl-64/math_config.h: New file.
* sysdeps/ieee754/dbl-64/math_err.c: New file.
* sysdeps/ieee754/dbl-64/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/t_exp2.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.tbl: Remove.
* sysdeps/m68k/m680x0/fpu/e_exp_data.c: New file.
* sysdeps/m68k/m680x0/fpu/math_err.c: New file.
* sysdeps/m68k/m680x0/fpu/t_exp.c: Remove.
* sysdeps/powerpc/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Update.
2018-02-12 18:16:03 +00:00
|
|
|
double: 5
|
2014-06-25 14:57:39 +00:00
|
|
|
float: 8
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 6
|
2014-06-25 14:57:39 +00:00
|
|
|
ldouble: 8
|
|
|
|
|
|
|
|
Function: Imaginary part of "cpow_towardzero":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 1
|
2014-06-25 14:57:39 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2014-06-25 14:57:39 +00:00
|
|
|
ldouble: 4
|
|
|
|
|
|
|
|
Function: Real part of "cpow_upward":
|
|
|
|
double: 4
|
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2014-06-25 14:57:39 +00:00
|
|
|
ldouble: 3
|
|
|
|
|
|
|
|
Function: Imaginary part of "cpow_upward":
|
|
|
|
double: 1
|
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2014-06-25 14:57:39 +00:00
|
|
|
ldouble: 3
|
|
|
|
|
2006-01-28 00:15:15 +00:00
|
|
|
Function: Real part of "csin":
|
2012-05-19 15:46:20 +00:00
|
|
|
double: 1
|
2012-05-31 22:51:03 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2016-10-05 11:57:47 +00:00
|
|
|
ldouble: 2
|
1999-12-21 16:15:04 +00:00
|
|
|
|
2012-05-19 15:46:20 +00:00
|
|
|
Function: Imaginary part of "csin":
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2012-05-31 22:51:03 +00:00
|
|
|
ldouble: 1
|
2012-05-19 15:46:20 +00:00
|
|
|
|
2014-03-25 15:13:53 +00:00
|
|
|
Function: Real part of "csin_downward":
|
Add new exp and exp2 implementations
Optimized exp and exp2 implementations using a lookup table for
fractional powers of 2. There are several variants, see e_exp_data.c,
they can be selected by modifying math_config.h allowing different
tradeoffs.
The default selection should be acceptable as generic libm code.
Worst case error is 0.509 ULP for exp and 0.507 ULP for exp2, on
aarch64 the rodata size is 2160 bytes, shared between exp and exp2.
On aarch64 .text + .rodata size decreased by 24912 bytes.
The non-nearest rounding error is less than 1 ULP even on targets
without efficient round implementation (although the error rate is
higher in that case). Targets with single instruction, rounding mode
independent, to nearest integer rounding and conversion can use them
by setting TOINT_INTRINSICS and adding the necessary code to their
math_private.h.
The __exp1 code uses the same algorithm, so the error bound of pow
increased a bit.
New double precision error handling code was added following the
style of the single precision error handling code.
Improvements on Cortex-A72 compared to current glibc master:
exp thruput: 1.61x in [-9.9 9.9]
exp latency: 1.53x in [-9.9 9.9]
exp thruput: 1.13x in [0.5 1]
exp latency: 1.30x in [0.5 1]
exp2 thruput: 2.03x in [-9.9 9.9]
exp2 latency: 1.64x in [-9.9 9.9]
For small (< 1) inputs the current exp code uses a separate algorithm
so the speed up there is less.
Was tested on
aarch64-linux-gnu (TOINT_INTRINSICS, fma contraction) and
arm-linux-gnueabihf (!TOINT_INTRINSICS, no fma contraction) and
x86_64-linux-gnu (!TOINT_INTRINSICS, no fma contraction) and
powerpc64le-linux-gnu (!TOINT_INTRINSICS, fma contraction) targets,
only non-nearest rounding ulp errors increase and they are within
acceptable bounds (ulp updates are in separate patches).
* NEWS: Mention exp and exp2 improvements.
* math/Makefile (libm-support): Remove t_exp.
(type-double-routines): Add math_err and e_exp_data.
* sysdeps/aarch64/libm-test-ulps: Update.
* sysdeps/arm/libm-test-ulps: Update.
* sysdeps/i386/fpu/e_exp_data.c: New file.
* sysdeps/i386/fpu/math_err.c: New file.
* sysdeps/i386/fpu/t_exp.c: Remove.
* sysdeps/ia64/fpu/e_exp_data.c: New file.
* sysdeps/ia64/fpu/math_err.c: New file.
* sysdeps/ia64/fpu/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/e_exp.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp2.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp_data.c: New file.
* sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Update error bound.
* sysdeps/ieee754/dbl-64/eexp.tbl: Remove.
* sysdeps/ieee754/dbl-64/math_config.h: New file.
* sysdeps/ieee754/dbl-64/math_err.c: New file.
* sysdeps/ieee754/dbl-64/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/t_exp2.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.tbl: Remove.
* sysdeps/m68k/m680x0/fpu/e_exp_data.c: New file.
* sysdeps/m68k/m680x0/fpu/math_err.c: New file.
* sysdeps/m68k/m680x0/fpu/t_exp.c: Remove.
* sysdeps/powerpc/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Update.
2018-02-12 18:16:03 +00:00
|
|
|
double: 3
|
2014-03-25 15:13:53 +00:00
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2015-07-24 13:21:56 +00:00
|
|
|
ldouble: 6
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: Imaginary part of "csin_downward":
|
|
|
|
double: 1
|
2015-07-24 13:21:56 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2015-07-24 13:21:56 +00:00
|
|
|
ldouble: 6
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: Real part of "csin_towardzero":
|
Add new exp and exp2 implementations
Optimized exp and exp2 implementations using a lookup table for
fractional powers of 2. There are several variants, see e_exp_data.c,
they can be selected by modifying math_config.h allowing different
tradeoffs.
The default selection should be acceptable as generic libm code.
Worst case error is 0.509 ULP for exp and 0.507 ULP for exp2, on
aarch64 the rodata size is 2160 bytes, shared between exp and exp2.
On aarch64 .text + .rodata size decreased by 24912 bytes.
The non-nearest rounding error is less than 1 ULP even on targets
without efficient round implementation (although the error rate is
higher in that case). Targets with single instruction, rounding mode
independent, to nearest integer rounding and conversion can use them
by setting TOINT_INTRINSICS and adding the necessary code to their
math_private.h.
The __exp1 code uses the same algorithm, so the error bound of pow
increased a bit.
New double precision error handling code was added following the
style of the single precision error handling code.
Improvements on Cortex-A72 compared to current glibc master:
exp thruput: 1.61x in [-9.9 9.9]
exp latency: 1.53x in [-9.9 9.9]
exp thruput: 1.13x in [0.5 1]
exp latency: 1.30x in [0.5 1]
exp2 thruput: 2.03x in [-9.9 9.9]
exp2 latency: 1.64x in [-9.9 9.9]
For small (< 1) inputs the current exp code uses a separate algorithm
so the speed up there is less.
Was tested on
aarch64-linux-gnu (TOINT_INTRINSICS, fma contraction) and
arm-linux-gnueabihf (!TOINT_INTRINSICS, no fma contraction) and
x86_64-linux-gnu (!TOINT_INTRINSICS, no fma contraction) and
powerpc64le-linux-gnu (!TOINT_INTRINSICS, fma contraction) targets,
only non-nearest rounding ulp errors increase and they are within
acceptable bounds (ulp updates are in separate patches).
* NEWS: Mention exp and exp2 improvements.
* math/Makefile (libm-support): Remove t_exp.
(type-double-routines): Add math_err and e_exp_data.
* sysdeps/aarch64/libm-test-ulps: Update.
* sysdeps/arm/libm-test-ulps: Update.
* sysdeps/i386/fpu/e_exp_data.c: New file.
* sysdeps/i386/fpu/math_err.c: New file.
* sysdeps/i386/fpu/t_exp.c: Remove.
* sysdeps/ia64/fpu/e_exp_data.c: New file.
* sysdeps/ia64/fpu/math_err.c: New file.
* sysdeps/ia64/fpu/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/e_exp.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp2.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp_data.c: New file.
* sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Update error bound.
* sysdeps/ieee754/dbl-64/eexp.tbl: Remove.
* sysdeps/ieee754/dbl-64/math_config.h: New file.
* sysdeps/ieee754/dbl-64/math_err.c: New file.
* sysdeps/ieee754/dbl-64/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/t_exp2.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.tbl: Remove.
* sysdeps/m68k/m680x0/fpu/e_exp_data.c: New file.
* sysdeps/m68k/m680x0/fpu/math_err.c: New file.
* sysdeps/m68k/m680x0/fpu/t_exp.c: Remove.
* sysdeps/powerpc/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Update.
2018-02-12 18:16:03 +00:00
|
|
|
double: 3
|
2014-03-25 15:13:53 +00:00
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2015-07-24 13:21:56 +00:00
|
|
|
ldouble: 6
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: Imaginary part of "csin_towardzero":
|
2015-07-24 13:21:56 +00:00
|
|
|
double: 1
|
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2015-07-24 13:21:56 +00:00
|
|
|
ldouble: 6
|
|
|
|
|
|
|
|
Function: Real part of "csin_upward":
|
2014-03-25 15:13:53 +00:00
|
|
|
double: 2
|
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2015-07-24 13:21:56 +00:00
|
|
|
ldouble: 3
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: Imaginary part of "csin_upward":
|
|
|
|
double: 1
|
2015-07-24 13:21:56 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2014-03-25 15:13:53 +00:00
|
|
|
ldouble: 3
|
|
|
|
|
1999-12-21 16:15:04 +00:00
|
|
|
Function: Real part of "csinh":
|
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2006-01-28 00:15:15 +00:00
|
|
|
ldouble: 1
|
1999-12-21 16:15:04 +00:00
|
|
|
|
|
|
|
Function: Imaginary part of "csinh":
|
|
|
|
double: 1
|
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2016-10-05 11:57:47 +00:00
|
|
|
ldouble: 2
|
1999-12-21 16:15:04 +00:00
|
|
|
|
2014-03-25 15:13:53 +00:00
|
|
|
Function: Real part of "csinh_downward":
|
2015-07-24 13:21:56 +00:00
|
|
|
double: 2
|
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2015-07-24 13:21:56 +00:00
|
|
|
ldouble: 6
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: Imaginary part of "csinh_downward":
|
Add new exp and exp2 implementations
Optimized exp and exp2 implementations using a lookup table for
fractional powers of 2. There are several variants, see e_exp_data.c,
they can be selected by modifying math_config.h allowing different
tradeoffs.
The default selection should be acceptable as generic libm code.
Worst case error is 0.509 ULP for exp and 0.507 ULP for exp2, on
aarch64 the rodata size is 2160 bytes, shared between exp and exp2.
On aarch64 .text + .rodata size decreased by 24912 bytes.
The non-nearest rounding error is less than 1 ULP even on targets
without efficient round implementation (although the error rate is
higher in that case). Targets with single instruction, rounding mode
independent, to nearest integer rounding and conversion can use them
by setting TOINT_INTRINSICS and adding the necessary code to their
math_private.h.
The __exp1 code uses the same algorithm, so the error bound of pow
increased a bit.
New double precision error handling code was added following the
style of the single precision error handling code.
Improvements on Cortex-A72 compared to current glibc master:
exp thruput: 1.61x in [-9.9 9.9]
exp latency: 1.53x in [-9.9 9.9]
exp thruput: 1.13x in [0.5 1]
exp latency: 1.30x in [0.5 1]
exp2 thruput: 2.03x in [-9.9 9.9]
exp2 latency: 1.64x in [-9.9 9.9]
For small (< 1) inputs the current exp code uses a separate algorithm
so the speed up there is less.
Was tested on
aarch64-linux-gnu (TOINT_INTRINSICS, fma contraction) and
arm-linux-gnueabihf (!TOINT_INTRINSICS, no fma contraction) and
x86_64-linux-gnu (!TOINT_INTRINSICS, no fma contraction) and
powerpc64le-linux-gnu (!TOINT_INTRINSICS, fma contraction) targets,
only non-nearest rounding ulp errors increase and they are within
acceptable bounds (ulp updates are in separate patches).
* NEWS: Mention exp and exp2 improvements.
* math/Makefile (libm-support): Remove t_exp.
(type-double-routines): Add math_err and e_exp_data.
* sysdeps/aarch64/libm-test-ulps: Update.
* sysdeps/arm/libm-test-ulps: Update.
* sysdeps/i386/fpu/e_exp_data.c: New file.
* sysdeps/i386/fpu/math_err.c: New file.
* sysdeps/i386/fpu/t_exp.c: Remove.
* sysdeps/ia64/fpu/e_exp_data.c: New file.
* sysdeps/ia64/fpu/math_err.c: New file.
* sysdeps/ia64/fpu/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/e_exp.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp2.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp_data.c: New file.
* sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Update error bound.
* sysdeps/ieee754/dbl-64/eexp.tbl: Remove.
* sysdeps/ieee754/dbl-64/math_config.h: New file.
* sysdeps/ieee754/dbl-64/math_err.c: New file.
* sysdeps/ieee754/dbl-64/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/t_exp2.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.tbl: Remove.
* sysdeps/m68k/m680x0/fpu/e_exp_data.c: New file.
* sysdeps/m68k/m680x0/fpu/math_err.c: New file.
* sysdeps/m68k/m680x0/fpu/t_exp.c: Remove.
* sysdeps/powerpc/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Update.
2018-02-12 18:16:03 +00:00
|
|
|
double: 3
|
2014-03-25 15:13:53 +00:00
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2015-07-24 13:21:56 +00:00
|
|
|
ldouble: 6
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: Real part of "csinh_towardzero":
|
|
|
|
double: 2
|
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2015-07-24 13:21:56 +00:00
|
|
|
ldouble: 6
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: Imaginary part of "csinh_towardzero":
|
Add new exp and exp2 implementations
Optimized exp and exp2 implementations using a lookup table for
fractional powers of 2. There are several variants, see e_exp_data.c,
they can be selected by modifying math_config.h allowing different
tradeoffs.
The default selection should be acceptable as generic libm code.
Worst case error is 0.509 ULP for exp and 0.507 ULP for exp2, on
aarch64 the rodata size is 2160 bytes, shared between exp and exp2.
On aarch64 .text + .rodata size decreased by 24912 bytes.
The non-nearest rounding error is less than 1 ULP even on targets
without efficient round implementation (although the error rate is
higher in that case). Targets with single instruction, rounding mode
independent, to nearest integer rounding and conversion can use them
by setting TOINT_INTRINSICS and adding the necessary code to their
math_private.h.
The __exp1 code uses the same algorithm, so the error bound of pow
increased a bit.
New double precision error handling code was added following the
style of the single precision error handling code.
Improvements on Cortex-A72 compared to current glibc master:
exp thruput: 1.61x in [-9.9 9.9]
exp latency: 1.53x in [-9.9 9.9]
exp thruput: 1.13x in [0.5 1]
exp latency: 1.30x in [0.5 1]
exp2 thruput: 2.03x in [-9.9 9.9]
exp2 latency: 1.64x in [-9.9 9.9]
For small (< 1) inputs the current exp code uses a separate algorithm
so the speed up there is less.
Was tested on
aarch64-linux-gnu (TOINT_INTRINSICS, fma contraction) and
arm-linux-gnueabihf (!TOINT_INTRINSICS, no fma contraction) and
x86_64-linux-gnu (!TOINT_INTRINSICS, no fma contraction) and
powerpc64le-linux-gnu (!TOINT_INTRINSICS, fma contraction) targets,
only non-nearest rounding ulp errors increase and they are within
acceptable bounds (ulp updates are in separate patches).
* NEWS: Mention exp and exp2 improvements.
* math/Makefile (libm-support): Remove t_exp.
(type-double-routines): Add math_err and e_exp_data.
* sysdeps/aarch64/libm-test-ulps: Update.
* sysdeps/arm/libm-test-ulps: Update.
* sysdeps/i386/fpu/e_exp_data.c: New file.
* sysdeps/i386/fpu/math_err.c: New file.
* sysdeps/i386/fpu/t_exp.c: Remove.
* sysdeps/ia64/fpu/e_exp_data.c: New file.
* sysdeps/ia64/fpu/math_err.c: New file.
* sysdeps/ia64/fpu/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/e_exp.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp2.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp_data.c: New file.
* sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Update error bound.
* sysdeps/ieee754/dbl-64/eexp.tbl: Remove.
* sysdeps/ieee754/dbl-64/math_config.h: New file.
* sysdeps/ieee754/dbl-64/math_err.c: New file.
* sysdeps/ieee754/dbl-64/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/t_exp2.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.tbl: Remove.
* sysdeps/m68k/m680x0/fpu/e_exp_data.c: New file.
* sysdeps/m68k/m680x0/fpu/math_err.c: New file.
* sysdeps/m68k/m680x0/fpu/t_exp.c: Remove.
* sysdeps/powerpc/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Update.
2018-02-12 18:16:03 +00:00
|
|
|
double: 3
|
2014-03-25 15:13:53 +00:00
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2015-07-24 13:21:56 +00:00
|
|
|
ldouble: 6
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: Real part of "csinh_upward":
|
|
|
|
double: 1
|
2015-07-24 13:21:56 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2014-03-25 15:13:53 +00:00
|
|
|
ldouble: 3
|
|
|
|
|
|
|
|
Function: Imaginary part of "csinh_upward":
|
|
|
|
double: 2
|
2015-07-24 13:21:56 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2015-07-24 13:21:56 +00:00
|
|
|
ldouble: 3
|
2014-03-25 15:13:53 +00:00
|
|
|
|
1999-12-21 16:15:04 +00:00
|
|
|
Function: Real part of "csqrt":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 2
|
2012-03-15 00:05:14 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2014-01-08 13:45:54 +00:00
|
|
|
ldouble: 1
|
2006-01-28 00:15:15 +00:00
|
|
|
|
|
|
|
Function: Imaginary part of "csqrt":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 2
|
2012-03-15 00:05:14 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2014-01-08 13:45:54 +00:00
|
|
|
ldouble: 1
|
1999-12-21 16:15:04 +00:00
|
|
|
|
2014-03-25 15:13:53 +00:00
|
|
|
Function: Real part of "csqrt_downward":
|
2015-12-22 13:06:36 +00:00
|
|
|
double: 5
|
2015-05-29 12:40:33 +00:00
|
|
|
float: 4
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 4
|
2015-05-29 12:40:33 +00:00
|
|
|
ldouble: 4
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: Imaginary part of "csqrt_downward":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 4
|
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2015-12-22 13:06:36 +00:00
|
|
|
ldouble: 5
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: Real part of "csqrt_towardzero":
|
2015-12-22 13:06:36 +00:00
|
|
|
double: 4
|
2015-05-29 12:40:33 +00:00
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2015-12-22 13:06:36 +00:00
|
|
|
ldouble: 5
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: Imaginary part of "csqrt_towardzero":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 4
|
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2015-12-22 13:06:36 +00:00
|
|
|
ldouble: 5
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: Real part of "csqrt_upward":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 5
|
|
|
|
float: 4
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 4
|
2015-05-29 12:40:33 +00:00
|
|
|
ldouble: 12
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: Imaginary part of "csqrt_upward":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 3
|
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2015-05-29 12:40:33 +00:00
|
|
|
ldouble: 8
|
2014-03-25 15:13:53 +00:00
|
|
|
|
1999-12-21 16:15:04 +00:00
|
|
|
Function: Real part of "ctan":
|
2006-01-30 22:29:44 +00:00
|
|
|
double: 1
|
2012-05-31 22:51:03 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2016-10-05 11:57:47 +00:00
|
|
|
ldouble: 3
|
1999-12-21 16:15:04 +00:00
|
|
|
|
|
|
|
Function: Imaginary part of "ctan":
|
2013-12-23 13:40:10 +00:00
|
|
|
double: 2
|
2017-12-16 08:34:14 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2012-07-11 12:19:27 +00:00
|
|
|
ldouble: 2
|
|
|
|
|
|
|
|
Function: Real part of "ctan_downward":
|
2013-12-23 13:40:10 +00:00
|
|
|
double: 6
|
|
|
|
float: 5
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 4
|
2015-07-24 13:21:56 +00:00
|
|
|
ldouble: 6
|
2012-07-11 12:19:27 +00:00
|
|
|
|
|
|
|
Function: Imaginary part of "ctan_downward":
|
2013-12-23 13:40:10 +00:00
|
|
|
double: 2
|
2017-12-16 08:34:14 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 5
|
2013-12-23 13:40:10 +00:00
|
|
|
ldouble: 9
|
2012-07-11 12:19:27 +00:00
|
|
|
|
|
|
|
Function: Real part of "ctan_towardzero":
|
2013-12-23 13:40:10 +00:00
|
|
|
double: 5
|
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 4
|
2015-07-24 13:21:56 +00:00
|
|
|
ldouble: 6
|
2012-07-11 12:19:27 +00:00
|
|
|
|
|
|
|
Function: Imaginary part of "ctan_towardzero":
|
2013-12-23 13:40:10 +00:00
|
|
|
double: 2
|
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 5
|
2015-05-29 12:40:33 +00:00
|
|
|
ldouble: 13
|
2012-07-11 12:19:27 +00:00
|
|
|
|
|
|
|
Function: Real part of "ctan_upward":
|
|
|
|
double: 2
|
2017-12-16 08:34:14 +00:00
|
|
|
float: 4
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 5
|
2015-07-24 13:21:56 +00:00
|
|
|
ldouble: 7
|
2012-07-11 12:19:27 +00:00
|
|
|
|
|
|
|
Function: Imaginary part of "ctan_upward":
|
2013-12-23 13:40:10 +00:00
|
|
|
double: 2
|
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 5
|
2012-07-11 12:19:27 +00:00
|
|
|
ldouble: 10
|
|
|
|
|
1999-12-21 16:15:04 +00:00
|
|
|
Function: Real part of "ctanh":
|
2013-12-23 13:40:10 +00:00
|
|
|
double: 2
|
1999-12-21 16:15:04 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2015-07-24 13:21:56 +00:00
|
|
|
ldouble: 3
|
1999-12-21 16:15:04 +00:00
|
|
|
|
|
|
|
Function: Imaginary part of "ctanh":
|
2013-12-23 13:40:10 +00:00
|
|
|
double: 2
|
2012-05-31 22:51:03 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2016-10-05 11:57:47 +00:00
|
|
|
ldouble: 3
|
1999-12-21 16:15:04 +00:00
|
|
|
|
2012-07-11 12:19:27 +00:00
|
|
|
Function: Real part of "ctanh_downward":
|
2013-12-23 13:40:10 +00:00
|
|
|
double: 4
|
2017-12-16 08:34:14 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 5
|
2013-12-23 13:40:10 +00:00
|
|
|
ldouble: 9
|
2012-07-11 12:19:27 +00:00
|
|
|
|
|
|
|
Function: Imaginary part of "ctanh_downward":
|
2013-12-23 13:40:10 +00:00
|
|
|
double: 6
|
|
|
|
float: 5
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 4
|
2015-07-24 13:21:56 +00:00
|
|
|
ldouble: 6
|
2012-07-11 12:19:27 +00:00
|
|
|
|
|
|
|
Function: Real part of "ctanh_towardzero":
|
2013-12-23 13:40:10 +00:00
|
|
|
double: 2
|
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 5
|
2012-07-11 12:19:27 +00:00
|
|
|
ldouble: 13
|
|
|
|
|
|
|
|
Function: Imaginary part of "ctanh_towardzero":
|
2013-12-23 13:40:10 +00:00
|
|
|
double: 5
|
2012-07-11 12:19:27 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2015-07-24 13:21:56 +00:00
|
|
|
ldouble: 10
|
2013-12-23 13:40:10 +00:00
|
|
|
|
|
|
|
Function: Real part of "ctanh_upward":
|
|
|
|
double: 2
|
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 5
|
2012-07-11 12:19:27 +00:00
|
|
|
ldouble: 10
|
|
|
|
|
|
|
|
Function: Imaginary part of "ctanh_upward":
|
|
|
|
double: 2
|
2013-12-23 13:40:10 +00:00
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 5
|
2013-12-23 13:40:10 +00:00
|
|
|
ldouble: 10
|
2012-07-11 12:19:27 +00:00
|
|
|
|
Add narrowing divide functions.
This patch adds the narrowing divide functions from TS 18661-1 to
glibc's libm: fdiv, fdivl, ddivl, f32divf64, f32divf32x, f32xdivf64
for all configurations; f32divf64x, f32divf128, f64divf64x,
f64divf128, f32xdivf64x, f32xdivf128, f64xdivf128 for configurations
with _Float64x and _Float128; __nldbl_ddivl for ldbl-opt.
The changes are mostly essentially the same as for the other narrowing
functions, so the description of those generally applies to this patch
as well.
Tested for x86_64, x86, mips64 (all three ABIs, both hard and soft
float) and powerpc, and with build-many-glibcs.py.
* math/Makefile (libm-narrow-fns): Add div.
(libm-test-funcs-narrow): Likewise.
* math/Versions (GLIBC_2.28): Add narrowing divide functions.
* math/bits/mathcalls-narrow.h (div): Use __MATHCALL_NARROW.
* math/gen-auto-libm-tests.c (test_functions): Add div.
* math/math-narrow.h (CHECK_NARROW_DIV): New macro.
(NARROW_DIV_ROUND_TO_ODD): Likewise.
(NARROW_DIV_TRIVIAL): Likewise.
* sysdeps/ieee754/float128/float128_private.h (__fdivl): New
macro.
(__ddivl): Likewise.
* sysdeps/ieee754/ldbl-opt/Makefile (libnldbl-calls): Add fdiv and
ddiv.
(CFLAGS-nldbl-ddiv.c): New variable.
(CFLAGS-nldbl-fdiv.c): Likewise.
* sysdeps/ieee754/ldbl-opt/Versions (GLIBC_2.28): Add
__nldbl_ddivl.
* sysdeps/ieee754/ldbl-opt/nldbl-compat.h (__nldbl_ddivl): New
prototype.
* manual/arith.texi (Misc FP Arithmetic): Document fdiv, fdivl,
ddivl, fMdivfN, fMdivfNx, fMxdivfN and fMxdivfNx.
* math/auto-libm-test-in: Add tests of div.
* math/auto-libm-test-out-narrow-div: New generated file.
* math/libm-test-narrow-div.inc: New file.
* sysdeps/i386/fpu/s_f32xdivf64.c: Likewise.
* sysdeps/ieee754/dbl-64/s_f32xdivf64.c: Likewise.
* sysdeps/ieee754/dbl-64/s_fdiv.c: Likewise.
* sysdeps/ieee754/float128/s_f32divf128.c: Likewise.
* sysdeps/ieee754/float128/s_f64divf128.c: Likewise.
* sysdeps/ieee754/float128/s_f64xdivf128.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_ddivl.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_f64xdivf128.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_fdivl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_ddivl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_fdivl.c: Likewise.
* sysdeps/ieee754/ldbl-96/s_ddivl.c: Likewise.
* sysdeps/ieee754/ldbl-96/s_fdivl.c: Likewise.
* sysdeps/ieee754/ldbl-opt/nldbl-ddiv.c: Likewise.
* sysdeps/ieee754/ldbl-opt/nldbl-fdiv.c: Likewise.
* sysdeps/ieee754/soft-fp/s_ddivl.c: Likewise.
* sysdeps/ieee754/soft-fp/s_fdiv.c: Likewise.
* sysdeps/ieee754/soft-fp/s_fdivl.c: Likewise.
* sysdeps/powerpc/fpu/libm-test-ulps: Update.
* sysdeps/mach/hurd/i386/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/aarch64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/arm/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/hppa/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/microblaze/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/nios2/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/riscv/rv64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sh/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise.
2018-05-17 00:40:52 +00:00
|
|
|
Function: "div_ldouble":
|
|
|
|
float: 1
|
|
|
|
|
2021-09-22 12:35:44 +00:00
|
|
|
Function: "div_towardzero_ldouble":
|
|
|
|
double: 1
|
|
|
|
|
2002-09-02 20:04:55 +00:00
|
|
|
Function: "erf":
|
|
|
|
double: 1
|
2015-05-29 12:40:33 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2006-01-28 00:15:15 +00:00
|
|
|
ldouble: 1
|
2002-09-02 20:04:55 +00:00
|
|
|
|
2014-03-25 15:13:53 +00:00
|
|
|
Function: "erf_downward":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 1
|
2014-03-25 15:13:53 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2015-05-29 12:40:33 +00:00
|
|
|
ldouble: 2
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: "erf_towardzero":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 1
|
2014-03-25 15:13:53 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2015-12-22 13:06:36 +00:00
|
|
|
ldouble: 2
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: "erf_upward":
|
2015-12-22 13:06:36 +00:00
|
|
|
double: 1
|
2015-05-29 12:40:33 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2015-12-22 13:06:36 +00:00
|
|
|
ldouble: 3
|
2014-03-25 15:13:53 +00:00
|
|
|
|
1999-12-21 16:15:04 +00:00
|
|
|
Function: "erfc":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 2
|
|
|
|
float: 2
|
2021-03-03 17:39:17 +00:00
|
|
|
float128: 4
|
2016-10-05 11:57:47 +00:00
|
|
|
ldouble: 3
|
2006-01-28 00:15:15 +00:00
|
|
|
|
2014-03-25 15:13:53 +00:00
|
|
|
Function: "erfc_downward":
|
Add new exp and exp2 implementations
Optimized exp and exp2 implementations using a lookup table for
fractional powers of 2. There are several variants, see e_exp_data.c,
they can be selected by modifying math_config.h allowing different
tradeoffs.
The default selection should be acceptable as generic libm code.
Worst case error is 0.509 ULP for exp and 0.507 ULP for exp2, on
aarch64 the rodata size is 2160 bytes, shared between exp and exp2.
On aarch64 .text + .rodata size decreased by 24912 bytes.
The non-nearest rounding error is less than 1 ULP even on targets
without efficient round implementation (although the error rate is
higher in that case). Targets with single instruction, rounding mode
independent, to nearest integer rounding and conversion can use them
by setting TOINT_INTRINSICS and adding the necessary code to their
math_private.h.
The __exp1 code uses the same algorithm, so the error bound of pow
increased a bit.
New double precision error handling code was added following the
style of the single precision error handling code.
Improvements on Cortex-A72 compared to current glibc master:
exp thruput: 1.61x in [-9.9 9.9]
exp latency: 1.53x in [-9.9 9.9]
exp thruput: 1.13x in [0.5 1]
exp latency: 1.30x in [0.5 1]
exp2 thruput: 2.03x in [-9.9 9.9]
exp2 latency: 1.64x in [-9.9 9.9]
For small (< 1) inputs the current exp code uses a separate algorithm
so the speed up there is less.
Was tested on
aarch64-linux-gnu (TOINT_INTRINSICS, fma contraction) and
arm-linux-gnueabihf (!TOINT_INTRINSICS, no fma contraction) and
x86_64-linux-gnu (!TOINT_INTRINSICS, no fma contraction) and
powerpc64le-linux-gnu (!TOINT_INTRINSICS, fma contraction) targets,
only non-nearest rounding ulp errors increase and they are within
acceptable bounds (ulp updates are in separate patches).
* NEWS: Mention exp and exp2 improvements.
* math/Makefile (libm-support): Remove t_exp.
(type-double-routines): Add math_err and e_exp_data.
* sysdeps/aarch64/libm-test-ulps: Update.
* sysdeps/arm/libm-test-ulps: Update.
* sysdeps/i386/fpu/e_exp_data.c: New file.
* sysdeps/i386/fpu/math_err.c: New file.
* sysdeps/i386/fpu/t_exp.c: Remove.
* sysdeps/ia64/fpu/e_exp_data.c: New file.
* sysdeps/ia64/fpu/math_err.c: New file.
* sysdeps/ia64/fpu/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/e_exp.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp2.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp_data.c: New file.
* sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Update error bound.
* sysdeps/ieee754/dbl-64/eexp.tbl: Remove.
* sysdeps/ieee754/dbl-64/math_config.h: New file.
* sysdeps/ieee754/dbl-64/math_err.c: New file.
* sysdeps/ieee754/dbl-64/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/t_exp2.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.tbl: Remove.
* sysdeps/m68k/m680x0/fpu/e_exp_data.c: New file.
* sysdeps/m68k/m680x0/fpu/math_err.c: New file.
* sysdeps/m68k/m680x0/fpu/t_exp.c: Remove.
* sysdeps/powerpc/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Update.
2018-02-12 18:16:03 +00:00
|
|
|
double: 4
|
2015-05-29 12:40:33 +00:00
|
|
|
float: 4
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 5
|
2015-12-22 13:06:36 +00:00
|
|
|
ldouble: 10
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: "erfc_towardzero":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 3
|
2014-03-25 15:13:53 +00:00
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 4
|
2021-03-01 20:07:27 +00:00
|
|
|
ldouble: 11
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: "erfc_upward":
|
Add new exp and exp2 implementations
Optimized exp and exp2 implementations using a lookup table for
fractional powers of 2. There are several variants, see e_exp_data.c,
they can be selected by modifying math_config.h allowing different
tradeoffs.
The default selection should be acceptable as generic libm code.
Worst case error is 0.509 ULP for exp and 0.507 ULP for exp2, on
aarch64 the rodata size is 2160 bytes, shared between exp and exp2.
On aarch64 .text + .rodata size decreased by 24912 bytes.
The non-nearest rounding error is less than 1 ULP even on targets
without efficient round implementation (although the error rate is
higher in that case). Targets with single instruction, rounding mode
independent, to nearest integer rounding and conversion can use them
by setting TOINT_INTRINSICS and adding the necessary code to their
math_private.h.
The __exp1 code uses the same algorithm, so the error bound of pow
increased a bit.
New double precision error handling code was added following the
style of the single precision error handling code.
Improvements on Cortex-A72 compared to current glibc master:
exp thruput: 1.61x in [-9.9 9.9]
exp latency: 1.53x in [-9.9 9.9]
exp thruput: 1.13x in [0.5 1]
exp latency: 1.30x in [0.5 1]
exp2 thruput: 2.03x in [-9.9 9.9]
exp2 latency: 1.64x in [-9.9 9.9]
For small (< 1) inputs the current exp code uses a separate algorithm
so the speed up there is less.
Was tested on
aarch64-linux-gnu (TOINT_INTRINSICS, fma contraction) and
arm-linux-gnueabihf (!TOINT_INTRINSICS, no fma contraction) and
x86_64-linux-gnu (!TOINT_INTRINSICS, no fma contraction) and
powerpc64le-linux-gnu (!TOINT_INTRINSICS, fma contraction) targets,
only non-nearest rounding ulp errors increase and they are within
acceptable bounds (ulp updates are in separate patches).
* NEWS: Mention exp and exp2 improvements.
* math/Makefile (libm-support): Remove t_exp.
(type-double-routines): Add math_err and e_exp_data.
* sysdeps/aarch64/libm-test-ulps: Update.
* sysdeps/arm/libm-test-ulps: Update.
* sysdeps/i386/fpu/e_exp_data.c: New file.
* sysdeps/i386/fpu/math_err.c: New file.
* sysdeps/i386/fpu/t_exp.c: Remove.
* sysdeps/ia64/fpu/e_exp_data.c: New file.
* sysdeps/ia64/fpu/math_err.c: New file.
* sysdeps/ia64/fpu/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/e_exp.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp2.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp_data.c: New file.
* sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Update error bound.
* sysdeps/ieee754/dbl-64/eexp.tbl: Remove.
* sysdeps/ieee754/dbl-64/math_config.h: New file.
* sysdeps/ieee754/dbl-64/math_err.c: New file.
* sysdeps/ieee754/dbl-64/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/t_exp2.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.tbl: Remove.
* sysdeps/m68k/m680x0/fpu/e_exp_data.c: New file.
* sysdeps/m68k/m680x0/fpu/math_err.c: New file.
* sysdeps/m68k/m680x0/fpu/t_exp.c: Remove.
* sysdeps/powerpc/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Update.
2018-02-12 18:16:03 +00:00
|
|
|
double: 4
|
2015-05-29 12:40:33 +00:00
|
|
|
float: 4
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 5
|
2015-12-22 13:06:36 +00:00
|
|
|
ldouble: 7
|
2014-03-25 15:13:53 +00:00
|
|
|
|
2006-01-28 00:15:15 +00:00
|
|
|
Function: "exp":
|
2016-01-28 07:26:59 +00:00
|
|
|
double: 1
|
2020-04-07 14:41:29 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2006-01-28 00:15:15 +00:00
|
|
|
ldouble: 1
|
1999-12-21 16:15:04 +00:00
|
|
|
|
|
|
|
Function: "exp10":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 2
|
2020-04-07 14:41:29 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2012-05-21 17:24:12 +00:00
|
|
|
ldouble: 1
|
2006-01-28 00:15:15 +00:00
|
|
|
|
2013-12-23 13:40:10 +00:00
|
|
|
Function: "exp10_downward":
|
Add new exp and exp2 implementations
Optimized exp and exp2 implementations using a lookup table for
fractional powers of 2. There are several variants, see e_exp_data.c,
they can be selected by modifying math_config.h allowing different
tradeoffs.
The default selection should be acceptable as generic libm code.
Worst case error is 0.509 ULP for exp and 0.507 ULP for exp2, on
aarch64 the rodata size is 2160 bytes, shared between exp and exp2.
On aarch64 .text + .rodata size decreased by 24912 bytes.
The non-nearest rounding error is less than 1 ULP even on targets
without efficient round implementation (although the error rate is
higher in that case). Targets with single instruction, rounding mode
independent, to nearest integer rounding and conversion can use them
by setting TOINT_INTRINSICS and adding the necessary code to their
math_private.h.
The __exp1 code uses the same algorithm, so the error bound of pow
increased a bit.
New double precision error handling code was added following the
style of the single precision error handling code.
Improvements on Cortex-A72 compared to current glibc master:
exp thruput: 1.61x in [-9.9 9.9]
exp latency: 1.53x in [-9.9 9.9]
exp thruput: 1.13x in [0.5 1]
exp latency: 1.30x in [0.5 1]
exp2 thruput: 2.03x in [-9.9 9.9]
exp2 latency: 1.64x in [-9.9 9.9]
For small (< 1) inputs the current exp code uses a separate algorithm
so the speed up there is less.
Was tested on
aarch64-linux-gnu (TOINT_INTRINSICS, fma contraction) and
arm-linux-gnueabihf (!TOINT_INTRINSICS, no fma contraction) and
x86_64-linux-gnu (!TOINT_INTRINSICS, no fma contraction) and
powerpc64le-linux-gnu (!TOINT_INTRINSICS, fma contraction) targets,
only non-nearest rounding ulp errors increase and they are within
acceptable bounds (ulp updates are in separate patches).
* NEWS: Mention exp and exp2 improvements.
* math/Makefile (libm-support): Remove t_exp.
(type-double-routines): Add math_err and e_exp_data.
* sysdeps/aarch64/libm-test-ulps: Update.
* sysdeps/arm/libm-test-ulps: Update.
* sysdeps/i386/fpu/e_exp_data.c: New file.
* sysdeps/i386/fpu/math_err.c: New file.
* sysdeps/i386/fpu/t_exp.c: Remove.
* sysdeps/ia64/fpu/e_exp_data.c: New file.
* sysdeps/ia64/fpu/math_err.c: New file.
* sysdeps/ia64/fpu/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/e_exp.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp2.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp_data.c: New file.
* sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Update error bound.
* sysdeps/ieee754/dbl-64/eexp.tbl: Remove.
* sysdeps/ieee754/dbl-64/math_config.h: New file.
* sysdeps/ieee754/dbl-64/math_err.c: New file.
* sysdeps/ieee754/dbl-64/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/t_exp2.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.tbl: Remove.
* sysdeps/m68k/m680x0/fpu/e_exp_data.c: New file.
* sysdeps/m68k/m680x0/fpu/math_err.c: New file.
* sysdeps/m68k/m680x0/fpu/t_exp.c: Remove.
* sysdeps/powerpc/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Update.
2018-02-12 18:16:03 +00:00
|
|
|
double: 3
|
2014-06-25 14:57:39 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2015-12-22 13:06:36 +00:00
|
|
|
ldouble: 9
|
2013-12-23 13:40:10 +00:00
|
|
|
|
|
|
|
Function: "exp10_towardzero":
|
Add new exp and exp2 implementations
Optimized exp and exp2 implementations using a lookup table for
fractional powers of 2. There are several variants, see e_exp_data.c,
they can be selected by modifying math_config.h allowing different
tradeoffs.
The default selection should be acceptable as generic libm code.
Worst case error is 0.509 ULP for exp and 0.507 ULP for exp2, on
aarch64 the rodata size is 2160 bytes, shared between exp and exp2.
On aarch64 .text + .rodata size decreased by 24912 bytes.
The non-nearest rounding error is less than 1 ULP even on targets
without efficient round implementation (although the error rate is
higher in that case). Targets with single instruction, rounding mode
independent, to nearest integer rounding and conversion can use them
by setting TOINT_INTRINSICS and adding the necessary code to their
math_private.h.
The __exp1 code uses the same algorithm, so the error bound of pow
increased a bit.
New double precision error handling code was added following the
style of the single precision error handling code.
Improvements on Cortex-A72 compared to current glibc master:
exp thruput: 1.61x in [-9.9 9.9]
exp latency: 1.53x in [-9.9 9.9]
exp thruput: 1.13x in [0.5 1]
exp latency: 1.30x in [0.5 1]
exp2 thruput: 2.03x in [-9.9 9.9]
exp2 latency: 1.64x in [-9.9 9.9]
For small (< 1) inputs the current exp code uses a separate algorithm
so the speed up there is less.
Was tested on
aarch64-linux-gnu (TOINT_INTRINSICS, fma contraction) and
arm-linux-gnueabihf (!TOINT_INTRINSICS, no fma contraction) and
x86_64-linux-gnu (!TOINT_INTRINSICS, no fma contraction) and
powerpc64le-linux-gnu (!TOINT_INTRINSICS, fma contraction) targets,
only non-nearest rounding ulp errors increase and they are within
acceptable bounds (ulp updates are in separate patches).
* NEWS: Mention exp and exp2 improvements.
* math/Makefile (libm-support): Remove t_exp.
(type-double-routines): Add math_err and e_exp_data.
* sysdeps/aarch64/libm-test-ulps: Update.
* sysdeps/arm/libm-test-ulps: Update.
* sysdeps/i386/fpu/e_exp_data.c: New file.
* sysdeps/i386/fpu/math_err.c: New file.
* sysdeps/i386/fpu/t_exp.c: Remove.
* sysdeps/ia64/fpu/e_exp_data.c: New file.
* sysdeps/ia64/fpu/math_err.c: New file.
* sysdeps/ia64/fpu/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/e_exp.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp2.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp_data.c: New file.
* sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Update error bound.
* sysdeps/ieee754/dbl-64/eexp.tbl: Remove.
* sysdeps/ieee754/dbl-64/math_config.h: New file.
* sysdeps/ieee754/dbl-64/math_err.c: New file.
* sysdeps/ieee754/dbl-64/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/t_exp2.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.tbl: Remove.
* sysdeps/m68k/m680x0/fpu/e_exp_data.c: New file.
* sysdeps/m68k/m680x0/fpu/math_err.c: New file.
* sysdeps/m68k/m680x0/fpu/t_exp.c: Remove.
* sysdeps/powerpc/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Update.
2018-02-12 18:16:03 +00:00
|
|
|
double: 3
|
2014-06-25 14:57:39 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2015-12-22 13:06:36 +00:00
|
|
|
ldouble: 9
|
2013-12-23 13:40:10 +00:00
|
|
|
|
|
|
|
Function: "exp10_upward":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 2
|
2013-12-23 13:40:10 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2015-03-09 17:26:48 +00:00
|
|
|
ldouble: 4
|
2013-12-23 13:40:10 +00:00
|
|
|
|
2006-01-28 00:15:15 +00:00
|
|
|
Function: "exp2":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2015-12-22 13:06:36 +00:00
|
|
|
ldouble: 2
|
1999-12-21 16:15:04 +00:00
|
|
|
|
2014-03-25 15:13:53 +00:00
|
|
|
Function: "exp2_downward":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2014-03-25 15:13:53 +00:00
|
|
|
ldouble: 1
|
|
|
|
|
|
|
|
Function: "exp2_towardzero":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2016-10-05 11:57:47 +00:00
|
|
|
ldouble: 2
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: "exp2_upward":
|
2013-12-04 12:04:48 +00:00
|
|
|
double: 1
|
2012-03-03 13:20:24 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2015-05-29 12:40:33 +00:00
|
|
|
ldouble: 2
|
2012-03-03 13:20:24 +00:00
|
|
|
|
2015-05-29 12:40:33 +00:00
|
|
|
Function: "exp_downward":
|
|
|
|
double: 1
|
2017-09-26 19:13:33 +00:00
|
|
|
float: 1
|
2015-12-22 13:06:36 +00:00
|
|
|
ldouble: 2
|
2012-03-03 13:20:24 +00:00
|
|
|
|
|
|
|
Function: "exp_towardzero":
|
2013-12-04 12:04:48 +00:00
|
|
|
double: 1
|
2017-09-26 19:13:33 +00:00
|
|
|
float: 1
|
2015-12-22 13:06:36 +00:00
|
|
|
ldouble: 2
|
2012-03-03 13:20:24 +00:00
|
|
|
|
|
|
|
Function: "exp_upward":
|
2013-12-04 12:04:48 +00:00
|
|
|
double: 1
|
2016-10-05 11:57:47 +00:00
|
|
|
float: 1
|
2012-03-03 13:20:24 +00:00
|
|
|
ldouble: 1
|
|
|
|
|
1999-12-21 16:15:04 +00:00
|
|
|
Function: "expm1":
|
2002-09-02 20:04:55 +00:00
|
|
|
double: 1
|
1999-12-21 16:15:04 +00:00
|
|
|
float: 1
|
2021-03-03 17:39:17 +00:00
|
|
|
float128: 2
|
2012-05-31 22:51:03 +00:00
|
|
|
ldouble: 1
|
1999-12-21 16:15:04 +00:00
|
|
|
|
2013-12-19 19:44:16 +00:00
|
|
|
Function: "expm1_downward":
|
|
|
|
double: 1
|
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2021-03-01 20:07:27 +00:00
|
|
|
ldouble: 7
|
2013-12-19 19:44:16 +00:00
|
|
|
|
|
|
|
Function: "expm1_towardzero":
|
|
|
|
double: 1
|
2015-12-22 13:06:36 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 4
|
2021-03-01 20:07:27 +00:00
|
|
|
ldouble: 6
|
2013-12-19 19:44:16 +00:00
|
|
|
|
|
|
|
Function: "expm1_upward":
|
|
|
|
double: 1
|
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2016-10-05 11:57:47 +00:00
|
|
|
ldouble: 6
|
2013-12-19 19:44:16 +00:00
|
|
|
|
2014-03-25 15:13:53 +00:00
|
|
|
Function: "fma":
|
|
|
|
ldouble: 1
|
|
|
|
|
|
|
|
Function: "fma_downward":
|
|
|
|
ldouble: 1
|
|
|
|
|
Add narrowing fma functions
This patch adds the narrowing fused multiply-add functions from TS
18661-1 / TS 18661-3 / C2X to glibc's libm: ffma, ffmal, dfmal,
f32fmaf64, f32fmaf32x, f32xfmaf64 for all configurations; f32fmaf64x,
f32fmaf128, f64fmaf64x, f64fmaf128, f32xfmaf64x, f32xfmaf128,
f64xfmaf128 for configurations with _Float64x and _Float128;
__f32fmaieee128 and __f64fmaieee128 aliases in the powerpc64le case
(for calls to ffmal and dfmal when long double is IEEE binary128).
Corresponding tgmath.h macro support is also added.
The changes are mostly similar to those for the other narrowing
functions previously added, especially that for sqrt, so the
description of those generally applies to this patch as well. As with
sqrt, I reused the same test inputs in auto-libm-test-in as for
non-narrowing fma rather than adding extra or separate inputs for
narrowing fma. The tests in libm-test-narrow-fma.inc also follow
those for non-narrowing fma.
The non-narrowing fma has a known bug (bug 6801) that it does not set
errno on errors (overflow, underflow, Inf * 0, Inf - Inf). Rather
than fixing this or having narrowing fma check for errors when
non-narrowing does not (complicating the cases when narrowing fma can
otherwise be an alias for a non-narrowing function), this patch does
not attempt to check for errors from narrowing fma and set errno; the
CHECK_NARROW_FMA macro is still present, but as a placeholder that
does nothing, and this missing errno setting is considered to be
covered by the existing bug rather than needing a separate open bug.
missing-errno annotations are duly added to many of the
auto-libm-test-in test inputs for fma.
This completes adding all the new functions from TS 18661-1 to glibc,
so will be followed by corresponding stdc-predef.h changes to define
__STDC_IEC_60559_BFP__ and __STDC_IEC_60559_COMPLEX__, as the support
for TS 18661-1 will be at a similar level to that for C standard
floating-point facilities up to C11 (pragmas not implemented, but
library functions done). (There are still further changes to be done
to implement changes to the types of fromfp functions from N2548.)
Tested as followed: natively with the full glibc testsuite for x86_64
(GCC 11, 7, 6) and x86 (GCC 11); with build-many-glibcs.py with GCC
11, 7 and 6; cross testing of math/ tests for powerpc64le, powerpc32
hard float, mips64 (all three ABIs, both hard and soft float). The
different GCC versions are to cover the different cases in tgmath.h
and tgmath.h tests properly (GCC 6 has _Float* only as typedefs in
glibc headers, GCC 7 has proper _Float* support, GCC 8 adds
__builtin_tgmath).
2021-09-22 21:25:31 +00:00
|
|
|
Function: "fma_downward_ldouble":
|
|
|
|
double: 1
|
|
|
|
float: 1
|
|
|
|
|
|
|
|
Function: "fma_ldouble":
|
|
|
|
double: 1
|
|
|
|
float: 1
|
|
|
|
|
2014-03-25 15:13:53 +00:00
|
|
|
Function: "fma_towardzero":
|
|
|
|
ldouble: 2
|
|
|
|
|
Add narrowing fma functions
This patch adds the narrowing fused multiply-add functions from TS
18661-1 / TS 18661-3 / C2X to glibc's libm: ffma, ffmal, dfmal,
f32fmaf64, f32fmaf32x, f32xfmaf64 for all configurations; f32fmaf64x,
f32fmaf128, f64fmaf64x, f64fmaf128, f32xfmaf64x, f32xfmaf128,
f64xfmaf128 for configurations with _Float64x and _Float128;
__f32fmaieee128 and __f64fmaieee128 aliases in the powerpc64le case
(for calls to ffmal and dfmal when long double is IEEE binary128).
Corresponding tgmath.h macro support is also added.
The changes are mostly similar to those for the other narrowing
functions previously added, especially that for sqrt, so the
description of those generally applies to this patch as well. As with
sqrt, I reused the same test inputs in auto-libm-test-in as for
non-narrowing fma rather than adding extra or separate inputs for
narrowing fma. The tests in libm-test-narrow-fma.inc also follow
those for non-narrowing fma.
The non-narrowing fma has a known bug (bug 6801) that it does not set
errno on errors (overflow, underflow, Inf * 0, Inf - Inf). Rather
than fixing this or having narrowing fma check for errors when
non-narrowing does not (complicating the cases when narrowing fma can
otherwise be an alias for a non-narrowing function), this patch does
not attempt to check for errors from narrowing fma and set errno; the
CHECK_NARROW_FMA macro is still present, but as a placeholder that
does nothing, and this missing errno setting is considered to be
covered by the existing bug rather than needing a separate open bug.
missing-errno annotations are duly added to many of the
auto-libm-test-in test inputs for fma.
This completes adding all the new functions from TS 18661-1 to glibc,
so will be followed by corresponding stdc-predef.h changes to define
__STDC_IEC_60559_BFP__ and __STDC_IEC_60559_COMPLEX__, as the support
for TS 18661-1 will be at a similar level to that for C standard
floating-point facilities up to C11 (pragmas not implemented, but
library functions done). (There are still further changes to be done
to implement changes to the types of fromfp functions from N2548.)
Tested as followed: natively with the full glibc testsuite for x86_64
(GCC 11, 7, 6) and x86 (GCC 11); with build-many-glibcs.py with GCC
11, 7 and 6; cross testing of math/ tests for powerpc64le, powerpc32
hard float, mips64 (all three ABIs, both hard and soft float). The
different GCC versions are to cover the different cases in tgmath.h
and tgmath.h tests properly (GCC 6 has _Float* only as typedefs in
glibc headers, GCC 7 has proper _Float* support, GCC 8 adds
__builtin_tgmath).
2021-09-22 21:25:31 +00:00
|
|
|
Function: "fma_towardzero_ldouble":
|
|
|
|
double: 1
|
|
|
|
float: 1
|
|
|
|
|
2014-03-25 15:13:53 +00:00
|
|
|
Function: "fma_upward":
|
2015-03-09 17:26:48 +00:00
|
|
|
ldouble: 3
|
2014-03-25 15:13:53 +00:00
|
|
|
|
Add narrowing fma functions
This patch adds the narrowing fused multiply-add functions from TS
18661-1 / TS 18661-3 / C2X to glibc's libm: ffma, ffmal, dfmal,
f32fmaf64, f32fmaf32x, f32xfmaf64 for all configurations; f32fmaf64x,
f32fmaf128, f64fmaf64x, f64fmaf128, f32xfmaf64x, f32xfmaf128,
f64xfmaf128 for configurations with _Float64x and _Float128;
__f32fmaieee128 and __f64fmaieee128 aliases in the powerpc64le case
(for calls to ffmal and dfmal when long double is IEEE binary128).
Corresponding tgmath.h macro support is also added.
The changes are mostly similar to those for the other narrowing
functions previously added, especially that for sqrt, so the
description of those generally applies to this patch as well. As with
sqrt, I reused the same test inputs in auto-libm-test-in as for
non-narrowing fma rather than adding extra or separate inputs for
narrowing fma. The tests in libm-test-narrow-fma.inc also follow
those for non-narrowing fma.
The non-narrowing fma has a known bug (bug 6801) that it does not set
errno on errors (overflow, underflow, Inf * 0, Inf - Inf). Rather
than fixing this or having narrowing fma check for errors when
non-narrowing does not (complicating the cases when narrowing fma can
otherwise be an alias for a non-narrowing function), this patch does
not attempt to check for errors from narrowing fma and set errno; the
CHECK_NARROW_FMA macro is still present, but as a placeholder that
does nothing, and this missing errno setting is considered to be
covered by the existing bug rather than needing a separate open bug.
missing-errno annotations are duly added to many of the
auto-libm-test-in test inputs for fma.
This completes adding all the new functions from TS 18661-1 to glibc,
so will be followed by corresponding stdc-predef.h changes to define
__STDC_IEC_60559_BFP__ and __STDC_IEC_60559_COMPLEX__, as the support
for TS 18661-1 will be at a similar level to that for C standard
floating-point facilities up to C11 (pragmas not implemented, but
library functions done). (There are still further changes to be done
to implement changes to the types of fromfp functions from N2548.)
Tested as followed: natively with the full glibc testsuite for x86_64
(GCC 11, 7, 6) and x86 (GCC 11); with build-many-glibcs.py with GCC
11, 7 and 6; cross testing of math/ tests for powerpc64le, powerpc32
hard float, mips64 (all three ABIs, both hard and soft float). The
different GCC versions are to cover the different cases in tgmath.h
and tgmath.h tests properly (GCC 6 has _Float* only as typedefs in
glibc headers, GCC 7 has proper _Float* support, GCC 8 adds
__builtin_tgmath).
2021-09-22 21:25:31 +00:00
|
|
|
Function: "fma_upward_ldouble":
|
|
|
|
double: 1
|
|
|
|
float: 1
|
|
|
|
|
2015-12-22 13:06:36 +00:00
|
|
|
Function: "fmod":
|
2014-01-08 13:45:54 +00:00
|
|
|
ldouble: 1
|
2006-01-28 00:15:15 +00:00
|
|
|
|
2015-12-22 13:06:36 +00:00
|
|
|
Function: "fmod_downward":
|
|
|
|
ldouble: 1
|
2014-03-25 15:13:53 +00:00
|
|
|
|
2015-12-22 13:06:36 +00:00
|
|
|
Function: "fmod_towardzero":
|
|
|
|
ldouble: 1
|
|
|
|
|
|
|
|
Function: "fmod_upward":
|
|
|
|
ldouble: 1
|
|
|
|
|
|
|
|
Function: "gamma":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 3
|
2015-12-22 13:06:36 +00:00
|
|
|
float: 4
|
2019-07-11 14:48:28 +00:00
|
|
|
float128: 5
|
2014-03-25 15:13:53 +00:00
|
|
|
ldouble: 3
|
|
|
|
|
2015-12-22 13:06:36 +00:00
|
|
|
Function: "gamma_downward":
|
|
|
|
double: 4
|
|
|
|
float: 4
|
2019-07-11 14:48:28 +00:00
|
|
|
float128: 8
|
2017-01-13 11:33:42 +00:00
|
|
|
ldouble: 15
|
2015-12-22 13:06:36 +00:00
|
|
|
|
|
|
|
Function: "gamma_towardzero":
|
|
|
|
double: 4
|
2014-03-25 15:13:53 +00:00
|
|
|
float: 3
|
2019-07-11 14:48:28 +00:00
|
|
|
float128: 5
|
2017-01-13 11:33:42 +00:00
|
|
|
ldouble: 16
|
2015-12-22 13:06:36 +00:00
|
|
|
|
|
|
|
Function: "gamma_upward":
|
|
|
|
double: 4
|
|
|
|
float: 5
|
2019-07-11 14:48:28 +00:00
|
|
|
float128: 8
|
2016-10-05 11:57:47 +00:00
|
|
|
ldouble: 11
|
2014-03-25 15:13:53 +00:00
|
|
|
|
1999-12-21 16:15:04 +00:00
|
|
|
Function: "hypot":
|
2012-03-03 13:20:24 +00:00
|
|
|
double: 1
|
2023-02-23 17:23:39 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2006-01-28 00:15:15 +00:00
|
|
|
ldouble: 1
|
1999-12-21 16:15:04 +00:00
|
|
|
|
2014-03-25 15:13:53 +00:00
|
|
|
Function: "hypot_downward":
|
|
|
|
double: 1
|
2023-02-23 17:23:39 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2014-03-25 15:13:53 +00:00
|
|
|
ldouble: 2
|
|
|
|
|
|
|
|
Function: "hypot_towardzero":
|
|
|
|
double: 1
|
2023-02-23 17:23:39 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2014-03-25 15:13:53 +00:00
|
|
|
ldouble: 2
|
|
|
|
|
|
|
|
Function: "hypot_upward":
|
|
|
|
double: 1
|
2023-02-23 17:23:39 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2014-03-25 15:13:53 +00:00
|
|
|
ldouble: 3
|
|
|
|
|
1999-12-21 16:15:04 +00:00
|
|
|
Function: "j0":
|
Fix the inaccuracy of j0f/j1f/y0f/y1f [BZ #14469, #14470, #14471, #14472]
For j0f/j1f/y0f/y1f, the largest error for all binary32
inputs is reduced to at most 9 ulps for all rounding modes.
The new code is enabled only when there is a cancellation at the very end of
the j0f/j1f/y0f/y1f computation, or for very large inputs, thus should not
give any visible slowdown on average. Two different algorithms are used:
* around the first 64 zeros of j0/j1/y0/y1, approximation polynomials of
degree 3 are used, computed using the Sollya tool (https://www.sollya.org/)
* for large inputs, an asymptotic formula from [1] is used
[1] Fast and Accurate Bessel Function Computation,
John Harrison, Proceedings of Arith 19, 2009.
Inputs yielding the new largest errors are added to auto-libm-test-in,
and ulps are regenerated for various targets (thanks Adhemerval Zanella).
Tested on x86_64 with --disable-multi-arch and on powerpc64le-linux-gnu.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2021-04-01 06:14:10 +00:00
|
|
|
double: 3
|
|
|
|
float: 9
|
2021-04-09 20:41:22 +00:00
|
|
|
float128: 7
|
Fix the inaccuracy of j0f/j1f/y0f/y1f [BZ #14469, #14470, #14471, #14472]
For j0f/j1f/y0f/y1f, the largest error for all binary32
inputs is reduced to at most 9 ulps for all rounding modes.
The new code is enabled only when there is a cancellation at the very end of
the j0f/j1f/y0f/y1f computation, or for very large inputs, thus should not
give any visible slowdown on average. Two different algorithms are used:
* around the first 64 zeros of j0/j1/y0/y1, approximation polynomials of
degree 3 are used, computed using the Sollya tool (https://www.sollya.org/)
* for large inputs, an asymptotic formula from [1] is used
[1] Fast and Accurate Bessel Function Computation,
John Harrison, Proceedings of Arith 19, 2009.
Inputs yielding the new largest errors are added to auto-libm-test-in,
and ulps are regenerated for various targets (thanks Adhemerval Zanella).
Tested on x86_64 with --disable-multi-arch and on powerpc64le-linux-gnu.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2021-04-01 06:14:10 +00:00
|
|
|
ldouble: 5
|
1999-12-21 16:15:04 +00:00
|
|
|
|
2014-03-25 15:13:53 +00:00
|
|
|
Function: "j0_downward":
|
Fix the inaccuracy of j0f/j1f/y0f/y1f [BZ #14469, #14470, #14471, #14472]
For j0f/j1f/y0f/y1f, the largest error for all binary32
inputs is reduced to at most 9 ulps for all rounding modes.
The new code is enabled only when there is a cancellation at the very end of
the j0f/j1f/y0f/y1f computation, or for very large inputs, thus should not
give any visible slowdown on average. Two different algorithms are used:
* around the first 64 zeros of j0/j1/y0/y1, approximation polynomials of
degree 3 are used, computed using the Sollya tool (https://www.sollya.org/)
* for large inputs, an asymptotic formula from [1] is used
[1] Fast and Accurate Bessel Function Computation,
John Harrison, Proceedings of Arith 19, 2009.
Inputs yielding the new largest errors are added to auto-libm-test-in,
and ulps are regenerated for various targets (thanks Adhemerval Zanella).
Tested on x86_64 with --disable-multi-arch and on powerpc64le-linux-gnu.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2021-04-01 06:14:10 +00:00
|
|
|
double: 6
|
|
|
|
float: 9
|
2021-04-09 20:41:22 +00:00
|
|
|
float128: 9
|
2020-09-10 18:06:34 +00:00
|
|
|
ldouble: 12
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: "j0_towardzero":
|
Fix the inaccuracy of j0f/j1f/y0f/y1f [BZ #14469, #14470, #14471, #14472]
For j0f/j1f/y0f/y1f, the largest error for all binary32
inputs is reduced to at most 9 ulps for all rounding modes.
The new code is enabled only when there is a cancellation at the very end of
the j0f/j1f/y0f/y1f computation, or for very large inputs, thus should not
give any visible slowdown on average. Two different algorithms are used:
* around the first 64 zeros of j0/j1/y0/y1, approximation polynomials of
degree 3 are used, computed using the Sollya tool (https://www.sollya.org/)
* for large inputs, an asymptotic formula from [1] is used
[1] Fast and Accurate Bessel Function Computation,
John Harrison, Proceedings of Arith 19, 2009.
Inputs yielding the new largest errors are added to auto-libm-test-in,
and ulps are regenerated for various targets (thanks Adhemerval Zanella).
Tested on x86_64 with --disable-multi-arch and on powerpc64le-linux-gnu.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2021-04-01 06:14:10 +00:00
|
|
|
double: 7
|
|
|
|
float: 9
|
2021-04-09 20:41:22 +00:00
|
|
|
float128: 9
|
2020-09-10 18:06:34 +00:00
|
|
|
ldouble: 16
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: "j0_upward":
|
Fix the inaccuracy of j0f/j1f/y0f/y1f [BZ #14469, #14470, #14471, #14472]
For j0f/j1f/y0f/y1f, the largest error for all binary32
inputs is reduced to at most 9 ulps for all rounding modes.
The new code is enabled only when there is a cancellation at the very end of
the j0f/j1f/y0f/y1f computation, or for very large inputs, thus should not
give any visible slowdown on average. Two different algorithms are used:
* around the first 64 zeros of j0/j1/y0/y1, approximation polynomials of
degree 3 are used, computed using the Sollya tool (https://www.sollya.org/)
* for large inputs, an asymptotic formula from [1] is used
[1] Fast and Accurate Bessel Function Computation,
John Harrison, Proceedings of Arith 19, 2009.
Inputs yielding the new largest errors are added to auto-libm-test-in,
and ulps are regenerated for various targets (thanks Adhemerval Zanella).
Tested on x86_64 with --disable-multi-arch and on powerpc64le-linux-gnu.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2021-04-01 06:14:10 +00:00
|
|
|
double: 9
|
2021-10-06 13:49:28 +00:00
|
|
|
float: 9
|
2021-04-09 20:41:22 +00:00
|
|
|
float128: 7
|
Fix the inaccuracy of j0f/j1f/y0f/y1f [BZ #14469, #14470, #14471, #14472]
For j0f/j1f/y0f/y1f, the largest error for all binary32
inputs is reduced to at most 9 ulps for all rounding modes.
The new code is enabled only when there is a cancellation at the very end of
the j0f/j1f/y0f/y1f computation, or for very large inputs, thus should not
give any visible slowdown on average. Two different algorithms are used:
* around the first 64 zeros of j0/j1/y0/y1, approximation polynomials of
degree 3 are used, computed using the Sollya tool (https://www.sollya.org/)
* for large inputs, an asymptotic formula from [1] is used
[1] Fast and Accurate Bessel Function Computation,
John Harrison, Proceedings of Arith 19, 2009.
Inputs yielding the new largest errors are added to auto-libm-test-in,
and ulps are regenerated for various targets (thanks Adhemerval Zanella).
Tested on x86_64 with --disable-multi-arch and on powerpc64le-linux-gnu.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2021-04-01 06:14:10 +00:00
|
|
|
ldouble: 14
|
2014-03-25 15:13:53 +00:00
|
|
|
|
1999-12-21 16:15:04 +00:00
|
|
|
Function: "j1":
|
Fix the inaccuracy of j0f/j1f/y0f/y1f [BZ #14469, #14470, #14471, #14472]
For j0f/j1f/y0f/y1f, the largest error for all binary32
inputs is reduced to at most 9 ulps for all rounding modes.
The new code is enabled only when there is a cancellation at the very end of
the j0f/j1f/y0f/y1f computation, or for very large inputs, thus should not
give any visible slowdown on average. Two different algorithms are used:
* around the first 64 zeros of j0/j1/y0/y1, approximation polynomials of
degree 3 are used, computed using the Sollya tool (https://www.sollya.org/)
* for large inputs, an asymptotic formula from [1] is used
[1] Fast and Accurate Bessel Function Computation,
John Harrison, Proceedings of Arith 19, 2009.
Inputs yielding the new largest errors are added to auto-libm-test-in,
and ulps are regenerated for various targets (thanks Adhemerval Zanella).
Tested on x86_64 with --disable-multi-arch and on powerpc64le-linux-gnu.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2021-04-01 06:14:10 +00:00
|
|
|
double: 4
|
|
|
|
float: 9
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 4
|
Fix the inaccuracy of j0f/j1f/y0f/y1f [BZ #14469, #14470, #14471, #14472]
For j0f/j1f/y0f/y1f, the largest error for all binary32
inputs is reduced to at most 9 ulps for all rounding modes.
The new code is enabled only when there is a cancellation at the very end of
the j0f/j1f/y0f/y1f computation, or for very large inputs, thus should not
give any visible slowdown on average. Two different algorithms are used:
* around the first 64 zeros of j0/j1/y0/y1, approximation polynomials of
degree 3 are used, computed using the Sollya tool (https://www.sollya.org/)
* for large inputs, an asymptotic formula from [1] is used
[1] Fast and Accurate Bessel Function Computation,
John Harrison, Proceedings of Arith 19, 2009.
Inputs yielding the new largest errors are added to auto-libm-test-in,
and ulps are regenerated for various targets (thanks Adhemerval Zanella).
Tested on x86_64 with --disable-multi-arch and on powerpc64le-linux-gnu.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2021-04-01 06:14:10 +00:00
|
|
|
ldouble: 6
|
1999-12-21 16:15:04 +00:00
|
|
|
|
2014-03-25 15:13:53 +00:00
|
|
|
Function: "j1_downward":
|
|
|
|
double: 3
|
Fix the inaccuracy of j0f/j1f/y0f/y1f [BZ #14469, #14470, #14471, #14472]
For j0f/j1f/y0f/y1f, the largest error for all binary32
inputs is reduced to at most 9 ulps for all rounding modes.
The new code is enabled only when there is a cancellation at the very end of
the j0f/j1f/y0f/y1f computation, or for very large inputs, thus should not
give any visible slowdown on average. Two different algorithms are used:
* around the first 64 zeros of j0/j1/y0/y1, approximation polynomials of
degree 3 are used, computed using the Sollya tool (https://www.sollya.org/)
* for large inputs, an asymptotic formula from [1] is used
[1] Fast and Accurate Bessel Function Computation,
John Harrison, Proceedings of Arith 19, 2009.
Inputs yielding the new largest errors are added to auto-libm-test-in,
and ulps are regenerated for various targets (thanks Adhemerval Zanella).
Tested on x86_64 with --disable-multi-arch and on powerpc64le-linux-gnu.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2021-04-01 06:14:10 +00:00
|
|
|
float: 8
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 4
|
2014-03-25 15:13:53 +00:00
|
|
|
ldouble: 7
|
|
|
|
|
|
|
|
Function: "j1_towardzero":
|
Fix the inaccuracy of j0f/j1f/y0f/y1f [BZ #14469, #14470, #14471, #14472]
For j0f/j1f/y0f/y1f, the largest error for all binary32
inputs is reduced to at most 9 ulps for all rounding modes.
The new code is enabled only when there is a cancellation at the very end of
the j0f/j1f/y0f/y1f computation, or for very large inputs, thus should not
give any visible slowdown on average. Two different algorithms are used:
* around the first 64 zeros of j0/j1/y0/y1, approximation polynomials of
degree 3 are used, computed using the Sollya tool (https://www.sollya.org/)
* for large inputs, an asymptotic formula from [1] is used
[1] Fast and Accurate Bessel Function Computation,
John Harrison, Proceedings of Arith 19, 2009.
Inputs yielding the new largest errors are added to auto-libm-test-in,
and ulps are regenerated for various targets (thanks Adhemerval Zanella).
Tested on x86_64 with --disable-multi-arch and on powerpc64le-linux-gnu.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2021-04-01 06:14:10 +00:00
|
|
|
double: 4
|
|
|
|
float: 8
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 4
|
2014-03-25 15:13:53 +00:00
|
|
|
ldouble: 7
|
|
|
|
|
|
|
|
Function: "j1_upward":
|
Fix the inaccuracy of j0f/j1f/y0f/y1f [BZ #14469, #14470, #14471, #14472]
For j0f/j1f/y0f/y1f, the largest error for all binary32
inputs is reduced to at most 9 ulps for all rounding modes.
The new code is enabled only when there is a cancellation at the very end of
the j0f/j1f/y0f/y1f computation, or for very large inputs, thus should not
give any visible slowdown on average. Two different algorithms are used:
* around the first 64 zeros of j0/j1/y0/y1, approximation polynomials of
degree 3 are used, computed using the Sollya tool (https://www.sollya.org/)
* for large inputs, an asymptotic formula from [1] is used
[1] Fast and Accurate Bessel Function Computation,
John Harrison, Proceedings of Arith 19, 2009.
Inputs yielding the new largest errors are added to auto-libm-test-in,
and ulps are regenerated for various targets (thanks Adhemerval Zanella).
Tested on x86_64 with --disable-multi-arch and on powerpc64le-linux-gnu.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2021-04-01 06:14:10 +00:00
|
|
|
double: 9
|
|
|
|
float: 9
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2014-03-25 15:13:53 +00:00
|
|
|
ldouble: 6
|
|
|
|
|
1999-12-21 16:15:04 +00:00
|
|
|
Function: "jn":
|
2011-09-27 14:40:18 +00:00
|
|
|
double: 4
|
2015-05-29 12:40:33 +00:00
|
|
|
float: 4
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 7
|
2015-05-29 12:40:33 +00:00
|
|
|
ldouble: 4
|
1999-12-21 16:15:04 +00:00
|
|
|
|
2015-07-24 13:21:56 +00:00
|
|
|
Function: "jn_downward":
|
|
|
|
double: 4
|
|
|
|
float: 5
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 8
|
2015-07-24 13:21:56 +00:00
|
|
|
ldouble: 7
|
|
|
|
|
|
|
|
Function: "jn_towardzero":
|
|
|
|
double: 4
|
|
|
|
float: 5
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 8
|
2015-07-24 13:21:56 +00:00
|
|
|
ldouble: 7
|
|
|
|
|
|
|
|
Function: "jn_upward":
|
|
|
|
double: 5
|
|
|
|
float: 4
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 7
|
2015-07-24 13:21:56 +00:00
|
|
|
ldouble: 5
|
|
|
|
|
1999-12-21 16:15:04 +00:00
|
|
|
Function: "lgamma":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 3
|
2015-12-22 13:06:36 +00:00
|
|
|
float: 4
|
2017-08-08 21:52:03 +00:00
|
|
|
float128: 5
|
2015-12-22 13:06:36 +00:00
|
|
|
ldouble: 3
|
|
|
|
|
|
|
|
Function: "lgamma_downward":
|
|
|
|
double: 4
|
|
|
|
float: 4
|
2017-08-08 21:52:03 +00:00
|
|
|
float128: 8
|
2017-01-13 11:33:42 +00:00
|
|
|
ldouble: 15
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: "lgamma_towardzero":
|
2015-12-22 13:06:36 +00:00
|
|
|
double: 4
|
2015-05-29 12:40:33 +00:00
|
|
|
float: 3
|
2017-08-08 21:52:03 +00:00
|
|
|
float128: 5
|
2017-01-13 11:33:42 +00:00
|
|
|
ldouble: 16
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: "lgamma_upward":
|
2015-12-22 13:06:36 +00:00
|
|
|
double: 4
|
|
|
|
float: 5
|
2017-08-08 21:52:03 +00:00
|
|
|
float128: 8
|
2016-10-05 11:57:47 +00:00
|
|
|
ldouble: 11
|
2014-03-25 15:13:53 +00:00
|
|
|
|
2006-01-28 00:15:15 +00:00
|
|
|
Function: "log":
|
2020-12-22 18:20:56 +00:00
|
|
|
double: 1
|
2013-12-05 17:20:06 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2014-01-08 13:45:54 +00:00
|
|
|
ldouble: 1
|
1999-12-21 16:15:04 +00:00
|
|
|
|
|
|
|
Function: "log10":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 2
|
2002-09-02 20:04:55 +00:00
|
|
|
float: 2
|
2021-03-03 17:39:17 +00:00
|
|
|
float128: 2
|
2006-01-28 00:15:15 +00:00
|
|
|
ldouble: 1
|
1999-12-21 16:15:04 +00:00
|
|
|
|
2014-05-26 17:40:08 +00:00
|
|
|
Function: "log10_downward":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 2
|
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2014-05-26 17:40:08 +00:00
|
|
|
ldouble: 1
|
|
|
|
|
|
|
|
Function: "log10_towardzero":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 2
|
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2015-12-22 13:06:36 +00:00
|
|
|
ldouble: 2
|
2014-05-26 17:40:08 +00:00
|
|
|
|
|
|
|
Function: "log10_upward":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 2
|
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2021-03-01 20:07:27 +00:00
|
|
|
ldouble: 2
|
2014-05-26 17:40:08 +00:00
|
|
|
|
1999-12-21 16:15:04 +00:00
|
|
|
Function: "log1p":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 1
|
1999-12-21 16:15:04 +00:00
|
|
|
float: 1
|
2021-03-03 17:39:17 +00:00
|
|
|
float128: 3
|
2015-12-22 13:06:36 +00:00
|
|
|
ldouble: 2
|
2006-01-28 00:15:15 +00:00
|
|
|
|
2014-03-25 15:13:53 +00:00
|
|
|
Function: "log1p_downward":
|
|
|
|
double: 1
|
2015-05-29 12:40:33 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2015-05-29 12:40:33 +00:00
|
|
|
ldouble: 2
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: "log1p_towardzero":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 2
|
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2015-12-22 13:06:36 +00:00
|
|
|
ldouble: 3
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: "log1p_upward":
|
2014-06-12 02:22:49 +00:00
|
|
|
double: 2
|
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2016-10-05 11:57:47 +00:00
|
|
|
ldouble: 3
|
2014-06-12 02:22:49 +00:00
|
|
|
|
2015-05-29 12:40:33 +00:00
|
|
|
Function: "log2":
|
2014-06-12 02:22:49 +00:00
|
|
|
double: 1
|
|
|
|
float: 1
|
2021-03-03 17:39:17 +00:00
|
|
|
float128: 3
|
2014-06-12 02:22:49 +00:00
|
|
|
ldouble: 1
|
|
|
|
|
2015-05-29 12:40:33 +00:00
|
|
|
Function: "log2_downward":
|
|
|
|
double: 3
|
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2015-05-29 12:40:33 +00:00
|
|
|
ldouble: 2
|
|
|
|
|
|
|
|
Function: "log2_towardzero":
|
2014-06-12 02:22:49 +00:00
|
|
|
double: 2
|
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2021-03-01 20:07:27 +00:00
|
|
|
ldouble: 5
|
2014-06-12 02:22:49 +00:00
|
|
|
|
2015-05-29 12:40:33 +00:00
|
|
|
Function: "log2_upward":
|
|
|
|
double: 3
|
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2015-12-22 13:06:36 +00:00
|
|
|
ldouble: 4
|
2014-03-25 15:13:53 +00:00
|
|
|
|
2015-05-29 12:40:33 +00:00
|
|
|
Function: "log_downward":
|
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2014-03-25 15:13:53 +00:00
|
|
|
ldouble: 1
|
|
|
|
|
2015-05-29 12:40:33 +00:00
|
|
|
Function: "log_towardzero":
|
2017-01-13 11:33:42 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2015-05-29 12:40:33 +00:00
|
|
|
ldouble: 2
|
|
|
|
|
2014-03-25 15:13:53 +00:00
|
|
|
Function: "log_upward":
|
2015-12-22 13:06:36 +00:00
|
|
|
double: 1
|
2017-01-13 11:33:42 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2014-03-25 15:13:53 +00:00
|
|
|
ldouble: 1
|
|
|
|
|
Add narrowing multiply functions.
This patch adds the narrowing multiply functions from TS 18661-1 to
glibc's libm: fmul, fmull, dmull, f32mulf64, f32mulf32x, f32xmulf64
for all configurations; f32mulf64x, f32mulf128, f64mulf64x,
f64mulf128, f32xmulf64x, f32xmulf128, f64xmulf128 for configurations
with _Float64x and _Float128; __nldbl_dmull for ldbl-opt.
The changes are mostly essentially the same as for the narrowing add
functions, so the description of those generally applies to this patch
as well. f32xmulf64 for i386 cannot use precision control as used for
add and subtract, because that would result in double rounding for
subnormal results, so that uses round-to-odd with long double
intermediate result instead. The soft-fp support involves adding a
new FP_TRUNC_COOKED since soft-fp multiplication uses cooked inputs
and outputs.
Tested for x86_64, x86, mips64 (all three ABIs, both hard and soft
float) and powerpc, and with build-many-glibcs.py.
* math/Makefile (libm-narrow-fns): Add mul.
(libm-test-funcs-narrow): Likewise.
* math/Versions (GLIBC_2.28): Add narrowing multiply functions.
* math/bits/mathcalls-narrow.h (mul): Use __MATHCALL_NARROW.
* math/gen-auto-libm-tests.c (test_functions): Add mul.
* math/math-narrow.h (CHECK_NARROW_MUL): New macro.
(NARROW_MUL_ROUND_TO_ODD): Likewise.
(NARROW_MUL_TRIVIAL): Likewise.
* soft-fp/op-common.h (FP_TRUNC_COOKED): Likewise.
* sysdeps/ieee754/float128/float128_private.h (__fmull): New
macro.
(__dmull): Likewise.
* sysdeps/ieee754/ldbl-opt/Makefile (libnldbl-calls): Add fmul and
dmul.
(CFLAGS-nldbl-dmul.c): New variable.
(CFLAGS-nldbl-fmul.c): Likewise.
* sysdeps/ieee754/ldbl-opt/Versions (GLIBC_2.28): Add
__nldbl_dmull.
* sysdeps/ieee754/ldbl-opt/nldbl-compat.h (__nldbl_dmull): New
prototype.
* manual/arith.texi (Misc FP Arithmetic): Document fmul, fmull,
dmull, fMmulfN, fMmulfNx, fMxmulfN and fMxmulfNx.
* math/auto-libm-test-in: Add tests of mul.
* math/auto-libm-test-out-narrow-mul: New generated file.
* math/libm-test-narrow-mul.inc: New file.
* sysdeps/i386/fpu/s_f32xmulf64.c: Likewise.
* sysdeps/ieee754/dbl-64/s_f32xmulf64.c: Likewise.
* sysdeps/ieee754/dbl-64/s_fmul.c: Likewise.
* sysdeps/ieee754/float128/s_f32mulf128.c: Likewise.
* sysdeps/ieee754/float128/s_f64mulf128.c: Likewise.
* sysdeps/ieee754/float128/s_f64xmulf128.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_dmull.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_f64xmulf128.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_fmull.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_dmull.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_fmull.c: Likewise.
* sysdeps/ieee754/ldbl-96/s_dmull.c: Likewise.
* sysdeps/ieee754/ldbl-96/s_fmull.c: Likewise.
* sysdeps/ieee754/ldbl-opt/nldbl-dmul.c: Likewise.
* sysdeps/ieee754/ldbl-opt/nldbl-fmul.c: Likewise.
* sysdeps/ieee754/soft-fp/s_dmull.c: Likewise.
* sysdeps/ieee754/soft-fp/s_fmul.c: Likewise.
* sysdeps/ieee754/soft-fp/s_fmull.c: Likewise.
* sysdeps/powerpc/fpu/libm-test-ulps: Update.
* sysdeps/mach/hurd/i386/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/aarch64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/arm/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/hppa/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/microblaze/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/nios2/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/riscv/rv64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sh/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise.
2018-05-16 00:05:28 +00:00
|
|
|
Function: "mul_downward_ldouble":
|
|
|
|
double: 1
|
|
|
|
float: 1
|
|
|
|
|
|
|
|
Function: "mul_ldouble":
|
|
|
|
double: 1
|
|
|
|
float: 1
|
|
|
|
|
|
|
|
Function: "mul_towardzero_ldouble":
|
|
|
|
double: 1
|
|
|
|
float: 1
|
|
|
|
|
|
|
|
Function: "mul_upward_ldouble":
|
|
|
|
double: 1
|
|
|
|
float: 1
|
|
|
|
|
2016-01-28 07:26:59 +00:00
|
|
|
Function: "nextafter_downward":
|
|
|
|
ldouble: 1
|
|
|
|
|
|
|
|
Function: "nextafter_upward":
|
|
|
|
ldouble: 1
|
|
|
|
|
2006-01-28 00:15:15 +00:00
|
|
|
Function: "pow":
|
2018-02-12 14:10:21 +00:00
|
|
|
double: 1
|
2012-04-24 19:21:45 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2006-01-28 00:15:15 +00:00
|
|
|
ldouble: 1
|
|
|
|
|
2012-03-05 19:20:15 +00:00
|
|
|
Function: "pow_downward":
|
2014-06-25 14:57:39 +00:00
|
|
|
double: 1
|
2012-03-05 19:20:15 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2012-03-05 19:20:15 +00:00
|
|
|
ldouble: 1
|
2013-12-17 16:23:00 +00:00
|
|
|
|
2012-03-05 19:20:15 +00:00
|
|
|
Function: "pow_towardzero":
|
2014-06-25 14:57:39 +00:00
|
|
|
double: 1
|
2012-03-05 19:20:15 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2012-03-05 19:20:15 +00:00
|
|
|
ldouble: 1
|
|
|
|
|
|
|
|
Function: "pow_upward":
|
2014-06-25 14:57:39 +00:00
|
|
|
double: 1
|
2012-03-05 19:20:15 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2012-03-05 19:20:15 +00:00
|
|
|
ldouble: 1
|
|
|
|
|
2006-01-28 00:15:15 +00:00
|
|
|
Function: "sin":
|
2018-04-04 22:17:13 +00:00
|
|
|
double: 1
|
2013-12-05 17:20:06 +00:00
|
|
|
float: 1
|
2021-03-03 17:39:17 +00:00
|
|
|
float128: 2
|
2006-01-28 00:15:15 +00:00
|
|
|
ldouble: 1
|
1999-12-21 16:15:04 +00:00
|
|
|
|
2012-03-03 13:20:24 +00:00
|
|
|
Function: "sin_downward":
|
2013-12-05 17:20:06 +00:00
|
|
|
double: 1
|
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2015-12-22 13:06:36 +00:00
|
|
|
ldouble: 4
|
2012-03-03 13:20:24 +00:00
|
|
|
|
|
|
|
Function: "sin_towardzero":
|
2013-12-05 17:20:06 +00:00
|
|
|
double: 1
|
2012-03-03 13:20:24 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2021-03-01 20:07:27 +00:00
|
|
|
ldouble: 5
|
2012-03-03 13:20:24 +00:00
|
|
|
|
|
|
|
Function: "sin_upward":
|
2013-12-05 17:20:06 +00:00
|
|
|
double: 1
|
2012-03-03 13:20:24 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2015-05-29 12:40:33 +00:00
|
|
|
ldouble: 5
|
2012-03-03 13:20:24 +00:00
|
|
|
|
1999-12-21 16:15:04 +00:00
|
|
|
Function: "sincos":
|
2018-04-04 22:17:13 +00:00
|
|
|
double: 1
|
1999-12-21 16:15:04 +00:00
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2013-07-04 12:14:44 +00:00
|
|
|
ldouble: 1
|
2006-01-28 00:15:15 +00:00
|
|
|
|
2014-03-25 15:13:53 +00:00
|
|
|
Function: "sincos_downward":
|
|
|
|
double: 1
|
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2014-03-25 15:13:53 +00:00
|
|
|
ldouble: 4
|
|
|
|
|
|
|
|
Function: "sincos_towardzero":
|
|
|
|
double: 1
|
|
|
|
float: 1
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2014-03-25 15:13:53 +00:00
|
|
|
ldouble: 7
|
|
|
|
|
|
|
|
Function: "sincos_upward":
|
|
|
|
double: 1
|
2015-12-22 13:06:36 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2015-03-09 17:26:48 +00:00
|
|
|
ldouble: 7
|
2014-03-25 15:13:53 +00:00
|
|
|
|
2006-01-28 00:15:15 +00:00
|
|
|
Function: "sinh":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 2
|
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2015-12-22 13:06:36 +00:00
|
|
|
ldouble: 3
|
2012-03-05 19:20:15 +00:00
|
|
|
|
2015-05-29 12:40:33 +00:00
|
|
|
Function: "sinh_downward":
|
|
|
|
double: 3
|
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2015-12-22 13:06:36 +00:00
|
|
|
ldouble: 6
|
2012-03-05 19:20:15 +00:00
|
|
|
|
|
|
|
Function: "sinh_towardzero":
|
2020-12-22 18:20:56 +00:00
|
|
|
double: 3
|
2015-05-29 12:40:33 +00:00
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2015-12-22 13:06:36 +00:00
|
|
|
ldouble: 6
|
2012-03-05 19:20:15 +00:00
|
|
|
|
|
|
|
Function: "sinh_upward":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 3
|
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 4
|
2015-05-29 12:40:33 +00:00
|
|
|
ldouble: 6
|
2012-03-05 19:20:15 +00:00
|
|
|
|
2012-03-03 13:20:24 +00:00
|
|
|
Function: "sqrt":
|
2013-12-02 11:16:42 +00:00
|
|
|
ldouble: 1
|
|
|
|
|
|
|
|
Function: "sqrt_downward":
|
|
|
|
ldouble: 1
|
|
|
|
|
Add narrowing square root functions
This patch adds the narrowing square root functions from TS 18661-1 /
TS 18661-3 / C2X to glibc's libm: fsqrt, fsqrtl, dsqrtl, f32sqrtf64,
f32sqrtf32x, f32xsqrtf64 for all configurations; f32sqrtf64x,
f32sqrtf128, f64sqrtf64x, f64sqrtf128, f32xsqrtf64x, f32xsqrtf128,
f64xsqrtf128 for configurations with _Float64x and _Float128;
__f32sqrtieee128 and __f64sqrtieee128 aliases in the powerpc64le case
(for calls to fsqrtl and dsqrtl when long double is IEEE binary128).
Corresponding tgmath.h macro support is also added.
The changes are mostly similar to those for the other narrowing
functions previously added, so the description of those generally
applies to this patch as well. However, the not-actually-narrowing
cases (where the two types involved in the function have the same
floating-point format) are aliased to sqrt, sqrtl or sqrtf128 rather
than needing a separately built not-actually-narrowing function such
as was needed for add / sub / mul / div. Thus, there is no
__nldbl_dsqrtl name for ldbl-opt because no such name was needed
(whereas the other functions needed such a name since the only other
name for that entry point was e.g. f32xaddf64, not reserved by TS
18661-1); the headers are made to arrange for sqrt to be called in
that case instead.
The DIAG_* calls in sysdeps/ieee754/soft-fp/s_dsqrtl.c are because
they were observed to be needed in GCC 7 testing of
riscv32-linux-gnu-rv32imac-ilp32. The other sysdeps/ieee754/soft-fp/
files added didn't need such DIAG_* in any configuration I tested with
build-many-glibcs.py, but if they do turn out to be needed in more
files with some other configuration / GCC version, they can always be
added there.
I reused the same test inputs in auto-libm-test-in as for
non-narrowing sqrt rather than adding extra or separate inputs for
narrowing sqrt. The tests in libm-test-narrow-sqrt.inc also follow
those for non-narrowing sqrt.
Tested as followed: natively with the full glibc testsuite for x86_64
(GCC 11, 7, 6) and x86 (GCC 11); with build-many-glibcs.py with GCC
11, 7 and 6; cross testing of math/ tests for powerpc64le, powerpc32
hard float, mips64 (all three ABIs, both hard and soft float). The
different GCC versions are to cover the different cases in tgmath.h
and tgmath.h tests properly (GCC 6 has _Float* only as typedefs in
glibc headers, GCC 7 has proper _Float* support, GCC 8 adds
__builtin_tgmath).
2021-09-10 20:56:22 +00:00
|
|
|
Function: "sqrt_ldouble":
|
|
|
|
double: 1
|
|
|
|
|
2013-12-02 11:16:42 +00:00
|
|
|
Function: "sqrt_towardzero":
|
|
|
|
ldouble: 1
|
|
|
|
|
|
|
|
Function: "sqrt_upward":
|
|
|
|
ldouble: 1
|
2012-03-03 13:20:24 +00:00
|
|
|
|
Add narrowing subtract functions.
This patch adds the narrowing subtract functions from TS 18661-1 to
glibc's libm: fsub, fsubl, dsubl, f32subf64, f32subf32x, f32xsubf64
for all configurations; f32subf64x, f32subf128, f64subf64x,
f64subf128, f32xsubf64x, f32xsubf128, f64xsubf128 for configurations
with _Float64x and _Float128; __nldbl_dsubl for ldbl-opt.
The changes are essentially the same as for the narrowing add
functions, so the description of those generally applies to this patch
as well.
Tested for x86_64, x86, mips64 (all three ABIs, both hard and soft
float) and powerpc, and with build-many-glibcs.py.
* math/Makefile (libm-narrow-fns): Add sub.
(libm-test-funcs-narrow): Likewise.
* math/Versions (GLIBC_2.28): Add narrowing subtract functions.
* math/bits/mathcalls-narrow.h (sub): Use __MATHCALL_NARROW.
* math/gen-auto-libm-tests.c (test_functions): Add sub.
* math/math-narrow.h (CHECK_NARROW_SUB): New macro.
(NARROW_SUB_ROUND_TO_ODD): Likewise.
(NARROW_SUB_TRIVIAL): Likewise.
* sysdeps/ieee754/float128/float128_private.h (__fsubl): New
macro.
(__dsubl): Likewise.
* sysdeps/ieee754/ldbl-opt/Makefile (libnldbl-calls): Add fsub and
dsub.
(CFLAGS-nldbl-dsub.c): New variable.
(CFLAGS-nldbl-fsub.c): Likewise.
* sysdeps/ieee754/ldbl-opt/Versions (GLIBC_2.28): Add
__nldbl_dsubl.
* sysdeps/ieee754/ldbl-opt/nldbl-compat.h (__nldbl_dsubl): New
prototype.
* manual/arith.texi (Misc FP Arithmetic): Document fsub, fsubl,
dsubl, fMsubfN, fMsubfNx, fMxsubfN and fMxsubfNx.
* math/auto-libm-test-in: Add tests of sub.
* math/auto-libm-test-out-narrow-sub: New generated file.
* math/libm-test-narrow-sub.inc: New file.
* sysdeps/i386/fpu/s_f32xsubf64.c: Likewise.
* sysdeps/ieee754/dbl-64/s_f32xsubf64.c: Likewise.
* sysdeps/ieee754/dbl-64/s_fsub.c: Likewise.
* sysdeps/ieee754/float128/s_f32subf128.c: Likewise.
* sysdeps/ieee754/float128/s_f64subf128.c: Likewise.
* sysdeps/ieee754/float128/s_f64xsubf128.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_dsubl.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_f64xsubf128.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_fsubl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_dsubl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_fsubl.c: Likewise.
* sysdeps/ieee754/ldbl-96/s_dsubl.c: Likewise.
* sysdeps/ieee754/ldbl-96/s_fsubl.c: Likewise.
* sysdeps/ieee754/ldbl-opt/nldbl-dsub.c: Likewise.
* sysdeps/ieee754/ldbl-opt/nldbl-fsub.c: Likewise.
* sysdeps/ieee754/soft-fp/s_dsubl.c: Likewise.
* sysdeps/ieee754/soft-fp/s_fsub.c: Likewise.
* sysdeps/ieee754/soft-fp/s_fsubl.c: Likewise.
* sysdeps/powerpc/fpu/libm-test-ulps: Update.
* sysdeps/mach/hurd/i386/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/aarch64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/arm/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/hppa/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/microblaze/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/nios2/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/riscv/rv64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sh/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/tile/tilegx32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/tile/tilegx64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise.
2018-03-20 00:34:52 +00:00
|
|
|
Function: "sub_ldouble":
|
|
|
|
double: 1
|
|
|
|
float: 1
|
|
|
|
|
1999-12-21 16:15:04 +00:00
|
|
|
Function: "tan":
|
2017-05-02 16:38:29 +00:00
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2012-10-11 08:22:46 +00:00
|
|
|
ldouble: 2
|
2006-01-28 00:15:15 +00:00
|
|
|
|
2012-03-03 13:20:24 +00:00
|
|
|
Function: "tan_downward":
|
2013-12-05 17:20:06 +00:00
|
|
|
double: 1
|
2017-05-02 16:38:29 +00:00
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2015-12-22 13:06:36 +00:00
|
|
|
ldouble: 3
|
2012-03-03 13:20:24 +00:00
|
|
|
|
|
|
|
Function: "tan_towardzero":
|
2013-12-05 17:20:06 +00:00
|
|
|
double: 1
|
2017-05-02 16:38:29 +00:00
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2015-05-29 12:40:33 +00:00
|
|
|
ldouble: 2
|
2012-03-03 13:20:24 +00:00
|
|
|
|
|
|
|
Function: "tan_upward":
|
2013-12-05 17:20:06 +00:00
|
|
|
double: 1
|
2017-05-02 16:38:29 +00:00
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 1
|
2012-03-03 13:20:24 +00:00
|
|
|
ldouble: 3
|
|
|
|
|
2006-01-28 00:15:15 +00:00
|
|
|
Function: "tanh":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 2
|
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 2
|
2006-01-28 00:15:15 +00:00
|
|
|
ldouble: 1
|
1999-12-21 16:15:04 +00:00
|
|
|
|
2014-03-25 15:13:53 +00:00
|
|
|
Function: "tanh_downward":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 3
|
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 4
|
2016-10-05 11:57:47 +00:00
|
|
|
ldouble: 4
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: "tanh_towardzero":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 2
|
|
|
|
float: 2
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2016-10-05 11:57:47 +00:00
|
|
|
ldouble: 4
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: "tanh_upward":
|
2015-05-29 12:40:33 +00:00
|
|
|
double: 3
|
2013-05-08 20:06:56 +00:00
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
2015-12-22 13:06:36 +00:00
|
|
|
ldouble: 6
|
1999-12-21 16:15:04 +00:00
|
|
|
|
2015-05-29 12:40:33 +00:00
|
|
|
Function: "tgamma":
|
2020-12-22 18:20:56 +00:00
|
|
|
double: 9
|
2020-04-07 14:41:29 +00:00
|
|
|
float: 8
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 4
|
2015-07-24 13:21:56 +00:00
|
|
|
ldouble: 5
|
|
|
|
|
|
|
|
Function: "tgamma_downward":
|
2021-04-09 20:41:22 +00:00
|
|
|
double: 9
|
2020-04-07 14:41:29 +00:00
|
|
|
float: 7
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 5
|
2015-07-24 13:21:56 +00:00
|
|
|
ldouble: 6
|
|
|
|
|
|
|
|
Function: "tgamma_towardzero":
|
2020-12-22 18:20:56 +00:00
|
|
|
double: 9
|
2020-04-07 14:41:29 +00:00
|
|
|
float: 7
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 5
|
2015-07-24 13:21:56 +00:00
|
|
|
ldouble: 5
|
|
|
|
|
|
|
|
Function: "tgamma_upward":
|
2020-12-22 18:20:56 +00:00
|
|
|
double: 9
|
2020-04-07 14:41:29 +00:00
|
|
|
float: 8
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 4
|
2015-05-29 12:40:33 +00:00
|
|
|
ldouble: 5
|
|
|
|
|
1999-12-21 16:15:04 +00:00
|
|
|
Function: "y0":
|
|
|
|
double: 2
|
Fix the inaccuracy of j0f/j1f/y0f/y1f [BZ #14469, #14470, #14471, #14472]
For j0f/j1f/y0f/y1f, the largest error for all binary32
inputs is reduced to at most 9 ulps for all rounding modes.
The new code is enabled only when there is a cancellation at the very end of
the j0f/j1f/y0f/y1f computation, or for very large inputs, thus should not
give any visible slowdown on average. Two different algorithms are used:
* around the first 64 zeros of j0/j1/y0/y1, approximation polynomials of
degree 3 are used, computed using the Sollya tool (https://www.sollya.org/)
* for large inputs, an asymptotic formula from [1] is used
[1] Fast and Accurate Bessel Function Computation,
John Harrison, Proceedings of Arith 19, 2009.
Inputs yielding the new largest errors are added to auto-libm-test-in,
and ulps are regenerated for various targets (thanks Adhemerval Zanella).
Tested on x86_64 with --disable-multi-arch and on powerpc64le-linux-gnu.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2021-04-01 06:14:10 +00:00
|
|
|
float: 8
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
Fix the inaccuracy of j0f/j1f/y0f/y1f [BZ #14469, #14470, #14471, #14472]
For j0f/j1f/y0f/y1f, the largest error for all binary32
inputs is reduced to at most 9 ulps for all rounding modes.
The new code is enabled only when there is a cancellation at the very end of
the j0f/j1f/y0f/y1f computation, or for very large inputs, thus should not
give any visible slowdown on average. Two different algorithms are used:
* around the first 64 zeros of j0/j1/y0/y1, approximation polynomials of
degree 3 are used, computed using the Sollya tool (https://www.sollya.org/)
* for large inputs, an asymptotic formula from [1] is used
[1] Fast and Accurate Bessel Function Computation,
John Harrison, Proceedings of Arith 19, 2009.
Inputs yielding the new largest errors are added to auto-libm-test-in,
and ulps are regenerated for various targets (thanks Adhemerval Zanella).
Tested on x86_64 with --disable-multi-arch and on powerpc64le-linux-gnu.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2021-04-01 06:14:10 +00:00
|
|
|
ldouble: 10
|
1999-12-21 16:15:04 +00:00
|
|
|
|
2014-03-25 15:13:53 +00:00
|
|
|
Function: "y0_downward":
|
|
|
|
double: 3
|
Fix the inaccuracy of j0f/j1f/y0f/y1f [BZ #14469, #14470, #14471, #14472]
For j0f/j1f/y0f/y1f, the largest error for all binary32
inputs is reduced to at most 9 ulps for all rounding modes.
The new code is enabled only when there is a cancellation at the very end of
the j0f/j1f/y0f/y1f computation, or for very large inputs, thus should not
give any visible slowdown on average. Two different algorithms are used:
* around the first 64 zeros of j0/j1/y0/y1, approximation polynomials of
degree 3 are used, computed using the Sollya tool (https://www.sollya.org/)
* for large inputs, an asymptotic formula from [1] is used
[1] Fast and Accurate Bessel Function Computation,
John Harrison, Proceedings of Arith 19, 2009.
Inputs yielding the new largest errors are added to auto-libm-test-in,
and ulps are regenerated for various targets (thanks Adhemerval Zanella).
Tested on x86_64 with --disable-multi-arch and on powerpc64le-linux-gnu.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2021-04-01 06:14:10 +00:00
|
|
|
float: 8
|
2021-04-09 20:41:22 +00:00
|
|
|
float128: 7
|
2014-03-25 15:13:53 +00:00
|
|
|
ldouble: 10
|
|
|
|
|
|
|
|
Function: "y0_towardzero":
|
|
|
|
double: 3
|
Fix the inaccuracy of j0f/j1f/y0f/y1f [BZ #14469, #14470, #14471, #14472]
For j0f/j1f/y0f/y1f, the largest error for all binary32
inputs is reduced to at most 9 ulps for all rounding modes.
The new code is enabled only when there is a cancellation at the very end of
the j0f/j1f/y0f/y1f computation, or for very large inputs, thus should not
give any visible slowdown on average. Two different algorithms are used:
* around the first 64 zeros of j0/j1/y0/y1, approximation polynomials of
degree 3 are used, computed using the Sollya tool (https://www.sollya.org/)
* for large inputs, an asymptotic formula from [1] is used
[1] Fast and Accurate Bessel Function Computation,
John Harrison, Proceedings of Arith 19, 2009.
Inputs yielding the new largest errors are added to auto-libm-test-in,
and ulps are regenerated for various targets (thanks Adhemerval Zanella).
Tested on x86_64 with --disable-multi-arch and on powerpc64le-linux-gnu.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2021-04-01 06:14:10 +00:00
|
|
|
float: 8
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 3
|
Fix the inaccuracy of j0f/j1f/y0f/y1f [BZ #14469, #14470, #14471, #14472]
For j0f/j1f/y0f/y1f, the largest error for all binary32
inputs is reduced to at most 9 ulps for all rounding modes.
The new code is enabled only when there is a cancellation at the very end of
the j0f/j1f/y0f/y1f computation, or for very large inputs, thus should not
give any visible slowdown on average. Two different algorithms are used:
* around the first 64 zeros of j0/j1/y0/y1, approximation polynomials of
degree 3 are used, computed using the Sollya tool (https://www.sollya.org/)
* for large inputs, an asymptotic formula from [1] is used
[1] Fast and Accurate Bessel Function Computation,
John Harrison, Proceedings of Arith 19, 2009.
Inputs yielding the new largest errors are added to auto-libm-test-in,
and ulps are regenerated for various targets (thanks Adhemerval Zanella).
Tested on x86_64 with --disable-multi-arch and on powerpc64le-linux-gnu.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2021-04-01 06:14:10 +00:00
|
|
|
ldouble: 9
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: "y0_upward":
|
|
|
|
double: 2
|
Fix the inaccuracy of j0f/j1f/y0f/y1f [BZ #14469, #14470, #14471, #14472]
For j0f/j1f/y0f/y1f, the largest error for all binary32
inputs is reduced to at most 9 ulps for all rounding modes.
The new code is enabled only when there is a cancellation at the very end of
the j0f/j1f/y0f/y1f computation, or for very large inputs, thus should not
give any visible slowdown on average. Two different algorithms are used:
* around the first 64 zeros of j0/j1/y0/y1, approximation polynomials of
degree 3 are used, computed using the Sollya tool (https://www.sollya.org/)
* for large inputs, an asymptotic formula from [1] is used
[1] Fast and Accurate Bessel Function Computation,
John Harrison, Proceedings of Arith 19, 2009.
Inputs yielding the new largest errors are added to auto-libm-test-in,
and ulps are regenerated for various targets (thanks Adhemerval Zanella).
Tested on x86_64 with --disable-multi-arch and on powerpc64le-linux-gnu.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2021-04-01 06:14:10 +00:00
|
|
|
float: 8
|
2021-04-09 20:41:22 +00:00
|
|
|
float128: 4
|
2014-03-25 15:13:53 +00:00
|
|
|
ldouble: 9
|
|
|
|
|
1999-12-21 16:15:04 +00:00
|
|
|
Function: "y1":
|
|
|
|
double: 3
|
Fix the inaccuracy of j0f/j1f/y0f/y1f [BZ #14469, #14470, #14471, #14472]
For j0f/j1f/y0f/y1f, the largest error for all binary32
inputs is reduced to at most 9 ulps for all rounding modes.
The new code is enabled only when there is a cancellation at the very end of
the j0f/j1f/y0f/y1f computation, or for very large inputs, thus should not
give any visible slowdown on average. Two different algorithms are used:
* around the first 64 zeros of j0/j1/y0/y1, approximation polynomials of
degree 3 are used, computed using the Sollya tool (https://www.sollya.org/)
* for large inputs, an asymptotic formula from [1] is used
[1] Fast and Accurate Bessel Function Computation,
John Harrison, Proceedings of Arith 19, 2009.
Inputs yielding the new largest errors are added to auto-libm-test-in,
and ulps are regenerated for various targets (thanks Adhemerval Zanella).
Tested on x86_64 with --disable-multi-arch and on powerpc64le-linux-gnu.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2021-04-01 06:14:10 +00:00
|
|
|
float: 9
|
2021-04-09 20:41:22 +00:00
|
|
|
float128: 5
|
2006-01-31 21:32:11 +00:00
|
|
|
ldouble: 2
|
1999-12-21 16:15:04 +00:00
|
|
|
|
2014-03-25 15:13:53 +00:00
|
|
|
Function: "y1_downward":
|
Fix the inaccuracy of j0f/j1f/y0f/y1f [BZ #14469, #14470, #14471, #14472]
For j0f/j1f/y0f/y1f, the largest error for all binary32
inputs is reduced to at most 9 ulps for all rounding modes.
The new code is enabled only when there is a cancellation at the very end of
the j0f/j1f/y0f/y1f computation, or for very large inputs, thus should not
give any visible slowdown on average. Two different algorithms are used:
* around the first 64 zeros of j0/j1/y0/y1, approximation polynomials of
degree 3 are used, computed using the Sollya tool (https://www.sollya.org/)
* for large inputs, an asymptotic formula from [1] is used
[1] Fast and Accurate Bessel Function Computation,
John Harrison, Proceedings of Arith 19, 2009.
Inputs yielding the new largest errors are added to auto-libm-test-in,
and ulps are regenerated for various targets (thanks Adhemerval Zanella).
Tested on x86_64 with --disable-multi-arch and on powerpc64le-linux-gnu.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2021-04-01 06:14:10 +00:00
|
|
|
double: 6
|
|
|
|
float: 8
|
2021-04-09 20:41:22 +00:00
|
|
|
float128: 5
|
Fix the inaccuracy of j0f/j1f/y0f/y1f [BZ #14469, #14470, #14471, #14472]
For j0f/j1f/y0f/y1f, the largest error for all binary32
inputs is reduced to at most 9 ulps for all rounding modes.
The new code is enabled only when there is a cancellation at the very end of
the j0f/j1f/y0f/y1f computation, or for very large inputs, thus should not
give any visible slowdown on average. Two different algorithms are used:
* around the first 64 zeros of j0/j1/y0/y1, approximation polynomials of
degree 3 are used, computed using the Sollya tool (https://www.sollya.org/)
* for large inputs, an asymptotic formula from [1] is used
[1] Fast and Accurate Bessel Function Computation,
John Harrison, Proceedings of Arith 19, 2009.
Inputs yielding the new largest errors are added to auto-libm-test-in,
and ulps are regenerated for various targets (thanks Adhemerval Zanella).
Tested on x86_64 with --disable-multi-arch and on powerpc64le-linux-gnu.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2021-04-01 06:14:10 +00:00
|
|
|
ldouble: 11
|
2014-03-25 15:13:53 +00:00
|
|
|
|
|
|
|
Function: "y1_towardzero":
|
|
|
|
double: 3
|
Fix the inaccuracy of j0f/j1f/y0f/y1f [BZ #14469, #14470, #14471, #14472]
For j0f/j1f/y0f/y1f, the largest error for all binary32
inputs is reduced to at most 9 ulps for all rounding modes.
The new code is enabled only when there is a cancellation at the very end of
the j0f/j1f/y0f/y1f computation, or for very large inputs, thus should not
give any visible slowdown on average. Two different algorithms are used:
* around the first 64 zeros of j0/j1/y0/y1, approximation polynomials of
degree 3 are used, computed using the Sollya tool (https://www.sollya.org/)
* for large inputs, an asymptotic formula from [1] is used
[1] Fast and Accurate Bessel Function Computation,
John Harrison, Proceedings of Arith 19, 2009.
Inputs yielding the new largest errors are added to auto-libm-test-in,
and ulps are regenerated for various targets (thanks Adhemerval Zanella).
Tested on x86_64 with --disable-multi-arch and on powerpc64le-linux-gnu.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2021-04-01 06:14:10 +00:00
|
|
|
float: 9
|
2021-04-09 20:41:22 +00:00
|
|
|
float128: 3
|
2014-03-25 15:13:53 +00:00
|
|
|
ldouble: 9
|
|
|
|
|
|
|
|
Function: "y1_upward":
|
Fix the inaccuracy of j0f/j1f/y0f/y1f [BZ #14469, #14470, #14471, #14472]
For j0f/j1f/y0f/y1f, the largest error for all binary32
inputs is reduced to at most 9 ulps for all rounding modes.
The new code is enabled only when there is a cancellation at the very end of
the j0f/j1f/y0f/y1f computation, or for very large inputs, thus should not
give any visible slowdown on average. Two different algorithms are used:
* around the first 64 zeros of j0/j1/y0/y1, approximation polynomials of
degree 3 are used, computed using the Sollya tool (https://www.sollya.org/)
* for large inputs, an asymptotic formula from [1] is used
[1] Fast and Accurate Bessel Function Computation,
John Harrison, Proceedings of Arith 19, 2009.
Inputs yielding the new largest errors are added to auto-libm-test-in,
and ulps are regenerated for various targets (thanks Adhemerval Zanella).
Tested on x86_64 with --disable-multi-arch and on powerpc64le-linux-gnu.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2021-04-01 06:14:10 +00:00
|
|
|
double: 6
|
|
|
|
float: 9
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 5
|
2014-03-25 15:13:53 +00:00
|
|
|
ldouble: 9
|
|
|
|
|
1999-12-21 16:15:04 +00:00
|
|
|
Function: "yn":
|
|
|
|
double: 3
|
2017-12-16 08:34:14 +00:00
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 5
|
2006-01-31 21:32:11 +00:00
|
|
|
ldouble: 2
|
1999-12-21 16:15:04 +00:00
|
|
|
|
2014-06-30 21:38:43 +00:00
|
|
|
Function: "yn_downward":
|
|
|
|
double: 3
|
2017-12-16 08:34:14 +00:00
|
|
|
float: 4
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 5
|
2014-06-30 21:38:43 +00:00
|
|
|
ldouble: 10
|
|
|
|
|
|
|
|
Function: "yn_towardzero":
|
|
|
|
double: 3
|
|
|
|
float: 3
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 5
|
2014-06-30 21:38:43 +00:00
|
|
|
ldouble: 8
|
|
|
|
|
|
|
|
Function: "yn_upward":
|
|
|
|
double: 4
|
2017-12-16 08:34:14 +00:00
|
|
|
float: 5
|
2016-08-09 21:48:54 +00:00
|
|
|
float128: 5
|
2014-06-30 21:38:43 +00:00
|
|
|
ldouble: 9
|
|
|
|
|
1999-12-21 16:15:04 +00:00
|
|
|
# end of automatic generation
|