Remove the slow paths from pow. Like several other double precision math
functions, pow is exactly rounded. This is not required from math functions
and causes major overheads as it requires multiple fallbacks using higher
precision arithmetic if a result is close to 0.5ULP. Ridiculous slowdowns
of up to 100000x have been reported when the highest precision path triggers.
All GLIBC math tests pass on AArch64 and x64 (with ULP of pow set to 1).
The worst case error is ~0.506ULP. A simple test over a few hundred million
values shows pow is 10% faster on average. This fixes BZ #13932.
[BZ #13932]
* sysdeps/ieee754/dbl-64/uexp.h (err_1): Remove.
* benchtests/pow-inputs: Update comment for slow path cases.
* manual/probes.texi (slowpow_p10): Delete removed probe.
(slowpow_p10): Likewise.
* math/Makefile: Remove halfulp.c and slowpow.c.
* sysdeps/aarch64/libm-test-ulps: Set ULP of pow to 1.
* sysdeps/generic/math_private.h (__exp1): Remove error argument.
(__halfulp): Remove.
(__slowpow): Remove.
* sysdeps/i386/fpu/halfulp.c: Delete file.
* sysdeps/i386/fpu/slowpow.c: Likewise.
* sysdeps/ia64/fpu/halfulp.c: Likewise.
* sysdeps/ia64/fpu/slowpow.c: Likewise.
* sysdeps/ieee754/dbl-64/e_exp.c (__exp1): Remove error argument,
improve comments and add error analysis.
* sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Add error analysis.
(power1): Remove function:
(log1): Remove error argument, add error analysis.
(my_log2): Remove function.
* sysdeps/ieee754/dbl-64/halfulp.c: Delete file.
* sysdeps/ieee754/dbl-64/slowpow.c: Likewise.
* sysdeps/m68k/m680x0/fpu/halfulp.c: Likewise.
* sysdeps/m68k/m680x0/fpu/slowpow.c: Likewise.
* sysdeps/powerpc/power4/fpu/Makefile: Remove CPPFLAGS-slowpow.c.
* sysdeps/x86_64/fpu/libm-test-ulps: Set ULP of pow to 1.
* sysdeps/x86_64/fpu/multiarch/Makefile: Remove slowpow-fma.c,
slowpow-fma4.c, halfulp-fma.c, halfulp-fma4.c.
* sysdeps/x86_64/fpu/multiarch/e_pow-fma.c (__slowpow): Remove define.
* sysdeps/x86_64/fpu/multiarch/e_pow-fma4.c (__slowpow): Likewise.
* sysdeps/x86_64/fpu/multiarch/halfulp-fma.c: Delete file.
* sysdeps/x86_64/fpu/multiarch/halfulp-fma4.c: Likewise.
* sysdeps/x86_64/fpu/multiarch/slowpow-fma.c: Likewise.
* sysdeps/x86_64/fpu/multiarch/slowpow-fma4.c: Likewise.
2001-03-11 Ulrich Drepper <drepper@redhat.com>
Last-bit accurate math library implementation by IBM Haifa.
Contributed by Abraham Ziv <ziv@il.ibm.com>, Moshe Olshansky
<olshansk@il.ibm.com>, Ealan Henis <ealan@il.ibm.com>, and
Anna Reitman <reitman@il.ibm.com>.
* math/Makefile (dbl-only-routines): New variable.
(libm-routines): Add $(dbl-only-routines).
* sysdeps/ieee754/dbl-64/e_acos.c: Empty, definition is in e_asin.c.
* sysdeps/ieee754/dbl-64/e_asin.c: Replaced with accurate asin
implementation.
* sysdeps/ieee754/dbl-64/e_atan2.c: Replaced with accurate atan2
implementation.
* sysdeps/ieee754/dbl-64/e_exp.c: Replaced with accurate exp
implementation.
* sysdeps/ieee754/dbl-64/e_lgamma_r.c: Don't use __kernel_sin and
__kernel_cos.
* sysdeps/ieee754/dbl-64/e_log.c: Replaced with accurate log
implementation.
* sysdeps/ieee754/dbl-64/e_remainder.c: Replaced with accurate
remainder implementation.
* sysdeps/ieee754/dbl-64/e_pow.c: Replaced with accurate pow
implementation.
* sysdeps/ieee754/dbl-64/e_sqrt.c: Replaced with accurate sqrt
implementation.
* sysdeps/ieee754/dbl-64/k_cos.c: Empty, definition is in s_sin.c.
* sysdeps/ieee754/dbl-64/k_sin.c: Empty, definition is in s_sin.c.
* sysdeps/ieee754/dbl-64/s_atan.c: Replaced with accurate atan
implementation.
* sysdeps/ieee754/dbl-64/s_cos.c: Empty, definition is in s_sin.c.
* sysdeps/ieee754/dbl-64/s_sin.c: Replaced with accurate sin/cos
implementation.
* sysdeps/ieee754/dbl-64/s_sincos.c: Rewritten to not use __kernel_sin
and __kernel_cos.
* sysdeps/ieee754/dbl-64/s_tan.c: Replaced with accurate tan
implementation.
* sysdeps/ieee754/dbl-64/Dist: Add new non-code files.
* sysdeps/ieee754/dbl-64/MathLib.h: New file.
* sysdeps/ieee754/dbl-64/asincos.tbl: New file.
* sysdeps/ieee754/dbl-64/atnat.h: New file.
* sysdeps/ieee754/dbl-64/atnat2.h: New file.
* sysdeps/ieee754/dbl-64/branred.c: New file.
* sysdeps/ieee754/dbl-64/branred.h: New file.
* sysdeps/ieee754/dbl-64/dla.h: New file.
* sysdeps/ieee754/dbl-64/doasin.c: New file.
* sysdeps/ieee754/dbl-64/doasin.h: New file.
* sysdeps/ieee754/dbl-64/dosincos.c: New file.
* sysdeps/ieee754/dbl-64/dosincos.h: New file.
* sysdeps/ieee754/dbl-64/endian.h: New file.
* sysdeps/ieee754/dbl-64/halfulp.c: New file.
* sysdeps/ieee754/dbl-64/mpa.c: New file.
* sysdeps/ieee754/dbl-64/mpa.h: New file.
* sysdeps/ieee754/dbl-64/mpa2.h: New file.
* sysdeps/ieee754/dbl-64/mpatan.c: New file.
* sysdeps/ieee754/dbl-64/mpatan.h: New file.
* sysdeps/ieee754/dbl-64/mpatan2.c: New file.
* sysdeps/ieee754/dbl-64/mpexp.c: New file.
* sysdeps/ieee754/dbl-64/mpexp.h: New file.
* sysdeps/ieee754/dbl-64/mplog.c: New file.
* sysdeps/ieee754/dbl-64/mplog.h: New file.
* sysdeps/ieee754/dbl-64/mpsqrt.c: New file.
* sysdeps/ieee754/dbl-64/mpsqrt.h: New file.
* sysdeps/ieee754/dbl-64/mptan.c: New file.
* sysdeps/ieee754/dbl-64/mydefs.h: New file.
* sysdeps/ieee754/dbl-64/powtwo.tbl: New file.
* sysdeps/ieee754/dbl-64/root.tbl: New file.
* sysdeps/ieee754/dbl-64/sincos.tbl: New file.
* sysdeps/ieee754/dbl-64/sincos32.c: New file.
* sysdeps/ieee754/dbl-64/sincos32.h: New file.
* sysdeps/ieee754/dbl-64/slowexp.c: New file.
* sysdeps/ieee754/dbl-64/slowpow.c: New file.
* sysdeps/ieee754/dbl-64/uasncs.h: New file.
* sysdeps/ieee754/dbl-64/uatan.tbl: New file.
* sysdeps/ieee754/dbl-64/uexp.h: New file.
* sysdeps/ieee754/dbl-64/uexp.tbl: New file.
* sysdeps/ieee754/dbl-64/ulog.h: New file.
* sysdeps/ieee754/dbl-64/ulog.tbl: New file.
* sysdeps/ieee754/dbl-64/upow.h: New file.
* sysdeps/ieee754/dbl-64/upow.tbl: New file.
* sysdeps/ieee754/dbl-64/urem.h: New file.
* sysdeps/ieee754/dbl-64/uroot.h: New file.
* sysdeps/ieee754/dbl-64/usncs.h: New file.
* sysdeps/ieee754/dbl-64/utan.h: New file.
* sysdeps/ieee754/dbl-64/utan.tbl: New file.
* sysdeps/i386/fpu/branred.c: New file.
* sysdeps/i386/fpu/doasin.c: New file.
* sysdeps/i386/fpu/dosincos.c: New file.
* sysdeps/i386/fpu/halfulp.c: New file.
* sysdeps/i386/fpu/mpa.c: New file.
* sysdeps/i386/fpu/mpatan.c: New file.
* sysdeps/i386/fpu/mpatan2.c: New file.
* sysdeps/i386/fpu/mpexp.c: New file.
* sysdeps/i386/fpu/mplog.c: New file.
* sysdeps/i386/fpu/mpsqrt.c: New file.
* sysdeps/i386/fpu/mptan.c: New file.
* sysdeps/i386/fpu/sincos32.c: New file.
* sysdeps/i386/fpu/slowexp.c: New file.
* sysdeps/i386/fpu/slowpow.c: New file.
* sysdeps/ia64/fpu/branred.c: New file.
* sysdeps/ia64/fpu/doasin.c: New file.
* sysdeps/ia64/fpu/dosincos.c: New file.
* sysdeps/ia64/fpu/halfulp.c: New file.
* sysdeps/ia64/fpu/mpa.c: New file.
* sysdeps/ia64/fpu/mpatan.c: New file.
* sysdeps/ia64/fpu/mpatan2.c: New file.
* sysdeps/ia64/fpu/mpexp.c: New file.
* sysdeps/ia64/fpu/mplog.c: New file.
* sysdeps/ia64/fpu/mpsqrt.c: New file.
* sysdeps/ia64/fpu/mptan.c: New file.
* sysdeps/ia64/fpu/sincos32.c: New file.
* sysdeps/ia64/fpu/slowexp.c: New file.
* sysdeps/ia64/fpu/slowpow.c: New file.
* sysdeps/m68k/fpu/branred.c: New file.
* sysdeps/m68k/fpu/doasin.c: New file.
* sysdeps/m68k/fpu/dosincos.c: New file.
* sysdeps/m68k/fpu/halfulp.c: New file.
* sysdeps/m68k/fpu/mpa.c: New file.
* sysdeps/m68k/fpu/mpatan.c: New file.
* sysdeps/m68k/fpu/mpatan2.c: New file.
* sysdeps/m68k/fpu/mpexp.c: New file.
* sysdeps/m68k/fpu/mplog.c: New file.
* sysdeps/m68k/fpu/mpsqrt.c: New file.
* sysdeps/m68k/fpu/mptan.c: New file.
* sysdeps/m68k/fpu/sincos32.c: New file.
* sysdeps/m68k/fpu/slowexp.c: New file.
* sysdeps/m68k/fpu/slowpow.c: New file.
* iconvdata/gconv-modules: Add a number of alias, mostly for IBM
codepages.