glibc/sysdeps/i386
Szabolcs Nagy 424c4f60ed Add new pow implementation
The algorithm is exp(y * log(x)), where log(x) is computed with about
1.3*2^-68 relative error (1.5*2^-68 without fma), returning the result
in two doubles, and the exp part uses the same algorithm (and lookup
tables) as exp, but takes the input as two doubles and a sign (to handle
negative bases with odd integer exponent).  The __exp1 internal symbol
is no longer necessary.

There is separate code path when fma is not available but the worst case
error is about 0.54 ULP in both cases.  The lookup table and consts for
log are 4168 bytes.  The .rodata+.text is decreased by 37908 bytes on
aarch64.  The non-nearest rounding error is less than 1 ULP.

Improvements on Cortex-A72 compared to current glibc master:
pow thruput: 2.40x in [0.01 11.1]x[0.01 11.1]
pow latency: 1.84x in [0.01 11.1]x[0.01 11.1]

Tested on
aarch64-linux-gnu (defined __FP_FAST_FMA, TOINT_INTRINSICS) and
arm-linux-gnueabihf (!defined __FP_FAST_FMA, !TOINT_INTRINSICS) and
x86_64-linux-gnu (!defined __FP_FAST_FMA, !TOINT_INTRINSICS) and
powerpc64le-linux-gnu (defined __FP_FAST_FMA, !TOINT_INTRINSICS) targets.

	* NEWS: Mention pow improvements.
	* math/Makefile (type-double-routines): Add e_pow_log_data.
	* sysdeps/generic/math_private.h (__exp1): Remove.
	* sysdeps/i386/fpu/e_pow_log_data.c: New file.
	* sysdeps/ia64/fpu/e_pow_log_data.c: New file.
	* sysdeps/ieee754/dbl-64/Makefile (CFLAGS-e_pow.c): Allow fma
	contraction.
	* sysdeps/ieee754/dbl-64/e_exp.c (__exp1): Remove.
	(exp_inline): Remove.
	(__ieee754_exp): Only single double input is handled.
	* sysdeps/ieee754/dbl-64/e_pow.c: Rewrite.
	* sysdeps/ieee754/dbl-64/e_pow_log_data.c: New file.
	* sysdeps/ieee754/dbl-64/math_config.h (issignaling_inline): Define.
	(__pow_log_data): Define.
	* sysdeps/ieee754/dbl-64/upow.h: Remove.
	* sysdeps/ieee754/dbl-64/upow.tbl: Remove.
	* sysdeps/m68k/m680x0/fpu/e_pow_log_data.c: New file.
	* sysdeps/x86_64/fpu/multiarch/Makefile (CFLAGS-e_pow-fma.c): Allow fma
	contraction.
	(CFLAGS-e_pow-fma4.c): Likewise.
2018-09-19 10:04:51 +01:00
..
fpu Add new pow implementation 2018-09-19 10:04:51 +01:00
htl hurd: Bump remaining LGPL2+ htl licences to LGPL 2.1+ 2018-04-02 16:37:36 +02:00
i586 Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
i686 x86: Don't include <init-arch.h> in assembly codes 2018-08-03 08:05:00 -07:00
i786 Also applying directories. 1999-01-24 10:39:22 +00:00
nptl x86: Rename __glibc_reserved2 to ssp_base in tcbhead_t 2018-07-25 04:39:39 -07:00
sys Drop fpregset unused symbol exposition 2018-04-20 01:27:13 +02:00
____longjmp_chk.S Add sigstack handling to Linux ____longjmp_chk on i386. 2009-07-30 21:50:14 -07:00
__longjmp.S x86: Support shadow stack pointer in setjmp/longjmp 2018-07-14 05:59:53 -07:00
abort-instr.h update from main archive 961220 1996-12-21 04:13:58 +00:00
add_n.S i386: Add _CET_ENDBR to indirect jump targets in add_n.S/sub_n.S 2018-07-17 16:11:44 -07:00
addmul_1.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
asm-syntax.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
atomic-machine.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
backtrace.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
bcopy.S Add i386 memset and memcpy assembly functions 2015-08-27 09:04:54 -07:00
bsd-_setjmp.S x86: Support shadow stack pointer in setjmp/longjmp 2018-07-14 05:59:53 -07:00
bsd-setjmp.S x86: Support shadow stack pointer in setjmp/longjmp 2018-07-14 05:59:53 -07:00
bzero.S Add i386 memset and memcpy assembly functions 2015-08-27 09:04:54 -07:00
cacheinfo.c Move sysdeps/x86_64/cacheinfo.c to sysdeps/x86 2016-05-08 08:49:18 -07:00
configure Add --enable-static-pie configure option to build static PIE [BZ #19574] 2017-12-15 17:12:14 -08:00
configure.ac Add --enable-static-pie configure option to build static PIE [BZ #19574] 2017-12-15 17:12:14 -08:00
crti.S x86: Add _CET_ENDBR to functions in crti.S 2018-07-17 16:05:18 -07:00
crtn.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
dl-cet.c x86: Support IBT and SHSTK in Intel CET [BZ #21598] 2018-07-16 14:08:27 -07:00
dl-irel.h [BZ #20271] Add newlines in __libc_fatal calls. 2018-08-31 18:04:32 -07:00
dl-lookupcfg.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
dl-machine.h elf: Unify symbol address run-time calculation [BZ #19818] 2018-04-04 23:09:37 +01:00
dl-procinfo.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
dl-tls.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
dl-tlsdesc.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
dl-tlsdesc.S x86: Add _CET_ENDBR to functions in dl-tlsdesc.S 2018-07-17 16:07:17 -07:00
dl-trampoline.S x86: Support IBT and SHSTK in Intel CET [BZ #21598] 2018-07-16 14:08:27 -07:00
ffs.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
gccframe.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
gmp-mparam.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
htonl.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
htons.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
i386-mcount.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
Implies Add float128 support for x86_64, x86. 2017-06-26 22:02:24 +00:00
init-arch.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
jmpbuf-offsets.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
jmpbuf-unwind.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
ldbl2mpn.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
link-defines.sym Preserve bound registers for pointer pass/return 2015-07-09 06:50:12 -07:00
lshift.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
machine-gmon.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
Makefile math: Set 387 and SSE2 rounding mode for tgamma on i386 [BZ #23253] 2018-06-21 08:04:29 +02:00
malloc-alignment.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
memchr.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
memcmp.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
memcopy.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
memcpy_chk.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
memcpy.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
memmove_chk.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
memmove.S Add i386 memset and memcpy assembly functions 2015-08-27 09:04:54 -07:00
mempcpy_chk.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
mempcpy.S Add i386 memset and memcpy assembly functions 2015-08-27 09:04:54 -07:00
memset_chk.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
memset.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
memusage.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
mp_clz_tab.c Update. 2002-03-14 20:48:50 +00:00
mul_1.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
preconfigure Move base_machine and machine settings from configure.ac to sysdeps preconfigure fragments. 2014-06-25 17:52:56 +00:00
pthread_spin_trylock.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
rawmemchr.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
rshift.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
setfpucw.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
setjmp.S x86: Support shadow stack pointer in setjmp/longjmp 2018-07-14 05:59:53 -07:00
stackguard-macros.h BZ #15754: CVE-2013-4788 2013-09-23 00:52:09 -04:00
stackinfo.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
start.S i386: Use ENTRY and END in start.S [BZ #23606] 2018-09-12 08:41:26 -07:00
stpcpy.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
stpncpy.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
strcat.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
strchr.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
strchrnul.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
strcspn.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
string-inlines.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
strlen.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
strlen.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
strpbrk.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
strrchr.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
strspn.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
sub_n.S i386: Add _CET_ENDBR to indirect jump targets in add_n.S/sub_n.S 2018-07-17 16:11:44 -07:00
submul_1.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
symbol-hacks.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
sysdep.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tls-macros.h i386: Fix build by GCC 5.0 2014-12-30 11:37:41 -08:00
tlsdesc.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tlsdesc.sym Introduce TLS descriptors for i386 and x86_64. 2008-05-13 05:41:30 +00:00
tst-audit3.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tst-audit3.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tst-audit.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tst-auditmod3a.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tst-auditmod3b.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tst-ld-sse-use.sh Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tst-stack-align.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
Versions Move __fentry__ version definition to sysdeps/{i386,x86_64} 2018-08-10 09:07:44 +02:00