glibc/sysdeps
Szabolcs Nagy f41b0a43e4 Add new log implementation
Optimized log using carefully generated lookup table with 1/c and log(c)
values for small intervalls around 1.  The log(c) is very near a double
precision value, it has about 62 bits precision.  The algorithm is
log(2^k x) = k log(2) + log(c) + log(x/c), where the last term is
approximated by a polynomial of x/c - 1.  Near 1 a single polynomial of
x - 1 is used.

There is separate code path when fma instruction is not available for
computing x/c - 1 precisely, in which case the table size is doubled.
The code uses __builtin_fma under __FP_FAST_FMA to ensure it is inlined
as an instruction.

With the default configuration settings the worst case error is 0.519 ULP
(and 0.520 without fma), the rodata size is 2192 bytes (4240 without fma).
The non-nearest rounding error is less than 1 ULP.

Improvements on Cortex-A72 compared to current glibc master:
log thruput: 3.28x in [0.01 11.1]
log latency: 2.23x in [0.01 11.1]
log thruput: 1.56x in [0.999 1.001]
log latency: 1.57x in [0.999 1.001]

Tested on
aarch64-linux-gnu (defined __FP_FAST_FMA)
arm-linux-gnueabihf (!defined __FP_FAST_FMA)
x86_64-linux-gnu (!defined __FP_FAST_FMA)
powerpc64le-linux-gnu (defined __FP_FAST_FMA)
targets.

	* NEWS: Mention log improvement.
	* math/Makefile (type-double-routines): Add e_log_data.
	* sysdeps/i386/fpu/e_log_data.c: New file.
	* sysdeps/ia64/fpu/e_log_data.c: New file.
	* sysdeps/ieee754/dbl-64/e_log.c: Rewrite.
	* sysdeps/ieee754/dbl-64/e_log_data.c: New file.
	* sysdeps/ieee754/dbl-64/math_config.h (__log_data): Add.
	* sysdeps/ieee754/dbl-64/ulog.h: Remove.
	* sysdeps/ieee754/dbl-64/ulog.tbl: Remove.
	* sysdeps/m68k/m680x0/fpu/e_log_data.c: New file.
2018-09-12 17:33:30 +01:00
..
aarch64 Add new exp and exp2 implementations 2018-09-05 16:22:00 +01:00
alpha Remove alpha math_private.h. 2018-09-05 12:42:51 +00:00
arm Add new exp and exp2 implementations 2018-09-05 16:22:00 +01:00
generic Move float128 inlines from sysdeps/generic/math_private.h to include/math.h. 2018-09-05 11:53:35 +00:00
gnu Update netinet/udp.h from Linux 4.18. 2018-08-27 13:43:05 +00:00
hppa Move SNAN_TESTS_PRESERVE_PAYLOAD out of math-tests.h. 2018-08-01 11:21:16 +00:00
htl hurd: Avoid PLTs for __pthread_get/setspecific 2018-08-09 01:28:55 +02:00
hurd Fix ISO C threads installed header and HURD assumption 2018-07-25 17:27:45 -03:00
i386 Add new log implementation 2018-09-12 17:33:30 +01:00
ia64 Add new log implementation 2018-09-12 17:33:30 +01:00
ieee754 Add new log implementation 2018-09-12 17:33:30 +01:00
init_array sysdeps/init_array: Add PREINIT_FUNCTION to crti.S 2018-01-29 10:22:26 -08:00
m68k Add new log implementation 2018-09-12 17:33:30 +01:00
mach hurd: Fix exec usage of mach_setup_thread 2018-08-01 00:10:03 +02:00
microblaze Mark _init and _fini as hidden [BZ #23145] 2018-06-08 10:28:52 -07:00
mips Split fenv_private.h out of math_private.h more consistently. 2018-08-28 20:48:49 +00:00
nios2 Move EXCEPTION_TESTS_* out of math-tests.h 2018-08-23 23:41:13 +00:00
nptl [BZ #20271] Add newlines in __libc_fatal calls. 2018-08-31 18:04:32 -07:00
posix Fix Linux fcntl OFD locks for non-LFS architectures (BZ#20251) 2018-06-26 13:22:53 -03:00
powerpc Add new exp and exp2 implementations 2018-09-05 16:22:00 +01:00
pthread hurd: fix sigevent's sigev_notify_attributes field type 2018-04-19 21:43:44 +02:00
riscv Do not include fenv_private.h in math_private.h. 2018-09-03 21:09:04 +00:00
s390 S390: Regenerate ULPs. 2018-09-06 14:29:01 +02:00
sh Update SH libm-tests-ulps 2018-07-31 10:33:53 -03:00
sparc [BZ #20271] Add newlines in __libc_fatal calls. 2018-08-31 18:04:32 -07:00
unix Fix segfault in maybe_script_execute. 2018-09-06 14:27:03 +02:00
wordsize-32 Use libc_hidden_* for strtoumax (bug 15105). 2018-02-28 14:16:21 +00:00
wordsize-64 Use libc_hidden_* for strtoumax (bug 15105). 2018-02-28 14:16:21 +00:00
x86 Split fenv_private.h out of math_private.h more consistently. 2018-08-28 20:48:49 +00:00
x86_64 Remove x86_64 math_private.h asms. 2018-09-11 14:51:40 +00:00