mirror of
https://sourceware.org/git/glibc.git
synced 2025-01-15 05:20:05 +00:00
9583836785
The CORE-MATH implementation is correctly rounded (for any rounding mode), although it should worse performance than current one. The current implementation performance comes mainly from the internal usage of the optimize expf implementation, and shows a maximum ULPs of 2 for FE_TONEAREST and 3 for other rounding modes. The code was adapted to glibc style and to use the definition of math_config.h (to handle errno, overflow, and underflow). Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1, gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1): Latency master patched improvement x86_64 40.6995 49.0737 -20.58% x86_64v2 40.5841 44.3604 -9.30% x86_64v3 39.3879 39.7502 -0.92% i686 112.3380 129.8570 -15.59% aarch64 (Neoverse) 18.6914 17.0946 8.54% power10 11.1343 9.3245 16.25% reciprocal-throughput master patched improvement x86_64 18.6471 24.1077 -29.28% x86_64v2 17.7501 20.2946 -14.34% x86_64v3 17.8262 17.1877 3.58% i686 64.1454 86.5645 -34.95% aarch64 (Neoverse) 9.77226 12.2314 -25.16% power10 4.0200 5.3316 -32.63% Signed-off-by: Alexei Sibidanov <sibid@uvic.ca> Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr> Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: DJ Delorie <dj@redhat.com> |
||
---|---|---|
.. | ||
abiv2 | ||
bits | ||
fpu | ||
nofpu | ||
nptl | ||
abort-instr.h | ||
atomic-machine.h | ||
bsd-_setjmp.S | ||
bsd-setjmp.S | ||
configure | ||
configure.ac | ||
dl-machine.h | ||
dl-procinfo.h | ||
dl-tls.h | ||
fpu_control.h | ||
gccframe.h | ||
Implies | ||
jmpbuf-unwind.h | ||
ldsodefs.h | ||
libc-tls.c | ||
linkmap.h | ||
machine-gmon.h | ||
Makefile | ||
preconfigure | ||
preconfigure.ac | ||
sfp-machine.h | ||
sotruss-lib.c | ||
sysdep.h | ||
tininess.h | ||
tst-audit.h | ||
utmp-size.h |