mirror of
https://sourceware.org/git/glibc.git
synced 2024-11-30 16:50:07 +00:00
bccb0648ea
The CORE-MATH implementation is correctly rounded (for any rounding mode) and shows better performance to the generic tanf. The code was adapted to glibc style, to use the definition of math_config.h, to remove errno handling, and to use a generic 128 bit routine for ABIs that do not support it natively. Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (neoverse1, gcc 13.2.1), and powerpc (POWER10, gcc 13.2.1): latency master patched improvement x86_64 82.3961 54.8052 33.49% x86_64v2 82.3415 54.8052 33.44% x86_64v3 69.3661 50.4864 27.22% i686 219.271 45.5396 79.23% aarch64 29.2127 19.1951 34.29% power10 19.5060 16.2760 16.56% reciprocal-throughput master patched improvement x86_64 28.3976 19.7334 30.51% x86_64v2 28.4568 19.7334 30.65% x86_64v3 21.1815 16.1811 23.61% i686 105.016 15.1426 85.58% aarch64 18.1573 10.7681 40.70% power10 8.7207 8.7097 0.13% Signed-off-by: Alexei Sibidanov <sibid@uvic.ca> Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr> Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: DJ Delorie <dj@redhat.com> |
||
---|---|---|
.. | ||
bits | ||
fpu | ||
nofpu | ||
nptl | ||
__longjmp.S | ||
atomic-machine.h | ||
bsd-_setjmp.S | ||
bsd-setjmp.S | ||
configure | ||
configure.ac | ||
dl-machine.h | ||
dl-start.S | ||
dl-tls.h | ||
dl-trampoline.S | ||
fpu_control.h | ||
Implies | ||
jmpbuf-offsets.h | ||
jmpbuf-unwind.h | ||
ldsodefs.h | ||
libc-tls.c | ||
machine-gmon.h | ||
Makefile | ||
math-tests-snan-payload.h | ||
math-tests-trap.h | ||
memusage.h | ||
preconfigure | ||
setjmp.S | ||
sfp-machine.h | ||
sotruss-lib.c | ||
stackinfo.h | ||
start.S | ||
sysdep.h | ||
tininess.h | ||
tst-audit.h | ||
utmp-size.h |