mirror of
https://sourceware.org/git/glibc.git
synced 2025-01-14 21:10:19 +00:00
0e0be3ed80
The CORE-MATH implementation is correctly rounded (for any rounding mode) and shows slight better performance to the generic tanhf. The code was adapted to glibc style and to use the definition of math_config.h (to handle errno, overflow, and underflow). Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1, gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1): Latency master patched improvement x86_64 51.5273 41.0951 20.25% x86_64v2 47.7021 39.1526 17.92% x86_64v3 45.0373 34.2737 23.90% i686 133.9970 83.8596 37.42% aarch64 (Neoverse) 21.5439 14.7961 31.32% power10 13.3301 8.4406 36.68% reciprocal-throughput master patched improvement x86_64 24.9493 12.8547 48.48% x86_64v2 20.7051 12.7761 38.29% x86_64v3 19.2492 11.0851 42.41% i686 78.6498 29.8211 62.08% aarch64 (Neoverse) 11.6026 7.11487 38.68% power10 6.3328 2.8746 54.61% Signed-off-by: Alexei Sibidanov <sibid@uvic.ca> Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr> Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: DJ Delorie <dj@redhat.com> |
||
---|---|---|
.. | ||
bits | ||
fpu | ||
nofpu | ||
nptl | ||
__longjmp.S | ||
atomic-machine.h | ||
bsd-_setjmp.S | ||
bsd-setjmp.S | ||
configure | ||
configure.ac | ||
dl-machine.h | ||
dl-start.S | ||
dl-tls.h | ||
dl-trampoline.S | ||
fpu_control.h | ||
Implies | ||
jmpbuf-offsets.h | ||
jmpbuf-unwind.h | ||
ldsodefs.h | ||
libc-tls.c | ||
machine-gmon.h | ||
Makefile | ||
math-tests-snan-payload.h | ||
math-tests-trap.h | ||
memusage.h | ||
preconfigure | ||
setjmp.S | ||
sfp-machine.h | ||
sotruss-lib.c | ||
stackinfo.h | ||
start.S | ||
sysdep.h | ||
tininess.h | ||
tst-audit.h | ||
utmp-size.h |