glibc

mirror of https://sourceware.org/git/glibc.git synced 2025-01-04 00:31:09 +00:00

History

Adhemerval Zanella Netto cf9cf33199 math: Improve fmodf This uses a new algorithm similar to already proposed earlier [1]. With x = mx * 2^ex and y = my * 2^ey (mx, my, ex, ey being integers), the simplest implementation is: mx * 2^ex == 2 * mx * 2^(ex - 1) while (ex > ey) { mx = 2; --ex; mx %= my; } With mx/my being mantissa of double floating pointer, on each step the argument reduction can be improved 8 (which is sizeof of uint32_t minus MANTISSA_WIDTH plus the signal bit): while (ex > ey) { mx << 8; ex -= 8; mx %= my; } / The implementation uses builtin clz and ctz, along with shifts to convert hx/hy back to doubles. Different than the original patch, this path assume modulo/divide operation is slow, so use multiplication with invert values. I see the following performance improvements using fmod benchtests (result only show the 'mean' result): Architecture \| Input \| master \| patch -----------------\|-----------------\|----------\|-------- x86_64 (Ryzen 9) \| subnormals \| 17.2549 \| 12.0318 x86_64 (Ryzen 9) \| normal \| 85.4096 \| 49.9641 x86_64 (Ryzen 9) \| close-exponents \| 19.1072 \| 15.8224 aarch64 (N1) \| subnormal \| 10.2182 \| 6.81778 aarch64 (N1) \| normal \| 60.0616 \| 20.3667 aarch64 (N1) \| close-exponents \| 11.5256 \| 8.39685 I also see similar improvements on arm-linux-gnueabihf when running on the N1 aarch64 chips, where it a lot of soft-fp implementation (for modulo, and multiplication): Architecture \| Input \| master \| patch -----------------\|-----------------\|----------\|-------- armhf (N1) \| subnormal \| 11.6662 \| 10.8955 armhf (N1) \| normal \| 69.2759 \| 34.1524 armhf (N1) \| close-exponents \| 13.6472 \| 18.2131 Instead of using the math_private.h definitions, I used the math_config.h instead which is used on newer math implementations. Co-authored-by: kirill <kirill.okhotnikov@gmail.com> [1] https://sourceware.org/pipermail/libc-alpha/2020-November/119794.html Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>		2023-04-03 16:45:18 -03:00
..
dbl-64	math: Improve fmod	2023-04-03 16:36:24 -03:00
float128	Update copyright dates with scripts/update-copyrights	2023-01-06 21:14:39 +00:00
flt-32	math: Improve fmodf	2023-04-03 16:45:18 -03:00
ldbl-64-128	Update copyright dates with scripts/update-copyrights	2023-01-06 21:14:39 +00:00
ldbl-96	Update copyright dates with scripts/update-copyrights	2023-01-06 21:14:39 +00:00
ldbl-128	Update copyright dates with scripts/update-copyrights	2023-01-06 21:14:39 +00:00
ldbl-128ibm	Update copyright dates with scripts/update-copyrights	2023-01-06 21:14:39 +00:00
ldbl-128ibm-compat	Move libc_freeres_ptrs and libc_subfreeres to hidden/weak functions	2023-03-27 13:57:55 -03:00
ldbl-opt	C2x scanf binary constant handling	2023-03-02 19:10:37 +00:00
soft-fp	math: Suppress -O0 warnings for soft-fp fsqrt [BZ #19444 ]	2023-01-11 17:50:51 -03:00
ieee754.h	Update copyright dates with scripts/update-copyrights	2023-01-06 21:14:39 +00:00
k_standard.c	Use copysign functions not __copysign functions in glibc libm.	2018-09-27 20:04:48 +00:00
k_standardf.c	Update copyright dates with scripts/update-copyrights	2023-01-06 21:14:39 +00:00
k_standardl.c	Update copyright dates with scripts/update-copyrights	2023-01-06 21:14:39 +00:00
libm-alias-finite.h	Update copyright dates with scripts/update-copyrights	2023-01-06 21:14:39 +00:00
Makefile
s_lib_version.c
s_matherr.c
s_signgam.c	Remove unnecessary math_private.h includes.	2018-09-28 21:53:33 +00:00