glibc/sysdeps/x86_64
Wilco Dijkstra ea5c662c62 Improve performance of sincosf
This patch is a complete rewrite of sincosf.  The new version is
significantly faster, as well as simple and accurate.
The worst-case ULP is 0.5607, maximum relative error is 0.5303 * 2^-23 over
all 4 billion inputs.  In non-nearest rounding modes the error is 1ULP.

The algorithm uses 3 main cases: small inputs which don't need argument
reduction, small inputs which need a simple range reduction and large inputs
requiring complex range reduction.  The code uses approximate integer
comparisons to quickly decide between these cases.

The small range reducer uses a single reduction step to handle values up to
120.0.  It is fastest on targets which support inlined round instructions.

The large range reducer uses integer arithmetic for simplicity.  It does a
32x96 bit multiply to compute a 64-bit modulo result.  This is more than
accurate enough to handle the worst-case cancellation for values close to
an integer multiple of PI/4.  It could be further optimized, however it is
already much faster than necessary.

sincosf throughput gains on Cortex-A72:
* |x| < 0x1p-12 : 1.6x
* |x| < M_PI_4  : 1.7x
* |x| < 2 * M_PI: 1.5x
* |x| < 120.0   : 1.8x
* |x| < Inf     : 2.3x

	* math/Makefile: Add s_sincosf_data.c.
	* sysdeps/ia64/fpu/s_sincosf_data.c: New file.
	* sysdeps/ieee754/flt-32/s_sincosf.h (abstop12): Add new function.
	(sincosf_poly): Likewise.
	(reduce_small): Likewise.
	(reduce_large): Likewise.
	* sysdeps/ieee754/flt-32/s_sincosf.c (sincosf): Rewrite.
	* sysdeps/ieee754/flt-32/s_sincosf_data.c: New file with sincosf data.
	* sysdeps/m68k/m680x0/fpu/s_sincosf_data.c: New file.
	* sysdeps/x86_64/fpu/s_sincosf_data.c: New file.
2018-08-10 17:34:39 +01:00
..
64
fpu Improve performance of sincosf 2018-08-10 17:34:39 +01:00
multiarch x86: Don't include <init-arch.h> in assembly codes 2018-08-03 08:05:00 -07:00
nptl x86: Rename __glibc_reserved2 to ssp_base in tcbhead_t 2018-07-25 04:39:39 -07:00
x32 Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
____longjmp_chk.S
__longjmp.S x86: Support shadow stack pointer in setjmp/longjmp 2018-07-14 05:59:53 -07:00
_mcount.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
abort-instr.h
add_n.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
addmul_1.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
atomic-machine.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
bsd-_setjmp.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
bsd-setjmp.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
bzero.S
configure Add --enable-static-pie configure option to build static PIE [BZ #19574] 2017-12-15 17:12:14 -08:00
configure.ac Add --enable-static-pie configure option to build static PIE [BZ #19574] 2017-12-15 17:12:14 -08:00
crti.S x86: Add _CET_ENDBR to functions in crti.S 2018-07-17 16:05:18 -07:00
crtn.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
dl-irel.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
dl-lookupcfg.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
dl-machine.h elf: Unify symbol address run-time calculation [BZ #19818] 2018-04-04 23:09:37 +01:00
dl-procinfo.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
dl-runtime.c
dl-tls.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
dl-tls.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
dl-tlsdesc.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
dl-tlsdesc.S x86: Add _CET_ENDBR to functions in dl-tlsdesc.S 2018-07-17 16:07:17 -07:00
dl-trampoline.h x86: Support IBT and SHSTK in Intel CET [BZ #21598] 2018-07-16 14:08:27 -07:00
dl-trampoline.S x86: Move STATE_SAVE_OFFSET/STATE_SAVE_MASK to sysdep.h 2018-08-06 06:25:43 -07:00
ffs.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
ffsll.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
hp-timing.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
htonl.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
ifuncmain8.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
ifuncmod8.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
Implies Add float128 support for x86_64, x86. 2017-06-26 22:02:24 +00:00
jmpbuf-offsets.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
jmpbuf-unwind.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
l10nflist.c
ldbl2mpn.c
link-defines.sym
locale-defines.sym
localplt.data ld.so: Introduce struct dl_exception 2017-08-10 16:54:57 +02:00
lshift.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
machine-gmon.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
Makefile Rename the glibc.tune namespace to glibc.cpu 2018-08-02 23:49:19 +05:30
memchr.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
memcmp.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
memcopy.h
memcpy_chk.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
memcpy.S
memmove_chk.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
memmove.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
mempcpy_chk.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
mempcpy.S
memrchr.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
memset_chk.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
memset.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
memusage.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
mp_clz_tab.c
mul_1.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
preconfigure
preconfigure.ac
rawmemchr.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
rshift.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
rtld-offsets.sym x86-64: Align the stack in __tls_get_addr [BZ #21609] 2017-07-06 04:43:20 -07:00
sched_cpucount.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
setjmp.S x86: Support shadow stack pointer in setjmp/longjmp 2018-07-14 05:59:53 -07:00
stack-aliasing.h
stackguard-macros.h
stackinfo.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
start.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
stpcpy.S
strcasecmp_l-nonascii.c Use locale_t, not __locale_t, throughout glibc 2017-06-20 20:30:06 -04:00
strcasecmp_l.S
strcasecmp.S
strcat.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
strchr.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
strchrnul.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
strcmp.S x86_64: Use _CET_NOTRACK in strcmp.S 2018-07-18 06:31:53 -07:00
strcpy.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
strcspn.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
strlen.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
strncase_l-nonascii.c Use locale_t, not __locale_t, throughout glibc 2017-06-20 20:30:06 -04:00
strncase_l.S
strncase.S
strncmp.S
strnlen.S
strpbrk.S x86-64: Implement strcspn/strpbrk/strspn IFUNC selectors in C 2017-06-15 08:59:05 -07:00
strrchr.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
strspn.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
sub_n.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
submul_1.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
sysdep.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tls_get_addr.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tls-macros.h
tlsdesc.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tlsdesc.sym x86-64: Align the stack in __tls_get_addr [BZ #21609] 2017-07-06 04:43:20 -07:00
tst-audit3.c
tst-audit4-aux.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tst-audit4.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tst-audit5.c
tst-audit6.c
tst-audit7.c
tst-audit10-aux.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tst-audit10.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tst-audit.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tst-auditmod3a.c
tst-auditmod3b.c Add missing header files throughout the testsuite. 2017-02-16 17:33:18 -05:00
tst-auditmod4a.c
tst-auditmod4b.c Add missing header files throughout the testsuite. 2017-02-16 17:33:18 -05:00
tst-auditmod5a.c
tst-auditmod5b.c Add missing header files throughout the testsuite. 2017-02-16 17:33:18 -05:00
tst-auditmod6a.c
tst-auditmod6b.c Add missing header files throughout the testsuite. 2017-02-16 17:33:18 -05:00
tst-auditmod6c.c Add missing header files throughout the testsuite. 2017-02-16 17:33:18 -05:00
tst-auditmod7a.c
tst-auditmod7b.c Add missing header files throughout the testsuite. 2017-02-16 17:33:18 -05:00
tst-auditmod10a.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tst-auditmod10b.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tst-avx512-aux.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tst-avx512.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tst-avx512mod.c x86-64: Verify that _dl_runtime_resolve preserves vector registers 2017-02-09 12:19:58 -08:00
tst-avx-aux.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tst-avx.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tst-avxmod.c x86-64: Verify that _dl_runtime_resolve preserves vector registers 2017-02-09 12:19:58 -08:00
tst-mallocalign1.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tst-platform-1.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tst-platformmod-1.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tst-platformmod-2.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tst-quad1.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tst-quad1pie.c
tst-quad2.c
tst-quad2pie.c
tst-quadmod1.S x86-64: Add endbr64 to tst-quadmod[12].S 2018-07-24 05:12:14 -07:00
tst-quadmod1pie.S
tst-quadmod2.S x86-64: Add endbr64 to tst-quadmod[12].S 2018-07-24 05:12:14 -07:00
tst-quadmod2pie.S
tst-split-dynreloc.c
tst-split-dynreloc.lds
tst-sse.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tst-ssemod.c x86-64: Verify that _dl_runtime_resolve preserves vector registers 2017-02-09 12:19:58 -08:00
tst-stack-align.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tst-x86_64-1.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tst-x86_64mod-1.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
Versions Move __fentry__ version definition to sysdeps/{i386,x86_64} 2018-08-10 09:07:44 +02:00
wcschr.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
wcscmp.S x86-64: Optimize strcmp/wcscmp and strncmp/wcsncmp with AVX2 2018-06-01 16:32:43 -05:00
wcslen.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
wcsrchr.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
wmemset_chk.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
wmemset.S x86-64: Optimize wmemset with SSE2/AVX2/AVX512 2017-06-05 11:09:59 -07:00
wordcopy.c