Commit Graph

13 Commits

Author SHA1 Message Date
Siddhesh Poyarekar
2506109403 Set/restore rounding mode only when needed
The most common use case of math functions is with default rounding
mode, i.e. rounding to nearest.  Setting and restoring rounding mode
is an unnecessary overhead for this, so I've added support for a
context, which does the set/restore only if the FP status needs a
change.  The code is written such that only x86 uses these.  Other
architectures should be unaffected by it, but would definitely benefit
if the set/restore has as much overhead relative to the rest of the
code, as the x86 bits do.

Here's a summary of the performance improvement due to these
improvements; I've only mentioned functions that use the set/restore
and have benchmark inputs for x86_64:

Before:

cos(): ITERS:4.69335e+08: TOTAL:28884.6Mcy, MAX:4080.28cy, MIN:57.562cy, 16248.6 calls/Mcy
exp(): ITERS:4.47604e+08: TOTAL:28796.2Mcy, MAX:207.721cy, MIN:62.385cy, 15543.9 calls/Mcy
pow(): ITERS:1.63485e+08: TOTAL:28879.9Mcy, MAX:362.255cy, MIN:172.469cy, 5660.86 calls/Mcy
sin(): ITERS:3.89578e+08: TOTAL:28900Mcy, MAX:704.859cy, MIN:47.583cy, 13480.2 calls/Mcy
tan(): ITERS:7.0971e+07: TOTAL:28902.2Mcy, MAX:1357.79cy, MIN:388.58cy, 2455.55 calls/Mcy

After:

cos(): ITERS:6.0014e+08: TOTAL:28875.9Mcy, MAX:364.283cy, MIN:45.716cy, 20783.4 calls/Mcy
exp(): ITERS:5.48578e+08: TOTAL:28764.9Mcy, MAX:191.617cy, MIN:51.011cy, 19071.1 calls/Mcy
pow(): ITERS:1.70013e+08: TOTAL:28873.6Mcy, MAX:689.522cy, MIN:163.989cy, 5888.18 calls/Mcy
sin(): ITERS:4.64079e+08: TOTAL:28891.5Mcy, MAX:6959.3cy, MIN:36.189cy, 16062.8 calls/Mcy
tan(): ITERS:7.2354e+07: TOTAL:28898.9Mcy, MAX:1295.57cy, MIN:380.698cy, 2503.7 calls/Mcy

So the improvements are:

cos: 27.9089%
exp: 22.6919%
pow: 4.01564%
sin: 19.1585%
tan: 1.96086%

The downside of the change is that it will have an adverse performance
impact on non-default rounding modes, but I think the tradeoff is
justified.
2013-06-12 10:36:48 +05:30
Joseph Myers
5b5b04d628 Make fma use of Dekker and Knuth algorithms use round-to-nearest (bug 14796). 2012-11-03 19:48:53 +00:00
Joseph Myers
a68d0680f8 conformtest: Add test data for fenv.h. 2012-11-02 23:21:36 +00:00
Ulrich Drepper
a784e50247 Remove pre-ISO C support
No more __const.
2012-01-07 23:57:22 -05:00
Jakub Jelinek
9ff8d36f27 Correct implementation of fmaf. 2010-10-11 09:27:05 -04:00
Andreas Schwab
7eb22e757e Avoid PLT call to fegetenv on s390 2010-02-09 22:34:17 -08:00
Ulrich Drepper
246ec41199 * sysdeps/powerpc/fpu/fenv_libc.h: Add libm_hidden_proto for
__fe_nomask_env.
	* sysdeps/powerpc/fpu/fe_nomask.c: Add libm_hidden_def.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/fe_nomask.c: Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/fpu/fe_nomask.c: Likewise.

	* sysdeps/powerpc/bits/fenv.h: Make safe for C++.

	* sysdeps/unix/sysv/linux/powerpc/bits/mathinline.h: New file.
	* sysdeps/powerpc/fpu/fegetexcept.c (__fegetexcept): Rename
	function from fegetexcept and make old name weak alias.
	* include/fenv.h: Declare __fegetexcept.
	* sysdeps/powerpc/fpu/fedisblxcpt.c: Use __fegetexcept instead of
	fegetexcept.
	* sysdeps/powerpc/fpu/feenablxcpt.c: Likewise.
	* sysdeps/powerpc/fpu/fraiseexcpt.c (__feraiseexcept): Avoid call
	to fetestexcept.
	* sysdeps/ieee754/ldbl-128ibm/s_log1pl.c (__log1pl): Use __frexpl
	instead of frexpl to avoid local PLT.
	* math/s_significandl.c (__significandl): Use __ilogbl instead of
	ilogbl to avoid local PLT.
	* sysdeps/ieee754/ldbl-128ibm/s_expm1l.c (__expm1l): Use __ldexpl
	instead of ldexpl to avoid local PLT.
	* sysdeps/ieee754/ldbl-128ibm/e_expl.c (__ieee754_expl): Use
	__roundl not roundl to avoid local PLT.
	* sysdeps/ieee754/ldbl-128/e_j0l.c: Use function names which avoid
	local PLTs.  Use __sincosl instead of separate sinl and cosl
	calls.
	* sysdeps/ieee754/ldbl-128/e_j1l.c: Likewise.
2008-04-12 00:51:34 +00:00
Ulrich Drepper
9b8a727776 * include/fenv.h: Add libm_hidden_proto for fesetround and
feholdexcept.
	* sysdeps/alpha/fpu/feholdexcpt.c: Add libm_hidden_def.
	* sysdeps/alpha/fpu/fesetround.c: Likewise.
	* sysdeps/generic/feholdexcpt.c: Likewise.
	* sysdeps/generic/fesetround.c: Likewise.
	* sysdeps/i386/fpu/feholdexcpt.c: Likewise.
	* sysdeps/i386/fpu/fesetround.c: Likewise.
	* sysdeps/ia64/fpu/feholdexcpt.c: Likewise.
	* sysdeps/ia64/fpu/fesetround.c: Likewise.
	* sysdeps/powerpc/fpu/feholdexcpt.c: Likewise.
	* sysdeps/powerpc/fpu/fesetround.c: Likewise.
	* sysdeps/s390/fpu/feholdexcpt.c: Likewise.
	* sysdeps/s390/fpu/fesetround.c: Likewise.
	* sysdeps/sh/sh4/fpu/feholdexcpt.c: Likewise.
	* sysdeps/sh/sh4/fpu/fesetround.c: Likewise.
	* sysdeps/sparc/fpu/feholdexcpt.c: Likewise.
	* sysdeps/sparc/fpu/fesetround.c: Likewise.
	* sysdeps/x86_64/fpu/feholdexcpt.c: Likewise.
	* sysdeps/x86_64/fpu/fesetround.c: Likewise.
	* sysdeps/generic/s_significand.c (__significand): Use __ilogb not
	ilogb.
	* sysdeps/generic/s_significandf.c (__significandf): Use __ilogbf
	not ilogbf.
2005-07-08 18:54:49 +00:00
Ulrich Drepper
a334319f65 (CFLAGS-tst-align.c): Add -mpreferred-stack-boundary=4. 2004-12-22 20:10:10 +00:00
Jakub Jelinek
0ecb606cb6 2.5-18.1 2007-07-12 18:26:36 +00:00
Ulrich Drepper
76f2646f3d Update.
2002-09-09  Jakub Jelinek  <jakub@redhat.com>

	* include/math.h (__finite_internal, __finitef_internal,
	__finitel_internal, __isinf_internal, __isnan_internal): Remove.
	(isfinite): Remove.
	(__finite, __isinf, __isnan, __finitef, __isinff, __isnanf, __finitel,
	__isinfl, __isnanl): Add hidden_proto.
	(__fpclassify, __fpclassifyf, __fpclassifyl, __expm1l): Add
	libm_hidden_proto.
	* math/Makefile (libm-calls): Add s_isinf and s_isnan.
	* stdio-common/printf_fp.c (__printf_fp): Remove INTUSE from
	__is{inf,nan} calls.
	* stdio-common/printf_size.c (printf_size): Likewise.
	* sysdeps/generic/printf_fphex.c (__printf_fphex): Likewise.
	* sysdeps/generic/s_ldexp.c (__ldexp): Likewise.
	* sysdeps/generic/s_ldexpf.c (__ldexpf): Likewise.
	* sysdeps/generic/s_ldexpl.c (__ldexpl): Likewise.
	* sysdeps/generic/s_expm1l.c (__expm1l): Add libm_hidden_def.
	* sysdeps/i386/fpu/s_finite.S (__finite_internal): Remove alias.
	(__finite): Add hidden_def.
	* sysdeps/i386/fpu/s_finitef.S (__finitef_internal): Remove alias.
	(__finitef): Add hidden_def.
	* sysdeps/i386/fpu/s_finitel.S (__finitel_internal): Remove alias.
	(__finitel): Add hidden_def.
	* sysdeps/i386/fpu/s_isinfl.c (__isinfl): Remove INTDEF.  Add
	hidden_def.
	* sysdeps/i386/fpu/s_isnanl.c (__isnanl): Likewise.
	* sysdeps/i386/fpu/s_fpclassifyl.c (__fpclassifyl): Add
	libm_hidden_def.
	* sysdeps/i386/fpu/s_expm1l.S (__expm1l): Likewise.
	* sysdeps/ieee754/dbl-64/s_finite.c (__finite): Remove INTDEF.  Add
	hidden_def.
	* sysdeps/ieee754/dbl-64/s_isinf.c (__isinf): Likewise.
	(__isinfl): Remove INTDEF.
	* sysdeps/ieee754/dbl-64/s_isnan.c (__isnan): Remove INTDEF.  Add
	hidden_def.
	(__isnanl): Remove INTDEF.
	* sysdeps/ieee754/dbl-64/s_fpclassify.c (__fpclassify): Add
	libm_hidden_def.
	* sysdeps/ieee754/dbl-64/e_lgamma_r.c (sin_pi): Use __sin and __cos
	instead of sin and cos.
	* sysdeps/ieee754/flt-32/s_finitef.c (__finitef): Remove INTDEF.
	Add hidden_def.
	* sysdeps/ieee754/flt-32/s_isinff.c (__isinff): Likewise.
	* sysdeps/ieee754/flt-32/s_isnanf.c (__isnanf): Likewise.
	* sysdeps/ieee754/flt-32/s_fpclassifyf.c (__fpclassifyf): Add
	libm_hidden_def.
	* sysdeps/ieee754/ldbl-128/s_finitel.c (__finitel): Remove INTDEF.
	Add hidden_def.
	* sysdeps/ieee754/ldbl-128/s_isinfl.c (__isinfl): Likewise.
	* sysdeps/ieee754/ldbl-128/s_isnanl.c (__isnanl): Likewise.
	* sysdeps/ieee754/ldbl-128/s_fpclassifyl.c (__fpclassifyl): Add
	libm_hidden_def.
	* sysdeps/ieee754/ldbl-128/s_expm1l.c (__expm1l): Add
	libm_hidden_def.
	* sysdeps/ieee754/ldbl-96/s_finitel.c (__finitel): Remove INTDEF.
	Add hidden_def.
	* sysdeps/ieee754/ldbl-96/s_isinfl.c (__isinfl): Likewise.
	* sysdeps/ieee754/ldbl-96/s_isnanl.c (__isnanl): Likewise.
	* sysdeps/ieee754/ldbl-96/s_fpclassifyl.c (__fpclassifyl): Add
	libm_hidden_def.
	* sysdeps/ia64/fpu/s_finite.S (__finite_internal, __finitef_internal,
	__finitel_internal): Remove aliases.
	(__finite, __finitef, __finitel): Add hidden_def.
	* sysdeps/ia64/fpu/s_isnan.S (__isnan_internal, __isnanf_internal,
	__isnanl_internal): Remove aliases.
	(__isnan, __isnanf, __isnanl): Add hidden_def.
	* sysdeps/ia64/fpu/s_isinf.S (__isinf_internal, __isinff_internal,
	__isinfl_internal): Remove aliases.
	(__isinf, __isinff, __isinfl): Add hidden_def.
	* sysdeps/ia64/fpu/s_fpclassify.S (__fpclassify, __fpclassifyf,
	__fpclassifyl): Add libm_hidden_def.
	* sysdeps/ia64/fpu/s_expm1l.S (__expm1l): Likewise.
	* sysdeps/m68k/s_isinfl.c (__isinfl): Remove INTDEF.  Add hidden_def.
	* sysdeps/m68k/fpu/s_isinf.c (INTDEFX): Remove.
	(hidden_defx): Define and use.
	* sysdeps/m68k/fpu/s_fpclassifyl.c (__fpclassifyl): Add
	libm_hidden_def.
	* sysdeps/m68k/fpu/s_expm1l.c (__expm1l): Likewise.
	* sysdeps/m68k/s_isnanl.c (__isnanl): Add hidden_def.
	* sysdeps/powerpc/fpu/s_isnan.c (__isnan, __isnanf, __isnanl):
	Remove INTDEF.
	(__isnan, __isnanf): Add hidden_def.
	* sysdeps/x86_64/fpu/s_finitel.S (__finitel_internal): Remove alias.
	(__finitel): Add libm_hidden_def.
	* sysdeps/x86_64/fpu/s_expm1l.S (__expm1l): Likewise.

	* include/fenv.h (feraiseexcept, fesetenv): Add libm_hidden_proto.
	* sysdeps/alpha/fpu/fesetenv.c (fesetenv): Add libm_hidden_ver.
	* sysdeps/alpha/fpu/fraiseexcpt.c (feraiseexcept): Likewise.
	* sysdeps/arm/fpu/fesetenv.c (fesetenv): Likewise.
	* sysdeps/arm/fpu/fraiseexcpt.c (feraiseexcept): Likewise.
	* sysdeps/generic/fesetenv.c (fesetenv): Likewise.
	* sysdeps/generic/fraiseexcpt.c (feraiseexcept): Likewise.
	* sysdeps/i386/fpu/fesetenv.c (fesetenv): Likewise.
	* sysdeps/i386/fpu/fraiseexcpt.c (feraiseexcept): Likewise.
	* sysdeps/m68k/fpu/fesetenv.c (fesetenv): Likewise.
	* sysdeps/m68k/fpu/fraiseexcpt.c (feraiseexcept): Likewise.
	* sysdeps/mips/fpu/fesetenv.c (fesetenv): Likewise.
	* sysdeps/mips/fpu/fraiseexcpt.c (feraiseexcept): Likewise.
	* sysdeps/powerpc/fpu/fesetenv.c (fesetenv): Likewise.
	* sysdeps/powerpc/fpu/fraiseexcpt.c (feraiseexcept): Likewise.
	* sysdeps/sparc/fpu/fesetenv.c (fesetenv): Likewise.
	* sysdeps/sparc/fpu/fraiseexcpt.c (feraiseexcept): Likewise.
	* sysdeps/hppa/fpu/fesetenv.c (fesetenv): Add libm_hidden_def.
	* sysdeps/hppa/fpu/fraiseexcpt.c (feraiseexcept): Likewise.
	* sysdeps/ia64/fpu/fesetenv.c (fesetenv): Likewise.
	* sysdeps/ia64/fpu/fraiseexcpt.c (feraiseexcept): Likewise.
	* sysdeps/sh/sh4/fpu/fesetenv.c (fesetenv): Likewise.
	* sysdeps/sh/sh4/fpu/fraiseexcpt.c (feraiseexcept): Likewise.
	* sysdeps/s390/fpu/fesetenv.c (fesetenv): Likewise.
	* sysdeps/s390/fpu/fraiseexcpt.c (feraiseexcept): Likewise.
	* sysdeps/x86_64/fpu/fesetenv.c (fesetenv): Likewise.
	* sysdeps/x86_64/fpu/fraiseexcpt.c (feraiseexcept): Likewise.
2002-09-10 01:40:26 +00:00
Andreas Jaeger
ed073f0e62 Add prototype for foo. 2000-12-27 19:58:36 +00:00
Ulrich Drepper
7cabd57c0d Update.
1998-08-03 16:36  Ulrich Drepper  <drepper@cygnus.com>

	* catgets/catgets.c: Use mmap/munmap only is _POSIX_MAPPED_FILES
	is defined.
	* catgets/open_catalog.c: Likewise.
	* iconv/iconv_prog.c: Likewise.
	* intl/loadmsgcat.c: Likewise.
	* locale/findlocale.c: Likewise.
	* locale/loadlocale.c: Likewise.
	* locale/programs/localedef.c: Likewise.
	* malloc/malloc.c: Likewise.

	* elf/elf.h: Fix typo.

	* math/Makefile: Use $(LN_S) instead of ln.

	* sysdeps/generic/getpgid.c: Fix return type.

1998-08-01 02:49 -0400  Zack Weinberg  <zack@rabi.phys.columbia.edu>

	* sysdeps/posix/tempname.c (__stdio_gen_tempname): Rename to
	__gen_tempname and simplify the interface.  Strip out the
	code to do path search and create FILE objects.  This function
	now takes a mktemp() style template and returns either a name
	or a file descriptor.
	(__path_search): New function; searches for directories for
	temp files.
	* sysdeps/generic/tempname.c: Stub out __gen_tempname and
	__path_search, not __stdio_gen_tempname.

	* libio/stdio.h: Prototype __gen_tempname and __path_search,
	not __stdio_gen_tempname.
	* stdio/stdio.h: Likewise.

	* stdio-common/tempnam.c: Use __path_search and __gen_tempname.
	* stdio-common/tmpfile.c: Likewise.
	* stdio-common/tmpfile64.c: Likewise.
	* stdio-common/tmpnam.c: Likewise.
	* stdio-common/tmpnam_r.c: Likewise.

	* misc/mkstemp.c: New file.  Use __gen_tempname.
	* misc/mktemp.c: Likewise.

	* sysdeps/posix/mkstemp.c: Removed.
	* sysdeps/posix/mktemp.c: Removed.
	* sysdeps/generic/mkstemp.c: Removed.
	* sysdeps/generic/mktemp.c: Removed.

1998-08-02  Thorsten Kukuk  <kukuk@vt.uni-paderborn.de>

	* configure.in: Check, if door add-on is installed.
	* config.make.in: Add have_doors.
	* sunrpc/Makefile: Add HAVE_DOOR define.
	* sunrpc/key_call.c: Add keyserv/door interface.

	* sunrpc/svc_unix.c: Call setsockopt only if SO_PASSCRED is defined.
	* sunrpc/clnt_unix.c: Likewise.

1998-08-02  Andreas Jaeger  <aj@arthur.rhein-neckar.de>

	* inet/netinet/in.h (IN_CLASSC): Correct mask.
	Reported by Ian Staniforth <I.Staniforth@sheffield.ac.uk> [fixes
	PR libc/727].

1998-08-03 10:23  Ulrich Drepper  <drepper@cygnus.com>

	* misc/Makefile: Fix installation problem with --disable-shared.
	* posix/Makefile: Likewise.

1998-08-02  Andreas Schwab  <schwab@issan.informatik.uni-dortmund.de>

	* posix/regex.c (re_search_2): Optimize searching for anchored
	pattern if '^' cannot match at embedded newlines.
	(regerror): Renamed from __regerror, which it should only be
	called if _LIBC.

1998-07-31  Andreas Schwab  <schwab@issan.informatik.uni-dortmund.de>

	* sunrpc/svc_unix.c (__msgread): Check setsockopt return value.

1998-07-31  Andreas Schwab  <schwab@issan.informatik.uni-dortmund.de>

	* sysdeps/generic/glob.c: Remove obsolete cast.

1998-07-31  Andreas Schwab  <schwab@issan.informatik.uni-dortmund.de>

	* Rules (tests): Fix last change.
1998-08-03 16:47:01 +00:00