The most common use case of math functions is with default rounding
mode, i.e. rounding to nearest. Setting and restoring rounding mode
is an unnecessary overhead for this, so I've added support for a
context, which does the set/restore only if the FP status needs a
change. The code is written such that only x86 uses these. Other
architectures should be unaffected by it, but would definitely benefit
if the set/restore has as much overhead relative to the rest of the
code, as the x86 bits do.
Here's a summary of the performance improvement due to these
improvements; I've only mentioned functions that use the set/restore
and have benchmark inputs for x86_64:
Before:
cos(): ITERS:4.69335e+08: TOTAL:28884.6Mcy, MAX:4080.28cy, MIN:57.562cy, 16248.6 calls/Mcy
exp(): ITERS:4.47604e+08: TOTAL:28796.2Mcy, MAX:207.721cy, MIN:62.385cy, 15543.9 calls/Mcy
pow(): ITERS:1.63485e+08: TOTAL:28879.9Mcy, MAX:362.255cy, MIN:172.469cy, 5660.86 calls/Mcy
sin(): ITERS:3.89578e+08: TOTAL:28900Mcy, MAX:704.859cy, MIN:47.583cy, 13480.2 calls/Mcy
tan(): ITERS:7.0971e+07: TOTAL:28902.2Mcy, MAX:1357.79cy, MIN:388.58cy, 2455.55 calls/Mcy
After:
cos(): ITERS:6.0014e+08: TOTAL:28875.9Mcy, MAX:364.283cy, MIN:45.716cy, 20783.4 calls/Mcy
exp(): ITERS:5.48578e+08: TOTAL:28764.9Mcy, MAX:191.617cy, MIN:51.011cy, 19071.1 calls/Mcy
pow(): ITERS:1.70013e+08: TOTAL:28873.6Mcy, MAX:689.522cy, MIN:163.989cy, 5888.18 calls/Mcy
sin(): ITERS:4.64079e+08: TOTAL:28891.5Mcy, MAX:6959.3cy, MIN:36.189cy, 16062.8 calls/Mcy
tan(): ITERS:7.2354e+07: TOTAL:28898.9Mcy, MAX:1295.57cy, MIN:380.698cy, 2503.7 calls/Mcy
So the improvements are:
cos: 27.9089%
exp: 22.6919%
pow: 4.01564%
sin: 19.1585%
tan: 1.96086%
The downside of the change is that it will have an adverse performance
impact on non-default rounding modes, but I think the tradeoff is
justified.
__fe_nomask_env.
* sysdeps/powerpc/fpu/fe_nomask.c: Add libm_hidden_def.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/fe_nomask.c: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/fpu/fe_nomask.c: Likewise.
* sysdeps/powerpc/bits/fenv.h: Make safe for C++.
* sysdeps/unix/sysv/linux/powerpc/bits/mathinline.h: New file.
* sysdeps/powerpc/fpu/fegetexcept.c (__fegetexcept): Rename
function from fegetexcept and make old name weak alias.
* include/fenv.h: Declare __fegetexcept.
* sysdeps/powerpc/fpu/fedisblxcpt.c: Use __fegetexcept instead of
fegetexcept.
* sysdeps/powerpc/fpu/feenablxcpt.c: Likewise.
* sysdeps/powerpc/fpu/fraiseexcpt.c (__feraiseexcept): Avoid call
to fetestexcept.
* sysdeps/ieee754/ldbl-128ibm/s_log1pl.c (__log1pl): Use __frexpl
instead of frexpl to avoid local PLT.
* math/s_significandl.c (__significandl): Use __ilogbl instead of
ilogbl to avoid local PLT.
* sysdeps/ieee754/ldbl-128ibm/s_expm1l.c (__expm1l): Use __ldexpl
instead of ldexpl to avoid local PLT.
* sysdeps/ieee754/ldbl-128ibm/e_expl.c (__ieee754_expl): Use
__roundl not roundl to avoid local PLT.
* sysdeps/ieee754/ldbl-128/e_j0l.c: Use function names which avoid
local PLTs. Use __sincosl instead of separate sinl and cosl
calls.
* sysdeps/ieee754/ldbl-128/e_j1l.c: Likewise.
1998-08-03 16:36 Ulrich Drepper <drepper@cygnus.com>
* catgets/catgets.c: Use mmap/munmap only is _POSIX_MAPPED_FILES
is defined.
* catgets/open_catalog.c: Likewise.
* iconv/iconv_prog.c: Likewise.
* intl/loadmsgcat.c: Likewise.
* locale/findlocale.c: Likewise.
* locale/loadlocale.c: Likewise.
* locale/programs/localedef.c: Likewise.
* malloc/malloc.c: Likewise.
* elf/elf.h: Fix typo.
* math/Makefile: Use $(LN_S) instead of ln.
* sysdeps/generic/getpgid.c: Fix return type.
1998-08-01 02:49 -0400 Zack Weinberg <zack@rabi.phys.columbia.edu>
* sysdeps/posix/tempname.c (__stdio_gen_tempname): Rename to
__gen_tempname and simplify the interface. Strip out the
code to do path search and create FILE objects. This function
now takes a mktemp() style template and returns either a name
or a file descriptor.
(__path_search): New function; searches for directories for
temp files.
* sysdeps/generic/tempname.c: Stub out __gen_tempname and
__path_search, not __stdio_gen_tempname.
* libio/stdio.h: Prototype __gen_tempname and __path_search,
not __stdio_gen_tempname.
* stdio/stdio.h: Likewise.
* stdio-common/tempnam.c: Use __path_search and __gen_tempname.
* stdio-common/tmpfile.c: Likewise.
* stdio-common/tmpfile64.c: Likewise.
* stdio-common/tmpnam.c: Likewise.
* stdio-common/tmpnam_r.c: Likewise.
* misc/mkstemp.c: New file. Use __gen_tempname.
* misc/mktemp.c: Likewise.
* sysdeps/posix/mkstemp.c: Removed.
* sysdeps/posix/mktemp.c: Removed.
* sysdeps/generic/mkstemp.c: Removed.
* sysdeps/generic/mktemp.c: Removed.
1998-08-02 Thorsten Kukuk <kukuk@vt.uni-paderborn.de>
* configure.in: Check, if door add-on is installed.
* config.make.in: Add have_doors.
* sunrpc/Makefile: Add HAVE_DOOR define.
* sunrpc/key_call.c: Add keyserv/door interface.
* sunrpc/svc_unix.c: Call setsockopt only if SO_PASSCRED is defined.
* sunrpc/clnt_unix.c: Likewise.
1998-08-02 Andreas Jaeger <aj@arthur.rhein-neckar.de>
* inet/netinet/in.h (IN_CLASSC): Correct mask.
Reported by Ian Staniforth <I.Staniforth@sheffield.ac.uk> [fixes
PR libc/727].
1998-08-03 10:23 Ulrich Drepper <drepper@cygnus.com>
* misc/Makefile: Fix installation problem with --disable-shared.
* posix/Makefile: Likewise.
1998-08-02 Andreas Schwab <schwab@issan.informatik.uni-dortmund.de>
* posix/regex.c (re_search_2): Optimize searching for anchored
pattern if '^' cannot match at embedded newlines.
(regerror): Renamed from __regerror, which it should only be
called if _LIBC.
1998-07-31 Andreas Schwab <schwab@issan.informatik.uni-dortmund.de>
* sunrpc/svc_unix.c (__msgread): Check setsockopt return value.
1998-07-31 Andreas Schwab <schwab@issan.informatik.uni-dortmund.de>
* sysdeps/generic/glob.c: Remove obsolete cast.
1998-07-31 Andreas Schwab <schwab@issan.informatik.uni-dortmund.de>
* Rules (tests): Fix last change.