glibc

mirror of https://sourceware.org/git/glibc.git synced 2025-01-11 20:00:07 +00:00

History

Patrick McGehearty 6fd0a3c6a8 Improve __ieee754_exp() performance by greater than 5x on sparc/x86. These changes will be active for all platforms that don't provide their own exp() routines. They will also be active for ieee754 versions of ccos, ccosh, cosh, csin, csinh, sinh, exp10, gamma, and erf. Typical performance gains is typically around 5x when measured on Sparc s7 for common values between exp(1) and exp(40). Using the glibc perf tests on sparc, sparc (nsec) x86 (nsec) old new old new max 17629 395 5173 144 min 399 54 15 13 mean 5317 200 1349 23 The extreme max times for the old (ieee754) exp are due to the multiprecision computation in the old algorithm when the true value is very near 0.5 ulp away from an value representable in double precision. The new algorithm does not take special measures for those cases. The current glibc exp perf tests overrepresent those values. Informal testing suggests approximately one in 200 cases might invoke the high cost computation. The performance advantage of the new algorithm for other values is still large but not as large as indicated by the chart above. Glibc correctness tests for exp() and expf() were run. Within the test suite 3 input values were found to cause 1 bit differences (ulp) when "FE_TONEAREST" rounding mode is set. No differences in exp() were seen for the tested values for the other rounding modes. Typical example: exp(-0x1.760cd2p+0) (-1.46113312244415283203125) new code: 2.31973271630014299393707e-01 0x1.db14cd799387ap-3 old code: 2.31973271630014271638132e-01 0x1.db14cd7993879p-3 exp = 2.31973271630014285508337 (high precision) Old delta: off by 0.49 ulp New delta: off by 0.51 ulp In addition, because ieee754_exp() is used by other routines, cexp() showed test results with very small imaginary input values where the imaginary portion of the result was off by 3 ulp when in upward rounding mode, but not in the other rounding modes. For x86, tgamma showed a few values where the ulp increased to 6 (max ulp for tgamma is 5). Sparc tgamma did not show these failures. I presume the tgamma differences are due to compiler optimization differences within the gamma function.The gamma function is known to be difficult to compute accurately. * sysdeps/ieee754/dbl-64/e_exp.c: Include <math-svid-compat.h> and <errno.h>. Include "eexp.tbl". (half): New constant. (one): Likewise. (__ieee754_exp): Rewrite. (__slowexp): Remove prototype. * sysdeps/ieee754/dbl-64/eexp.tbl: New file. * sysdeps/ieee754/dbl-64/slowexp.c: Remove file. * sysdeps/i386/fpu/slowexp.c: Likewise. * sysdeps/ia64/fpu/slowexp.c: Likewise. * sysdeps/m68k/m680x0/fpu/slowexp.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowexp-avx.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowexp-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowexp-fma4.c: Likewise. * sysdeps/generic/math_private.h (__slowexp): Remove prototype. * sysdeps/ieee754/dbl-64/e_pow.c: Remove mention of slowexp.c in comment. * sysdeps/powerpc/power4/fpu/Makefile [$(subdir) = math] (CPPFLAGS-slowexp.c): Remove variable. * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Remove slowexp-fma, slowexp-fma4 and slowexp-avx. (CFLAGS-slowexp-fma.c): Remove variable. (CFLAGS-slowexp-fma4.c): Likewise. (CFLAGS-slowexp-avx.c): Likewise. * sysdeps/x86_64/fpu/multiarch/e_exp-avx.c (__slowexp): Do not define as macro. * sysdeps/x86_64/fpu/multiarch/e_exp-fma.c (__slowexp): Likewise. * sysdeps/x86_64/fpu/multiarch/e_exp-fma4.c (__slowexp): Likewise. * math/Makefile (type-double-routines): Remove slowexp. * manual/probes.texi (slowexp_p6): Remove. (slowexp_p32): Likewise.		2017-12-19 17:27:31 +00:00
..
examples	Update copyright dates with scripts/update-copyrights.	2017-01-01 00:14:16 +00:00
argp.texi	manual: Complete @standards in argp.texi.	2017-06-16 01:19:30 -07:00
arith.texi	Obsolete matherr, _LIB_VERSION, libieee.a.	2017-08-21 17:45:10 +00:00
charset.texi	manual: Replace summary.awk with summary.pl.	2017-06-15 21:26:20 -07:00
check-safety.sh	Update copyright dates with scripts/update-copyrights.	2017-01-01 00:14:16 +00:00
conf.texi	manual/conf.texi: add a missing underscore in front of SC_SSIZE_MAX [BZ #22588 ]	2017-12-12 00:11:29 +01:00
contrib.texi	Remove Banner mechanism.	2017-09-22 17:43:42 +00:00
creature.texi	manual: Complete @standards in creature.texi.	2017-07-27 03:21:56 -07:00
crypt.texi	manual: Replace summary.awk with summary.pl.	2017-06-15 21:26:20 -07:00
ctype.texi	manual: Replace summary.awk with summary.pl.	2017-06-15 21:26:20 -07:00
debug.texi	manual: Replace summary.awk with summary.pl.	2017-06-15 21:26:20 -07:00
dir	..	2005-11-21 15:45:19 +00:00
errno.texi	manual: Replace summary.awk with summary.pl.	2017-06-15 21:26:20 -07:00
fdl-1.3.texi	BZ#13738: Switch manual to FDL 1.3.	2012-02-24 12:58:10 -08:00
filesys.texi	manual: Document the linkat function	2017-11-04 00:28:37 +01:00
freemanuals.texi	Update to canonical freemanuals.texi file.	2013-09-24 14:06:56 -07:00
getopt.texi	manual: Replace summary.awk with summary.pl.	2017-06-15 21:26:20 -07:00
header.texi	manual: Replace summary.awk with summary.pl.	2017-06-15 21:26:20 -07:00
install-plain.texi	BZ #15941 : Fix INSTALL file regeneration failure with makeinfo 5.x	2013-12-05 09:58:20 +05:30
install.texi	Add --enable-static-pie configure option to build static PIE [BZ #19574 ]	2017-12-15 17:12:14 -08:00
intro.texi	manual: fix typo in the introduction	2016-05-19 23:22:59 -04:00
io.texi	Clean up glibc manual references to "GNU system" (bug 6911).	2012-03-08 01:27:38 +00:00
ipc.texi	manual/ipc.texi: Fix AC-safety notes.	2014-04-08 17:12:15 -04:00
job.texi	manual: Replace summary.awk with summary.pl.	2017-06-15 21:26:20 -07:00
lang.texi	manual: Rewrite the section on widths of integer types.	2017-08-10 20:28:28 -07:00
lgpl-2.1.texi	Use canonical FSF .texi files for LGPL and FDL texts.	2011-06-06 16:16:55 -07:00
libc-texinfo.sh	Remove add-ons mechanism.	2017-10-05 15:58:13 +00:00
libc.texinfo	Update copyright dates not handled by scripts/update-copyrights.	2017-01-01 00:26:24 +00:00
libcbook.texi
libdl.texi	* manual/libdl.texi: New.	2014-01-31 23:23:59 -02:00
libm-err-tab.pl	Prepare the manual to display math errors for float128 functions	2017-06-23 10:31:09 -03:00
llio.texi	Linux: Add memfd_create system call wrapper	2017-11-23 10:00:40 +01:00
locale.texi	manual: Fix a typo in locale.texi.	2017-12-12 03:21:53 -08:00
macros.texi	manual: Replace summary.awk with summary.pl.	2017-06-15 21:26:20 -07:00
maint.texi	Remove add-ons mechanism.	2017-10-05 15:58:13 +00:00
Makefile	Remove add-ons mechanism.	2017-10-05 15:58:13 +00:00
math.texi	Add _Float32 function aliases.	2017-12-07 00:48:31 +00:00
memory.texi	Linux: Implement interfaces for memory protection keys	2017-12-05 15:20:35 +01:00
message.texi	manual: Replace summary.awk with summary.pl.	2017-06-15 21:26:20 -07:00
nss.texi	Remove compat from DEFAULT_CONFIG lookup strings	2017-09-12 10:21:48 -07:00
nsswitch.texi
pattern.texi	manual: Replace summary.awk with summary.pl.	2017-06-15 21:26:20 -07:00
pipe.texi	manual: Replace summary.awk with summary.pl.	2017-06-15 21:26:20 -07:00
platform.texi	PowerPC: Extend Program Priority Register support	2015-08-19 17:43:26 -03:00
probes.texi	Improve __ieee754_exp() performance by greater than 5x on sparc/x86.	2017-12-19 17:27:31 +00:00
process.texi	manual: Replace summary.awk with summary.pl.	2017-06-15 21:26:20 -07:00
README.pretty-printers	Remove obsolete notes at top level of source tree.	2017-09-01 08:04:22 -04:00
README.tunables	Remove obsolete notes at top level of source tree.	2017-09-01 08:04:22 -04:00
resource.texi	manual: Replace summary.awk with summary.pl.	2017-06-15 21:26:20 -07:00
search.texi	manual: Replace summary.awk with summary.pl.	2017-06-15 21:26:20 -07:00
setjmp.texi	manual: Document getcontext uc_stack value on Linux [BZ #759 ]	2017-08-08 16:16:43 -03:00
signal.texi	manual: Replace summary.awk with summary.pl.	2017-06-15 21:26:20 -07:00
socket.texi	manual: Replace summary.awk with summary.pl.	2017-06-15 21:26:20 -07:00
startup.texi	manual: Replace summary.awk with summary.pl.	2017-06-15 21:26:20 -07:00
stdio-fp.c
stdio.texi	manual: Replace summary.awk with summary.pl.	2017-06-15 21:26:20 -07:00
string.texi	manual: Complete @standards in string.texi.	2017-06-16 01:23:17 -07:00
summary.pl	manual: Replace summary.awk with summary.pl.	2017-06-15 21:26:20 -07:00
sysinfo.texi	manual: Replace summary.awk with summary.pl.	2017-06-15 21:26:20 -07:00
syslog.texi	manual: Replace summary.awk with summary.pl.	2017-06-15 21:26:20 -07:00
terminal.texi	manual: Update to mention ENODEV for ttyname and ttyname_r	2017-11-15 20:46:45 +01:00
texinfo.tex	Update miscellaneous files from upstream sources.	2016-12-21 16:05:55 +00:00
texis.awk	Correct close statement.	2001-05-18 13:01:32 +00:00
threads.texi	manual: Replace summary.awk with summary.pl.	2017-06-15 21:26:20 -07:00
time.texi	manual: Replace summary.awk with summary.pl.	2017-06-15 21:26:20 -07:00
tsort.awk	Update copyright dates with scripts/update-copyrights.	2017-01-01 00:14:16 +00:00
tunables.texi	powerpc: POWER8 memcpy optimization for cached memory	2017-12-11 17:39:42 -02:00
users.texi	manual: Replace summary.awk with summary.pl.	2017-06-15 21:26:20 -07:00
xtract-typefun.awk	Make shebang interpreter directives consistent	2016-01-07 04:03:21 -05:00

README.tunables

			TUNABLE FRAMEWORK
			=================

Tunables is a feature in the GNU C Library that allows application authors and
distribution maintainers to alter the runtime library behaviour to match their
workload.

The tunable framework allows modules within glibc to register variables that
may be tweaked through an environment variable.  It aims to enforce a strict
namespace rule to bring consistency to naming of these tunable environment
variables across the project.  This document is a guide for glibc developers to
add tunables to the framework.

ADDING A NEW TUNABLE
--------------------

The TOP_NAMESPACE macro is defined by default as 'glibc'.  If distributions
intend to add their own tunables, they should do so in a different top
namespace by overriding the TOP_NAMESPACE macro for that tunable.  Downstream
implementations are discouraged from using the 'glibc' top namespace for
tunables they don't already have consensus to push upstream.

There are three steps to adding a tunable:

1. Add a tunable to the list and fully specify its properties:

For each tunable you want to add, make an entry in elf/dl-tunables.list.  The
format of the file is as follows:

TOP_NAMESPACE {
  NAMESPACE1 {
    TUNABLE1 {
      # tunable attributes, one per line
    }
    # A tunable with default attributes, i.e. string variable.
    TUNABLE2
    TUNABLE3 {
      # its attributes
    }
  }
  NAMESPACE2 {
    ...
  }
}

The list of allowed attributes are:

- type:			Data type.  Defaults to STRING.  Allowed types are:
			INT_32, UINT_64, SIZE_T and STRING.  Numeric types may
			be in octal or hexadecimal format too.

- minval:		Optional minimum acceptable value.  For a string type
			this is the minimum length of the value.

- maxval:		Optional maximum acceptable value.  For a string type
			this is the maximum length of the value.

- default:		Specify an optional default value for the tunable.

- env_alias:		An alias environment variable

- security_level:	Specify security level of the tunable.  Valid values:

			SXID_ERASE: (default) Don't read for AT_SECURE binaries and
				    removed so that child processes can't read it.
			SXID_IGNORE: Don't read for AT_SECURE binaries, but retained for
				     non-AT_SECURE subprocesses.
			NONE: Read all the time.

2. Use TUNABLE_GET/TUNABLE_SET to get and set tunables.

3. OPTIONAL: If tunables in a namespace are being used multiple times within a
   specific module, set the TUNABLE_NAMESPACE macro to reduce the amount of
   typing.

GETTING AND SETTING TUNABLES
----------------------------

When the TUNABLE_NAMESPACE macro is defined, one may get tunables in that
module using the TUNABLE_GET macro as follows:

  val = TUNABLE_GET (check, int32_t, TUNABLE_CALLBACK (check_callback))

where 'check' is the tunable name, 'int32_t' is the C type of the tunable and
'check_callback' is the function to call if the tunable got initialized to a
non-default value.  The macro returns the value as type 'int32_t'.

The callback function should be defined as follows:

  void
  TUNABLE_CALLBACK (check_callback) (int32_t *valp)
  {
  ...
  }

where it can expect the tunable value to be passed in VALP.

Tunables in the module can be updated using:

  TUNABLE_SET (check, int32_t, val)

where 'check' is the tunable name, 'int32_t' is the C type of the tunable and
'val' is a value of same type.

To get and set tunables in a different namespace from that module, use the full
form of the macros as follows:

  val = TUNABLE_GET_FULL (glibc, tune, hwcap_mask, uint64_t, NULL)

  TUNABLE_SET_FULL (glibc, tune, hwcap_mask, uint64_t, val)

where 'glibc' is the top namespace, 'tune' is the tunable namespace and the
remaining arguments are the same as the short form macros.

When TUNABLE_NAMESPACE is not defined in a module, TUNABLE_GET is equivalent to
TUNABLE_GET_FULL, so you will need to provide full namespace information for
both macros.  Likewise for TUNABLE_SET and TUNABLE_SET_FULL.

** IMPORTANT NOTE **

The tunable list is set as read-only after the dynamic linker relocates itself,
so setting tunable values must be limited only to tunables within the dynamic
linker, that too before relocation.

FUTURE WORK
-----------

The framework currently only allows a one-time initialization of variables
through environment variables and in some cases, modification of variables via
an API call.  A future goals for this project include:

- Setting system-wide and user-wide defaults for tunables through some
  mechanism like a configuration file.

- Allow tweaking of some tunables at runtime