Commit Graph

618 Commits

Author SHA1 Message Date
Wilco Dijkstra
08ddd26814 math: remove exp10 wrappers
Remove the error handling wrapper from exp10.  This is very similar to
the changes done to exp and exp2, except that we also need to handle
pow10 and pow10l.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-01-12 16:02:12 +00:00
Paul Eggert
dff8da6b3e Update copyright dates with scripts/update-copyrights 2024-01-01 10:53:40 -08:00
Bruno Haible
787282dede x86: Do not raises floating-point exception traps on fesetexceptflag (BZ 30990)
According to ISO C23 (7.6.4.4), fesetexcept is supposed to set
floating-point exception flags without raising a trap (unlike
feraiseexcept, which is supposed to raise a trap if feenableexcept
was called with the appropriate argument).

The flags can be set in the 387 unit or in the SSE unit.  When we need
to clear a flag, we need to do so in both units, due to the way
fetestexcept is implemented.

When we need to set a flag, it is sufficient to do it in the SSE unit,
because that is guaranteed to not trap.  However, on i386 CPUs that have
only a 387 unit, set the flags in the 387, as long as this cannot trap.

Co-authored-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2023-12-19 15:12:38 -03:00
Adhemerval Zanella
47a9eeb9ba i686: Do not raise exception traps on fesetexcept (BZ 30989)
According to ISO C23 (7.6.4.4), fesetexcept is supposed to set
floating-point exception flags without raising a trap (unlike
feraiseexcept, which is supposed to raise a trap if feenableexcept
was called with the appropriate argument).

The flags can be set in the 387 unit or in the SSE unit.  To set
a flag, it is sufficient to do it in the SSE unit, because that is
guaranteed to not trap.  However, on i386 CPUs that have only a
387 unit, set the flags in the 387, as long as this cannot trap.

Checked on i686-linux-gnu.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2023-12-19 15:12:38 -03:00
Paul Pluzhnikov
65cc53fe7c Fix misspellings in sysdeps/ -- BZ 25337 2023-05-30 23:02:29 +00:00
Adhemerval Zanella Netto
16439f419b math: Remove the error handling wrapper from fmod and fmodf
The error handling is moved to sysdeps/ieee754 version with no SVID
support.  The compatibility symbol versions still use the wrapper
with SVID error handling around the new code.  There is no new symbol
version nor compatibility code on !LIBM_SVID_COMPAT targets
(e.g. riscv).

The ia64 is unchanged, since it still uses the arch specific
__libm_error_region on its implementation.  For both i686 and m68k,
which provive arch specific implementation, wrappers are added so
no new symbol are added (which would require to change the
implementations).

It shows an small improvement, the results for fmod:

  Architecture     | Input           | master   | patch
  -----------------|-----------------|----------|--------
  x86_64 (Ryzen 9) | subnormals      | 12.5049  | 9.40992
  x86_64 (Ryzen 9) | normal          | 296.939  | 296.738
  x86_64 (Ryzen 9) | close-exponents | 16.0244  | 13.119
  aarch64 (N1)     | subnormal       | 6.81778  | 4.33313
  aarch64 (N1)     | normal          | 155.620  | 152.915
  aarch64 (N1)     | close-exponents | 8.21306  | 5.76138
  armhf (N1)       | subnormal       | 15.1083  | 14.5746
  armhf (N1)       | normal          | 244.833  | 241.738
  armhf (N1)       | close-exponents | 21.8182  | 22.457

Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
2023-04-03 16:45:27 -03:00
Joseph Myers
6d7e8eda9b Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
Adhemerval Zanella
114e299ca6 x86: Remove .tfloat usage
Some compiler does not support it (such as clang integrated assembler)
neither gcc emits it.
2022-10-03 14:03:21 -03:00
Adhemerval Zanella
efeb2bd1ab math: Add math-use-builtins-fabs (BZ#29027)
Both float, double, and _Float128 are assumed to be supported
(float and double already only uses builtins).  Only long double
is parametrized due GCC bug 29253 which prevents its usage on
powerpc.

It allows to remove i686, ia64, x86_64, powerpc, and sparc arch
specific implementation.

On ia64 it also fixes the sNAN handling:

  math/test-float64x-fabs
  math/test-ldouble-fabs

Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc-linux-gnu,
powerpc64-linux-gnu, sparc64-linux-gnu, and ia64-linux-gnu.
2022-05-23 17:49:18 -03:00
Carlos O'Donell
e465d97653 i386: Regenerate ulps
These failures were caught while building glibc master for Fedora
Rawhide which is built with '-mtune=generic -msse2 -mfpmath=sse'
using gcc 11.3 (gcc-11.3.1-2.fc35) on a Cascadelake Intel Xeon
processor.
2022-04-26 10:52:41 -04:00
Adhemerval Zanella
13d45cf9a7 x86: Remove fcopysign{f} implementation
The builtin used by generic code generates similar code.

Checked on x86_64-linux-gnu and i686-linux-gnu.
2022-04-07 12:17:15 -03:00
Adhemerval Zanella
7eed708edf x86: Remove fabs{f} implementation
For x86_64 is the same as the generic implementation, while for i686
the builtin generates the same code.
2022-04-04 16:23:11 -03:00
Paul Eggert
581c785bf3 Update copyright dates with scripts/update-copyrights
I used these shell commands:

../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")

and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 7061 files FOO.

I then removed trailing white space from math/tgmath.h,
support/tst-support-open-dev-null-range.c, and
sysdeps/x86_64/multiarch/strlen-vec.S, to work around the following
obscure pre-commit check failure diagnostics from Savannah.  I don't
know why I run into these diagnostics whereas others evidently do not.

remote: *** 912-#endif
remote: *** 913:
remote: *** 914-
remote: *** error: lines with trailing whitespace found
...
remote: *** error: sysdeps/unix/sysv/linux/statx_cp.c: trailing lines
2022-01-01 11:40:24 -08:00
Adhemerval Zanella
104d2005d5 math: Remove the error handling wrapper from hypot and hypotf
The error handling is moved to sysdeps/ieee754 version with no SVID
support.  The compatibility symbol versions still use the wrapper with
SVID error handling around the new code.  There is no new symbol version
nor compatibility code on !LIBM_SVID_COMPAT targets (e.g. riscv).

Only ia64 is unchanged, since it still uses the arch specific
__libm_error_region on its implementation.

Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
2021-12-13 10:08:46 -03:00
Adhemerval Zanella
a1d3c9b642 i386: Move hypot implementation to C
The generic hypotf is slight slower, mostly due the tricks the assembly
does to optimize the isinf/isnan/issignaling.  The generic hypot is way
slower, since the optimized implementation uses the i386 default
excessive precision to issue the operation directly.  A similar
implementation is provided instead of using the generic implementation:

Checked on i686-linux-gnu.
2021-12-13 09:08:02 -03:00
Siddhesh Poyarekar
5afe4c0d69 Cleanup encoding in comments
Replace non-UTF-8 and non-ASCII characters in comments with their UTF-8
equivalents so that files don't end up with mixed encodings.  With this,
all files (except tests that actually test different encodings) have a
single encoding.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2021-12-13 10:01:45 +05:30
Joseph Myers
1356f38df5 Fix f64xdivf128, f64xmulf128 spurious underflows (bug 28358)
As described in bug 28358, the round-to-odd computations used in the
libm functions that round their results to a narrower format can yield
spurious underflow exceptions in the following circumstances: the
narrowing only narrows the precision of the type and not the exponent
range (i.e., it's narrowing _Float128 to _Float64x on x86_64, x86 or
ia64), the architecture does after-rounding tininess detection (which
applies to all those architectures), the result is inexact, tiny
before rounding but not tiny after rounding (with the chosen rounding
mode) for _Float64x (which is possible for narrowing mul, div and fma,
not for narrowing add, sub or sqrt), so the underflow exception
resulting from the toward-zero computation in _Float128 is spurious
for _Float64x.

Fixed by making ROUND_TO_ODD call feclearexcept (FE_UNDERFLOW) in the
problem cases (as indicated by an extra argument to the macro); there
is never any need to preserve underflow exceptions from this part of
the computation, because the conversion of the round-to-odd value to
the narrower type will underflow in exactly the cases in which the
function should raise that exception, but it may be more efficient to
avoid the extra manipulation of the floating-point environment when
not needed.

Tested for x86_64 and x86, and with build-many-glibcs.py.
2021-09-21 21:54:37 +00:00
Joseph Myers
abd383584b Add narrowing square root functions
This patch adds the narrowing square root functions from TS 18661-1 /
TS 18661-3 / C2X to glibc's libm: fsqrt, fsqrtl, dsqrtl, f32sqrtf64,
f32sqrtf32x, f32xsqrtf64 for all configurations; f32sqrtf64x,
f32sqrtf128, f64sqrtf64x, f64sqrtf128, f32xsqrtf64x, f32xsqrtf128,
f64xsqrtf128 for configurations with _Float64x and _Float128;
__f32sqrtieee128 and __f64sqrtieee128 aliases in the powerpc64le case
(for calls to fsqrtl and dsqrtl when long double is IEEE binary128).
Corresponding tgmath.h macro support is also added.

The changes are mostly similar to those for the other narrowing
functions previously added, so the description of those generally
applies to this patch as well.  However, the not-actually-narrowing
cases (where the two types involved in the function have the same
floating-point format) are aliased to sqrt, sqrtl or sqrtf128 rather
than needing a separately built not-actually-narrowing function such
as was needed for add / sub / mul / div.  Thus, there is no
__nldbl_dsqrtl name for ldbl-opt because no such name was needed
(whereas the other functions needed such a name since the only other
name for that entry point was e.g. f32xaddf64, not reserved by TS
18661-1); the headers are made to arrange for sqrt to be called in
that case instead.

The DIAG_* calls in sysdeps/ieee754/soft-fp/s_dsqrtl.c are because
they were observed to be needed in GCC 7 testing of
riscv32-linux-gnu-rv32imac-ilp32.  The other sysdeps/ieee754/soft-fp/
files added didn't need such DIAG_* in any configuration I tested with
build-many-glibcs.py, but if they do turn out to be needed in more
files with some other configuration / GCC version, they can always be
added there.

I reused the same test inputs in auto-libm-test-in as for
non-narrowing sqrt rather than adding extra or separate inputs for
narrowing sqrt.  The tests in libm-test-narrow-sqrt.inc also follow
those for non-narrowing sqrt.

Tested as followed: natively with the full glibc testsuite for x86_64
(GCC 11, 7, 6) and x86 (GCC 11); with build-many-glibcs.py with GCC
11, 7 and 6; cross testing of math/ tests for powerpc64le, powerpc32
hard float, mips64 (all three ABIs, both hard and soft float).  The
different GCC versions are to cover the different cases in tgmath.h
and tgmath.h tests properly (GCC 6 has _Float* only as typedefs in
glibc headers, GCC 7 has proper _Float* support, GCC 8 adds
__builtin_tgmath).
2021-09-10 20:56:22 +00:00
Siddhesh Poyarekar
30891f35fa Remove "Contributed by" lines
We stopped adding "Contributed by" or similar lines in sources in 2012
in favour of git logs and keeping the Contributors section of the
glibc manual up to date.  Removing these lines makes the license
header a bit more consistent across files and also removes the
possibility of error in attribution when license blocks or files are
copied across since the contributed-by lines don't actually reflect
reality in those cases.

Move all "Contributed by" and similar lines (Written by, Test by,
etc.) into a new file CONTRIBUTED-BY to retain record of these
contributions.  These contributors are also mentioned in
manual/contrib.texi, so we just maintain this additional record as a
courtesy to the earlier developers.

The following scripts were used to filter a list of files to edit in
place and to clean up the CONTRIBUTED-BY file respectively.  These
were not added to the glibc sources because they're not expected to be
of any use in future given that this is a one time task:

https://gist.github.com/siddhesh/b5ecac94eabfd72ed2916d6d8157e7dc
https://gist.github.com/siddhesh/15ea1f5e435ace9774f485030695ee02

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2021-09-03 22:06:44 +05:30
Arjun Shankar
e785361ce3 i386: Regenerate ulps
These failures were caught while building glibc master for Fedora Rawhide
which is built with `-mtune=generic -msse2 -mfpmath=sse'.
2021-07-25 22:29:27 +02:00
Adhemerval Zanella
30c2a0e41b i386: Update ulps
Required after 43576de04a "Improve the accuracy of tgamma
(BZ #26983)"
2021-04-13 16:33:27 -03:00
Adhemerval Zanella
1d64e962ab i386: Update ulps
Required after 9acda61d94 "Fix the inaccuracy of j0f/j1f/y0f/y1f
[BZ #14469, #14470, #14471, #14472]".
2021-04-05 10:02:15 -03:00
Florian Weimer
f01a61e138 i386: Regenerate ulps 2021-03-02 15:41:29 +01:00
Siddhesh Poyarekar
d46c51e9f9 i686: Regenerate ULPs 2021-02-03 23:16:39 +05:30
Paul Eggert
2b778ceb40 Update copyright dates with scripts/update-copyrights
I used these shell commands:

../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")

and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 6694 files FOO.
I then removed trailing white space from benchtests/bench-pthread-locks.c
and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this
diagnostic from Savannah:
remote: *** pre-commit check failed ...
remote: *** error: lines with trailing whitespace found
remote: error: hook declined to update refs/heads/master
2021-01-02 12:17:34 -08:00
Siddhesh Poyarekar
94547d9209 x86 long double: Support pseudo numbers in isnanl
This syncs up isnanl behaviour with gcc.  Also move the isnanl
implementation to sysdeps/x86 and remove the sysdeps/x86_64 version.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2020-12-24 06:05:40 +05:30
Siddhesh Poyarekar
b7f8815617 x86 long double: Support pseudo numbers in fpclassifyl
Also move sysdeps/i386/fpu/s_fpclassifyl.c to
sysdeps/x86/fpu/s_fpclassifyl.c and remove
sysdeps/x86_64/fpu/s_fpclassifyl.c

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2020-12-24 06:05:26 +05:30
Florian Weimer
bca0283815 i386: Regenerate ulps
For new inputs added in commit cad5ad81d2.
2020-12-21 18:19:03 +01:00
Patsy Griffin
86a912c863 Update i686 ulps.
Without this ULP patch these 3 tests fail on i686:
FAIL: math/test-float128-j0
FAIL: math/test-float64x-j0
FAIL: math/test-ldouble-j0

CPU info:
Vendor ID:                       GenuineIntel
CPU family:                      6
Model:                           85
Model name:                      Intel Xeon Processor (Cascadelake)
2020-09-02 10:00:29 -04:00
H.J. Lu
107e6a3c22 x86: Support usable check for all CPU features
Support usable check for all CPU features with the following changes:

1. Change struct cpu_features to

struct cpuid_features
{
  struct cpuid_registers cpuid;
  struct cpuid_registers usable;
};

struct cpu_features
{
  struct cpu_features_basic basic;
  struct cpuid_features features[COMMON_CPUID_INDEX_MAX];
  unsigned int preferred[PREFERRED_FEATURE_INDEX_MAX];
...
};

so that there is a usable bit for each cpuid bit.
2. After the cpuid bits have been initialized, copy the known bits to the
usable bits.  EAX/EBX from INDEX_1 and EAX from INDEX_7 aren't used for
CPU feature detection.
3. Clear the usable bits which require OS support.
4. If the feature is supported by OS, copy its cpuid bit to its usable
bit.
5. Replace HAS_CPU_FEATURE and CPU_FEATURES_CPU_P with CPU_FEATURE_USABLE
and CPU_FEATURE_USABLE_P to check if a feature is usable.
6. Add DEPR_FPU_CS_DS for INDEX_7_EBX_13.
7. Unset MPX feature since it has been deprecated.

The results are

1. If the feature is known and doesn't requre OS support, its usable bit
is copied from the cpuid bit.
2. Otherwise, its usable bit is copied from the cpuid bit only if the
feature is known to supported by OS.
3. CPU_FEATURE_USABLE/CPU_FEATURE_USABLE_P are used to check if the
feature can be used.
4. HAS_CPU_FEATURE/CPU_FEATURE_CPU_P are used to check if CPU supports
the feature.
2020-07-13 06:05:16 -07:00
Patsy Franklin
b21c2c24ed Update i686 libm-test-ulps
Without my ULP patch these 18 tests fail on i686:
 https://koji.fedoraproject.org/koji/taskinfo?taskID=46467301

+ cat /proc/cpuinfo
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 85
model name	: Intel Xeon Processor (Cascadelake)

FAIL: math/test-double-j0
FAIL: math/test-double-y0
FAIL: math/test-float-erfc
FAIL: math/test-float-j0
FAIL: math/test-float-j1
FAIL: math/test-float-lgamma
FAIL: math/test-float-tgamma
FAIL: math/test-float-y0
FAIL: math/test-float32-erfc
FAIL: math/test-float32-j0
FAIL: math/test-float32-j1
FAIL: math/test-float32-lgamma
FAIL: math/test-float32-tgamma
FAIL: math/test-float32-y0
FAIL: math/test-float32x-j0
FAIL: math/test-float32x-y0
FAIL: math/test-float64-j0
FAIL: math/test-float64-y0

With my ULP patch applied these tests now pass:
 https://koji.fedoraproject.org/koji/taskinfo?taskID=46436310
2020-07-09 23:43:25 -04:00
Adhemerval Zanella
b24381e50f i386: Use builtin sqrtl
Checked on i686-linux-gnu.
2020-06-22 11:09:49 -03:00
Adhemerval Zanella
4b2d8e4442 i386: Use generic exp10f
The generic implementation is twice as fast.  Using the exp10f
benchmark:

 * master:
  "exp10f": {
   "workload-spec2017.wrf (adapted)": {
    "duration": 1.02967e+09,
    "iterations": 4.768e+07,
    "reciprocal-throughput": 18.3579,
    "latency": 24.8331,
    "max-throughput": 5.44725e+07,
    "min-throughput": 4.02688e+07
   }
  }

 * patched:
  "exp10f": {
   "workload-spec2017.wrf (adapted)": {
    "duration": 1.01821e+09,
    "iterations": 6.1984e+07,
    "reciprocal-throughput": 13.1975,
    "latency": 19.6563,
    "max-throughput": 7.57719e+07,
    "min-throughput": 5.08743e+07
   }
  }

Checked on i686-linux-gnu.
2020-06-19 10:48:15 -03:00
Samuel Thibault
415d0b0b3f Update i386 libm-test-ulps 2020-05-26 13:21:57 +02:00
Adhemerval Zanella
1c15464ca0 math: Remove inline math tests
With mathinline removal there is no need to keep building and testing
inline math tests.

The gen-libm-tests.py support to generate ULP_I_* is removed and all
libm-test-ulps files are updated to longer have the
i{float,double,ldouble} entries.  The support for no-test-inline is
also removed from both gen-auto-libm-tests and the
auto-libm-test-out-* were regenerated.

Checked on x86_64-linux-gnu and i686-linux-gnu.
2020-03-19 11:45:44 -03:00
Wilco Dijkstra
220622dde5 Add libm_alias_finite for _finite symbols
This patch adds a new macro, libm_alias_finite, to define all _finite
symbol.  It sets all _finite symbol as compat symbol based on its first
version (obtained from the definition at built generated first-versions.h).

The <fn>f128_finite symbols were introduced in GLIBC 2.26 and so need
special treatment in code that is shared between long double and float128.
It is done by adding a list, similar to internal symbol redifinition,
on sysdeps/ieee754/float128/float128_private.h.

Alpha also needs some tricky changes to ensure we still emit 2 compat
symbols for sqrt(f).

Passes buildmanyglibc.

Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2020-01-03 10:02:04 -03:00
Joseph Myers
d614a75396 Update copyright dates with scripts/update-copyrights. 2020-01-01 00:14:33 +00:00
Paul Eggert
5a82c74822 Prefer https to http for gnu.org and fsf.org URLs
Also, change sources.redhat.com to sourceware.org.
This patch was automatically generated by running the following shell
script, which uses GNU sed, and which avoids modifying files imported
from upstream:

sed -ri '
  s,(http|ftp)(://(.*\.)?(gnu|fsf|sourceware)\.org($|[^.]|\.[^a-z])),https\2,g
  s,(http|ftp)(://(.*\.)?)sources\.redhat\.com($|[^.]|\.[^a-z]),https\2sourceware.org\4,g
' \
  $(find $(git ls-files) -prune -type f \
      ! -name '*.po' \
      ! -name 'ChangeLog*' \
      ! -path COPYING ! -path COPYING.LIB \
      ! -path manual/fdl-1.3.texi ! -path manual/lgpl-2.1.texi \
      ! -path manual/texinfo.tex ! -path scripts/config.guess \
      ! -path scripts/config.sub ! -path scripts/install-sh \
      ! -path scripts/mkinstalldirs ! -path scripts/move-if-change \
      ! -path INSTALL ! -path  locale/programs/charmap-kw.h \
      ! -path po/libc.pot ! -path sysdeps/gnu/errlist.c \
      ! '(' -name configure \
            -execdir test -f configure.ac -o -f configure.in ';' ')' \
      ! '(' -name preconfigure \
            -execdir test -f preconfigure.ac ';' ')' \
      -print)

and then by running 'make dist-prepare' to regenerate files built
from the altered files, and then executing the following to cleanup:

  chmod a+x sysdeps/unix/sysv/linux/riscv/configure
  # Omit irrelevant whitespace and comment-only changes,
  # perhaps from a slightly-different Autoconf version.
  git checkout -f \
    sysdeps/csky/configure \
    sysdeps/hppa/configure \
    sysdeps/riscv/configure \
    sysdeps/unix/sysv/linux/csky/configure
  # Omit changes that caused a pre-commit check to fail like this:
  # remote: *** error: sysdeps/powerpc/powerpc64/ppc-mcount.S: trailing lines
  git checkout -f \
    sysdeps/powerpc/powerpc64/ppc-mcount.S \
    sysdeps/unix/sysv/linux/s390/s390-64/syscall.S
  # Omit change that caused a pre-commit check to fail like this:
  # remote: *** error: sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S: last line does not end in newline
  git checkout -f sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S
2019-09-07 02:43:31 -07:00
Andreas Schwab
b72971845a Update i386 libm-test-ulps 2019-08-20 17:06:02 +02:00
Andreas Schwab
56e098118a Update i386 libm-test-ulps 2019-08-15 12:25:51 +02:00
Joseph Myers
04277e02d7 Update copyright dates with scripts/update-copyrights.
* All files with FSF copyright notices: Update copyright dates
	using scripts/update-copyrights.
	* locale/programs/charmap-kw.h: Regenerated.
	* locale/programs/locfile-kw.h: Likewise.
2019-01-01 00:11:28 +00:00
Szabolcs Nagy
a502c5294b Remove the error handling wrapper from pow
Introduce new pow symbol version that doesn't do SVID compatible error
handling.  The standard errno and fp exception based error handling is
inline in the new code and does not have significant overhead.

The wrapper is disabled for sysdeps/ieee754/dbl-64 by using empty
w_pow.c and enabled for targets with their own pow implementation or
ifunc dispatch on __ieee754_pow by including math/w_pow.c.

The compatibility symbol version still uses the wrapper with SVID error
handling around the new code.  There is no new symbol version nor
compatibility code on !LIBM_SVID_COMPAT targets (e.g. riscv).

On targets where previously powl was an alias of pow, now it points to
the compatibility symbol with the wrapper, because it still need the
SVID compatible error handling.  This affects NO_LONG_DOUBLE (e.g. arm)
and LONG_DOUBLE_COMPAT (e.g. alpha) targets as well.

The __pow_finite symbol is now an alias of pow.  Both __pow_finite and
pow set errno and thus not const functions.

The ia64 asm is changed so the compat and new symbol versions map to the
same address.

On x86_64 #include <math.h> was added before macro definitions that
may affect that header.

Tested with build-many-glibcs.py.

	* math/Versions (GLIBC_2.29): Add pow.
	* math/w_pow_compat.c (__pow_compat): Change to versioned compat
	symbol.
	* math/w_pow.c: New file.
	* sysdeps/i386/fpu/w_pow.c: New file.
	* sysdeps/ia64/fpu/e_pow.S: Add versioned symbols.
	* sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Rename to __pow
	and add necessary aliases.
	* sysdeps/ieee754/dbl-64/w_pow.c: New file.
	* sysdeps/m68k/m680x0/fpu/w_pow.c: New file.
	* sysdeps/mach/hurd/i386/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/aarch64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/alpha/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/arm/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/hppa/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/i386/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/ia64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/microblaze/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/nios2/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Update.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/sh/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Update.
	* sysdeps/x86_64/fpu/multiarch/e_pow-fma.c (__ieee754_pow): Rename to
	__pow.
	* sysdeps/x86_64/fpu/multiarch/e_pow-fma4.c (__ieee754_pow): Likewise.
	* sysdeps/x86_64/fpu/multiarch/e_pow.c (__ieee754_pow): Likewise.
	* sysdeps/x86_64/fpu/multiarch/w_pow.c: New file.
2018-11-21 09:58:36 +00:00
Szabolcs Nagy
718d6542f2 Remove the error handling wrapper from log2
Introduce new log2 symbol version that doesn't do SVID compatible error
handling.  The standard errno and fp exception based error handling is
inline in the new code and does not have significant overhead.

The wrapper is disabled for sysdeps/ieee754/dbl-64 by using empty
w_log2.c and enabled for targets with their own log2 implementation by
including math/w_log2.c.

The compatibility symbol version still uses the wrapper with SVID error
handling around the new code.  There is no new symbol version nor
compatibility code on !LIBM_SVID_COMPAT targets (e.g. riscv).

On targets where previously log2l was an alias of log2, now it points to
the compatibility symbol with the wrapper, because it still need the
SVID compatible error handling.  This affects NO_LONG_DOUBLE (e.g. arm)
and LONG_DOUBLE_COMPAT (e.g. alpha) targets as well.

The __log2_finite symbol is now an alias of log2.  Both __log2_finite
and log2 set errno and thus not const functions.

The ia64 asm is changed so the compat and new symbol versions map to the
same address.

Tested with build-many-glibcs.py.

	* math/Versions (GLIBC_2.29): Add log2.
	* math/w_log2_compat.c (__log2_compat): Change to versioned compat
	symbol.
	* math/w_log2.c: New file.
	* sysdeps/i386/fpu/w_log2.c: New file.
	* sysdeps/ia64/fpu/e_log2.S: Add versioned symbols.
	* sysdeps/ieee754/dbl-64/e_log2.c (__ieee754_log2): Rename to __log2
	and add necessary aliases.
	* sysdeps/ieee754/dbl-64/w_log2.c: New file.
	* sysdeps/m68k/m680x0/fpu/w_log2.c: New file.
	* sysdeps/mach/hurd/i386/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/aarch64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/alpha/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/arm/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/hppa/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/i386/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/ia64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/microblaze/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/nios2/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Update.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/sh/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Update.
2018-11-21 09:57:21 +00:00
Szabolcs Nagy
f29b7c492d Remove the error handling wrapper from log
Introduce new log symbol version that doesn't do SVID compatible error
handling.  The standard errno and fp exception based error handling is
inline in the new code and does not have significant overhead.

The wrapper is disabled for sysdeps/ieee754/dbl-64 by using empty
w_log.c and enabled for targets with their own log implementation by
including math/w_log.c.

The compatibility symbol version still uses the wrapper with SVID error
handling around the new code.  There is no new symbol version nor
compatibility code on !LIBM_SVID_COMPAT targets (e.g. riscv).

On targets where previously logl was an alias of log, now it points to
the compatibility symbol with the wrapper, because it still need the
SVID compatible error handling.  This affects NO_LONG_DOUBLE (e.g. arm)
and LONG_DOUBLE_COMPAT (e.g. alpha) targets as well.

The __log_finite symbol is now an alias of log.  Both __log_finite and
log set errno and thus not const functions.

The ia64 asm is changed so the compat and new symbol versions map to the
same address.

On x86_64 #include <math.h> was added before macro definitions that may
affect that header.

Tested with build-many-glibcs.py.

	* math/Versions (GLIBC_2.29): Add log.
	* math/w_log_compat.c (__log_compat): Change to versioned compat
	symbol.
	* math/w_log.c: New file.
	* sysdeps/i386/fpu/w_log.c: New file.
	* sysdeps/ia64/fpu/e_log.S: Update.
	* sysdeps/ieee754/dbl-64/e_log.c (__ieee754_log): Rename to __log
	and add necessary aliases.
	* sysdeps/ieee754/dbl-64/w_log.c: New file.
	* sysdeps/m68k/m680x0/fpu/w_log.c: New file.
	* sysdeps/mach/hurd/i386/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/aarch64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/alpha/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/arm/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/hppa/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/i386/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/ia64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/microblaze/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/nios2/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Update.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/sh/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Update.
	* sysdeps/x86_64/fpu/multiarch/e_log-avx.c (__ieee754_log): Rename to
	__log.
	* sysdeps/x86_64/fpu/multiarch/e_log-fma.c (__ieee754_log): Likewise.
	* sysdeps/x86_64/fpu/multiarch/e_log-fma4.c (__ieee754_log): Likewise.
	* sysdeps/x86_64/fpu/multiarch/e_log.c (__ieee754_log): Likewise.
	* sysdeps/x86_64/fpu/multiarch/w_log.c: New file.
2018-11-21 09:56:27 +00:00
Szabolcs Nagy
c20a10561a Remove the error handling wrapper from exp and exp2
Introduce new exp and exp2 symbol version that don't do SVID compatible
error handling.  The standard errno and fp exception based error handling
is inline in the new code and does not have significant overhead.

The double precision wrappers are disabled for sysdeps/ieee754/dbl-64
by using empty w_exp.c and w_exp2.c files, the math/w_exp.c and
math/w_exp2.c files use the wrapper template and can be included by
targets that have their own exp and exp2 implementations or use ifunc
on the glibc internal __ieee754_exp symbol.

The compatibility symbol versions still use the wrapper with SVID error
handling around the new code.  There is no new symbol version nor
compatibility code on !LIBM_SVID_COMPAT targets (e.g. riscv).

On targets where previously expl and exp2l were aliases of exp and exp2,
now they point to the compatibility symbols with the wrapper, because
they still need the SVID compatible error handling.  This affects
NO_LONG_DOUBLE (e.g arm) and LONG_DOUBLE_COMPAT (e.g. alpha) targets
as well.

The _finite symbols are now aliases of the standard symbols (they have
no performance advantage anymore).  Both the standard symbols and
_finite symbols set errno and thus not const functions.

The ia64 asm is changed so the compat and new symbol versions map to the
same address.

On x86_64 #include <math.h> was added before macro definitions that may
affect that header (the new macro name is __exp instead of __ieee754_exp
which breaks some math.h macros).

Tested with build-many-glibcs.py.

	* math/Versions (GLIBC_2.29): Add exp and exp2.
	* math/w_exp2_compat.c (__exp2_compat): Change to versioned compat
	symbol, handle NO_LONG_DOUBLE and LONG_DOUBLE_COMPAT explicitly.
	* math/w_exp_compat.c (__exp_compat): Likewise.
	* math/w_exp.c: New file.
	* math/w_exp2.c: New file.
	* sysdeps/i386/fpu/w_exp.c: New file.
	* sysdeps/i386/fpu/w_exp2.c: New file.
	* sysdeps/ia64/fpu/e_exp.S: Add versioned symbols.
	* sysdeps/ia64/fpu/e_exp2.S: Likewise.
	* sysdeps/ieee754/dbl-64/e_exp.c (__ieee754_exp): Rename to __exp
	and add necessary aliases.
	* sysdeps/ieee754/dbl-64/e_exp2.c (__ieee754_exp2): Rename to __exp2
	and add necessary aliases.
	* sysdeps/ieee754/dbl-64/w_exp.c: New file.
	* sysdeps/ieee754/dbl-64/w_exp2.c: New file.
	* sysdeps/m68k/m680x0/fpu/w_exp.c: New file.
	* sysdeps/m68k/m680x0/fpu/w_exp2.c: New file.
	* sysdeps/mach/hurd/i386/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/aarch64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/alpha/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/arm/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/hppa/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/i386/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/ia64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/microblaze/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/nios2/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Update.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/sh/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Update.
	* sysdeps/x86_64/fpu/multiarch/e_exp-avx.c (__exp1): Remove.
	(__ieee754_exp): Rename to __exp.
	* sysdeps/x86_64/fpu/multiarch/e_exp-fma.c (__exp1): Remove.
	(__ieee754_exp): Rename to __exp.
	* sysdeps/x86_64/fpu/multiarch/e_exp-fma4.c (__exp1): Remove.
	(__ieee754_exp): Rename to __exp.
	* sysdeps/x86_64/fpu/multiarch/e_exp.c (__ieee754_exp): Rename to
	__exp.
	* sysdeps/x86_64/fpu/multiarch/w_exp.c: New file.
2018-11-21 09:55:02 +00:00
Joseph Myers
c52944e8cc Remove unnecessary math_private.h includes.
After my changes to move various macros, inlines and other content
from math_private.h to more specific headers, many files including
math_private.h no longer need to do so.  Furthermore, since the
optimized inlines of various functions have been moved to
include/fenv.h or replaced by use of function names GCC inlines
automatically, a missing math_private.h include where one is
appropriate will reliably cause a build failure rather than possibly
causing code to be less well optimized while still building
successfully.  Thus, this patch removes includes of math_private.h
that are now unnecessary.  In the case of two RISC-V files, the
include is replaced by one of stdbool.h because the files in question
were relying on math_private.h to get a definition of bool.

Tested for x86_64 and x86, and with build-many-glibcs.py.

	* math/fromfp.h: Do not include <math_private.h>.
	* math/s_cacosh_template.c: Likewise.
	* math/s_casin_template.c: Likewise.
	* math/s_casinh_template.c: Likewise.
	* math/s_ccos_template.c: Likewise.
	* math/s_cproj_template.c: Likewise.
	* math/s_fdim_template.c: Likewise.
	* math/s_fmaxmag_template.c: Likewise.
	* math/s_fminmag_template.c: Likewise.
	* math/s_iseqsig_template.c: Likewise.
	* math/s_ldexp_template.c: Likewise.
	* math/s_nextdown_template.c: Likewise.
	* math/w_log1p_template.c: Likewise.
	* math/w_scalbln_template.c: Likewise.
	* sysdeps/aarch64/fpu/feholdexcpt.c: Likewise.
	* sysdeps/aarch64/fpu/fesetround.c: Likewise.
	* sysdeps/aarch64/fpu/fgetexcptflg.c: Likewise.
	* sysdeps/aarch64/fpu/ftestexcept.c: Likewise.
	* sysdeps/aarch64/fpu/s_llrint.c: Likewise.
	* sysdeps/aarch64/fpu/s_llrintf.c: Likewise.
	* sysdeps/aarch64/fpu/s_lrint.c: Likewise.
	* sysdeps/aarch64/fpu/s_lrintf.c: Likewise.
	* sysdeps/i386/fpu/s_atanl.c: Likewise.
	* sysdeps/i386/fpu/s_f32xaddf64.c: Likewise.
	* sysdeps/i386/fpu/s_f32xsubf64.c: Likewise.
	* sysdeps/i386/fpu/s_fdim.c: Likewise.
	* sysdeps/i386/fpu/s_logbl.c: Likewise.
	* sysdeps/i386/fpu/s_rintl.c: Likewise.
	* sysdeps/i386/fpu/s_significandl.c: Likewise.
	* sysdeps/ia64/fpu/s_matherrf.c: Likewise.
	* sysdeps/ia64/fpu/s_matherrl.c: Likewise.
	* sysdeps/ieee754/dbl-64/s_atan.c: Likewise.
	* sysdeps/ieee754/dbl-64/s_cbrt.c: Likewise.
	* sysdeps/ieee754/dbl-64/s_fma.c: Likewise.
	* sysdeps/ieee754/dbl-64/s_fmaf.c: Likewise.
	* sysdeps/ieee754/flt-32/s_cbrtf.c: Likewise.
	* sysdeps/ieee754/k_standardf.c: Likewise.
	* sysdeps/ieee754/k_standardl.c: Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_copysignl.c: Likewise.
	* sysdeps/ieee754/ldbl-64-128/s_finitel.c: Likewise.
	* sysdeps/ieee754/ldbl-64-128/s_fpclassifyl.c: Likewise.
	* sysdeps/ieee754/ldbl-64-128/s_isinfl.c: Likewise.
	* sysdeps/ieee754/ldbl-64-128/s_isnanl.c: Likewise.
	* sysdeps/ieee754/ldbl-64-128/s_signbitl.c: Likewise.
	* sysdeps/ieee754/ldbl-96/s_cbrtl.c: Likewise.
	* sysdeps/ieee754/ldbl-96/s_fma.c: Likewise.
	* sysdeps/ieee754/ldbl-96/s_fmal.c: Likewise.
	* sysdeps/ieee754/s_signgam.c: Likewise.
	* sysdeps/powerpc/power5+/fpu/s_modf.c: Likewise.
	* sysdeps/powerpc/power5+/fpu/s_modff.c: Likewise.
	* sysdeps/powerpc/power7/fpu/s_logbf.c: Likewise.
	* sysdeps/riscv/rv64/rvd/s_ceil.c: Likewise.
	* sysdeps/riscv/rv64/rvd/s_floor.c: Likewise.
	* sysdeps/riscv/rv64/rvd/s_nearbyint.c: Likewise.
	* sysdeps/riscv/rv64/rvd/s_round.c: Likewise.
	* sysdeps/riscv/rv64/rvd/s_roundeven.c: Likewise.
	* sysdeps/riscv/rv64/rvd/s_trunc.c: Likewise.
	* sysdeps/riscv/rvd/s_finite.c: Likewise.
	* sysdeps/riscv/rvd/s_fmax.c: Likewise.
	* sysdeps/riscv/rvd/s_fmin.c: Likewise.
	* sysdeps/riscv/rvd/s_fpclassify.c: Likewise.
	* sysdeps/riscv/rvd/s_isinf.c: Likewise.
	* sysdeps/riscv/rvd/s_isnan.c: Likewise.
	* sysdeps/riscv/rvd/s_issignaling.c: Likewise.
	* sysdeps/riscv/rvf/fegetround.c: Likewise.
	* sysdeps/riscv/rvf/feholdexcpt.c: Likewise.
	* sysdeps/riscv/rvf/fesetenv.c: Likewise.
	* sysdeps/riscv/rvf/fesetround.c: Likewise.
	* sysdeps/riscv/rvf/feupdateenv.c: Likewise.
	* sysdeps/riscv/rvf/fgetexcptflg.c: Likewise.
	* sysdeps/riscv/rvf/ftestexcept.c: Likewise.
	* sysdeps/riscv/rvf/s_ceilf.c: Likewise.
	* sysdeps/riscv/rvf/s_finitef.c: Likewise.
	* sysdeps/riscv/rvf/s_floorf.c: Likewise.
	* sysdeps/riscv/rvf/s_fmaxf.c: Likewise.
	* sysdeps/riscv/rvf/s_fminf.c: Likewise.
	* sysdeps/riscv/rvf/s_fpclassifyf.c: Likewise.
	* sysdeps/riscv/rvf/s_isinff.c: Likewise.
	* sysdeps/riscv/rvf/s_isnanf.c: Likewise.
	* sysdeps/riscv/rvf/s_issignalingf.c: Likewise.
	* sysdeps/riscv/rvf/s_nearbyintf.c: Likewise.
	* sysdeps/riscv/rvf/s_roundevenf.c: Likewise.
	* sysdeps/riscv/rvf/s_roundf.c: Likewise.
	* sysdeps/riscv/rvf/s_truncf.c: Likewise.
	* sysdeps/riscv/rv64/rvd/s_rint.c: Include <stdbool.h> instead of
	<math_private.h>.
	* sysdeps/riscv/rvf/s_rintf.c: Likewise.
2018-09-28 21:53:33 +00:00
Szabolcs Nagy
424c4f60ed Add new pow implementation
The algorithm is exp(y * log(x)), where log(x) is computed with about
1.3*2^-68 relative error (1.5*2^-68 without fma), returning the result
in two doubles, and the exp part uses the same algorithm (and lookup
tables) as exp, but takes the input as two doubles and a sign (to handle
negative bases with odd integer exponent).  The __exp1 internal symbol
is no longer necessary.

There is separate code path when fma is not available but the worst case
error is about 0.54 ULP in both cases.  The lookup table and consts for
log are 4168 bytes.  The .rodata+.text is decreased by 37908 bytes on
aarch64.  The non-nearest rounding error is less than 1 ULP.

Improvements on Cortex-A72 compared to current glibc master:
pow thruput: 2.40x in [0.01 11.1]x[0.01 11.1]
pow latency: 1.84x in [0.01 11.1]x[0.01 11.1]

Tested on
aarch64-linux-gnu (defined __FP_FAST_FMA, TOINT_INTRINSICS) and
arm-linux-gnueabihf (!defined __FP_FAST_FMA, !TOINT_INTRINSICS) and
x86_64-linux-gnu (!defined __FP_FAST_FMA, !TOINT_INTRINSICS) and
powerpc64le-linux-gnu (defined __FP_FAST_FMA, !TOINT_INTRINSICS) targets.

	* NEWS: Mention pow improvements.
	* math/Makefile (type-double-routines): Add e_pow_log_data.
	* sysdeps/generic/math_private.h (__exp1): Remove.
	* sysdeps/i386/fpu/e_pow_log_data.c: New file.
	* sysdeps/ia64/fpu/e_pow_log_data.c: New file.
	* sysdeps/ieee754/dbl-64/Makefile (CFLAGS-e_pow.c): Allow fma
	contraction.
	* sysdeps/ieee754/dbl-64/e_exp.c (__exp1): Remove.
	(exp_inline): Remove.
	(__ieee754_exp): Only single double input is handled.
	* sysdeps/ieee754/dbl-64/e_pow.c: Rewrite.
	* sysdeps/ieee754/dbl-64/e_pow_log_data.c: New file.
	* sysdeps/ieee754/dbl-64/math_config.h (issignaling_inline): Define.
	(__pow_log_data): Define.
	* sysdeps/ieee754/dbl-64/upow.h: Remove.
	* sysdeps/ieee754/dbl-64/upow.tbl: Remove.
	* sysdeps/m68k/m680x0/fpu/e_pow_log_data.c: New file.
	* sysdeps/x86_64/fpu/multiarch/Makefile (CFLAGS-e_pow-fma.c): Allow fma
	contraction.
	(CFLAGS-e_pow-fma4.c): Likewise.
2018-09-19 10:04:51 +01:00
Joseph Myers
f29b6f17e4 Use rint functions not __rint functions in glibc libm.
Continuing the move to use, within libm, public names for libm
functions that can be inlined as built-in functions on many
architectures, this patch moves calls to __rint functions to call the
corresponding rint names instead, with asm redirection to __rint when
the calls are not inlined.  The x86_64 math_private.h is removed as no
longer useful after this patch.

This patch is relative to a tree with my floor patch
<https://sourceware.org/ml/libc-alpha/2018-09/msg00148.html> applied,
and much the same considerations arise regarding possibly replacing an
IFUNC call with a direct inline expansion.

Tested for x86_64, and with build-many-glibcs.py.

	* include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ &&
	__FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (rint): Redirect
	using MATH_REDIRECT.
	* sysdeps/aarch64/fpu/s_rint.c: Define NO_MATH_REDIRECT before
	header inclusion.
	* sysdeps/aarch64/fpu/s_rintf.c: Likewise.
	* sysdeps/alpha/fpu/s_rint.c: Likewise.
	* sysdeps/alpha/fpu/s_rintf.c: Likewise.
	* sysdeps/i386/fpu/s_rintl.c: Likewise.
	* sysdeps/ieee754/dbl-64/s_rint.c: Likewise.
	* sysdeps/ieee754/dbl-64/wordsize-64/s_rint.c: Likewise.
	* sysdeps/ieee754/float128/s_rintf128.c: Likewise.
	* sysdeps/ieee754/flt-32/s_rintf.c: Likewise.
	* sysdeps/ieee754/ldbl-128/s_rintl.c: Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_rintl.c: Likewise.
	* sysdeps/m68k/coldfire/fpu/s_rint.c: Likewise.
	* sysdeps/m68k/coldfire/fpu/s_rintf.c: Likewise.
	* sysdeps/m68k/m680x0/fpu/s_rint.c: Likewise.
	* sysdeps/m68k/m680x0/fpu/s_rintf.c: Likewise.
	* sysdeps/m68k/m680x0/fpu/s_rintl.c: Likewise.
	* sysdeps/powerpc/fpu/s_rint.c: Likewise.
	* sysdeps/powerpc/fpu/s_rintf.c: Likewise.
	* sysdeps/riscv/rv64/rvd/s_rint.c: Likewise.
	* sysdeps/riscv/rvf/s_rintf.c: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_rint.c: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_rintf.c: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_rint.c: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_rintf.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_rint.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_rintf.c: Likewise.
	* sysdeps/x86_64/fpu/math_private.h: Remove file.
	* math/e_scalb.c (invalid_fn): Use rint functions instead of
	__rint variants.
	* math/e_scalbf.c (invalid_fn): Likewise.
	* math/e_scalbl.c (invalid_fn): Likewise.
	* sysdeps/ieee754/dbl-64/e_gamma_r.c (__ieee754_gamma_r):
	Likewise.
	* sysdeps/ieee754/flt-32/e_gammaf_r.c (__ieee754_gammaf_r):
	Likewise.
	* sysdeps/ieee754/k_standard.c (__kernel_standard): Likewise.
	* sysdeps/ieee754/k_standardl.c (__kernel_standard_l): Likewise.
	* sysdeps/ieee754/ldbl-128/e_gammal_r.c (__ieee754_gammal_r):
	Likewise.
	* sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (__ieee754_gammal_r):
	Likewise.
	* sysdeps/ieee754/ldbl-96/e_gammal_r.c (__ieee754_gammal_r):
	Likewise.
	* sysdeps/powerpc/powerpc32/fpu/s_llrint.c (__llrint): Likewise.
	* sysdeps/powerpc/powerpc32/fpu/s_llrintf.c (__llrintf): Likewise.
2018-09-14 13:10:39 +00:00
Szabolcs Nagy
3e08ff544b Add new log2 implementation
Similar algorithm is used as in log: log2(2^k x) = k + log2(c) + log2(x/c)
where the last term is approximated by a polynomial of x/c - 1, the first
order coefficient is about 1/ln2 in this case.

There is separate code path when fma instruction is not available for
computing x/c - 1 precisely, for which the table size is doubled.

The worst case error is 0.547 ULP (0.55 without fma), the read only
global data size is 1168 bytes (2192 without fma) on aarch64.  The
non-nearest rounding error is less than 1 ULP.

Improvements on Cortex-A72 compared to current glibc master:
log2 thruput: 2.00x in [0.01 11.1]
log2 latency: 2.04x in [0.01 11.1]
log2 thruput: 2.17x in [0.999 1.001]
log2 latency: 2.88x in [0.999 1.001]

Tested on
aarch64-linux-gnu (defined __FP_FAST_FMA)
arm-linux-gnueabihf (!defined __FP_FAST_FMA)
x86_64-linux-gnu (!defined __FP_FAST_FMA)
powerpc64le-linxu-gnu (defined __FP_FAST_FMA)
targets.

	* NEWS: Mention log2 improvements.
	* math/Makefile (type-double-routines): Add e_log2_data.
	* sysdeps/i386/fpu/e_log2_data.c: New file.
	* sysdeps/ia64/fpu/e_log2_data.c: New file.
	* sysdeps/ieee754/dbl-64/e_log2.c: Rewrite.
	* sysdeps/ieee754/dbl-64/e_log2_data.c: New file.
	* sysdeps/ieee754/dbl-64/math_config.h (__log2_data): Add.
	* sysdeps/ieee754/dbl-64/wordsize-64/e_log2.c: Remove.
	* sysdeps/m68k/m680x0/fpu/e_log2_data.c: New file.
2018-09-12 17:36:33 +01:00
Szabolcs Nagy
f41b0a43e4 Add new log implementation
Optimized log using carefully generated lookup table with 1/c and log(c)
values for small intervalls around 1.  The log(c) is very near a double
precision value, it has about 62 bits precision.  The algorithm is
log(2^k x) = k log(2) + log(c) + log(x/c), where the last term is
approximated by a polynomial of x/c - 1.  Near 1 a single polynomial of
x - 1 is used.

There is separate code path when fma instruction is not available for
computing x/c - 1 precisely, in which case the table size is doubled.
The code uses __builtin_fma under __FP_FAST_FMA to ensure it is inlined
as an instruction.

With the default configuration settings the worst case error is 0.519 ULP
(and 0.520 without fma), the rodata size is 2192 bytes (4240 without fma).
The non-nearest rounding error is less than 1 ULP.

Improvements on Cortex-A72 compared to current glibc master:
log thruput: 3.28x in [0.01 11.1]
log latency: 2.23x in [0.01 11.1]
log thruput: 1.56x in [0.999 1.001]
log latency: 1.57x in [0.999 1.001]

Tested on
aarch64-linux-gnu (defined __FP_FAST_FMA)
arm-linux-gnueabihf (!defined __FP_FAST_FMA)
x86_64-linux-gnu (!defined __FP_FAST_FMA)
powerpc64le-linux-gnu (defined __FP_FAST_FMA)
targets.

	* NEWS: Mention log improvement.
	* math/Makefile (type-double-routines): Add e_log_data.
	* sysdeps/i386/fpu/e_log_data.c: New file.
	* sysdeps/ia64/fpu/e_log_data.c: New file.
	* sysdeps/ieee754/dbl-64/e_log.c: Rewrite.
	* sysdeps/ieee754/dbl-64/e_log_data.c: New file.
	* sysdeps/ieee754/dbl-64/math_config.h (__log_data): Add.
	* sysdeps/ieee754/dbl-64/ulog.h: Remove.
	* sysdeps/ieee754/dbl-64/ulog.tbl: Remove.
	* sysdeps/m68k/m680x0/fpu/e_log_data.c: New file.
2018-09-12 17:33:30 +01:00